SlideShare uma empresa Scribd logo
1 de 19
Sunku Ranganath
https://www.linkedin.com/in/sunkuranganath/
Legal Disclaimer
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any
warranty arising from course of performance, course of dealing, or usage in trade.
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to
obtain the latest forecast, schedule, specifications and roadmaps.
The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on
request.
Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm.
Intel, the Intel logo, Intel Resource Director Technology, Intel Run Sure Technology, Intel Node Manager, are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others
Copyright © 2017 Intel Corporation. All rights reserved.
Acknowledgements
Timothy Verrall
John Browne
Damien Power
Emma Collins
Jean Christophe Bouche
Krzysztof Kepka
Agenda
Platform Observability
Service Assurance
Closed Loop Automation
Platform Observability & Service Assurance (SA)
• Observability: Ability to expose state of the platform to ensure Service Level
Objectives are met
• Observability Considerations: Logging, Metrics & Tracing
• Communications Service Provider Context:
• Care about overall Service Assurance
• Both Monitoring & Observability are important
• Service Assurance
• Application of policies to ensure services meet a pre-defined service quality level
• FCAPS (Fault, Configuration, Accounting, Performance & Security) attributes on
existing network infrastructure
6
Three Key Elements of SA Platform
 Monitoring: Enabling deeper
management and tracking of
specific service levels
 Presentation: Reporting to
enable reaction to service level
changes
 Provisioning: Enable
configuration of service levels
based on workload or service
priority
Figure: Service Assurance elements mapping to ETSI NFV Model
7
Collectd Monitoring Agent
Collectd: Why & What
• Statistics collection daemon
• Uses read or write plugins to collect metrics write to an end
point
• Open source
• Widely adopted
• Configurable Collection Interval
Various Plugin types:
• Input/Output
• Binding Plugins
• Logging Plugins
• Notification Plugins
• Other: Network plugin with both send/receive feature
Figure: Collectd Architecture
https://github.com/collectd/collectd
8
Platform Telemetry Exposure & Integration
Compute Network Storage
Hypervisor [RT/SA KVM4NFV extensions]
NFVI
IPFIX
Virtualised
Compute
Virtualised
Network
Virtualised
Storage
E.g.
Working/Protect
Failover
Local
Corrective
Action
Enterprise
MIB
SYSLOG
Collectd
PMU^
counters
NIC counters
vSwitch
counters
SNMP API
Perfmon
MIB
Common / Standard Open APIs
Fast Path
Triggers on events or
counters
VM Stall Detection/
RT Stall Detection
Monitoring/
Analytics
Systems
Slow Path
Periodic Pull 1/15mins
RAS Hypervisor/Container
Counters
Container
Monitoring
Solutions
(Prometheus
….)
Includes
NetFlow Collectors
Vendor SA
Middleware
Intel® Node
Manager
NFV Platform
MIB
Standard Open APIs
Intel Components
Open Platform
Collectors
Intel® Run Sure Technology
MCA* PCIe AER
Resilient System Technology
Resilient Memory Technology
SDDC DDDC+1 Mirroring
RAID/
NVMe*
Intel® Rapid
Storage
Technology
sFlow
Intel®
Management
Engine
IPMI
Ceilometer
Aodh
Vitrage
Congress
In progress
Done/Integrated
OpenStack*
Collectd PluginsIntel® Infrastructure
Management Technologies ®
Gnocchi
VES Plugin
Redfish
C
M
T
Intel® RDT
C
A
T
M
B
M
C
D
P
PO
W
ER
Out Of
Band
Telemetry
Kafka Prometheus
OpenStack*
VIM
PMU^: Performance Monitoring Unit
Multiple Closed Loops
Plan & Provision
Offline
feedback loop
Design Analyze
Use cases (Loops)
• Capacity planning
• Peering planning
• Cache placement
• …
Optimize
MonitorOrchestrate
Near-real
Time
Feedback loop Real-Time
Feedback loop
Use cases (Loops)
• Service assurance
• Security operations
• …
Use cases (Loops)
• Traffic Engineering:
Network Optimization
• Demand placement
• Workload placement…
Telemetry
Telemetry
Real-time/Near Real-time Loops - Automated
Telemetry
Offline Processing
Online Processing
Source: https://pndablog.com/2017/06/05/feedback-loops-and-closed-loop-control/
10
Networking Closed Loops – High Level
Architecture
Platform Resources
Forwarding Plane
Interfaces
Interfaces
TrafficTraffic
Platform
Analytics
Systems
Business Applications
Setting of Policy
SDN/NMS
Network Services
Cloud and Virtual
Management
MANO
EMS VNFM
Infrastructure
Control
Application
Independent Closed Loops: SDN, Cloud & Virtual Mgt, Platform
Local
Platform
Agent
Telemetry
distribution or
storage or
…..
Platform
Telemetry
Policy Based Provisioning
Control Loops
11
Closed Loops – Networking Stack
Application Layer
Network Data Analytics
Orchestration, Management, Policy
Cloud & Virtual Management
Network Control
Operating Systems
Data Path
Hardware/
Disaggregated Hardware
ServicesManagement&ControlInfrastructure
Micro-seconds/
Milliseconds
Mins/Hours/Days
Closed Loop
Reaction Time
Domain Knowledge
Local to
Platform
End to End
Enforce Local
Policy
Deployment
Policies
Enforce Network
Domain Policy
Map Policies
HW Enabled
Loops (eg
RAS)
Enforce DP
Loops (HA etc.)
Analyze/
Plan Policies
High Speed Control Loops are Close to the Platform
Seconds/Mins
Analytics
12
Closed Loops – Business Cases
Improved Customer
Experience
Cloud Optimization &
Efficiency
Edge Placement
Service Healing
Differentiated QoS
Service Optimization
Energy Optimization
Capacity Optimization
Cloud Configurations
Business
Use Cases
AI/ML/DL
Platform(s)
Feature Exposure Provisioning Telemetry
Local Policy Enforcement Agent(s)
For Local Dynamic Control
Intel® Infrastructure
Management Tech
Intel®
RDT
Power
Monitoring/Storage
NFV Orchestrator (NFVO) [eg ONAP/OSM]
Security
Threat Detection
Threat Response
Business Applications
collectd
Policy Based Provisioning
Control Loops
VNF Manager (VNFM)
OpenStack* Kubernetes* Telemetry I/FTelemetry I/F
Actively
Contributing
Intel® Run Sure
Technology
Bare Metal
Telemetry I/F
Closed Loop Resiliency Demo
Goal: Maximize Service Availability
of Virtual Border Network Gateway
(vBNG) in memory error scenario
Figure 1 Source: OpenSAF and VMware from the Perspective of High Availability - Ali Nikzad, Ferhat KhendekMaria Toeroe
Concordia University Ericsson SVM’2013 – Zurich – October 2013
Figure 1: Service Recovery Timeline Figure 2: Closed Loop Resiliency
Demo with Kubernetes
More Details on Demo: https://networkbuilders.intel.com/social-hub/video/closed-loop-
platform-automation-workload-resiliency-demo
Use Cases & Gaps
• 5G Network Slicing
• Demand based Energy Savings
• Workload Resiliency
• Noisy Neighbor Detection & Avoidance
• And many more….
Figure: 5G Network Slicing Architecture
Source: https://www.researchgate.net/figure/5G-network-slicing-architecture_fig1_324175599
Gaps, On Going Work
• Telemetry tagging
• Policy delivery & management across
VIM to NFVI
• ONAP, OPNFV, ETSI, etc.
Summary
Platform Observability & Monitoring play crucial role in ensuring service assurance
Platform telemetry heavily differentiate the services, along side of application telemetry
Various levels of closed loops are required for autonomous networks
Real-time & Near Real-time closed loops require automation
Collaborate through Open Source Communities
Figure out use cases of interest
Leverage relevant infrastructure telemetry
Call To Action
Backup
17
Service Assurance “Phased” Evolution for NFV/SDN
• Strategic Framework for SA “Phase” Evolution
 Phase 1 - Equivalence (Virtualized + Interworking with existing management systems)
 Phase 2 - Automated by MANO+SDN Controller
 Phase 3 - Predict failures and adapt automatically
Platform Service Assurance -
Equivalence
• Platform Service Assurance supporting:
• Intel RunSure Technologies
• Cache Config & Monitoring
• Bios Config & Reporting
• Fastpath DPDK Interface Reporting
• Fastpath DPDK Keep Alive
• Virtual Switch Health
• Host Health
Platform Service Assurance
(MANO + SDN Controller)
• VIM and above, support:
• Enable RAS Technologies
• Enable DPDK and Keep
Alive
• Enable Host Health
• Policy Based Provisioning
Predictive Platform Service
Assurance
• Predict Failures and Adapt
Automatically:
• Automated and Adaptive
to changes notified in
metrics
• Closed loop and Dynamic
SA environment
Phase 1 Phase 2 Phase 3
Evolving from Equivalence towards NFV/SDN Automation
Never Stops Solution of the day Under Construction
18
Platform Plugins Contributed by Intel
Plugin Domain Description
Intel® Run Sure
Technology/ RAS
Mcelog, PCIe AER, logparser: Metrics & notifications pertaining to Intel Run Sure
Technology
Intel® RDT Intel® Resource Director Technologies (Intel® RDT) related metrics
Virt Libvirt related metrics
OVS Ovs_stats, ovs_events: Metrics related to Open Virtual Switch
DPDK Dpdk_stats, dpdk_events, hugepages: Data Plane Development Kit (DPDK)
related metrics
OpenStack* Gnocchi, Aodh: Integration in OpenStack projects
Cloud Write_Kafka, Write_Prometheus, VES: Integration in to various cloud platforms
Storage RAID, NVMe*: Storage related Metrics
Power/Energy CPUFreq, Turbostat: Frequency & power related metrics
Platform IPMI, RedFish, PMU: Out of Band metrics & platform counters
Infrastructure Metrics are Crucial as Application Metrics
Details: https://github.com/collectd/collectd
19
Barometer Strategy:
• Ensure platform metrics/events are
accessible through open industry standard
interfaces.
• Demonstrate platform technologies can be
monitored, consumed and actioned in real
time
Opnfv barometer
One Click Install:
 Easy install/configuration
for customers
 One command to install
Collectd/Influxdb/Grafana
• Three container approach for
Collectd:
• Stable Container: latest stable branch
• Master Container: up to date with
master
• Experimental Container: cherry pick
features of interest
Source: https://opnfv-barometer.readthedocs.io/en/latest/release/userguide/docker.userguide.html

Mais conteúdo relacionado

Mais procurados

Enabling new protocol processing with DPDK using Dynamic Device Personalization
Enabling new protocol processing with DPDK using Dynamic Device PersonalizationEnabling new protocol processing with DPDK using Dynamic Device Personalization
Enabling new protocol processing with DPDK using Dynamic Device Personalization
Michelle Holley
 
Is Your Power Utility Smart Enough to Survive Another Decade
Is Your Power Utility Smart Enough to Survive Another DecadeIs Your Power Utility Smart Enough to Survive Another Decade
Is Your Power Utility Smart Enough to Survive Another Decade
Real-Time Innovations (RTI)
 
Data Center Design Guide 4 2
Data Center Design Guide 4 2Data Center Design Guide 4 2
Data Center Design Guide 4 2
Fiyaz Syed
 

Mais procurados (18)

Ligato - A platform for development of Cloud-Native VNF's - SDN/NFV London me...
Ligato - A platform for development of Cloud-Native VNF's - SDN/NFV London me...Ligato - A platform for development of Cloud-Native VNF's - SDN/NFV London me...
Ligato - A platform for development of Cloud-Native VNF's - SDN/NFV London me...
 
Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...Accelerating Virtual Machine Access with the Storage Performance Development ...
Accelerating Virtual Machine Access with the Storage Performance Development ...
 
Enabling new protocol processing with DPDK using Dynamic Device Personalization
Enabling new protocol processing with DPDK using Dynamic Device PersonalizationEnabling new protocol processing with DPDK using Dynamic Device Personalization
Enabling new protocol processing with DPDK using Dynamic Device Personalization
 
ETSI NFV#13 NFV resiliency presentation - ali kafel - stratus
ETSI NFV#13   NFV resiliency presentation - ali kafel - stratusETSI NFV#13   NFV resiliency presentation - ali kafel - stratus
ETSI NFV#13 NFV resiliency presentation - ali kafel - stratus
 
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
 
Development, test, and characterization of MEC platforms with Teranium and Dr...
Development, test, and characterization of MEC platforms with Teranium and Dr...Development, test, and characterization of MEC platforms with Teranium and Dr...
Development, test, and characterization of MEC platforms with Teranium and Dr...
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?
 
Is Your Power Utility Smart Enough to Survive Another Decade
Is Your Power Utility Smart Enough to Survive Another DecadeIs Your Power Utility Smart Enough to Survive Another Decade
Is Your Power Utility Smart Enough to Survive Another Decade
 
Data Center Design Guide 4 2
Data Center Design Guide 4 2Data Center Design Guide 4 2
Data Center Design Guide 4 2
 
FACE Architecture Executive Summary
FACE Architecture Executive SummaryFACE Architecture Executive Summary
FACE Architecture Executive Summary
 
Carrier Grade OCP: Open Solutions for Telecom Data Centers
Carrier Grade OCP: Open Solutions for Telecom Data CentersCarrier Grade OCP: Open Solutions for Telecom Data Centers
Carrier Grade OCP: Open Solutions for Telecom Data Centers
 
Using Xeon + FPGA for Accelerating HPC Workloads
Using Xeon + FPGA for Accelerating HPC WorkloadsUsing Xeon + FPGA for Accelerating HPC Workloads
Using Xeon + FPGA for Accelerating HPC Workloads
 
Sdn and open flow tutorial 4
Sdn and open flow tutorial 4Sdn and open flow tutorial 4
Sdn and open flow tutorial 4
 
RTI Support for FACE TSS
RTI Support for FACE TSSRTI Support for FACE TSS
RTI Support for FACE TSS
 
How to Leverage Open Architectures for Existing Systems
How to Leverage Open Architectures for Existing SystemsHow to Leverage Open Architectures for Existing Systems
How to Leverage Open Architectures for Existing Systems
 
Stratus Fault-Tolerant Cloud Infrastructure Software for NFV using OpenStack
Stratus Fault-Tolerant Cloud Infrastructure Software for NFV using OpenStackStratus Fault-Tolerant Cloud Infrastructure Software for NFV using OpenStack
Stratus Fault-Tolerant Cloud Infrastructure Software for NFV using OpenStack
 
The VPOD: Breakthrough Operational Efficiency Improvement For Data Centers
The VPOD: Breakthrough Operational Efficiency Improvement For Data CentersThe VPOD: Breakthrough Operational Efficiency Improvement For Data Centers
The VPOD: Breakthrough Operational Efficiency Improvement For Data Centers
 
The Evolution of the Data Centre
The Evolution of the Data CentreThe Evolution of the Data Centre
The Evolution of the Data Centre
 

Semelhante a Platform Observability and Infrastructure Closed Loops

Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2
Srinivasa Addepalli
 
F5 9.x to 10.x Upgrade Customer Presentation
F5 9.x to 10.x Upgrade Customer PresentationF5 9.x to 10.x Upgrade Customer Presentation
F5 9.x to 10.x Upgrade Customer Presentation
F5 Networks
 
Intel Network Builders Summit: Key Lessons from an advanced multi-vendor NFV ...
Intel Network Builders Summit: Key Lessons from an advanced multi-vendor NFV ...Intel Network Builders Summit: Key Lessons from an advanced multi-vendor NFV ...
Intel Network Builders Summit: Key Lessons from an advanced multi-vendor NFV ...
Kiran Sirupa
 
Ibm cloud forum managing heterogenousclouds_final
Ibm cloud forum managing heterogenousclouds_finalIbm cloud forum managing heterogenousclouds_final
Ibm cloud forum managing heterogenousclouds_final
Mauricio Godoy
 
Genesis Networks Mar 2010 Base Presentation Rev4
Genesis Networks Mar 2010 Base Presentation Rev4Genesis Networks Mar 2010 Base Presentation Rev4
Genesis Networks Mar 2010 Base Presentation Rev4
danieljimmie
 

Semelhante a Platform Observability and Infrastructure Closed Loops (20)

Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
Service Assurance Constructs for Achieving Network Transformation by Sunku Ra...
 
Platform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed LoopsPlatform Observability and Infrastructure Closed Loops
Platform Observability and Infrastructure Closed Loops
 
Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2Acceleration_and_Security_draft_v2
Acceleration_and_Security_draft_v2
 
Design and Deploy Secure Clouds for Financial Services Use Cases
Design and Deploy Secure Clouds for Financial Services Use CasesDesign and Deploy Secure Clouds for Financial Services Use Cases
Design and Deploy Secure Clouds for Financial Services Use Cases
 
Intent Based Networking: turning intentions into reality with network securit...
Intent Based Networking: turning intentions into reality with network securit...Intent Based Networking: turning intentions into reality with network securit...
Intent Based Networking: turning intentions into reality with network securit...
 
DUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS UpdateDUG'20: 01 - Welcome & DAOS Update
DUG'20: 01 - Welcome & DAOS Update
 
Yongsan presentation 3
Yongsan presentation 3Yongsan presentation 3
Yongsan presentation 3
 
T3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of ExcellenceT3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of Excellence
 
Increased IT infrastructure effectiveness by 80% with Microsoft system center...
Increased IT infrastructure effectiveness by 80% with Microsoft system center...Increased IT infrastructure effectiveness by 80% with Microsoft system center...
Increased IT infrastructure effectiveness by 80% with Microsoft system center...
 
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
 
F5 9.x to 10.x Upgrade Customer Presentation
F5 9.x to 10.x Upgrade Customer PresentationF5 9.x to 10.x Upgrade Customer Presentation
F5 9.x to 10.x Upgrade Customer Presentation
 
What’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINTWhat’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINT
 
Intel Network Builders Summit: Key Lessons from an advanced multi-vendor NFV ...
Intel Network Builders Summit: Key Lessons from an advanced multi-vendor NFV ...Intel Network Builders Summit: Key Lessons from an advanced multi-vendor NFV ...
Intel Network Builders Summit: Key Lessons from an advanced multi-vendor NFV ...
 
gesa_sol.ppt
gesa_sol.pptgesa_sol.ppt
gesa_sol.ppt
 
Tech Talk Oct 2008 Upgrade Migrate
Tech Talk Oct 2008 Upgrade MigrateTech Talk Oct 2008 Upgrade Migrate
Tech Talk Oct 2008 Upgrade Migrate
 
Ibm cloud forum managing heterogenousclouds_final
Ibm cloud forum managing heterogenousclouds_finalIbm cloud forum managing heterogenousclouds_final
Ibm cloud forum managing heterogenousclouds_final
 
STATUS UPDATE OF COLO PROJECT XIAOWEI YANG, HUAWEI AND WILL AULD, INTEL
STATUS UPDATE OF COLO PROJECT XIAOWEI YANG, HUAWEI AND WILL AULD, INTELSTATUS UPDATE OF COLO PROJECT XIAOWEI YANG, HUAWEI AND WILL AULD, INTEL
STATUS UPDATE OF COLO PROJECT XIAOWEI YANG, HUAWEI AND WILL AULD, INTEL
 
Cloud monitoring - An essential Platform Service
Cloud monitoring  - An essential Platform ServiceCloud monitoring  - An essential Platform Service
Cloud monitoring - An essential Platform Service
 
CIE_overview
CIE_overviewCIE_overview
CIE_overview
 
Genesis Networks Mar 2010 Base Presentation Rev4
Genesis Networks Mar 2010 Base Presentation Rev4Genesis Networks Mar 2010 Base Presentation Rev4
Genesis Networks Mar 2010 Base Presentation Rev4
 

Mais de Liz Warner

CNTT with Airship
CNTT with AirshipCNTT with Airship
CNTT with Airship
Liz Warner
 

Mais de Liz Warner (18)

Open Source 5G/Edge Automation via ONAP
Open Source 5G/Edge Automation via ONAPOpen Source 5G/Edge Automation via ONAP
Open Source 5G/Edge Automation via ONAP
 
Easing the Path to Network Transformation - Network Transformation Experience...
Easing the Path to Network Transformation - Network Transformation Experience...Easing the Path to Network Transformation - Network Transformation Experience...
Easing the Path to Network Transformation - Network Transformation Experience...
 
CNTT with Airship
CNTT with AirshipCNTT with Airship
CNTT with Airship
 
Your Path to Edge Computing - Akraino Edge Stack Update
Your Path to Edge Computing - Akraino Edge Stack UpdateYour Path to Edge Computing - Akraino Edge Stack Update
Your Path to Edge Computing - Akraino Edge Stack Update
 
Introduction to Tungsten Fabric and the vRouter
Introduction to Tungsten Fabric and the vRouterIntroduction to Tungsten Fabric and the vRouter
Introduction to Tungsten Fabric and the vRouter
 
Linux Akraino Blueprint
Linux Akraino BlueprintLinux Akraino Blueprint
Linux Akraino Blueprint
 
ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...
ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...
ONAP and the K8s Ecosystem: A Converged Edge Application & Network Function P...
 
P4/FPGA, Packet Acceleration
P4/FPGA, Packet AccelerationP4/FPGA, Packet Acceleration
P4/FPGA, Packet Acceleration
 
Enabling the Deployment of Edge Services with the Open Network Edge Services ...
Enabling the Deployment of Edge Services with the Open Network Edge Services ...Enabling the Deployment of Edge Services with the Open Network Edge Services ...
Enabling the Deployment of Edge Services with the Open Network Edge Services ...
 
Unleashing the Power of Fabric Orchestrating New Performance Features for SR-...
Unleashing the Power of Fabric Orchestrating New Performance Features for SR-...Unleashing the Power of Fabric Orchestrating New Performance Features for SR-...
Unleashing the Power of Fabric Orchestrating New Performance Features for SR-...
 
Closed-Loop Platform Automation by Tong Zhong and Emma Collins
Closed-Loop Platform Automation by Tong Zhong and Emma CollinsClosed-Loop Platform Automation by Tong Zhong and Emma Collins
Closed-Loop Platform Automation by Tong Zhong and Emma Collins
 
Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed-Loop Network Automation for Optimal Resource Allocation via Reinforcem...
 
Open Network Edge Services Software for 5G and Edge
Open Network Edge Services Software for 5G and EdgeOpen Network Edge Services Software for 5G and Edge
Open Network Edge Services Software for 5G and Edge
 
Akraino and Edge Computing
Akraino and Edge ComputingAkraino and Edge Computing
Akraino and Edge Computing
 
Whats New with Kata Containers
Whats New with Kata ContainersWhats New with Kata Containers
Whats New with Kata Containers
 
SEBA: SDN Enabled Broadband Access - Transporting SDN principles to PON Networks
SEBA: SDN Enabled Broadband Access - Transporting SDN principles to PON NetworksSEBA: SDN Enabled Broadband Access - Transporting SDN principles to PON Networks
SEBA: SDN Enabled Broadband Access - Transporting SDN principles to PON Networks
 
Simplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual CloudSimplifying and accelerating converged media with Open Visual Cloud
Simplifying and accelerating converged media with Open Visual Cloud
 
Open Source for the 4th Industrial Revolution
Open Source for the 4th Industrial RevolutionOpen Source for the 4th Industrial Revolution
Open Source for the 4th Industrial Revolution
 

Último

%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Último (20)

%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 

Platform Observability and Infrastructure Closed Loops

  • 2. Legal Disclaimer No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm. Intel, the Intel logo, Intel Resource Director Technology, Intel Run Sure Technology, Intel Node Manager, are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others Copyright © 2017 Intel Corporation. All rights reserved.
  • 3. Acknowledgements Timothy Verrall John Browne Damien Power Emma Collins Jean Christophe Bouche Krzysztof Kepka
  • 5. Platform Observability & Service Assurance (SA) • Observability: Ability to expose state of the platform to ensure Service Level Objectives are met • Observability Considerations: Logging, Metrics & Tracing • Communications Service Provider Context: • Care about overall Service Assurance • Both Monitoring & Observability are important • Service Assurance • Application of policies to ensure services meet a pre-defined service quality level • FCAPS (Fault, Configuration, Accounting, Performance & Security) attributes on existing network infrastructure
  • 6. 6 Three Key Elements of SA Platform  Monitoring: Enabling deeper management and tracking of specific service levels  Presentation: Reporting to enable reaction to service level changes  Provisioning: Enable configuration of service levels based on workload or service priority Figure: Service Assurance elements mapping to ETSI NFV Model
  • 7. 7 Collectd Monitoring Agent Collectd: Why & What • Statistics collection daemon • Uses read or write plugins to collect metrics write to an end point • Open source • Widely adopted • Configurable Collection Interval Various Plugin types: • Input/Output • Binding Plugins • Logging Plugins • Notification Plugins • Other: Network plugin with both send/receive feature Figure: Collectd Architecture https://github.com/collectd/collectd
  • 8. 8 Platform Telemetry Exposure & Integration Compute Network Storage Hypervisor [RT/SA KVM4NFV extensions] NFVI IPFIX Virtualised Compute Virtualised Network Virtualised Storage E.g. Working/Protect Failover Local Corrective Action Enterprise MIB SYSLOG Collectd PMU^ counters NIC counters vSwitch counters SNMP API Perfmon MIB Common / Standard Open APIs Fast Path Triggers on events or counters VM Stall Detection/ RT Stall Detection Monitoring/ Analytics Systems Slow Path Periodic Pull 1/15mins RAS Hypervisor/Container Counters Container Monitoring Solutions (Prometheus ….) Includes NetFlow Collectors Vendor SA Middleware Intel® Node Manager NFV Platform MIB Standard Open APIs Intel Components Open Platform Collectors Intel® Run Sure Technology MCA* PCIe AER Resilient System Technology Resilient Memory Technology SDDC DDDC+1 Mirroring RAID/ NVMe* Intel® Rapid Storage Technology sFlow Intel® Management Engine IPMI Ceilometer Aodh Vitrage Congress In progress Done/Integrated OpenStack* Collectd PluginsIntel® Infrastructure Management Technologies ® Gnocchi VES Plugin Redfish C M T Intel® RDT C A T M B M C D P PO W ER Out Of Band Telemetry Kafka Prometheus OpenStack* VIM PMU^: Performance Monitoring Unit
  • 9. Multiple Closed Loops Plan & Provision Offline feedback loop Design Analyze Use cases (Loops) • Capacity planning • Peering planning • Cache placement • … Optimize MonitorOrchestrate Near-real Time Feedback loop Real-Time Feedback loop Use cases (Loops) • Service assurance • Security operations • … Use cases (Loops) • Traffic Engineering: Network Optimization • Demand placement • Workload placement… Telemetry Telemetry Real-time/Near Real-time Loops - Automated Telemetry Offline Processing Online Processing Source: https://pndablog.com/2017/06/05/feedback-loops-and-closed-loop-control/
  • 10. 10 Networking Closed Loops – High Level Architecture Platform Resources Forwarding Plane Interfaces Interfaces TrafficTraffic Platform Analytics Systems Business Applications Setting of Policy SDN/NMS Network Services Cloud and Virtual Management MANO EMS VNFM Infrastructure Control Application Independent Closed Loops: SDN, Cloud & Virtual Mgt, Platform Local Platform Agent Telemetry distribution or storage or ….. Platform Telemetry Policy Based Provisioning Control Loops
  • 11. 11 Closed Loops – Networking Stack Application Layer Network Data Analytics Orchestration, Management, Policy Cloud & Virtual Management Network Control Operating Systems Data Path Hardware/ Disaggregated Hardware ServicesManagement&ControlInfrastructure Micro-seconds/ Milliseconds Mins/Hours/Days Closed Loop Reaction Time Domain Knowledge Local to Platform End to End Enforce Local Policy Deployment Policies Enforce Network Domain Policy Map Policies HW Enabled Loops (eg RAS) Enforce DP Loops (HA etc.) Analyze/ Plan Policies High Speed Control Loops are Close to the Platform Seconds/Mins
  • 12. Analytics 12 Closed Loops – Business Cases Improved Customer Experience Cloud Optimization & Efficiency Edge Placement Service Healing Differentiated QoS Service Optimization Energy Optimization Capacity Optimization Cloud Configurations Business Use Cases AI/ML/DL Platform(s) Feature Exposure Provisioning Telemetry Local Policy Enforcement Agent(s) For Local Dynamic Control Intel® Infrastructure Management Tech Intel® RDT Power Monitoring/Storage NFV Orchestrator (NFVO) [eg ONAP/OSM] Security Threat Detection Threat Response Business Applications collectd Policy Based Provisioning Control Loops VNF Manager (VNFM) OpenStack* Kubernetes* Telemetry I/FTelemetry I/F Actively Contributing Intel® Run Sure Technology Bare Metal Telemetry I/F
  • 13. Closed Loop Resiliency Demo Goal: Maximize Service Availability of Virtual Border Network Gateway (vBNG) in memory error scenario Figure 1 Source: OpenSAF and VMware from the Perspective of High Availability - Ali Nikzad, Ferhat KhendekMaria Toeroe Concordia University Ericsson SVM’2013 – Zurich – October 2013 Figure 1: Service Recovery Timeline Figure 2: Closed Loop Resiliency Demo with Kubernetes More Details on Demo: https://networkbuilders.intel.com/social-hub/video/closed-loop- platform-automation-workload-resiliency-demo
  • 14. Use Cases & Gaps • 5G Network Slicing • Demand based Energy Savings • Workload Resiliency • Noisy Neighbor Detection & Avoidance • And many more…. Figure: 5G Network Slicing Architecture Source: https://www.researchgate.net/figure/5G-network-slicing-architecture_fig1_324175599 Gaps, On Going Work • Telemetry tagging • Policy delivery & management across VIM to NFVI • ONAP, OPNFV, ETSI, etc.
  • 15. Summary Platform Observability & Monitoring play crucial role in ensuring service assurance Platform telemetry heavily differentiate the services, along side of application telemetry Various levels of closed loops are required for autonomous networks Real-time & Near Real-time closed loops require automation Collaborate through Open Source Communities Figure out use cases of interest Leverage relevant infrastructure telemetry Call To Action
  • 17. 17 Service Assurance “Phased” Evolution for NFV/SDN • Strategic Framework for SA “Phase” Evolution  Phase 1 - Equivalence (Virtualized + Interworking with existing management systems)  Phase 2 - Automated by MANO+SDN Controller  Phase 3 - Predict failures and adapt automatically Platform Service Assurance - Equivalence • Platform Service Assurance supporting: • Intel RunSure Technologies • Cache Config & Monitoring • Bios Config & Reporting • Fastpath DPDK Interface Reporting • Fastpath DPDK Keep Alive • Virtual Switch Health • Host Health Platform Service Assurance (MANO + SDN Controller) • VIM and above, support: • Enable RAS Technologies • Enable DPDK and Keep Alive • Enable Host Health • Policy Based Provisioning Predictive Platform Service Assurance • Predict Failures and Adapt Automatically: • Automated and Adaptive to changes notified in metrics • Closed loop and Dynamic SA environment Phase 1 Phase 2 Phase 3 Evolving from Equivalence towards NFV/SDN Automation Never Stops Solution of the day Under Construction
  • 18. 18 Platform Plugins Contributed by Intel Plugin Domain Description Intel® Run Sure Technology/ RAS Mcelog, PCIe AER, logparser: Metrics & notifications pertaining to Intel Run Sure Technology Intel® RDT Intel® Resource Director Technologies (Intel® RDT) related metrics Virt Libvirt related metrics OVS Ovs_stats, ovs_events: Metrics related to Open Virtual Switch DPDK Dpdk_stats, dpdk_events, hugepages: Data Plane Development Kit (DPDK) related metrics OpenStack* Gnocchi, Aodh: Integration in OpenStack projects Cloud Write_Kafka, Write_Prometheus, VES: Integration in to various cloud platforms Storage RAID, NVMe*: Storage related Metrics Power/Energy CPUFreq, Turbostat: Frequency & power related metrics Platform IPMI, RedFish, PMU: Out of Band metrics & platform counters Infrastructure Metrics are Crucial as Application Metrics Details: https://github.com/collectd/collectd
  • 19. 19 Barometer Strategy: • Ensure platform metrics/events are accessible through open industry standard interfaces. • Demonstrate platform technologies can be monitored, consumed and actioned in real time Opnfv barometer One Click Install:  Easy install/configuration for customers  One command to install Collectd/Influxdb/Grafana • Three container approach for Collectd: • Stable Container: latest stable branch • Master Container: up to date with master • Experimental Container: cherry pick features of interest Source: https://opnfv-barometer.readthedocs.io/en/latest/release/userguide/docker.userguide.html

Notas do Editor

  1. Monitoring: Platform & Network counters to track usage and performance against KPIs Provide open standard components and interfaces Presentation: Human & Dynamic intervention for threshold violations or failures Support for the detection of trending against configured parameters and the enabling of capacity plan changes based on those trends Provisioning: Includes allocating or partitioning platform resources Intercepts with every layer of NFV framework Capabilities have to be built in for easy interoperability and smooth consumption of telemetry data Open standard interfaces play an important role
  2. Platform Telemetry agents: Collectd - https://github.com/collectd/collectd Local Platform Agent: Resource Management Daemon - https://github.com/intel/rmd
  3. Looking to help these with IA features. Feature exposure, provisioning & telemetry. We are looking to enable/fill in these gaps.
  4. Workload Resiliency: https://networkbuilders.intel.com/social-hub/video/closed-loop-platform-automation-workload-resiliency-demo Noisy Neighbor Detection & Avoidance: https://ftp.osuosl.org/pub/fosdem/2019/H.2214/noisy_neighbor_insurance.mp4
  5. Given that existing FCAPS systems are widely deployed and represent significant investments from a business perspective, the approach taken is to introduce a phased approach for Service Assurance, starting with equivalence with existing systems then adding MANO+SDN integration and then evolving to a fully automated and predictive management and orchestration system.
  6. Sunku