SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
Enforcing Application
SLAs with Congress
and Monasca
Fabio Giannetti, Ken Owens
April 28, 2016
• Vision
• Congress and Monasca implementing:
• OPS/NOC SLA Policies
• App Intent SLA Policies
• Current State and Next Steps
Outline
Vision
• Application
owners/developers do not
care about the underlining
infrastructure unless it is a
problem.
• Microservices based
architectures demands
inherently granular
application design.
• SLAs for applications must
be holistic and independent
of the underlining
infrastructure
Vision
Host
Virtualization VirtualizationContainer Container
Container Container
Srvc Srvc Srvc Srvc Srvc Srvc Srvc
Application A Application B
Enable business/application
owners to easily define the
aspects that are relevant in
running their applications with
the budget constraints that are
imposed by IT.
Vision
Monitoring is now holistic and has to
consider various level of
virtualization and harmonize data
over the different layers.
Containers are short lived and
moved around the available
infrastructure.
Vision
Host
Virtualization VirtualizationContainer Container
Container Container
Application owners’ soft limits (alarms) are notified back and hard limits
(actions) are performed whenever required.
Vision
OPS/NOC SLA using
Congress and Monasca
Underutilized Servers 
OPS/NOC Policy Example
error(vm, email) :-
nova:server_owner(vm, owner),
two_months_before_today(start, end),
ceilometer:statistics(vm, start, end, “cpu-util”, cpu),
cpu < 5,
keystone:email(owner, email)
two_months_before_today(start, end) :-
date:today(end),
date:minus(end, “2 months”, start)
If a VM has less than 5% CPU utilization for the last 2 months,
then notify its owner via email
Current Solution
Ceilometer API
Congress API
Policy
Engine
Ceilometer
Datasource
GET
/v2/meters/cpu_util/statistics?resource_
id=…
VM UUID (Resource ID) CPU
xxxxxxxx-0001-xxxx-xxxxxxxxxxx
xxxxxxxx-0002-xxxx-xxxxxxxxxxx
xxxxxxxx-0003-xxxx-xxxxxxxxxxx
xxxxxxxx-0004-xxxx-xxxxxxxxxxx
xxxxxxxx-0005-xxxx-xxxxxxxxxxx
Poll every <n>s
40
30
2
70
55
Current Solution
Congress APIPolicy
Engine
Ceilometer
Datasource
VM UUID (Resource ID) CPU
xxxxxxxx-0001-xxxx
xxxxxxxx-0002-xxxx
xxxxxxxx-0003-xxxx
xxxxxxxx-0004-xxxx
xxxxxxxx-0005-xxxx
40
30
2
70
55
Nova API
Nova
Datasource
Keystone
Datasource
Keystone API
VM Owner
xxxxxxxx-0001-xxxx Ann
xxxxxxxx-0002-xxxx Fabio
xxxxxxxx-0003-xxxx Fabio
xxxxxxxx-0004-xxxx Ken
xxxxxxxx-0005-xxxx Ken
Owner Email
Ann AnnNotRealEmail@cisco.com
Fabio FabioNotRealEmail@cisco.com
Ken KenNotRealEmail@cisco.com
VM Email
xxxxxxxx-0003-xxxx FabioNotRealEmail@cisco.com
From Policy to Alarm
error(vm, email) :-
nova:server_owner(vm, owner),
two_months_before_today(start, end),
monasca_alarms:stats(vm, start, end, “cpu.user_perc”, cpu),
cpu < 5,
keystone:email(owner, email)
two_months_before_today(start, end) :-
date:today(end),
date:minus(end, “2 months”, start)
{
"name":"Average CPU percent is less than 5",
"description":"The average CPU percent is lesser than 5",
"expression":"(avg(cpu.user_perc{resource_id=vm}) < 5)",
"match_by":[
"resource_id"
],
"severity":”HIGH",
"ok_actions":[
”action_id_for_ok"
],
"alarm_actions":[
”action_id_for_alarm"
]
}
Proposed Solution (receiving notif.)
Metrics
DB
Monasca
Agents
Monasca API
Notification
Engine
Threshold
Engine
Persister
Kafka Cluster
Congress API
Policy
Engine
Monasca Alarm
Datasource
Webhook:
…/v1/data-
sources/monasca_alarm
?execute&action=handl
e_alarm
Settings
DB
monasca notification-create congress WEBHOOK
http:…/v1/data-
sources/monasca_alarm?execute&action=handle_ala
handle_alarm(params)
VM UUID (Resource ID) CPU
xxxxxxxx-0003-xxxx 2
POST /v2.0/alarm-definitions
Proposed Solution (receiving notifications)
Congress API
Policy
Engine
Monasca Alarm
Datasource
VM UUID (Resource ID) CPU
xxxxxxxx-0003-xxxx 2
Nova API
Nova
Datasource
Keystone
Datasource
Keystone API
VM Owner
xxxxxxxx-0003-xxxx Fabio
Owner Email
Fabio FabioNotRealEmail@cisco.com
VM Email
xxxxxxxx-0003-xxxx FabioNotRealEmail@cisco.com
Application Intent SLA using
Congress and Monasca
VM Evacuation for Biz Critical App if Host has potential health issues 
App Intent Policy Example
error(vm) :-
nova:show(vm, hostID),
monasca_alarm:host_issues(hostID)
If a Host has issues, for instance:
1. Unhealthy: cannot be pinged and or SSH into
2. Network errors and packet loss
3. Disk space below certain threshold
App Intent Policy: Metrics Correlation
error(vm) :-
nova:show(vm, hostID),
monasca_alarm:host_issues(hostID)
Metric Name Dimensions Value
host_alive_status observer_host=fqdn,
hostname=supplied hostname being
checked,
test_type=ping or ssh
0=online, 1=offline
disk.space_used_perc device, mount_point The percentage of disk space that
is being used on a device
net.in_packets_dropped_sec device Number of inbound network packets
dropped per second
net.out_packets_dropped_sec device Number of outbound network
packets dropped per second
App Intent Policy: Multi-Alarms #1
{
"name":”Host is Unhealty",
"description":"The host is considered unhealty",
"expression":"(host_alive_status{host_id=hostID}) = 1)",
"match_by":[
"host_id"
],
...
}
{
"name":”Host disk getting full",
"description":"The host disk is reaching capacity",
"expression":"(disk.space_used_perc{host_id=hostID}) > 90)",
"match_by":[
"host_id"
],
...
}
Metric Name Value
host_alive_status 0=online, 1=offline
disk.space_used_perc The percentage of disk
space that is being used on
a device
net.in_packets_dropped_sec Number of inbound network
packets dropped per
second
net.out_packets_dropped_se
c
Number of outbound
network packets dropped
per second
App Intent Policy: Multi-Alarms #2
{
"name":”Host is Unhealty",
"description":"The host is considered unhealty",
"expression":"(net.in_packets_dropped_sec{host_id=hostID}) > 30)",
"match_by":[
"host_id"
],
...
}
{
"name":”Host disk getting full",
"description":"The host disk is reaching capacity",
"expression":"(net.out_packets_dropped_sec{host_id=hostID}) > 30)",
"match_by":[
"host_id"
],
...
}
Metric Name Value
host_alive_status 0=online,
1=offline
disk.space_used_perc The percentage
of disk space that
is being used on
a device
net.in_packets_dropped_sec Number of
inbound network
packets dropped
per second
net.out_packets_dropped_sec Number of
outbound network
packets dropped
per second
Current State and Future
Work
Overall Architecture
Settings
DB
Metrics
DB
Monasca
Agents
Monasca API
Keystone
Notification
Engine
Threshold
Engine
Persister
Kafka Cluster
Congress API
Policy
Engine
Monasca Alarm
Datasource
Metric Value
metric1 val1
metricN valN
In Mem DB
webhookrpc
• Done:
• Developed a Monasca Datasource to validate integration.
• Designed the solution and found the main integration points
• To be Done:
• Developed a Monasca Alarm Datasource leveraging the RPC
capabilties in Congress.
• Create a Congress Notification Webhook for Monasca
• Develop a policy to alarm conversion component to develop
policies prefixed with monasca-alarm.
Current Status and Next Steps
OpenStack Summit
Austin, Texas 2016
Thank You!

Mais conteúdo relacionado

Mais procurados

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Flink Forward
 
Ceilometer presentation ODS Grizzly.pdf
Ceilometer presentation ODS Grizzly.pdfCeilometer presentation ODS Grizzly.pdf
Ceilometer presentation ODS Grizzly.pdfOpenStack Foundation
 
From Ceilometer to Telemetry: not so alarming!
From Ceilometer to Telemetry: not so alarming!From Ceilometer to Telemetry: not so alarming!
From Ceilometer to Telemetry: not so alarming!Nicolas (Nick) Barcet
 
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...HostedbyConfluent
 
Apache Kafka in Adobe Ad Cloud's Analytics Platform
Apache Kafka in Adobe Ad Cloud's Analytics PlatformApache Kafka in Adobe Ad Cloud's Analytics Platform
Apache Kafka in Adobe Ad Cloud's Analytics Platformconfluent
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
How to Autoscale in Apache Cloudstack using LiquiD AutoScaler
How to Autoscale in Apache Cloudstack using LiquiD AutoScalerHow to Autoscale in Apache Cloudstack using LiquiD AutoScaler
How to Autoscale in Apache Cloudstack using LiquiD AutoScalerBob Bennink
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Docker, Inc.
 
Live Coding a KSQL Application
Live Coding a KSQL ApplicationLive Coding a KSQL Application
Live Coding a KSQL Applicationconfluent
 
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...Flink Forward
 
Netflix viewing data architecture evolution - EBJUG Nov 2014
Netflix viewing data architecture evolution - EBJUG Nov 2014Netflix viewing data architecture evolution - EBJUG Nov 2014
Netflix viewing data architecture evolution - EBJUG Nov 2014Philip Fisher-Ogden
 
Data & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeData & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeSingleStore
 
Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...HostedbyConfluent
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETconfluent
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerPradeep Kilambi
 
Time and ordering in streaming distributed systems
Time and ordering in streaming distributed systemsTime and ordering in streaming distributed systems
Time and ordering in streaming distributed systemsZhenzhong Xu
 
Distributed architecture in a cloud native microservices ecosystem
Distributed architecture in a cloud native microservices ecosystemDistributed architecture in a cloud native microservices ecosystem
Distributed architecture in a cloud native microservices ecosystemZhenzhong Xu
 
Real-world Streaming Architectures
Real-world Streaming ArchitecturesReal-world Streaming Architectures
Real-world Streaming Architecturesconfluent
 

Mais procurados (20)

Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
Virtual Flink Forward 2020: How Streaming Helps Your Staging Environment and ...
 
Ceilometer + Heat = Alarming
Ceilometer + Heat = Alarming Ceilometer + Heat = Alarming
Ceilometer + Heat = Alarming
 
Ceilometer presentation ODS Grizzly.pdf
Ceilometer presentation ODS Grizzly.pdfCeilometer presentation ODS Grizzly.pdf
Ceilometer presentation ODS Grizzly.pdf
 
From Ceilometer to Telemetry: not so alarming!
From Ceilometer to Telemetry: not so alarming!From Ceilometer to Telemetry: not so alarming!
From Ceilometer to Telemetry: not so alarming!
 
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
 
Apache Kafka in Adobe Ad Cloud's Analytics Platform
Apache Kafka in Adobe Ad Cloud's Analytics PlatformApache Kafka in Adobe Ad Cloud's Analytics Platform
Apache Kafka in Adobe Ad Cloud's Analytics Platform
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
How to Autoscale in Apache Cloudstack using LiquiD AutoScaler
How to Autoscale in Apache Cloudstack using LiquiD AutoScalerHow to Autoscale in Apache Cloudstack using LiquiD AutoScaler
How to Autoscale in Apache Cloudstack using LiquiD AutoScaler
 
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus Monitoring, the Prometheus Way - Julius Voltz, Prometheus
Monitoring, the Prometheus Way - Julius Voltz, Prometheus
 
Live Coding a KSQL Application
Live Coding a KSQL ApplicationLive Coding a KSQL Application
Live Coding a KSQL Application
 
Telemetry Updates - Juno Edition
Telemetry Updates - Juno Edition Telemetry Updates - Juno Edition
Telemetry Updates - Juno Edition
 
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
Flink Forward SF 2017: Scott Kidder - Building a Real-Time Anomaly-Detection ...
 
Netflix viewing data architecture evolution - EBJUG Nov 2014
Netflix viewing data architecture evolution - EBJUG Nov 2014Netflix viewing data architecture evolution - EBJUG Nov 2014
Netflix viewing data architecture evolution - EBJUG Nov 2014
 
Data & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeData & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real Time
 
Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...
 
Stream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NETStream Processing with Apache Kafka and .NET
Stream Processing with Apache Kafka and .NET
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out Ceilometer
 
Time and ordering in streaming distributed systems
Time and ordering in streaming distributed systemsTime and ordering in streaming distributed systems
Time and ordering in streaming distributed systems
 
Distributed architecture in a cloud native microservices ecosystem
Distributed architecture in a cloud native microservices ecosystemDistributed architecture in a cloud native microservices ecosystem
Distributed architecture in a cloud native microservices ecosystem
 
Real-world Streaming Architectures
Real-world Streaming ArchitecturesReal-world Streaming Architectures
Real-world Streaming Architectures
 

Destaque

Openstackサテライトプロジェクトまとめ
OpenstackサテライトプロジェクトまとめOpenstackサテライトプロジェクトまとめ
OpenstackサテライトプロジェクトまとめTakahiro Shida
 
Storing VMs with Cinder and Ceph RBD.pdf
Storing VMs with Cinder and Ceph RBD.pdfStoring VMs with Cinder and Ceph RBD.pdf
Storing VMs with Cinder and Ceph RBD.pdfOpenStack Foundation
 
Webinar Monitoring in era of cloud computing
Webinar Monitoring in era of cloud computingWebinar Monitoring in era of cloud computing
Webinar Monitoring in era of cloud computingCREATE-NET
 
SLAs and Performance in the Cloud: Because There is More Than "Just" Availabi...
SLAs and Performance in the Cloud: Because There is More Than "Just" Availabi...SLAs and Performance in the Cloud: Because There is More Than "Just" Availabi...
SLAs and Performance in the Cloud: Because There is More Than "Just" Availabi...Michael Kopp
 
Application SLA - the missing part of complete SLA management
Application SLA - the missing part of complete SLA managementApplication SLA - the missing part of complete SLA management
Application SLA - the missing part of complete SLA managementComarch
 
Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEbutest
 
openstackの仮想マシンHA機能の現状と今後の方向性
openstackの仮想マシンHA機能の現状と今後の方向性openstackの仮想マシンHA機能の現状と今後の方向性
openstackの仮想マシンHA機能の現状と今後の方向性Sampath Priyankara
 
“Tools” and Standards for Cloud-SLA
“Tools” and Standards for Cloud-SLA“Tools” and Standards for Cloud-SLA
“Tools” and Standards for Cloud-SLASLA-Ready Network
 
Self-Adaptive SLA-Driven Capacity Management for Internet Services
Self-Adaptive SLA-Driven Capacity Management for Internet ServicesSelf-Adaptive SLA-Driven Capacity Management for Internet Services
Self-Adaptive SLA-Driven Capacity Management for Internet ServicesBruno Abrahao
 
Autonomic SLA-driven Provisioning for Cloud Applications
Autonomic SLA-driven Provisioning for Cloud ApplicationsAutonomic SLA-driven Provisioning for Cloud Applications
Autonomic SLA-driven Provisioning for Cloud Applicationsnbonvin
 
Hierarchical SLA-based Service Selection for Multi-Cloud Environments
Hierarchical SLA-based Service Selection for Multi-Cloud EnvironmentsHierarchical SLA-based Service Selection for Multi-Cloud Environments
Hierarchical SLA-based Service Selection for Multi-Cloud EnvironmentsSoodeh Farokhi
 
Introduction to Network Performance Measurement with Cisco IOS IP Service Lev...
Introduction to Network Performance Measurement with Cisco IOS IP Service Lev...Introduction to Network Performance Measurement with Cisco IOS IP Service Lev...
Introduction to Network Performance Measurement with Cisco IOS IP Service Lev...Cisco Canada
 
Infrastructure Migration from Windows Server 2003 to the Cloud: An Interoute ...
Infrastructure Migration from Windows Server 2003 to the Cloud: An Interoute ...Infrastructure Migration from Windows Server 2003 to the Cloud: An Interoute ...
Infrastructure Migration from Windows Server 2003 to the Cloud: An Interoute ...Interoute
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleSudhir Tonse
 

Destaque (15)

Openstackサテライトプロジェクトまとめ
OpenstackサテライトプロジェクトまとめOpenstackサテライトプロジェクトまとめ
Openstackサテライトプロジェクトまとめ
 
Storing VMs with Cinder and Ceph RBD.pdf
Storing VMs with Cinder and Ceph RBD.pdfStoring VMs with Cinder and Ceph RBD.pdf
Storing VMs with Cinder and Ceph RBD.pdf
 
Webinar Monitoring in era of cloud computing
Webinar Monitoring in era of cloud computingWebinar Monitoring in era of cloud computing
Webinar Monitoring in era of cloud computing
 
SLAs and Performance in the Cloud: Because There is More Than "Just" Availabi...
SLAs and Performance in the Cloud: Because There is More Than "Just" Availabi...SLAs and Performance in the Cloud: Because There is More Than "Just" Availabi...
SLAs and Performance in the Cloud: Because There is More Than "Just" Availabi...
 
Application SLA - the missing part of complete SLA management
Application SLA - the missing part of complete SLA managementApplication SLA - the missing part of complete SLA management
Application SLA - the missing part of complete SLA management
 
Machine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AEMachine Learning for automated diagnosis of distributed ...AE
Machine Learning for automated diagnosis of distributed ...AE
 
openstackの仮想マシンHA機能の現状と今後の方向性
openstackの仮想マシンHA機能の現状と今後の方向性openstackの仮想マシンHA機能の現状と今後の方向性
openstackの仮想マシンHA機能の現状と今後の方向性
 
“Tools” and Standards for Cloud-SLA
“Tools” and Standards for Cloud-SLA“Tools” and Standards for Cloud-SLA
“Tools” and Standards for Cloud-SLA
 
Self-Adaptive SLA-Driven Capacity Management for Internet Services
Self-Adaptive SLA-Driven Capacity Management for Internet ServicesSelf-Adaptive SLA-Driven Capacity Management for Internet Services
Self-Adaptive SLA-Driven Capacity Management for Internet Services
 
Autonomic SLA-driven Provisioning for Cloud Applications
Autonomic SLA-driven Provisioning for Cloud ApplicationsAutonomic SLA-driven Provisioning for Cloud Applications
Autonomic SLA-driven Provisioning for Cloud Applications
 
Hierarchical SLA-based Service Selection for Multi-Cloud Environments
Hierarchical SLA-based Service Selection for Multi-Cloud EnvironmentsHierarchical SLA-based Service Selection for Multi-Cloud Environments
Hierarchical SLA-based Service Selection for Multi-Cloud Environments
 
Introduction to Network Performance Measurement with Cisco IOS IP Service Lev...
Introduction to Network Performance Measurement with Cisco IOS IP Service Lev...Introduction to Network Performance Measurement with Cisco IOS IP Service Lev...
Introduction to Network Performance Measurement with Cisco IOS IP Service Lev...
 
(ARC307) Infrastructure as Code
(ARC307) Infrastructure as Code(ARC307) Infrastructure as Code
(ARC307) Infrastructure as Code
 
Infrastructure Migration from Windows Server 2003 to the Cloud: An Interoute ...
Infrastructure Migration from Windows Server 2003 to the Cloud: An Interoute ...Infrastructure Migration from Windows Server 2003 to the Cloud: An Interoute ...
Infrastructure Migration from Windows Server 2003 to the Cloud: An Interoute ...
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scale
 

Semelhante a Enforcing Application SLA with Congress and Monasca

VSP: A Virtual Smartphone Platform to Enhance the Capability of Physical Smar...
VSP: A Virtual Smartphone Platform to Enhance the Capability of Physical Smar...VSP: A Virtual Smartphone Platform to Enhance the Capability of Physical Smar...
VSP: A Virtual Smartphone Platform to Enhance the Capability of Physical Smar...Keshav Vaswani
 
VMware Developer-Ready Transformation
VMware Developer-Ready TransformationVMware Developer-Ready Transformation
VMware Developer-Ready TransformationVMware Tanzu
 
Open stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshareOpen stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshareSumit Naiksatam
 
VMworld 2013: Introducing NSX Service Composer: The New Consumption Model for...
VMworld 2013: Introducing NSX Service Composer: The New Consumption Model for...VMworld 2013: Introducing NSX Service Composer: The New Consumption Model for...
VMworld 2013: Introducing NSX Service Composer: The New Consumption Model for...VMworld
 
Nx ray etisalatnigeria
Nx ray etisalatnigeriaNx ray etisalatnigeria
Nx ray etisalatnigeriaOwoeye Opeyemi
 
Mohammed Al Mulla - Best practices to secure working environments
Mohammed Al Mulla - Best practices to secure working environmentsMohammed Al Mulla - Best practices to secure working environments
Mohammed Al Mulla - Best practices to secure working environmentsnooralmousa
 
Meet the BYOD, ‘Computing Anywhere’ Challenge—Planning and License Management...
Meet the BYOD, ‘Computing Anywhere’ Challenge—Planning and License Management...Meet the BYOD, ‘Computing Anywhere’ Challenge—Planning and License Management...
Meet the BYOD, ‘Computing Anywhere’ Challenge—Planning and License Management...Flexera
 
Harbour IT & VMware - vForum 2010 Wrap
Harbour IT & VMware - vForum 2010 WrapHarbour IT & VMware - vForum 2010 Wrap
Harbour IT & VMware - vForum 2010 WrapHarbourIT
 
Anveshak: Placing Edge Servers In The Wild
Anveshak: Placing Edge Servers In The WildAnveshak: Placing Edge Servers In The Wild
Anveshak: Placing Edge Servers In The WildNitinder Mohan
 
BlueHat v17 || Securing Windows Defender Application Guard
BlueHat v17 || Securing Windows Defender Application Guard BlueHat v17 || Securing Windows Defender Application Guard
BlueHat v17 || Securing Windows Defender Application Guard BlueHat Security Conference
 
Softricity - Bri-Forum 2005
Softricity -  Bri-Forum 2005Softricity -  Bri-Forum 2005
Softricity - Bri-Forum 2005Jeff Fisher
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsAraf Karsh Hamid
 
Distributed Services Scheduling and Cloud Provisioning
Distributed Services Scheduling and Cloud ProvisioningDistributed Services Scheduling and Cloud Provisioning
Distributed Services Scheduling and Cloud ProvisioningAr Agarwal
 
IaaS with Software Defined Networking
IaaS with Software Defined NetworkingIaaS with Software Defined Networking
IaaS with Software Defined NetworkingPrasenjit Sarkar
 
It331 Documentation
It331 DocumentationIt331 Documentation
It331 DocumentationApril Davis
 
V center application discovery manager customer facing technical presentation
V center application discovery manager customer facing technical presentationV center application discovery manager customer facing technical presentation
V center application discovery manager customer facing technical presentationsolarisyourep
 
Embracing SDN in the Next Gen Network
Embracing SDN in the Next Gen NetworkEmbracing SDN in the Next Gen Network
Embracing SDN in the Next Gen NetworkNetCraftsmen
 

Semelhante a Enforcing Application SLA with Congress and Monasca (20)

VSP: A Virtual Smartphone Platform to Enhance the Capability of Physical Smar...
VSP: A Virtual Smartphone Platform to Enhance the Capability of Physical Smar...VSP: A Virtual Smartphone Platform to Enhance the Capability of Physical Smar...
VSP: A Virtual Smartphone Platform to Enhance the Capability of Physical Smar...
 
VMware Developer-Ready Transformation
VMware Developer-Ready TransformationVMware Developer-Ready Transformation
VMware Developer-Ready Transformation
 
Open stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshareOpen stack gbp final sn-4-slideshare
Open stack gbp final sn-4-slideshare
 
Unit 2
Unit 2Unit 2
Unit 2
 
VMworld 2013: Introducing NSX Service Composer: The New Consumption Model for...
VMworld 2013: Introducing NSX Service Composer: The New Consumption Model for...VMworld 2013: Introducing NSX Service Composer: The New Consumption Model for...
VMworld 2013: Introducing NSX Service Composer: The New Consumption Model for...
 
Nx ray etisalatnigeria
Nx ray etisalatnigeriaNx ray etisalatnigeria
Nx ray etisalatnigeria
 
GuideIT High Level Consulting Framework
GuideIT High Level Consulting FrameworkGuideIT High Level Consulting Framework
GuideIT High Level Consulting Framework
 
Mohammed Al Mulla - Best practices to secure working environments
Mohammed Al Mulla - Best practices to secure working environmentsMohammed Al Mulla - Best practices to secure working environments
Mohammed Al Mulla - Best practices to secure working environments
 
Meet the BYOD, ‘Computing Anywhere’ Challenge—Planning and License Management...
Meet the BYOD, ‘Computing Anywhere’ Challenge—Planning and License Management...Meet the BYOD, ‘Computing Anywhere’ Challenge—Planning and License Management...
Meet the BYOD, ‘Computing Anywhere’ Challenge—Planning and License Management...
 
Resume_Padmaja
Resume_PadmajaResume_Padmaja
Resume_Padmaja
 
Harbour IT & VMware - vForum 2010 Wrap
Harbour IT & VMware - vForum 2010 WrapHarbour IT & VMware - vForum 2010 Wrap
Harbour IT & VMware - vForum 2010 Wrap
 
Anveshak: Placing Edge Servers In The Wild
Anveshak: Placing Edge Servers In The WildAnveshak: Placing Edge Servers In The Wild
Anveshak: Placing Edge Servers In The Wild
 
BlueHat v17 || Securing Windows Defender Application Guard
BlueHat v17 || Securing Windows Defender Application Guard BlueHat v17 || Securing Windows Defender Application Guard
BlueHat v17 || Securing Windows Defender Application Guard
 
Softricity - Bri-Forum 2005
Softricity -  Bri-Forum 2005Softricity -  Bri-Forum 2005
Softricity - Bri-Forum 2005
 
Microservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native AppsMicroservices Architecture - Cloud Native Apps
Microservices Architecture - Cloud Native Apps
 
Distributed Services Scheduling and Cloud Provisioning
Distributed Services Scheduling and Cloud ProvisioningDistributed Services Scheduling and Cloud Provisioning
Distributed Services Scheduling and Cloud Provisioning
 
IaaS with Software Defined Networking
IaaS with Software Defined NetworkingIaaS with Software Defined Networking
IaaS with Software Defined Networking
 
It331 Documentation
It331 DocumentationIt331 Documentation
It331 Documentation
 
V center application discovery manager customer facing technical presentation
V center application discovery manager customer facing technical presentationV center application discovery manager customer facing technical presentation
V center application discovery manager customer facing technical presentation
 
Embracing SDN in the Next Gen Network
Embracing SDN in the Next Gen NetworkEmbracing SDN in the Next Gen Network
Embracing SDN in the Next Gen Network
 

Último

TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSedrianrheine
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdfShreedeep Rayamajhi
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilitiesalihassaah1994
 
Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpressssuser166378
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsRoxana Stingu
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...APNIC
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSlesteraporado16
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxnaveenithkrishnan
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Shubham Pant
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteMavein
 
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024Jan Löffler
 
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfmchristianalwyn
 

Último (12)

TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDSTYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
TYPES AND DEFINITION OF ONLINE CRIMES AND HAZARDS
 
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdfIntroduction to ICANN and Fellowship program  by Shreedeep Rayamajhi.pdf
Introduction to ICANN and Fellowship program by Shreedeep Rayamajhi.pdf
 
Zero-day Vulnerabilities
Zero-day VulnerabilitiesZero-day Vulnerabilities
Zero-day Vulnerabilities
 
Presentation2.pptx - JoyPress Wordpress
Presentation2.pptx -  JoyPress WordpressPresentation2.pptx -  JoyPress Wordpress
Presentation2.pptx - JoyPress Wordpress
 
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced HorizonsVision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
Vision Forward: Tracing Image Search SEO From Its Roots To AI-Enhanced Horizons
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
 
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASSLESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
LESSON 10/ GROUP 10/ ST. THOMAS AQUINASS
 
Bio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptxBio Medical Waste Management Guideliness 2023 ppt.pptx
Bio Medical Waste Management Guideliness 2023 ppt.pptx
 
Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024Check out the Free Landing Page Hosting in 2024
Check out the Free Landing Page Hosting in 2024
 
Computer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a WebsiteComputer 10 Lesson 8: Building a Website
Computer 10 Lesson 8: Building a Website
 
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
WordPress by the numbers - Jan Loeffler, CTO WebPros, CloudFest 2024
 
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdfLESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
LESSON 5 GROUP 10 ST. THOMAS AQUINAS.pdf
 

Enforcing Application SLA with Congress and Monasca

  • 1. Enforcing Application SLAs with Congress and Monasca Fabio Giannetti, Ken Owens April 28, 2016
  • 2. • Vision • Congress and Monasca implementing: • OPS/NOC SLA Policies • App Intent SLA Policies • Current State and Next Steps Outline
  • 4. • Application owners/developers do not care about the underlining infrastructure unless it is a problem. • Microservices based architectures demands inherently granular application design. • SLAs for applications must be holistic and independent of the underlining infrastructure Vision Host Virtualization VirtualizationContainer Container Container Container Srvc Srvc Srvc Srvc Srvc Srvc Srvc Application A Application B
  • 5. Enable business/application owners to easily define the aspects that are relevant in running their applications with the budget constraints that are imposed by IT. Vision
  • 6. Monitoring is now holistic and has to consider various level of virtualization and harmonize data over the different layers. Containers are short lived and moved around the available infrastructure. Vision Host Virtualization VirtualizationContainer Container Container Container
  • 7. Application owners’ soft limits (alarms) are notified back and hard limits (actions) are performed whenever required. Vision
  • 9. Underutilized Servers  OPS/NOC Policy Example error(vm, email) :- nova:server_owner(vm, owner), two_months_before_today(start, end), ceilometer:statistics(vm, start, end, “cpu-util”, cpu), cpu < 5, keystone:email(owner, email) two_months_before_today(start, end) :- date:today(end), date:minus(end, “2 months”, start) If a VM has less than 5% CPU utilization for the last 2 months, then notify its owner via email
  • 10. Current Solution Ceilometer API Congress API Policy Engine Ceilometer Datasource GET /v2/meters/cpu_util/statistics?resource_ id=… VM UUID (Resource ID) CPU xxxxxxxx-0001-xxxx-xxxxxxxxxxx xxxxxxxx-0002-xxxx-xxxxxxxxxxx xxxxxxxx-0003-xxxx-xxxxxxxxxxx xxxxxxxx-0004-xxxx-xxxxxxxxxxx xxxxxxxx-0005-xxxx-xxxxxxxxxxx Poll every <n>s 40 30 2 70 55
  • 11. Current Solution Congress APIPolicy Engine Ceilometer Datasource VM UUID (Resource ID) CPU xxxxxxxx-0001-xxxx xxxxxxxx-0002-xxxx xxxxxxxx-0003-xxxx xxxxxxxx-0004-xxxx xxxxxxxx-0005-xxxx 40 30 2 70 55 Nova API Nova Datasource Keystone Datasource Keystone API VM Owner xxxxxxxx-0001-xxxx Ann xxxxxxxx-0002-xxxx Fabio xxxxxxxx-0003-xxxx Fabio xxxxxxxx-0004-xxxx Ken xxxxxxxx-0005-xxxx Ken Owner Email Ann AnnNotRealEmail@cisco.com Fabio FabioNotRealEmail@cisco.com Ken KenNotRealEmail@cisco.com VM Email xxxxxxxx-0003-xxxx FabioNotRealEmail@cisco.com
  • 12. From Policy to Alarm error(vm, email) :- nova:server_owner(vm, owner), two_months_before_today(start, end), monasca_alarms:stats(vm, start, end, “cpu.user_perc”, cpu), cpu < 5, keystone:email(owner, email) two_months_before_today(start, end) :- date:today(end), date:minus(end, “2 months”, start) { "name":"Average CPU percent is less than 5", "description":"The average CPU percent is lesser than 5", "expression":"(avg(cpu.user_perc{resource_id=vm}) < 5)", "match_by":[ "resource_id" ], "severity":”HIGH", "ok_actions":[ ”action_id_for_ok" ], "alarm_actions":[ ”action_id_for_alarm" ] }
  • 13. Proposed Solution (receiving notif.) Metrics DB Monasca Agents Monasca API Notification Engine Threshold Engine Persister Kafka Cluster Congress API Policy Engine Monasca Alarm Datasource Webhook: …/v1/data- sources/monasca_alarm ?execute&action=handl e_alarm Settings DB monasca notification-create congress WEBHOOK http:…/v1/data- sources/monasca_alarm?execute&action=handle_ala handle_alarm(params) VM UUID (Resource ID) CPU xxxxxxxx-0003-xxxx 2 POST /v2.0/alarm-definitions
  • 14. Proposed Solution (receiving notifications) Congress API Policy Engine Monasca Alarm Datasource VM UUID (Resource ID) CPU xxxxxxxx-0003-xxxx 2 Nova API Nova Datasource Keystone Datasource Keystone API VM Owner xxxxxxxx-0003-xxxx Fabio Owner Email Fabio FabioNotRealEmail@cisco.com VM Email xxxxxxxx-0003-xxxx FabioNotRealEmail@cisco.com
  • 15. Application Intent SLA using Congress and Monasca
  • 16. VM Evacuation for Biz Critical App if Host has potential health issues  App Intent Policy Example error(vm) :- nova:show(vm, hostID), monasca_alarm:host_issues(hostID) If a Host has issues, for instance: 1. Unhealthy: cannot be pinged and or SSH into 2. Network errors and packet loss 3. Disk space below certain threshold
  • 17. App Intent Policy: Metrics Correlation error(vm) :- nova:show(vm, hostID), monasca_alarm:host_issues(hostID) Metric Name Dimensions Value host_alive_status observer_host=fqdn, hostname=supplied hostname being checked, test_type=ping or ssh 0=online, 1=offline disk.space_used_perc device, mount_point The percentage of disk space that is being used on a device net.in_packets_dropped_sec device Number of inbound network packets dropped per second net.out_packets_dropped_sec device Number of outbound network packets dropped per second
  • 18. App Intent Policy: Multi-Alarms #1 { "name":”Host is Unhealty", "description":"The host is considered unhealty", "expression":"(host_alive_status{host_id=hostID}) = 1)", "match_by":[ "host_id" ], ... } { "name":”Host disk getting full", "description":"The host disk is reaching capacity", "expression":"(disk.space_used_perc{host_id=hostID}) > 90)", "match_by":[ "host_id" ], ... } Metric Name Value host_alive_status 0=online, 1=offline disk.space_used_perc The percentage of disk space that is being used on a device net.in_packets_dropped_sec Number of inbound network packets dropped per second net.out_packets_dropped_se c Number of outbound network packets dropped per second
  • 19. App Intent Policy: Multi-Alarms #2 { "name":”Host is Unhealty", "description":"The host is considered unhealty", "expression":"(net.in_packets_dropped_sec{host_id=hostID}) > 30)", "match_by":[ "host_id" ], ... } { "name":”Host disk getting full", "description":"The host disk is reaching capacity", "expression":"(net.out_packets_dropped_sec{host_id=hostID}) > 30)", "match_by":[ "host_id" ], ... } Metric Name Value host_alive_status 0=online, 1=offline disk.space_used_perc The percentage of disk space that is being used on a device net.in_packets_dropped_sec Number of inbound network packets dropped per second net.out_packets_dropped_sec Number of outbound network packets dropped per second
  • 20. Current State and Future Work
  • 21. Overall Architecture Settings DB Metrics DB Monasca Agents Monasca API Keystone Notification Engine Threshold Engine Persister Kafka Cluster Congress API Policy Engine Monasca Alarm Datasource Metric Value metric1 val1 metricN valN In Mem DB webhookrpc
  • 22. • Done: • Developed a Monasca Datasource to validate integration. • Designed the solution and found the main integration points • To be Done: • Developed a Monasca Alarm Datasource leveraging the RPC capabilties in Congress. • Create a Congress Notification Webhook for Monasca • Develop a policy to alarm conversion component to develop policies prefixed with monasca-alarm. Current Status and Next Steps