SlideShare uma empresa Scribd logo
1 de 31
1
2
Daniel Krook
Senior Certified IT Specialist, IBM
The IBM dashboard for operational metrics
3
We run Cloud Foundry on dozens of OpenStack VMs
Two intranet clusters
In the past year, we’ve learned how to
Classic: 38 huge VMs deployed with Chef: 1,302 users, 1,710 apps
NG: 41 medium VMs deployed with BOSH: 123 users, 247 apps
Not counting Dev deployments
All on 50+ Nova Compute nodes
• Keep Cloud Foundry running smoothly
• Discover and prevent impending problems
• Resolve unexpected issues quickly
4
1. Show the key data points we track
2. Show how our metrics dashboard helps us monitor that data
3. Share ideas on how to find better data in NG and beyond
4. Spark discussion on improved visibility for CF admins and customers.
Goals of this lightning talk
We are looking to get better at this, and help the community get better as well.
5
1. The key data
6
What are the important metrics?
Data that can be
tracked over time to see
trends and behaviors
Data that can help
us predict problems
before they happen
DEAs and apps health
 Memory reserved as a proportion of the
memory available
General health of all components
 Health of the virtual machines
 Status of the processes running on them
Database nodes and services
 Number of provisioned services against
capacity available
At the PaaS layer, that means:
7
 Deliver continuous
availability in the cloud
 Proactively solve
problems rather than
react to them
 Understand the behavior
of the system to
automate it
Why do we need metrics?
8
 NATS message bus
• Discover the components to interrogate
• Best for dynamically changing data
Where can we find them?
 Cloud Controller database (CCDB)
• Longer lived data that isn’t in the varz endpoints
9
2. Monitoring that data
10
1. Views of component health
2. Resource usage details
3. Ongoing growth trends
4. Access to logs and raw varz
5. Email notifications
Our metrics dashboard provides…
11
 Components nearing capacity or failure
 Already failed components
 Out of control apps and noisy users
 Active/inactive users and apps
 Growth trends and runtime/service adoption
It helps us find (and fix) problems
It helps us see patterns
12
User and app trends
There is also one unauthenticated page for high level stats
13
DEA list
14
DEA details
15
Service node list
16
Service node details
17
User list
18
User details
19
App list
20
App details
21
Log list
22
Log details
23
Email notifications
24
3. Finding and acting on better data
25
 NG provides granular user/org/space views…
• This enables better BSS potential in terms of QoS and departmental billing
 …But we lost user and app data linkages from the health manager
• Can’t see what DEA my app resides on (not currently enabled in our NG version)
• Can’t see how many apps a user has (replaced by orgs and spaces, but still
valuable to trace)
• See https://github.com/cloudfoundry/cloud_controller_ng/issues/81
 We’d like to restore that data, either surface it
• in varz endpoints (dynamic data, preferred) or
• CC_DB (static data, could be a security concern)
Let’s resolve gaps in data captured from NG
26
 Detect errors in applications that are traceable to users/orgs
• Preemptively reach out to them to see if they need help
• Think customer service and proactive support!
• Can we hook into to BOSH or Jenkins for automation?
 Automate (and expand links to the IaaS and SaaS stacks)
• Self healing systems (out of disk, move apps)
• Self scaling systems (detect when nearing thresholds)
• Evolving topologies (replace unused service nodes with popular ones)
Let’s begin to link metrics to automation
27
 Admins are the primary beneficiary right now
• But data is almost completely read only
• Should we provide UAA based tiers of access to admins?
 Others can and should benefit
• Customers
• End users
• Developers
• Management
• Executives, line of business owners
• Finance
Let’s expand the broadcast of metrics to more users
28
Thanks!
29
The metrics dashboard innovators
Chris Peters Russell Boykin
Doug Davis Wei Feng
30
We’re hiring!
Search Jobs at IBM by:
SmartCloud Application Services
31

Mais conteúdo relacionado

Mais procurados

January 2015 Webinar - Wins and Successes from 2014
January 2015 Webinar -  Wins and Successes from 2014January 2015 Webinar -  Wins and Successes from 2014
January 2015 Webinar - Wins and Successes from 2014RapidScale
 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataIan Foster
 
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for InnovationBig Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for InnovationSoftServe
 
Towards Personalization in Global Digital Health
Towards Personalization in Global Digital HealthTowards Personalization in Global Digital Health
Towards Personalization in Global Digital HealthDatabricks
 
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5Splunk
 
Splunk Distributed Management Console
Splunk Distributed Management Console                                         Splunk Distributed Management Console
Splunk Distributed Management Console Splunk
 
Modern management of data pipelines made easier
Modern management of data pipelines made easierModern management of data pipelines made easier
Modern management of data pipelines made easierCloverDX
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
Affecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of EngagementAffecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of EngagementAffecto
 
Splunk in the Cisco Unified Computing System (UCS)
Splunk in the Cisco Unified Computing System (UCS) Splunk in the Cisco Unified Computing System (UCS)
Splunk in the Cisco Unified Computing System (UCS) Splunk
 
RapidScale CloudMail
RapidScale CloudMailRapidScale CloudMail
RapidScale CloudMailRapidScale
 
Three Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking ObservabilityThree Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking ObservabilityDevOps.com
 
Migrating from Java EE to cloud-native Reactive systems
Migrating from Java EE to cloud-native Reactive systemsMigrating from Java EE to cloud-native Reactive systems
Migrating from Java EE to cloud-native Reactive systemsMarkus Eisele
 
Event-driven architecture
Event-driven architectureEvent-driven architecture
Event-driven architectureAndrew Easter
 
IBM and Lightbend Build Integrated Platform for Cognitive Development
IBM and Lightbend Build Integrated Platform for Cognitive DevelopmentIBM and Lightbend Build Integrated Platform for Cognitive Development
IBM and Lightbend Build Integrated Platform for Cognitive DevelopmentLightbend
 
SplunkLive! Customer Presentation - SSA
SplunkLive! Customer Presentation - SSASplunkLive! Customer Presentation - SSA
SplunkLive! Customer Presentation - SSASplunk
 
SplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - StaplesSplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - StaplesSplunk
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk
 
Conferencia principal: Evolución y visión de Elastic Observability
Conferencia principal: Evolución y visión de Elastic ObservabilityConferencia principal: Evolución y visión de Elastic Observability
Conferencia principal: Evolución y visión de Elastic ObservabilityElasticsearch
 

Mais procurados (20)

January 2015 Webinar - Wins and Successes from 2014
January 2015 Webinar -  Wins and Successes from 2014January 2015 Webinar -  Wins and Successes from 2014
January 2015 Webinar - Wins and Successes from 2014
 
Science for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing DataScience for the Future: Strategies for Moving and Sharing Data
Science for the Future: Strategies for Moving and Sharing Data
 
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for InnovationBig Data as a Service: A Neo-Metropolis Model Approach for Innovation
Big Data as a Service: A Neo-Metropolis Model Approach for Innovation
 
Towards Personalization in Global Digital Health
Towards Personalization in Global Digital HealthTowards Personalization in Global Digital Health
Towards Personalization in Global Digital Health
 
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
 
Splunk Distributed Management Console
Splunk Distributed Management Console                                         Splunk Distributed Management Console
Splunk Distributed Management Console
 
Modern management of data pipelines made easier
Modern management of data pipelines made easierModern management of data pipelines made easier
Modern management of data pipelines made easier
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
Affecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of EngagementAffecto Informatica World Tour 2015: The Age of Engagement
Affecto Informatica World Tour 2015: The Age of Engagement
 
Splunk in the Cisco Unified Computing System (UCS)
Splunk in the Cisco Unified Computing System (UCS) Splunk in the Cisco Unified Computing System (UCS)
Splunk in the Cisco Unified Computing System (UCS)
 
RapidScale CloudMail
RapidScale CloudMailRapidScale CloudMail
RapidScale CloudMail
 
Three Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking ObservabilityThree Pillars, Zero Answers: Rethinking Observability
Three Pillars, Zero Answers: Rethinking Observability
 
Migrating from Java EE to cloud-native Reactive systems
Migrating from Java EE to cloud-native Reactive systemsMigrating from Java EE to cloud-native Reactive systems
Migrating from Java EE to cloud-native Reactive systems
 
Event-driven architecture
Event-driven architectureEvent-driven architecture
Event-driven architecture
 
IBM and Lightbend Build Integrated Platform for Cognitive Development
IBM and Lightbend Build Integrated Platform for Cognitive DevelopmentIBM and Lightbend Build Integrated Platform for Cognitive Development
IBM and Lightbend Build Integrated Platform for Cognitive Development
 
SplunkLive! Customer Presentation - SSA
SplunkLive! Customer Presentation - SSASplunkLive! Customer Presentation - SSA
SplunkLive! Customer Presentation - SSA
 
SplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - StaplesSplunkLive! Customer Presentation - Staples
SplunkLive! Customer Presentation - Staples
 
Splunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk Ninjas: New Features, Pivot, and Search Dojo
Splunk Ninjas: New Features, Pivot, and Search Dojo
 
Dev ops toronto
Dev ops torontoDev ops toronto
Dev ops toronto
 
Conferencia principal: Evolución y visión de Elastic Observability
Conferencia principal: Evolución y visión de Elastic ObservabilityConferencia principal: Evolución y visión de Elastic Observability
Conferencia principal: Evolución y visión de Elastic Observability
 

Destaque

Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...Earley Information Science
 
Best Practices in Measuring Critical Support Metrics
Best Practices in Measuring Critical Support MetricsBest Practices in Measuring Critical Support Metrics
Best Practices in Measuring Critical Support Metricsdreamforce2006
 
Cloud Foundry Deployment Tools: BOSH vs Juju Charms
Cloud Foundry Deployment Tools:  BOSH vs Juju CharmsCloud Foundry Deployment Tools:  BOSH vs Juju Charms
Cloud Foundry Deployment Tools: BOSH vs Juju CharmsAltoros
 
Webinar: “KPIs in Digital Marketing” - presented by Jacques Warren
Webinar: “KPIs in Digital Marketing” - presented by Jacques WarrenWebinar: “KPIs in Digital Marketing” - presented by Jacques Warren
Webinar: “KPIs in Digital Marketing” - presented by Jacques WarrenAT Internet
 
Regulatory Reporting Dashboard
Regulatory Reporting DashboardRegulatory Reporting Dashboard
Regulatory Reporting Dashboardaccenture
 
The difference between a KPI and a Metric
The difference between a KPI and a MetricThe difference between a KPI and a Metric
The difference between a KPI and a MetricDennis Mortensen
 
Stress management in hr
Stress management in hrStress management in hr
Stress management in hr'Anuraag Ghosh
 
KPI for HR Manager - Sample of KPIs for HR
KPI for HR Manager - Sample of KPIs for HRKPI for HR Manager - Sample of KPIs for HR
KPI for HR Manager - Sample of KPIs for HRYodhia Antariksa
 
Microservices with Spring and Cloud Foundry
Microservices with Spring and Cloud FoundryMicroservices with Spring and Cloud Foundry
Microservices with Spring and Cloud Foundrymimacom
 
The 10 Most Important Banking Metrics
The 10 Most Important Banking MetricsThe 10 Most Important Banking Metrics
The 10 Most Important Banking MetricsJohn J. Maxfield
 
Developing Metrics and KPI (Key Performance Indicators
Developing Metrics and KPI (Key Performance IndicatorsDeveloping Metrics and KPI (Key Performance Indicators
Developing Metrics and KPI (Key Performance IndicatorsVictor Holman
 
KEY PERFORMANCE INDICATOR
KEY PERFORMANCE INDICATORKEY PERFORMANCE INDICATOR
KEY PERFORMANCE INDICATORspeedcars
 

Destaque (14)

Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
Meaningful Metrics - Aligning Operational Metrics with Marketing & Customer E...
 
Best Practices in Measuring Critical Support Metrics
Best Practices in Measuring Critical Support MetricsBest Practices in Measuring Critical Support Metrics
Best Practices in Measuring Critical Support Metrics
 
Cloud Foundry Deployment Tools: BOSH vs Juju Charms
Cloud Foundry Deployment Tools:  BOSH vs Juju CharmsCloud Foundry Deployment Tools:  BOSH vs Juju Charms
Cloud Foundry Deployment Tools: BOSH vs Juju Charms
 
Webinar: “KPIs in Digital Marketing” - presented by Jacques Warren
Webinar: “KPIs in Digital Marketing” - presented by Jacques WarrenWebinar: “KPIs in Digital Marketing” - presented by Jacques Warren
Webinar: “KPIs in Digital Marketing” - presented by Jacques Warren
 
Regulatory Reporting Dashboard
Regulatory Reporting DashboardRegulatory Reporting Dashboard
Regulatory Reporting Dashboard
 
The difference between a KPI and a Metric
The difference between a KPI and a MetricThe difference between a KPI and a Metric
The difference between a KPI and a Metric
 
Stress management in hr
Stress management in hrStress management in hr
Stress management in hr
 
KPI for HR Manager - Sample of KPIs for HR
KPI for HR Manager - Sample of KPIs for HRKPI for HR Manager - Sample of KPIs for HR
KPI for HR Manager - Sample of KPIs for HR
 
Microservices with Spring and Cloud Foundry
Microservices with Spring and Cloud FoundryMicroservices with Spring and Cloud Foundry
Microservices with Spring and Cloud Foundry
 
The 10 Most Important Banking Metrics
The 10 Most Important Banking MetricsThe 10 Most Important Banking Metrics
The 10 Most Important Banking Metrics
 
Project Metrics & Measures
Project Metrics & MeasuresProject Metrics & Measures
Project Metrics & Measures
 
Developing Metrics and KPI (Key Performance Indicators
Developing Metrics and KPI (Key Performance IndicatorsDeveloping Metrics and KPI (Key Performance Indicators
Developing Metrics and KPI (Key Performance Indicators
 
Learning Metrics: Building Your Training Scorecard
Learning Metrics: Building Your Training ScorecardLearning Metrics: Building Your Training Scorecard
Learning Metrics: Building Your Training Scorecard
 
KEY PERFORMANCE INDICATOR
KEY PERFORMANCE INDICATORKEY PERFORMANCE INDICATOR
KEY PERFORMANCE INDICATOR
 

Semelhante a The IBM dashboard for operational metrics

Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summitMatt Carroll
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview Rajesh Menon
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY
 
Whitepaper factors to consider when selecting an open source infrastructure ...
Whitepaper  factors to consider when selecting an open source infrastructure ...Whitepaper  factors to consider when selecting an open source infrastructure ...
Whitepaper factors to consider when selecting an open source infrastructure ...apprize360
 
Lecture 3.31 3.32.pptx
Lecture 3.31  3.32.pptxLecture 3.31  3.32.pptx
Lecture 3.31 3.32.pptxRATISHKUMAR32
 
ADDO Open Source Observability Tools
ADDO Open Source Observability Tools ADDO Open Source Observability Tools
ADDO Open Source Observability Tools Mickey Boxell
 
Whitepaper factors to consider commercial infrastructure management vendors
Whitepaper  factors to consider commercial infrastructure management vendorsWhitepaper  factors to consider commercial infrastructure management vendors
Whitepaper factors to consider commercial infrastructure management vendorsapprize360
 
The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015Chip Childers
 
About Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopAbout Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopLynn Langit
 
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...Adin Ermie
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsCloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsVMware Tanzu
 
How to improve your system monitoring
How to improve your system monitoringHow to improve your system monitoring
How to improve your system monitoringAndrew White
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopKevin Crawley
 
Why Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdfWhy Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdfDatacademy.ai
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0alok khobragade
 
How to add security in dataops and devops
How to add security in dataops and devopsHow to add security in dataops and devops
How to add security in dataops and devopsUlf Mattsson
 
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y KubernetesIntroducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y KubernetesSUSE España
 
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...AboutYouGmbH
 

Semelhante a The IBM dashboard for operational metrics (20)

Cloudera federal summit
Cloudera federal summitCloudera federal summit
Cloudera federal summit
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
Whitepaper factors to consider when selecting an open source infrastructure ...
Whitepaper  factors to consider when selecting an open source infrastructure ...Whitepaper  factors to consider when selecting an open source infrastructure ...
Whitepaper factors to consider when selecting an open source infrastructure ...
 
Lecture 3.31 3.32.pptx
Lecture 3.31  3.32.pptxLecture 3.31  3.32.pptx
Lecture 3.31 3.32.pptx
 
ADDO Open Source Observability Tools
ADDO Open Source Observability Tools ADDO Open Source Observability Tools
ADDO Open Source Observability Tools
 
Whitepaper factors to consider commercial infrastructure management vendors
Whitepaper  factors to consider commercial infrastructure management vendorsWhitepaper  factors to consider commercial infrastructure management vendors
Whitepaper factors to consider commercial infrastructure management vendors
 
The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015The Architecture of Continuous Innovation - OSCON 2015
The Architecture of Continuous Innovation - OSCON 2015
 
About Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopAbout Streaming Data Solutions for Hadoop
About Streaming Data Solutions for Hadoop
 
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
Global Azure Bootcamp 2017 - Performance and Health Management for Modern App...
 
Cloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native appsCloud-Native Data: What data questions to ask when building cloud-native apps
Cloud-Native Data: What data questions to ask when building cloud-native apps
 
How to improve your system monitoring
How to improve your system monitoringHow to improve your system monitoring
How to improve your system monitoring
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability Workshop
 
Why Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdfWhy Monitoring and Logging are Important in DevOps.pdf
Why Monitoring and Logging are Important in DevOps.pdf
 
Big Data
Big DataBig Data
Big Data
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0
 
How to add security in dataops and devops
How to add security in dataops and devopsHow to add security in dataops and devops
How to add security in dataops and devops
 
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y KubernetesIntroducción a Microservicios, SUSE CaaS Platform y Kubernetes
Introducción a Microservicios, SUSE CaaS Platform y Kubernetes
 
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
Artur Borycki - Beyond Lambda - how to get from logical to physical - code.ta...
 

Mais de Platform CF

The Platform for Building Great Software
The Platform for Building Great SoftwareThe Platform for Building Great Software
The Platform for Building Great SoftwarePlatform CF
 
The Path to Stackato
The Path to StackatoThe Path to Stackato
The Path to StackatoPlatform CF
 
Continuous Deployment with Cloud Foundry, Github and Travis CI
Continuous Deployment with Cloud Foundry, Github and Travis CIContinuous Deployment with Cloud Foundry, Github and Travis CI
Continuous Deployment with Cloud Foundry, Github and Travis CIPlatform CF
 
The Journey to Cloud Foundry
The Journey to Cloud FoundryThe Journey to Cloud Foundry
The Journey to Cloud FoundryPlatform CF
 
Pivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry ServicePivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry ServicePlatform CF
 
What Lessons Can Cloud Foundry Teach to IaaS?
What Lessons Can Cloud Foundry Teach to IaaS?What Lessons Can Cloud Foundry Teach to IaaS?
What Lessons Can Cloud Foundry Teach to IaaS?Platform CF
 
Cloud Foundry at VMware
Cloud Foundry at VMwareCloud Foundry at VMware
Cloud Foundry at VMwarePlatform CF
 
Go Within Cloud Foundry
Go Within Cloud FoundryGo Within Cloud Foundry
Go Within Cloud FoundryPlatform CF
 
Continuous Delivery with Cloud Foundry
Continuous Delivery with Cloud FoundryContinuous Delivery with Cloud Foundry
Continuous Delivery with Cloud FoundryPlatform CF
 
From Zero To Factory
From Zero To FactoryFrom Zero To Factory
From Zero To FactoryPlatform CF
 
Service Distribution to Any Cloud - Cloud Elements
Service Distribution to Any Cloud - Cloud ElementsService Distribution to Any Cloud - Cloud Elements
Service Distribution to Any Cloud - Cloud ElementsPlatform CF
 
Cloud Foundry Marketplace Powered by AppDirect
Cloud Foundry MarketplacePowered by AppDirectCloud Foundry MarketplacePowered by AppDirect
Cloud Foundry Marketplace Powered by AppDirectPlatform CF
 
The Path to Stackato
The Path to StackatoThe Path to Stackato
The Path to StackatoPlatform CF
 
Multi-site Architecture Considerations
Multi-site Architecture ConsiderationsMulti-site Architecture Considerations
Multi-site Architecture ConsiderationsPlatform CF
 
Cloud Foundry at NTT
Cloud Foundry at NTTCloud Foundry at NTT
Cloud Foundry at NTTPlatform CF
 
Building Opportunity with an Open Cloud Architecture
Building Opportunity with an Open Cloud ArchitectureBuilding Opportunity with an Open Cloud Architecture
Building Opportunity with an Open Cloud ArchitecturePlatform CF
 
Extending Cloud Foundry to .NET
Extending Cloud Foundry to .NETExtending Cloud Foundry to .NET
Extending Cloud Foundry to .NETPlatform CF
 
Cloud Foundry at Rakuten
Cloud Foundry at RakutenCloud Foundry at Rakuten
Cloud Foundry at RakutenPlatform CF
 

Mais de Platform CF (19)

The Platform for Building Great Software
The Platform for Building Great SoftwareThe Platform for Building Great Software
The Platform for Building Great Software
 
The Path to Stackato
The Path to StackatoThe Path to Stackato
The Path to Stackato
 
Continuous Deployment with Cloud Foundry, Github and Travis CI
Continuous Deployment with Cloud Foundry, Github and Travis CIContinuous Deployment with Cloud Foundry, Github and Travis CI
Continuous Deployment with Cloud Foundry, Github and Travis CI
 
The Journey to Cloud Foundry
The Journey to Cloud FoundryThe Journey to Cloud Foundry
The Journey to Cloud Foundry
 
Pivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry ServicePivotal HD as a Cloud Foundry Service
Pivotal HD as a Cloud Foundry Service
 
What Lessons Can Cloud Foundry Teach to IaaS?
What Lessons Can Cloud Foundry Teach to IaaS?What Lessons Can Cloud Foundry Teach to IaaS?
What Lessons Can Cloud Foundry Teach to IaaS?
 
Cloud Foundry at VMware
Cloud Foundry at VMwareCloud Foundry at VMware
Cloud Foundry at VMware
 
Go Within Cloud Foundry
Go Within Cloud FoundryGo Within Cloud Foundry
Go Within Cloud Foundry
 
Continuous Delivery with Cloud Foundry
Continuous Delivery with Cloud FoundryContinuous Delivery with Cloud Foundry
Continuous Delivery with Cloud Foundry
 
From Zero To Factory
From Zero To FactoryFrom Zero To Factory
From Zero To Factory
 
Service Distribution to Any Cloud - Cloud Elements
Service Distribution to Any Cloud - Cloud ElementsService Distribution to Any Cloud - Cloud Elements
Service Distribution to Any Cloud - Cloud Elements
 
Cloud Foundry Marketplace Powered by AppDirect
Cloud Foundry MarketplacePowered by AppDirectCloud Foundry MarketplacePowered by AppDirect
Cloud Foundry Marketplace Powered by AppDirect
 
The Path to Stackato
The Path to StackatoThe Path to Stackato
The Path to Stackato
 
Multi-site Architecture Considerations
Multi-site Architecture ConsiderationsMulti-site Architecture Considerations
Multi-site Architecture Considerations
 
Intro to MoPaaS
Intro to MoPaaSIntro to MoPaaS
Intro to MoPaaS
 
Cloud Foundry at NTT
Cloud Foundry at NTTCloud Foundry at NTT
Cloud Foundry at NTT
 
Building Opportunity with an Open Cloud Architecture
Building Opportunity with an Open Cloud ArchitectureBuilding Opportunity with an Open Cloud Architecture
Building Opportunity with an Open Cloud Architecture
 
Extending Cloud Foundry to .NET
Extending Cloud Foundry to .NETExtending Cloud Foundry to .NET
Extending Cloud Foundry to .NET
 
Cloud Foundry at Rakuten
Cloud Foundry at RakutenCloud Foundry at Rakuten
Cloud Foundry at Rakuten
 

Último

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Último (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

The IBM dashboard for operational metrics

  • 1. 1
  • 2. 2 Daniel Krook Senior Certified IT Specialist, IBM The IBM dashboard for operational metrics
  • 3. 3 We run Cloud Foundry on dozens of OpenStack VMs Two intranet clusters In the past year, we’ve learned how to Classic: 38 huge VMs deployed with Chef: 1,302 users, 1,710 apps NG: 41 medium VMs deployed with BOSH: 123 users, 247 apps Not counting Dev deployments All on 50+ Nova Compute nodes • Keep Cloud Foundry running smoothly • Discover and prevent impending problems • Resolve unexpected issues quickly
  • 4. 4 1. Show the key data points we track 2. Show how our metrics dashboard helps us monitor that data 3. Share ideas on how to find better data in NG and beyond 4. Spark discussion on improved visibility for CF admins and customers. Goals of this lightning talk We are looking to get better at this, and help the community get better as well.
  • 6. 6 What are the important metrics? Data that can be tracked over time to see trends and behaviors Data that can help us predict problems before they happen DEAs and apps health  Memory reserved as a proportion of the memory available General health of all components  Health of the virtual machines  Status of the processes running on them Database nodes and services  Number of provisioned services against capacity available At the PaaS layer, that means:
  • 7. 7  Deliver continuous availability in the cloud  Proactively solve problems rather than react to them  Understand the behavior of the system to automate it Why do we need metrics?
  • 8. 8  NATS message bus • Discover the components to interrogate • Best for dynamically changing data Where can we find them?  Cloud Controller database (CCDB) • Longer lived data that isn’t in the varz endpoints
  • 10. 10 1. Views of component health 2. Resource usage details 3. Ongoing growth trends 4. Access to logs and raw varz 5. Email notifications Our metrics dashboard provides…
  • 11. 11  Components nearing capacity or failure  Already failed components  Out of control apps and noisy users  Active/inactive users and apps  Growth trends and runtime/service adoption It helps us find (and fix) problems It helps us see patterns
  • 12. 12 User and app trends There is also one unauthenticated page for high level stats
  • 24. 24 3. Finding and acting on better data
  • 25. 25  NG provides granular user/org/space views… • This enables better BSS potential in terms of QoS and departmental billing  …But we lost user and app data linkages from the health manager • Can’t see what DEA my app resides on (not currently enabled in our NG version) • Can’t see how many apps a user has (replaced by orgs and spaces, but still valuable to trace) • See https://github.com/cloudfoundry/cloud_controller_ng/issues/81  We’d like to restore that data, either surface it • in varz endpoints (dynamic data, preferred) or • CC_DB (static data, could be a security concern) Let’s resolve gaps in data captured from NG
  • 26. 26  Detect errors in applications that are traceable to users/orgs • Preemptively reach out to them to see if they need help • Think customer service and proactive support! • Can we hook into to BOSH or Jenkins for automation?  Automate (and expand links to the IaaS and SaaS stacks) • Self healing systems (out of disk, move apps) • Self scaling systems (detect when nearing thresholds) • Evolving topologies (replace unused service nodes with popular ones) Let’s begin to link metrics to automation
  • 27. 27  Admins are the primary beneficiary right now • But data is almost completely read only • Should we provide UAA based tiers of access to admins?  Others can and should benefit • Customers • End users • Developers • Management • Executives, line of business owners • Finance Let’s expand the broadcast of metrics to more users
  • 29. 29 The metrics dashboard innovators Chris Peters Russell Boykin Doug Davis Wei Feng
  • 30. 30 We’re hiring! Search Jobs at IBM by: SmartCloud Application Services
  • 31. 31