SlideShare uma empresa Scribd logo
1 de 26
PCF Metrics – App Dev
Providing App Developers insight into app performance
PCF Metrics
Providing App Developers insight into app performance
Pieter Humphrey, Allen Duet
Gartner believes that more than 80% of all
mission-critical IT service outages result
from people and process errors and
failures, and of those outages, more than
50% result from a lack of coordination
between change, release and configuration
management processes.
Four Steps to Optimize Configuration Management Process and Tools, By Ronni J. Colville, Doc #G00258557 Oct 2013
Modern infrastructure is constantly changing
Methodologies Deployment
Sparingly at
designated times
Ready for prod at
any time
Architecture Technologies Operations
App Server on Machine
Containers,
Public / Private /
Hybrid Cloud
Monolithic App
Microservices /
Composite app
Linear / Sequential
Agile
DevOps
CI / CD Pipelines
Many tools, ad hoc
automation
Manage services,
not servers
Rate of change is
driving more outages
5
Outages often preventable using automation
Facebook
1 hour, Jan 26th
Config / app / net failures
Apple App Store
11 hours March 11th
Internal DNS error
NYSE, United, WSJ
4 hr, 1.5 hr, 1 hr July 8th
Software update, routing
failure, server overload
UltraDNS
2.5 hours Oct 15th
Configuration Errors
https://blog.thousandeyes.com/top-internet-outages-2015/
http://www.informationweek.com/cloud/9-spectacular-cloud-computing-fails/d/d-id/1321305?image_number=2
http://www.informationweek.com/cloud/9-spectacular-cloud-computing-fails/d/d-id/1321305?image_number=4
http://www.informationweek.com/cloud/9-spectacular-cloud-computing-fails/d/d-id/1321305?image_number=8
2015
“25% of customers will abandon a web page that takes more than 4 seconds to load”
“47% of consumers expect a web page to load in < 2 seconds”
“Customers prefer competitors website if it is 250ms faster”
“Increase revenue 1% for each 100ms improvement”
Sources: Gartner, Google, Amazon, Walmart
6
Speed and Availability Matters
7
Speed Performance and Human Perception
Delay time
User Reaction
0 - 100 ms 100-300 ms 300-1000 ms 1 second + 10 seconds +
Instant
Feels
sluggish
Machine is
working..
Mental
context
switch
I’ll come
back later ..
Stay under 250 ms to feel "fast".
Stay under 1000 ms to keep users attention.
Breaking the 1000 ms Mobile Barrier - Velocity - Google Slides
https://docs.google.com/presentation/d/1wAxB5DPN-rcelwbGO6lCOus_S1rP24LMqA8m1eXEDRo/present?slide=id.p19
Changes to a single microservice
or monolithic app can impact
performance of downstream apps
and services, or cause breakage
8
9
Troubleshooting apps and
microservices is hard
Most platforms have:
Disparate permissions on different apps
Data silos across subsystems
Trouble reconciling time series data
Multiple
Languages
Microservices
Support
Services
Marketplace
Native
User
Provided Partner
DEVELOPMENT
1010
Operating
System
Cloud API
Container Orchestration
App Deployment
& Management
Availability
Visibility &
Administration
CI/CD Tools,
ID, Security
Health,
Metrics,
Patching
Apps &
Platform
Dashboards
OPERATIONS
4 Levels of High Availability
11
Availability Zone Fail
4
VM Fail
3
Process Fail
2
App Instance
Fail
1
V
M
V
M
Process
V
M
V
M
V
M
VM VM
VM VM
VM VM
VM VM
Container Scheduler Handles Workloads
12
250,000
containers
managed in a
single
environment
https://blog.pivotal.io/pivotal-cloud-foundry/products/250k-containers-in-production-a-real-test-for-the-real-world
Container Scheduler Handles Workloads
13
Dynamic load
balancing
Container Scheduler Handles Workloads
14
Dynamic load
balancing
Remediation
and rebalance
of workloads
Each Layer Upgradable with No Downtime
15
App Runtime*
File system mapping
Application
Linux host & kernel
Blue-Green deploy
Canary style deploy
* e.g. Embedded webserver, app configurations, JRE, agents for services packaged as buildpacks
C
o
n
t
a
i
n
e
r
Our Charter
To provide App Devs with data points
to assess overall solution performance
and healthProviding App Developers insight into app performance
• Near real-time
view
• Covers 80-90%
of the problems
• One tool correlates
events, logs, metrics
• Common set of facts
for Dev+Ops
• Designed for PCF
multi-tenancy
• Agentless, no install
• Enabled
automatically for
all applications
Immediate Integrated Automated
Available Data
CF
EVENTS
APP
LOGS
APP
METRICS
ROUTES
Select an app,
watch streaming
data
2 weeks of app log storage
2 weeks of detailed container
and http start stop metric storage
App Log distribution histogram
App Event UI improvements
Fault tolerance on all storage
services
Testing and tuning for large
ingestion loads
v1.2.1 PCF Metrics
Data Correlation
Demo
22
PCF Metrics 1.2 Architecture
Our Journey
PCF Metrics v1.0
PCF Metrics v1.1
PCF Metrics v1.2.1
PCF Metrics v1.3
Aggregate Container
and HTTP metrics
provided for Apps
Aggregate Container
and HTTP metrics +
App events and Logs
(24 hour storage)
Aggregate Container
and HTTP metrics +
App events and Logs
(2 weeks storage)
Aggregate Container
and HTTP metrics +
App events and Logs
(2 weeks storage)
TraceID capture and
Trace Logs
Spring Boot actuator support
Expanded event descriptions
Additional Log sources *
Data exposed as API
Continued UX improvements
v1.3+ App Developers
Troubleshooting App Health and Performance with PCF Metrics 1.2
Troubleshooting App Health and Performance with PCF Metrics 1.2

Mais conteúdo relacionado

Mais procurados

Part 1: The Developer Experience (Pivotal Cloud Platform Roadshow)
Part 1: The Developer Experience (Pivotal Cloud Platform Roadshow)Part 1: The Developer Experience (Pivotal Cloud Platform Roadshow)
Part 1: The Developer Experience (Pivotal Cloud Platform Roadshow)
VMware Tanzu
 

Mais procurados (20)

Upgrade your InfoSec, Ops and Dev teams with PCF 1.12
Upgrade your InfoSec, Ops and Dev teams with PCF 1.12Upgrade your InfoSec, Ops and Dev teams with PCF 1.12
Upgrade your InfoSec, Ops and Dev teams with PCF 1.12
 
What's new in Pivotal Cloud Foundry 1.6
What's new in Pivotal Cloud Foundry 1.6What's new in Pivotal Cloud Foundry 1.6
What's new in Pivotal Cloud Foundry 1.6
 
Driving TAS Enterprise Fitness
Driving TAS Enterprise FitnessDriving TAS Enterprise Fitness
Driving TAS Enterprise Fitness
 
Keynote: Architecting for Continuous Delivery (Pivotal Cloud Platform Roadshow)
Keynote: Architecting for Continuous Delivery (Pivotal Cloud Platform Roadshow)Keynote: Architecting for Continuous Delivery (Pivotal Cloud Platform Roadshow)
Keynote: Architecting for Continuous Delivery (Pivotal Cloud Platform Roadshow)
 
Removing Barriers Between Dev and Ops
Removing Barriers Between Dev and OpsRemoving Barriers Between Dev and Ops
Removing Barriers Between Dev and Ops
 
PCF Cloud-Native Workshop Slides
PCF Cloud-Native Workshop SlidesPCF Cloud-Native Workshop Slides
PCF Cloud-Native Workshop Slides
 
Driving Enterprise Architecture Redesign: Cloud-Native Platforms, APIs, and D...
Driving Enterprise Architecture Redesign: Cloud-Native Platforms, APIs, and D...Driving Enterprise Architecture Redesign: Cloud-Native Platforms, APIs, and D...
Driving Enterprise Architecture Redesign: Cloud-Native Platforms, APIs, and D...
 
Why Your Digital Transformation Strategy Demands Middleware Modernization
Why Your Digital Transformation Strategy Demands Middleware ModernizationWhy Your Digital Transformation Strategy Demands Middleware Modernization
Why Your Digital Transformation Strategy Demands Middleware Modernization
 
Monitoring Cloud Native Apps on Pivotal Cloud Foundry with AppDynamics
Monitoring Cloud Native Apps on Pivotal Cloud Foundry with AppDynamicsMonitoring Cloud Native Apps on Pivotal Cloud Foundry with AppDynamics
Monitoring Cloud Native Apps on Pivotal Cloud Foundry with AppDynamics
 
Breaking the Monolith
Breaking the MonolithBreaking the Monolith
Breaking the Monolith
 
Demystifying Operational Features for Product Owners - AgileCam - SkeltonThat...
Demystifying Operational Features for Product Owners - AgileCam - SkeltonThat...Demystifying Operational Features for Product Owners - AgileCam - SkeltonThat...
Demystifying Operational Features for Product Owners - AgileCam - SkeltonThat...
 
Part 1: The Developer Experience (Pivotal Cloud Platform Roadshow)
Part 1: The Developer Experience (Pivotal Cloud Platform Roadshow)Part 1: The Developer Experience (Pivotal Cloud Platform Roadshow)
Part 1: The Developer Experience (Pivotal Cloud Platform Roadshow)
 
B3 getting started_with_cloud_native_development
B3 getting started_with_cloud_native_developmentB3 getting started_with_cloud_native_development
B3 getting started_with_cloud_native_development
 
Azure Application Modernization
Azure Application ModernizationAzure Application Modernization
Azure Application Modernization
 
Pivotal Cloud Foundry: A Technical Overview
Pivotal Cloud Foundry: A Technical OverviewPivotal Cloud Foundry: A Technical Overview
Pivotal Cloud Foundry: A Technical Overview
 
Cloud Foundry Summit 2015: Leaving your Comfort Zone - Garmin and Cloud Foundry
Cloud Foundry Summit 2015: Leaving your Comfort Zone - Garmin and Cloud FoundryCloud Foundry Summit 2015: Leaving your Comfort Zone - Garmin and Cloud Foundry
Cloud Foundry Summit 2015: Leaving your Comfort Zone - Garmin and Cloud Foundry
 
Application Migration: How to Start, Scale and Succeed
Application Migration: How to Start, Scale and SucceedApplication Migration: How to Start, Scale and Succeed
Application Migration: How to Start, Scale and Succeed
 
Cloud expo 2018: From Apollo 13 to Google SRE - When DevOps meets SRE
Cloud expo 2018: From Apollo 13 to Google SRE - When DevOps meets SRECloud expo 2018: From Apollo 13 to Google SRE - When DevOps meets SRE
Cloud expo 2018: From Apollo 13 to Google SRE - When DevOps meets SRE
 
Spring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour DallasSpring and Pivotal Application Service - SpringOne Tour Dallas
Spring and Pivotal Application Service - SpringOne Tour Dallas
 
From Apollo 13 to Google SRE
From Apollo 13 to Google SREFrom Apollo 13 to Google SRE
From Apollo 13 to Google SRE
 

Semelhante a Troubleshooting App Health and Performance with PCF Metrics 1.2

IBM Innovate 2013 Session: DevOps 101
IBM Innovate 2013 Session: DevOps 101IBM Innovate 2013 Session: DevOps 101
IBM Innovate 2013 Session: DevOps 101
Sanjeev Sharma
 
F5 Value For Virtualization
F5 Value For VirtualizationF5 Value For Virtualization
F5 Value For Virtualization
Patricio Campos
 
Resume-Neha-AWS
Resume-Neha-AWSResume-Neha-AWS
Resume-Neha-AWS
Neha Gupta
 

Semelhante a Troubleshooting App Health and Performance with PCF Metrics 1.2 (20)

T3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of ExcellenceT3 Consortium's Performance Center of Excellence
T3 Consortium's Performance Center of Excellence
 
MuleSoft Surat Virtual Meetup#16 - Anypoint Deployment Option, API and Operat...
MuleSoft Surat Virtual Meetup#16 - Anypoint Deployment Option, API and Operat...MuleSoft Surat Virtual Meetup#16 - Anypoint Deployment Option, API and Operat...
MuleSoft Surat Virtual Meetup#16 - Anypoint Deployment Option, API and Operat...
 
The Key to Successful Development and Deployment of Applications from Mobile ...
The Key to Successful Development and Deployment of Applications from Mobile ...The Key to Successful Development and Deployment of Applications from Mobile ...
The Key to Successful Development and Deployment of Applications from Mobile ...
 
Ibm mobile first platform presentation refresh 05 18-mc
Ibm mobile first platform presentation refresh 05 18-mcIbm mobile first platform presentation refresh 05 18-mc
Ibm mobile first platform presentation refresh 05 18-mc
 
Encontrando la Aguja en el Rendimiento de Aplicaciones
Encontrando la Aguja en el Rendimiento de AplicacionesEncontrando la Aguja en el Rendimiento de Aplicaciones
Encontrando la Aguja en el Rendimiento de Aplicaciones
 
eG Innovations
eG InnovationseG Innovations
eG Innovations
 
IBM Innovate 2013 Session: DevOps 101
IBM Innovate 2013 Session: DevOps 101IBM Innovate 2013 Session: DevOps 101
IBM Innovate 2013 Session: DevOps 101
 
F5 Value For Virtualization
F5 Value For VirtualizationF5 Value For Virtualization
F5 Value For Virtualization
 
Convertigo Mobility Platform | Mobile Application Development for Enterprises...
Convertigo Mobility Platform | Mobile Application Development for Enterprises...Convertigo Mobility Platform | Mobile Application Development for Enterprises...
Convertigo Mobility Platform | Mobile Application Development for Enterprises...
 
Webinar: Gaining Control and Visibility of Your Virtualized Infrastructure
Webinar: Gaining Control and Visibility of Your Virtualized InfrastructureWebinar: Gaining Control and Visibility of Your Virtualized Infrastructure
Webinar: Gaining Control and Visibility of Your Virtualized Infrastructure
 
MMS2011_BC34_Plas_Final
MMS2011_BC34_Plas_FinalMMS2011_BC34_Plas_Final
MMS2011_BC34_Plas_Final
 
Introduction to Red Hat Mobile Application Platform
Introduction to Red Hat Mobile Application PlatformIntroduction to Red Hat Mobile Application Platform
Introduction to Red Hat Mobile Application Platform
 
J Bdemo101215
J Bdemo101215J Bdemo101215
J Bdemo101215
 
Resume-Neha-AWS
Resume-Neha-AWSResume-Neha-AWS
Resume-Neha-AWS
 
Agile Tour Pune 2015: Dev-ops- niche or mainstream: Bhaskar Venugopalan
Agile Tour Pune 2015: Dev-ops- niche or mainstream: Bhaskar VenugopalanAgile Tour Pune 2015: Dev-ops- niche or mainstream: Bhaskar Venugopalan
Agile Tour Pune 2015: Dev-ops- niche or mainstream: Bhaskar Venugopalan
 
DCSF19 Adding a Modern API Layer to ‘Dockerized’ Legacy Apps
DCSF19 Adding a Modern API Layer to ‘Dockerized’ Legacy Apps  DCSF19 Adding a Modern API Layer to ‘Dockerized’ Legacy Apps
DCSF19 Adding a Modern API Layer to ‘Dockerized’ Legacy Apps
 
Complete Visibility into Docker Containers with AppDynamics
Complete Visibility into Docker Containers with AppDynamicsComplete Visibility into Docker Containers with AppDynamics
Complete Visibility into Docker Containers with AppDynamics
 
Twelve factor-app
Twelve factor-appTwelve factor-app
Twelve factor-app
 
Shift Happens - Rapidly Rolling Forward During Production Failure
Shift Happens - Rapidly Rolling Forward During Production FailureShift Happens - Rapidly Rolling Forward During Production Failure
Shift Happens - Rapidly Rolling Forward During Production Failure
 
Transform Digital Business with DevOps
Transform Digital Business with DevOpsTransform Digital Business with DevOps
Transform Digital Business with DevOps
 

Mais de VMware Tanzu

Mais de VMware Tanzu (20)

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
 

Último

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Troubleshooting App Health and Performance with PCF Metrics 1.2

  • 1. PCF Metrics – App Dev Providing App Developers insight into app performance PCF Metrics Providing App Developers insight into app performance Pieter Humphrey, Allen Duet
  • 2. Gartner believes that more than 80% of all mission-critical IT service outages result from people and process errors and failures, and of those outages, more than 50% result from a lack of coordination between change, release and configuration management processes. Four Steps to Optimize Configuration Management Process and Tools, By Ronni J. Colville, Doc #G00258557 Oct 2013
  • 3. Modern infrastructure is constantly changing Methodologies Deployment Sparingly at designated times Ready for prod at any time Architecture Technologies Operations App Server on Machine Containers, Public / Private / Hybrid Cloud Monolithic App Microservices / Composite app Linear / Sequential Agile DevOps CI / CD Pipelines Many tools, ad hoc automation Manage services, not servers
  • 4. Rate of change is driving more outages
  • 5. 5 Outages often preventable using automation Facebook 1 hour, Jan 26th Config / app / net failures Apple App Store 11 hours March 11th Internal DNS error NYSE, United, WSJ 4 hr, 1.5 hr, 1 hr July 8th Software update, routing failure, server overload UltraDNS 2.5 hours Oct 15th Configuration Errors https://blog.thousandeyes.com/top-internet-outages-2015/ http://www.informationweek.com/cloud/9-spectacular-cloud-computing-fails/d/d-id/1321305?image_number=2 http://www.informationweek.com/cloud/9-spectacular-cloud-computing-fails/d/d-id/1321305?image_number=4 http://www.informationweek.com/cloud/9-spectacular-cloud-computing-fails/d/d-id/1321305?image_number=8 2015
  • 6. “25% of customers will abandon a web page that takes more than 4 seconds to load” “47% of consumers expect a web page to load in < 2 seconds” “Customers prefer competitors website if it is 250ms faster” “Increase revenue 1% for each 100ms improvement” Sources: Gartner, Google, Amazon, Walmart 6 Speed and Availability Matters
  • 7. 7 Speed Performance and Human Perception Delay time User Reaction 0 - 100 ms 100-300 ms 300-1000 ms 1 second + 10 seconds + Instant Feels sluggish Machine is working.. Mental context switch I’ll come back later .. Stay under 250 ms to feel "fast". Stay under 1000 ms to keep users attention. Breaking the 1000 ms Mobile Barrier - Velocity - Google Slides https://docs.google.com/presentation/d/1wAxB5DPN-rcelwbGO6lCOus_S1rP24LMqA8m1eXEDRo/present?slide=id.p19
  • 8. Changes to a single microservice or monolithic app can impact performance of downstream apps and services, or cause breakage 8
  • 9. 9 Troubleshooting apps and microservices is hard Most platforms have: Disparate permissions on different apps Data silos across subsystems Trouble reconciling time series data
  • 10. Multiple Languages Microservices Support Services Marketplace Native User Provided Partner DEVELOPMENT 1010 Operating System Cloud API Container Orchestration App Deployment & Management Availability Visibility & Administration CI/CD Tools, ID, Security Health, Metrics, Patching Apps & Platform Dashboards OPERATIONS
  • 11. 4 Levels of High Availability 11 Availability Zone Fail 4 VM Fail 3 Process Fail 2 App Instance Fail 1 V M V M Process V M V M V M VM VM VM VM VM VM VM VM
  • 12. Container Scheduler Handles Workloads 12 250,000 containers managed in a single environment https://blog.pivotal.io/pivotal-cloud-foundry/products/250k-containers-in-production-a-real-test-for-the-real-world
  • 13. Container Scheduler Handles Workloads 13 Dynamic load balancing
  • 14. Container Scheduler Handles Workloads 14 Dynamic load balancing Remediation and rebalance of workloads
  • 15. Each Layer Upgradable with No Downtime 15 App Runtime* File system mapping Application Linux host & kernel Blue-Green deploy Canary style deploy * e.g. Embedded webserver, app configurations, JRE, agents for services packaged as buildpacks C o n t a i n e r
  • 16. Our Charter To provide App Devs with data points to assess overall solution performance and healthProviding App Developers insight into app performance
  • 17. • Near real-time view • Covers 80-90% of the problems • One tool correlates events, logs, metrics • Common set of facts for Dev+Ops • Designed for PCF multi-tenancy • Agentless, no install • Enabled automatically for all applications Immediate Integrated Automated
  • 19. Select an app, watch streaming data
  • 20. 2 weeks of app log storage 2 weeks of detailed container and http start stop metric storage App Log distribution histogram App Event UI improvements Fault tolerance on all storage services Testing and tuning for large ingestion loads v1.2.1 PCF Metrics
  • 22. 22 PCF Metrics 1.2 Architecture
  • 23. Our Journey PCF Metrics v1.0 PCF Metrics v1.1 PCF Metrics v1.2.1 PCF Metrics v1.3 Aggregate Container and HTTP metrics provided for Apps Aggregate Container and HTTP metrics + App events and Logs (24 hour storage) Aggregate Container and HTTP metrics + App events and Logs (2 weeks storage) Aggregate Container and HTTP metrics + App events and Logs (2 weeks storage) TraceID capture and Trace Logs
  • 24. Spring Boot actuator support Expanded event descriptions Additional Log sources * Data exposed as API Continued UX improvements v1.3+ App Developers