SlideShare uma empresa Scribd logo
1 de 28
PCF Platform Monitoring with
Prometheus & Grafana
By Alan Strader & Jamie Christian
1
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Northern Trust
2
Founded in 1889, Northern Trust is a global leader in asset servicing, asset
management, and banking for personal and institutional clients
Wealth Management
Corporate &
Institutional Services
• Insurance companies
• Pensions
• Sovereign entities
• Fund managers
• Foundations and
endowments
• Individuals
• Families
• Family offices
• Foundation
• Endowments
• Privately held businesses
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Agenda
3
• Monitoring Solution Requirements
• Options Considered
• What is Prometheus?
• Prometheus on Cloud Foundry
• Alerting
• Dashboarding
• How Prometheus Helps Us
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
PCF at Northern Trust
4
• January 2017: Begin Prod and Non-Prod environment build
• March 2017: First application go-live
• June 2017: Started Prometheus journey
• Now: 750+ microservices executing across 5 foundations
4 Full Time PCF Platform Operators
250+ Spring Boot Developers
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
5
Prometheus
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Prometheus at a Glance
6
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Prometheus Value vs. Commercial Solution
7
• Upfront time/cost (soft dollars) higher
• Estimated ongoing cost significantly lower
• Initial implementation of new features takes time (learning curve), but easily
replicated across foundations
• 4 upgrades/foundation (20 deployments) to date!
• Time/value proposition greatly improved by input of the CF community
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Monitoring Requirements
8
Application Monitoring
• CA APM our enterprise implementation
Platform Monitoring (Gap identified as part of day 2 operations)
• Report health of opaque platform components
• Alert when we are approaching/exceeding capacity
• GoRouters/Diego Cells/Quota/Memory/Compute
• Forecast/project approximate dates for capacity increases
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Options Considered – DataDog, Sysdig, Pandora FMS, Prometheus
• Cost (hard vs. soft dollars)
• Already in use at NT?
• Commercial vs. open source
• On premise vs. off premise
• Community & vendor recommendations
• Ease of use
• Look and feel
9
Prometheus:
Benefits:
• No hard dollar cost
• Existing bosh deployment
• Existing CF Dashboards
Limitations:
• No direct support
• Fragmented documentation
• Startup/Operational Learning
Curve
• Desired features missing
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Components of Prometheus & Grafana
10
Prometheus: scrapes/stores time series data
Exporters: applications that harvest existing
metrics from third-party systems and
expose them for Prometheus ingestion
Nginx: HTTP & reverse proxy server
Grafana: metric analytics & visualization suite
Alert Manager: provides notifications on alerts
generated by the Prometheus server
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
NT Environment – Current State
11
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
NT Environment – Future State
12
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Installing Prometheus on PCF
• Download artifacts from GitHub
• Upload BOSH releases to BOSH director
• Create UAA clients for firehose and MySQL user
• Populate manifest with new creds/Ops manager settings
• Bosh deploy
• Note: There is a tile!
13
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Alerting
•Prometheus:
• Rule-based alerts; conditional
• Prometheus Expression Language
• Answers the question “what is broken right now?”
• Requires additional notification solution…
•Alert Manager
• Notification solution!
• Send summarized notifications to slack, email, etc.
• De-duplication, grouping, routing
• Can “silence” noisy alerts (rule-based)
14
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Alerting: Custom Example
15
ALERT CFAppDown
IF cf_application_instances{deployment="cf",environment="cf",
organization_name="arch-org",application_name="ShowEnv"} < 2
FOR 30s
LABELS {service="cf", severity="warning"}
ANNOTATIONS{details="`{{$labels.organization_name}}/
{{$labels.space_name}}/{{$labels.application_name}}` has fewer application
instances than expected; there has been {{$value}}/2 app instances running during
the last 30s."}
Prometheus custom.rules:
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Alerting: Notifications
• Set up new receiver for independent
Slack channel
• Route particular alerts to independent
Slack channel
16
routes:
- receiver: 'slack-test'
match:
alertname: CFAppDown
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
http://2a74212d-b53c-4dbe-b0c2-a60c031ee063:9093/#/alerts?receiver=slack-test
Alerting: Notifications
• Customize notification links to contain a generally available URL vs.
Hostname
17
http:// my.alerts.com :9093/#/alerts?receiver=slack-test
exec alertmanager 
-config.file=
"/var/vcap/jobs/alertmanager/config/alertmanager.yml" 
-mesh.listen-address=10.1.1.1:6783 
-mesh.peer="10.1.1.1:6783" 
-web.listen-address=":9093" 
-web.external-url="http://my.alerts.com:9093" 
alertmanager_ctl:
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Deciphering Metrics
18
• Grafana Dashboards can have unclear metric definitions
• Organization Memory Quota Consumption
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Deciphering Metrics
19
• Grafana Dashboards can have unclear metric definitions
• Ex: Instances
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Dashboarding
20
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Use Cases
Aggregating data across Orgs/Spaces/Diego Cells
• Failed deployments  increase Diego Cells
21
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Use Cases
Diego Cell Configuration
22
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Use Cases
Identify where Buildpacks are being used
23
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Use Cases
MySQL Table Statistics
• Bug during upgrade as a result of high record count
24
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Improvements We’d Like to See
• Drilldown into dashboards
• Mechanism to pull configuration from a “config server”
• Integration mechanism for Enterprise Ticketing System
• Searchable Dashboards
• User Provided Service Metrics
• Alert Manager security
25
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Overall…
• Would we do this again? Yes!
• Provided significant amount of data in short period of time
• Highly customizable
• Data
• Dashboards
• Alerts
• Notifications
• Learning curve worth the agility & operational control
26
NTAC:3NS-20
Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a
Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/
Reference Links
Prometheus on PCF: https://github.com/pivotal-cf/prometheus-on-PCF
Prometheus Bosh Release: https://github.com/cloudfoundry-community/prometheus-boshrelease
Prometheus Concourse Pipeline: https://github.com/pivotal-cf/pcf-prometheus-pipeline
Prometheus Documentation: https://prometheus.io/docs/
Grafana Documentation: http://docs.grafana.org/
27
NTAC:3NS-20
Learn More. Stay Connected.
Monitoring with MongoDB on PCF, Jordan Sumerlus
Wednesday 3:20
Introducing Spring Metrics, Jon Schneider
Thursday 10:30
Monitoring and Troubleshooting Spring Boot Microservices Architecture, Mukesh Gadiya
Thursday 11:50
28
#springone@s1p
NTAC:3NS-20

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Grafana optimization for Prometheus
Grafana optimization for PrometheusGrafana optimization for Prometheus
Grafana optimization for Prometheus
 
kafka
kafkakafka
kafka
 
Red Hat OpenShift Container Platform Overview
Red Hat OpenShift Container Platform OverviewRed Hat OpenShift Container Platform Overview
Red Hat OpenShift Container Platform Overview
 
Infrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using PrometheusInfrastructure & System Monitoring using Prometheus
Infrastructure & System Monitoring using Prometheus
 
Observability
Observability Observability
Observability
 
OpenTelemetry For Developers
OpenTelemetry For DevelopersOpenTelemetry For Developers
OpenTelemetry For Developers
 
Apache Airflow Introduction
Apache Airflow IntroductionApache Airflow Introduction
Apache Airflow Introduction
 
Application Monitoring using Datadog
Application Monitoring using DatadogApplication Monitoring using Datadog
Application Monitoring using Datadog
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
Modern DevOps with Spinnaker - Olga Kundzich
Modern DevOps with Spinnaker - Olga KundzichModern DevOps with Spinnaker - Olga Kundzich
Modern DevOps with Spinnaker - Olga Kundzich
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Observability
ObservabilityObservability
Observability
 
Ansible Automation - Enterprise Use Cases | Juncheng Anthony Lin
Ansible Automation - Enterprise Use Cases | Juncheng Anthony LinAnsible Automation - Enterprise Use Cases | Juncheng Anthony Lin
Ansible Automation - Enterprise Use Cases | Juncheng Anthony Lin
 
AWS Security and SecOps
AWS Security and SecOpsAWS Security and SecOps
AWS Security and SecOps
 
Beautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDBBeautiful Monitoring With Grafana and InfluxDB
Beautiful Monitoring With Grafana and InfluxDB
 
Demystifying observability
Demystifying observability Demystifying observability
Demystifying observability
 
Timeseries - data visualization in Grafana
Timeseries - data visualization in GrafanaTimeseries - data visualization in Grafana
Timeseries - data visualization in Grafana
 
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...
 
Data Observability Best Pracices
Data Observability Best PracicesData Observability Best Pracices
Data Observability Best Pracices
 
Adopting OpenTelemetry
Adopting OpenTelemetryAdopting OpenTelemetry
Adopting OpenTelemetry
 

Semelhante a PCF Platform Monitoring with Prometheus and Grafana

Case Study of Batch Processing With Spring Cloud Data Flow Server in Cloud Fo...
Case Study of Batch Processing With Spring Cloud Data Flow Server in Cloud Fo...Case Study of Batch Processing With Spring Cloud Data Flow Server in Cloud Fo...
Case Study of Batch Processing With Spring Cloud Data Flow Server in Cloud Fo...
VMware Tanzu
 

Semelhante a PCF Platform Monitoring with Prometheus and Grafana (20)

12 Factor, or Cloud Native Apps - What EXACTLY Does that Mean for Spring Deve...
12 Factor, or Cloud Native Apps - What EXACTLY Does that Mean for Spring Deve...12 Factor, or Cloud Native Apps - What EXACTLY Does that Mean for Spring Deve...
12 Factor, or Cloud Native Apps - What EXACTLY Does that Mean for Spring Deve...
 
Connecting All Abstractions with Istio
Connecting All Abstractions with IstioConnecting All Abstractions with Istio
Connecting All Abstractions with Istio
 
Chaos Engineering for PCF
Chaos Engineering for PCFChaos Engineering for PCF
Chaos Engineering for PCF
 
It’s a Multi-Cloud World, But What About The Data?
It’s a Multi-Cloud World, But What About The Data?It’s a Multi-Cloud World, But What About The Data?
It’s a Multi-Cloud World, But What About The Data?
 
Spring Integration Done Bootifully
Spring Integration Done BootifullySpring Integration Done Bootifully
Spring Integration Done Bootifully
 
Case Study of Batch Processing With Spring Cloud Data Flow Server in Cloud Fo...
Case Study of Batch Processing With Spring Cloud Data Flow Server in Cloud Fo...Case Study of Batch Processing With Spring Cloud Data Flow Server in Cloud Fo...
Case Study of Batch Processing With Spring Cloud Data Flow Server in Cloud Fo...
 
Cloud Native Java with Spring Cloud Services
Cloud Native Java with Spring Cloud ServicesCloud Native Java with Spring Cloud Services
Cloud Native Java with Spring Cloud Services
 
Automated PCF Upgrades with Concourse
Automated PCF Upgrades with ConcourseAutomated PCF Upgrades with Concourse
Automated PCF Upgrades with Concourse
 
State of Securing Restful APIs s12gx2015
State of Securing Restful APIs s12gx2015State of Securing Restful APIs s12gx2015
State of Securing Restful APIs s12gx2015
 
Consumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice ArchitectureConsumer Driven Contracts and Your Microservice Architecture
Consumer Driven Contracts and Your Microservice Architecture
 
Implementing Raft in RabbitMQ
Implementing Raft in RabbitMQImplementing Raft in RabbitMQ
Implementing Raft in RabbitMQ
 
Deploying Spring Boot apps on Kubernetes
Deploying Spring Boot apps on KubernetesDeploying Spring Boot apps on Kubernetes
Deploying Spring Boot apps on Kubernetes
 
Modern messaging with RabbitMQ, Spring Cloud and Reactor
Modern messaging with RabbitMQ, Spring Cloud and ReactorModern messaging with RabbitMQ, Spring Cloud and Reactor
Modern messaging with RabbitMQ, Spring Cloud and Reactor
 
High performance stream processing
High performance stream processingHigh performance stream processing
High performance stream processing
 
Numbers in the Hidden: A Pragmatic View of 'Nirvana'
Numbers in the Hidden: A Pragmatic View of 'Nirvana'Numbers in the Hidden: A Pragmatic View of 'Nirvana'
Numbers in the Hidden: A Pragmatic View of 'Nirvana'
 
Automation and Culture Changes for 40M Subscriber Platform Operation
Automation and Culture Changes for 40M Subscriber Platform OperationAutomation and Culture Changes for 40M Subscriber Platform Operation
Automation and Culture Changes for 40M Subscriber Platform Operation
 
Building Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data GridsBuilding Highly Scalable Spring Applications using In-Memory Data Grids
Building Highly Scalable Spring Applications using In-Memory Data Grids
 
Zuul @ Netflix SpringOne Platform
Zuul @ Netflix SpringOne PlatformZuul @ Netflix SpringOne Platform
Zuul @ Netflix SpringOne Platform
 
Who Does What? Mapping Cloud Foundry Activities and Entitlements to IT Roles
Who Does What? Mapping Cloud Foundry Activities and Entitlements to IT RolesWho Does What? Mapping Cloud Foundry Activities and Entitlements to IT Roles
Who Does What? Mapping Cloud Foundry Activities and Entitlements to IT Roles
 
Lattice: A Cloud-Native Platform for Your Spring Applications
Lattice: A Cloud-Native Platform for Your Spring ApplicationsLattice: A Cloud-Native Platform for Your Spring Applications
Lattice: A Cloud-Native Platform for Your Spring Applications
 

Mais de VMware Tanzu

Mais de VMware Tanzu (20)

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

PCF Platform Monitoring with Prometheus and Grafana

  • 1. PCF Platform Monitoring with Prometheus & Grafana By Alan Strader & Jamie Christian 1 NTAC:3NS-20
  • 2. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Northern Trust 2 Founded in 1889, Northern Trust is a global leader in asset servicing, asset management, and banking for personal and institutional clients Wealth Management Corporate & Institutional Services • Insurance companies • Pensions • Sovereign entities • Fund managers • Foundations and endowments • Individuals • Families • Family offices • Foundation • Endowments • Privately held businesses NTAC:3NS-20
  • 3. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Agenda 3 • Monitoring Solution Requirements • Options Considered • What is Prometheus? • Prometheus on Cloud Foundry • Alerting • Dashboarding • How Prometheus Helps Us NTAC:3NS-20
  • 4. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ PCF at Northern Trust 4 • January 2017: Begin Prod and Non-Prod environment build • March 2017: First application go-live • June 2017: Started Prometheus journey • Now: 750+ microservices executing across 5 foundations 4 Full Time PCF Platform Operators 250+ Spring Boot Developers NTAC:3NS-20
  • 5. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ 5 Prometheus NTAC:3NS-20
  • 6. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Prometheus at a Glance 6 NTAC:3NS-20
  • 7. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Prometheus Value vs. Commercial Solution 7 • Upfront time/cost (soft dollars) higher • Estimated ongoing cost significantly lower • Initial implementation of new features takes time (learning curve), but easily replicated across foundations • 4 upgrades/foundation (20 deployments) to date! • Time/value proposition greatly improved by input of the CF community NTAC:3NS-20
  • 8. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Monitoring Requirements 8 Application Monitoring • CA APM our enterprise implementation Platform Monitoring (Gap identified as part of day 2 operations) • Report health of opaque platform components • Alert when we are approaching/exceeding capacity • GoRouters/Diego Cells/Quota/Memory/Compute • Forecast/project approximate dates for capacity increases NTAC:3NS-20
  • 9. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Options Considered – DataDog, Sysdig, Pandora FMS, Prometheus • Cost (hard vs. soft dollars) • Already in use at NT? • Commercial vs. open source • On premise vs. off premise • Community & vendor recommendations • Ease of use • Look and feel 9 Prometheus: Benefits: • No hard dollar cost • Existing bosh deployment • Existing CF Dashboards Limitations: • No direct support • Fragmented documentation • Startup/Operational Learning Curve • Desired features missing NTAC:3NS-20
  • 10. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Components of Prometheus & Grafana 10 Prometheus: scrapes/stores time series data Exporters: applications that harvest existing metrics from third-party systems and expose them for Prometheus ingestion Nginx: HTTP & reverse proxy server Grafana: metric analytics & visualization suite Alert Manager: provides notifications on alerts generated by the Prometheus server NTAC:3NS-20
  • 11. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ NT Environment – Current State 11 NTAC:3NS-20
  • 12. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ NT Environment – Future State 12 NTAC:3NS-20
  • 13. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Installing Prometheus on PCF • Download artifacts from GitHub • Upload BOSH releases to BOSH director • Create UAA clients for firehose and MySQL user • Populate manifest with new creds/Ops manager settings • Bosh deploy • Note: There is a tile! 13 NTAC:3NS-20
  • 14. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Alerting •Prometheus: • Rule-based alerts; conditional • Prometheus Expression Language • Answers the question “what is broken right now?” • Requires additional notification solution… •Alert Manager • Notification solution! • Send summarized notifications to slack, email, etc. • De-duplication, grouping, routing • Can “silence” noisy alerts (rule-based) 14 NTAC:3NS-20
  • 15. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Alerting: Custom Example 15 ALERT CFAppDown IF cf_application_instances{deployment="cf",environment="cf", organization_name="arch-org",application_name="ShowEnv"} < 2 FOR 30s LABELS {service="cf", severity="warning"} ANNOTATIONS{details="`{{$labels.organization_name}}/ {{$labels.space_name}}/{{$labels.application_name}}` has fewer application instances than expected; there has been {{$value}}/2 app instances running during the last 30s."} Prometheus custom.rules: NTAC:3NS-20
  • 16. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Alerting: Notifications • Set up new receiver for independent Slack channel • Route particular alerts to independent Slack channel 16 routes: - receiver: 'slack-test' match: alertname: CFAppDown NTAC:3NS-20
  • 17. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ http://2a74212d-b53c-4dbe-b0c2-a60c031ee063:9093/#/alerts?receiver=slack-test Alerting: Notifications • Customize notification links to contain a generally available URL vs. Hostname 17 http:// my.alerts.com :9093/#/alerts?receiver=slack-test exec alertmanager -config.file= "/var/vcap/jobs/alertmanager/config/alertmanager.yml" -mesh.listen-address=10.1.1.1:6783 -mesh.peer="10.1.1.1:6783" -web.listen-address=":9093" -web.external-url="http://my.alerts.com:9093" alertmanager_ctl: NTAC:3NS-20
  • 18. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Deciphering Metrics 18 • Grafana Dashboards can have unclear metric definitions • Organization Memory Quota Consumption NTAC:3NS-20
  • 19. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Deciphering Metrics 19 • Grafana Dashboards can have unclear metric definitions • Ex: Instances NTAC:3NS-20
  • 20. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Dashboarding 20 NTAC:3NS-20
  • 21. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Use Cases Aggregating data across Orgs/Spaces/Diego Cells • Failed deployments  increase Diego Cells 21 NTAC:3NS-20
  • 22. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Use Cases Diego Cell Configuration 22 NTAC:3NS-20
  • 23. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Use Cases Identify where Buildpacks are being used 23 NTAC:3NS-20
  • 24. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Use Cases MySQL Table Statistics • Bug during upgrade as a result of high record count 24 NTAC:3NS-20
  • 25. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Improvements We’d Like to See • Drilldown into dashboards • Mechanism to pull configuration from a “config server” • Integration mechanism for Enterprise Ticketing System • Searchable Dashboards • User Provided Service Metrics • Alert Manager security 25 NTAC:3NS-20
  • 26. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Overall… • Would we do this again? Yes! • Provided significant amount of data in short period of time • Highly customizable • Data • Dashboards • Alerts • Notifications • Learning curve worth the agility & operational control 26 NTAC:3NS-20
  • 27. Unless otherwise indicated, these slides are © 2013 -2016 Piv otal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by -nc/3.0/ Reference Links Prometheus on PCF: https://github.com/pivotal-cf/prometheus-on-PCF Prometheus Bosh Release: https://github.com/cloudfoundry-community/prometheus-boshrelease Prometheus Concourse Pipeline: https://github.com/pivotal-cf/pcf-prometheus-pipeline Prometheus Documentation: https://prometheus.io/docs/ Grafana Documentation: http://docs.grafana.org/ 27 NTAC:3NS-20
  • 28. Learn More. Stay Connected. Monitoring with MongoDB on PCF, Jordan Sumerlus Wednesday 3:20 Introducing Spring Metrics, Jon Schneider Thursday 10:30 Monitoring and Troubleshooting Spring Boot Microservices Architecture, Mukesh Gadiya Thursday 11:50 28 #springone@s1p NTAC:3NS-20

Notas do Editor

  1. Financial Services Founded 125+ years ago Primary businesses Wealth Mgmt/C&IS
  2. Hope you’re in the right place!
  3. Highlights that we are new operators  learning curve
  4. Greek God – gave fire to man Open Source Software – shows fire(s) to man  Alcohol fires: in many cases things we would not have otherwise seen Open-sources systems monitoring and alerting toolkit originally built by SoundCloud
  5. More time on slides than on Prometheus itself!
  6. We already have an application monitoring solution (CA APM)
  7. **Datadog, Sysdig, Pandora, Prometheus** Cost actual dollar amount to spend vs. time/resources put in In use at NT? Give preference to products already in-house where possible is it an ADDITIONAL cost to NT? Commercial vs. Open Source  Cost and support model considerations On vs Off premise  Prefer on premise due to security concerns and direction  Unclear what data would be exposed due to lack of experience with product Recommendations  PCF integration: we don’t have to build it!  Container awareness is likely on par with expectations; data makes sense Ease of Use  Data presented in a useful way (ex: 1 metric per graph?)  PCF integration already exist? Look and Feel  Visuals make sense  Can display similar data in multiple formats
  8. POINT: how does the data actually travel? Where does it go? transitions into what is in the bosh release PCF Components: CC, MySQL, Bosh Director, … Exporters: applications which harvest existing metrics from third-party systems as Prometheus metrics. This is useful for cases where it is not feasible to instrument a given system with Prometheus metrics directly (for example, HAProxy or Linux system stats). Node_Exporter: One of the most widely used exporters. Added to PCF runtime-config  every VM running on PCF. I/O Memory Disk CPU Pressure Several Community exporters Build your own! Prometheus: scrapes and stores time series data Time Series database: Nginx: HTTP & reverse proxy server Grafana: open source metric analytics & visualization suite used for visualizing time series data Alert Manager: handles alerts sent by client applications such as the Prometheus server.  It takes care of deduplication, grouping, and routing them to the correct receiver integrations (email, Slack, PagerDuty, OpsGenie, etc)
  9. Current state: - 5 foundations, 3 non-prod (Sandbox, System, UAT) - one datacenter, 2 prod (live/warm) – two datacenters - 1 grafana instance for each foundation Future state: - 8 foundations, 6 non-prod (Sandbox1, Sandbox2, System live/live, UAT live/live) – two datacenters, 2 prod (live/live) – two datacenters - 4 grafana instances (non-prod DC1, non-prod DC2, prod DC1, prod DC2)
  10. Current state: - 5 foundations, 3 non-prod (Sandbox, System, UAT) - one datacenter, 2 prod (live/warm) – two datacenters - 1 grafana instance for each foundation Future state: - 8 foundations, 6 non-prod (Sandbox1, Sandbox2, System live/live, UAT live/live) – two datacenters, 2 prod (live/live) – two datacenters - 2 grafana instances (non-prod, prod) Potential talking point: recommendations for others; start more granular?? Start with spanned approach?
  11. - Easy to install as most of what is needed for CF has been built and published to GitHub for you. ** Most time spent on this was on understanding configuration items**  ex: proxy, vm configurations  ex: exporters would often die until we increased RAM Once it is set up, provides valuable data out of the box and does not require much care and feeding, depending on existing skillset.  Our pain points were more a result of being new to Prometheus/Bosh. Trying to learn many things at once. Tile:  when we started with prometheus, it didn’t exist  Does not give us the flexibility to span multiple foundations (as is our goal – future state), so chose not to explore this.
  12. Grouping: categorizes alerts of similar nature into a single notification. useful during larger outages when many systems fail at once and hundreds to thousands of alerts may be firing simultaneously. Thus one can configure Alert Manager to group alerts by their cluster/alertname to send a single notification. Routing: send alerts to different receivers based on “match”;  I want to know when my app crashes, but don’t want to bombard everyone else in my slack channel. Send all alerts with organization_name “arch-org” to our email distribution list Silence: simply mute alerts for a given time.  Helpful during times like an upgrade when you expect a lot of activity and don’t care to be told your CPU usage is high!  Alternatively, can remove usless alerts entirely so that they are not time-bound.
  13. Not one size fits all, so customization is key. EXAMPLE: We’ve had app teams request alerting on when their apps go down; test scenario. >> if: query that triggers the alert **>>details: gives value to alert notifications** important because….
  14. By default, notifications provide alert name and labels. Details  alert meaningful at a glance. Can also customize routes (alertmanager.yml) for notifications, so for these app teams, only they will get their alerts.
  15. 1 thing in addition to the optional customizations that we had to modify was the notification URL. OOB, notifications use alertmanager hostname in URLs not generally available outside of the VM 1 Concern: URL takes to alert manager. Have not figured out how to isolate alert data based on app team. Everyone can view any alert.
  16. A LOT of information; need to learn what metrics mean, some not obvious.  Org Memory Quota Consumption: actual vs. reserved GOOD! Help teams use resources wisely
  17.  INSTANCES: System Apps Dashboard vs. Space Summary Dashboard Dashboards were showing us reserved resources, not actual values. CONFUSING. We decided to change them so that they made more sense to us.
  18. Now we can see all instance information in one dashboard. Running vs. Crashed  WHICH are crashed?? When we start using instance quotas, compare requested instances vs. instance quota There are plenty of dashboards we loved OOB that required no modification… some we look at regularly to better manage the platform >>
  19. “CF: Cells Capacity” Situations where this solution filled a void -THEN: Failed deployments  application memory limit?  org memory quota?  ERT memory status (percentage)  probably about time we add cells or change template… -NOW: (proactive vs. reactive) alerted on cells with low resource or total available cell memory
  20. Getting alerts for low storage… Went to Prometheus to see Allocated vs. Available and surprised by total cell disk– why 32GB when our template assigned 64GB? SURPRISE SWAP  learned how Swap is allocated by Bosh  what configuration is needed to increase our usable storage
  21. When we upgrade our buildpacks or delete old ones…
  22. Upgrade to platform and ran into a bug; “push apps manager” errand failed; Truncated this table and it worked Rather than:  Jumpbox  ssh ops manager  set deployment  target and log in to mysql  set a table  read table size NOW: Check here prior to upgrading ERT; simplifies prerequisite work for upgrades. Take that anywhere we can get it!
  23. Plenty we love about the product, but some improvements we’d like to see… Drilling down into different metrics  showed on instance dashboard Config Server  Any changes require redeploy (to keep them)  Ex: alerts Enterprise Ticketing System  Granularity: need individual notifications vs. aggregated Searchable Dashboards  We have almost 70 dashboards, and some are very similar, though not entirely the same  Can only search on dashboard name, not the contents of it User Provided Service Metrics  Can only see OOB (Rabbit, Redis, Autoscaler)  Haven’t found a way to query for UPS to create our own dashboards Alert Manager Security  No authentication needed; can see all alerts for all receivers.
  24. Happy with Prometheus and would recommend to others Provided a huge amount of data in a short period of time; we think we can customize the data more to better fit our needs. Good experiences with customizations thus far. Took time to understand the product and the data being collected to be able to meet our goals; but learning curve is made worth it by the agility and operational control we have w/ this tool