O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

5 Tips on Determining the Most Impactful Metrics in Your App

190 visualizações

Publicada em

Your monitoring works... but is it "work-centric"? This webinar, co-hosted by Preetam Jinka of VividCortex and Matt Williams of Datadog, is all about the best ways to define and monitor your app's most important performance metrics.

Publicada em: Software
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

5 Tips on Determining the Most Impactful Metrics in Your App

  1. 1. Preetam Jinka Software Engineer VividCortex @PreetamJinka Matt Williams DevOps Evangelist Datadog @technovangelist
  2. 2. #WORKCENTRIC Preetam Jinka Software Engineer
  3. 3. VividCortex is the best way to improve your database performance, efficiency, and uptime. It is a secure, cloud-hosted platform that eliminates your most critical APM visibility gap: deep insights into production database workload and query performance.
  4. 4. #WORKCENTRIC Tip #1: Determine the work your systems are designed to perform How VividCortex views monitoring: ● Servers exist to do work. ● Database work is in terms of queries. We’re interested in: ● How databases complete their work ● The behavior, efficiency, and effects of each query ● How work changes over time
  5. 5. #WORKCENTRIC Adaptive Fault Detection Detecting micro-stalls in a database
  6. 6. #WORKCENTRIC Database Stalls ● A stall is a type of fault. ● Short periods when work isn’t being done ● Can be as short as 1 second ● VividCortex detects real database stalls early before they lead to user-facing outages or downtime ● We do this with zero configuration and no fixed thresholds ◦ The secret sauce: we have a model.
  7. 7. #WORKCENTRIC Tip #2: Find a model to create relationships between metrics or describe how work is being done Little’s Law ● L = λ × W ● Concurrency = Throughput × Latency ● Little’s Law provides a model to relate throughput and concurrency In MySQL: ● Concurrency: threads_running ◦ There’s one thread per query. ◦ From SHOW STATUS ● Throughput: queries completed per second
  8. 8. #WORKCENTRIC MySQL Server Stall More queries in progress Fewer being completed
  9. 9. #WORKCENTRIC What about latency? We can see the effect of the stall on overall query latencies. All of the stalled queries are completing after the fault ends.
  10. 10. #WORKCENTRIC Monitoring Queries Looking at the work itself
  11. 11. #WORKCENTRIC “What gets measured gets managed.” —Peter Drucker
  12. 12. #WORKCENTRIC Tip #3: Monitor what you want to optimize ● Monitor anything worth optimizing. ● If you’re interested in optimizing the use of indexes, monitor how queries are using indexes.
  13. 13. #WORKCENTRIC Work metrics Resource metrics Tip #4: Focus on heavy hitters
  14. 14. #WORKCENTRIC Monitoring query behavior over time Detecting Workload Changes
  15. 15. #WORKCENTRIC Tip #5: Automatically detect changes What’s taking up the database’s time that wasn’t before?
  16. 16. #WORKCENTRIC Query Anomaly Detection ● Detects changes in a query’s execution time, error rate, and throughput ● Uses intelligent baselining to account for seasonalities in metrics ● Consider at least hourly, daily, and weekly seasonal trends ● Not about detecting problems. ◦ Not worth alerting on! ◦ Systems are always changing.
  17. 17. #WORKCENTRIC Anomaly detected in a query’s total execution time metric
  18. 18. #WORKCENTRIC Summary 1. Determine the work your systems are designed to perform. 2. Find a model to create relationships between metrics or describe how work is being done. 3. Monitor what you want to optimize. 4. Focus on the heavy hitters. 5. Automatically detect changes. https://www.vividcortex.com/blog/webinar-cheat-sheet
  19. 19. #WORKCENTRIC Matt Williams DevOps Evangelist
  20. 20. #WORKCENTRIC Datadog Overview •SaaS based infrastructure monitoring •Focus on modern infrastructure •Cloud, Containers, Micro Services •Processing nearly a trillion data points per day •Intelligent alerting, Insightful dashboardsnsightful Dashboards and Reportin
  21. 21. #WORKCENTRIC
  22. 22. #WORKCENTRIC Collecting data is cheap
  23. 23. #WORKCENTRIC Collecting data is cheap Not having it when you need it can be expensive
  24. 24. #WORKCENTRIC
  25. 25. #WORKCENTRIC
  26. 26. #WORKCENTRIC ● Requests per second ● % 404 Requests
  27. 27. #WORKCENTRIC
  28. 28. #WORKCENTRIC ● CPU Utilization ● Queue Length
  29. 29. #WORKCENTRIC
  30. 30. #WORKCENTRIC ● Jenkins deploy ● Git Commit
  31. 31. #WORKCENTRIC
  32. 32. #WORKCENTRIC
  33. 33. #WORKCENTRIC DISCUSSION Preetam & Matt Request a Free Trial www.vividcortex.com/free-trial-sign-up app.datadoghq.com/signup
  34. 34. #WORKCENTRIC Q&A Session Additional Questions? preetam@vividcortex.com matt.williams@datadoghq.com