O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Prometheus for Monitoring Metrics (Fermilab 2018)

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 24 Anúncio

Prometheus for Monitoring Metrics (Fermilab 2018)

Baixar para ler offline

From its humble beginnings in 2012, the Prometheus monitoring system has grown a substantial community with a comprehensive set of integrations. This talk will give an overview of the core ideas behind Prometheus, its feature set and how it has grown to met the challenges of modern cloud-based systems.

From its humble beginnings in 2012, the Prometheus monitoring system has grown a substantial community with a comprehensive set of integrations. This talk will give an overview of the core ideas behind Prometheus, its feature set and how it has grown to met the challenges of modern cloud-based systems.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Prometheus for Monitoring Metrics (Fermilab 2018) (20)

Anúncio

Mais de Brian Brazil (12)

Mais recentes (20)

Anúncio

Prometheus for Monitoring Metrics (Fermilab 2018)

  1. 1. Brian Brazil Founder Prometheus for Monitoring Metrics
  2. 2. Who am I? ● One of the main developers of Prometheus ● Founder of Robust Perception ● Contributor to many open source projects
  3. 3. Why monitor? ● Know when things go wrong ○ To call in a human to prevent a business-level issue ● Be able to debug and gain insight ● Trending to see changes over time, and drive technical/business decisions ● To feed into other systems/processes
  4. 4. Services have Internals
  5. 5. Monitor the Internals
  6. 6. Monitor as a Service, not as Machines
  7. 7. What is Prometheus? Metrics monitoring system (not logs). A time series database. A query language. Client libraries. An Ecosystem. A modern approach to monitoring services.
  8. 8. Architecture
  9. 9. Client Libraries Instrument your code to capture the metrics that matter to you. If upstream libraries are instrumented, you get that for free! Also many exporters, cAdvisor, MySQL, MongoDB, SNMP, JMX, HAProxy, Minecraft, Factorio...
  10. 10. Let’s Talk Code pip install prometheus_client from prometheus_client import Summary, start_http_server REQUEST_DURATION = Summary('request_duration_seconds', 'Request duration in seconds') @REQUEST_DURATION.time() def my_handler(request): pass // Your code here start_http_server(8000)
  11. 11. Multiple Dimensions from prometheus_client import Counter REQUESTS = Counter('requests_total', 'Total requests', ['method']) def my_handler(request): REQUESTS.labels(request.method).inc() pass // Your code here
  12. 12. Exceptional Circumstances In Progress from prometheus_client import Counter, Gauge EXCEPTIONS = Counter('exceptions_total', 'Total exceptions') IN_PROGRESS = Gauge('inprogress_requests', 'In progress') @EXCEPTIONS.count_exceptions() @IN_PROGRESS.track_inprogress() def my_handler(request): pass // Your code here
  13. 13. Getting Data Out from prometheus_client import start_http_server if __name__ == '__main__': start_http_server(8080) Also possible with Django, Twisted etc.
  14. 14. The PromQL Query Language Arbitrary aggregation, joins and slicing all possible. Can calculate how close you'll be to your quota in 4 hours, or the 95th percentile latency across an entire datacenter. If you can graph it, you can alert on it!
  15. 15. Analytics: Top 5 Docker images by CPU topk(5, sum by (image)( rate(container_cpu_usage_seconds_total{ id=~"/system.slice/docker.*"}[5m] ) ) )
  16. 16. Heterogeneity Not all VMs are equal. Noisy neighbours mean different application instance have different performance. But PromQL can aggregate latency across instances, allowing you to alert on overall end-user visible latency rather than outliers.
  17. 17. Alert management Not every alert results in a page. Group similar alerts together, route them to the right team and throttle notifications. Designed to work reliably during network partitions.
  18. 18. Reliability is Key Core Prometheus server is a single binary. Each Prometheus server is independent. No clustering or attempts to backfill "missing" data when scrapes fail. Option for remote storage for long term storage.
  19. 19. Monitoring Approach Service management went from manual to Chef to Kubernetes. Need to do the same for monitoring. Care about what matters to end users, such as latency and error rates. Distracting a human with alerts for everything that's vaguely off only leads to burnout.
  20. 20. A Rich Community Today there are 750+ contributors to the core repositories, and 350+ 3rd party integrations. There are 1000+ subscribers on our mailing lists, 600+ people in IRC and an estimated 10000+ companies using Prometheus in production. Many companies funding Prometheus development.
  21. 21. Live Demo!
  22. 22. What is Prometheus? Metrics monitoring system (not logs). A time series database. A query language. Client libraries. An Ecosystem. A modern approach to monitoring services.
  23. 23. Prometheus: The Book Coming in 2018!
  24. 24. Resources Official Project Website: prometheus.io User Mailing List: prometheus-users@googlegroups.com Dev Mailing List: prometheus-developers@googlegroups.com IRC: #prometheus on chat.freenode.net Robust Perception Blog: www.robustperception.io/blog

×