O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Monitoring on Kubernetes using prometheus

123 visualizações

Publicada em

Monitoring on Kubernetes using Prometheus

Publicada em: Engenharia
  • Seja o primeiro a comentar

Monitoring on Kubernetes using prometheus

  1. 1. Monitoring on Kubernetes using Prometheus Chandresh Pancholi Engineer at AI
  2. 2. Kubernetes at Arvind Internet ● Our Infra is deployed on AWS ● Kubernetes minions are running on m4.xlarge instances ● Kubernetes version 1.7.5 in QA/Prod, 1.8.3 on Pre-prod ● QA/Dev, Pre-Prod & Production running on Kubernetes ● Total Pods ⇒ More than 350 (QA/Dev, Prod) ● Total services ⇒ More than 200 (QA/Dev, Prod) ● Running Mongo, MySQL, Redis, Hazelcast in Kubernetes in QA/Dev
  3. 3. What is Kubernetes? Kubernetes is an open-source container orchestration engine and also an abstraction layer for managing full stack operations of hosts and containers. From deployment, Scaling, Load Balancing and to rolling updates of containerized applications across multiple hosts within a cluster. Kubernetes make sure that your applications are in the desired state.
  4. 4. Kubernetes Architecture
  5. 5. Kubernetes Node Architecture
  6. 6. Master: The machine that controls Kubernetes nodes. This is where all task assignments originate. Node: These machines perform the requested, assigned tasks. The Kubernetes master controls them. Deployments: Provides declarative updates for Pod: A group of one or more containers deployed to a single node. All containers in a pod share an IP address, IPC, hostname, and other resources. Pods abstract network and storage away from the underlying container. This lets you move containers around the cluster more easily.
  7. 7. Service: This decouples work definitions from the pods. Kubernetes service proxies automatically get service requests to the right pod—no matter where it moves to in the cluster or even if it’s been replaced. Config maps : ConfigMaps allow you to decouple configuration artifacts from image content to keep containerized applications portable Secrets: Secret are intended to hold sensitive information, such as passwords, OAuth tokens, and ssh keys. Putting this information in a secret is safer and more flexible than putting it verbatim in a pod definition or in a docker image
  8. 8. Monitoring at AI (earlier) EC2 Sensu Kubernetes µServices
  9. 9. Cons 1. Multiple monitoring system 2. Difficulty in troubleshooting 3. Additional Infrastructure cost to support three monitoring system 4. Graphite doesn’t provide pod level Application metrics 5. Infra team need to understand Sensu, Prometheus alerting 6. Application metrics are single dimension Ex. (a.b.c.d.99) 7. Grafana alerting for Application metrics
  10. 10. Prometheus ● It developed at SoundCloud by ex-Googlers ● Prometheus is a close cousin of Kubernetes ● A multi-dimensional data model with time series data identified by metric name and key/value pairs ● Alerting and graphing are unified, using the same language. ● Time series collection happens via a pull model over HTTP ● Targets are discovered via service discovery or static configuration ● Provides multiple exporters to send AWS EC2, Kafka, Mongo, Cassandra, RMQ, Redis metrics
  11. 11. Sample metrics {endpoint="http",instance="",job="hello",namespace="defau lt",pod="hello-946046218-397x2",service="hello-world"} {endpoint="http",instance="",job="hello",namespace="default", pod="hello-946046218-5h39f",service="hello-world"}
  12. 12. node_exporter Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors.
  13. 13. Metrics ● CPU (system, user, nice, iowait, steal, idle, irq, softirq, guest) ● Memory (Apps, Buffers, Cached, Free, Sla, SwapCached, PageTables, VmallocUser, Swap, Committed, Mapped, Active, Inactive) ● Load ● Disk Space Used in percent ● Disk Utilization per Device ● Disk IOS per device (read, write) ● Disk Throughput per Device (read, write) ● Context Switches ● Network Traffic (In, Out) ● Netstat (Established) ● UDP stats (InDatagrams, InErrors, OutDatagrams, NoPorts) ● Conntrack
  14. 14. AWS EC2 config Relabelling Tags __meta_ec2_availability_zone Availability zone __meta_ec2_instance_id Instance Id __meta_ec2_instance_state Instance state __meta_ec2_instance_type Instance type __meta_ec2_private_ip Private ip __meta_ec2_public_dns_name Public DNS Name __meta_ec2_public_ip Public IP __meta_ec2_tag_<tagkey> Custom Tag key
  15. 15. Alerting
  16. 16. Approach #1 - Prometheus on EC2 EC2 Kubernetes node ex µServices AWS EC2
  17. 17. #1. Getting EC2 server metrics is quite easy and straightforward. Prometheus provides EC2 discovery. #2. Getting Kubernetes and Application metrics is very complex. It has 300+ lines of configuration to support just Kubernetes metrics
  18. 18. Approach #2. Use Prometheus operator
  19. 19. What is Prometheus operator? The Prometheus Operator creates, configures, and manages Prometheus monitoring instances. Automatically generates monitoring target configurations based on familiar Kubernetes label queries.
  20. 20. Service monitor Custom Resource Definition(CRD)
  21. 21. Prometheus Custom Resource Definition (CRD)