O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Modern Monitoring [ with Prometheus ]

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
Chaos is a ladder !
Chaos is a ladder !
Carregando em…3
×

Confira estes a seguir

1 de 94 Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Modern Monitoring [ with Prometheus ] (20)

Anúncio

Mais de Haggai Philip Zagury (13)

Mais recentes (20)

Anúncio

Modern Monitoring [ with Prometheus ]

  1. 1. Tikal KnowledgeTikal Knowledge Haggai Philip Zagury - DevOps Group Lead - Tikal Knowledge
  2. 2. FullStack Developers Israel INTRO - WHO WE ARE WHO WE ARE ? ▸ Tikal helps ISV’s in Israel & abroad in their technological challenges. ▸ Our Engineers are Fullstack Developers with expertise in Android, DevOps, Java, JS, Ruby & Python ▸ We are passionate about technology and specialize in OpenSource technologies. ▸ Our Tech and Group leaders help establish & enhance existing software teams with innovative & creative thinking. https://www.meetup.com/full-stack-developer-il/
  3. 3. INTRODUCTION TO MODERN MONITORING CURRENT STATUS [ INFRASTRUCTURE ] ▸ AWS, Cloud, Hybrid / Multi Cloud ▸ Define metrics and system health based on experience and application specific behaviors. ▸ Many False Positives ▸ Scaling is hard [ semi-auto, manual ] Tikal Knowledge
  4. 4. INTRODUCTION TO MODERN MONITORING COMMON MONITORING STATUS ▸ OPS own monitoring domain ▸ Define metrics and system health based on experience and application specific behaviours. ▸ Many False Positives ▸ Scaling is hard [ semi-auto, manual ] Tikal Knowledge
  5. 5. INTRODUCTION TO MODERN MONITORING COMMON MONITORING SOLUTIONS ▸ cloud watch ▸ new relic ▸ Nagios ▸ App Dynamics ▸ Data Dog ▸ Many more …. Tikal Knowledge
  6. 6. INTRODUCTION TO MODERN MONITORING GOALS ▸ Improve existing monitoring and RCA indicators ▸ Reduce false positives & ‘customer driven alerting’ ▸ Proactively identify data anomalies / diversions ▸ Provide meaningful / intelligent notifications [ severity, SLA compliance etc ] ▸ Proactively remediate commonly known issues, or set the foundation of a robust substitute ▸ Provide KPI integration policy & methodology for both DevOps & R&D teams Tikal Knowledge
  7. 7. INTRODUCTION TO MODERN MONITORING CHALLENGES ▸ Preserve the knowledge and insights in the existing Monitoring system ▸ Cultural changes: ▸ APM is part of the development process ▸ Monitoring tools are part of the developer stack (or he will wake up on any issue with his code/app) ▸ On-call isn’t only for OPS … Everybody’s accountable ▸ breakdown the “wall of confusion” between dev and ops Tikal Knowledge
  8. 8. PHILOSOPHY Tikal Knowledge
  9. 9. The Gap of Traditional Monitoring - We know what we want to know … Tikal Knowledge
  10. 10. System Metrics Not enough || Too much a little too late Tikal Knowledge
  11. 11. We do not always know what we are looking @ / 4 … Tikal Knowledge
  12. 12. Is this OK ?! || Normal What happened at 4AM Tikal Knowledge
  13. 13. If your’e lucky + = No action needed Tikal Knowledge
  14. 14. Go back to sleep ( you still work up ! ) Tikal Knowledge
  15. 15. REALITY Murphy’s law … Tikal Knowledge
  16. 16. Stop using Nagios (so it can die peacefully) Feb 13, 2014 [ slideshare ] Tikal Knowledge
  17. 17. In 2 words: Configuration files… In a few more: - resources - services - dependencies - … Tikal Knowledge
  18. 18. Traditional Monitoring • Reliable • Durable • Scalable Conclusion … system monitoring does not suffice, enter APM Tikal Knowledge
  19. 19. HOW DID WE GET HERE Tikal Knowledge
  20. 20. INTRODUCTION TO MODERN MONITORING TRADITIONAL MONITORING WAS(IS) ALL ABOUT THE “BLACK BOX” | “OS” METRICS ▸ All we care about is that the system is OK … APPLICATION FROTNEND APPLICATION BACKEND APPLICATION DATABASE Tikal Knowledge
  21. 21. INTRODUCTION TO MODERN MONITORING OPS ARE WORKING ON OPTIMIZING INFRASTRUCTURE … ▸ Throw more RAM & “Reports” ▸ Add another node to the “FE cluster” ▸ Add another shard to the DB … ▸ …. APPLICATION … Tikal Knowledge
  22. 22. INTRODUCTION TO MODERN MONITORING IN THE PAST ~10 YEARS ▸ Developers have started to implement METRICS ▸ Organizations are adopting Standards ▸ Common metrics have become a commodity Tikal Knowledge
  23. 23. REALITY PREVAILS Tikal Knowledge
  24. 24. APPLICATION FROTNEND APPLICATION BACKEND APPLICATION DATABASE APPLICATION … Tikal Knowledge
  25. 25. Multipule Dimensions • [ Stability ] • Ops dimension • [ Innovation ] • Dev dimension • Product dimension Tikal Knowledge
  26. 26. Even More • Environment [ stg, uat, prod ] • Application Stack(s) || tags || types • Business metrics Tikal Knowledge
  27. 27. TEAMS | SCOPES | METRICS - COME TOGETHR
  28. 28. Tikal Knowledge
  29. 29. Tikal KnowledgeTikal Knowledge Apply
  30. 30. INTRODUCTION TO MODERN MONITORING MONITORING CRITARIA’S ▸ Server (OS) level monitoring ▸ Application Monitoring (APM) ▸ Perimeter (External website) monitoring ▸ Event driven remediation ▸ Alerting and Escalation ▸ Associated log data & anomaly detection Tikal Knowledge
  31. 31. INTRODUCTION TO MODERN MONITORING REQUIRED FEATURES Accessibility Scheduling SLA’s assured Auth & Authorization Escalation Durable & Resilient Forensics Automatic Flexible & Elastic Accountable Tikal Knowledge
  32. 32. INTRODUCTION TO MODERN MONITORING IT’S AN ITERATIVE PROCESS ▸ How quick did we recover ? ▸ What worked / Didn’t work ? ▸ Iterative improvements [ Chaos Monkey, 10 story test ] ▸ RCA -> Remediation [ a.k.a False positive lifecycle ] Tikal Knowledge
  33. 33. METHODOLOGY Tikal Knowledge
  34. 34. INTRODUCTION TO MODERN MONITORING HOW TO DEFINE A METRIC OR ALERT VS. HOW TO STORE DATA ▸ A Metric’s Lifecycle & Design ▸ Time Series Data stream(s) || source(s) ▸ Common tagging ▸ Metric naming conventions and implications ▸ Micro Services, Integration of Traditional and New Generation solutions ▸ Choose short, mid & long term tools / services Tikal Knowledge
  35. 35. INTRODUCTION TO MODERN MONITORING A METRIC’S LIFECYCLE NEW (A) METRIC INFRUSTRUCTURE (OS) APPLICATION EXTERNAL (DEPENDENCY / ENDPOINT) REMEDIABLE ? ALEARTABLE ? LOG CORRELATION SCOPE OF IMPACT LEARN IN DEV | STG } } DEFINE IN DEV | STG } SHIP TO PROD Tikal Knowledge
  36. 36. INTRODUCTION TO MODERN MONITORING A METRIC’S LIFECYCLE - “TAG-ABLE” == FILTERABLE | MEASURABLE | QUANTIFIABLE NEW (A) METRIC INFRUSTRUCTURE (OS) APPLICATION EXTERNAL (DEPENDENCY / ENDPOINT) REMEDIABLE ? ALEARTABLE ? LOG CORRELATION SCOPE OF IMPACT LEARN IN DEV | STG } } DEFINE IN DEV | STG } SHIP TO PROD DEVLOPMENT STAGING PRODUCTIONENVIRONMENT Tikal Knowledge
  37. 37. INTRODUCTION TO MODERN MONITORING A METRIC’S LIFECYCLE NEW (A) METRIC INFRUSTRUCTURE (OS) APPLICATION EXTERNAL (DEPENDENCY / ENDPOINT) REMEDIABLE ? ALEARTABLE ? LOG CORRELATION SCOPE OF IMPACT LEARN IN DEV | STG } } DEFINE IN DEV | STG } SHIP TO PROD - QUANTIFIABLE METRICS: SEVERITY, CRITICAL STATE - EXPOSING A SERVICE - CONSUMING A SERVICE - - WHY DOES MY SERVICE HAVE AN OS IMPACT ? - - IS IT BY DESIGN ? - FALLBACK METHODS ? - ALTERNATE ENDPOINTS / RETRY ? - FEATURE TOGGLE - DEFINE SEVERITY 37 Tikal Knowledge
  38. 38. INTRODUCTION TO MODERN MONITORING TSD PRINCIPLES Credit->http://opentsdb.net/overview.html Tikal Knowledge
  39. 39. INTRODUCTION TO MODERN MONITORING DATAPOINTS Credit->https://www.datadoghq.com/blog/the-power-of-tagged-metrics/ IntoolslikePrometheusyoudon'tneedthetimestampitjustusescollectiontimestamp Tikal Knowledge
  40. 40. INTRODUCTION TO MODERN MONITORING MIX ’N’ MATCH Tikal Knowledge
  41. 41. INTRODUCTION TO MODERN MONITORING SHORT | MID | LONG TERM SOLUTIONS Tikal Knowledge
  42. 42. PROMETHEUS https://github.com/prometheus/prometheus Tikal Knowledge
  43. 43. INTRODUCTION TO MODERN MONITORING FEATURES ▸ Open-source systems monitoring and alerting toolkit ▸ A multi-dimensional data model (time series identified by metric name and key/value pairs) ▸ A flexible query language to leverage this dimensionality ▸ A no reliance on distributed storage; single server nodes are autonomous** ▸ A time series collection happens via a pull model over HTTP ▸ A pushing time series is supported via an intermediary gateway ▸ A targets are discovered via service discovery or static configuration ▸ A multiple modes of graphing and dashboarding support Tikal Knowledge
  44. 44. INTRODUCTION TO MODERN MONITORING PROMETHEUS ARCHITECTURE Dashboarding Prometheus Server Alertmanager Retrieval / Collection DataSerie s Storage [DB] PromQ L web UI Prometheus server Prometheus server(s) Push Gateway Service Discovery Providers Prometheus server Prometheus exporters Tikal Knowledge
  45. 45. INTRODUCTION TO MODERN MONITORING UNTIL NOW ‣ Try providing this to each developer ‣ Sensu has a very similar approach to APM … ‣ Complexity is the barrier … Tikal Knowledge
  46. 46. INTRODUCTION TO MODERN MONITORING UNTIL NOW ‣ Pull has become an advantage … ‣ Severity is implied [TSD] ‣ False Positives reduction ‣ Docker makes it super simple ‣ Go Lang lightweight approach Tikal Knowledge
  47. 47. IMPLEMENTATION Tikal Knowledge
  48. 48. INTRODUCTION TO MODERN MONITORING IMPLEMENTATION ‣ Review old system metrics & capabilities and decide what’s good whats bad ‣ What can move ‣ What needs to stay | integrate to new system ‣ Prometheus deployment is Automated from day 1 ‣ Prometheus exporter services are tagged and labeled per application stack | layer ‣ Preferably Dockerized ‣ Metric Design Workshops | meetings | slack group ‣ Alert Design Workshops | meetings | slack group ‣ Teams Mectic tags and Alerting & Escalation Tikal Knowledge
  49. 49. INTRODUCTION TO MODERN MONITORING STEP1 - IMPLEMENT DISCOVERY AWS Discovery -> https://github.com/prometheus/prometheus/tree/master/discovery NEW NODE DEPLOYMEN T SERVICE DISCOVERY DEV STAGING PRODUCTION STACK / APP NAME Alertmanager Tikal Knowledge
  50. 50. INTRODUCTION TO MODERN MONITORING STEP2 - IMPLEMENT EXPORTERS https://prometheus.io/docs/instrumenting/exporters/ Official node exporter -> https://github.com/prometheus/node_exporter Mssql Exporter -> https://hub.docker.com/r/awaragi/prometheus-mssql- exporter/ Nagios Exporter -> https://github.com/m-lab/prometheus-nagios-exporter Tikal Knowledge
  51. 51. INTRODUCTION TO MODERN MONITORING STEP3 - IMPLEMENT CUSTOM APPLICATION METRICS https://prometheus.io/docs/instrumenting/exporters/ Windows WMI -> https://github.com/martinlindhe/wmi_exporter Java -> https://github.com/prometheus/jmx_exporter node.js -> https://www.npmjs.com/browse/keyword/prometheus .Net -> https://github.com/andrasm/prometheus-net Tikal Knowledge
  52. 52. INTRODUCTION TO MODERN MONITORING STEP4 - ADAPT TO YOUR INFRA MONITORING [ FILTER || TAG || SELECTOR ] kubernetes_sd_config Tikal Knowledge
  53. 53. INTRODUCTION TO MODERN MONITORING STEP 5 - METRIC DESIGN ‣ Review sample METRICS and GRAPHS ‣ Define | Reuse ‣ Naming conventions { https://prometheus.io/docs/practices/naming/ } ‣ Quantifiable [ numbers not strings … ] Tikal Knowledge
  54. 54. DASHBOARSTikal Knowledge
  55. 55. INTRODUCTION TO MODERN MONITORING DEVELOPER TOOL Tikal Knowledge
  56. 56. INTRODUCTION TO MODERN MONITORING DEVELOPER TOOL - SIMPLE GRAPHS Tikal Knowledge
  57. 57. INTRODUCTION TO MODERN MONITORING DEVELOPER TOOL - METRICS - USING PROMQL ▸ Simple queries: ▸ rate(http_requests_total[5m]) ▸ Linear predictions ▸ predict_linear(node_filesystem_free[1h], 4*3600) Tikal Knowledge
  58. 58. INTRODUCTION TO MODERN MONITORING GRAFANA - SIMILAR WORKING EXPERIENCE - MUCH NICER Tikal Knowledge
  59. 59. INTRODUCTION TO MODERN MONITORING GRAFANA - SIMILAR WORKING EXPERIENCE - MUCH NICER Tikal Knowledge
  60. 60. INTRODUCTION TO MODERN MONITORING STEP 6 - ALERT DESIGN ‣ Review new METRICS and GRAPHS define | design thresholds ‣ Define Severity ‣ Ownership ‣ Escalation lader Tikal Knowledge
  61. 61. ALERTINGTikal Knowledge
  62. 62. INTRODUCTION TO MODERN MONITORING ALERT DESIGN ▸ ALERT <alert name> ▸ IF <expression> ▸ [ FOR <duration> ] ▸ [ LABELS <label set> ] ▸ [ ANNOTATIONS <label set> ] Tikal Knowledge
  63. 63. INTRODUCTION TO MODERN MONITORING ALERT FOR ANY INSTANCE THAT IS UNREACHABLE FOR >5 MINUTES. ALERT high_load IF node_load1 > 0.5 ANNOTATIONS {description="{{ $labels.instance }} of job {{ $labels.job }} is under high load.", summary="Instance {{ $labels.instance }} under high load"} Tikal Knowledge
  64. 64. INTRODUCTION TO MODERN MONITORING STILL LOOKING FOR ONLINE EDITOR FOR EASE OF DEVELOPMENT https://github.com/alerta/prometheus-config Tikal Knowledge
  65. 65. INTRODUCTION TO MODERN MONITORING SIMPLE YAML FILE route: receiver: 'slack' receivers: - name: 'slack' slack_configs: - send_resolved: true username: '<username>' channel: '#<channel-name>' api_url: '<incomming-webhook-url>' WHERE TO ROUTE TO ROUTER DETAILS Tikal Knowledge
  66. 66. INTRODUCTION TO MODERN MONITORING ALERTING global: resolve_timeout: 5m smtp_require_tls: true pagerduty_url: https://events.pagerduty.com/generic/2010-04-15/create_event.json hipchat_url: https://api.hipchat.com/ opsgenie_api_host: https://api.opsgenie.com/ victorops_api_url: https://alert.victorops.com/integrations/generic/20131114/alert/ route: receiver: slack receivers: - name: slack slack_configs: - send_resolved: true api_url: <secret> channel: '#<channel-name>' username: <username> color: '{{ if eq .Status "firing" }}danger{{ else }}good{{ end }}' title: '{{ template "slack.default.title" . }}' title_link: '{{ template "slack.default.titlelink" . }}' pretext: '{{ template "slack.default.pretext" . }}' text: '{{ template "slack.default.text" . }}' fallback: '{{ template "slack.default.fallback" . }}' icon_emoji: '{{ template "slack.default.iconemoji" . }}' icon_url: '{{ template "slack.default.iconurl" . }}' templates: [] } }Channel Configuration Variables | Global configuration Tikal Knowledge
  67. 67. INTRODUCTION TO MODERN MONITORING ALERT TEMPLATING ▸ What | How to say … https://prometheus.io/blog/2016/03/03/custom-alertmanager-templates/ - send_resolved: true api_url: <secret> channel: '#<channel-name>' username: <username> color: '{{ if eq .Status "firing" }}danger{{ else }} good{{ end }}' title: '{{ template "slack.default.title" . }}' title_link: '{{ template "slack.default.titlelink" . }}' pretext: '{{ template "slack.default.pretext" . }}' text: '{{ template "slack.default.text" . }}' fallback: '{{ template "slack.default.fallback" . }}' icon_emoji: '{{ template "slack.default.iconemoji" . }}' icon_url: '{{ template "slack.default.iconurl" . }}' Tikal Knowledge
  68. 68. INTRODUCTION TO MODERN MONITORING SILENCING, VIA UI / API Tikal Knowledge
  69. 69. INTRODUCTION TO MODERN MONITORING ANSWERS REQUIRED FEATURES Accessibility Scheduling SLA’s assured Auth & Authorization Escalation Durable & Resilient Forensics Automatic Flexible & Elastic Accountable Tikal Knowledge
  70. 70. INTRODUCTION TO MODERN MONITORING NEXT STEPS INFRUSTRUCTURE (OS) APPLICATION EXTERNAL (DEPENDENCY / ENDPOINT) REMEDIABLE ? ALEARTABLE ? LOG CORRELATION } ALERT MANAGER LEGACY IDENTIFY CHOOSE Tikal Knowledge
  71. 71. INTRODUCTION TO MODERN MONITORING DEMO TIME ‣ Docker-compose - ready fro R&D to start using to run create custom application Metrics. ‣ Prometheus, Node_exporter, Alertmanager Cadvisor, Grafana Tikal Knowledge
  72. 72. INTRODUCTION TO MODERN MONITORING DOCKER SETTINGS - VOLUMES, NETWORKS version: ‘2' volumes: prometheus_data: {} grafana_data: {} networks: front-tier: driver: bridge back-tier: driver: bridge Docker-compose version Docker volumes for preometheus and grafana Docker Networks Tikal Knowledge
  73. 73. INTRODUCTION TO MODERN MONITORING PROMETHEUS - OFFICIAL CONTAINER services: prometheus: image: prom/prometheus container_name: prometheus volumes: - ./prometheus/:/etc/prometheus/ - prometheus_data:/prometheus command: - '-config.file=/etc/prometheus/prometheus.yml' - '-storage.local.path=/prometheus' - '-alertmanager.url=http://alertmanager:9093' expose: - 9090 ports: - 9090:9090 links: - cadvisor:cadvisor - alertmanager:alertmanager depends_on: - cadvisor networks: - back-tier Docker Service name Docker volumes for prometheus and grafana Expose as service on specified port Ports to expose as service Link to cadvisor & alertmanager Network placement ‘back-tier’ Configuration Tikal Knowledge
  74. 74. INTRODUCTION TO MODERN MONITORING NODE-EXPORTER [ NODE METRICS COLLECTOR ] node-exporter: container_name: node-exporter image: prom/node-exporter volumes: - /proc:/host/proc:ro - /sys:/host/sys:ro - /:/rootfs:ro command: '-collector.procfs=/host/proc -collector.sysfs=/host/sys -collector.filesystem.ignored-mount-points="^(/rootfs|/host|)/(sys| proc|dev|host|etc)($$|/)" collector.filesystem.ignored-fs- types="^(sys|proc|auto|cgroup|devpts|ns|au|fuse.lxc|mqueue)(fs|)$$"' expose: - 9100 networks: - back-tier Access to /proc /sys What to mount from OS to container for metric collection Tikal Knowledge
  75. 75. INTRODUCTION TO MODERN MONITORING ALERT MANAGER alertmanager: image: prom/alertmanager ports: - 9093:9093 volumes: - ./alertmanager/:/etc/alertmanager/ networks: - back-tier command: - '-config.file=/etc/alertmanager/config.yml' - '-storage.path=/alertmanager' Tikal Knowledge
  76. 76. INTRODUCTION TO MODERN MONITORING CADVISOR cadvisor: image: google/cadvisor volumes: - /:/rootfs:ro - /var/run:/var/run:rw - /sys:/sys:ro - /var/lib/docker/:/var/lib/docker:ro expose: - 8080 networks: - back-tier grafana: image: grafana/grafana depends_on: - prometheus ports: - 3000:3000 volumes: - grafana_data:/var/lib/grafana env_file: - config.monitoring networks: - back-tier - front-tier Tikal Knowledge
  77. 77. INTRODUCTION TO MODERN MONITORING GRAFANA grafana: image: grafana/grafana depends_on: - prometheus ports: - 3000:3000 volumes: - grafana_data:/var/lib/grafana env_file: - config.monitoring networks: - back-tier - front-tier Tikal Knowledge
  78. 78. INTRODUCTION TO MODERN MONITORING DOCKER PS CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3dcfd7c289cb grafana/grafana "/run.sh" 21 hours ago Up 4 minutes 0.0.0.0:3000->3000/tcp prometheus_grafana_1 2b2817fc0bd9 prom/prometheus "/bin/prometheus -..." 21 hours ago Up 4 minutes 0.0.0.0:9090->9090/tcp prometheus d2c6849d3bd9 google/cadvisor "/usr/bin/cadvisor..." 21 hours ago Up 4 minutes 8080/tcp prometheus_cadvisor_1 d4a3c3ceb97d prom/node-exporter "/bin/node_exporte..." 21 hours ago Up 4 minutes 9100/tcp node-exporter 75eb08791ea9 prom/alertmanager "/bin/alertmanager..." 21 hours ago Up 4 minutes 0.0.0.0:9093->9093/tcp prometheus_alertmanager_1 Tikal Knowledge
  79. 79. INTRODUCTION TO MODERN MONITORING DEMO PROJECT ON GITHUB https://github.com/shelleg/monlog-compose-stack Tikal Knowledge
  80. 80. INTRODUCTION TO MODERN MONITORING ‣ All containers - monitored by prometheus + graphed in a small nice project. Tikal Knowledge
  81. 81. TEXT ROLLOUT [ LLD ] Tikal Knowledge
  82. 82. INTRODUCTION TO MODERN MONITORING PLACEMENT OPTIONS ‣ 1 main prometheus server vs. 1 Prometheus server per team ‣ 1 Alert-manager [ with pre-defined “receivers” ] vs. 1 per team / concern Tikal Knowledge
  83. 83. INTRODUCTION TO MODERN MONITORING DEPLOYMENT OPTIONS ‣ Automate deployment of prometheus server(s) / Alert-manager [ pre-defined “receivers” ] ‣ Ansible, puppet etc ‣ Jenkins ‣ The combination of the 2 ;) ‣ Automation helps solve the “one 2 Many” dilemma IMHO … Tikal Knowledge
  84. 84. INTRODUCTION TO MODERN MONITORING DEVELOPER STACK ‣ Options: ‣ Personal Docker / Docker-compose[ private fork if desired ] ‣ A small startup.cmd / startup.sh starting go applications of promethes & alertmanager ‣ A centralized Grafana / Alertmanager with only prometheus on dev-machine ‣ Toolkit for ‣ develop metrics, alarms, graphs ‣ Add exporters to configuration [ tendency :: as common as you develop new services ] ‣ SDLC -> Gil Pull/MErge request mechanism Tikal Knowledge
  85. 85. INTRODUCTION TO MODERN MONITORING DEVELOPER STACK(S) - EXAMPLE Tikal Knowledge
  86. 86. INTRODUCTION TO MODERN MONITORING ALERTS IN SCM MASTER -> STG -> PRD Tikal Knowledge
  87. 87. INTRODUCTION TO MODERN MONITORING POPULATE ALERTS | METRICS | DASHBOARDS VIA SCM 1. Use “ready made” || good starring point graphs from grafana dashboard exchange or build your own 2. Customize 3. Add / push to git master branch 4. “ci” server -> listen on GitHook -> push to staging 5. “ci” server -> wait for manual trigger -> push to production Tikal Knowledge
  88. 88. INTRODUCTION TO MODERN MONITORING CONTINUOUS DELIVERY OPTIONS [ ADDING AN ALERT SAMPLE WORKFLOW ] master (dev) staging production DEVELOP DEPLOY TO STAGE DEPLOY TO PROD 1 centralized repo branch per env / prometheus instance Tikal Knowledge
  89. 89. INTRODUCTION TO MODERN MONITORING CONTINUOUS DELIVERY OPTIONS [ ADDING GRAPHS ] master (dev) staging production DEVELOP DEPLOY TO STAGE DEPLOY TO PROD “Grafana Dashboard hub” - separate repo ? - part of monitoring repo ? Tikal Knowledge
  90. 90. INTRODUCTION TO MODERN MONITORING CI PIPELINE -DATA ORIGINS & PRESENTATION Exporters REGION POD INSTANCE * } } App Metrics OS Metrics Filter Tags & Alerts Tikal Knowledge
  91. 91. INTRODUCTION TO MODERN MONITORING CI PIPELINE DEV STAGING PRODUCTION STACK / APP NAME ALERTMANAGE R ALERTMANAGE R Web-hook (PR-builder) GRAFANA GRAFANA OPS “CLEANUP” ROUTINE(S) Tikal Knowledge
  92. 92. INTRODUCTION TO MODERN MONITORING BUILDING THE PIPELINE ‣ Routine on submit / push builds to dev/stg ‣ Run daily / weekly deployments of Alerts (prometheus) | Dashboards (grafana) ‣ Avoid / rollback any manual changes of Alerts / Graphs etc ‣ Help make automation a common practice ‣ Scheduled task which syncs and re-configures the desired state from SCM Tikal Knowledge
  93. 93. INTRODUCTION TO MODERN MONITORING MESURE THE PIPELINE ‣ Pipeline steps are monitored ‣ Expose metrics such as: ‣ deployment time & status [ in env | stack etc ] ‣ count (# of alerts, new vs old last week, month etc) ‣ Metric counters [ application metrics ] … ‣ [ Jenkins exporter || push gateway TBD ] Tikal Knowledge
  94. 94. FEEDBACK / QUESTIONS ? I’M HERE … HAGZAG@TIKALK.COM, 0545302525 Haggai Philip Zagury - Tikal Knowledge MONITORING HLD FullStack Developers Israel

×