This document provides details on Day 2 operations for an OpenStack Meetup on September 27, 2017 in Montreal. It includes an agenda for the day which involves setting up and configuring monitoring tools like Nagios, Elastic Stack, Filebeat, Metricbeat and Monasca Agent to monitor an OpenStack cloud deployment. It also discusses using Dynatrace to monitor the OpenStack deployment, correlate metrics and events, and provide real user monitoring.
2. confidential
Dirk Wallerstorfer
Cloud Technology Strategist @ Dynatrace
Tech enthusiast
Husband
Father
Son
Austrian (no kangaroos)
Never seen “Sound of music”
Yes, I own a lederhosn
No, I don’t know how to yodel
@wall_dirk
dirk.wallerstorfer@dynatrace.com
11. confidential
Nagios
Monitoring IT infrastructure – and more ...
Monitoring static entities
Alternative: Nagios XI – Enterprise
Nagios Log Server
Great talk from Nagios world 2014: Monitoring OpenStack
https://www.youtube.com/watch?v=1U5fo6aPS-k
24. confidential
Monasca Agent
The Monasca Agent supports collecting metrics from a variety of sources as follows:
System metrics
Nagios plugins
Statsd
Host alive (icmp/ssh)
Process checks (# instances, memory, io, threads)
Http Endpoint checks
Service checks (mysql, rabbitmq)
OpenStack process metrics
The Agent is extensible through configuration of additional plugins, written in Python.
40. confidential
Resource capacity and utilization
OpenStack service availability/performance
Supporting services
Log analytics
Applications running on top
Dependencies
Correlation of metrics/events/data
Real user monitoring, UX affects $
PaaS
OPTIMIZING
MAINTENANCE
AVAILABILITYBILLING
ENSURE IT MATCHES EXPECTATIONS
monitoring IT infrastructure and more
UI
Beats are lightweight data shippers that you install as agents on your servers
Beats have a small footprint and use fewer system resources than Logstash.
Logstash provides a broad array of input, filter, and output plugins for collecting, enriching, and transforming data from a variety of sources.
X-Pack: security, alerting (watcher), monitoring, reporting, graph, machine learning
Cloud: hosted on AWS/GCP – scaling is easy, activate additional features on demand
modules: apache2, nginx, mysql, syslog, ...
configuration @ controller
processes @ elastic
Agents send data to APIs
Read data through CLI or Grafana
agent configuration in container for system metrics
integration in Horizon, dashboards, and ‘graph metrics’
OICR
make sense out of data
correlation doesn’t imply causation
doesn’t make sense to artificially look for causal relationship!
So, how can you monitor this application?
Update of the Payment Service of one of the rookie developer that are convinced that you have to write everything on your own and reinvent the wheel on a daily basis.
LET’S ASSUME YOU ARE MONITORING EVERYTHING ... in the most professional way, so you immediately notice any error or change in performance
This is one of the things that can go wrong in OpenStack ... now I don’t to talk you out of doing OpenStack, on the contrary, I want to encourage you to think maintenance and operations from the beginning, and don’t just go with
Putting it all together now.
With large environments, manual introspection and correlation and log browsing won’t cut it anymore ... people don’t scale as well ...
What do you need OpenStack for, if you 5 VMs with 8 services, and 2 applications.
imagine you need to configure Nagios or Elastic in this environment ... you need a monitoring on your own