The five pillars of infrastructure monitoring are: 1) Know your infrastructure stack by keeping information up-to-date and using automated collection processes, 2) Know your monitoring tools and how different tools are used to monitor aspects of infrastructure, 3) Consolidate monitoring output into a single view for easier analysis and visualization, 4) Setup a proper support organization to handle alerts and detect impacts, and 5) Make monitoring smart by following event chains and predicting events based on historical patterns.
3. 1. Know your infrastructure stack
• Make sure your infrastructure
information is complete and
current.
• Tools, which contain non-
current information, are not
used.
• Aim for an automated &
repeatable process of
information collection.
aws ec2 describe-instances --instance-ids i-5203422c
5. 2. Know your monitoring tools
• In todays world, vendors often merge
management tools with monitoring
functions.
• You‘ll want to benefit from these very
specific tools, so you‘ll end up using
them. (sometimes you have to)
• Be clear in your organization, what you
monitor how.
• Check completeness of landscape
7. 3. Consolidate your monitoring output
• You‘ll want to have a consolidated view of your
monitoring output.
• Potential tools for that the ELK stack, Splunk or SAP
ITOA.
• Work on sensible thresholds & alerts
• Consolidated view simplifies tasks such as root cause
analysis & pattern finding.
• Visualization helps! (Health status)
9. 4. Setup of proper support organization
• Thousands of alerts …
• Not everything has to go to the
service desk („Self-healing“)
• Service desk needs to know
what to do with monitoring
output
• Enable fast impact detection
• How do you inform your
customers?
• Look regularly at repeating
causes for issues
11. 5. Make it smart
• Follow event chains across the
infrastructure layers (Markov
chains may help)
• Predict events based on history
12. Something to take away
Know your
infrastructure stack
Know your
monitoring tools
Consolidate your
monitoring output
Setup of proper
support
organization
Make it smart