More Related Content
Similar to Night owl by Boyd Meyer of PROS (20)
More from Mark Kerzner (20)
Night owl by Boyd Meyer of PROS
- 1. Night Owl
Log Monitoring using Elasticsearch and Hadoop
Boyd Meier (bmeier@pros.com)
Hadoop Meetup – October 16, 2013
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 3. Application Performance Monitoring
● Many servers
● Many applications
● Many log formats
● Many places to go look for information
● What if we could just look in one place and see everything?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 4. Advanced Analysis
● The logs are too low-level
● The servers need the existing capacity
● The amount of data to be analyzed is huge
● Some analysis needs to be across multiple servers
● What if we want to change the analysis algorithms?
● How we can do analysis in the most flexible way possible?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 5. Proactive Support
● See problems coming before they become crises
● Watch for errors and exceptions
● Track performance of the application
● Track usage of the application
● Enable checks we haven’t thought of yet
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 6. Some Analysis Questions
● What errors happen, and how often?
● Who did what, when?
● How long did it take to do a task?
● What else was happening on the server?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 7. Constraints
● Very little budget – as much free stuff as possible
● Can’t use client machines
● Communications need to be secure
● Large amounts of data (Gb/day/client)
● Minimize support’s dependence on client IT
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 9. Hadoop
● We have a lot of data (~2 GB day with 3 clients)
● We need to process it in reasonable time
● We can’t afford a big machine for this
● We have lots of old machines lying around
● Sounds like a job for the elephant! But what about query?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 10. Elasticsearch
● Query performance on base Hadoop is painful
● Ad-hoc queries are required
● Hadoop integration
● Cluster deployment
● Looks promising! How do we get the data into the server?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 11. Logstash
● Handle many sources, not just logs
● Fan-in architecture to server
● Compressed, SSL encrypted data
● Can offload some logic on the client if desired
● Massively configurable
● Output to Elasticsearch
● Great! Now how about visualization?
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 12. Kibana
● Backed by Elasticsearch
● Supports dynamic queries
● View information over time
● Built-in support for Logstash
● Configurable, shareable dashboards
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 14. Hadoop Processing
● Pig scripts process the data
● Wonderdog from InfoChimps to integrate Pig and Elasticsearch
– There are issues:
• Cluster stability using Wonderdog
• Wonderdog Pig interface has not been updated in a while
• Currently evaluating elasticsearch-hadoop project from Elasticsearch.org
● Analysis results are stored in Elasticsearch for ease of access
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 18. Software
● Ubuntu 12.04.2 LTS (Precise)
● Cloudera CDH 4.3.1
– Hadoop 2.0.0
– Hbase 0.94
– Hive 0.10
– Pig 0.11
● Elasticsearch 0.90.3
● Logstash 1.1.12
● Kibana 3 M3
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 19. Hardware Architecture
● 27 node cluster of commodity machines
● 42 TB of disk space
● Connected via 10 gigabit switch
● Each machine has:
– 8 GB RAM
– 2 TB SATA HDD
– Gigabit Ethernet
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 20. Performance
● Over the month of September:
– 188 million events ingested from 3 clients
– 57.5 GB storage used (1.92 GB / day)
● At that rate, 42 TB is enough space for:
– 142 billion events
– 60 years of data from these clients
– 1 year of data from 180 clients at the same volume per client
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 21. Resources
● Elasticsearch - http://www.elasticsearch.org/overview/
• http://github.com/elasticsearch/elasticsearch
● Logstash - http://www.elasticsearch.org/overview/logstash/
• https://github.com/logstash/logstash
● Kibana - http://www.elasticsearch.org/overview/kibana/
• https://github.com/elasticsearch/kibana
● ES – Hadoop - http://www.elasticsearch.org/overview/hadoop/
• http://github.com/elasticsearch/elasticsearch-hadoop
● Cloudera - http://www.cloudera.com/
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY
- 22. World Headquarters
3100 Main Street, Suite #900
Houston, TX 77002
Phone: +1 713-335-5151
Sales: +1 855-846-0641
Fax: +1 713-335-8144
PROS Germany GmbH
Feringastrasse 6
85774 Unterfoehring
Munich
Tel.: +49 89 99216 270
Fax: +49 89 99216 200
European Headquarters - United Kingdom
Lakeside House
1 Furzeground Way
Stockley Park
Heathrow
UB11 1BD
Phone: +44 (0) 208 622 3555
Fax: +44 208 622 3230
Regional Office - Austin, TX
3600 Parmer Lane, Suite 205
Austin, Texas 78727
Regional Office - Cary, North Carolina
1000 Centre Green Way, #200
Cary, NC 27513
Phone:+1 919-228-6334
© COPYRIGHT PROS, INC. 2013 | CONFIDENTIAL AND PROPRIETARY