O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Why @Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Log Management

3.292 visualizações

Publicada em

Agenda for this Presentation
• The challenges of Log Management at scale
• Overview of Loggly’s processing pipeline
• Alternative technologies considered
• Why we love Apache Kafka
• How Kafka has added flexibility to our pipeline

The Challenges of Log Management at Scale
• Big data
– >750 billion events logged to date
– Sustained bursts of 100,000+ events per second
– Data space measured in petabytes
• Need for high fault tolerance
• Near real-time indexing requirements
• Time-series index management

Publicada em: Tecnologia
  • Seja o primeiro a comentar

Why @Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Log Management

  1. 1. Why Loggly Loves Apache Kafka, and How We Use Its Unbreakable Messaging for Better Apache Log Storm Management Infrastructure Engineering Team June 2014 | Log management as a service Simplify Log Management
  2. 2. What Loggly Does World’s most popular cloud-based log management service § More than 5,000 customers § Near real-time indexing of events Distributed architecture, built on AWS Initial production services in 2011 § Loggly Generation 2 released in Sept 2013 | Log management as a service Simplify Log Management
  3. 3. Loggly: Addressing the first big data problem every company faces § Centralized logging and archival § Real-time processing, analysis and visualization § Monitoring, alerting and troubleshooting | Log management as a service Simplify Log Management
  4. 4. Agenda for this Presentation § The challenges of Log Management at scale § Overview of Loggly’s processing pipeline § Alternative technologies considered § Why we love Apache Kafka § How Kafka has added flexibility to our pipeline | Log management as a service Simplify Log Management
  5. 5. The Challenges of Log Management at Scale § Big data – >750 billion events logged to date – Sustained bursts of 100,000+ events per second – Data space measured in petabytes § Need for high fault tolerance § Near real-time indexing requirements § Time-series index management | Log management as a service Simplify Log Management
  6. 6. Log Management Processing Pipeline: Overview Load Balancing Kafka Stage 2 Loggly Custom Module | Log management as a service Simplify Log Management
  7. 7. Collectors Can Easily Outpace Downstream Processes Load Balancing Kafka Stage 2 Loggly Custom Module § Written in C++ § Designed to ingest massive data volumes § Need to collect regardless of what’s happening downstream | Log management as a service Simplify Log Management
  8. 8. Solution: Queue That’s External to Collector Load Balancing Kafka Stage 2 Loggly Custom Module § Based on Apache Kafka § Highly performant and reliable | Log management as a service Simplify Log Management
  9. 9. Alternate/ Supplementary Approaches Considered § Internal buffering in collectors – Added complexity § Cassandra – Not as good a queue as Kafka § Apache Storm – In initial Gen2 architecture, removed after launch | Log management as a service Simplify Log Management
  10. 10. The Secret to Log Management at Scale: Keep It Simple, Stupid Results: § Can process sustained rates of 100,000+ events per second per cluster § Average message 300 bytes | Log management as a service Simplify Log Management
  11. 11. Why We Love Kafka | Log management as a service Simplify Log Management
  12. 12. What Attracted Us in the First Place No single point of failure • Terabytes of data move through our Kafka cluster every day without losing a single event • We use age-based retention to purge old data on disks Low latency • 99.99999% of the time our data is coming from disk cache and RAM; only very rarely do we hit disk Performance • Crazy good! • We currently have a bunch of Kafka brokers running on m2.xlarge instances backed by provisioned IOPS. • One of consumer group (eight threads) which maps a log to a customer can process about 200,000 events per second draining from 192 partitions spread across three brokers Scalability • Ability to increase partition count per topic and downstream consumer threads provides flexibility to increase throughput when desired | Log management as a service Simplify Log Management
  13. 13. How Our Kafka Crush Has Deepened Distributed log collection • Local pods and collectors spread all over the Internet with local Kafka deployments to collect data from customers located all over world • Can collect logs even when we lose connectivity • When network comes back, Kafka sends the logs downstream to the rest of the pipeline More efficient, effective DevOps • Deploying Kafka throughout pipeline makes it easy to disable certain parts of system (for troubleshooting or upgrades) • No worrying that we will lose customer data • Example: Add support for new log type into our automatic parsing capabilities by turning off existing parser, deploying new one, and processing logs that Kafka has queued up Controlling resource utilization • Keep collectors as simple as possible for resilience and reliability reasons • Add intelligence into our pipelines using Kafka | Log management as a service Simplify Log Management
  14. 14. Resource Utilization Example: “Noisy Neighbors” | Log management as a service Simplify Log Management
  15. 15. “Noisy Neighbors” are Inherent to SaaS § Sending many times their “normal” level of logging volume, inadvertently or because their application is in big trouble § Routing logs to separate queue minimizes impact on other customers | Log management as a service Simplify Log Management
  16. 16. Kafka Queues Add Flexibility to Loggly Pipeline § Because Kafka topics are very cheap from a performance and overhead standpoint, we can create as many queues as we want § Scaled to the performance we want § Optimizing resource utilization across the system § Because they can be created dynamically, we can make business rules very flexible § Makes us confident that pipeline will scale as customer data volumes do | Log management as a service Simplify Log Management
  17. 17. Conclusion: Kafka Frees Our Development Team to Build Differentiating Features § Kafka deployment working without us thinking about it § Plenty of other things to do to keep our position as the world’s most popular cloud-based log management service! | Log management as a service Simplify Log Management
  18. 18. Does Log Management Sound Hard? It Should! Let us do the heavy lifting for you! Try Loggly FREE for 30 days About Us: Loggly is the world’s most popular cloud-based log management solution, used by more than 5,000 happy customers to effortlessly spot problems in real-time, easily pinpoint root causes and resolve issues faster to ensure application success. Visit us at loggly.com or follow @loggly on Twitter. | Log management as a service Simplify Log Management
  19. 19. Did you like this presentation? Head over to our blog for more great content! Take me to the Loggly Blog | Log management as a service Simplify Log Management

×