SlideShare uma empresa Scribd logo
1 de 78
Building a Replicated Logging System with Apache
Kafka
Guozhang Wang, Joel Koshy, Sriram Subramanian, Kartik Paramasivam
Mammad Zadeh, Neha Narkhede, Jun Rao, Jay Kreps, Joe Stein
We All Love Logs!
Apache Kafka
• A distributed messaging system
..that store messages as a log!
Example: LinkedIn back in 2010
Point-to-Point Pipelines
What We Want:
A Centralized Data Pipeline
Log-centric Data Flow
• Logical Ordering
• Persistent Buffering
• “Source-of-Truth”
Store Messages as a Log
4 5 5 7 8 9 10 11 12...
Producer Write
Consumer1
Reads (offset 7)
Consumer2
Reads (offset 10)
Messages
3
Partition the Log across
Machines
Topic 1
Topic 2
Partitions
Producers
Producers
Consumers
Consumers
Brokers
Apache Kafka
Example: Kafka at LinkedIn
“Source-of-Truth” should not
be lost even when..
Replicas and Layout
Logs
Broker-1
topic1-part1
topic1-part3
topic1-part2
Logs
topic1-part2
topic1-part1
topic1-part3
Logs
topic1-part3
topic1-part2
topic1-part1
Broker-2 Broker-3
Consensus for Log Replication
Logs
Broker-1
Logs Logs
Broker-2 Broker-3
Write
Consensus
Protocol
Consensus
Protocol
Key Idea
Separate membership configuration
from data replication
Primary-backup Replication
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write
Conventional Quorum Commits
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write
Conventional Quorum Commits
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write
Conventional Quorum Commits
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Conventional Quorum Commits
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
• Leader maintains in-sync-replicas (ISR)
• Failed / slow follower => drop from ISR
• Caught-up follower => re-join ISR
• Producer specifies required ACK based on
ISR
Configurable ISR Commits
Example: ACK with all ISRs
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: ACK with all ISRs
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: ACK with all ISRs
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: ACK with all ISRs
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: ACK with all ISRs
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: ACK with all ISRs
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: ACK with all ISRs
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: ACK with Leader-only
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“leader”)
ISR {1, 2, 3}
Example: ACK with Leader-only
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“leader”)
ISR {1, 2, 3}
Example: ACK with Leader-only
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“leader”)
ISR {1, 2, 3}
Example: ACK with Leader-only
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“leader”)
ISR {1, 2, 3}
Example: ACK with Leader-only
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“leader”)
ISR {1, 2, 3}
Example: ACK with Leader-only
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“leader”)
ISR {1, 2, 3}
Example: Slow Follower
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: Slow Follower
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: Slow Follower
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: Slow Follower
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: Slow Follower
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2, 3}
Example: Slow Follower
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2}
Example: Slow Follower
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
Write (ack=“all”)
ISR {1, 2}
Configurable ISR Commits
ACK mode Latency On Failures
“no" no network delay some data loss
“leader" 1 network roundtrip a few data loss
“all" ~2 network roundtrips no data loss
• Use an embedded controller
• Detect broker failure via ZooKeeper
• Leader failure => elect new leader from ISR
• Leader and ISR persisted in Zookeeper
• For Controller fail-over
Membership Management
Example: Broker Failure
Logs
Broker-1 *
Logs Logs
Broker-2 Broker-3
ISR {1, 2}
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 Broker-3
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 Broker-3
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 Broker-3
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 Broker-3
ISR {2}
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 Broker-3
ISR {2}
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 Broker-3
ISR {2}
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 * Broker-3
ISR {2}
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 * Broker-3
ISR {2}
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 * Broker-3
ISR {2}
Example: Broker Failure
Logs
Broker-1
Logs Logs
Broker-2 * Broker-3
ISR {2, 3}
• Overview: Logs and Kafka
• Log Replication in Kafka
• Kafka Usage at LinkedIn
• Conclusion
Agenda
Change Log
Replication
Apache Kafka
Example: Kafka at LinkedIn
Example: Espresso
• A distributed document store
• Primary online data serving
platform at LI
• Member profile, homepage, InMail, etc
[SIGMOD 2013]
Old Espresso Replication
Data Center-1
Storage
Node
Storage
NodeMySQL
Replication
MySQL MySQL
Search
Index
Hadoop …
…Databus
Cross-DC
Replicator
Data Center-1
Storage
Node
Storage
NodeMySQL
Replication
MySQL MySQL
Search
Index
Hadoop …
DatabusCross-DC
Replicator
Problems with MySQL
Replication
Master Storage Node
P1
Slave Storage Node
P2 P3
P4 P5 P6
P1 P2 P3
P4 P5 P6
Binary Log
Shipping
Replicate Logs with Kafka
Storage Node
Kafka Logs
P1
Storage Node
P2 P3
P4 P5 P6
P1 P2 P3
P4 P5 P6
Kafka Producer Kafka Consumer Kafka Consumer Kafka Producer
Key-based Log Compaction
...
Partition Messages
Segment-3 Segment-4 Segment-6 *
Key-based Log Compaction
d: 3 f: 8 b: 0 c: null...
Partition Messages
c: 3 a: 5 a: 6 a: 5 f: 9 ...
Segment-3 Segment-4
b: 2 d: 4a: 1
Key-based Log Compaction
... d: 3 f: 8 b: 0 c: null a: 5 f: 9 ...
Segment-3 Segment-4
c: 3 a: 5 a: 6b: 2 d: 4a: 1 c: 3 a: 5 a: 6b: 2 d: 4a: 1 d: 3 f: 8 b: 0 a: 5 f: 9
New Segment
Partition Messages
Key-based Log Compaction
... d: 3 f: 8 b: 0 c: null a: 5 f: 9 ...
Segment-3 Segment-4
c: 3 a: 5 a: 6b: 2 d: 4a: 1
c: 3 a: 6 d: 3 f: 8 b: 0
c: null a: 5 f: 9
New Segment
Partition Messages
Key-based Log Compaction
... d: 3 f: 8 b: 0 c: null a: 5 f: 9 ...
Segment-3 Segment-4
c: 3 a: 5 a: 6b: 2 d: 4a: 1
d: 3 b: 0 a: 5 f: 9
New Segment
Partition Messages
Key-based Log Compaction
... d: 3 f: 8 b: 0 c: null a: 5 f: 9 ...
Segment-3 Segment-4
c: 3 a: 5 a: 6b: 2 d: 4a: 1
d: 3 b: 0 a: 5 f: 9
New Segment
Partition Messages
New Espresso Replication
Data Center-1
Storage
Node
Storage
Node
Storage
Node
Kafka Logs
MySQL MySQL MySQL
Data Center-n
Storage
Node
Storage
Node
Storage
Node
Kafka Logs
MySQL MySQL MySQL
Kafka
MirrorMaker
Search
Index
Hadoop …
…
Search
Index
Hadoop …
* In Progress
Stream Processing
Apache Kafka
Example: Kafka at LinkedIn
• Data flow streaming on Kafka and YARN
• Stateful processing
• Re-processing
• Failure Recovery
Example: Samza [CIDR 2015]
Kafka
Kafka
Samza
StateProces
s
Protoc
ol
StateProces
s
Protoc
ol
StateProces
s
Protoc
ol
Samza Processing
Kafka
Kafka
Samza
StateProces
s
Protoc
ol
StateProces
s
Protoc
ol
StateProces
s
Protoc
ol
Samza Processing Kafka Changelog
Kafka
Kafka
Samza
StateProces
s
Protoc
ol
StateProces
s
Protoc
ol
StateProces
s
Protoc
ol
Samza Processing Kafka Changlog
Kafka
Kafka Samza
StateProces
s
Protoc
ol
StateProces
s
Protoc
ol
StateProces
s
Protoc
ol
Samza Processing Kafka Changlog
StateProces
s
Protoc
ol
Take-aways
• Log-centric data flow helps scaling your
systems
• Kafka: replicated log streams for real-time
platforms
We are Hiring
Take-aways
• Log-centric data flow helps scaling your
systems
• Kafka: replicated log streams for real-time
platforms
THANKS!

Mais conteúdo relacionado

Mais procurados

Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path ForwardAlluxio, Inc.
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesYoshinori Matsunobu
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howAltinity Ltd
 
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018Charles Allen
 
Deep dive into stateful stream processing in structured streaming by Tathaga...
Deep dive into stateful stream processing in structured streaming  by Tathaga...Deep dive into stateful stream processing in structured streaming  by Tathaga...
Deep dive into stateful stream processing in structured streaming by Tathaga...Databricks
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...HostedbyConfluent
 
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...confluent
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberFlink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Flink Forward
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Eric Sun
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
Clickhouse at Cloudflare. By Marek Vavrusa
Clickhouse at Cloudflare. By Marek VavrusaClickhouse at Cloudflare. By Marek Vavrusa
Clickhouse at Cloudflare. By Marek VavrusaAltinity Ltd
 
How YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQLHow YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQLYugabyte
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniZalando Technology
 
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022HostedbyConfluent
 
Looking towards an official cassandra sidecar netflix
Looking towards an official cassandra sidecar   netflixLooking towards an official cassandra sidecar   netflix
Looking towards an official cassandra sidecar netflixVinay Kumar Chella
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovAltinity Ltd
 

Mais procurados (20)

Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path Forward
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
 
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
Data Analytics and Processing at Snap - Druid Meetup LA - September 2018
 
Deep dive into stateful stream processing in structured streaming by Tathaga...
Deep dive into stateful stream processing in structured streaming  by Tathaga...Deep dive into stateful stream processing in structured streaming  by Tathaga...
Deep dive into stateful stream processing in structured streaming by Tathaga...
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
 
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...
Simplifying Distributed Transactions with Sagas in Kafka (Stephen Zoio, Simpl...
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)
 
Postgres index types
Postgres index typesPostgres index types
Postgres index types
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
Clickhouse at Cloudflare. By Marek Vavrusa
Clickhouse at Cloudflare. By Marek VavrusaClickhouse at Cloudflare. By Marek Vavrusa
Clickhouse at Cloudflare. By Marek Vavrusa
 
How YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQLHow YugaByte DB Implements Distributed PostgreSQL
How YugaByte DB Implements Distributed PostgreSQL
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando PatroniHigh Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
 
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
 
Looking towards an official cassandra sidecar netflix
Looking towards an official cassandra sidecar   netflixLooking towards an official cassandra sidecar   netflix
Looking towards an official cassandra sidecar netflix
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 

Destaque

Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaGuozhang Wang
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...DataWorks Summit/Hadoop Summit
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedInGuozhang Wang
 
Kafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesKafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesTodd Palino
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Building a Real-time Data Pipeline: Apache Kafka at LinkedIn
Building a Real-time Data Pipeline: Apache Kafka at LinkedInBuilding a Real-time Data Pipeline: Apache Kafka at LinkedIn
Building a Real-time Data Pipeline: Apache Kafka at LinkedInDataWorks Summit
 
Automatic Scaling Iterative Computations
Automatic Scaling Iterative ComputationsAutomatic Scaling Iterative Computations
Automatic Scaling Iterative ComputationsGuozhang Wang
 
Behavioral Simulations in MapReduce
Behavioral Simulations in MapReduceBehavioral Simulations in MapReduce
Behavioral Simulations in MapReduceGuozhang Wang
 
Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingGuozhang Wang
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereGwen (Chen) Shapira
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
 
No data loss pipeline with apache kafka
No data loss pipeline with apache kafkaNo data loss pipeline with apache kafka
No data loss pipeline with apache kafkaJiangjie Qin
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureP. Taylor Goetz
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 
Distributed Logging System Using Elasticsearch Logstash,Beat,Kibana Stack and...
Distributed Logging System Using Elasticsearch Logstash,Beat,Kibana Stack and...Distributed Logging System Using Elasticsearch Logstash,Beat,Kibana Stack and...
Distributed Logging System Using Elasticsearch Logstash,Beat,Kibana Stack and...Sanjog Kumar Dash
 

Destaque (20)

Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Kafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier ArchitecturesKafka at Scale: Multi-Tier Architectures
Kafka at Scale: Multi-Tier Architectures
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Building a Real-time Data Pipeline: Apache Kafka at LinkedIn
Building a Real-time Data Pipeline: Apache Kafka at LinkedInBuilding a Real-time Data Pipeline: Apache Kafka at LinkedIn
Building a Real-time Data Pipeline: Apache Kafka at LinkedIn
 
Automatic Scaling Iterative Computations
Automatic Scaling Iterative ComputationsAutomatic Scaling Iterative Computations
Automatic Scaling Iterative Computations
 
Behavioral Simulations in MapReduce
Behavioral Simulations in MapReduceBehavioral Simulations in MapReduce
Behavioral Simulations in MapReduce
 
Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream Processing
 
Kafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be thereKafka Reliability - When it absolutely, positively has to be there
Kafka Reliability - When it absolutely, positively has to be there
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
No data loss pipeline with apache kafka
No data loss pipeline with apache kafkaNo data loss pipeline with apache kafka
No data loss pipeline with apache kafka
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
Distributed Logging System Using Elasticsearch Logstash,Beat,Kibana Stack and...
Distributed Logging System Using Elasticsearch Logstash,Beat,Kibana Stack and...Distributed Logging System Using Elasticsearch Logstash,Beat,Kibana Stack and...
Distributed Logging System Using Elasticsearch Logstash,Beat,Kibana Stack and...
 
Log
LogLog
Log
 

Semelhante a Building a Replicated Logging System with Apache Kafka

Building a Distributed Message Log from Scratch
Building a Distributed Message Log from ScratchBuilding a Distributed Message Log from Scratch
Building a Distributed Message Log from ScratchTyler Treat
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaShiao-An Yuan
 
Building a Distributed Message Log from Scratch - SCaLE 16x
Building a Distributed Message Log from Scratch - SCaLE 16xBuilding a Distributed Message Log from Scratch - SCaLE 16x
Building a Distributed Message Log from Scratch - SCaLE 16xTyler Treat
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writerKyle Hailey
 
JDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof DębskiJDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof DębskiPROIDEA
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writerEnkitec
 
SEMLA_logging_infra
SEMLA_logging_infraSEMLA_logging_infra
SEMLA_logging_infraswy351
 
Gemtalk Systems Product Roadmap
Gemtalk Systems Product RoadmapGemtalk Systems Product Roadmap
Gemtalk Systems Product RoadmapESUG
 
Dmytro Okhonko "LogDevice: durable and highly available sequential distribute...
Dmytro Okhonko "LogDevice: durable and highly available sequential distribute...Dmytro Okhonko "LogDevice: durable and highly available sequential distribute...
Dmytro Okhonko "LogDevice: durable and highly available sequential distribute...Fwdays
 
D itg-manual
D itg-manualD itg-manual
D itg-manualVeggax
 
Getting Started with Kafka on k8s
Getting Started with Kafka on k8sGetting Started with Kafka on k8s
Getting Started with Kafka on k8sVMware Tanzu
 
spark stream - kafka - the right way
spark stream - kafka - the right way spark stream - kafka - the right way
spark stream - kafka - the right way Dori Waldman
 
Paper_Scalable database logging for multicores
Paper_Scalable database logging for multicoresPaper_Scalable database logging for multicores
Paper_Scalable database logging for multicoresHyo jeong Lee
 
Unveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveUnveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveChieh (Jack) Yu
 
Fast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL EngineFast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL EngineDatabricks
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetMongoDB
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).Alexey Lesovsky
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsCeph Community
 

Semelhante a Building a Replicated Logging System with Apache Kafka (20)

Building a Distributed Message Log from Scratch
Building a Distributed Message Log from ScratchBuilding a Distributed Message Log from Scratch
Building a Distributed Message Log from Scratch
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Building a Distributed Message Log from Scratch - SCaLE 16x
Building a Distributed Message Log from Scratch - SCaLE 16xBuilding a Distributed Message Log from Scratch - SCaLE 16x
Building a Distributed Message Log from Scratch - SCaLE 16x
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writer
 
JDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof DębskiJDD2015: Make your world event driven - Krzysztof Dębski
JDD2015: Make your world event driven - Krzysztof Dębski
 
Profiling the logwriter and database writer
Profiling the logwriter and database writerProfiling the logwriter and database writer
Profiling the logwriter and database writer
 
SEMLA_logging_infra
SEMLA_logging_infraSEMLA_logging_infra
SEMLA_logging_infra
 
Gemtalk Systems Product Roadmap
Gemtalk Systems Product RoadmapGemtalk Systems Product Roadmap
Gemtalk Systems Product Roadmap
 
Dmytro Okhonko "LogDevice: durable and highly available sequential distribute...
Dmytro Okhonko "LogDevice: durable and highly available sequential distribute...Dmytro Okhonko "LogDevice: durable and highly available sequential distribute...
Dmytro Okhonko "LogDevice: durable and highly available sequential distribute...
 
Path oram
Path oramPath oram
Path oram
 
D itg-manual
D itg-manualD itg-manual
D itg-manual
 
Getting Started with Kafka on k8s
Getting Started with Kafka on k8sGetting Started with Kafka on k8s
Getting Started with Kafka on k8s
 
spark stream - kafka - the right way
spark stream - kafka - the right way spark stream - kafka - the right way
spark stream - kafka - the right way
 
Paper_Scalable database logging for multicores
Paper_Scalable database logging for multicoresPaper_Scalable database logging for multicores
Paper_Scalable database logging for multicores
 
Unveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep DiveUnveiling etcd: Architecture and Source Code Deep Dive
Unveiling etcd: Architecture and Source Code Deep Dive
 
Fast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL EngineFast and Reliable Apache Spark SQL Engine
Fast and Reliable Apache Spark SQL Engine
 
Fluentd vs. Logstash for OpenStack Log Management
Fluentd vs. Logstash for OpenStack Log ManagementFluentd vs. Logstash for OpenStack Log Management
Fluentd vs. Logstash for OpenStack Log Management
 
Webinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica SetWebinar Back to Basics 3 - Introduzione ai Replica Set
Webinar Back to Basics 3 - Introduzione ai Replica Set
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah Watkins
 

Mais de Guozhang Wang

Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfGuozhang Wang
 
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Guozhang Wang
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Guozhang Wang
 
Introduction to the Incremental Cooperative Protocol of Kafka
Introduction to the Incremental Cooperative Protocol of KafkaIntroduction to the Incremental Cooperative Protocol of Kafka
Introduction to the Incremental Cooperative Protocol of KafkaGuozhang Wang
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsGuozhang Wang
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedGuozhang Wang
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsGuozhang Wang
 
Building Realtim Data Pipelines with Kafka Connect and Spark Streaming
Building Realtim Data Pipelines with Kafka Connect and Spark StreamingBuilding Realtim Data Pipelines with Kafka Connect and Spark Streaming
Building Realtim Data Pipelines with Kafka Connect and Spark StreamingGuozhang Wang
 

Mais de Guozhang Wang (8)

Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdf
 
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
 
Introduction to the Incremental Cooperative Protocol of Kafka
Introduction to the Incremental Cooperative Protocol of KafkaIntroduction to the Incremental Cooperative Protocol of Kafka
Introduction to the Incremental Cooperative Protocol of Kafka
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams Applications
 
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson LearnedApache Kafka from 0.7 to 1.0, History and Lesson Learned
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 
Building Realtim Data Pipelines with Kafka Connect and Spark Streaming
Building Realtim Data Pipelines with Kafka Connect and Spark StreamingBuilding Realtim Data Pipelines with Kafka Connect and Spark Streaming
Building Realtim Data Pipelines with Kafka Connect and Spark Streaming
 

Último

Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 

Último (20)

Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 

Building a Replicated Logging System with Apache Kafka

Notas do Editor

  1. Thank you. And good morning, today I am going to talk about Kafka, and how it can be built as a general replicated log streams for a wide use of scalable systems. This is a joint work from the Apache Kafka community.
  2. First of all, being in this room, I think it is safe to say “we all love logs”. Logs have been around almost as long as this research community.
  3. No-overwrite in POSTGRES ARIES: Write-Ahead-Logging in the 80’s Today, reading the 50 page Aries pager has been the must-to-do for every single database graduate student including myself.
  4. Similarly, Log-Structured storage architecture.
  5. Replicated State Machine And in all these examples, the log is used as the source of truth data change log to scale the systems while providing durability and consistency.
  6. So that is all good stuff about logs, but where is Kafka is this big picture. Well, Kafka is an Apache open sourced distributed messaging system that stores messages as a commit log.
  7. Data-serving websites, LinkedIn has a lot of data We have this variety of data and and we need to build all these products around such data. Messaging: ActiveMQ User Activity: In house log aggregation Logging: Splunk Metrics: JMX => Zenoss Database data: Databus, custom ETL
  8. This idea of using logs for data flow has been floating around LinkedIn, log-centric fashion. Take all the organization's data and put it into a central log for real-time subscription. Data integration, replication, real-time stream processing.
  9. Disks are fast when used sequentially File system caching
  10. Topic = message stream Topic has partitions, partitions are distributed to brokers
  11. higher availability and durability
  12. evenly distributed
  13. replicated log => replicated state machine
  14. One of the replicas is leader, leader evenly spread All writes go to leader Leader propagates writes to followers in order Leader decides when to commit message
  15. The size of the ISR is decoupled from the size of the replica set, hence the number of replicas and acknowledgements are independent.
  16. ack=3
  17. committed messages to consumer messages are committed is independent of the ack chosen by the producer.
  18. ack=1
  19. ack=1
  20. ack=3, follower slow
  21. under replicated partitions
  22. ack=3, broker failure
  23. load balancing cluster expansion
  24. load balancing cluster expansion
  25. This is a major initiative and will put Kafka on the critical path for site latency sensitive data paths which also require much higher message delivery guarantees.
  26. Data standardization, site monitoring
  27. Data flow graph. Flow rate may overwhelm query processor: batch processing, sampling, synopsis, etc In-memory storage constraints: single-pass algorithms, no stream backtracking
  28. WAL
  29. Streaming on Message Pipes