SlideShare uma empresa Scribd logo
1 de 63
Baixar para ler offline
By Amir Sedighi 
@amirsedighi 
Data Solutions Engineer at DatisPars 
Nov 2014
2 
References 
● http://kafka.apache.org/documentation.html 
● http://www.slideshare.net/charmalloc/current-an 
d-future-of-apache-kafka 
● http://www.michael-noll.com/blog/2013/03/13/ru 
nning-a-multi-broker-apache-kafka-cluster-on-a 
-single-node/
3 
At first data pipelining looks easy! 
● It often starts with one 
data pipeline from a 
producer to a 
consumer.
4 
It looks pretty wise either to reuse 
things! 
● Reusing the pipeline 
for new producers.
5 
We may handle some situations! 
● Reusing added 
producers for new 
consumers.
6 
But we can't go far! 
● Eventually the 
solution becomes the 
problem!
7 
The additional requirements make 
things complicated! 
● By later developments it gets even worse!
8 
How to avoid this mess?
9 
Decoupling Data-Pipelines
10 
Message Delivery Semantics 
● At most once 
– Messages may be lost by are never delivered. 
● At least once 
– Messages are never lost byt may be redliverd. 
● Exactly once 
– This is what people actually want.
11 
Apache Kafka is publish-subscribe messaging 
rethought as a distributed commit log.
12 
Apache Kafka 
● Apache Kafka is publish-subscribe messaging 
rethought as a distributed commit log. 
– Kafka is super fast. 
– Kafka is scalable. 
– Kafka is durable. 
– Kafka is distributed by design.
13 
Apache Kafka 
● Apache Kafka is publish-subscribe messaging 
rethought as a distributed commit log. 
– Kafka is super fast. 
– Kafka is scalable. 
– Kafka is durable. 
– Kafka is distributed by design.
14 
Apache Kafka 
● Apache Kafka is publish-subscribe messaging 
rethought as a distributed commit log. 
– Kafka is super fast. 
– Kafka is scalable. 
– Kafka is durable. 
– Kafka is distributed by design.
15 
Apache Kafka 
● A single Kafka broker 
(server) can handle 
hundreds of 
megabytes of reads 
and writes per second 
from thousands of 
clients.
16 
Apache Kafka 
● Apache Kafka is publish-subscribe messaging 
rethought as a distributed commit log. 
– Kafka is super fast. 
– Kafka is scalable. 
– Kafka is durable. 
– Kafka is distributed by design.
17 
Apache Kafka 
● Kafka is designed to 
allow a single cluster 
to serve as the central 
data backbone for a 
large organization. It 
can be elastically and 
transparently 
expanded without 
downtime.
18 
Apache Kafka 
● Apache Kafka is publish-subscribe messaging 
rethought as a distributed commit log. 
– Kafka is super fast. 
– Kafka is scalable. 
– Kafka is durable. 
– Kafka is distributed by design.
19 
Apache Kafka 
● Messages are 
persisted on disk and 
replicated within the 
cluster to prevent 
data loss. Each 
broker can handle 
terabytes of 
messages without 
performance impact.
20 
Apache Kafka 
● Apache Kafka is publish-subscribe messaging 
rethought as a distributed commit log. 
– Kafka is super fast. 
– Kafka is scalable. 
– Kafka is durable. 
– Kafka is distributed by design.
21 
Apache Kafka 
● Kafka has a modern 
cluster-centric design 
that offers strong 
durability and fault-tolerance 
guarantees.
22 
Kafka in Linkedin
23
24 
Kafka is a distributed, partitioned, replicated 
commit log service.
25 
Main Components 
● Topic 
● Producer 
● Consumer 
● Broker
26 
Topic 
● Topic 
● Producer 
● Consumer 
● Broker 
● Kafka maintains feeds 
of messages in 
categories called 
topics. 
● Topics are the highest 
level of abstraction 
that Kafka provides.
27 
Topic
28 
Topic
29 
Topic
30 
Producer 
● Topic 
● Producer 
● Consumer 
● Broker 
● We'll call processes 
that publish 
messages to a Kafka 
topic producers.
31 
Producer
32 
Producer
33 
Producer
34 
Consumer 
● Topic 
● Producer 
● Consumer 
● Broker 
● We'll call processes 
that subscribe to 
topics and process 
the feed of published 
messages, 
consumers. 
– Hadoop Consumer
35 
Consumer
36 
Broker 
● Topic 
● Producer 
● Consumer 
● Broker 
● Kafka is run as a 
cluster comprised of 
one or more servers 
each of which is 
called a broker.
37 
Broker
38 
Broker
39 
Topics 
● A topic is a category 
or feed name to which 
messages are 
published. 
● Kafka cluster 
maintains a 
partitioned log for 
each topic.
40 
Partition 
● Is an ordered, 
immutable sequence of 
messages that is 
continually appended to 
a commit log. 
● The messages in the 
partitions are each 
assigned a sequential id 
number called the offset.
41 
Partition
42 
Again Topic and Partition
43 
Log Compaction
44 
Producer 
● The producer is responsible for choosing which 
message to assign to which partition within the 
topic. 
– Round-Robin 
– Load-Balanced 
– Key-Based (Semantic-Oriented)
45 
Log Compaction
46 
How a Kafka cluster looks Like?
47 
How Kafka replicates a Topic's 
partitions through the cluster?
48 
Logical Consumers
49 
What if we put jobs (Processors) 
cross the flow?
50 
Where to Start? 
● http://kafka.apache.org/downloads.html
51 
Run Zookeeper 
● bin/zookeeper-server-start.sh 
config/zookeeper.properties
52 
Run kafka-server 
● bin/kafka-server-start.sh 
config/server.properties
53 
Create Topic 
● bin/kafka-topics.sh --create --zookeeper 
localhost:2181 --replication-factor 1 --partitions 
1 --topic test 
> Created topic "test".
54 
List all Topics 
● bin/kafka-topics.sh --list --zookeeper 
localhost:2181
55 
Send some Messages by Producer 
● bin/kafka-console-producer.sh --broker-list 
localhost:9092 --topic test 
Hello DatisPars Guys! 
How is it going with you?
56 
Start a Consumer 
● bin/kafka-console-consumer.sh --zookeeper 
localhost:2181 --topic test --from-beginning
57 
Producing ...
58 
Consuming
59 
Use Cases 
● Messaging 
– Kafka is comparable to traditional messaging 
systems such as ActiveMQ and RabbitMQ. 
● Kafka provides customizable latency 
● Kafka has better throughput 
● Kafka is highly Fault-tolerance
60 
Use Cases 
● Log Aggregation 
– Many people use Kafka as a replacement for a log aggregation 
solution. 
– Log aggregation typically collects physical log files off servers 
and puts them in a central place (a file server or HDFS perhaps) 
for processing. 
– In comparison to log-centric systems like Scribe or Flume, Kafka 
offers equally good performance, stronger durability guarantees 
due to replication, and much lower end-to-end latency. 
● Lower-latency 
● Easier support
61 
Use Cases 
● Stream Processing 
– Storm and Samza are popular frameworks for stream processing. They 
both use Kafka. 
● Event Sourcing 
– Event sourcing is a style of application design where state changes are 
logged as a time-ordered sequence of records. Kafka's support for very 
large stored log data makes it an excellent backend for an application 
built in this style. 
● Commit Log 
– Kafka can serve as a kind of external commit-log for a distributed 
system. The log helps replicate data between nodes and acts as a re-syncing 
mechanism for failed nodes to restore their data.
62 
Message Format 
● /** 
● * A message. The format of an N byte message is the following: 
● * If magic byte is 0 
● * 1. 1 byte "magic" identifier to allow format changes 
● * 2. 4 byte CRC32 of the payload 
● * 3. N - 5 byte payload 
● * If magic byte is 1 
● * 1. 1 byte "magic" identifier to allow format changes 
● * 2. 1 byte "attributes" identifier to allow annotations on the message independent of the 
version (e.g. compression enabled, type of codec used) 
● * 3. 4 byte CRC32 of the payload 
● * 4. N - 6 byte payload 
● */
63 
Questions?

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Uber: Kafka Consumer Proxy
Uber: Kafka Consumer ProxyUber: Kafka Consumer Proxy
Uber: Kafka Consumer Proxy
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 

Destaque

Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)Amir Sedighi
 
Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)Amir Sedighi
 
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupCase Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupAmir Sedighi
 
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگآشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگAmir Sedighi
 
An introduction To Apache Spark
An introduction To Apache SparkAn introduction To Apache Spark
An introduction To Apache SparkAmir Sedighi
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Amir Sedighi
 
Big Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMBig Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMAmir Sedighi
 
An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAmir Sedighi
 
Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Amir Sedighi
 

Destaque (11)

Hadoop 2.x HDFS Cluster Installation (VirtualBox)
Hadoop 2.x  HDFS Cluster Installation (VirtualBox)Hadoop 2.x  HDFS Cluster Installation (VirtualBox)
Hadoop 2.x HDFS Cluster Installation (VirtualBox)
 
Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)Elasticsearch 1.x Cluster Installation (VirtualBox)
Elasticsearch 1.x Cluster Installation (VirtualBox)
 
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupCase Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
 
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگآشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
 
An introduction To Apache Spark
An introduction To Apache SparkAn introduction To Apache Spark
An introduction To Apache Spark
 
Distributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBUDistributed Data Processing Workshop - SBU
Distributed Data Processing Workshop - SBU
 
Dark data
Dark dataDark data
Dark data
 
Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM
 
Big Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMBig Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACM
 
An Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for BeginnersAn Introduction to Elasticsearch for Beginners
An Introduction to Elasticsearch for Beginners
 
Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015
 

Semelhante a An Introduction to Apache Kafka

14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-CamusDeep Shah
 
Columbus mule soft_meetup_aug2021_Kafka_Integration
Columbus mule soft_meetup_aug2021_Kafka_IntegrationColumbus mule soft_meetup_aug2021_Kafka_Integration
Columbus mule soft_meetup_aug2021_Kafka_IntegrationMuleSoft Meetup
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Otávio Carvalho
 
Insta clustr seattle kafka meetup presentation bb
Insta clustr seattle kafka meetup presentation   bbInsta clustr seattle kafka meetup presentation   bb
Insta clustr seattle kafka meetup presentation bbNitin Kumar
 
Kafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - PaytmKafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - PaytmSumit Jain
 
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...confluent
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpJosé Román Martín Gil
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Monal Daxini
 
Apache Kafka
Apache KafkaApache Kafka
Apache KafkaJoe Stein
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Knoldus Inc.
 
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterTwitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterHostedbyConfluent
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedEdureka!
 
Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLEdunomica
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperAnandMHadoop
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...HostedbyConfluent
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafkaSamuel Kerrien
 

Semelhante a An Introduction to Apache Kafka (20)

14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Copy of Kafka-Camus
Copy of Kafka-CamusCopy of Kafka-Camus
Copy of Kafka-Camus
 
Columbus mule soft_meetup_aug2021_Kafka_Integration
Columbus mule soft_meetup_aug2021_Kafka_IntegrationColumbus mule soft_meetup_aug2021_Kafka_Integration
Columbus mule soft_meetup_aug2021_Kafka_Integration
 
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
 
Insta clustr seattle kafka meetup presentation bb
Insta clustr seattle kafka meetup presentation   bbInsta clustr seattle kafka meetup presentation   bb
Insta clustr seattle kafka meetup presentation bb
 
Kafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - PaytmKafka in action - Tech Talk - Paytm
Kafka in action - Tech Talk - Paytm
 
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
 
kafka
kafkakafka
kafka
 
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpStrimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1
 
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterTwitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
 
Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for ML
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 
Introduction to apache kafka
Introduction to apache kafkaIntroduction to apache kafka
Introduction to apache kafka
 

Mais de Amir Sedighi

Big Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMBig Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Amir Sedighi
 
Big Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMBig Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMBig Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMBig Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMAmir Sedighi
 
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranTwo Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranAmir Sedighi
 
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionHelio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionAmir Sedighi
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingAmir Sedighi
 
An introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAn introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAmir Sedighi
 

Mais de Amir Sedighi (9)

Big Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMBig Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACM
 
Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM
 
Big Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMBig Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACM
 
Big Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMBig Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACM
 
Big Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMBig Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACM
 
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranTwo Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
 
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionHelio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData Processing
 
An introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAn introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoop
 

Último

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 

Último (20)

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 

An Introduction to Apache Kafka

  • 1. By Amir Sedighi @amirsedighi Data Solutions Engineer at DatisPars Nov 2014
  • 2. 2 References ● http://kafka.apache.org/documentation.html ● http://www.slideshare.net/charmalloc/current-an d-future-of-apache-kafka ● http://www.michael-noll.com/blog/2013/03/13/ru nning-a-multi-broker-apache-kafka-cluster-on-a -single-node/
  • 3. 3 At first data pipelining looks easy! ● It often starts with one data pipeline from a producer to a consumer.
  • 4. 4 It looks pretty wise either to reuse things! ● Reusing the pipeline for new producers.
  • 5. 5 We may handle some situations! ● Reusing added producers for new consumers.
  • 6. 6 But we can't go far! ● Eventually the solution becomes the problem!
  • 7. 7 The additional requirements make things complicated! ● By later developments it gets even worse!
  • 8. 8 How to avoid this mess?
  • 10. 10 Message Delivery Semantics ● At most once – Messages may be lost by are never delivered. ● At least once – Messages are never lost byt may be redliverd. ● Exactly once – This is what people actually want.
  • 11. 11 Apache Kafka is publish-subscribe messaging rethought as a distributed commit log.
  • 12. 12 Apache Kafka ● Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. – Kafka is super fast. – Kafka is scalable. – Kafka is durable. – Kafka is distributed by design.
  • 13. 13 Apache Kafka ● Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. – Kafka is super fast. – Kafka is scalable. – Kafka is durable. – Kafka is distributed by design.
  • 14. 14 Apache Kafka ● Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. – Kafka is super fast. – Kafka is scalable. – Kafka is durable. – Kafka is distributed by design.
  • 15. 15 Apache Kafka ● A single Kafka broker (server) can handle hundreds of megabytes of reads and writes per second from thousands of clients.
  • 16. 16 Apache Kafka ● Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. – Kafka is super fast. – Kafka is scalable. – Kafka is durable. – Kafka is distributed by design.
  • 17. 17 Apache Kafka ● Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime.
  • 18. 18 Apache Kafka ● Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. – Kafka is super fast. – Kafka is scalable. – Kafka is durable. – Kafka is distributed by design.
  • 19. 19 Apache Kafka ● Messages are persisted on disk and replicated within the cluster to prevent data loss. Each broker can handle terabytes of messages without performance impact.
  • 20. 20 Apache Kafka ● Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. – Kafka is super fast. – Kafka is scalable. – Kafka is durable. – Kafka is distributed by design.
  • 21. 21 Apache Kafka ● Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees.
  • 22. 22 Kafka in Linkedin
  • 23. 23
  • 24. 24 Kafka is a distributed, partitioned, replicated commit log service.
  • 25. 25 Main Components ● Topic ● Producer ● Consumer ● Broker
  • 26. 26 Topic ● Topic ● Producer ● Consumer ● Broker ● Kafka maintains feeds of messages in categories called topics. ● Topics are the highest level of abstraction that Kafka provides.
  • 30. 30 Producer ● Topic ● Producer ● Consumer ● Broker ● We'll call processes that publish messages to a Kafka topic producers.
  • 34. 34 Consumer ● Topic ● Producer ● Consumer ● Broker ● We'll call processes that subscribe to topics and process the feed of published messages, consumers. – Hadoop Consumer
  • 36. 36 Broker ● Topic ● Producer ● Consumer ● Broker ● Kafka is run as a cluster comprised of one or more servers each of which is called a broker.
  • 39. 39 Topics ● A topic is a category or feed name to which messages are published. ● Kafka cluster maintains a partitioned log for each topic.
  • 40. 40 Partition ● Is an ordered, immutable sequence of messages that is continually appended to a commit log. ● The messages in the partitions are each assigned a sequential id number called the offset.
  • 42. 42 Again Topic and Partition
  • 44. 44 Producer ● The producer is responsible for choosing which message to assign to which partition within the topic. – Round-Robin – Load-Balanced – Key-Based (Semantic-Oriented)
  • 46. 46 How a Kafka cluster looks Like?
  • 47. 47 How Kafka replicates a Topic's partitions through the cluster?
  • 49. 49 What if we put jobs (Processors) cross the flow?
  • 50. 50 Where to Start? ● http://kafka.apache.org/downloads.html
  • 51. 51 Run Zookeeper ● bin/zookeeper-server-start.sh config/zookeeper.properties
  • 52. 52 Run kafka-server ● bin/kafka-server-start.sh config/server.properties
  • 53. 53 Create Topic ● bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test > Created topic "test".
  • 54. 54 List all Topics ● bin/kafka-topics.sh --list --zookeeper localhost:2181
  • 55. 55 Send some Messages by Producer ● bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test Hello DatisPars Guys! How is it going with you?
  • 56. 56 Start a Consumer ● bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
  • 59. 59 Use Cases ● Messaging – Kafka is comparable to traditional messaging systems such as ActiveMQ and RabbitMQ. ● Kafka provides customizable latency ● Kafka has better throughput ● Kafka is highly Fault-tolerance
  • 60. 60 Use Cases ● Log Aggregation – Many people use Kafka as a replacement for a log aggregation solution. – Log aggregation typically collects physical log files off servers and puts them in a central place (a file server or HDFS perhaps) for processing. – In comparison to log-centric systems like Scribe or Flume, Kafka offers equally good performance, stronger durability guarantees due to replication, and much lower end-to-end latency. ● Lower-latency ● Easier support
  • 61. 61 Use Cases ● Stream Processing – Storm and Samza are popular frameworks for stream processing. They both use Kafka. ● Event Sourcing – Event sourcing is a style of application design where state changes are logged as a time-ordered sequence of records. Kafka's support for very large stored log data makes it an excellent backend for an application built in this style. ● Commit Log – Kafka can serve as a kind of external commit-log for a distributed system. The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data.
  • 62. 62 Message Format ● /** ● * A message. The format of an N byte message is the following: ● * If magic byte is 0 ● * 1. 1 byte "magic" identifier to allow format changes ● * 2. 4 byte CRC32 of the payload ● * 3. N - 5 byte payload ● * If magic byte is 1 ● * 1. 1 byte "magic" identifier to allow format changes ● * 2. 1 byte "attributes" identifier to allow annotations on the message independent of the version (e.g. compression enabled, type of codec used) ● * 3. 4 byte CRC32 of the payload ● * 4. N - 6 byte payload ● */