SlideShare uma empresa Scribd logo
1 de 22
Baixar para ler offline
Overview
•Problem statement
•Use cases
•What is Kafka?
Key
Terminologies
•Topic
•Partition
•offset
Kafka Cluster
•Zookeeper
•Broker
•Replication
•Analogy
•Relationship
Demo
•CLI
•Spark
Agenda
Source Source Source
Target Target Target
Problem statement
• Many source and target system integration
• High velocity streams
Source Source Source
Target Target Target
Kafka
Use cases
• Tracking user activity
• Log aggregation
• De-coupling systems
• Streaming processing
Producer Producer Producer
Consumer Consumer Consumer
Kafka
Kafka
• Scale to 100s of nodes
• Handle millions of messages per second
• Real-time processing (~10ms)
Kafka is horizontally scalable, fault tolerant and
fast messaging system.
Topic
• Stream of data
• Similar to table in a NoSQL
• Split into partition
• Data is retrieved through offset
• Offset unique per partition per topic
Key Column
1
2
1 2 3
1 2
Partition 0
Partition 1
Topic ATable A
• Table = Topic
• Key = offset
• Key unique per table = offset unique per partition + topic
• Row = message
NoSQL table & Topic Analogy
Kafka
Broker 1 Broker 2 Broker 3
Topic A
(Partition 0)
Topic A
(Partition 0)
Topic A
(Partition 1)
Topic A
(Partition 1)
Topic A
(Partition 2)
Topic A
(Partition 2)
Topic B
(Partition 0)
Topic B
(Partition 0)
Topic B
(Partition 1)
Topic B
(Partition 1)
• Topic A – 3 Partitions
• Topic B – 2 Partitions
Partition
• Enables topic to be distributed
• Unit of parallelism
• Usually one topic many partition
• Order is guaranteed only within a partition
• Messages are immutable
1 2 3 4 5 6
1 2 3 4
1 2 3 4 5
Partition 0
Partition 1
Partition 2
Topic A
Kafka
ProducerProducer
Offset
Partition – offset & key
• Key
 Messages are written to partition based on
key
 No key then round-robin
 Keys are important to avoid hotspots.
• Offsets
 Incremental unique id per partition
Producer
Producer
Producer
Zookeeper
node 1
Zookeeper
node n
Zookeeper Cluster
Broker 1
Kafka Cluster
Broker 2
Broker 3
Consumer
Group
Update metadata
WriteRead
Kafka Architecture
• Zookeeper
• Kafka nodes (brokers)
• Producers
• Consumer groups
• consumers
Zookeeper 0
(Follower)
Zookeeper 0
(Follower)
Zookeeper 1
(Leader)
Zookeeper 1
(Leader)
Zookeeper 2
(Follower)
Zookeeper 2
(Follower)
Broker 0Broker 0 Broker 1Broker 1 Broker 2Broker 2 Broker 3Broker 3 Broker 5Broker 5
All Meta data
Writes
Zookeeper
• Hierarchical key-value store
• Configuration, synchronization and name registry services
• Ensemble layer
• Ties things together
• Ensures high availability
• Odd number of nodes
• More than 7 nodes not recommended
• Kafka can’t work without zookeeper
• Stores metadata
• Leader & follower nodes
• All writes only through leader node
• From Kafka 0.10 offsets are not managed by zookeeper
• Acts like a project manager (analogy)
Zookeeper is a centralized service for managing
distributed systems.
Kafka
1 2 3
1 2
Partition 0
Partition 1
Topic A
Broker 1
1 2 3
1 2
Partition 2
Partition 3
Topic A
Broker 2
Producer Producer Producer
Consumer Consumer Consumer
Broker
• Single Kafka node
• Managed by Zookeeper
• Topic is distributed across brokers based on
partition and replication
• Acts like a developer (analogy)
Kafka
Broker 1 Broker 2 Broker 3
Topic B
(Partition 0)
[Leader]
Topic B
(Partition 0)
[Leader]
Topic B
(Partition 0)
[Follower]
• Topic B – 2 Partitions
• Replication factor of 2
Topic B
(Partition 1)
[Leader]
Topic B
(Partition 1)
[Leader]
Topic B
(Partition 1)
[Follower]
Producer
Consumer
Group
Replication
• Copy of a partition in another broker
• Enables fault tolerant
• Follower partition replicates from leader
• Only leader serves both producer and
consumer
• ISR – In Sync Replica
Dev Team
Developer 1 Developer 2 Developer 3
Module B
(Task 0)
[Leader]
Module B
(Task 0)
[Leader]
Module B
(Task 0)
[Follower]
• Module B – 2 parallel task
• 1 back resource for module B
Module B
(Task 1)
[Leader]
Module B
(Task 1)
[Leader]
Module B
(Task 1)
[Follower]
Manager
(Leader)
Manager
(Leader)
Task
Assigner
Testing
Team
Replication – IT team analogy
Partition 0
(Leader)
Partition 0
(Leader)
Partition 0
(Follower)
Partition 0
(Follower)
Partition 0
(Follower)
Partition 0
(Follower)
Producer
Write
Pull changes
Pull changes
Replication - Followers
Kafka
Broker 1 Broker 2 Broker 3
Topic B
(Partition 0)
[Leader]
Topic B
(Partition 0)
[Leader]
Topic B
(Partition 0)
[Leader]
Topic B
(Partition 0)
[Leader]
• Topic B – 2 Partitions
• Replication factor of 2Topic B
(Partition 1)
[Leader]
Topic B
(Partition 1)
[Leader]
Topic B
(Partition 0)
[Follower]
Replication
Producer
Consumer
Group
All Reads
 Topic B
(Partition 1)
[Follower]
Replication
Replication
Replication – Leader election
Partition 0
(Leader)
Partition 0
(Leader)
Partition 1
(Leader)
Partition 1
(Leader)
Partition 2
(Follower)
Partition 2
(Follower)
Producer
Write
Pull changes

Replication – Leader election
FollowersReplicationScaleStreamNodes
Cluster
manager
Zookeeper
Broker 0
Topic 0
Partition 0
(Leader)
Replication
Partition 0
(Follower)
Partition 0
(Follower)
Topic 1
Broker 1 Topic 0
Partition 1
(Leader)
Replication
Partition 1
(Follower)
Partition 1
(Follower)
Components hierarchy
FollowersReplicationScaleStreamNodes
Cluster
manager
Zookeeper
Broker 0
Topic 0
Partition 0
(Leader)
Replication
Partition 0
(Follower)
Partition 0
(Follower)
Topic 1
Broker 1 Topic 0
Partition 1
(Leader)
Replication
Partition 1
(Follower)
Partition 1
(Follower)
Backup
team
members
SharingSub tasksmodulesDev Team
Project
manager
Manager
Developers
Module 0
Task 0
(Leader)
Knowledge
Sharing
Task 0
(Follower)
Task 0
(Follower)
Module 1
Developers Module 0
Task 1
(Leader)
Knowledge
Sharing
Task 1
(Follower)
Task 1
(Follower)
Kafka Cluster IT team
Zookeeper Manager
Broker Developer
Topic Module
Partition Task
Replication Knowledge sharing
Leader Developer who owns the task
follower Backup resource
IT Team and Kafka Cluster Analogy
Zookeeper Broker
1 1..*
Manage
Topic
1 0..1
has
Partition
1 1..*
Split into
Replica
1
1..*
has
Relationship summary
Demo
Read article @ https://www.linkedin.com/pulse/kafka-technical-overview-sylvester-daniel/
LinkedIn - https://www.linkedin.com/in/sylvesterdj/
Thank you

Mais conteúdo relacionado

Mais procurados

A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsKetan Gote
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafkaemreakis
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformJean-Paul Azar
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리confluent
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...HostedbyConfluent
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 

Mais procurados (20)

A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Envoy and Kafka
Envoy and KafkaEnvoy and Kafka
Envoy and Kafka
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platformKafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial - basics of the Kafka streaming platform
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Stream Processing made simple with Kafka
Stream Processing made simple with KafkaStream Processing made simple with Kafka
Stream Processing made simple with Kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리ksqlDB로 실시간 데이터 변환 및 스트림 처리
ksqlDB로 실시간 데이터 변환 및 스트림 처리
 
Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...Kafka High Availability in multi data center setup with floating Observers wi...
Kafka High Availability in multi data center setup with floating Observers wi...
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 

Semelhante a Kafka Overview and Key Concepts

MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup Ran Levy
 
Introduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free ReplicationIntroduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free ReplicationTim Callaghan
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013Jun Rao
 
Introducción a Stream Processing utilizando Kafka Streams
Introducción a Stream Processing utilizando Kafka StreamsIntroducción a Stream Processing utilizando Kafka Streams
Introducción a Stream Processing utilizando Kafka Streamsconfluent
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafkaconfluent
 
Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...
Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...
Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...confluent
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to heroAvi Levi
 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka TLV
 
MyHeritage backend group - build to scale
MyHeritage backend group - build to scaleMyHeritage backend group - build to scale
MyHeritage backend group - build to scaleRan Levy
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Exploiting NoSQL Like Never Before
Exploiting NoSQL Like Never BeforeExploiting NoSQL Like Never Before
Exploiting NoSQL Like Never BeforeFrancis Alexander
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloudVarun Thacker
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudthelabdude
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)confluent
 
kafka simplicity and complexity
kafka simplicity and complexitykafka simplicity and complexity
kafka simplicity and complexityPaolo Platter
 

Semelhante a Kafka Overview and Key Concepts (20)

MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup MyHeritage Kakfa use cases - Feb 2014 Meetup
MyHeritage Kakfa use cases - Feb 2014 Meetup
 
Introduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free ReplicationIntroduction to TokuDB v7.5 and Read Free Replication
Introduction to TokuDB v7.5 and Read Free Replication
 
Kafka replication apachecon_2013
Kafka replication apachecon_2013Kafka replication apachecon_2013
Kafka replication apachecon_2013
 
Introducción a Stream Processing utilizando Kafka Streams
Introducción a Stream Processing utilizando Kafka StreamsIntroducción a Stream Processing utilizando Kafka Streams
Introducción a Stream Processing utilizando Kafka Streams
 
Deep Dive into Apache Kafka
Deep Dive into Apache KafkaDeep Dive into Apache Kafka
Deep Dive into Apache Kafka
 
Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...
Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...
Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using ...
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to hero
 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to hero
 
MyHeritage backend group - build to scale
MyHeritage backend group - build to scaleMyHeritage backend group - build to scale
MyHeritage backend group - build to scale
 
Kafka101
Kafka101Kafka101
Kafka101
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
sbt 1
sbt 1sbt 1
sbt 1
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Exploiting NoSQL Like Never Before
Exploiting NoSQL Like Never BeforeExploiting NoSQL Like Never Before
Exploiting NoSQL Like Never Before
 
Kafka overview v0.1
Kafka overview v0.1Kafka overview v0.1
Kafka overview v0.1
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloud
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
 
kafka simplicity and complexity
kafka simplicity and complexitykafka simplicity and complexity
kafka simplicity and complexity
 

Último

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Último (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

Kafka Overview and Key Concepts

  • 1. Overview •Problem statement •Use cases •What is Kafka? Key Terminologies •Topic •Partition •offset Kafka Cluster •Zookeeper •Broker •Replication •Analogy •Relationship Demo •CLI •Spark Agenda
  • 2. Source Source Source Target Target Target Problem statement • Many source and target system integration • High velocity streams
  • 3. Source Source Source Target Target Target Kafka Use cases • Tracking user activity • Log aggregation • De-coupling systems • Streaming processing
  • 4. Producer Producer Producer Consumer Consumer Consumer Kafka Kafka • Scale to 100s of nodes • Handle millions of messages per second • Real-time processing (~10ms) Kafka is horizontally scalable, fault tolerant and fast messaging system.
  • 5. Topic • Stream of data • Similar to table in a NoSQL • Split into partition • Data is retrieved through offset • Offset unique per partition per topic
  • 6. Key Column 1 2 1 2 3 1 2 Partition 0 Partition 1 Topic ATable A • Table = Topic • Key = offset • Key unique per table = offset unique per partition + topic • Row = message NoSQL table & Topic Analogy
  • 7. Kafka Broker 1 Broker 2 Broker 3 Topic A (Partition 0) Topic A (Partition 0) Topic A (Partition 1) Topic A (Partition 1) Topic A (Partition 2) Topic A (Partition 2) Topic B (Partition 0) Topic B (Partition 0) Topic B (Partition 1) Topic B (Partition 1) • Topic A – 3 Partitions • Topic B – 2 Partitions Partition • Enables topic to be distributed • Unit of parallelism • Usually one topic many partition • Order is guaranteed only within a partition • Messages are immutable
  • 8. 1 2 3 4 5 6 1 2 3 4 1 2 3 4 5 Partition 0 Partition 1 Partition 2 Topic A Kafka ProducerProducer Offset Partition – offset & key • Key  Messages are written to partition based on key  No key then round-robin  Keys are important to avoid hotspots. • Offsets  Incremental unique id per partition
  • 9. Producer Producer Producer Zookeeper node 1 Zookeeper node n Zookeeper Cluster Broker 1 Kafka Cluster Broker 2 Broker 3 Consumer Group Update metadata WriteRead Kafka Architecture • Zookeeper • Kafka nodes (brokers) • Producers • Consumer groups • consumers
  • 10. Zookeeper 0 (Follower) Zookeeper 0 (Follower) Zookeeper 1 (Leader) Zookeeper 1 (Leader) Zookeeper 2 (Follower) Zookeeper 2 (Follower) Broker 0Broker 0 Broker 1Broker 1 Broker 2Broker 2 Broker 3Broker 3 Broker 5Broker 5 All Meta data Writes Zookeeper • Hierarchical key-value store • Configuration, synchronization and name registry services • Ensemble layer • Ties things together • Ensures high availability • Odd number of nodes • More than 7 nodes not recommended • Kafka can’t work without zookeeper • Stores metadata • Leader & follower nodes • All writes only through leader node • From Kafka 0.10 offsets are not managed by zookeeper • Acts like a project manager (analogy) Zookeeper is a centralized service for managing distributed systems.
  • 11. Kafka 1 2 3 1 2 Partition 0 Partition 1 Topic A Broker 1 1 2 3 1 2 Partition 2 Partition 3 Topic A Broker 2 Producer Producer Producer Consumer Consumer Consumer Broker • Single Kafka node • Managed by Zookeeper • Topic is distributed across brokers based on partition and replication • Acts like a developer (analogy)
  • 12. Kafka Broker 1 Broker 2 Broker 3 Topic B (Partition 0) [Leader] Topic B (Partition 0) [Leader] Topic B (Partition 0) [Follower] • Topic B – 2 Partitions • Replication factor of 2 Topic B (Partition 1) [Leader] Topic B (Partition 1) [Leader] Topic B (Partition 1) [Follower] Producer Consumer Group Replication • Copy of a partition in another broker • Enables fault tolerant • Follower partition replicates from leader • Only leader serves both producer and consumer • ISR – In Sync Replica
  • 13. Dev Team Developer 1 Developer 2 Developer 3 Module B (Task 0) [Leader] Module B (Task 0) [Leader] Module B (Task 0) [Follower] • Module B – 2 parallel task • 1 back resource for module B Module B (Task 1) [Leader] Module B (Task 1) [Leader] Module B (Task 1) [Follower] Manager (Leader) Manager (Leader) Task Assigner Testing Team Replication – IT team analogy
  • 14. Partition 0 (Leader) Partition 0 (Leader) Partition 0 (Follower) Partition 0 (Follower) Partition 0 (Follower) Partition 0 (Follower) Producer Write Pull changes Pull changes Replication - Followers
  • 15. Kafka Broker 1 Broker 2 Broker 3 Topic B (Partition 0) [Leader] Topic B (Partition 0) [Leader] Topic B (Partition 0) [Leader] Topic B (Partition 0) [Leader] • Topic B – 2 Partitions • Replication factor of 2Topic B (Partition 1) [Leader] Topic B (Partition 1) [Leader] Topic B (Partition 0) [Follower] Replication Producer Consumer Group All Reads  Topic B (Partition 1) [Follower] Replication Replication Replication – Leader election
  • 16. Partition 0 (Leader) Partition 0 (Leader) Partition 1 (Leader) Partition 1 (Leader) Partition 2 (Follower) Partition 2 (Follower) Producer Write Pull changes  Replication – Leader election
  • 17. FollowersReplicationScaleStreamNodes Cluster manager Zookeeper Broker 0 Topic 0 Partition 0 (Leader) Replication Partition 0 (Follower) Partition 0 (Follower) Topic 1 Broker 1 Topic 0 Partition 1 (Leader) Replication Partition 1 (Follower) Partition 1 (Follower) Components hierarchy
  • 18. FollowersReplicationScaleStreamNodes Cluster manager Zookeeper Broker 0 Topic 0 Partition 0 (Leader) Replication Partition 0 (Follower) Partition 0 (Follower) Topic 1 Broker 1 Topic 0 Partition 1 (Leader) Replication Partition 1 (Follower) Partition 1 (Follower) Backup team members SharingSub tasksmodulesDev Team Project manager Manager Developers Module 0 Task 0 (Leader) Knowledge Sharing Task 0 (Follower) Task 0 (Follower) Module 1 Developers Module 0 Task 1 (Leader) Knowledge Sharing Task 1 (Follower) Task 1 (Follower) Kafka Cluster IT team Zookeeper Manager Broker Developer Topic Module Partition Task Replication Knowledge sharing Leader Developer who owns the task follower Backup resource IT Team and Kafka Cluster Analogy
  • 19. Zookeeper Broker 1 1..* Manage Topic 1 0..1 has Partition 1 1..* Split into Replica 1 1..* has Relationship summary
  • 20. Demo
  • 21. Read article @ https://www.linkedin.com/pulse/kafka-technical-overview-sylvester-daniel/ LinkedIn - https://www.linkedin.com/in/sylvesterdj/