SlideShare uma empresa Scribd logo
1 de 23
www.edureka.co/apache-kafka
Apache Kafka : Next Generation Distributed Messaging System
www.edureka.co/apache-kafka
What will you learn today ?
 What is Apache Kafka ?
 Architecture of Kafka
 Multiple ways of setting Kafka cluster
 Comparing Kafka with other messaging systems
 Hands-On : Getting started with Kafka
www.edureka.co/apache-kafka
Data : The Ingredient
Data is the main ingredient of Internet applications and typically includes the following :
 Page visits and clicks
 User activities
 Events corresponding to logins
 Social networking activities such as likes, shares, and comments
 Application specific metrics (e.g. logs, page load time, performance etc.)
www.edureka.co/apache-kafka
Need : Real Time Analytics
In todays applications, activity data has become a part of production data and is used to run
analytics in real time. These analytics can be:
 Delivering advertisements to the masses
 Any abnormal user behavior or application hacking
 Search-based on relevance
 Recommendations based on popularity
www.edureka.co/apache-kafka
Messaging Systems
Messaging systems provide seamless integration among distributed applications with the
help of messages, that are shared between them
In the present big-data era, the very first challenge is to collect the data as it is a huge and the
second challenge is to analyze it, one way to solve this problem is by using messaging systems
Problem :
Solution :
www.edureka.co/apache-kafka
Apache Kafka
 Apache Kafka is a distributed publish-subscribe messaging system
 Originally developed at LinkedIn and later on became a part of Apache project
 Kafka is fast, scalable, durable and distributed by design
www.edureka.co/apache-kafka
Kafka Architecture
Producer
ConsumerConsumerConsumer
Producer Producer
Kafka Cluster
 A stream of messages of particular category is called a topic. Producers publish messages to a topic
 A Producer can be any application who can publish messages to a topic
 Consumers subscribe to topics and consume the messages
 Kafka cluster is a set of servers, each of which is called a broker
Kafka Architecture
www.edureka.co/apache-kafka
ZooKeeper and Kafka
 Each Kafka broker coordinates with other Kafka brokers using ZooKeeper
 Producers and Consumers are notified by ZooKeeper service about the presence
of new broker in Kafka system or failure of the broker in Kafka system
www.edureka.co/apache-kafka
Kafka Clusters
With Kafka we can create multiple types of clusters, such as the following :
 Single node single broker cluster
 Single node multiple broker cluster
 Multiple nodes multiple broker cluster
www.edureka.co/apache-kafka
Single Node Single Broker Cluster
Producer
Producer
Producer
Consumer
Consumer
Consumer
Kafka Broker
ZooKeeper
Single Node Single Broker Cluster
www.edureka.co/apache-kafka
Single Node Multiple Broker Cluster
Producer
Producer
Producer
Consumer
Consumer
Consumer
ZooKeeper
Single Node Multiple Broker Cluster
Broker 1
Broker 2
Broker 3
www.edureka.co/apache-kafka
Multiple Node Multiple Broker Cluster
Producer
Producer
Producer
Consumer
Consumer
Consumer
ZooKeeper
Multiple Node Multiple Broker Cluster
Broker 1
Broker 2
Broker 1
Broker 2
Node 1
Node 2
www.edureka.co/apache-kafka
Kafka @ LinkedIn
LinkedIn Newsfeed is powered by Kafka
LinkedIn recommendations are powered by Kafka
www.edureka.co/apache-kafka
Kafka @ LinkedIn
LinkedIn notifications are powered by Kafka
Apart from this LinkedIn uses Kafka for many
other purposes like log monitoring, performance
metrics, search improvement etc.
www.edureka.co/apache-kafka
Who else uses Kafka ?
DataSift uses Kafka as a collector of monitoring events and to track user’s
consumption of data streams in real time
Wooga uses Kafka to aggregate and process tracking data from all their
facebook games (hosted at various providers) in a central location
Spongecell uses Kafka to run their entire analytics and monitoring pipeline
driving both real-time and ETL applications
Loggly is the world's most popular cloud-based log management. It uses
Kafka for log collection
An exhaustive list of companies using Kafka can be found here : https://cwiki.apache.org/confluence/display/KAFKA/Powered+By
www.edureka.co/apache-kafka
But what about
other messaging
systems e.g.
ActiveMQ,
RabbitMQ etc.
www.edureka.co/apache-kafka
Comparing Messaging Systems
Kafka has a more efficient storage
format.
On average, each message had an
overhead of 9 bytes in Kafka,
versus 144 bytes in ActiveMQ
www.edureka.co/apache-kafka
Comparing Messaging Systems
In both ActiveMQ and RabbitMQ
brokers maintain delivery state of
every message by writing to disk
but in case of Kafka there is no
disk write which makes it fast
www.edureka.co/apache-kafka
Hands-on
Getting Started with Kafka
www.edureka.co/apache-kafka
References
Apache Kafka :
http://kafka.apache.org/
Kafka Papers :
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
Powered by Kafka :
https://cwiki.apache.org/confluence/display/KAFKA/Powered+By
LinkedIn Performance Insights :
https://engineering.linkedin.com/samza/real-time-insights-linkedins-performance-using-apache-samza
www.edureka.co/apache-kafka
Survey
Your feedback is vital for us, be it a compliment, a suggestion or a complaint. It helps us to make your
experience better!
Please spare few minutes to take the survey after the webinar.
www.edureka.co/apache-kafka
Course Details
Edureka's Apache Kafka course:
• Introduction of course
• Online Live Classes: 15 hours
• Assignments: 25 hours
• Project: 20 hours
• Lifetime Access + 24 X 7 Support
Go to www.edureka.co/apache-kafka
Batch starts from 07 November (Weekend Batch)
www.edureka.co/apache-kafka
Thank You …
Questions/Queries/Feedback
Recording and presentation will be made available to you within 24 hours

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka introduction
Apache kafka introductionApache kafka introduction
Apache kafka introduction
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka tutorial
Kafka tutorialKafka tutorial
Kafka tutorial
 
ES & Kafka
ES & KafkaES & Kafka
ES & Kafka
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Kafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internalsKafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internals
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Apache Kafka
Apache Kafka Apache Kafka
Apache Kafka
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
 
Kafka Overview
Kafka OverviewKafka Overview
Kafka Overview
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
 
Apache Kafka - Messaging System Overview
Apache Kafka - Messaging System OverviewApache Kafka - Messaging System Overview
Apache Kafka - Messaging System Overview
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
 

Destaque

Building Kafka-powered Activity Stream
Building Kafka-powered Activity StreamBuilding Kafka-powered Activity Stream
Building Kafka-powered Activity Stream
Oleksiy Holubyev
 

Destaque (11)

Apache ActiveMQ
Apache ActiveMQApache ActiveMQ
Apache ActiveMQ
 
Kafka Workshop
Kafka WorkshopKafka Workshop
Kafka Workshop
 
Devoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with KafkaDevoxx Morocco 2016 - Microservices with Kafka
Devoxx Morocco 2016 - Microservices with Kafka
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Scaling MQTT With Apache Kafka
Scaling MQTT With Apache KafkaScaling MQTT With Apache Kafka
Scaling MQTT With Apache Kafka
 
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
 
Building Kafka-powered Activity Stream
Building Kafka-powered Activity StreamBuilding Kafka-powered Activity Stream
Building Kafka-powered Activity Stream
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message Broker
 
Message queue demo
Message queue demoMessage queue demo
Message queue demo
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 

Semelhante a Apache Kafka: Next Generation Distributed Messaging System

Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Denodo
 

Semelhante a Apache Kafka: Next Generation Distributed Messaging System (20)

Fault Tolerance with Kafka
Fault Tolerance with KafkaFault Tolerance with Kafka
Fault Tolerance with Kafka
 
How Apache Kafka is transforming Hadoop, Spark and Storm
How Apache Kafka is transforming Hadoop, Spark and StormHow Apache Kafka is transforming Hadoop, Spark and Storm
How Apache Kafka is transforming Hadoop, Spark and Storm
 
How kafka is transforming hadoop, spark & storm
How kafka is transforming hadoop, spark & stormHow kafka is transforming hadoop, spark & storm
How kafka is transforming hadoop, spark & storm
 
Python Kafka Integration: Developers Guide
Python Kafka Integration: Developers GuidePython Kafka Integration: Developers Guide
Python Kafka Integration: Developers Guide
 
Edbt19 paper 329
Edbt19 paper 329Edbt19 paper 329
Edbt19 paper 329
 
Kafka overview
Kafka overviewKafka overview
Kafka overview
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guide
 
Apache Kafka - Strakin Technologies Pvt Ltd
Apache Kafka - Strakin Technologies Pvt LtdApache Kafka - Strakin Technologies Pvt Ltd
Apache Kafka - Strakin Technologies Pvt Ltd
 
Connecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESBConnecting Apache Kafka With Mule ESB
Connecting Apache Kafka With Mule ESB
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Confluent Enterprise Datasheet
Confluent Enterprise DatasheetConfluent Enterprise Datasheet
Confluent Enterprise Datasheet
 
Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)Streaming the platform with Confluent (Apache Kafka)
Streaming the platform with Confluent (Apache Kafka)
 
kafka_session_updated.pptx
kafka_session_updated.pptxkafka_session_updated.pptx
kafka_session_updated.pptx
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and ZookeeperSession 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
 
Kafka for data scientists
Kafka for data scientistsKafka for data scientists
Kafka for data scientists
 
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
Unlocking the Power of Apache Kafka: How Kafka Listeners Facilitate Real-time...
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
 
Kafka for Scale
Kafka for ScaleKafka for Scale
Kafka for Scale
 

Mais de Edureka!

Mais de Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Último

Último (20)

Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Apache Kafka: Next Generation Distributed Messaging System