SlideShare uma empresa Scribd logo
1 de 38
Baixar para ler offline
BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF
HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH
Apache Kafka
Scalable Message Processing and more!
Guido Schmutz
@gschmutz guidoschmutz.wordpress.com
Guido Schmutz
Working at Trivadis for more than 20 years
Oracle ACE Director for Fusion Middleware and SOA
Consultant, Trainer Software Architect for Java, Oracle, SOA and
Big Data / Fast Data
Member of Trivadis Architecture Board
Technology Manager @ Trivadis
More than 30 years of software development experience
Contact: guido.schmutz@trivadis.com
Blog: http://guidoschmutz.wordpress.com
Slideshare: http://www.slideshare.net/gschmutz
Twitter: gschmutz
2 8.12.2016 Big Data & Fast Data
Agenda
1. Introduction & Motivation
2. Kafka Core
3. Kafka Connect
4. Kafka Streams
5. Kafka and ”Big Data” / ”Fast Data” Ecosystem
6. Confluent Data Platform
7. Summary
Apache Kafka - Scalable Message Processing and more!3
Introduction & Motivation
Apache Kafka - Scalable Message Processing and more!4
Hadoop Clusterd
Hadoop Cluster
Big Data Cluster
Traditional Big Data Architecture
BI	Tools
Enterprise Data
Warehouse
Billing &
Ordering
CRM /
Profile
Marketing
Campaigns
File Import / SQL Import
SQL
Search
Online	&	Mobile	
Apps
Search
NoSQL
Parallel
Processing
Distributed
Filesystem
• Machine	Learning
• Graph	Algorithms
• Natural	Language	Processing
Event
Hub
Event
Hub
Hadoop Clusterd
Hadoop Cluster
Big Data Cluster
Event Hub – handle event stream data
BI	Tools
Enterprise Data
Warehouse
Location
Social
Click
stream
Sensor
Data
Billing &
Ordering
CRM /
Profile
Marketing
Campaigns
Event
Hub
Call
Center
Weather
Data
Mobile
Apps
SQL
Search
Online	&	Mobile	
Apps
Search
Data Flow
NoSQL
Parallel
Processing
Distributed
Filesystem
• Machine	Learning
• Graph	Algorithms
• Natural	Language	Processing
Hadoop Clusterd
Hadoop Cluster
Big Data Cluster
Event Hub – taking Velocity into account
Location
Social
Click
stream
Sensor
Data
Billing &
Ordering
CRM /
Profile
Marketing
Campaigns
Call
Center
Mobile
Apps
Batch Analytics
Streaming Analytics
Event
Hub
Event
Hub
Event
Hub
NoSQL
Parallel
Processing
Distributed
Filesystem
Stream Analytics
NoSQL
Reference /
Models
SQL
Search
Dashboard
BI	Tools
Enterprise Data
Warehouse
Search
Online	&	Mobile	
Apps
File Import / SQL Import
Weather
Data
Apache Kafka - Scalable Message Processing and more!7
Kafka Stream Data Platform
Source:	Confluent
Apache Kafka - Scalable Message Processing and more!8
Kafka Core
Apache Kafka - Scalable Message Processing and more!9
Apache Kafka - Overview
Distributed publish-subscribe messaging system
Designed for processing of real time activity stream data (logs, metrics
collections, social media streams, …)
Initially developed at LinkedIn, now part of Apache
Does not use JMS API and standards
Kafka maintains feeds of messages in topics
Apache Kafka - Scalable Message Processing and more!10
Apache Kafka - Motivation
LinkedIn’s motivation for Kafka was:
• “A unified platform for handling all the real-time data feeds a large company might
have.”
Must haves
• High throughput to support high volume event feeds.
• Support real-time processing of these feeds to create new, derived feeds.
• Support large data backlogs to handle periodic ingestion from offline systems.
• Support low-latency delivery to handle more traditional messaging use cases.
• Guarantee fault-tolerance in the presence of machine failures.
Apache Kafka - Scalable Message Processing and more!11
Kafka High Level Architecture
The who is who
• Producers write data to brokers.
• Consumers read data from
brokers.
• All this is distributed.
The data
• Data is stored in topics.
• Topics are split into partitions,
which are replicated.
Kafka Cluster
Consumer Consumer Consumer
Producer Producer Producer
Broker 1 Broker 2 Broker 3
Zookeeper
Ensemble
Apache Kafka - Scalable Message Processing and more!12
Apache Kafka - Architecture
Kafka Broker
Movement
Processor
Movement	Topic
Engine-Metrics	Topic
1 2 3 4 5 6
Engine
Processor1 2 3 4 5 6
Truck
Apache Kafka - Scalable Message Processing and more!13
Apache Kafka - Architecture
Kafka Broker
Movement
Processor
Movement	Topic
Engine-Metrics	Topic
1 2 3 4 5 6
Engine
Processor
Partition	0
1 2 3 4 5 6
Partition	0
1 2 3 4 5 6
Partition	1 Movement
Processor
Truck
Apache Kafka - Scalable Message Processing and more!14
Apache
Kafka
Kafka Broker 1
Movement
Processor
Truck
Movement	Topic
P	0
Movement
Processor
1 2 3 4 5
P	2 1 2 3 4 5
Kafka Broker 2
Movement	Topic
P	2 1 2 3 4 5
P	1 1 2 3 4 5
Kafka Broker 3
Movement	Topic
P	0 1 2 3 4 5
P	1 1 2 3 4 5
Movement
Processor
Apache Kafka - Scalable Message Processing and more!15
Apache Kafka - Architecture
• Write Ahead Log / Commit Log
• Producers always append to tail
• think append to file
Kafka Broker
Movement	Topic
1 2 3 4 5
Truck
6 6
Apache Kafka - Scalable Message Processing and more!16
Durability Guarantees
Producer can configure acknowledgements
Value Impact Durability
0 • Producer	doesn’t	wait	for	leader weak
1	(default) • Producer	waits	for	leader
• Leader	sends ack when	message	written	to	log
• No	wait	for	followers
medium
all • Producer	waits	for	leader
• Leader	sends	ack when all	In-Sync	Replica	have	
acknowledged
strong
Apache Kafka - Scalable Message Processing and more!17
Apache Kafka - Partition offsets
Offset: messages in the partitions are each assigned a unique (per partition) and
sequential id called the offset
• Consumers track their pointers via (offset, partition, topic) tuples
Consumer	Group	A Consumer	Group	B
Apache Kafka - Scalable Message Processing and more!18
Source:	Apache	Kafka
Data Retention – 3 options
1. Never
2. Time based (TTL)
log.retention.{ms | minutes | hours}
3. Size based
log.retention.bytes
4. Log compaction based (entries with same key are removed)
kafka-topics.sh --zookeeper localhost:2181 
--create --topic customers 
--replication-factor 1 --partitions 1 
--config cleanup.policy=compact
Apache Kafka - Scalable Message Processing and more!19
Apache Kafka – Some numbers
Kafka at LinkedIn => over 1800+ broker machines / 79K+ Topics
Kafka Performance at our own infrastructure => 6 brokers (VM) / 1 cluster
• 445’622 messages/second
• 31 MB / second
• 3.0405 ms average latency between producer / consumer
1.3	Trillion	messages	per	
day
330	Terabytes	in/day
1.2	Petabytes	out/day
Peak	load	for	a	single	cluster
2	million	messages/sec
4.7	Gigabits/sec	inbound
15	Gigabits/sec	outbound
http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
https://engineering.linkedin.com/kafka/running-kafka-scale
Apache Kafka - Scalable Message Processing and more!20
Kafka Topics
Creating a topic
• Command line interface
• Using AdminUtils.createTopic method
• Auto-create via auto.create.topics.enable = true
Modifying a topic
https://kafka.apache.org/documentation.html#basic_ops_modify_topic
Deleting a topic
• Command Line interface
$ kafka-topics.sh –zookeeper zk1:2181 --create 
--topic my.topic –-partitions 3 
–-replication-factor 2 --config x=y
Apache Kafka - Scalable Message Processing and more!21
Inspecting the current state of a topic
Use the --describe option
• Leader: brokerID of the currently elected leader broker
• Replica ID’s = broker ID’s
• ISR = “in-sync replica”, replicas that are in sync with the leader. In this example:
• Broker 0 is leader for partition 1.
• Broker 1 is leader for partitions 0 and 2.
• All replicas are in-sync with their respective leader partitions.
$ kafka-topics.sh –zookeeper zk1:2181 –-describe --topic my.topic
Topic:zerg2.hydra PartitionCount:3 ReplicationFactor:2 Configs:
Topic: my.topic Partition: 0 Leader: 1 Replicas: 1,0 Isr: 1,0
Topic: my.topic Partition: 1 Leader: 0 Replicas: 0,1 Isr: 0,1
Topic: my.topic Partition: 2 Leader: 1 Replicas: 1,0 Isr: 1,0
Apache Kafka - Scalable Message Processing and more!22
Kafka Connect
Apache Kafka - Scalable Message Processing and more!23
Kafka Connect Architecture
Apache Kafka - Scalable Message Processing and more!24
Source:	Confluent
Kafka Connector Hub – Certified Connectors
Source:	http://www.confluent.io/product/connectors
Apache Kafka - Scalable Message Processing and more!25
Kafka Connector Hub – Additional Connectors
Source:	http://www.confluent.io/product/connectors
Apache Kafka - Scalable Message Processing and more!26
Kafka Streams
Apache Kafka - Scalable Message Processing and more!27
Kafka Streams
• Designed as a simple and lightweight library in Apache Kafka
• no external dependencies on systems other than Apache Kafka
• Leverages Kafka as its internal messaging layer
• agnostic to resource management and configuration tools
• Supports fault-tolerant local state
• Event-at-a-time processing (not microbatch) with millisecond latency
• Windowing with out-of-order data using a Google DataFlow-like model
Apache Kafka - Scalable Message Processing and more!28
Kafka Streams - Architecture
Apache Kafka - Scalable Message Processing and more!29
topology defines the stream
processing computational logic for
your application
topology is a graph of stream
processors (nodes) that are
connected by streams (edges)
source processor is a stream
processor that does not have any
upstream processors
sink processor is a special type
of stream processor that does not
have down-stream processors.
Source:	Confluent
Kafka Streams - Processor Topology
Apache Kafka - Scalable Message Processing and more!30
topology defines the stream processing
computational logic for your application
topology is a graph of stream processors
(nodes) that are connected by streams (edges)
source processor is a stream processor that
does not have any upstream processors.
Consumes one or Kafka topics.
sink processor is a special type of stream
processor that does not have down-stream
processors. Produces to a single Kafka topic.
Source:	Confluent
Kafka and ”Big Data” / ”Fast Data”
Ecosystem
Apache Kafka - Scalable Message Processing and more!31
Kafka and the Big Data / Fast Data ecosystem
Kafka integrates with many popular
products / frameworks
• Apache Spark Streaming
• Apache Flink
• Apache Storm
• Apache NiFi
• Streamsets
• Apache Flume
• Oracle Stream Analytics
• Oracle Service Bus
• Oracle GoldenGate
• Spring Integration Kafka Support
• …Storm	built-in	Kafka	Spout	to	consume	events	from	Kafka
Apache Kafka - Scalable Message Processing and more!32
Confluent Platform
Apache Kafka - Scalable Message Processing and more!33
Confluent Data Platform 3.1
Apache Kafka - Scalable Message Processing and more!34
Source:	Confluent
Summary
Apache Kafka - Scalable Message Processing and more!35
Weather
Data
SQL Import
Hadoop Clusterd
Hadoop Cluster
Hadoop Cluster
Location
Social
Click
stream
Sensor
Data
Billing &
Ordering
CRM /
Profile
Marketing
Campaigns
Call
Center
Mobile
Apps
Batch Analytics
Streaming Analytics
Event
Hub
Event
Hub
Event
Hub
NoSQL
Parallel
Processing
Distributed
Filesystem
Stream Analytics
NoSQL
Reference /
Models
SQL
Search
Dashboard
BI	Tools
Enterprise Data
Warehouse
Search
Online	&	Mobile	
Apps
Customer Event Hub – mapping of technologies
Apache Kafka - Scalable Message Processing and more!36
Summary
• Kafka can scale to millions of messages per second, and more
• Easy to start with for a PoC
• A bit more to invest to setup production environment
• Monitoring is key
• Vibrant community and ecosystem
• Fast pace technology
• Confluent provides Kafka Distribution
Apache Kafka - Scalable Message Processing and more!37
Guido Schmutz
Technology Manager
guido.schmutz@trivadis.com
Apache Kafka - Scalable Message Processing and more!38
@gschmutz guidoschmutz.wordpress.com

Mais conteúdo relacionado

Mais procurados

Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core
Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka coreKafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core
Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka coreGuido Schmutz
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaGuido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Microservices with Kafka Ecosystem
Microservices with Kafka EcosystemMicroservices with Kafka Ecosystem
Microservices with Kafka EcosystemGuido Schmutz
 
Big Data - in the cloud or rather on-premises?
Big Data - in the cloud or rather on-premises?Big Data - in the cloud or rather on-premises?
Big Data - in the cloud or rather on-premises?Guido Schmutz
 
Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseGuido Schmutz
 
Apache Kafka Scalable Message Processing and more!
Apache Kafka Scalable Message Processing and more! Apache Kafka Scalable Message Processing and more!
Apache Kafka Scalable Message Processing and more! Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Guido Schmutz
 
Apache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformApache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformGuido Schmutz
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming AnalyticsGuido Schmutz
 
Kafka for data scientists
Kafka for data scientistsKafka for data scientists
Kafka for data scientistsJenn Rawlins
 
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
dotScale 2017 Keynote: The Rise of Real Time by Neha NarkhededotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhedeconfluent
 
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Guido Schmutz
 
Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020Timothy Spann
 
Evolving from Messaging to Event Streaming
Evolving from Messaging to Event StreamingEvolving from Messaging to Event Streaming
Evolving from Messaging to Event Streamingconfluent
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka Guido Schmutz
 

Mais procurados (20)

Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core
Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka coreKafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core
Kafka Connect & Kafka Streams/KSQL - powerful ecosystem around Kafka core
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Microservices with Kafka Ecosystem
Microservices with Kafka EcosystemMicroservices with Kafka Ecosystem
Microservices with Kafka Ecosystem
 
Big Data - in the cloud or rather on-premises?
Big Data - in the cloud or rather on-premises?Big Data - in the cloud or rather on-premises?
Big Data - in the cloud or rather on-premises?
 
Ingesting streaming data into Graph Database
Ingesting streaming data into Graph DatabaseIngesting streaming data into Graph Database
Ingesting streaming data into Graph Database
 
Apache Kafka Scalable Message Processing and more!
Apache Kafka Scalable Message Processing and more! Apache Kafka Scalable Message Processing and more!
Apache Kafka Scalable Message Processing and more!
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?
 
Apache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformApache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing Platform
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
Introduction to Streaming Analytics
Introduction to Streaming AnalyticsIntroduction to Streaming Analytics
Introduction to Streaming Analytics
 
Kafka for data scientists
Kafka for data scientistsKafka for data scientists
Kafka for data scientists
 
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
dotScale 2017 Keynote: The Rise of Real Time by Neha NarkhededotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
 
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architectures @ JAX / BigDataCon 2016
 
Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020
 
Evolving from Messaging to Event Streaming
Evolving from Messaging to Event StreamingEvolving from Messaging to Event Streaming
Evolving from Messaging to Event Streaming
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka Building event-driven (Micro)Services with Apache Kafka
Building event-driven (Micro)Services with Apache Kafka
 

Semelhante a Apache Kafka - Scalable Message-Processing and more !

Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationKnoldus Inc.
 
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Michael Noll
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsSlim Baltagi
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Data Con LA
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaSlim Baltagi
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
Apache Kafka
Apache KafkaApache Kafka
Apache KafkaJoe Stein
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Kai Wähner
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
OSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsTimothy Spann
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBrian Ritchie
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraJoe Stein
 
Lessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesLessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesAlexis Seigneurin
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Exampleconfluent
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaGuido Schmutz
 
Apache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Apache Kafka DC Meetup: Replicating DB Binary Logs to KafkaApache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Apache Kafka DC Meetup: Replicating DB Binary Logs to KafkaMark Bittmann
 

Semelhante a Apache Kafka - Scalable Message-Processing and more ! (20)

Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
 
Kafka Explainaton
Kafka ExplainatonKafka Explainaton
Kafka Explainaton
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Introduction to Kafka Streams Presentation
Introduction to Kafka Streams PresentationIntroduction to Kafka Streams Presentation
Introduction to Kafka Streams Presentation
 
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
OSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming Apps
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
 
Lessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and MicroservicesLessons Learned: Using Spark and Microservices
Lessons Learned: Using Spark and Microservices
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
 
Apache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Apache Kafka DC Meetup: Replicating DB Binary Logs to KafkaApache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Apache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
 

Mais de Guido Schmutz

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as CodeGuido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsGuido Schmutz
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureGuido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureGuido Schmutz
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaGuido Schmutz
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming VisualisationGuido Schmutz
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Guido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 
Location Analytics - Real Time Geofencing using Apache Kafka
Location Analytics - Real Time Geofencing using Apache KafkaLocation Analytics - Real Time Geofencing using Apache Kafka
Location Analytics - Real Time Geofencing using Apache KafkaGuido Schmutz
 

Mais de Guido Schmutz (20)

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data Architecture
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data Architecture
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using Kafka
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming Visualisation
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI Architecture
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Location Analytics - Real Time Geofencing using Apache Kafka
Location Analytics - Real Time Geofencing using Apache KafkaLocation Analytics - Real Time Geofencing using Apache Kafka
Location Analytics - Real Time Geofencing using Apache Kafka
 

Último

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Apache Kafka - Scalable Message-Processing and more !

  • 1. BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENF HAMBURG KOPENHAGEN LAUSANNE MÜNCHEN STUTTGART WIEN ZÜRICH Apache Kafka Scalable Message Processing and more! Guido Schmutz @gschmutz guidoschmutz.wordpress.com
  • 2. Guido Schmutz Working at Trivadis for more than 20 years Oracle ACE Director for Fusion Middleware and SOA Consultant, Trainer Software Architect for Java, Oracle, SOA and Big Data / Fast Data Member of Trivadis Architecture Board Technology Manager @ Trivadis More than 30 years of software development experience Contact: guido.schmutz@trivadis.com Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 2 8.12.2016 Big Data & Fast Data
  • 3. Agenda 1. Introduction & Motivation 2. Kafka Core 3. Kafka Connect 4. Kafka Streams 5. Kafka and ”Big Data” / ”Fast Data” Ecosystem 6. Confluent Data Platform 7. Summary Apache Kafka - Scalable Message Processing and more!3
  • 4. Introduction & Motivation Apache Kafka - Scalable Message Processing and more!4
  • 5. Hadoop Clusterd Hadoop Cluster Big Data Cluster Traditional Big Data Architecture BI Tools Enterprise Data Warehouse Billing & Ordering CRM / Profile Marketing Campaigns File Import / SQL Import SQL Search Online & Mobile Apps Search NoSQL Parallel Processing Distributed Filesystem • Machine Learning • Graph Algorithms • Natural Language Processing
  • 6. Event Hub Event Hub Hadoop Clusterd Hadoop Cluster Big Data Cluster Event Hub – handle event stream data BI Tools Enterprise Data Warehouse Location Social Click stream Sensor Data Billing & Ordering CRM / Profile Marketing Campaigns Event Hub Call Center Weather Data Mobile Apps SQL Search Online & Mobile Apps Search Data Flow NoSQL Parallel Processing Distributed Filesystem • Machine Learning • Graph Algorithms • Natural Language Processing
  • 7. Hadoop Clusterd Hadoop Cluster Big Data Cluster Event Hub – taking Velocity into account Location Social Click stream Sensor Data Billing & Ordering CRM / Profile Marketing Campaigns Call Center Mobile Apps Batch Analytics Streaming Analytics Event Hub Event Hub Event Hub NoSQL Parallel Processing Distributed Filesystem Stream Analytics NoSQL Reference / Models SQL Search Dashboard BI Tools Enterprise Data Warehouse Search Online & Mobile Apps File Import / SQL Import Weather Data Apache Kafka - Scalable Message Processing and more!7
  • 8. Kafka Stream Data Platform Source: Confluent Apache Kafka - Scalable Message Processing and more!8
  • 9. Kafka Core Apache Kafka - Scalable Message Processing and more!9
  • 10. Apache Kafka - Overview Distributed publish-subscribe messaging system Designed for processing of real time activity stream data (logs, metrics collections, social media streams, …) Initially developed at LinkedIn, now part of Apache Does not use JMS API and standards Kafka maintains feeds of messages in topics Apache Kafka - Scalable Message Processing and more!10
  • 11. Apache Kafka - Motivation LinkedIn’s motivation for Kafka was: • “A unified platform for handling all the real-time data feeds a large company might have.” Must haves • High throughput to support high volume event feeds. • Support real-time processing of these feeds to create new, derived feeds. • Support large data backlogs to handle periodic ingestion from offline systems. • Support low-latency delivery to handle more traditional messaging use cases. • Guarantee fault-tolerance in the presence of machine failures. Apache Kafka - Scalable Message Processing and more!11
  • 12. Kafka High Level Architecture The who is who • Producers write data to brokers. • Consumers read data from brokers. • All this is distributed. The data • Data is stored in topics. • Topics are split into partitions, which are replicated. Kafka Cluster Consumer Consumer Consumer Producer Producer Producer Broker 1 Broker 2 Broker 3 Zookeeper Ensemble Apache Kafka - Scalable Message Processing and more!12
  • 13. Apache Kafka - Architecture Kafka Broker Movement Processor Movement Topic Engine-Metrics Topic 1 2 3 4 5 6 Engine Processor1 2 3 4 5 6 Truck Apache Kafka - Scalable Message Processing and more!13
  • 14. Apache Kafka - Architecture Kafka Broker Movement Processor Movement Topic Engine-Metrics Topic 1 2 3 4 5 6 Engine Processor Partition 0 1 2 3 4 5 6 Partition 0 1 2 3 4 5 6 Partition 1 Movement Processor Truck Apache Kafka - Scalable Message Processing and more!14
  • 15. Apache Kafka Kafka Broker 1 Movement Processor Truck Movement Topic P 0 Movement Processor 1 2 3 4 5 P 2 1 2 3 4 5 Kafka Broker 2 Movement Topic P 2 1 2 3 4 5 P 1 1 2 3 4 5 Kafka Broker 3 Movement Topic P 0 1 2 3 4 5 P 1 1 2 3 4 5 Movement Processor Apache Kafka - Scalable Message Processing and more!15
  • 16. Apache Kafka - Architecture • Write Ahead Log / Commit Log • Producers always append to tail • think append to file Kafka Broker Movement Topic 1 2 3 4 5 Truck 6 6 Apache Kafka - Scalable Message Processing and more!16
  • 17. Durability Guarantees Producer can configure acknowledgements Value Impact Durability 0 • Producer doesn’t wait for leader weak 1 (default) • Producer waits for leader • Leader sends ack when message written to log • No wait for followers medium all • Producer waits for leader • Leader sends ack when all In-Sync Replica have acknowledged strong Apache Kafka - Scalable Message Processing and more!17
  • 18. Apache Kafka - Partition offsets Offset: messages in the partitions are each assigned a unique (per partition) and sequential id called the offset • Consumers track their pointers via (offset, partition, topic) tuples Consumer Group A Consumer Group B Apache Kafka - Scalable Message Processing and more!18 Source: Apache Kafka
  • 19. Data Retention – 3 options 1. Never 2. Time based (TTL) log.retention.{ms | minutes | hours} 3. Size based log.retention.bytes 4. Log compaction based (entries with same key are removed) kafka-topics.sh --zookeeper localhost:2181 --create --topic customers --replication-factor 1 --partitions 1 --config cleanup.policy=compact Apache Kafka - Scalable Message Processing and more!19
  • 20. Apache Kafka – Some numbers Kafka at LinkedIn => over 1800+ broker machines / 79K+ Topics Kafka Performance at our own infrastructure => 6 brokers (VM) / 1 cluster • 445’622 messages/second • 31 MB / second • 3.0405 ms average latency between producer / consumer 1.3 Trillion messages per day 330 Terabytes in/day 1.2 Petabytes out/day Peak load for a single cluster 2 million messages/sec 4.7 Gigabits/sec inbound 15 Gigabits/sec outbound http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines https://engineering.linkedin.com/kafka/running-kafka-scale Apache Kafka - Scalable Message Processing and more!20
  • 21. Kafka Topics Creating a topic • Command line interface • Using AdminUtils.createTopic method • Auto-create via auto.create.topics.enable = true Modifying a topic https://kafka.apache.org/documentation.html#basic_ops_modify_topic Deleting a topic • Command Line interface $ kafka-topics.sh –zookeeper zk1:2181 --create --topic my.topic –-partitions 3 –-replication-factor 2 --config x=y Apache Kafka - Scalable Message Processing and more!21
  • 22. Inspecting the current state of a topic Use the --describe option • Leader: brokerID of the currently elected leader broker • Replica ID’s = broker ID’s • ISR = “in-sync replica”, replicas that are in sync with the leader. In this example: • Broker 0 is leader for partition 1. • Broker 1 is leader for partitions 0 and 2. • All replicas are in-sync with their respective leader partitions. $ kafka-topics.sh –zookeeper zk1:2181 –-describe --topic my.topic Topic:zerg2.hydra PartitionCount:3 ReplicationFactor:2 Configs: Topic: my.topic Partition: 0 Leader: 1 Replicas: 1,0 Isr: 1,0 Topic: my.topic Partition: 1 Leader: 0 Replicas: 0,1 Isr: 0,1 Topic: my.topic Partition: 2 Leader: 1 Replicas: 1,0 Isr: 1,0 Apache Kafka - Scalable Message Processing and more!22
  • 23. Kafka Connect Apache Kafka - Scalable Message Processing and more!23
  • 24. Kafka Connect Architecture Apache Kafka - Scalable Message Processing and more!24 Source: Confluent
  • 25. Kafka Connector Hub – Certified Connectors Source: http://www.confluent.io/product/connectors Apache Kafka - Scalable Message Processing and more!25
  • 26. Kafka Connector Hub – Additional Connectors Source: http://www.confluent.io/product/connectors Apache Kafka - Scalable Message Processing and more!26
  • 27. Kafka Streams Apache Kafka - Scalable Message Processing and more!27
  • 28. Kafka Streams • Designed as a simple and lightweight library in Apache Kafka • no external dependencies on systems other than Apache Kafka • Leverages Kafka as its internal messaging layer • agnostic to resource management and configuration tools • Supports fault-tolerant local state • Event-at-a-time processing (not microbatch) with millisecond latency • Windowing with out-of-order data using a Google DataFlow-like model Apache Kafka - Scalable Message Processing and more!28
  • 29. Kafka Streams - Architecture Apache Kafka - Scalable Message Processing and more!29 topology defines the stream processing computational logic for your application topology is a graph of stream processors (nodes) that are connected by streams (edges) source processor is a stream processor that does not have any upstream processors sink processor is a special type of stream processor that does not have down-stream processors. Source: Confluent
  • 30. Kafka Streams - Processor Topology Apache Kafka - Scalable Message Processing and more!30 topology defines the stream processing computational logic for your application topology is a graph of stream processors (nodes) that are connected by streams (edges) source processor is a stream processor that does not have any upstream processors. Consumes one or Kafka topics. sink processor is a special type of stream processor that does not have down-stream processors. Produces to a single Kafka topic. Source: Confluent
  • 31. Kafka and ”Big Data” / ”Fast Data” Ecosystem Apache Kafka - Scalable Message Processing and more!31
  • 32. Kafka and the Big Data / Fast Data ecosystem Kafka integrates with many popular products / frameworks • Apache Spark Streaming • Apache Flink • Apache Storm • Apache NiFi • Streamsets • Apache Flume • Oracle Stream Analytics • Oracle Service Bus • Oracle GoldenGate • Spring Integration Kafka Support • …Storm built-in Kafka Spout to consume events from Kafka Apache Kafka - Scalable Message Processing and more!32
  • 33. Confluent Platform Apache Kafka - Scalable Message Processing and more!33
  • 34. Confluent Data Platform 3.1 Apache Kafka - Scalable Message Processing and more!34 Source: Confluent
  • 35. Summary Apache Kafka - Scalable Message Processing and more!35
  • 36. Weather Data SQL Import Hadoop Clusterd Hadoop Cluster Hadoop Cluster Location Social Click stream Sensor Data Billing & Ordering CRM / Profile Marketing Campaigns Call Center Mobile Apps Batch Analytics Streaming Analytics Event Hub Event Hub Event Hub NoSQL Parallel Processing Distributed Filesystem Stream Analytics NoSQL Reference / Models SQL Search Dashboard BI Tools Enterprise Data Warehouse Search Online & Mobile Apps Customer Event Hub – mapping of technologies Apache Kafka - Scalable Message Processing and more!36
  • 37. Summary • Kafka can scale to millions of messages per second, and more • Easy to start with for a PoC • A bit more to invest to setup production environment • Monitoring is key • Vibrant community and ecosystem • Fast pace technology • Confluent provides Kafka Distribution Apache Kafka - Scalable Message Processing and more!37
  • 38. Guido Schmutz Technology Manager guido.schmutz@trivadis.com Apache Kafka - Scalable Message Processing and more!38 @gschmutz guidoschmutz.wordpress.com