An introduction to the kafka stream processing platform. The presentation gives a small introduction into stream processing and furthermore explains how kafka streams and kafka connect are used together to implement realtime stream processing flows.
5. Stream Processing
Get real-time insights
Lower processing latency
Easier to test
Easier to maintain
Easier to scale
Different way of thinking
At-least-once vs exactly-
once
Time
+ -
6. Every company is already
doing stream processing
(more or less ... )
7. A Stream
Key 1 -> value 1
Key 2 -> value 2
Key 1 -> value 3
...
20. Kafka Platform
Streams and Connect apps are just (java) apps
Streams and Connect are libraries
Can be deployed like any other (java) app
Multiple instances of the same app can be launched
Use tools like Mesos, kubernetes, Docker Swarm, ...
30. Sequential disk access is fast*
* Don’t believe me? Read http://kafka.apache.org/documentation#persistence
31. Producer
Puts messages onto kafka
Determines the partition to write to
Can be implemented in many, many languages
32. Consumer
Gets messages from kafka
Can be grouped into Consumer Groups
Allows for round robin message delivery
Enables scaling of consumers
Have a persisted offset per Consumer Group
Stored in Zookeeper
Or in Kafka
43. Kafka Streams
KStream for a stream of data
KTable to keep the latest value for each key
KTable state is distributed across app instances
Transform from streams to tables and tables to streams
Choose which field to use as “timestamp”
44. TOPIC A
TOPIC B
TOPIC C
Kafka
Connect
App
Kafka
Streams
App
Kafka
Streams
App
Kafka
Connect
App
TOPIC C
TOPIC B
TOPIC A
47. TOPIC A
TOPIC B
TOPIC C
Sales JDBC
Kafka
Connect
Top
Products
Ranker
Emailer
TOPIC C
TOPIC B
TOPIC A
Low Stock
Notifier
Kafka
Connect
App
Slack Poster