When it comes to choosing a distributed streaming platform for real-time data pipelines, everyone knows the answer - Apache Kafka! And when it comes to deploying applications at scale without needing to integrate different pieces of infrastructure yourself, the answer nowadays is increasingly Kubernetes. However, with all great things, the devil is truly in the details. While Kubernetes does provide all the building blocks that are needed, a lot of thought is required to truly create an enterprise-grade Kafka platform that can be used in production. In this technical deep dive, Michael and Viktor will go through challenges and pitfalls of managing Kafka on Kubernetes as well as the goals and lessons learned from the development of the Confluent Operator for Kubernetes.
NOTE: This talk together with Michael Ng from Confluent
Kafka on Kubernetes: Does it really have to be "The Hard Way"? (Viktor Gamov, Confluent) Kafka Summit SF 2019
1. @gamussa | #KafkaSummit | @ConfluentINc
Streaming on Kubernetes:
Does it have to be the hard
way?
Fall, 2019 / San Francisco, Ca 2019
@gamussa | #kafkasummit | @ConfluentINc
10. @gamussa | #KafkaSummit | @ConfluentINc
10
Who run stateless apps in
Kubernetes?
Who thinks it’s a good
idea?
Who run stateful apps in
Kubernetes?
Who thinks it’s a good
idea?
🙋
22. @gamussa | #KafkaSummit | @ConfluentINc
22
Pod
• Basic Unit of Deployment in
Kubernetes
• A collection of containers sharing:
• Namespace
• Network
• Volumes
23. @gamussa | #KafkaSummit | @ConfluentINc
23
Storage
• Persistent Volume (PV) & Persistent
Volume Claim (PVC)
• PV is a piece of storage that is
provisioned dynamic or static of
any individual pod that uses the
PV
28. @gamussa | #KafkaSummit | @ConfluentINc
28
Provisioning storage in Kubernetes
Storage
System
Kubernetes
controller
Kubernetes
node
29. @gamussa | #KafkaSummit | @ConfluentINc
29
Provisioning storage in Kubernetes
The controller requests storage system
to attach volume
Storage
System
Kubernetes
controller
Kubernetes
node
30. @gamussa | #KafkaSummit | @ConfluentINc
30
Provisioning storage in Kubernetes
The storage system attaches the volume
Storage
System
Kubernetes
controller
Kubernetes
node
31. @gamussa | #KafkaSummit | @ConfluentINc
31
Provisioning storage in Kubernetes
Kubelet requests storage system to mount
Volume
Storage
System
Kubernetes
controller
Kubernetes
node
32. @gamussa | #KafkaSummit | @ConfluentINc
32
Provisioning storage in Kubernetes
Kubelet can now request container runtime
to bind mount the volume
into the requested path in the container.
Storage
System
Kubernetes
controller
Kubernetes
node
/var/lib
42. Rolling Upgrade
Kafka Broker Upgrades:
1. Stop the broker, upgrade Kafka
2. Wait for Partition Leader reassignment
3. Start the upgraded broker
4. Wait for zero under-replicated
partitions
5. Upgrade the next broker
43. @gamussa | #KafkaSummit | @ConfluentINc
43
Will it Scale
Spin up new brokers, connect workers
easily
Manual Rebalance required in v1.0
Determine balancing plan
Execute balancing plan
Monitor Resources