Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability

Vanessa Vuibert
Sta
ff
Production Engineer
Resilient Ka
f
ka: How DNS Tra
ff
ic Management
and Client Wrappers Ensure Availability
@V3_XD
862 14
Scale
Ka
f
ka brokers Ka
f
ka clusters
14M 9
Messages per sec GCP Regions
@V3_XD
• Maintenance
• Incidents
• Regionalize tra
ff
ic
Tra
ff
ic management use cases
Kubernetes (K8s) out of the box
🔓open source
Kafka broker
K8s out of the box
dig +short service.namespace.svc.cluster.local
IP0
IP1
IP2
K8s out of the box
bootstrap.servers=
service.namespace.svc.cluster.local:9092
K8s out of the box
dig +short pod2.service.namespace.svc.cluster.local
IP2
K8s out of the box
advertised.listeners=
pod2.service.namespace.svc.cluster.local:9092
• Readiness
• Startup
• Liveness
K8s StatefulSet: probes
dig +short service.namespace.svc.cluster.local
IP0
IP2
K8s readiness probe
dig +short service.namespace.svc.cluster.local
IP0
IP2
IP3
K8s readiness probe
not ready
publishNotReadyAddresses: true
Regional pairs
External tra
ff
ic: load balancers
External tra
ff
ic: load balancers
bootstrap.servers
External tra
ff
ic: load balancers
advertised.listeners
• Issues scaling
• Manual broker DNS
records
• Limited tra
ff
ic
control
Built automation with
k8s controllers.
Stateful buddy: load balancers
🔒closed source
Name buddy: DNS records
🔒closed source
Ka
f
ka access buddy: endpoints
🔒closed source
Ka
f
ka Access Buddy: consumer
Ka
f
ka Access Buddy: producer failover
east
- Elasticsearch on call
“Let me failover real quick.”
Faster failovers with a
DNS tra
ff
ic manager.
DNS tra
ff
ic manager
🔒closed source
DNS tra
ff
ic manager: normal
dig +short us-east1.somedomain.com
US-East1-IP
DNS tra
ff
ic manager: failover
dig +short us-east1.somedomain.com
US-Central1-IP
- A Ka
f
ka client
“DNS trickery.”
used to take
40
Minutes
now only takes
1
Minutes
Failover time savings
@V3_XD
Incident during
fl
ashsale
Failover during
fl
ashsale
US Central1 -> US East1
Reduced toil with
client wrappers.
• Failover reconnection
• Everything needed for connection
• Ruby, go and python
Client wrappers
K8s Deployment template: bootstrap.servers
K8s Deployment template: client ID
K8s Deployment template
Improved availability
with local consumers.
• More availability
• Reduced latency
• Reduced storage costs
• Reduced network costs
Local consumers
Aggregate consumer
Local consumers
Local consumers: DNS records
Aggregate
500
ms
Regional
20
ms
Latency 99th
@V3_XD
Connect directly
through private IPs.
• More secure
• Reduced network costs
• Fetch from closest replica: KIP
-
392
Public to private tra
ff
ic
Tra
ff
ic manager: pod IPs
Reduction
-6%
bill
Network represents
29%
bill
Network cost reduction
@V3_XD
• GKE 1.24 -> 1.25
incident
• Apply
f
irewall rules
• LB more secure for
public tra
ff
ic
Failover: pod IPs
Single stop shop with Multi-
Cluster Services (MCS).
MCS endpoints
🔒closed source
Tra
ff
ic sources
Regional pairs: uneven distribution
Regionalize tra
ff
ic: Ka
f
ka access buddy
east
Regionalize tra
ff
ic: MCS
40 18
MCS time savings
Minutes to regionalize tra
ff
ic Minutes to deploy
1 13
Minutes after migration Minutes after migration
@V3_XD
Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability
• Resiliency: DNS
tra
ff
ic management
• Toil: client wrappers
• Availability: local
consumption
Thanks!
@V3_XD
1 de 58

Recomendados

Keystone - ApacheCon 2016 por
Keystone - ApacheCon 2016Keystone - ApacheCon 2016
Keystone - ApacheCon 2016Peter Bakas
301 visualizações75 slides
Capital One Delivers Risk Insights in Real Time with Stream Processing por
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
1.6K visualizações53 slides
From Three Nines to Five Nines - A Kafka Journey por
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyAllen (Xiaozhong) Wang
1.4K visualizações39 slides
Accelerated SDN in Azure por
Accelerated SDN in AzureAccelerated SDN in Azure
Accelerated SDN in AzureOpen Networking Summit
712 visualizações25 slides
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic... por
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...LF_DPDK
282 visualizações23 slides
Cloud Native SDN por
Cloud Native SDNCloud Native SDN
Cloud Native SDNRomana Project
1.9K visualizações17 slides

Mais conteúdo relacionado

Similar a Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability

Uber Real Time Data Analytics por
Uber Real Time Data AnalyticsUber Real Time Data Analytics
Uber Real Time Data AnalyticsAnkur Bansal
2.4K visualizações71 slides
In Flux Limiting for a multi-tenant logging service por
In Flux Limiting for a multi-tenant logging serviceIn Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging serviceDataWorks Summit/Hadoop Summit
1.4K visualizações15 slides
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015 por
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Monal Daxini
1.2K visualizações96 slides
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022 por
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
749 visualizações27 slides
DNS Survival Guide. por
DNS Survival Guide.DNS Survival Guide.
DNS Survival Guide.Qrator Labs
102 visualizações53 slides
DNS Survival Guide por
DNS Survival GuideDNS Survival Guide
DNS Survival GuideAPNIC
403 visualizações53 slides

Similar a Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability(20)

Uber Real Time Data Analytics por Ankur Bansal
Uber Real Time Data AnalyticsUber Real Time Data Analytics
Uber Real Time Data Analytics
Ankur Bansal2.4K visualizações
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015 por Monal Daxini
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Monal Daxini1.2K visualizações
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022 por HostedbyConfluent
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
HostedbyConfluent749 visualizações
DNS Survival Guide. por Qrator Labs
DNS Survival Guide.DNS Survival Guide.
DNS Survival Guide.
Qrator Labs102 visualizações
DNS Survival Guide por APNIC
DNS Survival GuideDNS Survival Guide
DNS Survival Guide
APNIC403 visualizações
Experience with Kafka & Storm por Otto Mok
Experience with Kafka & StormExperience with Kafka & Storm
Experience with Kafka & Storm
Otto Mok4.9K visualizações
Battle Tested Event-Driven Patterns for your Microservices Architecture - Ris... por Natan Silnitsky
Battle Tested Event-Driven Patterns for your Microservices Architecture - Ris...Battle Tested Event-Driven Patterns for your Microservices Architecture - Ris...
Battle Tested Event-Driven Patterns for your Microservices Architecture - Ris...
Natan Silnitsky143 visualizações
Battle Tested Event-Driven Patterns for your Microservices Architecture por Natan Silnitsky
Battle Tested Event-Driven Patterns for your Microservices ArchitectureBattle Tested Event-Driven Patterns for your Microservices Architecture
Battle Tested Event-Driven Patterns for your Microservices Architecture
Natan Silnitsky170 visualizações
AWS re:Invent 2016: NextGen Networking: New Capabilities for Amazon’s Virtual... por Amazon Web Services
AWS re:Invent 2016: NextGen Networking: New Capabilities for Amazon’s Virtual...AWS re:Invent 2016: NextGen Networking: New Capabilities for Amazon’s Virtual...
AWS re:Invent 2016: NextGen Networking: New Capabilities for Amazon’s Virtual...
Amazon Web Services2K visualizações
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn... por HostedbyConfluent
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
Sharing is Caring: Toward Creating Self-tuning Multi-tenant Kafka (Anna Povzn...
HostedbyConfluent1.4K visualizações
Summit 16: Achieving Low Latency Network Function with Opnfv por OPNFV
Summit 16: Achieving Low Latency Network Function with OpnfvSummit 16: Achieving Low Latency Network Function with Opnfv
Summit 16: Achieving Low Latency Network Function with Opnfv
OPNFV816 visualizações
PLNOG 9: Robert Dąbrowski - Carrier-grade NAT (CGN) Solution with FortiGate por PROIDEA
PLNOG 9: Robert Dąbrowski - Carrier-grade NAT (CGN) Solution with FortiGatePLNOG 9: Robert Dąbrowski - Carrier-grade NAT (CGN) Solution with FortiGate
PLNOG 9: Robert Dąbrowski - Carrier-grade NAT (CGN) Solution with FortiGate
PROIDEA245 visualizações
Integrating OpenStack To Existing Infrastructure por Hui Cheng
Integrating OpenStack To Existing InfrastructureIntegrating OpenStack To Existing Infrastructure
Integrating OpenStack To Existing Infrastructure
Hui Cheng3.7K visualizações
(BDT318) How Netflix Handles Up To 8 Million Events Per Second por Amazon Web Services
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
Amazon Web Services79.2K visualizações
Docker Networking in Production at Visa - Sasi Kannappan, Visa and Mark Churc... por Docker, Inc.
Docker Networking in Production at Visa - Sasi Kannappan, Visa and Mark Churc...Docker Networking in Production at Visa - Sasi Kannappan, Visa and Mark Churc...
Docker Networking in Production at Visa - Sasi Kannappan, Visa and Mark Churc...
Docker, Inc.2.7K visualizações
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning por Guido Schmutz
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Guido Schmutz1.6K visualizações
Practice of large Hadoop cluster in China Mobile por DataWorks Summit
Practice of large Hadoop cluster in China MobilePractice of large Hadoop cluster in China Mobile
Practice of large Hadoop cluster in China Mobile
DataWorks Summit796 visualizações
ddsf-student-presentation_756205.pptx por ssuser498be2
ddsf-student-presentation_756205.pptxddsf-student-presentation_756205.pptx
ddsf-student-presentation_756205.pptx
ssuser498be22 visualizações
FreeSWITCH as a Microservice por Evan McGee
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a Microservice
Evan McGee3.4K visualizações

Último

ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf por
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdfASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdfAlhamduKure
10 visualizações11 slides
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth por
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for GrowthInnomantra
22 visualizações4 slides
Créativité dans le design mécanique à l’aide de l’optimisation topologique por
Créativité dans le design mécanique à l’aide de l’optimisation topologiqueCréativité dans le design mécanique à l’aide de l’optimisation topologique
Créativité dans le design mécanique à l’aide de l’optimisation topologiqueLIEGE CREATIVE
9 visualizações84 slides
dummy.pptx por
dummy.pptxdummy.pptx
dummy.pptxJamesLamp
7 visualizações2 slides
MongoDB.pdf por
MongoDB.pdfMongoDB.pdf
MongoDB.pdfArthyR3
51 visualizações6 slides
Robotics in construction enterprise por
Robotics in construction enterpriseRobotics in construction enterprise
Robotics in construction enterpriseKhalid Abdel Naser Abdel Rahim
5 visualizações1 slide

Último(20)

ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf por AlhamduKure
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdfASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
AlhamduKure10 visualizações
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth por Innomantra
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth
Innomantra 22 visualizações
Créativité dans le design mécanique à l’aide de l’optimisation topologique por LIEGE CREATIVE
Créativité dans le design mécanique à l’aide de l’optimisation topologiqueCréativité dans le design mécanique à l’aide de l’optimisation topologique
Créativité dans le design mécanique à l’aide de l’optimisation topologique
LIEGE CREATIVE9 visualizações
dummy.pptx por JamesLamp
dummy.pptxdummy.pptx
dummy.pptx
JamesLamp7 visualizações
MongoDB.pdf por ArthyR3
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
ArthyR351 visualizações
unit 1.pptx por rrbornarecm
unit 1.pptxunit 1.pptx
unit 1.pptx
rrbornarecm5 visualizações
Plant Design Report-Oil Refinery.pdf por Safeen Yaseen Ja'far
Plant Design Report-Oil Refinery.pdfPlant Design Report-Oil Refinery.pdf
Plant Design Report-Oil Refinery.pdf
Safeen Yaseen Ja'far9 visualizações
sam_software_eng_cv.pdf por sammyigbinovia
sam_software_eng_cv.pdfsam_software_eng_cv.pdf
sam_software_eng_cv.pdf
sammyigbinovia19 visualizações
GDSC Mikroskil Members Onboarding 2023.pdf por gdscmikroskil
GDSC Mikroskil Members Onboarding 2023.pdfGDSC Mikroskil Members Onboarding 2023.pdf
GDSC Mikroskil Members Onboarding 2023.pdf
gdscmikroskil72 visualizações
GPS Survery Presentation/ Slides por OmarFarukEmon1
GPS Survery Presentation/ SlidesGPS Survery Presentation/ Slides
GPS Survery Presentation/ Slides
OmarFarukEmon17 visualizações
Integrating Sustainable Development Goals (SDGs) in School Education por SheetalTank1
Integrating Sustainable Development Goals (SDGs) in School EducationIntegrating Sustainable Development Goals (SDGs) in School Education
Integrating Sustainable Development Goals (SDGs) in School Education
SheetalTank113 visualizações
Ansari: Practical experiences with an LLM-based Islamic Assistant por M Waleed Kadous
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic Assistant
M Waleed Kadous12 visualizações
Automated Remote sensing GPS satellite system for managing resources and moni... por Khalid Abdel Naser Abdel Rahim
Automated Remote sensing GPS satellite system for managing resources and moni...Automated Remote sensing GPS satellite system for managing resources and moni...
Automated Remote sensing GPS satellite system for managing resources and moni...
Khalid Abdel Naser Abdel Rahim5 visualizações
Basic Design Flow for Field Programmable Gate Arrays por Usha Mehta
Basic Design Flow for Field Programmable Gate ArraysBasic Design Flow for Field Programmable Gate Arrays
Basic Design Flow for Field Programmable Gate Arrays
Usha Mehta10 visualizações
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx por lwang78
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
lwang78314 visualizações
Field Programmable Gate Arrays : Architecture por Usha Mehta
Field Programmable Gate Arrays : ArchitectureField Programmable Gate Arrays : Architecture
Field Programmable Gate Arrays : Architecture
Usha Mehta23 visualizações
taylor-2005-classical-mechanics.pdf por ArturoArreola10
taylor-2005-classical-mechanics.pdftaylor-2005-classical-mechanics.pdf
taylor-2005-classical-mechanics.pdf
ArturoArreola1037 visualizações
CPM Schedule Float.pptx por Mathew Joseph
CPM Schedule Float.pptxCPM Schedule Float.pptx
CPM Schedule Float.pptx
Mathew Joseph8 visualizações
IRJET-Productivity Enhancement Using Method Study.pdf por SahilBavdhankar
IRJET-Productivity Enhancement Using Method Study.pdfIRJET-Productivity Enhancement Using Method Study.pdf
IRJET-Productivity Enhancement Using Method Study.pdf
SahilBavdhankar10 visualizações

Resilient Kafka: How DNS Traffic Management and Client Wrappers Ensure Availability