SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
Exactly-once Stream Processing
Matthias J. Sax, Software Engineer
Apache Kafka committer and PMC member
matthias@confluent.io | @MatthiasJSax
@MatthiasJSax
Exactly-once: Delivery vs Semantics
Exactly-once Delivery
• Academic distributed system problem:
• Can we send a message an ensure it’s delivered to the receiver exactly once?
• Two Generals’ Problem (https://en.wikipedia.org/wiki/Byzantine_fault)
• Provable not possible!
Deliver != Semantics
2
@MatthiasJSax
Take input record, process it, update result, and record progress.
No Error. No Problem.
What is Exactly-once Semantics About?
3
@MatthiasJSax
What happens if something goes wrong?
Error during read, processing, write, or record progress.
We retry!
But is it safe?
What is Exactly-once Semantics About?
4
@MatthiasJSax
5
Are retries safe? With exactly-once, yes!
Exactly-once is about masking errors via safe retries.
The result of an exactly-once retry,
is semantically the same as if no error had occurred.
What is Exactly-once Semantics About?
@MatthiasJSax
Common Misconceptions
Kafka as an intermediate
• Pattern: Produce -> Kafka -> Consume
• No exactly-once semantics:
• Upstream write-only producer!
6
@MatthiasJSax
There is no* Write-only Exactly-once!
(*) Write-only exactly-once is possible for idempotent updates (but Kafka is append-only…)
@MatthiasJSax
Common Misconceptions
Kafka as an intermediate
• Pattern: Produce -> Kafka -> Consume
• No exactly-once semantics:
• Upstream write-only producer!
• Downstream read-only consumer!
8
@MatthiasJSax
There is NO Read-only Exactly-once!
@MatthiasJSax
Common Misconceptions
Kafka as an intermediate
• Pattern: Produce -> Kafka -> Consume
• No exactly-once semantics.
Kafka for processing
• Pattern: Consume -> Process -> Produce
• Built-in exactly-once via Kafka Streams (or DIY).
• Also possible with external source/target system!
10
@MatthiasJSax
Let’s Break it Down
Steps in a Processing Pipeline
• Read input:
• Does not modify state; re-reading is always safe.
• Process data:
• Stateless re-processing (filter, map etc) is always safe.
• Stateful re-processing: need to roll-back state before we can retry.
• Update result:
• Need to “retract” (partial) results.
• Or: rely on idempotent updates. (There are dragons!)
• Record progress:
• Modifies state in the source system (or does it?)
11
@MatthiasJSax
Exactly-once
==
At-least-once + Idempotency
It depends…
@MatthiasJSax
Idempotent Updates (Internal State)?
Stateful processing
Stateful processing is usually a “read and modify” pattern, e.g., increase a counter.
• It’s context sensitive!
13
Cnt: 73 Cnt: 74
73+1
Cnt: 74 Cnt: 75
74+1
Retry: L
@MatthiasJSax
Idempotent Updates? Maybe…
Stateful processing
Stateful processing is usually a “read and modify” pattern, e.g., increase a counter.
• It’s context sensitive!
• Idempotency requires context agnostic state modifications, e.g., set a new address.
14
City: LA City: NY
Set “NY”
City: NY City: NY
Set “NY”
Retry: J
@MatthiasJSax
Idempotent Updates (External State)
The issue of time travel…
15
City: LA City: NY
Set “NY”
City: BO
Set “BO”
Read: NY Read: BO
Read: LA
@MatthiasJSax
Idempotent Updates (External State)
Retrying a sequence of updates:
16
City: BO City: NY
Set “NY”
City: BO
Set “BO”
Read: NY L
Read: BO J Read: BO J
@MatthiasJSax
Idempotency is not enough.
All State Changes must be Atomic!
@MatthiasJSax
All State Changes must be Atomic
What is ”state”?
• Internal processing state.
• External state, i.e., result state.
• External state, i.e., source progress.
Transactions for the rescue!
Do we want to (can we) do a cross-system distributed transaction?
Good news: we don’t have to…
18
@MatthiasJSax
Exactly-Once with Kafka and External Systems
19
Example: Downstream target RDBMS
(Async) offset update
(not part of the transaction)
Atomic write via
ACID transaction
State
Result
Offsets
@MatthiasJSax
Exactly-Once with Kafka and External Systems
20
Example: Downstream target RDBMS
State
Result
Offsets
Reset offsets
and retry
@MatthiasJSax
Kafka Connect (Part 1)
Exactly-once Sink
• Has “nothing” to do with Kafka:
• Kafka provides source system progress tracking via offsets.
• Connect provide API to fetch start offsets from target system.
• Depends on targe system properties / features.
• Each individual connector must implement it.
21
@MatthiasJSax
How does Kafka Tackle Exactly-once?
22
Kafka Transactions
Multi-partition/multi-topic atomic write:
0 0
0 0 0
1 1 1 1
2
2
2
3
4
3
1
2
t
1
-
p
0
t
1
-
p
1
t
2
-
p
0
t
2
-
p
1
t
2
-
p
2
2
3
@MatthiasJSax
How does Kafka Tackle Exactly-once?
23
Kafka Transactions
Multi-partition/multi-topic atomic write:
producer.beginTransaction();
// state updates (changelogs + result)
producer.send(…);
producer.send(…);
…
producer.commitTransaction(); // or .abortTransaction()
@MatthiasJSax
Exactly-Once with Kafka
24
Kafka as Sink
Requirement: ability to track source system progress.
result
state (via changelogs)
source progress (via custom metadata topic)
@MatthiasJSax
Kafka Connect (Part 2)
•
•
•
•
•
Exactly-once Source
• “Exactly-once, Again: Adding EOS Support for Kafka Connect Source Connectors”
• Tomorrow: 2pm
• Chris Egerton, Aiven
• KIP-618 (Apache Kafka 3.3):
• https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors
25
@MatthiasJSax
Kafka Streams
26
Kafka Transactions
Atomic read-process-write pattern:
@MatthiasJSax
Kafka Streams
27
__consumer_offsets
changelogs
result
Kafka Transactions
Multi-partition/multi-topic atomic write:
@MatthiasJSax
Kafka Streams
28
Kafka Transactions
Multi-partition/multi-topic atomic write:
producer.beginTransaction();
// state updates (changelogs + result)
producer.send(…);
producer.send(…);
…
producer.addOffsetsToTransaction(…);
producer.commitTransaction(); // or .abortTransaction()
@MatthiasJSax
Kafka Streams
Single vs Multi-cluster
Kafka Streams (current) only works against a single broker cluster:
• Does not really matter. We still rely on the brokers as target system.
• Need source offsets but commit them via the producer.
• Single broker cluster only avoids “dual” commit of source offsets.
Supporting cross-cluster EOS with Kafka Streams is possible:
• Add custom metadata topic to targe cluster.
• Replace addOffsetsToTransaction() with send().
• Fetch consumer offset manually from metadata topic.
• Issues:
• EOS v2 implementation (producer per thread) not possible.
• Limited to single target cluster.
29
@MatthiasJSax
The Big Challenge
Error Handling in a (Distributed) Application
Kafka transaction allow to fence “zombie” producers.
Any EOS target system needs to support something similar (or rely on idempotency if possible).
Kafka Connect Sink Connectors:
• Idempotency or sink system fencing required—Connect framework cannot help at all.
Kafka Connect Source Connectors:
• Relies on producer fencing.
• Does use a producer per task (similarly to Kafka Streams’ EOS v1 implementation).
Kafka Streams:
• Relies on producer fencing (EOS v1) or consumer fencing (EOS v2).
• EOS v2 implementation (producer per thread) relies on consumer/producer integration inside the same broker cluster.
30
@MatthiasJSax
What to do in Practice?
Publishing with producer-only app?
The important thing is to figure out where to resume on restart:
• Is there any “source progress” information you can store?
• You need to add a consumer to your app!
• On app restart:
• Initialize producer to fence potential zombie and to force any pending TX to complete.
• Use consumer (in read-committed mode) to inspect the target cluster’s data.
Reading with consumer-only app?
• If there is no target data system, only idempotency can help.
• With no target data system, everything is basically a side-effect.
31
@MatthiasJSax
Exactly-once Key Takeaways
(A) no producer-only EOS
(B) no consumer-only EOS
(C) read-process-write pattern
(1) need ability to track source system read progress
(2) require target system atomic write (plus fencing)
(3) source system progress is recorded in target system
Kafka built-in support via transactions + Zero coding with Kafka Streams
✅
@MatthiasJSax

Mais conteúdo relacionado

Semelhante a Exactly-once Stream Processing Done Right with Matthias J Sax

Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applicationsDing Li
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with storesYoni Farin
 
Introducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache KafkaIntroducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache KafkaApurva Mehta
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafkaconfluent
 
Open west 2015 talk ben coverston
Open west 2015 talk ben coverstonOpen west 2015 talk ben coverston
Open west 2015 talk ben coverstonbcoverston
 
JHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemJHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemFlorent Ramiere
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformApache Apex
 
Hyperbatch danielpeter-161117095610
Hyperbatch danielpeter-161117095610Hyperbatch danielpeter-161117095610
Hyperbatch danielpeter-161117095610Sandeep Dobariya
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Databricks
 
Akka Microservices Architecture And Design
Akka Microservices Architecture And DesignAkka Microservices Architecture And Design
Akka Microservices Architecture And DesignYaroslav Tkachenko
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitterRoger Xia
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...smallerror
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...xlight
 

Semelhante a Exactly-once Stream Processing Done Right with Matthias J Sax (20)

Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
 
CAP: Scaling, HA
CAP: Scaling, HACAP: Scaling, HA
CAP: Scaling, HA
 
Introducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache KafkaIntroducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache Kafka
 
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache KafkaKafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
 
Open west 2015 talk ben coverston
Open west 2015 talk ben coverstonOpen west 2015 talk ben coverston
Open west 2015 talk ben coverston
 
kafka for db as postgres
kafka for db as postgreskafka for db as postgres
kafka for db as postgres
 
JHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemJHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka Ecosystem
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and TransformIntro to Apache Apex - Next Gen Platform for Ingest and Transform
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
 
Introduction to Go
Introduction to GoIntroduction to Go
Introduction to Go
 
Hyperbatch danielpeter-161117095610
Hyperbatch danielpeter-161117095610Hyperbatch danielpeter-161117095610
Hyperbatch danielpeter-161117095610
 
HyperBatch
HyperBatchHyperBatch
HyperBatch
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
 
Akka Microservices Architecture And Design
Akka Microservices Architecture And DesignAkka Microservices Architecture And Design
Akka Microservices Architecture And Design
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 

Mais de HostedbyConfluent

Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

Mais de HostedbyConfluent (20)

Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Último

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Último (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

Exactly-once Stream Processing Done Right with Matthias J Sax

  • 1. Exactly-once Stream Processing Matthias J. Sax, Software Engineer Apache Kafka committer and PMC member matthias@confluent.io | @MatthiasJSax
  • 2. @MatthiasJSax Exactly-once: Delivery vs Semantics Exactly-once Delivery • Academic distributed system problem: • Can we send a message an ensure it’s delivered to the receiver exactly once? • Two Generals’ Problem (https://en.wikipedia.org/wiki/Byzantine_fault) • Provable not possible! Deliver != Semantics 2
  • 3. @MatthiasJSax Take input record, process it, update result, and record progress. No Error. No Problem. What is Exactly-once Semantics About? 3
  • 4. @MatthiasJSax What happens if something goes wrong? Error during read, processing, write, or record progress. We retry! But is it safe? What is Exactly-once Semantics About? 4
  • 5. @MatthiasJSax 5 Are retries safe? With exactly-once, yes! Exactly-once is about masking errors via safe retries. The result of an exactly-once retry, is semantically the same as if no error had occurred. What is Exactly-once Semantics About?
  • 6. @MatthiasJSax Common Misconceptions Kafka as an intermediate • Pattern: Produce -> Kafka -> Consume • No exactly-once semantics: • Upstream write-only producer! 6
  • 7. @MatthiasJSax There is no* Write-only Exactly-once! (*) Write-only exactly-once is possible for idempotent updates (but Kafka is append-only…)
  • 8. @MatthiasJSax Common Misconceptions Kafka as an intermediate • Pattern: Produce -> Kafka -> Consume • No exactly-once semantics: • Upstream write-only producer! • Downstream read-only consumer! 8
  • 9. @MatthiasJSax There is NO Read-only Exactly-once!
  • 10. @MatthiasJSax Common Misconceptions Kafka as an intermediate • Pattern: Produce -> Kafka -> Consume • No exactly-once semantics. Kafka for processing • Pattern: Consume -> Process -> Produce • Built-in exactly-once via Kafka Streams (or DIY). • Also possible with external source/target system! 10
  • 11. @MatthiasJSax Let’s Break it Down Steps in a Processing Pipeline • Read input: • Does not modify state; re-reading is always safe. • Process data: • Stateless re-processing (filter, map etc) is always safe. • Stateful re-processing: need to roll-back state before we can retry. • Update result: • Need to “retract” (partial) results. • Or: rely on idempotent updates. (There are dragons!) • Record progress: • Modifies state in the source system (or does it?) 11
  • 13. @MatthiasJSax Idempotent Updates (Internal State)? Stateful processing Stateful processing is usually a “read and modify” pattern, e.g., increase a counter. • It’s context sensitive! 13 Cnt: 73 Cnt: 74 73+1 Cnt: 74 Cnt: 75 74+1 Retry: L
  • 14. @MatthiasJSax Idempotent Updates? Maybe… Stateful processing Stateful processing is usually a “read and modify” pattern, e.g., increase a counter. • It’s context sensitive! • Idempotency requires context agnostic state modifications, e.g., set a new address. 14 City: LA City: NY Set “NY” City: NY City: NY Set “NY” Retry: J
  • 15. @MatthiasJSax Idempotent Updates (External State) The issue of time travel… 15 City: LA City: NY Set “NY” City: BO Set “BO” Read: NY Read: BO Read: LA
  • 16. @MatthiasJSax Idempotent Updates (External State) Retrying a sequence of updates: 16 City: BO City: NY Set “NY” City: BO Set “BO” Read: NY L Read: BO J Read: BO J
  • 17. @MatthiasJSax Idempotency is not enough. All State Changes must be Atomic!
  • 18. @MatthiasJSax All State Changes must be Atomic What is ”state”? • Internal processing state. • External state, i.e., result state. • External state, i.e., source progress. Transactions for the rescue! Do we want to (can we) do a cross-system distributed transaction? Good news: we don’t have to… 18
  • 19. @MatthiasJSax Exactly-Once with Kafka and External Systems 19 Example: Downstream target RDBMS (Async) offset update (not part of the transaction) Atomic write via ACID transaction State Result Offsets
  • 20. @MatthiasJSax Exactly-Once with Kafka and External Systems 20 Example: Downstream target RDBMS State Result Offsets Reset offsets and retry
  • 21. @MatthiasJSax Kafka Connect (Part 1) Exactly-once Sink • Has “nothing” to do with Kafka: • Kafka provides source system progress tracking via offsets. • Connect provide API to fetch start offsets from target system. • Depends on targe system properties / features. • Each individual connector must implement it. 21
  • 22. @MatthiasJSax How does Kafka Tackle Exactly-once? 22 Kafka Transactions Multi-partition/multi-topic atomic write: 0 0 0 0 0 1 1 1 1 2 2 2 3 4 3 1 2 t 1 - p 0 t 1 - p 1 t 2 - p 0 t 2 - p 1 t 2 - p 2 2 3
  • 23. @MatthiasJSax How does Kafka Tackle Exactly-once? 23 Kafka Transactions Multi-partition/multi-topic atomic write: producer.beginTransaction(); // state updates (changelogs + result) producer.send(…); producer.send(…); … producer.commitTransaction(); // or .abortTransaction()
  • 24. @MatthiasJSax Exactly-Once with Kafka 24 Kafka as Sink Requirement: ability to track source system progress. result state (via changelogs) source progress (via custom metadata topic)
  • 25. @MatthiasJSax Kafka Connect (Part 2) • • • • • Exactly-once Source • “Exactly-once, Again: Adding EOS Support for Kafka Connect Source Connectors” • Tomorrow: 2pm • Chris Egerton, Aiven • KIP-618 (Apache Kafka 3.3): • https://cwiki.apache.org/confluence/display/KAFKA/KIP-618%3A+Exactly-Once+Support+for+Source+Connectors 25
  • 28. @MatthiasJSax Kafka Streams 28 Kafka Transactions Multi-partition/multi-topic atomic write: producer.beginTransaction(); // state updates (changelogs + result) producer.send(…); producer.send(…); … producer.addOffsetsToTransaction(…); producer.commitTransaction(); // or .abortTransaction()
  • 29. @MatthiasJSax Kafka Streams Single vs Multi-cluster Kafka Streams (current) only works against a single broker cluster: • Does not really matter. We still rely on the brokers as target system. • Need source offsets but commit them via the producer. • Single broker cluster only avoids “dual” commit of source offsets. Supporting cross-cluster EOS with Kafka Streams is possible: • Add custom metadata topic to targe cluster. • Replace addOffsetsToTransaction() with send(). • Fetch consumer offset manually from metadata topic. • Issues: • EOS v2 implementation (producer per thread) not possible. • Limited to single target cluster. 29
  • 30. @MatthiasJSax The Big Challenge Error Handling in a (Distributed) Application Kafka transaction allow to fence “zombie” producers. Any EOS target system needs to support something similar (or rely on idempotency if possible). Kafka Connect Sink Connectors: • Idempotency or sink system fencing required—Connect framework cannot help at all. Kafka Connect Source Connectors: • Relies on producer fencing. • Does use a producer per task (similarly to Kafka Streams’ EOS v1 implementation). Kafka Streams: • Relies on producer fencing (EOS v1) or consumer fencing (EOS v2). • EOS v2 implementation (producer per thread) relies on consumer/producer integration inside the same broker cluster. 30
  • 31. @MatthiasJSax What to do in Practice? Publishing with producer-only app? The important thing is to figure out where to resume on restart: • Is there any “source progress” information you can store? • You need to add a consumer to your app! • On app restart: • Initialize producer to fence potential zombie and to force any pending TX to complete. • Use consumer (in read-committed mode) to inspect the target cluster’s data. Reading with consumer-only app? • If there is no target data system, only idempotency can help. • With no target data system, everything is basically a side-effect. 31
  • 32. @MatthiasJSax Exactly-once Key Takeaways (A) no producer-only EOS (B) no consumer-only EOS (C) read-process-write pattern (1) need ability to track source system read progress (2) require target system atomic write (plus fencing) (3) source system progress is recorded in target system Kafka built-in support via transactions + Zero coding with Kafka Streams ✅