SlideShare uma empresa Scribd logo
1 de 57
Baixar para ler offline
Deep dive into Kafka Tiered
Storage
Satish D / Kamal
Introduction
Goals
● Scalability
● Efficiency
○ Operational
○ Cost
● Elasticity
Non Goals
● It does not support compact topics
● It does not support JBOD feature
● Tiered storage does not replace ETL pipelines
Features
● Provides tiering in storage layer beyond local drives
○ Memory/PageCache
○ Local drive
○ Remote storage support (including cloud stores like S3/GCS/Azure)
■ consistency and ordering semantics as local storage
● Improves efficiency
○ operational
○ cost
● Isolation of reading latest and old data
● Easy tuning and provisioning of clusters
● No changes required from clients
Local and Remote Log Segments
6 7 8 9 Active
1 2 3 4 6 7
Local Log Segments
Remote Log Segments
5
High Level Architecture
Follower Fetch
● There are two main states
○ Fetch
■ Fetch the messages from the leader and append it to its log segment
○ Truncating
■ Truncate the existing data to make sure its log segments follow the
same log lineage of the leader
Follower Fetch
1. Leader copies log segments with the auxiliary
state(includes leader epoch cache and producer-id
snapshots) to remote storage.
Follower Fetch
1. Leader copies log segments with the auxiliary
state(includes leader epoch cache and producer-id
snapshots) to remote storage.
2. Leader publishes remote log segment metadata about the
copied remote log segment.
Follower Fetch
1. Leader copies log segments with the auxiliary
state(includes leader epoch cache and producer-id
snapshots) to remote storage.
2. Leader publishes remote log segment metadata about the
copied remote log segment.
3. Follower tries to fetch the messages from the leader.
Follower Fetch
1. Leader copies log segments with the auxiliary
state(includes leader epoch cache and producer-id
snapshots) to remote storage.
2. Leader publishes remote log segment metadata about the
copied remote log segment.
3. Follower tries to fetch the messages from the leader.
4. Follower waits till it catches up consuming the required
remote log segment metadata.
Follower Fetch
1. Leader copies log segments with the auxiliary
state(includes leader epoch cache and producer-id
snapshots) to remote storage.
2. Leader publishes remote log segment metadata about the
copied remote log segment.
3. Follower tries to fetch the messages from the leader.
4. Follower waits till it catches up consuming the required
remote log segment metadata.
5. Follower fetches the respective remote log segment
metadata to build auxiliary state.
Follower Fetch
● Which offset should follower fetch from
○ Last tiered offset
○ Earliest local offset
2001 9000 Active
0 1000
Local Log Segments
Remote Log Segments
…….. 2001 9000
…….. 12000
……..
Earliest Local Offset
Last Tiered Offset
Follower Fetch
● Maintain the same log lineage
● Leader epochs
○ It is a representation of leader transitions, which is a monotonically increasing
number
○ Added with each message batch by the leader into the log segment
○ Maintains the leader epoch sequence file with epoch vs start-offset
■ Maintained by each replica
■ Enables maintaining the log lineage across the replicas
Leader
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
Leader epochs: (0,0)
Leader - A RemoteStorage
Leader
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
Leader epochs: (0,0)
Leader - A
seg 0-1, uuid-1
Leader epochs = (0, 0)
RemoteStorage
Leader
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-1
6: msg 6 LE-1
Leader epochs: (0,0), (1, 3)
Leader - B
seg 0-1, uuid-1
Leader epochs = (0, 0)
RemoteStorage
Leader
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-1
6: msg 6 LE-1
Leader epochs: (0,0), (1, 3)
Leader - B
seg 0-1, uuid-1
Leader epochs = (0, 0)
seg 2-5, uuid-2
Leader epochs = (0, 0),
(1.3)
RemoteStorage
Leader
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-1
6: msg 6 LE-1
7: msg 7 LE-1
Leader epochs: (0,0), (1, 3)
Leader - B
seg 0-1, uuid-1
Leader epochs = (0, 0)
seg 2-5, uuid-2
Leader epochs = (0, 0),
(1.3)
RemoteStorage
Follower Fetch - Empty Follower
Follower Fetch - Empty follower
1. Fetch from 0
a. Receives OMTS(OffsetMovedToTieredStorage)
2. Fetch ELO (Earliest Local Offset)
a. Receives ELO (leader epoch, offset)
3. Fetch remote segment info and build local leader
epoch sequence until ELO
a. Receives leader epoch sequence, producer-id snapshot
4. Fetch from ELO to HW
Leader - A
Follower - B
Follower Fetch - Empty follower
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1,3), (2, 5) (3, 7)
Leader - A
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1,3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Leader - A
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Fetch offset 0
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Fetch offset 0
Receives OMTS
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Fetch offset 0
Receives OMTS
Fetch ELO
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Fetch offset 0
Receives OMTS
Fetch ELO
Receives ELO (LE-1, 3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Fetch offset 0
Receives OMTS
Fetch ELO
Receives ELO (LE-1, 3)
Fetch remote segment info and
build local leader epoch
sequence until ELO
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1,3), (2, 5) (3, 7)
Fetch offset 0
Receives OMTS
Fetch ELO
Receives ELO (LE-1, 3)
Fetch remote segment info and
build local leader epoch
sequence until ELO
Leader epochs = (0, 0)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0), (1,
3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Fetch offset 0
Receives OMTS
Fetch ELO
Receives ELO (LE-1, 3)
Fetch remote segment info and
build local leader epoch
sequence until ELO
Fetch from ELO to HW
Leader epochs = (0, 0), (1, 3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Fetch offset 0
Receives OMTS
Fetch ELO
Receives ELO (LE-1, 3)
Fetch remote segment info and
build local leader epoch
sequence until ELO
Fetch from ELO to HW
Leader epochs = (0, 0), (1.3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
Leader epochs = (0, 0),
(1.3), (2, 5) (3, 7)
Fetch from ELO to HW
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
Leader epochs = (0, 0), (1,3),
(2,5)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1,3), (2, 5)
RemoteStorage
Follower Fetch - Empty follower - Summary
1. Fetch offset 0
a. Receives OMTS
2. Fetch EarliestLocalOffset (ELO)
a. Receives ELO (leader epoch, offset)
3. Fetch remote segment info and build local leader
epoch sequence until ELO
a. Receives leader epoch sequence
4. Fetch from ELO to HW
Leader - A
Follower - B
Follower Fetch - Out of sync follower
Follower Fetch - Out of sync follower
● Follower catching up with the leader
● Segments are copied to remote storage
○ Locally available
○ Locally not available
Follower Fetch - Out of sync follower
● Follower trying to catch up with the leader
● Segments are copied to remote storage
○ Locally available
■ Fetch from the leader like it does without tiered storage
Follower Fetch - Out of sync follower
● Follower trying to catch up with the leader
● Segments are copied to remote storage
○ Locally available
■ Fetch from the leader like it does without remote storage
○ Locally not available
■ Truncate the data on follower
Follower Fetch - Out of sync follower
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
8: msg 8 LE-3
9: msg 9 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
Leader epochs (0, 0), (1, 3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 6-8: uuid-3
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
8: msg 8 LE-3
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
Leader epochs (0, 0), (1, 3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 6-8: uuid-3
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
4: msg 4 LE-1
5: msg 5 LE-2
6: msg 6 LE-2
7: msg 7 LE-3
8: msg 8 LE-3
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
Leader epochs (0, 0), (1, 3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 6-8: uuid-3
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
Leader epochs (0, 0), (1, 3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 6-8: uuid-3
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
Fetch from leader LE-1, 4
Leader epochs (0, 0), (1, 3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 6-8: uuid-3
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
Fetch from leader LE-1, 4
Receives OMTS, truncate local
segments.
Leader epochs (0, 0), (1, 3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 6-8: uuid-3
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-1
Fetch from leader LE-1, 4
Receives OMTS, truncate local
segments.
Leader epochs (0, 0), (1, 3)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 6-8: uuid-3
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
Fetch from leader LE-1, 4
Receives OMTS, truncate local
segments.
[This is similar to empty broker]
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 6-8: uuid-3
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
Fetch from leader LE-1, 4
Receives OMTS, truncate local
segments.
Fetch ELO
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
Fetch from leader LE-1, 4
Receives OMTS, truncate local
segments.
Fetch ELO
Receives ELO 9, LE-3
Rebuilds leader epoch
sequence up to LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
Fetch ELO
Receives ELO 9, LE-3
Rebuilds leader epoch
sequence up to LE-3
Fetch 9, LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Out of sync follower
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
Fetch ELO
Receives ELO 9, LE-3
Rebuilds leader epoch
sequence up to LE-3
Fetch 9, LE-3
9: msg 9 LE-3
10: msg 10 LE-3
11: msg 11 LE-3
Leader epochs
(0, 0), (1, 3), (2, 5), (3, 7)
Leader - A Follower - B
seg 0-2, uuid-1
Leader epochs = (0, 0)
seg 3-5, uuid-2
Leader epochs = (0, 0),
(1.3), (2, 5)
seg 9-10: uuid-4
Leader epochs = (0, 0),
(1.3), (2, 5), (3,7)
RemoteStorage
Follower Fetch - Unclean leader election
Follower Fetch - Unclean leader election
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-0 (HW)
Leader epochs : (0, 0)
0: msg 0 LE-0
Leader epochs : (0, 0)
Leader - A Follower - B
seg 0-2: uuid-1
log:
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
Leader epochs : (0, 0)
RemoteStorage
Follower Fetch - Unclean leader election
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-0 (HW)
Leader epochs : (0, 0)
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0,0), (1, 1)
Leader - A
Stopped
Leader - B
seg 0-2: uuid-1
log:
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
Leader epochs : (0, 0)
seg 0-1: uuid-2
log:
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0, 0),
(1,1)
RemoteStorage
Follower Fetch - Unclean leader election
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
3: msg 3 LE-0 (HW)
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0, 0),
(1,1)
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0,0), (1, 1)
Follower-A Leader - B
seg 0-2: uuid-1
log:
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
Leader epochs : (0, 0)
seg 0-1: uuid-2
log:
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0, 0),
(1,1)
RemoteStorage
Follower Fetch - Unclean leader election
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0, 0),
(1,1)
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0,0), (1, 1)
Follower-A Leader - B
seg 0-2: uuid-1
log:
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
Leader epochs : (0, 0)
seg 0-1: uuid-2
log:
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0,0), (1,1)
RemoteStorage
RLMM maintains the indexes of epochs, offsets and the respective
segments
- For a given leader-epoch and offset it returns any segment that
contains the respective message
seg-0-2, uuid-1
segment epochs
(0, 0)
seg-0-1, uuid-2
segment epochs
(0, 0), (1, 1)
RemoteMetadata
Storage
Follower Fetch - Unclean leader election
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0, 0),
(1,1)
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0,0), (1, 1)
Follower-A Leader - B
seg 0-2: uuid-1
log:
0: msg 0 LE-0
1: msg 1 LE-0
2: msg 2 LE-0
Leader epochs : (0,0)
seg 0-2: uuid-2
log:
0: msg 0 LE-0
1: msg 4 LE-1
2: msg 5 LE-1
Leader epochs : (0,0), (1,1)
RemoteStorage
Consumer fetch: LE:0, offset:0
- Either of the segments is chosen by RLMM
seg-0-2, uuid-1
segment epochs
(0, 0)
seg-0-1, uuid-2
segment epochs
(0, 0), (1, 1)
RemoteLogMeta
data Storage
Consumer fetch: LE:1, offset:1
- msg 4 from seg 0-1
Thanks
Q & A

Mais conteúdo relacionado

Mais procurados

Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student SlidesSpark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student SlidesDatabricks
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk ModellingScyllaDB
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeDatabricks
 
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin SeyfeSOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin SeyfeDatabricks
 
Dbs302 driving a realtime personalization engine with cloud bigtable
Dbs302  driving a realtime personalization engine with cloud bigtableDbs302  driving a realtime personalization engine with cloud bigtable
Dbs302 driving a realtime personalization engine with cloud bigtableCalvin French-Owen
 
Distributed SQL Databases Deconstructed
Distributed SQL Databases DeconstructedDistributed SQL Databases Deconstructed
Distributed SQL Databases DeconstructedYugabyte
 
Kafka tiered-storage-meetup-2022-final-presented
Kafka tiered-storage-meetup-2022-final-presentedKafka tiered-storage-meetup-2022-final-presented
Kafka tiered-storage-meetup-2022-final-presentedSumant Tambe
 
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...Flink Forward
 
Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...Mario Fusco
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDatabricks
 
Mqtt overview (iot)
Mqtt overview (iot)Mqtt overview (iot)
Mqtt overview (iot)David Fowler
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelinesSumant Tambe
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQGeorge Teo
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentJean-François Gagné
 
Reliable Event Delivery in Apache Kafka Based on Retry Policy and Dead Letter...
Reliable Event Delivery in Apache Kafka Based on Retry Policy and Dead Letter...Reliable Event Delivery in Apache Kafka Based on Retry Policy and Dead Letter...
Reliable Event Delivery in Apache Kafka Based on Retry Policy and Dead Letter...HostedbyConfluent
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
 
A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerNikita Popov
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiFlink Forward
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Flink Forward
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...Redis Labs
 

Mais procurados (20)

Spark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student SlidesSpark Summit East 2015 Advanced Devops Student Slides
Spark Summit East 2015 Advanced Devops Student Slides
 
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
Scylla Summit 2022: IO Scheduling & NVMe Disk Modelling
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
 
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin SeyfeSOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
SOS: Optimizing Shuffle I/O with Brian Cho and Ergin Seyfe
 
Dbs302 driving a realtime personalization engine with cloud bigtable
Dbs302  driving a realtime personalization engine with cloud bigtableDbs302  driving a realtime personalization engine with cloud bigtable
Dbs302 driving a realtime personalization engine with cloud bigtable
 
Distributed SQL Databases Deconstructed
Distributed SQL Databases DeconstructedDistributed SQL Databases Deconstructed
Distributed SQL Databases Deconstructed
 
Kafka tiered-storage-meetup-2022-final-presented
Kafka tiered-storage-meetup-2022-final-presentedKafka tiered-storage-meetup-2022-final-presented
Kafka tiered-storage-meetup-2022-final-presented
 
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
Flink Forward Berlin 2018: Nico Kruber - "Improving throughput and latency wi...
 
Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...Reactive Programming for a demanding world: building event-driven and respons...
Reactive Programming for a demanding world: building event-driven and respons...
 
Deep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.xDeep Dive into GPU Support in Apache Spark 3.x
Deep Dive into GPU Support in Apache Spark 3.x
 
Mqtt overview (iot)
Mqtt overview (iot)Mqtt overview (iot)
Mqtt overview (iot)
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQ
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
Reliable Event Delivery in Apache Kafka Based on Retry Policy and Dead Letter...
Reliable Event Delivery in Apache Kafka Based on Retry Policy and Dead Letter...Reliable Event Delivery in Apache Kafka Based on Retry Policy and Dead Letter...
Reliable Event Delivery in Apache Kafka Based on Retry Policy and Dead Letter...
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
 
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
 

Semelhante a Deep Dive Into Kafka Tiered Storage With Satish Duggana | Current 2022

Building a Distributed Message Log from Scratch
Building a Distributed Message Log from ScratchBuilding a Distributed Message Log from Scratch
Building a Distributed Message Log from ScratchTyler Treat
 
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...Jonathan Salwan
 
Designing of fifo and serial peripheral interface protocol using Verilog HDL
Designing of fifo and serial peripheral interface protocol using Verilog HDLDesigning of fifo and serial peripheral interface protocol using Verilog HDL
Designing of fifo and serial peripheral interface protocol using Verilog HDLJay Baxi
 
User guide wishbone serializer
User guide wishbone serializerUser guide wishbone serializer
User guide wishbone serializerdragonvnu
 
Juniper JNCIA – Juniper RIP and OSPF Route Configuration
Juniper JNCIA – Juniper RIP and OSPF Route ConfigurationJuniper JNCIA – Juniper RIP and OSPF Route Configuration
Juniper JNCIA – Juniper RIP and OSPF Route ConfigurationHamed Moghaddam
 
Spi master core verification
Spi master core verificationSpi master core verification
Spi master core verificationMaulik Suthar
 
AREA OPTIMIZATION OF SPI MODULE USING VERILOG HDL
AREA OPTIMIZATION OF SPI MODULE USING VERILOG HDLAREA OPTIMIZATION OF SPI MODULE USING VERILOG HDL
AREA OPTIMIZATION OF SPI MODULE USING VERILOG HDLIAEME Publication
 
the-cpu-design-central-processing-unit-design-1
the-cpu-design-central-processing-unit-design-1the-cpu-design-central-processing-unit-design-1
the-cpu-design-central-processing-unit-design-1Basel Mansour
 
VXLAN and FRRouting
VXLAN and FRRoutingVXLAN and FRRouting
VXLAN and FRRoutingFaisal Reza
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaGuozhang Wang
 
ISIS Routing Protocol for Network Engineers.pptx
ISIS Routing Protocol for Network Engineers.pptxISIS Routing Protocol for Network Engineers.pptx
ISIS Routing Protocol for Network Engineers.pptxMohammadHamedHaidary
 
Report for lab 8
Report for lab 8Report for lab 8
Report for lab 8trayyoo
 
Report for lab 8(1)
Report for lab 8(1)Report for lab 8(1)
Report for lab 8(1)trayyoo
 

Semelhante a Deep Dive Into Kafka Tiered Storage With Satish Duggana | Current 2022 (16)

Building a Distributed Message Log from Scratch
Building a Distributed Message Log from ScratchBuilding a Distributed Message Log from Scratch
Building a Distributed Message Log from Scratch
 
Ppt of routing protocols
Ppt of routing protocolsPpt of routing protocols
Ppt of routing protocols
 
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
Sstic 2015 detailed_version_triton_concolic_execution_frame_work_f_saudel_jsa...
 
Designing of fifo and serial peripheral interface protocol using Verilog HDL
Designing of fifo and serial peripheral interface protocol using Verilog HDLDesigning of fifo and serial peripheral interface protocol using Verilog HDL
Designing of fifo and serial peripheral interface protocol using Verilog HDL
 
OSPF_multi.pdf
OSPF_multi.pdfOSPF_multi.pdf
OSPF_multi.pdf
 
User guide wishbone serializer
User guide wishbone serializerUser guide wishbone serializer
User guide wishbone serializer
 
Juniper JNCIA – Juniper RIP and OSPF Route Configuration
Juniper JNCIA – Juniper RIP and OSPF Route ConfigurationJuniper JNCIA – Juniper RIP and OSPF Route Configuration
Juniper JNCIA – Juniper RIP and OSPF Route Configuration
 
Spi master core verification
Spi master core verificationSpi master core verification
Spi master core verification
 
AREA OPTIMIZATION OF SPI MODULE USING VERILOG HDL
AREA OPTIMIZATION OF SPI MODULE USING VERILOG HDLAREA OPTIMIZATION OF SPI MODULE USING VERILOG HDL
AREA OPTIMIZATION OF SPI MODULE USING VERILOG HDL
 
the-cpu-design-central-processing-unit-design-1
the-cpu-design-central-processing-unit-design-1the-cpu-design-central-processing-unit-design-1
the-cpu-design-central-processing-unit-design-1
 
VXLAN and FRRouting
VXLAN and FRRoutingVXLAN and FRRouting
VXLAN and FRRouting
 
Building a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache KafkaBuilding a Replicated Logging System with Apache Kafka
Building a Replicated Logging System with Apache Kafka
 
ISIS Routing Protocol for Network Engineers.pptx
ISIS Routing Protocol for Network Engineers.pptxISIS Routing Protocol for Network Engineers.pptx
ISIS Routing Protocol for Network Engineers.pptx
 
Ospf
OspfOspf
Ospf
 
Report for lab 8
Report for lab 8Report for lab 8
Report for lab 8
 
Report for lab 8(1)
Report for lab 8(1)Report for lab 8(1)
Report for lab 8(1)
 

Mais de HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

Mais de HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Último

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Deep Dive Into Kafka Tiered Storage With Satish Duggana | Current 2022

  • 1. Deep dive into Kafka Tiered Storage Satish D / Kamal
  • 2. Introduction Goals ● Scalability ● Efficiency ○ Operational ○ Cost ● Elasticity Non Goals ● It does not support compact topics ● It does not support JBOD feature ● Tiered storage does not replace ETL pipelines
  • 3. Features ● Provides tiering in storage layer beyond local drives ○ Memory/PageCache ○ Local drive ○ Remote storage support (including cloud stores like S3/GCS/Azure) ■ consistency and ordering semantics as local storage ● Improves efficiency ○ operational ○ cost ● Isolation of reading latest and old data ● Easy tuning and provisioning of clusters ● No changes required from clients
  • 4. Local and Remote Log Segments 6 7 8 9 Active 1 2 3 4 6 7 Local Log Segments Remote Log Segments 5
  • 6. Follower Fetch ● There are two main states ○ Fetch ■ Fetch the messages from the leader and append it to its log segment ○ Truncating ■ Truncate the existing data to make sure its log segments follow the same log lineage of the leader
  • 7. Follower Fetch 1. Leader copies log segments with the auxiliary state(includes leader epoch cache and producer-id snapshots) to remote storage.
  • 8. Follower Fetch 1. Leader copies log segments with the auxiliary state(includes leader epoch cache and producer-id snapshots) to remote storage. 2. Leader publishes remote log segment metadata about the copied remote log segment.
  • 9. Follower Fetch 1. Leader copies log segments with the auxiliary state(includes leader epoch cache and producer-id snapshots) to remote storage. 2. Leader publishes remote log segment metadata about the copied remote log segment. 3. Follower tries to fetch the messages from the leader.
  • 10. Follower Fetch 1. Leader copies log segments with the auxiliary state(includes leader epoch cache and producer-id snapshots) to remote storage. 2. Leader publishes remote log segment metadata about the copied remote log segment. 3. Follower tries to fetch the messages from the leader. 4. Follower waits till it catches up consuming the required remote log segment metadata.
  • 11. Follower Fetch 1. Leader copies log segments with the auxiliary state(includes leader epoch cache and producer-id snapshots) to remote storage. 2. Leader publishes remote log segment metadata about the copied remote log segment. 3. Follower tries to fetch the messages from the leader. 4. Follower waits till it catches up consuming the required remote log segment metadata. 5. Follower fetches the respective remote log segment metadata to build auxiliary state.
  • 12. Follower Fetch ● Which offset should follower fetch from ○ Last tiered offset ○ Earliest local offset 2001 9000 Active 0 1000 Local Log Segments Remote Log Segments …….. 2001 9000 …….. 12000 …….. Earliest Local Offset Last Tiered Offset
  • 13. Follower Fetch ● Maintain the same log lineage ● Leader epochs ○ It is a representation of leader transitions, which is a monotonically increasing number ○ Added with each message batch by the leader into the log segment ○ Maintains the leader epoch sequence file with epoch vs start-offset ■ Maintained by each replica ■ Enables maintaining the log lineage across the replicas
  • 14. Leader 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 Leader epochs: (0,0) Leader - A RemoteStorage
  • 15. Leader 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 Leader epochs: (0,0) Leader - A seg 0-1, uuid-1 Leader epochs = (0, 0) RemoteStorage
  • 16. Leader 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-1 6: msg 6 LE-1 Leader epochs: (0,0), (1, 3) Leader - B seg 0-1, uuid-1 Leader epochs = (0, 0) RemoteStorage
  • 17. Leader 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-1 6: msg 6 LE-1 Leader epochs: (0,0), (1, 3) Leader - B seg 0-1, uuid-1 Leader epochs = (0, 0) seg 2-5, uuid-2 Leader epochs = (0, 0), (1.3) RemoteStorage
  • 18. Leader 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-1 6: msg 6 LE-1 7: msg 7 LE-1 Leader epochs: (0,0), (1, 3) Leader - B seg 0-1, uuid-1 Leader epochs = (0, 0) seg 2-5, uuid-2 Leader epochs = (0, 0), (1.3) RemoteStorage
  • 19. Follower Fetch - Empty Follower
  • 20. Follower Fetch - Empty follower 1. Fetch from 0 a. Receives OMTS(OffsetMovedToTieredStorage) 2. Fetch ELO (Earliest Local Offset) a. Receives ELO (leader epoch, offset) 3. Fetch remote segment info and build local leader epoch sequence until ELO a. Receives leader epoch sequence, producer-id snapshot 4. Fetch from ELO to HW Leader - A Follower - B
  • 21. Follower Fetch - Empty follower 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1,3), (2, 5) (3, 7) Leader - A seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1,3), (2, 5) RemoteStorage
  • 22. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Leader - A seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) RemoteStorage
  • 23. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) RemoteStorage
  • 24. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Fetch offset 0 Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) RemoteStorage
  • 25. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Fetch offset 0 Receives OMTS Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) RemoteStorage
  • 26. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Fetch offset 0 Receives OMTS Fetch ELO Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) RemoteStorage
  • 27. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Fetch offset 0 Receives OMTS Fetch ELO Receives ELO (LE-1, 3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) RemoteStorage
  • 28. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Fetch offset 0 Receives OMTS Fetch ELO Receives ELO (LE-1, 3) Fetch remote segment info and build local leader epoch sequence until ELO Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) RemoteStorage
  • 29. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1,3), (2, 5) (3, 7) Fetch offset 0 Receives OMTS Fetch ELO Receives ELO (LE-1, 3) Fetch remote segment info and build local leader epoch sequence until ELO Leader epochs = (0, 0) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1, 3), (2, 5) RemoteStorage
  • 30. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Fetch offset 0 Receives OMTS Fetch ELO Receives ELO (LE-1, 3) Fetch remote segment info and build local leader epoch sequence until ELO Fetch from ELO to HW Leader epochs = (0, 0), (1, 3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) RemoteStorage
  • 31. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Fetch offset 0 Receives OMTS Fetch ELO Receives ELO (LE-1, 3) Fetch remote segment info and build local leader epoch sequence until ELO Fetch from ELO to HW Leader epochs = (0, 0), (1.3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) RemoteStorage
  • 32. Follower Fetch - Empty follower 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 Leader epochs = (0, 0), (1.3), (2, 5) (3, 7) Fetch from ELO to HW 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 Leader epochs = (0, 0), (1,3), (2,5) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1,3), (2, 5) RemoteStorage
  • 33. Follower Fetch - Empty follower - Summary 1. Fetch offset 0 a. Receives OMTS 2. Fetch EarliestLocalOffset (ELO) a. Receives ELO (leader epoch, offset) 3. Fetch remote segment info and build local leader epoch sequence until ELO a. Receives leader epoch sequence 4. Fetch from ELO to HW Leader - A Follower - B
  • 34. Follower Fetch - Out of sync follower
  • 35. Follower Fetch - Out of sync follower ● Follower catching up with the leader ● Segments are copied to remote storage ○ Locally available ○ Locally not available
  • 36. Follower Fetch - Out of sync follower ● Follower trying to catch up with the leader ● Segments are copied to remote storage ○ Locally available ■ Fetch from the leader like it does without tiered storage
  • 37. Follower Fetch - Out of sync follower ● Follower trying to catch up with the leader ● Segments are copied to remote storage ○ Locally available ■ Fetch from the leader like it does without remote storage ○ Locally not available ■ Truncate the data on follower
  • 38. Follower Fetch - Out of sync follower 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 8: msg 8 LE-3 9: msg 9 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 Leader epochs (0, 0), (1, 3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 6-8: uuid-3 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 39. Follower Fetch - Out of sync follower 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 8: msg 8 LE-3 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 Leader epochs (0, 0), (1, 3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 6-8: uuid-3 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 40. Follower Fetch - Out of sync follower 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 4: msg 4 LE-1 5: msg 5 LE-2 6: msg 6 LE-2 7: msg 7 LE-3 8: msg 8 LE-3 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 Leader epochs (0, 0), (1, 3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 6-8: uuid-3 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 41. Follower Fetch - Out of sync follower 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 Leader epochs (0, 0), (1, 3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 6-8: uuid-3 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 42. Follower Fetch - Out of sync follower 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 Fetch from leader LE-1, 4 Leader epochs (0, 0), (1, 3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 6-8: uuid-3 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 43. Follower Fetch - Out of sync follower 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 Fetch from leader LE-1, 4 Receives OMTS, truncate local segments. Leader epochs (0, 0), (1, 3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 6-8: uuid-3 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 44. Follower Fetch - Out of sync follower 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-1 Fetch from leader LE-1, 4 Receives OMTS, truncate local segments. Leader epochs (0, 0), (1, 3) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 6-8: uuid-3 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 45. Follower Fetch - Out of sync follower 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) Fetch from leader LE-1, 4 Receives OMTS, truncate local segments. [This is similar to empty broker] Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 6-8: uuid-3 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 46. Follower Fetch - Out of sync follower 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) Fetch from leader LE-1, 4 Receives OMTS, truncate local segments. Fetch ELO Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 47. Follower Fetch - Out of sync follower 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) Fetch from leader LE-1, 4 Receives OMTS, truncate local segments. Fetch ELO Receives ELO 9, LE-3 Rebuilds leader epoch sequence up to LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 48. Follower Fetch - Out of sync follower 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) Fetch ELO Receives ELO 9, LE-3 Rebuilds leader epoch sequence up to LE-3 Fetch 9, LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 49. Follower Fetch - Out of sync follower 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) Fetch ELO Receives ELO 9, LE-3 Rebuilds leader epoch sequence up to LE-3 Fetch 9, LE-3 9: msg 9 LE-3 10: msg 10 LE-3 11: msg 11 LE-3 Leader epochs (0, 0), (1, 3), (2, 5), (3, 7) Leader - A Follower - B seg 0-2, uuid-1 Leader epochs = (0, 0) seg 3-5, uuid-2 Leader epochs = (0, 0), (1.3), (2, 5) seg 9-10: uuid-4 Leader epochs = (0, 0), (1.3), (2, 5), (3,7) RemoteStorage
  • 50. Follower Fetch - Unclean leader election
  • 51. Follower Fetch - Unclean leader election 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-0 (HW) Leader epochs : (0, 0) 0: msg 0 LE-0 Leader epochs : (0, 0) Leader - A Follower - B seg 0-2: uuid-1 log: 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 Leader epochs : (0, 0) RemoteStorage
  • 52. Follower Fetch - Unclean leader election 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-0 (HW) Leader epochs : (0, 0) 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0,0), (1, 1) Leader - A Stopped Leader - B seg 0-2: uuid-1 log: 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 Leader epochs : (0, 0) seg 0-1: uuid-2 log: 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0, 0), (1,1) RemoteStorage
  • 53. Follower Fetch - Unclean leader election 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 3: msg 3 LE-0 (HW) 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0, 0), (1,1) 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0,0), (1, 1) Follower-A Leader - B seg 0-2: uuid-1 log: 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 Leader epochs : (0, 0) seg 0-1: uuid-2 log: 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0, 0), (1,1) RemoteStorage
  • 54. Follower Fetch - Unclean leader election 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0, 0), (1,1) 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0,0), (1, 1) Follower-A Leader - B seg 0-2: uuid-1 log: 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 Leader epochs : (0, 0) seg 0-1: uuid-2 log: 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0,0), (1,1) RemoteStorage RLMM maintains the indexes of epochs, offsets and the respective segments - For a given leader-epoch and offset it returns any segment that contains the respective message seg-0-2, uuid-1 segment epochs (0, 0) seg-0-1, uuid-2 segment epochs (0, 0), (1, 1) RemoteMetadata Storage
  • 55. Follower Fetch - Unclean leader election 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0, 0), (1,1) 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0,0), (1, 1) Follower-A Leader - B seg 0-2: uuid-1 log: 0: msg 0 LE-0 1: msg 1 LE-0 2: msg 2 LE-0 Leader epochs : (0,0) seg 0-2: uuid-2 log: 0: msg 0 LE-0 1: msg 4 LE-1 2: msg 5 LE-1 Leader epochs : (0,0), (1,1) RemoteStorage Consumer fetch: LE:0, offset:0 - Either of the segments is chosen by RLMM seg-0-2, uuid-1 segment epochs (0, 0) seg-0-1, uuid-2 segment epochs (0, 0), (1, 1) RemoteLogMeta data Storage Consumer fetch: LE:1, offset:1 - msg 4 from seg 0-1
  • 57. Q & A