8. Replication Process
• Record oplog entry on write
• Idempotent entries
• Pulled by replicas
1. Read over network
2. Buffer locally
3. Apply in batch
4. Repeat
9. Read + Apply Decoupled
• Background oplog reader thread
• Pool of oplog applier threads (by collection)
Repl Source
Applier
Thread
Pool
16
Buffer
DB4
DB3
DB1 DB2
Local Oplog
Network
11. Good Replication States
• Initial Sync
o Record oplog start position
o Clone/copy all dbs
o Set minvalid, apply oplog since start
o Build indexes
• Replication Batch: MinValid
15. Election Nomination
Disqualifications
A replica will nominate itself unless:
• Priority:0 or arbiter
• Not freshest
• Just stepped down (in unelectable state)
• Would be vetoed by anyone because
o There is a Primary already
o They don't have us in their config
o Higher priority member out there
• Higher config version out there
16. The Election
Nomination:
• If it looks like a tie, sleep random time
(unless first node)
Voting:
• If all goes well, only one nominee
• All voting members vote for one nominee
• Majority of votes wins
22. Replication Source
Select'n
• Select closest source
o Limit to non-hidden or slave delayed
o If nothing, try again with hidden/slave delayed
o Select node with fastest "ping" time
o Must be fresher
• Choose source when
o Starting
o Any error with existing source (network, query)
o Any member is 30s ahead of current source
• Manual override
o replSetSyncSource -- good until we choose again
24. Goal: Dynamic Reads
Controls for consistency
• Default to Primary
• Non-primary allowed
• Based on
o Locality (ping/tags)
o Tags
Client
S
P
S
Tags: A,
B
Tags: B, C