O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
May 2016
Raft in details
Ivan Glushkov
ivan.glushkov@gmail.com
@gliush
Raft is a consensus algorithm for managing a
replicated log
Why
❖ Very important topic
❖ [Multi-]Paxos, the key player, is very difficult to
understand
❖ Paxos -> Multi-Paxos -> diffe...
Raft goals
❖ Main goal - understandability:
❖ Decomposition: separated leader election, log
replication, safety, membershi...
Replicated state machines
❖ The same state on each server
❖ Compute the state from replicated log
❖ Consensus algorithm: k...
Properties of consensus algorithm
❖ Never return an incorrect result: Safety
❖ Functional as long as any majority of the s...
Raft in a nutshell
❖ Elect a leader
❖ Give full responsibility for replicated log to leader:
❖ Leader accepts log entries ...
Raft basics: States
❖ Follower
❖ Candidate
❖ Leader
RequestVote RPC
❖ Become a candidate, if no communication from leader over
a timeout (random election timeout: e.g. 150-30...
Leader Election
❖ Heartbeats from leader
❖ Election timeout
❖ Candidate results:
❖ It wins
❖ Another server wins
❖ No winn...
Leader Election Property
❖ Election Safety: at most one leader can be elected in a
given term
Log Replication
❖ log index
❖ log term
❖ log command
Log Replication
❖ Leader appends command from client to its log
❖ Leader issues AppendEntries RPC to replicate the entry
❖...
Log Replication Properties
❖ Leader Append-Only: a leader never overwrites or
deletes entries in its log; it only appends ...
Inconsistency
Solving Inconsistency
❖ Leader forces the followers’ logs to duplicate its own
❖ Conflicting entries in the follower logs w...
Safety Property
❖ Leader Completeness: if a log entry is committed in a
given term, then that entry will be present in the...
Safety Restriction
❖ Up-to-date: if the logs have last entries with different
terms, then the log with the later term is m...
Follower and Candidate Crashes
❖ Retry indefinitely
❖ Raft RPC are idempotent
Raft Joint Consensus
❖ Two phase to change membership configuration
❖ In Joint Consensus state:
❖ replicate log entries to ...
Log compaction
❖ Independent

snapshotting
❖ Snapshot metadata
❖ InstallSnapshot RPC
Client Interaction
❖ Works with leader
❖ Leader respond to a request when it commits an entry
❖ Assign uniqueID to every c...
Correctness
❖ TLA+ formal specification for consensus alg (400 LOC)
❖ Proof of linearizability in Raft using Coq (community)
Questions?
Próximos SlideShares
Carregando em…5
×

Raft in details

760 visualizações

Publicada em

Raft explained in details, it's simply "Raft paper (Extended edition)"

Publicada em: Software
  • Seja o primeiro a comentar

Raft in details

  1. 1. May 2016 Raft in details Ivan Glushkov ivan.glushkov@gmail.com @gliush
  2. 2. Raft is a consensus algorithm for managing a replicated log
  3. 3. Why ❖ Very important topic ❖ [Multi-]Paxos, the key player, is very difficult to understand ❖ Paxos -> Multi-Paxos -> different approaches ❖ Different optimization, closed-source implementation
  4. 4. Raft goals ❖ Main goal - understandability: ❖ Decomposition: separated leader election, log replication, safety, membership changes ❖ State space reduction
  5. 5. Replicated state machines ❖ The same state on each server ❖ Compute the state from replicated log ❖ Consensus algorithm: keep the replicated log consistent
  6. 6. Properties of consensus algorithm ❖ Never return an incorrect result: Safety ❖ Functional as long as any majority of the severs are operational: Availability ❖ Do not depend on timing to ensure consistency (at worst - unavailable)
  7. 7. Raft in a nutshell ❖ Elect a leader ❖ Give full responsibility for replicated log to leader: ❖ Leader accepts log entries from client ❖ Leader replicates log entries on other servers ❖ Tell servers when it is safe to apply changes on their state
  8. 8. Raft basics: States ❖ Follower ❖ Candidate ❖ Leader
  9. 9. RequestVote RPC ❖ Become a candidate, if no communication from leader over a timeout (random election timeout: e.g. 150-300ms) ❖ Spec: (term, candidateId, lastLogIndex, lastLogTerm) 
 -> (term, voteGranted) ❖ Receiver: ❖ false if term < currentTerm ❖ true if not voted yet and term and log are Up-To-Date ❖ false
  10. 10. Leader Election ❖ Heartbeats from leader ❖ Election timeout ❖ Candidate results: ❖ It wins ❖ Another server wins ❖ No winner for a period of time -> new election
  11. 11. Leader Election Property ❖ Election Safety: at most one leader can be elected in a given term
  12. 12. Log Replication ❖ log index ❖ log term ❖ log command
  13. 13. Log Replication ❖ Leader appends command from client to its log ❖ Leader issues AppendEntries RPC to replicate the entry ❖ Leader replicates entry to majority -> apply entry to state -> “committed entry” ❖ Follower checks if entry is committed by server -> commit it locally
  14. 14. Log Replication Properties ❖ Leader Append-Only: a leader never overwrites or deletes entries in its log; it only appends new entries ❖ Log Matching: If two entries in different logs have the same index and term, then they store the same command. ❖ Log Matching: If two entries in different logs have the same index and term, then the logs are identical in all preceding entries.
  15. 15. Inconsistency
  16. 16. Solving Inconsistency ❖ Leader forces the followers’ logs to duplicate its own ❖ Conflicting entries in the follower logs will be overwritten with entries from the Leader’s log: ❖ Find latest log where two servers agree ❖ Remove everything after this point ❖ Write new logs
  17. 17. Safety Property ❖ Leader Completeness: if a log entry is committed in a given term, then that entry will be present in the logs of the leaders for all higher-numbered terms ❖ State Machine Safety: if a server has applied a log entry at a given index to its state machine, no other server will ever apply a different log entry for the same index.
  18. 18. Safety Restriction ❖ Up-to-date: if the logs have last entries with different terms, then the log with the later term is more up-to- date. If the logs end with the same term, then whichever log is longer is more up-to-date. time
  19. 19. Follower and Candidate Crashes ❖ Retry indefinitely ❖ Raft RPC are idempotent
  20. 20. Raft Joint Consensus ❖ Two phase to change membership configuration ❖ In Joint Consensus state: ❖ replicate log entries to both configuration ❖ any server from either configuration may be a leader ❖ agreement (for election and commitment) requires separate majorities from both configuration
  21. 21. Log compaction ❖ Independent
 snapshotting ❖ Snapshot metadata ❖ InstallSnapshot RPC
  22. 22. Client Interaction ❖ Works with leader ❖ Leader respond to a request when it commits an entry ❖ Assign uniqueID to every command, leader stores latest ID with response. ❖ Read-only - could be without writing to a log, possible stale data. Or write “no-op” entry at start
  23. 23. Correctness ❖ TLA+ formal specification for consensus alg (400 LOC) ❖ Proof of linearizability in Raft using Coq (community)
  24. 24. Questions?

×