The document discusses different replication strategies for cloud storage to reduce data loss. Copyset replication groups storage nodes into sets where each set stores the same data. This reduces correlated failures compared to random replication. Tuning the scatter width, which is the number of sets per data item, controls the parallelism of data recovery from node failures. Wider scatter widths result in lower recovery loads on individual nodes but increase the chance of data loss. The strategies aim to minimize both the frequency and amount of data lost from node failures in cloud storage systems.
30. Pragmatic Aspects
• Move randomization to permutation stage
• Low overhead on operations
• Near optimal and fast
• Support for dynamic systems while
maintaining guarantees is tricky -> chainsets
(http://hackingdistributed.com/2014/02/14/chainsets/)
• Tiered
replicationhttps://www.usenix.org/conference/atc15/t
echnical-session/presentation/cidon
Show of hands: who read the paper? No problem
Familiar with the idea?
Simple replication technique
Great at uncorrelated failures
Correlated failures result in small amounts of data, lost frequently
Data center outage, 1% of nodes don’t come back up after power is restored
Cluster > 300 nodes, this is nearly guaranteed
Recover by reading from backup
High cost to restore data
Random replication is guaranteed to lose some data as N increases due to simultaneous node failure
66% increase
Unexpectedly high cost
Different systems & applications prefer different tradeoffs
Explain N choose 3 = 84 with random, so any 3 node combo would end up some losing data
these specific 3 need to fail at the same time
Randomly pick a node (for load balancing), then it determines which copyset data is written to
probability of all R nodes in a copyset failing is low
When they all fail, more data is lost
Different systems & applications prefer different tradeoffs
During permutation, can set different constraints and keep generating until all constraints have been met
At most 1 node overlap between 2 copy sets
Cover all nodes equally
Different systems & applications prefer different tradeoffs