Cap in depth

EEDC
34330
Execution
The CAP theorem in
Environments for
depth
Distributed
Computing
Erasmus Mundus Distributed
Computing - EMDC

Homework: Final Project
Group number: EEDC-7.1
Group member:
Ioanna Tsalouchidou –
ioannatsalouchidou@gmail.com

Contents


The Theorem

The CAP theorem's growing impact

CAP twelve years later: how the “Rules” have changed

Perspectives on the CAP theorem

Consistency tradeoffs in modern distributed database
system design

CAP and cloud data management

Overcoming CAP with consistent soft-state replication

Conclusions

2

The Theorem

Consistency: each server gives the correct response
to each request

Availability: each request eventually receives a
response

Partition tolerance: refers to the underlying system
and not to the service. Servers partitioned to groups
that are not able to communicate

3

The Theorem

“ There is a fundamental tradeoff between consistency,
availability and network partition tolerance”
Eric Brewer

“ The impossibility of guaranteeing both safety and
liveness in an unreliable system”

“Fast, Cheap, Good - Pick Any Two”
J. Noel Chiappa

4

The Theorem

In practice CAP takes place during a timeout.

Then a decision should be made:

• Cancel the operation and thus decrease availability

• Continue the operation and be prone to
inconsistency

6


The big data challenge:

Handling exponential growth in Web-data

Relational DBMS with ACID properties do not scale well

Alternative solutions → NoSQL databases

Non-relational and distributed databases

7

NoSQL Data Bases:

• Flexible schema

• Scale horizontally

• Do NOT support ACID properties

• Store and Replicate data in distributed systems

• Achieve Scalability and Reliability

8


Atomicity
Consistency
Consistency
Vs. Availability
Isolation
Partition tolerance
Durability

9


Within a datacenter:

• Network failures are rare
• No tradeoff between Consistency and
Availability

Cloud providers:

• Maintain multiple datacenters
• Datacenters are geographically separated
• Consistency- Availability tradeoff appears

10

CAP: twelve years later

How “Rules ” have changed:

“Any networked shared-data system can have only two
of three desirable properties. However, by explicitly
handling partitions, designers can optimize
consistency and availability, thereby achieving some
tradeoff off all three. ”
Eric Brewer

11

Use and Abuse of CAP theorem:

“2 of 3 ” oversimplifies the tensions among properties.
CAP → prohibits a tiny part of the design space.
Perfect availability and Consistency given partitions, which are rare.

Modern CAP:

Max the combination of Consistency and Availability when possible.
Operation during a partition.
Recovery after the partition.
CAP goes beyond its limitations.

12

Managing Partitions:

Detect the start of the partition.
Partition mode → limited operations.
Partition recovery.

13

During partition mode:
The operations to be limited depends on the invariants
needed to be maintained
Recovery:
Both sides should become consistent
Compensation of the mistakes happened during partition
Compensation:
Tracking and limitation of partition-mode operations.
Knowledge of the invariants violated.
Last writer wins.
Still an open problem.

14


“ The CAP theorem is one example of a more general
tradeoff between safety and liveness”

Gilbert and Lynch

15


Safety property → at every point in every execution this
property holds – Consistency.

Liveness property → if the execution continues for long
then something desirable happens – Availability.

CAP → any protocol implementing an atomic read/write
register cannot guarantee both safety and liveness in
a partition-prone system.

16

Agreement:
• Fault-tolerant agreement is impossible in an
asynchronous system.

Requirements for Consensus:
• Agreement: all processes same value (safety).
• Validity: output-values have been provided as the input of
some processes (safety).
• Termination: all processes must output a value
(liveness).

Consensus:
• Safety and liveness are impossible if the system is
potentially faulty

17

Safety/liveness tradeoff for consensus

Under which circumstances can we have both?

Network synchrony
• Wholly synchronous network → wholly avoided tradeoff
• Cynthia Dwork → eventual synchrony
• Tushar Chandra → failure detectors

Consistency
• Maximum level of consistency?
• Soma Chaudhuri → set agreement
• 1-set agreement means consensus no crash failure
• t failures need [t/k] +1 rounds and achieve k-set
agreement.
18

Practical Implications

Over an unreliable system you can choose to sacrifice

• Availability

• Consistency

• Moderate approach – sacrifice both dynamically
– Well response to most user requests
– Consistency when it is necessary

19

Best-effort availability

– Most common approach

– Guarantees consistency, regardless of network behavior

– When communication is typically reliable

– Example of servers of the same datacenter – rare partitions

20

Best-effort consistency

– Sometimes unavailability is not an option

– Inconsistency is not a major problem

– Web caches, services with image and video content

– Best-effort for up-to-date data

– No assurance that all users get the same content

– Not high requirement of strong consistency

21

Balancing consistency and availability

Neither strong consistency nor continual availability.
Applications specify the level of continuous consistency.

Airline reservation system
• Many free seats → sacrifice consistency
• A few places left → sacrifice availability

Inconsistency of data when consistency is not needed.
Unavailability when major network partition happens.

→ Increase system's robustness to network disruption before
sacrificing availability.
22

Tradeoffs in modern distributed db

“ The CAP theorem's impact on modern DDBS is more
limited than is often perceived”
Caniel J. Abadi

23


“It is wrong to assume that DDBSs that reduce
consistency in the absence of any partitions are
doing so due to CAP-based decision-making”

24

Consistency/ Latency tradeoff:
• Availability ~ Latency
• An unavailable system provides extreme latency
• Exists even without network partitions
• System runs long enough → at least one component fails
• Highly available systems need to replicate data

The occurrence of failure causes CAP tradeoffs, the
possibility of failure results in Consistency/Latency
tradeoff.

25

Data Replication:

As soon as a DDBS replicates data, a tradeoff between consistency
and latency arises.

Replication alternatives
• Data updates to all replicas at the same time.
• Data updates first to an agreed master node.
• Data updates to a single arbitrary node.

Each implementation comes with consistency/latency tradeoff.

26


PACELC

If there is a Partition, how does the system trade off
Availability and Consistency; Else, when a system is
running normally in the absence of partitions, how
does the system trade off Latency and Consistency?

27


Web applications must scale on demand.

Need for requests with low latency

Require high throughput

Highly available

Minimum operational cost

28

Coordinating all updates through a master

Performance and availability implications

PNUTS → automatically migrating the master to be
close to the writers

Impact on performance and availability insignificant for
Yahoo's applications
• Localized user access patterns

29

Overcoming CAP: replication

Stronger consistency inside the datacenter

Low latency

Scalability

No consistency sacrificing

30


First-tier cloud services:

New consistency model for data replication.

Combination of agreement on update ordering with amnesia
freedom.

→ Surprising levels of scalability and performance.

31

The ISIS system

Supports virtually synchronous process groups

Reliable multicast

Various ordering options

Send primitive is FIFO-ordered

Ordered-primitive guarantees total order

Barrier primitive Flush → Amnesia Freedom

Delay until prior unstable multicasts reached destinations

Virtually synchronous version of Paxos

SafeSend

In-memory durability

On-disk durability

32


33

Conclusions
CAP → “2 of 3” in unreliable systems - No blind sacrifice
consistency or availability when partitions exist.

Safety/ Liveness

Failures → CAP tradeoffs,
possibility of failure → Consistency/Latency tradeoff.

Replication

PACELC

34

References
[1] Guest Editor's Introduction: The CAP Theorem's Growing Impact -
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6155651

[2] Pushing the CAP: Strategies for Consistency and Availability -

[3] Perspectives on the CAP Theorem -

[4] Consistency Tradeoffs in Modern Distributed Database System Design:
CAP is Only Part of the Story -

[5] CAP and Cloud Data Management -

35

References
[6] Overcoming CAP with Consistent Soft-State Replication -
http://ieeexplore.ieee.org/stamp/stamp.jsp?
tp=&arnumber=6112739

[7] http://en.wikipedia.org/wiki/CAP_theorem

[8] http://nathanmarz.com/blog/how-to-beat-the-cap-theorem.html

[9] http://www.cloudera.com/blog/2010/04/cap-confusion-
problems-with-partition-tolerance/

[10] http://wiki.fib.upc.es/sds/images/8/89/02-visions.pdf

36

Cap in depth

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (9)

Destaque

Destaque (6)

Semelhante a Cap in depth

Semelhante a Cap in depth (20)

Mais de Ioanna Tsalouchidou

Mais de Ioanna Tsalouchidou (8)

Cap in depth