SlideShare uma empresa Scribd logo
1 de 34
All you didn’t know about the CAP theorem
CAP theorem
The theorem was presented on the Symposium on Principles of
Distributed Computing in 2000 by Eric Brewer.
In 2002, Seth Gilbert and Nancy Lynch of MIT published a
formal proof of Brewer's conjecture, rendering it a theorem.
According to Brewer, he wanted the
community to start conversation about it,
but his words have been corrected and
treated as a theorem.
What stands behind the cap?
The CAP theorem states that in the distributed system you can
choose only 2 out of 3:
Consistency: every read would get you the most recent write
Availability: every node (if not failed) always executes
queries (read and writes)
Partition-tolerance: even if the connections between nodes
are down, the other two (A & C) promises, are kept.
AP Proof
Not consistent!
CP Proof
Not available!
CA Proof
No partition tolerance!
the CAP-triangle
Let’s take a look at the postgresql
Master/slave architecture is one of common solutions
Slave can be synced with master in async/sync way
The transaction system use two-phase commit to ensure consistency
If a partition occurs you can’t talk to the server (in the basic case), the
system is not CAP-available.
So, it can’t continue work in case of network partitioning, but it provides
strong consistency and high-availability. It’s a CA system!
Let’s take a look at the mongodb
MongoDB provides strong consistency, because it is a single-master system and
all writes go to the primary by default
MongoDB provides automatic failover in case of partitioning
If a partition occurs it will stop accepting writes to the system until it
believes that it can safely complete those writes.
So, it can continue work in case of network partitioning and it gives up
availability. It’s a CP system!
Let’s take a look at the
postgres + salesforce + heroku connect system
It’s master-master system
Heroku connect is responsible for keeping system consistency
Salesforce data and our postgres db doesn’t know about each other
If network partition occurs both storages will be available. Heroku connect
will try to reconnect.
So, it can continue work in case of network partitioning and it gives up
consistency. It should be an AP system!
It is so easy! Now I know everything!
Actually, no.
There are a lot of problems with CAP theorem:
CAP uses very narrow and far-from-the-real-world definitions
Actually, it is the choice only between consistency and availability
Many systems are neither CAP-consistent nor CAP-available
Pure AP systems are useless
Pure CP systems might behave not as expected
What is wrong with definitions
Consistency in CAP actually means linearizability (and it’s
really hard to reach it).
Availability in CAP is defined as “every request received by
a non-failing [database] node in the system must result in
a [non-error] response” and it’s not restricted by time.
The only fault considered by the CAP theorem is a network
partition.
Linearizability
Why node failures are outside CAP?
By the definition of availability: ...every node (if not
failed) always...
By the proof of CAP: the proof of CAP used by Gilbert and
Lynch relies on having code running on both sides of the
partition.
In some cases, a partition will be equivalent to a failure,
but this equivalence will be obtained by implementing a
specific logic in all the nodes.
Of course, we should manage node failures, but CAP doesn’t
help us here.
AP / CP choice
Partition Tolerance basically means that you’re communicating
over an asynchronous network that may delay or drop messages.
The internet and all our data centers have this property, so
you don’t really have any choice in this matter.
Many systems are only “p”
In case you have one master and
one slave, and you are
partitioned from the master - you
can’t write, but you can read.
It’s not CAP-available.
Ok, it’s a CP system, but usually
sync between slave and master is
async and there might be a gap
between sync and system
partitioning, so you do not have
CAP-consistency.
AP and CP problems
Pure AP is useless, it may just return any random value and
it would be an AP system
Pure CP is useless too, because partitioning in CAP have no
fixed duration, so the system provides only eventual
consistency, which is not the strong one that we want to
have.
Try to digest it
How to describe distributed system
Remember about CAP narrow definitions, as they
are still widely used
Use PACELC(A) theorem instead of CAP, it
provides additional consistency/latency
tradeoff
Describe how ACID/BASE principles apply to
your system
Decide if the system suits your needs,
considering the project you are working on.
Let’s take a look on PACELC(A)
The PACELC theorem was first described and formalised by Daniel J. Abadi from
Yale University in 2012.
IF there is a partition (P), how does the system tradeoff availability and
consistency (A and C)
ELSE (E) When the system is running normally in the absence of partitions, how
does the system trade off latency (L) and consistency (C)?
As PACELC theorem is based on CAP, it also uses CAP definitions.
PACELC(A) described systems
Let’s take a look on ACID
ACID - is a set of properties of database transactions.
Jim Gray defined these properties of a reliable transaction system in the late
1970s and developed technologies to achieve them automatically.
Database vendors long time ago introduced 2 phase commit for providing ACID
across multiple database instances.
Let’s take a look on ACID
Atomicity. All of the operations in the transaction will complete, or none
will
Consistency. The database will be in a consistent state when the transaction
begins and ends
Isolation. The transaction will behave as if it is the only operation being
performed upon the database
Durability. Once a transaction has been committed, it will remain so, even in
the event of power loss, crashes, or errors.
CAP/ACID definition confusion
Consistency in ACID relates to data integrity,
whereas Consistency in CAP is a reference for
Atomic Consistency (Linearizability), which
is a consistency model
Isolation term is not used in CAP, but it’s
definition in ACID is actually the same for
what linearizability stands for
Availability in ACID is not used in definitions,
but when it’s presented in articles, it means
the same as CAP-availability, apart that it’s
not required for all non-failing nodes to
respond.
Let’s take a look on BASE
Eventually consistent services are often classified
as providing BASE semantics, in contrast to
traditional ACID guarantees.
One of the earliest definitions of eventual
consistency comes from a 1988.
BASE essentially embraces the fact that true
consistency cannot be achieved in the real world,
and as such cannot be modelled in highly scalable
distributed systems.
What is BASE
Basic Availability states that there will be a response to any request, but,
that response could still be a “failure” or the data may be in an
inconsistent or changing state
Soft-state state of the system could change over time due to “eventual
consistency” changes
Eventual consistency states that the system will eventually become consistent,
the system will continue to receive input and is not checking the
consistency of every transaction before it moves onto the next one
Try to digest it too...
Fresh look on the postgres
Postgres does allow multiple cluster configuration, so it’s really hard to
describe all of them. Let’s just take the master - slave replication with the
Slony implementation.
The system works according ACID (there are couple of problems with two-phase
commit, but mostly it’s reliable)
In case of partition the Slony will try to proceed switchover and if all went
fine, we have our new master with its consistency
When there is no partition the Slony gives up latency and does everything to
approach strong consistency. Actually, ACID is the reason off high latency
The system is considered as PC/EC(A)
Fresh look on the mongodb
MongoDB is a NOSQL database.
MongoDB is ACID in limited sense at the document level
In case of distributed system - it’s all about BASE
In case of no partition, the system guarantees reads and writes to be
consistent
If the master node will be failed or partitioned from the rest of the system,
some data will not be replicated. System elects a new master to remain
available for reads and writes.(New master and old master are inconsistent)
The system is considered as PA/EC(A), as most of the nodes remain CAP-available
in case of partition.
fresh look on the
postgres + salesforce + heroku connect system
The Salesforce is an abstraction to oracle database (RDBMS) and the Postgres is
an actual RDBMS. Heroku connect is a tool to sync those too dbs. Let’s see what
we have…
Heroku connect, Postgres and Salesforce provide us ACID, but we can’t operate
with related entities
Surprise! The full system looks more like a BASE.
It uses streaming API to sync the entities.
In case of partition, both systems will be available, but not consistent
In case of no partition, the heroku connect try to sync as much as possible,
Conclusion
The distributed systems might and should be understanded by a lot of metrics
and terms, but the start point is it’s tradeoffs
It’s really difficult to classify an abstract system. You should decide what
kind of system do you want and then - look at what you can achieve
The 2 point is the reason why to find any good articles about that is not easy
either
Do not overwhelm yourself by that work, you should be a scientist to do it 99 %
correctly
Links
https://dzone.com/articles/better-explaining-cap-theorem
https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html
https://habrahabr.ru/post/231703/
http://blog.thislongrun.com/2015/03/the-confusing-cap-and-acid-wording.html
https://neo4j.com/blog/acid-vs-base-consistency-models-explained/
http://databases.about.com/od/databasetraining/a/databasesbegin.htm
https://brooker.co.za/blog/2014/07/16/pacelc.html
https://www.postgresql.org/files/developer/transactions.pdf
https://www.airpair.com/postgresql/posts/sql-vs-nosql-ko-postgres-vs-mongo
http://jennyxiaozhang.com/nosql-hbase-vs-cassandra-vs-mongodb/

Mais conteúdo relacionado

Mais procurados

K-means Clustering
K-means ClusteringK-means Clustering
K-means ClusteringAnna Fensel
 
Services comparison among Microsoft Azure AWS and Google Cloud Platform
Services comparison among Microsoft Azure AWS and Google Cloud PlatformServices comparison among Microsoft Azure AWS and Google Cloud Platform
Services comparison among Microsoft Azure AWS and Google Cloud Platformindu Yadav
 
Cloud Computing Security
Cloud Computing SecurityCloud Computing Security
Cloud Computing SecurityNinh Nguyen
 
Scalability and Reliability in the Cloud
Scalability and Reliability in the CloudScalability and Reliability in the Cloud
Scalability and Reliability in the Cloudgmthomps
 
DBMS _Relational model
DBMS _Relational modelDBMS _Relational model
DBMS _Relational modelAzizul Mamun
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methodsKrish_ver2
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Md. Main Uddin Rony
 
Latency and Consistency Tradeoffs in Modern Distributed Databases
Latency and Consistency Tradeoffs in Modern Distributed DatabasesLatency and Consistency Tradeoffs in Modern Distributed Databases
Latency and Consistency Tradeoffs in Modern Distributed DatabasesScyllaDB
 
Cloud Computing - Benefits and Challenges
Cloud Computing - Benefits and ChallengesCloud Computing - Benefits and Challenges
Cloud Computing - Benefits and ChallengesThoughtWorks Studios
 
Storage As A Service (StAAS)
Storage As A Service (StAAS)Storage As A Service (StAAS)
Storage As A Service (StAAS)Shreyans Jain
 
08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMS08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMSkoolkampus
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Mustafa Sherazi
 
Cluster Computers
Cluster ComputersCluster Computers
Cluster Computersshopnil786
 
11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMSkoolkampus
 
Association Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset GenerationAssociation Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset GenerationKnoldus Inc.
 
5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patternsKrish_ver2
 

Mais procurados (20)

K-means Clustering
K-means ClusteringK-means Clustering
K-means Clustering
 
Services comparison among Microsoft Azure AWS and Google Cloud Platform
Services comparison among Microsoft Azure AWS and Google Cloud PlatformServices comparison among Microsoft Azure AWS and Google Cloud Platform
Services comparison among Microsoft Azure AWS and Google Cloud Platform
 
Cloud Computing Security
Cloud Computing SecurityCloud Computing Security
Cloud Computing Security
 
Scalability and Reliability in the Cloud
Scalability and Reliability in the CloudScalability and Reliability in the Cloud
Scalability and Reliability in the Cloud
 
DBMS _Relational model
DBMS _Relational modelDBMS _Relational model
DBMS _Relational model
 
3.3 hierarchical methods
3.3 hierarchical methods3.3 hierarchical methods
3.3 hierarchical methods
 
Chapter1
Chapter1Chapter1
Chapter1
 
ppt
pptppt
ppt
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
Latency and Consistency Tradeoffs in Modern Distributed Databases
Latency and Consistency Tradeoffs in Modern Distributed DatabasesLatency and Consistency Tradeoffs in Modern Distributed Databases
Latency and Consistency Tradeoffs in Modern Distributed Databases
 
Cloud Computing - Benefits and Challenges
Cloud Computing - Benefits and ChallengesCloud Computing - Benefits and Challenges
Cloud Computing - Benefits and Challenges
 
Storage As A Service (StAAS)
Storage As A Service (StAAS)Storage As A Service (StAAS)
Storage As A Service (StAAS)
 
08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMS08. Object Oriented Database in DBMS
08. Object Oriented Database in DBMS
 
Io t introduction
Io t introductionIo t introduction
Io t introduction
 
DBSCAN
DBSCANDBSCAN
DBSCAN
 
Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)Clustering in data Mining (Data Mining)
Clustering in data Mining (Data Mining)
 
Cluster Computers
Cluster ComputersCluster Computers
Cluster Computers
 
11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS11. Storage and File Structure in DBMS
11. Storage and File Structure in DBMS
 
Association Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset GenerationAssociation Rule Learning Part 1: Frequent Itemset Generation
Association Rule Learning Part 1: Frequent Itemset Generation
 
5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patterns
 

Destaque

NoSQL in Perspective
NoSQL in PerspectiveNoSQL in Perspective
NoSQL in PerspectiveJeff Smith
 
Rolling With Riak
Rolling With RiakRolling With Riak
Rolling With RiakJohn Lynch
 
Riak Core: Building Distributed Applications Without Shared State
Riak Core: Building Distributed Applications Without Shared StateRiak Core: Building Distributed Applications Without Shared State
Riak Core: Building Distributed Applications Without Shared StateRusty Klophaus
 
Riak in Ten Minutes
Riak in Ten MinutesRiak in Ten Minutes
Riak in Ten MinutesJon Meredith
 
Riak Training Session — Surge 2011
Riak Training Session — Surge 2011Riak Training Session — Surge 2011
Riak Training Session — Surge 2011DstroyAllModels
 
Что нового в nginx? / Максим Дунин (Nginx, Inc.)
Что нового в nginx? / Максим Дунин (Nginx, Inc.)Что нового в nginx? / Максим Дунин (Nginx, Inc.)
Что нового в nginx? / Максим Дунин (Nginx, Inc.)Ontico
 
NoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityNoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityLars Marius Garshol
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 

Destaque (9)

Cap in depth
Cap in depthCap in depth
Cap in depth
 
NoSQL in Perspective
NoSQL in PerspectiveNoSQL in Perspective
NoSQL in Perspective
 
Rolling With Riak
Rolling With RiakRolling With Riak
Rolling With Riak
 
Riak Core: Building Distributed Applications Without Shared State
Riak Core: Building Distributed Applications Without Shared StateRiak Core: Building Distributed Applications Without Shared State
Riak Core: Building Distributed Applications Without Shared State
 
Riak in Ten Minutes
Riak in Ten MinutesRiak in Ten Minutes
Riak in Ten Minutes
 
Riak Training Session — Surge 2011
Riak Training Session — Surge 2011Riak Training Session — Surge 2011
Riak Training Session — Surge 2011
 
Что нового в nginx? / Максим Дунин (Nginx, Inc.)
Что нового в nginx? / Максим Дунин (Nginx, Inc.)Что нового в nginx? / Максим Дунин (Nginx, Inc.)
Что нового в nginx? / Максим Дунин (Nginx, Inc.)
 
NoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativityNoSQL databases, the CAP theorem, and the theory of relativity
NoSQL databases, the CAP theorem, and the theory of relativity
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 

Semelhante a CAP Theorem Explained: Consistency, Availability and Partition Tolerance in Distributed Systems

A Critique of the CAP Theorem by Martin Kleppmann
A Critique of the CAP Theorem by Martin KleppmannA Critique of the CAP Theorem by Martin Kleppmann
A Critique of the CAP Theorem by Martin Kleppmannmustafa sarac
 
Database Throwdown Introduction
Database Throwdown IntroductionDatabase Throwdown Introduction
Database Throwdown IntroductionSean Collins
 
Lightning talk: highly scalable databases and the PACELC theorem
Lightning talk: highly scalable databases and the PACELC theoremLightning talk: highly scalable databases and the PACELC theorem
Lightning talk: highly scalable databases and the PACELC theoremVishal Bardoloi
 
CAP, PACELC, and Determinism
CAP, PACELC, and DeterminismCAP, PACELC, and Determinism
CAP, PACELC, and DeterminismDaniel Abadi
 
17-NoSQL.pptx
17-NoSQL.pptx17-NoSQL.pptx
17-NoSQL.pptxlevichan1
 
Nosql availability & integrity
Nosql availability & integrityNosql availability & integrity
Nosql availability & integrityFahri Firdausillah
 
Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupAdam Hutson
 
HbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubeyHbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubeyRohit Dubey
 
Database Consistency Models
Database Consistency ModelsDatabase Consistency Models
Database Consistency ModelsSimon Ouellette
 
Distributed Algorithms
Distributed AlgorithmsDistributed Algorithms
Distributed Algorithms913245857
 
Lecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdfLecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdfmanimozhi98
 
Container independent failover framework
Container independent failover frameworkContainer independent failover framework
Container independent failover frameworktelestax
 
Container Independent failover framework - Mobicents Summit 2011
Container Independent failover framework - Mobicents Summit 2011Container Independent failover framework - Mobicents Summit 2011
Container Independent failover framework - Mobicents Summit 2011telestax
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databaseslovingprince58
 
Scalable Data Storage Getting You Down? To The Cloud!
Scalable Data Storage Getting You Down? To The Cloud!Scalable Data Storage Getting You Down? To The Cloud!
Scalable Data Storage Getting You Down? To The Cloud!Mikhail Panchenko
 
Scalable Data Storage Getting you Down? To the Cloud!
Scalable Data Storage Getting you Down? To the Cloud!Scalable Data Storage Getting you Down? To the Cloud!
Scalable Data Storage Getting you Down? To the Cloud!Mikhail Panchenko
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databasesguestdfd1ec
 

Semelhante a CAP Theorem Explained: Consistency, Availability and Partition Tolerance in Distributed Systems (20)

CAP: Scaling, HA
CAP: Scaling, HACAP: Scaling, HA
CAP: Scaling, HA
 
A Critique of the CAP Theorem by Martin Kleppmann
A Critique of the CAP Theorem by Martin KleppmannA Critique of the CAP Theorem by Martin Kleppmann
A Critique of the CAP Theorem by Martin Kleppmann
 
Database Throwdown Introduction
Database Throwdown IntroductionDatabase Throwdown Introduction
Database Throwdown Introduction
 
Lightning talk: highly scalable databases and the PACELC theorem
Lightning talk: highly scalable databases and the PACELC theoremLightning talk: highly scalable databases and the PACELC theorem
Lightning talk: highly scalable databases and the PACELC theorem
 
CAP, PACELC, and Determinism
CAP, PACELC, and DeterminismCAP, PACELC, and Determinism
CAP, PACELC, and Determinism
 
17-NoSQL.pptx
17-NoSQL.pptx17-NoSQL.pptx
17-NoSQL.pptx
 
Nosql availability & integrity
Nosql availability & integrityNosql availability & integrity
Nosql availability & integrity
 
Cassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User GroupCassandra & Python - Springfield MO User Group
Cassandra & Python - Springfield MO User Group
 
Dsys guide37
Dsys guide37Dsys guide37
Dsys guide37
 
HbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubeyHbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubey
 
Database Consistency Models
Database Consistency ModelsDatabase Consistency Models
Database Consistency Models
 
Distributed Algorithms
Distributed AlgorithmsDistributed Algorithms
Distributed Algorithms
 
Lecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdfLecture-04-Principles of data management.pdf
Lecture-04-Principles of data management.pdf
 
No sql (not only sql)
No sql                 (not only sql)No sql                 (not only sql)
No sql (not only sql)
 
Container independent failover framework
Container independent failover frameworkContainer independent failover framework
Container independent failover framework
 
Container Independent failover framework - Mobicents Summit 2011
Container Independent failover framework - Mobicents Summit 2011Container Independent failover framework - Mobicents Summit 2011
Container Independent failover framework - Mobicents Summit 2011
 
Design Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databasesDesign Patterns For Distributed NO-reational databases
Design Patterns For Distributed NO-reational databases
 
Scalable Data Storage Getting You Down? To The Cloud!
Scalable Data Storage Getting You Down? To The Cloud!Scalable Data Storage Getting You Down? To The Cloud!
Scalable Data Storage Getting You Down? To The Cloud!
 
Scalable Data Storage Getting you Down? To the Cloud!
Scalable Data Storage Getting you Down? To the Cloud!Scalable Data Storage Getting you Down? To the Cloud!
Scalable Data Storage Getting you Down? To the Cloud!
 
Design Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational DatabasesDesign Patterns for Distributed Non-Relational Databases
Design Patterns for Distributed Non-Relational Databases
 

Último

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 

Último (20)

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 

CAP Theorem Explained: Consistency, Availability and Partition Tolerance in Distributed Systems

  • 1. All you didn’t know about the CAP theorem
  • 2. CAP theorem The theorem was presented on the Symposium on Principles of Distributed Computing in 2000 by Eric Brewer. In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of Brewer's conjecture, rendering it a theorem. According to Brewer, he wanted the community to start conversation about it, but his words have been corrected and treated as a theorem.
  • 3. What stands behind the cap? The CAP theorem states that in the distributed system you can choose only 2 out of 3: Consistency: every read would get you the most recent write Availability: every node (if not failed) always executes queries (read and writes) Partition-tolerance: even if the connections between nodes are down, the other two (A & C) promises, are kept.
  • 8. Let’s take a look at the postgresql Master/slave architecture is one of common solutions Slave can be synced with master in async/sync way The transaction system use two-phase commit to ensure consistency If a partition occurs you can’t talk to the server (in the basic case), the system is not CAP-available. So, it can’t continue work in case of network partitioning, but it provides strong consistency and high-availability. It’s a CA system!
  • 9. Let’s take a look at the mongodb MongoDB provides strong consistency, because it is a single-master system and all writes go to the primary by default MongoDB provides automatic failover in case of partitioning If a partition occurs it will stop accepting writes to the system until it believes that it can safely complete those writes. So, it can continue work in case of network partitioning and it gives up availability. It’s a CP system!
  • 10. Let’s take a look at the postgres + salesforce + heroku connect system It’s master-master system Heroku connect is responsible for keeping system consistency Salesforce data and our postgres db doesn’t know about each other If network partition occurs both storages will be available. Heroku connect will try to reconnect. So, it can continue work in case of network partitioning and it gives up consistency. It should be an AP system!
  • 11. It is so easy! Now I know everything!
  • 12. Actually, no. There are a lot of problems with CAP theorem: CAP uses very narrow and far-from-the-real-world definitions Actually, it is the choice only between consistency and availability Many systems are neither CAP-consistent nor CAP-available Pure AP systems are useless Pure CP systems might behave not as expected
  • 13. What is wrong with definitions Consistency in CAP actually means linearizability (and it’s really hard to reach it). Availability in CAP is defined as “every request received by a non-failing [database] node in the system must result in a [non-error] response” and it’s not restricted by time. The only fault considered by the CAP theorem is a network partition.
  • 15. Why node failures are outside CAP? By the definition of availability: ...every node (if not failed) always... By the proof of CAP: the proof of CAP used by Gilbert and Lynch relies on having code running on both sides of the partition. In some cases, a partition will be equivalent to a failure, but this equivalence will be obtained by implementing a specific logic in all the nodes. Of course, we should manage node failures, but CAP doesn’t help us here.
  • 16. AP / CP choice Partition Tolerance basically means that you’re communicating over an asynchronous network that may delay or drop messages. The internet and all our data centers have this property, so you don’t really have any choice in this matter.
  • 17. Many systems are only “p” In case you have one master and one slave, and you are partitioned from the master - you can’t write, but you can read. It’s not CAP-available. Ok, it’s a CP system, but usually sync between slave and master is async and there might be a gap between sync and system partitioning, so you do not have CAP-consistency.
  • 18. AP and CP problems Pure AP is useless, it may just return any random value and it would be an AP system Pure CP is useless too, because partitioning in CAP have no fixed duration, so the system provides only eventual consistency, which is not the strong one that we want to have.
  • 20. How to describe distributed system Remember about CAP narrow definitions, as they are still widely used Use PACELC(A) theorem instead of CAP, it provides additional consistency/latency tradeoff Describe how ACID/BASE principles apply to your system Decide if the system suits your needs, considering the project you are working on.
  • 21. Let’s take a look on PACELC(A) The PACELC theorem was first described and formalised by Daniel J. Abadi from Yale University in 2012. IF there is a partition (P), how does the system tradeoff availability and consistency (A and C) ELSE (E) When the system is running normally in the absence of partitions, how does the system trade off latency (L) and consistency (C)? As PACELC theorem is based on CAP, it also uses CAP definitions.
  • 23. Let’s take a look on ACID ACID - is a set of properties of database transactions. Jim Gray defined these properties of a reliable transaction system in the late 1970s and developed technologies to achieve them automatically. Database vendors long time ago introduced 2 phase commit for providing ACID across multiple database instances.
  • 24. Let’s take a look on ACID Atomicity. All of the operations in the transaction will complete, or none will Consistency. The database will be in a consistent state when the transaction begins and ends Isolation. The transaction will behave as if it is the only operation being performed upon the database Durability. Once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors.
  • 25. CAP/ACID definition confusion Consistency in ACID relates to data integrity, whereas Consistency in CAP is a reference for Atomic Consistency (Linearizability), which is a consistency model Isolation term is not used in CAP, but it’s definition in ACID is actually the same for what linearizability stands for Availability in ACID is not used in definitions, but when it’s presented in articles, it means the same as CAP-availability, apart that it’s not required for all non-failing nodes to respond.
  • 26. Let’s take a look on BASE Eventually consistent services are often classified as providing BASE semantics, in contrast to traditional ACID guarantees. One of the earliest definitions of eventual consistency comes from a 1988. BASE essentially embraces the fact that true consistency cannot be achieved in the real world, and as such cannot be modelled in highly scalable distributed systems.
  • 27. What is BASE Basic Availability states that there will be a response to any request, but, that response could still be a “failure” or the data may be in an inconsistent or changing state Soft-state state of the system could change over time due to “eventual consistency” changes Eventual consistency states that the system will eventually become consistent, the system will continue to receive input and is not checking the consistency of every transaction before it moves onto the next one
  • 28. Try to digest it too...
  • 29. Fresh look on the postgres Postgres does allow multiple cluster configuration, so it’s really hard to describe all of them. Let’s just take the master - slave replication with the Slony implementation. The system works according ACID (there are couple of problems with two-phase commit, but mostly it’s reliable) In case of partition the Slony will try to proceed switchover and if all went fine, we have our new master with its consistency When there is no partition the Slony gives up latency and does everything to approach strong consistency. Actually, ACID is the reason off high latency The system is considered as PC/EC(A)
  • 30. Fresh look on the mongodb MongoDB is a NOSQL database. MongoDB is ACID in limited sense at the document level In case of distributed system - it’s all about BASE In case of no partition, the system guarantees reads and writes to be consistent If the master node will be failed or partitioned from the rest of the system, some data will not be replicated. System elects a new master to remain available for reads and writes.(New master and old master are inconsistent) The system is considered as PA/EC(A), as most of the nodes remain CAP-available in case of partition.
  • 31. fresh look on the postgres + salesforce + heroku connect system The Salesforce is an abstraction to oracle database (RDBMS) and the Postgres is an actual RDBMS. Heroku connect is a tool to sync those too dbs. Let’s see what we have… Heroku connect, Postgres and Salesforce provide us ACID, but we can’t operate with related entities Surprise! The full system looks more like a BASE. It uses streaming API to sync the entities. In case of partition, both systems will be available, but not consistent In case of no partition, the heroku connect try to sync as much as possible,
  • 32. Conclusion The distributed systems might and should be understanded by a lot of metrics and terms, but the start point is it’s tradeoffs It’s really difficult to classify an abstract system. You should decide what kind of system do you want and then - look at what you can achieve The 2 point is the reason why to find any good articles about that is not easy either Do not overwhelm yourself by that work, you should be a scientist to do it 99 % correctly
  • 33.

Notas do Editor

  1. Спросить эксперта в зале и попросить его внимательно следить и в конце поправить?
  2. Кто слышал про CAP теорему? А кто использовал её?
  3. Что осталось не ясно?
  4. Linearizability gives isolation at the level of operations. This is a fairly expensive guarantee to provide, because it requires a lot of coordination. Even the CPU in your computer doesn’t provide linearizable access to your local RAM! On modern CPUs, you need to use an explicit memory barrier instruction in order to get linearizability. And even testing whether a system provides linearizability is tricky.
  5. Вопросы?
  6. Вопросы?
  7. Вопросы?