SlideShare a Scribd company logo
1 of 22
Download to read offline
© 2013 triAGENS GmbH | 2013-08-24 1
CAP
and the
Architectural
Consequences
FrOSCon
St. Augustin
2013-08-24
martin Schönert (triAGENS)
© 2013 triAGENS GmbH | 2013-08-24 2
Who am I
 martin Schönert
 I work at triAGENS GmbH
 I have been in software
development since 30 years
 programmer
 product manager
 responsible for a data center
 department head at a large company
 software architect
 I am the architect of
© 2013 triAGENS GmbH | 2013-08-24 3
The CAP Theorem:
Consistency, Availability, Partition Tolerance
Write
Replicate
© 2013 triAGENS GmbH | 2013-08-24 4
The CAP Theorem:
Consistency, Availability, Partition Tolerance
Read the
actual data
© 2013 triAGENS GmbH | 2013-08-24 5
The CAP Theorem:
Consistency, Availability, Partition Tolerance
Partition
© 2013 triAGENS GmbH | 2013-08-24 6
Theorem: You can at most have two of these
properties for any shared data system.
Dr. Eric A. Brewer
Towards Robust Distributed
Systems
PODC Keynote, July 19. 2000
Proceedings of the Anual ACM
Symposium on the Principles of
Distributed Systems, 2000
Consistency Availability
Tolerance to
network
Partitions
© 2013 triAGENS GmbH | 2013-08-24 7
Which was criticized in many articles and blog
entries (below is just a small sample ;-).
codahale.com/you-cant-sacrifice-partition-tolerance/
blog.voltdb.com/clarifications-cap-theorem-and-data-related-errors/
dbmsmusings.blogspot.de/2010/04/problems-with-cap-and-yahoos-little.html
© 2013 triAGENS GmbH | 2013-08-24 8
Which was criticized in many articles and blog
entries (below is just a small sample ;-).
codahale.com/you-cant-sacrifice-partition-tolerance/
blog.voltdb.com/clarifications-cap-theorem-and-data-related-errors/
dbmsmusings.blogspot.de/2010/04/problems-with-cap-and-yahoos-little.html
I really need to write
an updated
CAP theorem paper.
Dr. Eric A. Brewer (twitter, Oct. 2010)
© 2013 triAGENS GmbH | 2013-08-24 9
Critique of CAP: CP
 Was basically interpreted as:
 if anything at all goes wrong
(real network partition, node
failure, ...), immediately stop
accepting any operation (read,
write, …) at all.
 and was rejected because:
 you can still accept some
operations (e.g. reads),
 or continue top accept all
operations in one partition
(e.g. the one with a quorum),
 ...
© 2013 triAGENS GmbH | 2013-08-24 10
Critque of CAP: AP
 Was basically interpreted as:
 the system gives up all of the
ACID semantics and
 at no time (even while not
partitioned) does the system
guarantee consistency.
 this confusion is partly
because at the same time we
had discussions about:
 ACID vs. BASE and
 P(A|C) E(L|C)
© 2013 triAGENS GmbH | 2013-08-24 11
Critque of CAP: CA
 Can you actually choose to
not have partitions?
 Yes:
 small clusters (2-3 nodes)
 in one datacenter
 nodes and clients are connected
through one switch
 No:
 not for systems with more nodes
 or distributed over several
datacenters
© 2013 triAGENS GmbH | 2013-08-24 12
So let us take a better look at the situation:
Operations on the state
normal mode
partition
detection
partition mode
partition recovery
normal mode
© 2013 triAGENS GmbH | 2013-08-24 13
Detect the partition
 Happens – at the last – when
one node tries to replicate an
operation to another node and
this times out.
 In this moment the node must
make a decision:
 go ahead with the operation (and
risk consistency)
 cancel the operation (and reduce
availability)
 Options:
 separate watchdog
(to distuingish failed node from
partitions)
 heartbeats (to avoid that only one
side detects the partition)
© 2013 triAGENS GmbH | 2013-08-24 14
Partition Mode
Place restrictions on:
 on the nodes that accept
operations:
 quorum
 on the data on which a client
can operate:
 data ownership (MESI, MOESI, …)
 problems with complex operations
 on the operations
 read only
 on the semantics:
 delayed commit
 async failure
 record intent
 any combination of the above
 possibly with human
intervention
 (e.g. shut down one partition and
make the other fully functional)
© 2013 triAGENS GmbH | 2013-08-24 15
Partition Recovery
 Merging strategies
 last writer wins
 commutative operators
 lattice of operations
 application controlled
 opportunistic (read time)
 Fix invariants
 e.g. violation of uniqueness
constraints
 Eventual consistency
 it IS NOT the fact that every
operation is first committed on one
node and later (eventually)
replicated to other nodes
 it IS the fact that the system will
heal itself, i.e. without external
intervention converge to consistent
state
 Merkle hash trees
 Hinted handoff
© 2013 triAGENS GmbH | 2013-08-24 16
Massively Distributed Systems
 Store so much data that
hundreds of nodes are
needed just to store it.
 Not that common.
 Main driver behind early
NoSQL developments.
 Receive a lot of publicity.
© 2013 triAGENS GmbH | 2013-08-24 17
Consequences of CAP for massively distributed
systems
 Failures happen constantly
 Nodes die
 Network connections die
 Network route flapping
 Partitions can be huge
 Must use resources well
 if a node dies the load must
distributed over multiple other
nodes
 Partition detection
 number of possible failure modes
and fault lines is HUGE
 impossible to find out the failure
mode quickly is impossible
 always operate under a worst case
assumption
© 2013 triAGENS GmbH | 2013-08-24 18
Consequences of CAP for massively distributed
systems
 Partition mode
 restricting operations to nodes with
quorum is impossible
 restricting operations to read only
is impossible
 restricting operation semantics is
possible (though always difficult)
 restricting operations to „own“ or
„borrowed“ data is sometimes
necessary
 Partition recovery
 must happen fully automatically
 must merge states
 must fix invariants
 Consequences
 no complex operations
 resp. only „local“ complex
operations
© 2013 triAGENS GmbH | 2013-08-24 19
Further properties of massively distributed
systems
 Properties
 Nodes fail often
 New nodes are added regularly
 Nodes are not homogenous
 Distribution and redistribution
of data must be fully automatic
 Consistent Hashing
 Consequence:
 No complex operations
 no scans over large parts of the
data
 no non-trivial joins
 no multi-index operations
 The marvel is not that the bear
dances well, but that the bear
dances at all. Russian Proverb
© 2013 triAGENS GmbH | 2013-08-24 20
My view of the (NoSQL) Database world
DBs that manage an evolving state (OLTP)
Complex
Queries
Operations on
compex structures
Massively
Distributed
Key/Value
Stores
Document
Stores
Graph
Stores
Map Reduce
Column oriented
Stores
Analyzing data (OLAP)
© 2013 triAGENS GmbH | 2013-08-24 21
Über uns
Die triAGENS GmbH ist ein Dienstleister im
Bereich komplexer Informationssysteme und
webbasierter Business-Lösungen, mit hohen
Anforderungen an Performance, Skalierbarkeit
und Sicherheit.
triAGENS entwickelt
High-Performance-Datenbanken auf Basis
optimierter NoSQL-Datenbanktechnologien, die
u.a. bei der Deutschen Post zum Einsatz
kommen.
Erstellt von:
martin Schönert
m.schoenert@triagens.de
triAGENS GmbH
Brüsseler Strasse 89-93
50672 Köln
www.triagens.de
The triAGENS GmbH is a service company in
the area of complex IT Systems and web based
business solutions with high requirements on
performance, scalability and security.
triAGENS supplies high performance databases
based on NoSQL database technology, which is
utilized for example at the Deutsche Post.
Created by:
martin Schönert
m.schoenert@triagens.de
triAGENS GmbH
Brüsseler Strasse 89-93
50672 Köln
www.triagens.de
© 2013 triAGENS GmbH | 2013-08-24 22
Kontext Marketing
Titel CAP and Consequences
Ablage 77_marketing
ID TRI-MS-1308-004
Verantwortlich martin Schönert / triagens
Leser Öffentlich
Sicherheitsein. Öffentlich
SchlüsselworteCAP Distributed Systems
Schritt Bearbeiter geplant bis Fertigstellung Kommentar
Entwurf ms 2013-08-18 2013-08-20
Finalisierung ms 2013-08-26 2013-08-26
Version Datum Autor Kommentar
V1.00 2013-08-20 mS initiale Version
V1.01 2013-08-26 mS Tippfehler korrigiert
Folie Kommentar
- -
Dokumentinformationen
Metainformationen Historie
Bearbeitungsschritte Todos

More Related Content

What's hot

Assignment_4
Assignment_4Assignment_4
Assignment_4Kirti J
 
Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijevIlja Dmitrijevs
 
Data warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionData warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionKlaudiia Jacome
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless DatabasesDan Gunter
 
Data Access Technologies
Data Access TechnologiesData Access Technologies
Data Access TechnologiesDimara Hakim
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
NoSQL Data Architecture Patterns
NoSQL Data ArchitecturePatternsNoSQL Data ArchitecturePatterns
NoSQL Data Architecture PatternsMaynooth University
 
SQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosSQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosGeorge Grammatikos
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLbalwinders
 
Introduction to nosql
Introduction to nosqlIntroduction to nosql
Introduction to nosqlZuhaib Ansari
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceJ Singh
 

What's hot (20)

Assignment_4
Assignment_4Assignment_4
Assignment_4
 
Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijev
 
Data warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionData warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and vision
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
Data Access Technologies
Data Access TechnologiesData Access Technologies
Data Access Technologies
 
Oslo bekk2014
Oslo bekk2014Oslo bekk2014
Oslo bekk2014
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
NOSQL
NOSQLNOSQL
NOSQL
 
NoSQL Data Architecture Patterns
NoSQL Data ArchitecturePatternsNoSQL Data ArchitecturePatterns
NoSQL Data Architecture Patterns
 
Introducing Mache
Introducing MacheIntroducing Mache
Introducing Mache
 
SQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George GrammatikosSQL or NoSQL, is this the question? - George Grammatikos
SQL or NoSQL, is this the question? - George Grammatikos
 
Know what is NOSQL
Know what is NOSQL Know what is NOSQL
Know what is NOSQL
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Introduction to nosql
Introduction to nosqlIntroduction to nosql
Introduction to nosql
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduce
 

Similar to CAP and the Architectural Consequences by martin Schönert

CloudComputing_UNIT5.pdf
CloudComputing_UNIT5.pdfCloudComputing_UNIT5.pdf
CloudComputing_UNIT5.pdfkhan593595
 
Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)Eran Harel
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep DivesRush Shah
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Stavros Kontopoulos
 
Pmit 6102-14-lec1-intro
Pmit 6102-14-lec1-introPmit 6102-14-lec1-intro
Pmit 6102-14-lec1-introJesmin Rahaman
 
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...DataStax
 
CLOUD COMPUTING CHANTI-130 ( FOR THE COMPUTING2).pdf
CLOUD COMPUTING CHANTI-130 ( FOR THE COMPUTING2).pdfCLOUD COMPUTING CHANTI-130 ( FOR THE COMPUTING2).pdf
CLOUD COMPUTING CHANTI-130 ( FOR THE COMPUTING2).pdfyadavkarthik4437
 
Data Structures in the Multicore Age : Notes
Data Structures in the Multicore Age : NotesData Structures in the Multicore Age : Notes
Data Structures in the Multicore Age : NotesSubhajit Sahu
 
Hybrid Cloud Monitoring - Datatdog
Hybrid Cloud Monitoring - DatatdogHybrid Cloud Monitoring - Datatdog
Hybrid Cloud Monitoring - DatatdogChase Thompson
 
A New Way Of Distributed Or Cloud Computing
A New Way Of Distributed Or Cloud ComputingA New Way Of Distributed Or Cloud Computing
A New Way Of Distributed Or Cloud ComputingAshley Lovato
 
HbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubeyHbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubeyRohit Dubey
 
Overlapped clustering approach for maximizing the service reliability of
Overlapped clustering approach for maximizing the service reliability ofOverlapped clustering approach for maximizing the service reliability of
Overlapped clustering approach for maximizing the service reliability ofIAEME Publication
 
Review and Analysis of Self Destruction of Data in Cloud Computing
Review and Analysis of Self Destruction of Data in Cloud ComputingReview and Analysis of Self Destruction of Data in Cloud Computing
Review and Analysis of Self Destruction of Data in Cloud ComputingIRJET Journal
 
نظم موزعة Distributed systems slides.01.pdf
نظم موزعة Distributed systems slides.01.pdfنظم موزعة Distributed systems slides.01.pdf
نظم موزعة Distributed systems slides.01.pdfBilal Al-samaee
 
Distributed Scheme to Authenticate Data Storage Security in Cloud Computing
Distributed Scheme to Authenticate Data Storage Security in Cloud ComputingDistributed Scheme to Authenticate Data Storage Security in Cloud Computing
Distributed Scheme to Authenticate Data Storage Security in Cloud ComputingAIRCC Publishing Corporation
 
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTINGDISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTINGAIRCC Publishing Corporation
 

Similar to CAP and the Architectural Consequences by martin Schönert (20)

CloudComputing_UNIT5.pdf
CloudComputing_UNIT5.pdfCloudComputing_UNIT5.pdf
CloudComputing_UNIT5.pdf
 
Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)Reactive by example (DevOpsDaysTLV 2019)
Reactive by example (DevOpsDaysTLV 2019)
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
 
Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016Trivento summercamp masterclass 9/9/2016
Trivento summercamp masterclass 9/9/2016
 
Pmit 6102-14-lec1-intro
Pmit 6102-14-lec1-introPmit 6102-14-lec1-intro
Pmit 6102-14-lec1-intro
 
50620130101004
5062013010100450620130101004
50620130101004
 
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
DataStax | Distributing the Enterprise, Safely (Thomas Valley) | Cassandra Su...
 
CLOUD COMPUTING CHANTI-130 ( FOR THE COMPUTING2).pdf
CLOUD COMPUTING CHANTI-130 ( FOR THE COMPUTING2).pdfCLOUD COMPUTING CHANTI-130 ( FOR THE COMPUTING2).pdf
CLOUD COMPUTING CHANTI-130 ( FOR THE COMPUTING2).pdf
 
Cloud Computing
Cloud Computing Cloud Computing
Cloud Computing
 
Data Structures in the Multicore Age : Notes
Data Structures in the Multicore Age : NotesData Structures in the Multicore Age : Notes
Data Structures in the Multicore Age : Notes
 
Hybrid Cloud Monitoring - Datatdog
Hybrid Cloud Monitoring - DatatdogHybrid Cloud Monitoring - Datatdog
Hybrid Cloud Monitoring - Datatdog
 
A New Way Of Distributed Or Cloud Computing
A New Way Of Distributed Or Cloud ComputingA New Way Of Distributed Or Cloud Computing
A New Way Of Distributed Or Cloud Computing
 
HbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubeyHbaseHivePigbyRohitDubey
HbaseHivePigbyRohitDubey
 
Overlapped clustering approach for maximizing the service reliability of
Overlapped clustering approach for maximizing the service reliability ofOverlapped clustering approach for maximizing the service reliability of
Overlapped clustering approach for maximizing the service reliability of
 
EFFICIENT TRUSTED CLOUD STORAGE USING PARALLEL CLOUD COMPUTING
EFFICIENT TRUSTED CLOUD STORAGE USING PARALLEL CLOUD COMPUTINGEFFICIENT TRUSTED CLOUD STORAGE USING PARALLEL CLOUD COMPUTING
EFFICIENT TRUSTED CLOUD STORAGE USING PARALLEL CLOUD COMPUTING
 
PureMVC
PureMVCPureMVC
PureMVC
 
Review and Analysis of Self Destruction of Data in Cloud Computing
Review and Analysis of Self Destruction of Data in Cloud ComputingReview and Analysis of Self Destruction of Data in Cloud Computing
Review and Analysis of Self Destruction of Data in Cloud Computing
 
نظم موزعة Distributed systems slides.01.pdf
نظم موزعة Distributed systems slides.01.pdfنظم موزعة Distributed systems slides.01.pdf
نظم موزعة Distributed systems slides.01.pdf
 
Distributed Scheme to Authenticate Data Storage Security in Cloud Computing
Distributed Scheme to Authenticate Data Storage Security in Cloud ComputingDistributed Scheme to Authenticate Data Storage Security in Cloud Computing
Distributed Scheme to Authenticate Data Storage Security in Cloud Computing
 
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTINGDISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
 

More from ArangoDB Database

ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022ArangoDB Database
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB Database
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBArangoDB Database
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale ArangoDB Database
 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDBGraph Analytics with ArangoDB
Graph Analytics with ArangoDBArangoDB Database
 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisGetting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisArangoDB Database
 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBCustom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBArangoDB Database
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsArangoDB Database
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarArangoDB Database
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?ArangoDB Database
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoDB Database
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB Database
 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisWebinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisArangoDB Database
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB Database
 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBWebinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBArangoDB Database
 
An introduction to multi-model databases
An introduction to multi-model databasesAn introduction to multi-model databases
An introduction to multi-model databasesArangoDB Database
 
Running complex data queries in a distributed system
Running complex data queries in a distributed systemRunning complex data queries in a distributed system
Running complex data queries in a distributed systemArangoDB Database
 

More from ArangoDB Database (20)

ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at ScaleArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at Scale
 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDBGraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDBGraph Analytics with ArangoDB
Graph Analytics with ArangoDB
 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB OasisGetting Started with ArangoDB Oasis
Getting Started with ArangoDB Oasis
 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDBCustom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDB
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge GraphsHacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release WebinarA Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning MetadataArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at ScaleArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at Scale
 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB OasisWebinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB Oasis
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
 
3.5 webinar
3.5 webinar 3.5 webinar
3.5 webinar
 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDBWebinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDB
 
An introduction to multi-model databases
An introduction to multi-model databasesAn introduction to multi-model databases
An introduction to multi-model databases
 
Running complex data queries in a distributed system
Running complex data queries in a distributed systemRunning complex data queries in a distributed system
Running complex data queries in a distributed system
 

Recently uploaded

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

CAP and the Architectural Consequences by martin Schönert

  • 1. © 2013 triAGENS GmbH | 2013-08-24 1 CAP and the Architectural Consequences FrOSCon St. Augustin 2013-08-24 martin Schönert (triAGENS)
  • 2. © 2013 triAGENS GmbH | 2013-08-24 2 Who am I  martin Schönert  I work at triAGENS GmbH  I have been in software development since 30 years  programmer  product manager  responsible for a data center  department head at a large company  software architect  I am the architect of
  • 3. © 2013 triAGENS GmbH | 2013-08-24 3 The CAP Theorem: Consistency, Availability, Partition Tolerance Write Replicate
  • 4. © 2013 triAGENS GmbH | 2013-08-24 4 The CAP Theorem: Consistency, Availability, Partition Tolerance Read the actual data
  • 5. © 2013 triAGENS GmbH | 2013-08-24 5 The CAP Theorem: Consistency, Availability, Partition Tolerance Partition
  • 6. © 2013 triAGENS GmbH | 2013-08-24 6 Theorem: You can at most have two of these properties for any shared data system. Dr. Eric A. Brewer Towards Robust Distributed Systems PODC Keynote, July 19. 2000 Proceedings of the Anual ACM Symposium on the Principles of Distributed Systems, 2000 Consistency Availability Tolerance to network Partitions
  • 7. © 2013 triAGENS GmbH | 2013-08-24 7 Which was criticized in many articles and blog entries (below is just a small sample ;-). codahale.com/you-cant-sacrifice-partition-tolerance/ blog.voltdb.com/clarifications-cap-theorem-and-data-related-errors/ dbmsmusings.blogspot.de/2010/04/problems-with-cap-and-yahoos-little.html
  • 8. © 2013 triAGENS GmbH | 2013-08-24 8 Which was criticized in many articles and blog entries (below is just a small sample ;-). codahale.com/you-cant-sacrifice-partition-tolerance/ blog.voltdb.com/clarifications-cap-theorem-and-data-related-errors/ dbmsmusings.blogspot.de/2010/04/problems-with-cap-and-yahoos-little.html I really need to write an updated CAP theorem paper. Dr. Eric A. Brewer (twitter, Oct. 2010)
  • 9. © 2013 triAGENS GmbH | 2013-08-24 9 Critique of CAP: CP  Was basically interpreted as:  if anything at all goes wrong (real network partition, node failure, ...), immediately stop accepting any operation (read, write, …) at all.  and was rejected because:  you can still accept some operations (e.g. reads),  or continue top accept all operations in one partition (e.g. the one with a quorum),  ...
  • 10. © 2013 triAGENS GmbH | 2013-08-24 10 Critque of CAP: AP  Was basically interpreted as:  the system gives up all of the ACID semantics and  at no time (even while not partitioned) does the system guarantee consistency.  this confusion is partly because at the same time we had discussions about:  ACID vs. BASE and  P(A|C) E(L|C)
  • 11. © 2013 triAGENS GmbH | 2013-08-24 11 Critque of CAP: CA  Can you actually choose to not have partitions?  Yes:  small clusters (2-3 nodes)  in one datacenter  nodes and clients are connected through one switch  No:  not for systems with more nodes  or distributed over several datacenters
  • 12. © 2013 triAGENS GmbH | 2013-08-24 12 So let us take a better look at the situation: Operations on the state normal mode partition detection partition mode partition recovery normal mode
  • 13. © 2013 triAGENS GmbH | 2013-08-24 13 Detect the partition  Happens – at the last – when one node tries to replicate an operation to another node and this times out.  In this moment the node must make a decision:  go ahead with the operation (and risk consistency)  cancel the operation (and reduce availability)  Options:  separate watchdog (to distuingish failed node from partitions)  heartbeats (to avoid that only one side detects the partition)
  • 14. © 2013 triAGENS GmbH | 2013-08-24 14 Partition Mode Place restrictions on:  on the nodes that accept operations:  quorum  on the data on which a client can operate:  data ownership (MESI, MOESI, …)  problems with complex operations  on the operations  read only  on the semantics:  delayed commit  async failure  record intent  any combination of the above  possibly with human intervention  (e.g. shut down one partition and make the other fully functional)
  • 15. © 2013 triAGENS GmbH | 2013-08-24 15 Partition Recovery  Merging strategies  last writer wins  commutative operators  lattice of operations  application controlled  opportunistic (read time)  Fix invariants  e.g. violation of uniqueness constraints  Eventual consistency  it IS NOT the fact that every operation is first committed on one node and later (eventually) replicated to other nodes  it IS the fact that the system will heal itself, i.e. without external intervention converge to consistent state  Merkle hash trees  Hinted handoff
  • 16. © 2013 triAGENS GmbH | 2013-08-24 16 Massively Distributed Systems  Store so much data that hundreds of nodes are needed just to store it.  Not that common.  Main driver behind early NoSQL developments.  Receive a lot of publicity.
  • 17. © 2013 triAGENS GmbH | 2013-08-24 17 Consequences of CAP for massively distributed systems  Failures happen constantly  Nodes die  Network connections die  Network route flapping  Partitions can be huge  Must use resources well  if a node dies the load must distributed over multiple other nodes  Partition detection  number of possible failure modes and fault lines is HUGE  impossible to find out the failure mode quickly is impossible  always operate under a worst case assumption
  • 18. © 2013 triAGENS GmbH | 2013-08-24 18 Consequences of CAP for massively distributed systems  Partition mode  restricting operations to nodes with quorum is impossible  restricting operations to read only is impossible  restricting operation semantics is possible (though always difficult)  restricting operations to „own“ or „borrowed“ data is sometimes necessary  Partition recovery  must happen fully automatically  must merge states  must fix invariants  Consequences  no complex operations  resp. only „local“ complex operations
  • 19. © 2013 triAGENS GmbH | 2013-08-24 19 Further properties of massively distributed systems  Properties  Nodes fail often  New nodes are added regularly  Nodes are not homogenous  Distribution and redistribution of data must be fully automatic  Consistent Hashing  Consequence:  No complex operations  no scans over large parts of the data  no non-trivial joins  no multi-index operations  The marvel is not that the bear dances well, but that the bear dances at all. Russian Proverb
  • 20. © 2013 triAGENS GmbH | 2013-08-24 20 My view of the (NoSQL) Database world DBs that manage an evolving state (OLTP) Complex Queries Operations on compex structures Massively Distributed Key/Value Stores Document Stores Graph Stores Map Reduce Column oriented Stores Analyzing data (OLAP)
  • 21. © 2013 triAGENS GmbH | 2013-08-24 21 Über uns Die triAGENS GmbH ist ein Dienstleister im Bereich komplexer Informationssysteme und webbasierter Business-Lösungen, mit hohen Anforderungen an Performance, Skalierbarkeit und Sicherheit. triAGENS entwickelt High-Performance-Datenbanken auf Basis optimierter NoSQL-Datenbanktechnologien, die u.a. bei der Deutschen Post zum Einsatz kommen. Erstellt von: martin Schönert m.schoenert@triagens.de triAGENS GmbH Brüsseler Strasse 89-93 50672 Köln www.triagens.de The triAGENS GmbH is a service company in the area of complex IT Systems and web based business solutions with high requirements on performance, scalability and security. triAGENS supplies high performance databases based on NoSQL database technology, which is utilized for example at the Deutsche Post. Created by: martin Schönert m.schoenert@triagens.de triAGENS GmbH Brüsseler Strasse 89-93 50672 Köln www.triagens.de
  • 22. © 2013 triAGENS GmbH | 2013-08-24 22 Kontext Marketing Titel CAP and Consequences Ablage 77_marketing ID TRI-MS-1308-004 Verantwortlich martin Schönert / triagens Leser Öffentlich Sicherheitsein. Öffentlich SchlüsselworteCAP Distributed Systems Schritt Bearbeiter geplant bis Fertigstellung Kommentar Entwurf ms 2013-08-18 2013-08-20 Finalisierung ms 2013-08-26 2013-08-26 Version Datum Autor Kommentar V1.00 2013-08-20 mS initiale Version V1.01 2013-08-26 mS Tippfehler korrigiert Folie Kommentar - - Dokumentinformationen Metainformationen Historie Bearbeitungsschritte Todos