SlideShare uma empresa Scribd logo
1 de 19
Automated conflict resolution
     - enabling masterless data
     distribution



         GOTO Aarhus 2012
         Premier software development conference created by developers for
         developers.
         Conference: Oct. 1-3 // Training: Sept. 30, Oct. 4-5

         Se mere på www.gotocon.com/aarhus-2012




1
About the speaker

    ■ Working with databases since '97

    ■ NoSQL since 2008

    ■ Danish Shared Medication Record

       ■ Migrating data from MySQL to Riak

       ■ Devel Riak clients

    ■ NoSQL architect on various international projects

       ■ Seoul, Korea

       ■ Toulouse, France

       ■ Leeds, UK                                        RuneSkouLarsen

       ■ Copenhagen, Denmark


2
Agenda

    ■Polyglot persistence landscape
    ■Distribute those data!
    ■Types of Consistency
    ■Moving towards consistency
    ■Introduction to CRDTs
    ■Consistency models of OLTP databases


3
Polyglot persistence
    landscape
        In-memory    OLTP

          Neo4j        Riak
          VoltDB    Cassandra
          Redis     Voldemort
                    CouchBase

         EasyDB     Analytics

         MongoDB     Hadoop
         CouchDB




4
Why distributed
    databases?


    ■ Redundancy

    ■ Availability

    ■ Scaling

    ■ Getting closer to your
      users




5
Types of Concistency


                       ■ Consistency:

                           All nodes see the same
                           data at the same time
                       ■ Eventual consistency →
                         Autonomous consistency
                       ■ Sequential consistency →
                         Bureaucratic consistency



6
When to be Consistent with what
     ■ Eventual consistency

       Support disconnected operations

         – Better to read a stale value than nothing

         – Better to save writes somewhere than nothing

       Potentially anomalous application behavior

         – Stale reads and conflicting writes…

     ■ Sequential consistency

       Requires highly available connections

       Not suitable for certain scenarios:

       – Disconnected clients (e.g. your phone)

       – Apps might prefer potential inconsistency to loss of availability
7
Conflicting updates
          User A                     User B



               A                          B




               A                          B
                   Asynchronous
                   Synchronization




8
Last Write Wins (LWW)
             User A                         User B



                 A                               B
                t=t0                            t=t1



                 A                               B
                t=t0                            t=t1
                        Asynchronous
                        Synchronization



    ■ Assign timestamp to all objects
    ■ Simple but fragile – depends on precise
      synchronization of timers
    ■ Data is lost
9
Detecting conflicts using
     Vector Clocks (1)
             User A                                  User B



                      A                                         B
                 vclock=a:1                               vclock=a:1,b:1



                      A                          A
                                                B
                 vclock=a:1                 vclock=a:1
                              Asynchronous
                                         vclock=a:1,b:1
                              Synchronization



     ■ Assign vector clock to objects
     ■ Spawn siblings when causality chain is broken
     ■ Data is never lost

10
Detecting conflicts using
     Vector Clocks (2)
             User A                             User B



                      A                                  B
                 vclock=a:1                         vclock=b:1



                      A                                  B
                 vclock=a:1                         vclock=b:1
                              Asynchronous
                              Synchronization



     ■ Assign vector clock to objects
     ■ Spawn siblings when causality chain is broken
     ■ Data is never lost

11
Conflict-free Replicated DataTypes
             User A                        User B



                  A                             B




               AB A                         AB B
                         Asynchronous
                         Synchronization



     ■ Datastructure intrinsically merges objects
     ■ No data loss
     ■ Limited applicability

12
Semantic resolution
             User A                       User B



                  C
                  A                            B




                  A                            B
                        Asynchronous
                        Synchronization



     ■ Keep both values as siblings
     ■ User does the merging
     ■ Only solution for complex, important data

13
Methods for resolving conflicts
     ■ Last Write Wins

       ■ Easy

       ■ Data is lost

       ■ Depends on timestamps

     ■ Conflict-free Data Types

       ■ Data structure has built-in convergence

       ■ Limited ability to model real-world problems

     ■ Semantic resolution

       ■ Requires application/user involvement

       ■ Generic solution
14
Conflict-free Replicated Data Types
     ■ Convergent (CvRDT)

       ■ State is replicated

       ■ Moves towards one value



     ■ Commutative (CmRDT)

       ■ Operations to the state are
         replicated

       ■ The order of operations is
         insignificant

            a*b = b*a

     ■ CvRDT and CmRDT can emulate
       eachother
15
CvRDT examples: G-set and 2P-Set
                             Tombstone

                               RIP




16
CRDT References

     ■ CRDTs: Consistency without concurrency control
       2009

       INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE

     ■ A comprehensive study of Convergent and
       Commutative Replicated Data Types
       2009

       INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE

     ■ Sean Cribbs - Eventually Consistent Data Structures
       http://vimeo.com/43903960



17
Consistency models of OLTP databases
                                    ■ Hinted handoff with sloppy quorums
     ■ Last write wins                (highest write-availability)

           Riak                           Riak

           CouchDB/CouchBase              Cassandra

           Cassandra                ■ Strong consistency (read you own
                                      writes + strict quorums)
     ■ User resolvable conflicts
                                          Riak
           Riak
                                          Voldemort
           Voldemort
                                          Cassandra
           CouchDB/CouchBase (but
           unreliable)                    CouchBase

     ■ Active anti-entropy                MongoDB

           Riak (Soon)                    Traditional SQL databases
                                          (Oracle, MySQL, etc.)


18
Thank you




                 Rune Skou Larsen

                 rsl@trifork.com

                 Twitter: RuneSkouLarsen




19

Mais conteúdo relacionado

Semelhante a Automated conflict resolution - enabling masterless data distribution (Rune Skou Larsen, Trifork)

Dynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theoremDynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theoremGrisha Weintraub
 
Yes sql08 inmemorydb
Yes sql08 inmemorydbYes sql08 inmemorydb
Yes sql08 inmemorydbDaniel Austin
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database OverviewSteve Min
 
Programming Language Memory Models: What do Shared Variables Mean?
Programming Language Memory Models: What do Shared Variables Mean?Programming Language Memory Models: What do Shared Variables Mean?
Programming Language Memory Models: What do Shared Variables Mean?greenwop
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache CassandraStu Hood
 
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017Alex Robinson
 
To Serverless and Beyond
To Serverless and BeyondTo Serverless and Beyond
To Serverless and BeyondScyllaDB
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandraBrian Enochson
 
Mirco Dotta - Akka Streams
Mirco Dotta - Akka StreamsMirco Dotta - Akka Streams
Mirco Dotta - Akka StreamsScala Italy
 
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim StarkeyInsight Technology, Inc.
 
Akka streams scala italy2015
Akka streams scala italy2015Akka streams scala italy2015
Akka streams scala italy2015mircodotta
 
Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency modelsrogerbodamer
 
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Rick Hwang
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterDataStax Academy
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaGuozhang Wang
 
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Ontico
 
Be Social. Use CrowdRE.
Be Social. Use CrowdRE.Be Social. Use CrowdRE.
Be Social. Use CrowdRE.CrowdStrike
 
PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterMat Keep
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Johnny Miller
 

Semelhante a Automated conflict resolution - enabling masterless data distribution (Rune Skou Larsen, Trifork) (20)

Dynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theoremDynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theorem
 
Yes sql08 inmemorydb
Yes sql08 inmemorydbYes sql08 inmemorydb
Yes sql08 inmemorydb
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
 
Programming Language Memory Models: What do Shared Variables Mean?
Programming Language Memory Models: What do Shared Variables Mean?Programming Language Memory Models: What do Shared Variables Mean?
Programming Language Memory Models: What do Shared Variables Mean?
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
The Hows and Whys of a Distributed SQL Database - Strange Loop 2017
 
To Serverless and Beyond
To Serverless and BeyondTo Serverless and Beyond
To Serverless and Beyond
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 
Mirco Dotta - Akka Streams
Mirco Dotta - Akka StreamsMirco Dotta - Akka Streams
Mirco Dotta - Akka Streams
 
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
 
Akka streams scala italy2015
Akka streams scala italy2015Akka streams scala italy2015
Akka streams scala italy2015
 
Thoughts on consistency models
Thoughts on consistency modelsThoughts on consistency models
Thoughts on consistency models
 
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
Study Notes - Architecting for the cloud (AWS Best Practices, Feb 2016)
 
Pythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra ClusterPythian: My First 100 days with a Cassandra Cluster
Pythian: My First 100 days with a Cassandra Cluster
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
 
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
Cистема распределенного, масштабируемого и высоконадежного хранения данных дл...
 
Be Social. Use CrowdRE.
Be Social. Use CrowdRE.Be Social. Use CrowdRE.
Be Social. Use CrowdRE.
 
PayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL ClusterPayPal Big Data and MySQL Cluster
PayPal Big Data and MySQL Cluster
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
 

Mais de Swiss Big Data User Group

Making Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to useMaking Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to useSwiss Big Data User Group
 
A real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operatorA real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operatorSwiss Big Data User Group
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaSwiss Big Data User Group
 
Closing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data AnalysisClosing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data AnalysisSwiss Big Data User Group
 
Big Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companiesBig Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companiesSwiss Big Data User Group
 
Design Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time LearningDesign Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time LearningSwiss Big Data User Group
 
Unleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data WarehouseUnleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data WarehouseSwiss Big Data User Group
 
Project "Babelfish" - A data warehouse to attack complexity
 Project "Babelfish" - A data warehouse to attack complexity Project "Babelfish" - A data warehouse to attack complexity
Project "Babelfish" - A data warehouse to attack complexitySwiss Big Data User Group
 
Brainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density ChoiceBrainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density ChoiceSwiss Big Data User Group
 
Urturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maketUrturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maketSwiss Big Data User Group
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridSwiss Big Data User Group
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseSwiss Big Data User Group
 
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computingTechnology Outlook - The new Era of computing
Technology Outlook - The new Era of computingSwiss Big Data User Group
 

Mais de Swiss Big Data User Group (20)

Making Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to useMaking Hadoop based analytics simple for everyone to use
Making Hadoop based analytics simple for everyone to use
 
A real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operatorA real life project using Cassandra at a large Swiss Telco operator
A real life project using Cassandra at a large Swiss Telco operator
 
Data Analytics – B2B vs. B2C
Data Analytics – B2B vs. B2CData Analytics – B2B vs. B2C
Data Analytics – B2B vs. B2C
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Closing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data AnalysisClosing The Loop for Evaluating Big Data Analysis
Closing The Loop for Evaluating Big Data Analysis
 
Big Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companiesBig Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companies
 
Design Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time LearningDesign Patterns for Large-Scale Real-Time Learning
Design Patterns for Large-Scale Real-Time Learning
 
Educating Data Scientists of the Future
Educating Data Scientists of the FutureEducating Data Scientists of the Future
Educating Data Scientists of the Future
 
Unleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data WarehouseUnleash the power of Big Data in your existing Data Warehouse
Unleash the power of Big Data in your existing Data Warehouse
 
Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?Big data for Telco: opportunity or threat?
Big data for Telco: opportunity or threat?
 
Project "Babelfish" - A data warehouse to attack complexity
 Project "Babelfish" - A data warehouse to attack complexity Project "Babelfish" - A data warehouse to attack complexity
Project "Babelfish" - A data warehouse to attack complexity
 
Brainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density ChoiceBrainserve Datacenter: the High-Density Choice
Brainserve Datacenter: the High-Density Choice
 
Urturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maketUrturn on AWS: scaling infra, cost and time to maket
Urturn on AWS: scaling infra, cost and time to maket
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
 
New opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph databaseNew opportunities for connected data : Neo4j the graph database
New opportunities for connected data : Neo4j the graph database
 
Technology Outlook - The new Era of computing
Technology Outlook - The new Era of computingTechnology Outlook - The new Era of computing
Technology Outlook - The new Era of computing
 
In-Store Analysis with Hadoop
In-Store Analysis with HadoopIn-Store Analysis with Hadoop
In-Store Analysis with Hadoop
 
Big Data Visualization With ParaView
Big Data Visualization With ParaViewBig Data Visualization With ParaView
Big Data Visualization With ParaView
 
Introduction to Apache Drill
Introduction to Apache DrillIntroduction to Apache Drill
Introduction to Apache Drill
 

Último

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Último (20)

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Automated conflict resolution - enabling masterless data distribution (Rune Skou Larsen, Trifork)

  • 1. Automated conflict resolution - enabling masterless data distribution GOTO Aarhus 2012 Premier software development conference created by developers for developers. Conference: Oct. 1-3 // Training: Sept. 30, Oct. 4-5 Se mere på www.gotocon.com/aarhus-2012 1
  • 2. About the speaker ■ Working with databases since '97 ■ NoSQL since 2008 ■ Danish Shared Medication Record ■ Migrating data from MySQL to Riak ■ Devel Riak clients ■ NoSQL architect on various international projects ■ Seoul, Korea ■ Toulouse, France ■ Leeds, UK RuneSkouLarsen ■ Copenhagen, Denmark 2
  • 3. Agenda ■Polyglot persistence landscape ■Distribute those data! ■Types of Consistency ■Moving towards consistency ■Introduction to CRDTs ■Consistency models of OLTP databases 3
  • 4. Polyglot persistence landscape In-memory OLTP Neo4j Riak VoltDB Cassandra Redis Voldemort CouchBase EasyDB Analytics MongoDB Hadoop CouchDB 4
  • 5. Why distributed databases? ■ Redundancy ■ Availability ■ Scaling ■ Getting closer to your users 5
  • 6. Types of Concistency ■ Consistency: All nodes see the same data at the same time ■ Eventual consistency → Autonomous consistency ■ Sequential consistency → Bureaucratic consistency 6
  • 7. When to be Consistent with what ■ Eventual consistency Support disconnected operations – Better to read a stale value than nothing – Better to save writes somewhere than nothing Potentially anomalous application behavior – Stale reads and conflicting writes… ■ Sequential consistency Requires highly available connections Not suitable for certain scenarios: – Disconnected clients (e.g. your phone) – Apps might prefer potential inconsistency to loss of availability 7
  • 8. Conflicting updates User A User B A B A B Asynchronous Synchronization 8
  • 9. Last Write Wins (LWW) User A User B A B t=t0 t=t1 A B t=t0 t=t1 Asynchronous Synchronization ■ Assign timestamp to all objects ■ Simple but fragile – depends on precise synchronization of timers ■ Data is lost 9
  • 10. Detecting conflicts using Vector Clocks (1) User A User B A B vclock=a:1 vclock=a:1,b:1 A A B vclock=a:1 vclock=a:1 Asynchronous vclock=a:1,b:1 Synchronization ■ Assign vector clock to objects ■ Spawn siblings when causality chain is broken ■ Data is never lost 10
  • 11. Detecting conflicts using Vector Clocks (2) User A User B A B vclock=a:1 vclock=b:1 A B vclock=a:1 vclock=b:1 Asynchronous Synchronization ■ Assign vector clock to objects ■ Spawn siblings when causality chain is broken ■ Data is never lost 11
  • 12. Conflict-free Replicated DataTypes User A User B A B AB A AB B Asynchronous Synchronization ■ Datastructure intrinsically merges objects ■ No data loss ■ Limited applicability 12
  • 13. Semantic resolution User A User B C A B A B Asynchronous Synchronization ■ Keep both values as siblings ■ User does the merging ■ Only solution for complex, important data 13
  • 14. Methods for resolving conflicts ■ Last Write Wins ■ Easy ■ Data is lost ■ Depends on timestamps ■ Conflict-free Data Types ■ Data structure has built-in convergence ■ Limited ability to model real-world problems ■ Semantic resolution ■ Requires application/user involvement ■ Generic solution 14
  • 15. Conflict-free Replicated Data Types ■ Convergent (CvRDT) ■ State is replicated ■ Moves towards one value ■ Commutative (CmRDT) ■ Operations to the state are replicated ■ The order of operations is insignificant a*b = b*a ■ CvRDT and CmRDT can emulate eachother 15
  • 16. CvRDT examples: G-set and 2P-Set Tombstone RIP 16
  • 17. CRDT References ■ CRDTs: Consistency without concurrency control 2009 INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE ■ A comprehensive study of Convergent and Commutative Replicated Data Types 2009 INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE ■ Sean Cribbs - Eventually Consistent Data Structures http://vimeo.com/43903960 17
  • 18. Consistency models of OLTP databases ■ Hinted handoff with sloppy quorums ■ Last write wins (highest write-availability) Riak Riak CouchDB/CouchBase Cassandra Cassandra ■ Strong consistency (read you own writes + strict quorums) ■ User resolvable conflicts Riak Riak Voldemort Voldemort Cassandra CouchDB/CouchBase (but unreliable) CouchBase ■ Active anti-entropy MongoDB Riak (Soon) Traditional SQL databases (Oracle, MySQL, etc.) 18
  • 19. Thank you Rune Skou Larsen rsl@trifork.com Twitter: RuneSkouLarsen 19