SlideShare uma empresa Scribd logo
1 de 36
Distributed Transactions
           Alan Medlar

       amedlar@cs.ucl.ac.uk
Motivation
• Distributed Database
  • collection of sites, each with own database
  • each site processes local transactions
  • local transactions can only access local
    database
• Distributed transactions require co-
  ordination among sites
Advantages

• Distributed databases can improve
  availability (especially if we are using
  database replication)
• Parallel processing of sub-transactions at
  individual sites instead of all locally
  improves performance
Disadvantages
• Cost: hardware, software dev, network
  (leased lines?)
• Operational Overhead: network traffic, co-
  ordination overhead
• Technical: harder to debug, security, greater
  complexity
• ACID properties harder to achieve
Main Issues
•   Transparency: database provides abstraction layer above
    data access, distributed databases should be accessed in the
    same way
•   Distributed Transactions: local transactions are only
    processed at one site, global transactions need to preserve
    ACID across multiple sites and provide distributed query
    processing (eg: distributed join)
•   Atomicity: all sites in a global transactions must commit or
    none do
•   Consistency: all schedules must be conflict serializable (last
    lecture!)
Failures
•   Site failures: exactly the same as for local databases
    (hardware failure, out of memory etc)
•   Networking failures
    •   Failure of a network link: no hope of communicating
        with other database site
    •   Loss of messages: network link might be fine, but
        congested, packet loss, TCP timeouts
    •   Network partition: more relevant to replication, set of
        replicas might be divided in two, updating only replicas
        in their partition
Fragmentation

• Divide a relation into sections which can be
  allocated to different sites to optimise
  (reduce processing time, network traffic
  overhead) transaction processing
• Horizontal and vertical fragmentation
Branch Account no Customer Balance
Euston    1234      Alice    200
Euston    2345      Bob      100
Euston    3456      Eve       5
Harrow    4567     Richard   550
Harrow    5678      Jane     75
Harrow    6789     Graham    175
Branch Account no Customer Balance
Euston        1234           Alice         200
Euston        2345            Bob          100
Euston        3456            Eve           5

         Horizontal Fragmentation
(in this case taking advantage of usage locality)

Branch Account no Customer Balance
Harrow        4567         Richard         550
Harrow        5678           Jane          75
Harrow        6789         Graham          175
Branch Account no Customer Balance
Euston    1234      Alice    200
Euston    2345      Bob      100
Euston    3456      Eve       5
Harrow    4567     Richard   550
Harrow    5678      Jane     75
Harrow    6789     Graham    175
Branch Customer Id             Id Account no Balance
Euston      Alice      0        0      1234           200
Euston      Bob        1        1      2345           100
Euston       Eve       2        2      3456            5
Harrow    Richard      3        3      4567           550
Harrow      Jane       4        4      5678            75
Harrow    Graham       5        5      6789           175
                Vertical Fragmentation
    Additional Id-tuple allows for a join to recreate the
                      original relation
Problem

• Now our data is split into fragments and
  each fragment is at a separate site
• How do we access these sites using
  transactions, whilst maintaining the ACID
  properties?
2-Phase Commit

• Distributed algorithm that permits all nodes
  in a distributed system to agree to commit a
  transaction, the protocol results in all sites
  committing or aborting
• Completes despite network or node failures
• Necessary to provide atomicity
2-Phase Commit
• Voting Phase: each site is polled as to
  whether a transactions should commit (ie:
  whether their sub-transaction can commit)
• Decision Phase: if any site says “abort” or
  does not reply, then all sites must be told
  to abort
• Logging is performed for failure recovery
  (as usual)
client
client




 TC
client




     TC




A            B
client


          start




     TC




A                 B
client


          start




     TC
                  prepare



A                           B
client


                    start




               TC
                            prepare
    prepare


A                                     B
client


                    start




               TC
                         prepare
    prepare            ready


A                                  B
client


                      start




                 TC
    ready                  prepare
      prepare            ready


A                                    B
client


                               start




                          TC
commit                                            commit
             ready                  prepare
               prepare            ready


         A                                    B
client


                   OK          start




                          TC
commit                                            commit
             ready                  prepare
               prepare            ready


         A                                    B
Voting Phase
• TC (transaction co-ordinator) writes
  <prepare Ti> to log
• TC sends prepare message to all sites (A,B)
• Site’s local DBMS decides whether to
  commit its part of the transaction or abort.
  If commit write <ready Ti> else <no Ti> to
  log
• Ready or abort message sent back to TC
Decision Phase
•   After receiving all results from prepare messages (or after
    a timeout) TC can decision whether the entire transaction
    should commit
•   If any site replies “abort” or timed out, TC aborts the
    entire transaction by logging <abort Ti> and then sending
    the “abort” message to all sites
•   If all sites replies with “ready”, TC commits by logging
    <commit Ti> and sending commit message to all sites
•   Upon receipt of a commit message, each site logs
    <commit Ti> and only then alters the database in memory
Failure Example 1
•   One of the database sites (A,B) fails
•   On recovery the log is examined:
    •   if log contains <commit Ti>, redo the changes of the
        transaction
    •   if the log contains <abort Ti>, undo the changes
    •   if the log contains <ready Ti>, but not a commit, contact TC
        for the outcome of transaction Ti, if TC is down, then other
        sites
    •   if log does not contain ready, commit or abort then the failure
        must have occurred before the receipt of “prepare Ti”, so TC
        would have aborted the transaction
Failure Example 2
•   One of the transaction coordinator (TC) fails (sites A or B waiting
    for commit/abort message)
•   Each database site log is examined:
    •   if any site log contains <commit Ti> Ti must be committed at all
        sites
    •   if any site log contains <abort Ti> or <no Ti> Ti must be aborted
        at all sites
    •   if any site log does not contain <ready Ti>, TC must have failed
        before decision to commit
    •   if none of the above apply then all active sites must have
        <ready Ti> (but no additional commits or aborts), TC must be
        consulted (when it comes back online)
Network Faults

• Failure of the network
 • From the perspective of entities on one
    side of the network failure, entities on
    the other side have failed
    (apply previous strategies)
Locking (non-replicated system)

•   Each local site has a lock manager
    •   administers lock requests for data items stored
        at site
    •   when a transactions requires a data item to be
        locked, it requests a lock from the lock manager
    •   lock manager blocks until lock can be held
•   Problem: deadlocks in a distributed system, clearly
    more complicated to resolve...
Locking (single co-ordinator)

•   Have a single lock manager for the whole
    distributed database
    •   manages locks at all sites
    •   locks for reading of any replica
    •   locks for writing of all replicas
•   Simpler deadlock handling
•   Single point of failure
•   Bottleneck?
Locking (replicated system)

•   Majority protocol where each local site has a lock
    manager
•   Transactions wants a lock on a data item that is
    replicated at n sites
    •   must get a lock for that data item at more than n/2
        sites
    •   transaction cannot operate until it has locks on more
        than half of the replica sites (only one transaction can
        do this at a time)
    •   if replicas are written to all replicas must be updated...
Updating Replicas
•   Replication makes reading more reliable
    (probability p that a replica is unavailable, the
    probability that all n replicas are unavailable is
    pn)
•   Replication makes writing less reliable (the
    probability of all n replicas being available to be
    updated with a write has a probability (1-p)n)
•   Writing must succeed even if not all replicas
    are available...
Updating Replicas (2)
• Majority update protocol!
• Update more than half of the replicas (the
    rest have “failed”, can be updated later), but
    this time add a timestamp or version number
• To read a data item, read more than half of
    the replicas and use the one with the most
    recent timestamp
•   Write more reliable, reading more complex!
~ Fin ~
(Graphics lectures begin on Monday 9th March)

Mais conteúdo relacionado

Destaque

Destaque (13)

Vaziuojam.lt
Vaziuojam.ltVaziuojam.lt
Vaziuojam.lt
 
2011 Db Concurrency
2011 Db Concurrency2011 Db Concurrency
2011 Db Concurrency
 
Marketing Design Portfolio
Marketing Design PortfolioMarketing Design Portfolio
Marketing Design Portfolio
 
M. Golivkin - Entreprenerystės Ekosistema
M. Golivkin - Entreprenerystės EkosistemaM. Golivkin - Entreprenerystės Ekosistema
M. Golivkin - Entreprenerystės Ekosistema
 
OCC Vilnius - 3 Men. Veiklos Apžvalga
OCC Vilnius - 3 Men. Veiklos ApžvalgaOCC Vilnius - 3 Men. Veiklos Apžvalga
OCC Vilnius - 3 Men. Veiklos Apžvalga
 
Autoerus - Automobiliu Servisas 03 26
Autoerus - Automobiliu Servisas 03 26Autoerus - Automobiliu Servisas 03 26
Autoerus - Automobiliu Servisas 03 26
 
Learnt Technologies
Learnt TechnologiesLearnt Technologies
Learnt Technologies
 
2011 Db Intro
2011 Db Intro2011 Db Intro
2011 Db Intro
 
ScimoreDB - Enterprise level database
ScimoreDB - Enterprise level databaseScimoreDB - Enterprise level database
ScimoreDB - Enterprise level database
 
Cloud Slam Co D Presentation
Cloud Slam Co D PresentationCloud Slam Co D Presentation
Cloud Slam Co D Presentation
 
APP DESIGN by WIZARD INTERACTIVE
APP DESIGN by WIZARD INTERACTIVEAPP DESIGN by WIZARD INTERACTIVE
APP DESIGN by WIZARD INTERACTIVE
 
2011 Db Transactions
2011 Db Transactions2011 Db Transactions
2011 Db Transactions
 
Target Audience/ Attracting Audience
Target Audience/ Attracting AudienceTarget Audience/ Attracting Audience
Target Audience/ Attracting Audience
 

Semelhante a 2011 Db Distributed

Two phase commit protocol in dbms
Two phase commit protocol in dbmsTwo phase commit protocol in dbms
Two phase commit protocol in dbmsDilouar Hossain
 
enc=encoded=TlJst0_SHq0cPRhLS74QDXTP4FpU303sSqpyVVkfhckA93UCiZrRF0QVNAFGmuGu9...
enc=encoded=TlJst0_SHq0cPRhLS74QDXTP4FpU303sSqpyVVkfhckA93UCiZrRF0QVNAFGmuGu9...enc=encoded=TlJst0_SHq0cPRhLS74QDXTP4FpU303sSqpyVVkfhckA93UCiZrRF0QVNAFGmuGu9...
enc=encoded=TlJst0_SHq0cPRhLS74QDXTP4FpU303sSqpyVVkfhckA93UCiZrRF0QVNAFGmuGu9...DHANUSHKUMARKS
 
Distributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsDistributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsSachin Chauhan
 
Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422journeyer
 
Comet: an Overview and a New Solution Called Jabbify
Comet: an Overview and a New Solution Called JabbifyComet: an Overview and a New Solution Called Jabbify
Comet: an Overview and a New Solution Called JabbifyBrian Moschel
 
When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022HostedbyConfluent
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceScyllaDB
 
High available energy management system
High available energy management systemHigh available energy management system
High available energy management systemJo Ee Liew
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Appsadunne
 
Akka for realtime multiplayer mobile games
Akka for realtime multiplayer mobile gamesAkka for realtime multiplayer mobile games
Akka for realtime multiplayer mobile gamesYan Cui
 
Rails hosting
Rails hostingRails hosting
Rails hostingwonko
 
Enhanced Live Migration for Intensive Memory Loads
Enhanced Live Migration for Intensive Memory LoadsEnhanced Live Migration for Intensive Memory Loads
Enhanced Live Migration for Intensive Memory LoadsSamsung Open Source Group
 
Mutual Exclusion using Peterson's Algorithm
Mutual Exclusion using Peterson's AlgorithmMutual Exclusion using Peterson's Algorithm
Mutual Exclusion using Peterson's AlgorithmSouvik Roy
 
Acus08 Advanced Load Balancing Apache2.2
Acus08 Advanced Load Balancing Apache2.2Acus08 Advanced Load Balancing Apache2.2
Acus08 Advanced Load Balancing Apache2.2Jim Jagielski
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagJean-François Gagné
 
TXGX 2019_Albert_High Availability Architecture of Klaytn Service Chain
TXGX 2019_Albert_High Availability Architecture of Klaytn Service ChainTXGX 2019_Albert_High Availability Architecture of Klaytn Service Chain
TXGX 2019_Albert_High Availability Architecture of Klaytn Service ChainKlaytn
 
Verifying Deadlock and Livelock Freedom in an SOA Scenario
Verifying Deadlock and Livelock Freedom in an SOA ScenarioVerifying Deadlock and Livelock Freedom in an SOA Scenario
Verifying Deadlock and Livelock Freedom in an SOA ScenarioUniversität Rostock
 
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009Hiroshi Ono
 
Transactions in Action: the Story of Exactly Once in Apache Kafka
Transactions in Action: the Story of Exactly Once in Apache KafkaTransactions in Action: the Story of Exactly Once in Apache Kafka
Transactions in Action: the Story of Exactly Once in Apache KafkaHostedbyConfluent
 

Semelhante a 2011 Db Distributed (20)

Two phase commit protocol in dbms
Two phase commit protocol in dbmsTwo phase commit protocol in dbms
Two phase commit protocol in dbms
 
enc=encoded=TlJst0_SHq0cPRhLS74QDXTP4FpU303sSqpyVVkfhckA93UCiZrRF0QVNAFGmuGu9...
enc=encoded=TlJst0_SHq0cPRhLS74QDXTP4FpU303sSqpyVVkfhckA93UCiZrRF0QVNAFGmuGu9...enc=encoded=TlJst0_SHq0cPRhLS74QDXTP4FpU303sSqpyVVkfhckA93UCiZrRF0QVNAFGmuGu9...
enc=encoded=TlJst0_SHq0cPRhLS74QDXTP4FpU303sSqpyVVkfhckA93UCiZrRF0QVNAFGmuGu9...
 
Distributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsDistributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit Protocols
 
Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422Kamaelia-ACCU-20050422
Kamaelia-ACCU-20050422
 
Comet: an Overview and a New Solution Called Jabbify
Comet: an Overview and a New Solution Called JabbifyComet: an Overview and a New Solution Called Jabbify
Comet: an Overview and a New Solution Called Jabbify
 
When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022When Streaming Needs Batch With Konstantin Knauf | Current 2022
When Streaming Needs Batch With Konstantin Knauf | Current 2022
 
Lattice yapc-slideshare
Lattice yapc-slideshareLattice yapc-slideshare
Lattice yapc-slideshare
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
 
High available energy management system
High available energy management systemHigh available energy management system
High available energy management system
 
Web 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web AppsWeb 2.0 Performance and Reliability: How to Run Large Web Apps
Web 2.0 Performance and Reliability: How to Run Large Web Apps
 
Akka for realtime multiplayer mobile games
Akka for realtime multiplayer mobile gamesAkka for realtime multiplayer mobile games
Akka for realtime multiplayer mobile games
 
Rails hosting
Rails hostingRails hosting
Rails hosting
 
Enhanced Live Migration for Intensive Memory Loads
Enhanced Live Migration for Intensive Memory LoadsEnhanced Live Migration for Intensive Memory Loads
Enhanced Live Migration for Intensive Memory Loads
 
Mutual Exclusion using Peterson's Algorithm
Mutual Exclusion using Peterson's AlgorithmMutual Exclusion using Peterson's Algorithm
Mutual Exclusion using Peterson's Algorithm
 
Acus08 Advanced Load Balancing Apache2.2
Acus08 Advanced Load Balancing Apache2.2Acus08 Advanced Load Balancing Apache2.2
Acus08 Advanced Load Balancing Apache2.2
 
How Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lagHow Booking.com avoids and deals with replication lag
How Booking.com avoids and deals with replication lag
 
TXGX 2019_Albert_High Availability Architecture of Klaytn Service Chain
TXGX 2019_Albert_High Availability Architecture of Klaytn Service ChainTXGX 2019_Albert_High Availability Architecture of Klaytn Service Chain
TXGX 2019_Albert_High Availability Architecture of Klaytn Service Chain
 
Verifying Deadlock and Livelock Freedom in an SOA Scenario
Verifying Deadlock and Livelock Freedom in an SOA ScenarioVerifying Deadlock and Livelock Freedom in an SOA Scenario
Verifying Deadlock and Livelock Freedom in an SOA Scenario
 
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
Solving_the_C20K_problem_PHP_Performance_and_Scalability-phpquebec_2009
 
Transactions in Action: the Story of Exactly Once in Apache Kafka
Transactions in Action: the Story of Exactly Once in Apache KafkaTransactions in Action: the Story of Exactly Once in Apache Kafka
Transactions in Action: the Story of Exactly Once in Apache Kafka
 

Último

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 

Último (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 

2011 Db Distributed

  • 1. Distributed Transactions Alan Medlar amedlar@cs.ucl.ac.uk
  • 2. Motivation • Distributed Database • collection of sites, each with own database • each site processes local transactions • local transactions can only access local database • Distributed transactions require co- ordination among sites
  • 3. Advantages • Distributed databases can improve availability (especially if we are using database replication) • Parallel processing of sub-transactions at individual sites instead of all locally improves performance
  • 4. Disadvantages • Cost: hardware, software dev, network (leased lines?) • Operational Overhead: network traffic, co- ordination overhead • Technical: harder to debug, security, greater complexity • ACID properties harder to achieve
  • 5. Main Issues • Transparency: database provides abstraction layer above data access, distributed databases should be accessed in the same way • Distributed Transactions: local transactions are only processed at one site, global transactions need to preserve ACID across multiple sites and provide distributed query processing (eg: distributed join) • Atomicity: all sites in a global transactions must commit or none do • Consistency: all schedules must be conflict serializable (last lecture!)
  • 6. Failures • Site failures: exactly the same as for local databases (hardware failure, out of memory etc) • Networking failures • Failure of a network link: no hope of communicating with other database site • Loss of messages: network link might be fine, but congested, packet loss, TCP timeouts • Network partition: more relevant to replication, set of replicas might be divided in two, updating only replicas in their partition
  • 7. Fragmentation • Divide a relation into sections which can be allocated to different sites to optimise (reduce processing time, network traffic overhead) transaction processing • Horizontal and vertical fragmentation
  • 8. Branch Account no Customer Balance Euston 1234 Alice 200 Euston 2345 Bob 100 Euston 3456 Eve 5 Harrow 4567 Richard 550 Harrow 5678 Jane 75 Harrow 6789 Graham 175
  • 9. Branch Account no Customer Balance Euston 1234 Alice 200 Euston 2345 Bob 100 Euston 3456 Eve 5 Horizontal Fragmentation (in this case taking advantage of usage locality) Branch Account no Customer Balance Harrow 4567 Richard 550 Harrow 5678 Jane 75 Harrow 6789 Graham 175
  • 10. Branch Account no Customer Balance Euston 1234 Alice 200 Euston 2345 Bob 100 Euston 3456 Eve 5 Harrow 4567 Richard 550 Harrow 5678 Jane 75 Harrow 6789 Graham 175
  • 11. Branch Customer Id Id Account no Balance Euston Alice 0 0 1234 200 Euston Bob 1 1 2345 100 Euston Eve 2 2 3456 5 Harrow Richard 3 3 4567 550 Harrow Jane 4 4 5678 75 Harrow Graham 5 5 6789 175 Vertical Fragmentation Additional Id-tuple allows for a join to recreate the original relation
  • 12. Problem • Now our data is split into fragments and each fragment is at a separate site • How do we access these sites using transactions, whilst maintaining the ACID properties?
  • 13. 2-Phase Commit • Distributed algorithm that permits all nodes in a distributed system to agree to commit a transaction, the protocol results in all sites committing or aborting • Completes despite network or node failures • Necessary to provide atomicity
  • 14. 2-Phase Commit • Voting Phase: each site is polled as to whether a transactions should commit (ie: whether their sub-transaction can commit) • Decision Phase: if any site says “abort” or does not reply, then all sites must be told to abort • Logging is performed for failure recovery (as usual)
  • 15.
  • 18. client TC A B
  • 19. client start TC A B
  • 20. client start TC prepare A B
  • 21. client start TC prepare prepare A B
  • 22. client start TC prepare prepare ready A B
  • 23. client start TC ready prepare prepare ready A B
  • 24. client start TC commit commit ready prepare prepare ready A B
  • 25. client OK start TC commit commit ready prepare prepare ready A B
  • 26. Voting Phase • TC (transaction co-ordinator) writes <prepare Ti> to log • TC sends prepare message to all sites (A,B) • Site’s local DBMS decides whether to commit its part of the transaction or abort. If commit write <ready Ti> else <no Ti> to log • Ready or abort message sent back to TC
  • 27. Decision Phase • After receiving all results from prepare messages (or after a timeout) TC can decision whether the entire transaction should commit • If any site replies “abort” or timed out, TC aborts the entire transaction by logging <abort Ti> and then sending the “abort” message to all sites • If all sites replies with “ready”, TC commits by logging <commit Ti> and sending commit message to all sites • Upon receipt of a commit message, each site logs <commit Ti> and only then alters the database in memory
  • 28. Failure Example 1 • One of the database sites (A,B) fails • On recovery the log is examined: • if log contains <commit Ti>, redo the changes of the transaction • if the log contains <abort Ti>, undo the changes • if the log contains <ready Ti>, but not a commit, contact TC for the outcome of transaction Ti, if TC is down, then other sites • if log does not contain ready, commit or abort then the failure must have occurred before the receipt of “prepare Ti”, so TC would have aborted the transaction
  • 29. Failure Example 2 • One of the transaction coordinator (TC) fails (sites A or B waiting for commit/abort message) • Each database site log is examined: • if any site log contains <commit Ti> Ti must be committed at all sites • if any site log contains <abort Ti> or <no Ti> Ti must be aborted at all sites • if any site log does not contain <ready Ti>, TC must have failed before decision to commit • if none of the above apply then all active sites must have <ready Ti> (but no additional commits or aborts), TC must be consulted (when it comes back online)
  • 30. Network Faults • Failure of the network • From the perspective of entities on one side of the network failure, entities on the other side have failed (apply previous strategies)
  • 31. Locking (non-replicated system) • Each local site has a lock manager • administers lock requests for data items stored at site • when a transactions requires a data item to be locked, it requests a lock from the lock manager • lock manager blocks until lock can be held • Problem: deadlocks in a distributed system, clearly more complicated to resolve...
  • 32. Locking (single co-ordinator) • Have a single lock manager for the whole distributed database • manages locks at all sites • locks for reading of any replica • locks for writing of all replicas • Simpler deadlock handling • Single point of failure • Bottleneck?
  • 33. Locking (replicated system) • Majority protocol where each local site has a lock manager • Transactions wants a lock on a data item that is replicated at n sites • must get a lock for that data item at more than n/2 sites • transaction cannot operate until it has locks on more than half of the replica sites (only one transaction can do this at a time) • if replicas are written to all replicas must be updated...
  • 34. Updating Replicas • Replication makes reading more reliable (probability p that a replica is unavailable, the probability that all n replicas are unavailable is pn) • Replication makes writing less reliable (the probability of all n replicas being available to be updated with a write has a probability (1-p)n) • Writing must succeed even if not all replicas are available...
  • 35. Updating Replicas (2) • Majority update protocol! • Update more than half of the replicas (the rest have “failed”, can be updated later), but this time add a timestamp or version number • To read a data item, read more than half of the replicas and use the one with the most recent timestamp • Write more reliable, reading more complex!
  • 36. ~ Fin ~ (Graphics lectures begin on Monday 9th March)

Notas do Editor