SlideShare uma empresa Scribd logo
1 de 61
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/1
Outline
ā€¢ Introduction
ā€¢ Background
ā€¢ Distributed Database Design
ā€¢ Database Integration
ā€¢ Semantic Data Control
ā€¢ Distributed Query Processing
ā€¢ Distributed Transaction Management
āž” Transaction Concepts and Models
āž” Distributed Concurrency Control
āž” Distributed Reliability
ā€¢ Data Replication
ā€¢ Parallel Database Systems
ā€¢ Distributed Object DBMS
ā€¢ Peer-to-Peer Data Management
ā€¢ Web Data Management
ā€¢ Current Issues
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/2
Reliability
Problem:
How to maintain
atomicity
durability
properties of transactions
Ch.10/2
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/3
Fundamental Definitions
ā€¢ Reliability
āž” A measure of success with which a system conforms to some authoritative
specification of its behavior.
āž” Probability that the system has not experienced any failures within a given
time period.
āž” Typically used to describe systems that cannot be repaired or where the
continuous operation of the system is critical.
ā€¢ Availability
āž” The fraction of the time that a system meets its specification.
āž” The probability that the system is operational at a given time t.
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/4
Fundamental Definitions
ā€¢ Failure
āž” The deviation of a system from the behavior that is described in its
specification.
ā€¢ Erroneous state
āž” The internal state of a system such that there exist circumstances in which
further processing, by the normal algorithms of the system, will lead to a
failure which is not attributed to a subsequent fault.
ā€¢ Error
āž” The part of the state which is incorrect.
ā€¢ Fault
āž” An error in the internal states of the components of a system or in the design
of a system.
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/5
Faults to Failures
Fault Error Failure
causes results in
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/6
Types of Faults
ā€¢ Hard faults
āž” Permanent
āž” Resulting failures are called hard failures
ā€¢ Soft faults
āž” Transient or intermittent
āž” Account for more than 90% of all failures
āž” Resulting failures are called soft failures
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/7
Fault Classification
Permanent
fault
Incorrect
design
Unstable
environment
Operator
mistake
Transient
error
System
Failure
Unstable or
marginal
components
Intermittent
error
Permanent
error
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/8
Failures
Fault
occurs
Error
caused
Detection
of error
Repair Fault
occurs
Error
caused
MTBF
MTTRMTTD
Multiple errors can occur
during this period
Time
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/9
Fault Tolerance Measures
Reliability
R(t) = Pr{0 failures in time [0,t] | no failures at t=0}
If occurrence of failures is Poisson
R(t) = Pr{0 failures in time [0,t]}
Then
where
z(x) is known as the hazard function which gives the time-dependent failure
rate of the component
k!
Pr(k failures in time [0,t] =
e-m(t)[m(t)]k
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/10
Fault-Tolerance Measures
Reliability
The mean number of failures in time [0, t] can be computed as
and the variance can be be computed as
Var[k] = E[k2] - (E[k])2 = m(t)
Thus, reliability of a single component is
R(t) = e-m(t)
and of a system consisting of n non-redundant components as
E [k] =
k =0
āˆž
k k!
e-m(t )[m(t )]k
= m(t )
Rsys(t) =
i =1
n
Ri(t)
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/11
Fault-Tolerance Measures
Availability
A(t) = Pr{system is operational at time t}
Assume
āœ¦ Poisson failures with rate
āœ¦ Repair time is exponentially distributed with mean 1/Ī¼
Then, steady-state availability
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/12
Fault-Tolerance Measures
MTBF
Mean time between failures
MTTR
Mean time to repair
Availability
MTBF
MTBF + MTTR
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/13
Types of Failures
ā€¢ Transaction failures
āž” Transaction aborts (unilaterally or due to deadlock)
āž” Avg. 3% of transactions abort abnormally
ā€¢ System (site) failures
āž” Failure of processor, main memory, power supply, ā€¦
āž” Main memory contents are lost, but secondary storage contents are safe
āž” Partial vs. total failure
ā€¢ Media failures
āž” Failure of secondary storage devices such that the stored data is lost
āž” Head crash/controller failure (?)
ā€¢ Communication failures
āž” Lost/undeliverable messages
āž” Network partitioning
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/14
Local Recovery Management ā€“
Architecture
ā€¢ Volatile storage
āž” Consists of the main memory of the computer system (RAM).
ā€¢ Stable storage
āž” Resilient to failures and loses its contents only in the presence of media
failures (e.g., head crashes on disks).
āž” Implemented via a combination of hardware (non-volatile storage) and
software (stable-write, stable-read, clean-up) components.
Secondary
storage
Stable
database
Read Write
Write Read
Main memoryLocal Recovery
Manager
Database Buffer
Manager
Fetch,
Flush Database
buffers
(Volatile
database)
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/15
Update Strategies
ā€¢ In-place update
āž” Each update causes a change in one or more data values on pages in the
database buffers
ā€¢ Out-of-place update
āž” Each update causes the new value(s) of data item(s) to be stored separate
from the old value(s)
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/16
In-Place Update Recovery
Information
Database Log
Every action of a transaction must not only perform the action, but must also
write a log record to an append-only file.
New
stable database
state
Database
Log
Update
Operation
Old
stable database
state
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/17
Logging
The log contains information used by the recovery process to restore the
consistency of a system. This information may include
āž” transaction identifier
āž” type of operation (action)
āž” items accessed by the transaction to perform the action
āž” old value (state) of item (before image)
āž” new value (state) of item (after image)
ā€¦
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/18
Why Logging?
Upon recovery:
āž” all of T1's effects should be reflected in the database (REDO if necessary due to
a failure)
āž” none of T2's effects should be reflected in the database (UNDO if necessary)
0 t time
system
crash
T1Begin End
Begin T2
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/19
REDO Protocol
ā€¢ REDO'ing an action means performing it again.
ā€¢ The REDO operation uses the log information and performs the action that
might have been done before, or not done due to failures.
ā€¢ The REDO operation generates the new image.
Database
Log
REDO
Old
stable database
state
New
stable database
state
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/20
UNDO Protocol
ā€¢ UNDO'ing an action means to restore the object to its before image.
ā€¢ The UNDO operation uses the log information and restores the old value
of the object.
New
stable database
state
Database
Log
UNDO
Old
stable database
state
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/21
When to Write Log Records Into
Stable Store
Assume a transaction T updates a page P
ā€¢ Fortunate case
āž” System writes P in stable database
āž” System updates stable log for this update
āž” SYSTEM FAILURE OCCURS!... (before T commits)
We can recover (undo) by restoring P to its old state by using the log
ā€¢ Unfortunate case
āž” System writes P in stable database
āž” SYSTEM FAILURE OCCURS!... (before stable log is updated)
We cannot recover from this failure because there is no log record to
restore the old value.
ā€¢ Solution: Write-Ahead Log (WAL) protocol
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/22
Writeā€“Ahead Log Protocol
ā€¢ Notice:
āž” If a system crashes before a transaction is committed, then all the operations
must be undone. Only need the before images (undo portion of the log).
āž” Once a transaction is committed, some of its actions might have to be redone.
Need the after images (redo portion of the log).
ā€¢ WAL protocol :
ļ‚Œ Before a stable database is updated, the undo portion of the log should be
written to the stable log
ļ‚ When a transaction commits, the redo portion of the log must be written to
stable log prior to the updating of the stable database.
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/23
Logging Interface
Read
WriteWrite
Read
Main memory
Local Recovery
Manager
Database Buffer
Manager
Fetch,
Flush
Secondary
storage
Stable
log
Stable
database
Database
buffers
(Volatile
database)
Log
buffers
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/24
Out-of-Place Update Recovery
Information
ā€¢ Shadowing
āž” When an update occurs, don't change the old page, but create a shadow page
with the new values and write it into the stable database.
āž” Update the access paths so that subsequent accesses are to the new shadow
page.
āž” The old page retained for recovery.
ā€¢ Differential files
āž” For each file F maintain
āœ¦ a read only part FR
āœ¦ a differential file consisting of insertions part DF+ and deletions part DF-
āœ¦ Thus, F = (FR DF+) ā€“ DF-
āž” Updates treated as delete old value, insert new value
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/25
Execution of Commands
Commands to consider:
begin_transaction
read
write
commit
abort
recover
Independent of execution
strategy for LRM
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/26
Execution Strategies
ā€¢ Dependent upon
āž” Can the buffer manager decide to write some of the buffer pages being
accessed by a transaction into stable storage or does it wait for LRM to
instruct it?
āœ¦ fix/no-fix decision
āž” Does the LRM force the buffer manager to write certain buffer pages into
stable database at the end of a transaction's execution?
āœ¦ flush/no-flush decision
ā€¢ Possible execution strategies:
āž” no-fix/no-flush
āž” no-fix/flush
āž” fix/no-flush
āž” fix/flush
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/27
No-Fix/No-Flush
ā€¢ Abort
āž” Buffer manager may have written some of the updated pages into stable
database
āž” LRM performs transaction undo (or partial undo)
ā€¢ Commit
āž” LRM writes an ā€œend_of_transactionā€ record into the log.
ā€¢ Recover
āž” For those transactions that have both a ā€œbegin_transactionā€ and an
ā€œend_of_transactionā€ record in the log, a partial redo is initiated by LRM
āž” For those transactions that only have a ā€œbegin_transactionā€ in the log, a global
undo is executed by LRM
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/28
No-Fix/Flush
ā€¢ Abort
āž” Buffer manager may have written some of the updated pages into stable
database
āž” LRM performs transaction undo (or partial undo)
ā€¢ Commit
āž” LRM issues a flush command to the buffer manager for all updated pages
āž” LRM writes an ā€œend_of_transactionā€ record into the log.
ā€¢ Recover
āž” No need to perform redo
āž” Perform global undo
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/29
Fix/No-Flush
ā€¢ Abort
āž” None of the updated pages have been written into stable database
āž” Release the fixed pages
ā€¢ Commit
āž” LRM writes an ā€œend_of_transactionā€ record into the log.
āž” LRM sends an unfix command to the buffer manager for all pages that were
previously fixed
ā€¢ Recover
āž” Perform partial redo
āž” No need to perform global undo
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/30
Fix/Flush
ā€¢ Abort
āž” None of the updated pages have been written into stable database
āž” Release the fixed pages
ā€¢ Commit (the following have to be done atomically)
āž” LRM issues a flush command to the buffer manager for all updated pages
āž” LRM sends an unfix command to the buffer manager for all pages that were
previously fixed
āž” LRM writes an ā€œend_of_transactionā€ record into the log.
ā€¢ Recover
āž” No need to do anything
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/31
Checkpoints
ā€¢ Simplifies the task of determining actions of transactions that need to be
undone or redone when a failure occurs.
ā€¢ A checkpoint record contains a list of active transactions.
ā€¢ Steps:
ļ‚Œ Write a begin_checkpoint record into the log
ļ‚ Collect the checkpoint dat into the stable storage
ļ‚Ž Write an end_checkpoint record into the log
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/32
Media Failures ā€“ Full Architecture
Read
WriteWrite
Read
Main memory
Local Recovery
Manager
Database Buffer
Manager
Fetch,
Flush
Archive
log
Archive
database
Secondary
storage
Stable
log
Stable
database
Database
buffers
(Volatile
database)
Log
buffers
Write Write
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/33
Distributed Reliability Protocols
ā€¢ Commit protocols
āž” How to execute commit command for distributed transactions.
āž” Issue: how to ensure atomicity and durability?
ā€¢ Termination protocols
āž” If a failure occurs, how can the remaining operational sites deal with it.
āž” Non-blocking : the occurrence of failures should not force the sites to wait until
the failure is repaired to terminate the transaction.
ā€¢ Recovery protocols
āž” When a failure occurs, how do the sites where the failure occurred deal with
it.
āž” Independent : a failed site can determine the outcome of a transaction without
having to obtain remote information.
ā€¢ Independent recovery non-blocking termination
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/34
Two-Phase Commit (2PC)
Phase 1 : The coordinator gets the participants ready to write the results into
the database
Phase 2 : Everybody writes the results into the database
āž” Coordinator :The process at the site where the transaction originates and
which controls the execution
āž” Participant :The process at the other sites that participate in executing the
transaction
Global Commit Rule:
ļ‚Œ The coordinator aborts a transaction if and only if at least one participant
votes to abort it.
ļ‚ The coordinator commits a transaction if and only if all of the participants
vote to commit it.
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/35
Centralized 2PC
ready? yes/no commit/abort?commited/aborted
Phase 1 Phase 2
C C C
P
P
P
P
P
P
P
P
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/36
2PC Protocol Actions
ParticipantCoordinator
No
Yes
VOTE-COMMIT
Yes GLOBAL-ABORT
No
write abort
in log
Abort
Commit
ACK
ACK
INITIAL
write abort
in log
write ready
in log
write commit
in log
Type of
msg
WAIT
Ready to
Commit?
write commit
in log
Any No?
write abort
in log
ABORTCOMMIT
COMMITABORT
write
begin_commit
in log
write
end_of_transaction
in log
READ
Y
INITIAL
Unilateralabort
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/37
Linear 2PC
Prepare VC/VA
Phase 1
Phase 2
GC/GA
VC/VA VC/VA VC/VA
VC: Vote-Commit, VA: Vote-Abort, GC: Global-commit, GA: Global-abort
1 2 3 4 5 N
GC/GA GC/GA GC/GA GC/GA
ā‰ˆā‰ˆ
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/38
Distributed 2PC
prepare
vote-abort/
vote-commit
global-commit/
global-abort
decision made
independently
Phase 1
Coordinator Participants Participants
Phase 2
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/39
State Transitions in 2PC
INITIAL
WAIT
Commit command
Prepare
Vote-commit (all)
Global-commit
INITIAL
READY
Prepare
Vote-commit
Global-commit
Ack
Prepare
Vote-abort
Global-abort
Ack
Coordinator Participants
Vote-abort
Global-abort
ABORT COMMIT COMMITABORT
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/40
Site Failures - 2PC Termination
ā€¢ Timeout in INITIAL
āž” Who cares
ā€¢ Timeout in WAIT
āž” Cannot unilaterally commit
āž” Can unilaterally abort
ā€¢ Timeout in ABORT or COMMIT
āž” Stay blocked and wait for the acks
COORDINATOR
INITIAL
WAIT
Commit command
Prepare
Vote-commit
Global-commit
ABORT COMMIT
Vote-abort
Global-abort
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/41
Site Failures - 2PC Termination
ā€¢ Timeout in INITIAL
āž” Coordinator must have failed in
INITIAL state
āž” Unilaterally abort
ā€¢ Timeout in READY
āž” Stay blocked
INITIAL
READY
Prepare
Vote-commit
Global-commit
Ack
Prepare
Vote-abort
Global-abort
Ack
ABORT COMMIT
PARTICIPANTS
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/42
Site Failures - 2PC Recovery
ā€¢ Failure in INITIAL
āž” Start the commit process upon recovery
ā€¢ Failure in WAIT
āž” Restart the commit process upon recovery
ā€¢ Failure in ABORT or COMMIT
āž” Nothing special if all the acks have been
received
āž” Otherwise the termination protocol is
involved
COORDINATOR
INITIAL
WAIT
Commit command
Prepare
Vote-commit
Global-commit
ABORT COMMIT
Vote-abort
Global-abort
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/43
Site Failures - 2PC Recovery
ā€¢ Failure in INITIAL
āž” Unilaterally abort upon recovery
ā€¢ Failure in READY
āž” The coordinator has been informed
about the local decision
āž” Treat as timeout in READY state and
invoke the termination protocol
ā€¢ Failure in ABORT or COMMIT
āž” Nothing special needs to be done
INITIAL
READY
Prepare
Vote-commit
Global-commit
Ack
Prepare
Vote-abort
Global-abort
Ack
ABORT COMMIT
PARTICIPANTS
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/44
2PC Recovery Protocols ā€“
Additional Cases
Arise due to non-atomicity of log and message send actions
ā€¢ Coordinator site fails after writing ā€œbegin_commitā€ log and before sending
ā€œprepareā€ command
āž” treat it as a failure in WAIT state; send ā€œprepareā€ command
ā€¢ Participant site fails after writing ā€œreadyā€ record in log but before ā€œvote-
commitā€ is sent
āž” treat it as failure in READY state
āž” alternatively, can send ā€œvote-commitā€ upon recovery
ā€¢ Participant site fails after writing ā€œabortā€ record in log but before ā€œvote-
abortā€ is sent
āž” no need to do anything upon recovery
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/45
2PC Recovery Protocols ā€“
Additional Case
ā€¢ Coordinator site fails after logging its final decision record but before
sending its decision to the participants
āž” coordinator treats it as a failure in COMMIT or ABORT state
āž” participants treat it as timeout in the READY state
ā€¢ Participant site fails after writing ā€œabortā€ or ā€œcommitā€ record in log but
before acknowledgement is sent
āž” participant treats it as failure in COMMIT or ABORT state
āž” coordinator will handle it by timeout in COMMIT or ABORT state
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/46
Problem With 2PC
ā€¢ Blocking
āž” Ready implies that the participant waits for the coordinator
āž” If coordinator fails, site is blocked until recovery
āž” Blocking reduces availability
ā€¢ Independent recovery is not possible
ā€¢ However, it is known that:
āž” Independent recovery protocols exist only for single site failures; no
independent recovery protocol exists which is resilient to multiple-site
failures.
ā€¢ So we search for these protocols ā€“ 3PC
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/47
Three-Phase Commit
ā€¢ 3PC is non-blocking.
ā€¢ A commit protocols is non-blocking iff
āž” it is synchronous within one state transition, and
āž” its state transition diagram contains
āœ¦ no state which is ā€œadjacentā€ to both a commit and an abort state, and
āœ¦ no non-committable state which is ā€œadjacentā€ to a commit state
ā€¢ Adjacent: possible to go from one stat to another with a single state
transition
ā€¢ Committable: all sites have voted to commit a transaction
āž” e.g.: COMMIT state
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/48
State Transitions in 3PC
INITIAL
WAIT
Commit command
Prepare
Vote-commit
Prepare-to-commit
Coordinator
Vote-abort
Global-abort
ABORT
COMMIT
PRE-
COMMIT
Ready-to-commit
Global commit
INITIAL
READY
Prepare
Vote-commit
Prepared-to-commit
Ready-to-commit
Prepare
Vote-abort
Global-abort
Ack
Participants
COMMIT
ABORT
PRE-
COMMIT
Global commit
Ack
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/49
Communication Structure
C
P
P
P
P
C
P
P
P
P
C
ready? yes/no
pre-commit/
pre-abort? commit/abort
Phase 1 Phase 2
P
P
P
P
C
yes/no ack
Phase 3
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/50
Site Failures ā€“ 3PC Termination
ā€¢ Timeout in INITIAL
āž” Who cares
ā€¢ Timeout in WAIT
āž” Unilaterally abort
ā€¢ Timeout in PRECOMMIT
āž” Participants may not be in PRE-
COMMIT, but at least in READY
āž” Move all the participants to
PRECOMMIT state
āž” Terminate by globally committing
INITIAL
WAIT
Commit command
Prepare
Vote-commit
Prepare-to-commit
Coordinator
Vote-abort
Global-abort
ABORT
COMMIT
PRE-
COMMIT
Ready-to-commit
Global commit
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/51
Site Failures ā€“ 3PC Termination
ā€¢ Timeout in ABORT or COMMIT
āž” Just ignore and treat the transaction
as completed
āž” participants are either in
PRECOMMIT or READY state and
can follow their termination
protocols
INITIAL
WAIT
Commit command
Prepare
Vote-commit
Prepare-to-commit
Coordinator
Vote-abort
Global-abort
ABORT
COMMIT
PRE-
COMMIT
Ready-to-commit
Global commit
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/52
Site Failures ā€“ 3PC Termination
ā€¢ Timeout in INITIAL
āž” Coordinator must have failed in
INITIAL state
āž” Unilaterally abort
ā€¢ Timeout in READY
āž” Voted to commit, but does not
know the coordinator's decision
āž” Elect a new coordinator and
terminate using a special protocol
ā€¢ Timeout in PRECOMMIT
āž” Handle it the same as timeout in
READY state
INITIAL
READY
Prepare
Vote-commit
Prepared-to-commit
Ready-to-commit
Prepare
Vote-abort
Global-abort
Ack
Participants
COMMIT
ABORT
PRE-
COMMIT
Global commit
Ack
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/53
Termination Protocol Upon
Coordinator Election
New coordinator can be in one of four states: WAIT, PRECOMMIT,
COMMIT, ABORT
ļ‚Œ Coordinator sends its state to all of the participants asking them to assume its
state.
ļ‚ Participants ā€œback-upā€ and reply with appriate messages, except those in
ABORT and COMMIT states. Those in these states respond with ā€œAckā€ but
stay in their states.
ļ‚Ž Coordinator guides the participants towards termination:
āœ¦ If the new coordinator is in the WAIT state, participants can be in INITIAL,
READY, ABORT or PRECOMMIT states. New coordinator globally aborts the
transaction.
āœ¦ If the new coordinator is in the PRECOMMIT state, the participants can be in
READY, PRECOMMIT or COMMIT states. The new coordinator will globally
commit the transaction.
āœ¦ If the new coordinator is in the ABORT or COMMIT states, at the end of the first
phase, the participants will have moved to that state as well.
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/54
Site Failures ā€“ 3PC Recovery
ā€¢ Failure in INITIAL
āž” start commit process upon recovery
ā€¢ Failure in WAIT
āž” the participants may have elected a
new coordinator and terminated the
transaction
āž” the new coordinator could be in WAIT
or ABORT states transaction
aborted
āž” ask around for the fate of the
transaction
ā€¢ Failure in PRECOMMIT
āž” ask around for the fate of the
transaction
INITIAL
WAIT
Commit command
Prepare
Vote-commit
Prepare-to-commit
Coordinator
Vote-abort
Global-abort
ABORT
COMMIT
PRE-
COMMIT
Ready-to-commit
Global commit
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/55
Site Failures ā€“ 3PC Recovery
ā€¢ Failure in COMMIT or ABORT
āž” Nothing special if all the
acknowledgements have been
received; otherwise the termination
protocol is involved
INITIAL
WAIT
Commit command
Prepare
Vote-commit
Prepare-to-commit
Coordinator
Vote-abort
Global-abort
ABORT
COMMIT
PRE-
COMMIT
Ready-to-commit
Global commit
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/56
Site Failures ā€“ 3PC Recovery
ā€¢ Failure in INITIAL
āž” unilaterally abort upon recovery
ā€¢ Failure in READY
āž” the coordinator has been informed
about the local decision
āž” upon recovery, ask around
ā€¢ Failure in PRECOMMIT
āž” ask around to determine how the
other participants have terminated
the transaction
ā€¢ Failure in COMMIT or ABORT
āž” no need to do anything
INITIAL
READY
Prepare
Vote-commit
Prepared-to-commit
Ready-to-commit
Prepare
Vote-abort
Global-abort
Ack
Participants
COMMIT
ABORT
PRE-
COMMIT
Global commit
Ack
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/57
Network Partitioning
ā€¢ Simple partitioning
āž” Only two partitions
ā€¢ Multiple partitioning
āž” More than two partitions
ā€¢ Formal bounds:
āž” There exists no non-blocking protocol that is resilient to a network partition if
messages are lost when partition occurs.
āž” There exist non-blocking protocols which are resilient to a single network
partition if all undeliverable messages are returned to sender.
āž” There exists no non-blocking protocol which is resilient to a multiple
partition.
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/58
Independent Recovery Protocols
for Network Partitioning
ā€¢ No general solution possible
āž” allow one group to terminate while the other is blocked
āž” improve availability
ā€¢ How to determine which group to proceed?
āž” The group with a majority
ā€¢ How does a group know if it has majority?
āž” Centralized
āœ¦ Whichever partitions contains the central site should terminate the transaction
āž” Voting-based (quorum)
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/59
Quorum Protocols
ā€¢ The network partitioning problem is handled by the commit protocol.
ā€¢ Every site is assigned a vote Vi.
ā€¢ Total number of votes in the system V
ā€¢ Abort quorum Va, commit quorum Vc
āž” Va + Vc > V where 0 ā‰¤ Va , Vc ā‰¤ V
āž” Before a transaction commits, it must obtain a commit quorum Vc
āž” Before a transaction aborts, it must obtain an abort quorum Va
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/60
State Transitions in Quorum
Protocols
INITIAL
WAIT
Commit command
Prepare
Vote-commit
Prepare-to-commit
Coordinator
Vote-abort
Prepare-to-abort
ABORT COMMIT
PRE-
COMMIT
Ready-to-commit
Global commit
INITIAL
READY
Prepare
Vote-commit
Prepare-to-commit
Ready-to-commit
Prepare
Vote-abort
Global-abort
Ack
Participants
COMMITABORT
PRE-
COMMIT
Global commit
Ack
PRE-
ABORT
Prepared-to-abortt
Ready-to-abort
PRE-
ABORT
Ready-to-abort
Global-abort
Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/61
Use for Network Partitioning
ā€¢ Before commit (i.e., moving from PRECOMMIT to COMMIT), coordinator
receives commit quorum from participants. One partition may have the
commit quorum.
ā€¢ Assumes that failures are ā€œcleanā€ which means:
āž” failures that change the network's topology are detected by all sites
instantaneously
āž” each site has a view of the network consisting of all the sites it can
communicate with

Mais conteĆŗdo relacionado

Mais procurados

DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESDISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESAAKANKSHA JAIN
Ā 
Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.Meghaj Mallick
Ā 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systemsmridul mishra
Ā 
Query processing and optimization (updated)
Query processing and optimization (updated)Query processing and optimization (updated)
Query processing and optimization (updated)Ravinder Kamboj
Ā 
Distributed database management system
Distributed database management  systemDistributed database management  system
Distributed database management systemPooja Dixit
Ā 
Distributed data processing
Distributed data processingDistributed data processing
Distributed data processingAyisha Kowsar
Ā 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization Hafiz faiz
Ā 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating SystemsDr Sandeep Kumar Poonia
Ā 
Concurrency control
Concurrency  controlConcurrency  control
Concurrency controlJaved Khan
Ā 
Concurrency control
Concurrency controlConcurrency control
Concurrency controlSubhasish Pati
Ā 
Concurrency Control in Database Management System
Concurrency Control in Database Management SystemConcurrency Control in Database Management System
Concurrency Control in Database Management SystemJanki Shah
Ā 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency ControlDilum Bandara
Ā 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed SystemsRupsee
Ā 
Database, 3 Distribution Design
Database, 3 Distribution DesignDatabase, 3 Distribution Design
Database, 3 Distribution DesignAli Usman
Ā 
File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing modelsishmecse13
Ā 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memoryAshish Kumar
Ā 

Mais procurados (20)

DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUESDISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
Ā 
Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.Concurrency Control in Distributed Database.
Concurrency Control in Distributed Database.
Ā 
Optimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed SystemsOptimistic concurrency control in Distributed Systems
Optimistic concurrency control in Distributed Systems
Ā 
Query processing and optimization (updated)
Query processing and optimization (updated)Query processing and optimization (updated)
Query processing and optimization (updated)
Ā 
Distributed database management system
Distributed database management  systemDistributed database management  system
Distributed database management system
Ā 
Distributed data processing
Distributed data processingDistributed data processing
Distributed data processing
Ā 
Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization
Ā 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
Ā 
Distributed DBMS - Unit 5 - Semantic Data Control
Distributed DBMS - Unit 5 - Semantic Data ControlDistributed DBMS - Unit 5 - Semantic Data Control
Distributed DBMS - Unit 5 - Semantic Data Control
Ā 
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
Distributed DBMS - Unit 3 - Distributed DBMS ArchitectureDistributed DBMS - Unit 3 - Distributed DBMS Architecture
Distributed DBMS - Unit 3 - Distributed DBMS Architecture
Ā 
Concurrency control
Concurrency  controlConcurrency  control
Concurrency control
Ā 
Concurrency control
Concurrency controlConcurrency control
Concurrency control
Ā 
Concurrency Control in Database Management System
Concurrency Control in Database Management SystemConcurrency Control in Database Management System
Concurrency Control in Database Management System
Ā 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency Control
Ā 
Distributed Systems
Distributed SystemsDistributed Systems
Distributed Systems
Ā 
Lec 7 query processing
Lec 7 query processingLec 7 query processing
Lec 7 query processing
Ā 
Database, 3 Distribution Design
Database, 3 Distribution DesignDatabase, 3 Distribution Design
Database, 3 Distribution Design
Ā 
File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing models
Ā 
distributed shared memory
 distributed shared memory distributed shared memory
distributed shared memory
Ā 
DDBMS Paper with Solution
DDBMS Paper with SolutionDDBMS Paper with Solution
DDBMS Paper with Solution
Ā 

Destaque

Optimistic Algorithm and Concurrency Control Algorithm
Optimistic Algorithm and Concurrency Control AlgorithmOptimistic Algorithm and Concurrency Control Algorithm
Optimistic Algorithm and Concurrency Control AlgorithmShounak Katyayan
Ā 
Database , 5 Semantic
Database , 5 SemanticDatabase , 5 Semantic
Database , 5 SemanticAli Usman
Ā 
PL/pgSQL - An Introduction on Using Imperative Programming in PostgreSQL
PL/pgSQL - An Introduction on Using Imperative Programming in PostgreSQLPL/pgSQL - An Introduction on Using Imperative Programming in PostgreSQL
PL/pgSQL - An Introduction on Using Imperative Programming in PostgreSQLReactive.IO
Ā 
MySQL InnoDB ęŗē å®žēŽ°åˆ†ęž(äø€)
MySQL InnoDB ęŗē å®žēŽ°åˆ†ęž(äø€)MySQL InnoDB ęŗē å®žēŽ°åˆ†ęž(äø€)
MySQL InnoDB ęŗē å®žēŽ°åˆ†ęž(äø€)frogd
Ā 
Database ,16 P2P
Database ,16 P2P Database ,16 P2P
Database ,16 P2P Ali Usman
Ā 
Database ,10 Transactions
Database ,10 TransactionsDatabase ,10 Transactions
Database ,10 TransactionsAli Usman
Ā 
Postgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency ControlPostgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency ControlReactive.IO
Ā 
Database ,14 Parallel DBMS
Database ,14 Parallel DBMSDatabase ,14 Parallel DBMS
Database ,14 Parallel DBMSAli Usman
Ā 
Dama - Protecting Sensitive Data on a Database
Dama - Protecting Sensitive Data on a DatabaseDama - Protecting Sensitive Data on a Database
Dama - Protecting Sensitive Data on a Databasejohanswart1234
Ā 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internalmysqlops
Ā 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 ReplicationAli Usman
Ā 
Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localizationAli Usman
Ā 
Buffer management --database buffering
Buffer management --database buffering Buffer management --database buffering
Buffer management --database buffering julia121214
Ā 
Database ,11 Concurrency Control
Database ,11 Concurrency ControlDatabase ,11 Concurrency Control
Database ,11 Concurrency ControlAli Usman
Ā 
Oracle rac资ęŗē®”ē†ē®—ę³•äøŽcache fusion实ēŽ°ęµ…Ꞑ
Oracle rac资ęŗē®”ē†ē®—ę³•äøŽcache fusion实ēŽ°ęµ…ꞐOracle rac资ęŗē®”ē†ē®—ę³•äøŽcache fusion实ēŽ°ęµ…Ꞑ
Oracle rac资ęŗē®”ē†ē®—ę³•äøŽcache fusion实ēŽ°ęµ…Ꞑfrogd
Ā 
Discrete Structures lecture 2
 Discrete Structures lecture 2 Discrete Structures lecture 2
Discrete Structures lecture 2Ali Usman
Ā 
Database , 15 Object DBMS
Database , 15 Object DBMSDatabase , 15 Object DBMS
Database , 15 Object DBMSAli Usman
Ā 
Database , 1 Introduction
 Database , 1 Introduction Database , 1 Introduction
Database , 1 IntroductionAli Usman
Ā 

Destaque (20)

Would You Consider Abortion in These Circumstances?
Would You Consider Abortion in These Circumstances?Would You Consider Abortion in These Circumstances?
Would You Consider Abortion in These Circumstances?
Ā 
Buffer manager
Buffer managerBuffer manager
Buffer manager
Ā 
Optimistic Algorithm and Concurrency Control Algorithm
Optimistic Algorithm and Concurrency Control AlgorithmOptimistic Algorithm and Concurrency Control Algorithm
Optimistic Algorithm and Concurrency Control Algorithm
Ā 
Database , 5 Semantic
Database , 5 SemanticDatabase , 5 Semantic
Database , 5 Semantic
Ā 
PL/pgSQL - An Introduction on Using Imperative Programming in PostgreSQL
PL/pgSQL - An Introduction on Using Imperative Programming in PostgreSQLPL/pgSQL - An Introduction on Using Imperative Programming in PostgreSQL
PL/pgSQL - An Introduction on Using Imperative Programming in PostgreSQL
Ā 
MySQL InnoDB ęŗē å®žēŽ°åˆ†ęž(äø€)
MySQL InnoDB ęŗē å®žēŽ°åˆ†ęž(äø€)MySQL InnoDB ęŗē å®žēŽ°åˆ†ęž(äø€)
MySQL InnoDB ęŗē å®žēŽ°åˆ†ęž(äø€)
Ā 
Database ,16 P2P
Database ,16 P2P Database ,16 P2P
Database ,16 P2P
Ā 
Database ,10 Transactions
Database ,10 TransactionsDatabase ,10 Transactions
Database ,10 Transactions
Ā 
Postgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency ControlPostgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Postgres MVCC - A Developer Centric View of Multi Version Concurrency Control
Ā 
Database ,14 Parallel DBMS
Database ,14 Parallel DBMSDatabase ,14 Parallel DBMS
Database ,14 Parallel DBMS
Ā 
Dama - Protecting Sensitive Data on a Database
Dama - Protecting Sensitive Data on a DatabaseDama - Protecting Sensitive Data on a Database
Dama - Protecting Sensitive Data on a Database
Ā 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internal
Ā 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 Replication
Ā 
Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localization
Ā 
Buffer management --database buffering
Buffer management --database buffering Buffer management --database buffering
Buffer management --database buffering
Ā 
Database ,11 Concurrency Control
Database ,11 Concurrency ControlDatabase ,11 Concurrency Control
Database ,11 Concurrency Control
Ā 
Oracle rac资ęŗē®”ē†ē®—ę³•äøŽcache fusion实ēŽ°ęµ…Ꞑ
Oracle rac资ęŗē®”ē†ē®—ę³•äøŽcache fusion实ēŽ°ęµ…ꞐOracle rac资ęŗē®”ē†ē®—ę³•äøŽcache fusion实ēŽ°ęµ…Ꞑ
Oracle rac资ęŗē®”ē†ē®—ę³•äøŽcache fusion实ēŽ°ęµ…Ꞑ
Ā 
Discrete Structures lecture 2
 Discrete Structures lecture 2 Discrete Structures lecture 2
Discrete Structures lecture 2
Ā 
Database , 15 Object DBMS
Database , 15 Object DBMSDatabase , 15 Object DBMS
Database , 15 Object DBMS
Ā 
Database , 1 Introduction
 Database , 1 Introduction Database , 1 Introduction
Database , 1 Introduction
Ā 

Semelhante a Database , 12 Reliability

Database ,18 Current Issues
Database ,18 Current IssuesDatabase ,18 Current Issues
Database ,18 Current IssuesAli Usman
Ā 
1 introduction
1 introduction1 introduction
1 introductionAmrit Kaur
Ā 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
Ā 
DBMS Vardhaman.pdf
DBMS Vardhaman.pdfDBMS Vardhaman.pdf
DBMS Vardhaman.pdfNithishReddy90
Ā 
What is Database Backup? The 3 Important Recovery Techniques from transaction...
What is Database Backup? The 3 Important Recovery Techniques from transaction...What is Database Backup? The 3 Important Recovery Techniques from transaction...
What is Database Backup? The 3 Important Recovery Techniques from transaction...Raj vardhan
Ā 
2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.pptnadirpervez2
Ā 
1 introduction DDBS
1 introduction DDBS1 introduction DDBS
1 introduction DDBSnaimanighat
Ā 
Sql server performance tuning
Sql server performance tuningSql server performance tuning
Sql server performance tuningngupt28
Ā 
database recovery techniques
database recovery techniques database recovery techniques
database recovery techniques Kalhan Liyanage
Ā 
Unit no 5 transation processing DMS 22319
Unit no 5 transation processing DMS 22319Unit no 5 transation processing DMS 22319
Unit no 5 transation processing DMS 22319ARVIND SARDAR
Ā 
1 introduction ddbms
1 introduction ddbms1 introduction ddbms
1 introduction ddbmsamna izzat
Ā 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replicationShahzad
Ā 
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & RecoveryDistributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & RecoveryGyanmanjari Institute Of Technology
Ā 
database backup and recovery
database backup and recoverydatabase backup and recovery
database backup and recoverysdrhr
Ā 
KScope14 Oracle EPM Troubleshooting
KScope14 Oracle EPM TroubleshootingKScope14 Oracle EPM Troubleshooting
KScope14 Oracle EPM TroubleshootingAlithya
Ā 
Maximizing Business Continuity and Minimizing Recovery Time Objectives in Win...
Maximizing Business Continuity and Minimizing Recovery Time Objectives in Win...Maximizing Business Continuity and Minimizing Recovery Time Objectives in Win...
Maximizing Business Continuity and Minimizing Recovery Time Objectives in Win...Cloudnition Web Development & Online Marketing
Ā 

Semelhante a Database , 12 Reliability (20)

DBMS unit-5.pdf
DBMS unit-5.pdfDBMS unit-5.pdf
DBMS unit-5.pdf
Ā 
Database ,18 Current Issues
Database ,18 Current IssuesDatabase ,18 Current Issues
Database ,18 Current Issues
Ā 
1 introduction
1 introduction1 introduction
1 introduction
Ā 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Ā 
DBMS Vardhaman.pdf
DBMS Vardhaman.pdfDBMS Vardhaman.pdf
DBMS Vardhaman.pdf
Ā 
What is Database Backup? The 3 Important Recovery Techniques from transaction...
What is Database Backup? The 3 Important Recovery Techniques from transaction...What is Database Backup? The 3 Important Recovery Techniques from transaction...
What is Database Backup? The 3 Important Recovery Techniques from transaction...
Ā 
2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt2007-05-23 Cecchet_PGCon2007.ppt
2007-05-23 Cecchet_PGCon2007.ppt
Ā 
1 introduction DDBS
1 introduction DDBS1 introduction DDBS
1 introduction DDBS
Ā 
Sql server performance tuning
Sql server performance tuningSql server performance tuning
Sql server performance tuning
Ā 
database recovery techniques
database recovery techniques database recovery techniques
database recovery techniques
Ā 
Unit no 5 transation processing DMS 22319
Unit no 5 transation processing DMS 22319Unit no 5 transation processing DMS 22319
Unit no 5 transation processing DMS 22319
Ā 
1 introduction ddbms
1 introduction ddbms1 introduction ddbms
1 introduction ddbms
Ā 
Software architecture case study - why and why not sql server replication
Software architecture   case study - why and why not sql server replicationSoftware architecture   case study - why and why not sql server replication
Software architecture case study - why and why not sql server replication
Ā 
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & RecoveryDistributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Ā 
database backup and recovery
database backup and recoverydatabase backup and recovery
database backup and recovery
Ā 
2 recovery
2 recovery2 recovery
2 recovery
Ā 
EQUNIX - PPT 11DB-Postgresā„¢.pdf
EQUNIX - PPT 11DB-Postgresā„¢.pdfEQUNIX - PPT 11DB-Postgresā„¢.pdf
EQUNIX - PPT 11DB-Postgresā„¢.pdf
Ā 
KScope14 Oracle EPM Troubleshooting
KScope14 Oracle EPM TroubleshootingKScope14 Oracle EPM Troubleshooting
KScope14 Oracle EPM Troubleshooting
Ā 
Recovery techniques
Recovery techniquesRecovery techniques
Recovery techniques
Ā 
Maximizing Business Continuity and Minimizing Recovery Time Objectives in Win...
Maximizing Business Continuity and Minimizing Recovery Time Objectives in Win...Maximizing Business Continuity and Minimizing Recovery Time Objectives in Win...
Maximizing Business Continuity and Minimizing Recovery Time Objectives in Win...
Ā 

Mais de Ali Usman

Cisco Packet Tracer Overview
Cisco Packet Tracer OverviewCisco Packet Tracer Overview
Cisco Packet Tracer OverviewAli Usman
Ā 
Islamic Arts and Architecture
Islamic Arts and  ArchitectureIslamic Arts and  Architecture
Islamic Arts and ArchitectureAli Usman
Ā 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 WebAli Usman
Ā 
Database , 6 Query Introduction
Database , 6 Query Introduction Database , 6 Query Introduction
Database , 6 Query Introduction Ali Usman
Ā 
Database , 4 Data Integration
Database , 4 Data IntegrationDatabase , 4 Data Integration
Database , 4 Data IntegrationAli Usman
Ā 
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 BackgroundAli Usman
Ā 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor SpecificationsAli Usman
Ā 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of MicroprocessorAli Usman
Ā 
Discrete Structures. Lecture 1
 Discrete Structures. Lecture 1  Discrete Structures. Lecture 1
Discrete Structures. Lecture 1 Ali Usman
Ā 
Muslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyMuslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyAli Usman
Ā 
Muslim Contributions in Geography
Muslim Contributions in GeographyMuslim Contributions in Geography
Muslim Contributions in GeographyAli Usman
Ā 
Muslim Contributions in Astronomy
Muslim Contributions in AstronomyMuslim Contributions in Astronomy
Muslim Contributions in AstronomyAli Usman
Ā 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor SpecificationsAli Usman
Ā 
Ptcl modem (user manual)
Ptcl modem (user manual)Ptcl modem (user manual)
Ptcl modem (user manual)Ali Usman
Ā 
Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali Ali Usman
Ā 
Muslim Contributions in Mathematics
Muslim Contributions in MathematicsMuslim Contributions in Mathematics
Muslim Contributions in MathematicsAli Usman
Ā 
Osi protocols
Osi protocolsOsi protocols
Osi protocolsAli Usman
Ā 

Mais de Ali Usman (17)

Cisco Packet Tracer Overview
Cisco Packet Tracer OverviewCisco Packet Tracer Overview
Cisco Packet Tracer Overview
Ā 
Islamic Arts and Architecture
Islamic Arts and  ArchitectureIslamic Arts and  Architecture
Islamic Arts and Architecture
Ā 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 Web
Ā 
Database , 6 Query Introduction
Database , 6 Query Introduction Database , 6 Query Introduction
Database , 6 Query Introduction
Ā 
Database , 4 Data Integration
Database , 4 Data IntegrationDatabase , 4 Data Integration
Database , 4 Data Integration
Ā 
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 Background
Ā 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor Specifications
Ā 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of Microprocessor
Ā 
Discrete Structures. Lecture 1
 Discrete Structures. Lecture 1  Discrete Structures. Lecture 1
Discrete Structures. Lecture 1
Ā 
Muslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyMuslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-Astronomy
Ā 
Muslim Contributions in Geography
Muslim Contributions in GeographyMuslim Contributions in Geography
Muslim Contributions in Geography
Ā 
Muslim Contributions in Astronomy
Muslim Contributions in AstronomyMuslim Contributions in Astronomy
Muslim Contributions in Astronomy
Ā 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor Specifications
Ā 
Ptcl modem (user manual)
Ptcl modem (user manual)Ptcl modem (user manual)
Ptcl modem (user manual)
Ā 
Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali
Ā 
Muslim Contributions in Mathematics
Muslim Contributions in MathematicsMuslim Contributions in Mathematics
Muslim Contributions in Mathematics
Ā 
Osi protocols
Osi protocolsOsi protocols
Osi protocols
Ā 

ƚltimo

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
Ā 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
Ā 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
Ā 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
Ā 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
Ā 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
Ā 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
Ā 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
Ā 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
Ā 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
Ā 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
Ā 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜RTylerCroy
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
Ā 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
Ā 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
Ā 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationRadu Cotescu
Ā 

ƚltimo (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
Ā 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
Ā 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Ā 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
Ā 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Ā 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Ā 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Ā 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Ā 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Ā 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Ā 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
Ā 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Ā 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Ā 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
Ā 
šŸ¬ The future of MySQL is Postgres šŸ˜
šŸ¬  The future of MySQL is Postgres   šŸ˜šŸ¬  The future of MySQL is Postgres   šŸ˜
šŸ¬ The future of MySQL is Postgres šŸ˜
Ā 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Ā 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Ā 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Ā 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Ā 
Scaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organizationScaling API-first ā€“ The story of a global engineering organization
Scaling API-first ā€“ The story of a global engineering organization
Ā 

Database , 12 Reliability

  • 1. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/1 Outline ā€¢ Introduction ā€¢ Background ā€¢ Distributed Database Design ā€¢ Database Integration ā€¢ Semantic Data Control ā€¢ Distributed Query Processing ā€¢ Distributed Transaction Management āž” Transaction Concepts and Models āž” Distributed Concurrency Control āž” Distributed Reliability ā€¢ Data Replication ā€¢ Parallel Database Systems ā€¢ Distributed Object DBMS ā€¢ Peer-to-Peer Data Management ā€¢ Web Data Management ā€¢ Current Issues
  • 2. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/2 Reliability Problem: How to maintain atomicity durability properties of transactions Ch.10/2
  • 3. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/3 Fundamental Definitions ā€¢ Reliability āž” A measure of success with which a system conforms to some authoritative specification of its behavior. āž” Probability that the system has not experienced any failures within a given time period. āž” Typically used to describe systems that cannot be repaired or where the continuous operation of the system is critical. ā€¢ Availability āž” The fraction of the time that a system meets its specification. āž” The probability that the system is operational at a given time t.
  • 4. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/4 Fundamental Definitions ā€¢ Failure āž” The deviation of a system from the behavior that is described in its specification. ā€¢ Erroneous state āž” The internal state of a system such that there exist circumstances in which further processing, by the normal algorithms of the system, will lead to a failure which is not attributed to a subsequent fault. ā€¢ Error āž” The part of the state which is incorrect. ā€¢ Fault āž” An error in the internal states of the components of a system or in the design of a system.
  • 5. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/5 Faults to Failures Fault Error Failure causes results in
  • 6. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/6 Types of Faults ā€¢ Hard faults āž” Permanent āž” Resulting failures are called hard failures ā€¢ Soft faults āž” Transient or intermittent āž” Account for more than 90% of all failures āž” Resulting failures are called soft failures
  • 7. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/7 Fault Classification Permanent fault Incorrect design Unstable environment Operator mistake Transient error System Failure Unstable or marginal components Intermittent error Permanent error
  • 8. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/8 Failures Fault occurs Error caused Detection of error Repair Fault occurs Error caused MTBF MTTRMTTD Multiple errors can occur during this period Time
  • 9. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/9 Fault Tolerance Measures Reliability R(t) = Pr{0 failures in time [0,t] | no failures at t=0} If occurrence of failures is Poisson R(t) = Pr{0 failures in time [0,t]} Then where z(x) is known as the hazard function which gives the time-dependent failure rate of the component k! Pr(k failures in time [0,t] = e-m(t)[m(t)]k
  • 10. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/10 Fault-Tolerance Measures Reliability The mean number of failures in time [0, t] can be computed as and the variance can be be computed as Var[k] = E[k2] - (E[k])2 = m(t) Thus, reliability of a single component is R(t) = e-m(t) and of a system consisting of n non-redundant components as E [k] = k =0 āˆž k k! e-m(t )[m(t )]k = m(t ) Rsys(t) = i =1 n Ri(t)
  • 11. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/11 Fault-Tolerance Measures Availability A(t) = Pr{system is operational at time t} Assume āœ¦ Poisson failures with rate āœ¦ Repair time is exponentially distributed with mean 1/Ī¼ Then, steady-state availability
  • 12. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/12 Fault-Tolerance Measures MTBF Mean time between failures MTTR Mean time to repair Availability MTBF MTBF + MTTR
  • 13. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/13 Types of Failures ā€¢ Transaction failures āž” Transaction aborts (unilaterally or due to deadlock) āž” Avg. 3% of transactions abort abnormally ā€¢ System (site) failures āž” Failure of processor, main memory, power supply, ā€¦ āž” Main memory contents are lost, but secondary storage contents are safe āž” Partial vs. total failure ā€¢ Media failures āž” Failure of secondary storage devices such that the stored data is lost āž” Head crash/controller failure (?) ā€¢ Communication failures āž” Lost/undeliverable messages āž” Network partitioning
  • 14. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/14 Local Recovery Management ā€“ Architecture ā€¢ Volatile storage āž” Consists of the main memory of the computer system (RAM). ā€¢ Stable storage āž” Resilient to failures and loses its contents only in the presence of media failures (e.g., head crashes on disks). āž” Implemented via a combination of hardware (non-volatile storage) and software (stable-write, stable-read, clean-up) components. Secondary storage Stable database Read Write Write Read Main memoryLocal Recovery Manager Database Buffer Manager Fetch, Flush Database buffers (Volatile database)
  • 15. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/15 Update Strategies ā€¢ In-place update āž” Each update causes a change in one or more data values on pages in the database buffers ā€¢ Out-of-place update āž” Each update causes the new value(s) of data item(s) to be stored separate from the old value(s)
  • 16. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/16 In-Place Update Recovery Information Database Log Every action of a transaction must not only perform the action, but must also write a log record to an append-only file. New stable database state Database Log Update Operation Old stable database state
  • 17. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/17 Logging The log contains information used by the recovery process to restore the consistency of a system. This information may include āž” transaction identifier āž” type of operation (action) āž” items accessed by the transaction to perform the action āž” old value (state) of item (before image) āž” new value (state) of item (after image) ā€¦
  • 18. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/18 Why Logging? Upon recovery: āž” all of T1's effects should be reflected in the database (REDO if necessary due to a failure) āž” none of T2's effects should be reflected in the database (UNDO if necessary) 0 t time system crash T1Begin End Begin T2
  • 19. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/19 REDO Protocol ā€¢ REDO'ing an action means performing it again. ā€¢ The REDO operation uses the log information and performs the action that might have been done before, or not done due to failures. ā€¢ The REDO operation generates the new image. Database Log REDO Old stable database state New stable database state
  • 20. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/20 UNDO Protocol ā€¢ UNDO'ing an action means to restore the object to its before image. ā€¢ The UNDO operation uses the log information and restores the old value of the object. New stable database state Database Log UNDO Old stable database state
  • 21. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/21 When to Write Log Records Into Stable Store Assume a transaction T updates a page P ā€¢ Fortunate case āž” System writes P in stable database āž” System updates stable log for this update āž” SYSTEM FAILURE OCCURS!... (before T commits) We can recover (undo) by restoring P to its old state by using the log ā€¢ Unfortunate case āž” System writes P in stable database āž” SYSTEM FAILURE OCCURS!... (before stable log is updated) We cannot recover from this failure because there is no log record to restore the old value. ā€¢ Solution: Write-Ahead Log (WAL) protocol
  • 22. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/22 Writeā€“Ahead Log Protocol ā€¢ Notice: āž” If a system crashes before a transaction is committed, then all the operations must be undone. Only need the before images (undo portion of the log). āž” Once a transaction is committed, some of its actions might have to be redone. Need the after images (redo portion of the log). ā€¢ WAL protocol : ļ‚Œ Before a stable database is updated, the undo portion of the log should be written to the stable log ļ‚ When a transaction commits, the redo portion of the log must be written to stable log prior to the updating of the stable database.
  • 23. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/23 Logging Interface Read WriteWrite Read Main memory Local Recovery Manager Database Buffer Manager Fetch, Flush Secondary storage Stable log Stable database Database buffers (Volatile database) Log buffers
  • 24. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/24 Out-of-Place Update Recovery Information ā€¢ Shadowing āž” When an update occurs, don't change the old page, but create a shadow page with the new values and write it into the stable database. āž” Update the access paths so that subsequent accesses are to the new shadow page. āž” The old page retained for recovery. ā€¢ Differential files āž” For each file F maintain āœ¦ a read only part FR āœ¦ a differential file consisting of insertions part DF+ and deletions part DF- āœ¦ Thus, F = (FR DF+) ā€“ DF- āž” Updates treated as delete old value, insert new value
  • 25. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/25 Execution of Commands Commands to consider: begin_transaction read write commit abort recover Independent of execution strategy for LRM
  • 26. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/26 Execution Strategies ā€¢ Dependent upon āž” Can the buffer manager decide to write some of the buffer pages being accessed by a transaction into stable storage or does it wait for LRM to instruct it? āœ¦ fix/no-fix decision āž” Does the LRM force the buffer manager to write certain buffer pages into stable database at the end of a transaction's execution? āœ¦ flush/no-flush decision ā€¢ Possible execution strategies: āž” no-fix/no-flush āž” no-fix/flush āž” fix/no-flush āž” fix/flush
  • 27. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/27 No-Fix/No-Flush ā€¢ Abort āž” Buffer manager may have written some of the updated pages into stable database āž” LRM performs transaction undo (or partial undo) ā€¢ Commit āž” LRM writes an ā€œend_of_transactionā€ record into the log. ā€¢ Recover āž” For those transactions that have both a ā€œbegin_transactionā€ and an ā€œend_of_transactionā€ record in the log, a partial redo is initiated by LRM āž” For those transactions that only have a ā€œbegin_transactionā€ in the log, a global undo is executed by LRM
  • 28. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/28 No-Fix/Flush ā€¢ Abort āž” Buffer manager may have written some of the updated pages into stable database āž” LRM performs transaction undo (or partial undo) ā€¢ Commit āž” LRM issues a flush command to the buffer manager for all updated pages āž” LRM writes an ā€œend_of_transactionā€ record into the log. ā€¢ Recover āž” No need to perform redo āž” Perform global undo
  • 29. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/29 Fix/No-Flush ā€¢ Abort āž” None of the updated pages have been written into stable database āž” Release the fixed pages ā€¢ Commit āž” LRM writes an ā€œend_of_transactionā€ record into the log. āž” LRM sends an unfix command to the buffer manager for all pages that were previously fixed ā€¢ Recover āž” Perform partial redo āž” No need to perform global undo
  • 30. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/30 Fix/Flush ā€¢ Abort āž” None of the updated pages have been written into stable database āž” Release the fixed pages ā€¢ Commit (the following have to be done atomically) āž” LRM issues a flush command to the buffer manager for all updated pages āž” LRM sends an unfix command to the buffer manager for all pages that were previously fixed āž” LRM writes an ā€œend_of_transactionā€ record into the log. ā€¢ Recover āž” No need to do anything
  • 31. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/31 Checkpoints ā€¢ Simplifies the task of determining actions of transactions that need to be undone or redone when a failure occurs. ā€¢ A checkpoint record contains a list of active transactions. ā€¢ Steps: ļ‚Œ Write a begin_checkpoint record into the log ļ‚ Collect the checkpoint dat into the stable storage ļ‚Ž Write an end_checkpoint record into the log
  • 32. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/32 Media Failures ā€“ Full Architecture Read WriteWrite Read Main memory Local Recovery Manager Database Buffer Manager Fetch, Flush Archive log Archive database Secondary storage Stable log Stable database Database buffers (Volatile database) Log buffers Write Write
  • 33. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/33 Distributed Reliability Protocols ā€¢ Commit protocols āž” How to execute commit command for distributed transactions. āž” Issue: how to ensure atomicity and durability? ā€¢ Termination protocols āž” If a failure occurs, how can the remaining operational sites deal with it. āž” Non-blocking : the occurrence of failures should not force the sites to wait until the failure is repaired to terminate the transaction. ā€¢ Recovery protocols āž” When a failure occurs, how do the sites where the failure occurred deal with it. āž” Independent : a failed site can determine the outcome of a transaction without having to obtain remote information. ā€¢ Independent recovery non-blocking termination
  • 34. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/34 Two-Phase Commit (2PC) Phase 1 : The coordinator gets the participants ready to write the results into the database Phase 2 : Everybody writes the results into the database āž” Coordinator :The process at the site where the transaction originates and which controls the execution āž” Participant :The process at the other sites that participate in executing the transaction Global Commit Rule: ļ‚Œ The coordinator aborts a transaction if and only if at least one participant votes to abort it. ļ‚ The coordinator commits a transaction if and only if all of the participants vote to commit it.
  • 35. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/35 Centralized 2PC ready? yes/no commit/abort?commited/aborted Phase 1 Phase 2 C C C P P P P P P P P
  • 36. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/36 2PC Protocol Actions ParticipantCoordinator No Yes VOTE-COMMIT Yes GLOBAL-ABORT No write abort in log Abort Commit ACK ACK INITIAL write abort in log write ready in log write commit in log Type of msg WAIT Ready to Commit? write commit in log Any No? write abort in log ABORTCOMMIT COMMITABORT write begin_commit in log write end_of_transaction in log READ Y INITIAL Unilateralabort
  • 37. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/37 Linear 2PC Prepare VC/VA Phase 1 Phase 2 GC/GA VC/VA VC/VA VC/VA VC: Vote-Commit, VA: Vote-Abort, GC: Global-commit, GA: Global-abort 1 2 3 4 5 N GC/GA GC/GA GC/GA GC/GA ā‰ˆā‰ˆ
  • 38. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/38 Distributed 2PC prepare vote-abort/ vote-commit global-commit/ global-abort decision made independently Phase 1 Coordinator Participants Participants Phase 2
  • 39. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/39 State Transitions in 2PC INITIAL WAIT Commit command Prepare Vote-commit (all) Global-commit INITIAL READY Prepare Vote-commit Global-commit Ack Prepare Vote-abort Global-abort Ack Coordinator Participants Vote-abort Global-abort ABORT COMMIT COMMITABORT
  • 40. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/40 Site Failures - 2PC Termination ā€¢ Timeout in INITIAL āž” Who cares ā€¢ Timeout in WAIT āž” Cannot unilaterally commit āž” Can unilaterally abort ā€¢ Timeout in ABORT or COMMIT āž” Stay blocked and wait for the acks COORDINATOR INITIAL WAIT Commit command Prepare Vote-commit Global-commit ABORT COMMIT Vote-abort Global-abort
  • 41. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/41 Site Failures - 2PC Termination ā€¢ Timeout in INITIAL āž” Coordinator must have failed in INITIAL state āž” Unilaterally abort ā€¢ Timeout in READY āž” Stay blocked INITIAL READY Prepare Vote-commit Global-commit Ack Prepare Vote-abort Global-abort Ack ABORT COMMIT PARTICIPANTS
  • 42. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/42 Site Failures - 2PC Recovery ā€¢ Failure in INITIAL āž” Start the commit process upon recovery ā€¢ Failure in WAIT āž” Restart the commit process upon recovery ā€¢ Failure in ABORT or COMMIT āž” Nothing special if all the acks have been received āž” Otherwise the termination protocol is involved COORDINATOR INITIAL WAIT Commit command Prepare Vote-commit Global-commit ABORT COMMIT Vote-abort Global-abort
  • 43. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/43 Site Failures - 2PC Recovery ā€¢ Failure in INITIAL āž” Unilaterally abort upon recovery ā€¢ Failure in READY āž” The coordinator has been informed about the local decision āž” Treat as timeout in READY state and invoke the termination protocol ā€¢ Failure in ABORT or COMMIT āž” Nothing special needs to be done INITIAL READY Prepare Vote-commit Global-commit Ack Prepare Vote-abort Global-abort Ack ABORT COMMIT PARTICIPANTS
  • 44. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/44 2PC Recovery Protocols ā€“ Additional Cases Arise due to non-atomicity of log and message send actions ā€¢ Coordinator site fails after writing ā€œbegin_commitā€ log and before sending ā€œprepareā€ command āž” treat it as a failure in WAIT state; send ā€œprepareā€ command ā€¢ Participant site fails after writing ā€œreadyā€ record in log but before ā€œvote- commitā€ is sent āž” treat it as failure in READY state āž” alternatively, can send ā€œvote-commitā€ upon recovery ā€¢ Participant site fails after writing ā€œabortā€ record in log but before ā€œvote- abortā€ is sent āž” no need to do anything upon recovery
  • 45. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/45 2PC Recovery Protocols ā€“ Additional Case ā€¢ Coordinator site fails after logging its final decision record but before sending its decision to the participants āž” coordinator treats it as a failure in COMMIT or ABORT state āž” participants treat it as timeout in the READY state ā€¢ Participant site fails after writing ā€œabortā€ or ā€œcommitā€ record in log but before acknowledgement is sent āž” participant treats it as failure in COMMIT or ABORT state āž” coordinator will handle it by timeout in COMMIT or ABORT state
  • 46. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/46 Problem With 2PC ā€¢ Blocking āž” Ready implies that the participant waits for the coordinator āž” If coordinator fails, site is blocked until recovery āž” Blocking reduces availability ā€¢ Independent recovery is not possible ā€¢ However, it is known that: āž” Independent recovery protocols exist only for single site failures; no independent recovery protocol exists which is resilient to multiple-site failures. ā€¢ So we search for these protocols ā€“ 3PC
  • 47. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/47 Three-Phase Commit ā€¢ 3PC is non-blocking. ā€¢ A commit protocols is non-blocking iff āž” it is synchronous within one state transition, and āž” its state transition diagram contains āœ¦ no state which is ā€œadjacentā€ to both a commit and an abort state, and āœ¦ no non-committable state which is ā€œadjacentā€ to a commit state ā€¢ Adjacent: possible to go from one stat to another with a single state transition ā€¢ Committable: all sites have voted to commit a transaction āž” e.g.: COMMIT state
  • 48. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/48 State Transitions in 3PC INITIAL WAIT Commit command Prepare Vote-commit Prepare-to-commit Coordinator Vote-abort Global-abort ABORT COMMIT PRE- COMMIT Ready-to-commit Global commit INITIAL READY Prepare Vote-commit Prepared-to-commit Ready-to-commit Prepare Vote-abort Global-abort Ack Participants COMMIT ABORT PRE- COMMIT Global commit Ack
  • 49. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/49 Communication Structure C P P P P C P P P P C ready? yes/no pre-commit/ pre-abort? commit/abort Phase 1 Phase 2 P P P P C yes/no ack Phase 3
  • 50. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/50 Site Failures ā€“ 3PC Termination ā€¢ Timeout in INITIAL āž” Who cares ā€¢ Timeout in WAIT āž” Unilaterally abort ā€¢ Timeout in PRECOMMIT āž” Participants may not be in PRE- COMMIT, but at least in READY āž” Move all the participants to PRECOMMIT state āž” Terminate by globally committing INITIAL WAIT Commit command Prepare Vote-commit Prepare-to-commit Coordinator Vote-abort Global-abort ABORT COMMIT PRE- COMMIT Ready-to-commit Global commit
  • 51. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/51 Site Failures ā€“ 3PC Termination ā€¢ Timeout in ABORT or COMMIT āž” Just ignore and treat the transaction as completed āž” participants are either in PRECOMMIT or READY state and can follow their termination protocols INITIAL WAIT Commit command Prepare Vote-commit Prepare-to-commit Coordinator Vote-abort Global-abort ABORT COMMIT PRE- COMMIT Ready-to-commit Global commit
  • 52. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/52 Site Failures ā€“ 3PC Termination ā€¢ Timeout in INITIAL āž” Coordinator must have failed in INITIAL state āž” Unilaterally abort ā€¢ Timeout in READY āž” Voted to commit, but does not know the coordinator's decision āž” Elect a new coordinator and terminate using a special protocol ā€¢ Timeout in PRECOMMIT āž” Handle it the same as timeout in READY state INITIAL READY Prepare Vote-commit Prepared-to-commit Ready-to-commit Prepare Vote-abort Global-abort Ack Participants COMMIT ABORT PRE- COMMIT Global commit Ack
  • 53. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/53 Termination Protocol Upon Coordinator Election New coordinator can be in one of four states: WAIT, PRECOMMIT, COMMIT, ABORT ļ‚Œ Coordinator sends its state to all of the participants asking them to assume its state. ļ‚ Participants ā€œback-upā€ and reply with appriate messages, except those in ABORT and COMMIT states. Those in these states respond with ā€œAckā€ but stay in their states. ļ‚Ž Coordinator guides the participants towards termination: āœ¦ If the new coordinator is in the WAIT state, participants can be in INITIAL, READY, ABORT or PRECOMMIT states. New coordinator globally aborts the transaction. āœ¦ If the new coordinator is in the PRECOMMIT state, the participants can be in READY, PRECOMMIT or COMMIT states. The new coordinator will globally commit the transaction. āœ¦ If the new coordinator is in the ABORT or COMMIT states, at the end of the first phase, the participants will have moved to that state as well.
  • 54. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/54 Site Failures ā€“ 3PC Recovery ā€¢ Failure in INITIAL āž” start commit process upon recovery ā€¢ Failure in WAIT āž” the participants may have elected a new coordinator and terminated the transaction āž” the new coordinator could be in WAIT or ABORT states transaction aborted āž” ask around for the fate of the transaction ā€¢ Failure in PRECOMMIT āž” ask around for the fate of the transaction INITIAL WAIT Commit command Prepare Vote-commit Prepare-to-commit Coordinator Vote-abort Global-abort ABORT COMMIT PRE- COMMIT Ready-to-commit Global commit
  • 55. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/55 Site Failures ā€“ 3PC Recovery ā€¢ Failure in COMMIT or ABORT āž” Nothing special if all the acknowledgements have been received; otherwise the termination protocol is involved INITIAL WAIT Commit command Prepare Vote-commit Prepare-to-commit Coordinator Vote-abort Global-abort ABORT COMMIT PRE- COMMIT Ready-to-commit Global commit
  • 56. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/56 Site Failures ā€“ 3PC Recovery ā€¢ Failure in INITIAL āž” unilaterally abort upon recovery ā€¢ Failure in READY āž” the coordinator has been informed about the local decision āž” upon recovery, ask around ā€¢ Failure in PRECOMMIT āž” ask around to determine how the other participants have terminated the transaction ā€¢ Failure in COMMIT or ABORT āž” no need to do anything INITIAL READY Prepare Vote-commit Prepared-to-commit Ready-to-commit Prepare Vote-abort Global-abort Ack Participants COMMIT ABORT PRE- COMMIT Global commit Ack
  • 57. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/57 Network Partitioning ā€¢ Simple partitioning āž” Only two partitions ā€¢ Multiple partitioning āž” More than two partitions ā€¢ Formal bounds: āž” There exists no non-blocking protocol that is resilient to a network partition if messages are lost when partition occurs. āž” There exist non-blocking protocols which are resilient to a single network partition if all undeliverable messages are returned to sender. āž” There exists no non-blocking protocol which is resilient to a multiple partition.
  • 58. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/58 Independent Recovery Protocols for Network Partitioning ā€¢ No general solution possible āž” allow one group to terminate while the other is blocked āž” improve availability ā€¢ How to determine which group to proceed? āž” The group with a majority ā€¢ How does a group know if it has majority? āž” Centralized āœ¦ Whichever partitions contains the central site should terminate the transaction āž” Voting-based (quorum)
  • 59. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/59 Quorum Protocols ā€¢ The network partitioning problem is handled by the commit protocol. ā€¢ Every site is assigned a vote Vi. ā€¢ Total number of votes in the system V ā€¢ Abort quorum Va, commit quorum Vc āž” Va + Vc > V where 0 ā‰¤ Va , Vc ā‰¤ V āž” Before a transaction commits, it must obtain a commit quorum Vc āž” Before a transaction aborts, it must obtain an abort quorum Va
  • 60. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/60 State Transitions in Quorum Protocols INITIAL WAIT Commit command Prepare Vote-commit Prepare-to-commit Coordinator Vote-abort Prepare-to-abort ABORT COMMIT PRE- COMMIT Ready-to-commit Global commit INITIAL READY Prepare Vote-commit Prepare-to-commit Ready-to-commit Prepare Vote-abort Global-abort Ack Participants COMMITABORT PRE- COMMIT Global commit Ack PRE- ABORT Prepared-to-abortt Ready-to-abort PRE- ABORT Ready-to-abort Global-abort
  • 61. Distributed DBMS Ā© M. T. Ɩzsu & P. Valduriez Ch.12/61 Use for Network Partitioning ā€¢ Before commit (i.e., moving from PRECOMMIT to COMMIT), coordinator receives commit quorum from participants. One partition may have the commit quorum. ā€¢ Assumes that failures are ā€œcleanā€ which means: āž” failures that change the network's topology are detected by all sites instantaneously āž” each site has a view of the network consisting of all the sites it can communicate with