SlideShare uma empresa Scribd logo
1 de 49
CS 542 Database Management Systems Failure Recovery, Concurrency Control J Singh  April 4, 2011
Today’s meeting The D in ACID: Durability The ACI in ACID Consistency is specified by users is how they define transactions The Database is responsible for Atomicity and Isolation
Types of Failures Potential sources of failures: Power loss, resulting in loss of main-memory state, Media failures, resulting in loss of disk state and Software errors, resulting in both Recovery is based on the concept of transactions.
Transactions and Concurrency Users submit transactions, and think of each transaction as executing by itself. Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of various transactions. Each transaction must leave the database in a consistent state if the DB is consistent when the transaction begins.A transaction can end in two different ways: commit: successful end, all actions completed, abort: unsuccessful end, only some actions executed. Issues:  effect of interleaving transactions on the database System failures (today’s lecture) Concurrent transactions (partly today, remainder next week)
Transactions, Logging and Recovery We studied Query Processing in the last two lectures Now, Log Manager and Recovery Manager Second part today, Transaction Manager
Reminder: Buffer Management DB Page Requests from Higher Levels BUFFER POOL disk page free frame MAIN MEMORY DISK choice of frame dictated by replacement policy Data must be in RAM for DBMS to operate on it!
Primitive Buffer Operations Requests from Transactions Read (x,t):  Input(x) if necessary Assign value of x in block to local variable t (in buffer) Write (x,t):  Input(x) if necessary Assign value of local variable t (in buffer) to x Requests to Disk Input (x): Transfer block containing x from disk to memory (buffer) Output (x): Transfer block containing x from buffer to disk
Failure Recovery Approaches All of the approaches rely on logging – storing a log of changes to the database so it is possible to restore its state. They differ in What information is logged, The timing of when to force that information to stable storage, What the procedure for recovery will be The approaches are named after the recovery procedure Undo Logging The log contains enough information to detect if the transaction was committed and to roll back the state if it was not. When recovering after a failure, walk back through the log and undo the effect of all txns that do not have a COMMIT entry in the log Other approaches described later
Undo Logging When executing transactions Write the log before writing transaction data and force it to disk Make sure to preserve chronological order The log contains enough information to detect if the transaction was committed and to roll back the state if it was not. When restarting,  Walk back through the log and undo the effect of all uncommitted txns in the log. Challenge: How far back do we need to look? Answer: Until the last checkpoint Define and implement checkpoints momentarily
An Example Transaction Initially A = 8 B = 8 Transaction T1 A  2  A B  2  B Transaction T1: Read (A,t);  t  t  2 Write (A,t); Read (B,t);  t  t  2 Write (B,t); Output (A); Output (B); State at Failure Point: Memory: A = 16 B = 16 Disk: A = 16 B = 8 Undo Log Entries <T1, start> <T1, A, 8> <T1, B, 8> <T1, Commit> Would have been written if the transaction had completed. Do we have the info to restore? failure!
Execution with Undo Logging Forces all log records to disk Logging Rule: ,[object Object],[object Object]
More on Undo Logging Failure During Recovery Recovery algorithm is idempotent Just do it again! How much of the log file needs to be processed? In principle, we need to examine the entire log. Checkpointing limits the part of the log that needs to be considered during recovery up to a certain point (checkpoint).
Quiescent Checkpointing Simple approach to introduce the concept Pause the database stop accepting new transactions, wait until all current transactions commit or abort and have written the corresponding log records, flush the log to disk, write a <CKPT> log record and flush the log, resume accepting new transactions. Once we encounter a checkpoint record, we know that there are no incomplete transactions. Do not need to go backward beyond checkpoint.  Can afford to throw away any part of the log prior to the checkpoint Pausing the database may not be warranted for business reasons
Non-quiescentCheckpointing Main idea: Start- and End-Checkpoints to bracket unfinished txns Write a <START CKPT (T1, T2, … Tk)> record into the log T1, T2, … Tk are the unfinished txns Wait till T1, T2, … Tk commit or abort, but allow other txns to begin Write a <END CKPT> record into the log Recovery method: scan the log backwards until a <CKPT> record is found If <END…>, scan backwards to the previous <START…> No need to look any further If <START…>, then crash must have occurred during checkpointing.  The START record tells us unfinished txns and  Scan back to the beginning of the oldest one of these.
Issues with Undo Logging Bottlenecks on I/O All log records must be forced back to disk before any data written back All data records must be forced to disk before the COMMIT record is written back An alternative: Redo Logging Instead of scanning backward from the end Undoing all transactions that were not completed Scans the log forward Reapplies all transactions that were not completed
Logging with Redo Logs Creation of the Redo log For every action, generate redo log record. <T, X, v> has different meaning: v is the new value, not old Flush log at commit. All log records for transaction that modified X (including commit) must be on disk before X is modified on disk Write END log record after DB modifications have been written to disk. Recovery algorithm.  Redo the modifications by committed transactions not yet flushed to disk. S = set of txns with <Ti commit> and no <Ti end> in log For each <Ti X, v> in log, in forward order (from earliest to latest) do: if Ti in S then Write(X, v)  Output(X) Write <Ti END>
Logging with Redo Logs
Comments on Redo Logging Checkpoint algorithms similar to those for Undo Logging Quiescent as well as Non-quiescent algorithms Issues with Redo Logging Writing data back to disk is not allowed until transaction logs have been written out Results in a large requirement for memory for buffer pool A flaw in the checkpointing algorithms (textbook, p869) Both undo and redo logs may put contradictory requirements on how buffers are handled during a checkpoint, unless the database elements are complete blocks or sets of blocks.  For instance, if a buffer contains one database element A that was changed by a committed transaction and another database element B that was changed in the same buffer by a transaction that has not yet had its COMMIT record written to disk, then we are required to copy the buffer to disk because of A but also forbidden to do so, because rule R1 applies to B.
Undo/Redo Logging (p1) Undo logging requires to write modifications to disk immediately after commit, leading to an unnecessarily large number of IOs. Redo logging requires to keep all modified blocks in the buffer until the transaction commits and the log records have been flushed, increasing the buffer size requirement. Undo/redo logging combines undo and redo logging.  It provides more flexibility in flushing modified blocks at the expense of maintaining more information in the log.
Undo/Redo Logging (p2) Main idea: The log can be used to reconstruct the data Update records <T, X, new, old> record new and old value of X. The only undo/redo logging rule is:  Log record must be flushed before corresponding modified block Also known as write ahead logging. Block of X can be flushed before or after T commits, i.e. before or after the COMMIT log record. Flush the log at commit.
Undo/Redo Logging (p3) Because of the flexibility of flushing X before or after the COMMIT record, we can have uncommitted transactions with modifications on disk and committed transactions with modifications not yet on disk. The undo/redo recovery policy is as follows: Redo committed transactions. Undo uncommitted transactions.
Undo/Redo Logging Recovery More details on the recovery procedure: Backward pass  From end of log back to latest valid checkpoint, construct set S of committed transactions. Undo actions of transactions not in S. Forward pass From latest checkpoint forward to end of log, Or from the beginning of time, if there are no checkpoints redo actions of transactions in S. Alternatively, can also perform the redos before the undos.
Undo/Redo Checkpointing Write "start checkpoint" listing all active transactions to log Flush log to disk Write to disk all dirty buffers (contain a changed DB element), whether or not transaction has committed Implies nothing should be written (not even to memory buffers) until we are sure the transaction will not abort Implies some log records may need to be written to disk (WAL) Write "end checkpoint" to log Flush log to disk start ckpt active T's: T1,T2,... end ckpt ... ... ...
Protecting Against Media Failure Logging protects from local loss of main memory and disk content, but not against global loss of secondary storage content (media failure). To protect against media failures, employ archiving: maintaining a copy of the database on a separate, secure storage device. Log also needs to be archived in the same manner. Two levels of archiving: full dump vs. incremental dump.
Protecting Against Media Failure Typically, database cannot be shut down for the period of time needed to make a backup copy (dump). Need to perform nonquiescent archiving, i.e., create a dump while the DBMS continues to process transactions. Goal is to make copy of database at time when the dump began, but transactions may change database content during the dumping. Logging continues during the dumping, and discrepancies can be corrected from the log.
Protecting Against Media Failure We assume undo/redo (or redo) logging. The archiving procedure is as follows:  Write a log record <START DUMP>.  Perform a checkpoint for the log.  Perform a (full / incremental) dump on the secure storage device.  Make sure that enough of the log has been copied to the secure storage device so that at least the log up to the check point will survive media failure. Write a log record <END DUMP>.
Protecting Against Media Failure After a media failure, we can restore the DB from the archived DB and archived log as follows:  Copy latest full dump (archive) back to DB.  Starting with the earliest ones, make the modifications recorded in the incremental dump(s) in increasing order of time.  Further modify DB using the archived log.  Use the recovery method corresponding to the chosen type of logging.
Summary Logging is an effective way to prepare for system failure Transactions provide a useful building block on which to base log entries Three type of logs Undo Logs Redo Logs Undo/Redo logs Only Undo/Redo logs are used in practice. Why? Periodic checkpoints are necessary for keeping recovery times under control. Why? Database Dumps (archives) protect against media failure Great for making a “point in time” copy of the database.
On the NoSQL Front… Google Datastore Recently (1/2011) added a “High Replication” option. Replicates the datastore synchronously across multiple data centers Does not use an append-only log Has performance and size impact CouchDB Append-only log that’s actually a b-tree No provision for deleting part of the log Provision for ‘compacting the log’ MongoDB Recently (12/2010) added a --journal option Has performance impact, no measurements available Common thread, tradeoff between performance and durability!
CS 542 Database Management Systems Concurrency Control J Singh  April 4, 2011
Concurrency Control Goal: Preserving Data Integrity Challenge: enforce ACID rules (while maintaining maximum traffic through the system) Committed transactions leave the system in a consistent state Rolled-back transactions behave as if they never happened! Historical Note Based on The Transaction Concept: Virtues and Limitations by Jim Gray, Tandem Computers, 1981 ACM Turing Award, 1998
Transactions Concurrent execution of user programs is essential for good DBMS performance. Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently. A user’s program may carry out many operations on the data retrieved from the database, but the DBMS is only concerned about what data is read/written from/to the database. A transaction is the DBMS’s abstract view of a user program:  a sequence of reads and writes. Referred to as a Schedule Implemented by a Transaction Scheduler
Scheduler Scheduler takes read/write requests from transactions Either executes them in buffers or delays them Scheduler must avoid Isolation Anomalies
Isolation Anomalies (p1) READ UNCOMMITTED Dirty Read – data of an uncommitted transaction visible to others Sometimes called WR Conflict UNREPEATABLE READ Non-repeatable Read – some previously read data changes due to another transaction committing Sometimes called RW Conflict T1: R(A), W(A), R(B), W(B), C T2:		  R(A), W(A), R(B), W(B), C T1:	R(A),  		     	   W(A), C T2:		R(A), W(A), C
Isolation Anomalies (p2) Overwriting Uncommitted Data Sometimes called WW Conflicts We need a set of rules to prohibit such isolation anomalies The rules place constraints on the actions of concurrent transactions T1:	W(A),  		W(B), C T2:		W(A), W(B), C
Serial Schedules Definition: A schedule is a list of actions, (i.e. reading, writing, aborting, committing), from a set of transactions. A schedule is serial if its transactions are not interleaved Serial schedules observe ACI properties Schedule D is the set of 3 transactions T1, T2, T3.  T1 Reads and writes to object X Then T2 Reads and writes to object Y ThenT3 Reads and writes to object Z.  D is an example of a serial schedule, because the 3 txns are not interleaved. Shorthand: R1 (X), W1(X),R2 (Y), W2(Y), R3 (Z), W3(Z)
Serializable Schedules Aserializable schedule is one that is equivalent to a serial schedule. The Transaction Manager should defer some transactions if the current schedule is not serializable The order of transactions in E is not the same as in D,  But E gives the same result. Shorthand:  E = R1 (X); R2 (Y); R3 (Z); W1 (X);       W2 (Y); W3 (Z);
Serializability Is G serializable? Equivalent to the serial schedule <T1,T2> But not <T2,T1> G is conflict-serializable Conflict equivalence: The schedules S1 and S2are conflict-equivalent if the following conditions are satisfied: Both schedules S1and S2involve the same set of transactions (including ordering of actions within each transaction). The order of each pair of conflicting actions in S1 and S2are the same Conflict-serializability: A schedule is conflict-serializablewhen the schedule is conflict-equivalent to one or more serial schedules.
Serializability of Schedule G T1:  R(A) W(B) T2:  R(A) W(A) T1 T2 Precedence graph: a node for each transaction an arc from Ti to Tj if an action in Tiprecedes and conflicts with an action in Tj. T1 T2? R1 (A) W1 (B) R2(A) W2(A) ? No conflicts T2T1? R2(A) W2 (A) R1 (A) W1 (B) ?  Two actions conflict if The actions belong to different transactions.  At least one of the actions is a write operation.  The actions access the same object (read or write).  Theorem: A schedule is conflict serializable if and only if its precedence graph is acyclic Conflicts
Enforcing Serializable Schedules Prevent cycles in the Precedence Graph, P(S), from occurring Locking primitives: Lock (exclusive): li(A) Unlock: ui(A) Make transactions consistent Ti: pi (A) becomes Ti: li(A) pi (A) ui(A) pi (A) is either a READ or a WRITE Allow only one transaction to hold a lock on A at any time Two-phase locking for transactions Ti: li(A) … pi (A) … ui(A) no unlocks              no locks
Legal Schedules? S1= l1 (A) l1(B) r1 (A) w1 (B)l2(B)u1 (A) u1 (B) r2 (B) w2 (B) u2 (B)l3 (B) r3 (B) u3(B) S2= l1 (A) r1 (A) w1 (B) u1 (A) u1 (B) l2 (B) r2 (B) w2 (B)           l3 (B) r3 (B) u3 (B) S3= l1 (A) r1 (A) u1 (A) l1 (B) w1 (B) u1 (B) l2 (B) r2 (B) w2 (B) u2 (B)l3 (B) r3 (B) u3 (B)
Locking Protocols for Serializable Schedules Strict Two-phase Locking (Strict 2PL) Protocol: Each transaction must obtain a S (shared) lock on object before reading, and an X (exclusive) lock on object before writing. All locks held by a transaction are released when the transaction completes Strict 2PL allows only serializable schedules Additionally, it simplifies transaction aborts (Non-strict) 2PL Variant: Release locks anytime, but cannot acquire locks after releasing any lock. If a txn holds an X lock on an object, no other txn can get a lock (S or X) on that object. (Non-strict) 2PL also allows only serializable schedules, but involves more complex abort processing Why is “acquiring after releasing” disallowed? To avoid cascading aborts More in a minute
Executing Locking Protocols Begin with a Serialized Schedule We know it won’t deadlock How do we know this? Beyond this simple 2PL protocol, it is all a matter of improving performance and allowing more concurrency…. Shared locks Increment locks Multiple granularity Other types of concurrency control mechanisms
Lock Management Lock and unlock requests are handled by the lock manager Lock Table Entry Number of transactions currently holding a lock Type of lock held (shared or exclusive) Pointer to queue of lock requests Locking and unlocking operations Atomic Support upgrade: transaction that holds a shared lock can be upgraded to hold an exclusive lock Any level of granularity can be locked Database, table, block, tuple Why is this necessary?
Multiple-Granularity Locks If a transaction needs to scan all records in a table, we don’t really want to have a lock on all tuples individually – significant locking overhead!  Put a single lock on the table Database Tables Pages Tuples A lock on a node implicitly locks all descendents. contains
Aborting a Transaction If a transaction Ti is aborted, all its actions have to be undone.   If Tjreads an object last written by Ti,  Tj must be aborted as well! Most systems avoid such cascading aborts by releasing a transaction’s locks only at commit time. If Tiwrites an object, Tjcan read this only after Ticommits. In order to undo the actions of an aborted transaction, the DBMS maintains a log in which every write is recorded.   The same mechanism is used to recover from system crashes; all active txns at the time of the crash are aborted when the system recovers
Performance Considerations (Again!) 2PL Protocol allows transactions to proceed with maximum parallelism Locking algorithm only delays actions that would cause conflicts But the locks are still a bottleneck Need to ensure lowest-possible level of locking granularity Classic memory-performance trade-off Conflict-serialization is too conservative But other methods of serialization are too complex A use case that occurs quite often, should be optimized Besides scanning through the table, if we need to modify a few tuples, what kind of lock to put on the table? Have to be X (if we only have S or X). But, blocks all other read requests! Concurrency control is pessimistic and acquires/releases locks Optimistic Concurrency Control
Next Week Intention Locks Optimistic Concurrency Control Distributed Commit Please Read ahead of time The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008
CS 542 -- Failure Recovery, Concurrency Control

Mais conteúdo relacionado

Mais procurados

Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency ControlDilum Bandara
 
Database recovery techniques
Database recovery techniquesDatabase recovery techniques
Database recovery techniquespusp220
 
Git: Overview, Pitfalls, Best Practices
Git: Overview, Pitfalls, Best PracticesGit: Overview, Pitfalls, Best Practices
Git: Overview, Pitfalls, Best PracticesJeremy Leisy
 
Operating Systems - Process Scheduling
Operating Systems - Process SchedulingOperating Systems - Process Scheduling
Operating Systems - Process SchedulingDamian T. Gordon
 
Topic 4 database recovery
Topic 4 database recoveryTopic 4 database recovery
Topic 4 database recoveryacap paei
 

Mais procurados (10)

Crash recovery in database
Crash recovery in databaseCrash recovery in database
Crash recovery in database
 
Data (1)
Data (1)Data (1)
Data (1)
 
Transactions and Concurrency Control
Transactions and Concurrency ControlTransactions and Concurrency Control
Transactions and Concurrency Control
 
Database recovery techniques
Database recovery techniquesDatabase recovery techniques
Database recovery techniques
 
Git: Overview, Pitfalls, Best Practices
Git: Overview, Pitfalls, Best PracticesGit: Overview, Pitfalls, Best Practices
Git: Overview, Pitfalls, Best Practices
 
Lec08
Lec08Lec08
Lec08
 
Operating Systems - Process Scheduling
Operating Systems - Process SchedulingOperating Systems - Process Scheduling
Operating Systems - Process Scheduling
 
24904 lecture11
24904 lecture1124904 lecture11
24904 lecture11
 
Topic 4 database recovery
Topic 4 database recoveryTopic 4 database recovery
Topic 4 database recovery
 
운영체제론 Ch16
운영체제론 Ch16운영체제론 Ch16
운영체제론 Ch16
 

Destaque

DBMS-chap 2-Concurrency Control
DBMS-chap 2-Concurrency ControlDBMS-chap 2-Concurrency Control
DBMS-chap 2-Concurrency ControlMukesh Tekwani
 
database management system Chapter 5
database management system Chapter 5database management system Chapter 5
database management system Chapter 5Mohamad Syazwan
 
Dbms sixth chapter_part-1_2011
Dbms sixth chapter_part-1_2011Dbms sixth chapter_part-1_2011
Dbms sixth chapter_part-1_2011sumit_study
 
Irt 1 pl, 2pl, 3pl.pdf
Irt 1 pl, 2pl, 3pl.pdfIrt 1 pl, 2pl, 3pl.pdf
Irt 1 pl, 2pl, 3pl.pdfCarlo Magno
 
Distributed Database Management System(DDMS)
Distributed Database Management System(DDMS)Distributed Database Management System(DDMS)
Distributed Database Management System(DDMS)mobeen.laws
 
Insertion sort
Insertion sortInsertion sort
Insertion sortalmaqboli
 
Third party logistics
Third party logisticsThird party logistics
Third party logisticsKuldeep Uttam
 
16. Concurrency Control in DBMS
16. Concurrency Control in DBMS16. Concurrency Control in DBMS
16. Concurrency Control in DBMSkoolkampus
 
15. Transactions in DBMS
15. Transactions in DBMS15. Transactions in DBMS
15. Transactions in DBMSkoolkampus
 

Destaque (14)

Concurrency
ConcurrencyConcurrency
Concurrency
 
DBMS-chap 2-Concurrency Control
DBMS-chap 2-Concurrency ControlDBMS-chap 2-Concurrency Control
DBMS-chap 2-Concurrency Control
 
Chapter18
Chapter18Chapter18
Chapter18
 
Concurrency control
Concurrency controlConcurrency control
Concurrency control
 
database management system Chapter 5
database management system Chapter 5database management system Chapter 5
database management system Chapter 5
 
Dbms sixth chapter_part-1_2011
Dbms sixth chapter_part-1_2011Dbms sixth chapter_part-1_2011
Dbms sixth chapter_part-1_2011
 
Irt 1 pl, 2pl, 3pl.pdf
Irt 1 pl, 2pl, 3pl.pdfIrt 1 pl, 2pl, 3pl.pdf
Irt 1 pl, 2pl, 3pl.pdf
 
Distributed Database Management System(DDMS)
Distributed Database Management System(DDMS)Distributed Database Management System(DDMS)
Distributed Database Management System(DDMS)
 
Lecture 1 ddbms
Lecture 1 ddbmsLecture 1 ddbms
Lecture 1 ddbms
 
Insertion sort
Insertion sortInsertion sort
Insertion sort
 
Insertion sort
Insertion sortInsertion sort
Insertion sort
 
Third party logistics
Third party logisticsThird party logistics
Third party logistics
 
16. Concurrency Control in DBMS
16. Concurrency Control in DBMS16. Concurrency Control in DBMS
16. Concurrency Control in DBMS
 
15. Transactions in DBMS
15. Transactions in DBMS15. Transactions in DBMS
15. Transactions in DBMS
 

Semelhante a CS 542 -- Failure Recovery, Concurrency Control

DBMS-Recovery techniques dfggrjfchdfhwrshfxbvdgtytdfx.pptx
DBMS-Recovery techniques dfggrjfchdfhwrshfxbvdgtytdfx.pptxDBMS-Recovery techniques dfggrjfchdfhwrshfxbvdgtytdfx.pptx
DBMS-Recovery techniques dfggrjfchdfhwrshfxbvdgtytdfx.pptxHemaSenthil5
 
Databases: Backup and Recovery
Databases: Backup and RecoveryDatabases: Backup and Recovery
Databases: Backup and RecoveryDamian T. Gordon
 
Recovery System.pptx
Recovery System.pptxRecovery System.pptx
Recovery System.pptxssuserfb9a21
 
Introduction to transaction processing concepts and theory
Introduction to transaction processing concepts and theoryIntroduction to transaction processing concepts and theory
Introduction to transaction processing concepts and theoryZainab Almugbel
 
Transactionsmanagement
TransactionsmanagementTransactionsmanagement
TransactionsmanagementSanjeev Gupta
 
recovery system
recovery systemrecovery system
recovery systemshreeuva
 
Transaction unit 1 topic 4
Transaction unit 1 topic 4Transaction unit 1 topic 4
Transaction unit 1 topic 4avniS
 
Adbms 34 transaction processing and recovery
Adbms 34 transaction processing and recoveryAdbms 34 transaction processing and recovery
Adbms 34 transaction processing and recoveryVaibhav Khanna
 
UNIT 4- CRASH AND RECOVERY.pdf
UNIT 4- CRASH AND RECOVERY.pdfUNIT 4- CRASH AND RECOVERY.pdf
UNIT 4- CRASH AND RECOVERY.pdfKavitaShinde26
 
ch 5 Daatabase Recovery.ppt
ch 5 Daatabase Recovery.pptch 5 Daatabase Recovery.ppt
ch 5 Daatabase Recovery.pptAdemeCheklie
 
Relational Database Management System
Relational Database Management SystemRelational Database Management System
Relational Database Management Systemsweetysweety8
 
Dbms ii mca-ch11-recovery-2013
Dbms ii mca-ch11-recovery-2013Dbms ii mca-ch11-recovery-2013
Dbms ii mca-ch11-recovery-2013Prosanta Ghosh
 

Semelhante a CS 542 -- Failure Recovery, Concurrency Control (20)

DBMS-Recovery techniques dfggrjfchdfhwrshfxbvdgtytdfx.pptx
DBMS-Recovery techniques dfggrjfchdfhwrshfxbvdgtytdfx.pptxDBMS-Recovery techniques dfggrjfchdfhwrshfxbvdgtytdfx.pptx
DBMS-Recovery techniques dfggrjfchdfhwrshfxbvdgtytdfx.pptx
 
Databases: Backup and Recovery
Databases: Backup and RecoveryDatabases: Backup and Recovery
Databases: Backup and Recovery
 
Recovery System.pptx
Recovery System.pptxRecovery System.pptx
Recovery System.pptx
 
Introduction to transaction processing concepts and theory
Introduction to transaction processing concepts and theoryIntroduction to transaction processing concepts and theory
Introduction to transaction processing concepts and theory
 
Unit07 dbms
Unit07 dbmsUnit07 dbms
Unit07 dbms
 
Assignment#14
Assignment#14Assignment#14
Assignment#14
 
Transactionsmanagement
TransactionsmanagementTransactionsmanagement
Transactionsmanagement
 
recovery system
recovery systemrecovery system
recovery system
 
Transaction unit 1 topic 4
Transaction unit 1 topic 4Transaction unit 1 topic 4
Transaction unit 1 topic 4
 
Dbms
DbmsDbms
Dbms
 
Adbms 34 transaction processing and recovery
Adbms 34 transaction processing and recoveryAdbms 34 transaction processing and recovery
Adbms 34 transaction processing and recovery
 
Assignment#13
Assignment#13Assignment#13
Assignment#13
 
UNIT 4- CRASH AND RECOVERY.pdf
UNIT 4- CRASH AND RECOVERY.pdfUNIT 4- CRASH AND RECOVERY.pdf
UNIT 4- CRASH AND RECOVERY.pdf
 
ch 5 Daatabase Recovery.ppt
ch 5 Daatabase Recovery.pptch 5 Daatabase Recovery.ppt
ch 5 Daatabase Recovery.ppt
 
Recovery
RecoveryRecovery
Recovery
 
Relational Database Management System
Relational Database Management SystemRelational Database Management System
Relational Database Management System
 
Aries
AriesAries
Aries
 
Recovery techniques
Recovery techniquesRecovery techniques
Recovery techniques
 
Dbms ii mca-ch11-recovery-2013
Dbms ii mca-ch11-recovery-2013Dbms ii mca-ch11-recovery-2013
Dbms ii mca-ch11-recovery-2013
 
ch-5 advanced db.pdf
ch-5 advanced db.pdfch-5 advanced db.pdf
ch-5 advanced db.pdf
 

Mais de J Singh

OpenLSH - a framework for locality sensitive hashing
OpenLSH  - a framework for locality sensitive hashingOpenLSH  - a framework for locality sensitive hashing
OpenLSH - a framework for locality sensitive hashingJ Singh
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big dataJ Singh
 
Open LSH - september 2014 update
Open LSH  - september 2014 updateOpen LSH  - september 2014 update
Open LSH - september 2014 updateJ Singh
 
PaaS - google app engine
PaaS  - google app enginePaaS  - google app engine
PaaS - google app engineJ Singh
 
Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)J Singh
 
Data Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and TradeoffsData Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and TradeoffsJ Singh
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceJ Singh
 
Big Data Laboratory
Big Data LaboratoryBig Data Laboratory
Big Data LaboratoryJ Singh
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop EcosystemJ Singh
 
Social Media Mining using GAE Map Reduce
Social Media Mining using GAE Map ReduceSocial Media Mining using GAE Map Reduce
Social Media Mining using GAE Map ReduceJ Singh
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data AnalysisJ Singh
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query OptimizationJ Singh
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query ExecutionJ Singh
 
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementCS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementJ Singh
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceJ Singh
 
CS 542 Database Index Structures
CS 542 Database Index StructuresCS 542 Database Index Structures
CS 542 Database Index StructuresJ Singh
 
CS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and PerformanceCS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and PerformanceJ Singh
 
CS 542 Overview of query processing
CS 542 Overview of query processingCS 542 Overview of query processing
CS 542 Overview of query processingJ Singh
 
CS 542 Introduction
CS 542 IntroductionCS 542 Introduction
CS 542 IntroductionJ Singh
 

Mais de J Singh (20)

OpenLSH - a framework for locality sensitive hashing
OpenLSH  - a framework for locality sensitive hashingOpenLSH  - a framework for locality sensitive hashing
OpenLSH - a framework for locality sensitive hashing
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big data
 
Open LSH - september 2014 update
Open LSH  - september 2014 updateOpen LSH  - september 2014 update
Open LSH - september 2014 update
 
PaaS - google app engine
PaaS  - google app enginePaaS  - google app engine
PaaS - google app engine
 
Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)Mining of massive datasets using locality sensitive hashing (LSH)
Mining of massive datasets using locality sensitive hashing (LSH)
 
Data Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and TradeoffsData Analytic Technology Platforms: Options and Tradeoffs
Data Analytic Technology Platforms: Options and Tradeoffs
 
Facebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/ReduceFacebook Analytics with Elastic Map/Reduce
Facebook Analytics with Elastic Map/Reduce
 
Big Data Laboratory
Big Data LaboratoryBig Data Laboratory
Big Data Laboratory
 
The Hadoop Ecosystem
The Hadoop EcosystemThe Hadoop Ecosystem
The Hadoop Ecosystem
 
Social Media Mining using GAE Map Reduce
Social Media Mining using GAE Map ReduceSocial Media Mining using GAE Map Reduce
Social Media Mining using GAE Map Reduce
 
High Throughput Data Analysis
High Throughput Data AnalysisHigh Throughput Data Analysis
High Throughput Data Analysis
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
CS 542 -- Query Optimization
CS 542 -- Query OptimizationCS 542 -- Query Optimization
CS 542 -- Query Optimization
 
CS 542 -- Query Execution
CS 542 -- Query ExecutionCS 542 -- Query Execution
CS 542 -- Query Execution
 
CS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage ManagementCS 542 Putting it all together -- Storage Management
CS 542 Putting it all together -- Storage Management
 
CS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduceCS 542 Parallel DBs, NoSQL, MapReduce
CS 542 Parallel DBs, NoSQL, MapReduce
 
CS 542 Database Index Structures
CS 542 Database Index StructuresCS 542 Database Index Structures
CS 542 Database Index Structures
 
CS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and PerformanceCS 542 Controlling Database Integrity and Performance
CS 542 Controlling Database Integrity and Performance
 
CS 542 Overview of query processing
CS 542 Overview of query processingCS 542 Overview of query processing
CS 542 Overview of query processing
 
CS 542 Introduction
CS 542 IntroductionCS 542 Introduction
CS 542 Introduction
 

Último

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 

Último (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 

CS 542 -- Failure Recovery, Concurrency Control

  • 1. CS 542 Database Management Systems Failure Recovery, Concurrency Control J Singh April 4, 2011
  • 2. Today’s meeting The D in ACID: Durability The ACI in ACID Consistency is specified by users is how they define transactions The Database is responsible for Atomicity and Isolation
  • 3. Types of Failures Potential sources of failures: Power loss, resulting in loss of main-memory state, Media failures, resulting in loss of disk state and Software errors, resulting in both Recovery is based on the concept of transactions.
  • 4. Transactions and Concurrency Users submit transactions, and think of each transaction as executing by itself. Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of various transactions. Each transaction must leave the database in a consistent state if the DB is consistent when the transaction begins.A transaction can end in two different ways: commit: successful end, all actions completed, abort: unsuccessful end, only some actions executed. Issues: effect of interleaving transactions on the database System failures (today’s lecture) Concurrent transactions (partly today, remainder next week)
  • 5. Transactions, Logging and Recovery We studied Query Processing in the last two lectures Now, Log Manager and Recovery Manager Second part today, Transaction Manager
  • 6. Reminder: Buffer Management DB Page Requests from Higher Levels BUFFER POOL disk page free frame MAIN MEMORY DISK choice of frame dictated by replacement policy Data must be in RAM for DBMS to operate on it!
  • 7. Primitive Buffer Operations Requests from Transactions Read (x,t): Input(x) if necessary Assign value of x in block to local variable t (in buffer) Write (x,t): Input(x) if necessary Assign value of local variable t (in buffer) to x Requests to Disk Input (x): Transfer block containing x from disk to memory (buffer) Output (x): Transfer block containing x from buffer to disk
  • 8. Failure Recovery Approaches All of the approaches rely on logging – storing a log of changes to the database so it is possible to restore its state. They differ in What information is logged, The timing of when to force that information to stable storage, What the procedure for recovery will be The approaches are named after the recovery procedure Undo Logging The log contains enough information to detect if the transaction was committed and to roll back the state if it was not. When recovering after a failure, walk back through the log and undo the effect of all txns that do not have a COMMIT entry in the log Other approaches described later
  • 9. Undo Logging When executing transactions Write the log before writing transaction data and force it to disk Make sure to preserve chronological order The log contains enough information to detect if the transaction was committed and to roll back the state if it was not. When restarting, Walk back through the log and undo the effect of all uncommitted txns in the log. Challenge: How far back do we need to look? Answer: Until the last checkpoint Define and implement checkpoints momentarily
  • 10. An Example Transaction Initially A = 8 B = 8 Transaction T1 A  2  A B  2  B Transaction T1: Read (A,t); t  t  2 Write (A,t); Read (B,t); t  t  2 Write (B,t); Output (A); Output (B); State at Failure Point: Memory: A = 16 B = 16 Disk: A = 16 B = 8 Undo Log Entries <T1, start> <T1, A, 8> <T1, B, 8> <T1, Commit> Would have been written if the transaction had completed. Do we have the info to restore? failure!
  • 11.
  • 12. More on Undo Logging Failure During Recovery Recovery algorithm is idempotent Just do it again! How much of the log file needs to be processed? In principle, we need to examine the entire log. Checkpointing limits the part of the log that needs to be considered during recovery up to a certain point (checkpoint).
  • 13. Quiescent Checkpointing Simple approach to introduce the concept Pause the database stop accepting new transactions, wait until all current transactions commit or abort and have written the corresponding log records, flush the log to disk, write a <CKPT> log record and flush the log, resume accepting new transactions. Once we encounter a checkpoint record, we know that there are no incomplete transactions. Do not need to go backward beyond checkpoint. Can afford to throw away any part of the log prior to the checkpoint Pausing the database may not be warranted for business reasons
  • 14. Non-quiescentCheckpointing Main idea: Start- and End-Checkpoints to bracket unfinished txns Write a <START CKPT (T1, T2, … Tk)> record into the log T1, T2, … Tk are the unfinished txns Wait till T1, T2, … Tk commit or abort, but allow other txns to begin Write a <END CKPT> record into the log Recovery method: scan the log backwards until a <CKPT> record is found If <END…>, scan backwards to the previous <START…> No need to look any further If <START…>, then crash must have occurred during checkpointing. The START record tells us unfinished txns and Scan back to the beginning of the oldest one of these.
  • 15. Issues with Undo Logging Bottlenecks on I/O All log records must be forced back to disk before any data written back All data records must be forced to disk before the COMMIT record is written back An alternative: Redo Logging Instead of scanning backward from the end Undoing all transactions that were not completed Scans the log forward Reapplies all transactions that were not completed
  • 16. Logging with Redo Logs Creation of the Redo log For every action, generate redo log record. <T, X, v> has different meaning: v is the new value, not old Flush log at commit. All log records for transaction that modified X (including commit) must be on disk before X is modified on disk Write END log record after DB modifications have been written to disk. Recovery algorithm. Redo the modifications by committed transactions not yet flushed to disk. S = set of txns with <Ti commit> and no <Ti end> in log For each <Ti X, v> in log, in forward order (from earliest to latest) do: if Ti in S then Write(X, v) Output(X) Write <Ti END>
  • 18. Comments on Redo Logging Checkpoint algorithms similar to those for Undo Logging Quiescent as well as Non-quiescent algorithms Issues with Redo Logging Writing data back to disk is not allowed until transaction logs have been written out Results in a large requirement for memory for buffer pool A flaw in the checkpointing algorithms (textbook, p869) Both undo and redo logs may put contradictory requirements on how buffers are handled during a checkpoint, unless the database elements are complete blocks or sets of blocks. For instance, if a buffer contains one database element A that was changed by a committed transaction and another database element B that was changed in the same buffer by a transaction that has not yet had its COMMIT record written to disk, then we are required to copy the buffer to disk because of A but also forbidden to do so, because rule R1 applies to B.
  • 19. Undo/Redo Logging (p1) Undo logging requires to write modifications to disk immediately after commit, leading to an unnecessarily large number of IOs. Redo logging requires to keep all modified blocks in the buffer until the transaction commits and the log records have been flushed, increasing the buffer size requirement. Undo/redo logging combines undo and redo logging. It provides more flexibility in flushing modified blocks at the expense of maintaining more information in the log.
  • 20. Undo/Redo Logging (p2) Main idea: The log can be used to reconstruct the data Update records <T, X, new, old> record new and old value of X. The only undo/redo logging rule is: Log record must be flushed before corresponding modified block Also known as write ahead logging. Block of X can be flushed before or after T commits, i.e. before or after the COMMIT log record. Flush the log at commit.
  • 21. Undo/Redo Logging (p3) Because of the flexibility of flushing X before or after the COMMIT record, we can have uncommitted transactions with modifications on disk and committed transactions with modifications not yet on disk. The undo/redo recovery policy is as follows: Redo committed transactions. Undo uncommitted transactions.
  • 22. Undo/Redo Logging Recovery More details on the recovery procedure: Backward pass From end of log back to latest valid checkpoint, construct set S of committed transactions. Undo actions of transactions not in S. Forward pass From latest checkpoint forward to end of log, Or from the beginning of time, if there are no checkpoints redo actions of transactions in S. Alternatively, can also perform the redos before the undos.
  • 23. Undo/Redo Checkpointing Write "start checkpoint" listing all active transactions to log Flush log to disk Write to disk all dirty buffers (contain a changed DB element), whether or not transaction has committed Implies nothing should be written (not even to memory buffers) until we are sure the transaction will not abort Implies some log records may need to be written to disk (WAL) Write "end checkpoint" to log Flush log to disk start ckpt active T's: T1,T2,... end ckpt ... ... ...
  • 24. Protecting Against Media Failure Logging protects from local loss of main memory and disk content, but not against global loss of secondary storage content (media failure). To protect against media failures, employ archiving: maintaining a copy of the database on a separate, secure storage device. Log also needs to be archived in the same manner. Two levels of archiving: full dump vs. incremental dump.
  • 25. Protecting Against Media Failure Typically, database cannot be shut down for the period of time needed to make a backup copy (dump). Need to perform nonquiescent archiving, i.e., create a dump while the DBMS continues to process transactions. Goal is to make copy of database at time when the dump began, but transactions may change database content during the dumping. Logging continues during the dumping, and discrepancies can be corrected from the log.
  • 26. Protecting Against Media Failure We assume undo/redo (or redo) logging. The archiving procedure is as follows: Write a log record <START DUMP>. Perform a checkpoint for the log. Perform a (full / incremental) dump on the secure storage device. Make sure that enough of the log has been copied to the secure storage device so that at least the log up to the check point will survive media failure. Write a log record <END DUMP>.
  • 27. Protecting Against Media Failure After a media failure, we can restore the DB from the archived DB and archived log as follows: Copy latest full dump (archive) back to DB. Starting with the earliest ones, make the modifications recorded in the incremental dump(s) in increasing order of time. Further modify DB using the archived log. Use the recovery method corresponding to the chosen type of logging.
  • 28. Summary Logging is an effective way to prepare for system failure Transactions provide a useful building block on which to base log entries Three type of logs Undo Logs Redo Logs Undo/Redo logs Only Undo/Redo logs are used in practice. Why? Periodic checkpoints are necessary for keeping recovery times under control. Why? Database Dumps (archives) protect against media failure Great for making a “point in time” copy of the database.
  • 29. On the NoSQL Front… Google Datastore Recently (1/2011) added a “High Replication” option. Replicates the datastore synchronously across multiple data centers Does not use an append-only log Has performance and size impact CouchDB Append-only log that’s actually a b-tree No provision for deleting part of the log Provision for ‘compacting the log’ MongoDB Recently (12/2010) added a --journal option Has performance impact, no measurements available Common thread, tradeoff between performance and durability!
  • 30. CS 542 Database Management Systems Concurrency Control J Singh April 4, 2011
  • 31. Concurrency Control Goal: Preserving Data Integrity Challenge: enforce ACID rules (while maintaining maximum traffic through the system) Committed transactions leave the system in a consistent state Rolled-back transactions behave as if they never happened! Historical Note Based on The Transaction Concept: Virtues and Limitations by Jim Gray, Tandem Computers, 1981 ACM Turing Award, 1998
  • 32. Transactions Concurrent execution of user programs is essential for good DBMS performance. Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently. A user’s program may carry out many operations on the data retrieved from the database, but the DBMS is only concerned about what data is read/written from/to the database. A transaction is the DBMS’s abstract view of a user program: a sequence of reads and writes. Referred to as a Schedule Implemented by a Transaction Scheduler
  • 33. Scheduler Scheduler takes read/write requests from transactions Either executes them in buffers or delays them Scheduler must avoid Isolation Anomalies
  • 34. Isolation Anomalies (p1) READ UNCOMMITTED Dirty Read – data of an uncommitted transaction visible to others Sometimes called WR Conflict UNREPEATABLE READ Non-repeatable Read – some previously read data changes due to another transaction committing Sometimes called RW Conflict T1: R(A), W(A), R(B), W(B), C T2: R(A), W(A), R(B), W(B), C T1: R(A), W(A), C T2: R(A), W(A), C
  • 35. Isolation Anomalies (p2) Overwriting Uncommitted Data Sometimes called WW Conflicts We need a set of rules to prohibit such isolation anomalies The rules place constraints on the actions of concurrent transactions T1: W(A), W(B), C T2: W(A), W(B), C
  • 36. Serial Schedules Definition: A schedule is a list of actions, (i.e. reading, writing, aborting, committing), from a set of transactions. A schedule is serial if its transactions are not interleaved Serial schedules observe ACI properties Schedule D is the set of 3 transactions T1, T2, T3. T1 Reads and writes to object X Then T2 Reads and writes to object Y ThenT3 Reads and writes to object Z. D is an example of a serial schedule, because the 3 txns are not interleaved. Shorthand: R1 (X), W1(X),R2 (Y), W2(Y), R3 (Z), W3(Z)
  • 37. Serializable Schedules Aserializable schedule is one that is equivalent to a serial schedule. The Transaction Manager should defer some transactions if the current schedule is not serializable The order of transactions in E is not the same as in D, But E gives the same result. Shorthand: E = R1 (X); R2 (Y); R3 (Z); W1 (X); W2 (Y); W3 (Z);
  • 38. Serializability Is G serializable? Equivalent to the serial schedule <T1,T2> But not <T2,T1> G is conflict-serializable Conflict equivalence: The schedules S1 and S2are conflict-equivalent if the following conditions are satisfied: Both schedules S1and S2involve the same set of transactions (including ordering of actions within each transaction). The order of each pair of conflicting actions in S1 and S2are the same Conflict-serializability: A schedule is conflict-serializablewhen the schedule is conflict-equivalent to one or more serial schedules.
  • 39. Serializability of Schedule G T1: R(A) W(B) T2: R(A) W(A) T1 T2 Precedence graph: a node for each transaction an arc from Ti to Tj if an action in Tiprecedes and conflicts with an action in Tj. T1 T2? R1 (A) W1 (B) R2(A) W2(A) ? No conflicts T2T1? R2(A) W2 (A) R1 (A) W1 (B) ? Two actions conflict if The actions belong to different transactions. At least one of the actions is a write operation. The actions access the same object (read or write). Theorem: A schedule is conflict serializable if and only if its precedence graph is acyclic Conflicts
  • 40. Enforcing Serializable Schedules Prevent cycles in the Precedence Graph, P(S), from occurring Locking primitives: Lock (exclusive): li(A) Unlock: ui(A) Make transactions consistent Ti: pi (A) becomes Ti: li(A) pi (A) ui(A) pi (A) is either a READ or a WRITE Allow only one transaction to hold a lock on A at any time Two-phase locking for transactions Ti: li(A) … pi (A) … ui(A) no unlocks no locks
  • 41. Legal Schedules? S1= l1 (A) l1(B) r1 (A) w1 (B)l2(B)u1 (A) u1 (B) r2 (B) w2 (B) u2 (B)l3 (B) r3 (B) u3(B) S2= l1 (A) r1 (A) w1 (B) u1 (A) u1 (B) l2 (B) r2 (B) w2 (B) l3 (B) r3 (B) u3 (B) S3= l1 (A) r1 (A) u1 (A) l1 (B) w1 (B) u1 (B) l2 (B) r2 (B) w2 (B) u2 (B)l3 (B) r3 (B) u3 (B)
  • 42. Locking Protocols for Serializable Schedules Strict Two-phase Locking (Strict 2PL) Protocol: Each transaction must obtain a S (shared) lock on object before reading, and an X (exclusive) lock on object before writing. All locks held by a transaction are released when the transaction completes Strict 2PL allows only serializable schedules Additionally, it simplifies transaction aborts (Non-strict) 2PL Variant: Release locks anytime, but cannot acquire locks after releasing any lock. If a txn holds an X lock on an object, no other txn can get a lock (S or X) on that object. (Non-strict) 2PL also allows only serializable schedules, but involves more complex abort processing Why is “acquiring after releasing” disallowed? To avoid cascading aborts More in a minute
  • 43. Executing Locking Protocols Begin with a Serialized Schedule We know it won’t deadlock How do we know this? Beyond this simple 2PL protocol, it is all a matter of improving performance and allowing more concurrency…. Shared locks Increment locks Multiple granularity Other types of concurrency control mechanisms
  • 44. Lock Management Lock and unlock requests are handled by the lock manager Lock Table Entry Number of transactions currently holding a lock Type of lock held (shared or exclusive) Pointer to queue of lock requests Locking and unlocking operations Atomic Support upgrade: transaction that holds a shared lock can be upgraded to hold an exclusive lock Any level of granularity can be locked Database, table, block, tuple Why is this necessary?
  • 45. Multiple-Granularity Locks If a transaction needs to scan all records in a table, we don’t really want to have a lock on all tuples individually – significant locking overhead! Put a single lock on the table Database Tables Pages Tuples A lock on a node implicitly locks all descendents. contains
  • 46. Aborting a Transaction If a transaction Ti is aborted, all its actions have to be undone. If Tjreads an object last written by Ti, Tj must be aborted as well! Most systems avoid such cascading aborts by releasing a transaction’s locks only at commit time. If Tiwrites an object, Tjcan read this only after Ticommits. In order to undo the actions of an aborted transaction, the DBMS maintains a log in which every write is recorded. The same mechanism is used to recover from system crashes; all active txns at the time of the crash are aborted when the system recovers
  • 47. Performance Considerations (Again!) 2PL Protocol allows transactions to proceed with maximum parallelism Locking algorithm only delays actions that would cause conflicts But the locks are still a bottleneck Need to ensure lowest-possible level of locking granularity Classic memory-performance trade-off Conflict-serialization is too conservative But other methods of serialization are too complex A use case that occurs quite often, should be optimized Besides scanning through the table, if we need to modify a few tuples, what kind of lock to put on the table? Have to be X (if we only have S or X). But, blocks all other read requests! Concurrency control is pessimistic and acquires/releases locks Optimistic Concurrency Control
  • 48. Next Week Intention Locks Optimistic Concurrency Control Distributed Commit Please Read ahead of time The end of an Architectural Era, Stonebraker et al, Proc. VLDB, 2007 OLTP Through the Looking Glass, and What We Found There, Harizopoulos et al, Proc ACM SIGMOD, 2008