The document discusses techniques for recovery in concurrent transaction systems including log-based recovery, shadow paging, and checkpointing. It explains that concurrent transactions require modifications to the basic log-based recovery scheme including constructing undo and redo lists during recovery. Shadow paging uses copy-on-write to avoid in-place updates and provide atomicity. Checkpointing reduces the number of log records that must be scanned during recovery.
2. Recovery concurrent transaction is an environment where
only a single trans- action at a time is executing.
we can modify and extend the log-based recovery scheme to
deal with multiple concurrent transactions.
Regardless of the number of concurrent transactions, the
system has a single disk buffer and a single log.
All transactions share the buffer blocks.
We allow immediate modification, and permit a buffer block
to have data items updated by one or more transactions.
3. Interaction with concurrency control:
The recovery scheme depends greatly on the concurrency-
control scheme that is used.
To roll back a failed transaction, we must undo the updates
performed by the transaction.
Suppose that a transaction T0 has to be rolled back, and a
data item Q that was updated by T0 has to be restored to its
old value.
Using the log-based schemes for recovery, we restore the
value by using the undo information in a log record.
Suppose now that a second transaction T1 has performed yet
another update on Q before T0 is rolled back.
4. Then, the update performed by T1 will be lost if T0 is rolled
back.
we require that, if a transaction T has updated a data item Q,
no other transaction may update the same data item until T has
committed or been rolled back.
We can ensure this requirement easily by using strict two-
phase locking—that is, two-phase locking with exclusive
locks held until the end of the transaction.
5. We roll back a failed transaction, Ti, by using the log.
The system scans the log back- ward; for every log record of the
form <Ti, Xj , V1, V2> found in the log, the system restores the
data item Xj to its old value V1.
Scanning of the log terminates when the log record <Ti, start> is
found.
Scanning the log backward is important, since a transaction may
have updated a data item more than once.
As an illustration, consider the pair of log records
<Ti, A, 10, 20>
<Ti, A, 20, 30>
6. The log records represent a modification of data item A by
Ti, followed by another modification of A by Ti.
Scanning the log backward sets A correctly to 10.
If the log were scanned in the forward direction, A would
be set to 20, which is incorrect.
If strict two-phase locking is used for concurrency control,
locks held by a transaction T may be released only after the
transaction has been rolled back as described.
Therefore, restoring the old value of the data item will not
erase the effects of any other transaction.
7. we used checkpoints to reduce the number of log
records that the system must scan when it recovers from a
crash. Since we assumed no concurrency, it was necessary to
consider only the following transactions during recovery:
Those transactions that started after the most recent
checkpoint
The one transaction, if any, that was active at the time of the
most recent check- point
8. The situation is more complex when transactions can execute
concurrently, since several transactions may have been active
at the time of the most recent checkpoint.
In a concurrent transaction-processing system, we require that
the checkpoint log record be of the form <checkpoint L>,
where L is a list of transactions active at the time of the
checkpoint.
Again, we assume that transactions do not perform updates
either on the buffer blocks or on the log while the checkpoint
is in progress.
9. The requirement that transactions must not perform any
updates to buffer blocks or to the log during check pointing
can be bothersome, since transaction processing will have to
halt while a checkpoint is in progress.
A fuzzy checkpoint is a check- point where transactions are
allowed to perform updates even while buffer blocks are being
written out.
10. When the system recovers from a crash, it constructs two lists:
The undo-list consists of transactions to be undone, and the
redo-list consists of transactions to be redone.
The system constructs the two lists as follows: Initially, they
are both empty. The system scans the log backward,
examining each record, until it finds the first
<checkpoint> record:
For each record found of the form <Ti commit>, it adds Ti to
redo-list.
For each record found of the form <Ti start>, if Ti is not in
redo-list, then it adds Ti to undo-list.
11. Once the redo-list and undo-list have have been constructed,
the recovery proceeds as follows:
1. The system rescans the log from the most recent record
backward, and per- forms an undo for each log record that
belongs transaction Ti on the undo-list. Log records of
transactions on the redo-list are ignored in this phase. The
scan stops when the <Ti start> records have been found for
every transaction Ti in the undo-list.
12. 2. The system locates the most recent <checkpoint L>
record on the log. Notice that this step may involve
scanning the log forward, if the checkpoint record
was passed in step 1.
3. The system scans the log forward from the most
recent <checkpoint L> record, and performs redo for
each log record that belongs to a transaction Ti that is
on the redo-list. It ignores log records of transactions
on the undo-list in this phase.
13. It is important in step 1 to process the log backward, to ensure
that the resulting state of the database is correct.
After the system has undone all transactions on the undo-list,
it redoes those trans- actions on the redo-list. It is important,
in this case, to process the log forward. When the recovery
process has completed, transaction processing resumes.
<Ti, A, 10, 20>
<Tj , A, 10, 30>
<Tj commit>
14. shadow paging is a technique for providing atomicity and
durability (two of the ACID properties) in database systems.
A page in this context refers to a unit of physical storage
(probably on a hard disk), typically of the order of 1 to 64
KiB.
Shadow paging is a copy-on-write technique for avoiding in-
place updates of pages.
when a page is to be modified, a shadow page is allocated.
When the page is ready to become durable, all pages that
referred to the original are updated to refer to the new
replacement page instead.
15. Because the page is "activated" only when it is ready,
it is atomic.
If the referring pages must also be updated via
shadow paging, this procedure may recurse many
times, becoming quite costly.
This increases performance significantly by avoiding
many writes on hotspots high up in the referential
hierarchy (e.g.: a file system superblock) at the cost
of high commit latency.
16. Shadow paging is similar to the old master–new
master batch processing technique used in mainframe
database systems.
In these systems, the output of each batch run (possibly
a day's work) was written to two separate disks or other
form of storage medium.
One was kept for backup, and the other was used as the
starting point for the next day's work
Shadow paging is also similar to purely functional
data structures, in that in-place updates are avoided.
17.
18. Data must be in RAM for DBMS to operate on it!
Buffer Mgr hides the fact that not all data is in
RAMDB.
Requestor of page must eventually unpin it, and
indicate whether page has been modified:
–dirty bit is used for this
CC & recovery may entail additional I/O when a
frame is chosen for replacement. (Write-Ahead Log
protocol; more later.)
19. Frame is chosen for replacement by a replacement
policy:– Least-recently-used (LRU), MRU, Clock,
etc.
Policy can have big impact on # of I/O’s; depends on
the access pattern.
20. Least Recently Used (LRU):
for each page in buffer pool, keep track of time last
unpinned
replace the frame that has the oldest (earliest) time
very common policy: intuitive and simple
Problem: Sequential flooding:
LRU + repeated sequential scans.
# buffer frames < # pages in file means each page
request causes an I/O. MRU much better in this situation
(but not in all situations, of course).
21.
22. An approximation of LRU
Arrange frames into a cycle, store one reference bitper
frame– Can think of this as the 2nd chance bit
When pin count reduces to 0, turn on ref. bit
When replacement necessary
do for each page in cycle {
if (pincount == 0 && ref bit is on)turn off ref bit;
else if (pincount == 0 && ref bit is off)
choose this page for replacement;
} until a page is chosen;
23.
24. ddd:
Two ways to debug postgres.
interactive mode
bare backend mode (described here)
Initialize postgres by typing.
/local_tmp/$USER/postgresql-7.2.2-clock/bin/init db
–D <data directory>.
25. Startup ddd with the newly complied and
installed version of postgres.
ddd /local_tmp/$USER/postgresql-7.2.2-
clock/bin/postgres
You can now examine functions.
Type in the search field the name of the function to
be examined (e.g., GetFreeBuffer()).
Then click the Lookup button.