Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

CSF212 Module7

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

CSF212 DATABASE SYSTEMS

BITS Pilani Jayalakshmi N


Pilani Campus Guest Faculty (Off-Campus) Computer Science
M7 – Recovery

• Recovery Concepts
• Log-based Recovery Techniques
• Shadow Paging
• ARIES Recovery Algorithm

BITS Pilani, Pilani Campus


Recovery Concepts

BITS Pilani, Pilani Campus


Concepts
• Recovery from transaction failures usually means that the
database is restored to the most recent consistent state just
before the time of failure.
• To do this, the system must keep information about the
changes that were applied to data items by the various
transactions. This information is typically kept in the system
log.
• Conceptually, there are two main techniques for recovery
from non-catastrophic transaction failures: deferred update
and immediate update.
• The deferred update techniques do not physically update the
database on disk until after a transaction reaches its commit
point; then the updates are recorded in the database.
Deferred update is also known as the NO-UNDO/REDO 4
algorithm. BITS Pilani, Pilani Campus
Concepts/Caching of Disk Blocks
• In the immediate update techniques, the database may be updated by
some operations of a transaction before the transaction reaches its
commit point. However, these operations must also be recorded in the
log on disk by force-writing before they are applied to the database on
disk, making recovery still possible.
• A variation of the algorithm where all updates are required to be
recorded in the database on disk before a transaction commits requires
undo only, so it is known as the UNDO/NO-REDO algorithm.
• Typically a collection of in-memory buffers, called the DBMS cache, is
kept under the control of the DBMS for the purpose of holding these
buffers. A directory for the cache is used to keep track of which
database items are in the buffers. This can be a table of
<Disk_page_address, Buffer_location, ... > entries.
• It may be necessary to replace (or flush) some of the cache buffers to
make space available for the new item.
5
BITS Pilani, Pilani Campus
Caching of Disk Blocks
• Two main strategies can be employed when flushing a
modified buffer back to disk. The first strategy, known as
in-place updating, writes the buffer to the same original disk
location, thus overwriting the old value of any changed data
items on disk.
• Hence, a single copy of each database disk block is
maintained. The second strategy, known as shadowing, writes
an updated buffer at a different disk location, so multiple
versions of data items can be maintained.
• In general, the old value of the data item before updating is
called the before image(BFIM), and the new value after
updating is called the after image (AFIM). If shadowing is
used, both the BFIM and the AFIM can be kept on disk; hence,
it is not strictly necessary to maintain a log for recovering. 6
BITS Pilani, Pilani Campus
Write-Ahead Logging
The recovery mechanism must ensure that the BFIM of the data item is
recorded in the appropriate log entry and that the log entry is flushed to
disk before the BFIM is overwritten with the AFIM in the database on disk.
This process is generally known as write-ahead logging.
Standard DBMS recovery terminology includes the terms
steal/no-steal and force/no-force, which specify the rules that govern
when a page from the database can be written to disk from the cache:
1. If a cache buffer page updated by a transaction cannot be written to
disk before the transaction commits, the recovery method is called a
no-steal approach . Otherwise it is called steal approach.
2. If all pages updated by a transaction are immediately written to disk
before the transaction commits, it is called a force approach. Otherwise, it
is called no-force. The force rule means that REDO will never be needed
during recovery, since any committed transaction will have all its updates
on disk before it is committed.
7
BITS Pilani, Pilani Campus
Checkpoint and Fuzzy
CheckPointing
• Another type of entry in the log is called a checkpoint. A
[checkpoint, list of active transactions] record is written into the
log periodically at that point when the system writes out to the
database on disk all DBMS buffers that have been modified.
• Taking a checkpoint consists of the following actions:
1. Suspend execution of transactions temporarily.
2. Force-write all main memory buffers that have been modified to
disk.
3. Write a [checkpoint] record to the log, and force-write the log to
disk.
4. Resume executing transactions.
• The time needed to force-write all modified memory buffers may
delay transaction processing because of step 1. To reduce this delay,
it is common to use a technique called fuzzy checkpointing.
8
BITS Pilani, Pilani Campus
Fuzzy Checkpointing
• In this technique, the system can resume transaction
processing after a [begin_checkpoint] record is written
to the log without having to wait for step 2 to finish.
• When step 2 is completed, an [end_checkpoint, ...]
record is written in the log with the relevant
information collected during checkpointing.

9
BITS Pilani, Pilani Campus
Transaction Rollback and
Cascading Rollback
• If a transaction fails for whatever reason after updating the
database, but before the transaction commits, it may be necessary
to roll back the transaction.
• If any data item values have been changed by the transaction and
written to the database, they must be restored to their previous
values (BFIMs). The undo-type log entries are used to restore the
old values of data items that must be rolled back.
• If a transaction T is rolled back, any transaction S that has, in the
interim, read the value of some data item X written by T must also
be rolled back.
• Similarly, once S is rolled back, any transaction R that has read the
value of some data item Y written by S must also be rolled back; and
so on. This phenomenon is called cascading rollback , and can occur
when the recovery protocol ensures recoverable schedules but does
not ensure strict or cascadeless schedules 10
BITS Pilani, Pilani Campus
11
BITS Pilani, Pilani Campus
12
BITS Pilani, Pilani Campus
Recovery Techniques Based
on Deferred Update

BITS Pilani, Pilani Campus


NO-UNDO/REDO Recovery Based
on Deferred Update
• The idea behind deferred update is to defer or postpone any actual
updates to the database on disk until the transaction completes its
execution successfully and reaches its commit point.
• If a transaction fails before reaching its commit point, there is no need to
undo any operations because the transaction has not affected the
database on disk in any way. Therefore, only REDO type log entries are
needed in the log, which include the new value (AFIM) of the item written
by a write operation.
• The UNDO-type log entries are not needed since no undoing of operations
will be required during recovery.
• A typical deferred update protocol as follows:
1. A transaction cannot change the database on disk until it reaches its
commit point.
2. A transaction does not reach its commit point until all its REDO-type log
entries are recorded in the log and the log buffer is force-written to disk.
14
BITS Pilani, Pilani Campus
NO-UNDO/REDO Recovery Based
on Deferred Update
Procedure RDU_M (NO-UNDO/REDO with checkpoints). Use
two lists of transactions maintained by the system: the
committed transactions T since the last checkpoint (commit list),
and the active transactions T (active list).REDO all the WRITE
operations of the committed transactions from the log, in the
order in which they were written into the log. The transactions
that are active and did not commit are effectively cancelled and
must be resubmitted.
The REDO procedure is defined as follows:
Procedure REDO (WRITE_OP). Redoing a write_item operation
WRITE_OP consists of examining its log entry [write_item, T, X,
new_value] and setting the value of item X in the database to
new_value, which is the after image (AFIM).
15
BITS Pilani, Pilani Campus
Recovery TimeLine

16
BITS Pilani, Pilani Campus
Example

17
BITS Pilani, Pilani Campus
Example

18
BITS Pilani, Pilani Campus
Recovery Techniques Based
on Immediate Update

BITS Pilani, Pilani Campus


Recovery Techniques Based
on Immediate Update
Procedure RIU_M (UNDO/REDO with checkpoints).
1. Use two lists of transactions maintained by the system:
the committed transactions since the last checkpoint and
the active transactions.
2. Undo all the write_item operations of the active
(uncommitted) transactions , using the UNDO procedure.
The operations should be undone in the reverse of the
order in which they were written into the log.
3. Redo all the write_item operations of the committed
transactions from the log, in the order in which they were
written into the log, using the REDO procedure defined
earlier. 20
BITS Pilani, Pilani Campus
Recovery Techniques Based
on Immediate Update

The UNDO procedure is defined as follows:


Procedure UNDO (WRITE_OP). Undoing a write_item
operation write_op consists of examining its log entry
[write_item, T, X, old_value, new_value] and setting the
value of item X in the database to old_value, which is
the before image (BFIM).
Undoing a number of write_item operations from
one or more transactions from the log must proceed in
the reverse order from the order in which the
operations were written in the log.
21
BITS Pilani, Pilani Campus
Shadow Paging

BITS Pilani, Pilani Campus


Shadow Paging
• Shadow paging considers the database to be made up of a
number of fixed size disk pages (or disk blocks)—say, n—for
recovery purposes.
• A directory with n entries is constructed, where the ith
entry points to the ith database page on disk.
• The directory is kept in main memory if it is not too large,
and all references—reads or writes—to database pages on
disk go through it.
• When a transaction begins executing, the current
directory—whose entries point to the most recent or
current database pages on disk—is copied into a shadow
directory. The shadow directory is then saved on disk while
the current directory is used by the transaction. 23
BITS Pilani, Pilani Campus
Example

24
BITS Pilani, Pilani Campus
Shadow Paging
• To recover from a failure during transaction execution, it is
sufficient to free the modified database pages and to
discard the current directory.
• The state of the database before transaction execution is
available through the shadow directory, and that state is
recovered by reinstating the shadow directory.
• The database thus is returned to its state prior to the
transaction that was executing when the crash occurred,
and any modified pages are discarded.
• Committing a transaction corresponds to discarding the
previous shadow directory. Since recovery involves neither
undoing nor redoing data items, this technique can be
categorized as a NO-UNDO/ NO-REDO technique for 25
recovery. BITS Pilani, Pilani Campus
Disadvantages
• One disadvantage of shadow paging is that the
updated database pages change location on disk.
This makes it difficult to keep related database pages
close together on disk without complex storage
management strategies.
• Furthermore, if the directory is large, the overhead
of writing shadow directories to disk as transactions
commit is significant.
• A further complication is how to handle garbage
collection when a transaction commits.

26
BITS Pilani, Pilani Campus
ARIES Recovery Algorithm

BITS Pilani, Pilani Campus


Concepts
• It is used in many relational database-related products of IBM.
• ARIES uses a steal/no-force approach for writing, and it is based
on three concepts:
write-ahead logging, repeating history during redo, and
logging changes during undo.
• The second concept, repeating history, means that ARIES will
retrace all actions of the database system prior to the crash to
reconstruct the database state when the crash
occurred.Transactions that were uncommitted at the time of the
crash (active transactions) are undone.
• The third concept, logging during undo, will prevent ARIES from
repeating the completed undo operations if a failure occurs
during recovery, which causes a restart of the recovery process.
28
BITS Pilani, Pilani Campus
Recovery Procedure
• The ARIES recovery procedure consists of three main
steps: analysis, REDO, and UNDO.
• The analysis step identifies the dirty (updated) pages in
the buffer6 and the set of transactions active at the
time of the crash. The appropriate point in the log
where the REDO operation should start is also
determined.
• The REDO phase actually reapplies updates from the
log to the database. Thus, only the necessary REDO
operations are applied during recovery.

29
BITS Pilani, Pilani Campus
Recovery Procedure
• Finally, during the UNDO phase, the log is scanned
backward and the operations of transactions that were
active at the time of the crash are undone in reverse order.
The information needed for ARIES to accomplish its
recovery procedure includes the log, the Transaction Table,
and the Dirty Page Table. Additionally, checkpointing is
used.
• These tables are maintained by the transaction manager
and written to the log during checkpointing.
• In ARIES, every log record has an associated log sequence
number (LSN) that is monotonically increasing and indicates
the address of the log record on disk.
30
BITS Pilani, Pilani Campus
Log Sequence Number
Each LSN corresponds to a specific change (action) of some
transaction. Also, each data page will store the LSN of the
latest log record corresponding to a change for that page.
A log record is written for any of the following actions:
updating a page (write), committing a transaction (commit),
aborting a transaction (abort), undoing an update (undo), and
ending a transaction (end). The need for including the first
three actions in the log has been discussed, but the last two
need some explanation.
When an update is undone, a compensation log record is
written in the log. When a transaction ends, whether by
committing or aborting, an end log record is written.
31
BITS Pilani, Pilani Campus
An example of recovery in ARIES
(a) The log at point of crash.

32
BITS Pilani, Pilani Campus
Continued…
(b)The Transaction and Dirty Page Tables at time of checkpoint.
(c)The Transaction and Dirty Page Tables after the analysis
phase.

33
BITS Pilani, Pilani Campus

You might also like