Advanced Database MS 5

Advanced Database system
CHAPTER 5
DATABASE RECOVERY TECHNIQUES
Recovery Outline and Categorization of Recovery Algorithms
Recovery from transaction failures usually means that the database is restored to the
most recent consistent state just before the time of failure.
 To do this, the system must keep information about the changes that were applied to
data items by the various transactions. This information is typically kept in the system
log.
A typical strategy for recovery may be summarized informally as follows:
1. If there is extensive damage to a wide portion of the database due to
catastrophic failure, such as a disk crash, the recovery method restores a past
copy of the database that was backed up to archival storage and reconstructs a more
current state by reapplying or redoing the operations of committed transactions
from the backed up log, up to the time of failure.
2. When the database on disk is not physically damaged, and a non-catastrophic
failure has occurred, the recovery strategy is to identify any changes that may cause
an inconsistency in the database.
 For non-catastrophic failure, the recovery protocol does not need a
complete archival copy of the database. Rather, the entries kept in the
online system log on disk are analyzed to determine the appropriate actions
for recovery.
Conceptually, we can distinguish two main techniques for recovery from non-
catastrophic transaction failures:
1. Deferred update and
2. Immediate update.
Deferred Update
The deferred update techniques do not physically update the database on disk until after a
transaction reaches its commit point; then the updates are recorded in the database.
 Before reaching commit, all transaction updates are recorded in the local transaction
workspace or in the main memory buffers that the DBMS maintains.
 Before commit, the updates are recorded persistently in the log, and then after commit,
the updates are written to the database on disk.
If a transaction fails before reaching its commit point, it will not have changed the database in
any way, so UNDO is not needed.
It may be necessary to REDO the effect of the operations of a committed transaction from the
log, because their effect may not yet have been recorded in the database on disk. Hence, deferred
update is also known as the NO-UNDO/REDO algorithm.
AUWC School of Technology and Informatics Page 1

Immediate Update
In the immediate update techniques, the database may be updated by some operations
of a transaction before the transaction reaches its commit point.
 However, these operations must also be recorded in the log on disk by force-writing
before they are applied to the database on disk, making recovery still possible.
If a transaction fails after recording some changes in the database on disk but before reaching
its commit point, the effect of its operations on the database must be undone; that is, the
transaction must be rolled back.
In the general case of immediate update, both undo and redo may be required during
recovery. This technique, known as the UNDO/REDO algorithm, requires both operations
during recovery, and is used most often in practice.
A variation of the algorithm where all updates are required to be recorded in the database on
disk before a transaction commits requires undo only, so it is known as the UNDO/NO-REDO
algorithm.
Caching (Buffering) of Disk Blocks
Typically, multiple disk pages that include the data items to be updated are cached into
main memory buffers and then updated in memory before being written back to disk.
 The caching of disk pages is traditionally an operating system function, but
because of its importance to the efficiency of recovery procedures, it is handled by
the DBMS by calling low-level operating systems routines.
When the DBMS requests action on some item,

1. First it checks the cache directory to determine whether the disk page containing the
item is in the DBMS cache.
2. Second if it is not, the item must be located on disk, and the appropriate disk pages are
copied into the cache.
It may be necessary to replace (or flush) some of the cache buffers to make space available for
the new item.
Some page replacement strategy similar to these used in operating systems, such as least recently
used (LRU) or first-in-first out (FIFO), or a new strategy that is DBMS-specific can be used to
select the buffers for replacement, such as DBMIN or Least-Likely-to-Use.
Two main strategies can be employed when flushing a modified buffer back to disk.
1. In-place updating, writes the buffer to the same original disk location, thus
overwriting the old value of any changed data items on disk.
2. Shadowing, writes an updated buffer at a different disk location, so multiple
versions of data items can be maintained.
In general, the old value of the data item before updating is called the before image
(BFIM), and the new value after updating is called the after image (AFIM).

If shadowing is used, both the BFIM and the AFIM can be kept on disk; hence, it is not
strictly necessary to maintain a log for recovering.
Write-Ahead Logging, Steal/No-Steal, and Force/No-Force
When in-place updating is used, it is necessary to use a log for recovery.

In this case, the recovery mechanism must ensure that the BFIM of the data item is
recorded in the appropriate log entry and that the log entry is flushed to disk before the
BFIM is overwritten with the AFIM in the database on disk. This process is generally
known as write-ahead logging.
There are two types of log entry information included for a write command:
1. The information needed for UNDO: The UNDO-type log entries include the old value
(BFIM) of the item since this is needed to undo the effect of the operation from the log (by
setting the item value in the database back to its BFIM).
2. The information needed for REDO. A REDO-type log entry includes the new value
(AFIM) of the item written by the operation since this is needed to redo the effect of the
operation from the log (by setting the item value in the database on disk to its AFIM).
 Standard DBMS recovery terminology includes the terms steal/no-steal and force/no-
force, which specify the rules that govern when a page from the database can be
written to disk from the cache:
1. If a cache buffer page updated by a transaction cannot be written to disk before the
transaction commits, the recovery method is called a no-steal approach.
 On the other hand, if the recovery protocol allows writing an updated
buffer before the transaction commits, it is called steal approach.
2. If all pages updated by a transaction are immediately written to disk before the
transaction commits, it is called a force approach. Otherwise, it is called no-force.
 The force rule means that REDO will never be needed during recovery,
since any committed transaction will have all its updates on disk before
it is committed.
The deferred update (NO-UNDO) recovery scheme follows a no-steal approach. However,
typical database systems employ a steal/no-force strategy.
 The advantage of steal is that it avoids the need for a very large buffer space to store all
updated pages in memory.
 The advantage of no-force is that an updated page of a committed transaction may still
be in the buffer when another transaction needs to update it, thus eliminating the I/O
cost to write that page multiple times to disk, and possibly to have to read it again from
disk.
 This may provide a substantial saving in the number of disk I/O
operations when a specific page is updated heavily by multiple
transactions.

Transaction Actions That Do Not Affect the Database
In general, a transaction will have actions that do not affect the database, such as
generating and printing messages or reports from information retrieved from the
database.
If a transaction fails before completion, we may not want the user to get these reports,
since the transaction has failed to complete.
 If such erroneous reports are produced, part of the recovery process would have
to inform the user that these reports are wrong, since the user may take an
action based on these reports that affects the database. Hence, such reports
should be generated only after the transaction reaches its commit point.
A common method of dealing with such actions is to issue the commands that generate the
reports but keep them as batch jobs, which are executed only after the transaction
reaches its commit point. If the transaction fails, the batch jobs are canceled.
NO-UNDO/REDO Recovery Based on Deferred Update
The idea behind deferred update is to defer or postpone any actual updates to the
database on disk until the transaction completes its execution successfully and reaches
its commit point.
During transaction execution, the updates are recorded only in the log and in the cache
buffers.
We can state a typical deferred update protocol as follows:
1. A transaction cannot change the database on disk until it reaches its commit point.
2. A transaction does not reach its commit point until all its REDO-type log entries
are recorded in the log and the log buffer is force-written to disk.
Notice that step 2 of this protocol is a restatement of the write-ahead logging (WAL)
protocol.
 Because the database is never updated on disk until after the transaction commits,
there is never a need to UNDO any operations.
 REDO is needed in case the system fails after a transaction commits but before all
its changes are recorded in the database on disk.
 In this case, the transaction operations are redone from the log entries
during recovery.
For multiuser systems with concurrency control, the concurrency control and recovery
processes are interrelated.
 Consider a system in which concurrency control uses strict two-phase locking, so the
locks on items remain in effect until the transaction reaches its commit point. After that,
the locks can be released. This ensures strict and serializable schedules.
If a transaction is aborted for any reason (say, by the deadlock detection method), it is
simply resubmitted, since it has not changed the database on disk.
 A drawback of the method described here is that it limits the concurrent
execution of transactions because all write-locked items remain locked until the
transaction reaches its commit point.

 Additionally, it may require excessive buffer space to hold all updated items until the
transactions commit.
The method’s main benefit is that transaction operations never need to be undone, for two
reasons:
1. A transaction does not record any changes in the database on disk until after it reaches
its commit point—that is, until it completes its execution successfully. Hence, a
transaction is never rolled back because of failure during transaction execution.
2. A transaction will never read the value of an item that is written by an uncommitted
transaction, because items remain locked until a transaction reaches its commit point.
Hence, no cascading rollback will occur.
Recovery Techniques Based on Immediate Update
In these techniques, when a transaction issues an update command, the database on disk
can be updated immediately, without any need to wait for the transaction to reach its
commit point.
 Notice that it is not a requirement that every update be applied immediately to
disk; it is just possible that some updates are applied to disk before the
transaction commits.
Theoretically, we can distinguish two main categories of immediate update algorithms.
1. If the recovery technique ensures that all updates of a transaction are recorded in
the database on disk before the transaction commits, there is never a need to REDO
any operations of committed transactions. This is called the UNDO/NO-REDO recovery
algorithm.
 In this method, all updates by a transaction must be recorded on disk before the
transaction commits, so that REDO is never needed. Hence, this method must
utilize the force strategy for deciding when updated main memory buffers are
written back to disk.
2. If the transaction is allowed to commit before all its changes are written to the
database, we have the most general case, known as the UNDO/REDO recovery
algorithm. In this case, the steal/no-force strategy is applied.
Shadow Paging
This recovery scheme does not require the use of a log in a single-user environment.
In a multiuser environment, a log may be needed for the concurrency control method.
 Shadow paging considers the database to be made up of a number of fixed size
disk pages (or disk blocks)—say, n—for recovery purposes.
A directory with n entries is constructed, where the ith entry points to the ith database
page on disk.
The directory is kept in main memory if it is not too large, and all references—read or
writes—to database pages on disk go through it.
When a transaction begins executing, the current directory—whose entries point to the
most recent or current database pages on disk—is copied into a shadow directory.

The shadow directory is then saved on disk while the current directory is used by the
transaction.
During transaction execution, the shadow directory is never modified. When a write_item
operation is performed, a new copy of the modified database page is created, but the old
copy of that page is not overwritten.
The database thus is returned to its state prior to the transaction that was executing when
the crash occurred, and any modified pages are discarded. Committing a transaction
corresponds to discarding the previous shadow directory. Since recovery involves neither
undoing nor redoing data items, this technique can be categorized as a NO-UNDO/ NO-
REDO technique for recovery.
The ARIES Recovery Algorithm
It is used in many relational database-related products of IBM.

ARIES uses a steal/no-force approach for writing, and it is based on three concepts:
1. Write-ahead logging, repeating history during redo, and logging changes during undo.
2. The second concept, repeating history, means that ARIES will retrace all actions of the
database system prior to the crash to reconstruct the database state when the crash occurred.
a. Transactions that were uncommitted at the time of the crash (active transactions)
are undone.
3. The third concept, logging during undo, will prevent ARIES from repeating the completed
undo operations if a failure occurs during recovery, which causes a restart of the recovery
process.
The ARIES recovery procedure consists of three main steps: analysis, REDO, and UNDO.
1. Analysis: The analysis step identifies the dirty (updated) pages in the buffer and the
set of transactions active at the time of the crash.
2. Redo
 The appropriate point in the log where the REDO operation should start is
also determined.
 The REDO phase actually reapplies updates from the log to the database.
 Generally, the REDO operation is applied only to committed transactions.
 In ARIES, every log record has an associated log sequence number (LSN)
that is monotonically increasing and indicates the address of the log record
on disk.
 Each LSN corresponds to a specific change (action) of some transaction.
 Also, each data page will store the LSN of the latest log record corresponding
to a change for that page.
A log record is written for any of the following actions: updating a page (write),
committing a transaction (commit), aborting a transaction (abort), undoing an
update (undo), and ending a transaction (end).

Recovery in Multidatabase Systems
These databases may even be stored on different types of DBMSs; for example, some DBMSs
may be relational, whereas others are object oriented, hierarchical, or network DBMSs.
 In such a case, each DBMS involved in the multidatabase transaction may have its
own recovery technique and transaction manager separate from those of the other
DBMSs.
 This situation is somewhat similar to the case of a distributed database
management system, where parts of the database reside at different sites that are
connected by a communication network.
To maintain the atomicity of a multidatabase transaction, it is necessary to have a two-
level recovery mechanism.
A global recovery manager, or coordinator, is needed to maintain information needed for
recovery, in addition to the local recovery managers and the information they maintain (log,
tables).
The coordinator usually follows a protocol called the two-phase commit protocol, whose
two phases can be stated as follows:
 Phase 1. When all participating databases signal the coordinator that the part of the
multidatabase transaction involving each has concluded, the coordinator sends a message
prepare for commit to each participant to get ready for committing the transaction.
 Each participating database receiving that message will force-write all log
records and needed information for local recovery to disk and then send a ready
to commit or OK signal to the coordinator.
 If the force-writing to disk fails or the local transaction cannot commit for some
reason, the participating database sends cannot commit or not OK signal to the
coordinator.
 If the coordinator does not receive a reply from the database within a certain time
out interval, it assumes a not OK response.
Phase 2. If all participating databases reply OK, and the coordinator’s vote is also OK, the
transaction is successful, and the coordinator sends a commit signal for the transaction to
the participating databases.
The net effect of the two-phase commit protocol is that either all participating databases
commit the effect of the transaction or none of them do.
 In case any of the participants—or the coordinator—fails, it is always possible to
recover to a state where either the transaction is committed or it is rolled back.
A failure during or before Phase 1 usually requires the transaction to be rolled back,
whereas a failure during Phase 2 means that a successful transaction can recover and
commit.

Database Backup and Recovery from Catastrophic Failures
So far, all the techniques we have discussed apply to non-catastrophic failures.

A key assumption has been that the system log is maintained on the disk and is not lost as
a result of the failure.
Similarly, the shadow directory must be stored on disk to allow recovery when shadow
paging is used.
 The recovery techniques we have discussed use the entries in the system log or the
shadow directory to recover from failure by bringing the database back to a
consistent state.
The recovery manager of a DBMS must also be equipped to handle more catastrophic
failures such as disk crashes.
 The main technique used to handle such crashes is a database backup, in which the
whole database and the log are periodically copied onto a cheap storage medium such
as magnetic tapes or other large capacity offline storage devices.
In case of a catastrophic system failure, the latest backup copy can be reloaded from the
tape to the disk, and the system can be restarted.
Data from critical applications such as banking, insurance, stock market, and other
databases is periodically backed up in its entirety and moved to physically separate safe
locations.
 To avoid losing all the effects of transactions that have been executed since the last
backup, it is customary to back up the system log at more frequent intervals than full
database backup by periodically copying it to magnetic tape.
The system log is usually substantially smaller than the database itself and hence can be
backed up more frequently. Therefore, users do not lose all transactions they have
performed since the last database backup.
All committed transactions recorded in the portion of the system log that has been backed
up to tape can have their effect on the database redone.
 A new log is started after each database backup.
 Hence, to recover from disk failure, the database is first recreated on disk from
its latest backup copy on tape.
Following that, the effects of all the committed transactions whose operations have been
recorded in the backed-up copies of the system log are reconstructed.

Advanced Database MS 5

Uploaded by

Copyright:

Available Formats

Advanced Database MS 5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Advanced Database MS 5

Uploaded by

Copyright:

Available Formats

Advanced Database system

DATABASE RECOVERY TECHNIQUES

Recovery Outline and Categorization of Recovery Algorithms

AUWC School of Technology and Informatics Page 1

Caching (Buffering) of Disk Blocks

When the DBMS requests action on some item,

AUWC School of Technology and Informatics Page 2

Write-Ahead Logging, Steal/No-Steal, and Force/No-Force

When in-place updating is used, it is necessary to use a log for recovery.

AUWC School of Technology and Informatics Page 3

Transaction Actions That Do Not Affect the Database

NO-UNDO/REDO Recovery Based on Deferred Update

AUWC School of Technology and Informatics Page 4

Recovery Techniques Based on Immediate Update

AUWC School of Technology and Informatics Page 5

The ARIES Recovery Algorithm

It is used in many relational database-related products of IBM.

AUWC School of Technology and Informatics Page 6

Recovery in Multidatabase Systems

AUWC School of Technology and Informatics Page 7

Database Backup and Recovery from Catastrophic Failures

So far, all the techniques we have discussed apply to non-catastrophic failures.

AUWC School of Technology and Informatics Page 8

You might also like