Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Concurrency Problems:: Transactions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

UNIT -V

Concurrency: Concurrency is the ability of a database to allow multiple users to affect multiple
transactions.
Concurrency Problems:
 When multiple transactions execute concurrently in an uncontrolled or unrestricted manner, then it
might lead to several problems.
 Such problems are called as concurrency problems.
The concurrency problems are-

1. Dirty Read Problem-


Reading the data written by an uncommitted transaction is called as dirty read

This read is called as dirty read because-

 There is always a chance that the uncommitted transaction might roll back later.
 Thus, uncommitted transaction might make other transactions read a value that does not even exist.
 This leads to inconsistency of the database.
 Dirty read does not lead to inconsistency always.
 It becomes problematic only when the uncommitted transaction fails and roll backs later due to
some reason.
Example-

DBMS B.Tech II Year IV Sem. Page | 1


Here,

1. T1 reads the value of A.


2. T1 updates the value of A in the buffer.
3. T2 reads the value of A from the buffer.
4. T2 writes the updated the value of A.
5. T2 commits.
6. T1 fails in later stages and rolls back.
In this example,

 T2 reads the dirty value of A written by the uncommitted transaction T1.


 T1 fails in later stages and roll backs.
 Thus, the value that T2 read now stands to be incorrect.
 Therefore, database becomes inconsistent.

2. Unrepeatable Read Problem-


This problem occurs when a transaction gets to read unrepeated i.e. different values of the same variable
in its different read operations even when it has not updated its value.

Example-

Here,

1. T1 reads the value of X (= 10 say).


2. T2 reads the value of X (= 10).
3. T1 updates the value of X (from 10 to 15 say) in the buffer.
4. T2 again reads the value of X (but = 15).
In this example,

 T2 gets to read a different value of X in its second reading.


 T2 wonders how the value of X got changed because according to it, it is running in isolation.

3. Lost Update Problem-


This problem occurs when multiple transactions execute concurrently and updates from one or more
transactions get lost.
DBMS B.Tech II Year IV Sem. Page | 2
Example-

Here,

1. T1 reads the value of A (= 10 say).


2. T2 updates the value to A (= 15 say) in the buffer.
3. T2 does blind write A = 25 (write without read) in the buffer.
4. T2 commits.
5. When T1 commits, it writes A = 25 in the database.

In this example,

 T1 writes the over written value of A in the database.


 Thus, update from T1 gets lost.

 This problem occurs whenever there is a write-write conflict.


 In write-write conflict, there are two writes one by each transaction on the same data item without
any read in the middle.

4. Phantom Read Problem-


This problem occurs when a transaction reads some variable from the buffer and when it reads the same
variable later, it finds that the variable does not exist.

Example-

DBMS B.Tech II Year IV Sem. Page | 3


Here,

1. T1 reads X.
2. T2 reads X.
3. T1 deletes X.
4. T2 tries reading X but does not find it.
In this example,

 T2 finds that there does not exist any variable X when it tries reading X again.
 T2 wonders who deleted the variable X because according to it, it is running in isolation.

Concurrency Control

When multiple transactions are trying to access the same sharable resource, there could arise many
problems if the access control is not done properly. There are some important mechanisms to which
access control can be maintained.

Concurrency Control Protocols help to prevent the occurrence of above problems and maintain the
consistency of the database. The practical concept of this can be implemented by
using Locks and Timestamps. Locks and Timestamps can be used to provide an environment in which
concurrent transactions can preserve their Consistency and Isolation properties.

Concurrency Control Protocols


Different concurrency control protocols offer different benefits between the amount of concurrency
they allow and the amount of overhead that they impose.

 Lock-Based Protocols
 Two Phase
 Timestamp-Based Protocols
 Validation-Based Protocols

Lock Based Protocol


 A lock is nothing but a mechanism that tells the DBMS whether a particular data item is being
used by any transaction for read/write purpose. Since there are two types of operations, i.e. read
and write, whose basic nature are different, the locks for read and write operation may behave
differently.
 Read operation performed by different transactions on the same data item poses less of a
challenge. The value of the data item, if constant, can be read by any number of transactions at
any given time.
 Write operation is something different. When a transaction writes some value into a data item, the
content of that data item remains in an inconsistent state, starting from the moment when the
writing operation begins up to the moment the writing operation is over. If we allow any other
transaction to read/write the value of the data item during the write operation, those transaction

DBMS B.Tech II Year IV Sem. Page | 4


will read an inconsistent value or overwrite the value being written by the first transaction. In both
the cases anomalies will creep into the database.
 The simple rule for locking can be derived from here. If a transaction is reading the content of a
sharable data item, then any number of other processes can be allowed to read the content of the
same data item. But if any transaction is writing into a sharable data item, then no other transaction
will be allowed to read or write that same data item.

We can classify the locks into two types.

Shared Lock: A transaction may acquire shared lock on a data item in order to read its content. The
lock is shared in the sense that any other transaction can acquire the shared lock on that same data
item for reading purpose.

Exclusive Lock: A transaction may acquire exclusive lock on a data item in order to both read/write into
it. The lock is excusive in the sense that no other transaction can acquire any kind of lock (either shared
or exclusive) on that same data item.

The relationship between Shared and Exclusive Lock can be represented by the following table which is
known as Lock Matrix.

LOCKS Shared Exclusive


Shared TRUE FALSE
Exclusive FALSE FALSE

In a transaction, a data item which we want to read/write should first be locked before the read/write is
done. After the operation is over, the transaction should then unlock the data item so that other
transaction can lock that same data item for their respective usage. In the earlier chapter we had seen a
transaction to deposit Rs 100/- from account A to account B. The transaction should now be written as
the following:

Lock-X (A); (Exclusive Lock, we want to both read A’s value and modify it)
Read A;
A = A – 100;
Write A;
Unlock (A); (Unlocking A after the modification is done)
Lock-X (B); (Exclusive Lock, we want to both read B’s value and modify it)
Read B;
B = B + 100;
Write B;
Unlock (B); (Unlocking B after the modification is done)
DBMS B.Tech II Year IV Sem. Page | 5
And the transaction that deposits 10% amount of account A to account C should now be written as:

Lock-S (A); (Shared Lock, we only want to read A’s value)


Read A;
Temp = A * 0.1;
Unlock (A); (Unlocking A)
Lock-X (C); (Exclusive Lock, we want to both read C’s value and modify it)
Read C;
C = C + Temp;
Write C;
Unlock (C); (Unlocking C after the modification is done)

T1 T2

Read A;

A = A - 100;

Read A;

Temp = A * 0.1;

Read C;

C = C + Temp;

Write C;

Write A;

Read B;

B = B + 100;

Write B;

We detected the error based on common sense only that the Context Switching is being performed
before the new value has been updated in A. T2 reads the old value of A, and thus deposits a wrong
amount in C. Had we used the locking mechanism, this error could never have occurred. Let us rewrite
the schedule using the locks.

T1 T2
Lock-X (A)
Read A;

DBMS B.Tech II Year IV Sem. Page | 6


A = A - 100;
Write A;
Lock-S (A)
Read A;
Temp = A * 0.1;
Unlock (A)
Lock-X(C)
Read C;
C = C + Temp;
Write C;
Unlock (C)
Unlock (A)
Lock-X (B)
Read B;
B = B + 100;
Write B;
Unlock (B)

Cannot prepare a schedule like the above even if we like, provided that we use the locks in the
transactions. See the first statement in T2 that attempts to acquire a lock on A. This would be impossible
because T1 has not released the exclusive lock on A, and T2 just cannot get the shared lock it wants on
A. It must wait until the exclusive lock on A is released by T1, and can begin its execution only after that.
So the proper schedule would look like the following:
T1 T2
Lock-X (A)
Read A;
A = A - 100;
Write A;
Unlock (A)
Lock-S (A)
Read A;
Temp = A * 0.1;
Unlock (A)
Lock-X(C)
Read C;
C = C + Temp;
Write C;

DBMS B.Tech II Year IV Sem. Page | 7


Unlock (C)
Lock-X (B)
Read B;
B = B + 100;
Write B;
Unlock (B)

Two Phase Locking Protocol


The use of locks has helped us to create neat and clean concurrent schedule. The Two Phase Locking
Protocol defines the rules of how to acquire the locks on a data item and how to release the locks.
The Two Phase Locking Protocol assumes that a transaction can only be in one of two phases.

Growing Phase: In this phase the transaction can only acquire locks, but cannot release any lock. The
transaction enters the growing phase as soon as it acquires the first lock it wants. From now on it has no
option but to keep acquiring all the locks it would need. It cannot release any lock at this phase even if it
has finished working with a locked data item. Ultimately the transaction reaches a point where all the lock
it may need has been acquired. This point is called Lock Point.

Shrinking Phase: After Lock Point has been reached, the transaction enters the shrinking phase. In this
phase the transaction can only release locks, but cannot acquire any new lock. The transaction enters the
shrinking phase as soon as it releases the first lock after crossing the Lock Point. From now on it has no
option but to keep releasing all the acquired locks.

Two different versions of the Two Phase Locking Protocol. One is called the Strict Two Phase Locking
Protocol and the other one is called the Rigorous Two Phase Locking Protocol.

Strict Two Phase Locking Protocol


In this protocol, a transaction may release all the shared locks after the Lock Point has been reached, but
it cannot release any of the exclusive locks until the transaction commits. This protocol helps in creating
cascade less schedule.

DBMS B.Tech II Year IV Sem. Page | 8


A Cascading Schedule is a typical problem faced while creating concurrent schedule. Consider the
following schedule once again.
T1 T2
Lock-X (A)
Read A;
A = A - 100;
Write A;
Unlock (A)
Lock-S (A)
Read A;
Temp = A * 0.1;
Unlock (A)
Lock-X(C)
Read C;
C = C + Temp;
Write C;
Unlock (C)
Lock-X (B)
Read B;
B = B + 100;
Write B;
Unlock (B)

The schedule is theoretically correct. T1 releases the exclusive lock on A, and immediately after that the
Context Switch is made. T2 acquires a shared lock on A to read its value, perform a calculation, update
the content of account C and then issue COMMIT. However, T1 is not finished yet. What if the remaining
portion of T1 encounters a problem (power failure, disc failure etc.) and cannot be committed? In that
case T1 should be rolled back and the old BFIM value of A should be restored. In such a case T2, which
has read the updated (but not committed) value of A and calculated the value of C based on this value,
must also have to be rolled back. We have to rollback T2 for no fault of T2 itself, but because we

DBMS B.Tech II Year IV Sem. Page | 9


proceeded with T2 depending on a value which has not yet been committed. This phenomenon of rolling
back a child transaction if the parent transaction is rolled back is called Cascading Rollback, which causes
a tremendous loss of processing power and execution time.
Using Strict Two Phase Locking Protocol, Cascading Rollback can be prevented. In Strict Two Phase
Locking Protocol a transaction cannot release any of its acquired exclusive locks until the transaction
commits. In such a case, T1 would not release the exclusive lock on A until it finally commits, which makes
it impossible for T2 to acquire the shared lock on A at a time when A’s value has not been committed.
This makes it impossible for a schedule to be cascading.
Rigorous Two Phase Locking Protocol
In Rigorous Two Phase Locking Protocol, a transaction is not allowed to release any lock (either shared
or exclusive) until it commits. This means that until the transaction commits, other transaction might
acquire a shared lock on a data item on which the uncommitted transaction has a shared lock; but cannot
acquire any lock on a data item on which the uncommitted transaction has an exclusive lock.

Timestamp-based Protocol
 The Timestamp Ordering Protocol is used to order the transactions based on their Timestamps.
The order of transaction is nothing but the ascending order of the transaction creation.
 The priority of the older transaction is higher that's why it executes first. To determine the
timestamp of the transaction, this protocol uses system time or logical counter.
 The lock-based protocol is used to manage the order between conflicting pairs among
transactions at the execution time. But Timestamp based protocols start working as soon as a
transaction is created.
 Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has entered the
system at 007 times and transaction T2 has entered the system at 009 times. T1 has the higher
priority, so it executes first as it is entered the system first.
 The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write' operation
on a data.

Basic Timestamp ordering protocol works as follows:

1. Check the following condition whenever a transaction Ti issues a Read (X) operation:

 If W_TS(X) >TS(Ti) then the operation is rejected.


 If W_TS(X) <= TS(Ti) then the operation is executed.
 Timestamps of all the data items are updated.

2. Check the following condition whenever a transaction Ti issues a Write(X) operation:

 If TS(Ti) < R_TS(X) then the operation is rejected.


 If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back otherwise the operation
is executed.

DBMS B.Tech II Year IV Sem. Page | 10


Where,

TS(TI) denotes the timestamp of the transaction Ti.

R_TS(X) denotes the Read time-stamp of data-item X.

W_TS(X) denotes the Write time-stamp of data-item X.

Example: Suppose there are there transactions T1, T2, and T3.

T1 has entered the system at time 0010


T2 has entered the system at 0020
T3 has entered the system at 0030
Priority will be given to transaction T1, then transaction T2 and lastly Transaction T3.

Advantages:
 Schedules are serializable just like 2PL protocols
 No waiting for the transaction, which eliminates the possibility of deadlocks!

Disadvantages:
 Starvation is possible if the same transaction is restarted and continually aborted

Validation-based Protocol

It is based on Timestamp Protocol. It has three phases:

1. Read Phase: During this phase, the system executes transaction Ti. . It reads the values of the
various data items and stores them in variable local to Ti. It performs all the write operations on
temporary local variables without update of the actual database.
2. Validation Phase: Transaction Ti performs a validation test to determine whether it can copy to
database the temporary local variables that hold the result of write operations without causing a
violation of serializability.
3. Write Phase: If Transaction Ti succeeds in validation, then the system applies the actual updates
to the database, otherwise the system rolls back Ti.

To perform the validation test, we need to know when the various phases of transaction Ti took place.
We shall therefore associate three different timestamps with transaction Ti.

1. Start (Ti): the time when Ti, started its execution.

2. Validation (Ti): the time when Ti finished its read phase and started its validation phase.

3. Finish (Ti): the time when Ti finished its write phase.

DBMS B.Tech II Year IV Sem. Page | 11


The Validation Test for Tj requires that, for all transaction Ti with TS(Ti ) < TS(Tj ) one of the following
condition must hold

1. Finish (Ti) < Start (Tj): Since Ti completes its execution before Tj started, the serializability order is
indeed maintained.
2. Start(Tj )<Finish(Ti ) <validation(Tj ): The validation phase of Tj should occur after Ti finishes.

Deadlock

"A system is in a deadlock state if there exists a set of transactions such that every transaction in the set
is waiting for another transaction in the set."

As shown in fig. transaction T1 is waiting for transaction


T2 to release its lock on data item and transaction T2 is
waiting for transaction T3 to release its lock on data item
and transaction T2 is waiting for transaction T3 to release
its lock on data item. Such a cycle of transactions waiting
for locks to be released is called a Deadlock.

There are two principal methods for dealing with the


deadlock problem.

 Deadlock prevention
 Deadlock detection

Deadlock prevention: We can use a deadlock-prevention protocol to ensure that the system will never
enter a deadlock state.

Deadlock detection: In this case, we can allow the system to enter a deadlock state, and then try to
recover using a deadlock detection and deadlock recovery scheme.

Both the above methods may result in transaction rollback. Prevention is commonly used if the probability
that the system would enter a deadlock state is relatively high; otherwise detection and recovery are more
efficient.

Deadlock Prevention

Prevent deadlocks by giving each transaction a priority and ensuring that lower priority transactions are
not allowed to wait for higher priority transactions (or vice versa). One way to assign priorities is to give
each transaction a timestamp when it starts up. The lower the timestamp, the higher the transaction
priority that is, the oldest transaction has the highest priority.

DBMS B.Tech II Year IV Sem. Page | 12


Deadlock prevention methods

Wait-die

If Ti has higher priority, it is allowed to wait; otherwise it is aborted. It means when transaction Ti requests
a data item currently held by Tj, Ti is allowed to wait only if it has a timestamp smaller than that of T1
(that is Ti is older than Tj). Otherwise Ti is rolled back (dies).

Example

Suppose that transactions T1 and T2 have timestamps 7 and 10 respectively. If T1 requests a data item
held by T2 then T1 will wait. If T2 requests a data item held by T1 then T2 (die) will be rolled back.

The wait-die scheme is non-preemptive scheme because only a


transaction requesting a lock can be aborted. As a transaction
grows older (and its priority increases), it tends to wait for more and
more younger transactions.

Wound-wait

If Ti has higher priority, abort Tj otherwise Ti waits. It means when transaction Ti requests a data item
currently held by Tj, Ti is allowed to wait only if it has a timestamp larger than that of Tj (that is Ti is
younger than Tj). Otherwise Tj is rolled back (Tj is wounded by Ti).

DBMS B.Tech II Year IV Sem. Page | 13


Example:

Transactions T1 and T2, if T1 requests a data item held by T2


then the data item will be preempted from T1 and T1 will be
rolled back. If T2 requests a data item held by T1, and then T2
will wait.

This scheme is based on a preemptive technique. In the wait-


die scheme, lower priority transactions can never wait for higher
priority transactions. In the wound-wait scheme, higher priority
transactions never wait for lower priority transactions. In either
case no deadlock cycle can develop.

When a transaction is aborted and restarted, it should be given


the same timestamp that it had originally. Reissuing timestamps
in this way ensures that each transaction will eventually become
the oldest transaction, and thus the one with the highest priority, and will get the locks that it requires.

Timeout-Based Schemes

Another simple approach to deadlock handling is based on lock timeouts. In this approach, a transaction
that has requested a lock waits for at most a specified amount of time. If the lock has not been granted
within that time, the transaction is said to time out, and it rolls itself back and restarts. If there was in fact
a deadlock, one or more transactions involved in the deadlock will time out and roll back, allowing the
others to proceed.

The timeout scheme is particularly easy to implement, and works well if transactions· are short, and if
long waits are likely to be due to deadlocks.

Deadlock Detection

Wait-for Graph:

 To detect a state of deadlock is with the help of wait-for graph. This graph is constructed and
maintained by the system.

 One node is created in the wait-for graph for each transaction that is currently executing. Whenever
a transaction Ti is waiting to lock an item X that is currently locked by a transaction Tj, a directed edge
(Ti->Tj) is created in the wait-for graph.

 When Tj releases the lock(s) on the items that Ti was waiting for, the directed edge is dropped from
the wait-for graph.

 We have a state of deadlock if and only if the wait-for graph has a cycle. Then each transaction
involved in the cycle is said to be deadlocked.

DBMS B.Tech II Year IV Sem. Page | 14


 To detect deadlocks, the system needs to maintain the wait for graph, and periodically to invoke an
algorithm that searches for a cycle in the graph.

Consider the following wait-for graph in figure. Here:

Transaction T25 is waiting for transactions T26 and T27.

Transactions T27 is waiting for transaction T26.

Transaction T26 is waiting for transaction T28.

No Cycle

Wait-for graph has no cycle, so there is no deadlock state.

Suppose now that transaction T28 is requesting an item held by T27. Then the edge T28 ->T27 is added
to the wait -for graph, resulting in a new system state as shown in figure.

Cycle

Wait-for graph has cycle, so there is deadlock state.

This time the graph contains the cycle.

T26------>T28------>T27----->T26

It means that transactions T26, T27 and T28 are all deadlocked.

Recovery from Deadlock

When a detection algorithm determines that a deadlock exists, the system must recover from the
deadlock. The most common solution is to roll back one or more transactions to break the deadlock.
Choosing which transaction to abort is known as Victim Selection.

DBMS B.Tech II Year IV Sem. Page | 15


 Selection of deadlock victim

 In below wait-for graph transactions T26, T28 and T27 are deadlocked. In order to remove
deadlock one of the transaction out of these three transactions must be roll backed.

 We should roll back those transactions that will incur the minimum cost. When a deadlock is
detected, the choice of which transaction to abort can be made using following criteria:

• The transaction which have the fewest locks

• The transaction that has done the least work

• The transaction that is farthest from completion

 Rollback

Once we have decided that a particular transaction must be rolled back, we must determine how far this
transaction should be rolled back. The simplest solution is a total rollback; Abort the transaction and then
restart it. However it is more effective to roll back the transaction only as far as necessary to break the
deadlock. But this method requires the system to maintain additional information about the state of all
the running system.

 Problem of Starvation

In a system where the selection of victims is based primarily on cost factors, it may happen that the same
transaction is always picked as a victim. As a result this transaction never completes can be picked as a
victim only a (small) finite number of times. The most common solution is to include the number of
rollbacks in the cost factor.

Starvation

When a transaction requests a lock on a data item in a particular mode, and no other transaction has a
lock on the same data item in a conflicting mode, the lock can be granted. However care must be taken
to avoid the following scenario.

Suppose a transaction T2 has a shared-mode lock on a data item, and another transaction TI requests an
exclusive mode lock on the data item. Clearly, TI has to wait for T2 to release the share mode lock.
Meanwhile a transaction T3 may request a shared mode lock on the same data item. The lock request is
compatible with the lock granted to T2 so T3 may be granted the shared mode lock. At this point T2 may
release the lock, but still TI has to wait for T3 to finish. But again there may be a new transaction T4 that
request a shared mode lock on the same data item and is granted the lock before T3 releases it. In fact,
it is possible that there is a sequence of transactions that each request a shared mode lock on the data
item and each transaction release the lock a short while after it is granted, but T 1 never gets the exclusive
mode lock on the data item. The transaction Tl may never make progress and is said to be starved.

DBMS B.Tech II Year IV Sem. Page | 16


Database Failure Classification:
To find that where the problem has occurred, we generalize a failure into the following categories:

1. Transaction failure
2. System crash
3. Disk failure

1. Transaction failure: The transaction failure occurs when it fails to execute or when it reaches
a point from where it can't go any further. If a few transaction or process is hurt, then this is
called as transaction failure.

Reasons for a transaction failure could be -

1. Logical errors: If a transaction cannot complete due to some code error or an internal
error condition, then the logical error occurs.
2. Syntax error: It occurs where the DBMS itself terminates an active transaction because
the database system is not able to execute it. For example, The system aborts an active
transaction, in case of deadlock or resource unavailability.

2. System Crash: System failure can occur due to power failure or other hardware or software
failure.
Example: Operating system error.

Fail-stop assumption: In the system crash, non-volatile storage is assumed not to be


corrupted.

3. Disk Failure
 It occurs where hard-disk drives or storage drives used to fail frequently. It was a common
problem in the early days of technology evolution.
 Disk failure occurs due to the formation of bad sectors, disk head crash, and
unreachability to the disk or any other failure, which destroy all or part of disk storage.

Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a transaction. It is
important that the logs are written prior to the actual modification and stored on a stable storage media,
which is failsafe.

An update log record describes a single database write: <Tn, X, V1, V2>

i. Transactions identifier.
ii. Data-item identifier.
iii. Old value.
iv. New value.

DBMS B.Tech II Year IV Sem. Page | 17


Log-based recovery works as follows −

 The log file is kept on a stable storage media.


 When a transaction enters the system and starts execution, it writes a log about it.
<Tn, Start>
 When the transaction modifies an item X, it write logs as follows –
<Tn, X, V1, V2>
 It reads Tn has changed the value of X, from V1 to V2.
 When the transaction finishes, it logs –
<Tn, commit>

The database can be modified using two approaches −

 Deferred database modification − All logs are written on to the stable storage and the database
is updated when a transaction commits.

 Immediate database modification − Each log follows an actual database modification. That is,
the database is modified immediately after every operation.

Deferred database modification:


 The deferred database modification scheme records all modifications to the log, but defers all the
writes to after partial commit.
 Assume that transactions execute serially
 Transaction starts by writing record to log.
 A write(X) operation results in a log record being written, where V is the new value for X
Note: old value is not needed for this scheme
 The write is not performed on X at this time, but is deferred.
 When Ti partially commits, is written to the log
 Finally, the log records are read and used to actually execute the previously deferred writes.
 During recovery after a crash, a transaction needs to be redone if and only if both <Ti Start> and
<Ti Commit> are there in the log.
 Redoing a transaction Ti ( redoTi) sets the value of all data items updated by the transaction to
the new values.
 Crashes can occur while:
The transaction is executing the original updates, or while recovery action is being taken
 Example transactions T0 and T1 (T0 executes before T1):

DBMS B.Tech II Year IV Sem. Page | 18


T0: read (A) T1 : read (C)
A: - A - 50 C:- C- 100
Write (A) write (C)
read (B)
B:- B + 50
write (B)

If log on stable storage at time of crash is as in case:

(a) No redo actions need to be taken


(b) redo(T0) must be performed since <T0 commit> is present
(c) redo(T0) must be performed followed by redo(T1) since <T0commit> and <T1commit> are present

Immediate database modification:


 The immediate-update technique allows database modifications to be output to the database
while the transaction is still in the active state.
 These modifications are called uncommitted modifications. In the event of a crash or transaction
failure, the system must use the old-value field of the log records to restore the modified data
items.
 Recovery procedure has two operations instead of one:
 Undo (Ti) restores the value of all data items updated by Ti to their old values, going
backwards from the last log record for Ti.
 Redo (Ti) sets the value of all data items updated by Ti to the new values, going forward
from the first log record for Ti.
 Both operations must be idempotent:
 That is, even if the operation is executed multiple times the effect is the same as if it is
executed once, needed since operations may get re-executed during recover.
 When recovering after failure:
 Transaction Ti needs to be undone if the log contains the record <ti start>but does not
contain the record<ti commit>

DBMS B.Tech II Year IV Sem. Page | 19


 Transaction Ti needs to be redone if the log contains both the record <ti start>and the
record<ti commit> .
 Undo operations are performed first, then redo operations.

Recovery actions in each case above are:


(a) undo (T0): B is restored to 2000 and A to 1000.
(b) undo (T1) and redo (T0): C is restored to 700, and then A and
B are set to 950 and 2050 respectively.
(c) redo (T0) and redo (T1): A and B are set to 950 and 2050
respectively. Then Cis set to 600

Recovery with Concurrent Transactions


When more than one transaction are being executed in parallel, the logs are interleaved. At the time of
recovery, it would become hard for the recovery system to backtrack all logs, and then start recovering.
To ease this situation, modern DBMS use the concept of 'checkpoints'.

Checkpoint
Keeping and maintaining logs in real time and in real environment may fill out all the memory space
available in the system. As time passes, the log file may grow too big to be handled at all. Checkpoint is
a mechanism where all the previous logs are removed from the system and stored permanently in a
storage disk. Checkpoint declares a point before which the DBMS was in consistent state, and all the
transactions were committed.

During recovery we need to consider only the most recent transaction Ti that started before the
checkpoint, and transactions that started after Ti.

 Scan backwards from end of log to find the most recent <checkpoint> record
 Continue scanning backwards till a record <Ti start> is found.
 Need only consider the part of log following above start record. Earlier part of log can be ignored
during recovery, and can be erased whenever desired.
 For all transactions (starting from Ti or later) with no <Ti commit>, execute undo(Ti). (Done only
in case of immediate modification.)
DBMS B.Tech II Year IV Sem. Page | 20
 Scanning forward in the log, for all transactions starting from Ti or later with a <Ti commit>,
execute redo(Ti).
Recovery
When a system with concurrent transactions crashes and recovers, it behaves in the following manner −

 The recovery system reads the logs backwards from the end to the last checkpoint.
 It maintains two lists, an undo-list and a redo-list.
 If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it puts
the transaction in the redo-list.
 If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the
transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed. All the transactions in
the redo-list and their previous logs are removed and then redone before saving their logs.

 For example: In the log file, transaction T2 and T3 will have <Tn, Start> and <Tn, Commit>. The
T1 transaction will have only <Tn, commit> in the log file. That's why the transaction is committed
after the checkpoint is crossed. Hence it puts T1, T2 and T3 transaction into redo list.
 The transaction is put into undo state if the recovery system sees a log with <Tn, Start> but no
commit or abort log found. In the undo-list, all the transactions are undone, and their logs are
removed.
 Transaction T4 will have <Tn, Start>. So T4 will be put into undo list since this transaction is not
yet complete and failed amid.

Shadow Paging:
This recovery scheme does not require the use of a log in a single-user environment. In a multiuser
environment, a log may be needed for the concurrency control method. Shadow paging considers the
database to be made up of a number of fixed-size disk pages (or disk blocks) say, n—for recovery
purposes. A directory with n entries is constructed, where the ith entry points to the ith database page
on disk. The directory is kept in main memory if it is not too large, and all references—reads or writes—
to database pages on disk go through it. When a transaction begins executing, the current directory—
whose entries point to the most recent or current database pages on disk—is copied into a shadow

DBMS B.Tech II Year IV Sem. Page | 21


directory. The shadow directory is then saved on disk while the current directory is used by the
transaction.

During transaction execution, the shadow directory is never modified. When a write_item operation is
performed, a new copy of the modified database page is created, but the old copy of that page is not
overwritten. Instead, the new page is written elsewhere on some previously unused disk block. The current
directory entry is modified to point to the new disk block, whereas the shadow directory is not modified
and continues to point to the old unmodified disk block. Figure illustrates the concepts of shadow and
current directories. For pages updated by the transaction, two versions are kept. The old version is
referenced by the shadow directory and the new version by the current directory.

To recover from a failure during transaction execution, it is sufficient to free the modified database pages
and to discard the current directory. The state of the data-base before transaction execution is available
through the shadow directory, and that state is recovered by reinstating the shadow directory. The
database thus is returned to its state prior to the transaction that was executing when the crash occurred,
and any modified pages are discarded. Committing a transaction corresponds to discarding the previous
shadow directory. Since recovery involves neither undoing nor redoing data items, this technique can be
categorized as a NO-UNDO/NO-REDO technique for recovery.

In a multiuser environment with concurrent transactions, logs and checkpoints must be incorporated into
the shadow paging technique. One disadvantage of shadow paging is that the updated database pages
change location on disk. This makes it difficult to keep related database pages close together on disk
without complex storage management strategies. Furthermore, if the directory is large, the overhead of
writing shadow directories to disk as transactions commit is significant. A further complication is how to
handle garbage collection when a transaction commits. The old pages referenced by the shadow
directory that have been updated must be released and added to a list of free pages for future use. These
pages are no longer needed after the transaction commits. Another issue is that the operation to migrate
between cur-rent and shadow directories must be implemented as an atomic operation.

DBMS B.Tech II Year IV Sem. Page | 22

You might also like