Concurrency Problems:: Transactions
Concurrency Problems:: Transactions
Concurrency Problems:: Transactions
Concurrency: Concurrency is the ability of a database to allow multiple users to affect multiple
transactions.
Concurrency Problems:
When multiple transactions execute concurrently in an uncontrolled or unrestricted manner, then it
might lead to several problems.
Such problems are called as concurrency problems.
The concurrency problems are-
There is always a chance that the uncommitted transaction might roll back later.
Thus, uncommitted transaction might make other transactions read a value that does not even exist.
This leads to inconsistency of the database.
Dirty read does not lead to inconsistency always.
It becomes problematic only when the uncommitted transaction fails and roll backs later due to
some reason.
Example-
Example-
Here,
Here,
In this example,
Example-
1. T1 reads X.
2. T2 reads X.
3. T1 deletes X.
4. T2 tries reading X but does not find it.
In this example,
T2 finds that there does not exist any variable X when it tries reading X again.
T2 wonders who deleted the variable X because according to it, it is running in isolation.
Concurrency Control
When multiple transactions are trying to access the same sharable resource, there could arise many
problems if the access control is not done properly. There are some important mechanisms to which
access control can be maintained.
Concurrency Control Protocols help to prevent the occurrence of above problems and maintain the
consistency of the database. The practical concept of this can be implemented by
using Locks and Timestamps. Locks and Timestamps can be used to provide an environment in which
concurrent transactions can preserve their Consistency and Isolation properties.
Lock-Based Protocols
Two Phase
Timestamp-Based Protocols
Validation-Based Protocols
Shared Lock: A transaction may acquire shared lock on a data item in order to read its content. The
lock is shared in the sense that any other transaction can acquire the shared lock on that same data
item for reading purpose.
Exclusive Lock: A transaction may acquire exclusive lock on a data item in order to both read/write into
it. The lock is excusive in the sense that no other transaction can acquire any kind of lock (either shared
or exclusive) on that same data item.
The relationship between Shared and Exclusive Lock can be represented by the following table which is
known as Lock Matrix.
In a transaction, a data item which we want to read/write should first be locked before the read/write is
done. After the operation is over, the transaction should then unlock the data item so that other
transaction can lock that same data item for their respective usage. In the earlier chapter we had seen a
transaction to deposit Rs 100/- from account A to account B. The transaction should now be written as
the following:
Lock-X (A); (Exclusive Lock, we want to both read A’s value and modify it)
Read A;
A = A – 100;
Write A;
Unlock (A); (Unlocking A after the modification is done)
Lock-X (B); (Exclusive Lock, we want to both read B’s value and modify it)
Read B;
B = B + 100;
Write B;
Unlock (B); (Unlocking B after the modification is done)
DBMS B.Tech II Year IV Sem. Page | 5
And the transaction that deposits 10% amount of account A to account C should now be written as:
T1 T2
Read A;
A = A - 100;
Read A;
Temp = A * 0.1;
Read C;
C = C + Temp;
Write C;
Write A;
Read B;
B = B + 100;
Write B;
We detected the error based on common sense only that the Context Switching is being performed
before the new value has been updated in A. T2 reads the old value of A, and thus deposits a wrong
amount in C. Had we used the locking mechanism, this error could never have occurred. Let us rewrite
the schedule using the locks.
T1 T2
Lock-X (A)
Read A;
Cannot prepare a schedule like the above even if we like, provided that we use the locks in the
transactions. See the first statement in T2 that attempts to acquire a lock on A. This would be impossible
because T1 has not released the exclusive lock on A, and T2 just cannot get the shared lock it wants on
A. It must wait until the exclusive lock on A is released by T1, and can begin its execution only after that.
So the proper schedule would look like the following:
T1 T2
Lock-X (A)
Read A;
A = A - 100;
Write A;
Unlock (A)
Lock-S (A)
Read A;
Temp = A * 0.1;
Unlock (A)
Lock-X(C)
Read C;
C = C + Temp;
Write C;
Growing Phase: In this phase the transaction can only acquire locks, but cannot release any lock. The
transaction enters the growing phase as soon as it acquires the first lock it wants. From now on it has no
option but to keep acquiring all the locks it would need. It cannot release any lock at this phase even if it
has finished working with a locked data item. Ultimately the transaction reaches a point where all the lock
it may need has been acquired. This point is called Lock Point.
Shrinking Phase: After Lock Point has been reached, the transaction enters the shrinking phase. In this
phase the transaction can only release locks, but cannot acquire any new lock. The transaction enters the
shrinking phase as soon as it releases the first lock after crossing the Lock Point. From now on it has no
option but to keep releasing all the acquired locks.
Two different versions of the Two Phase Locking Protocol. One is called the Strict Two Phase Locking
Protocol and the other one is called the Rigorous Two Phase Locking Protocol.
The schedule is theoretically correct. T1 releases the exclusive lock on A, and immediately after that the
Context Switch is made. T2 acquires a shared lock on A to read its value, perform a calculation, update
the content of account C and then issue COMMIT. However, T1 is not finished yet. What if the remaining
portion of T1 encounters a problem (power failure, disc failure etc.) and cannot be committed? In that
case T1 should be rolled back and the old BFIM value of A should be restored. In such a case T2, which
has read the updated (but not committed) value of A and calculated the value of C based on this value,
must also have to be rolled back. We have to rollback T2 for no fault of T2 itself, but because we
Timestamp-based Protocol
The Timestamp Ordering Protocol is used to order the transactions based on their Timestamps.
The order of transaction is nothing but the ascending order of the transaction creation.
The priority of the older transaction is higher that's why it executes first. To determine the
timestamp of the transaction, this protocol uses system time or logical counter.
The lock-based protocol is used to manage the order between conflicting pairs among
transactions at the execution time. But Timestamp based protocols start working as soon as a
transaction is created.
Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has entered the
system at 007 times and transaction T2 has entered the system at 009 times. T1 has the higher
priority, so it executes first as it is entered the system first.
The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write' operation
on a data.
1. Check the following condition whenever a transaction Ti issues a Read (X) operation:
Example: Suppose there are there transactions T1, T2, and T3.
Advantages:
Schedules are serializable just like 2PL protocols
No waiting for the transaction, which eliminates the possibility of deadlocks!
Disadvantages:
Starvation is possible if the same transaction is restarted and continually aborted
Validation-based Protocol
1. Read Phase: During this phase, the system executes transaction Ti. . It reads the values of the
various data items and stores them in variable local to Ti. It performs all the write operations on
temporary local variables without update of the actual database.
2. Validation Phase: Transaction Ti performs a validation test to determine whether it can copy to
database the temporary local variables that hold the result of write operations without causing a
violation of serializability.
3. Write Phase: If Transaction Ti succeeds in validation, then the system applies the actual updates
to the database, otherwise the system rolls back Ti.
To perform the validation test, we need to know when the various phases of transaction Ti took place.
We shall therefore associate three different timestamps with transaction Ti.
2. Validation (Ti): the time when Ti finished its read phase and started its validation phase.
1. Finish (Ti) < Start (Tj): Since Ti completes its execution before Tj started, the serializability order is
indeed maintained.
2. Start(Tj )<Finish(Ti ) <validation(Tj ): The validation phase of Tj should occur after Ti finishes.
Deadlock
"A system is in a deadlock state if there exists a set of transactions such that every transaction in the set
is waiting for another transaction in the set."
Deadlock prevention
Deadlock detection
Deadlock prevention: We can use a deadlock-prevention protocol to ensure that the system will never
enter a deadlock state.
Deadlock detection: In this case, we can allow the system to enter a deadlock state, and then try to
recover using a deadlock detection and deadlock recovery scheme.
Both the above methods may result in transaction rollback. Prevention is commonly used if the probability
that the system would enter a deadlock state is relatively high; otherwise detection and recovery are more
efficient.
Deadlock Prevention
Prevent deadlocks by giving each transaction a priority and ensuring that lower priority transactions are
not allowed to wait for higher priority transactions (or vice versa). One way to assign priorities is to give
each transaction a timestamp when it starts up. The lower the timestamp, the higher the transaction
priority that is, the oldest transaction has the highest priority.
Wait-die
If Ti has higher priority, it is allowed to wait; otherwise it is aborted. It means when transaction Ti requests
a data item currently held by Tj, Ti is allowed to wait only if it has a timestamp smaller than that of T1
(that is Ti is older than Tj). Otherwise Ti is rolled back (dies).
Example
Suppose that transactions T1 and T2 have timestamps 7 and 10 respectively. If T1 requests a data item
held by T2 then T1 will wait. If T2 requests a data item held by T1 then T2 (die) will be rolled back.
Wound-wait
If Ti has higher priority, abort Tj otherwise Ti waits. It means when transaction Ti requests a data item
currently held by Tj, Ti is allowed to wait only if it has a timestamp larger than that of Tj (that is Ti is
younger than Tj). Otherwise Tj is rolled back (Tj is wounded by Ti).
Timeout-Based Schemes
Another simple approach to deadlock handling is based on lock timeouts. In this approach, a transaction
that has requested a lock waits for at most a specified amount of time. If the lock has not been granted
within that time, the transaction is said to time out, and it rolls itself back and restarts. If there was in fact
a deadlock, one or more transactions involved in the deadlock will time out and roll back, allowing the
others to proceed.
The timeout scheme is particularly easy to implement, and works well if transactions· are short, and if
long waits are likely to be due to deadlocks.
Deadlock Detection
Wait-for Graph:
To detect a state of deadlock is with the help of wait-for graph. This graph is constructed and
maintained by the system.
One node is created in the wait-for graph for each transaction that is currently executing. Whenever
a transaction Ti is waiting to lock an item X that is currently locked by a transaction Tj, a directed edge
(Ti->Tj) is created in the wait-for graph.
When Tj releases the lock(s) on the items that Ti was waiting for, the directed edge is dropped from
the wait-for graph.
We have a state of deadlock if and only if the wait-for graph has a cycle. Then each transaction
involved in the cycle is said to be deadlocked.
No Cycle
Suppose now that transaction T28 is requesting an item held by T27. Then the edge T28 ->T27 is added
to the wait -for graph, resulting in a new system state as shown in figure.
Cycle
T26------>T28------>T27----->T26
It means that transactions T26, T27 and T28 are all deadlocked.
When a detection algorithm determines that a deadlock exists, the system must recover from the
deadlock. The most common solution is to roll back one or more transactions to break the deadlock.
Choosing which transaction to abort is known as Victim Selection.
In below wait-for graph transactions T26, T28 and T27 are deadlocked. In order to remove
deadlock one of the transaction out of these three transactions must be roll backed.
We should roll back those transactions that will incur the minimum cost. When a deadlock is
detected, the choice of which transaction to abort can be made using following criteria:
Rollback
Once we have decided that a particular transaction must be rolled back, we must determine how far this
transaction should be rolled back. The simplest solution is a total rollback; Abort the transaction and then
restart it. However it is more effective to roll back the transaction only as far as necessary to break the
deadlock. But this method requires the system to maintain additional information about the state of all
the running system.
Problem of Starvation
In a system where the selection of victims is based primarily on cost factors, it may happen that the same
transaction is always picked as a victim. As a result this transaction never completes can be picked as a
victim only a (small) finite number of times. The most common solution is to include the number of
rollbacks in the cost factor.
Starvation
When a transaction requests a lock on a data item in a particular mode, and no other transaction has a
lock on the same data item in a conflicting mode, the lock can be granted. However care must be taken
to avoid the following scenario.
Suppose a transaction T2 has a shared-mode lock on a data item, and another transaction TI requests an
exclusive mode lock on the data item. Clearly, TI has to wait for T2 to release the share mode lock.
Meanwhile a transaction T3 may request a shared mode lock on the same data item. The lock request is
compatible with the lock granted to T2 so T3 may be granted the shared mode lock. At this point T2 may
release the lock, but still TI has to wait for T3 to finish. But again there may be a new transaction T4 that
request a shared mode lock on the same data item and is granted the lock before T3 releases it. In fact,
it is possible that there is a sequence of transactions that each request a shared mode lock on the data
item and each transaction release the lock a short while after it is granted, but T 1 never gets the exclusive
mode lock on the data item. The transaction Tl may never make progress and is said to be starved.
1. Transaction failure
2. System crash
3. Disk failure
1. Transaction failure: The transaction failure occurs when it fails to execute or when it reaches
a point from where it can't go any further. If a few transaction or process is hurt, then this is
called as transaction failure.
1. Logical errors: If a transaction cannot complete due to some code error or an internal
error condition, then the logical error occurs.
2. Syntax error: It occurs where the DBMS itself terminates an active transaction because
the database system is not able to execute it. For example, The system aborts an active
transaction, in case of deadlock or resource unavailability.
2. System Crash: System failure can occur due to power failure or other hardware or software
failure.
Example: Operating system error.
3. Disk Failure
It occurs where hard-disk drives or storage drives used to fail frequently. It was a common
problem in the early days of technology evolution.
Disk failure occurs due to the formation of bad sectors, disk head crash, and
unreachability to the disk or any other failure, which destroy all or part of disk storage.
Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a transaction. It is
important that the logs are written prior to the actual modification and stored on a stable storage media,
which is failsafe.
An update log record describes a single database write: <Tn, X, V1, V2>
i. Transactions identifier.
ii. Data-item identifier.
iii. Old value.
iv. New value.
Deferred database modification − All logs are written on to the stable storage and the database
is updated when a transaction commits.
Immediate database modification − Each log follows an actual database modification. That is,
the database is modified immediately after every operation.
Checkpoint
Keeping and maintaining logs in real time and in real environment may fill out all the memory space
available in the system. As time passes, the log file may grow too big to be handled at all. Checkpoint is
a mechanism where all the previous logs are removed from the system and stored permanently in a
storage disk. Checkpoint declares a point before which the DBMS was in consistent state, and all the
transactions were committed.
During recovery we need to consider only the most recent transaction Ti that started before the
checkpoint, and transactions that started after Ti.
Scan backwards from end of log to find the most recent <checkpoint> record
Continue scanning backwards till a record <Ti start> is found.
Need only consider the part of log following above start record. Earlier part of log can be ignored
during recovery, and can be erased whenever desired.
For all transactions (starting from Ti or later) with no <Ti commit>, execute undo(Ti). (Done only
in case of immediate modification.)
DBMS B.Tech II Year IV Sem. Page | 20
Scanning forward in the log, for all transactions starting from Ti or later with a <Ti commit>,
execute redo(Ti).
Recovery
When a system with concurrent transactions crashes and recovers, it behaves in the following manner −
The recovery system reads the logs backwards from the end to the last checkpoint.
It maintains two lists, an undo-list and a redo-list.
If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it puts
the transaction in the redo-list.
If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the
transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed. All the transactions in
the redo-list and their previous logs are removed and then redone before saving their logs.
For example: In the log file, transaction T2 and T3 will have <Tn, Start> and <Tn, Commit>. The
T1 transaction will have only <Tn, commit> in the log file. That's why the transaction is committed
after the checkpoint is crossed. Hence it puts T1, T2 and T3 transaction into redo list.
The transaction is put into undo state if the recovery system sees a log with <Tn, Start> but no
commit or abort log found. In the undo-list, all the transactions are undone, and their logs are
removed.
Transaction T4 will have <Tn, Start>. So T4 will be put into undo list since this transaction is not
yet complete and failed amid.
Shadow Paging:
This recovery scheme does not require the use of a log in a single-user environment. In a multiuser
environment, a log may be needed for the concurrency control method. Shadow paging considers the
database to be made up of a number of fixed-size disk pages (or disk blocks) say, n—for recovery
purposes. A directory with n entries is constructed, where the ith entry points to the ith database page
on disk. The directory is kept in main memory if it is not too large, and all references—reads or writes—
to database pages on disk go through it. When a transaction begins executing, the current directory—
whose entries point to the most recent or current database pages on disk—is copied into a shadow
During transaction execution, the shadow directory is never modified. When a write_item operation is
performed, a new copy of the modified database page is created, but the old copy of that page is not
overwritten. Instead, the new page is written elsewhere on some previously unused disk block. The current
directory entry is modified to point to the new disk block, whereas the shadow directory is not modified
and continues to point to the old unmodified disk block. Figure illustrates the concepts of shadow and
current directories. For pages updated by the transaction, two versions are kept. The old version is
referenced by the shadow directory and the new version by the current directory.
To recover from a failure during transaction execution, it is sufficient to free the modified database pages
and to discard the current directory. The state of the data-base before transaction execution is available
through the shadow directory, and that state is recovered by reinstating the shadow directory. The
database thus is returned to its state prior to the transaction that was executing when the crash occurred,
and any modified pages are discarded. Committing a transaction corresponds to discarding the previous
shadow directory. Since recovery involves neither undoing nor redoing data items, this technique can be
categorized as a NO-UNDO/NO-REDO technique for recovery.
In a multiuser environment with concurrent transactions, logs and checkpoints must be incorporated into
the shadow paging technique. One disadvantage of shadow paging is that the updated database pages
change location on disk. This makes it difficult to keep related database pages close together on disk
without complex storage management strategies. Furthermore, if the directory is large, the overhead of
writing shadow directories to disk as transactions commit is significant. A further complication is how to
handle garbage collection when a transaction commits. The old pages referenced by the shadow
directory that have been updated must be released and added to a list of free pages for future use. These
pages are no longer needed after the transaction commits. Another issue is that the operation to migrate
between cur-rent and shadow directories must be implemented as an atomic operation.