Unit 4 - QueryProcessingandTransactionManagementSystem
Unit 4 - QueryProcessingandTransactionManagementSystem
But they are not sufficient in concurrent execution. To handle those problems we need to
understand database ACID properties.
ACID Properties
Atomicity: This property ensures that either all the operations of a transaction reflect in
database or none.
Consistency: To preserve the consistency of database, the execution of transaction should take
place in isolation (that means no other transaction should run concurrently when there is a
transaction already running).
ACID Properties…cont
Isolation: For every pair of transactions, one transaction should start execution only when the
other finished execution.
Durability: Once a transaction completes successfully, the changes it has made into the
database should be permanent even if there is a system failure. The recovery-management
component of database systems ensures the durability of transaction.
Transaction States
Transaction States …Cont
Active: The initial state when the transaction has just started execution.
Partially Committed: At any given point of time if the transaction is executing properly, then it is
going towards it COMMIT POINT. The values generated during the execution are all stored in
volatile storage.
Failed: If the transaction fails for some reason. The temporary values are no longer required, and
the transaction is set to ROLLBACK. It means that any change made to the database by this
transaction up to the point of the failure must be undone. If the failed transaction has
withdrawn Rs. 100/- from account A, then the ROLLBACK operation should add Rs 100/- to
account A.
Transaction States …Cont
Aborted: When the ROLLBACK operation is over, the database reaches the BFIM. The
transaction is now said to have been aborted.
Committed: If no failure occurs then the transaction reaches the COMMIT POINT. All the
temporary values are written to the stable storage and the transaction is said to have been
committed.
Terminated: Either committed or aborted, the transaction finally reaches this state.
Concurrent Execution
Concurrent execution means running side by side or parallel of transactions.
Advantages of Concurrent execution are:
Improved throughput & Resource utilization – i.e. no. of transactions executed increases in a
given amount of time & the processor is utilized properly.
Reduced Waiting time – The unpredictable delays in running transactions as well as the average
response time is reduced.
Schedule
A schedule is a collection of many transactions which is implemented as a unit. Depending upon
how these transactions are arranged in within a schedule, a schedule can be of two types:
Serial: The transactions are executed one after another, in a non-preemptive manner.
Concurrent: The transactions are executed in a preemptive, time shared method.
Schedule..cont
T1 is the transaction in which we have two accounts A and B, each containing Rs 1000/-. We now
start a transaction to deposit Rs 100/- from account A to Account B.
T2 is a transaction which deposits to account C 10% of the amount in account A.
Schedule..cont
If we prepare a serial schedule, then either T1 will completely finish before T2 can begin, or T2
will completely finish before T1 can begin.
However, if we want to create a concurrent schedule, then some Context Switching need to be
made, so that some portion of T1 will be executed, then some portion of T2 will be executed and
so on.
Concurrent Schedule
Serializability
To create error free concurrent schedules we must follow some well formed rules to arrange
instructions of the transactions.
When several concurrent transactions are trying to access the same data item, the instructions
within these concurrent transactions must be ordered in some way so as there are no problem
in accessing and releasing the shared data item.
Serializability…cont
There are two aspects of serializability:
1. Conflict Serializability
2. View Serializability
Conflict Serializability
Two instructions of two different transactions may want to access the same data item in order to
perform a read/write operation.
Conflict Serializability deals with detecting whether the instructions are conflicting in any way,
and specifying the order in which these two instructions will be executed in case there is any
conflict.
A conflict arises if at least one (or both) of the instructions is a write operation. The following
rules are important in Conflict Serializability:
1. If two instructions of the two concurrent transactions are both for read operation, then they
are not in conflict, and can be allowed to take place in any order.
Conflict Serializability
2. If one of the instructions wants to perform a read operation and the other instruction wants
to perform a write operation, then they are in conflict, hence their ordering is important. If the
read instruction is performed first, then it reads the old value of the data item and after the
reading is over, the new value of the data item is written. It the write instruction is performed
first, then updates the data item with the new value and the read instruction reads the newly
updated value.
3. 3. If both the transactions are for write operation, then they are in conflict but can be allowed
to take place in any order, because the transaction do not read the value updated by each other.
However, the value that persists in the data item after the schedule is over is the one written by
the instruction that performed the last write.
View Serializability
This is another type of serializability that can be derived by creating another schedule out of an
existing schedule, involving the same set of transactions. These two schedules would be called
View Serializable if the following rules are followed while creating the second schedule out of
the first.
Let us consider that the transactions T1 and T2 are being serialized to create two different
schedules S1 and S2 which we want to be View Equivalent and both T1 and T2 wants to access
the same data item.
1. If in S1, T1 reads the initial value of the data item, then in S2 also, T1 should read the initial
value of that same data item.
2. If in S1, T1 writes a value in the data item which is read by T2, then in S2 also, T1 should write
the value in the data item before T2 reads it.
3. If in S1, T1 performs the final write operation on that data item, then in S2 also, T1 should
perform the final write operation on that data item. Except in these three cases, any alteration
can be possible while creating S2 by modifying S1.
Concurrency-control Schemes
Concurrency-control schemes are also used to ensure serializability. All these schemes either
delay an operation or abort the transaction that issued the operation.
Most commonly used Concurrency-control schemes are:
locking protocols
timestamp based protocols
Lock based protocols
A locking protocol is a set of rules that state when a transaction may lock and unlock each of the
data items in the database.
Two-phase locking protocol: this protocol allows a transaction to lock a new data item only if
that transaction has not yet unlocked any data item. This protocol ensures serializability, but not
deadlock freedom.
Strict two-phase locking protocol: It permits release of exclusive locks only at the end of
transactions, in order to ensure recoverability and cascadelessness of the resulting schedules.
Rigorous two-phase locking protocol: This protocol releases all locks only at the end of the
transaction.
Lock-Based Protocol
In this type of protocol, any transaction cannot read or write data until it acquires an appropriate lock
on it. There are two types of lock:
1. Shared lock:
It is also known as a Read-only lock. In a shared lock, the data item can only read by the transaction.
It can be shared between the transactions because when the transaction holds a lock, then it can't
update the data on the data item.
2. Exclusive lock:
In the exclusive lock, the data item can be both reads as well as written by the transaction.
This lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.
Two-phase locking (2PL)
The two-phase locking protocol divides the execution phase of the transaction into three parts.
In the first part, when the execution of the transaction starts, it seeks permission for the lock it
requires.
In the second part, the transaction acquires all the locks. The third phase is started as soon as
the transaction releases its first lock.
In the third phase, the transaction cannot demand any new locks. It only releases the acquired
locks.
Two phases of 2PL
:
Growing phase: In the growing phase, a new lock on the data item may be acquired by the
transaction, but none can be released.
Shrinking phase: In the shrinking phase, existing lock held by the transaction may be released,
but no new locks can be acquired.
In the below example, if lock conversion is allowed then the following phase can happen:
Upgrading of lock (from S(a) to X (a)) is allowed in growing phase.
Downgrading of lock (from X(a) to S(a)) must be done in shrinking phase.
Strict Two-phase locking (Strict-2PL)
The first phase of Strict-2PL is similar to 2PL. In the first phase, after acquiring all the locks, the
transaction continues to execute normally.
The only difference between 2PL and strict 2PL is that Strict-2PL does not release a lock after
using it.
Strict-2PL waits until the whole transaction to commit, and then it releases all the locks at a
time.
Strict-2PL protocol does not have shrinking phase of lock release.
Timestamp Ordering Protocol
The Timestamp Ordering Protocol is used to order the transactions based on their Timestamps.
The order of transaction is nothing but the ascending order of the transaction creation.
The priority of the older transaction is higher that's why it executes first. To determine the
timestamp of the transaction, this protocol uses system time or logical counter.
The lock-based protocol is used to manage the order between conflicting pairs among
transactions at the execution time. But Timestamp based protocols start working as soon as a
transaction is created.
Let's assume there are two transactions T1 and T2. Suppose the transaction T1 has entered the
system at 007 times and transaction T2 has entered the system at 009 times. T1 has the higher
priority, so it executes first as it is entered the system first.
The timestamp ordering protocol also maintains the timestamp of last 'read' and 'write'
operation on a data.
Working of Timestamp ordering protocol
1. Check the following condition whenever a transaction Ti issues a Read (X) operation:
If W_TS(X) >TS(Ti) then the operation is rejected.
If W_TS(X) <= TS(Ti) then the operation is executed.
Timestamps of all the data items are updated.
2. Check the following condition whenever a transaction Ti issues a Write(X) operation:
If TS(Ti) < R_TS(X) then the operation is rejected.
If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back otherwise the operation is
executed.
Where,
TS(TI) denotes the timestamp of the transaction Ti.
R_TS(X) denotes the Read time-stamp of data-item X.
W_TS(X) denotes the Write time-stamp of data-item X.
Precedence graph for TS ordering
Advantages and Disadvantages of TO
protocol
TO protocol ensures serializability as per the precedence graph
TS protocol ensures freedom from deadlock that means no transaction ever waits.
But the schedule may not be recoverable and may not even be cascade- free.
Intent Locks
The intent lock occurs when SQL Server wants to acquire the shared (S) lock or exclusive
(X) lock on some of the resources lower in the lock hierarchy. In practice, when SQL Server
acquires a lock on a page or row, the intent lock is required in the table.
Recoverability of Schedule
Sometimes a transaction may not execute completely due to a software issue, system crash or
hardware failure. In that case, the failed transaction has to be rollback. But some other
transaction may also have used value produced by the failed transaction. So we also have to
rollback those transactions.