Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Database

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Chapter 2-ADBMS

Chapter 2

Transaction Processing Concepts


What do you mean by Transaction Processing Systems?

Transaction processing systems are systems with large databases and hundreds of concurrent
users executing database transactions. Examples of such systems include airline reservations,
banking, credit card processing, online retail purchasing and many other applications. These
systems require high availability and fast response time for hundreds of concurrent users.

According to the number of users who can use the system concurrently a DBMS is classified into
two. A DBMS is single-user if at most one user at a time can use the system, and it is multiuser
if many users can use the system—and hence accesses the database—concurrently. Database
systems used in banks, insurance agencies, stock exchanges, supermarkets, and many other
applications are multiuser systems. In these systems, hundreds or thousands of users are typically
operating on the data-base by submitting transactions concurrently to the system.

Multiple users can access databases simultaneously because of the concept of multiprogramming,
which allows the operating system of the computer to execute multiple programs—or processes—
at the same time. A single central processing unit (CPU) can only execute at most one process at
a time. However, multiprogramming operating systems execute some commands from one
process, then suspend that process and execute some commands from the next process, and so on.
A process is resumed at the point where it was suspended whenever it gets its turn to use the CPU
again. Hence, concurrent execution of processes is actually interleaved. Interleaving keeps the
CPU busy when a process requires an input or output (I/O) operation, such as reading a block from
disk. The CPU is switched to execute another process rather than remaining idle during I/O time.
Interleaving also prevents a long process from delaying other processes. If the computer system
has multiple hardware processors (CPUs), parallel processing of multiple processes is possible.

What is a Transaction?

A transaction is a unit of program under execution that accesses and possibly updates various
data items.

 It is either completed in its entirety or not done at all.


 This logical unit of database processing includes one or more access operations (read -
retrieval, write - insert or update, delete).
 E.g. transaction to transfer $50 from account A to account B:

1. read(A) 3. write(A)

2. A := A – 50 4. read(B)

Page 1
Chapter 2-ADBMS
5. B := B + 50 6. write(B)

 Database operations that form a transaction can either be embedded within an application
program or they can be specified interactively via a high-level query language such as
SQL.
 One way of specifying the transaction boundaries is by specifying explicit begin
transaction and end transaction statements in an application program.
 A successful transaction changes the database from one consistent state to another. A
consistent database state is one in which all data integrity constraints are satisfied. To
ensure consistency of the database, every transaction must begin with the database in a
known consistent state and end in a consistent state.
 Two main issues to deal with:
→ Failures of various kinds, such as hardware failures and system crashes
→ Concurrent execution of multiple transactions

Although a SELECT query in SQL does not make any changes in the table, the SQL code
represents a transaction because it accesses the database. If the database existed in a consistent
state before the access, the database remains in a consistent state after the access because the
transaction did not alter the database. A transaction may consist of a single SQL statement or
a collection of related SQL statements.

If the database operations in a transaction do not update the database but only retrieve data, the
transaction is called a read-only transaction; otherwise it is known as a read-write transaction

Note: By default, MS Access does not support transaction management as discussed here. More
sophisticated DBMSs, such as Oracle, SQL Server, and DB2, do support the transaction
management components discussed in this chapter.

A transaction can have one of two outcomes:

1. Committed: If a transaction completed successfully and the database reaches a new


consistent state.
2. Aborted: If the transaction does not executed successfully. The database must be restored
to the consistent state it was, before the transaction started. Such a transaction is called
rolled back or undone.

For recovery purposes, the system needs to keep track of when the transaction starts,
terminates, and commits or aborts.

Why Concurrency Control is needed?

The coordination of the simultaneous execution of transactions in a multiuser database system is


known as concurrency control. The objective of concurrency control is to ensure the

Page 2
Chapter 2-ADBMS
serializability of transactions in a multiuser database environment. Concurrency control is
important because the simultaneous execution of transactions over a shared database can create
several data integrity and consistency problems. The three main problems are lost updates,
uncommitted data, and inconsistent retrievals.

1. Lost Update

The lost update problem occurs when two concurrent transactions, T1 and T2, are updating the
same data element and one of the updates is lost (overwritten by the other transaction.)

Assume that you have a product whose current quantity value is 35. Also assume that
two concurrent transactions, T1 and T2, occur that update the quantity value for some item in the
database. The transactions are as below:

Transaction Computation

T1: Buy 100 units quantity =quantity +100

T2 : Sell 30 units quantity=qunatitiy-30

Following table shows the serial execution of those transactions under normal circumstances,
yielding the correct answer quantity = 105.

Time Transaction Step Stored Value

1 T1 Read quantity 35

2 T1 quantity =35 +100

3 T1 Write quantity 135

4 T2 Read quantity 135

5 T2 quantity =135-30

6 T2 Write quantity 105

But suppose that a transaction is able to read a product’s quantity value from the table before a
previous transaction (using the same product) has been committed. The sequence depicted below
shows how the lost update problem can arise. Note that the first transaction (T1) has not yet been
committed when the second transaction (T2) is executed.

Page 3
Chapter 2-ADBMS

Time Transaction Step Stored Value

1 T1 Read quantity 35

2 T2 Read qunatity 35

3 T1 quantity =35 +100

4 T2 quantity =35-30

5 T1 Write quantity 135

6 T2 Write quantity 5 (Lost update)

2. Uncommitted Data (The Temporary Update (or Dirty Read) Problem)

The phenomenon of uncommitted data occurs when two transactions, T1 and T2, are executed
concurrently and the first transaction (T1) is rolled back after the second transaction (T2) has
already accessed the uncommitted data—thus violating the isolation property of transactions. To
illustrate that possibility, let’s use the same transactions described during the lost updates
discussion. T1 is forced to roll back due to an error. T1 transaction is rolled back to eliminate the
addition of the 100 units. Because T2 subtracts 30 from the original 35 units, the correct answer
should be 5.

Following table shows the serial execution of those transactions under normal circumstances,
yielding the correct answer quantity = 5.

Time Transaction Step Stored Value

1 T1 Read quantity 35

2 T1 quantity =35 +100

3 T1 Write quantity 135

4 T1 Rollback 35

5 T2 Read quantity 35

6 T2 quantity =35-30

7 T2 Write quantity 5

Page 4
Chapter 2-ADBMS
Following table shows how the uncommitted data problem can arise when the ROLLBACK is
completed after T2 has begun its execution.

Time Transaction Step Stored Value

1 T1 Read quantity 35

2 T1 quantity =35 +100

3 T1 Write quantity 135

4 T2 Read quantity 135

(Reading uncommitted data)

5 T2 quantity =135-30

6 T1 ROLLBACK 35

T2 Write quantity 105

3. Inconsistent Retrievals(The Incorrect Summary Problem)

Inconsistent retrievals occur when a transaction accesses data before and after another
transaction(s) finish working with such data. For example, an inconsistent retrieval would occur
if transaction T1 calculated some summary (aggregate) function over a set of data while another
transaction (T2) was updating the same data. The problem is that the transaction might read some
data before they are changed and other data after they are changed, thereby yielding inconsistent
results.

To illustrate that problem, assume the following conditions:

1. T1 calculates the total quantity of products stored in the PRODUCT table.

2. At the same time, T2 updates the quantity for two of the items int the PRODUCT table.

The two transactions are shown as:

Transaction 1 Transaction 2

Select sum(quantity) from product; Update product set quantity=quantity+10 where pid = 1003

Update product set quantity =quantity-10 where pid=1004

Commit

Page 5
Chapter 2-ADBMS

The following table shows the serial execution of those transactions under normal circumstances

Product_id(pid) Before update After Update

1001 8 8

1002 32 32

1003 15 25(15+10)

1004 23 13(23-10)

1005 8 8

Total 86 86

Below Table demonstrates that inconsistent retrievals are possible during the transaction
execution, making the result of T1’s execution incorrect.

Time Transaction Action Value Total

1 T1 Read quantity for pid=1001 8 8

2 T1 Read quantity for pid=1002 32 40

3 T2 Read quantity for pid=1003 15

4 T2 Quantity=qunatity+10

5 T2 Write quantity for pid=1003 25

6 T1 Read quantity for pid=1003 25 65

7 T1 Read quantity for pid=1004 23 88

8 T2 Read quantity for pid=1004 23

9 T2 Quantity=qunatity-10

10 T2 Write quantity for pid=1004 13

11 T2 Commit

12 T1 Read quantity for pid=1005 8 96

The Unrepeatable Read Problem. A transaction T1 reads the same item twice and the item is
changed by another transaction T2 between the two reads. Hence, T1 receives different values for
its two reads of the same item. This may occur, for example, if during an airline reservation

Page 6
Chapter 2-ADBMS
transaction, a customer inquiry about seat availability on several flights. When the customer
decides on a particular flight, the transaction then reads the number of seats on that flight a second
time before completing the reservation, and it may end up reading a different value for the item.

Why Recovery Is Needed?

Whenever a transaction is submitted to a DBMS for execution, the system is responsible for
making sure that either all the operations in the transaction are completed successfully and their
effect is recorded permanently in the database, or that the transaction does not have any effect on
the database or any other transactions. In the first case, the transaction is said to be committed,
whereas in the second case, the transaction is aborted. Database recovery restores a database from
a given state (usually inconsistent) to a previously consistent state after a failure. It is a service
provided by the DBMS to ensure that the database is reliable and remains in a consistent state in
the presence of failure.

Types of Failures

Failures are generally classified as transaction, system, and media failures. There are several
possible reasons for a transaction to fail in the middle of execution:

1. A computer failure (system crash): A hardware, software, or network error occurs in the
computer system during transaction execution. Hardware crashes are usually media failures—
for example, main memory failure.

2. A transaction or system error: Some operations in the transaction such as integer overflow
or division by zero or logical programming error may cause it to fail.
3. Local errors or exception conditions detected by the transaction: certain conditions that
may occur that necessitate cancellation of transaction. (Notice that an exception condition such
as insufficient account balance in a banking database may cause transaction such as fund
withdrawal to be canceled.)This exception should be programmed in the transaction itself &
hence would not be considered a failure.
4. Concurrency Control Enforcement: the concurrency control method may decide to abort the
transaction, to be restarted later, because it violates serializability or because several
transactions are in a state of deadlock.
5. Disk failure Some disk blocks may lose their data because of a read or write malfunction or
because of a disk read/write head crash. This may happen during a read or a write operation of
the transaction.

6. Physical problems and catastrophes: This refers to an endless list of problems that includes
power or air-conditioning failure, fire, theft, sabotage, overwriting disks or tapes by mistake.

Transaction States:

Page 7
Chapter 2-ADBMS
1. Active state
2. Partially committed state
3. Committed state
4. Failed state
5. Terminated State

State transition diagram illustrating the states for transaction execution

1. A transaction goes into an active state immediately after it starts execution where it can issue
READ or WRITE operations. When the transaction ends, it moves to the partially committed
state where certain recovery protocol ensures that a system failure will not result in an
inability to record the changes of the transaction permanently. (Changes are recorded in
TRANSACTION LOG)

a. If this check is successful the transaction enters into a commit point and enters the
committed state. If so, all its changes must be recorded permanently in the database.

b. If this check fails, it goes to the failed state.

3. Transaction can go the failed state, from the partially committed state if any of the checks
there fails or if the transaction is aborted from its active state itself.
a. The transaction may then have to be rolled back to undo the effect of WRITE
operations.
4. The terminated state corresponds to the transaction leaving the system.
5. Failed or Aborted transactions may be restarted later either automatically or after being
resubmitted by the user.

Transaction Properties

Each individual transaction must ensure atomicity, consistency, isolation, and durability.
These properties are sometimes referred to as the ACID test. In addition, when executing multiple
transactions, the DBMS must schedule the concurrent execution of the transaction’s operations.
The schedule of such transaction’s operations must exhibit the property of serializability.

Page 8
Chapter 2-ADBMS
 Atomicity requires that all operations (SQL requests) of a transaction be completed; if not,
the transaction is aborted. If a transaction T1 has four SQL requests, all four requests must be
successfully completed; otherwise, the entire transaction is aborted. In other words, a
transaction is treated as a single, indivisible, logical unit of work.
 Consistency indicates the permanence of the database’s consistent state. A transaction takes
a database from one consistent state to another consistent state. When a transaction is
completed, the database must be in a consistent state; if any of the transaction parts violates
an integrity constraint, the entire transaction is aborted.
 Isolation means that the data used during the execution of a transaction cannot be used by a
second transaction until the first one is completed. In other words, if a transaction T1 is being
executed and is using the data item X, that data item cannot be accessed by any other
transaction (T2, ..., Tn) until T1 ends. This property is particularly useful in multiuser database
environments because several users can access and update the database at the same time.
 Durability ensures that once transaction changes are done (committed), they cannot be
undone or lost, even in the event of a system failure.
 Serializability ensures that the schedule for the concurrent execution of the transactions
yields consistent results. This property is important in multiuser and distributed databases,
where multiple transactions are likely to be executed concurrently. Naturally, if only a single
transaction is executed, serializability is not an issue.

A single-user database system automatically ensures serializability and isolation of the database
because only one transaction is executed at a time. The atomicity, consistency, and durability of
transactions must be guaranteed by the single-user DBMSs. (Even a single-user DBMS must
manage recovery from errors created by operating-system-induced interruptions, power
interruptions, and improper application execution.)

Multiuser databases are typically subject to multiple concurrent transactions. Therefore, the
multiuser DBMS must implement controls to ensure serializability and isolation of transactions in
addition to atomicity and durability—to guard the database’s consistency and integrity. For
example, if several concurrent transactions are executed over the same data set and the second
transaction updates the database before the first transaction is finished, the isolation property is
violated and the database is no longer consistent. The DBMS must manage the transactions by
using concurrency control techniques to avoid such undesirable situations.

The Transaction Log

A DBMS uses a transaction log to keep track of all transactions that update the database. The
information stored in this log is used by the DBMS for a recovery requirement triggered by a
ROLLBACK statement, a program’s abnormal termination, or a system failure such as a network
discrepancy or a disk crash. Some RDBMSs use the transaction log to recover a database forward

Page 9
Chapter 2-ADBMS
to a currently consistent state. After a server failure, for example, Oracle automatically rolls back
uncommitted transactions and rolls forward transactions that were committed but not yet written
to the physical database. This behaviour is required for transactional correctness and is typical of
any transactional DBMS. Although using a transaction log increases the processing overhead of a
DBMS, the ability to restore a corrupted database is worth the price.

While the DBMS executes transactions that modify the database, it also automatically updates the
transaction log. The transaction log stores:

1. A record for the beginning of the transaction.


2. For each transaction component (SQL statement):
a. The types of operation being performed (update, delete, insert).
b. The names of the objects affected by the transaction (the name of the table).
c. The “before” and “after” values for the fields being updated.
d. Pointers to the previous and next transaction log entries for the same transaction.
3. The ending (COMMIT) of the transaction.

The transaction log is a critical part of the database and it is usually implemented as one or
more files that are managed separately from the actual database files. The transaction log is subject
to common dangers such as disk-full conditions and disk crashes. Because the transaction log
contains some of the most critical data in a DBMS, some implementations support logs on several
different disks to reduce the consequences of a system failure

Characterizing Schedules Based on Recoverability and Serializability

The Scheduler

Schedule – a sequence of instructions that specify the chronological order in which instructions
of concurrent transactions are executed. When transactions are executing concurrently in an
interleaved fashion, the order of execution of operations from the various transactions, forms what
is known as a transaction schedule (or history).

– A schedule for a set of transactions must consist of all instructions of those transactions.
– It must preserve the order in which the instructions appear in each individual transaction.

– A transaction that successfully completes its execution will have a commit instructions
as the last statement

– A transaction that fails to successfully complete its execution will have an abort
instruction as the last statement

As long as two transactions, T1 and T2, access unrelated data, there is no conflict and the order
of execution is irrelevant to the final outcome. But if the transactions operate on related (or the
same) data, conflict is possible among the transaction components and the selection of one

Page 10
Chapter 2-ADBMS
execution order over another might have some undesirable consequences. So how is the correct
order determined, and who determines that order? Fortunately, the DBMS handles that tricky
assignment by using a built-in scheduler.

The scheduler is a special DBMS process that establishes the order in which the operations
within concurrent transactions are executed. The scheduler interleaves the execution of database
operations to ensure serializability and isolation of transactions. To determine the appropriate
order, the scheduler bases its actions on concurrency control algorithms, such as locking or time
stamping methods.

– The scheduler also makes sure that the computer’s central processing unit (CPU) and
storage systems are used efficiently.

– Additionally, the scheduler facilitates data isolation to ensure that two transactions do not
update the same data element at the same time.

– Two operations are said to conflict

 If they belong to different transactions


 Access the same database item

 At least one of the two operations is a write operation.

Transactions→ T1 T2 Result

Read Read No conflict

Read Write Conflict


Operations
Write Read Conflict

Write Write Conflict

– Several methods have been proposed to schedule the execution of conflicting operations
in concurrent transactions.

Characterizing Schedules Based on Serializability

(Serial, Nonaerial, and Conflict-Serializable schedules, View Serializable schedules)

Serial Schedule: A schedule S is serial if, for every transaction T participating in the schedule, all
the operations of T are executed consecutively in the schedule.

– No interleaving occurs in serial schedule

– If we consider transactions to be independent, then every serial schedule is considered correct.

Page 11
Chapter 2-ADBMS
– Eg: Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B.

– A serial schedule in which T1 is followed by T2 OR T2 is followed by T1 can be given as:

TI T2 TI T2

Read(A) Read(A)

AA-50 TempA*0.1

Write(A) AA-temp

Read(B) Write(A)

B=B+50 Read(B)

Write(B) B=B+temp

Read(A) Write(B)

TempA*0.1 Read(A) •
AA-temp AA-50

Write(A) Write(A)

Read(B) Read(B)

B=B+temp B=B+50

Write(B) Write(B)

– In serial execution there is no interference between transactions, because only one is


executing at any given time. So, it prevents the occurrence of concurrency problems.

– Serial schedule never leaves the database in an inconsistent sate, so every serial schedule is
considered correct.

Problems of Serial Schedules:

– They limit concurrency or interleaving of operations.

– If a transaction waits for an I/O operation to complete, we cannot switch the CPU Processor
to another transaction.

– If some transaction T is long, the other transactions must wait for T to complete all its
operations.

Non-Serial Schedule (Concurrent Schedule):

– The schedule that is not serial is called non-serial schedule.

– Here each sequence interleaves operations from the two transactions.

Page 12
Chapter 2-ADBMS
– For eg: Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B.

Two non-serial schedules with interleaving of operations for T1 and T2 is listed below.

Non Serial Schedule I Non Serial Schedule II

TI T2 TI T2

Read(A) Read(A)

AA-50 AA-50

Write(A) Read(A)

Read(A) TempA*0.1

TempA*0.1 AA-temp

AA-temp Write(A)

Write(A) Read(B)

Read(B) Write(A)

B=B+50 Read(B)

Write(B) B=B+50

Read(B) Write(B)

B=B + temp B=B + temp

Write(B) Write(B)

Non serial schedule I (above) give a correct result equal to the serial schedule whereas the non-
serial schedule II gives an erroneous result. Determining which of the non-serial schedules
always give a correct result and which may give erroneous results helps interleaving operations
in transactions. The concept used to characterize schedules in this manner is that of
serializability of a schedule.

An important aspect of concurrency control, called serializability theory, attempts to


determine which schedules are correct and which are not, to develop techniques that allow
only correct schedules.

Serializable Schedule: A schedule S of n transactions is serializable if it is equivalent to


some serial schedule of the same n transactions. A serializable schedule is a schedule of a
transaction’s operations in which the interleaved execution of the transactions (T1, T2, T3,
etc.) yields the same results as if the transactions were executed in serial order (one after
another).

When are two schedules considered equivalent?

There are several ways to define schedule equivalence.

Page 13
Chapter 2-ADBMS
1. Result Equivalency

2. Conflict Equivalency

3. View Equivalence

– Two schedules are called result equivalent if they produce the same final state of the
database. However two different schedules may accidently produce the same final state.
Hence result equivalence is not used to define equivalence of schedules.

– Two schedules are said to be conflict equivalent if the order of any two conflicting
operations is the same in both schedules. If a schedule S can be transformed into a schedule
S´ by a series of swaps of non-conflicting instructions, we say that S and S´ are conflict
equivalent.

Conflict Serializable Schedule

Any given concurrent schedule is said to be conflict serializable schedule if it is CONFLICT


EQUIVALENT to one of the possible serial schedule.

– Schedule S is conflict serializable if it is conflict equivalent to some serial schedule S’.

To check whether a schedule is conflict equivalent , follow the step given below:

– For any 2 given schedules, say S1 and S2 if the order of all possible conflicting
operations is the same in both then it is said to be conflict equivalent.

– E.g: Consider two transactions T1 and T2 with following schedules.

Schedule I (S1) Schedule II (S2) Schedule III(S3)

(Concurrent Schedule) (Serial Schedule) (Serial Schedule)

T1 T2 T2 T1 T1 T2

R(A) W(A) R(A)

W(A) W(B) R(B)

R(B) R(A) W(A)

W(B) R(B) W(B)

Existing conflicting Existing conflicting Existing conflicting


operation is R-W operation is W-R operation is R-W

– S1→R1(A),W2(A),R1(B),W2(B) – R-W conflict

– S2→W2(A),W2(B),R1(A),R1(B) - W-R conflict

– S3→R1(A)R1(B),W2(A),W2(B) – R-W conflict

Page 14
Chapter 2-ADBMS
So S1and S3 are Conflict Equivalent

Note:

– Being Serializable is distinct from being Serial.

– A Serial Schedule leads to inefficient utilization of CPU because of no interleaving of


operations from different transactions.

– A Serializable Schedule gives the benefits of concurrent execution without giving up any
correctness.

– Practically, it is difficult to test for the serializability. Also, it is impractical to execute the
schedule and then test the result for serializability and cancel the effect of the schedule if
it is not serializable. The approach taken in most commercial DBMSs is to design
Protocols that will ensure serializability of all schedules.

Testing for Conflict Serializability of a Schedule Using Precedence Graph

Algorithm looks at only the read_item and write_item operations in a schedule to construct a
precedence graph (or serialization graph), which is a directed graph G = (N, E) that consists
of a set of nodes N = {T1, T2, ..., Tn } and a set of directed edges E = {e1,e2, ..., em }. There is
one node in the graph for each transaction Ti in the schedule. Each edge ei in the graph is of the
form (Tj→Tk ), 1 ≤ j ≤ n, 1 ≤ k ≤ n, where Tj is the starting node of ei and Tk is the ending node
of ei. Such an edge from node Tj to node Tk is created by the algorithm if one of the operations in
Tj appears in the schedule before some conflicting operation in Tk.

Algorithm :Testing Conflict Serializability of a Schedule S

1. For each transaction Ti participating in schedule S, create a node labelled Ti in the


precedence graph.
2. For each case in S where Tj executes a read_item(X) after Ti executes a write_item(X),
create an edge (Ti→ Tj) in the precedence graph.
3. For each case in S where Tj executes a write_item(X) after Ti executes a read_item(X),
create an edge (Ti→Tj) in the precedence graph.
4. For each case in S where Tj executes a write_item(X) after Ti executes a write_item(X),
create an edge (Ti→ Tj) in the precedence graph.
5. The schedule S is serializable if and only if the precedence graph has no cycles.

(Cycle means - the sequence starts and ends at the same node).

Topological Sorting

Page 15
Chapter 2-ADBMS
The process of ordering the nodes of an acrylic graph is known as topological sorting. We use
topological sorting to find the equivalent serial schedule for the conflict serializable schedule.

1. Consider the In-degree of the nodes. Find the nodes with in-degree zero.

In-degree= No of edges coming to the node. That is a vertex with no incoming edges.

Out-degree= No of edges going out of the node.

T1 T2

For the given example, In-degree of T1, I=0

In-degree of T2, I=1

2. Ignore the node whose in-degree is zero and note it down, T1.

T1 T2
3. Ignore all those edges which is connected with T1

4. Consider the remaining nodes and continue the above steps again. But here there is only
T2 whose in- degree is now zero. So the possible equivalent serial schedule is T1,T2.

Exercise:1

Consider the following graph for three transactions and specify whether it supports conflict
serializable schedule and write the equivalent serial schedule.

Step 1:
T1 T2 Indegree of T1 –0
Indegree of T2 –1
Indegree of T3 –2
Therefore consider T1 first and delete/ignore T1
T3 and all edges from T1.

Step 2
T2 Indegree of T2 –0
Indegree of T3 –1
Therefore consider T2 first and delete/ignore T2
and all edges from T1.
T3

T3 Step 3
Indegree of T3 –0
Therefore consider T3

Hence the serial schedule equivalent to the conflict serializable schedule is T1, T2 and T3.

Exercise:2

Page 16
Chapter 2-ADBMS
Consider the following concurrent schedule for three transactions and specify whether it is conflict
serializable schedule. If it is a conflict serializable schedule find the equivalent serial schedule.

T1 T2 T3
T1 T2

R(X)

T3
R(X)

W(X)

R(X)

W(X)

From T2→T3 there is R-W conflict, From T3→T1 there is W-R conflict, From T2→T1 there is
R-W conflict. As there is no cycle or loop in precedence graph the given schedule is a conflict
serializable schedule.

Now, to find the equivalent serial schedule,

Step 1:
Step 3
Indegree of T1 –2
Indegree of T1 –0
Indegree of T2 –0
Therefore consider T1
Indegree of T3 –1
Therefore consider T2 first and delete/ignore T2
and all edges from T2.

Step 2 Hence the serial schedule equivalent to


Indegree of T1 –1 the conflict serializable schedule is
Indegree of T3 –0 T2,T3 and T1.
Therefore consider T3 first and delete/ignore T3
and all edges from T3.

Exercise:3 (Try yourself)

Consider the following concurrent schedule for three transactions and specify whether it is conflict
serializable schedule. If it is a conflict serializable schedule find the equivalent serial schedule .

T1 T2 T3

R(x)

W(x)

W(x)

W(x)

Page 17
Chapter 2-ADBMS
Exercise: 4 (Try Yourself)

Consider the following two schedules S1 and S2 for the transactions T1 and T2. Now determine
which among the schedules are conflict serializable schedule.

Concurrent Schedule (S1) Concurrent Schedule (S2)

T1 T2 T1 T2

R(x) R(x)

R(y) R(x) R(x)

W(y) W(y)

W(x) R(y)

W(x)

View Serializable Schedule:

Any given concurrent schedule is said to be view serializable schedule if it is View


EQUIVALENT to one of the possible serial schedule.

View Equivalent

Two schedules is said to be view equivalent if the following three conditions hold:

1. Initial Reads: If T1 reads the initial data X in S1, then T1 also reads the initial data X in
S2.

2. W-R Conflict: If T1 reads the value written by T2 in S1, then T1 also reads the value
written by T2 in S2.

3. Final Write: If T1 performs the final write on the data value in S1, then it also performs
the final write on the data value in S2.

Consider the following example:

Schedule 1(S1) Schedule 2(S2)

(Concurrent Schedule) (Serial Schedule)

T1 T2 T3 T2 T3 T1

R(X) R(X)

R(X) R(X)

Page 18
Chapter 2-ADBMS

W(X) W(X)

R(X) R(X)

W(X) W(X)

Initial Reads: In S1 there are only two initial reads, one from T2 and one from T3. Similarly in
the given serial schedule also there are two initial reads exactly one from T2 and T3 on data item
X. So the initial read condition is satisfied here.

W-R Conflict: In S1 there is one write- read conflict from T3 to T1. Similarly there is write-read
schedule from T3 to T1. So the second condition is also satisfied.

Final write: In S1 the final write on X is from T1, similarly in schedule S2 also the final write on
X is from T1. So the final condition is also satisfied.

Hence the above given schedule is view serializable.

Characterizing Schedules Based On Recoverability

For some schedules it is easy to recover from transaction and system failures, whereas for other
schedules the recovery process can be quite complicated. In some cases, it is even not possible to
recover correctly after a failure. Therefore, it is important to characterize the types of schedules
for which recovery is possible, as well as those for which recovery is note possible.

Sometimes a transaction may not execute completely due to a software issue, system crash or
hardware failure. In that case, the failed transaction has to be rollback. But some other transaction
may also have used value produced by the failed transaction. So, we also have to rollback those
transactions. This process is called recoverability of schedules. There are different types of
recoverability of schedules those are Recoverable Schedule, Cascade less Schedule, and Strict
Schedule.

1. Recoverable Schedule:

If in a schedule,
•A
transaction performs a dirty read operation from an uncommitted transaction
• And its commit operation is delayed till the uncommitted transaction either commits or
roll backs
Then such a schedule is known as a Recoverable Schedule.
– A schedule S is recoverable if no transaction T in S commits until all transactions T1 that
have written some item X that T reads have committed.

Page 19
Chapter 2-ADBMS

Sa : r1 (X); r2(X); w1(X); r1(Y); w2(X); c2; w1(Y); c1; Recoverable (but suffers from lost
update problem)

Sc: r1(X); w1(X); r2(X); r1(Y); w2(X); c2; a1; Non recoverable.

T2 reads x from T1, and then T2


commits before T1 commits. If T1
aborts, then T2 must be aborted after
had been committed.

Sd: r1(X); w1(X); r2(X); r1(Y); w2(X); w1(Y); c1; c2; Recoverable.

To make Sc recoverable, c2 of Sc must


be postponed until after T1 commits.

Or else if T1 aborts, then T2 should also


abort as follows:

Se: r1(x); w1(x); r2(x); r1(y);w2(x);


w1(y);a1; a2;

Cascading rollback (Cascading Abort) is a phenomenon that occurs in some recoverable


schedules when an uncommitted transaction has to be rolled back because it read an item from a
transaction that failed. For eg: Se: r1(x); w1(x); r2(x); r1(y); w2(x); w1(y); a1. Here T2 has to be
rolled back because it read an item from a transaction that is aborted later.

2. Cascade less schedule (avoid cascading rollback): A schedule is said to be cascade less,
or to avoid cascading rollback, if every transaction in the schedule reads only items that
were written by committed transactions. In this case, all items read will be committed data,
so no cascading rollback will occur.
3. Strict Schedule: A schedule in which a transaction can neither read nor write an item X
until the last transaction that wrote X has committed.

Summary:

The cascade less schedules will be a subset of the recoverable schedules, and the strict schedules
will be a subset of the cascade less schedules. It is important to note that any strict schedule is also
cascade less, and any cascade less schedule is also recoverable.

ALL
Recoverable
Cascadeless

Strict

Page 20
Chapter 2-ADBMS

Schedule Problem Solution

Recoverable Cascading Rollback Cascade less schedule

Cascade less schedule Write conflict Strict Schedule

Strict schedule Performance issue

Summary

 A transaction is a sequence of database operations that access the database. A transaction


represents a real-world event. A transaction must be a logical unit of work; that is, no portion
of the transaction can exist by itself. Either all parts are executed or the transaction is aborted.
A transaction takes a database from one consistent state to another. A consistent database state
is one in which all data integrity constraints are satisfied.
 Transactions have four main properties: atomicity (all parts of the transaction are executed;
otherwise, the transaction is aborted), consistency (the database’s consistent state is
maintained), isolation (data used by one transaction cannot be accessed by another transaction
until the first transaction is completed), and durability (the changes made by a transaction
cannot be rolled back once the transaction is committed). In addition, transaction schedules
have the property of serializability (the result of the concurrent execution of transactions is the
same as that of the transactions being executed in serial order).
 SQL provides support for transactions through the use of two statements: COMMIT (saves
changes to disk) and ROLLBACK (restores the previous database state).
 SQL transactions are formed by several SQL statements or database requests. Each database
request originates several I/O database operations.
 The transaction log keeps track of all transactions that modify the database. The information
stored in the transaction log is used for recovery (ROLLBACK) purposes.
 Concurrency control coordinates the simultaneous execution of transactions. The concurrent
execution of transactions can result in three main problems: lost updates, uncommitted data,
and inconsistent retrievals.
 The scheduler is responsible for establishing the order in which the concurrent transaction
operations are executed. The transaction execution order is critical and ensures database integrity
in multiuser database systems.

Page 21

You might also like