Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unit 4 Transaction Processing

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 45

Transaction Processing

-Ashu Mehta
Database Systems
Transaction Processing Systems
• Transaction Processing Systems are the
systems with large databases and hundreds of
concurrent users executing database
transactions.
• For example airline reservations, banking,
stock markets, etc.
Transaction
• A transaction is an executing program that
forms a logical unit of database processing.
• A transaction includes one or more database
access operations- these can include insertion,
deletion, modification, or retrieval operations.
• Transaction is executed as a single unit. It is a
program unit whose execution may or may
not change the contents of a database.
Example
• A transfer of money from one bank account to
another requires two changes to the database
both must succeed or fail together.
– Subtracting the money from the savings account
balance.
– Adding the money to the checking account
balance.
Example
Processes of Transaction
• Read Operation: To read a database object, it
is first brought into main memory from disk
and then its value is copied into a program
variable.
A
Read (2) 5000

INPUT (1) 5000 A


Physical
5000 a Block (X)
Program
Variable
Main Memory Disk
(RAM)
Processes of Transaction
• Write Operation: To write a database object,
an in-memory copy of the object is first
modified and then written back to disk.

A
Write (1) 5000 to 7000

OUTPUT (2) 5000 to 7000 A


Physical
7000 a Block (X)
Program
Variable
Main Memory Disk
(RAM)
Desirable Properties of Transactions
ACID properties:
• Atomicity: A transaction is an atomic unit of
processing; it is either performed in its entirety or not
performed at all.
• Consistency preservation: A correct execution of the
transaction must take the database from one
consistent state to another.
• Isolation: A transaction should appear as though it is
being executed in isolation from other transactions,
even if many transactions are executing concurrently.
That is, the execution of a transaction should not be
interfered with any other executing transactions.
• Durability or permanency: Once a transaction changes
the database and the changes are committed, these
changes must never be lost because of any failure.
Transaction Properties
• Let T1 be a transaction that transfers Rs 50 from
account A to account B. This transaction can be
defined as:

• Say value of A prior to transaction was 1000 and


that of B was 2000
• Atomicity: Suppose that during the execution of T1, a
power failure has occurred that prevented the T1 to
complete successfully. The point of failure may be after the
completion of Write(A) and before Write(B). It means that
the changes in A are performed but not in B. Thus the
values of account A and B are Rs.950 and Rs.2000
respectively. We have lost Rs.50 as a result of this failure.
• Now, our database is in inconsistent state.
• The reason for this inconsistent state is that our transaction
is completed partially.
• In order to maintain atomicity of transaction, the database
system keeps track of the old values of any write and if the
transaction does not complete its execution, the old values
are restored to make it appear as the transaction never
executed.
• Consistency: The consistency requirement here is
that the sum of A and B must be unchanged by
the execution of the transaction. It can be
verified easily that, if the database is consistent
before an execution of the transaction, the
database remains consistent after the execution
of the transaction.
• Ensuring consistency for an individual transaction
is the responsibility of the application
programmer who codes the transaction.
• Isolation: If several transactions are executed
concurrently (or in parallel), then each
transaction must behave as if it was executed
in isolation. It means that concurrent
execution does not result an inconsistent
state.
• For example, consider another transaction T2,
which has to display the sum of account A and
B. Then, its result should be Rs.3000.
Transaction Properties
• Lets suppose that T1 and T2 perform concurrently,
their schedule is shown below:
T1 T2 Status
Read (A, a) Value of A i.e. 1000 is copied to local variable a
a: a-50 Local variable a = 950
Write (A, a) Value of local variable a 950 is copied to database item A
Read (A, a) Value of database item A 950 is copied to a
Read (B, b) Value of database item B 2000 is copied to b
Display (a+b) 2950 is displayed as sum of accounts A and B
Read (B, b) Value of data item B 2000 is copied to vocal variable b
b: = b+50 Local variable b = 2050
Write (B, b) Value of local variable b 2050 is copied to database item B
• Durability: Once the execution of the
transaction completes successfully, and the
user who initiated the transaction has been
notified that the transfer of funds has taken
place, it must be the case that no system
failure will result in a loss of data
corresponding to this transfer of funds. 
Transaction States
• Active state– the initial state; the transaction stays in this
state while it is executing .
• Partially committed state– after the final statement has
been executed.
• Failed state -- after the discovery that normal execution can
no longer proceed.
• Aborted state– after the transaction has been rolled back
and the database restored to its state prior to the start of
the transaction. Two options after it has been aborted:
– restart the transaction can be done only if no internal logical
error
– kill the transaction
• Committed state – after successful completion
State Transition Diagram
Concurrent Executions
• Multiple transactions are allowed to run
concurrently in the system.
• Advantages are:
– increased processor and disk utilization, leading to
better transaction throughput
• E.g. one transaction can be using the CPU while another is
reading from or writing to the disk
– reduced average response time for transactions:
short transactions need not wait behind long ones.
• Concurrency control schemes – mechanisms to
achieve isolation that is, to control the
interaction among the concurrent transactions in
order to prevent them from destroying the
consistency of the database
Why Concurrency Control is needed?
• The Lost Update Problem
– This occurs when two transactions that access the same
database items have their operations interleaved in a way
that makes the value of some database item incorrect.
• The Temporary Update (or Dirty Read) Problem
– This occurs when one transaction updates a database item
and then the transaction fails for some reason
– The updated item is accessed by another transaction
before it is changed back to its original value.
• The Incorrect Summary Problem
– If one transaction is calculating an aggregate summary
function on a number of records while other transactions
are updating some of these records, the aggregate
function may calculate some values before they are
updated and others after they are updated.
Concurrent execution is uncontrolled:
(a) The lost update problem.
Concurrent execution is uncontrolled:
(b) The temporary update problem.
Concurrent execution is uncontrolled:
(c) The incorrect summary problem.
Schedules
• Schedule – a sequences of instructions that specify the
chronological order in which instructions of concurrent
transactions are executed
– a schedule for a set of transactions must consist of all
instructions of those transactions
– must preserve the order in which the instructions appear in
each individual transaction.
• A transaction that successfully completes its execution
will have a commit instructions as the last statement
– by default transaction assumed to execute commit
instruction as its last step
• A transaction that fails to successfully complete its
execution will have an abort instruction as the last
statement
Scheduling of Transactions
• Complete Schedule: A schedule that contains
either an abort or a commit for each transaction
whose actions are listed in it is called a complete
schedule.
• Serial Schedule: Each serial schedule consists of a
sequence of instructions from various transactions,
where the instructions belonging to one single
transaction appear together in that schedule. If the
actions of different transactions are not interleaved,
we call that schedule a serial schedule.
Schedule 1
Schedule 2
Schedule 3
Schedule 4
Scheduling of Transactions
• Serializiable Schedule: A non-serial schedule that
is equivalent to some serial execution of transactions
is called a serializiable schedule.
• For example schedule-3 is serializable schedule,
which is equivalent to schedule-1 and schedule-2.
• The objective of serializability is to find non-serial
schedules that allow transactions to execute
concurrently without interfering with one another,
and thereby produce a database state that could be
produced by a serial execution.
Serializability
• Basic Assumption – Each transaction preserves
database consistency.
• Thus serial execution of a set of transactions
preserves database consistency.
• A (possibly concurrent) schedule is serializable if it is
equivalent to a serial schedule. Different forms of
schedule equivalence give rise to the notions of:
1. conflict serializability
2. view serializability
Conflict Serializability
•If a schedule S can be transformed into a schedule S’
by a series of swaps of non-conflicting instructions, we
say that S and S’ are conflict equivalent.
•We say that a schedule S is conflict serializable if it is
conflict equivalent to a serial schedule.
Conflict Serializability
• We need to check the conflicts between two
consecutive operations of two different
transactions in a schedule. Operations upon data
can be read or write. There are two possibilities:
– If two consecutive operations are on different data
items, then they donot conflict i.e, their order of
execution does not matter and we can swap them
without affecting their result.
– If two consecutive operations are on same data items,
then they can conflict i.e their order of execution
matters and we cannot swap them.
Conflict Serializability
• Consider a schedule S in which there are two consecutive
operations Oi and Oj of two transactions Ti and Tj and both
operations refer to the same data item A. Then, there are
following four cases:
1. Oi= read(A) and Oj= read(A). Then, their order does not matter
because same value of A is read by Ti and Tj.
2. Oi= read(A) and Oj= write(A). Then, their order matters. If Oi
comes before Oj then, Oj does not read the value of A written
by Oi. If Oj comes before Oi then, Oi reads the value of A that is
written by Oj.
3. Oi= write(A) and Oj= read(A). Then, their order matters. If Oi
comes before Oj then, Oj reads the value of A written by Oi. If
Oj comes before Oi then, Oj does not read the value of A that is
written by Oi.
4. Oi= write(A) and Oj= write(A). Then, their order matters. If Oi
comes after Oj then, the value of A written by Oi is stored in
database. If Oj comes after Oi then, the value of A written by Oj
is stored in database.
Conflict Serializability
View Serializability
• A schedule is said to be view-serializable if it is view-
equivalent to some serial schedule.
• Any schedule that is conflict-serializable is also view-
serializable, but not vice versa.
• Below is a schedule which is view-serializable but not conflict
serializable.

• Every view serializable schedule that is not conflict


serializable has blind writes.
View Serializability
1. Initial read
2. Updated read
3. Final write operation
Examples
Schedule 1:
T1 T2 T3
W(A)
W(A)
R(A)

Schedule 2:
T1 T2 T3
W(A)
R(A)
W(A)

Schedule 1 and schedule 2 are not view equivalent


Examples
Schedule 3:
T1 T2
R(A)
W(A)
W(A)

Schedule 4:
T1 T2
R(A)
W(A)
W(A)

Schedule 3 and schedule 4 are not view equivalent


• Schedule 5: View equivalent
T1 T2 T1 T2
R(X) R(X)
W(X) W(X)
R(X) R(Y)
W(X) W(Y)
R(Y) R(X)
W(Y) W(X)
R(Y) R(Y)
W(Y) W(Y)
Recoverable Schedules
• Need to address the effect of transaction failures on
concurrently running transactions.
• Recoverable schedule — if a transaction Tj reads a data item
previously written by a transaction Ti , then the commit
operation of Ti appears before the commit operation of Tj.
• The following schedule is not recoverable if T9 commits
immediately after the read

• If T8 should abort, T9 would have read an inconsistent database


state. Hence, database must ensure that schedules are
recoverable.
Recoverable Schedules
• A schedule is said to be recoverable, if for each pair of
transaction Ti and Tj such that Tj reads a data item
previously written by Ti, the commit operation of Ti
appears before the commit operation of Tj.

• Both the above schedules are recoverable .


Recoverable Schedules
• Schedule1 is recoverable because T2 reads the
value of A updated by T1 and also T1 commits
before T2 which makes the value read by T2
correct. Then T2 can commit itself.
• In schedule2, if T1 is aborted, T2 has to abort
because the value A it read is incorrect.
• In both cases, the database is in consistent
state.
Cascading Rollbacks
• If in a schedule, failure of one transaction
causes several other dependent transactions
to rollback or abort, then schedule is called as
cascading schedule or cascading rollback or
cascading abort.
• It leads to the wastage of CPU time.
• Cascading rollbacks occur because of Dirty
read problems.
Cascadeless Schedules
• Cascadeless schedules—cascading rollbacks
cannot occur; for each pair of transactions Ti and
Tj such that Tj reads a data item previously
written by Ti, the commit operation of Ti
appears before the read operation of Tj.
• Every cascadeless schedule is also recoverable

You might also like