Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Transaction Processing Concepts

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

MODULE 5

                                        Transaction Processing Concepts 

                     The concept of transaction provides a mechanism for describing logical units of


database processing. Transaction processing systems are systems with large databases and
hundreds of concurrent users that are executing database transactions. Examples of such
systems include systems for reservations, banking, credit card processing, stock markets,
supermarket checkout, and other similar systems.. 
                                 
  Single-user versus Multiuser Systems

              One criterion for classifying a database system is according to the number of users
who can use the system concurrently—that is, at the same time. A DBMS is single-user if at
most one user at a time can use the system, and it is multiuser if many users can use the
system—and hence access the database—concurrently. Single-user DBMSs are mostly
restricted to some microcomputer systems; most other DBMSs are multiuser. For example, an
airline reservations system is used by hundreds of travel agents and reservation clerks
concurrently. Systems in banks, insurance agencies, stock exchanges, supermarkets, and the
like are also operated on by many users who submit transactions concurrently to the system. 
                         Multiple users can access databases—and use computer
systems—simultaneously because of the concept of multiprogramming, which allows the
computer to execute multiple programs—or processes—at the same time.

 Transactions, Read and Write Operations, and DBMS Buffers 

               A transaction is a logical unit of database processing that includes one or more
database access operations—these can include insertion, deletion, modification, or retrieval
operations. The database operations that form a transaction can either be embedded within an
application program or they can be specified interactively via a high-level query language
such as SQL. One way of specifying the transaction boundaries is by specifying explicit
begin transaction and end transaction statements in an application program; in this case, all
database access operations between the two are considered as forming one transaction. A
single application program may contain more than one transaction if it contains several
transaction boundaries. If the database operations in a transaction do not update the database
but only retrieve data, the transaction is called a read-only transaction. 
               A database is basically represented as a collection of named data items. The size
of a data item is called its granularity
 The basic database access operations that a transaction can include are as follows: 
• read_ item(X): Reads a database item named X into a program variable. To simplify
our notation, we assume that the program variable is also named X. 
1. • write_item(X): Writes the value of program variable X into the database item named
X. 
Transaction includes read_item and write_item operations to access and update the database.
The read-set of a transaction is the set of all items that the transaction reads, and the
write-set is the set of all items that the transaction writes. 

 Why Concurrency Control Is Needed


When a transaction submitted by various users may execute concurrently
and may access and update the same databse system ,it may lead to many problems.

1.The Lost Update Problem 


            This problem occurs when two transactions that access the same database items have
their operations interleaved in a way that makes the value of some database item incorrect.
Suppose that transactions and are submitted at approximately the same time, and suppose that
their operations are interleaved as shown in Figure 2(a); then the final value of item X is
incorrect, because reads the value of X before changes it in the database, and hence the
updated value resulting from is lost. For example, if X = 80 at the start (originally there were
80 reservations on the flight), N = 5 ( transfers 5 seat reservations from the flight
corresponding to X to the flight corresponding to Y), and M = 4 ( reserves 4 seats on X), the
final result should be X = 79; but in the interleaving of operations shown in Figure 19.03(a), it
is X = 84 because the update in that removed the five seats from X was lost. 

T1 T2

read_item(X);
X=X-N; Read_item(X);
X=X+M;

write_item(X);
read_item(Y);

Write_item(X);
Y=Y+N;
write_item(Y);

Item X has an incorrect value because its update by T1 is lost

2.The Temporary Update (or Dirty Read) Problem 


This problem occurs when one transaction updates a database item and then the
transaction fails for some reason). The updated item is accessed by another transaction before
it is changed back to its original value. In the following Figure ) shows an example where
updates item X and then fails before completion, so the system must change X back to its
original value. Before it can do so, however, transaction reads the "temporary" value of X,
which will not be recorded permanently in the database because of the failure of . The value
of item X that is read by is called dirty data, because it has been created by a transaction that
has not completed and committed yet; hence, this problem is also known as the dirty read
problem. 

T1 T2

read_item(X);
X=X-N;
write_item(X);
read_item(X);
X=X+M;
Write_item(X);
read_item(Y);

Transaction T1 fails and must change the value of X back to its old value meanwhile T2 has
read the temporary incorrect value of x

3.The Incorrect Summary Problem 

If one transaction is calculating an aggregate summary function on a number of


records while other transactions are updating some of these records, the aggregate function
may calculate some values before they are updated and others after they are updated. For
example, suppose that a transaction is calculating the total number of reservations on all the
flights; meanwhile, transaction is executing. If the interleaving of operations shown in the
following  figure occurs, the result of will be off by an amount N because reads the value of X
after N seats have been subtracted from it but reads the value of Y before those N seats have
been added to it. 
4. Unrepeatable read

Another problem that may occur is called unrepeatable read, where a transaction T reads an
item twice and the item is changed by another transaction T between the two reads. Hence, T
receives different values for its two reads of the same item. This may occur, for example, if
during an airline reservation transaction, a customer is inquiring about seat availability on
several flights. When the customer decides on a particular flight, the transaction then reads
the number of seats on that flight a second time before completing the reservation

 Why Recovery Is Needed 

Types of Failures
Whenever a transaction is submitted to a DBMS for execution, the system is
responsible for making sure that either  all the operations in the transaction are completed
successfully and their effect is recorded permanently in the database, or  the transaction has
no effect the database or on any other transactions. The DBMS must not permit some
operations of a transaction T to be applied to the database while other operations of T are not.
This may happen if a transaction fails after executing some of its operations but before
executing all of them. 
Types of Failures 
Failures are generally classified as transaction, system, and media failures. There are several
possible reasons for a transaction to fail in the middle of execution: 
1. A computer failure (system crash): A hardware, software, or network error occurs in
the computer system during transaction execution. Hardware crashes are usually
media failures—for example, main memory failure. 
2. A transaction or system error: Some operation in the transaction may cause it to fail,
such as integer overflow or division by zero. Transaction failure may also occur
because of erroneous parameter values or because of a logical programming error . In
addition, the user may interrupt the transaction during its execution. 
3. Local errors or exception conditions detected by the transaction: During transaction
execution, certain conditions may occur that necessitate cancellation of the
transaction. For example, data for the transaction may not be found. Notice that an
exception condition ,such as insufficient account balance in a banking database, may
cause a transaction, such as a fund withdrawal, to be cancelled. This exception should
be programmed in the transaction itself, and hence would not be considered a failure. 
4. Concurrency control enforcement: The concurrency control method may decide to
abort the transaction, to be restarted later, because it violates serializability or because
several transactions are in a state of deadlock. 
5. Disk failure: Some disk blocks may lose their data because of a read or write
malfunction or because of a disk read/write head crash. This may happen during a
read or a write operation of the transaction. 
6. Physical problems and catastrophes: This refers to an endless list of problems that
includes power or air-conditioning failure, fire, theft, sabotage, overwriting disks or
tapes by mistake, and mounting of a wrong tape by the operator. 
Transaction and System Concepts 

A transaction is an atomic unit of work that is either completed in its entirety or


not done at all. For recovery purposes, the system needs to keep track of when the transaction
starts, terminates, and commits or aborts (see below). Hence, the recovery manager keeps
track of the following operations: 
1. • BEGIN_TRANSACTION: This marks the beginning of transaction execution. 
2. • READ or WRITE: These specify read or write operations on the database items
that are executed as part of a transaction. 
3. • END_TRANSACTION: This specifies that READ and WRITE transaction
operations have ended and marks the end of transaction execution. However, at this
point it may be necessary to check whether the changes introduced by the transaction
can be permanently applied to the database (committed) or whether the transaction
has to be aborted because it violates serializability or for some other reason. 
4. • COMMIT_TRANSACTION: This signals a successful end of the transaction so
that any changes (updates) executed by the transaction can be safely committed to the
database and will not be undone. 
5. • ROLLBACK (or ABORT): This signals that the transaction has ended
unsuccessfully, so that any changes or effects that the transaction may have applied to
the database must be undone. 

             A transaction goes into an active state immediately after it starts


execution, where it can issue READ and WRITE operations. When the transaction ends, it
moves to the partially committed state. The transaction is said to have reached its commit
point and enters the committed state. A transaction can go to the failed state if one of the
checks fails or if the transaction is aborted during its active state. The transaction may then
have to be rolled back to undo the effect of its WRITE operations on the database. The
terminated state corresponds to the transaction leaving the system. 

The System Log 


To be able to recover from failures that affect transactions, the system maintains
a log  to keep track of all transaction operations that affect the values of database items. This
information may be needed to permit recovery from failures. The log is kept on disk, so it is
not affected by any type of failure except for disk or catastrophic failure. In addition, the log
is periodically backed up to archival storage (tape) to guard against such catastrophic failures.
We now list the types of entries—called log records—that are written to the log and the
action each performs. In these entries, T refers to a unique transaction-id that is generated
automatically by the system and is used to identify each transaction: 
1. 1. [start_transaction, T]: Indicates that transaction T has started execution. 
2. 2. [write_item, T,X,old_value,new_value]: Indicates that transaction T has changed the
value of database item X from old_value to new_value. 
3. 3. [read_item, T,X]: Indicates that transaction T has read the value of database item X. 
4. 4. [commit,T]: Indicates that transaction T has completed successfully, and affirms
that its effect can be committed (recorded permanently) to the database. 
5. 5. [abort,T]: Indicates that transaction T has been aborted. 

If the system crashes, we can recover to a consistent database state by


examining the log Because the log contains a record of every WRITE operation that changes
the value of some database item, it is possible to undo the effect of these WRITE operations
of a transaction T by tracing backward through the log and resetting all items changed by a
WRITE operation of T to their old values. Redoing the operations of a transaction may also
be needed if all its updates are recorded in the log but a failure occurs before we can be sure
that all these new values have been written permanently in the actual database on

Desirable Properties of Transactions 

Transactions should possess several properties. These are often called the ACID
properties, and they should be enforced by the concurrency control and recovery methods of
the DBMS. The following are the ACID properties: 

1. Atomicity: A transaction is an atomic unit of processing; it is either performed in its


entirety or not performed at all. 
2. Consistency preservation: A transaction is consistency preserving if its complete
execution take(s) the database from one consistent state to another. 
3. Isolation: A transaction should appear as though it is being executed in isolation
from other transactions. That is, the execution of a transaction should not be interfered
with by any other transactions executing concurrently. 
4. Durability or permanency: The changes applied to the database by a committed
transaction must persist in the database. These changes must not be lost because of
any failure. 
Introduction to Database Security Issues

1.Types of Security 
Database security is a very broad area that addresses many issues, including the following: 
1. • Legal and ethical issues regarding the right to access certain information. Some
information may be deemed to be private and cannot be accessed legally by
unauthorized persons. In the United States, there are numerous laws governing
privacy of information. 
2. • Policy issues at the governmental, institutional, or corporate level as to what
kinds of information should not be made publicly available—for example, credit
ratings and personal medical records. 
3. • System-related issues such as the system levels at which various security functions
should be enforced—for example, whether a security function should be handled at
the physical hardware level, the operating system level, or the DBMS level. 
4. • The need in some organizations to identify multiple security levels and to categorize
the data and users based on these classifications—for example, top secret, secret,
confidential, and unclassified. The security policy of the organization with respect to
permitting access to various classifications of data must be enforced. 
Threats to databases

● Loss of integrity :-
       Database integrity refers to the requirement that information be protected from
improper modification. Modification of data includes creation, insertion, modification,
deletion etc. Integrity is lost if unauthorized changes are made to the data

● Loss of availability:-
     Database availability refers to making objects available to a human user or a program
to which they have a legitimate right. 

● Loss of confidentiality:-
      Database confidentiality refers to the protection of data from unauthorized disclosure.
It result in public confidence and legal action against the organization.
 
                              In a multiuser database system, the DBMS must provide techniques to
enable certain users or user groups to access selected portions of a database without gaining
access to the rest of the database. A DBMS typically includes a database security and
authorization subsystem that is responsible for ensuring the security of portions of a
database against unauthorized access. It is now customary to refer to two types of database
security mechanisms: 
1. • Discretionary security mechanisms: These are used to grant privileges to users,
including the capability to access specific data files, records, or fields in a specified
mode (such as read, insert, delete, or update). 
2. • Mandatory security mechanisms: These are used to enforce multilevel security by
classifying the data and users into various security classes (or level)

2.Control Measures

There are 4 control measures that are used to provide security of data in databases.
   2.1 Access control
    2.2 Inference control
    2.3 Flow control
    2.4 Data Encryption

2.1 Access control


               A second security problem common to all computer systems is that of preventing
unauthorized persons from accessing the system itself—either to obtain information or to
make malicious changes in a portion of the database. The security mechanism of a DBMS
must include provisions for restricting access to the database system as a whole. This
function is called access control and is handled by creating user accounts and passwords to
control the log-in process by the DBMS. 

2.2 Inference control


     A third security problem associated with databases is that of controlling the access to a
statistical database, which is used to provide statistical information or summaries of values
based on various criteria. For example, a database for population statistics may provide
statistics based on age groups, income levels, size of household, education levels, and other
criteria. Statistical database users such as government statisticians or market research firms
are allowed to access the database to retrieve statistical information about a population but
not to access the detailed confidential information on specific individuals. Security for
statistical databases must ensure that information on individuals cannot be accessed. It is
sometimes possible to deduce certain facts concerning individuals from queries that involve
only summary statistics on groups; consequently this must not be permitted either. This
problem, called statistical database security, the corresponding control measures are called
inference control measures.

2.3 Flow control


  Flow control prevents information from flowing in such a way that it reaches unauthorized
users. Channels that are pathways for information to flow implicitly in ways that violate the
security policy of an organization are called covert channels.

2.4 Data Encryption


A fourth security issue is data encryption, which is used to protect sensitive data—such as
credit card numbers—that is being transmitted via some type of communications network.
Encryption can be used to provide additional protection for sensitive portions of a database as
well. The data is encoded by using some coding algorithm. 

3. Database Security and the DBA 


          The database administrator (DBA) is the central authority for managing a database
system. The DBA’s responsibilities include granting privileges to users who need to use the
system and classifying users and data in accordance with the policy of the organization. The
DBA has a 
DBA account in the DBMS, sometimes called a system or superuser account, which
provides powerful capabilities that are not made available to regular database accounts and
users 

1. Account creation: This action creates a new account and password for a user or
a   group of users to enable them to access the DBMS. 
2. Privilege granting: This action permits the DBA to grant certain privileges to
certain accounts. 
3. Privilege revocation: This action permits the DBA to revoke (cancel) certain
privileges that were previously given to certain accounts. 
4. Security level assignment: This action consists of assigning user accounts to the
appropriate security classification level. 

The DBA is responsible for the overall security of the database system. 

4. Access Protection, User Accounts, and Database Audits 


                     Whenever a person or a group of persons needs to access a database system, the
individual or group must first apply for a user account. The DBA will then create a new
account number and password for the user if there is a legitimate need to access the
database. The user must log in to the DBMS by entering the account number and password
whenever database access is needed. The DBMS checks that the account number and
password are valid; if they are, the user is permitted to use the DBMS and to access the
database. Application programs can also be considered as users and can be required to supply
passwords. 
                    It is straightforward to keep track of database users and their accounts and
passwords by creating an encrypted table or file with the two fields Account Number and
Password. This table can easily be maintained by the DBMS. Whenever a new account is
created, a new record is inserted into the table. When an account is cancelled, the
corresponding record must be deleted from the table. 
                  To keep a record of all updates applied to the database and of the particular user
who applied each update, we can modify the system log. That the system log includes an
entry for each operation applied to the database that may be required for recovery from a
transaction failure or system crash.
If any tampering with the database is suspected, a database audit is performed,
which consists of reviewing the log to examine all accesses and operations applied to the
database during a certain time period. When an illegal or unauthorized operation is found, the
DBA can determine the account number used to perform this operation. A database log that is
used mainly for security purposes is sometimes called an audit trail. 

5. Discretionary Access Control Based on Granting/Revoking of Privileges 


                   The typical method of enforcing discretionary access control in a database
system is based on the granting and revoking of privileges. The main idea is to include
additional statements in the query language that allow the DBA and selected users to grant
and revoke privileges. 

Types of Discretionary Privileges 


                For simplicity, we will use the words user or account interchangeably in place of
authorization identifier. The DBMS must provide selective access to each relation in the
database based on specific accounts. Operations may also be controlled; thus having an
account does not necessarily entitle the account holder to all the functionality provided by the
DBMS. Informally, there are two levels for assigning privileges to use the database system: 
1. 1. The account level: At this level, the DBA specifies the particular privileges that
each account holds independently of the relations in the database. 
2. 2. The relation (or table) level: At this level, we can control the privilege to access
each individual relation or view in the database. 

Propagation of Privileges Using the GRANT OPTION 

                                    Suppose that the DBA creates four accounts—A1, A2, A3, and
A4—and wants only A1 to be able to create base relations; then the DBA must issue the
following GRANT command in SQL: 

GRANT CREATETAB TO A1;

The CREATETAB (create table) privilege gives account A1 the capability to create new
database tables (base relations) and is hence an account privilege. 

Next, suppose that account A1 wants to grant to account A2 the privilege to insert and delete
tuples in both of these relations. Then DBA can issue the following command: 

GRANT INSERT, DELETE ON EMPLOYEE, DEPARTMENT TO A2;


Next, suppose that A1 wants to allow account A3 to retrieve information from either of the
two tables and also to be able to propagate the SELECT privilege to other accounts. Then A1
can issue the following
command: 

GRANT SELECT ON EMPLOYEE, DEPARTMENT TO A3 WITH GRANT OPTION;

The clause WITH GRANT OPTION means that A3 can now propagate the privilege to other
accounts by using GRANT. 

Revoking Privileges 
                          In some cases it is desirable to grant some privilege to a user temporarily. For
example, the owner of a relation may want to grant the SELECT privilege to a user for a
specific task and then revoke that privilege once the task is completed. Hence, a mechanism
for revoking privileges is needed. In SQL a REVOKE command is included for the purpose
of cancelling privileges. 

REVOKE SELECT ON EMPLOYEE FROM A3;

The DBMS must now automatically revoke the SELECT privilege on EMPLOYEE from A3

You might also like