DB2 9 Fundamentals Exam 730 Prep, Part 6:: Data Concurrency
DB2 9 Fundamentals Exam 730 Prep, Part 6:: Data Concurrency
Data concurrency
Roger E. Sanders (rsanders@netapp.com)
Consultant Corporate Systems Engineer
EMC Corporation
This tutorial will introduce you to the concept of data consistency and to the
various mechanisms that are used by DB2 for Linux, UNIX, and Windows to
maintain consistency in both single- and multi-user database environments. This
is the last tutorial in a series of six tutorials that you can use to help prepare for
the DB2 for Linux, Unix, and Windows Fundamentals Certification (Exam 730).
View more content in this series
Trademarks
Page 1 of 29
developerWorks
ibm.com/developerWorks/
Objectives
After completing this tutorial, you should be able to:
Prerequisites
In order to understand some of the material presented in this tutorial, you should be
familiar with the following terms:
Object: Anything in a database that can be created or manipulated with SQL
(e.g., tables, views, indexes, packages).
Table: A logical structure that is used to present data as a collection of
unordered rows with a fixed number of columns. Each column contains a set
of values, each value of the same data type (or a subtype of the column's data
type); the definitions of the columns make up the table structure, and the rows
contain the actual table data.
Record: The storage representation of a row in a table.
Field: The storage representation of a column in a table.
Value: A specific data item that can be found at each intersection of a row and
column in a database table.
Structured Query Language (SQL): A standardized language used to define
objects and manipulate data in a relational database. (For more on SQL, see
the fourth tutorial in this series.
DB2 optimizer: A component of the SQL precompiler that chooses an access
plan for a Data Manipulation Language (DML) SQL statement by modeling the
execution cost of several alternative access plans and choosing the one with
the minimal estimated cost.
DB2 9 Fundamentals exam 730 prep, Part 6: Data
concurrency
Page 2 of 29
ibm.com/developerWorks/
developerWorks
System requirements
You do not need a copy of DB2 to complete this tutorial. However, you will get more
out of the tutorial if you download the free trial version of IBM DB2 for Linux, UNIX,
and Windows to work along with this tutorial.
Section 2. Transactions
Understanding data consistency
What is data consistency? The best way to answer this question is by example.
Suppose your company owns a chain of restaurants and you have a database that
is designed to keep track of supplies stored at each of those restaurants. To facilitate
the supply-purchasing process, your database contains an inventory table for each
restaurant in the chain. Whenever supplies are received or used by an individual
restaurant, the corresponding inventory table for that restaurant is modified to reflect
the changes.
Now, suppose some bottles of ketchup are physically moved from one restaurant
to another. In order to accurately represent this inventory move, the ketchup bottle
count value stored in the donating restaurant's table needs to be lowered and the
ketchup bottle count value stored in the receiving restaurant's table needs to be
raised. If a user lowers the ketchup bottle count in the donating restaurant's inventory
table but fails to raise the ketchup bottle count in the receiving restaurant's inventory
table, the data will become inconsistent - now the total ketchup bottle count for the
chain of restaurants is no longer accurate.
Data in a database can become inconsistent if a user forgets to make all necessary
changes (as in the previous example), if the system crashes while the user is in
the middle of making changes, or if a database application for some reason stops
prematurely. Inconsistency can also occur when several users are accessing the
same database tables at the same time. In an effort to prevent data inconsistency,
particularly in a multi-user environment, the following data consistency support
mechanisms have been incorporated into DB2 for Linux, UNIX, and Windows design:
Transactions
Isolation levels
Locks
Page 3 of 29
developerWorks
ibm.com/developerWorks/
a transaction are applied to the database (committed), or the effects of all SQL
operations performed are completely undone and thrown away (rolled back).
With embedded SQL applications and scripts run from the Command Center, the
Script Center, or the Command Line Processor, transactions are automatically
initiated the first time an executable SQL statement is executed, either after a
connection to a database been established or after an existing transaction has been
terminated. Once initiated, a transaction must be explicitly terminated by the user or
application that initiated it, unless a process known as automatic commit is being
used (in which case each individual SQL statement submitted for execution is treated
as a single transaction that is implicitly committed as soon as it is executed).
In most cases, transactions are terminated by executing either the COMMIT or the
ROLLBACK statement. When the COMMIT statement is executed, all changes that have
been made to the database since the transaction was initiated are made permanent
-- that is, they are written to disk. When the ROLLBACK statement is executed, all
changes that have been made to the database since the transaction was initiated are
backed out and the database is returned to the state it was in before the transaction
began. In either case, the database is guaranteed to be returned to a consistent state
at the completion of the transaction.
It is important to note that, while transactions provide generic database consistency
by ensuring that changes to data only become permanent after a transaction has
been successfully committed, it is up to the user or application to ensure that the
sequence of SQL operations performed within each transaction will always result in a
consistent database.
A table named DEPARTMENT will be created that looks something like this:
DB2 9 Fundamentals exam 730 prep, Part 6: Data
concurrency
Page 4 of 29
ibm.com/developerWorks/
developerWorks
DEPT_ID
DEPT_NAME
100
PAYROLL
200
ACCOUNTING
500
MARKETING
That's because when the first COMMIT statement is executed, the creation of
the table named DEPARTMENT, along with the insertion of two records into the
DEPARTMENT table, will be made permanent. On the other hand, when the first
ROLLBACK statement is executed, the third record inserted into the DEPARTMENT
table is removed and the table is returned to the state it was in before the insert
operation was performed. Finally, when the second COMMIT statement is executed,
the insertion of the fourth record into the DEPARTMENT is made permanent and the
database is again returned to a consistent state.
As you can see from this example, a commit or rollback operation only affects
changes that are made within the transaction that the commit or rollback operation
ends. As long as data changes remain uncommitted, other users and applications
are usually unable to see them (there are exceptions, which we will look at later),
and they can be backed out simply by performing a rollback operation. Once data
changes are committed, however, they become accessible to other users and
applications and can no longer be removed by a rollback operation.
Page 5 of 29
developerWorks
ibm.com/developerWorks/
Page 6 of 29
ibm.com/developerWorks/
developerWorks
Page 7 of 29
developerWorks
ibm.com/developerWorks/
works with it. DB2 for Linux, UNIX, and Windows uses the following isolation levels to
enforce concurrency:
Repeatable Read
Read Stability
Cursor Stability
Uncommitted Read
The repeatable read isolation level prevents all phenomena, but greatly reduces the
level of concurrency (the number of transactions that can access the same resource
simultaneously) available. The uncommitted read isolation level provides the greatest
level of concurrency, but allows all three phenomena to occur.
Page 8 of 29
ibm.com/developerWorks/
developerWorks
Page 9 of 29
developerWorks
ibm.com/developerWorks/
Page 10 of 29
ibm.com/developerWorks/
developerWorks
Page 11 of 29
developerWorks
ibm.com/developerWorks/
is now positioned on.) Furthermore, if the owning transaction modifies any row it
retrieves, no other transaction is allowed to update or delete that row until the owning
transaction is terminated, even though the cursor may no longer be positioned on
the modified row. As with the repeatable read and read stability isolation levels,
transactions using the cursor stability isolation level (which is the default isolation
level used) won't see changes made to other rows by other transactions until those
changes have been committed.
If our hotel reservation is running under the cursor stability isolation level, here's how
it will operate. When a customer scans the database for a list of rooms available for
a given date range and then views information about each room on the list produced,
one room at a time, you will be able to change the room rates for any room in the
hotel except for the room the customer is currently looking at (for the date range
specified). Likewise, other customers will be able to make or cancel reservations for
any room in the hotel except the room the customer is currently looking at (for the
date range specified). However, neither you nor other customers will be able to do
anything with the room record the first customer is currently looking at. When the first
customer views information about another room in the list, you and other customers
will be able to modify the room record the first customer was just looking at (provided
the customer did not reserve it); however, no one will be allowed to change the room
record the first customer is now looking at. This behavior is illustrated in Figure 4.
Page 12 of 29
ibm.com/developerWorks/
developerWorks
Page 13 of 29
developerWorks
ibm.com/developerWorks/
indexes that have been dropped; transactions using the uncommitted read will only
learn that these objects no longer exist when the transaction that dropped them is
committed. (It's important to note that when a transaction running under the this
isolation level uses an updatable cursor, the transaction will behave as if it is running
under the cursor stability isolation level, and the constraints of the cursor stability
isolation level will apply.)
So how would the uncommitted read isolation level affect our hotel reservation
application? Now, when a customer scans the database to obtain a list of available
rooms for a given date range, you will be able to change the room rates for any room
in the hotel over any date range. Likewise, other customers will be able to make
or cancel reservations for any room in the hotel including the room the customer
is currently looking at (for the date range specified). In addition, the list of rooms
produced for the first customer may contain records for rooms that other customers
are in the processing of reserving that are not really available. This behavior is
illustrated in Figure 5.
Figure 5. Example of the Uncommitted Read isolation level
Page 14 of 29
ibm.com/developerWorks/
developerWorks
Page 15 of 29
developerWorks
ibm.com/developerWorks/
to DB2 commands, SQL statements, and scripts executed from the Command
Line Processor (CLP) as well as to embedded SQL, ODBC/CLI, JDBC, and SQLJ
applications. Therefore, it's also possible to specify the isolation level for operations
that are to be performed from the DB2 Command Line Processor (as well as
for scripts that are to be passed to the DB2 CLP for processing). In this case,
the isolation level is set by executing the CHANGE ISOLATION command before a
connection to a database is established.
With DB2 for Linux, UNIX, and Windows, the ability to specify the isolation level that
a particular query is to run under was provided in the form of the WITH [RR | RS
| CS | UR] clause that can be appended to a SELECT SQL statement. A simple
SELECT statement that uses this clause looks something like this:
SELECT * FROM EMPLOYEE WHERE EMPID = '001' WITH RR
If you have an application that needs to run in a less-restrictive isolation level the
majority of the time (to support maximum concurrency), but contains some queries
for which you must not see phenomena, this clause provides an excellent method
that can be used to help you meet your objective.
Section 4. Locks
How locking works
In the section on Concurrency and isolation levels, we saw that DB2 for Linux, UNIX,
and Windows isolates transactions from each other through the use of locks. A lock
is a mechanism that is used to associate a data resource with a single transaction,
with the purpose of controlling how other transactions interact with that resource
while it is associated with the owning transaction. (The transaction that a locked
resource is associated with is said to hold or own the lock.) The DB2 Database
Manager uses locks to prohibit transactions from accessing uncommitted data written
by other transactions (unless the Uncommitted Read isolation level is used) and to
prohibit the updating of rows by other transactions when the owning transaction is
using a restrictive isolation level. Once a lock is acquired, it is held until the owning
transaction is terminated; at that point, the lock is released and the data resource is
made available to other transactions.
If one transaction attempts to access a data resource in a way that is incompatible
with the lock being held by another transaction (we'll look at lock compatibility
shortly), that transaction must wait until the owning transaction has ended. This
is known as a lock wait event. When a lock wait event occurs, the transaction
attempting to access the data resource simply stops execution until the owning
transaction has terminated and the incompatible lock is released.
DB2 9 Fundamentals exam 730 prep, Part 6: Data
concurrency
Page 16 of 29
ibm.com/developerWorks/
developerWorks
Lock attributes
All locks have the following basic attributes:
Object: The object attribute identifies the data resource that is being locked.
The DB2 Database Manager acquires locks on data resources, such as
tablespaces, tables, and rows, whenever they are needed.
Size: The size attribute specifies the physical size of the portion of the data
resource that is being locked. A lock does not always have to control an entire
data resource. For example, rather than giving an application exclusive control
over an entire table, the DB2 Database Manager can give an application
exclusive control over a specific row in a table.
Duration: The duration attribute specifies the length of time for which a lock is
held. A transaction's isolation level usually controls the duration of a lock.
Mode: The mode attribute specifies the type of access allowed for the lock
owner as well as the type of access permitted for concurrent users of the locked
data resource. This attribute is commonly referred to as the lock state.
Lock states
The state of a lock determines the type of access allowed for the lock owner as well
as the type of access permitted for concurrent users of a locked data resource. Table
1 identifies the lock states that are available, in order of increasing control.
Table 1. Lock states
Lock State (Mode)
Applicable Objects
Description
Page 17 of 29
developerWorks
ibm.com/developerWorks/
Rows
Share (S)
Tables
Update (U)
Rows
Exclusive (X)
Page 18 of 29
ibm.com/developerWorks/
developerWorks
UPDATE
, and/or
DELETE
statements.
Weak Exclusive (W)
Rows
The DB2 Database Manager always attempts to acquire row-level locks. However,
this behavior can be modified by executing a special form of the ALTER TABLE
statement, as follows:
DB2 9 Fundamentals exam 730 prep, Part 6: Data
concurrency
Page 19 of 29
developerWorks
ibm.com/developerWorks/
where TableName identifies the name of an existing table for which all transactions
are to acquire table-level locks for when accessing it.
The DB2 Database Manager can also be forced to acquire a table-level lock on a
table for a specific transaction by executing the LOCK TABLE statement, as follows:
LOCK TABLE [ TableName ] IN [SHARE | EXCLUSIVE] MODE
where TableName identifies the name of an existing table for which a table-level
lock is to be acquired (provided that no other transaction has an incompatible lock
on this table). If this statement is executed with the SHARE mode specified, a tablelevel lock that will allow other transactions to read, but not change, the data stored in
it will be acquired; if executed with the EXCLUSIVE mode specified, a table-level lock
that does not allow other transactions to read or modify data stored in the table will
be acquired.
Lock conversion
When a transaction attempts to access a data resource that it already holds a lock
on, and the mode of access needed requires a more restrictive lock than the one
already held, the state of the lock held is changed to the more restrictive state. The
operation of changing the state of a lock already held to a more restrictive state is
known as lock conversion. Lock conversion occurs because a transaction can hold
only one lock on a data resource at a time.
DB2 9 Fundamentals exam 730 prep, Part 6: Data
concurrency
Page 20 of 29
ibm.com/developerWorks/
developerWorks
In most cases, lock conversion is performed for row-level locks and the conversion
process is pretty straightforward. For example, if a Share (S) or an Update (U) rowlevel lock is held and an Exclusive (X) lock is needed, the held lock will be converted
to an Exclusive (X) lock. Intent Exclusive (IX) locks and Share (S) locks are special
cases, however, since neither is considered to be more restrictive than the other.
Thus, if one of these row-level locks is held and the other is requested, the held lock
is converted to a Share with Intent Exclusive (SIX) lock. Similar conversions result in
the requested lock state becoming the new lock state of the held lock, provided the
requested lock state is more restrictive. (Lock conversion only occurs if a held lock
can increase its restriction.) Once a lock's state has been converted, the lock stays at
the highest state obtained until the transaction holding the lock is terminated.
Lock escalation
All locks require space for storage; because the space available is not infinite, the
DB2 Database Manager must limit the amount of space that can be used for locks
(this is done through the maxlocks database configuration parameter). In order
to prevent a specific database agent from exceeding the lock space limitations
established, a process known as lock escalation is performed automatically
whenever too many locks (of any type) have been acquired. Lock escalation is the
conversion of several individual row-level locks within the same table to a single
table-level lock. Since lock escalation is handled internally, the only externally
detectable result might be a reduction in concurrent access on one or more tables.
Here's how lock escalation works: When a transaction requests a lock and the lock
storage space is full, one of the tables associated with the transaction is selected, a
table-level lock is acquired on its behalf, all row-level locks for that table are released
(to create space in the lock list data structure), and the table-level lock is added to
the lock list. If this process does not free up enough space, another table is selected
and the process is repeated until enough free space is available. At that point, the
requested lock is acquired and the transaction resumes execution. However, if the
necessary lock space is still unavailable after all the transaction's row-level locks
have been escalated, the transaction is asked (via an SQL error code) to either
commit or rollback all changes that have been made since its initiation and the
transaction is terminated.
Lock timeouts
Any time a transaction holds a lock on a particular data resource (for example, a
table or a row), other transactions may be denied access to that resource until the
owning transaction terminates and frees all locks it has acquired. Without some sort
of lock timeout detection mechanism in place, a transaction might wait indefinitely for
a lock to be released. Such a situation might occur, for example, when a transaction
is waiting for a lock that is held by another user's application to be released, and the
other user has left his or her workstation without performing some interaction that
would allow the application to terminate the owning transaction. Obviously, such a
DB2 9 Fundamentals exam 730 prep, Part 6: Data
concurrency
Page 21 of 29
developerWorks
ibm.com/developerWorks/
situation can cause poor application performance. To avoid stalling other applications
when this occurs, a lock timeout value can be specified in a database's configuration
file (via the locktimeout database configuration parameter). When used, this value
controls the amount of time any transaction will wait to obtain a requested lock. If the
desired lock is not acquired before the time interval specified elapses, the waiting
application receives an error and the transaction requesting the lock is rolled back.
Distributed transaction application environments are particularly prone to these types
of situations; you can avoid them by using lock timeouts.
Deadlocks
Although the situation of one transaction waiting indefinately for a lock to be released
by another transaction can be resolved by establishing lock timeouts, there is one
scenario where contention for locks by two or more transactions cannot be resolved
by a timeout. This situation is known as a deadlock, or more specifically, a deadlock
cycle . The best way to illustrate how a deadlock can occur is by example: Suppose
Transaction 1 acquires an Exclusive (X) lock on Table A and Transaction 2 acquires
an Exclusive (X) lock on Table B. Now, suppose Transaction 1 attempts to acquire an
Exclusive (X) lock on Table B and Transaction 2 attempts to acquire an Exclusive (X)
lock on Table A. Processing by both transactions will be suspended until their second
lock request is granted. However, because neither lock request can be granted until
one of the transactions releases the lock it currently holds (by performing a commit or
rollback operation), and because neither transaction can release the lock it currently
holds (because both are suspended and waiting on locks), the transactions are stuck
in a deadlock cycle. Figure 7 illustrates this deadlock scenario.
Page 22 of 29
ibm.com/developerWorks/
developerWorks
When a deadlock cycle occurs, each transaction involved will wait indefinitely for a
lock to be released unless some outside agent steps in. With DB2 for Linux, UNIX,
and Windows, this agent is an asynchronous system background process that is
known as the deadlock detector. The sole responsibility of the deadlock detector is
to locate and resolve any deadlocks found in the locking subsystem. Each database
has its own deadlock detector, which is activated as part of the database initialization
process. Once activated, the deadlock detector stays "asleep" most of the time
but "wakes up" at preset intervals to examine the locking subsystem for deadlock
cycles. If the deadlock detector discovers that a deadlock cycle exists, it randomly
selects one of the transactions in the cycle to terminate and roll back. The transaction
chosen receives an SQL error code and all locks it had acquired are released; the
remaining transaction(s) can then proceed because the deadlock cycle has been
broken.
DB2 9 Fundamentals exam 730 prep, Part 6: Data
concurrency
Page 23 of 29
developerWorks
ibm.com/developerWorks/
Lock granularity
It was mentioned earlier that any time a transaction holds a lock on a particular data
resource, other transactions may be denied access to that resource until the owning
transaction terminates. Therefore, to optimize for maximum concurrency, row-level
locks are usually better than table-level locks, because they limit access to a much
smaller resource. However, because each lock acquired requires some amount of
processing time and storage space to acquire and manage, a single table-level lock
will require less overhead than several individual row-level locks. Unless otherwise
specified, row-level locks are acquired by default.
The granularity of locks (that is, whether row-level locks or table-level locks are
acquired) can be controlled through the use of the ALTER TABLE ... LOCKSIZE
TABLE, ALTER TABLE ... LOCKSIZE ROW, and LOCK TABLE statements. The ALTER
TABLE ... LOCKSIZE TABLE statement provides a global approach to granularity
that results in table-level locks being acquired by all transactions that access rows
within a particular table. On the other hand, the LOCK TABLE statement allows tablelevel locks to be acquired at an individual transaction level. When either of these
statements are used, a single Share (S) or Exclusive (X) table-level lock is acquired
whenever a lock is needed. As a result, locking performance is usually improved,
since only one table-level lock must be acquired and released instead of several
different row-level locks. However, when table-level locking is used, concurency can
be decreased if long-running transactions acquire Exclusive rather than Share, tablelevel locks.
Page 24 of 29
ibm.com/developerWorks/
developerWorks
(IX), and Exclusive (X) locks for tables, and Share (S), Update (U), and Exclusive
(X) locks for rows. Change transactions tend to use Intent Exclusive (IX) and/or
Exclusive (X) locks, while Cursor Controlled transactions often use Intent Exclusive
(IX) and/or Exclusive (X) locks.
When an SQL statement is prepared for execution, the DB2 optimizer explores
various ways to satisfy that statement's request and estimates the execution cost
involved for each approach. Based on this evaluation, the DB2 optimizer then
selects what it believes to be the optimal access plan. (The access plan specifies the
operations required and the order in which those operations are to be performed to
resolve an SQL request.) An access plan can use one of two ways to access data
in a table: by directly reading the table (which is known as performing a table or a
relation scan ), or by reading an index on that table and then retrieving the row in the
table to which a particular index entry refers (which is known as performing an index
scan ).
The access path chosen by the DB2 optimizer, which is often determined by the
database's design, can have a significant impact on the number of locks acquired
and the lock states used. For example, when an index scan is used to locate a
specific row, the DB2 Database Manager will most likely acquire one or more Intent
Share (IS) row-level locks. However, if a table scan is used, because the entire table
must be scanned, in sequence, to locate a specific row, the DB2 Database Manager
may opt to acquire a single Share (S) table-level lock.
Section 6. Summary
This tutorial was designed to introduce you to the concept of data consistency and
to the various mechanisms that are used by DB2 for Linux, UNIX, and Windows
to maintain database consistency in both single- and multi-user environments. A
database can become inconsistent if a user forgets to make all necessary changes, if
the system crashes while a user is in the middle of making changes, or if a database
application for some reason stops prematurely. Inconsistency can also occur when
several users/applications access the same data resource at the same time. For
example, one user might read another user's changes before all tables have been
properly updated and take some inappropriate action or make an incorrect change
based on the premature data values read. In an effort to prevent data inconsistency,
particularly in a multi-user environment, the developers of DB2 for Linux, UNIX, and
Windows incorporated the following data consistency support mechanisms into its
design:
Transactions
Isolation levels
Locks
DB2 9 Fundamentals exam 730 prep, Part 6: Data
concurrency
Page 25 of 29
developerWorks
ibm.com/developerWorks/
Repeatable Read
Read Stability
Cursor Stability
Uncommitted Read
The repeatable read isolation level prevents all phenomena, but greatly reduces the
level of concurrency (the number of transactions that can access the same resource
simultaneously) available. The uncommitted read isolation level provides the greatest
level of concurrency, but allows dirty reads, nonrepeatable reads, and phantoms to
occur.
Along with isolation levels, DB2 for Linux, UNIX, and Windows provides concurrency
in multi-user environments through the use of locks. A lock is a mechanism that
is used to associate a data resource with a single transaction, for the purpose of
controlling how other transactions interact with that resource while it is associated
with the transaction that owns the lock. Several different types of locks are available:
To maintain data integrity, the DB2 Database Manager acquires locks implicitly, and
all locks acquired remain under the DB2 Database Manager's control. Locks can be
placed on tablespaces, tables, and rows.
DB2 9 Fundamentals exam 730 prep, Part 6: Data
concurrency
Page 26 of 29
ibm.com/developerWorks/
developerWorks
To optimize for maximum concurrency, row-level locks are usually better than tablelevel locks, because they limit access to a much smaller resource. However, because
each lock acquired requires some amount of storage space and processing time to
manage, a single table-level lock will require less overhead than several individual
row-level locks.
Page 27 of 29
developerWorks
ibm.com/developerWorks/
Resources
Learn
Get more information on IBM Information Management Training and
Certification.
Check out the other parts of the DB2 9 Fundamentals exam 730 prep tutorial
series.
Certification exam site. Click the exam number to see more information about
Exams 730 and 731.
Learn more about DB2 9 from the DB2 9 Information Center.
Visit the developerWorks DBA Central zone to read articles and tutorials and
connect to other resources to expand your database administration skills.
Check out the developerWorks DB2 basics series, a group of articles geared
toward beginning users.
Get products and technologies
A trial version of DB2 9 is available for free download.
Download DB2 Express-C, a no-charge version of DB2 Express Edition for the
community that offers the same core data features as DB2 Express Edition and
provides a solid base to build and deploy applications.
Discuss
Participate in the discussion forum for this content.
Page 28 of 29
ibm.com/developerWorks/
developerWorks
Page 29 of 29