ADB - Chapter 4,5,6 2
ADB - Chapter 4,5,6 2
ADB - Chapter 4,5,6 2
Chapter Four
Concurrency Control Techniques
Introduction
Concurrency control is the activity of coordinating the simultaneous execution of transactions
in a multiprocessing or multi-user database management system. The objective of concurrency
control is to ensure the serializability of transactions in a multi-user database management
system. Serializability can be achieved in several ways. Concurrency control can enforce
Isolation (through mutual exclusion) among conflicting transactions, preserve database
consistency through consistency preserving execution of transactions and can resolve read-
write and write-write conflicts. There are two main concurrency control techniques that allow
transactions to execute safely in parallel subject to certain constraints: locking and timestamp
methods.
Locking and timestamping are essentially conservative (or pessimistic) approaches in that
they cause transactions to be delayed in case they conflict with other transactions at some time
in the future. Optimistic methods, are based on the premise that conflict is rare so they allow
transactions to proceed unsynchronized and only check for conflicts at the end, when a
transaction commits.
Locking
A procedure used to control concurrent access to data. When one transaction is accessing the
database, a lock may deny access to other transactions to prevent incorrect results. It is an
operation which secures:
(a) permission to Read
(b) permission to Write a data item for a transaction.
Example:
• Lock (X). Data item X is locked in behalf of the requesting transaction.
Unlocking is an operation which removes these permissions from the data item.
Example: Unlock (X): Data item X is made available to all other transactions.
Lock and Unlock are Atomic operations.
Locking methods are the most widely used approach to ensure serializability of concurrent
transactions. There are several variations, but all share the same fundamental characteristic,
namely that a transaction must claim a shared (read) or exclusive (write) lock on a data item
before the corresponding database read or write operation.
1
Advanced Database Systems Handout
• Shared lock: If a transaction has a shared lock on a data item, it can read the item
but not update it.
• Exclusive lock: If a transaction has an exclusive lock on a data item, it can both
read and update the item.
Since read operations cannot conflict, it is permissible for more than one transaction to hold
shared locks simultaneously on the same item. On the other hand, an exclusive lock gives a
transaction exclusive access to that item. Thus, as long as a transaction holds the exclusive lock
on the item, no other transactions can read or update that data item. Locks are used in the
following way:
• Any transaction that needs to access a data item must first lock the item, requesting a
shared lock for read only access or an exclusive lock for both read and write access.
• If the item is not already locked by another transaction, the lock will be granted.
• If the item is currently locked, the DBMS determines whether the request is compatible
with the existing lock. If a shared lock is requested on an item that already has a shared
lock on it, the request will be granted; otherwise, the transaction must wait until the
existing lock is released.
• A transaction continues to hold a lock until it explicitly releases it either during
execution or when it terminates (aborts or commits). It is only when the exclusive lock
has been released that the effects of the write operation will be made visible to other
transactions.
In addition to these rules, some systems permit a transaction to issue a shared lock on an item
and then later to upgrade the lock to an exclusive lock. This in effect allows a transaction to
examine the data first and then decide whether it wishes to update it. For the same reason, some
systems also permit a transaction to issue an exclusive lock and then later to downgrade the
lock to a shared lock.
To guarantee serializability, we must follow an additional protocol concerning the positioning
of the lock and unlock operations in every transaction. The best-known protocol is two-phase
locking (2PL).
Two-phase locking (2PL)
A transaction follows the two-phase locking protocol if all locking operations precede the first
unlock operation in the transaction. According to the rules of this protocol, every transaction
can be divided into two phases: first a growing phase, in which it acquires all the locks needed
but cannot release any locks, and then a shrinking phase, in which it releases its locks but
2
Advanced Database Systems Handout
cannot acquire any new locks. There is no requirement that all locks be obtained
simultaneously. Normally, the transaction acquires some locks, does some processing, and
goes on to acquire additional locks as needed. However, it never releases any lock until it has
reached a stage where no new locks are needed. The rules are:
• A transaction must acquire a lock on an item before operating on the item. The lock
may be read or write, depending on the type of access needed.
• Once the transaction releases a lock, it can never acquire any new locks.
Two-Phase Locking Techniques: Essential components
• Lock Manager: is responsible for deciding the appropriate lock type (shared,
exclusive, update, and so on)
• Lock table: Lock manager uses it to store the identification of the transaction locking
a data item, the data item, lock mode and pointer to the next data item locked. One
simple way to implement a lock table is through linked list.
3
Advanced Database Systems Handout
Lock conversion
Lock conversion occurs when a process accesses a data object on which it already holds a lock,
and the access mode requires a more restrictive lock than the one already held. A process can
hold only one lock on a data object at any given time, although it can request a lock on the
same data object many times indirectly through a query.
• Lock upgrade: existing read lock to write lock.
if Ti has a read-lock (X) and Tj has no read-lock (X) (i ≠ j) then
convert read-lock (X) to write-lock (X)
else
force Ti to wait until Tj unlocks X
• Lock downgrade: existing write lock to read lock
Ti has a write-lock (X)
(*no transaction can have any lock on X*)
convert write-lock (X) to read-lock (X)
4
Advanced Database Systems Handout
Another problem with two-phase locking, which applies to all locking-based schemes, is that
it can cause deadlock, since transactions can wait for locks on data items. If two transactions
wait for locks on items held by the other, deadlock will occur.
5
Advanced Database Systems Handout
• Basic: Transaction locks data items incrementally. This may cause deadlock which is
dealt with.
• Strict: A more stricter version of Basic algorithm where unlocking is performed after
a transaction terminates (commits or aborts and rolled-back). This is the most
commonly used two-phase locking algorithm.
Deadlock
Deadlock: An impasse that may result when two (or more) transactions are each waiting for
locks to be released that are held by the other.
6
Advanced Database Systems Handout
Since it is more difficult to prevent deadlock than to use timeouts or testing for deadlock and
breaking it when it occurs, systems generally avoid the deadlock prevention method.
Deadlock avoidance
There are many variations of two-phase locking algorithm. Some avoid deadlock by not letting
the cycle to complete. That is as soon as the algorithm discovers that blocking a transaction is
likely to create a cycle, it rolls back the transaction. Wound-Wait and Wait-Die algorithms
use timestamps to avoid deadlocks by rolling-back victim.
• wait-die: When an older transaction tries to lock a DB element that has been locked by
a younger transaction, it waits. When a younger transaction tries to lock a DB element
that has been locked by an older transaction, it dies.
• wound-wait: When an older transaction tries to lock a DB element that has been
locked by a younger transaction, it wounds the younger transaction. When
a younger transaction tries to lock a DB element that has been locked by
an older transaction, it waits.
Assume that Tn requests a lock held by Tk. The following table summarizes the actions taken
for wait-die and wound-wait scheme:
Starvation
Starvation occurs when a particular transaction consistently waits or restarted and never gets a
chance to proceed further. In a deadlock resolution it is possible that the same transaction may
consistently be selected as victim and rolled-back. This limitation is inherent in all priority-
based scheduling mechanisms. In Wound-Wait scheme a younger transaction may always be
wounded (aborted) by a long running older transaction which may create starvation.
Timeouts
A simple approach to deadlock prevention is based on lock timeouts. With this approach, a
transaction that requests a lock will wait for only a system-defined period of time. If the lock
has not been granted within this period, the lock request times out. In this case, the DBMS
assumes the transaction may be deadlocked, even though it may not be, and it aborts and
automatically restarts the transaction. This is a very simple and practical solution to deadlock
prevention and is used by several commercial DBMSs.
7
Advanced Database Systems Handout
The use of locks, combined with the two-phase locking protocol, guarantees serializability of
schedules. The order of transactions in the equivalent serial schedule is based on the order in
which the transactions lock the items they require. If a transaction needs an item that is already
locked, it may be forced to wait until the item is released. A different approach that also
guarantees serializability uses transaction timestamps to order transaction execution for an
equivalent serial schedule.
Timestamping Methods
Timestamp methods for concurrency control are quite different from locking methods. No
locks are involved, and therefore there can be no deadlock. Locking methods generally prevent
conflicts by making transactions wait. With timestamp methods, there is no waiting:
transactions involved in conflict are simply rolled back and restarted.
• Timestamp: A unique identifier created by the DBMS that indicates the relative
starting time of a transaction. Timestamps can be generated by simply using the system
clock at the time the transaction started, or, more normally, by incrementing a logical
counter every time a new transaction starts.
• Timestamping: A concurrency control protocol that orders transactions in such a way
that older transactions, transactions with smaller timestamps, get priority in the event
of conflict.
With timestamping, if a transaction attempts to read or write a data item, then the read or write
is only allowed to proceed if the last update on that data item was carried out by an older
transaction. Otherwise, the transaction requesting the read/write is restarted and given a new
timestamp. New timestamps must be assigned to restarted transactions to prevent their being
continually aborted and restarted. Without new timestamps, a transaction with an old
timestamp might not be able to commit owing to younger transactions having already
committed.
Besides timestamps for transactions, there are timestamps for data items. Each data item
contains a read_timestamp, giving the timestamp of the last transaction to read the item, and
a write_timestamp, giving the timestamp of the last transaction to write (update) the item. For
a transaction T with timestamp ts(T), the timestamp ordering protocol works as follows.
Basic Timestamp Ordering
1. Transaction T issues a read(x)
a. Transaction T asks to read an item (x) that has already been updated by a
younger (later) transaction, that is ts(T) < write_timestamp(x). This means that
8
Advanced Database Systems Handout
an earlier transaction is trying to read a value of an item that has been updated
by a later transaction. The earlier transaction is too late to read the previous
outdated value, and any other values it has acquired are likely to be inconsistent
with the updated value of the data item. In this situation, transaction T must be
aborted and restarted with a new (later) timestamp.
b. Otherwise, ts(T) ≥ write_timestamp(x), and the read operation can proceed. We
set read_timestamp(x) = max(ts(T), read_timestamp(x)).
2. Transaction T issues a write(x)
a. Transaction T asks to write an item (x) whose value has already been read by a
younger transaction, that is ts(T) < read_timestamp(x). This means that a later
transaction is already using the current value of the item and it would be an error
to update it now. This occurs when a transaction is late in doing a write and a
younger transaction has already read the old value or written a new one. In this
case, the only solution is to roll back transaction T and restart it using a later
timestamp.
b. Transaction T asks to write an item (x) whose value has already been written by
a younger transaction, that is ts(T) < write_timestamp(x). This means that
transaction T is attempting to write an obsolete value of data item x. Transaction
T should be rolled back and restarted using a later timestamp.
c. Otherwise, the write operation can proceed. We set write_timestamp(x) = ts(T).
Basic timestamp ordering guarantees that transactions are conflict serializable, and the results
are equivalent to a serial schedule in which the transactions are executed in chronological order
of the timestamps. However, basic timestamp ordering does not guarantee recoverable
schedules. Strict Timestamp Ordering can overcome the limitation of the basic timestamp
ordering.
Strict Timestamp Ordering
1. Transaction T issues a write_item(X) operation: If TS(T) > read_TS(X), then delay T
until the transaction T’ that wrote or read X has terminated (committed or aborted).
2. Transaction T issues a read_item(X) operation: If TS(T) > write_TS(X), then delay T
until the transaction T’ that wrote or read X has terminated (committed or aborted).
Thomas’s write rule
A modification to the basic timestamp ordering protocol that relaxes conflict serializability can
be used to provide greater concurrency by rejecting obsolete write operations. The extension,
9
Advanced Database Systems Handout
known as Thomas’s write rule, modifies the checks for a write operation by transaction T as
follows:
a) Transaction T asks to write an item (x) whose value has already been read by a
younger transaction, that is ts(T) < read_timestamp(x). As before, roll back
transaction T and restart it using a later timestamp.
b) Transaction T asks to write an item (x) whose value has already been written by a
younger transaction, that is ts(T) < write_timestamp(x). This means that a later
transaction has already updated the value of the item, and the value that the older
transaction is writing must be based on an obsolete value of the item. In this case,
the write operation can safely be ignored.
c) Otherwise, as before, the write operation can proceed. We set write_timestamp(x)
= ts(T).
Multiversion Timestamp Ordering
Versioning of data can also be used to increase concurrency, since different users may work
concurrently on different versions of the same object instead of having to wait for each other’s
transactions to complete. In the event that the work appears faulty at any stage, it should be
possible to roll back the work to some valid state.
The basic timestamp ordering protocol assumes that only one version of a data item exists, and
so only one transaction can access a data item at a time. This restriction can be relaxed if we
allow multiple transactions to read and write different versions of the same data item, and
ensure that each transaction sees a consistent set of versions for all the data items it accesses.
In multiversion concurrency control, each write operation creates a new version of a data item
while retaining the old version. When a transaction attempts to read a data item, the system
selects one of the versions that ensures serializability.
For each data item x, we assume that the database holds n versions x1, x2, . . ., xn. For each
version i, the system stores three values:
• the value of version xi;
• read_timestamp(xi), which is the largest timestamp of all transactions that have
successfully read version xi;
• write_timestamp(xi), which is the timestamp of the transaction that created version xi.
Let ts(T) be the timestamp of the current transaction. The multiversion timestamp ordering
protocol uses the following two rules to ensure serializability:
10
Advanced Database Systems Handout
11
Advanced Database Systems Handout
• Validation phase: This follows the read phase. Checks are performed to ensure
serializability is not violated if the transaction updates are applied to the database. For
a read-only transaction, this consists of checking that the data values read are still the
current values for the corresponding data items. If no interference occurred, the
transaction is committed. If interference occurred, the transaction is aborted and
restarted. For a transaction that has updates, validation consists of determining whether
the current transaction leaves the database in a consistent state, with serializability
maintained. If not, the transaction is aborted and restarted.
• Write phase: This follows the successful validation phase for update transactions.
During this phase, the updates made to the local copy are applied to the database.
The validation phase examines the reads and writes of transactions that may cause interference.
Each transaction T is assigned a timestamp at the start of its execution, start (T), one at the start
of its validation phase, validation(T), and one at its finish time, finish(T), including its write
phase, if any. To pass the validation test, one of the following must be true:
1. All transactions S with earlier timestamps must have finished before transaction T
started; that is, finish(S) < start (T).
2. If transaction T starts before an earlier one S finishes, then:
a. the set of data items written by the earlier transaction are not the ones read by
the current transaction; and
b. the earlier transaction completes its write phase before the current transaction
enters its validation phase, that is start (T) < finish(S) < validation(T).
Although optimistic techniques are very efficient when there are few conflicts, they can result
in the rollback of individual transactions. Note that the rollback involves only a local copy of
the data so there are no cascading rollbacks, since the writes have not actually reached the
database. However, if the aborted transaction is of a long duration, valuable processing time
will be lost since the transaction must be restarted. If rollback occurs often, it is an indication
that the optimistic method is a poor choice for concurrency control in that particular
environment.
Granularity of Data Items
Granularity is the size of data items chosen as the unit of protection by a concurrency control
protocol. A lockable unit of data defines its granularity. Granularity can be coarse (entire
database) or it can be fine (a tuple or an attribute of a relation). Data item granularity
12
Advanced Database Systems Handout
significantly affects concurrency control performance. Thus, the degree of concurrency is low
for coarse granularity and high for fine granularity. Example of data item granularity:
1. A field of a database record (an attribute of a tuple)
2. A database record (a tuple or a relation)
3. A disk block
4. An entire file
5. The entire database
The size or granularity of the data item that can be locked in a single operation has a significant
effect on the overall performance of the concurrency control algorithm. However, there are
several tradeoffs that have to be considered in choosing the data item size. Typically, a data
item is chosen to be between coarse to fine, where fine granularity refers to small item sizes
and coarse granularity refers to large item sizes.
However, escalating the granularity from field or record to file may increase the likelihood of
deadlock occurring. Thus, the coarser the data item size, the lower the degree of concurrency
permitted. On the other hand, the finer the item size, the more locking information that needs
to be stored. The best item size depends upon the nature of the transactions. The following
diagram illustrates a hierarchy of granularity from coarse (database) to fine (record).
DB
f1 f2
r111 ... r11j r111 ... r11j r111 ... r11j r111 ... r11j r111 ... r11j r111 ... r11j
Hierarchy of granularity
We could represent the granularity of locks in a hierarchical structure where each node
represents data items of different sizes. Here, the root node represents the entire database, the
level 1 nodes represent files, the level 2 nodes represent pages, the level 3 nodes represent
records, and the level 4 leaves represent individual fields. Whenever a node is locked, all its
descendants are also locked.
13
Advanced Database Systems Handout
• For example, if a transaction locks a page, Page2, all its records (Record1 and Record2)
as well as all their fields (Field1 and Field2) are also locked. If another transaction
requests an incompatible lock on the same node, the DBMS clearly knows that the lock
cannot be granted.
• If another transaction requests a lock on any of the descendants of the locked node, the
DBMS checks the hierarchical path from the root to the requested node to determine if
any of its ancestors are locked before deciding whether to grant the lock. Thus, if the
request is for an exclusive lock on record Record1, the DBMS checks its parent (Page2),
its grandparent (File2), and the database itself to determine if any of them are locked.
When it finds that Page2 is already locked, it denies the request.
• Additionally, a transaction may request a lock on a node and a descendant of the node
is already locked. For example, if a lock is requested on File2, the DBMS checks every
page in the file, every record in those pages, and every field in those records to
determine if any of them are locked.
Multiple-granularity locking
To reduce the searching involved in locating locks on descendants, the DBMS can use another
specialized locking strategy called multiple-granularity locking. This strategy uses a new
type of lock called an intention lock. When any node is locked, an intention lock is placed on
all the ancestors of the node. Thus, if some descendant of File2 (in our example, Page2) is
locked and a request is made for a lock on File2, the presence of an intention lock on File2
indicates that some descendant of that node is already locked.
14
Advanced Database Systems Handout
Intention locks may be either Shared (read) or eXclusive (write). An intention shared (IS)
lock conflicts only with an exclusive lock; an intention exclusive (IX) lock conflicts with both
a shared and an exclusive lock.
In addition, a transaction can hold a shared and intention exclusive (SIX) lock that is logically
equivalent to holding both a shared and an IX lock. A SIX lock conflicts with any lock that
conflicts with either a shared or IX lock; in other words, a SIX lock is compatible only with
an IS lock. The lock compatibility table for multiple-granularity locking is shown in the next
table.
To ensure serializability with locking levels, a two-phase locking protocol is used as follows:
• No lock can be granted once any node has been unlocked.
• No node may be locked until its parent is locked by an intention lock.
• No node may be unlocked until all its descendants are unlocked.
In this way, locks are applied from the root down using intention locks until the node requiring
an actual read or exclusive lock is reached, and locks are released from the bottom up.
However, deadlock is still possible and must be handled as discussed previously.
Using Locks for Concurrency Control in Indexes
• Two-phase locking can also be applied to indexes, where the nodes of an index correspond
to disk pages.
• However, holding locks on index pages until the shrinking phase of 2PL could cause an
undue amount of transaction blocking because searching an index always starts at the root.
• Therefore, if a transaction wants to insert a record (write operation), the root would be
locked in exclusive mode, so all other conflicting lock requests for the index must wait until
the transaction enters its shrinking phase.
• This blocks all other transactions from accessing the index, so in practice other approaches
to locking an index must be used.
15
Advanced Database Systems Handout
Chapter Five
Database Recovery Techniques
Database recovery is the process of restoring the database to a correct state in the event of a
failure. To preserve transaction properties (Atomicity, Consistency, Isolation and Durability).
The Need for Recovery
The storage of data generally includes four different types of media with an increasing degree
of reliability: main memory, magnetic disk, magnetic tape, and optical disk. Main memory is
volatile storage that usually does not survive system crashes. Magnetic disks provide online
non-volatile storage. Compared with main memory, disks are more reliable and much cheaper,
but slower by three to four orders of magnitude.
Magnetic tape is an offline non-volatile storage medium, which is far more reliable than disk
and fairly inexpensive, but slower, providing only sequential access. Optical disk is more
reliable than tape, generally cheaper, faster, and providing random access. Main memory is
also referred to as primary storage and disks and tape as secondary storage. Stable storage
represents information that has been replicated in several non-volatile storage media (usually
disk) with independent failure modes.
There are many different types of failure that can affect database processing, each of which has
to be dealt with in a different manner. Some failures affect main memory only, while others
involve non-volatile (secondary) storage. Among the causes of failure are:
• system crashes due to hardware or software errors, resulting in loss of main memory;
• media failures, such as head crashes or unreadable media, resulting in the loss of partsn
of secondary storage;
• application software errors, such as logical errors in the program that is accessing the
database, which cause one or more transactions to fail;
• natural physical disasters, such as fires, floods, earthquakes, or power failures;
• carelessness or unintentional destruction of data or facilities by operators or users;
• sabotage, or intentional corruption or destruction of data, hardware, or software
facilities.
Transactions and Recovery
Transactions represent the basic unit of recovery in a database system. It is the role of the
recovery manager to guarantee two of the four ACID properties of transactions, namely
atomicity and durability, in the presence of failures. The recovery manager has to ensure that,
16
Advanced Database Systems Handout
on recovery from failure, either all the effects of a given transaction are permanently recorded
in the database or none of them are.
Types of Failure
The database may become unavailable for use due to:
• Transaction failure: Transactions may fail because of incorrect input, deadlock,
incorrect synchronization.
• System failure: System may fail because of addressing error, application error,
operating system fault, RAM failure, etc.
• Media failure: Disk head crash, power disruption, etc.
Recovery Facilities
A DBMS should provide the following facilities to assist with recovery:
• a backup mechanism, which makes periodic backup copies of the database;
• logging facilities, which keep track of the current state of transactions and database
changes;
• a checkpoint facility, which enables updates to the database that are in progress to be
made permanent;
• a recovery manager, which allows the system to restore the database to a consistent
state following a failure.
Backup mechanism
The DBMS should provide a mechanism to allow backup copies of the database and the log
file to be made at regular intervals without necessarily having to stop the system first. The
backup copy of the database can be used in the event that the database has been damaged or
destroyed. A backup can be a complete copy of the entire database or an incremental backup,
consisting only of modifications made since the last complete or incremental backup.
Typically, the backup is stored on offline storage, such as magnetic tape.
Log file
To keep track of database transactions, the DBMS maintains a special file called a log (or
journal) that contains information about all updates to the database. The log may contain the
following data:
• Transaction records, containing:
o transaction identifier;
17
Advanced Database Systems Handout
o type of log record (transaction start, insert, update, delete, abort, commit);
o identifier of data item affected by the database action (insert, delete, and
update operations);
o before-image of the data item, that is, its value before change (update and
delete operations only);
o after-image of the data item, that is, its value after change (insert and update
operations only);
o log management information, such as a pointer to previous and next log
records for that transaction (all operations).
• Checkpoint records
Data Update
• Immediate Update: As soon as a data item is modified in cache, the disk copy is
updated.
• Deferred Update: All modified data items in the cache is written either after a
transaction ends its execution or after a fixed number of transactions have completed
their execution.
• Shadow update: The modified version of a data item does not overwrite its disk copy
but is written at a separate disk location.
• In-place update: The disk version of the data item is overwritten by the cache version.
Data Caching
Data items to be modified are first stored into database cache by the Cache Manager (CM)
and after modification they are flushed (written) to the disk. The flushing is controlled by
Modified and Pin-Unpin bits.
• Pin-Unpin: Instructs the operating system not to flush the data item.
• Modified: Indicates the AFIM of the data item.
18
Advanced Database Systems Handout
Write-Ahead Logging
When in-place update (immediate or deferred) is used then log is necessary for recovery and
it must be available to recovery manager. This is achieved by Write-Ahead Logging (WAL)
protocol. WAL states that:
• For Undo: Before a data item’s AFIM is flushed to the database disk (overwriting the
BFIM) its BFIM must be written to the log and the log must be saved on a stable store
(log disk).
• For Redo: Before a transaction executes its commit operation, all its AFIMs must be
written to the log and the log must be saved on a stable store.
Checkpointing
A checkpoint is the point of synchronization between the database and the transaction log file.
All buffers are force-written to secondary storage. The information in the log file is used to
recover from a database failure. One difficulty with this scheme is that when a failure occurs,
we may not know how far back in the log to search and we may end up redoing transactions
that have been safely written to the database. To limit the amount of searching and subsequent
processing that we need to carry out on the log file, we can use a technique called
checkpointing. Checkpoints are scheduled at predetermined intervals and involve the
following operations:
• writing all log records in main memory to secondary storage;
• writing the modified blocks in the database buffers to secondary storage;
• writing a checkpoint record to the log file. This record contains the identifiers of all
transactions that are active at the time of the checkpoint.
If transactions are performed serially, then, when a failure occurs, we check the log file to find
the last transaction that started before the last checkpoint. Any earlier transactions would have
committed previously and would have been written to the database at the checkpoint.
Therefore, we need only redo the one that was active at the checkpoint and any subsequent
transactions for which both start and commit records appear in the log. If a transaction is active
at the time of failure, the transaction must be undone. If transactions are performed
concurrently, we redo all transactions that have committed since the checkpoint and undo all
transactions that were active at the time of the crash.
Generally, checkpointing is a relatively inexpensive operation, and it is often possible to take
three or four checkpoints an hour. In this way, no more than 15–20 minutes of work will need
to be recovered.
19
Advanced Database Systems Handout
Recovery Techniques
The particular recovery procedure to be used is dependent on the extent of the damage that has
occurred to the database. We consider two cases:
1. If the database has been extensively damaged, for example a disk head crash has
occurred and destroyed the database, then it is necessary to restore the last backup copy
of the database and reapply the update operations of committed transactions using the
log file. This assumes, of course, that the log file has not been damaged as well.
2. If the database has not been physically damaged but has become inconsistent, for
example the system crashed while transactions were executing, then it is necessary to
undo the changes that caused the inconsistency. It may also be necessary to redo some
transactions to ensure that the updates they performed have reached secondary storage.
20
Advanced Database Systems Handout
Here, we do not need to use the backup copy of the database but can restore the database
to a consistent state using the before- and after-images held in the log file.
There are two techniques for recovery from the second situation, that is, the case where the
database has not been destroyed but is in an inconsistent state. The techniques, known as
deferred update and immediate update, differ in the way that updates are written to
secondary storage. There is also an alternative technique called shadow paging.
Recovery Schemes
Recovery techniques using deferred update
Using the deferred update recovery protocol, updates are not written to the database until after
a transaction has reached its commit point. If a transaction fails before it reaches this point, it
will not have modified the database and so no undoing of changes will be necessary. However,
it may be necessary to redo the updates of committed transactions as their effect may not have
reached the database. In this case, we use the log file to protect against system failures in the
following way:
• When a transaction starts, write a transaction start record to the log.
• When any write operation is performed, write a log record containing all the log data
specified previously (excluding the before-image of the update). Do not actually write
the update to the database buffers or the database itself.
• When a transaction is about to commit, write a transaction commit log record, write all
the log records for the transaction to disk, and then commit the transaction. Use the log
records to perform the actual updates to the database.
• If a transaction aborts, ignore the log records for the transaction and do not perform the
writes.
Note that we write the log records to disk before the transaction is actually committed, so that
if a system failure occurs while the actual database updates are in progress, the log records will
survive and the updates can be applied later. In the event of a failure, we examine the log to
identify the transactions that were in progress at the time of failure. Starting at the last entry in
the log file, we go back to the most recent checkpoint record:
• Any transaction with transaction start and transaction commit log records should be
redone. The redo procedure performs all the writes to the database using the afterimage
log records for the transactions, in the order in which they were written to the log. If
this writing has been performed already, before the failure, the write has no effect on
the data item, so there is no damage done if we write the data again (that is, the operation
21
Advanced Database Systems Handout
is idempotent). However, this method guarantees that we will update any data item
that was not properly updated prior to the failure.
• For any transactions with transaction start and transaction abort log records, we do
nothing since no actual writing was done to the database, so these transactions do not
have to be undone.
If a second system crash occurs during recovery, the log records are used again to restore the
database. With the form of the write log records, it does not matter how many times we redo
the writes.
22
Advanced Database Systems Handout
• The updates to the database itself are written when the buffers are next flushed to
secondary storage.
• When the transaction commits, write a transaction commit record to the log.
It is essential that log records (or at least certain parts of them) are written before the
corresponding write to the database. This is known as the write-ahead log protocol. If updates
were made to the database first, and failure occurred before the log record was written, then
the recovery manager would have no way of undoing (or redoing) the operation. Under the
write-ahead log protocol, the recovery manager can safely assume that, if there is no transaction
commit record in the log file for a particular transaction then that transaction was still active at
the time of failure and must therefore be undone.
If a transaction aborts, the log can be used to undo it since it contains all the old values for the
updated fields. As a transaction may have performed several changes to an item, the writes are
undone in reverse order. Regardless of whether the transaction’s writes have been applied to
the database itself, writing the before-images guarantees that the database is restored to its state
prior to the start of the transaction. If the system fails, recovery involves using the log to undo
or redo transactions:
• For any transaction for which both a transaction start and transaction commit record
appear in the log, we redo using the log records to write the after-image of updated
fields, as described above. Note that if the new values have already been written to the
database, these writes, although unnecessary, will have no effect. However, any write
that did not actually reach the database will now be performed.
• For any transaction for which the log contains a transaction start record but not a
transaction commit record, we need to undo that transaction. This time the log records
are used to write the before-image of the affected fields, and thus restore the database
to its state prior to the transaction’s start. The undo operations are performed in the
reverse order to which they were written to the log.
Undo/Redo Algorithm (Single-user environment)
• Recovery schemes of this category apply undo and also redo for recovery.
• In a single-user environment no concurrency control is required but a log is maintained
under WAL. Note that at any time there will be one transaction in the system and it will
be either in the commit table or in the active table. The recovery manager performs:
– Undo of a transaction if it is in the active table.
– Redo of a transaction if it is in the commit table.
23
Advanced Database Systems Handout
24
Advanced Database Systems Handout
1. Analysis: step identifies the dirty (updated) pages in the buffer and the set of
transactions active at the time of crash. The appropriate point in the log where
redo is to start is also determined.
2. Redo: necessary redo operations are applied.
3. Undo: log is scanned backwards and the operations of transactions active at
the time of crash are undone in reverse order.
The Log and Log Sequence Number (LSN)
• A log record is written for:
(a) data update
(b) transaction commit
(c) transaction abort
(d) undo
(e) transaction end
• In the case of undo a compensating log record is written.
• A unique LSN is associated with every log record.
o LSN increases monotonically and indicates the disk address of the log record it
is associated with.
o In addition, each data page stores the LSN of the latest log record corresponding
to a change for that page.
• A log record stores
(a) the previous LSN of that transaction
(b) the transaction ID
(c) the type of log record.
• For a write operation the following additional information is logged:
1. Page ID for the page that includes the item
2. Length of the updated item
3. Its offset from the beginning of the page
4. BFIM of the item
5. AFIM of the item
The Transaction table and the Dirty Page table
• For efficient recovery following tables are also stored in the log during checkpointing:
25
Advanced Database Systems Handout
26
Advanced Database Systems Handout
Chapter Six
Database Security and Authorization
Introduction to DB Security Issues
Database security involve the mechanisms that protect the database against intentional or
accidental threats. Security considerations apply not only to the data held in a database:
breaches of security may affect other parts of the system, which may in turn affect the database.
Consequently, database security encompasses hardware, software, people, and data. To
effectively implement security requires appropriate controls, which are defined in specific
mission objectives for the system.
A database represents an essential corporate resource that should be properly secured using
appropriate controls. We consider database security in relation to the following situations:
• theft and fraud; affect not only the database environment but also the entire
organization.
• loss of confidentiality (secrecy); refers to the need to maintain secrecy over data,
usually only that which is critical to the organization:
• loss of privacy; refers to the need to protect data about individuals.
• loss of integrity; results in invalid or corrupted data, which may seriously affect the
operation of an organization.
• loss of availability: means that the data, or the system, or both cannot be accessed,
which can seriously affect an organization’s financial performance. In some cases,
events that cause a system to be unavailable may also cause data corruption.
Threats
A threat is Any situation or event, whether intentional or accidental, that may adversely affect
a system and consequently the organization. A threat may be caused by a situation or event
involving a person, action, or circumstance that is likely to bring harm to an organization. The
harm may be tangible, such as loss of:
• hardware,
• software,
• or data,
• or intangible, such as loss of credibility or client confidence.
The extent that an organization suffers as a result of a threat’s succeeding depends upon a
number of factors, such as the existence of countermeasures and contingency plans. For
example, if a hardware failure occurs corrupting secondary storage, all processing activity must
27
Advanced Database Systems Handout
cease until the problem is resolved. The recovery will depend upon a number of factors, which
include when the last backups were taken and the time needed to restore the system. An
organization needs to identify the types of threat it may be subjected to and initiate appropriate
plans and countermeasures, bearing in mind the costs of implementing them.
Countermeasures
The types of countermeasure to threats on computer systems range from physical controls to
administrative procedures. To protect databases against these types of threats four kinds of
countermeasures can be implemented:
• Access control
• Inference control
• Flow control
• Encryption
A DBMS typically includes a database security and authorization subsystem that is responsible
for ensuring the security portions of a database against unauthorized access. Two types of
database security mechanisms:
• Discretionary security mechanisms
• Mandatory security mechanisms
Access Controls
The typical way to provide access controls for a database system is based on the granting and
revoking of privileges.
• A privilege allows a user to create or access (that is read, write, or modify) some
database object (such as a relation, view, or index) or to run certain DBMS utilities.
• As excessive granting of unnecessary privileges can compromise security: a privilege
should only be granted to a user if that user cannot accomplish his or her work without
that privilege.
• A user who creates a database object such as a relation or a view automatically gets all
privileges on that object.
• The DBMS subsequently keeps track of how these privileges are granted to other users,
and possibly revoked, and ensures that at all times only users with necessary privileges
can access an object.
Inference control
The security problem associated with databases is that of controlling the access to a statistical
database, which is used to provide statistical information or summaries of values based on
28
Advanced Database Systems Handout
various criteria. A statistical database is a database used for statistical analysis purposes. It is
an OLAP (online analytical processing). The counter measures to statistical database security
problem is called inference control measures.
Flow control
Another security is that of flow control, which prevents information from flowing in such a
way that it reaches unauthorized users.
• Channels that are pathways for information to flow implicitly in ways that violate the
security policy of an organization are called covert channels.
Data encryption
A final security issue is data encryption, which is used to protect sensitive data (such as credit
card numbers) that is being transmitted via some type communication network. Encryption
involves the encoding of the data by a special algorithm that renders the data unreadable by
any program without the decryption key.
• The data is encoded using some encoding algorithm.
• An unauthorized user who access encoded data will have difficulty deciphering it, but
authorized users are given decoding or decrypting algorithms (or keys) to decipher data.
Encryption also protects data transmitted over communication lines. There are a number of
techniques for encoding data to conceal the information; some are termed ‘irreversible’ and
others ‘reversible’. Irreversible techniques, as the name implies, do not permit the original
data to be known. However, the data can be used to obtain valid statistical information.
Reversible techniques are more commonly used. To transmit data securely over insecure
networks requires the use of a cryptosystem, which includes:
• an encryption key to encrypt the data (plaintext);
• an encryption algorithm that, with the encryption key, transforms the plaintext into
ciphertext;
• a decryption key to decrypt the ciphertext;
• a decryption algorithm that, with the decryption key, transforms the ciphertext back
into plaintext.
Authorization
Authorization: The granting of a right or privilege that enables a subject to have legitimate
access to a system or a system’s object. Authorization controls can be built into the software,
and govern not only what system or object a specified user can access, but also what the user
29
Advanced Database Systems Handout
may do with it. The process of authorization involves authentication of subjects requesting
access to objects.
Authentication: A mechanism that determines whether a user is who he or she claims
to be.
Database Security and the DBA
A system administrator is usually responsible for allowing users to have access to a computer
system by creating individual user accounts. A system administrator is usually responsible for
allowing users to have access to a computer system by creating individual user accounts.
• Each user is given a unique identifier, which is used by the operating system to
determine who they are.
• Associated with each identifier is a password, chosen by the user and known to the
operating system
The responsibility to authorize use of the DBMS usually rests with the Database Administrator
(DBA), who must also set up individual user accounts and passwords using the DBMS itself.
The DBA’s responsibilities include:
• granting privileges to users who need to use the system
• classifying users and data in accordance with the policy of the organization
The DBA has a DBA account in the DBMS. Sometimes these are called a system or superuser
account. These accounts provide powerful capabilities such as:
1. Account creation
2. Privilege granting
3. Privilege revocation
4. Security level assignment
30
Advanced Database Systems Handout
• To keep a record of all updates applied to the database and of the particular user who
applied each update, we can modify system log, which includes an entry for each
operation applied to the database that may be required for recovery from a transaction
failure or system crash.
• If any tampering with the database is suspected, a database audit is performed
o A database audit consists of reviewing the log to examine all accesses and
operations applied to the database during a certain time period.
• A database log that is used mainly for security purposes is sometimes called an audit
trail.
31
Advanced Database Systems Handout
The second level of privileges applies to the relation level This includes base relations and
virtual (view) relations. The granting and revoking of privileges generally follow an
authorization model for discretionary privileges known as the access matrix model where:
– The rows of a matrix M represents subjects (users, accounts, programs)
– The columns represent objects (relations, records, columns, views, operations).
– Each position M(i,j) in the matrix represents the types of privileges (read, write, update)
that subject i holds on object j.
To control the granting and revoking of relation privileges, each relation R in a database is
assigned an owner account, which is typically the account that was used when the relation was
created in the first place.
– The owner of a relation is given all privileges on that relation.
– In SQL2, the DBA can assign and owner to a whole schema by creating the schema and
associating the appropriate authorization identifier with that schema, using the
CREATE SCHEMA command.
– The owner account holder can pass privileges on any of the owned relation to other
users by granting privileges to their accounts.
In SQL the following types of privileges can be granted on each individual relation R:
– SELECT (retrieval or read) privilege on R:
– Gives the account retrieval privilege.
– In SQL this gives the account the privilege to use the SELECT statement to
retrieve tuples from R.
– MODIFY privileges on R:
– This gives the account the capability to modify tuples of R.
– In SQL this privilege is further divided into UPDATE, DELETE, and INSERT
privileges to apply the corresponding SQL command to R.
– In addition, both the INSERT and UPDATE privileges can specify that only
certain attributes can be updated by the account.
In SQL the following types of privileges can be granted on each individual relation R (contd.):
– REFERENCES privilege on R:
– This gives the account the capability to reference relation R when specifying
integrity constraints.
– The privilege can also be restricted to specific attributes of R.
Notice that to create a view, the account must have SELECT privilege on all relations involved
in the view definition.
32
Advanced Database Systems Handout
Revoking Privileges
In some cases it is desirable to grant a privilege to a user temporarily. For example,
• The owner of a relation may want to grant the SELECT privilege to a user for a specific
task and then revoke that privilege once the task is completed.
• Hence, a mechanism for revoking privileges is needed. In SQL, a REVOKE command
is included for the purpose of canceling privileges.
Propagation of Privileges using the GRANT OPTION
Whenever the owner A of a relation R grants a privilege on R to another account B, privilege
can be given to B with or without the GRANT OPTION.
If the GRANT OPTION is given, this means that B can also grant that privilege on R to other
accounts.
• Suppose that B is given the GRANT OPTION by A and that B then grants the privilege
on R to a third account C, also with GRANT OPTION. In this way, privileges on R
can propagate to other accounts without the knowledge of the owner of R.
• If the owner account A now revokes the privilege granted to B, all the privileges that B
propagated based on that privilege should automatically be revoked by the system.
Example:
• Suppose that the DBA creates four accounts
– A1, A2, A3, A4
• and wants only A1 to be able to create base relations. Then the DBA must issue the
following GRANT command in SQL
GRANT CREATETAB TO A1;
Example:
33
Advanced Database Systems Handout
• Suppose that A1 wants to allow A3 to retrieve information from either of the two tables
and also to be able to propagate the SELECT privilege to other accounts.
• A1 can issue the command:
GRANT SELECT ON EMPLOYEE, DEPARTMENT
TO A3 WITH GRANT OPTION;
• A3 can grant the SELECT privilege on the EMPLOYEE relation to A4 by issuing:
GRANT SELECT ON EMPLOYEE TO A4;
– Notice that A4 can’t propagate the SELECT privilege because GRANT
OPTION was not given to A4
Example:
• Suppose that A1 decides to revoke the SELECT privilege on the EMPLOYEE relation
from A3; A1 can issue:
REVOKE SELECT ON EMPLOYEE FROM A3;
• The DBMS must now automatically revoke the SELECT privilege on EMPLOYEE
from A4, too, because A3 granted that privilege to A4 and A3 does not have the
privilege any more.
Example:
• Suppose that A1 wants to give back to A3 a limited capability to SELECT from the
EMPLOYEE relation and wants to allow A3 to be able to propagate the privilege.
• The limitation is to retrieve only the NAME, BDATE, and ADDRESS
attributes and only for the tuples with DNO=5.
• A1 then create the view:
CREATE VIEW A3EMPLOYEE AS
SELECT NAME, BDATE, ADDRESS
FROM EMPLOYEE
WHERE DNO = 5;
After the view is created, A1 can grant SELECT on the view A3EMPLOYEE to A3
as follows:
GRANT SELECT ON A3EMPLOYEE TO A3
WITH GRANT OPTION;
Example:
• Finally, suppose that A1 wants to allow A4 to update only the SALARY attribute of
EMPLOYEE;
34
Advanced Database Systems Handout
• A1 can issue:
GRANT UPDATE ON EMPLOYEE (SALARY) TO A4;
– The UPDATE or INSERT privilege can specify particular attributes that may
be updated or inserted in a relation.
– Other privileges (SELECT, DELETE) are not attribute specific.
Mandatory Access Control and Role-Based Access Control for Multilevel Security
The discretionary access control techniques of granting and revoking privileges on relations
has traditionally been the main security mechanism for relational database systems. This is an
all-or-nothing method:
• A user either has or does not have a certain privilege.
In many applications, an additional security policy is needed that classifies data and users based
on security classes. This approach as mandatory access control, would typically be combined
with the discretionary access control mechanisms.
Mandatory Access Control (MAC) for Multilevel Security
Mandatory Access Control (MAC) is based on system-wide policies that cannot be changed by
individual users. In this approach each database object is assigned a security class and each
user is assigned a clearance for a security class, and rules are imposed on reading and writing
of database objects by users.
The DBMS determines whether a given user can read or write a given object based on certain
rules that involve the security level of the object and the clearance of the user. These rules seek
to ensure that sensitive data can never be passed on to another user without the necessary
clearance. The SQL standard does not include support for MAC.
A popular model for MAC is called Bell–LaPadula model (Bell and LaPadula, 1974), which
is described in terms of objects (such as relations, views, tuples, and attributes), subjects (such
as users and programs), security classes, and clearances.
35
Advanced Database Systems Handout
• Each database object is assigned a security class, and each subject is assigned a
clearance for a security class.
• The security classes in a system are ordered, with a most secure class and a least
secure class.
• Examples of security classes are: top secret (TS), secret (S), confidential (C), and
unclassified (U)
Two restrictions are enforced on data access based on the subject/object classifications:
• Simple security property: A subject S is not allowed read access to an object O unless
class(S) ≥ class(O).
• A subject S is not allowed to write an object O unless class(S) ≤ class(O). This known
as the star property (or * property).
To incorporate multilevel security notions into the relational database model, it is common to
consider attribute values and tuples as data objects. Hence, each attribute A is associated with
a classification attribute C in the schema, and each attribute value in a tuple is associated with
a corresponding security classification. In addition, in some models, a tuple classification
attribute TC is added to the relation attributes to provide a classification for each tuple as a
whole. Hence, a multilevel relation schema R with n attributes would be represented as
• R(A1,C1,A2,C2, …, An,Cn,TC)
o where each Ci represents the classification attribute associated with attribute Ai.
The value of the TC attribute in each tuple t – which is the highest of all attribute classification
values within t – provides a general classification for the tuple itself, whereas each Ci provides
a finer security classification for each attribute value within the tuple.
The apparent key of a multilevel relation is the set of attributes that would have formed
the primary key in a regular(single-level) relation.
A multilevel relation will appear to contain different data to subjects (users) with different
clearance levels. In some cases, it is possible to store a single tuple in the relation at a higher
classification level and produce the corresponding tuples at a lower-level classification through
a process known as filtering. In other cases, it is necessary to store two or more tuples at
different classification levels with the same value for the apparent key. This leads to the
concept of polyinstantiation where several tuples can have the same apparent key value but
have different attribute values for users at different classification levels.
36
Advanced Database Systems Handout
In general, the entity integrity rule for multilevel relations states that all attributes that are
members of the apparent key must not be null and must have the same security classification
within each individual tuple. In addition, all other attribute values in the tuple must have a
security classification greater than or equal to that of the apparent key. This constraint ensures
that a user can see the key if the user is permitted to see any part of the tuple at all.
Other integrity rules, called null integrity and interinstance integrity, informally ensure that
if a tuple value at some security level can be filtered (derived) from a higher-classified tuple,
then it is sufficient to store the higher-classified tuple in the multilevel relation.
37
Advanced Database Systems Handout
38
Advanced Database Systems Handout
of a relation (table) that satisfy some selection condition. Statistical queries involve applying
statistical functions to a population of tuples.
For example, we may want to retrieve the number of individuals in a population or the average
income in the population. However, statistical users are not allowed to retrieve individual data,
such as the income of a specific person. Statistical database security techniques must prohibit
the retrieval of individual data. This can be achieved by prohibiting queries that retrieve
attribute values and by allowing only queries that involve statistical aggregate functions such
as:
• COUNT, SUM, MIN, MAX, AVERAGE, and STANDARD DEVIATION.
Such queries are sometimes called statistical queries.
It is DBMS’s responsibility to ensure confidentiality of information about individuals, while
still providing useful statistical summaries of data about those individuals to users. Provision
of privacy protection of users in a statistical database is paramount.
In some cases it is possible to infer the values of individual tuples from a sequence statistical
queries.
• This is particularly true when the conditions result in a population consisting of a small
number of tuples.
39