Outline: Background Distributed DBMS Architecture Distributed Database Design Distributed Query Processing
Outline: Background Distributed DBMS Architecture Distributed Database Design Distributed Query Processing
Outline: Background Distributed DBMS Architecture Distributed Database Design Distributed Query Processing
Introduction
Background
Distributed DBMS Architecture
Distributed Database Design
Distributed Query Processing
Distributed Transaction Management
Concurrency Control Ideas
Building Distributed Database Systems (RAID)
Mobile Database Systems
Privacy, Trust, and Authentication
Peer to Peer Systems
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 1
Useful References
J. D. Ullman, Principles of Database Systems.
Computer Science Press, Rockville, 1982
J. Gray and A. Reuter. Transaction Processing -
Concepts and Techniques. Morgan Kaufmann, 1993
B. Bhargava, Concurrency Control in Database
Systems, IEEE Trans on Knowledge and Data
Engineering,11(1), Jan.-Feb. 1999
Textbook Principles of Distributed Database Systems,
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 2
Concurrency Control
Some Examples:
Centralized locking
Distributed locking
Majority voting
Local and centralized validation
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 3
Basic Terms for Concurrency Control
Database
Concurrent processing
Database entity (item, object)
Conflict
Distributed database
Consistency
Program
Mutual consistency
Transaction, read set, write set
History
Actions Serializability
Atomic
Serial history
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 4
Basic Terms for Concurrency Control
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 5
Concurrency Control once again
The problem of synchronizing concurrent transactions
such that the consistency of the database is maintained
while, at the same time, maximum degree of
concurrency is achieved.
Anomalies:
Lost updates
The effects of some transactions are not reflected on the
database.
Inconsistent retrievals
A transaction, if it reads the same data item more than once,
should always read the same value.
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 6
Execution Schedule (or History)
An order in which the operations of a set of transactions
are executed.
A schedule (history) can be defined as a partial order
over the operations of a set of transactions.
H1={W2(x),R1(x), R3(x),W1(x),C1,W2(y),R3(y),R2(z),C2,R3(z),C3}
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 7
Formalization of Schedule
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 8
Complete Schedule – Example
Given three transactions
T1: Read(x) T2: Write(x) T3: Read(x)
Write(x) Write(y) Read(y)
Commit Read(z) Read(z)
Commit Commit
A possible complete schedule is given as the DAG
C1 R2(z) R3(z)
C2 C3
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 9
Schedule Definition
A schedule is a prefix of a complete schedule such that
only some of the operations and only some of the
ordering relationships are included.
T1: Read(x) T2: Write(x) T3: Read(x)
Write(x) Write(y) Read(y)
Commit Read(z) Read(z)
Commit Commit
R1(x) W2(x) R3(x) R1(x) W2(x) R3(x)
C2 C3
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 10
Serial History
All the actions of a transaction occur consecutively.
No interleaving of transaction operations.
If each transaction is consistent (obeys integrity rules),
then the database is guaranteed to be consistent at the
end of executing a serial history.
Hs={W2(x),W2(y),R2(z),C2,R1(x),W1(x),C1,R3(x),R3(y),R3(z),C3}
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 11
Serializable History
Transactions execute concurrently, but the net effect of
the resulting history upon the database is equivalent to
some serial history.
Equivalent with respect to what?
Conflict equivalence: the relative order of execution of the
conflicting operations belonging to unaborted transactions in
two histories are the same.
Conflicting operations: two incompatible operations (e.g.,
Read and Write) conflict if they both access the same data
item.
Incompatible operations of each transaction is assumed to
conflict; do not change their execution orders.
If two operations from two different transactions conflict, the
corresponding transactions are also said to conflict.
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 12
Serializable History
T1: Read(x) T2: Write(x) T3: Read(x)
Write(x) Write(y) Read(y)
Commit Read(z) Read(z)
Commit Commit
H1={W2(x),R1(x), R3(x),W1(x),C1,W2(y),R3(y),R2(z),C2,R3(z),C3}
H2={W2(x),R1(x),W1(x),C1,R3(x),W2(y),R3(y),R2(z),C2,R3(z),C3}
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 13
Serializability in Distributed DBMS
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 14
Global Non-serializability
T1: Read(x) T2: Read(x)
x x5 x x15
Write(x) Write(x)
Commit Commit
The following two local histories are individually
serializable (in fact serial), but the two transactions are not
globally serializable.
LH1={R1(x),W1(x),C1,R2(x),W2(x),C2}
LH2={R2(x),W2(x),C2,R1(x),W1(x),C1}
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 15
Evaluation Criterion for Concurrency Control
1. Degree of Concurrency
Scheduler
history Recognizes history
(requested) or (executed)
Reshuffles
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 16
General Comments
Information needed by Concurrency Controllers
Locks on database objects
Time stamps on database objects
Time stamps on transactions
Observations
Time stamps mechanisms more fundamental than locking
Time stamps carry more information
Checking locks costs less than checking time stamps
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 17
General Comments (cont.)
When to synchronize
First access to an object (Locking, pessimistic validation)
At each access (question of granularity)
After all accesses and before commitment (optimistic validation)
Fundamental notions
Rollback
Identification of useless transactions
Delaying commit point
Semantics of transactions
Distributed DBMS © 1998 M. Tamer Özsu & Patrick Valduriez Page 10-12. 18