The case for determinism in database systems

A Thomson, DJ Abadi - Proceedings of the VLDB Endowment, 2010 - dl.acm.org
Proceedings of the VLDB Endowment, 2010dl.acm.org
Replication is a widely used method for achieving high availability in database systems. Due
to the nondeterminism inherent in traditional concurrency control schemes, however, special
care must be taken to ensure that replicas don't diverge. Log shipping, eager commit
protocols, and lazy synchronization protocols are well-understood methods for safely
replicating databases, but each comes with its own cost in availability, performance, or
consistency. In this paper, we propose a distributed database system which combines a …
Replication is a widely used method for achieving high availability in database systems. Due to the nondeterminism inherent in traditional concurrency control schemes, however, special care must be taken to ensure that replicas don't diverge. Log shipping, eager commit protocols, and lazy synchronization protocols are well-understood methods for safely replicating databases, but each comes with its own cost in availability, performance, or consistency.
In this paper, we propose a distributed database system which combines a simple deadlock avoidance technique with concurrency control schemes that guarantee equivalence to a predetermined serial ordering of transactions. This effectively removes all nondeterminism from typical OLTP workloads, allowing active replication with no synchronization overhead whatsoever. Further, our system eliminates the requirement for two-phase commit for any kind of distributed transaction, even across multiple nodes within the same replica. By eschewing deadlock detection and two-phase commit, our system under many workloads outperforms traditional systems that allow nondeterministic transaction reordering.
ACM Digital Library