abstract

ACM SIGMOD Jim Gray Dissertation Award W Talk

Author:

Jose M. FaleiroAuthors Info & Claims

SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data

Page 1923

https://doi.org/10.1145/3318464.3394026

Published: 31 May 2020 Publication History

Get Access

Abstract

The increasing democratization of server hardware with multi-core CPUs and large main memories has been one of the dominant hardware trends of the last decade. "Bare metal" servers with tens of CPU cores and over 100 gigabytes of main memory have been available for several years now. Recently, this large scale hardware has also been available via the cloud. Database systems, with their roots in uniprocessors and paucity of main memory, have unsurprisingly been found wanting on modern hardware. In addition to changes in hardware, database systems have had to contend with changing application requirements and deployment environments. Database systems have long provided applications with an interactive interface, in which an application can communicate with the database over several round-trips in the course of a single request. A large class of applications, however, does not require interactive interfaces, and is unwilling to pay the performance cost associated with overly flexible interfaces. Some of these applications have eschewed database systems altogether in favor of high-performance key-value stores. Finally, modern applications are increasingly deployed at ever increasing scales, often serving hundreds of thousands to millions of simultaneous clients. These large scale deployments are more prone to errors due to consistency issues in their underlying database systems. Ever since their inception, database systems have allowed applications to tradeoff consistency for performance, and often nudge applications towards weak consistency. When deployed at scale, weak consistency exposes latent consistency-related bugs, in the same way that failures are more likely to occur at scale. Nearly every widely deployed database system provides applications with weak consistency consistency by default, and its widespread use in practice significantly complicates application development, leading to latent Heisenbugs that are only exposed in production. This dissertation proposes and explores the use of deterministic execution to address these concerns. Database systems have traditionally been non-deterministic; given an input list of transactions, the final state of the database, which corresponds to some totally ordered execution of transactions, is dependent on non-deterministic factors such as thread scheduling decisions made by the operating system and failures. Deterministic execution, on the other hand, ensures that the database's final state is always determined by its input list of transactions; in other words, the input list of transactions is the same as the total order of transactions that determines the database's state. While non-deterministic database systems expend significant resources in determining valid total orders of transactions, we show that deterministic systems can exploit simple and low-cost up-front total ordering of transactions to execute and schedule transactions much more efficiently. We show that deterministic execution enables low-overhead, highly-parallel scheduling mechanisms, that can address the performance limitations of existing database systems on modern hardware. Deterministic database systems are designed based on the assumption that applications can submit their transactions in one-shot prepared transactions, instead of multiple round-trips. Finally, we attempt to understand the fundamental reason for the observed performance differences between various consistency levels in database systems, and based on this understanding, show that we can exploit deterministic execution to provide strong consistency at a cost that is competitive with that offered by weak consistency levels.

Index Terms

ACM SIGMOD Jim Gray Dissertation Award W Talk
1. Information systems
  1. Data management systems
    1. Database management system engines

Recommendations

What is the cost of weak determinism?
PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation

We analyze the fundamental performance impact of enforcing a fixed order of synchronization operations to achieve weak deterministic execution. Our analysis is in three parts, performed on a real system using the SPLASH-2 and PARSEC benchmarks. First, ...
Efficient deterministic multithreading without global barriers
PPoPP '14

Multithreaded programs execute nondeterministically on conventional architectures and operating systems. This complicates many tasks, including debugging and testing. Deterministic multithreading (DMT) makes the output of a multithreaded program depend ...
Efficient deterministic multithreading without global barriers
PPoPP '14: Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming

Multithreaded programs execute nondeterministically on conventional architectures and operating systems. This complicates many tasks, including debugging and testing. Deterministic multithreading (DMT) makes the output of a multithreaded program depend ...

Comments

Information & Contributors

Information

Published In

SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data

June 2020

2925 pages

ISBN:9781450367356

DOI:10.1145/3318464

General Chairs:
David Maier
Portland State University, USA
,
Rachel Pottinger
University of British Columbia, Canada
,
Program Chairs:
AnHai Doan
University of Wisconsin, USA
,
Wang-Chiew Tan
Megagon Labs, USA
,
Publications Chairs:
Abdussalam Alawini
University of Illinois at Urbana-Champaign, USA
,
Hung Q. Ngo
RelationalAI, USA

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2020

Check for updates

Author Tags

Qualifiers

Abstract

Conference

SIGMOD/PODS '20

Sponsor:

SIGMOD

SIGMOD/PODS '20: International Conference on Management of Data

June 14 - 19, 2020

OR, Portland, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
117
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations

What is the cost of weak determinism?

Efficient deterministic multithreading without global barriers

Efficient deterministic multithreading without global barriers

Comments

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Login options

Full Access

PDF

eReader

Abstract

Index Terms

Recommendations

What is the cost of weak determinism?

Efficient deterministic multithreading without global barriers

Efficient deterministic multithreading without global barriers

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations