Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3597503.3639207acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Understanding Transaction Bugs in Database Systems

Published: 12 April 2024 Publication History

Abstract

Transactions are used to guarantee data consistency and integrity in Database Management Systems (DBMSs), and have become an indispensable component in DBMSs. However, faulty designs and implementations of DBMSs' transaction processing mechanisms can introduce transaction bugs, and lead to severe consequences, e.g., incorrect database states and DBMS crashes. An in-depth understanding of real-world transaction bugs can significantly promote effective techniques in combating transaction bugs in DBMSs.
In this paper, we conduct the first comprehensive study on 140 transaction bugs collected from six widely-used DBMSs, i.e., MySQL, PostgreSQL, SQLite, MariaDB, CockroachDB, and TiDB. We investigate these bugs from their bug manifestations, root causes, bug impacts and bug fixing. Our study reveals many interesting findings and provides useful guidance for transaction bug detection, testing, and verification.

References

[1]
2022. The ANSI isolation levels. https://renenyffenegger.ch/notes/development/databases/SQL/transaction/isolation-level.
[2]
2022. Async Commit in TiDB. https://docs.pingcap.com/tidb/stable/release-5.0.0#async-commit.
[3]
2022. Character Set Support in PostgreSQL. https://www.postgresql.org/docs/current/multibyte.html.
[4]
2022. Character Sets, Collations, Unicode in MySQL. https://dev.mysql.com/doc/refman/8.0/en/charset.html.
[5]
2022. CockroachDB. https://www.cockroachlabs.com.
[6]
2022. CockroachDB issues. https://github.com/cockroachdb/cockroach/issues/.
[7]
2022. DB-Engines. https://db-engines.com/en/ranking.
[8]
2022. Gretchen: Offline serializability verification, in Clojure. https://jepsen.io/.
[9]
2022. MariaDB. https://mariadb.org.
[10]
2022. MariaDB JIRA issues. https://jira.mariadb.org/issues/.
[11]
2022. MySQL. https://www.mysql.com.
[12]
2022. MySQL Bugs Home. https://bugs.mysql.com/.
[13]
2022. Pgsql-bugs. https://www.postgresql.org/list/pgsql-bugs/.
[14]
2022. PostgreSQL. https://www.postgresql.org.
[15]
2022. SQLite. https://www.sqlite.org/index.html.
[16]
2022. SQLite Ticket Main Menu. https://www.sqlite.org/src/reportlist.
[17]
2022. SQLsmith. https://github.com/anse1/sqlsmith.
[18]
2022. TiDB issues. https://github.com/pingcap/tidb/issues/.
[19]
2022. TiDB, PingCAP. https://pingcap.com.
[20]
2022. Transaction Isolation in PostgreSQL. https://www.postgresql.org/docs/14/transaction-iso.html.
[21]
2022. Transaction Isolation Levels in MySQL. https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html.
[22]
2022. Transactions in TiDB. https://docs.pingcap.com/tidb/stable/transaction-overview.
[23]
2023. JanusGraph. https://janusgraph.org.
[24]
2023. Memcached. http://www.memcached.org/.
[25]
2023. Neo4j. https://neo4j.com/.
[26]
2023. OrientDB. https://orientdb.org.
[27]
2023. Redis. https://redis.com/.
[28]
2023. XA Transactions in MariaDB. https://mariadb.com/kb/en/xa-transactions/.
[29]
2023. XA Transactions in MySQL. https://dev.mysql.com/doc/refman/8.0/en/xa.html.
[30]
Atul Adya. 1999. Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions. Ph. D. Dissertation. Massachusetts Institute of Technology.
[31]
Atul Adya, Barbara Liskov, and Patrick O'Neil. 2000. Generalized Isolation Level Definitions. In Proceedings of International Conference on Data Engineering (ICDE). 67--78.
[32]
Mohammed Alfatafta, Basil Alkhatib, Ahmed Alquraan, and Samer Al-Kiswany. 2020. Toward a Generic Fault Tolerance Technique for Partial Network Partitioning. In Proceedings of USENIX Conference on Operating Systems Design and Implementation (OSDI). 351--368.
[33]
Ahmed Alquraan, Hatem Takruri, Mohammed Alfatafta, and Samer Al-Kiswany. 2018. An Analysis of Network-Partitioning Failures in Cloud Systems. In Proceedings of USENIX Conference on Operating Systems Design and Implementation (OSDI). 51--68.
[34]
Jinsheng Ba and Manuel Rigger. 2023. Testing Database Engines via Query Plan Guidance. In Proceedings of International Conference on Software Engineering (ICSE). 2060--2071.
[35]
Peter Bailis, Aaron Davidson, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. 2013. Highly Available Transactions: Virtues and Limitations. Proceedings of the VLDB Endowment 7, 3 (2013), 181--192.
[36]
Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. 2016. Scalable Atomic Visibility with RAMP Transactions. ACM Transactions on Database Systems 41, 3 (2016), 15:1--15:45.
[37]
Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil. 1995. A Critique of ANSI SQL Isolation Levels. In Proceedings of International Conference on Management of Data (SIGMOD). 1--10.
[38]
Philip A. Bernstein and Nathan Goodman. 1983. Multiversion Concurrency Control---Theory and Algorithms. ACM Transactions on Database Systems (TODS) 8, 4 (1983), 465--483.
[39]
Ranadeep Biswas and Constantin Enea. 2019. On the Complexity of Checking Transactional Consistency. In Proceedings of ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA). 165:1--165:28.
[40]
Lucas Brutschy, Dimitar Dimitrov, Peter Müller, and Martin Vechev. 2017. Serializability for Eventual Consistency: Criterion, Analysis, and Applications. In Proceedings of ACM SIGPLAN Symposium on Principles of Programming Languages (POPL). 458--472.
[41]
Andrea Cerone, Giovanni Bernardi, and Alexey Gotsman. 2015. A Framework for Transactional Consistency Models with Atomic Visibility. In Proceedings of International Conference on Concurrency Theory (CONCUR). 58--71.
[42]
Junjie Chen, Yihua Liang, Qingchao Shen, Jiajun Jiang, and Shuochuan Li. 2023. Toward Understanding Deep Learning Framework Bugs. ACM Trans. Softw. Eng. Methodol. 32, 6 (2023), 31 pages.
[43]
Natacha Crooks, Youer Pu, Lorenzo Alvisi, and Allen Clement. 2017. Seeing is Believing: A Client-Centric Specification of Database Isolation. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC). 73--82.
[44]
Ziyu Cui, Wensheng Dou, Qianwang Dai, Jiansen Song, Wei Wang, Jun Wei, and Dan Ye. 2022. Differentially Testing Database Transactions for Fun and Profit. In Proceedings of International Conference on Automated Software Engineering (ASE). 35:1--35:12.
[45]
Wensheng Dou, Ziyu Cui, Qianwang Dai, Jiansen Song, Dong Wang, Yu Gao, Wei Wang, Jun Wei, Lei Chen, Hanmo Wang, Hua Zhong, and Tao Huang. 2023. Detecting Isolation Bugs via Transaction Oracle Construction. In Proceedings of International Conference on Software Engineering (ICSE). 1123--1135.
[46]
Yu Gao, Wensheng Dou, Feng Qin, Chushu Gao, Dong Wang, Jun Wei, Ruirui Huang, Li Zhou, and Yongming Wu. 2018. An Empirical Study on Crash Recovery Bugs in Large-Scale Distributed Systems. In Proceedings of ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 539--550.
[47]
Hao Guan, Ying Xiao, Jiaying Li, Yepang Liu, and Guangdong Bai. 2023. A Comprehensive Study of Real-World Bugs in Machine Learning Model Optimization. In Proceedings of International Conference on Software Engineering (ICSE). 147--158.
[48]
Haryadi S. Gunawi, Mingzhe Hao, Tanakorn Leesatapornwongsa, Tiratat Patanaanake, Thanh Do, Jeffry Adityatama, Kurnia J. Eliazar, Agung Laksono, Jeffrey F. Lukman, Vincentius Martin, and Anang D. Satria. 2014. What Bugs Live in the Cloud? A Study of 3000+ Issues in Cloud Systems. In Proceedings of ACM Symposium on Cloud Computing (SOCC). 1--14.
[49]
Ziyue Hua, Wei Lin, Luyao Ren, Zongyang Li, Lu Zhang, Wenpin Jiao, and Tao Xie. 2023. GDsmith: Detecting Bugs in Cypher Graph Database Engines. In Proceedings of International Symposium on Software Testing and Analysis (ISSTA). 163--174.
[50]
Daniel Jackson and Craig A. Damon. 1996. Elements of Style: Analyzing a Software Design Feature with a Counterexample Detector. IEEE Transactions on Software Engineering 22, 7 (1996), 484--495.
[51]
Zu-Ming Jiang, Jia-Ju Bai, and Zhendong Su. 2023. DynSQL: Stateful Fuzzing for Database Management Systems with Complex and Valid SQL Query Generation. In Proceedings of USENIX Security Symposium (USENIX Security). 4949--4965.
[52]
Zu-Ming Jiang, Si Liu, Manuel Rigger, and Zhendong Su. 2023. Detecting Transactional Bugs in Database Engines via Graph-Based Oracle Construction. In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI). 397--417.
[53]
Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and Detecting Real-World Performance Bugs. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). 77--88.
[54]
Jinho Jung, Hong Hu, Joy Arulraj, Taesoo Kim, and Woonhak Kang. 2019. APOLLO: Automatic Detection and Diagnosis of Performance Regressions in Database Systems. Proceedings of the VLDB Endowment (VLDB) 13, 1 (2019), 57--70.
[55]
Matteo Kamm, Manuel Rigger, Chengyu Zhang, and Zhendong Su. 2023. Testing Graph Database Engines via Query Partitioning. In Proceedings of International Symposium on Software Testing and Analysis (ISSTA). 140--149.
[56]
Kyle Kingsbury and Peter Alvaro. 2020. Elle: Inferring Isolation Anomalies from Experimental Observations. Proceedings of the VLDB Endowment 14, 3 (2020), 268--280.
[57]
Hsiang-Tsung Kung and John T Robinson. 1981. On Optimistic Methods for Concurrency Control. ACM Transactions on Database Systems (TODS) 6, 2 (1981), 213--226.
[58]
Tanakorn Leesatapornwongsa, Jeffrey F. Lukman, Shan Lu, and Haryadi S. Gunawi. 2016. TaxDC: A Taxonomy of Non-Deterministic Concurrency Bugs in Datacenter Distributed Systems. In Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 517--530.
[59]
Jie Liang, Yaoguang Chen, Zhiyong Wu, Jingzhou Fu, Mingzhe Wang, Yu Jiang, Xiangdong Huang, Ting Chen, Jiashui Wang, and Jiajia Li. 2023. Sequence-Oriented DBMS Fuzzing. In Proceedings of International Conference on Data Engineering (ICDE). 668--681.
[60]
Yu Liang, Song Liu, and Hong Hu. 2022. Detecting Logical Bugs of DBMS with Coverage-based Guidance. In Proceedings of USENIX Security Symposium (USENIX Security). 4309--4326.
[61]
Xinyu Liu, Qi Zhou, Joy Arulraj, and Alessandro Orso. 2022. Automatic Detection of Performance Bugs in Database Systems using Equivalent Queries. In Proceedings of International Conference on Software Engineering (ICSE). 225--236.
[62]
Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from Mistakes: A Comprehensive Study on Real World Concurrency Bug Characteristics. In Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 329--339.
[63]
David Patrick Reed. 1978. Naming and Synchronization in a Decentralized Computer System. Technical Report.
[64]
Manuel Rigger and Zhendong Su. 2020. Detecting Optimization Bugs in Database Engines via Non-Optimizing Reference Engine Construction. In Proceedings of ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). 1140--1152.
[65]
Manuel Rigger and Zhendong Su. 2020. Finding Bugs in Database Systems via Query Partitioning. Proceedings of the ACM on Programming Languages 4, OOPSLA (2020), 211:1--211:30.
[66]
Manuel Rigger and Zhendong Su. 2020. Testing Database Engines via Pivoted Query Synthesis. In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI). 667--682.
[67]
Donald R. Slutz. 1998. Massive Stochastic Testing of SQL. In Proceedings of International Conference on Very Large Data Bases (VLDB). 618--622.
[68]
Jiansen Song, Wensheng Dou, Ziyu Cui, Qianwang Dai, Wei Wang, Jun Wei, Hua Zhong, and Tao Huang. 2023. Testing Database Systems via Differential Query Execution. In Proceedings of International Conference on Software Engineering (ICSE). 2072--2084.
[69]
Xiaohui Song and Jane W-S Liu. 1990. Performance of Multiversion Concurrency Control Algorithms in Maintaining Temporal Consistency. In Proceedings of Annual International Computer Software and Applications Conference (COMPSAC). 132--139.
[70]
Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward Understanding Compiler Bugs in GCC and LLVM. In Proceedings of International Symposium on Software Testing and Analysis (ISSTA). 294--305.
[71]
Cheng Tan, Changgeng Zhao, Shuai Mu, and Michael Walfish. 2020. Cobra: Making Transactional Key-Value Stores Verifiably Serializable. In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI). 63--80.
[72]
Jie Wang, Wensheng Dou, Yu Gao, Chushu Gao, Feng Qin, Kang Yin, and Jun Wei. 2017. A Comprehensive Study on Real World Concurrency Bugs in Node.Js. In Proceedings of International Conference on Automated Software Engineering (ASE). 520--531.
[73]
Mingzhe Wang, Zhiyong Wu, Xinyi Xu, Jie Liang, Chijin Zhou, Huafeng Zhang, and Yu Jiang. 2021. Industry Practice of Coverage-Guided Enterprise-Level DBMS Fuzzing. In Proceedings of International Conference on Software Engineering (ICSE SEIP). 328--337.
[74]
Rui Yang, Yingying Zheng, Lei Tang, Wensheng Dou, Wei Wang, and Jun Wei. 2023. Randomized Differential Testing of RDF Stores. In Proceedings of International Conference on Software Engineering (ICSE Demo). 136--140.
[75]
Xiangyao Yu, Andrew Pavlo, Daniel Sanchez, and Srinivas Devadas. 2016. TicToc: Time Traveling Optimistic Concurrency Control. In Proceedings of International Conference on Management of Data (SIGMOD). 1629--1642.
[76]
Yingying Zheng, Wensheng Dou, Lei Tang, Ziyu Cui, Jiansen Song, Ziyue Cheng, Wei Wang, Jun Wei, Hua Zhong, and Tao Huang. 2024. Differential Optimization Testing of Gremlin-Based Graph Database Systems. In Proceedings of IEEE International Conference on Software Testing, Verification and Validation (ICST).
[77]
Yingying Zheng, Wensheng Dou, Yicheng Wang, Zheng Qin, Lei Tang, Yu Gao, Dong Wang, Wei Wang, and Jun Wei. 2022. Finding Bugs in Gremlin-Based Graph Database Systems via Randomized Differential Testing. In Proceedings of International Symposium on Software Testing and Analysis (ISSTA). 302--313.
[78]
Rui Zhong, Yongheng Chen, Hong Hu, Hangfan Zhang, Wenke Lee, and Dinghao Wu. 2020. SQUIRREL: Testing Database Management Systems with Language Validity and Coverage Feedback. In Proceedings of ACM SIGSAC Conference on Computer and Communications Security (CCS). 955--970.

Cited By

View all
  • (2024)Detecting Metadata-Related Logic Bugs in Database Systems via Raw Database ConstructionProceedings of the VLDB Endowment10.14778/3659437.365944517:8(1884-1897)Online publication date: 31-May-2024
  • (2024)An Empirical Study on Kubernetes Operator BugsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680396(1746-1758)Online publication date: 11-Sep-2024
  • (2024)Testing Gremlin-Based Graph Database Systems via Query DisassemblingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680392(1695-1707)Online publication date: 11-Sep-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
May 2024
2942 pages
ISBN:9798400702174
DOI:10.1145/3597503
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 April 2024

Check for updates

Badges

Author Tags

  1. database system
  2. transaction bug
  3. empirical study

Qualifiers

  • Research-article

Conference

ICSE '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)506
  • Downloads (Last 6 weeks)73
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Detecting Metadata-Related Logic Bugs in Database Systems via Raw Database ConstructionProceedings of the VLDB Endowment10.14778/3659437.365944517:8(1884-1897)Online publication date: 31-May-2024
  • (2024)An Empirical Study on Kubernetes Operator BugsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680396(1746-1758)Online publication date: 11-Sep-2024
  • (2024)Testing Gremlin-Based Graph Database Systems via Query DisassemblingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680392(1695-1707)Online publication date: 11-Sep-2024
  • (2024)Differential Optimization Testing of Gremlin-Based Graph Database Systems2024 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST60714.2024.00012(25-36)Online publication date: 27-May-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media