Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications

Published: 20 June 2024 Publication History

Abstract

Distributed data stores typically provide weak isolation levels, which are efficient but can lead to unserializable behaviors, which are hard for programmers to understand and often result in errors. This paper presents the first dynamic predictive analysis for data store applications under weak isolation levels, called IsoPredict. Given an observed serializable execution of a data store application, IsoPredict generates and solves SMT constraints to find an unserializable execution that is a feasible execution of the application. IsoPredict introduces novel techniques to handle divergent application behavior; to solve mutually recursive sets of constraints; and to balance coverage, precision, and performance. An evaluation shows IsoPredict finds unserializable behaviors in four data store benchmarks, and that more than 99% of its predicted executions are feasible.

References

[1]
Parosh Abdulla, Mohamed Faouzi Atig, S. Krishna, Ashutosh Gupta, and Omkar Tuppe. 2023. Optimal Stateless Model Checking for Causal Consistency. In Tools and Algorithms for the Construction and Analysis of Systems, Sriram Sankaranarayanan and Natasha Sharygina (Eds.). Springer Nature Switzerland, Cham. 105–125. isbn:978-3-031-30823-9
[2]
A. Adya, B. Liskov, and P. O’Neil. 2000. Generalized isolation level definitions. In Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073). IEEE Computer Society, Los Alamitos, CA, USA. 67–78. https://doi.org/10.1109/ICDE.2000.839388
[3]
Mustaque Ahamad, Gil Neiger, James E. Burns, Prince Kohli, and P.W. Hutto. 1995. Causal Memory: Definitions, Implementation and Programming. Distributed Computing, 9, 1 (1995), 37–49. https://doi.org/10.1007/BF01784241
[4]
Jade Alglave, Luc Maranget, and Michael Tautschnig. 2014. Herding Cats: Modelling, Simulation, Testing, and Data Mining for Weak Memory. ACM Trans. Program. Lang. Syst., 36, 2 (2014), Article 7, Jul, 74 pages. issn:0164-0925 https://doi.org/10.1145/2627752
[5]
Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil, and Patrick O’Neil. 1995. A Critique of ANSI SQL Isolation Levels. In Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data (SIGMOD ’95). ACM, New York, NY, USA. 1–10. isbn:0-89791-731-6 https://doi.org/10.1145/223784.223785
[6]
Ranadeep Biswas and Constantin Enea. 2019. On the Complexity of Checking Transactional Consistency. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 165, Oct, 28 pages. https://doi.org/10.1145/3360591
[7]
Ranadeep Biswas, Diptanshu Kakwani, Jyothi Vedurada, Constantin Enea, and Akash Lal. 2021. MonkeyDB: Effectively Testing Correctness under Weak Isolation Levels. Proc. ACM Program. Lang., 5, OOPSLA (2021), Article 132, Oct, 27 pages. https://doi.org/10.1145/3485546
[8]
Ranadeep Biswas, Diptanshu Kakwani, Jyothi Vedurada, Constantin Enea, and Akash Lal. 2023. Personal communication.
[9]
Ahmed Bouajjani, Constantin Enea, Rachid Guerraoui, and Jad Hamza. 2017. On verifying causal consistency. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL ’17). Association for Computing Machinery, New York, NY, USA. 626–638. isbn:9781450346603 https://doi.org/10.1145/3009837.3009888
[10]
Ahmed Bouajjani, Constantin Enea, and Enrique Román-Calvo. 2023. Dynamic Partial Order Reduction for Checking Correctness against Transaction Isolation Levels. Proc. ACM Program. Lang., 7, PLDI (2023), Article 129, Jun, 26 pages. https://doi.org/10.1145/3591243
[11]
Lucas Brutschy, Dimitar Dimitrov, Peter Müller, and Martin Vechev. 2017. Serializability for Eventual Consistency: Criterion, Analysis, and Applications. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL ’17). Association for Computing Machinery, New York, NY, USA. 458–472. isbn:9781450346603 https://doi.org/10.1145/3009837.3009895
[12]
Lucas Brutschy, Dimitar Dimitrov, Peter Müller, and Martin Vechev. 2018. Static Serializability Analysis for Causal Consistency. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). Association for Computing Machinery, New York, NY, USA. 90–104. isbn:9781450356985 https://doi.org/10.1145/3192366.3192415
[13]
Sebastian Burckhardt. 2014. Principles of Eventual Consistency. Found. Trends Program. Lang., 1, 1–2 (2014), oct, 1–150. issn:2325-1107 https://doi.org/10.1561/2500000011
[14]
Andrea Cerone, Giovanni Bernardi, and Alexey Gotsman. 2015. A Framework for Transactional Consistency Models with Atomic Visibility. In 26th International Conference on Concurrency Theory (CONCUR 2015), Luca Aceto and David de Frutos Escrig (Eds.) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 42). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany. 58–71. isbn:978-3-939897-91-0 issn:1868-8969 https://doi.org/10.4230/LIPIcs.CONCUR.2015.58
[15]
Chaoyi Cheng, Mingzhe Han, Nuo Xu, Spyros Blanas, Michael D. Bond, and Yang Wang. 2023. Developer’s Responsibility or Database’s Responsibility? Rethinking Concurrency Control in Databases. In 13th Conference on Innovative Data Systems Research, CIDR 2023, Amsterdam, The Netherlands, January 8-11, 2023. www.cidrdb.org. https://www.cidrdb.org/cidr2023/papers/p30-cheng.pdf
[16]
James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. 2012. Spanner: Google’ s Globally-Distributed Database. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). USENIX Association, Hollywood, CA. 261–264. isbn:978-1-931971-96-6 https://www.usenix.org/conference/osdi12/technical-sessions/presentation/corbett
[17]
Natacha Crooks, Youer Pu, Lorenzo Alvisi, and Allen Clement. 2017. Seeing is Believing: A Client-Centric Specification of Database Isolation. In Proceedings of the ACM Symposium on Principles of Distributed Computing (PODC ’17). ACM, New York, NY, USA. 73–82. isbn:978-1-4503-4992-5 https://doi.org/10.1145/3087801.3087802
[18]
Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Tools and Algorithms for the Construction and Analysis of Systems, C. R. Ramakrishnan and Jakob Rehof (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 337–340. isbn:978-3-540-78800-3
[19]
Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudre-Mauroux. 2013. OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases. Proc. VLDB Endow., 7, 4 (2013), Dec, 277–288. issn:2150-8097 https://doi.org/10.14778/2732240.2732246
[20]
Mostafa Elhemali, Niall Gallagher, Nick Gordon, Joseph Idziorek, Richard Krog, Colin Lazier, Erben Mo, Akhilesh Mritunjai, Somasundaram Perianayagam, Tim Rath, Swami Sivasubramanian, James Christopher Sorenson III, Sroaj Sosothikul, Doug Terry, and Akshat Vig. 2022. Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). USENIX Association, Carlsbad, CA. 1037–1048. isbn:978-1-939133-29-46 https://www.usenix.org/conference/atc22/presentation/elhemali
[21]
Ben Frederickson. 2024. https://github.com/benfred/py-spy
[22]
Leonidas Galanis, Supiti Buranawatanachoke, Romain Colle, Benoît Dageville, Karl Dias, Jonathan Klein, Stratos Papadomanolakis, Leng Leng Tan, Venkateshwaran Venkataramani, Yujun Wang, and Graham Wood. 2008. Oracle Database Replay. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD ’08). Association for Computing Machinery, New York, NY, USA. 1159–1170. isbn:9781605581026 https://doi.org/10.1145/1376616.1376732
[23]
Yifan Gan, Xueyuan Ren, Drew Ripberger, Spyros Blanas, and Yang Wang. 2020. IsoDiff: Debugging Anomalies Caused by Weak Isolation. Proc. VLDB Endow., 13, 12 (2020), Jul, 2773–2786. issn:2150-8097 https://doi.org/10.14778/3407790.3407860
[24]
Chujun Geng, Spyros Blanas, Michael D. Bond, and Yang Wang. 2024. IsoPredict artifact. https://doi.org/10.5281/zenodo.10802748
[25]
Chujun Geng, Spyros Blanas, Michael D. Bond, and Yang Wang. 2024. IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications. arXiv:2404.04621. Extended version of PLDI 2024 paper
[26]
Chujun Geng, Spyros Blanas, Michael D. Bond, and Yang Wang. 2024. IsoPredict implementation. https://github.com/PLaSSticity/IsoPredict-implementation
[27]
M. Ghafoor, M. Mahmood, and J. Siddiqui. 2016. Effective Partial Order Reduction in Model Checking Database Applications. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). IEEE Computer Society, Los Alamitos, CA, USA. 146–156. https://doi.org/10.1109/ICST.2016.25
[28]
Seth Gilbert and Nancy Lynch. 2002. Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News, 33 (2002), June, 51–59. issn:0163-5700 https://doi.org/10.1145/564585.564601
[29]
Jad Hamza. 2015. Algorithmic Verification of Concurrent and Distributed Data Structures. Ph. D. Dissertation. PhD thesis, Université Paris Diderot.
[30]
Jeff Huang, Patrick O’Neil Meredith, and Grigore Rosu. 2014. Maximal sound predictive race detection with control flow abstraction. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). Association for Computing Machinery, New York, NY, USA. 337–348. isbn:9781450327848 https://doi.org/10.1145/2594291.2594315
[31]
Gowtham Kaki, Kapil Earanky, KC Sivaramakrishnan, and Suresh Jagannathan. 2018. Safe replication through bounded concurrency verification. Proc. ACM Program. Lang., 2, OOPSLA (2018), Article 164, Oct, 27 pages. https://doi.org/10.1145/3276534
[32]
Kyle Kingsbury and Peter Alvaro. 2020. Elle: Inferring Isolation Anomalies from Experimental Observations. Proc. VLDB Endow., 14, 3 (2020), Nov, 268–280. issn:2150-8097 https://doi.org/10.14778/3430915.3430918
[33]
Dileep Kini, Umang Mathur, and Mahesh Viswanathan. 2017. Dynamic race prediction in linear time. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). Association for Computing Machinery, New York, NY, USA. 157–170. isbn:9781450349888 https://doi.org/10.1145/3062341.3062374
[34]
K. R. M. Leino and Clément Pit-Claudel. 2016. Trigger Selection Strategies to Stabilize Program Verifiers. In Computer Aided Verification, Swarat Chaudhuri and Azadeh Farzan (Eds.). Springer International Publishing, Cham. 361–381. isbn:978-3-319-41528-4
[35]
Qian Li, Peter Kraft, Michael Cafarella, Çağatay Demiralp, Goetz Graefe, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh, Xiangyao Yu, and Matei Zaharia. 2023. R3: Record-Replay-Retroaction for Database-Backed Applications. Proc. VLDB Endow., 16, 11 (2023), Jul, 3085–3097. issn:2150-8097 https://doi.org/10.14778/3611479.3611510
[36]
P. Mahajan, L. Alvisi, and M. Dahlin. 2011. Consistency, Availability, Convergence. Computer Science Department, University of Texas at Austin.
[37]
2023. http://www.mysql.com
[38]
2023. MySQL Cluster. https://www.mysql.com/products/cluster/
[39]
Kartik Nagar and Suresh Jagannathan. 2018. Automated Detection of Serializability Violations under Weak Consistency. arxiv:1806.08416.
[40]
Andrew Pavlo. 2017. What Are We Doing With Our Lives? Nobody Cares About Our Concurrency Control Research. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD ’17). Association for Computing Machinery, New York, NY, USA. 3. isbn:9781450341974 https://doi.org/10.1145/3035918.3056096
[41]
2024. https://perf.wiki.kernel.org/index.php/Main_Page
[42]
Matthieu Perrin, Achour Mostefaoui, and Claude Jard. 2016. Causal Consistency: Beyond Memory. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’16). Association for Computing Machinery, New York, NY, USA. Article 26, 12 pages. isbn:9781450340922 https://doi.org/10.1145/2851141.2851170
[43]
Kia Rahmani, Kartik Nagar, Benjamin Delaware, and Suresh Jagannathan. 2019. CLOTHO: Directed Test Generation for Weakly Consistent Database Systems. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 117, Oct, 28 pages. https://doi.org/10.1145/3360543
[44]
Jake Roemer, Kaan Genç, and Michael D. Bond. 2020. SmartTrack: efficient predictive race detection. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020). Association for Computing Machinery, New York, NY, USA. 747–762. isbn:9781450376136 https://doi.org/10.1145/3385412.3385993
[45]
Mahmoud Said, Chao Wang, Zijiang Yang, and Karem Sakallah. 2011. Generating Data Race Witnesses by an SMT-Based Analysis. In NASA Formal Methods, Mihaela Bobaru, Klaus Havelund, Gerard J. Holzmann, and Rajeev Joshi (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 313–327. isbn:978-3-642-20398-5 https://doi.org/10.1007/978-3-642-20398-5_23
[46]
Arnab Sinha, Sharad Malik, Chao Wang, and Aarti Gupta. 2012. Predicting Serializability Violations: SMT-Based Search vs. DPOR-Based Search. In Hardware and Software: Verification and Testing, Kerstin Eder, João Lourenço, and Onn Shehory (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 95–114. isbn:978-3-642-34188-5 https://doi.org/10.1007/978-3-642-34188-5_11
[47]
2023. Snowflake transactions. https://docs.snowflake.com/en/sql-reference/transactions
[48]
Cheng Tan, Changgeng Zhao, Shuai Mu, and Michael Walfish. 2020. COBRA: making transactional key-value stores verifiably serializable. In Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation (OSDI’20). USENIX Association, USA. Article 4, 18 pages. isbn:978-1-939133-19-9 https://www.usenix.org/conference/osdi20/presentation/tan
[49]
Chuzhe Tang, Zhaoguo Wang, Xiaodong Zhang, Qianmian Yu, Binyu Zang, Haibing Guan, and Haibo Chen. 2022. Ad Hoc Transactions in Web Applications: The Good, the Bad, and the Ugly. In Proceedings of the 2022 International Conference on Management of Data (SIGMOD ’22). Association for Computing Machinery, New York, NY, USA. 4–18. isbn:9781450392495 https://doi.org/10.1145/3514221.3526120
[50]
Hünkar Can Tunç, Umang Mathur, Andreas Pavlogiannis, and Mahesh Viswanathan. 2023. Sound Dynamic Deadlock Prediction in Linear Time. Proc. ACM Program. Lang., 7, PLDI (2023), Article 177, Jun, 26 pages. https://doi.org/10.1145/3591291
[51]
Todd Warszawski and Peter Bailis. 2017. ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD ’17). ACM, New York, NY, USA. 5–20. isbn:978-1-4503-4197-4 https://doi.org/10.1145/3035918.3064037
[52]
Rachid Zennou, Ranadeep Biswas, Ahmed Bouajjani, Constantin Enea, and Mohammed Erradi. 2022. Checking Causal Consistency of Distributed Databases. Computing, 104, 10 (2022), Oct, 2181–2201. issn:0010-485X https://doi.org/10.1007/s00607-021-00911-3
[53]
Jian Zhang, Ye Ji, Shuai Mu, and Cheng Tan. 2023. Viper: A Fast Snapshot Isolation Checker. In Proceedings of the Eighteenth European Conference on Computer Systems (EuroSys ’23). Association for Computing Machinery, New York, NY, USA. 654–671. isbn:9781450394871 https://doi.org/10.1145/3552326.3567492

Index Terms

  1. IsoPredict: Dynamic Predictive Analysis for Detecting Unserializable Behaviors in Weakly Isolated Data Store Applications

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Programming Languages
    Proceedings of the ACM on Programming Languages  Volume 8, Issue PLDI
    June 2024
    2198 pages
    EISSN:2475-1421
    DOI:10.1145/3554317
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 June 2024
    Published in PACMPL Volume 8, Issue PLDI

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. data stores
    2. dynamic predictive analysis
    3. transactions
    4. weak isolation levels

    Qualifiers

    • Research-article

    Funding Sources

    • NSF
    • Oracle America, Inc

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 368
      Total Downloads
    • Downloads (Last 12 months)368
    • Downloads (Last 6 weeks)41
    Reflects downloads up to 01 Feb 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media