Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

DiffStream: differential output testing for stream processing programs

Published: 13 November 2020 Publication History
  • Get Citation Alerts
  • Abstract

    High performance architectures for processing distributed data streams, such as Flink, Spark Streaming, and Storm, are increasingly deployed in emerging data-driven computing systems. Exploiting the parallelism afforded by such platforms, while preserving the semantics of the desired computation, is prone to errors, and motivates the development of tools for specification, testing, and verification. We focus on the problem of differential output testing for distributed stream processing systems, that is, checking whether two implementations produce equivalent output streams in response to a given input stream. The notion of equivalence allows reordering of logically independent data items, and the main technical contribution of the paper is an optimal online algorithm for checking this equivalence. Our testing framework is implemented as a library called DiffStream in Flink. We present four case studies to illustrate how our framework can be used to (1) correctly identify bugs in a set of benchmark MapReduce programs, (2) facilitate the development of difficult-to-parallelize high performance applications, and (3) monitor an application for a long period of time with minimal performance overhead.

    Supplementary Material

    Auxiliary Presentation Video (oopsla20main-p96-p-video.mp4)
    OOPSLA Video Presentation

    References

    [1]
    Parosh Abdulla, Stavros Aronis, Bengt Jonsson, and Konstantinos Sagonas. 2014. Optimal dynamic partial order reduction. ACM SIGPLAN Notices 49, 1 ( 2014 ), 373-384.
    [2]
    Sebastian Burckhardt, Chris Dern, Madanlal Musuvathi, and Roy Tan. 2010. Line-up: a complete and automatic linearizability checker. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation. 330-340.
    [3]
    Sebastian Burckhardt, Alexey Gotsman, Hongseok Yang, and Marek Zawirski. 2014. Replicated data types: specification, verification, optimality. In ACM Sigplan Notices, Vol. 49. ACM, 271-284.
    [4]
    Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache flink: Stream and batch processing in a single engine. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 36, 4 ( 2015 ).
    [5]
    Saksham Chand, Yanhong A Liu, and Scott D Stoller. 2016. Formal verification of multi-Paxos for distributed consensus. In International Symposium on Formal Methods. Springer, 119-136.
    [6]
    Xin Chen, Ymir Vigfusson, Douglas M Blough, Fang Zheng, Kun-Lung Wu, and Liting Hu. 2017. GOVERNOR: Smoother Stream Processing Through Smarter Backpressure. In 2017 IEEE International Conference on Autonomic Computing (ICAC). IEEE, 145-154.
    [7]
    Yu-Fang Chen, Lei Song, and Zhilin Wu. 2016. The commutativity problem of the MapReduce framework: A transducer-based approach. In International Conference on Computer Aided Verification. Springer, 91-111.
    [8]
    Sanket Chintapalli, Derek Dagit, Bobby Evans, Reza Farivar, Thomas Graves, Mark Holderbaugh, Zhuo Liu, Kyle Nusbaum, Kishorkumar Patil, Boyang Jerry Peng, et al. 2016. Benchmarking streaming computation engines: Storm, flink and spark streaming. In 2016 IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, 1789-1792.
    [9]
    Rebecca L Collins and Luca P Carloni. 2009. Flexible filters: load balancing through backpressure for stream programs. In Proceedings of the seventh ACM international conference on Embedded software. 205-214.
    [10]
    Christoph Csallner, Leonidas Fegaras, and Chengkai Li. 2011. New ideas track: testing mapreduce-style programs. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 504-507.
    [11]
    Jefrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM 51, 1 (Jan. 2008 ), 107-113. https://doi.org/10.1145/1327452.1327492
    [12]
    Volker Diekert and Grzegorz Rozenberg. 1995. The Book of Traces. World Scientific. https://doi.org/10.1142/2563
    [13]
    Robert B Evans and Alberto Savoia. 2007. Diferential testing: a new approach to change detection. In The 6th Joint Meeting on European software engineering conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering: Companion Papers. ACM, 549-552.
    [14]
    Apache Software Foundation. 2019. Apache Storm. http://storm.apache.org/. [Online; accessed March 31, 2019 ].
    [15]
    Phillip B Gibbons and Ephraim Korach. 1992. The complexity of sequential consistency. In Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing. IEEE, 317-325.
    [16]
    Phillip B Gibbons and Ephraim Korach. 1997. Testing shared memories. SIAM J. Comput. 26, 4 ( 1997 ), 1208-1244.
    [17]
    Patrice Godefroid. 1996. Partial-order methods for the verification of concurrent systems: an approach to the state-explosion problem. Springer-Verlag.
    [18]
    Alex Groce, Gerard Holzmann, and Rajeev Joshi. 2007. Randomized diferential testing as a prelude to formal verification. In 29th International Conference on Software Engineering (ICSE'07). IEEE, 621-631.
    [19]
    Muhammad Ali Gulzar, Matteo Interlandi, Seunghyun Yoo, Sai Deep Tetali, Tyson Condie, Todd Millstein, and Miryung Kim. 2016. Bigdebug: Debugging primitives for interactive big data processing in spark. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE). IEEE, 784-795.
    [20]
    Klaus Havelund and Grigore Roşu. 2004. Eficient monitoring of safety properties. International Journal on Software Tools for Technology Transfer 6, 2 ( 2004 ).
    [21]
    Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R Lorch, Bryan Parno, Michael L Roberts, Srinath Setty, and Brian Zill. 2015. IronFleet: proving practical distributed systems correct. In Proceedings of the 25th Symposium on Operating Systems Principles. ACM, 1-17.
    [22]
    Maurice P Herlihy and Jeannette M Wing. 1990. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems (TOPLAS) 12, 3 ( 1990 ), 463-492.
    [23]
    Paul Holser. 2013. junit-quickcheck (software). https://github.com/pholser/junit-quickcheck.
    [24]
    Petr Hosek and Cristian Cadar. 2013. Safe software updates via multi-version execution. In Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 612-621.
    [25]
    JUnit. 2019. JUnit testing framework. https://junit.org/junit5/. [Online; accessed October 19, 2019 ].
    [26]
    Soila Kavulya, Jiaqi Tan, Rajeev Gandhi, and Priya Narasimhan. 2010. An analysis of traces from a production mapreduce cluster. In Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. IEEE Computer Society, 94-103.
    [27]
    Sanjeev Kulkarni, Nikunj Bhagat, Maosong Fu, Vikas Kedigehalli, Christopher Kellogg, Sailesh Mittal, Jignesh M. Patel, Karthik Ramasamy, and Siddarth Taneja. 2015. Twitter Heron: Stream Processing at Scale. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (Melbourne, Victoria, Australia) ( SIGMOD '15). ACM, New York, NY, USA, 239-250. https://doi.org/10.1145/2723372.2742788
    [28]
    Edward A Lee and David G Messerschmitt. 1987. Synchronous data flow. Proc. IEEE 75, 9 ( 1987 ), 1235-1245.
    [29]
    Martin Leucker and Christian Schallhart. 2009. A brief account of runtime verification. The Journal of Logic and Algebraic Programming 78, 5 ( 2009 ), 293-303.
    [30]
    Sihan Li, Hucheng Zhou, Haoxiang Lin, Tian Xiao, Haibo Lin, Wei Lin, and Tao Xie. 2013. A characteristic study on failures of production distributed data-parallel programs. In Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 963-972.
    [31]
    Chang Liu, Jiaxing Zhang, Hucheng Zhou, Sean McDirmid, Zhenyu Guo, and Thomas Moscibroda. 2014. Automating distributed partial aggregation. In Proceedings of the ACM Symposium on Cloud Computing. 1-12.
    [32]
    Gavin Lowe. 2017. Testing for linearizability. Concurrency and Computation: Practice and Experience 29, 4 ( 2017 ), e3928.
    [33]
    Konstantinos Mamouras, Caleb Stanford, Rajeev Alur, Zachary G Ives, and Val Tannen. 2019. Data-trace types for distributed stream processing systems. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 670-685.
    [34]
    João Eugenio Marynowski, Michel Albonico, Eduardo Cunha de Almeida, and Gerson Sunyé. 2012. Testing MapReduce-based systems. arXiv preprint arXiv:1209.6580 ( 2012 ).
    [35]
    Matthew Maurer and David Brumley. 2012. Tachyon: Tandem Execution for Eficient Live Patch Testing. In Presented as part of the 21st USENIX Security Symposium (USENIX Security 12). USENIX, Bellevue, WA, 617-630. https://www.usenix. org/conference/usenixsecurity12/technical-sessions/presentation/maurer
    [36]
    Antoni Mazurkiewicz. 1986. Trace theory. In Advanced course on Petri nets. Springer, 278-324.
    [37]
    William M McKeeman. 1998. Diferential testing for software. Digital Technical Journal 10, 1 ( 1998 ), 100-107.
    [38]
    Jesús Morán, Claudio de la Riva, and Javier Tuya. 2019. Testing MapReduce programs: A systematic mapping study. Journal of Software: Evolution and Process 31, 3 ( 2019 ), e2120.
    [39]
    Robert HB Netzer and Barton P Miller. 1990. On the complexity of event ordering for shared-memory parallel program executions. Technical Report. University of Wisconsin-Madison Department of Computer Sciences.
    [40]
    Robert HB Netzer and Barton P Miller. 1992. What are race conditions? Some issues and formalizations. ACM Letters on Programming Languages and Systems (LOPLAS) 1, 1 ( 1992 ), 74-88.
    [41]
    Shadi A Noghabi, Kartik Paramasivam, Yi Pan, Navina Ramesh, Jon Bringhurst, Indranil Gupta, and Roy H Campbell. 2017. Samza: stateful scalable stream processing at LinkedIn. Proceedings of the VLDB Endowment 10, 12 ( 2017 ), 1634-1645.
    [42]
    Christopher Olston, Shubham Chopra, and Utkarsh Srivastava. 2009. Generating example data for dataflow programs. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. ACM, 245-256.
    [43]
    Christopher Olston and Benjamin Reed. 2011. Inspector gadget: A framework for custom monitoring and debugging of distributed dataflows. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, 1221-1224.
    [44]
    Stack Overflow. 2020. Questions tagged with apache-flink on Stack Overflow. https://stackoverflow.com/questions/tagged/ apache-flink. [Online; accessed January 27, 2020 ].
    [45]
    Burcu Kulahcioglu Ozkan, Rupak Majumdar, Filip Niksic, Mitra Tabaei Befrouei, and Georg Weissenbacher. 2018. Randomized testing of distributed systems with probabilistic guarantees. Proceedings of the ACM on Programming Languages 2, OOPSLA ( 2018 ), 160.
    [46]
    Oded Padon, Kenneth L McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham. 2016. Ivy: safety verification by interactive generalization. ACM SIGPLAN Notices 51, 6 ( 2016 ), 614-630.
    [47]
    Chang-Seo Park, Koushik Sen, Paul Hargrove, and Costin Iancu. 2011. Eficient data race detection for distributed memory parallel programs. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 51.
    [48]
    Doron Peled. 1994. Combining Partial Order Reductions with On-the-fly Model-Checking. In Computer Aided Verification, Proc. 6th Int. Conference (LNCS 818). Springer-Verlag.
    [49]
    Alex Raizman, Asvin Ananthanarayan, Anton Kirilov, Badrish Chandramouli, and Mohamed H Ali. 2010. An extensible test framework for the Microsoft StreamInsight query processor. In DBTest.
    [50]
    Veselin Raychev, Madanlal Musuvathi, and Todd Mytkowicz. 2015. Parallelizing user-defined aggregations using symbolic execution. In Proceedings of the 25th Symposium on Operating Systems Principles. 153-167.
    [51]
    Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems (TOCS) 15, 4 ( 1997 ), 391-411.
    [52]
    Scott Schneider, Martin Hirzel, Buğra Gedik, and Kun-Lung Wu. 2013. Safe data parallelism for general streaming. IEEE transactions on computers 64, 2 ( 2013 ), 504-517.
    [53]
    Bianca Schroeder and Garth Gibson. 2009. A large-scale study of failures in high-performance computing systems. IEEE transactions on Dependable and Secure Computing 7, 4 ( 2009 ), 337-350.
    [54]
    Koushik Sen. 2008. Race directed random testing of concurrent programs. ACM Sigplan Notices 43, 6 ( 2008 ), 11-21.
    [55]
    William Thies, Michal Karczmarek, and Saman Amarasinghe. 2002. StreamIt: A language for streaming applications. In International Conference on Compiler Construction. Springer, 179-196.
    [56]
    Joseph Tucek, Weiwei Xiong, and Yuanyuan Zhou. 2009. Eficient Online Validation with Delta Execution. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (Washington, DC, USA) (ASPLOS XIV). ACM, New York, NY, USA, 193-204. https://doi.org/10.1145/1508244.1508267
    [57]
    Alexandre Vianna, Waldemar Ferreira, and Kiev Gama. 2019. An Exploratory Study of How Specialists Deal with Testing in Data Stream Processing Applications. arXiv preprint arXiv: 1909. 11069 ( 2019 ).
    [58]
    James R Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D Ernst, and Thomas Anderson. 2015. Verdi: a framework for implementing and formally verifying distributed systems. ACM SIGPLAN Notices 50, 6 ( 2015 ), 357-368.
    [59]
    Jeannette M. Wing and Chun Gong. 1993. Testing and verifying concurrent objects. J. Parallel and Distrib. Comput. 17, 1-2 ( 1993 ), 164-182.
    [60]
    Tian Xiao, Jiaxing Zhang, Hucheng Zhou, Zhenyu Guo, Sean McDirmid, Wei Lin, Wenguang Chen, and Lidong Zhou. 2014. Nondeterminism in MapReduce considered harmful? an empirical study on non-commutative aggregators in MapReduce programs. In Companion Proceedings of the 36th International Conference on Software Engineering. ACM, 44-53.
    [61]
    Zhihong Xu, Martin Hirzel, and Gregg Rothermel. 2013a. Semantic characterization of MapReduce workloads. In 2013 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 87-97.
    [62]
    Zhihong Xu, Martin Hirzel, Gregg Rothermel, and Kun-Lung Wu. 2013b. Testing properties of dataflow program operators. In Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 103-113.
    [63]
    Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In ACM SIGPLAN Notices, Vol. 46. ACM, 283-294.
    [64]
    Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. 2013. Discretized streams: Fault-tolerant streaming computation at scale. In Proceedings of the twenty-fourth ACM symposium on operating systems principles. ACM, 423-438.
    [65]
    Hucheng Zhou, Jian-Guang Lou, Hongyu Zhang, Haibo Lin, Haoxiang Lin, and Tingting Qin. 2015. An empirical study on quality issues of production big data platform. In Proceedings of the 37th International Conference on Software EngineeringVolume 2. IEEE Press, 17-26.

    Cited By

    View all
    • (2023)Testing Graph Database Engines via Query PartitioningProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598044(140-149)Online publication date: 12-Jul-2023
    • (2023)Fast Prototyping of Distributed Stream Processing Applications with stream2gym2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS57875.2023.00034(395-405)Online publication date: Jul-2023
    • (2023)Perfce: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00106(1454-1466)Online publication date: 11-Sep-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Programming Languages
    Proceedings of the ACM on Programming Languages  Volume 4, Issue OOPSLA
    November 2020
    3108 pages
    EISSN:2475-1421
    DOI:10.1145/3436718
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 November 2020
    Published in PACMPL Volume 4, Issue OOPSLA

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. differential testing
    2. runtime verification
    3. stream processing

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)305
    • Downloads (Last 6 weeks)31
    Reflects downloads up to

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Testing Graph Database Engines via Query PartitioningProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598044(140-149)Online publication date: 12-Jul-2023
    • (2023)Fast Prototyping of Distributed Stream Processing Applications with stream2gym2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS57875.2023.00034(395-405)Online publication date: Jul-2023
    • (2023)Perfce: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE56229.2023.00106(1454-1466)Online publication date: 11-Sep-2023
    • (2023)A Grey Literature Review on Data Stream Processing applications testingJournal of Systems and Software10.1016/j.jss.2023.111744203(111744)Online publication date: Sep-2023
    • (2022)Stream processing with dependency-guided synchronizationProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508413(1-16)Online publication date: 2-Apr-2022
    • (2022)A Survey on Advancements of Real-Time Analytics Architecture ComponentsComputational Methods and Data Engineering10.1007/978-981-19-3015-7_41(547-559)Online publication date: 9-Sep-2022
    • (2021)SPOT: Testing Stream Processing Programs with Symbolic Execution and Stream SynthesizingApplied Sciences10.3390/app1117805711:17(8057)Online publication date: 30-Aug-2021
    • (2021)s2p: Provenance Research for Stream Processing SystemApplied Sciences10.3390/app1112552311:12(5523)Online publication date: 15-Jun-2021
    • (2021)An order-aware dataflow model for parallel Unix pipelinesProceedings of the ACM on Programming Languages10.1145/34735705:ICFP(1-28)Online publication date: 19-Aug-2021
    • (2021)Synchronization SchemasProceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3452021.3458317(1-18)Online publication date: 20-Jun-2021
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media