Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Automated and Efficient Test-Generation for Grid-Based Multiagent Systems: Comparing Random Input Filtering versus Constraint Solving

Published: 23 November 2023 Publication History

Abstract

Automatic generation of random test inputs is an approach that can alleviate the challenges of manual test case design. However, random test cases may be ineffective in fault detection and increase testing cost, especially in systems where test execution is resource- and time-consuming. To remedy this, the domain knowledge of test engineers can be exploited to select potentially effective test cases. To this end, test selection constraints suggested by domain experts can be utilized either for filtering randomly generated test inputs or for direct generation of inputs using constraint solvers. In this article, we propose a domain specific language (DSL) for formalizing locality-based test selection constraints of autonomous agents and discuss the impact of test selection filters, specified in our DSL, on randomly generated test cases. We study and compare the performance of filtering and constraint solving approaches in generating selective test cases for different test scenario parameters and discuss the role of these parameters in test generation performance. Through our study, we provide criteria for suitability of the random data filtering approach versus the constraint solving one under the varying size and complexity of our testing problem. We formulate the corresponding research questions and answer them by designing and conducting experiments using QuickCheck for random test data generation with filtering and Z3 for constraint solving. Our observations and statistical analysis indicate that applying filters can significantly improve test efficiency of randomly generated test cases. Furthermore, we observe that test scenario parameters affect the performance of the filtering and constraint solving approaches differently. In particular, our results indicate that the two approaches have complementary strengths: random generation and filtering works best for large agent numbers and long paths, while its performance degrades in the larger grid sizes and more strict constraints. On the contrary, constraint solving has a robust performance for large grid sizes and strict constraints, while its performance degrades with more agents and long paths.

References

[1]
Dimitris Achlioptas, Zayd S. Hammoudeh, and Panos Theodoropoulos. 2018. Fast sampling of perfectly uniform satisfying assignments. In Proceedings of the International Conference on Theory and Applications of Satisfiability Testing. O. Beyersdorff and C. Wintersteiger (Eds.), Springer, 135–147.
[2]
Andrea Aquino, Francesco A. Bianchi, Meixian Chen, Giovanni Denaro, and Mauro Pezzè. 2015. Reusing constraint proofs in program analysis. In Proceedings of the 2015 International Symposium on Software Testing and Analysis. 305–315.
[3]
Thomas Arts, John Hughes, Joakim Johansson, and Ulf Wiger. 2006. Testing telecoms software with Quviq QuickCheck. In Proceedings of the 2006 ACM SIGPLAN Workshop on Erlang. 2–10.
[4]
Markus Borg, Raja Ben Abdessalem, Shiva Nejati, François-Xavier Jegeden, and Donghwan Shin. 2021. Digital twins are not monozygotic – cross-replicating ADAS testing in two industry-grade automotive simulators. In Proceedings of the 2021 14th IEEE Conference on Software Testing, Verification and Validation. IEEE, 383–393.
[5]
Oliver Carsten, Natasha Merat, Wiel Janssen, Emma Johansson, Mark Fowkes, and Karel Brookhuis. 2005. Human Machine Interaction and Safety of Traffic in Europe. HASTE Final Report, European Commission.
[6]
Francesco Cesarini and Simon Thompson. 2009. Erlang Programming: A Concurrent Approach to Software Development. O’Reilly Media, Inc.
[7]
Supratik Chakraborty, Daniel J. Fremont, Kuldeep S. Meel, Sanjit A. Seshia, and Moshe Y. Vardi. 2015. On parallel scalable uniform SAT witness generation. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. C. Baier and C. Tinelli (Eds.), Springer, 304–319.
[8]
Supratik Chakraborty, Kuldeep S. Meel, and Moshe Y. Vardi. 2014. Balancing scalability and uniformity in SAT witness generator. In Proceedings of the 2014 51st ACM/EDAC/IEEE Design Automation Conference. IEEE, 1–6.
[9]
Wayne W. Daniel. 1990. Spearman Rank Correlation Coefficient. Applied Nonparametric Statistics (2nd ed.). PWS-Kent, 358–365.
[10]
John Derrick, Neil Walkinshaw, Thomas Arts, Clara Benac Earle, Francesco Cesarini, Lars-Åke Fredlund, Víctor M. Gulías, John Hughes, and Simon J. Thompson. 2009. Property-based testing - the protest project. In Proceedings of the 8th International Symposium on Formal Methods for Components and Objects. Frank S. de Boer, Marcello M. Bonsangue, Stefan Hallerstede, and Michael Leuschel (Eds.), Lecture Notes in Computer Science, Vol. 6286, Springer, 250–271. DOI:DOI:
[11]
Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An open urban driving simulator. In Proceedings of the 1st Annual Conference on Robot Learning. 1–16.
[12]
Rafael Dutra, Kevin Laeufer, Jonathan Bachrach, and Koushik Sen. 2018. Efficient sampling of SAT solutions for testing. In Proceedings of the 2018 IEEE/ACM 40th International Conference on Software Engineering. IEEE, 549–559.
[13]
Sina Entekhabi and Thomas Arts. 2023. SafeSmartTurtle. DOI:DOI:. Accessed October 2023.
[14]
Sina Entekhabi, Wojciech Mostowski, Mohammad Reza Mousavi, and Thomas Arts. 2022. Locality-based test selection for autonomous agents. In ICTSS 2021: Testing Software and Systems. D. Clark, H. Menendez, A.R. Cavalli (Eds.), Springer International Publishing, Cham, 73–89.
[15]
Robert Feldt, Simon Poulding, David Clark, and Shin Yoo. 2016. Test set diameter: Quantifying the diversity of sets of test cases. In Proceedings of the 2016 IEEE International Conference on Software Testing, Verification and Validation. IEEE, 223–233.
[16]
Ronald A. Fisher. 1919. XV.–The correlation between relatives on the supposition of Mendelian inheritance. Transactions of the Royal Society of Edinburgh 52, 2 (1919), 399–433. DOI:DOI:
[17]
Association for Standardization of Automation and Measuring Systems. 2022. ASAM OpenSCENARIO v2.0.0. ASAM e.v.
[18]
David Freedman, Robert Pisani, and Roger Purves. 2007. Statistics (International Student Edition) (4th ed.). WW Norton & Company, New York, NY .
[19]
Daniel J. Fremont, Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Alberto L. Sangiovanni-Vincentelli, and Sanjit A. Seshia. 2019. Scenic: A language for scenario specification and scene generation. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation. 63–78.
[20]
Alessio Gambi, Tri Huynh, and Gordon Fraser. 2019. Automatically reconstructing car crashes from police reports for testing self-driving cars. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings. IEEE, 290–291.
[21]
Alessio Gambi, Marc Müller, and Gordon Fraser. 2019. Asfault: Testing self-driving car software using search-based procedural content generation. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings. IEEE, 27–30.
[22]
Mark Harman, Yue Jia, and Yuanyuan Zhang. 2015. Achievements, open problems and challenges for search based software testing. In Proceedings of the 2015 IEEE 8th International Conference on Software Testing, Verification and Validation. IEEE, 1–12.
[23]
Christopher Henard, Mike Papadakis, Mark Harman, Yue Jia, and Yves Le Traon. 2016. Comparing white-box and black-box test prioritization. In Proceedings of the 2016 IEEE/ACM 38th International Conference on Software Engineering. IEEE, 523–534.
[24]
Ruben Heradio, David Fernández-Amorós, José A. Galindo, and David Benavides. 2020. Uniform and scalable SAT-sampling for configurable systems. In Proceedings of the 24th ACM Conference on Systems and Software Product Line: Volume A-Volume A. 1–11.
[25]
Rubing Huang, Weifeng Sun, Yinyin Xu, Haibo Chen, Dave Towey, and Xin Xia. 2019. A survey on adaptive random testing. IEEE Transactions on Software Engineering 47, 10 (2019), 2052–2083.
[26]
Xiangyang Jia, Carlo Ghezzi, and Shi Ying. 2015. Enhancing reuse of constraint solutions to improve symbolic execution. In Proceedings of the 2015 International Symposium on Software Testing and Analysis. 177–187.
[27]
Bengt Jonsson, Martin Leucker, and Alexander Pretschner. 2005. Model-Based Testing of Reactive Systems: Advanced Lectures. Springer. DOI:DOI:
[28]
Muhammad Khatibsyarbini, Mohd Adham Isa, Dayang N. A. Jawawi, and Rooster Tumeng. 2018. Test case prioritization approaches in regression testing: A systematic literature review. Information and Software Technology 93 (2018), 74–93.
[29]
Nathan Koenig and Andrew Howard. 2004. Design and use paradigms for Gazebo, an open-source multi-robot simulator. In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems. Vol. 3, IEEE, 2149–2154.
[30]
Friedrich Kruber, Jonas Wurst, and Michael Botsch. 2018. An unsupervised random forest clustering technique for automatic traffic scenario categorization. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems. IEEE, 2811–2818.
[31]
William H. Kruskal and W. Allen Wallis. 1952. Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association 47, 260 (1952), 583–621.
[32]
Pablo Alvarez Lopez, Michael Behrisch, Laura Bieker-Walz, Jakob Erdmann, Yun-Pang Flötteröd, Robert Hilbrich, Leonhard Lücken, Johannes Rummel, Peter Wagner, and Evamarie Wießner. 2018. Microscopic traffic simulation using SUMO. In Proceedings of the21st IEEE International Conference on Intelligent Transportation Systems. Retrieved from https://elib.dlr.de/124092/
[33]
Rafael Math, Angela Mahr, Mohammad M. Moniri, and Christian Müller. 2013. OpenDS: A New Open-Source Driving Simulator for Research. GMM-Fachbericht-AmE 2013, VDE-Verlag.
[34]
Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. C. R. Ramakrishnan and J. Rehof (Eds.), Springer, 337–340.
[35]
Glenford J. Myers, Corey Sandler, and Tom Badgett. 2011. The Art of Software Testing (3rd ed.). Wiley Publishing.
[36]
Wassim G. Najm, Raja Ranganathan, Gowrishankar Srinivasan, John D. Smith, Samuel Toma, Elizabeth Swanson, and August Burgett. 2013. Description of Light-vehicle Pre-crash Scenarios for Safety Applications based on Vehicle-to-vehicle Communications. Technical Report. National Highway Traffic Safety Administration.
[37]
Wassim G. Najm, Samuel Toma, and John Brewer. 2013. Depiction of Priority Light-vehicle Pre-crash Scenarios for Safety Applications based on Vehicle-to-vehicle Communications. Technical Report. National Highway Traffic Safety Administration.
[38]
Yehuda Naveh, Michal Rimon, Itai Jaeger, Yoav Katz, Michael Vinov, Eitan s Marcu, and Gil Shurek. 2007. Constraint-based random stimuli generation for hardware verification. AI Magazine 28, 3 (2007), 13–13.
[39]
Jeho Oh, Paul Gazzillo, and Don Batory. 2019. t-wise coverage by uniform sampling. In Proceedings of the 23rd International Systems and Software Product Line Conference-Volume A. 84–87.
[40]
Francis Palma, Tamer Abdou, Ayse Bener, John Maidens, and Stella Liu. 2018. An improvement to test case failure prediction in the context of test case prioritization. In Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering. 80–89.
[41]
Rodrigo Queiroz, Thorsten Berger, and Krzysztof Czarnecki. 2019. GeoScenario: An open DSL for autonomous driving scenario representation. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium. IEEE, 287–294.
[42]
Christian Roesener, Felix Fahrenkrog, Axel Uhlig, and Lutz Eckstein. 2016. A scenario-based assessment approach for automated driving by using time series classification of human-driving behaviour. In Proceedings of the 2016 IEEE 19th International Conference on Intelligent Transportation Systems. IEEE, 1360–1365.
[43]
Stuart J. Russell and Peter Norvig. 2021. Artificial Intelligence a Modern Approach. Pearson Education, Inc.
[44]
Samuel Sanford Shapiro and Martin B. Wilk. 1965. An analysis of variance test for normality (complete samples). Biometrika 52, 3/4 (1965), 591–611.
[45]
Galina Sidorenko, Wojciech Mostowski, Alexey Vinel, Jeanette Sjöberg, and Martin Cooney. 2021. The CAR approach: Creative applied research experiences for Master’s students in autonomous platooning. In Proceedings of the 2021 30th IEEE International Conference on Robot & Human Interactive Communication. IEEE, 214–221.
[46]
Student. 1908. The probable error of a mean. Biometrika 6 (1908), 1–25.
[47]
Johan Thunberg, Galina Sidorenko, Katrin Sjöberg, and Alexey Vinel. 2021. Efficiently bounding the probabilities of vehicle collision at intelligent intersections. IEEE Open Journal of Intelligent Transportation Systems 2 (2021), 47–59. DOI:DOI:
[48]
Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in Statistics. S. Kotz and N. L. Johnson (Eds.), Springer, 196–202.

Index Terms

  1. Automated and Efficient Test-Generation for Grid-Based Multiagent Systems: Comparing Random Input Filtering versus Constraint Solving

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Software Engineering and Methodology
    ACM Transactions on Software Engineering and Methodology  Volume 33, Issue 1
    January 2024
    933 pages
    EISSN:1557-7392
    DOI:10.1145/3613536
    • Editor:
    • Mauro Pezzè
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 November 2023
    Online AM: 30 September 2023
    Accepted: 03 July 2023
    Revised: 19 June 2023
    Received: 01 July 2022
    Published in TOSEM Volume 33, Issue 1

    Check for updates

    Author Tags

    1. Test input generation
    2. domain specific languages
    3. test selection
    4. autonomous agents
    5. multiagent systems
    6. grid-based systems
    7. constraint solving
    8. test input filtering

    Qualifiers

    • Research-article

    Funding Sources

    • Knowledge Foundation (KKS) in the framework of “Safety of Connected Intelligent Vehicles in Smart Cities – SafeSmart” project
    • UKRI Trustworthy Autonomous Systems Node in Verifiability

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 814
      Total Downloads
    • Downloads (Last 12 months)695
    • Downloads (Last 6 weeks)101
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media