Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3460319.3464811acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

DeepHyperion: exploring the feature space of deep learning-based systems through illumination search

Published: 11 July 2021 Publication History

Abstract

Deep Learning (DL) has been successfully applied to a wide range of application domains, including safety-critical ones. Several DL testing approaches have been recently proposed in the literature but none of them aims to assess how different interpretable features of the generated inputs affect the system's behaviour.
In this paper, we resort to Illumination Search to find the highest-performing test cases (i.e., misbehaving and closest to misbehaving), spread across the cells of a map representing the feature space of the system. We introduce a methodology that guides the users of our approach in the tasks of identifying and quantifying the dimensions of the feature space for a given domain. We developed DeepHyperion, a search-based tool for DL systems that illuminates, i.e., explores at large, the feature space, by providing developers with an interpretable feature map where automatically generated inputs are placed along with information about the exposed behaviours.

References

[1]
Raja Ben Abdessalem, Shiva Nejati, Lionel C. Briand, and Thomas Stifter. 2016. Testing advanced driver assistance systems using multi-objective search and neural networks. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, Singapore, September 3-7, 2016. ACM, 63–74. https://doi.org/10.1145/2970276.2970311
[2]
Raja Ben Abdessalem, Shiva Nejati, Lionel C. Briand, and Thomas Stifter. 2018. Testing Vision-based Control Systems Using Learnable Evolutionary Algorithms. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). ACM, 1016–1026. isbn:978-1-4503-5638-1 https://doi.org/10.1145/3180155.3180160
[3]
Andrea Arcuri and Lionel Briand. 2014. A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability, 24, 3 (2014), 219–250. https://doi.org/10.1002/stvr.1486
[4]
Phillip J. Barry and Ronald N. Goldman. 1988. A Recursive Evaluation Algorithm for a Class of Catmull-Rom Splines. SIGGRAPH Comput. Graph., 22, 4 (1988), June, 199–204. issn:0097-8930 https://doi.org/10.1145/378456.378511
[5]
BeamNG GmbH. 2018. BeamNG.research. https://www.beamng.gmbh/research
[6]
Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, Xin Zhang, Jake Zhao, and Karol Zieba. 2016. End to End Learning for Self-Driving Cars. CoRR, abs/1604.07316 (2016), 1–9. arxiv:1604.07316. arxiv:1604.07316
[7]
Edwin Catmull and Raphael Rom. 1974. A Class of Local Interpolating Splines. In Computer Aided Geometric Design, R. E. Barnhill and R. F. Riesenfeld (Eds.). Academic Press, 317 – 326. isbn:978-0-12-079050-0 https://doi.org/10.1016/B978-0-12-079050-0.50020-5
[8]
François Chollet. 2020. Simple MNIST convnet. https://github.com/keras-team/keras-io/blob/master/examples/vision/mnist_convnet.py
[9]
Samet Demir, Hasan Ferit Eniser, and Alper Sen. 2020. DeepSmartFuzzer: Reward Guided Test Generation For Deep Learning. In Proceedings of the Workshop on Artificial Intelligence Safety 2020 (IJCAI-PRICAI 2020), Yokohama, Japan, January, 2021 (CEUR Workshop Proceedings, Vol. 2640). CEUR-WS.org, 134–140. http://ceur-ws.org/Vol-2640/paper_19.pdf
[10]
Alessio Gambi, Marc Müller, and Gordon Fraser. 2019. Automatically testing self-driving cars with search-based procedural content generation. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2019, Beijing, China, July 15-19, 2019. ACM, 318–328. https://doi.org/10.1145/3293882.3330566
[11]
Ian J. Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org
[12]
Jianmin Guo, Yu Jiang, Yue Zhao, Quan Chen, and Jiaguang Sun. 2018. DLFuzz: differential fuzzing testing of deep learning systems. In Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, USA, November 04-09, 2018. ACM, 739–743. https://doi.org/10.1145/3236024.3264835
[13]
Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of Real Faults in Deep Learning Systems. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machinery, 1110–1121. isbn:9781450371216 https://doi.org/10.1145/3377811.3380395
[14]
Gunel Jahangirova, Andrea Stocco, and Paolo Tonella. 2021. Quality Metrics and Oracles for Autonomous Vehicles Testing. In Proceedings of 14th IEEE International Conference on Software Testing, Verification and Validation (ICST ’21). IEEE, 194–204.
[15]
Jinhan Kim, Robert Feldt, and Shin Yoo. 2019. Guiding deep learning system testing using surprise adequacy. In Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019. IEEE / ACM, 1039–1049. https://doi.org/10.1109/ICSE.2019.00108
[16]
2008. Pearson’s Correlation Coefficient, Wilhelm Kirch (Ed.). Springer Netherlands, 1090–1091. isbn:978-1-4020-5614-7 https://doi.org/10.1007/978-1-4020-5614-7_2569
[17]
Eugene F Krause. 1986. Taxicab geometry: An adventure in non-Euclidean geometry. Courier Corporation.
[18]
Craig Larman. 1997. Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design. Prentice Hall. isbn:0-13-748880-7
[19]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE, 86, 11 (1998), 2278–2324.
[20]
Lei Ma, Felix Juefei-Xu, Minhui Xue, Bo Li, Li Li, Yang Liu, and Jianjun Zhao. 2019. DeepCT: Tomographic Combinatorial Testing for Deep Learning Systems. In 26th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2019, Hangzhou, China, February 24-27, 2019. IEEE, 614–618. https://doi.org/10.1109/SANER.2019.8668044
[21]
Lei Ma, Felix Juefei-Xu, Fuyuan Zhang, Jiyuan Sun, Minhui Xue, Bo Li, Chunyang Chen, Ting Su, Li Li, Yang Liu, Jianjun Zhao, and Yadong Wang. 2018. DeepGauge: Multi-granularity Testing Criteria for Deep Learning Systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE 2018). ACM, 120–131. isbn:978-1-4503-5937-5 https://doi.org/10.1145/3238147.3238202
[22]
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press. isbn:0521865719, 9780521865715
[23]
Jean-Baptiste Mouret and Jeff Clune. 2015. Illuminating search spaces by mapping elites. arxiv:1504.04909.
[24]
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2019. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. Commun. ACM, 62, 11 (2019), Oct., 137?145. issn:0001-0782 https://doi.org/10.1145/3361566
[25]
Alexander Ratner, Christopher De Sa, Sen Wu, Daniel Selsam, and Christopher Ré. 2016. Data Programming: Creating Large Training Sets, Quickly. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS’16). Curran Associates Inc., 3574–3582. isbn:9781510838819
[26]
Vincenzo Riccio, Gunel Jahangirova, Andrea Stocco, Nargiz Humbatova, Michael Weiss, and Paolo Tonella. 2020. Testing machine learning based systems: a systematic mapping. Empir. Softw. Eng., 25, 6 (2020), 5193–5254. https://doi.org/10.1007/s10664-020-09881-0
[27]
Vincenzo Riccio and Paolo Tonella. 2020. Model-based Exploration of the Frontier of Behaviours for Deep Learning System Testing. In Proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’20). Association for Computing Machinery, 13 pages. https://doi.org/10.1145/3368089.3409730
[28]
Carolyn B. Seaman. 1999. Qualitative Methods in Empirical Studies of Software Engineering. IEEE Transactions on Software Engineering, 25 (1999), 557–572.
[29]
P. Selinger. 2003. Potrace: a polygon-based tracing algorithm. http://potrace.sourceforge.net/potrace.pdf
[30]
Andrea Stocco and Paolo Tonella. 2020. Towards Anomaly Detectors that Learn Continuously. In 2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). 201–208. https://doi.org/10.1109/ISSREW51248.2020.00073
[31]
Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. DeepTest: Automated Testing of Deep-neural-network-driven Autonomous Cars. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). ACM, 303–314. isbn:978-1-4503-5638-1 https://doi.org/10.1145/3180155.3180220
[32]
Mark Utting, Alexander Pretschner, and Bruno Legeard. 2012. A taxonomy of model-based testing approaches. Software testing, verification and reliability, 22, 5 (2012), 297–312.
[33]
Xiaofei Xie, Lei Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, and Simon See. 2019. DeepHunter: A Coverage-Guided Fuzz Testing Framework for Deep Neural Networks. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2019). Association for Computing Machinery, 146–157. isbn:9781450362245 https://doi.org/10.1145/3293882.3330579
[34]
J. M. Zhang, M. Harman, L. Ma, and Y. Liu. 2020. Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering, Early Access, – (2020), 1–1. https://doi.org/10.1109/TSE.2019.2962027
[35]
Mengshi Zhang, Yuqun Zhang, Lingming Zhang, Cong Liu, and Sarfraz Khurshid. 2018. DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, Montpellier, France, September 3-7, 2018. ACM, 132–142. https://doi.org/10.1145/3238147.3238187
[36]
Zhi-Hua Zhou. 2017. A brief introduction to weakly supervised learning. National Science Review, 5, 1 (2017), 08, 44–53. issn:2095-5138 https://doi.org/10.1093/nsr/nwx106 arxiv:https://academic.oup.com/nsr/article-pdf/5/1/44/31567770/nwx106.pdf.
[37]
Tahereh Zohdinasab, Vincenzo Riccio, Alessio Gambi, and Paolo Tonella. 2021. :Replication Package. https://github.com/testingautomated-usi/DeepHyperion

Cited By

View all
  • (2025)Can search-based testing with pareto optimization effectively cover failure-revealing test inputs?Empirical Software Engineering10.1007/s10664-024-10564-330:1Online publication date: 1-Feb-2025
  • (2025)Reinforcement learning for online testing of autonomous driving systems: a replication and extension studyEmpirical Software Engineering10.1007/s10664-024-10562-530:1Online publication date: 1-Feb-2025
  • (2024)LeGEND: A Top-Down Approach to Scenario Generation of Autonomous Driving Systems Assisted by Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695520(1497-1508)Online publication date: 27-Oct-2024
  • Show More Cited By

Index Terms

  1. DeepHyperion: exploring the feature space of deep learning-based systems through illumination search

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISSTA 2021: Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis
    July 2021
    685 pages
    ISBN:9781450384599
    DOI:10.1145/3460319
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. deep learning
    2. search based software engineering
    3. self-driving cars
    4. software testing

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ISSTA '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 58 of 213 submissions, 27%

    Upcoming Conference

    ISSTA '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)114
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Can search-based testing with pareto optimization effectively cover failure-revealing test inputs?Empirical Software Engineering10.1007/s10664-024-10564-330:1Online publication date: 1-Feb-2025
    • (2025)Reinforcement learning for online testing of autonomous driving systems: a replication and extension studyEmpirical Software Engineering10.1007/s10664-024-10562-530:1Online publication date: 1-Feb-2025
    • (2024)LeGEND: A Top-Down Approach to Scenario Generation of Autonomous Driving Systems Assisted by Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695520(1497-1508)Online publication date: 27-Oct-2024
    • (2024)In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic ManipulatorsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695281(2187-2198)Online publication date: 27-Oct-2024
    • (2024)Neuron Semantic-Guided Test Generation for Deep Neural Networks FuzzingACM Transactions on Software Engineering and Methodology10.1145/368883534:1(1-38)Online publication date: 14-Aug-2024
    • (2024)Reinforcement Learning Informed Evolutionary Search for Autonomous Systems TestingACM Transactions on Software Engineering and Methodology10.1145/368046833:8(1-45)Online publication date: 27-Jul-2024
    • (2024)Focused Test Generation for Autonomous Driving SystemsACM Transactions on Software Engineering and Methodology10.1145/366460533:6(1-32)Online publication date: 27-Jun-2024
    • (2024)VioHawk: Detecting Traffic Violations of Autonomous Driving Systems through Criticality-Guided Simulation TestingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680325(844-855)Online publication date: 11-Sep-2024
    • (2024)DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep Neural NetworksACM Transactions on Software Engineering and Methodology10.1145/364438833:6(1-29)Online publication date: 27-Jun-2024
    • (2024)Harnessing Neuron Stability to Improve DNN VerificationProceedings of the ACM on Software Engineering10.1145/36437651:FSE(859-881)Online publication date: 12-Jul-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media