research-article

Open access

DocTer: documentation-guided fuzzing for testing deep learning API functions

Authors:

Hung Viet Pham,

Michael W. GodfreyAuthors Info & Claims

ISSTA 2022: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis

Pages 176 - 188

https://doi.org/10.1145/3533767.3534220

Published: 18 July 2022 Publication History

Abstract

Input constraints are useful for many software development tasks. For example, input constraints of a function enable the generation of valid inputs, i.e., inputs that follow these constraints, to test the function deeper. API functions of deep learning (DL) libraries have DL-specific input constraints, which are described informally in the free-form API documentation. Existing constraint-extraction techniques are ineffective for extracting DL-specific input constraints.

To fill this gap, we design and implement a new technique—DocTer—to analyze API documentation to extract DL-specific input constraints for DL API functions. DocTer features a novel algorithm that automatically constructs rules to extract API parameter constraints from syntactic patterns in the form of dependency parse trees of API descriptions. These rules are then applied to a large volume of API documents in popular DL libraries to extract their input parameter constraints. To demonstrate the effectiveness of the extracted constraints, DocTer uses the constraints to enable the automatic generation of valid and invalid inputs to test DL API functions.

Our evaluation on three popular DL libraries (TensorFlow, PyTorch, and MXNet) shows that DocTer’s precision in extracting input constraints is 85.4%. DocTer detects 94 bugs from 174 API functions, including one previously unknown security vulnerability that is now documented in the CVE database, while a baseline technique without input constraints detects only 59 bugs. Most (63) of the 94 bugs are previously unknown, 54 of which have been fixed or confirmed by developers after we report them. In addition, DocTer detects 43 inconsistencies in documents, 39 of which are fixed or confirmed.

References

[1]

1999. The Java Modeling Language (JML). "https://www.cs.ucf.edu/~leavens/JML/examples.shtml"

[2]

2004. Beautiful Soup. https://www.crummy.com/software/BeautifulSoup/bs4/doc/

[3]

2013. American Fuzzy Lop. http://lcamtuf.coredump.cx/afl/

[4]

2014. Universal Dependencies. https://universaldependencies.org/

[5]

2015. libFuzzer – a library for coverage-guided fuzz testing. http://llvm.org/docs/LibFuzzer.html

[6]

2016. OSS-Fuzz. https://github.com/google/oss-fuzz

[7]

2016. pytype. "https://github.com/google/pytype"

[8]

2017. What is the best programming language for Machine Learning? https://towardsdatascience.com/what-is-the-best-programming-language-for-machine-learning-a745c156d6b7

[9]

2019. FuzzFactory: Domain-Specific Fuzzing with Waypoints. "https://github.com/rohanpadhye/fuzzfactory"

[10]

2019. incubator-mxnet. https://github.com/apache/incubator-mxnet/blob/1.6.0/python/mxnet/ndarray/ndarray.py##L64-L74

[11]

2019. torch.Tensor. https://pytorch.org/docs/1.5.0/tensors.html

[12]

2020. tf.dtypes.DType. https://www.tensorflow.org/versions/r2.1/api_docs/python/tf/dtypes/DType

[13]

2022. ’s Supplementary Material. https://github.com/lin-tan/DocTer

[14]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, and Michael Isard. 2016. Tensorflow: A system for large-scale machine learning. In 12th $USENIX$ symposium on operating systems design and implementation ($OSDI$ 16). 265–283.

[15]

Steven Bird, Edward Loper, and Ewan Klein. 2009. Natural Language Processing with Python. O’Reilly Media Inc.

Digital Library

[16]

Arianna Blasi, Alberto Goffi, Konstantin Kuznetsov, Alessandra Gorla, Michael D Ernst, Mauro Pezzè, and Sergio Delgado Castellanos. 2018. Translating code comments to procedure specifications. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 242–253.

Digital Library

[17]

Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. arxiv:1512.01274.

[18]

Saikat Dutta, Owolabi Legunsen, Zixin Huang, and Sasa Misailovic. 2018. Testing Probabilistic Programming Systems. ESEC/FSE 2018. Association for Computing Machinery, New York, NY, USA. isbn:9781450355735 https://doi.org/10.1145/3236024.3236057

Digital Library

[19]

Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Transactions on Software Engineering, 39, 2 (2013), feb., 276 –291.

Digital Library

[20]

Xiang Gao, Ripon K Saha, Mukul R Prasad, and Abhik Roychoudhury. 2020. Fuzz Testing based Data Augmentation to Improve Robustness of Deep Neural Networks. In Proceedings of the 42nd International Conference on Software Engineering (ICSE ’20).

Digital Library

[21]

P. Godefroid, A. Kiezun, and M. Y. Levin. 2008. Grammar-based Whitebox Fuzzing. In Proceedings of the ACM SIGPLAN conference on Programming language design and implementation. 206–215.

[22]

Alberto Goffi, Alessandra Gorla, Michael D Ernst, and Mauro Pezzè. 2016. Automatic generation of oracles for exceptional behaviors. In Proceedings of the 25th international symposium on software testing and analysis. 213–224.

Digital Library

[23]

Jianmin Guo, Yu Jiang, Yue Zhao, Quan Chen, and Jiaguang Sun. 2018. DLFuzz: Differential Fuzzing Testing of Deep Learning Systems. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018). ACM, New York, NY, USA. 739–743. isbn:978-1-4503-5573-5 https://doi.org/10.1145/3236024.3264835

Digital Library

[24]

Qianyu Guo, Xiaofei Xie, Yi Li, Xiaoyu Zhang, Yang Liu, Xiaohong Li, and Chao Shen. 2020. Audee: Automated Testing for Deep Learning Frameworks. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE).

Digital Library

[25]

Q. Hu, L. Ma, X. Xie, B. Yu, Y. Liu, and J. Zhao. 2019. DeepMutation++: A Mutation Testing Framework for Deep Learning Systems. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1158–1161.

[26]

Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2020. Taxonomy of Real Faults in Deep Learning Systems. In Proceedings of 42nd International Conference on Software Engineering (ICSE ’20). ACM.

Digital Library

[27]

Md Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A Comprehensive Study on Deep Learning Bug Characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). Association for Computing Machinery, New York, NY, USA. 510–520. isbn:9781450355728 https://doi.org/10.1145/3338906.3338955

Digital Library

[28]

Sifis Lagouvardos, Julian Dolby, Neville Grech, Anastasios Antoniadis, and Yannis Smaragdakis. 2020. Static Analysis of Shape in TensorFlow Programs. In ECOOP 2020.

[29]

Caroline Lemieux and Koushik Sen. 2018. FairFuzz: a targeted mutation strategy for increasing greybox fuzz testing coverage. In ASE, Marianne Huchard, Christian Kästner, and Gordon Fraser (Eds.). ACM, 475–485.

[30]

Shuang Liu, Jun Sun, Yang Liu, Yue Zhang, Bimlesh Wadhwa, Jin Song Dong, and Xinyu Wang. 2014. Automatic early defects detection in use case documents. In Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. 785–790.

Digital Library

[31]

Tao Lv, Ruishi Li, Yi Yang, Kai Chen, Xiaojing Liao, XiaoFeng Wang, Peiwei Hu, and Luyi Xing. 2020. RTFM! Automatic Assumption Discovery and Verification Derivation from Library Document for API Misuse Detection. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 1837–1852.

Digital Library

[32]

R. Majumda and R. Xu. 2007. Directed Test Generation Using Symbolic Grammars. In Proceedings of the 22nd IEEE/ACM International Conference on Automated Software Engineering. 134–143.

[33]

Christopher D Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. 2014. The Stanford CoreNLP natural language processing toolkit. In Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations. 55–60.

[34]

Manish Motwani and Yuriy Brun. 2019. Automatically generating precise Oracles from structured natural language specifications. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 188–199.

Digital Library

[35]

M. Nejadgholi and J. Yang. 2019. A Study of Oracle Approximations in Testing Deep Learning Libraries. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 785–796.

[36]

Augustus Odena, Catherine Olsson, David Andersen, and Ian Goodfellow. 2019. TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing. In Proceedings of the 36th International Conference on Machine Learning, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.) (Proceedings of Machine Learning Research, Vol. 97). PMLR, Long Beach, California, USA. 4901–4911.

[37]

Carlos Pacheco and Michael D. Ernst. 2007. Randoop: Feedback-directed Random Testing for Java. In Companion to the 22Nd ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications Companion (OOPSLA ’07). ACM, New York, NY, USA. 815–816. isbn:978-1-59593-865-7 https://doi.org/10.1145/1297846.1297902

Digital Library

[38]

Rahul Pandita, Xusheng Xiao, Hao Zhong, Tao Xie, Stephen Oney, and Amit Paradkar. 2012. Inferring Method Specifications from Natural Language API Descriptions. In Proceedings of the 34th International Conference on Software Engineering (ICSE ’12). IEEE Press, Piscataway, NJ, USA. 815–825. isbn:978-1-4673-1067-3

[39]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d' Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024–8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Digital Library

[40]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12 (2011), 2825–2830.

Digital Library

[41]

Hui Peng, Yan Shoshitaishvili, and Mathias Payer. 2018. T-Fuzz: Fuzzing by Program Transformation. In IEEE Symposium on Security and Privacy. IEEE Computer Society, 697–710. isbn:978-1-5386-4353-2

[42]

Hung Viet Pham, Thibaud Lutellier, Weizhen Qi, and Lin Tan. 2019. CRADLE: Cross-backend Validation to Detect and Localize Bugs in Deep Learning Libraries. In Proceedings of the 41st International Conference on Software Engineering (ICSE ’19). IEEE Press, Piscataway, NJ, USA. 1027–1038. https://doi.org/10.1109/ICSE.2019.00107

Digital Library

[43]

Siwakorn Srisakaokul, Zhengkai Wu, Angello Astorga, Oreoluwa Alebiosu, and Tao Xie. 2018. Multiple-Implementation Testing of Supervised Learning Software. In Proc. AAAI-18 Workshop on Engineering Dependable and Secure Machine Learning Systems (EDSMLS).

[44]

Robert Swiecki. 2015. Honggfuzz: A general-purpose, easy-to-use fuzzer with interesting analysis options. URl: https://github. com/google/honggfuzz.

[45]

Lin Tan, Ding Yuan, Gopal Krishna, and Yuanyuan Zhou. 2007. /*Icomment: Bugs or Bad Comments?*/. In Proceedings of Twenty-first ACM SIGOPS Symposium on Operating Systems Principles (SOSP ’07). ACM, New York, NY, USA. 145–158. isbn:978-1-59593-591-5 https://doi.org/10.1145/1294261.1294276

Digital Library

[46]

Lin Tan, Yuanyuan Zhou, and Yoann Padioleau. 2011. aComment: Mining Annotations from Comments and Code to Detect Interrupt Related Concurrency Bugs. In Proceedings of the 33rd International Conference on Software Engineering (ICSE ’11). ACM, New York, NY, USA. 11–20. isbn:978-1-4503-0445-0 https://doi.org/10.1145/1985793.1985796

Digital Library

[47]

S. H. Tan, D. Marinov, L. Tan, and G. T. Leavens. 2012. @tComment: Testing Javadoc Comments to Detect Comment-Code Inconsistencies. In 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation. 260–269. issn:2159-4848 https://doi.org/10.1109/ICST.2012.106

Digital Library

[48]

Saeid Tizpaz-Niari, Pavol Černỳ, and Ashutosh Trivedi. 2020. Detecting and understanding real-world differential performance bugs in machine learning libraries. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 189–199.

Digital Library

[49]

Sakshi Udeshi and Sudipta Chattopadhyay. 2019. Grammar Based Directed Testing of Machine Learning Systems. CoRR, abs/1902.10027 (2019), arxiv:1902.10027.

[50]

Jackson Vanover, Xuan Deng, and Cindy Rubio-González. 2020. Discovering discrepancies in numerical libraries. In Proceedings of the 2020 International Symposium on Software Testing and Analysis (ISSTA 2020). 488–501. https://doi.org/10.1145/3395363.3397380

Digital Library

[51]

Haohan Wang, Da Sun, and Eric P Xing. 2019. What if we simply swap the two text fragments? a straightforward yet effective way to test the robustness of methods to confounding signals in nature language inference tasks. In Proceedings of the AAAI Conference on Artificial Intelligence. 33, 7136–7143.

Digital Library

[52]

Zan Wang, Ming Yan, Junjie Chen, Shuang Liu, and Dongdi Zhang. 2020. Deep Learning Library Testing via Effective Model Generation. In Proceedings of the 2020 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020).

Digital Library

[53]

Edmund Wong, Lei Zhang, Song Wang, Taiyue Liu, and Lin Tan. 2015. Dase: Document-assisted symbolic execution for improving automated software testing. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering. 1, 620–631.

[54]

Qian Wu, Ling Wu, Guangtai Liang, Qianxiang Wang, Tao Xie, and Hong Mei. 2013. Inferring dependency constraints on parameters for web services. In Proceedings of the 22nd international conference on World Wide Web. 1421–1432.

Digital Library

[55]

Xiaofei Xie, Lei Ma, Felix Juefei-Xu, Minhui Xue, Hongxu Chen, Yang Liu, Jianjun Zhao, Bo Li, Jianxiong Yin, and Simon See. 2019. DeepHunter: A Coverage-guided Fuzz Testing Framework for Deep Neural Networks. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2019). ACM, New York, NY, USA. 146–157. isbn:978-1-4503-6224-5 https://doi.org/10.1145/3293882.3330579

Digital Library

[56]

Mohammed Javeed Zaki. 2005. Efficiently mining frequent trees in a forest: Algorithms and applications. IEEE transactions on knowledge and data engineering, 17, 8 (2005), 1021–1035.

[57]

Juan Zhai, Yu Shi, Minxue Pan, Guian Zhou, Yongxiang Liu, Chunrong Fang, Shiqing Ma, Lin Tan, and Xiangyu Zhang. 2020. C2S: Translating Natural Language Comments to Formal Program. In Proceedings of the 2020 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020).

Digital Library

[58]

Ru Zhang, Wencong Xiao, Hongyu Zhang, Yu Liu, Haoxiang Lin, and Mao Yang. 2020. An Empirical Study on Program Failures of Deep Learning Jobs. In Proceedings of the 42nd International Conference on Software Engineering (ICSE ’20). IEEE/ACM.

Digital Library

[59]

Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 2020 International Symposium on Software Testing and Analysis (ISSTA 2018). 129–140. https://doi.org/10.1145/3213846.3213866

Digital Library

[60]

W. Zheng, W. Wang, D. Liu, C. Zhang, Q. Zeng, Y. Deng, W. Yang, P. He, and T. Xie. 2019. Testing Untestable Neural Machine Translation: An Industrial Case. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 314–315.

[61]

Hao Zhong, Lu Zhang, Tao Xie, and Hong Mei. 2009. Inferring resource specifications from natural language API documentation. In 2009 IEEE/ACM International Conference on Automated Software Engineering. 307–318.

Digital Library

[62]

Yu Zhou, Changzhi Wang, Xin Yan, Taolue Chen, Sebastiano Panichella, and Harald Gall. 2018. Automatic detection and repair recommendation of directive defects in Java API documentation. IEEE Transactions on Software Engineering, 46, 9 (2018), 1004–1023.

Cited By

Zhang XJiang WShen CLi QWang QLin CGuan X(2025)Deep Learning Library Testing: Definition, Methods and ChallengesACM Computing Surveys10.1145/3716497Online publication date: 5-Feb-2025
https://dl.acm.org/doi/10.1145/3716497
Wang JPham HLi QTan LGuo YAziz AMeijer E(2025)D³: Differential Testing of Distributed Deep Learning With Model GenerationIEEE Transactions on Software Engineering10.1109/TSE.2024.346165751:1(38-52)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TSE.2024.3461657
Yang CDeng YLu RYao JLiu JJabbarvand RZhang L(2024)WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language ModelsProceedings of the ACM on Programming Languages10.1145/36897368:OOPSLA2(709-735)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689736
Show More Cited By

Index Terms

DocTer: documentation-guided fuzzing for testing deep learning API functions
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
  2. Software organization and properties
    1. Extra-functional properties
      1. Software reliability

Recommendations

Fuzzing deep-learning libraries via automated relational API inference
ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Deep Learning (DL) has gained wide attention in recent years. Meanwhile, bugs in DL systems can lead to serious consequences, and may even threaten human lives. As a result, a growing body of research has been dedicated to DL model testing. However, ...
History-Driven Fuzzing for Deep Learning Libraries
Recently, many Deep Learning (DL) fuzzers have been proposed for API-level testing of DL libraries. However, they either perform unguided input generation (e.g., not considering the relationship between API arguments when generating inputs) or only ...
Carving UI Tests to Generate API Tests and API Specification
ICSE '23: Proceedings of the 45th International Conference on Software Engineering

Modern web applications make extensive use of API calls to update the UI state in response to user events or server-side changes. For such applications, API-level testing can play an important role, in-between unit-level testing and UI-level (or end-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISSTA 2022: Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis

July 2022

808 pages

ISBN:9781450393799

DOI:10.1145/3533767

General Chair:
Sukyoung Ryu
KAIST, South Korea
,
Program Chair:
Yannis Smaragdakis
University of Athens, Greece

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISSTA '22

Sponsor:

SIGSOFT

ISSTA '22: 31st ACM SIGSOFT International Symposium on Software Testing and Analysis

July 18 - 22, 2022

Virtual, South Korea

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Sponsor:
sigsoft

34th ACM SIGSOFT International Symposium on Software Testing and Analysis

June 25 - 28, 2025

Trondheim , Norway

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

34
Total Citations
View Citations
1,686
Total Downloads

Downloads (Last 12 months)685
Downloads (Last 6 weeks)79

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang XJiang WShen CLi QWang QLin CGuan X(2025)Deep Learning Library Testing: Definition, Methods and ChallengesACM Computing Surveys10.1145/3716497Online publication date: 5-Feb-2025
https://dl.acm.org/doi/10.1145/3716497
Wang JPham HLi QTan LGuo YAziz AMeijer E(2025)D³: Differential Testing of Distributed Deep Learning With Model GenerationIEEE Transactions on Software Engineering10.1109/TSE.2024.346165751:1(38-52)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TSE.2024.3461657
Yang CDeng YLu RYao JLiu JJabbarvand RZhang L(2024)WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language ModelsProceedings of the ACM on Programming Languages10.1145/36897368:OOPSLA2(709-735)Online publication date: 8-Oct-2024
https://dl.acm.org/doi/10.1145/3689736
Shiri Harzevili NMohajer MWei MPham HWang S(2024)History-Driven Fuzzing for Deep Learning LibrariesACM Transactions on Software Engineering and Methodology10.1145/368883834:1(1-29)Online publication date: 28-Dec-2024
https://dl.acm.org/doi/10.1145/3688838
Liao SShan C(2024)A PSO-based Method to Test Deep Learning Library at API LevelProceedings of the 3rd International Conference on Computer, Artificial Intelligence and Control Engineering10.1145/3672758.3672777(117-130)Online publication date: 26-Jan-2024
https://dl.acm.org/doi/10.1145/3672758.3672777
Chen JJia CYan YGe JZheng HCheng Y(2024)A Miss Is as Good as A Mile: Metamorphic Testing for Deep Learning OperatorsProceedings of the ACM on Software Engineering10.1145/36607961:FSE(2005-2027)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3660796
Meijer WCombemale BWimmer MChechik MEgyed A(2024)Contract-based Validation of Conceptual Design Bugs for Engineering Complex Machine Learning SoftwareProceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems10.1145/3652620.3688201(155-161)Online publication date: 22-Sep-2024
https://dl.acm.org/doi/10.1145/3652620.3688201
Yu XLiu LHu XKeung JXia XLo DChristakis MPradel M(2024)Practitioners’ Expectations on Automated Test GenerationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680386(1618-1630)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680386
Guan HBai GLiu YChristakis MPradel M(2024)Large Language Models Can Connect the Dots: Exploring Model Optimization Bugs with Domain Knowledge-Aware PromptsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680383(1579-1591)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680383
Go GZhou CZhang QZou XShi HJiang YChristakis MPradel M(2024)Towards More Complete Constraints for Deep Learning Library Testing via Complementary Set Guided RefinementProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680364(1338-1350)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680364
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten