research-article

Neuron Semantic-Guided Test Generation for Deep Neural Networks Fuzzing

Authors:

Yan Lei,

David LoAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 34, Issue 1

Article No.: 15, Pages 1 - 38

https://doi.org/10.1145/3688835

Published: 30 December 2024 Publication History

Get Access

Abstract

In recent years, significant progress has been made in testing methods for deep neural networks (DNNs) to ensure their correctness and robustness. Coverage-guided criteria, such as neuron-wise, layer-wise, and path-/trace-wise, have been proposed for DNN fuzzing. However, existing coverage-based criteria encounter performance bottlenecks for several reasons: ❶ Testing Adequacy: Partial neural coverage criteria have been observed to achieve full coverage using only a small number of test inputs. In this case, increasing the number of test inputs does not consistently improve the quality of models. ❷ Interpretability: The current coverage criteria lack interpretability. Consequently, testers are unable to identify and understand which incorrect attributes or patterns of the model are triggered by the test inputs. This lack of interpretability hampers the subsequent debugging and fixing process. Therefore, there is an urgent need for a novel fuzzing criterion that offers improved testing adequacy, better interpretability, and more effective failure detection capabilities for DNNs.

To alleviate these limitations, we propose NSGen, an approach for DNN fuzzing that utilizes neuron semantics as guidance during test generation. NSGen identifies critical neurons, translates their high-level semantic features into natural language descriptions, and then assembles them into human-readable DNN decision paths (representing the internal decision of the DNN). With these decision paths, we can generate more fault-revealing test inputs by quantifying the similarity between original test inputs and mutated test inputs for fuzzing. We evaluate NSGen on popular DNN models (VGG16_BN, ResNet50, and MobileNet_v2) using CIFAR10, CIFAR100, Oxford 102 Flower, and ImageNet datasets. Compared to 12 existing coverage-guided fuzzing criteria, NSGen outperforms all baselines, increasing the number of triggered faults by 21.4% to 61.2% compared to the state-of-the-art coverage-guided fuzzing criterion. This demonstrates NSGen's effectiveness in generating fault-revealing test inputs through guided input mutation, highlighting its potential to enhance DNN testing and interpretability.

References

[1]

NSGen. 2024. Retrieved from https://github.com/unknownhl/NSGen

Abstract

References

Index Terms

Recommendations

Prioritizing Test Inputs for Deep Neural Networks via Mutation Analysis

IATT: Interpretation Analysis based Transferable Test Generation for Convolutional Neural Networks

Distance-Aware Test Input Selection for Deep Neural Networks

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Full Text

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations