research-article

Open access

Deeper Notions of Correctness in Image-Based DNNs: Lifting Properties from Pixel to Entities

Authors:

Sebastian Elbaum,

Matthew B. DwyerAuthors Info & Claims

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Pages 2122 - 2126

https://doi.org/10.1145/3611643.3613079

Published: 30 November 2023 Publication History

Abstract

Deep Neural Networks (DNNs) that process images are being widely used for many safety-critical tasks, from autonomous vehicles to medical diagnosis. Currently, DNN correctness properties are defined at the pixel level over the entire input. Such properties are useful to expose system failures related to sensor noise or adversarial attacks, but they cannot capture features that are relevant to domain-specific entities and reflect richer types of behaviors. To overcome this limitation, we envision the specification of properties based on the entities that may be present in image input, capturing their semantics and how they change. Creating such properties today is difficult as it requires determining where the entities appear in images, defining how each entity can change, and writing a specification that is compatible with each particular V&V client. We introduce an initial framework structured around those challenges to assist in the generation of Domain-specific Entity-based properties automatically by leveraging object detection models to identify entities in images and creating properties based on entity features. Our feasibility study provides initial evidence that the new properties can uncover interesting system failures, such as changes in skin color can modify the output of a gender classification network. We conclude by analyzing the framework potential to implement the vision and by outlining directions for future work.

References

[1]

Yahoo News Aaron Cole. July 1, 2016. Fatal crash in Florida is first reported Tesla Autopilot death. https://news.yahoo.com/fatal-crash-florida-first-reported-tesla-autopilot-death-180000689.html

[2]

Akirasosa. 2020. Real-Time Semantic Segmentation in Mobile device. https://github.com/akirasosa/mobile-semantic-segmentation

[3]

Long Chen, Shaobo Lin, Xiankai Lu, Dongpu Cao, Hangbin Wu, Chi Guo, Chun Liu, and Fei-Yue Wang. 2021. Deep neural network based vehicle and pedestrian detection for autonomous driving: a survey. IEEE Transactions on Intelligent Transportation Systems, 22, 6 (2021), 3234–3246.

[4]

F. Codevilla, M. Miiller, A. López, V. Koltun, and A. Dosovitskiy. 2018. End-to-End Driving Via Conditional Imitation Learning. In 2018 IEEE International Conference on Robotics and Automation (ICRA). 1–9. issn:2577-087X https://doi.org/10.1109/ICRA.2018.8460487

Digital Library

[5]

Scheme Color. [n. d.]. Skin color palette. https://www.schemecolor.com/real-skin-tones-color-palette.php

[6]

Yahoo News Daniel Howley. June 29, 2015. Google Photos Mislabels 2 Black Americans as Gorillas. https://finance.yahoo.com/news/google-photos-mislabels-two-black-americans-as-122793782784.html

[7]

Isaac Dunn, Hadrien Pouget, Daniel Kroening, and Tom Melham. 2021. Exposing previously undetectable faults in deep neural networks. In ISSTA ’21: 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual Event, Denmark, July 11-17, 2021, Cristian Cadar and Xiangyu Zhang (Eds.). ACM, 56–66. https://doi.org/10.1145/3460319.3464801

Digital Library

[8]

Rüdiger Ehlers. 2017. Formal Verification of Piece-Wise Linear Feed-Forward Neural Networks. In Automated Technology for Verification and Analysis - 15th International Symposium, ATVA 2017, Pune, India, October 3-6, 2017, Proceedings. 269–286. https://doi.org/10.1007/978-3-319-68167-2_19

[9]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 2672–2680.

Digital Library

[10]

Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.).

[11]

Agrim Gupta, Piotr Dollar, and Ross Girshick. 2019. LVIS: A Dataset for Large Vocabulary Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]

Surya Mattu Julia Angwin, Jeff Larson and ProPublica Lauren Kirchner. April 04, 2023. Machine Bias. https://en.wikipedia.org/wiki/COMPAS_(software)

[14]

Kimmo Karkkainen and Jungseock Joo. 2021. Fairface: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1548–1558.

[15]

Guy Katz, Clark W. Barrett, David L. Dill, Kyle Julian, and Mykel J. Kochenderfer. 2017. Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks. In Computer Aided Verification - 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I. 97–117.

[16]

Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings.

[17]

Zelun Kong, Junfeng Guo, Ang Li, and Cong Liu. 2020. PhysGAN: Generating Physical-World-Resilient Adversarial Examples for Autonomous Driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]

Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings. OpenReview.net.

[19]

Antonio Loquercio, Ana Isabel Maqueda, Carlos R. Del Blanco, and Davide Scaramuzza. 2018. Dronet: Learning to Fly by Driving. IEEE Robotics and Automation Letters, https://doi.org/10.1109/lra.2018.2795643

[20]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net.

[21]

ABC News Mark Osborne. March 31, 2018. Tesla car was on autopilot prior to fatal crash in California, company says. https://abcnews.go.com/US/tesla-car-autopilot-prior-fatal-crash-california-company/story?id=54142891

[22]

Algorithm Watch Nicolas Kayser-Bril. April 7, 2020. Google apologizes after its Vision AI produced racist results. https://algorithmwatch.org/en/google-vision-racism/

[23]

Augustus Odena, Catherine Olsson, David Andersen, and Ian Goodfellow. 2019. TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing. In Proceedings of the 36th International Conference on Machine Learning, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.) (Proceedings of Machine Learning Research, Vol. 97). PMLR, Long Beach, California, USA. 4901–4911.

[24]

Department of Homeland Security. December 4, 2017. Passenger Screening Algorithm Challenge. https://www.kaggle.com/competitions/passenger-screening-algorithm-challenge/overview/timeline

[25]

ONNX. 2017. Open Neural Network Exchange. https://github.com/onnx/onnx

[26]

Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2017. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, October 28-31, 2017. 1–18. https://doi.org/10.1145/3132747.3132785

Digital Library

[27]

Vincenzo Riccio and Paolo Tonella. 2020. Model-Based Exploration of the Frontier of Behaviours for Deep Learning System Testing. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020). Association for Computing Machinery, New York, NY, USA. 876–888. isbn:9781450370431 https://doi.org/10.1145/3368089.3409730

Digital Library

[28]

Stephan R Richter, Vibhav Vineet, Stefan Roth, and Vladlen Koltun. 2016. Playing for data: Ground truth from computer games. In European conference on computer vision. 102–118.

[29]

David Shriver, Sebastian G. Elbaum, and Matthew B. Dwyer. 2021. DNNV: A Framework for Deep Neural Network Verification. In Computer Aided Verification - 33rd International Conference, CAV 2021, Virtual Event, July 20-23, 2021, Proceedings, Part I, Alexandra Silva and K. Rustan M. Leino (Eds.) (Lecture Notes in Computer Science, Vol. 12759). Springer, 137–150. https://doi.org/10.1007/978-3-030-81685-8_6

Digital Library

[30]

David Shriver, Sebastian G. Elbaum, and Matthew B. Dwyer. 2021. Reducing DNN Properties to Enable Falsification with Adversarial Attacks. In Proceedings of the International Conference on Software Engineering.

[31]

Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. 2017. One pixel attack for fooling deep neural networks. CoRR, abs/1710.08864 (2017).

[32]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.).

[33]

Yuchi Tian, Kexin Pei, Suman Jana, and Baishakhi Ray. 2018. DeepTest: automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018. 303–314.

Digital Library

[34]

Vincent Tjeng, Kai Y. Xiao, and Russ Tedrake. 2019. Evaluating Robustness of Neural Networks with Mixed Integer Programming. In International Conference on Learning Representations.

[35]

Felipe Toledo, David Shriver, Sebastian Elbaum, and Matthew B. Dwyer. 2021. Distribution Models for Falsification and Verification of DNNs. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 317–329. https://doi.org/10.1109/ASE51524.2021.9678590

Digital Library

[36]

Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, Venugopalan S, Widner K, Madams T, Cuadros J, Kim R, Raman R, Nelson PC, Mega JL, and Webster DR. December 16, 2016. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.

[37]

Shiqi Wang, Kexin Pei, Justin Whitehouse, Junfeng Yang, and Suman Jana. 2018. Efficient Formal Safety Analysis of Neural Networks. In NeurIPS. 6369–6379.

[38]

Wikipedia. May 23, 2016. COMPAS (software). https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

[39]

Trey Woodlief, Sebastian Elbaum, and Kevin Sullivan. 2022. Semantic image fuzzing of AI perception systems. In Proceedings of the 44th International Conference on Software Engineering. 1958–1969.

Digital Library

[40]

Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. 2019. Detectron2. https://github.com/facebookresearch/detectron2

[41]

Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. 2018. Efficient Neural Network Robustness Certification with General Activation Functions. Advances in Neural Information Processing Systems, 31 (2018), 4939–4948.

[42]

Jie M. Zhang, Mark Harman, Lei Ma, and Yang Liu. 2022. Machine Learning Testing: Survey, Landscapes and Horizons. IEEE Transactions on Software Engineering, 48, 1 (2022), 1–36. https://doi.org/10.1109/TSE.2019.2962027

Digital Library

[43]

Husheng Zhou, Wei Li, Zelun Kong, Junfeng Guo, Yuqun Zhang, Bei Yu, Lingming Zhang, and Cong Liu. 2020. DeepBillboard: Systematic Physical-World Testing of Autonomous Driving Systems. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 347–358.

Digital Library

[44]

Tianfei Zhou, Fatih Porikli, David J. Crandall, Luc Van Gool, and Wenguan Wang. 2023. A Survey on Deep Learning Technique for Video Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 6 (2023), 7099–7122. https://doi.org/10.1109/TPAMI.2022.3225573

Digital Library

[45]

Tahereh Zohdinasab, Vincenzo Riccio, Alessio Gambi, and Paolo Tonella. 2021. DeepHyperion: Exploring the Feature Space of Deep Learning-Based Systems through Illumination Search. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2021). Association for Computing Machinery, New York, NY, USA. 79–90. isbn:9781450384599

Digital Library

Cited By

Duong HXu DNguyen TDwyer M(2024)Harnessing Neuron Stability to Improve DNN VerificationProceedings of the ACM on Software Engineering10.1145/36437651:FSE(859-881)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643765
Mangal RNarodytska NGopinath DHu BRoy AJha SPăsăreanu C(2024)Concept-Based Analysis of Neural Networks via Vision-Language ModelsAI Verification10.1007/978-3-031-65112-0_3(49-77)Online publication date: 17-Jul-2024
https://doi.org/10.1007/978-3-031-65112-0_3

Index Terms

Deeper Notions of Correctness in Image-Based DNNs: Lifting Properties from Pixel to Entities
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
  2. Software organization and properties
    1. Software functional properties
      1. Correctness

Recommendations

Characterizing Sets of Systems: Representation and Analysis of Across-Systems Properties
Beyond Interactions
Abstract
System quality is assessed with respect to the value of relevant properties of that system. The level of abstraction of these properties can be very high (e.g. usability) or very low (e.g. all the “Ok” buttons in the application have the same size)...
Image Fusion Based on Lifting Wavelet Transform
IPTC '10: Proceedings of the 2010 International Symposium on Intelligence Information Processing and Trusted Computing

This paper first introduces the basic principle and implementation steps of lifting wavelet transform. As lifting wavelet transform has advantages like simpler computation, higher execution speed and saving more memory space compared to traditional ...
Guessing Properties of the Qlock Mutual Exclusion Protocol based on its Graphical Animations and confirming the Properties by Model Checking
ICSCA '18: Proceedings of the 2018 7th International Conference on Software and Computer Applications

The paper reports on a case study in which we have guessed properties of Qlock, a mutual exclusion (mutex) protocol, based on its graphical animations and confirmed them by model checking. Such properties guessed graphically and confirmed by model ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

November 2023

2215 pages

ISBN:9798400703270

DOI:10.1145/3611643

General Chair:
Satish Chandra
Google, USA
,
Program Chairs:
Kelly Blincoe
University of Auckland, New Zealand
,
Paolo Tonella
USI Lugano, Switzerland

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF (National Science Foundation)

Conference

ESEC/FSE '23

Sponsor:

SIGSOFT

ESEC/FSE '23: 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

December 3 - 9, 2023

CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
190
Total Downloads

Downloads (Last 12 months)190
Downloads (Last 6 weeks)16

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Duong HXu DNguyen TDwyer M(2024)Harnessing Neuron Stability to Improve DNN VerificationProceedings of the ACM on Software Engineering10.1145/36437651:FSE(859-881)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643765
Mangal RNarodytska NGopinath DHu BRoy AJha SPăsăreanu C(2024)Concept-Based Analysis of Neural Networks via Vision-Language ModelsAI Verification10.1007/978-3-031-65112-0_3(49-77)Online publication date: 17-Jul-2024
https://doi.org/10.1007/978-3-031-65112-0_3

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents