Article

Explaining Deep Learning Models with Constrained Adversarial Examples

Authors:

Jonathan Moore,

Chris WatkinsAuthors Info & Claims

PRICAI 2019: Trends in Artificial Intelligence: 16th Pacific Rim International Conference on Artificial Intelligence, Cuvu, Yanuca Island, Fiji, August 26–30, 2019, Proceedings, Part I

Pages 43 - 56

https://doi.org/10.1007/978-3-030-29908-8_4

Published: 26 August 2019 Publication History

Abstract

Machine learning algorithms generally suffer from a problem of explainability. Given a classification result from a model, it is typically hard to determine what caused the decision to be made, and to give an informative explanation. We explore a new method of generating counterfactual explanations, which instead of explaining why a particular classification was made explain how a different outcome can be achieved. This gives the recipients of the explanation a better way to understand the outcome, and provides an actionable suggestion. We show that the introduced method of Constrained Adversarial Examples (CADEX) can be used in real world applications, and yields explanations which incorporate business or domain constraints such as handling categorical attributes and range constraints.

References

[1]

Athalye, A., Engstrom, L., Ilyas, A., Kwok, K.: Synthesizing robust adversarial examples. CoRR abs/1707.07397 (2017). http://arxiv.org/abs/1707.07397

[2]

Brown, T.B., Mané, D., Roy, A., Abadi, M., Gilmer, J.: Adversarial patch. CoRR abs/1712.09665 (2017). http://arxiv.org/abs/1712.09665

[3]

Dua, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

[4]

Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015). http://arxiv.org/abs/1412.6572

[5]

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980

[6]

Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. CoRR abs/1607.02533 (2016)

[7]

Laugel T, Lesot M-J, Marsala C, Renard X, Detyniecki M, et al. Medina J et al. Comparison-based inverse classification for interpretability in machine learning Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations 2018 Cham Springer 100-111

[8]

Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 4765–4774. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf

[9]

Miller T Explanation in artificial intelligence: insights from the social sciences Artif. Intell. 2019 267 1-38

[10]

Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, ASIA CCS 2017, pp. 506–519. ACM, New York (2017). http://doi.acm.org/10.1145/3052973.3053009

Digital Library

[11]

Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 1135–1144. ACM, New York (2016). http://doi.acm.org/10.1145/2939672.2939778

Digital Library

[12]

Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, International Convention Centre, Sydney, Australia, 06–11 August 2017, vol. 70, pp. 3145–3153. PMLR (2017). http://proceedings.mlr.press/v70/shrikumar17a.html

[13]

Szegedy, C., et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014). http://arxiv.org/abs/1312.6199

[14]

Wachter S, Mittelstadt B, and Russell C Counterfactual explanations without opening the black box: automated decisions and the GDPR Harvard J. Law Technol. 2018 31 2 841-887

[15]

Zhang QS and Zhu SC Visual interpretability for deep learning: a survey Front. Inf. Technol. Electron. Eng. 2018 19 1 27-39

Cited By

Si MPei J(2024)Counterfactual Explanation of Shapley Value in Data CoalitionsProceedings of the VLDB Endowment10.14778/3681954.368200417:11(3332-3345)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3682004
Laugel TJeyasothy ALesot MMarsala CDetyniecki M(2023)Achieving Diversity in Counterfactual Explanations: a Review and DiscussionProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency10.1145/3593013.3594122(1859-1869)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1145/3593013.3594122
Yang FAlva SChen JHu XZhu FChin Ooi BMiao CWang HSkrypnyk IHsu WChawla S(2021)Model-Based Counterfactual Synthesizer for InterpretationProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467333(1964-1974)Online publication date: 14-Aug-2021
https://dl.acm.org/doi/10.1145/3447548.3467333

Recommendations

Explaining Image Misclassification in Deep Learning via Adversarial Examples
Modeling Decisions for Artificial Intelligence
Abstract
With the increasing use of convolutional neural networks (CNNs) for computer vision and other artificial intelligence tasks, the need arises to interpret their predictions. In this work, we tackle the problem of explaining CNN misclassification of ...
Explaining with Counter Visual Attributes and Examples
ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval

In this paper, we aim to explain the decisions of neural networks by utilizing multimodal information. That is counter-intuitive attributes and counter visual examples which appear when perturbed samples are introduced. Different from previous work on ...
Constructing adversarial examples to investigate the plausibility of explanations in deep audio and image classifiers
Abstract
Given the rise of deep learning and its inherent black-box nature, the desire to interpret these systems and explain their behaviour became increasingly more prominent. The main idea of so-called explainers is to identify which features of ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

PRICAI 2019: Trends in Artificial Intelligence: 16th Pacific Rim International Conference on Artificial Intelligence, Cuvu, Yanuca Island, Fiji, August 26–30, 2019, Proceedings, Part I

Aug 2019

788 pages

ISBN:978-3-030-29907-1

DOI:10.1007/978-3-030-29908-8

Editors:
Abhaya C. Nayak
Department of Computing, Macquarie University, Sydney, NSW, Australia
,
Alok Sharma
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan

© Springer Nature Switzerland AG 2019.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 26 August 2019

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Si MPei J(2024)Counterfactual Explanation of Shapley Value in Data CoalitionsProceedings of the VLDB Endowment10.14778/3681954.368200417:11(3332-3345)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3682004
Laugel TJeyasothy ALesot MMarsala CDetyniecki M(2023)Achieving Diversity in Counterfactual Explanations: a Review and DiscussionProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency10.1145/3593013.3594122(1859-1869)Online publication date: 12-Jun-2023
https://dl.acm.org/doi/10.1145/3593013.3594122
Yang FAlva SChen JHu XZhu FChin Ooi BMiao CWang HSkrypnyk IHsu WChawla S(2021)Model-Based Counterfactual Synthesizer for InterpretationProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467333(1964-1974)Online publication date: 14-Aug-2021
https://dl.acm.org/doi/10.1145/3447548.3467333

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents