Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-030-29908-8_4guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Explaining Deep Learning Models with Constrained Adversarial Examples

Published: 26 August 2019 Publication History

Abstract

Machine learning algorithms generally suffer from a problem of explainability. Given a classification result from a model, it is typically hard to determine what caused the decision to be made, and to give an informative explanation. We explore a new method of generating counterfactual explanations, which instead of explaining why a particular classification was made explain how a different outcome can be achieved. This gives the recipients of the explanation a better way to understand the outcome, and provides an actionable suggestion. We show that the introduced method of Constrained Adversarial Examples (CADEX) can be used in real world applications, and yields explanations which incorporate business or domain constraints such as handling categorical attributes and range constraints.

References

[1]
Athalye, A., Engstrom, L., Ilyas, A., Kwok, K.: Synthesizing robust adversarial examples. CoRR abs/1707.07397 (2017). http://arxiv.org/abs/1707.07397
[2]
Brown, T.B., Mané, D., Roy, A., Abadi, M., Gilmer, J.: Adversarial patch. CoRR abs/1712.09665 (2017). http://arxiv.org/abs/1712.09665
[3]
Dua, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
[4]
Goodfellow, I., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (2015). http://arxiv.org/abs/1412.6572
[5]
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980
[6]
Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. CoRR abs/1607.02533 (2016)
[7]
Laugel T, Lesot M-J, Marsala C, Renard X, Detyniecki M, et al. Medina J et al. Comparison-based inverse classification for interpretability in machine learning Information Processing and Management of Uncertainty in Knowledge-Based Systems. Theory and Foundations 2018 Cham Springer 100-111
[8]
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30, pp. 4765–4774. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf
[9]
Miller T Explanation in artificial intelligence: insights from the social sciences Artif. Intell. 2019 267 1-38
[10]
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, ASIA CCS 2017, pp. 506–519. ACM, New York (2017). http://doi.acm.org/10.1145/3052973.3053009
[11]
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?”: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 1135–1144. ACM, New York (2016). http://doi.acm.org/10.1145/2939672.2939778
[12]
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, International Convention Centre, Sydney, Australia, 06–11 August 2017, vol. 70, pp. 3145–3153. PMLR (2017). http://proceedings.mlr.press/v70/shrikumar17a.html
[13]
Szegedy, C., et al.: Intriguing properties of neural networks. In: International Conference on Learning Representations (2014). http://arxiv.org/abs/1312.6199
[14]
Wachter S, Mittelstadt B, and Russell C Counterfactual explanations without opening the black box: automated decisions and the GDPR Harvard J. Law Technol. 2018 31 2 841-887
[15]
Zhang QS and Zhu SC Visual interpretability for deep learning: a survey Front. Inf. Technol. Electron. Eng. 2018 19 1 27-39

Cited By

View all
  • (2024)Counterfactual Explanation of Shapley Value in Data CoalitionsProceedings of the VLDB Endowment10.14778/3681954.368200417:11(3332-3345)Online publication date: 1-Jul-2024
  • (2023)Achieving Diversity in Counterfactual Explanations: a Review and DiscussionProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency10.1145/3593013.3594122(1859-1869)Online publication date: 12-Jun-2023
  • (2021)Model-Based Counterfactual Synthesizer for InterpretationProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467333(1964-1974)Online publication date: 14-Aug-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
PRICAI 2019: Trends in Artificial Intelligence: 16th Pacific Rim International Conference on Artificial Intelligence, Cuvu, Yanuca Island, Fiji, August 26–30, 2019, Proceedings, Part I
Aug 2019
788 pages
ISBN:978-3-030-29907-1
DOI:10.1007/978-3-030-29908-8
  • Editors:
  • Abhaya C. Nayak,
  • Alok Sharma

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 26 August 2019

Author Tags

  1. Explainable AI
  2. Adversarial examples
  3. Counerfactual explanations

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Counterfactual Explanation of Shapley Value in Data CoalitionsProceedings of the VLDB Endowment10.14778/3681954.368200417:11(3332-3345)Online publication date: 1-Jul-2024
  • (2023)Achieving Diversity in Counterfactual Explanations: a Review and DiscussionProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency10.1145/3593013.3594122(1859-1869)Online publication date: 12-Jun-2023
  • (2021)Model-Based Counterfactual Synthesizer for InterpretationProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining10.1145/3447548.3467333(1964-1974)Online publication date: 14-Aug-2021

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media