Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
rapid-communication

Perturbation-based methods for explaining deep neural networks: : A survey

Published: 01 October 2021 Publication History

Highlights

Overview of the black box problem in DNNs and XAI research.
Survey of perturbation-based attribution methods for different input data types.
Future directions for research on perturbation paradigm.

Abstract

Deep neural networks (DNNs) have achieved state-of-the-art results in a broad range of tasks, in particular the ones dealing with the perceptual data. However, full-scale application of DNNs in safety-critical areas is hindered by their black box-like nature, which makes their inner workings nontransparent. As a response to the black box problem, the field of explainable artificial intelligence (XAI) has recently emerged and is currently rapidly growing. The present survey is concerned with perturbation-based XAI methods, which allow to explore DNN models by perturbing their input and observing changes in the output. We present an overview of the most recent research focusing on the differences and similarities in the applications of perturbation-based methods to different data types, from extensively studied perturbations of images to the just emerging research on perturbations of video, natural language, software code, and reinforcement learning entities.

References

[1]
A. Abdul, J. Vermeulen, D. Wang, B.Y. Lim, M. Kankanhalli, Trends and trajectories for explainable, accountable and intelligible systems: an HCI research agenda, Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 2018, pp. 1–18.
[2]
A. Adadi, M. Berrada, Peeking inside the black-box: a survey on explainable artificial intelligence (XAI), IEEE Access 6 (2018) 52138–52160.
[3]
J. Adebayo, J. Gilmer, M. Muelly, I. Goodfellow, M. Hardt, B. Kim, Sanity checks for saliency maps, Proceedings of the Advances in Neural Information Processing Systems, 2018, pp. 9505–9515.
[4]
S. Adel Bargal, A. Zunino, D. Kim, J. Zhang, V. Murino, S. Sclaroff, Excitation backprop for RNNs, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1440–1449.
[5]
M.A. Ahmad, C. Eckert, A. Teredesai, Interpretable machine learning in healthcare, Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, 2018, pp. 559–560.
[6]
Amodei et al.(2016)Amodei, Olah, Steinhardt, Christiano, Schulman, and Mané D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, D. Mané, Concrete problems in ai safety, arXiv preprint arXiv:1606.06565(2016).
[7]
M. Ancona, E. Ceolini, C. Öztireli, M. Gross, Gradient-based attribution methods, Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer, 2019, pp. 169–191.
[8]
A.B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina, R. Benjamins, et al., Explainable artificial intelligence (Xai): concepts, taxonomies, opportunities and challenges toward responsible ai, Inf. Fusion 58 (2020) 82–115.
[9]
BBC News, 2015, Google apologises for photos app’s racist blunder, 2015. https://www.bbc.com/news/technology-33347866.
[10]
M.G. Bellemare, Y. Naddaf, J. Veness, M. Bowling, The arcade learning environment: an evaluation platform for general agents, J. Artif. Intell. Res. 47 (2013) 253–279.
[11]
U. Bhatt, A. Xiang, S. Sharma, A. Weller, A. Taly, Y. Jia, J. Ghosh, R. Puri, J.M. Moura, P. Eckersley, Explainable machine learning in deployment, Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 2020, pp. 648–657.
[12]
M. Büchi, W. Weck, The Greybox Approach: When Blackbox Specification Hide too much, Citeseer, 1999.
[13]
N.D. Bui, Y. Yu, L. Jiang, Autofocus: interpreting attention-based neural networks by code perturbation, Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, 2019, pp. 38–41.
[14]
D.V. Carvalho, E.M. Pereira, J.S. Cardoso, Machine learning interpretability: a survey on methods and metrics, Electronics 8 (8) (2019) 832.
[15]
S. Cave, K. Dihal, The whiteness of ai, Philos. Technol. (2020) 1–19.
[16]
I. Chalkiadakis, A brief survey of visualization methods for deep learning models from the perspective of explainable AIU, 2018.
[17]
J. Choo, S. Liu, Visual analytics for explainable deep learning, IEEE Comput. Graph. Appl. 38 (4) (2018) 84–92.
[18]
P. Dabkowski, Y. Gal, Real time image saliency for black box classifiers, Proceedings of the Advances in Neural Information Processing Systems, 2017, pp. 6967–6976.
[19]
D. Damen, H. Doughty, G.M. Farinella, S. Fidler, A. Furnari, E. Kazakos, D. Moltisanti, J. Munro, T. Perrett, W. Price, et al., Scaling egocentric vision: the epic-kitchens dataset, Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 720–736.
[20]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, Imagenet: a large-scale hierarchical image database, Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, IEEE, 2009, pp. 248–255.
[21]
Doran et al.(2017)Doran, Schulz, and Besold D. Doran, S. Schulz, T.R. Besold, What does explainable ai really mean? A new conceptualization of perspectives, arXiv preprint arXiv:1710.00794(2017).
[22]
F.K. Došilović, M. Brčić, N. Hlupić, Explainable artificial intelligence: a survey, Proceedings of the 41st International convention on information and communication technology, electronics and microelectronics (MIPRO), IEEE, 2018, pp. 0210–0215.
[23]
M. Everingham, L. Van Gool, C.K. Williams, J. Winn, A. Zisserman, The pascal visual object classes (VoC) challenge, Int. J. Comput. Vis. 88 (2) (2010) 303–338.
[24]
K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, D. Song, Robust physical-world attacks on deep learning visual classification, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1625–1634.
[25]
R. Fong, M. Patrick, A. Vedaldi, Understanding deep networks via extremal perturbations and smooth masks, Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 2950–2958.
[26]
R. Fong, A. Vedaldi, Explanations for attributing deep neural network predictions, Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, Springer, 2019, pp. 149–167.
[27]
R.C. Fong, A. Vedaldi, Interpretable explanations of black boxes by meaningful perturbation, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 3429–3437.
[28]
Gilpin et al.(2018)Gilpin, Bau, Yuan, Bajwa, Specter, and Kagal L.H. Gilpin, D. Bau, B.Z. Yuan, A. Bajwa, M. Specter, L. Kagal, Explaining explanations: an approach to evaluating interpretability of machine learning, arXiv preprint arXiv:1806.00069(2018).
[29]
Goodfellow et al.(2014)Goodfellow, Shlens, and Szegedy I.J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, arXiv preprint arXiv:1412.6572(2014).
[30]
B. Goodman, S. Flaxman, European union regulations on algorithmic decision-making and a ‘right to explanation’, AI Mag. 38 (3) (2017) 50–57.
[31]
S. Greydanus, A. Koul, J. Dodge, A. Fern, Visualizing and understanding Atari agents, Proceedings of the International Conference on Machine Learning, 2018, pp. 1792–1801.
[32]
S. Grigorescu, B. Trasnea, T. Cocias, G. Macesanu, A survey of deep learning techniques for autonomous driving, J. Field Robot. 37 (3) (2020) 362–386.
[33]
R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A survey of methods for explaining black box models, ACM Comput. Surv. (CSUR) 51 (5) (2018) 1–42.
[34]
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[35]
F. Hohman, M. Kahng, R. Pienta, D.H. Chau, Visual analytics in deep learning: an interrogative survey for the next frontiers, IEEE Trans. Vis. Comput. Graph. 25 (8) (2018) 2674–2693.
[36]
Holzinger et al.(2017)Holzinger, Plass, Holzinger, Crisan, Pintea, and Palade A. Holzinger, M. Plass, K. Holzinger, G.C. Crisan, C.-M. Pintea, V. Palade, A glass-box interactive machine learning approach for solving np-hard problems with the human-in-the-loop, arXiv preprint arXiv:1708.01104(2017).
[37]
R. Iyer, Y. Li, H. Li, M. Lewis, R. Sundar, K. Sycara, Transparency and explanation in deep reinforcement learning neural networks, Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 2018, pp. 144–150.
[38]
N. Kayser-Bril, Google Apologizes After Its Vision AI Produced Racist Results, AlgorithmWatch, 2020.
[39]
Khakzar et al.(2019)Khakzar, Baselizadeh, Khanduja, Rupprecht, Kim, and Navab A. Khakzar, S. Baselizadeh, S. Khanduja, C. Rupprecht, S.T. Kim, N. Navab, Improving feature attribution through input-specific network pruning, arXiv (2019) arXiv–1911.
[40]
M.E. Khan, F. Khan, et al., A comparative study of white box, black box and grey box testing techniques, Int. J. Adv. Comput. Sci. Appl 3 (6) (2012).
[41]
A. Krizhevsky, G. Hinton, et al., Learning multiple layers of features from tiny images (2009).
[42]
S. Lapuschkin, S. Wäldchen, A. Binder, G. Montavon, W. Samek, K.-R. Müller, Unmasking clever Hans predictors and assessing what machines really learn, Nat. Commun. 10 (1) (2019) 1–8.
[43]
Li et al.(2020)Li, Wang, Li, Huang, and Sato Z. Li, W. Wang, Z. Li, Y. Huang, Y. Sato, A comprehensive study on visual explanations for spatio-temporal networks, arXiv preprint arXiv:2005.00375(2020).
[44]
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: common objects in context, Proceedings of the European Conference on Computer Vision, Springer, 2014, pp. 740–755.
[45]
S. Liu, Z. Li, T. Li, V. Srikumar, V. Pascucci, P.-T. Bremer, Nlize: a perturbation-driven visual interrogation tool for analyzing and interpreting natural language inference models, IEEE Trans. Vis. Comput. Graph. 25 (1) (2018) 651–660.
[46]
Mahdisoltani et al.(2018)Mahdisoltani, Berger, Gharbieh, Fleet, and Memisevic F. Mahdisoltani, G. Berger, W. Gharbieh, D. Fleet, R. Memisevic, Fine-grained video classification and captioning, arXiv preprint arXiv:1804.09235 5(6) (2018).
[47]
Mänttäri et al.(2020)Mänttäri, Broomé, Folkesson, and Kjellström J. Mänttäri, S. Broomé, J. Folkesson, H. Kjellström, Interpreting video features: a comparison of 3d convolutional networks and convolutional LSTM networks, arXiv preprint arXiv:2002.00367(2020).
[48]
D. Martens, J. Vanthienen, W. Verbeke, B. Baesens, Performance of classification models from a user perspective, Decis. Support Syst. 51 (4) (2011) 782–793.
[49]
G.A. Miller, Wordnet: a lexical database for english, Commun. ACM 38 (11) (1995) 39–41.
[50]
Mohseni et al.(2018)Mohseni, Zarei, and Ragan S. Mohseni, N. Zarei, E.D. Ragan, A multidisciplinary survey and framework for design and evaluation of explainable ai systems, arXiv (2018) arXiv–1811.
[51]
G. Montavon, W. Samek, K.-R. Müller, Methods for interpreting and understanding deep neural networks, Digit. Signal Process. 73 (2018) 1–15.
[52]
Petsiuk et al.(2018)Petsiuk, Das, and Saenko V. Petsiuk, A. Das, K. Saenko, Rise: Randomized input sampling for explanation of black-box models, arXiv preprint arXiv:1806.07421(2018).
[53]
Puiutta and Veith(2020) E. Puiutta, E. Veith, Explainable reinforcement learning: a survey, arXiv preprint arXiv:2005.06247(2020).
[54]
N. Puri, S. Verma, P. Gupta, D. Kayastha, S. Deshmukh, B. Krishnamurthy, S. Singh, Explain your move: Understanding agent actions using specific and relevant feature attribution, Proceedings of the International Conference on Learning Representations, 2019.
[55]
M.T. Ribeiro, S. Singh, C. Guestrin, “Why should i trust you?” Explaining the predictions of any classifier, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and data Mining, 2016, pp. 1135–1144.
[56]
M. Robnik-Šikonja, M. Bohanec, Human and Machine Learning, Springer, 2018, pp. 159–175.
[57]
R. Roscher, B. Bohn, M.F. Duarte, J. Garcke, Explainable machine learning for scientific insights and discoveries, IEEE Access 8 (2020) 42200–42216.
[58]
W. Samek, A. Binder, G. Montavon, S. Lapuschkin, K.-R. Müller, Evaluating the visualization of what a deep neural network has learned, IEEE Trans. Neural Netw. Learn. Syst. 28 (11) (2016) 2660–2673.
[59]
C. Schuldt, I. Laptev, B. Caputo, Recognizing human actions: a local SVM approach, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., volume 3, IEEE, 2004, pp. 32–36.
[60]
R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: visual explanations from deep networks via gradient-based localization, Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 618–626.
[61]
D. Silver, T. Hubert, J. Schrittwieser, I. Antonoglou, M. Lai, A. Guez, M. Lanctot, L. Sifre, D. Kumaran, T. Graepel, et al., A general reinforcement learning algorithm that masters chess, Shogi, and go through self-play, Science 362 (6419) (2018) 1140–1144.
[62]
Soomro et al.(2012)Soomro, Zamir, and Shah K. Soomro, A.R. Zamir, M. Shah, Ucf101: a dataset of 101 human actions classes from videos in the wild, arXiv preprint arXiv:1212.0402(2012).
[63]
A. Stergiou, G. Kapidis, G. Kalliatakis, C. Chrysoulas, R. Veltkamp, R. Poppe, Saliency tubes: visual explanations for spatio-temporal convolutions, Proceedings of the IEEE International Conference on Image Processing (ICIP), IEEE, 2019, pp. 1830–1834.
[64]
Szegedy et al.(2013)Szegedy, Zaremba, Sutskever, Bruna, Erhan, Goodfellow, and Fergus C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks, arXiv preprint arXiv:1312.6199(2013).
[65]
The Guardian, 2018, Google’s solution to accidental algorithmic racism: ban gorillas, 2018. https://www.theguardian.com/technology/2018/jan/12/google-racism-ban-gorilla-black-people.
[66]
Tjoa and Guan(2019) E. Tjoa, C. Guan, A survey on explainable artificial intelligence (XAI): towards medical XAI, arXiv preprint arXiv:1907.07374(2019).
[67]
N. Weiner, Cybernetics: or the control and communication in the animal and the machine: or control and communication in the animal and the machine, 1961.
[68]
F. Xu, H. Uszkoreit, Y. Du, W. Fan, D. Zhao, J. Zhu, Explainable ai: a brief survey on history, research areas, approaches and challenges, Proceedings of the CCF International Conference on Natural Language Processing and Chinese Computing, Springer, 2019, pp. 563–574.
[69]
Yang et al.(2020)Yang, Zhu, Ye, Fwu, You, and Zhu Q. Yang, X. Zhu, Y. Ye, J.-K. Fwu, G. You, Y. Zhu, Mfpp: morphological fragmental perturbation pyramid for black-box model explanations, arXiv preprint arXiv:2006.02659(2020).
[70]
M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional networks, Proceedings of the European Conference on Computer Vision, Springer, 2014, pp. 818–833.
[71]
J. Zhang, S.A. Bargal, Z. Lin, J. Brandt, X. Shen, S. Sclaroff, Top-down neural attention by excitation backprop, Int. J. Comput. Vis. 126 (10) (2018) 1084–1102.
[72]
Q.-s. Zhang, S.-C. Zhu, Visual interpretability for deep learning: a survey, Front. Inf. Technol. Electron. Eng. 19 (1) (2018) 27–39.
[73]
Zhang et al.(2019)Zhang, Cui, Finkler, Saon, Kayi, Buyuktosunoglu, Kingsbury, Kung, and Picheny W. Zhang, X. Cui, U. Finkler, G. Saon, A. Kayi, A. Buyuktosunoglu, B. Kingsbury, D. Kung, M. Picheny, A highly efficient distributed deep learning system for automatic speech recognition, arXiv preprint arXiv:1907.05701(2019).
[74]
J. Zou, L. Schiebinger, Ai can be sexist and Racistit’s time to make it fair, 2018.

Cited By

View all
  • (2024)Drawing Attributions From Evolved CounterfactualsProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664122(1582-1589)Online publication date: 14-Jul-2024
  • (2024)CAT: Interpretable Concept-based Taylor Additive ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672020(723-734)Online publication date: 25-Aug-2024
  • (2024)RAG-Ex: A Generic Framework for Explaining Retrieval Augmented GenerationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657660(2776-2780)Online publication date: 10-Jul-2024
  • Show More Cited By

Index Terms

  1. Perturbation-based methods for explaining deep neural networks: A survey
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Pattern Recognition Letters
      Pattern Recognition Letters  Volume 150, Issue C
      Oct 2021
      313 pages

      Publisher

      Elsevier Science Inc.

      United States

      Publication History

      Published: 01 October 2021

      Author Tags

      1. Deep learning
      2. Explainable artificial intelligence
      3. Perturbation-based methods

      Author Tags

      1. 41A05
      2. 41A10
      3. 65D05
      4. 65D17

      Qualifiers

      • Rapid-communication

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 04 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Drawing Attributions From Evolved CounterfactualsProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664122(1582-1589)Online publication date: 14-Jul-2024
      • (2024)CAT: Interpretable Concept-based Taylor Additive ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672020(723-734)Online publication date: 25-Aug-2024
      • (2024)RAG-Ex: A Generic Framework for Explaining Retrieval Augmented GenerationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657660(2776-2780)Online publication date: 10-Jul-2024
      • (2024)Rearrange Anatomy Inputs Like LEGO Bricks: Applying InSSS-P and a Mobile-Dense Hybrid Network to Distill Vascular Significance From Retina OCT-AngiographyIEEE Computational Intelligence Magazine10.1109/MCI.2024.340134819:3(12-25)Online publication date: 1-Aug-2024
      • (2024)Content bias in deep learning image age approximationPattern Recognition Letters10.1016/j.patrec.2024.04.009182:C(90-96)Online publication date: 18-Jul-2024
      • (2024)Deep neural networks for automatic speaker recognition do not learn supra-segmental temporal featuresPattern Recognition Letters10.1016/j.patrec.2024.03.016181:C(64-69)Online publication date: 1-May-2024
      • (2024)A multivariate Markov chain model for interpretable dense action anticipationNeurocomputing10.1016/j.neucom.2024.127285574:COnline publication date: 14-Mar-2024
      • (2024)REPROTInformation Sciences: an International Journal10.1016/j.ins.2023.119851654:COnline publication date: 1-Jan-2024
      • (2024)Neural network structure simplification by assessing evolution in node weight magnitudeMachine Language10.1007/s10994-023-06438-2113:6(3693-3710)Online publication date: 1-Jun-2024
      • (2024)DocXclassifier: towards a robust and interpretable deep neural network for document image classificationInternational Journal on Document Analysis and Recognition10.1007/s10032-024-00483-w27:3(447-473)Online publication date: 1-Sep-2024
      • Show More Cited By

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media