Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Causal generative explainers using counterfactual inference: a case study on the Morpho-MNIST dataset

  • Original Article
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose leveraging causal generative learning as an interpretable tool for explaining image classifiers. Specifically, we present a generative counterfactual inference approach to study the influence of visual features (pixels) as well as causal factors through generative learning. To this end, we first uncover the most influential pixels on a classifier’s decision by computing both Shapely and contrastive explanations for counterfactual images with different attribute values. We then establish a Monte Carlo mechanism using the generator of a causal generative model in order to adapt Shapley explainers to produce feature importances for the human-interpretable attributes of a causal dataset. This method is applied to the case where a classifier has been trained exclusively on the images of the causal dataset. Finally, we present optimization methods for creating counterfactual explanations of classifiers by means of counterfactual inference, proposing straightforward approaches for both differentiable and arbitrary classifiers. We exploit the Morpho-MNIST causal dataset as a case study for exploring our proposed methods for generating counterfactual explanations. However, our methods are applicable also to other causal datasets containing image data. We employ visual explanation methods from the OmnixAI open source toolkit to compare them with our proposed methods. By employing quantitative metrics to measure the interpretability of counterfactual explanations, we find that our proposed methods of counterfactual explanation offer more interpretable explanations compared to those generated from OmnixAI. This finding suggests that our methods are well-suited for generating highly interpretable counterfactual explanations on causal datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The source code for our experiments is available in a public Github repository at https://github.com/wtaylor17/CDGMExplainers. The Morpho-MNIST dataset used in this work is available online at https://github.com/dccastro/Morpho-MNIST.

Notes

  1. https://github.com/wtaylor17/CDGMExplainers

  2. The original MNIST dataset may be downloaded from http://yann.lecun.com/exdb/mnist.

  3. https://github.com/shap/shap/blob/master/shap/explainers/_gradient.py

References

  1. de Castro DC, Tan J, Kainz B, et al (2018) Morpho-mnist: Quantitative assessment and diagnostics for representation learning. CoRR arXiv:1809.10780

  2. Dash S, Balasubramanian VN, Sharma A (2022) Evaluating and mitigating bias in image classifiers: a causal perspective using counterfactuals. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 915–924

  3. Deng L (2012) The mnist database of handwritten digit images for machine learning research. IEEE Signal Process Mag 29(6):141–142

    Article  Google Scholar 

  4. Dhurandhar A, Chen PY, Luss R et al (2018) Explanations based on the missing: towards contrastive explanations with pertinent negatives. arXiv:1802.07623

  5. Dinh L, Krueger D, Bengio Y (2014) Nice: non-linear independent components estimation. arXiv preprint arXiv:1410.8516

  6. Donahue J, Krähenbühl P, Darrell T (2016) Adversarial feature learning. arXiv preprint arXiv:1605.09782

  7. Dumoulin V, Belghazi I, Poole B, et al (2016) Adversarially learned inference. arXiv preprint arXiv:1606.00704

  8. Dwivedi R, Dave D, Naik H et al (2023) Explainable ai (xai): core ideas, techniques, and solutions. ACM Comput Surv 55(9):1–33

    Article  Google Scholar 

  9. Feng R, Lin Z, Zhu J, et al (2021) Uncertainty principles of encoding gans. In: International conference on machine learning, PMLR, pp 3240–3251

  10. Goodfellow I, Pouget-Abadie J, Mirza M et al (2020) Generative adversarial networks. Commun ACM 63(11):139–144

    Article  MathSciNet  Google Scholar 

  11. Goudet O, Kalainathan D, Caillou P et al (2018) Learning functional causal models with generative neural networks. Explainable and interpretable models in computer vision and machine learning, pp 39–80

  12. Hvilshøj F, Iosifidis A, Assent I (2021a) Ecinn: efficient counterfactuals from invertible neural networks. arXiv preprint arXiv:2103.13701

  13. Hvilshøj F, Iosifidis A, Assent I (2021b) On quantitative evaluations of counterfactuals. arXiv preprint arXiv:2111.00177

  14. Kobyzev I, Prince SJ, Brubaker MA (2020) Normalizing flows: an introduction and review of current methods. IEEE Trans Pattern Anal Mach Intell 43(11):3964–3979

    Article  Google Scholar 

  15. Kocaoglu M, Snyder C, Dimakis AG, et al (2017) Causalgan: learning causal implicit generative models with adversarial training. CoRR arXiv:1709.02023

  16. Liu X, Zou Y, Kong L et al (2018) Data augmentation via latent space interpolation for image classification. In: 2018 24th international conference on pattern recognition (ICPR). IEEE, pp 728–733

  17. Lundberg SM, Lee SI (2017) A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S et al (eds) advances in neural information processing systems 30. Curran Associates, Inc., pp 4765–4774. http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf

  18. Mahajan D, Tan C, Sharma A (2019) Preserving causal constraints in counterfactual explanations for machine learning classifiers. arXiv preprint arXiv:1912.03277

  19. O’Shaughnessy M, Canal G, Connor M, et al (2020) Generative causal explanations of black-box classifiers. arXiv:2006.13913

  20. Parafita Á, Vitrià J (2019) Explaining visual models by causal attribution. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW). IEEE, pp 4167–4175

  21. Pawlowski N, Coelho de Castro D, Glocker B (2020) Deep structural causal models for tractable counterfactual inference. Adv Neural Inf Process Syst 33:857–869

    Google Scholar 

  22. Pearl J (2009) Causality. Cambridge University Press, Cambridge

    Book  Google Scholar 

  23. Rezende D, Mohamed S (2015) Variational inference with normalizing flows. In: International conference on machine learning. PMLR, pp 1530–1538

  24. Schut L, Key O, Mc Grath R et al (2021) Generating interpretable counterfactual explanations by implicit minimisation of epistemic and aleatoric uncertainties. In: International conference on artificial intelligence and statistics. PMLR, pp 1756–1764

  25. Schölkopf B, von Kügelgen J (2022) From statistical to causal learning. https://doi.org/10.48550/ARXIV.2204.00607, arXiv:2204.00607

  26. Van Looveren A, Klaise J (2021) Interpretable counterfactual explanations guided by prototypes. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 650–665

  27. Van Looveren A, Klaise J, Vacanti G et al (2021) Conditional generative models for counterfactual explanations. arXiv preprint arXiv:2101.10123

  28. Wachter S, Mittelstadt B, Russell C (2018) Counterfactual explanations without opening the black box: automated decisions and the gdpr. arXiv:1711.00399

  29. Yang F, Alva SS, Chen J et al (2021) Model-based counterfactual synthesizer for interpretation. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp 1964–1974

Download references

Funding

This work was supported by the Natural Sciences and Engineering Research Council of Canada [grant number 550722].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Will Taylor-Melanson.

Ethics declarations

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Taylor-Melanson, W., Sadeghi, Z. & Matwin, S. Causal generative explainers using counterfactual inference: a case study on the Morpho-MNIST dataset. Pattern Anal Applic 27, 89 (2024). https://doi.org/10.1007/s10044-024-01306-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10044-024-01306-8

Keywords