Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

FACET: Robust Counterfactual Explanation Analytics

Published: 12 December 2023 Publication History

Abstract

Machine learning systems are deployed in domains such as hiring and healthcare, where undesired classifications can have serious ramifications for the user. Thus, there is a rising demand for explainable AI systems which provide actionable steps for lay users to obtain their desired outcome. To meet this need, we propose FACET, the first explanation analytics system which supports a user in interactively refining counterfactual explanations for decisions made by tree ensembles. As FACET's foundation, we design a novel type of counterfactual explanation called the counterfactual region. Unlike traditional counterfactuals, FACET's regions concisely describe portions of the feature space where the desired outcome is guaranteed, regardless of variations in exact feature values. This property, which we coin explanation robustness, is critical for the practical application of counterfactuals. We develop a rich set of novel explanation analytics queries which empower users to identify personalized counterfactual regions that account for their real-world circumstances. To process these queries, we develop a compact high-dimensional counterfactual region index along with index-aware query processing strategies for near real-time explanation analytics. We evaluate FACET against state-of-the-art explanation techniques on eight public benchmark datasets and demonstrate that FACET generates actionable explanations of similar quality in an order of magnitude less time while providing critical robustness guarantees. Finally, we conduct a preliminary user study which suggests that FACET's regions lead to higher user understanding than traditional counterfactuals.

Supplemental Material

MP4 File
Presentation video
PPTX File
Presentation slides
PPTX File
Poster

References

[1]
Equal Credit Opportunities Act. 1974. Public Law, 15 C.F.R § 1691, Regulation B 12 C.F.R. § 1002.
[2]
Amina Adadi and Mohammed Berrada. 2018. Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE access, Vol. 6 (2018), 52138--52160.
[3]
Solon Barocas, Andrew D. Selbst, and Manish Raghavan. 2020. The Hidden Assumptions behind Counterfactual Explanations and Principal Reasons. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* '20). Association for Computing Machinery, New York, NY, USA, 80--89. https://doi.org/10.1145/3351095.3372830
[4]
Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, and Bernhard Seeger. 1990. The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data (Atlantic City, New Jersey, USA) (SIGMOD '90). Association for Computing Machinery, New York, NY, USA, 322--331. https://doi.org/10.1145/93597.98741
[5]
Stefan Berchtold, Daniel A. Keim, and Hans-Peter Kriegel. 1996. The X-Tree : An Index Structure for High-Dimensional Data. In Proceedings of the Twenty-second International Conference on Very Large Data-Bases ; Mumbai (Bombay), India 3 - 6 September, 1996, T. M. Vijayaraman (Ed.). Morgan Kaufmann, San Francisco, 28--39.
[6]
Erik Bernhardsson. 2005. Spotify/Annoy: Approximate nearest neighbors in c/python optimized for memory usage and loading/saving to disk. https://github.com/spotify/annoy.
[7]
Leo Breiman. 2001 a. Random forests. Machine learning, Vol. 45, 1 (2001), 5--32.
[8]
Leo Breiman. 2001 b. Statistical modeling: The two cultures. Statist. Sci., Vol. 16, 3 (2001), 199--231.
[9]
Zhicheng Cui, Wenlin Chen, Yujie He, and Yixin Chen. 2015. Optimal Action Extraction for Random Forests and Boosted Trees. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Sydney, NSW, Australia) (KDD '15). Association for Computing Machinery, New York, NY, USA, 179--188. https://doi.org/10.1145/2783258.2783281
[10]
Houtao Deng. 2019. Interpreting tree ensembles with intrees. International Journal of Data Science and Analytics, Vol. 7, 4 (2019), 277--287.
[11]
Amit Dhurandhar, Pin-Yu Chen, Ronny Luss, Chun-Chen Tu, Paishun Ting, Karthikeyan Shanmugam, and Payel Das. 2018. Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), Vol. 31. Curran Associates, Inc., Montreal, QC, Canada, 12. https://proceedings.neurips.cc/paper/2018/file/c5ff2543b53f4cc0ad3819a36752467b-Paper.pdf
[12]
Amit Dhurandhar, Tejaswini Pedapati, Avinash Balakrishnan, Pin-Yu Chen, Karthikeyan Shanmugam, and Ruchir Puri. 2019. Model Agnostic Contrastive Explanations for Structured Data. ArXiv preprint, Vol. abs/1906.00117 (2019), 12 pages. https://arxiv.org/abs/1906.00117
[13]
Wei Dong. 2014. AAALGO/kgraph: A library for K-Nearest Neighbor Search. https://github.com/aaalgo/kgraph.
[14]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[15]
Sanghamitra Dutta, Jason Long, Saumitra Mishra, Cecilia Tilli, and Daniele Magazzeni. 2022. Robust Counterfactual Explanations for Tree-Based Ensembles. In Proceedings of the 39th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (Eds.). PMLR, Baltimore, MD, USA, 5742--5756. https://proceedings.mlr.press/v162/dutta22a.html
[16]
Rubén R. Fernández, Isaac Martín de Diego, Víctor Aceña, Alberto Fernández-Isabel, and Javier M. Moguerza. 2020. Random forest explainability using counterfactual sets. Information Fusion, Vol. 63 (2020), 196--207. https://doi.org/10.1016/j.inffus.2020.07.001
[17]
Sainyam Galhotra, Romila Pradhan, and Babak Salimi. 2021. Explaining Black-Box Algorithms Using Probabilistic Contrastive Counterfactuals. In Proceedings of the 2021 International Conference on Management of Data (Virtual Event, China) (SIGMOD '21). Association for Computing Machinery, New York, NY, USA, 577--590. https://doi.org/10.1145/3448016.3458455
[18]
Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Dino Pedreschi, Franco Turini, and Fosca Giannotti. 2018. Local Rule-Based Explanations of Black Box Decision Systems. CoRR, Vol. abs/1805.10820 (2018), 10 pages.showeprint[arXiv]1805.10820 http://arxiv.org/abs/1805.10820
[19]
Masoud Hashemi and Ali Fathi. 2020. PermuteAttack: Counterfactual Explanation of Machine Learning Credit Scorecards. https://doi.org/10.48550/ARXIV.2008.10138
[20]
Qiang Huang, Jianlin Feng, Yikai Zhang, Qiong Fang, and Wilfred Ng. 2015. Query-aware locality-sensitive hashing for approximate nearest neighbor search. Proceedings of the VLDB Endowment, Vol. 9, 1 (2015), 1--12.
[21]
Gregory M. Hunter and Kenneth Steiglitz. 1979. Operations on Images Using Quad Trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-1, 2 (1979), 145--153. https://doi.org/10.1109/TPAMI.1979.4766900
[22]
Qing-Yuan Jiang and Wu-Jun Li. 2015. Scalable graph hashing with feature transformation. In Twenty-fourth International Joint Conference on Artificial Intelligence. AAAI, Buenos Aires, Argentina, 2248--2254.
[23]
Kaggle. 2008. Loan Predication. https://www.kaggle.com/datasets/ninzaami/loan-predication,.
[24]
Ibrahim Kamel and Christos Faloutsos. 1993. Hilbert R-tree: An improved R-tree using fractals. Technical Report. University of Maryland, Institute for Systems Research.
[25]
Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, and Hiroki Arimura. 2020. DACE: Distribution-Aware Counterfactual Explanation by Mixed-Integer Linear Optimization. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, Christian Bessiere (Ed.). International Joint Conferences on Artificial Intelligence Organization, Yokohama, Kanto, Japan, 2855--2862. https://doi.org/10.24963/ijcai.2020/395 Main track.
[26]
Amir-Hossein Karimi, Gilles Barthe, Bernhard Schölkopf, and Isabel Valera. 2022. A Survey of Algorithmic Recourse: Contrastive Explanations and Consequential Recommendations. ACM Comput. Surv., Vol. 55, 5, Article 95 (dec 2022), 29 pages. https://doi.org/10.1145/3527848
[27]
Amir-Hossein Karimi, Bernhard Schölkopf, and Isabel Valera. 2021. Algorithmic Recourse: From Counterfactual Explanations to Interventions. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT '21). Association for Computing Machinery, New York, NY, USA, 353--362. https://doi.org/10.1145/3442188.3445899
[28]
Thai Le, Suhang Wang, and Dongwon Lee. 2020. GRACE: Generating Concise and Informative Contrastive Sample to Explain Neural Network Model's Prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Virtual Event, CA, USA) (KDD '20). Association for Computing Machinery, New York, NY, USA, 238--248. https://doi.org/10.1145/3394486.3403066
[29]
Scikit Learn. 2023. Gradient Boosting Classifier. https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html,.
[30]
Zhicheng Liu and Jeffrey Heer. 2014. The Effects of Interactive Latency on Exploratory Visual Analysis. IEEE Transactions on Visualization and Computer Graphics, Vol. 20, 12 (2014), 2122--2131. https://doi.org/10.1109/TVCG.2014.2346452
[31]
Ana Lucic, Harrie Oosterhuis, Hinda Haned, and Maarten de Rijke. 2022. FOCUS: Flexible Optimizable Counterfactual Explanations for Tree Ensembles. Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36, 5 (Jun. 2022), 5313--5322. https://doi.org/10.1609/aaai.v36i5.20468
[32]
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc., Long Beach, CA, USA. https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf
[33]
Yury Malkov, Alexander Ponomarenko, Andrey Logvinov, and Vladimir Krylov. 2014. Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems, Vol. 45 (2014), 61--68.
[34]
Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, Vol. 267 (2019), 1--38. https://doi.org/10.1016/j.artint.2018.07.007
[35]
Ramaravind K. Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* '20). Association for Computing Machinery, New York, NY, USA, 607--617. https://doi.org/10.1145/3351095.3372850
[36]
Marius Muja and David G Lowe. 2014. Scalable nearest neighbor algorithms for high dimensional data. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 36, 11 (2014), 2227--2240.
[37]
Axel Parmentier and Thibaut Vidal. 2021. Optimal Counterfactual Explanations in Tree Ensembles. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, Virtual, 8422--8431. https://proceedings.mlr.press/v139/parmentier21a.html
[38]
Article 29 Data Protection Working Party. 2016. Guidelines on Automated Individual Decision-Making and Profiling for the Purposes of Regulation 2016/679. https://ec.europa.eu/newsroom/article29/items/612053
[39]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, Vol. 12 (2011), 2825--2830.
[40]
Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, and Peter Flach. 2020. FACE: Feasible and Actionable Counterfactual Explanations. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (New York, NY, USA) (AIES '20). Association for Computing Machinery, New York, NY, USA, 344--350. https://doi.org/10.1145/3375627.3375850
[41]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 1135--1144. https://doi.org/10.1145/2939672.2939778
[42]
Omer Sagi and Lior Rokach. 2020. Explainable decision forest: Transforming a decision forest into an interpretable tree. Information Fusion, Vol. 61 (2020), 124--138.
[43]
Maximilian Schleich, Zixuan Geng, Yihong Zhang, and Dan Suciu. 2021. GeCo: Quality Counterfactual Explanations in Real Time. Proc. VLDB Endow., Vol. 14, 9 (oct 2021), 1681--1693. https://doi.org/10.14778/3461535.3461555
[44]
Shubham Sharma, Jette Henderson, and Joydeep Ghosh. 2020. CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-Box Models. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (New York, NY, USA) (AIES '20). Association for Computing Machinery, New York, NY, USA, 166--172. https://doi.org/10.1145/3375627.3375812
[45]
Gabriele Tolomei, Fabrizio Silvestri, Andrew Haines, and Mounia Lalmas. 2017. Interpretable Predictions of Tree-Based Ensembles via Actionable Feature Tweaking. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Halifax, NS, Canada) (KDD '17). Association for Computing Machinery, New York, NY, USA, 465--474. https://doi.org/10.1145/3097983.3098039
[46]
Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan E. Hines, John P. Dickerson, and Chirag Shah. 2020. Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review. https://doi.org/10.48550/ARXIV.2010.10596
[47]
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR. Harvard journal of law & technology, Vol. 31, 2 (2017), 841--.

Cited By

View all
  • (2024)MetaStore: Analyzing Deep Learning Meta-Data at ScaleProceedings of the VLDB Endowment10.14778/3648160.364818217:6(1446-1459)Online publication date: 3-May-2024
  • (2024)Actionable Recourse for Automated Decisions: Examining the Effects of Counterfactual Explanation Type and Presentation on Lay User UnderstandingProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658997(1682-1700)Online publication date: 3-Jun-2024

Index Terms

  1. FACET: Robust Counterfactual Explanation Analytics

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 1, Issue 4
    PACMMOD
    December 2023
    1317 pages
    EISSN:2836-6573
    DOI:10.1145/3637468
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 December 2023
    Published in PACMMOD Volume 1, Issue 4

    Permissions

    Request permissions for this article.

    Author Tags

    1. counterfactual explanation
    2. explainable AI
    3. gradient boosting ensembles
    4. interpretable machine learning
    5. random forest

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)240
    • Downloads (Last 6 weeks)34
    Reflects downloads up to 17 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)MetaStore: Analyzing Deep Learning Meta-Data at ScaleProceedings of the VLDB Endowment10.14778/3648160.364818217:6(1446-1459)Online publication date: 3-May-2024
    • (2024)Actionable Recourse for Automated Decisions: Examining the Effects of Counterfactual Explanation Type and Presentation on Lay User UnderstandingProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3658997(1682-1700)Online publication date: 3-Jun-2024

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media