research-article

Interventional SHAP values and interaction values for piecewise linear regression trees

AUTHORs:

Klaus Broelemann, and

Gjergji KasneciAuthors Info & Claims

AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

February 2023

Article No.: 1253, Pages 11164 - 11173

https://doi.org/10.1609/aaai.v37i9.26322

Published: 07 February 2023 Publication History

Abstract

In recent years, game-theoretic Shapley values have gained increasing attention with respect to local model explanation by feature attributions. While the approach using Shapley values is model-independent, their (exact) computation is usually intractable, so efficient model-specific algorithms have been devised including approaches for decision trees or their ensembles in general. Our work goes further in this direction by extending the interventional TreeSHAP algorithm to piecewise linear regression trees, which gained more attention in the past few years. To this end, we introduce a decomposition of the contribution function based on decision paths, which allows a more comprehensible formulation of SHAP algorithms for tree-based models. Our algorithm can also be readily applied to computing SHAP interaction values for these models. In particular, as the main contribution of this paper, we provide a more efficient approach of interventional SHAP for tree-based models by precomputing statistics of the background data based on the tree structure.

References

[1]

Aas, K.; Jullum, M.; and Løland, A. 2021. Explaining individual predictions when features are dependent: More accurate approximations to Shapley values. Artificial Intelligence, 298: 103502.

Digital Library

[2]

Ancona, M.; Oztireli, C.; and Gross, M. 2019. Explaining deep neural networks with a polynomial time algorithm for Shapley value approximation. In International Conference on Machine Learning, 272-281. PMLR.

[3]

Arenas, M.; Barceló, P.; Bertossi, L.; and Monet, M. 2021a. On the Complexity of SHAP-Score-Based Explanations: Tractability via Knowledge Compilation and Non-Approximability Results. arXiv:2104.08015.

[4]

Arenas, M.; Barceló, P.; Bertossi, L.; and Monet, M. 2021b. The Tractability of SHAP-Score-Based Explanations for Classification over Deterministic and Decomposable Boolean Circuits. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8): 6670-6678.

[5]

Breiman, L. 2001. Random forests. Machine learning, 45(1): 5-32.

[6]

Broelemann, K.; and Kasneci, G. 2019. A Gradient-Based Split Criterion for Highly Accurate and Transparent Model Trees. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2030-2037. International Joint Conferences on Artificial Intelligence Organization. ISBN 978-0-9992411-4-1.

[7]

Chen, H.; Janizek, J. D.; Lundberg, S.; and Lee, S.-I. 2020. True to the Model or True to the Data? arXiv:2006.16234.

[8]

Chen, T.; and Guestrin, C. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794.

Digital Library

[9]

Covert, I.; and Lee, S.-I. 2021. Improving KernelSHAP: Practical Shapley value estimation using linear regression. In International Conference on Artificial Intelligence and Statistics, 3457-3465. PMLR.

[10]

Covert, I.; Lundberg, S. M.; and Lee, S.-I. 2020. Understanding global feature contributions with additive importance measures. Advances in Neural Information Processing Systems, 33.

[11]

Datta, A.; Sen, S.; and Zick, Y. 2016. Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In 2016 IEEE symposium on security and privacy (SP), 598-617. IEEE.

[12]

de Vito, L. 2017. LinXGBoost: Extension of XGBoost to generalized local linear models. arXiv:1710.03634.

[13]

Du, M.; Liu, N.; and Hu, X. 2019. Techniques for interpretable machine learning. Communications of the ACM, 63(1): 68-77.

Digital Library

[14]

Friedman, J. H. 2002. Stochastic gradient boosting. Computational statistics & data analysis, 38(4): 367-378.

[15]

Fujimoto, K.; Kojadinovic, I.; and Marichal, J.-L. 2006. Axiomatic characterizations of probabilistic and cardinal-probabilistic interaction indices. Games and Economic Behavior, 55(1): 72-99.

[16]

Grabisch, M. 1997. K-order additive discrete fuzzy measures and their representation. Fuzzy sets and systems, 92(2): 167-189.

[17]

Grinsztajn, L.; Oyallon, E.; and Varoquaux, G. 2022. Why do tree-based models still outperform deep learning on tabulardata? arXiv:2207.08815.

[18]

Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; and Pedreschi, D. 2018. A Survey of Methods for Explaining Black Box Models. ACM Computing Surveys, 51(5): 1-42.

Digital Library

[19]

Guryanov, A. 2019. Histogram-Based Algorithm for Building Gradient Boosting Ensembles of Piecewise Linear Decision Trees. In International Conference on Analysis of Images, Social Networks and Texts, 39-50. Springer.

[20]

Haug, J.; Broelemann, K.; and Kasneci, G. 2022. Dynamic Model Tree for Interpretable Data Stream Learning. In 38th IEEE International Conference on Data Engineering, ICDE 2022, Kuala Lumpur, Malaysia, May 9-12, 2022, 2562-2574. IEEE.

[21]

Ignatiev, A. 2020. Towards Trustable Explainable AI. In IJCAI, 5154-5158.

[22]

Janzing, D.; Minorics, L.; and Blöbaum, P. 2020. Feature relevance quantification in explainable AI: A causal problem. In International Conference on artificial intelligence and statistics, 2907-2916. PMLR.

[23]

Jethani, N.; Sudarshan, M.; Covert, I. C.; Lee, S.-I.; and Ranganath, R. 2021. FastSHAP: Real-Time Shapley Value Estimation. In International Conference on Learning Representations.

[24]

Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; and Liu, T.-Y. 2017. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30.

[25]

Kumar, I.; Venkatasubramanian, S.; Scheidegger, C.; and Friedler, S. 2020. Problems with Shapley-value-based explanations as feature importance measures. In Proceedings of the International Conference on Machine Learning, 8083-8092.

[26]

Landwehr, N.; Hall, M.; and Frank, E. 2005. Logistic model trees. Machine learning, 59(1-2): 161-205.

[27]

Lundberg, S. M.; Erion, G.; Chen, H.; DeGrave, A.; Prutkin, J. M.; Nair, B.; Katz, R.; Himmelfarb, J.; Bansal, N.; and Lee, S.-I. 2020. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1): 56-67.

[28]

Lundberg, S. M.; and Lee, S.-I. 2017. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS'17, 4768-4777. Red Hook, NY, USA: Curran Associates Inc. ISBN 9781510860964.

[29]

Merrick, L.; and Taly, A. 2020. The Explanation Game: Explaining Machine Learning Models Using Shapley Values. In International Cross-Domain Conference for Machine Learning and Knowledge Extraction, 17-38. Springer.

[30]

Miroshnikov, A.; Kotsiopoulos, K.; and Kannan, A. R. 2021. Mutual information-based group explainers with coalition structure for machine learning model explanations. arXiv:2102.10878.

[31]

Potts, D.; and Sammut, C. 2005. Incremental learning of linear model trees. Machine Learning, 61(1-3): 5-48.

Digital Library

[32]

Quinlan, J. R.; et al. 1992. Learning with continuous classes. In 5th Australian joint conference on artificial intelligence, volume 92, 343-348. World Scientific.

[33]

Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining.

[34]

Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2018. Anchors: High-Precision Model-Agnostic Explanations. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1).

[35]

Shapley, L. S. 1953. A value forn-person games. Contributions to the Theory of Games, 2(28): 307-317.

[36]

Shi, Y.; Li, J.; and Li, Z. 2019. Gradient Boosting with Piece-Wise Linear Regression Trees. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, 3432-3438. International Joint Conferences on Artificial Intelligence Organization.

[37]

Van den Broeck, G.; Lykov, A.; Schleich, M.; and Suciu, D. 2021. On the Tractability of SHAP Explanations. In Proceedings of the 35 th Conference on Artificial Intelligence (AAAI).

[38]

Wang, Y.; and Witten, I. H. 1997. Induction of model trees for predicting continuous classes. In Poster papers of the 9th European Conference on Machine Learning. Springer.

[39]

Yang, J. 2021. Fast TreeSHAP: Accelerating SHAP Value Computation for Trees. arXiv:2109.09847.

Recommendations

Imputing missing values using cumulative linear regression

The concept of missing data is important to apply statistical methods on the dataset. Statisticians and researchers may end up to an inaccurate illation about the data if the missing data are not handled properly. Of late, Python and R provide diverse ...
Read More
Extremal values of ratios: Distance problems vs. subtree problems in trees II

We discovered a dual behavior of two tree indices, the Wiener index and the number of subtrees, for a number of extremal problems (Szekely and Wang, 2006, 2005). We introduced the concept of subtree core: the subtree core of a tree consists of one or ...
Read More
Missing Values: Proposition of a Typology and Characterization with an Association Rule-Based Model
DaWaK '09: Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery

Handling missing values when tackling real-world datasets is a great challenge arousing the interest of many scientific communities. Many works propose completion methods or implement new data mining techniques tolerating the presence of missing values. ...
Read More

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

February 2023

16496 pages

ISBN:978-1-57735-880-0

Copyright © 2023 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 07 February 2023

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents