From local explanations to global understanding with explainable AI for trees

A preprint version of the article is available at arXiv.


Tree-based machine learning models such as random forests, decision trees and gradient boosted trees are popular nonlinear predictive models, yet comparatively little attention has been paid to explaining their predictions. Here we improve the interpretability of tree-based models through three main contributions. (1) A polynomial time algorithm to compute optimal explanations based on game theory. (2) A new type of explanation that directly measures local feature interaction effects. (3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to (1) identify high-magnitude but low-frequency nonlinear mortality risk factors in the US population, (2) highlight distinct population subgroups with shared risk characteristics, (3) identify nonlinear interaction effects among risk factors for chronic kidney disease and (4) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model’s performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains.

Fig. 1: Local explanations based on TreeExplainer enable a wide variety of new ways to understand global model structure.
Fig. 2: Gradient boosted tree models can be more accurate than neural networks and more interpretable than linear models.
Fig. 3: Explanation method performance across 15 different evaluation metrics and 3 classification models in the chronic kidney disease dataset.
Fig. 4: By combining many local explanations, we can provide rich summaries of both an entire model and individual features.
Fig. 5: Monitoring plots reveal problems that would otherwise be invisible in a retrospective hospital machine learning model deployment.
Fig. 6: Local explanation embeddings support both supervised clustering and interpretable dimensionality reduction.

Data availability

The pre-processed mortality data are available at http://github.com/suinleelab/treexplainer-study. Privacy restrictions prevent the release of the hospital procedure-related data, and the kidney disease data are only available directly from the National Institute of Diabetes, Digestive and Kidney Diseases (NIDDK).

Code availability

Code supporting this paper is published online at https://github.com/suinleelab/treexplainer-study. A widely used Python implementation of TreeExplainer is available at https://github.com/slundberg/shap, and portions of it are included in the standard release of XGBoost (https://xgboost.ai), LightGBM (https://github.com/Microsoft/LightGBM) and CatBoost (https://catboost.ai).


We are grateful to R. Chen, A. Okeson, C. Robinson, V. Khotilovich, N. Hiranuma, J. Janizek, M. T. Ribeiro, J. Schreiber, P. Hall and members of S.-I.L.’s group for the feedback and assistance they provided during the development and preparation of this research. This work was funded by the National Science Foundation (DBI-1759487, DBI-1552309, DBI-1355899, DGE-1762114 and DGE-1256082), American Cancer Society (127332-RSG-15-097-01-TBG), National Institutes of Health (R35 GM 128638 and R01 NIA AG 061132), and an unrestricted gift from the Northwest Kidney Centers to the University of Washington Kidney Research Institute. The Chronic Renal Insufficiency Cohort (CRIC) study was conducted by the CRIC investigators and supported by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). The data from the CRIC study reported here were supplied by the NIDDK Central Repositories. This manuscript was not prepared in collaboration with Investigators of the CRIC study and does not necessarily reflect the opinions or views of the CRIC study, the NIDDK Central Repositories or the NIDDK.

Author information

Authors and Affiliations



S.M.L. and S.I.L conceived the study. S.M.L. designed algorithms, designed visualizations, designed metrics, ran experiments and contributed to the writing. G.E. ran experiments, designed visualizations and contributed to the writing. H.C. designed algorithms, ran experiments and contributed to the writing. A.D. performed dataset creation. R.K., J.H. and N.B. did dataset selection, model vetting and defined the chronic kidney disease prediction problem. J.M.P., B.N., R.K., J.H. and N.B. each contributed writing and helped procure and interpret datasets. S.-I.L. supervised research, method development and contributed to the writing.

Corresponding author

Correspondence to Su-In Lee.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Supplementary information

Supplementary Information

Supplementary Figs, methods and references.

Supplementary Data 1

Cite this article

Lundberg, S.M., Erion, G., Chen, H. et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell 2, 56–67 (2020). https://doi.org/10.1038/s42256-019-0138-9

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-019-0138-9

