A Unified Approach to Interpreting Model Predictions

Lundberg, Scott; Lee, Su-In

Computer Science > Artificial Intelligence

arXiv:1705.07874 (cs)

[Submitted on 22 May 2017 (v1), last revised 25 Nov 2017 (this version, v2)]

Title:A Unified Approach to Interpreting Model Predictions

Authors:Scott Lundberg, Su-In Lee

View PDF

Abstract:Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

Comments:	To appear in NIPS 2017
Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1705.07874 [cs.AI]
	(or arXiv:1705.07874v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.1705.07874

Submission history

From: Scott Lundberg [view email]
[v1] Mon, 22 May 2017 17:38:10 UTC (2,350 KB)
[v2] Sat, 25 Nov 2017 03:53:32 UTC (4,352 KB)

Computer Science > Artificial Intelligence

Title:A Unified Approach to Interpreting Model Predictions

Submission history

Access Paper:

References & Citations

20 blog links

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:A Unified Approach to Interpreting Model Predictions

Submission history

Access Paper:

References & Citations

20 blog links

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators