research-article

Manipulating and Measuring Model Interpretability

Authors:

Forough Poursabzi-Sangdeh,

Daniel G Goldstein,

Jennifer Wortman Wortman Vaughan,

Hanna WallachAuthors Info & Claims

CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Article No.: 237, Pages 1 - 52

https://doi.org/10.1145/3411764.3445315

Published: 07 May 2021 Publication History

Abstract

With machine learning models being increasingly used to aid decision making even in high-stakes domains, there has been a growing interest in developing interpretable models. Although many supposedly interpretable models have been proposed, there have been relatively few experimental studies investigating whether these models achieve their intended effects, such as making people more closely follow a model’s predictions when it is beneficial for them to do so or enabling them to detect when a model has made a mistake. We present a sequence of pre-registered experiments (N = 3, 800) in which we showed participants functionally identical models that varied only in two factors commonly thought to make machine learning models more or less interpretable: the number of features and the transparency of the model (i.e., whether the model internals are clear or black box). Predictably, participants who saw a clear model with few features could better simulate the model’s predictions. However, we did not find that participants more closely followed its predictions. Furthermore, showing participants a clear model meant that they were less able to detect and correct for the model’s sizable mistakes, seemingly due to information overload. These counterintuitive findings emphasize the importance of testing over intuition when developing interpretable models.

Supplementary Material

Supplementary Materials (3411764.3445315_supplementalmaterials.zip)

Download
2.16 MB

References

[1]

Ashraf Abdul, Christian von der Weth, Mohan Kankanhalli, and Brian Y Lim. 2020. COGAM: Measuring and Moderating Cognitive Load in Machine Learning Model Explanations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[2]

Russell L Ackoff. 1967. Management misinformation systems. Management Science 14, 4 (1967), B–141–B–274. https://doi.org/10.1287/mnsc.14.4.B147

Digital Library

[3]

Oscar Alvarado and Annika Waern. 2018. Towards Algorithmic Experience: Initial Efforts for Social Media Contexts. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 286. http://doi.acm.org/10.1145/3173574.3173860

Digital Library

[4]

Giuseppe Amatulli, Maria João Rodrigues, Marco Trombetti, and Raffaella Lovreglio. 2006. Assessing long-term fire risk at local scale by means of decision tree technique. Journal of Geophysical Research: Biogeosciences 111, G4(2006).

[5]

Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias: There’s software used across the country to predict future criminals. And it’s biased against blacks. ProPublica (2016).

[6]

Thomas Åstebro and Samir Elhedhli. 2006. The effectiveness of simple decision heuristics: Forecasting commercial success for early-stage ventures. Management Science 52, 3 (2006), 395–409. https://doi.org/10.1287/mnsc.1050.0468

Digital Library

[7]

Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67, 1 (2015), 1–48.

[8]

Max H Bazerman. 1985. Norms of distributive justice in interest arbitration. ILR Review 38, 4 (1985), 558–570.

[9]

Victoria Bellotti and Keith Edwards. 2001. Intelligibility and accountability: Human considerations in context-aware systems. Human–Computer Interaction 16, 2–4 (2001), 193–212.

Digital Library

[10]

James Bennett, Stan Lanning, 2007. The netflix prize. In Proceedings of KDD Cup and Workshop.

[11]

Reuben Binns, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. 2018. ‘It’s Reducing a Human Being to a Percentage’: Perceptions of Justice in Algorithmic Decisions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 377. http://doi.acm.org/10.1145/3173574.3173951

Digital Library

[12]

L Breiman, JH Friedman, R Olshen, and CJ Stone. 1984. Classification and Regression Trees. (1984).

[13]

Taina Bucher. 2017. The algorithmic imaginary: exploring the ordinary affects of Facebook algorithms. Information, Communication & Society 20, 1 (2017), 30–44.

[14]

Michael Buhrmester, Tracy Kwang, and Samuel D Gosling. 2011. Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data?Perspectives on psychological science 6, 1 (2011), 3–5.

[15]

Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).

Digital Library

[16]

Krista Casler, Lydia Bickel, and Elizabeth Hackett. 2013. Separate but equal? A comparison of participants and data gathered via Amazon’s MTurk, social media, and face-to-face behavioral testing. Computers in Human Behavior 29, 6 (2013), 2156–2160.

Digital Library

[17]

Hao-Fei Cheng, Ruotong Wang, Zheng Zhang, Fiona O’Connell, Terrance Gray, F. Maxwell Harper, and Haiyi Zhu. 2019. Explaining Decision-Making Algorithms through UI: Strategies to Help Non-Expert Stakeholders. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems.

Digital Library

[18]

Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data 5, 2 (2017), 153–163.

[19]

Marvin M Chun. 2000. Contextual cueing of visual attention. Trends in cognitive sciences 4, 5 (2000), 170–178.

[20]

Eric Colson. 2013. Using human and machine processing in recommendation systems. In First AAAI Conference on Human Computation and Crowdsourcing.

[21]

Alexander Coppock. 2019. Generalizing from survey experiments conducted on Mechanical Turk: A replication approach. Political Science Research and Methods 7, 3 (2019), 613–628.

[22]

Jeffrey Dastin. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. (2018).

[23]

Robyn M Dawes. 1979. The robust beauty of improper linear models in decision making.American psychologist 34, 7 (1979), 571.

[24]

Robyn M Dawes, David Faust, and Paul E Meehl. 1989. Clinical versus actuarial judgment. Science 243, 4899 (1989), 1668–1674.

[25]

Berkeley J. Dietvorst, Joseph P. Simmons, and Cade Massey. 2015. Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General 144, 1 (2015), 114–126.

[26]

Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. 2018. Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science 64, 3 (2018), 1155–1170.

Digital Library

[27]

Jaap J Dijkstra. 1999. User agreement with incorrect expert system advice. Behaviour & Information Technology 18, 6 (1999), 399–411.

[28]

Jaap J Dijkstra, Wim BG Liebrand, and Ellen Timminga. 1998. Persuasiveness of expert systems. Behaviour & Information Technology 17, 3 (1998), 155–163.

[29]

Jonathan Dodge, Q Vera Liao, Yunfeng Zhang, Rachel KE Bellamy, and Casey Dugan. 2019. Explaining models: an empirical study of how explanations impact fairness judgment. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 275–285.

Digital Library

[30]

Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608(2017).

[31]

Mary T Dzindolet, Linda G Pierce, Hall P Beck, and Lloyd A Dawe. 2002. The perceived utility of human and automated aids in a visual detection task. Human Factors 44, 1 (2002), 79–94.

[32]

Dedre Gentner and Albert L. Stevens. 1983. Mental Models. Lawrence Erlbaum Associates.

[33]

Vivian Giang. 2018. The Potential Hidden Bias In Automated Hiring Systems. (2018). Accessed at https://www.fastcompany.com/40566971/the-potential-hidden-bias-in-automated-hiring-systems/.

[34]

Gerd Gigerenzer and Daniel G Goldstein. 1996. Reasoning the fast and frugal way: models of bounded rationality.Psychological review 103, 4 (1996), 650.

[35]

Gerd Gigerenzer and Peter M Todd. 1999. Simple heuristics that make us smart. Oxford University Press, USA.

[36]

Francesca Gino and Don A. Moore. 2007. Effects of task difficulty on use of advice. Journal of Behavioral Decision Making 20, 1 (2007), 21–35.

[37]

Alyssa Glass, Deborah L McGuinness, and Michael Wolverton. 2008. Toward establishing trust in adaptive agents. In Proceedings of the 13th International Conference on Intelligent User Interfaces (IUI).

Digital Library

[38]

Daniel G Goldstein and Gerd Gigerenzer. 2009. Fast and frugal forecasting. International journal of forecasting 25, 4 (2009), 760–772.

[39]

Nitesh Goyal and Susan R Fussell. 2016. Effects of sensemaking translucence on distributed collaborative analysis. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. 288–302.

Digital Library

[40]

Nina Grgić-Hlača, Christoph Engel, and Krishna P Gummadi. 2019. Human Decision Making with Machine Assistance: An Experiment on Bailing and Jailing. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–25.

[41]

William M Grove, David H Zald, Boyd S Lebow, Beth E Snitz, and Chad Nelson. 2000. Clinical versus mechanical prediction: a meta-analysis.Psychological assessment 12, 1 (2000), 19.

[42]

Todd M Gureckis, Jay Martin, John McDonnell, Alexander S Rich, Doug Markant, Anna Coenen, David Halpern, Jessica B Hamrick, and Patricia Chan. 2016. psiTurk: An open-source framework for conducting replicable behavioral experiments online. Behavior Research Methods 48, 3 (2016), 829–842.

[43]

David J Hand and William E Henley. 1997. Statistical classification methods in consumer credit scoring: a review. Journal of the Royal Statistical Society: Series A (Statistics in Society) 160, 3(1997), 523–541.

[44]

Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven M Drucker. 2019. Gamut: A design probe to understand how data scientists understand machine learning models. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–13.

Digital Library

[45]

Johan Huysmans, Karel Dejaeger, Christophe Mues, Jan Vanthienen, and Bart Baesens. 2011. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Systems 51, 1 (2011), 141–154.

Digital Library

[46]

Jacob Jacoby. 1984. Perspectives on information overload. Journal of consumer research 10, 4 (1984), 432–435.

[47]

Philip Johnson-Laird. 1983. Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness. Cambridge University Press.

Digital Library

[48]

Jongbin Jung, Connor Concannon, Ravi Shroff, Sharad Goel, and Daniel G Goldstein. 2020. Simple rules to guide expert classifications. Journal of the Royal Statistical Society: Series A (Statistics in Society) (2020).

[49]

Harmanpreet Kaur, Harsha Nori, Samuel Jenkins, Rich Caruana, Hanna Wallach, and Jennifer Wortman Vaughan. 2020. Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems.

Digital Library

[50]

Kevin Lane Keller and Richard Staelin. 1987. Effects of quality and quantity of information on decision effectiveness. Journal of consumer research 14, 2 (1987), 200–213.

[51]

Hyunjoong Kim, Wei-Yin Loh, Yu-Shan Shih, and Probal Chaudhuri. 2007. Visualizable and interpretable regression models with good prediction power. IIE Transactions 39, 6 (2007), 565–579.

[52]

Yea-Seul Kim, Katharina Reinecke, and Jessica Hullman. 2017. Explaining the gap: Visualizing one’s predictions improves recall and comprehension of data. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 1375–1386.

Digital Library

[53]

Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2018. Human decisions and machine predictions. The quarterly journal of economics 133, 1 (2018), 237–293.

[54]

Jon Kleinberg and Sendhil Mullainathan. 2019. Simplicity creates inequity: implications for fairness, stereotypes, and interpretability. In Proceedings of the 2019 ACM Conference on Economics and Computation. 807–808.

Digital Library

[55]

Rafal Kocielnik, Saleema Amershi, and Paul N Bennett. 2019. Will you accept an imperfect ai? exploring designs for adjusting end-user expectations of ai systems. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[56]

Pang Wei Koh and Percy Liang. 2017. Understanding Black-box Predictions via Influence Functions. In Proceedings of the 34th International Conference on Machine Learning (ICML).

Digital Library

[57]

Igor Kononenko. 2001. Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in medicine 23, 1 (2001), 89–109.

Digital Library

[58]

Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanaphongs, and Enrico Bertini. 2017. A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations. In Proceedings of IEEE Conference and Visual Analytics Science and Technology.

[59]

Todd Kulesza, Simone Stumpf, Margaret Burnett, Sherry Yang, Irwin Kwan, and Weng-Keen Wong. 2013. Too much, too little, or just right? Ways explanations impact end users’ mental models. In Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing. 3–10.

[60]

Isaac Lage, Emily Chen, Jeffrey He, Menaka Narayanan, Been Kim, Samuel J Gershman, and Finale Doshi-Velez. 2019. Human evaluation of models built for interpretability. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7. 59–67.

[61]

Isaac Lage, Andrew Slavin Ross, Been Kim, Samuel J. Gershman, and Finale Doshi-Velez. 2018. Human-in-the-Loop Interpretability Prior. In Advances in Neural Information Processing Systems.

[62]

Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. 2016. Interpretable Decision Sets: A Joint Framework for Description and Prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1675–1684.

Digital Library

[63]

Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2017. Interpretable & Explorable Approximations of Black Box Models. In FATML Workshop.

[64]

Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2019. Faithful and customizable explanations of black box models. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 131–138.

Digital Library

[65]

Q Vera Liao, Daniel Gruen, and Sarah Miller. 2020. Questioning the AI: Informing Design Practices for Explainable AI User Experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–15.

Digital Library

[66]

Cynthia CS Liem, Markus Langer, Andrew Demetriou, Annemarie MF Hiemstra, Achmadnoer Sukma Wicaksana, Marise Ph Born, and Cornelius J König. 2018. Psychology Meets Machine Learning: Interdisciplinary Perspectives on Algorithmic Job Candidate Screening. In Explainable and Interpretable Models in Computer Vision and Machine Learning. Springer, 197–253.

[67]

Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the 2009 CHI Conference on Human Factors in Computing Systems. ACM, 2119–2128. http://doi.acm.org/10.1145/1518701.1519023

Digital Library

[68]

Zachary C Lipton. 2016. The mythos of model interpretability. arXiv preprint arXiv:1606.03490(2016).

[69]

Jia Liu and Olivier Toubia. 2018. A semantic approach for estimating consumer content preferences from online search queries. Marketing Science 37, 6 (2018), 855–1052. https://doi.org/10.1287/mksc.2018.1112

Digital Library

[70]

Jennifer M. Logg. 2017. Theory of Machine: When Do People Rely on Algorithms? (2017). Harvard Business School NOM Unit Working Paper No. 17-086.

[71]

Jennifer M Logg, Julia A Minson, and Don A Moore. 2019. Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes 151 (2019), 90–103.

[72]

Yin Lou, Rich Caruana, and Johannes Gehrke. 2012. Intelligible Models for Classification and Regression. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)(Beijing, China).

Digital Library

[73]

Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker. 2013. Accurate Intelligible Models with Pairwise Interactions. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (Chicago, IL, USA).

Digital Library

[74]

Scott Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30 (NIPS).

[75]

Winter Mason and Siddharth Suri. 2012. Conducting behavioral research on Amazon’s Mechanical Turk. Behavior research methods 44, 1 (2012), 1–23.

[76]

Paul E Meehl. 1990. Why summaries of research on psychological theories are often uninterpretable. Psychological reports 66, 1 (1990), 195–244.

[77]

Rishabh Mehrotra, James McInerney, Hugues Bouchard, Mounia Lalmas, and Fernando Diaz. 2018. Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2243–2251. http://doi.acm.org/10.1145/3269206.3272027

Digital Library

[78]

Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 267 (2019), 1–38.

[79]

Tim Miller, Piers Howe, and Liz Sonenberg. 2017. Explainable AI: Beware of inmates running the asylum. arXiv preprint arXiv:1712.00547(2017).

[80]

Safiya Umoja Noble. 2018. Algorithms of Oppression: How search engines reinforce racism. NYU Press.

[81]

Harsha Nori, Samuel Jenkins, Paul Koch, and Rich Caruana. 2019. InterpretML: A Unified Framework for Machine Learning Interpretability. arXiv preprint arXiv:1909.09223(2019).

[82]

Don A. Norman. 1987. Some Observations on Mental Models. In Human-Computer Interaction: A Multidisciplinary Approach, R. M. Baecker and W. A. S. Buxton (Eds.). Morgan Kaufmann Publishers Inc., 241–244.

[83]

Dilek Önkal, Paul Goodwin, Mary Thomson, and Sinan Gönül. 2009. The relative influence of advice from human experts and statistical methods on forecast adjustments. Journal of Behavioral Decision Making 22 (2009), 390–409.

[84]

Gabriele Paolacci and Jesse Chandler. 2014. Inside the Turk: Understanding Mechanical Turk as a participant pool. Current Directions in Psychological Science 23, 3 (2014), 184–188.

[85]

Peter Pirolli and Daniel M Russell. 2011. Introduction to this special issue on sensemaking.

[86]

Marianne Promberger and Jonathan Baron. 2006. Do patients trust computers?Journal of Behavioral Decision Making 19, 5 (2006), 455–468.

[87]

Emilee Rader, Kelley Cotter, and Janghee Cho. 2018. Explanations as Mechanisms for Supporting Algorithmic Transparency. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 103. http://doi.acm.org/10.1145/3173574.3173677

Digital Library

[88]

Emilee Rader and Rebecca Gray. 2015. Understanding user beliefs about algorithmic curation in the Facebook news feed. In Proceedings of the 2015 CHI Conference on Human Factors in Computing Systems. ACM, 173–182. http://doi.acm.org/10.1145/2702123.2702174

Digital Library

[89]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).

Digital Library

[90]

Christian Rudder. 2014. Dataclysm: Love, Sex, Race, and Identity–What Our Online Lives Tell Us about Our Offline Selves. Crown.

[91]

Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206–215.

[92]

Cynthia Rudin and Berk Ustun. 2018. Optimized Scoring Systems: Toward Trust in Machine Learning for Healthcare and Criminal Justice. Applied Analytics 48, 5 (2018), 449–466. https://doi.org/10.1287/inte.2018.0957

Digital Library

[93]

Chris Russell. 2019. Efficient search for diverse coherent explanations. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 20–28.

Digital Library

[94]

Daniel M Russell, Mark J Stefik, Peter Pirolli, and Stuart K Card. 1993. The cost structure of sensemaking. In Proceedings of the INTERACT’93 and CHI’93 conference on Human factors in computing systems. 269–276.

Digital Library

[95]

Paul JH Schoemaker and C Carter Waid. 1982. An experimental comparison of different approaches to determining weights in additive utility models. Management Science 28, 2 (1982), 182–196. https://doi.org/10.1287/mnsc.28.2.182

Digital Library

[96]

Kathryn Sharpe Wessling, Joel Huber, and Oded Netzer. 2017. MTurk character misrepresentation: Assessment and solutions. Journal of Consumer Research 44, 1 (2017), 211–230.

[97]

Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen Wong, Margaret Burnett, Thomas Dietterich, Erin Sullivan, and Jonathan Herlocker. 2009. Interacting meaningfully with machine learning systems: Three experiments. International Journal of Human-Computer Studies 67, 8 (2009), 639–662.

Digital Library

[98]

Sarah Tan, Rich Caruana, Giles Hooker, and Yin Lou. 2018. Distill-and-compare: Auditing black-box models using transparent model distillation. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 303–310.

Digital Library

[99]

Nava Tintarev and Judith Masthoff. 2015. Explaining recommendations: Design and evaluation. In Recommender systems handbook. Springer, 353–382.

[100]

Richard Tomsett, David Braines, Dan Harborne, Alun Preece, and Supriyo Chakraborty. 2018. Interpretable to Whom? A Role-Based Model for Analyzing Interpretable Machine Learning Systems. In 2018 Workshop on Human Interpretability in Machine Learning.

[101]

Amos Tversky and Daniel Kahneman. 1974. Judgment under uncertainty: Heuristics and biases. science 185, 4157 (1974), 1124–1131.

[102]

Berk Ustun and Cynthia Rudin. 2016. Supersparse Linear Integer Models for Optimized Medical Scoring Systems. Machine Learning Journal 102, 3 (2016), 349–391.

Digital Library

[103]

Berk Ustun, Alexander Spangher, and Yang Liu. 2019. Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 10–19.

Digital Library

[104]

Kristen Vaccaro, Dylan Huang, Motahhare Eslami, Christian Sandvig, Kevin Hamilton, and Karrie Karahalios. 2018. The illusion of control: Placebo effects of control settings. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[105]

Vanya MCA Van Belle, Ben Van Calster, Dirk Timmerman, Tom Bourne, Cecilia Bottomley, Lil Valentin, Patrick Neven, Sabine Van Huffel, Johan AK Suykens, and Stephen Boyd. 2012. A mathematical model for interpretable clinical decision support with applications in gynecology. PloS one 7, 3 (2012).

[106]

Michael Veale, Max Van Kleek, and Reuben Binns. 2018. Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[107]

Kim J Vicente and Gerard L Torenvliet. 2000. The Earth is spherical(p < 0. 05): alternative methods of statistical inference. Theoretical Issues in Ergonomics Science 1, 3 (2000), 248–271.

[108]

Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech. 31(2017), 841.

[109]

Sara Wachter-Boettcher. 2017. Why You Can’t Trust AI to Make Unbiased Hiring Decisions. (2017). Accessed at http://time.com/4993431/ai-recruiting-tools-do-not-/.

[110]

Danding Wang, Qian Yang, Ashraf Abdul, and Brian Y Lim. 2019. Designing theory-driven user-centric explainable AI. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–15.

Digital Library

[111]

Fulton Wang and Cynthia Rudin. 2015. Falling rule lists. In Artificial Intelligence and Statistics. 1013–1022.

[112]

Martin Wattenberg, Fernanda Viégas, and Moritz Hardt. 2016. Attacking discrimination with smarter machine learning. (2016). Accessed at https://research.google.com/bigpicture/attacking-discrimination-in-ml/.

[113]

Daniel S Weld and Gagan Bansal. 2019. The challenge of crafting intelligible intelligence. Commun. ACM 62, 6 (2019), 70–79.

Digital Library

[114]

Ilan Yaniv. 2004. Receiving other people’s advice: Influence and benefit. Organizational Behavior and Human Decision Processes 93 (2004), 1–13.

Cited By

Alotaibi SAlmujibah HMohamed KElhassan AAlsulami BAlsaluli AKhattak A(2024)Towards Cleaner Cities: Estimating Vehicle-Induced PM2.5 with Hybrid EBM-CMA-ES ModelingToxics10.3390/toxics1211082712:11(827)Online publication date: 19-Nov-2024
https://doi.org/10.3390/toxics12110827
Chiaburu THaußer FBießmann F(2024)Uncertainty in XAI: Human Perception and Modeling ApproachesMachine Learning and Knowledge Extraction10.3390/make60200556:2(1170-1192)Online publication date: 27-May-2024
https://doi.org/10.3390/make6020055
Naveed SStevens GRobin-Kern D(2024)An Overview of the Empirical Evaluation of Explainable AI (XAI): A Comprehensive Guideline for User-Centered Evaluation in XAIApplied Sciences10.3390/app14231128814:23(11288)Online publication date: 3-Dec-2024
https://doi.org/10.3390/app142311288
Show More Cited By

Index Terms

Manipulating and Measuring Model Interpretability

Index terms have been assigned to the content through auto-classification.

Recommendations

Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling

This paper aims at providing an in-depth overview of designing interpretable fuzzy inference models from data within a unified framework. The objective of complex system modelling is to develop reliable and understandable models for human being to get ...
Interpretability of Gradual Semantics in Abstract Argumentation
Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Abstract
Argumentation, in the field of Artificial Intelligence, is a formalism allowing to reason with contradictory information as well as to model an exchange of arguments between one or several agents. For this purpose, many semantics have been defined ...
A Fundamental Model with Stable Interpretability for Traffic Forecasting
GeoPrivacy '23: Proceedings of the 1st ACM SIGSPATIAL International Workshop on Geo-Privacy and Data Utility for Smart Societies

Deep learning models have been widely applied in traffic prediction and analysis. Notably, attention-based models like Graph Attention Network (GAT) have brought significant insights and decisionmaking capabilities to traffic managers through their ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

May 2021

10862 pages

ISBN:9781450380966

DOI:10.1145/3411764

General Chairs:
Yoshifumi Kitamura
Tohoku University, Japan
,
Aaron Quigley
University of New South Wales, Australia
,
Program Chairs:
Katherine Isbister
University of California Santa Cruz, USA
,
Takeo Igarashi
The University of Tokyo, Japan
,
Publications Chairs:
Pernille Bjørn
University of Copenhagen, Denmark
,
Steven Drucker
Microsoft Research, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CHI '21

Sponsor:

SIGCHI

CHI '21: CHI Conference on Human Factors in Computing Systems

May 8 - 13, 2021

Yokohama, Japan

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

243
Total Citations
View Citations
4,817
Total Downloads

Downloads (Last 12 months)889
Downloads (Last 6 weeks)71

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Alotaibi SAlmujibah HMohamed KElhassan AAlsulami BAlsaluli AKhattak A(2024)Towards Cleaner Cities: Estimating Vehicle-Induced PM2.5 with Hybrid EBM-CMA-ES ModelingToxics10.3390/toxics1211082712:11(827)Online publication date: 19-Nov-2024
https://doi.org/10.3390/toxics12110827
Chiaburu THaußer FBießmann F(2024)Uncertainty in XAI: Human Perception and Modeling ApproachesMachine Learning and Knowledge Extraction10.3390/make60200556:2(1170-1192)Online publication date: 27-May-2024
https://doi.org/10.3390/make6020055
Naveed SStevens GRobin-Kern D(2024)An Overview of the Empirical Evaluation of Explainable AI (XAI): A Comprehensive Guideline for User-Centered Evaluation in XAIApplied Sciences10.3390/app14231128814:23(11288)Online publication date: 3-Dec-2024
https://doi.org/10.3390/app142311288
Tambwekar PGombolay M(2024)Towards reconciling usability and usefulness of policy explanations for sequential decision-making systemsFrontiers in Robotics and AI10.3389/frobt.2024.137549011Online publication date: 22-Jul-2024
https://doi.org/10.3389/frobt.2024.1375490
Lammert ORichter BSchütze CThommes KWrede B(2024)Humans in XAI: increased reliance in decision-making under uncertainty by using explanation strategiesFrontiers in Behavioral Economics10.3389/frbhe.2024.13770753Online publication date: 8-Mar-2024
https://doi.org/10.3389/frbhe.2024.1377075
Dervovic DLécué FMarchesotti NMagazzeni DLarson K(2024)Are logistic models really interpretable?Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/41(367-375)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/41
Sele DChugunova M(2024)Putting a human in the loop: Increasing uptake, but decreasing accuracy of automated decision-makingPLOS ONE10.1371/journal.pone.029803719:2(e0298037)Online publication date: 9-Feb-2024
https://doi.org/10.1371/journal.pone.0298037
Meng CHou YZou QShi LSu XJu Y(2024)Rore: robust and efficient antioxidant protein classification via a novel dimensionality reduction strategy based on learning of fewer featuresGenomics & Informatics10.1186/s44342-024-00026-z22:1Online publication date: 4-Dec-2024
https://doi.org/10.1186/s44342-024-00026-z
Abgrall GHolder AChelly Dagdia ZZeitouni KMonnet X(2024)Should AI models be explainable to clinicians?Critical Care10.1186/s13054-024-05005-y28:1Online publication date: 12-Sep-2024
https://doi.org/10.1186/s13054-024-05005-y
Pithayarungsarit PRieger TOnnasch LRoesler E(2024)The Pop-Out Effect of Rarer Occurring Stimuli Shapes the Effectiveness of AI ExplainabilityProceedings of the Human Factors and Ergonomics Society Annual Meeting10.1177/1071181324126128468:1(352-358)Online publication date: 13-Aug-2024
https://doi.org/10.1177/10711813241261284
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten