Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3411764.3445315acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article

Manipulating and Measuring Model Interpretability

Published: 07 May 2021 Publication History

Abstract

With machine learning models being increasingly used to aid decision making even in high-stakes domains, there has been a growing interest in developing interpretable models. Although many supposedly interpretable models have been proposed, there have been relatively few experimental studies investigating whether these models achieve their intended effects, such as making people more closely follow a model’s predictions when it is beneficial for them to do so or enabling them to detect when a model has made a mistake. We present a sequence of pre-registered experiments (N = 3, 800) in which we showed participants functionally identical models that varied only in two factors commonly thought to make machine learning models more or less interpretable: the number of features and the transparency of the model (i.e., whether the model internals are clear or black box). Predictably, participants who saw a clear model with few features could better simulate the model’s predictions. However, we did not find that participants more closely followed its predictions. Furthermore, showing participants a clear model meant that they were less able to detect and correct for the model’s sizable mistakes, seemingly due to information overload. These counterintuitive findings emphasize the importance of testing over intuition when developing interpretable models.

Supplementary Material

Supplementary Materials (3411764.3445315_supplementalmaterials.zip)

References

[1]
Ashraf Abdul, Christian von der Weth, Mohan Kankanhalli, and Brian Y Lim. 2020. COGAM: Measuring and Moderating Cognitive Load in Machine Learning Model Explanations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.
[2]
Russell L Ackoff. 1967. Management misinformation systems. Management Science 14, 4 (1967), B–141–B–274. https://doi.org/10.1287/mnsc.14.4.B147
[3]
Oscar Alvarado and Annika Waern. 2018. Towards Algorithmic Experience: Initial Efforts for Social Media Contexts. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 286. http://doi.acm.org/10.1145/3173574.3173860
[4]
Giuseppe Amatulli, Maria João Rodrigues, Marco Trombetti, and Raffaella Lovreglio. 2006. Assessing long-term fire risk at local scale by means of decision tree technique. Journal of Geophysical Research: Biogeosciences 111, G4(2006).
[5]
Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias: There’s software used across the country to predict future criminals. And it’s biased against blacks. ProPublica (2016).
[6]
Thomas Åstebro and Samir Elhedhli. 2006. The effectiveness of simple decision heuristics: Forecasting commercial success for early-stage ventures. Management Science 52, 3 (2006), 395–409. https://doi.org/10.1287/mnsc.1050.0468
[7]
Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67, 1 (2015), 1–48.
[8]
Max H Bazerman. 1985. Norms of distributive justice in interest arbitration. ILR Review 38, 4 (1985), 558–570.
[9]
Victoria Bellotti and Keith Edwards. 2001. Intelligibility and accountability: Human considerations in context-aware systems. Human–Computer Interaction 16, 2–4 (2001), 193–212.
[10]
James Bennett, Stan Lanning, 2007. The netflix prize. In Proceedings of KDD Cup and Workshop.
[11]
Reuben Binns, Max Van Kleek, Michael Veale, Ulrik Lyngs, Jun Zhao, and Nigel Shadbolt. 2018. ‘It’s Reducing a Human Being to a Percentage’: Perceptions of Justice in Algorithmic Decisions. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 377. http://doi.acm.org/10.1145/3173574.3173951
[12]
L Breiman, JH Friedman, R Olshen, and CJ Stone. 1984. Classification and Regression Trees. (1984).
[13]
Taina Bucher. 2017. The algorithmic imaginary: exploring the ordinary affects of Facebook algorithms. Information, Communication & Society 20, 1 (2017), 30–44.
[14]
Michael Buhrmester, Tracy Kwang, and Samuel D Gosling. 2011. Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality, data?Perspectives on psychological science 6, 1 (2011), 3–5.
[15]
Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).
[16]
Krista Casler, Lydia Bickel, and Elizabeth Hackett. 2013. Separate but equal? A comparison of participants and data gathered via Amazon’s MTurk, social media, and face-to-face behavioral testing. Computers in Human Behavior 29, 6 (2013), 2156–2160.
[17]
Hao-Fei Cheng, Ruotong Wang, Zheng Zhang, Fiona O’Connell, Terrance Gray, F. Maxwell Harper, and Haiyi Zhu. 2019. Explaining Decision-Making Algorithms through UI: Strategies to Help Non-Expert Stakeholders. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems.
[18]
Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data 5, 2 (2017), 153–163.
[19]
Marvin M Chun. 2000. Contextual cueing of visual attention. Trends in cognitive sciences 4, 5 (2000), 170–178.
[20]
Eric Colson. 2013. Using human and machine processing in recommendation systems. In First AAAI Conference on Human Computation and Crowdsourcing.
[21]
Alexander Coppock. 2019. Generalizing from survey experiments conducted on Mechanical Turk: A replication approach. Political Science Research and Methods 7, 3 (2019), 613–628.
[22]
Jeffrey Dastin. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. (2018).
[23]
Robyn M Dawes. 1979. The robust beauty of improper linear models in decision making.American psychologist 34, 7 (1979), 571.
[24]
Robyn M Dawes, David Faust, and Paul E Meehl. 1989. Clinical versus actuarial judgment. Science 243, 4899 (1989), 1668–1674.
[25]
Berkeley J. Dietvorst, Joseph P. Simmons, and Cade Massey. 2015. Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General 144, 1 (2015), 114–126.
[26]
Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. 2018. Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science 64, 3 (2018), 1155–1170.
[27]
Jaap J Dijkstra. 1999. User agreement with incorrect expert system advice. Behaviour & Information Technology 18, 6 (1999), 399–411.
[28]
Jaap J Dijkstra, Wim BG Liebrand, and Ellen Timminga. 1998. Persuasiveness of expert systems. Behaviour & Information Technology 17, 3 (1998), 155–163.
[29]
Jonathan Dodge, Q Vera Liao, Yunfeng Zhang, Rachel KE Bellamy, and Casey Dugan. 2019. Explaining models: an empirical study of how explanations impact fairness judgment. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 275–285.
[30]
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608(2017).
[31]
Mary T Dzindolet, Linda G Pierce, Hall P Beck, and Lloyd A Dawe. 2002. The perceived utility of human and automated aids in a visual detection task. Human Factors 44, 1 (2002), 79–94.
[32]
Dedre Gentner and Albert L. Stevens. 1983. Mental Models. Lawrence Erlbaum Associates.
[33]
Vivian Giang. 2018. The Potential Hidden Bias In Automated Hiring Systems. (2018). Accessed at https://www.fastcompany.com/40566971/the-potential-hidden-bias-in-automated-hiring-systems/.
[34]
Gerd Gigerenzer and Daniel G Goldstein. 1996. Reasoning the fast and frugal way: models of bounded rationality.Psychological review 103, 4 (1996), 650.
[35]
Gerd Gigerenzer and Peter M Todd. 1999. Simple heuristics that make us smart. Oxford University Press, USA.
[36]
Francesca Gino and Don A. Moore. 2007. Effects of task difficulty on use of advice. Journal of Behavioral Decision Making 20, 1 (2007), 21–35.
[37]
Alyssa Glass, Deborah L McGuinness, and Michael Wolverton. 2008. Toward establishing trust in adaptive agents. In Proceedings of the 13th International Conference on Intelligent User Interfaces (IUI).
[38]
Daniel G Goldstein and Gerd Gigerenzer. 2009. Fast and frugal forecasting. International journal of forecasting 25, 4 (2009), 760–772.
[39]
Nitesh Goyal and Susan R Fussell. 2016. Effects of sensemaking translucence on distributed collaborative analysis. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. 288–302.
[40]
Nina Grgić-Hlača, Christoph Engel, and Krishna P Gummadi. 2019. Human Decision Making with Machine Assistance: An Experiment on Bailing and Jailing. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–25.
[41]
William M Grove, David H Zald, Boyd S Lebow, Beth E Snitz, and Chad Nelson. 2000. Clinical versus mechanical prediction: a meta-analysis.Psychological assessment 12, 1 (2000), 19.
[42]
Todd M Gureckis, Jay Martin, John McDonnell, Alexander S Rich, Doug Markant, Anna Coenen, David Halpern, Jessica B Hamrick, and Patricia Chan. 2016. psiTurk: An open-source framework for conducting replicable behavioral experiments online. Behavior Research Methods 48, 3 (2016), 829–842.
[43]
David J Hand and William E Henley. 1997. Statistical classification methods in consumer credit scoring: a review. Journal of the Royal Statistical Society: Series A (Statistics in Society) 160, 3(1997), 523–541.
[44]
Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven M Drucker. 2019. Gamut: A design probe to understand how data scientists understand machine learning models. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–13.
[45]
Johan Huysmans, Karel Dejaeger, Christophe Mues, Jan Vanthienen, and Bart Baesens. 2011. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Systems 51, 1 (2011), 141–154.
[46]
Jacob Jacoby. 1984. Perspectives on information overload. Journal of consumer research 10, 4 (1984), 432–435.
[47]
Philip Johnson-Laird. 1983. Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness. Cambridge University Press.
[48]
Jongbin Jung, Connor Concannon, Ravi Shroff, Sharad Goel, and Daniel G Goldstein. 2020. Simple rules to guide expert classifications. Journal of the Royal Statistical Society: Series A (Statistics in Society) (2020).
[49]
Harmanpreet Kaur, Harsha Nori, Samuel Jenkins, Rich Caruana, Hanna Wallach, and Jennifer Wortman Vaughan. 2020. Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems.
[50]
Kevin Lane Keller and Richard Staelin. 1987. Effects of quality and quantity of information on decision effectiveness. Journal of consumer research 14, 2 (1987), 200–213.
[51]
Hyunjoong Kim, Wei-Yin Loh, Yu-Shan Shih, and Probal Chaudhuri. 2007. Visualizable and interpretable regression models with good prediction power. IIE Transactions 39, 6 (2007), 565–579.
[52]
Yea-Seul Kim, Katharina Reinecke, and Jessica Hullman. 2017. Explaining the gap: Visualizing one’s predictions improves recall and comprehension of data. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 1375–1386.
[53]
Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2018. Human decisions and machine predictions. The quarterly journal of economics 133, 1 (2018), 237–293.
[54]
Jon Kleinberg and Sendhil Mullainathan. 2019. Simplicity creates inequity: implications for fairness, stereotypes, and interpretability. In Proceedings of the 2019 ACM Conference on Economics and Computation. 807–808.
[55]
Rafal Kocielnik, Saleema Amershi, and Paul N Bennett. 2019. Will you accept an imperfect ai? exploring designs for adjusting end-user expectations of ai systems. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.
[56]
Pang Wei Koh and Percy Liang. 2017. Understanding Black-box Predictions via Influence Functions. In Proceedings of the 34th International Conference on Machine Learning (ICML).
[57]
Igor Kononenko. 2001. Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in medicine 23, 1 (2001), 89–109.
[58]
Josua Krause, Aritra Dasgupta, Jordan Swartz, Yindalon Aphinyanaphongs, and Enrico Bertini. 2017. A Workflow for Visual Diagnostics of Binary Classifiers using Instance-Level Explanations. In Proceedings of IEEE Conference and Visual Analytics Science and Technology.
[59]
Todd Kulesza, Simone Stumpf, Margaret Burnett, Sherry Yang, Irwin Kwan, and Weng-Keen Wong. 2013. Too much, too little, or just right? Ways explanations impact end users’ mental models. In Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing. 3–10.
[60]
Isaac Lage, Emily Chen, Jeffrey He, Menaka Narayanan, Been Kim, Samuel J Gershman, and Finale Doshi-Velez. 2019. Human evaluation of models built for interpretability. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7. 59–67.
[61]
Isaac Lage, Andrew Slavin Ross, Been Kim, Samuel J. Gershman, and Finale Doshi-Velez. 2018. Human-in-the-Loop Interpretability Prior. In Advances in Neural Information Processing Systems.
[62]
Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. 2016. Interpretable Decision Sets: A Joint Framework for Description and Prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1675–1684.
[63]
Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2017. Interpretable & Explorable Approximations of Black Box Models. In FATML Workshop.
[64]
Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2019. Faithful and customizable explanations of black box models. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 131–138.
[65]
Q Vera Liao, Daniel Gruen, and Sarah Miller. 2020. Questioning the AI: Informing Design Practices for Explainable AI User Experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–15.
[66]
Cynthia CS Liem, Markus Langer, Andrew Demetriou, Annemarie MF Hiemstra, Achmadnoer Sukma Wicaksana, Marise Ph Born, and Cornelius J König. 2018. Psychology Meets Machine Learning: Interdisciplinary Perspectives on Algorithmic Job Candidate Screening. In Explainable and Interpretable Models in Computer Vision and Machine Learning. Springer, 197–253.
[67]
Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Why and why not explanations improve the intelligibility of context-aware intelligent systems. In Proceedings of the 2009 CHI Conference on Human Factors in Computing Systems. ACM, 2119–2128. http://doi.acm.org/10.1145/1518701.1519023
[68]
Zachary C Lipton. 2016. The mythos of model interpretability. arXiv preprint arXiv:1606.03490(2016).
[69]
Jia Liu and Olivier Toubia. 2018. A semantic approach for estimating consumer content preferences from online search queries. Marketing Science 37, 6 (2018), 855–1052. https://doi.org/10.1287/mksc.2018.1112
[70]
Jennifer M. Logg. 2017. Theory of Machine: When Do People Rely on Algorithms? (2017). Harvard Business School NOM Unit Working Paper No. 17-086.
[71]
Jennifer M Logg, Julia A Minson, and Don A Moore. 2019. Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes 151 (2019), 90–103.
[72]
Yin Lou, Rich Caruana, and Johannes Gehrke. 2012. Intelligible Models for Classification and Regression. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)(Beijing, China).
[73]
Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker. 2013. Accurate Intelligible Models with Pairwise Interactions. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) (Chicago, IL, USA).
[74]
Scott Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30 (NIPS).
[75]
Winter Mason and Siddharth Suri. 2012. Conducting behavioral research on Amazon’s Mechanical Turk. Behavior research methods 44, 1 (2012), 1–23.
[76]
Paul E Meehl. 1990. Why summaries of research on psychological theories are often uninterpretable. Psychological reports 66, 1 (1990), 195–244.
[77]
Rishabh Mehrotra, James McInerney, Hugues Bouchard, Mounia Lalmas, and Fernando Diaz. 2018. Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 2243–2251. http://doi.acm.org/10.1145/3269206.3272027
[78]
Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 267 (2019), 1–38.
[79]
Tim Miller, Piers Howe, and Liz Sonenberg. 2017. Explainable AI: Beware of inmates running the asylum. arXiv preprint arXiv:1712.00547(2017).
[80]
Safiya Umoja Noble. 2018. Algorithms of Oppression: How search engines reinforce racism. NYU Press.
[81]
Harsha Nori, Samuel Jenkins, Paul Koch, and Rich Caruana. 2019. InterpretML: A Unified Framework for Machine Learning Interpretability. arXiv preprint arXiv:1909.09223(2019).
[82]
Don A. Norman. 1987. Some Observations on Mental Models. In Human-Computer Interaction: A Multidisciplinary Approach, R. M. Baecker and W. A. S. Buxton (Eds.). Morgan Kaufmann Publishers Inc., 241–244.
[83]
Dilek Önkal, Paul Goodwin, Mary Thomson, and Sinan Gönül. 2009. The relative influence of advice from human experts and statistical methods on forecast adjustments. Journal of Behavioral Decision Making 22 (2009), 390–409.
[84]
Gabriele Paolacci and Jesse Chandler. 2014. Inside the Turk: Understanding Mechanical Turk as a participant pool. Current Directions in Psychological Science 23, 3 (2014), 184–188.
[85]
Peter Pirolli and Daniel M Russell. 2011. Introduction to this special issue on sensemaking.
[86]
Marianne Promberger and Jonathan Baron. 2006. Do patients trust computers?Journal of Behavioral Decision Making 19, 5 (2006), 455–468.
[87]
Emilee Rader, Kelley Cotter, and Janghee Cho. 2018. Explanations as Mechanisms for Supporting Algorithmic Transparency. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 103. http://doi.acm.org/10.1145/3173574.3173677
[88]
Emilee Rader and Rebecca Gray. 2015. Understanding user beliefs about algorithmic curation in the Facebook news feed. In Proceedings of the 2015 CHI Conference on Human Factors in Computing Systems. ACM, 173–182. http://doi.acm.org/10.1145/2702123.2702174
[89]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD).
[90]
Christian Rudder. 2014. Dataclysm: Love, Sex, Race, and Identity–What Our Online Lives Tell Us about Our Offline Selves. Crown.
[91]
Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (2019), 206–215.
[92]
Cynthia Rudin and Berk Ustun. 2018. Optimized Scoring Systems: Toward Trust in Machine Learning for Healthcare and Criminal Justice. Applied Analytics 48, 5 (2018), 449–466. https://doi.org/10.1287/inte.2018.0957
[93]
Chris Russell. 2019. Efficient search for diverse coherent explanations. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 20–28.
[94]
Daniel M Russell, Mark J Stefik, Peter Pirolli, and Stuart K Card. 1993. The cost structure of sensemaking. In Proceedings of the INTERACT’93 and CHI’93 conference on Human factors in computing systems. 269–276.
[95]
Paul JH Schoemaker and C Carter Waid. 1982. An experimental comparison of different approaches to determining weights in additive utility models. Management Science 28, 2 (1982), 182–196. https://doi.org/10.1287/mnsc.28.2.182
[96]
Kathryn Sharpe Wessling, Joel Huber, and Oded Netzer. 2017. MTurk character misrepresentation: Assessment and solutions. Journal of Consumer Research 44, 1 (2017), 211–230.
[97]
Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen Wong, Margaret Burnett, Thomas Dietterich, Erin Sullivan, and Jonathan Herlocker. 2009. Interacting meaningfully with machine learning systems: Three experiments. International Journal of Human-Computer Studies 67, 8 (2009), 639–662.
[98]
Sarah Tan, Rich Caruana, Giles Hooker, and Yin Lou. 2018. Distill-and-compare: Auditing black-box models using transparent model distillation. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 303–310.
[99]
Nava Tintarev and Judith Masthoff. 2015. Explaining recommendations: Design and evaluation. In Recommender systems handbook. Springer, 353–382.
[100]
Richard Tomsett, David Braines, Dan Harborne, Alun Preece, and Supriyo Chakraborty. 2018. Interpretable to Whom? A Role-Based Model for Analyzing Interpretable Machine Learning Systems. In 2018 Workshop on Human Interpretability in Machine Learning.
[101]
Amos Tversky and Daniel Kahneman. 1974. Judgment under uncertainty: Heuristics and biases. science 185, 4157 (1974), 1124–1131.
[102]
Berk Ustun and Cynthia Rudin. 2016. Supersparse Linear Integer Models for Optimized Medical Scoring Systems. Machine Learning Journal 102, 3 (2016), 349–391.
[103]
Berk Ustun, Alexander Spangher, and Yang Liu. 2019. Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 10–19.
[104]
Kristen Vaccaro, Dylan Huang, Motahhare Eslami, Christian Sandvig, Kevin Hamilton, and Karrie Karahalios. 2018. The illusion of control: Placebo effects of control settings. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–13.
[105]
Vanya MCA Van Belle, Ben Van Calster, Dirk Timmerman, Tom Bourne, Cecilia Bottomley, Lil Valentin, Patrick Neven, Sabine Van Huffel, Johan AK Suykens, and Stephen Boyd. 2012. A mathematical model for interpretable clinical decision support with applications in gynecology. PloS one 7, 3 (2012).
[106]
Michael Veale, Max Van Kleek, and Reuben Binns. 2018. Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–14.
[107]
Kim J Vicente and Gerard L Torenvliet. 2000. The Earth is spherical(p < 0. 05): alternative methods of statistical inference. Theoretical Issues in Ergonomics Science 1, 3 (2000), 248–271.
[108]
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech. 31(2017), 841.
[109]
Sara Wachter-Boettcher. 2017. Why You Can’t Trust AI to Make Unbiased Hiring Decisions. (2017). Accessed at http://time.com/4993431/ai-recruiting-tools-do-not-/.
[110]
Danding Wang, Qian Yang, Ashraf Abdul, and Brian Y Lim. 2019. Designing theory-driven user-centric explainable AI. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–15.
[111]
Fulton Wang and Cynthia Rudin. 2015. Falling rule lists. In Artificial Intelligence and Statistics. 1013–1022.
[112]
Martin Wattenberg, Fernanda Viégas, and Moritz Hardt. 2016. Attacking discrimination with smarter machine learning. (2016). Accessed at https://research.google.com/bigpicture/attacking-discrimination-in-ml/.
[113]
Daniel S Weld and Gagan Bansal. 2019. The challenge of crafting intelligible intelligence. Commun. ACM 62, 6 (2019), 70–79.
[114]
Ilan Yaniv. 2004. Receiving other people’s advice: Influence and benefit. Organizational Behavior and Human Decision Processes 93 (2004), 1–13.

Cited By

View all
  • (2024)Towards Cleaner Cities: Estimating Vehicle-Induced PM2.5 with Hybrid EBM-CMA-ES ModelingToxics10.3390/toxics1211082712:11(827)Online publication date: 19-Nov-2024
  • (2024)Uncertainty in XAI: Human Perception and Modeling ApproachesMachine Learning and Knowledge Extraction10.3390/make60200556:2(1170-1192)Online publication date: 27-May-2024
  • (2024)An Overview of the Empirical Evaluation of Explainable AI (XAI): A Comprehensive Guideline for User-Centered Evaluation in XAIApplied Sciences10.3390/app14231128814:23(11288)Online publication date: 3-Dec-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
May 2021
10862 pages
ISBN:9781450380966
DOI:10.1145/3411764
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. human-centered machine learning
  2. interpretability
  3. machine-assisted decision making

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CHI '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025
ACM CHI Conference on Human Factors in Computing Systems
April 26 - May 1, 2025
Yokohama , Japan

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)889
  • Downloads (Last 6 weeks)71
Reflects downloads up to 29 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Towards Cleaner Cities: Estimating Vehicle-Induced PM2.5 with Hybrid EBM-CMA-ES ModelingToxics10.3390/toxics1211082712:11(827)Online publication date: 19-Nov-2024
  • (2024)Uncertainty in XAI: Human Perception and Modeling ApproachesMachine Learning and Knowledge Extraction10.3390/make60200556:2(1170-1192)Online publication date: 27-May-2024
  • (2024)An Overview of the Empirical Evaluation of Explainable AI (XAI): A Comprehensive Guideline for User-Centered Evaluation in XAIApplied Sciences10.3390/app14231128814:23(11288)Online publication date: 3-Dec-2024
  • (2024)Towards reconciling usability and usefulness of policy explanations for sequential decision-making systemsFrontiers in Robotics and AI10.3389/frobt.2024.137549011Online publication date: 22-Jul-2024
  • (2024)Humans in XAI: increased reliance in decision-making under uncertainty by using explanation strategiesFrontiers in Behavioral Economics10.3389/frbhe.2024.13770753Online publication date: 8-Mar-2024
  • (2024)Are logistic models really interpretable?Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/41(367-375)Online publication date: 3-Aug-2024
  • (2024)Putting a human in the loop: Increasing uptake, but decreasing accuracy of automated decision-makingPLOS ONE10.1371/journal.pone.029803719:2(e0298037)Online publication date: 9-Feb-2024
  • (2024)Rore: robust and efficient antioxidant protein classification via a novel dimensionality reduction strategy based on learning of fewer featuresGenomics & Informatics10.1186/s44342-024-00026-z22:1Online publication date: 4-Dec-2024
  • (2024)Should AI models be explainable to clinicians?Critical Care10.1186/s13054-024-05005-y28:1Online publication date: 12-Sep-2024
  • (2024)The Pop-Out Effect of Rarer Occurring Stimuli Shapes the Effectiveness of AI ExplainabilityProceedings of the Human Factors and Ergonomics Society Annual Meeting10.1177/1071181324126128468:1(352-358)Online publication date: 13-Aug-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media