research-article

Open access

Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making

Authors:

Daniel Smilkov,

Martin Wattenberg,

Fernanda Viegas,

Greg S. Corrado,

Martin C. Stumpe,

Michael TerryAuthors Info & Claims

CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems

Paper No.: 4, Pages 1 - 14

https://doi.org/10.1145/3290605.3300234

Published: 02 May 2019 Publication History

All formats PDF

Abstract

Machine learning (ML) is increasingly being used in image retrieval systems for medical decision making. One application of ML is to retrieve visually similar medical images from past patients (e.g. tissue from biopsies) to reference when making a medical decision with a new patient. However, no algorithm can perfectly capture an expert's ideal notion of similarity for every case: an image that is algorithmically determined to be similar may not be medically relevant to a doctor's specific diagnostic needs. In this paper, we identified the needs of pathologists when searching for similar images retrieved using a deep learning algorithm, and developed tools that empower users to cope with the search algorithm on-the-fly, communicating what types of similarity are most important at different moments in time. In two evaluations with pathologists, we found that these tools increased the diagnostic utility of images found and increased user trust in the algorithm. The tools were preferred over a traditional interface, without a loss in diagnostic accuracy. We also observed that users adopted new strategies when using refinement tools, re-purposing them to test and understand the underlying algorithm and to disambiguate ML errors from their own errors. Taken together, these findings inform future human-ML collaborative systems for expert decision-making.

References

[1]

Ceyhun Burak Akgül, Daniel L Rubin, Sandy Napel, Christopher F Beaulieu, Hayit Greenspan, and Burak Acar. 2011. Content-based image retrieval in radiology: current status and future directions. Journal of Digital Imaging 24, 2 (2011), 208--222.

[2]

Guillaume Alain and Yoshua Bengio. 2016. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644 (2016).

[3]

Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. AI Magazine 35, 4 (2014), 105--120.

Digital Library

[4]

Saleema Amershi, James Fogarty, and Daniel Weld. 2012. Regroup: Interactive machine learning for on-demand group creation in social networks. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 21--30.

Digital Library

[5]

Hossein Azizpour, Ali Sharif Razavian, Josephine Sullivan, Atsuto Maki, and Stefan Carlsson. 2015. From generic to specific deep representations for visual recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 36--45.

[6]

David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Network Dissection: Quantifying Interpretability of Deep Visual Representations. In Computer Vision and Pattern Recognition.

[7]

Eta S Berner. 2007. Clinical decision support systems. Vol. 233. Springer.

[8]

Marshal A Blatt, Michael C Higgins, Keith I Marton, and HC Sox Jr. 1988. Medical decision making.

[9]

Carrie J Cai, Jonas Jongejan, and Jess Holbrook. 2019. The Effects of Example-Based Explanations in a Machine Learning Interface. In Proceedings of the 24th International Conference on Intelligent User Interfaces. ACM.

Digital Library

[10]

Matthew Chalmers and Ian MacColl. 2003. Seamful and seamless design in ubiquitous computing. In Workshop at the crossroads: The interaction of HCI and systems issues in UbiComp, Vol. 8.

[11]

Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z. Wang. 2008. Image Retrieval: Ideas, Influences, and Trends of the New Age. ACM Comput. Surv. 40, 2, Article 5 (May 2008), 60 pages.

Digital Library

[12]

Scott Doyle, Mark Hwang, Kinsuk Shah, Anant Madabhushi, Michael Feldman, and John Tomaszeweski. 2007. Automated grading of prostate cancer using architectural and textural image features. In Biomedical imaging: from nano to macro, 2007. ISBI 2007. 4th IEEE international symposium on. IEEE, 1284--1287.

[13]

Jesse Engel, Matthew Hoffman, and Adam Roberts. 2017. Latent Constraints: Learning to Generate Conditionally from Unconditional Generative Models. Computing Research Repository abs/1711.05772 (2017).

[14]

Jonathan I Epstein, Michael J Zelefsky, Daniel D Sjoberg, Joel B Nelson, Lars Egevad, Cristina Magi-Galluzzi, Andrew J Vickers, Anil V Parwani, Victor E Reuter, Samson W Fine, et al. 2016. A contemporary prostate cancer grading system: a validated alternative to the Gleason score. European urology 69, 3 (2016), 428--435.

[15]

Motahhare Eslami, Karrie Karahalios, Christian Sandvig, Kristen Vaccaro, Aimee Rickman, Kevin Hamilton, and Alex Kirlik. 2016. First i like it, then i hide it: Folk theories of social feeds. In Proceedings of the 2016 cHI conference on human factors in computing systems. ACM, 2371--2382.

Digital Library

[16]

Jerry Alan Fails and Dan R Olsen Jr. 2003. Interactive machine learning. In Proceedings of the 8th international conference on Intelligent user interfaces. ACM, 39--45.

Digital Library

[17]

Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, et al. 1995. Query by image and video content: The QBIC system. computer 28, 9 (1995), 23--32.

Digital Library

[18]

James Fogarty, Desney Tan, Ashish Kapoor, and Simon Winder. 2008. CueFlik: interactive concept learning in image search. In Proceedings of the sigchi conference on human factors in computing systems. ACM, 29--38.

Digital Library

[19]

Amit X Garg, Neill KJ Adhikari, Heather McDonald, M Patricia RosasArellano, PJ Devereaux, Joseph Beyene, Justina Sam, and R Brian Haynes. 2005. Effects of computerized clinical decision support systems on practitioner performance and patient outcomes: a systematic review. Jama 293, 10 (2005), 1223--1238.

[20]

Sandra G Hart and Lowell E Staveland. 1988. Development of NASATLX (Task Load Index): Results of empirical and theoretical research. In Advances in psychology. Vol. 52. Elsevier, 139--183.

[21]

Kevin Anthony Hoff and Masooda Bashir. 2015. Trust in automation: Integrating empirical evidence on factors that influence trust. Human Factors 57, 3 (2015), 407--434.

[22]

Avi Kak and Christina Pavlopoulou. 2002. Content-based image retrieval from large medical databases. In 3D Data Processing Visualization and Transmission, 2002. Proceedings. First International Symposium on. IEEE, 138--147.

[23]

Saif Khairat, David Marc, William Crosby, and Ali Al Sanousi. 2018. Reasons For Physicians Not Adopting Clinical Decision Support Systems: Critical Analysis. JMIR medical informatics 6, 2 (2018).

[24]

Been Kim, Elena Glassman, Brittney Johnson, and Julie Shah. 2015. iBCM: Interactive Bayesian Case Model Empowering Humans via Intuitive Interaction. (2015).

[25]

Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, et al. 2018. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In International Conference on Machine Learning. 2673--2682.

[26]

René F Kizilcec. 2016. How much information?: Effects of transparency on trust in an algorithmic interface. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 2390--2395.

Digital Library

[27]

Ajay Kohli and Saurabh Jha. 2018. Why CAD failed in mammography. Journal of the American College of Radiology 15, 3 (2018), 535--537.

[28]

Todd Kulesza, Saleema Amershi, Rich Caruana, Danyel Fisher, and Denis Charles. 2014. Structured labeling for facilitating concept evolution in machine learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3075--3084.

Digital Library

[29]

Roger C Mayer, James H Davis, and F David Schoorman. 1995. An integrative model of organizational trust. Academy of management review 20, 3 (1995), 709--734.

[30]

Neville Mehta, Raja Alomari, and Vipin Chaudhary. 2009. Content based sub-image retrieval system for high resolution pathology images using salient interest points. 2009 (09 2009), 3719--22.

[31]

B Middleton, DF Sittig, and A Wright. 2016. Clinical decision support: a 25 year retrospective and a 25 year vision. Yearbook of medical informatics 25, S 01 (2016), S103--S116.

[32]

Tomas Mikolov, Wen tau Yih, and Geoffrey Zweig. 2013. Linguistic Regularities in Continuous Space Word Representations. In HLT-NAACL.

[33]

Clara Mosquera-Lopez, Sos Agaian, Alejandro Velez-Hoyos, and Ian Thompson. 2015. Computer-aided prostate cancer diagnosis from digitized histopathology: a review on texture-based systems. IEEE reviews in biomedical engineering 8 (2015), 98--113.

[34]

Henning Müller, Nicolas Michoux, David Bandon, and Antoine Geissbuhler. 2004. A review of content-based image retrieval systems in medical applications clinical benefits and future directions. International journal of medical informatics 73, 1 (2004), 1--23.

[35]

Carlton Wayne Niblack, Ron Barber, Will Equitz, Myron D Flickner, Eduardo H Glasman, Dragutin Petkovic, Peter Yanker, Christos Faloutsos, and Gabriel Taubin. 1993. QBIC project: querying images by content, using color, texture, and shape. In Storage and retrieval for image and video databases, Vol. 1908. International Society for Optics and Photonics, 173--188.

[36]

Raymond S Nickerson. 1998. Confirmation bias: A ubiquitous phenomenon in many guises. Review of general psychology 2, 2 (1998), 175.

[37]

Jerome A Osheroff, Jonathan M Teich, Blackford Middleton, Elaine B Steen, Adam Wright, and Don E Detmer. 2007. A roadmap for national action on clinical decision support. Journal of the American medical informatics association 14, 2 (2007), 141--145.

[38]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1135--1144.

Digital Library

[39]

Ali Sharif Razavian, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. 2014. CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 806--813.

Digital Library

[40]

Patrice Y Simard, Saleema Amershi, David M Chickering, Alicia Edelman Pelton, Soroush Ghorashi, Christopher Meek, Gonzalo Ramos, Jina Suh, Johan Verwey, Mo Wang, et al. 2017. Machine teaching: A new paradigm for building machine learning systems. arXiv preprint arXiv:1707.06742 (2017).

[41]

Judah ES Sklan, Andrew J Plassard, Daniel Fabbri, and Bennett A Landman. 2015. Toward content-based image retrieval with deep convolutional neural networks. In Medical Imaging 2015: Biomedical Applications in Molecular, Structural, and Functional Imaging, Vol. 9417. International Society for Optics and Photonics, 94172C.

[42]

A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. 2000. Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 12 (Dec 2000), 1349--1380.

Digital Library

[43]

Kristen Vaccaro, Dylan Huang, Motahhare Eslami, Christian Sandvig, Kevin Hamilton, and Karrie Karahalios. 2018. The Illusion of Control: Placebo Effects of Control Settings. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 16.

Digital Library

[44]

Ji Wan, Dayong Wang, Steven Chu Hong Hoi, Pengcheng Wu, Jianke Zhu, Yongdong Zhang, and Jintao Li. 2014. Deep Learning for ContentBased Image Retrieval: A Comprehensive Study. In Proceedings of the 22Nd ACM International Conference on Multimedia (MM '14). ACM, New York, NY, USA, 157--166.

Digital Library

[45]

Qian Yang, John Zimmerman, Aaron Steinfeld, Lisa Carey, and James F Antaki. 2016. Investigating the heart pump implant decision process: opportunities for decision support tools to help. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 4477--4488.

Digital Library

[46]

Mussarat Yasmin, Sajjad Mohsin, and Muhammad Sharif. 2014. Intelligent Image Retrieval Techniques: A Survey. Journal of Applied Research and Technology 12, 1 (2014), 87 -- 103.

[47]

HongJiang Zhang and Zhong Su. 2002. Relevance feedback in CBIR. In Visual and Multimedia Information Management. Springer, 21--35.

Digital Library

[48]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. arXiv preprint arXiv:1703.10593 (2017).

Cited By

Badahman FAlsobhi MAlzahrani AChevidikunnan MNeamatallah ZAlqarni AAlabasi UAbduljabbar ABasuodan RKhan F(2024)Validating the Accuracy of a Patient-Facing Clinical Decision Support System in Predicting Lumbar Disc Herniation: Diagnostic Accuracy StudyDiagnostics10.3390/diagnostics1417187014:17(1870)Online publication date: 26-Aug-2024
https://doi.org/10.3390/diagnostics14171870
Zhang YKassem KGong ZMo FMa YKirjavainen EHäkkilä J(2024)Human-centered AI Technologies in Human-robot Interaction for Social SettingsProceedings of the International Conference on Mobile and Ubiquitous Multimedia10.1145/3701571.3701610(501-505)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1145/3701571.3701610
Zhang ZFeger SDullenkopf LLiao RSüsslin LLiu YButz A(2024)Beyond Recommendations: From Backward to Forward AI Support of Pilots' Decision-Making ProcessProceedings of the ACM on Human-Computer Interaction10.1145/36870248:CSCW2(1-32)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.1145/3687024
Show More Cited By

Index Terms

Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Medical decision making using vector space model
IHI '10: Proceedings of the 1st ACM International Health Informatics Symposium

This paper addresses the task of analyzing healthcare data for medical decision making. We describe a method for ranking medications based on historical data of the outcomes recorded as part of a system of Electronic Medical Records (EMR). Medication ...
Human-Centred Machine Learning
CHI EA '16: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems

Machine learning is one of the most important and successful techniques in contemporary computer science. It involves the statistical inference of models (such as classifiers) from data. It is often conceived in a very impersonal way, with algorithms ...
Medical informatics: clinical decision making and beyond

Does Medical Informatics encompass all aspects of computing in health care, or is it limited to information processing in clinical medicine? A panel discussion will present several points of view. This paper advocates a unified view of Medical ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems

May 2019

9077 pages

ISBN:9781450359702

DOI:10.1145/3290605

General Chairs:
Stephen Brewster
University of Glasgow, Scotland, UK
,
Geraldine Fitzpatrick
TU Wien, Austria
,
Program Chairs:
Anna Cox
University College London, UK
,
Vassilis Kostakos
University of Melbourne, Australia

Copyright © 2019 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 May 2019

Check for updates

Badges

Honorable Mention

Author Tags

Qualifiers

Research-article

Conference

CHI '19

Sponsor:

SIGCHI

CHI '19: CHI Conference on Human Factors in Computing Systems

May 4 - 9, 2019

Glasgow, Scotland Uk

Acceptance Rates

CHI '19 Paper Acceptance Rate 703 of 2,958 submissions, 24%;

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

244
Total Citations
View Citations
11,197
Total Downloads

Downloads (Last 12 months)2,077
Downloads (Last 6 weeks)270

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Badahman FAlsobhi MAlzahrani AChevidikunnan MNeamatallah ZAlqarni AAlabasi UAbduljabbar ABasuodan RKhan F(2024)Validating the Accuracy of a Patient-Facing Clinical Decision Support System in Predicting Lumbar Disc Herniation: Diagnostic Accuracy StudyDiagnostics10.3390/diagnostics1417187014:17(1870)Online publication date: 26-Aug-2024
https://doi.org/10.3390/diagnostics14171870
Zhang YKassem KGong ZMo FMa YKirjavainen EHäkkilä J(2024)Human-centered AI Technologies in Human-robot Interaction for Social SettingsProceedings of the International Conference on Mobile and Ubiquitous Multimedia10.1145/3701571.3701610(501-505)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1145/3701571.3701610
Zhang ZFeger SDullenkopf LLiao RSüsslin LLiu YButz A(2024)Beyond Recommendations: From Backward to Forward AI Support of Pilots' Decision-Making ProcessProceedings of the ACM on Human-Computer Interaction10.1145/36870248:CSCW2(1-32)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.1145/3687024
Hao YFarzan RLópez CCardoso Llach DQuercia DMustafa MNiu SWong-Villacrés M(2024)Outcome First or Overview First? Optimizing Patient-Oriented Framework for Evidence-Based Healthcare Treatment Selections with XAI ToolsCompanion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing10.1145/3678884.3681859(248-254)Online publication date: 11-Nov-2024
https://dl.acm.org/doi/10.1145/3678884.3681859
Alharbi RLor PHerskovitz JSchoenebeck SBrewer R(2024)Misfitting With AI: How Blind People Verify and Contest AI ErrorsProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675659(1-17)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3663548.3675659
Amin RChen FHirsch LOu CLa TButz A(2024)Integrating Crowd and Machine Learning in an Intelligent Interface: A Case Study of Oil Spill Detection in Satellite ImagesProceedings of the 2024 International Conference on Advanced Visual Interfaces10.1145/3656650.3656680(1-9)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3656650.3656680
Zheng CZhang YHuang ZShi CXu MMa X(2024)DiscipLink: Unfolding Interdisciplinary Information Seeking Process via Human-AI Co-ExplorationProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676366(1-20)Online publication date: 13-Oct-2024
https://dl.acm.org/doi/10.1145/3654777.3676366
Kollerup NWester JSkov Mvan Berkel N(2024)How Can I Signal You To Trust Me: Investigating AI Trust Signalling in Clinical Self-AssessmentsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661612(525-540)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3661612
Zając HRibeiro JIngala SGentile SWanjohi RGitau SCarlsen JNielsen MAndersen T(2024)"It depends": Configuring AI to Improve Clinical Usefulness Across ContextsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660707(874-889)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3660707
Hussain MIacovides ILawton TSharma VPorter ZCunningham AHabli IHickey SJia YMorgan PWong N(2024)Development and translation of human-AI interaction models into working prototypes for clinical decision-makingProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660697(1607-1619)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3660697
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents