Between Always and Never: Evaluating Uncertainty in Radiology Reports Using Natural Language Processing

Callen, Andrew L.; Dupont, Sara M.; Price, Adi; Laguna, Ben; McCoy, David; Do, Bao; Talbott, Jason; Kohli, Marc; Narvid, Jared

doi:10.1007/s10278-020-00379-1

Between Always and Never: Evaluating Uncertainty in Radiology Reports Using Natural Language Processing

Original Paper
Published: 19 August 2020

Volume 33, pages 1194–1201, (2020)
Cite this article

Journal of Digital Imaging Aims and scope Submit manuscript

Andrew L. Callen ORCID: orcid.org/0000-0003-2315-7767¹,
Sara M. Dupont²,
Adi Price³,
Ben Laguna³,
David McCoy³,
Bao Do⁴,
Jason Talbott³,
Marc Kohli³ &
…
Jared Narvid³

990 Accesses
14 Citations
18 Altmetric
1 Mention
Explore all metrics

Abstract

The ideal radiology report reduces diagnostic uncertainty, while avoiding ambiguity whenever possible. The purpose of this study was to characterize the use of uncertainty terms in radiology reports at a single institution and compare the use of these terms across imaging modalities, anatomic sections, patient characteristics, and radiologist characteristics. We hypothesized that there would be variability among radiologists and between subspecialities within radiology regarding the use of uncertainty terms and that the length of the impression of a report would be a predictor of use of uncertainty terms. Finally, we hypothesized that use of uncertainty terms would often be interpreted by human readers as “hedging.” To test these hypotheses, we applied a natural language processing (NLP) algorithm to assess and count the number of uncertainty terms within radiology reports. An algorithm was created to detect usage of a published set of uncertainty terms. All 642,569 radiology report impressions from 171 reporting radiologists were collected from 2011 through 2015. For validation, two radiologists without knowledge of the software algorithm reviewed report impressions and were asked to determine whether the report was “uncertain” or “hedging.” The relationship between the presence of 1 or more uncertainty terms and the human readers’ assessment was compared. There were significant differences in the proportion of reports containing uncertainty terms across patient admission status and across anatomic imaging subsections. Reports with uncertainty were significantly longer than those without, although report length was not significantly different between subspecialities or modalities. There were no significant differences in rates of uncertainty when comparing the experience of the attending radiologist. When compared with reader 1 as a gold standard, accuracy was 0.91, sensitivity was 0.92, specificity was 0.9, and precision was 0.88, with an F1-score of 0.9. When compared with reader 2, accuracy was 0.84, sensitivity was 0.88, specificity was 0.82, and precision was 0.68, with an F1-score of 0.77. Substantial variability exists among radiologists and subspecialities regarding the use of uncertainty terms, and this variability cannot be explained by years of radiologist experience or differences in proportions of specific modalities. Furthermore, detection of uncertainty terms demonstrates good test characteristics for predicting human readers’ assessment of uncertainty.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline

Article 30 October 2017

What Influences the Way Radiologists Express Themselves in Their Reports? A Quantitative Assessment Using Natural Language Processing

Article 22 March 2022

The reporting quality of natural language processing studies: systematic review of studies of radiology reports

Article Open access 02 October 2021

References

Schoenfeld AJ, Harris MB, Davis M. Clinical uncertainty at the intersection of advancing technology, evidence-based medicine, and health care policy. JAMA Surg. 2014;149(12):1221–2.
Article Google Scholar
Platt ML, Huettel SA. Risky business: the neuroeconomics of decision making under uncertainty. Nat Neurosci. 2008;11(4):398–403.
Article CAS Google Scholar
Saposnik G, Sempere AP, Prefasi D, Selchen D, Ruff CC, Maurino J, et al. Decision-making in Multiple Sclerosis: The Role of Aversion to Ambiguity for Therapeutic Inertia among Neurologists (DIScUTIR MS). Front Neurol. 2017;8:65.
PubMed PubMed Central Google Scholar
Rosenkrantz AB, Kiritsy M, Kim S. How “consistent” is “consistent”? A clinician-based assessment of the reliability of expressions used by radiologists to communicate diagnostic confidence. Clin Radiol. 2014;69(7):745–9.
Article CAS Google Scholar
Khorasani R, Bates DW, Teeger S, Rothschild JM, Adams DF, Seltzer SE. Is terminology used effectively to convey diagnostic certainty in radiology reports? Acad Radiol. 2003;10(6):685–8.
Article Google Scholar
Carney PA, Yi JP, Abraham LA, Miglioretti DL, Aiello EJ, Gerrity MS, et al. Reactions to uncertainty and the accuracy of diagnostic mammography. J Gen Intern Med. 2007;22(2):234–41.
Article Google Scholar
Clinger NJ, Hunter TB, Hillman BJ. Radiology reporting: attitudes of referring physicians. Radiology. 1988;169(3):825–6.
Article CAS Google Scholar
Panicek DM, Hricak H. How Sure Are You, Doctor? A Standardized Lexicon to Describe the Radiologist’s Level of Certainty. AJR Am J Roentgenol. 2016;207(1):2–3.
Article Google Scholar
Wallis A, McCoubrie P. The radiology report—are we getting the message across? Clin Radiol. 2011;66(11):1015–22.
Article CAS Google Scholar
Sobel JL, Pearson ML, Gross K, Desmond KA, Harrison ER, Rubenstein LV, et al. Information content and clarity of radiologists’ reports for chest radiography. Acad Radiol. 1996;3(9):709–17.
Article CAS Google Scholar
Valls C. Pitfalls of the vague radiology report. AJR Am J Roentgenol. 2001;176(1):253–4.
Article CAS Google Scholar
Levinson W. Physician-patient communication. A key to malpractice prevention. JAMA. 1994;272(20):1619–20.
Article CAS Google Scholar
Hoang JK. Do not hedge when there is certainty. J Am Coll Radiol. 2017;14(1):5.
Article Google Scholar
Burnside ES, Sickles EA, Bassett LW, Rubin DL, Lee CH, Ikeda DM, et al. The ACR BI-RADS experience: learning from history. J Am Coll Radiol. 2009;6(12):851–60.
Article Google Scholar
Tessler FN, Middleton WD, Grant EG, Hoang JK, Berland LL, Teefey SA, et al. ACR Thyroid Imaging, Reporting and Data System (TI-RADS): White Paper of the ACR TI-RADS Committee. J Am Coll Radiol. 2017;14(5):587–95.
Article Google Scholar
Berlin L. Radiologic errors and malpractice: a blurry distinction. AJR Am J Roentgenol. 2007;189(3):517–22.
Article Google Scholar
Hripcsak G, Austin JHM, Alderson PO, Friedman C. Use of natural language processing to translate clinical information from a database of 889,921 chest radiographic reports. Radiology. 2002;224(1):157–63.
Article Google Scholar
Do BH, Wu A, Biswal S, Kamaya A, Rubin DL. Informatics in Radiology: RADTF: A Semantic Search–enabled, Natural Language Processor–generated Radiology Teaching File. Radiographics. 2010;30(7):2039–48.
Article Google Scholar
Dutta S, Long WJ, Brown DFM, Reisner AT. Automated detection using natural language processing of radiologists recommendations for additional imaging of incidental findings. Ann Emerg Med. 2013;62(2):162–9.
Article Google Scholar
Dang PA, Kalra MK, Blake MA, Schultz TJ, Stout M, Lemay PR, et al. Natural language processing using online analytic processing for assessing recommendations in radiology reports. J Am Coll Radiol. 2008;5(3):197–204.
Article Google Scholar
Dogra N, Giordano J, France N. Cultural diversity teaching and issues of uncertainty: the findings of a qualitative study. BMC Med Educ. 2007;7(1):8.
Article Google Scholar
Wu AS, Do BH, Kim J, Rubin DL. Evaluation of negation and uncertainty detection and its impact on precision and recall in search. J Digit Imaging. 2011;24(2):234–42.
Article Google Scholar
Hanauer DA, Liu Y, Mei Q, Manion FJ, Balis UJ, Zheng K. Hedging their mets: the use of uncertainty terms in clinical documents and its potential implications when sharing the documents with patients. AMIA Annu Symp Proc. 2012;2012:321–30.
PubMed PubMed Central Google Scholar
Bhise V, Rajan SS, Sittig DF, Morgan RO, Chaudhary P, Singh H. Defining and measuring diagnostic uncertainty in medicine: a systematic review. J Gen Intern Med. 2018;33(1):103–15.
Article Google Scholar
Zwaan L, Singh H. The challenges in defining and measuring diagnostic error. Diagnosis (Berl). 2015;2(2):97–103.
Article Google Scholar
Singh H, Giardina TD, Meyer AND, Forjuoh SN, Reis MD, Thomas EJ. Types and origins of diagnostic errors in primary care settings. JAMA Intern Med. 2013;173(6):418–25.
Article Google Scholar
Hoang JK, Middleton WD, Farjat AE, Langer JE, Reading CC, Teefey SA, et al. Reduction in thyroid nodule biopsies and improved accuracy with American college of radiology thyroid imaging reporting and data system. Radiology. 2018;287(1):185–93.
Article Google Scholar
Tang A, Bashir MR, Corwin MT, Cruite I, Dietrich CF, Do RKG, et al. Evidence supporting LI-RADS major features for CT- and MR imaging-based diagnosis of hepatocellular carcinoma: a systematic review. Radiology. 2018;286(1):29–48.
Article Google Scholar
MacMahon H, Naidich DP, Goo JM, Lee KS, Leung ANC, Mayo JR, et al. Guidelines for Management of Incidental Pulmonary Nodules Detected on CT Images: From the Fleischner Society 2017. Radiology. 2017;284(1):228–43.
Article Google Scholar
Pons E, Braun LMM, Hunink MGM, Kors JA. Natural language processing in radiology: a systematic review. Radiology. 2016;279(2):329–43.
Article Google Scholar
Lacson R, Khorasani R. Practical examples of natural language processing in radiology. J Am Coll Radiol. 2011;8(12):872–4.
Article Google Scholar
Srinivasa Babu A, Brooks ML. The malpractice liability of radiology reports: minimizing the risk. Radiographics. 2015;35(2):547–54.
Article Google Scholar
Dreyer KJ, Kalra MK, Maher MM, Hurier AM, Asfaw BA, Schultz T, et al. Application of recently developed computer algorithm for automatic classification of unstructured radiology reports: validation study. Radiology. 2005;234(2):323–9.
Article Google Scholar
Abbass IM, Krause TM, Virani SS, Swint JM, Chan W, Franzini L. Revisiting the economic efficiencies of observation units. Manag Care. 2015;24(3):46–52.
PubMed Google Scholar
Prabhakar AM, Misono AS, Harvey HB, Yun BJ, Saini S, Oklu R. Imaging utilization from the ED: no difference between observation and admitted patients. Am J Emerg Med. 2015;33(8):1076–9.
Article Google Scholar
Molins E, Macià F, Ferrer F, Maristany M-T, Castells X. Association between radiologists’ experience and accuracy in interpreting screening mammograms. BMC Health Serv Res. 2008;8:91.
Article Google Scholar
Miglioretti DL, Gard CC, Carney PA, Onega TL, Buist DSM, Sickles EA, et al. When radiologists perform best: the learning curve in screening mammogram interpretation. Radiology. 2009;253(3):632–40.
Article Google Scholar
Saposnik G, Redelmeier D, Ruff CC, Tobler PN. Cognitive biases associated with medical decisions: a systematic review. BMC Med Inform Decis Mak. 2016;16(1):138.
Article Google Scholar

Download references

Funding

Dr. Callen was supported by the National Institutes of Health (NIBIB) T32 Training Grant, T32EB001631.

Author information

Authors and Affiliations

Department of Radiology, University of Colorado Anschutz Medical Campus, Denver, CO, USA
Andrew L. Callen
Sublte Medical Inc, Menlo Park, CA, USA
Sara M. Dupont
Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA, USA
Adi Price, Ben Laguna, David McCoy, Jason Talbott, Marc Kohli & Jared Narvid
Department of Radiology, Stanford University Medical Center, Stanford, CA, USA
Bao Do

Authors

Andrew L. Callen
View author publications
You can also search for this author in PubMed Google Scholar
Sara M. Dupont
View author publications
You can also search for this author in PubMed Google Scholar
Adi Price
View author publications
You can also search for this author in PubMed Google Scholar
Ben Laguna
View author publications
You can also search for this author in PubMed Google Scholar
David McCoy
View author publications
You can also search for this author in PubMed Google Scholar
Bao Do
View author publications
You can also search for this author in PubMed Google Scholar
Jason Talbott
View author publications
You can also search for this author in PubMed Google Scholar
Marc Kohli
View author publications
You can also search for this author in PubMed Google Scholar
Jared Narvid
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew L. Callen.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Table 1 Uncertainty Terms

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Callen, A.L., Dupont, S.M., Price, A. et al. Between Always and Never: Evaluating Uncertainty in Radiology Reports Using Natural Language Processing. J Digit Imaging 33, 1194–1201 (2020). https://doi.org/10.1007/s10278-020-00379-1

Download citation

Received: 12 December 2019
Revised: 10 June 2020
Accepted: 23 July 2020
Published: 19 August 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10278-020-00379-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Between Always and Never: Evaluating Uncertainty in Radiology Reports Using Natural Language Processing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline

What Influences the Way Radiologists Express Themselves in Their Reports? A Quantitative Assessment Using Natural Language Processing

The reporting quality of natural language processing studies: systematic review of studies of radiology reports

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Between Always and Never: Evaluating Uncertainty in Radiology Reports Using Natural Language Processing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline

What Influences the Way Radiologists Express Themselves in Their Reports? A Quantitative Assessment Using Natural Language Processing

The reporting quality of natural language processing studies: systematic review of studies of radiology reports

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation