Abstract
Depending on the application, radiological diagnoses can be associated with high inter- and intra-rater variabilities. Most computer-aided diagnosis (CAD) solutions treat such data as incontrovertible, exposing learning algorithms to considerable and possibly contradictory label noise and biases. Thus, managing subjectivity in labels is a fundamental problem in medical imaging analysis. To address this challenge, we introduce auto-decoded deep latent embeddings (ADDLE), which explicitly models the tendencies of each rater using an auto-decoder framework. After a simple linear transformation, the latent variables can be injected into any backbone at any and multiple points, allowing the model to account for rater-specific effects on the diagnosis. Importantly, ADDLE does not expect multiple raters per image in training, meaning it can readily learn from data mined from hospital archives. Moreover, the complexity of training ADDLE does not increase as more raters are added. During inference each rater can be simulated and a “mean” or “greedy” virtual rating can be produced. We test ADDLE on the problem of liver steatosis diagnosis from 2D ultrasound (US) by collecting \(36\,602\) studies along with clinical US diagnoses originating from 65 different raters. We evaluated diagnostic performance using a separate dataset with gold-standard biopsy diagnoses. ADDLE can improve the partial areas under the curve (AUCs) for diagnosing severe steatosis by \(10.5\%\) over standard classifiers while outperforming other annotator-noise approaches, including those requiring 65 times the parameters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Biswas, M., Kuppili, V., Edla, D.R., Suri, H.S., Saba, L., Marinhoe, R.T., Sanches, J.M., Suri, J.S.: Symtosis: a liver ultrasound tissue characterization and risk stratification in optimized deep learning paradigm. Comput. Methods Programs Biomed. 155, 165–177 (2018)
Byra, M., et al.: Transfer learning with deep convolutional neural network for liver steatosis assessment in ultrasound images. Int. J. Comput. Assist. Radiol. Surg. 13(12), 1895–1903 (2018). https://doi.org/10.1007/s11548-018-1843-2
Cheng, C.T., et al.: A scalable physician-level deep learning algorithm detects universal trauma on pelvic radiographs. Nat. Commun. 12(1), 1–10 (2021)
Chou, H., Lee, C.: Every rating matters: joint learning of subjective labels and individual annotators for speech emotion classification. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5886–5890 (2019)
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. Royal Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)
Frank, E., Hall, M.: A simple approach to ordinal classification. In: De Raedt, L., Flach, P. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 145–156. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44795-4_13
Fürnkranz, J., Hüllermeier, E., Vanderlooy, S.: Binary decomposition methods for multipartite ranking. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS (LNAI), vol. 5781, pp. 359–374. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04180-8_41
Greenspan, H., Van Ginneken, B., Summers, R.M.: Guest editorial deep learning in medical imaging: overview and future promise of an exciting new technique. IEEE Trans. Med. Imaging 35(5), 1153–1159 (2016)
Guan, M., Gulshan, V., Dai, A., Hinton, G.: Who said what: modeling individual labelers improves classification (2018)
Gummadi, S., et al.: Automated machine learning in the sonographic diagnosis of non-alcoholic fatty liver disease. Adv. Ultrasound Diagnosis Therapy 4(3), 176–182 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hernaez, R., et al.: Diagnostic accuracy and reliability of ultrasonography for the detection of fatty liver: a meta-analysis. Hepatology 54(3), 1082–1090 (2011)
Jonckheere, A.R.: A distribution-free k-sample test against ordered alternatives. Biometrika 41(1/2), 133–145 (1954)
Khetan, A., Lipton, Z.C., Anandkumar, A.: Learning from noisy singly-labeled data. In: International Conference on Learning Representations (2018)
Li, B., et al.: Reliable liver fibrosis assessment from ultrasound using global hetero-image fusion and view-specific parameterization. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12263, pp. 606–615. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59716-0_58
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: DeepSDF: learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 165–174 (2019)
Reddy, D.S., Bharath, R., Rajalakshmi, P.: A novel computer-aided diagnosis framework using deep learning for classification of fatty liver disease in ultrasound imaging. In: 2018 IEEE 20th International Conference on e-Health Networking, Applications and Services (Healthcom), pp. 1–5. IEEE (2018)
Suzuki, K.: Overview of deep learning in medical imaging. Radiol. Phys. Technol. 10(3), 257–273 (2017). https://doi.org/10.1007/s12194-017-0406-5
Tanno, R., Saeedi, A., Sankaranarayanan, S., Alexander, D.C., Silberman, N.: Learning from noisy labels by regularized estimation of annotator confusion. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11236–11245 (2019)
Warfield, S.K., Zou, K.H., Wells, W.M.: Simultaneous truth and performance level estimation (staple): an algorithm for the validation of image segmentation. IEEE Trans. Med. Imaging 23(7), 903–921 (2004)
Welinder, P., Branson, S., Perona, P., Belongie, S.: The multidimensional wisdom of crowds. In: Lafferty, J., Williams, C., Shawe-Taylor, J., Zemel, R., Culotta, A. (eds.) Advances in Neural Information Processing Systems, vol. 23. Curran Associates, Inc. (2010)
Willemink, M.J., et al.: Preparing medical imaging data for machine learning. Radiology 295(1), 4–15 (2020)
Wu, K., Chen, X., Ding, M.: Deep learning based classification of focal liver lesions with contrast-enhanced ultrasound. Optik 125(15), 4057–4063 (2014)
Younossi, Z., et al.: Global burden of NAFLD and NASH: trends, predictions, risk factors and prevention. Nat. Rev. Gastroenterol. Hepatol. 15(1), 11–20 (2018). number: 1 Publisher: Nature Publishing Group
Yu, S., et al.: Difficulty-aware glaucoma classification with multi-rater consensus modeling. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12261, pp. 741–750. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_72
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, B. et al. (2021). Learning from Subjective Ratings Using Auto-Decoded Deep Latent Embeddings. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12905. Springer, Cham. https://doi.org/10.1007/978-3-030-87240-3_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-87240-3_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87239-7
Online ISBN: 978-3-030-87240-3
eBook Packages: Computer ScienceComputer Science (R0)