Feature Extraction for Generative Medical Imaging Evaluation: New Evidence Against an Evolving Trend

Woodland, McKell; Castelo, Austin; Al Taie, Mais; Albuquerque Marques Silva, Jessica; Eltaher, Mohamed; Mohn, Frank; Shieh, Alexander; Kundu, Suprateek; Yung, Joshua P.; Patel, Ankit B.; Brock, Kristy K.

doi:10.1007/978-3-031-72390-2_9

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15012))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

1155 Accesses

Abstract

Fréchet Inception Distance (FID) is a widely used metric for assessing synthetic image quality. It relies on an ImageNet-based feature extractor, making its applicability to medical imaging unclear. A recent trend is to adapt FID to medical imaging through feature extractors trained on medical images. Our study challenges this practice by demonstrating that ImageNet-based extractors are more consistent and aligned with human judgment than their RadImageNet counterparts. We evaluated sixteen StyleGAN2 networks across four medical imaging modalities and four data augmentation techniques with Fréchet distances (FDs) computed using eleven ImageNet or RadImageNet-trained feature extractors. Comparison with human judgment via visual Turing tests revealed that ImageNet-based extractors produced rankings consistent with human judgment, with the FD derived from the ImageNet-trained SwAV extractor significantly correlating with expert evaluations. In contrast, RadImageNet-based rankings were volatile and inconsistent with human judgment. Our findings challenge prevailing assumptions, providing novel evidence that medical image-trained feature extractors do not inherently improve FDs and can even compromise their reliability. Our code is available at https://github.com/mckellwoodland/fid-med-eval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abstract: Radiomics Processing Toolkit

Evaluating the Performance of StyleGAN2-ADA on Medical Images

A Systematic Benchmarking Analysis of Transfer Learning for Medical Image Analysis

Notes

1.
https://sliver07.grand-challenge.org/.
2.
https://nihcc.app.box.com/v/ChestXray-NIHCC.
3.
http://medicaldecathlon.com/, CC-BY-SA 4.0 license.
4.
https://www.creatis.insa-lyon.fr/Challenge/acdc/databases.html.

References

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Guyon, I., et al. (eds.) NIPS. vol. 30. Curran Associates, Inc. (2017)
Google Scholar
Woodland, M., et al.: Evaluating the performance of stylegan2-ada on medical images. In: Zhao, C., et al. (eds.) SASHIMI. pp. 142–153. Springer (2022). https://doi.org/10.1007/978-3-031-16980-9_14
Borji, A.: Pros and cons of gan evaluation measures. Comput. Vis. Image Underst. 179, 41–65 (2019). https://doi.org/10.1016/j.cviu.2018.10.009
Article Google Scholar
Truong, T., Mohammadi, S., Lenga, M.: How transferable are self-supervised features in medical image classification tasks? In: Jung, K., et al. (eds.) MLHC. vol. 158, pp. 54–74. PMLR (2021)
Google Scholar
Kynkäänniemi, T., Karras, T., Aittala, M., Aila, T., Lehtinen, J.: The role of imagenet classes in fréchet inception distance. arXiv:2203.06026 (2023)
Mei, X., et al.: Radimagenet: An open radiologic deep learning research dataset for effective transfer learning. Radiol.: Artif. Intell. 4(5) (2022). https://doi.org/10.1148/ryai.210315
Osuala, R., et al.: medigan: a Python library of pretrained generative models for medical image synthesis. J. Med. Imaging 10(6), 061403 (2023). https://doi.org/10.1117/1.JMI.10.6.061403
Article Google Scholar
Anton, J., et al.: How well do self-supervised models transfer to medical imaging? J. Imaging 8(12), 320 (2022). https://doi.org/10.3390/jimaging8120320
Article Google Scholar
Morozov, S., Voynov, A., Babenko, A.: On self-supervised image representations for gan evaluation. In: ICLR (2020)
Google Scholar
He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. In: Zabih, R., et al. (eds.) CVPR. IEEE (2020)
Google Scholar
Chen, J., Wei, J., Li, R.: Targan: target-aware generative adversarial networks for multi-modality medical image translation. In: de Bruijne, M., et al. (eds.) MICCAI. pp. 24–33. Springer (2021). https://doi.org/10.1007/978-3-030-87231-1_3
Jung, E., Luna, M., Park, S.H.: Conditional gan with an attention-based generator and a 3d discriminator for 3d medical image generation. In: de Bruijne, M., et al. (eds.) MICCAI. pp. 318–328. Springer (2021). https://doi.org/10.1007/978-3-030-87231-1_31
Tronchin, L., Sicilia, R., Cordelli, E., Ramella, S., Soda, P.: Evaluating gans in medical imaging. In: Engelhardt, S., et al. (eds.) DGM4MICCAI, DALI. pp. 112–121. Springer (2021). https://doi.org/10.1007/978-3-030-88210-5_10
Heimann, T., et al.: Comparison and evaluation of methods for liver segmentation from ct datasets. IEEE Trans. Med. Imaging 28(8), 1251–1265 (2009). https://doi.org/10.1109/TMI.2009.2013851
Article Google Scholar
Wang, X., et al.: Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Chellappa, R., et al. (eds.) CVPR. IEEE (2017)
Google Scholar
Antonelli, M., et al.: The medical segmentation decathlon. Nat. Commun. 13(1), 4128 (2022). https://doi.org/10.1038/s41467-022-30695-9
Article Google Scholar
Simpson, A.L., et al.: A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv:1902.09063 (2019)
Bernard, O., et al.: Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: Is the problem solved? IEEE Trans. Med. Imaging 37(11), 2514–2525 (2018). https://doi.org/10.1109/TMI.2018.2837502
Article Google Scholar
Karras, T., et al.: Analyzing and improving the image quality of stylegan. In: Zabih, R., et al. (eds.) CVPR. IEEE (2020)
Google Scholar
Karras, T., et al.: Training generative adversarial networks with limited data. In: Larochelle, H., et al. (eds.) NeurIPS. vol. 33, pp. 12104–12114. Curran Associates, Inc. (2020)
Google Scholar
Zhao, S., Liu, Z., Lin, J., Zhu, J.Y., Han, S.: Differentiable augmentation for data-efficient gan training. In: Larochelle, H., et al. (eds.) NeurIPS. vol. 33, pp. 7559–7570. Curran Associates, Inc. (2020)
Google Scholar
Jiang, L., Dai, B., Wu, W., Loy, C.C.: Deceive d: Adaptive pseudo augmentation for gan training with limited data. In: Ranzato, M. (ed.) NeurIPS. vol. 34, pp. 21655–21667. Curran Associates, Inc. (2021)
Google Scholar
Dowson, D., Landau, B.: The fréchet distance between multivariate normal distributions. J. Multivar. Anal. 12(3), 450–455 (1982). https://doi.org/10.1016/0047-259X(82)90077-X
Article Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Bischof, H., et al. (eds.) CVPR. IEEE (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Zabih, R., et al. (eds.) CVPR. IEEE (2016)
Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. vol. 31 (2017). https://doi.org/10.1609/aaai.v31i1.11231
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR. IEEE (2017)
Google Scholar
Deng, J., et al.: Imagenet: A large-scale hierarchical image database. In: CVPR. pp. 248–255. IEEE (2009). https://doi.org/10.1109/CVPR.2009.5206848
Caron, M., et al.: Unsupervised learning of visual features by contrasting cluster assignments. In: NeurIPS. vol. 33, pp. 9912–9924. Curran Associates, Inc. (2020)
Google Scholar
Caron, M., et al.: Emerging properties in self-supervised vision transformers. In: Berg, T., et al. (eds.) ICCV. pp. 9650–9660. IEEE (2021)
Google Scholar
Li, Z., Wang, Y., Yu, J.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Berg, T., et al. (eds.) ICCV. pp. 10012–10022. IEEE (2021)
Google Scholar
Zhou, H.Y., Lu, C., Yang, S., Yu, Y.: Convnets vs. transformers: Whose visual representations are more transferable? In: Vandenhende, S., et al. (eds.) ICCV Workshops. pp. 2230–2238. IEEE (2021)
Google Scholar
Kang, M., Shim, W., Cho, M., Park, J.: Studiogan: A taxonomy and benchmark of gans for image synthesis. Trans. Pattern Anal. Mach. Intell. 45(12), 15725–15742 (2023). https://doi.org/10.1109/TPAMI.2023.3306436
Article Google Scholar

Download references

Acknowledgments

Research reported in this publication was supported in part by resources of the Image Guided Cancer Therapy Research Program (IGCT) at The University of Texas MD Anderson Cancer Center, a generous gift from the Apache Corporation, the National Institutes of Health/NCI under award number P30CA016672, and the Tumor Measurement Initiative through the MD Anderson Strategic Initiative Development Program (STRIDE). We thank the NIH Clinical Center for the ChestX-ray14 dataset, Dr. Rishi Agrawal and Dr. Carol Wu for their generative modeling feedback, Dr. Vikram Haheshri and Dr. Oleg Igoshin for the discussion that led to the hypothesis testing contribution, and Erica Goodoff - Senior Scientific Editor in the Research Medical Library at MD Anderson for editing this article. GPT4 was used in the proofreading stage of this manuscript.

Author information

Authors and Affiliations

The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
McKell Woodland, Austin Castelo, Mais Al Taie, Jessica Albuquerque Marques Silva, Mohamed Eltaher, Frank Mohn, Alexander Shieh, Suprateek Kundu, Joshua P. Yung & Kristy K. Brock
Rice University, Houston, TX, 77005, USA
McKell Woodland & Ankit B. Patel
Baylor College of Medicine, Houston, TX, 77030, USA
Ankit B. Patel

Authors

McKell Woodland
View author publications
You can also search for this author in PubMed Google Scholar
Austin Castelo
View author publications
You can also search for this author in PubMed Google Scholar
Mais Al Taie
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Albuquerque Marques Silva
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Eltaher
View author publications
You can also search for this author in PubMed Google Scholar
Frank Mohn
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Shieh
View author publications
You can also search for this author in PubMed Google Scholar
Suprateek Kundu
View author publications
You can also search for this author in PubMed Google Scholar
Joshua P. Yung
View author publications
You can also search for this author in PubMed Google Scholar
Ankit B. Patel
View author publications
You can also search for this author in PubMed Google Scholar
Kristy K. Brock
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to McKell Woodland .

Editor information

Editors and Affiliations

Children’s National Hospital/George Washington University, Washington, DC, USA
Marius George Linguraru
The Chinese University of Hong Kong, Hong Kong, China
Qi Dou
Technical University of Denmark, Kgs Lyngby, Denmark
Aasa Feragen
Imperial College London, London, UK
Stamatia Giannarou
Imperial College London, London, UK
Ben Glocker
Universitat de Barcelona, Barcelona, Spain
Karim Lekadir
Helmholtz Munich, Technical University of Munich and King’s College London, Munich, Germany
Julia A. Schnabel

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare.

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 56 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Woodland, M. et al. (2024). Feature Extraction for Generative Medical Imaging Evaluation: New Evidence Against an Evolving Trend. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15012. Springer, Cham. https://doi.org/10.1007/978-3-031-72390-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-72390-2_9
Published: 23 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72389-6
Online ISBN: 978-3-031-72390-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Feature Extraction for Generative Medical Imaging Evaluation: New Evidence Against an Evolving Trend

Abstract

Access this chapter

Subscribe and save

Buy Now