research-article

Enhancing Robust Liver Cancer Diagnosis: A Contrastive Multi-Modality Learner with Lightweight Fusion and Effective Data Augmentation

Authors:

Hsun-Ping Hsieh,

Yang Fan-Chiang,

Ching-Chung KoAuthors Info & Claims

ACM Transactions on Computing for Healthcare, Volume 5, Issue 2

Article No.: 6, Pages 1 - 13

https://doi.org/10.1145/3639414

Published: 22 April 2024 Publication History

Abstract

This article explores the application of self-supervised contrastive learning in the medical domain, focusing on classification of multi-modality Magnetic Resonance (MR) images. To address the challenges of limited and hard-to-annotate medical data, we introduce multi-modality data augmentation (MDA) and cross-modality group convolution (CGC). In the pre-training phase, we leverage Simple Siamese networks to maximize the similarity between two augmented MR images from a patient, without a handcrafted pretext task. Our approach also combines 3D and 2D group convolution with a channel shuffle operation to efficiently incorporate different modalities of image features. Evaluation on liver MR images from a well-known hospital in Taiwan demonstrates a significant improvement over previous methods. This work contributes to advancing multi-modality contrastive learning, particularly in the context of medical imaging, offering enhanced tools for analyzing complex image data.

References

[1]

Shekoofeh Azizi, Basil Mustafa, Fiona Ryan, Zachary Beaver, Jan Freyberg, Jonathan Deaton, Aaron Loh, Alan Karthikesalingam, Simon Kornblith, Ting Chen, Vivek Natarajan, and Mohammad Norouzi. 2021. Big self-supervised models advance medical image classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3478–3488.

[2]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning. PMLR, 1597–1607.

[3]

Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. Big self-supervised models are strong semi-supervised learners. In Advances in Neural Information Processing Systems, Vol. 33, 22243–22255.

[4]

Xinlei Chen and Kaiming He. 2021. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15750–15758.

[5]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of naacL-HLT, Vol. 1. 2.

[6]

Lucas Fidon, Wenqi Li, Luis C. Garcia-Peraza-Herrera, Jinendra Ekanayake, Neil Kitchen, Sebastien Ourselin, and Tom Vercauteren. 2017. Scalable multimodal convolutional networks for brain tumour segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 285–293.

Digital Library

[7]

Ian Goodfellow, Jean Pouget Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems, Vol. 27, Curran Associates, Inc., 2672–2680.

Digital Library

[8]

Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, Koray Kavukcuoglu, Rémi Munos, and Michal Valko. 2020. Bootstrap your own latent-a new approach to self-supervised learning. In Advances in Neural Information Processing Systems, Vol. 33, 21271–21284.

[9]

Zhe Guo, Xiang Li, Heng Huang, Ning Guo, and Quanzheng Li. 2018. Medical image segmentation based on multi-modal convolutional neural network: Study on image fusion schemes. In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI ’18). IEEE, 903–907.

[10]

Mahdi Hashemi. 2019. Enlarging smaller images before inputting into convolutional neural network: Zero-padding vs. interpolation. Journal of Big Data 6, 1 (2019), 1–13.

[11]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9729–9738.

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[13]

Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning. PMLR, 448–456.

Digital Library

[14]

Jue Jiang, Yu-Chi Hu, Neelam Tyagi, Pengpeng Zhang, Andreas Rimner, Gig S. Mageras, Joseph O Deasy, and Harini Veeraraghavan. 2018. Tumor-aware, adversarial domain adaptation from ct to mri for lung cancer segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 777–785.

Digital Library

[15]

Peng-Tao Jiang, Chang-Bin Zhang, Qibin Hou, Ming-Ming Cheng, and Yunchao Wei. 2021. LayerCAM: Exploring hierarchical class activation maps for localization. IEEE Transactions on Image Processing 30, 30 (2021), 5875–5888.

[16]

Konstantinos Kamnitsas, Christian Ledig, Virginia F. J. Newcombe, Joanna P. Simpson, Andrew D. Kane, David K. Menon, Daniel Rueckert, and Ben Glocker. 2017. Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Medical Image Analysis 36, 22 (2017), 61–78.

[17]

Alexander Ke, William Ellsworth, Oishi Banerjee, Andrew Y. Ng, and Pranav Rajpurkar. 2021. CheXtransfer: Performance and parameter efficiency of ImageNet models for chest X-Ray interpretation. In Proceedings of the Conference on Health, Inference, and Learning. 116–124.

Digital Library

[18]

Nikos Komodakis and Spyros Gidaris. 2018. Unsupervised representation learning by predicting image rotations. In International Conference on Learning Representations (ICLR’18).

[19]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, F. Pereira, C. J. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Vol. 25, Curran Associates, Inc., 1097–1105.

Digital Library

[20]

Kang Li, Lequan Yu, Shujun Wang, and Pheng-Ann Heng. 2020. Towards cross-modality medical image segmentation with online mutual knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 775–783.

[21]

Min Lin, Qiang Chen, and Shuicheng Yan. 2014. Network in network. In 2nd International Conference on Learning Representations (ICLR’14, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings), Yoshua Bengio and Yann LeCun (Eds.).

[22]

Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen Awm Van Der Laak, Bram Van Ginneken, and Clara I. Sánchez. 2017. A survey on deep learning in medical image analysis. Medical Image Analysis 42, 22 (2017), 60–88.

[23]

Prajit Ramachandran, Barret Zoph, and Quoc V. Le. 2017. Searching for activation functions. arXiv:1710.05941. Retrieved from https://arxiv.org/abs/cs/1710.05941

[24]

Hari Sowrirajan, Jingbo Yang, Andrew Y. Ng, and Pranav Rajpurkar. 2021. MoCo pretraining improves representation and transferability of chest x-ray models. In Proceedings of the 4th Conference on Medical Imaging with Deep Learning (Proceedings of Machine Learning Research, Vol. 143), Mattias Heinrich, Qi Dou, Marleen de Bruijne, Jan Lellmann, Alexander Schläfer, and Floris Ernst (Eds.). PMLR, 728–744.

[25]

Mohammad Reza Hosseinzadeh Taher, Fatemeh Haghighi, Michael B. Gotway, and Jianming Liang. 2022. CAiD: Context-Aware instance discrimination for self-supervised learning in medical imaging. In Proceedings of Machine Learning Research 172 (2022), 535–551. Publisher Copyright: 2022 M. R. Hosseinzadeh Taher, F. Haghighi, M. B. Gotway J. Liang.; 5th International Conference on Medical Imaging with Deep Learning, MIDL 2022; Conference date: 06-07-2022 Through 08-07-2022.

[26]

Mohammad Reza Hosseinzadeh Taher, Fatemeh Haghighi, Michael B. Gotway, and Jianming Liang. 2022. CAiD: Context-Aware Instance Discrimination for Self-supervised Learning in Medical Imaging. arXiv:2204.07344. Retrieved from https://arxiv.org/abs/cs/2204.07344

[27]

Aiham Taleb, Christoph Lippert, Tassilo Klein, and Moin Nabi. 2021. Multimodal self-supervised learning for medical image analysis. In International Conference on Information Processing in Medical Imaging. Springer, 661–673.

Digital Library

[28]

Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning. PMLR, 6105–6114.

[29]

Michael Tschannen, Josip Djolonga, Marvin Ritter, Aravindh Mahendran, Neil Houlsby, Sylvain Gelly, and Mario Lucic. 2020. Self-supervised learning of video-induced visual invariances. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13806–13815.

[30]

Kuan-Lun Tseng, Yen-Liang Lin, Winston Hsu, and Chung-Yang Huang. 2017. Joint sequence learning and cross-modality convolution for 3d biomedical segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6393–6400.

[31]

Yen Nhi Truong Vu, Richard Wang, Niranjan Balachandar, Can Liu, Andrew Y. Ng, and Pranav Rajpurkar. 2021. Medaug: Contrastive learning leveraging patient metadata improves representations for chest x-ray interpretation. In Machine Learning for Healthcare Conference, PMLR, 755–769.

[32]

Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning. PMLR, 9929–9939.

[33]

Richard Zhang, Phillip Isola, and Alexei A Efros. 2016. Colorful image colorization. In European Conference on Computer Vision. Springer, 649–666.

[34]

Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6848–6856.

[35]

Hong-Yu Zhou, Chixiang Lu, Sibei Yang, Xiaoguang Han, and Yizhou Yu. 2021. Preservational learning improves self-supervised medical image models by reconstructing diverse contexts. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3499–3509.

Index Terms

Enhancing Robust Liver Cancer Diagnosis: A Contrastive Multi-Modality Learner with Lightweight Fusion and Effective Data Augmentation

Recommendations

Exploring Feature Fusion from A Contrastive Multi-Modality Learner for Liver Cancer Diagnosis
MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

Self-supervised contrastive learning has achieved promising results in computer vision, and recently it also received attention in the medical domain. In practice, medical data is hard to collect and even harder to annotate, but leveraging multi-...
Multi-modality three-dimensional brain image registration and analysis
Intermodality registration and fusion of liver images for medical diagnosis
IIS '97: Proceedings of the 1997 IASTED International Conference on Intelligent Information Systems (IIS '97)

SPECT is valuable for the evaluation of hepatic function and CT can offer better anatomical images for liver and its adjacent structures. Image registration from the two modalities, CT and SPECT, acquires both structural and functional information, and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computing for Healthcare

ACM Transactions on Computing for Healthcare Volume 5, Issue 2

April 2024

169 pages

EISSN:2637-8051

DOI:10.1145/3613591

Editors:
Insup Lee
University of Pennsylvania, USA
,
Gang Zhou
William and Mary, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 April 2024

Online AM: 30 December 2023

Accepted: 10 December 2023

Revised: 07 September 2023

Received: 27 October 2022

Published in HEALTH Volume 5, Issue 2

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science and Technology Council (NSTC)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
332
Total Downloads

Downloads (Last 12 months)249
Downloads (Last 6 weeks)15

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents