Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3671151.3671286acmotherconferencesArticle/Chapter ViewAbstractPublication PagescibdaConference Proceedingsconference-collections
research-article

DCTNet: A Fusion of Transformer and CNN for Advanced Multimodal Medical Image Segmentation

Published: 23 July 2024 Publication History

Abstract

In the present investigation, we propose an advanced multimodal breast cancer segmentation framework, designated as DCTNet, which harnesses the synergistic capabilities of Convolutional Neural Networks (CNN) and Transformers. This innovative approach is designed to amalgamate and utilize diverse informational and feature-rich inputs from varied modalities, significantly refining the precision of automated delineation in breast cancer lesions. DCTNet incorporates dual CNN-based feature learning architectures to independently assimilate modality-specific features, concurrently minimizing cross-modality interference through an intricately structured encoder-decoder mechanism complemented by skip connections. Furthermore, we introduce a Transformer-based encoder dedicated to cross-modal shared learning, adept at extracting cohesive representations from multimodal inputs. These are seamlessly integrated with modality-specific features via a Cross-Modal Feature Fusion Module (CFM), thereby optimizing the feature representation through the CNN decoder pathway for superior segmentation outcomes. Rigorous experimental evaluations conducted on the DCI breast cancer dataset affirm DCTNet's capacity to either match or excel beyond the segmentation efficacy of prevailing advanced multimodal models. This exploration not only elucidates the efficacy and indispensability of integrating CNN with Transformer structures, the cross-modal feature fusion module, and multimodal contrastive loss in elevating the accuracy of breast cancer segmentation but also pioneers new directions for ensuing research in multimodal medical image analysis.

References

[1]
Morrow, M.; Strom, E. A.; Bassett, L. W.; Dershaw, D. D.; Fowble, B.; Giuliano, A.; Harris, J. R.; O'Malley, F.; Schnitt, S. J.; Singletary, S. E.; Standard for Breast Conservation Therapy in the Management of Invasive Breast Carcinoma. CA: A Cancer Journal for Clinicians 2002, 52 (5), 277-300.
[2]
Chen, C.; Dou, Q.; Jin, Y.; Chen, H.; Qin, J.; Heng, P. A. Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement and Gated Fusion. In Lecture Notes in Computer Science, 2019; Vol. 11766 LNCS.
[3]
Carneiro, G.; Nascimento, J.; Bradley, A. P. Unregistered multiview mammogram analysis with pre-trained deep learning models. In Lecture Notes in Computer Science, 2015; Vol. 9351.
[4]
Saeed, N.; Sobirov, I.; Al Majzoub, R.; Yaqub, M. TMSS: An End-to-End Transformer-Based Multimodal Network for Segmentation and Survival Prediction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022; Vol. 13437 LNCS.
[5]
Wang, W.; Chen, C.; Ding, M.; Yu, H.; Zha, S.; Li, J. TransBTS: Multimodal Brain Tumor Segmentation Using Transformer. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2021; Vol. 12901 LNCS.
[6]
Shi, J.; Kan, H.; Ruan, S.; Zhu, Z.; Zhao, M.; Qiao, L.; Wang, Z.; An, H.; Xue, X. H-DenseFormer: An Efficient Hybrid Densely Connected Transformer for Multimodal Tumor Segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 2023; Springer: pp 692-702.
[7]
Dobko, M.; Kolinko, D. I.; Viniavskyi, O.; Yelisieiev, Y. Combining CNNs with Transformer for Multimodal 3D MRI Brain Tumor Segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022; Vol. 12963 LNCS.
[8]
Zhang, Y.; He, N.; Yang, J.; Li, Y.; Wei, D.; Huang, Y.; Zhang, Y.; He, Z.; Zheng, Y. mmFormer: Multimodal Medical Transformer for Incomplete Multimodal Learning of Brain Tumor Segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022; Vol. 13435 LNCS.
[9]
Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017 2017, 2017-Janua, 5987-5995.
[10]
Zhang, H.; Wu, C.; Zhang, Z.; Zhu, Y.; Lin, H.; Zhang, Z.; Sun, Y.; He, T.; Mueller, J.; Manmatha, R.; ResNeSt: Split-Attention Networks. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2022; Vol. 2022-June.
[11]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K. Q. Densely connected convolutional networks. In Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017; Vol. 2017-January.
[12]
Gulati, A.; Qin, J.; Chiu, C. C.; Parmar, N.; Zhang, Y.; Yu, J.; Han, W.; Wang, S.; Zhang, Z.; Wu, Y.; Conformer: Convolution-augmented transformer for speech recognition. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2020; Vol. 2020-October.
[13]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2015; Vol. 9351.
[14]
Isensee, F.; Petersen, J.; Klein, A.; Zimmerer, D.; Jaeger, P. F.; Kohl, S.; Wasserthal, J.; Koehler, G.; Norajitra, T.; Wirkert, S. nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation. In Informatik aktuell, 2019.
[15]
Zhou, Z.; Siddiquee, M. M. R.; Tajbakhsh, N.; Liang, J. Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE transactions on medical imaging 2019, 39 (6), 1856-1867.
[16]
Dolz, J.; Desrosiers, C.; Ben Ayed, I. IVD-Net: Intervertebral disc localization and segmentation in MRI with a multi-modal UNet. In International workshop and challenge on computational methods and clinical applications for spine imaging, 2018; Springer: pp 130-143.
[17]
Li, X.; Ma, S.; Tang, J.; Guo, F. TranSiam: Fusing multimodal visual features using transformer for medical image segmentation. arXiv preprint arXiv:2204.12185 2022.
[18]
Zhang, J.; Zhang, S.; Shen, X.; Lukasiewicz, T.; Xu, Z. Multi-ConDoS: Multimodal contrastive domain sharing generative adversarial networks for self-supervised medical image segmentation. IEEE Transactions on Medical Imaging 2023.
[19]
Marinov, Z.; Reiß, S.; Kersting, D.; Kleesiek, J.; Stiefelhagen, R. Mirror u-net: Marrying multimodal fission with multi-task learning for semantic segmentation in medical imaging. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023; pp 2283-2293.

Index Terms

  1. DCTNet: A Fusion of Transformer and CNN for Advanced Multimodal Medical Image Segmentation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    CIBDA '24: Proceedings of the 5th International Conference on Computer Information and Big Data Applications
    April 2024
    1285 pages
    ISBN:9798400718106
    DOI:10.1145/3671151
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 July 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    CIBDA 2024

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 11
      Total Downloads
    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media