Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ASF-LKUNet: : Adjacent-scale fusion U-Net with large kernel for multi-organ segmentation

Published: 01 October 2024 Publication History

Abstract

In the multi-organ segmentation task of medical images, there are some challenging issues such as the complex background, blurred boundaries between organs, and the larger scale difference in volume. Due to the local receptive fields of conventional convolution operations, it is difficult to obtain desirable results by directly using them for multi-organ segmentation. While Transformer-based models have global information, there is a significant dependency on hardware because of the high computational demands. Meanwhile, the depthwise convolution with large kernel can capture global information and have less computational requirements. Therefore, to leverage the large receptive field and reduce model complexity, we propose a novel CNN-based approach, namely adjacent-scale fusion U-Net with large kernel (ASF-LKUNet) for multi-organ segmentation. We utilize a u-shaped encoder–decoder as the base architecture of ASF-LKUNet. In the encoder path, we design the large kernel residual block, which combines the large and small kernels and can simultaneously capture the global and local features. Furthermore, for the first time, we propose an adjacent-scale fusion and large kernel GRN channel attention that incorporates the low-level details with the high-level semantics by the adjacent-scale feature and then adaptively focuses on the more global and meaningful channel information. Extensive experiments and interpretability analysis are made on the Synapse multi-organ dataset (Synapse) and the ACDC cardiac multi-structure dataset (ACDC). Our proposed ASF-LKUNet achieves 88.41% and 89.45% DSC scores on the Synapse and ACDC datasets, respectively, with 17.96M parameters and 29.14 GFLOPs. These results show that our method achieves superior performance with favorable lower complexity against ten competing approaches.ASF-LKUNet is superior to various competing methods and has less model complexity. Code and the trained models have been released on GitHub.

Highlights

Proposed an adjacent-scale fusion UNet to enhance the effectiveness and efficiency.
The designs of LKRB and LKGRN help explore bridging the gap between ViTs and CNNs.
Using Grad-CAM to gain a deeper understanding of the information hidden in the model.
ASF-LKUNet shows favorable performance with lower complexity by extensive results.

References

[1]
Isensee F., Jaeger P.F., Kohl S.A., Petersen J., Maier-Hein K.H., nnU-net: a self-configuring method for deep learning-based biomedical image segmentation, Nature Methods 18 (2) (2021) 203–211.
[2]
Shi F., Wang J., Shi J., Wu Z., Wang Q., Tang Z., He K., Shi Y., Shen D., Review of artificial intelligence techniques in imaging data acquisition, segmentation, and diagnosis for COVID-19, IEEE Rev. Biomed. Eng. 14 (2020) 4–15.
[3]
Sherer M.V., Lin D., Elguindi S., Duke S., Tan L.-T., Cacicedo J., Dahele M., Gillespie E.F., Metrics to evaluate the performance of auto-segmentation for radiation treatment planning: A critical review, Radiother. Oncol. 160 (2021) 185–191.
[4]
Wang R., Guo J., Zhou Z., Wang K., Gou S., Xu R., Sher D., Wang J., Locoregional recurrence prediction in head and neck cancer based on multi-modality and multi-view feature expansion, Phys. Med. Biol. 67 (12) (2022).
[5]
Wang K., Li Y., Dohopolski M., Peng T., Lu W., Zhang Y., Wang J., Recurrence-free survival prediction under the guidance of automatic gross tumor volume segmentation for head and neck cancers, 2022, arXiv preprint arXiv:2209.11268.
[6]
Ronneberger O., Fischer P., Brox T., U-net: Convolutional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, Springer, 2015, pp. 234–241.
[7]
Zhou Z., Rahman Siddiquee M.M., Tajbakhsh N., Liang J., Unet++: A nested u-net architecture for medical image segmentation, in: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, Springer, 2018, pp. 3–11.
[8]
Huang H., Lin L., Tong R., Hu H., Zhang Q., Iwamoto Y., Han X., Chen Y.-W., Wu J., Unet 3+: A full-scale connected unet for medical image segmentation, in: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, IEEE, 2020, pp. 1055–1059.
[9]
Beeche C., Singh J.P., Leader J.K., Gezer N.S., Oruwari A.P., Dansingani K.K., Chhablani J., Pu J., Super U-net: A modularized generalizable architecture, Pattern Recognit. 128 (2022).
[10]
Zhao S., Li Z., Chen Y., Zhao W., Xie X., Liu J., Zhao D., Li Y., SCOAT-net: A novel network for segmenting COVID-19 lung opacification from CT images, Pattern Recognit. 119 (2021).
[11]
Chen J., Lu Y., Yu Q., Luo X., Adeli E., Wang Y., Lu L., Yuille A.L., Zhou Y., Transunet: Transformers make strong encoders for medical image segmentation, 2021, arXiv preprint arXiv:2102.04306.
[12]
Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., Dehghani M., Minderer M., Heigold G., Gelly S., et al., An image is worth 16x16 words: Transformers for image recognition at scale, 2020, arXiv preprint arXiv:2010.11929.
[13]
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.
[14]
Cao H., Wang Y., Chen J., Jiang D., Zhang X., Tian Q., Wang M., Swin-unet: Unet-like pure transformer for medical image segmentation, in: Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III, Springer, 2023, pp. 205–218.
[15]
Wang H., Cao P., Wang J., Zaiane O.R., Uctransnet: rethinking the skip connections in u-net from a channel-wise perspective with transformer, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, 2022, pp. 2441–2449.
[16]
Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, S. Xie, A convnet for the 2020s, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11976–11986.
[17]
Lee H.H., Bao S., Huo Y., Landman B.A., 3D UX-net: A large kernel volumetric ConvNet modernizing hierarchical transformer for medical image segmentation, 2022, arXiv preprint arXiv:2209.15076.
[18]
Sun Y., Dai D., Zhang Q., Wang Y., Xu S., Lian C., MSCA-net: Multi-scale contextual attention network for skin lesion segmentation, Pattern Recognit. 139 (2023).
[19]
Zhao R., Li Q., Wu J., You J., A nested U-shape network with multi-scale upsample attention for robust retinal vascular segmentation, Pattern Recognit. 120 (2021).
[20]
Zhou T., Zhou Y., He K., Gong C., Yang J., Fu H., Shen D., Cross-level feature aggregation network for polyp segmentation, Pattern Recognit. 140 (2023).
[21]
Woo S., Debnath S., Hu R., Chen X., Liu Z., Kweon I.S., Xie S., ConvNeXt V2: Co-designing and scaling ConvNets with masked autoencoders, 2023, arXiv preprint arXiv:2301.00808.
[22]
Lv P., Wang J., Wang H., 2.5 D lightweight RIU-net for automatic liver and tumor segmentation from CT, Biomed. Signal Process. Control 75 (2022).
[23]
Gao Y., Zhou M., Liu D., Yan Z., Zhang S., Metaxas D.N., A data-scalable transformer for medical image segmentation: architecture, model efficiency, and benchmark, 2022, arXiv preprint arXiv:2203.00131.
[24]
Manzari O.N., Kaleybar J.M., Saadat H., Maleki S., BEFUnet: A hybrid CNN-transformer architecture for precise medical image segmentation, 2024, arXiv preprint arXiv:2402.08793.
[25]
Hou Q., Lu C.-Z., Cheng M.-M., Feng J., Conv2Former: A simple transformer-style ConvNet for visual recognition, 2022, arXiv preprint arXiv:2211.11943.
[26]
Guo M.-H., Lu C.-Z., Liu Z.-N., Cheng M.-M., Hu S.-M., Visual attention network, 2022, arXiv preprint arXiv:2202.09741.
[27]
Li H., Nan Y., Del Ser J., Yang G., Large-kernel attention for 3D medical image segmentation, Cogn. Comput. (2023) 1–15.
[28]
Zhao X., Jia H., Pang Y., Lv L., Tian F., Zhang L., Sun W., Lu H., M 2 SNet: Multi-scale in multi-scale subtraction network for medical image segmentation, 2023, arXiv preprint arXiv:2303.10894.
[29]
M.M. Rahman, R. Marculescu, Medical Image Segmentation via Cascaded Attention Decoding, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 6222–6231.
[30]
Zhang Y., Liao Q., Ding L., Zhang J., Bridging 2D and 3D segmentation networks for computation-efficient volumetric medical image segmentation: An empirical study of 2.5 d solutions, Comput. Med. Imaging Graph. 99 (2022).
[31]
Wang R., Guo J., Zhou Z., Wang K., Gou S., Xu R., Sher D., Wang J., Locoregional recurrence prediction in head and neck cancer based on multi-modality and multi-view feature expansion, Phys. Med. Biol. 67 (12) (2022).
[32]
Wang G., Shapey J., Li W., Dorent R., Dimitriadis A., Bisdas S., Paddick I., Bradford R., Zhang S., Ourselin S., et al., Automatic segmentation of vestibular schwannoma from T2-weighted MRI by deep spatial attention with hardness-weighted loss, in: Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part II 22, Springer, 2019, pp. 264–272.
[33]
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[34]
He K., Zhang X., Ren S., Sun J., Identity mappings in deep residual networks, in: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, the Netherlands, October 11–14, 2016, Proceedings, Part IV 14, Springer, 2016, pp. 630–645.
[35]
S. Xie, R. Girshick, P. Dollár, Z. Tu, K. He, Aggregated residual transformations for deep neural networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1492–1500.
[36]
J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141.
[37]
S. Woo, J. Park, J.-Y. Lee, I.S. Kweon, Cbam: Convolutional block attention module, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 3–19.
[38]
Bernard O., Lalande A., Zotti C., Cervenansky F., Yang X., Heng P.-A., Cetin I., Lekadir K., Camara O., Ballester M.A.G., et al., Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: is the problem solved?, IEEE Trans. Med. Imaging 37 (11) (2018) 2514–2525.
[39]
Zhang Z., Liu Q., Wang Y., Road extraction by deep residual u-net, IEEE Geosci. Remote Sens. Lett. 15 (5) (2018) 749–753.
[40]
Oktay O., Schlemper J., Folgoc L.L., Lee M., Heinrich M., Misawa K., Mori K., McDonagh S., Hammerla N.Y., Kainz B., et al., Attention u-net: Learning where to look for the pancreas, 2018, arXiv preprint arXiv:1804.03999.
[41]
B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba, Learning deep features for discriminative localization, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2921–2929.
[42]
A. Mahendran, A. Vedaldi, Understanding deep image representations by inverting them, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 5188–5196.
[43]
R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 618–626.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Computers in Biology and Medicine
Computers in Biology and Medicine  Volume 181, Issue C
Oct 2024
730 pages

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 October 2024

Author Tags

  1. Multi-organ segmentation
  2. Adjacent-scale fusion
  3. Large kernel
  4. CNNs
  5. Interpretability analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media