Pie-UNet: A Novel Parallel Interaction Encoder for Medical Image Segmentation

Jiang, Youtao; Zhang, Xiaoqian; Chen, Yufeng; Yang, Shukai; Sun, Feng

doi:10.1007/978-3-031-44210-0_45

Youtao Jiang¹¹,
Xiaoqian Zhang^11,12,
Yufeng Chen¹¹,
Shukai Yang¹¹ &
…
Feng Sun¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14255))

Included in the following conference series:

International Conference on Artificial Neural Networks

1217 Accesses

Abstract

Most of the initial medical image segmentation methods based on deep learning adopt a full convolutional structure, while the fixed size of the convolutional window limits the modeling of long-range dependencies. ViT has powerful global modelling capabilities, but low-level feature detail is poorly represented. To address the above problems, we propose a novel encoder structure and design a new U-shaped network for medical image segmentation, called Pie-UNet. Firstly, facing the problem of lack of localization in ViT and lack of global perception in CNN, we complement each other by encoding global and local information separately and implementing both in a parallel interaction manner; meanwhile, we propose a network with local structure-aware ViT, called Rwin Transformer, to enhance the local detail representation of ViT itself; in addition, to further refine the local representation, we construct a focal modulator based on large kernels; finally, we propose a pre-fusion approach to optimize the information interaction between heterogeneous structures. The experimental results demonstrate that our proposed Pie-UNet can achieve optimal and accurate segmentation results compared with several existing medical image segmentation methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SegNetr: Rethinking the Local-Global Interactions and Skip Connections in U-Shaped Networks

D-former: a U-shaped Dilated Transformer for 3D medical image segmentation

Article 06 October 2022

P-TransUNet: an improved parallel network for medical image segmentation

Article Open access 18 July 2023

References

Chen, J., et al.: TransuNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Chen, M., et al.: Generative pretraining from pixels. In: International Conference on Machine Learning, pp. 1691–1703. PMLR (2020)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2Net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2019)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2117–2125 (2017)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Liu, Z., Mao, H., Wu, C.Y.: Christoph feichtenhofer trevor darrell and saining xie. a convnet for the 2020s. CoRR (2022)
Google Scholar
Oktay, O., et al.: Attention u-net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 10 (2018)
Peng, Z., et al.: Conformer: local features coupling global representations for visual recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 367–376 (2021)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers and distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)
Google Scholar
Valanarasu, J.M.J., Patel, V.M.: UNeXt: MLP-Based Rapid Medical Image Segmentation Network. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention–MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol. 13435, pp. 23–33. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-16443-9_3
Wang, H., Cao, P., Wang, J., Zaiane, O.R.: UcTransNet: rethinking the skip connections in U-NET from a channel-wise perspective with transformer. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 2441–2449 (2022)
Google Scholar
Wang, H., Zhu, Y., Green, B., Adam, H., Yuille, A., Chen, L.-C.: Axial-DeepLab: stand-alone axial-attention for panoptic segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12349, pp. 108–126. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58548-8_7
Chapter Google Scholar
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: Segformer: simple and efficient design for semantic segmentation with transformers. Adv. Neural. Inf. Process. Syst. 34, 12077–12090 (2021)
Google Scholar
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Google Scholar
Yang, J., Li, C., Dai, X., Gao, J.: Focal modulation networks. Adv. Neural. Inf. Process. Syst. 35, 4203–4217 (2022)
Google Scholar
Zhang, Y., Liu, H., Hu, Q.: TransFuse: fusing transformers and CNNs for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 14–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_2
Chapter Google Scholar
Zhang, Z., Liu, Q., Wang, Y.: Road extraction by deep residual U-NET. IEEE Geosci. Remote Sens. Lett. 15(5), 749–753 (2018)
Article Google Scholar
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: a nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Chapter Google Scholar
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant 62102331, the Natural Science Foundation of Sichuan Province under Grant 2022NSFSC0839 and the Doctoral Research Fund Project of Southwest University of science and Technology 22zx7110.

Author information

Authors and Affiliations

School of Information Engineering, Southwest University of Science and Technology, Mianyang, 621010, China
Youtao Jiang, Xiaoqian Zhang, Yufeng Chen & Shukai Yang
Tianfu Institute of Research and Innovation, Southwest University of Science and Technology, Mianyang, 621010, China
Xiaoqian Zhang
Radiology department, Mianyang Central Hospital, Mianyang, 621010, China
Feng Sun

Authors

Youtao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoqian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yufeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shukai Yang
View author publications
You can also search for this author in PubMed Google Scholar
Feng Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoqian Zhang .

Editor information

Editors and Affiliations

Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
Lancaster University, Lancaster, UK
Plamen Angelov
Teesside University, Middlesbrough, UK
Chrisina Jayne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, Y., Zhang, X., Chen, Y., Yang, S., Sun, F. (2023). Pie-UNet: A Novel Parallel Interaction Encoder for Medical Image Segmentation. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14255. Springer, Cham. https://doi.org/10.1007/978-3-031-44210-0_45

Download citation

DOI: https://doi.org/10.1007/978-3-031-44210-0_45
Published: 22 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44209-4
Online ISBN: 978-3-031-44210-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Pie-UNet: A Novel Parallel Interaction Encoder for Medical Image Segmentation