HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation

Yu, Zhihong; Lee, Feifei; Chen, Qiu

doi:10.1007/s10489-023-04570-z

HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation

Published: 24 March 2023

Volume 53, pages 19990–20006, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

1256 Accesses
8 Citations
Explore all metrics

Abstract

Considering that many manually designed convolutional neural networks (CNNs) for different tasks that require considerable time, labor, and domain knowledge have been designed in the medical image segmentation domain and that most CNN networks only consider local feature information while ignoring the global receptive field due to the convolution limitation, there is still much room for performance improvement. Therefore, designing a new method that can fully capture feature information and save considerable time and human energy with less GPU memory consumption and complexity is necessary. In this paper, we propose a novel hybrid CNN-transformer model based on a neural architecture search network (HCT-Net), which designs a hybrid U-shaped CNN with a key-sampling Transformer backbone that considers contextual and long-range pixel information in the search space and uses a single-path neural architecture search that contains a flexible search space and an efficient search strategy to simultaneously find the optimal subnetwork including three types of cells during SuperNet. Compared with various types of medical image segmentation methods, our framework can achieve competitive precision and efficiency on various datasets, and we also validate the generalization on unseen datasets in extended experiments. In this way, we can verify that our method is competitive and robust. The code for the method is available at https://github.com/yuzh2022/HCT-Net.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

BiX-NAS: Searching Efficient Bi-directional Architecture for Medical Image Segmentation

LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation

TransDeepLab: Convolution-Free Transformer-Based DeepLab v3+ for Medical Image Segmentation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Zheng S, Zhang Y, Liu W, Zou Y (2020) Improved image representation and sparse representation for image classification. Appl Intell 50:1687–1698. https://doi.org/10.1007/s10489-019-01612-3.y
Article Google Scholar
Zhu L, Lee F, Cai J, Yu H, Chen Q (2022) An improved feature pyramid network for object detection. Neurocomputing. 483:127–139. https://doi.org/10.1016/j.neucom.2022.02.016
Article Google Scholar
Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2021) Deformable DETR: deformable transformers for end-to-end object detection. In: International conference on learning representations
Wu C, Wang Z (2022) Robust fuzzy dual-local information clustering with kernel metric and quadratic surface prototype for image segmentation. Appl Intell. https://doi.org/10.1007/s10489-022-03690-2
Lu X, Wang W, Shen J, Crandall DJ, Van Gool L (2022) Segmenting objects from relational visual data. IEEE Trans Pattern Anal Mach Intell 44:7885–7897. https://doi.org/10.1109/TPAMI.2021.3115815
Article Google Scholar
Lu X, Wang W, Ma C, Shen J, Shao L, Porikli F (2019) See more, know more: unsupervised video object segmentation with co-attention Siamese networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3623–3632
Lu X, Wang W, Shen J, Crandall D, Luo J (2022) Zero-shot video object segmentation with co-attention Siamese networks. IEEE Trans Pattern Anal Mach Intell 44:2228–2242. https://doi.org/10.1109/TPAMI.2020.3040258
Article Google Scholar
Qin Z, Lu X, Nie X, Zhen X, Yin Y (2021) Learning hierarchical embedding for video instance segmentation. In: Proc. 29th ACM int. conf. multimed., ACM, Virtual Event China, pp 1884–1892. https://doi.org/10.1145/3474085.3475342
Lu X, Wang W, Danelljan M, Zhou T, Shen J, Van Gool L (2020) Video object segmentation with episodic graph memory networks. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Comput. Vis. – ECCV 2020. Springer International Publishing, Cham, pp 661–679. https://doi.org/10.1007/978-3-030-58580-8_39
Chapter Google Scholar
Baygin M, Yaman O, Barua PD, Dogan S, Tuncer T, Acharya UR (2022) Exemplar Darknet19 feature generation technique for automated kidney stone detection with coronal CT images. Artif Intell Med 127:102274. https://doi.org/10.1016/j.artmed.2022.102274
Article Google Scholar
Kobat SG, Baygin N, Yusufoglu E, Baygin M, Barua PD, Dogan S, Yaman O, Celiker U, Yildirim H, Tan R-S, Tuncer T, Islam N, Acharya UR (2022) Automated diabetic retinopathy detection using horizontal and vertical patch division-based pre-trained DenseNET with digital fundus images. Diagnostics 12:1975. https://doi.org/10.3390/diagnostics12081975
Article Google Scholar
Key S, Baygin M, Demir S, Dogan S, Tuncer T (2022) Meniscal tear and ACL injury detection model based on AlexNet and iterative ReliefF. J Digit Imaging 35:200–212. https://doi.org/10.1007/s10278-022-00581-3
Article Google Scholar
Guo X, Yang C, Yuan Y (2021) Dynamic-weighting hierarchical segmentation network for medical images. Med Image Anal 73:102196. https://doi.org/10.1016/j.media.2021.102196
Article Google Scholar
Sinha A, Dolz J (2021) Multi-scale self-guided attention for medical image segmentation. IEEE J Biomed Health Inform 25:121–130. https://doi.org/10.1109/JBHI.2020.2986926
Article Google Scholar
Xie Y, Zhang J, Lu H, Shen C, Xia Y (2021) SESV: accurate medical image segmentation by predicting and correcting errors. IEEE Trans Med Imaging 40:286–296. https://doi.org/10.1109/TMI.2020.3025308
Article Google Scholar
Huang S, Lee F, Miao R, Si Q, Lu C, Chen Q (2020) A deep convolutional neural network architecture for interstitial lung disease pattern classification. Med Biol Eng Comput 58:725–737. https://doi.org/10.1007/s11517-019-02111-w
Article Google Scholar
Zuo B, Lee F, Chen Q (2022) An efficient U-shaped network combined with edge attention module and context pyramid fusion for skin lesion segmentation. Med Biol Eng Comput 60:1987–2000. https://doi.org/10.1007/s11517-022-02581-5
Article Google Scholar
Mittal H, Pandey AC, Pal R, Tripathi A (2021) A new clustering method for the diagnosis of CoVID19 using medical images. Appl Intell 51:2988–3011. https://doi.org/10.1007/s10489-020-02122-3
Article Google Scholar
Song L, Liu G, Ma M (2022) TD-net: unsupervised medical image registration network based on transformer and CNN. Appl Intell 52:18201–18209. https://doi.org/10.1007/s10489-022-03472-w
Article Google Scholar
Wang R, Lei T, Cui R, Zhang B, Meng H, Nandi AK (2022) Medical image segmentation using deep learning: a survey. IET Image Process 16:1243–1267. https://doi.org/10.1049/ipr2.12419
Article Google Scholar
Khatri I, Kumar D, Gupta A (2022) A noise robust kernel fuzzy clustering based on picture fuzzy sets and KL divergence measure for MRI image segmentation. Appl Intell. https://doi.org/10.1007/s10489-022-04315-4
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer assisted intervention – MICCAI 2015, Cham, pp 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
Baymurzina D, Golikov E, Burtsev M (2022) A review of neural architecture search. Neurocomputing. 474:82–93. https://doi.org/10.1016/j.neucom.2021.12.014
Article Google Scholar
Phan QM, Luong NH (2022) Enhancing multi-objective evolutionary neural architecture search with training-free Pareto local search. Appl Intell. https://doi.org/10.1007/s10489-022-04032-y
Hong W et al (2020) DropNAS: Grouped operation dropout for differentiable architecture search. In: International joint conferences on artificial intelligence organization, Yokohama, Japan, pp 2326–2332. https://doi.org/10.24963/ijcai.2020/322
Guo Q, Wu X-J, Kittler J, Feng Z (2022) Differentiable neural architecture learning for efficient neural networks. Pattern Recogn 126:108448. https://doi.org/10.1016/j.patcog.2021.108448
Article Google Scholar
Baldeon-Calisto M, Lai-Yuen SK (2020) AdaResU-net: multiobjective adaptive convolutional neural network for medical image segmentation. Neurocomputing. 392:325–340. https://doi.org/10.1016/j.neucom.2019.01.110
Article Google Scholar
Baldeon Calisto M, Lai-Yuen SK (2020) AdaEn-net: an ensemble of adaptive 2D–3D fully convolutional networks for medical image segmentation. Neural Netw 126:76–94. https://doi.org/10.1016/j.neunet.2020.03.007
Article Google Scholar
Yan X, Jiang W, Shi Y, Zhuo C (2020) MS-NAS: multi-scale neural architecture search for medical image segmentation. In: Medical image computing and computer assisted intervention – MICCAI 2020, Cham, pp 388–397
He Y, Yang D, Roth H, Zhao C, Xu D (2021) DiNTS: differentiable neural network topology search for 3D medical image segmentation, In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5837–5846
Zhang H, Zhang W, Shen W, Li N, Chen Y, Li S, Chen B, Guo S, Wang Y (2021) Automatic segmentation of the cardiac MR images based on nested fully convolutional dense network with dilated convolution. Biomed Signal Process Control 68:102684. https://doi.org/10.1016/j.bspc.2021.102684
Article Google Scholar
Li C, Tan Y, Chen W, Luo X, He Y, Gao Y, Li F (2020) ANU-net: attention-based nested U-net to exploit full resolution features for medical image segmentation. Comput Graph 90:11–20. https://doi.org/10.1016/j.cag.2020.05.003
Article Google Scholar
Vaswani A et al (2017) Attention is all you need. In: Advances in neural information processing systems, 30
Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2022) Transformers in vision: a survey. ACM Comput Surv 54:200:1–200:41. https://doi.org/10.1145/3505244
Article Google Scholar
Chen J et al (2021) TransUNet: transformers make strong encoders for medical image segmentation. arXiv abs/2102.04306
Gao Y, Zhou M, Metaxas DN (2021) UTNet: a hybrid transformer architecture for medical image segmentation. In: Medical image computing and computer assisted intervention – MICCAI 2021, Cham, pp 61–71
Chen Y, Dai X, Liu M, Chen D, Yuan L, Liu Z (2020) Dynamic convolution: attention over convolution kernels. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11027–11036
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39:640–651. https://doi.org/10.1109/TPAMI.2016.2572683
Article Google Scholar
Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2020) UNet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans Med Imaging 39:1856–1867. https://doi.org/10.1109/TMI.2019.2959609
Article Google Scholar
Oktay O et al (2018) Attention U-Net: learning where to look for the pancreas. arXiv abs/1804.03999
Ibtehaz N, Rahman MS (2020) MultiResUNet : rethinking the U-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87. https://doi.org/10.1016/j.neunet.2019.08.025
Article Google Scholar
Xie Y, Zhang J, Shen C, Xia Y (2021) CoTr: efficiently bridging CNN and transformer for 3D medical image segmentation. In: Medical image computing and computer assisted intervention – MICCAI 2021, Cham, pp 171–180. https://doi.org/10.1007/978-3-030-87199-4_16
Liu Z et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp 9992–10002
Cao H et al (2021) Swin-Unet: Unet-like pure transformer for medical image segmentation. arXiv abs/2105.05537
Weng Y, Zhou T, Li Y, Qiu X (2019) NAS-Unet: neural architecture search for medical image segmentation. IEEE Access 7:44247–44257. https://doi.org/10.1109/ACCESS.2019.2908991
Article Google Scholar
Liu C et al (2019) Auto-DeepLab: hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 82–92
Yu Q et al (2020) C2FNAS: Coarse-to-fine neural architecture search for 3D medical image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4125–4134
Liu L et al (2021) MixSearch: searching for domain generalized medical image segmentation architectures. arXiv abs/2102.13280
Hu S, Xie S, Zheng H, Liu C, Shi J, Liu X, Lin D (2020) DSNAS: direct neural architecture search without parameter retraining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12081–12089. https://doi.org/10.1109/CVPR42600.2020.01210
Liu H, Simonyan K, Yang Y (2018) DARTS: differentiable architecture search. In: International conference on learning representations
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42:2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372
Article Google Scholar
Xie S, Zheng H, Liu C, Lin L (2018) SNAS: stochastic neural architecture search, in: international conference on learning representations. https://doi.org/10.48550/arXiv.1812.09926
Bernal J, Tajkbaksh N, Sanchez FJ, Matuszewski BJ, Chen H, Yu L, Angermann Q, Romain O, Rustad B, Balasingham I, Pogorelov K, Choi S, Debard Q, Maier-Hein L, Speidel S, Stoyanov D, Brandao P, Cordova H, Sanchez-Montes C et al (2017) Comparative validation of polyp detection methods in video colonoscopy: results from the MICCAI 2015 endoscopic vision challenge. IEEE Trans Med Imaging 36:1231–1249. https://doi.org/10.1109/TMI.2017.2664042
Article Google Scholar
Kavur AE, Gezer NS, Barış M, Aslan S, Conze PH, Groza V, Pham DD, Chatterjee S, Ernst P, Özkan S, Baydar B, Lachinov D, Han S, Pauli J, Isensee F, Perkonigg M, Sathish R, Rajan R, Sheet D et al (2021) CHAOS challenge -- combined (CT-MR) healthy abdominal organ segmentation. Med Image Anal 69:101950. https://doi.org/10.1016/j.media.2020.101950
Article Google Scholar
Ali R, Hardie RC, Narayanan BN, De Silva S (2019) Deep learning ensemble methods for skin lesion analysis towards melanoma detection. In: 2019 IEEE Natl. Aerosp. Electron. Conf. NAECON, pp 311–316. https://doi.org/10.1109/NAECON46414.2019.9058245
Alom MZ, Yakopcic C, Hasan M, Taha TM, Asari VK (2019) Recurrent residual U-net for medical image segmentation. J Med Imaging 6:014006. https://doi.org/10.1117/1.JMI.6.1.014006
Article Google Scholar

Download references

Acknowledgments

This research is partially supported by JSPS KAKENHI Grant Number 22K12079.

Author information

Zhihong Yu and Feifei Lee contributed equally to this work.

Authors and Affiliations

Shanghai Engineering Research Center of Assistive Devices, School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China
Zhihong Yu & Feifei Lee
Rehabilitation Engineering and Technology Institute, University of Shanghai for Science and Technology, Shanghai, 200093, China
Feifei Lee
Major of Electrical Engineering and Electronics, Graduate School of Engineering, Kogakuin University, Tokyo, 163-8677, Japan
Qiu Chen

Authors

Zhihong Yu
View author publications
You can also search for this author in PubMed Google Scholar
Feifei Lee
View author publications
You can also search for this author in PubMed Google Scholar
Qiu Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Feifei Lee or Qiu Chen.

Ethics declarations

Competing of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yu, Z., Lee, F. & Chen, Q. HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation. Appl Intell 53, 19990–20006 (2023). https://doi.org/10.1007/s10489-023-04570-z

Download citation

Accepted: 12 March 2023
Published: 24 March 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s10489-023-04570-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

BiX-NAS: Searching Efficient Bi-directional Architecture for Medical Image Segmentation

LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation

TransDeepLab: Convolution-Free Transformer-Based DeepLab v3+ for Medical Image Segmentation

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

BiX-NAS: Searching Efficient Bi-directional Architecture for Medical Image Segmentation

LeViT-UNet: Make Faster Encoders with Transformer for Medical Image Segmentation

TransDeepLab: Convolution-Free Transformer-Based DeepLab v3+ for Medical Image Segmentation

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation