Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Considering that many manually designed convolutional neural networks (CNNs) for different tasks that require considerable time, labor, and domain knowledge have been designed in the medical image segmentation domain and that most CNN networks only consider local feature information while ignoring the global receptive field due to the convolution limitation, there is still much room for performance improvement. Therefore, designing a new method that can fully capture feature information and save considerable time and human energy with less GPU memory consumption and complexity is necessary. In this paper, we propose a novel hybrid CNN-transformer model based on a neural architecture search network (HCT-Net), which designs a hybrid U-shaped CNN with a key-sampling Transformer backbone that considers contextual and long-range pixel information in the search space and uses a single-path neural architecture search that contains a flexible search space and an efficient search strategy to simultaneously find the optimal subnetwork including three types of cells during SuperNet. Compared with various types of medical image segmentation methods, our framework can achieve competitive precision and efficiency on various datasets, and we also validate the generalization on unseen datasets in extended experiments. In this way, we can verify that our method is competitive and robust. The code for the method is available at https://github.com/yuzh2022/HCT-Net.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Zheng S, Zhang Y, Liu W, Zou Y (2020) Improved image representation and sparse representation for image classification. Appl Intell 50:1687–1698. https://doi.org/10.1007/s10489-019-01612-3.y

    Article  Google Scholar 

  2. Zhu L, Lee F, Cai J, Yu H, Chen Q (2022) An improved feature pyramid network for object detection. Neurocomputing. 483:127–139. https://doi.org/10.1016/j.neucom.2022.02.016

    Article  Google Scholar 

  3. Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2021) Deformable DETR: deformable transformers for end-to-end object detection. In: International conference on learning representations

  4. Wu C, Wang Z (2022) Robust fuzzy dual-local information clustering with kernel metric and quadratic surface prototype for image segmentation. Appl Intell. https://doi.org/10.1007/s10489-022-03690-2

  5. Lu X, Wang W, Shen J, Crandall DJ, Van Gool L (2022) Segmenting objects from relational visual data. IEEE Trans Pattern Anal Mach Intell 44:7885–7897. https://doi.org/10.1109/TPAMI.2021.3115815

    Article  Google Scholar 

  6. Lu X, Wang W, Ma C, Shen J, Shao L, Porikli F (2019) See more, know more: unsupervised video object segmentation with co-attention Siamese networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3623–3632

  7. Lu X, Wang W, Shen J, Crandall D, Luo J (2022) Zero-shot video object segmentation with co-attention Siamese networks. IEEE Trans Pattern Anal Mach Intell 44:2228–2242. https://doi.org/10.1109/TPAMI.2020.3040258

    Article  Google Scholar 

  8. Qin Z, Lu X, Nie X, Zhen X, Yin Y (2021) Learning hierarchical embedding for video instance segmentation. In: Proc. 29th ACM int. conf. multimed., ACM, Virtual Event China, pp 1884–1892. https://doi.org/10.1145/3474085.3475342

  9. Lu X, Wang W, Danelljan M, Zhou T, Shen J, Van Gool L (2020) Video object segmentation with episodic graph memory networks. In: Vedaldi A, Bischof H, Brox T, Frahm J-M (eds) Comput. Vis. – ECCV 2020. Springer International Publishing, Cham, pp 661–679. https://doi.org/10.1007/978-3-030-58580-8_39

    Chapter  Google Scholar 

  10. Baygin M, Yaman O, Barua PD, Dogan S, Tuncer T, Acharya UR (2022) Exemplar Darknet19 feature generation technique for automated kidney stone detection with coronal CT images. Artif Intell Med 127:102274. https://doi.org/10.1016/j.artmed.2022.102274

    Article  Google Scholar 

  11. Kobat SG, Baygin N, Yusufoglu E, Baygin M, Barua PD, Dogan S, Yaman O, Celiker U, Yildirim H, Tan R-S, Tuncer T, Islam N, Acharya UR (2022) Automated diabetic retinopathy detection using horizontal and vertical patch division-based pre-trained DenseNET with digital fundus images. Diagnostics 12:1975. https://doi.org/10.3390/diagnostics12081975

    Article  Google Scholar 

  12. Key S, Baygin M, Demir S, Dogan S, Tuncer T (2022) Meniscal tear and ACL injury detection model based on AlexNet and iterative ReliefF. J Digit Imaging 35:200–212. https://doi.org/10.1007/s10278-022-00581-3

    Article  Google Scholar 

  13. Guo X, Yang C, Yuan Y (2021) Dynamic-weighting hierarchical segmentation network for medical images. Med Image Anal 73:102196. https://doi.org/10.1016/j.media.2021.102196

    Article  Google Scholar 

  14. Sinha A, Dolz J (2021) Multi-scale self-guided attention for medical image segmentation. IEEE J Biomed Health Inform 25:121–130. https://doi.org/10.1109/JBHI.2020.2986926

    Article  Google Scholar 

  15. Xie Y, Zhang J, Lu H, Shen C, Xia Y (2021) SESV: accurate medical image segmentation by predicting and correcting errors. IEEE Trans Med Imaging 40:286–296. https://doi.org/10.1109/TMI.2020.3025308

    Article  Google Scholar 

  16. Huang S, Lee F, Miao R, Si Q, Lu C, Chen Q (2020) A deep convolutional neural network architecture for interstitial lung disease pattern classification. Med Biol Eng Comput 58:725–737. https://doi.org/10.1007/s11517-019-02111-w

    Article  Google Scholar 

  17. Zuo B, Lee F, Chen Q (2022) An efficient U-shaped network combined with edge attention module and context pyramid fusion for skin lesion segmentation. Med Biol Eng Comput 60:1987–2000. https://doi.org/10.1007/s11517-022-02581-5

    Article  Google Scholar 

  18. Mittal H, Pandey AC, Pal R, Tripathi A (2021) A new clustering method for the diagnosis of CoVID19 using medical images. Appl Intell 51:2988–3011. https://doi.org/10.1007/s10489-020-02122-3

    Article  Google Scholar 

  19. Song L, Liu G, Ma M (2022) TD-net: unsupervised medical image registration network based on transformer and CNN. Appl Intell 52:18201–18209. https://doi.org/10.1007/s10489-022-03472-w

    Article  Google Scholar 

  20. Wang R, Lei T, Cui R, Zhang B, Meng H, Nandi AK (2022) Medical image segmentation using deep learning: a survey. IET Image Process 16:1243–1267. https://doi.org/10.1049/ipr2.12419

    Article  Google Scholar 

  21. Khatri I, Kumar D, Gupta A (2022) A noise robust kernel fuzzy clustering based on picture fuzzy sets and KL divergence measure for MRI image segmentation. Appl Intell. https://doi.org/10.1007/s10489-022-04315-4

  22. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical image computing and computer assisted intervention – MICCAI 2015, Cham, pp 234–241. https://doi.org/10.1007/978-3-319-24574-4_28

  23. Baymurzina D, Golikov E, Burtsev M (2022) A review of neural architecture search. Neurocomputing. 474:82–93. https://doi.org/10.1016/j.neucom.2021.12.014

    Article  Google Scholar 

  24. Phan QM, Luong NH (2022) Enhancing multi-objective evolutionary neural architecture search with training-free Pareto local search. Appl Intell. https://doi.org/10.1007/s10489-022-04032-y

  25. Hong W et al (2020) DropNAS: Grouped operation dropout for differentiable architecture search. In: International joint conferences on artificial intelligence organization, Yokohama, Japan, pp 2326–2332. https://doi.org/10.24963/ijcai.2020/322

  26. Guo Q, Wu X-J, Kittler J, Feng Z (2022) Differentiable neural architecture learning for efficient neural networks. Pattern Recogn 126:108448. https://doi.org/10.1016/j.patcog.2021.108448

    Article  Google Scholar 

  27. Baldeon-Calisto M, Lai-Yuen SK (2020) AdaResU-net: multiobjective adaptive convolutional neural network for medical image segmentation. Neurocomputing. 392:325–340. https://doi.org/10.1016/j.neucom.2019.01.110

    Article  Google Scholar 

  28. Baldeon Calisto M, Lai-Yuen SK (2020) AdaEn-net: an ensemble of adaptive 2D–3D fully convolutional networks for medical image segmentation. Neural Netw 126:76–94. https://doi.org/10.1016/j.neunet.2020.03.007

    Article  Google Scholar 

  29. Yan X, Jiang W, Shi Y, Zhuo C (2020) MS-NAS: multi-scale neural architecture search for medical image segmentation. In: Medical image computing and computer assisted intervention – MICCAI 2020, Cham, pp 388–397

  30. He Y, Yang D, Roth H, Zhao C, Xu D (2021) DiNTS: differentiable neural network topology search for 3D medical image segmentation, In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5837–5846

  31. Zhang H, Zhang W, Shen W, Li N, Chen Y, Li S, Chen B, Guo S, Wang Y (2021) Automatic segmentation of the cardiac MR images based on nested fully convolutional dense network with dilated convolution. Biomed Signal Process Control 68:102684. https://doi.org/10.1016/j.bspc.2021.102684

    Article  Google Scholar 

  32. Li C, Tan Y, Chen W, Luo X, He Y, Gao Y, Li F (2020) ANU-net: attention-based nested U-net to exploit full resolution features for medical image segmentation. Comput Graph 90:11–20. https://doi.org/10.1016/j.cag.2020.05.003

    Article  Google Scholar 

  33. Vaswani A et al (2017) Attention is all you need. In: Advances in neural information processing systems, 30

  34. Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2022) Transformers in vision: a survey. ACM Comput Surv 54:200:1–200:41. https://doi.org/10.1145/3505244

    Article  Google Scholar 

  35. Chen J et al (2021) TransUNet: transformers make strong encoders for medical image segmentation. arXiv abs/2102.04306

  36. Gao Y, Zhou M, Metaxas DN (2021) UTNet: a hybrid transformer architecture for medical image segmentation. In: Medical image computing and computer assisted intervention – MICCAI 2021, Cham, pp 61–71

  37. Chen Y, Dai X, Liu M, Chen D, Yuan L, Liu Z (2020) Dynamic convolution: attention over convolution kernels. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11027–11036

  38. Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39:640–651. https://doi.org/10.1109/TPAMI.2016.2572683

    Article  Google Scholar 

  39. Zhou Z, Siddiquee MMR, Tajbakhsh N, Liang J (2020) UNet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans Med Imaging 39:1856–1867. https://doi.org/10.1109/TMI.2019.2959609

    Article  Google Scholar 

  40. Oktay O et al (2018) Attention U-Net: learning where to look for the pancreas. arXiv abs/1804.03999

  41. Ibtehaz N, Rahman MS (2020) MultiResUNet : rethinking the U-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87. https://doi.org/10.1016/j.neunet.2019.08.025

    Article  Google Scholar 

  42. Xie Y, Zhang J, Shen C, Xia Y (2021) CoTr: efficiently bridging CNN and transformer for 3D medical image segmentation. In: Medical image computing and computer assisted intervention – MICCAI 2021, Cham, pp 171–180. https://doi.org/10.1007/978-3-030-87199-4_16

  43. Liu Z et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp 9992–10002

  44. Cao H et al (2021) Swin-Unet: Unet-like pure transformer for medical image segmentation. arXiv abs/2105.05537

  45. Weng Y, Zhou T, Li Y, Qiu X (2019) NAS-Unet: neural architecture search for medical image segmentation. IEEE Access 7:44247–44257. https://doi.org/10.1109/ACCESS.2019.2908991

    Article  Google Scholar 

  46. Liu C et al (2019) Auto-DeepLab: hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 82–92

  47. Yu Q et al (2020) C2FNAS: Coarse-to-fine neural architecture search for 3D medical image segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4125–4134

  48. Liu L et al (2021) MixSearch: searching for domain generalized medical image segmentation architectures. arXiv abs/2102.13280

  49. Hu S, Xie S, Zheng H, Liu C, Shi J, Liu X, Lin D (2020) DSNAS: direct neural architecture search without parameter retraining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12081–12089. https://doi.org/10.1109/CVPR42600.2020.01210

  50. Liu H, Simonyan K, Yang Y (2018) DARTS: differentiable architecture search. In: International conference on learning representations

  51. Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42:2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372

    Article  Google Scholar 

  52. Xie S, Zheng H, Liu C, Lin L (2018) SNAS: stochastic neural architecture search, in: international conference on learning representations. https://doi.org/10.48550/arXiv.1812.09926

  53. Bernal J, Tajkbaksh N, Sanchez FJ, Matuszewski BJ, Chen H, Yu L, Angermann Q, Romain O, Rustad B, Balasingham I, Pogorelov K, Choi S, Debard Q, Maier-Hein L, Speidel S, Stoyanov D, Brandao P, Cordova H, Sanchez-Montes C et al (2017) Comparative validation of polyp detection methods in video colonoscopy: results from the MICCAI 2015 endoscopic vision challenge. IEEE Trans Med Imaging 36:1231–1249. https://doi.org/10.1109/TMI.2017.2664042

    Article  Google Scholar 

  54. Kavur AE, Gezer NS, Barış M, Aslan S, Conze PH, Groza V, Pham DD, Chatterjee S, Ernst P, Özkan S, Baydar B, Lachinov D, Han S, Pauli J, Isensee F, Perkonigg M, Sathish R, Rajan R, Sheet D et al (2021) CHAOS challenge -- combined (CT-MR) healthy abdominal organ segmentation. Med Image Anal 69:101950. https://doi.org/10.1016/j.media.2020.101950

    Article  Google Scholar 

  55. Ali R, Hardie RC, Narayanan BN, De Silva S (2019) Deep learning ensemble methods for skin lesion analysis towards melanoma detection. In: 2019 IEEE Natl. Aerosp. Electron. Conf. NAECON, pp 311–316. https://doi.org/10.1109/NAECON46414.2019.9058245

  56. Alom MZ, Yakopcic C, Hasan M, Taha TM, Asari VK (2019) Recurrent residual U-net for medical image segmentation. J Med Imaging 6:014006. https://doi.org/10.1117/1.JMI.6.1.014006

    Article  Google Scholar 

Download references

Acknowledgments

This research is partially supported by JSPS KAKENHI Grant Number 22K12079.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Feifei Lee or Qiu Chen.

Ethics declarations

Competing of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, Z., Lee, F. & Chen, Q. HCT-net: hybrid CNN-transformer model based on a neural architecture search network for medical image segmentation. Appl Intell 53, 19990–20006 (2023). https://doi.org/10.1007/s10489-023-04570-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-04570-z

Keywords