Abstract
Accurate Left Atrial (LA) segmentation from Late Gadolinium Enhancement Magnetic Resonance Imaging (LGE MRI) is fundamental to the diagnosis of Atrial Fibrillation (AF). Previous approaches tended to solve this problem by refining network architecture to leverage spatial priors in medical imaging. However, the priors modeling can hardly be achieved due to low image quality and various shapes of LA. In this paper, we try to learn the priors from generation. The motivation is simple: if a model can generate or recover image content well, it possibly has learned the priors well. With the priors built in, such a model can better segment LA. Specifically, we investigate the self pre-training paradigm, i.e., models are pre-trained and fine-tuned on the same LGE-MRI dataset, based on Mask Autoencoder (MAE). In the pre-training stage, we utilize Vision Transformers (ViT) based auto-encoders to perform the pretext task of reconstructing the original MRI images from only partial patches, where the ViT encoder is encouraged to learn contextual information as priors by aggregating global information to recover the contents in masked patches. In the fine-tuning process, we further propose an single-scale adaptor for downstream task. The adapter first has different branches with different numbers of upsampling blocks to remedy the plain, non-hierarchical property of the ViT. This can better adapt ViT to dense prediction task. Then, it constructs a feature pyramid directly from the single-scale feature map of ViT using the multi-scale features from different branches. Finally, the adapter incorporates a decoder to predict the segmentation results based on the feature pyramid. The proposed model (called ViTUNet) outperforms baseline trained from scratch and widely used nnUNet model. The final trained model shows a validation score of 0.89013, 1.70567 and 17.12375 for Dice coefficient, ASD and HD metric, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chugh, S.S., et al.: Worldwide epidemiology of atrial fibrillation: a global burden of disease 2010 study. Circulation 129(8), 837 (2013)
Li, L., Zimmer, V.A., Schnabel, J.A., Zhuang, X.: Atrialjsqnet: a new framework for joint segmentation and quantification of left atrium and scars incorporating spatial and shape information. Med. Image Anal. 76, 102303 (2022)
Li, L., Zimmer, V.A., Schnabel, J.A., Zhuang, X.: Medical image analysis on left atrial lge mri for atrial fibrillation studies: a review. Med. Image Anal., 102360 (2022)
Li, L., Zimmer, V.A., Schnabel, J.A., Zhuang, X.: AtrialGeneral: domain generalization for left atrial segmentation of multi-center LGE MRIs. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 557–566. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_54
Zhang, J., Xie, Y., Liao, Z., Verjans, J., Xia, Y.: EfficientSeg: a simple but efficient solution to myocardial pathology segmentation challenge. In: Zhuang, X., Li, L. (eds.) MyoPS 2020. LNCS, vol. 12554, pp. 17–25. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65651-5_2
Martín-Isla, C., Asadi-Aghbolaghi, M., Gkontra, P., Campello, V.M., Escalera, S., Lekadir, K.: Stacked BCDU-net with semantic CMR synthesis: application to myocardial pathology segmentation challenge. In: Zhuang, X., Li, L. (eds.) MyoPS 2020. LNCS, vol. 12554, pp. 1–16. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65651-5_1
Liu, Y., Zhang, M., Zhan, Q., Gu, D., Liu, G.: Two-stage method for segmentation of the myocardial scars and edema on multi-sequence cardiac magnetic resonance. In: Zhuang, X., Li, L. (eds.) MyoPS 2020. LNCS, vol. 12554, pp. 26–36. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65651-5_3
Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks. In: International conference on machine learning, pp. 6105–6114, PMLR (2019)
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale, arXiv preprint arXiv:2010.11929 (2020)
Bao, H., Dong, L., Wei, F.: Beit: Bert pre-training of image transformers, arXiv preprint arXiv:2106.08254 (2021)
Xie, Z., et al.: Simmim: a simple framework for masked image modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9653–9663 (2022)
He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16000–16009 (2022)
Chen, Z., Duan, Y., Wang, W., He, J., Lu, T., Dai, J., Qiao, Y.: Vision transformer adapter for dense predictions. arXiv preprint arXiv:2205.08534 (2022)
Zhou, L., Liu, H., Bae, J., He, J., Samaras, D., Prasanna, P.: Self pre-training with masked autoencoders for medical image analysis. arXiv preprint arXiv:2203.05573 (2022)
Li, Y., Mao, H., Girshick, R., He, K.: Exploring plain vision transformer backbones for object detection. arXiv preprint arXiv:2203.16527 (2022)
Hatamizadeh, A., et al.: Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584, (2022)
Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18(2), 203–211 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tu, C. et al. (2023). Self Pre-training with Single-Scale Adapter for Left Atrial Segmentation. In: Zhuang, X., Li, L., Wang, S., Wu, F. (eds) Left Atrial and Scar Quantification and Segmentation. LAScarQS 2022. Lecture Notes in Computer Science, vol 13586. Springer, Cham. https://doi.org/10.1007/978-3-031-31778-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-31778-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31777-4
Online ISBN: 978-3-031-31778-1
eBook Packages: Computer ScienceComputer Science (R0)