VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed Tomography

He, Yufan; Guo, Pengfei; Tang, Yucheng; Myronenko, Andriy; Nath, Vishwesh; Xu, Ziyue; Yang, Dong; Zhao, Can; Simon, Benjamin; Belue, Mason; Harmon, Stephanie; Turkbey, Baris; Xu, Daguang; Li, Wenqi

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.05285 (cs)

[Submitted on 7 Jun 2024 (v1), last revised 7 Aug 2024 (this version, v2)]

Title:VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed Tomography

Authors:Yufan He, Pengfei Guo, Yucheng Tang, Andriy Myronenko, Vishwesh Nath, Ziyue Xu, Dong Yang, Can Zhao, Benjamin Simon, Mason Belue, Stephanie Harmon, Baris Turkbey, Daguang Xu, Wenqi Li

View PDF HTML (experimental)

Abstract:Medical image segmentation is a core component of precision medicine, and 3D computed tomography (CT) is one of the most important imaging techniques. A highly accurate and clinically applicable segmentation foundation model will greatly facilitate clinicians and researchers using CT images. Although existing foundation models have attracted great interest, none are adequate for 3D CT, either because they lack accurate automatic segmentation for large cohort analysis or the ability to segment novel classes. An ideal segmentation solution should possess two features: accurate out-of-the-box performance covering major organ classes, and effective adaptation or zero-shot ability to novel structures. To achieve this goal, we introduce Versatile Imaging SegmenTation and Annotation model (VISTA3D). VISTA3D is trained systematically on 11454 volumes and provides accurate out-of-the-box segmentation for 127 common types of human anatomical structures and various lesions. Additionally, VISTA3D supports 3D interactive segmentation, allowing convenient editing of automatic results and achieving state-of-the-art annotation results on unseen classes. The novel model design and training recipe represent a promising step toward developing a versatile medical image foundation model and will serve as a valuable foundation for CT image analysis. Code and model weights are available at this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.05285 [cs.CV]
	(or arXiv:2406.05285v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.05285

Submission history

From: Yufan He [view email]
[v1] Fri, 7 Jun 2024 22:41:39 UTC (2,991 KB)
[v2] Wed, 7 Aug 2024 21:47:41 UTC (2,984 KB)

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computer Vision and Pattern Recognition

Title:VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed Tomography

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computer Vision and Pattern Recognition

Title:VISTA3D: Versatile Imaging SegmenTation and Annotation model for 3D Computed Tomography

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators