Open Panoramic Segmentation

Zheng, Junwei; Liu, Ruiping; Chen, Yufan; Peng, Kunyu; Wu, Chengzhi; Yang, Kailun; Zhang, Jiaming; Stiefelhagen, Rainer

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.02685 (cs)

[Submitted on 2 Jul 2024 (v1), last revised 11 Jul 2024 (this version, v2)]

Title:Open Panoramic Segmentation

Authors:Junwei Zheng, Ruiping Liu, Yufan Chen, Kunyu Peng, Chengzhi Wu, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

View PDF HTML (experimental)

Abstract:Panoramic images, capturing a 360° field of view (FoV), encompass omnidirectional spatial information crucial for scene understanding. However, it is not only costly to obtain training-sufficient dense-annotated panoramas but also application-restricted when training models in a close-vocabulary setting. To tackle this problem, in this work, we define a new task termed Open Panoramic Segmentation (OPS), where models are trained with FoV-restricted pinhole images in the source domain in an open-vocabulary setting while evaluated with FoV-open panoramic images in the target domain, enabling the zero-shot open panoramic semantic segmentation ability of models. Moreover, we propose a model named OOOPS with a Deformable Adapter Network (DAN), which significantly improves zero-shot panoramic semantic segmentation performance. To further enhance the distortion-aware modeling ability from the pinhole source domain, we propose a novel data augmentation method called Random Equirectangular Projection (RERP) which is specifically designed to address object deformations in advance. Surpassing other state-of-the-art open-vocabulary semantic segmentation approaches, a remarkable performance boost on three panoramic datasets, WildPASS, Stanford2D3D, and Matterport3D, proves the effectiveness of our proposed OOOPS model with RERP on the OPS task, especially +2.2% on outdoor WildPASS and +2.4% mIoU on indoor Stanford2D3D. The source code is publicly available at this https URL.

Comments:	Accepted by ECCV 2024. Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2407.02685 [cs.CV]
	(or arXiv:2407.02685v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.02685

Submission history

From: Jiaming Zhang [view email]
[v1] Tue, 2 Jul 2024 22:00:32 UTC (44,242 KB)
[v2] Thu, 11 Jul 2024 22:29:05 UTC (25,811 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Open Panoramic Segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Open Panoramic Segmentation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators