Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Jan 21, 2024 · Abstract:3D open-vocabulary scene understanding aims to recognize arbitrary novel categories beyond the base label space.
In this paper, we propose a unified multimodal 3D open-vocabulary scene understanding network, namely UniM-OV3D, which aligns point clouds with image, language ...
Apr 22, 2024 · This paper introduces UniM-OV3D, a uni-modal 3D scene understanding model that can recognize a wide range of object classes using fine-grained ...
It introduces a hierarchical point cloud feature extractor that effectively captures both local and global features to acquire comprehensive fine-grained ...
3D open-vocabulary scene understanding aims to recognize arbitrary novel categories beyond the base label space. However, existing works not only fail to ...
Jan 23, 2024 · [2401.11395] UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with Fine-Grained Feature Representation.
This survey presents the first detailed survey on open vocabulary tasks, including open-vocabulary object detection, open-vocabulary segmentation, and 3D/video ...
In this paper, we propose a unified multimodal 3D open-vocabulary scene understanding network, namely UniM-OV3D, which aligns point clouds with image, language ...
To achieve a comprehensive 3D representation with fine-grained details, we introduce a Volumetric Environment Representation (VER), which voxelizes the physical ...
This work proposes to distill knowledge encoded in pretrained vision-language (VL) foundation models through captioning multi-view images from 3D, ...