RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

Li, Chunliang; Han, Wencheng; Yin, Junbo; Zhao, Sanyuan; Shen, Jianbing

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.10876 (cs)

[Submitted on 15 Jul 2024 (v1), last revised 20 Jul 2024 (this version, v2)]

Title:RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

Authors:Chunliang Li, Wencheng Han, Junbo Yin, Sanyuan Zhao, Jianbing Shen

View PDF HTML (experimental)

Abstract:Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using traditional multi-task learning approaches. This paper addresses these issues by proposing a novel unified representation, RepVF, which harmonizes the representation of various perception tasks such as 3D object detection and 3D lane detection within a single framework. RepVF characterizes the structure of different targets in the scene through a vector field, enabling a single-head, multi-task learning model that significantly reduces computational redundancy and feature competition. Building upon RepVF, we introduce RFTR, a network designed to exploit the inherent connections between different tasks by utilizing a hierarchical structure of queries that implicitly model the relationships both between and within tasks. This approach eliminates the need for task-specific heads and parameters, fundamentally reducing the conflicts inherent in traditional multi-task learning paradigms. We validate our approach by combining labels from the OpenLane dataset with the Waymo Open dataset. Our work presents a significant advancement in the efficiency and effectiveness of multi-task perception in autonomous driving, offering a new perspective on handling multiple 3D perception tasks synchronously and in parallel. The code will be available at: this https URL

Comments:	Accepted by ECCV 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2407.10876 [cs.CV]
	(or arXiv:2407.10876v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.10876

Submission history

From: Chunliang Li [view email]
[v1] Mon, 15 Jul 2024 16:25:07 UTC (3,898 KB)
[v2] Sat, 20 Jul 2024 15:46:44 UTC (3,898 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators