Rethinking Early-Fusion Strategies for Improved Multispectral Object Detection

Zhang, Xue; Cao, Si-Yuan; Wang, Fang; Zhang, Runmin; Wu, Zhe; Zhang, Xiaohan; Bai, Xiaokai; Shen, Hui-Liang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2405.16038v1 (cs)

[Submitted on 25 May 2024 (this version), latest version 19 Sep 2024 (v2)]

Title:Rethinking Early-Fusion Strategies for Improved Multispectral Object Detection

Authors:Xue Zhang, Si-Yuan Cao, Fang Wang, Runmin Zhang, Zhe Wu, Xiaohan Zhang, Xiaokai Bai, Hui-Liang Shen

View PDF HTML (experimental)

Abstract:Most recent multispectral object detectors employ a two-branch structure to extract features from RGB and thermal images. While the two-branch structure achieves better performance than a single-branch structure, it overlooks inference efficiency. This conflict is increasingly aggressive, as recent works solely pursue higher performance rather than both performance and efficiency. In this paper, we address this issue by improving the performance of efficient single-branch structures. We revisit the reasons causing the performance gap between these structures. For the first time, we reveal the information interference problem in the naive early-fusion strategy adopted by previous single-branch structures. Besides, we find that the domain gap between multispectral images, and weak feature representation of the single-branch structure are also key obstacles for performance. Focusing on these three problems, we propose corresponding solutions, including a novel shape-priority early-fusion strategy, a weakly supervised learning method, and a core knowledge distillation technique. Experiments demonstrate that single-branch networks equipped with these three contributions achieve significant performance enhancements while retaining high efficiency. Our code will be available at \url{this https URL}.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2405.16038 [cs.CV]
	(or arXiv:2405.16038v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2405.16038

Submission history

From: Xue Zhang [view email]
[v1] Sat, 25 May 2024 03:19:34 UTC (4,182 KB)
[v2] Thu, 19 Sep 2024 02:33:48 UTC (4,281 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Rethinking Early-Fusion Strategies for Improved Multispectral Object Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Rethinking Early-Fusion Strategies for Improved Multispectral Object Detection

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators