Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3126686.3126727acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Multispectral Object Detection for Autonomous Vehicles

Published: 23 October 2017 Publication History

Abstract

Recently, researchers have actively conducted studies on mobile robot technologies that involve autonomous driving. To implement an automatic mobile robot (e.g., an automated driving vehicle) in traffic, robustly detecting various types of objects such as cars, people, and bicycles in various conditions such as daytime and nighttime is necessary. In this paper, we propose the use of multispectral images as input information for object detection in traffic. Multispectral images are composed of RGB images, near-infrared images, middle-infrared images, and far-infrared images and have multilateral information as a whole. For example, some objects that cannot be visually recognized in the RGB image can be detected in the far-infrared image. To train our multispectral object detection system, we need a multispectral dataset for object detection in traffic. Since such a dataset does not currently exist, in this study we generated our own multispectral dataset. In addition, we propose a multispectral ensemble detection pipeline to fully use the features of multispectral images. The pipeline is divided into two parts: the single-spectral detection model and the ensemble part. We conducted two experiments in this work. In the first experiment, we evaluate our single-spectral object detection model. Our results show that each component in the multispectral image was individually useful for the task of object detection when applied to different types of objects. In the second experiment, we evaluate the entire multispectral object detection system and show that the mean average precision (mAP) of multispectral object detection is 13% higher than that of RGB-only object detection.

References

[1]
Ming-Ming Cheng, Ziming Zhang, Wen-Yan Lin, and Philip Torr. 2014. BING: Binarized normed gradients for objectness estimation at 300fps Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[2]
Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[3]
Piotr Dollar, Zhuowen Tu, Pietro Perona, and Serge Belongie. 2009. Integral Channel Features. In Proceedings of the British Machine Vision Conference.
[4]
Mark Everingham, SM Ali Eslami, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2015. The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision Vol. 111, 1 (2015), 98--136.
[5]
Pedro Felzenszwalb, David McAllester, and Deva Ramanan. 2008. A discriminatively trained, multiscale, deformable part model Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[6]
Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. 2010. Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence, Vol. 32, 9 (2010), 1627--1645.
[7]
Ross Girshick. 2015. Fast r-cnn Proceedings of the IEEE International Conference on Computer Vision.
[8]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[9]
Alejandro González, Zhijie Fang, Yainuvis Socarras, Joan Serrat, David Vázquez, Jiaolong Xu, and Antonio M López. 2016. Pedestrian detection at day/night time with visible and FIR cameras: A comparison. Sensors, Vol. 16, 6 (2016), 820.
[10]
Alejandro González, Gabriel Villalonga, Jiaolong Xu, David Vázquez, Jaume Amores, and Antonio M López. 2015. Multiview random forest of local experts combining rgb and lidar data for pedestrian detection IEEE Intelligent Vehicles Symposium.
[11]
P Govardhan and Umesh Chandra Pati. 2014. NIR image based pedestrian detection in night vision with cascade classification and validation. In Proceedings of International Conference on Advanced Communication Control and Computing Technologies.
[12]
Soonmin Hwang, Jaesik Park, Namil Kim, Yukyung Choi, and In So Kweon. 2015. Multispectral pedestrian detection: Benchmark dataset and baseline Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[13]
Eun Som Jeon, Jong-Suk Choi, Ji Hoon Lee, Kwang Yong Shin, Yeong Gon Kim, Toan Thanh Le, and Kang Ryoung Park. 2015. Human detection based on the generation of a background image by using a far-infrared light camera. Sensors, Vol. 15, 3 (2015), 6763--6788.
[14]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. SSD: Single shot multibox detector. In Proceedings of European Conference on Computer Vision.
[15]
David G Lowe. 1999. Object recognition from local scale-invariant features Proceedings of the IEEE international conference on Computer vision.
[16]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You Only Look Once: Unified, Real-Time Object Detection Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[17]
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[18]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks Proceedings of Advances in neural information processing systems.
[19]
Yainuvis Socarrás, Sebastian Ramos, David Vázquez, Antonio M López, and Theo Gevers. 2011. Adapting pedestrian detection from synthetic to far infrared images Proceedings of the IEEE International Conference on Computer Vision, Workshop on Visual Domain Adaptation and Dataset Bias.
[20]
Jasper RR Uijlings, Koen EA van de Sande, Theo Gevers, and Arnold WM Smeulders. 2013. Selective search for object recognition. International journal of computer vision Vol. 104, 2 (2013), 154--171.
[21]
Maurice Velte. 2015. Semantic image segmentation combining visible and near-infrared channels with depth information. Ph.D. Dissertation. bibinfoschoolBonn-Rhein-Sieg University of Applied Sciences.
[22]
Jörg Wagner, Volker Fischer, Michael Herman, and Sven Behnke. 2016. Multispectral pedestrian detection using deep fusion convolutional neural networks Proceedings of European Sympousium on Artificial Neural Networks.
[23]
C Lawrence Zitnick and Piotr Dollár. 2014. Edge boxes: Locating object proposals from edges. Proceedings of European Conference on Computer Vision.

Cited By

View all
  • (2025)A new method for judging thermal image quality with applicationsSignal Processing10.1016/j.sigpro.2024.109769229(109769)Online publication date: Apr-2025
  • (2024)LFIR-YOLO: Lightweight Model for Infrared Vehicle and Pedestrian DetectionSensors10.3390/s2420660924:20(6609)Online publication date: 14-Oct-2024
  • (2024)M-SKSNet: Multi-Scale Spatial Kernel Selection for Image Segmentation of Damaged Road MarkingsRemote Sensing10.3390/rs1609147616:9(1476)Online publication date: 23-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Thematic Workshops '17: Proceedings of the on Thematic Workshops of ACM Multimedia 2017
October 2017
558 pages
ISBN:9781450354165
DOI:10.1145/3126686
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. autonomous vehicles
  2. computer vision
  3. deep learning
  4. infrared images
  5. multispectral images
  6. object detection

Qualifiers

  • Research-article

Conference

MM '17
Sponsor:
MM '17: ACM Multimedia Conference
October 23 - 27, 2017
California, Mountain View, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)506
  • Downloads (Last 6 weeks)67
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2025)A new method for judging thermal image quality with applicationsSignal Processing10.1016/j.sigpro.2024.109769229(109769)Online publication date: Apr-2025
  • (2024)LFIR-YOLO: Lightweight Model for Infrared Vehicle and Pedestrian DetectionSensors10.3390/s2420660924:20(6609)Online publication date: 14-Oct-2024
  • (2024)M-SKSNet: Multi-Scale Spatial Kernel Selection for Image Segmentation of Damaged Road MarkingsRemote Sensing10.3390/rs1609147616:9(1476)Online publication date: 23-Apr-2024
  • (2024)Bias-Tunable Quantum Well Infrared PhotodetectorNanomaterials10.3390/nano1406054814:6(548)Online publication date: 20-Mar-2024
  • (2024)Novel Entropy for Enhanced Thermal Imaging and Uncertainty QuantificationEntropy10.3390/e2605037426:5(374)Online publication date: 28-Apr-2024
  • (2024)Explore Hybrid Modeling for Moving Infrared Small Target DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680887(6172-6181)Online publication date: 28-Oct-2024
  • (2024)Small Object Detection by DETR via Information Augmentation and Adaptive Feature FusionProceedings of 2024 ACM ICMR Workshop on Multimodal Video Retrieval10.1145/3664524.3675362(39-44)Online publication date: 10-Jun-2024
  • (2024)InfraParis: A multi-modal and multi-task autonomous driving dataset2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00295(2961-2971)Online publication date: 3-Jan-2024
  • (2024)HalluciDet: Hallucinating RGB Modality for Person Detection Through Privileged Information2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00147(1433-1442)Online publication date: 3-Jan-2024
  • (2024)A Task-Guided, Implicitly-Searched and Meta-Initialized Deep Model for Image FusionIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.338230846:10(6594-6609)Online publication date: Oct-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media