Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Sniffer Faster R-CNN ++: An Efficient Camera-LiDAR Object Detector with Proposal Refinement on Fused Candidates

Published: 08 April 2024 Publication History

Abstract

In this article, we present Sniffer Faster R-CNN++, an efficient camera-LiDAR late fusion network for low complexity and accurate object detection in autonomous driving scenarios. The proposed detection network architecture operates on output candidates of any three-dimensional (3D) detector and proposals from regional proposal network of any 2D detector to generate final prediction results. In comparison to the single modality object detection approaches, fusion-based methods in many instances suffer from dissimilar data integration difficulties. On the one hand, fusion-based network models are complicated in nature and, on the other hand, they require large computational overhead and resources, processing pipelines for training and inference specially, the early fusion and deep fusion approaches. As such, we devise a late fusion network that in-cooperates pre-trained, single-modality detectors without change, performing association only at the detection level. In addition to this, lidar-based method fail to detect distant object due to its sparse nature so we devise proposal refinement algorithm to jointly optimize detection candidates and assist detection for distant objects. Extensive experiments on both the 3D and 2D detection benchmark of challenging KITTI dataset illustrate that our proposed network architecture significantly improves the detection accuracy, accelerating the detection speed.

References

[1]
Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. Nuscenes: A multimodal dataset for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11621–11631.
[2]
Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6154–6162.
[3]
Can Chen, Luca Zanotti Fragonara, and Antonios Tsourdos. 2021. RoIFusion: 3D object detection from LiDAR and vision. IEEE Access 9 (2021), 51710–51721. DOI:
[4]
Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. 2017. Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1907–1915.
[5]
Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, and Houqiang Li. 2021. Voxel r-cnn: Towards high performance voxel-based 3d object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 1201–1209.
[6]
Sudip Dhakal, Qi Chen, Deyuan Qu, Dominic Carillo, Qing Yang, and Song Fu. 2023. Sniffer faster R-CNN: A joint camera-LiDAR object detection framework with proposal refinement. In Proceedings of the IEEE International Conference on Mobility, Operations, Services and Technologies (MOST’23). 1–10.
[7]
Sudip Dhakal, Deyuan Qu, Dominic Carrillo, Qing Yang, and Song Fu. 2021. OASD: An open approach to self-driving vehicle. In Proceedings of the 4th International Conference on Connected and Autonomous Driving (MetroCAD’21). 54–61.
[8]
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? The kitti vision benchmark suite. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3354–3361.
[9]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision. 2961–2969.
[10]
Tengteng Huang, Zhe Liu, Xiwu Chen, and Xiang Bai. 2020. Epnet: Enhancing point features with image semantics for 3d object detection. In Proceedings of the 16th European Conference on Computer Vision (ECCV’20). Springer, 35–52.
[11]
Jason Ku, Melissa Mozifian, Jungwook Lee, Ali Harakeh, and Steven L. Waslander. 2018. Joint 3d proposal generation and object detection from view aggregation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’18). IEEE, 1–8.
[12]
Alex H Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. 2019. Pointpillars: Fast encoders for object detection from point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12697–12705.
[13]
Ming Liang, Bin Yang, Shenlong Wang, and Raquel Urtasun. 2018. Deep continuous fusion for multi-sensor 3d object detection. In Proceedings of the European Conference on Computer Vision (ECCV’18). 641–656.
[14]
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2117–2125.
[15]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In Proceedings of the 14th European Conference on Computer Vision (ECCV’16). Springer, 21–37.
[16]
Anh Nguyen and Bac Le. 2013. 3D point cloud segmentation: A survey. In Proceedings of the 6th IEEE Conference on Robotics, Automation and Mechatronics (RAM’13). IEEE, 225–230.
[17]
Su Pang, Daniel Morris, and Hayder Radha. 2020. CLOCs: Camera-LiDAR object candidates fusion for 3D object detection. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’20). IEEE, 10386–10393.
[18]
A. J. Piergiovanni, Vincent Casser, Michael S. Ryoo, and Anelia Angelova. 2021. 4D-Net for learned multi-modal alignment. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV’21). 15415–15425. DOI:
[19]
Charles R. Qi, Xinlei Chen, Or Litany, and Leonidas J. Guibas. 2020. Imvotenet: Boosting 3d object detection in point clouds with image votes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4404–4413.
[20]
Charles R. Qi, Or Litany, Kaiming He, and Leonidas J. Guibas. 2019. Deep hough voting for 3d object detection in point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9277–9286.
[21]
Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, and Leonidas J. Guibas. 2018. Frustum pointnets for 3d object detection from rgb-d data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 918–927.
[22]
Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 652–660.
[23]
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 779–788.
[24]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015).
[25]
Erich Schubert, Jörg Sander, Martin Ester, Hans Peter Kriegel, and Xiaowei Xu. 2017. DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst. 42, 3 (2017), 1–21.
[26]
Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, and Hongsheng Li. 2020. Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10529–10538.
[27]
Shaoshuai Shi, Xiaogang Wang, and Hongsheng Li. 2019. Pointrcnn: 3d object proposal generation and detection from point cloud. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 770–779.
[28]
Martin Simon, Karl Amende, Andrea Kraus, Jens Honer, Timo Samann, Hauke Kaulbersch, Stefan Milz, and Horst Michael Gross. 2019. Complexer-yolo: Real-time 3d object detection and tracking on semantic point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 0–0.
[29]
Vishwanath A. Sindagi, Yin Zhou, and Oncel Tuzel. 2019. Mvx-net: Multimodal voxelnet for 3d object detection. In Proceedings of the International Conference on Robotics and Automation (ICRA’19). IEEE, 7276–7282.
[30]
Vishwanath A. Sindagi, Yin Zhou, and Oncel Tuzel. 2019. MVX-Net: Multimodal voxelNet for 3D object detection. In 2019 International Conference on Robotics and Automation (ICRA’19). 7276–7282. DOI:
[31]
Alexander J. B. Trevor, Suat Gedikli, Radu B. Rusu, and Henrik I Christensen. 2013. Efficient organized point cloud segmentation with connected components. In Proceedings of the Workshop on Semantic Perception Mapping and Exploration (SPME’13).
[32]
Sourabh Vora, Alex H. Lang, Bassam Helou, and Oscar Beijbom. 2020. Pointpainting: Sequential fusion for 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4604–4612.
[33]
Chunwei Wang, Chao Ma, Ming Zhu, and Xiaokang Yang. 2021. PointAugmenting: Cross-modal augmentation for 3D object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 11794–11803.
[34]
Zhixin Wang and Kui Jia. 2019. Frustum convnet: Sliding frustums to aggregate local point-wise features for amodal 3d object detection. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’19). IEEE, 1742–1749.
[35]
Li-Hua Wen and Kang-Hyun Jo. 2021. Fast and accurate 3D object detection for lidar-camera-based autonomous vehicles using one shared voxel-based backbone. IEEE Access 9 (2021), 22080–22089.
[36]
Liang Xie, Chao Xiang, Zhengxu Yu, Guodong Xu, Zheng Yang, Deng Cai, and Xiaofei He. 2019. PI-RCNN: An efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module. ArXiv abs/1911.06084, (2019). Retrieved from https://api.semanticscholar.org/CorpusID:208006295
[37]
Liang Xie, Chao Xiang, Zhengxu Yu, Guodong Xu, Zheng Yang, Deng Cai, and Xiaofei He. 2020. PI-RCNN: An efficient multi-sensor 3D object detector with point-based attentive cont-conv fusion module. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12460–12467.
[38]
Danfei Xu, Dragomir Anguelov, and Ashesh Jain. 2018. Pointfusion: Deep sensor fusion for 3d bounding box estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 244–253.
[39]
Yan Yan, Yuxing Mao, and Bo Li. 2018. Second: Sparsely embedded convolutional detection. Sensors 18, 10 (2018), 3337.
[40]
Bin Yang, Wenjie Luo, and Raquel Urtasun. 2018. Pixor: Real-time 3d object detection from point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7652–7660.
[41]
Jin Hyeok Yoo, Yecheol Kim, Jisong Kim, and Jun Won Choi. 2020. 3d-cvf: Generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection. In Proceedings of the 16th European Conference on Computer Vision (ECCV’20). Springer, 720–736.
[42]
Yin Zhou and Oncel Tuzel. 2018. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4490–4499.

Cited By

View all
  • (2024)SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated Vehicles2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS58592.2024.10801398(8905-8912)Online publication date: 14-Oct-2024

Index Terms

  1. Sniffer Faster R-CNN ++: An Efficient Camera-LiDAR Object Detector with Proposal Refinement on Fused Candidates

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Journal on Autonomous Transportation Systems
    ACM Journal on Autonomous Transportation Systems  Volume 1, Issue 2
    June 2024
    127 pages
    EISSN:2833-0528
    DOI:10.1145/3613595
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 April 2024
    Online AM: 28 October 2023
    Accepted: 25 October 2023
    Revised: 01 September 2023
    Received: 03 May 2023
    Published in JATS Volume 1, Issue 2

    Check for updates

    Author Tags

    1. Object detection
    2. late fusion
    3. proposal refinement
    4. candidates fusion
    5. regional proposal network

    Qualifiers

    • Research-article

    Funding Sources

    • National Science Foundation

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)154
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 01 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated Vehicles2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS58592.2024.10801398(8905-8912)Online publication date: 14-Oct-2024

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media