Lidar and monocular camera fusion: On-road depth completion for autonomous driving

C Fu, C Mertz, JM Dolan - 2019 IEEE Intelligent Transportation …, 2019 - ieeexplore.ieee.org
2019 IEEE Intelligent Transportation Systems Conference (ITSC), 2019ieeexplore.ieee.org
LIDAR and RGB cameras are commonly used sensors in autonomous vehicles. However,
both of them have limitations: LIDAR provides accurate depth but is sparse in vertical and
horizontal resolution; RGB images provide dense texture but lack depth information. In this
paper, we fuse LIDAR and RGB images by a deep neural network, which completes a
denser pixel-wise depth map. The proposed architecture reconstructs the pixel-wise depth
map, taking advantage of both the dense color features and sparse 3D spatial features. We …
LIDAR and RGB cameras are commonly used sensors in autonomous vehicles. However, both of them have limitations: LIDAR provides accurate depth but is sparse in vertical and horizontal resolution; RGB images provide dense texture but lack depth information. In this paper, we fuse LIDAR and RGB images by a deep neural network, which completes a denser pixel-wise depth map. The proposed architecture reconstructs the pixel-wise depth map, taking advantage of both the dense color features and sparse 3D spatial features. We applied the early fusion technique and fine-tuned the ResNet model as the encoder. The designed Residual Up-Projection block recovers the spatial resolution of the feature map and captures context within the depth map. We introduced a depth feature tensor which propagates context information from encoder blocks to decoder blocks. Our proposed method is evaluated on the large-scale indoor NYUdepthV2 and KITTI odometry datasets and outperforms the state-of-the-art single RGB image and depth fusion method. The proposed method is also evaluated on a reduced-resolution KITTI dataset which synthesizes the planar LIDAR and RGB image fusion.
ieeexplore.ieee.org