research-article

AutoMatch: Leveraging Traffic Camera to Improve Perception and Localization of Autonomous Vehicles

Authors:

Chen PanAuthors Info & Claims

SenSys '22: Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems

Pages 16 - 30

https://doi.org/10.1145/3560905.3568519

Published: 24 January 2023 Publication History

Abstract

Traffic camera is one of the most ubiquitous traffic facilities, providing high coverage of complex, accident-prone road sections such as intersections. This work leverages traffic cameras to improve the perception and localization performance of autonomous vehicles at intersections. In particular, vehicles can expand their range of perception by matching the images captured by both the traffic cameras and on-vehicle cameras. Moreover, a traffic camera can match its images to an existing high-definition map (HD map) to derive centimeter-level location of the vehicles in its field of view. To this end, we propose AutoMatch - a novel system for real-time image registration, which is a key enabling technology for traffic camera-assisted perception and localization of autonomous vehicles. Our key idea is to leverage landmark keypoints of distinctive structures such as ground signs at intersections to facilitate image registration between traffic cameras and HD maps or vehicles. By leveraging the strong structural characteristics of ground signs, AutoMatch can extract very few but precise landmark keypoints for registration, which effectively reduces the communication/compute overhead. We implement AutoMatch on a testbed consisting of a self-built autonomous car, drones for surveying and mapping, and real traffic cameras. In addition, we collect two new multi-view traffic image datasets at intersections, which contain images from 220 real operational traffic cameras in 22 cities. Experimental results show that AutoMatch achieves pixel-level image registration accuracy within 88 milliseconds, and delivers an 11.7× improvement in accuracy, 1.4× speedup in compute time, and 17.1× data transmission saving over existing approaches.

References

[1]

n.d. Nvidia TENSORRT. https://developer.nvidia.com/tensorrt.

[2]

n.d. Open Neural Network Exchange. https://onnx.ai/.

[3]

Eduardo Arnold, Mehrdad Dianati, Robert de Temple, and Saber Fallah. 2020. Cooperative perception for 3D object detection in driving scenarios using infrastructure sensors. IEEE Transactions on Intelligent Transportation Systems (2020).

[4]

OpenDroneMap Authors. 2020. ODM - A command line toolkit to generate maps, point clouds, 3D models and DEMs from drone, balloon or kite images. https://github.com/OpenDroneMap/ODM.

[5]

Vassileios Balntas, Edgar Riba, Daniel Ponsa, and Krystian Mikolajczyk. 2016. Learning local feature descriptors with triplets and shallow convolutional neural networks. In Bmvc, Vol. 1. 3.

[6]

Herbert Bay, Tinne Tuytelaars, and Luc Van Gool. 2006. Surf: Speeded up robust features. In European conference on computer vision. Springer, 404--417.

Digital Library

[7]

Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. 2020. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020).

[8]

Gary Bradski and Adrian Kaehler. 2008. Learning OpenCV: Computer vision with the OpenCV library. " O'Reilly Media, Inc.".

[9]

Matthew Brown, Gang Hua, and Simon Winder. 2010. Discriminative learning of local image descriptors. IEEE transactions on pattern analysis and machine intelligence 33, 1 (2010), 43--57.

[10]

Andrew Burnes. 2019. Introducing GeForce RTX SUPER Graphics Cards: Best In Class Performance, Plus Ray Tracing. https://www.nvidia.com/en-us/geforce/news/geforce-rtx-20-series-super-gpus/.

[11]

Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7291--7299.

[12]

Long Chen, Shaobo Lin, Xiankai Lu, Dongpu Cao, Hangbin Wu, Chi Guo, Chun Liu, and Fei-Yue Wang. 2021. Deep neural network based vehicle and pedestrian detection for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems 22, 6 (2021), 3234--3246.

[13]

Christopher Choy, Wei Dong, and Vladlen Koltun. 2020. Deep global registration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2514--2523.

[14]

Christopher Choy, Jaesik Park, and Vladlen Koltun. 2019. Fully Convolutional Geometric Features. In ICCV.

[15]

Xiao Chu, Wei Yang, Wanli Ouyang, Cheng Ma, Alan L Yuille, and Xiaogang Wang. 2017. Multi-context attention for human pose estimation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1831--1840.

[16]

BRITISH COLUMBIA. 2019. Where intersection safety cameras are located. https://www2.gov.bc.ca/gov/content/transportation/driving-and-cycling/roadsafetybc/intersection-safety-cameras/where-the-cameras-are.

[17]

Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. 2018. Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 224--236.

[18]

Jingming Dong and Stefano Soatto. 2015. Domain-size pooling in local descriptors: DSP-SIFT. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5097--5106.

[19]

Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Pollefeys, Josef Sivic, Akihiko Torii, and Torsten Sattler. 2019. D2-net: A trainable cnn for joint description and detection of local features. In Proceedings of the IEEE/cvf conference on computer vision and pattern recognition. 8092--8101.

[20]

G. Elbaz, T. Avraham, and A. Fischer. 2017. 3D Point Cloud Registration for Localization Using a Deep Neural Network Auto-Encoder. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2472--2481.

[21]

Alessio Fascista, Giovanni Ciccarese, Angelo Coluccia, and Giuseppe Ricci. 2017. Angle of arrival-based cooperative positioning for smart vehicles. IEEE Transactions on Intelligent Transportation Systems 19, 9 (2017), 2880--2892.

[22]

Martin A. Fischler and Robert C. Bolles. 1981. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM 24, 6 (June 1981), 381--395.

Digital Library

[23]

Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 3354--3361.

Digital Library

[24]

Xiaojie Guo, Siyuan Li, Jinke Yu, Jiawan Zhang, Jiayi Ma, Lin Ma, Wei Liu, and Haibin Ling. 2019. PFLD: A practical facial landmark detector. arXiv preprint arXiv:1902.10859 (2019).

[25]

Chris Harris, Mike Stephens, et al. 1988. A combined corner and edge detector. In Alvey vision conference. Citeseer, 10--5244.

[26]

Richard Hartley and Andrew Zisserman. 2003. Multiple view geometry in computer vision. Cambridge university press.

[27]

Jared Heinly, Johannes L Schonberger, Enrique Dunn, and Jan-Michael Frahm. 2015. Reconstructing the world* in six days*(as captured by the yahoo 100 million image dataset). In Proceedings of the IEEE conference on computer vision and pattern recognition. 3287--3295.

[28]

INSIDER. 2016. Here's why self-driving cars can't handle bridges. <http://www.businessinsider.com/autonomous-cars-bridges-2016-8.

[29]

Mahdi Javanmardi, Ehsan Javanmardi, Yanlei Gu, and Shunsuke Kamijo. 2017. Towards high-definition 3D urban mapping: Road feature-based registration of mobile mapping systems and aerial imagery. Remote Sensing 9, 10 (2017), 975.

[30]

Wei Jiang, Eduard Trulls, Jan Hosang, Andrea Tagliasacchi, and Kwang Moo Yi. 2021. Cotr: Correspondence transformer for matching across images. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6207--6217.

[31]

Jialin Jiao. 2018. Machine learning assisted high-definition map creation. In 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), Vol. 1. IEEE, 367--373.

[32]

Felix Kam and Henrik Mellin. 2019. Different frequencies of maneuver replanning on autonomous vehicles.

[33]

I Karls and M Mueck. 2018. Networking vehicles to everything. Evolving automotive solutions.

[34]

S. Kuutti, S. Fallah, K. Katsaros, M. Dianati, F. Mccullough, and A. Mouzakitis. 2018. A Survey of the State-of-the-Art Localization Techniques and Their Potentials for Autonomous Vehicle Applications. IEEE Internet of Things Journal 5, 2 (2018), 829--846.

[35]

David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 2 (2004), 91--110.

Digital Library

[36]

Jiayi Ma, Xingyu Jiang, Aoxiang Fan, Junjun Jiang, and Junchi Yan. 2021. Image matching from handcrafted to deep features: A survey. International Journal of Computer Vision 129, 1 (2021), 23--79.

Digital Library

[37]

Juliette Marais, Cyril Meurie, Dhouha Attia, Yassine Ruichek, and Amaury Flancquart. 2014. Toward accurate localization in guided transport: Combining GNSS data and imaging information. Transportation Research Part C: Emerging Technologies 43 (2014), 188--197.

[38]

Iaroslav Melekhov, Aleksei Tiulpin, Torsten Sattler, Marc Pollefeys, Esa Rahtu, and Juho Kannala. 2019. Dgc-net: Dense geometric correspondence network. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1034--1042.

[39]

Krystian Mikolajczyk and Cordelia Schmid. 2004. Scale & affine invariant interest point detectors. International journal of computer vision 60, 1 (2004), 63--86.

[40]

Krystian Mikolajczyk and Cordelia Schmid. 2004. Scale & affine invariant interest point detectors. International journal of computer vision 60, 1 (2004), 63--86.

[41]

Krystian Mikolajczyk and Cordelia Schmid. 2005. A performance evaluation of local descriptors. IEEE transactions on pattern analysis and machine intelligence 27, 10 (2005), 1615--1630.

Digital Library

[42]

Krystian Mikolajczyk, Tinne Tuytelaars, Cordelia Schmid, Andrew Zisserman, Jiri Matas, Frederik Schaffalitzky, Timor Kadir, and L Van Gool. 2005. A comparison of affine region detectors. International journal of computer vision 65, 1 (2005), 43--72.

Digital Library

[43]

Yanghui Mo, Peilin Zhang, Zhijun Chen, and Bin Ran. 2021. A method of vehicle-infrastructure cooperative perception based vehicle state information fusion using improved kalman filter. Multimedia Tools and Applications (2021), 1--18.

[44]

Marius Muja and David G Lowe. 2009. Fast approximate nearest neighbors with automatic algorithm configuration. VISAPP (1) 2, 331--340 (2009), 2.

[45]

Marius Muja and David G Lowe. 2014. Scalable nearest neighbor algorithms for high dimensional data. IEEE transactions on pattern analysis and machine intelligence 36, 11 (2014), 2227--2240.

[46]

Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE transactions on robotics 31, 5 (2015), 1147--1163.

[47]

Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. In European conference on computer vision. Springer, 483--499.

[48]

NVIDIA. 2022. HARDWARE FOR SELF-DRIVING CARS. https://www.nvidia.com/en-us/self-driving-cars/drive-platform/hardware/.

[49]

Giuseppe Palestra, Adriana Pettinicchio, Marco Del Coco, Pierluigi Carcagnì, Marco Leo, and Cosimo Distante. 2015. Improved performance in facial expression recognition using 32 geometric features. In International Conference on Image Analysis and Processing. Springer, 518--528.

Digital Library

[50]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).

[51]

radenso. 2021. What's the difference between traffic cameras, red light cameras, and speed cameras? https://radenso.com/blogs/radar-university/what-s-the-difference-between-traffic-cameras-red-light-cameras-and-speed-cameras.

[52]

Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. 2020. Superglue: Learning feature matching with graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4938--4947.

[53]

Nikolay Savinov, Akihito Seki, Lubor Ladicky, Torsten Sattler, and Marc Pollefeys. 2017. Quad-networks: unsupervised learning to rank for interest point detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1822--1830.

[54]

Nikolay Savinov, Akihito Seki, Lubor Ladicky, Torsten Sattler, and Marc Pollefeys. 2017. Quad-networks: unsupervised learning to rank for interest point detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1822--1830.

[55]

Johannes L Schonberger and Jan-Michael Frahm. 2016. Structure-from-motion revisited. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4104--4113.

[56]

Heiko G Seif and Xiaolong Hu. 2016. Autonomous driving in the iCity---HD maps as a key challenge of the automotive industry. Engineering 2, 2 (2016), 159--162.

[57]

Edgar Simo-Serra, Eduard Trulls, Luis Ferraz, Iasonas Kokkinos, Pascal Fua, and Francesc Moreno-Noguer. 2015. Discriminative learning of deep convolutional feature point descriptors. In Proceedings of the IEEE international conference on computer vision. 118--126.

Digital Library

[58]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[59]

Yi Sun, Xiaogang Wang, and Xiaoou Tang. 2013. Deep convolutional network cascade for facial point detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3476--3483.

Digital Library

[60]

The N.Y. Times. 2017. Building a road map for the self-driving car. <https://www.nytimes.com/2017/03/02/automobiles/wheels/selfdriving-cars-gps-maps.html.

[61]

Prune Truong, Martin Danelljan, Luc V Gool, and Radu Timofte. 2020. GOCor: Bringing globally optimized correspondence volumes into your neural network. Advances in Neural Information Processing Systems 33 (2020), 14278--14290.

[62]

Prune Truong, Martin Danelljan, and Radu Timofte. 2020. GLU-Net: Global-local universal network for dense flow and correspondences. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6258--6268.

[63]

Federal Highway Administration U.S. Department of Transportation. 2002. United States Pavement Markings. https://mutcd.fhwa.dot.gov/services/publications/fhwaop02090/index.htm.

[64]

Jessica Van Brummelen, Marie O'Brien, Dominique Gruyer, and Homayoun Najjaran. 2018. Autonomous vehicle perception: The technology of today and tomorrow. Transportation research part C: emerging technologies 89 (2018), 384--406.

[65]

Harsha Vardhan. 2017. HD Maps:New age maps powering autonomous vehicles. Geospatial world 22 (2017).

[66]

Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional pose machines. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 4724--4732.

[67]

Ron Weinstein. 2005. RFID: a technical overview and its application to the enterprise. IT professional 7, 3 (2005), 27--33.

[68]

Andi Zang, Runsheng Xu, Zichen Li, and David Doria. 2017. Lane boundary extraction from satellite imagery. In Proceedings of the 1st ACM SIGSPATIAL Workshop on High-Precision Maps and Intelligent Applications for Autonomous Vehicles. 1--8.

Digital Library

[69]

Linguang Zhang and Szymon Rusinkiewicz. 2018. Learning to detect features in texture images. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6325--6333.

[70]

Linguang Zhang and Szymon Rusinkiewicz. 2018. Learning to detect features in texture images. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6325--6333.

[71]

Xumiao Zhang, Anlan Zhang, Jiachen Sun, Xiao Zhu, Y Ethan Guo, Feng Qian, and Z Morley Mao. 2021. Emp: Edge-assisted multi-vehicle perception. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking. 545--558.

Digital Library

[72]

Erjin Zhou, Haoqiang Fan, Zhimin Cao, Yuning Jiang, and Qi Yin. 2013. Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In Proceedings of the IEEE international conference on computer vision workshops. 386--391.

Digital Library

Cited By

Li XLiu SZhou ZGuo BXu YYu Z(2024)EchoPFLProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435608:1(1-22)Online publication date: 6-Mar-2024
https://dl.acm.org/doi/10.1145/3643560

Index Terms

AutoMatch: Leveraging Traffic Camera to Improve Perception and Localization of Autonomous Vehicles
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Sensor networks

Recommendations

VIPS: real-time perception fusion for infrastructure-assisted autonomous driving
MobiCom '22: Proceedings of the 28th Annual International Conference on Mobile Computing And Networking

Infrastructure-assisted autonomous driving is an emerging paradigm that expects to significantly improve the driving safety of autonomous vehicles. The key enabling technology for this vision is to fuse LiDAR results from the roadside infrastructure and ...
VI-eye: semantic-based 3D point cloud registration for infrastructure-assisted autonomous driving
MobiCom '21: Proceedings of the 27th Annual International Conference on Mobile Computing and Networking

Infrastructure-assisted autonomous driving is an emerging paradigm that aims to make affordable autonomous vehicles a reality. A key technology for realizing this vision is real-time point cloud registration which allows a vehicle to fuse the 3D point ...
Demo: Enabling Efficient Perception Sharing via Infrastructure-to-Road Beamforming
ACM SIGCOMM Posters and Demos '24: Proceedings of the ACM SIGCOMM 2024 Conference: Posters and Demos

Sharing the sensor data collected by roadside infrastructure is an emerging paradigm for extending the perception of autonomous vehicles and improving driving safety. However, enabling perception sharing is challenging due to the high data rate of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SenSys '22: Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems

November 2022

1280 pages

ISBN:9781450398862

DOI:10.1145/3560905

General Chairs:
Jeremy Gummeson
University of Massachusetts Amherst
,
Sunghoon Ivan Lee
University of Massachusetts Amherst
,
Program Chairs:
Jie Gao
Rutgers University
,
Guoliang Xing
The Chinese University of Hong Kong

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Innovation and Technology Commission - The Government of the Hong Kong Special Administrative Region of the People's republic of China
Centre for Perceptual and Interactive Intelligence

Conference

SenSys '22

Sponsor:

SenSys '22: The 20th ACM Conference on Embedded Networked Sensor Systems

November 6 - 9, 2022

Massachusetts, Boston

Acceptance Rates

Overall Acceptance Rate 122 of 680 submissions, 18%

Upcoming Conference

SenSys '24

Sponsor:
sigbed
sigbed
sigbed
sigbed
sigbed

The 22nd ACM Conference on Embedded Networked Sensor Systems

November 4 - 7, 2024

Hangzhou , China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
426
Total Downloads

Downloads (Last 12 months)287
Downloads (Last 6 weeks)19

Reflects downloads up to 01 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li XLiu SZhou ZGuo BXu YYu Z(2024)EchoPFLProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435608:1(1-22)Online publication date: 6-Mar-2024
https://dl.acm.org/doi/10.1145/3643560

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents