Evaluation of vision transformers for traffic sign classification

Y Zheng, W Jiang - Wireless Communications and Mobile …, 2022 - Wiley Online Library
Y Zheng, W Jiang
Wireless Communications and Mobile Computing, 2022Wiley Online Library
Traffic sign recognition is one of the most important tasks in autonomous driving. Camera‐
based computer vision techniques have been proposed for this task, and various
convolutional neural network structures are used and validated with multiple open datasets.
Recently, novel Transformer‐based models have been proposed for various computer
vision tasks and have achieved state‐of‐the‐art performance, outperforming convolutional
neural networks in several tasks. In this study, our goal is to investigate whether the success …
Traffic sign recognition is one of the most important tasks in autonomous driving. Camera‐based computer vision techniques have been proposed for this task, and various convolutional neural network structures are used and validated with multiple open datasets. Recently, novel Transformer‐based models have been proposed for various computer vision tasks and have achieved state‐of‐the‐art performance, outperforming convolutional neural networks in several tasks. In this study, our goal is to investigate whether the success of Vision Transformers can be replicated within the traffic sign recognition area. Based on existing resources, we first extract and contribute three open traffic sign classification datasets. Based on these datasets, we experiment with seven convolutional neural networks and five Vision Transformers. We find that Transformers are not as competitive as convolutional neural networks for the traffic sign classification task. Specifically, there are performance gaps of up to 12.81%, 2.01%, and 4.37% existing for the German, Indian, and Chinese traffic sign datasets, respectively. Furthermore, we propose some suggestions to improve the performance of Transformers.
Wiley Online Library