Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (262)

Search Parameters:
Keywords = monocular detection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 3461 KiB  
Article
Accurate 3D to 2D Object Distance Estimation from the Mapped Point Cloud Data
by Saidrasul Usmankhujaev, Shokhrukh Baydadaev and Jang Woo Kwon
Sensors 2023, 23(4), 2103; https://doi.org/10.3390/s23042103 - 13 Feb 2023
Cited by 7 | Viewed by 5278
Abstract
Distance estimation is one of the oldest and most challenging tasks in computer vision using only a monocular camera. This can be challenging owing to the presence of occlusions, noise, and variations in the lighting, texture, and shape of objects. Additionally, the motion [...] Read more.
Distance estimation is one of the oldest and most challenging tasks in computer vision using only a monocular camera. This can be challenging owing to the presence of occlusions, noise, and variations in the lighting, texture, and shape of objects. Additionally, the motion of the camera and objects in the scene can affect the accuracy of the distance estimation. Various techniques have been proposed to overcome these challenges, including stereo matching, structured light, depth from focus, depth from defocus, depth from motion, and time of flight. The addition of information from a high-resolution 3D view of the surroundings simplifies the distance calculation. This paper describes a novel distance estimation method that operates with converted point cloud data. The proposed method is a reliable map-based bird’s eye view (BEV) that calculates the distance to the detected objects. Using the help of the Euler-region proposal network (E-RPN) model, a LiDAR-to-image-based method for metric distance estimation with 3D bounding box projections onto the image was proposed. We demonstrate that despite the general difficulty of the BEV representation in understanding features related to the height coordinate, it is possible to extract all parameters characterizing the bounding boxes of the objects, including their height and elevation. Finally, we applied the triangulation method to calculate the accurate distance to the objects and statistically proved that our methodology is one of the best in terms of accuracy and robustness. Full article
(This article belongs to the Special Issue The Intelligent Sensing Technology of Transportation System)
Show Figures

Figure 1

13 pages, 4336 KiB  
Article
Detect Orientation of Symmetric Objects from Monocular Camera to Enhance Landmark Estimations in Object SLAM
by Zehua Fang, Jinglin Han and Wei Wang
Appl. Sci. 2023, 13(4), 2096; https://doi.org/10.3390/app13042096 - 6 Feb 2023
Cited by 1 | Viewed by 1638
Abstract
Object simultaneous localization and mapping (SLAM) introduces object-level landmarks to the map and helps robots to further perceive their surroundings. As one of the most preferred landmark representations, ellipsoid has a dense mathematical expression and can represent the occupied space of objects with [...] Read more.
Object simultaneous localization and mapping (SLAM) introduces object-level landmarks to the map and helps robots to further perceive their surroundings. As one of the most preferred landmark representations, ellipsoid has a dense mathematical expression and can represent the occupied space of objects with high accuracy. However, the orientations of ellipsoid approximations often fail to coincide with the orientation of objects. To further improve the performance of object SLAM systems with ellipsoid landmarks, we innovatively propose a strategy that first extracts the orientations of those symmetric human-made objects in a single frame and then implements the results of the orientation as a back-end constraint factor of the ellipsoid landmarks. Experimental results obtained show that, compared with the baseline, the proposed orientation detection method can reduce the orientation error by more than 46.5% in most tested datasets and improves the accuracy of mapping. The average translation, rotation and shape error improved by 63.4%, 61.7% and 42.4%, respectively, compared with quadric-SLAM. With only 9 ms additional time cost of each frame, the object SLAM system integrated with our proposed method can still run in real time. Full article
Show Figures

Figure 1

20 pages, 2499 KiB  
Article
Monocular Facial Presentation–Attack–Detection: Classifying Near-Infrared Reflectance Patterns
by Ali Hassani, Jon Diedrich and Hafiz Malik
Appl. Sci. 2023, 13(3), 1987; https://doi.org/10.3390/app13031987 - 3 Feb 2023
Cited by 2 | Viewed by 2195
Abstract
This paper presents a novel material spectroscopy approach to facial presentation–attack–defense (PAD). Best-in-class PAD methods typically detect artifacts in the 3D space. This paper proposes similar features can be achieved in a monocular, single-frame approach by using controlled light. A mathematical model is [...] Read more.
This paper presents a novel material spectroscopy approach to facial presentation–attack–defense (PAD). Best-in-class PAD methods typically detect artifacts in the 3D space. This paper proposes similar features can be achieved in a monocular, single-frame approach by using controlled light. A mathematical model is produced to show how live faces and their spoof counterparts have unique reflectance patterns due to geometry and albedo. A rigorous dataset is collected to evaluate this proposal: 30 diverse adults and their spoofs (paper-mask, display-replay, spandex-mask and COVID mask) under varied pose, position, and lighting for 80,000 unique frames. A panel of 13 texture classifiers are then benchmarked to verify the hypothesis. The experimental results are excellent. The material spectroscopy process enables a conventional MobileNetV3 network to achieve 0.8% average-classification-error rate, outperforming the selected state-of-the-art algorithms. This demonstrates the proposed imaging methodology generates extremely robust features. Full article
(This article belongs to the Special Issue Application of Biometrics Technology in Security)
Show Figures

Figure 1

16 pages, 3411 KiB  
Article
Monocular 3D Object Detection Based on Pseudo Multimodal Information Extraction and Keypoint Estimation
by Dan Zhao, Chaofeng Ji and Guizhong Liu
Appl. Sci. 2023, 13(3), 1731; https://doi.org/10.3390/app13031731 - 29 Jan 2023
Cited by 2 | Viewed by 2491
Abstract
Three-dimensional object detection is an essential and fundamental task in the field of computer vision which can be widely used in various scenarios such as autonomous driving and visual navigation. In view of the current insufficient utilization of image information in current monocular [...] Read more.
Three-dimensional object detection is an essential and fundamental task in the field of computer vision which can be widely used in various scenarios such as autonomous driving and visual navigation. In view of the current insufficient utilization of image information in current monocular camera-based 3D object detection algorithms, we propose a monocular 3D object detection algorithm based on pseudo-multimodal information extraction and keypoint estimation. We utilize the original image to generate pseudo-lidar and a bird’s-eye view, and then feed the fused data of the original image and pseudo-lidar to the keypoint-based network for an initial 3D box estimation, finally using the bird’s-eye view to refine the initial 3D box. The experimental performance of our method exceeds state-of-the-art algorithms under the evaluation criteria of 3D object detection and localization on the KITTI dataset, achieving the best experimental performance so far. Full article
Show Figures

Figure 1

21 pages, 658 KiB  
Article
DPO: Direct Planar Odometry with Stereo Camera
by Filipe C. A. Lins, Nicolas S. Rosa, Valdir Grassi, Adelardo A. D. Medeiros and Pablo J. Alsina
Sensors 2023, 23(3), 1393; https://doi.org/10.3390/s23031393 - 26 Jan 2023
Cited by 2 | Viewed by 2612
Abstract
Nowadays, state-of-the-art direct visual odometry (VO) methods essentially rely on points to estimate the pose of the camera and reconstruct the environment. Direct Sparse Odometry (DSO) became the standard technique and many approaches have been developed from it. However, only recently, two monocular [...] Read more.
Nowadays, state-of-the-art direct visual odometry (VO) methods essentially rely on points to estimate the pose of the camera and reconstruct the environment. Direct Sparse Odometry (DSO) became the standard technique and many approaches have been developed from it. However, only recently, two monocular plane-based DSOs have been presented. The first one uses a learning-based plane estimator to generate coarse planes as input for optimization. When these coarse estimates are too far from the minimum, the optimization may fail. Thus, the entire system result is dependent on the quality of the plane predictions and restricted to the training data domain. The second one only detects planes in vertical and horizontal orientation as being more adequate to structured environments. To the best of our knowledge, we propose the first Stereo Plane-based VO inspired by the DSO framework. Differing from the above-mentioned methods, our approach purely uses planes as features in the sliding window optimization and uses a dual quaternion as pose parameterization. The conducted experiments showed that our method presents a similar performance to Stereo DSO, a point-based approach. Full article
Show Figures

Figure 1

23 pages, 41307 KiB  
Article
Recognizing and Recovering Ball Motion Based on Low-Frame-Rate Monocular Camera
by Wendi Zhang, Yin Zhang, Yuli Zhao and Bin Zhang
Appl. Sci. 2023, 13(3), 1513; https://doi.org/10.3390/app13031513 - 23 Jan 2023
Viewed by 1664
Abstract
Reconstructing sphere motion is an essential part of indoor screen-based ball sports. Current sphere recognition techniques require expensive high-precision equipment and complex field deployment, which limits the application of these techniques. This paper proposes a novel method for recognizing and recovering sphere motion [...] Read more.
Reconstructing sphere motion is an essential part of indoor screen-based ball sports. Current sphere recognition techniques require expensive high-precision equipment and complex field deployment, which limits the application of these techniques. This paper proposes a novel method for recognizing and recovering sphere motion based on a low-frame-rate monocular camera. The method captures ball motion streaks in input images, reconstructs trajectories in space, and then estimates ball speed. We evaluated the effectiveness of the streak detection method and obtained an F1-score of 0.97. We also compared the performance of the proposed trajectory reconstruction method with existing methods, and the proposed method outperformed the compared techniques. Full article
Show Figures

Figure 1

18 pages, 12953 KiB  
Article
Vision-Based Autonomous Following of a Moving Platform and Landing for an Unmanned Aerial Vehicle
by Jesús Morales, Isabel Castelo, Rodrigo Serra, Pedro U. Lima and Meysam Basiri
Sensors 2023, 23(2), 829; https://doi.org/10.3390/s23020829 - 11 Jan 2023
Cited by 23 | Viewed by 4859
Abstract
Interest in Unmanned Aerial Vehicles (UAVs) has increased due to their versatility and variety of applications, however their battery life limits their applications. Heterogeneous multi-robot systems can offer a solution to this limitation, by allowing an Unmanned Ground Vehicle (UGV) to serve as [...] Read more.
Interest in Unmanned Aerial Vehicles (UAVs) has increased due to their versatility and variety of applications, however their battery life limits their applications. Heterogeneous multi-robot systems can offer a solution to this limitation, by allowing an Unmanned Ground Vehicle (UGV) to serve as a recharging station for the aerial one. Moreover, cooperation between aerial and terrestrial robots allows them to overcome other individual limitations, such as communication link coverage or accessibility, and to solve highly complex tasks, e.g., environment exploration, infrastructure inspection or search and rescue. This work proposes a vision-based approach that enables an aerial robot to autonomously detect, follow, and land on a mobile ground platform. For this purpose, ArUcO fiducial markers are used to estimate the relative pose between the UAV and UGV by processing RGB images provided by a monocular camera on board the UAV. The pose estimation is fed to a trajectory planner and four decoupled controllers to generate speed set-points relative to the UAV. Using a cascade loop strategy, these set-points are then sent to the UAV autopilot for inner loop control. The proposed solution has been tested both in simulation, with a digital twin of a solar farm using ROS, Gazebo and Ardupilot Software-in-the-Loop (SiL); and in the real world at IST Lisbon’s outdoor facilities, with a UAV built on the basis of a DJ550 Hexacopter and a modified Jackal ground robot from DJI and Clearpath Robotics, respectively. Pose estimation, trajectory planning and speed set-point are computed on board the UAV, using a Single Board Computer (SBC) running Ubuntu and ROS, without the need for external infrastructure. Full article
(This article belongs to the Special Issue Sensors for Smart Vehicle Applications)
Show Figures

Figure 1

34 pages, 1115 KiB  
Article
Autonomous Unmanned Aerial Vehicles in Bushfire Management: Challenges and Opportunities
by Shouthiri Partheepan, Farzad Sanati and Jahan Hassan
Drones 2023, 7(1), 47; https://doi.org/10.3390/drones7010047 - 10 Jan 2023
Cited by 37 | Viewed by 13614
Abstract
The intensity and frequency of bushfires have increased significantly, destroying property and living species in recent years. Presently, unmanned aerial vehicle (UAV) technology advancements are becoming increasingly popular in bushfire management systems because of their fundamental characteristics, such as manoeuvrability, autonomy, ease of [...] Read more.
The intensity and frequency of bushfires have increased significantly, destroying property and living species in recent years. Presently, unmanned aerial vehicle (UAV) technology advancements are becoming increasingly popular in bushfire management systems because of their fundamental characteristics, such as manoeuvrability, autonomy, ease of deployment, and low cost. UAVs with remote-sensing capabilities are used with artificial intelligence, machine learning, and deep-learning algorithms to detect fire regions, make predictions, make decisions, and optimize fire-monitoring tasks. Moreover, UAVs equipped with various advanced sensors, including LIDAR, visual, infrared (IR), and monocular cameras, have been used to monitor bushfires due to their potential to provide new approaches and research opportunities. This review focuses on the use of UAVs in bushfire management for fire detection, fire prediction, autonomous navigation, obstacle avoidance, and search and rescue to improve the accuracy of fire prediction and minimize their impacts on people and nature. The objective of this paper is to provide valuable information on various UAV-based bushfire management systems and machine-learning approaches to predict and effectively respond to bushfires in inaccessible areas using intelligent autonomous UAVs. This paper aims to assemble information about the use of UAVs in bushfire management and to examine the benefits and limitations of existing techniques of UAVs related to bushfire handling. However, we conclude that, despite the potential benefits of UAVs for bushfire management, there are shortcomings in accuracy, and solutions need to be optimized for effective bushfire management. Full article
(This article belongs to the Section Drones in Agriculture and Forestry)
Show Figures

Figure 1

15 pages, 5429 KiB  
Article
A Novel Monocular Vision Technique for the Detection of Electric Transmission Tower Tilting Trend
by Yongsheng Yang, Minzhen Wang, Xinheng Wang, Cheng Li, Ziwen Shang and Liying Zhao
Appl. Sci. 2023, 13(1), 407; https://doi.org/10.3390/app13010407 - 28 Dec 2022
Cited by 7 | Viewed by 1746
Abstract
Transmission lines are primarily deployed overhead, and the transmission tower, acting as the fulcrum, can be affected by the unbalanced force of the wire and extreme weather, resulting in the transmission tower tilt, deformation, or collapse. This can jeopardize the safe operation of [...] Read more.
Transmission lines are primarily deployed overhead, and the transmission tower, acting as the fulcrum, can be affected by the unbalanced force of the wire and extreme weather, resulting in the transmission tower tilt, deformation, or collapse. This can jeopardize the safe operation of the power grid and even cause widespread failures, resulting in significant economic losses. Given the limitations of current tower tilt detection methods, this paper proposes a tower tilt detection and analysis method based on monocular vision images. The monocular camera collects the profile and contour features of the tower, and the tower tilt model is combined to realize the calculation and analysis of the tower tilt. Through this improved monocular visual monitoring method, the perception accuracy of the tower tilt is improved by 7.5%, and the axial eccentricity is accurate to ±2 mm. The method provides real-time reliability and simple operation for detecting tower inclination, significantly reducing staff inspection intensity and ensuring the power system operates safely and efficiently. Full article
(This article belongs to the Topic Smart Energy)
Show Figures

Figure 1

18 pages, 5122 KiB  
Article
An Embedded Framework for Fully Autonomous Object Manipulation in Robotic-Empowered Assisted Living
by Giovanni Mezzina and Daniela De Venuto
Sensors 2023, 23(1), 103; https://doi.org/10.3390/s23010103 - 22 Dec 2022
Viewed by 2186
Abstract
Most of the humanoid social robots currently diffused are designed only for verbal and animated interactions with users, and despite being equipped with two upper arms for interactive animation, they lack object manipulation capabilities. In this paper, we propose the MONOCULAR (eMbeddable autONomous [...] Read more.
Most of the humanoid social robots currently diffused are designed only for verbal and animated interactions with users, and despite being equipped with two upper arms for interactive animation, they lack object manipulation capabilities. In this paper, we propose the MONOCULAR (eMbeddable autONomous ObjeCt manipULAtion Routines) framework, which implements a set of routines to add manipulation functionalities to social robots by exploiting the functional data fusion of two RGB cameras and a 3D depth sensor placed in the head frame. The framework is designed to: (i) localize specific objects to be manipulated via RGB cameras; (ii) define the characteristics of the shelf on which they are placed; and (iii) autonomously adapt approach and manipulation routines to avoid collisions and maximize grabbing accuracy. To localize the item on the shelf, MONOCULAR exploits an embeddable version of the You Only Look Once (YOLO) object detector. The RGB camera outcomes are also used to estimate the height of the shelf using an edge-detecting algorithm. Based on the item’s position and the estimated shelf height, MONOCULAR is designed to select between two possible routines that dynamically optimize the approach and object manipulation parameters according to the real-time analysis of RGB and 3D sensor frames. These two routines are optimized for a central or lateral approach to objects on a shelf. The MONOCULAR procedures are designed to be fully automatic, intrinsically protecting sensitive users’ data and stored home or hospital maps. MONOCULAR was optimized for Pepper by SoftBank Robotics. To characterize the proposed system, a case study in which Pepper is used as a drug delivery operator is proposed. The case study is divided into: (i) pharmaceutical package search; (ii) object approach and manipulation; and (iii) delivery operations. Experimental data showed that object manipulation routines for laterally placed objects achieves a best grabbing success rate of 96%, while the routine for centrally placed objects can reach 97% for a wide range of different shelf heights. Finally, a proof of concept is proposed here to demonstrate the applicability of the MONOCULAR framework in a real-life scenario. Full article
Show Figures

Figure 1

14 pages, 4780 KiB  
Article
Vehicle Distance Estimation from a Monocular Camera for Advanced Driver Assistance Systems
by Seungyoo Lee, Kyujin Han, Seonyeong Park and Xiaopeng Yang
Symmetry 2022, 14(12), 2657; https://doi.org/10.3390/sym14122657 - 15 Dec 2022
Cited by 11 | Viewed by 6290
Abstract
The purpose of this study is to propose a framework for accurate and efficient vehicle distance estimation from a monocular camera. The proposed framework consists of a transformer-based object detector, a transformer-based depth estimator, and a distance predictor. The object detector detects various [...] Read more.
The purpose of this study is to propose a framework for accurate and efficient vehicle distance estimation from a monocular camera. The proposed framework consists of a transformer-based object detector, a transformer-based depth estimator, and a distance predictor. The object detector detects various objects that are mostly symmetrical from an image captured by the monocular camera and provides the type of each object and the coordinate information of a bounding box around each object. The depth estimator generates a depth map for the image. Then, the bounding boxes are overlapped with the depth map to extract the depth features of each object, such as the mean depth, minimum depth, and maximum depth of each object. The present study then trained three models—eXtreme Gradient Boosting, Random Forest, and Long Short-Term Memory—to predict the actual distance between the object and the camera based on the type of the object, the bounding box of the object (including its coordinates and size), and the extracted depth features. The present study proposes including the trimmed mean depth of an object to predict the actual distance by excluding the background pixels around an object but within the bounding box of the object. The evaluation results show that the proposed framework outperformed existing studies. Full article
Show Figures

Figure 1

16 pages, 78378 KiB  
Article
A Comparison of Deep Neural Networks for Monocular Depth Map Estimation in Natural Environments Flying at Low Altitude
by Alexandra Romero-Lugo, Andrea Magadan-Salazar, Jorge Fuentes-Pacheco and Raúl Pinto-Elías
Sensors 2022, 22(24), 9830; https://doi.org/10.3390/s22249830 - 14 Dec 2022
Cited by 4 | Viewed by 3859
Abstract
Currently, the use of Unmanned Aerial Vehicles (UAVs) in natural and complex environments has been increasing, because they are appropriate and affordable solutions to support different tasks such as rescue, forestry, and agriculture by collecting and analyzing high-resolution monocular images. Autonomous navigation at [...] Read more.
Currently, the use of Unmanned Aerial Vehicles (UAVs) in natural and complex environments has been increasing, because they are appropriate and affordable solutions to support different tasks such as rescue, forestry, and agriculture by collecting and analyzing high-resolution monocular images. Autonomous navigation at low altitudes is an important area of research, as it would allow monitoring parts of the crop that are occluded by their foliage or by other plants. This task is difficult due to the large number of obstacles that might be encountered in the drone’s path. The generation of high-quality depth maps is an alternative for providing real-time obstacle detection and collision avoidance for autonomous UAVs. In this paper, we present a comparative analysis of four supervised learning deep neural networks and a combination of two for monocular depth map estimation considering images captured at low altitudes in simulated natural environments. Our results show that the Boosting Monocular network is the best performing in terms of depth map accuracy because of its capability to process the same image at different scales to avoid loss of fine details. Full article
(This article belongs to the Section Remote Sensors)
12 pages, 3268 KiB  
Article
A 3D Scene Information Enhancement Method Applied in Augmented Reality
by Bo Li, Xiangfeng Wang, Qiang Gao, Zhimei Song, Cunyu Zou and Siyuan Liu
Electronics 2022, 11(24), 4123; https://doi.org/10.3390/electronics11244123 - 10 Dec 2022
Cited by 1 | Viewed by 1449
Abstract
Aiming at the problem that the detection of small planes with unobvious texture is easy to be missed in augmented reality scene, a 3D scene information enhancement method to grab the planes for augmented reality scene is proposed based on a series of [...] Read more.
Aiming at the problem that the detection of small planes with unobvious texture is easy to be missed in augmented reality scene, a 3D scene information enhancement method to grab the planes for augmented reality scene is proposed based on a series of images of a real scene taken by a monocular camera. Firstly, we extract the feature points from the images. Secondly, we match the feature points from different images, and build the three-dimensional sparse point cloud data of the scene based on the feature points and the camera internal parameters. Thirdly, we estimate the position and size of the planes based on the sparse point cloud. The planes can be used to provide extra structural information for augmented reality. In this paper, an optimized feature points extraction and matching algorithm based on Scale Invariant Feature Transform (SIFT) is proposed, and a fast spatial planes recognition method based on a RANdom SAmple Consensus (RANSAC) is established. Experiments show that the method can achieve higher accuracy compared to the Oriented Fast and Rotated Brief (ORB), Binary Robust Invariant Scalable Keypoints (BRISK) and Super Point. The proposed method can effectively solve the problem of missing detection of faces in ARCore, and improve the integration effect between virtual objects and real scenes. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

15 pages, 6436 KiB  
Article
A New Parallel Intelligence Based Light Field Dataset for Depth Refinement and Scene Flow Estimation
by Yu Shen, Yuhang Liu, Yonglin Tian, Zhongmin Liu and Feiyue Wang
Sensors 2022, 22(23), 9483; https://doi.org/10.3390/s22239483 - 4 Dec 2022
Cited by 4 | Viewed by 2403
Abstract
Computer vision tasks, such as motion estimation, depth estimation, object detection, etc., are better suited to light field images with more structural information than traditional 2D monocular images. However, since costly data acquisition instruments are difficult to calibrate, it is always hard to [...] Read more.
Computer vision tasks, such as motion estimation, depth estimation, object detection, etc., are better suited to light field images with more structural information than traditional 2D monocular images. However, since costly data acquisition instruments are difficult to calibrate, it is always hard to obtain real-world scene light field images. The majority of the datasets for static light field images now available are modest in size and cannot be used in methods such as transformer to fully leverage local and global correlations. Additionally, studies on dynamic situations, such as object tracking and motion estimates based on 4D light field images, have been rare, and we anticipate a superior performance. In this paper, we firstly propose a new static light field dataset that contains up to 50 scenes and takes 8 to 10 perspectives for each scene, with the ground truth including disparities, depths, surface normals, segmentations, and object poses. This dataset is larger scaled compared to current mainstream datasets for depth estimation refinement, and we focus on indoor and some outdoor scenarios. Second, to generate additional optical flow ground truth that indicates 3D motion of objects in addition to the ground truth obtained in static scenes in order to calculate more precise pixel level motion estimation, we released a light field scene flow dataset with dense 3D motion ground truth of pixels, and each scene has 150 frames. Thirdly, by utilizing the DistgDisp and DistgASR, which decouple the angular and spatial domain of the light field, we perform disparity estimation and angular super-resolution to evaluate the performance of our light field dataset. The performance and potential of our dataset in disparity estimation and angular super-resolution have been demonstrated by experimental results. Full article
(This article belongs to the Special Issue Intelligent Monitoring, Control and Optimization in Industries 4.0)
Show Figures

Figure 1

13 pages, 1013 KiB  
Article
Multi-Vehicle Tracking Based on Monocular Camera in Driver View
by Pengfei Lyu, Minxiang Wei and Yuwei Wu
Appl. Sci. 2022, 12(23), 12244; https://doi.org/10.3390/app122312244 - 30 Nov 2022
Cited by 3 | Viewed by 1868
Abstract
Multi-vehicle tracking is used in advanced driver assistance systems to track obstacles, which is fundamental for high-level tasks. It requires real-time performance while dealing with object illumination variations and deformations. To this end, we propose a novel multi-vehicle tracking algorithm based on a [...] Read more.
Multi-vehicle tracking is used in advanced driver assistance systems to track obstacles, which is fundamental for high-level tasks. It requires real-time performance while dealing with object illumination variations and deformations. To this end, we propose a novel multi-vehicle tracking algorithm based on a monocular camera in driver view. It follows the tracking-by-detection paradigm and integrates detection and appearance descriptors into a single network. The one-stage detection approach consists of a backbone, a modified BiFPN as a neck layer, and three prediction heads. The data association consists of a two-step matching strategy together with a Kalman filter. Experimental results demonstrate that the proposed approach outperforms state-of-the-art algorithms. It is also able to solve the tracking problem in driving scenarios while maintaining 16 FPS on the test dataset. Full article
(This article belongs to the Special Issue Intelligent Vehicles: Advanced Technology and Development)
Show Figures

Figure 1

Back to TopTop