Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (424)

Search Parameters:
Keywords = YOLO v3

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 5571 KiB  
Article
A Novel Pre-Processing Approach and Benchmarking Analysis for Faster, Robust, and Improved Small Object Detection Methods
by Mohammed Ali Mohammed Al-Hababi, Ahsan Habib, Fursan Thabit and Ying Liu
Remote Sens. 2024, 16(20), 3753; https://doi.org/10.3390/rs16203753 - 10 Oct 2024
Viewed by 307
Abstract
Detecting tiny objects in aerial imagery presents a major challenge regarding their limited resolution and size. Existing research predominantly focuses on evaluating average precision (AP) across various detection methods, often neglecting computational efficiency. Furthermore, state-of-the-art techniques can be complex and difficult to understand. [...] Read more.
Detecting tiny objects in aerial imagery presents a major challenge regarding their limited resolution and size. Existing research predominantly focuses on evaluating average precision (AP) across various detection methods, often neglecting computational efficiency. Furthermore, state-of-the-art techniques can be complex and difficult to understand. This paper introduces a comprehensive benchmarking analysis specifically tailored for enhancing small object detection within the DOTA dataset, focusing on one-stage detection methods. We propose a novel data-processing approach to enhance the overall AP for all classes in the DOTA-v1.5 dataset using the YOLOv8 framework. Our approach utilizes the YOLOv8’s darknet architecture, a proven effective backbone for object detection tasks. To optimize performance, we introduce innovative pre-processing techniques, including data formatting, noise handling, and normalization, in order to improve the representation of small objects and improve their detectability. Extensive experiments on the DOTA-v1.5 dataset demonstrate the superiority of our proposed approach in terms of overall class mean average precision (mAP), achieving 66.7%. Additionally, our method establishes a new benchmark regarding computational efficiency and speed. This advancement not only enhances the performance of small object detection but also sets a foundation for future research and applications in aerial imagery analysis, paving the way for more efficient and effective detection techniques. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

24 pages, 10093 KiB  
Article
Enhancing a You Only Look Once-Plated Detector via Auxiliary Textual Coding for Multi-Scale Rotating Remote Sensing Objects in Transportation Monitoring Applications
by Sarentuya Bao, Mingwang Zhang, Rui Xie, Dabhvrbayar Huang and Jianlei Kong
Appl. Sci. 2024, 14(19), 9074; https://doi.org/10.3390/app14199074 - 8 Oct 2024
Viewed by 468
Abstract
With the rapid development of intelligent information technologies, remote sensing object detection has played an important role in different field applications. Particularly in recent years, it has attracted widespread attention in assisting with food safety supervision, which still faces troubling issues between oversized [...] Read more.
With the rapid development of intelligent information technologies, remote sensing object detection has played an important role in different field applications. Particularly in recent years, it has attracted widespread attention in assisting with food safety supervision, which still faces troubling issues between oversized parameters and low performance that are challenging to solve. Hence, this article proposes a novel remote sensing detection framework for multi-scale objects with a rotating status and mutual occlusion, defined as EYMR-Net. This proposed approach is established on the YOLO-v7 architecture with a Swin Transformer backbone, which offers multi-scale receptive fields to mine massive features. Then, an enhanced attention module is added to exploit the spatial and dimensional interrelationships among different local characteristics. Subsequently, the effective rotating frame regression mechanism via circular smoothing labels is introduced to the EYMR-Net structure, addressing the problem of horizontal YOLO (You Only Look Once) frames ignoring direction changes. Extensive experiments on DOTA datasets demonstrated the outstanding performance of EYMR-Net, which achieved an impressive mAP0.5 of up to 74.3%. Further ablation experiments verified that our proposed approach obtains a balance between performance and efficiency, which is beneficial for practical remote sensing applications in transportation monitoring and supply chain management. Full article
(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)
Show Figures

Figure 1

16 pages, 3733 KiB  
Article
Research on Rapid Detection of Underwater Targets Based on Global Differential Model Compression
by Weishan Li, Yilin Li, Ruixue Li, Haozhe Shen, Wenjun Li and Keqiang Yue
J. Mar. Sci. Eng. 2024, 12(10), 1760; https://doi.org/10.3390/jmse12101760 - 4 Oct 2024
Viewed by 421
Abstract
Large-scale deep learning algorithms have emerged as the primary technology for underwater target detection, demonstrating exceptional inference effectiveness and accuracy. However, the real-time capabilities of these high-accuracy algorithms rely heavily on high-performance computing resources like CPUs and GPUs. It presents a challenge for [...] Read more.
Large-scale deep learning algorithms have emerged as the primary technology for underwater target detection, demonstrating exceptional inference effectiveness and accuracy. However, the real-time capabilities of these high-accuracy algorithms rely heavily on high-performance computing resources like CPUs and GPUs. It presents a challenge for deploying them on underwater embedded devices, where communication is limited and computational and energy resources are constrained. To overcome this, this paper focuses on constructing a lightweight yet highly accurate deep learning model suitable for real-time underwater target detection on edge devices. We develop a new lightweight model, named YOLO-TN, for real-time underwater object recognition on edge devices using a self-constructed image dataset captured by an underwater unmanned vehicle. This model is obtained by compressing the classical YOLO-V5, utilizing a globally differentiable deep neural architecture search method and a network pruning technique. Experimental results show that the YOLO-TN achieves a mean average precision (mAP) of 0.5425 and an inference speed of 28.6 FPS on embedded devices, while its parameter size is between 0.4 M and 0.6 M. This performance is a fifth of the parameter size and twelve times the FPS of the YOLO-V5 model, with almost no loss in inference accuracy. In conclusion, this framework significantly enhances the feasibility of deploying large-scale deep learning models on edge devices with high precision and compactness, ensuring real-time inference and offline deployment capabilities. This research is pivotal in addressing the computational challenges faced in underwater operations. Full article
(This article belongs to the Special Issue Maritime Communication Networks and 6G Technologies)
Show Figures

Figure 1

16 pages, 2184 KiB  
Article
Comparative Analysis of Improved YOLO v5 Models for Corrosion Detection in Coastal Environments
by Qifeng Yu, Yudong Han, Xinjia Gao, Wuguang Lin and Yi Han
J. Mar. Sci. Eng. 2024, 12(10), 1754; https://doi.org/10.3390/jmse12101754 - 4 Oct 2024
Viewed by 623
Abstract
Coastal areas face severe corrosion issues, posing significant risks and economic losses to equipment, personnel, and the environment. YOLO v5, known for its speed, accuracy, and ease of deployment, has been employed for the rapid detection and identification of marine corrosion. However, corrosion [...] Read more.
Coastal areas face severe corrosion issues, posing significant risks and economic losses to equipment, personnel, and the environment. YOLO v5, known for its speed, accuracy, and ease of deployment, has been employed for the rapid detection and identification of marine corrosion. However, corrosion images often feature complex characteristics and high variability in detection targets, presenting significant challenges for YOLO v5 in recognizing and extracting corrosion features. To improve the detection performance of YOLO v5 for corrosion image features, this study investigates two enhanced models: EfficientViT-NWD-YOLO v5 and Gold-NWD-YOLO v5. These models specifically target improvements to the backbone and neck structures of YOLO v5, respectively. The performance of these models for corrosion detection is analyzed in comparison with both YOLO v5 and NWD-YOLO v5. The evaluation metrics including precision, recall, F1-score, Frames Per Second (FPS), pre-processing time, inference time, non-maximum suppression time (NMS), and confusion matrix were used to evaluate the detection performance. The results indicate that the Gold-NWD-YOLO v5 model shows significant improvements in precision, recall, F1-score, and accurate prediction probability. However, it also increases inference time and NMS time, and decreases FPS. This suggests that while the modified neck structure significantly enhances detection performance in corrosion images, it also increases computational overhead. On the other hand, the EfficientViT-NWD-YOLO v5 model shows slight improvements in precision, recall, F1-score, and accurate prediction probability. Notably, it significantly reduces inference and NMS time, and greatly improves FPS. This indicates that modifications to the backbone structure do not notably enhance corrosion detection performance but significantly improve detection speed. From the application perspective, YOLO v5 and NWD-YOLO v5 are suitable for routine corrosion detection applications. Gold-NWD-YOLO v5 is better suited for scenarios requiring high precision in corrosion detection, while EfficientViT-NWD-YOLO v5 is ideal for applications needing a balance between speed and accuracy. The findings can guide decision making for corrosion health monitoring for critical infrastructure in coastal areas. Full article
Show Figures

Figure 1

16 pages, 5560 KiB  
Article
On-Line Measurement of Tracking Poses of Heliostats in Concentrated Solar Power Plants
by Fen Xu, Changhao Li and Feihu Sun
Sensors 2024, 24(19), 6373; https://doi.org/10.3390/s24196373 - 1 Oct 2024
Viewed by 358
Abstract
The tracking pose of heliostats directly affects the stability and working efficiency of concentrated solar power (CSP) plants. Due to occlusion, over-exposure, and uneven illumination caused by mirror reflection, traditional image processing algorithms showed poor performances on the detection and segmentation of heliostats, [...] Read more.
The tracking pose of heliostats directly affects the stability and working efficiency of concentrated solar power (CSP) plants. Due to occlusion, over-exposure, and uneven illumination caused by mirror reflection, traditional image processing algorithms showed poor performances on the detection and segmentation of heliostats, which impede vision-based 3D measurement of tracking poses of heliostats. To tackle this issue, object detection using deep learning neural networks are exploited. An improved neural network based on YOLO-v5 framework has been designed to solve the on-line detection problem of heliostats. The model achieves a recognition accuracy of 99.7% for the test set, outperforming traditional methods significantly. Based on segmented results, the corner points of each heliostat are found out using Hough Transform and line intersection methods. The 3D poses of each heliostat are then solved out based on the image coordinates of specific feature points and the camera model. Experimental and field test results demonstrate the feasibility of this hybrid approach, which provides a low-cost solution for the monitoring and measurement of tracking poses of the heliostats in CSP. Full article
Show Figures

Figure 1

21 pages, 10158 KiB  
Article
Object Extraction-Based Comprehensive Ship Dataset Creation to Improve Ship Fire Detection
by Farkhod Akhmedov, Sanjar Mukhamadiev, Akmalbek Abdusalomov and Young-Im Cho
Fire 2024, 7(10), 345; https://doi.org/10.3390/fire7100345 - 27 Sep 2024
Viewed by 391
Abstract
The detection of ship fires is a critical aspect of maritime safety and surveillance, demanding high accuracy in both identification and response mechanisms. However, the scarcity of ship fire images poses a significant challenge to the development and training of effective machine learning [...] Read more.
The detection of ship fires is a critical aspect of maritime safety and surveillance, demanding high accuracy in both identification and response mechanisms. However, the scarcity of ship fire images poses a significant challenge to the development and training of effective machine learning models. This research paper addresses this challenge by exploring advanced data augmentation techniques aimed at enhancing the training datasets for ship and ship fire detection. We have curated a dataset comprising ship images (both fire and non-fire) and various oceanic images, which serve as target and source images. By employing diverse image blending methods, we randomly integrate target images of ships with source images of oceanic environments under various conditions, such as windy, rainy, hazy, cloudy, or open-sky scenarios. This approach not only increases the quantity but also the diversity of the training data, thus improving the robustness and performance of machine learning models in detecting ship fires across different contexts. Furthermore, we developed a Gradio web interface application that facilitates selective augmentation of images. The key contribution of this work is related to object extraction-based blending. We propose basic and advanced data augmentation techniques while applying blending and selective randomness. Overall, we cover eight critical steps for dataset creation. We collected 9200 ship fire and 4100 ship non-fire images. From the images, we augmented 90 ship fire images with 13 background images and achieved 11,440 augmented images. To test the augmented dataset performance, we trained Yolo-v8 and Yolo-v10 models with “Fire” and “No-fire” augmented ship images. In the Yolo-v8 case, the precision-recall curve achieved 96.6% (Fire), 98.2% (No-fire), and 97.4% mAP score achievement in all classes at a 0.5 rate. In Yolo-v10 model training achievement, we got 90.3% (Fire), 93.7 (No-fire), and 92% mAP score achievement in all classes at 0.5 rate. In comparison, both trained models’ performance is outperforming other Yolo-based SOTA ship fire detection models in overall and mAP scores. Full article
(This article belongs to the Section Fire Science Models, Remote Sensing, and Data)
Show Figures

Figure 1

23 pages, 5682 KiB  
Article
IV-YOLO: A Lightweight Dual-Branch Object Detection Network
by Dan Tian, Xin Yan, Dong Zhou, Chen Wang and Wenshuai Zhang
Sensors 2024, 24(19), 6181; https://doi.org/10.3390/s24196181 - 24 Sep 2024
Viewed by 655
Abstract
With the rapid growth in demand for security surveillance, assisted driving, and remote sensing, object detection networks with robust environmental perception and high detection accuracy have become a research focus. However, single-modality image detection technologies face limitations in environmental adaptability, often affected by [...] Read more.
With the rapid growth in demand for security surveillance, assisted driving, and remote sensing, object detection networks with robust environmental perception and high detection accuracy have become a research focus. However, single-modality image detection technologies face limitations in environmental adaptability, often affected by factors such as lighting conditions, fog, rain, and obstacles like vegetation, leading to information loss and reduced detection accuracy. We propose an object detection network that integrates features from visible light and infrared images—IV-YOLO—to address these challenges. This network is based on YOLOv8 (You Only Look Once v8) and employs a dual-branch fusion structure that leverages the complementary features of infrared and visible light images for target detection. We designed a Bidirectional Pyramid Feature Fusion structure (Bi-Fusion) to effectively integrate multimodal features, reducing errors from feature redundancy and extracting fine-grained features for small object detection. Additionally, we developed a Shuffle-SPP structure that combines channel and spatial attention to enhance the focus on deep features and extract richer information through upsampling. Regarding model optimization, we designed a loss function tailored for multi-scale object detection, accelerating the convergence speed of the network during training. Compared with the current state-of-the-art Dual-YOLO model, IV-YOLO achieves mAP improvements of 2.8%, 1.1%, and 2.2% on the Drone Vehicle, FLIR, and KAIST datasets, respectively. On the Drone Vehicle and FLIR datasets, IV-YOLO has a parameter count of 4.31 M and achieves a frame rate of 203.2 fps, significantly outperforming YOLOv8n (5.92 M parameters, 188.6 fps on the Drone Vehicle dataset) and YOLO-FIR (7.1 M parameters, 83.3 fps on the FLIR dataset), which had previously achieved the best performance on these datasets. This demonstrates that IV-YOLO achieves higher real-time detection performance while maintaining lower parameter complexity, making it highly promising for applications in autonomous driving, public safety, and beyond. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

22 pages, 11344 KiB  
Article
The Detection of Maize Seedling Quality from UAV Images Based on Deep Learning and Voronoi Diagram Algorithms
by Lipeng Ren, Changchun Li, Guijun Yang, Dan Zhao, Chengjian Zhang, Bo Xu, Haikuan Feng, Zhida Chen, Zhongyun Lin and Hao Yang
Remote Sens. 2024, 16(19), 3548; https://doi.org/10.3390/rs16193548 - 24 Sep 2024
Viewed by 486
Abstract
Assessing the quality of maize seedlings is crucial for field management and germplasm evaluation. Traditional methods for evaluating seedling quality mainly rely on manual field surveys, which are not only inefficient but also highly subjective, while large-scale satellite detection often lacks sufficient accuracy. [...] Read more.
Assessing the quality of maize seedlings is crucial for field management and germplasm evaluation. Traditional methods for evaluating seedling quality mainly rely on manual field surveys, which are not only inefficient but also highly subjective, while large-scale satellite detection often lacks sufficient accuracy. To address these issues, this study proposes an innovative approach that combines the YOLO v8 object detection algorithm with Voronoi spatial analysis to rapidly evaluate maize seedling quality based on high-resolution drone imagery. The YOLO v8 model provides the maize coordinates, which are then used for Voronoi segmentation of the field after applying the Convex Hull difference method. From the generated Voronoi diagram, three key indicators are extracted: Voronoi Polygon Uniformity Index (VPUI), missing seedling rate, and repeated seedling rate to comprehensively evaluate maize seedling quality. The results show that this method effectively extracts the VPUI, missing seedling rate, and repeated seedling rate of maize in the target area. Compared to the traditional plant spacing variation coefficient, VPUI performs better in representing seedling uniformity. Additionally, the R2 for the estimated missing seedling rate and replanting rate based on the Voronoi method were 0.773 and 0.940, respectively. Compared to using the plant spacing method, the R2 increased by 0.09 and 0.544, respectively. The maize seedling quality evaluation method proposed in this study provides technical support for precision maize planting management and is of great significance for improving agricultural production efficiency and reducing labor costs. Full article
Show Figures

Figure 1

20 pages, 3181 KiB  
Article
Dehazing Algorithm Integration with YOLO-v10 for Ship Fire Detection
by Farkhod Akhmedov, Rashid Nasimov and Akmalbek Abdusalomov
Fire 2024, 7(9), 332; https://doi.org/10.3390/fire7090332 - 23 Sep 2024
Cited by 1 | Viewed by 527
Abstract
Ship fire detection presents significant challenges in computer vision-based approaches due to factors such as the considerable distances from which ships must be detected and the unique conditions of the maritime environment. The presence of water vapor and high humidity further complicates the [...] Read more.
Ship fire detection presents significant challenges in computer vision-based approaches due to factors such as the considerable distances from which ships must be detected and the unique conditions of the maritime environment. The presence of water vapor and high humidity further complicates the detection and classification tasks for deep learning models, as these factors can obscure visual clarity and introduce noise into the data. In this research, we explain the development of a custom ship fire dataset, a YOLO (You Only Look Once)-v10 model with a fine-tuning combination of dehazing algorithms. Our approach integrates the power of deep learning with sophisticated image processing to deliver comprehensive solutions for ship fire detection. The results demonstrate the efficacy of using YOLO-v10 in conjunction with a dehazing algorithm, highlighting significant improvements in detection accuracy and reliability. Experimental results show that the YOLO-v10-based developed ship fire detection model outperforms several YOLO and other detection models in precision (97.7%), recall (98%), and [email protected] score (89.7%) achievements. However, the model reached a relatively lower score in terms of F1 score in comparison with YOLO-v8 and ship-fire-net model performances. In addition, the dehazing approach significantly improves the model’s detection performance in a haze environment. Full article
(This article belongs to the Section Fire Science Models, Remote Sensing, and Data)
Show Figures

Figure 1

23 pages, 15105 KiB  
Article
Coupled Impact of Points of Interest and Thermal Environment on Outdoor Human Behavior Using Visual Intelligence
by Shiliang Wang, Qun Zhang, Peng Gao, Chenglin Wang, Jiang An and Lan Wang
Buildings 2024, 14(9), 2978; https://doi.org/10.3390/buildings14092978 - 20 Sep 2024
Viewed by 368
Abstract
Although it is well established that thermal environments significantly influence travel behavior, the synergistic effects of points of interest (POI) and thermal environments on behavior remain unclear. This study developed a vision-based outdoor evaluation model aimed at uncovering the driving factors behind human [...] Read more.
Although it is well established that thermal environments significantly influence travel behavior, the synergistic effects of points of interest (POI) and thermal environments on behavior remain unclear. This study developed a vision-based outdoor evaluation model aimed at uncovering the driving factors behind human behavior in outdoor spaces. First, Yolo v5 and questionnaires were employed to obtain crowd activity intensity and preference levels. Subsequently, target detection and clustering algorithms were used to derive variables such as POI attractiveness and POI distance, while a validated environmental simulator was utilized to simulate outdoor thermal comfort distributions across different times. Finally, multiple classification models were compared to establish the mapping relationships between POI, thermal environment variables, and crowd preferences, with SHAP analysis used to examine the contribution of each variable. The results indicate that XGBoost achieved the best predictive performance (accuracy = 0.95), with shadow proportion (|SHAP| = 0.24) and POI distance (|SHAP| = 0.12) identified as the most significant factors influencing crowd preferences. By extrapolation, this classification model can provide valuable insights for optimizing community environments and enhancing vitality in areas with similar climatic and cultural contexts. Full article
Show Figures

Figure 1

27 pages, 34070 KiB  
Article
Comparison of Faster R-CNN, YOLO, and SSD for Third Molar Angle Detection in Dental Panoramic X-rays
by Piero Vilcapoma, Diana Parra Meléndez, Alejandra Fernández, Ingrid Nicole Vásconez, Nicolás Corona Hillmann, Gustavo Gatica and Juan Pablo Vásconez
Sensors 2024, 24(18), 6053; https://doi.org/10.3390/s24186053 - 19 Sep 2024
Viewed by 897
Abstract
The use of artificial intelligence algorithms (AI) has gained importance for dental applications in recent years. Analyzing AI information from different sensor data such as images or panoramic radiographs (panoramic X-rays) can help to improve medical decisions and achieve early diagnosis of different [...] Read more.
The use of artificial intelligence algorithms (AI) has gained importance for dental applications in recent years. Analyzing AI information from different sensor data such as images or panoramic radiographs (panoramic X-rays) can help to improve medical decisions and achieve early diagnosis of different dental pathologies. In particular, the use of deep learning (DL) techniques based on convolutional neural networks (CNNs) has obtained promising results in dental applications based on images, in which approaches based on classification, detection, and segmentation are being studied with growing interest. However, there are still several challenges to be tackled, such as the data quality and quantity, the variability among categories, and the analysis of the possible bias and variance associated with each dataset distribution. This study aims to compare the performance of three deep learning object detection models—Faster R-CNN, YOLO V2, and SSD—using different ResNet architectures (ResNet-18, ResNet-50, and ResNet-101) as feature extractors for detecting and classifying third molar angles in panoramic X-rays according to Winter’s classification criterion. Each object detection architecture was trained, calibrated, validated, and tested with three different feature extraction CNNs which are ResNet-18, ResNet-50, and ResNet-101, which were the networks that best fit our dataset distribution. Based on such detection networks, we detect four different categories of angles in third molars using panoramic X-rays by using Winter’s classification criterion. This criterion characterizes the third molar’s position relative to the second molar’s longitudinal axis. The detected categories for the third molars are distoangular, vertical, mesioangular, and horizontal. For training, we used a total of 644 panoramic X-rays. The results obtained in the testing dataset reached up to 99% mean average accuracy performance, demonstrating the YOLOV2 obtained higher effectiveness in solving the third molar angle detection problem. These results demonstrate that the use of CNNs for object detection in panoramic radiographs represents a promising solution in dental applications. Full article
Show Figures

Figure 1

17 pages, 4164 KiB  
Article
G-YOLO: A Lightweight Infrared Aerial Remote Sensing Target Detection Model for UAVs Based on YOLOv8
by Xiaofeng Zhao, Wenwen Zhang, Yuting Xia, Hui Zhang, Chao Zheng, Junyi Ma and Zhili Zhang
Drones 2024, 8(9), 495; https://doi.org/10.3390/drones8090495 - 18 Sep 2024
Viewed by 822
Abstract
A lightweight infrared target detection model, G-YOLO, based on an unmanned aerial vehicle (UAV) is proposed to address the issues of low accuracy in target detection of UAV aerial images in complex ground scenarios and large network models that are difficult to apply [...] Read more.
A lightweight infrared target detection model, G-YOLO, based on an unmanned aerial vehicle (UAV) is proposed to address the issues of low accuracy in target detection of UAV aerial images in complex ground scenarios and large network models that are difficult to apply to mobile or embedded platforms. Firstly, the YOLOv8 backbone feature extraction network is improved and designed based on the lightweight network, GhostBottleneckV2, and the remaining part of the backbone network adopts the depth-separable convolution, DWConv, to replace part of the standard convolution, which effectively retains the detection effect of the model while greatly reducing the number of model parameters and calculations. Secondly, the neck structure is improved by the ODConv module, which adopts an adaptive convolutional structure to adaptively adjust the convolutional kernel size and step size, which allows for more effective feature extraction and detection based on targets at different scales. At the same time, the neck structure is further optimized using the attention mechanism, SEAttention, to improve the model’s ability to learn global information of input feature maps, which is then applied to each channel of each feature map to enhance the useful information in a specific channel and improve the model’s detection performance. Finally, the introduction of the SlideLoss loss function enables the model to calculate the differences between predicted and actual truth bounding boxes during the training process, and adjust the model parameters based on these differences to improve the accuracy and efficiency of object detection. The experimental results show that compared with YOLOv8n, the G-YOLO reduces the missed and false detection rates of infrared small target detection in complex backgrounds. The number of model parameters is reduced by 74.2%, the number of computational floats is reduced by 54.3%, the FPS is improved by 71, which improves the detection efficiency of the model, and the average accuracy (mAP) reaches 91.4%, which verifies the validity of the model for UAV-based infrared small target detection. Furthermore, the FPS of the model reaches 556, and it will be suitable for wider and more complex detection task such as small targets, long-distance targets, and other complex scenes. Full article
Show Figures

Figure 1

23 pages, 36929 KiB  
Article
Dynamic Target Tracking and Following with UAVs Using Multi-Target Information: Leveraging YOLOv8 and MOT Algorithms
by Diogo Ferreira and Meysam Basiri
Drones 2024, 8(9), 488; https://doi.org/10.3390/drones8090488 - 14 Sep 2024
Viewed by 850
Abstract
This work presents an autonomous vision-based mobile target tracking and following system designed for unmanned aerial vehicles (UAVs) leveraging multi-target information. It explores the research gap in applying the most recent multi-object tracking (MOT) methods in target following scenarios over traditional single-object tracking [...] Read more.
This work presents an autonomous vision-based mobile target tracking and following system designed for unmanned aerial vehicles (UAVs) leveraging multi-target information. It explores the research gap in applying the most recent multi-object tracking (MOT) methods in target following scenarios over traditional single-object tracking (SOT) algorithms. The system integrates the real-time object detection model, You Only Look Once (YOLO)v8, with the MOT algorithms BoT-SORT and ByteTrack, extracting multi-target information. It leverages this information to improve redetection capabilities, addressing target misidentifications (ID changes), and partial and full occlusions in dynamic environments. A depth sensing module is incorporated to enhance distance estimation when feasible. A 3D flight control system is proposed for target following, capable of reacting to changes in target speed and direction while maintaining line-of-sight. The system is initially tested in simulation and then deployed in real-world scenarios. Results show precise target tracking and following, resilient to partial and full occlusions in dynamic environments, effectively distinguishing the followed target from bystanders. A comparison between the BoT-SORT and ByteTrack trackers reveals a trade-off between computational efficiency and tracking precision. In overcoming the presented challenges, this work enables new practical applications in the field of vision-based target following from UAVs leveraging multi-target information. Full article
(This article belongs to the Special Issue Advances in Detection, Security, and Communication for UAV)
Show Figures

Figure 1

31 pages, 73552 KiB  
Article
Enhancing 3D Rock Localization in Mining Environments Using Bird’s-Eye View Images from the Time-of-Flight Blaze 101 Camera
by John Kern, Reinier Rodriguez-Guillen, Claudio Urrea and Yainet Garcia-Garcia
Technologies 2024, 12(9), 162; https://doi.org/10.3390/technologies12090162 - 12 Sep 2024
Viewed by 820
Abstract
The mining industry faces significant challenges in production costs, environmental protection, and worker safety, necessitating the development of autonomous systems. This study presents the design and implementation of a robust rock centroid localization system for mining robotic applications, particularly rock-breaking hammers. The system [...] Read more.
The mining industry faces significant challenges in production costs, environmental protection, and worker safety, necessitating the development of autonomous systems. This study presents the design and implementation of a robust rock centroid localization system for mining robotic applications, particularly rock-breaking hammers. The system comprises three phases: assembly, data acquisition, and data processing. Environmental sensing was accomplished using a Basler Blaze 101 three-dimensional (3D) Time-of-Flight (ToF) camera. The data processing phase incorporated advanced algorithms, including Bird’s-Eye View (BEV) image conversion and You Only Look Once (YOLO) v8x-Seg instance segmentation. The system’s performance was evaluated using a comprehensive dataset of 627 point clouds, including samples from real mining environments. The system achieved efficient processing times of approximately 5 s. Segmentation accuracy was evaluated using the Intersection over Union (IoU), reaching 95.10%. Localization precision was measured by the Euclidean distance in the XY plane (EDXY), achieving 0.0128 m. The normalized error (enorm) on the X and Y axes did not exceed 2.3%. Additionally, the system demonstrated high reliability with R2 values close to 1 for the X and Y axes, and maintained performance under various lighting conditions and in the presence of suspended particles. The Mean Absolute Error (MAE) in the Z axis was 0.0333 m, addressing challenges in depth estimation. A sensitivity analysis was conducted to assess the model’s robustness, revealing consistent performance across brightness and contrast variations, with an IoU ranging from 92.88% to 96.10%, while showing greater sensitivity to rotations. Full article
Show Figures

Figure 1

18 pages, 12186 KiB  
Article
Cloud-Edge Collaborative Defect Detection Based on Efficient Yolo Networks and Incremental Learning
by Zhenwu Lei, Yue Zhang, Jing Wang and Meng Zhou
Sensors 2024, 24(18), 5921; https://doi.org/10.3390/s24185921 - 12 Sep 2024
Viewed by 545
Abstract
Defect detection constitutes one of the most crucial processes in industrial production. With a continuous increase in the number of defect categories and samples, the defect detection model underpinned by deep learning finds it challenging to expand to new categories, and the accuracy [...] Read more.
Defect detection constitutes one of the most crucial processes in industrial production. With a continuous increase in the number of defect categories and samples, the defect detection model underpinned by deep learning finds it challenging to expand to new categories, and the accuracy and real-time performance of product defect detection are also confronted with severe challenges. This paper addresses the problem of insufficient detection accuracy of existing lightweight models on resource-constrained edge devices by presenting a new lightweight YoloV5 model, which integrates four modules, SCDown, GhostConv, RepNCSPELAN4, and ScalSeq. Here, this paper abbreviates it as SGRS-YoloV5n. Through the incorporation of these modules, the model notably enhances feature extraction and computational efficiency while reducing the model size and computational load, making it more conducive for deployment on edge devices. Furthermore, a cloud-edge collaborative defect detection system is constructed to improve detection accuracy and efficiency through initial detection by edge devices, followed by additional inspection by cloud servers. An incremental learning mechanism is also introduced, enabling the model to adapt promptly to new defect categories and update its parameters accordingly. Experimental results reveal that the SGRS-YoloV5n model exhibits superior detection accuracy and real-time performance, validating its value and stability for deployment in resource-constrained environments. This system presents a novel solution for achieving efficient and accurate real-time defect detection. Full article
Show Figures

Figure 1

Back to TopTop