MDPI - Publisher of Open Access Journals

17 pages, 4299 KiB

Open AccessArticle

Resistance Spot Welding Defect Detection Based on Visual Inspection: Improved Faster R-CNN Model

by Weijie Liu, Jie Hu and Jin Qi

Machines 2025, 13(1), 33; https://doi.org/10.3390/machines13010033 - 7 Jan 2025

Viewed by 208

This paper presents an enhanced Faster R-CNN model for detecting surface defects in resistance welding spots, improving both efficiency and accuracy for body-in-white quality monitoring. Key innovations include using high-confidence anchor boxes from the RPN network to locate welding spots, using the SmoothL1 [...] Read more.

This paper presents an enhanced Faster R-CNN model for detecting surface defects in resistance welding spots, improving both efficiency and accuracy for body-in-white quality monitoring. Key innovations include using high-confidence anchor boxes from the RPN network to locate welding spots, using the SmoothL1 loss function, and applying Fast R-CNN to classify detected defects. Additionally, a new pruning model is introduced, reducing unnecessary layers and parameters in the neural network, leading to faster processing times without sacrificing accuracy. Tests show that the model achieves over 90% accuracy and recall, processing each image in about 15 ms, meeting industrial requirements for welding spot inspection. Full article

(This article belongs to the Section Industrial Systems)

► Show Figures

Figure 1

12 pages, 1050 KiB

Open AccessArticle

Rebar Recognition Using Multi-Hyperbolic Attention in Faster R-CNN

by Chuan Li, Nianbiao Cai, Tong Pu, Xi Yang, Hao Liu and Lulu Wang

Appl. Sci. 2025, 15(1), 367; https://doi.org/10.3390/app15010367 - 2 Jan 2025

Viewed by 400

Abstract

Rebar constitutes a crucial element within tunnel lining structures, where its precise arrangement plays a pivotal role in determining both structural stability and load-bearing capacity. Due to the rebar’s high dielectric constant approaching infinity, radar signal reflections are intensified, manifesting as distinct hyperbolic [...] Read more.

Rebar constitutes a crucial element within tunnel lining structures, where its precise arrangement plays a pivotal role in determining both structural stability and load-bearing capacity. Due to the rebar’s high dielectric constant approaching infinity, radar signal reflections are intensified, manifesting as distinct hyperbolic patterns within radar imagery. By performing convolutional operations, these hyperbolic features of rebar can be effectively extracted from radar images. Building upon the feature extraction capabilities of the ResNet50 model, this study introduces a Deformable Attention to Capture Salient Information (DAS) mechanism, employing deformable and separable convolutions to enhance rebar localization and concentrate on hyperbolic features resulting from multiple reflections. Before the Region Proposal Network (RPN) and region of interest (ROI) pooling stages in Faster R-CNN, this study integrates a hyperbolic attention (HAT) module. Through refined distance metrics, the hyperbolic attention mechanism enhances the network’s Precision in identifying rebar hyperbolic features within feature maps. To ensure robustness across diverse conditions, this study utilizes a simulated public dataset for tunnel linings, alongside real data from the Husa Tunnel, to create a comprehensive ground-penetrating radar image dataset for tunnel linings. Experimental results indicate that the proposed model achieved an Average Precision of 94.93%, reflecting a 3.14% increase compared to the baseline model. Lastly, in a random selection of 50 radar images for testing, the model achieved a rebar detection accuracy of 93.46%, representing an enhancement of 0.94% over the baseline model. Full article

► Show Figures

Figure 1

23 pages, 6756 KiB

Open AccessArticle

Vehicle Target Detection of Autonomous Driving Vehicles in Foggy Environments Based on an Improved YOLOX Network

by Zhaohui Liu, Huiru Zhang and Lifei Lin

Sensors 2025, 25(1), 194; https://doi.org/10.3390/s25010194 - 1 Jan 2025

Viewed by 485

Abstract

To address the problems that exist in the target detection of vehicle-mounted visual sensors in foggy environments, a vehicle target detection method based on an improved YOLOX network is proposed. Firstly, to address the issue of vehicle target feature loss in foggy traffic [...] Read more.

To address the problems that exist in the target detection of vehicle-mounted visual sensors in foggy environments, a vehicle target detection method based on an improved YOLOX network is proposed. Firstly, to address the issue of vehicle target feature loss in foggy traffic scene images, specific characteristics of fog-affected imagery are integrated into the network training process. This not only augments the training data but also improves the robustness of the network in foggy environments. Secondly, the YOLOX network is optimized by adding attention mechanisms and an image enhancement module to improve feature extraction and training. Additionally, by combining this with the characteristics of foggy environment images, the loss function is optimized to further improve the target detection performance of the network in foggy environments. Finally, transfer learning is applied during the training process, which not only accelerates network convergence and shortens the training time but also further improves the robustness of the network in different environments. Compared with YOLOv5, YOLOv7, and Faster R-CNN networks, the mAP of the improved network increased by 13.57%, 10.3%, and 9.74%, respectively. The results of the comparative experiments from different aspects illustrated that the proposed method significantly enhances the detection performance for vehicle targets in foggy environments. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

18 pages, 16918 KiB

Open AccessArticle

Advancing Road Safety: A Comprehensive Evaluation of Object Detection Models for Commercial Driver Monitoring Systems

by Huma Zia, Imtiaz ul Hassan, Muhammad Khurram, Nicholas Harris, Fatima Shah and Nimra Imran

Future Transp. 2025, 5(1), 2; https://doi.org/10.3390/futuretransp5010002 - 1 Jan 2025

Viewed by 425

Abstract

This paper addresses the critical issue of road safety in the indispensable role of transportation for societal well-being and economic growth. Despite global initiatives like Vision Zero, traffic accidents persist, largely influenced by driver behavior. Advanced driver monitoring systems (ADMSs) utilizing computer vision [...] Read more.

This paper addresses the critical issue of road safety in the indispensable role of transportation for societal well-being and economic growth. Despite global initiatives like Vision Zero, traffic accidents persist, largely influenced by driver behavior. Advanced driver monitoring systems (ADMSs) utilizing computer vision have emerged to mitigate this issue, but existing systems are often costly and inaccessible, particularly for bus companies. This study introduces a lightweight, deep-learning-based ADMS tailored for real-time driver behavior monitoring, addressing practical barriers to enhance safety measures. A meticulously curated dataset, encompassing diverse demographics and lighting conditions, captures 4966 images depicting five key driver behaviors: eye closure, yawning, smoking, mobile phone usage, and seatbelt compliance. Three object detection models—Faster R-CNN, RetinaNet, and YOLOv5—were evaluated using critical performance metrics. YOLOv5 demonstrated exceptional efficiency, achieving an FPS of 125, a compact model size of 42 MB, and an mAP@IoU 50% of 93.6%. Its performance highlights a favorable trade-off between speed, model size, and prediction accuracy, making it ideal for real-time applications. Faster R-CNN achieved an FPS of 8.56, a model size of 835 MB, and an mAP@IoU 50% of 89.93%, while RetinaNet recorded an FPS of 16.24, a model size of 442 MB, and an mAP@IoU 50% of 87.63%. The practical deployment of the ADMS on a mini CPU demonstrated cost-effectiveness and high performance, enhancing accessibility in real-world settings. By elucidating the strengths and limitations of different object detection models, this research contributes to advancing road safety through affordable, efficient, and reliable technology solutions. Full article

► Show Figures

Figure 1

20 pages, 2870 KiB

Open AccessArticle

Research on Mine-Personnel Helmet Detection Based on Multi-Strategy-Improved YOLOv11

by Lei Zhang, Zhipeng Sun, Hongjing Tao, Meng Wang and Weixun Yi

Sensors 2025, 25(1), 170; https://doi.org/10.3390/s25010170 - 31 Dec 2024

Viewed by 372

Abstract

In the complex environment of fully mechanized mining faces, the current object detection algorithms face significant challenges in achieving optimal accuracy and real-time detection of mine personnel and safety helmets. This difficulty arises from factors such as uneven lighting conditions and equipment obstructions, [...] Read more.

In the complex environment of fully mechanized mining faces, the current object detection algorithms face significant challenges in achieving optimal accuracy and real-time detection of mine personnel and safety helmets. This difficulty arises from factors such as uneven lighting conditions and equipment obstructions, which often lead to missed detections. Consequently, these limitations pose a considerable challenge to effective mine safety management. This article presents an enhanced algorithm based on YOLOv11n, referred to as GCB-YOLOv11. The proposed improvements are realized through three key aspects: Firstly, the traditional convolution is replaced with GSConv, which significantly enhances feature extraction capabilities while simultaneously reducing computational costs. Secondly, a novel C3K2_FE module was designed that integrates Faster_block and ECA attention mechanisms. This design aims to improve detection accuracy while also accelerating detection speed. Finally, the introduction of the Bi FPN mechanism in the Neck section optimizes the efficiency of multi-scale feature fusion and addresses issues related to feature loss and redundancy. The experimental results demonstrate that GCB-YOLOv11 exhibits strong performance on the dataset concerning mine personnel and safety helmets, achieving a mean average precision of 93.6%. Additionally, the frames per second reached 90.3 f·s⁻¹, representing increases of 3.3% and 9.4%, respectively, compared to the baseline model. In addition, when compared to models such as YOLOv5s, YOLOv8s, YOLOv3 Tiny, Fast R-CNN, and RT-DETR, GCB-YOLOv11 demonstrates superior performance in both detection accuracy and model complexity. This highlights its advantages in mining environments and offers a viable technical solution for enhancing the safety of mine personnel. Full article

(This article belongs to the Special Issue Recent Advances in Optical Sensor for Mining)

► Show Figures

Figure 1

17 pages, 6879 KiB

Open AccessArticle

Machine Learning Models for Artist Classification of Cultural Heritage Sketches

by Gianina Chirosca, Roxana Rădvan, Silviu Mușat, Matei Pop and Alecsandru Chirosca

Appl. Sci. 2025, 15(1), 212; https://doi.org/10.3390/app15010212 - 30 Dec 2024

Viewed by 357

Abstract

Modern computer vision algorithms allow researchers and art historians to search for artist-characteristic contour extraction from sketches, thus providing accurate input for artwork analysis, for possible assignments and classifications, and also for the identification of the specific stylistic features. We approach this challenging [...] Read more.

Modern computer vision algorithms allow researchers and art historians to search for artist-characteristic contour extraction from sketches, thus providing accurate input for artwork analysis, for possible assignments and classifications, and also for the identification of the specific stylistic features. We approach this challenging task with three machine learning algorithms and evaluate their performance on a small collection of images from five distinct artists. These algorithms aim to find the most appropriate artist for a sketch (or a contour of a sketch), with promising results that have a higher level of confidence (around 92%). Models start from common Faster R-CNN architectures, reinforcement learning, and vector extraction tools. The proposed tool provides a base for future improvements to create a tool that aids artwork evaluators. Full article

(This article belongs to the Special Issue Applications in Computer Vision and Image Processing)

► Show Figures

Figure 1

25 pages, 19869 KiB

Open AccessArticle

PMDNet: An Improved Object Detection Model for Wheat Field Weed

by Zhengyuan Qi and Jun Wang

Agronomy 2025, 15(1), 55; https://doi.org/10.3390/agronomy15010055 - 28 Dec 2024

Viewed by 323

Abstract

Efficient and accurate weed detection in wheat fields is critical for precision agriculture to optimize crop yield and minimize herbicide usage. The dataset for weed detection in wheat fields was created, encompassing 5967 images across eight well-balanced weed categories, and it comprehensively covers [...] Read more.

Efficient and accurate weed detection in wheat fields is critical for precision agriculture to optimize crop yield and minimize herbicide usage. The dataset for weed detection in wheat fields was created, encompassing 5967 images across eight well-balanced weed categories, and it comprehensively covers the entire growth cycle of spring wheat as well as the associated weed species observed throughout this period. Based on this dataset, PMDNet, an improved object detection model built upon the YOLOv8 architecture, was introduced and optimized for wheat field weed detection tasks. PMDNet incorporates the Poly Kernel Inception Network (PKINet) as the backbone, the self-designed Multi-Scale Feature Pyramid Network (MSFPN) for multi-scale feature fusion, and Dynamic Head (DyHead) as the detection head, resulting in significant performance improvements. Compared to the baseline YOLOv8n model, PMDNet increased [email protected] from 83.6% to 85.8% (+2.2%) and [email protected]:0.95 from 65.7% to 69.6% (+5.9%). Furthermore, PMDNet outperformed several classical single-stage and two-stage object detection models, achieving the highest precision (94.5%, 14.1% higher than Faster-RCNN) and [email protected] (85.8%, 5.4% higher than RT-DETR-L). Under the stricter [email protected]:0.95 metric, PMDNet reached 69.6%, surpassing Faster-RCNN by 16.7% and RetinaNet by 13.1%. Real-world video detection tests further validated PMDNet’s practicality, achieving 87.7 FPS and demonstrating high precision in detecting weeds in complex backgrounds and small targets. These advancements highlight PMDNet’s potential for practical applications in precision agriculture, providing a robust solution for weed management and contributing to the development of sustainable farming practices. Full article

(This article belongs to the Section Precision and Digital Agriculture)

► Show Figures

Figure 1

19 pages, 10948 KiB

Open AccessArticle

Detecting Plant Diseases Using Machine Learning Models

by Nazar Kohut, Oleh Basystiuk, Nataliya Shakhovska and Nataliia Melnykova

Sustainability 2025, 17(1), 132; https://doi.org/10.3390/su17010132 - 27 Dec 2024

Viewed by 383

Abstract

Sustainable agriculture is pivotal to global food security and economic stability, with plant disease detection being a key challenge to ensuring healthy crop production. The early and accurate identification of plant diseases can significantly enhance agricultural practices, minimize crop losses, and reduce the [...] Read more.

Sustainable agriculture is pivotal to global food security and economic stability, with plant disease detection being a key challenge to ensuring healthy crop production. The early and accurate identification of plant diseases can significantly enhance agricultural practices, minimize crop losses, and reduce the environmental impacts. This paper presents an innovative approach to sustainable development by leveraging machine learning models to detect plant diseases, focusing on tomato crops—a vital and globally significant agricultural product. Advanced object detection models including YOLOv8 (minor and nano variants), Roboflow 3.0 (Fast), EfficientDetV2 (with EfficientNetB0 backbone), and Faster R-CNN (with ResNet50 backbone) were evaluated for their precision, efficiency, and suitability for mobile and field applications. YOLOv8 nano emerged as the optimal choice, offering a mean average precision (MAP) of 98.6% with minimal computational requirements, facilitating its integration into mobile applications for real-time support to farmers. This research underscores the potential of machine learning in advancing sustainable agriculture and highlights future opportunities to integrate these models with drone technology, Internet of Things (IoT)-based irrigation, and disease management systems. Expanding datasets and exploring alternative models could enhance this technology’s efficacy and adaptability to diverse agricultural contexts. Full article

(This article belongs to the Special Issue Empowering Artificial Intelligence to Achieve Sustainable Development Goals)

► Show Figures

Figure 1

22 pages, 9808 KiB

Open AccessArticle

An Efficient Group Convolution and Feature Fusion Method for Weed Detection

by Chaowen Chen, Ying Zang, Jinkang Jiao, Daoqing Yan, Zhuorong Fan, Zijian Cui and Minghua Zhang

Agriculture 2025, 15(1), 37; https://doi.org/10.3390/agriculture15010037 - 27 Dec 2024

Viewed by 379

Abstract

Weed detection is a crucial step in achieving intelligent weeding for vegetables. Currently, research on vegetable weed detection technology is relatively limited, and existing detection methods still face challenges due to complex natural conditions, resulting in low detection accuracy and efficiency. This paper [...] Read more.

Weed detection is a crucial step in achieving intelligent weeding for vegetables. Currently, research on vegetable weed detection technology is relatively limited, and existing detection methods still face challenges due to complex natural conditions, resulting in low detection accuracy and efficiency. This paper proposes the YOLOv8-EGC-Fusion (YEF) model, an enhancement based on the YOLOv8 model, to address these challenges. This model introduces plug-and-play modules: (1) The Efficient Group Convolution (EGC) module leverages convolution kernels of various sizes combined with group convolution techniques to significantly reduce computational cost. Integrating this EGC module with the C2f module creates the C2f-EGC module, strengthening the model’s capacity to grasp local contextual information. (2) The Group Context Anchor Attention (GCAA) module strengthens the model’s capacity to capture long-range contextual information, contributing to improved feature comprehension. (3) The GCAA-Fusion module effectively merges multi-scale features, addressing shallow feature loss and preserving critical information. Leveraging GCAA-Fusion and PAFPN, we developed an Adaptive Feature Fusion (AFF) feature pyramid structure that amplifies the model’s feature extraction capabilities. To ensure effective evaluation, we collected a diverse dataset of weed images from various vegetable fields. A series of comparative experiments was conducted to verify the detection effectiveness of the YEF model. The results show that the YEF model outperforms the original YOLOv8 model, Faster R-CNN, RetinaNet, TOOD, RTMDet, and YOLOv5 in detection performance. The detection metrics achieved by the YEF model are as follows: precision of 0.904, recall of 0.88, F1 score of 0.891, and mAP0.5 of 0.929. In conclusion, the YEF model demonstrates high detection accuracy for vegetable and weed identification, meeting the requirements for precise detection. Full article

(This article belongs to the Special Issue Intelligent Agricultural Machinery Design for Smart Farming)

► Show Figures

Figure 1

18 pages, 8018 KiB

Open AccessArticle

STBNA-YOLOv5: An Improved YOLOv5 Network for Weed Detection in Rapeseed Field

by Tao Tao and Xinhua Wei

Agriculture 2025, 15(1), 22; https://doi.org/10.3390/agriculture15010022 - 25 Dec 2024

Viewed by 363

Abstract

Rapeseed is one of the primary oil crops; yet, it faces significant threats from weeds. The ideal method for applying herbicides would be selective variable spraying, but the primary challenge lies in automatically identifying weeds. To address the issues of dense weed identification, [...] Read more.

Rapeseed is one of the primary oil crops; yet, it faces significant threats from weeds. The ideal method for applying herbicides would be selective variable spraying, but the primary challenge lies in automatically identifying weeds. To address the issues of dense weed identification, frequent occlusion, and varying weed sizes in rapeseed fields, this paper introduces a STBNA-YOLOv5 weed detection model and proposes three enhanced algorithms: incorporating a Swin Transformer encoder block to bolster feature extraction capabilities, utilizing a BiFPN structure coupled with a NAM attention mechanism module to efficiently harness feature information, and incorporating an adaptive spatial fusion module to enhance recognition sensitivity. Additionally, the random occlusion technique and weed category image data augmentation method are employed to diversify the dataset. Experimental results demonstrate that the STBNA-YOLOv5 model outperforms detection models such as SDD, Faster-RCNN, YOLOv3, DETR, and EfficientDet in terms of Precision, F1-score, and [email protected], achieving scores of 0.644, 0.825, and 0.908, respectively. For multi-target weed detection, the study presents detection results under various field conditions, including sunny, cloudy, unobstructed, and obstructed. The results indicate that the weed detection model can accurately identify both rapeseed and weed species, demonstrating high stability. Full article

(This article belongs to the Special Issue Multi- and Hyper-Spectral Imaging Technologies for Crop Monitoring—2nd Edition)

► Show Figures

Figure 1

15 pages, 2776 KiB

Open AccessArticle

Research on a Target Detection Algorithm for Common Pests Based on an Improved YOLOv7-Tiny Model

by He Gong, Xiaodan Ma and Ying Guo

Agronomy 2024, 14(12), 3068; https://doi.org/10.3390/agronomy14123068 - 23 Dec 2024

Viewed by 368

Abstract

In agriculture and forestry, pest detection is critical for increasing crop yields and reducing economic losses. However, traditional deep learning models face challenges in resource-constrained environments, such as insufficient accuracy, slow inference speed, and large model sizes, which hinder their practical application. To [...] Read more.

In agriculture and forestry, pest detection is critical for increasing crop yields and reducing economic losses. However, traditional deep learning models face challenges in resource-constrained environments, such as insufficient accuracy, slow inference speed, and large model sizes, which hinder their practical application. To address these issues, this study proposes an improved YOLOv7-tiny model designed to deliver efficient, accurate, and lightweight pest detection solutions. The main improvements are as follows: 1. Lightweight Network Design: The backbone network is optimized by integrating GhostNet and Dynamic Region-Aware Convolution (DRConv) to enhance computational efficiency. 2. Feature Sharing Enhancement: The introduction of a Cross-layer Feature Sharing Network (CotNet Transformer) strengthens feature fusion and extraction capabilities. 3. Activation Function Optimization: The traditional ReLU activation function is replaced with the Gaussian Error Linear Unit (GELU) to improve nonlinear expression and classification performance. Experimental results demonstrate that the improved model surpasses YOLOv7-tiny in accuracy, inference speed, and model size, achieving a [email protected] of 92.8%, reducing inference time to 4.0 milliseconds, and minimizing model size to just 4.8 MB. Additionally, compared to algorithms like Faster R-CNN, SSD, and RetinaNet, the improved model delivers superior detection performance. In conclusion, the improved YOLOv7-tiny provides an efficient and practical solution for intelligent pest detection in agriculture and forestry. Full article

(This article belongs to the Special Issue Tools and Techniques for Monitoring Pests and Diseases in Agro-Ecosystem)

► Show Figures

Figure 1

19 pages, 10695 KiB

Open AccessArticle

A Scene Knowledge Integrating Network for Transmission Line Multi-Fitting Detection

by Xinhang Chen, Xinsheng Xu, Jing Xu, Wenjie Zheng and Qianming Wang

Sensors 2024, 24(24), 8207; https://doi.org/10.3390/s24248207 - 23 Dec 2024

Viewed by 338

Abstract

Aiming at the severe occlusion problem and the tiny-scale object problem in the multi-fitting detection task, the Scene Knowledge Integrating Network (SKIN), including the scene filter module (SFM) and scene structure information module (SSIM) is proposed. Firstly, the particularity of the scene in [...] Read more.

Aiming at the severe occlusion problem and the tiny-scale object problem in the multi-fitting detection task, the Scene Knowledge Integrating Network (SKIN), including the scene filter module (SFM) and scene structure information module (SSIM) is proposed. Firstly, the particularity of the scene in the multi-fitting detection task is analyzed. Hence, the aggregation of the fittings is defined as the scene according to the professional knowledge of the power field and the habit of the operators in identifying the fittings. So, the scene knowledge will include global context information, fitting fine-grained visual information and scene structure information. Then, a scene filter module is designed to learn the global context information and fitting fine-grained visual information, and a scene structure module is designed to learn the scene structure information. Finally, the scene semantic features are used as the carrier to integrate three categories of information into the relative scene features, which can assist in the recognition of the occluded fittings and the tiny-scale fittings after feature mining and feature integration. The experiments show that the proposed network can effectively improve the performance of the multi-fitting detection task compared with the Faster R-CNN and other state-of-the-art models. In particular, the detection performances of the occluded and tiny-scale fittings are significantly improved. Full article

(This article belongs to the Special Issue Deep Power Vision Technology and Intelligent Vision Sensors: 2nd Edition)

► Show Figures

Figure 1

21 pages, 15422 KiB

Open AccessArticle

A Lightweight Model for Weed Detection Based on the Improved YOLOv8s Network in Maize Fields

by Jinyong Huang, Xu Xia, Zhihua Diao, Xingyi Li, Suna Zhao, Jingcheng Zhang, Baohua Zhang and Guoqiang Li

Agronomy 2024, 14(12), 3062; https://doi.org/10.3390/agronomy14123062 - 22 Dec 2024

Viewed by 448

Abstract

To address the issue of the computational intensity and deployment difficulties associated with weed detection models, a lightweight target detection model for weeds based on YOLOv8s in maize fields was proposed in this study. Firstly, a lightweight network, designated as Dualconv High Performance [...] Read more.

To address the issue of the computational intensity and deployment difficulties associated with weed detection models, a lightweight target detection model for weeds based on YOLOv8s in maize fields was proposed in this study. Firstly, a lightweight network, designated as Dualconv High Performance GPU Net (D-PP-HGNet), was constructed on the foundation of the High Performance GPU Net (PP-HGNet) framework. Dualconv was introduced to reduce the computation required to achieve a lightweight design. Furthermore, Adaptive Feature Aggregation Module (AFAM) and Global Max Pooling were incorporated to augment the extraction of salient features in complex scenarios. Then, the newly created network was used to reconstruct the YOLOv8s backbone. Secondly, a four-stage inverted residual moving block (iRMB) was employed to construct a lightweight iDEMA module, which was used to replace the original C2f feature extraction module in the Neck to improve model performance and accuracy. Finally, Dualconv was employed instead of the conventional convolution for downsampling, further diminishing the network load. The new model was fully verified using the established field weed dataset. The test results showed that the modified model exhibited a notable improvement in detection performance compared with YOLOv8s. Accuracy improved from 91.2% to 95.8%, recall from 87.9% to 93.2%, and [email protected] from 90.8% to 94.5%. Furthermore, the number of GFLOPs and the model size were reduced to 12.7 G and 9.1 MB, respectively, representing a decrease of 57.4% and 59.2% compared to the original model. Compared with the prevalent target detection models, such as Faster R-CNN, YOLOv5s, and YOLOv8l, the new model showed superior performance in accuracy and lightweight. The new model proposed in this paper effectively reduces the cost of the required hardware to achieve accurate weed identification in maize fields with limited resources. Full article

(This article belongs to the Collection AI, Sensors and Robotics for Smart Agriculture)

► Show Figures

Figure 1

27 pages, 24936 KiB

Open AccessArticle

Multipath and Deep Learning-Based Detection of Ultra-Low Moving Targets Above the Sea

by Zhaolong Wang, Xiaokuan Zhang, Weike Feng, Binfeng Zong, Tong Wang, Cheng Qi and Xixi Chen

Remote Sens. 2024, 16(24), 4773; https://doi.org/10.3390/rs16244773 - 21 Dec 2024

Viewed by 314

Abstract

An intelligent approach is proposed and investigated in this paper for the detection of ultra-low-altitude sea-skimming moving targets for airborne pulse Doppler radar. Without suppressing interferences, the proposed method uses both target and multipath information for detection based on their distinguishable image features [...] Read more.

An intelligent approach is proposed and investigated in this paper for the detection of ultra-low-altitude sea-skimming moving targets for airborne pulse Doppler radar. Without suppressing interferences, the proposed method uses both target and multipath information for detection based on their distinguishable image features and deep learning (DL) techniques. First, the image features of the target, multipath, and sea clutter in the real-measured range-Doppler (RD) map are analyzed, based on which the target and multipath are defined together as the generalized target. Then, based on the composite electromagnetic scattering mechanism of the target and the ocean surface, a scattering-based echo generation model is established and validated to generate sufficient data for DL network training. Finally, the RD features of the generalized target are learned by training the DL-based target detector, such as you-only-look-once version 7 (YOLOv7) and Faster R-CNN. The detection results show the high performance of the proposed method on both simulated and real-measured data without suppressing interferences (e.g., clutter, jamming, and noise). In particular, even if the target is submerged in clutter, the target can still be detected by the proposed method based on the multipath feature. Full article

(This article belongs to the Special Issue Array and Signal Processing for Radar)

► Show Figures

Graphical abstract

22 pages, 5498 KiB

Open AccessArticle

Small-Sample Target Detection Across Domains Based on Supervision and Distillation

by Fusheng Sun, Jianli Jia, Xie Han, Liqun Kuang and Huiyan Han

Electronics 2024, 13(24), 4975; https://doi.org/10.3390/electronics13244975 - 18 Dec 2024

Viewed by 382

Abstract

To address the issues of significant object discrepancies, low similarity, and image noise interference between source and target domains in object detection, we propose a supervised learning approach combined with knowledge distillation. Initially, student and teacher models are jointly trained through supervised and [...] Read more.

To address the issues of significant object discrepancies, low similarity, and image noise interference between source and target domains in object detection, we propose a supervised learning approach combined with knowledge distillation. Initially, student and teacher models are jointly trained through supervised and distillation-based approaches, iteratively refining the inter-model weights to mitigate the issue of model overfitting. Secondly, a combined convolutional module is integrated into the feature extraction network of the student model, to minimize redundant computational effort; an explicit visual center module is embedded within the feature pyramid network, to bolster feature representation; and a spatial grouping enhancement module is incorporated into the region proposal network, to mitigate the adverse effects of noise on the outcomes. Ultimately, the model undergoes a comprehensive optimization process that leverages the loss functions originating from both the supervised and knowledge distillation phases. The experimental results demonstrate that this strategy significantly boosts classification and identification accuracy on cross-domain datasets; when compared to the TFA (Task-agnostic Fine-tuning and Adapter), CD-FSOD (Cross-Domain Few-Shot Object Detection) and DeFRCN (Decoupled Faster R-CNN for Few-Shot Object Detection), with sample orders of magnitude 1 and 5, increased the detection accuracy by 1.67% and 1.87%, respectively. Full article

(This article belongs to the Topic Visual Computing and Understanding: New Developments and Trends)

► Show Figures

Figure 1

Search Results (818)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (818)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI