Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (624)

Search Parameters:
Keywords = focal loss

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 3600 KiB  
Article
Normalization-Guided and Gradient-Weighted Unsupervised Domain Adaptation Network for Transfer Diagnosis of Rolling Bearing Faults Under Class Imbalance
by Hao Luo, Xinyue Wang and Li Zhang
Actuators 2025, 14(1), 39; https://doi.org/10.3390/act14010039 (registering DOI) - 18 Jan 2025
Viewed by 178
Abstract
Transfer learning has garnered significant interest in the field of bearing fault diagnosis under varying operational conditions due to its robust generalization capabilities. However, real-world diagnostic scenarios frequently encounter data imbalances, which complicates the learning of the classification boundary for the minority class [...] Read more.
Transfer learning has garnered significant interest in the field of bearing fault diagnosis under varying operational conditions due to its robust generalization capabilities. However, real-world diagnostic scenarios frequently encounter data imbalances, which complicates the learning of the classification boundary for the minority class within the diagnostic model. To address this challenge, we propose a normalization-guided and gradient-weighted unsupervised domain adaptation network (NG-UDAN) for intelligent bearing fault diagnosis, aimed at tackling inter-domain feature shifts and intra-domain category imbalances. Firstly, the proposed network integrates a residual feature extractor with the Domain Normalization (DN) module to enhance domain-invariant feature extraction. Subsequently, the Local Maximum Mean Discrepancy (LMMD) loss is utilized to minimize the conditional distributional differences between the source and target domains. Finally, the Gradient-Weighted Focal Loss (GWFL) is specifically designed to address the issue of class imbalance. Experiments conducted across three imbalanced scenarios using the Case Western Reserve University (CWRU) and Paderborn University (PU) datasets demonstrate that NG-UDAN is effective in both single-source and mixed-source domain adaptation. Furthermore, comparisons with alternative methods validate the superiority of this approach in managing class imbalances under varying working conditions. Full article
(This article belongs to the Section Actuators for Manufacturing Systems)
16 pages, 3773 KiB  
Article
MDA-DETR:Enhancing Offending Animal Detection with Multi-Channel Attention and Multi-Scale Feature Aggregation
by Haiyan Zhang, Huiqi Li, Guodong Sun and Feng Yang
Animals 2025, 15(2), 259; https://doi.org/10.3390/ani15020259 - 17 Jan 2025
Viewed by 208
Abstract
Conflicts between humans and animals in agricultural and settlement areas have recently increased, resulting in significant resource loss and risks to human and animal lives. This growing issue presents a global challenge. This paper addresses the detection and identification of offending animals, particularly [...] Read more.
Conflicts between humans and animals in agricultural and settlement areas have recently increased, resulting in significant resource loss and risks to human and animal lives. This growing issue presents a global challenge. This paper addresses the detection and identification of offending animals, particularly in obscured or blurry nighttime images. This article introduces Multi-Channel Coordinated Attention and Multi-Dimension Feature Aggregation (MDA-DETR). It integrates multi-scale features for enhanced detection accuracy, employing a Multi-Channel Coordinated Attention (MCCA) mechanism to incorporate location, semantic, and long-range dependency information and a Multi-Dimension Feature Aggregation Module (DFAM) for cross-scale feature aggregation. Additionally, the VariFocal Loss function is utilized to assign pixel weights, enhancing detail focus and maintaining accuracy. In the dataset section, this article uses a dataset from the Northeast China Tiger and Leopard National Park, which includes images of six common offending animal species. In the comprehensive experiments on the dataset, the mAP50 index of MDA-DETR was 1.3%, 0.6%, 0.3%, 3%, 1.1%, and 0.5% higher than RT-DETR-r18, yolov8n, yolov9-C, DETR, Deformable-detr, and DCA-yolov8, respectively, indicating that MDA-DETR is superior to other advanced methods. Full article
(This article belongs to the Special Issue Animal–Computer Interaction: Advances and Opportunities)
21 pages, 9000 KiB  
Article
An Investigation of Infrared Small Target Detection by Using the SPT–YOLO Technique
by Yongjun Qi, Shaohua Yang, Zhengzheng Jia, Yuanmeng Song, Jie Zhu, Xin Liu and Hongxing Zheng
Technologies 2025, 13(1), 40; https://doi.org/10.3390/technologies13010040 - 17 Jan 2025
Viewed by 411
Abstract
To detect and recognize small-size and submerged complex background targets in infrared images, we combine a dynamic receptive field fusion strategy and a multi-scale feature fusion mechanism to improve the detection performance of small targets significantly. The space-to-depth convolution module is introduced as [...] Read more.
To detect and recognize small-size and submerged complex background targets in infrared images, we combine a dynamic receptive field fusion strategy and a multi-scale feature fusion mechanism to improve the detection performance of small targets significantly. The space-to-depth convolution module is introduced as a downsampling layer in the backbone first and achieves the same sampling effect. More detailed information is retained at the same time. Thus, the model’s detection capability for small targets has been enhanced. Then, the pyramid level 2 feature map with minimum receptive field and maximum resolution is added to the neck, which reduces the loss of positional information during feature sampling. Furthermore, x-small detection heads are added, the understanding of the overall characteristics and structure of the target is enhanced much more, and the representation and localization of small targets have been improved. Finally, the cross-entropy loss function in the original network model is replaced by an adaptive threshold focal loss function, forcing the model to allocate more attention to target features. The above methods are based on a public tool, the eighth version of You Only Look Once (YOLO) improved, it is named SPT–YOLO (SPDConv + P2 + Adaptive Threshold + YOLOV8s) in this paper. Some experiments on datasets such as infrared small object detection (IR-SOD) and infrared small target detection 1K(IRSTD-1K), etc. have been executed to verify the proposed algorithm; and the mean average precision of 94.0% and 69% under the condition of threshold at 0.5 and over a range from 0.5 to 0.95 is obtained, respectively. The results show that the proposed method achieves the best performance of infrared small target detection compared to existing methods. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

16 pages, 2425 KiB  
Article
Improved Grain Boundary Reconstruction Method Based on Channel Attention Mechanism
by Xianyin Duan, Yang Chen, Xianbao Duan, Zhijun Rong, Wunan Nie and Jinwei Gao
Materials 2025, 18(2), 253; https://doi.org/10.3390/ma18020253 - 8 Jan 2025
Viewed by 396
Abstract
The grain size of metal materials has a significant impact on their macroscopic properties. However, original metallographic images often suffer from issues such as substantial noise, missing grain boundaries, low contrast, and blurred edges. These challenges hinder the accurate extraction of complete grain [...] Read more.
The grain size of metal materials has a significant impact on their macroscopic properties. However, original metallographic images often suffer from issues such as substantial noise, missing grain boundaries, low contrast, and blurred edges. These challenges hinder the accurate extraction of complete grain boundaries, limiting the precision of grain size measurement and material performance prediction. Therefore, effectively reconstructing incomplete grain boundaries is particularly crucial. This paper proposes a grain boundary reconstruction and grain size measurement method based on an improved channel attention mechanism. A generative adversarial network (GAN) serves as the backbone, with a custom-designed channel attention module embedded in the generator. Combined with a global context attention mechanism, the method captures the global contextual information of the image, enhancing the network’s semantic understanding and reconstruction accuracy for regions with missing grain boundaries. During the image reconstruction process, the method effectively leverages long-range feature correlations within the image, significantly improving network performance. To address the Mode Collapse observed during experiments, the loss function is optimized using Focal Loss, balancing the ratio of positive and negative samples and improving network robustness. Compared with other attention modules, the improved channel attention module significantly enhances the performance of the generative network. Experimental results demonstrate that the generative network based on this module outperforms comparable modules in terms of MIoU (86.25%), Accuracy (95.06%), and Precision (86.54%). The grain boundary reconstruction method based on the improved channel attention mechanism not only effectively improves the accuracy of grain boundary reconstruction but also significantly enhances the generalization ability of the network. This provides reliable technical support for the characterization of the microstructure and the performance prediction of metal materials. Full article
Show Figures

Figure 1

24 pages, 3385 KiB  
Article
An Improved Binary Simulated Annealing Algorithm and TPE-FL-LightGBM for Fast Network Intrusion Detection
by Yafei Luo, Ruihan Chen, Chuantao Li, Derong Yang, Kun Tang and Jing Su
Electronics 2025, 14(2), 231; https://doi.org/10.3390/electronics14020231 - 8 Jan 2025
Viewed by 343
Abstract
With the rapid proliferation of the Internet, network security issues that threaten users have become increasingly severe, despite the widespread benefits of Internet access. Most existing intrusion detection systems (IDS) suffer from suboptimal performance due to data imbalance and feature redundancy, while also [...] Read more.
With the rapid proliferation of the Internet, network security issues that threaten users have become increasingly severe, despite the widespread benefits of Internet access. Most existing intrusion detection systems (IDS) suffer from suboptimal performance due to data imbalance and feature redundancy, while also facing high computational complexity in areas such as feature selection and optimization. To address these challenges, this study proposes a novel network intrusion detection method based on an improved binary simulated annealing algorithm (IBSA) and TPE-FL-LightGBM. First, by integrating Focal Loss into the loss function of the LightGBM classifier, we introduce cost-sensitive learning, which effectively mitigates the impact of class imbalance on model performance and enhances the model’s ability to learn difficult-to-classify samples. Next, significant improvements are made to the simulated annealing algorithm, including adaptive adjustments of the initial temperature and Metropolis criterion, the incorporation of multi-neighborhood search strategies, and the integration of an S-shaped transfer function. These improvements enable the IBSA method to achieve efficient optimal feature selection with fewer iterations. Finally, the Tree-structured Parzen Estimator (TPE) algorithm is employed to optimize the structure of the FL-LightGBM classifier, further enhancing its performance. Through comprehensive visual analysis, ablation studies, and comparative experiments on the NSL-KDD and UNSW-NB15 datasets, the reliability of the proposed network intrusion detection method is validated. Full article
(This article belongs to the Special Issue Artificial Intelligence in Cyberspace Security)
Show Figures

Figure 1

20 pages, 3216 KiB  
Review
Harnessing Genetic Resistance in Maize and Integrated Rust Management Strategies to Combat Southern Corn Rust
by Jiaying Chang, Shizhi Wei, Yueyang Liu, Zhiquan Zhao and Jie Shi
J. Fungi 2025, 11(1), 41; https://doi.org/10.3390/jof11010041 - 7 Jan 2025
Viewed by 457
Abstract
Southern corn rust (SCR) caused by Puccinia polysora Underw. has recently emerged as a focal point of study because of its extensive distribution, significant damage, and high prevalence in maize growing areas such as the United States, Canada, and China. P. polysora is [...] Read more.
Southern corn rust (SCR) caused by Puccinia polysora Underw. has recently emerged as a focal point of study because of its extensive distribution, significant damage, and high prevalence in maize growing areas such as the United States, Canada, and China. P. polysora is an obligate biotrophic fungal pathogen that cannot be cultured in vitro or genetically modified, thus complicating the study of the molecular bases of its pathogenicity. High temperatures and humid environmental conditions favor SCR development. In severe cases, SCR may inhibit photosynthesis and cause early desiccation of maize, a decrease in kernel weight, and yield loss. Consequently, an expedited and accurate detection approach for SCR is essential for plant protection and disease management. Significant progress has been made in elucidating the pathogenic mechanisms of P. polysora, identifying resistance genes and developing SCR-resistant cultivars. A detailed understanding of the molecular interactions between maize and P. polysora will facilitate the development of novel and effective approaches for controlling SCR. This review gives a concise overview of the biological characteristics and symptoms of SCR, its life cycle, the molecular basis of interactions between maize and P. polysora, the genetic resistance of maize to SCR, the network of maize resistance to P. polysora infection, SCR management, and future perspectives. Full article
Show Figures

Figure 1

21 pages, 6316 KiB  
Article
A Method for Tomato Ripeness Recognition and Detection Based on an Improved YOLOv8 Model
by Zhanshuo Yang, Yaxian Li, Qiyu Han, Haoming Wang, Chunjiang Li and Zhandong Wu
Horticulturae 2025, 11(1), 15; https://doi.org/10.3390/horticulturae11010015 - 27 Dec 2024
Viewed by 381
Abstract
With the rapid development of agriculture, tomatoes, as an important economic crop, require accurate ripeness recognition technology to enable selective harvesting. Therefore, intelligent tomato ripeness recognition plays a crucial role in agricultural production. However, factors such as lighting conditions and occlusion lead to [...] Read more.
With the rapid development of agriculture, tomatoes, as an important economic crop, require accurate ripeness recognition technology to enable selective harvesting. Therefore, intelligent tomato ripeness recognition plays a crucial role in agricultural production. However, factors such as lighting conditions and occlusion lead to issues such as low detection accuracy, false detections, and missed detections. Thus, a deep learning algorithm for tomato ripeness detection based on an improved YOLOv8n is proposed in this study. First, the improved YOLOv8 model is used for tomato target detection and ripeness classification. The RCA-CBAM (Region and Color Attention Convolutional Block Attention Module) module is introduced into the YOLOv8 backbone network to enhance the model’s focus on key features. By incorporating attention mechanisms across three dimensions—color, channel, and spatial attention—the model’s ability to recognize changes in tomato color and spatial positioning is improved. Additionally, the BiFPN (Bidirectional Feature Pyramid Network) module is introduced to replace the traditional PANet connection, which achieves efficient feature fusion across different scales of tomato skin color, size, and surrounding environment and optimizes the expression ability of the feature map. Finally, an Inner-FocalerIoU loss function is designed and integrated to address the difficulty of ripeness classification caused by class imbalance in the samples. The results show that the improved YOLOv8+ model is capable of accurately recognizing the ripeness level of tomatoes, achieving relatively high values of 95.8% precision value and 91.7% accuracy on the test dataset. It is concluded that the new model has strong detection performance and real-time detection. Full article
Show Figures

Figure 1

27 pages, 6723 KiB  
Article
RS-FeatFuseNet: An Integrated Remote Sensing Object Detection Model with Enhanced Feature Extraction
by Yijuan Qiu, Jiefeng Xue, Gang Zhang, Xuying Hao, Tao Lei and Ping Jiang
Remote Sens. 2025, 17(1), 61; https://doi.org/10.3390/rs17010061 - 27 Dec 2024
Viewed by 360
Abstract
With the advancement of satellite and sensor technologies, remote sensing images are playing crucial roles in both civilian and military domains. This paper addresses challenges such as complex backgrounds and scale variations in remote sensing images by proposing a novel attention mechanism called [...] Read more.
With the advancement of satellite and sensor technologies, remote sensing images are playing crucial roles in both civilian and military domains. This paper addresses challenges such as complex backgrounds and scale variations in remote sensing images by proposing a novel attention mechanism called ESHA. This mechanism effectively integrates multi-scale feature information and introduces a multi-head self-attention (MHSA) to better capture contextual information surrounding objects, enhancing the model’s ability to perceive complex scenes. Additionally, we optimized the C2f module of YOLOv8, which enhances the model’s representational capacity by introducing a parallel multi-branch structure to learn features at different levels, resolving feature scarcity issues. During training, we utilized focal loss to handle the issue of imbalanced target class distributions in remote sensing datasets, improving the detection accuracy of challenging objects. The final network model achieved training accuracies of 89.1%, 91.6%, and 73.2% on the DIOR, NWPU VHR-10, and VEDAI datasets, respectively. Full article
(This article belongs to the Special Issue Advanced AI Technology in Remote Sensing)
Show Figures

Figure 1

21 pages, 44945 KiB  
Article
Grape Target Detection Method in Orchard Environment Based on Improved YOLOv7
by Fuchun Sun, Qiurong Lv, Yuechao Bian, Renwei He, Dong Lv, Leina Gao, Haorong Wu and Xiaoxiao Li
Agronomy 2025, 15(1), 42; https://doi.org/10.3390/agronomy15010042 - 27 Dec 2024
Viewed by 287
Abstract
In response to the poor detection performance of grapes in orchards caused by issues such as leaf occlusion and fruit overlap, this study proposes an improved grape detection method named YOLOv7-MCSF based on the You Only Look Once v7 (YOLOv7) framework. Firstly, the [...] Read more.
In response to the poor detection performance of grapes in orchards caused by issues such as leaf occlusion and fruit overlap, this study proposes an improved grape detection method named YOLOv7-MCSF based on the You Only Look Once v7 (YOLOv7) framework. Firstly, the original backbone network is replaced with MobileOne to achieve a lightweight improvement of the model, thereby reducing the number of parameters. In addition, a Channel Attention (CA) module was added to the neck network to reduce interference from the orchard background and to accelerate the inference speed. Secondly, the SPPFCSPC pyramid pooling is embedded to enhance the speed of image feature fusion while maintaining a consistent receptive field. Finally, the Focal-EIoU loss function is employed to optimize the regression prediction boxes, accelerating their convergence and improving regression accuracy. The experimental results indicate that, compared to the original YOLOv7 model, the YOLOv7-MCSF model achieves a 26.9% reduction in weight, an increase in frame rate of 21.57 f/s, and improvements in precision, recall, and mAP of 2.4%, 1.8%, and 3.5%, respectively. The improved model can efficiently and in real-time identify grape clusters, providing technical support for the deployment of mobile devices and embedded grape detection systems in orchard environments. Full article
(This article belongs to the Special Issue Remote Sensing Applications in Crop Monitoring and Modelling)
Show Figures

Figure 1

17 pages, 7209 KiB  
Article
Sorghum Spike Detection Method Based on Gold Feature Pyramid Module and Improved YOLOv8s
by Shujin Qiu, Jian Gao, Mengyao Han, Qingliang Cui, Xiangyang Yuan and Cuiqing Wu
Sensors 2025, 25(1), 104; https://doi.org/10.3390/s25010104 - 27 Dec 2024
Viewed by 363
Abstract
In order to solve the problems of high planting density, similar color, and serious occlusion between spikes in sorghum fields, such as difficult identification and detection of sorghum spikes, low accuracy and high false detection, and missed detection rates, this study proposes an [...] Read more.
In order to solve the problems of high planting density, similar color, and serious occlusion between spikes in sorghum fields, such as difficult identification and detection of sorghum spikes, low accuracy and high false detection, and missed detection rates, this study proposes an improved sorghum spike detection method based on YOLOv8s. The method involves augmenting the information fusion capability of the YOLOv8 model’s neck module by integrating the Gold feature pyramid module. Additionally, the SPPF module is refined with the LSKA attention mechanism to heighten focus on critical features. To tackle class imbalance in sorghum detection and expedite model convergence, a loss function incorporating Focal-EIOU is employed. Consequently, the YOLOv8s-Gold-LSKA model, based on the Gold module and LSKA attention mechanism, is developed. Experimental results demonstrate that this improved method significantly enhances sorghum spike detection accuracy in natural field settings. The improved model achieved a precision of 90.72%, recall of 76.81%, mean average precision (mAP) of 85.86%, and an F1-score of 81.19%. Comparing the improved model of this study with the three target detection models of YOLOv5s, SSD, and YOLOv8, respectively, the improved model of this study has better detection performance. This advancement provides technical support for the rapid and accurate recognition of multiple sorghum spike targets in natural field backgrounds, thereby improving sorghum yield estimation accuracy. It also contributes to increased sorghum production and harvest, as well as the enhancement of intelligent harvesting equipment for agricultural machinery. Full article
(This article belongs to the Special Issue Sensor and AI Technologies in Intelligent Agriculture: 2nd Edition)
Show Figures

Figure 1

19 pages, 8527 KiB  
Article
Spatial and Temporal Changes and Assessment of Multi-Species Habitat in Hainan Jianfengling Protected Area
by Yong Ma, Lixi Liu, Wutao Yao, Zhigao Zeng, Mingjun Zhang, Erping Shang, Shuyan Zhang and Jing Yang
Remote Sens. 2025, 17(1), 46; https://doi.org/10.3390/rs17010046 - 27 Dec 2024
Viewed by 425
Abstract
The loss and fragmentation of wildlife habitats is a major threat to their survival and expansion, and protected areas (PAs) are the main tool for conserving biodiversity and protecting habitats. However, most current studies focus on analyzing suitable habitats for species and rarely [...] Read more.
The loss and fragmentation of wildlife habitats is a major threat to their survival and expansion, and protected areas (PAs) are the main tool for conserving biodiversity and protecting habitats. However, most current studies focus on analyzing suitable habitats for species and rarely analyze the spatial and temporal changes in multi-species habitats in protected areas and the effectiveness of conservation. In this study, we analyzed changes in the suitable habitats of five focal mammal species before and after the incorporation of the Hainan Jianfengling protected area into China’s national parks. We utilized the ensemble species distribution model (ESDM) to assess these changes, based on multi-species infrared camera monitoring data from 2015 to 2016 and 2020 to 2021. Furthermore, we evaluated differences in conservation effectiveness before and after the establishment of the national parks. The results showed that there were some differences in habitat changes among all the species included in this study, and all of them showed the phenomenon of the migration from suitable habitats to the central area. The environmental changes in and around the protected area suggest that the changes are closely related to the increase in anthropogenic activities around the protected area, and it is recommended that the protected area should be better managed at its edges to minimize the impact of anthropogenic disturbances on the species and their habitats. Full article
Show Figures

Figure 1

21 pages, 7395 KiB  
Article
Improved YOLOv8 Model for Phenotype Detection of Horticultural Seedling Growth Based on Digital Cousin
by Yuhao Song, Lin Yang, Shuo Li, Xin Yang, Chi Ma, Yuan Huang and Aamir Hussain
Agriculture 2025, 15(1), 28; https://doi.org/10.3390/agriculture15010028 - 26 Dec 2024
Viewed by 403
Abstract
Crop phenotype detection is a precise way to understand and predict the growth of horticultural seedlings in the smart agriculture era to increase the cost-effectiveness and energy efficiency of agricultural production. Crop phenotype detection requires the consideration of plant stature and agricultural devices, [...] Read more.
Crop phenotype detection is a precise way to understand and predict the growth of horticultural seedlings in the smart agriculture era to increase the cost-effectiveness and energy efficiency of agricultural production. Crop phenotype detection requires the consideration of plant stature and agricultural devices, like robots and autonomous vehicles, in smart greenhouse ecosystems. However, collecting the imaging dataset is a challenge facing the deep learning detection of plant phenotype given the dynamic changes among leaves and the temporospatial limits of camara sampling. To address this issue, digital cousin is an improvement on digital twins that can be used to create virtual entities of plants through the creation of dynamic 3D structures and plant attributes using RGB image datasets in a simulation environment, using the principles of the variations and interactions of plants in the physical world. Thus, this work presents a two-phase method to obtain the phenotype of horticultural seedling growth. In the first phase, 3D Gaussian splatting is selected to reconstruct and store the 3D model of the plant with 7000 and 30,000 training rounds, enabling the capture of RGB images and the detection of the phenotypes of the seedlings, overcoming temporal and spatial limitations. In the second phase, an improved YOLOv8 model is created to segment and measure the seedlings, and it is modified by adding the LADH, SPPELAN, and Focaler-ECIoU modules. Compared with the original YOLOv8, the precision of our model is 91%, and the loss metric is lower by approximately 0.24. Moreover, a case study of watermelon seedings is examined, and the results of the 3D reconstruction of the seedlings show that our model outperforms classical segmentation algorithms on the main metrics, achieving a 91.0% mAP50 (B) and a 91.3% mAP50 (M). Full article
Show Figures

Figure 1

13 pages, 3378 KiB  
Article
Research on Improved YOLOv7 for Traffic Obstacle Detection
by Yifan Yang, Song Cui, Xuan Xiang, Yuxing Bai, Liguo Zang and Hongshan Ding
World Electr. Veh. J. 2025, 16(1), 1; https://doi.org/10.3390/wevj16010001 - 24 Dec 2024
Viewed by 791
Abstract
Object detection and recognition algorithms are widely used in applications such as real-time monitoring and autonomous driving. However, there is limited research on traffic obstacle detection in complex scenarios involving road construction and sudden accidents. This gap results in low accuracy and difficulties [...] Read more.
Object detection and recognition algorithms are widely used in applications such as real-time monitoring and autonomous driving. However, there is limited research on traffic obstacle detection in complex scenarios involving road construction and sudden accidents. This gap results in low accuracy and difficulties in recognizing occluded targets, thereby hindering the further development and widespread adoption of intelligent transportation systems. To address these issues, this paper proposes an improved algorithm based on YOLOv7, incorporating a lightweight coordinate attention mechanism to focus on small objects at long distances and capture target location information. The use of a high receptive field enhances the feature hierarchy within the detection network. Additionally, we introduce the focal efficient intersection over union loss function to address sample imbalance, which accelerates the model’s convergence speed, reduces loss values, and improves overall model stability. Our model achieved a detection accuracy of 98.1%, reflecting a 1.4% increase, while also enhancing detection speed and minimizing missed detections. These advancements significantly bolster the model’s performance, demonstrating advantages for real-world applications. Full article
(This article belongs to the Special Issue Research on Intelligent Vehicle Path Planning Algorithm)
Show Figures

Figure 1

14 pages, 4833 KiB  
Article
Automatic Road Extraction from Historical Maps Using Transformer-Based SegFormers
by Elif Sertel, Can Michael Hucko and Mustafa Erdem Kabadayı
ISPRS Int. J. Geo-Inf. 2024, 13(12), 464; https://doi.org/10.3390/ijgi13120464 - 21 Dec 2024
Viewed by 1033
Abstract
Historical maps are valuable sources of geospatial data for various geography-related applications, providing insightful information about historical land use, transportation infrastructure, and settlements. While transformer-based segmentation methods have been widely applied to image segmentation tasks, they have mostly focused on satellite images. There [...] Read more.
Historical maps are valuable sources of geospatial data for various geography-related applications, providing insightful information about historical land use, transportation infrastructure, and settlements. While transformer-based segmentation methods have been widely applied to image segmentation tasks, they have mostly focused on satellite images. There is a growing need to explore transformer-based approaches for geospatial object extraction from historical maps, given their superior performance over traditional convolutional neural network (CNN)-based architectures. In this research, we aim to automatically extract five different road types from historical maps, using a road dataset digitized from the scanned Deutsche Heereskarte 1:200,000 Türkei (DHK 200 Turkey) maps. We applied the variants of the transformer-based SegFormer model and evaluated the effects of different encoders, batch sizes, loss functions, optimizers, and augmentation techniques on road extraction performance. Our best results, with an intersection over union (IoU) of 0.5411 and an F1 score of 0.7017, were achieved using the SegFormer-B2 model, the Adam optimizer, and the focal loss function. All SegFormer-based experiments outperformed previously reported CNN-based segmentation models on the same dataset. In general, increasing the batch size and using larger SegFormer variants (from B0 to B2) resulted in improved accuracy metrics. Additionally, the choice of augmentation techniques significantly influenced the outcomes. Our results demonstrate that SegFormer models substantially enhance true positive predictions and resulted in higher precision metric values. These findings suggest that the output weights could be directly applied to transfer learning for similar historical maps and the inference of additional DHK maps, while offering a promising architecture for future road extraction studies. Full article
Show Figures

Figure 1

13 pages, 5932 KiB  
Article
Automating an Encoder–Decoder Incorporated Ensemble Model: Semantic Segmentation Workflow on Low-Contrast Underwater Images
by Jale Bektaş
Appl. Sci. 2024, 14(24), 11964; https://doi.org/10.3390/app142411964 - 20 Dec 2024
Viewed by 386
Abstract
Numerous methods have been proposed for semantic segmentation and the state-of-the-art part is likely to be incorporated by deep learning-based methods which show a salient performance. This study addresses the challenge of semantic segmentation in low-contrast imbalanced underwater images. Moreover, it employs nine [...] Read more.
Numerous methods have been proposed for semantic segmentation and the state-of-the-art part is likely to be incorporated by deep learning-based methods which show a salient performance. This study addresses the challenge of semantic segmentation in low-contrast imbalanced underwater images. Moreover, it employs nine model fusions as a downstream workflow task using encoder–decoder architectures with Dice Loss and Focal Loss training focusing on the imbalance data. Afterwards, the most effective two encoder–decoder fusion models, Res34+Unet and VGG19+FPN, by 0.592%, 0.590% mIoU on average and by 0.510%, 0.491% F1-score yielded better performance, respectively, than other models. Using a weight-optimization algorithm, the ensemble model with recreated IoU results improves the accuracy for both the Res34+Unet and the VGG19+FPN models, by 0.652% mIoU on average which is 6%. The ensemble model combines the model performances of independent models by considering their superior inference accuracy on a per-class basis separately and improves the model performances by emphasizing the better one on a per-class basis. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop