Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (325)

Search Parameters:
Keywords = swin-transformer

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 10494 KiB  
Article
RT-DETR-Tomato: Tomato Target Detection Algorithm Based on Improved RT-DETR for Agricultural Safety Production
by Zhimin Zhao, Shuo Chen, Yuheng Ge, Penghao Yang, Yunkun Wang and Yunsheng Song
Appl. Sci. 2024, 14(14), 6287; https://doi.org/10.3390/app14146287 (registering DOI) - 19 Jul 2024
Abstract
The detection of tomatoes is of vital importance for enhancing production efficiency, with image recognition-based tomato detection methods being the primary approach. However, these methods face challenges such as the difficulty in extracting small targets, low detection accuracy, and slow processing speeds. Therefore, [...] Read more.
The detection of tomatoes is of vital importance for enhancing production efficiency, with image recognition-based tomato detection methods being the primary approach. However, these methods face challenges such as the difficulty in extracting small targets, low detection accuracy, and slow processing speeds. Therefore, this paper proposes an improved RT-DETR-Tomato model for efficient tomato detection under complex environmental conditions. The model mainly consists of a Swin Transformer block, a BiFormer module, path merging, multi-scale convolutional layers, and fully connected layers. In this proposed model, Swin Transformer is chosen as the new backbone network to replace ResNet50 because of its superior ability to capture broader global dependency relationships and contextual information. Meanwhile, a lightweight BiFormer block is adopted in Swin Transformer to reduce computational complexity through content-aware flexible computation allocation. Experimental results show that the average accuracy of the final RT-DETR-Tomato model is greatly improved compared to the original model, and the model training time is greatly reduced, demonstrating better environmental adaptability. In the future, the RT-DETR-Tomato model can be integrated with intelligent patrol and picking robots, enabling precise identification of crops and ensuring the safety of crops and the smooth progress of agricultural production. Full article
Show Figures

Figure 1

16 pages, 4388 KiB  
Article
CellGAN: Generative Adversarial Networks for Cellular Microscopy Image Recognition with Integrated Feature Completion Mechanism
by Xiangle Liao and Wenlong Yi
Appl. Sci. 2024, 14(14), 6266; https://doi.org/10.3390/app14146266 - 18 Jul 2024
Viewed by 136
Abstract
In response to the challenges of high noise, high adhesion, and a low signal-to-noise ratio in microscopic cell images, as well as the difficulty of existing deep learning models such as UNet, ResUNet, and SwinUNet in segmenting images with clear boundaries and high-resolution, [...] Read more.
In response to the challenges of high noise, high adhesion, and a low signal-to-noise ratio in microscopic cell images, as well as the difficulty of existing deep learning models such as UNet, ResUNet, and SwinUNet in segmenting images with clear boundaries and high-resolution, this study proposes a CellGAN semantic segmentation method based on a generative adversarial network with a Feature Completion Mechanism. This method incorporates a Transformer to supplement long-range semantic information. In the self-attention module of the Transformer generator, bilinear interpolation for feature completion is introduced, reducing the computational complexity of self-attention to O(n). Additionally, two-dimensional relative positional encoding is employed in the self-attention mechanism to supplement positional information and facilitate position recovery. Experimental results demonstrate that this method outperforms ResUNet and SwinUNet in segmentation performance on rice leaf cell, MuNuSeg, and Nucleus datasets, achieving up to 23.45% and 19.90% improvements in the Intersection over Union and Similarity metrics, respectively. This method provides an automated and efficient analytical tool for cell biology, enabling more accurate segmentation of cell images, and contributing to a deeper understanding of cellular structure and function. Full article
Show Figures

Figure 1

12 pages, 868 KiB  
Article
Trademark Text Recognition Combining SwinTransformer and Feature-Query Mechanisms
by Boxiu Zhou, Xiuhui Wang, Wenchao Zhou and Longwen Li
Electronics 2024, 13(14), 2814; https://doi.org/10.3390/electronics13142814 - 17 Jul 2024
Viewed by 245
Abstract
The task of trademark text recognition is a fundamental component of scene text recognition (STR), which currently faces a number of challenges, including the presence of unordered, irregular or curved text, as well as text that is distorted or rotated. In applications such [...] Read more.
The task of trademark text recognition is a fundamental component of scene text recognition (STR), which currently faces a number of challenges, including the presence of unordered, irregular or curved text, as well as text that is distorted or rotated. In applications such as trademark infringement detection and analysis of brand effects, the diversification of artistic fonts in trademarks and the complexity of the product surfaces where the trademarks are located pose major challenges for relevant research. To tackle these issues, this paper proposes a novel recognition framework named SwinCornerTR, which aims to enhance the accuracy and robustness of trademark text recognition. Firstly, a novel feature-extraction network based on SwinTransformer with EFPN (enhanced feature pyramid network) is proposed. By incorporating SwinTransformer as the backbone, efficient capture of global information in trademark images is achieved through the self-attention mechanism and enhanced feature pyramid module, providing more accurate and expressive feature representations for subsequent text extraction. Then, during the encoding stage, a novel feature point-retrieval algorithm based on corner detection is designed. The OTSU-based fast corner detector is presented to generate a corner map, achieving efficient and accurate corner detection. Furthermore, in the encoding phase, a feature point-retrieval mechanism based on corner detection is introduced to achieve priority selection of key-point regions, eliminating character-to-character lines and suppressing background interference. Finally, we conducted extensive experiments on two open-access benchmark datasets, SVT and CUTE80, as well as a self-constructed trademark dataset, to assess the effectiveness of the proposed method. Our results showed that the proposed method achieved accuracies of 92.9%, 92.3% and 84.8%, respectively, on these datasets. These results demonstrate the effectiveness and robustness of the proposed method in the analysis of trademark data. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

22 pages, 80762 KiB  
Article
Super-Resolution Image Reconstruction of Wavefront Coding Imaging System Based on Deep Learning Network
by Xueyan Li, Haowen Yu, Yijian Wu, Lieshan Zhang, Di Chang, Xuhong Chu and Haoyuan Du
Electronics 2024, 13(14), 2781; https://doi.org/10.3390/electronics13142781 - 15 Jul 2024
Viewed by 292
Abstract
Wavefront Coding (WFC) is an innovative technique aimed at extending the depth of focus (DOF) of optics imaging systems. In digital imaging systems, super-resolution digital reconstruction close to the diffraction limit of optical systems has always been a hot research topic. With the [...] Read more.
Wavefront Coding (WFC) is an innovative technique aimed at extending the depth of focus (DOF) of optics imaging systems. In digital imaging systems, super-resolution digital reconstruction close to the diffraction limit of optical systems has always been a hot research topic. With the design of a point spread function (PSF) generated by a suitably phase mask, WFC could also be used in super-resolution image reconstruction. In this paper, we use a deep learning network combined with WFC as a general framework for images reconstruction, and verify its possibility and effectiveness. Considering the blur and additive noise simultaneously, we proposed three super-resolution image reconstruction procedures utilizing convolutional neural networks (CNN) based on mean square error (MSE) loss, conditional Generative Adversarial Networks (CGAN), and Swin Transformer Networks (SwinIR) based on mean absolute error (MAE) loss. We verified their effectiveness by simulation experiments. A comparison of experimental results shows that the SwinIR deep residual network structure based on MAE loss optimization criteria can generate more realistic super-resolution images with more details. In addition, we used a WFC camera to obtain a resolution test target and real scene images for experiments. Using the resolution test target, we demonstrated that the spatial resolution could be improved from 55.6 lp/mm to 124 lp/mm by the proposed super-resolution reconstruction procedure. The reconstruction results show that the proposed deep learning network model is superior to the traditional method in reconstructing high-frequency details and effectively suppressing noise, with the resolution approaching the diffraction limit. Full article
(This article belongs to the Section Networks)
Show Figures

Figure 1

14 pages, 3515 KiB  
Article
Swin-FER: Swin Transformer for Facial Expression Recognition
by Mei Bie, Huan Xu, Yan Gao, Kai Song and Xiangjiu Che
Appl. Sci. 2024, 14(14), 6125; https://doi.org/10.3390/app14146125 - 14 Jul 2024
Viewed by 192
Abstract
The ability of transformers to capture global context information is highly beneficial for recognizing subtle differences in facial expressions. However, compared to convolutional neural networks, transformers require the computation of dependencies between each element and all other elements, leading to high computational complexity. [...] Read more.
The ability of transformers to capture global context information is highly beneficial for recognizing subtle differences in facial expressions. However, compared to convolutional neural networks, transformers require the computation of dependencies between each element and all other elements, leading to high computational complexity. Additionally, the large number of model parameters need extensive data for training so as to avoid overfitting. In this paper, according to the characteristics of facial expression recognition tasks, we made targeted improvements to the Swin transformer network. The proposed Swin-Fer network adopts the fusion strategy from the middle layer to deeper layers and employs a method of data dimension conversion to make the network perceive more spatial dimension information. Furthermore, we also integrated a mean module, a split module, and a group convolution strategy to effectively control the number of parameters. On the Fer2013 dataset, an in-the-wild dataset, Swin-Fer achieved an accuracy of 71.11%. On the CK+ dataset, an in-the-lab dataset, the accuracy reached 100%. Full article
Show Figures

Figure 1

20 pages, 78594 KiB  
Article
Underwater Side-Scan Sonar Target Detection: YOLOv7 Model Combined with Attention Mechanism and Scaling Factor
by Xin Wen, Jian Wang, Chensheng Cheng, Feihu Zhang and Guang Pan
Remote Sens. 2024, 16(13), 2492; https://doi.org/10.3390/rs16132492 - 8 Jul 2024
Viewed by 329
Abstract
Side-scan sonar plays a crucial role in underwater exploration, and the autonomous detection of side-scan sonar images is vital for detecting unknown underwater environments. However, due to the complexity of the underwater environment, the presence of a few highlighted areas on the targets, [...] Read more.
Side-scan sonar plays a crucial role in underwater exploration, and the autonomous detection of side-scan sonar images is vital for detecting unknown underwater environments. However, due to the complexity of the underwater environment, the presence of a few highlighted areas on the targets, blurred feature details, and difficulty in collecting data from side-scan sonar, achieving high-precision autonomous target recognition in side-scan sonar images is challenging. This article addresses this problem by improving the You Only Look Once v7 (YOLOv7) model to achieve high-precision object detection in side-scan sonar images. Firstly, given that side-scan sonar images contain large areas of irrelevant information, this paper introduces the Swin-Transformer for dynamic attention and global modeling, which enhances the model’s focus on the target regions. Secondly, the Convolutional Block Attention Module (CBAM) is utilized to further improve feature representation and enhance the neural network model’s accuracy. Lastly, to address the uncertainty of geometric features in side-scan sonar target features, this paper innovatively incorporates a feature scaling factor into the YOLOv7 model. The experiment initially verified the necessity of attention mechanisms in the public dataset. Subsequently, experiments on our side-scan sonar (SSS) image dataset show that the improved YOLOv7 model has 87.9% and 49.23% in its average accuracy (mAP0.5) and (mAP0.5:0.95), respectively. These results are 9.28% and 8.41% higher than the YOLOv7 model. The improved YOLOv7 algorithm proposed in this paper has great potential for object detection and the recognition of side-scan sonar images. Full article
(This article belongs to the Special Issue Advancement in Undersea Remote Sensing II)
Show Figures

Figure 1

21 pages, 9235 KiB  
Article
Feature-Enhanced Attention and Dual-GELAN Net (FEADG-Net) for UAV Infrared Small Object Detection in Traffic Surveillance
by Tuerniyazi Aibibu, Jinhui Lan, Yiliang Zeng, Weijian Lu and Naiwei Gu
Drones 2024, 8(7), 304; https://doi.org/10.3390/drones8070304 - 8 Jul 2024
Viewed by 364
Abstract
With the rapid development of UAV and infrared imaging technology, the cost of UAV infrared imaging technology has decreased steadily. Small target detection technology in aerial infrared images has great potential for applications in many fields, especially in the field of traffic surveillance. [...] Read more.
With the rapid development of UAV and infrared imaging technology, the cost of UAV infrared imaging technology has decreased steadily. Small target detection technology in aerial infrared images has great potential for applications in many fields, especially in the field of traffic surveillance. Because of the low contrast and relatively limited feature information in infrared images compared to visible images, the difficulty involved in small road target detection in infrared aerial images has increased. To solve this problem, this study proposes a feature-enhanced attention and dual-GELAN net (FEADG-net) model. In this network model, the reliability and effectiveness of small target feature extraction is enhanced by a backbone network combined with low-frequency enhancement and a swin transformer. The multi-scale features of the target are fused using a dual-GELAN neck structure, and a detection head with the parameters of the auto-adjusted InnerIoU is constructed to improve the detection accuracy for small infrared targets. The viability of the method was proved using the HIT-UAV dataset and IRTS-AG dataset. According to a comparative experiment, the mAP50 of FEADG-net reached more than 90 percent, which was higher than that of any previous method and it met the real-time requirements. Finally, an ablation experiment was conducted to demonstrate that all three of the modules proposed in the method contributed to the improvement in the detection accuracy. This study not only designs a new algorithm for small road object detection in infrared remote sensing images from UAVs but also provides new ideas for small target detection in remote sensing images for other fields. Full article
Show Figures

Figure 1

17 pages, 9234 KiB  
Article
Algorithm for Corn Crop Row Recognition during Different Growth Stages Based on ST-YOLOv8s Network
by Zhihua Diao, Shushuai Ma, Dongyan Zhang, Jingcheng Zhang, Peiliang Guo, Zhendong He, Suna Zhao and Baohua Zhang
Agronomy 2024, 14(7), 1466; https://doi.org/10.3390/agronomy14071466 - 6 Jul 2024
Viewed by 397
Abstract
Corn crop row recognition during different growth stages is a major difficulty faced by the current development of visual navigation technology for agricultural robots. In order to solve this problem, an algorithm for recognizing corn crop rows during different growth stages is presented [...] Read more.
Corn crop row recognition during different growth stages is a major difficulty faced by the current development of visual navigation technology for agricultural robots. In order to solve this problem, an algorithm for recognizing corn crop rows during different growth stages is presented based on the ST-YOLOv8s network. Firstly, a dataset of corn crop rows during different growth stages, including the seedling stage and mid-growth stage, is constructed in this paper; secondly, an improved YOLOv8s network, in which the backbone network is replaced by the swin transformer (ST), is proposed in this paper for detecting corn crop row segments; after that, an improved supergreen method is introduced in this paper, and the segmentation of crop rows and background within the detection frame is achieved utilizing the enhanced method; finally, the corn crop row lines are identified using the proposed local–global detection method, which detects the local crop rows first, and then detects the global crop rows. The corn crop row segment detection experiments show that the mean average precision (MAP) of the ST-YOLOv8s network during different growth stages increases by 7.34%, 11.92%, and 4.03% on average compared to the MAP of YOLOv5s, YOLOv7, and YOLOv8s networks, respectively, indicating that the ST-YOLOv8s network has a better crop row segment detection effect compared to the comparison networks. Corn crop row line detection experiments show that the accuracy of the local–global detection method proposed in this paper is improved by 17.38%, 10.47%, and 5.99%, respectively, compared with the accuracy of the comparison method; the average angle error is reduced by 3.78°, 1.61°, and 0.7°, respectively, compared with the average angle error of the comparison method; and the average fitting time is reduced by 5.30 ms, 18 ms, and 33.77 ms, respectively, compared with the average fitting time of the comparison method, indicating that the local–global detection method has a better crop row line detection effect compared to the comparison method. In summary, the corn crop row recognition algorithm proposed in this paper can well accomplish the task of corn crop row recognition during different growth stages and contribute to the development of crop row detection technology. Full article
(This article belongs to the Special Issue AI, Sensors and Robotics for Smart Agriculture—2nd Edition)
Show Figures

Figure 1

8 pages, 1586 KiB  
Article
Automated Laryngeal Invasion Detector of Boluses in Videofluoroscopic Swallowing Study Videos Using Action Recognition-Based Networks
by Kihwan Nam, Changyeol Lee, Taeheon Lee, Munseop Shin, Bo Hae Kim and Jin-Woo Park
Diagnostics 2024, 14(13), 1444; https://doi.org/10.3390/diagnostics14131444 - 6 Jul 2024
Viewed by 362
Abstract
We aimed to develop an automated detector that determines laryngeal invasion during swallowing. Laryngeal invasion, which causes significant clinical problems, is defined as two or more points on the penetration–aspiration scale (PAS). We applied two three-dimensional (3D) stream networks for action recognition in [...] Read more.
We aimed to develop an automated detector that determines laryngeal invasion during swallowing. Laryngeal invasion, which causes significant clinical problems, is defined as two or more points on the penetration–aspiration scale (PAS). We applied two three-dimensional (3D) stream networks for action recognition in videofluoroscopic swallowing study (VFSS) videos. To detect laryngeal invasion (PAS 2 or higher scores) in VFSS videos, we employed two 3D stream networks for action recognition. To establish the robustness of our model, we compared its performance with those of various current image classification-based architectures. The proposed model achieved an accuracy of 92.10%. Precision, recall, and F1 scores for detecting laryngeal invasion (≥PAS 2) in VFSS videos were 0.9470 each. The accuracy of our model in identifying laryngeal invasion surpassed that of other updated image classification models (60.58% for ResNet101, 60.19% for Swin-Transformer, 63.33% for EfficientNet-B2, and 31.17% for HRNet-W32). Our model is the first automated detector of laryngeal invasion in VFSS videos based on video action recognition networks. Considering its high and balanced performance, it may serve as an effective screening tool before clinicians review VFSS videos, ultimately reducing the burden on clinicians. Full article
(This article belongs to the Special Issue Advances in Diagnosis and Treatment in Otolaryngology)
Show Figures

Figure 1

14 pages, 7750 KiB  
Article
Bearing Health State Detection Based on Informer and CNN + Swin Transformer
by Chunyang Liu, Weiwei Zou, Zhilei Hu, Hongyu Li, Xin Sui, Xiqiang Ma, Fang Yang and Nan Guo
Machines 2024, 12(7), 456; https://doi.org/10.3390/machines12070456 - 4 Jul 2024
Viewed by 348
Abstract
In response to the challenge of timely fault identification in the spindle bearings of machine tools operating in complex environments, this study proposes a method based on a combination of infrared imaging with an Informer and a CNN + Swin Transformer. The aim [...] Read more.
In response to the challenge of timely fault identification in the spindle bearings of machine tools operating in complex environments, this study proposes a method based on a combination of infrared imaging with an Informer and a CNN + Swin Transformer. The aim is to achieve real-time monitoring of bearing faults, precise fault localization, and classification of fault severity. To accomplish this, an angular contact ball bearing was chosen as the research subject. Initially, an infrared image dataset was constructed, encompassing various fault positions and degrees, by simulating different forms of bearing faults. Subsequently, an Informer-based bearing temperature prediction model was established to select faulty bearing data. Lastly, the faulty data were input into the CNN + Swin Transformer model for bearing fault recognition and classification. The results demonstrate that the Informer model accurately identifies abnormal temperature rises during bearing operation, effectively screening out faulty bearings. Under steady-state conditions, the model achieves a classification accuracy of 97.8%. Furthermore, after employing the Informer screening process, the proposed model exhibits a recognition precision of 98.9%, surpassing other models such as CNN, SVM, and Swin Transformer, which are mentioned in this paper. Full article
(This article belongs to the Section Machines Testing and Maintenance)
Show Figures

Figure 1

19 pages, 4649 KiB  
Article
SIMCB-Yolo: An Efficient Multi-Scale Network for Detecting Forest Fire Smoke
by Wanhong Yang, Zhenlin Yang, Meiyun Wu, Gui Zhang, Yinfang Zhu and Yurong Sun
Forests 2024, 15(7), 1137; https://doi.org/10.3390/f15071137 - 29 Jun 2024
Viewed by 442
Abstract
Forest fire monitoring plays a crucial role in preventing and mitigating forest disasters. Early detection of forest fire smoke is essential for a timely response to forest fire emergencies. The key to effective forest fire monitoring lies in accounting for the various levels [...] Read more.
Forest fire monitoring plays a crucial role in preventing and mitigating forest disasters. Early detection of forest fire smoke is essential for a timely response to forest fire emergencies. The key to effective forest fire monitoring lies in accounting for the various levels of forest fire smoke targets in the monitoring images, enhancing the model’s anti-interference capabilities against mountain clouds and fog, and reducing false positives and missed detections. In this paper, we propose an improved multi-level forest fire smoke detection model based on You Only Look Once v5s (Yolov5s) called SIMCB-Yolo. This model aims to achieve high-precision detection of forest fire smoke at various levels. First, to address the issue of low precision in detecting small target smoke, a Swin transformer small target monitoring head is added to the neck of Yolov5s, enhancing the precision of small target smoke detection. Then, to address the issue of missed detections due to the decline in conventional target smoke detection accuracy after improving small target smoke detection accuracy, we introduced a cross stage partial network bottleneck with three convolutional layers (C3) and a channel block sequence (CBS) into the trunk. These additions help extract more surface features and enhance the detection accuracy of conventional target smoke. Finally, the SimAM attention mechanism is introduced to address the issue of complex background interference in forest fire smoke detection, further reducing false positives and missed detections. Experimental results demonstrate that, compared to the Yolov5s model, the SIMCB-Yolo model achieves an average recognition accuracy (mAP50) of 85.6%, an increase of 4.5%. Additionally, the mAP50-95 is 63.6%, an improvement of 6.9%, indicating good detection accuracy. The performance of the SIMCB-Yolo model on the self-built forest fire smoke dataset is also significantly better than that of current mainstream models, demonstrating high practical value. Full article
(This article belongs to the Special Issue Forest Fires Prediction and Detection—2nd Edition)
Show Figures

Figure 1

13 pages, 2264 KiB  
Article
Spatial Small Target Detection Method Based on Multi-Scale Feature Fusion Pyramid
by Xiaojuan Wang, Yuepeng Liu, Haitao Xu and Changbin Xue
Appl. Sci. 2024, 14(13), 5673; https://doi.org/10.3390/app14135673 - 28 Jun 2024
Viewed by 328
Abstract
Small target detection has become an important part of space exploration missions. The existence of weak illumination and interference from the background of star charts in deep and distant space has brought great challenges to space target detection. In addition, the distance of [...] Read more.
Small target detection has become an important part of space exploration missions. The existence of weak illumination and interference from the background of star charts in deep and distant space has brought great challenges to space target detection. In addition, the distance of space targets is usually far, so most of them are small targets in the image, and the detection of small targets is also very difficult. To solve the above problems, we propose a multi-scale feature fusion pyramid network. First, we propose the CST module of a CNN fused with Swin Transformer as the feature extraction module of the feature pyramid network to enhance the extraction of target features. Then, we improve the SE attention mechanism and construct the CSE module to find the attention region in the dense star map background. Finally, we introduce improved spatial pyramid pooling to fuse more features to increase the sensory field to obtain multi-scale object information and improve detection performance for small targets. We provide two versions and conducted a detailed ablation study to empirically validate the effectiveness and efficiency of the design of each component in our network architecture. The experimental results show that our network improved in performance compared to the existing feature pyramid. Full article
Show Figures

Figure 1

18 pages, 6430 KiB  
Article
A Measurement Method for Body Parameters of Mongolian Horses Based on Deep Learning and Machine Vision
by Lide Su, Minghuang Li, Yong Zhang and Zheying Zong
Appl. Sci. 2024, 14(13), 5655; https://doi.org/10.3390/app14135655 - 28 Jun 2024
Viewed by 351
Abstract
The traditional manual methods for measuring Mongolian horse body parameters are not very safe, have low levels of automation, and cannot effectively ensure animal welfare. This research proposes a method for extracting target Mongolian horse body parameters based on deep learning and machine [...] Read more.
The traditional manual methods for measuring Mongolian horse body parameters are not very safe, have low levels of automation, and cannot effectively ensure animal welfare. This research proposes a method for extracting target Mongolian horse body parameters based on deep learning and machine vision technology. Firstly, Swin Transformer is used as the backbone feature extraction network of Mask R-CNN model, and the CNN-based differentiated feature clustering model is added to minimize the loss of similarity and spatial continuity between pixels, thereby improving the robustness of the model while reducing error pixels and optimizing the rough mask boundary output. Secondly, an improved Harris algorithm and a polynomial fitting method based on contour curves are applied to determine the positions of various measurement points on the horse mask and calculate various body parameters. The accuracy of the proposed method was tested using 20 Mongolian horses. The experimental results show that compared with the original Mask R-CNN network, the PA (pixel accuracy) and MIoU (mean intersection over union) of the optimized model results increased from 91.46% and 84.72% to 98.72% and 95.36%, respectively. The average relative errors of shoulder height, withers height, chest depth, body length, croup height, shoulder angle, and croup angle were 4.01%, 2.98%, 4.86%, 2.97%, 3.06%, 4.91%, and 5.21%, respectively. The research results can provide technical support for assessing body parameters related to the performance of horses under natural conditions, which is of great significance for improving the refinement and welfare of Mongolian horse breeding techniques. Full article
Show Figures

Figure 1

17 pages, 6171 KiB  
Article
Detection and Multi-Class Classification of Invasive Knotweeds with Drones and Deep Learning Models
by Sruthi Keerthi Valicharla, Roghaiyeh Karimzadeh, Kushal Naharki, Xin Li and Yong-Lak Park
Drones 2024, 8(7), 293; https://doi.org/10.3390/drones8070293 - 28 Jun 2024
Viewed by 479
Abstract
Invasive knotweeds are rhizomatous and herbaceous perennial plants that pose significant ecological threats due to their aggressive growth and ability to outcompete native plants. Although detecting and identifying knotweeds is crucial for effective management, current ground-based survey methods are labor-intensive and limited to [...] Read more.
Invasive knotweeds are rhizomatous and herbaceous perennial plants that pose significant ecological threats due to their aggressive growth and ability to outcompete native plants. Although detecting and identifying knotweeds is crucial for effective management, current ground-based survey methods are labor-intensive and limited to cover large and hard-to-access areas. This study was conducted to determine the optimum flight height of drones for aerial detection of knotweeds at different phenological stages and to develop automated detection of knotweeds on aerial images using the state-of-the-art Swin Transformer. The results of this study found that, at the vegetative stage, Japanese knotweed and giant knotweed were detectable at ≤35 m and ≤25 m, respectively, above the canopy using an RGB sensor. The flowers of the knotweeds were detectable at ≤20 m. Thermal and multispectral sensors were not able to detect any knotweed species. Swin Transformer achieved higher precision, recall, and accuracy in knotweed detection on aerial images acquired with drones and RGB sensors than conventional convolutional neural networks (CNNs). This study demonstrated the use of drones, sensors, and deep learning in revolutionizing invasive knotweed detection. Full article
(This article belongs to the Section Drones in Agriculture and Forestry)
Show Figures

Figure 1

16 pages, 5181 KiB  
Article
A Novel Dual-Component Radar-Signal Modulation Recognition Method Based on CNN-ST
by Chenxia Wan and Qinghui Zhang
Appl. Sci. 2024, 14(13), 5499; https://doi.org/10.3390/app14135499 - 25 Jun 2024
Viewed by 258
Abstract
Dual-component radar-signal modulation recognition is a challenging yet significant technique for electronic reconnaissance systems. To improve the lower recognition performance and the higher computational costs of the conventional methods, this paper presents a randomly overlapping dual-component radar-signal modulation recognition method based on a [...] Read more.
Dual-component radar-signal modulation recognition is a challenging yet significant technique for electronic reconnaissance systems. To improve the lower recognition performance and the higher computational costs of the conventional methods, this paper presents a randomly overlapping dual-component radar-signal modulation recognition method based on a convolutional neural network–swin transformer (CNN-ST) under different signal-to-noise ratios (SNRs). To enhance the feature representation ability and decrease the loss of the detailed features of dual-component radar signals under different SNRs, the swin transformer is adopted and integrated into the designed CNN model. An inverted residual structure and lightweight depthwise convolutions are used to maintain the powerful representational ability. The results show that the dual-component radar-signal recognition accuracy of the proposed CNN-ST is up to 82.58% at −8 dB, which shows the better recognition performance of the CNN-ST over others. The dual-component radar-signal recognition accuracies under different SNRs are all more than 88%, which verified the fact that the CNN-ST achieves better recognition accuracy under different SNRs. This work offers essential guidance in enhancing dual-component radar signal recognition under different SNRs and in promoting actual applications. Full article
Show Figures

Figure 1

Back to TopTop