Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (5,902)

Search Parameters:
Keywords = feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 5207 KiB  
Article
MAPPNet: A Multi-Scale Attention Pyramid Pooling Network for Dental Calculus Segmentation
by Tianyu Nie, Shihong Yao, Di Wang, Conger Wang and Yishi Zhao
Appl. Sci. 2024, 14(16), 7273; https://doi.org/10.3390/app14167273 (registering DOI) - 19 Aug 2024
Viewed by 61
Abstract
Dental diseases are among the most prevalent diseases globally, and accurate segmentation of dental calculus images plays a crucial role in periodontal disease diagnosis and treatment planning. However, the current methods are not stable and reliable enough due to the variable morphology of [...] Read more.
Dental diseases are among the most prevalent diseases globally, and accurate segmentation of dental calculus images plays a crucial role in periodontal disease diagnosis and treatment planning. However, the current methods are not stable and reliable enough due to the variable morphology of dental calculus and the blurring of the boundaries between the dental edges and the surrounding tissues; therefore, our hope is to propose an accurate and reliable calculus segmentation algorithm to improve the efficiency of clinical detection. We propose a multi-scale attention pyramid pooling network (MAPPNet) to enhance the performance of dental calculus segmentation. The network incorporates a multi-scale fusion strategy in both the encoder and decoder, forming a model with a dual-ended multi-scale structure. This design, in contrast to employing a multi-scale fusion scheme at a single end, enables more effective capturing of features from diverse scales. Furthermore, the attention pyramid pooling module (APPM) reconstructs the features on this map by leveraging a spatial-first and channel-second attention mechanism. APPM enables the network to adaptively adjust the weights of different locations and channels in the feature map, thereby enhancing the perception of important regions and key features. Experimental evaluation of our collected dental calculus segmentation dataset demonstrates the superior performance of MAPPNet, which achieves an intersection-over-union of 81.46% and an accuracy rate of 98.35%. Additionally, on two publicly available datasets, ISIC2018 (skin lesion dataset) and Kvasir-SEG (gastrointestinal polyp segmentation dataset), MAPPNet achieved an intersection-over-union of 76.48% and 91.38%, respectively. These results validate the effectiveness of our proposed network in accurately segmenting lesion regions and achieving high accuracy rates, surpassing many existing segmentation methods. Full article
Show Figures

Figure 1

19 pages, 19884 KiB  
Article
A Novel Transformer Network Based on Cross–Spatial Learning and Deformable Attention for Composite Fault Diagnosis of Agricultural Machinery Bearings
by Xuemei Li, Min Li, Bin Liu, Shangsong Lv and Chengjie Liu
Agriculture 2024, 14(8), 1397; https://doi.org/10.3390/agriculture14081397 (registering DOI) - 18 Aug 2024
Viewed by 331
Abstract
Diagnosing agricultural machinery faults is critical to agricultural automation, and identifying vibration signals from faulty bearings is important for agricultural machinery fault diagnosis and predictive maintenance. In recent years, data–driven methods based on deep learning have received much attention. Considering the roughness of [...] Read more.
Diagnosing agricultural machinery faults is critical to agricultural automation, and identifying vibration signals from faulty bearings is important for agricultural machinery fault diagnosis and predictive maintenance. In recent years, data–driven methods based on deep learning have received much attention. Considering the roughness of the attention receptive fields in Vision Transformer and Swin Transformer, this paper proposes a Shift–Deformable Transformer (S–DT) network model with multi–attention fusion to achieve accurate diagnosis of composite faults. In this method, the vibration signal is first transformed into a time–frequency graph representation through continuous wavelet transform (CWT); secondly, dilated convolutional residual blocks and efficient attention for cross–spatial learning are used for low–level local feature enhancement. Then, the shift window and deformable attention are fused into S–D Attention, which has a more focused receptive field to learn global features accurately. Finally, the diagnosis result is obtained through the classifier. Experiments were conducted on self–collected datasets and public datasets. The results show that the proposed S–DT network performs excellently in all cases. With a slight decrease in the number of parameters, the validation accuracy improves by more than 2%, and the training network has a fast convergence period. This provides an effective solution for monitoring the efficient and stable operation of agricultural automation machinery and equipment. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

23 pages, 2501 KiB  
Article
MsFNet: Multi-Scale Fusion Network Based on Dynamic Spectral Features for Multi-Temporal Hyperspectral Image Change Detection
by Yining Feng, Weihan Ni, Liyang Song and Xianghai Wang
Remote Sens. 2024, 16(16), 3037; https://doi.org/10.3390/rs16163037 (registering DOI) - 18 Aug 2024
Viewed by 332
Abstract
With the development of satellite technology, the importance of multi-temporal remote sensing (RS) image change detection (CD) in urban planning, environmental monitoring, and other fields is increasingly prominent. Deep learning techniques enable a profound exploration of the intrinsic features within hyperspectral (HS) data, [...] Read more.
With the development of satellite technology, the importance of multi-temporal remote sensing (RS) image change detection (CD) in urban planning, environmental monitoring, and other fields is increasingly prominent. Deep learning techniques enable a profound exploration of the intrinsic features within hyperspectral (HS) data, leading to substantial enhancements in CD accuracy while addressing several challenges posed by traditional methodologies. However, existing convolutional neural network (CNN)-based CD approaches frequently encounter issues during the feature extraction process, such as the loss of detailed information due to downsampling, which hampers a model’s ability to accurately capture complex spectral features. Additionally, these methods often neglect the integration of multi-scale information, resulting in suboptimal local feature extraction and, consequently, diminished model performance. To address these limitations, we propose a multi-scale fusion network (MsFNet) which leverages dynamic spectral features for effective multi-temporal HS-CD. Our approach incorporates a dynamic convolution module with spectral attention, which adaptively modulates the receptive field size according to the spectral characteristics of different bands. This flexibility enhances the model’s capacity to focus on critical bands, thereby improving its ability to identify and differentiate changes across spectral dimensions. Furthermore, we develop a multi-scale feature fusion module which extracts and integrates features from deep feature maps, enriching local information and augmenting the model’s sensitivity to local variations. Experimental evaluations conducted on three real-world HS-CD datasets demonstrate that the proposed MsFNet significantly outperforms contemporary advanced CD methods in terms of both efficacy and performance. Full article
(This article belongs to the Special Issue Recent Advances in the Processing of Hyperspectral Images)
Show Figures

Figure 1

27 pages, 3359 KiB  
Article
A Multi-Scale Mask Convolution-Based Blind-Spot Network for Hyperspectral Anomaly Detection
by Zhiwei Yang, Rui Zhao, Xiangchao Meng, Gang Yang, Weiwei Sun, Shenfu Zhang and Jinghui Li
Remote Sens. 2024, 16(16), 3036; https://doi.org/10.3390/rs16163036 (registering DOI) - 18 Aug 2024
Viewed by 324
Abstract
Existing methods of hyperspectral anomaly detection still face several challenges: (1) Due to the limitations of self-supervision, avoiding the identity mapping of anomalies remains difficult; (2) the ineffective interaction between spatial and spectral features leads to the insufficient utilization of spatial information; and [...] Read more.
Existing methods of hyperspectral anomaly detection still face several challenges: (1) Due to the limitations of self-supervision, avoiding the identity mapping of anomalies remains difficult; (2) the ineffective interaction between spatial and spectral features leads to the insufficient utilization of spatial information; and (3) current methods are not adaptable to the detection of multi-scale anomaly targets. To address the aforementioned challenges, we proposed a blind-spot network based on multi-scale blind-spot convolution for HAD. The multi-scale mask convolution module is employed to adapt to diverse scales of anomaly targets, while the dynamic fusion module is introduced to integrate the advantages of mask convolutions at different scales. The proposed approach includes a spatial–spectral joint module and a background feature attention mechanism to enhance the interaction between spatial–spectral features, with a specific emphasis on highlighting the significance of background features within the network. Furthermore, we propose a preprocessing technique that combines pixel shuffle down-sampling (PD) with spatial spectral joint screening. This approach addresses anomalous identity mapping and enables finite-scale mask convolution for better detection of targets at various scales. The proposed approach was assessed on four real hyperspectral datasets comprising anomaly targets of different scales. The experimental results demonstrate the effectiveness and superior performance of the proposed methodology compared with nine state-of-the-art methods. Full article
19 pages, 14105 KiB  
Article
Identification of Pine Wilt-Diseased Trees Using UAV Remote Sensing Imagery and Improved PWD-YOLOv8n Algorithm
by Jianyi Su, Bingxi Qin, Fenggang Sun, Peng Lan and Guolin Liu
Drones 2024, 8(8), 404; https://doi.org/10.3390/drones8080404 (registering DOI) - 18 Aug 2024
Viewed by 344
Abstract
Pine wilt disease (PWD) is one of the most destructive diseases for pine trees, causing a significant effect on ecological resources. The identification of PWD-infected trees is an effective approach for disease control. However, the effects of complex environments and the multi-scale features [...] Read more.
Pine wilt disease (PWD) is one of the most destructive diseases for pine trees, causing a significant effect on ecological resources. The identification of PWD-infected trees is an effective approach for disease control. However, the effects of complex environments and the multi-scale features of PWD trees hinder detection performance. To address these issues, this study proposes a detection model based on PWD-YOLOv8 by utilizing aerial images. In particular, the coordinate attention (CA) and convolutional block attention module (CBAM) mechanisms are combined with YOLOv8 to enhance feature extraction. The bidirectional feature pyramid network (BiFPN) structure is used to strengthen feature fusion and recognition capability for small-scale diseased trees. Meanwhile, the lightweight FasterBlock structure and efficient multi-scale attention (EMA) mechanism are employed to optimize the C2f module. In addition, the Inner-SIoU loss function is introduced to seamlessly improve model accuracy and reduce missing rates. The experiment showed that the proposed PWD-YOLOv8n algorithm outperformed conventional target-detection models on the validation set ([email protected] = 94.3%, precision = 87.9%, recall = 87.0%, missing rate = 6.6%; model size = 4.8 MB). Therefore, the proposed PWD-YOLOv8n model demonstrates significant superiority in diseased-tree detection. It not only enhances detection efficiency and accuracy but also provides important technical support for forest disease control and prevention. Full article
Show Figures

Figure 1

17 pages, 5247 KiB  
Article
Intra-Pulse Modulation Recognition of Radar Signals Based on Efficient Cross-Scale Aware Network
by Jingyue Liang, Zhongtao Luo and Renlong Liao
Sensors 2024, 24(16), 5344; https://doi.org/10.3390/s24165344 (registering DOI) - 18 Aug 2024
Viewed by 394
Abstract
Radar signal intra-pulse modulation recognition can be addressed with convolutional neural networks (CNNs) and time–frequency images (TFIs). However, current CNNs have high computational complexity and do not perform well in low-signal-to-noise ratio (SNR) scenarios. In this paper, we propose a lightweight CNN known [...] Read more.
Radar signal intra-pulse modulation recognition can be addressed with convolutional neural networks (CNNs) and time–frequency images (TFIs). However, current CNNs have high computational complexity and do not perform well in low-signal-to-noise ratio (SNR) scenarios. In this paper, we propose a lightweight CNN known as the cross-scale aware network (CSANet) to recognize intra-pulse modulation based on three types of TFIs. The cross-scale aware (CSA) module, designed as a residual and parallel architecture, comprises a depthwise dilated convolution group (DDConv Group), a cross-channel interaction (CCI) mechanism, and spatial information focus (SIF). DDConv Group produces multiple-scale features with a dynamic receptive field, CCI fuses the features and mitigates noise in multiple channels, and SIF is aware of the cross-scale details of TFI structures. Furthermore, we develop a novel time–frequency fusion (TFF) feature based on three types of TFIs by employing image preprocessing techniques, i.e., adaptive binarization, morphological processing, and feature fusion. Experiments demonstrate that CSANet achieves higher accuracy with our TFF compared to other TFIs. Meanwhile, CSANet outperforms cutting-edge networks across twelve radar signal datasets, providing an efficient solution for high-precision recognition in low-SNR scenarios. Full article
(This article belongs to the Special Issue Radar Signal Detection, Recognition and Identification)
Show Figures

Figure 1

21 pages, 1668 KiB  
Article
DCG-Net: Enhanced Hyperspectral Image Classification with Dual-Branch Convolutional Neural Network and Graph Convolutional Neural Network Integration
by Wenkai Zhu, Xueying Sun and Qiang Zhang
Electronics 2024, 13(16), 3271; https://doi.org/10.3390/electronics13163271 (registering DOI) - 18 Aug 2024
Viewed by 224
Abstract
In recent years, graph convolutional neural networks (GCNs) and convolutional neural networks (CNNs) have made significant strides in hyperspectral image (HSI) classification. However, existing models often encounter information redundancy and feature mismatch during feature fusion, and they struggle with small-scale refined features. To [...] Read more.
In recent years, graph convolutional neural networks (GCNs) and convolutional neural networks (CNNs) have made significant strides in hyperspectral image (HSI) classification. However, existing models often encounter information redundancy and feature mismatch during feature fusion, and they struggle with small-scale refined features. To address these issues, we propose DCG-Net, an innovative classification network integrating CNN and GCN architectures. Our approach includes the development of a double-branch expanding network (E-Net) to enhance spectral features and efficiently extract high-level features. Additionally, we incorporate a GCN with an attention mechanism to facilitate the integration of multi-space scale superpixel-level and pixel-level features. To further improve feature fusion, we introduce a feature aggregation module (FAM) that adaptively learns channel features, enhancing classification robustness and accuracy. Comprehensive experiments on three widely used datasets show that DCG-Net achieves superior classification results compared to other state-of-the-art methods. Full article
(This article belongs to the Topic Hyperspectral Imaging and Signal Processing)
Show Figures

Figure 1

24 pages, 2025 KiB  
Article
A Cross-Working Condition-Bearing Diagnosis Method Based on Image Fusion and a Residual Network Incorporating the Kolmogorov–Arnold Representation Theorem
by Ziyi Tang, Xinhao Hou, Xin Wang and Jifeng Zou
Appl. Sci. 2024, 14(16), 7254; https://doi.org/10.3390/app14167254 (registering DOI) - 17 Aug 2024
Viewed by 473
Abstract
With the optimization and advancement of industrial production and manufacturing, the application scenarios of bearings have become increasingly diverse and highly coupled. This complexity poses significant challenges for the extraction of bearing fault features, consequently affecting the accuracy of cross-condition fault diagnosis methods. [...] Read more.
With the optimization and advancement of industrial production and manufacturing, the application scenarios of bearings have become increasingly diverse and highly coupled. This complexity poses significant challenges for the extraction of bearing fault features, consequently affecting the accuracy of cross-condition fault diagnosis methods. To improve the extraction and recognition of fault features and enhance the diagnostic accuracy of models across different conditions, this paper proposes a cross-condition bearing diagnosis method. This method, named MCR-KAResNet-TLDAF, is based on image fusion and a residual network that incorporates the Kolmogorov–Arnold representation theorem. Firstly, the one-dimensional vibration signals of the bearing are processed using Markov transition field (MTF), continuous wavelet transform (CWT), and recurrence plot (RP) methods, converting the resulting images to grayscale. These grayscale images are then multiplied by corresponding coefficients and fed into the R, G, and B channels for image fusion. Subsequently, fault features are extracted using a residual network enhanced by the Kolmogorov–Arnold representation theorem. Additionally, a domain adaptation algorithm combining multiple kernel maximum mean discrepancy (MK-MMD ) and conditional domain adversarial network with entropy conditioning (CDAN+E ) is employed to align the source and target domains, thereby enhancing the model’s cross-condition diagnostic accuracy. The proposed method was experimentally validated on the Case Western Reserve University (CWRU) dataset and the Jiangnan University (JUN) dataset, which include the 6205-2RS JEM SKF, N205, and NU205 bearing models. The method achieved accuracy rates of 99.36% and 99.889% on the two datasets, respectively. Comparative experiments from various perspectives further confirm the superiority and effectiveness of the proposed model. Full article
22 pages, 15192 KiB  
Article
Joint Luminance-Saliency Prior and Attention for Underwater Image Quality Assessment
by Zhiqiang Lin, Zhouyan He, Chongchong Jin, Ting Luo and Yeyao Chen
Remote Sens. 2024, 16(16), 3021; https://doi.org/10.3390/rs16163021 (registering DOI) - 17 Aug 2024
Viewed by 308
Abstract
Underwater images, as a crucial medium for storing ocean information in underwater sensors, play a vital role in various underwater tasks. However, they are prone to distortion due to the imaging environment, which leads to a decline in visual quality, which is an [...] Read more.
Underwater images, as a crucial medium for storing ocean information in underwater sensors, play a vital role in various underwater tasks. However, they are prone to distortion due to the imaging environment, which leads to a decline in visual quality, which is an urgent issue for various marine vision systems to address. Therefore, it is necessary to develop underwater image enhancement (UIE) and corresponding quality assessment methods. At present, most underwater image quality assessment (UIQA) methods primarily rely on extracting handcrafted features that characterize degradation attributes, which struggle to measure complex mixed distortions and often exhibit discrepancies with human visual perception in practical applications. Furthermore, current UIQA methods lack the consideration of the perception perspective of enhanced effects. To this end, this paper employs luminance and saliency priors as critical visual information for the first time to measure the enhancement effect of global and local quality achieved by the UIE algorithms, named JLSAU. The proposed JLSAU is built upon an overall pyramid-structured backbone, supplemented by the Luminance Feature Extraction Module (LFEM) and Saliency Weight Learning Module (SWLM), which aim at obtaining perception features with luminance and saliency priors at multiple scales. The supplement of luminance priors aims to perceive visually sensitive global distortion of luminance, including histogram statistical features and grayscale features with positional information. The supplement of saliency priors aims to perceive visual information that reflects local quality variation both in spatial and channel domains. Finally, to effectively model the relationship among different levels of visual information contained in the multi-scale features, the Attention Feature Fusion Module (AFFM) is proposed. Experimental results on the public UIQE and UWIQA datasets demonstrate that the proposed JLSAU outperforms existing state-of-the-art UIQA methods. Full article
(This article belongs to the Special Issue Ocean Remote Sensing Based on Radar, Sonar and Optical Techniques)
Show Figures

Figure 1

19 pages, 7491 KiB  
Article
A Data-Driven Approach for Leveraging Inline and Offline Data to Determine the Causes of Monoclonal Antibody Productivity Reduction in the Commercial-Scale Cell Culture Process
by Sheng Zhang, Hang Chen, Yuxiang Wan, Haibin Wang and Haibin Qu
Pharmaceutics 2024, 16(8), 1082; https://doi.org/10.3390/pharmaceutics16081082 (registering DOI) - 17 Aug 2024
Viewed by 286
Abstract
The monoclonal antibody (mAb) manufacturing process comes with high profits and high costs, and thus mAb productivity is of vital importance. However, many factors can impact the cell culture process, and lead to mAb productivity reduction. Nowadays, the biopharma industry is actively employing [...] Read more.
The monoclonal antibody (mAb) manufacturing process comes with high profits and high costs, and thus mAb productivity is of vital importance. However, many factors can impact the cell culture process, and lead to mAb productivity reduction. Nowadays, the biopharma industry is actively employing manufacturing information systems, which enable the integration of both online data and offline data. Although the volume of data is large, related data mining studies for mAb productivity improvement are rare. Therefore, a data-driven approach is proposed in this study to leverage both the inline and offline data of the cell culture process to discover the causes of mAb productivity reduction. The approach consists of four steps, namely data preprocessing, phase division, feature extraction and fusion, and cluster comparing. First, data quality issues are solved during the data preprocessing step. Next, the inline data are divided into several phases based on the moving window k-nearest neighbor method. Then, the inline data features are extracted via functional data analysis and combined with the offline data features. Finally, the causes of mAb productivity reduction are identified using the contrasting clusters via the principal component analysis method. A commercial-scale cell culture process case study is provided in this research to verify the effectiveness of the approach. Data from 35 batches were collected, and each batch contained nine inline variables and seven offline variables. The causes of mAb productivity reduction were identified to be the lack of nutrients, and recommended actions were taken according to the result, which was subsequently proven by six validation batches. Full article
(This article belongs to the Section Pharmaceutical Technology, Manufacturing and Devices)
Show Figures

Graphical abstract

19 pages, 4688 KiB  
Article
Enhancing the Resolution of Satellite Ocean Data Using Discretized Satellite Gridding Neural Networks
by Shirong Liu, Wentao Jia, Qianyun Wang, Weimin Zhang and Huizan Wang
Remote Sens. 2024, 16(16), 3020; https://doi.org/10.3390/rs16163020 (registering DOI) - 17 Aug 2024
Viewed by 278
Abstract
Ocean satellite data are often impeded by intrinsic limitations in resolution and accuracy. However, conventional data reconstruction approaches encounter substantial challenges when facing the nonlinear oceanic system and high-resolution fusion of variables. This research presents a Discrete Satellite Gridding Neural Network (DSGNN), a [...] Read more.
Ocean satellite data are often impeded by intrinsic limitations in resolution and accuracy. However, conventional data reconstruction approaches encounter substantial challenges when facing the nonlinear oceanic system and high-resolution fusion of variables. This research presents a Discrete Satellite Gridding Neural Network (DSGNN), a new machine learning method that processes satellite data within a discrete grid framework. By transforming the positional information of grid elements into a standardized vector format, the DSGNN significantly elevates the accuracy and resolution of data fusion through a neural network model. This method’s innovative aspect lies in its discretization and fusion technique, which not only enhances the spatial resolution of oceanic data but also, through the integration of multi-element datasets, better reflects the true physical state of the ocean. A comprehensive analysis of the reconstructed datasets indicates the DSGNN’s consistency and reliability across different seasons and oceanic regions, especially in its adept handling of complex nonlinear interactions and small-scale oceanic features. The DSGNN method has demonstrated exceptional competence in reconstructing global ocean datasets, maintaining small error variance, and achieving high congruence with in situ observations, which is almost equivalent to 1/12° hybrid coordinate ocean model (HYCOM) data. This study offers a novel and potent strategy for the high-resolution reconstruction and fusion of ocean satellite datasets. Full article
(This article belongs to the Special Issue Artificial Intelligence and Big Data for Oceanography)
Show Figures

Figure 1

25 pages, 11774 KiB  
Article
CR-YOLOv9: Improved YOLOv9 Multi-Stage Strawberry Fruit Maturity Detection Application Integrated with CRNET
by Rong Ye, Guoqi Shao, Quan Gao, Hongrui Zhang and Tong Li
Foods 2024, 13(16), 2571; https://doi.org/10.3390/foods13162571 (registering DOI) - 17 Aug 2024
Viewed by 230
Abstract
Strawberries are a commonly used agricultural product in the food industry. In the traditional production model, labor costs are high, and extensive picking techniques can result in food safety issues, like poor taste and fruit rot. In response to the existing challenges of [...] Read more.
Strawberries are a commonly used agricultural product in the food industry. In the traditional production model, labor costs are high, and extensive picking techniques can result in food safety issues, like poor taste and fruit rot. In response to the existing challenges of low detection accuracy and slow detection speed in the assessment of strawberry fruit maturity in orchards, a CR-YOLOv9 multi-stage method for strawberry fruit maturity detection was introduced. The composite thinning network, CRNet, is utilized for target fusion, employing multi-branch blocks to enhance images by restoring high-frequency details. To address the issue of low computational efficiency in the multi-head self-attention (MHSA) model due to redundant attention heads, the design concept of CGA is introduced. This concept aligns input feature grouping with the number of attention heads, offering the distinct segmentation of complete features for each attention head, thereby reducing computational redundancy. A hybrid operator, ACmix, is proposed to enhance the efficiency of image classification and target detection. Additionally, the Inner-IoU concept, in conjunction with Shape-IoU, is introduced to replace the original loss function, thereby enhancing the accuracy of detecting small targets in complex scenes. The experimental results demonstrate that CR-YOLOv9 achieves a precision rate of 97.52%, a recall rate of 95.34%, and an mAP@50 of 97.95%. These values are notably higher than those of YOLOv9 by 4.2%, 5.07%, and 3.34%. Furthermore, the detection speed of CR-YOLOv9 is 84, making it suitable for the real-time detection of strawberry ripeness in orchards. The results demonstrate that the CR-YOLOv9 algorithm discussed in this study exhibits high detection accuracy and rapid detection speed. This enables more efficient and automated strawberry picking, meeting the public’s requirements for food safety. Full article
Show Figures

Figure 1

20 pages, 3293 KiB  
Article
Camera-Radar Fusion with Radar Channel Extension and Dual-CBAM-FPN for Object Detection
by Xiyan Sun, Yaoyu Jiang, Hongmei Qin, Jingjing Li and Yuanfa Ji
Sensors 2024, 24(16), 5317; https://doi.org/10.3390/s24165317 (registering DOI) - 16 Aug 2024
Viewed by 245
Abstract
When it comes to road environment perception, millimeter-wave radar with a camera facilitates more reliable detection than a single sensor. However, the limited utilization of radar features and insufficient extraction of important features remain pertinent issues, especially with regard to the detection of [...] Read more.
When it comes to road environment perception, millimeter-wave radar with a camera facilitates more reliable detection than a single sensor. However, the limited utilization of radar features and insufficient extraction of important features remain pertinent issues, especially with regard to the detection of small and occluded objects. To address these concerns, we propose a camera-radar fusion with radar channel extension and a dual-CBAM-FPN (CRFRD), which incorporates a radar channel extension (RCE) module and a dual-CBAM-FPN (DCF) module into the camera-radar fusion net (CRF-Net). In the RCE module, we design an azimuth-weighted RCS parameter and extend three radar channels, which leverage the secondary redundant information to achieve richer feature representation. In the DCF module, we present the dual-CBAM-FPN, which enables the model to focus on important features by inserting CBAM at the input and the fusion process of FPN simultaneously. Comparative experiments conducted on the NuScenes dataset and real data demonstrate the superior performance of the CRFRD compared to CRF-Net, as its weighted mean average precision (wmAP) increases from 43.89% to 45.03%. Furthermore, ablation studies verify the indispensability of the RCE and DCF modules and the effectiveness of azimuth-weighted RCS. Full article
(This article belongs to the Section Radar Sensors)
Show Figures

Figure 1

32 pages, 28406 KiB  
Article
Infrared and Harsh Light Visible Image Fusion Using an Environmental Light Perception Network
by Aiyun Yan, Shang Gao, Zhenlin Lu, Shuowei Jin and Jingrong Chen
Entropy 2024, 26(8), 696; https://doi.org/10.3390/e26080696 - 16 Aug 2024
Viewed by 287
Abstract
The complementary combination of emphasizing target objects in infrared images and rich texture details in visible images can effectively enhance the information entropy of fused images, thereby providing substantial assistance for downstream composite high-level vision tasks, such as nighttime vehicle intelligent driving. However, [...] Read more.
The complementary combination of emphasizing target objects in infrared images and rich texture details in visible images can effectively enhance the information entropy of fused images, thereby providing substantial assistance for downstream composite high-level vision tasks, such as nighttime vehicle intelligent driving. However, mainstream fusion algorithms lack specific research on the contradiction between the low information entropy and high pixel intensity of visible images under harsh light nighttime road environments. As a result, fusion algorithms that perform well in normal conditions can only produce low information entropy fusion images similar to the information distribution of visible images under harsh light interference. In response to these problems, we designed an image fusion network resilient to harsh light environment interference, incorporating entropy and information theory principles to enhance robustness and information retention. Specifically, an edge feature extraction module was designed to extract key edge features of salient targets to optimize fusion information entropy. Additionally, a harsh light environment aware (HLEA) module was proposed to avoid the decrease in fusion image quality caused by the contradiction between low information entropy and high pixel intensity based on the information distribution characteristics of harsh light visible images. Finally, an edge-guided hierarchical fusion (EGHF) module was designed to achieve robust feature fusion, minimizing irrelevant noise entropy and maximizing useful information entropy. Extensive experiments demonstrate that, compared to other advanced algorithms, the method proposed fusion results contain more useful information and have significant advantages in high-level vision tasks under harsh nighttime lighting conditions. Full article
(This article belongs to the Special Issue Methods in Artificial Intelligence and Information Processing II)
Show Figures

Figure 1

14 pages, 7566 KiB  
Article
Ship Segmentation via Combined Attention Mechanism and Efficient Channel Attention High-Resolution Representation Network
by Xiaoyi Li
J. Mar. Sci. Eng. 2024, 12(8), 1411; https://doi.org/10.3390/jmse12081411 - 16 Aug 2024
Viewed by 235
Abstract
Ship segmentation with small imaging size, which challenges ship detection and visual navigation model performance due to imaging noise interference, has attracted significant attention in the field. To address the issues, this study proposed a novel combined attention mechanism and efficient channel attention [...] Read more.
Ship segmentation with small imaging size, which challenges ship detection and visual navigation model performance due to imaging noise interference, has attracted significant attention in the field. To address the issues, this study proposed a novel combined attention mechanism and efficient channel attention high-resolution representation network (CA2HRNET). More specially, the proposed model fulfills accurate ship segmentation by introducing a channel attention mechanism, a multi-scale spatial attention mechanism, and a weight self-adjusted attention mechanism. Overall, the proposed CA2HRNET model enhances attention mechanism performance by focusing on the trivial yet important features and pixels of a ship against background-interference pixels. The proposed ship segmentation model can accurately focus on ship features by implementing both channel and spatial fusion attention mechanisms at each scale feature layer. Moreover, the channel attention mechanism helps the proposed framework allocate higher weights to ship-feature-related pixels. The experimental results show that the proposed CA2HRNET model outperforms its counterparts in terms of accuracy (Accs), precision (Pc), F1-score (F1s), intersection over union (IoU), and frequency-weighted IoU (FIoU). The average Accs, Pc, F1s, IoU, and FIoU for the proposed CA2HRNET model were 99.77%, 97.55%, 97%, 96.97%, and 99.55%, respectively. The research findings can promote intelligent ship visual navigation and maritime traffic management in the smart shipping era. Full article
Show Figures

Figure 1

Back to TopTop