Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,038)

Search Parameters:
Keywords = multi-scale attention

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 2841 KiB  
Article
Distributed Regional Photovoltaic Power Prediction Based on Stack Integration Algorithm
by Keyong Hu, Chunyuan Lang, Zheyi Fu, Yang Feng, Shuifa Sun and Ben Wang
Mathematics 2024, 12(16), 2561; https://doi.org/10.3390/math12162561 (registering DOI) - 19 Aug 2024
Abstract
With the continuous increase in the proportion of distributed photovoltaic power stations, the demand for photovoltaic power grid connection is becoming more and more urgent, and the requirements for the accuracy of regional distributed photovoltaic power forecasting are also increasing. A distributed regional [...] Read more.
With the continuous increase in the proportion of distributed photovoltaic power stations, the demand for photovoltaic power grid connection is becoming more and more urgent, and the requirements for the accuracy of regional distributed photovoltaic power forecasting are also increasing. A distributed regional photovoltaic power prediction model based on a stacked ensemble algorithm is proposed here. This model first uses a graph attention network (GAT) to learn the structural features and relationships between sub-area photovoltaic power stations, dynamically calculating the attention weights of the photovoltaic power stations to capture the global relationships and importance between stations, and selects representative stations for each sub-area. Subsequently, the CNN-LSTM-multi-head attention parallel multi-channel (CNN-LSTM-MHA (PC)) model is used as the basic model to predict representative stations for sub-areas by integrating the advantages of both the CNN and LSTM models. The predicted results are then used as new features for the input data of the meta-model, which finally predicts the photovoltaic power of the large area. Through comparative experiments at different seasons and time scales, this distributed regional approach reduced the MAE metric by a total of 22.85 kW in spring, 17 kW in summer, 30.26 kW in autumn, and 50.62 kW in winter compared with other models. Full article
27 pages, 18580 KiB  
Article
YOLOv5s-BiPCNeXt, a Lightweight Model for Detecting Disease in Eggplant Leaves
by Zhedong Xie, Chao Li, Zhuang Yang, Zhen Zhang, Jiazhuo Jiang and Hongyu Guo
Plants 2024, 13(16), 2303; https://doi.org/10.3390/plants13162303 - 19 Aug 2024
Abstract
Ensuring the healthy growth of eggplants requires the precise detection of leaf diseases, which can significantly boost yield and economic income. Improving the efficiency of plant disease identification in natural scenes is currently a crucial issue. This study aims to provide an efficient [...] Read more.
Ensuring the healthy growth of eggplants requires the precise detection of leaf diseases, which can significantly boost yield and economic income. Improving the efficiency of plant disease identification in natural scenes is currently a crucial issue. This study aims to provide an efficient detection method suitable for disease detection in natural scenes. A lightweight detection model, YOLOv5s-BiPCNeXt, is proposed. This model utilizes the MobileNeXt backbone to reduce network parameters and computational complexity and includes a lightweight C3-BiPC neck module. Additionally, a multi-scale cross-spatial attention mechanism (EMA) is integrated into the neck network, and the nearest neighbor interpolation algorithm is replaced with the content-aware feature recombination operator (CARAFE), enhancing the model’s ability to perceive multidimensional information and extract multiscale disease features and improving the spatial resolution of the disease feature map. These improvements enhance the detection accuracy for eggplant leaves, effectively reducing missed and incorrect detections caused by complex backgrounds and improving the detection and localization of small lesions at the early stages of brown spot and powdery mildew diseases. Experimental results show that the YOLOv5s-BiPCNeXt model achieves an average precision (AP) of 94.9% for brown spot disease, 95.0% for powdery mildew, and 99.5% for healthy leaves. Deployed on a Jetson Orin Nano edge detection device, the model attains an average recognition speed of 26 FPS (Frame Per Second), meeting real-time requirements. Compared to other algorithms, YOLOv5s-BiPCNeXt demonstrates superior overall performance, accurately detecting plant diseases under natural conditions and offering valuable technical support for the prevention and treatment of eggplant leaf diseases. Full article
Show Figures

Figure 1

21 pages, 23534 KiB  
Article
GVC-YOLO: A Lightweight Real-Time Detection Method for Cotton Aphid-Damaged Leaves Based on Edge Computing
by Zhenyu Zhang, Yunfan Yang, Xin Xu, Liangliang Liu, Jibo Yue, Ruifeng Ding, Yanhui Lu, Jie Liu and Hongbo Qiao
Remote Sens. 2024, 16(16), 3046; https://doi.org/10.3390/rs16163046 - 19 Aug 2024
Abstract
Cotton aphids (Aphis gossypii Glover) pose a significant threat to cotton growth, exerting detrimental effects on both yield and quality. Conventional methods for pest and disease surveillance in agricultural settings suffer from a lack of real-time capability. The use of edge computing [...] Read more.
Cotton aphids (Aphis gossypii Glover) pose a significant threat to cotton growth, exerting detrimental effects on both yield and quality. Conventional methods for pest and disease surveillance in agricultural settings suffer from a lack of real-time capability. The use of edge computing devices for real-time processing of cotton aphid-damaged leaves captured by field cameras holds significant practical research value for large-scale disease and pest control measures. The mainstream detection models are generally large in size, making it challenging to achieve real-time detection on edge computing devices with limited resources. In response to these challenges, we propose GVC-YOLO, a real-time detection method for cotton aphid-damaged leaves based on edge computing. Building upon YOLOv8n, lightweight GSConv and VoVGSCSP modules are employed to reconstruct the neck and backbone networks, thereby reducing model complexity while enhancing multiscale feature fusion. In the backbone network, we integrate the coordinate attention (CA) mechanism and the SimSPPF network to increase the model’s ability to extract features of cotton aphid-damaged leaves, balancing the accuracy loss of the model after becoming lightweight. The experimental results demonstrate that the size of the GVC-YOLO model is only 5.4 MB, a decrease of 14.3% compared with the baseline network, with a reduction of 16.7% in the number of parameters and 17.1% in floating-point operations (FLOPs). The [email protected] and [email protected]:0.95 reach 97.9% and 90.3%, respectively. The GVC-YOLO model is optimized and accelerated by TensorRT and then deployed onto the embedded edge computing device Jetson Xavier NX for detecting cotton aphid damage video captured from the camera. Under FP16 quantization, the detection speed reaches 48 frames per second (FPS). In summary, the proposed GVC-YOLO model demonstrates good detection accuracy and speed, and its performance in detecting cotton aphid damage in edge computing scenarios meets practical application needs. This research provides a convenient and effective intelligent method for the large-scale detection and precise control of pests in cotton fields. Full article
(This article belongs to the Special Issue Plant Disease Detection and Recognition Using Remotely Sensed Data)
Show Figures

Figure 1

20 pages, 631 KiB  
Article
Dynamic Target Assignment by Unmanned Surface Vehicles Based on Reinforcement Learning
by Tao Hu, Xiaoxue Zhang, Xueshan Luo and Tao Chen
Mathematics 2024, 12(16), 2557; https://doi.org/10.3390/math12162557 - 19 Aug 2024
Abstract
Due to the dynamic complexities of the multi-unmanned vessel target assignment problem at sea, especially when addressing moving targets, traditional optimization algorithms often fail to quickly find an adequate solution. To overcome this, we have developed a multi-agent reinforcement learning algorithm. This approach [...] Read more.
Due to the dynamic complexities of the multi-unmanned vessel target assignment problem at sea, especially when addressing moving targets, traditional optimization algorithms often fail to quickly find an adequate solution. To overcome this, we have developed a multi-agent reinforcement learning algorithm. This approach involves defining a state space, employing preferential experience replay, and integrating self-attention mechanisms, which are applied to a novel offshore unmanned vessel model designed for dynamic target allocation. We have conducted a thorough analysis of strike positions and times, establishing robust mathematical models. Additionally, we designed several experiments to test the effectiveness of the algorithm. The proposed algorithm improves the quality of the solution by at least 30% in larger scale scenarios compared to the genetic algorithm (GA), and the average solution speed is less than 10% of the GA, demonstrating the feasibility of the algorithm in solving the problem. Full article
Show Figures

Figure 1

18 pages, 5207 KiB  
Article
MAPPNet: A Multi-Scale Attention Pyramid Pooling Network for Dental Calculus Segmentation
by Tianyu Nie, Shihong Yao, Di Wang, Conger Wang and Yishi Zhao
Appl. Sci. 2024, 14(16), 7273; https://doi.org/10.3390/app14167273 - 19 Aug 2024
Viewed by 61
Abstract
Dental diseases are among the most prevalent diseases globally, and accurate segmentation of dental calculus images plays a crucial role in periodontal disease diagnosis and treatment planning. However, the current methods are not stable and reliable enough due to the variable morphology of [...] Read more.
Dental diseases are among the most prevalent diseases globally, and accurate segmentation of dental calculus images plays a crucial role in periodontal disease diagnosis and treatment planning. However, the current methods are not stable and reliable enough due to the variable morphology of dental calculus and the blurring of the boundaries between the dental edges and the surrounding tissues; therefore, our hope is to propose an accurate and reliable calculus segmentation algorithm to improve the efficiency of clinical detection. We propose a multi-scale attention pyramid pooling network (MAPPNet) to enhance the performance of dental calculus segmentation. The network incorporates a multi-scale fusion strategy in both the encoder and decoder, forming a model with a dual-ended multi-scale structure. This design, in contrast to employing a multi-scale fusion scheme at a single end, enables more effective capturing of features from diverse scales. Furthermore, the attention pyramid pooling module (APPM) reconstructs the features on this map by leveraging a spatial-first and channel-second attention mechanism. APPM enables the network to adaptively adjust the weights of different locations and channels in the feature map, thereby enhancing the perception of important regions and key features. Experimental evaluation of our collected dental calculus segmentation dataset demonstrates the superior performance of MAPPNet, which achieves an intersection-over-union of 81.46% and an accuracy rate of 98.35%. Additionally, on two publicly available datasets, ISIC2018 (skin lesion dataset) and Kvasir-SEG (gastrointestinal polyp segmentation dataset), MAPPNet achieved an intersection-over-union of 76.48% and 91.38%, respectively. These results validate the effectiveness of our proposed network in accurately segmenting lesion regions and achieving high accuracy rates, surpassing many existing segmentation methods. Full article
Show Figures

Figure 1

23 pages, 2501 KiB  
Article
MsFNet: Multi-Scale Fusion Network Based on Dynamic Spectral Features for Multi-Temporal Hyperspectral Image Change Detection
by Yining Feng, Weihan Ni, Liyang Song and Xianghai Wang
Remote Sens. 2024, 16(16), 3037; https://doi.org/10.3390/rs16163037 - 18 Aug 2024
Viewed by 332
Abstract
With the development of satellite technology, the importance of multi-temporal remote sensing (RS) image change detection (CD) in urban planning, environmental monitoring, and other fields is increasingly prominent. Deep learning techniques enable a profound exploration of the intrinsic features within hyperspectral (HS) data, [...] Read more.
With the development of satellite technology, the importance of multi-temporal remote sensing (RS) image change detection (CD) in urban planning, environmental monitoring, and other fields is increasingly prominent. Deep learning techniques enable a profound exploration of the intrinsic features within hyperspectral (HS) data, leading to substantial enhancements in CD accuracy while addressing several challenges posed by traditional methodologies. However, existing convolutional neural network (CNN)-based CD approaches frequently encounter issues during the feature extraction process, such as the loss of detailed information due to downsampling, which hampers a model’s ability to accurately capture complex spectral features. Additionally, these methods often neglect the integration of multi-scale information, resulting in suboptimal local feature extraction and, consequently, diminished model performance. To address these limitations, we propose a multi-scale fusion network (MsFNet) which leverages dynamic spectral features for effective multi-temporal HS-CD. Our approach incorporates a dynamic convolution module with spectral attention, which adaptively modulates the receptive field size according to the spectral characteristics of different bands. This flexibility enhances the model’s capacity to focus on critical bands, thereby improving its ability to identify and differentiate changes across spectral dimensions. Furthermore, we develop a multi-scale feature fusion module which extracts and integrates features from deep feature maps, enriching local information and augmenting the model’s sensitivity to local variations. Experimental evaluations conducted on three real-world HS-CD datasets demonstrate that the proposed MsFNet significantly outperforms contemporary advanced CD methods in terms of both efficacy and performance. Full article
(This article belongs to the Special Issue Recent Advances in the Processing of Hyperspectral Images)
Show Figures

Figure 1

27 pages, 8893 KiB  
Article
A Multi-Scale Mask Convolution-Based Blind-Spot Network for Hyperspectral Anomaly Detection
by Zhiwei Yang, Rui Zhao, Xiangchao Meng, Gang Yang, Weiwei Sun, Shenfu Zhang and Jinghui Li
Remote Sens. 2024, 16(16), 3036; https://doi.org/10.3390/rs16163036 - 18 Aug 2024
Viewed by 324
Abstract
Existing methods of hyperspectral anomaly detection still face several challenges: (1) Due to the limitations of self-supervision, avoiding the identity mapping of anomalies remains difficult; (2) the ineffective interaction between spatial and spectral features leads to the insufficient utilization of spatial information; and [...] Read more.
Existing methods of hyperspectral anomaly detection still face several challenges: (1) Due to the limitations of self-supervision, avoiding the identity mapping of anomalies remains difficult; (2) the ineffective interaction between spatial and spectral features leads to the insufficient utilization of spatial information; and (3) current methods are not adaptable to the detection of multi-scale anomaly targets. To address the aforementioned challenges, we proposed a blind-spot network based on multi-scale blind-spot convolution for HAD. The multi-scale mask convolution module is employed to adapt to diverse scales of anomaly targets, while the dynamic fusion module is introduced to integrate the advantages of mask convolutions at different scales. The proposed approach includes a spatial–spectral joint module and a background feature attention mechanism to enhance the interaction between spatial–spectral features, with a specific emphasis on highlighting the significance of background features within the network. Furthermore, we propose a preprocessing technique that combines pixel shuffle down-sampling (PD) with spatial spectral joint screening. This approach addresses anomalous identity mapping and enables finite-scale mask convolution for better detection of targets at various scales. The proposed approach was assessed on four real hyperspectral datasets comprising anomaly targets of different scales. The experimental results demonstrate the effectiveness and superior performance of the proposed methodology compared with nine state-of-the-art methods. Full article
Show Figures

Figure 1

19 pages, 14105 KiB  
Article
Identification of Pine Wilt-Diseased Trees Using UAV Remote Sensing Imagery and Improved PWD-YOLOv8n Algorithm
by Jianyi Su, Bingxi Qin, Fenggang Sun, Peng Lan and Guolin Liu
Drones 2024, 8(8), 404; https://doi.org/10.3390/drones8080404 - 18 Aug 2024
Viewed by 344
Abstract
Pine wilt disease (PWD) is one of the most destructive diseases for pine trees, causing a significant effect on ecological resources. The identification of PWD-infected trees is an effective approach for disease control. However, the effects of complex environments and the multi-scale features [...] Read more.
Pine wilt disease (PWD) is one of the most destructive diseases for pine trees, causing a significant effect on ecological resources. The identification of PWD-infected trees is an effective approach for disease control. However, the effects of complex environments and the multi-scale features of PWD trees hinder detection performance. To address these issues, this study proposes a detection model based on PWD-YOLOv8 by utilizing aerial images. In particular, the coordinate attention (CA) and convolutional block attention module (CBAM) mechanisms are combined with YOLOv8 to enhance feature extraction. The bidirectional feature pyramid network (BiFPN) structure is used to strengthen feature fusion and recognition capability for small-scale diseased trees. Meanwhile, the lightweight FasterBlock structure and efficient multi-scale attention (EMA) mechanism are employed to optimize the C2f module. In addition, the Inner-SIoU loss function is introduced to seamlessly improve model accuracy and reduce missing rates. The experiment showed that the proposed PWD-YOLOv8n algorithm outperformed conventional target-detection models on the validation set ([email protected] = 94.3%, precision = 87.9%, recall = 87.0%, missing rate = 6.6%; model size = 4.8 MB). Therefore, the proposed PWD-YOLOv8n model demonstrates significant superiority in diseased-tree detection. It not only enhances detection efficiency and accuracy but also provides important technical support for forest disease control and prevention. Full article
Show Figures

Figure 1

21 pages, 1668 KiB  
Article
DCG-Net: Enhanced Hyperspectral Image Classification with Dual-Branch Convolutional Neural Network and Graph Convolutional Neural Network Integration
by Wenkai Zhu, Xueying Sun and Qiang Zhang
Electronics 2024, 13(16), 3271; https://doi.org/10.3390/electronics13163271 - 18 Aug 2024
Viewed by 224
Abstract
In recent years, graph convolutional neural networks (GCNs) and convolutional neural networks (CNNs) have made significant strides in hyperspectral image (HSI) classification. However, existing models often encounter information redundancy and feature mismatch during feature fusion, and they struggle with small-scale refined features. To [...] Read more.
In recent years, graph convolutional neural networks (GCNs) and convolutional neural networks (CNNs) have made significant strides in hyperspectral image (HSI) classification. However, existing models often encounter information redundancy and feature mismatch during feature fusion, and they struggle with small-scale refined features. To address these issues, we propose DCG-Net, an innovative classification network integrating CNN and GCN architectures. Our approach includes the development of a double-branch expanding network (E-Net) to enhance spectral features and efficiently extract high-level features. Additionally, we incorporate a GCN with an attention mechanism to facilitate the integration of multi-space scale superpixel-level and pixel-level features. To further improve feature fusion, we introduce a feature aggregation module (FAM) that adaptively learns channel features, enhancing classification robustness and accuracy. Comprehensive experiments on three widely used datasets show that DCG-Net achieves superior classification results compared to other state-of-the-art methods. Full article
(This article belongs to the Topic Hyperspectral Imaging and Signal Processing)
Show Figures

Figure 1

22 pages, 15192 KiB  
Article
Joint Luminance-Saliency Prior and Attention for Underwater Image Quality Assessment
by Zhiqiang Lin, Zhouyan He, Chongchong Jin, Ting Luo and Yeyao Chen
Remote Sens. 2024, 16(16), 3021; https://doi.org/10.3390/rs16163021 - 17 Aug 2024
Viewed by 308
Abstract
Underwater images, as a crucial medium for storing ocean information in underwater sensors, play a vital role in various underwater tasks. However, they are prone to distortion due to the imaging environment, which leads to a decline in visual quality, which is an [...] Read more.
Underwater images, as a crucial medium for storing ocean information in underwater sensors, play a vital role in various underwater tasks. However, they are prone to distortion due to the imaging environment, which leads to a decline in visual quality, which is an urgent issue for various marine vision systems to address. Therefore, it is necessary to develop underwater image enhancement (UIE) and corresponding quality assessment methods. At present, most underwater image quality assessment (UIQA) methods primarily rely on extracting handcrafted features that characterize degradation attributes, which struggle to measure complex mixed distortions and often exhibit discrepancies with human visual perception in practical applications. Furthermore, current UIQA methods lack the consideration of the perception perspective of enhanced effects. To this end, this paper employs luminance and saliency priors as critical visual information for the first time to measure the enhancement effect of global and local quality achieved by the UIE algorithms, named JLSAU. The proposed JLSAU is built upon an overall pyramid-structured backbone, supplemented by the Luminance Feature Extraction Module (LFEM) and Saliency Weight Learning Module (SWLM), which aim at obtaining perception features with luminance and saliency priors at multiple scales. The supplement of luminance priors aims to perceive visually sensitive global distortion of luminance, including histogram statistical features and grayscale features with positional information. The supplement of saliency priors aims to perceive visual information that reflects local quality variation both in spatial and channel domains. Finally, to effectively model the relationship among different levels of visual information contained in the multi-scale features, the Attention Feature Fusion Module (AFFM) is proposed. Experimental results on the public UIQE and UWIQA datasets demonstrate that the proposed JLSAU outperforms existing state-of-the-art UIQA methods. Full article
(This article belongs to the Special Issue Ocean Remote Sensing Based on Radar, Sonar and Optical Techniques)
Show Figures

Figure 1

24 pages, 22182 KiB  
Article
Multi-Channel Multi-Scale Convolution Attention Variational Autoencoder (MCA-VAE): An Interpretable Anomaly Detection Algorithm Based on Variational Autoencoder
by Jingwen Liu, Yuchen Huang, Dizhi Wu, Yuchen Yang, Yanru Chen, Liangyin Chen and Yuanyuan Zhang
Sensors 2024, 24(16), 5316; https://doi.org/10.3390/s24165316 - 16 Aug 2024
Viewed by 252
Abstract
With the rapid development of industry, the risks factories face are increasing. Therefore, the anomaly detection algorithms deployed in factories need to have high accuracy, and they need to be able to promptly discover and locate the specific equipment causing the anomaly to [...] Read more.
With the rapid development of industry, the risks factories face are increasing. Therefore, the anomaly detection algorithms deployed in factories need to have high accuracy, and they need to be able to promptly discover and locate the specific equipment causing the anomaly to restore the regular operation of the abnormal equipment. However, the neural network models currently deployed in factories cannot effectively capture both temporal features within dimensions and relationship features between dimensions; some algorithms that consider both types of features lack interpretability. Therefore, we propose a high-precision, interpretable anomaly detection algorithm based on variational autoencoder (VAE). We use a multi-scale local weight-sharing convolutional neural network structure to fully extract the temporal features within each dimension of the multi-dimensional time series. Then, we model the features from various aspects through multiple attention heads, extracting the relationship features between dimensions. We map the attention output results to the latent space distribution of the VAE and propose an optimization method to improve the reconstruction performance of the VAE, detecting anomalies through reconstruction errors. Regarding anomaly interpretability, we utilize the VAE probability distribution characteristics, decompose the obtained joint probability density into conditional probabilities on each dimension, and calculate the anomaly score, which provides helpful value for technicians. Experimental results show that our algorithm performed best in terms of F1 score and AUC value. The AUC value for anomaly detection is 0.982, and the F1 score is 0.905, which is 4% higher than the best-performing baseline algorithm, Transformer with a Discriminator for Anomaly Detection (TDAD). It also provides accurate anomaly interpretation capability. Full article
Show Figures

Figure 1

16 pages, 2860 KiB  
Article
Attention-Enhanced Bi-LSTM with Gated CNN for Ship Heave Multi-Step Forecasting
by Wenzhuo Shi, Zimeng Guo, Zixiang Dai, Shizhen Li and Meng Chen
J. Mar. Sci. Eng. 2024, 12(8), 1413; https://doi.org/10.3390/jmse12081413 - 16 Aug 2024
Viewed by 196
Abstract
This study addresses the challenges of predicting ship heave motion in real time, which is essential for mitigating sensor–actuator delays in high-performance active compensation control. Traditional methods often fall short due to training on specific sea conditions, and they lack real-time prediction capabilities. [...] Read more.
This study addresses the challenges of predicting ship heave motion in real time, which is essential for mitigating sensor–actuator delays in high-performance active compensation control. Traditional methods often fall short due to training on specific sea conditions, and they lack real-time prediction capabilities. To overcome these limitations, this study introduces a multi-step prediction model based on a Seq2Seq framework, training with heave data taken from various sea conditions. The model features a long-term encoder with attention-enhanced Bi-LSTM, a short-term encoder with Gated CNN, and a decoder composed of multiple fully connected layers. The long-term encoder and short-term encoder are designed to maximize the extraction of global characteristics and multi-scale short-term features of heave data, respectively. An optimized Huber loss function is used to improve the fitting performance in peak and valley regions. The experimental results demonstrate that this model outperforms baseline methods across all metrics, providing precise predictions for high-sampling-rate real-time applications. Trained on simulated sea conditions and fine-tuned through transfer learning on actual ship data, the proposed model shows strong generalization with prediction errors smaller than 0.02 m. Based on both results from the regular test and the generalization test, the model’s predictive performance is shown to meet the necessary criteria for active heave compensation control. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

14 pages, 7566 KiB  
Article
Ship Segmentation via Combined Attention Mechanism and Efficient Channel Attention High-Resolution Representation Network
by Xiaoyi Li
J. Mar. Sci. Eng. 2024, 12(8), 1411; https://doi.org/10.3390/jmse12081411 - 16 Aug 2024
Viewed by 235
Abstract
Ship segmentation with small imaging size, which challenges ship detection and visual navigation model performance due to imaging noise interference, has attracted significant attention in the field. To address the issues, this study proposed a novel combined attention mechanism and efficient channel attention [...] Read more.
Ship segmentation with small imaging size, which challenges ship detection and visual navigation model performance due to imaging noise interference, has attracted significant attention in the field. To address the issues, this study proposed a novel combined attention mechanism and efficient channel attention high-resolution representation network (CA2HRNET). More specially, the proposed model fulfills accurate ship segmentation by introducing a channel attention mechanism, a multi-scale spatial attention mechanism, and a weight self-adjusted attention mechanism. Overall, the proposed CA2HRNET model enhances attention mechanism performance by focusing on the trivial yet important features and pixels of a ship against background-interference pixels. The proposed ship segmentation model can accurately focus on ship features by implementing both channel and spatial fusion attention mechanisms at each scale feature layer. Moreover, the channel attention mechanism helps the proposed framework allocate higher weights to ship-feature-related pixels. The experimental results show that the proposed CA2HRNET model outperforms its counterparts in terms of accuracy (Accs), precision (Pc), F1-score (F1s), intersection over union (IoU), and frequency-weighted IoU (FIoU). The average Accs, Pc, F1s, IoU, and FIoU for the proposed CA2HRNET model were 99.77%, 97.55%, 97%, 96.97%, and 99.55%, respectively. The research findings can promote intelligent ship visual navigation and maritime traffic management in the smart shipping era. Full article
Show Figures

Figure 1

29 pages, 17576 KiB  
Article
Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion Utilizing a Multi-Scale Dilated Convolutional Pyramid
by Shan Zhao, Zihao Wang, Zhanqiang Huo and Fukai Zhang
Sensors 2024, 24(16), 5305; https://doi.org/10.3390/s24165305 - 16 Aug 2024
Viewed by 289
Abstract
Deep learning has recently made significant progress in semantic segmentation. However, the current methods face critical challenges. The segmentation process often lacks sufficient contextual information and attention mechanisms, low-level features lack semantic richness, and high-level features suffer from poor resolution. These limitations reduce [...] Read more.
Deep learning has recently made significant progress in semantic segmentation. However, the current methods face critical challenges. The segmentation process often lacks sufficient contextual information and attention mechanisms, low-level features lack semantic richness, and high-level features suffer from poor resolution. These limitations reduce the model’s ability to accurately understand and process scene details, particularly in complex scenarios, leading to segmentation outputs that may have inaccuracies in boundary delineation, misclassification of regions, and poor handling of small or overlapping objects. To address these challenges, this paper proposes a Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion with the Multi-Scale Dilated Convolutional Pyramid (SDAMNet). Specifically, the Dilated Convolutional Atrous Spatial Pyramid Pooling (DCASPP) module is developed to enhance contextual information in semantic segmentation. Additionally, a Semantic Channel Space Details Module (SCSDM) is devised to improve the extraction of significant features through multi-scale feature fusion and adaptive feature selection, enhancing the model’s perceptual capability for key regions and optimizing semantic understanding and segmentation performance. Furthermore, a Semantic Features Fusion Module (SFFM) is constructed to address the semantic deficiency in low-level features and the low resolution in high-level features. The effectiveness of SDAMNet is demonstrated on two datasets, revealing significant improvements in Mean Intersection over Union (MIOU) by 2.89% and 2.13%, respectively, compared to the Deeplabv3+ network. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

16 pages, 4191 KiB  
Communication
Optical-to-SAR Translation Based on CDA-GAN for High-Quality Training Sample Generation for Ship Detection in SAR Amplitude Images
by Baolong Wu, Haonan Wang, Cunle Zhang and Jianlai Chen
Remote Sens. 2024, 16(16), 3001; https://doi.org/10.3390/rs16163001 - 15 Aug 2024
Viewed by 297
Abstract
Abundant datasets are critical to train models based on deep learning technologies for ship detection applications. Compared with optical images, ship detection based on synthetic aperture radar (SAR) (especially the high-Earth-orbit spaceborne SAR launched recently) lacks enough training samples. A novel cross-domain attention [...] Read more.
Abundant datasets are critical to train models based on deep learning technologies for ship detection applications. Compared with optical images, ship detection based on synthetic aperture radar (SAR) (especially the high-Earth-orbit spaceborne SAR launched recently) lacks enough training samples. A novel cross-domain attention GAN (CDA-GAN) model is proposed for optical-to-SAR translation, which can generate high-quality SAR amplitude training samples of a target by optical image conversion. This high quality includes high geometry structure similarity of the target compared with the corresponding optical image and low background noise around the target. In the proposed model, the cross-domain attention mechanism and cross-domain multi-scale feature fusion are designed to improve the quality of samples for detection based on the generative adversarial network (GAN). Specifically, a cross-domain attention mechanism is designed to simultaneously emphasize discriminative features from optical images and SAR images at the same time. Moreover, a designed cross-domain multi-scale feature fusion module further emphasizes the geometric information and semantic information of the target in a feature graph from the perspective of global features. Finally, a reference loss is introduced in CDA-GAN to completely retain the extra features generated by the cross-domain attention mechanism and cross-domain multi-scale feature fusion module. Experimental results demonstrate that the training samples generated by the proposed CDA-GAN can obtain higher ship detection accuracy using real SAR data than the other state-of-the-art methods. The proposed method is generally available for different orbit SARs and can be extended to the high-Earth-orbit spaceborne SAR case. Full article
Show Figures

Figure 1

Back to TopTop