Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (170)

Search Parameters:
Keywords = cross-layer fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 4038 KiB  
Article
Multi-Height and Heterogeneous Sensor Fusion Discriminant with LSTM for Weak Fire Signal Detection in Large Spaces with High Ceilings
by Li Wang, Boning Li, Xiaosheng Yu and Jubo Chen
Electronics 2024, 13(13), 2572; https://doi.org/10.3390/electronics13132572 (registering DOI) - 30 Jun 2024
Viewed by 116
Abstract
Fire is a significant cause of fatalities and property loss. In tall spaces, early smoke dispersion is hindered by thermal barriers, and initial flames with limited smoke production may be obscured by ground-level structures. Consequently, smoke, temperature, and other fire sensor signals are [...] Read more.
Fire is a significant cause of fatalities and property loss. In tall spaces, early smoke dispersion is hindered by thermal barriers, and initial flames with limited smoke production may be obscured by ground-level structures. Consequently, smoke, temperature, and other fire sensor signals are weakened, leading to delays in fire detection by sensor networks. This paper proposes a multi-height and heterogeneous fusion discriminant model with a multilayered LSTM structure for the robust detection of weak fire signals in such challenging situations. The model employs three LSTM structures with cross inputs in the first layer and an input-weighted LSTM structure in the second layer to capture the temporal and cross-correlation features of smoke concentration, temperature, and plume velocity sensor data. The third LSTM layer further aggregates these features to extract the spatial correlation patterns among different heights. The experimental results demonstrate that the proposed algorithm can effectively expedite alarm response during sparse smoke conditions and mitigate false alarms caused by weak signals. Full article
(This article belongs to the Special Issue Advances in Mobile Networked Systems)
Show Figures

Figure 1

27 pages, 10814 KiB  
Article
UPGAN: An Unsupervised Generative Adversarial Network Based on U-Shaped Structure for Pansharpening
by Xin Jin, Yuting Feng, Qian Jiang, Shengfa Miao, Xing Chu, Huangqimei Zheng and Qianqian Wang
ISPRS Int. J. Geo-Inf. 2024, 13(7), 222; https://doi.org/10.3390/ijgi13070222 - 26 Jun 2024
Viewed by 278
Abstract
Pansharpening is the fusion of panchromatic images and multispectral images to obtain images with high spatial resolution and high spectral resolution, which have a wide range of applications. At present, methods based on deep learning can fit the nonlinear features of images and [...] Read more.
Pansharpening is the fusion of panchromatic images and multispectral images to obtain images with high spatial resolution and high spectral resolution, which have a wide range of applications. At present, methods based on deep learning can fit the nonlinear features of images and achieve excellent image quality; however, the images generated with supervised learning approaches lack real-world applicability. Therefore, in this study, we propose an unsupervised pansharpening method based on a generative adversarial network. Considering the fine tubular structures in remote sensing images, a dense connection attention module is designed based on dynamic snake convolution to recover the details of spatial information. In the stage of image fusion, the fusion of features in groups is applied through the cross-scale attention fusion module. Moreover, skip layers are implemented at different scales to integrate significant information, thus improving the objective index values and visual appearance. The loss function contains four constraints, allowing the model to be effectively trained without reference images. The experimental results demonstrate that the proposed method outperforms other widely accepted state-of-the-art methods on the QuickBird and WorldView2 data sets. Full article
(This article belongs to the Special Issue Advances in AI-Driven Geospatial Analysis and Data Generation)
Show Figures

Figure 1

17 pages, 48171 KiB  
Article
S5Utis: Structured State-Space Sequence SegNeXt UNet-like Tongue Image Segmentation in Traditional Chinese Medicine
by Donglei Song, Hongda Zhang, Lida Shi, Hao Xu and Ying Xu
Sensors 2024, 24(13), 4046; https://doi.org/10.3390/s24134046 - 21 Jun 2024
Viewed by 316
Abstract
Intelligent Traditional Chinese Medicine can provide people with a convenient way to participate in daily health care. The ease of acceptance of Traditional Chinese Medicine is also a major advantage in promoting health management. In Traditional Chinese Medicine, tongue imaging is an important [...] Read more.
Intelligent Traditional Chinese Medicine can provide people with a convenient way to participate in daily health care. The ease of acceptance of Traditional Chinese Medicine is also a major advantage in promoting health management. In Traditional Chinese Medicine, tongue imaging is an important step in the examination process. The segmentation and processing of the tongue image directly affects the results of intelligent Traditional Chinese Medicine diagnosis. As intelligent Traditional Chinese Medicine continues to develop, remote diagnosis and patient participation will play important roles. Smartphone sensor cameras can provide irreplaceable data collection capabilities in enhancing interaction in smart Traditional Chinese Medicine. However, these factors lead to differences in the size and quality of the captured images due to factors such as differences in shooting equipment, professionalism of the photographer, and the subject’s cooperation. Most current tongue image segmentation algorithms are based on data collected by professional tongue diagnosis instruments in standard environments, and are not able to demonstrate the tongue image segmentation effect in complex environments. Therefore, we propose a segmentation algorithm for tongue images collected in complex multi-device and multi-user environments. We use convolutional attention and extend state space models to the 2D environment in the encoder. Then, cross-layer connection fusion is used in the decoder part to fuse shallow texture and deep semantic features. Through segmentation experiments on tongue image datasets collected by patients and doctors in real-world settings, our algorithm significantly improves segmentation performance and accuracy. Full article
Show Figures

Figure 1

21 pages, 29925 KiB  
Article
The Impact of Multiple Thermal Cycles Using CMT® on Microstructure Evolution in WAAM of Thin Walls Made of AlMg5
by Vinicius Lemes Jorge, Felipe Ribeiro Teixeira, Sten Wessman, Americo Scotti and Sergio Luiz Henke
Metals 2024, 14(6), 717; https://doi.org/10.3390/met14060717 - 17 Jun 2024
Viewed by 469
Abstract
Wire Arc Additive Manufacturing (WAAM) of thin walls is an adequate technology for producing functional components made with aluminium alloys. The AlMg5 family is one of the most applicable alloys for WAAM. However, WAAM differs from traditional fabrication routes by imposing multiple thermal [...] Read more.
Wire Arc Additive Manufacturing (WAAM) of thin walls is an adequate technology for producing functional components made with aluminium alloys. The AlMg5 family is one of the most applicable alloys for WAAM. However, WAAM differs from traditional fabrication routes by imposing multiple thermal cycles on the material, leading the alloy to undergo cyclic thermal treatments. Depending on the heat source used, thermal fluctuation can also impact the microstructure of the builds and, consequently, the mechanical properties. No known publications discuss the effects of these two WAAM characteristics on the built microstructure. To study the influence of multiple thermal cycles and heat source-related thermal fluctuations, a thin wall was built using CMT-WAAM on a laboratory scale. Cross-sections of the wall were metallographically analysed, at the centre of a layer that was re-treated, and a region at the transition between two layers. The focus was the solidification modes and solubilisation and precipitations of secondary phases. Samples from the wall were post-heat treated in-furnace with different soaking temperatures and cooling, to support the results. Using numerical simulations, the progressive thermal cycles acting on the HAZ of one layer were simplified by a temperature sequence with a range of peak temperatures. The results showed that different zones are formed along the layers, either as a result of the imposed thermal cycling or the solidification mode resulting from CMT-WAAM deposition. In the zones, a band composed of coarse dendrites and an interdendritic phase and another band formed by alternating sizes of cells coexisted with the fusion and heat-affected zones. The numerical simulation revealed that the thermal cycling did not significantly promote the precipitation of second-phase particles. Full article
Show Figures

Figure 1

21 pages, 1816 KiB  
Article
Improving Polyp Segmentation with Boundary-Assisted Guidance and Cross-Scale Interaction Fusion Transformer Network
by Lincen Jiang, Yan Hui, Yuan Fei, Yimu Ji and Tao Zeng
Processes 2024, 12(5), 1030; https://doi.org/10.3390/pr12051030 - 19 May 2024
Viewed by 450
Abstract
Efficient and precise colorectal polyp segmentation has significant implications for screening colorectal polyps. Although network variants derived from the Transformer network have high accuracy in segmenting colorectal polyps with complex shapes, they have two main shortcomings: (1) multi-level semantic information at the output [...] Read more.
Efficient and precise colorectal polyp segmentation has significant implications for screening colorectal polyps. Although network variants derived from the Transformer network have high accuracy in segmenting colorectal polyps with complex shapes, they have two main shortcomings: (1) multi-level semantic information at the output of the encoder may result in information loss during the fusion process and (2) failure to adequately suppress background noise during segmentation. To address these challenges, we propose a cross-scale interaction fusion transformer for polyp segmentation (CIFFormer). Firstly, a novel feature supplement module (FSM) supplements the missing details and explores potential features to enhance the feature representations. Additionally, to mitigate the interference of background noise, we designed a cross-scale interactive fusion module (CIFM) that combines feature information between different layers to obtain more multi-scale and discriminative representative features. Furthermore, a boundary-assisted guidance module (BGM) is proposed to help the segmentation network obtain boundary-enhanced details. Extensive experiments on five typical datasets have demonstrated that CIFFormer has an obvious advantage in segmenting polyps. Specifically, CIFFormer achieved an mDice of 0.925 and an mIoU of 0.875 on the Kvasir-SEG dataset, achieving superior segmentation accuracy to competing methods. Full article
Show Figures

Figure 1

29 pages, 5473 KiB  
Article
Optimal Channel Selection of Multiclass Motor Imagery Classification Based on Fusion Convolutional Neural Network with Attention Blocks
by Joharah Khabti, Saad AlAhmadi and Adel Soudani
Sensors 2024, 24(10), 3168; https://doi.org/10.3390/s24103168 - 16 May 2024
Viewed by 456
Abstract
The widely adopted paradigm in brain–computer interfaces (BCIs) involves motor imagery (MI), enabling improved communication between humans and machines. EEG signals derived from MI present several challenges due to their inherent characteristics, which lead to a complex process of classifying and finding the [...] Read more.
The widely adopted paradigm in brain–computer interfaces (BCIs) involves motor imagery (MI), enabling improved communication between humans and machines. EEG signals derived from MI present several challenges due to their inherent characteristics, which lead to a complex process of classifying and finding the potential tasks of a specific participant. Another issue is that BCI systems can result in noisy data and redundant channels, which in turn can lead to increased equipment and computational costs. To address these problems, the optimal channel selection of a multiclass MI classification based on a Fusion convolutional neural network with Attention blocks (FCNNA) is proposed. In this study, we developed a CNN model consisting of layers of convolutional blocks with multiple spatial and temporal filters. These filters are designed specifically to capture the distribution and relationships of signal features across different electrode locations, as well as to analyze the evolution of these features over time. Following these layers, a Convolutional Block Attention Module (CBAM) is used to, further, enhance EEG signal feature extraction. In the process of channel selection, the genetic algorithm is used to select the optimal set of channels using a new technique to deliver fixed as well as variable channels for all participants. The proposed methodology is validated showing 6.41% improvement in multiclass classification compared to most baseline models. Notably, we achieved the highest results of 93.09% for binary classes involving left-hand and right-hand movements. In addition, the cross-subject strategy for multiclass classification yielded an impressive accuracy of 68.87%. Following channel selection, multiclass classification accuracy was enhanced, reaching 84.53%. Overall, our experiments illustrated the efficiency of the proposed EEG MI model in both channel selection and classification, showing superior results with either a full channel set or a reduced number of channels. Full article
(This article belongs to the Special Issue EEG Signal Processing Techniques and Applications—2nd Edition)
Show Figures

Figure 1

32 pages, 15835 KiB  
Article
Research on Bidirectional Multi-Span Feature Pyramid and Key Feature Capture Object Detection Network
by Heng Zhang, Faming Shao, Xiaohui He, Dewei Zhao, Zihan Zhang and Tao Zhang
Drones 2024, 8(5), 189; https://doi.org/10.3390/drones8050189 - 9 May 2024
Viewed by 898
Abstract
UAV remote sensing (RS) image object detection is a very valuable and challenging technology. This article discusses the importance of key features and proposes an object detection network (URSNet) based on a bidirectional multi-span feature pyramid and key feature capture mechanism. Firstly, a [...] Read more.
UAV remote sensing (RS) image object detection is a very valuable and challenging technology. This article discusses the importance of key features and proposes an object detection network (URSNet) based on a bidirectional multi-span feature pyramid and key feature capture mechanism. Firstly, a bidirectional multi-span feature pyramid (BMSFPN) is constructed. In the process of bidirectional sampling, bicubic interpolation and cross layer fusion are used to filter out image noise and enhance the details of object features. Secondly, the designed feature polarization module (FPM) uses the internal polarization attention mechanism to build a powerful feature representation for classification and regression tasks, making it easier for the network to capture the key object features with more semantic discrimination. In addition, the anchor rotation alignment module (ARAM) further refines the preset anchor frame based on the key regression features extracted by FPM to obtain high-quality rotation anchors with a high matching degree and rich positioning visual information. Finally, the dynamic anchor optimization module (DAOM) is used to improve the ability of feature alignment and positive and negative sample discrimination of the model so that the model can dynamically select the candidate anchor to capture the key regression features so as to further eliminate the deviation between the classification and regression. URSNet has conducted comprehensive ablation and SOTA comparative experiments on challenging RS datasets such as DOTA-V2.0, DIOR and RSOD. The optimal experimental results (87.19% mAP, 108.2 FPS) show that URSNet has efficient and reliable detection performance. Full article
Show Figures

Figure 1

25 pages, 4470 KiB  
Article
Multi-Scale Fusion Siamese Network Based on Three-Branch Attention Mechanism for High-Resolution Remote Sensing Image Change Detection
by Yan Li, Liguo Weng, Min Xia, Kai Hu and Haifeng Lin
Remote Sens. 2024, 16(10), 1665; https://doi.org/10.3390/rs16101665 - 8 May 2024
Viewed by 749
Abstract
Remote sensing image change detection (CD) is an important means in remote sensing data analysis tasks, which can help us understand the surface changes in high-resolution (HR) remote sensing images. Traditional pixel-based and object-based methods are only suitable for low- and medium-resolution images, [...] Read more.
Remote sensing image change detection (CD) is an important means in remote sensing data analysis tasks, which can help us understand the surface changes in high-resolution (HR) remote sensing images. Traditional pixel-based and object-based methods are only suitable for low- and medium-resolution images, and are still challenging for complex texture features and detailed image detail processing in HR images. At present, the method based on deep learning has problems such as inconsistent fusion and difficult model training in the combination of the difference feature information of the deep and shallow layers and the attention mechanism, which leads to errors in the distinction between the changing region and the invariant region, edge detection and small target detection. In order to solve the above problems of inconsistent fusions of feature information aggregation and attention mechanisms, and indistinguishable change areas, we propose a multi-scale feature fusion Siamese network based on attention mechanism (ABMFNet). To tackle the issues of inconsistent fusion and alignment difficulties when integrating multi-scale fusion and attention mechanisms, we introduce the attention-based multi-scale feature fusion module (AMFFM). This module not only addresses insufficient feature fusion and connection between different-scale feature layers, but also enables the model to automatically learn and prioritize important features or regions in the image. Additionally, we design the cross-scale fusion module (CFM) and the difference feature enhancement pyramid structure (DEFPN) to assist the AMFFM module in integrating differential information effectively. These modules bridge the spatial disparity between low-level and high-level features, ensuring efficient connection and fusion of spatial difference information. Furthermore, we enhance the representation and inference speed of the feature pyramid by incorporating a feature enhancement module (FEM) into DEFPN. Finally, the BICD dataset proposed by the laboratory and public datasets LEVIR-CD and BCDD are compared and tested. We use F1 score and MIoU values as evaluation metrics. For AMBMFNet, the F1 scores on the three datasets are 77.69%, 81.57%, and 77.91%, respectively, while the MIoU values are 84.65%, 85.84%, and 84.54%, respectively. The experimental results show that ABMFNet has better effectiveness and robustness. Full article
(This article belongs to the Section Environmental Remote Sensing)
Show Figures

Figure 1

19 pages, 3281 KiB  
Article
An Integrated Gather-and-Distribute Mechanism and Attention-Enhanced Deformable Convolution Model for Pig Behavior Recognition
by Rui Mao, Dongzhen Shen, Ruiqi Wang, Yiming Cui, Yufan Hu, Mei Li and Meili Wang
Animals 2024, 14(9), 1316; https://doi.org/10.3390/ani14091316 - 27 Apr 2024
Viewed by 676
Abstract
The behavior of pigs is intricately tied to their health status, highlighting the critical importance of accurately recognizing pig behavior, particularly abnormal behavior, for effective health monitoring and management. This study addresses the challenge of accommodating frequent non-rigid deformations in pig behavior using [...] Read more.
The behavior of pigs is intricately tied to their health status, highlighting the critical importance of accurately recognizing pig behavior, particularly abnormal behavior, for effective health monitoring and management. This study addresses the challenge of accommodating frequent non-rigid deformations in pig behavior using deformable convolutional networks (DCN) to extract more comprehensive features by incorporating offsets during training. To overcome the inherent limitations of traditional DCN offset weight calculations, the study introduces the multi-path coordinate attention (MPCA) mechanism to enhance the optimization of the DCN offset weight calculation within the designed DCN-MPCA module, further integrated into the cross-scale cross-feature (C2f) module of the backbone network. This optimized C2f-DM module significantly enhances feature extraction capabilities. Additionally, a gather-and-distribute (GD) mechanism is employed in the neck to improve non-adjacent layer feature fusion in the YOLOv8 network. Consequently, the novel DM-GD-YOLO model proposed in this study is evaluated on a self-built dataset comprising 11,999 images obtained from an online monitoring platform focusing on pigs aged between 70 and 150 days. The results show that DM-GD-YOLO can simultaneously recognize four common behaviors and three abnormal behaviors, achieving a precision of 88.2%, recall of 92.2%, and mean average precision (mAP) of 95.3% with 6.0MB Parameters and 10.0G FLOPs. Overall, the model outperforms popular models such as Faster R-CNN, EfficientDet, YOLOv7, and YOLOv8 in monitoring pens with about 30 pigs, providing technical support for the intelligent management and welfare-focused breeding of pigs while advancing the transformation and modernization of the pig industry. Full article
Show Figures

Figure 1

18 pages, 5494 KiB  
Article
Hierarchical Semantic-Guided Contextual Structure-Aware Network for Spectral Satellite Image Dehazing
by Lei Yang, Jianzhong Cao, Hua Wang, Sen Dong and Hailong Ning
Remote Sens. 2024, 16(9), 1525; https://doi.org/10.3390/rs16091525 - 25 Apr 2024
Viewed by 454
Abstract
Haze or cloud always shrouds satellite images, obscuring valuable geographic information for military surveillance, natural calamity surveillance and mineral resource exploration. Satellite image dehazing (SID) provides the possibility for better applications of satellite images. Most of the existing dehazing methods are tailored for [...] Read more.
Haze or cloud always shrouds satellite images, obscuring valuable geographic information for military surveillance, natural calamity surveillance and mineral resource exploration. Satellite image dehazing (SID) provides the possibility for better applications of satellite images. Most of the existing dehazing methods are tailored for natural images and are not very effective for satellite images with non-homogeneous haze since the semantic structure information and inconsistent attenuation are not fully considered. To tackle this problem, this study proposes a hierarchical semantic-guided contextual structure-aware network (SCSNet) for spectral satellite image dehazing. Specifically, a hybrid CNN–Transformer architecture integrated with a hierarchical semantic guidance (HSG) module is presented to learn semantic structure information by synergetically complementing local representation from non-local features. Furthermore, a cross-layer fusion (CLF) module is specially designed to replace the traditional skip connection during the feature decoding stage so as to reinforce the attention to the spatial regions and feature channels with more serious attenuation. The results on the SateHaze1k, RS-Haze, and RSID datasets demonstrated that the proposed SCSNet can achieve effective dehazing and outperforms existing state-of-the-art methods. Full article
(This article belongs to the Special Issue Remote Sensing Cross-Modal Research: Algorithms and Practices)
Show Figures

Figure 1

23 pages, 4882 KiB  
Article
USES-Net: An Infrared Dim and Small Target Detection Network with Embedded Knowledge Priors
by Lingxiao Li, Linlin Liu, Yunan He and Zhuqiang Zhong
Electronics 2024, 13(7), 1400; https://doi.org/10.3390/electronics13071400 - 8 Apr 2024
Viewed by 638
Abstract
Detecting and identifying small infrared targets has always been a crucial technology for many applications. To address the low accuracy, high false-alarm rate, and poor environmental adaptability that commonly exist in infrared target detection methods, this paper proposes a composite infrared dim and [...] Read more.
Detecting and identifying small infrared targets has always been a crucial technology for many applications. To address the low accuracy, high false-alarm rate, and poor environmental adaptability that commonly exist in infrared target detection methods, this paper proposes a composite infrared dim and small target detection model called USES-Net, which combines the target prior knowledge and conventional data-driven deep learning networks to make use of both labeled data and the domain knowledge. Based on the typical encoder–decoder structure, USES-Net firstly introduces the self-attention mechanism of Swin Transformer to replace the universal convolution kernel at the encoder end. This helps to extract potential features related to dim, small targets in a larger receptive field. In addition, USES-Net includes an embedded patch-based contrast learning module (EPCLM) to integrate the spatial distribution of the target as a knowledge prior in the training network model. This guides the training process of the constrained network model with clear physical interpretability. Finally, USES-Net also designs a bottom-up cross-layer feature fusion module (AFM) as the decoder of the network, and a data-slicing-aided enhancement and inference method based on Slicing Aided Hyper Inference (SAHI) is utilized to further improve the model’s detection accuracy. An experimental comparative analysis shows that USES-Net achieves the best results on three typical infrared weak-target datasets: NUAA-SIRST, NUDT-SIRST, and IRSTD-1K. The results of the target segmentation are complete and sufficient, which demonstrates the validity and practicality of the proposed method in comparison to others. Full article
Show Figures

Figure 1

15 pages, 7222 KiB  
Article
Insulator Defect Detection Based on YOLOv8s-SwinT
by Zhendong He, Wenbin Yang, Yanjie Liu, Anping Zheng, Jie Liu, Taishan Lou and Jie Zhang
Information 2024, 15(4), 206; https://doi.org/10.3390/info15040206 - 6 Apr 2024
Viewed by 1238
Abstract
Ensuring the safety of transmission lines necessitates effective insulator defect detection. Traditional methods often need more efficiency and accuracy, particularly for tiny defects. This paper proposes an innovative insulator defect recognition method leveraging YOLOv8s-SwinT. Combining Swin Transformer and Convolutional Neural Network (CNN) enhances [...] Read more.
Ensuring the safety of transmission lines necessitates effective insulator defect detection. Traditional methods often need more efficiency and accuracy, particularly for tiny defects. This paper proposes an innovative insulator defect recognition method leveraging YOLOv8s-SwinT. Combining Swin Transformer and Convolutional Neural Network (CNN) enhances the model’s understanding of multi-scale global semantic information through cross-layer interactions. The improved BiFPN structure in the neck achieves bidirectional cross-scale connections and weighted feature fusion during feature extraction. Additionally, a new small-target detection layer enhances the capability to detect tiny defects. The experimental results showcase outstanding performance, with precision, recall, and mAP reaching 95.6%, 95.3%, and 97.7%, respectively. This boosts detection efficiency and ensures high accuracy, providing robust support for real-time detection of tiny insulator defects. Full article
Show Figures

Figure 1

25 pages, 4894 KiB  
Article
A Spectral–Spatial Context-Boosted Network for Semantic Segmentation of Remote Sensing Images
by Xin Li, Xi Yong, Tao Li, Yao Tong, Hongmin Gao, Xinyuan Wang, Zhennan Xu, Yiwei Fang, Qian You and Xin Lyu
Remote Sens. 2024, 16(7), 1214; https://doi.org/10.3390/rs16071214 - 29 Mar 2024
Cited by 1 | Viewed by 698
Abstract
Semantic segmentation of remote sensing images (RSIs) is pivotal for numerous applications in urban planning, agricultural monitoring, and environmental conservation. However, traditional approaches have primarily emphasized learning within the spatial domain, which frequently leads to less than optimal discrimination of features. Considering the [...] Read more.
Semantic segmentation of remote sensing images (RSIs) is pivotal for numerous applications in urban planning, agricultural monitoring, and environmental conservation. However, traditional approaches have primarily emphasized learning within the spatial domain, which frequently leads to less than optimal discrimination of features. Considering the inherent spectral qualities of RSIs, it is essential to bolster these representations by incorporating the spectral context in conjunction with spatial information to improve discriminative capacity. In this paper, we introduce the spectral–spatial context-boosted network (SSCBNet), an innovative network designed to enhance the accuracy semantic segmentation in RSIs. SSCBNet integrates synergetic attention (SYA) layers and cross-fusion modules (CFMs) to harness both spectral and spatial information, addressing the intrinsic complexities of urban and natural landscapes within RSIs. Extensive experiments on the ISPRS Potsdam and LoveDA datasets reveal that SSCBNet surpasses existing state-of-the-art models, achieving remarkable results in F1-scores, overall accuracy (OA), and mean intersection over union (mIoU). Ablation studies confirm the significant contribution of SYA layers and CFMs to the model’s performance, emphasizing the effectiveness of these components in capturing detailed contextual cues. Full article
Show Figures

Figure 1

17 pages, 4093 KiB  
Article
Sea-Surface Small Target Detection Based on Improved Markov Transition Fields
by Ru Ye, Hongyan Xing and Xing Zhou
J. Mar. Sci. Eng. 2024, 12(4), 582; https://doi.org/10.3390/jmse12040582 - 29 Mar 2024
Cited by 1 | Viewed by 571
Abstract
Addressing the limitations of manually extracting features from small maritime target signals, this paper explores Markov transition fields and convolutional neural networks, proposing a detection method for small targets based on an improved Markov transition field. Initially, the raw data undergo a Fourier [...] Read more.
Addressing the limitations of manually extracting features from small maritime target signals, this paper explores Markov transition fields and convolutional neural networks, proposing a detection method for small targets based on an improved Markov transition field. Initially, the raw data undergo a Fourier transform, feature fusion is performed on the series, and a spectrogram is generated using Markov transition fields to extract radar data features from both the time domain and frequency domain, providing a more comprehensive data representation for the detector. Then, the InceptionResnetV2 network is employed as a classifier, setting decision thresholds based on the softmax layer’s output, thus achieving controllable false alarms in the detection of small maritime targets. Additionally, transfer learning is introduced to address the issue of sample imbalance. The IPIX dataset is used for experimental verification. The experimental results show that the proposed detection method can deeply mine the differences between targets and the maritime clutter background, demonstrating superior detection performance. When the observation time is set to 1.024 s, the IMIRV2 detector performs best. Cross-validation with different data preprocessing methods and classification models reveals a significant advantage in the performance of the IMIRV2 detector, especially at low signal-to-noise ratios. Finally, a comparison with the performance of existing detectors indicates that the proposed method offers certain improvements. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

19 pages, 4737 KiB  
Article
SEB-YOLO: An Improved YOLOv5 Model for Remote Sensing Small Target Detection
by Yan Hui, Shijie You, Xiuhua Hu, Panpan Yang and Jing Zhao
Sensors 2024, 24(7), 2193; https://doi.org/10.3390/s24072193 - 29 Mar 2024
Cited by 1 | Viewed by 1281
Abstract
Due to the limited semantic information extraction with small objects and difficulty in distinguishing similar targets, it brings great challenges to target detection in remote sensing scenarios, which results in poor detection performance. This paper proposes an improved YOLOv5 remote sensing image target [...] Read more.
Due to the limited semantic information extraction with small objects and difficulty in distinguishing similar targets, it brings great challenges to target detection in remote sensing scenarios, which results in poor detection performance. This paper proposes an improved YOLOv5 remote sensing image target detection algorithm, SEB-YOLO (SPD-Conv + ECSPP + Bi-FPN + YOLOv5). Firstly, the space-to-depth (SPD) layer followed by a non-strided convolution (Conv) layer module (SPD-Conv) was used to reconstruct the backbone network, which retained the global features and reduced the feature loss. Meanwhile, the pooling module with the attention mechanism of the final layer of the backbone network was designed to help the network better identify and locate the target. Furthermore, a bidirectional feature pyramid network (Bi-FPN) with bilinear interpolation upsampling was added to improve bidirectional cross-scale connection and weighted feature fusion. Finally, the decoupled head is introduced to enhance the model convergence and solve the contradiction between the classification task and the regression task. Experimental results on NWPU VHR-10 and RSOD datasets show that the mAP of the proposed algorithm reaches 93.5% and 93.9%respectively, which is 4.0% and 5.3% higher than that of the original YOLOv5l algorithm. The proposed algorithm achieves better detection results for complex remote sensing images. Full article
Show Figures

Figure 1

Back to TopTop