Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (15,188)

Search Parameters:
Keywords = deep neural network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 23512 KiB  
Article
Design of a Tunnel Anchor Monitoring System Based on Long Short-Term Memory–Autoregressive Integrated Moving Average Prediction
by Junyan Qi, Yuhao Che, Lei Wang and Ruifu Yuan
Electronics 2024, 13(14), 2840; https://doi.org/10.3390/electronics13142840 (registering DOI) - 19 Jul 2024
Abstract
Considering the shortcomings of the current monitoring system for tunnel anchor support systems, a tunnel anchor monitoring system based on LSTM-ARIMA prediction is proposed in this paper to prevent the deformation and collapse accidents that may occur in the underground mine tunnels during [...] Read more.
Considering the shortcomings of the current monitoring system for tunnel anchor support systems, a tunnel anchor monitoring system based on LSTM-ARIMA prediction is proposed in this paper to prevent the deformation and collapse accidents that may occur in the underground mine tunnels during the backfilling process, which combines the Internet of Things and a neural network deep learning algorithm to achieve the real-time monitoring and prediction of the tunnel anchor pressure. To improve the prediction accuracy, a time series analysis algorithm is used in the prediction model of this system. In particular, an LSTM-ARIMA model is constructed to predict the tunnel anchor pressure by combining the Long Short-Term Memory (LSTM) model and the Autoregressive Integrated Moving Average (ARIMA) model. And a dynamic weighted combination method is designed based on model prediction confidence to acquire the optimal weight coefficients. This combined model enables the monitoring system to predict the anchor pressure more accurately, thereby preventing possible tunnel deformation and collapse accidents in advance. Finally, the overall system is verified using the anchor pressure dataset obtained from the 21,404 section of the Hulusu Coal Mine transportation tunnel in real-world engineering, whose results show that the pressure value predicted using the combined model is basically the same as the actual value on site, and the system has high real-time performance and stability, proving the effectiveness and reliability of the system. Full article
Show Figures

Figure 1

27 pages, 2251 KiB  
Article
Threshold Active Learning Approach for Physical Violence Detection on Images Obtained from Video (Frame-Level) Using Pre-Trained Deep Learning Neural Network Models
by Itzel M. Abundez, Roberto Alejo, Francisco Primero Primero, Everardo E. Granda-Gutiérrez, Otniel Portillo-Rodríguez and Juan Alberto Antonio Velázquez
Algorithms 2024, 17(7), 316; https://doi.org/10.3390/a17070316 (registering DOI) - 18 Jul 2024
Viewed by 66
Abstract
Public authorities and private companies have used video cameras as part of surveillance systems, and one of their objectives is the rapid detection of physically violent actions. This task is usually performed by human visual inspection, which is labor-intensive. For this reason, different [...] Read more.
Public authorities and private companies have used video cameras as part of surveillance systems, and one of their objectives is the rapid detection of physically violent actions. This task is usually performed by human visual inspection, which is labor-intensive. For this reason, different deep learning models have been implemented to remove the human eye from this task, yielding positive results. One of the main problems in detecting physical violence in videos is the variety of scenarios that can exist, which leads to different models being trained on datasets, leading them to detect physical violence in only one or a few types of videos. In this work, we present an approach for physical violence detection on images obtained from video based on threshold active learning, that increases the classifier’s robustness in environments where it was not trained. The proposed approach consists of two stages: In the first stage, pre-trained neural network models are trained on initial datasets, and we use a threshold (μ) to identify those images that the classifier considers ambiguous or hard to classify. Then, they are included in the training dataset, and the model is retrained to improve its classification performance. In the second stage, we test the model with video images from other environments, and we again employ (μ) to detect ambiguous images that a human expert analyzes to determine the real class or delete the ambiguity on them. After that, the ambiguous images are added to the original training set and the classifier is retrained; this process is repeated while ambiguous images exist. The model is a hybrid neural network that uses transfer learning and a threshold μ to detect physical violence on images obtained from video files successfully. In this active learning process, the classifier can detect physical violence in different environments, where the main contribution is the method used to obtain a threshold μ (which is based on the neural network output) that allows human experts to contribute to the classification process to obtain more robust neural networks and high-quality datasets. The experimental results show the proposed approach’s effectiveness in detecting physical violence, where it is trained using an initial dataset, and new images are added to improve its robustness in diverse environments. Full article
(This article belongs to the Special Issue Machine Learning Algorithms for Image Understanding and Analysis)
Show Figures

Figure 1

20 pages, 5032 KiB  
Article
Enhanced Learning Enriched Features Mechanism Using Deep Convolutional Neural Network for Image Denoising and Super-Resolution
by Iqra Waseem, Muhammad Habib, Eid Rehman, Ruqia Bibi, Rehan Mehmood Yousaf, Muhammad Aslam, Syeda Fizzah Jilani and Muhammad Waqar Younis
Appl. Sci. 2024, 14(14), 6281; https://doi.org/10.3390/app14146281 - 18 Jul 2024
Viewed by 70
Abstract
Image denoising and super-resolution play vital roles in imaging systems, greatly reducing the preprocessing cost of many AI techniques for object detection, segmentation, and tracking. Various advancements have been accomplished in this field, but progress is still needed. In this paper, we have [...] Read more.
Image denoising and super-resolution play vital roles in imaging systems, greatly reducing the preprocessing cost of many AI techniques for object detection, segmentation, and tracking. Various advancements have been accomplished in this field, but progress is still needed. In this paper, we have proposed a novel technique named the Enhanced Learning Enriched Features (ELEF) mechanism using a deep convolutional neural network, which makes significant improvements to existing techniques. ELEF consists of two major processes: (1) Denoising, which removes the noise from images; and (2) Super-resolution, which improves the clarity and details of images. Features are learned through deep CNN and not through traditional algorithms so that we can better refine and enhance images. To effectively capture features, the network architecture adopted Dual Attention Units (DUs), which align with the Multi-Scale Residual Block (MSRB) for robust feature extraction, working sidewise with the feature-matching Selective Kernel Extraction (SKF). In addition, resolution mismatching cases are processed in detail to produce high-quality images. The effectiveness of the ELEF model is highlighted by the performance metrics, achieving a Peak Signal-to-Noise Ratio (PSNR) of 42.99 and a Structural Similarity Index (SSIM) of 0.9889, which indicates the ability to carry out the desired high-quality image restoration and enhancement. Full article
(This article belongs to the Special Issue Advances in Image Enhancement and Restoration Technology)
Show Figures

Figure 1

19 pages, 6138 KiB  
Article
Spectral-Frequency Conversion Derived from Hyperspectral Data Combined with Deep Learning for Estimating Chlorophyll Content in Rice
by Lei Du and Shanjun Luo
Agriculture 2024, 14(7), 1186; https://doi.org/10.3390/agriculture14071186 - 18 Jul 2024
Viewed by 74
Abstract
As a vital pigment for photosynthesis in rice, chlorophyll content is closely correlated with growth status and photosynthetic capacity. The estimation of chlorophyll content allows for the monitoring of rice growth and facilitates precise management in the field, such as the application of [...] Read more.
As a vital pigment for photosynthesis in rice, chlorophyll content is closely correlated with growth status and photosynthetic capacity. The estimation of chlorophyll content allows for the monitoring of rice growth and facilitates precise management in the field, such as the application of fertilizers and irrigation. The advancement of hyperspectral remote sensing technology has made it possible to estimate chlorophyll content non-destructively, quickly, and effectively, offering technical support for managing and monitoring rice growth across wide areas. Although hyperspectral data have a fine spectral resolution, they also cause a large amount of information redundancy and noise. This study focuses on the issues of unstable input variables and the estimation model’s poor applicability to various periods when predicting rice chlorophyll content. By introducing the theory of harmonic analysis and the time-frequency conversion method, a deep neural network (DNN) model framework based on wavelet packet transform-first order differential-harmonic analysis (WPT-FD-HA) was proposed, which avoids the uncertainty in the calculation of spectral parameters. The accuracy of estimating rice chlorophyll content based on WPT-FD and WPT-FD-HA variables was compared at seedling, tillering, jointing, heading, grain filling, milk, and complete periods to evaluate the validity and generalizability of the suggested framework. The results demonstrated that all of the WPT-FD-HA models’ single-period validation accuracy had coefficients of determination (R2) values greater than 0.9 and RMSE values less than 1. The multi-period validation model had a root mean square error (RMSE) of 1.664 and an R2 of 0.971. Even with independent data splitting validation, the multi-period model accuracy can still achieve R2 = 0.95 and RMSE = 1.4. The WPT-FD-HA-based deep learning framework exhibited strong stability. The outcome of this study deserves to be used to monitor rice growth on a broad scale using hyperspectral data. Full article
Show Figures

Figure 1

20 pages, 4732 KiB  
Article
Short-Term Photovoltaic Power Generation Based on MVMD Feature Extraction and Informer Model
by Ruilin Xu, Jianyong Zheng, Fei Mei, Xie Yang, Yue Wu and Heng Zhang
Appl. Sci. 2024, 14(14), 6279; https://doi.org/10.3390/app14146279 - 18 Jul 2024
Viewed by 101
Abstract
Photovoltaic (PV) power fluctuates with weather changes, and traditional forecasting methods typically decompose the power itself to study its characteristics, ignoring the impact of multidimensional weather conditions on the power decomposition. Therefore, this paper proposes a short-term PV power generation method based on [...] Read more.
Photovoltaic (PV) power fluctuates with weather changes, and traditional forecasting methods typically decompose the power itself to study its characteristics, ignoring the impact of multidimensional weather conditions on the power decomposition. Therefore, this paper proposes a short-term PV power generation method based on MVMD (multivariate variational mode decomposition) feature extraction and the Informer model. First, MIC correlation analysis is used to extract weather features most related to PV power. Next, to more comprehensively describe the relationship between PV power and environmental conditions, MVMD is used for time–frequency synchronous analysis of the PV power time series combined with the highest MIC correlation weather data, obtaining frequency-aligned multivariate intrinsic modes. These modes incorporate multidimensional weather factors into the data-decomposition-based forecasting method. Finally, to enhance the model’s learning capability, the Informer neural network model is employed in the prediction phase. Based on the input PV IMF time series and associated weather mode components, the Informer prediction model is constructed for training and forecasting. The predicted results of different PV IMF modes are then superimposed to obtain the total PV power generation. Experiments show that this method improves PV power generation accuracy, with an MAPE value of 4.31%, demonstrating good robustness. In terms of computational efficiency, the Informer model’s ability to handle long sequences with sparse attention mechanisms reduces training and prediction times by approximately 15%, making it faster than conventional deep learning models. Full article
(This article belongs to the Section Energy Science and Technology)
Show Figures

Figure 1

17 pages, 7982 KiB  
Article
Deep Dynamic Weights for Underwater Image Restoration
by Hafiz Shakeel Ahmad Awan and Muhammad Tariq Mahmood
J. Mar. Sci. Eng. 2024, 12(7), 1208; https://doi.org/10.3390/jmse12071208 - 18 Jul 2024
Viewed by 102
Abstract
Underwater imaging presents unique challenges, notably color distortions and reduced contrast due to light attenuation and scattering. Most underwater image enhancement methods first use linear transformations for color compensation and then enhance the image. We observed that linear transformation for color compensation is [...] Read more.
Underwater imaging presents unique challenges, notably color distortions and reduced contrast due to light attenuation and scattering. Most underwater image enhancement methods first use linear transformations for color compensation and then enhance the image. We observed that linear transformation for color compensation is not suitable for certain images. For such images, non-linear mapping is a better choice. This paper introduces a unique underwater image restoration approach leveraging a streamlined convolutional neural network (CNN) for dynamic weight learning for linear and non-linear mapping. In the first phase, a classifier is applied that classifies the input images as Type I or Type II. In the second phase, we use the Deep Line Model (DLM) for Type-I images and the Deep Curve Model (DCM) for Type-II images. For mapping an input image to an output image, the DLM creatively combines color compensation and contrast adjustment in a single step and uses deep lines for transformation, whereas the DCM employs higher-order curves. Both models utilize lightweight neural networks that learn per-pixel dynamic weights based on the input image’s characteristics. Comprehensive evaluations on benchmark datasets using metrics like peak signal-to-noise ratio (PSNR) and root mean square error (RMSE) affirm our method’s effectiveness in accurately restoring underwater images, outperforming existing techniques. Full article
(This article belongs to the Special Issue Application of Deep Learning in Underwater Image Processing)
Show Figures

Figure 1

19 pages, 12648 KiB  
Article
A Reliability Quantification Method for Deep Reinforcement Learning-Based Control
by Hitoshi Yoshioka and Hirotada Hashimoto
Algorithms 2024, 17(7), 314; https://doi.org/10.3390/a17070314 (registering DOI) - 18 Jul 2024
Viewed by 87
Abstract
Reliability quantification of deep reinforcement learning (DRL)-based control is a significant challenge for the practical application of artificial intelligence (AI) in safety-critical systems. This study proposes a method for quantifying the reliability of DRL-based control. First, an existing method, random network distillation, was [...] Read more.
Reliability quantification of deep reinforcement learning (DRL)-based control is a significant challenge for the practical application of artificial intelligence (AI) in safety-critical systems. This study proposes a method for quantifying the reliability of DRL-based control. First, an existing method, random network distillation, was applied to the reliability evaluation to clarify the issues to be solved. Second, a novel method for reliability quantification was proposed to solve these issues. The reliability is quantified using two neural networks: a reference and an evaluator. They have the same structure with the same initial parameters. The outputs of the two networks were the same before training. During training, the evaluator network parameters were updated to maximize the difference between the reference and evaluator networks for trained data. Thus, the reliability of the DRL-based control for a state can be evaluated based on the difference in output between the two networks. The proposed method was applied to DRL-based controls as an example of a simple task, and its effectiveness was demonstrated. Finally, the proposed method was applied to the problem of switching trained models depending on the state. Consequently, the performance of the DRL-based control was improved by switching the trained models according to their reliability. Full article
Show Figures

Figure 1

19 pages, 27994 KiB  
Article
ELFNet: An Effective Electricity Load Forecasting Model Based on a Deep Convolutional Neural Network with a Double-Attention Mechanism
by Pei Zhao, Guang Ling and Xiangxiang Song
Appl. Sci. 2024, 14(14), 6270; https://doi.org/10.3390/app14146270 - 18 Jul 2024
Viewed by 124
Abstract
Forecasting energy demand is critical to ensure the steady operation of the power system. However, present approaches to estimating power load are still unsatisfactory in terms of accuracy, precision, and efficiency. In this paper, we propose a novel method, named ELFNet, for estimating [...] Read more.
Forecasting energy demand is critical to ensure the steady operation of the power system. However, present approaches to estimating power load are still unsatisfactory in terms of accuracy, precision, and efficiency. In this paper, we propose a novel method, named ELFNet, for estimating short-term electricity consumption, based on the deep convolutional neural network model with a double-attention mechanism. The Gramian Angular Field method is utilized to convert electrical load time series into 2D image data for input into the proposed model. The prediction accuracy is greatly improved through the use of a convolutional neural network to extract the intrinsic characteristics from the input data, along with channel attention and spatial attention modules, to enhance the crucial features and suppress the irrelevant ones. The present ELFNet method is compared to several classic deep learning networks across different prediction horizons using publicly available data on real power demands from the Belgian grid firm Elia. The results show that the suggested approach is competitive and effective for short-term power load forecasting. Full article
Show Figures

Figure 1

20 pages, 5228 KiB  
Article
Remote Sensing Image Change Detection Based on Deep Learning: Multi-Level Feature Cross-Fusion with 3D-Convolutional Neural Networks
by Sibo Yu, Chen Tao, Guang Zhang, Yubo Xuan and Xiaodong Wang
Appl. Sci. 2024, 14(14), 6269; https://doi.org/10.3390/app14146269 (registering DOI) - 18 Jul 2024
Viewed by 125
Abstract
Change detection (CD) in high-resolution remote sensing imagery remains challenging due to the complex nature of objects and varying spectral characteristics across different times and locations. Convolutional neural networks (CNNs) have shown promising performance in CD tasks by extracting meaningful semantic features. However, [...] Read more.
Change detection (CD) in high-resolution remote sensing imagery remains challenging due to the complex nature of objects and varying spectral characteristics across different times and locations. Convolutional neural networks (CNNs) have shown promising performance in CD tasks by extracting meaningful semantic features. However, traditional 2D-CNNs may struggle to accurately integrate deep features from multi-temporal images, limiting their ability to improve CD accuracy. This study proposes a Multi-level Feature Cross-Fusion (MFCF) network with 3D-CNNs for remote sensing image change detection. The network aims to effectively extract and fuse deep features from multi-temporal images to identify surface changes. To bridge the semantic gap between high-level and low-level features, a MFCF module is introduced. A channel attention mechanism (CAM) is also integrated to enhance model performance, interpretability, and generalization capabilities. The proposed methodology is validated on the LEVIR construction dataset (LEVIR-CD). The experimental results demonstrate superior performance compared to the current state-of-the-art in evaluation metrics including recall, F1 score, and IOU. The MFCF network, which combines 3D-CNNs and a CAM, effectively utilizes multi-temporal information and deep feature fusion, resulting in precise and reliable change detection in remote sensing imagery. This study significantly contributes to the advancement of change detection methods, facilitating more efficient management and decision making across various domains such as urban planning, natural resource management, and environmental monitoring. Full article
(This article belongs to the Special Issue Advances in Image Recognition and Processing Technologies)
Show Figures

Figure 1

20 pages, 560 KiB  
Article
Deep Learning Soft-Decision GNSS Multipath Detection and Mitigation
by Fernando Nunes and Fernando Sousa
Sensors 2024, 24(14), 4663; https://doi.org/10.3390/s24144663 - 18 Jul 2024
Viewed by 117
Abstract
A technique is proposed to detect the presence of the multipath effect in Global Navigation Satellite Signal (GNSS) signals using a convolutional neural network (CNN) as the building block. The network is trained and validated, for a wide range of [...] Read more.
A technique is proposed to detect the presence of the multipath effect in Global Navigation Satellite Signal (GNSS) signals using a convolutional neural network (CNN) as the building block. The network is trained and validated, for a wide range of C/N0 values, with a realistic dataset constituted by the synthetic noisy outputs of a 2D grid of correlators associated with different Doppler frequencies and code delays (time-domain dataset). Multipath-disturbed signals are generated in agreement with the various scenarios encompassed by the adopted multipath model. It was found that pre-processing the outputs of the correlators grid with the two-dimensional Discrete Fourier Transform (frequency-domain dataset) enables the CNN to improve the accuracy relative to the time-domain dataset. Depending on the kind of CNN outputs, two strategies can then be devised to solve the equation of navigation: either remove the disturbed signal from the equation (hard decision) or process the pseudoranges with a weighted least-squares algorithm, where the entries of the weighting matrix are computed using the analog outputs of the neural network (soft decision). Full article
Show Figures

Figure 1

17 pages, 5024 KiB  
Article
SCAE—Stacked Convolutional Autoencoder for Fault Diagnosis of a Hydraulic Piston Pump with Limited Data Samples
by Oybek Eraliev, Kwang-Hee Lee and Chul-Hee Lee
Sensors 2024, 24(14), 4661; https://doi.org/10.3390/s24144661 - 18 Jul 2024
Viewed by 127
Abstract
Deep learning (DL) models require enormous amounts of data to produce reliable diagnosis results. The superiority of DL models over traditional machine learning (ML) methods in terms of feature extraction, feature dimension reduction, and diagnosis performance has been shown in various studies of [...] Read more.
Deep learning (DL) models require enormous amounts of data to produce reliable diagnosis results. The superiority of DL models over traditional machine learning (ML) methods in terms of feature extraction, feature dimension reduction, and diagnosis performance has been shown in various studies of fault diagnosis systems. However, data acquisition can sometimes be compromised by sensor issues, resulting in limited data samples. In this study, we propose a novel DL model based on a stacked convolutional autoencoder (SCAE) to address the challenge of limited data. The innovation of the SCAE model lies in its ability to enhance gradient information flow and extract richer hierarchical features, leading to superior diagnostic performance even with limited and noisy data samples. This article describes the development of a fault diagnosis method for a hydraulic piston pump using time–frequency visual pattern recognition. The proposed SCAE model has been evaluated on limited data samples of a hydraulic piston pump. The findings of the experiment demonstrate that the suggested approach can achieve excellent diagnostic performance with over 99.5% accuracy. Additionally, the SCAE model has outperformed traditional DL models such as deep neural networks (DNN), standard stacked sparse autoencoders (SSAE), and convolutional neural networks (CNN) in terms of diagnosis performance. Furthermore, the proposed model demonstrates robust performance under noisy data conditions, further highlighting its effectiveness and reliability. Full article
(This article belongs to the Special Issue Deep-Learning-Based Defect Detection for Smart Manufacturing)
Show Figures

Figure 1

19 pages, 2442 KiB  
Article
Prediction of Accident Risk Levels in Traffic Accidents Using Deep Learning and Radial Basis Function Neural Networks Applied to a Dataset with Information on Driving Events
by Cristian Arciniegas-Ayala, Pablo Marcillo, Ángel Leonardo Valdivieso Caraguay and Myriam Hernández-Álvarez
Appl. Sci. 2024, 14(14), 6248; https://doi.org/10.3390/app14146248 - 18 Jul 2024
Viewed by 152
Abstract
A complex AI system must be worked offline because the training and execution phases are processed separately. This process often requires different computer resources due to the high model requirements. A limitation of this approach is the convoluted training process that needs to [...] Read more.
A complex AI system must be worked offline because the training and execution phases are processed separately. This process often requires different computer resources due to the high model requirements. A limitation of this approach is the convoluted training process that needs to be repeated to obtain models with new data continuously incorporated into the knowledge base. Although the environment may be not static, it is crucial to dynamically train models by integrating new information during execution. In this article, artificial neural networks (ANNs) are developed to predict risk levels in traffic accidents with relatively simpler configurations than a deep learning (DL) model, which is more computationally intensive. The objective is to demonstrate that efficient, fast, and comparable results can be obtained using simple architectures such as that offered by the Radial Basis Function neural network (RBFNN). This work led to the generation of a driving dataset, which was subsequently validated for testing ANN models. The driving dataset simulated the dynamic approach by adding new data to the training on-the-fly, given the constant changes in the drivers’ data, vehicle information, environmental conditions, and traffic accidents. This study compares the processing time and performance of a Convolutional Neural Network (CNN), Random Forest (RF), Radial Basis Function (RBF), and Multilayer Perceptron (MLP), using evaluation metrics of accuracy, Specificity, and Sensitivity-recall to recommend an appropriate, simple, and fast ANN architecture that can be implemented in a secure alert traffic system that uses encrypted data. Full article
Show Figures

Figure 1

24 pages, 758 KiB  
Article
Advanced Convolutional Neural Networks for Precise White Blood Cell Subtype Classification in Medical Diagnostics
by Athanasios Kanavos, Orestis Papadimitriou, Khalil Al-Hussaeni, Manolis Maragoudakis and Ioannis Karamitsos
Electronics 2024, 13(14), 2818; https://doi.org/10.3390/electronics13142818 - 18 Jul 2024
Viewed by 201
Abstract
White blood cell (WBC) classification is pivotal in medical image analysis, playing a critical role in the precise diagnosis and monitoring of diseases. This paper presents a novel convolutional neural network (CNN) architecture designed specifically for the classification of WBC images. Our model, [...] Read more.
White blood cell (WBC) classification is pivotal in medical image analysis, playing a critical role in the precise diagnosis and monitoring of diseases. This paper presents a novel convolutional neural network (CNN) architecture designed specifically for the classification of WBC images. Our model, trained on an extensive dataset, automates the extraction of discriminative features essential for accurate subtype identification. We conducted comprehensive experiments on a publicly available image dataset to validate the efficacy of our methodology. Comparative analysis with state-of-the-art methods shows that our approach significantly outperforms existing models in accurately categorizing WBCs into their respective subtypes. An in-depth analysis of the features learned by the CNN reveals key insights into the morphological traits—such as shape, size, and texture—that contribute to its classification accuracy. Importantly, the model demonstrates robust generalization capabilities, suggesting its high potential for real-world clinical implementation. Our findings indicate that the proposed CNN architecture can substantially enhance the precision and efficiency of WBC subtype identification, offering significant improvements in medical diagnostics and patient care. Full article
Show Figures

Figure 1

23 pages, 7788 KiB  
Article
A Novel Mamba Architecture with a Semantic Transformer for Efficient Real-Time Remote Sensing Semantic Segmentation
by Hao Ding, Bo Xia, Weilin Liu, Zekai Zhang, Jinglin Zhang, Xing Wang and Sen Xu
Remote Sens. 2024, 16(14), 2620; https://doi.org/10.3390/rs16142620 - 17 Jul 2024
Viewed by 256
Abstract
Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential [...] Read more.
Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential to develop real-time semantic segmentation methods that can be applied to resource-limited platforms, such as edge devices. The majority of mainstream real-time semantic segmentation methods rely on convolutional neural networks (CNNs) and transformers. However, CNNs cannot effectively capture long-range dependencies, while transformers have high computational complexity. This paper proposes a novel remote sensing Mamba architecture for real-time segmentation tasks in remote sensing, named RTMamba. Specifically, the backbone utilizes a Visual State-Space (VSS) block to extract deep features and maintains linear computational complexity, thereby capturing long-range contextual information. Additionally, a novel Inverted Triangle Pyramid Pooling (ITP) module is incorporated into the decoder. The ITP module can effectively filter redundant feature information and enhance the perception of objects and their boundaries in remote sensing images. Extensive experiments were conducted on three challenging aerial remote sensing segmentation benchmarks, including Vaihingen, Potsdam, and LoveDA. The results show that RTMamba achieves competitive performance advantages in terms of segmentation accuracy and inference speed compared to state-of-the-art CNN and transformer methods. To further validate the deployment potential of the model on embedded devices with limited resources, such as UAVs, we conducted tests on the Jetson AGX Orin edge device. The experimental results demonstrate that RTMamba achieves impressive real-time segmentation performance. Full article
Show Figures

Figure 1

23 pages, 2598 KiB  
Article
Enhancing Human Activity Recognition through Integrated Multimodal Analysis: A Focus on RGB Imaging, Skeletal Tracking, and Pose Estimation
by Sajid Ur Rehman, Aman Ullah Yasin, Ehtisham Ul Haq, Moazzam Ali, Jungsuk Kim and Asif Mehmood
Sensors 2024, 24(14), 4646; https://doi.org/10.3390/s24144646 - 17 Jul 2024
Viewed by 209
Abstract
Human activity recognition (HAR) is pivotal in advancing applications ranging from healthcare monitoring to interactive gaming. Traditional HAR systems, primarily relying on single data sources, face limitations in capturing the full spectrum of human activities. This study introduces a comprehensive approach to HAR [...] Read more.
Human activity recognition (HAR) is pivotal in advancing applications ranging from healthcare monitoring to interactive gaming. Traditional HAR systems, primarily relying on single data sources, face limitations in capturing the full spectrum of human activities. This study introduces a comprehensive approach to HAR by integrating two critical modalities: RGB imaging and advanced pose estimation features. Our methodology leverages the strengths of each modality to overcome the drawbacks of unimodal systems, providing a richer and more accurate representation of activities. We propose a two-stream network that processes skeletal and RGB data in parallel, enhanced by pose estimation techniques for refined feature extraction. The integration of these modalities is facilitated through advanced fusion algorithms, significantly improving recognition accuracy. Extensive experiments conducted on the UTD multimodal human action dataset (UTD MHAD) demonstrate that the proposed approach exceeds the performance of existing state-of-the-art algorithms, yielding improved outcomes. This study not only sets a new benchmark for HAR systems but also highlights the importance of feature engineering in capturing the complexity of human movements and the integration of optimal features. Our findings pave the way for more sophisticated, reliable, and applicable HAR systems in real-world scenarios. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Back to TopTop