Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,543)

Search Parameters:
Keywords = VGG-19

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 2982 KiB  
Article
Transformative Transparent Hybrid Deep Learning Framework for Accurate Cataract Detection
by Julius Olaniyan, Deborah Olaniyan, Ibidun Christiana Obagbuwa, Bukohwo Michael Esiefarienrhe and Matthew Odighi
Appl. Sci. 2024, 14(21), 10041; https://doi.org/10.3390/app142110041 - 4 Nov 2024
Viewed by 254
Abstract
This paper presents a transformative explainable convolutional neural network (CNN) framework for cataract detection, utilizing a hybrid deep learning model combining Siamese networks with VGG16. By leveraging a learning rate scheduler and Grad-CAM (Gradient-weighted Class Activation Mapping) for explainability, the proposed model not [...] Read more.
This paper presents a transformative explainable convolutional neural network (CNN) framework for cataract detection, utilizing a hybrid deep learning model combining Siamese networks with VGG16. By leveraging a learning rate scheduler and Grad-CAM (Gradient-weighted Class Activation Mapping) for explainability, the proposed model not only achieves high accuracy in identifying cataract-infected images but also provides interpretable visual explanations of its predictions. Performance evaluation metrics such as accuracy, precision, recall, and F1 score demonstrate the model’s robustness, with a perfect accuracy of 100%. Grad-CAM visualizations highlight the key image regions—primarily around the iris and pupil—that contribute most to the model’s decision-making, making the system more transparent for clinical use. Additionally, novel statistical analysis methods, including saliency map evaluation metrics like AUC (Area Under the Curve) and the Pointing Game, were employed to quantify the quality of the model’s explanations. These metrics enhance the interpretability of the model and support its practical applicability in medical image analysis. This approach advances the integration of deep learning with explainable AI, offering a robust, accurate, and interpretable solution for cataract detection with the potential for broader adoption in ocular disease diagnosis and medical decision support systems. Full article
Show Figures

Figure 1

16 pages, 9878 KiB  
Article
An Enhanced Deep Learning Model for Effective Crop Pest and Disease Detection
by Yongqi Yuan, Jinhua Sun and Qian Zhang
J. Imaging 2024, 10(11), 279; https://doi.org/10.3390/jimaging10110279 - 2 Nov 2024
Viewed by 395
Abstract
Traditional machine learning methods struggle with plant pest and disease image recognition, particularly when dealing with small sample sizes, indistinct features, and numerous categories. This paper proposes an improved ResNet34 model (ESA-ResNet34) for crop pest and disease detection. The model employs ResNet34 as [...] Read more.
Traditional machine learning methods struggle with plant pest and disease image recognition, particularly when dealing with small sample sizes, indistinct features, and numerous categories. This paper proposes an improved ResNet34 model (ESA-ResNet34) for crop pest and disease detection. The model employs ResNet34 as its backbone and introduces an efficient spatial attention mechanism (effective spatial attention, ESA) to focus on key regions of the images. By replacing the standard convolutions in ResNet34 with depthwise separable convolutions, the model reduces its parameter count by 85.37% and its computational load by 84.51%. Additionally, Dropout is used to mitigate overfitting, and data augmentation techniques such as center cropping and horizontal flipping are employed to enhance the model’s robustness. The experimental results show that the improved algorithm achieves an accuracy, precision, and F1 score of 87.09%, 87.14%, and 86.91%, respectively, outperforming several benchmark models (including AlexNet, VGG16, MobileNet, DenseNet, and various ResNet variants). These findings demonstrate that the proposed ESA-ResNet34 model significantly enhances crop pest and disease detection. Full article
(This article belongs to the Special Issue Imaging Applications in Agriculture)
Show Figures

Figure 1

16 pages, 2883 KiB  
Article
Enhanced Skin Lesion Segmentation and Classification Through Ensemble Models
by Su Myat Thwin and Hyun-Seok Park
Eng 2024, 5(4), 2805-2820; https://doi.org/10.3390/eng5040146 (registering DOI) - 31 Oct 2024
Viewed by 184
Abstract
This study addresses challenges in skin cancer detection, particularly issues like class imbalance and the varied appearance of lesions, which complicate segmentation and classification tasks. The research employs deep learning ensemble models for both segmentation (using U-Net, SegNet, and DeepLabV3) and classification (using [...] Read more.
This study addresses challenges in skin cancer detection, particularly issues like class imbalance and the varied appearance of lesions, which complicate segmentation and classification tasks. The research employs deep learning ensemble models for both segmentation (using U-Net, SegNet, and DeepLabV3) and classification (using VGG16, ResNet-50, and Inception-V3). The ISIC dataset is balanced through oversampling in classification, and preprocessing techniques such as data augmentation and post-processing are applied in segmentation to increase robustness. The ensemble model outperformed individual models, achieving a Dice Coefficient of 0.93, an IoU of 0.90, and an accuracy of 0.95 for segmentation, with 90% accuracy on the original dataset and 99% on the balanced dataset for classification. The use of ensemble models and balanced datasets proved highly effective in improving the accuracy and reliability of automated skin lesion analysis, supporting dermatologists in early detection efforts. Full article
(This article belongs to the Special Issue Feature Papers in Eng 2024)
Show Figures

Figure 1

26 pages, 17802 KiB  
Article
MR_NET: A Method for Breast Cancer Detection and Localization from Histological Images Through Explainable Convolutional Neural Networks
by Rachele Catalano, Myriam Giusy Tibaldi, Lucia Lombardi, Antonella Santone, Mario Cesarelli and Francesco Mercaldo
Sensors 2024, 24(21), 7022; https://doi.org/10.3390/s24217022 - 31 Oct 2024
Viewed by 230
Abstract
Breast cancer is the most prevalent cancer among women globally, making early and accurate detection essential for effective treatment and improved survival rates. This paper presents a method designed to detect and localize breast cancer using deep learning, specifically convolutional neural networks. The [...] Read more.
Breast cancer is the most prevalent cancer among women globally, making early and accurate detection essential for effective treatment and improved survival rates. This paper presents a method designed to detect and localize breast cancer using deep learning, specifically convolutional neural networks. The approach classifies histological images of breast tissue as either tumor-positive or tumor-negative. We utilize several deep learning models, including a custom-built CNN, EfficientNet, ResNet50, VGG-16, VGG-19, and MobileNet. Fine-tuning was also applied to VGG-16, VGG-19, and MobileNet to enhance performance. Additionally, we introduce a novel deep learning model called MR_Net, aimed at providing a more accurate network for breast cancer detection and localization, potentially assisting clinicians in making informed decisions. This model could also accelerate the diagnostic process, enabling early detection of the disease. Furthermore, we propose a method for explainable predictions by generating heatmaps that highlight the regions within tissue images that the model focuses on when predicting a label, revealing the detection of benign, atypical, and malignant tumors. We evaluate both the quantitative and qualitative performance of MR_Net and the other models, also presenting explainable results that allow visualization of the tissue areas identified by the model as relevant to the presence of breast cancer. Full article
(This article belongs to the Special Issue Feature Papers in the Internet of Things Section 2024)
Show Figures

Figure 1

13 pages, 2185 KiB  
Article
Diagnosis of Pancreatic Ductal Adenocarcinoma Using Deep Learning
by Fulya Kavak, Sebnem Bora, Aylin Kantarci, Aybars Uğur, Sumru Cagaptay, Deniz Gokcay, Anıl Aysal, Burcin Pehlivanoglu and Ozgul Sagol
Sensors 2024, 24(21), 7005; https://doi.org/10.3390/s24217005 - 31 Oct 2024
Viewed by 295
Abstract
Recent advances in artificial intelligence (AI) research, particularly in image processing technologies, have shown promising applications across various domains, including health care. There is a significant effort to use AI for the early diagnosis and detection of diseases, offering cost-effective and timely solutions [...] Read more.
Recent advances in artificial intelligence (AI) research, particularly in image processing technologies, have shown promising applications across various domains, including health care. There is a significant effort to use AI for the early diagnosis and detection of diseases, offering cost-effective and timely solutions to enhance patient outcomes. This study introduces a deep learning network aimed at analyzing pathology images for the accurate diagnosis of pancreatic cancer, specifically pancreatic ductal adenocarcinoma (PDAC). Utilizing a novel dataset comprised of cases diagnosed with PDAC and/or chronic pancreatitis, this study applies deep learning algorithms to assess the effectiveness and reliability of the diagnostic process. The dataset was enhanced through image duplication and the creation of a second dataset with varied dimensions, facilitating the training of advanced transfer learning models including InceptionV3, DenseNet, ResNet, VGG, EfficientNet, and a specially designed deep neural network. The study presents a convolutional neural network model, optimized for the rapid and accurate detection of pancreatic cancer, and conducts a comparative analysis with other models to select the most accurate algorithm for a decision support system. The results from Dataset 1 show that EfficientNetB0 achieved a high success rate of 92%. In Dataset 2, VGG16 was found to have high performance, with a success rate of 92%. On the other hand, ResNet50 achieved a remarkable success rate of 96% despite a moderate training time and showed high precision, recall, F1 score, and accuracy. These results provide valuable data to demonstrate and share the relevance of different deep learning models in pancreatic cancer diagnosis. Full article
(This article belongs to the Special Issue Digital Imaging Processing, Sensing, and Object Recognition)
Show Figures

Figure 1

25 pages, 18179 KiB  
Article
ES-L2-VGG16 Model for Artificial Intelligent Identification of Ice Avalanche Hidden Danger
by Daojing Guo, Minggao Tang, Qiang Xu, Guangjian Wu, Guang Li, Wei Yang, Zhihang Long, Huanle Zhao and Yu Ren
Remote Sens. 2024, 16(21), 4041; https://doi.org/10.3390/rs16214041 - 30 Oct 2024
Viewed by 414
Abstract
Ice avalanche (IA) has a strong concealment and sudden characteristics, which can cause severe disasters. The early identification of IA hidden danger is of great value for disaster prevention and mitigation. However, it is very difficult, and there is poor efficiency in identifying [...] Read more.
Ice avalanche (IA) has a strong concealment and sudden characteristics, which can cause severe disasters. The early identification of IA hidden danger is of great value for disaster prevention and mitigation. However, it is very difficult, and there is poor efficiency in identifying it by site investigation or manual remote sensing. So, an artificial intelligence method for the identification of IA hidden dangers using a deep learning model has been proposed, with the glacier area of the Yarlung Tsangpo River Gorge in Nyingchi selected for identification and validation. First, through engineering geological investigations, three key identification indices for IA hidden dangers are established, glacier source, slope angle, and cracks. Sentinel-2A satellite data, Google Earth, and ArcGIS are used to extract these indices and construct a feature dataset for the study and validation area. Next, key performance metrics, such as training accuracy, validation accuracy, test accuracy, and loss rates, are compared to assess the performance of the ResNet50 (Residual Neural Network 50) and VGG16 (Visual Geometry Group 16) models. The VGG16 model (96.09% training accuracy) is selected and optimized, using Early Stopping (ES) to prevent overfitting and L2 regularization techniques (L2) to add weight penalties, which constrained model complexity and enhanced simplicity and generalization, ultimately developing the ES-L2-VGG16 (Early Stopping—L2 Norm Regularization Techniques—Visual Geometry Group 16) model (98.61% training accuracy). Lastly, during the validation phase, the model is applied to the Yarlung Tsangpo River Gorge glacier area on the Tibetan Plateau (TP), identifying a total of 100 IA hidden danger areas, with average slopes ranging between 34° and 48°. The ES-L2-VGG16 model achieves an accuracy of 96% in identifying these hidden danger areas, ensuring the precise identification of IA dangers. This study offers a new intelligent technical method for identifying IA hidden danger, with clear advantages and promising application prospects. Full article
Show Figures

Figure 1

20 pages, 3672 KiB  
Article
Grad-CAM Enabled Breast Cancer Classification with a 3D Inception-ResNet V2: Empowering Radiologists with Explainable Insights
by Fatma M. Talaat, Samah A. Gamel, Rana Mohamed El-Balka, Mohamed Shehata and Hanaa ZainEldin
Cancers 2024, 16(21), 3668; https://doi.org/10.3390/cancers16213668 - 30 Oct 2024
Viewed by 289
Abstract
Breast cancer (BCa) poses a severe threat to women’s health worldwide as it is the most frequently diagnosed type of cancer and the primary cause of death for female patients. The biopsy procedure remains the gold standard for accurate and effective diagnosis of [...] Read more.
Breast cancer (BCa) poses a severe threat to women’s health worldwide as it is the most frequently diagnosed type of cancer and the primary cause of death for female patients. The biopsy procedure remains the gold standard for accurate and effective diagnosis of BCa. However, its adverse effects, such as invasiveness, bleeding, infection, and reporting time, keep this procedure as a last resort for diagnosis. A mammogram is considered the routine noninvasive imaging-based procedure for diagnosing BCa, mitigating the need for biopsies; however, it might be prone to subjectivity depending on the radiologist’s experience. Therefore, we propose a novel, mammogram image-based BCa explainable AI (BCaXAI) model with a deep learning-based framework for precise, noninvasive, objective, and timely manner diagnosis of BCa. The proposed BCaXAI leverages the Inception-ResNet V2 architecture, where the integration of explainable AI components, such as Grad-CAM, provides radiologists with valuable visual insights into the model’s decision-making process, fostering trust and confidence in the AI-based system. Based on using the DDSM and CBIS-DDSM mammogram datasets, BCaXAI achieved exceptional performance, surpassing traditional models such as ResNet50 and VGG16. The model demonstrated superior accuracy (98.53%), recall (98.53%), precision (98.40%), F1-score (98.43%), and AUROC (0.9933), highlighting its effectiveness in distinguishing between benign and malignant cases. These promising results could alleviate the diagnostic subjectivity that might arise as a result of the experience-variability between different radiologists, as well as minimize the need for repetitive biopsy procedures. Full article
(This article belongs to the Special Issue Artificial Intelligence-Assisted Radiomics in Cancer)
Show Figures

Figure 1

31 pages, 5080 KiB  
Article
Detection of Subarachnoid Hemorrhage Using CNN with Dynamic Factor and Wandering Strategy-Based Feature Selection
by Jewel Sengupta, Robertas Alzbutas, Tomas Iešmantas, Vytautas Petkus, Alina Barkauskienė, Vytenis Ratkūnas, Saulius Lukoševičius, Aidanas Preikšaitis, Indre Lapinskienė, Mindaugas Šerpytis, Edgaras Misiulis, Gediminas Skarbalius, Robertas Navakas and Algis Džiugys
Diagnostics 2024, 14(21), 2417; https://doi.org/10.3390/diagnostics14212417 - 30 Oct 2024
Viewed by 322
Abstract
Objectives: Subarachnoid Hemorrhage (SAH) is a serious neurological emergency case with a higher mortality rate. An automatic SAH detection is needed to expedite and improve identification, aiding timely and efficient treatment pathways. The existence of noisy and dissimilar anatomical structures in NCCT [...] Read more.
Objectives: Subarachnoid Hemorrhage (SAH) is a serious neurological emergency case with a higher mortality rate. An automatic SAH detection is needed to expedite and improve identification, aiding timely and efficient treatment pathways. The existence of noisy and dissimilar anatomical structures in NCCT images, limited availability of labeled SAH data, and ineffective training causes the issues of irrelevant features, overfitting, and vanishing gradient issues that make SAH detection a challenging task. Methods: In this work, the water waves dynamic factor and wandering strategy-based Sand Cat Swarm Optimization, namely DWSCSO, are proposed to ensure optimum feature selection while a Parametric Rectified Linear Unit with a Stacked Convolutional Neural Network, referred to as PRSCNN, is developed for classifying grades of SAH. The DWSCSO and PRSCNN surpass current practices in SAH detection by improving feature selection and classification accuracy. DWSCSO is proposed to ensure optimum feature selection, avoiding local optima issues with higher exploration capacity and avoiding the issue of overfitting in classification. Firstly, in this work, a modified region-growing method was employed on the patient Non-Contrast Computed Tomography (NCCT) images to segment the regions affected by SAH. From the segmented regions, the wide range of patterns and irregularities, fine-grained textures and details, and complex and abstract features were extracted from pre-trained models like GoogleNet, Visual Geometry Group (VGG)-16, and ResNet50. Next, the PRSCNN was developed for classifying grades of SAH which helped to avoid the vanishing gradient issue. Results: The DWSCSO-PRSCNN obtained a maximum accuracy of 99.48%, which is significant compared with other models. The DWSCSO-PRSCNN provides an improved accuracy of 99.62% in CT dataset compared with the DL-ICH and GoogLeNet + (GLCM and LBP), ResNet-50 + (GLCM and LBP), and AlexNet + (GLCM and LBP), which confirms that DWSCSO-PRSCNN effectively reduces false positives and false negatives. Conclusions: the complexity of DWSCSO-PRSCNN was acceptable in this research, for while simpler approaches appeared preferable, they failed to address problems like overfitting and vanishing gradients. Accordingly, the DWSCSO for optimized feature selection and PRSCNN for robust classification were essential for handling these challenges and enhancing the detection in different clinical settings. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

25 pages, 6970 KiB  
Article
Urban Land Use Classification Model Fusing Multimodal Deep Features
by Yougui Ren, Zhiwei Xie and Shuaizhi Zhai
ISPRS Int. J. Geo-Inf. 2024, 13(11), 378; https://doi.org/10.3390/ijgi13110378 - 30 Oct 2024
Viewed by 382
Abstract
Urban land use classification plays a significant role in urban studies and provides key guidance for urban development. However, existing methods predominantly rely on either raster structure deep features through convolutional neural networks (CNNs) or topological structure deep features through graph neural networks [...] Read more.
Urban land use classification plays a significant role in urban studies and provides key guidance for urban development. However, existing methods predominantly rely on either raster structure deep features through convolutional neural networks (CNNs) or topological structure deep features through graph neural networks (GNNs), making it challenging to comprehensively capture the rich semantic information in remote sensing images. To address this limitation, we propose a novel urban land use classification model by integrating both raster and topological structure deep features to enhance the accuracy and robustness of the classification model. First, we divide the urban area into block units based on road network data and further subdivide these units using the fractal network evolution algorithm (FNEA). Next, the K-nearest neighbors (KNN) graph construction method with adaptive fusion coefficients is employed to generate both global and local graphs of the blocks and sub-units. The spectral features and subgraph features are then constructed, and a graph convolutional network (GCN) is utilized to extract the node relational features from both the global and local graphs, forming the topological structure deep features while aggregating local features into global ones. Subsequently, VGG-16 (Visual Geometry Group 16) is used to extract the image convolutional features of the block units, obtaining the raster structure deep features. Finally, the transformer is used to fuse both topological and raster structure deep features, and land use classification is completed using the softmax function. Experiments were conducted using high-resolution Google images and Open Street Map (OSM) data, with study areas on the third ring road of Shenyang and the fourth ring road of Chengdu. The results demonstrate that the proposed method improves the overall accuracy and Kappa coefficient by 9.32% and 0.17, respectively, compared to single deep learning models. Incorporating subgraph structure features further enhances the overall accuracy and Kappa by 1.13% and 0.1. The adaptive KNN graph construction method achieves accuracy comparable to that of the empirical threshold method. This study enables accurate large-scale urban land use classification with reduced manual intervention, improving urban planning efficiency. The experimental results verify the effectiveness of the proposed method, particularly in terms of classification accuracy and feature representation completeness. Full article
Show Figures

Figure 1

17 pages, 45843 KiB  
Article
How to Learn More? Exploring Kolmogorov–Arnold Networks for Hyperspectral Image Classification
by Ali Jamali, Swalpa Kumar Roy, Danfeng Hong, Bing Lu and Pedram Ghamisi
Remote Sens. 2024, 16(21), 4015; https://doi.org/10.3390/rs16214015 - 29 Oct 2024
Viewed by 644
Abstract
Convolutional neural networks (CNNs) and vision transformers (ViTs) have shown excellent capability in complex hyperspectral image (HSI) classification. However, these models require a significant number of training data and are computational resources. On the other hand, modern Multi-Layer Perceptrons (MLPs) have demonstrated a [...] Read more.
Convolutional neural networks (CNNs) and vision transformers (ViTs) have shown excellent capability in complex hyperspectral image (HSI) classification. However, these models require a significant number of training data and are computational resources. On the other hand, modern Multi-Layer Perceptrons (MLPs) have demonstrated a great classification capability. These modern MLP-based models require significantly less training data compared with CNNs and ViTs, achieving state-of-the-art classification accuracy. Recently, Kolmogorov–Arnold networks (KANs) were proposed as viable alternatives for MLPs. Because of their internal similarity to splines and their external similarity to MLPs, KANs are able to optimize learned features with remarkable accuracy, in addition to being able to learn new features. Thus, in this study, we assessed the effectiveness of KANs for complex HSI data classification. Moreover, to enhance the HSI classification accuracy obtained by the KANs, we developed and proposed a hybrid architecture utilizing 1D, 2D, and 3D KANs. To demonstrate the effectiveness of the proposed KAN architecture, we conducted extensive experiments on three newly created HSI benchmark datasets: QUH-Pingan, QUH-Tangdaowan, and QUH-Qingyun. The results underscored the competitive or better capability of the developed hybrid KAN-based model across these benchmark datasets over several other CNN- and ViT-based algorithms, including 1D-CNN, 2DCNN, 3D CNN, VGG-16, ResNet-50, EfficientNet, RNN, and ViT. Full article
Show Figures

Figure 1

14 pages, 1193 KiB  
Article
Hyper CLS-Data-Based Robotic Interface and Its Application to Intelligent Peg-in-Hole Task Robot Incorporating a CNN Model for Defect Detection
by Fusaomi Nagata, Ryoma Abe, Shingo Sakata, Keigo Watanabe and Maki K. Habib
Machines 2024, 12(11), 757; https://doi.org/10.3390/machines12110757 - 26 Oct 2024
Viewed by 386
Abstract
Various types of numerical control (NC) machine tools can be standardly operated and controlled based on NC data that can be easily generated using widespread CAD/CAM systems. On the other hand, the operation environments of industrial robots still depend on conventional teaching and [...] Read more.
Various types of numerical control (NC) machine tools can be standardly operated and controlled based on NC data that can be easily generated using widespread CAD/CAM systems. On the other hand, the operation environments of industrial robots still depend on conventional teaching and playback systems provided by the makers, so it seems that they have not been standardized and unified like NC machine tools yet. Additionally, robotic functional extensions, e.g., the easy implementation of a machine learning model, such as a convolutional neural network (CNN), a visual feedback controller, cooperative control for multiple robots, and so on, has not been sufficiently realized yet. In this paper, a hyper cutter location source (HCLS)-data-based robotic interface is proposed to cope with the issues. Due to the HCLS-data-based robot interface, the robotic control sequence can be visually and unifiedly described as NC codes. In addition, a VGG19-based CNN model for defect detection, whose classification accuracy is over 99% and average time for forward calculation is 70 ms, can be systematically incorporated into a robotic control application that handles multiple robots. The effectiveness and validity of the proposed system are demonstrated through a cooperative pick and place task using three small-sized industrial robot MG400s and a peg-in-hole task while checking undesirable defects in workpieces with a CNN model without using any programmable logic controller (PLC). The specifications of the PC used for the experiments are CPU: Intel(R) Core(TM) i9-10850K CPU 3.60 GHz, GPU: NVIDIA GeForce RTX 3090, Main memory: 64 GB. Full article
(This article belongs to the Special Issue Industry 4.0: Intelligent Robots in Smart Manufacturing)
Show Figures

Figure 1

24 pages, 7237 KiB  
Article
An Embedded System for Real-Time Atrial Fibrillation Diagnosis Using a Multimodal Approach to ECG Data
by Monalisa Akter, Nayeema Islam, Abdul Ahad, Md. Asaduzzaman Chowdhury, Fahim Foysal Apurba and Riasat Khan
Eng 2024, 5(4), 2728-2751; https://doi.org/10.3390/eng5040143 - 24 Oct 2024
Viewed by 705
Abstract
Cardiovascular diseases pose a significant global health threat, with atrial fibrillation representing a critical precursor to more severe heart conditions. In this work, a multimodality-based deep learning model has been developed for diagnosing atrial fibrillation using an embedded system consisting of a Raspberry [...] Read more.
Cardiovascular diseases pose a significant global health threat, with atrial fibrillation representing a critical precursor to more severe heart conditions. In this work, a multimodality-based deep learning model has been developed for diagnosing atrial fibrillation using an embedded system consisting of a Raspberry Pi 4B, an ESP8266 microcontroller, and an AD8232 single-lead ECG sensor to capture real-time ECG data. Our approach leverages a deep learning model that is capable of distinguishing atrial fibrillation from normal ECG signals. The proposed method involves real-time ECG signal acquisition and employs a multimodal model trained on the PTB-XL dataset. This model utilizes a multi-step approach combining a CNN–bidirectional LSTM for numerical ECG series tabular data and VGG16 for image-based ECG representations. A fusion layer is incorporated into the multimodal CNN-BiLSTM + VGG16 model to enhance atrial fibrillation detection, achieving state-of-the-art results with a precision of 94.07% and an F1 score of 0.94. This study demonstrates the efficacy of a multimodal approach in improving the real-time diagnosis of cardiovascular diseases. Furthermore, for edge devices, we have distilled knowledge to train a smaller student model, CNN-BiLSTM, using a larger CNN-BiLSTM model as a teacher, which achieves an accuracy of 83.21% with 0.85 s detection latency. Our work represents a significant advancement towards efficient and preventative cardiovascular health management. Full article
Show Figures

Figure 1

16 pages, 1532 KiB  
Article
An Improved Random Forest Approach on GAN-Based Dataset Augmentation for Fog Observation
by Yucan Cao, Panpan Zhao, Balin Xu and Jingshu Liang
Appl. Sci. 2024, 14(21), 9657; https://doi.org/10.3390/app14219657 - 22 Oct 2024
Viewed by 528
Abstract
The monitoring of fog density is of great importance in meteorology and its applications in environment, aviation and transportation. Nowadays, vision-based fog estimation from images taken with surveillance cameras has made a great supplementary contribution to the scarcely traditional meteorological fog observation. In [...] Read more.
The monitoring of fog density is of great importance in meteorology and its applications in environment, aviation and transportation. Nowadays, vision-based fog estimation from images taken with surveillance cameras has made a great supplementary contribution to the scarcely traditional meteorological fog observation. In this paper, we propose a new Random Forest (RF) approach for image-based fog estimation. In order to reduce the impact of data imbalance on recognition, the StyleGAN2-ADA (generative adversarial network with adaptive discriminator augmentation) algorithm is used to generate virtual images to expand the data of low proportions. Key image features related to fog are extracted, and an RF method, integrated with the hierarchical and k-medoid clustering, is deployed to estimate the fog density. The experiment conducted in Sichuan in February 2024 shows that the improved RF model has achieved an average accuracy of fog density observation of 93%, 6.4% higher than the RF model without data expansion, 3–6% higher than the VGG16, the VGG19, the ResNet50, and the DenseNet169 with or without data expansion. What is more, the improved RF method exhibits a very good convergence as a cost-effective solution. Full article
Show Figures

Figure 1

13 pages, 14375 KiB  
Article
Spiking Neural Networks for Object Detection Based on Integrating Neuronal Variants and Self-Attention Mechanisms
by Weixuan Li, Jinxiu Zhao, Li Su, Na Jiang and Quan Hu
Appl. Sci. 2024, 14(20), 9607; https://doi.org/10.3390/app14209607 - 21 Oct 2024
Viewed by 648
Abstract
Thanks to their event-driven asynchronous computing capabilities and low power consumption advantages, spiking neural networks (SNNs) show significant potential for computer vision tasks, especially in object detection. However, effective training methods and optimization mechanisms for SNNs remain underexplored. This study proposes two high [...] Read more.
Thanks to their event-driven asynchronous computing capabilities and low power consumption advantages, spiking neural networks (SNNs) show significant potential for computer vision tasks, especially in object detection. However, effective training methods and optimization mechanisms for SNNs remain underexplored. This study proposes two high accuracy SNNs for object detection, AMS_YOLO and AMSpiking_VGG, integrating neuronal variants and attention mechanisms. To enhance these proposed networks, we explore the impact of incorporating different neuronal variants.The results show that the optimization in the SNN’s structure with neuronal variants outperforms that in the attention mechanism for object detection. Compared to the state-of-the-art in the current SNNs, AMS_YOLO improved by 6.7% in accuracy on the static dataset COCO2017, and AMS_Spiking has improved by 11.4% on the dynamic dataset GEN1. Full article
Show Figures

Figure 1

18 pages, 41079 KiB  
Article
Research on Target Image Classification in Low-Light Night Vision
by Yanfeng Li, Yongbiao Luo, Yingjian Zheng, Guiqian Liu and Jiekai Gong
Entropy 2024, 26(10), 882; https://doi.org/10.3390/e26100882 - 21 Oct 2024
Viewed by 484
Abstract
In extremely dark conditions, low-light imaging may offer spectators a rich visual experience, which is important for both military and civic applications. However, the images taken in ultra-micro light environments usually have inherent defects such as extremely low brightness and contrast, a high [...] Read more.
In extremely dark conditions, low-light imaging may offer spectators a rich visual experience, which is important for both military and civic applications. However, the images taken in ultra-micro light environments usually have inherent defects such as extremely low brightness and contrast, a high noise level, and serious loss of scene details and colors, which leads to great challenges in the research of low-light image and object detection and classification. The low-light night vision image used as the study object in this work has an excessively dim overall picture and very little information about the screen’s features. Three algorithms, HE, AHE, and CLAHE, were used to enhance and highlight the image. The effectiveness of these image enhancement methods is evaluated using metrics such as the peak signal-to-noise ratio and mean square error, and CLAHE was selected after comparison. The target image includes vehicles, people, license plates, and objects. The gray-level co-occurrence matrix (GLCM) was used to extract the texture features of the enhanced images, and the extracted image texture features were used as input to construct a backpropagation (BP) neural network classification model. Then, low-light image classification models were developed based on VGG16 and ResNet50 convolutional neural networks combined with low-light image enhancement algorithms. The experimental results show that the overall classification accuracy of the VGG16 convolutional neural network model is 92.1%. Compared with the BP and ResNet50 neural network models, the classification accuracy was increased by 4.5% and 2.3%, respectively, demonstrating its effectiveness in classifying low-light night vision targets. Full article
Show Figures

Figure 1

Back to TopTop