Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,143)

Search Parameters:
Keywords = deep autoencoder

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 947 KiB  
Article
Knowledge Graph and Personalized Answer Sequences for Programming Knowledge Tracing
by Jianguo Pan, Zhengyang Dong, Lijun Yan and Xia Cai
Appl. Sci. 2024, 14(17), 7952; https://doi.org/10.3390/app14177952 - 6 Sep 2024
Viewed by 172
Abstract
Knowledge tracing is a significant research area in educational data mining, aiming to predict future performance based on students’ historical learning data. In the field of programming, several challenges are faced in knowledge tracing, including inaccurate exercise representation and limited student information. These [...] Read more.
Knowledge tracing is a significant research area in educational data mining, aiming to predict future performance based on students’ historical learning data. In the field of programming, several challenges are faced in knowledge tracing, including inaccurate exercise representation and limited student information. These issues can lead to biased models and inaccurate predictions of students’ knowledge states. To effectively address these issues, we propose a novel programming knowledge tracing model named GPPKT (Knowledge Graph and Personalized Answer Sequences for Programming Knowledge Tracing), which enhances performance by using knowledge graphs and personalized answer sequences. Specifically, we establish the associations between well-defined knowledge concepts and exercises, incorporating student learning abilities and latent representations generated from personalized answer sequences using Variational Autoencoders (VAE) in the model. This deep knowledge tracing model employs Long Short-Term Memory (LSTM) networks and attention mechanisms to integrate the embedding vectors, such as exercises and student information. Extensive experiments are conducted on two real-world programming datasets. The results indicate that GPPKT outperforms state-of-the-art methods, achieving an AUC of 0.8840 and an accuracy of 0.8472 on the Luogu dataset, and an AUC of 0.7770 and an accuracy of 0.8799 on the Codeforces dataset. This demonstrates the superiority of the proposed model, with an average improvement of 9.03% in AUC and 2.02% in accuracy across both datasets. Full article
Show Figures

Figure 1

19 pages, 2777 KiB  
Article
Generative Models Utilizing Padding Can Efficiently Integrate and Generate Multi-Omics Data
by Hyeon-Su Lee, Seung-Hwan Hong, Gwan-Heon Kim, Hye-Jin You, Eun-Young Lee, Jae-Hwan Jeong, Jin-Woo Ahn and June-Hyuk Kim
AI 2024, 5(3), 1614-1632; https://doi.org/10.3390/ai5030078 - 5 Sep 2024
Viewed by 193
Abstract
Technological advances in information-processing capacity have enabled integrated analyses (multi-omics) of different omics data types, improving target discovery and clinical diagnosis. This study proposes novel artificial intelligence (AI) learning strategies for incomplete datasets, common in omics research. The model comprises (1) a multi-omics [...] Read more.
Technological advances in information-processing capacity have enabled integrated analyses (multi-omics) of different omics data types, improving target discovery and clinical diagnosis. This study proposes novel artificial intelligence (AI) learning strategies for incomplete datasets, common in omics research. The model comprises (1) a multi-omics generative model based on a variational auto-encoder that learns tumor genetic patterns based on different omics data types and (2) an expanded classification model that predicts cancer phenotypes. Padding was applied to replace missing data with virtual data. The embedding data generated by the model accurately classified cancer phenotypes, addressing the class imbalance issue (weighted F1 score: cancer type > 0.95, primary site > 0.92, sample type > 0.97). The classification performance was maintained in the absence of omics data, and the virtual data resembled actual omics data (cosine similarity mRNA gene expression > 0.96, mRNA isoform expression > 0.95, DNA methylation > 0.96). Meanwhile, in the presence of omics data, high-quality, non-existent omics data were generated (cosine similarity mRNA gene expression: 0.9702, mRNA isoform expression: 0.9546, DNA methylation: 0.9687). This model can effectively classify cancer phenotypes based on incomplete omics data with data sparsity robustness, generating omics data through deep learning and enabling precision medicine. Full article
Show Figures

Figure 1

19 pages, 1785 KiB  
Article
Representing the Information of Multiplayer Online Battle Arena (MOBA) Video Games Using Convolutional Accordion Auto-Encoder (A2E) Enhanced by Attention Mechanisms
by José A. Torres-León, Marco A. Moreno-Armendáriz and Hiram Calvo
Mathematics 2024, 12(17), 2744; https://doi.org/10.3390/math12172744 - 3 Sep 2024
Viewed by 395
Abstract
In this paper, we propose a representation of the visual information about Multiplayer Online Battle Arena (MOBA) video games using an adapted unsupervised deep learning architecture called Convolutional Accordion Auto-Encoder (Conv_A2E). Our study includes a presentation of current representations of MOBA [...] Read more.
In this paper, we propose a representation of the visual information about Multiplayer Online Battle Arena (MOBA) video games using an adapted unsupervised deep learning architecture called Convolutional Accordion Auto-Encoder (Conv_A2E). Our study includes a presentation of current representations of MOBA video game information and why our proposal offers a novel and useful solution to this task. This approach aims to achieve dimensional reduction and refined feature extraction of the visual data. To enhance the model’s performance, we tested several attention mechanisms for computer vision, evaluating algorithms from the channel attention and spatial attention families, and their combination. Through experimentation, we found that the best reconstruction of the visual information with the Conv_A2E was achieved when using a spatial attention mechanism, deformable convolution, as its mean squared error (MSE) during testing was the lowest, reaching a value of 0.003893, which means that its dimensional reduction is the most generalist and representative for this case study. This paper presents one of the first approaches to applying attention mechanisms to the case study of MOBA video games, representing a new horizon of possibilities for research. Full article
(This article belongs to the Special Issue Mathematical Optimization and Control: Methods and Applications)
Show Figures

Figure 1

21 pages, 6847 KiB  
Article
Hyperspectral Anomaly Detection Based on Spectral Similarity Variability Feature
by Xueyuan Li and Wenjing Shang
Sensors 2024, 24(17), 5664; https://doi.org/10.3390/s24175664 - 30 Aug 2024
Viewed by 223
Abstract
In the traditional method for hyperspectral anomaly detection, spectral feature mapping is used to map hyperspectral data to a high-level feature space to make features more easily distinguishable between different features. However, the uncertainty in the mapping direction makes the mapped features ineffective [...] Read more.
In the traditional method for hyperspectral anomaly detection, spectral feature mapping is used to map hyperspectral data to a high-level feature space to make features more easily distinguishable between different features. However, the uncertainty in the mapping direction makes the mapped features ineffective in distinguishing anomalous targets from the background. To address this problem, a hyperspectral anomaly detection algorithm based on the spectral similarity variability feature (SSVF) is proposed. First, the high-dimensional similar neighborhoods are fused into similar features using AE networks, and then the SSVF are obtained using residual autoencoder. Finally, the final detection of SSVF was obtained using Reed and Xiaoli (RX) detectors. Compared with other comparison algorithms with the highest accuracy, the overall detection accuracy (AUCODP) of the SSVFRX algorithm is increased by 0.2106. The experimental results show that SSVF has great advantages in both highlighting anomalous targets and improving separability between different ground objects. Full article
(This article belongs to the Special Issue Advanced Optical Sensors Based on Machine Learning)
Show Figures

Figure 1

26 pages, 27118 KiB  
Article
A Denoising Method Based on DDPM for Radar Emitter Signal Intra-Pulse Modulation Classification
by Shibo Yuan, Peng Li, Xu Zhou, Yingchao Chen and Bin Wu
Remote Sens. 2024, 16(17), 3215; https://doi.org/10.3390/rs16173215 - 30 Aug 2024
Viewed by 330
Abstract
Accurately classifying the intra-pulse modulations of radar emitter signals is important for radar systems and can provide necessary information for relevant military command strategy and decision making. As strong additional white Gaussian noise (AWGN) leads to a lower signal-to-noise ratio (SNR) of received [...] Read more.
Accurately classifying the intra-pulse modulations of radar emitter signals is important for radar systems and can provide necessary information for relevant military command strategy and decision making. As strong additional white Gaussian noise (AWGN) leads to a lower signal-to-noise ratio (SNR) of received signals, which results in a poor classification accuracy on the classification models based on deep neural networks (DNNs), in this paper, we propose an effective denoising method based on a denoising diffusion probabilistic model (DDPM) for increasing the quality of signals. Trained with denoised signals, classification models can classify samples denoised by our method with better accuracy. The experiments based on three DNN classification models using different modal input, with undenoised data, data denoised by the convolutional denoising auto-encoder (CDAE), and our method’s denoised data, are conducted with three different conditions. The extensive experimental results indicate that our proposed method could denoise samples with lower values of the SNR, and that it is more effective for increasing the accuracy of DNN classification models for radar emitter signal intra-pulse modulations, where the average accuracy is increased from around 3 to 22 percentage points based on three different conditions. Full article
Show Figures

Figure 1

19 pages, 5145 KiB  
Article
Variational Autoencoders for Network Lifetime Enhancement in Wireless Sensors
by Boopathi Chettiagounder Sengodan, Prince Mary Stanislaus, Sivakumar Sabapathy Arumugam, Dipak Kumar Sah, Dharmesh Dhabliya, Poongodi Chenniappan, James Deva Koresh Hezekiah and Rajagopal Maheswar
Sensors 2024, 24(17), 5630; https://doi.org/10.3390/s24175630 - 30 Aug 2024
Viewed by 269
Abstract
Wireless sensor networks (WSNs) are structured for monitoring an area with distributed sensors and built-in batteries. However, most of their battery energy is consumed during the data transmission process. In recent years, several methodologies, like routing optimization, topology control, and sleep scheduling algorithms, [...] Read more.
Wireless sensor networks (WSNs) are structured for monitoring an area with distributed sensors and built-in batteries. However, most of their battery energy is consumed during the data transmission process. In recent years, several methodologies, like routing optimization, topology control, and sleep scheduling algorithms, have been introduced to improve the energy efficiency of WSNs. This study introduces a novel method based on a deep learning approach that utilizes variational autoencoders (VAEs) to improve the energy efficiency of WSNs by compressing transmission data. The VAE approach is customized in this work for compressing WSN data by retaining its important features. This is achieved by analyzing the statistical structure of the sensor data rather than providing a fixed-size latent representation. The performance of the proposed model is verified using a MATLAB simulation platform, integrating a pre-trained variational autoencoder model with openly available wireless sensor data. The performance of the proposed model is found to be satisfactory in comparison to traditional methods, like the compressed sensing technique, lightweight temporal compression, and the autoencoder, in terms of having an average compression rate of 1.5572. The WSN simulation also indicates that the VAE-incorporated architecture attains a maximum network lifetime of 1491 s and suggests that VAE could be used for compression-based transmission using WSNs, as its reconstruction rate is 0.9902, which is better than results from all the other techniques. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

9 pages, 1964 KiB  
Article
Deciphering Rod Pump Anomalies: A Deep Learning Autoencoder Approach
by Cai Wang, He Ma, Xishun Zhang, Xiaolong Xiang, Junfeng Shi, Xingyuan Liang, Ruidong Zhao and Guoqing Han
Processes 2024, 12(9), 1845; https://doi.org/10.3390/pr12091845 - 29 Aug 2024
Viewed by 296
Abstract
This paper investigates the application of a self-coder neural network in oilfield rod pump anomaly detection. Rod pumps are critical equipment in oilfield production engineering, and their stability and reliability are crucial to the production efficiency and economic benefits. However, rod pumps are [...] Read more.
This paper investigates the application of a self-coder neural network in oilfield rod pump anomaly detection. Rod pumps are critical equipment in oilfield production engineering, and their stability and reliability are crucial to the production efficiency and economic benefits. However, rod pumps are often affected by anomalies such as wax deposition, leading to increased maintenance costs and production interruptions. Traditional wax deposition detection methods are inefficient and fail to provide early warning capabilities. This paper reviews the research progress in sucker rod pump anomaly detection and autoencoder neural networks, providing a detailed description of the construction and training process of the autoencoder neural network model. Utilizing data from the rod-pumped wells of the Tuha oilfield in China, this study achieves the automatic recognition of various anomalies through data preprocessing and the training of an autoencoder model. This study also includes a comparative analysis of the differences in the anomaly detection performance between the autoencoder and traditional methods and verifies the effectiveness and superiority of the proposed method. Full article
(This article belongs to the Section Advanced Digital and Other Processes)
Show Figures

Figure 1

26 pages, 9607 KiB  
Article
A Global Spatial-Spectral Feature Fused Autoencoder for Nonlinear Hyperspectral Unmixing
by Mingle Zhang, Mingyu Yang, Hongyu Xie, Pinliang Yue, Wei Zhang, Qingbin Jiao, Liang Xu and Xin Tan
Remote Sens. 2024, 16(17), 3149; https://doi.org/10.3390/rs16173149 - 26 Aug 2024
Viewed by 297
Abstract
Hyperspectral unmixing (HU) aims to decompose mixed pixels into a set of endmembers and corresponding abundances. Deep learning-based HU methods are currently a hot research topic, but most existing unmixing methods still rely on per-pixel training or employ convolutional neural networks (CNNs), which [...] Read more.
Hyperspectral unmixing (HU) aims to decompose mixed pixels into a set of endmembers and corresponding abundances. Deep learning-based HU methods are currently a hot research topic, but most existing unmixing methods still rely on per-pixel training or employ convolutional neural networks (CNNs), which overlook the non-local correlations of materials and spectral characteristics. Furthermore, current research mainly focuses on linear mixing models, which limits the feature extraction capability of deep encoders and further improvement in unmixing accuracy. In this paper, we propose a nonlinear unmixing network capable of extracting global spatial-spectral features. The network is designed based on an autoencoder architecture, where a dual-stream CNNs is employed in the encoder to separately extract spectral and local spatial information. The extracted features are then fused together to form a more complete representation of the input data. Subsequently, a linear projection-based multi-head self-attention mechanism is applied to capture global contextual information, allowing for comprehensive spatial information extraction while maintaining lightweight computation. To achieve better reconstruction performance, a model-free nonlinear mixing approach is adopted to enhance the model’s universality, with the mixing model learned entirely from the data. Additionally, an initialization method based on endmember bundles is utilized to reduce interference from outliers and noise. Comparative results on real datasets against several state-of-the-art unmixing methods demonstrate the superior of the proposed approach. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)
Show Figures

Figure 1

28 pages, 8909 KiB  
Article
A Novel Data-Driven Approach with a Long Short-Term Memory Autoencoder Model with a Multihead Self-Attention Deep Learning Model for Wind Turbine Converter Fault Detection
by Joel Torres-Cabrera, Jorge Maldonado-Correa, Marcelo Valdiviezo-Condolo, Estefanía Artigao, Sergio Martín-Martínez and Emilio Gómez-Lázaro
Appl. Sci. 2024, 14(17), 7458; https://doi.org/10.3390/app14177458 - 23 Aug 2024
Viewed by 373
Abstract
The imminent depletion of oil resources and increasing environmental pollution have driven the use of clean energy, particularly wind energy. However, wind turbines (WTs) face significant challenges, such as critical component failures, which can cause unexpected shutdowns and affect energy production. To address [...] Read more.
The imminent depletion of oil resources and increasing environmental pollution have driven the use of clean energy, particularly wind energy. However, wind turbines (WTs) face significant challenges, such as critical component failures, which can cause unexpected shutdowns and affect energy production. To address this challenge, we analyzed the Supervisory Control and Data Acquisition (SCADA) data to identify significant differences between the relationship of variables based on data reconstruction errors between actual and predicted values. This study proposes a hybrid short- and long-term memory autoencoder model with multihead self-attention (LSTM-MA-AE) for WT converter fault detection. The proposed model identifies anomalies in the data by comparing the reconstruction errors of the variables involved. However, more is needed. To address this model limitation, we developed a fault prediction system that employs an adaptive threshold with an Exponentially Weighted Moving Average (EWMA) and a fixed threshold. This system analyzes the anomalies of several variables and generates fault warnings in advance time. Thus, we propose an outlier detection method through data preprocessing and unsupervised learning, using SCADA data collected from a wind farm located in complex terrain, including real faults in the converter. The LSTM-MA-AE is shown to be able to predict the converter failure 3.3 months in advance, and with an F1 greater than 90% in the tests performed. The results provide evidence of the potential of the proposed model to improve converter fault diagnosis with SCADA data in complex environments, highlighting its ability to increase the reliability and efficiency of WTs. Full article
(This article belongs to the Special Issue Advanced Forecasting Techniques and Methods for Energy Systems)
Show Figures

Figure 1

19 pages, 5132 KiB  
Article
Synthetic Face Discrimination via Learned Image Compression
by Sofia Iliopoulou, Panagiotis Tsinganos, Dimitris Ampeliotis and Athanassios Skodras
Algorithms 2024, 17(9), 375; https://doi.org/10.3390/a17090375 - 23 Aug 2024
Viewed by 280
Abstract
The emergence of deep learning has sparked notable strides in the quality of synthetic media. Yet, as photorealism reaches new heights, the line between generated and authentic images blurs, raising concerns about the dissemination of counterfeit or manipulated content online. Consequently, there is [...] Read more.
The emergence of deep learning has sparked notable strides in the quality of synthetic media. Yet, as photorealism reaches new heights, the line between generated and authentic images blurs, raising concerns about the dissemination of counterfeit or manipulated content online. Consequently, there is a pressing need to develop automated tools capable of effectively distinguishing synthetic images, especially those portraying faces, which is one of the most commonly encountered issues. In this work, we propose a novel approach to synthetic face discrimination, leveraging deep learning-based image compression and predominantly utilizing the quality metrics of an image to determine its authenticity. Full article
(This article belongs to the Special Issue Algorithms for Image Processing and Machine Vision)
Show Figures

Figure 1

16 pages, 3374 KiB  
Article
P-CA: Privacy-Preserving Convolutional Autoencoder-Based Edge–Cloud Collaborative Computing for Human Behavior Recognition
by Haoda Wang, Chen Qiu, Chen Zhang, Jiantao Xu and Chunhua Su
Mathematics 2024, 12(16), 2587; https://doi.org/10.3390/math12162587 - 21 Aug 2024
Viewed by 488
Abstract
With the development of edge computing and deep learning, intelligent human behavior recognition has spawned extensive applications in smart worlds. However, current edge computing technology faces performance bottlenecks due to limited computing resources at the edge, which prevent deploying advanced deep neural networks. [...] Read more.
With the development of edge computing and deep learning, intelligent human behavior recognition has spawned extensive applications in smart worlds. However, current edge computing technology faces performance bottlenecks due to limited computing resources at the edge, which prevent deploying advanced deep neural networks. In addition, there is a risk of privacy leakage during interactions between the edge and the server. To tackle these problems, we propose an effective, privacy-preserving edge–cloud collaborative interaction scheme based on WiFi, named P-CA, for human behavior sensing. In our scheme, a convolutional autoencoder neural network is split into two parts. The shallow layers are deployed on the edge side for inference and privacy-preserving processing, while the deep layers are deployed on the server side to leverage its computing resources. Experimental results based on datasets collected from real testbeds demonstrate the effectiveness and considerable performance of the P-CA. The recognition accuracy can maintain 88%, although it could achieve about 94.8% without the mixing operation. In addition, the proposed P-CA achieves better recognition accuracy than two state-of-the-art methods, i.e., FedLoc and PPDFL, by 2.7% and 2.1%, respectively, while maintaining privacy. Full article
Show Figures

Figure 1

14 pages, 2930 KiB  
Article
Editable Co-Speech Gesture Synthesis Enhanced with Individual Representative Gestures
by Yihua Bao, Dongdong Weng and Nan Gao
Electronics 2024, 13(16), 3315; https://doi.org/10.3390/electronics13163315 - 21 Aug 2024
Viewed by 331
Abstract
Co-speech gesture synthesis is a challenging task due to the complexity and uncertainty between gestures and speech. Gestures that accompany speech (i.e., Co-Speech Gesture) are an essential part of natural and efficient embodied human communication, as they work in tandem with speech to [...] Read more.
Co-speech gesture synthesis is a challenging task due to the complexity and uncertainty between gestures and speech. Gestures that accompany speech (i.e., Co-Speech Gesture) are an essential part of natural and efficient embodied human communication, as they work in tandem with speech to convey information more effectively. Although data-driven approaches have improved gesture synthesis, existing deep learning-based methods use deterministic modeling which could lead to averaging out predicted gestures. Additionally, these methods lack control over gesture generation such as user editing of generated results. In this paper, we propose an editable gesture synthesis method based on a learned pose script, which disentangles gestures into individual representative and rhythmic gestures to produce high-quality, diverse and realistic poses. Specifically, we first detect the time of occurrence of gestures in video sequences and transform them into pose scripts. Regression models are then built to predict the pose scripts. Next, learned pose scripts are used for gesture synthesis, while rhythmic gestures are modeled using a variational auto-encoder and a one-dimensional convolutional network. Moreover, we introduce a large-scale Chinese co-speech gesture synthesis dataset with multimodal annotations for training and evaluation, which will be publicly available to facilitate future research. The proposed method allows for the re-editing of generated results by changing the pose scripts for applications such as interactive digital humans. The experimental results show that this method generates more quality, more diverse, and realistic gestures than other existing methods. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

19 pages, 5727 KiB  
Article
Stage-Aware Interaction Network for Point Cloud Completion
by Hang Wu and Yubin Miao
Electronics 2024, 13(16), 3296; https://doi.org/10.3390/electronics13163296 - 20 Aug 2024
Viewed by 440
Abstract
Point cloud completion aims to restore full shapes of objects from partial scans, and a typical network pipeline is AutoEncoder, which has coarse-to-fine refinement modules. Although existing approaches using this kind of architecture achieve promising results, they usually neglect the usage of shallow [...] Read more.
Point cloud completion aims to restore full shapes of objects from partial scans, and a typical network pipeline is AutoEncoder, which has coarse-to-fine refinement modules. Although existing approaches using this kind of architecture achieve promising results, they usually neglect the usage of shallow geometry features in partial inputs and the fusion of multi-stage features in the upsampling process, which prevents network performances from further improving. Therefore, in this paper, we propose a new method with dense interactions between different encoding and decoding steps. First, we introduce the Decoupled Multi-head Transformer (DMT), which implements and integrates semantic prediction and resolution upsampling in a unified network module, which serves as a primary ingredient in our pipeline. Second, we propose an Encoding-aware Coarse Decoder (ECD) that compactly makes the top–down shape-decoding process interact with the bottom–up feature-encoding process to utilize both shallow and deep features of partial inputs for coarse point cloud generation. Third, we design a Stage-aware Refinement Group (SRG), which comprehensively understands local semantics from densely connected features across different decoding stages and gradually upsamples point clouds based on them. In general, the key contributions of our method are the DMT for joint semantic-resolution generation, the ECD for multi-scale feature fusion-based shape decoding, and the SRG for stage-aware shape refinement. Evaluations on two synthetic and three real-world datasets illustrate that our method achieves competitive performances compared with existing approaches. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

29 pages, 2253 KiB  
Article
Clustering Molecules at a Large Scale: Integrating Spectral Geometry with Deep Learning
by Ömer Akgüller, Mehmet Ali Balcı and Gabriela Cioca
Molecules 2024, 29(16), 3902; https://doi.org/10.3390/molecules29163902 - 17 Aug 2024
Viewed by 678
Abstract
This study conducts an in-depth analysis of clustering small molecules using spectral geometry and deep learning techniques. We applied a spectral geometric approach to convert molecular structures into triangulated meshes and used the Laplace–Beltrami operator to derive significant geometric features. By examining the [...] Read more.
This study conducts an in-depth analysis of clustering small molecules using spectral geometry and deep learning techniques. We applied a spectral geometric approach to convert molecular structures into triangulated meshes and used the Laplace–Beltrami operator to derive significant geometric features. By examining the eigenvectors of these operators, we captured the intrinsic geometric properties of the molecules, aiding their classification and clustering. The research utilized four deep learning methods: Deep Belief Network, Convolutional Autoencoder, Variational Autoencoder, and Adversarial Autoencoder, each paired with k-means clustering at different cluster sizes. Clustering quality was evaluated using the Calinski–Harabasz and Davies–Bouldin indices, Silhouette Score, and standard deviation. Nonparametric tests were used to assess the impact of topological descriptors on clustering outcomes. Our results show that the DBN + k-means combination is the most effective, particularly at lower cluster counts, demonstrating significant sensitivity to structural variations. This study highlights the potential of integrating spectral geometry with deep learning for precise and efficient molecular clustering. Full article
(This article belongs to the Special Issue Deep Learning in Molecular Science and Technology)
Show Figures

Figure 1

26 pages, 11215 KiB  
Article
Unsupervised Learning-Based Optical–Acoustic Fusion Interest Point Detector for AUV Near-Field Exploration of Hydrothermal Areas
by Yihui Liu, Yufei Xu, Ziyang Zhang, Lei Wan, Jiyong Li and Yinghao Zhang
J. Mar. Sci. Eng. 2024, 12(8), 1406; https://doi.org/10.3390/jmse12081406 - 15 Aug 2024
Viewed by 415
Abstract
The simultaneous localization and mapping (SLAM) technique provides long-term near-seafloor navigation for autonomous underwater vehicles (AUVs). However, the stability of the interest point detector (IPD) remains challenging in the seafloor environment. This paper proposes an optical–acoustic fusion interest point detector (OAF-IPD) using a [...] Read more.
The simultaneous localization and mapping (SLAM) technique provides long-term near-seafloor navigation for autonomous underwater vehicles (AUVs). However, the stability of the interest point detector (IPD) remains challenging in the seafloor environment. This paper proposes an optical–acoustic fusion interest point detector (OAF-IPD) using a monocular camera and forward-looking sonar. Unlike the artificial feature detectors most underwater IPDs adopt, a deep neural network model based on unsupervised interest point detector (UnsuperPoint) was built to reach stronger environmental adaption. First, a feature fusion module based on feature pyramid networks (FPNs) and a depth module were integrated into the system to ensure a uniform distribution of interest points in depth for improved localization accuracy. Second, a self-supervised training procedure was developed to adapt the OAF-IPD for unsupervised training. This procedure included an auto-encoder framework for the sonar data encoder, a ground truth depth generation framework for the depth module, and optical–acoustic mutual supervision for the fuse module training. Third, a non-rigid feature filter was implemented in the camera data encoder to mitigate the interference from non-rigid structural objects, such as smoke emitted from active vents in hydrothermal areas. Evaluations were conducted using open-source datasets as well as a dataset captured by the research team of this paper from pool experiments to prove the robustness and accuracy of the newly proposed method. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

Back to TopTop