MDPI - Publisher of Open Access Journals

22 pages, 758 KiB

Open AccessArticle

A Hierarchical Machine Learning Method for Detection and Visualization of Network Intrusions from Big Data

by Jinrong Wu, Su Nguyen, Thimal Kempitiya and Damminda Alahakoon

Technologies 2024, 12(10), 204; https://doi.org/10.3390/technologies12100204 - 17 Oct 2024

Viewed by 300

Machine learning is regarded as an effective approach in network intrusion detection, and has gained significant attention in recent studies. However, few intrusion detection methods have been successfully applied to detect anomalies in large-scale network traffic data, and low explainability of the complex [...] Read more.

Machine learning is regarded as an effective approach in network intrusion detection, and has gained significant attention in recent studies. However, few intrusion detection methods have been successfully applied to detect anomalies in large-scale network traffic data, and low explainability of the complex algorithms has caused concerns about fairness and accountability. A further problem is that many intrusion detection systems need to work with distributed data sources in the cloud. In this paper, we propose an intrusion detection method based on distributed computing to learn the latent representations from large-scale network data with lower computation time while improving the intrusion detection accuracy. Our proposed classifier, based on a novel hierarchical algorithm combining adaptability and visualization ability from a self-structured unsupervised learning algorithm and achieving explainability from self-explainable supervised algorithms, is able to enhance the understanding of the model and data. The experimental results show that our proposed method is effective, efficient, and scalable in capturing the network traffic patterns and detecting detailed network intrusion information such as type of attack with high detection performance, and is an ideal method to be applied in cloud-computing environments. Full article

(This article belongs to the Special Issue Data Science and Big Data in Biology, Physical Science and Engineering II)

27 pages, 1069 KiB

Open AccessArticle

Fractional Derivative to Symmetrically Extend the Memory of Fuzzy C-Means

by Safaa Safouan, Karim El Moutaouakil and Alina-Mihaela Patriciu

Symmetry 2024, 16(10), 1353; https://doi.org/10.3390/sym16101353 - 12 Oct 2024

Viewed by 309

Abstract

The fuzzy C-means (FCM) clustering algorithm is a widely used unsupervised learning method known for its ability to identify natural groupings within datasets. While effective in many cases, FCM faces challenges such as sensitivity to initial cluster assignments, slow convergence, and difficulty in [...] Read more.

The fuzzy C-means (FCM) clustering algorithm is a widely used unsupervised learning method known for its ability to identify natural groupings within datasets. While effective in many cases, FCM faces challenges such as sensitivity to initial cluster assignments, slow convergence, and difficulty in handling non-linear and overlapping clusters. Aimed at these limitations, this paper introduces a novel fractional fuzzy C-means (Frac-FCM) algorithm, which incorporates fractional derivatives into the FCM framework. By capturing non-local dependencies and long memory effects, fractional derivatives offer a more flexible and precise representation of data relationships, making the method more suitable for complex datasets. Additionally, a genetic algorithm (GA) is employed to optimize a new least-squares objective function that emphasizes the geometric properties of clusters, particularly focusing on the Fukuyama–Sugeno and Xie–Beni indices, thereby enhancing the balance between cluster compactness and separation. Furthermore, the Frac-FCM algorithm is evaluated on several benchmark datasets, including Iris, Seed, and Statlog, and compared against traditional methods like K-means, SOM, GMM, and FCM. The results indicate that Frac-FCM consistently outperforms these methods in terms of the Silhouette and Dunn indices. For instance, Frac-FCM achieves higher Silhouette scores of most cases, indicating more distinct and well-separated clusters. Dunn’s index further shows that Frac-FCM generates clusters that are better separated, surpassing the performance of traditional methods. These findings highlight the robustness and superior clustering performance of Frac-FCM. The Friedman test was employed to enhance and validate the effectiveness of Frac-FCM. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

66 pages, 1555 KiB

Open AccessArticle

Extracting Sentence Embeddings from Pretrained Transformer Models

by Lukas Stankevičius and Mantas Lukoševičius

Appl. Sci. 2024, 14(19), 8887; https://doi.org/10.3390/app14198887 - 2 Oct 2024

Viewed by 626

Abstract

Pre-trained transformer models shine in many natural language processing tasks and therefore are expected to bear the representation of the input sentence or text meaning. These sentence-level embeddings are also important in retrieval-augmented generation. But do commonly used plain averaging or prompt templates [...] Read more.

Pre-trained transformer models shine in many natural language processing tasks and therefore are expected to bear the representation of the input sentence or text meaning. These sentence-level embeddings are also important in retrieval-augmented generation. But do commonly used plain averaging or prompt templates sufficiently capture and represent the underlying meaning? After providing a comprehensive review of existing sentence embedding extraction and refinement methods, we thoroughly test different combinations and our original extensions of the most promising ones on pretrained models. Namely, given 110 M parameters, BERT’s hidden representations from multiple layers, and many tokens, we try diverse ways to extract optimal sentence embeddings. We test various token aggregation and representation post-processing techniques. We also test multiple ways of using a general Wikitext dataset to complement BERT’s sentence embeddings. All methods are tested on eight Semantic Textual Similarity (STS), six short text clustering, and twelve classification tasks. We also evaluate our representation-shaping techniques on other static models, including random token representations. Proposed representation extraction methods improve the performance on STS and clustering tasks for all models considered. Very high improvements for static token-based models, especially random embeddings for STS tasks, almost reach the performance of BERT-derived representations. Our work shows that the representation-shaping techniques significantly improve sentence embeddings extracted from BERT-based and simple baseline models. Full article

(This article belongs to the Special Issue Advances in Large Language Models: Techniques, Applications and Challenges)

► Show Figures

Figure 1

18 pages, 2584 KiB

Open AccessArticle

Robust Remote Sensing Scene Interpretation Based on Unsupervised Domain Adaptation

by Linjuan Li, Haoxue Zhang, Gang Xie and Zhaoxiang Zhang

Electronics 2024, 13(18), 3709; https://doi.org/10.3390/electronics13183709 - 19 Sep 2024

Viewed by 788

Abstract

Deep learning models excel in interpreting the exponentially growing amounts of remote sensing data; however, they are susceptible to deception and spoofing by adversarial samples, posing catastrophic threats. The existing methods to combat adversarial samples have limited performance in robustness and efficiency, particularly [...] Read more.

Deep learning models excel in interpreting the exponentially growing amounts of remote sensing data; however, they are susceptible to deception and spoofing by adversarial samples, posing catastrophic threats. The existing methods to combat adversarial samples have limited performance in robustness and efficiency, particularly in complex remote sensing scenarios. To tackle these challenges, an unsupervised domain adaptation algorithm is proposed for the accurate identification of clean images and adversarial samples by exploring a robust generative adversarial classification network that can harmonize the features between clean images and adversarial samples to minimize distribution discrepancies. Furthermore, linear polynomial loss as a replacement for cross-entropy loss is integrated to guide robust representation learning. Additionally, we leverage the fast gradient sign method (FGSM) and projected gradient descent (PGD) algorithms to generate adversarial samples with varying perturbation amplitudes to assess model robustness. A series of experiments was performed on the RSSCN7 dataset and SIRI-WHU dataset. Our experimental results illustrate that the proposed algorithm performs exceptionally well in classifying clean images while demonstrating robustness against adversarial perturbations. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

21 pages, 1072 KiB

Open AccessArticle

Community Detection Using Deep Learning: Combining Variational Graph Autoencoders with Leiden and K-Truss Techniques

by Jyotika Hariom Patil, Petros Potikas, William B. Andreopoulos and Katerina Potika

Information 2024, 15(9), 568; https://doi.org/10.3390/info15090568 - 16 Sep 2024

Viewed by 634

Abstract

Deep learning struggles with unsupervised tasks like community detection in networks. This work proposes the Enhanced Community Detection with Structural Information VGAE (VGAE-ECF) method, a method that enhances variational graph autoencoders (VGAEs) for community detection in large networks. It incorporates community structure information [...] Read more.

Deep learning struggles with unsupervised tasks like community detection in networks. This work proposes the Enhanced Community Detection with Structural Information VGAE (VGAE-ECF) method, a method that enhances variational graph autoencoders (VGAEs) for community detection in large networks. It incorporates community structure information and edge weights alongside traditional network data. This combined input leads to improved latent representations for community identification via K-means clustering. We perform experiments and show that our method works better than previous approaches of community-aware VGAEs. Full article

(This article belongs to the Special Issue Optimization Algorithms and Their Applications)

► Show Figures

Figure 1

15 pages, 12772 KiB

Open AccessArticle

Learning Unsupervised Cross-Domain Model for TIR Target Tracking

by Xiu Shu, Feng Huang, Zhaobing Qiu, Xinming Zhang and Di Yuan

Mathematics 2024, 12(18), 2882; https://doi.org/10.3390/math12182882 - 15 Sep 2024

Viewed by 357

Abstract

The limited availability of thermal infrared (TIR) training samples leads to suboptimal target representation by convolutional feature extraction networks, which adversely impacts the accuracy of TIR target tracking methods. To address this issue, we propose an unsupervised cross-domain model (UCDT) for TIR tracking. [...] Read more.

The limited availability of thermal infrared (TIR) training samples leads to suboptimal target representation by convolutional feature extraction networks, which adversely impacts the accuracy of TIR target tracking methods. To address this issue, we propose an unsupervised cross-domain model (UCDT) for TIR tracking. Our approach leverages labeled training samples from the RGB domain (source domain) to train a general feature extraction network. We then employ a cross-domain model to adapt this network for effective target feature extraction in the TIR domain (target domain). This cross-domain strategy addresses the challenge of limited TIR training samples effectively. Additionally, we utilize an unsupervised learning technique to generate pseudo-labels for unlabeled training samples in the source domain, which helps overcome the limitations imposed by the scarcity of annotated training data. Extensive experiments demonstrate that our UCDT tracking method outperforms existing tracking approaches on the PTB-TIR and LSOTB-TIR benchmarks. Full article

(This article belongs to the Special Issue Mathematics-Based Methods in Artificial Intelligence, Pattern Recognition and Deep Learning, 2nd Edition)

► Show Figures

Figure 1

15 pages, 2970 KiB

Open AccessArticle

scVGATAE: A Variational Graph Attentional Autoencoder Model for Clustering Single-Cell RNA-seq Data

by Lijun Liu, Xiaoyang Wu, Jun Yu, Yuduo Zhang, Kaixing Niu and Anli Yu

Biology 2024, 13(9), 713; https://doi.org/10.3390/biology13090713 - 11 Sep 2024

Viewed by 640

Abstract

Single-cell RNA sequencing (scRNA-seq) is now a successful technology for identifying cell heterogeneity, revealing new cell subpopulations, and predicting developmental trajectories. A crucial component in scRNA-seq is the precise identification of cell subsets. Although many unsupervised clustering methods have been developed for clustering [...] Read more.

Single-cell RNA sequencing (scRNA-seq) is now a successful technology for identifying cell heterogeneity, revealing new cell subpopulations, and predicting developmental trajectories. A crucial component in scRNA-seq is the precise identification of cell subsets. Although many unsupervised clustering methods have been developed for clustering cell subpopulations, the performance of these methods is prone to be affected by dropout, high dimensionality, and technical noise. Additionally, most existing methods are time-consuming and fail to fully consider the potential correlations between cells. In this paper, we propose a novel unsupervised clustering method called scVGATAE (Single-cell Variational Graph Attention Autoencoder) for scRNA-seq data. This method constructs a reliable cell graph through network denoising, utilizes a novel variational graph autoencoder model integrated with graph attention networks to aggregate neighbor information and learn the distribution of the low-dimensional representations of cells, and adaptively determines the model training iterations for various datasets. Finally, the obtained low-dimensional representations of cells are clustered using kmeans. Experiments on nine public datasets show that scVGATAE outperforms classical and state-of-the-art clustering methods. Full article

(This article belongs to the Special Issue 2nd Edition of Computational Methods in Biology)

► Show Figures

Figure 1

14 pages, 5141 KiB

Open AccessArticle

An End-to-End, Multi-Branch, Feature Fusion-Comparison Deep Clustering Method

by Xuanyu Li and Houqun Yang

Mathematics 2024, 12(17), 2749; https://doi.org/10.3390/math12172749 - 5 Sep 2024

Viewed by 497

Abstract

The application of contrastive learning in image clustering in the field of unsupervised learning has attracted much attention due to its ability to effectively improve clustering performance. Extracting features for face-oriented clustering using deep learning networks has also become one of the key [...] Read more.

The application of contrastive learning in image clustering in the field of unsupervised learning has attracted much attention due to its ability to effectively improve clustering performance. Extracting features for face-oriented clustering using deep learning networks has also become one of the key challenges in this field. Some current research focuses on learning valuable semantic features using contrastive learning strategies to accomplish cluster allocation in the feature space. However, some studies decoupled the two phases of feature extraction and clustering are prone to error transfer, on the other hand, features learned in the feature extraction phase of multi-stage training are not guaranteed to be suitable for the clustering task. To address these challenges, We propose an end-to-end multi-branch feature fusion comparison deep clustering method (SwEAC), which incorporates a multi-branch feature extraction strategy in the representation learning phase, this method completes the clustering center comparison between multiple views and then assigns clusters to the extracted features. In order to extract higher-level semantic features, a multi-branch structure is used to learn multi-dimensional spatial channel dimension information and weighted receptive-field spatial features, achieving cross-dimensional information exchange of multi-branch sub-features. Meanwhile, we jointly optimize unsupervised contrastive representation learning and clustering in an end-to-end architecture to obtain semantic features for clustering that are more suitable for clustering tasks. Experimental results show that our model achieves good clustering performance on three popular image datasets evaluated by three unsupervised evaluation metrics, which proves the effectiveness of end-to-end multi-branch feature fusion comparison deep clustering methods. Full article

► Show Figures

Figure 1

27 pages, 10427 KiB

Open AccessArticle

UMMFF: Unsupervised Multimodal Multilevel Feature Fusion Network for Hyperspectral Image Super-Resolution

by Zhongmin Jiang, Mengyao Chen and Wenju Wang

Remote Sens. 2024, 16(17), 3282; https://doi.org/10.3390/rs16173282 - 4 Sep 2024

Viewed by 729

Abstract

Due to the inadequacy in utilizing complementary information from different modalities and the biased estimation of degraded parameters, the unsupervised hyperspectral super-resolution algorithm suffers from low precision and limited applicability. To address this issue, this paper proposes an approach for hyperspectral image super-resolution, [...] Read more.

Due to the inadequacy in utilizing complementary information from different modalities and the biased estimation of degraded parameters, the unsupervised hyperspectral super-resolution algorithm suffers from low precision and limited applicability. To address this issue, this paper proposes an approach for hyperspectral image super-resolution, namely, the Unsupervised Multimodal Multilevel Feature Fusion network (UMMFF). The proposed approach employs a gated cross-retention module to learn shared patterns among different modalities. This module effectively eliminates the intermodal differences while preserving spatial–spectral correlations, thereby facilitating information interaction. A multilevel spatial–channel attention and parallel fusion decoder are constructed to extract features at three levels (low, medium, and high), enriching the information of the multimodal images. Additionally, an independent prior-based implicit neural representation blind estimation network is designed to accurately estimate the degraded parameters. The utilization of UMMFF on the “Washington DC”, Salinas, and Botswana datasets exhibited a superior performance compared to existing state-of-the-art methods in terms of primary performance metrics such as PSNR and ERGAS, and the PSNR values improved by 18.03%, 8.55%, and 5.70%, respectively, while the ERGAS values decreased by 50.00%, 75.39%, and 53.27%, respectively. The experimental results indicate that UMMFF demonstrates excellent algorithm adaptability, resulting in high-precision reconstruction outcomes. Full article

(This article belongs to the Special Issue Image Enhancement and Fusion Techniques in Remote Sensing)

► Show Figures

Figure 1

19 pages, 1785 KiB

Open AccessArticle

Representing the Information of Multiplayer Online Battle Arena (MOBA) Video Games Using Convolutional Accordion Auto-Encoder (A²E) Enhanced by Attention Mechanisms

by José A. Torres-León, Marco A. Moreno-Armendáriz and Hiram Calvo

Mathematics 2024, 12(17), 2744; https://doi.org/10.3390/math12172744 - 3 Sep 2024

Viewed by 581

Abstract

In this paper, we propose a representation of the visual information about Multiplayer Online Battle Arena (MOBA) video games using an adapted unsupervised deep learning architecture called Convolutional Accordion Auto-Encoder (Conv_A²E). Our study includes a presentation of current representations of MOBA [...] Read more.

In this paper, we propose a representation of the visual information about Multiplayer Online Battle Arena (MOBA) video games using an adapted unsupervised deep learning architecture called Convolutional Accordion Auto-Encoder (Conv_A²E). Our study includes a presentation of current representations of MOBA video game information and why our proposal offers a novel and useful solution to this task. This approach aims to achieve dimensional reduction and refined feature extraction of the visual data. To enhance the model’s performance, we tested several attention mechanisms for computer vision, evaluating algorithms from the channel attention and spatial attention families, and their combination. Through experimentation, we found that the best reconstruction of the visual information with the Conv_A²E was achieved when using a spatial attention mechanism, deformable convolution, as its mean squared error (MSE) during testing was the lowest, reaching a value of 0.003893, which means that its dimensional reduction is the most generalist and representative for this case study. This paper presents one of the first approaches to applying attention mechanisms to the case study of MOBA video games, representing a new horizon of possibilities for research. Full article

(This article belongs to the Special Issue Mathematical Optimization and Control: Methods and Applications)

► Show Figures

Figure 1

20 pages, 24086 KiB

Open AccessArticle

Clustering Hyperspectral Imagery via Sparse Representation Features of the Generalized Orthogonal Matching Pursuit

by Wenqi Guo, Xu Xu, Xiaoqiang Xu, Shichen Gao and Zibu Wu

Remote Sens. 2024, 16(17), 3230; https://doi.org/10.3390/rs16173230 - 31 Aug 2024

Viewed by 379

Abstract

This study focused on improving the clustering performance of hyperspectral imaging (HSI) by employing the Generalized Orthogonal Matching Pursuit (GOMP) algorithm for feature extraction. Hyperspectral remote sensing imaging technology, which is crucial in various fields like environmental monitoring and agriculture, faces challenges due [...] Read more.

This study focused on improving the clustering performance of hyperspectral imaging (HSI) by employing the Generalized Orthogonal Matching Pursuit (GOMP) algorithm for feature extraction. Hyperspectral remote sensing imaging technology, which is crucial in various fields like environmental monitoring and agriculture, faces challenges due to its high dimensionality and complexity. Supervised learning methods require extensive data and computational resources, while clustering, an unsupervised method, offers a more efficient alternative. This research presents a novel approach using GOMP to enhance clustering performance in HSI. The GOMP algorithm iteratively selects multiple dictionary elements for sparse representation, which makes it well-suited for handling complex HSI data. The proposed method was tested on two publicly available HSI datasets and evaluated in comparison with other methods to demonstrate its effectiveness in enhancing clustering performance. Full article

► Show Figures

Graphical abstract

18 pages, 4262 KiB

Open AccessArticle

Cyclic Consistent Image Style Transformation: From Model to System

by Jun Peng, Kaiyi Chen, Yuqing Gong, Tianxiang Zhang and Baohua Su

Appl. Sci. 2024, 14(17), 7637; https://doi.org/10.3390/app14177637 - 29 Aug 2024

Viewed by 667

Abstract

Generative Adversarial Networks (GANs) have achieved remarkable success in various tasks, including image generation, editing, and reconstruction, as well as in unsupervised and representation learning. Despite their impressive capabilities, GANs are often plagued by challenges such as unstable training dynamics and limitations in [...] Read more.

Generative Adversarial Networks (GANs) have achieved remarkable success in various tasks, including image generation, editing, and reconstruction, as well as in unsupervised and representation learning. Despite their impressive capabilities, GANs are often plagued by challenges such as unstable training dynamics and limitations in generating complex patterns. To address these challenges, we propose a novel image style transfer method, named C3GAN, which leverages CycleGAN architecture to achieve consistent and stable transformation of image style. In this context, “image style” refers to the distinct visual characteristics or artistic elements, such as the color schemes, textures, and brushstrokes that define the overall appearance of an image. Our method incorporates cyclic consistency, ensuring that the style transformation remains coherent and visually appealing, thus enhancing the training stability and overcoming the generative limitations of traditional GAN models. Additionally, we have developed a robust and efficient image style transfer system by integrating Flask for web development and MySQL for database management. Our system demonstrates superior performance in transferring complex styles compared to existing model-based approaches. This paper presents the development of a comprehensive image style transfer system based on our advanced C3GAN model, effectively addressing the challenges of GANs and expanding application potential in domains such as artistic creation and cinematic special effects. Full article

(This article belongs to the Special Issue Selected Papers from CCF 39th China Computer Application Conference (CCF NCCA 2024))

► Show Figures

Figure 1

12 pages, 4152 KiB

Open AccessArticle

Exploring Molecular Heteroencoders with Latent Space Arithmetic: Atomic Descriptors and Molecular Operators

by Xinyue Gao, Natalia Baimacheva and Joao Aires-de-Sousa

Molecules 2024, 29(16), 3969; https://doi.org/10.3390/molecules29163969 - 22 Aug 2024

Viewed by 612

Abstract

A variational heteroencoder based on recurrent neural networks, trained with SMILES linear notations of molecular structures, was used to derive the following atomic descriptors: delta latent space vectors (DLSVs) obtained from the original SMILES of the whole molecule and the SMILES of the [...] Read more.

A variational heteroencoder based on recurrent neural networks, trained with SMILES linear notations of molecular structures, was used to derive the following atomic descriptors: delta latent space vectors (DLSVs) obtained from the original SMILES of the whole molecule and the SMILES of the same molecule with the target atom replaced. Different replacements were explored, namely, changing the atomic element, replacement with a character of the model vocabulary not used in the training set, or the removal of the target atom from the SMILES. Unsupervised mapping of the DLSV descriptors with t-distributed stochastic neighbor embedding (t-SNE) revealed a remarkable clustering according to the atomic element, hybridization, atomic type, and aromaticity. Atomic DLSV descriptors were used to train machine learning (ML) models to predict ¹⁹F NMR chemical shifts. An R² of up to 0.89 and mean absolute errors of up to 5.5 ppm were obtained for an independent test set of 1046 molecules with random forests or a gradient-boosting regressor. Intermediate representations from a Transformer model yielded comparable results. Furthermore, DLSVs were applied as molecular operators in the latent space: the DLSV of a halogenation (H→F substitution) was summed to the LSVs of 4135 new molecules with no fluorine atom and decoded into SMILES, yielding 99% of valid SMILES, with 75% of the SMILES incorporating fluorine and 56% of the structures incorporating fluorine with no other structural change. Full article

(This article belongs to the Special Issue QSAR and QSPR: Recent Developments and Applications, 4th Edition)

► Show Figures

Graphical abstract

11 pages, 607 KiB

Open AccessArticle

A Semi-Supervised Lie Detection Algorithm Based on Integrating Multiple Speech Emotional Features

by Ji Xi, Hang Yu, Zhe Xu, Li Zhao and Huawei Tao

Appl. Sci. 2024, 14(16), 7391; https://doi.org/10.3390/app14167391 - 21 Aug 2024

Viewed by 636

Abstract

When people tell lies, they often exhibit tension and emotional fluctuations, reflecting a complex psychological state. However, the scarcity of labeled data in datasets and the complexity of deception information pose significant challenges in extracting effective lie features, which severely restrict the accuracy [...] Read more.

When people tell lies, they often exhibit tension and emotional fluctuations, reflecting a complex psychological state. However, the scarcity of labeled data in datasets and the complexity of deception information pose significant challenges in extracting effective lie features, which severely restrict the accuracy of lie detection systems. To address this, this paper proposes a semi-supervised lie detection algorithm based on integrating multiple speech emotional features. Firstly, Long Short-Term Memory (LSTM) and Auto Encoder (AE) network process log Mel spectrogram features and acoustic statistical features, respectively, to capture the contextual links between similar features. Secondly, the joint attention model is used to learn the complementary relationship among different features to obtain feature representations with richer details. Lastly, the model combines the unsupervised loss Local Maximum Mean Discrepancy (LMMD) and supervised loss Jefferys multi-loss optimization to enhance the classification performance. Experimental results show that the algorithm proposed in this paper achieves better performance. Full article

(This article belongs to the Special Issue Application of Affective Computing)

► Show Figures

Figure 1

40 pages, 4079 KiB

Open AccessArticle

Investigating Contrastive Pair Learning’s Frontiers in Supervised, Semisupervised, and Self-Supervised Learning

by Bihi Sabiri, Amal Khtira, Bouchra El Asri and Maryem Rhanoui

J. Imaging 2024, 10(8), 196; https://doi.org/10.3390/jimaging10080196 - 13 Aug 2024

Viewed by 1144

Abstract

In recent years, contrastive learning has been a highly favored method for self-supervised representation learning, which significantly improves the unsupervised training of deep image models. Self-supervised learning is a subset of unsupervised learning in which the learning process is supervised by creating pseudolabels [...] Read more.

In recent years, contrastive learning has been a highly favored method for self-supervised representation learning, which significantly improves the unsupervised training of deep image models. Self-supervised learning is a subset of unsupervised learning in which the learning process is supervised by creating pseudolabels from the data themselves. Using supervised final adjustments after unsupervised pretraining is one way to take the most valuable information from a vast collection of unlabeled data and teach from a small number of labeled instances. This study aims firstly to compare contrastive learning with other traditional learning models; secondly to demonstrate by experimental studies the superiority of contrastive learning during classification; thirdly to fine-tune performance using pretrained models and appropriate hyperparameter selection; and finally to address the challenge of using contrastive learning techniques to produce data representations with semantic meaning that are independent of irrelevant factors like position, lighting, and background. Relying on contrastive techniques, the model efficiently captures meaningful representations by discerning similarities and differences between modified copies of the same image. The proposed strategy, involving unsupervised pretraining followed by supervised fine-tuning, improves the robustness, accuracy, and knowledge extraction of deep image models. The results show that even with a modest 5% of data labeled, the semisupervised model achieves an accuracy of 57.72%. However, the use of supervised learning with a contrastive approach and careful hyperparameter tuning increases accuracy to 85.43%. Further adjustment of the hyperparameters resulted in an excellent accuracy of 88.70%. Full article

► Show Figures

Figure 1

Search Results (297)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (297)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI