Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (693)

Search Parameters:
Keywords = word embedding

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 2069 KiB  
Article
Automated Detection of Misinformation: A Hybrid Approach for Fake News Detection
by Fadi Mohsen, Bedir Chaushi, Hamed Abdelhaq, Dimka Karastoyanova and Kevin Wang
Future Internet 2024, 16(10), 352; https://doi.org/10.3390/fi16100352 - 27 Sep 2024
Viewed by 151
Abstract
The rise of social media has transformed the landscape of news dissemination, presenting new challenges in combating the spread of fake news. This study addresses the automated detection of misinformation within written content, a task that has prompted extensive research efforts across various [...] Read more.
The rise of social media has transformed the landscape of news dissemination, presenting new challenges in combating the spread of fake news. This study addresses the automated detection of misinformation within written content, a task that has prompted extensive research efforts across various methodologies. We evaluate existing benchmarks, introduce a novel hybrid word embedding model, and implement a web framework for text classification. Our approach integrates traditional frequency–inverse document frequency (TF–IDF) methods with sophisticated feature extraction techniques, considering linguistic, psychological, morphological, and grammatical aspects of the text. Through a series of experiments on diverse datasets, applying transfer and incremental learning techniques, we demonstrate the effectiveness of our hybrid model in surpassing benchmarks and outperforming alternative experimental setups. Furthermore, our findings emphasize the importance of dataset alignment and balance in transfer learning, as well as the utility of incremental learning in maintaining high detection performance while reducing runtime. This research offers promising avenues for further advancements in fake news detection methodologies, with implications for future research and development in this critical domain. Full article
(This article belongs to the Special Issue Embracing Artificial Intelligence (AI) for Network and Service)
Show Figures

Figure 1

21 pages, 2680 KiB  
Article
TACSan: Enhancing Vulnerability Detection with Graph Neural Network
by Qingyao Zeng, Dapeng Xiong, Zhongwang Wu, Kechang Qian, Yu Wang and Yinghao Su
Electronics 2024, 13(19), 3813; https://doi.org/10.3390/electronics13193813 - 26 Sep 2024
Viewed by 277
Abstract
With the increasing scale and complexity of software, the advantages of using neural networks for static vulnerability detection are becoming increasingly prominent. Before inputting into a neural network, the source code needs to undergo word embedding, transforming discrete high-dimensional text data into low-dimensional [...] Read more.
With the increasing scale and complexity of software, the advantages of using neural networks for static vulnerability detection are becoming increasingly prominent. Before inputting into a neural network, the source code needs to undergo word embedding, transforming discrete high-dimensional text data into low-dimensional continuous vectors suitable for training in neural networks. However, analysis has revealed that different implementation ideas by code writers for the same functionality can lead to varied code implementation methods. Embedding different code texts into vectors results in distinctions that can reduce the robustness of a model. To address this issue, this paper explores the impact of converting source code into different forms on word embedding and finds that a TAC (Three-Address Code) can significantly eliminate noise caused by different code implementation approaches. Given the excellent capability of a GNN (Graph Neural Network) in handling non-Euclidean space data and complex features, this paper subsequently employs a GNN to learn and classify vulnerabilities by capturing the implicit syntactic structure information in a TAC. Based on this, this paper introduces TACSan, a novel static vulnerability detection system based on a GNN designed to detect vulnerabilities in C/C++ programs. TACSan transforms the preprocessed source code into a TAC representation, adds control and data edges to create a graph structure, and then inputs it into the GNN for training. Comparative testing and evaluation of TACSan against other renowned static analysis tools, such as VulDeePecker and Devign, demonstrate that TACSan’s detection capabilities not only exceed those methods but also achieve substantial enhancements in accuracy and F1 score. Full article
Show Figures

Figure 1

18 pages, 2249 KiB  
Article
Fractal Self-Similarity in Semantic Convergence: Gradient of Embedding Similarity across Transformer Layers
by Minhyeok Lee
Fractal Fract. 2024, 8(10), 552; https://doi.org/10.3390/fractalfract8100552 - 24 Sep 2024
Viewed by 309
Abstract
This paper presents a mathematical analysis of semantic convergence in transformer-based language models, drawing inspiration from the concept of fractal self-similarity. We introduce and prove a novel theorem characterizing the gradient of embedding similarity across layers. Specifically, we establish that there exists a [...] Read more.
This paper presents a mathematical analysis of semantic convergence in transformer-based language models, drawing inspiration from the concept of fractal self-similarity. We introduce and prove a novel theorem characterizing the gradient of embedding similarity across layers. Specifically, we establish that there exists a monotonically increasing function that provides a lower bound on the rate at which the average cosine similarity between token embeddings at consecutive layers and the final layer increases. This establishes a fundamental property: semantic alignment of token representations consistently increases through the network, exhibiting a pattern of progressive refinement, analogous to fractal self-similarity. The key challenge addressed is the quantification and generalization of semantic convergence across diverse model architectures and input contexts. To validate our findings, we conduct experiments on BERT and DistilBERT models, analyzing embedding similarities for diverse input types. While our experiments are limited to these models, we empirically demonstrate consistent semantic convergence within these architectures. Quantitatively, we find that the average rates of semantic convergence are approximately 0.0826 for BERT and 0.1855 for DistilBERT. We observe that the rate of convergence varies based on token frequency and model depth, with rare words showing slightly higher similarities (differences of approximately 0.0167 for BERT and 0.0120 for DistilBERT). This work advances our understanding of transformer models’ internal mechanisms and provides a mathematical framework for comparing and optimizing model architectures. Full article
Show Figures

Figure 1

23 pages, 3964 KiB  
Article
Geometry of Textual Data Augmentation: Insights from Large Language Models
by Sherry J. H. Feng, Edmund M-K. Lai and Weihua Li
Electronics 2024, 13(18), 3781; https://doi.org/10.3390/electronics13183781 - 23 Sep 2024
Viewed by 534
Abstract
Data augmentation is crucial for enhancing the performance of text classification models when labelled training data are scarce. For natural language processing (NLP) tasks, large language models (LLMs) are able to generate high-quality augmented data. But a fundamental understanding of the reasons for [...] Read more.
Data augmentation is crucial for enhancing the performance of text classification models when labelled training data are scarce. For natural language processing (NLP) tasks, large language models (LLMs) are able to generate high-quality augmented data. But a fundamental understanding of the reasons for their effectiveness remains limited. This paper presents a geometric and topological perspective on textual data augmentation using LLMs. We compare the augmentation data generated by GPT-J with those generated through cosine similarity from Word2Vec and GloVe embeddings. Topological data analysis reveals that GPT-J generated data maintains label coherence. Convex hull analysis of such data represented by their two principal components shows that they lie within the spatial boundaries of the original training data. Delaunay triangulation reveals that increasing the number of augmented data points that are connected within these boundaries correlates with improved classification accuracy. These findings provide insights into the superior performance of LLMs in data augmentation. A framework for predicting the usefulness of augmentation data based on geometric properties could be formed based on these techniques. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

25 pages, 896 KiB  
Article
Enhancing Fake News Detection with Word Embedding: A Machine Learning and Deep Learning Approach
by Mutaz A. B. Al-Tarawneh, Omar Al-irr, Khaled S. Al-Maaitah, Hassan Kanj and Wael Hosny Fouad Aly
Computers 2024, 13(9), 239; https://doi.org/10.3390/computers13090239 - 19 Sep 2024
Viewed by 880
Abstract
The widespread dissemination of fake news on social media has necessitated the development of more sophisticated detection methods to maintain information integrity. This research systematically investigates the effectiveness of different word embedding techniques—TF-IDF, Word2Vec, and FastText—when applied to a variety of machine learning [...] Read more.
The widespread dissemination of fake news on social media has necessitated the development of more sophisticated detection methods to maintain information integrity. This research systematically investigates the effectiveness of different word embedding techniques—TF-IDF, Word2Vec, and FastText—when applied to a variety of machine learning (ML) and deep learning (DL) models for fake news detection. Leveraging the TruthSeeker dataset, which includes a diverse set of labeled news articles and social media posts spanning over a decade, we evaluated the performance of classifiers such as Support Vector Machines (SVMs), Multilayer Perceptrons (MLPs), and Convolutional Neural Networks (CNNs). Our analysis demonstrates that SVMs using TF-IDF embeddings and CNNs employing TF-IDF embeddings achieve the highest overall performance in terms of accuracy, precision, recall, and F1 score. These results suggest that TF-IDF, with its capacity to highlight discriminative features in text, enhances the performance of models like SVMs, which are adept at handling sparse data representations. Additionally, CNNs benefit from TF-IDF by effectively capturing localized features and patterns within the textual data. In contrast, while Word2Vec and FastText embeddings capture semantic and syntactic nuances, they introduce complexities that may not always benefit traditional ML models like MLPs or SVMs, which could explain their relatively lower performance in some cases. This study emphasizes the importance of selecting appropriate embedding techniques based on the model architecture to maximize fake news detection performance. Future research should consider integrating contextual embeddings and exploring hybrid model architectures to further enhance detection capabilities. These findings contribute to the ongoing development of advanced computational tools for combating misinformation. Full article
Show Figures

Figure 1

15 pages, 683 KiB  
Article
Cross-Lingual Short-Text Semantic Similarity for Kannada–English Language Pair
by Muralikrishna S N, Raghurama Holla, Harivinod N and Raghavendra Ganiga
Computers 2024, 13(9), 236; https://doi.org/10.3390/computers13090236 - 18 Sep 2024
Viewed by 456
Abstract
Analyzing the semantic similarity of cross-lingual texts is a crucial part of natural language processing (NLP). The computation of semantic similarity is essential for a variety of tasks such as evaluating machine translation systems, quality checking human translation, information retrieval, plagiarism checks, etc. [...] Read more.
Analyzing the semantic similarity of cross-lingual texts is a crucial part of natural language processing (NLP). The computation of semantic similarity is essential for a variety of tasks such as evaluating machine translation systems, quality checking human translation, information retrieval, plagiarism checks, etc. In this paper, we propose a method for measuring the semantic similarity of Kannada–English sentence pairs that uses embedding space alignment, lexical decomposition, word order, and a convolutional neural network. The proposed method achieves a maximum correlation of 83% with human annotations. Experiments on semantic matching and retrieval tasks resulted in promising results in terms of precision and recall. Full article
Show Figures

Figure 1

32 pages, 5227 KiB  
Article
Global Suicide Mortality Rates (2000–2019): Clustering, Themes, and Causes Analyzed through Machine Learning and Bibliographic Data
by Erinija Pranckeviciene and Judita Kasperiuniene
Int. J. Environ. Res. Public Health 2024, 21(9), 1202; https://doi.org/10.3390/ijerph21091202 - 10 Sep 2024
Viewed by 2372
Abstract
Suicide research is directed at understanding social, economic, and biological causes of suicide thoughts and behaviors. (1) Background: Worldwide, certain countries have high suicide mortality rates (SMRs) compared to others. Age-standardized suicide mortality rates (SMRs) published by the World Health Organization (WHO) plus [...] Read more.
Suicide research is directed at understanding social, economic, and biological causes of suicide thoughts and behaviors. (1) Background: Worldwide, certain countries have high suicide mortality rates (SMRs) compared to others. Age-standardized suicide mortality rates (SMRs) published by the World Health Organization (WHO) plus numerous bibliographic records of the Web of Science (WoS) database provide resources to understand these disparities between countries and regions. (2) Methods: Hierarchical clustering was applied to age-standardized suicide mortality rates per 100,000 population from 2000–2019. Keywords of country-specific suicide-related publications collected from WoS were analyzed by network and association rule mining. Keyword embedding was carried out using a recurrent neural network. (3) Results: Countries with similar SMR trends formed naturally distinct groups of high, medium, and low suicide mortality rates. Major themes in suicide research worldwide are depression, mental disorders, youth suicide, euthanasia, hopelessness, loneliness, unemployment, and drugs. Prominent themes differentiating countries and regions include: alcohol in post-Soviet countries; HIV/AIDS in Sub-Saharan Africa, war veterans and PTSD in the Middle East, students in East Asia, and many others. (4) Conclusion: Countries naturally group into high, medium, and low SMR categories characterized by different keyword-informed themes. The compiled dataset and presented methodology enable enrichment of analytical results by bibliographic data where observed results are difficult to interpret. Full article
Show Figures

Figure 1

17 pages, 543 KiB  
Article
Speaker-Attributed Training for Multi-Speaker Speech Recognition Using Multi-Stage Encoders and Attention-Weighted Speaker Embedding
by Minsoo Kim and Gil-Jin Jang
Appl. Sci. 2024, 14(18), 8138; https://doi.org/10.3390/app14188138 - 10 Sep 2024
Viewed by 519
Abstract
Automatic speech recognition (ASR) aims at understanding naturally spoken human speech to be used as text inputs to machines. In multi-speaker environments, where multiple speakers are talking simultaneously with a large amount of overlap, a significant performance degradation may occur with conventional ASR [...] Read more.
Automatic speech recognition (ASR) aims at understanding naturally spoken human speech to be used as text inputs to machines. In multi-speaker environments, where multiple speakers are talking simultaneously with a large amount of overlap, a significant performance degradation may occur with conventional ASR systems if they are trained by recordings of single talkers. This paper proposes a multi-speaker ASR method that incorporates speaker embedding information as an additional input. The embedding information for each of the speakers in the training set was extracted as numeric vectors, and all of the embedding vectors were stacked to construct a total speaker profile matrix. The speaker profile matrix from the training dataset enables finding embedding vectors that are close to the speakers of the input recordings in the test conditions, and it helps to recognize the individual speakers’ voices mixed in the input. Furthermore, the proposed method efficiently reuses the decoder from the existing speaker-independent ASR model, eliminating the need for retraining the entire system. Various speaker embedding methods such as i-vector, d-vector, and x-vector were adopted, and the experimental results show 0.33% and 0.95% absolute (3.9% and 11.5% relative) improvements without and with the speaker profile in the word error rate (WER). Full article
(This article belongs to the Special Issue Speech Recognition and Natural Language Processing)
Show Figures

Figure 1

24 pages, 5079 KiB  
Article
Leveraging Generative AI in Short Document Indexing
by Sara Bouzid and Loïs Piron
Electronics 2024, 13(17), 3563; https://doi.org/10.3390/electronics13173563 - 8 Sep 2024
Viewed by 496
Abstract
The efficiency of information retrieval systems primarily depends on the effective representation of documents during query processing. This representation is mainly constructed from relevant document terms identified and selected during their indexing, which are then used for retrieval. However, when documents contain only [...] Read more.
The efficiency of information retrieval systems primarily depends on the effective representation of documents during query processing. This representation is mainly constructed from relevant document terms identified and selected during their indexing, which are then used for retrieval. However, when documents contain only a few features, such as in short documents, the resulting representation may be information-poor due to a lack of index terms and their lack of relevance. Although document representation can be enriched using techniques like word embeddings, these techniques require large pre-trained datasets, which are often unavailable in the context of domain-specific short documents. This study investigates a new approach to enrich document representation during indexing using generative AI. In the proposed approach, relevant terms extracted from documents and preprocessed for indexing are enriched with a list of key terms suggested by a large language model (LLM). After conducting a small benchmark of several renowned LLM models for key term suggestions from a set of short texts, the GPT-4o model was chosen to experiment with the proposed indexing approach. The findings of this study yielded notable results, demonstrating that generative AI can efficiently fill the knowledge gap in document representation, regardless of the retrieval technique used. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

26 pages, 12522 KiB  
Article
A Vision–Language Model-Based Traffic Sign Detection Method for High-Resolution Drone Images: A Case Study in Guyuan, China
by Jianqun Yao, Jinming Li, Yuxuan Li, Mingzhu Zhang, Chen Zuo, Shi Dong and Zhe Dai
Sensors 2024, 24(17), 5800; https://doi.org/10.3390/s24175800 - 6 Sep 2024
Viewed by 415
Abstract
As a fundamental element of the transportation system, traffic signs are widely used to guide traffic behaviors. In recent years, drones have emerged as an important tool for monitoring the conditions of traffic signs. However, the existing image processing technique is heavily reliant [...] Read more.
As a fundamental element of the transportation system, traffic signs are widely used to guide traffic behaviors. In recent years, drones have emerged as an important tool for monitoring the conditions of traffic signs. However, the existing image processing technique is heavily reliant on image annotations. It is time consuming to build a high-quality dataset with diverse training images and human annotations. In this paper, we introduce the utilization of Vision–language Models (VLMs) in the traffic sign detection task. Without the need for discrete image labels, the rapid deployment is fulfilled by the multi-modal learning and large-scale pretrained networks. First, we compile a keyword dictionary to explain traffic signs. The Chinese national standard is used to suggest the shape and color information. Our program conducts Bootstrapping Language-image Pretraining v2 (BLIPv2) to translate representative images into text descriptions. Second, a Contrastive Language-image Pretraining (CLIP) framework is applied to characterize not only drone images but also text descriptions. Our method utilizes the pretrained encoder network to create visual features and word embeddings. Third, the category of each traffic sign is predicted according to the similarity between drone images and keywords. Cosine distance and softmax function are performed to calculate the class probability distribution. To evaluate the performance, we apply the proposed method in a practical application. The drone images captured from Guyuan, China, are employed to record the conditions of traffic signs. Further experiments include two widely used public datasets. The calculation results indicate that our vision–language model-based method has an acceptable prediction accuracy and low training cost. Full article
Show Figures

Figure 1

26 pages, 1413 KiB  
Article
Active Learning for Biomedical Article Classification with Bag of Words and FastText Embeddings
by Paweł Cichosz
Appl. Sci. 2024, 14(17), 7945; https://doi.org/10.3390/app14177945 - 6 Sep 2024
Viewed by 486
Abstract
In several applications of text classification, training document labels are provided by human evaluators, and therefore, gathering sufficient data for model creation is time consuming and costly. The labeling time and effort may be reduced by active learning, in which classification models are [...] Read more.
In several applications of text classification, training document labels are provided by human evaluators, and therefore, gathering sufficient data for model creation is time consuming and costly. The labeling time and effort may be reduced by active learning, in which classification models are created based on relatively small training sets, which are obtained by collecting class labels provided in response to labeling requests or queries. This is an iterative process with a sequence of models being fitted, and each of them is used to select query articles to be added to the training set for the next one. Such a learning scenario may pose different challenges for machine learning algorithms and text representation methods used for text classification than ordinary passive learning, since they have to deal with very small, often imbalanced data, and the computational expense of both model creation and prediction has to remain low. This work examines how classification algorithms and text representation methods that have been found particularly useful by prior work handle these challenges. The random forest and support vector machines algorithms are coupled with the bag of words and FastText word embedding representations and applied to datasets consisting of scientific article abstracts from systematic literature review studies in the biomedical domain. Several strategies are used to select articles for active learning queries, including uncertainty sampling, diversity sampling, and strategies favoring the minority class. Confidence-based and stability-based early stopping criteria are used to generate active learning termination signals. The results confirm that active learning is a useful approach to creating text classification models with limited access to labeled data, making it possible to save at least half of the human effort needed to assign relevant or irrelevant class labels to training articles. Two of the four examined combinations of classification algorithms and text representation methods were the most successful: the SVM algorithm with the FastText representation and the random forest algorithm with the bag of words representation. Uncertainty sampling turned out to be the most useful query selection strategy, and confidence-based stopping was found more universal and easier to configure than stability-based stopping. Full article
(This article belongs to the Special Issue Data and Text Mining: New Approaches, Achievements and Applications)
Show Figures

Figure 1

19 pages, 714 KiB  
Article
Combining Semantic Matching, Word Embeddings, Transformers, and LLMs for Enhanced Document Ranking: Application in Systematic Reviews
by Goran Mitrov, Boris Stanoev, Sonja Gievska, Georgina Mirceva and Eftim Zdravevski
Big Data Cogn. Comput. 2024, 8(9), 110; https://doi.org/10.3390/bdcc8090110 - 4 Sep 2024
Viewed by 749
Abstract
The rapid increase in scientific publications has made it challenging to keep up with the latest advancements. Conducting systematic reviews using traditional methods is both time-consuming and difficult. To address this, new review formats like rapid and scoping reviews have been introduced, reflecting [...] Read more.
The rapid increase in scientific publications has made it challenging to keep up with the latest advancements. Conducting systematic reviews using traditional methods is both time-consuming and difficult. To address this, new review formats like rapid and scoping reviews have been introduced, reflecting an urgent need for efficient information retrieval. This challenge extends beyond academia to many organizations where numerous documents must be reviewed in relation to specific user queries. This paper focuses on improving document ranking to enhance the retrieval of relevant articles, thereby reducing the time and effort required by researchers. By applying a range of natural language processing (NLP) techniques, including rule-based matching, statistical text analysis, word embeddings, and transformer- and LLM-based approaches like Mistral LLM, we assess the article’s similarities to user-specific inputs and prioritize them according to relevance. We propose a novel methodology, Weighted Semantic Matching (WSM) + MiniLM, combining the strengths of the different methodologies. For validation, we employ global metrics such as precision at K, recall at K, average rank, median rank, and pairwise comparison metrics, including higher rank count, average rank difference, and median rank difference. Our proposed algorithm achieves optimal performance, with an average recall at 1000 of 95% and an average median rank of 185 for selected articles across the five datasets evaluated. These findings give promising results in pinpointing the relevant articles and reducing the manual work. Full article
(This article belongs to the Special Issue Advances in Natural Language Processing and Text Mining)
Show Figures

Figure 1

19 pages, 2828 KiB  
Article
KCB-FLAT: Enhancing Chinese Named Entity Recognition with Syntactic Information and Boundary Smoothing Techniques
by Zhenrong Deng, Zheng Huang, Shiwei Wei and Jinglin Zhang
Mathematics 2024, 12(17), 2714; https://doi.org/10.3390/math12172714 - 30 Aug 2024
Viewed by 348
Abstract
Named entity recognition (NER) is a fundamental task in Natural Language Processing (NLP). During the training process, NER models suffer from over-confidence, and especially for the Chinese NER task, it involves word segmentation and introduces erroneous entity boundary segmentation, exacerbating over-confidence and reducing [...] Read more.
Named entity recognition (NER) is a fundamental task in Natural Language Processing (NLP). During the training process, NER models suffer from over-confidence, and especially for the Chinese NER task, it involves word segmentation and introduces erroneous entity boundary segmentation, exacerbating over-confidence and reducing the model’s overall performance. These issues limit further enhancement of NER models. To tackle these problems, we proposes a new model named KCB-FLAT, designed to enhance Chinese NER performance by integrating enriched semantic information with the word-Boundary Smoothing technique. Particularly, we first extract various types of syntactic data and utilize a network named Key-Value Memory Network, based on syntactic information to functionalize this, integrating it through an attention mechanism to generate syntactic feature embeddings for Chinese characters. Subsequently, we employed an encoder named Cross-Transformer to thoroughly combine syntactic and lexical information to address the entity boundary segmentation errors caused by lexical information. Finally, we introduce a Boundary Smoothing module, combined with a regularity-conscious function, to capture the internal regularity of per entity, reducing the model’s overconfidence in entity probabilities through smoothing. Experimental results demonstrate that the proposed model achieves exceptional performance on the MSRA, Resume, Weibo, and self-built ZJ datasets, as verified by the F1 score. Full article
Show Figures

Figure 1

19 pages, 1006 KiB  
Article
Semantic Interaction Meta-Learning Based on Patch Matching Metric
by Baoguo Wei, Xinyu Wang, Yuetong Su, Yue Zhang and Lixin Li
Sensors 2024, 24(17), 5620; https://doi.org/10.3390/s24175620 - 30 Aug 2024
Viewed by 510
Abstract
Metric-based meta-learning methods have demonstrated remarkable success in the domain of few-shot image classification. However, their performance is significantly contingent upon the choice of metric and the feature representation for the support classes. Current approaches, which predominantly rely on holistic image features, may [...] Read more.
Metric-based meta-learning methods have demonstrated remarkable success in the domain of few-shot image classification. However, their performance is significantly contingent upon the choice of metric and the feature representation for the support classes. Current approaches, which predominantly rely on holistic image features, may inadvertently disregard critical details necessary for novel tasks, a phenomenon known as “supervision collapse”. Moreover, relying solely on visual features to characterize support classes can prove to be insufficient, particularly in scenarios involving limited sample sizes. In this paper, we introduce an innovative framework named Patch Matching Metric-based Semantic Interaction Meta-Learning (PatSiML), designed to overcome these challenges. To counteract supervision collapse, we have developed a patch matching metric strategy based on the Transformer architecture to transform input images into a set of distinct patch embeddings. This approach dynamically creates task-specific embeddings, facilitated by a graph convolutional network, to formulate precise matching metrics between the support classes and the query image patches. To enhance the integration of semantic knowledge, we have also integrated a label-assisted channel semantic interaction strategy. This strategy merges word embeddings with patch-level visual features across the channel dimension, utilizing a sophisticated language model to combine semantic understanding with visual information. Our empirical findings across four diverse datasets reveal that the PatSiML method achieves a classification accuracy improvement of 0.65% to 21.15% over existing methodologies, underscoring its robustness and efficacy. Full article
(This article belongs to the Special Issue Advances in Remote Sensing Image Enhancement and Classification)
Show Figures

Figure 1

18 pages, 2293 KiB  
Article
Social Media Topic Classification on Greek Reddit
by Charalampos Mastrokostas, Nikolaos Giarelis and Nikos Karacapilidis
Information 2024, 15(9), 521; https://doi.org/10.3390/info15090521 - 26 Aug 2024
Viewed by 439
Abstract
Text classification (TC) is a subtask of natural language processing (NLP) that categorizes text pieces into predefined classes based on their textual content and thematic aspects. This process typically includes the training of a Machine Learning (ML) model on a labeled dataset, where [...] Read more.
Text classification (TC) is a subtask of natural language processing (NLP) that categorizes text pieces into predefined classes based on their textual content and thematic aspects. This process typically includes the training of a Machine Learning (ML) model on a labeled dataset, where each text example is associated with a specific class. Recent progress in Deep Learning (DL) enabled the development of deep neural transformer models, surpassing traditional ML ones. In any case, works of the topic classification literature prioritize high-resource languages, particularly English, while research efforts for low-resource ones, such as Greek, are limited. Taking the above into consideration, this paper presents: (i) the first Greek social media topic classification dataset; (ii) a comparative assessment of a series of traditional ML models trained on this dataset, utilizing an array of text vectorization methods including TF-IDF, classical word and transformer-based Greek embeddings; (iii) a fine-tuned GREEK-BERT-based TC model on the same dataset; (iv) key empirical findings demonstrating that transformer-based embeddings significantly increase the performance of traditional ML models, while our fine-tuned DL model outperforms previous ones. The dataset, the best-performing model, and the experimental code are made public, aiming to augment the reproducibility of this work and advance future research in the field. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop