TransVAE-PAM: A Combined Transformer and DAG-based Approach for Enhanced Fake News Detection in Indian Context

Authors: Shivani Tufchi, Tanveer Ahmed, Ashima Yadav, Krishna Kant Agrawal, Ankit VidyarthiAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing

Accepted on 24 February 2024

https://doi.org/10.1145/3651160

Online AM: 02 March 2024 Publication History

PDF eReader

Abstract

In this study, we introduce a novel method, “TransVAE-PAM”, for the classification of fake news articles, tailored specifically for the Indian context. The approach capitalizes on state-of-the-art contextual and sentence transformer-based embedding models to generate article embeddings. Furthermore, we also try to address the issue of compact model size. In this respect, we employ a Variational Autoencoder (VAE) and β-VAE to reduce the dimensions of the embeddings, thereby yielding compact latent representations. To capture the thematic essence or important topics in the news articles, we use the Pachinko Allocation Model (PAM) model, a Directed Acyclic Graph (DAG) based approach, to generate meaningful topics. These two facets of representation - the reduced-dimension embeddings from the VAE and the extracted topics from the PAM model - are fused together to create a feature set. This representation is subsequently channeled into five different methods for fake news classification. Furthermore, we use eight distinct transformer-based architectures to test the embedding generation. To validate the feasibility of the proposed approach, we have conducted extensive experimentation on a proprietary dataset. The dataset is sourced from “Times of India” and other online media. Considering the size of the dataset, large-scale experiments are conducted on an NVIDIA supercomputer. Through this comprehensive numerical investigation, we have achieved an accuracy of 96.2% and an F1 score of 96% using the DistilBERT transformer architecture. By complementing the method via topic modeling, we record a performance improvement with the accuracy and F1 score both at 97%. These results indicate a promising direction toward leveraging the combination of advanced topic models into existing classification schemes to enhance research on fake news detection.

References

[1]

Omar Abu Arqoub, Adeola Abdulateef Elega, Bahire Efe Özad, Hanadi Dwikat, and Felix Adedamola Oloyede. 2022. Mapping the scholarship of fake news research: A systematic review. Journalism Practice 16, 1 (2022), 56–86.

Abstract

References

Index Terms

Recommendations

Polarity Analysis of Editorial Articles towards Fake News Detection

RankSum—An unsupervised extractive text summarization based on rank fusion

A Robust User Sentiment Biterm Topic Mixture Model Based on User Aggregation Strategy to Avoid Data Sparsity for Short Text

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations