Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Samuel Zammit ; Fiona Sammut and David Suda

Affiliation: Department of Statistics & Operations Research, University of Malta, Msida, Malta

Keyword(s): Natural Language Processing, Word Embeddings, Word2Vec, FastText, Doc2Vec, k-means Clustering.

Abstract: This paper aims to identify common topics in a dataset of online news portal comments made between April 2008 and January 2017 on the Times of Malta website. By making use of the FastText algorithm, Word2Vec is used to obtain word embeddings for each unique word in the dataset. Furthermore, document vectors are also obtained for each comment, where again similar comments are assigned similar representations. The resulting word and document embeddings are also clustered using k-means clustering to identify common topic clusters. The results obtained indicate that the majority of comments follow a political theme related either to party politics, foreign politics, corruption, issues of an ideological nature, or other issues. Comments related to themes such as sports, arts and culture were not common, except around years with major events. Additionally, a number of topics were identified as being more prevalent during some time periods rather than others. These include the Maltese divor ce referendum in 2011, the Maltese citizenship scheme in 2013, Russia’s annexation of Crimea in 2014, Brexit in 2015 and corruption/Panama Papers in 2016. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 70.40.220.129

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Zammit, S.; Sammut, F. and Suda, D. (2021). Common Topic Identification in Online Maltese News Portal Comments. In Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - ICPRAM; ISBN 978-989-758-486-2; ISSN 2184-4313, SciTePress, pages 548-555. DOI: 10.5220/0010250605480555

@conference{icpram21,
author={Samuel Zammit. and Fiona Sammut. and David Suda.},
title={Common Topic Identification in Online Maltese News Portal Comments},
booktitle={Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - ICPRAM},
year={2021},
pages={548-555},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010250605480555},
isbn={978-989-758-486-2},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods - ICPRAM
TI - Common Topic Identification in Online Maltese News Portal Comments
SN - 978-989-758-486-2
IS - 2184-4313
AU - Zammit, S.
AU - Sammut, F.
AU - Suda, D.
PY - 2021
SP - 548
EP - 555
DO - 10.5220/0010250605480555
PB - SciTePress