Computer Science > Computation and Language
[Submitted on 17 Oct 2023]
Title:The effect of stemming and lemmatization on Portuguese fake news text classification
View PDFAbstract:With the popularization of the internet, smartphones and social media, information is being spread quickly and easily way, which implies bigger traffic of information in the world, but there is a problem that is harming society with the dissemination of fake news. With a bigger flow of information, some people are trying to disseminate deceptive information and fake news. The automatic detection of fake news is a challenging task because to obtain a good result is necessary to deal with linguistics problems, especially when we are dealing with languages that not have been comprehensively studied yet, besides that, some techniques can help to reach a good result when we are dealing with text data, although, the motivation of detecting this deceptive information it is in the fact that the people need to know which information is true and trustful and which one is not. In this work, we present the effect the pre-processing methods such as lemmatization and stemming have on fake news classification, for that we designed some classifier models applying different pre-processing techniques. The results show that the pre-processing step is important to obtain betters results, the stemming and lemmatization techniques are interesting methods and need to be more studied to develop techniques focused on the Portuguese language so we can reach better results.
Submission history
From: Murilo Varges da Silva [view email][v1] Tue, 17 Oct 2023 15:26:40 UTC (750 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.