Multi-labeled dataset of arabic covid-19 tweets for topic-based sentiment classifications
FM Alderazi, AA Algosaibi… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
FM Alderazi, AA Algosaibi, MA Alabdullatif
2022 IEEE International Conference on Evolving and Adaptive …, 2022•ieeexplore.ieee.orgNatural Language Processing (NLP) can analyze and classify the growing number of
expressed opinions and feelings of online texts and quickly get the required feedback. The
technique of automatically labeling a textual document with the most appropriate collection
of labels is known as text classification, whereas supervised text classifiers require extensive
human expertise and labeling efforts. This paper seeks to build a multi-labeled Arabic
dataset by labeling an Arabic Covid-19 Tweet to two groups based on their lexical features …
expressed opinions and feelings of online texts and quickly get the required feedback. The
technique of automatically labeling a textual document with the most appropriate collection
of labels is known as text classification, whereas supervised text classifiers require extensive
human expertise and labeling efforts. This paper seeks to build a multi-labeled Arabic
dataset by labeling an Arabic Covid-19 Tweet to two groups based on their lexical features …
Natural Language Processing (NLP) can analyze and classify the growing number of expressed opinions and feelings of online texts and quickly get the required feedback. The technique of automatically labeling a textual document with the most appropriate collection of labels is known as text classification, whereas supervised text classifiers require extensive human expertise and labeling efforts. This paper seeks to build a multi-labeled Arabic dataset by labeling an Arabic Covid-19 Tweet to two groups based on their lexical features: related topic and associated sentiment. An extensive dataset was created from Twitter posts to achieve this purpose. There are over 32k multi-labeled tweets in the dataset. The dataset will be made freely available to the Arabic computational linguistics research community. This work used both traditional machine learning approaches and a deep-learning approach to investigate this dataset’s performance. This paper demonstrates that traditional ML approaches provide higher accuracy with almost stable performance when experienced on the Twitter dataset for sentiment analysis and topic classification.
ieeexplore.ieee.org