research-article

Detection of hate speech in Arabic tweets using deep learning

Authors:

Areej Al-Hassan,

Hmood Al-DossariAuthors Info & Claims

Multimedia Systems, Volume 28, Issue 6

Pages 1963 - 1974

https://doi.org/10.1007/s00530-020-00742-w

Published: 01 December 2022 Publication History

Abstract

Nowadays, people are communicating through social networks everywhere. However, for whatever reason it is noticeable that verbal misbehaviors, such as hate speech is now propagated through the social networks. One of the most popular social networks is Twitter which has gained widespread in the Arabic region. This research aims to identify and classify Arabic tweets into 5 distinct classes: none, religious, racial, sexism or general hate. A dataset of 11 K tweets was collected and labelled and SVM model was used as a baseline to be compared against 4 deep learning models: LTSM, CNN + LTSM, GRU and CNN + GRU. The results show that all the 4 deep learning models outperform the SVM model in detecting hateful tweets. Although the SVM achieves an overall recall of 74%, the deep learning models have an average recall of 75%. However, adding a layer of CNN to LTSM enhances the overall performance of detection with 72% precision, 75% recall and 73% F1 score.

References

[1]

Salem, F.: Arab Social Media Report : Social Media and the Internet of Things: Towards Data-Driven Policymaking in the Arab World - Potential, Limits and Concerns, MBR School of Goverment 7, (2017). https://www.mbrsg.ae/home/publications/research-report-research-paper-white-paper/arab-social-media-report-2017.aspx

[2]

Blaya, C.: Cyberhate: A review and content analysis of intervention strategies. Aggress. Violent Behav. 45, 0–1 (2018)

[3]

Gelashvili, T., Nowak, K.A.: Hate Speech on Social Media. Lund University (2018)

[4]

Fortuna P and Nunes S A survey on automatic detection of hate speech in text ACM Comput. Surv. 2018 51 4 1-30

[5]

Waseem Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proc. NAACL Student Res. Work., pp. 88–93 (2016). https://www.aclweb.org/anthology/N16-2013/

[6]

Anis M.Y., Maret, U.S.: Hatespeech in Arabic Language. In: International Conference on Media Studies, September 2017

[7]

Alshutayri A., Atwell, E.: Creating an Arabic Dialect Text Corpus by Exploring Twitter, Facebook, and Online Newspapers, May 2018

[8]

Irfan R et al. A survey on text mining in social networks Knowl. Eng. Rev. 2015 30 2 157-170

[9]

Assiri A, Emam A, and Al-Dossari H Towards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis J. Inf. Sci. 2018 44 2 184-202

[10]

Soumya George K and Joseph S Text classification by augmenting bag of words (BOW) representation with co-occurrence feature IOSR J. Comput. Eng. 2014 16 1 34-38

[11]

Blei DM, Ng AY, and Jordan MI Latent dirichlet allocation J. Mach. Learn. Res. 2003 3 993-1022

[12]

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

[13]

Soliman AB, Eissa K, and El-Beltagy SR AraVec: a set of Arabic word embedding models for use in Arabic NLP Procedia Comput. Sci. 2017 117 256-265

[14]

Bouazizi M and Otsuki T A pattern-based approach for sarcasm detection on twitter IEEE Access 2016 4 5477-5488

[15]

Xiang, G., Fan, B., Wang, L., Hong, J., Rose, C.: Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In: Proc. 21st ACM Int. Conf. Inf. Knowl. Manag.—CIKM’12, pp 1980 (2012)

[16]

Gitari ND, Zuping Z, Damien H, and Long J A lexicon-based approach for hate speech detection Int. J. Multimed. Ubiquitous Eng. 2015 10 4 215-230

[17]

Goodfellow I, Bengio Y, and Courville A Deep Learning 2016 Cambridge MIT Press

[18]

Warner W., Hirschberg, J.: Detecting Hate Speech on the World Wide Web. In: Proceedings of the Second Workshop on Language in Social Media, pp. 19–26 (2012)

[19]

Watanabe H, Bouazizi M, and Ohtsuki T Hate speech on twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection IEEE Access 2018 6 13825-13835

[20]

Burnap P and Williams ML Us and them: identifying cyber hate on Twitter across multiple protected characteristics EPJ Data Sci 2016

[21]

Gambäck B and Sikdar UK Using convolutional neural networks to classify hate-speech Assoc. Comput. Linguist. 2017 7491 85-90

[22]

Badjatiya P., Gupta S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760 (2017)

[23]

Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on twitter using a convolution-GRU based deep neural network. In: ESWC 2018: The Semantic Web, pp. 745–760 (2018)

[24]

Abozinadah E.A., Jones J.H.: A statistical learning approach to detect abusive twitter accounts. In: Proc. Int. Conf. Comput. Data Anal.—ICCDA ’17, pp. 6–13 (2017)

[25]

Haidar B, Chamoun M, and Serhrouchni A A multilingual system for cyberbullying detection: arabic content detection using machine learning Adv. Sci. Technol. Eng. Syst. J. 2017 2 6 275-284

[26]

Albadi, N., Kurdi, M., Mishra, S.: Are they Our Brothers? Analysis and Detection of Religious Hate Speech in the Arabic Twittersphere. In: 2018 IEEE/ACM Int. Conf. Adv. Soc. Networks Anal. Min., pp. 69–76 (2018)

[27]

Al-Hassan A and Al-Dossari H Detection of hate speech in social networks: a survey on multilingual corpus Comput. Sci. Inf. Technol. (CS IT) 2019 9 2 83

[28]

Alabbas W., Haider, M., Mansour, A., Epiphaniou, G., Frommholz, I.: Classification of Colloquial Arabic Tweets in real-time to detect high-risk floods. In: 2017 International Conference On Social Media, Wearable And Web Analytics (Social Media), pp. 1–8 (2017)

Cited By

Ghaly RElKorany AEzzat C(2025)Hate Speech Detection in Arabic TextProcedia Computer Science10.1016/j.procs.2024.10.222244:C(166-177)Online publication date: 7-Jan-2025
https://dl.acm.org/doi/10.1016/j.procs.2024.10.222
Azzi SZribi C(2024)A new Classifier Chain method of BERT Models For Multi-label Classification of Arabic Abusive Language on Social MediaProcedia Computer Science10.1016/j.procs.2023.10.032225:C(476-485)Online publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1016/j.procs.2023.10.032
Kaushik PBansal KKumar YChangela A(2024)Mental Disorders Prognosis and Predictions Using Artificial Intelligence Techniques: a Comprehensive StudySN Computer Science10.1007/s42979-024-03416-w5:8Online publication date: 14-Nov-2024
https://dl.acm.org/doi/10.1007/s42979-024-03416-w
Show More Cited By

Recommendations

Detection of Hateful Social Media Content for Arabic Language
Social media is a common medium for expression of views, discussion, sharing of content, and promotion of products and ideas. These views are either polite or obscene. The growth of hate speech is one of the negative aspects of the medium and its ...
Hate Speech Identification using the Hate Codes for Indonesian Tweets
DSIT 2019: Proceedings of the 2019 2nd International Conference on Data Science and Information Technology

The hate speech has become the major source of negativity spread in all over the social media. As the social media becomes aware of this issue, they gradually build several new regulations to handle the spread of hate speech e.g. by automatically ...
Hate speech and offensive language detection in Dravidian languages using deep ensemble framework
Abstract
Social networking platforms gained widespread popularity and are used for various activities like: promoting products, sharing news, achievements and many more. On the other hand, it is also used for spreading rumors, bullying people, ...
Highlights
- Proposed a weighted ensemble framework for hate and offensive code-mixed posts identification on social platforms.

Comments

Information & Contributors

Information

Published In

cover image Multimedia Systems

Multimedia Systems Volume 28, Issue 6

Dec 2022

568 pages

ISSN:0942-4962

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 December 2022

Accepted: 20 December 2020

Received: 11 October 2020

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ghaly RElKorany AEzzat C(2025)Hate Speech Detection in Arabic TextProcedia Computer Science10.1016/j.procs.2024.10.222244:C(166-177)Online publication date: 7-Jan-2025
https://dl.acm.org/doi/10.1016/j.procs.2024.10.222
Azzi SZribi C(2024)A new Classifier Chain method of BERT Models For Multi-label Classification of Arabic Abusive Language on Social MediaProcedia Computer Science10.1016/j.procs.2023.10.032225:C(476-485)Online publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1016/j.procs.2023.10.032
Kaushik PBansal KKumar YChangela A(2024)Mental Disorders Prognosis and Predictions Using Artificial Intelligence Techniques: a Comprehensive StudySN Computer Science10.1007/s42979-024-03416-w5:8Online publication date: 14-Nov-2024
https://dl.acm.org/doi/10.1007/s42979-024-03416-w
Anjum Katarya R(2024)Hate speech, toxicity detection in online social media: a recent survey of state of the art and opportunitiesInternational Journal of Information Security10.1007/s10207-023-00755-223:1(577-608)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1007/s10207-023-00755-2
Ying HOu QFan CMei LZhang SXu X(2024)Domain Adaptation for Chinese Offensive Language DetectionNatural Language Processing and Chinese Computing10.1007/978-981-97-9440-9_12(146-158)Online publication date: 2-Nov-2024
https://dl.acm.org/doi/10.1007/978-981-97-9440-9_12
Karajeh OAl-Kabi MFox E(2024)Multi-dimensional Edge-Embedded GCNs for Arabic Text ClassificationLinking Theory and Practice of Digital Libraries10.1007/978-3-031-72437-4_14(241-255)Online publication date: 24-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-72437-4_14
Mohamed MElzayady HBadran KSalama G(2023)An efficient approach for data-imbalanced hate speech detection in Arabic social mediaJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23115145:4(6381-6390)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.3233/JIFS-231151
Mujahid MKanwal KRustam FAljedaani WAshraf I(2023)Arabic ChatGPT Tweets Classification Using RoBERTa and BERT Ensemble ModelACM Transactions on Asian and Low-Resource Language Information Processing10.1145/360588922:8(1-23)Online publication date: 24-Aug-2023
https://dl.acm.org/doi/10.1145/3605889
Murshed BSuresha Abawajy JSaif MAbdulwahab HGhanem F(2023)FAEO-ECNN: cyberbullying detection in social media platforms using topic modelling and deep learningMultimedia Tools and Applications10.1007/s11042-023-15372-382:30(46611-46650)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s11042-023-15372-3
Kumari KSingh J(2022)Multi-modal cyber-aggression detection with feature optimization by firefly algorithmMultimedia Systems10.1007/s00530-021-00785-728:6(1951-1962)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s00530-021-00785-7

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents