Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Detection of hate speech in Arabic tweets using deep learning

Published: 01 December 2022 Publication History

Abstract

Nowadays, people are communicating through social networks everywhere. However, for whatever reason it is noticeable that verbal misbehaviors, such as hate speech is now propagated through the social networks. One of the most popular social networks is Twitter which has gained widespread in the Arabic region. This research aims to identify and classify Arabic tweets into 5 distinct classes: none, religious, racial, sexism or general hate. A dataset of 11 K tweets was collected and labelled and SVM model was used as a baseline to be compared against 4 deep learning models: LTSM, CNN + LTSM, GRU and CNN + GRU. The results show that all the 4 deep learning models outperform the SVM model in detecting hateful tweets. Although the SVM achieves an overall recall of 74%, the deep learning models have an average recall of 75%. However, adding a layer of CNN to LTSM enhances the overall performance of detection with 72% precision, 75% recall and 73% F1 score.

References

[1]
Salem, F.: Arab Social Media Report : Social Media and the Internet of Things: Towards Data-Driven Policymaking in the Arab World - Potential, Limits and Concerns, MBR School of Goverment 7, (2017). https://www.mbrsg.ae/home/publications/research-report-research-paper-white-paper/arab-social-media-report-2017.aspx
[2]
Blaya, C.: Cyberhate: A review and content analysis of intervention strategies. Aggress. Violent Behav. 45, 0–1 (2018)
[3]
Gelashvili, T., Nowak, K.A.: Hate Speech on Social Media. Lund University (2018)
[4]
Fortuna P and Nunes S A survey on automatic detection of hate speech in text ACM Comput. Surv. 2018 51 4 1-30
[5]
Waseem Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proc. NAACL Student Res. Work., pp. 88–93 (2016). https://www.aclweb.org/anthology/N16-2013/
[6]
Anis M.Y., Maret, U.S.: Hatespeech in Arabic Language. In: International Conference on Media Studies, September 2017
[7]
Alshutayri A., Atwell, E.: Creating an Arabic Dialect Text Corpus by Exploring Twitter, Facebook, and Online Newspapers, May 2018
[8]
Irfan R et al. A survey on text mining in social networks Knowl. Eng. Rev. 2015 30 2 157-170
[9]
Assiri A, Emam A, and Al-Dossari H Towards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis J. Inf. Sci. 2018 44 2 184-202
[10]
Soumya George K and Joseph S Text classification by augmenting bag of words (BOW) representation with co-occurrence feature IOSR J. Comput. Eng. 2014 16 1 34-38
[11]
Blei DM, Ng AY, and Jordan MI Latent dirichlet allocation J. Mach. Learn. Res. 2003 3 993-1022
[12]
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
[13]
Soliman AB, Eissa K, and El-Beltagy SR AraVec: a set of Arabic word embedding models for use in Arabic NLP Procedia Comput. Sci. 2017 117 256-265
[14]
Bouazizi M and Otsuki T A pattern-based approach for sarcasm detection on twitter IEEE Access 2016 4 5477-5488
[15]
Xiang, G., Fan, B., Wang, L., Hong, J., Rose, C.: Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In: Proc. 21st ACM Int. Conf. Inf. Knowl. Manag.—CIKM’12, pp 1980 (2012)
[16]
Gitari ND, Zuping Z, Damien H, and Long J A lexicon-based approach for hate speech detection Int. J. Multimed. Ubiquitous Eng. 2015 10 4 215-230
[17]
Goodfellow I, Bengio Y, and Courville A Deep Learning 2016 Cambridge MIT Press
[18]
Warner W., Hirschberg, J.: Detecting Hate Speech on the World Wide Web. In: Proceedings of the Second Workshop on Language in Social Media, pp. 19–26 (2012)
[19]
Watanabe H, Bouazizi M, and Ohtsuki T Hate speech on twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection IEEE Access 2018 6 13825-13835
[20]
Burnap P and Williams ML Us and them: identifying cyber hate on Twitter across multiple protected characteristics EPJ Data Sci 2016
[21]
Gambäck B and Sikdar UK Using convolutional neural networks to classify hate-speech Assoc. Comput. Linguist. 2017 7491 85-90
[22]
Badjatiya P., Gupta S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760 (2017)
[23]
Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on twitter using a convolution-GRU based deep neural network. In: ESWC 2018: The Semantic Web, pp. 745–760 (2018)
[24]
Abozinadah E.A., Jones J.H.: A statistical learning approach to detect abusive twitter accounts. In: Proc. Int. Conf. Comput. Data Anal.—ICCDA ’17, pp. 6–13 (2017)
[25]
Haidar B, Chamoun M, and Serhrouchni A A multilingual system for cyberbullying detection: arabic content detection using machine learning Adv. Sci. Technol. Eng. Syst. J. 2017 2 6 275-284
[26]
Albadi, N., Kurdi, M., Mishra, S.: Are they Our Brothers? Analysis and Detection of Religious Hate Speech in the Arabic Twittersphere. In: 2018 IEEE/ACM Int. Conf. Adv. Soc. Networks Anal. Min., pp. 69–76 (2018)
[27]
Al-Hassan A and Al-Dossari H Detection of hate speech in social networks: a survey on multilingual corpus Comput. Sci. Inf. Technol. (CS IT) 2019 9 2 83
[28]
Alabbas W., Haider, M., Mansour, A., Epiphaniou, G., Frommholz, I.: Classification of Colloquial Arabic Tweets in real-time to detect high-risk floods. In: 2017 International Conference On Social Media, Wearable And Web Analytics (Social Media), pp. 1–8 (2017)

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Multimedia Systems
Multimedia Systems  Volume 28, Issue 6
Dec 2022
568 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 December 2022
Accepted: 20 December 2020
Received: 11 October 2020

Author Tags

  1. Hate speech
  2. Arabic tweets
  3. Arabic NLP
  4. Deep learning
  5. Multiclassification
  6. Social networks
  7. Text mining

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Hate Speech Detection in Arabic TextProcedia Computer Science10.1016/j.procs.2024.10.222244:C(166-177)Online publication date: 7-Jan-2025
  • (2024)A new Classifier Chain method of BERT Models For Multi-label Classification of Arabic Abusive Language on Social MediaProcedia Computer Science10.1016/j.procs.2023.10.032225:C(476-485)Online publication date: 4-Mar-2024
  • (2024)Mental Disorders Prognosis and Predictions Using Artificial Intelligence Techniques: a Comprehensive StudySN Computer Science10.1007/s42979-024-03416-w5:8Online publication date: 14-Nov-2024
  • (2024)Hate speech, toxicity detection in online social media: a recent survey of state of the art and opportunitiesInternational Journal of Information Security10.1007/s10207-023-00755-223:1(577-608)Online publication date: 1-Feb-2024
  • (2024)Domain Adaptation for Chinese Offensive Language DetectionNatural Language Processing and Chinese Computing10.1007/978-981-97-9440-9_12(146-158)Online publication date: 2-Nov-2024
  • (2024)Multi-dimensional Edge-Embedded GCNs for Arabic Text ClassificationLinking Theory and Practice of Digital Libraries10.1007/978-3-031-72437-4_14(241-255)Online publication date: 24-Sep-2024
  • (2023)An efficient approach for data-imbalanced hate speech detection in Arabic social mediaJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-23115145:4(6381-6390)Online publication date: 1-Jan-2023
  • (2023)Arabic ChatGPT Tweets Classification Using RoBERTa and BERT Ensemble ModelACM Transactions on Asian and Low-Resource Language Information Processing10.1145/360588922:8(1-23)Online publication date: 24-Aug-2023
  • (2023)FAEO-ECNN: cyberbullying detection in social media platforms using topic modelling and deep learningMultimedia Tools and Applications10.1007/s11042-023-15372-382:30(46611-46650)Online publication date: 1-Dec-2023
  • (2022)Multi-modal cyber-aggression detection with feature optimization by firefly algorithmMultimedia Systems10.1007/s00530-021-00785-728:6(1951-1962)Online publication date: 1-Dec-2022

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media