short-paper

Evaluation of Different Machine Learning and Deep Learning Techniques for Hate Speech Detection

Authors:

Hazim ShatnawiAuthors Info & Claims

ACMSE '24: Proceedings of the 2024 ACM Southeast Conference

Pages 253 - 258

https://doi.org/10.1145/3603287.3651218

Published: 27 April 2024 Publication History

Abstract

Detecting online hate speech is important for creating safer online spaces. In this paper, we evaluate the performance of several machine learning (ML) and deep learning (DL) models in detecting hate speech on three different datasets. We evaluate the performance of the traditional ML algorithms Support Vector Machines (SVM), Naive Bayes, Decision Trees, Random Forests, and Logistic Regression. We also evaluate the performance of deep learning Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), and the BERT pre-trained transformer model. Our experiments show that BERT outperformed all other models with F-1 scores of 90.6% on one dataset and 89.7% and 88.2% on the other two datasets. After that, CNN and LSTM outperformed the traditional ML algorithms with F1-scores over 80% on all three datasets. Among the traditional ML models, SVM performed best with the highest F1-score of 75.6%.

References

[1]

Francisca Adoma Acheampong, Henry Nunoo-Mensah, and Wenyu Chen. 2021. Transformer Models for Text-based Emotion Detection: A Review of BERT-based Approaches. Artificial Intelligence Review (2021), 1--41.

[2]

Sweta Agrawal and Amit Awekar. 2018. Deep Learning for Detecting Cyberbullying Across Multiple Social Media Platforms. In European conference on Information Retrieval (Lecture Notes in Computer Science, Vol. 10772), G. Pasi, B. Piwowarski, L. Azzopardi, and A. Hanbury (Eds.). Springer, 141--153. https://doi.org/10.1007/978-3-319-76941-7_11

[3]

Osama Alsharif. 2023. Rise in Hate Speech Over Gaza a Defining Moment. https://www.arabnews.com/node/2416871

[4]

Mohit Chandra, Dheeraj Pailla, Himanshu Bhatia, Aadilmehdi Sanchawala, Manish Gupta, Manish Shrivastava, and Ponnurangam Kumaraguru. 2021. "Subverting the Jewtocracy": Online Antisemitism Detection Using Multimodal Deep Learning. In Proceedings of the 13th ACM Web Science Conference 2021. ACM, Southampton, England, 148--157. https://doi.org/10.1145/3447535.3462502

Digital Library

[5]

Mohit Chandra, Manvith Reddy, Shradha Sehgal, Saurabh Gupta, Arun Balaji Buduru, and Ponnurangam Kumaraguru. 2021. "A Virus Has No Religion": Analyzing Islamophobia on Twitter During the COVID-19 Outbreak. In Proceedings of the 32nd ACM Conference on Hypertext and Social Media. Dublin, Ireland, 67--77.

Digital Library

[6]

Sam Cook. 2023. Cyberbullying Facts and Statistics for 2018 - 2023. https://www.comparitech.com/Internet-providers/cyberbullying-statistics/

[7]

Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated Hate Speech Detection and the Problem of Offensive Language. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 11. Montreal, Canada, 512--515.

[8]

Magnus Ekma. 2021. Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, Natural Language Processing, and Transformers Using TensorFlow. Addison-Wesley Professional, Boston, MA, USA.

[9]

Mai ElSherief, Vivek Kulkarni, Dana Nguyen, William Yang Wang, and Elizabeth Belding. 2018. Hate Lingo: A Target-based Linguistic Analysis of Hate Speech in Social Media. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 12. Stanford, California, USA.

[10]

Emma Farge and Alison Williams. 2023. UN Committee Voices Concern About Rising Israeli Hate Speech Against Palestinians. https://www.reuters.com/world/uncommittee-voices-concern-about-rising-israeli-hate-speech-against-2023-10-27/

[11]

Paula Fortuna and Sérgio Nunes. 2019. A Survey on Automatic Detection of Hate Speech in Text. Comput. Surveys 51, 4 (2019), 1--30. https://doi.org/10.1145/3232676

Digital Library

[12]

Kyle Gallatin and Chris Albon. 2023. Machine Learning with Python Cookbook: Practical Solutions from Preprocessing to Deep Learning. O'Reilly Media, Boston, MA, USA.

[13]

Gary W Giumetti and Robin M Kowalski. 2022. Cyberbullying via Social Media and Well-being. Current Opinion in Psychology 45 (2022), 101314. https://doi.org/10.1016/j.copsyc.2022.101314

[14]

Jonathan Greig. 2021. CDC Study Finds Ties Between Online Bullying, Violence, Hate Speech and Suicide or Self-harm. https://www.zdnet.com/article/cdc-study-finds-ties-between-online-bullying-violence-hate-speech-and-suicide-or-self-harm

[15]

Aurélien Géron. 2019. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O'Reilly Media, Boston, MA, USA.

[16]

Ong Chee Hang and Halina Mohamed Dahlan. 2019. Cyberbullying Lexicon for Social Media. In 2019 6th International Conference on Research and Innovation in Information Systems (ICRIIS). IEEE, Johor Bahru, Malaysia, 1--6. https://doi.org/10.1109/ICRIIS48246.2019.9073679

[17]

Sameer Hinduja and Justin W Patchin. 2010. Bullying, Cyberbullying, and Suicide. Archives of Suicide Research 14, 3 (2010), 206--221.

[18]

Impermium. 2012. Detecting Insults in Social Commentary Dataset, Kaggle. https://www.kaggle.com/c/detecting-insults-in-social-commentary

[19]

S Joshua Johnson, M Ramakrishna Murty, and I Navakanth. 2023. A Detailed Review on Word Embedding Techniques with Emphasis on Word2Vec. Multimedia Tools and Applications (3 October 2023), 1--29.

[20]

Heena Khan and Joshua L Phillips. 2021. Language Agnostic Model: Detecting Islamophobic Content on Social Media. In Proceedings of the 2021 ACM Southeast Conference. Jacksonville, Alabama, USA, 229--233.

Digital Library

[21]

Lara Korte. 2017. Youth Suicide Rates are Rising. School and the Internet May be to Blame. https://www.usatoday.com/story/news/nation-now/2017/05/30/youth-suicide-rates-rising-school-and-internet-may-blame/356539001/

[22]

Ritesh Kumar, Atul Kr Ojha, Shervin Malmasi, and Marcos Zampieri. 2018. Benchmarking Aggression Identification in Social Media. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). Santa Fe, New Mexico, USA, 1--11.

[23]

Jeremy Liebowitz, Geoffrey Macdonald, Vivek Shivaram, and Sanjendra Vignaraja. 2005. The Digitalization of Hate Speech in South and Southeast Asia: Conflict-Mitigation Approaches. Georgetown Journal of International Affairs (5 May 2005).

[24]

Anna Liu. 2018. Neural Network Models for Hate Speech Classification in Tweets. Ph. D. Dissertation. https://dash.harvard.edu/handle/1/38811552

[25]

Sean MacAvaney, Hao-Ren Yao, Eugene Yang, Katina Russell, Nazli Goharian, and Ophir Frieder. 2019. Hate Speech Detection: Challenges and Solutions. PloS one 14, 8 (2019). https://doi.org/10.1371/

[26]

Saed Rezayi, Vimala Balakrishnan, Samira Arabnia, and Hamid R Arabnia. 2018. Fake News and Cyberbullying in the Modern Era. In 2018 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE, Las Vegas, NV, USA, 7--12. https://doi.org/10.1109/CSCI46756.2018.00010

[27]

Kalhan Rosenblatt. 2017. Cyberbullying Tragedy: New Jersey Family to Sue After 12-Year-Old Daughter's Suicide. https://www.nbcnews.com/news/us-news/new-jersey-family-sue-school-district-after-12-year-old-n788506

[28]

Hind Saleh, Areej Alhothali, and Kawthar Moria. 2023. Detection of Hate Speech Using BERT and Hate Speech Word Embedding with Deep Model. Applied Artificial Intelligence 37 (2023). Issue 1. https://doi.org/10.1080/08839514.2023.2166719

[29]

Anna Schmidt and Michael Wiegand. 2017. A Survey on Hate Speech Detection Using Natural Language Processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. Association for Computational Linguistics, Valencia, Spain, 1--10.

[30]

Mifta Sintaha and Moin Mostakim. 2018. An Empirical Study and Analysis of the Machine Learning Algorithms Used in Detecting Cyberbullying in Social Media. In 2018 21st International Conference of Computer and Information Technology (ICCIT). IEEE, Dhaka, Bangladesh, 1--6. https://doi.org/0.1109/ICCITECHN.2018.8631958

[31]

Nico T Solitana and Charibeth K Cheng. 2021. Analyses of Hate and Non-Hate Expressions During Election Using NLP. In 2021 International Conference on Asian Language Processing (IALP). IEEE, Yantai, China, 385--390. https://doi.org/10.1109/IALP54817.2021.9675186

[32]

Fatemeh Tahmasbi, Leonard Schild, Chen Ling, Jeremy Blackburn, Gianluca Stringhini, Yang Zhang, and Savvas Zannettou. 2021. "Go Eat a Bat, Chang!": On the Emergence of Sinophobic Behavior on Web Communities in the Face of COVID-19. In Proceedings of the Web Conference 2021. 1122--1133. https://doi.org/10.1145/3442381.3450024

Digital Library

[33]

Cagatay Neftali Tulu. 2022. Experimental Comparison of Pre-Trained Word Embedding Vectors of Word2Vec, Glove, FastText for Word Level Semantic Text Similarity Measurement in Turkish. Advances in Science and Technology. Research Journal 16, 4 (2022), 147--156.

[34]

Iulia Turc, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Well-Read Students Learn Better: On the Importance of Pre-training Compact Models. arXiv preprint arXiv:1908.08962v2 (2019).

[35]

Riyaz Wani. 2022. Across South Asia, Online Hate Speech is Increasingly Leading to Real-world Harm. https://www.equaltimes.org/across-south-asia-online-hate?lang=en

[36]

Zeerak Waseem and Dirk Hovy. 2016. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In Proceedings of the NAACL Student Research Workshop. San Diego, California, 88--93.

[37]

Ellery Wulczyn, Nithum Thain, and Lucas Dixon. 2017. Ex Machina: Personal Attacks Seen at Scale. In Proceedings of the 26th International Conference on World Wide Web. Perth, Australia, 1391--1399. https://doi.org/10.1145/3038912.3052591

Digital Library

[38]

Dawei Yin, Zhenzhen Xue, Liangjie Hong, Brian D Davison, April Kontostathis, and Lynne Edwards Edwards. 2009. Detection of Harassment on Web 2.0. In Proceedings of the Content Analysis in the WEB (CAW2.0) Conference. Madrid, Spain, 7 pages.

Cited By

Pericherla SRudraraju LChaubey N(2024)Hyperparameter Tuning of Pre-Trained Architectures for Multi-Modal Cyberbullying DetectionAdvancing Cyber Security Through Quantum Cryptography10.4018/979-8-3693-5961-7.ch017(465-502)Online publication date: 4-Oct-2024
https://doi.org/10.4018/979-8-3693-5961-7.ch017

Index Terms

Evaluation of Different Machine Learning and Deep Learning Techniques for Hate Speech Detection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Neural networks

Recommendations

Hate speech detection on Twitter using transfer learning
Highlights
- The results show that using transfer learning with BERT architecture gives best results on our dataset.
Abstract
Social Media has become an ultimate driver of social change in the global society. Implications of the events, that take place in one corner of the word, reverberate across the globe in various geographies. This is so because the huge ...
Detection of hate speech in Arabic tweets using deep learning
Abstract
Nowadays, people are communicating through social networks everywhere. However, for whatever reason it is noticeable that verbal misbehaviors, such as hate speech is now propagated through the social networks. One of the most popular social ...
Hate speech and offensive language detection in Dravidian languages using deep ensemble framework
Abstract
Social networking platforms gained widespread popularity and are used for various activities like: promoting products, sharing news, achievements and many more. On the other hand, it is also used for spreading rumors, bullying people, ...
Highlights
- Proposed a weighted ensemble framework for hate and offensive code-mixed posts identification on social platforms.

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ACMSE '24: Proceedings of the 2024 ACM Southeast Conference

April 2024

337 pages

ISBN:9798400702372

DOI:10.1145/3603287

Organizing Chair:
Dan Lo,
Program Chair:
Eric Gamess

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 April 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Conference

ACM SE '24

Sponsor:

ACM

ACM SE '24: 2024 ACM Southeast Conference

April 18 - 20, 2024

GA, Marietta, USA

Acceptance Rates

ACMSE '24 Paper Acceptance Rate 44 of 137 submissions, 32%;

Overall Acceptance Rate 502 of 1,023 submissions, 49%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
79
Total Downloads

Downloads (Last 12 months)79
Downloads (Last 6 weeks)6

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Pericherla SRudraraju LChaubey N(2024)Hyperparameter Tuning of Pre-Trained Architectures for Multi-Modal Cyberbullying DetectionAdvancing Cyber Security Through Quantum Cryptography10.4018/979-8-3693-5961-7.ch017(465-502)Online publication date: 4-Oct-2024
https://doi.org/10.4018/979-8-3693-5961-7.ch017

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten