research-article

Enhancing social network hate detection using back translation and GPT-3 augmentations during training and test-time

Authors:

Lior RokachAuthors Info & Claims

Volume 99, Issue C

https://doi.org/10.1016/j.inffus.2023.101887

Published: 01 November 2023 Publication History

Abstract

Social media platforms have become an essential means of communication, but they also serve as a breeding ground for hateful content. Detecting hate speech accurately is challenging due to factors such as slang and implicit hate speech. In response to these challenges, this paper presents a novel ensemble approach utilizing DeBERTa models, integrating back-translation and GPT-3 augmentation techniques during both training and test time. This method aims to address the complexities associated with detecting hate speech, resulting in more robust and accurate results. Our findings indicate that the proposed approach significantly enhances hate speech detection performance across various metrics and models in both the Parler and GAB datasets. For reproducibility and further exploration, our code is publicly available at https://github.com/OrKatz7/parler-hate-speech.

Highlights

•

Apply Test-Time Augmentation in NLP to improve hate speech detection.

•

Enhance augmentation quality with a selection method to filter unsuitable ones.

•

Merge DeBERTa, back-translation & GPT-3 to boost hate speech detection.

•

Manage hate speech complexities - effectively detects slang & implicit language.

References

[1]

Fortuna P., Nunes S., A survey on automatic detection of hate speech in text, ACM Comput. Surv. 51 (2018) 1–30.

[2]

Guterres A., United Nations Strategy and Plan of Action on Hate Speech, United Nations, New York, NY, USA, 2019.

[3]

Singh P., How SPRINKLR Helps Identify and Measure Toxic Content with AI, Sprinklr, 2023.

[4]

Shead S., Facebook Claims A.I. Now Detects 94.7% of the Hate Speech that Gets Removed from its Platform, CNBC, 2020.

[5]

Schroepfer M., Update on Our Progress on AI and Hate Speech Detection, Meta, 2021.

[6]

Reddit M., Understanding Hate on Reddit, and the Impact of Our New Policy, Reddit Security, 2020.

[7]

Hensley L., Right-Wing Platform Gab Taken Down After Pittsburgh Shooting, Says It’s been ’Smeared’ by Media- National, Oct 2018, GlobalNews, Canada, 2018.

[8]

Bajak A., Guynn J., Thorson M., When Trump Started his Speech Before the Capitol Riot, Talk on Parler Turned to Civil War, Feb 2021, Usatoday, 2021.

[9]

Devlin J., Chang M.-W., Lee K., Toutanova K., Bert: Pre-training of deep bidirectional transformers for language understanding, 2018, arXiv preprint arXiv:1810.04805.

[10]

Chiu K.-L., Collins A., Alexander R., Detecting hate speech with gpt-3, 2021, arXiv preprint arXiv:2103.12407.

[11]

He P., Liu X., Gao J., Chen W., Deberta: Decoding-enhanced bert with disentangled attention, 2020, arXiv preprint arXiv:2006.03654.

[12]

Israeli A., Tsur O., Free speech or free hate speech? Analyzing the proliferation of hate speech in Parler, in: Proceedings of the Sixth Workshop on Online Abuse and Harms, WOAH, 2022, pp. 109–121.

[13]

Kennedy B., Atari M., Davani A.M., Yeh L., Omrani A., Kim Y., Coombs K., Havaldar S., Portillo-Wightman G., Gonzalez E., et al., The Gab Hate Corpus: A Collection of 27k Posts Annotated for Hate Speech, PsyArXiv, 2018, July 18.

[14]

Lomas N., Facebook, Google, Twitter Commit to Hate Speech Action in Germany, TechCrunch, 2015.

[15]

Beddiar D.R., Jahan M.S., Oussalah M., Data expansion using back translation and paraphrasing for hate speech detection, Online Soc. Netw. Media 24 (2021).

[16]

Kui Y., Detect hate and offensive content in english and indo-aryan languages based on transformer, in: Forum for Information Retrieval Evaluation (Working Notes)(FIRE), CEUR-WS. org, 2021.

[17]

Cao R., Lee R.K.-W., Hategan: Adversarial generative-based data augmentation for hate speech detection, in: Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 6327–6338.

[18]

Ludwig F., Dolos K., Zesch T., Hobley E., Improving generalization of hate speech detection systems to novel target groups via domain adaptation, in: Proceedings of the Sixth Workshop on Online Abuse and Harms, WOAH, 2022, pp. 29–39.

[19]

Ahmed Z., Vidgen B., Hale S.A., Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning, EPJ Data Sci. 11 (1) (2022) 8.

[20]

Malik J.S., Pang G., Hengel A.v.d., Deep learning for hate speech detection: A comparative study, 2022, arXiv preprint arXiv:2202.09517.

[21]

Pérez J.M., Luque F.M., Zayat D., Kondratzky M., Moro A., Serrati P.S., Zajac J., Miguel P., Debandi N., Gravano A., et al., Assessing the impact of contextual information in hate speech detection, IEEE Access 11 (2023) 30575–30590.

[22]

Utku A., Can U., Aslan S., Detection of hateful twitter users with graph convolutional network model, Earth Sci. Inform. (2023) 1–15.

[23]

Nagar S., Barbhuiya F.A., Dey K., Towards more robust hate speech detection: using social context and user data, Soc. Netw. Anal. Min. 13 (1) (2023) 47.

[24]

Hosseini H., Kannan S., Zhang B., Poovendran R., Deceiving google’s perspective api built for detecting toxic comments, 2017, arXiv preprint arXiv:1702.08138.

[25]

Aluru S.S., Mathew B., Saha P., Mukherjee A., Deep learning models for multilingual hate speech detection, 2020, arXiv preprint arXiv:2004.06465.

[26]

Barbieri F., Camacho-Collados J., Neves L., Espinosa-Anke L., Tweeteval: Unified benchmark and comparative evaluation for tweet classification, 2020, arXiv preprint arXiv:2010.12421.

[27]

Davidson T., Warmsley D., Macy M., Weber I., Automated hate speech detection and the problem of offensive language, in: Proceedings of the International AAAI Conference on Web and Social Media, Vol. 11, 2017, pp. 512–515.

[28]

Fadaee M., Monz C., Back-translation sampling by targeting difficult words in neural machine translation, 2018, arXiv preprint arXiv:1808.09006.

[29]

Xu N., Li Y., Xu C., Li Y., Li B., Xiao T., Zhu J., Analysis of back-translation methods for low-resource neural machine translation, in: CCF International Conference on Natural Language Processing and Chinese Computing, Springer, 2019, pp. 466–475.

[30]

Sharifirad S., Jafarpour B., Matwin S., Boosting text classification performance on sexist tweets by text augmentation and text generation using a combination of knowledge graphs, in: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), 2018, pp. 107–114.

[31]

Speer R., Chin J., Havasi C., Conceptnet 5.5: An open multilingual graph of general knowledge, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31, 2017.

[32]

Balkus S., Yan D., Improving short text classification with augmented data using GPT-3, 2022, arXiv preprint arXiv:2205.10981.

[33]

Azam U., Rizwan H., Karim A., Exploring data augmentation strategies for hate speech detection in roman urdu, in: Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022, pp. 4523–4531.

[34]

Mao R., Lin C., Guerin F., Word embedding and WordNet based metaphor identification and interpretation, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (ACL), 2018.

[35]

Mao R., Li X., Ge M., Cambria E., MetaPro: A computational metaphor processing model for text pre-processing, Inf. Fusion 86 (2022) 30–43.

[36]

He K., Mao R., Gong T., Li C., Cambria E., Meta-based self-training and re-weighting for aspect-based sentiment analysis, IEEE Trans. Affect. Comput. (2022).

Digital Library

[37]

Li W., Zhu L., Mao R., Cambria E., SKIER: A symbolic knowledge integrated model for conversational emotion recognition, 2023.

[38]

Sennrich R., Haddow B., Birch A., Improving neural machine translation models with monolingual data, 2015, arXiv preprint arXiv:1511.06709.

[39]

Sugiyama A., Yoshinaga N., Data augmentation using back-translation for context-aware neural machine translation, in: Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019), 2019, pp. 35–44.

[40]

Yu A.W., Dohan D., Luong M.-T., Zhao R., Chen K., Norouzi M., Le Q.V., Qanet: Combining local convolution with global self-attention for reading comprehension, 2018, arXiv preprint arXiv:1804.09541.

[41]

Lee J., Kim J., Kang P., Back-translated task adaptive pretraining: Improving accuracy and robustness on text classification, 2021, arXiv preprint arXiv:2107.10474.

[42]

Shleifer S., Low resource text classification with ulmfit and backtranslation, 2019, arXiv preprint arXiv:1903.09244.

[43]

Jong Y.-J., Kim Y.-J., Ri O.-C., Improving performance of automated essay scoring by using back-translation essays and adjusted scores, Math. Probl. Eng. 2022 (2022).

[44]

Mishra S., Prasad S., Mishra S., Exploring multi-task multi-lingual learning of transformer models for hate speech and offensive speech identification in social media, SN Comput. Sci. 2 (2021) 1–19.

[45]

Fan A., Bhosale S., Schwenk H., Ma Z., El-Kishky A., Goyal S., Baines M., Celebi O., Wenzek G., Chaudhary V., Goyal N., Birch T., Liptchinsky V., Edunov S., Grave E., Auli M., Joulin A., Beyond english-centric multilingual machine translation, 2020, arXiv preprint.

[46]

Schwenk H., Wenzek G., Edunov S., Grave E., Joulin A., Ccmatrix: Mining billions of high-quality parallel sentences on the web, 2019, arXiv preprint arXiv:1911.04944.

[47]

El-Kishky A., Chaudhary V., Guzman F., Koehn P., A massive collection of cross-lingual web-document pairs, 2019, arXiv preprint arXiv:1911.06154.

[48]

Perez L., Wang J., The effectiveness of data augmentation in image classification using deep learning, 2017, arXiv preprint arXiv:1712.04621.

[49]

Rokach L., Ensemble-based classifiers, Artif. Intell. Rev. 33 (1–2) (2010) 1–39.

Digital Library

[50]

Cohen S., Goldshlager N., Rokach L., Shapira B., Boosting anomaly detection using unsupervised diverse test-time augmentation, Inform. Sci. (2023).

[51]

Cohen S., Dagan N., Cohen-Inger N., Ofer D., Rokach L., ICU survival prediction incorporating test-time augmentation to improve the accuracy of ensemble-based models, IEEE Access 9 (2021) 91584–91592.

[52]

Wang J., Dong Y., Measurement of text similarity: a survey, Information 11 (9) (2020) 421.

[53]

Kennedy B., Atari M., Davani A.M., Yeh L., Omrani A., Kim Y., Coombs K., Portillo-Wightman G., Havaldar S., Gonzalez E., et al., The Gab Hate Corpus, 2022,. URL osf.io/edua3.

[54]

Aliapoulios M., Bevensee E., Blackburn J., Bradlyn B., De Cristofaro E., Stringhini G., Zannettou S., An early look at the parler online social network, 2021, arXiv preprint arXiv:2101.03820.

[55]

Kennedy B., Atari M., Davani A.M., Yeh L., Omrani A., Kim Y., Coombs K., Havaldar S., Portillo-Wightman G., Gonzalez E., et al., Introducing the Gab Hate Corpus: defining and applying hate-based rhetoric to social media posts at scale, Lang. Resour. Eval. (2022) 1–30.

Cited By

Zhong HZhang QLi WLin RTang Y(2025)KPLLM-STE: Knowledge-enhanced and prompt-aware large language models for short-text expansionWorld Wide Web10.1007/s11280-024-01322-y28:1Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s11280-024-01322-y
Rawat AKumar SSamant S(2024)Hate speech detection in social mediaWIREs Computational Statistics10.1002/wics.164816:2Online publication date: 11-Mar-2024
https://dl.acm.org/doi/10.1002/wics.1648

Index Terms

Enhancing social network hate detection using back translation and GPT-3 augmentations during training and test-time

Index terms have been assigned to the content through auto-classification.

Recommendations

The Virality of Hate Speech on Social Media
CSCW

Online hate speech is responsible for violent attacks such as, e.g., the Pittsburgh synagogue shooting in 2018, thereby posing a significant threat to vulnerable groups and society in general. However, little is known about what makes hate speech on ...
Spread of Hate Speech in Online Social Media
WebSci '19: Proceedings of the 10th ACM Conference on Web Science

Hate speech is considered to be one of the major issues currently plaguing the online social media. With online hate speech culminating in gruesome scenarios like the Rohingya genocide in Myanmar, anti-Muslim mob violence in Sri Lanka, and the ...
A Measurement Study of Hate Speech in Social Media
HT '17: Proceedings of the 28th ACM Conference on Hypertext and Social Media

Social media platforms provide an inexpensive communication medium that allows anyone to quickly reach millions of users. Consequently, in these platforms anyone can publish content and anyone interested in the content can obtain it, representing a ...

Comments

Information & Contributors

Information

Published In

cover image Information Fusion

Information Fusion Volume 99, Issue C

Nov 2023

688 pages

ISSN:1566-2535

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 November 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhong HZhang QLi WLin RTang Y(2025)KPLLM-STE: Knowledge-enhanced and prompt-aware large language models for short-text expansionWorld Wide Web10.1007/s11280-024-01322-y28:1Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s11280-024-01322-y
Rawat AKumar SSamant S(2024)Hate speech detection in social mediaWIREs Computational Statistics10.1002/wics.164816:2Online publication date: 11-Mar-2024
https://dl.acm.org/doi/10.1002/wics.1648

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents