Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Enhancing social network hate detection using back translation and GPT-3 augmentations during training and test-time

Published: 01 November 2023 Publication History

Abstract

Social media platforms have become an essential means of communication, but they also serve as a breeding ground for hateful content. Detecting hate speech accurately is challenging due to factors such as slang and implicit hate speech. In response to these challenges, this paper presents a novel ensemble approach utilizing DeBERTa models, integrating back-translation and GPT-3 augmentation techniques during both training and test time. This method aims to address the complexities associated with detecting hate speech, resulting in more robust and accurate results. Our findings indicate that the proposed approach significantly enhances hate speech detection performance across various metrics and models in both the Parler and GAB datasets. For reproducibility and further exploration, our code is publicly available at https://github.com/OrKatz7/parler-hate-speech.

Highlights

Apply Test-Time Augmentation in NLP to improve hate speech detection.
Enhance augmentation quality with a selection method to filter unsuitable ones.
Merge DeBERTa, back-translation & GPT-3 to boost hate speech detection.
Manage hate speech complexities - effectively detects slang & implicit language.

References

[1]
Fortuna P., Nunes S., A survey on automatic detection of hate speech in text, ACM Comput. Surv. 51 (2018) 1–30.
[2]
Guterres A., United Nations Strategy and Plan of Action on Hate Speech, United Nations, New York, NY, USA, 2019.
[3]
Singh P., How SPRINKLR Helps Identify and Measure Toxic Content with AI, Sprinklr, 2023.
[4]
Shead S., Facebook Claims A.I. Now Detects 94.7% of the Hate Speech that Gets Removed from its Platform, CNBC, 2020.
[5]
Schroepfer M., Update on Our Progress on AI and Hate Speech Detection, Meta, 2021.
[6]
Reddit M., Understanding Hate on Reddit, and the Impact of Our New Policy, Reddit Security, 2020.
[7]
Hensley L., Right-Wing Platform Gab Taken Down After Pittsburgh Shooting, Says It’s been ’Smeared’ by Media- National, Oct 2018, GlobalNews, Canada, 2018.
[8]
Bajak A., Guynn J., Thorson M., When Trump Started his Speech Before the Capitol Riot, Talk on Parler Turned to Civil War, Feb 2021, Usatoday, 2021.
[9]
Devlin J., Chang M.-W., Lee K., Toutanova K., Bert: Pre-training of deep bidirectional transformers for language understanding, 2018, arXiv preprint arXiv:1810.04805.
[10]
Chiu K.-L., Collins A., Alexander R., Detecting hate speech with gpt-3, 2021, arXiv preprint arXiv:2103.12407.
[11]
He P., Liu X., Gao J., Chen W., Deberta: Decoding-enhanced bert with disentangled attention, 2020, arXiv preprint arXiv:2006.03654.
[12]
Israeli A., Tsur O., Free speech or free hate speech? Analyzing the proliferation of hate speech in Parler, in: Proceedings of the Sixth Workshop on Online Abuse and Harms, WOAH, 2022, pp. 109–121.
[13]
Kennedy B., Atari M., Davani A.M., Yeh L., Omrani A., Kim Y., Coombs K., Havaldar S., Portillo-Wightman G., Gonzalez E., et al., The Gab Hate Corpus: A Collection of 27k Posts Annotated for Hate Speech, PsyArXiv, 2018, July 18.
[14]
Lomas N., Facebook, Google, Twitter Commit to Hate Speech Action in Germany, TechCrunch, 2015.
[15]
Beddiar D.R., Jahan M.S., Oussalah M., Data expansion using back translation and paraphrasing for hate speech detection, Online Soc. Netw. Media 24 (2021).
[16]
Kui Y., Detect hate and offensive content in english and indo-aryan languages based on transformer, in: Forum for Information Retrieval Evaluation (Working Notes)(FIRE), CEUR-WS. org, 2021.
[17]
Cao R., Lee R.K.-W., Hategan: Adversarial generative-based data augmentation for hate speech detection, in: Proceedings of the 28th International Conference on Computational Linguistics, 2020, pp. 6327–6338.
[18]
Ludwig F., Dolos K., Zesch T., Hobley E., Improving generalization of hate speech detection systems to novel target groups via domain adaptation, in: Proceedings of the Sixth Workshop on Online Abuse and Harms, WOAH, 2022, pp. 29–39.
[19]
Ahmed Z., Vidgen B., Hale S.A., Tackling racial bias in automated online hate detection: Towards fair and accurate detection of hateful users with geometric deep learning, EPJ Data Sci. 11 (1) (2022) 8.
[20]
Malik J.S., Pang G., Hengel A.v.d., Deep learning for hate speech detection: A comparative study, 2022, arXiv preprint arXiv:2202.09517.
[21]
Pérez J.M., Luque F.M., Zayat D., Kondratzky M., Moro A., Serrati P.S., Zajac J., Miguel P., Debandi N., Gravano A., et al., Assessing the impact of contextual information in hate speech detection, IEEE Access 11 (2023) 30575–30590.
[22]
Utku A., Can U., Aslan S., Detection of hateful twitter users with graph convolutional network model, Earth Sci. Inform. (2023) 1–15.
[23]
Nagar S., Barbhuiya F.A., Dey K., Towards more robust hate speech detection: using social context and user data, Soc. Netw. Anal. Min. 13 (1) (2023) 47.
[24]
Hosseini H., Kannan S., Zhang B., Poovendran R., Deceiving google’s perspective api built for detecting toxic comments, 2017, arXiv preprint arXiv:1702.08138.
[25]
Aluru S.S., Mathew B., Saha P., Mukherjee A., Deep learning models for multilingual hate speech detection, 2020, arXiv preprint arXiv:2004.06465.
[26]
Barbieri F., Camacho-Collados J., Neves L., Espinosa-Anke L., Tweeteval: Unified benchmark and comparative evaluation for tweet classification, 2020, arXiv preprint arXiv:2010.12421.
[27]
Davidson T., Warmsley D., Macy M., Weber I., Automated hate speech detection and the problem of offensive language, in: Proceedings of the International AAAI Conference on Web and Social Media, Vol. 11, 2017, pp. 512–515.
[28]
Fadaee M., Monz C., Back-translation sampling by targeting difficult words in neural machine translation, 2018, arXiv preprint arXiv:1808.09006.
[29]
Xu N., Li Y., Xu C., Li Y., Li B., Xiao T., Zhu J., Analysis of back-translation methods for low-resource neural machine translation, in: CCF International Conference on Natural Language Processing and Chinese Computing, Springer, 2019, pp. 466–475.
[30]
Sharifirad S., Jafarpour B., Matwin S., Boosting text classification performance on sexist tweets by text augmentation and text generation using a combination of knowledge graphs, in: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), 2018, pp. 107–114.
[31]
Speer R., Chin J., Havasi C., Conceptnet 5.5: An open multilingual graph of general knowledge, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31, 2017.
[32]
Balkus S., Yan D., Improving short text classification with augmented data using GPT-3, 2022, arXiv preprint arXiv:2205.10981.
[33]
Azam U., Rizwan H., Karim A., Exploring data augmentation strategies for hate speech detection in roman urdu, in: Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022, pp. 4523–4531.
[34]
Mao R., Lin C., Guerin F., Word embedding and WordNet based metaphor identification and interpretation, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (ACL), 2018.
[35]
Mao R., Li X., Ge M., Cambria E., MetaPro: A computational metaphor processing model for text pre-processing, Inf. Fusion 86 (2022) 30–43.
[36]
He K., Mao R., Gong T., Li C., Cambria E., Meta-based self-training and re-weighting for aspect-based sentiment analysis, IEEE Trans. Affect. Comput. (2022).
[37]
Li W., Zhu L., Mao R., Cambria E., SKIER: A symbolic knowledge integrated model for conversational emotion recognition, 2023.
[38]
Sennrich R., Haddow B., Birch A., Improving neural machine translation models with monolingual data, 2015, arXiv preprint arXiv:1511.06709.
[39]
Sugiyama A., Yoshinaga N., Data augmentation using back-translation for context-aware neural machine translation, in: Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019), 2019, pp. 35–44.
[40]
Yu A.W., Dohan D., Luong M.-T., Zhao R., Chen K., Norouzi M., Le Q.V., Qanet: Combining local convolution with global self-attention for reading comprehension, 2018, arXiv preprint arXiv:1804.09541.
[41]
Lee J., Kim J., Kang P., Back-translated task adaptive pretraining: Improving accuracy and robustness on text classification, 2021, arXiv preprint arXiv:2107.10474.
[42]
Shleifer S., Low resource text classification with ulmfit and backtranslation, 2019, arXiv preprint arXiv:1903.09244.
[43]
Jong Y.-J., Kim Y.-J., Ri O.-C., Improving performance of automated essay scoring by using back-translation essays and adjusted scores, Math. Probl. Eng. 2022 (2022).
[44]
Mishra S., Prasad S., Mishra S., Exploring multi-task multi-lingual learning of transformer models for hate speech and offensive speech identification in social media, SN Comput. Sci. 2 (2021) 1–19.
[45]
Fan A., Bhosale S., Schwenk H., Ma Z., El-Kishky A., Goyal S., Baines M., Celebi O., Wenzek G., Chaudhary V., Goyal N., Birch T., Liptchinsky V., Edunov S., Grave E., Auli M., Joulin A., Beyond english-centric multilingual machine translation, 2020, arXiv preprint.
[46]
Schwenk H., Wenzek G., Edunov S., Grave E., Joulin A., Ccmatrix: Mining billions of high-quality parallel sentences on the web, 2019, arXiv preprint arXiv:1911.04944.
[47]
El-Kishky A., Chaudhary V., Guzman F., Koehn P., A massive collection of cross-lingual web-document pairs, 2019, arXiv preprint arXiv:1911.06154.
[48]
Perez L., Wang J., The effectiveness of data augmentation in image classification using deep learning, 2017, arXiv preprint arXiv:1712.04621.
[49]
Rokach L., Ensemble-based classifiers, Artif. Intell. Rev. 33 (1–2) (2010) 1–39.
[50]
Cohen S., Goldshlager N., Rokach L., Shapira B., Boosting anomaly detection using unsupervised diverse test-time augmentation, Inform. Sci. (2023).
[51]
Cohen S., Dagan N., Cohen-Inger N., Ofer D., Rokach L., ICU survival prediction incorporating test-time augmentation to improve the accuracy of ensemble-based models, IEEE Access 9 (2021) 91584–91592.
[52]
Wang J., Dong Y., Measurement of text similarity: a survey, Information 11 (9) (2020) 421.
[53]
Kennedy B., Atari M., Davani A.M., Yeh L., Omrani A., Kim Y., Coombs K., Portillo-Wightman G., Havaldar S., Gonzalez E., et al., The Gab Hate Corpus, 2022,. URL osf.io/edua3.
[54]
Aliapoulios M., Bevensee E., Blackburn J., Bradlyn B., De Cristofaro E., Stringhini G., Zannettou S., An early look at the parler online social network, 2021, arXiv preprint arXiv:2101.03820.
[55]
Kennedy B., Atari M., Davani A.M., Yeh L., Omrani A., Kim Y., Coombs K., Havaldar S., Portillo-Wightman G., Gonzalez E., et al., Introducing the Gab Hate Corpus: defining and applying hate-based rhetoric to social media posts at scale, Lang. Resour. Eval. (2022) 1–30.

Cited By

View all
  • (2025)KPLLM-STE: Knowledge-enhanced and prompt-aware large language models for short-text expansionWorld Wide Web10.1007/s11280-024-01322-y28:1Online publication date: 1-Jan-2025
  • (2024)Hate speech detection in social mediaWIREs Computational Statistics10.1002/wics.164816:2Online publication date: 11-Mar-2024

Index Terms

  1. Enhancing social network hate detection using back translation and GPT-3 augmentations during training and test-time
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Information Fusion
            Information Fusion  Volume 99, Issue C
            Nov 2023
            688 pages

            Publisher

            Elsevier Science Publishers B. V.

            Netherlands

            Publication History

            Published: 01 November 2023

            Author Tags

            1. Hate-detection
            2. TTA
            3. Back-translation
            4. GPT

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 02 Feb 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2025)KPLLM-STE: Knowledge-enhanced and prompt-aware large language models for short-text expansionWorld Wide Web10.1007/s11280-024-01322-y28:1Online publication date: 1-Jan-2025
            • (2024)Hate speech detection in social mediaWIREs Computational Statistics10.1002/wics.164816:2Online publication date: 11-Mar-2024

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media