research-article

You too Brutus! Trapping Hateful Users in Social Media: Challenges, Solutions & Insights

Authors:

Animesh Mukherjee,

Binny MathewAuthors Info & Claims

HT '21: Proceedings of the 32nd ACM Conference on Hypertext and Social Media

Pages 79 - 89

https://doi.org/10.1145/3465336.3475106

Published: 29 August 2021 Publication History

Abstract

Hate speech is regarded as one of the crucial issues plaguing the online social media. The current literature on hate speech detection leverages primarily the textual content to find hateful posts and subsequently identify hateful users. However, this methodology disregards the social connections between users. In this paper, we run a detailed exploration of the problem space and investigate an array of models ranging from purely textual to graph based to finally semi-supervised techniques using Graph Neural Networks (GNN) that utilize both textual and graph-based features. We run exhaustive experiments on two datasets -- Gab, which is loosely moderated and Twitter, which is strictly moderated. Overall the AGNN model achieves 0.791 macro F1-score on the Gab dataset and 0.780 macro F1-score on the Twitter dataset using only 5% of the labeled instances, considerably outperforming all the other models including the fully supervised ones. We perform detailed error analysis on the best performing text and graph based models and observe that hateful users have unique network neighborhood signatures and the AGNN model benefits by paying attention to these signatures. This property, as we observe, also allows the model to generalize well across platforms in a zero-shot setting. Lastly, we utilize the best performing GNN model to analyze the evolution of hateful users and their targets over time in Gab.

Supplementary Material

MP4 File (HT21-ht038.mp4)

Presentation video

Download
79.10 MB

References

[1]

Ashutosh Adhikari, Achyudh Ram, Raphael Tang, and Jimmy Lin. 2019. DocBERT: BERT for Document Classification. arxiv: 1904.08398 [cs.CL]

[2]

Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep Learning for Hate Speech Detection in Tweets (WWW). 759--760.

[3]

Filippo Maria Bianchi, Daniele Grattarola, L Livi, and C Alippi. 2019. Graph neural networks with convolutional arma filters. arXiv preprint arXiv:1901.01343 (2019).

[4]

P. Cui, X. Wang, J. Pei, and W. Zhu. 2019. A Survey on Network Embedding. IEEE Transactions on Knowledge and Data Engineering, Vol. 31, 5 (May 2019), 833--852. https://doi.org/10.1109/TKDE.2018.2849727

[5]

Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Eleventh International AAAI Conference on Web and Social Media .

[6]

Ona de Gibert, Naiara Perez, Aitor Garc'ia Pablos, and Montse Cuadros. 2018. Hate Speech Dataset from a White Supremacy Forum. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). 11--20.

[7]

Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Proceedings of the 30th International Conference on Neural Information Processing Systems (Barcelona, Spain) (NIPS'16). Curran Associates Inc., USA, 3844--3852. http://dl.acm.org/citation.cfm?id=3157382.3157527

Digital Library

[8]

Nemanja Djuric, Jing Zhou, Robin Morris, Mihajlo Grbovic, Vladan Radosavljevic, and Narayan Bhamidipati. 2015. Hate speech detection with comment embeddings. In Proceedings of the 24th international conference on world wide web. ACM, 29--30.

Digital Library

[9]

Mai ElSherief, Vivek Kulkarni, Dana Nguyen, William Y. Wang, and Elizabeth Belding. 2018. Hate Lingo: A Target-based Linguistic Analysis of Hate Speech in Social Media (ICWSM '18).

[10]

Paula Fortuna and Sérgio Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR), Vol. 51, 4 (2018), 85.

Digital Library

[11]

Antigoni Maria Founta, Constantinos Djouvas, Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Gianluca Stringhini, Athena Vakali, Michael Sirivianos, and Nicolas Kourtellis. 2018. Large scale crowdsourcing and characterization of twitter abusive behavior. In Twelfth International AAAI Conference on Web and Social Media .

[12]

Njagi Dennis Gitari, Zhang Zuping, Hanyurwimfura Damien, and Jun Long. 2015. A lexicon-based approach for hate speech detection. International Journal of Multimedia and Ubiquitous Engineering, Vol. 10, 4 (2015), 215--230.

[13]

Benjamin Golub and Matthew O Jackson. 2010. Naive learning in social networks and the wisdom of crowds. American Economic Journal: Microeconomics, Vol. 2, 1 (2010), 112--49.

[14]

Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning Word Vectors for 157 Languages. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018) .

[15]

Jeff Greenberg and Tom Pyszczynski. 1985. The effect of an overheard ethnic slur on evaluations of the target: How to spread a social disease. Journal of Experimental Social Psychology, Vol. 21, 1 (1985), 61--72.

[16]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining .

Digital Library

[17]

Radhouane Guermazi, Mohamed Hammami, and Abdelmajid Ben Hamadou. 2007. Using a semi-automatic keyword dictionary for improving violent Web site filtering. In 2007 Third International IEEE Conference on Signal-Image Technologies and Internet-Based System. IEEE, 337--344.

Digital Library

[18]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems. 1024--1034.

[19]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput., Vol. 9, 8 (Nov. 1997), 1735--1780. https://doi.org/10.1162/neco.1997.9.8.1735

Digital Library

[20]

Thorsten Joachims. 1999. Transductive Inference for Text Classification using Support Vector Machines. In Proceedings of ICML-99, 16th International Conference on Machine Learning, Ivan Bratko and Saso Dzeroski (Eds.). Morgan Kaufmann Publishers, San Francisco, US, Bled, SL, 200--209.

Digital Library

[21]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR) .

[22]

Quoc Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (Beijing, China) (ICML'14). JMLR.org, II--1188--II--1196. http://dl.acm.org/citation.cfm?id=3044805.3045025

Digital Library

[23]

Shuhua Liu and Thomas Forss. 2015. New classification models for detecting Hate and Violence web content. In 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), Vol. 1. IEEE, 487--495.

Digital Library

[24]

Sean MacAvaney, Hao-Ren Yao, Eugene Yang, Katina Russell, Nazli Goharian, and Ophir Frieder. 2019. Hate speech detection: Challenges and solutions. PloS one, Vol. 14, 8 (2019), e0221152.

[25]

Binny Mathew, Ritam Dutt, Pawan Goyal, and Animesh Mukherjee. 2019 a. Spread of hate speech in online social media. In Proceedings of WebSci. ACM.

Digital Library

[26]

Binny Mathew, Anurag Illendula, Punyajoy Saha, Soumya Sarkar, Pawan Goyal, and Animesh Mukherjee. 2019 b. Temporal effects of Unmoderated Hate speech in Gab. arxiv: 1909.10966 [cs.SI]

[27]

Binny Mathew, Anurag Illendula, Punyajoy Saha, Soumya Sarkar, Pawan Goyal, and Animesh Mukherjee. 2020. Hate begets Hate: A Temporal Study of Hate Speech. Proceedings of the ACM on Human-Computer Interaction CSCW (2020).

Digital Library

[28]

Brendan Meeder, Brian Karrer, Amin Sayedi, R Ravi, Christian Borgs, and Jennifer Chayes. 2011. We know who you followed last summer: inferring social link creation times in twitter. In Proceedings of the 20th international conference on World wide web. 517--526.

Digital Library

[29]

Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, and Ekaterina Shutova. 2018. Author Profiling for Abuse Detection. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, 1088--1098. https://www.aclweb.org/anthology/C18--1093

[30]

Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. 2018. Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks. CoRR, Vol. abs/1810.02244 (2018). arxiv: 1810.02244 http://arxiv.org/abs/1810.02244

[31]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532--1543.

[32]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of KDD. 701--710.

Digital Library

[33]

Jing Qian, Mai ElSherief, Elizabeth Belding, and William Yang Wang. 2018. Leveraging Intra-User and Inter-User Representation Learning for Automated Hate Speech Detection. In NAACL, Vol. 2. 118--123.

[34]

Radim Rehurek and Petr Sojka. 2010. Software framework for topic modelling with large corpora. In In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. Citeseer.

[35]

Manoel Horta Ribeiro, Pedro H Calais, Yuri A Santos, Virg'ilio AF Almeida, and Wagner Meira Jr. 2018. Characterizing and detecting hateful users on twitter. In Twelfth International AAAI Conference on Web and Social Media .

[36]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13--17, 2016. 1135--1144.

Digital Library

[37]

Caitlin Rivers and Bryan Lewis. 2014. Ethical research standards in a world of big data. F1000Research, Vol. 3 (08 2014). https://doi.org/10.12688/f1000research.3--38.v2

[38]

Haji Mohammad Saleem, Kelly P Dillon, Susan Benesch, and Derek Ruths. 2017. A web of hate: Tackling hateful speech in online social spaces. arXiv preprint arXiv:1709.10159 (2017).

[39]

Joni Salminen, Hind Almerekhi, Milica Milenković, Soon-gyo Jung, Jisun An, Haewoon Kwak, and Bernard J Jansen. 2018. Anatomy of online hate: developing a taxonomy and machine learning models for identifying and classifying hate in online news media. In Twelfth International AAAI Conference on Web and Social Media .

[40]

Anna Schmidt and Michael Wiegand. 2017. A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. 1--10.

[41]

Kiran K Thekumparampil, Chong Wang, Sewoong Oh, and Li-Jia Li. 2018. Attention-based graph neural network for semi-supervised learning. arXiv preprint arXiv:1803.03735 (2018).

[42]

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=rJXMpikCZ

[43]

William Warner and Julia Hirschberg. 2012. Detecting hate speech on the world wide web. In Proceedings of the Second Workshop on Language in Social Media. Association for Computational Linguistics, 19--26.

Digital Library

[44]

Sanjay Yadav and Sanyam Shukla. 2016. Analysis of k-Fold Cross-Validation over Hold-Out Validation on Colossal Datasets for Quality Classification. 78--83. https://doi.org/10.1109/IACC.2016.25

[45]

Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. 2019. Gnnexplainer: Generating explanations for graph neural networks. In Advances in neural information processing systems. 9244--9255.

[46]

Savvas Zannettou, Barry Bradlyn, Emiliano De Cristofaro, Haewoon Kwak, Michael Sirivianos, Gianluca Stringini, and Jeremy Blackburn. 2018. What is Gab: A Bastion of Free Speech or an Alt-Right Echo Chamber. In Proceedings of WWW (Companion). 1007--1014.

Digital Library

[47]

Ziqi Zhang, David Robinson, and Jonathan Tepper. 2018. Detecting hate speech on Twitter using a convolution-GRU based deep neural network. In European Semantic Web Conference. Springer, 745--760.

Cited By

Miao ZChen XWang HTang RYang ZHuang TTang W(2024)Detecting Offensive Language Based on Graph Attention Networks and Fusion FeaturesIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325050211:1(1493-1505)Online publication date: Feb-2024
https://doi.org/10.1109/TCSS.2023.3250502
Shome DKar T(2021)ConOffense: Multi-modal multitask Contrastive learning for offensive content identification2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9671427(4524-4529)Online publication date: 15-Dec-2021
https://doi.org/10.1109/BigData52589.2021.9671427

Index Terms

You too Brutus! Trapping Hateful Users in Social Media: Challenges, Solutions & Insights
1. Human-centered computing
  1. Collaborative and social computing
    1. Empirical studies in collaborative and social computing

Recommendations

Spread of Hate Speech in Online Social Media
WebSci '19: Proceedings of the 10th ACM Conference on Web Science

Hate speech is considered to be one of the major issues currently plaguing the online social media. With online hate speech culminating in gruesome scenarios like the Rohingya genocide in Myanmar, anti-Muslim mob violence in Sri Lanka, and the ...
Hate begets Hate: A Temporal Study of Hate Speech
CSCW

With the ongoing debate on 'freedom of speech' vs. 'hate speech,' there is an urgent need to carefully understand the consequences of the inevitable culmination of the two, i.e., 'freedom of hate speech' over time. An ideal scenario to understand this ...
Who Let The Trolls Out?: Towards Understanding State-Sponsored Trolls
WebSci '19: Proceedings of the 10th ACM Conference on Web Science

Recent evidence has emerged linking coordinated campaigns by state-sponsored actors to manipulate public opinion on the Web. Campaigns revolving around major political events are enacted via mission-focused ?trolls." While trolls are involved in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HT '21: Proceedings of the 32nd ACM Conference on Hypertext and Social Media

August 2021

306 pages

ISBN:9781450385510

DOI:10.1145/3465336

General Chair:
Owen Conlan
Trinity College Dublin, Ireland
,
Program Chair:
Eelco Herder
Radboud Universiteit Nijmegen, Netherlands

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 August 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Best Paper

Author Tags

Qualifiers

Research-article

Conference

HT '21

Sponsor:

HT '21: 32nd ACM Conference on Hypertext and Social Media

August 30 - September 2, 2021

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 378 of 1,158 submissions, 33%

Upcoming Conference

HT '24

Sponsor:
sigweb

35th ACM Conference on Hypertext and Social Media

September 10 - 13, 2024

Poznan , Poland

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
248
Total Downloads

Downloads (Last 12 months)50
Downloads (Last 6 weeks)2

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Miao ZChen XWang HTang RYang ZHuang TTang W(2024)Detecting Offensive Language Based on Graph Attention Networks and Fusion FeaturesIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325050211:1(1493-1505)Online publication date: Feb-2024
https://doi.org/10.1109/TCSS.2023.3250502
Shome DKar T(2021)ConOffense: Multi-modal multitask Contrastive learning for offensive content identification2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9671427(4524-4529)Online publication date: 15-Dec-2021
https://doi.org/10.1109/BigData52589.2021.9671427

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents