Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3308558.3313546acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Fuzzy Multi-task Learning for Hate Speech Type Identification

Published: 13 May 2019 Publication History

Abstract

In traditional machine learning, classifiers training is typically undertaken in the setting of single-task learning, so the trained classifier can discriminate between different classes. However, this must be based on the assumption that different classes are mutually exclusive. In real applications, the above assumption does not always hold. For example, the same book may belong to multiple subjects. From this point of view, researchers were motivated to formulate multi-label learning problems. In this context, each instance can be assigned multiple labels but the classifiers training is still typically undertaken in the setting of single-task learning. When probabilistic approaches are adopted for classifiers training, multi-task learning can be enabled through transformation of a multi-labelled data set into several binary data sets. The above data transformation could usually result in the class imbalance issue. Without the above data transformation, multi-labelling of data results in an exponential increase of the number of classes, leading to fewer instances for each class and a higher difficulty for identifying each class. In addition, multi-labelling of data is very time consuming and expensive in some application areas, such as hate speech detection. In this paper, we introduce a novel formulation of the hate speech type identification problem in the setting of multi-task learning through our proposed fuzzy ensemble approach. In this setting, single-labelled data can be used for semi-supervised multi-label learning and two new metrics (detection rate and irrelevance rate) are thus proposed to measure more effectively the performance for this kind of learning tasks. We report an experimental study on identification of four types of hate speech, namely: religion, race, disability and sexual orientation. The experimental results show that our proposed fuzzy ensemble approach outperforms other popular probabilistic approaches, with an overall detection rate of 0.93.

References

[1]
Zahra Ahmadi and Stefan Kramer. 2018. A label compression method for online multi-label classification. Pattern Recognition Letters111 (2018), 64-71.
[2]
Wafa Alorainy, Pete Burnap, Han Liu, Amir Javed, and Matthew L. Williams. 2018. Suspended Accounts: A Source of Tweets with Disgust and Anger Emotions for Augmenting Hate Speech Data Sample. In International Conference on Machine Learning and Cynernetics. IEEE, Chengdu, China.
[3]
Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep Learning for Hate Speech Detection in Tweets. In Proceedings of the 26th International Conference on World Wide Web Companion. ACM, Perth, Australia, 759-760.
[4]
Chara Bakalis. 2018. Rethinking cyberhate laws. Information & Communications Technology Law,27, 1 (2018), 86-110.
[5]
James Banks. 2010. Regulating hate speech online. International Review of Law, Computers and Technology24, 3(2010), 233-239.
[6]
James Banks. 2011. European regulation of cross-border hate speech in cyberspace: The limits of legislation. European Journal of Crime, Criminal Law and Criminal Justice19, 1(2011), 1-13.
[7]
Michael R. Berthold. 2003. Mixed Fuzzy Rule Formation. International Journal of Approximate Reasoning32 (2003), 67-84.
[8]
Pete Burnap and Matthew Williams. 2015. Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making. Policy and Internet7, 2 (2015), 223-242.
[9]
Pete Burnap and Matthew Williams. 2016. Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Science5, 11 (2016).
[10]
Nello Cristianini. 2000. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge.
[11]
Thomas R. Gabriel and Michael R. Berthold. 2004. Influence of fuzzy norms and other heuristics on Mixed fuzzy rule formation. International Journal of Approximate Reasoning35 (2004), 195-202.
[12]
Bjorn Gamback and Utpal Kumar Sikdar. 2017. Using Convolutional Neural Networks to Classify Hate-Speech. In 1st Workshop on Abusive Language Online. ACM, Vancouver, Canada.
[13]
Tieliang Gong, Guangtao Wang, Jieping Ye, and Zongben Xu. 2018. Margin Based PU Learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. AAAI, New Orleans, Louisiana, USA, 3037-3044.
[14]
Sebastian Koeffer, Dennis M Riehle, Steffen Hoehenberger, and Joerg Becker. 2018. Discussing the Value of Automatic Hate Speech Detection in Online Debates. In Multikonferenz Wirtschaftsinformatik. GITO-Verlag, Leuphana, Germany, 83-94.
[15]
Irene Kwok and Yuzhou Wang. 2013. Locate the Hate: Detecting Tweets against Blacks. In Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence. AAAI Press, Bellevue, Washington, 1621-1622.
[16]
Xiaoli Li and Bing Liu. 2005. Learning from Positive and Unlabeled Examples with Different Data Distributions. In European Conference on Machine Learning. Springer, Porto, Portugal, 218-229.
[17]
Bing Liu, Yang Dai, Xiaoli Li, Wee Sun Lee, and Philip Yu. 2003. Building Text Classifiers Using Positive and Unlabeled Examples. In Proceedings of the Third IEEE International Conference on Data Mining. IEEE, Melbourne, Florida, USA, 1-8.
[18]
Bing Liu, Wee Sun Lee, Philip S Yu, and Xiaoli Li. 2002. Partially Supervised Classification of Text Documents. In Proceedings of the Nineteenth International Conference on Machine Learning. Morgan Kaufmann Publishers, Sydney, Australia, 387-394.
[19]
Han Liu, Pete Burnap, Wafa Alorainy, and Matthew L. Williams. 2019. A Fuzzy Approach to Text Classification with Two Stage Training for Ambiguous Instances. IEEE Transactions on Computational Social Systems6 (2019).
[20]
Han Liu and Mihaela Cocea. 2017. Fuzzy Rule Based Systems for Interpretable Sentiment Analysis. In International Conference on Advanced Computational Intelligence. IEEE, Doha, Qatar, 129-136.
[21]
Han Liu and Mihaela Cocea. 2018. Granular Computing Based Machine Learning: A Big Data Processing Approach. Springer, Berlin.
[22]
Han Liu, Mihaela Cocea, and Weili Ding. 2018. Multi-Task Learning for Intelligent Data Processing in Granular Computing Context. Granular Computing3, 3 (2018), 257-273.
[23]
Han Liu, Mihaela Cocea, Alaa Mohasseb, and Mohamed Bader. 2017. Transformation of Discriminative Single-Task Classification into Generative Multi-Task Classification in Machine Learning Context. In International Conference on Advanced Computational Intelligence. IEEE, Doha, Qatar, 66-73.
[24]
Jan Lukasiewicz. 1970. Selected Works-Studies in Logic and the Foundations of Mathematics. North-Holland Publishing, Amsterdam.
[25]
Irene Nemes. 2010. Regulating Hate Speech in Cyberspace: Issues of Desirability and Efficacy. Journal of Information and Communications Technology Law11, 3(2010), 193-220.
[26]
Chikashi Nobata, Joel Tetreault, and Achint Thomas. 2016. Abusive Language Detection in Online User Content. In Proceedings of the 25th International Conference on World Wide Web. ACM, Montreal, Quebec, Canada, 145-153.
[27]
Ji Ho Park and Pascale Fung. 2017. One-step and Two-step Classification for Abusive Language Detection on Twitter. In 1st Workshop on Abusive Language Online. ACM, Vancouver, Canada.
[28]
Ross J. Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco.
[29]
Amir H. Razavi, Diana Inkpen, Sasha Uritsky, and Stan Matwin. 2010. Offensive Language Detection Using Multi-level Classification. In Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence. Springer, Ottawa, Canada, 16-27.
[30]
Jesse Read. 2013. Multi-label Classification. In The Second School on Machine Learning and Knowledge Discovery in Databases. Brazilian Computer Society, Sao Carlos, Brazil.
[31]
Jesse Read, Bernhard Pfahringer, Geoff Holmes, and Eibe Frank. 2011. Classifier chains for multi-label classification. Machine Learning85(2011), 333-359.
[32]
Grigorios Tsoumakas and Ioannis Katakis. 2007. Multi-label classification: An overview. International Journal of Data Warehousing and Mining3, 3 (2007), 1-13.
[33]
Zeerak Waseem and Dirk Hovy. 2016. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In Proceedings of NAACL-HLT 2016. ACL, San Diego, California, USA, 88-93.
[34]
Hajime Watanabe, Mondher Bouazizi, and Tomoaki Ohtsuki. 2018. Hate Speech on Twitter: A Pragmatic Approach to Collect Hateful and Offensive Expressions and Perform Hate Speech Detection. IEEE AccessPP, 99 (2018), 1-11.
[35]
Guang Xiang, Bin Fan, Ling Wang, Jason Hong, and Carolyn Rose. 2012. Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In Proceedings of the 21st ACM international conference on Information and knowledge management. Springer, Maui, Hawaii, USA, 1980-1984.
[36]
Ronald R. Yager, S. Ovchinnikov, R. M. Tong, and H. T. Ngugen. 1987. Fuzzy Sets and Applications. Wiley, New York.
[37]
Lotfi Zadeh. 2015. Fuzzy Logic: A Personal Perspective. Fuzzy Sets and Systems281 (December 2015), 4-20.
[38]
Bangzuo Zhang and Wanli Zuo. 2008. Learning from Positive and Unlabeled Examples: A Survey. In 2008 International Symposiums on Information Processing. IEEE, Moscow, Russia, 650-654.
[39]
Min-Ling Zhang and Zhi-Hua Zhou. 2014. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering26, 8(2014), 1819-1837.
[40]
Ziqi Zhang, David Robinson, and Jonathan Tepper. 2018. Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network. In European Semantic Web Conference. Springer, Heraklion, Crete, 745-760.

Cited By

View all
  • (2024)Hate speech detection in social media: Techniques, recent trends, and future challengesWIREs Computational Statistics10.1002/wics.164816:2Online publication date: 11-Mar-2024
  • (2024)Mapping the scientific knowledge and approaches to defining and measuring hate crime, hate speech, and hate incidents: A systematic reviewCampbell Systematic Reviews10.1002/cl2.139720:2Online publication date: 28-Apr-2024
  • (2023)Exploring Automatic Hate Speech Detection on Social Media: A Focus on Content-Based AnalysisSage Open10.1177/2158244023118131113:2Online publication date: 17-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cyberhate detection
  2. Fuzzy classification
  3. Machine learning
  4. Multi-task learning
  5. Text classification

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '19
WWW '19: The Web Conference
May 13 - 17, 2019
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)37
  • Downloads (Last 6 weeks)2
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Hate speech detection in social media: Techniques, recent trends, and future challengesWIREs Computational Statistics10.1002/wics.164816:2Online publication date: 11-Mar-2024
  • (2024)Mapping the scientific knowledge and approaches to defining and measuring hate crime, hate speech, and hate incidents: A systematic reviewCampbell Systematic Reviews10.1002/cl2.139720:2Online publication date: 28-Apr-2024
  • (2023)Exploring Automatic Hate Speech Detection on Social Media: A Focus on Content-Based AnalysisSage Open10.1177/2158244023118131113:2Online publication date: 17-Jun-2023
  • (2023)Hate Speech and Offensive Language Detection Using an Emotion-Aware Shared EncoderICC 2023 - IEEE International Conference on Communications10.1109/ICC45041.2023.10279690(2852-2857)Online publication date: 28-May-2023
  • (2023)Twitter Hate Speech Detection: A Systematic Review of Methods, Taxonomy Analysis, Challenges, and OpportunitiesIEEE Access10.1109/ACCESS.2023.323937511(16226-16249)Online publication date: 2023
  • (2023)Finding hate speech with auxiliary emotion detection from self-training multi-label learning perspectiveInformation Fusion10.1016/j.inffus.2023.03.01596(214-223)Online publication date: Aug-2023
  • (2023)Transfer learning for hate speech detection in social mediaJournal of Computational Social Science10.1007/s42001-023-00224-96:2(1081-1101)Online publication date: 17-Oct-2023
  • (2023)Impounding behavioural connotations for hate speech analysis – a view towards criminal investigation using machine learningInternational Journal of Information Technology10.1007/s41870-023-01500-7Online publication date: 27-Sep-2023
  • (2023)A literature survey on multimodal and multilingual automatic hate speech identificationMultimedia Systems10.1007/s00530-023-01051-829:3(1203-1230)Online publication date: 1-Jun-2023
  • (2022)Keyword-Enhanced Multi-Expert Framework for Hate Speech DetectionMathematics10.3390/math1024470610:24(4706)Online publication date: 11-Dec-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media