Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3580305.3599773acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Analysis of COVID-19 Offensive Tweets and Their Targets

Published: 04 August 2023 Publication History
  • Get Citation Alerts
  • Abstract

    During the global COVID-19 pandemic, people utilized social media platforms, especially Twitter, to spread and express opinions about the pandemic. Such discussions also drove the rise in COVID-related offensive speech. In this work, focusing on Twitter, we present a comprehensive analysis of COVID-related offensive tweets and their targets. We collected a COVID-19 dataset with over 747 million tweets for 30 months and fine-tuned a BERT classifier to detect offensive tweets. Our offensive tweets analysis shows that the ebb and flow of COVID-related offensive tweets potentially reflect events in the physical world. We then studied the targets of these offensive tweets. There was a large number of offensive tweets with abusive words, which could negatively affect the targeted groups or individuals. We also conducted a user network analysis, and found that offensive users interact more with other offensive users and that the pandemic had a lasting impact on some offensive users. Our study offers novel insights into the persistence and evolution of COVID-related offensive tweets during the pandemic

    Supplementary Material

    MP4 File (kdd_video.mp4)
    2 minute long promotional video for the KDD 2023 Applied Data Science (ADS) paper "Analysis of COVID-19 Offensive Tweets and Their Targets".

    References

    [1]
    Shivang Agarwal and C Ravindranath Chowdary. 2021. Combating hate speech using an adaptive ensemble learning model with a case study on COVID-19. Expert Systems with Applications, Vol. 185 (2021), 115632.
    [2]
    Sarah Alqurashi, Ahmad Alhindi, and Eisa Alanazi. 2020. Large arabic twitter dataset on covid-19. arXiv preprint arXiv:2004.04315 (2020).
    [3]
    Jisun An, Haewoon Kwak, Claire Seungeun Lee, Bogang Jun, and Yong-Yeol Ahn. 2021. Predicting anti-Asian hateful users on Twitter during COVID-19. arXiv preprint arXiv:2109.07296 (2021).
    [4]
    Juan M Banda, Ramya Tekumalla, Guanyu Wang, Jingyuan Yu, Tuo Liu, Yuning Ding, Ekaterina Artemova, Elena Tutubalina, and Gerardo Chowell. 2021. A large-scale COVID-19 Twitter chatter dataset for open scientific research-an international collaboration. Epidemiologia, Vol. 2, 3 (2021), 315--324.
    [5]
    Mathieu Bastian, Sebastien Heymann, and Mathieu Jacomy. 2009. Gephi: an open source software for exploring and manipulating networks. In Proceedings of the international AAAI conference on web and social media, Vol. 3. 361--362.
    [6]
    Lauren E Beausoleil. 2019. Free, hateful, and posted: rethinking first amendment protection of hate speech in a social media world. BCL Rev., Vol. 60 (2019), 2101.
    [7]
    Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, Vol. 2008, 10 (2008), P10008.
    [8]
    Ryan L Boyd, Ashwini Ashokkumar, Sarah Seraj, and James W Pennebaker. 2022. The development and psychometric properties of LIWC-22. Austin, TX: University of Texas at Austin (2022).
    [9]
    Emily Chen, Kristina Lerman, and Emilio Ferrara. 2020. Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set. JMIR Public Health and Surveillance, Vol. 6, 2 (May 2020), e19273. https://doi.org/10.2196/19273
    [10]
    Xueqi Cheng, Xiaohui Yan, Yanyan Lan, and Jiafeng Guo. 2014. Btm: Topic modeling over short texts. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, 12 (2014), 2928--2941.
    [11]
    San Francisco Chronicle. 2020. Coronavirus: Asian American groups compile hate crime reports as trump persists in ?Chinese virus' attacks. https://www.sfchronicle.com/bayarea/article/Coronavirus-Asian-American-groups-compile-hate-15144295.php
    [12]
    Alessandro Cossard, Gianmarco De Francisci Morales, Kyriaki Kalimeri, Yelena Mejova, Daniela Paolotti, and Michele Starnini. 2020. Falling into the echo chamber: the Italian vaccination debate on Twitter. In Proceedings of the International AAAI conference on web and social media, Vol. 14. 130--140.
    [13]
    Giuseppe Crupi, Yelena Mejova, Michele Tizzani, Daniela Paolotti, and André Panisson. 2022. Echoes through Time: Evolution of the Italian COVID-19 Vaccination Debate. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 16. 102--113.
    [14]
    Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated Hate Speech Detection and the Problem of Offensive Language. arxiv: 1703.04009 [cs.CL]
    [15]
    Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arxiv: 1810.04805 [cs.CL]
    [16]
    Marco Di Giovanni, Francesco Pierri, Christopher Torres-Lugo, and Marco Brambilla. 2022. VaccinEU: COVID-19 vaccine conversations on Twitter in French, German and Italian. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 16. 1236--1244.
    [17]
    Facebook. 2023. Facebook Community Standards: Hate Speech. https://www.facebook.com/communitystandards/hate_speech
    [18]
    Komal Florio, Valerio Basile, and Viviana Patti. 2021. Hate speech and topic shift in the covid-19 public discourse on social media in Italy. In 8th Italian Conference on Computational Linguistics, CLiC-it 2021, Vol. 3033. CEUR-WS, 1--7.
    [19]
    Abraham H Foxman and Christopher Wolf. 2013. Viral hate: Containing its spread on the Internet. Macmillan.
    [20]
    Zhiwei Gao, Shuntaro Yada, Shoko Wakamiya, and Eiji Aramaki. 2020. Naist covid: Multilingual covid-19 twitter and weibo dataset. arXiv preprint arXiv:2004.08145 (2020).
    [21]
    Angela R Gover, Shannon B Harper, and Lynn Langton. 2020. Anti-Asian hate crime during the COVID-19 pandemic: Exploring the reproduction of inequality. American journal of criminal justice, Vol. 45, 4 (2020), 647--667.
    [22]
    Guido W Imbens and Thomas Lemieux. 2008. Regression discontinuity designs: A guide to practice. Journal of econometrics, Vol. 142, 2 (2008), 615--635.
    [23]
    Mathieu Jacomy, Tommaso Venturini, Sebastien Heymann, and Mathieu Bastian. 2014. ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. PloS one, Vol. 9, 6 (2014), e98679.
    [24]
    Eamonn Keogh, Selina Chu, David Hart, and Michael Pazzani. 2001. An online algorithm for segmenting time series. In Proceedings 2001 IEEE international conference on data mining. IEEE, 289--296.
    [25]
    Rebecca Killick, Paul Fearnhead, and Idris A Eckley. 2012. Optimal detection of changepoints with a linear computational cost. J. Amer. Statist. Assoc., Vol. 107, 500 (2012), 1590--1598.
    [26]
    Bumsoo Kim, Eric Cooks, and Seong-Kyu Kim. 2021. Exploring incivility and moral foundations toward Asians in English-speaking tweets in hate crime-reporting cities during the COVID-19 pandemic. Internet Research (2021).
    [27]
    Ritesh Kumar, Atul Kr Ojha, Shervin Malmasi, and Marcos Zampieri. 2018. Benchmarking aggression identification in social media. In Proceedings of the first workshop on trolling, aggression and cyberbullying (TRAC-2018). 1--11.
    [28]
    Lucas Lacasa, Bartolo Luque, Fernando Ballesteros, Jordi Luque, and Juan Carlos Nuno. 2008. From time series to complex networks: The visibility graph. Proceedings of the National Academy of Sciences, Vol. 105, 13 (2008), 4972--4975.
    [29]
    J Richard Landis and Gary G Koch. 1977. The measurement of observer agreement for categorical data. biometrics (1977), 159--174.
    [30]
    Carmen Lee. 2021. HateIsAVirus: Talking about COVID-19 ?hate'. Viral discourse (2021), 61--68.
    [31]
    Mingqi Li, Song Liao, Ebuka Okpala, Max Tong, Matthew Costello, Long Cheng, Hongxin Hu, and Feng Luo. 2021. COVID-HateBERT: a Pre-trained Language Model for COVID-19 related Hate Speech Detection. In 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 233--238.
    [32]
    Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
    [33]
    Runjing Lu and Yanying Sheng. 2020. From fear to hate: How the COVID-19 pandemic sparks racial animus in the United States. arXiv preprint arXiv:2007.01448 (2020).
    [34]
    Lydia Manikonda, Mee Young Um, and Rui Fan. 2022. Shift of User Attitudes about Anti-Asian Hate on Reddit Before and During COVID-19. In 14th ACM Web Science Conference 2022. 364--369.
    [35]
    Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
    [36]
    Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.
    [37]
    Marzieh Mozafari, Reza Farahbakhsh, and Noel Crespi. 2019. A BERT-based transfer learning approach for hate speech detection in online social media. In International Conference on Complex Networks and Their Applications. Springer, 928--940.
    [38]
    Martin Müller, Marcel Salathé, and Per E Kummervold. 2020. COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter. arxiv: 2005.07503 [cs.CL]
    [39]
    Matti Näsi, Pekka Räsänen, James Hawdon, Emma Holkeri, and Atte Oksanen. 2015. Exposure to online hate material and social trust among Finnish youth. Information Technology & People (2015).
    [40]
    Huy Nghiem and Fred Morstatter. 2021. "Stop Asian Hate!": Refining Detection of Anti-Asian Hate Speech During the COVID-19 Pandemic. arXiv preprint arXiv:2112.02265 (2021).
    [41]
    Dat Quoc Nguyen, Thanh Vu, and Anh Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. 9--14.
    [42]
    Soham Poddar, Mainack Mondal, Janardan Misra, Niloy Ganguly, and Saptarshi Ghosh. 2022. Winds of Change: Impact of COVID-19 on Vaccine-related Opinions of Twitter users. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 16. 782--793.
    [43]
    Jipeng Qiang, Zhenyu Qian, Yun Li, Yunhao Yuan, and Xindong Wu. 2020. Short text topic modeling techniques, applications, and performance: a survey. IEEE Transactions on Knowledge and Data Engineering (2020).
    [44]
    Sasank Reddy, Min Mun, Jeff Burke, Deborah Estrin, Mark Hansen, and Mani Srivastava. 2010. Using mobile phones to determine transportation modes. ACM Transactions on Sensor Networks (TOSN), Vol. 6, 2 (2010), 1--27.
    [45]
    Ibrahim Sabuncu and Zeynep Yurex. 2020. Corona Virus (COVID-19) Turkish Tweets Dataset. https://doi.org/10.21227/0wf0-0792
    [46]
    Joni Salminen, Maximilian Hopf, Shammur A. Chowdhury, Soon gyo Jung, Hind Almerekhi, and Bernard J. Jansen. 2020. Developing an online hate classifier for multiple social media platforms. Human-centric Computing and Information Sciences, Vol. 10, 1 (1 Dec. 2020). https://doi.org/10.1186/s13673-019-0205-6
    [47]
    Leonard Schild, Chen Ling, Jeremy Blackburn, Gianluca Stringhini, Yang Zhang, and Savvas Zannettou. 2020. "Go eat a bat, Chang!": An Early Look on the Emergence of Sinophobic Behavior on Web Communities in the Face of COVID-19. arxiv: 2004.04046 [cs.SI]
    [48]
    Lanyu Shang, Ziyi Kou, Yang Zhang, and Dong Wang. 2022. A Duo-generative Approach to Explainable Multimodal COVID-19 Misinformation Detection. In Proceedings of the ACM Web Conference 2022. 3623--3631.
    [49]
    Kirill Solovev and Nicolas Pröllochs. 2022. Moral emotions shape the virality of COVID-19 misinformation on social media. In Proceedings of the ACM Web Conference 2022. 3706--3717.
    [50]
    Twitter. 2023. Hateful conduct policy. https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
    [51]
    Joshua Uyheng and Kathleen M Carley. 2021. Characterizing network dynamics of online hate communities around the COVID-19 pandemic. Applied Network Science, Vol. 6, 1 (2021), 1--21.
    [52]
    Nishant Vishwamitra, Ruijia Roger Hu, Feng Luo, Long Cheng, Matthew Costello, and Yin Yang. 2020. On analyzing covid-19-related hate speech using bert attention. In 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 669--676.
    [53]
    Zeerak Waseem, Thomas Davidson, Dana Warmsley, and Ingmar Weber. 2017. Understanding abuse: A typology of abusive language detection subtasks. arXiv preprint arXiv:1705.09899 (2017).
    [54]
    Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In Proceedings of the NAACL student research workshop. 88--93.
    [55]
    Hajime Watanabe, Mondher Bouazizi, and Tomoaki Ohtsuki. 2018. Hate speech on twitter: A pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE access, Vol. 6 (2018), 13825--13835.
    [56]
    Maxwell Weinzierl and Sanda Harabagiu. 2022. Identifying the Adoption or Rejection of Misinformation Targeting COVID-19 Vaccines in Twitter Discourse. In Proceedings of the ACM Web Conference 2022. 3196--3205.
    [57]
    Kai-Cheng Yang, Pik-Mai Hui, and Filippo Menczer. 2019. Bot electioneering volume: Visualizing social bot activity during elections. In Companion Proceedings of The 2019 World Wide Web Conference. 214--217.
    [58]
    Ziqi Zhang and Lei Luo. 2019. Hate speech detection: A solved problem? the challenging case of long tail on twitter. Semantic Web, Vol. 10, 5 (2019), 925--945.
    [59]
    Ziqi Zhang, David Robinson, and Jonathan Tepper. 2018. Detecting hate speech on twitter using a convolution-gru based deep neural network. In European semantic web conference. Springer, 745--760.
    [60]
    Yu Zheng, Quannan Li, Yukun Chen, Xing Xie, and Wei-Ying Ma. 2008a. Understanding mobility based on GPS data. In Proceedings of the 10th international conference on Ubiquitous computing. 312--321.
    [61]
    Yu Zheng, Like Liu, Longhao Wang, and Xing Xie. 2008b. Learning transportation mode from raw gps data for geographic applications on the web. In Proceedings of the 17th international conference on World Wide Web. 247--256.
    [62]
    Assem Zhunis, Gabriel Lima, Hyeonho Song, Jiyoung Han, and Meeyoung Cha. 2022. Emotion Bubbles: Emotional Composition of Online Discourse Before and After the COVID-19 Outbreak. In Proceedings of the ACM Web Conference 2022. 2603--2613.
    [63]
    Caleb Ziems, Bing He, Sandeep Soni, and Srijan Kumar. 2020. Racism is a Virus: Anti-Asian Hate and Counterhate in Social Media during the COVID-19 Crisis. arxiv: 2005.12423 [cs.SI]

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2023
    5996 pages
    ISBN:9798400701030
    DOI:10.1145/3580305
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 August 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. covid-19
    2. offensive tweets
    3. twitter

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 460
      Total Downloads
    • Downloads (Last 12 months)460
    • Downloads (Last 6 weeks)40

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media