research-article

Unravelling social media racial discriminations through a semi-supervised approach

Authors:

Vimala Balakrishnan,

Hamid R. ArabniaAuthors Info & Claims

Volume 67, Issue C

https://doi.org/10.1016/j.tele.2021.101752

Published: 01 February 2022 Publication History

Highlights

•

Machine learning models were used to detect cyber-racism during COVID19 pandemic.

•

Cyber-racism detection based on negative English tweets.

•

Random Forest with bagging emerged to be the best detection classifier.

•

Top themes of cyber-racism - Eating habit, Xenophobia and Political hatred.

Abstract

The study investigated cyber-racism on social media during the recent Coronavirus pandemic using a semi-supervised approach. Specifically, several machine learning models were trained to detect cyber-racism, followed by topic modelling using Latent Dirichlet Allocation (LDA). Twitter data were gathered using the hash tags Chinese virus and Kung Flu in the month of March 2020, resulting in 7,454 clean tweets. Negative tweets extracted using sentiment analysis were annotated (Racism, Sarcasm/irony and Others), and used to train several machine learning models. Experimental results show Random Forest with bagging to consistently outperform Random Forest, J48 and Support Vector Machine with an accuracy of 78.1% (Racism versus Sarcasm/Irony) and 77.9% (Racism versus Others). LDA revealed three distinct topics for tweets identified as racist, namely, Eating habit, Political hatred and Xenophobia. Consistent detection performance of the models evaluated indicate their reliability in detecting cyber-racism patterns based on textual communications.

References

[1]

M. Ahmad, S. Aftab, S.S. Muhammad, S. Ahmad, Machine learning techniques for sentiment analysis: A review, Int. J. Multidiscip. Sci. Eng 8 (3) (2017) 27.

[2]

M.A. Al-garadi, K.D. Varathan, S.D. Ravana, Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network, Comput. Hum. Behav. 63 (2016) 433–443.

Digital Library

[3]

M.S. Amin, Y.K. Chiam, K.D. Varathan, Identification of significant features and data mining techniques in predicting heart disease, Telematics Inform. 36 (2019) 82–93.

[4]

P. Badjatiya, S. Gupta, M. Gupta, V. Varma, Deep learning for hate speech detection in tweets, Proceedings of the 26th International Conference on World Wide Web Companion, 2017.

[5]

D.M. Blei, Probabilistic topic models, Commun. ACM 55 (4) (2012) 77–84.

Digital Library

[6]

A.-M. Bliuc, N. Faulkner, A. Jakubowicz, C. McGarty, Online networks of racial hate: A systematic review of 10 years of research on cyber-racism, Comput. Hum. Behav. 87 (2018) 75–86.

[7]

A. Bondielli, F. Marcelloni, A survey on fake news and rumour detection techniques, Inf. Sci. 497 (2019) 38–55.

[8]

L.B. Buchanan, Elementary pre-service teachers׳ navigation of racism and whiteness through inquiry with historical documentary film, J. Soc. Stud. Res. 40 (2) (2016) 137–154.

[9]

P. Burnap, O.F. Rana, N. Avis, M. Williams, W. Housley, A. Edwards, J. Morgan, L. Sloan, Detecting tension in online communities with computational Twitter analysis, Technol. Forecast. Soc. Chang. 95 (2015) 96–108.

[10]

P. Burnap, M.L. Williams, Us and them: identifying cyber hate on Twitter across multiple protected characteristics, EPJ Data Sci. 5 (1) (2016) 11.

[11]

W. Cai, D. Yu, Z. Wu, X. Du, T. Zhou, A hybrid ensemble learning framework for basketball outcomes prediction, Physica A 528 (2019) 121461,.

[12]

J. Cho, S. Kim, Personal and social predictors of use and non-use of fitness/diet app: Application of Random Forest algorithm, Telematics Inform. 55 (2020) 101301,.

[13]

S. Ding, Z. Li, X. Liu, H. Huang, S. Yang, Diabetic complication prediction using a similarity-enhanced latent Dirichlet allocation model, Inf. Sci. 499 (2019) 12–24.

[14]

Hasanuzzaman, M., Dias, G., & Way, A. (2017). Demographic word embeddings for racism detection on twitter.

[15]

D. Jain, A. Kumar, G. Garg, Sarcasm detection in mash-up language using soft-attention based bi-directional LSTM and feature-rich CNN, Appl. Soft Comput. 91 (2020) 106–198.

[16]

P.K. Jonason, How “dark” personality traits and perceptions come together to predict racism in Australia, Personality Individ. Differ. 72 (2015) 47–51.

[17]

Kozlowska, H. (2020). How anti-Chinese sentiment is spreading on social media. https://qz.com/1823608/how-anti-china-sentiment-is-spreading-on-social-media/.

[18]

J. Liu, E. Zio, Integration of feature vector selection and support vector machine for classification of imbalanced data, Appl. Soft Comput. 75 (2019) 702–711.

[19]

E. Lozano, J. Cedeño, G. Castillo, F. Layedra, H. Lasso, C. Vaca, Requiem for online harassers: Identifying racism from political tweets, 2017 Fourth International Conference on eDemocracy & eGovernment (ICEDEG), 2017.

[20]

S. Murnion, W.J. Buchanan, A. Smales, G. Russell, Machine learning and semantic analysis of in-game chat for cyberbullying, Computers & Security 76 (2018) 197–213.

[21]

F.A. Ozbay, B. Alatas, Fake news detection within online social media using supervised artificial intelligence algorithms, Physica A 540 (2020) 123–174.

[22]

N. Öztürk, S. Ayvaz, Sentiment analysis on Twitter: A text mining approach to the Syrian refugee crisis, Telematics Inform. 35 (1) (2018) 136–147.

[23]

T. Pan, J. Zhao, W. Wu, J. Yang, Learning imbalanced datasets based on SMOTE and Gaussian distribution, Inf. Sci. 512 (2020) 1214–1233.

Digital Library

[24]

G. Pennycook, J. McPhetres, Y. Zhang, J.G. Lu, D.G. Rand, Fighting COVID-19 misinformation on social media: Experimental evidence for a scalable accuracy nudge intervention, Psychol. Sci. 31 (7) (2020) 770–780.

[25]

L. Tang, Y. Tian, W. Li, P.M. Pardalos, Structural improved regular simplex support vector machine for multiclass classification, Appl. Soft Comput. 91 (2020) 106–235.

[26]

L.V.P. Trindade, Disparagement humour and gendered racism on social media in Brazil, Ethnic and Racial Studies (2019) 1–19.

[27]

M.F. Vázquez, F.S. Pérez, Hate Speech in Spain Against Aquarius Refugees 2018 in Twitter, Proceedings of the Seventh International Conference on Technological Ecosystems for Enhancing Multiculturality, 2019.

[28]

H. Watanabe, M. Bouazizi, T. Ohtsuki, Hate speech on twitter: A pragmatic approach to collect hateful and offensive expressions and perform hate speech detection, IEEE Access 6 (2018) 13825–13835.

[29]

World Health Organization (2020a). Coronavirus disease 2019 (COVID-19) Situation Report - 72. https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200401-sitrep-72-covid-19.pdf?sfvrsn=3dd8971b_2.

[30]

World Health Organization (2020b). Novel coronavirus Situation Report -1, https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200121-sitrep-1-2019-ncov.pdf?sfvrsn=20a99c10_4.

[31]

N. Zainuddin, A. Selamat, R. Ibrahim, Hybrid sentiment classification on twitter aspect-based sentiment analysis, Applied Intelligence 48 (5) (2018) 1218–1232.

[32]

J. Zhang, M. Litvinova, W. Wang, Y. Wang, X. Deng, X. Chen, M. Li, W. Zheng, L. Yi, X. Chen, Q. Wu, Y. Liang, X. Wang, J. Yang, K. Sun, I.M. Longini, M.E. Halloran, P. Wu, B.J. Cowling, S. Merler, C. Viboud, A. Vespignani, M. Ajelli, H. Yu, Evolving epidemiology and transmission dynamics of coronavirus disease 2019 outside Hubei province, China: A descriptive and modelling study, Lancet Infectios Disease 20 (7) (2020) 793–802.

Cited By

Ahmed ULin JSrivastava G(2024)Emotional Intelligence Attention Unsupervised Learning Using Lexicon Analysis for Irony-based AdvertisingACM Transactions on Asian and Low-Resource Language Information Processing10.1145/358049623:1(1-19)Online publication date: 15-Jan-2024
https://dl.acm.org/doi/10.1145/3580496

Index Terms

Unravelling social media racial discriminations through a semi-supervised approach

Index terms have been assigned to the content through auto-classification.

Recommendations

Social sentiment sensor: a visualization system for topic detection and topic sentiment analysis on microblog

As a new form of social media, microblogging provides platform sharing, wherein users can share their feelings and ideas on certain topics. Bursty topics from microblogs are the results of the emerging issues that instantly attract more followers and ...
Detecting bursts in sentiment-aware topics from social media

Nowadays plenty of user-generated posts, e.g., sina weibos, are published on the social media. The posts contain the publics sentiments (i.e., positive or negative) towards various topics. Bursty sentiment-aware topics from these posts reveal sentiment-...
Using Latent Dirichlet Allocation for Topic Modeling and Document Clustering of Dumaguete City Twitter Dataset
ICCDE '18: Proceedings of the 2018 International Conference on Computing and Data Engineering

Online communication channel, such as social media is predominantly becoming common nowadays as it allows people to fearlessly and instantly share opinions and exchange information at one's convenience. One popular social media site and microblogging ...

Comments

Information & Contributors

Information

Published In

cover image Telematics and Informatics

Telematics and Informatics Volume 67, Issue C

Feb 2022

98 pages

ISSN:0736-5853

Issue’s Table of Contents

Elsevier Ltd.

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 February 2022

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ahmed ULin JSrivastava G(2024)Emotional Intelligence Attention Unsupervised Learning Using Lexicon Analysis for Irony-based AdvertisingACM Transactions on Asian and Low-Resource Language Information Processing10.1145/358049623:1(1-19)Online publication date: 15-Jan-2024
https://dl.acm.org/doi/10.1145/3580496

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents