research-article

Public Access

HURI: Hybrid user risk identification in social networks

Authors:

Roberto Corizzo,

Emanuele Pio Barracchia,

Antonio Pellicani,

Nathalie Japkowicz,

Michelangelo CeciAuthors Info & Claims

World Wide Web, Volume 26, Issue 5

Pages 3409 - 3439

https://doi.org/10.1007/s11280-023-01192-w

Published: 28 July 2023 Publication History

Abstract

The massive adoption of social networks increased the need to analyze users’ data and interactions to detect and block the spread of propaganda and harassment behaviors, as well as to prevent actions influencing people towards illegal or immoral activities. In this paper, we propose HURI, a method for social network analysis that accurately classifies users as safe or risky, according to their behavior in the social network. Specifically, the proposed hybrid approach leverages both the topology of the network of interactions and the semantics of the content shared by users, leading to an accurate classification also in the presence of noisy data, such as users who may appear to be risky due to the topic of their posts, but are actually safe according to their relationships. The strength of the proposed approach relies on the full and simultaneous exploitation of both aspects, giving each of them equal consideration during the combination phase. This characteristic makes HURI different from other approaches that fully consider only a single aspect and graft partial or superficial elements of the other into the first. The achieved performance in the analysis of a real-world Twitter dataset shows that the proposed method offers competitive performance with respect to eight state-of-the-art approaches.

References

[1]

Huang, B., Raisi, E.: Weak Supervision and Machine Learning for Online Harassment Detection, Springer, Cham pp 5–28 (2018)

[2]

Awan, I.: Cyber-Extremism: Isis and the Power of Social Media. Society 54(2), 138–149 (2017)

[3]

Al-Rawi A and Groshek J Jihadist Propaganda on Social Media: An Examination of ISIS Related Content on Twitter Int J Cyber Warfare and Terrorism (IJCWT) 2018 8 4 1-15

[4]

Alfifi, M., Kaghazgaran, P., Caverlee, J., Morstatter, F.: A Large-Scale Study of ISIS Social Media Strategy: Community Size, Collective Influence, and Behavioral Impact. Proc. of the International AAAI Conference on Web and Social Media 13, 58–67 (2019)

[5]

Shaheen J et al. Network of Terror: How Daesh Uses Adaptive Social Networks To Spread Its Message 2015 Riga, Latvia NATO Strategic Communications Centre of Excellence

[6]

Ji, M., Sun, Y., Danilevsky, M., Han, J., Gao, J.: Graph regularized transductive classification on heterogeneous information networks. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer pp. 570–586 (2010)

[7]

Macskassy, S.A., Provost, F.: Classification in networked data: A toolkit and a univariate case study. Journal of Machine Learning Research 935–983 (2007) 8 May

[8]

Gallagher, B., Tong, H., Eliassi-Rad, T., Faloutsos, C.: Using ghost edges for classification in sparsely labeled networks. In: Proc. of SIGKDD Int. Conference on Knowledge Discovery and Data Mining, ACM pp. 256–264 (2008)

[9]

Bilgic, M., Getoor, L.: Effective label acquisition for collective classification. In: Proc. of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’08, ACM, New York pp. 43–51 (2008)

[10]

Zhou, W., Han, C., Huang, X.: Multiclass classification of tweets and twitter users based on kindness analysis. In: CS229 Final Project Report (2016)

[11]

Uzel, V.N., Saraç Eşsiz, E., Ayşe Özel, S.: Using fuzzy sets for detecting cyber terrorism and extremism in the text. In: 2018 Innovations in Intelligent Systems and Applications Conference (ASYU) pp. 1–4 (2018)

[12]

Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning pp. 1188–1196 (2014)

[13]

Pio G, Serafino F, Malerba D, and Ceci M Multi-type clustering and classification from heterogeneous networks Inf. Sci. 2018 425 107-126

[14]

Ceci, M., Appice, A., Malerba, D.: Mr-SBC: A Multi-relational Naïve Bayes Classifier. In: Proc. of Knowledge Discovery in Databases: PKDD 2003 pp. 95–106 (2003)

[15]

Serafino F, Pio G, and Ceci M Ensemble learning for multi-type classification in heterogeneous networks IEEE Trans. Knowl. Data Eng. 2018 30 12 2326-2339

[16]

Campbell, W., Baseman, E., Greenfield, K.: Content+context networks for user classification in twitter. In: Frontiers of Network Analysis: Methods, Models, and Applications Workshop at Neural Information Processing Systems (2013)

[17]

Xie, D., Xu, J., Lu, T.: Automated classification of extremist twitter accounts using content-based and network-based features. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 2545–2549 (2016)

[18]

Bengio, Y., et al: Learning deep architectures for AI. Foundations and trends® in Machine Learning 2(1), 1–127 (2009)

[19]

Wolpert DH Stacked generalization Neural Networks 1992 5 2 241-259

[20]

Otte, E., Rousseau, R.: Social network analysis: a powerful strategy, also for the information sciences. Journal of Information Science 28(6), 441–453 (2002)

[21]

Camacho D, Panizo-LLedot Á, Bello-Orgaz G, Gonzalez-Pardo A, and Cambria E The four dimensions of social network analysis: An overview of research methods, applications, and software tools Inf. Fusion 2020 63 88-120

[22]

Scott, J.: Social network analysis. Sociology 22(1), 109–127 (1988)

[23]

Bartal, A., Sasson, E., Ravid, G.: Predicting Links in Social Networks Using Text Mining and SNA. In: 2009 International Conference on Advances in Social Network Analysis and Mining pp. 131–136 (2009)

[24]

Sadayappan, S., McCulloh, I., Piorkowski, J.: Evaluation of political party cohesion using exponential random graph modeling. In: IEEE/ACM ASONAM 2018 pp. 298–301 (2018)

[25]

Karimi, H., VanDam, C., Ye, L., Tang, J.: End-to-end compromised account detection. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) pp. 314–321 (2018)

[26]

Crandell, I., Korkmaz, G.:Link prediction in the criminal network of albuquerque.In: IEEE/ACM ASONAM 2018 pp. 564–567 (2018)

[27]

Choudhary, P.: A survey on social network analysis for counter-terrorism. Int J Comput Appl 112 (2015)

[28]

Gialampoukidis, I., Kalpakis, G., Tsikrika, T., Vrochidis, S., Kompatsiaris, I.: Key player identification in terrorism-related social media networks using centrality measures. In: EISIC 2016, pp. 112–115 (2016)

[29]

Farooq, E., Khan, S.A., Butt, W.H.: Covert network analysis to detect key players using correlation and social network analysis. In: Proc. of the Second International Conference on Internet of Things, Data and Cloud Computing. ICC ’17, ACM,New York pp. 94–1946 (2017)

[30]

Gialampoukidis, I., Kalpakis, G., Tsikrika, T., Papadopoulos, S., Vrochidis, S., Kompatsiaris, I.:Detection of terrorism-related twitter communities using centrality scores. In: Proc. of the 2Nd Int. Workshop on Multimedia Forensics and Security. MFSec ’17, ACM,New York pp. 21–25 (2017)

[31]

Saidi, F.,Trabelsi, Z.,Ghazela, H.B.: A novel approach for terrorist sub-communities detection based on constrained evidential clustering. In: Proc. of Int. Conf. on Res. Challenges in Information Science, pp. 1–8 (2018)

[32]

Wiil, U.K., Gniadek, J., Memon, N.: Measuring link importance in terrorist networks. In: 2010 International Conference on Advances in Social Networks Analysis and Mining pp. 225–232 (2010)

[33]

Zhou Y, Reid E, Qin J, Chen H, and Lai G US domestic extremist groups on the Web: link and content analysis IEEE Intell. Syst. 2005 20 5 44-51

[34]

Kaza, S., Hu, D., Chen, H.: Dynamic social network analysis of a dark network: Identifying significant facilitators. In: 2007 IEEE Intelligence and Security Informatics pp. 40–46 (2007)

[35]

Adler, R.M.: A dynamic social network software platform for counter-terrorism decision support. In: IEEE ITSS 2007 pp. 47–54 (2007)

[36]

Wang, Y., Zhu, L.: Research and implementation of svd in machine learning. In: 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS) pp. 471–475 (2017)

[37]

Jolliffe I and Cadima J Principal component analysis: A review and recent developments Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 2016 374 20150202

[38]

Lee DD and Seung HS Learning the parts of objects by non-negative matrix factorization Nature 1999 401 788-791

[39]

Buono ND and Pio G Non-negative matrix tri-factorization for co-clustering: An analysis of the block matrix Inf. Sci. 2015 301 13-26

[40]

Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: Proc. of SIGKDD Int. Conference on Knowledge Discovery and Data Mining. KDD ’14, ACM,New York pp. 701–710 (2014)

[41]

Grover, A., Leskovec, J.: Node2vec: Scalable feature learning for networks. In: Proc. of SIGKDD Int. Conference on Knowledge Discovery and Data Mining. KDD ’16, ACM, New York, NY, USA pp. 855–864 (2016)

[42]

Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: Large-scale information network embedding. In: Proc. of Int. Conference on World Wide Web pp. 1067–1077 (2015)

[43]

Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM pp. 1225–1234 (2016)

[44]

Liu, J., He, Z.,Huang, Y.: Hashtag2Vec: Learning Hashtag Representation with Relational Hierarchical Embedding Model. In: Proc. of IJCAI 2018 pp. 3456–3462 (2018)

[45]

Du Y, Guo W, Liu J, and Yao C Classification by multi-semantic meta path and active weight learning in heterogeneous information networks Expert Systems with Applications 2019 123 227-236

[46]

Jensen, D., Neville, J., Gallagher, B.: Why collective inference improves relational classification. In: Proc. of SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM pp. 593–598 (2004)

[47]

Jethava G and Rao UP User behavior-based and graph-based hybrid approach for detection of sybil attack in online social networks Computers and Electrical Engineering 2022 99

[48]

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems pp. 3111–3119 (2013)

[49]

Lara-Cabrera R, Gonzalez-Pardo A, and Camacho D Statistical analysis of risk assessment factors and metrics to evaluate radicalisation in Twitter Future Generation Computer Systems 2019 93 971-978

[50]

Abbasi, F., Fazl-Ersi, E.: Identifying influentials in social networks. Applied Artificial Intelligence 36(1), 2010886 (2022)

[51]

Bhih A, Johnson P, and Randles M An optimisation tool for robust community detection algorithms using content and topology information J Supercomput 2020 76 1 226-254

[52]

Martinez-Romo J and Araujo L Detecting malicious tweets in trending topics using a statistical analysis of language Expert Systems with Applications 2013 40 8 2992-3000

[53]

Desrosiers, C., Karypis, G.: Within-network classification using local structure similarity. In: ECML PKDD ’09 pp. 260–275 (2009)

[54]

Barracchia EP, Pio G, Bifet A, Gomes HM, Pfahringer B, and Ceci M LP-ROBIN: Link prediction in dynamic networks exploiting incremental node embedding Information Sciences 2022 606 702-721

[55]

Lu, Q.,Getoor, L.: Link-based classification using labeled and unlabeled data. In: ICML Workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining (2003)

[56]

Stojanova D, Ceci M, Appice A, and Dzeroski S Network regression with predictive clustering tree Data Min. Knowl. Discov. 2012 25 2 378-413

[57]

Hinton, G., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. science 313(5786), 504–507 (2006)

[58]

Cai H, Zheng VW, and Chang KC A comprehensive survey of graph embedding: Problems, techniques, and applications IEEE Transactions on Knowledge and Data Engineering 2018 30 9 1616-1637

[59]

Levatic J, Kocev D, Ceci M, and Dzeroski S Semi-supervised trees for multi-target regression Inf. Sci. 2018 450 109-127

[60]

Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC press, ??? (1984)

[61]

Mirończuk MM and Protasiewicz J A recent overview of the state-of-the-art elements of text classification Expert Systems with Applications 2018 106 36-54

[62]

Japkowicz N Supervised versus unsupervised binary-learning by feedforward neural networks Machine Learning 2001 42 1/2 97-122

[63]

Corizzo R, Ceci M, and Japkowicz N Anomaly detection and repair for accurate predictions in geo-distributed big data Big Data Res. 2019 16 18-35

[64]

Corizzo R, Ceci M, Zdravevski E, and Japkowicz N Scalable auto-encoders for gravitational waves detection from time series data Expert Systems with Applications 2020 151

[65]

Bellinger, C., Sharma, S., Japkowicz, N.: One-class versus binary classification: Which and when? In: 2012 11th International Conference on Machine Learning and Applications 2, pp. 102–106 (2012)

[66]

Haykin S Neural Networks: a Comprehensive Foundation 1994 New Jersey, United States Prentice Hall PTR

[67]

Karlik B and Olgac AV Performance analysis of various activation functions in generalized mlp architectures of neural networks Int J Artif Intell Expert Syst 2011 1 4 111-122

[68]

Sheela, K.G., Deepa, S.N.: Review on methods to fix number of hidden neurons in neural networks. Mathematical Problems in Engineering 2013 (2013)

[69]

Garavaglia, S., Sharma, A.: A smart guide to dummy variables: Four applications and a macro. In: Proc. of the Northeast SAS Users Group Conference p. 43 (1998)

[70]

White, K., Li, G., Japkowicz, N.: Sampling online social networks using coupling from the past. In: Proc. of IEEE International Conference on Data Mining Workshops pp. 266–272 (2012)

[71]

Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.: The Stanford CoreNLP Natural Language Processing Toolkit. Proc. of Annual Meeting of the Association for Computational Linguistics: System Demonstrations 55–60 (2014)

[72]

Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Neural Networks: Tricks of the Trade, Springer,Berlin pp. 437–478 (2012)

[73]

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, et al. Scikit-learn: Machine learning in Python J Mach Learning Research 2011 12 2825-2830

Recommendations

Understanding user behavior in a local social media platform by social network analysis
MindTrek '11: Proceedings of the 15th International Academic MindTrek Conference: Envisioning Future Media Environments

Characterizing user behavior by social network analysis in social media has been an active research domain for a long time. However, much previous research has focused on the large-scale global social media such as Facebook, Wikipedia and Twitter. ...
A User-Centric Feature Identification and Modeling Approach to Infer Social Ties in OSNs
IIWAS '13: Proceedings of International Conference on Information Integration and Web-based Applications & Services

This paper aims to identify user-centric features to calculate the strength of social ties between Online Social Network (OSN) users, and models the same using Latent Space Model (LSM). The modeling approach processes a socio-centric user-set as the ...
Predicting user activity level in social networks
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

The study of users' social behaviors has gained much research attention since the advent of various social media such as Facebook, Renren and Twitter. A major kind of applications is to predict a user's future activities based on his/her historical ...

Comments

Information & Contributors

Information

Published In

cover image World Wide Web

World Wide Web Volume 26, Issue 5

Sep 2023

1444 pages

ISSN:1386-145X

Issue’s Table of Contents

© The Author(s) 2023.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 28 July 2023

Accepted: 26 June 2023

Revision received: 12 April 2023

Received: 16 November 2021

Author Tags

Qualifiers

Research-article

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents