research-article

Collective classification in social networks

Authors:

Omar Jaafor and

Babiga BirregahAuthors Info & Claims

ASONAM '17: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017

July 2017

Pages 827 - 835

https://doi.org/10.1145/3110025.3110128

Published: 31 July 2017 Publication History

Abstract

Classification is one of the most studied subjects in machine learning. Most classification methods that were developed this last decade either account for structure (interactions, relationships) or attributes (text, numerical, etc). This leads to ignoring significant patterns in a dataset that could only be captured by analyzing the features of an item and its interactions. Collective classification methods use both structure and attributes, often by aggregating data from neighbors of a node and learning a model on the aggregated data. In social networks, the degree distribution of nodes follows a power law where few nodes have many neighbors. High degree nodes have incoming links from low degree nodes of different classes and many nodes have very few edges. Hence, using only local structure may lead to poor predictions. Also, many social networks allow for different types of interactions (retweet, reply, like, etc.) that affect classification differently. This article proposes a collective classification method that makes use of the structure of a network to determine its neighbors. It then presents experiments aimed at detecting jihadi propagandists and malware distributors on social networks.

References

[1]

D. Cardon, "Le design de la visibilité: un essai de cartographie du web 2.0," Réseaux, vol. 152, pp. 93--137, 2008. [Online]. Available: http://www.cairn.info/revue-reseaux-2008-6-page-93.htm%5Cnhttp://www.internetactu.net/2008/02/01/le-design-de-la-visibilite-un-essai-de-typologie-du-web-20/

[2]

Y. Freund and R. R. E. Schapire, "Experiments with a New Boosting Algorithm," International Conference on Machine Learning, pp. 148--156, 1996. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.51.6252

[3]

L. Breiman, "Random forests," Machine Learning, vol. 45, no. 1, pp. 5--32, 2001.

Digital Library

[4]

David E. Ruineihart, Geoffrey E. Hinton and R. J. Williams, "Learning internal representations by error propagation," Parallel distributed processing: explorations in the microstructure of cognition, no. 1, pp. 318--362, 1985.

[5]

K. Nigam, J. Lafferty, and A. Mccallum, "Using Maximum Entropy for Text Classification," IJCAI-99 Workshop on Machine Learning for Information Filtering, pp. 61--67, 1999.

[6]

R. M. Neal, "Probabilistic Inference Using Markov Chain Monte Carlo Methods," Technical Report, vol. 1, pp. 1--144, 1998. [Online]. Available: papers2://publication/uuid/0C88167E-5379-4E4E-A9E4-007ABA4F716D

[7]

P. Sen, G. Namata, M. Bilgic, L. Getoor, and B. Gallagher, "Collective Classification in Network Data," pp. 93--106, 2008.

[8]

C. Castillo, D. Donato, and V. Murdock, "Know your Neighbors: Web Spam Detection using the Web Topology," Framework, pp. 423--430, 2007. [Online]. Available: http://www.dcc.uchile.cl/$\sim$ccastill/papers/cdgms_2006_know_your_neighbors.pdf

Digital Library

[9]

L. K. McDowell, K. M. Gupta, and D. W. Aha, "Cautious Collective Classification," J. Mach. Learn. Res., vol. 10, pp. 2777--2836, 2009. [Online]. Available: http://dl.acm.org/citation.cfm?id=1577069.1755879

Digital Library

[10]

S. a. Macskassy and F. Provost, "Classification in Networked Data: A Toolkit and a Univariate Case Study," Journal of Machine Learning Research, vol. 8, no. December 2004, pp. 935--983, 2007.

[11]

L. Tang and H. Liu, "Leveraging social media networks for classification," Data Mining and Knowledge Discovery, vol. 23, no. 3, pp. 447--478, 2011.

Digital Library

[12]

B. Gallagher, H. Tong, T. Eliassi-Rad, and C. Faloutsos, "Using ghost edges for classification in sparsely labeled networks," Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08, p. 256, 2008. [Online]. Available: http://dl.acm.org/citation.cfm?doid=1401890.1401925

[13]

C. Park, "Effective Label Acquisition for Collective Classification Categories and Subject Descriptors," Design.

[14]

S. Chakrabarti, B. Dom, and P. Indyk, "Enhanced hypertext categorization using hyperlinks," Proceedings of the 1998 ACM SIGMOD international conference on Management of data - SIGMOD '98, no. March, pp. 307--318, 1998. [Online]. Available: http://portal.acm.org/citation.cfm?doid=276304.276332

[15]

B. Perozzi, R. Al-Rfou, and S. Skiena, "DeepWalk: Online Learning of Social Representations," 2014. [Online]. Available: http://arxiv.org/abs/1403.6652%0Ahttp://dx.doi.org/10.1145/2623330.2623732

Digital Library

[16]

S. Bhagat, G. Cormode, and S. Muthukrishnan, "Node Classification in Social Networks," 2011. [Online]. Available: http://arxiv.org/abs/1101.3291%0Ahttp://dx.doi.org/10.1007/978-1-4419-8462-3_5

[17]

D. Jensen, J. Neville, and B. Gallagher, "Why collective inference improves relational classification," Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '04, p. 593, 2004. [Online]. Available: http://portal.acm.org/citation.cfm?doid=1014052.1014125

[18]

D. Zhou, B. Schölkopf, and T. Hofmann, "Semi-Supervised Learning on Directed Graphs," Adv. in Neur. Inf. Proc. Syst. (NIPS), vol. 17, pp. 1633--1640, 2005.

[19]

Z. Yang, W. W. Cohen, and R. Salakhutdinov, "Revisiting Semi-Supervised Learning with Graph Embeddings," vol. 48, 2016. [Online]. Available: http://arxiv.org/abs/1603.08861

[20]

S. A. Macskassy and F. Provost, "Suspicion scoring of networked entities based on guilt-by-association, collective inference, and focused data access 1," no. February, 2014.

[21]

M. Nickel, V. Tresp, and H.-P. Kriegel, "Factorizing YAGO," Proceedings of the 21st international conference on World Wide Web - WWW '12, p. 271, 2012. [Online]. Available: http://dl.acm.org/citation.cfm?doid=2187836.2187874

Digital Library

[22]

P. Kazienko and T. Kajdanowicz, "Label-dependent node classification in the network," Neurocomputing, vol. 75, no. 1, pp. 199--209, 2012. [Online]. Available: http://dx.doi.org/10.1016/j.neucom.2011.04.047

Digital Library

[23]

D. Jensen and J. Neville, "Linkage and autocorrelation cause feature selection bias in relational learning," Proceedings of the Nineteenth International Conference on Machine Learning (ICML2002), pp. 259--266, 2002.

[24]

D. Jensen, J. Neville, and M. Hay, "Avoiding Bias when Aggregating Relational Data with Degree Disparity," Proceedings of the Twentieth International Conference on Machine Learning, vol. 20, no. 1, p. 274, 2003.

[25]

F. D. Malliaros and M. Vazirgiannis, "Clustering and community detection in directed networks: A survey," Physics Reports, vol. 533, no. 4, pp. 95--142, 2013.

[26]

L. Wang, T. Lou, J. Tang, and J. E. Hopcroft, "Detecting Community Kernels in Large Social Networks."

[27]

P. Pons and M. Latapy, "Computing communities in large networks using random walks," Lect Notes Comput Sc, vol. 3733, pp. 284--293, 2005.

Digital Library

[28]

M. Magnani and L. Rossi, "Multi-Stratum Networks: toward a unified model of on-line identities," arXiv preprint arXiv:1211.0169, pp. 1--18, 2012. [Online]. Available: http://arxiv.org/abs/1211.0169v1

[29]

M. Kivelä, A. Arenas, M. Barthelemy, J. P. Gleeson, Y. Moreno, and M. a. Porter, "Multilayer Networks," arXiv, p. 37, 2014. [Online]. Available: http://arxiv.org/abs/1309.7233

[30]

S. Boccaletti, G. Bianconi, R. Criado, C. I. del Genio, J. Gómez-Gardeñes, M. Romance, I. Sendiña-Nadal, Z. Wang, and M. Zanin, "The structure and dynamics of multilayer networks," Physics Reports, vol. 544, no. 1, pp. 1--122, 2014. [Online]. Available: http://dx.doi.org/10.1016/j.physrep.2014.07.001

[31]

O. Jaafor, "Multi-layered graph-based model for social engineering vulnerability assessment," in The international conference on Advances in Social Network Analysis and Mining (ASONAM). Paris, France: ACM, 2015, pp. 1480--1488. [Online]. Available: http://link.springer.com/bookseries/8768

[32]

P. Kazienko, K. Musial, E. Kukla, and T. Kajdanowicz, "Multidimensional Social Network: Model and Analysis," in International Conference on Computational Collective Intelligence, 2011, pp. 378--387.

Cited By

Tingxuan SLau R(2019)Collective Classification for Social Opinion Spam DetectionProceedings of the 2019 2nd International Conference on Data Science and Information Technology10.1145/3352411.3352440(181-186)Online publication date: 19-Jul-2019
https://dl.acm.org/doi/10.1145/3352411.3352440
Zhang ZLi XGan C(2018)Multimodality Fusion for Node Classification in D2D CommunicationsIEEE Access10.1109/ACCESS.2018.28777156(63748-63756)Online publication date: 2018
https://doi.org/10.1109/ACCESS.2018.2877715

Collective classification in social networks

Recommendations

Collective Classification for Social Opinion Spam Detection
DSIT 2019: Proceedings of the 2019 2nd International Conference on Data Science and Information Technology

With increasingly more firms using online social media to market their products and services, so are the widely spread attacks to the consumer opinions posted to social media, namely the social opinion spam. Fake social opinions may inflate firms' own ...
Read More
User opinion classification in social media

A link-based approach, named global consistency maximization (GCM) is proposed for opinion classification.The proposed approach achieves higher accuracy than two baseline approaches.Link-based opinion classifiers are robust to a small training sample if ...
Read More
A Matrix Alignment Approach for Collective Classification
ASONAM '09: Proceedings of the 2009 International Conference on Advances in Social Network Analysis and Mining

Within networks there is often a pattern to the way nodes link to one another.It has been shown that the accuracy of node classification can be improved by using the link data.One of the challenges to integrating the attribute and link data, though, is ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASONAM '17: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017

July 2017

698 pages

ISBN:9781450349932

DOI:10.1145/3110025

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 July 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ASONAM '17

Sponsor:

SIGKDD

ASONAM '17: Advances in Social Networks Analysis and Mining 2017

July 31 - August 3, 2017

Sydney, Australia

Acceptance Rates

Overall Acceptance Rate 116 of 549 submissions, 21%

Upcoming Conference

KDD '24

Sponsor:
sigkdd
sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
138
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)2

Other Metrics

View Author Metrics

Citations

Cited By

Tingxuan SLau R(2019)Collective Classification for Social Opinion Spam DetectionProceedings of the 2019 2nd International Conference on Data Science and Information Technology10.1145/3352411.3352440(181-186)Online publication date: 19-Jul-2019
https://dl.acm.org/doi/10.1145/3352411.3352440
Zhang ZLi XGan C(2018)Multimodality Fusion for Node Classification in D2D CommunicationsIEEE Access10.1109/ACCESS.2018.28777156(63748-63756)Online publication date: 2018
https://doi.org/10.1109/ACCESS.2018.2877715

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents