research-article

Ties That Bind: Characterizing Classes by Attributes and Social Ties

Authors:

Leman AkogluAuthors Info & Claims

WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

Pages 973 - 981

https://doi.org/10.1145/3041021.3055138

Published: 03 April 2017 Publication History

Abstract

Given a set of attributed subgraphs known to be from different classes, how can we discover their differences? There are many cases where collections of subgraphs may be contrasted against each other. For example, they may be as- signed ground truth labels (spam/not-spam), or it may be desired to directly compare the biological networks of different species or compound networks of different chemicals.

In this work we introduce the problem of characterizing the differences between attributed subgraphs that belong to different classes. We define this characterization problem as one of partitioning the attributes into as many groups as the number of classes, while maximizing the total attributed quality score of all the given subgraphs.

We show that our attribute-to-class assignment problem is NP-hard and an optimal (1 -- 1/e)-approximation algorithm exists. We also propose two different faster heuristics that are linear-time in the number of attributes and subgraphs. Unlike previous work where only attributes were taken into account for characterization, here we exploit both attributes and social ties (i.e. graph structure).

Through extensive experiments, we compare our proposed algorithms, show findings that agree with human intuition on datasets from Amazon co-purchases, Congressional bill sponsorships and DBLP co-authorships. We also show that our approach of characterizing subgraphs is better suited for sense-making than discriminating classification approaches.

References

[1]

R. Agrawal, T. Imieliński, and A. Swami. Mining association rules between sets of items in large databases. In SIGMOD, volume 22, pages 207--216. ACM, 1993.

Digital Library

[2]

L. Akoglu, H. Tong, B. Meeder, and C. Faloutsos. PICS: Parameter-free identification of cohesive subgroups in large attributed graphs. In SIAM SDM, pages 439--450, 2012.

[3]

R. Andersen, F. R. K. Chung, and K. J. Lang. Local graph partitioning using pagerank vectors. In FOCS, pages 475--486, 2006.

Digital Library

[4]

A. Banerjee, S. Basu, and S. Merugu. Multi-way clustering on relation graphs. In SIAM SDM, 2007.

[5]

J. D. Burger, J. Henderson, G. Kim, and G. Zarrella. Discriminating gender on twitter. In EMNLP, pages 1301--1309, 2011.

Digital Library

[6]

M. D. Choudhury, M. Gamon, S. Counts, and E. Horvitz. Predicting depression via social media. In ICWSM, 2013.

[7]

D. DellaPosta, Y. Shi, and M. Macy. Why do liberals drink lattes? American Journal of Sociology, 120(5):1473--1511, 2015.

[8]

I. Dhillon, S. Mallela, and D. Modha. Information-theoretic co-clustering. In KDD, 2003.

Digital Library

[9]

J. Eisenstein, B. O'Connor, N. A. Smith, and E. P. Xing. A latent variable model for geographic lexical variation. In EMNLP, pages 1277--1287, 2010.

Digital Library

[10]

L. Flekova, L. Ungar, and D. Preoctiuc-Pietro. Exploring stylistic variation with age and income on twitter. In ACL, 2016.

[11]

J. H. Fowler. Legislative cosponsorship networks in the us house and senate. Social Networks, 28(4):454--465, 2006.

[12]

J. Gao, F. Liang, W. Fan, C. Wang, Y. Sun, and J. Han. On community outliers and their efficient detection in information networks. In KDD, pages 813--822, 2010.

Digital Library

[13]

S. Günnemann, I. Färber, B. Boden, and T. Seidl. Subspace clustering meets dense subgraph mining: A synthesis of two paradigms. In ICDM, 2010.

Digital Library

[14]

S. Han, B.-Z. Yang, H. R. Kranzler, X. Liu, H. Zhao, L. A. Farrer, E. Boerwinkle, J. B. Potash, and J. Gelernter. Integrating GWASs and human protein interaction networks identifies a gene subnetwork underlying alcohol dependence. The American Journal of Human Genetics, 93(6):1027--1034, 2013.

[15]

P. Iglesias, E. Müller, F. Laforet, F. Keller, and K. Böhm. Statistical selection of congruent subspaces for outlier detection on attributed graphs. In ICDM, 2013.

[16]

G. Karypis and V. Kumar. Multilevel algorithms for multi-constraint graph partitioning. In Proc. of Supercomputing, pages 1--13, 1998.

Digital Library

[17]

S. Khot, R. J. Lipton, E. Markakis, and A. Mehta. Inapproximability results for combinatorial auctions with submodular utility functions. Algorithmica, 52(1):3--18, 2008.

[18]

V. Kulkarni, B. Perozzi, and S. Skiena. Freshman or fresher? quantifying the geographic variation of language in online social media. In Tenth International AAAI Conference on Web and Social Media, 2016.

[19]

B. Lehmann, D. J. Lehmann, and N. Nisan. Combinatorial auctions with decreasing marginal utilities. In EC, pages 18--28, 2001.

Digital Library

[20]

J. Leskovec, L. A. Adamic, and B. A. Huberman. The dynamics of viral marketing. ACM Transactions on the Web (TWEB), 1(1):5, 2007.

Digital Library

[21]

J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. M. VanBriesen, and N. S. Glance. Cost-effective outbreak detection in networks. In KDD, pages 420--429, 2007.

Digital Library

[22]

A. Lewis, N. Jones, M. Porter, and C. Deane. The function of communities in protein interaction networks at multiple scales. BMC Systems Biology, 4(1):100, 2010.

[23]

B. Long, Z. Zhang, X. Wu, and P. S. Yu. Spectral clustering for multi-type relational data. In ICML, 2006.

Digital Library

[24]

V. S. Mirrokni, M. Schapira, and J. Vondrák. Tight information-theoretic lower bounds for welfare maximization in combinatorial auctions. In EC, pages 70--77, 2008.

Digital Library

[25]

F. Moser, R. Colak, A. Rafiey, and M. Ester. Mining cohesive patterns from graphs with feature vectors. In SDM, 2009.

[26]

A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, 2001.

Digital Library

[27]

B. Perozzi and L. Akoglu. Scalable anomaly ranking of attributed neighborhoods. In SIAM SDM, 2016.

[28]

B. Perozzi, L. Akoglu, P. Iglesias Sánchez, and E. Müller. Focused Clustering and Outlier Detection in Large Attributed Graphs. In KDD, pages 1346--1355, 2014.

Digital Library

[29]

B. Perozzi and S. Skiena. Exact age prediction in social networks. In WWW '15 Companion, pages 91--92, 2015.

Digital Library

[30]

D. Preoţiuc-Pietro, V. Lampos, and N. Aletras. An analysis of the user occupational class through twitter content. The Association for Computational Linguistics, 2015.

[31]

D. Rao, D. Yarowsky, A. Shreevats, and M. Gupta. Classifying latent user attributes in twitter. In 2nd International Workshop on Search and Mining User-generated Contents, pages 37--44. ACM, 2010.

Digital Library

[32]

H. A. Schwartz, J. C. Eichstaedt, M. L. Kern, L. Dziurzynski, S. M. Ramones, M. Agrawal, A. Shah, M. Kosinski, D. Stillwell, M. E. Seligman, and L. H. Ungar. Personality, gender, and age in the language of social media: The Open-Vocabulary approach. PLoS ONE, 2013.

[33]

J. Tang and H. Liu. Unsupervised feature selection for linked social media data. In KDD, pages 904--912, 2012.

Digital Library

[34]

R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B, pages 267--288, 1996.

[35]

J. Vondrák. Optimal approximation for the submodular welfare problem in the value oracle model. In STOC, pages 67--74, 2008.

Digital Library

[36]

S. White and P. Smyth. A spectral clustering approach to finding communities in graph. In SDM, 2005.

[37]

Y. Zhou, H. Cheng, and J. X. Yu. Graph clustering based on structural/attribute similarities. PVLDB, 2(1):718--729, 2009.

Digital Library

Cited By

Silva TLaender AVaz de Melo P(2020)On knowledge-transfer characterization in dynamic attributed networksSocial Network Analysis and Mining10.1007/s13278-020-00657-410:1Online publication date: 13-Jun-2020
https://doi.org/10.1007/s13278-020-00657-4
Rezaei AGao J(2019)On Privacy of Socially Contagious Attributes2019 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM.2019.00163(1294-1299)Online publication date: Nov-2019
https://doi.org/10.1109/ICDM.2019.00163

Index Terms

Ties That Bind: Characterizing Classes by Attributes and Social Ties
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing design and evaluation methods
      1. Social network analysis
2. Information systems
  1. World Wide Web
    1. Web applications
      1. Social networks

Recommendations

Mining Attribute Evolution Rules in Dynamic Attributed Graphs
Big Data Analytics and Knowledge Discovery
Abstract
A dynamic attributed graph is a graph that changes over time and where each vertex is described using multiple continuous attributes. Such graphs are found in numerous domains, e.g., social network analysis. Several studies have been done on ...
The Strength of Awkward Ties: Online Interactions between High School Students and Adults
GROUP '16: Proceedings of the 2016 ACM International Conference on Supporting Group Work

In this multiple case study of two high schools in the USA, we use interview and focus group data to examine the experiences of teen-age students when they friend and interact with teachers, high school administrators, parents, and other adults on ...
Compact group discovery in attributed graphs and social networks
Highlights
- We define the novel attributed group query problem over social networks and propose new objective functions to rank the results.
Abstract
Social networks and many other graphs are attributed, meaning that their nodes are labelled with textual information such as personal data, expertise or interests. In attributed graphs, a common data analysis task is to find subgraphs ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

April 2017

1738 pages

ISBN:9781450349147

General Chairs:
Rick Barrett
W3Events
,
Rick Cummings
Murdoch University
,
Program Chairs:
Eugene Agichtein
Emory University
,
Evgeniy Gabrilovich
Google Research

Sponsors

IW3C2: International World Wide Web Conference Committee

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 03 April 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '17

Sponsor:

IW3C2

WWW '17: 26th International World Wide Web Conference

April 3 - 7, 2017

Perth, Australia

Acceptance Rates

WWW '17 Companion Paper Acceptance Rate 164 of 966 submissions, 17%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
111
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Silva TLaender AVaz de Melo P(2020)On knowledge-transfer characterization in dynamic attributed networksSocial Network Analysis and Mining10.1007/s13278-020-00657-410:1Online publication date: 13-Jun-2020
https://doi.org/10.1007/s13278-020-00657-4
Rezaei AGao J(2019)On Privacy of Socially Contagious Attributes2019 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM.2019.00163(1294-1299)Online publication date: Nov-2019
https://doi.org/10.1109/ICDM.2019.00163

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents