Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank

Published: 12 October 2015 Publication History

Abstract

In this article, we explore the relationships among digital objects in terms of their similarity based on vertex similarity measures. We argue that SimRank—a famous similarity measure—and its families, such as P-Rank and SimRank++, fail to capture similar node pairs in certain conditions, especially when two nodes can only reach each other through paths of odd lengths. We present new similarity measures ASCOS and ASCOS++ to address the problem. ASCOS outputs a more complete similarity score than SimRank and SimRank’s families. ASCOS++ enriches ASCOS to include edge weight into the measure, giving all edges and network weights an opportunity to make their contribution. We show that both ASCOS++ and ASCOS can be reformulated and applied on a distributed environment for parallel contribution. Experimental results show that ASCOS++ reports a better score than SimRank and several famous similarity measures. Finally, we re-examine previous use cases of SimRank, and explain appropriate and inappropriate use cases. We suggest future SimRank users following the rules proposed here before naïvely applying it. We also discuss the relationship between ASCOS++ and PageRank.

References

[1]
Evrim Acar, Daniel M. Dunlavy, and Tamara G. Kolda. 2009. Link prediction on evolving data using matrix and tensor factorizations. In IEEE International Conference on Data Mining Workshops. IEEE, Washington DC, USA, 262--269.
[2]
Lada A. Adamic and Eytan Adar. 2003. Friends and neighbors on the web. Social Networks 25, 3, 211--230.
[3]
Alekh Agarwal and Soumen Chakrabarti. 2007. Learning random walks to rank nodes in graphs. In Proceedings of the 24th International Conference on Machine learning. ACM, New York, NY, USA, 9--16.
[4]
Charu Aggarwal, Yan Xie, and Philip S. Yu. 2012. On dynamic link inference in heterogeneous networks. In Proceedings of the 12th SIAM International Conference on Data Mining. SIAM, Anaheim, CA, USA, 415--426.
[5]
Mohammad Al Hasan and Mohammed J. Zaki. 2011. A survey of link prediction in social networks. Social Network Data Analytics, 243--275.
[6]
Réka Albert and Albert-László Barabási. 2002. Statistical mechanics of complex networks. Reviews of Modern Physics 74, 1, 47--97.
[7]
Ioannis Antonellis, Hector Garcia Molina, and Chi Chao Chang. 2008. Simrank++: Query rewriting through link analysis of the click graph. Proceedings of the VLDB Endowment 1, 1, 408--421.
[8]
Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining. ACM, Hong Kong, China, 635--644.
[9]
Shenghua Bao, Guirong Xue, Xiaoyuan Wu, Yong Yu, Ben Fei, and Zhong Su. 2007. Optimizing web search using social annotations. In Proceedings of the 16th International Conference on World Wide Web. ACM, Banff, Alberta, Canada, 501--510.
[10]
Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science 286, 5439, 509--512.
[11]
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30, 1, 107--117.
[12]
Rajmonda Sulo Caceres, Tanya Berger-Wolf, and Robert Grossman. 2011. Temporal scale of processes in dynamic networks. In IEEE International Conference on Data Mining Workshops. IEEE, Vancouver, Canada, 925--932.
[13]
Yuanzhe Cai, Miao Zhang, Chris Ding, and Sharma Chakravarthy. 2010. Closed form solution of similarity algorithms. In Proceedings of the 33rd International SIGIR Conference on Research and Development in Information Retrieval. ACM, Geneva, Switzerland, 709--710.
[14]
Hung-Hsuan Chen, Yan-Bin Ciou, and Shou-De Lin. 2012. Information propagation game: A tool to acquire humanplaying data for multiplayer influence maximization on social networks. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Beijing, China, 1524--1527.
[15]
Hung-Hsuan Chen and C. Lee Giles. 2013. ASCOS: An asymmetric network structure context similarity measure. In Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on. IEEE, Niagara Falls, Canada, 442--449.
[16]
Hung-Hsuan Chen, Liang Gou, Xiaolong Zhang, and C. Lee Giles. 2011. CollabSeer: A search engine for collaboration discovery. In Proceeding of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries. ACM, Ottawa, Canada, 231--240.
[17]
Hung-Hsuan Chen, Liang Gou, Xiaolong Zhang, and C. Lee Giles. 2012a. Discovering missing links in networks using vertex similarity measures. In The 27th ACM Symposium on Applied Computing. ACM, Riva del Garda (Trento), Italy, 138--143.
[18]
Hung-Hsuan Chen, Liang Gou, Xioalong Zhang, and C. Lee Giles. 2012b. Predicting recent links in FOAF networks. In Social Computing, Behavioral-Cultural Modeling and Prediction. Springer, College Park, MD, USA, 156--163.
[19]
Hung-Hsuan Chen, Liang Gou, Xiaolong Luke Zhang, and C. Lee Giles. 2013a. Towards the discovery of diseases related by genes using vertex similarity measures. In Healthcare Informatics (ICHI), 2013 IEEE International Conference on. IEEE, Philadelphia, PA, USA, 505--510.
[20]
Hung-Hsuan Chen, David J. Miller, and C. Lee Giles. 2013b. The predictive value of young and old links in a social network. In Proceedings of the ACM SIGMOD Workshop on Databases and Social Networks. ACM, New York, NY, USA, 43--48.
[21]
Fan Chung. 2007. The heat kernel as the pagerank of a graph. Proc. Natl. Acad. Sci. USA 104, 50, 19735--19740.
[22]
Fan Chung. 2009. A local graph partitioning algorithm using heat kernel pagerank. Internet Mathematics 6, 3, 315--330.
[23]
Adele Cutler and Leo Breiman. 1994. Archetypal analysis. Technometrics 36, 4, 338--347.
[24]
Yuxiao Dong, Qing Ke, Bai Wang, and Bin Wu. 2011. Link prediction based on local information. In Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on. IEEE, Kaohsiung, Taiwan, 382--386.
[25]
Leo A. Goodman. 1961. Snowball sampling. The Annals of Mathematical Statistics 32, 1, 148--170.
[26]
Guoming He, Haijun Feng, Cuiping Li, and Hong Chen. 2010. Parallel simrank computation on large graphs with iterative aggregation. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Washington, DC, USA, 543--552.
[27]
Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li. 2012. Rolx: Structural role extraction & mining in large graphs. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Beijing, China, 1231--1239.
[28]
Glen Jeh and Jennifer Widom. 2002. SimRank: A measure of structural-context similarity. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Edmonton, Alberta, Canada, 538--543.
[29]
Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika 18, 1, 39--43.
[30]
David Kempe, Jon Kleinberg, and Éva Tardos. 2003. Maximizing the spread of influence through a social network. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Washington, DC, USA, 137--146.
[31]
Kyle Kloster and David F. Gleich. 2014. Heat kernel based community detection. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, New York, NY, USA, 1386--1395.
[32]
Danai Koutra, Joshua T. Vogelstein, and Christos Faloutsos. 2013. DeltaCon: A principled massive-graph similarity function. In Proceedings of the SIAM International Conference on Data Mining. SIAM, Austin, Texas, 162--170.
[33]
Elizabeth A. Leicht, Petter Holme, and Mark E. J. Newman. 2006. Vertex similarity in networks. Physical Review E 73, 2, 26120.
[34]
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the 8th ACM International Conference on Knowledge Discovery and Data Mining. ACM, Chicago, IL, USA, 177--187.
[35]
Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance. 2007. Cost-effective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, San Jose, CA, USA, 420--429.
[36]
Cuiping Li, Jiawei Han, Guoming He, Xin Jin, Yizhou Sun, Yintao Yu, and Tianyi Wu. 2010a. Fast computation of simrank for static and dynamic information networks. In Proceedings of the 13th International Conference on Extending Database Technology. ACM, Lausanne, Switzerland, 465--476.
[37]
Pei Li, Hongyan Liu, Jeffrey Xu Yu, Jun He, and Xiaoyong Du. 2010b. Fast single-pair simrank computation. In Proceedings of the SIAM International Conference on Data Mining. SIAM, Columbus, Ohio, USA, 571--582.
[38]
David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology 58, 7, 1019--1031.
[39]
Ryan Lichtnwalter and Nitesh V. Chawla. 2012. Link prediction: Fair and effective evaluation. In Advances in Social Networks Analysis and Mining (ASONAM), 2012 IEEE/ACM International Conference on. IEEE, Istanbul, Turkey, 376--383.
[40]
Linyuan Lü and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications 390, 6, 1150--1170.
[41]
Rashid Mehmood and Jon Crowcroft. 2005. Parallel iterative solution method for large sparse linear equation systems. Computer Laboratory: University of Cambridge.
[42]
Douglas L. Nelson, Cathy L. McEvoy, and Thomas A. Schreiber. 2004. The university of south florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers 36, 3, 402--407.
[43]
Mark E. J. Newman. 2003. The structure and function of complex networks. SIAM REVIEW 45, 2, 167--256.
[44]
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2006. Introduction to Data Mining. Pearson Addison Wesley.
[45]
Erzsébet Ravasz, Anna Lisa Somera, Dale A. Mongru, Zoltán N. Oltvai, and A.-L. Barabási. 2002. Hierarchical organization of modularity in metabolic networks. Science 297, 5586, 1551--1555.
[46]
Yousef Saad. 2003. Iterative Methods for Sparse Linear Systems. SIAM.
[47]
Gerard Salton. 1989. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley.
[48]
Charalampos E. Tsourakakis. 2014. Towards quantifying vertex similarity in networks. Internet Mathematics 10, 3--4, 263--286.
[49]
Amos Tversky. 1977. Features of similarity. Psychological Review 84, 4, 327--352.
[50]
Wensi Xi, Edward A. Fox, Weiguo Fan, Benyu Zhang, Zheng Chen, Jun Yan, and Dong Zhuang. 2005. SimFusion: Measuring similarity using unified relationship matrix. In Proceedings of the 28th Annual International SIGIR Conference on Research and Development in Information Retrieval. ACM, Salvador, Brazil, 130--137.
[51]
Weiren Yu, Xuemin Lin, Wenjie Zhang, Ying Zhang, and Jiajin Le. 2012. SimFusion+: Extending simfusion towards efficient estimation on large and dynamic networks. In Proceedings of the 35th International SIGIR Conference on Research and Development in Information Retrieval. ACM, Portland, OR, USA, 365--374.
[52]
Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. 2014. Social Media Mining: An Introduction. Cambridge University Press.
[53]
Jun Zhang, Chaokun Wang, Philip S. Yu, and Jianmin Wang. 2013. Learning latent friendship propagation networks with interest awareness for link prediction. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’13). ACM, Dublin, Ireland, 63--72.
[54]
Peixiang Zhao, Jiawei Han, and Yizhou Sun. 2009. P-Rank: A comprehensive structural similarity measure over information networks. In Proceeding of the 18th ACM Conference on Information and Knowledge Management. ACM, Hong Kong, China, 553--562.
[55]
Tao Zhou, Linyuan Lü, and Yi-Cheng Zhang. 2009. Predicting missing links via local information. The European Physical Journal B-Condensed Matter and Complex Systems 71, 4, 623--630.

Cited By

View all
  • (2023)Detecting Inactive Cyberwarriors from Online Forums2023 IEEE International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT59888.2023.00008(9-15)Online publication date: 26-Oct-2023
  • (2023)Detecting Inaccurate Sensors on a Large-Scale Sensor Network Using Centralized and Localized Graph Neural NetworksIEEE Sensors Journal10.1109/JSEN.2023.328727023:15(16446-16455)Online publication date: 1-Aug-2023
  • (2022)CoSimHeat: An Effective Heat Kernel Similarity Measure Based on Billion-Scale Network Topology✱Proceedings of the ACM Web Conference 202210.1145/3485447.3511952(234-245)Online publication date: 25-Apr-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 10, Issue 2
October 2015
291 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/2835206
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2015
Accepted: 01 May 2015
Revised: 01 January 2015
Received: 01 August 2014
Published in TKDD Volume 10, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ASCOS++
  2. SimRank
  3. Vertex similarity
  4. coauthor network
  5. link analysis
  6. link prediction

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • National Science Foundation and Dow Chemical

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)2
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Detecting Inactive Cyberwarriors from Online Forums2023 IEEE International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT59888.2023.00008(9-15)Online publication date: 26-Oct-2023
  • (2023)Detecting Inaccurate Sensors on a Large-Scale Sensor Network Using Centralized and Localized Graph Neural NetworksIEEE Sensors Journal10.1109/JSEN.2023.328727023:15(16446-16455)Online publication date: 1-Aug-2023
  • (2022)CoSimHeat: An Effective Heat Kernel Similarity Measure Based on Billion-Scale Network Topology✱Proceedings of the ACM Web Conference 202210.1145/3485447.3511952(234-245)Online publication date: 25-Apr-2022
  • (2022)Personalized query techniques in graphsInformation Sciences: an International Journal10.1016/j.ins.2022.06.023607:C(961-1000)Online publication date: 1-Aug-2022
  • (2021)Adaptive Similarity Function with Structural Features of Network Embedding for Missing Link PredictionComplexity10.1155/2021/12775792021Online publication date: 1-Jan-2021
  • (2021)AdaSimProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482316(1528-1537)Online publication date: 26-Oct-2021
  • (2021)Expanding the Possibilities of the Metric Approach Based on the Theory of Means and the Theory of ErrorsAutomation and Remote Control10.1134/S000511792111007282:11(1912-1922)Online publication date: 22-Dec-2021
  • (2021)Finding diverse ways to improve algebraic connectivity through multi-start optimizationJournal of Complex Networks10.1093/comnet/cnab0059:1Online publication date: 25-Apr-2021
  • (2021)RoleSim*: Scaling axiomatic role-based similarity ranking on large graphsWorld Wide Web10.1007/s11280-021-00925-z25:2(785-829)Online publication date: 11-Aug-2021
  • (2020)Measuring similarity in co-occurrence data using ego-networksChaos: An Interdisciplinary Journal of Nonlinear Science10.1063/1.512903630:1Online publication date: 2-Jan-2020
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media