Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3106426.3109447acmconferencesArticle/Chapter ViewAbstractPublication PageswiConference Proceedingsconference-collections
research-article

Partial sums-based P-Rank computation in information networks

Published: 23 August 2017 Publication History

Abstract

P-Rank is a simple and captivating link-based similarity measure that extends SimRank by exploiting both in- and out-links for similarity computation. However, the existing work of P-Rank computation is expensive in terms of time and space cost and cannot efficiently support similarity computation in large information networks. For tackling this problem, in this paper, we propose an optimization technique for fast P-Rank computation in information networks by adopting the spiritual of partial sums. We write P-Rank equation based on partial sums and further approximate this equation by setting a threshold for ignoring the small similarity scores during iterative similarity computation. An optimized similarity computation algorithm is developed, which reduces the computation cost by skipping the similarity scores smaller than the give threshold during accumulation operations. And the accuracy loss estimation under the threshold is given through extensive mathematical analysis. Extensive experiments demonstrate the effectiveness and efficiency of our proposed approach through comparing with the straightforward P-Rank computation algorithm.

References

[1]
R. Amsler. 1972. Application of citation-based automatic classification. Technical report, The University of Texas at Austin Linguistics Research Center.
[2]
Ioannis Antonellis, Hector Garcia-Molina, and Chi-Chao Chang. 2008. Simrank++: query rewriting through link analysis of the click graph. PVLDB 1, 1 (2008), 408--421.
[3]
Dániel Fogaras and Balázs Rácz. 2005. Scaling link-based similarity search. In WWW. 641--650.
[4]
Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 4 (2002).
[5]
Glen Jeh and Jennifer Widom. 2002. SimRank: a measure of structural-context similarity. In KDD. 538--543.
[6]
M. M. Kessler. 1963. Bibliographic coupling between scientific papers. American Documentation 14 (1963), 10--25.
[7]
Cuiping Li, Jiawei Han, Guoming He, Xin Jin, Yizhou Sun, Yintao Yu, and Tianyi Wu. 2010. Fast computation of SimRank for static and dynamic information networks. In EDBT. 465--476.
[8]
Lina Li, Cuiping Li, Hong Chen, and Xiaoyong Du. 2013. MapReduce-Based SimRank Computation and Its Application in Social Recommender System. In IEEE International Congress on Big Data, BigData Congress 2013, June 27 2013--July 2, 2013. 133--140.
[9]
Ruiqi Li, Xiang Zhao, Haichuan Shang, Yifan Chen, and Weidong Xiao. 2017. Fast top-k similarity join for SimRank. Inf. Sci. 381 (2017), 1--19.
[10]
Zhenjiang Lin, Irwin King, and Michael R. Lyu. 2006. PageSim: A Novel Link-Based Similarity Measure for the World Wide Web. In WI. 687--693.
[11]
Dmitry Lizorkin, Pavel Velikhov, Maxim N. Grinev, and Denis Turdakov. 2008. Accuracy estimate and optimization techniques for SimRank computation. PVLDB 1, 1 (2008), 422--433.
[12]
Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos, and Pinar Duygulu. 2004. Automatic multimedia cross-modal correlation discovery. In KDD. 653--658.
[13]
Henry G. Small. 1973. Co-citation in the scientific literature: A new measure of the relationship betweenf two documents. Journal of the American Society for Information Scienc 24, 4 (1973), 265--269.
[14]
Wensi Xi, Edward A. Fox, Weiguo Fan, Benyu Zhang, Zheng Chen, Jun Yan, and Dong Zhuang. 2005. SimFusion: measuring similarity using unified relationship matrix. In SIGIR. 130--137.
[15]
Xiaoxin Yin, Jiawei Han, and Philip S. Yu. 2006. LinkClus: Efficient Clustering via Heterogeneous Semantic Links. In VLDB. 427--438.
[16]
Mingxi Zhang, Hao Hu, Zhenying He, Liping Gao, and Liujie Sun. 2015. Efficient link-based similarity search in web networks. Expert Syst. Appl. 42, 22 (2015), 8868--8880.
[17]
Mingxi Zhang, Hao Hu, Zhenying He, and Wei Wang. 2015. Top-k similarity search in heterogeneous information networks with x-star network schema. Expert Syst. Appl. 42, 2 (2015), 699--712.
[18]
Peixiang Zhao, Jiawei Han, and Yizhou Sun. 2009. P-Rank: a comprehensive structural similarity measure over information networks. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, November 2--6, 2009. 553--562.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WI '17: Proceedings of the International Conference on Web Intelligence
August 2017
1284 pages
ISBN:9781450349512
DOI:10.1145/3106426
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. P-Rank
  2. information network
  3. partial sums
  4. similarity computation

Qualifiers

  • Research-article

Funding Sources

  • National High-tech R&D Program of China (863 Program)
  • University of Shanghai for Science and Technology
  • Innovation Program of Shanghai Municipal Education Commission
  • Natural Science Foundation of Shanghai

Conference

WI '17
Sponsor:

Acceptance Rates

WI '17 Paper Acceptance Rate 118 of 178 submissions, 66%;
Overall Acceptance Rate 118 of 178 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 55
    Total Downloads
  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media