Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2783258.2783297acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Efficient PageRank Tracking in Evolving Networks

Published: 10 August 2015 Publication History

Abstract

Real-world networks, such as the World Wide Web and online social networks, are very large and are evolving rapidly. Thus tracking personalized PageRank in such evolving networks is an important challenge in network analysis and graph mining.
In this paper, we propose an efficient online algorithm for tracking personalized PageRank in an evolving network. The proposed algorithm tracks personalized PageRank accurately (i.e., within a given accuracy ε > 0). Moreover it can update the personalized PageRank scores in amortized O(1/ε) iterations for each graph modification. In addition, when m edges are randomly and sequentially inserted, the total number of iterations is expected to be O(log m/ε).
We evaluated our algorithm in real-world networks. In average case, for each edge insertion and deletion, our algorithm updated the personalized PageRank in 3us in a web graph with 105M vertices and 3.7B edges, and 20ms in a social network with 42M vertices and 1.5B edges. By comparing existing state-of-the-arts algorithms, our algorithm is 2--290 times faster with an equal accuracy.

Supplementary Material

MP4 File (p875.mp4)

References

[1]
R. Andersen, F. Chung, and K. Lang. Local graph partitioning using pagerank vectors. In FOCS, pages 475--486, 2006.
[2]
B. Bahmani, A. Chowdhury, and A. Goel. Fast incremental and personalized pagerank. VLDB, 4(3):173--184, 2010.
[3]
B. Bahmani, R. Kumar, M. Mahdian, and E. Upfal. Pagerank on an evolving graph. In KDD, 2012.
[4]
A.-L. Barabási and R. Albert. Emergence of scaling in random networks. Science, 286(5439):509--512, 1999.
[5]
A. A. Benczúr, K. Csalogány, T. Sarlós, and M. Uher. SpamRank--fully automatic link spam detection work in progress. In AIRWeb, 2005.
[6]
P. Berkhin. Bookmark-coloring approach to personalized pagerank computing. Internet Math., 3(1):41--62, 2006.
[7]
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Comput. Networks ISDN Syst., 30(1):107--117, 1998.
[8]
M. Cha, H. Haddadi, F. Benevenuto, and K. P. Gummadi. Measuring user influence in twitter: The million follower fallacy. In ICWSM, pages 10--17, 2010.
[9]
S. Chien, C. Dwork, R. Kumar, D. R. Simon, and D. Sivakumar. Link evolution: Analysis and algorithms. Internet Math., 1(3):277--304, 2004.
[10]
Y. Chung, M. Toyoda, and M. Kitsuregawa. Detecting link hijacking by web spammers. In PAKDD, 2009.
[11]
Y. Chung, M. Toyoda, and M. Kitsuregawa. A study of link farm distribution and evolution using a time series of web snapshots. In AIRWeb, pages 9--16, 2009.
[12]
Y. Chung, M. Toyoda, and M. Kitsuregawa. Identifying spam link generators for monitoring emerging web spam. In WICOW, pages 51--58, 2010.
[13]
G. M. Del Corso, A. Gullí, and F. Romani. Fast pagerank computation via a sparse linear system. Internet Math., 2(3):251--273, 2005.
[14]
Y. Fujiwara, M. Nakatsuji, M. Onizuka, and M. Kitsuregawa. Fast and exact top-k search for random walk with restart. VLDB, 5(5):442--453, 2012.
[15]
D. Gleich, L. Zhukov, and P. Berkhin. Fast parallel pagerank: A linear system approach. Yahoo! Research Technical Report YRL-2004-038, 13:22, 2004.
[16]
Z. Gyöngyi and H. Garcia-Molina. Link spam alliances. In VLDB, pages 517--528, 2005.
[17]
Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In VLDB, pages 576--587, 2004.
[18]
W. Hoeffding. Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc., 58(301):13--30, 1963.
[19]
G. Jeh and J. Widom. Scaling personalized web search. In WWW, pages 271--279, 2003.
[20]
S. D. Kamvar, T. H. Haveliwala, C. D. Manning, and G. H. Golub. Extrapolation methods for accelerating pagerank computations. In WWW, pages 261--270, 2003.
[21]
A. N. Langville and C. D. Meyer. Deeper inside pagerank. Internet Math., 1(3):335--380, 2004.
[22]
A. N. Langville and C. D. Meyer. Updating markov chains with an eye on google's pagerank. SIAM J. Matrix Anal. Appl., 27(4):968--987, 2006.
[23]
P. Lofgren, S. Banerjee, A. Goel, and C. Seshadhri. FAST-PPR: Scaling personalized pagerank estimation for large graphs. In KDD, pages 1436--1445, 2014.
[24]
T. Maehara, T. Akiba, Y. Iwata, and K. Kawarabayashi. Computing personalized pagerank quickly by exploiting graph structures. VLDB, 7(12), 2014.
[25]
F. McSherry. A uniform approach to accelerated pagerank computation. In WWW, pages 575--582, 2005.
[26]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.
[27]
A. D. Sarma, S. Gollapudi, and R. Panigrahy. Estimating pagerank on graph streams. J. ACM, 58(3):13, 2011.
[28]
G. Shen, B. Gao, T.-Y. Liu, G. Feng, S. Song, and H. Li. Detecting link spam using temporal information. In ICDM, pages 1049--1053, 2006.
[29]
H. A. Simon and A. Ando. Aggregation of variables in dynamic systems. Econometrica, 29:114--138, 1961.
[30]
R. V. Southwell. Relaxation Methods in Engineering Science. Oxford University Press, 1940.
[31]
R. V. Southwell. Relaxation Methods in Theoretical Physics. Oxford University Press, 1946.
[32]
D. A. Spielman and S.-H. Teng. A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM J. Comput., 42(1):1--26, 2013.
[33]
J. Weng, E. P. Lim, J. Jiang, and Q. He. TwitterRank: finding topic-sensitive influential twitterers. In WSDM, pages 261--270, 2010.

Cited By

View all
  • (2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 1-May-2024
  • (2024)Efficient Algorithms for Personalized PageRank Computation: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337600036:9(4582-4602)Online publication date: Sep-2024
  • (2024)Multi-Order Clustering on Dynamic Networks: On Error Accumulation and Its EliminationIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621124(1950-1959)Online publication date: 20-May-2024
  • Show More Cited By

Index Terms

  1. Efficient PageRank Tracking in Evolving Networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '15: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
    August 2015
    2378 pages
    ISBN:9781450336642
    DOI:10.1145/2783258
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 August 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. dynamic graphs
    2. online algorithms
    3. personalized pagerank

    Qualifiers

    • Research-article

    Conference

    KDD '15
    Sponsor:

    Acceptance Rates

    KDD '15 Paper Acceptance Rate 160 of 819 submissions, 20%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)32
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 1-May-2024
    • (2024)Efficient Algorithms for Personalized PageRank Computation: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337600036:9(4582-4602)Online publication date: Sep-2024
    • (2024)Multi-Order Clustering on Dynamic Networks: On Error Accumulation and Its EliminationIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621124(1950-1959)Online publication date: 20-May-2024
    • (2024)Personalized PageRanks over Dynamic Graphs - The Case for Optimizing Quality of Service2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00038(409-422)Online publication date: 13-May-2024
    • (2024)A Sample Reuse Strategy for Dynamic Influence Maximization ProblemBio-Inspired Computing: Theories and Applications10.1007/978-981-97-2275-4_9(107-120)Online publication date: 16-Apr-2024
    • (2024)DF* PageRank: Incrementally Expanding Approaches for Updating PageRank on Dynamic GraphsEuro-Par 2024: Parallel Processing10.1007/978-3-031-69583-4_22(312-326)Online publication date: 26-Aug-2024
    • (2023)Efficient Tree-SVD for Subset Node Embedding over Large Dynamic GraphsProceedings of the ACM on Management of Data10.1145/35889501:1(1-26)Online publication date: 30-May-2023
    • (2023)Personalized PageRank on Evolving Graphs with an Incremental Index-Update SchemeProceedings of the ACM on Management of Data10.1145/35887051:1(1-26)Online publication date: 30-May-2023
    • (2023)Real-Time PageRank on Dynamic GraphsProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3593004(239-251)Online publication date: 7-Aug-2023
    • (2023)FTLTM: Fine Tuned Linear Threshold Model for gauging of influential user in complex networks for information diffusionInternational Journal of Information Technology10.1007/s41870-023-01387-415:7(3593-3604)Online publication date: 15-Aug-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media