Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1835804.1835916acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Fast query execution for retrieval models based on path-constrained random walks

Published: 25 July 2010 Publication History
  • Get Citation Alerts
  • Abstract

    Many recommendation and retrieval tasks can be represented as proximity queries on a labeled directed graph, with typed nodes representing documents, terms, and metadata, and labeled edges representing the relationships between them. Recent work has shown that the accuracy of the widely-used random-walk-based proximity measures can be improved by supervised learning - in particular, one especially effective learning technique is based on Path-Constrained Random Walks (PCRW), in which similarity is defined by a learned combination of constrained random walkers, each constrained to follow only a particular sequence of edge labels away from the query nodes. The PCRW based method significantly outperformed unsupervised random walk based queries, and models with learned edge weights. Unfortunately, PCRW query systems are expensive to evaluate. In this study we evaluate the use of approximations to the computation of the PCRW distributions, including fingerprinting, particle filtering, and truncation strategies. In experiments on several recommendation and retrieval problems using two large scientific publications corpora we show speedups of factors of 2 to 100 with little loss in accuracy.

    Supplementary Material

    MOV File (kdd2010_lao_fer_01.mov)

    References

    [1]
    A. Agarwal, S. Chakrabarti, and S. Aggarwal. Learning to rank networked entities. pages 14--23, 2006.
    [2]
    S. Agrawal, S. Chaudhuri, and G. Das. Dbxplorer: A system for keyword-based search over relational databases. ICDE, pages 5--16, 2002.
    [3]
    G. Andrew and J. Gao. Scalable training of l1-regularized log-linear models. ICML, pages 33--40, 2007.
    [4]
    A. Arnold and W. W. Cohen. Information extraction as link prediction: Using curated citation networks to improve gene detection. WASA, pages 541--550, 2009.
    [5]
    J. A. Aslam, E. Kanoulas, V. Pavlu, S. Savev, and E. Yilmaz. Document selection methodologies for efficient and effective learning-to-rank. SIGIR, pages 468--475, 2009.
    [6]
    G. Bhalotia, A. Hulgeri, C. Nakhe, S. Chakrabarti, and S. Sudarshan. Keyword searching and browsing in databases using BANKS. ICDE, pages 431--440, 2002.
    [7]
    S. Chakrabarti. Dynamic personalized pagerank in entity-relation graphs. pages 571--580, 2007.
    [8]
    S. Chakrabarti and A. Agarwal. Learning parameters in entity relationship graphs from ranking preferences. PKDD, pages 91--102, 2006.
    [9]
    B. B. Dalvi, M. Kshirsagar, and S. Sudarshan. Keyword search on external memory data graphs. PVLDB, 1(1):1189--1204, 2008.
    [10]
    M. Diligenti, M. Gori, and M. Maggini. Learning web page scores by error back-propagation. In IJCAI, pages 684--689, 2005.
    [11]
    D. Fogaras and B. R2acz. Towards fully personalizing PageRank. In Proceedings of the 3rd Workshop on Algorithms and Models for the Web-Graph (WAW2004), in conjunction with FOCS 2004., 2004.
    [12]
    T. H. Haveliwala. Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Trans. Knowl. Data Eng., 15(4):784--796, 2003.
    [13]
    H. He, H. Wang, J. Y. 0001, and P. S. Yu. Blinks: ranked keyword searches on graphs. SIGMOD, pages 305--316, 2007.
    [14]
    Q. He, J. Pei, D. Kifer, P. Mitra, and C. L. Giles. Context-aware citation recommendation. 2010.
    [15]
    V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. VLDB, pages 670--681, 2002.
    [16]
    N. Lao and W. Cohen. Relational retrieval using an combination of path-constrained random walks. in submission.
    [17]
    M. Lupu, F. Piroi, X. Huang, J. Zhu, and J. Tait. Overview of the trec 2009 chemical ir track. TREC-18, 2009.
    [18]
    E. Minkov and W. W. Cohen. Learning graph walk based similarity measures for parsed text. EMNLP, pages 907--916, 2008.
    [19]
    E. Minkov, W. W. Cohen, and A. Y. Ng. Contextual search and name disambiguation in email using graphs. SIGIR, pages 27--34, 2006.
    [20]
    Z. Nie, Y. Zhang, J.-R. Wen, and W.-Y. Ma. Object-level ranking: bringing order to web objects. WWW, pages 567--574, 2005.
    [21]
    L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University Database Group.
    [22]
    G. Pandurangan, P. Raghavan, and E. Upfal. Using pagerank to characterize web structure. COCOON, pages 330--339, 2002.
    [23]
    V. Pavlu. Large scale ir evaluation. In PhD thesis, Northeastern University, College of Computer and Information Science, 2008.
    [24]
    S. Raghavan and H. Garcia-Molina. Representing web graphs. pages 405--416, 2003.
    [25]
    H. Tong, C. Faloutsos, and J.-Y. Pan. Fast random walk with restart and its applications. ICDM, pages 613--622, 2006.

    Cited By

    View all
    • (2022)Short Text Topic Learning Using Heterogeneous Information NetworkIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3147766(1-1)Online publication date: 2022
    • (2022)A Survey of Data Mining Method in Heterogeneous Information Networks with Node Importance Evaluation2022 IEEE 8th International Conference on Computer and Communications (ICCC)10.1109/ICCC56324.2022.10065864(412-417)Online publication date: 9-Dec-2022
    • (2022)Homophily-aware correction framework for crowdsourced labels using heterogeneous information networkExpert Systems with Applications10.1016/j.eswa.2022.116896200(116896)Online publication date: Aug-2022
    • Show More Cited By

    Index Terms

    1. Fast query execution for retrieval models based on path-constrained random walks

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
      July 2010
      1240 pages
      ISBN:9781450300551
      DOI:10.1145/1835804
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 July 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. filtering and recommending
      2. learning to rank
      3. path-constrained random walks
      4. relational retrieval

      Qualifiers

      • Research-article

      Conference

      KDD '10
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Upcoming Conference

      KDD '24

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)12
      • Downloads (Last 6 weeks)0

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Short Text Topic Learning Using Heterogeneous Information NetworkIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3147766(1-1)Online publication date: 2022
      • (2022)A Survey of Data Mining Method in Heterogeneous Information Networks with Node Importance Evaluation2022 IEEE 8th International Conference on Computer and Communications (ICCC)10.1109/ICCC56324.2022.10065864(412-417)Online publication date: 9-Dec-2022
      • (2022)Homophily-aware correction framework for crowdsourced labels using heterogeneous information networkExpert Systems with Applications10.1016/j.eswa.2022.116896200(116896)Online publication date: Aug-2022
      • (2022)Discover Important Paths in the Knowledge Graph Based on Dynamic Relation ConfidenceBig Data and Social Computing10.1007/978-981-19-7532-5_22(341-358)Online publication date: 7-Dec-2022
      • (2022)Finding All Shortest Meaningful Meta-Paths Between Two Vertices of a Secured Large Heterogeneous Information Network Using Distributed AlgorithmRobotics and AI for Cybersecurity and Critical Infrastructure in Smart Cities10.1007/978-3-030-96737-6_10(171-192)Online publication date: 29-Mar-2022
      • (2021)Incorporating heterogeneous information in deep learning with informative meta-paths for community recommendationsJournal of Information Science10.1177/0165551521104742349:5(1309-1324)Online publication date: 13-Oct-2021
      • (2021)Neural PathSim for Inductive Similarity Search in Heterogeneous Information NetworksProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482454(2201-2210)Online publication date: 26-Oct-2021
      • (2021)Measuring diversity in heterogeneous information networksTheoretical Computer Science10.1016/j.tcs.2021.01.013859(80-115)Online publication date: Mar-2021
      • (2020)Distributed Algorithms for Finding Meta-Paths of a Large Heterogeneous Information Network on CloudModern Principles, Practices, and Algorithms for Cloud Security10.4018/978-1-7998-1082-7.ch011(223-249)Online publication date: 2020
      • (2020)Effective Similarity Search on Heterogeneous Networks: A Meta-path Free ApproachIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.3019488(1-1)Online publication date: 2020
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media