Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1367497.1367517acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Learning multiple graphs for document recommendations

Published: 21 April 2008 Publication History

Abstract

The Web offers rich relational data with different semantics. In this paper, we address the problem of document recommendation in a digital library, where the documents in question are networked by citations and are associated with other entities by various relations. Due to the sparsity of a single graph and noise in graph construction, we propose a new method for combining multiple graphs to measure document similarities, where different factorization strategies are used based on the nature of different graphs. In particular, the new method seeks a single low-dimensional embedding of documents that captures their relative similarities in a latent space. Based on the obtained embedding, a new recommendation framework is developed using semi-supervised learning on graphs. In addition, we address the scalability issue and propose an incremental algorithm. The new incremental method significantly improves the efficiency by calculating the embedding for new incoming documents only. The new batch and incremental methods are evaluated on two real world datasets prepared from CiteSeer. Experiments demonstrate significant quality improvement for our batch method and significant efficiency improvement with tolerable quality loss for our incremental method.

References

[1]
F. Chung. Spectral Graph Theory. American Mathematical Society, 1997.
[2]
F. Chung. Laplacians and the cheeger inequality for directed graphs. Annals of Combinatorics, 9, 2005.
[3]
D. Cohn and H. Chang. Learning to probabilistically identify authoritative documents. Proc. ICML 2000. pp.167--174., 2000.
[4]
D. Cohn and T. Hofmann. The missing link - a probabilistic model of document content and hypertext connectivity. In T. K. Leen, T. G. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems 13, pages 430--436. MIT Press, 2001.
[5]
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41(6):391--407, 1990.
[6]
M. Fazel, H. Hindi, and S. P. Boyd. Log-det heuristic for matrix rank minimization with applications to hankel and euclidean distance matrices. In Proceedings of American Control Conference, 2003.
[7]
R. Guha, R. Kumar, P. Raghavan, and A. Tomkins. Propagation of trust and distrust. In WWW ?04: Proceedings of the 13th international conference on World Wide Web, pages 403--412, New York, NY, USA, 2004. ACM Press.
[8]
X. He, H. Zha, C. H. Q. Ding, and H. D. Simon. Web document clustering using hyperlink structures. Computational Statistics & Data Analysis, 41(1):19--45, November 2002.
[9]
T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1-2):177--196, 2001.
[10]
T. Hofmann. Latent semantic models for collaborative filtering. ACM Trans. Inf. Syst., 22(1):89--115, 2004.
[11]
B. Sarwar, G. Karypis, J. Konstan, and J. Reidl. Item-based collaborative filtering recommendation algorithms. In WWW ?01: Proceedings of the 10th international conference on World Wide Web, pages 285--295, New York, NY, USA, 2001. ACM Press.
[12]
F. Wang, S. Ma, L. Yang, and T. Li. Recommendation on item graphs. In ICDM ?06: Proceedings of the Sixth International Conference on Data Mining, pages 1119--1123, Washington, DC, USA, 2006. IEEE Computer Society.
[13]
J. Wang, A. P. de Vries, and M. J. T. Reinders. Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In SIGIR ?06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 501--508, New York, NY, USA, 2006. ACM Press.
[14]
K. Yu, A. Schwaighofer, V. Tresp, X. Xu, and H.-P. Kriegel. Probabilistic memory-based collaborative filtering. IEEE Transactions on Knowledge and Data Engineering, 16(1):56--69, 2004.
[15]
H. Zha, C. Ding, M. Gu, X. He, and H. Simon. Spectral relaxation for k-means clustering. In Neural Information Processing Systems, volume 14, 2001.
[16]
D. Zhou and C. J. C. Burges. Spectral clustering and transductive learning with multiple views. In ICML ?07: Proceedings of the 24th international conference on Machine learning, pages 1159--1166, 2007.
[17]
D. Zhou, I. Councill, H. Zha, and C. L. Giles. Discovering temporal communities from social network documents. In ICDM?07: Proceedings of the 7th IEEE International Conference on Data Mining, 2007.
[18]
D. Zhou, J. Huang, and B. Scholkopf. Learning from labeled and unlabeled data on a directed graph. In ICML ?05: Proceedings of the 22nd international conference on Machine learning, pages 1036--1043, 2005.
[19]
D. Zhou, E. Manavoglu, J. Li, C. L. Giles, and H. Zha. Probabilistic models for discovering e-communities. In WWW ?06: Proceedings of the 15th international conference on World Wide Web, pages 173--182. ACM Press, 2006.
[20]
S. Zhu, K. Yu, Y. Chi, and Y. Gong. Combining content and link for classification using matrix factorization. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007.

Cited By

View all
  • (2024)IntellectSeeker: A Personalized Literature Management System with the Probabilistic Model and Large Language ModelKnowledge Science, Engineering and Management10.1007/978-981-97-5489-2_24(270-282)Online publication date: 27-Jul-2024
  • (2023)A Content-Collaborative Recommender System based on clustering and ontologySignal and Data Processing10.61186/jsdp.20.3.19720:3(197-224)Online publication date: 1-Dec-2023
  • (2022)Tell Me How to Survey: Literature Review Made Simple with Automatic Reading Path Generation2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00322(3426-3438)Online publication date: May-2022
  • Show More Cited By

Recommendations

Reviews

Amos O Olagunju

The measurement of document similarities by using citations in a digital library naturally leads to fundamental questions. How should different graphs for measurement of document similarities in information retrieval systems be presented__?__ How should document citation (DC) graphs and bipartite graphs of the document-author (DA) and document-publication (DP) venue be explored to develop novel document similarity metrics__?__ Principles and applications of collaborative item filtering (CIF) in document similarity measurement exist in the literature [1,2,3]. However, the usefulness of CIF in networked document items (DIs) on the Web remains underexploited. The authors propose a new document recommendation framework predicated on the semi-supervised learning of multiple graphs. The framework includes a multiple graph learning model for accurately capturing the relative similarities among single low-dimensional embedded DIs in a latent semantic space, and a label propagation scheme for making use of the similarities among partly tagged documents to calculate approximately the labels of untagged documents. The authors combine the learning represented in the DC, DA, and DP graphs into an optimization problem, with factorization strategies to obtain the distinctive traits of the graphs. They present an incremental document embedding update algorithm for enhancing the scalability and efficiency of processing large documents in digital libraries. The standard conjugate gradient method is used to solve the optimization problems in both the learning and the incremental updating of document insertion. The proposed batch and incremental techniques for merging multiple graphs by document similarity measurement are evaluated using samples of datasets from real-world digital databases containing documents with citations, authors, and venues. Experimental precision and recall results of document recommendations show major quality enhancement with the batch technique, and a significant upgrade of efficiency, with acceptable quality depreciation, with the incremental method. Consequently, the planned graph-synthesizing methods are excellent candidates for document recommendation over cooperating networks of digital libraries. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '08: Proceedings of the 17th international conference on World Wide Web
April 2008
1326 pages
ISBN:9781605580852
DOI:10.1145/1367497
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 April 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. collaborative filtering
  2. recommender systems
  3. semi-supervised learning
  4. social network analysis
  5. spectral clustering

Qualifiers

  • Research-article

Conference

WWW '08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)IntellectSeeker: A Personalized Literature Management System with the Probabilistic Model and Large Language ModelKnowledge Science, Engineering and Management10.1007/978-981-97-5489-2_24(270-282)Online publication date: 27-Jul-2024
  • (2023)A Content-Collaborative Recommender System based on clustering and ontologySignal and Data Processing10.61186/jsdp.20.3.19720:3(197-224)Online publication date: 1-Dec-2023
  • (2022)Tell Me How to Survey: Literature Review Made Simple with Automatic Reading Path Generation2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00322(3426-3438)Online publication date: May-2022
  • (2022)Citation recommendation using semantic representation of cited papers’ relations and contentExpert Systems with Applications10.1016/j.eswa.2021.115826187(115826)Online publication date: Jan-2022
  • (2022)Enhancing citation recommendation using citation network embeddingScientometrics10.1007/s11192-021-04196-3Online publication date: 11-Jan-2022
  • (2021)Creative Destruction: The Structural Consequences of Scientific CurationAmerican Sociological Review10.1177/000312242199632386:2(341-376)Online publication date: 18-Mar-2021
  • (2021)Graph Neural Collaborative Topic Model for Citation RecommendationACM Transactions on Information Systems10.1145/347397340:3(1-30)Online publication date: 17-Nov-2021
  • (2021)A Hybrid Recommender System Using KNN and ClusteringInternational Journal of Information Technology & Decision Making10.1142/S021962202150005X20:02(553-596)Online publication date: 31-Mar-2021
  • (2021)VOPRec: Vector Representation Learning of Papers with Text Information and Structural Identity for RecommendationIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2018.28306989:1(226-237)Online publication date: 1-Jan-2021
  • (2020)Additive Angular Margin Loss in Deep Graph Neural Network Classifier for Learning Graph Edit DistanceIEEE Access10.1109/ACCESS.2020.30358868(201752-201761)Online publication date: 2020
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media