research-article

fLDA: matrix factorization through latent dirichlet allocation

Authors:

Deepak Agarwal,

Bee-Chung ChenAuthors Info & Claims

WSDM '10: Proceedings of the third ACM international conference on Web search and data mining

Pages 91 - 100

https://doi.org/10.1145/1718487.1718499

Published: 04 February 2010 Publication History

Abstract

We propose fLDA, a novel matrix factorization method to predict ratings in recommender system applications where a "bag-of-words" representation for item meta-data is natural. Such scenarios are commonplace in web applications like content recommendation, ad targeting and web search where items are articles, ads and web pages respectively. Because of data sparseness, regularization is key to good predictive accuracy. Our method works by regularizing both user and item factors simultaneously through user features and the bag of words associated with each item. Specifically, each word in an item is associated with a discrete latent factor often referred to as the topic of the word; item topics are obtained by averaging topics across all words in an item. Then, user rating on an item is modeled as user's affinity to the item's topics where user affinity to topics (user factors) and topic assignments to words in items (item factors) are learned jointly in a supervised fashion. To avoid overfitting, user and item factors are regularized through Gaussian linear regression and Latent Dirichlet Allocation (LDA) priors respectively. We show our model is accurate, interpretable and handles both cold-start and warm-start scenarios seamlessly through a single model. The efficacy of our method is illustrated on benchmark datasets and a new dataset from Yahoo! Buzz where fLDA provides superior predictive accuracy in cold-start scenarios and is comparable to state-of-the-art methods in warm-start scenarios. As a by-product, fLDA also identifies interesting topics that explains user-item interactions. Our method also generalizes a recently proposed technique called supervised LDA (sLDA) to collaborative filtering applications. While sLDA estimates item topic vectors in a supervised fashion for a single regression, fLDA incorporates multiple regressions (one for each user) in estimating the item factors.

References

[1]

KDD cup and workshop. 2007.

[2]

D. Agarwal and B.-C. Chen. Regression-based latent factor models. In KDD, 2009.

Digital Library

[3]

D. Agarwal and B.-C. Chen, et al. Online models for content optimization. In NIPS, 2008.

[4]

M. Balabanovic and Y. Shoham. Fab: content-based, collaborative recommendation. Comm. of the ACM, 1997.

Digital Library

[5]

R. Bell, Y. Koren, and C. Volinsky. Modeling relationships at multiple scales to improve accuracy of large recommender systems. In KDD, 2007.

Digital Library

[6]

R.M. Bell and Y. Koren. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In ICDM, 2007.

Digital Library

[7]

D. Blei and J. McAuliffe. Supervised topic models. In NIPS, 2008.

[8]

D.M. Blei, A.Y. Ng, and M.I. Jordan. Latent dirichlet allocation. JMLR, 3, 2003.

Digital Library

[9]

J. Booth and J. Hobert. Maximizing generalized linear mixed model likelihoods with an automated monte carlo EM algorithm. J.R. Statist. Soc. B, 1999.

[10]

Y. Chen, D. Pavlov, and J.F. Canny. Large-scale behavioral targeting. In KDD, 2009.

Digital Library

[11]

M. Claypool and A. Gokhale, et al. Combining content-based and collaborative filters in an online newspaper. In Recommender Systems Workshop, 1999.

[12]

A.P. Dempster, N.M. Laird, and D.B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J.R. Statist. Soc. B, 1977.

[13]

A.E. Gelfand. Gibbs sampling. JASA, 1995.

[14]

L. Getoor and B. Taskar. Introduction to Statistical Relational Learning. MIT Press, 2007.

Digital Library

[15]

N. Good and J.B. Schafer, et al. Combining collaborative filtering with personal agents for better recommendations. In AAAI, 1999.

Digital Library

[16]

T.L. Griffiths and M. Steyvers. Finding scientific topics. In Proc. of National Academy of Sciences, 2004.

[17]

X. Jin, Y. Zhou, and B. Mobasher. A maximum entropy web recommendation system: Combining collaborative and content features. In KDD, 2005.

Digital Library

[18]

Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In KDD, 2008.

Digital Library

[19]

N. Lawrence and R. Urtasun. Non-linear matrix factorization with gaussian processes. In ICML, 2009.

Digital Library

[20]

S.-T. Park and D. Pennock, et al. Naive filterbots for robust cold-start recommendations. In KDD, 2006.

Digital Library

[21]

I. Porteous, E. Bart, and M. Welling. Multi-hdp: A non parametric bayesian model for tensor factorization. In AAAI, 2008.

Digital Library

[22]

J. Rennie and N. Srebro. Fast maximum margin matrix factorization for collaborative prediction. In ICML, 2005.

Digital Library

[23]

P.E. Rossi, G. Allenby, and R.P. McCulloch. Bayesian Statistics and Marketing. John Wiley, 2005.

[24]

R. Salakhutdinov and A. Mnih. Bayesian probabilistic matrix factorization using markov chain monte carlo. In ICML, 2008.

Digital Library

[25]

R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In NIPS, 2008.

Digital Library

[26]

A.I. Schein and R. Popescul, et al. Methods and metrics for cold-start recommendations. In SIGIR, 2002.

Digital Library

[27]

A.P. Singh and G.J. Gordon. Relational learning via collective matrix factorization. In KDD, 2008.

Digital Library

[28]

D.H. Stern, R. Herbrich, and T. Graepel. Matchbox: large scale online bayesian recommendations. In WWW, 2009.

Digital Library

[29]

Y. Wang and H. Bai, et al. Plda: Parallel latent dirichlet allocation for large-scale applications. In AAIM, 2009.

Digital Library

[30]

K. Yu, J. Lafferty, and S. Zhu. Large-scale collaborative prediction using a nonparametric random effects model. In ICML, 2009.

Digital Library

[31]

C.-N. Ziegler, S.M. McNee, J.A. Konstan, and G. Lausen. Improving recommendation lists through topic diversification. In WWW, 2005.

Digital Library

Cited By

Nahta RChauhan GMeena YGopalani D(2024)Deep learning with the generative models for recommender systems: A surveyComputer Science Review10.1016/j.cosrev.2024.10064653(100646)Online publication date: Aug-2024
https://doi.org/10.1016/j.cosrev.2024.100646
Vercoutere SJoris GDe Pessemier TMartens L(2024)Improving selection diversity using hybrid graph-based news recommendersUser Modeling and User-Adapted Interaction10.1007/s11257-024-09399-w34:4(955-993)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s11257-024-09399-w
Lin YZhou Y(2023)Identification of Hydrogen-Energy-Related Emerging Technologies Based on Text MiningSustainability10.3390/su1601014716:1(147)Online publication date: 22-Dec-2023
https://doi.org/10.3390/su16010147
Show More Cited By

Recommendations

Multi-linear interactive matrix factorization

A multi-linear interactive matrix factorization algorithm is introduced.The interactions between users and factors are empirically analyzed.Results show interactive factors significantly enhance recommendation performance. Recommender systems, which can ...
Rating prediction using review texts with underlying sentiments

Recommender systems typically produce a list of recommendations to precisely predict the user's preference for the items. For this purpose, latent factor models, such as matrix factorization, are usually employed to find latent factors that can ...
Incorporating textual reviews in the learning of latent factors for recommender systems
Highlight
- We use textual reviews to support the ratings in the latent factor model.
- A ...
Abstract
In the field of recommender systems, the latent factor model is one of the state-of-the-art ones thanks to its strengths in accuracy and scalability. Its core is to learn latent factors for the representation of users and items using ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WSDM '10: Proceedings of the third ACM international conference on Web search and data mining

February 2010

468 pages

ISBN:9781605588896

DOI:10.1145/1718487

General Chairs:
Brian D. Davison
Lehigh University, USA
,
Torsten Suel
Polytechnic Institute of NYU, USA
,
Program Chairs:
Nick Craswell
Microsoft, USA
,
Bing Liu
University of Illinois, Chicago, USA

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 February 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WSDM'10

Sponsor:

WSDM'10: Third ACM International Conference on Web Search and Data Mining

February 4 - 6, 2010

New York, New York, USA

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

212
Total Citations
View Citations
2,064
Total Downloads

Downloads (Last 12 months)91
Downloads (Last 6 weeks)18

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nahta RChauhan GMeena YGopalani D(2024)Deep learning with the generative models for recommender systems: A surveyComputer Science Review10.1016/j.cosrev.2024.10064653(100646)Online publication date: Aug-2024
https://doi.org/10.1016/j.cosrev.2024.100646
Vercoutere SJoris GDe Pessemier TMartens L(2024)Improving selection diversity using hybrid graph-based news recommendersUser Modeling and User-Adapted Interaction10.1007/s11257-024-09399-w34:4(955-993)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s11257-024-09399-w
Lin YZhou Y(2023)Identification of Hydrogen-Energy-Related Emerging Technologies Based on Text MiningSustainability10.3390/su1601014716:1(147)Online publication date: 22-Dec-2023
https://doi.org/10.3390/su16010147
Papadakis HPapagrigoriou AKosmas EPanagiotakis CMarkaki SFragopoulou P(2023)Content-Based Recommender Systems TaxonomyFoundations of Computing and Decision Sciences10.2478/fcds-2023-000948:2(211-241)Online publication date: 30-Jun-2023
https://doi.org/10.2478/fcds-2023-0009
Thielmann AReuter ASeifert QBergherr ESäfken B(2023)Topics in the Haystack: Enhancing Topic Quality through Corpus ExpansionComputational Linguistics10.1162/coli_a_0050650:2(619-655)Online publication date: 1-Jun-2023
https://doi.org/10.1162/coli_a_00506
Ajmal SSarfraz MMemon IBilal MAlam K(2023)PUB-VEN: a personalized recommendation system for suggesting publication venuesMultimedia Tools and Applications10.1007/s11042-023-16798-583:14(42103-42124)Online publication date: 14-Oct-2023
https://doi.org/10.1007/s11042-023-16798-5
Almomani AAlauthman MShatnawi MAlweshah MAlrosan AAlomoush WGupta BGupta BGupta B(2022)Phishing Website Detection With Semantic Features Based on Machine Learning ClassifiersInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29703218:1(1-24)Online publication date: 23-Feb-2022
https://dl.acm.org/doi/10.4018/IJSWIS.297032
Su JLi J(2022)Semantic Trajectory Frequent Pattern Mining ModelInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29703118:1(1-20)Online publication date: 23-Feb-2022
https://dl.acm.org/doi/10.4018/IJSWIS.297031
Barbosa ABittencourt ISiqueira SDermeval DCruz N(2022)A Context-Independent Ontological Linked Data Alignment Approach to Instance MatchingInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29597718:1(1-29)Online publication date: 22-Feb-2022
https://dl.acm.org/doi/10.4018/IJSWIS.295977
Tembhurne JAlmin MDiwan T(2022)Mc-DNNInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29555318:1(1-20)Online publication date: 17-Feb-2022
https://dl.acm.org/doi/10.4018/IJSWIS.295553
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten