Article

Deep content-based music recommendation

Authors:

Aäron van den Oord,

Sander Dieleman,

Benjamin SchrauwenAuthors Info & Claims

NIPS'13: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2

Pages 2643 - 2651

Published: 05 December 2013 Publication History

Abstract

Automatic music recommendation has become an increasingly relevant problem in recent years, since a lot of music is now sold and consumed digitally. Most recommender systems rely on collaborative filtering. However, this approach suffers from the cold start problem: it fails when no usage data is available, so it is not effective for recommending new and unpopular songs. In this paper, we propose to use a latent factor model for recommendation, and predict the latent factors from music audio when they cannot be obtained from usage data. We compare a traditional approach using a bag-of-words representation of the audio signals with deep convolutional neural networks, and evaluate the predictions quantitatively and qualitatively on the Million Song Dataset. We show that using predicted latent factors produces sensible recommendations, despite the fact that there is a large semantic gap between the characteristics of a song that affect user preference and the corresponding audio signal. We also show that recent advances in deep learning translate very well to the music recommendation setting, with deep convolutional neural networks significantly outperforming the traditional approach.

References

[1]

M. Slaney. Web-scale multimedia analysis: Does content matter? MultiMedia, IEEE, 18(2):12-15, 2011.

[2]

Ò. Celma. Music Recommendation and Discovery in the Long Tail. PhD thesis, Universitat Pompeu Fabra, Barcelona, 2008.

[3]

Malcolm Slaney, Kilian Q. Weinberger, and William White. Learning a metric for music similarity. In Proceedings of the 9th International Conference on Music Information Retrieval (ISMIR), 2008.

[4]

Jan Schlüter and Christian Osendorfer. Music Similarity Estimation with the Mean-Covariance Restricted Boltzmann Machine. In Proceedings of the 10th International Conference on Machine Learning and Applications (ICMLA), 2011.

[5]

Brian McFee, Luke Barrington, and Gert R. G. Lanckriet. Learning content similarity for music recommendation. IEEE Transactions on Audio, Speech & Language Processing, 20(8), 2012.

[6]

Richard Stenzel and Thomas Kamps. Improving Content-Based Similarity Measures by Training a Collaborative Model. pages 264-271, London, UK, September 2005. University of London.

[7]

Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor, editors. Recommender Systems Handbook. Springer, 2011.

[8]

James Bennett and Stan Lanning. The netflix prize. In Proceedings of KDD cup and workshop, volume 2007, page 35, 2007.

[9]

Eric J. Humphrey, Juan P. Bello, and Yann LeCun. Moving beyond feature design: Deep architectures and automatic feature learning in music informatics. In Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR), 2012.

[10]

Philippe Hamel and Douglas Eck. Learning features from music audio with deep belief networks. In Proceedings of the 11th International Conference on Music Information Retrieval (ISMIR), 2010.

[11]

Honglak Lee, Peter Pham, Yan Largman, and Andrew Ng. Unsupervised feature learning for audio classification using convolutional deep belief networks. In Advances in Neural Information Processing Systems 22. 2009.

[12]

Sander Dieleman, Philemon Brakel, and Benjamin Schrauwen. Audio-based music classification with a pretrained convolutional network. In Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR), 2011.

[13]

Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. The million song dataset. In Proceedings of the 11th International Conference on Music Information Retrieval (ISMIR), 2011.

[14]

Brian McFee, Thierry Bertin-Mahieux, Daniel P.W. Ellis, and Gert R.G. Lanckriet. The million song dataset challenge. In Proceedings of the 21st international conference companion on World Wide Web, 2012.

[15]

Andreas Rauber, Alexander Schindler, and Rudolf Mayer. Facilitating comprehensive benchmarking experiments on the million song dataset. In Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR), 2012.

[16]

Yifan Hu, Yehuda Koren, and Chris Volinsky. Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, 2008.

[17]

Jason Weston, Chong Wang, Ron Weiss, and Adam Berenzweig. Latent collaborative retrieval. In Proceedings of the 29th international conference on Machine learning, 2012.

[18]

Jason Weston, Samy Bengio, and Philippe Hamel. Large-scale music annotation and retrieval: Learning to rank in joint semantic spaces. Journal of New Music Research, 2011.

[19]

Jonathan T Foote. Content-based retrieval of music and audio. In Voice, Video, and Data Communications, pages 138-147. International Society for Optics and Photonics, 1997.

[20]

Matthew Hoffman, David Blei, and Perry Cook. Easy As CBA: A Simple Probabilistic Model for Tagging Music. In Proceedings of the 10th International Conference on Music Information Retrieval (ISMIR), 2009.

[21]

Brian McFee and Gert R. G. Lanckriet. Metric learning to rank. In Proceedings of the 27th International Conference on Machine Learning, 2010.

[22]

Geoffrey Hinton, Li Deng, Dong Yu, George E Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Signal Processing Magazine, IEEE, 29(6):82-97, 2012.

[23]

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, 2012.

[24]

Vinod Nair and Geoffrey E. Hinton. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010.

[25]

James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy), June 2010.

[26]

G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. Technical report, University of Toronto, 2012.

[27]

Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(2579-2605):85, 2008.

[28]

Chong Wang and David M. Blei. Collaborative topic modeling for recommending scientific articles. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 2011.

[29]

Ruslan Salakhutdinov and Andriy Mnih. Probabilistic matrix factorization. In Advances in Neural Information Processing Systems, volume 20, 2008.

Cited By

Zangerle EBauer C(2022)Evaluating Recommender Systems: Survey and FrameworkACM Computing Surveys10.1145/355653655:8(1-38)Online publication date: 23-Dec-2022
https://dl.acm.org/doi/10.1145/3556536
Song YJiang DZhao XXu QWong RFan LYang QShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)L2RSProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3481542(1157-1166)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3481542
Zhu HNiu YFu DWang HShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)MusicBERTProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475576(3955-3963)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475576
Show More Cited By

Recommendations

Improving Content-based and Hybrid Music Recommendation using Deep Learning
MM '14: Proceedings of the 22nd ACM international conference on Multimedia

Existing content-based music recommendation systems typically employ a \textit{two-stage} approach. They first extract traditional audio content features such as Mel-frequency cepstral coefficients and then predict user preferences. However, these ...
Effective social content-based collaborative filtering for music recommendation

Recently, music recommender systems have been proposed to help users obtain the interested music. Traditional recommender systems making attempts to discover users' musical preferences by ratings always suffer from problems of rating diversity, rating ...
Siamese Neural Networks for Content-based Cold-Start Music Recommendation.
RecSys '21: Proceedings of the 15th ACM Conference on Recommender Systems

Music recommendation systems typically use collaborative filtering to determine which songs to recommend to their users. This mechanism matches a user with listeners that have similar tastes, and uses their listening history to find songs that the user ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'13: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2

December 2013

3236 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 05 December 2013

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

142
Total Citations
View Citations
3
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zangerle EBauer C(2022)Evaluating Recommender Systems: Survey and FrameworkACM Computing Surveys10.1145/355653655:8(1-38)Online publication date: 23-Dec-2022
https://dl.acm.org/doi/10.1145/3556536
Song YJiang DZhao XXu QWong RFan LYang QShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)L2RSProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3481542(1157-1166)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3481542
Zhu HNiu YFu DWang HShen HZhuang YSmith JYang YCesar PMetze FPrabhakaran B(2021)MusicBERTProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475576(3955-3963)Online publication date: 17-Oct-2021
https://dl.acm.org/doi/10.1145/3474085.3475576
Ai QNarayanan.R LDemartini GZuccon GCulpepper JHuang ZTong H(2021)Model-agnostic vs. Model-intrinsic Interpretability for Explainable Product SearchProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482276(5-15)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482276
Feng XChen CLi DZhao MHao JWang JDemartini GZuccon GCulpepper JHuang ZTong H(2021)CMMLProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482241(484-493)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482241
Fan ZLiu ZWang SZheng LYu PDemartini GZuccon GCulpepper JHuang ZTong H(2021)Modeling Sequences as Distributions with Uncertainty for Sequential RecommendationProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482145(3019-3023)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482145
Li SZhao YVarma RSalpekar ONoordhuis PLi TPaszke ASmith JVaughan BDamania PChintala S(2020)PyTorch distributedProceedings of the VLDB Endowment10.14778/3415478.341553013:12(3005-3018)Online publication date: 1-Aug-2020
https://dl.acm.org/doi/10.14778/3415478.3415530
Joglekar MLi CChen MXu TWang XAdams JKhaitan PLiu JLe QGupta RLiu YShah MRajan STang JPrakash B(2020)Neural Input Search for Large Scale Recommendation ModelsProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3403288(2387-2397)Online publication date: 23-Aug-2020
https://dl.acm.org/doi/10.1145/3394486.3403288
Khoali MTali ALaaziz YBen Ahmed MBoudhir A(2020)Advanced Recommendation Systems Through Deep LearningProceedings of the 3rd International Conference on Networking, Information Systems & Security10.1145/3386723.3387870(1-8)Online publication date: 31-Mar-2020
https://dl.acm.org/doi/10.1145/3386723.3387870
Chen CZhang MZhang YLiu YMa S(2020)Efficient Neural Matrix Factorization without Sampling for RecommendationACM Transactions on Information Systems10.1145/337380738:2(1-28)Online publication date: 14-Jan-2020
https://dl.acm.org/doi/10.1145/3373807
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents