Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3534678.3539103acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Generalized Deep Mixed Models

Published: 14 August 2022 Publication History

Abstract

We introduce generalized deep mixed model (GDMix), a class of machine learning models for large-scale recommender systems that combines the power of deep neural networks and the efficiency of logistic regression. GDMix leverages state-of-the-art deep neural networks (DNNs) as the global models (fixed effects), and further improves the performance by adding entity-specific personalized models (random effects). For instance, the click response from a particular user m to a job posting j may consist of contributions from a DNN model common to all users and job postings, a model specific to the user m and a model specific to the job j. GDMix models not only possess powerful modeling capabilities but also enjoy high training efficiency especially for web-scale recommender systems. We demonstrate the capabilities by detailing their use in Feed and Ads recommendation at LinkedIn. The source code for the GDMix training framework is available at https://github.com/linkedin/gdmix https://github.com/linkedin/gdmix under the BSD-2-Clause License.

References

[1]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.
[2]
Badi H Baltagi. 2008. Econometric Analysis of Panel Data .Wiley.
[3]
R. H. Byrd, P. Lu, and J. Nocedal. 1995. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM J. Sci. Statist. Comput., Vol. 16, 5 (1995), 1190--1208.
[4]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (Boston, MA, USA) (DLRS 2016). Association for Computing Machinery, New York, NY, USA, 7--10. https://doi.org/10.1145/2988450.2988454
[5]
Criteo. 2014. Kaggle Display Advertising Challenge Dataset. https://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/
[6]
Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.
[7]
Andrew Gelman. 2005. Analysis of Variance-Why it is more important than ever. In The Annals of Statistics, Vol. 33. Institute of Mathematical Statistics, 1--53.
[8]
Antonio Ginart, Maxim Naumov, Dheevatsa Mudigere, Jiyan Yang, and James Zou. 2019. Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems. CoRR, Vol. abs/1909.11810 (2019). https://arxiv.org/abs/1909.11810
[9]
Grouplens. 1998. MovieLens 100K Dataset. https://grouplens.org/datasets/movielens/100k/
[10]
Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine Based Neural Network for CTR Prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (Melbourne, Australia) (IJCAI'17). AAAI Press, 1725--1731.
[11]
Weiwei Guo, Xiaowei Liu, Sida Wang, Huiji Gao, and Bo Long. 2020. DeText: A Deep NLP Framework for Intelligent Text Understanding. https://engineering.linkedin.com/blog/2020/open-sourcing-detext
[12]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1746--1751. https://doi.org/10.3115/v1/D14--1181
[13]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[14]
Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G. Azzolini, Dmytro Dzhulgakov, Andrey Mallevich, Ilia Cherniavskii, Yinghai Lu, Raghuraman Krishnamoorthi, Ansha Yu, Volodymyr Kondratenko, Stephanie Pereira, Xianjie Chen, Wenlin Chen, Vijay Rao, Bill Jia, Liang Xiong, and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR, Vol. abs/1906.00091 (2019). https://arxiv.org/abs/1906.00091
[15]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. dtextquotesingle Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
[16]
Rohan Ramanath, Konstantin Salomatin, Jeffrey D. Gee, Kirill Talanine, Onkar Dalal, Gungor Polatkan, Sara Smoot, and Deepak Kumar. 2021. Lambda Learner: Fast Incremental Learning on Data Streams. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery I& Data Mining (Virtual Event, Singapore) (KDD '21). Association for Computing Machinery, New York, NY, USA, 3492--3502. https://doi.org/10.1145/3447548.3467172
[17]
Steffen Rendle. 2010. Factorization Machines. In 2010 IEEE International Conference on Data Mining. 995--1000. https://doi.org/10.1109/ICDM.2010.127
[18]
Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018).
[19]
Hao-Jun Michael Shi, Dheevatsa Mudigere, Maxim Naumov, and Jiyan Yang. 2019. Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems. CoRR, Vol. abs/1909.02107 (2019). https://arxiv.org/abs/1909.02107
[20]
Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, .Ilhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, Vol. 17 (2020), 261--272. https://doi.org/10.1038/s41592-019-0686--2
[21]
XianXing Zhang, Yitong Zhou, Yiming Ma, Bee-Chung Chen, Liang Zhang, and Deepak Agarwal. 2016. GLMix: Generalized Linear Mixed Models For Large-Scale Response Prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 363--372. https://doi.org/10.1145/2939672.2939684
[22]
Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep Interest Evolution Network for Click-through Rate Prediction. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence (Honolulu, Hawaii, USA) (AAAI'19/IAAI'19/EAAI'19). AAAI Press, Article 729, 8 pages. https://doi.org/10.1609/aaai.v33i01.33015941
[23]
Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery I& Data Mining (London, United Kingdom) (KDD '18). Association for Computing Machinery, New York, NY, USA, 1059--1068. https://doi.org/10.1145/3219819.3219823

Cited By

View all
  • (2024)Enabling mixed effects neural networks for diverse, clustered data using Monte Carlo MethodsProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/555(5018-5024)Online publication date: 3-Aug-2024
  • (2024)Design and Experimentation of a Distributed Information Retrieval-Hybrid Architecture in Cloud IoT Data CentersInternet of Things. 7th IFIPIoT 2024 International IFIP WG 5.5 Workshops10.1007/978-3-031-82065-6_2(12-21)Online publication date: 29-Dec-2024
  • (2023)Neural Mixed Effects for Nonlinear Personalized PredictionsProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614115(445-454)Online publication date: 9-Oct-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2022
5033 pages
ISBN:9781450393850
DOI:10.1145/3534678
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. generalized deep mixed model
  2. neural networks
  3. recommender systems
  4. response prediction

Qualifiers

  • Research-article

Conference

KDD '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Enabling mixed effects neural networks for diverse, clustered data using Monte Carlo MethodsProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/555(5018-5024)Online publication date: 3-Aug-2024
  • (2024)Design and Experimentation of a Distributed Information Retrieval-Hybrid Architecture in Cloud IoT Data CentersInternet of Things. 7th IFIPIoT 2024 International IFIP WG 5.5 Workshops10.1007/978-3-031-82065-6_2(12-21)Online publication date: 29-Dec-2024
  • (2023)Neural Mixed Effects for Nonlinear Personalized PredictionsProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614115(445-454)Online publication date: 9-Oct-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media