research-article

Generalized Deep Mixed Models

Authors:

Chengming Jiang,

Qiang Charles Xiao,

Huiji GaoAuthors Info & Claims

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 3869 - 3877

https://doi.org/10.1145/3534678.3539103

Published: 14 August 2022 Publication History

Abstract

We introduce generalized deep mixed model (GDMix), a class of machine learning models for large-scale recommender systems that combines the power of deep neural networks and the efficiency of logistic regression. GDMix leverages state-of-the-art deep neural networks (DNNs) as the global models (fixed effects), and further improves the performance by adding entity-specific personalized models (random effects). For instance, the click response from a particular user m to a job posting j may consist of contributions from a DNN model common to all users and job postings, a model specific to the user m and a model specific to the job j. GDMix models not only possess powerful modeling capabilities but also enjoy high training efficiency especially for web-scale recommender systems. We demonstrate the capabilities by detailing their use in Feed and Ads recommendation at LinkedIn. The source code for the GDMix training framework is available at https://github.com/linkedin/gdmix https://github.com/linkedin/gdmix under the BSD-2-Clause License.

References

[1]

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dandelion Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/ Software available from tensorflow.org.

[2]

Badi H Baltagi. 2008. Econometric Analysis of Panel Data .Wiley.

[3]

R. H. Byrd, P. Lu, and J. Nocedal. 1995. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM J. Sci. Statist. Comput., Vol. 16, 5 (1995), 1190--1208.

Digital Library

[4]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (Boston, MA, USA) (DLRS 2016). Association for Computing Machinery, New York, NY, USA, 7--10. https://doi.org/10.1145/2988450.2988454

Digital Library

[5]

Criteo. 2014. Kaggle Display Advertising Challenge Dataset. https://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/

[6]

Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.

[7]

Andrew Gelman. 2005. Analysis of Variance-Why it is more important than ever. In The Annals of Statistics, Vol. 33. Institute of Mathematical Statistics, 1--53.

[8]

Antonio Ginart, Maxim Naumov, Dheevatsa Mudigere, Jiyan Yang, and James Zou. 2019. Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems. CoRR, Vol. abs/1909.11810 (2019). https://arxiv.org/abs/1909.11810

[9]

Grouplens. 1998. MovieLens 100K Dataset. https://grouplens.org/datasets/movielens/100k/

[10]

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine Based Neural Network for CTR Prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (Melbourne, Australia) (IJCAI'17). AAAI Press, 1725--1731.

Digital Library

[11]

Weiwei Guo, Xiaowei Liu, Sida Wang, Huiji Gao, and Bo Long. 2020. DeText: A Deep NLP Framework for Intelligent Text Understanding. https://engineering.linkedin.com/blog/2020/open-sourcing-detext

[12]

Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1746--1751. https://doi.org/10.3115/v1/D14--1181

[13]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[14]

Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G. Azzolini, Dmytro Dzhulgakov, Andrey Mallevich, Ilia Cherniavskii, Yinghai Lu, Raghuraman Krishnamoorthi, Ansha Yu, Volodymyr Kondratenko, Stephanie Pereira, Xianjie Chen, Wenlin Chen, Vijay Rao, Bill Jia, Liang Xiong, and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR, Vol. abs/1906.00091 (2019). https://arxiv.org/abs/1906.00091

[15]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. dtextquotesingle Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024--8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf

Digital Library

[16]

Rohan Ramanath, Konstantin Salomatin, Jeffrey D. Gee, Kirill Talanine, Onkar Dalal, Gungor Polatkan, Sara Smoot, and Deepak Kumar. 2021. Lambda Learner: Fast Incremental Learning on Data Streams. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery I& Data Mining (Virtual Event, Singapore) (KDD '21). Association for Computing Machinery, New York, NY, USA, 3492--3502. https://doi.org/10.1145/3447548.3467172

Digital Library

[17]

Steffen Rendle. 2010. Factorization Machines. In 2010 IEEE International Conference on Data Mining. 995--1000. https://doi.org/10.1109/ICDM.2010.127

Digital Library

[18]

Alexander Sergeev and Mike Del Balso. 2018. Horovod: fast and easy distributed deep learning in TensorFlow. arXiv preprint arXiv:1802.05799 (2018).

[19]

Hao-Jun Michael Shi, Dheevatsa Mudigere, Maxim Naumov, and Jiyan Yang. 2019. Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems. CoRR, Vol. abs/1909.02107 (2019). https://arxiv.org/abs/1909.02107

[20]

Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, .Ilhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, Vol. 17 (2020), 261--272. https://doi.org/10.1038/s41592-019-0686--2

[21]

XianXing Zhang, Yitong Zhou, Yiming Ma, Bee-Chung Chen, Liang Zhang, and Deepak Agarwal. 2016. GLMix: Generalized Linear Mixed Models For Large-Scale Response Prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 363--372. https://doi.org/10.1145/2939672.2939684

Digital Library

[22]

Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep Interest Evolution Network for Click-through Rate Prediction. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence (Honolulu, Hawaii, USA) (AAAI'19/IAAI'19/EAAI'19). AAAI Press, Article 729, 8 pages. https://doi.org/10.1609/aaai.v33i01.33015941

Digital Library

[23]

Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery I& Data Mining (London, United Kingdom) (KDD '18). Association for Computing Machinery, New York, NY, USA, 1059--1068. https://doi.org/10.1145/3219819.3219823

Digital Library

Cited By

Wörtwein TAllen NSheeber LAuerbach RCohn JMorency L(2023)Neural Mixed Effects for Nonlinear Personalized PredictionsProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614115(445-454)Online publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1145/3577190.3614115

Index Terms

Generalized Deep Mixed Models
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems

Recommendations

A Deep Architecture for Content-based Recommendations Exploiting Recurrent Neural Networks
UMAP '17: Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization

In this paper we investigate the effectiveness of Recurrent Neural Networks (RNNs) in a top-N content-based recommendation scenario. Specifically, we propose a deep architecture which adopts Long Short Term Memory (LSTM) networks to jointly learn two ...
News Session-Based Recommendations using Deep Neural Networks
DLRS 2018: Proceedings of the 3rd Workshop on Deep Learning for Recommender Systems

News recommender systems are aimed to personalize users experiences and help them to discover relevant articles from a large and dynamic search space. Therefore, news domain is a challenging scenario for recommendations, due to its sparse user profiling, ...
Neural Collaborative Ranking
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

Recommender systems are aimed at generating a personalized ranked list of items that an end user might be interested in. With the unprecedented success of deep learning in computer vision and speech recognition, recently it has been a hot topic to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2022

5033 pages

ISBN:9781450393850

DOI:10.1145/3534678

General Chairs:
Aidong Zhang
University of Virginia
,
Huzefa Rangwala
Amazon/George Mason University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '22

Sponsor:

KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2022

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
328
Total Downloads

Downloads (Last 12 months)51
Downloads (Last 6 weeks)1

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wörtwein TAllen NSheeber LAuerbach RCohn JMorency L(2023)Neural Mixed Effects for Nonlinear Personalized PredictionsProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3614115(445-454)Online publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1145/3577190.3614115

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents