Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3219819.3219823acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Deep Interest Network for Click-Through Rate Prediction

Published: 19 July 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Click-through rate prediction is an essential task in industrial applications, such as online advertising. Recently deep learning based models have been proposed, which follow a similar Embedding&MLP paradigm. In these methods large scale sparse input features are first mapped into low dimensional embedding vectors, and then transformed into fixed-length vectors in a group-wise manner, finally concatenated together to fed into a multilayer perceptron (MLP) to learn the nonlinear relations among features. In this way, user features are compressed into a fixed-length representation vector, in regardless of what candidate ads are. The use of fixed-length vector will be a bottleneck, which brings difficulty for Embedding&MLP methods to capture user's diverse interests effectively from rich historical behaviors. In this paper, we propose a novel model: Deep Interest Network (DIN) which tackles this challenge by designing a local activation unit to adaptively learn the representation of user interests from historical behaviors with respect to a certain ad. This representation vector varies over different ads, improving the expressive ability of model greatly. Besides, we develop two techniques: mini-batch aware regularization and data adaptive activation function which can help training industrial deep networks with hundreds of millions of parameters. Experiments on two public datasets as well as an Alibaba real production dataset with over 2 billion samples demonstrate the effectiveness of proposed approaches, which achieve superior performance compared with state-of-the-art methods. DIN now has been successfully deployed in the online display advertising system in Alibaba, serving the main traffic.

    References

    [1]
    Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate Proceedings of the 3rd International Conference on Learning Representations.
    [2]
    Ducharme Réjean Bengio Yoshua et al. 2003. A neural probabilistic language model. Journal of Machine Learning Research (2003), 1137--1155.
    [3]
    Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. ACM, 191--198.
    [4]
    Cheng H. et al. 2016a. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM.
    [5]
    Qu Y. et al. 2016b. Product-Based Neural Networks for User Response Prediction Proceedings of the 16th International Conference on Data Mining. IEEE.
    [6]
    Zhu H. et al. 2017. Optimized Cost per Click in Taobao Display Advertising Proceedings of the 23rd International Conference on Knowledge Discovery and Data Mining. ACM, 2191--2200.
    [7]
    Tom Fawcett. 2006. An introduction to ROC analysis. Pattern recognition letters Vol. 27, 8 (2006), 861--874.
    [8]
    Kun Gai, Xiaoqiang Zhu, et almbox. 2017. Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction. arXiv preprint arXiv:1704.05194 (2017).
    [9]
    Huifeng Guo, Ruiming Tang, et almbox. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction Proceedings of the 26th International Joint Conference on Artificial Intelligence. 1725--1731.
    [10]
    F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems Vol. 5, 4 (2015).
    [11]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision. 1026--1034.
    [12]
    Ruining He and Julian McAuley. 2016. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. In Proceedings of the 25th International Conference on World Wide Web. 507--517.
    [13]
    Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2261--2269.
    [14]
    Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations.
    [15]
    Mu Li, Ziqi Liu, Alexander J Smola, and Yu-Xiang Wang. 2016. DiFacto: Distributed factorization machines. In Proceedings of the 9th ACM International Conference on Web Search and Data Mining. 377--386.
    [16]
    Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research Vol. 9, Nov (2008), 2579--2605.
    [17]
    Julian Mcauley, Christopher Targett, Qinfeng Shi, and Van Den Hengel Anton. 2015. Image-Based Recommendations on Styles and Substitutes Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 43--52.
    [18]
    H. Brendan Mcmahan, H. Brendan Holt, et almbox. 2014. Ad Click Prediction: a View from the Trenches. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1222--1230.
    [19]
    Steffen Rendle. 2010. Factorization machines. In Proceedings of the 10th International Conference on Data Mining. IEEE, 995--1000.
    [20]
    Ying Shan, T Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, and JC Mao. 2016. Deep Crossing: Web-scale modeling without manually crafted combinatorial features Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 255--262.
    [21]
    Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research Vol. 15, 1 (2014), 1929--1958.
    [22]
    Andreas Veit, Balazs Kovacs, et almbox. 2015. Learning Visual Clothing Style With Heterogeneous Dyadic Co-Occurrences Proceedings of the IEEE International Conference on Computer Vision.
    [23]
    Ronald J Williams and David Zipser. 1989. A learning algorithm for continually running fully recurrent neural networks. Neural computation (1989), 270--280.
    [24]
    Ling Yan, Wu-jun Li, Gui-Rong Xue, and Dingyi Han. 2014. Coupled group lasso for web-scale ctr prediction in display advertising Proceedings of the 31th International Conference on Machine Learning. 802--810.
    [25]
    Shuangfei Zhai, Keng-hao Chang, Ruofei Zhang, and Zhongfei Mark Zhang. 2016. Deepintent: Learning attentions for online advertising with recurrent neural networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1295--1304.

    Cited By

    View all
    • (2024)TABSTAI-Driven Marketing Research and Data Analytics10.4018/979-8-3693-2165-2.ch019(342-359)Online publication date: 19-Apr-2024
    • (2024)It’s Not Always about Wide and Deep Models: Click-Through Rate Prediction with a Customer Behavior-Embedding RepresentationJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer1901000819:1(135-151)Online publication date: 12-Jan-2024
    • (2024)Non-Stationary Transformer Architecture: A Versatile Framework for Recommendation SystemsElectronics10.3390/electronics1311207513:11(2075)Online publication date: 27-May-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    July 2018
    2925 pages
    ISBN:9781450355520
    DOI:10.1145/3219819
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. click-through rate prediction
    2. display advertising
    3. e-commerce

    Qualifiers

    • Research-article

    Conference

    KDD '18
    Sponsor:

    Acceptance Rates

    KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,028
    • Downloads (Last 6 weeks)109

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)TABSTAI-Driven Marketing Research and Data Analytics10.4018/979-8-3693-2165-2.ch019(342-359)Online publication date: 19-Apr-2024
    • (2024)It’s Not Always about Wide and Deep Models: Click-Through Rate Prediction with a Customer Behavior-Embedding RepresentationJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer1901000819:1(135-151)Online publication date: 12-Jan-2024
    • (2024)Non-Stationary Transformer Architecture: A Versatile Framework for Recommendation SystemsElectronics10.3390/electronics1311207513:11(2075)Online publication date: 27-May-2024
    • (2024)LSH Models in Federated RecommendationApplied Sciences10.3390/app1411442314:11(4423)Online publication date: 23-May-2024
    • (2024)Weight Adjustment Framework for Self-Attention Sequential RecommendationApplied Sciences10.3390/app1409360814:9(3608)Online publication date: 24-Apr-2024
    • (2024)Feature-Interaction-Enhanced Sequential Transformer for Click-Through Rate PredictionApplied Sciences10.3390/app1407276014:7(2760)Online publication date: 26-Mar-2024
    • (2024)A bias study and an unbiased deep neural network for recommender systemsWeb Intelligence10.3233/WEB-23003622:1(15-29)Online publication date: 26-Mar-2024
    • (2024)MeFiNet: Modeling multi-semantic convolution-based feature interactions for CTR predictionIntelligent Data Analysis10.3233/IDA-22711328:1(261-278)Online publication date: 3-Feb-2024
    • (2024)Modeling Long- and Short-Term Service Recommendations with a Deep Multi-Interest Network for Edge ComputingTsinghua Science and Technology10.26599/TST.2022.901005429:1(86-98)Online publication date: Feb-2024
    • (2024)Enhancing User Interest based on Stream Clustering and Memory Networks in Large-Scale Recommender SystemsSSRN Electronic Journal10.2139/ssrn.4836975Online publication date: 2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media