Abstract
Click-through rate (CTR) prediction plays an important role in many industrial applications, feature engineering directly influences CTR prediction performance because features are normally the multi-field type. However, the existing CTR prediction techniques either neglect the importance of each feature or regard the feature interactions equally for feature learning. In addition, using an inner product or a Hadamard product is too simple to effectively model the feature interactions. These limitations lead to suboptimal performances of existing models. In this paper, we propose a framework called Hierarchical Attention and Feature Projection neural network (HAFP) for CTR prediction, which enables the automatically learning of more representative and efficient feature representation in an end-to-end manner. Towards this end, we employ a feature learning layer with a hierarchical attention mechanism to jointly extract more generalized and dominant features and feature interactions. In addition, a projective bilinear function is designed in meaningful second-order interaction encoder to effectively learn more fine-grained and comprehensive second-order feature interactions. Taking advantages of the hierarchical attention mechanism and the projective bilinear function, our proposed model can not only model feature learning in a flexible fashion, but also provide an interpretable capability of the prediction results. Experimental results on two real-world datasets demonstrate that HAFP outperforms the state-of-the-art in terms of Logloss and AUC for CTR prediction baselines. Further analysis verifies the importance of the proposed hierarchical attention mechanism and the projective bilinear function for modelling the feature representation, showing the rationality and effectiveness of HAFP.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Avila Clemenshia P, Vijaya MS (2016) Click through rate prediction for display advertisement. International Journal of Computer Applications 975:8887
Cai W, Wang Y, Ma J, Jin Q (2021) Can: Effective cross features by global attention mechanism and neural network for ad click prediction. Tsinghua Sci Technol 27(1):186–195
Bo C, Ding Y, Xin X, Li Y, Wang Y, Wang D (2021) Airec: Attentive intersection model for tag-aware recommendation. Neurocomputing 421:105–114
Cheng H-T, Koc L, Harmsen J, Shaked T, Chandra T, Aradhye H, Anderson G, Corrado G, Chai W, Ispir M, Anil R, Haque Z, Hong L, Jain V, Liu X, Shah H (2016) Wide & deep learning for recommender systems. In: Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, pp 7–10
Frey RM, Xu R, Ammendola C, Moling O, Giglio G, Ilic A (2017) Mobile recommendations based on interest prediction from consumer’s installed apps-insights from a large-scale field study. Inf Syst 71:152–163
Guo H, Tang R, Ye Y, Li Z, He X (2017) Deepfm: a factorization-machine based neural network for CTR prediction. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp 1725–1731
He X, Chua T-S (2017) Neural factorization machines for sparse predictive analytics. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 355–364
Hong W, Xiong Z, You J, Wu X, Xia M (2021) CPIN: Comprehensive present-interest network for CTR prediction. Expert System Application 168:114469
Hu J, Li S, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Huang T, Zhang Z, Zhang J (2019) Fibinet: combining feature importance and bilinear feature interaction for click-through rate prediction. In: Proceedings of the 13th ACM Conference on Recommender Systems, pp 169–177
Jiang D, Xu R, Xu X, Xie Y (2021) Multi-view feature transfer for click-through rate prediction. Inf Sci 546:961–976
Juan Y-C, Zhuang Y, Chin W-S, Lin C-J (2016) Field-aware factorization machines for CTR prediction. In: Proceedings of the 10th ACM Conference on Recommender Systems, pp 43–50
Li D, Hu B, Chen Q, Wang X, Qi Q, Wang L, Liu H (2021) Attentive capsule network for click-through rate and conversion rate prediction in online advertising. Knowl-Based Syst 211:106522
Li G, Gan Y, Wu H, Xiao N, Lin L (2019) Cross-modal attentional context learning for rgb-d object detection. IEEE Trans Image Process 28(4):1591–1601
Li H, Duan H, Zheng Y, Wang Q, Wang Yu (2020) A CTR prediction model based on user interest via attention mechanism. Appl Intell 50(4):1192–1203
Li Z, Cheng W, Chen Y, Chen H, Wang W (2020) Interpretable click-through rate prediction through hierarchical attention. In: Proceedings of the Thirteenth ACM International Conference on Web Search and Data Mining, pp 313–321
Liu B, Zhu C, Li G, Zhang W, Lai J, Tang R, He X, Li Z, Yu Y (2020) Autofis: Automatic feature interaction selection in factorization models for click-through rate prediction. In: Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp 2636–2645
Liu M, Cai S, Lai Z, Qiu L, Hu Z, Yi D (2021) A joint learning model for click-through prediction in display advertising. Neurocomputing 445:206–219
Lodhi B, Kang J (2019) Multipath-densenet: a supervised ensemble architecture of densely connected convolutional networks. Inf Sci 482:63–72
Luo Y, Wang M, Zhou H, Yao Q, Tu W-W, Chen Y, Dai W, Yang Q (2019) Autocross: Automatic feature crossing for tabular data in real-world applications. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp 1936–1945
Ma C, Mu X, Lin R, Wang S (2021) Multilayer feature fusion with weight adjustment based on a convolutional neural network for remote sensing scene classification. IEEE Geoscience and Remote Sensing Letters 18:241–245
Pan J, Xu J, Ruiz AL, Zhao W, Pan S, Sun Yu, Lu Q (2018) Field-weighted factorization machines for click-through rate prediction in display advertising. In: Proceedings of the 2018 World Wide Web Conference on World Wide Web, pp 1349–1357
Qu Y, Cai H, Ren K, Zhang W, Yu Y, Wen Y, Wang J (2016) Product-based neural networks for user response prediction. In: Proceedings of the IEEE 16th International Conference on Data Mining, pp 1149–1154
Steffen R (2012) Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology 3(3):1–22
Shan Y, Ryan Hoens T, Jiao J, Wang H, Yu D, Mao JC (2016) Deep crossing: Web-scale modeling without manually crafted combinatorial features. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 255–262
Silveira T, Zhang M, Liu Y, Ma S (2019) How good your recommender system is? a survey on evaluations in recommendation. International Journal of Machine Learning and Cybernetics 10:813–831
Song K, Huang Q, Zhang F, Lu J (2021) Coarse-to-fine: a dual-view attention network for click-through rate prediction. Knowledge Based Systems 216:106767
Tao Z, Wang X, He X, Huang X, Chua T-S (2020) Hoafm: A high-order attentive factorization machine for CTR prediction. Inf Process Manag 57:102076
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, pp 5998–6008
Wang Q, Fang’ai Liu PH, Xing S, Zhao X (2020) A hierarchical attention model for ctr prediction based on user interest. IEEE Syst J 14(3):4015–4024
Wang Q, Liu F, Huang Pu, Xing S, Zhao X (2020) A hierarchical attention model for CTR prediction based on user interest. IEEE Syst J 14(3):4015–4024
Wang Q, Fang’ai Liu SX, Zhao X (2019) Research on CTR prediction based on stacked autoencoder. Appl Intell 49(8):2970–2981
Wang R, Fu B, Fu G, Wang M (2017) Deep & cross network for ad click predictions. In: Proceedings of the ADKDD’17, pp 1–7
Xiao J, Ye H, He X, Zhang H, Wu F, Chua T-S (2017) Attentional factorization machines: Learning the weight of feature interactions via attention networks. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp 3119–3125
En Xu, Yu Z, Guo B, Cui H (2021) Core interest network for click-through rate prediction. ACM Transactions Knowledge Discovery Data 15(2):1–16
Xue N, Liu B, Guo H, Tang R, Zhou F, Zafeiriou SP, Zhang Y, Wang J, Li Z (2020) Autohash: Learning higher-order feature interactions for deep ctr prediction. IEEE Trans Knowl Data Eng, pp 1–1
Yan C, Chen Y, Wan Y, Wang P (2021) Modeling low- and high-order feature interactions with FM and self-attention network. Appl Intell 51(6):3189–3201
Yan C, Li X, Chen Y, Zhang Y (2021) Jointctr: a joint ctr prediction framework combining feature interaction and sequential behavior learning. Appl Intell, pp 1–14
Yi Y, Xu B, Shen S, Shen F, Zhao J (2020) Operation-aware neural networks for user response prediction. Neural Netw 121:161–168
Zhang W, Du T, Wang J (2016) Deep learning over multi-field categorical data - - a case study on user response prediction. In: Advances in information retrieval - 38th european conference on IR research, vol 9626, pp 45–57
Zhong Z, Li J, Luo Z, Chapman M (2018) Spectral–spatial residual network for hyperspectral image classification: a 3-d deep learning framework. IEEE Trans Geosci Remote Sens 56(2):847–858
Zou D, Wang Z, Zhang L, Zou J, Qi L i, Chen Y, Sheng W (2021) Deep field relation neural network for click-through rate prediction. Inf Sci 577:128–139
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, J., Zhong, C., Fan, S. et al. Hierarchical attention and feature projection for click-through rate prediction. Appl Intell 52, 8651–8663 (2022). https://doi.org/10.1007/s10489-021-02931-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02931-0