research-article

Retrieval & Interaction Machine for Tabular Data Prediction

Authors:

Yong YuAuthors Info & Claims

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Pages 1379 - 1389

https://doi.org/10.1145/3447548.3467216

Published: 14 August 2021 Publication History

Abstract

Prediction over tabular data is an essential task in many data science applications such as recommender systems, online advertising, medical treatment, etc. Tabular data is structured into rows and columns, with each row as a data sample and each column as a feature attribute. Both the columns and rows of the tabular data carry useful patterns that could improve the model prediction performance. However, most existing models focus on the cross-column patterns yet overlook the cross-rowpatterns as they deal with single samples independently. In this work, we propose a general learning framework named Retrieval & Interaction Machine (RIM) that fully exploits both cross-row and cross-column patterns among tabular data. Specifically, RIM first leverages search engine techniques to efficiently retrieve useful rows of the table to assist the label prediction of the target row, then uses feature interaction networks to capture the cross-column patterns among the target row and the retrieved rows so as to make the final label prediction. We conduct extensive experiments on 11 datasets of three important tasks, i.e., CTR prediction (classification), top-n recommendation (ranking) and rating prediction (regression). Experimental results show that RIM achieves significant improvements over the state-of-the-art and various baselines, demonstrating the superiority and efficacy of RIM.

Supplementary Material

MP4 File (retrieval_interaction_machine_for_tabular-jiarui_qin-weinan_zhang-38957790-1zCc.mp4)

Presentation video

Download
168.26 MB

References

[1]

2020. MindSpore. https://www.mindspore.cn/

[2]

Alex Beutel, Paul Covington, Sagar Jain, Can Xu, Jia Li, Vince Gatto, and Ed H Chi. 2018. Latent Cross: Making Use of Context in Recurrent Recommender Systems. In WSDM.

[3]

Jesús Bobadilla, Fernando Ortega, Antonio Hernando, and Abraham Gutiérrez. 2013. Recommender systems survey. Knowledge-based sys, Vol. 46 (2013), 109--132.

[4]

Richard J Bolton and David J Hand. 2002. Statistical fraud detection: A review. Statistical science (2002), 235--249.

[5]

Xuezhi Cao, Weiyue Huang, and Yong Yu. 2016. A Complete & Comprehensive Movie Review Dataset (CCMR). In SIGIR. ACM, 661--664.

[6]

Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard, and Chih-Jen Lin. 2010. Training and testing low-degree polynomial data mappings via linear SVM. Journal of Machine Learning Research, Vol. 11, 4 (2010).

[7]

Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In KDD. 785--794.

Digital Library

[8]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & deep learning for recommender systems. In DLRS. 7--10.

[9]

Jiatao Gu, Yong Wang, Kyunghyun Cho, and Victor OK Li. 2018. Search engine guided neural machine translation. In AAAI, Vol. 32.

[10]

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. Deepfm: a factorization-machine based neural network for ctr prediction. In IJCAI.

[11]

Ruining He, Wang-Cheng Kang, and Julian McAuley. 2017a. Translation-based recommendation. In RecSys. 161--169.

[12]

Xiangnan He and Tat-Seng Chua. 2017. Neural factorization machines for sparse predictive analytics. In SIGIR. 355--364.

[13]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017b. Neural collaborative filtering. In WWW. 173--182.

[14]

Balázs Hidasi and Alexandros Karatzoglou. 2018. Recurrent neural networks with top-k gains for session-based recommendations. CIKM (2018).

[15]

Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2016. Session-based recommendations with recurrent neural networks. In ICLR.

[16]

Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin. 2016. Field-aware factorization machines for CTR prediction. In RecSys. 43--50.

[17]

Wang-Cheng Kang and Julian McAuley. 2018. Self-Attentive Sequential Recommendation. ICDM (2018).

[18]

Igor Kononenko. 2001. Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in medicine, Vol. 23, 1 (2001), 89--109.

[19]

Kuang-chih Lee, Burkay Orten, Ali Dasdan, and Wentong Li. 2012. Estimating conversion rate in display advertising from past erformance data. In KDD.

[20]

Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, and Jun Ma. 2017. Neural attentive session-based recommendation. In CIKM. 1419--1428.

[21]

Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xdeepfm: Combining explicit and implicit feature interactions for recommender systems. In KDD. 1754--1763.

[22]

Bin Liu, Ruiming Tang, Yingzhi Chen, Jinkai Yu, Huifeng Guo, and Yuzhou Zhang. 2019. Feature generation by convolutional neural network for click-through rate prediction. In WWW. 1119--1129.

[23]

Bin Liu, Chenxu Zhu, Guilin Li, Weinan Zhang, Jincai Lai, Ruiming Tang, Xiuqiang He, Zhenguo Li, and Yong Yu. 2020. AutoFIS: Automatic Feature Interaction Selection in Factorization Models for Click-Through Rate Prediction. In KDD.

[24]

Zhiqiang Pan, Fei Cai, Yanxiang Ling, and Maarten de Rijke. 2020. Rethinking Item Importance in Session-based Recommendation. In SIGIR.

[25]

Qi Pi, Weijie Bian, Guorui Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Practice on long sequential user behavior modeling for click-through rate prediction. In KDD. 2671--2679.

[26]

Tobias Plötz and Stefan Roth. 2018. Neural nearest neighbors networks. NeurIPS, Vol. 31 (2018), 1087--1098.

[27]

Pi Qi, Xiaoqiang Zhu, Guorui Zhou, Yujing Zhang, Zhe Wang, Lejian Ren, Ying Fan, and Kun Gai. 2020. Search-based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction. In CIKM.

[28]

Jiarui Qin, W. Zhang, Xin Wu, Jiarui Jin, Yuchen Fang, and Y. Yu. 2020. User Behavior Retrieval for Click-Through Rate Prediction. In SIGIR.

[29]

Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based neural networks for user response prediction. In ICDM.

[30]

Yanru Qu, Bohui Fang, Weinan Zhang, Ruiming Tang, Minzhe Niu, Huifeng Guo, Yong Yu, and Xiuqiang He. 2018. Product-based neural networks for user response prediction over multi-field categorical data. TOIS, Vol. 37, 1 (2018), 1--35.

Digital Library

[31]

Kan Ren, Jiarui Qin, Yuchen Fang, Weinan Zhang, Lei Zheng, Weijie Bian, Guorui Zhou, Jian Xu, Yong Yu, Xiaoqiang Zhu, et al. 2019. Lifelong Sequential Modeling with Personalized Memorization for User Response Prediction. SIGIR.

[32]

Steffen Rendle. 2010. Factorization machines. In ICDM.

[33]

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2012. BPR: Bayesian personalized ranking from implicit feedback. In UAR.

[34]

Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. 2010. Factorizing personalized markov chains for next-basket recommendation. In WWW.

[35]

Stephen E Robertson, Steve Walker, Susan Jones, Micheline M Hancock-Beaulieu, Mike Gatford, et al. 1995. Okapi at TREC-3. In Nist Special Publication Sp. 109.

[36]

Ying Shan, T Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, and JC Mao. 2016. Deep crossing: Web-scale modeling without manually crafted combinatorial features. In KDD. 255--262.

[37]

Jiaxi Tang and Ke Wang. 2018. Personalized top-n sequential recommendation via convolutional sequence embedding. In WSDM.

[38]

Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, and Tat-Seng Chua. 2017. Attentional factorization machines: Learning the weight of feature interactions via attention networks. IJCAI (2017).

[39]

Weinan Zhang, Tianming Du, and Jun Wang. 2016. Deep Learning over Multi-field Categorical Data: A Case Study on User Response Prediction. ECIR (2016).

[40]

Jake Zhao and Kyunghyun Cho. 2018. Retrieval-augmented convolutional neural networks for improved robustness against adversarial examples. arXiv preprint arXiv:1802.09502 (2018).

[41]

Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2019. Deep interest evolution network for click-through rate prediction. In AAAI, Vol. 33. 5941--5948.

Digital Library

[42]

Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep interest network for click-through rate prediction. In KDD.

Cited By

Meng LGong XChen Y(2024)BAD-FM: Backdoor Attacks Against Factorization-Machine Based Neural Network for Tabular Data PredictionChinese Journal of Electronics10.23919/cje.2023.00.04133:4(1077-1092)Online publication date: Jul-2024
https://doi.org/10.23919/cje.2023.00.041
Cai QZheng KJagadish HOoi BYip J(2024)CohortNet: Empowering Cohort Discovery for Interpretable Healthcare AnalyticsProceedings of the VLDB Endowment10.14778/3675034.367504117:10(2487-2500)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.14778/3675034.3675041
Duan MLi KZhang WQin JXiao B(2024)Attacking Click-through Rate Predictors via Generating Realistic Fake SamplesACM Transactions on Knowledge Discovery from Data10.1145/364368518:5(1-24)Online publication date: 28-Feb-2024
https://dl.acm.org/doi/10.1145/3643685
Show More Cited By

Index Terms

Retrieval & Interaction Machine for Tabular Data Prediction
1. Information systems
  1. Information retrieval
    1. Users and interactive retrieval
  2. Information systems applications
    1. Data mining

Recommendations

Dense Representation Learning and Retrieval for Tabular Data Prediction
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Data science is concerned with mining data patterns from a database, which is assembled by tabular data. As the routine of machine learning, most of the previous work mining the tabular data's pattern based on a single instance. However, they neglect ...
Aggregating preference graphs for collaborative rating prediction
RecSys '10: Proceedings of the fourth ACM conference on Recommender systems

Collaborative filtering is a widely used technique for rating prediction in recommender systems. Memory based collaborative filtering algorithms assign weights to the users to capture similarities between them. The weighted average of similar users' ...
Query performance prediction for entity retrieval
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

We address the query-performance-prediction task for entity retrieval; that is, retrieval effectiveness is estimated with no relevance judgements. First we show how to adapt state-of-the-art query-performance predictors proposed for document retrieval ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

August 2021

4259 pages

ISBN:9781450383325

DOI:10.1145/3447548

General Chairs:
Feida Zhu
Singapore Management University
,
Beng Chin Ooi
National University of Singapore
,
Chunyan Miao
Nanyang Technology University
,
Program Chairs:
Haixun Wang,
Iryna Skrypnyk,
Wynne Hsu,
Sanjay Chawla

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '21

Sponsor:

KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2021

Virtual Event, Singapore

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '24

Sponsor:
sigkdd
sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
467
Total Downloads

Downloads (Last 12 months)92
Downloads (Last 6 weeks)13

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Meng LGong XChen Y(2024)BAD-FM: Backdoor Attacks Against Factorization-Machine Based Neural Network for Tabular Data PredictionChinese Journal of Electronics10.23919/cje.2023.00.04133:4(1077-1092)Online publication date: Jul-2024
https://doi.org/10.23919/cje.2023.00.041
Cai QZheng KJagadish HOoi BYip J(2024)CohortNet: Empowering Cohort Discovery for Interpretable Healthcare AnalyticsProceedings of the VLDB Endowment10.14778/3675034.367504117:10(2487-2500)Online publication date: 1-Jun-2024
https://dl.acm.org/doi/10.14778/3675034.3675041
Duan MLi KZhang WQin JXiao B(2024)Attacking Click-through Rate Predictors via Generating Realistic Fake SamplesACM Transactions on Knowledge Discovery from Data10.1145/364368518:5(1-24)Online publication date: 28-Feb-2024
https://dl.acm.org/doi/10.1145/3643685
Zhong TZhang JCheng ZZhou FChen XHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Information Diffusion Prediction via Cascade-Retrieved In-context LearningProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657909(2472-2476)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657909
Li WWang ZWang JXia SZhu JChen MFan JCheng JLei JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)ReFer: Retrieval-Enhanced Vertical Federated Recommendation for Full Set User BenefitProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657763(1763-1773)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657763
Lin JShan RZhu CDu KChen BQuan STang RYu YZhang WChua TNgo CKa-Wei Lee RKumar RLauw H(2024)ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in RecommendationProceedings of the ACM on Web Conference 202410.1145/3589334.3645467(3497-3508)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645467
Xiao FWu YZhang MChen GOoi B(2023)MINT: Detecting Fraudulent Behaviors from Time-Series Relational DataProceedings of the VLDB Endowment10.14778/3611540.361155116:12(3610-3623)Online publication date: 1-Aug-2023
https://dl.acm.org/doi/10.14778/3611540.3611551
Chen MZhao HZhao YFan HGao HYu YTian Z(2023)ROMO: Retrieval-enhanced Offline Model-based OptimizationProceedings of the Fifth International Conference on Distributed Artificial Intelligence10.1145/3627676.3627685(1-9)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3627676.3627685
Yu JZhenyu MLei JYin LXia WYu YLong T(2023)SACAT: Student-Adaptive Computerized Adaptive TestingProceedings of the Fifth International Conference on Distributed Artificial Intelligence10.1145/3627676.3627679(1-7)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3627676.3627679
Luo ZCai SWang YOoi B(2023)Regularized Pairwise Relationship based Analytics for Structured DataProceedings of the ACM on Management of Data10.1145/35889361:1(1-27)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588936
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents