Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3448016.3457321acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Open access

ARM-Net: Adaptive Relation Modeling Network for Structured Data

Published: 18 June 2021 Publication History

Abstract

Relational databases are the de facto standard for storing and querying structured data, and extracting insights from structured data requires advanced analytics. Deep neural networks (DNNs) have achieved super-human prediction performance in particular data types, e.g., images. However, existing DNNs may not produce meaningful results when applied to structured data. The reason is that there are correlations and dependencies across combinations of attribute values in a table, and these do not follow simple additive patterns that can be easily mimicked by a DNN. The number of possible such cross features is combinatorial, making them computationally prohibitive to model. Furthermore, the deployment of learning models in real-world applications has also highlighted the need for interpretability, especially for high-stakes applications, which remains another issue of concern to DNNs. In this paper, we present ARM-Net, an adaptive relation modeling network tailored for structured data, and a lightweight framework ARMOR based on ARM-Net for relational data analytics. The key idea is to model feature interactions with cross features selectively and dynamically, by first transforming the input features into exponential space, and then determining the interaction order and interaction weights adaptively for each cross feature. We propose a novel sparse attention mechanism to dynamically generate the interaction weights given the input tuple, so that we can explicitly model cross features of arbitrary orders with noisy features filtered selectively. Then during model inference, ARM-Net can specify the cross features being used for each prediction for higher accuracy and better interpretability. Our extensive experiments on real-world datasets demonstrate that ARM-Net consistently outperforms existing models and provides more interpretable predictions for data-driven decision making.

Supplementary Material

MP4 File (3448016.3457321.mp4)
Relational databases are the de facto standard for storing and querying structured data, and extracting insights from structured data requires advanced analytics. Deep neural networks (DNNs) have achieved super-human prediction performance in particular data types, e.g., images. However, existing DNNs may not produce meaningful results when applied to structured data. The reason is that there are correlations and dependencies across combinations of attribute values in a table, and these do not follow a simple geometric pattern that can be mimicked by a DNN. The number of possible such ``cross features'' is combinatorial, making them computationally prohibitive to model. Furthermore, the deployment of learning models in real-world applications has highlighted the need for interpretability, especially for high-stakes applications, which remains a major drawback for many DNNs.In this paper, we present ARM-Net, an adaptive relation modeling network tailored for structured data, and a lightweight framework ARMOR based on ARM-Net for relational data analytics, which is designed to be accurate, efficient and interpretable. The key idea is to model feature interactions with cross features selectively and dynamically, by first transforming the input features into exponential space, and then determining interaction weights and the interaction order adaptively for each cross feature. We propose a novel sparse attention mechanism to dynamically generate the interaction weights given the input tuple, so that we can model cross features of arbitrary orders and selectively filter noisy features. Then during model inference, ARM-Net can identify the most informative cross features in an input-aware manner for more accurate prediction and better interpretability. Our extensive experiments on real-world datasets show that ARM-Net consistently outperforms existing models and provides interpretable predictions for data-driven decision making.

References

[1]
Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, et almbox. 2016. Deep speech 2: End-to-end speech recognition in english and mandarin. In International conference on machine learning. 173--182.
[2]
Alexandr Andoni, Rina Panigrahy, Gregory Valiant, and Li Zhang. 2014. Learning Polynomials with Neural Networks. In Proceedings of the 31th International Conference on Machine Learning, ICML.
[3]
Sercan Ö mer Arik and Tomas Pfister. 2019. TabNet: Attentive Interpretable Tabular Learning. CoRR, Vol. abs/1908.07442 (2019).
[4]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In ICLR.
[5]
Linas Baltrunas, Karen Church, Alexandros Karatzoglou, and Nuria Oliver. 2015. Frappe: Understanding the Usage and Perception of Mobile App Recommendations In-The-Wild. arXiv preprint arXiv:1505.03014 (2015).
[6]
Alex Beutel, Paul Covington, Sagar Jain, Can Xu, Jia Li, Vince Gatto, and Ed H. Chi. 2018. Latent Cross: Making Use of Context in Recurrent Recommender Systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM. ACM .
[7]
Or Biran and Courtenay Cotton. 2017. Explanation and justification in machine learning: A survey. In IJCAI-17 workshop on explainable AI (XAI), Vol. 8. 8--13.
[8]
Mathieu Blondel, Akinori Fujino, Naonori Ueda, and Masakazu Ishihata. 2016. Higher-order factorization machines. In Advances in Neural Information Processing Systems. 3351--3359.
[9]
Rajesh Bordawekar and Oded Shmueli. 2017. Using Word Embedding to Enable Semantic Queries in Relational Databases. In Proceedings of the 1st Workshop on Data Management for End-to-End Machine Learning, DEEM@SIGMOD 2017, Chicago, IL, USA. ACM, 5:1--5:4.
[10]
John S Bridle. 1990. Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In Neurocomputing . 227--236.
[11]
Lingjiao Chen, Arun Kumar, Jeffrey F. Naughton, and Jignesh M. Patel. 2017. Towards Linear Algebra over Normalized Data. Proceedings of VLDB Endowment., Vol. 10, 11 (2017), 1214--1225.
[12]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. ACM, 7--10.
[13]
Weiyu Cheng, Yanyan Shen, and Linpeng Huang. 2020. Adaptive Factorization Network: Learning Adaptive-Order Feature Interactions. In 34th AAAI Conference on Artificial Intelligence .
[14]
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems. 3504--3512.
[15]
Milan Cvitkovic. 2020. Supervised Learning on Relational Databases with Graph Neural Networks. CoRR, Vol. abs/2002.02046 (2020).
[16]
George Cybenko. 1989. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, Vol. 2, 4 (1989), 303--314.
[17]
Krishna Gade, Sahin Cem Geyik, Krishnaram Kenthapadi, Varun Mithal, and Ankur Taly. 2019. Explainable AI in industry. In Proceedings of International Conference on Knowledge Discovery & Data Mining, SIGKDD. 3203--3204.
[18]
Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2019. A Survey of Methods for Explaining Black Box Models. Comput. Surveys, Vol. 51, 5 (2019).
[19]
Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI. 1725--1731.
[20]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[21]
Xiangnan He and Tat-Seng Chua. 2017. Neural Factorization Machines for Sparse Predictive Analytics. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 355--364.
[22]
J. Wesley Hines. 1996. A logarithmic neural network architecture for unbounded non-linear function approximation. In Proceedings of International Conference on Neural Networks (ICNN'96). IEEE, 1245--1250.
[23]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.
[24]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018 .
[25]
Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick, Jonathan Schwarz, Jack W. Rae, Simon Osindero, Yee Whye Teh, Tim Harley, and Razvan Pascanu. 2020. Multiplicative Interactions and Where to Find Them. In 8th International Conference on Learning Representations, ICLR.
[26]
Yu-Chin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin. 2016. Field-aware Factorization Machines for CTR Prediction. In Proceedings of the 10th ACM Conference on Recommender Systems .
[27]
Mahmoud Abo Khamis, Hung Q. Ngo, XuanLong Nguyen, Dan Olteanu, and Maximilian Schleich. 2020. Learning Models over Relational Data Using Sparse Tensors and Functional Dependencies. ACM Transactions on Database Systems., Vol. 45, 2 (2020), 7:1--7:66.
[28]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR.
[29]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In 5th International Conference on Learning Representations, ICLR. OpenReview.net.
[30]
Nikita Kitaev, Lukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The Efficient Transformer. In 8th International Conference on Learning Representations, ICLR.
[31]
Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent Convolutional Neural Networks for Text Classification. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence,. AAAI, 2267--2273.
[32]
Side Li, Lingjiao Chen, and Arun Kumar. 2019. Enabling and Optimizing Non-linear Feature Interactions in Factorized Linear Algebra. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD. ACM .
[33]
Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, Zhongxia Chen, Xing Xie, and Guangzhong Sun. 2018. xDeepFM: Combining explicit and implicit feature interactions for recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining . 1754--1763.
[34]
Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, USA. 4765--4774.
[35]
Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP. 1412--1421.
[36]
Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, and Jing Gao. 2017. Dipole: Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining . 1903--1911.
[37]
André F. T. Martins and Ramó n Ferná ndez Astudillo. 2016. From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, Vol. 48. 1614--1623.
[38]
Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, Vol. 267 (2019), 1--38.
[39]
Christoph Molnar. 2019. Interpretable Machine Learning . https://christophm.github.io/interpretable-ml-book/. https://christophm.github.io/interpretable-ml-book/.
[40]
Milos Nikolic, Haozhe Zhang, Ahmet Kara, and Dan Olteanu. 2020. F-IVM: Learning over Fast-Evolving Relational Data. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD. ACM, 2773--2776.
[41]
Ben Peters, Vlad Niculae, and André F. T. Martins. 2019. Sparse Sequence-to-Sequence Models. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019 . 1504--1519.
[42]
Sanjay Purushotham, Chuizheng Meng, Zhengping Che, and Yan Liu. 2018. Benchmarking deep learning models on large healthcare datasets. Journal of biomedical informatics, Vol. 83 (2018), 112--134.
[43]
Yanru Qu, Bohui Fang, Weinan Zhang, Ruiming Tang, Minzhe Niu, Huifeng Guo, Yong Yu, and Xiuqiang He. 2018. Product-based neural networks for user response prediction over multi-field categorical data. ACM Transactions on Information Systems (TOIS), Vol. 37, 1 (2018), 1--35.
[44]
Steffen Rendle. 2010. Factorization Machines. In ICDM 2010, The 10th IEEE International Conference on Data Mining, Sydney, Australia, 14--17 December 2010 .
[45]
Steffen Rendle. 2013. Scaling Factorization Machines to Relational Data. Proceedings of VLDB Endowment., Vol. 6, 5 (2013), 337--348.
[46]
Marco Tú lio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD. ACM, 1135--1144.
[47]
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision . 618--626.
[48]
Ying Shan, T Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, and JC Mao. 2016. Deep crossing: Web-scale modeling without manually crafted combinatorial features. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 255--262.
[49]
Lloyd S Shapley. 1953. A value for n-person games. Contributions to the Theory of Games, Vol. 2, 28 (1953), 307--317.
[50]
Beata Strack, Jonathan P DeShazo, Chris Gennings, Juan L Olmo, Sebastian Ventura, Krzysztof J Cios, and John N Clore. 2014. Impact of HbA1c measurement on hospital readmission rates: analysis of 70,000 clinical database patient records. BioMed research international, Vol. 2014 (2014).
[51]
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic Attribution for Deep Networks. In Proceedings of the 34th International Conference on Machine Learning, ICML, Vol. 70. 3319--3328.
[52]
Constantino Tsallis. 1988. Possible generalization of Boltzmann-Gibbs statistics. In Journal of Statistical Physics . 52:479--487.
[53]
Kush R. Varshney and Homa Alemzadeh. 2017. On the Safety of Machine Learning: Cyber-Physical Systems, Decision Sciences, and Data Products. Big Data, Vol. 5, 3 (2017), 246--255.
[54]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017 . 5998--6008.
[55]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In 6th International Conference on Learning Representations, ICLR.
[56]
Martin J Wainwright, Michael I Jordan, et almbox. 2008. Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, Vol. 1, 1--2 (2008), 1--305.
[57]
Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & Cross Network for Ad Click Predictions. In Proceedings of the SIGKDD'17 . ACM .
[58]
Wei Wang, Meihui Zhang, Gang Chen, H. V. Jagadish, Beng Chin Ooi, and Kian-Lee Tan. 2016. Database Meets Deep Learning: Challenges and Opportunities. SIGMOD Rec., Vol. 45, 2 (2016), 17--22.
[59]
Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, and Tat-Seng Chua. 2017. Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI.
[60]
Xiaoyong Yuan, Pan He, Qile Zhu, and Xiaolin Li. 2019. Adversarial Examples: Attacks and Defenses for Deep Learning. IEEE Transactions on Neural Networks and Learning Systems., Vol. 30, 9 (2019), 2805--2824.
[61]
Kaiping Zheng, Shaofeng Cai, Horng Ruey Chua, Wei Wang, Kee Yuan Ngiam, and Beng Chin Ooi. 2020. TRACER: A Framework for Facilitating Accurate and Interpretable Analytics for High Stakes Applications. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD. ACM .
[62]
Kaiping Zheng, Gang Chen, Melanie Herschel, Kee Yuan Ngiam, Beng Chin Ooi, and Jinyang Gao. 2021. PACE: Learning Effective Task Decomposition for Human-in-the-loop Healthcare Delivery. In Proceedings of the 2021 ACM SIGMOD International Conference on Management of Data .
[63]
Kaiping Zheng, Jinyang Gao, Kee Yuan Ngiam, Beng Chin Ooi, and Wei Luen James Yip. 2017a. Resolving the Bias in Electronic Medical Records. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Halifax, NS, Canada) (KDD '17). 2171--2180.
[64]
Kaiping Zheng, Wei Wang, Jinyang Gao, Kee Yuan Ngiam, Beng Chin Ooi, and Wei Luen James Yip. 2017b. Capturing Feature-Level Irregularity in Disease Progression Modeling. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (Singapore, Singapore) (CIKM '17). 1579--1588.

Cited By

View all
  • (2024)Exploiting negative samplesProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694606(61287-61320)Online publication date: 21-Jul-2024
  • (2024)OEBench: Investigating Open Environment Challenges in Real-World Relational Data StreamsProceedings of the VLDB Endowment10.14778/3648160.364817017:6(1283-1296)Online publication date: 3-May-2024
  • (2024)Database Native Model Selection: Harnessing Deep Neural Networks in Database SystemsProceedings of the VLDB Endowment10.14778/3641204.364121217:5(1020-1033)Online publication date: 2-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
June 2021
2969 pages
ISBN:9781450383431
DOI:10.1145/3448016
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. feature importance
  2. feature interaction
  3. interpretability
  4. multi-head gated attention
  5. neural networks
  6. structured data

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)403
  • Downloads (Last 6 weeks)51
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Exploiting negative samplesProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3694606(61287-61320)Online publication date: 21-Jul-2024
  • (2024)OEBench: Investigating Open Environment Challenges in Real-World Relational Data StreamsProceedings of the VLDB Endowment10.14778/3648160.364817017:6(1283-1296)Online publication date: 3-May-2024
  • (2024)Database Native Model Selection: Harnessing Deep Neural Networks in Database SystemsProceedings of the VLDB Endowment10.14778/3641204.364121217:5(1020-1033)Online publication date: 2-May-2024
  • (2024)METER: A Dynamic Concept Adaptation Framework for Online Anomaly DetectionProceedings of the VLDB Endowment10.14778/3636218.363623317:4(794-807)Online publication date: 5-Mar-2024
  • (2024)Do LLMs Understand Visual Anomalies? Uncovering LLM's Capabilities in Zero-shot Anomaly DetectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681190(48-57)Online publication date: 28-Oct-2024
  • (2024)Rotative Factorization MachinesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671740(2912-2923)Online publication date: 25-Aug-2024
  • (2024)NPA: Improving Large-scale Graph Neural Networks with Non-parametric AttentionCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653399(414-427)Online publication date: 9-Jun-2024
  • (2024)Deep Neural Networks and Tabular Data: A SurveyIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.322916135:6(7499-7519)Online publication date: Jun-2024
  • (2024)Managing Metaverse Data Tsunami: Actionable InsightsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.335496036:12(7423-7441)Online publication date: Dec-2024
  • (2024)Applications and Challenges for Large Language Models: From Data Management Perspective2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00441(5530-5541)Online publication date: 13-May-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media