research-article

Graph Neural Network contextual embedding for Deep Learning on tabular data

Authors:

Mario Villaizán-Vallelado,

Matteo Salvatori,

Antonio Javier Sanchez-EsguevillasAuthors Info & Claims

Volume 173, Issue C

https://doi.org/10.1016/j.neunet.2024.106180

Published: 02 July 2024 Publication History

Abstract

All industries are trying to leverage Artificial Intelligence (AI) based on their existing big data which is available in so called tabular form, where each record is composed of a number of heterogeneous continuous and categorical columns also known as features. Deep Learning (DL) has constituted a major breakthrough for AI in fields related to human skills like natural language processing, but its applicability to tabular data has been more challenging. More classical Machine Learning (ML) models like tree-based ensemble ones usually perform better. This paper presents a novel DL model using Graph Neural Network (GNN) more specifically Interaction Network (IN), for contextual embedding and modeling interactions among tabular features. Its results outperform those of a recently published survey with DL benchmark based on seven public datasets, also achieving competitive results when compared to boosted-tree solutions.

References

[1]

Akiba T., Sano S., Yanase T., Ohta T., Koyama M., Optuna: A next-generation hyperparameter optimization framework, in: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, Association for Computing Machinery, New York, NY, USA, 2019, pp. 2623–2631.

[2]

Arik, S. Ö., & Pfister, T. (2021). TabNet: Attentive Interpretable Tabular Learning. In Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 8 (pp. 6679–6687).

[3]

Bai J., Wang J., Li Z., Ding D., Zhang J., Gao J., ATJ-Net: Auto-table-join network for automatic learning on relational databases, in: Proceedings of the web conference 2021, Association for Computing Machinery, New York, NY, USA, ISBN 9781450383127, 2021, pp. 1540–1551.

[4]

Baldi P., Sadowski P., Whiteson D., Searching for exotic particles in high-energy physics with deep learning, Nature Communications 5 (1) (2014) 1–9.

[5]

Battaglia P.W., Hamrick J.B., Bapst V., Sanchez-Gonzalez A., Zambaldi V., Malinowski M., et al., Relational inductive biases, deep learning, and graph networks, 2018, arXiv preprint arXiv:1806.01261.

[6]

Battaglia P., Pascanu R., Lai M., Rezende D., Kavukcuoglu K., Interaction networks for learning about objects, relations and physics, Advances in Neural Information Processing Systems (2016) 4509–4517.

[7]

Becker B., Kohavi R., Adult, 1996,. UCI Machine Learning Repository.

[8]

Blackard J., Covertype, 1998,. UCI Machine Learning Repository.

[9]

Borisov V., Leemann T., Seßler K., Haug J., Pawelczyk M., Kasneci G., Deep neural networks and tabular data: A survey, IEEE Transactions on Neural Networks and Learning Systems (2022) 1–21.

[10]

Bossan B., Feigl J., Kan W., Otto group product classification challenge, Kaggle, 2015, URL: https://kaggle.com/competitions/otto-group-product-classification-challenge.

[11]

Chen T., Guestrin C., XGBoost: A scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, Association for Computing Machinery, New York, NY, USA, 2016, pp. 785–794.

[12]

Cheng H.-T., Koc L., Harmsen J., Shaked T., Chandra T., Aradhye H., et al., Wide & deep learning for recommender systems, in: Proceedings of the 1st workshop on deep learning for recommender systems, Association for Computing Machinery, New York, NY, USA, 2016, pp. 7–10.

[13]

Cvitkovic M., Supervised learning on relational databases with graph neural networks, 2020, arXiv preprint arXiv:2002.02046.

[14]

Devlin J., Chang M.-W., Lee K., Toutanova K., BERT: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp. 4171–4186.

[15]

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al. (2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International conference on learning representations.

[16]

Du K., Zhang W., Zhou R., Wang Y., Zhao X., Jin J., et al., Learning enhanced representation for tabular data via neighborhood propagation, Advances in neural information processing systems, vol. 35, Curran Associates, Inc., 2022, pp. 16373–16384.

[17]

Dua D., Graff C., UCI machine learning repository, University of California, Irvine, School of Information and Computer Sciences, 2017, URL: http://archive.ics.uci.edu/ml.

[18]

Fey, M., & Lenssen, J. E. (2019). Fast Graph Representation Learning with PyTorch Geometric. In ICLR workshop on representation learning on graphs and manifolds.

[19]

FICO M., Home equity line of credit (HELOC) dataset, 2019, URL: https://community.fico.com/s/explainable-machine-learning-challenge.

[20]

Frosst N., Hinton G., Distilling a neural network into a soft decision tree, 2017, arXiv preprint arXiv:1711.09784.

[21]

Gorishniy Y., Rubachev I., Khrulkov V., Babenko A., Revisiting deep learning models for tabular data, in: Advances in Neural Information Processing Systems, 34, Curran Associates, Inc., 2021, pp. 18932–18943.

[22]

Guo, X., Quan, Y., Zhao, H., Yao, Q., Li, Y., & Tu, W. (2021). TabGNN: Multiplex graph neural network for tabular data prediction. In 3rd workshop on deep learning practice for high-dimensional sparse data with KDD.

[23]

Guo, H., Tang, R., Ye, Y., Li, Z., & He, X. (2017). DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In Proceedings of the twenty-sixth international joint conference on artificial intelligence, IJCAI-17 (pp. 1725–1731).

[24]

Hamilton W.L., Graph representation learning, Synthesis Lectures on Artificial Intelligence and Machine Learning 14 (3) (2020) 1–159.

[25]

He X., Liao L., Zhang H., Nie L., Hu X., Chua T.-S., Neural collaborative filtering, in: Proceedings of the 26th international conference on world wide web, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 2017, pp. 173–182.

[26]

Huang X., Khetan A., Cvitkovic M., Karnin Z., TabTransformer: Tabular data modeling using contextual embeddings, 2020, arXiv preprint arXiv:2012.06678.

[27]

Joseph M., Raj H., GATE: Gated additive tree ensemble for tabular classification and regression, 2022, arXiv preprint arXiv:2207.08548.

[28]

Katzir, L., Elidan, G., & El-Yaniv, R. (2021). Net-DNF: Effective Deep Modeling of Tabular Data. In International conference on learning representations.

[29]

Ke G., Meng Q., Finley T., Wang T., Chen W., Ma W., et al., Lightgbm: A highly efficient gradient boosting decision tree, Advances in Neural Information Processing Systems 30 (2017) 3146–3154.

[30]

Ke G., Xu Z., Zhang J., Bian J., Liu T.-Y., DeepGBM: A deep learning framework distilled by GBDT for online prediction tasks, in: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, Association for Computing Machinery, New York, NY, USA, 2019, pp. 384–394.

[31]

Ke G., Zhang J., Xu Z., Bian J., Liu T.-Y., TabNN: A universal neural network solution for tabular data, 2019, URL: https://openreview.net/forum?id=r1eJssCqY7.

[32]

Kipf, T. N., & Welling, M. (2017). Semi-Supervised Classification with Graph Convolutional Networks. In International conference on learning representations.

[33]

Kotelnikov A., Baranchuk D., Rubachev I., Babenko A., TabDDPM: Modelling tabular data with diffusion models, in: International conference on machine learning, PMLR, 2023, pp. 17564–17579.

[34]

Lam R., Sanchez-Gonzalez A., Willson M., Wirnsberger P., Fortunato M., Alet F., et al., Learning skillful medium-range global weather forecasting, Science (2023).

[35]

Langley, P., & Sage, S. (1994). Oblivious decision trees and abstract cases. In Working notes of the AAAI-94 workshop on case-based reasoning (pp. 113–117).

[36]

Lundberg S.M., Lee S.-I., A unified approach to interpreting model predictions, in: Proceedings of the 31st international conference on neural information processing systems, Curran Associates Inc., Red Hook, NY, USA, 2017, pp. 4768–4777.

[37]

Luo H., Cheng F., Yu H., Yi Y., SDTR: Soft decision tree regressor for tabular data, IEEE Access 9 (2021) 55999–56011.

[38]

McCulloch W.S., Pitts W., A logical calculus of the ideas immanent in nervous activity, The Bulletin of Mathematical Biophysics 5 (4) (1943) 115–133.

[39]

Moritz P., Nishihara R., Wang S., Tumanov A., Liaw R., Liang E., et al., Ray: A distributed framework for emerging AI applications, in: Proceedings of the 13th USENIX conference on operating systems design and implementation, USENIX Association, USA, 2018, pp. 561–577.

[40]

Naumov M., Mudigere D., Shi H.-J.M., Huang J., Sundaraman N., Park J., et al., Deep learning recommendation model for personalization and recommendation systems, 2019, arXiv preprint arXiv:1906.00091.

[41]

Pace R.K., Barry R., Sparse spatial autoregressions, Statistics & Probability Letters 33 (3) (1997) 291–297.

[42]

Paszke A., Gross S., Massa F., Lerer A., Bradbury J., Chanan G., et al., PyTorch: An imperative style, high-performance deep learning library, in: Advances in neural information processing systems, vol. 32, Curran Associates, Inc., 2019.

[43]

Popov S., Morozov S., Babenko A., Neural oblivious decision ensembles for deep learning on tabular data, 2019, arXiv preprint arXiv:1909.06312.

[44]

Prokhorenkova L., Gusev G., Vorobev A., Dorogush A.V., Gulin A., CatBoost: unbiased boosting with categorical features, Advances in Neural Information Processing Systems 31 (2018).

[45]

Radford A., Narasimhan K., Salimans T., Sutskever I., et al., Improving language understanding by generative pre-training, OpenAI, 2018.

[46]

Sanchez-Gonzalez, A., Godwin, J., Pfaff, T., Ying, R., Leskovec, J., & Battaglia, P. W. (2020). Learning to simulate complex physics with graph networks. In 37th International conference on machine learning, vol. PartF168147-11 (pp. 8428–8437).

[47]

Somepalli, G., Schwarzschild, A., Goldblum, M., Bruss, C. B., & Goldstein, T. (2022). SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training. In NeurIPS 2022 first table representation workshop.

[48]

Tay Y., Dehghani M., Bahri D., Metzler D., Efficient transformers: A survey, ACM Computing Surveys 55 (6) (2022).

[49]

Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., et al., Attention is all you need, Advances in neural information processing systems, vol. 30, Curran Associates, Inc., 2017.

[50]

Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2018). Graph attention networks. In International conference on learning representations.

[51]

Vergara A., Vembu S., Ayhan T., Ryan M.A., Homer M.L., Huerta R., Chemical gas sensor drift compensation using classifier ensembles, Sensors and Actuators B (Chemical) (ISSN ) 166–167 (2012) 320–329.

[52]

Wang R., Shivanna R., Cheng D., Jain S., Lin D., Hong L., et al., DCN V2: Improved deep & cross network and practical lessons for web-scale learning to rank systems, in: Proceedings of the web conference 2021, Association for Computing Machinery, New York, NY, USA, 2021, pp. 1785–1797.

Index Terms

Graph Neural Network contextual embedding for Deep Learning on tabular data

Index terms have been assigned to the content through auto-classification.

Recommendations

Perturbation of deep autoencoder weights for model compression and classification of tabular data
Abstract
Fully connected deep neural networks (DNN) often include redundant weights leading to overfitting and high memory requirements. Additionally, in tabular data classification, DNNs are challenged by the often superior performance of ...
Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective
Abstract
Deep learning, which is originated from an artificial neural network (ANN), is one of the major technologies of today’s smart cybersecurity systems or policies to function in an intelligent manner. Popular deep learning techniques, such as multi-...
Deep learning, reinforcement learning, and world models
Abstract
Deep learning (DL) and reinforcement learning (RL) methods seem to be a part of indispensable factors to achieve human-level or super-human AI systems. On the other hand, both DL and RL have strong connections with our brain functions ...

Comments

Information & Contributors

Information

Published In

cover image Neural Networks

Neural Networks Volume 173, Issue C

May 2024

703 pages

Issue’s Table of Contents

The Authors.

Publisher

Elsevier Science Ltd.

United Kingdom

Publication History

Published: 02 July 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents