research-article

Free access

Deep Pipeline Embeddings for AutoML

Authors:

Sebastian Pineda Arango,

Josif GrabockaAuthors Info & Claims

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 1907 - 1919

https://doi.org/10.1145/3580305.3599303

Published: 04 August 2023 Publication History

Abstract

Automated Machine Learning (AutoML) is a promising direction for democratizing AI by automatically deploying Machine Learning systems with minimal human expertise. The core technical challenge behind AutoML is optimizing the pipelines of Machine Learning systems (e.g. the choice of preprocessing, augmentations, models, optimizers, etc.). Existing Pipeline Optimization techniques fail to explore deep interactions between pipeline stages/components. As a remedy, this paper proposes a novel neural architecture that captures the deep interaction between the components of a Machine Learning pipeline. We propose embedding pipelines into a latent representation through a novel per-component encoder mechanism. To search for optimal pipelines, such pipeline embeddings are used within deep-kernel Gaussian Process surrogates inside a Bayesian Optimization setup. Furthermore, we meta-learn the parameters of the pipeline embedding network using existing evaluations of pipelines on diverse collections of related datasets (a.k.a. meta-datasets). Through extensive experiments on three large-scale meta-datasets, we demonstrate that pipeline embeddings yield state-of-the-art results in Pipeline Optimization.

Supplementary Material

MP4 File (rtfp0321-2min-promo.mp4)

How can one effectively search for Machine Learning and Deep Learning Pipelines? Typically, pipelines contain numerous conditional hyperparameters and correlated features. Moreover, they often result in large search spaces. We propose learning an embedding function that enables a more efficient search. This function is implemented using a neural network, which can be meta-learned or designed based on knowledge of the pipeline structure. We demonstrate that this approach outperforms the state-of-the-art method. Additionally, it can be easily adapted to new components added to the pipeline.

Download
4.63 MB

References

[1]

Yuji Akimoto and Chengrun Yang. 2020. Oboe. https://github.com/udellgroup/oboe.

[2]

Ahmed M. Alaa and Mihaela van der Schaar. 2018. AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization with Structured Kernel Learning. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsm"a ssan, Stockholm, Sweden, July 10-15, 2018. 139--148.

[3]

Sebastian Pineda Arango, Hadi S. Jomaa, Martin Wistuba, and Josif Grabocka. 2021. HPO-B: A Large-Scale Reproducible Benchmark for Black-Box HPO based on OpenML. arxiv: 2106.06257 [cs.LG]

[4]

Maximilian Balandat, Brian Karrer, Daniel R. Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Eytan Bakshy. 2020. BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.

[5]

James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for Hyper-Parameter Optimization. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain. 2546--2554.

[6]

James Bergstra and Yoshua Bengio. 2012. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res., Vol. 13 (2012), 281--305.

[7]

Lasse Hansen Bogdan Cebere. 2022. AutoPrognosis2.0. https://github.com/vanderschaarlab/autoprognosis.

[8]

Roberto Calandra, Jan Peters, Carl Edward Rasmussen, and Marc Peter Deisenroth. 2016. Manifold Gaussian processes for regression. In 2016 International Joint Conference on Neural Networks (IJCNN). IEEE, 3338--3345.

[9]

Alex Guimar a es Cardoso de Sá, Walter José G. S. Pinto, Luiz Otá vio Vilas Boas Oliveira, and Gisele L. Pappa. 2017. RECIPE: A Grammar-Based Framework for Automatically Evolving Classification Pipelines. In Genetic Programming - 20th European Conference, EuroGP 2017, Amsterdam, The Netherlands, April 19-21, 2017, Proceedings (Lecture Notes in Computer Science, Vol. 10196), James McDermott, Mauro Castelli, Lukás Sekanina, Evert Haasdijk, and Pablo García-Sá nchez (Eds.). 246--261. https://doi.org/10.1007/978-3-319-55696-3_16

[10]

Iddo Drori, Yamuna Krishnamurthy, Ré mi Rampin, Raoni de Paula Lourencc o, Jorge Piazentin Ono, Kyunghyun Cho, Clá udio T. Silva, and Juliana Freire. 2021. AlphaD3M: Machine Learning Pipeline Synthesis. CoRR, Vol. abs/2111.02508 (2021). [arXiv]2111.02508 https://arxiv.org/abs/2111.02508

[11]

Fabio Ferreira Ekrem Örztürk. 2021. Zero Shot AUtoML with Pretrained Models. https://github.com/automl/zero-shot-automl-with-pretrained-models.

[12]

Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, and Alexander Smola. 2020. Autogluon-tabular: Robust and accurate automl for structured data. arXiv 2020. arXiv preprint arXiv:2003.06505 (2020).

[13]

Matthias Feurer, Katharina Eggensperger, Stefan Falkner, Marius Lindauer, and Frank Hutter. 2020. Auto-sklearn 2.0: Hands-free automl via meta-learning. arXiv preprint arXiv:2007.04074 (2020).

[14]

Matthias Feurer, Aaron Klein, Jost Eggensperger, Katharina Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and Robust Automated Machine Learning. In Advances in Neural Information Processing Systems 28 (2015). 2962--2970.

Digital Library

[15]

Matthias Feurer, Benjamin Letham, and Eytan Bakshy. 2018. Scalable meta-learning for bayesian optimization using ranking-weighted gaussian process ensembles. In AutoML Workshop at ICML, Vol. 7.

[16]

Nicoló Fusi, Rishit Sheth, and Melih Elibol. 2018. Probabilistic Matrix Factorization for Automated Machine Learning. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). 3352--3361.

[17]

Daniel Gabay and Bertrand Mercier. 1976. A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Computers & mathematics with applications, Vol. 2, 1 (1976), 17--40.

[18]

Marta Garnelo, Jonathan Schwarz, Dan Rosenbaum, Fabio Viola, Danilo J Rezende, SM Eslami, and Yee Whye Teh. 2018. Neural processes. arXiv preprint arXiv:1807.01622 (2018).

[19]

Marc G. Genton. 2002. Classes of Kernels for Machine Learning: A Statistics Perspective., Vol. 2 (mar 2002), 299--312.

[20]

Pieter Gijsbers, Erin LeDell, Janek Thomas, Sébastien Poirier, Bernd Bischl, and Joaquin Vanschoren. 2019. An open source AutoML benchmark. arXiv preprint arXiv:1907.00909 (2019).

[21]

Xin He, Kaiyong Zhao, and Xiaowen Chu. 2021. AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, Vol. 212 (2021), 106622.

[22]

Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-Based Optimization for General Algorithm Configuration. In Learning and Intelligent Optimization - 5th International Conference, LION 5, Rome, Italy, January 17-21, 2011. Selected Papers. 507--523. https://doi.org/10.1007/978-3-642-25566-3_40

Digital Library

[23]

Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). 2019a. Automated Machine Learning - Methods, Systems, Challenges. Springer.

[24]

Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). 2019b. Automated Machine Learning - Methods, Systems, Challenges. Springer. https://doi.org/10.1007/978--3-030-05318--5

[25]

Fergus Imrie, Bogdan Cebere, Eoin F McKinney, and Mihaela van der Schaar. 2022. AutoPrognosis 2.0: Democratizing Diagnostic and Prognostic Modeling in Healthcare with Automated Machine Learning. arXiv preprint arXiv:2210.12090 (2022).

[26]

Abdus Salam Khazi, Sebastian Pineda Arango, and Josif Grabocka. 2023. Deep Ranking Ensembles for Hyperparameter Optimization. In The Eleventh International Conference on Learning Representations.

[27]

Akihiro Kishimoto, Djallel Bouneffouf, Radu Marinescu, Parikshit Ram, Ambrish Rawat, Martin Wistuba, Paulito Pedregosa Palmes, and Adi Botea. 2021. Bandit Limited Discrepancy Search and Application to Machine Learning Pipeline Optimization. In 8th ICML Workshop on Automated Machine Learning (AutoML).

[28]

Aaron Klein and Arber Zela. 2020. PyBNN. https://github.com/automl/pybnn.

[29]

Sijia Liu, Parikshit Ram, Deepak Vijaykeerthy, Djallel Bouneffouf, Gregory Bramble, Horst Samulowitz, Dakuo Wang, Andrew Conn, and Alexander G. Gray. 2020. An ADMM Based Framework for AutoML Pipeline Configuration. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020. AAAI Press, 4892--4899. https://ojs.aaai.org/index.php/AAAI/article/view/5926

[30]

Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole, and Jascha Sohl-Dickstein. 2020. Using a thousand optimization tasks to learn hyperparameter search strategies. CoRR, Vol. abs/2002.11887 (2020). arxiv: 2002.11887

[31]

Felix Mohr, Marcel Wever, and Eyke Hüllermeier. 2018. ML-Plan: Automated machine learning via hierarchical planning. Mach. Learn., Vol. 107, 8--10 (2018), 1495--1515. https://doi.org/10.1007/s10994-018-5735-z

Digital Library

[32]

Randal S. Olson and Jason H. Moore. 2016. TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning. In Proceedings of the 2016 Workshop on Automatic Machine Learning, AutoML 2016, co-located with 33rd International Conference on Machine Learning (ICML 2016), New York City, NY, USA, June 24, 2016 (JMLR Workshop and Conference Proceedings, Vol. 64), Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). JMLR.org, 66--74. http://proceedings.mlr.press/v64/olson_tpot_2016.html

[33]

Ekrem Ozturk, Fábio Ferreira, Hadi Samer Jomaa, Lars Schmidt-Thieme, Josif Grabocka, and Frank Hutter. 2022. Zero-Shot AutoML with Pretrained Models. In International Conference on Machine Learning (ICML).

[34]

Massimiliano Patacchiola, Jack Turner, Elliot J Crowley, Michael O'Boyle, and Amos J Storkey. 2020. Bayesian meta-learning for the few-shot setting via deep kernels. Advances in Neural Information Processing Systems, Vol. 33 (2020), 16108--16118.

[35]

Valerio Perrone, Rodolphe Jenatton, Matthias W. Seeger, and Cédric Archambeau. 2018. Scalable Hyperparameter Transfer Learning. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada. 6846--6856.

[36]

Herilalaina Rakotoarison, Marc Schoenauer, and Michèle Sebag. 2019. Automated Machine Learning with Monte-Carlo Tree Search. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, Sarit Kraus (Ed.). ijcai.org, 3296--3303. https://doi.org/10.24963/ijcai.2019/457

[37]

Carl Edward Rasmussen and Christopher K. I. Williams. 2006. Gaussian Processes for Machine Learning. MIT Press.

Digital Library

[38]

Nicolas Schilling, Martin Wistuba, Lucas Drumond, and Lars Schmidt-Thieme. 2015. Joint Model Choice and Hyperparameter Optimization with Factorized Multilayer Perceptrons. In 27th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2015, Vietri sul Mare, Italy, November 9-11, 2015. IEEE Computer Society, 72--79. https://doi.org/10.1109/ICTAI.2015.24

Digital Library

[39]

Rishit Sheth. 2018. pmf-automl. https://github.com/rsheth80/pmf-automl.

[40]

Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012, Lake Tahoe, Nevada, United States. 2960--2968.

Digital Library

[41]

Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md. Mostofa Ali Patwary, Prabhat, and Ryan P. Adams. 2015. Scalable Bayesian Optimization Using Deep Neural Networks. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015. 2171--2180.

[42]

Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, and Frank Hutter. 2016. Bayesian Optimization with Robust Bayesian Neural Networks. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain. 4134--4142.

[43]

Thomas Swearingen, Will Drevo, Bennett Cyphers, Alfredo Cuesta-Infante, Arun Ross, and Kalyan Veeramachaneni. 2017. ATM: A distributed, collaborative, scalable system for automated machine learning. In 2017 IEEE International Conference on Big Data (IEEE BigData 2017), Boston, MA, USA, December 11-14, 2017. IEEE Computer Society, 151--162. https://doi.org/10.1109/BigData.2017.8257923

[44]

Chris Thornton, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2012. Auto-WEKA: Automated Selection and Hyper-Parameter Optimization of Classification Algorithms. CoRR, Vol. abs/1208.3719 (2012). arxiv: 1208.3719

[45]

Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).

[46]

Ying Wei, Peilin Zhao, and Junzhou Huang. 2021a. Meta-learning Hyperparameter Performance Prediction with Neural Processes. In International Conference on Machine Learning. PMLR, 11058--11067.

[47]

Ying Wei, Peilin Zhao, and Junzhou Huang. 2021b. Meta-learning Hyperparameter Performance Prediction with Neural Processes. In International Conference on Machine Learning. PMLR, 11058--11067.

[48]

Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, and Eric P. Xing. 2016. Deep Kernel Learning. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 51), Arthur Gretton and Christian C. Robert (Eds.). PMLR, Cadiz, Spain, 370--378.

[49]

Martin Wistuba and Josif Grabocka. 2021. Few-Shot Bayesian Optimization with Deep Kernel Surrogates. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021.

[50]

Martin Wistuba, Arlind Kadra, and Josif Grabocka. 2022. Supervising the Multi-Fidelity Race of Hyperparameter Configurations. In Thirty-Sixth Conference on Neural Information Processing Systems. https://openreview.net/forum?id=0Fe7bAWmJr

[51]

Chengrun Yang, Yuji Akimoto, Dae Won Kim, and Madeleine Udell. 2019. OBOE: Collaborative Filtering for AutoML Model Selection. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Rómer Rosales, Evimaria Terzi, and George Karypis (Eds.). ACM, 1173--1183. https://doi.org/10.1145/3292500.3330909

Digital Library

[52]

Chengrun Yang, Jicong Fan, Ziyang Wu, and Madeleine Udell. 2020. AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space. In KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020, Rajesh Gupta, Yan Liu, Jiliang Tang, and B. Aditya Prakash (Eds.). ACM, 1446--1456. https://doi.org/10.1145/3394486.3403197

Digital Library

Index Terms

Deep Pipeline Embeddings for AutoML
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
        Supervised learning by regression
    2. Machine learning approaches
      1. Learning latent representations
      2. Neural networks
2. Information systems
  1. Information systems applications
    1. Data mining
    2. Decision support systems
      1. Data analytics

Recommendations

AutoML Pipeline Selection: Efficiently Navigating the Combinatorial Space
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Data scientists seeking a good supervised learning model on a dataset have many choices to make: they must preprocess the data, select features, possibly reduce the dimension, select an estimation algorithm, and choose hyperparameters for each of these ...
Metalearning using structure-rich pipeline representations for improved AutoML

Automatic machine learning (AutoML) systems have been shown to perform better when they learn from past experience. Examples include Auto-sklearn, which warm-starts the ML pipeline search using existing programs known to perform well on 'similar' tasks, ...
Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science
GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference 2016

As the field of data science continues to grow, there will be an ever-increasing demand for tools that make machine learning accessible to non-experts. In this paper, we introduce the concept of tree-based pipeline optimization for automating one of the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2023

5996 pages

ISBN:9798400701030

DOI:10.1145/3580305

General Chairs:
Ambuj Singh
UC Santa Barbara, USA
,
Yizhou Sun
UC Los Angeles, USA
,
Program Chairs:
Leman Akoglu
Carnegie Mellon University, USA
,
Dimitrios Gunopulos
University of Athens, Greece
,
Xifeng Yan
UC Santa Barbara, USA
,
Ravi Kumar
Google, USA
,
Fatma Ozcan
Google, USA
,
Jieping Ye
Alibaba DAMO Academy

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 August 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '23

Sponsor:

KDD '23: The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 6 - 10, 2023

CA, Long Beach, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
387
Total Downloads

Downloads (Last 12 months)291
Downloads (Last 6 weeks)39

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents