Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3688671.3688762acmotherconferencesArticle/Chapter ViewFull TextPublication PagessetnConference Proceedingsconference-collections
short-paper
Open access

CLBO: Conditional Local Bayesian Optimization for Automated Machine Learning

Published: 27 December 2024 Publication History

Abstract

In this paper, we present a novel Bayesian optimization method named Conditional Local Bayesian Optimization (CLBO) designed specifically to address challenges in optimizing Automated Machine Learning (AutoML) tasks. Inspired by a controller-responder architecture, our method leverages a controller that selects promising pipelines based on an acquisition function. Local responders employ Bayesian optimization to refine the search within sub-spaces. CLBO introduces a progressive budget system, which dynamically allocates the optimization budget. Our method utilizes a group of surrogate models to initially optimize a simpler objective function. Notably, CLBO outperforms current state-of-the-art algorithms on 35 classification datasets from the OpenML repository. Finally, CLBO reports unbiased performance estimates through the use of a performance correction method.

1 Introduction

Machine learning (ML) has revolutionized diverse fields like natural language processing [7], computer vision [36], robotics [24], material design [22], and fraud detection [35]. However, unlocking this potential hinges on optimal hyperparameter tuning for ML algorithms [38]. This crucial yet time-consuming task is addressed by hyperparameter optimization (HPO) techniques [3, 28]. Among HPO algorithms, Bayesian Optimization (BO) stands out for its efficiency [23, 37]. It excels at converging on near optimal hyperparameters with fewer iterations compared to alternatives like random search and evolutionary algorithms [3, 34, 39].
AutoML, a subfield of machine learning (ML), encompasses a range of objectives, primarily focusing on maximizing predictive performance on datasets [14]. AutoML tools achieve this by constructing pipelines that include various processing steps such as pre-processing, imputation, feature selection, and modeling. Each stage utilizes specific algorithms, and AutoML optimizes the hyperparameters for each algorithm within the pipeline. A configuration refers to the combination of algorithms selected for each stage and their corresponding hyperparameter values [32]. A key objective of AutoML is to identify the optimal configuration for a given problem based on predictive performance. This task requires extensive optimization across a complex search space with many factors influencing the outcome, and is referred to as the combined algorithm selection and hyperparameter optimization (CASH) problem [30]. Additionally, fitting a configuration for medium to large datasets can be time-consuming, ranging from minutes to hours. To address this challenge, AutoML tools incorporate optimization algorithms to achieve high performance with fewer function evaluations, thus reducing analysis time [10, 30]. Another critical aspect of AutoML tools is providing unbiased performance estimates. Various methods exist to address this challenge [29, 31, 33]. However, the majority of AutoML tools rely on simple cross-validation scores, which are prone to estimation bias [32].
Bayesian Optimization (BO) is a state-of-the art optimization method that utilizes a probabilistic model for predictions and uncertainty estimates [23]. BO starts with an initial sampling phase where a few random configurations are evaluated to gather initial data. These configurations are assessed by running the objective function, which can involve tasks like fitting a machine learning model on a dataset and returning a performance metric (e.g., cross-validated AUC score). The probabilistic model aims to predict the average outcome (mean) of the objective function and quantify the associated uncertainty (standard deviation). During each iteration, the probabilistic model predicts the mean value and uncertainty for configurations that haven’t been evaluated yet. The next configuration to be evaluated is chosen by an acquisition function, with Expected Improvement (EI) being the most common choice [16, 23]. This function balances exploration vs. exploitation by selecting configurations with both potentially high scores or/and high uncertainty. By utilizing the acquisition function, BO avoids getting stuck in local optima and converges to near-optimal solutions with sufficient iterations. Finally, to identify the configuration that maximizes the acquisition function in each step, we can either use random sampling or an evolutionary algorithm to generate a set of candidate configurations. Once a configuration is evaluated, the probabilistic model is retrained with the new data point.
This paper introduces Conditional Local Bayesian Optimization (CLBO), a novel Bayesian optimization procedure designed to address the challenges posed by conditional search spaces in AutoML. CLBO tackles this challenge by splitting the complex search space into smaller, more manageable sub-spaces. The search space is naturally partitioned through the conditional variables. In this paper, the conditional variable is the selection of the ML algorithm. Each sub-space is then optimized by localized, semi-independent Bayesian optimization algorithms we refer to as responders. A central controller identifies the most promising configuration by selecting the one with the highest acquisition value among all configurations proposed by the responders. CLBO constructs multiple Random Forest (RF) models for each responder. Each RF model is trained to predict the performance of individual folds. CLBO adopts a progressive approach, similar to cross-validation. In early iterations, the algorithm optimizes a simpler objective function using a subset of folds. As the optimization progresses, more budget is allocated and a higher number of folds is utilized. Finally, we include a method to correct bias in performance estimates within Bayesian optimization. CLBO uses the Bootstrap Bias Correction algorithm for Cross-Validation (BBC-CV) to report unbiased performance estimates [33]. BBC-CV eliminates the need for constructing additional models, unlike Nested Cross-Validation (NCV).
We validated CLBO’s overall effectiveness through a large-scale evaluation. We used 35 classification datasets from the complete OpenML CC-18 benchmark [6] to compare CLBO against several leading optimization frameworks designed for conditional hyperparameter spaces. These frameworks included SMAC [20], Hyperopt [4], Optuna [1], and Mango [26]. Additionally, we included Random Search [3] as a baseline for comparison. Our experiments revealed that CLBO achieved the best average performance across all datasets, secured the highest number of wins (16) compared to the other methods and CLBO’s performance was significantly better than both Mango and Random Search. Finally, we compared the performance estimates obtained from CLBO against the held-out performance and the reported cross-validation (CV) performance. The results demonstrated that CLBO delivers estimates that are very close to unbiased, accurately reflecting the true predictive performance.

2 Related Work

While this paper doesn’t delve into a comprehensive review of the broader black-box optimization literature, we acknowledge the existence of well-established surveys and tutorials on the subject for those interested in a more extensive overview [5, 11, 27, 37]. Our specific focus lies on applying black-box optimization within the context of AutoML, which introduces unique complexities compared to the general black-box optimization domain. The AutoML search space presents a significant challenge for most existing optimization algorithms. This space is characterized by several complexities: high dimensionality, the presence of conditional hyper-parameters (where a parameter’s value depends on another parameter), and the need to tune both continuous and discrete variables. Such complexity renders many prominent methods incompatible with AutoML optimization [28].
While a significant portion of black-box optimization research focuses on single-objective, non-conditional search spaces, the area of conditional hyper-parameter space optimization remains relatively unexplored. Early work proposed using independent Bayesian optimization procedures based on Gaussian Processes (GPs) to address conditional spaces [3]. Subsequent advancements explored alternative approaches to handle conditionality. For instance, SMAC introduced Random Forests as surrogate models [20]. Similarly, tree-structured Parzen estimators have been adopted as surrogate models in both Hyperopt and Optuna [1, 4]. Building upon the work of [3], Mango leverages the acquisition function to establish dependencies between independent BO instances [26]. Large-scale benchmarks designed for evaluating BO methods in AutoML remain scarce. However, in the context of AutoML tasks, Random Forests have been shown to outperform TPE (Tree-Parzen Estimator) [30].
Most AutoML tools rely heavily on Bayesian Optimization algorithms for pipeline optimization. An overview of the optimization methods across various open-source AutoML tools follows. Auto-sklearn and Auto-WEKA 2.0, use SMAC for the optimization [10, 17]. TPOT utilizes a genetic algorithm [25]. Auto-prognosis 2.0 utilizes the Optuna framework, with Hyperband as an alternative [1, 15, 19]. H2O currently employs optimization with random search [18]. Auto-Torch relies on Bayesian optimization with Hyperband, which is a variant of SMAC [9, 40]. Auto-Gluon relies on ensemble learning without performing hyperparameter optimization [8].

3 Our Method

The central piece of CLBO is the controller-responder architecture. By partitioning the search space into semi-independent sub-spaces, we are able to create multiple local Bayesian optimization procedures that "communicate" via the controller through the acquisition function. This approach was first proposed within the Mango framework [26].
Surrogate model: The central part of every BO algorithm is the probabilistic model, also known as the surrogate model, which learns the characteristics of the objective function. Gaussian Processes (GP), Random Forests (RF), and Tree Parzen Estimators (TPE) are all prominent examples of surrogate models. We selected RF as our surrogate model due to their sample efficiency compared to GPs. One of the novelties of our work is the utilization of an group of RFs as surrogate models. Unlike traditional approaches that employ a single surrogate model, CLBO constructs a RF specifically for predicting the performance of each fold, rather than the fold average. This enables CLBO to generate more fine-grained estimations of the predicted score and the uncertainty for each configuration.
Configuration Sampler: The Bayesian optimization algorithm relies on some initial data acquisition procedure to train the surrogate model for the first time. Common methods for generating these initial random configurations include Sobol Sequences, Hypercube sampling and Random uniform sampling [2, 21]. Our work leverages Sobol Sequences due to their ability to more evenly explore the search space compared to uniform random search [3]. Sobol Sequences are used in each iteration to generate a set of 900 configurations to maximize the Expected Improvement (EI) acquisition function. To further refine the search, a local search method is implemented. This local search, utilizes a Gaussian sampler to generate 100 additional configurations around each group’s current best evaluated configuration.
Optimization Budget allocation: We propose a novel progressive budget allocation algorithm, that dynamically allocates more resources (iterations) to later folds. This is a cross-validation flavored procedure, in which we initially train models on a limited subset of folds in the early iterations. Estimates from more folds are included as the optimization progresses. At the start of each fold, the configurations selected on the previous folds, are run on the new fold. We introduce a linear weighting system to determine the budget allocation for each fold. Let k denote the total number of folds and n denote the total number of iterations allocated (Excluding the initial random configurations). To begin, a normalized value, norm, is calculated using the formula norm = (k*(k + 1))/2. For each fold i, a weight C[i] is then computed as follows C[i] = (n*i)/norm. This weighting system ensures a gradual increase in the number of iterations allocated to each subsequent fold.

4 Experimental Setup

In this section, we introduce the search space, the datasets selected, the objective to optimize, and the optimization methods.

4.1 Search Space

This section describes the approach for jointly optimizing the hyperparameters of a variety of classification algorithms. The selected classifiers encompass a range of complexities, including basic models, ensemble methods, and boosting algorithms. We also tune a variety of hyper-parameters with a wide range of values. Table 1 gives an overview of the classifiers, the hyper-parameters and the hyper-parameter ranges that the optimization algorithms tunes.
Table 1:
Table 1: The seach space employed for HPO.

4.2 Datasets

We employed the curated OpenML CC-18 benchmark [6], which provides a comprehensive suite of multi-class and binary classification problems. A crucial aspect of OpenML CC-18’s dataset selection involved prioritizing datasets where simple models (e.g Decision Trees) typically achieve lower performance. This characteristic ensures that CASH optimization can improve performance over these baselines. For the evaluation we randomly selected 35 datasets. The sample sizes ranges from 540 to 6430, feature size from 5 to 857 features, the dataset include binary and multi-class labels up-to 11 classes. The number of continuous features ranges between 0 - 856, while for the categorical features the value ranges between 1 and 61.

4.3 Evaluation Protocol - Optimization Metric

The data are partitioned into 80% training and 20% testing sets (for estimating performance bias). We repeat this process 5 times to produce disjointed test-sets, aligning with the principles of Nested-Cross Validation (NCV). The training set is used by the optimization algorithms to maximize the 5-fold CV (inner-fold) performance and the Area Under the ROC Curve (AUC) is used as the metric optimized. Our proposed method deviates slightly from this approach. While other algorithms directly optimize the inner-fold CV performance throughout the optimization process, our method incorporates a dynamic optimization procedure that delays this focus until the final iterations (see Section 3).

4.4 Optimization algorithms

Due to space constraints, we refrain from providing a comprehensive explanation of the optimization algorithms employed in this work. However, we present an overview of the five algorithms included in our test-bed.
Random Search: Random search is the simplest hyperparameter optimization method. It generates a set of configurations by uniformly sampling values from the defined hyperparameter ranges. It is used as the baseline method.
SMAC: The first Bayesian optimization algorithm to utilize Random Forest as surrogate models. We select the "Hyperparameter Optimization" version from SMAC3 package [20]. SMAC uses Sobol sequence to sample initial random configurations and also for the selection of the next most promising configurations, coupled with a Local Search around the current best configuration [13].
Mango: Instead of using a tree-model to handle conditionality, Mango adopts local-BO for each sub-space. In order to sample the next best configuration, it selects the highest acquisition value across all local-BO instances. The surrogate model utilized by mango is sparse Gaussian processes [26]. Contrary to us, Mango relies on sparse Gaussian processes for the surrogate model.
HyperOpt: By using Tree-Parzen Estimators (TPE) for both surrogate modeling and configuration sampling, HyperOpt is able to handle the conditional search space. We include the default TPE implementation provided by the HyperOpt package [4].
Optuna: Utilizes TPE for both surrogate modeling and configuration sampling. (Same as HyperOpt). We select the default implementation from Optuna package [1].

5 Comparative Evaluation

This section presents the results of the comparative evaluation of CLBO against currently available optimization packages. We utilized the Autorank tool [12], which performns non-parametric statistical tests (Friedman test), for the ranking analysis.
Figure 1 illustrates the average performance (AUC) of each optimization method across all 35 datasets, over the course of the optimization iterations. CLBO emerges as the clear winner, achieving the highest final score of 0.936 AUC. Other methods like SMAC, HyperOpt, Optuna, and Mango follow closely with an average AUC of around 0.934. Random Search lags behind with an average AUC of 0.932. CLBO’s strong performance is further supported by statistical tests of Autorank which confirm that CLBO is significantly better than both Mango and Random Search. Additionally, CLBO leads in terms of winning datasets, achieving the best performance in 45% (16 out of 35) of the experiments. CLBO’s consistently strong performance, as illustrated across the optimization process and the variety of datasets, establishes it as a leading optimization method.
Figure 1:
Figure 1: The figure illustrates the efficiency of CLBO vs the state-of-the art optimizers.

6 Bias Correction Experiments

This section reports results of using BBC-CV for bias removal in final performance. Figure 2 shows the difference between performance estimated by BBC-CV (y-axis) and CV (x-axis) from holdout performance, across all datasets. Each point represents a dataset, divided into four quadrants for visualization. Each quadrant describes whether the performance is optimistic or conservative and the number of datasets in that category (out of 35). Points on the diagonal white line show datasets where CV estimation matches BBC-CV. Points to the right of the diagonal indicate CV is more optimistic than BBC-CV, which is true for most datasets.
In figure 2, CV estimation is optimistically biased in 22 datasets (green and red quadrants), while BBC-CV addresses this bias, providing conservative estimates in 30 datasets (grey and green quadrants). Only 5 datasets show optimistic estimates by BBC-CV (red quadrant). CV never reports conservative estimates when BBC-CV reports optimistic ones (blue quadrant). BBC-CV has an absolute average bias of 0.0026 AUC, compared to 0.0036 AUC for 5-fold CV.
We conducted a Wilcoxon signed-rank test to compare BBC-CV and Holdout performance. The null hypothesis (BBC-CV performance equals Holdout) was rejected with p-value = 0.000096, supporting the alternative that BBC-CV is conservative. A second test compared CV and Holdout performance. The null hypothesis (CV performance equals Holdout) was rejected with p-value = 0.0096, supporting the alternative that CV is optimistic. These results indicate BBC-CV provides more accurate performance estimates.
Figure 2:
Figure 2: The figure illustrates the average difference from hold-out estimates for CV and BBC per dataset.

7 Conclusion

In this paper, we introduce Conditional Local Bayesian Optimization (CLBO), a new Bayesian optimization method for conditional search spaces. We compared our algorithm against state-of-the-art Bayesian optimization algorithms. Results indicated that our method better is on average. Finally, by integrating a performance correction method called Bootstrap Bias Correction (BBC-CV) we ensure accurate reporting of performance estimates.

Acknowledgments

The research project was co-funded by the Stavros Niarchos Foundation (SNF) and the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the 5th Call of “Science and Society” Action – “Always Strive for Excellence – Theodore Papazoglou” (Project Number: 9578.)

References

[1]
Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. 2019. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA, 2623–2631.
[2]
I. A. Antonov and V. M. Saleev. 1979. An economic method of computing LPτ -sequences. Ussr Computational Mathematics and Mathematical Physics 19 (1979), 252–256. https://api.semanticscholar.org/CorpusID:122566579
[3]
James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of machine learning research 13, 2 (2012).
[4]
James Bergstra, Brent Komer, Chris Eliasmith, Dan Yamins, and David D Cox. 2015. Hyperopt: a Python library for model selection and hyperparameter optimization. Computational Science & Discovery 8, 1 (jul 2015), 014008.
[5]
Bernd Bischl, Martin Binder, Michel Lang, Tobias Pielok, Jakob Richter, Stefan Coors, Janek Thomas, Theresa Ullmann, Marc Becker, Anne-Laure Boulesteix, Difan Deng, and Marius Lindauer. 2023. Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges. WIREs Data Mining and Knowledge Discovery 13, 2 (2023), e1484. arXiv:https://wires.onlinelibrary.wiley.com/doi/pdf/10.1002/widm.1484
[6]
Bernd Bischl, Giuseppe Casalicchio, Matthias Feurer, Frank Hutter, Michel Lang, Rafael G. Mantovani, Jan N. van Rijn, and Joaquin Vanschoren. 2019. OpenML Benchmarking Suites. arXiv:https://arXiv.org/abs/1708.03731v2 [stat.ML] (2019).
[7]
K. R. Chowdhary. 2020. Natural Language Processing. Springer India, New Delhi, 603–649.
[8]
Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, and Alexander Smola. 2020. Autogluon-tabular: Robust and accurate automl for structured data. arXiv preprint arXiv:https://arXiv.org/abs/2003.06505 (2020).
[9]
Stefan Falkner, Aaron Klein, and Frank Hutter. 2018. BOHB: Robust and Efficient Hyperparameter Optimization at Scale. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018(Proceedings of Machine Learning Research, Vol. 80), Jennifer G. Dy and Andreas Krause (Eds.). PMLR, 1436–1445. http://proceedings.mlr.press/v80/falkner18a.html
[10]
Matthias Feurer, Katharina Eggensperger, Stefan Falkner, Marius Lindauer, and Frank Hutter. 2022. Auto-sklearn 2.0: hands-free AutoML via meta-learning. J. Mach. Learn. Res. 23, 1, Article 261 (jan 2022), 61 pages.
[11]
Peter I. Frazier. 2018. A Tutorial on Bayesian Optimization. arxiv:https://arXiv.org/abs/1807.02811 [stat.ML]
[12]
Steffen Herbold. 2020. Autorank: A Python package for automated ranking of classifiers. Journal of Open Source Software 5, 48 (2020), 2173.
[13]
Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-Based Optimization for General Algorithm Configuration. In Learning and Intelligent Optimization, Carlos A. Coello Coello (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 507–523.
[14]
Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). 2019. Automated Machine Learning - Methods, Systems, Challenges. Springer.
[15]
Fergus Imrie, Bogdan Cebere, Eoin F. McKinney, and Mihaela van der Schaar. 2023. AutoPrognosis 2.0: Democratizing diagnostic and prognostic modeling in healthcare with automated machine learning. PLOS Digital Health 2, 6 (06 2023), 1–21.
[16]
Donald R. Jones, Matthias Schonlau, and William J. Welch. 1998. Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization 13, 4 (01 Dec 1998), 455–492.
[17]
Lars Kotthoff, Chris Thornton, Holger H. Hoos, Frank Hutter, and Kevin Leyton-Brown. 2017. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research 18, 25 (2017), 1–5. http://jmlr.org/papers/v18/16-261.html
[18]
Erin LeDell and Sebastien Poirier. 2020. H2O AutoML: Scalable Automatic Machine Learning. 7th ICML Workshop on Automated Machine Learning (AutoML) (July 2020). https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf
[19]
Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2018. Hyperband: A novel bandit-based approach to hyperparameter optimization. Journal of Machine Learning Research 18, 185 (2018), 1–52.
[20]
Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Difan Deng, Carolin Benjamins, Tim Ruhkopf, René Sass, and Frank Hutter. 2022. SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization. Journal of Machine Learning Research 23, 54 (2022), 1–9. http://jmlr.org/papers/v23/21-0888.html
[21]
M. D. McKay, R. J. Beckman, and W. J. Conover. 1979. A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technometrics 21, 2 (1979), 239–245. http://www.jstor.org/stable/1268522
[22]
Timon Meier, Runxuan Li, Stefanos Mavrikos, Brian Blankenship, Zacharias Vangelatos, M. Erden Yildizdag, and Costas P. Grigoropoulos. 2024. Obtaining auxetic and isotropic metamaterials in counterintuitive design spaces: an automated optimization approach and experimental characterization. npj Computational Materials 10, 1 (03 Jan 2024), 3.
[23]
J. Močkus. 1975. On bayesian methods for seeking the extremum. In Optimization Techniques IFIP Technical Conference Novosibirsk, July 1–7, 1974, G. I. Marchuk (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 400–404.
[24]
Ronja Möller, Antonino Furnari, Sebastiano Battiato, Aki Härmä, and Giovanni Maria Farinella. 2021. A survey on human-aware robot navigation. Robotics and Autonomous Systems 145 (2021), 103837.
[25]
Randal S. Olson, Nathan Bartley, Ryan J. Urbanowicz, and Jason H. Moore. 2016. Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science. In Proceedings of the Genetic and Evolutionary Computation Conference 2016 (Denver, Colorado, USA) (GECCO ’16). ACM, New York, NY, USA, 485–492.
[26]
Sandeep Singh Sandha, Mohit Aggarwal, Igor Fedorov, and Mani Srivastava. 2020. Mango: A Python Library for Parallel Hyperparameter Tuning. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3987–3991.
[27]
Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P. Adams, and Nando de Freitas. 2016. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 104, 1 (2016), 148–175.
[28]
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In Advances in Neural Information Processing Systems, F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger (Eds.), Vol. 25. Curran Associates, Inc.https://proceedings.neurips.cc/paper_files/paper/2012/file/05311655a15b75fab86956663e1819cd-Paper.pdf
[29]
M. Stone. 1974. Cross-Validatory Choice and Assessment of Statistical Predictions. Journal of the Royal Statistical Society. Series B (Methodological) 36, 2 (1974), 111–147. http://www.jstor.org/stable/2984809
[30]
Chris Thornton, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Chicago, Illinois, USA) (KDD ’13). Association for Computing Machinery, New York, NY, USA, 847–855.
[31]
Ryan Tibshirani and Robert Tibshirani. 2009. A bias correction for the minimum error rate in cross-validation. The Annals of Applied Statistics 3 (08 2009).
[32]
Ioannis Tsamardinos, Paulos Charonyktakis, Georgios Papoutsoglou, Giorgos Borboudakis, Kleanthi Lakiotaki, Jean Claude Zenklusen, Hartmut Juhl, Ekaterini Chatzaki, and Vincenzo Lagani. 2022. Just Add Data: automated predictive modeling for knowledge discovery and feature selection. npj Precision Oncology 6, 1 (16 Jun 2022), 38.
[33]
Ioannis Tsamardinos, Elissavet Greasidou, and Giorgos Borboudakis. 2018. Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation. Mach Learn 107, 12 (May 2018), 1895–1922.
[34]
Ryan Turner, David Eriksson, Michael McCourt, Juha Kiili, Eero Laaksonen, Zhen Xu, and Isabelle Guyon. 2021. Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020. In Proceedings of the NeurIPS 2020 Competition and Demonstration Track(Proceedings of Machine Learning Research, Vol. 133), Hugo Jair Escalante and Katja Hofmann (Eds.). PMLR, 3–26. https://proceedings.mlr.press/v133/turner21a.html
[35]
Theophilos Papadimitriou Vasilios Plakandaras, Periklis Gogas and Ioannis Tsamardinos. 2022. Credit Card Fraud Detection with Automated Machine Learning Systems. Applied Artificial Intelligence 36, 1 (2022), 2086354. arXiv:
[36]
Athanasios Voulodimos, Nikolaos Doulamis, Anastasios Doulamis, and Eftychios Protopapadakis. 2018. Deep learning for computer vision: A brief review. Computational intelligence and neuroscience 2018 (2018).
[37]
Xilu Wang, Yaochu Jin, Sebastian Schmitt, and Markus Olhofer. 2023. Recent Advances in Bayesian Optimization. ACM Comput. Surv. 55, 13s, Article 287 (jul 2023), 36 pages.
[38]
Li Yang and Abdallah Shami. 2020. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415 (2020), 295–316.
[39]
Xinjie Yu. 2010. Introduction to Evolutionary Algorithms. Industrial Engineering and Management Systems 9, 1–1.
[40]
Lucas Zimmer, Marius Lindauer, and Frank Hutter. 2021. Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021), 3079 – 3090.

Index Terms

  1. CLBO: Conditional Local Bayesian Optimization for Automated Machine Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SETN '24: Proceedings of the 13th Hellenic Conference on Artificial Intelligence
    September 2024
    437 pages
    ISBN:9798400709821
    DOI:10.1145/3688671

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 December 2024

    Check for updates

    Author Tags

    1. machine learning
    2. Bayesian optimization
    3. AutoML
    4. optimization

    Qualifiers

    • Short-paper

    Funding Sources

    • The Stavros Niarchos Foundation (SNF) and the Hellenic Foundation for Research and Innovation (H.F.R.I.) under the 5th Call of “Science and Society” Action– “Always Strive for Excellence – Theodore Papazoglou”

    Conference

    SETN 2024

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 164
      Total Downloads
    • Downloads (Last 12 months)164
    • Downloads (Last 6 weeks)105
    Reflects downloads up to 11 Feb 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media