Mind the Gap: Measuring Generalization Performance Across Multiple Objectives

Feurer, Matthias; Eggensperger, Katharina; Bergman, Edward; Pfisterer, Florian; Bischl, Bernd; Hutter, Frank

doi:10.1007/978-3-031-30047-9_11

Matthias Feurer¹⁰,
Katharina Eggensperger¹⁰,
Edward Bergman¹⁰,
Florian Pfisterer^11,12,
Bernd Bischl^11,12 &
…
Frank Hutter^10,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13876))

Included in the following conference series:

International Symposium on Intelligent Data Analysis

1118 Accesses
1 Citations

Abstract

Modern machine learning models are often constructed taking into account multiple objectives, e.g., minimizing inference time while also maximizing accuracy. Multi-objective hyperparameter optimization (MHPO) algorithms return such candidate models, and the approximation of the Pareto front is used to assess their performance. In practice, we also want to measure generalization when moving from the validation to the test set. However, some of the models might no longer be Pareto-optimal which makes it unclear how to quantify the performance of the MHPO method when evaluated on the test set. To resolve this, we provide a novel evaluation protocol that allows measuring the generalization performance of MHPO methods and studying its capabilities for comparing two optimization experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A survey on multi-objective hyperparameter optimization algorithms for machine learning

Article Open access 24 December 2022

Generalization in multi-objective machine learning

Article Open access 14 December 2024

Advancements in Multiobjective Hyperparameterization Optimization: A Comprehensive Review

Notes

1.
In principle, this is agnostic to the capability of the HPO algorithm to consider multiple objectives. Any HPO algorithm (including random search) would suffice since one can compute the Pareto-optimal set post-hoc.
2.
The true Pareto front is only approximated because there is usually no guarantee that an MHPO algorithm finds the optimal solution. Furthermore, there is no guarantee that an algorithm can find all solutions on the true Pareto front.
3.
This is due to a shift in distributions when going from the validation set to the test set due to random sampling. The HPC might then no longer be optimal due to overfitting.
4.
If the true function values of evaluated configurations cannot be recovered due to budget restrictions, our proposed evaluation protocol can be applied as well to deal with solutions that are no longer part of the Pareto front on the test set.
5.
Distributionally Robust Bayesian Optimization (Kirschner et al., 2020) is an algorithm that could be used in such a setting and the paper introducing it explicitly states AutoML as an application, but does neither demonstrate its applicability to AutoML nor elaborates on how to describe the distribution shift in a way the algorithm could handle it.

References

Benmeziane, H., El Maghraoui, K., Ouarnoughi, H., Niar, S., Wistuba, M., Wang, N.: A comprehensive survey on Hardware-aware Neural Architecture Search. arXiv:2101.09336 [cs.LG] (2021)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
MathSciNet MATH Google Scholar
Binder, M., Moosbauer, J., Thomas, J., Bischl, B.: Multi-objective hyperparameter tuning and feature selection using filter ensembles. In: Ceberio, J. (ed.) Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2020), pp. 471–479. ACM Press (2020)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. J. 45, 5–32 (2001)
Article MATH Google Scholar
Chakraborty, J., Xia, T., Fahid, F., Menzies, T.: Software engineering for fairness: a case study with Hyperparameter Optimization. In: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE (2019)
Google Scholar
Cruz, A., Saleiro, P., Belem, C., Soares, C., Bizarro, P.: Promoting fairness through hyperparameter optimization. In: Bailey, J., Miettinen, P., Koh, Y., Tao, D., Wu, X. (eds.) Proceedings of the IEEE International Conference on Data Mining (ICDM 2021), pp. 1036–1041. IEEE (2021)
Google Scholar
Dua, D., Graff, C.: UCI machine learning repository (2017)
Google Scholar
Elsken, T., Metzen, J., Hutter, F.: Efficient multi-objective Neural Architecture Search via Lamarckian evolution. In: Proceedings of the International Conference on Learning Representations (ICLR 2019) (2019a). Published online: https://iclr.cc/
Elsken, T., Metzen, J., Hutter, F.: Neural architecture search: a survey. J. Mach. Learn. Res. 20(55), 1–21 (2019b)
Google Scholar
Emmerich, M.T.M., Deutz, A.H.: A tutorial on multiobjective optimization: fundamentals and evolutionary methods. Nat. Comput. 17(3), 585–609 (2018)
Article MathSciNet Google Scholar
Feffer, M., Hirzel, M., Hoffman, S., Kate, K., Ram, P., Shinnar, A.: An empirical study of modular bias mitigators and ensembles. arXiv:2202.00751 [cs.LG] (2022)
Feurer, M., Hutter, F.: Hyperparameter optimization. In: Hutter et al. (2019), chap. 1, pp. 3–38, available for free at http://automl.org/book
Feurer, M., et al.: OpenML-Python: an extensible Python API for OpenML. J. Mach. Learn. Res. 22(100), 1–5 (2021)
MATH Google Scholar
Gardner, S., et al.: Constrained multi-objective optimization for automated machine learning. In: Singh, L., De Veaux, R., Karypis, G., Bonchi, F., Hill, J. (eds.) Proceedings of the International Conference on Data Science and Advanced Analytics (DSAA 2019), pp. 364–373. ieeecis, IEEE (2019)
Google Scholar
Gelbart, M., Snoek, J., Adams, R.: Bayesian optimization with unknown constraints. In: Zhang, N., Tian, J. (eds.) Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence (UAI 2014), pp. 250–258. AUAI Press (2014)
Google Scholar
Gonzalez, S., Branke, J., van Nieuwenhuyse, I.: Multiobjective ranking and selection using stochastic Kriging. arXiv:2209.03919 [stat.ML] (2022)
Hernández-Lobato, J., Gelbart, M., Adams, R., Hoffman, M., Ghahramani, Z.: A general framework for constrained Bayesian optimization using information-based search. J. Mach. Learn. Res. 17(1), 5549–5601 (2016)
MathSciNet MATH Google Scholar
Horn, D., Bischl, B.: Multi-objective parameter configuration of machine learning algorithms using model-based optimization. In: Likas, A. (ed.) 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE (2016)
Google Scholar
Horn, D., Dagge, M., Sun, X., Bischl, B.: First investigations on noisy model-based multi-objective optimization. In: Trautmann, H., et al. (eds.) EMO 2017. LNCS, vol. 10173, pp. 298–313. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54157-0_21
Chapter Google Scholar
Hutter, F., Kotthoff, L., Vanschoren, J. (eds.): Automated Machine Learning: Methods, Systems, Challenges. Springer, Heidelberg (2019). Available for free at http://automl.org/book
Iqbal, M., Su, J., Kotthoff, L., Jamshidi, P.: Flexibo: Cost-aware multi-objective optimization of deep neural networks. arXiv:2001.06588 [cs.LG] (2020)
Karl, F., et al.: Multi-objective hyperparameter optimization - an overview. arXiv:2206.07438 [cs.LG] (2022)
Kirschner, J., Bogunovic, I., Jegelka, S., Krause, A.: Distributionally robust Bayesian optimization. In: Chiappa, S., Calandra, R. (eds.) Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020), pp. 2174–2184. Proceedings of Machine Learning Research (2020)
Google Scholar
Konen, W., Koch, P., Flasch, O., Bartz-Beielstein, T., Friese, M., Naujoks, B.: Tuned data mining: a benchmark study on different tuners. In: Krasnogor, N. (ed.) Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation (GECCO 2011), pp. 1995–2002. ACM Press (2011)
Google Scholar
Letham, B., Brian, K., Ottoni, G., Bakshy, E.: Constrained Bayesian optimization with noisy experiments. Bayesian Analysis (2018)
Google Scholar
Levesque, J.C., Durand, A., Gagne, C., Sabourin, R.: Multi-objective evolutionary optimization for generating ensembles of classifiers in the roc space. In: Soule, T. (ed.) Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation (GECCO 2012), pp. 879–886. ACM Press (2011)
Google Scholar
Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
Molnar, C., Casalicchio, G., Bischl, B.: Quantifying model complexity via functional decomposition for better post-hoc interpretability. In: Cellier, P., Driessens, K. (eds.) ECML PKDD 2019. CCIS, vol. 1167, pp. 193–204. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43823-4_17
Chapter Google Scholar
Morales-Hernández, A., Nieuwenhuyse, I.V., Gonzalez, S.: A survey on multi-objective hyperparameter optimization algorithms for machine learning. arXiv:2111.13755 [cs.LG] (2021)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Raschka, S.: Model evaluation, model selection, and algorithm selection in machine learning. arXiv:1811.12808 [stat.ML] (2018)
Schmucker, R., Donini, M., Zafar, M., Salinas, D., Archambeau, C.: Multi-objective asynchronous successive halving. arXiv:2106.12639 [stat.ML] (2021)
Vanschoren, J., van Rijn, J., Bischl, B., Torgo, L.: OpenML: networked science in machine learning. SIGKDD Explor. 15(2), 49–60 (2014)
Article Google Scholar
Zitzler, E., Deb, K., Thiele, L.: Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000)
Article Google Scholar
Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C., Fonseca, V.: Performance assessment of multiobjective optimizers: an analysis and review. IEEE Trans. Evol. Comput. 7, 117–132 (2003)
Article Google Scholar

Download references

Acknowledgements

Robert Bosch GmbH is acknowledged for financial support. Also, this research was partially supported by TAILOR, a project funded by EU Horizon 2020 research and innovation programme under GA No 952215. The authors of this work take full responsibility for its content.

Author information

Authors and Affiliations

Albert-Ludwigs-Universität Freiburg, Freiburg im Breisgau, Germany
Matthias Feurer, Katharina Eggensperger, Edward Bergman & Frank Hutter
Ludwig-Maximilians-Universität München, Munich, Germany
Florian Pfisterer & Bernd Bischl
Munich Center for Machine Learning, Munich, Germany
Florian Pfisterer & Bernd Bischl
Bosch Center for Artificial Intelligence, Renningen, Germany
Frank Hutter

Authors

Matthias Feurer
View author publications
You can also search for this author in PubMed Google Scholar
Katharina Eggensperger
View author publications
You can also search for this author in PubMed Google Scholar
Edward Bergman
View author publications
You can also search for this author in PubMed Google Scholar
Florian Pfisterer
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Bischl
View author publications
You can also search for this author in PubMed Google Scholar
Frank Hutter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthias Feurer .

Editor information

Editors and Affiliations

Université de Caen Normandie, Caen, France
Bruno Crémilleux
Eindhoven University of Technology, Eindhoven, The Netherlands
Sibylle Hess
UCLouvain, Louvain-la-Neuve, Belgium
Siegfried Nijssen

A Experimental Details

Random Forest		Linear Model
Hyperparameter name	Search space	Hyperparameter name	Search Space
criterion	[gini, entropy]	penalty	[l2, l1, elasticnet]
bootstrap	[True, False]	alpha	$[1e-6, 1e-2]$, log
max_features	[0.0, 1.0]	l1 ratio	[0.0, 1.0]
min_samples_split	[2, 20]	fit_intercept	[True, False]
min_samples_leaf	[1, 20]	eta0	$[1e-7, 1e-1]$
pos_class_weight exponent	$[-7, 7]$	pos_class_weight exp	$[-7, 7]$

We provide the random forest and linear model search spaces in Table A. We fit the linear model with stochastic gradient descent and use an adaptive learning rate and minimize the log loss (please see the scikit-learn (Pedregosa et al., 2011) documentation for a description of these). Because we are dealing with unbalanced data, we consider the class weights as a hyperparameter and tune the weight of the minority (positive) class in the range of $[2^{-7}, 2^7]$ on a log-scale (Horn and Bischl, 2011; Konen et al., 2016). To deal with categorical features, we use one hot encoding. We transform the features for the linear models using a quantile transformer with a normal output distribution.

We use the German credit dataset (Dua and Graff, 2017) because it is relatively small, leading to high variance in the algorithm performance, and unbalanced. We downloaded the dataset from OpenML (Vanschoren et al., 2014) using the OpenML-Python API (Feurer et al., 2021) as task ID 31, but conducted our own 60/20/20 split. It is a binary classification problem with 30% positive samples. The dataset has 1000 samples and 20 features. Out of the 20 features, 13 are categorical. The dataset contains no missing values.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feurer, M., Eggensperger, K., Bergman, E., Pfisterer, F., Bischl, B., Hutter, F. (2023). Mind the Gap: Measuring Generalization Performance Across Multiple Objectives. In: Crémilleux, B., Hess, S., Nijssen, S. (eds) Advances in Intelligent Data Analysis XXI. IDA 2023. Lecture Notes in Computer Science, vol 13876. Springer, Cham. https://doi.org/10.1007/978-3-031-30047-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-30047-9_11
Published: 01 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30046-2
Online ISBN: 978-3-031-30047-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mind the Gap: Measuring Generalization Performance Across Multiple Objectives

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A survey on multi-objective hyperparameter optimization algorithms for machine learning

Generalization in multi-objective machine learning

Advancements in Multiobjective Hyperparameterization Optimization: A Comprehensive Review

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Experimental Details

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Mind the Gap: Measuring Generalization Performance Across Multiple Objectives

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A survey on multi-objective hyperparameter optimization algorithms for machine learning

Generalization in multi-objective machine learning

Advancements in Multiobjective Hyperparameterization Optimization: A Comprehensive Review

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Experimental Details

A Experimental Details

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation