VEER: enhancing the interpretability of model-based optimizations

Peng, Kewen; Kaltenecker, Christian; Siegmund, Norbert; Apel, Sven; Menzies, Tim

doi:10.1007/s10664-023-10296-w

VEER: enhancing the interpretability of model-based optimizations

Published: 04 April 2023

Volume 28, article number 61, (2023)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Kewen Peng ORCID: orcid.org/0000-0002-4884-1808¹,
Christian Kaltenecker²,
Norbert Siegmund³,
Sven Apel² &
…
Tim Menzies¹

353 Accesses
Explore all metrics

Abstract

Context:

Many software systems can be tuned for multiple objectives (e.g., faster runtime, less required memory, less network traffic or energy consumption, etc.). Such systems can suffer from “disagreement” where different models have different (or even opposite) insights and tactics on how to optimize a system. For configuration problems, we show that (a) model disagreement is rampant; yet (b) prior to this paper, it has barely been explored.

Objective:

We aim at helping practitioners and researchers better solve multi-objective configuration optimization problems, by resolving model disagreement.

Method:

We propose a dimension reduction method called VEER that builds a useful one-dimensional approximation to the original N-objective space. Traditional model-based optimizers use Pareto search to locate Pareto-optimal solutions to a multi-objective problem, which is computationally heavy on large-scale systems. VEER builds a surrogate that can replace the Pareto sorting step after deployment.

Results:

Compared to the prior state-of-the-art, for 11 configurable systems, VEER significantly reduces disagreement and execution time, without compromising the optimization performance in most cases. For our largest problem (with tens of thousands of possible configurations), optimizing with VEER finds as good or better optimizations with zero model disagreements, three orders of magnitude faster.

Conclusion:

When employing model-based optimizers for multi-objective optimization, we recommend to apply VEER, which not only improves the execution time, but also resolves the potential model disagreement problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CAVE: Configuration Assessment, Visualization and Evaluation

Configuration Optimization with Limited Functional Impact

MO-ParamILS: A Multi-objective Automatic Algorithm Configuration Framework

Notes

http://tiny.cc/gartners21
Holland’s advice (Holland 1992) for genetic algorithms (such as NSGA-II and MOEA/D) is that 100 individuals need to be evolved over 100 generations; i.e., 10⁴ evaluations in all.
And the source code for that implementation can be found in the reproduction package mentioned in our abstract
We define a concordant pair in tasks of more than 2 objectives in a similar manner: one configuration has better performance than the other in all objectives. This is not originally defined by Kendall, but we believe it is a proper extension.

References

Agrawal A, Menzies T, Minku LL, Wagner M, Yu Z (2020) Better software analytics via “duo”: data mining algorithms using/used-by optimizers. Empir Softw Eng 25(3):2099–2136
Article Google Scholar
Antkiewicz M, Bąk K, Murashkin A, Olaechea R, Liang JH, Czarnecki K (2013) Clafer tools for product line engineering. In: Proceedings of the 17th international software product line conference co-located workshops, pp 130–135
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. Adv Neural Inf Process Syst, vol 24
Bergstra J, Yamins D, Cox DD et al (2013) Hyperopt: a python library for optimizing the hyperparameters of machine learning algorithms. In: Proceedings of the 12th Python in science conference, Citeseer, vol 13, p 20
Brochu E, Cora VM, De Freitas N (2010) A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv:10122599
Chen D, Fu W, Krishna R, Menzies T (2018a) Applications of psychological science for actionable analytics. In: Proceedings of the 2018 26th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, pp 456–467
Chen J, Nair V, Krishna R, Menzies T (2018b) “sampling” as a baseline optimizer for search-based software engineering. IEEE Trans Softw Eng (pre-print):1–1
Coello CAC, Sierra MR (2004) A study of the parallelization of a coevolutionary multi-objective evolutionary algorithm. In: Mexican international conference on artificial intelligence. Springer, pp 688–697
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evolution Computat 6(2):182–197
Article Google Scholar
Devanbu P, Zimmermann T, Bird C (2016) Belief & evidence in empirical software engineering. In: 2016 IEEE/ACM 38th international conference on software engineering (ICSE). IEEE, pp 108–119
Gigerenzer G (2008) Why heuristics work. Perspect Psycho Sci 3 (1):20–29
Article Google Scholar
Golovin D, Solnik B, Moitra S, Kochanski G, Karro J, Sculley D (2017) Google vizier: a service for black-box optimization. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1487–1495
Guo J, Yang D, Siegmund N, Apel S, Sarkar A, Valov P, Czarnecki K, Wasowski A, Yu H (2018) Data-efficient performance learning for configurable systems. Empirical Softw Eng (EMSE) 23(3):1826–1867
Article Google Scholar
Herodotou H, Lim H, Luo G, Borisov N, Dong L, Cetin FB, Babu S (2011) Starfish: a self-tuning system for big data analytics. In: Conference on innovative data systems research
Hess MR, Kromrey JD (2004) Robust confidence intervals for effect sizes: a comparative study of cohen’sd and cliff’s delta under non-normality and heterogeneous variances. In: Annual meeting of the American educational research association, pp 1–30
Holland JH (1992) Genetic algorithms. Sci Amer 267(1):66–73
Article Google Scholar
Huband S, Hingston P, While L, Barone L (2003) An evolution strategy with probabilistic mutation for multi-objective optimisation. In: The 2003 congress on evolutionary computation, 2003. CEC’03., IEEE, vol 4, pp 2284–2291
Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: 5th LION
Jiarpakdee J, Tantithamthavorn C, Grundy J (2021) Practitioners’ perceptions of the goals and visual explanations of defect prediction models. arXiv:210212007
Kaltenecker C, Grebhahn A, Siegmund N, Apel S (2020) The interplay of sampling and machine learning for software performance prediction. IEEE Softw 37(4):58–66
Article Google Scholar
Kendall MG (1948) Rank correlation methods
Kolesnikov S, Siegmund N, Kästner C, Grebhahn A, Apel S (2019) Tradeoffs in modeling performance of highly configurable software systems. Softw Syst Model 18(3):2265–2283
Article Google Scholar
Laumanns M, Thiele L, Deb K, Zitzler E (2002) Combining convergence and diversity in evolutionary multiobjective optimization. Evolution Computat
Macbeth G, Razumiejczyk E, Ledesma RD (2011) Cliff’s delta calculator: a non-parametric effect size program for two groups of observations. Univ Psychol 10(2):545–555
Article Google Scholar
Mittas N, Angelis L (2012) Ranking and clustering software cost estimation models through a multiple comparisons algorithm. IEEE Trans Softw Eng 39(4):537–551
Article Google Scholar
Nair V, Menzies T, Siegmund N, Apel S (2017) Using bad learners to find good configurations. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, pp 257–267
Nair V, Yu Z, Menzies T, Siegmund N, Apel S (2018) Finding faster configurations using flash. IEEE Trans Softw Eng 46(7):794–811
Article Google Scholar
Sarkar A, Guo J, Siegmund N, Apel S, Czarnecki K (2015) Cost-efficient sampling for performance prediction of configurable systems (t). In: 2015 30th IEEE/ACM international conference on automated software engineering (ASE). IEEE, pp 342–352
Sawyer R (2011) Bi’s impact on analyses and decision making depends on the development of less complex applications. IJBIR 2:52–63. https://doi.org/10.4018/IJBIR.2011070104
Google Scholar
Shrikanth N, Menzies T (2020) Assessing practitioner beliefs about software defect prediction. In: 2020 IEEE/ACM 42nd international conference on software engineering: software engineering in practice (ICSE-SEIP). IEEE, pp 182–190
Siegmund N, Grebhahn A, Apel S, Kästner C (2015) Performance-influence models for highly configurable systems. In: Proceedings of the joint meeting on foundations of software engineering (ESEC/FSE), ACM, pp 284–294
Snoek J, Larochelle H, Adams R (2012) Practical bayesian optimization of machine learning algorithms. In: NIPS - volume 2
Song W, Chan FT (2015) Multi-objective configuration optimization for product-extension service. J Manuf Syst 37:113–125
Article Google Scholar
Tan SY, Chan T (2016) Defining and conceptualizing actionable insight: a conceptual framework for decision-centric analytics. arXiv:160603510
Tu H, Papadimitriou G, Kiran M, Wang C, Mandal A, Deelman E, Menzies T (2021) Mining workflows for anomalous data transfers. In: 2021 IEEE/ACM 18th international conference on mining software repositories (MSR), pp 1–12. https://doi.org/10.1109/MSR52588.2021.00013
Van Aken D, Pavlo A, Gordon GJ, Zhang B (2017) Automatic database management system tuning through large-scale machine learning. In: International conference on management of data, ACM
Van Veldhuizen DA (1999) Multiobjective evolutionary algorithms: classifications, analyses and new innovations. Tech rep, Air force inst of tech wright-pattersonafb oh school of engineering
Xia T, Krishna R, Chen J, Mathew G, Shen X, Menzies T (2018) Hyperparameter optimization for effort estimation. arXiv:180500336
Xu T, Jin L, Fan X, Zhou Y, Pasupathy S, Talwadker R (2015) Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software. In: Foundations of software engineering
Zhang Q, Li H (2007) Moea/d: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evolution Computat 11(6):712–731
Article Google Scholar
Zhu H, Jin J, Tan C, Pan F, Zeng Y, Li H, Gai K (2017) Optimized cost per click in taobao display advertising. arXiv preprint
Zitzler E, Laumanns M, Thiele L (2001) Spea2: improving the strength pareto evolutionary algorithm. TIK-Report, vol 103
Zuluaga M, Krause A, Püschel M (2016) ε-pal: an active learning approach to the multi-objective optimization problem. J Mach Learn Res 17(1):3619–3650
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was partially funded by a research grant from the Laboratory for Analytical Sciences, North Carolina State University. Apel’s work has been funded by the German Research Foundation (AP 206/11 and Grant 389792660 as part of TRR 248 – CPEC. Siegmunds work has been supported by the Federal Ministry of Education and Research of Germany and by the Sächsische Staatsministerium für Wissenschaft Kultur und Tourismus in the program Center of Excellence for AI-research “Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig”, project identification number: ScaDS.AI, and by the German Research Foundation (SI 2171/2-2).

Funding

Apart from the funding acknowledged above, this work does not have any other conflicts of interests.

Author information

Authors and Affiliations

Department of Computer Science, North Carolina State University, Raleigh, North Carolina, USA
Kewen Peng & Tim Menzies
Department of Computer Science, Saarland University, Saarland Informatics Campus, Saarbrücken, Germany
Christian Kaltenecker & Sven Apel
Department of Computer Science, Leipzig University, Leipzig, Germany
Norbert Siegmund

Authors

Kewen Peng
View author publications
You can also search for this author in PubMed Google Scholar
Christian Kaltenecker
View author publications
You can also search for this author in PubMed Google Scholar
Norbert Siegmund
View author publications
You can also search for this author in PubMed Google Scholar
Sven Apel
View author publications
You can also search for this author in PubMed Google Scholar
Tim Menzies
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kewen Peng.

Ethics declarations

Conflict of Interests

We assert that the authors have no conflict of interest.

Additional information

Communicated by: Erik Linstead

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Peng, K., Kaltenecker, C., Siegmund, N. et al. VEER: enhancing the interpretability of model-based optimizations. Empir Software Eng 28, 61 (2023). https://doi.org/10.1007/s10664-023-10296-w

Download citation

Accepted: 24 January 2023
Published: 04 April 2023
DOI: https://doi.org/10.1007/s10664-023-10296-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VEER: enhancing the interpretability of model-based optimizations