Safe Learning and Optimization Techniques: Towards a Survey of the State of the Art

Kim, Youngmin; Allmendinger, Richard; López-Ibáñez, Manuel

doi:10.1007/978-3-030-73959-1_12

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12641))

Included in the following conference series:

International Workshop on the Foundations of Trustworthy AI Integrating Learning, Optimization and Reasoning

1631 Accesses

Abstract

Safe learning and optimization deals with learning and optimization problems that avoid, as much as possible, the evaluation of non-safe input points, which are solutions, policies, or strategies that cause an irrecoverable loss (e.g., breakage of a machine or equipment, or life threat). Although a comprehensive survey of safe reinforcement learning algorithms was published in 2015, a number of new algorithms have been proposed thereafter, and related works in active learning and in optimization were not considered. This paper reviews those algorithms from a number of domains including reinforcement learning, Gaussian process regression and classification, evolutionary computing, and active learning. We provide the fundamental concepts on which the reviewed algorithms are based and a characterization of the individual algorithms. We conclude by explaining how the algorithms are connected and suggestions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Safe Exploration for Active Learning with Gaussian Processes

SAMBA: safe model-based & active reinforcement learning

Article 04 January 2022

Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning

Article 08 February 2019

Notes

1.
The notion of risk is considered more formally in SAL [26], discussed in Sect. 4.3.

References

Allmendinger, R.: Tuning evolutionary search for closed-loop optimization. Ph.D. thesis, The University of Manchester, UK (2012)
Google Scholar
Allmendinger, R., Knowles, J.D.: Evolutionary search in lethal environments. In: International Conference on Evolutionary Computation Theory and Applications, pp. 63–72. SciTePress (2011)
Google Scholar
Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397–422 (2002)
MathSciNet MATH Google Scholar
Bachoc, F., Helbert, C., Picheny, V.: Gaussian process optimization with failures: classification and convergence proof. J. Glob. Optim. 78, 483–506 (2020). https://doi.org/10.1007/s10898-020-00920-0
Article MathSciNet Google Scholar
Bäck, T.: Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press, Oxford (1996)
Book Google Scholar
Berkenkamp, F., Krause, A., Schoellig, A.P.: Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics. arXiv preprint arXiv:1602.04450 (2016)
Berkenkamp, F., Schoellig, A.P., Krause, A.: Safe controller optimization for quadrotors with Gaussian processes. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 491–496. IEEE (2016)
Google Scholar
Bıyık, E., Margoliash, J., Alimo, S.R., Sadigh, D.: Efficient and safe exploration in deterministic Markov decision processes with unknown transition models. In: 2019 American Control Conference (ACC), pp. 1792–1799. IEEE (2019)
Google Scholar
Brochu, E., Cora, V., de Freitas, N.: A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599, December 2010
Duivenvoorden, R.R.P.R., Berkenkamp, F., Carion, N., Krause, A., Schoellig, A.P.: Constrained Bayesian optimization with particle swarms for safe adaptive controller tuning. IFAC-PapersOnLine 50(1), 11800–11807 (2017)
Article Google Scholar
Ferrer, J., López-Ibáñez, M., Alba, E.: Reliable simulation-optimization of traffic lights in a real-world city. Appl. Soft Comput. 78, 697–711 (2019)
Article Google Scholar
Forrester, A.I.J., Keane, A.J.: Recent advances in surrogate-based optimization. Prog. Aerosp. Sci. 45(1–3), 50–79 (2009)
Article Google Scholar
García, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)
MathSciNet MATH Google Scholar
Geibel, P.: Reinforcement learning for MDPs with constraints. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 646–653. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_63
Chapter Google Scholar
Gosavi, A.: Reinforcement learning: a tutorial survey and recent advances. INFORMS J. Comput. 21(2), 178–192 (2009)
Article MathSciNet Google Scholar
Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159–195 (2001)
Article Google Scholar
Huang, D., Allen, T.T., Notz, W.I., Zeng, N.: Global optimization of stochastic black-box systems via sequential kriging meta-models. J. Global Optim. 34(3), 441–466 (2006). https://doi.org/10.1007/s10898-005-2454-3
Article MathSciNet MATH Google Scholar
Kaji, H., Ikeda, K., Kita, H.: Avoidance of constraint violation for experiment-based evolutionary multi-objective optimization. In: Proceedings of the 2009 Congress on Evolutionary Computation (CEC 2009), pp. 2756–2763. IEEE Press, Piscataway (2009)
Google Scholar
Knowles, J.D.: Closed-loop evolutionary multiobjective optimization. IEEE Comput. Intell. Mag. 4, 77–91 (2009)
Article Google Scholar
Likar, B., Kocijan, J.: Predictive control of a gas-liquid separation plant based on a Gaussian process model. Comput. Chem. Eng. 31(3), 142–152 (2007)
Article Google Scholar
Moldovan, T.M., Abbeel, P.: Safe exploration in Markov decision processes. In: Langford, J., Pineau, J. (eds.) Proceedings of the 29th International Conference on Machine Learning, ICML 2012, pp. 1451–1458. Omnipress (2012)
Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)
MATH Google Scholar
Sacher, M., et al.: A classification approach to efficient global optimization in presence of non-computable domains. Struct. Multidiscip. Optim. 58(4), 1537–1557 (2018). https://doi.org/10.1007/s00158-018-1981-8
Article MathSciNet Google Scholar
Schillinger, M., Hartmann, B., Skalecki, P., Meister, M., Nguyen-Tuong, D., Nelles, O.: Safe active learning and safe Bayesian optimization for tuning a PI-controller. IFAC-PapersOnLine 50(1), 5967–5972 (2017)
Article Google Scholar
Schillinger, M., et al.: Safe active learning of a high pressure fuel supply system. In: Proceedings of the 9th EUROSIM Congress on Modelling and Simulation, EUROSIM 2016 and the 57th SIMS Conference on Simulation and Modelling SIMS 2016, pp. 286–292, Linköping University Electronic Press (2018)
Google Scholar
Schreiter, J., Nguyen-Tuong, D., Eberts, M., Bischoff, B., Markert, H., Toussaint, M.: Safe exploration for active learning with Gaussian processes. In: Bifet, A., et al. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9286, pp. 133–149. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23461-8_9
Chapter Google Scholar
Schulz, E., Speekenbrink, M., Krause, A.: A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions. J. Math. Psychol. 85, 1–16 (2018)
Article MathSciNet Google Scholar
Small, B.G., et al.: Efficient discovery of anti-inflammatory small-molecule combinations using evolutionary computing. Nat. Chem. Biol. 7(12), 902–908 (2011)
Article Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Bartlett, P.L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems (NIPS 25), pp. 2960–2968. Curran Associates, Red Hook (2012)
Google Scholar
Sui, Y., Gotovos, A., Burdick, J.W., Krause, A.: Safe exploration for optimization with Gaussian processes. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, vol. 37, pp. 997–1005 (2015)
Google Scholar
Sui, Y., Zhuang, V., Burdick, J.W., Yue, Y.: Stagewise safe Bayesian optimization with Gaussian processes. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018. Proceedings of Machine Learning Research, vol. 80, pp. 4788–4796. PMLR (2018)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Cambridge (2018)
MATH Google Scholar
Turchetta, M., Berkenkamp, F., Krause, A.: Safe exploration in finite Markov decision processes with Gaussian processes. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS 29), pp. 4312–4320 (2016)
Google Scholar
Turchetta, M., Berkenkamp, F., Krause, A.: Safe exploration for interactive machine learning. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS 32), pp. 2887–2897 (2019)
Google Scholar
Wachi, A., Sui, Y., Yue, Y., Ono, M.: Safe exploration and optimization of constrained MDPs using Gaussian processes. In: McIlraith, S.A., Weinberger, K.Q. (eds.) AAAI Conference on Artificial Intelligence, pp. 6548–6556, AAAI Press, February 2018
Google Scholar

Download references

Acknowledgements

M. López-Ibáñez is a “Beatriz Galindo” Senior Distinguished Researcher (BEAGAL 18/00053) funded by the Spanish Ministry of Science and Innovation.

Author information

Authors and Affiliations

Alliance Manchester Business School, University of Manchester, Manchester, M15 6PB, UK
Youngmin Kim, Richard Allmendinger & Manuel López-Ibáñez
School of Computer Science, University of Málaga, 29071, Málaga, Spain
Manuel López-Ibáñez

Authors

Youngmin Kim
View author publications
You can also search for this author in PubMed Google Scholar
Richard Allmendinger
View author publications
You can also search for this author in PubMed Google Scholar
Manuel López-Ibáñez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Youngmin Kim .

Editor information

Editors and Affiliations

Department of Computer and Information Science, Linköping University, Linköping, Sweden
Fredrik Heintz
ALMA-AI Research Institute on Human-Centered AI, University of Bologna, Bologna, Italy
Michela Milano
Department of Computer Science, University College Cork, Cork, Ireland
Barry O'Sullivan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, Y., Allmendinger, R., López-Ibáñez, M. (2021). Safe Learning and Optimization Techniques: Towards a Survey of the State of the Art. In: Heintz, F., Milano, M., O'Sullivan, B. (eds) Trustworthy AI - Integrating Learning, Optimization and Reasoning. TAILOR 2020. Lecture Notes in Computer Science(), vol 12641. Springer, Cham. https://doi.org/10.1007/978-3-030-73959-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-73959-1_12
Published: 13 April 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73958-4
Online ISBN: 978-3-030-73959-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics