Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Safe Learning and Optimization Techniques: Towards a Survey of the State of the Art

  • Conference paper
  • First Online:
Trustworthy AI - Integrating Learning, Optimization and Reasoning (TAILOR 2020)

Abstract

Safe learning and optimization deals with learning and optimization problems that avoid, as much as possible, the evaluation of non-safe input points, which are solutions, policies, or strategies that cause an irrecoverable loss (e.g., breakage of a machine or equipment, or life threat). Although a comprehensive survey of safe reinforcement learning algorithms was published in 2015, a number of new algorithms have been proposed thereafter, and related works in active learning and in optimization were not considered. This paper reviews those algorithms from a number of domains including reinforcement learning, Gaussian process regression and classification, evolutionary computing, and active learning. We provide the fundamental concepts on which the reviewed algorithms are based and a characterization of the individual algorithms. We conclude by explaining how the algorithms are connected and suggestions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The notion of risk is considered more formally in SAL [26], discussed in Sect. 4.3.

References

  1. Allmendinger, R.: Tuning evolutionary search for closed-loop optimization. Ph.D. thesis, The University of Manchester, UK (2012)

    Google Scholar 

  2. Allmendinger, R., Knowles, J.D.: Evolutionary search in lethal environments. In: International Conference on Evolutionary Computation Theory and Applications, pp. 63–72. SciTePress (2011)

    Google Scholar 

  3. Auer, P.: Using confidence bounds for exploitation-exploration trade-offs. J. Mach. Learn. Res. 3, 397–422 (2002)

    MathSciNet  MATH  Google Scholar 

  4. Bachoc, F., Helbert, C., Picheny, V.: Gaussian process optimization with failures: classification and convergence proof. J. Glob. Optim. 78, 483–506 (2020). https://doi.org/10.1007/s10898-020-00920-0

    Article  MathSciNet  Google Scholar 

  5. Bäck, T.: Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press, Oxford (1996)

    Book  Google Scholar 

  6. Berkenkamp, F., Krause, A., Schoellig, A.P.: Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics. arXiv preprint arXiv:1602.04450 (2016)

  7. Berkenkamp, F., Schoellig, A.P., Krause, A.: Safe controller optimization for quadrotors with Gaussian processes. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 491–496. IEEE (2016)

    Google Scholar 

  8. Bıyık, E., Margoliash, J., Alimo, S.R., Sadigh, D.: Efficient and safe exploration in deterministic Markov decision processes with unknown transition models. In: 2019 American Control Conference (ACC), pp. 1792–1799. IEEE (2019)

    Google Scholar 

  9. Brochu, E., Cora, V., de Freitas, N.: A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599, December 2010

  10. Duivenvoorden, R.R.P.R., Berkenkamp, F., Carion, N., Krause, A., Schoellig, A.P.: Constrained Bayesian optimization with particle swarms for safe adaptive controller tuning. IFAC-PapersOnLine 50(1), 11800–11807 (2017)

    Article  Google Scholar 

  11. Ferrer, J., López-Ibáñez, M., Alba, E.: Reliable simulation-optimization of traffic lights in a real-world city. Appl. Soft Comput. 78, 697–711 (2019)

    Article  Google Scholar 

  12. Forrester, A.I.J., Keane, A.J.: Recent advances in surrogate-based optimization. Prog. Aerosp. Sci. 45(1–3), 50–79 (2009)

    Article  Google Scholar 

  13. García, J., Fernández, F.: A comprehensive survey on safe reinforcement learning. J. Mach. Learn. Res. 16(1), 1437–1480 (2015)

    MathSciNet  MATH  Google Scholar 

  14. Geibel, P.: Reinforcement learning for MDPs with constraints. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 646–653. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_63

    Chapter  Google Scholar 

  15. Gosavi, A.: Reinforcement learning: a tutorial survey and recent advances. INFORMS J. Comput. 21(2), 178–192 (2009)

    Article  MathSciNet  Google Scholar 

  16. Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159–195 (2001)

    Article  Google Scholar 

  17. Huang, D., Allen, T.T., Notz, W.I., Zeng, N.: Global optimization of stochastic black-box systems via sequential kriging meta-models. J. Global Optim. 34(3), 441–466 (2006). https://doi.org/10.1007/s10898-005-2454-3

    Article  MathSciNet  MATH  Google Scholar 

  18. Kaji, H., Ikeda, K., Kita, H.: Avoidance of constraint violation for experiment-based evolutionary multi-objective optimization. In: Proceedings of the 2009 Congress on Evolutionary Computation (CEC 2009), pp. 2756–2763. IEEE Press, Piscataway (2009)

    Google Scholar 

  19. Knowles, J.D.: Closed-loop evolutionary multiobjective optimization. IEEE Comput. Intell. Mag. 4, 77–91 (2009)

    Article  Google Scholar 

  20. Likar, B., Kocijan, J.: Predictive control of a gas-liquid separation plant based on a Gaussian process model. Comput. Chem. Eng. 31(3), 142–152 (2007)

    Article  Google Scholar 

  21. Moldovan, T.M., Abbeel, P.: Safe exploration in Markov decision processes. In: Langford, J., Pineau, J. (eds.) Proceedings of the 29th International Conference on Machine Learning, ICML 2012, pp. 1451–1458. Omnipress (2012)

    Google Scholar 

  22. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)

    MATH  Google Scholar 

  23. Sacher, M., et al.: A classification approach to efficient global optimization in presence of non-computable domains. Struct. Multidiscip. Optim. 58(4), 1537–1557 (2018). https://doi.org/10.1007/s00158-018-1981-8

    Article  MathSciNet  Google Scholar 

  24. Schillinger, M., Hartmann, B., Skalecki, P., Meister, M., Nguyen-Tuong, D., Nelles, O.: Safe active learning and safe Bayesian optimization for tuning a PI-controller. IFAC-PapersOnLine 50(1), 5967–5972 (2017)

    Article  Google Scholar 

  25. Schillinger, M., et al.: Safe active learning of a high pressure fuel supply system. In: Proceedings of the 9th EUROSIM Congress on Modelling and Simulation, EUROSIM 2016 and the 57th SIMS Conference on Simulation and Modelling SIMS 2016, pp. 286–292, Linköping University Electronic Press (2018)

    Google Scholar 

  26. Schreiter, J., Nguyen-Tuong, D., Eberts, M., Bischoff, B., Markert, H., Toussaint, M.: Safe exploration for active learning with Gaussian processes. In: Bifet, A., et al. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9286, pp. 133–149. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23461-8_9

    Chapter  Google Scholar 

  27. Schulz, E., Speekenbrink, M., Krause, A.: A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions. J. Math. Psychol. 85, 1–16 (2018)

    Article  MathSciNet  Google Scholar 

  28. Small, B.G., et al.: Efficient discovery of anti-inflammatory small-molecule combinations using evolutionary computing. Nat. Chem. Biol. 7(12), 902–908 (2011)

    Article  Google Scholar 

  29. Snoek, J., Larochelle, H., Adams, R.P.: Practical Bayesian optimization of machine learning algorithms. In: Bartlett, P.L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems (NIPS 25), pp. 2960–2968. Curran Associates, Red Hook (2012)

    Google Scholar 

  30. Sui, Y., Gotovos, A., Burdick, J.W., Krause, A.: Safe exploration for optimization with Gaussian processes. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, vol. 37, pp. 997–1005 (2015)

    Google Scholar 

  31. Sui, Y., Zhuang, V., Burdick, J.W., Yue, Y.: Stagewise safe Bayesian optimization with Gaussian processes. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018. Proceedings of Machine Learning Research, vol. 80, pp. 4788–4796. PMLR (2018)

    Google Scholar 

  32. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  33. Turchetta, M., Berkenkamp, F., Krause, A.: Safe exploration in finite Markov decision processes with Gaussian processes. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS 29), pp. 4312–4320 (2016)

    Google Scholar 

  34. Turchetta, M., Berkenkamp, F., Krause, A.: Safe exploration for interactive machine learning. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NIPS 32), pp. 2887–2897 (2019)

    Google Scholar 

  35. Wachi, A., Sui, Y., Yue, Y., Ono, M.: Safe exploration and optimization of constrained MDPs using Gaussian processes. In: McIlraith, S.A., Weinberger, K.Q. (eds.) AAAI Conference on Artificial Intelligence, pp. 6548–6556, AAAI Press, February 2018

    Google Scholar 

Download references

Acknowledgements

M. López-Ibáñez is a “Beatriz Galindo” Senior Distinguished Researcher (BEAGAL 18/00053) funded by the Spanish Ministry of Science and Innovation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Youngmin Kim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kim, Y., Allmendinger, R., López-Ibáñez, M. (2021). Safe Learning and Optimization Techniques: Towards a Survey of the State of the Art. In: Heintz, F., Milano, M., O'Sullivan, B. (eds) Trustworthy AI - Integrating Learning, Optimization and Reasoning. TAILOR 2020. Lecture Notes in Computer Science(), vol 12641. Springer, Cham. https://doi.org/10.1007/978-3-030-73959-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-73959-1_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-73958-4

  • Online ISBN: 978-3-030-73959-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics