Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions

  • Conference paper
  • First Online:
Algorithmic Foundations of Robotics XIV (WAFR 2020)

Abstract

Sample-based learning model predictive control (LMPC) strategies have recently attracted attention due to their desirable theoretical properties and good empirical performance on robotic tasks. However, prior analysis of LMPC controllers for stochastic systems has mainly focused on linear systems in the iterative learning control setting. We present a novel LMPC algorithm, Adjustable Boundary Condition LMPC (ABC-LMPC), which enables rapid adaptation to novel start and goal configurations and theoretically show that the resulting controller guarantees iterative improvement in expectation for stochastic nonlinear systems. We present results with a practical instantiation of this algorithm and experimentally demonstrate that the resulting controller adapts to a variety of initial and terminal conditions on 3 stochastic continuous control tasks.

B. Thananjeyan and A. Balakrishna—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Thananjeyan, B., Balakrishna, A., Rosolia, U., Li, F., McAllister, R., Gonzalez, J.E., Levine, S., Borrelli, F., Goldberg, K.: Safety augmented value estimation from demonstrations (SAVED): safe deep model-based RL for sparse cost robotic tasks. IEEE Robot. Autom. Lett. 5(2), 3612–3619 (2020)

    Article  Google Scholar 

  2. Chua, K., Calandra, R., McAllister, R., Levine, S.: Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In: Proceedings of Advances in Neural Information Processing Systems (2018)

    Google Scholar 

  3. Nagabandi, A., Konolige, K., Levine, S., Kumar, V.: Deep dynamics models for learning dexterous manipulation. In: Conference on Robot Learning (CoRL) (2019)

    Google Scholar 

  4. Balakrishna, A., Thananjeyan, B., Lee, J., Li, F., Zahed, A., Gonzalez, J.E., Goldberg, K.: On-policy robot imitation learning from a converging supervisor. In: Conference on Robot Learning (CoRL) (2019)

    Google Scholar 

  5. Rosolia, U., Borrelli, F.: Sample-based learning model predictive control for linear uncertain systems. CoRR (2019). arXiv: 1904.06432

  6. Rosolia, U., Zhang, X., Borrelli, F.: Robust learning model predictive control for iterative tasks: learning from experience. In: Annual Conference on Decision and Control (CDC) (2017)

    Google Scholar 

  7. Rosolia, U., Borrelli, F.: Learning model predictive control for iterative tasks. A data-driven control framework. IEEE Trans. Autom. Control 63(7), 1883–1896 (2018)

    Article  MathSciNet  Google Scholar 

  8. Rosolia, U., Borrelli, F.: Learning how to autonomously race a car: a predictive control approach. In: IEEE (2019)

    Google Scholar 

  9. Aswani, A., Gonzalez, H., Sastry, S., Tomlin, C.: Provably safe and robust learning-based model predictive control. Automatica 49(5), 1216–1226 (2013)

    Article  MathSciNet  Google Scholar 

  10. Kocijan, J., Murray-Smith, R., Rasmussen, C.E., Girard, A.: Gaussian process model based predictive control (2004)

    Google Scholar 

  11. Koller, T., Berkenkamp, F., Turchetta, M., Krause, A.: Learning based model predictive control for safe exploration and reinforcement learning (2018)

    Google Scholar 

  12. Hewing, L., Liniger, A., Zeilinger, M.: Cautious NMPC with gaussian process dynamics for autonomous miniature race cars (2018)

    Google Scholar 

  13. Terzi, E., Fagiano, L., Farina, M., Scattolini, R.: Learning-based predictive control for linear systems: a unitary approach. Automatica 108, 108473 (2019)

    Article  MathSciNet  Google Scholar 

  14. Hewing, L., Kabzan, J., Zeilinger, M.N.: Cautious model predictive control using Gaussian process regression. IEEE Trans. Control Syst. Technol. (2019)

    Google Scholar 

  15. Kocijan, J., Murray-Smith, R., Rasmussen, C.E., Girard, A.: Gaussian process model based predictive control. In: Proceedings of the 2004 American Control Conference (2004)

    Google Scholar 

  16. Bacic, M., Cannon, M., Lee, Y.I., Kouvaritakis, B.: General interpolation in MPC and its advantages. IEEE Trans. Autom. Control 48(6), 1092–1096 (2003)

    Article  MathSciNet  Google Scholar 

  17. Brunner, F.D., Lazar, M., Allgöwer, F.: Stabilizing linear model predictive control: on the enlargement of the terminal set. In: 2013 European Control Conference (ECC) (2013)

    Google Scholar 

  18. Wabersich, K.P., Zeilinger, M.N.: Linear model predictive safety certification for learning-based control. In: 2018 IEEE Conference on Decision and Control (CDC) (2018)

    Google Scholar 

  19. Blanchini, F., Pellegrino, F.A.: Relatively optimal control and its linear implementation. IEEE Trans. Autom. Control 48(12), 2151–2162 (2003)

    Article  MathSciNet  Google Scholar 

  20. Lowrey, K., Rajeswaran, A., Kakade, S., Todorov, E., Mordatch, I.: Plan online, learn offline: efficient learning and exploration via model-based control. In: Proceedings of International Conference on Machine Learning (2019)

    Google Scholar 

  21. Florensa, C., Held, D., Wulfmeier, M., Zhang, M., Abbeel, P.: Reverse curriculum generation for reinforcement learning. In: Conference on Robot Learning (CoRL) (2017)

    Google Scholar 

  22. Resnick, C., Raileanu, R., Kapoor, S., Peysakhovich, A., Cho, K., Bruna, J.: Backplay: “man muss immer umkehren”. CoRR (2018). arXiv: 1807

  23. Narvekar, S., Stone, P.: Learning curriculum policies for reinforcement learning. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (2019)

    Google Scholar 

  24. Ivanovic, B., Harrison, J., Sharma, A., Chen, M., Pavone, M.: BaRC: backward reachability curriculum for robotic reinforcement learning. In: Proceedings of IEEE International Conference on Robotics and Automation (ICRA) (2019)

    Google Scholar 

  25. Nair, A.V., Pong, V., Dalal, M., Bahl, S., Lin, S., Levine, S.: Visual reinforcement learning with imagined goals. In: Proceedings Advances in Neural Information Processing Systems (2018)

    Google Scholar 

  26. Schaul, T., Horgan, D., Gregor, K., Silver, D.: Universal value function approximators. In: Proceedings of International Conference on Machine Learning (2015)

    Google Scholar 

  27. Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, O.P., Zaremba, W.: Hindsight experience replay. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  28. van den Berg, J., Abbeel, P., Goldberg, K.Y.: LQG-MP: optimized path planning for robots with motion uncertainty and imperfect state information. Int. J. Robot. Res. 30(7), 895–913 (2011)

    Article  Google Scholar 

  29. Lee, A., Patil, S., Schulman, J., McCarthy, Z., Berg, J., Goldberg, K., Abbeel, P.: Gaussian belief space planning for imprecise articulated robots. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2013)

    Google Scholar 

  30. Kurniawati, H., Du, Y., Hsu, D., Lee, W.S.: Motion planning under uncertainty for robotic tasks with long time horizons. Int. J. Robot. Res. 30(3), 308–323 (2011)

    Article  Google Scholar 

  31. Botev, Z.I., Kroese, D.P., Rubinstein, R.Y., Faculty of Industrial Engineering: The Cross-Entropy Method for Optimization (2013)

    Google Scholar 

  32. Sakai, A., Ingram, D., Dinius, J., Chawla, K., Raffin, A., Paques, A.: PythonRobotics: a Python code collection of robotics algorithms (2018)

    Google Scholar 

  33. Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI Gym (2016). eprint: arXiv:1606.01540

Download references

Acknowledgements

This research was performed at the AUTOLAB at UC Berkeley in affiliation with the Berkeley AI Research (BAIR) Lab. Authors were also supported by the Scalable Collaborative Human-Robot Learning (SCHooL) Project, a NSF National Robotics Initiative Award 1734633, and in part by donations from Google and Toyota Research Institute. Ashwin Balakrishna is supported by an NSF GRFP. This article solely reflects the opinions and conclusions of its authors and does not reflect the views of the sponsors. We thank our colleagues who provided helpful feedback and suggestions, especially Michael Danielczuk, Daniel Brown and Suraj Nair.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashwin Balakrishna .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 182 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Thananjeyan, B., Balakrishna, A., Rosolia, U., Gonzalez, J.E., Ames, A., Goldberg, K. (2021). ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions. In: LaValle, S.M., Lin, M., Ojala, T., Shell, D., Yu, J. (eds) Algorithmic Foundations of Robotics XIV. WAFR 2020. Springer Proceedings in Advanced Robotics, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-030-66723-8_1

Download citation

Publish with us

Policies and ethics