ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions

Thananjeyan, Brijen; Balakrishna, Ashwin; Rosolia, Ugo; Gonzalez, Joseph E.; Ames, Aaron; Goldberg, Ken

doi:10.1007/978-3-030-66723-8_1

Brijen Thananjeyan¹⁵,
Ashwin Balakrishna¹⁵,
Ugo Rosolia¹⁶,
Joseph E. Gonzalez¹⁵,
Aaron Ames¹⁶ &
…
Ken Goldberg¹⁵

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 17))

Included in the following conference series:

International Workshop on the Algorithmic Foundations of Robotics

1019 Accesses
2 Citations

Abstract

Sample-based learning model predictive control (LMPC) strategies have recently attracted attention due to their desirable theoretical properties and good empirical performance on robotic tasks. However, prior analysis of LMPC controllers for stochastic systems has mainly focused on linear systems in the iterative learning control setting. We present a novel LMPC algorithm, Adjustable Boundary Condition LMPC (ABC-LMPC), which enables rapid adaptation to novel start and goal configurations and theoretically show that the resulting controller guarantees iterative improvement in expectation for stochastic nonlinear systems. We present results with a practical instantiation of this algorithm and experimentally demonstrate that the resulting controller adapts to a variety of initial and terminal conditions on 3 stochastic continuous control tasks.

B. Thananjeyan and A. Balakrishna—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Online model-learning algorithm from samples and trajectories

Article 17 November 2018

Sample-Efficient Reinforcement Learning Based on Dynamics Models via Meta-policy Optimization

Hierarchical Reinforcement Learning Under Mixed Observability

References

Thananjeyan, B., Balakrishna, A., Rosolia, U., Li, F., McAllister, R., Gonzalez, J.E., Levine, S., Borrelli, F., Goldberg, K.: Safety augmented value estimation from demonstrations (SAVED): safe deep model-based RL for sparse cost robotic tasks. IEEE Robot. Autom. Lett. 5(2), 3612–3619 (2020)
Article Google Scholar
Chua, K., Calandra, R., McAllister, R., Levine, S.: Deep reinforcement learning in a handful of trials using probabilistic dynamics models. In: Proceedings of Advances in Neural Information Processing Systems (2018)
Google Scholar
Nagabandi, A., Konolige, K., Levine, S., Kumar, V.: Deep dynamics models for learning dexterous manipulation. In: Conference on Robot Learning (CoRL) (2019)
Google Scholar
Balakrishna, A., Thananjeyan, B., Lee, J., Li, F., Zahed, A., Gonzalez, J.E., Goldberg, K.: On-policy robot imitation learning from a converging supervisor. In: Conference on Robot Learning (CoRL) (2019)
Google Scholar
Rosolia, U., Borrelli, F.: Sample-based learning model predictive control for linear uncertain systems. CoRR (2019). arXiv: 1904.06432
Rosolia, U., Zhang, X., Borrelli, F.: Robust learning model predictive control for iterative tasks: learning from experience. In: Annual Conference on Decision and Control (CDC) (2017)
Google Scholar
Rosolia, U., Borrelli, F.: Learning model predictive control for iterative tasks. A data-driven control framework. IEEE Trans. Autom. Control 63(7), 1883–1896 (2018)
Article MathSciNet Google Scholar
Rosolia, U., Borrelli, F.: Learning how to autonomously race a car: a predictive control approach. In: IEEE (2019)
Google Scholar
Aswani, A., Gonzalez, H., Sastry, S., Tomlin, C.: Provably safe and robust learning-based model predictive control. Automatica 49(5), 1216–1226 (2013)
Article MathSciNet Google Scholar
Kocijan, J., Murray-Smith, R., Rasmussen, C.E., Girard, A.: Gaussian process model based predictive control (2004)
Google Scholar
Koller, T., Berkenkamp, F., Turchetta, M., Krause, A.: Learning based model predictive control for safe exploration and reinforcement learning (2018)
Google Scholar
Hewing, L., Liniger, A., Zeilinger, M.: Cautious NMPC with gaussian process dynamics for autonomous miniature race cars (2018)
Google Scholar
Terzi, E., Fagiano, L., Farina, M., Scattolini, R.: Learning-based predictive control for linear systems: a unitary approach. Automatica 108, 108473 (2019)
Article MathSciNet Google Scholar
Hewing, L., Kabzan, J., Zeilinger, M.N.: Cautious model predictive control using Gaussian process regression. IEEE Trans. Control Syst. Technol. (2019)
Google Scholar
Kocijan, J., Murray-Smith, R., Rasmussen, C.E., Girard, A.: Gaussian process model based predictive control. In: Proceedings of the 2004 American Control Conference (2004)
Google Scholar
Bacic, M., Cannon, M., Lee, Y.I., Kouvaritakis, B.: General interpolation in MPC and its advantages. IEEE Trans. Autom. Control 48(6), 1092–1096 (2003)
Article MathSciNet Google Scholar
Brunner, F.D., Lazar, M., Allgöwer, F.: Stabilizing linear model predictive control: on the enlargement of the terminal set. In: 2013 European Control Conference (ECC) (2013)
Google Scholar
Wabersich, K.P., Zeilinger, M.N.: Linear model predictive safety certification for learning-based control. In: 2018 IEEE Conference on Decision and Control (CDC) (2018)
Google Scholar
Blanchini, F., Pellegrino, F.A.: Relatively optimal control and its linear implementation. IEEE Trans. Autom. Control 48(12), 2151–2162 (2003)
Article MathSciNet Google Scholar
Lowrey, K., Rajeswaran, A., Kakade, S., Todorov, E., Mordatch, I.: Plan online, learn offline: efficient learning and exploration via model-based control. In: Proceedings of International Conference on Machine Learning (2019)
Google Scholar
Florensa, C., Held, D., Wulfmeier, M., Zhang, M., Abbeel, P.: Reverse curriculum generation for reinforcement learning. In: Conference on Robot Learning (CoRL) (2017)
Google Scholar
Resnick, C., Raileanu, R., Kapoor, S., Peysakhovich, A., Cho, K., Bruna, J.: Backplay: “man muss immer umkehren”. CoRR (2018). arXiv: 1807
Narvekar, S., Stone, P.: Learning curriculum policies for reinforcement learning. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (2019)
Google Scholar
Ivanovic, B., Harrison, J., Sharma, A., Chen, M., Pavone, M.: BaRC: backward reachability curriculum for robotic reinforcement learning. In: Proceedings of IEEE International Conference on Robotics and Automation (ICRA) (2019)
Google Scholar
Nair, A.V., Pong, V., Dalal, M., Bahl, S., Lin, S., Levine, S.: Visual reinforcement learning with imagined goals. In: Proceedings Advances in Neural Information Processing Systems (2018)
Google Scholar
Schaul, T., Horgan, D., Gregor, K., Silver, D.: Universal value function approximators. In: Proceedings of International Conference on Machine Learning (2015)
Google Scholar
Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, O.P., Zaremba, W.: Hindsight experience replay. In: Advances in Neural Information Processing Systems (2017)
Google Scholar
van den Berg, J., Abbeel, P., Goldberg, K.Y.: LQG-MP: optimized path planning for robots with motion uncertainty and imperfect state information. Int. J. Robot. Res. 30(7), 895–913 (2011)
Article Google Scholar
Lee, A., Patil, S., Schulman, J., McCarthy, Z., Berg, J., Goldberg, K., Abbeel, P.: Gaussian belief space planning for imprecise articulated robots. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2013)
Google Scholar
Kurniawati, H., Du, Y., Hsu, D., Lee, W.S.: Motion planning under uncertainty for robotic tasks with long time horizons. Int. J. Robot. Res. 30(3), 308–323 (2011)
Article Google Scholar
Botev, Z.I., Kroese, D.P., Rubinstein, R.Y., Faculty of Industrial Engineering: The Cross-Entropy Method for Optimization (2013)
Google Scholar
Sakai, A., Ingram, D., Dinius, J., Chawla, K., Raffin, A., Paques, A.: PythonRobotics: a Python code collection of robotics algorithms (2018)
Google Scholar
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI Gym (2016). eprint: arXiv:1606.01540

Download references

Acknowledgements

This research was performed at the AUTOLAB at UC Berkeley in affiliation with the Berkeley AI Research (BAIR) Lab. Authors were also supported by the Scalable Collaborative Human-Robot Learning (SCHooL) Project, a NSF National Robotics Initiative Award 1734633, and in part by donations from Google and Toyota Research Institute. Ashwin Balakrishna is supported by an NSF GRFP. This article solely reflects the opinions and conclusions of its authors and does not reflect the views of the sponsors. We thank our colleagues who provided helpful feedback and suggestions, especially Michael Danielczuk, Daniel Brown and Suraj Nair.

Author information

Authors and Affiliations

University of California Berkeley, Berkeley, CA, 94720, USA
Brijen Thananjeyan, Ashwin Balakrishna, Joseph E. Gonzalez & Ken Goldberg
California Institute of Technology, Pasadena, CA, 91125, USA
Ugo Rosolia & Aaron Ames

Authors

Brijen Thananjeyan
View author publications
You can also search for this author in PubMed Google Scholar
Ashwin Balakrishna
View author publications
You can also search for this author in PubMed Google Scholar
Ugo Rosolia
View author publications
You can also search for this author in PubMed Google Scholar
Joseph E. Gonzalez
View author publications
You can also search for this author in PubMed Google Scholar
Aaron Ames
View author publications
You can also search for this author in PubMed Google Scholar
Ken Goldberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashwin Balakrishna .

Editor information

Editors and Affiliations

Center for Ubiquitous Computing, University of Oulu, Oulu, Finland
Steven M. LaValle
Brendan Iribe Center for Computer Science and Engineering, University of Maryland, College Park, MD, USA
Ming Lin
Center for Ubiquitous Computing, University of Oulu, Oulu, Finland
Timo Ojala
Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
Dylan Shell
Department of Computer Science, Rutgers University, Piscataway, NJ, USA
Jingjin Yu

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 182 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thananjeyan, B., Balakrishna, A., Rosolia, U., Gonzalez, J.E., Ames, A., Goldberg, K. (2021). ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions. In: LaValle, S.M., Lin, M., Ojala, T., Shell, D., Yu, J. (eds) Algorithmic Foundations of Robotics XIV. WAFR 2020. Springer Proceedings in Advanced Robotics, vol 17. Springer, Cham. https://doi.org/10.1007/978-3-030-66723-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-66723-8_1
Published: 09 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66722-1
Online ISBN: 978-3-030-66723-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Online model-learning algorithm from samples and trajectories

Sample-Efficient Reinforcement Learning Based on Dynamics Models via Meta-policy Optimization

Hierarchical Reinforcement Learning Under Mixed Observability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 182 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear Dynamical Systems with Adjustable Boundary Conditions

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Online model-learning algorithm from samples and trajectories

Sample-Efficient Reinforcement Learning Based on Dynamics Models via Meta-policy Optimization

Hierarchical Reinforcement Learning Under Mixed Observability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 182 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation