Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/ICRA46639.2022.9811541guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Learning Controller Gains on Bipedal Walking Robots via User Preferences

Published: 23 May 2022 Publication History

Abstract

Experimental demonstration of complex robotic behaviors relies heavily on finding the correct controller gains. This painstaking process is often completed by a domain expert, requiring deep knowledge of the relationship between parameter values and the resulting behavior of the system. Even when such knowledge is possessed, it can take significant effort to navigate the nonintuitive landscape of possible parameter combinations. In this work, we explore the extent to which preference-based learning can be used to optimize controller gains online by repeatedly querying the user for their preferences. This general methodology is applied to two variants of control Lyapunov function based nonlinear controllers framed as quadratic programs, which provide theoretical guarantees but are challenging to realize in practice. These controllers are successfully demonstrated both on the planar underactuated biped, AMBER, and on the 3D underactuated biped, Cassie. We experimentally evaluate the performance of the learned controllers and show that the proposed method is repeatably able to learn gains that yield stable and robust locomotion.

References

[1]
W. K. Ho, C. C. Hang, and L. S. Cao, “Tuning of PID controllers based on gain and phase margin specifications,” Automatica, vol. 31, no. 3 pp. 497–502, 1995.
[2]
W. Wojsznis, J. Gudaz, T. Blevins, and A. Mehta, “Practical approach to tuning MPC,” ISA transactions, vol. 42, no. 1 pp. 149–162, 2003.
[3]
L. Zheng, “A practical guide to tune of proportional and integral (PI) like fuzzy controllers,” in [1992 Proceedings] IEEE International Conference on Fuzzy Systems. IEEE, 1992, pp. 633–640.
[4]
H. Hjalmarsson and T. Birkeland, “Iterative feedback tuning of linear time-invariant MIMO systems,” in Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No. 98CH36171), vol. 4. IEEE, 1998, pp. 3893–3898.
[5]
A. D. Ames and M. Powell, “Towards the unification of locomotion and manipulation through control Lyapunov functions and quadratic programs,” in Control of Cyber-Physical Systems. Springer, 2013, pp. 219–240.
[6]
K. Galloway, K. Sreenath, A. D. Ames, and J. W. Grizzle, “Torque saturation in bipedal robotic walking through control Lyapunov function-based quadratic programs,” IEEE Access, vol. 3, pp. 323–332, 2015.
[7]
J. Reher and A. Ames, “Control Lyapunov functions for compliant hybrid zero dynamic walking,” ArXiv, Submitted to IEEE Transactions on Robotics, vol. abs/2107.04241, 2021.
[8]
E. Ambrose, W.-L. Ma, C. Hubicki, and A. D. Ames, “Toward benchmarking locomotion economy across design configurations on the modular robot: AMBER-3M,” in 2017 IEEE Conference on Control Technology and Applications (CCTA). IEEE, 2017, pp. 1270–1276.
[9]
A. Robotics, https://www.agilityrobotics.com/robots#cassie, Last accessed on 2021-09-14.
[10]
M. Birattari and J. Kacprzyk, Tuning metaheuristics: a machine learning perspective.Springer, 2009, vol. 197.
[11]
M. Jun and M. G. Safonov, “Automatic PID tuning: An application of unfalsified control,” in Proceedings of the 1999 IEEE International Symposium on Computer Aided Control System Design (Cat. No. 99TH8404). IEEE, 1999, pp. 328–333.
[12]
A. Marco, P. Hennig, J. Bohg, S. Schaal, and S. Trimpe, “Automatic LQR tuning based on Gaussian process global optimization,” in 2016 IEEE international conference on robotics and automation (ICRA). IEEE, 2016, pp. 270–277.
[13]
R. Calandra, A. Seyfarth, J. Peters, and M. P. Deisenroth, “Bayesian optimization for learning gaits under uncertainty,” Annals of Mathematics and Artificial Intelligence, vol. 76, no. 1 pp. 5–23, 2016.
[14]
A. Rai, R. Antonova, S. Song, W. Martin, H. Geyer, and C. Atkeson, “Bayesian optimization using domain knowledge on the ATRIAS biped,” in 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 1771–1778.
[15]
P.-B. Wieber, “On the stability of walking systems,” in Proceedings of the international workshop on humanoid and human friendly robotics, 2002.
[16]
M. Vukobratovic and B. Borovac, “Zero-moment point—thirty five years of its life,” International journal of humanoid robotics, vol. 1, no. 01 pp. 157–173, 2004.
[17]
J. E. Pratt and R. Tedrake, “Velocity-based stability margins for fast bipedal walking,” in Fast motions in biomechanics and robotics. Springer, 2006, pp. 299–324.
[18]
J. W. Grizzle, C. Chevallereau, R. W. Sinnet, and A. D. Ames, “Models, feedback control, and open problems of 3D bipedal robotic walking,” Automatica, vol. 50, no. 8 pp. 1955–1988, 2014.
[19]
M. Tucker, N. Csomay-Shanklin, W.-L. Ma, and A.D. Ames, “Preference-based learning for user-guided HZD gait generation on bipedal walking robots,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 2804–2810.
[20]
E. R. Westervelt, J. W. Grizzle, C. Chevallereau, J. H. Choi, and B. Morris, Feedback control ofdynamic bipedal robot locomotion.CRC press, 2018.
[21]
A. Hereid and A. D. Ames, “FROST: Fast robot optimization and simulation toolkit,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 719–726.
[22]
E. R. Westervelt, J. W. Grizzle, and D. E. Koditschek, “Hybrid zero dynamics of planar biped walkers,” IEEE transactions on automatic control, vol. 48, no. 1 pp. 42–56, 2003.
[23]
J. Reher and A. D. Ames, “Inverse dynamics control of compliant hybrid zero dynamic walking,” 2020.
[24]
A. D. Ames, K. Galloway, K. Sreenath, and J. W. Grizzle, “Rapidly exponentially stabilizing control Lyapunov functions and hybrid zero dynamics,” IEEE Transactions on Automatic Control, vol. 59, no. 4 pp. 876–891, 2014.
[25]
A. Isidori, Nonlinear Control Systems, Third Edition, ser. Communications and Control Engineering. Springer, 1995. [Online]. Available: https://doi.org/10.1007/978-1-84628-615-5.
[26]
J. Reher, C. Kann, and A. D. Ames, “An inverse dynamics approach to control Lyapunov functions,” 2020.
[27]
M. Tucker, M. Cheng, E. Novoseller, R. Cheng, Y. Yue, J. W. Burdick, and A. D. Ames, “Human preference-based learning for high-dimensional optimization of exoskeleton walking gaits,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 3423–3430.
[28]
K. Li, M. Tucker, E. Bıyık, E. Novoseller, J. W. Burdick, Y. Sui, D. Sadigh, Y. Yue, and A. D. Ames, “ROIAL: Region of interest active learning for characterizing exoskeleton gait preference landscapes,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 3212–3218.
[29]
W. Chu and Z. Ghahramani, “Preference learning with Gaussian processes,” in Proceedings of the 22nd International Conference on Machine learning (ICML), 2005, pp. 137–144.
[30]
O. Chapelle and L. Li, “An empirical evaluation of Thompson sampling,” Advances in neural information processing systems, vol. 24, pp. 2249–2257, 2011.
[31]
Video of the experimental results.https://youtu.be/jMX5a_6Xcuw.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
2022 International Conference on Robotics and Automation (ICRA)
May 2022
6634 pages

Publisher

IEEE Press

Publication History

Published: 23 May 2022

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media