research-article

Learning Controller Gains on Bipedal Walking Robots via User Preferences

Authors:

Noel Csomay-Shanklin,

Aaron D. AmesAuthors Info & Claims

2022 International Conference on Robotics and Automation (ICRA)

Pages 10405 - 10411

https://doi.org/10.1109/ICRA46639.2022.9811541

Published: 23 May 2022 Publication History

Abstract

Experimental demonstration of complex robotic behaviors relies heavily on finding the correct controller gains. This painstaking process is often completed by a domain expert, requiring deep knowledge of the relationship between parameter values and the resulting behavior of the system. Even when such knowledge is possessed, it can take significant effort to navigate the nonintuitive landscape of possible parameter combinations. In this work, we explore the extent to which preference-based learning can be used to optimize controller gains online by repeatedly querying the user for their preferences. This general methodology is applied to two variants of control Lyapunov function based nonlinear controllers framed as quadratic programs, which provide theoretical guarantees but are challenging to realize in practice. These controllers are successfully demonstrated both on the planar underactuated biped, AMBER, and on the 3D underactuated biped, Cassie. We experimentally evaluate the performance of the learned controllers and show that the proposed method is repeatably able to learn gains that yield stable and robust locomotion.

References

[1]

W. K. Ho, C. C. Hang, and L. S. Cao, “Tuning of PID controllers based on gain and phase margin specifications,” Automatica, vol. 31, no. 3 pp. 497–502, 1995.

Digital Library

[2]

W. Wojsznis, J. Gudaz, T. Blevins, and A. Mehta, “Practical approach to tuning MPC,” ISA transactions, vol. 42, no. 1 pp. 149–162, 2003.

[3]

L. Zheng, “A practical guide to tune of proportional and integral (PI) like fuzzy controllers,” in [1992 Proceedings] IEEE International Conference on Fuzzy Systems. IEEE, 1992, pp. 633–640.

[4]

H. Hjalmarsson and T. Birkeland, “Iterative feedback tuning of linear time-invariant MIMO systems,” in Proceedings of the 37th IEEE Conference on Decision and Control (Cat. No. 98CH36171), vol. 4. IEEE, 1998, pp. 3893–3898.

[5]

A. D. Ames and M. Powell, “Towards the unification of locomotion and manipulation through control Lyapunov functions and quadratic programs,” in Control of Cyber-Physical Systems. Springer, 2013, pp. 219–240.

[6]

K. Galloway, K. Sreenath, A. D. Ames, and J. W. Grizzle, “Torque saturation in bipedal robotic walking through control Lyapunov function-based quadratic programs,” IEEE Access, vol. 3, pp. 323–332, 2015.

[7]

J. Reher and A. Ames, “Control Lyapunov functions for compliant hybrid zero dynamic walking,” ArXiv, Submitted to IEEE Transactions on Robotics, vol. abs/2107.04241, 2021.

[8]

E. Ambrose, W.-L. Ma, C. Hubicki, and A. D. Ames, “Toward benchmarking locomotion economy across design configurations on the modular robot: AMBER-3M,” in 2017 IEEE Conference on Control Technology and Applications (CCTA). IEEE, 2017, pp. 1270–1276.

[9]

A. Robotics, https://www.agilityrobotics.com/robots#cassie, Last accessed on 2021-09-14.

[10]

M. Birattari and J. Kacprzyk, Tuning metaheuristics: a machine learning perspective.Springer, 2009, vol. 197.

[11]

M. Jun and M. G. Safonov, “Automatic PID tuning: An application of unfalsified control,” in Proceedings of the 1999 IEEE International Symposium on Computer Aided Control System Design (Cat. No. 99TH8404). IEEE, 1999, pp. 328–333.

[12]

A. Marco, P. Hennig, J. Bohg, S. Schaal, and S. Trimpe, “Automatic LQR tuning based on Gaussian process global optimization,” in 2016 IEEE international conference on robotics and automation (ICRA). IEEE, 2016, pp. 270–277.

[13]

R. Calandra, A. Seyfarth, J. Peters, and M. P. Deisenroth, “Bayesian optimization for learning gaits under uncertainty,” Annals of Mathematics and Artificial Intelligence, vol. 76, no. 1 pp. 5–23, 2016.

Digital Library

[14]

A. Rai, R. Antonova, S. Song, W. Martin, H. Geyer, and C. Atkeson, “Bayesian optimization using domain knowledge on the ATRIAS biped,” in 2018 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2018, pp. 1771–1778.

Digital Library

[15]

P.-B. Wieber, “On the stability of walking systems,” in Proceedings of the international workshop on humanoid and human friendly robotics, 2002.

[16]

M. Vukobratovic and B. Borovac, “Zero-moment point—thirty five years of its life,” International journal of humanoid robotics, vol. 1, no. 01 pp. 157–173, 2004.

[17]

J. E. Pratt and R. Tedrake, “Velocity-based stability margins for fast bipedal walking,” in Fast motions in biomechanics and robotics. Springer, 2006, pp. 299–324.

[18]

J. W. Grizzle, C. Chevallereau, R. W. Sinnet, and A. D. Ames, “Models, feedback control, and open problems of 3D bipedal robotic walking,” Automatica, vol. 50, no. 8 pp. 1955–1988, 2014.

[19]

M. Tucker, N. Csomay-Shanklin, W.-L. Ma, and A.D. Ames, “Preference-based learning for user-guided HZD gait generation on bipedal walking robots,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 2804–2810.

Digital Library

[20]

E. R. Westervelt, J. W. Grizzle, C. Chevallereau, J. H. Choi, and B. Morris, Feedback control ofdynamic bipedal robot locomotion.CRC press, 2018.

[21]

A. Hereid and A. D. Ames, “FROST: Fast robot optimization and simulation toolkit,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 719–726.

Digital Library

[22]

E. R. Westervelt, J. W. Grizzle, and D. E. Koditschek, “Hybrid zero dynamics of planar biped walkers,” IEEE transactions on automatic control, vol. 48, no. 1 pp. 42–56, 2003.

[23]

J. Reher and A. D. Ames, “Inverse dynamics control of compliant hybrid zero dynamic walking,” 2020.

[24]

A. D. Ames, K. Galloway, K. Sreenath, and J. W. Grizzle, “Rapidly exponentially stabilizing control Lyapunov functions and hybrid zero dynamics,” IEEE Transactions on Automatic Control, vol. 59, no. 4 pp. 876–891, 2014.

[25]

A. Isidori, Nonlinear Control Systems, Third Edition, ser. Communications and Control Engineering. Springer, 1995. [Online]. Available: https://doi.org/10.1007/978-1-84628-615-5.

[26]

J. Reher, C. Kann, and A. D. Ames, “An inverse dynamics approach to control Lyapunov functions,” 2020.

[27]

M. Tucker, M. Cheng, E. Novoseller, R. Cheng, Y. Yue, J. W. Burdick, and A. D. Ames, “Human preference-based learning for high-dimensional optimization of exoskeleton walking gaits,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 3423–3430.

Digital Library

[28]

K. Li, M. Tucker, E. Bıyık, E. Novoseller, J. W. Burdick, Y. Sui, D. Sadigh, Y. Yue, and A. D. Ames, “ROIAL: Region of interest active learning for characterizing exoskeleton gait preference landscapes,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 3212–3218.

Digital Library

[29]

W. Chu and Z. Ghahramani, “Preference learning with Gaussian processes,” in Proceedings of the 22nd International Conference on Machine learning (ICML), 2005, pp. 137–144.

Digital Library

[30]

O. Chapelle and L. Li, “An empirical evaluation of Thompson sampling,” Advances in neural information processing systems, vol. 24, pp. 2249–2257, 2011.

[31]

“Video of the experimental results.” https://youtu.be/jMX5a_6Xcuw.

Recommendations

A Compliant Hybrid Zero Dynamics Controller for Stable, Efficient and Fast Bipedal Walking on MABEL

The planar bipedal testbed MABEL contains springs in its drivetrain for the purpose of enhancing both energy efficiency and agility of dynamic locomotion. While the potential energetic benefits of springs are well documented in the literature, feedback ...
Acquisition of energy-efficient bipedal walking using CPG-based reinforcement learning
IROS'09: Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems

Although there have been much research on robot walking, the energy efficiency of central pattern generator (CPG)-based walking has not received much attention. This study proposes a novel method for acquiring energy-efficient CPG-based bipedal walking ...
A human-simulated fuzzy membrane approach for the joint controller of walking biped robots

To guarantee their locomotion, biped robots need to walk stably. The latter is achieved by a high performance in joint control. This article addresses this issue by proposing a novel human-simulated fuzzy (HF) membrane control system of the joint angles. ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

2022 International Conference on Robotics and Automation (ICRA)

May 2022

6634 pages

Copyright © 2022.

Publisher

IEEE Press

Publication History

Published: 23 May 2022

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten