Search | arXiv e-print repository

Output Feedback Adaptive Optimal Control of Affine Nonlinear systems with a Linear Measurement Model

Authors: Tochukwu Elijah Ogri, S. M. Nahid Mahmud, Zachary I. Bell, Rushikesh Kamalapurkar

Abstract: Real-world control applications in complex and uncertain environments require adaptability to handle model uncertainties and robustness against disturbances. This paper presents an online, output-feedback, critic-only, model-based reinforcement learning architecture that simultaneously learns and implements an optimal controller while maintaining stability during the learning phase. Using multipli… ▽ More Real-world control applications in complex and uncertain environments require adaptability to handle model uncertainties and robustness against disturbances. This paper presents an online, output-feedback, critic-only, model-based reinforcement learning architecture that simultaneously learns and implements an optimal controller while maintaining stability during the learning phase. Using multiplier matrices, a convenient way to search for observer gains is designed along with a controller that learns from simulated experience to ensure stability and convergence of trajectories of the closed-loop system to a neighborhood of the origin. Local uniform ultimate boundedness of the trajectories is established using a Lyapunov-based analysis and demonstrated through simulation results, under mild excitation conditions. △ Less

Submitted 3 April, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

Comments: 16 pages, 5 figures, submitted to 2023 IEEE Conference on Control Technology and Applications

arXiv:2204.01409 [pdf, other]

Safe Controller for Output Feedback Linear Systems using Model-Based Reinforcement Learning

Authors: S M Nahid Mahmud, Moad Abudia, Scott A Nivison, Zachary I. Bell, Rushikesh Kamalapurkar

Abstract: The objective of this research is to enable safety-critical systems to simultaneously learn and execute optimal control policies in a safe manner to achieve complex autonomy. Learning optimal policies via trial and error, i.e., traditional reinforcement learning, is difficult to implement in safety-critical systems, particularly when task restarts are unavailable. Safe model-based reinforcement le… ▽ More The objective of this research is to enable safety-critical systems to simultaneously learn and execute optimal control policies in a safe manner to achieve complex autonomy. Learning optimal policies via trial and error, i.e., traditional reinforcement learning, is difficult to implement in safety-critical systems, particularly when task restarts are unavailable. Safe model-based reinforcement learning techniques based on a barrier transformation have recently been developed to address this problem. However, these methods rely on full state feedback, limiting their usability in a real-world environment. In this work, an output-feedback safe model-based reinforcement learning technique based on a novel barrier-aware dynamic state estimator has been designed to address this issue. The developed approach facilitates simultaneous learning and execution of safe control policies for safety-critical linear systems. Simulation results indicate that barrier transformation is an effective approach to achieve online reinforcement learning in safety-critical systems using output feedback. △ Less

Submitted 4 April, 2022; originally announced April 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2110.00271

arXiv:2110.00271 [pdf, other]

Safety aware model-based reinforcement learning for optimal control of a class of output-feedback nonlinear systems

Authors: S M Nahid Mahmud, Moad Abudia, Scott A Nivison, Zachary I. Bell, Rushikesh Kamalapurkar

Abstract: The ability to learn and execute optimal control policies safely is critical to realization of complex autonomy, especially where task restarts are not available and/or the systems are safety-critical. Safety requirements are often expressed in terms of state and/or control constraints. Methods such as barrier transformation and control barrier functions have been successfully used, in conjunction… ▽ More The ability to learn and execute optimal control policies safely is critical to realization of complex autonomy, especially where task restarts are not available and/or the systems are safety-critical. Safety requirements are often expressed in terms of state and/or control constraints. Methods such as barrier transformation and control barrier functions have been successfully used, in conjunction with model-based reinforcement learning, for safe learning in systems under state constraints, to learn the optimal control policy. However, existing barrier-based safe learning methods rely on full state feedback. In this paper, an output-feedback safe model-based reinforcement learning technique is developed that utilizes a novel dynamic state estimator to implement simultaneous learning and control for a class of safety-critical systems with partially observable state. △ Less

Submitted 1 October, 2021; originally announced October 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2007.12666

arXiv:2008.08972 [pdf, other]

Online inverse reinforcement learning with limited data

Authors: Ryan Self, S M Nahid Mahmud, Katrine Hareland, Rushikesh Kamalapurkar

Abstract: This paper addresses the problem of online inverse reinforcement learning for systems with limited data and uncertain dynamics. In the developed approach, the state and control trajectories are recorded online by observing an agent perform a task, and reward function estimation is performed in real-time using a novel inverse reinforcement learning approach. Parameter estimation is performed concur… ▽ More This paper addresses the problem of online inverse reinforcement learning for systems with limited data and uncertain dynamics. In the developed approach, the state and control trajectories are recorded online by observing an agent perform a task, and reward function estimation is performed in real-time using a novel inverse reinforcement learning approach. Parameter estimation is performed concurrently to help compensate for uncertainties in the agent's dynamics. Data insufficiency is resolved by developing a data-driven update law to estimate the optimal feedback controller. The estimated controller can then be queried to artificially create additional data to drive reward function estimation. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: 8 pages, 5 figures. arXiv admin note: text overlap with arXiv:2003.03912

arXiv:2007.12666 [pdf, other]

Safe Model-Based Reinforcement Learning for Systems with Parametric Uncertainties

Authors: S M Nahid Mahmud, Scott A Nivison, Zachary I. Bell, Rushikesh Kamalapurkar

Abstract: Reinforcement learning has been established over the past decade as an effective tool to find optimal control policies for dynamical systems, with recent focus on approaches that guarantee safety during the learning and/or execution phases. In general, safety guarantees are critical in reinforcement learning when the system is safety-critical and/or task restarts are not practically feasible. In o… ▽ More Reinforcement learning has been established over the past decade as an effective tool to find optimal control policies for dynamical systems, with recent focus on approaches that guarantee safety during the learning and/or execution phases. In general, safety guarantees are critical in reinforcement learning when the system is safety-critical and/or task restarts are not practically feasible. In optimal control theory, safety requirements are often expressed in terms of state and/or control constraints. In recent years, reinforcement learning approaches that rely on persistent excitation have been combined with a barrier transformation to learn the optimal control policies under state constraints. To soften the excitation requirements, model-based reinforcement learning methods that rely on exact model knowledge have also been integrated with the barrier transformation framework. The objective of this paper is to develop safe reinforcement learning method for deterministic nonlinear systems, with parametric uncertainties in the model, to learn approximate constrained optimal policies without relying on stringent excitation conditions. To that end, a model-based reinforcement learning technique that utilizes a novel filtered concurrent learning method, along with a barrier transformation, is developed in this paper to realize simultaneous learning of unknown model parameters and approximate optimal state-constrained control policies for safety-critical systems. △ Less

Submitted 5 October, 2021; v1 submitted 24 July, 2020; originally announced July 2020.

Comments: This manuscript has been accepted in Frontiers in Robotics and AI. doi: 10.3389/frobt.2021.733104

arXiv:1703.07068 [pdf, other]

doi 10.1109/CDC.2017.8263965

Online Simultaneous State and Parameter Estimation

Authors: Ryan Self, Moad Abudia, S. M. Nahid Mahmud, Rushikesh Kamalapurkar

Abstract: In this paper, a concurrent learning based adaptive observer is developed for a class of second-order nonlinear time-invariant systems with uncertain dynamics. The developed technique results in uniformly ultimately bounded state and parameter estimation errors. As opposed to persistent excitation which is required for parameter convergence in traditional adaptive control methods, the developed te… ▽ More In this paper, a concurrent learning based adaptive observer is developed for a class of second-order nonlinear time-invariant systems with uncertain dynamics. The developed technique results in uniformly ultimately bounded state and parameter estimation errors. As opposed to persistent excitation which is required for parameter convergence in traditional adaptive control methods, the developed technique only requires excitation over a finite time interval to achieve parameter convergence. Simulation results in both noise-free and noisy environments are presented to validate the design. △ Less

Submitted 13 October, 2020; v1 submitted 21 March, 2017; originally announced March 2017.

Comments: arXiv admin note: text overlap with arXiv:1609.05879

Showing 1–6 of 6 results for author: Mahmud, S M N