Adaptive Optimal Control via Continuous-Time Q-Learning for Unknown Nonlinear Affine Systems

Anthony Siming Chen, Guido Herrmann

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

55 Downloads (Pure)

Abstract

This paper proposes two novel adaptive optimal control algorithms for continuous-time nonlinear affine systems based on reinforcement learning: i) generalised policy iteration (GPI) and ii) Q-learning. As a result, the a priori knowledge of the system drift f(x) is not needed via GPI, which gives us a partially model-free and online solution. We then for the first time extend the idea of Q-learning to the nonlinear continuoustime optimal control problem in a noniterative manner. This leads to a completely model-free method where neither the system drift f(x) nor the input gain g(x) is needed. For both methods, the adaptive critic and actor are continuously and simultaneously updating each other without iterative steps, which effectively avoids the hybrid structure and the need for an initial stabilising control policy. Moreover, finite-time convergence is guaranteed by using a sliding mode technique in the new adaptive approach, where the persistent excitation (PE) condition can be directly verified online. We also prove the overall Lyapunov stability and demonstrate the effectiveness of the proposed algorithms using numerical examples.
Original languageEnglish
Title of host publication58th Conference on Decision and Control
DOIs
Publication statusPublished - 12 Mar 2020
Event58th IEEE Conference on Decision and Control - Nice, France
Duration: 11 Dec 201913 Dec 2019

Conference

Conference58th IEEE Conference on Decision and Control
Country/TerritoryFrance
CityNice
Period11/12/1913/12/19

Fingerprint

Dive into the research topics of 'Adaptive Optimal Control via Continuous-Time Q-Learning for Unknown Nonlinear Affine Systems'. Together they form a unique fingerprint.

Cite this