Reinforcement learning and optimal adaptive control: an overview and implementation examples

Said G. Khan, Guido Herrmann, Frank L. Lewis, Tony Pipe, Chris Melhuish

    Research output: Contribution to journalArticlepeer-review


    This paper provides an overview of the reinforcementlearning and optimaladaptive control literature and its application to robotics. Reinforcementlearning is bridging the gap between traditional optimal control, adaptive control and bio-inspired learning techniques borrowed from animals. This work is highlighting some of the key techniques presented by well known researchers from the combined areas of reinforcementlearning and optimal control theory. At the end, an example of an implementation of a novel model-free Q-learning based discrete optimaladaptive controller for a humanoid robot arm is presented. The controller uses a novel adaptive dynamic programming (ADP) reinforcementlearning (RL) approach to develop an optimal policy on-line. The RL joint space tracking controller was implemented for two links (shoulder flexion and elbow flexion joints) of the arm of the humanoid Bristol-Elumotion-Robotic-Torso II (BERT II) torso. The constrained case (joint limits) of the RL scheme was tested for a single link (elbow flexion) of the BERT II arm by modifying the cost function to deal with the extra nonlinearity due to the joint constraints.
    Original languageEnglish
    Pages (from-to)42-59
    Number of pages18
    JournalAnnual Reviews in Control
    Issue number1
    Publication statusPublished - Apr 2012


    • Reinforcement learning
    • ADP
    • Q-learning
    • Optimal adaptive control


    Dive into the research topics of 'Reinforcement learning and optimal adaptive control: an overview and implementation examples'. Together they form a unique fingerprint.

    Cite this