Learning and Optimization in Hierarchical Adaptive Critic Design

Document Type


Date of Original Version



This chapter introduces a novel hierarchical adaptive critic design to improve learning and optimization over time. Specifically, we propose to integrate a hierarchical goal generator network to provide the learning system a more informative and detailed goal representation to guide its decision making. The motivations for this idea is twofold. First, instead of using a typical binary reinforcement signal (e.g., 0 or 1) to represent "success" or "failure" of the system, we propose a more informative reinforcement signal representation for the intelligent system to make better choice of actions. Second, in order to mimic certain levels of brain-like intelligence, we consider it is important to introduce a multilevel goal representation into the adaptive critic design to guide the system's decision-making to accomplish the long-term goal over time.We present the detailed system architecture, learning, and adaptation procedure, and a case study of the ball-and-beam system to demonstrate the learning and control capability of this approach. © 2013 The Institute of Electrical and Electronics Engineers, Inc.

Publication Title

Reinforcement Learning and Approximate Dynamic Programming for Feedback Control