Learning and Optimization in Hierarchical Adaptive Critic Design
Date of Original Version
This chapter introduces a novel hierarchical adaptive critic design to improve learning and optimization over time. Specifically, we propose to integrate a hierarchical goal generator network to provide the learning system a more informative and detailed goal representation to guide its decision making. The motivations for this idea is twofold. First, instead of using a typical binary reinforcement signal (e.g., 0 or 1) to represent "success" or "failure" of the system, we propose a more informative reinforcement signal representation for the intelligent system to make better choice of actions. Second, in order to mimic certain levels of brain-like intelligence, we consider it is important to introduce a multilevel goal representation into the adaptive critic design to guide the system's decision-making to accomplish the long-term goal over time.We present the detailed system architecture, learning, and adaptation procedure, and a case study of the ball-and-beam system to demonstrate the learning and control capability of this approach. © 2013 The Institute of Electrical and Electronics Engineers, Inc.
Reinforcement Learning and Approximate Dynamic Programming for Feedback Control
He, Haibo, Zhen Ni, and Dongbin Zhao. "Learning and Optimization in Hierarchical Adaptive Critic Design." Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , (2013): 78-97. doi:10.1002/9781118453988.ch4.