Learning and Optimization in Hierarchical Adaptive Critic Design
Document Type
Article
Date of Original Version
2-7-2013
Abstract
This chapter introduces a novel hierarchical adaptive critic design to improve learning and optimization over time. Specifically, we propose to integrate a hierarchical goal generator network to provide the learning system a more informative and detailed goal representation to guide its decision making. The motivations for this idea is twofold. First, instead of using a typical binary reinforcement signal (e.g., 0 or 1) to represent "success" or "failure" of the system, we propose a more informative reinforcement signal representation for the intelligent system to make better choice of actions. Second, in order to mimic certain levels of brain-like intelligence, we consider it is important to introduce a multilevel goal representation into the adaptive critic design to guide the system's decision-making to accomplish the long-term goal over time.We present the detailed system architecture, learning, and adaptation procedure, and a case study of the ball-and-beam system to demonstrate the learning and control capability of this approach. © 2013 The Institute of Electrical and Electronics Engineers, Inc.
Publication Title, e.g., Journal
Reinforcement Learning and Approximate Dynamic Programming for Feedback Control
Citation/Publisher Attribution
He, Haibo, Zhen Ni, and Dongbin Zhao. "Learning and Optimization in Hierarchical Adaptive Critic Design." Reinforcement Learning and Approximate Dynamic Programming for Feedback Control (2013): 78-97. doi: 10.1002/9781118453988.ch4.