Adaptive dynamic programming with balanced weights seeking strategy
Document Type
Conference Proceeding
Date of Original Version
9-5-2011
Abstract
In this paper we propose to integrate the recursive Levenberg-Marquardt method into the adaptive dynamic programming (ADP) design for improved learning and adaptive control performance. Our key motivation is to consider a balanced weight updating strategy with the consideration of both robustness and convergence during the online learning process. Specifically, a modified recursive Levenberg-Marquardt (LM) method is integrated into both the action network and critic network of the ADP design, and a detailed learning algorithm is proposed to implement this approach. We test the performance of our approach based on the triple link inverted pendulum, a popular benchmark in the community, to demonstrate online learning and control strategy. Experimental results and comparative study under different noise conditions demonstrate the effectiveness of this approach. © 2011 IEEE.
Publication Title, e.g., Journal
IEEE SSCI 2011: Symposium Series on Computational Intelligence - ADPRL 2011: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
Citation/Publisher Attribution
Fu, Jian, Haibo He, and Zhen Ni. "Adaptive dynamic programming with balanced weights seeking strategy." IEEE SSCI 2011: Symposium Series on Computational Intelligence - ADPRL 2011: 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (2011): 210-217. doi: 10.1109/ADPRL.2011.5967373.