Data-driven partially observable dynamic processes using adaptive dynamic programming

Document Type

Conference Proceeding

Date of Original Version



Adaptive dynamic programming (ADP) has been widely recognized as one of the 'core methodologies' to achieve optimal control for intelligent systems in Markov decision process (MDP). Generally, ADP control design requires all the information of the system dynamics. However, in many practical situations, the measured input and output data can only represent part of the system states. This means the complete information of the system cannot be available in many real-world cases, which narrows the range of application of the ADP design. In this paper, we propose a data-driven ADP method to stabilize the system with partially observable dynamics based on neural network techniques. A state network is integrated into the typical actor-critic architecture to provide an estimated state from the measured input/output sequences. The theoretical analysis and the stability discussion of this data-driven ADP method are also provided. Two examples are studied to verify our proposed method.

Publication Title, e.g., Journal

IEEE SSCI 2014 - 2014 IEEE Symposium Series on Computational Intelligence - ADPRL 2014: 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Proceedings