Electrical, Computer, and Biomedical Engineering Faculty Publications

Experimental studies on data-driven heuristic dynamic programming for POMDP

Zhen Ni
Haibo HeFollow
Xiangnan Zhong

Document Type

Article

Date of Original Version

8-13-2014

Abstract

Adaptive dynamic programming (ADP) has been a popular approach to seek the optimal control strategy in Markov decision process (MDP). Generally, this type of approach requires a complete set of system information/states to achieve the online optimal decision-making. However, the full system information/states are not usually available in practical situations. In many cases, the measured input/output data can only represent part of the system information and the system internal states are not available. In this chapter, we investigate a data-driven heuristic dynamic programming (HDP) architecture to tackle the partially observed Markov decision process (POMDP). In specific, we include a state estimator neural network to recover the full system information for the action network, so that the optimal control policy can still be achieved under the partially observed environment. We randomly initialize the weights in the state estimator network, and conduct online learning for the entire process. Both discrete-time and continuous-time system functions are tested. Simulation results and system trajectories justify the control performance of our proposed approach.

Publication Title, e.g., Journal

Frontiers of Intelligent Control and Information Processing

Citation/Publisher Attribution

Ni, Zhen, Haibo He, and Xiangnan Zhong. "Experimental studies on data-driven heuristic dynamic programming for POMDP." Frontiers of Intelligent Control and Information Processing (2014): 83-106. doi: 10.1142/9789814616881_0003.

Link to Full Text

COinS

DOI

https://doi.org/10.1142/9789814616881_0003

Electrical, Computer, and Biomedical Engineering Faculty Publications

Experimental studies on data-driven heuristic dynamic programming for POMDP

Document Type

Date of Original Version

Abstract

Publication Title, e.g., Journal

Citation/Publisher Attribution

DOI

Search

Browse

Author Corner

Electrical, Computer, and Biomedical Engineering Faculty Publications

Experimental studies on data-driven heuristic dynamic programming for POMDP

Authors

Document Type

Date of Original Version

Abstract

Publication Title, e.g., Journal

Citation/Publisher Attribution

Share

DOI

Search

Browse

Author Corner