Invariant Adaptive Dynamic Programming for Discrete-Time Optimal Control
Document Type
Article
Date of Original Version
11-1-2020
Abstract
For systems that can only be locally stabilized, control laws and their effective regions are both important. In this paper, invariant policy iteration is proposed to solve the optimal control of discrete-time systems. At each iteration, a given policy is evaluated in its invariantly admissible region, and a new policy and a new region are updated for the next iteration. Theoretical analysis shows the method is regionally convergent to the optimal value and the optimal policy. Combined with sum-of-squares polynomials, the method is able to achieve the near-optimal control of a class of discrete-time systems. An invariant adaptive dynamic programming algorithm is developed to extend the method to scenarios where system dynamics is not available. Online data are utilized to learn the near-optimal policy and the invariantly admissible region. Simulated experiments verify the effectiveness of our method.
Publication Title, e.g., Journal
IEEE Transactions on Systems, Man, and Cybernetics: Systems
Volume
50
Issue
11
Citation/Publisher Attribution
Zhu, Yuanheng, Dongbin Zhao, and Haibo He. "Invariant Adaptive Dynamic Programming for Discrete-Time Optimal Control." IEEE Transactions on Systems, Man, and Cybernetics: Systems 50, 11 (2020): 3959-3971. doi: 10.1109/TSMC.2019.2911900.