Goal representation adaptive dynamic programming for machine intelligence
This dissertation is focused on a general purpose new framework for machine intelligence based on adaptive dynamic programming (ADP) design. This research is significantly important for developing self-adaptive intelligent system that are highly robust and fault-tolerant to uncertain and unstructured environments. Generally, there are two key components toward building truly self-adaptive systems: fundamental understanding of brain intelligence and complex engineering designs. This dissertation will focus on general purpose computational intelligence methodologies from a biological inspired perspective, and develop a new self-learning machine intelligent system online over time. Furthermore, this new approach will also be explored on wide critical engineering applications. ^ Specifically, a new framework, named "goal representation adaptive dynamic programming (GrADP)", is proposed and introduced in this dissertation. It is regarded as the foundation of building intelligent systems through internal reward learning, goal representation and state-action association. Unlike the traditional ADP design with an action network and a critic network, this new approach integrates an additional network, called the reference (or goal) network, such that to build a general internal reinforcement signal. Unlike the traditional fixed or predefined reinforcement learning signal, this new design can adaptively update the internal reinforcement representation over time and thus facilitate the system's learning and optimization to accomplish the ultimate goals. ^ The original contribution of this research is to integrate an adaptive goal representation design into ADP framework rather than engineering hand-crafted reward functions in literature. This is the first time that the reward signal is presented in a general mapping function by the observation of system variables over time. This is also an important step towards a general purpose self-adaptive learning system based on ADP designs. Generally, ADP family has three major categories: heuristic dynamic programming (HDP), dual heuristic dynamic programming (DHP), and globalized dual heuristic dynamic programming (GDHP). In this research, goal representation principle has been integrated into each design, and verified with promising optimization and learning results. To this end, goal representation heuristic dynamic programming (GrHDP), goal representation dual heuristic dynamic programming (GrDHP), and goal representation globalized dual heuristic dynamic programming (Gr-GDHP), are successfully proposed and developed as a new GrADP family. Further studies of GrADP approaches from toy problems to real-world applications have been provided in comparison with several other classical control and reinforcement learning approaches. The rigorous mathematical analysis and stability assurance have also been provided to address the convergence and boundedness issues, which are the theoretical assurance for this new integrated design. In summary, this is the first time that the new GrADP design framework has been proposed and described explicitly with its family members. The numerical simulation verification, engineering applications and also theoretical results are provided to study each of the new architecture design from different viewpoints. ^
Electrical engineering|Computer science
"Goal representation adaptive dynamic programming for machine intelligence"
Dissertations and Master's Theses (Campus Access).