Date of Award
2023
Degree Type
Dissertation
Degree Name
Doctor of Philosophy in Electrical Engineering
Department
Electrical, Computer, and Biomedical Engineering
First Advisor
Haibo He
Abstract
Reinforcement Learning (RL) has achieved remarkable success in solving complicated control and sequential decision-making tasks in recent years, such as board games, video games, robot control, dynamic energy management, and autonomous driving. In these tasks, RL has demonstrated unprecedented adaptability and self-learning capability in unknown and stochastic environments. However, existing RL algorithms still have significant disadvantages and shortcomings compared to human intelligence, such as learning stability, sample efficiency, safety, scalability, and collaboration with other agents. These drawbacks impede the widespread application of RL in real-life domains.
To address these challenges, this dissertation aims to propose novel theories and algorithms for more effective and efficient RL and to develop their applications in real-world domains. This work consists of three major parts. First, the dissertation establishes a novel theoretical result for trust-region-based RL methods by analyzing the lower bound of policy performance. A closed-form policy update rule is then derived based on the theoretical result, providing a monotonic improvement guarantee. Second, the dissertation develops an off-policy RL algorithm for continuous control problems, inspired by the closed-form update rule. This algorithm enables sample-efficient learning of deep neural network policies. Additionally, trust region policy optimization is extended to cooperative multi-agent systems through consensus optimization, resulting in a distributed MARL algorithm. Third, the dissertation develops the real-world applications of a trust-region-based safe RL method and the MARL method in smart grids. Numerous experiments are conducted to verify the effectiveness of these methods using various benchmark environments, including robotics, strategic games, and power systems.
Recommended Citation
Li, Hepfeng, "TRUST-REGION BASED POLICY OPTIMIZATION FOR EFFICIENT REINFORCEMENT LEARNING" (2023). Open Access Dissertations. Paper 1559.
https://digitalcommons.uri.edu/oa_diss/1559
Terms of Use
All rights reserved under copyright.