Date of Award

2023

Degree Type

Dissertation

Degree Name

Doctor of Philosophy in Electrical Engineering

Department

Electrical, Computer, and Biomedical Engineering

First Advisor

Haibo He

Abstract

Reinforcement Learning (RL) has achieved remarkable success in solving complicated control and sequential decision-making tasks in recent years, such as board games, video games, robot control, dynamic energy management, and autonomous driving. In these tasks, RL has demonstrated unprecedented adaptability and self-learning capability in unknown and stochastic environments. However, existing RL algorithms still have significant disadvantages and shortcomings compared to human intelligence, such as learning stability, sample efficiency, safety, scalability, and collaboration with other agents. These drawbacks impede the widespread application of RL in real-life domains.

To address these challenges, this dissertation aims to propose novel theories and algorithms for more effective and efficient RL and to develop their applications in real-world domains. This work consists of three major parts. First, the dissertation establishes a novel theoretical result for trust-region-based RL methods by analyzing the lower bound of policy performance. A closed-form policy update rule is then derived based on the theoretical result, providing a monotonic improvement guarantee. Second, the dissertation develops an off-policy RL algorithm for continuous control problems, inspired by the closed-form update rule. This algorithm enables sample-efficient learning of deep neural network policies. Additionally, trust region policy optimization is extended to cooperative multi-agent systems through consensus optimization, resulting in a distributed MARL algorithm. Third, the dissertation develops the real-world applications of a trust-region-based safe RL method and the MARL method in smart grids. Numerous experiments are conducted to verify the effectiveness of these methods using various benchmark environments, including robotics, strategic games, and power systems.

Recommended Citation

Li, Hepfeng, "TRUST-REGION BASED POLICY OPTIMIZATION FOR EFFICIENT REINFORCEMENT LEARNING" (2023). Open Access Dissertations. Paper 1559.
https://digitalcommons.uri.edu/oa_diss/1559

Download

COinS

DOI

https://doi.org/10.23860/diss-li-hepfeng-2023

Open Access Dissertations

TRUST-REGION BASED POLICY OPTIMIZATION FOR EFFICIENT REINFORCEMENT LEARNING

Date of Award

Degree Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

DOI

Terms of Use

Search

Browse

Author Corner

Open Access Dissertations

TRUST-REGION BASED POLICY OPTIMIZATION FOR EFFICIENT REINFORCEMENT LEARNING

Author

Date of Award

Degree Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Share

DOI

Terms of Use

Search

Browse

Author Corner