Electrical, Computer, and Biomedical Engineering Faculty Publications

An Improved Trust-Region Method for Off-Policy Deep Reinforcement Learning

Hepeng Li, University of Rhode Island
Xiangnan Zhong, FAU College of Engineering and Computer Science
Haibo He, University of Rhode Island

Document Type

Conference Proceeding

Date of Original Version

1-1-2023

Abstract

Reinforcement learning (RL) is a powerful tool for training agents to interact with complex environments. In particular, trust-region methods are widely used for policy optimization in model-free RL. However, these methods suffer from high sample complexity due to their on-policy nature, which requires interactions with the environment for each update. To address this issue, off-policy trust-region methods have been proposed, but they have shown limited success in highdimensional continuous control problems compared to other off-policy DRL methods. To improve the performance and sample efficiency of trust-region policy optimization, we propose an off-policy trust-region RL algorithm. Our algorithm is based on a theoretical result on a closed-form solution to trust-region policy optimization and is effective in optimizing complex nonlinear policies. We demonstrate the superiority of our algorithm over prior trust-region DRL methods and show that it achieves excellent performance on a range of continuous control tasks in the Multi-Joint dynamics with Contact (MuJoCo) environment, comparable to state-of-the-art off-policy algorithms.

Publication Title, e.g., Journal

Proceedings of the International Joint Conference on Neural Networks

Volume

2023-June

Citation/Publisher Attribution

Li, Hepeng, Xiangnan Zhong, and Haibo He. "An Improved Trust-Region Method for Off-Policy Deep Reinforcement Learning." Proceedings of the International Joint Conference on Neural Networks 2023-June, (2023). doi: 10.1109/IJCNN54540.2023.10191837.

Link to Full Text

COinS

DOI

https://doi.org/10.1109/IJCNN54540.2023.10191837

Electrical, Computer, and Biomedical Engineering Faculty Publications

An Improved Trust-Region Method for Off-Policy Deep Reinforcement Learning

Document Type

Date of Original Version

Abstract

Publication Title, e.g., Journal

Volume

Citation/Publisher Attribution

DOI

Search

Browse

Author Corner

Electrical, Computer, and Biomedical Engineering Faculty Publications

An Improved Trust-Region Method for Off-Policy Deep Reinforcement Learning

Authors

Document Type

Date of Original Version

Abstract

Publication Title, e.g., Journal

Volume

Citation/Publisher Attribution

Share

DOI

Search

Browse

Author Corner