Date of Award

2023

Degree Type

Thesis

Degree Name

Master of Science in Electrical Engineering (MSEE)

Department

Electrical, Computer, and Biomedical Engineering

First Advisor

Haibo He

Abstract

In the current reinforcement learning (RL) taxonomy, there exist few algorithms for discrete action environments that are capable of learning stochastic policies in an off-policy manner. Learning stochastic policies brings benefits such as stable training and smoother exploration strategies. Training an algorithm in an off-policy manner allows for greater sample efficiency, as experiences collected while interacting with a learning environment can be used more than once. Stable performance and good sample efficiency are highly important when collecting experiences from a learning environment is expensive. This thesis proposes a new algorithm for discrete action RL called Discrete General Policy Optimization (Discrete GPO) that has both of the above characteristics. The algorithm is designed following recent theoretical developments in trust region policy optimization techniques. The performance of Discrete GPO is tested in different simulated learning environments, and a comparison to other state of the art methods is provided.

Recommended Citation

Clavette, Nicholas, "A SAMPLE EFFICIENT OFF-POLICY ACTOR-CRITIC APPROACH FOR DISCRETE ACTION ENVIRONMENTS" (2023). Open Access Master's Theses. Paper 2374.
https://digitalcommons.uri.edu/theses/2374

Download

COinS

DOI

https://doi.org/10.23860/thesis-clavette-nicholas-2023

Open Access Master's Theses

A SAMPLE EFFICIENT OFF-POLICY ACTOR-CRITIC APPROACH FOR DISCRETE ACTION ENVIRONMENTS

Date of Award

Degree Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

DOI

Terms of Use

Search

Browse

Author Corner

Open Access Master's Theses

A SAMPLE EFFICIENT OFF-POLICY ACTOR-CRITIC APPROACH FOR DISCRETE ACTION ENVIRONMENTS

Author

Date of Award

Degree Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Share

DOI

Terms of Use

Search

Browse

Author Corner