Electrical, Computer, and Biomedical Engineering Faculty Publications

Parameterized Batch Reinforcement Learning for Longitudinal Control of Autonomous Land Vehicles

Zhenhua Huang, National University of Defense Technology
Xin Xu, National University of Defense Technology
Haibo He, University of Rhode IslandFollow
Jun Tan, National University of Defense Technology
Zhenping Sun, National University of Defense Technology

Document Type

Article

Date of Original Version

4-1-2019

Abstract

This paper presents a parameterized batch reinforcement learning algorithm for near-optimal longitudinal control of autonomous land vehicles (ALVs). The proposed approach uses an actor-critic architecture, where parameterized feature vectors based on kernels are learned from collected samples for approximating the value functions and policies. One difference between the parameterized batch actor-critic (PBAC) algorithm and previous actor-critic learning approaches is that the critic and actor in PBAC share the same linear features, which has been theoretically proved to be a beneficial property for the convergence of actor-critic learning approaches. In order to obtain better learning efficiency, least-squares-based batch updating rules are designed for the critic and actor, respectively. Based on the PBAC learning algorithm, a data-driven longitudinal control method is presented for ALVs to obtain near-optimal control policies which adaptively tune the fuel/brake control signals to track different speeds. A multiobjective reward function is designed so that both tracking precision and driving smoothness are considered. Extensive experiments were conducted on a real ALV platform while driving on flat, slippery, sloping, and bumpy roads. The experimental results illustrate the superiority of the PBAC-based self-learning controller over conventional longitudinal control methods such as proportional-integral (PI) control and learning-based PI control.

Publication Title, e.g., Journal

IEEE Transactions on Systems, Man, and Cybernetics: Systems

Volume

Issue

Citation/Publisher Attribution

Huang, Zhenhua, Xin Xu, Haibo He, Jun Tan, and Zhenping Sun. "Parameterized Batch Reinforcement Learning for Longitudinal Control of Autonomous Land Vehicles." IEEE Transactions on Systems, Man, and Cybernetics: Systems 49, 4 (2019): 730-741. doi: 10.1109/TSMC.2017.2712561.

Link to Full Text

COinS

DOI

https://doi.org/10.1109/TSMC.2017.2712561

Electrical, Computer, and Biomedical Engineering Faculty Publications

Parameterized Batch Reinforcement Learning for Longitudinal Control of Autonomous Land Vehicles

Document Type

Date of Original Version

Abstract

Publication Title, e.g., Journal

Volume

Issue

Citation/Publisher Attribution

DOI

Search

Browse

Author Corner

Electrical, Computer, and Biomedical Engineering Faculty Publications

Parameterized Batch Reinforcement Learning for Longitudinal Control of Autonomous Land Vehicles

Authors

Document Type

Date of Original Version

Abstract

Publication Title, e.g., Journal

Volume

Issue

Citation/Publisher Attribution

Share

DOI

Search

Browse

Author Corner