Electrical, Computer, and Biomedical Engineering Faculty Publications

A hybrid evolving and gradient strategy for approximating policy evaluation on online critic-actor learning

Jian Fu, Wuhan University of Technology
Haibo He, University of Rhode IslandFollow
Huiying Li, Wuhan University of Technology
Qing Liu, Wuhan University of Technology

Document Type

Conference Proceeding

Date of Original Version

8-23-2012

Abstract

In this paper, we propose a novel strategy for approximating policy evaluation during online critic-actor learning procedure. We adopt the adaptive differential evolution with elites (ADEE) to optimize moving least square temporal difference with one step (MLSTD(0)) at the early stage which is good at global searching. Next we apply gradient method to perform local search efficiently and effectively. That solves the dilemma between explore and exploit in weight seeking for critic neural network. Simulation results on the online learning control of a cart pole benchmark demonstrate the efficiency of the presented method. © 2012 Springer-Verlag.

Publication Title, e.g., Journal

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume

7367 LNCS

Issue

PART 1

Citation/Publisher Attribution

Fu, Jian, Haibo He, Huiying Li, and Qing Liu. "A hybrid evolving and gradient strategy for approximating policy evaluation on online critic-actor learning." Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7367 LNCS, PART 1 (2012): 555-564. doi: 10.1007/978-3-642-31346-2_62.

Link to Full Text

COinS

DOI

https://doi.org/10.1007/978-3-642-31346-2_62

Electrical, Computer, and Biomedical Engineering Faculty Publications

A hybrid evolving and gradient strategy for approximating policy evaluation on online critic-actor learning

Document Type

Date of Original Version

Abstract

Publication Title, e.g., Journal

Volume

Issue

Citation/Publisher Attribution

DOI

Search

Browse

Author Corner

Electrical, Computer, and Biomedical Engineering Faculty Publications

A hybrid evolving and gradient strategy for approximating policy evaluation on online critic-actor learning

Authors

Document Type

Date of Original Version

Abstract

Publication Title, e.g., Journal

Volume

Issue

Citation/Publisher Attribution

Share

DOI

Search

Browse

Author Corner