Learning to navigate through complex dynamic environment with modular deep reinforcement learning

Document Type

Article

Date of Original Version

12-1-2018

Abstract

In this paper, we propose an end-to-end modular reinforcement learning architecture for a navigation task in complex dynamic environments with rapidly moving obstacles. In this architecture, the main task is divided into two subtasks: local obstacle avoidance and global navigation. For obstacle avoidance, we develop a two-stream Q-network, which processes spatial and temporal information separately and generates action values. The global navigation subtask is resolved by a conventional Q-network framework. An online learning network and an action scheduler are introduced to first combine two pretrained policies, and then continue exploring and optimizing until a stable policy is obtained. The two-stream Q-network obtains better performance than the conventional deep Q-learning approach in the obstacle avoidance subtask. Experiments on the main task demonstrate that the proposed architecture can efficiently avoid moving obstacles and complete the navigation task at a high success rate. The modular architecture enables parallel training and also demonstrates good generalization capability in different environments. 2017 IEEE.

Publication Title, e.g., Journal

IEEE Transactions on Games

Volume

10

Issue

4

Share

COinS