In recent years, deep reinforcement learning methods have achieved impressive performance in many different fields, including playing games, robotics, and dialogue systems. However, there are still a lot of restrictions… Click to show full abstract
In recent years, deep reinforcement learning methods have achieved impressive performance in many different fields, including playing games, robotics, and dialogue systems. However, there are still a lot of restrictions here, one of which is the demand for massive amounts of sampled data. In this paper, a hierarchical meta-learning method based on the actor-critic algorithm is proposed for sample efficient learning. This method provides the transferable knowledge that can efficiently train an actor on a new task with a few trials. Specifically, a global basic critic, meta critic, and task specified network are shared within a distribution of tasks and are capable of criticizing any actor trying to solve any specified task. The hierarchical framework is applied to a critic network in the actor-critic algorithm for distilling meta-knowledge above the task level and addressing distinct tasks. The proposed method is evaluated on multiple classic control tasks with reinforcement learning algorithms, including the start-of-the-art meta-learning methods. The experimental results statistically demonstrate that the proposed method achieves state-of-the-art performance and attains better results with more depth of meta critic network.
               
Click one of the above tabs to view related content.