In this article, we address the difficulty of controlling unmanned surface vehicles (USVs) under unforeseeable and unobservable external disturbances using model-based reinforcement learning (MBRL) without human’s prior knowledge. A novel… Click to show full abstract
In this article, we address the difficulty of controlling unmanned surface vehicles (USVs) under unforeseeable and unobservable external disturbances using model-based reinforcement learning (MBRL) without human’s prior knowledge. A novel MBRL approach, filtered probabilistic model predictive control (FPMPC) is proposed to iteratively learn the USV model and an MPC-based policy in a probabilistic way through trial-and-error interactions. Compared with existing MBRL approaches that model the unobservable disturbances as system noise, FPMPC introduces a Bayesian filter process to implicitly translate the system dynamics to a partially-observed Markov decision process to present those disturbances as hidden states. An adaptive sample selection is proposed to remove the redundant learning samples based on the filter belief. Equipped with bias compensation and parallel computation, an FPMPC system, specific for USV, is developed. Evaluated by both position holding and target reaching tasks in a real USV data-driven simulation, FPMPC shows its significant superiority in control performances, generalization capability, and sample efficiency under large disturbances compared with the baseline approaches.
               
Click one of the above tabs to view related content.