Collision-free trajectory planning is a critical technique for space robot mission. In this letter, we developed a model-free Hierarchical Decoupling Optimization (HDO) algorithm to realize 6D-pose multi-target trajectory planning for… Click to show full abstract
Collision-free trajectory planning is a critical technique for space robot mission. In this letter, we developed a model-free Hierarchical Decoupling Optimization (HDO) algorithm to realize 6D-pose multi-target trajectory planning for the free-floating space robot. In order to reduce the complexity of exploration, the whole system consists of two layers: the high-level policy completes the collision-free trajectory planning of the end-effector’s pose; the low-level policy divides the task of reaching arbitrary pose into two decoupling sub-tasks (position and orientation) within a large target space. By introducing the Hindsight Experience Replay (HER), we successfully trained two agents based on multi-goal reinforcement learning. We proposed an Event-based Alternating Optimization (EAO) to stabilize the training and efficiently approximate the optimal policy. Theoretical analysis shows EAO can guarantee the learning stability and reachability of the equilibrium point. The simulation results illustrate that the proposed algorithm achieves high environmental adaptability and anti-disturbance capacity. Furthermore, we demonstrated our proposed method in a practical space mission by applying it to capture a target satellite. Qualitative results (videos) are available at 1.
               
Click one of the above tabs to view related content.