An unmanned swarm system (UWS) is a multiagent system that can fulfill task requirements through autonomous and cooperative behavior strategy learning. However, learning instability is inevitable in a dynamic mission… Click to show full abstract
An unmanned swarm system (UWS) is a multiagent system that can fulfill task requirements through autonomous and cooperative behavior strategy learning. However, learning instability is inevitable in a dynamic mission setting, as the agents continuously adapt to an evolving mission objective. This article proposes several knowledge enhancement mechanisms to improve the training efficiency and learning stability of a UWS in a confined-space confrontation mission. Specifically, a punishment for transcending action-space boundary and a reward for satisfying agent space-time distance constraints are introduced as training reward enhancements. Meanwhile, experience sharing among agents is optimized for unanimous behavior. We apply these novel mechanisms to several representative single-agent and multiagent reinforcement learning algorithms and verify their effectiveness on our proprietary, SwarmFlow, simulation system. Simulations show that the proposed mechanisms improve existing algorithms’ convergence speed and performance stability. The increase is more prominent for multiagent reinforcement learning algorithms than single-agent algorithms where the convergence time is halved, and the mission success rates increase by 3–4%.
               
Click one of the above tabs to view related content.