Considering the natural advantages of deep reinforcement learning algorithms in dealing with continuous control problems, especially for dynamic interactions, these algorithms can be applied to solve the Attacker-Defender-Target (ADT) game… Click to show full abstract
Considering the natural advantages of deep reinforcement learning algorithms in dealing with continuous control problems, especially for dynamic interactions, these algorithms can be applied to solve the Attacker-Defender-Target (ADT) game problem. In this paper, the deep deterministic policy gradient (DDPG) and the multiagent DDPG algorithm are employed to solve the issue of target defense in the ADT game. By introducing an angle between the attacker-target line of sight and the attacker-defender line, we modify the reward function in the deep reinforcement learning algorithm, and redefine the corresponding state space and action space. Through several numerical experiments, the validity of the modified reward function is obvious that the modified defender’s reward function improves the defender’s strategic performance in the game. Compared with the traditional differential game theory, the DDPG and multiagent DDPG algorithms with the modified reward function can realize real-time decision-making and improve the flexibility of defenders in the confrontation process.
               
Click one of the above tabs to view related content.