Autonomous maneuvering decisions of unmanned aerial vehicle (UAV) in short-range air combat remain a challenging research topic, and a decision method based on an improved deep deterministic policy gradient (DDPG)… Click to show full abstract
Autonomous maneuvering decisions of unmanned aerial vehicle (UAV) in short-range air combat remain a challenging research topic, and a decision method based on an improved deep deterministic policy gradient (DDPG) is proposed. First, the problem model is improved from the perspective of energy–air combat, and a decision model with engine thrust, angle of attack, and roll angle as control variables is established. The normal and tangential overloads are determined by these control variables, and the decision is constrained by the flight stability and threshold range. Subsequently, the decision learning algorithm of the maneuver command is designed based on the DDPG framework. According to the energy air combat, speed is introduced into the return function in some states to make the return value more in line with reality. In view of the slow learning speed of the DDPG algorithm, the winning rate is introduced into the $\varepsilon $ -greedy strategy to adjust the exploration and application probabilities in real time. In view of the decrease in computational efficiency caused by the large amount of empirical data, a similar empirical exclusion was carried out based on the vector distance. The simulation results show that the DDPG-based algorithm realizes autonomous decisions of engine thrust, roll angle, and attack angle under constraints, and the comparative simulation shows that the improvement measures are effective.
               
Click one of the above tabs to view related content.