Abstract In this paper, a novel synchronous off-policy method is given to solve multi-player zero-sum (ZS) game under the condition that the knowledge of system data are completely unknown, the… Click to show full abstract
Abstract In this paper, a novel synchronous off-policy method is given to solve multi-player zero-sum (ZS) game under the condition that the knowledge of system data are completely unknown, the actuators of controls are constrained and the disturbances are bounded simultaneously. The cost functions are built by nonquadratic functions to reflect the constrained properties of inputs. The integral reinforcement learning (IRL) technology is employed to solve Hamilton–Jacobi–Bellman equation, so that the system dynamics are not necessary anymore. The obtained value function is proved to converge to the optimal game values. And the equivalent of traditional policy iteration (PI) algorithm and the proposed algorithm is given in solving the multi-player ZS game with constrained inputs. Three neural networks in this paper are utilized, the critic neural network (CNN) to approach the cost function, the action neural network (ANN) to approach the control policies and the disturbance neural networks (DNN) to approach the disturbances are utilized. Finally, a simulation example is given to demonstrate the convergence and performance of the proposed algorithm.
               
Click one of the above tabs to view related content.