"Multiagent reinforcement learning for strictly constrained tasks based on Reward Recorder"

Multiagent reinforcement learning (MARL) has been widely applied in engineering problems. However, many strictly constrained problems such as distributed optimization in engineering applications are still a great challenge to MARL. Especially for strict global constraints of agents' actions, it is very easy to lead to sparse rewards. Besides, existing studies cannot solve the instability caused by partial observability while making the algorithm fully distributed. Algorithms with centralized training may encounter significant obstacles in real‐world deployment. For the first time, we provide theoretical analysis for MARL to determine the adverse effects of partial observability on convergence, and a fully distributed and convergent MARL algorithm based on Reward Recorder is proposed. Each agent runs an independent reinforcement learning algorithm and uses the average‐consensus protocol to estimate the global state‐action value locally to achieve global optimization. To verify the performance of the algorithm, we propose a novel generalized constrained optimization model, which includes local inequality constraints and strict global constraints. The proposed distributed reinforcement learning algorithm is supported by several simulation examples. The results reveal that the proposed algorithm has high stability and excellent decision‐making ability.

Keywords: reinforcement learning; reinforcement; strictly constrained; reward recorder; based reward; multiagent reinforcement

Journal Title: International Journal of Intelligent Systems
Year Published: 2022

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
1

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended