Reinforcement learning (RL) is a subset of artificial intelligence in which a decision-making agent tries to act optimally in an environment by controlling different parameters. There is no need to… Click to show full abstract
Reinforcement learning (RL) is a subset of artificial intelligence in which a decision-making agent tries to act optimally in an environment by controlling different parameters. There is no need to identify and mathematically formulate the environmental constraints in such a method. Moreover, the RL agent does not need prior information about future outcomes to act optimally in the current situation. However, its performance is adversely affected by the environmental complexity, which increases the agent’s effort to choose the optimal action in a particular condition. Integrating RL and linear programming (LP) methods is beneficial to tackle this problem as it reduces the state-action space that the agent should learn. In this regard, the optimization variables are divided into two categories. First, experience-dependent variables which have an inter-time dependency, and their values depend on the agent’s decision. Second, experience-independent variables whose values depend on the LP model and have no inter-time connection. The numerical results of integrating mentioned methods have demonstrated the hybrid model’s effectiveness in converging to the global optimum with more than 95% accuracy.
               
Click one of the above tabs to view related content.