LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

On Supervised On-Line Rolling-Horizon Control for Infinite-Horizon Discounted Markov Decision Processes

Photo by timdegroot from unsplash

This note re-visits the rolling-horizon control approach to the problem of a Markov decision process (MDP) with infinite-horizon discounted expected reward criterion. Distinguished from the classical value-iteration approach, we develop… Click to show full abstract

This note re-visits the rolling-horizon control approach to the problem of a Markov decision process (MDP) with infinite-horizon discounted expected reward criterion. Distinguished from the classical value-iteration approach, we develop an asynchronous on-line algorithm based on policy iteration integrated with a multi-policy improvement method of policy switching. A sequence of monotonically improving solutions to the forecast-horizon sub-MDP is generated by updating the current solution only at the currently visited state, building in effect a rolling-horizon control policy for the MDP over infinite horizon. Feedbacks from"supervisors,"if available, can be also incorporated while updating. We focus on the convergence issue with a relation to the transition structure of the MDP. Either a global convergence to an optimal forecast-horizon policy or a local convergence to a"locally-optimal"fixed-policy in a finite time is achieved by the algorithm depending on the structure.

Keywords: horizon control; rolling horizon; control; infinite horizon; policy; horizon

Journal Title: IEEE Transactions on Automatic Control
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.