LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Path Integral Policy Improvement With Population Adaptation.

Photo by sarahdorweiler from unsplash

Path integral policy improvement (PI²) is known to be an efficient reinforcement learning algorithm, particularly, if the target system is a high-dimensional dynamical system. However, PI², and its existing extensions,… Click to show full abstract

Path integral policy improvement (PI²) is known to be an efficient reinforcement learning algorithm, particularly, if the target system is a high-dimensional dynamical system. However, PI², and its existing extensions, have adjustable parameters, on which the efficiency depends significantly. This article proposes an extension of PI² that adjusts all of the critical parameters automatically. Motion acquisition tasks for three different types of simulated legged robots were performed to test the efficacy of the proposed algorithm. The results show that the proposed method cannot only eliminate the burden on the user to set the parameters appropriately but also improve the optimization performance significantly. For one of the acquired motions, a real robot experiment was conducted to show the validity of the motion.

Keywords: integral policy; path integral; policy improvement; improvement population

Journal Title: IEEE transactions on cybernetics
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.