Bipedal walking is a challenging task for humanoid robots. In this study, we develop a lightweight reinforcement learning method for real-time gait planning of the biped robot.We regard bipedal walking… Click to show full abstract
Bipedal walking is a challenging task for humanoid robots. In this study, we develop a lightweight reinforcement learning method for real-time gait planning of the biped robot.We regard bipedal walking as a process in which the robot constantly interacts with the environment, judges the quality of control action through the walking state, and then adjusts the control strategy. A mean-asynchronous advantage actor-critic (M-A3C) reinforcement learning algorithm is proposed to obtain the continuous state space and action space, and directly obtain the final gait of the robot without introducing the reference gait. We use multiple sub-agents of M-A3C algorithm to train multiple virtual robots independently at the same time in the physical simulation platform. Then we transfer the trained model to the walking control of the actual robot to reduce the number of training on the actual robot, improve the training speed, and ensure the acquisition of the final gait. Finally, a biped robot is designed and fabricated to verify the effectiveness of the proposed method. Various experiments show that the proposed method can achieve the biped robot’s continuous and stable gait planning.
               
Click one of the above tabs to view related content.