Learning accurate state transition dynamics model in a sample-efficient way is important to predict the future states from the current states and actions of a system both accurately and efficiently… Click to show full abstract
Learning accurate state transition dynamics model in a sample-efficient way is important to predict the future states from the current states and actions of a system both accurately and efficiently in model-based reinforcement learning for many robotic applications. This paper proposes a sample-efficient learning approach that can accurately learn a state transition dynamics model by fitting both the predicted next states and their derivatives. To compute the derivatives of the feedforward neural network output (next states) with respective to the inputs (current states and actions), chain rules are used. In addition, the effects of the activation functions on the learning derivatives are illustrated via sum of elementary sine functions example and compared for various activation functions with respect to accuracy. Significant improvement in accuracy by the proposed learning approach is demonstrated for both one-step and multi-step prediction cases with a six-degree-of-freedom manipulation robot (UR-10) in both the simulation and real environments.
               
Click one of the above tabs to view related content.