LAUSR: pessimistic value

Photo by goian from unsplash

Improving Exploration in Actor-Critic With Weakly Pessimistic Value Estimation and Optimistic Policy Optimization.

Sign Up to like & get
recommendations!
1 Published in 2022 at "IEEE transactions on neural networks and learning systems"

DOI: 10.1109/tnnls.2022.3215596

Abstract: Deep off-policy actor-critic algorithms have been successfully applied to challenging tasks in continuous control. However, these methods typically suffer from the poor sample efficiency problem, limiting their widespread adoption in real-world domains. To mitigate this… read more here.

Keywords: pessimistic value; value estimation; policy; value ... See more keywords

LAUSR

You are not signed in:

Sign Up!

Improving Exploration in Actor-Critic With Weakly Pessimistic Value Estimation and Optimistic Policy Optimization.