Articles with "martingale proximal" as a keyword



Photo by aleexcif from unsplash

Anti-Martingale Proximal Policy Optimization.

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE transactions on cybernetics"

DOI: 10.1109/tcyb.2022.3170355

Abstract: Since the sample data after one exploration process can only be used to update network parameters once in on-policy deep reinforcement learning (DRL), a high sample efficiency is necessary to accelerate the training process of… read more here.

Keywords: policy optimization; martingale proximal; proximal policy; policy ... See more keywords