Articles with "difference learning" as a keyword



Photo from wikipedia

The serial blocking effect: a testbed for the neural mechanisms of temporal-difference learning

Sign Up to like & get
recommendations!
Published in 2019 at "Scientific Reports"

DOI: 10.1038/s41598-019-42244-4

Abstract: Temporal-difference (TD) learning models afford the neuroscientist a theory-driven roadmap in the quest for the neural mechanisms of reinforcement learning. The application of these models to understanding the role of phasic midbrain dopaminergic responses in… read more here.

Keywords: serial blocking; difference learning; neural mechanisms; blocking effect ... See more keywords
Photo by reganography from unsplash

Distributed Off-Policy Temporal Difference Learning Using Primal-Dual Method

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Access"

DOI: 10.1109/access.2022.3211395

Abstract: The goal of this paper is to provide theoretical analysis and additional insights on a distributed temporal-difference (TD)-learning algorithm for the multi-agent Markov decision processes (MDPs) via saddle-point viewpoints. The (single-agent) TD-learning is a reinforcement… read more here.

Keywords: temporal difference; policy temporal; policy; distributed policy ... See more keywords
Photo by bladeoftree from unsplash

CTD: Cascaded Temporal Difference Learning for the Mean-Standard Deviation Shortest Path Problem

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Transactions on Intelligent Transportation Systems"

DOI: 10.1109/tits.2021.3096829

Abstract: This paper investigates the reliable shortest path (RSP) planning problem from the reinforcement learning perspective. Different from canonical path planning methods, which require at least the first- order statistic (mean) and second-order statistic (variance) information… read more here.

Keywords: problem; temporal difference; path; shortest path ... See more keywords
Photo by szolkin from unsplash

Online Sparse Temporal Difference Learning Based on Nested Optimization and Regularized Dual Averaging

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Transactions on Systems, Man, and Cybernetics: Systems"

DOI: 10.1109/tsmc.2020.3043584

Abstract: In policy evaluation of reinforcement learning tasks, the temporal difference (TD) learning with value function approximation has been widely studied. However, feature representation has a decisive influence on both accuracy of value function approximation and… read more here.

Keywords: temporal difference; tex math; online sparse; inline formula ... See more keywords