Articles with "policy gradient" as a keyword



Photo by codioful from unsplash

SeqVAE: Sequence variational autoencoder with policy gradient

Sign Up to like & get
recommendations!
Published in 2021 at "Applied Intelligence"

DOI: 10.1007/s10489-021-02374-7

Abstract: In the paper, we propose a variant of Variational Autoencoder (VAE) for sequence generation task, called SeqVAE, which is a combination of recurrent VAE and policy gradient in reinforcement learning. The goal of SeqVAE is… read more here.

Keywords: policy gradient; policy; seqvae; variational autoencoder ... See more keywords
Photo by codioful from unsplash

Grasping Control of a Vision Robot Based on a Deep Attentive Deterministic Policy Gradient

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Access"

DOI: 10.1109/access.2021.3137821

Abstract: Reinforcement learning can achieve excellent performance in the field of robotic grasping if the grasping target is stable. However, during applications in the real world, robot needs to overcome the effects of a complex working… read more here.

Keywords: deep attentive; deterministic policy; policy gradient; attentive deterministic ... See more keywords
Photo from wikipedia

Research on Maneuvering Decision Algorithm Based on Improved Deep Deterministic Policy Gradient

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Access"

DOI: 10.1109/access.2022.3202918

Abstract: Autonomous maneuvering decisions of unmanned aerial vehicle (UAV) in short-range air combat remain a challenging research topic, and a decision method based on an improved deep deterministic policy gradient (DDPG) is proposed. First, the problem… read more here.

Keywords: improved deep; deep deterministic; based improved; deterministic policy ... See more keywords
Photo from wikipedia

Data-Driven Coordinated Charging for Electric Vehicles With Continuous Charging Rates: A Deep Policy Gradient Approach

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Internet of Things Journal"

DOI: 10.1109/jiot.2021.3135977

Abstract: In this article, we consider a parking lot that manages the charging processes of its parked electric vehicles (EVs). Upon arrival, each EV requests a certain amount of energy. This request should be fulfilled before… read more here.

Keywords: charging rates; deep policy; policy gradient; policy ... See more keywords
Photo from wikipedia

Computing Stabilizing Feedback Gains via a Model-Free Policy Gradient Method

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Control Systems Letters"

DOI: 10.1109/lcsys.2022.3188180

Abstract: In spite of the lack of convexity, convergence and sample complexity properties were recently established for the random search method applied to the linear quadratic regulator (LQR) problem. Since policy gradient approaches require an initial… read more here.

Keywords: feedback gains; policy gradient; policy; method ... See more keywords
Photo from wikipedia

Revisiting LQR Control From the Perspective of Receding-Horizon Policy Gradient

Sign Up to like & get
recommendations!
Published in 2023 at "IEEE Control Systems Letters"

DOI: 10.1109/lcsys.2023.3271594

Abstract: We revisit in this letter the discrete-time linear quadratic regulator (LQR) problem from the perspective of receding-horizon policy gradient (RHPG), a newly developed model-free learning framework for control applications. We provide a fine-grained sample complexity… read more here.

Keywords: horizon policy; receding horizon; control; policy gradient ... See more keywords
Photo from wikipedia

Learning Optimal Controllers for Linear Systems With Multiplicative Noise via Policy Gradient

Sign Up to like & get
recommendations!
Published in 2021 at "IEEE Transactions on Automatic Control"

DOI: 10.1109/tac.2020.3037046

Abstract: The linear quadratic regulator (LQR) problem has reemerged as an important theoretical benchmark for reinforcement learning-based control of complex dynamical systems with continuous state and action spaces. In contrast with nearly all recent work in… read more here.

Keywords: optimal controllers; policy; learning optimal; multiplicative noise ... See more keywords
Photo from wikipedia

Policy Gradient for Continuing Tasks in Discounted Markov Decision Processes

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Transactions on Automatic Control"

DOI: 10.1109/tac.2022.3163085

Abstract: Reinforcement learning aims to find policies that maximize an expected cumulative reward in Markov decision processes with unknown transition probabilities. Policy gradient (PG)-algorithms use stochastic gradients of the value function to update the policy. A… read more here.

Keywords: markov decision; decision processes; policy gradient; policy ... See more keywords
Photo from wikipedia

Expert System-Based Multiagent Deep Deterministic Policy Gradient for Swarm Robot Decision Making.

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE transactions on cybernetics"

DOI: 10.1109/tcyb.2022.3228578

Abstract: In this article, an expert system-based multiagent deep deterministic policy gradient (ESB-MADDPG) is proposed to realize the decision making for swarm robots. Multiagent deep deterministic policy gradient (MADDPG) is a multiagent reinforcement learning algorithm proposed… read more here.

Keywords: deep deterministic; policy gradient; policy; multiagent deep ... See more keywords
Photo from wikipedia

Policy Gradient Adaptive Critic Design With Dynamic Prioritized Experience Replay for Wastewater Treatment Process Control

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Transactions on Industrial Informatics"

DOI: 10.1109/tii.2021.3106402

Abstract: With the industrialization of modern society, the pollution of water resources becomes more and more serious. Although purifying urban sewage through the wastewater treatment plants eases the burden of fragile ecosystems, the nonlinearities and uncertainties… read more here.

Keywords: dynamic prioritized; wastewater treatment; control; policy gradient ... See more keywords
Photo by patrickltr from unsplash

Intelligent Fault Quantitative Identification via the Improved Deep Deterministic Policy Gradient (DDPG) Algorithm Accompanied With Imbalanced Sample

Sign Up to like & get
recommendations!
Published in 2023 at "IEEE Transactions on Instrumentation and Measurement"

DOI: 10.1109/tim.2023.3250284

Abstract: The imbalanced amount of faulty and normal samples seriously affects the performance of intelligent fault diagnosis models. Aiming to solve the above problem, an improved deep deterministic policy gradient (DDPG) algorithm incorporating ResNet, ResDPG, based… read more here.

Keywords: improved deep; deep deterministic; intelligent fault; deterministic policy ... See more keywords