Deep reinforcement learning (DRL) has attracted much attention as an approach to solve optimal control problems without mathematical models of systems. On the other hand, in general, constraints may be… Click to show full abstract
Deep reinforcement learning (DRL) has attracted much attention as an approach to solve optimal control problems without mathematical models of systems. On the other hand, in general, constraints may be imposed on optimal control problems. In this study, we consider the optimal control problems with constraints to complete temporal control tasks. We describe the constraints using signal temporal logic (STL), which is useful for time sensitive control tasks since it can specify continuous signals within bounded time intervals. To deal with the STL constraints, we introduce an extended constrained Markov decision process (CMDP), which is called a
               
Click one of the above tabs to view related content.