LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

CTD: Cascaded Temporal Difference Learning for the Mean-Standard Deviation Shortest Path Problem

Photo by bladeoftree from unsplash

This paper investigates the reliable shortest path (RSP) planning problem from the reinforcement learning perspective. Different from canonical path planning methods, which require at least the first- order statistic (mean)… Click to show full abstract

This paper investigates the reliable shortest path (RSP) planning problem from the reinforcement learning perspective. Different from canonical path planning methods, which require at least the first- order statistic (mean) and second-order statistic (variance) information of travel time distribution, we target at the RSP planning problem without the assumption of knowing any travel time distribution characteristic beforehand, and propose a cascaded temporal difference learning (CTD) method, which simultaneously estimates the mean and variance of the executing path and thereby gradually makes improvements through the generalized policy iteration (GPI) scheme, as the ego vehicle interacts with the environment. Extensive simulation results demonstrate the applicability of the proposed method for RSP learning in various transportation networks.

Keywords: problem; temporal difference; path; shortest path; cascaded temporal; difference learning

Journal Title: IEEE Transactions on Intelligent Transportation Systems
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.