This paper presents a simultaneous pickup and delivery route designing model, which considers the use of express lockers. Unlike the traditional traveling salesman problem (TSP), this model analyzes the scenario… Click to show full abstract
This paper presents a simultaneous pickup and delivery route designing model, which considers the use of express lockers. Unlike the traditional traveling salesman problem (TSP), this model analyzes the scenario that a courier serves a neighborhood with multiple trips. Considering the locker and vehicle capacity, the total cost is constituted of back order, lost sale, and traveling time. We aim to minimize the total cost when satisfying all requests. A modified deep Q-learning network is designed to get the optimal results from our model, leveraging masked multi-head attention to select the courier paths. Our algorithm outperforms other stochastic optimization methods with better optimal solutions and O(n) computational time in evaluation processes. The experiment has shown that reinforcement learning is a better choice than traditional stochastic optimization methods, consuming less power and time during evaluation processes, which indicates that this approach fits better for large-scale data and broad deployment.
               
Click one of the above tabs to view related content.