Deep reinforcement learning with Experience Replay (ER), including Deep Q-Network (DQN), has been used to solve many multi-step learning problems. However, in practice, DQN algorithms need better explainability, which limits… Click to show full abstract
Deep reinforcement learning with Experience Replay (ER), including Deep Q-Network (DQN), has been used to solve many multi-step learning problems. However, in practice, DQN algorithms need better explainability, which limits their applicability in many scenarios. While we can consider DQN as a black-box model, the Learning Classifier Systems (LCSs), including anticipatory versions, also solve multi-step problems, but their operation is subject to interpretation. It seems promising to combine the properties of these two learning approaches. The paper describes an attempt to design and evaluate modification to the Experience Replay extension of the anticipatory classifier system ACS2. The modification is named Episode-based Experience Replay (EER), and its main premise is to replay entire episodes instead of single experience samples. Promising results affirmed by Bayes estimations are obtained on multi-step problems, albeit limited to deterministic and discrete tasks. The experimental results show that the EER extension significantly improves the ACS2’s learning capabilities.
               
Click one of the above tabs to view related content.