Many deep RL algorithms use conventional deep neural networks such as the convolutional networks in DQN or LSTMs in DRQN. The dueling network takes a different approach. Instead of using the existing networks, a new neural network architecture was designed specifically for deep reinforcement learning.
Experience replay allows agents to remember and reuse past experiences. It samples the experiences uniformly from the memory. However, some experiences are more important than others and should not be treated equally. With prioritized experience replay (PER), the agents replay important experiences more often and therefore, learn more efficiently.
DQN overestimates Q values. The agent values an action much higher than its true value. These estimation errors may lead to longer learning time and poor policies. Double DQN is the algorithm combining DQN and double Q-learning to reduce the overestimations.
Inspired by David Silver’s wonderful introductory course on reinforcement learning (RL), I’ve recently started learning about deep reinforcement learning (deep RL). My goal is to gain an in-depth understanding on this field, the learning path suggested by OpenAI is to read the key papers and re-implement the algorithms.