Portfolio Project

Description

In this project, a deep Q-network is implemented using experience replay and target networks to train a 2600 VCS Breakout Atari game. The coding has been done both in Tensorflow and Pytorch frameworks. At first, a preprocess on the input images is done, and they are cut to make the process easier with fewer amounts of data. Then, a greyscale is added to the images to reduce the processing time for training. A frame buffer is considered based on this article to be used on the experience replay. The CNN network has four dense convolutional layers and one output layer for the Q values. which would be called on every agent's step. Using target networks as reference Q-values, Q-learning TD error is computed. After that, using the Adam Optimizer, the agent has trained for about 75k steps. Finally, after reaching a reward of almost 13 or 14 per every life, the training is stopped.