Connect Four is a strategic board game, where two players take turns placing stones of their color (here 'o' and '+') in one of the 7 columns.
A player wins if he gets four consecutive stones of his color in a row, column or diagonal.
This image is an ASCII representation of that game, generated from my code.
It shows the result of a game the agent generated for training purposes.
The agent follows an epsilon-greedy policy, i.e. it takes random moves with probability epsilon.
This screenshot is taken at the beginning of training, where epsilon is still very large.