Solving CartPole-v1 environment in Keras with Advantage Actor Critic (A2C) algorithm an Deep Reinforcement Learning algorithm