Solving an equation problem with Naive Policy Gradient (keras ver.) to find where the maximum value is