Solving a Puzzle with QLearning

Say your a bird and want to avoid pipes to get high scores. Which action would be best to take(to flap or not to flap)?

1.) Mapping Agent(bird) position into states

Again, there may too many states to cover, so we will do is get the difference of height percentage of the bird and the two pipes:

2.) Adding states scenarios as inputs (even if its incomplete that's ok, as long as we achieve a goal)

Syntax: state action reward next-state

Results: (Note: Press Speeds to see how far it will go)

Conclusion: Now we know that our algorithm can run surviving flappy bird, given that we provide the right data, still the data although reduced it still take up 22243 bytes or more that 34,000 rows of data to produce our simple result.

Use it for a tictactoe game

Survive Flappybird with QLearning Demo

1.) Mapping Agent(bird) position into states

2.) Adding states scenarios as inputs (even if its incomplete that's ok, as long as we achieve a goal)