Crossy Road AI

Crossy Road AI

Comparing Learning Algorithms for Crossy Road Game Sajana Weerawardena, Alwyn Tan, Nick Rubin Overview RL Approach Results ● Epsilon-greedy vanilla Q-learning ● Crossy Road is a 3D mobile version ● Game tree: agent never dies unless stuck w/ no escape ● Our state includes: of Frogger, and is one of the top ● Reinforcement learning: still training, but slowly improving (see below) ○ Safety status of 8 positions ○ Total states currently explored: ~8k ~ 4.3% of state space. grossing apps for iOS and Android. surrounding player ● Goal: create an intelligent agent to ○ Type of previous, current, and next (hopefully) play indefinitely rows (grass, road, or water) ● Infrastructure: we built our own Crossy ○ Direction of moving objects in the Road simulator in Unity previous, current, and next rows ● Implemented both game tree and ○ Total existing states: 186,624 reinforcement learning agents, to ● Our rewards: compare how each performed ○ +7 for going forwards ○ -8 for going backwards ○ -300 if dead Game Tree Approach ● To train faster we trained on only road, Game tree (minimax) results: only water, and water/grass worlds, so ● Minimax implementation with a that more states were explored Depth 1 2 3 4 variable depth ● Reduced epsilon as we trained Average Score 213 390 2692 8536 ● All game objects (cars, logs, trees) treated as single opponent that moves deterministically 01101000 012 011 ● Evaluation function: ○ No penalty for going backwards Analysis ○ -99,999 if dead ○ Score = score * 1 / | x offset | ● Minimax agent, even with a low depth like 3 is able to achieve extremely to discourage moving to edge high scores because it will always survive unless there is a state in which every action leads to a death Challenges ● RL implementation isn’t as successful for a few reasons: ○ Crossy Road is a complex game and our feature extractor isn’t able to ● RL: struggled to craft a feature capture enough information about a state extractor that is robust enough in its ○ The agent has only been trained for 20-hours (~3,000 game iterations) representation of states, but isn’t too and it hasn’t yet encountered the vast majority of the 186,624 states that complex that the state space is our current feature extractor captures impossibly large. Also still trying to ● We expect that with more training our RL agent will improve optimize reward values ● Future work: function approximation, TD-learning, policy network github.com/alwyntan/Crossy-Road-AI.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    1 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us