Automated Playtesting of Platformer Games Using Reinforcement Learning

Total Page:16

File Type:pdf, Size:1020Kb

Automated Playtesting of Platformer Games Using Reinforcement Learning Automated Playtesting of Platformer Games using Reinforcement Learning A thesis presented to the academic faculty in partial fulfillment of the requirement for the Degree Masters of Science in Game Science and Design in the College of Arts, Media and Design by Varun Sriram Thesis Committee Thesis Advisors: Dr. Giovanni Troiano Committee Members: Dr. Christoffer Holmgård, Dr. Casper Harteveld, Dr. Magy Seif El-Nasr Northeastern University Boston, Massachusetts December 2019 1 2 Abstract Platformer games are popular in the video game industry and their design require much efforts from game companies. As part of the design process, playtesting is key for improving the gameplay before game release. Playtesting is the quality assurance phase of the game development cycle where people are hired to play the game, report bugs and provide feedback regarding the playability of the game. This feedback could be used for game balancing (process of tuning game rules to prevent them from being ineffective or provide undesirable results). However, playtesting may be expensive if done manually and may require several iterations, resulting in high budget requirement and time for game companies. In this thesis, we investigate a way to automatically playtest 2D platformer levels using a combination of deep reinforcement learning and curriculum learning, for both quality assurance and game balancing. Deep Reinforcement Learning has contributed greatly in playing games (Atari and Dota 2) and in this thesis, we will try to replicate the results to playtest games. Curriculum learning is an approach that has shown promising results thus we will explore it to derive useful results. We develop our APT tool by training an artificial intelligence (AI) agent on several different platformer levels following a curriculum, and use the trained agent to playtest newly-created levels. Our APT is able to identify areas of the level that needed design improvements and further gameplay balancing. We contribute a reliable APT tool for designers that wish to easily design 2D platformer games and a discussion of how our results extend to APT at-large. Keywords: 2D Platformer Games, Quality Assurance, Automated Playtesting, Deep ​ Reinforcement Learning, Curriculum Learning Northeastern University Boston, Massachusetts December 2019 3 Table of Contents List of Tables List of Figures List of Equations Acknowledgements 7 1. Introduction 8 2. Background 9 3. Methodology 15 4. Results 25 5. Discussion 31 6. Conclusion 35 References 37 Appendix 4 List of Tables Table 1. Hyperparameters for Neural Network Table 2. Reward Structure for the agent Table 3. Damage taken by agent Table 4. Agent Death List of Figures Figure 1. Huge map system in Ori and the Blind Forest Figure 2. Turrets and Spikes Pits in Ori and the Blind Forest Figure 3. Markov Reward Process Figure 4. A Neuron Figure 5. A Neural Network Figure 6. Raycasting Figure 7. Level with spike pits Figure 8. Level with enemy AI Figure 9. Level with both, spike pits and enemy Figure 10. Unseen level for testing Figure 11. Unseen level with player heatmaps Figure 12. Heatmaps for random agent on unseen level Figure 13. Coins in Sonic the Hedgehog Figure 14. “Spirit light” is the equivalent to coins and acts as a currency to unlock new abilities 5 Figure 15. Special moves in Ori and the Blind Forest Figure 16. Ability tree in Ori and the Blind Forest List of Equations Equation 1. Equation for Markov Reward Process Equation 2. Bellman Equation 6 ACKNOWLEDGEMENTS Firstly I would like to thank my committee members Dr. Christoffer Holmgard, Dr. Casper Harteveld, Dr. Giovanni Troiano and Dr. Magy Seif El-Nasr for their guidance and help which eased the thesis writing process. Thank you Christoffer for introducing me to artificial intelligence and guiding me, answering my endless list of queries and providing constructive criticism wherever needed. I would like to thank my family for believing in me that games science is an important field regardless of its application and helping me pursue my dreams. Special thanks to all my friends living here (USA) and in India, who helped me throughout my Master's journey with their constant support. 7 INTRODUCTION Playtesting is an important part of the game development cycle. It provides feedback regarding the “playability” of the game. In the game industry, the quality assurance (QA) process involves ​ hiring human playtesters to play the game, report bugs and provide feedback regarding the playability of the game. Game development is an iterative process and a game is released when it is balanced and almost has no bugs (it is difficult to assume that a game will have no bugs). For this iterative process to function, there is constant requirement of human playtesters which cost money and time. One solution to cut down on QA cost is by automating the playtesting process resulting in a minimal need for human playtesters. Machine learning (branch of artificial intelligence) is used in play-testing applications (PTA; ​ (Gudmundsson et al., 2018); such approach is often referred to as automated play-testing (APT). ​ In APT, pre-trained artificial intelligence (AI) agents will “play” the game, test for bugs and ​ provide feedback to game designers regarding balance in game mechanics, and provide QA (Pfau, Smeddinck, & Malaka, 2017). Platformer Games (PG) are amongst the most popular type of video games (Galyonkin S., 2019) and include games like Mario Bros, Sonic The Hedgehog, and Crash Bandicoot. In PG games, the main task of players is usually to jump between obstacles, move and jump from one platform to another, and avoiding and or shoot enemies. A PG is a combination of various design patterns, 8 including collectibles, mechanics, and power-ups (Khalifa, de Mesentier Silva, & Togelius, 2019), which developers can use to create a vast number of unique levels. However, analyzing design patterns (Smith, Cha, & Whitehead, 2008) and their combinations for balancing the gameplay (Spencer, 1977) is hard. Depending on the dimensionality of the game objects used. 3D games have x, y, and z axes can and ignoring the z axis will result in 2D PGs. In this thesis, we will be focusing on 2D PGs. This thesis uses APT in the context of PGs, with the end goal of helping game designers playtest and design their PGs effectively and rapidly. We develop a PTA, which automatically plays and tests premade PG levels. The PTA developed for this thesis will provide other developers with feedback about the difficulty and the degree of game balance in their game design. The thesis will help developers to playtest their games for feedback regarding the attributes mentioned above. Additionally, this system could be applicable to students who aspire to develop platformer games, independent game developers, and for research in game artificial intelligence. BACKGROUND This thesis investigates APT in PGs using machine learning and deep reinforcement learning to be precise. There has been extensive research conducted in the field of APT with some of them done by King (Gudmundsson et al., 2018) using supervised learning, EA (Zhao et al., 2019) ​ ​ using inverse reinforcement learning and researchers (Mugrai, Silva, Holmgard, & Togelius, ​ 2019) use monte carlo tree search with genetic evolution, (García-Sánchez, Tonda, Mora, ​ ​ ​ ​ 9 Squillero, & Merelo, 2018) use genetic evolution. All the mentioned papers have provided good ​ ​ results to the field of APT. Curriculum learning (CL) (Bengio, Louradour, Collobert, & Weston, 2009) is defined as a way ​ ​ of training a machine learning model where more difficult aspects of a problem are slowly introduced to challenge the model/agent optimally. This way, I could train AI models well versed with different aspects of its environment as the problems are presented to the agent following a proper difficulty curve (Aponte, Levieux, & Natkin, 2009). ​ ​ Deep reinforcement learning has been used to play different games like Atari (Mnih et al., 2013) ​ by DeepMind, Dota 2 (OpenAI, 2018) by OpenAI etc. DeepMind’s AI agents have successfully completed the Atari games and OpenAI’s Dota team has defeated the current world champions (Peng, Sarazen, 2019) too. Therefore, I could use deep reinforcement learning not just to play, but to playtest games as well. To explore new possibilities and set baselines, in this thesis, I position my work in the area of APT, with machine learning as an approach and deep reinforcement learning in particular with curriculum learning. Next, I briefly review previous work in both areas, and discuss challenges that earlier work encountered when developing PG levels, and how APT can help tackle such challenges. 10 Level Design in Platformer Games Consider a 2D PG like Ori and the Blind Forest (Moon Studios, 2015). It consists of a complex ​ ​ map system, which includes PG mechanics such as jumping to and from platforms, as well as solving puzzles. Ori, the protagonist can faces enemies (e.g., turrets, melee frogs, porcupines with long ranged projectiles), collect health shards, energy shards and many other special items (e.g., snow orb, key for doors). This presence of multiple elements results in the formation of multiple mechanics, setting no limit to the number of unique levels one can design. Thus, by automating the testing process, I am able to test a vast number of levels, in a short period of time, and make playtesting more scalable, less expensive, and requiring only minimal human supervision. 11 Fig 1. An example of map system from Ori and the Blind Forest ​ 12 Fig 2. Turrets, spikes, platforms and blue crystal pickups in Ori and the Blind Forest ​ Automated Play-Testing in Platformer games APT is used in games to replace human testers with pre-trained AI agents. The agent will provide QA regarding the playability of the level. The advantages of APT is that it reduces time and money expenses compared to human testers In this thesis, we explore the use of APT for the design of PGs.
Recommended publications
  • Artificial Intelligence: with Great Power Comes Great Responsibility
    ARTIFICIAL INTELLIGENCE: WITH GREAT POWER COMES GREAT RESPONSIBILITY JOINT HEARING BEFORE THE SUBCOMMITTEE ON RESEARCH AND TECHNOLOGY & SUBCOMMITTEE ON ENERGY COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY HOUSE OF REPRESENTATIVES ONE HUNDRED FIFTEENTH CONGRESS SECOND SESSION JUNE 26, 2018 Serial No. 115–67 Printed for the use of the Committee on Science, Space, and Technology ( Available via the World Wide Web: http://science.house.gov U.S. GOVERNMENT PUBLISHING OFFICE 30–877PDF WASHINGTON : 2018 COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY HON. LAMAR S. SMITH, Texas, Chair FRANK D. LUCAS, Oklahoma EDDIE BERNICE JOHNSON, Texas DANA ROHRABACHER, California ZOE LOFGREN, California MO BROOKS, Alabama DANIEL LIPINSKI, Illinois RANDY HULTGREN, Illinois SUZANNE BONAMICI, Oregon BILL POSEY, Florida AMI BERA, California THOMAS MASSIE, Kentucky ELIZABETH H. ESTY, Connecticut RANDY K. WEBER, Texas MARC A. VEASEY, Texas STEPHEN KNIGHT, California DONALD S. BEYER, JR., Virginia BRIAN BABIN, Texas JACKY ROSEN, Nevada BARBARA COMSTOCK, Virginia CONOR LAMB, Pennsylvania BARRY LOUDERMILK, Georgia JERRY MCNERNEY, California RALPH LEE ABRAHAM, Louisiana ED PERLMUTTER, Colorado GARY PALMER, Alabama PAUL TONKO, New York DANIEL WEBSTER, Florida BILL FOSTER, Illinois ANDY BIGGS, Arizona MARK TAKANO, California ROGER W. MARSHALL, Kansas COLLEEN HANABUSA, Hawaii NEAL P. DUNN, Florida CHARLIE CRIST, Florida CLAY HIGGINS, Louisiana RALPH NORMAN, South Carolina DEBBIE LESKO, Arizona SUBCOMMITTEE ON RESEARCH AND TECHNOLOGY HON. BARBARA COMSTOCK, Virginia, Chair FRANK D. LUCAS, Oklahoma DANIEL LIPINSKI, Illinois RANDY HULTGREN, Illinois ELIZABETH H. ESTY, Connecticut STEPHEN KNIGHT, California JACKY ROSEN, Nevada BARRY LOUDERMILK, Georgia SUZANNE BONAMICI, Oregon DANIEL WEBSTER, Florida AMI BERA, California ROGER W. MARSHALL, Kansas DONALD S. BEYER, JR., Virginia DEBBIE LESKO, Arizona EDDIE BERNICE JOHNSON, Texas LAMAR S.
    [Show full text]
  • Towards Incremental Agent Enhancement for Evolving Games
    Evaluating Reinforcement Learning Algorithms For Evolving Military Games James Chao*, Jonathan Sato*, Crisrael Lucero, Doug S. Lange Naval Information Warfare Center Pacific *Equal Contribution ffi[email protected] Abstract games in 2013 (Mnih et al. 2013), Google DeepMind devel- oped AlphaGo (Silver et al. 2016) that defeated world cham- In this paper, we evaluate reinforcement learning algorithms pion Lee Sedol in the game of Go using supervised learning for military board games. Currently, machine learning ap- and reinforcement learning. One year later, AlphaGo Zero proaches to most games assume certain aspects of the game (Silver et al. 2017b) was able to defeat AlphaGo with no remain static. This methodology results in a lack of algorithm robustness and a drastic drop in performance upon chang- human knowledge and pure reinforcement learning. Soon ing in-game mechanics. To this end, we will evaluate general after, AlphaZero (Silver et al. 2017a) generalized AlphaGo game playing (Diego Perez-Liebana 2018) AI algorithms on Zero to be able to play more games including Chess, Shogi, evolving military games. and Go, creating a more generalized AI to apply to differ- ent problems. In 2018, OpenAI Five used five Long Short- term Memory (Hochreiter and Schmidhuber 1997) neural Introduction networks and a Proximal Policy Optimization (Schulman et al. 2017) method to defeat a professional DotA team, each AlphaZero (Silver et al. 2017a) described an approach that LSTM acting as a player in a team to collaborate and achieve trained an AI agent through self-play to achieve super- a common goal. AlphaStar used a transformer (Vaswani et human performance.
    [Show full text]
  • DEEP LEARNING TECHNOLOGIES for AI Considérations
    DEEP LEARNING TECHNOLOGIES FOR AI Marc Duranton CEA Tech (Leti and List) Considérations énergétiques de l’IA et perspectives Marc Duranton CEA Fellow 22 Septembre 2020 Séminaire INS2I- Systèmes et architectures intégrées matériel-logiciel pour l’intelligence artificielle KEY ELEMENTS OF ARTIFICIAL INTELLIGENCE “…as soon as it works, no one calls it AI anymore.” “AI is whatever hasn't been done yet” John McCarthy, AI D. Hofstadter (1980) who coined the term “Artificial Intelligence” in 1956 Traditional AI, Analysis of Optimization of energy symbolic, “big data” in datacenters algorithms Data rules… analytics “Classical” approaches ML-based AI: Our focus today: to reduce energy consumption Bayesian, … - Considerations on state of the art systems Deep (learninG + inference) Learning* - Edge computing ( inference, federated learning, * Reinforcement Learning, One-shot Learning, Generative Adversarial Networks, etc… neuromorphic) From Greg. S. Corrado, Google brain team co-founder: – “Traditional AI systems are programmed to be clever – Modern ML-based AI systems learn to be clever. | 2 CONTEXT AND HISTORY: STATE-OF-THE-ART SYSTEMS 2012: DEEP NEURAL NETWORKS RISE AGAIN They give the state-of-the-art performance e.g. in image classification • ImageNet classification (Hinton’s team, hired by Google) • 14,197,122 images, 1,000 different classes • Top-5 17% error rate (huge improvement) in 2012 (now ~ 3.5%) “Supervision” network Year: 2012 650,000 neurons 60,000,000 parameters 630,000,000 synapses • Facebook’s ‘DeepFace’ Program (labs headed by Y. LeCun) • 4.4 millionThe images, 2018 Turing4,030 identities Award recipients are Google VP Geoffrey Hinton, • 97.35% accuracy,Facebook's vs.Yann 97.53% LeCun humanand performanceYoshua Bengio, Scientific Director of AI research center Mila.
    [Show full text]
  • Long-Term Planning and Situational Awareness in Openai Five
    Long-Term Planning and Situational Awareness in OpenAI Five Jonathan Raiman∗ Susan Zhang∗ Filip Wolski Dali OpenAI OpenAI [email protected] [email protected] [email protected] Abstract Understanding how knowledge about the world is represented within model-free deep reinforcement learning methods is a major challenge given the black box nature of its learning process within high-dimensional observation and action spaces. AlphaStar and OpenAI Five have shown that agents can be trained without any explicit hierarchical macro-actions to reach superhuman skill in games that require taking thousands of actions before reaching the final goal. Assessing the agent’s plans and game understanding becomes challenging given the lack of hierarchy or explicit representations of macro-actions in these models, coupled with the incomprehensible nature of the internal representations. In this paper, we study the distributed representations learned by OpenAI Five to investigate how game knowledge is gradually obtained over the course of training. We also introduce a general technique for learning a model from the agent’s hidden states to identify the formation of plans and subgoals. We show that the agent can learn situational similarity across actions, and find evidence of planning towards accomplishing subgoals minutes before they are executed. We perform a qualitative analysis of these predictions during the games against the DotA 2 world champions OG in April 2019. 1 Introduction The choice of action and plan representation has dramatic consequences on the ability for an agent to explore, learn, or generalize when trying to accomplish a task. Inspired by how humans methodically organize and plan for long-term goals, Hierarchical Reinforcement Learning (HRL) methods were developed in an effort to augment the set of actions available to the agent to include temporally extended multi-action subroutines.
    [Show full text]
  • AI in Focus - Fundamental Artificial Intelligence and Video Games
    AI in Focus - Fundamental Artificial Intelligence and Video Games April 5, 2019 By Isi Caulder and Lawrence Yu Patent filings for fundamental artificial intelligence (AI) technologies continue to rise. Led by a number of high profile technology companies, including IBM, Google, Amazon, Microsoft, Samsung, and AT&T, patent applications directed to fundamental AI technologies, such as machine learning, neural networks, natural language processing, speech processing, expert systems, robotic and machine vision, are being filed and issued in ever-increasing numbers.[1] In turn, these fundamental AI technologies are being applied to address problems in industries such as healthcare, manufacturing, and transportation. A somewhat unexpected source of fundamental AI technology development has been occurring in the field of video games. Traditional board games have long been a subject of study for AI research. In the 1990’s, IBM created an AI for playing chess, Deep Blue, which was able to defeat top-caliber human players using brute force algorithms.[2] More recently, machine learning algorithms have been developed for more complex board games, which include a larger breadth of possible moves. For example, DeepMind (since acquired by Google), recently developed the first AI capable of defeating professional Go players, AlphaGo.[3] Video games have recently garnered the interest of researchers, due to their closer similarity to the “messiness” and “continuousness” of the real world. In contrast to board games, video games typically include a greater
    [Show full text]
  • Machine Learning for Speech Recognition by Alice Coucke, Head of Machine Learning Research
    Machine Learning for Speech Recognition by Alice Coucke, Head of Machine Learning Research @alicecoucke [email protected] Outline: 1. Recent advances in machine learning 2. From physics to machine learning IA en général Speech & NLP Startup 3. Working at Snips (now Sonos) Snips A lot of applications in the real world, quite a difference from theoretical work Un des rares domaines scientifiques où la théorie et la pratique sont très emmêlées Recent Advances in Applied Machine Learning A lot of applications in the real world, quite a difference from theoretical work Reinforcement learning Learning goal-oriented behavior within simulated environments Go Starcraft II Dota 2 AlphaGo (Deepmind, 2016) AlphaStar (Deepmind) OpenAI Five (OpenAI) Play-driven learning for robots Sim-to-real dexterity learning (Google Brain) Project BLUE (UC Berkeley) Machine Learning for Life Sciences Deep learning applied to biology and medicine Protein folding & structure Cardiac arrhythmia prediction Eye disease diagnosis prediction (NHS, UCL, Deepmind) from ECGs AlphaFold (Deepmind) (Stanford) Reconstruct speech from neural activity Limb control restoration (UCSF) (Batelle, Ohio State Univ) Computer vision High-level understanding of digital images or videos GANs for image generation (Heriot Watt Univ, DeepMind) « Common sense » understanding of actions in videos (TwentyBn, DeepMind, MIT, IBM…) GANs for artificial video dubbing GAN for full body synthesis (Synthesia) (DataGrid) From physics to machine learning and back A surge of interest from the physics community
    [Show full text]
  • Arxiv:2002.10433V1 [Cs.AI] 24 Feb 2020 on Game AI Was in a Niche, Largely Unrecognized by Have Certainly Changed in the Last 10 to 15 Years
    From Chess and Atari to StarCraft and Beyond: How Game AI is Driving the World of AI Sebastian Risi and Mike Preuss Abstract This paper reviews the field of Game AI, strengthening it (e.g. [45]). The main arguments have which not only deals with creating agents that can play been these: a certain game, but also with areas as diverse as cre- ating game content automatically, game analytics, or { By tackling game problems as comparably cheap, player modelling. While Game AI was for a long time simplified representatives of real world tasks, we can not very well recognized by the larger scientific com- improve AI algorithms much easier than by model- munity, it has established itself as a research area for ing reality ourselves. developing and testing the most advanced forms of AI { Games resemble formalized (hugely simplified) mod- algorithms and articles covering advances in mastering els of reality and by solving problems on these we video games such as StarCraft 2 and Quake III appear learn how to solve problems in reality. in the most prestigious journals. Because of the growth of the field, a single review cannot cover it completely. Therefore, we put a focus on important recent develop- Both arguments have at first nothing to do with ments, including that advances in Game AI are start- games themselves but see them as a modeling / bench- ing to be extended to areas outside of games, such as marking tools. In our view, they are more valid than robotics or the synthesis of chemicals. In this article, ever.
    [Show full text]
  • Global Artificial Intelligence Industry Whitepaper
    Global artificial intelligence industry whitepaper Global artificial intelligence industry whitepaper | 4. AI reshapes every industry 1. New trends of AI innovation and integration 5 1.1 AI is growing fully commercialized 5 1.2 AI has entered an era of machine learning 6 1.3 Market investment returns to reason 9 1.4 Cities become the main battleground for AI innovation, integration and application 14 1.5 AI supporting technologies are advancing 24 1.6 Growing support from top-level policies 26 1.7 Over USD 6 trillion global AI market 33 1.8 Large number of AI companies located in the Beijing-Tianjin-Hebei Region, Yangtze River Delta and Pearl River Delta 35 2. Development of AI technologies 45 2.1 Increasingly sophisticated AI technologies 45 2.2 Steady progress of open AI platform establishment 47 2.3 Human vs. machine 51 3. China’s position in global AI sector 60 3.1 China has larger volumes of data and more diversified environment for using data 61 3.2 China is in the highest demand on chip in the world yet relying heavily on imported high-end chips 62 3.3 Chinese robot companies are growing fast with greater efforts in developing key parts and technologies domestically 63 3.4 The U.S. has solid strengths in AI’s underlying technology while China is better in speech recognition technology 63 3.5 China is catching up in application 64 02 Global artificial intelligence industry whitepaper | 4. AI reshapes every industry 4. AI reshapes every industry 68 4.1 Financial industry: AI enhances the business efficiency of financial businesses
    [Show full text]
  • When Are We Done with Games?
    When Are We Done with Games? Niels Justesen Michael S. Debus Sebastian Risi IT University of Copenhagen IT University of Copenhagen IT University of Copenhagen Copenhagen, Denmark Copenhagen, Denmark Copenhagen, Denmark [email protected] [email protected] [email protected] Abstract—From an early point, games have been promoted designed to erase particular elements of unfairness within the as important challenges within the research field of Artificial game, the players, or their environments. Intelligence (AI). Recent developments in machine learning have We take a black-box approach that ignores some dimensions allowed a few AI systems to win against top professionals in even the most challenging video games, including Dota 2 and of fairness such as learning speed and prior knowledge, StarCraft. It thus may seem that AI has now achieved all of focusing only on perceptual and motoric fairness. Additionally, the long-standing goals that were set forth by the research we introduce the notions of game extrinsic factors, such as the community. In this paper, we introduce a black box approach competition format and rules, and game intrinsic factors, such that provides a pragmatic way of evaluating the fairness of AI vs. as different mechanical systems and configurations within one human competitions, by only considering motoric and perceptual fairness on the competitors’ side. Additionally, we introduce the game. We apply these terms to critically review the aforemen- notion of extrinsic and intrinsic factors of a game competition and tioned AI achievements and observe that game extrinsic factors apply these to discuss and compare the competitions in relation are rarely discussed in this context, and that game intrinsic to human vs.
    [Show full text]
  • Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning
    Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning Guy Davidson ([email protected]) Brenden M. Lake ([email protected]) Center for Data Science Department of Psychology and Center for Data Science New York University New York University Abstract (2017) point to object representations (as a component of in- We explore the benefits of augmenting state-of-the-art model- tuitive physics) as an opportunity to bridge the gap between free deep reinforcement learning with simple object representa- human and machine reasoning. Diuk et al. (2008) utilize this tions. Following the Frostbite challenge posited by Lake et al. notion to reformulate the Markov Decision Process (MDP; see (2017), we identify object representations as a critical cognitive capacity lacking from current reinforcement learning agents. below) in terms of objects and interactions, and Kansky et al. We discover that providing the Rainbow model (Hessel et al., (2017) offer Schema Networks as a method of reasoning over 2018) with simple, feature-engineered object representations and planning with such object entities. Dubey et al. (2018) substantially boosts its performance on the Frostbite game from Atari 2600. We then analyze the relative contributions of the explicitly examine the importance of the visual object prior to representations of different types of objects, identify environ- human and artificial agents, discovering that the human agents ment states where these representations are most impactful, and exhibit strong reliance on the objectness of the environment, examine how these representations aid in generalizing to novel situations. while deep RL agents suffer no penalty when it is removed. More recently, this inductive bias served as a source of inspira- Keywords: deep reinforcement learning; object representa- tions; model-free reinforcement learning; DQN.
    [Show full text]
  • Deep Learning: State of the Art (2020) Deep Learning Lecture Series
    Deep Learning: State of the Art (2020) Deep Learning Lecture Series For the full list of references visit: http://bit.ly/deeplearn-sota-2020 https://deeplearning.mit.edu 2020 Outline • Deep Learning Growth, Celebrations, and Limitations • Deep Learning and Deep RL Frameworks • Natural Language Processing • Deep RL and Self-Play • Science of Deep Learning and Interesting Directions • Autonomous Vehicles and AI-Assisted Driving • Government, Politics, Policy • Courses, Tutorials, Books • General Hopes for 2020 For the full list of references visit: http://bit.ly/deeplearn-sota-2020 https://deeplearning.mit.edu 2020 “AI began with an ancient wish to forge the gods.” - Pamela McCorduck, Machines Who Think, 1979 Frankenstein (1818) Ex Machina (2015) Visualized here are 3% of the neurons and 0.0001% of the synapses in the brain. Thalamocortical system visualization via DigiCortex Engine. For the full list of references visit: http://bit.ly/deeplearn-sota-2020 https://deeplearning.mit.edu 2020 Deep Learning & AI in Context of Human History We are here Perspective: • Universe created 13.8 billion years ago • Earth created 4.54 billion years ago • Modern humans 300,000 years ago 1700s and beyond: Industrial revolution, steam • Civilization engine, mechanized factory systems, machine tools 12,000 years ago • Written record 5,000 years ago For the full list of references visit: http://bit.ly/deeplearn-sota-2020 https://deeplearning.mit.edu 2020 Artificial Intelligence in Context of Human History We are here Perspective: • Universe created 13.8 billion years ago • Earth created 4.54 billion years ago • Modern humans Dreams, mathematical foundations, and engineering in reality. 300,000 years ago Alan Turing, 1951: “It seems probable that once the machine • Civilization thinking method had started, it would not take long to outstrip 12,000 years ago our feeble powers.
    [Show full text]
  • Applied Machine Learning for Games: a Graduate School Course
    The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) Applied Machine Learning for Games: A Graduate School Course Yilei Zeng, Aayush Shah, Jameson Thai, Michael Zyda University of Southern California fyilei.zeng, aayushsh, jamesont, [email protected] Abstract research-oriented, industry-oriented or patent-oriented di- rections. The projects’ difficulties are also dynamically ad- The game industry is moving into an era where old-style justable towards different students’ learning curve or prior game engines are being replaced by re-engineered systems experiences in machine learning. In this class, we intend with embedded machine learning technologies for the opera- tion, analysis and understanding of game play. In this paper, to encourage further research into different gaming areas we describe our machine learning course designed for gradu- by requiring students to work on a semester-long research ate students interested in applying recent advances of deep project in groups of up to 8. Students work on incorporat- learning and reinforcement learning towards gaming. This ing deep learning and reinforcement learning techniques in course serves as a bridge to foster interdisciplinary collab- different aspects of game-play creation, simulation, or spec- oration among graduate schools and does not require prior tating. These projects are completely driven by the student experience designing or building games. Graduate students along any direction they wish to explore. Giving the students enrolled in this course apply different fields of machine learn- an intrinsic motivation to engage on their favorite ideas will ing techniques such as computer vision, natural language not only make teaching more time efficient but also bestow processing, computer graphics, human computer interaction, a long-term meaning to the course project which will open robotics and data analysis to solve open challenges in gam- ing.
    [Show full text]