Training Agents to Playtest Modern Games

Total Page:16

File Type:pdf, Size:1020Kb

Training Agents to Playtest Modern Games Winning Isn’t Everything: Training Agents to Playtest Modern Games Igor Borovikov∗y, Yunqi Zhao∗†, Ahmad Beirami∗† Jesse Hardery, John Koleny, James Pestraky, Jervis Pintoy, Reza Pourabolghasemy Harold Chaputy, Mohsen Sardariy, Long Liny, Navid Aghdaiey, Kazi Zamany Abstract (MCTS) (Coulom 2006; Kocsis and Szepesvári 2006) was a big leap in solving games. MCTS agents for play- Recently, there have been several high-profile achieve- ing Settlers of Catan were reported in (Szita, Chaslot, ments of agents learning to play games against humans and beat them. We propose an approach that instead and Spronck 2009; Chaslot et al. 2008) and shown to addresses how the player experiences the game, which beat previous heuristics. Other work compares multi- we consider to be a more challenging problem. In this ple approaches of agents to one another in the game paper, we present an alternative context for developing Carcassonne on the two-player variant of the game and AI that plays games. Specifically, we study the prob- discusses variations of MCTS and Minimax search for lem of creating intelligent game agents in service of playing the game (Heyden 2009). MCTS has also been the development processes of the game developers that applied to the game of 7 Wonders (Robilliard, Fonlupt, design, build, and operate modern games. We high- and Teytaud 2014) and Ticket to Ride (Huchler 2015). light some of the ways in which we think intelligent Recently, DeepMind researchers demonstrated that agents can assist game developers to understand their games, and even to build them. Our main contribution deep neural networks (DNNs) combined with MCTS is to propose a learning and planning framework that is could lead to AI agents that play Go at a super-human uniquely tuned to the environment and needs of mod- level (Silver et al. 2016), and solely via self-play (Silver ern game engines, developers and players. Our game et al. 2017; Silver et al. 2018). Subsequently, OpenAI agent framework takes a few steps towards addressing researchers showed that AI agents could learn to coop- the unique challenges that game developers face. We erate at a human level in Dota 2 (OpenAI Five 2018). discuss some early results from an initial implementa- The impressive recent progress on RL to solve games tion of our framework. is partly due to the advancements in processing power and AI computing technology.1 Further, deep Q net- Artificial intelligence; artificial game agent; rein- works (DQNs) have emerged as a general representation forcement learning; imitation learning; deep learning; learning framework from the pixels in a frame buffer model-based learning; game design; game playtesting; combined with Q function approximation without need non-player character (NPC). for task-specific feature engineering (Mnih et al. 2015). The design of a DQN and setting the hyperparame- Introduction ters is still a daunting task. In addition, it takes hun- The history of artificial intelligence (AI) can be mapped dreds of thousands of state-action pairs for the agent by its achievements playing and winning various games. to reach human-level performance. Applying the same From the early days of Chess-playing machines to the techniques to modern games would require obtaining most recent accomplishments of Deep Blue and Al- and processing even more state-action pairs, which is phaGo, AI has advanced from competent, to competi- infeasible in most cases because speeding up the game tive, to champion in even the most complex games. engine may not be possible and the game state may be Games have been instrumental in advancing AI, and difficult to infer from the frame buffer. The costs asso- most notably in recent times, reinforcement learning ciated with such an approach may be too high for many (RL). IBM Deep Blue was the first AI agent who applications, not justifying the benefits. beat the chess world champion, Gary Kasparov (Deep On modern strategy games, DeepMind and Blizzard Blue 1997). A decade later, Monte Carlo Tree Search showed that existing techniques fall short even on learn- ing the rules of StarCraft II (Vinyals et al. 2017). While ∗These authors contributed equally to this work. Contact: {iborovikov, yuzhao, abeirami}@ea.com. breaking the state space and action space into a hi- yEA Digital Platform – Data & AI, Electronic Arts, 209 erarchy of simpler learning problems has shown to be Redwood Shores Pkwy, Redwood City, CA 94065, USA. Copyright c 2019, Association for the Advancement of Ar- 1The amount of AI compute has been doubling every 3-4 tificial Intelligence (www.aaai.org). All rights reserved. months in the past few years (AI & Compute 2018). promising (Vinyals et al. 2017; Pang et al. 2018), ad- (Smith, Nelson, and Mateas 2010) addresses the be- ditional complications arise when building agents for a havior emanating from the design by having an engine modern game that is still under development. Some of capable of recording play traces. (Treanor et al. 2015) these challenges are: proposes an ideation technique to embed design pat- 1. The game state space is huge, with continuous at- terns in AI based game design. (Zhu, Wang, and Zyda tributes, and is only partially observable to the agent. 2018) uses a measure for similarity between game events to transfer different levels across games. See (Togelius 2. The set of available actions is huge, parametric, et al. 2011; Summerville et al. 2018) for a survey of partly continuous, and potentially unknown to the these techniques in game design. agent rendering a direct application of MCTS infea- In the next section, we make the case for using AI as sible. more of a tool to help designers tune their game, rather 3. The game itself is dynamic in the design and de- than build an agent with super-human performance or velopment stage, and multiple parameters and at- in order to create a game (or a part thereof). tributes (particularly related to graphics) may change between different builds. Playtesting Game Agents 4. The games are designed to last tens of millions of While achieving optimal super-human gameplay using timesteps2 leading to potentially long episodes, and modern RL techniques is impressive, our goal is to train the manner the player engages with the game envi- agents that can help game designers ensure their game ronment impacts the gameplay strategy.3 provides players with optimal experiences. As it is not obvious to define a reward function that abstracts an 5. Multiple players may interact in conflict or cooper- optimal experience, this problem does not necessarily ation leading to an exploding state space and non- lend itself to a traditional RL formulation. For instance, convergence issues due to invalidation of the Markov we considered the early development of The Sims Mo- assumption in a multi-agent learning environment. bile, whose gameplay is about “emulating life”: players 6. Winning isn’t everything, and the goal of the agent create avatars, called Sims, and conduct them through could be to exhibit human-like behavior/style to bet- a variety of everyday activities. In this game, there is no ter engage human players, making it non-trivial to single predetermined goal to achieve. Instead, players design a proper rewarding mechanism. craft their own experiences, and the designer’s objective is to evaluate different aspects of that experience. The idea of using AI techniques to augment game To validate their design, game designers conduct development and playtesting is not new. Algorith- playtesting sessions. Playtesting consists of having a mic approaches have been proposed to address the is- group of players interact with the game in the develop- sue of game balance, in board games (De Mesen- ment cycle to not only gauge the engagement of play- tier Silva et al. 2017; Hom and Marks 2007) and card ers, but also to discover elements and states that result games (Krucher 2015; Mahlmann, Togelius, and Yan- in undesirable outcomes. As a game goes through the nakakis 2012). More recently, (Holmgard et al. 2018) various stages of development, it is essential to con- builds a variant of MCTS to create a player model for tinuously iterate and improve the relevant aspects of AI Agent based playtesting. These techniques are rel- the gameplay and its balance. Relying exclusively on evant to creating rewarding mechanisms for mimicking playtesting conducted by humans can be costly and in- player behavior. efficient. Artificial agents could perform much faster Another line of research focuses on investigating ap- play sessions, allowing the exploration of much more of proaches where AI and machine learning can play the the game space in much shorter time. This becomes role of a co-designer, making suggestions during devel- even more valuable as game worlds grow large enough opment (Yannakakis, Liapis, and Alexopoulos 2014). to hold tens of thousands of simultaneously interacting Tools for creating game maps (Liapis, Yannakakis, and players. Games at this scale render traditional human Togelius 2013) and level design (Smith, Whitehead, and playtesting infeasible. Mateas 2010; Shaker, Shaker, and Togelius 2013) are Recent advances in the field of RL, when applied also proposed. Other approaches have explored AI for to playing computer games (e.g., (OpenAI Five 2018; designing new games. (Browne and Maire 2010) gen- Mnih et al. 2015; Vinyals et al. 2017; Harmer et al. erates entirely new abstract games by means of evolu- 2018)), assume that the purpose of a trained artificial tionary algorithms. (Salge and Mahlmann 2010) relates agent (“agent” for short) is to achieve the best possi- the game design to the concept of relevant information. ble performance with respect to clearly defined rewards while the game itself remains fixed for the foreseen fu- 2A timestep is the unit time where the agent takes an action, its observation is updated, and a reward signal is ture.
Recommended publications
  • Artificial Intelligence: with Great Power Comes Great Responsibility
    ARTIFICIAL INTELLIGENCE: WITH GREAT POWER COMES GREAT RESPONSIBILITY JOINT HEARING BEFORE THE SUBCOMMITTEE ON RESEARCH AND TECHNOLOGY & SUBCOMMITTEE ON ENERGY COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY HOUSE OF REPRESENTATIVES ONE HUNDRED FIFTEENTH CONGRESS SECOND SESSION JUNE 26, 2018 Serial No. 115–67 Printed for the use of the Committee on Science, Space, and Technology ( Available via the World Wide Web: http://science.house.gov U.S. GOVERNMENT PUBLISHING OFFICE 30–877PDF WASHINGTON : 2018 COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY HON. LAMAR S. SMITH, Texas, Chair FRANK D. LUCAS, Oklahoma EDDIE BERNICE JOHNSON, Texas DANA ROHRABACHER, California ZOE LOFGREN, California MO BROOKS, Alabama DANIEL LIPINSKI, Illinois RANDY HULTGREN, Illinois SUZANNE BONAMICI, Oregon BILL POSEY, Florida AMI BERA, California THOMAS MASSIE, Kentucky ELIZABETH H. ESTY, Connecticut RANDY K. WEBER, Texas MARC A. VEASEY, Texas STEPHEN KNIGHT, California DONALD S. BEYER, JR., Virginia BRIAN BABIN, Texas JACKY ROSEN, Nevada BARBARA COMSTOCK, Virginia CONOR LAMB, Pennsylvania BARRY LOUDERMILK, Georgia JERRY MCNERNEY, California RALPH LEE ABRAHAM, Louisiana ED PERLMUTTER, Colorado GARY PALMER, Alabama PAUL TONKO, New York DANIEL WEBSTER, Florida BILL FOSTER, Illinois ANDY BIGGS, Arizona MARK TAKANO, California ROGER W. MARSHALL, Kansas COLLEEN HANABUSA, Hawaii NEAL P. DUNN, Florida CHARLIE CRIST, Florida CLAY HIGGINS, Louisiana RALPH NORMAN, South Carolina DEBBIE LESKO, Arizona SUBCOMMITTEE ON RESEARCH AND TECHNOLOGY HON. BARBARA COMSTOCK, Virginia, Chair FRANK D. LUCAS, Oklahoma DANIEL LIPINSKI, Illinois RANDY HULTGREN, Illinois ELIZABETH H. ESTY, Connecticut STEPHEN KNIGHT, California JACKY ROSEN, Nevada BARRY LOUDERMILK, Georgia SUZANNE BONAMICI, Oregon DANIEL WEBSTER, Florida AMI BERA, California ROGER W. MARSHALL, Kansas DONALD S. BEYER, JR., Virginia DEBBIE LESKO, Arizona EDDIE BERNICE JOHNSON, Texas LAMAR S.
    [Show full text]
  • Towards Incremental Agent Enhancement for Evolving Games
    Evaluating Reinforcement Learning Algorithms For Evolving Military Games James Chao*, Jonathan Sato*, Crisrael Lucero, Doug S. Lange Naval Information Warfare Center Pacific *Equal Contribution ffi[email protected] Abstract games in 2013 (Mnih et al. 2013), Google DeepMind devel- oped AlphaGo (Silver et al. 2016) that defeated world cham- In this paper, we evaluate reinforcement learning algorithms pion Lee Sedol in the game of Go using supervised learning for military board games. Currently, machine learning ap- and reinforcement learning. One year later, AlphaGo Zero proaches to most games assume certain aspects of the game (Silver et al. 2017b) was able to defeat AlphaGo with no remain static. This methodology results in a lack of algorithm robustness and a drastic drop in performance upon chang- human knowledge and pure reinforcement learning. Soon ing in-game mechanics. To this end, we will evaluate general after, AlphaZero (Silver et al. 2017a) generalized AlphaGo game playing (Diego Perez-Liebana 2018) AI algorithms on Zero to be able to play more games including Chess, Shogi, evolving military games. and Go, creating a more generalized AI to apply to differ- ent problems. In 2018, OpenAI Five used five Long Short- term Memory (Hochreiter and Schmidhuber 1997) neural Introduction networks and a Proximal Policy Optimization (Schulman et al. 2017) method to defeat a professional DotA team, each AlphaZero (Silver et al. 2017a) described an approach that LSTM acting as a player in a team to collaborate and achieve trained an AI agent through self-play to achieve super- a common goal. AlphaStar used a transformer (Vaswani et human performance.
    [Show full text]
  • DEEP LEARNING TECHNOLOGIES for AI Considérations
    DEEP LEARNING TECHNOLOGIES FOR AI Marc Duranton CEA Tech (Leti and List) Considérations énergétiques de l’IA et perspectives Marc Duranton CEA Fellow 22 Septembre 2020 Séminaire INS2I- Systèmes et architectures intégrées matériel-logiciel pour l’intelligence artificielle KEY ELEMENTS OF ARTIFICIAL INTELLIGENCE “…as soon as it works, no one calls it AI anymore.” “AI is whatever hasn't been done yet” John McCarthy, AI D. Hofstadter (1980) who coined the term “Artificial Intelligence” in 1956 Traditional AI, Analysis of Optimization of energy symbolic, “big data” in datacenters algorithms Data rules… analytics “Classical” approaches ML-based AI: Our focus today: to reduce energy consumption Bayesian, … - Considerations on state of the art systems Deep (learninG + inference) Learning* - Edge computing ( inference, federated learning, * Reinforcement Learning, One-shot Learning, Generative Adversarial Networks, etc… neuromorphic) From Greg. S. Corrado, Google brain team co-founder: – “Traditional AI systems are programmed to be clever – Modern ML-based AI systems learn to be clever. | 2 CONTEXT AND HISTORY: STATE-OF-THE-ART SYSTEMS 2012: DEEP NEURAL NETWORKS RISE AGAIN They give the state-of-the-art performance e.g. in image classification • ImageNet classification (Hinton’s team, hired by Google) • 14,197,122 images, 1,000 different classes • Top-5 17% error rate (huge improvement) in 2012 (now ~ 3.5%) “Supervision” network Year: 2012 650,000 neurons 60,000,000 parameters 630,000,000 synapses • Facebook’s ‘DeepFace’ Program (labs headed by Y. LeCun) • 4.4 millionThe images, 2018 Turing4,030 identities Award recipients are Google VP Geoffrey Hinton, • 97.35% accuracy,Facebook's vs.Yann 97.53% LeCun humanand performanceYoshua Bengio, Scientific Director of AI research center Mila.
    [Show full text]
  • Long-Term Planning and Situational Awareness in Openai Five
    Long-Term Planning and Situational Awareness in OpenAI Five Jonathan Raiman∗ Susan Zhang∗ Filip Wolski Dali OpenAI OpenAI [email protected] [email protected] [email protected] Abstract Understanding how knowledge about the world is represented within model-free deep reinforcement learning methods is a major challenge given the black box nature of its learning process within high-dimensional observation and action spaces. AlphaStar and OpenAI Five have shown that agents can be trained without any explicit hierarchical macro-actions to reach superhuman skill in games that require taking thousands of actions before reaching the final goal. Assessing the agent’s plans and game understanding becomes challenging given the lack of hierarchy or explicit representations of macro-actions in these models, coupled with the incomprehensible nature of the internal representations. In this paper, we study the distributed representations learned by OpenAI Five to investigate how game knowledge is gradually obtained over the course of training. We also introduce a general technique for learning a model from the agent’s hidden states to identify the formation of plans and subgoals. We show that the agent can learn situational similarity across actions, and find evidence of planning towards accomplishing subgoals minutes before they are executed. We perform a qualitative analysis of these predictions during the games against the DotA 2 world champions OG in April 2019. 1 Introduction The choice of action and plan representation has dramatic consequences on the ability for an agent to explore, learn, or generalize when trying to accomplish a task. Inspired by how humans methodically organize and plan for long-term goals, Hierarchical Reinforcement Learning (HRL) methods were developed in an effort to augment the set of actions available to the agent to include temporally extended multi-action subroutines.
    [Show full text]
  • AI in Focus - Fundamental Artificial Intelligence and Video Games
    AI in Focus - Fundamental Artificial Intelligence and Video Games April 5, 2019 By Isi Caulder and Lawrence Yu Patent filings for fundamental artificial intelligence (AI) technologies continue to rise. Led by a number of high profile technology companies, including IBM, Google, Amazon, Microsoft, Samsung, and AT&T, patent applications directed to fundamental AI technologies, such as machine learning, neural networks, natural language processing, speech processing, expert systems, robotic and machine vision, are being filed and issued in ever-increasing numbers.[1] In turn, these fundamental AI technologies are being applied to address problems in industries such as healthcare, manufacturing, and transportation. A somewhat unexpected source of fundamental AI technology development has been occurring in the field of video games. Traditional board games have long been a subject of study for AI research. In the 1990’s, IBM created an AI for playing chess, Deep Blue, which was able to defeat top-caliber human players using brute force algorithms.[2] More recently, machine learning algorithms have been developed for more complex board games, which include a larger breadth of possible moves. For example, DeepMind (since acquired by Google), recently developed the first AI capable of defeating professional Go players, AlphaGo.[3] Video games have recently garnered the interest of researchers, due to their closer similarity to the “messiness” and “continuousness” of the real world. In contrast to board games, video games typically include a greater
    [Show full text]
  • Machine Learning for Speech Recognition by Alice Coucke, Head of Machine Learning Research
    Machine Learning for Speech Recognition by Alice Coucke, Head of Machine Learning Research @alicecoucke [email protected] Outline: 1. Recent advances in machine learning 2. From physics to machine learning IA en général Speech & NLP Startup 3. Working at Snips (now Sonos) Snips A lot of applications in the real world, quite a difference from theoretical work Un des rares domaines scientifiques où la théorie et la pratique sont très emmêlées Recent Advances in Applied Machine Learning A lot of applications in the real world, quite a difference from theoretical work Reinforcement learning Learning goal-oriented behavior within simulated environments Go Starcraft II Dota 2 AlphaGo (Deepmind, 2016) AlphaStar (Deepmind) OpenAI Five (OpenAI) Play-driven learning for robots Sim-to-real dexterity learning (Google Brain) Project BLUE (UC Berkeley) Machine Learning for Life Sciences Deep learning applied to biology and medicine Protein folding & structure Cardiac arrhythmia prediction Eye disease diagnosis prediction (NHS, UCL, Deepmind) from ECGs AlphaFold (Deepmind) (Stanford) Reconstruct speech from neural activity Limb control restoration (UCSF) (Batelle, Ohio State Univ) Computer vision High-level understanding of digital images or videos GANs for image generation (Heriot Watt Univ, DeepMind) « Common sense » understanding of actions in videos (TwentyBn, DeepMind, MIT, IBM…) GANs for artificial video dubbing GAN for full body synthesis (Synthesia) (DataGrid) From physics to machine learning and back A surge of interest from the physics community
    [Show full text]
  • Arxiv:2002.10433V1 [Cs.AI] 24 Feb 2020 on Game AI Was in a Niche, Largely Unrecognized by Have Certainly Changed in the Last 10 to 15 Years
    From Chess and Atari to StarCraft and Beyond: How Game AI is Driving the World of AI Sebastian Risi and Mike Preuss Abstract This paper reviews the field of Game AI, strengthening it (e.g. [45]). The main arguments have which not only deals with creating agents that can play been these: a certain game, but also with areas as diverse as cre- ating game content automatically, game analytics, or { By tackling game problems as comparably cheap, player modelling. While Game AI was for a long time simplified representatives of real world tasks, we can not very well recognized by the larger scientific com- improve AI algorithms much easier than by model- munity, it has established itself as a research area for ing reality ourselves. developing and testing the most advanced forms of AI { Games resemble formalized (hugely simplified) mod- algorithms and articles covering advances in mastering els of reality and by solving problems on these we video games such as StarCraft 2 and Quake III appear learn how to solve problems in reality. in the most prestigious journals. Because of the growth of the field, a single review cannot cover it completely. Therefore, we put a focus on important recent develop- Both arguments have at first nothing to do with ments, including that advances in Game AI are start- games themselves but see them as a modeling / bench- ing to be extended to areas outside of games, such as marking tools. In our view, they are more valid than robotics or the synthesis of chemicals. In this article, ever.
    [Show full text]
  • Global Artificial Intelligence Industry Whitepaper
    Global artificial intelligence industry whitepaper Global artificial intelligence industry whitepaper | 4. AI reshapes every industry 1. New trends of AI innovation and integration 5 1.1 AI is growing fully commercialized 5 1.2 AI has entered an era of machine learning 6 1.3 Market investment returns to reason 9 1.4 Cities become the main battleground for AI innovation, integration and application 14 1.5 AI supporting technologies are advancing 24 1.6 Growing support from top-level policies 26 1.7 Over USD 6 trillion global AI market 33 1.8 Large number of AI companies located in the Beijing-Tianjin-Hebei Region, Yangtze River Delta and Pearl River Delta 35 2. Development of AI technologies 45 2.1 Increasingly sophisticated AI technologies 45 2.2 Steady progress of open AI platform establishment 47 2.3 Human vs. machine 51 3. China’s position in global AI sector 60 3.1 China has larger volumes of data and more diversified environment for using data 61 3.2 China is in the highest demand on chip in the world yet relying heavily on imported high-end chips 62 3.3 Chinese robot companies are growing fast with greater efforts in developing key parts and technologies domestically 63 3.4 The U.S. has solid strengths in AI’s underlying technology while China is better in speech recognition technology 63 3.5 China is catching up in application 64 02 Global artificial intelligence industry whitepaper | 4. AI reshapes every industry 4. AI reshapes every industry 68 4.1 Financial industry: AI enhances the business efficiency of financial businesses
    [Show full text]
  • When Are We Done with Games?
    When Are We Done with Games? Niels Justesen Michael S. Debus Sebastian Risi IT University of Copenhagen IT University of Copenhagen IT University of Copenhagen Copenhagen, Denmark Copenhagen, Denmark Copenhagen, Denmark [email protected] [email protected] [email protected] Abstract—From an early point, games have been promoted designed to erase particular elements of unfairness within the as important challenges within the research field of Artificial game, the players, or their environments. Intelligence (AI). Recent developments in machine learning have We take a black-box approach that ignores some dimensions allowed a few AI systems to win against top professionals in even the most challenging video games, including Dota 2 and of fairness such as learning speed and prior knowledge, StarCraft. It thus may seem that AI has now achieved all of focusing only on perceptual and motoric fairness. Additionally, the long-standing goals that were set forth by the research we introduce the notions of game extrinsic factors, such as the community. In this paper, we introduce a black box approach competition format and rules, and game intrinsic factors, such that provides a pragmatic way of evaluating the fairness of AI vs. as different mechanical systems and configurations within one human competitions, by only considering motoric and perceptual fairness on the competitors’ side. Additionally, we introduce the game. We apply these terms to critically review the aforemen- notion of extrinsic and intrinsic factors of a game competition and tioned AI achievements and observe that game extrinsic factors apply these to discuss and compare the competitions in relation are rarely discussed in this context, and that game intrinsic to human vs.
    [Show full text]
  • Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning
    Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning Guy Davidson ([email protected]) Brenden M. Lake ([email protected]) Center for Data Science Department of Psychology and Center for Data Science New York University New York University Abstract (2017) point to object representations (as a component of in- We explore the benefits of augmenting state-of-the-art model- tuitive physics) as an opportunity to bridge the gap between free deep reinforcement learning with simple object representa- human and machine reasoning. Diuk et al. (2008) utilize this tions. Following the Frostbite challenge posited by Lake et al. notion to reformulate the Markov Decision Process (MDP; see (2017), we identify object representations as a critical cognitive capacity lacking from current reinforcement learning agents. below) in terms of objects and interactions, and Kansky et al. We discover that providing the Rainbow model (Hessel et al., (2017) offer Schema Networks as a method of reasoning over 2018) with simple, feature-engineered object representations and planning with such object entities. Dubey et al. (2018) substantially boosts its performance on the Frostbite game from Atari 2600. We then analyze the relative contributions of the explicitly examine the importance of the visual object prior to representations of different types of objects, identify environ- human and artificial agents, discovering that the human agents ment states where these representations are most impactful, and exhibit strong reliance on the objectness of the environment, examine how these representations aid in generalizing to novel situations. while deep RL agents suffer no penalty when it is removed. More recently, this inductive bias served as a source of inspira- Keywords: deep reinforcement learning; object representa- tions; model-free reinforcement learning; DQN.
    [Show full text]
  • Deep Learning: State of the Art (2020) Deep Learning Lecture Series
    Deep Learning: State of the Art (2020) Deep Learning Lecture Series For the full list of references visit: http://bit.ly/deeplearn-sota-2020 https://deeplearning.mit.edu 2020 Outline • Deep Learning Growth, Celebrations, and Limitations • Deep Learning and Deep RL Frameworks • Natural Language Processing • Deep RL and Self-Play • Science of Deep Learning and Interesting Directions • Autonomous Vehicles and AI-Assisted Driving • Government, Politics, Policy • Courses, Tutorials, Books • General Hopes for 2020 For the full list of references visit: http://bit.ly/deeplearn-sota-2020 https://deeplearning.mit.edu 2020 “AI began with an ancient wish to forge the gods.” - Pamela McCorduck, Machines Who Think, 1979 Frankenstein (1818) Ex Machina (2015) Visualized here are 3% of the neurons and 0.0001% of the synapses in the brain. Thalamocortical system visualization via DigiCortex Engine. For the full list of references visit: http://bit.ly/deeplearn-sota-2020 https://deeplearning.mit.edu 2020 Deep Learning & AI in Context of Human History We are here Perspective: • Universe created 13.8 billion years ago • Earth created 4.54 billion years ago • Modern humans 300,000 years ago 1700s and beyond: Industrial revolution, steam • Civilization engine, mechanized factory systems, machine tools 12,000 years ago • Written record 5,000 years ago For the full list of references visit: http://bit.ly/deeplearn-sota-2020 https://deeplearning.mit.edu 2020 Artificial Intelligence in Context of Human History We are here Perspective: • Universe created 13.8 billion years ago • Earth created 4.54 billion years ago • Modern humans Dreams, mathematical foundations, and engineering in reality. 300,000 years ago Alan Turing, 1951: “It seems probable that once the machine • Civilization thinking method had started, it would not take long to outstrip 12,000 years ago our feeble powers.
    [Show full text]
  • Applied Machine Learning for Games: a Graduate School Course
    The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) Applied Machine Learning for Games: A Graduate School Course Yilei Zeng, Aayush Shah, Jameson Thai, Michael Zyda University of Southern California fyilei.zeng, aayushsh, jamesont, [email protected] Abstract research-oriented, industry-oriented or patent-oriented di- rections. The projects’ difficulties are also dynamically ad- The game industry is moving into an era where old-style justable towards different students’ learning curve or prior game engines are being replaced by re-engineered systems experiences in machine learning. In this class, we intend with embedded machine learning technologies for the opera- tion, analysis and understanding of game play. In this paper, to encourage further research into different gaming areas we describe our machine learning course designed for gradu- by requiring students to work on a semester-long research ate students interested in applying recent advances of deep project in groups of up to 8. Students work on incorporat- learning and reinforcement learning towards gaming. This ing deep learning and reinforcement learning techniques in course serves as a bridge to foster interdisciplinary collab- different aspects of game-play creation, simulation, or spec- oration among graduate schools and does not require prior tating. These projects are completely driven by the student experience designing or building games. Graduate students along any direction they wish to explore. Giving the students enrolled in this course apply different fields of machine learn- an intrinsic motivation to engage on their favorite ideas will ing techniques such as computer vision, natural language not only make teaching more time efficient but also bestow processing, computer graphics, human computer interaction, a long-term meaning to the course project which will open robotics and data analysis to solve open challenges in gam- ing.
    [Show full text]