The Undergraduate Games Corpus: a Dataset for Machine Perception of Interactive Media
Total Page:16
File Type:pdf, Size:1020Kb
The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) The Undergraduate Games Corpus: A Dataset for Machine Perception of Interactive Media Barrett R. Anderson, Adam M. Smith Design Reasoning Lab, University of California Santa Cruz [email protected], [email protected] Abstract of machine perception techniques that directly address the unique features of interactive media. Machine perception research primarily focuses on process- ing static inputs (e.g. images and texts). We are interested This paper contributes a new, annotated corpus of 755 in machine perception of interactive media (such as games, games. We provide full game source code and assets, in apps, and complex web applications) where interactive au- many cases licensed for reuse, that were collected with in- dience choices have long-term implications for the audience formed consent for their use in AI research. In comparison experience. While there is ample research on AI methods for to lab-created game clones intended as AI testbeds, these the task of playing games (often just one game at a time), this games are complete works intended as human experiences, work is difficult to apply to new and in-development games and they represent a range of design polish rather than being or to use for non-playing tasks such as similarity-based re- filtered to only commercially successful games. By offering trieval or authoring assistance. In response, we contribute a games for a variety of platforms (from micro-games in Bitsy corpus of 755 games and structured metadata, spread across and text-oriented stories in Twine to the heavyweight games several platforms (Twine, Bitsy, Construct, and Godot), with full source and assets available and appropriately licensed built in the Godot Engine), we set up future research with a for use and redistribution in research. Because these games wide range of technical challenges. were sourced from student projects in an undergraduate game In the following sections, we place machine perception of development program, they reference timely themes in their interactive media into the context of emerging work on in- content and represent a variety of levels of design polish formation retrieval for interactive media, humanist theories rather than only representing past commercial successes. This of the meaning of interactive media, and automated game- corpus could accelerate research in understanding interactive playing. We then compare our corpus with other sources of media while anchoring that work in freshly-developed games games and game-like artifacts, and quantitatively character- intended as legitimate human experiences (rather than lab- ize the scale and variety of the corpus. To demonstrate the created AI testbeds). We validate the utility of this corpus by usefulness of just one slice of this data (interactive narra- setting up the novel task of predicting tags relevant to the player experience from the game source code, showing that tives), we setup a metadata tag prediction task and compare representations that better exploit the structure of the media the performance of a system that reduces an interactive story outperform a text-only baseline. to plain text with one that exploits knowledge of the connec- tivity between scenes by player selected choices. The com- plete data and metadata for the new corpus are available at Introduction https://github.com/barrettrees/undergraduate games corpus. There is an extensive history of research into machine per- ception for many kinds of non-interactive media (e.g. text, Background images, video), but interactivity remains comparatively un- Towards a future of search engines that could search within derstudied. Interactivity is an important component of many as well as across games, Zhang et al. recently introduced modern media, including apps, complex web applications, the challenge of crawling, indexing, and retrieving moments and games. Games in particular are almost pathologically in- from the vast space of interactivity contained within each teractive. Many games are not just incidentally but primarily game (Zhang et al. 2018). This requires representing and interactive. Further, the meaning of text, images, and other reasoning about what makes one moment different from an- observations made while interacting with a game can be con- other or more or less relevant to a user-provided query. Pre- ditioned on the activity leading up to those observations. A liminary work in this area made progress by representing screen that reads “game over, try again?” comments on the moments as vectors in a space where proximity of points in players’s recent choices and suggests the possibility of alter- space approximated their semantic similarity. By training a nate outcomes with different choices. We conjecture that a neural network to predict the state of a videogame’s work- large collection of games could accelerate the development ing memory from the pixels in the screenshot of a moment, Copyright c 2021, Association for the Advancement of Artificial the Pix2Mem technique induces an intermediate vector rep- Intelligence (www.aaai.org). All rights reserved. resentation that implicitly represented spatial and temporal 3 aspects of moments (Zhan and Smith 2018). As with word Related Corpora vectors in natural language processing (Mikolov, Yih, and Partlan et al. (2019) recently proposed a formal represen- Zweig 2013), this representation allowed reasoning by anal- tation for interactive narratives that distinguished scenes ogy using simple operations on vectors (Zhang et al. 2018). within a larger story. Significantly, their representation was Variations on this vector representation strategy have been automatically constructable from a game’s definition (by explored. For example, search with open-vocabulary, natu- running something like exhaustive symbolic execution of ral language queries for game moments containing recog- the game’s scripting logic) and validated by an expert inter- nizable objects or having topics mentioned in character dia- view study designed to highlight structural features of nar- log was demonstrated by fusing information from visual and ratives that had human-level significance. This work was in auditory observations (Zhang and Smith 2019). This work turn based on a collection of 20 student-authored interac- (and past work like Pix2Mem) reduced perception of inter- tive narratives (Partlan et al. 2018) built on the StudyCrafter active media to perception of linear streams of images, au- (Harteveld et al. 2016) platform. dio, and text (potentially augmented with software execution Our contribution of the Undergraduate Games Corpus state annotations). To date, little work outside of automated aims to accelerate research on the development of percep- gameplaying (addressed shortly) has built machine percep- tual representations of interactive media (comparable within tion systems that model the semantics of observations as be- and across individual works) in a way that is anchored in ing conditioned on past player choices. human meaning-making processes and the work of a diverse Rooted in the humanistic field of game studies, choice collection of authors. poetics offers a theory of how authors of interactive media Several corpora of games (like the 20 StudyCrafter sto- build meaning through the structure of the choices they of- ries mentioned previously) exist and in some cases have al- fer to the audience (Mawhorter et al. 2018). Patterns in the ready been used in AI research. A corpus of all known his- framing of options and how they are linked to outcomes can torical games (130 total) for the ZX Spectrum version of construct simple “dead-end” situations that encourage the the Graphic Adventure Creator, a 1986 game development player to backtrack in search of a more desirable outcome tool advertised to nonprofessionals, was published (Aycock or more subtle “unchoice” situations where the fact that all and Biittner 2020) while this article was under review. In choices eventually lead to the same outcome can convey in- the history of AI, the effort to build one system (or at least evitability or even convey reluctance of the story’s protago- devise one technique) that could competently interact with nist to do what the player wants. multiple games—otherwise known as general game playing The related theory of procedural rhetoric (Bogost 2007) (GGP)—prompted Pell’s Metagamer (1996), a system that attempts to capture how games convey meaning and build could play any symmetric, chess-like games that could be arguments even through potentially very abstract systems. modeled in the SCL-Metagame language. Pell’s dissertation The rules governing the production and consumption of re- (1993) included a collection of such games in the appendix. sources under the influence of player actions, for example, This thread was followed by the commercial release of the can be used to convey subtle and culture-dependent mes- Zillions of Games “universal gaming engine” (Lefler and sages even what the player directly observes is just colored Mallett 1998-2020) which allowed competitive human play rectangles moving in straight lines. Proceduralist readings (including manipulation of artist created pieces) against an (Treanor et al. 2011) offer one way of analyzing meaning in AI opponent able to play hundreds of bundled board games games employing graphical logics, and they have been ap- with the software. As of the time of publication, there are plied to the interpretation of newsgames (which function like 2995 free add-on games available for the ZOG engine.1 political cartoons) (Treanor and Mateas 2009). Efforts to op- Academically, GGP research led to the development of erationalize proceduralist readings in AI systems (Martens the GDL (Love et al. 2006) and GDL-II (Thielscher 2010) et al. 2016) have thus far worked by taking a static game de- game description languages. Collections of games in these scription (analogous to a videogame’s source code) as input. formats are made to evaluate agents submitted to annual Many automated gameplaying systems, such as Alp- GGP competitions (Genesereth and Bjornsson¨ 2013). These haZero (Silver et al.