Game Data Mining

Chapter 12 Game Data Mining Anders Drachen, Christian Thurau, Julian Togelius, Georgios N. Yannakakis, and Christian Bauckhage Take Away Points: 1 . T h e d a t a r e v o l u t i o n i n g a m e s – a n d e v e r y w h e r e e l s e – c a l l s f o r a n a l y s i s m e t h o d s that scale to with dataset size. The solution: game data mining. 2. Game data mining deals with the challenges of acquiring actionable insights from game telemetry. 3. Read the chapter for an introduction to game data mining, an overview of methods commonly and not so commonly used, examples, case studies and a substan- tial amount of practical advice on how to employ game data mining effectively. A. Drachen , Ph.D. (*) PLAIT Lab , Northeastern University, Boston, MA , USA Department of Communication and Psychology, Aalborg University, Aalborg, Denmark Game Analytics , Copenhagen, Denmark e-mail: [email protected] C. Thurau , Ph.D. Game Analytics , Refshalevej 147, Copenhagen, K 1422, Denmark J. Togelius , Ph.D. Center for Computer Games Research , IT University of Copenhagen, Rued Langgaards Vej 7 , Copenhagen, S 2300 , Denmark G. N. Yannakakis , Ph.D. Department of Digital Games , University of Malta , Msida MSD 2080 , Malta Center for Computer Games Research , IT University of Copenhagen, Rued Langgaards Vej 7, Copenhagen, S 2300 , Denmark C. Bauckhage , Ph.D. Fraunhofer Institute Intelligent Analysis and Information Systems IAIS , Schloss Birlinghoven , 53754 Sankt Augustin Bonn-Aachen International Center for Information Technology B-IT Dahlmannstraße 2 D-53113 Bonn, Germany M. Seif El-Nasr et al. (eds.), Game Analytics: Maximizing the Value of Player Data, 205 DOI 10.1007/978-1-4471-4769-5_12, © Springer-Verlag London 2013 206 A. Drachen et al. 12.1 Introduction During the years of the Information Age, technological advances in the computers, satellites, data transfer, optics, and digital storage has led to the collection of an immense mass of data on everything from business to astronomy, counting on the power of digital computing to sort through the amalgam of information and gener- ate meaning from the data. Initially, in the 1970s and 1980s of the previous century, data were stored on disparate structures and very rapidly became overwhelming. The initial chaos led to the creation of structured databases and database management systems to assist with the management of large corpuses of data, and notably, the effective and ef fi cient retrieval of information from databases. The rise of the database management system increased the already rapid pace of information gathering. During later years, a virtually exponential increase in the availability of data is emerging in fi elds across industry and research, such as bio-informatics, social network analysis, computer vision and not the least digital games. Today, far more data is available than can be handled directly: business transaction data, scienti fi c data from, e.g., telescopes, satellite pictures, text reports, military intelligence and digital games (Berry and Linoff 1999 ; Han et al. 2005 ; Larose 2004 ; Kim et al. 2 0 0 8 ; I s b i s t e r a n d S c h a f f e r 2 0 0 8 ; M e l l o n 2 0 0 9 ; D r a c h e n a n d C a n o s s a 2 0 0 9 ; B o h a n n o n 2 0 1 0 ) . Modern digital games range from simple applications to incredibly sophisti- cated information systems, but common for all of them is that need to keep track of the actions of players and calculate a response to them, as is discussed in most chapters in this book. In recent years, the tracking and logging of this information termed telemetry data in computer science – as well as data on technical perfor- mance of the game engines and applications themselves – has become widespread in the digital entertainment industry, leading to a wealth of incredibly detailed information about the behavior of – in the case of major commercial titles – mil- lions of players and installed clients (Mellon 2009 ; King and Chen 2009 ; Drachen and Canossa 2011 ) . A p p l i e d r i g h t , game telemetry can be a very powerful tool for game development (Kim et al. 2008 ; Isbister and Schaffer 2008 ; King and Chen 2009 ) . Not only for analyzing and tuning games, QA, testing and monitoring infrastructure (Mellon 2009 ) , fi guring out and correcting problems and generally learning about effective game design, but also to guide marketing, strategic decision making, technical development, customer support, etc. However, it is generally far from obvious how to employ the analysis (Kim et al. 2008 ; Mellon 2009 ) : what data should we record, how can we analyze it, and how should it be presented to facilitate effect transformation of raw data to knowledge that if fully integrated into the organization? Narrowing the focus to behavioral game telemetry, i.e. telemetry data about how people play games (Chap. 2 ), there are a wide variety of ways that this kind of game telemetry data can be employed to assist a variety of stakeholders (as 12 Game Data Mining 207 discussed in Chap. 3 ), during and following the development process, e.g., inform- ing game designers about the effectiveness of their design via user modeling, behavior analysis, matchmaking and playtesting, something that is evident from the range of publications and presentations across academia and industry, includ- ing: Kennerly ( 2003) , Hoobler et al. ( 2004 ) , Houlette ( 2004 ) , Charles and Black ( 2 0 0 4 ) , T h u r a u e t a l . ( 2 0 0 4 , 2 0 0 7, 2011 ), Ducheneaut and Moore ( 2004 ) , DeRosa ( 2007 ) , Thompson ( 2007 ) , Kim et al. ( 2008 ) , Isbister and Schaffer ( 2008 ) , Thawonmas and Iizuka ( 2008 ) , Thawonmas et al. ( 2008 ) , Coulton et al. ( 2008 ) , Drachen et al. ( 2009 , 2012) , Missura and Gärtner ( 2009 ) , Williams et al. ( 2009, 2008 ) , Thurau and Bauckhage ( 2010 ) , Pedersen et al. ( 2010 ) , Yannakakis and Hallam ( 2009 ) , Weber and Mateas ( 2009 ) , Mellon ( 2009 ) , Seif El-Nasr and Zammitto ( 2010 ) , Seif El-Nasr et al. ( 2010 ) , Thurau and Drachen ( 2011 ) , Yannakakis and Togelius ( 2011 ) , Moura et al. ( 2011 ) , Erfani and Seif El-Nasr ( 2011 ) , Drachen and Canossa ( 2011 ) , Yannakakis ( 2012 ) , Bauckhage et al. ( 2012 ) and Gagne et al. ( 2012 ) . There is a wealth of information hidden in good game telemetry data, but not all of it is readily available, and some very hard to discover without the proper expert knowledge (or even with it). This has led to much game telemetry data being tracked, logged and stored, but not analyzed and employed. The challenge faced by the game industry to take advantage of game telemetry data mirrors the general challenge of working with large-scale data. Simply retrieving information from databases – irre- spective of the fi eld of application – is not enough to guide decision-making. Instead, new methods have emerged to assist analysts and decision makers to obtain the information they need to make better decisions. These include: automatic summari- zation of data, the extraction of the essence of the stored information, and the dis- covery of patterns in raw data. When datasets become very large (we consider any dataset that does not fi t into the memory of a high-end PC as large-scale, i.e. several GB and beyond) and complex, many traditional methodologies and algorithms used on smaller datasets break down. In fact, any algorithm that scales quadratically with the number of samples is not feasible to use on large data. Instead, methods designed for large datasets must be employed. These methods are collectively referred to as data mining . Data revolutions call for novel analysis methods that scale to massive data sizes and allow for effective, rapid analysis, as well as results that are intui- tively accessible to non-experts (Han et al. 2005 ; Larose 2004 ) . Using data mining methods in the context of game telemetry data – what we will here call game data mining – we can for example: • F i n d w e a k s p o t s i n a g a m e s ’ d e s i g n ( C h a p . 7 ; T h o m p s o n 2 0 0 7 ; K i m e t a l . 2 0 0 8 ; Drachen and Canossa 2009 ; Gagne et al. 2012 ) • Figure out how to convert nonpaying to paying users (Chap. 4 ; King and Chen 2009 ) • Discover geographical patterns in our player community, • Figure out how players spend their time when playing (Chaps. 18 and 19 , DeRosa 2007 ; Moura et al. 2011 ; Drachen et al. 2012 ) • Discover gold farmers in an MMORPGs (Thawonmas et al. 2008 ) , 208 A. Drachen et al. • Explore how people play a game (Chap. 14 ; Drachen and Canossa 2009 ) , • How much time they spend playing (Williams et al. 2009 ) • Predict when they will stop playing (Mahlman et al. 2010 ; Bauckhage et al. 2012 ) • Predict what they will do while playing (Weber and Mateas 2009 ) • Which assets that are not getting used (Chap. 14 ) • Develop better AI-controlled opponents or make games that adapt to the player (Charles and Black 2004 ; Thurau et al. 2007 ; Missura and Gärtner 2009 ; Pedersen et al. 2010 ; Yannakakis and Hallam 2009 ) • Explore and use of social grouping (Thurau and Bauckhage 2010 ) – and much, much more. This chapter will outline how large-scale data mining can be effectively carried out on game telemetry data (i.e.

Game Data Mining

Cows, Clicks, Ciphers, and Satire

Intercultural Perspective on Impact of Video Games on Players: Insights from a Systematic Review of Recent Literature

Informa 2018 Full Year Results Statement

Griefing, Massacres, Discrimination, and Art: the Limits of Overlapping Rule Sets in Online Games Sal Humphreys University of Adelaide (Australia)

What We Talk About When We Talk About Game Aesthetics Simon Niedenthal Malmö University School of Arts and Communication Malmö, Sweden [email protected]

A Scripting Game That Leverages Fuzzy Logic As an Engaging Game Mechanic

Routledge International Handbook of Schools and Schooling in Asia

Mass Media Visibility Distribution Plan

Gaps in Game Production Research Henrik Engström Division of Game Development University of Skövde Sweden [email protected]

Digra Conference Publication Format

Interactive Narrative

Issue 2 2.Pdf