Available online at www.sciencedirect.com

Available online at www.sciencedirect.com Available ScienceDirect online at www.sciencedirect.com

Procedia Computer Science 00 (2018) 000–000 ScienceDirect ScienceDirect www.elsevier.com/locate/procedia Procedia Computer Science 126 (2018) 1289–1297 Procedia Computer Science 00 (2018) 000–000

www.elsevier.com/locate/procedia International Conference on Knowledge Based and Intelligent Information and Engineering Systems, KES2018, 3-5 September 2018, Belgrade, Serbia International Conference on Knowledge Based and Intelligent Information and Engineering Hardcore GamerSystems, Profiling: KES2018, 3 Results-5 September from 2018, a Belgrade,n unsupervised Serbia learning approach to playing behavior on the platform Hardcore Gamer Profiling: Results from an unsupervised learning Florianapproach Baumann *to, Dominik playing Emmert, behavior Hermann on the Baumgartl, Steam Ricardo platform Buettner Aalen University, Germany Florian Baumann*, Dominik Emmert, Hermann Baumgartl, Ricardo Buettner

Aalen University, Germany

Abstract

Based on a very large dataset of over 100 million Steam platform users we present the first comprehensive analysis of hardcore gamerAbstract profiles by over 700,000 hardcore players (users playing more than 20 hours per week) covering more than 3,300 games. Using an unsupervised machine learning approach we reveal the specific behavioral categories of hardcore players, i.e. First PersonBased on Shooter, a very largeTeam datasetFortress of 2over player, 100 millionAction gameSteam player, platform users 2 player,we present Strategy the first and comprehensive action combiner, analysis Genre of- switchinghardcore player.gamer profilesSubsequ byently over we 700,000 derive hardcoreindividual players patterns (users of hardcore playing moregamers than in 20the hours categories per week) found, covering such as more strategy than- action3,300 games.games combinerUsing an orunsupervised game switching machine players. learning Our approachresults are we useful reveal for thecomputer specific science behavioral and informationcategories ofsystems hardcore scholars players, interested i.e. First in Personindividual Shooter, differen Teamces in Fortress user behavior 2 player, as well Action as practitionersgame player, interested player, in game Strategy-designing. and action combiner, Genre-switching player. Subsequently we derive individual patterns of hardcore gamers in the categories found, such as strategy-action games ©Keywords: 2018 The computer Authors. gaming Published behavior; by Elsevierhardcore Ltd.gamer; player profiling; k-means cluster analysis Thiscombiner is an oropen game access switching article players.under the Our CC results BY-NC-ND are useful license for ( https://creativecommons.org/licenses/by-nc-nd/4.0/computer science and information systems scholars) interested in Selectionindividual and differen peer-reviewces in user under behavior responsibility as well ofas KESpractitioners International. interested in game-designing.

1.Keywords: Introduction computer gaming behavior; hardcore gamer; player profiling; k-means cluster analysis

Computer science and information systems scholars argue for the need to take individual differences into 1.account Introduction in order to increase our understanding of information systems user behavior [1-3]. One of the most interesting areas for retrieving unbiased and large amounts of data is the computer gaming sector [4]. Computer science and information systems scholars argue for the need to take individual differences into accountOnline in dorderistribution to increase networks our like understanding Valve’s Steam, of oneinformation of the largest systems gaming user networks behavior for [1 computer-3]. One gamesof the inmost the world,interesting provide areas a nfor effective retrieving opportunity unbiased andto analyze large amounts game behavior of data isin the an computerunbiased gamingway, because sector the[4]. Steam platform community records actual gaming behavior without distortion across a broad spectrum of different players and games.Online distribution networks like Valve’s Steam, one of the largest gaming networks for computer games in the world, provide an effective opportunity to analyze game behavior in an unbiased way, because the Steam platform communityWhile a recordssubstantial actual amount gaming of studiesbehavior analyzing without gamedistortion play ingacross behavior a broad exist spectrum these generally of different focus players on overall and games.playing behavior [9-16]. However, in-depth analyses of specific types of players have been largely neglected. Since the so-called “hardcore gamer” spends most of his/her lifetime and money in gaming [8, 9], it is valuable for computerWhile sciencea substantial and information amount of systemsstudies scholarsanalyzing to game understand playing a player’sbehavior intentions exist these and generally behavior. focus on overall playing * Cor behaviorresponding [9-16]. author. However, E-mail in address:-depth analyses [email protected] of specific types of players-aalen.de have been largely neglected. Since the so-called “hardcore gamer” spends most of his/her lifetime and money in gaming [8, 9], it is valuable for 1877computer-0509 ã science 2018 The and Authors. information Published systemsby Elsevier scholars to understand a player’s intentions and behavior. B.V. *Peer Cor-reviewresponding under responsibility author. E -ofmail KES address: International [email protected] -aalen.de

1877-05091877-0509 ã© 20182018 TheThe Authors. Authors. Published Published by Elsevierby Elsevier Ltd. ThisB.V. Peeris an-review open accessunder responsibilityarticle under of the KES CC International BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Selection and peer-review under responsibility of KES International. 10.1016/j.procs.2018.08.078

10.1016/j.procs.2018.08.078 1877-0509 1290 Florian Baumann et al. / Procedia Computer Science 126 (2018) 1289–1297 2 Author name / Procedia Computer Science 00 (2018) 000–000

That is why our study focuses on the behavior of “hardcore gamers” in a lot more detail. This type of player is more dedicated to gaming in almost every way, for example due to their high level of involvement in games, quantified by time spent playing and the scale of their respective in-game achievements [5-7]. Hardcore gamers can be described as people who play as a lifestyle preference and invest substantial amounts of time and money on games [8, 9]. They constitute the pioneers of a particular game, despite being the smallest group of players among the total player-base, and they help to define the experience for their fellow players through their own actions and behavior [7]. By identifying and analyzing their playing patterns it is possible to see how games are perceived by these influential players [7]. Such information can help support improvements in game design and game development [4, 7]. It is also important for the game industry to know and better understand their most influential players in terms of marketing and sales-promotional activities.

In this paper, we provide interesting results from an unsupervised learning approach to the analysis of the playing behavior of over 700,000 hardcore players, covering more than 3,300 games. Our analysis is based on a very large dataset, collected by O’Neil et al. [10]. The dataset was originally used to analyze the gaming behavior of over 100 million Steam users in general, and offers numerous possibilities for follow-up research [10]. For this reason, in our study we focus on a subset of players with the highest playtime in the dataset. In order to fit into the subset, Steam players classified as “hardcore gamers” where selected. According to Poels et al. [17] “hardcore gamers” play 19 hours per week on average. O’Neil et al. [10] also demonstrated that the 95th percentile of gamers has a total playtime of 1,233.9 hours, while the 99th percentile has 2,660.1 hours of total playtime. Based on these results, players with a minimum total playtime of 2,000 hours and a minimum two-week playtime of 40 hours across all games were selected as “hardcore gamers”. This subset represents the active Steam community and is characterized by the large amount of time the players spend playing computer games.

Our results identify six behavioral subtypes of hardcore gamers as well as the games and genres they play.

In the next section we examine related work in the field and outline our methodology, before presenting results in-depth and a discussion of them. We then conclude by highlighting limitations and future work.

2. Research Background

While scholars are becoming more and more interested in hedonic information systems, the focus on computer games as a serious field of research was established within the last two decades. As a result there are comprehensive general overviews of gaming behavior [4, 16, 18, 19] and a lot of studies on individual games and their related communities [11, 13]. The authors of these studies limit their findings since data were regularly biased due to the single game focus. Authors of studies analyzing gaming behavior using questionnaires limit their findings due to self-selection and social desirability biases.

For these reasons scholars acknowledge the desirability of capturing unbiased actual behavior data of a broad range of players using gaming platform data. Since the Steam platform (store.steampowered.com) is the largest digital computer gaming platform with an open application programming interface, analyzing Steam data offers potential insights into realistic gaming behavior. Scholars [10, 21] have demonstrated that Steam data is unbiased as they have also found well-known real-word phenomena [22] in gaming behavior data from Steam. For instance, Becker et al. [21] demonstrated that the network shows small world graph characteristics by studying the structure of Steam’s gaming network. They used Steam network data of 9 million users, 82.3 million friendships, 1.98 million groups and 1,824 games using a web crawling program [21]. O'Neill et al. [10] also demonstrated the power law distribution in Steam data.

However, while gaming behavior was analyzed either based on biased data from questionnaires, or unbiased data from gaming platforms but with a single-game focus or small data, no gamer type specific analysis using massive unbiased data exists – with one exception: Sifa et al. [12] analyzed cross-games behavior based on a Steam sub- sample of 6 million players and 3,000 games. As a result, Sifa et al. [12] demonstrated that gaming behavioral data

Florian Baumann et al. / Procedia Computer Science 126 (2018) 1289–1297 12913 2 Author name / Procedia Computer Science 00 (2018) 000–000 Author name / Procedia Computer Science 00 (2018) 000–000

That is why our study focuses on the behavior of “hardcore gamers” in a lot more detail. This type of player is can be clustered according to the users’ playtime. Cross-games behavior is a term used by Sifa et al. [12] to refer to more dedicated to gaming in almost every way, for example due to their high level of involvement in games, cross-sectional analyses of relationships between players / game ownership, games and playing time. quantified by time spent playing and the scale of their respective in-game achievements [5-7]. Hardcore gamers can be described as people who play as a lifestyle preference and invest substantial amounts of time and money on We extend the approach adopted by Sifa et al. [12] using cluster analysis but based on massive data from all games [8, 9]. They constitute the pioneers of a particular game, despite being the smallest group of players among 108.7 million Steam user accounts and their 384.4 million owned games. In addition, to offer in-depth results we the total player-base, and they help to define the experience for their fellow players through their own actions and cluster them based not only on playtime but also on game type/genre. behavior [7]. By identifying and analyzing their playing patterns it is possible to see how games are perceived by these influential players [7]. Such information can help support improvements in game design and game development [4, 7]. It is also important for the game industry to know and better understand their most influential 3. Methodology players in terms of marketing and sales-promotional activities. 3.1. Dataset In this paper, we provide interesting results from an unsupervised learning approach to the analysis of the playing behavior of over 700,000 hardcore players, covering more than 3,300 games. Our analysis is based on a The dataset we use is from a comprehensive analysis of the Steam gaming network, comprising all 108.7 million very large dataset, collected by O’Neil et al. [10]. The dataset was originally used to analyze the gaming behavior of user accounts and 384.4 million owned games, a scale that makes this dataset unique in terms of both magnitude and over 100 million Steam users in general, and offers numerous possibilities for follow-up research [10]. For this focus [10]. The dataset [10, 22] covers gaming behavior across several dimensions, for example, social connectivity, reason, in our study we focus on a subset of players with the highest playtime in the dataset. In order to fit into the playtime, game ownership, genre affinity and monetary expenditure. Compared to a lot of other studies collecting subset, Steam players classified as “hardcore gamers” where selected. According to Poels et al. [17] “hardcore (small and biased) samples of Steam user data via web crawling, the dataset we use here captured the data directly gamers” play 19 hours per week on average. O’Neil et al. [10] also demonstrated that the 95th percentile of gamers from Steam’s application programming interface. has a total playtime of 1,233.9 hours, while the 99th percentile has 2,660.1 hours of total playtime. Based on these results, players with a minimum total playtime of 2,000 hours and a minimum two-week playtime of 40 hours across 3.2. Data cleansing all games were selected as “hardcore gamers”. This subset represents the active Steam community and is characterized by the large amount of time the players spend playing computer games. The data set has been adjusted according to the definition of “hardcore gamers”. In order to classify “hardcore gamers”, all Steam-IDs were filtered by their playtime over the previous two weeks and their added total playtime Our results identify six behavioral subtypes of hardcore gamers as well as the games and genres they play. for all games. It was also necessary to consider games only, and no other software such as developer tools. All lines were removed that did not correspond to the game type or download content (dlc). Some entries were not games but In the next section we examine related work in the field and outline our methodology, before presenting results developer or streaming software, even though they had the entry of a specific game type. To cater for this, we in-depth and a discussion of them. We then conclude by highlighting limitations and future work. removed all entries with genres that were not playable, examples being “Video Production” and “Utilities”. The final dataset consists of 707,477 players and 3,366 different games. 2. Research Background

While scholars are becoming more and more interested in hedonic information systems, the focus on computer 3.3. Clustering method games as a serious field of research was established within the last two decades. As a result there are comprehensive general overviews of gaming behavior [4, 16, 18, 19] and a lot of studies on individual games and their related To analyze a gamers’ behavior we use K-means clustering. The Player Characteristic (SteamID) and Game communities [11, 13]. The authors of these studies limit their findings since data were regularly biased due to the Characteristic (GameID) correlates with the age of an account, where lower IDs indicate older accounts. We use single game focus. Authors of studies analyzing gaming behavior using questionnaires limit their findings due to Euclidean distance for measuring the similarity. Related work shows that K-means clustering is an appropriate self-selection and social desirability biases. method for analyzing play behavior and creating player profiles [12, 18, 26]. It is necessary to find the best number of clusters to run the K-means algorithm with the cleaned-up dataset. There are two methods to search for the best For these reasons scholars acknowledge the desirability of capturing unbiased actual behavior data of a broad number of clusters; these are the silhouette and the elbow method [24]. It is also necessary to encode the categorical range of players using gaming platform data. Since the Steam platform (store.steampowered.com) is the largest data in the final table using Label Encoding since the features Player Characteristic and Game Characteristic are digital computer gaming platform with an open application programming interface, analyzing Steam data offers categorical and the K-means algorithm can only be used with numerical data [25]. To preserve information about potential insights into realistic gaming behavior. Scholars [10, 21] have demonstrated that Steam data is unbiased as the account age we sorted the Player Characteristic and Game Characteristic in ascending order. This process makes they have also found well-known real-word phenomena [22] in gaming behavior data from Steam. For instance, possible the correct usage of the K-means algorithm, and the algorithm clustered the Player Characteristic by games Becker et al. [21] demonstrated that the network shows small world graph characteristics by studying the structure and their respective playtime. Most clustering performance metrics such as Adjusted Rand Index, Homogeneity, V- of Steam’s gaming network. They used Steam network data of 9 million users, 82.3 million friendships, 1.98 million measure or Fowlkes-Mallows score require knowledge of the ground truth labels. Since these labels are unknown groups and 1,824 games using a web crawling program [21]. O'Neill et al. [10] also demonstrated the power law for the given hardcore gaming dataset, we report the average silhouette score for each cluster in order to evaluate distribution in Steam data. cluster quality. With further analysis it is possible to explain these clusters in more detail by creating player profiles for each cluster [18]. These profiles characterize the playing behavior of hardcore gamers on the Steam platform However, while gaming behavior was analyzed either based on biased data from questionnaires, or unbiased data [26]. from gaming platforms but with a single-game focus or small data, no gamer type specific analysis using massive unbiased data exists – with one exception: Sifa et al. [12] analyzed cross-games behavior based on a Steam sub- For cluster naming and explanation we use all the characteristics the dataset provides. sample of 6 million players and 3,000 games. As a result, Sifa et al. [12] demonstrated that gaming behavioral data

1292 Florian Baumann et al. / Procedia Computer Science 126 (2018) 1289–1297 4 Author name / Procedia Computer Science 00 (2018) 000–000

4. Results

4.1. Most played games and genres

Since 10 games drew more playtime than all the other 3,537 games together (Fig. 1), we found a power law-like distribution of playtime in Steam data.

Figure 1. Game-related playtime distribution [y-axis shows the total playtime from zero to 1 billion hours].

A genre-related analysis revealed that in general action games dominate gaming behavior (Figure 2).

Figure 2. Genre-related playtime distribution without Dota 2 since Dota 2 is related to both genres action and strategy [y-axis shows the total playtime from zero to 1 billion hours].

Florian Baumann et al. / Procedia Computer Science 126 (2018) 1289–1297 1293 5 4 Author name / Procedia Computer Science 00 (2018) 000–000 Author name / Procedia Computer Science 00 (2018) 000–000

4. Results 4.2. Gamer profiles

4.1. Most played games and genres K-means is an unsupervised algorithm that clusters data entries based on the given number of clusters. The most appropriate number of clusters for a given dataset can be calculated using the silhouette method or the elbow Since 10 games drew more playtime than all the other 3,537 games together (Fig. 1), we found a power law-like method. Both methods show that gaming behavior can be portioned in six clusters (separation index: 0.4). For this distribution of playtime in Steam data. reason we built six clusters to meaningfully group the behavior of hardcore gamers (Table 1).

Table 1. Hardcore gaming behavior appropriately grouped by six clusters including playtime distribution. No. Cluster name Number of Share of Playtime in hours Silhouette players playtime M STD score 1 First-Person Shooter 221,507 13.5 % 3,293.0 2,089.0 0,39 2 player 96,362 9.2 % 2,956.6 1,567.1 0,35 3 Action game player 295,588 11.2 % 2,938.5 1,855.0 0,39 4 Dota 2 player 14,438 39.5 % 2,638.3 1,542.3 0,35 5 Strategy and action combiner 1,008 15.5 % 5,409.9 3,368.4 0,47 6 Genre-switching player 78,574 11.1 % 3,080.7 2,357.7 0,38

We used the K-means algorithm with the features Player Characteristic (N=707,477), Game Characteristic (N=3,366), and (summarized) Playtime per game in hours (N=2,222,468,173, M=45.96, STD=292.05).

First Person Shooter (Cluster 1): Hardcore gamers in this cluster use most of their time to play shooters such as Counter-Strike (Global Offensive, and ), Call of Duty, Left for Dead (2) or DayZ. While users of this category also play Dota 2 and Garry’s Mod, the playtime of these two games is small compared to the time used for shooters. With the exception of Dota 2, all games played by hardcore gamers in this category are not free to play and are sometimes very cost-expensive.

Team Fortress 2 player (Cluster 2): Hardcore gamers in this category use most of their time to play Team Fortress 2 followed by Skyrim, Civilization V and Terraria. Players in this cluster tend to use free-to-play games such as Team Fortress 2 or War Thunder.

Figure 1. Game-related playtime distribution [y-axis shows the total playtime from zero to 1 billion hours]. Action game player (Cluster 3): Users in this category are attracted to action games such as Dungeon A genre-related analysis revealed that in general action games dominate gaming behavior (Figure 2). Defenders or Grand Theft Auto (IV). None of the Top 50 games in this cluster are free to play with the exception of Dota 2.

Dota 2 player (Cluster 4): This category is dominated by players who spend heavy playing time on Dota 2 (420 million playing hours in total). Dota 2 is a free-to-play multiplayer online battle arena computer game. With a big gap in terms of playtime, a series of ego-shooters is also played by users in this category (Fig. 3).

Figure 2. Genre-related playtime distribution without Dota 2 since Dota 2 is related to both genres action and strategy [y-axis shows the total playtime from zero to 1 billion hours].

1294 Florian Baumann et al. / Procedia Computer Science 126 (2018) 1289–1297 6 Author name / Procedia Computer Science 00 (2018) 000–000

Figure 3. Total playtime of games in Dota 2 Player category (cluster 4). Strategy and action combiner (Cluster 5): Users in this category play games which simultaneously combine strategy and action gaming elements. Users here often switch between Civilization V, Counter-Strike, Dota 2, Garry´s Mod and Skyrim (alphabetical order).

Genre-switching player (Cluster 6): Hardcore gamers in this category move from genre to genre (role-playing, adventure, indie, strategy, action, etc.) when playing games (Fig. 4). It seems that these players are very open to new gaming types and elements.

Figure 4. Top 20 Games of genre-switching players.

5. Discussion Our results demonstrate that hardcore gamers can be grouped by separate clusters to reflect their individual playing behavior. These clusters seem to reflect individual differences in players, which is an interesting finding for computer science and information systems scholars [1-3]. For example, Jansz and Tanis [37] found that first-person shooters (cluster 1 in our study) scored highest on motives with respect to competition, and challenge. In addition, it can be speculated that strategy and action combiners (cluster 5) score higher on consciousness, and that genre switching players (cluster) score higher on openness to experience [38, 39].

Author name / Procedia Computer Science 00 (2018) 000–000 7 Florian Baumann et al. / Procedia Computer Science 126 (2018) 1289–1297 1295 6 Author name / Procedia Computer Science 00 (2018) 000–000

In addition we found that the free-to-play property of games has a substantial impact on the playing of these games. Many top games, and even games with less playtime, are free to play games. These make up a very large proportion of the content on Steam. Examples include the big names Dota 2 and Team Fortress 2, as well as increasingly popular free-to-play games like Path of Exile [27]. One of the reasons for the popularity of these free to play games is that you can basically play them for free without any restrictions on gameplay. Spending money on these titles does not provide any competitive advantage over fellow players. There are also titles where this is not the case, but this type of free-to-play game is not noticeable in the analysis. Free-to-play games usually use microtransactions to finance them. The player is thus less inhibited from first testing the game and only spends money if he/she wants to [29].

It is noticeable that many of Steam's top titles are played in the Electronic Sports League (ESL). These include games like Dota 2, the Counter-Strike series and titles from the Call of Duty series [30-32]. Games and teams found here will be famous in the context of this eSport which provides a good marketing opportunity. Free-to-play games and games that can be purchased through the acquisition of software licenses are equally popular on Steam and in the ESL [31, 32]. Another aspect that makes these games become popular is expenditure on them from social media Figure 3. Total playtime of games in Dota 2 Player category (cluster 4). platforms. The most games played in the ESL are played from streamers on YouTube, Twitch and other streaming- Strategy and action combiner (Cluster 5): Users in this category play games which simultaneously combine platforms. This has a large potential to, first, increase the distribution of games and, second, to analyze game strategy and action gaming elements. Users here often switch between Civilization V, Counter-Strike, Dota 2, popularity [33]. Garry´s Mod and Skyrim (alphabetical order). The clusters also show which types of games are played most frequently via the gaming platform Steam. Genres Genre-switching player (Cluster 6): Hardcore gamers in this category move from genre to genre (role-playing, such as car racing games or general sports games are relatively less well represented compared to the others. Players adventure, indie, strategy, action, etc.) when playing games (Fig. 4). It seems that these players are very open to new tend to be attracted more to real-time strategy games like Dota 2, Shooters or more generally Action, Strategy, RPG, gaming types and elements. Indie or Adventure titles. Certain titles might be more interesting to users of personal computers than others. One reason for this could be that consoles are easy to use via a controller that is well suited for sports games. Keyboard and mouse work well for shooters and real-time strategy games, as they provide a greater variety of key combinations and engagement with the monitor is very precise [34]. Steam tries to close this gap through the use of its proprietary controller [35]. It is possible to connect popular controllers for the Xbox series and PlayStation series to the PC and use it as a gaming interface. Another reason could be that very popular and famous sport titles like the Fifa series are Electronic Arts (EA) games. EA has its own game distribution platform for personal computers called Origin.

Another notable feature of heavily used games is that these games have multiplayer capability (e.g., almost all top shooters, strategy games like Civilization V, etc.).

5.1. Limitations

A limitation is related to the fact that we did not observe the behavior when a game was being playing, but we retrieved behavioral data summarizing facts such as the playtime of each gaming session. However, the approach used here is in line with several studies analyzing gaming behavior [40].

Since numerical and categorical data is present in the dataset, a mixture of similarity measurements such as Hamming and Euclidean distance could yield better clustering results [41].

Figure 4. Top 20 Games of genre-switching players. Despite analyzing playing behavior using massive data (over 700,000 hardcore players covering more than 3,300 games) our results are internally valid for Steam platform users [10, 23, 35] but eventually slightly biased (single 5. Discussion data source selection bias) [42]. Our results demonstrate that hardcore gamers can be grouped by separate clusters to reflect their individual playing behavior. These clusters seem to reflect individual differences in players, which is an interesting finding for computer science and information systems scholars [1-3]. For example, Jansz and Tanis [37] found that first-person shooters (cluster 1 in our study) scored highest on motives with respect to competition, and challenge. In addition, it can be speculated that strategy and action combiners (cluster 5) score higher on consciousness, and that genre switching players (cluster) score higher on openness to experience [38, 39].

1296 Florian Baumann et al. / Procedia Computer Science 126 (2018) 1289–1297 8 Author name / Procedia Computer Science 00 (2018) 000–000

5.2. Future Work

Based on the clustering results derived from this work we intend to analyze cluster-personality relationships in one future study and personality-mining chances in another study. Mining information about a user’s personality based on gaming behavior is very interesting for recruiting [43-47].

Furthermore, while we used K-means algorithm due to its good performances in previous studies that also used clustering gaming data [12, 16, 18], and due to its low time complexity as well as good applicability on a large scale [48], future work could use different clustering algorithms like DBSCAN or k-prototypes and genre-specific clustering. In addition, this study could be expanded by examining how gaming behavior changes over time.

6. Conclusion

Based on a very large dataset of over 100 million Steam users we analyzed the gaming behavior of over 700,000 hardcore players covering more than 3,300 games. Using K-means clustering as an unsupervised learning approach we identified specific behavioral categories (First Person Shooter, Team Fortress 2 player, Action game player, Dota 2 player, Strategy and action combiner, Genre-switching player), and subsequent individual patterns of gaming behavior. Our results are useful for computer science and information systems scholars interested in individual differences in user behavior [1-3] as well as practitioners interested in game-designing.

References

1. I. A. Junglas, N. A. Johnson, and C. Spitzmüller “Personality traits and concern for privacy: an empirical study in the context of location- based services”, European Journal of Information Systems Vol. 17, pp. 387-402, 2008. 2. R. Buettner, “Predicting user behavior in electronic markets based on personality-mining in large online social networks: A personality-based product recommender framework”, Electronic Markets Vol. 27 No. 3, pp. 247-265, 2017 3. R. Buettner, “Personality as a predictor of business social media usage: An empirical investigation of XING usage patterns” In PACIS 2016 Proceedings: 20th Pacific Asia Conference on Information Systems (PACIS), June 27 - July 1, Chiayi, Taiwan, 2016. 4. M. Seif El-Nasr, A. Drachen, and A. Canossa, Game Analytics: Maximizing the Value of Player Data, Springerg, London, England, 2013. 5. J. Tuunanen, and J. Hamari, “Meta-synthesis of player typologies”, In Proc. of Nordic Digra 2012 Conference: Local and Global – Games in Culture and Society, Tampere, Finland, 2012, pp. 1-14. 6. B. Ip, and G. Jacobs, “Segmentation of the games market using multivariate analysis”, Journal of Targeting, Measurement and Analysis for Marketing, Vol. 13 No. 3, 2005, pp. 275–287. 7. B. Kirman, and S. Lawson, “Hardcore Classification: Identifying Play Styles in Social Games using Network Analysis”, In Proc. of the Int. Conf. on Entertainment Computing (ICEC), Paris, France, 2009, pp. 246–251. 8. J. Juul, A Casual Revolution: Reinventing Video Games and Their Players, MIT Press, Cambridge, Massachusetts, London, England, 2010. 9. B. Manero, J. Torrente, M. Freire, and B. Fernández-Manjón, “An instrument to build a gamer clustering framework according to gaming preferences and habits”, Computers in Human Behavior, Vol. 62, 2016, pp. 353–363. 10. M. O'Neill, E. Vaziripour, J. Wu, and D. Zappala, “Condensing Steam: Distilling the Diversity of Gamer Behavior”, IMC ’16 Proc. of the Internet Measurement Conference (IMC), ACM, 2016, Santa Monica, USA, pp. 81-95. 11. A. Drachen, J. Green, C. Gray, E. Harik, P. Lu, R. Sifa, and D. Klabjan, “Guns and Guardians: Comparative Cluster Analysis and Behavioral Profiling in Destiny”, In Proc. of Computational Intelligence and Games (CIG), IEEE, 2016, Santorini, Greece, pp. 1-8. 12. R. Sifa, A. Drachen, and C. Bauckhage, “Large-Scale Cross-Game Player Behavior Analysis on Steam”, In Proc. of the 11th Artificial Intelligence and Interactive Digital Entertainment Conference (AIIDE), AAAI, Santa Cruz, USA, 2015, pp. 198-204. 13. R. Sifa, A. Drachen, C. Bauckhage, C. Thurau, and A. Canossa, “Behavior Evolution in Tomb Raider Underworld”, In Proc. of CIG, IEEE, Niagara Falls, Canada, 2013. 14. A. Drachen, R. Sifa, C. Bauckhage, and C. Thurau, “Guns, swords and data: Clustering of player behavior in computer games in the wild”, In Proc. of CIG, IEEE, Granada, Spain, 2012, pp. 163-170. 15. A. Drachen, A. Canossa, and G. N. Yannakakis, “Player Modeling using Self-Organization in Tomb Raider: Underworld”, In Proc. of CIG, IEEE, Milano, Italy, 2009, pp. 1–8. 16. C. Bauckhage, K. Kersting, R. Sifa, C. Thurau, A. Drachen, and A. Canossa, “How Players Lose Interest in Playing a Game: An Empirical Study Based on Distributions of Total Playing Times”, In Proc. CIG, IEEE, Granada, Spain, 2012, pp. 139-146. 17. Y. Poels, J.H. Annema, M. Verstraete, B. Zaman, D. DeGroof, “Are you a gamer? A qualitative study on the parameters for categorizing casual and hardcore gamers”, IADIS International Journal on WWW/Internet, 2012, pp. 1-16

Florian Baumann et al. / Procedia Computer Science 126 (2018) 1289–1297 1297 8 Author name / Procedia Computer Science 00 (2018) 000–000 Author name / Procedia Computer Science 00 (2018) 000–000 9

5.2. Future Work 18. R. Sifa, A. Drachen, and C. Bauckhage, Profiling in Games: Understanding Behavior from Telemetry, In Social Interaction in Virtual Worlds, Cambridge University Press, 2018. Based on the clustering results derived from this work we intend to analyze cluster-personality relationships in 19. C. Chambers, W. Feng, S. Sahu, and D. Saha, “Measurement-based Characterization of a Collection of On-line Games”, In Proc. of IMC, one future study and personality-mining chances in another study. Mining information about a user’s personality Berkely, USA, 2005, pp. 1-14. 20. W. Feng, D. Brandt, and D. Saha, “A Long-Term Study of a Popular MMORPG”, In NetGames ’07: Proc. of ACM SIGCOMM workshop on based on gaming behavior is very interesting for recruiting [43-47]. Network and system support for games, Melbourne, Australia, 2007, pp. 19–24. 21. R. Becker, Y. Chernihov, Y. Shavitt, and N. Zilberman, “An Analysis of The Steam Community Network Evolution”, In Proc. Convention of Furthermore, while we used K-means algorithm due to its good performances in previous studies that also used Electrical and Electronics Engineers, IEEE, Eilat, Israel, 2012, pp. 1-5. clustering gaming data [12, 16, 18], and due to its low time complexity as well as good applicability on a large scale 22. L. Poessneck, H. Hofmann, H. and R. Buettner, “Physical Theories of the Evolution of Online Social Networks: A Discussion Impulse”, In [48], future work could use different clustering algorithms like DBSCAN or k-prototypes and genre-specific Proc. of 7th International Conference on Internet and Web Applications and Services (ICIW 2012), May 27 - June 1, 2012, Stuttgart, clustering. In addition, this study could be expanded by examining how gaming behavior changes over time. Germany, pp. 137-142. 23. M. O'Neill, E. Vaziripour, J. Wu, and D. Zappala, “Explanation and download link for the basic dataset of this paper”, https://steam.internet.byu.edu/, January 2018. 6. Conclusion 24. T. Kodinariya, and P. Makwana, “Review on determining number of Cluster in K-Means Clustering”, International Journal of Advance Research in Computer Science and Management Studies, Vol. 1 No. 6, 2013, pp. 90-95. Based on a very large dataset of over 100 million Steam users we analyzed the gaming behavior of over 700,000 25. Z. Huang, and M. K. Ng, “A Fuzzy k-Modes Algorithm for Clustering Categorical Data”, IEEE Transactions on Fuzzy Systems, Vol. 7 No. hardcore players covering more than 3,300 games. Using K-means clustering as an unsupervised learning approach 4, 1999, pp. 446-452. we identified specific behavioral categories (First Person Shooter, Team Fortress 2 player, Action game player, Dota 26. C. Bauckhage, A. Drachen, and R. Sifa, “Clustering Game Behavior Data”, IEEE Transactions on Computational Intelligence and AI in Games, Vol. 7 No. 3, 2015, pp. 266-278. 2 player, Strategy and action combiner, Genre-switching player), and subsequent individual patterns of gaming 27. Githyp, “Githyp shows the rising Player count of Path of Exile”, http://www.githyp.com/path-of-exile-100607/player-count/, January 2018. behavior. Our results are useful for computer science and information systems scholars interested in individual 28. Steam, “Steam Statistics shows the ranking of Games”, http://store.steampowered.com/stats/?l=german, January 2018. differences in user behavior [1-3] as well as practitioners interested in game-designing. 29. C. B Hart, H. Chou, M. D Cruea, S. Cuff, B. Kice, B. Liboriussen, J. Svelch, C. Terry, S. S. Wang, and E. Whatman, The Evolution and Social Impact of Video Game Economics, Lexington Books, Lanham, USA, 2017. 30. Electronic Sports League, “ESL shows their approved games”, https://play.eslgaming.com/games, January 2018. References 31. E-Sports Earnings, “The site shows the ranking of the Top eSports Titles of 2017”, https://www.esportsearnings.com/history/2017/games, January 2018. 32. Y. Seo, “Electronic sports: A new marketing landscape of the experience economy”, Journal of Marketing Management, Vol. 29 No. 13-14, 1. I. A. Junglas, N. A. Johnson, and C. Spitzmüller “Personality traits and concern for privacy: an empirical study in the context of location- 2013, pp. 1542-1560. based services”, European Journal of Information Systems Vol. 17, pp. 387-402, 2008. 33. S. Nakandala, G. L. Ciampaglia, N. M. Su, and Y. Ahn, “Gendered Conversation in a Social Game-Streaming Platform”, In Proc. of the 2. R. Buettner, “Predicting user behavior in electronic markets based on personality-mining in large online social networks: A personality-based International AAAI Conference on Web and Social Media (ICWSM), ACM, Montreal, Canada, 2017, pp. 162-171. product recommender framework”, Electronic Markets Vol. 27 No. 3, pp. 247-265, 2017 34. K. Gkikas, D. Nathanael, and N. Marmaras, “The evolution of FPS games controllers: how use progressively shaped their present design”, In 3. R. Buettner, “Personality as a predictor of business social media usage: An empirical investigation of XING usage patterns” In PACIS 2016 Proc. of the 11th Panhellenic Conference in Informatics (PCI), 2007, Patras, Greece, pp. 37-46. Proceedings: 20th Pacific Asia Conference on Information Systems (PACIS), June 27 - July 1, Chiayi, Taiwan, 2016. 35. Steam, “Information about the Steam Controller”, http://store.steampowered.com/app/353370/Steam_Controller, January 2018. 4. M. Seif El-Nasr, A. Drachen, and A. Canossa, Game Analytics: Maximizing the Value of Player Data, Springerg, London, England, 2013. 36. R. Sifa, A. Drachen, C. Bauckhage, C. Thurau, and A. Canossa, “Behavior Evolution in Tomb Raider Underworld”, In Proc. of CIG, IEEE, 5. J. Tuunanen, and J. Hamari, “Meta-synthesis of player typologies”, In Proc. of Nordic Digra 2012 Conference: Local and Global – Games in Niagara Falls, Canada, 2013. Culture and Society, Tampere, Finland, 2012, pp. 1-14. 37. J. Jansz, and M. Tanis, “Appeal of Playing Online First Person Shooter Games”, Cyberpsychology & Behavior, Vol. 10 No. 1, 2007, pp. 133- 6. B. Ip, and G. Jacobs, “Segmentation of the games market using multivariate analysis”, Journal of Targeting, Measurement and Analysis for 136. Marketing, Vol. 13 No. 3, 2005, pp. 275–287. 38. E. Romero, P. Villar, M. Á. Luengo, and J. A. Gómez-Fraguela, “Traits, personal strivings and well-being” Journal of Research in 7. B. Kirman, and S. Lawson, “Hardcore Classification: Identifying Play Styles in Social Games using Network Analysis”, In Proc. of the Int. Personality, Vol. 43 No. 4, 2009, pp. 535-546. Conf. on Entertainment Computing (ICEC), Paris, France, 2009, pp. 246–251. 39. R. R. McCrae, and P. T. Costa, “A five-factor theory of personality” In: Handbook of personality: Theory and research. NewYork: Guilford: 8. J. Juul, A Casual Revolution: Reinventing Video Games and Their Players, MIT Press, Cambridge, Massachusetts, London, England, 2010. Pervin, Lawrence A. and John, Oliver P., 1999, pp. 139-152. 9. B. Manero, J. Torrente, M. Freire, and B. Fernández-Manjón, “An instrument to build a gamer clustering framework according to gaming 40. H. M. Pontes, and M. D. Griffiths, “Measuring DSM-5 internet gaming disorder: Development and validation of a short psychometric scale”, preferences and habits”, Computers in Human Behavior, Vol. 62, 2016, pp. 353–363. Computers in Human Behavior, Vol. 45, 2015, pp. 137-143. 10. M. O'Neill, E. Vaziripour, J. Wu, and D. Zappala, “Condensing Steam: Distilling the Diversity of Gamer Behavior”, IMC ’16 Proc. of the 41. Huang, Z., "Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values", Data Mining and Knowledge Internet Measurement Conference (IMC), ACM, 2016, Santa Monica, USA, pp. 81-95. Discovery, Vol. 2 No 3, 1998, pp. 283-304. 11. A. Drachen, J. Green, C. Gray, E. Harik, P. Lu, R. Sifa, and D. Klabjan, “Guns and Guardians: Comparative Cluster Analysis and Behavioral 42. P. M. Podsakoff, S. B. MacKenzie, and J.-Y. Lee, N. P. Podsakoff , N.P., “Common method biases in behavioral research: a critical review of Profiling in Destiny”, In Proc. of Computational Intelligence and Games (CIG), IEEE, 2016, Santorini, Greece, pp. 1-8. the literature and recommended remedies”, Journal of Applied Psychology, Vol. 88 No. 5, 2003, pp. 879-903. 12. R. Sifa, A. Drachen, and C. Bauckhage, “Large-Scale Cross-Game Player Behavior Analysis on Steam”, In Proc. of the 11th Artificial 43. R. Buettner, “Getting a job via career-oriented social networking markets: The weakness of too many ties”, Electronic Markets Vol. 27 No. 4, Intelligence and Interactive Digital Entertainment Conference (AIIDE), AAAI, Santa Cruz, USA, 2015, pp. 198-204. pp. 371-385, 2017 13. R. Sifa, A. Drachen, C. Bauckhage, C. Thurau, and A. Canossa, “Behavior Evolution in Tomb Raider Underworld”, In Proc. of CIG, IEEE, 44. R. Buettner, and I. J. Timm I. J., “An Innovative Social Media Recruiting Framework for Human Resource Consulting” In Nissen, V. (eds) Niagara Falls, Canada, 2013. Digital Transformation of the Consulting Industry: Extending the Traditional Delivery Model. Progress in IS series, 2018, pp. 415-425. 14. A. Drachen, R. Sifa, C. Bauckhage, and C. Thurau, “Guns, swords and data: Clustering of player behavior in computer games in the wild”, In 45. R. Buettner, “Development of an Efficient Europe-wide e-Recruiting System (European Recruiting 2020)Ó In Bakõrcõ, F.; Heupel, T.; Proc. of CIG, IEEE, Granada, Spain, 2012, pp. 163-170. Kocagšz, O.; …zen, †. (eds) German-Turkish Perspectives on IT and Innovation Management: Challenges and Approaches, 2018, pp. 267- 15. A. Drachen, A. Canossa, and G. N. Yannakakis, “Player Modeling using Self-Organization in Tomb Raider: Underworld”, In Proc. of CIG, 274. IEEE, Milano, Italy, 2009, pp. 1–8. 46. R. Buettner, “Innovative Personality-based Digital Services” In PACIS 2016 Proceedings: 20th Pacific Asia Conference on Information 16. C. Bauckhage, K. Kersting, R. Sifa, C. Thurau, A. Drachen, and A. Canossa, “How Players Lose Interest in Playing a Game: An Empirical Systems (PACIS), June 27 - July 1, Chiayi, Taiwan, 2016. Study Based on Distributions of Total Playing Times”, In Proc. CIG, IEEE, Granada, Spain, 2012, pp. 139-146. 47. R. Buettner, “A Framework for Recommender Systems in Online Social Network Recruiting: An Interdisciplinary Call to Arms”, In HICSS- 17. Y. Poels, J.H. Annema, M. Verstraete, B. Zaman, D. DeGroof, “Are you a gamer? A qualitative study on the parameters for categorizing 47 Proceedings: 47th Hawaii International Conference on System Sciences (HICSS-47), January 6-9, 2014, Big Island, Hawaii, pp. 1415- casual and hardcore gamers”, IADIS International Journal on WWW/Internet, 2012, pp. 1-16 1424. 48. D. Xu, and Y. Tian, “A Comprehensive Survey of Clustering Algorithms”, Annals of Data Science, Vol. 2 No. 2, pp. 165-193.