<<

arXiv:2009.08947v1 [cs.IR] 11 Sep 2020 n/rpaesi sdi nbigtepeitos nagame a In predictions. the enabling in used In ga is other. about players the information can played and/or available we has predictions, player and based two a similar content when play probably game ob- often are one by games recommend players game the possible if or together, made example, player games For use are player correlations. not past predictions served of does the set it because data and features a interactions, divided [1] on game (CB) be based based and can is content algorithms and filtering (CF) Collaborative popular filtering most collaborative into the but profiles. contexts, user and informa content additional game often about and available interactions is past of lar facilitates sets solution gaming data problem: natural recommendation a game like his- the interacti to seem of new therefore predict sets to systems data interactions Recommender on user trained and game be torical can has user that systems and recommender methods of investigated sales field methods increase academic The these can engagement. in recommendations of interested good catalog are large because place. a Businesses of from market games. problem players the possible practical to and the in games new have games successful recommending also certain be marketplaces play game would players Many game certain a why whether know to ested erig odstart cold learning, lyr o aeaayispurposes. ce by also analytics liked game models are for use games content players certain them The why recommendation. of makes interpretations game which provide games world tasks, new real outperfo these both for models in the and th filtering that players, models find collaborative We interaction new simultaneously. based players games, and content new dat present survey into a to gam use generalize we likes new paper, games this game completely In those a of possible. for not interactions for predictions is past players words, interactions of and the other c number new In significant be it predict present. a However, to where to interactions. players found applied and past game been and be of describe has interactions, only predictor that game filtering accurate features and Collaborative most include player players. sets historical or data of the possible sets made sometimes are data Recommendations by systems. recommender of o aeRcmedto nteCl tr setting Start Cold the in Recommendation Game for otn ae lyradGm neato Model Interaction Game and Player Based Content eomne ytm aebe netgtdi different in investigated been have systems Recommender inter- all are platforms and publishers, developers, Game ne Terms Index Abstract Gm eomnaini nipratapplication important an is recommendation —Game gm eomnain aeaayis machine analytics, game recommendation, —game aksVlae,TpoPahikkala Tapio Viljanen, Markus eateto uueTechnologies Future of Department .I I. NTRODUCTION nvriyo Turku of University uk,Finland Turku, rtain set a ons. mes tion rm ful an ge re es at s . , e-rc 6 otana dnia oe eyfast. the very as model known identical result an train mathematical standa to a the [6] utilize vec-trick with we infeasible so pairs interactions player approach, t and the to game learning interactions, lear possible their to of makes then or number is The games, task features. or The given players given. of is response vectors collaborative latent the the to on of where approach both (SVD), or standard Decomposition Value the Singular methods the by the filtering, motivations motivated of gaming be development market as The can game preferences. such useful gameplay traits mode provide and player the they about interpret that information find also and therefore fast parameters, We simple, are interpretable. methods easily resulting filtering The collaborative settings. than different better generalize that simultaneously dation players and new 3), and 2), (Setting games 4). (Setting games (Setting new players without for without players (Setting predicting new games interactions for have new that predicting for players predicting We past 1), common. and predicti is [5]: of games [4] players problem past settings new evaluation the four and reality define games therefore in new but histori for in interactions, recommendation predicting ability game game predicted and their on player by Research methods setting. evaluated has this c kno in filtering is collaborative because predict interactions problem, no start or with cold few the setting with as The games the [3]. or players unless interactions for no predictions, predict to accurate filtering tasked cor- more are Collaborative always methods produce [2]. not to found may utility been accuracy perceived that with recognize literatur relate easil to system and started recommender objective has an the is However, this task. since evaluated methods, the of accuracy aefaue n lyrfaue)wt olbrtv filte collaborative (i.e. with features) information calle player So content and likes. features the game game combine observed features, the recommenders player in or hybrid result features together, game both their how or learning about Predictions on questions habits. based answer gaming then to and players motivations, the preferences, becaus ask creat flexible, to directly very used can is be features can player text Creating that reviews, features. etc. genres, videos, tags, gameplay have information, typically games database, nti td,w pl e ehd ogm recommen- game to methods new apply we study, this In the improving by motivated often is research Academic etefrClaoaieResearch Collaborative for Centre uk al,AiKoponen Aki Vahlo, Jukka nvriyo Turku of University uk,Finland Turku, gfor ng othe to annot one e ring. and has are ual wn cal he rd y n d e e e l II. RELATED WORK because it is closely related to game features. Furthermore, it has been been shown that gameplay type preferences such as Our recommendation models produce a list of game recom- preference in exploration, management or aggression predicts mendations to a player. This is known as Top-N recommenda- habit to play games of specific genres [14]. Because of these tion, where the methods are evaluated on the accuracy of the reasons, we make use of gameplay type preference survey data predicted list of recommendations, in contrast to how well the in this study. methods would predict missing rating or playtime values for Comparing recommender systems is difficult for several example. The objective of predicting an accurate list of game reasons. The performance may depend on the data set, the recommendations is probably the most relevant real world task prediction task, and the chosen metric [2]. In addition, many of these systems, since this task has been adopted by many methods are sensitive to the choice of hyperparameters and commercial companies. the optimization method, which means that authors of new Top-N recommendation differs from the traditional task of methods may have not always used baselines [17]. predicting missing values. The evaluation is based on ranking There can also be performance differences between different accuracy metrics such as precision or recall, not on error software implementations of otherwise identical methods [18]. metrics like the root mean squared error (RMSE). There is no However, the simple baseline methods tend to produce com- guarantee that algorithms optimized for the RMSE are also petitive results when the hyperparameters are carefully tuned optimal for ranking because the task is different. Furthermore, [17]. High accuracy if often assumed to correlate with an use- item popularity has been found to have a large effect on the ful recommender system, but subjective utility recommender error metrics [7]. The k Nearest Neighbour (kNN) and the Sin- system has also become an important research goal in itself gular Value Decomposition (SVD) based matrix factorization [2]. Optimizing accuracy can lead to recommending popular have become standard methods in predicting missing values, items at the cost of personalized results [7]. but they also work for the top-N recommendation task [3]. There are studies that have investigated the development of In the task of producing game recommendations, one should new recommender systems to games. The first study [19] used also take into account whether information about the player- probabilistic matrix factorization based collaborative filtering game interactions is implicit or explicit. Implicit signals about for the Xbox platform. The second study [20] presented a liking a game are for example owned or played games, recommender based on archetypes, where their formula (5) is whereas explicit data is the ratings and opinions provided a constrained case of the SVD. The study included compar- about the games. Implicit data is often complete, which means isons to kNN (cosine) trained on the latent SVD factors, the that every player and game pair has a value and the task is to standard kNN with somewhat small neighbourhood sizes, and rank the games by the probability of liking them. In explicit random or most popular recommendations. The third study data, there are typically many missing values because a player [21] investigated a new case-based disability rehabilitation has not rated every possible game, and the task is to predict recommender, which is a type of content recommender. The the values of the missing ratings. content was based on game descriptions combined with social For our models we technically use explicit data, because the network information and questionnaire answers of the users. player and game interactions are based on the favourite games The fourth study [22] presented a graph based approach with mentioned in a survey. Similar data sets to our explicit survey a biased random walk model inspired by ItemRank, which based data on favorite games and player preferences could is a type of a hybrid recommender. The fifth study [23] is be obtained without using surveys, i.e. implicitly by crawling a kNN (cosine) recommender based on content created by gaming platforms for the games that users own or play. In the latent semantic analysis of wikipedia article. The sixth either case, these data sets are understood as complete data for study [24] defined a content based recommender to find similar our task. This means that there are no missing values because games through the cosine similarity of feature vectors based all player and game pairs have a value that denotes whether on Gamespot game reviews. The bag-of words representation the player mentioned, owned or played the game. was outperformed by information theoretic co-clustering of In this study, also we utilize player traits and preferences to adjective-context word pairs to reduce the dimensionality. construct player features. In game research, player preferences The seventh study evaluated the quantitative and qualitative can be divided into four main categories: 1) player motiva- performance of game recommenders on the data set tions [8], [9], 2) preferred play styles [10], [11], 3) gaming [25]. They used BPR++, grouped BPR, kNN (cosine) on game mentalities [12], [13], and 4) gameplay type preferences [14], tag content, kNN combined with grouped BPR, a simulation [15]. Player motivations measure general reasons why players of Steam’s recommender, SVD, Factorization Machine (FM), play games, whereas models that investigate play styles are and popularity ranking. They found significant quantitative typically based on player behavior data rather than survey data. differences but no qualitative differences between the methods. Gaming mentalities refer to the psychometric data on players’ III. DATA SET typical and preferred gaming intensity type (e.g. casual or hardcore gaming). [16] Of the four approaches on player pref- A. Survey data set erences, gameplay type preference data is arguably the most The survey data (N=15,894) was collected by using a total promising for producing personalized game recommendations, of 10 standalone web-based surveys. Each of the surveys focused on different aspects of player preferences but all sur- game interactions are stored in a n m binary game like × veys included open-ended questions about respondents favorite matrix Ri,j = I(player i likes game j). The matrix does not games. Most of the samples were collected by using a UK- have missing entries, because the game like status is known based crowd sourcing platform, market companies providing for every player. For example, the player i may have answered large online panels, or by recruiting respondents from social the first and the third game as their favourites: media platforms such as Facebook or Reddit. The survey data includes representative samples from Finland, Denmark, USA, Ri,: = (1, 0, 1, 0, ..., 0) (1) Canada, and Japan. A typical survey took up to 20 minutes to The task is to predict the ranking of games that the player complete with a computer or a mobile phone, and was targeted has not answered as their favourite but might like. These to everyone between the ages of 18 and 60. Another type of predictions are the matrix ∗ R, where only the order survey data was collected by using a short online player type Ri,j of the values in each row matters∈ for ranking. For example, test. The short test was open to everyone regardless of their the predictions for player over all the games could be: prior experience in playing games, or the possible lack of it. i m Before analyzing survey data of the individual survey data ∗ (2) sets, the data was cleaned of participants who implied content Ri,: = (1.41, 0.10, 0.82, 0.04, ..., 0.21) non-responsivity by responding similarly to every question. The recommendation list for a player is obtained by taking The survey data analyzed in this study consists of 15894 ∗ the indices of games with k largest predicted values in Ri,:, observations, 6465 unique games, and 80916 favorite game where the games that the player has already liked are excluded. mentions. There are 1 to 37 favorite game mentions per player, In addition to game likes, we also have player and game and on average a player mentioned 5 games as his or her features that we can use for content based prediction. Denote favorites. Every game that is mentioned as a favorite game has the m r matrix of game features as Xtags, where the feature from 1 to 1108 individual favorite game mentions i.e. players. vectors× for the m games are stored as rows. In our case these The data set is very sparse, since only 0.08% of possible features are indicators of presence or absence of each of the (player, game)-pairs are favorite game mentions. The answers r game tags. Denote the n s matrix of player features as × tended to be more novel and personal than data sets based on Xquestions, where the feature vectors for the n players are stored played or owned games. as rows. In our case these features are the responses to the s Content for the games was obtained by crawling game tags questions on a Likert scale of -2,-1,0,1,2. from publicly available sources, such as the Steam platform To test model performance, we split the data set into training and the Internet Games DataBase (IGDB), the latter of which and validation sets as follows. First, we randomly sampled provided an API for this purpose. The presence or absence of 25% of games into ’test games’ an 25% of players into ’test every tag in each game was stored as a binary indicator. players’ that the models do not see during training. These Content for the players was obtained by asking the survey games and players test the performance of the model for new participants’ preferences using the Gameplay Activity Inven- games and players. The other games and players belong to tory (GAIN) which has been validated with cross-cultural data. the training set. In Setting 1, the models are tested on the task The GAIN consists of 52 questions (5-Likert scale, 1=very of recommending known games for a known player who has unpleasant, 5=very pleasant) about gameplay preferences, and mentioned 3 favourite games. We therefore further selected 20 the inventory measures five dimensions of gameplay appre- % of the training set players by randomly sampling amongst ciation: aggression, management, caretaking, exploration, and those who have liked more than 3 games. Randomly chosen coordination. These questions consists of items such as ’Ex- 3 games of each player are the ’seed’ which belongs to the ploding and Destroying’ (factor: aggression), ’Searching for training set, and the remaining games for these players belong and collecting rare treasures’ (factor: exploration), ’Flirting, to the validation set for Setting 1. The Setting 2 (new games) seducing and romantic dating’ (factor: caretaking), ’Matching validation set consists of the unseen ’test games’ for the known tiles together’ (factor: coordination), ’Generating resources players. In Setting 3 (new players), the validations set consists such as energy or money’ (factor: management) etc. (Vahlo et of unseen ’test players’ for the known games. In setting 4, the al. 2018). In addition to the GAIN model, we utilized also a 9- validation set is the game likes for the unseen ’test games’ item inventory on players’ game challenge preferences. These and ’test players’. We have illustrated the game like matrix R, items measure how pleasurable players consider e.g. ’logical the game features Xtags, the player features Xquestions, and the problem-solving’, ’strategic thinking’, ’puzzle solving’, ’racing different validation settings in Figure 1. against time’, etc. A typical survey included the full 52 plus 9 questions whereas the online player type test consisted of a B. Metrics partly randomized sample of these questions. We use Precision@20 and nDCG@m to measure accuracy in the validation sets. They are defined as follows. IV. MODELS 1) Precision@20: The Precision@k metric counts the num- A. Data set and validation ber of games the player has liked in the validation set, as a Assume we have n players and m games. Denote player fraction of all games in a recommendation list of length k. i 1, 2, ..., n and game j 1, 2, ..., m . The player and Assume the model predicts R∗ for player i, and denote r(i) ∈ { } ∈ { } i,: Game form solution for the distribution parameters, because the 1 2 .... m Data set definition maximum likelihood estimate of these is the sample mean 1 R Game likes vector µj = n Pi Ri,j and the sample covariance matrix Tag 1 1 2 .... r . XTags Game features Σi,j = n Ps(Rs,i µi)(Rs,j µj ) New − − + X Player features In prediction time, we assume that the values of liked +- Game Questions + games are known to be one but the values for other games - + Validation are missing. Denote the indices of liked games as and the indices of other games as so that = 1, 2, ...mI . We + - Training set J I ∪ J { } + Setting 1 validation use indexing XJ ,I to denote the submatrix with rows from Player - + Setting 2 validation and columns from , for example. The predictions for the + New J I Game, Setting 3 validation missing game likes are then given by the expectation of the - - + Player ∗ Setting 4 validation conditional distribution R J = E(Ri,J (Ri,j = 1)j∈I ). This 1 2 .... n + New Player i, | 1 2 .... s can be shown to equal the solution: Question Game Likes

∗ −1 Fig. 1. Illustration of the data set and validation settings. R = µJ +ΣJ ,I (ΣI,I) (1 µI ) (5) i,J − To predict without game popularity, we remove the mean as the vector of indices that sorts the predictions. The element vectors from the formula and substitute the sample covariance (i) rj is the j’th game in the recommendation list, i.e. the index matrix with the sample correlation matrix. This is equivalent ∗ mean centering and then normalizing the game like matrix of j’th largest predicted value in Ri,:. The metric for the data set is the average precision over the players: column wise before applying the model. 2) k Nearest Neighbour (kNN): The kNN is a simple 1 n 1 k I (i) recommendation method. Assume we have calculated the n Pi=1 k Pj=1 (player i likes game rj ) (3) similarity between any two games in a m m matrix Sij . Precision is a realistic measure of the real-world accuracy The rating prediction for game j considers the× k most similar of a recommendation list, where k is typically small and the games in the game like matrix for player i. Denote this set of position of a game in the list does not matter. most similar games Dk(i, j). The prediction for a player is the 2) nDCG@k: The normalized Discounted Cumulative Gain similarity weighted average to the player’s like status of k most (nDCG@k) metric measures the position of liked games in ∗ similar games: R = P ∈ k Sj,sRi,s/ P ∈ k Sj,s. the recommendation list. When a player liked a game, its i,j s D (i,j) s D (i,j) We evaluated the kNN on neigbhourhood sizes of k = position in the player’s recommendation list is rewarded by the 1, 2, 4, 8, ..., m, but we always obtained the best results with inverse of its logarithmic rank. These are called the discounted the maximum neighbourhood size of k = m. As others have cumulative gains. In the optimal ranking, we have ranked liked pointed out [7], the normalizing denominator is not necessary games on the top of the recommendation list, of the total validation for the ranking task and we in fact obtained better predictions ki = j : Ri,j =1 liked games, and the discounted cu- without it. We therefore predict simply by: |{ }| min(ki,k) mulative gain has the value IDCGi = Pj=1 1/log2(j+1). The nDCG@k is the discounted cumulative gain in the recom- ∗ Ri,j = Ps∈Dk(i,j) Sj,sRi,s (6) mendation list r(i) of length k, normalized by the maximum attainable value IDCGi. The metric for the data set is the We define the similarity function as either the cosine (cos) average over the players: or the Pearson correlation (phi), where µj = Pi Ri,j /n is the column mean of the game like matrix: I(player i likes game r(i)) 1 n 1 k j (4) n Pi=1 IDCG Pj=1 log (j+1) P R R i 2 Scos = s s,i s,j i,j P R2 P R2 (7) Because Precision@20 measures the top recommendations, √ s s,i√ s s,j we used used nDCG@m to measure the overall ranking. (R −µ )(R −µ ) Sphi = Ps s,i i s,j j i,j − 2 − 2 (8) C. Models √Ps(Rs,i µi) √Ps(Rs,j µj ) 1) Multivariate Normal Distribution (MVN): First we 3) Singular Value Decomposition (SVD): The SVD of di- present a simple new collaborative filtering model which has mension k is defined in terms of n k matrix P of latent player a competitive accuracy but simpler interpretation. This model factors as rows and m k matrix×G of latent game factors as allows us to explain the recommendations through explicit rows. It is a type of regularized× matrix factorization because correlation matrix between games. The model also allows us the predicted game like matrix is the product R∗ = P GT to completely remove the influence of game popularity to of the factor matrices. A prediction for player i and game k investigate its effect. Assume that every row of the player j is simply the product of the latent user vector Pi,: R k ∈ like matrix is a sample from a multivariate normal distri- and the latent game vector Gj,: R . These latent vectors are m ∈ bution: Ri,: (µ, Σ) with mean vector µ R and initially unknown. We evaluated different choices of dimension covariance matrix∼ N Σ Rm×m. This model has∈ a closed k = 4, 8, 16, 32, 64, 128 and regularization parameter λ. For ∈ SVD Tags Questions Tags X Questions

Game Game Game Interaction Game

G QA Tag Tag Latent feature + Response + +- +- + + P T - + - +

+ - + - + + Player Player Player Player - + - + + + - - + - - + + + Latent Game Likes Response Game Likes Question Game Likes Question Game Likes feature

Collaborative Filtering Content Based

Fig. 2. Illustration of the SVD and content models. Models can only generalize to settings (light green) for which they learn representations (light blue). the SVD implementation in the Suprise library, a python matrix T , which needs to be learned from data. A given player package for implicit recommendations, we found that a may for example answer that they like ’Candy Crush’ and of λ = 1, 2, 4, 8, 16, 32, 64, 128, 256, 512 produced a concave ’Tetris’, which implies that the player interacts positively with maximum between the values. We call the choice of λ = 0 game tags ’puzzle’ and ’tile-matching’. We predict the game as PureSVD, because it is possible to use a standard singular likes as a product of the game features Xtags and the player ∗ T value decomposition. The SVD was sensitive to regularization, interaction strengths T : R = TXtags. To find the parameters, but if the regularization choice was optimal we obtained almost we minimize the RMSE between observed game likes and identical results for dimensions k 32. The model parameters predicted game likes: are found by minimizing the RMSE≥ between observed game likes and predicted game likes: T = argmin R TXT 2 + λ T 2 (12) T k − tagskF k kF Every row Ti,: in fact independently minimizes the squared P, G = argmin R P GT 2 + λ P 2 + λ G 2 (9) error associated with that row, so the model can be fitted P,Gk − kF k kF k kF 2 separately for every row with standard ridge regression: Where the matrix norm X F denotes the Frobenius norm, or the root mean square ofk everyk element in the matrix X. T = (XT X + λI)−1XT RT (13) To find the parameters, one approach is to use Alternating i,: tags tags tags i,: Least Squares (ALS) optimization [3]. In this method, either 5) Questions: The second content model is based on player the latent game vectors G or the latent player vectors P features, which we call the ’Questions’ model because our are assumed to be fixed and the optimal solution for the player features are based on questionnaire about gaming other is found. Because in this case every row independently preferences. We assume that each game has an interaction minimizes the squared error associated with that row, we can strength with each player feature. These interaction strengths solve for each row with standard ridge regression either: are described by a game specific vector of length s. Collect these vectors as rows of the n s model parameter matrix T −1 T T × Pi,: = (G G + λI) G Ri,: (10) Q, which needs to be learned from data. A given game ’Candy Crush’ may for example be liked by players that

T −1 T have stated a preference for ’logical challenges’ and ’racing Gj,: = (P P + λI) P R:,j (11) against time’. We predict the game likes as a product of the The optimization starts by initializing P and G with random player features Xquestions and the game interaction strengths ∗ T values. We iterate between fixing G to find optimal values for Q: R = XquestionsQ . To find the parameters, we minimize P , and then fix the resulting P to find optimal values for G. the RMSE between observed game likes and predicted game This is repeated until convergence. likes: 4) Tags: The first content model is based on game features, which we call the ’Tags’ model because our game features are T 2 2 Q = argmin R XquestionsQ + λ Q (14) based on game tags. We assume that each player has some Qk − kF k kF interaction strength with each game feature. These interaction Again, every row Qi,: independently minimizes the squared strengths are described by a player specific vector of length error associated with that row, so the model can be fitted r. Collect these vectors as rows of the n r model parameter separately for every row of Q with standard ridge regression: × TABLE I RANKING ACCURACY BY NDCG@M (%) T −1 T Qj,: = (XquestionsXquestions + λI) XquestionsR:,j (15) 6) Tags X Questions: The final content model is based on Model Setting 1 Setting 2 Setting 3 Setting 4 Random 13.9 13.2 14.6 13.4 both game and player features, which we call the ’Tags x MVN 31.9 13.2 14.6 13.4 Questions’ model because it is based on interactions between kNN (cos) 30.0 13.2 14.6 13.4 the game tags and the questionnaire answers. To model the kNN (phi) 27.4 13.2 14.6 13.4 game likes between every player and every game , we PureSVD 28.4 13.2 14.6 13.4 i j SVD 30.9 13.2 14.6 13.4 assume that each (player, game)-pair is described by a feature Tags 23.4 22.3 14.6 13.4 vector. The feature vector for the pair is the tensor product Questions 26.9 13.2 32.2 13.4 of the player’s features and the game’s features. Given the Tags X Questions 23.2 19.9 24.3 20.1 n s player feature matrix Xquestions and the m r game feature× matrix X , the pair feature matrix can be represented× tags TABLE II as a nm sr feature matrix Xquestions Xtags. The logic RECOMMENDATION LIST ACCURACY BY PRECISION@20(%) behind this× representation implies that the⊗ game likes are simply the sum of interaction strengths between player and Model Setting 1 Setting 2 Setting 3 Setting 4 game features, where the r s interaction strength matrix as Random 0.1 0.1 0.1 0.1 A needs to be learned from× data. A given player may for MVN 5.0 0.1 0.1 0.1 kNN (cos) 3.9 0.1 0.1 0.1 example answer that they like ’logical challenges’ and ’racing kNN (phi) 3.2 0.1 0.1 0.1 against time’, and the data implies that these answers interact PureSVD 3.8 0.1 0.1 0.1 positively with game tags ’puzzle’ and ’tile-matching’. Denote SVD 4.3 0.1 0.1 0.1 the columnwise stacking of the interaction strength matrix as Tags 2.3 1.7 0.1 0.1 Questions 3.4 0.1 4.3 0.1 vec(A), and the columnwise stacking of the n m game Tags X Questions 2.4 1.1 2.7 1.2 like matrix as vec(R∗). Further denote the feature× matrix as Xinteractions = Xquestions Xtags. The response is predicted as the sum of the player feature⊗ and game feature interactions: ∗ with the MVN model delivering the best results. In Setting 2 vec(R ) = Xinteractionsvec(A). To find the model parameters, we minimize the RMSE between observed game likes and we generalize to the new games, and only the Tags and Tags predicted game likes: X Questions models are able to generalize. The Tags model has a greater degree of freedom and it performs better in this setting. In Setting 3 we generalize to new players, and only the ∗ 2 2 A = argmin vec(R ) Xinteractionsvec(A) + λ A Questions and Tags x Questions models are able to generalize. Ak − kF k kF (16) Again, the Questions model has more freedom to fit the data There is a mathematical optimization shortcut known as and performs better in this setting. For the last setting, only the vec-trick [6], which makes the minimization problem the Tags X Questions model which uses both features is able computationally tractable in the special case that the feature to make useful predictions. matrix is a Kronecker product. We use the python software The mathematics underlying the generalization ability is package RLScore1 which implements this short cut to obtain illustrated in Figure 2. Predictions can only be made for an exact closed form solution to the ridge regression problem: player and game pairs where both games and players have representations which have been learned from the training set. However, if the representations can be learned for the setting vec(A) = (XT X + λI)−1XT vec(R∗) interactions interactions interactions it is useful to use more flexible models. There is therefore (17) an important trade-off between using the provided features V. RESULTS to generalize better or learning latent features to have better A. Model accuracy performance inside the training set games and players. We report the accuracy of different models using nDCG@m B. Model interpretation in Table I and Precision@20 in Table II. The metrics have values between 0-100%, yet it is challenging to interpret the Because the models are based on simple linear models, we quality of recommendations from their absolute values. We can interpret the model coefficients. The coefficients provide can use accuracy metrics to compare and improve the models, an explanation of why certain players are predicted to like but the results need to be verified qualitatively in practise. certain games. This information can be useful for game devel- Comparing models by accuracy tells a clear-cut story. The opers or publishers that seek to understand the gaming market, two metrics have the same interpretation. In Setting 1, collab- and they can be used to tune the model towards qualitatively orative filtering methods outperform content based approaches better predictions. Recalling that the MVN model predictions are based on the 1http://staff.cs.utu.fi/ aatapa/software/RLScore game mean vector µ and covariance matrix Σ, we interpret Tag Question #1 Question #2 Question #3 Question #4 Game Players Liked Question #1 Question #2 Question #3 Question #4 Challenges of Challenges of Challenges of Exploring the Customizing a Searching and crosswords and Matching tiles or World of 108, 589, 631, Developing skills Puzzle logical problem- creative problem- gameworld and its character's collecting rare other word shapes together Warcraft 674, 739, … and abilities solving solving secrets appearance treasures puzzles The Elder 2102, 6064, Exploring the Developing skills Killing and Befriending with in Challenges of Challenges of Scrolls V: 6779, 6853, gameworld and its Real-time Challenges of Doing tricks in and abilities Murdering game characters strategy and acting in a Skyrim 6945, … secrets Strategy tactics extreme sports strategic thinking constant hurry Searching and Challenges of 141, 212, 252, Sniping to Overwatch collecting rare Team sports acting in a Dressing up, 337, 742, … eliminate Gardening and Decorating applying make Hugging, kissing treasures constant hurry Dating Sim taking care of rooms and up and choosing and making out Challenges of farms houses 47, 91, 141, Challenges of crosswords and Challenges of fast Matching tiles or looks TETRIS Jumping on 153, 166, … memorizing other word reaction shapes together Investigating the Searching and platforms and Challenges of puzzles Lara Croft story and its collecting rare bouncing on tactics mysteries treasures walls Interaction Game

A Q Game Game #1 Game #2 Game #3 Game #4 Player Games Liked Tag #1 Tag #2 Tag #3 Tag #4 Warcraft III: Child of Light, Dungeon First- Question Response World of Heroes of Turn-based Hand- Diablo II Reign of Diablo III 15887 Master, Shin Megami person Friendship Warcraft the Storm Strategy drawn Chaos Tensei: Persona 3 shooter Correlation Project CARS, Gran Motorsport Driving Customizat The Elder The Elder The The Elder 15888 Driving T Ʃ Turismo 5, Forza Horizon 2 s Simulator ion Scrolls V: Scrolls IV: Witcher 3: Fallout 4 Scrolls III: Shovel Knight, Super Mario Character 4 Player Skyrim Oblivion Wild Hunt Morrowind 15889 3D World, The Legend of developme Fishing Real-Time Local League of Heroes of Sudden Zelda: Ocarina of Time nt Player Overwatch Destiny 2 Legends the Storm Attack Quick- Heavy Rain, Steins Gate, 15890 Cinematic Time Story Rich Drama Space Microsoft The New Life is Strange TETRIS Dr. Mario Events Invaders Mahjong Tetris Tag Game Likes Response

Fig. 3. Example of model interpretation: correlated games (MVN), tag responses (Tags), question responses (Questions) and interactions (Tags x Questions) how collaborative filtering arrives at the predictions. The mean the game, based on their questionnaire answers. At prediction vector is simply the popularity of each game, calculated as the time, the players with these answers are predicted to be play fraction of players that play each game. This is the starting the game. This model can therefore generalize to new players point of the predictions, which are then adjusted based on by predicting players that have similar preferences. the strength or positive or negative correlations to the games The Tags X Questions model infers the interaction strengths the player has played. For example, in Figure 3, we provide A between all player questionnaire answers and game tags, 4 example games and their 4 most correlated games. These based on every player and game pair. Fitting the model correlations are very believable, and these are the games whose produces a matrix of r s interaction strengths, where each play probability increases the most when player mentions the answer and tag pair have× their own value. Figure 3 illustrates game in question. The exact calculation of the probability is 4 example tags and 4 answers with the strongest implied based on the assumption of a normal distribution, which while interactions. These interactions are very logical and similar incorrect, seems to work well in practise. to what one might manually define: Puzzle tag for example The Tags model is based on inferring a player specific interacts with the answer of liking ’Challenges of crosswords response vectors T to the game tags, based on the games the and other word puzzles’. With 61 possible answers and 379 player has liked. Fitting the model therefore produces a vector game tags, manually defining and tuning 23 119 parameters is of r interaction strengths to each of the game tags for each however infeasible. At prediction time, the players that have of the n players. In Figure 3 we illustrate 4 example players preferences which best match the game tags are predicted to with 3 liked games each and 4 tags with strongest implied be play the game. This model can therefore generalize to both interactions. It seems that the model is able to correctly learn new players and new games simultaneously. the content of the liked games and describe the player in terms of their tag interactions. At prediction time, the games C. Model Recommendations with these tags are predicted to be played the most by the player. The model can therefore generalize to new games by Finally, we illustrate the model recommendations in Ta- predicting games that have similar tag content. ble III. The MVN model produces recommendations which The Questions model is based on inferring a game specific are close to the games the player has played, and the Tags response vectors Q to the player preferences, based on the model provides close matches in terms of game genres. players that have liked the game. Fitting the model therefore The Question and Tags x Questions models rely on rather produces a vector of s interaction strengths to each of the ambiguous question answers, and as a result provide quite questionnaire answers for each of the m games. Figure 3 generic recommendations. However, their answers still seem illustrates 4 example games and 4 answers with the strongest better than random. The Tags model seems better than the implied interactions. The model is able to correctly learn the Questions model, even though accuracy metrics suggest the content of these games from the types of players that play opposite conclusion. There is a significant popularity bias visible in the Questions [4] Park, Yungki, and Edward M. Marcotte. ”Flaws in evaluation schemes and Tags x Questions models, and to some extent in the MVN for pair-input computational predictions.” Nature methods 9.12 (2012): 1134. model. The effect of popularity can be removed from the MVN [5] Pahikkala, Tapio, et al. ”Toward more realistic drugtarget interaction model as described earlier, and there is a similar trick that can predictions.” Briefings in bioinformatics 16.2 (2015): 325-337. be used with the other models. First, normalize the game like [6] Airola, Antti, and Tapio Pahikkala. ”Fast Kronecker product kernel methods via generalized vec trick.” IEEE transactions on neural net- matrix R column wise: substract the mean vector µ and divide works and learning systems 29.8 (2017): 3374-3387. by the standard deviation σ = pµ(1 µ) to produce a matrix [7] Cremonesi, Paolo, Yehuda Koren, and Roberto Turrin. ”Performance of of standardized deviations from baseline− popularity. Second, recommender algorithms on top-n recommendation tasks.” Proceedings of the fourth ACM conference on Recommender systems. ACM, 2010. more popular games tend to have more tags provided so [8] Yee, Nick. ”Motivations for play in online games.” CyberPsychology & produce a more egalitarian ’game profile’ vector by projecting behavior 9.6 (2006): 772-775. [9] Yee, Nick, Nicolas Ducheneaut, and Les Nelson. ”Online gaming moti- XTags with PCA into a lower dimensional space and normalize vations scale: development and validation.” Proceedings of the SIGCHI this vector. We found 16 dimensions worked qualitatively well. conference on human factors in computing systems. 2012. We skip reporting these results because they produced worse [10] Bartle, Richard. ”Hearts, clubs, diamonds, spades: Players who suit accuracy on the metrics, even though they virtually eliminated MUDs.” Journal of MUD research 1.1 (1996): 19. [11] Cowley, Benjamin, and Darryl Charles. ”Behavlets: a method for prac- the effect of recommending popular but unrelated games. tical player modelling using psychology-based player traits and domain specific features.” User Modeling and User-Adapted Interaction 26.2-3 VI. CONCLUSION (2016): 257-306. [12] Ip, Barry, and Gabriel Jacobs. ”Segmentation of the games market using Research in game recommendation has focused on the multivariate analysis.” Journal of Targeting, Measurement and Analysis prediction of missing game likes in data sets of historical for Marketing 13.3 (2005): 275-287. player and game interactions. However, many practical tasks [13] Kallio, Kirsi Pauliina, Frans Myr, and Kirsikka Kaipainen. ”At least nine ways to play: Approaching gamer mentalities.” Games and Culture require predictions for completely new games and players. 6.4 (2011): 327-353. Collaborative filtering has been found to be a useful model [14] Vahlo, Jukka, Jouni Smed, and Aki Koponen. ”Validating gameplay in the traditional setting, but for games or players with few activity inventory (GAIN) for modeling player profiles.” User Modeling and User-Adapted Interaction 28.4-5 (2018): 425-453. or no interactions different models are required. We presented [15] Fortes Tondello, Gustavo, et al. ”Towards a trait model of video game content based Tags, Questions, and Tags X Questions models preferences.” International Journal of HumanComputer Interaction 34.8 that generalize into new games, new players, or both simul- (2018): 732-748. [16] Vahlo, Jukka, and Juho Hamari. ”Five-factor inventory of intrinsic taneously. These methods are inspired by the Singular Value motivations to gameplay (IMG).” Proceedings of the 52nd Hawaii Decomposition (SVD), a standard approach in collaborative International Conference on System Sciences, Hawaii, USA, 2019.. filtering, where one or both of the feature vectors are assumed HICSS, 2019. [17] Rendle, Steffen, Li Zhang, and Yehuda Koren. ”On the Difficulty to be given. The optimization corresponds to a linear model of Evaluating Baselines: A Study on Recommender Systems.” arXiv with computational shortcuts, which makes them fast and preprint arXiv:1905.01395 (2019). easily interpretable. [18] Said, Alan, and Alejandro Bellogn. ”Comparative recommender system evaluation: benchmarking recommendation frameworks.” Proceedings of We compared the following models: Random baseline, Mul- the 8th ACM Conference on Recommender systems. ACM, 2014. tivariate Normal Distribution (MVN), k Nearest Neighbour [19] Koenigstein, Noam, et al. ”The Xbox recommender system.” Proceed- (kNN) with cosine or Pearson similarity, PureSVD, SVD, ings of the sixth ACM conference on Recommender systems. ACM, 2012. Tags, Questions, Tags x Questions. We evaluated the per- [20] Sifa, Rafet, Christian Bauckhage, and Anders Drachen. ”Archetypal formance with the nDCG and Precision@20 metrics within Game Recommender Systems.” LWA. 2014. known games and players (Setting 1), new games (Setting [21] Catal, Laura, Vicente Julin, and Jos-Antonio Gil-Gmez. ”A cbr-based game recommender for rehabilitation videogames in social networks.” 2), new players (Setting 3) and both new games and new International Conference on Intelligent Data Engineering and Automated players (Setting 4). We found that each content based model Learning. Springer, Cham, 2014. performed the best in the setting for which it was designed, [22] Chow, Anthony, Min-Hui Nicole Foo, and Giuseppe Manai. ”Hy- bridRank: A Hybrid Content-Based Approach To Mobile Game Rec- and restricting the models to use the provided features instead ommendations.” CBRecSys@ RecSys. 2014. of learning latent features is useful in terms of generalization [23] Ryan, James Owen, et al. ”People Tend to Like Related Games.” FDG. ability but has a trade off in terms of accuracy. Each model can 2015. [24] Meidl, Michael, Steven L. Lytinen, and Kevin Raison. ”Using game re- therefore be useful depending on the setting. Finally, we note views to recommend games.” Tenth Artificial Intelligence and Interactive that accuracy does not tell the full story because the qualitative Digital Entertainment Conference. 2014. results do not seem to perfectly correlate with accuracy. [25] Jannach, Dietmar, and Lukas Lerche. ”Offline performance vs. subjective quality experience: a case study in video game recommendation.” Proceedings of the Symposium on Applied Computing. ACM, 2017. REFERENCES [1] Ricci, Francesco, Lior Rokach, and Bracha Shapira. ”Introduction to recommender systems handbook.” Recommender systems handbook. Springer, Boston, MA, 2011. 1-35. [2] Herlocker, Jonathan L., et al. ”Evaluating collaborative filtering recom- mender systems.” ACM Transactions on Information Systems (TOIS) 22.1 (2004): 5-53. [3] Hu, Yifan, Yehuda Koren, and Chris Volinsky. ”Collaborative filtering for implicit feedback datasets.” 2008 Eighth IEEE International Confer- ence on Data Mining. Ieee, 2008. TABLE III EXAMPLE PLAYERS AND TOP 5 GAME RECOMMENDATIONS FROM DIFFERENT MODELS

Player Questions Liked Games Liked MVN Tags Questions Tags X Questions 93519 Engaging in battle, Child of Light, , Costume Quest, World of Warcraft, Fallout 4, Mass Ef- Weapons and skills Dungeon Master, Xenogears, Shin Ori and the Blind The Witcher 3: fect 2, Fallout 3, selection for characters, Shin Megami Megami Tensei: Forest, Abyss Wild Hunt, Diablo, Fallout: New Vegas, Searching and collecting Tensei: Persona 3 Persona 4, Odyssey, Fortune The Elder Scrolls V: Warframe rare treasures Cross, Bravely Summoners, Skyrim, Overwatch Default Bahamut Lagoon 93520 Piloting and steering ve- Project CARS, Gran Grand Theft Auto DiRT 4, Gran Tur- Call of Duty, Grand StarCraft: Brood hicles, Racing in a high Turismo 5, Forza V, Pokmon GO, ismo 2, Gran Tur- Theft Auto, Clash War, StarCraft, speed, Challenges of tac- Horizon 2 Forza Motorsport ismo (PSP), Forza of Clans, Angry StarCraft II: Legacy tics 6, Gran Turismo 6, Motorsport 4, Forza Birds, Battle-field of the Void, Doom Hill Climb Racing Motorsport 2 II RPG, Call of 2 Duty 93521 Running in a fast speed Shovel Knight, The Legend of The Legend of TETRIS, League StarCraft, Tomb while avoiding obstacles, Super Mario 3D Zelda: Breath of Zelda: Twilight of Legends, Raider, Dota 2, Developing skills and World, The Legend the Wild, Super Princess, Rogue Call of Duty, StarCraft: Brood abilities, Challenges of of Zelda: Ocarina Mario Galaxy, Legacy, Assassin’s Crash Bandicoot, War, Counter- fast reaction of Time Super Mario 64, Creed IV: Black Minecraft Strike: Global The Legend of Flag, The Legend Offensive Zelda: The Wind of Zelda: A Link Waker, The Legend to the Past, Power of Zelda: A Link to Stone 2 the Past 93522 Hugging, kissing and Heavy Rain, The Last of Us, Beyond: Two Sudoku, The Sims, 2, making out, Investigating Steins;Gate, Life Is Pokmon GO, The Souls, The Wolf Angry Birds, Pok- Fallout 3, The the story and its mysteries, Strange Witcher 3: Wild Among Us, Zero mon GO, Counter- Elder Scrolls IV: Challenges of logical Hunt, World of Escape: Zero Time Strike: Global Of- Oblivion, The Elder problem-solving Warcraft, TETRIS Dilemma, Persona fensive Scrolls V: Skyrim, 4 Golden, Alan Fallout: New Vegas Wake 93523 Decorating rooms Cities: Skylines, The Sims 3, The Train Valley, Prison Sudoku, The Warframe, The and houses, Hugging, Overcooked, The Sims, Civilization Architect, The Sims, TETRIS, Elder Scrolls IV: kissing and making out, Sims 2 V, The Sims 4, The Sims, RollerCoaster Gardenscapes, Oblivion, Dragon’s Challenges of logical Elder Scrolls V: Tycoon 3: Platinum, World of Warcraft Dogma: Dark problem-solving Skyrim Tropico 4 Arisen, The Sims, Stardew Valley