Rating of Players in a Battle Royale Game
Total Page:16
File Type:pdf, Size:1020Kb
Rating of Players in a Battle Royale Game Sander Johannes Cornelis van Riel ANR: 640828 SNR: 1247320 THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN DATA SCIENCE AND SOCIETY, AT THE SCHOOL OF HUMANITIES AND DIGITAL SCIENCES OF TILBURG UNIVERSITY Supervisor: prof. dr. ir. P.H.M. Spronck Second Reader: dr. Ç. Güven Tilburg University School of Humanities and Digital Sciences Department of Cognitive Science & Artificial Intelligence Tilburg, The Netherlands May 2019 Abstract In this research, the goal is to investigate if the rating of a player can be predicted for the Battle Royale game Player Unknown’s Battleground (PUBG). The dataset consisted of player statistics for approximately 85,000 top ranked PUBG players with 50 features for each game mode. First, to investigate which features are most important to predict the rating of a PUBG player an exploratory analysis was done. This resulted in the same features for the solo, duos and squads game modes. Second, this research showed that prediction of the rating of PUBG players was possible with the use of multiple classification models, where the results for all three game modes were better compared to their baseline accuracy scores. 2 Table of Contents 1. Introduction .......................................................................................................................................5 1.1 Context ......................................................................................................................................5 1.2 Problem Statement and Research Questions .............................................................................6 1.3 Outline .......................................................................................................................................6 2. Theoretical Framework ....................................................................................................................7 2.1 Prior Literature ..........................................................................................................................7 2.1.1 Outcome Prediction ...........................................................................................................7 2.1.2 Rating of Players ...............................................................................................................8 2.2 Classification Models ................................................................................................................9 2.2.1 Decision Tree and Random Forest ....................................................................................9 2.2.2 Logistic Regression .........................................................................................................10 2.2.3 k-Nearest Neighbor ..........................................................................................................10 2.3 Battle Royale ...........................................................................................................................11 2.4 Player Unknown’s Battleground (PUBG) ...............................................................................11 3. Experimental Setup .........................................................................................................................13 3.1 Dataset .....................................................................................................................................13 3.2 Pre-processing .........................................................................................................................14 3.3 Setup for RQ1 ..........................................................................................................................16 3.4 Setup for RQ2 ..........................................................................................................................17 4. Results ..............................................................................................................................................18 4.1 Exploratory Analysis ...............................................................................................................18 4.2 Classification Models ..............................................................................................................20 4.2.1 Decision Tree and Random Forest ..................................................................................21 4.2.2 Logistic Regression .........................................................................................................23 4.2.3 k-Nearest Neighbor ..........................................................................................................24 4.2.4 Performance on Test Set ..................................................................................................25 5. Discussion and Conclusion .............................................................................................................27 5.1 Discussion ...............................................................................................................................27 5.2 Limitations and Future Research .............................................................................................28 5.3 Conclusion ...............................................................................................................................28 References ..............................................................................................................................................30 Appendix A ............................................................................................................................................32 3 Appendix B ............................................................................................................................................33 Appendix C ............................................................................................................................................34 Appendix D ............................................................................................................................................36 4 1. Introduction In this chapter the goal of this research will be introduced. Section 1.1 will give an overview of the context for this research. In section 1.2 the problem statement and the research questions will be formulated. Finally, section 1.3 contains an outline of the rest of this research. 1.1 Context Over the past couple of years two trends in the gaming industry were notable. Firstly, a new gaming genre has become very popular, namely Battle Royale. In this online multiplayer genre 100 players are dropped on an island with the purpose to be the last man or team standing. Player Unknown’s Battlegrounds (PUBG) was one of the first games that that introduced this gaming genre. PUBG quickly became one of the best-selling games of all time. Since its release the Battle Royale genre has rapidly grown, and is still growing. In 2018 the Battle Royale game Fortnite broke the records of number of players and number of spectators who watched gamers play the game on the streaming website Twitch. Other shooter game franchises like Call of Duty and Battlefield have also implemented the Battle Royale mode into their new games. The second trend in the gaming industry is the rise of electronic sports (eSports). In 2018 the prize money pools and number of viewers for eSports have outranked some of the greatest sports events, like the Tour de France and Wimbledon (The Washington Post, 2018). To further illustrate the rise of competitive gaming, eSports could be added to the Olympic program as an official medal sport in 2024 (The Guardian, 2017). For the relatively new Battle Royale gaming genre, both PUBG and Fortnite are in the top-5 of the 2018 eSports tournaments prize money pool (Statista, 2019). Despite the economic significance of the aforementioned trends in gaming industry, little to no research has been done on the Battle Royale game mode. This research will explore which features will influence the rating of a Battle Royale player in PUBG and tries to predict the rating based on their pre-match statistics. From a practical point of view, this research could benefit the eSports gaming industry, especially for players trying to compete in eSports. They could focus on the features that mostly influence the rating of a player and therefore reach and stay in the highest rating of the game. From a scientific point of view, this research may add knowledge to past rating and outcome prediction research in the field of game analytics. When looking at prior research for rating and outcome prediction, there has not been any scientific research done for the Battle Royale genre. Therefore, for outcome prediction this research will review studies that have been done on other gaming genres, such as Real-Time Strategy (RTS) games (Erickson & Buro, 2014; Ravari, Bakkes & Spronck, 2016) and Multiplayer Online Battle Arena (MOBA) games (Yang, Harrison & Roberts, 2014; Yang, Qin, & Lei, 2016). These studies were 5 mostly done with only during-match data or post-match data. The study from Yang, Qin, and Lei (2016) showed that including pre-match data will improve prediction models. For outcome prediction in First-Person Shooters (FPS) there has been limited to no work, except for the work of Ravari, Spronck, Sifa, and Drachen (2017). There are not many studies that investigated rating systems for players of video games. Arpad Elo introduced