Machine Learning Approaches to Choose Heroes in Dota 2

______________________________________________________PROCEEDING OF THE 24TH CONFERENCE OF FRUCT ASSOCIATION Machine Learning Approaches to Choose Heroes in Dota 2 Iuliia Porokhnenko, Petr Polezhaev, Alexander Shukhman Orenburg State University Orenburg, Russian Federation [email protected], [email protected], [email protected] Abstract–The winning in the multiplayer online game Dota 2 objects. The goal of the game is the destruction of the main for teams is a sum of many factors. One of the most significant of building belonged to the enemy’s team by player controlled them is the right choice of heroes for the team. It is possible to heroes and creeps controlled by artificial intelligence. predict a match result based on the chosen heroes for both teams. This paper considers different approaches to predicting results of Section 2 describes the existing approaches to solving the a match using machine learning methods to solve the problem of increasing the efficiency of a team in the classification problem. The experimental comparison of multiplayer online game DOTA 2. Formulation of machine predictive classification models was done, including the learning problem for DOTA 2 and preparation of the training optimization of their hyperparameters. It showed that the best dataset are considered in Section 3. Section 4 presents the classification models are linear regression, linear support vector classifying models chosen from well-known frameworks, the machine, as well as neural network with Softplus and Sigmoid classifiers are used to solve the formulated problem. activation functions. The fastest of them is the linear regression Comparative study results of the classifying models trained on model, so it is best suited for practical implementation. the prepared dataset, the optimization of their I. INTRODUCTION hyperparameters, as well as the results of neural network study (on CPU, GPU and TPU) and its hyperparameter optimization, Computer games have become an important social, cultural are described in Section 5. and economic factor. Multiplayer online games currently attract a large number of players and have a wide audience of II. RELATED WORK observers. In addition, there is a growing interest in games There are many different approaches to solving the related to eSports. problem of increasing the efficiency of a team in the ESports includes team or individual competitions based on multiplayer online game DOTA 2. computer games. All eSports disciplines are divided into The success of the team is influenced by many different several main classes, which differ in the properties of spaces, factors. They can be divided conditionally into two groups: the models, game problem and developed skills of cyber personal contribution of each player and the principle of team sportsmen. Multiplayer Online Battle Arena (MOBA) games building. are among the most popular in eSports. The first group includes such characteristics as the player’s Prediction about the results of matches in sports games has experience, the sequence of his actions in the game, made always been a popular topic in the machine learning area. decisions, etc. Paper [1] considers player models and methods, Sports analytics is often used to make decisions in professional which can get results closest to the behavior of a real player. kinds of sport. Therefore, it can be assumed that such systems Player models take into account the choice of hero skills will also be useful for users of multiplayer online games. depending on the game situation and the sequence of such MOBA is a genre of computer games that combines the choices to increase their skills. The choice of items for real-time strategy and computer role-playing. One of the most purchasing by hero was also considered as part of studying the popular games in the MOBA genre is Dota 2 developed by influence of the hero’s personal contribution to the success of Valve Corporation. Nearly 500 thousand players play it every the whole team [2]. day. So, it was selected as a subject of the current research. More attention is paid to the principle of team building. In In Dota2, two teams of players fight on a certain type of [3] Nataliia Pobiedina and others prove that a correct choice of maps in real time. The map is a combination of three lines heroes by player is the factor, which has the most significant (top, middle and bottom) and the area between them (jungles). effect on the success of a team in the game. Each player controls one hero, which he chooses from a list of There are several approaches to choosing the heroes for the heroes. Heroes differ from each other in various features and match. Existing services, such as DotaBuff [4] and DotaPicker characteristics. During the match, heroes can become stronger, [5], use analysis of statistic information obtained from played develop new skills, enhance their characteristics and acquire matches. These statistics are updated sometime after each ISSN 2305-7254 ______________________________________________________PROCEEDING OF THE 24TH CONFERENCE OF FRUCT ASSOCIATION release of the new update in the game to actualize available (pick is the choice of a hero), the first half of which is for the information. “radiant” team, the second half for the “dire” team: ݋݂ݐ̶݄݁radiant" teamݎMachine learning is another approach, which is used for ͳǡ ݂݅݌݈ܽݕ݁ the selection of heroes in the game. ݔ௜ ൌ൝ had played with the hero ݅݀ ൌ ݅ǡ ǡ݁ݏݓ݅ݎͲǡ݋ݐ݄݁ team "݁ݎ݋݂ݐ̶݄݁݀݅ݎIn [6] Filip Beskyd considers a decision tree and a neural ͳǡ ݂݅݌݈ܽݕ݁ network to predict the outcome of the game based on chosen ݔ ൌ൝ had played with the hero ݅݀ ൌ ݅ǡ heroes. ଵ଴଼ା௜ Ǥ݁ݏݓ݅ݎͲǡ݋ݐ݄݁ Zhengxing Chen and others [7] compared Monte-Carlo method, logistic regression and gradient boosting to predict the There are 108 different heroes, and each of them can be outcome of a match. selected once by both teams. Vector ݕ contains information about the match result: It worth nothing that all authors, who use the machine learning approach in their research, gain accuracy ranged from to 70%. Moreover, the use of the same machine learning ͳǡ ̶̶݂݅ݐ݄݁ܽ݉ܽ݀ݓ݋݊ǡ 50% ݕൌ൜ Ǥ݁ݏݓ݅ݎmodels by different researchers gives different results. This Ͳǡ݋ݒ݁ happens because these models are applied to different datasets or they are configured by different hyperparameters or For further research, it was necessary to compare various structures (for neural networks). classification methods and select the optimal combination of their hyperparameters. At present, there are no studies, which can help to choose the most efficient set of hyperparameters that affect the IV.CLASSIFICATION MODELS FOR SOLVING THE accuracy of predicting game outcome. PROBLEM Gradient Boosting Classification (GBC) is a machine III. FORMULATION OF MACHINE LEARNING PROBLEM AND DATASET PREPARATION learning method for solving regression and classification problems that creates a prediction model as a linear The problem of our research is to build an effective combination of basic classifiers, which minimize the algorithm for predicting match results based on information differentiated loss function. about the DOTA 2 heroes chosen by players. This problem is a binary classification problem with two output classes meaning Random Forest Classification (RFC) is a machine learning victory for the “radiant” or “dire” team. algorithm that uses a set of decision trees, which reduces retraining problems and improves accuracy in comparison The OpenDota API allows developers to get data about with a single tree. The result is obtained by aggregating the Dota 2 matches. The information about 56,690 matches played responses of multiple decision trees. in 2018 was obtained through the OpenDota API. This information corresponds to a number of requirements. XGBoost Classification (XGBC) uses pre-sorted algorithm and Histogram-based algorithm for computing the best split First, game modes must be such that each hero has a non- [8]. zero probability of appearing in the game. Such modes are “all pick” (each player chooses one from all the heroes), “single Logistic regression (LR) is a method for constructing a draft” (each player chooses one from three random heroes), linear classifier to estimate a posteriori probabilities of “all random” (each player gets a random hero), “random draft” belonging objects to classes. It is a statistical model used to (players take turns choosing heroes from a pool of 50 random predict the probability of an event occurring by fitting data to a heroes), “captain’s draft” (each team is assigned a captain who logistic curve. LR is a very powerful algorithm, especially for chooses heroes from the list of random heroes), “captain’s high-dimensional problems. It is actively used in Kaggle mode” (each team is assigned a captain who chooses heroes competitions along with tree boosting approaches. for his team and prohibits ones for opponents). Linear support vector classification (LSVC) is an Secondly, the skill level of the players in the match must algorithm for solving classification problems using only the be high. Such a requirement is necessary, so only those linear core. Compared to the SVM algorithm, Linear SVC matches are considered to learn our model, in which heroes are learns faster and scales better. selected based on any strategic considerations. CatBoost Classification (CBC) is a machine learning Thirdly, only those matches are taken into account in algorithm using gradient boosting on decision trees, it which players do not leave the match until the end. This is available in the CatBoost library from Yandex. It is a follower necessary so that the math result depends only on skills and of the MatrixNet algorithm, which is used for ranking and choice of the team's heroes, and doesn’t depend on the balance forecasting. Also, it is the base for recommender technologies. and the number of players. The implementations of classification methods for training Based on downloaded data, a training set of labeled data and testing were used from the Scikit-learn, XGBoost and was created consisting of pairs of vectors (x, y).

Load more