International Journal of Applied Exercise Physiology 2322-3537 www.ijaep.com Vol.8 No.4

An Alternative Ranking System for Bundesliga Standings by Using Google’s Pagerank

Celal Gençoğlu1 and Hikmet Gümüş1

1 Faculty of Sport Sciences, Dokuz Eylül University, Izmir, Turkey.

ARTICLE INFORMATION ABSTRACT Original Research Paper Application of network-based algorithms to the sports metrics is Doi: 10.26655/IJAEP.2019.12.12 addressed to several disciplines. PageRank algorithm has been used Received September. 2019 for both team sports and individual players performance assessments Accepted December. 2019 whereas there is no study to handball team rankings. Purpose of this Keywords: study is to investigate the differences between the traditional ranking PageRank system and PR algorithm in handball. Total 306 game results, scores Sport metrics Handball made, conceding goals, offense and defense-related match statistics Ranking that played in Bundesliga 2017-18 season were collected from DKB Bundesliga official web page. We estimated four different PR values by name PageRankstraight, PageRankoffence, PageRankdefense, and PageRankdifference. The official points and normalized PR scores and provide the team standings changed by using the PR algorithm. A positive correlation was found between official system ranking and PageRankstraight, PageRankoffence, PageRankdefense, and PageRankdifference respectively r=0,932, p=0,000; r=0,915, p=0,000; r=0,711, p=0,001; and r=0,926, p=0,000. In this study, the PR algorithm presented a novel approach to ranking handball teams. Further researches needed to investigate alternative ranking methods which allow objective evaluation of a team’s performance, not just game results. Finally, we do not conceive the substitute traditional system within the PR algorithm for league standings, but it could be regarded as a method of assessment team market values and distributions of income.

1. Introduction In all sports disciplines, ranking occurs from the results of the matches and has critical importance as determining champion team, attaining to the European cup participation, increasing the amount of tv income and also attracting for new fans. Depend on the sports applying a league status, tournament or knock out regulations, teams or players collect the points corresponding to the result of matches. Accordingly, International Handball Federation rules team handball ranking based on total points of the teams for winning, draw and lose respectively three, one for each side and zero points [1]. Thus, at the end of the season the number of points accumulated indicating the respective standings. However traditional point award system has limitations cause of only consider to win or lose not to opponents’ strength or weakness. Therefore, the official point awarded system may cause failure to the ranking of a team who is not as good as their standing or ranking a team who may be better placed than their ranking in handball. In fact, there are several ranking systems such as Winning Percentage, The Rating Percentage Index, Elo’s Method and Keener algorithm [2] integrating computer science to sports standings. One of the best algorithms is Google’s PageRank [3] that enables meaningful ranking web pages when someone makes a query on the engine. A key aspect of PageRank (PR) is taking into account who plays and scores against a stronger or weaker opponent during interpretation to results.

International Journal of Applied Exercise Physiology www.ijaep.com VOL.8 (4)

Recently, the PageRank algorithm has been used for ranking American National Football League (NFL) teams [4-6]. Additionally, Lazova and Basnarkov (2015) investigated to compare PR and conventional FIFA rankings for international soccer teams [7]. On the other hand, several attempts have been made to the implementation of PR for weighted ranking of individual performance in soccer, basketball and hockey as team sports [8-10]. Moreover, Beggs et al. (2017) and London et al. (2015) have reported that PageRank is a useful ranking system for track athletes and tennis players [11,12]. Despite the competitive structure of DKB Bundesliga, there is no alternative ranking attempt using PR in handball. This study aims to investigate the differences between the traditional ranking system and PR algorithm in handball. Part of the purpose of this study is to address the offensive and defensive performance of teams by comparing match statistics and PR algorithm.

2. Method 2.1. Data Total 306 game results, scores made and conceding goals that played in Bundesliga 2017-18 season were collected from DKB Bundesliga official web page. Also, offence and defense-related match statistics included technical error, fast break goals, goals from line player, shoot percentage of line player, goals from wing, shoot percentage of wing, goals from back players, shoot percentage of back players, total shooting percentage, block, steal, conceded fast break goals, conceded goals from wings, conceded goals from line players, conceded goals from back players, conceded 7m goals gathered for each team (https://www.dkb-handball-bundesliga.de/de/).

2.2. Procedure The algorithm relies on to determine the importance of a webpage, the interconnection of the web. Mathematically, the algorithm can calculate by a system of coupled equations described below. 휔 푞 1 − 푞 푃 = (1 − 푞) ∑ 푃 푖푗 + + ∑ 훿 (푆표푢푡) 푖 푗 푆표푢푡 푁 푁 푗 푗 푗 푗

ω out In the formula ij is the weight of a link and s j = Σiωij is the out-strength of a link. pi is the PR score assigned to team i and represents the fraction of the overall ‘‘influence’’ sitting in the steady state of the diffusion process on vertex i. As it is an iterative algorithm within 500 iterations, calculation summation of all PageRank values was equaled to 1. 6-digit scaled fixed-point data format used to make calculations to compare PageRank values accurately. Damping factor is considered as 0.85. The algorithm relies on a basic that losing team send a part of the value which given before to the winning team. In other words, the more losing team has low-value therefore transfer value will be low too. In other words, the weighted ranking algorithm process the winning teams received a link from loser team. Throughout this, it is a stronger team with more incoming links from losing teams. For instance, there are A, B and C teams played against each other. If A defeats B, a directed link is accomplished from B to A. This link represents an amount of given value of the team. So, the value is proportional to the fraction of wins between A and B. Hence, if PR algorithm takes into account all the competing teams, a weighted and directed network is established (Figure 1).

70

International Journal of Applied Exercise Physiology www.ijaep.com VOL.8 (4)

Figure1. Beaten team transfers the part of their own value to the defeated side in PR algorithm

Table 1 presents an example matched-ups for A, B, and C teams and ranking according to who beats who. Provided that team ‘C’ has the best PR value with three winning and 1 lose, but both ‘B’ and ‘A’ teams got the same number of winning and loses. With this in mind, Team ‘A’ ranked 2nd with higher PR value owing to ‘A’ won against a stronger opponent such as ‘C’.

Table1. An example of Team (‘A’, ‘B’ and ‘C’) standings according to PageRank algorithm

Ranking Win Lose Team 0.39738 3 1 'C' 0.38777 1 2 'A' 0.21485 1 2 'B'

We estimated four different PR values by name PageRankstraight, PageRankoffence, PageRankdefense, and

PageRankdifference. PageRankstraight, consider only win-lose while excluding draws because of no any valuation change. In

PageRankoffence and PageRankdefense estimations, goals used as the link between teams and the team who conceded a goal exchange the part of the value. Therefore the team with minimum PageRankdefense score is the worst performance, on the contrary, all other PR scores. PageRankdefense score was computed as 1-PRvalue in order to backward ranking. The margin of victory in a game was used to compute PageRankdifference. All PR scores were normalized between the maximum and minimum official league points (56-13) to provide a more meaningful comparison.

2.3. Statistical Analysis Collected data were calculated in MATLAB [MATLAB and Statistics Toolbox Release 2018, The MathWorks, Inc., Natick, Massachusetts, United States]. The rankings gathered from the official system and PR scores were compared using Spearman’s Rank Correlation. In addition, nonparametric Spearman correlation was executed to verify the relation between PR scores and match analysis variables (IBM Corp. Released 2013. IBM SPSS Statistics for Windows, Version 22.0. Armonk, NY: IBM Corp.). The correlations were distributed according to R-values and significance of correlation was considered according to Hopkins scale (r˂0.1; 0.1-0.3; 0.3-0.5; 0.5-0.7; 0.7-0.9; ˃0.9; 1)

71

International Journal of Applied Exercise Physiology www.ijaep.com VOL.8 (4)

[13]. Statistical significance was assumed at p < 0.05.

3. Results The official standings included results, awarded points, scores made, goals conceded and difference as average was presented in Table 2. It also shows the PageRankstraight, PageRankoffence, PageRankdefense, and PageRankdifference scores of teams.

Table2. Bundesliga Handball Teams standings according to the official system and PR values in the 2017-18 season

SFG: SG Flensburg-Handewitt; RNL:Rhein-Neckar Löwen; BER: Füchse Berlin; SCM: SC ; THW: THW Kiel; HAN: TSV Hannover-Burgdorf; MTM: MT Melsungen; LEI: SC DHfK Leipzig; TBV: TBV Lemgo Lippe; FAG: FRISCH AUF Göppingen; WET: HSG Wetzlar; GWD: TSV GWD Minden; HCE: HC Erlangen; TVB: TVB 1898 Stuttgart; GUM: VfL Gummersbach; LUD: The owls Ludwigshafen; NLB: TuS N-Lübbecke; TVH: TV 05/07 Hüttenberg. The correlations between official points and PR scores demonstrated at Table 3. A positive correlation was found between official system ranking and PageRankstraight, PageRankoffence, PageRankdefense, and PageRankdifference respectively r=0,932, p=0,000; r=0,915, p=0,000; r=0,711, p=0,001; and r=0,926, p=0,000.

Table 3. Correlations of PageRank scores and official points.

PageRankdefense PageRankstraight PageRankdifference PageRankoffence Official Points

PageRankdefense r 1,000

p .

** PageRankstraight r -,740 1,000

p ,000 .

** ** PageRankdifference r -,709 ,965 1,000

p ,001 ,000 .

72

International Journal of Applied Exercise Physiology www.ijaep.com VOL.8 (4)

* ** ** PageRankoffence r -,501 ,862 ,891 1,000

p ,034 ,000 ,000 .

Official Points r -,711** ,932** ,926** ,915** 1,000 p ,001 ,000 ,000 ,000 .

**. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed).

Table 4. The Comparision of the official oints and normalized PR scores

RANK TEAMS OFFICIAL PRSTRAIGHT PROFFENCE PRDEFENsE PRDIFFERENCE 1 SGF 56 53,73* 47,83* -15,08* 51,56* 2 RNL 55 56,00# 55,63 -12,00# 56,00# 3 BER 53 43,11* 43,80* -22,08* 39,97* 4 SCM 50 43,16* 56,00# -40,00* 37,45* 5 THW 49 49,97# 47,35# -15,40# 47,57# 6 HAN 47 47,18# 42,07* -29,62* 31,48* 7 MTM 41 48,14# 42,48# -25,14 36,04# 8 LEI 37 33,72* 27,36* -13,26# 23,47* 9 TBV 34 19,74* 30,14* -38,46* 15,50* 10 FAG 31 31,26# 33,93# -31,63 25,08# 11 WET 30 30,23# 32,63# -22,80# 28,36# 12 GWD 26 16,32* 33,53# -48,32* 14,66* 13 HCE 25 23,20* 24,03* -38,52# 17,99# 14 TVB 20 17,91* 25,76# -46,35* 14,94 15 GUM 16 19,60# 24,27 -43,45 18,19# 16 LUD 15 14,90 21,34* -42,71# 13,77* 17 NLB 14 13,00* 13,00* -29,48# 13,00* 18 TVH 13 14,69# 24,89# -55,00 14,00#

* represents to placement changing to the down; # represents to placement changing to the up

SFG: SG Flensburg-Handewitt; RNL:Rhein-Neckar Löwen; BER: Füchse Berlin; SCM: SC Magdeburg; THW: THW Kiel; HAN: TSV Hannover-Burgdorf; MTM: MT Melsungen; LEI: SC DHfK Leipzig; TBV: TBV Lemgo Lippe; FAG: FRISCH AUF! Göppingen; WET: HSG Wetzlar; GWD: TSV GWD Minden; HCE: HC Erlangen; TVB: TVB 1898 Stuttgart; GUM: VfL Gummersbach; LUD: The owls Ludwigshafen; NLB: TuS N-Lübbecke; TVH: TV 05/07 Hüttenberg. Table 4 compares the official points and normalized PR scores and provides the team standings changed by using the PR algorithm. Interestingly, according to official points-based standings, there is only one team remain the same place when compare the PageRankstraight (Figure 2) and PageRankdifference (respectively LUD, and TVB). In addition, there are only two unchanged teams ranking in PageRankoffence (RNL, and GUM), while ranking of four teams steady state in PageRankdefense (MTM, FAG, GUM, and TVH). The results of the correlational analysis of offensive and defensive-related match statistics with PR scores and official points demonstrated in Table 5. Figure 3 below illustrates the connectivity networks between teams. In the graph, every team represents a node and size of each node is

73

International Journal of Applied Exercise Physiology www.ijaep.com VOL.8 (4)

proportional to its PR scores.

Figure 2. Relation between PageRankstraight and official rankings as different ranking methods. Scatter plot between the rank positions obtained via official point-based system and acquire with PageRankstraight.

SFG: SG Flensburg-Handewitt; RNL:Rhein-Neckar Löwen; BER: Füchse Berlin; SCM: SC Magdeburg; THW: THW Kiel; HAN: TSV Hannover-Burgdorf; MTM: MT Melsungen; LEI: SC DHfK Leipzig; TBV: TBV Lemgo Lippe; FAG: FRISCH AUF Göppingen; WET: HSG Wetzlar; GWD: TSV GWD Minden; HCE: HC Erlangen; TVB: TVB 1898 Stuttgart; GUM: VfL Gummersbach; LUD: The owls Ludwigshafen; NLB: TuS N-Lübbecke; TVH: TV 05/07 Hüttenberg. The results of the correlational analysis of offensive and defensive-related match statistics with PR scores and official points demonstrated in Table 5.

74

International Journal of Applied Exercise Physiology www.ijaep.com VOL.8 (4)

Figure 3. Network graph of teams by PageRankstraight

SFG: SG Flensburg-Handewitt; RNL:Rhein-Neckar Löwen; BER: Füchse Berlin; SCM: SC Magdeburg; THW: THW Kiel; HAN: TSV Hannover-Burgdorf; MTM: MT Melsungen; LEI: SC DHfK Leipzig; TBV: TBV Lemgo Lippe; FAG: FRISCH AUF Göppingen; WET: HSG Wetzlar; GWD: TSV GWD Minden; HCE: HC Erlangen; TVB: TVB 1898 Stuttgart; GUM: VfL Gummersbach; LUD: The owls Ludwigshafen; NLB: TuS N-Lübbecke; TVH: TV 05/07 Hüttenberg.

4. Discussion and Conclusion The purpose of the study was to apply the PageRank algorithm for ranking handball teams. First of all, the results of this study indicate that the official system and PR algorithm has ranked teams’ different standings. It is interesting to note that the champion of the Bundesliga 2017-18 season in the traditional system did not take leadership position according to all four PR scores. Bundesliga handball league is known as one of the most competitive organizations in European handball, and there is little difference in the first and last four teams. In this case result of the game played between two closed ranked teams has critical importance. However, the PageRank algorithm takes into account not only the outcome of the matches but also considers the strength or weakness of the opponent. Therefore top-ranked teams have to play seriously for defeat the weak side as well as the teams that placed the bottom of the league try to get unexpected results against top placed one. Another potential disadvantage of the traditional point awarded system is the mid-ranged teams have motivational problems getting close to the end of the season. PR algorithm eliminates this lacking motivation situation and keeps challenge with uncertain ranking because every match will be crucial against the all opponent. In a previous study, Lazova (2015) observed a considerable ranking difference between FIFA and PR in international soccer teams [7]. However, the method compared with PR in that study was the Elo’s Rating which is FIFA ranks 210 national federations. In the study conducted eight different algorithms included PageRank, Barrow (2013) found that core difference is a better indicator than win-loss assessment for prediction of team rankings [2]. However, there was a conflicting result to determine top seedings. This result may be explained by the effect of the knock out matches the other tournament data. In another study, Bigsby and Ohlmann (2017) have been compared to a traditional ranking system with PageRank and Elo Rating method for wrestlers from all weight class [14]. Findings of

75

International Journal of Applied Exercise Physiology www.ijaep.com VOL.8 (4)

that study support that PageRank may have several benefits when compared to the traditional system. However, they showed that the Elo rating method yielded better ranking. It is possible to explain that while PR award for win over the high-ranked team but not punished for losses. Govan and Meyer (2006). have found that the PR algorithm was a feasible tool to rank NFL teams, and their result showed that a point differential weighted PageRank outperformed comparison method [4]. The findings of this study support the PageRank algorithm appears to adjust the ranking of teams by using simplified `who beat who' model. The main finding of the present study is correlations between match-related statistics and official system, and PR scores. The official system and PR scores showed or no correlation with the in-game actions. In this study, the PR algorithm was ranked the team's different standings when compared to the official system. These relationships may partly be explained by the defensive, and offensive statistics reflect the game performance; however, the PageRank algorithm also considers the opponent's power. Overall, these results indicate that ranking by PageRank algorithm has sensitive according to defense and offense actions. The PageRank algorithm allows the evaluate the different sides of the performance such as defense or offense. Another important finding was the fast break goals, and technical errors showed a remarkable correlation with PageRankdifference when compared to the official system. This result may be explained by a weaker team makes more technical errors which resulted in a fast break goal in unbalanced games.

Contrary to the official system, PageRankdefense score showed a correlation with conceded wing goals whereas no significant correlation with the block parameter. These results suggest that scoring more goals against a stronger opponent is essential if you cannot win the game. Also, as possible as the lowest conceded goal from weaker teams seems to be critical within beating them. Accordingly, International Volleyball Federation regulations (FIVB, 2018), in volleyball the number of winning sets change the gained points for teams [15]. Nevertheless, score margin does not consider as a factor in the set. It was hypothesized that the PageRank algorithm has an offensive and defensive perspective while evaluating the performances of teams. This also accords with our findings that PageRankoffence, PageRankdefense, and PageRankdifference ranked significantly different the teams when compared to the official system. Although the traditional system considers the scored and conceded goals as an average PageRank seems to be a better indicator. Previous research (Govan 2009) has established that ranking teams according to offensive and defensive performance while using the linear algebra method and another algorithm utilize web pages ranking [16]. However, findings yielded from their study cannot comparability cause of the using PageRank algorithm for ranking teams. Jacobson (2009) investigated to predict the winner of the game by considering the victory margin [17]. Their results were a conflict to predict teams especially the higher seed teams. However, in that study, the focus was betting and possibility not to rank the teams. Also, in some unbalanced matches results could be decided already early in the game, and goal margin could affect the strategies of players or coaches. When a team leads the game performance decrease in attack or defense can occur unless they play for a knockout game. If the PR algorithm used to rank the teams, it evaluates all the factor abovementioned therefore both score leading or losing with goal margin team keep challenge until the last whistle which lead to more joy of watching. This result may support to teams going to stronger wins if goals-richer handball games will award with extra points. A limitation of this study is that the PageRank algorithm used to rank teams for four variables such as win-loss, defense, offense, and the victory margin. Four different PR scores may result in overachiever or underrated rankings. Therefore the results of the present study were unable to compare as complete PageRank scores for a team. In further researches, the compact PageRank scores or equivalent can be investigated. In this study, the PR algorithm presented a novel approach to ranking handball teams. Further researches needed to investigate alternative ranking methods which allow objective evaluation of a team’s performance, not just game results. Finally, we do not conceive the substitute traditional system within the PR algorithm for league standings, but it could be regarded as a method of assessment

76

International Journal of Applied Exercise Physiology www.ijaep.com VOL.8 (4)

team market values and distributions of income.

References 1. International Handball Federation. The official handball rules. 2010. http://ihf.info/files/Uploads/NewsAttachments/0_RuleGame_GB.pdf. 2. Barrow D., Drayer I., Elliott P., and Gaut G. Ranking rankings: an empirical comparison of the predictive power of sports ranking methods. Journal of Quantitative Analysis in Sports, 2013, 9[2]:187–202. 3. Page L., Brin S., Motwani R., Winograd T. The PageRank citation ranking: Bringing order to the web. Tech. Rep. 1999-66, Stanford InfoLab. 4. Govan AY., Meyer CD. [editors]. Ranking national football league teams using Google’s PageRank. AA Markov Anniversary Meeting; 2006; Charleston: Boson Books. 5. Govan AY. Ranking theory with application to popular sports. Raleigh: North Carolina State University; 2008. 6. Zack L, Lamb R, Ball S. An application of Google’s PageRank to NFL rankings. Involve, a Journal of Mathematics. 2012; 5[4]:463–71. 7. Lazova V., Basnarkov L. PageRank approach to ranking national football teams. CoRR abs/1503.01331 [2015]. 8. Pena JL., Touchette H. A network theory analysis of football strategies. In: C. Clanet [ed.], Sports Physics: Proc. 2012 Euromech Physics of Sports Conference. pp. 517–528 [2012]. 9. Brandt M., and Brefeld U. Graph-based Approaches for Analyzing Team Interaction on the Example of Soccer, Proceedings of the ECML/PKDD Workshop on Machine Learning and Data Mining for Sports Analytics 8, [2015]. 10. Brown S. [2017]. A PageRank Model for Player Performance Assessment in Basketball, Soccer and Hockey. http://www.sloansportsconference.com/wp-content/uploads/2017/02/1494.pdf. 11. Beggs CB, Shepherd SJ, Emmonds S, Jones B. A novel application of PageRank and user preference algorithms for assessing the relative performance of track athletes in competition, PLOS ONE, v. 12, p. 1–26, n. 6, 2017. 12. London, A., Németh, J., & Németh, T. Time-dependent network algorithm for ranking in sports. Acta Cybernet., 2014, 21[3], 495–506. 13. Hopkins WG. A New View Of Statistics. Internet Society for Sport Science: http://www.sportsci.org/resource/stats/ 2013. 14. Bigsby GK., and Ohlmann JW. Ranking and prediction of collegiate wrestling. Journal of Sports Analytics 3 [2017] 1–19. DOI 10.3233/JSA-160024. 15. The Fédération Internationale de Volleyball [FIVB], Sports Regulations Volleyball. Version 4 May 2018. http://www.fivb.org/en/FIVB/Document/Legal/FIVB_Sports_Regulations_2018_20180504.pdf. 16. Govan AY., Amy NL., Carl DM. Offense-Defense Approach to Ranking Team Sports. Jounal of Quantitative Analysis in Sports 5:1 [2009] 1-19. 17. Jacobson SH., and King DM. Seeding In The NCAA Men’s Basketball Tournament: When Is A Higher Seed Better? The Journal of Gambling Business and Economics [2009] 3 2 63-87.

77