Salary Determination in the

Econometrics II Project January 16, 2015

Abstract In our semester project we are going to estimate the model of players’ salary determi- nation in the National Hockey League. We will use standard OLS method for estimating the regression and we will further reveal the specifics of the sports economic research in- cluding the player’s performance and its effect on the value of the contract. We will also deal with topic of the present in the NHL and the specific role of salary vs. cap hit measures. Our main focus is going to be on whether cap hit is a better estimate of player’s contract value, rather than the traditionally used salary in a given season. We will find few possible contributions to overall study of salary determination in sports economics field including a way of dealing with overvalued contracts of high-paid players.

∗Our main reference database for the NHL statistics data is NHL’s official website nhl.com. Salary information retrieved from spotrac.com and nhlnumbers.com. All additional data referring to the size of average salaries or any other information concerning NHL salaries are computed or taken from the websites.

1 1 Introduction

The National Hockey League is one of the four elite sports leagues (along with National Bas- ketball Association, National Football League and Major League Baseball) in the USA which are among the wealthiest sports leagues in the world. The average salary among NHL players is 2.4 million dollars (third among US sports leagues). There has been a lot of contribution to the topic of players’ salary determination in the past and various regression methods have been used to describe the process of negotiations between agents of players and managers of the teams. The sport industry is an interesting playground for scholars who are investigating the re- lationship between marginal product of capital on earnings of a firm. The data are readily available (teams are often obliged to make salary information public) and the performance of a firm (i.e. team) is also simple to measure because games provide incessant comparison among them. Contracts are awarded differently in comparison with traditional labor markets (e.g. manufacturing industry). Badly performing players are put aside and their future is endangered while workers in a factory might effectively hide their lack of skill, a luxury NHL players do not have. For these reasons, specific methods used in sports economics research might contribute to the real microeconomic environment problems. For example, NBA offers a very widely used opportunity to examine the wage differentia- tion based on racial discrimination (Gius & Johnson, 2010). A topic we often covered in our seminars. The National Football League contributes to other topic, the effect of academic ex- perience and specific field positions on player’s performance, a typical method of investigating effect of education on abilities (Kowalevski, 2010). We are going to retain most of the theoretical foundations of the Kevin Peck’s honors the- ses (Peck, 2012). In his work, Mr. Peck examines which effects are determining NHL players’ salaries using the same season statistics. He presents two effects associated with salary determi- nation. One would be the effect of a player himself and the second is the effect of a team. These two attributes were already investigated by Leo Kahane (Kahane, 2001) who suggested that team aspects are significantly affecting the establishment of the salaries but we will concentrate merely on the individual effects. Our reasoning is that, discordantly to Kahane’s work, team performance is an immensely complex process where the skills (a cooperation and mental profi- ciency) are intuitively of great importance and their estimation exceeds the scale of this project. We also propose that the team effect is slightly exaggerated since there are many examples of poorly performing teams with elite players as well as ”free riding” ones hiding their incapacity behind team performance. Mr. Peck uses slightly different measures from ours, we intersect mainly in the offensive statistics. He uses variable allstars (accounting whether player was attending the annual NHL All Star game) which he found is relevant for both individual and team effects. Popular players attract fans to attend games and therefore increase profits from entrance fees and bring revenues from sales of merchandise. Also, the presence of an All Star player increases the performance of a team. We decided not to incorporate this measure. Peck also uses the variable career games which we substitute with age and age squared variables. The second study we were inspired by is the Vincent & Eastman’s Quantile regression approach. The authors are pointing out the importance of differentiation between forwards and defencemen. The typical statistics of offensive ability (e.g. goals, assists) are straightforward but those of defending and preventing goals are much more complex. Penalty minutes, plus/minus statistic and physical dispositions are among commonly used proxies. They concentrate on the differentiation between high and low paid players dividing them into few quantile clusters to account for different effects across groups. We didn’t divide the sample into offensive and defensive groups, but we were aware of the fact that these groups are different and incorporated

2 a dummy variable to account for that. It will be discussed further that our model wouldn’t be effective if we had split it.

2 Dataset Description

Unlike most of the previous research, our model focuses on effects of players’ performance in past seasons on their future salary. The logic behind this method is very intuitive. Players negotiate their contracts based on their performance in the past. Their bargaining power is determined by how well they did in previous years. When a contract is signed, player can no longer influence it with his play. The exception here are performance bonuses included in a contract. However, these bonuses are also counted against NHL salary cap (maximum amount of money a team can pay its players in a given season) and are rather rare. Teams are reluctant to use them in order to avoid getting over the cap. They are mostly given to high profile rookies. This is because rookie salaries are extremely restrictive but performance bonuses can dramatically increase rookie’s salary (under the Collective Bargaining Agreement (CBA), maximum entry level contract base salary was $925,000 with up to $2,850,000 in performance bonuses during the 2012-2013 season). However, contract are mainly determined by previous play and even the inclusion of performance bonuses to a contract has to be earned in the past. In order to avoid performance statistics from the lockout-shortened season 2012-2013 we decided to take 200 random players that participated in the 2011-2012 season. We collected their statistics from seasons 2010-2011 and 2011-2012 and restricted the sample to players who participated in at least 15 games in both seasons (same restriction that Mr. Peck uses). We then added their salary information from the year 2012-2013 (adjusted to full season). Players who were not able to sign NHL contract for this season were dropped from the regression. We ended up with 122 players (observations), where each player participated in at least 15 games during 2010-2011 and 2011-2012 seasons and was signed with an NHL team during the 2012-2013 season. We use both players’ salaries in a season as well as their cap hit (both in thousands of dollars) in our regression as dependent variables. Cap hit is calculated as an average of player’s salaries from every season during the length of his contract and therefore remains constant throughout the length of the contract. This is the sum that is counted against team’s salary cap. In some cases cap hit can be a more reasonable estimate of player’s true value as teams often tried to give established players ”front-loaded” contracts in order to bring their cap hit down. These contracts usually featured extremely high salaries in initial seasons and very low salaries at the end of the contract. This was often combined with a practice of signing players well into their forties, usually assuming that they would retire before their contract expires. Very few players are able to continue playing at this age. It is reasonable to assume that player’s performance will degrade as he gets older but teams were often exploiting these strategies to give themselves more cap space to sign other players. There are several highly-publicised cases of such behavior, perhaps the most notorious was the case of . The issues were addressed in the current CBA signed in 2013 and changes in salaries between years of a given contract are now regulated as well as maximum length of the contract. There are two blatant examples of such contracts in our example, both signed in 2012 before the updated CBA was negotiated. signed 14-year deal with the Nasville Predators, his salary in 2012-2013 season is $14,000,000 whereas his cap hit is only $7,850,000. The other example is Ryan Suter’s 13-year deal with the Minnesota Wild ($12,000,000 cap hit versus $7,538,000 actual salary in 2012-2013). They proved to be very problematic when using salary as a dependent variable.

3 The independent variables that we use are games played, goals, assists, plus/minus (all in 2010-2011 and 2011-2012 seasons), penalty minutes (only in 2011-2012), age, draft position, years form signing the contract, and whether a player is a forward or a defenceman. Games played measures whether player is able to stay healthy throughout the season and/or whether he is able to stay with an NHL team performance-wise. We expect a positive coefficient, and the effects should be the same for forwards and defencemen. Goals and assists are a measure of player’s offensive abilities and are likely to be more important for forwards. Goal is awarded to the last player to touch the puck before a goal is scored. On each goal there from zero up to two assists, awarded to players who touch the puck before the goal scorer. Plus/minus measures player’s ability to contribute at ”both ends of the ice” (meaning offensively and defensively). Player is awarded a ”plus” when he is on the ice when his team scores a goal, and a ”minus” when he is on the ice when his team concedes a goal (excluding powerplays). These points are added up to create the plus/minus. Solid ”two-way” players may contribute to a team’s success more than a offensively productive player who is a defensive liability. Penalty minutes are likely to have different effects on low-end and high-end players. Gener- ally, getting a penalty puts the team in a disadvantage. However, for ”character” players they might have a positive effect as their primary role is to provide spark for the rest of the team by getting into fights and by hitting opposing players. This type of play results in a accumulation of penalty minutes. We expect age to have positive coefficient as we use it mainly as a measure of experience. However, we also added its squared value (with negative expected coefficient). From a certain point player’s performance starts to decline as he ages. Draft indicates at which position a player was taken at the annual NHL entry draft. The teams gain rights for the players they select. It is measuring stick of perceived potential ability of the player when he becomes draft eligible (usually around the age of 18). Players drafted high are regarded as more valuable and are usually given more chances to prove themselves compared to players picked in later rounds. There are 13 players in our sample who made it to NHL despite being undrafted. The higher the number the later a player was picked, so we would expect a negative sign. Variable contract is simply age of the contract, number of years since a player signed his contract effective in 2012-2013. Since the introduction of the salary cap it has annually increased along with league’s revenues. The increase is rather significant, from $39.0 million in 2005-2006 to $70.2 million (pro-rated) in 2012-2013. This means that players who signed their contract more recently are likely to be given more money because salary cap is higher and teams have more money to spend. Therefore, we expect a negative sign. This variable is not traditionally used, but we feel that it might a good way to account for the incresing salary cap. Lastly, we added a dummy variable to indicate whether player is a forward or a defence- man. Defencemen are likely to get fewer points (goals and assists combined) as their primary responsibility is to prevent goals. In our regression, ceteris paribus, a defenceman is likely to get higher salary than a forward. This is because we estimate mainly offensive abilities since defensive abilities are a lot more intangible and harder to measure. Numbers that we use with variables refer to season that ended in that year. E.g. goals12 refers to goals scored in the season 2011-2012 and similarly for others. Below is a summary of all our variables.

4 Table 1: Summary Statistics Variable Mean Std. Dev. Min. Max. N Expected sign age 27.525 4.186 20 39 122 + age2 774.984 236.929 400 1521 122 - asists11 18.721 11.561 0 49 122 + assists12 18.467 12.759 0 59 122 + born 1983.475 4.186 1972 1991 122 caphit13 2747.254 1938.258 525 8700 122 contract 1.057 1.13 0 5 122 - draft 76.560 77.209 1 287 109 - forw 0.689 0.465 0 1 122 - games11 66.066 16.812 20 82 122 + games12 65.549 18.805 15 82 122 + goals11 10.09 8.116 0 37 122 + goals12 10.139 9.4 0 50 122 + pim12 40.18 32.689 0 235 122 - plusminus11 0.172 11.009 -27 32 122 + plusminus12 0.533 11.464 -28 28 122 + points11 28.811 18.094 0 73 122 + points12 28.607 20.44 2 109 122 + salary13 2781.59 2235.999 525 14000 122 aggreggoals 20.23 16.37 0 67 122 + aggregassists 37.189 22.086 3 108 122 + aggregplusminus 0.705 17.576 -53 47 122 + aggregpoints 57.418 35.85 3 148 122 +

3 Regression model

We are going to estimate the equation by standard Ordinary Least Squares estimator even though other researchers used more advanced methods such as quantile regression and hierar- chical linear model. Simply from looking at the dataset and plotting the variable points (goals and assists combined) on salary we see that our dataset suffers from potential bias caused by two leveraged points (Figure 1). We will further examine the effectiveness of the caphit variable which might capture the salary appreciation more precisely. The issue of the leverage points is even more clear when we plot the salary statistics on the contract variable (Figure 3). The graphs of plotted variables are displayed below. One can clearly see where are the leverage points. In the case of caphit (Figure 2) the leverage points moved significantly closer to the clustered group of the observations. Figure 3 where we plotted the variables contract and salary the problem of leverage points is presented again to support the argument for caphit. This figure is further discussed in Model 1.

5 Figure 1: Points to Salary

Figure 2: Points to Cap Hit

3.1 Model 1 2 salary13 = β0+β1goals12+β2assists12+β3contract+β4forw+β5age+β6age +β7plusminus12+ β8pim12 + u 6 We regress the variables with condition of maximum salary that excludes the leverage points (maximum salary is restricted to $10,000,000). We lose two observations but the results of our model are quite successful - only two variables are insignificant, penalty minutes and plus/minus statistic. Most of the coefficients have expected signs. Surprisingly, coefficient of contract is positive (significant at 1%), meaning that players with older contracts are getting paid more. Upon inspecting the data, we came up with a possible explanation. Only high-end players are able to earn long-term contracts. But they are also more likely to earn higher salaries. On the other hand, we have an abundance of fringe players signed to short-term low-value contracts in 2012. It seems that this effect is stronger than the effect of increasing salary cap. When we include the two leverage points, contract has positive coefficient as expected (but it is highly insignificant). Figure 3 clearly shows why that is. Results from this regression are not included. Age has positive effect as we expected, as well as negative effect of its squared value. The break- even point is 29.37 years. Even though it is insignificant, it is also surprising that plus/minus has a negative coefficient. The likely explanation is that players in prominent roles are put up against better competition, which brings their plus/minus down. Dummy variable forw has expected negative coefficient, meaning that forward is going to get payed $1,027,090 less than a defenceman with the same statistics. Goals and assists both have positive coefficient as expected but what is a little bit surprising is that the coefficient on assists is higher. Intuitively, when two players have the same amount of points, ceteris paribus, the one with more goals is highly likely to be preferred by teams.

Figure 3: Contract Age to Salary

See Table 2 for detailed results.

7 3.2 Model 2 2 caphit13 = β0+β1goals12+β2assists12+β3contract+β4forw+β5age+β6age +β7plusminus12+ β8pim12 + u In the second model we changed the dependent variable to caphit to examine whether it covers the effects better. Thanks to this, we were not forced to drop any observations. The coefficients retained their significance from the Model 1 with the exception of goals which declined to 5% level of significance. Our coefficient of determination remained very similar at 66%. We might conclude that the caphit dependent variable did not provide any contributing advantage over salary. But once again, we did not have to restrict our model. See Table 3.

3.3 Model 3 2 salary13 = β0 ++β1games12 +β2goals12 +β3assists12 +β4contract+β5forw+β6age+β7age + β8plusminus12 + β9pim12 + β10draft + u Model 3 features only players who went through the draft (i.e. teams selected them as rookies at the annual entry draft). We think that draft position might reflect some of the intangible assests that players might have. We have 13 players in our model who were not selected. However, this does not suggest anything about their performance in the NHL, merely that they were not as highly regarded when they were younger. In order to study the effect of the draft, we simply drop them from our model for this time. The regression results (Table 4 and Table 5) are in accordance with our expectations. The significant variables from Model 1 retained the same significance level and the coefficient of draft is negative as we would expect. We faced the opportunity to simply include draft as dummy variable only to check for the effect of being selected but we feel that draft position captures the reality much better. The caphit model (Table 5) for drafted players does not differ from salary (Table 4) almost in no aspect except for the slight difference in significance level for draft. Since both Shea Weber and Ryan Suter were drafted, the caphit model again provides the benefit of not having to drop them from the regression.

3.4 Model 4 2 caphit13 = β0 + β1aggregoals + β2aggreassists + β3contract + β4forw + β5age + β6age + β7aggreplusminus12 + β8pim12 + u Finally, we wanted to control for the effect of more than one previous season statistics by creating aggregated statistical variables aggreggoals, aggregassists, aggregplusminus (combined for seasons 2010-2011 and 2011-2012). The reasoning behind this is that consistent perfor- mances are valued more and one good season might overvalue otherwise mediocre player. Due to high correlation of performance indicators in consecutive seasons, we decided to add them up. The model with salary as dependent variable yielded very similar results as in the models before, so we do not present the results in the paper. The model with caphit variable however, provided a bit less informative indications. Unfortunately, the model left the variable goals with no useful significance suggesting that overall goal statistics are not valid when estimating the effect on salary. The main cause of this might be hidden in the fact that we operate with both offensive and defensive players in one group. It is highly relevant to suggest that defensive players’ earnings are decided on the basis of goals scored rather than other skills which are not as effectively measurable. The conclusion from this might undermine our reasoning. We feel that splitting the groups according to their position might bring different results as suggested by Vincent & Eastman. Unfortunately, we have a sample consisting of only 38 defencemen which is too small of a sample.

8 4 Conclusion

Determination of players’ salaries in the NHL is a topic that might be overshadowed by com- plicated macroeconomic and microeconomic models, but still the sports economics might bring quite a contribution to the research field. We did not bring any revolutionary approach to the field. Rather, we examined and improved already established processes of research. Our main contribution is of the caphit variable reference. We wanted to conclude that caphit is a more sophisticated measure (compared to salary which might suffer from overvaluation) due to the practice of teams using ”front-loaded” contracts to give themselves more cap space. We discovered that these contract might be located far from the clustered area causing regression to be biased. Our sample is limited but potential researchers might face the same issues even on a larger scale. The new CBA prevents signing of these contacts which should make future research easier. Empirical research proved that the salary dependent variable models resulted in higher significance of the variables. Intuitively, salary should provide better indicator because more often than not player’s performance is expected to change during the length of his con- tract (whether he is a rising star or a declining veteran). But the presence of these problematic contracts makes a case for using caphit. Generally, our results did not result in any kind of major divergence from the previous work, despite using performance indicators from previous seasons. This suggests that the salary is truly derived from the value of player’s skill proxies such as offensive statistics and also the details of the contract they sign with the employer.

5 References

Kahane, L. H. (2001). Team and player effects on NHL player salaries: A hierarchical linear model approach. Applied Economics Letters, 8, 629-632

Kowalewski, Sandra. Salary determination in the National Football League. TEMPLE UNI- VERSITY Publishing, 2010.

Mark Gius & Donn Johnson (1998) An empirical investigation of wage discrimination in pro- fessional basketball, Applied Economics Letters, 5:11, 703-705

Peck, Kevin, ”Salary Determination in the National Hockey League: Restricted, Unrestricted, Forwards, and Defensemen” (2012). Honors Theses. Paper 232

Vincent, C., and B. Eastman. ”Determinants of Pay in the NHL: A Quantile Regression Approach.” Journal of Sports Economics 10.3 (2009): 256-77

9 Table 2: Dep = ’salary13’ Variable Coefficient (Std. Err.) goals12 54.653∗∗ (19.252) assists12 72.315∗∗ (13.165) contract 300.940∗∗ (101.328) forw -1027.090∗∗ (280.201) age 861.585∗∗ (309.301) age2 -14.668∗∗ (5.487) plusminus12 -13.470 (10.122) pim12 0.098 (3.279)

Intercept -11206.886∗ (4283.402)

N 120 R2 0.63 F (8,111) 23.66

10 Table 3: Dep = ’caphit13’ Variable Coefficient (Std. Err.) goals12 41.869∗ (19.242) assists12 91.127∗∗ (13.024) contract 263.307∗ (100.914) forw -1086.138∗∗ (277.608) age 816.350∗ (311.634) age2 -13.902∗ (5.525) plusminus12 -7.191 (10.211) pim12 0.796 (3.327) Intercept -10614.810∗ (4324.270)

N 122 R2 0.665 F (8,113) 28.099

11 Table 4: Dep = ’salary13 only for drafted players’ Variable Coefficient (Std. Err.) games12 -20.851∗∗ (7.061) goals12 66.206∗∗ (18.864) assists12 81.558∗∗ (12.914) contract 307.995∗∗ (95.605) forw -1221.610∗∗ (283.692) age 1035.396∗∗ (294.787) age2 -17.312∗∗ (5.216) plusminus12 -10.447 (9.821) pim12 1.705 (3.207) draft -3.656∗∗ (1.342) Intercept -12463.061∗∗ (4086.961)

N 107 R2 0.71 F (10,96) 23.547

12 Table 5: Dep = ’caphit13 only for drafted players’ Variable Coefficient (Std. Err.) games12 -22.522∗∗ (7.162) goals12 58.396∗∗ (18.521) assists12 94.989∗∗ (12.844) contract 254.273∗∗ (94.586) forw -1287.428∗∗ (280.016) age 869.431∗∗ (301.936) age2 -14.429∗∗ (5.335) pim12 2.904 (3.256) plusminus11 17.805† (9.994) draft -3.609∗ (1.395) Intercept -10028.943∗ (4210.398)

N 109 R2 0.74 F (10,98) 27.87

13 Table 6: Dep = ’caphit13 for aggregated statistics’ Variable Coefficient (Std. Err.) aggreggoals 16.785 (11.232) aggregassists 58.826∗∗ (7.422) contract 251.107∗∗ (89.153) forw -1122.111∗∗ (270.046) age 747.848∗∗ (279.869) age2 -13.032∗∗ (4.960) aggregplusminus 2.133 (5.775) pim12 2.804 (2.952) Intercept -9872.056∗ (3889.688)

N 122 R2 0.733 F (8,113) 38.725

14