<<

PRODUCTION PROCESS IN THE NBA: A FORMULA FOR A SUCCESSFUL TEAM

A THESIS

Presented to

The Faculty of the Department of Economics and Business

The Colorado College

In Partial Fulfillment of the Requirements for the Degree

Bachelor of Arts

By

Jigmei Dorji

May 2016

PRODUCTION PROCESS IN THE NBA: A FORMULA FOR A SUCCESSFUL TEAM

Jigmei Dorji

May 2016

Economics

Abstract

Achieving success in the National Association is not only a priceless and historic feat, but teams that have success in the playoffs and regular season also benefit from financial bonuses. This paper estimates a production function for professional basketball teams, and uses the results to determine significant areas of focus that are positively and negatively associated with regular season win percentage. A Cobb- Douglas production function and multi-variable Ordinary Least Squares regression models are applied to data collected from the 2010-11 through 2014-15 seasons in the National Basketball Association. The results are also applied to successful teams in the playoffs in order to determine how regular season results translate to the playoffs. The resulting estimates indicate that successful NBA teams over the last five seasons have focused on shooting efficiently, keeping opponent shooting percentages low, rebounding, forcing turnovers at a high rate, and building their teams through the draft.

KEYWORDS: Correlation, Econometrics, Multicollinearity, Multiple Variable Model, Ordinary Least Squares, Regression, Cobb Douglas, Production Function, Production Measurement, JEL CODES: C1, C3, D24, L83

ON MY HONOR, I HAVE NEITHER GIVEN NOR RECEIVED UNAUTHORIZED AID ON THIS THESIS

Jigmei Dorji

Signature

TABLE OF CONTENTS

ABSTRACT

INTRODUCTION……………………………………...…………………………………1 Financial Incentive………………………………………………………………...2 Area of Focus………………...……………………………………………………2

LITERATURE REVIEW………………………………………………………………....3 Cobb-Douglas Production Functions……………………………………………...6 Basketball Analytics………………………………………………………………7 THEORETICAL FRAMEWORK………………………………………………………...9 Graph 1: Output and Marginal Product of Input X………………………………10 DATA AND METHODOLOGY………………………………………………………...12 Recent Trends in …………………………………………...18 Methodology……………………………………………………………………..19 REGRESSION RESULTS AND ANALYSIS…………………………………………..20 Table 1: Net Rating Regression Results…………………………………………20 Graph 2: Net Rating vs. Winning Percentage……………………………………21 Table 2: Play Style Regression Results………………………………………….22 Table 2.1: Regression Results………………………………...24 Table 2.1a: Percentage vs. Assists…………………………………...28 Table 2.2: Regression Results………………………………...29 Table 3: Fixed Team Effects Regression Results………………………………..30 Table 3.1: Summarization of Win Percentage by Conference…………………...31 Table 4: Play Style and Fixed Team Effects Regression Results………………..33 Table 4.1: Play Style and Fixed Team Effects Beta Coefficients………………..35 Table 4.2: Summarization of Inputs and Output…………………………………37 Table 4.3: Play Style and Fixed Team Effects VIFs……………………………..39 CONCLUSION…………………………………………………………………………..40

REFERENCES…………………………………………………………………………..45

Introduction

Every year from October to June for 82 games (with more successful teams playing close to 100 including the playoffs), 30 teams fight for supremacy in the National

Basketball Association (NBA). But, only one team can call themselves the champions of the league at the end of the season. Over the last five seasons, four different franchises have claimed the Larry O’Brien trophy – the Dallas Mavericks, the Miami Heat (twice), the , and most recently, the . The current decade has been a relatively balanced few years when compared to historical trends. The NBA has only had ten different teams win a title since 1975, indicating that the league has been enjoying a spell of competitive balance in the last five seasons.

Led by eventual Most Valuable Player (MVP) , the 2014-15

Warriors were able to jump out to an early lead in the standings despite playing in an incredibly competitive Western Conference. Golden State finished the season with a league-leading record of 67 wins and 15 losses and defeated the Cleveland Cavaliers in the NBA Finals for their first championship since 1975. Although the supremacy that

Golden State exhibited during the 2015 season is now indisputable, did the Warriors exhibit distinct types of advantages that allowed them to dominate over the rest of the

NBA? What types of effects did the style of play have on the winning percentage of the team? How did the personnel affect the record of the team? Did the fixed effects that the franchise has implemented off the court have an impact on the record on the floor?

1 Financial Incentive

Winning a championship is the unmistakable goal of any NBA franchise and the players involved. Not only is claiming the Larry O’Brien trophy and hanging up the title banner priceless and intangibly significant, but teams that have success in the playoffs benefit from monetary bonuses as well. After winning the 2014 title, the San Antonio

Spurs were awarded over $2 million in bonuses from the NBA, not including the trophy and championship rings, while the runner-up Miami Heat were also awarded $1.5 million.

The valuations of franchises also increase significantly after winning the championship. According to Forbes, recent NBA champions have gained an average of

30% in team value after raising the trophy. Teams also raise ticket prices significantly during their Finals run. In 2015, the average Finals ticket prices in Golden State ran over

$1,200 while the average ticket prices in Cleveland were over $1,300. Compared to the regular season, when the Warriors charged an average ticket price of $327 and the Cavs charged an average price of $258, merely playing in the Finals has provided a tangible financial benefit. In addition, teams that make the playoffs are awarded nearly $200,000 as a bonus, while teams that reach the conference finals are awarded over $380,000. The team that finishes the regular season with the best record in the NBA and in each conference is also awarded well over $300,000 as a bonus.

Area of Focus

The length of this study focuses on the past five seasons in the NBA. In particular,

I plan to observe the characteristics of teams since the 2010-2011 season and determine the inputs that have been significant in affecting regular season success in the past five

2 years of NBA basketball. Due to the makeup of the last four title-winning teams, I expect that accurate three- shooting and assisted field goals, as well as defending three- pointers are indicators of successful team basketball in the NBA. These team characteristics are related to the style of play that teams consciously employ, but are also a result of the type of players that each team has available. As far as fixed team effects, I expect that teams that build through the draft and free agency will have positive relationships with winning percentage.

I plan to use an extensive list of independent variables in the model, including variables measuring the output of the team, statistics measuring the style of play the team utilizes (such as percentage of total shots from three-point range), and additional variables measuring fixed team effects (such as average attendance and the conference of the team). After running several regressions and determining significant variables through the model, the models should be able to describe successful teams from the past five years through style of play as well as the areas of focus that the team exhibits through their statistics. After determining the significant inputs in the model, the results should also be able to predict characteristics of successful teams in the future.

Literature Review

The article “Who is ‘Most Valuable’? Measuring the Player’s Production of Wins in the National Basketball Association” by David Berri of Managerial and Decision

Economics (Berri, 1999) focused on linking individual performance to team wins in the

NBA. Berri found that although having the MVP certainly helps to produce wins, having multiple efficient and productive teammates – particularly in the Playoffs – was the key

3 factor in the 1997-1998 season. The model that Berri used to determine each player’s production of wins was:

Production of wins = (PM + TF + TDF – PA + TA) * total mins (2.1)

Berri first calculated each of the inputs individually, and then combined the factors into the equation above. His inputs were per- player production (PM), per-minute team tempo factor (TF), per-minute team defensive factor (TDF), average per-minute production at position (PA), and average player’s per-minute production (TA). The results that the model produced indicated that one dominant player per team is not enough to have success against Playoff competition.

Fiona Carmichael, Dennis Thomas, and Robert Ward’s article “Team

Performance: The Case of English Premiership Football” (Carmichael, Thomas, & Ward,

2000) utilized a linear production function where the individual match results were determined by various input variables. However, a slight variation in inputs that the authors used compared to the other cases used as background knowledge in this study was the difference in the types of independent variables. Team performance is still the variable utilized as output in this scenario, but statistics such as difference in shots on target, difference in percentage of all successful passes, difference in number of red cards, difference in clearances, blocks, and interceptions, and the difference in cumulative team goal differences before the game in question, as well as a number of other statistics were categorized as inputs for the production function. The results found that player skills such as accurate and efficient shooting and passing, as well as defensive skills such as tackles, clearances, and blocks were all significant independent variables in determining team performance in the Premiership.

4 José M. Sánchez Santos, Pablo Castellanos García, and Jesus A. Dopico Castro used data from the Spanish league in their article “The Production Process in Basketball:

Empirical Evidence from the Spanish League” (Santos, Garcia, & Castro, 2006). The authors found that factors such as home-court advantage, field goal and percentage, keeping turnovers and fouls in check, and defensive rebounds had the highest marginal effects on the probability of winning a particular game. The authors used two different models to estimate the probability of winning a game in the Spanish ACB

League. Their first model considered the statistics of the home team versus the away team in each game in relative terms, finding that the home team won in nearly 62% of the observed games and that the means for the home team in shooting percentage, assists, and total rebounds were higher than the visiting team. Their second model specifically analyzed the influence of home-court advantage on the probability of winning, and used separate variables for the home team and the visitor. The second model wound up finding similar significant results to their first model.

Nate Silver of ESPN’s FiveThirtyEight utilized analytic methods in his article

“Every NBA Team’s Chance of Winning A Title by 2019” (Silver, 2014), using current performance of the team, average age of the team, and the talent level of the best player on the team to estimate which team had the brightest outlook in the near future. Silver found that the Golden State Warriors, Los Angeles Clippers, and the Cleveland Cavaliers were the three teams with the highest probability to win at least one championship by

2019, based on the three factors listed above. When calculating the average age of the team as well as measuring the talent level of the best player on the team, Silver used projected wins added – a statistic based on a combination of Win Shares and Player

5 Rating (PER) – to weight the team’s average age by performance to help determine the relative age of the best players on each team, and to determine how many projected wins each team’s best player accounted for.

Cobb-Douglas Production Functions

Thomas A. Zak, Cliff J. Huang and John J. Siegfried used a production function in their article “Production Efficiency: The Case of Professional Basketball” (Zak,

Huang, & Siegfried, 1979). Using a Cobb-Douglas production function and data for individual games during the 1976-77 NBA season, the authors formulated a production frontier and estimated the impact of various inputs used in the production process. The variables included in the model were ratio of field goal percentages, ratio of field goal percentages, ratio of offensive and defensive rebounds, ratio of assists, ratio of personal fouls, ratio of steals and blocks, ratio of turnovers, and a binary dummy variable for location (home versus away games), while using the ratio of the final scores as the dependent variable. The empirical results from their production function found that the output was most responsive to field-goal percentage, free-throw percentage, and rebounding. Other variables significantly affecting output were turnovers and personal fouls. Based on their results, teams playing at home held an observed advantage over their visiting opponents.

“An Empirical Estimation of a Production Function: The Case of Major League

Baseball” by Charles E. Zech (Zech, 1981) also uses a Cobb-Douglas production function in order to estimate production of victories by a team in . Zech’s model used the major skills involved in baseball, such as batting average, home runs, stolen bases, ratio of strikeouts to walks, total fielding chances, as well as years with the

6 same manager and manager win percentage, to describe team success in Major League

Baseball. Based on the results, hitting for average is by far the most important factor contributing to team success, which contradicts the conventional wisdom that pitching is the most important factor in baseball. The author then used the results to measure the most valuable player (MVP) in the and National League in the MLB in a particular year by empirically determining the value that each player brings to the team.

Zech used each player’s marginal product, calculated by computing each team’s batting average, home runs, etc. without each player and using the values in the production function to determine the number of expected victories the team would have accomplished without the player. Using the difference between the two values as the player’s marginal product, the model was able to determine which player added the most wins to each team in the 1977 season.

Basketball Analytics

Basketball Analytics by Stephen Shea and Christopher Baker (Shea & Baker,

2013) provided insight into the rapidly growing world of basketball analytics and was used as background research and knowledge for this thesis. Shea and Baker used traditional statistics to create new stats that attempted to measure players in teams in new and more effective means. A notable statistic introduced in Basketball Analytics is

Offensive Efficiency (OE), measured as:

OE = (FG + A) / (FGA – ORB + A + TO). (2.2)

Offensive Efficiency is defined as a percentage variable because the formula produces a higher result when made field goals, assists, and offensive rebounds are higher, while missed field goals and turnovers bring the resulting value lower. Offensive Efficiency, as

7 well as total points and total assists, is used to create Efficient Offensive Production

(EOP). EOP is used to describe total offensive production but also accounts for efficiency of the player or team.

Shea and Baker also introduced a defensive statistic that “accounts for defensive contributions beyond blocks or steals” (Shea and Baker, 2013) called Defensive Stops

Gained (DSG). Statistics such as DSG are important to come up with, as defensive statistics have traditionally been lacking compared to offensive statistics. DSG is measured by using net effective field goal percentage, net offensive percentage, and net percentage, and included several positive per-game constants.

However, the most important statistic Shea and Baker introduced is Approximate

Value (AV). AV is calculated by adding together Defensive Points Saved (or Defensive

Stops Gained * 2) and Efficient Offensive Production in order to describe total contribution to the team from each player. The resulting statistic is comparable to Player

Efficiency Rating (PER) and Wins Produced (WP) as the most complete measurement of a player’s total performance.

Stephen Shea’s Basketball Analytics: Spatial Tracking (Shea, 2014) was also used in this thesis as background knowledge and research. Using the new spatial player tracking data collected by SportVU, Shea expands on his previous work and describes new ways to measure performance. Shea shows that the most efficient regions on the floor to shoot from are the corner three and the restricted area near the basket through effective field goal percentage, and proves that catch and shoot attempts are far more efficient than pull-up attempts. Shea was able to predict a team’s effective field goal percentage and overall offensive efficiency through utilization of drives and catch and

8 shoot corner threes, indicating that drives and kick-outs to corner threes combined have a positive effect on offensive efficiency (Shea, 2014).

Shea was also able to quantify the spacing in an offense and the stretch of a defense, and showed the effects that both have in game situations. Using the Miami Heat and San Antonio Spurs of 2013 and 2014 (both teams made the NBA Finals both seasons) as examples, spacing was shown to be beneficial on the offensive end, as long as efficient shooters were on the floor to draw the defense away from the basket. On the other hand, stretching the defense proved to be disastrous for the defensive team, as drawing defenders further from the rim allows the offense more space to perform drives to the restricted area and other actions detrimental to the integrity of the defense.

Theoretical Framework

In this paper, the theoretical background is focused on the production frontier and the corresponding production function. In terms of an economic or production theory view, a basketball team can be compared to a competitive firm. Each team has a different view for a successful team, and in this case, production can be seen as winning percentage while the various statistics and variables take the place of the traditional production inputs. The form of the function is the Cobb-Douglas production function:

a b x Q = AX 1X 2…X n (3.1) where n is the number of variables involved in the production function, A is a positive constants, and a, b, and x are the exponents of the function. The Cobb-Douglas form has several advantages for this type of study, especially since the exponents give relevant information concerning returns to scale. Breaking down the derivation of the Cobb-

9 a b Douglas form; if Q1 = AX 1X 2, and the firm doubles the amount of both variables, the

a b a+b a b enterprise produces Q2 = A(2X1) (2X2) = 2 AX 1X 2. Thus, the output increases by

a+b a b a b a+b Q2/Q1 = (2 AX 1X 2)/(AX 1X 2) = 2 . If a + b > 1, the firm (or team, in this case) is experiencing increasing returns to scale. If the sum of the exponents is equivalent to 1, the team is exhibiting constant returns to scale. Finally, if a + b < 1, the team is going through decreasing returns to scale.

Graph 1: Output and Marginal Product of Input X

Output (Q) 5000

4000

3000

2000

1000

0 Input X 0 5 10 15 20 25 Marginal Product of Input X 400

300

200

100

0 Input X 0 5 10 15 20 25 -100

Another appealing property of the Cobb-Douglas form is the marginal products.

In the Cobb-Douglas production function form, the marginal products of each input

10 a b depend on the levels of other inputs. If the function is still Q1 = AX 1X 2, the marginal product of X1 would be:

1-a b MPX1 = dQ/dX1 = aAX 1 X 2 , (3.2)

while the marginal product of X2 would be equivalent to

a 1-b MPX2 = dQ/dX2 = bAX 1X 2. (3.3)

After being broken down into partial derivatives, we can see that the marginal product of one input depends on both the derivative of output with respect to the input in question, but also on the value of the other inputs. This is important for this particular study because in basketball, the number of shots taken and subsequent field goals made by a team depends on the number of opportunities to possess the ball through rebounds, forcing turnovers, etc.

The Cobb-Douglas production function also captures elasticity in a convenient manner. Elasticity is defined as the percentage change in one variable in response to a given percentage change in another variable while holding all other relevant variables constant. In this particular form, the elasticity can also be interpreted as the exponents of the respective inputs. For example, in a traditional Cobb-Douglas production function

a b Q = AX 1X 2, if exponent a = 0.2, a 1% increase in X1 would lead to approximately a

0.2% increase in output Q. Finally, the Cobb-Douglas form is a widely used specification when dealing with production functions. This makes the Cobb-Douglas form familiar to many and therefore relatively simple to interpret. For this model, the variables not already categorized in percent values are transformed by the natural log in order to undo the exponentiation of the Cobb-Douglas production function. This process allows the exponents to be interpreted as regression coefficients, and also permits the coefficients to

11 be interpreted as the elasticity for each respective input. Using the natural log also generates a linear rather than quadratic function.

Data and Methodology

For this project in particular, the functional form of the model will be:

Q = X (ORa)(3P%b)(3PAc)(PITPd)(Re)(Af)(FBPg)(TOh)(FT%i)(FTAj)(FG%k)

(%TS3l)(%3FGMAm)(C3%n)(DRo)(OFG%p)(O3P%q)(OTOr)(STLs)(NRt)(DRAFTu)

(TRADEv)(FAw)(FANSx)(AGEy)(CONFz) U, (4.1) and:

 Q = regular season win percentage,  OR = offensive rating, or number of points scored per 100 possessions,  3P% = 3-point percentage,  3PAPG = 3-point attempts per game,  PITPPG = points in the paint per game,  RPG = rebounds per game,  APG = assists per game,  FBPPG = fast break ,  TOVPG = turnovers per game,  FT% = free-throw percentage,  FTAPG = free-throw attempts per game,  FG% = field-goal percentage,  %TS3 = percent of total shots from 3-point territory,  %3FGMA = percent of 3-point field goals made assisted by a teammate,  C3% = corner 3-point percentage,  DR = defensive rating, or number of points allowed per 100 possessions,  OFG% = opponent’s field-goal percentage,  O3P% = opponent’s 3-point percentage,  OTOVPG = opponent’s turnovers per game,

12  STLPG = steals per game,  NR = net rating, or the difference in offensive and defensive rating, or OR – DR,  DRAFT = number of players acquired through the draft or draft rights trade,  TRADE = number of players acquired through trade,  FA = number of players acquired through free agency,  FANS = average attendance per game,  AGE = average age of the team,  CONF = dummy variable describing the conference that each team plays in,

 and X as a positive constant and U as the error term.

An extensive list of traditional, advanced, and shooting statistics for individual players, teams, and opponents are tabulated and recorded by the NBA. The database on

NBA.com, as well as the data located on basketball-reference.com and basketball.realgm.com, provided the data for this project. Data was gathered on the NBA regular seasons from the 2010-11 season through the 2014-15 season, using per game averages for the majority of statistics and percentages for shooting statistics.

In the model, I decided to use regular season win percentage as the indicator of team success, or the output in the production function. Using win percentage rather than other methods of capturing team success, such as the ratio of final scores or absolute score differences, is important because it describes the success of the team over the course of the entire season while showing the consistency of the team. Using win percentage also accounts for a team’s playing style and does not differentiate between high and low scoring teams.

Offensive, defensive, and net ratings are important statistics to tabulate, as they also account for a team’s playing style. Offensive rating is equivalent to the number of

13 points scored per 100 possessions and defensive rating is the number of points allowed per 100 possessions, which by definition does not differentiate between fast and slow paced teams. The statistic is more of a measurement of efficiency, and is also known as offensive/defensive efficiency by ESPN’s John Hollinger. Net rating is the difference between offensive and defensive rating, and is the measurement of a team’s point differential per 100 possessions. Over the 150 observations (30 teams over 5 seasons), the minimum net rating a team accomplished was -15.5 by the historically bad Charlotte

Bobcats in the lockout-shortened 2012 season. The high for net rating over the last five seasons was 11.4 and occurred in the 2015 season by the defending champion Golden

State Warriors.

Teams in the NBA have increasingly utilized the 3-point shot over the last five seasons, as the average 3-point attempts per team have gone up in each successive year from 18 attempts per game in 2011 to 22.4 per game in 2015. However, teams have been shooting relatively similar percentages over the last five seasons, as league-wide averages have stayed steady around 35% from beyond the 3-point line. As previously mentioned, I expect that shooting an above-average 3-point percentage, shooting a high volume of 3- point shots, and keeping opposing 3-point percentage down are all positive descriptors of successful team performance. However, when running the model through the regression, the coefficient estimate for 3-point percentage is expected to be the elasticity of non- corner 3-point percentage, as corner 3-point percentage is also included as an input.

3-point percentage from the corner is also estimated to be an important factor in determining win percentage, as the corner three is the shortest 3-point shot and therefore the most efficient shot from 3-point range. Corner threes are 22 feet from the rim, while

14 3-point shots above the break around the rest of the arc are 23 feet, 9 inches. Compared to the average 3-point percentage league-wide of 35%, the average corner 3-point percentage over the last five years has been 38.5%.

Assisted 3-point shots are also expected to be an indicator of successful team play, as a high number of assisted shots tend to lead to either catch-and-shoot 3-point shots, open attempts, or both, and generally lead to higher percentage shots. The average percent of 3-point shots that have been assisted by teammates over the last five years is about 85%.

The mean for points in the paint have stayed steady over the last five seasons at just over 41 points per game. However, points in the paint is a strong indicator of a successful offense as shots close to the basket are the most efficient and effective shots an offense can generate on a consistent basis.

Rebounds and assists per game are traditional statistics and historically strong indicators of team success. Offensive rebounds give teams extra possessions, while rebounds on the defensive end help to end possessions and start offense. Assists are good signs of ball movement on the offensive side and tend to lead to higher-percentage shots for teammates. The average number of rebounds per game per team has increased slightly throughout the five successive seasons in question from 41.4 to 43.3, indicating a slight increase in pace – or possessions per game – during the same timespan. Meanwhile, assists per game have had slight peaks and valleys over the last half-decade but have remained steady at just under 22 assists per game.

Turnovers, opponent’s turnovers, steals, and fast-break points per game are closely related and are very important in determining easy shots for teammates and for

15 the opposition as well. Although steals are closely correlated with forced turnovers – as a constitutes a turnover – forced turnovers make up more than steals and also include dead-ball turnovers. However, steals force a live-ball turnover and tend to lead to fast- break points in transition. I anticipate that forcing turnovers, steals, and fast-break points have a positive impact on win percentage. On the other hand, turnovers lead to the same advantage for the opposition and should have a negative impact on team success.

I anticipate that free throw percentage and volume of free throw attempts are positive indicators of team success, as drawing fouls places the opposition in trouble.

The average free-throw percentage over the last five seasons has been about 75%, indicating that for every 100 possessions of only shooting free throws a team would achieve an offensive rating of 150, which would be the highest offensive rating in history.

Field-goal shooting (offensively and defensively) is expected to have a positive

(or negative, in opposing field goal percentage) impact on win percentage. In a hypothetical situation when all other variables are equal, a team that shoots at a superior percentage from the field than its opposition has a tangible advantage.

Although efficiency statistics such as and effective field goal percentage can be useful as well, the results can be skewed when using such statistics. True shooting percentage and effective field goal percentage are composed of other statistics (such as field goals, free throw percentage, field goals attempted, and three point shots made) and when combined with the original statistics in the ordinary least squares regression model, can bias the results. These types of statistics also do not differentiate between teams that place focus on scoring in the paint against teams that shoot a high volume of outside shots. For example, if Team A shot 40 of 100 while

16 making 20 three-point shots and Team B shot 50 of 100 with zero three-pointers made, both teams would end up with 100 points from the same amount of attempts. Although this scenario is highly improbable, similar situations can occur and the results have no way of telling us the style of play that the team employs.

The average attendance per game should have a positive relationship with regular season winning percentage. However, this may not be a factor of causation but rather of correlation, as teams that win tend to draw a larger crowd. On the other hand, teams playing at home have historically held an advantage over their visiting opponents due to

“superior performance by the home team and not preferential treatment by officials”

(Zak, Huang, & Siegfried, 1979). Chicago has led the league in attendance in all of the last five years at an average of nearly 22,000 per game and has ranked fourth in the league in win percentage with an average of over 65% over the same span.

The average age is expected to have a positive relationship with success until a certain point – around 29 – and then is expected to diminish as players age. The championship teams over the last five seasons have had varying mixes of players regarding roster composition. The LeBron James-led Miami Heat from 2011 to 2014 were built primarily through free agency, as the Heat had the most free agents in the league when building rosters from 2011-2013 and ranked second and third in 2014 and

2015, respectively. On the other hand, the San Antonio Spurs have maintained a consistent core of players acquired on draft day, ranking in the top 10 in the league in the last five years.

17 Recent Trends in Basketball Statistics

The recent partnership between the NBA and SportVU has dramatically expanded the range of statistics available to the public, as the system provides new precise data that would not be possible to gather without the use of SportVU camera technology and tracking software. As a result, basketball is experiencing a renaissance of sorts with data and statistics. Prior to player tracking, capturing the dynamic movements within a basketball game was nearly impossible due to the fluidity and complexity of actions that take place on the floor. The camera technology that SportVU has now implemented in every NBA arena follows the ball and every player on the court, providing real-time player and ball positioning and utilizing advanced statistical algorithms to derive previously unavailable statistics.

The lack of statistics prior to the NBA’s partnership with SportVU particularly needed improvement on the defensive side of the game, as steals and blocks have historically been poor or neutral indicators of defensive ability. Although steals and blocks are still important to categorize, as they effectively end an opponent’s offensive possession and can potentially generate transition offense, placing importance on steals and blocks can incentivize players to gamble rather than preventing their man from getting to the basket or boxing out for defensive rebounding. As the main objective on the defensive side of the ball is to prevent the opposition from getting open shots that lead to made baskets, the player tracking system from SportVU now categorizes defensive presence with statistics such as opponent’s contested field goal percentage and rim protection, as well as keeping up with traditional statistics.

18 The NBA’s partnership with SportVU has revolutionized statistics for basketball, as many new figures are now available to teams as well as the public. However, player tracking data is only available from the 2013-2014 NBA season onward, as the SportVU camera technology was only implemented into all NBA arenas in 2013. As this project is dealing with data from the 2010-11 season through the 2014-15 season, this project will unfortunately be devoid of player tracking statistics. However, future research can be conducted with this new technology, as similar ideas can be used with player tracking data to discover new areas of importance that have not been previously categorized.

Methodology

There are four models in place within the larger construct of this project. All four models have regular season winning percentage from 2010-11 to 2014-15 as the dependent variable. The first model is purely used as a reference to the succeeding models, and consists of net rating as the only independent variable. The second model will be constrained to the play style statistics, while the third model will consist of the fixed team effects. The final model will tie both play style and team effects together in order to gain a picture of a successful franchise, on and off the court.

The final three models will be run through OLS regression several times, using F- tests in order to determine the significance of the independent variables. Multicollinearity tests will also be performed, as the large set of interrelated variables suggests that collinearity is potentially present within the inputs. However, although multicollinearity does not violate OLS assumptions, acknowledging multicollinearity if present is important as the effects can bias the resulting coefficient estimates.

19 Regression Results and Analysis

After utilizing the natural logarithm to transform the variables that are not already categorized as percent values (such as rebounds per game, points in the paint, average attendance, etc.), the regression coefficient after performing ordinary least squares regression can be interpreted as the elasticity for each input. However, certain inputs –net rating, players acquired on draft day, and players acquired via trade – have at least one singular negative or zero values within the dataset. Therefore, the original value for these inputs must be used instead of the logged form of the variables, as the natural log of negative and zero numbers are undefined.

The first model comprises of winning percentage as the dependent variable and net rating as the single independent variable. Net rating is an exceptionally good predictor of winning, as it is describes the difference between offensive and defensive efficiency.

This can also be seen as the difference between points scored per 100 possessions and points allowed per 100 possessions. This first model will be used primarily as a reference point to the ensuing models to come.

Table 1: Net Rating Regression Results

WP Coefficient: Std. Error: t-score: P-value: 95% CI:

NR 0.0293 0.0006 48.30 0.000 0.0281, 0.0305

X 0.4999 0.0031 159.53 0.000 0.4938, 0.5062

R-squared: Adjusted R2: Residual: Prob > F: 0.9403 0.9399 0.2181 0.0000

Note: WP = winning percentage, NR = net rating, X = positive constant, CI = confidence interval

20 First, we look to the p-value of the F-test. As shown in Table 1, the p-value is

0.0000, indicating that the overall model is statistically significant. In the dataset, the values for winning percentage are cataloged into three decimal places (50% is logged as

0.500, for example). As shown in Table 1, the coefficient for net rating is 0.0293. This value indicates that for every point increase in net rating, winning percentage increases by 2.93 percentage points (from 0.500 to 0.5293, to continue the example). The R- squared value of the model is at 0.9403 and the adjusted R-squared value is 0.9399, indicating that the model explains approximately 94 percent of the variability of the data around the mean. As shown below, net rating and winning percentage have a very strong positive relationship, and the data points fit the regression line remarkably well.

Graph 2: Net Rating vs. Win Percentage

21 The second model is constrained to the play style variables. The objective behind keeping this particular model to the play style variables is to help define and describe on- the-court success while controlling for the actions taking place behind the scenes. It is important to differentiate the effects of play style variables from the fixed effects of the franchise before combining the two in the final model, as teams can potentially have success on the floor without having established fixed team variables in place. Another reason for keeping this model constrained to play style inputs is to help determine if the fixed team effects are important and significant to team success on the floor. This model is attempting to describe the effect that coaching has on winning, as these play style variables are mostly determined by coaching decisions.

Table 2: Play Style Regression Results

WP Coefficient: Std. Error: t-score: P-value: 95% CI:

3P% -0.0496 0.3046 -0.16 0.871 -0.6522, 0.5529

FT% 0.0386 0.1585 0.24 0.808 -0.2749, 0.3521

FG% -1.3520 0.7773 -1.74 0.084 -2.8899, 0.1859

%TS3 0.8745 0.5786 1.51 0.133 -0.2702, 2.0192

%3FGMA 0.1261 0.0974 1.29 0.198 -0.0665, 0.3187

C3P% -0.1051 0.1457 -0.72 0.472 -0.3932, 0.1831

OFG% -0.3627 0.6610 -0.55 0.584 -1.6705, 0.9450

O3FG% 0.1481 0.2876 0.51 0.607 -0.4209, 0.7171 logOR 3.5094 0.4504 7.79 0.000 2.6184, 4.4004 log3PAPG -0.2086 0.1310 -1.59 0.114 -0.4677, 0.0506 logPITPPG 0.1296 0 .0697 1.85 0.066 -0.0088, 0.2669 logRPG -0.2308 0.2002 -1.15 0.251 -0.6269, 0 .1653 logAPG 0.0648 0.0593 1.09 0.277 -0.0526, 0 .1821

22 logFBPPG 0.0003 0.0220 0.01 0.990 -0.0432, 0 .0437 logTOVPG 0.0179 0.0884 0.20 0.840 -0.1569, 0.1927 logFTAPG -0.0413 0.0509 -0.81 0.419 -0.1421, 0.0595 logDR -3.0305 0.3121 -9.71 0.000 -3.648, -2.4131 logOTOVPG -0.0786 0.1116 -0.70 0.482 -0.2994, 0.1421 logSTLPG -0.0173 0.0719 -0.24 0.811 -0.1595, 0.1249

X -0.1566 1.9754 -0.08 0.937 -4.0647, 3.7515 R-squared: Adjusted R2: Residual: Prob > F: 0.9470 0.9393 0.1936 0.0000

Note: 3P% = 3-point percentage, FT% = free throw percentage, FG% = field goal percentage, %TS3 = percent of total shots from 3, %3FGMA = percent of assisted 3-point field goals made, C3P% = corner 3- point percentage, OFG% = opponent field goal percentage, O3FG% = opponent 3-point percentage, logOR = natural log of offensive rating, log3PAPG = natural log of 3-point attempts per game, logPITPPG = natural log of points in the paint per game, logRPG = natural log of rebounds per game, logAPG = natural log of assists per game, logFBPPG = natural log of fast break points per game, logTOVPG = natural log of turnovers per game, logFTAPG = natural log of free throw attempts per game, logDR = natural log of defensive rating, logOTOVPG = natural log of opponent turnovers per game, logSTLPG = natural log of steals per game

As shown in Table 2, the p-value of the F-test is 0.0000, indicating that the overall model is statistically significant. Similar to the previous model, the R-squared and adjusted R-squared values are 0.9470 and 0.9393, respectively, indicating that the data fits the regression line and that the model explains much of the variability around the mean. Despite this fit, most of the variables have P-values above 0.05, indicating that they are not significant at the 95% level. The natural log of offensive rating and the natural log of defensive rating are the only two inputs that are significant to winning percentage in this second model, and are statistics that model efficient offenses and defenses. However, many of the other inputs involved in the model are pieces that are needed to have high offensive ratings and low defensive ratings. Therefore, if we break down the play style variables into offensive and defensive categories, the resulting models should result in significant variables describing offensive and defensive rating.

23 Table 2.1: Offensive Rating Regression Results

OR Coefficient: Std. Error: t-score: P-value: 95% CI:

3P% 34.3029 5.8727 5.84 0.000 22.6894, 45.9166

FT% 21.8690 2.7661 7.91 0.000 16.3989, 27.3392

FG% 129.4964 8.1827 15.83 0.000 113.3145, 145.6782

%TS3 24.7104 9.9754 2.48 0.014 4.9834, 44.4375

%3FGMA -1.8888 2.0296 -0.93 0.354 -5.9024, 2.1249

C3P% -5.8564 3.0983 -1.89 0.061 -11.9836, 0.2706 log3PAPG -0.0408 2.3165 -0.02 0.986 -4.6217, 4.5401 logPITPPG 5.6671 1.2556 4.51 0.000 3.1842, 8.1500 logRPG 9.2363 1.9745 4.68 0.000 5.3317, 13.1410 logAPG -3.1921 1.2620 -2.53 0.013 -5.6877, -0.6965 logFBPPG 0.1197 0.4331 0.28 0.783 -0.7367, 0 .9761 logTOVPG -15.5971 1.1179 -13.95 0.000 -17.8078, -13.3865 logFTAPG 7.8618 0.8213 9.57 0.000 6.2377, 9.4859

X -15.2168 8.3950 -1.81 0.072 -31.8183, 1.3848

R-squared: Adjusted R2: Residual: Prob > F: 0.9467 0.9416 98.0228 0.0000

The p-value of the F-test is 0.0000, signifying that the overall model is statistically significant. The R-squared and adjusted R-squared values have stayed relatively similar to Table 2, indicating that the model still explains much of the variability around the regression mean. However, many of the inputs in Table 2.1 have p- values less than 0.05, indicating that these variables are statistically significant at the 95% level. The coefficient values for each of the variables indicates the amount of change estimated in offensive rating given a one unit change (or one percent change, in the case

24 of the variables transformed by the natural log) in the value of each respective variable, given that all other variables in the model are held constant.

Percent of 3-pointers made from an has a negative coefficient estimate and a p-value of 0.354, and is therefore insignificant to offensive rating. This result is somewhat unexpected, as assisted three-point makes should theoretically lead to uncontested three-point attempts, which tends to be a mark of a good offense. Perhaps a better variable to tabulate in future research would have been the number of uncontested three-point attempts a team is able to generate per game, as this new variable would be able to describe both ball movement and spacing of an offense. However, this type of variable is produced through SportVU’s player tracking system (explained in Data and

Methodology) and is unavailable for the entire period of this study.

Three-point attempts per game are also insignificant to offensive rating, as the p- value for the natural log of 3-point attempts per game has a p-value of 0.986. This result can be explained, as bad offenses can be liable to jack up three-point attempts at a low- percentage, high-volume rate, while teams with a dominant inside presence can establish a good offense without a high volume of three-point attempts. An example of a bad offense that takes a high volume of low percentage three-point attempts is the 2015

Philadelphia 76ers. Although Philadelphia has been accused on numerous occasions of tanking during the present in order to build for the future, the fact remains that Philly had the 11th highest rate of three-point attempts per game over all teams during the last five seasons at over 26 attempts per game while generating a paltry 93 offensive rating, good for second worst over the same time span. On the other hand, the Memphis Grizzlies of

2011 attempted only 11.3 three-pointers per game, which was the lowest mark over the

25 last five seasons. However, the twin towers of Marc Gasol and Zach Randolph helped lead Memphis to an offensive rating of 104.4 during the 2011 season, an above-average mark over the last five seasons. In fact, Memphis has only average 13.4 three-point attempts per game over the given time frame – lowest in the league – while maintaining a league-average offensive rating.

Fast break points are also an insignificant input to offensive rating, as the p-value shows 0.783. This result can be explained through each team’s preference of style, as the

Denver Nuggets, Houston Rockets and Golden State Warriors have combined for 7 of the

15 highest totals in fast break points. On the other hand, the New York Knicks and

Brooklyn Nets have combined for 6 of the 10 lowest totals in fast break points per game over the last five seasons, all to varying degrees of success.

Corner three-point percentage is both negative and insignificant at the 95% level

(albeit just barely, as the p-value is 0.061), which is the most unexpected result obtained from Table 2.1. A potential explanation for the input’s lack of significance to offensive rating is that shooting percentage from the corner does not account for the number of attempts, particularly open attempts, that an offense is able to generate. In hindsight, a variable that could potentially have more success predicting offensive rating is the total number of corner threes and uncontested corner threes that a team attempts per game.

The total number of attempts a team is able to produce from the corner should be indicative of an offense that spreads the floor and moves the ball, as good defenses tend to focus on defending the corner three.

Non-corner three-point percentage, field goal percentage, and free throw percentage are all significant at the 99.9% level and all three inputs have highly positive

26 coefficient estimates. The coefficient results indicate that shooting percentages have very strong influences on offensive rating, with field goal percentage being the best indicator.

This is not a new result, as good offenses are historically dependent on being able to score at an efficient percentage from the field and from the free throw line.

Percent of total shots attempted from three-point range has a positive coefficient of 24.7, indicating that teams that place greater emphasis on three-point attempts in their shot selection tend to be associated with a more efficient offense. Compared to total attempts per game from three, this input is achieved through a team’s gameplan and accounts for the pace and total shot attempts. The variable is also statistically significant at the 98.6% level, as the p-value is 0.014.

The natural log of points in the paint has a positive coefficient estimate of 5.7, indicating that for every additional percent increase in points in the paint, offensive efficiency will likewise increase by approximately 5.7%. This result makes sense in a basketball perspective, as points in the paint are the closest shots to the basket a team can produce and are therefore the most efficient. The variable is also statistically significant at the 99.9% level.

Rebounds per game also have a strong positive connection with offensive rating, as the beta coefficient for the natural log of rebounds per game is 9.2. The result denotes a strong association between rebounds and offensive efficiency, as an additional percent increase in rebounds per game leads to 9.2% increase in offensive rating.

The coefficient for the natural log of assists per game is negative, which is an unanticipated result. Theoretically, assists would be presumed to have a positive link to offensive rating, as assists tend to create high percentage shot attempts for teammates.

27 However, assists have been a thorn in the side of the analytics movement in basketball, as assist percentage (percentage of field goals assisted) has historically had little influence on field goal percentage (Ziller, 2013). Conceptually, assists should have a significant and strong positive effect on a team’s shooting percentage.

Table 2.1a: Field Goal Percentage vs. Assists

FGP Coefficient: Std. Error: t-score: P-value: 95% CI: logAPG 0.1013 0.0152 6.65 0.000 0.0712, 0.1314

X 0.1410 0.0469 3.01 0.003 0.0483, 0.2337

R-squared: Adjusted R2: Residual: Prob > F: 0.2298 0.2246 0.0285 0.0000

Although assists are significant to field goal percentage, the coefficient estimate for the natural log of assists per game is 0.1. This result indicates that for every percent increase in assists per game, field goal percentage increase by 0.1%. While the result is indeed positive, the effect is minimal.

Turnovers per game have a strong negative association with offensive rating, and the variable is also highly significant. This result is to be expected, as turnovers result in a loss of possession and forfeits a shot attempt. Free throw attempts per game, on the other hand, has a strong positive association with offensive rating and is also highly significant.

Teams that attempt a high number of free throws per game are troublesome for opposing defenses to deal with, as free throws place opposing teams in foul trouble, which tends to limit minutes of the guilty players. Free throws are also a very efficient source of offense, as the league average has hovered around 75% over the last five seasons.

28 Table 2.2: Defensive Rating Regression Results

DR Coefficient: Std. Error: t-score: P-value: 95% CI:

OFG% 190.9438 9.2621 20.62 0.000 172.6366, 209.251

O3P% 25.8783 8.2924 3.12 0.002 9.4877, 42.2688 logRPG 1.4735 2.5516 0.58 0.564 -3.5698, 6.5169 logOTOVPG -7.9471 2.9019 -2.72 0.007 -13.6830 -2.2111 logSTLPG 0.6417 1.9624 0.33 0.744 -3.2374, 4.5203

X 25.8060 7.5349 3.42 0.001 10.9127, 40.6993

R-squared: Adjusted R2: Residual: Prob > F: 0.8643 0.8596 210.9558 0.0000

Table 2.2 describes defensive rating as the dependent variable through defensive statistics. The p-value of the F-test is 0.0000, indicating that the overall model is statistically significant. The R-squared value is 0.8643, which is lower than the values for net rating and offensive rating but is still a relatively high value. The adjusted R-squared is 0.8596, meaning that approximately 86% of the variability of defensive rating is accounted for by the model, even after taking the number of predictor variables in the model into account.

Opponent field goal percentage and opponent three-point percentage are the two strongest predictors of defensive rating, and are both statistically significant at the 95% level or higher. These two results, compared to steals per game – which is statistically insignificant – imply that limiting high-percentage shots and keeping the shooting percentages of opposing teams low is more important to a stingy defense than steals.

As shown above, rebounds per game are not significant statistically. Although the coefficient is positive, the effect rebounds have on defensive rating is negligible. The

29 variable for rebounds is total rebounds, and is therefore used in both offensive rating and defensive rating regression models.

Blocks were not used in this study, as the traditional statistic that groups all blocks into one category has generally been a poor indicator of defensive prowess.

Blocks do not account for a team’s shooting percentage at the rim, nor do they account for the distance from the basket of the blocked shots attempted. Blocks are also assumed to begin transition offense or guarantee possession for the team doing the blocking, but according to Nylon Calculus, “57.2% of all blocked shots were recovered by the defense”

(Willard, 2015). However, expanding blocks into multiple new statistics has the potential to be a strong descriptor of effective defense with future research.

The third model is constrained to the fixed team effects. The objective behind keeping this model to the fixed team effects is to determine if play style variables are important and significant to successful team basketball in the NBA. The question that this model is attempting to answer is: does the coach or the general manager have a higher level of accountability when it comes to success on the floor? This third model will attempt to answer this question by categorizing variables that occur off the court and determine if these inputs are significant to regular season winning percentage.

Table 3: Fixed Team Effects Regression Results

WP Coefficient: Std. Error: t-score: P-value: 95% CI:

CONF 0.0435 0.0194 2.24 0.026 0.0052, 0.0819

DRAFT 0.0501 0.0165 3.04 0.003 0.0176, 0.0827

TRADE 0.0327 0.0161 2.03 0.044 0.0009, 0.0646

30 FA 0.0309 0.0159 1.96 0.053 -0.0003, 0.0623 logFANS 0.3216 0.0919 3.50 0.001 0.1399, 0.5031 logAGE 1.1451 0.1821 6.29 0.000 0.7852, 1.5051

X -7.0172 0.8785 -7.99 0.000 -8.7537, -5.2807 R-squared: Adjusted R2: Residual: Prob > F: 0.4755 0.4535 1.9170 0.0000

Note: CONF = dummy variable (0 for eastern conference, 1 for western conference), DRAFT = number of players acquired on draft day, TRADE = number of players acquired via trade, FA = number of players acquired via free agency, logFANS = natural log of average attendance per game, logAGE = natural log of average age of team

As shown above, the p-value of the F-test is 0.0000, indicating that the overall model is statistically significant. The R-squared and adjusted R-squared values are

0.4755 and 0.4535 respectively, signifying that the model explains just over 45% of the variability around the mean. Although the R-squared values are approximately half of the previous two models, the P-values for each input is statistically significant at the 95% level (excluding FA, which is significant at the 94.7% level). The coefficient for conference shows that Western Conference teams on average perform at a superior rate than their Eastern Conference counterparts. Table 4 also shows this result using basic descriptive statistics:

Table 3.1: Summarization of Win Percentage by Conference

Eastern Conference:

Variable: # of Obs. Mean Std. Dev. Min. Max. WP 75 0.4681 0.1580 0.106 0.805 Western Conference: Variable: # of Obs. Mean Std. Dev. Min. Max. WP 75 0.5318 0.1496 0.195 0.817

31 Shifting the focus back to Table 3, the coefficient value for players acquired on draft day is the highest amongst the roster composition variables, and is estimated at

0.0501. This result indicates that for every additional player on the roster originally acquired on the respective player’s draft day – whether the player was drafted by his current team or if the team traded for his draft rights – winning percentage is estimated to increase by 5.01 percentage points (from 0.500 to 0.5501, or 50% to 55.01%).

Conceptually, this makes sense in a basketball perspective, as teams that are able to draft and develop a higher number of players allows for a higher level of continuity and familiarity between the players and within each team’s respective system.

The San Antonio Spurs are an excellent example of building through the draft and having continued success. Over the years, the Spurs have made smart draft night decisions and acquired players such as , Manu Ginobili, , and

Kawhi Leonard, and have been the most successful team over the last five seasons, winning over 72% of their games. The number of players acquired through free agency and the number of players acquired via trade also both have positive coefficient estimates and are relatively similar in value, but neither variable affects winning percentage at the same rate as players acquired through the draft.

The coefficient estimate for the natural log of average attendance is 0.3216, indicating that for every additional percent increase in attendance, output increases by

0.3216 percent. However, based on the results, the model is unable to specify whether a higher winning percentage causes a higher average attendance or whether the two variables are positively correlated.

32 The natural log of age has a coefficient estimate 1.1451, indicating that when the average age of a team increases by 1 percent, the winning percentage of the team is expected to increase by 1.1451 percent. This result also makes theoretical sense on a basketball level, as teams with more veterans are more experienced and are more likely to be competing for a championship. However, the resulting estimate is not indicating that a team exclusively made up of old veteran players is expected to be more successful. The span of average ages over the last five years ranges from 23.2 to 31.3, so the coefficient is estimating success within this range.

The final model is a combination of play style variables and fixed team effects.

Combining the play style and team effects variables should provide insight into the structure and focus of successful franchises. This final model is attempting to describe how the front office, coaching staff, and roster collaborate to result in accomplishments on the court.

Table 4: Play Style and Fixed Team Effects Regression Results

WP Coefficient: Std. Error: t-score: P-value: 95% CI:

3P% 1.1469 0.3729 3.08 0.003 0.4089, 1.8847

FT% 0.7179 0.1787 4.02 0.000 0.3644, 1.0715

FG% 3.2704 0.6178 5.29 0.000 2.0481, 4.4929

%TS3 3.8434 0.7135 5.39 0.000 2.4316, 5.2552

%3FGMA 0.2354 0.1307 1.80 0.074 -0.0232, 0.4940

C3P% -0.3448 0.1969 -1.75 0.082 -0.7345, 0.0449

OFG% -2.9839 0.5703 -5.23 0.000 -4.1125, -1.8554

O3P% -0.6903 0.3859 -1.79 0.076 -1.4539, 0.0734

33 log3PAPG -0.7743 0.1676 -4.62 0.000 -1.1061, -0.4426 logPITPPG 0.1839 .0969 1.90 0.060 -0.0079, 0.3758 logRPG 0.9821 0.2213 4.44 0.000 0.5442, 1.4199 logAPG -0.0531 0.0806 -0.66 0.511 -0.2125, 0.1064 logFBPPG -0.0197 0.0314 -0.63 0.531 -0.0818, 0.0424 logTOVPG -0.5454 0.0738 -7.39 0.000 -0.6914, -0.3994 logFTAPG 0.1053 00.0546 1.93 0.056 -0.0027, 0.2133 logOTOVPG 0.3979 0.1425 2.79 0.006 0.1159, 0.6799 logSTLPG -0.0470 0.0985 -0.48 0.634 -0.2419, 0.1478 logFANS 0.0436 0.0471 0.92 0.357 -0.0496, 0.1368 logAGE 0.3763 0.0999 3.77 0.000 0.1787, 0.5739

DRAFT 0.0163 0.0081 2.00 0.047 0.0002, 0.0324

TRADE 0.0076 0.0078 0.98 0.329 -0.0078, 0.0230

FA 0.0071 0.0076 0.94 0.350 -0.0079, 0.0220

X -4.8313 1.048 -4.61 0.000 -6.9068, -2.7559 R-squared: Adjusted R2: Residual: Prob > F: 0.9055 0.8891 0.3455 0.0000

The p-value of the F-test is 0.0000, indicating that the overall model is statistically significant. The R-squared and adjusted R-squared values are 0.9055 and

0.8891 respectively, which indicates that the model explains approximately 90% of the variability around the regression mean. The coefficient value for the positive constant X describes the predicted value if the inputs all equal zero, which does not have much significance for this particular model.

In order to compare the relative strength of each input to one another, we can take the beta scores of each variable. The beta scores are measured in standard deviations rather than the units of the original variables, which allow the relative strength of each

34 variable within the model to be compared. The beta scores are essentially the regression coefficients if the output and inputs were transformed standard scores, or z-scores. As shown in Table 4.1, the variable with the highest positive coefficient estimate and beta coefficient is the percent of total shots taken that were three-point attempts. However, the input with the lowest negative beta coefficient is the natural log of three-point attempts per game.

Although the results for the highest and lowest coefficient values are initially confusing, this can potentially be explained by the shot selection of each team.

Conceptually, a team taking a higher number of three-point attempts is more likely to be taking a higher percentage of lower quality shots, as a higher number of attempts signals that the team is playing at a faster pace (or number of possessions per game) and is taking a higher number of total shot attempts. High pace rankings in the past have not translated to success in the playoffs, as only the 1981-82 and the 2014-15

Golden State Warriors have ranked in the top five in pace since 1978 and won the championship. A team that takes a higher number of three-point attempts each game is also likely to trailing in the overall score of the game and is attempting to make a comeback. However, a team that incorporates a higher percentage of threes into the total shots taken per game is theoretically not rushing to take quick shots but instead is choosing to focus on three-point attempts, rather than midrange shots, which have historically been less efficient.

Below, Table 4.1 describes each independent variable and the associated beta coefficient. As mentioned previously, the coefficients are measured in standard deviations, which allows the variables to be compared amongst one another.

35 Table 4.1: Play Style and Fixed Team Effects Beta Coefficients

WP Beta coefficients:

3P% 0.1468

FT% 0.1345

FG% 0.3291

%TS3 1.1412

%3FGMA 0.0612

C3P% -0.0660

OFG% -0.2725

O3P% -0.0666 log3PAPG -0.9914 logPITPPG 0.1071 logRPG 0.2877 logAPG -0.0253 logFBPPG -0.0265 logTOVPG -0.2463 logFTAPG 0.0720 logOTOVPG 0.2072 logSTLPG -0.0350 logFANS 0.0336 logAGE 0.1646

DRAFT 0.1971

TRADE 0.1018

FA 0.0931

Since the beta coefficients are measured in standard deviations, one standard deviation increase in three-point percentage results in an estimated increase of 0.1468 standard deviations in win percentage (as shown in Table 4.1). In order to compare the

36 effect of the standard deviations of each respective variable, Table 4.2 summarizes the list of statistics used in the combined model of play style variables and fixed team effects and shows the standard deviations and means of each input as well as output. The actual effect that the beta values are interpreting are based on the values in Tables 4.1 and 4.2.

As shown, one standard deviation in three-point percentage is equivalent to 0.0201, or approximately two percentage points. Therefore, when three-point percentage increases by two percent, winning percentage is expected to increase by 0.1468, or 14.68 percentage points of a standard deviation. Since the standard deviation for winning percentage is 0.1566, or 15.66 percentage points, when three point percentage increases by one standard deviation of two percent, winning percentage is expected to increase by

2.3 percent.

Table 4.2: Summarization of Inputs and Output

Variable: # of Obs. Mean Std. Dev. Min Max

WP 150 0.5001 0.1566 0.106 0.817

3P% 150 0.3538 0.0201 0.295 0.403

FT% 150 0.7554 0.0293 0.66 0.828

FG% 150 0.4526 0.0158 0.408 0.501

%TS3 150 0.2437 0.0465 0.136 0.392

%3FGMA 150 0.8476 0.0407 0.715 0.934

C3P% 150 0.3848 0.0299 0.319 0.467

OFG% 150 0.4525 0.0143 0.419 0.487

O3P% 150 0.3548 0.0151 0.308 0.411 log3PAPG 150 2.9792 0.2005 2.4248 3.4874 logPITPPG 150 3.7266 0.0911 3.5086 4.0604

37 logRPG 150 3.7449 0.0459 3.6082 3.8607 logAPG 150 3.0758 0.0746 2.9178 3.3105 logFBPPG 150 2.5728 0.2106 2.1282 3.0397 logTOVPG 150 2.6700 0.0707 2.4159 2.8736 logFTAPG 150 3.1338 0.1071 2.8094 3.4372 logOTOVPG 150 2.6691 0.0816 2.4248 2.8565 logSTLPG 150 2.0272 0.1165 1.7047 2.2618 logFANS 150 9.7595 0.1207 9.5079 10.0061 logAGE 150 3.2789 0.0685 3.1442 3.4436

DRAFT 150 5.0467 1.8943 0 10

TRADE 150 4.4600 2.0939 0 11

FA 150 5.1600 2.0598 1 10

Below in Table 4.3, the variance inflation factors (VIFs) of each input are described in descending order. VIFs measure the multicollinearity of the set of regression variables. Multicollinearity can be a problem, particularly when the number of inputs is high (as in this model), since having independent variables that are closely related can bias the precise results of the individual inputs. VIFs above 10 may be especially indicative of multicollinearity, while a tolerance closer to one means that collinearity is not an issue. As shown below, the number of three-point attempts and the percentage of total shots taken from three-point territory are very closely related and are likely affected by multicollinearity. The number of players acquired through trades, free agency, and the draft are also potentially affected by multicollinearity.

Although multicollinearity is clearly present in this model, the collinear variables are all representing individual value to the overall model and will therefore remain in the model. Although multicollinearity remains a problem for regression models such as this

38 one, having multicollinearity in the model does not violate Ordinary Least Squares assumptions, as multicollinearity does not affect the overall fit of the model nor does having it present in the model result in inadequate prediction estimates.

Table 4.3: Play Style and Fixed Team Effects Variance Inflation Factors (VIFs)

Variable: VIF: Tolerance (1/VIF): log3PAPG 61.89 0.016158

%TS3 60.30 0.016585

TRADE 14.54 0.068771

FA 13.26 0.075431

DRAFT 13.02 0.076787 logTOVPG 7.40 0.135161 logSTLPG 7.21 0.138732 logRPG 5.65 0.177144

FG% 5.19 0.192593 logPITPPG 4.27 0.233973

OFG% 3.64 0.274424

3P% 3.06 0.326633 logAGE 2.56 0.390164 logFBPPG 2.39 0.418371 logAPG 1.98 0.505467

C3P% 1.91 0.523245 logFTAPG 1.87 0.534404

O3P% 1.87 0.536135 logFANS 1.77 0.564224

%3FGMA 1.55 0.645506

FT% 1.51 0.664381 logOTOVPG 1.49 0.670415

39 Although multicollinearity is clearly present in this model, the collinear variables are all representing individual value to the overall model and will therefore remain in the model. Although multicollinearity remains a problem for regression models such as this one, having multicollinearity in the model does not violate Ordinary Least Squares assumptions, as multicollinearity does not affect the overall fit of the model nor does having it present in the model result in inadequate prediction estimates.

Conclusion

As shown in Tables 4 and 4.1, the model found that independent variables with the greatest statistically significant positive effect on win percentage are shooting percentages (including field goal, three-point, and free throw percentages), percent of total shots taken as three-point attempts, total rebounds, forcing turnovers, increased age and experience (to a certain limit), and the number of players acquired through the draft.

On the other hand, the inputs with the most detrimental negative effects on winning are opponent field goal percentage, total three-point attempts per game, and turnovers.

In order to tell if the resulting variables have an effect that has translated to success in the playoffs, the inputs that significantly affected regular season win percentage will be compared to the teams that ranked highly in these statistics over the last five seasons. Three-point percentage translates very well to playoff success, as three of the top six teams over the last five seasons are the 2015 Golden State Warriors (2),

2014 San Antonio Spurs (3), and the 2013 Miami Heat (6). All three of these teams were eventually crowned as NBA Champions in each respective season. Field goal percentage is also crucial to playoff success, as six of the top ten teams were NBA Finalists in their

40 respective seasons. All ten of the top ten teams also finished with a win percentage of

65% or above. The top ten in free throw percentage is dominated by the Oklahoma City

Thunder teams from 2013, 2011, 2014, and 2012, the of 2014,

2011, 2015, and 2012. Although neither of the two teams has won a title within this span, both have experienced stellar regular season records, as OKC has averaged a stellar win percentage of over 70% for an average record of 58 wins and 24 losses during the four seasons above and Portland has averaged a very respectable win percentage of 57.25% for an average record of 47 wins and 35 losses.

Percentage of total shots from three was proven to be the most effective predictor of win percentage in the regular season amongst the independent variables in Tables 4 and 4.1, and has had a positive effect on playoff success as well. 20 of the top 25 teams ranked by this statistic were playoff teams, but the 2015 Golden State Warriors (13) were the only team in this set to win it all. However, the 2015 Cleveland Cavaliers (6) and the

2014 Miami Heat (21) were both participants in their respective Finals, and the 2015

Houston Rockets (1) and 2015 Atlanta Hawks (9) finished as conference finalists.

Rebounds per game are a category that is clearly important to team success and has been proven to have a positive effect on regular season winning percentage.

However, the 2015 Golden State Warriors were the only team in the last 5 seasons to rank top 10 in rebounding during the season and win the title in the same year. On the other hand, rebounding is a statistic that teams place differing amounts of importance on, as rebounding is particularly dependent on the personnel and capabilities of the team. For example, the Miami Heat ranked last in the league in rebounding in each of the last three seasons, but won the NBA Finals in 2013 and finished as the runner-up in 2014. For a

41 team like Miami that does not emphasize rebounding, the alternative is to focus on different aspects of the game in order to make up for their lack in rebounding.

Forcing turnovers has had a positive effect on winning percentage because teams that force a high number of turnovers give themselves a better opportunity to win the game based on the sheer number of possessions. As turning the ball over decreases the number of possessions available to even attempt a shot, forcing turnovers has the same effect on the opposition while creating extra possessions for the team. For each season, the 2015 Golden State Warriors and the 2013 and 2014 Miami Heat teams finished in the top five in forced turnovers per game and won the NBA championship. However, turning the ball over has been proven to have a negative effect on winning, since turnovers concede possessions.

Due to the increased experience and establishment that comes with older NBA veteran players, having a team consisting of an average age of 29-31 provides a positive effect to winning. Having an older team also likely indicates that the players are all fully developed and in their primes, rather than having the inexperience that youthful players bring to the team. Older teams have been successful in the playoffs in the last five seasons, as the 2011 Dallas Mavericks, 2012 and 2013 Miami Heat, and the 2014 San

Antonio Spurs all ranked in the top five during their respective seasons in average age, and all four teams went on to win the title at the end of each season.

The San Antonio Spurs have ranked in the top ten in the league for each of the last five seasons in the number of players on the roster acquired through the draft and are essentially the model for consistency and success, averaging a win percentage of 73%, or nearly 60 wins per year, over the last five seasons. A potential reason for their success is

42 the number of valuable players that the Spurs acquired on draft day, such as Tim Duncan,

Tony Parker, Manu Ginobili, and .

Other teams that have had a consistently high number of players acquired on draft day are: the Oklahoma City Thunder, who have also ranked in the top ten every season over the last five and have averaged a win percentage of 68%, or nearly 56 wins per season; the , who have finished first, twelfth, fifth, and tenth in the league over the last four seasons and have a win percentage of 62.5%, or over 51 wins per season; the Portland Trail Blazers, who have finished first twice, fifth, and second in four of the last five seasons, and finished with an average win percentage of 57% in those four seasons; and the Golden State Warriors, who finished amongst the bottom ten teams in the league in 2011 and 2012 (and won 43.9% and 34.8% of their games, respectively), but have finished in the top ten over the last three seasons and have won an average of

67% of their games during this span.

Holding opponents to a low field goal percentage has been proven to be the best indicator of a strong defense (as far as traditional statistics can describe) and also has a significantly positive effect on regular season win percentage. However, this statistic can also be very indicative of playoff success, as all four of the championship teams in the last five seasons have finished in the top ten in opponent field goal percentage during their respective title-winning seasons.

Based on the results found in the model and displayed in Table 2.1, an efficient offense must shoot the ball well from the field, as well as the free throw line and three- point line, factor in a high percentage of their total shots as three-point attempts, rebound the ball well, limit turnovers, and get to the free throw line often. A strong defensive team

43 forces a high amount of turnovers while keeping the field goal percentage of the opposition low, as shown in Table 2.2. Meanwhile, teams in the Western Conference were shown to historically outperform their Eastern Conference counterparts, teams that build through the draft have had higher winning percentages than teams that focus more on building through free agency and through trades, and teams with more age and experience tend to perform at a higher level than those made up of younger and less experienced players, shown in Table 3.

Based on the regression results from these models, a prototype for teams to follow should be the standards set by the Golden State Warriors and San Antonio Spurs. The

Spurs in particular have set a standard of excellence dating back from their first NBA title in 1999 and continuing to dominate through 2014 and their fifth championship. Despite the relatively small socioeconomic market that the city of San Antonio exists in

(compared to cities such as New York, Los Angeles, and Chicago), the Spurs have managed to build a dynasty of a basketball team through smart drafting, strong defense, efficient shooting, and a collection of effective and established veteran players.

On the other hand, Golden State has experienced a meteoric rise from mediocrity over the last five seasons, as the Warriors have enjoyed a leap from winning 35% of their games in 2012 to winning 82% of their games in 2015 (and only continuing to improve in

2016). The Warriors have managed to improve by these drastic measures through devastating three-point shooting, dominant and chaotic defense, and surprisingly strong rebounding. Although Golden State has mimicked San Antonio’s process of building through the draft, a key difference between the two is the average age. San Antonio has had an average age of nearly 29 over the last five seasons, while the Warriors have an

44 average age of just over 25, suggesting that the Warriors will remain a threat to win the title for several years. Based on the results, management and coaching has had a very strong influence on the success of both of these teams, as management is in charge of personnel and drafting, while the coaching staff has been able to optimize the focus on the floor for each set of players.

References

 Berri, D. (December, 1999). Who Is ‘Most Valuable?’ Measuring the Player’s Production of Wins in the National Basketball Association. Managerial and Decision Economics, 20 (8). Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/1099-1468(199912)20:8%3C411::AID- MDE957%3E3.0.CO;2-G/pdf.

 Carmichael, F., Thomas, D., Ward, R. (January, 2000). Team Performance: The Case of English Premiership Football. Managerial and Decision Economics, 21 (1). Retrieved from http://onlinelibrary.wiley.com/doi/10.1002/1099- 1468(200001/02)21:1%3C31::AID-MDE963%3E3.0.CO;2-Q/pdf.

 Santos, J., Garcia, P., Castro, J. (July, 2006). The Production Process in Basketball: Empirical Evidence from the Spanish League. International Association of Sports Economists, Working Paper Series, 06-11. Retrieved from http://college.holycross.edu/RePEc/spe/Santos_Basketball.pdf.

 Silver, N. (2014). Every Team’s Chance of Winning a Title by 2019. FiveThirtyEight. Retrieved from http://fivethirtyeight.com/features/every-nba-teams- chance-of-winning-a-title-by-2019/.

 Zak, T., Huang, C., Siegfried, J. (July, 1979). Production Efficiency: The Case of Professional Basketball. The Journal of Business, 52(8). Retrieved from http://0- www.jstor.org.tiger.coloradocollege.edu/stable/pdf/2352368.pdf?acceptTC=true.

 Zech, C. (Fall, 1981). An Empirical Estimation of a Production Function: The Case of Major League Baseball. The American Economist, 25(2). Retrieved from http://www.jstor.org/stable/25603335?seq=1#page_scan_tab_contents.

45

 Ziller, T. (April, 2013). How Important Are Assists in the NBA? SBNation. Retrieved from http://www.sbnation.com/2013/4/10/4208428/nba-assists-shooting- knicks-heat-bulls.

 Willard, J. (September, 2015). Shot Blocking Details: Mining 19 Years of Play-by- Play Data. Nylon Calculus. Retrieved from http://nyloncalculus.com/2015/09/21/shot-blocking-details-mining-19-years-of-play- by-play-data/.

 Shea, S., Baker, C. (2013). Basketball Analytics. St. Louis, MO: CreateSpace Independent Publishing Platform.

 Shea, S. (2014). Basketball Analytics: Spatial Tracking. St. Louis, MO: CreateSpace Independent Publishing Platform.

 Oliver, D. (2004). Basketball on Paper. Dulles, VA: Potomac Books, Inc.

 Perloff, J. (2008). Microeconomics: Theory and Applications with Calculus. Boston, MA: Pearson Education, Inc.

 www.basketball-reference.com.

 www.basketball.realgm.com.

 www.nba.com.

 www.forbes.com/nba-valuations.

46