How good is ? Baker, RD and McHale, IG

Title How good is Tiger Woods? Authors Baker, RD and McHale, IG Type Article URL This version is available at: http://usir.salford.ac.uk/id/eprint/22851/ Published Date 2012

USIR is a digital collection of the research output of the University of Salford. Where copyright permits, full text material held in the repository is made freely available online and can be read, downloaded and copied for non-commercial private study or research purposes. Please check the manuscript for any further copyright restrictions.

For more information, including our policy and submission procedure, please contact the Repository Team at: [email protected]. How good is Tiger Woods?

Rose D. Baker, C. Math, FIMA and Ian G. McHale Centre for Operational Research and Applied Statistics, University of Salford

Abstract tions, or compare the popularity of a set of choices made by consumers. A major objective of professional sport Given the main objective of profes- is to find out which player or team is sional sport is to find the best player or the best. Unfortunately the structure team, ratings models are naturally ap- of some sports means that this is of- plied to sport. Further, data on sport ten a difficult question to answer. For are often widely accessible making it example, there may be too many com- an ideal playground for a statistician. petitors to run a round-robin league, Here we use a ratings model to com- whilst knock-out tournaments do not pare the ability of golfers. The model compare every player with every other adopted is the Plackett-Luce model player. The problem gets worse when which we use to estimate competitor one has to compare players whose ‘strengths’. performance varies over time. For- The strengths estimated from the tunately mathematical modelling can standard Plackett-Luce model are help and in this article, we use the static in that they do not vary with Plackett-Luce model to estimate time- time. When used to estimate player varying player strengths of golfers. We strength in the context of golf, it seems use the model to investigate how good unreasonable to assume each and ev- golf’s current biggest attraction, Tiger ery player has a constant strength Woods, really is. throughout their playing career. It is more likely that a player’s strength Introduction will vary during his or her career. As such, we need a model that can al- ‘Who is the greatest golfer?’ is an of- low a player’s strength to vary with ten discussed topic at the 19th hole time. Our solution is to estimate of any golf course. In recent times, it the strength parameters at regularly- seems likely that Tiger Woods is. We spaced time points, with a smooth know Tiger is good, in fact we can be interpolation between them. Splines pretty certain he is the best amongst are very often used in statistical mod- his contemporaries, but just how good elling (e.g. Harrell, 2001) but it is is he? How much better than his com- even simpler and arguably better to petitors is he, and given his recent use the newer but little-known method drop in form, can we expect Tiger to of barycentric interpolation. This was get back to his best? It is difficult, mentioned in a recent article in this if not impossible to give definitive an- magazine (Trefethen, 2011), and we swers to these questions. However, as believe we are the first workers to use is often the case in life, mathematical it in a statistical analysis. Adopting modelling and statistical analysis can this methodology means that, unlike shed light on the situation. We use previous work, we model the evolu- a ratings or ranking model, and fit it tion of player strength deterministi- to tournament data by the method of cally rather than stochastically. maximum likelihood. Other authors, when modelling Ratings models have typically been player or team strengths in other used to compare candidates at elec- sports, have most commonly adopted

1 stochastic rather than deterministic player wins is evolution of player strength. This is sometimes justifiable. In team sports, ∑ πi Pi = n , for example, there is a non-neglible j=1 πj stochastic component of strength evo- lution: as the team buys and sells where the jth player has strength πj. top players, its strength forms part of After a player has been ranked first, a random walk. In individual sports another player ranks second by win- however, there is a strong systematic ning out of the remaining players, and component in strength evolution. For so on down to the last, who ‘wins’ with example, a player may gain experience probability one. in the early part of his career, reach a peak and then decline in strength until he retires. Indeed, as we find, Time-varying player strengths this would appear to describe the ca- reer path of a typical golf professional. In order to allow for time-dependent player strengths, π, we take as model variables for player i the strengths yik tabulated at mi equally-spaced epochs A Plackett-Luce model tik from the first year of tournament with time-varying strengths play to the last. Next, we interpo- late these tabulated strengths, using The Bradley-Terry model (Bradley the barycentric interpolation formula and Terry, 1952) is a ratings model ∑ mi used for contests between two play- wikyik/(t − tik) π (t) = ∑k=1 , i mi − ers. Such a model is appropriate for k=1 wik/(t tik) obtaining strength estimates of ten- k nis players for example, where play- where wik = (−1) , with first and last ers compete in head-to-head matches, weights being halved. This method but many players are observed play- is described in Berrut and Trefethen ing many matches. In golf, the situ- (2004), and is claimed to give smaller ation is more complicated since many errors than spline interpolation. It players compete in the same ‘match’, is trivial to obtain the differential and we observe many of these matches. ∂πi(t)/∂yik which will be needed in Each match is of course a golf tour- computing the likelihood function. nament, and at the end of each tour- To obtain a more precise estimate nament we have a rankings list of the of a player’s maximum strength, af- players who competed. But when we ter a preliminary likelihood maximisa- have many such tournaments, with tion, a new set of interpolation epochs different sets of competitors, how can was chosen so that the time of maxi- we combine the results? The answer mum strength fell at a tabulated time. comes from the generalisation of the This necessitated the use of unequally- B-T model, to be used when n players spaced points, when the weight for- are competing in each ‘match’, named mula is the Plackett-Luce model, introduced by Luce (1959), and Plackett (1975). k 1 1 wik = (−1) { + }, There are more complex models (e.g. tik − ti,k−1 ti,k+1 − tik Stern, 1990) which however require computation of multivariate integrals. with terms with out-of-range epochs Here the probability that the ith omitted.

2 The likelihood function playing lifetime. This can be given a simple meaning in terms of ob- Hunter (2004) describes the maximisa- servables. Imagine that a player of tion of Bradley-Terry likelihood func- strength π regularly competes against tions, including the Plackett-Luce ex- 1 a player of very high ability π . The tension. He writes the log-likelihood 2 proportion of games won is π /(π + function (for constant strengths) as 1 1 π2) ≃ π1/π2. Over the player’s play- − ing lifetime, the number of games ∑N m∑j 1 ∑mj ∫ won is (ρ/π2) π(t) dt, where ρ is the ℓ(π) = {ln πa(j,i)−ln πa(j,s)}, j=1 i=1 s=i rate at which∫ these competitions oc- cur. Hence π(t) dt is proportional where there are N tournaments, with to the total number of games that the mj players competing in the jth, player would win against a very strong and a(j, i) denotes the identity of the player. This is a sensible measure of player who came ith in the jth tour- lifetime achievement. nament. The sum is up to mj − 1 only because the probability for the last Estimation player is unity. With ties, however, the formula must be modified to When maximising the likelihood func- tion for the tabulated ‘strengths’, we ℓ(π) = cannot use the slow but sure method of likelihood maximisation recommended ∑N ∑mj ∑mj by Hunter (2004).Solving for the tab- {ln πa(j,i) − ln πa(j,s)}, ulated ‘strengths’ yik rather than di- j=1 i=1 s=p(j,i) rectly for the π means that we can- where p(j, i) is the position (place- not manipulate the likelihood maximi- ment) of the ith player in the jth sation condition so that yik is the sub- match; several players can have the ject. This is the basis of the ‘fixed same position. This approximation point’ method. A further complicat- for ties is that made by Cox and ing factor here is that when fitting the Oakes (1984) in the context of the model to data from golf tournaments proportional hazards model; the anal- we need to maximise a likelihood func- ogy with ranking models was pointed tion with respect to hundreds, or even out in a short paper by Su and Zhou thousands, of parameters. This is no (2006). easy feat, but the tractability of the likelihood function means that first The meaning of the model and second derivatives can be found parameters analytically, which helps. An ad hoc method was developed which worked A player’s strength π(t) is a function well enough. The difficulty is that of model parameters, but does not di- one does not want to try to invert the rectly have any observable meaning. huge Hessian matrix. After some ex- However, it is easy to find such mean- perimentation, a Newton-type method ings. Thus, the player with greater was evolved, whereby only the diago- maximum strength would have been nal term of the Hessian was computed. able to beat the other player more of- This of course throws away the attrac- ten than not. For lifetime achievement tive property of second-order conver- we looked at the total strength, i.e. gence, but function maximisation still the integral of strength over player’s proceeded rapidly until near the max-

3 imum. to constrain the parameters, removing Backtracking was used if a step de- the problem and greatly speeding up creased the likelihood function. In ad- likelihood maximisation. dition, single-variable minimisations were interspersed, because these can be done without using the full Hessian, Results and the routine switched to ‘single variable mode’ if backtracking failed. We obtained data from all four of It tried switching out of this mode men’s golf’s major tournaments (the again if enough successful steps had Masters, the US Open, the British been made using the single-variable it- Open and the USPGA) from 1996 to eration. The single variable to be iter- 2011. For each tournament, we had ated was chosen as the one promising the finishing position of all players. the largest increase in log-likelihood, In all there were 63 tournaments and based on the first and second differen- 228 players. Table 1 shows the top tials. Finally, several random restarts 10 players according to total ‘lifetime’ were made. strength. A small problem is that only the INSERT TABLE 1 ratio of the strengths π is determinable As expected Tiger Woods is the from tournament data. Hence the best player of his generation. More in- strength of one player at one time teresting perhaps is the identity of the point was fixed at unity, e.g. y11 = 1. other players in the list. is The yik were rescaled after each itera- rated as the fourth best player since tion to preserve this property. 1996 yet is not typically regarded so The lack of a Hessian matrix again highly in golfing circles. Furyk splits makes it harder to do statistical in- the big five of recent times, namely, ference, i.e. to find errors on model Woods, Mickelson, Els, Goosen and parameter estimates, and to do signif- Singh. The dominance of the US- icance tests. Here the famous boot- based players is evident. Only Lee strap method comes to the rescue, giv- Westwood, and to some extent Sergio ing us a complete methodology for pa- Garcia are players based in Europe. rameter estimation and inference. The rise of European golf in the last Implementing this methodology few years (at the time of writing in requires some heavy computing, even early 2012, European players have won with the relatively simple Plackett- seven of the last eight major titles) Luce model, and one of the authors may mean that this list will be domi- was obliged to upgrade her computer! nated with European golfers in future Then using fortran95 from the Nu- years. merical Algorithms Group (NAG) and Although of some interest, the re- use of their library of numerical sub- sults shown in Table 1 do not display routines made rapid computation pos- the time-varying strengths of the play- sible. There was a wide variety of ers. Figure 1 shows these for the top complications to be sorted out, such 4 players. as the already mentioned existence of INSERT FIGURE 1 ties. Another obvious problem is that and Tiger Woods’ careers if a player performs very well, their seem to be following a familiar pat- strength parameter may tend to in- tern - an increasing strength in the finity, causing numerical problems and early part of their careers with a de- greatly slowing down likelihood max- cline in later years. This is a very imisation. A small ‘prior’ term serves common shape of such strength tra-

4 jectories. Tiger’s peak is near 2001 - ‘twin peak career’, Tiger can. a year in which Tiger dominated the Future work in this area will be to world of golf. Jim Furyk and Phil answer the ultimate question for any Mickelson are following a more pecu- golf fan: ‘who is the greatest golfer liar career path. Jim Furyk has two ever?’. To do so, one would need clear peaks in his ability, whilst Phil data from a much longer time frame Mickelson looks like he may be enter- (golf’s major competitions began in ing another peak having peaked once 1860 with the British Open), and even in 2003 (although recent form suggests more heroic computing, but getting this may not be long-lived). Exam- answers to questions such as ‘who ining the trajectories of all players in would win a match between Tiger our analysis shows that the inverted Woods and ’ would make U-shape is very much the norm career the effort worth it. path for modern golfers. Tiger Woods’ closest rival has been . Our parameter es- timates suggest that, at their re- spective peak strengths (14.99 and 10.08), a match between the two would be won by Tiger with probabil- ity 14.99/(14.99 + 10.08) = 0.598.

Conclusions

In answer to the questions set at the start of the paper, Tiger Woods does indeed seem to be the best golfer of his generation, but given the pat- terns observed in players’ strengths, it seems unlikely that he will once again reach the heights of his powers observed around 2001. Golfers typ- ically increase in strength, and hav- ing reached a peak, face a slow de- cline in playing ability until they re- tire. Of course, part of Tiger’s recent decline has been down to injury (the few tournaments Tiger has played re- cently have seen him perform some- what below the standards he set in the early part of his career), which, as Tiger gets older is more and more likely to happen, suggesting his de- cline will continue. However, his de- cline in form was also partly due to his personal problems. As such, if these are overcome, he may find himself en- tirely focussed on his golfing legacy once more, and if anyone can have a

5

[5] Hunter, D. R. (2004). MM Al- gorithms for generalized Bradley- Terry models, The Annals of Statistics, 32, 384-406. [6] Luce, R. D. (1959). Individual Choice Behavior: A Theoretical Bibliography Analysis. Wiley, New York.

[7] Plackett, R. L. (1975). The anal- ysis of permutations, Applied [1] Berrut, J. P. and Trefethen, L. Statistics, 24, 193-202. N. (2004). Barycentric Lagrange [8] Stern, H. (1990). Models for Dis- interpolation, SIAM Review, 46, tributions on Permutations, Jour- 501-517. nal of the American Statistical [2] Bradley, R. A., and Terry, M. Association, 85, 410, 558-564. E. (1952). Rank analysis of in- [9] Su, Y. and Zhou, M. (2006). On a complete block designs I: the connection between the Bradley- method of paired comparisons. Terry model and the Cox propor- Biometrika, 39, 324345. tional hazards model, Statistics [3] Cox, D. R. and Oakes, D. (1984). and Probability Letters, 76, 698- Analysis of Survival Data, Chap- 702. man and Hall, New York. [10] Trefethen, L. N. (2011). Six [4] Harrell, Jr., F.E. (2001). Regres- myths of polynomial interpola- sion modeling strategies. Springer tion and quadrature, Mathemat- series in statistics, New York. ics Today, August.

7 16 Tiger Woods Phil Mickelson Ernie Els 14 Jim Furyk

12

10

8 Strength

6

4

2

0 1996 1998 2000 2002 2004 2006 2008 2010 2012 Year

Figure 0.1: Lifetime strengths of the four players with highest lifetime strength.

Figures and Tables

Player total lifetime strength (cv) Tiger Woods 12.64 (0.17) Phil Mickelson 8.16 (0.19) Ernie Els 6.12 (0.18) Jim Furyk 4.56 (0.17) 4.55 (0.16) 4.11 (0.21) 3.3 (0.15) 2.84 (0.16) Sergio Garcia 2.81 (0.15) 2.66 (0.11) Table 0.1: Top 10 players by total ‘lifetime’ strength with coefficient of varia- tion.

8