An Exploration of Momentum in Grand Slam Tennis Matches By
Total Page:16
File Type:pdf, Size:1020Kb
Hot Racquet or Not? An Exploration of Momentum in Grand Slam Tennis Matches by Arjun Goyal An honors thesis submitted in partial fulfillment of the requirements for the degree of Bachelor of Science Undergraduate College Leonard N. Stern School of Business New York University May 2020 Professor Marti G. Subrahmanyam Professor Jeffrey Simonoff Faculty Adviser Thesis Advisor Acknowledgements The research done throughout the thesis program would not have been possible without the help, commitment and patience that my thesis advisor, Professor Simonoff, provided. I learnt far more than I could have ever imagined from this endeavor, not just on the research topic, but also about the procedure of research. Many hours were spent peering over models, exploring odd trends in tennis and reading through lengthy drafts; all this effort from Professor Simonoff was invaluable to producing this final paper. I’d also like to thank Professor Subrahmanyam, the faculty advisor and program coordinator, for providing this excitement opportunity to undergraduate students, organizing the weekly seminars and providing the support throughout the research process. Lastly, thank you to my friends and family for providing me with unending encouragement throughout the year. I could not have produced this work without your support. 2 Abstract The hot hand has been a widely studied phenomenon in the field of sports statistics, specifically in basketball. However, tennis is no stranger to this concept of ‘momentum’. Gillovich, Vallone & Tversky debunked the hot hand in basketball with their paper in 1985, but the belief that a player can ‘get on a roll’ is still widely shared in the tennis community. As a fan, it is difficult to know whether this phenomenon is real or a fallacy. This paper aims to investigate the presence and manifestation of momentum in Grand Slam tennis matches from 2014 to 2019 for men and women. Specifically, it aims to investigate if there is any carryover effect from the outcome of previous point(s)/game(s)/set(s) to the current one, which cannot be accounted for by metrics measuring player quality, form and fatigue. The primary model, a Generalized Linear Mixed Effect Model, was constructed to fit the data; it helps incorporate the nested structure of a tennis match into a Logistic Regression Model. The primary model’s output would show the effect of factors on the odds of winning a point, game, or set. On a set level, there is definitely indication there exists positive momentum from winning the previous set, more so for women than for men. Surprisingly, the clay court seems to have the highest degree of momentum, followed by grass and then hard courts. On a game level, holding one’s serve seems to be paramount for the current serving player, and not doing so in prior games reduces the predicted odds of the winning the current game significantly. Surprisingly, losing a past game can spur positive momentum and is associated with higher predicted odds of winning the next game, which could be explained by players playing harder to get back in the match after losing past games. The momentum effects exhibited are more volatile for men than for women. On a point level, lagged point outcomes are definitely significant and have a carryover effect on the current point. Winning two or three points in a row is associated with the highest predicted odds of winning the next point, and correspondingly losing two or three points in a row is associated with the lowest predicted odds of winning the next point. It also seems that the momentum effect from winning previous points has a low memory, and Lag 1 point outcomes tend to affect predicted odds more than Lag 2 or Lag 3 point outcomes. Just as in the game level, these momentum effects are more volatile for men than for women. 3 Table of Contents ACKNOWLEDGEMENTS................................................................................................................................................ 2 ABSTRACT ........................................................................................................................................................................ 3 BACKGROUND INFORMATION ................................................................................................................................... 5 INTRODUCTION ................................................................................................................................................................................................... 5 RULES OF TENNIS ............................................................................................................................................................................................... 5 DEFINING MOMENTUM IN TENNIS ................................................................................................................................................................ 7 RESEARCH APPLICATION................................................................................................................................................................................... 8 DATA UNIVERSE CONSTRAINTS ...................................................................................................................................................................... 8 LITERATURE REVIEW ......................................................................................................................................................................................... 9 VARIABLES.......................................................................................................................................................................................................... 10 DATA COLLECTION .......................................................................................................................................................................................... 12 SET-LEVEL ANALYSIS.................................................................................................................................................. 13 RAW DATA ANALYSIS....................................................................................................................................................................................... 13 MODEL SELECTION .......................................................................................................................................................................................... 16 MODEL PARAMETERS....................................................................................................................................................................................... 17 MODEL OUTPUT ............................................................................................................................................................................................... 19 RESIDUAL ANALYSIS......................................................................................................................................................................................... 29 GAME-LEVEL ANALYSIS ............................................................................................................................................. 30 RAW DATA ANALYSIS....................................................................................................................................................................................... 30 MODEL SELECTION .......................................................................................................................................................................................... 33 MODEL PARAMETERS....................................................................................................................................................................................... 34 MODEL OUTPUT ............................................................................................................................................................................................... 36 RESIDUAL ANALYSIS......................................................................................................................................................................................... 44 POINT-LEVEL ANALYSIS ............................................................................................................................................ 45 RAW DATA ANALYSIS....................................................................................................................................................................................... 45 MODEL SELECTION .......................................................................................................................................................................................... 48 MODEL PARAMETERS....................................................................................................................................................................................... 48 MODEL OUTPUT ............................................................................................................................................................................................... 49 RESIDUAL ANALYSIS......................................................................................................................................................................................... 65 CONCLUSION ...............................................................................................................................................................