Engineering Moneyball in Hockey, Baseball and Beyond
Total Page:16
File Type:pdf, Size:1020Kb
Applied Optimization Lab, University of Toronto Engineering Moneyball in hockey, baseball and beyond Timothy Chan Mechanical and Industrial Engineering University of Toronto TALYR April 16, 2018 Applied Optimization Lab, University of Toronto What do you think of when you hear “Moneyball” • Statistics • Data • This guy… • Or this guy… • I prefer a broad interpretation of sports analytics – Not just statistics (#fancystats), but models – I’m especially interested in decision models 2 Applied Optimization Lab, University of Toronto Outline • Hockey – Several past and current research projects – A real application • Baseball – A project where a core Industrial Engineering concept was the key – First place in MIT Sloan Sports Analytics Conference paper competition (2013) • Beyond 3 Applied Optimization Lab, University of Toronto Thanks to all the students I’ve worked with David Novati Justin Cho James Sun Aditya Dhoot Swapneel Mehta Raghav Singal Mark Cui IndE 1T1+PEY IndE 1T1+PEY IndE 1T1+PEY EngSci 1T2+PEY IndE 1T3+PEY IndE 1T4+ PEY IndE 1T4+PEY (Riot Games) (DHL) (Deloitte) (Snapy Inc) (Bell) (PhD) (MasterCard) Daniel Biancolin Yusuf Shalaby David Madras Justin Boutilier Islay Wright Kyle Booth Amina Sherif IndE 1T5+PEY IndE 1T7+PEY CS 1T6 IndE PhD 1T8 IndE MASc 1T7 IndE PhD 1T9 IndE 1T6+PEY (BoA Merrill Lynch) (PEY: CCO) (MSc) (PhD) (MASc) (PhD) (BASc) 4 Applied Optimization Lab, University of Toronto Hockey #1 Applied Optimization Lab, University of Toronto How it all started • Construct the “optimal” Team Canada hockey team for the 2010 Olympics – What should a decision maker consider? • List of players • Attributes • NHL stats Model Roster • Intangibles • Int’l experience • How to value different types of players? 6 Applied Optimization Lab, University of Toronto Project 1: Quantifying the value of player types • Turn players into data and use models to identify “good” players • We represent players as a vector of their statistics – E.g., Sidney Crosby in 2010 is [51, 58, 15, 63, 43, 71] – [Goals, Assists, +/-, Hits, Blocks, PIM] – The beauty of models is that we can add new stats as they become available (e.g., Corsi, etc.) 7 Applied Optimization Lab, University of Toronto K-means clustering to identify player types Blocked shots Goals PIM 8 Applied Optimization Lab, University of Toronto Example player classification from 2013-2014 Position Player type Names Top Line Steven Stamkos, Sidney Crosby, Gustav Nyquist Second Line Joffrey Lupul, Henrik Sedin, Ryan Nugent-Hopkins Forward Defensive Jay McClement, Ryan Callahan, Daniel Winnik Physical Colton Orr, Chris Neil, Zac Rinaldo Offensive Erik Karlsson, PK Subban, Tyson Barrie Defensive Dan Hamhuis, Marc-Edouard Vlasic, Jacob Trouba Defense Average Justin Schultz, Christian Ehrhoff, Seth Jones Physical Dion Phaneuf, Radko Gudas, Erik Gudbranson Elite Tuukka Rask, Josh Harding, Martin Jones Goalie Average Roberto Luongo, Cory Schneider, Eddie Lack Bottom Ondrej Pavelec, Tim Thomas, Devan Dubnyk 9 Applied Optimization Lab, University of Toronto Project 2: Individualizing player valuations • Can we refine the model to better differentiate between players? Blocked shots A B 20% 30% 30% 30% 40% Goals 50% C PIM 10 Applied Optimization Lab, University of Toronto Validate split personality analysis • In 2009-2010, Ovechkin won MVP (players), Crosby co-won goal scoring trophy, H. Sedin won MVP (media) and points scoring trophy • Daniel and Henrik Sedin really are twins 11 Applied Optimization Lab, University of Toronto Olympics…is there an optimal team? Forwards (2010) Cluster (2009-2010) Forwards (2014) Cluster (2013-2014) Patrice Bergeron Top Line Jamie Benn Top Line Sidney Crosby Top Line Patrice Bergeron Top Line Ryan Getzlaf Top Line Jeff Carter Top Line Dany Heatley Top Line Sidney Crosby Top Line Jarome Iginla Top Line Matt Duchene Top Line Patrick Marleau Top Line Ryan Getzlaf Top Line Rick Nash Top Line Chris Kunitz Top Line Corey Perry Top Line Patrick Marleau Top Line Eric Staal Top Line Rick Nash Top Line Joe Thornton Top Line Corey Perry Top Line Jonathan Toews Top Line Patrick Sharp Top Line Brendan Morrow Second Line Steven Stamkos Top Line Mike Richards Second Line John Tavares Top Line Jonathan Toews Top Line 12 Applied Optimization Lab, University of Toronto A real application Moore v. Bertuzzi 13 Applied Optimization Lab, University of Toronto The incident • (show video clip) • https://www.youtube.com/watch?v=GEjdwlT6g7o 14 Applied Optimization Lab, University of Toronto A real application • Incident occurred on March 8, 2004 • Trial date was Sept 8, 2014 – Settled days before • A major issue was determining whether Moore would have had a long career in the NHL • My research methods were used in an expert witness report to determine comparable players • Potential for these methods to be used in salary arbitration in general 15 Applied Optimization Lab, University of Toronto Baseball Applied Optimization Lab, University of Toronto Motivation • During a game between Tampa Bay and Seattle on May 2, 2012, B. J. Upton (Tampa’s Center Fielder) experienced tightness in his right quad • As a precautionary measure, Tampa removed him from the game • The following is a news story posted on MLB.com after the game: 17 Applied Optimization Lab, University of Toronto A chain reaction “Upton’s departure set into motion a radical change of CF positions in the field for the Rays. Desmond Jennings shifted from LF RF left field to center, Matt Joyce went from right to left and Ben SS 2B Zobrist moved from second base to right field. Elliot Johnson 3B P 1B entered the game to play shortstop and hit in Upton’s slot C in the lineup. Sean Rodriguez moved from shortstop to third base and Will Rhymes shifted from third base to second.” 18 Applied Optimization Lab, University of Toronto Baseball’s Hilbert Problems • In 2000, Baseball Prospectus came up with a list of 23 open questions in baseball analytics – Mathematician David Hilbert gave an address in 1900 outlining 23 major unsolved problems in math • The value of positional flexibility – A player who plays two positions at a league-average level gives his manager flexibility, both in setting up the team’s roster and using in-game strategies. … Because roster spots are scarce, a team gets value from a player’s ability to play multiple positions, but we do not yet have an understanding of how much value there is to having [such a player] on your roster. 19 Applied Optimization Lab, University of Toronto Flexibility has been well-studied in manufacturing Factories Cars Players Positions No flexibility C 1B 2B 3B Some flexibility SS LF CF Full flexibility RF 20 Applied Optimization Lab, University of Toronto The experiment • Pretend each player can only play Players Positions a single position C 1B • Calculate how well the team 2B performs when a player gets injured 3B – A bench player is substituted in SS LF • Then compare to the case when CF players have their native flexibility RF 21 Applied Optimization Lab, University of Toronto The scientific approach • Statistical models to estimate Players Positions – Injury risk C – Length of injury, if injured 1B – Capability of players at different positions 2B • Available playing time determined by 3B simulating player injuries SS LF • Optimization model to determine CF player-position assignments based on RF available playing time 22 Applied Optimization Lab, University of Toronto The value of team flexibility At least 1.5 wins are due to flexibility alone Values represent % of total performance (RAR) due to flexibility of players on roster 23 Applied Optimization Lab, University of Toronto Beyond Applied Optimization Lab, University of Toronto Some exciting future directions • Incorporating physiological or biomechanical data • Smart equipment • Business analytics 25 Applied Optimization Lab, University of Toronto Golf handicapping • Does current golf handicap system bias match play outcomes? – Yes! – Stronger player wins roughly 53% of the time…better than house odds for many casino games! 26 Applied Optimization Lab, University of Toronto Daily fantasy sports • Choose fantasy lineup daily and compete for cash prizes after paying entry fee • FanDuel and DraftKings valued at over $1B each • $2.6B entry fees in 2015 → $15B projected by 2020 Prediction Optimization Input Player Lineup data model performance model 27 Applied Optimization Lab, University of Toronto NHL expansion draft optimization • Las Vegas became 31st NHL team in 2017-2018 • Expansion draft held in summer 2017 – Las Vegas gets to pick players from other teams to form its own team – Existing 30 teams get to protect some players 30 teams Las Vegas Current protection Protected / selection Roster players models exposed players model 28 Applied Optimization Lab, University of Toronto Developed website for fans to optimize in real time 29 Applied Optimization Lab, University of Toronto Summary • Think broadly about sports analytics – It’s not just baseball statistics – It’s complex models, non-traditional data, all sports • Think deeply about sports analytics – It’s not enough to do basic number crunching – Need advanced methods: optimization, simulation, machine learning, statistics • Better decision-making is possible when you combine the best of both 30 Applied Optimization Lab, University of Toronto Thanks for listening! Questions? Timothy Chan Associate Professor, University of Toronto [email protected] 31.