Lecture 2.Pdf
Total Page:16
File Type:pdf, Size:1020Kb
Mathematical modelling of football Start again 11:15 David Sumpter Uppsala University & Hammarby IF Structure today • Summary of last time. Key Performance Indices. • Statistical models of passes • Summary of expected goals (chance to raise questions) Break 11-11:15 Please ask questions in the chat section where possible and I will answer them as I go. I will take a reasonably slow pace and interact along the way. Go in to canvas and look where we are in the course… First a correction. Pass arrows. 138 SOCCERMATICS the defence. By marking every point at which the ball was played just before each of the Real Madrid shots during the Champions League season, we can get an overall picture of how they create successful attacks. Figure 7.12 is a risk map showing where the ball was played in the 15 seconds leading up to a shot from the 20m by 20m area in front of goal. The darker areas show places where there is a high risk of a Real Madrid shot from the danger zone coming within the next 15 seconds; the lighter areas show places where the risk is low. Corners are one clear risk-zone and, not surprisingly, if the ball is already in the box then the risk of a shot is high. But the most interesting risk zone is the hot area outside the box on Real Madrid’s left. This area of the pitch is mainly Summary from last timeoccupied by Marcelo, who comes up on the left wing, and Ronaldo, who is more central. It is from here that dangerous chances are created. ‘Football looked at in a very different way’ Pat Nevin Soccermatics ‘Every football nerd’s dream.’ FourFourTwo Football – the most mathematical of sports. From shot statistics and league tables to the geometry of passing and managerial strategy, the modern game is filled with numbers, patterns and shapes. How do we make sense of them? The answer lies in modelling – mathematical processes more usually applied in biology, physics and economics. Soccermatics Soccermatics brings football and mathematics together in a mind-bending synthesis, using numbers to help reveal the Figure 7.12 Real Madrid danger-zones duringinner workings of thethe beautiful game.Champions This new and expanded edition analyses the current big- League season 2014/15.The shading is proportionallyname players and teams darker using mathematics, in and areas meets the professionals working inside football who use numbers and where the ball was located during the 15 secondsstatistics to boost performance. leading up to a No matter who you follow – from your local non-league side to the big boys of the Premiership, La Liga, the Bundesliga, Serie shot from the 20m by 20m area in front of theA or theopposition MLS – you’ll be amazed atgoal. what mathematics Data has to DAVID SUMPTER MATHEMATICAL ADVENTURES IN THE Beautiful GAME provided by Opta. teach us about the world’s favourite sport. www.bloomsbury.com £9.99 DAVID SUMPTER Cover photograph: © GettyImages PRO EDITION 99781472924124_Soccermatics_Book_Finalpass.indd781472924124_Soccermatics_Book_Finalpass.indd 138138 112/13/20162/13/2016 88:29:53:29:53 PPMM Summary from last time… • Raw data is seldom enough. • Standard visualisations are seldom enough (why we need Python!) • What is the question? How do you answer it? • Does the data support your hypothesis? • Danger of self-confirmation, but risk of ignoring domain knowledge. • Building measures that can then be used for future benchmarking (KPIs) • Use both deductive (story) vs. inductive (data) thinking. Key Performance indices at clubs • Entries final third/box. • Shots danger zone. • Number of passes leading to shot. • Ball recoveries within 5 seconds. • Passing tempo. • Expected goals. Single set of measures that are used over the entire club. Lecture 2 Statistics of Passing Mathematical modelling of football CHECK MY FLOW 67 CHECK MY FLOW 67 CHECK MY FLOW 67 Pass sequence Figure 3.6 Passes leading up to Italy’s f rst goal against Germany in Euro 2012, showing passes made by Italy (arrows in top panel) andFigure Mesut 3.6 Özil’s Passes movement leading up while to Italy’s chasing f rst the goal ball against (bottom Germany panel). Darkerin Euro shading 2012, showingindicates passes more maderecent by events Italy in(arrows time. inLetters top panel)indi- cateand events:Mesut (A)Özil’s Pirlo’s movement f rst pass; while (B) Pirlo’schasing second the ball pass; (bottom (C) Chiellini panel). receivesDarker shadingthe ball; indicates(D) Balotelli more scores. recent events in time. Letters indi- cate events: (A) Pirlo’s f rst pass; (B) Pirlo’s second pass; (C) Chiellini receives the ball; (D) Balotelli scores. 99781472924124_Soccermatics_Book_Finalpass.indd781472924124_Soccermatics_Book_Finalpass.indd 6767 112/13/20162/13/2016 8:29:408:29:40 PPMM 99781472924124_Soccermatics_Book_Finalpass.indd781472924124_Soccermatics_Book_Finalpass.indd 6767 112/13/20162/13/2016 88:29:40:29:40 PMPM Figure 3.6 Passes leading up to Italy’s f rst goal against Germany in Euro 2012, showing passes made by Italy (arrows in top panel) and Mesut Özil’s movement while chasing the ball (bottom panel). Darker shading indicates more recent events in time. Letters indi- cate events: (A) Pirlo’s f rst pass; (B) Pirlo’s second pass; (C) Chiellini receives the ball; (D) Balotelli scores. 99781472924124_Soccermatics_Book_Finalpass.indd781472924124_Soccermatics_Book_Finalpass.indd 6767 112/13/20162/13/2016 8:29:408:29:40 PMPM Diego Escribano (Group 2) Philip Winchester (Group 2) The next step is to go from visual understanding to statistical understanding… All passes by England’s women in World Cup All passes by England’s women in World Cup 5 lanes 5 ‘heights’ All passes by England’s women in World Cup 10 lanes 10 ‘heights’ Passes within 15 seconds of a shot Passes within 15 seconds of a shot Limitations • Not adjusted for xG of chance created. • For example, all chances with greater than 0.05 xG. (Challenge) • Not compared to other teams. Passes made by each player Lucy Bronze 31 Alex Greenwood 11 Francesca Kirby 26 Abbie McManus 10 Jill Scott 23 Bethany Mead 10 Nikita Parris 22 DeMi Stokes 9 Keira Walsh 16 Jodie Taylor 4 Stephanie Houghton 16 Millie Bright 4 Rachel Daly 13 Jade Moore 4 Toni Duggan 13 Georgia Stanway 3 Ellen White 12 Karen Julia Carney 3 Karen Bardsley 1 Limitations • Not corrected for minutes played. • Again, could be corrected for xG. Lets look at the code… 7PassHeatMap.py Still don’t have statistical understanding until we compare to other teams… Do teams that pass more shot more? All teams in the Women’s World Cup Linear regression Minimize sum of distance between points and line. Equivalently, minimize sum of squares of differences. Do teams that pass more shot more? Fitting in python Goals = b0 + b1Passes b0 b1 Fitting in python Test goodness of fit Fitting in python Goals = b0 + b1Passes b0 b1 Test null hypothesis that intercept (b 0) and slope (b1) are zero Can we learn something about individual teams? USA England Linear regression is a quick way of checking relationships in data They predict future goals better than goals. https://cartilagefreecaptain.sbnation.com/2014/2/28/5452786/shot-matrix-tottenham-hotspur-stats-analysis- expected-goals Possession and goal difference Premier League 16/17 https://medium.com/@Soccermatics/how-important-is-it-to-have-the-ball-47f93b7760fd Possession and goal difference Champions League 16/17 https://medium.com/@Soccermatics/how-important-is-it-to-have-the-ball-47f93b7760fd Possession and winning are not usually correlated! For goals we should us Poisson regression (next lecture) No evidence to dismiss the null hypothesis Poisson regression fit. b0 b1 Poisson regression fit. Look in code 8PassCompare.py Difference to average of all teams What would I tell England (or Sweden) about their World Cup? Based on 3 or 4 hours of data analysis… Lucy Bronze 31 Francesca Kirby 26 Jill Scott 23 Nikita Parris 22 Keira Walsh 16 Stephanie Houghton 16 Rachel Daly 13 Toni Duggan 13 Ellen White 12 Data Alchemy Outnumbered Algorithms are running our society, and we don’t really know what they are up to. ‘You’ve heard about these algorithms that run your life, and you want to know two things: how exactly do they Featuring Our increasing reliance on technology and the internet has opened a window for work? And how much should I worry? With a refreshing David Cambridge mathematicians and data researchers to Analytica mix of in-depth knowledge and personal honesty, gaze through into our lives. Using the data David Sumpter answers both those questions.’ they are constantly collecting about where Timandra Harkness, writer, comedian and Sumpter we travel, where we shop, what we buy and broadcaster, and author of Big Data what interests us, they can begin to predict our daily habits. But how reliable is this data? Without understanding what mathematics can and can’t do, it is impossible to get a ‘A stellar book about the application of mathematics handle on how it is changing our lives. David Sumpter is Professor of Applied to the real world. Each chapter tells a fascinating story, Outnumbered Mathematics at the University of Uppsala, In this book, David Sumpter takes an and David’s warm and witty style demonstrates that a Sweden. Originally from London, but growing algorithm-strewn journey to the dark up in Scotland, he completed his doctorate mathematician can be so much more than just side of mathematics. He investigates the in Mathematics at Manchester, and held a a machine for turning coffee into theorems. equations that analyse us, influence us and Royal Society Fellowship at Oxford before A riveting read.’ will (maybe) become like us, answering Facebook questions such as: heading to Sweden. His scientific research From Kit Yates, Senior Lecturer, Department of covers everything from the inner workings Mathematical Sciences, University of Bath How does Facebook build a 100-dimensional of fish schools and ant colonies, the analysis picture of your personality? of the passing networks of football teams, Google and to Are Google algorithms racist and sexist? segregation in society to machine learning and artificial intelligence.