Using Spatiotemporal Data to Discover and Predict Shots in Tennis

Home , Clay court, Double fault (tennis), Grass court

“Sweet-Spot”: Using Spatiotemporal Data to Discover and Predict Shots in Tennis

Xinyu Wei1, Patrick Lucey2, Stuart Morgan3 and Sridha Sridharan1 Queensland University of Technology1, Disney Research Pittsburgh2, Australian Institute of Sport3 Email: [email protected], [email protected], [email protected], [email protected]

Abstract

In this paper, we use ball and player tracking data from “Hawk-Eye” to discover unique player styles and predict within-point events. We move beyond current analysis that only incorporates coarse match statistics (i.e. serves, winners, number of shots, volleys) and use spatial and temporal information which better characterizes the tactics and tendencies of each player. Using a probabilistic graphical model, we are able to model player behaviors which enables us to: 1) find the factors such as location and speed of the incoming shot which are most conducive to a player hitting a winner (i.e. “sweet-spot”) or cause an error, and 2) do “live in-point” prediction - based on the shots being played during a rally we estimate the probability of the outcome of the next shot (e.g. winner, continuation or error). As player behavior depends on the opponent, we use model adaptation to enhance our prediction. We show the utility of our approach by analyzing the play of Djokovic, Nadal and Federer at the 2012 Australian Tennis Open.

1 Introduction

In Rafael Nadal’s recent biography [1], he candidly describes his strategy against Roger Federer as, “If I have to hit the ball twenty times to Federer’s backhand, I’ll hit it twenty times, not nineteen. If I have to wait for the rally to stretch to ten shots or twelve or fifteen to bide my chance to hit a winner, I’ll wait. There are moments when you have a chance to go for a winning drive, but you have a 70% chance of succeeding; you wait five shots more and your odds will have improved to 85%... That’s the plan. It’s not a complicated plan. You can’t even call it a tactic, it’s so simple. I play the shot that’s easiest for me and he plays the one that’s harder for him - I mean, my left-handed drive against his right-handed backhand. It’s just a question of sticking with it...”

Despite the simplicity of his tactic, what is compelling is Nadal’s probabilistic mindset of maximizing the chance of a winner or waiting to force an error. Motivated by this insight, in this paper, we use ball and player tracking information from Hawk-Eye [2] to model a player’s behavior based on a myriad of spatiotemporal variables (e.g. shot location, speed, angle, number of shots, feet location). We use a probabilistic graphical model to determine the combination of variables which maximizes a player’s chance of hitting a winner from the incoming shot. We call this analysis “sweet-spot”, and we also use it find the combination of variables which frequently lead to an error - a method which can be used to highlight a player’s tactics and also their strengths and weaknesses.

Our work differs from current tools used to analyze tennis matches such as IBM’s Slamtracker [3], which only incorporate coarse match statistics (i.e. serves, winners, number of shots, volleys), and typically lack rich spatial and temporal information. This is despite that fact that ball and player tracking technology has become commonplace at major tennis tournaments to aid in officiating and broadcast visualization. Our approach also allows us to do live “in-point” prediction - where given the previous shots during a rally we can predict the outcome of the current shot. As tennis is an adversarial game, the behavior of a player is heavily conditioned on the behavior of the opponent. To model specific opponent tendencies, we employ a model adaptation technique which greatly improves the predictive power.

2 Methodology

Since Intille and Bobick’s seminal work on football play recognition over a decade ago [4], numerous efforts have concentrated on using probabilistic methods to model spatiotemporal data to aid in the analysis and prediction of plays in sport [5-8]. Even though notable, the lack of tracking data to adequately train models has limited its widespread use. Recently however, the release of STATS SportsVU data for basketball has enabled interesting analysis of shots and rebounding [9-10]. For soccer, researchers have characterized team behaviors in the English Premier League using ball-motion information across an entire season using OPTA data [11]. 2013 Research Paper Competition Presented by:

Player B serves Winner Player B foot movement

Impact point of incoming shot

Player A foot movement Figure 1. The Hawkeye data contains both the ball trajectory information as well as the player feet movement information. In this example, player B serves the ball to player A who then hits a winner back to player B.

In this paper, we use a Bayesian Network (BN) framework to model player behavior using ball and player tracking information. BN’s are a type of probabilistic model that allow us to construct a single compact model that captures the varying properties of a rally. To facilitate this work, we used Hawk-Eye [2] data from the Men’s draw of the 2012 Australian Open. The Hawk-Eye data contains ball trajectories and player foot position over time. An example of the data is illustrated in Figure 1. This rich set of data allows us to see which variables (or combination of variables) increase the likelihood a player will hit a winner or commit an error1 - which in essence suggests a player’s strengths and weaknesses. The genesis of a player’s behavior can be described by a number of underlying variables, such as: shot speed, shot location, player position, opposition position, number of shots in rally and identity or rank of opposition (e.g. Federer, or top 10 rank). Except for the identity of the opponent, all of these factors or variables vary temporally which makes this problem an ideal candidate for a BN. Additional factors such as: set number/ length of match, environment conditions (e.g. hot, humid, cold or windy), specific match context (e.g. game/set/ match/break point), court surface (e.g. grass, hard-court, clay) would no doubt enhance the predictive power of our model, however, as we increase the number of variables the demands on the amount of training data required to train our BN exponentially increases. For this work, we specifically modeled the winners and errors associated with the top 3 seeds at the tournament (Novak Djokovic, Rafael Nadal and Roger Federer) as they had the most data and it also allowed us compare the different styles of play. The respective opponents and outcomes of the matches used in this study are shown in Table 1.

Table 1. Shows the opponent and outcome for the Djokovic, Nadal and Federer at the 2012 Australian Open.

Player Stat Round

1st 2nd 3rd 4th Qtr Semi Final

Djokovic Opp Lorenzi Giraldo Mahut Hewitt Ferrer Murray Nadal

Score 6-2,6-0,6-0 6-3,6-2,6-1 6-0,6-2,6-1 6-1,6-3,4-6,6-3 6-4,7-6,6-1 6-3,3-6,6-7,6-1,7-5 5-7,6-4,6-2,6-7,7-5

Nadal Opp Kuznetsov Haas Lacko Lopez Berdych Federer Djokovic

Score 6-4,6-1,6-1 6-4,6-3,6-4 6-2,6-4,6-2 6-4,6-4,6-2 6-7,7-6,6-4,6-3 6-7,6-2,7-6,6-4 7-5,4-6,2-6,7-6,5-7

Federer Opp Kudryavtsev Beck Karlovic Tomic Del Potro Nadal -

Score 7-5,6-2,6-2 walkover 7-6,7-5,6-3 6-4,6-2,6-2 6-4,6-3,6-2 7-6,2-6,6-7,4-6 -

1 Due to the ambiguity in differentiating between forced and unforced errors from the data directly, we label both these errors into a single category. We found due to the different characteristics of these errors - two discriminant modes for each error type resulted. 2013 Research Paper Competition Presented by:

ICCV ICCV #**** #**** ICCV ICCV 2011 Submission #****. CONFIDENTIAL REVIEWICCV COPY. DO NOT DISTRIBUTE. #**** #**** ICCV 2011 Submission #****. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE. 216 4. Shape Constraint 270 217 271 108 162 a point essentially consists of three game states:218 1) the ini- 4. Shape Constraint 272 109 S1 (29) 5. Experiments163 tial state (i.e. the serve), 2) the middle state219 (i.e. the rally 273 110 which can consist of many shots), and 3) the end state (i.e. 5. Experimentszt (30) We164 can try it on hockey, basketball and soccer data. 111 ICCV220 165 ICCV 274 the point either ends via an ace, double#**** fault, winner or t What type of experiments can we #**** 112 ICCV221 We can try itx on hockey, basketball and soccer(31) data. 166 ICCV275 ICCV force/unforced error). A typical DBN consists of two parts: ICCV 2011 Submission #****. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE. ICCV ICCV #****222 What type of experimentst 1 can we ICCV #****276 #****113 an initial Bayesian network (BN) to model the initial state z (32) 167 #**** #**** 114 223 ICCVt 1 2011 Submission#**** #****. CONFIDENTIAL REVIEW168 COPY. DO NOT DISTRIBUTE. 277 ICCV and a 2TBN (2-time sliceICCV BN) 2011 to model Submission the rally. #****. In our CONFIDENTIAL REVIEWx COPY.ICCV DO NOT DISTRIBUTE.(33) 115ICCV 2011 Submission #****. CONFIDENTIAL REVIEW216224 COPY. DO NOT DISTRIBUTE. 169 270 278 ICCV #**** point model, we used a 2TBN to represent the rally com- ICCV 1 #**** 4. Shape Constraint ICCV 217 z (34) ICCV271 #**** ICCV 2011116 Submissionponent #****. and CONFIDENTIAL a BN for the service REVIEW component COPY. DO225 NOTof a DISTRIBUTE. point. A #**** 170 279 #**** 218 1 #****272 ICCV 2011 Submission #****. CONFIDENTIAL117 depiction REVIEW of COPY.this model DO NOT is DISTRIBUTE.shown in FigureICCV 2.226216 S1 x (29) 5. Experiments(35) 171 ICCV 280270 ICCV108 216 219 ICCV 2011 Submission #****. CONFIDENTIAL REVIEW162 COPY. DO NOT4. DISTRIBUTE.ICCV Shape Constraint 270 273 a pointICCV essentiallyICCV consists118 of three game states: 1) the ini- #****4. Shape217 Constraint t ICCV172 ICCV #**** 271 109 To learn a model for each player, we first227 create a shot z z2 (30) 163 We can try it on hockey, basketball and soccer data. 281 108 tial state (i.e. the217 serve),119 2) the middle state (i.e. the rally 220 ICCV 2011 Submission #****.162 CONFIDENTIAL(36) REVIEW COPY. DO173 NOT DISTRIBUTE. 271 274 ICCV ICCV#****a point essentially#**** consists of#**** three gamedatabase states: where 1) the each ini- shot is labeled as either218 ace, fault, con- ICCV ICCVt What type of experiments#****#**** can we #**** 272 108 ICCV 110 4. Shape5. Constraint221 Experiments228 162 x 3 (31) 164 ICCV 5. Experiments 275 282 #**** a point109 essentially#**** consists ofwhich three game canICCV consist states: 1) of120 the many ini-ICCV shots),4. 2011 Shape and Submission3) the ConstraintICCV end 2011 state #****.ICCV Submission (i.e. CONFIDENTIAL 2011 Submission #****. CONFIDENTIAL REVIEW #****. CONFIDENTIAL COPY. REVIEW#**** DO COPY.#**** NOT REVIEW DISTRIBUTE. DOz NOTS1 COPY. DISTRIBUTE. DO163 NOT DISTRIBUTE.(37)(29) ICCV174 ICCV tial111 state (i.e. the serve), 2)218 the middletinuing state (i.e. shot, the winner rally or error. GivenS1 these labeled219 shots, we ICCV(29) t 1 165 272 273 109 ICCVthe 2011 point Submission eitherICCV 2011 #****. ends Submission CONFIDENTIAL via an #****. ace, CONFIDENTIAL double REVIEW fault, COPY. REVIEW216 DO winner NOT COPY. DISTRIBUTE. or DO NOT DISTRIBUTE.222229 163 z t (32) 276270 283 tial#**** state110 (i.e. thewhich serve), can 2) the consist middle of#**** state many (i.e. shots),121 the rally and 3) the end state (i.e. 5. ExperimentsWe can try it on hockey, basketball and soccer4z data.164 4. Shape Constraint#****(30) We#****175 can try it on hockey, basketball and soccer data. ICCV #**** 110 112 219 can then learn probability distributiont functions223230220 (pdfs) for #****164 ICCVt 1 z 166 (38) 273 277 284274 which can111 consist of many shots),force/unforced and 3) theICCV end error). state122 2011 A (i.e. typical Submission5. DBN Experiments consists #****.ICCV of CONFIDENTIAL217 2011 two parts: Submissionz 216 REVIEW #****. CONFIDENTIAL COPY. DO NOT REVIEW(30) DISTRIBUTE.x COPY. DO NOT DISTRIBUTE.165 (33) 176 271 270 ICCV 111 ICCVthe113 2011 point Submission either ends #****. via CONFIDENTIAL an ace, doubleeach REVIEW fault, factor COPY. winner (e.g. DO shot or NOT location, DISTRIBUTE. speed,What angle,224 type positionICCV of experiments etc.), can165 we 5 t 167 4. Shape ConstraintWhat type of experiments can we 278 #**** the point112 either ends via anan ace, initial double Bayesian fault,220 winner network or (BN) to model the initial stateWe can try it231221 on hockey, basketball and#**** soccer1 data.zx 166 (39)(31) 274 285275 #**** 108 a point essentially consists216 of three216123 game states: 1)We thecan ini- try it218 on hockey, basketballt 217 and soccer#**** data. z162 (34)5. Experiments 177270 270 272 271 112108 ICCV216force/unforced 2011114 Submission error). #****. A typical CONFIDENTIAL DBN consistswhich REVIEW yields of two COPY.a parts:multi-dimensional4. DO Shape NOTWhat Constraint DISTRIBUTE.x type pdf of for experiments225 each shot type can forweS1 166(31)162 t 1 (29) 168 270 279 force/unforceda point113 essentially109 error). consists A typicaland of DBN three a 2TBN consists game (2-time states:221 of two124 1) parts:slice the ini- BN) to4. model Shape the Constraint rally. In our 218 232222 1163 6 167 178 275 272 286276 113ICCV109 2011 Submissionan #****.tial initial state CONFIDENTIAL Bayesian (i.e. the serve), network217 REVIEW 2) the (BN)217 middle COPY. to statemodel DOeach NOT (i.e.What player.the DISTRIBUTE.the initial type rally A of Gaussian state experiments219 Mixture can we Model226 (GMM) is used to 167163 x zz (35) 5.(40)(32) Experiments 271 271 280273 antial initialICCV state Bayesian (i.e.110 the217 network115 serve), (BN)2)point the to middle model, model state the we initial (i.e. used the state a 2TBN rally to represent the5. rally Experiments com-t 1 t ICCV S1164 (29)169 271 108 114 which can consist of216 many shots),222125 and 3) the end state (i.e. 220 z 219 233223 z (32)162 2 t 7 1 (30)168 We can try it on hockey,179 basketball270 and soccer276 data. 274 273 287277 114a point110 essentiallyandwhich216#**** a 2TBN can consists111 (2-timeconsistand116 of slice a threemany 2TBN BN) gameshots), (2-timeto model states: and218 slice 3) the 1) the rally. theBN) end218 ini- Instate to our model (i.e.calculate the5. rally. theExperiments pdf In ourof eachS1 of these factors227 and the parameters(29) 168#****164 z165t xz (36) 170 (41)(33)270 272 272 281 109 115 218the pointponent eitherICCV ends and 2011 via a Submission BN an ace, for126 double the #****. service CONFIDENTIAL fault, winner component REVIEW or of COPY. a point. DO NOT AS1 DISTRIBUTE. t (29)163 z 169 (30) We can try it272 on180 hockey, basketball and soccer data. 115111 217 223 S1 We can tryt it1 on220 hockey,(29)228234224 basketball and soccer169 data.165 3 What type of experiments can we271 277 282 274 288278 216 tial statepoint (i.e.the217 model, point the serve),112 either wepoint used 2) ends117 the a model, 2TBN via middle an to we ace, staterepresent used double219 (i.e. a the2TBN fault, the rally rally219 winner to com- represent orare calculated the rally com-using221 thet Expectation-Maximizationx (EM)x (33) z166270 8 1 (31) 171 271 273 273 275 108 110 116 force/unforceddepiction error). of A thistypical model DBN127 is consists shown of in two Figure parts:We can 2. tryWhatz it ontype hockey, of experimentst basketball can we and soccer162(30) data. 164 t zz 170 (37) What(42)(34) type of experiments181 can we a point essentially consists116which112 can of three consist game of113 many states:219ponent shots), 1) and the anda ini- BN 3) the for218 end the state service (i.e. componentt of a point. A p(s) z 221229235225 (5)t 1 (30)170166 x167 (31) 273 272 283 275 289279 217 ponentforce/unforced218 and a BN error).for118an the initial A service typical Bayesian component DBN network220 consists of224 (BN) a of point.220 two to model parts: A algorithm. the initialz state Maybe equation222S1 of GMM1 (30) here, followed(29) byz an 4 271 (32) 172 272 274 274 278 276 109 111 117113ICCV 117 114 To learn a modelS1128 for each player,What we type first of create experimentst (29) a shot canz t we 230 ICCV163 171(34)165167 tz1681 9 1 171 (38) 182 284 tial state (i.e. the serve),the point 2) the eitheran108 middle initial ends Bayesian state via220depiction (i.e. anand network ace, a the 2TBN of double rally this(BN) (2-time model to fault, model slice is winner BN)shown the toinitial or model in Figure state theinference rally. 2. In equation our x ↵1,2 222 236226 (31)(6) 162 zx (43)(35) 274 276 290280 ICCV #**** depiction ofa this point model essentially119 is shown consists in Figure of221219 three 2. game225 states: 1) the ini-t 4. Shape223 Constraintt x #**** t 1 (31) zICCV5 (32)173 275273 279 277 110ICCV 218 112 118114 219118 115 S1 database where each221129 shott is labeled as either ace,z fault, con- 1 231 ICCV164(30) x 172166168 z169272 (33)172 (39) 273 183 275 285 which can consist offorce/unforced many shots),andTo109 learn a and error). 2TBN a3) model theA (2-time typical end120pointTo forICCVlearnstate slice each model,DBN 2011 BN)player,(i.e. a Submission consists model we to used wemodel foroffirst #****.a 2TBN two each the createCONFIDENTIAL parts:rally. to player,(29) representa shot In our we REVIEW the firstForx rally COPY. ourcreatep( com-s DO) work NOT a shot DISTRIBUTE. wet used1 thet BayesNet1 223(31)(5) Toolbox [12] for 163 t 1 10 2 174 277 ICCV #**** ICCV 116tial state221 (i.e. the serve), 2) the222220 middle statez (i.e. the rally z 224 (30)↵2,3 xICCV 237227 (32)(7) (35) x#****170 z z ICCV 275 276274 278 291281 111#**** 219 113 119115 110119 tponent andtinuing a BN for shot, the winner service226222130 component or error. Given of a point. these A labeled shots,t wez 232 #****165 1 (32)173167164169 6 273 173 (33) (44)(36) 184 276 280 286 the point either endsan viaICCV initial andatabase Bayesianace,point220 double model, wherewhich network each wefault,database can121 used shot (BN) consist winner ais where2TBN labeled to of model ormany eachto as represent shots),either the shot initial ace, and is the labeled 3)fault, state rallythe con-end as com- state eithermodelt (i.e. ace,1 estimation↵ fault,5. Experiments con- andx inference. A visualization224(6) of(31) the pdfsz z ICCV (34) (40) 175 274 278 #**** 114 #**** ICCV117 222z 2011 Submission #****.221 CONFIDENTIAL(30)t z REVIEW 1,2 COPY.t225 DO1 NOT↵2,4 DISTRIBUTE.2#**** (32)238228 (8) 168 1711 3 #**** 276 275 279 292282 112 220 120116ICCV 2011111 Submission120 depiction #****. CONFIDENTIALcan of this then model223 learn is REVIEW shown probability227223131x in COPY. Figure distribution DO 2. NOT DISTRIBUTE. functionsx (pdfs)(31) fortz 1 233 166(33) 174(36)165170 z7 274 z 174 (34) 185277 277 281 287 force/unforced error).ICCVand A 2011 typicala 2TBN Submissiontinuingponent221 DBN (2-time shot, and consists #****.118the winner a slice pointBNtinuing CONFIDENTIALt122 oforBN) for either error.two the shot, to endsparts:servicemodel Given winner REVIEW via thesecomponent the an or ace, rally. COPY. labeled error. double DOIn of shots,Given aour NOT fault, point. we DISTRIBUTE. these winnerof A thelabeled or different shots, factors wet 1 of thex winners 225 for the top 3 play- 1 (33) z172 z1 (41) 176 (45)(37)275 279 115 121117#****108 112 To learnICCV a model 2011222 for Submission each player, #****.we first CONFIDENTIAL createt 1 ↵ a shotWe canz REVIEW try it on COPY. hockey, DO basketball NOT(7) 239229 DISTRIBUTE. and162 soccer(32) data. 175169166171 #**** 276 293283 113 point model,depictiona point we121 essentially used offorce/unforced this a223x 2TBN modelconsists to is error).of shownrepresent threeeach A gamefactor typicalin Figure224 states:the DBN(e.g. rally 1)2. theconsists shot com-t132 ini-(31)1 location, of two4. Shape parts: speed, Constraint2, angle,3 position2261 ↵ etc.),1,4 3 234 167 (9) x 1 4 (35)175 277186278 288280 an initial221 Bayesian network109 can (BN) then to learn model119 probabilitycan the123 theninitial distribution learnICCV state probability 2011 functions Submission228 distribution (pdfs)224 #****. for CONFIDENTIALers functionsx is shown (pdfs)What in REVIEW typeFigure for ofz experiments COPY. 3. As DO we1 NOT can can we226 DISTRIBUTE.(33) see from this163(34) figure, x8173275 (35)177 278 282 280 116 122118 222tial113 state (i.e. thet serve),1database 2) the wherewhich middle each yields state shot (i.e. ais multi-dimensional labeled thez rally as either ace, fault, pdf forcon- each shott 1 type(32) forzz 235240 2 (34)176(37)170167172 z xz1 (42) (46)(38)276 289 294 114 ponent andeach a factorTo BN122 learn for (e.g.120an athe initial model shot service Bayesian location, for componenteach network speed, player, angle, (BN)225223 of we a firstto positionpoint. model create133 A the etc.), a initial shot state ↵2,4 x227 ↵ (8) 230 168 174 176 187279277 281 284 and a222 2TBN (2-time slice110 BN)114 to modelz the224each 124 rally.tinuing factor Inshot, (e.g. our winner shot or location, error. Given225(32) speed, these5. labeled angle,each Experiments player1 shots, position we has etc.), different1 tendencies.1,5 227 Specifically,164 Federers(33)(10) z 168 92276 (36) 178 278 279 281 108 108 216 117 123119 223which canand consist a 2TBN of many (2-time shots),each slice and player. 3) BN) the endto A model state229 Gaussiant (i.e. the1 rally. Mixture In ourz Model (GMM)x is used to 41 162(34)236 162(35) 177171173 zz 270 5 (43)(36) 277 283 290 115 a pointa point essentially essentially consistsdepiction consists of111 threewhich of ofdatabase game three this yields123 states: model game where121 a multi-dimensional 1)t is states: each the1 shown ini- shot 1) in isthe Figure labeled ini- pdf 2. for as226224 either each shot ace,x 134 type fault, for con- 2281 (33) zx 241231 169165 3 (35)(38) 175 zz2 177 (47)(39) 188280278 282 295285 109 point223 model, we used120 a 2TBNthe to115 point represent either225which ends the125can viarally thenyields ancom- ace,learn a double multi-dimensional probability fault, winner distribution226 or pdf functions forwinners each (pdfs) shot↵ tend1,4 for type to be for hitz from↵2, wider5 on228 both(9) the forehand163(34)(11) and 169174 103277 179 279 280 282 109 217 118 216124 112 122pointx model, we usedcalculate a 2TBN the to pdf represent230 of each(33) the of rally theseWe com- can factors1 try it andon hockey, the parameters2 basketball and soccer163 237 data. 166 z 178172 176271 (37)270 284 291 116 tial statetial (i.e.state the (i.e. serve), the serve), 2)216 the middle 2)eachtinuing the224force/unforced116 player. middlestate124 shot, (i.e. A Gaussian state winner error). theeach (i.e. rally A or typicalMixture factor error. the DBN rally(e.g. Given Model consists shot these227225 (GMM)location, of labeled two parts: is speed,135 used shots,1 angle, to we position etc.), 229z 52 242232 170(36) 170 z z 270 6 178 (44)(37) 278 189281279 283 296286 110 ponent224 and a BN for125 the121 service componentponenteachof and126 a player. a point. BN for A A the Gaussian service component Mixture227z Modelof aWhat point.backhand (GMM) typex A of↵ experiments compared is used to can to1 we the others,z and229(35)(10) he also tends164 to be (36)179175 278 zz3 180 (48)(40) 281 283 110 218 119 217 113 calculatecan then125 the learn123 pdf of226 probability1 each of theseare distribution factors calculated and functions the using parameters (pdfs) thep Expectation-Maximization( fors) p1(,s5) x (34)↵(5) (EM)5,6 z 164 (5)238 167 (12) 4 (39)173 1774272 271179 280 292 117 whichwhich can consist can consist of many of shots), many217 andshots),225an117 3)initial the and Bayesian end 3)S1 state thez networkwhich end (i.e. state yields (BN) (i.e. ato multi-dimensional model228226 the initial231 state136(29)(34) pdf for each shot type for 3 243 171(35) z 171 zz1 271 (38) (45) 279 190282280 285 297 111 111 depiction225 of120 this model126122 is114 shown in Figure124depiction 2.calculate127 of this the model pdf is of shown each in of Figure these228 2.1 factorsfurther and the2 behind parameters the baseline230z (need63 to165 check230239 this?).233 From165168(37) this 180174176 178279 7 (38)181 282 293284 284 287 the pointthe219 point either either ends via ends an via ace,are an doubleeachandcalculated ace,118 a126 factor 2TBN fault,double (e.g. (2-time using winner227 fault,t shoteach the slice or location, winnerplayer. Expectation-Maximization BN)algorithm. to A model or speed, Gaussian the angle, Maybe rally. Mixture position In equation our Model (EM) etc.),↵ (GMM) of GMMz is used↵2 here,,5 to followed2 ↵ by(6)6,7 anz (36)(11) (13) (37)172 273 zz4 180 (49)(41) 281 118 218 218115 To learn1 a model forS1 each player,232 we137x first create1,2 a shot(29) ↵1,2 (35) z (6) 172169 5 (40) x 5 272 272 191 286 112 112 121 127123 226point model,125 wearexz used128 calculated a 2TBN to represent using229227 the the rally Expectation-Maximization com-(30)(35) model,(29) we can (EM) see231 thez4 combination166 of231 factors240244234 which166(36) leadsz 181175177 z1179 (39) (46)(39)182 280 283281 294285 285 298288 force/unforcedforce/unforced220226 error). error). A typical219 A typical DBNalgorithm.which119 consists DBN127 yields Maybedatabase consists of a two multi-dimensional equationcalculate parts: where of two each of theinference parts: GMM shot pdf ofis pdf here,labeled each equationfor of followedeach as these either shot229 factors by ace, type an fault,and for the con- parameters3 z ↵ 4 (38) 173 274280273 8 181 283 113119 219128124116 126 228algorithm. Maybet equation of138 GMM2 ↵2 here,,3 followed↵↵52,6,3 by an 3 (7)7,8 z7 167(12)(7) 173170 (14) (38)182178 180 zz5 273 (50)(42) 282192 113 an initial Bayesian122 network (BN)inference toeach227ponent model120 player. equationand the a initial BNA Gaussian for2t129are state the calculated service Mixture componentz using Model230228 the of (GMM) Expectation-Maximizationa point.233z Ais used tomost(30)z likely(30) (EM) to a playerICCV232z hitting(36) az winner,232(37)241 unforced245235 167 error(37) and 6 (41)176174 z26 (47) 183 281 284282 287 295ICCV286 286 299289 an227 initial Bayesian network220117 (BN)128 to modeltinuingzx shot,the initial winner state orFor error. our Given work these we230139(31)(36) labeled used shots, the BayesNet we Toolbox5 [12] for 171 z z 281274 (40)182 (40) 193 284 114120 221 220129125 depiction121 of127 this229inference model is shown equation in Figuret 2. ↵2,4 ↵ z ↵(8)8,5 5 168 242 174(39)(15) 183175179 181275 9 274 283 296 114 and a 2TBN (2-time123 slice BN)118 to modelcalculateFor our the work the rally. pdf we of130 Inalgorithm. used each our the of these MaybeBayesNetx factorst equation231229 Toolbox and234 of the GMM parameters[12]3 here,for ace followed (or4 their↵ by62,7, an4 sweet spot).2334 This is8 shown(13)(8) in Figure246 4.168172 177 z3 z6 (48) 184 (51) 285283 288 287 300 and228 a 2TBN (2-time221 slice BN)228To to129 learn model128can a modelt then the13 learnfor rally. each probability player,Inmodel our we estimation distribution first create a functionsand231 shot inference. (pdfs) for(31)z A visualization#**** of the pdfsz 233(38) 236 7 (39) 1827282275 z 183 (43)282 285 #**** 287 290 115121 222 130126 122 z z inferenceFor our equation workx we used the140z(32)(37) BayesNet Toolbox(31) [12] forz6 (37) z 169 243 175(38) z 184(42)176180 z 276 (41) (41) 194 297 115 point model, we124 used a221 2TBN119 tomodel representare calculatedestimation theeach rally230 factor andusing131 com- inference. (e.g. the shotExpectation-Maximization A location,t visualization1 232230 speed, angle,of the positionpdfs (EM)↵1,4 etc.), ↵ z ↵5(9),12 6 (14) 247 169173(40)(16) 178 10 275 185 284 286284 301 point model, we used127222 a 2TBNdatabase to130 represent where129 each the shot rally is labeledof com-z thet as1 either different ace, fault,235 factors con- of the winners(32)5 ↵71,8, for4 the234 top5 3 play- 234(9) 237 ICCV 2011 Submission181 z4183 #****.276z CONFIDENTIAL7 184 (49) REVIEW COPY.(52) DO NOT DISTRIBUTE. 289 288 288 291 116122 223229 131 120 229123 t 14 For our work we used the BayesNet2321414 Toolbox [12] for z9 170 244 176174 8 (40)185177 8277283 z (44)283 195 286 298 116 ponent and a BN125 for the222 serviceof componentalgorithm. thetinuing different shot, ofwhich Maybex winner a factorsmodel point.z yields132 or equation error. of Aestimation a multi-dimensional the Given winnersofz GMM these and labeled for here, inference. thepdf shots, followed fortop each(33)(38) we 3 A play- shotvisualization by↵1 an type,5 forz (32) of the pdfsz7 (10) (39) 170(39) z 179 z 276 (42)186 ponent and a BN for128223 the service124131 component130 231 of a point.erst is A1 shown233231 in Figurez 3. As we can↵ see from235 this↵ figure,(38)7,13 z 235 248 (17) (43)178182 184 277 (42)185 285 287285 289 289 302 117123 230 132 121 230 each player.model A Gaussianestimationx Mixture and inference. Model236233142 (GMM)A visualization is used(33) toof the↵8 pdfs1,5,5 z p(s) 7 171(15)(10)245 238 177175(41) (5) 186 z5 284 z (50) 284 196 287 290 299 292 117 depiction224 of this126 model is223 shown iners Figureinferencecan is shown then 2. learn131 equation inFigure probabilityof15133 the 3. different distribution As wet can factors1 functions see from of (pdfs) the this forwinners figure,↵ for the6 topICCV 3 play- 6 z 171 9 (41)180 1859278 z8 1 277 187 (53)(45) ICCV 118 depiction of this model129224122 is shown125132 in Figurez 2.of the differenteachx player factors234232 ofhas the different winners5 tendencies.for2,5 the top(33) 3 play- Specifically,z8 Federers↵(11) 10 172236 249 176(40)(18) 179183 z 278 186 (43) 288286 290 303 124 133 eachFor factor ourcalculate (e.g. work232z shot we the location, used pdf of the speed, each BayesNet of1 angle, these position factors Toolbox237 etc.), and143z(34)(39) the[12] parameters for z ↵↵2,5 236 (39)6,10 z (40)(16)(11)246 239 178 z 187(44) z6 (43) (51) 286197 291 300290 293 118 225231 127 each231 player has132 differenters134 is shown tendencies. inz Figure Specifically, 3. As234 Federers we can see from(34) 5 this,12#**** figure,z ↵1,2 8 172(42) (6) 181 186279285 188 285 288 #**** 119 224134130225123 126 16ers is shownwinners in Figure1 tend235 3. to As be we hit can from see↵5 fromwider,6 this on figure, both the forehand2167 (12) andz 173237 177 188180184 10 279z9 278 (54) 289 270 291 modelwhich133 yieldsestimation133are a multi-dimensionalcalculated and inference. using pdf the Afor1 Expectation-Maximization visualization each233 shot type144 offor6 the pdfs (EM)7 237z ↵8,11 247250 (19) 10 (42) z 187 x1 187 (44) (46) 198287 301291 304 125119 232 128 124 winners127 tend to be233eachxz hit135 from player wider has onz different both the tendencies. forehand238235(35)(40) and Specifically,(34)↵↵5, Federers6 9 (17)(12) ICCV240 2011179173178(41) Submissionz 182 #****.181 CONFIDENTIALz7 286 REVIEW(44) COPY.(52) 189 DO NOT DISTRIBUTE. 287 289 292 294 120 226 225131 232each player.algorithm. A Gaussianeach Maybe Mixture playerbackhand equation has Modelx different (GMM) of compared GMM tendencies. is used here,z to to followed Specifically, the others, by(35)z an Federers and7,13 he also tendsz (40)↵ to2,3 bez1 174(41) (43) (7) (45)185 280 279 286 135 226 of the134 different134 factors of the winners1 236234 for the145 top 3 play-↵6,7 2178 (13) 9 238248251 189 188 280z10 188 (55) 199290288 302 271 292 305 126120 233 129 125 backhand128 compared234winners27136winners to the tend others, tend to to be andbe2 hit he from from also wider tendswider on to onboth be both the forehand the forehand and and238 ↵8,12 z 241 180174179 (20) (43)183182 zz 287 z2 (53)(45)190 (47) 288 292 295 121 227 136132To learncalculate a model theinference for pdfz ofeach each equation player, of thesefurther factors wexz first behind and create the parameters the239 a baseline236 shot(36)(41)7 (need to8(35)↵ check6↵,106,7 this?).z10 From this 175(18)(13) (42) z 190186 81281 (45) 290 293 226 227126 ers233 is135 shown135 in Figure 3. As we can237 see from this figure,↵7,8 (36)z 218z (14)↵2,4 239(42)249 180(44) (8) 1 189 281 1 280189 287 291 303 272 293 127121 130 furtherare129 calculated behind the usingbackhand baseline137backhand the Expectation-Maximization (need compared compared to check235 to to this?).the the others, others, (EM)From146z and this and he also he also tends tends to be to be239 (41)x101 252 181175 S1 (46)184183 x (29) 191 200289 293 306 122 228234 137database133 where each136 shotFor235 is38 our labeled work we as eitherused23 the ace, BayesNet fault, con-237 Toolbox [12] for 9 ↵8,13 z 176 242 (21) (44)191187 zx9190282288 (54)(46) (56) 289 291 296 227 228127 model,each130136 weplayer can has seez different thefurther combination behind tendencies.model, thez of baselinewe factors Specifically, can (need240 seewhich147 the(37) to Federers leadscheck combination this?).(37) From9(36)↵8 of↵,117 this, factors8 whichz leads 240(19)(14)250 181(43) 184 1 282 z3 281190 (48) 201 294 304 294 123128122 131 234algorithm.model Maybefurther estimation138 equation behind of and GMM inference. the here, baseline238236 followed A visualization (need by(42) an8 to check of↵8 the,5 pdfs this?). From216 this219z (15)↵1,4 177 253 182176(45) (9)xt1 185 2 (46) 192 288 292290 270 273 307 235 138tinuing134128 shot, winner137 or error. Given these labeled shots,z we z ICCV 2401 ↵(42) z (43) 243 182 (22) z 192(47)188 191289 ICCV 294 297 229 229 mostwinnersinference131 likely137 tend toequation a toplayer236 be49model, hit hitting from wemost awider can winner, see likely34 on the both unforced combination to the a241 player forehand error238148 of hittingand factors and a which winner,↵ leads unforced10 error7,11 andz2 241251 (45)185 z10z 283283x (30)191 (55)(47) (57) 290202 292 295 305 295 124129123 132 228135 235 of themodel,z different139 we factors canz see of the the239237 winners combination for the↵ top of5,12 3 factors play-(38)(37)↵ which8,128,2175 leads220z (16) 1 178(20)(15) 254 183177(44) 186189 2 z4 282 193 (49)289 293291 271 274 308 To learn a model139can for129 theneach learn player,132 probability we138 firstz create distribution a shot functions (pdfs)(38)(43) for9 10 #**** 241 ↵1,5 183 (10)zt2 193186 1192 (47) #**** 295 230236 230 acebackhand (orFor their138 our sweet compared work we spot).most used to Thislikely theace BayesNet isothers, to shown (ora player their and Toolbox in hittingFigure he sweet also [12] a 4.239 spot).winner, tends for to This unforced be is shown error and in Figurex 4.1 ↵7,12 252 244 (46)(23) x 284290284T 1 192 (56) 293 306 298 125130 133 229136130 139ers is237most shown5 likely in Figure to a 3. player45 As240 we hitting can242 see149 az from winner, this figure, unforcedz ↵ error and (43) z3 179242(44) 184184 x (48)187190 193 x (31)283 (58) 291203294 296 296 124 database where each140each shot isfactor labeled236model (e.g.133 as estimation shot either location,10 ace, and140ace inference. fault, (or speed, their con- A sweet angle, visualizationz spot).238 position This of the is etc.), pdfs shown↵ in7,13 Figure(39) 4.↵85,13,12218 221242 (17) x1 (21)(16) 255ICCVS1178 2011 Submission(46)194 #****.187 CONFIDENTIALz3 (29) z REVIEW COPY.(48)194 DO NOT DISTRIBUTE.(50)290 292 272 296275 309 237 231 further139 behindeachz playerthez baseline has different (needz tendencies.to check this?). Specifically,(39)(44) From this Federers (38) z1 ↵2,5 253 245 (45)(11)z3 2 291285 5 (48)193 307 299 126131125 231 134 230137131 140 238ace141 (or their sweet6 spot). This240150 is10 shown in Figure 4. z2 ↵6,5 180243 185179185(47)(24)t 1 188191 x194285 T 284 (57) 195 292204 294 297 tinuing shot, winner141which or error. yields Givenmodel,of134a the multi-dimensional these different we can labeled factors see the ofshots, combination the pdf winners we for each for241239 of the shot factors top243 type 3 play- which for ↵ leads6,10 z ↵7,13219 222 (18) (45)(17) 256 t z 195188 x (32) (59) 295293 273297 276 310 232132 237140 winners6 tend to be hit fromz5 wider on bothz the forehand and(40)1↵7,11 243 (44) zz42 (22)254 z186 (47)(49) z4 (30)286 194 (49) 291 308297 127 232238 135 138 ers135 is shown141 in Figurez 3. As we canz see from this figure,241151(40) (39) x1 ↵5,6 181244 246 (46)(12) 189189192 T 1195286292 z6 (51) 205 295 298 300 132126 can then learn probability231142each player. distributionmost A Gaussianlikely functionsbackhand to239 a Mixture player1142 compared (pdfs)p hitting(s Model) for to a thewinner, (GMM)7 others,242 unforced is and used he(45) also errorto (5) tendsp and(s) to be 220 z3 ↵6,9(5) 186180(48)(25)t z14 196 x 1 (49)285 (58) 196 293 296 274 233133 141 142 240 p(s244) ↵8,11 ↵6,10 223244 (19) (18)255257 187t 196 287 195 294 298 309298277 311 128 239 136 139 238each136 player has different7 tendencies.z Specifically,6 Federers152 x(41)1↵7,12(5) 182(46)(23) 247 x x 190190193 zT5293 z (33) (50) (60)292 206 301 133127 233 143calculate134 theace pdf (or of their eachfurther sweetz of behind143 these spot). the factors This baseline is and shown (need the in parameters to Figure check242z this?). 4.1 From this 221 216 z (45)↵6,7 zz53 245 187181188(47)(13) (48)197(50) 287(31) z 197 (52) 296 270275 299 each factor (e.g. shot232 location,234 winners speed, tend angle,143 to240 bex1 position hit from↵1 wider,2 etc.), onz both8 243241 the forehand and(41)(46)(6)↵↵1,2 (40) 2242 ↵5,7(6) 256 (26) z5 x 197 288 7 (50)286 (59) 294 297295 310 278 129 To learn a model137 for each140 player,137 we142 first create a shot ↵p1245,(2s) 153 8,12 ↵8,11(6) (5)245z4 (20) 183(19) 258 t 1(49) 1 191191194 2 196 207 299 299 312 134128 234240 144are calculated135 239 usingmodel, the8 Expectation-Maximization we144 can see the combinationz of (EM)factors243 which leads(42)↵6,5217 246(24) 248 188182189 z 198 z16288294 z (34) (51)198 (61)293 297 271 300 302 130 which yields a multi-dimensional233 235 backhand pdf for144 compared each shot to the↵ type others, for and7 he244 also tends tox be z2 222 ↵7,8 z4 184(47)257 z (14) (49) z198(32)289 287 (60) 298 276 311 database where138 each shot is141 labeled as138 either143 ace, fault,241z con- 2,3 z9 242 ↵ (42)(47)1 (7)↵↵8,213,3 (41)(7) 225z3 (21)(46)↵5,8(7)z6 259 (48)(27) (51)192192195 z8 197 (53) 295 296 279 313 135129 241 145algorithm.136 Maybe equation145most likely2145 of to GMM a player here, hitting followed a winner,↵2246,13,2 by unforced154 an error and ↵8,12218 (6)246z5 247(20) 189183190(50) z16 199 199295 T 1 (51) 199 208 272300 300 301 131 each235 player.To learn A a Gaussian model for236 Mixture each240further139 player, Model behind we (GMM) the first baseline create isused (need a shot to to checkz this?). From244 this (43)↵6,9223 185(25)258 249tS11 193 z27289(29)290 (52) 294 298 277 312 303 tinuing shot, winner or error.234142137 Given these144 labeledace (or shots,9 their we sweet↵2,p4( spot).s) 8 This245243 is shown in Figure(8) 4.(5) ↵8,5 x 191 (15)x 196 z (33)z (35)288198 (61) (62) 299297 130 139 146inference equationmodel, we146 can see242z the combinationz of10 factors which↵2247,4 leads155z(43)↵↵7,211,4 z3 (8) 226247z4 (22)(47)(8)Playerzz5 A (48)260 184(49)(28) ...(50)200(52)193 200 z 296209 301 301280 314 132136 calculatedatabase236242 the where pdf of each each shot237 of these is labeled241140 factors as either andAcez the3146 ace, parameters fault, con-Playerz B ↵2,3 245(48)2 (44)(42)↵8,13224219 (7)Playerz6 A 7 186248(21)259 250 190t(51) z7 194 290296291 9 (52) 200 (54)295 299 273278 313 302 304 can then learn probability235147 distribution143138 functions145 (pdfs) for 246 ↵5,7 Error (26) z1192 2 201197 T z18 (30) T 289199 (53) 300 131 140 most likely147 to a player10 hitting↵1,↵4 a1, winner,21st Serve9 unforced244 ↵1 error,4 156 and (9)↵ (6) (9) 227Winner↵(23)5,12 261 z185 (16)z 194 z 201(34) z (36) (62) (63) 210298 281 315 133137 tinuing243 shot, winner or139 error. Given141 these243 labeled147 shots, we 248 ↵7,112,4 220 248z (9)z6 187249260 191193 (51)195 297 201 297 274302 314302 303 eachare factor calculated237 (e.g. shot using location, the144238 Expectation-Maximization speed,242ace angle, (or146 their position sweetzz4 spot). etc.), This (EM) is shownz in Figure↵ 4.2,4 246z(44)(49)3 (45)z4(43)↵7,11225 (8) 5 (48) z8 (49)(22) 251 t(50) z (53)198 T 291292 z10 200 (55)296 300 279 305 141 236148 142 148 1 247245 157 ↵ z7 (27) x1(52) 38 202195196 z202(31) (53)290 211301299 134138132 can then learn probability140 distribution functions148 ↵ (pdfs)1,↵52,3 forp(s) ↵1,5 (10)↵(7)6,5 (5) 5(10),8221 228249 ↵(24)7,13 188250261262 x192186194 (17)z z 9 (35) (63)(54)202 275 315303282 304 316 whichalgorithm. yields238244 a Maybemulti-dimensional equation149145239 of pdf GMM for each here,149 shot244 followed type for by an10 249 ↵1,5 226 (10)z 203199 203292298293 1 (37) (64) 298 280303 142 237 141 243143147 z1 zx1 248 ↵1,4 247158(45) (46)(44)↵7,12 (9)z6 7 (23) 252t 1951(51) (52)196197 x 291201 297 212302 301 306 135139133 each factor (e.g. shot location, speed, angle,z5149 position↵ ↵ etc.),↵1,2246 ↵2,5 Playerz(50)4 (11)A Player(6)z5 B (11) 229z8 (49) z9 189(50)(28)262263 1931872(53) z9 (54) (54) (64) 203 (56) 300 316 283 317 eachinference player.245 A equation Gaussian150 Mixture146240 Model (GMM)150 is used to 2,52,4 ↵(8)6,9 227222 250 ↵(25)S16,10 251 z z (29)(18) 4 204200 z10204299(36)(32)294 (55) 276281 304 305 239 143 142 144148 245 p(s) 249 250Returns248(5) ↵2,5 Returns (11) 263 253 196 z 197198 293 2 (38)202 299 303 302 304 317 307 136 134 calculate the pdf of each238 of these factors244 and the parametersx 150 z2↵ 247 ↵↵5,16,5 159(46) (7)(47) ↵(12)6,5 (10)z z8 190(24) t 1881(52) (53) 1 292 204 298 213301 140 which yields a multi-dimensional147241 pdf151 for eachz16 shot↵5 type,↵61,4 forz1 2,3 (51)(12)↵(9)5,7 (45) 223 230251z7 (26) 252 264 1943 z 201 205 295 x (57) 277 305284 306 318 240246 144 151 143 145149 ↵ z(6)5 z6 228 9 ↵(50)8,11z10 (51) x 197(54)(19)105 205(55)198199 x 294300(33) (55)203 (56) 282 137 135 are calculated using the Expectation-Maximization152 246151 (EM) 1,2 250 251249160 ↵5,6 (12) 191 264 254 z189 Inz a naive Bayes206 Net,(37) we could infer the probability205 of 300214304 303 305 318 308 141 each player. A Gaussian239148242144 Mixture245146 ModelDouble (GMM) is used toz3↵2,4248 ↵↵6,27,5 (8)(48) ↵(13)6,9224 (11)Player B Player B 253(25) 195198In a naive1 Bayes200 Net,202 we could296 inferT 1 the(39) probability293 of 299 302 278 307 145 152 150 z2 ↵6,↵71,5Playerx B (47)(13)↵(10)5,84. Shape(46)229 Constraint231252z8 (27) z9 265 41(53) (54)206199 2 204 283 306285 319 138 136 algorithm.241247 Maybe equation of GMM here, followed153 2477 by an↵2,3 1 250161(52)(7) z z10 192(52)265 255 z190(55)the(20) game-statex6 as follows:x207295301(34)x (56) (57) (58) 301215 304 319 309 142 calculate the pdf of240153 each149243145 of these147 factors andFault152 the parameters2nd Serve↵1,4251 ↵7252,8 z6 ↵6,7 (9)7 (14) Winner (51)(13)Error thez196199 game-state as207(56) follows:201203 (38)297 294 206 305 306 146 246151 z4 249 ↵5,6 (49) ↵5,2307225 (12)232 254(26) 266 z 200 (40)205 300 303 279284 286 308 320 139 137 inference242248 equation 146 154 z3 ↵27,4,↵82,5 (48)(8) (14)(11) S1 253z9 (28) z10 193(29)266 1912001(54) 2 (55) T 2081296302 T 320307 143 are calculated using154150244 the Expectation-Maximization148 248z8153 (EM)z2↵ 252 ↵ 251(53) 5. Experiments(47)(15) 256 x1975(56) x 208202204 x (35)298 x (57) (58)207 (59) 302 306 305 310 140 147 241 152 155 1,5250 8253,5 z ↵7,8 (10)z8 ↵ 226 233 (14) 194255(53)267 z 7 201 209(39) 295206 304 280307 287 309 321 138 249 147 247149 ↵18,4,↵55,6 z5 ↵6,7 (9)7 (15)(12) (50) 5,2318 (13)254 (52) (27)267 192201 z 203 303 (41) 301 285 321308 algorithm.243 Maybe155 equation151245 of GMM here,z followed154 by an 253 252(49) z 2(55) T 1 (56)209205 PT(297x 299z )P (1z ) 208 307 306 141144 148 242 148 156 249z49 z3↵2,5251↵5,12254 (54) In(11)We a(48) naive can(16) try227 Bayes it on10 hockey, Net, we basketball could195256268 inferand257 soccer thez1986202 probability data.x 2 202 of x210(36)n n Pz(xnn z296n)P (zn) (59) 303 305 281308 322 310 311 248150153 ↵1,5 (10) ↵8,5 232 234 (15) 268 z 8P (z x204)= (40)| (58)207 (65) (60)302 286 288 322 139 inference244250 equation156152 Initial155 Point↵5 ,State:12↵6,7 Servez6 4. Shape↵Continuous7,8 z8In Constraint(16) a(13) naiveState:(51)z 9 BayesRally Net,(14) we255End could (53)Point infer State the(54)(28) probability193 of 210n 206Pn (zn x298304n)= | 209 (65) 309 142145 246149 157 ↵ 254↵7,13 253 (12) (17)228 196 258 1992033 z | 2111 P300(xn) (42) 308 307 282 312 149 243153 151154 250z5 ↵ z 5,6252 255(11)(50) theWhat game-state type of233 experiments as follows: can we 257269 z7(56) T 203205207 z| (37) 2 P297(x208n) 304 306 287309 323 311 140 157 150 249 z10156 2,5 4 (55)↵(a)S15,12 (49) 235256 (29)(16) 269 z194204 x(b) 211 (41) z (59) (60)210 (61)303 310289 323 143146 245251 247 152 158 ↵7,13↵7,8 z7 ↵8,5the game-state(17)(14) (52)In as a naive follows:(15) Bayes Net, we could197 infer the200 probability9 of206 212299305301 150 244154 155 ↵6,7255↵6,10 254z9 (13)z10 (18)229 (54) 258(55) 259 4 z 204208 2 (43)298209 309 308 283 312 313 141 158 151 159Figure251z6157 2. ↵(a)5,6 The model2535. Experimentsof256 a (12)point(51) in tennis - given234 player236 B serves the ball andz 1958205it is not an1 ace212 or double213(38) fault,T 1 211 305 307 288310 290 144147 246252 248 250153 ↵6,10↵8,5 z5 (56)(18)↵(15)7,13 (50) 257 (17) 198 z201 207 z 300306(42)302 (61) 304 311 151 For our work156 we used the BayesNetz8↵ Toolbox7,8256↵↵8,511 [12],12 255 for the(14)(53) game-state(19)230 (16) as follows: P (x z )P (z ) 10z 205 z (60)210 (62) 310 309 284 142 245159155152 160 ↵6,7 4. Shape254 Constraint(13)In a naive Bayes235 Net, we could infer the259n probabilityn 260 n1962065 of 213209 214 299 3 308 289 313 314 145 249 154 player252158 A returns the ball back257 zto10 player B and the rallyP ( 237ensuesz258Px(x(55)n)= untilzn) Pplayer(199zn(56)) A or B hitsz9 a winnerz or an208 error.T 1 (b)(39)303 The (44) 212 306 311 312291 148 247253 152 model156153 estimation251 and inference.z7 ↵8 A,↵115 visualization,12 z ofWe the can pdfs(52) try(19)↵(16)6 it,10 on hockey,S1(20)231 basketballn n and(18) soccer data.(29)| 202207 2 (57)206210 z 301307 T (62) 305 285 143 160 157 161 ↵7,8 z69↵th8, 5257 ↵7,13the256(14) game-state(15)(54)P(51) asT(z follows:n xn(17))= | 260 Tz197(57) z 214 T-1215(43) z (61)211 3 311 310 314 146 246 250 155 probability159 of the i 255point-state at time T, z i , 236given the| current observation200P (xn261) x 6 and previous state209 z , can304 be 300 213 (63) 309 290 315 ICCV 149 248254 153 Inof a157 naive the154 different Bayes factors Net, we253 of thecould winners infer5. for Experimentsthe the probabilityWhat top258 3 type play- of of experiments can| we 238259 P (xnP) (xn zn)P (zn10203208) ICCV207211 T302308 (45) 307 312 313292 144 161 252156158 z8 ↵ ↵7,13 ↵5,12258 257(15)(53)(20)↵(17)8,11 (16) 232 (56)(19) 261 z198 T 1 215210 z (40) 212 (63) 306 312 311 286 315 147 247 251155 160 8,5 zz107 256 ↵6,10 (55)(52)237 (18)P (zn xn)= 201 | 262z 209 z (57) (44)305 (62)301 214 310 291 316 150 249255 154the game-stateers158 is shown as in follows:159 Figurecalculated 3.254 As we using can In seethe a from sum naive4. this and Shape259 Bayes figure, product Constraint Net, rule we (gray could 233nodes infer239 260are the|( observed probability) ( andP) ( ofclearxn) nodes2047 are hidden).208212 The state303309 with 213 (64) 308 287313 314293 #**** 148 145 157 ↵5,12↵6,10 We↵7,13259 can try it(16) on hockey,(18) basketball(17) and2 soccerP data.xn zn P 202z262n z199 #****211 (41) 313 316 248 252156 253 z9161 257 ↵ 258(54) (56)P (z 238x (19)) = (20)| 263 210(57) T (45)306 302 (64)215 307 311 312 292 317 151 For our work155 weeach used159 player the BayesNet158 has160 differentthe Toolbox highest tendencies. [12] probabilitythe for Specifically, game-statez8 is our Federers8,11 asin-point4. follows: Shape4. prediction. Shape Constraint(53)nConstraint 234n 240261 In a naive Bayes205 Net, wez could209212213 infer the probability(63) of214 288 315294 149 146 250256 ICCV 2011157 Submission255 #****.↵7 CONFIDENTIAL,13 What↵6, type10260 of experiments260 REVIEW(17) COPY. can we(18) DO NOT DISTRIBUTE.P (x ) 203263 2002118 304310 309 314 314 317 249 253 254 PIn(x a naivez↵8),11P ( Bayesz ) 258 Net, we259 could(19)2 infer the239| probability of n 264 z (42)307 303 308 312 313 293 318 152 model estimation156 andwinners inference.160158 tend159 A to161 visualization be hit fromz10 wider ofn then on pdfs bothn the forehand5. Experiments and(55) (20)241 206212 210213214 215 295 150 147 251257 P (z x )= ↵6|,10 z↵9 (18)(57) 4.(19) Shape(54) 235 Constraint262 the game-state204264 as follows:201 In a naive Bayes305311 Net, we could infer the probability of 289 316 318 161254 160nIn an naivethe256 game-state Bayes Net, we as8 could,11 follows:261 infer261 the260 probability(20) of 240 In a naive Bayes Net, we9 could infer the214215 probability of308 (64) 310 315 314 294315 151153 ofFor the our different work157 we factors used250backhand the of159 BayesNet the compared255 winners| Toolbox for to the the [12] others, top forP 3(x play- andn) he also259 tends to5. be Experiments5. ExperimentsP (x z )P (z ) 2 205 265 Inz207213 a naive Bayes211 Net, we could(43) infer the304 probability of 309 313 319 148 252258 the game-state as follows:↵8,11 We(19) can(56) try it on(20) hockey,n 236 basketballn 242263n and soccer265 data. 202 the game-state as follows:306312 290 317296 319 158 further255160 behind161 the baseline257 (need to checkz10 this?).262 From2624.261 this Shape Constraint(55)241 the game-state as follows: 10214 212215 309 311 316 315 295316 ICCV 152154 modelers is estimation shown in and Figure inference.251 3. As A visualization we256 can3 see of fromModeling the pdfs this figure, Player260 P Behavior(zn xn)=5. Experiments| ICCV 206(57) 266thez 208 game-stateP as(x follows:n zn)P (zn) 305 310 314 320 149 253259 161 What type(20)We of| experiments can try it can onP237( wex hockey,)243264 basketball and266 soccer203215 data. 307313(44) 291 318297 320 #**** 216 153 159 model,256 we can see the combination of factors263 which leads262P (xn zWen)P can(zn)242 try#****n it on2 hockey, basketball207 P (zn andxn)= soccer data.213| 270 310 (46) 317 316 296 155150 ofeach the different player has factors4. different Shape252 of the tendencies. winners Constraint for Specifically,the top258 3 play- Federers P (x261n zn)P263(zn) 2 (56) In a naive267 Bayes209204 Net, we could infer the probability306 of 312 315 317 321 254260 In a naive257 Bayes257 Net, we could inferP the(zn probabilityxn)=What5. Experiments of type| of experiments238 244265 can we(57) 267 P| (xn zn)P (znP) (xn) 308314(45)311 311 292 319298 321 217 154156 ersICCV is shown 2011 Submission in160 Figure 3. #****.most As CONFIDENTIAL welikely can to see a player from REVIEW this hitting figure,P COPY.(z an winner,x DOn)= NOT unforced DISTRIBUTE.264| error and What(57)We type2 can of243 tryexperiments it on hockey, can we208 basketball and210 soccer data.214271 P (x z )P (z ) 318 297 151 winnersFor tend our to work be hit253 we from used wider the BayesNet on bothTo model the259 Toolbox forehand player| [12] and behavior for 262P| ( xwen264) first263 haveP to(x constructn) 239 an accurate modelthe game-stateP of(z n howxn268)= asa point follows:205 is| played. At a coarseP ((46)x nlevel,zn)P a( znn)307n n 313 316 317 293318 322 155 each player255261 has161 differentthe game-stateace tendencies.258 (or their as258 Specifically, sweet follows: spot). Federers This is shown in Figure 4. 2 245266 T T 2091268 |T P (xn) 215 P (zn309315xn312)= | (65)312 320299 322 218 157 backhand compared5. Experiments to the others,In and a naive he also Bayes tends Net,4. to be Shapewe could265 Constraint infer264 theWe probability canWhat try type it of on244 of hockey, experimentsP basketball(z , canz we and, x soccer) 211 data. P (zn x272n)= | (65) 319 318 298 152 model estimation254 and inference. A visualizationpoint260 consists of the of pdfs three263 game265 states: 1)T theT initial1 T 240state (i.e. Inthe a 4.serve), naive Shape Bayes 2) the Constraint Net, 269 middle we206 could state infer (i.e. the the| probability rally which| ofP ( xcann) P308(xn) 314 317 294319 323 156 winners256262 tend to be hit fromS1 wider259 on259 both the forehand and P (z z , x245)=246267 210269 310316313 313 299 321300 323 158153 furtherof the behind different the baselinefactors of (need the4.winners game-stateto Shape check for this?). Constraintthe as follows: topFrom 3 thisplay-(29) 266 What265 type of experiments can we 212207 (66) 320 319 216 219 157 backhand compared toWe the255 can others,260 try and it he on alsoconsist hockey,4. tends261P Shape ( toofx basketball ben manyzn) ConstraintP ( shots),z andn) 264 soccer and266 3) data. the end| state. Given241 270 that4.the Shape there game-stateP was() Constraint 211no as follows: ace or double fault,P (x n thez273n ) end-pointP (zn) 314 state 309 315 318 295320 159 model,257263 we can see the combinationt P260(zn ofxnICCV factors)= which| 5. leads Experiments267 (57) 2 246 247268 P (z 213x )= | 311317 (65) ICCV 314 321 300 322301 217 158 154 furtherers behind is shown the baseline in Figure (needz 3. to As check we can this?). see From from this this figure,(30) 265 266 242 271P 5. Experiments212 n208n 319 320 296 220 What256 type261 of experiments5. Experiments| consists can262 weofP either(xn) player A 267or player B hitting a winner or an error. A depiction | of our pointP model(274xn) is given315 in 310 316 321 ICCV160155 mosteach258264 likely player to ahas player different hitting tendencies. a261 winner, Specifically, unforced error Federers and P (x268n zn)TP (Tzn1) T 247 248269 214209P (x z ICCV)P (z ) 312318 315 322 301 323302 218 159 model, we can see the combination262t of factors#**** which5. Experiments leadsT T 1 T 266P (z , z267 , x ) 243 272 5. Experiments213 n n n 316 #****3 3 320 321 297 221 S1 257x (29) PP(z(znz xn)=, x(31)We)= can| try it on hockey,(66)(57) basketball248 and soccerP data.(zn xn)= | 275 (46) 311 302 219 160#****161156 mostacewinners likely (or259265 their to tenda sweetplayer to be spot).hitting hit from This a winner, wider is262 shown unforcedWe onFigure both canin Figure error263 thetry 2(a), forehandit and 4.| on| which hockey, and shows basketball269P ( x that268P)() and ICCV we soccer 2011have Submissiondata. six possible #****.273 249 point-states CONFIDENTIALWe after214 can REVIEW the| try serveit COPY.on215210 hockey, {zP5 DO,...,z(x#**** NOT)10 basketball} DISTRIBUTE. and the and313319 transition soccer data. 316 317 323 322 303 t 4. Shape258 263 Constraint P267 n 268 244 T T T T 1 n 317 312 321 322 298 220 222 161 157 z t 1ICCV(30) 2011 Submission #****.What CONFIDENTIAL type of experiments REVIEW COPY. can DO we NOT249 DISTRIBUTE.274P250(x Wez can)P try(z215 itz on hockey,) 211 basketball and276 soccerT data. 303 304 ace (orbackhand their260266 sweet compared spot). This to264 is the shown others,What in Figure andprobabilities type he 4.of264 alsoWe experiments tends can between totry becan it we on different hockey,269 states basketballT isT given1 andT by245 a socceri,j. Given data. thatWhat our type observation of experiments or feature can we vector x 314320 contains318 318 299323 t z 263 (32) 269 3 317 323 221 158 further behind the259 baseline (need to check this?). From this 268 P (z z , x 250)=275 What| type of experiments| can212 we(67) 313 322 304 223 x261267 265 (31) information265 about the incoming shot (i.e. location, speed,251 4.angle, Shape player’sT ConstraintT 1feet position, number of shots277 in 315321rally319 etc.), 319 305 5. Experimentst 1 4.264 ShapeWhat Constraint type of experiments2 269T T canT weT 1| 246 P (x z ) 318 323 300 222 159 model,t 1 we can see260x the combination of factorsT whichT 1 leadsT(33)P (x z )P (z T-z1 ) 251 276 | 213 3 314 305 224 z 262268 266 (32) and216 Pwe(z knowz 2,thex )=previous| state z | (i.e. player(67) A or247 player252 B returned the ball), using our model topology278 316322shown320 in 270 301 306 223 265 266 T T 1 277 319 320 216160 mostt 1 likely to aWe player261 can1 hitting try it a winner,on hockey, unforced basketball| error and and soccerP (x z data. ) 252 5. Experiments 214 270 3 315 306 x 263269 z267 5.(33) Experiments217 (34) | 248 T253 3 317323321 271 302 307 224 225 161 Figure267 2(b), we can infer the probability of next state z278 being a returning shot, winner215 or error by 279using Bayes’ law 321 217 ace1 (or theirWhat sweet262type spot).268 of This experiments266 is shown in can Figure we 4. 253 271 T 322T 1 T 316 320 307 z264 1 (34) 218 T T 1 T T T T T 1 T T 1249 254 TWeT can tryT itT on1 hockey, basketball3 and soccerP (z318 data., z , x ) 272 303 308 225 226 x We canas P try ( 268z itz on , x hockey, )=( (35) P ( basketballx z ) P ( z and z soccer )) /P ( x data. z 254 ) . This279P ( expressionx zT )PT ( describesz1 Tz )a dynamicT T 1BayesianT Network280 (DBN), 322 308 218 263 269 267 | | | S1T T 1 | T P (z (29), z , x ) P (z z , x 272)= 323 317 (66) 321 226 x2651 (35)S1 219 P (zP (zzT zT ,1x, xT)=250)=280255What| type of experiments| can| we(68) 319P () 273 304 309 227 264 2 What typewhere of269 experiments the next state can(29) weis conditioned on3t the previous255 state, butP (x seeingT zT that1) there is(66) only one possible281 state for the 318 323 309 227 219 2662 z 268 (36) 2 z | | 251 281256 (30)P () 273 P 320 322 305 310 z (36)t returning220 shot and we observeT T thisT (otherwiseT 1 the256 point Pis over), we| can simplify this into a Bayesian Network (BN), 274 310 228 220 265 3 z T T 1 T (30)P (x z )P (z z t) 252 274 282 319 306 228 2673 269 221P (z z , x )= | | (68)3 282257 321 275 323 311 z z (37)t which yields:| (37) P (xT zT 1) x 257 (31) 311 229 229 221 266 x 253 283 275 283T T T T 1320 307 2684 (31) | t 1 258 322 312 z 4 (38) 222 258 T T T T 1 3 T T 1 T P (x z )P (z z ) 276 312 230 222 267z t 1 (38) z 254 284P (x Tz T)(32)P (z Tz ) P (z z , x )=276 | (1)| 321 (67) 308 230 269 z (32) T T T1 TT P259(x z )P (z ) 284 323T T 1 313 231 5 223 Pt(z1 z , x 259)= 285 | | (67)| P (x z ) 277 313 223 z 268 5 (39) T T T P (z x )=255 | T T3 1 (69) 277 322 309 231 z t 1 (39)P (x z )P (z )x | 260 P (x(33)Tz ) 285 | 314 232 6 x 224 P (zT xT )=(33) | 260 286 P (x ) 278 314 224 z 269 (40) | T 1 (69) 256 | 278 323 310 232 6 | P (x ) z 261 261 (34) 286 315 315 233 7 1 225 3 287 279 225 z z (41)z (40)(34) 257 279 311 234 233 1 262 288262 287T T T T 1 316 316 8 7 1 226P (x ✓)= M w G(x; µ ⌃4.) Shapex Constraint258 T T (35) T T 1 T TT TT 1 T P (x z )PT(z Tz ) 280 312 235 226 z z (42)x Depending onk=1 the(41)k player(35) k returningk theT shot,T 1 weT 263 infer289P 263the(x probabilityz )P (z zof {z) P5, (zz 7, zz 8}for, x player)=280 A and| {z 6, z| 9, (68) 317 317 234 227 | P (z2 z , x )=2593 | | (68) 288 T T 1 281 313 236 227 9 8 2 T z T 290 T(36)T 1 | 281 P (x z ) z (43)z z 10} for playerP B - and(36) our prediction is| the state,264 z i, with264 Pthe(x highestz )probability. To learn a model for each| player, 318 318 235 z 228 (42) 5. Experiments3 260 | 289 282 314 237 228 10 1 1 265 291 282 319 z 9 (44)3 we first created a shot database whereTz each1 shot was261 labeled265 as(37) either an ace, fault, continuing shot, ground stroke 315 319 238 236 z G229(x; µk⌃k)= d (37)1 exp (x µ) ⌃ (x µ) 292 290 283 229 z 2 (43)2 4 266 283 320 z1 (45) winner or ground2⇡ ⌃ kstroke error.2 We Given can these try itlabeled on hockey, shots, we basketball then learnt and probability soccer data.distribution functions (pdf ’s) 239 4 230 ✓ z ◆ 262 293266 (38) T T T 284 316 320 237 10 z | |(38) (70) 267 T T T T T P291(x z )P (z ) 321 230 x z (44) What type of experimentsT T 263 P can(x wez )P (z ) P (z x )=284 | (69) 317 240 1 (46) for231 each variable (e.g. shot location, speed,5 P (z angle,x )= position,294267 |feet position, number (69)of shots), which yieldedT a multi- 285 321 238 231 z5 (39) z 268 T(39) | 285 292P (x ) 322 241 z2 (47) | 264 295268 P (x ) 318 322 z1 dimensional4.232 Shape Constraint pdf (45) for each shot type for6 each player.269 To obtain a continuous distribution for each variable !, we 286 323 242 239 232 6 z 265 296 (40) 286 293 319 z3 (48)z (40) 269 M 323 233 x represented233 the probabilities as a Gaussian7 MixtureM Model (GMM), where P ( x ✓ )= w 287 k G ( x ; µ k ⌃ k ) given 287 243 240 1 7 5. Experiments (46) P (xz✓)= k=1 w266kG(297x; µk⌃k)(41) k=1 294 320 z4 (49)z 234 (41) | | 288 244 234 that the GMM has the form: 8 267 298 288 3 321 241 z z2 (50)8 We can try it on(47) hockey, basketball andz soccer data. (42) P 295 245 235 5 z 235 (42) P 268 299 4. Shape Constraint289 (2) 3 289 322 246 242 What type of experiments can we 9 300 296 236 z6 z3 (51)9 236 (48) z 1269 (43)1 T 1 290 290 323 z (43) G(x; µk⌃k)= exp (x µ) ⌃ (x µ) 247 243 10 d 1301 297 237 z7 (52)10 237 2⇡ 2 ⌃k 2 2 5. Experiments 291 291 248 z4 z (49)(44) z 302 ✓ (44) ◆ 244 z (53) 238 | | (70) 298 292 249 238 8 303 292 3 z5 z1 (50)(45) z1 (45) We can try it on hockey, basketball and soccer data. 250 245 239 z9 (54) 239 304 293 299 293 where is the mean, ∑ is the covariance and = {(w1; 1, ∑1), ... , (wM;What M, ∑ typeM)}are of experimentsthe parameters can of we the GMM " x1 # " (46) " 251 246 240 z10 z6 (55)x1 240 (51)(46) 4. Shape Constraint305 294 300 294 252 1 for M mixtures. The parameters of are learnt using the306 Expectation Maximization (EM) algorithm. It is important 247 241 x z (56)z2 241 (47) #z2 (47) 295 301 295 253 7 (52) 5. Experiments 307 248 242 x2 (57) to242 note these parameters as they allow us to adapt to specific opponent behavior, which we describe296 302 in Section 5. 296 254 z3 (48) z3 308 (48) T 1 z8 (53) 249 243 x (58) 243 We can try it on hockey, basketball and soccer data. 297 303 297 255 z4 (49) z4 309 (49) 256 250 244 xT z9 (59) In244 this paper, we(54) are only concernedWhat type with of experiments shots within310 can the we rally (i.e. all non-serve behavior).298 The304 pdf ’s for the 298 257 245 1 z5 245 (50) z5 311 (50) 299 299 251 z z10 (60) incoming shot preceding(55) winners are shown in Figure 3. Before, we analyze these plots it is wise to305 revisit Figure 1 258 246 2 z6 246 (51) z 312 (51) 300 300 252 z 1 (61) to get a better understanding of what 6the different variables mean. Given the opponent has hit the 306ball to the player 259 247 x (56) 313 301 T 1 z7 247 (52) 301 260 253 z (62) of interest - which we call the incomingz7 shot - the shot314 impact location(52) refers to the (x, y) location of307 where the ball 248 2 248 302 302 261 zT x (63)z (57)(53) 315 254 249 8 z8 (53) 303 308 262 T 1 249 2013 Research316 Paper Competition 303 255 x (64)z9 (58)(54) 309 263 250 250 z9 Presented317 by:(54) 304 304 T 264 256 251 x z10 251 (59)(55) z 318 (55) 305 310 305 In a naive Bayes Net, we could infer the probability of 10 265 257 252 1 1 1 319 306 311 the game-state as follows: z x 252 (60)(56) x (56) 306 266 253 320 307 258 2 2 253 312 307 267 z x (61)(57) x2 321 (57) 254 254 308 308 268 259 P (x z )P (z ) T 1 322 313 255 n n n T 1 x (58) T 1 309 P (zn xn)= | z (65) 255 (62) x (58) 309 269 260 | P (x ) T 323 314 256 n x (59) T 310 261 T 256 x (59) 315 310 257 z 1 (63) 311 z 1 262 3 257 (60) z (60) 316 311 258 2 (64) 312 z 258 (61) 2 312 263 259 z (61) 313 317 T 1 259 313 264 z (62) T 1 318 260 z (62) 314 In a naive Bayes Net, we couldT infer260 the probability of 314 265 261 z (63) 315 319 261 zT (63) 315 266 the262 game-state as follows: 316 320 262 (64) 316 263 (64) 317 267 263 321 317 264 318 268 In a naive Bayes Net,P (x wen z couldn)P (z264 infern) the probability of 322 318 265 P (zn xn)= | (65) 319 269 265 In a naive Bayes Net, we could infer the probability of 323 319 266 the game-state| as follows: P (xn) 320 266 the game-state as follows: 320 267 321 267 321 268 P (x z )P (z ) 322 P (z x )= n n n 3 269 n n | 268 (65) P (xn zn)P (zn) 323 322 | P (xn) P (zn xn)= | (65) 269 | P (xn) 323 3 3

ImpactImpact Location ShotShot Speed Speed ShotShot Direction Direction FeetFeet location Locationlocation Feet Direction Shot Speed Shot Location Impact (opponent)(opponent)interest) of (player(player (player of ofinterest) interest)(opponent) 0.045

0.04

0.035

0.03

0.025

D 0.02 D

0.015

0.01

0.005

0 −50 −40 −30 −20 −10 0 10 20 30 40 50

0.03

0.025

0.02

N 0.015 N

0.01

0.005

0 −50 −40 −30 −20 −10 0 10 20 30 40 50

0.03

0.025

0.02

F 0.015 F

0.01

0.005

0 −50 −40 −30 −20 −10 0 10 20 30 40 50 Figure 3. The probability distribution functions (pdf ’s) of winners for Djokovic (D), Nadal (N) and Federer (F) in the 2012 Australian Open with respect to various variables.

lands prior to the player hitting the ball. The speed and angle of the incoming shot are also quantified. The feet location refers to the respective location of each player at the moment the incoming shot impacts the court. In this work we did not differentiate betweenImpact ground-strokes, volleys andPoints smashes as the multiple modes within the GMM Points Impact encodes this information as these strokes have different speed and impact location profiles (this will be obvious after the next couple of sentences). In terms of unique player characteristics, these plots make for some interesting analysis. First of all, it can be seen that Federer tends to hit his winners from balls that land closer to boundary widths of the court compared to Djokovic and Nadal, which may allow him to generate more angle on his winning strokes. It is also evident that Federer tends to stroke more winners from volleys than Djokovic and Nadal, which can be inferred from the his foot location on striking the winners. Furthermore, in comparison to Djokovic and Nadal, Federer frequently hits winners from further inside the baseline while his opponents are pressed significantly deeper behind their own baseline. Many of Nadal’s winners are characterized by his opponents playing from their left side (the backhand side for right-handed players), which may indicate his preference for backhand rallies. In contrast, Djokovic tends to stroke winners while his opponent is positioned more to the right-handed forehand (albeit more subtly). The shot speed profiles also indicate that Federer and Djokovic achieve a number of winners via over-head smashes, which may be inferred by the small peaks at the upper end of the speed spectrum. The combination of variables in a rally that are most likely to lead to a winner for any specific player, the “sweet-spot”, can be visualized such as the examples seen in Figure 4. In terms of general trends, when we compare the winners (left) to the errors (right), we can see that the impact location of the incoming shot is much deeper as well as significantly more quicker.

Speed: 89.5 km/h Speed: 81.1 km/h Speed: 109.2 km/h # Shots: 8 # Shots: 10 Speed: 101.3 km/h # Shots: 13 Speed: 86.6 km/h # Shots: 9 # Shots: 12 Speed: 97.4 km/h # Shots: 9

Figure 4. “Sweet-Spot” - these visualizations show the incoming shot that gives the highest probability of (left) hitting a winner, and (right) causing an error - these shots are hit deeper and with more pace. 2013 Research Paper Competition Presented by:

4 In-Point Prediction

In order to validate our player models we conducted a series of experiments to measure the accuracy of next-stroke predictions at any point in a rally - which we call “in-point prediction”. As we are only interested in the non-service behavior of a player (i.e. no ace or double fault), “in-point prediction” refers to predicting whether the next shot is either: i) a winner, ii) an error, or iii) a return (i.e. continuation of the point). For our experiments, we generated player models for Djokovic, Nadal and Federer and tested these models on two matches, Nadal vs Federer (semi- final) and Djokovic vs Nadal (final). To make sure there was no overlap between training and testing data, for the Federer and Djokovic models, we trained the models on all their matches except those against Nadal. Similarly for Nadal, we created two models: the first we tested against Djokovic and trained on all the other matches he played in, the second we tested against Federer and again trained it all other matches. As there are many more continuing shots than winners or errors, the overall agreement between correctly classified shots can skew the results. To counter this we used the receiver-operator characteristic (ROC) curve, which plots the hit-rate against the false positives. From these curves, we used the area underneath the ROC curve (AUC) to assess performance. The AUC ranges from 0.5 (pure chance) to 1.0 (ideal classification). In Table 2(a), we show the aggregate shot prediction performance for winners and errors.

As can be seen from the results, the impact location of the incoming shot was the best single predictor of a winner, while the feet location was the second best predictor. When combining factors together, speed + impact location and feet location gave the best, which makes sense after our analysis in the previous section. Even though the performance improves, the overall predictive power of winners is still quite poor but the lack of opponent modeling can explain this (we look at this in the next section). It is also interesting to note that the number of shots in the rally diminishes the performance. As the variance of this variable is very high, coupled with the fact that we have relatively fewer examples that errors, it is probable that we have severely under trained this variable which can explain the noisy results. For errors, a similar trend is observed with the best predictor gaining a AUC of 76.09% which is much higher than the winner rate of 68.52%. 5 Opponent Modeling

As the models used in the previous section do not model specific opponent behavior, this represents an obvious area of improvement as the behavior or tactics of a player are heavily dependent on the opponent and the court surface (e.g. Nadal’s behavior in a match against another “base-liner” such as Djokovic on a clay-court is likely to be a poor predictor of his behavior against “serve-and-volleyer” Federer on a grass court). Obviously, the best model

Table 2. (a) Performance of our player models for predicting winners and errors using a “one-versus-everyone- else” or UBM model. (b) Adapting the player models to specific opponents yields better performance (N.B. there was no feet location data in our adaptation data - hence the absence of those results.)

(a) Universal Background Model Prediction Accuracy (AUC) (b) Adapted Model Prediction Accuracy (AUC)

Shot Variable Winner Errors Shot Variable Winner Errors

Speed 52.49 61.63 Speed 56.45 62.65

Angle 54.13 54.76 Angle 60.36 54.47

# Shots in Rally 55.51 59.61 # Shots in Rally 60.30 68.21

Feet Location 61.27 60.94 Impact Location 70.30 63.29

Impact Location 65.29 59.76 Speed +Angle 58.67 57.74

Speed +Angle 55.51 58.75 Speed + Impact Loc 77.28 71.85

Speed + Impact Location 63.62 61.12 Speed + Impact Loc + # Shots 65.41 70.94

Speed + Impact Loc + # Shots 60.04 60.39

Speed + Impact Loc + Feet Loc 68.52 76.09

Speed + Impact Loc + Feet Loc 57.83 68.60 + # Shots 2013 Research Paper Competition Presented by:

of future performance is going to be one that is trained on data which has the same conditions (i.e. same opponent, court-surface etc.). However, this is problematic as obtaining enough data to adequately train a model is extremelydifficult as players may only play each other a couple of times a year and this is often on different surfaces. A method to resolve this issue is to employ adaptive model techniques, such as are commonly used in speech and speaker verification tasks [13]. Unlike the standard approach of maximum-likelihood training of a model, adaptive models “adapt” the parameters of an initial model or Universal Background Model (UBM) to held-out data which is indicative of the test data. To do this, our held-out data consisted of two matches from previous tournaments which were the nearest examples to the Australian Open conditions in our library: 1) Djokovic vs Nadal at a previous hard- court tournament, and 2) Federer vs Nadal at a previous grass court tournament. Using this adaptation technique (see Appendix A for full description of technique), we were able to greatly improve the prediction performance for winners from 68.52% to 77.28% - as can be seen in Table 2(b). As the held-out data did not contain feet location of the players, we were constrained to only using the ball trajectory information which can explain the diminished performance for predicting errors. Again, the number of shots in the rally had a similar impact, which suggests that this variable is undertrained. When adapting the player models, we found that the court surface had a significant impact on performance. For example, when we adapted our Nadal vs Djokovic model to a match these played on a clay surface the performance greatly dropped which suggest that the court surface as a large impact on how a match is played (which is a commonly held belief).

From these experiments it can be seen that reasonable “in-point prediction” can be obtained from using spatiotemporal data. An obvious application of this work can be seen within a broadcast environment, where this type of approach can be used for real-time (or close to real-time) analysis. The idea is, given we have the trajectory information about the incoming shot - we can not only give a prediction of the most likely outcome of the shot but we can also give a prediction of the position where we think the shot will go. In terms of in-depth analysis, we can determine whether the shot played was within the expected distribution of the player or fell outside. If it fell outside our expectation this “anomalous” behavior could then trigger some further analysis of the shot.

6 Summary and Future Work

Rich sources of player and ball tracking data are emerging in sports such as tennis, which are well suited to novel analyses using probabilistic graphic models. In this paper, we employed a Bayesian framework to build a stroke-by- stroke model that is predictive of the outcome of individual points in top level tennis matches. This analysis creates novel insights to the playing styles of individual players, and in particular, we identify the “sweet-spot” for three of the top male tennis players in the world. Our modeling approach demonstrates superior performance using adaptive techniques, which allow greater sensitivity by tuning the model to specific match parameters such as court surface. These results are insightful for coaches hoping to discover critical points of strength and weakness in opponents. Furthermore, the dynamic and intuitive nature of the analysis has excellent potential to enhance the in- game viewer experience for spectators.

As noted by Nadal, tennis may be thought of as a very fast moving chess match, where players systematically maneuver their opponent into a position of weakness before attempting to win the point. Therefore, it is important in future work to extend the scope of model parameters to include deeper analysis of the sequence of strokes that ultimately lead to a winning shot. Analysis of that type requires exponentially larger data sets, however, the popularity of technologies such as Hawk-Eye means that “big data” modes of analysis will become more feasible.

7 References

[1] R. Nadal and J. Carlin, “Rafa”, Hyperion, 2011. [2] Hawkeye Innovations. http://www.hawkeyeinnovations.co.uk. [3] IBM SlamTracker, 2012. www.australianopen.com/en_AU/ibmrealtime/index.html. [4] S. Intille and A. Bobick, “A Framework for Recognizing Multi-Agent Action from Visual Evidence”, in AAAI, 1999. [5] R. Li, R. Chellappa and S. Zhou, “Learning Multi-Modal Densities on Discriminative Temporal Interaction Manifold for Group Activity”, in CVPR, 2009. [6] V. Morariu and L. Davis, “Multi-Agent Event Recognition in Structured Scenarios”, in CVPR, 2011.

2013 Research Paper Competition Presented by:

ICCV ICCV #**** ICCV #**** ICCV ICCV 2011 Submission #****. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE. #**** #**** ICCV 2011 Submission #****. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

216 ICCV 270 ICCV 217 ICCV 271 ICCV #**** 216 T T 1 T #**** 270 218 ICCV 2011 SubmissionT T#****1 #****.T CONFIDENTIALP (z , z REVIEW , x ) COPY. DO NOT DISTRIBUTE.272 #**** S1 (29) P (z z 217 , x )= ICCV 2011 Submission(66) #****. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE. 271 219 | P () 273 T T 1 T t 218 P T T 1 T P (z , z , x ) 272 220 ICCV z (30) S1 274 ICCV (29) P (z z , x )= (66) t 219 | P () 273 221 #**** x216 (31) [7]216 D. Stracuzzi, A. Fern, K. Ali, R.z Hess,t J. Pinto,275 N.#**** Li, T. Konik(30) and D. Shapiro, “An270 ApplicationP of Transfer to 270 ICCV 2011 Submission #****. CONFIDENTIAL REVIEW220 COPY. DO NOTT DISTRIBUTE.T T T 1 274 222 t 2171 T T American1217 T P Football:(x z ) PFrom(z zObservation ) t of Raw Video276 to Control in a Simulated Environment”,271 AI Magazine, vol. 271 z (32) 221 x T T 1 T 275 P (z z 218, x )= | T T 1| (67) T T (31)1 T 272 223 t 2181 | P (x z ) T T 1 T 277P (z , z , x ) T T 1 T 272P (z , z T, x T) T T 1 S1 22232, no. 2, 2011. P (zS1 tz 1 , x )= (29) P (z z , x )= P (x z )P (z (66)z ) 276 x (33) 219 (29) | z (32) (66) T T 1 T 273 224 219 t| 278 P () | P (z z 273, x )=P () | | (67) 216 1 t T T 1 223[8]T M. Perse,T M.T Kristan,T T S. 1Kovacicz T andT 1 J. Pers,P “A Trajectory-Based270(30) Analysis of CoordinatedP Team ActivityT T 1 in 277 225 z220 (34) z P (z z , 220x )=(P (x(30)z )P (z z ))/Pt(x1 z ) 279 | 274 P (x z ) 274 217 xt 271 (33) | t | 224Basketball221 Games”,| Computer| Visionx and| Image Understanding,(31) 2008. 275 278 2211 1 275 226 218 x (35) x (31) P (zT , zT 1, xT ) 280 272 T T 1 T T T TT TT 1 T T 1 T T 1 225[9]222 T R. T Maheswaran,1 T Y. Chang, t A.1 Henehanz and S. T Danesis,T “DeconstructionT(34)T 1 T PT(z1 thezT Rebound,Px(x)=(z with)PP(( xz Opticalzz )P) Tracking(z z ))/P (276x z ) 279 227 2222 S1 t(29)1 P (z z , x )= z (66)P281(x z (32))P (z z P)(z z , x )= 276 | | (67) 219 z T T T T 1T T 11 T 273 | T T 1| | | z (36) 226223 | (32) PP()(zt 1z , x )= | | | (67) P (x z ) 277 280 t T T Data”,1 T in PMIT(x Sloanz )P Sports(z z Analyticsx) x Conference, 2012.(33) T T (35)1 228 2233 z t(30)1 P (z z , x )= | P | | (68) 282 P (x z ) 277 | 220 224 T T 1 2 274 278 z (37) x | 227[10]. K. Goldsberry,(33)P (x z“CourtVision:) 1 New Visual and Spatial| Analytics forT Tthe1 NBA”,T in MITT T SloanT SportsT 1 AnalyticsT T 1 281 229 224 t 225 z z 283 (34) (36) P (z z , x )=(P (278x z )P (z zT T))/P (xT zT 1) 279 221 4 x (31)1 | 275 T T 1 T P (x z )P (z z ) 228 T T1 1 T T T T T 1 | T T 1 | | | 282 230 z225 (38) z Conference,226 (34)2012. T TP (z Tzx T 3,1x )=(P (284x z )(35)P (z z ))/P (x Pz(z z) 279, x )= | T T 1| 280(68) 222 t 1 T T 1 T P (x z )P (z z z ) 276 (37) | P (x z ) 5 z (32)1 229 | | | | 283 231 226 P ([11].227z z P. Lucey,, x )= A. Bialkowski,| T TP. 1|Carr,2 E. Foote(67) and285 I. Matthews, “Characterizing Multi-Agent280 Team Behavior| from 281 223 z t 1 (39) x | (35)T T PT(x z z) 4 277(36) T T T T 1 x 230 P (x z )P (z ) z (38) T T 1 T P (x z )P (z z ) 284 232 2276 (33)2 TPartial228 T Team Tracings: Evidence| from3 the English286 Premier League”, inP (AAAI,z z 2012., x )= 281 | | (68) 282 224 z (40) z P (z x )= (36)| z 5(69) T 278T(37) T T 1 | P (xT zT 1) 1 231T229| T 1 T P (xT T) T T T T Tz1 1 T T TP (1x z )P (z (39)z ) 283 285 233 225 228 z (34)3 P (z[12]z K. ,Murphy,x )=( P“The(x zBayes)PP( zNet(z z 4Toolboxz ))/P, x( forx)= zMATLAB”,287) 279 Computing Science and Statistics,282 vol. |33, 2001.T T T 7 z | (38)T T 1| (68) T T P (x z )P (z ) z 1 (41) z 232230| (37) | | | 6 | P (x z ) 284 286 234 226 229 x (35) [13] D. Reynolds, T. Quatieri and R.z Dunn, “Speaker288 Verification280 (40) using Adapted GaussianP (z 283x mixture)= models”,| T Digital (69) 8 4 231M z5 (39)| | P (x ) 285 235 227 z230 2 (42) z P (x ✓)= 233k=1 wkG(x;(38)µk⌃k) 7 289 281 P284(xT zT )P (zT ) 287 z (36) | Signal232 Processing, 2000.T T T 6T z1 (41) T T 286 9 5 234T T 1 T P (x z )P (z zz ) (40) P (z x )= | T (69) 288 236 228 z231 3 (43) z P (z z , x (39))= | | 8 (68) 290 282 | M285P (x ) P233 T T 17 T T T P (x ✓)= wkG(x; µk⌃k) 287 z (37) 235| P (x z z) z T T P (x z (41))P (z (42)) k=1 289 237 229 23210 6 234 P (z x )= 291 | 283 (69)| 286 288 z 4 (44) z 1 (40) 1 |T 18 9 T M z (38)G(x; µ ⌃ )=236A Appendixexp (Ax - µModel) ⌃z (xz Adaptation|µ) P (x(42)) (43) P (x ✓)= wkG(x; µk⌃k) 290 238 230 233 7 k k 235d 1 292 284 | k=1 P287 289 z 5 z 2⇡ 2 ⌃ 2 (41)2 231 1 (45) 237236 k 9 10 285 290 291 239 234 z (39) | | ✓ T T T z z M ◆ 293 (43) (44) P 288 1 1 T 1 8 P (x z )P (z ) (70) G(x; µ ⌃ )= exp (x µ) ⌃ (x µ) 232 x1 6 (46) z 238Given237 thatT Twe(42) want to trainP (x a✓ specific)=10 k =1modelwkG between(x; µk⌃ k286two) players, but do notk havek enoughd to 1adequately train the 291 292 240 235 z P (z x )= | z (69) 294 (44) 1 2892 12 T 1 (40) T | z1 (45)G(x; µ ⌃ )= 2exp⇡ ⌃k (x µ) ⌃2 (x µ) 233 9 239238 w | P (x ) 287 k k d 1 ✓ 292 ◆ 293 241 236 7 model,a wen can use a method which is widely used295 in speech and speaker recognition2 2 290communities| |2 called “model z2 z (47) (41)z k k (43) w z1 P (45) 2⇡ ⌃k (70) w239⇤ = +(1 a )wk, x1(71) (46) | | ✓ ◆ 293 242 234 237 10 240adaptation”k M [13]. Thek core idea behind model296 adaptation288 is to “adapt” the parameters291 of an initial model(70) or 294 z 8 (48) z n (44) x1 1 (46)1 235 3 z (42) P (x240✓)= k=1 wkG(x; µk⌃k) 289 T 1 w 294 243 238 241Universal| BackgroundG Model(x; µk ⌃(UBM)k)=z2 to da held-out1 exp297 or “adaptation”(x (47)µ) ⌃ dataset (x whichµ) shouldw 292 bea indicativek nk of wantw we 295 9 241 z 2 2 2 a n 295 236 z4 (49) z1 m (45) m 2 2⇡ ⌃k 290(47) k wkk⇤ = w +(1 ak )wk, (71) 244 239 z (43) 242 P 298 ✓ wk⇤◆= 293+(1 nak )wk, (71) 296 µk⇤can242= a expectk Ek( xto)+(1 see in athek ) µtestk, data. z 3 So(72) given| |the initial model(48) parameters or(70) UBM nparameters, UBM = {(w1; 1, 296 237 10 x z3 291(48) # " 245 z2405 z (50) (44)1 243243 1(46) 1 T 1 299 294 297 297 G(x; µ ⌃ )= exp (x µ) z⌃ (x µ) (49) 238 k∑1),k ... , (wMd; M, 1∑M)}, we can findz4 the4 parametersw of the292(49) held-out matches Adapt =m {(w1; 1,m ∑1), ...m , (wM; M, m∑M)}. 244 2 " 2 # µ⇤ ="a E (x)+(1" a )µ , 298 246 z2416 z1 (51) (45)z2 244 2⇡ (47)⌃k 2 ak nk300 w µk⇤ = ak Ekk(295x)+(1k k ak )µk, k (72)k 298(72) 239 | | ✓ z z wk⇤ = ◆ +(1293(50)ak )w(50)k, (71) 247 242 v 245We245 2then updatev the parameters2 2of5 the5 UBM (70)byn using301 the following equations: 296 299 299 240 z7 x1 (52) (46)z3 ck⇤ = akEk(x )+(1 (48)ak)(ck + µk) µk⇤ (73) 294 246246 z6 (51) 300 300 248 243 w z6 302 (51) 297 241 z z (53) (47)z4 (49)a nk 295 2448 2 p 247247 k w z m m v 2 v 2v 2 v 2 2 2 301 301 249 a ,p w, m, v wk⇤ = +(1 ak )wk7, zµk⇤ = ak (71)Ek(x303)+(1(52)ak (52))µk, ck⇤ = akE(72)k(x )+(1298ak)(ck + µk) µk⇤ (73)(3) 242 k 248 n 7 296 ck⇤ = akEk(x )+(1 ak)(ck + µk) µk⇤ 302(73) 250 z2459 z3 (54) (48)z5 2 { 248 } (50) z 304 (53) 299 302 249 8 p 303 243 z8 297 (53) a ,p w,p m, v 251 z24610 z4 (55) (49)z6 249 (51)m m z 305 (54) k 2 {a ,p } w, m,300 v 303 244 where250 µ ⇤w = k⇤ ,µ a k⇤ ,E ⌃ k⇤( x )+(1 are the anew)µ9 parameters, (74) (72) of the adapted298 model, and k are used to control the 304 {k k k } k k z9 (54) 2 { } 252 2471 z (50) 250251 v 2 306 v 2 2 301 305 304 245 x 5 (56) z7 balance between(52) old and newck⇤ =estimatesz10akEk(x for)+(1 weights,a kmeans)(299c(55)k + andµk) covariances.4. Shapeµk⇤ Constraint(73) This is a form of regularization, which 253 2482 251252 1 z10 307 (55) 302 306 305 246 x z6 (57) (51)4.z Shape Constraintguarantees improved(53) generalization.x 300(56) 4. Shape Constraint 8 p 1 254 249 252253 a ,p 2 w, m, v 308 5. Experiments 303 307 306 247 T 1 z (52) v 2 v k 2x x 2 301(57) (56) 255 x 250 7 (58) z9 ck⇤253254= akEk(x )+(1(54) ak)(ck + µ2k){ µk⇤ } (73) 309 304 308 307 248 5. Experiments T 1 2 302 We can5. Experimentstry it on hockey, basketball and soccer data. T 255 x x (58) (57) 309 256 x251 z8 (59) z(53)10 p 254 (55) 310 305 308 249 a ,p w, m, v 4. ShapeT ConstraintT 1 303 What type of experiments can we We can tryk 256 it on hockey, basketball and soccerxx data. (59) (58) We can try it on hockey, basketball and soccer310 data. 257 250 2521 z9 (54)x1 2552 { } 311 304 306 309 z (60) What type of experiments257 can(56) we 1 What type of experiments can we 311 258 253 z T 312 (60) 307 251 2 z10 (55)2 256 5. Experimentsx 305 (59) 310 z (61) x 4. Shape258 Constraint(57) 2 312 259 252 254 1 257 z 1 313 306(61) 308 311 T 1 x T(56)1 259 We can tryz it on hockey, basketball(60) and soccer data. 313 z 255 (62) x (58) T 1 309 260 253 2 5. Experiments258260 z 2 314 307(62) 314 312 T x (57)T What type ofz experiments can we (61) 261 254 z256 (63) x 259261 (59) zT 315 308(63) 310 315 313 T 1 We can try it on hockey, basketball andT 1 soccer data. 262 255 257 x (58)1 260262 z 316 309 (62) 311 316 314 (64) z What type of experiments(60) can we (64) 263 256 258 xT (59) 263 T 317 310 312 317 z2 261 (61) z (63) 315 264 257 259 1 262264 318 311 313 318 316 z T(60)1 In a naive Bayes Net, we could infer the probability of (64) In a naive Bayes Net, we could infer the probability of z 265 (62) 319 265 258 260 2 263 319 312 314 317 z (61) 266 the game-state as follows: 320 266 the259 game-state as follows: 261 zT (63) 320 313 315 T 1 264267 321 318 267 260 262 z (62) In a naive Bayes Net, we could infer321 the probability314 of 316 265268 (64) P (x z )P (z ) 322 319 261 T n n n 315 268 263P (xn znz)P (zn) (63) the game-stateP (z asn x follows:n)= | 322 (65) 317 266269 | P (xn) 323 320 262 P (zn xn)=264 | (65) 316 318 269 | P (xn) (64) 267 323 321 263 265 In a naive Bayes Net, we could infer the probability of 317 319 268 P (xn zn)P (zn) 3 322 264 266 the game-state as follows: P (zn xn)= | 318 (65) 320 In a naive Bayes Net, we could infer the probability3 of 269 | P (xn) 323 265 267 319 321 266 the game-state as follows: 320 268 P (xn zn)P (zn) 322 267 P (zn xn)= | (65) 321 3 269 | P (xn) 323 268 P (xn zn)P (zn) 322 P (zn xn)= | (65) 269 | P (xn) 323 3 3

2013 Research Paper Competition Presented by: