<<

Assessing Decision Rules For Stopped Games Assessing decision rules for stopped cricket games

Ahsan Bhatti, B. Sc., A. Stat

A project

Submitted to the School of Graduate Studies

in Partial Fulfilment of the Requirements

for the Degree of

Master of Science

McMaster University

c 2015 Ahsan Bhatti

i Master of Science (2015) McMaster University Statistics Hamilton, Ontario

TITLE: Assessing Decision Rules For Stopped Cricket Games

AUTHOR: Ahsan Bhatti

SUPERVISOR: Professor Ben Bolker

NUMBER OF PAGES: ix, 75

ii Abstract

In interrupted games, teams do not always get a chance to finish the full game. In such cases, different methods can be used to establish which team wins. The method currently in use is called the

Duckworth-Lewis method; it was introduced at the international level in 1998. The purpose of this project is to investigate and check the accuracy of Duckworth-Lewis method and make statistical comparisons with other proposed methods. The accuracy is checked via bias estimation, Cohen’s Kappa and root mean square error methods; cross-validation is used to assess out-of-sample accuracy. The resource table, a summary of the expected fraction of total runs scored by a given point in the game, has missing values and monotonicity

flaws. To improve the resource table, different statistical methods such as isotonic regression and Gibbs sampling, are used to construct different resource surfaces. The accuracy results show that the Duckworth-

Lewis displays the lowest accuracy out of all the methods; whereas, the improved Duckworth-Lewis is more accurate at predicting the new target or results of stopped games. Thus, the accuracy of the Duckworth-

Lewis can be improved by ignoring the old games and constructing the resource surface using the modern data.

iii Acknowledgements

First of all, I would like to express my appreciation to my supervisor Ben for being supportive, generous and compassionate human. It was a wonderful experience to work under his supervision.

I would also like to sincerely thank Dr. Viveros and Dr. Feng for being my committee members.

Special thanks to my family and friends for being supportive at all times. I appreciate it more than you can even imagine.

iv Contents

1 Introduction 1

1.1 History and Formats of Cricket ...... 1

1.2 Project Motivation ...... 2

1.3 Cricket Rules and Old Methods ...... 7

1.3.1 Average Rate ...... 8

1.3.2 Most Productive Overs ...... 8

1.3.3 Discounted Most Productive Overs ...... 9

1.3.4 PARAB ...... 9

1.3.5 Clark Curves ...... 10

1.3.6 Duckworth-Lewis (D/L) Method ...... 10

2 Constructing Resource Table and Surfaces 14

2.1 Data Collection ...... 14

2.2 New Resource Tables ...... 14

2.2.1 Mean of Ratios (R)...... 15

2.2.2 Optimization via D/L Method (Improved D/L Method) ...... 18

2.2.3 Isotonic Regression ...... 19

2.2.4 Gibbs Sampling ...... 21

2.3 Comparison of Resource Tables ...... 22

2.3.1 R vs. Duckworth-Lewis ...... 22

2.3.2 Isotonic Regression vs. Duckworth-Lewis ...... 23

2.3.3 Gibbs Sampling vs. Duckworth-Lewis ...... 23

2.3.4 Improved Duckworth-Lewis vs. Isotonic Regression ...... 25

2.3.5 Improved Duckworth-Lewis vs. Gibbs Sampling ...... 25

2.3.6 Improved Duckworth-Lewis vs. Duckworth-Lewis ...... 26

v 3 Accuracy Check 28

3.1 Cohen’s Kappa ...... 29

3.2 Root Mean Square Error ...... 30

3.3 Stoppage Distribution ...... 32

4 Discussion 42

A Basic Rules of Cricket 48

A.1 Ways to dismiss a batsman ...... 48

A.2 Ways to score runs ...... 49

B Clark Curves 51

C Difference between T20I and 50 overs game 56

D Scraping Code 59

E 2 Resource Table and Resource Surfaces for 50 overs games 62

F 20 Resource Tables 68

F.1 First Innings Tables ...... 68

F.2 Second Innings Tables ...... 69

F.2.1 Isotonic Regression ...... 69

F.2.2 Optimization on R ...... 69

vi List of Figures

1.1 Runs Achievable vs Overs Remaining using PARAB Method, ...... 9

1.2 Plot of Duckworth-Lewis resource table for the average number of runs scorable with

lost and overs remaining ...... 13

2.1 Heatmap (levelplot) of a Resource Table ...... 16

2.2 Heatmap (levelplot) of Standard Deviation of a Resource Table ...... 16

2.3 Heatmap (levelplot) of difference of Resource Table of both innings ...... 18

2.4 Heatmap (levelplot) of Optimized R Resource Surface via Duckworth-Lewis Method . . . . . 19

2.5 Heatmap (levelplot) of Isotonic Regression Resource Surface ...... 20

2.6 Heatmap (levelplot) of Gibbs Sampling Resource Surface ...... 22

2.7 Comparison of R Resource Table and Duckworth-Lewis Resource Surface ...... 23

2.8 Comparison of Isotonic Regression and Duckworth-Lewis Resources ...... 24

2.9 Comparison of Gibbs Sampling and Duckworth-Lewis Resources ...... 24

2.10 Comparison of Isotonic Regression and Improved Duckworth-Lewis Resource Surfaces . . . . 25

2.11 Comparison of Gibbs Sampling and Improved Duckworth-Lewis Resource Surfaces ...... 26

2.12 Comparison of Improved Duckworth-Lewis and Duckworth-Lewis Resource Tables ...... 27

3.1 Kappa values along with the CI for different methods for 50 overs ...... 30

3.2 Out of sample (unweighted) Kappa values along with the CI for different methods for all overs

combined ...... 31

3.3 RMSE values along with the CI for different methods for 50 overs ...... 33

3.4 Out of sample (unweighted) RMSE values along with the CI for different methods for all overs

combined ...... 33

3.5 Bias values for different methods for 50 overs ...... 34

3.6 Stoppage probability for all 50 overs ...... 34

3.7 Average stoppage distribution for 50 overs ...... 36

vii 4.1 Heatmap (levelplot) of difference between D/L and improved Duckworth-Lewis Resource Sur-

faces...... 43

4.2 Difference between Low, Mid and High runs Resource Tables ...... 44

4.3 Out of sample RMSE values along with the CI for different methods for all overs combined . 45

4.4 Kappa estimates along with the CI for different methods for 20 overs games ...... 46

4.5 RMSE values along with the CI for different methods for 20 overs games ...... 47

B.1 CLARK Curves ...... 52

B.2 Stoppage Type 2 ...... 52

B.3 Stoppage Type 3 ...... 53

B.4 Stoppage Type 5 ...... 55

C.1 Resources available in T20I with wickets lost and overs remaining ...... 57

F.1 Heatmap (levelplot) of R Resource Table for Innings 2 ...... 71

F.2 Heatmap (levelplot) of Isotonic Regression Resource Table for Innings 2 ...... 72

F.3 Heatmap (levelplot) of Optimized R Resource Table for Innings 2 ...... 73

viii List of Tables

3.1 Average stoppage distribution for 50 overs ...... 36

3.2 Duckworth-Lewis Resource Table for 50 overs ...... 37

3.3 R Resource Table using 50 over games ...... 38

3.4 Optimized R Resource Surface using 50 over games ...... 39

3.5 Isotonic Regression Resource Surface using 50 over games ...... 40

3.6 Gibbs Sampling Resource Surface using 50 over games ...... 41

C.1 Duckworth-Lewis Resource Table ...... 58

E.1 R Resource Table using 50 over games for Innings 2 ...... 63

E.2 Standard Deviation of R Resource Table using 50 over games for Innings 1 ...... 64

E.3 Isotonic Regression Resource Table using 50 over games for Innings 2 ...... 65

E.4 Optimized R Resource Table using 50 over games for Innings 2 ...... 66

E.5 Gibbs Sampling Resource Table using 50 over games for Innings 2 ...... 67

F.1 R Resource Table ...... 68

F.2 Isotonic Regression Resource Table ...... 69

F.3 Optimized R Resource Table ...... 70

F.4 Gibbs Sampling Resource Table ...... 70

F.5 R Resource Table for Innings 2 ...... 71

F.6 Isotonic Regression Resource Table for Innings 2 ...... 72

F.7 Optimized R Resource Table for Innings 2 ...... 73

ix Chapter 1

Introduction

1.1 History and Formats of Cricket

Cricket is one of the most entertaining and the second most watched , after soccer, in the world. The game was discovered and introduced in ; the first indication of cricket being played dates back to

1598 in [1, 3, 16]. The claim is supported by an older man’s testimony, as he used to play cricket with his friends instead of attending church mass. Cricket became famous after the Restoration of 1660, however, which is when gamblers started to show interest in the game. Through the English colonies, the game was introduced in North America in the 17th and the game spread throughout the rest of the world by the 18th century [3]. Interestingly, the first international game was played in 1844 between Canada and United States of America [1].

Three different formats of cricket are currently played at the international level. The first version of cricket introduced, ‘’, can last up to five days. Among many cricketers, this version of cricket is considered the soul of cricket; and unlike any other sport, it combines tactical, technical, physical and mental elements into a sport [4]. Also, the players (batsmen mainly) can take their time to settle down (that is, there is no time pressure constraint). Every game day, play starts at 9:30 in the morning and ends at 5:00 in the evening; two major breaks are taken (lunch and tea) per day. However, Test matches can be affected by rain or other weather, occasionally leading to a game resulting in a . Both teams bat twice and if teams fail to finish the game in the specified time, the game also results in a tie.

Due to lack of aggression and entertainment in Test games, a new format of cricket was introduced in

1971 called “limited overs game”. An over is comprised of six legitimate deliveries (or balls) by a single player (called the bowler) to a batsman at the other end. The concept of limited over game was

1 introduced by English county teams in 1962 and the first international game was played in 1971 between

Australia and England at , [1]. As the suggests, the game is completed within a limited time frame and both teams are allowed to bat at most 50 overs. To increase the level of competition and entertainment, a World Cup is played between the top 12 to 16 teams every four years.

The third format of cricket is called T20 (20-overs game), which is also a limited overs game. This version of cricket was first played in England at county level in 2003. It was liked by many cricketers and fans and gained popularity quickly. The first international 20 overs game (or T20I) was played between New

Zealand and Australia on February 17, 2005 at , . T20I was introduced to bring more aggression and entertainment, and a game can be finished within four hours. Like 50-overs game, World

Cup for T20I is also organized every two years. Unlike 50-overs game, a new law is introduced in T20I, that is, in case of a tie (if both teams make the same number of runs), a super over is bowled to get the outcome of the match.

Like any other sport, cricket has a committee that makes major decisions to improve the game. The main body that controls and is in charge of organizing all main tournaments at international level is known as

International Cricket Council (ICC). Currently, the body is run by 10 full member ICC countries, which are

Australia, South , India, , , England, New Zealand, West Indies, and

Zimbabwe. However, some of the associate countries that also play a minor role by helping ICC members in making decisions are Ireland, Kenya, , Scotland and Afghanistan.

1.2 Project Motivation

Sometimes, games can be interrupted by thunderstorm or rain. In these cases, teams do not get a chance to finish the full game (for this project, we only consider the limited over games), where they lose all their wickets or play all the allotted overs. In such cases, different methods can be used to overcome the lost time.

For instance, suppose team A plays their full allotted overs but due to rain, there is not enough time for team

B to play full overs. Then a new target (a number of runs that team B must achieve within their truncated play in order to win the game) needs to be set, given the number of overs that team B has completed.

The current method that is in use to reset the target is called the Duckworth-Lewis method. The other methods used before Duckworth-Lewis method are described in the next section. The precise parameters of the Duckworth-Lewis method were calculated based on 50-overs games using the data roughly from 1971 to

1998, but the same method has been scaled down so it can be used at a 20-overs games. However, players and researchers have complained about the Duckworth-Lewis method. Some of the arguments against the method are as follows:

2 • On May 3, 2010, after England vs. West Indies game, a member of each team expressed their views

about the Duckworth-Lewis method:

, the English at that time, said “There’s a major problem with Duckworth-

Lewis in this form of the game”. He later added “Ninety-five percent of the time when you get

191 runs on the board you are going to win the game. Unfortunately Duckworth-Lewis seems to

have other ideas and brings the equation completely the other way and makes it very difficult”

[17].

, the West Indian opener, agreed with Collingwood and said “I think it’s something

they (ICC) are going to have to look into”.

• Murhekar [18] argues that the Duckworth-Lewis method gives more weight to wickets compared to

overs. Even if teams play at a losing and manage not to lose wickets, then there is higher

chance for team 2 to win the game.

Since some people have criticized the validity of the internationally accepted method, the question arises:

“Is the Duckworth-Lewis method still a fair and accurate method to use in limited overs cricket games?”

Before we start, we know that Duckworth-Lewis does not consider how innings are played before the stoppage.

It only computes the new target, using the number of overs remaining and wickets lost information.

The main purpose of this project is to investigate and check the accuracy of Duckworth-Lewis method and compare with other proposed methods. Many cricket analysts have worked on limited overs cricket; most of their work has been strictly related to the Duckworth-Lewis method but, in general, no one has assessed fairness or accuracy of that method. For instance, Clarke and Allsopp [9] used the Duckworth-Lewis method to estimate the rankings of a team in a contest. De Silva, Pond and Swartz [7] used the Duckworth-Lewis method to predict the runs scoreable by team second. Lewis [15] suggested a method to estimate a player’s performance in a match using Duckworth-Lewis method. He also suggested a method in 2008 to rank players in limited overs games. Other researchers have proposed different methods that can be used to calculate the outcome of a game in an affected game. For instance, Jayadevan [14] proposed a VJD method to predict the outcome of the game. The method, VJD, is built around two curves. The first curve (or normal curve) shows the pattern of runs scored under no interruption scenario; the second curve (target curve) shows how the batting side should speed up after the interruption. Like Duckworth and Lewis, Jayadevan did not explain how the resource tables are computed. Bhattacharya proposed Gibbs Sampling (the method is explained in Chapter 2) and claimed that the model works better than the Duckworth-Lewis method; and

Perera updated Bhattachaya’s model [6, 8, 19].

3 In short, researchers have proposed a variety of different methods, and compared their methods with

Duckworth-Lewis using data from various sets of matches. Some proposed and compared their methods

with the actual Duckworth-Lewis method, while others compared their methods with the scaled Duckworth-

Lewis method used in 20-overs games. However, to date there has not been a formal evaluation of accuracy

of the methods. In this project, I will first construct the resource table by looking at the average of the runs

scored when u overs are played and/or w wickets are lost. In cricket terminology, a resource table is a table

that tells about the proportion of target a team can achieve; or the number of runs a team can make, given

the number of overs remaining, u, and wickets lost, w. It is used to calculate the target for team batting

second. The new target is calculated by ‘multiplying runs scored by Team 1’ and ‘resources available given

in the resource table’. The resource table can be constructed by either looking at the average of ratio of runs

at different points of game and the final runs, that is:

R(u, w) =< 1 − Z(u, w)/F >,

or ratio of average of runs at different points of game and the final runs, that is:

< F − Z(u, w) > R (u, w) = , 1 < F >

where < . > is the average over games that reach the point where u overs are remaining and w wickets are

lost. The average value at u overs remaining and w wickets lost is calculated as:

  Pn Fi−Zi(u,w) i=1 F i ; n

F represents the final runs of the innings; i represents the number of games; and Z(u, w) represents the runs at different points of the game. Both resource tables will provide approximately similar values. It should be noted that the runs scored in every over are discrete random variable. The ratio of means method has higher variance compared to the means of ratios method; therefore we ignore ratio of means method and focus on the mean of ratios method [19]. However, one of the flaws that is expected to be seen in the resource table is the missing values. The resource table is calculated using the actual data and no game can ever reach to some points, such as R(49, 7). That is, a team cannot lose 7 wickets in an over. Also, more games are played at certain points compared to others. I expect a resource table to be monotonically decreasing; in practice, at points where not enough games are played, monotonicity rule might be violated because, in general, smaller samples have higher variance. A function is said to be

4 monotonic if it is entirely non-increasing or non-decreasing. A function is strictly monotonically decreasing

if, for u(i) > u(i+1), f(u(i)) < f(u(i+1)). I expect a resource table to be monotonically decreasing because with every over consumed and/or lost, the number of remaining deliveries to be bowled decreases; hence run scoring deliveries are lost (expected runs to be scored decrease compared to when an over or wicket is available). I also expect smoothness; however, if the resource table is not monotonic, then it cannot be smooth. I use different statistical methods to improve the resource table. The tables constructed using different statistical methods are called resource surfaces. I expect resource surfaces to overcome the

∂R missing values and monotonicity problems, that is, ∂u < 0. However, two of the border values cannot obey the monotonicity rule; that is, R(0, w) = R(u, 10) = 0. This is because if all the overs are played or all the

wickets are lost, then the batting team runs out of available resources and hence, cannot score any more

runs.

There is also a bias-variance trade-off in estimating resource table/surfaces. Dealing with bias and

variance is actually about dealing with under- and over-fitting. Small sample size is the main source of

variance. If we increase our sample size (via bootstrapping), the variance of predictions will be decreased;

however, it might increase the bias. Usually, if we try to reduce the bias, variance increases (and vice

versa) in relation to model complexity. In practice, there is no analytical way to minimize both bias and

variance. The best way is to choose an accurate measure of prediction error and explore differing levels of

model complexity and then choose the complexity level that minimizes the overall error. If we try to reduce

the bias, then theoretically there will be more variance in the resource table/surfaces; and decreasing the

variance by using similar set of games will increase the bias. Using such analogy, within sample prediction

will reduce the bias but increase the variance; whereas, out-of-sample will increase the bias but reduce the

variance. The accuracy of prediction can be calculated via Mean Square Error (MSE). MSE of θˆ with respect

to unknown parameter θ is defined as:

MSE(θˆ) = Var(θˆ) + (bias(θ,ˆ θ))2.

The lower the MSE, the better the method. An MSE value of zero indicates that the estimator predicts

observations of the parameter with perfect accuracy. However, it is practically impossible to obtain zero

MSE. Root Mean Square Error (RMSE) is calculated by taking the square root of MSE. It is easier to

interpret and its units are in runs.

To construct the resource surfaces, I will use isotonic regression, optimization method that uses Broyden-

Fletcher-Goldfarb-Shanno algorithm and Duckworth-Lewis function, and Gibbs sampling. Isotonic regression

5 is defined as:

X X 2 ISO = Wuw(yuw − xuw) , u w where Wuw are the weights. Weights, for each w wickets lost and u overs remainig, are the total number of games played. Since it is a step function, I do not expect to see smoothness. Also, at some points, the

∂R ∂R constraint ∂u < 0 might be violated. However, the monotonicity rule will not be violated; that is, ∂u ≤ 0. Optimization method that uses Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm and Duckworth-Lewis

∂R function will improve the equality issue (the surface will follow ∂u < 0). The Duckworth-Lewis function has 11 unknown parameters, which I will estimate using the BFGS algorithm:

Z(u, w) = Z0F (w){1 − exp(−bu/F (w)}, where Z(u, w) is the average number of runs that can be scored for the number of overs remaining, u, and the number of wickets lost, w. The detailed method is explained in the next section. I also expect smoothness. I will calculate another resource surface using Gibbs sampling, which is a general class of Markov

Chain Monte Carlo (MCMC) algorithm. The Gibbs sampler is designed to generate sampling observations from a complex joint distribution by sequentially sampling from the conditional distribution of particular parameters. This greatly simplifies harder analytical or numerical problems (joint distributions) by replacing them with easier calculations (density function(s)); and is provably equivalent to taking samples from joint distribution functions. In simplest case, Gibbs sampler can reduce complexity by sampling directly from the conditional distribution (for example, conditioning it by a Normal distribution with known parameters).

In our case, we are sampling from a truncated Normal distribution. This could also be done directly by using the accept/reject algorithm. By simulating a large enough sample, desired degree of accuracy can be achieved [13].

Once the resource table and resource surfaces are constructed, I will compare the accuracy of those methods via Kappa and root mean square error (RMSE) methods. Since I know the actual results of each game, I will stop game at different points to calculate with-in-sample prediction. For cross-validation (or out of sample prediction), I will randomly choose 80% of the games and compute the resource surfaces. Once computed, I will predict the overall results of the remaining 20% of the games. This is explained more in

Chapter 3. Finally, I will estimate the bias of the resource table and resource surfaces.

6 1.3 Cricket Rules and Old Methods

In test matches, a game is stopped in case of poor light or bad weather situations. That time could be covered by starting the game earlier or finishing it later than intended time on the remaining days of the game. However, in the case of limited-overs games, there is not sufficient time to finish the game in case of natural disasters or bad light. In the past, some competitions allotted an extra day to finish the game while others would call it a tie. The main purpose of introducing limited overs games is to get a positive (win/loss) within a day. A result is required especially in knockout games such as World Cup games, or when more than two teams are involved in a series. Hereafter, the project only focuses on the limited over games.

Every team has two resources available, wickets and number of overs available. If a team plays aggressively then they may lose their wickets easily, hence using fewer overs; if a team plays defensively, then they may lose fewer wickets but will of overs. Thus, teams try to negotiate between playing defensively (scoring not enough runs) and aggressively (losing early wickets). No matter what strategy they adopt, they always compromise between the two resources available. However, in case of rain or any other scenario when a game is stopped, usually the game is shortened so the result can be achieved within a day. In such scenario, only one of the two resources, overs, is depleted and the balance gets upset [11].

Two teams (also called sides) are needed to play the game; each team requires 11 players. Before the start of the game, a is held to decide which team will bat (or field) first. After the toss, all eleven players from the /fielding side come to the while only two players (from batting side) bat at a time. The player who gets out is replaced by another player from the same team. In limited overs game, an innings is closed when either ten players of a team get out or the batting team runs out of overs (one over is comprised of six legitimate deliveries (or balls) bowled by a single player (called bowler) to a batsman at the other end). The team that bats first sets a target for the other team and roles of the teams swap after the end of first innings. Whichever side scores more runs, given the same number of resources (overs available and wickets lost), is considered a winner. In limited over games, there are only two innings played altogether, whereas, in test cricket, at most four innings are played, meaning each team can bat twice. (Appendix A discusses how to dismiss a batsman and how runs can be scored).

In case of rain or any other scenario that shortens the game, a method is required to decide the result of the game (or new target if not all the time is wasted). To calculate the outcome (or new target), two factors

(overs remaining and wickets lost) should be considered in the equation. The time of the stoppage should also be considered in the equation since the game can be stopped at any time and in general, players play aggressively at the end and defensively at the beginning of the game. Many methods have been proposed for deciding the result of shortened games; most of which are rejected due to some technical issues, such

7 as failing to account for one of the resources available in their calculations, or the stage of the innings at which the game stops. The methods proposed and reasons of removal of some of the famous methods from international games are as follows:

1.3.1 Average Run Rate

In the past, average run rate (ARR) was the most common method used to decide the result of a game. In an affected match, the team with the higher average number of runs per over is called a winner. In other words, if team 2 has higher runs than the slope, they are considered winners; whereas, if team 2 has runs lower than the slope, then team 1 is considered a winning team. This method (or resource surface) usually favours the team that bats aggressively, and does not consider the number of wickets fallen in its calculations,

R(u, w) = R(u), that is, the resource surface is a 1-Dimensional plane and only considers information about the number of overs played. For instance, if the first team manages to score above average runs then the second team will have to keep up with the required rate at all times. In the past, this rule favoured the team batting second. For instance, let’s consider a match that was played between Pakistan and India in 1990.

India batted first and scored 300 runs with the loss of two wickets (300/2) in their allotted 50 overs, while in response Pakistan played poorly and scored only 151 with the loss of 9 wickets in 25 overs. With only one more wicket in hand, the target of 301 runs was almost impossible to achieve. The match stopped due to rain and could not continue afterwards. Since Pakistan had slightly higher than the of

6.0, they were called winners at the end of the day.

1.3.2 Most Productive Overs

The use of ARR was continued until 1992 when the International Cricket Committee realized that the rule favoured the team batting second. In order to make the rule better and improve the game, Australian’s came up with a new idea which is called ‘Most Productive Overs (MPO) Rule’. According to this rule, in any naturally affected game, the target would be set by arranging the runs in a descending order and then taking the sum of the available overs for the team batting second. In other words, in affected games, it calculates the slope using the highest point of average runs per over of team 1. If team 2’s runs are above the slope, then they are considered victorious. This rule clearly favours the team batting first, as it does not take into account how those runs are scored and the maiden overs bowled by team batting second. Also, like the ARR, this 1-dimensional resource surface does not consider the number of wickets fallen in its calculations, that is

R(u, w) = R(u), and at what stage of an innings the game stops, R(u − n) ≤ R(u + n). The resource surface

∂R is supposed to monotonically decreasing, ∂u < 0. For example, let’s consider a semi-final game that was

8 PARAB Function

200

150

100

50 Runs Achievable 0 0 20 40 60 80 100

Overs Remaining

Figure 1.1: Runs Achievable vs Overs Remaining using PARAB Method, f(x) = 7.46x − 0.059x2 played between and England in 1992 world cup. South Africa needed 22 runs in 13 balls to win the game. Rain interrupted the game and the target was revised to 21 runs in just 1 ball, R(u − 2) ≈ R(u).

22 runs are achievable in 13 balls but scoring 21 runs in a single ball is absolutely impossible.

1.3.3 Discounted Most Productive Overs

Since MPO method favours the team batting first and ignores good bowling performance of team bowling

first, a slight adjustment was made in the MPO method after the 1992 World Cup incident. To improve the

MPO method, 0.5% was discounted for each alloted over. This method (Discounted Most Productive Overs

(DMPO)) slightly reduced the advantage given to team batting first but still faced the same problems as

MPO. For instance, some maiden overs (when no runs are scored in an over) bowled could be completely ignored under this method.

1.3.4 PARAB

The next method used at international level is called PARAB, the ‘parabola’ method, after complaints were made about MPO and DMPO. The method was proposed by a young South African named do Rego and was first used during the World Cup of 1996 [10]. It calculates the norms, f(x), where f(x) denotes the runs achievable in x overs, which is calculated using a parabola f(x) = 7.46x − 0.059x2, based on do Rego’s

9 calculations. The function has a turning point at about 63 overs, as shown in Figure 1.1. This method is an

improvement over the ARR method but like previous methods, it takes no account of the time of the innings

when overs are lost or the number of wickets fallen.

The target using the PARAB method is calculated differently compared to previously defined methods.

The proportion of the expected runs of team that bats first is calculated as R1 = f(x1)/f(N), where x1 denotes the remaining overs of the team that bats first and N represents the total number of overs allocated to each team before the interruption. The proportion of expected runs of team that bats second is calculated in a similar fashion and is represented as R2. Therefore, the target for team that bats first is calculated

S·R2 as T = + 1, where S represents the total runs scored by team batting first and R2/R1 is called the R1 reduction factor [6]. For instance, during India vs. Australia’s game, a storm interrupted the game and

reduced India’s number of overs to 46 in reply to Australia’s 284 runs for the loss of seven wickets (284/7) in

f(46) 50 overs. According to the parabola equation, the reduction factor was f(50) = 0.967 and hence the target was adjusted to 284 × 0.967 + 1 = 276 (the target is always rounded up to the next integer value).

1.3.5 Clark Curves

Before the introduction of this method (CLARK) in international cricket, it was used in South African domestic limited over games. In contrast to the previously defined methods, this method takes into account the number of wickets lost, R(u, w) and time of the innings when rain interrupts the game. There are six different types of stoppages that can occur in a game and at least one of them is used to calculate the predicted scores in any affected game (depending on the stoppage time of the game). There are three different types of stoppages for each inning. For this project, we only consider Type 5 and type 6. Type 5 is used when the game is interrupted during the second innings of the game but for a short period of time

(since there is time allotted for every innings and this type is used if the game resumes before that allotted time). Type 6 is used if the game is stopped during the second innings and doesn’t resume till the end of allotted time for team batting second. (Appendix B, reference [5] explains the method in detail).

1.3.6 Duckworth-Lewis (D/L) Method

The method proposed by two Englishmen, Frank Duckworth and , was first used in international cricket during the 1999 World Cup. Since then, it has been considered the best model known to the Inter- national Cricket Council. The Duckworth-Lewis method considers two important resources, overs remaining and wickets lost, while calculating the predicted scores in a shortened game. The target for the team batting second is independent of the first team’s scoring pattern. The model that Duckworth and Lewis introduced

10 is:

Z(u, w) = Z0F (w){1 − exp(−bu/F (w)}, (1.1) where Z(u, w) is the average number of runs that can be scored for the number of overs remaining, u, and the number of wickets lost, w, and F (w) is a positive decreasing step function estimated from data. Z0F (w) is the asymptotic function and gives the value of expected achievable runs with w wickets lost and u number of overs remaining. The variables b, Z0 and F (w) can be estimated by fitting the model to the data [11]. To make the method useful and easily accessible, a resource table is constructed using the model:

Z(u, w) P (u, w) = , (1.2) Z(50, 0) where P (u, w) gives the proportion of combined resources of overs remaining and wickets lost. There are

100% resources available when 0 wickets are lost and all the allotted overs are remaining, F (50, 0) = 1.

However, Duckworth-Lewis’s original paper does not provide enough information about how the estimates are calculated due to the commercial use of this method [11]. My assumption is that they used weighted least-squares based on average resource fractions observed in previous games. It should be noted that we do not need to know the method used by Duckworth-Lewis in order to calculate the parameters and reconstruct the Duckworth-Lewis resource surface. Since the table is provided, parameter estimation can be done and table can be reconstructed easily. In total, there are 11 parameters that need to be estimated, which are

Z0, b, Z(i) where i = 1 ,..., 9 . Assuming the average number of runs in a 50 overs game is 225, that is Z(50, 0) = 225, Equation 1.1 becomes:

225 = Z0F (0){1 − exp(−50b/F (0))}

= Z0{1 − exp(−50b)} (1.3)

225 Z0 = 1−exp(−50b)

Using the Duckworth-Lewis resource table, P(49, 0) = 99%, thus Equation 1.2 becomes:

Z(49,0) P (49, 0) = Z(50,0) Z(49,0) 0.992 = 225 =⇒ Z(49, 0) = 0.992 · 225

= 223.2.

11 And,

222.975 = Z0F (0){1 − exp(−49b/F (0)}

= Z0{1 − exp(−49b)} (1.4)

222.975 Z0 = 1−exp(−49b)

Using Equation (1.3) and Equation (1.4), we get:

225 222.975 = . 1 − exp(−50b) 1 − exp(−49b)

Solving this equation for b using Brent’s method by using the uniroot function in statistical software R,I

found b = 0.029 (we can also get an approximate estimation of b using Taylor expansion). Plugging the value

of b in Equation (1.3), I get: 225 Z0 = 1−exp(−50b) 225 = 1−exp(−50∗0.029)

Z0 = 293.671

We can also estimate the values of Z(i) for i = 1, . . . , 9. As said earlier,

Z(u, w) P (u, w) = =⇒ Z(u, w) = P (u, w) · Z(50, 0) Z(50, 0)

So, using resource table: Z(50, 1) = P (50, 1) · Z(50, 0)

= 0.934 · 225 (1.5)

Z(50, 1) = 210.15.

And,

210.15 = Z0F (1){1 − exp(−50b/F (1)} 210.15 = 293.671F (1){1 − exp(−50 ∗ 0.029/F (1)} (1.6)

=⇒ F (1) = 0.889.

Following the same criteria as Equation (1.5) and Equation (1.6), I found F (2) = 0.768, F (3) = 0.640,

F (4) = 0.510, F (5) = 0.384, F (6) = 0.269, F (7) = 0.168, F (8) = 0.091 and F (9) = 0.036. Figure 1.2 shows

the average number of runs, in a 50 overs game, that can be scored when u overs are available and w wickets

are lost.

12 w=0 w=1 200 w=2 w=3 150 w=4 w=5 100 w=6 50 w=7

Average runs scoreable Average w=8 w=9 0 0 20 40 Overs available

Figure 1.2: Plot of Duckworth-Lewis resource table for the average number of runs scorable with wickets lost and overs remaining

13 Chapter 2

Constructing Resource Table and

Surfaces

2.1 Data Collection

In order to understand whether Duckworth-Lewis method is reliable and to construct new resource tables, data on limited overs games is required. Since it is hard to find data for all the games in a single convenient location, I collected the data from cricinfo.com. The website provides detailed summary of ball-by-ball information, number of runs scored and wickets lost in each over. For this thesis, we collected information of

1810 50-over games played from June 7, 2001 to September 12, 2014 using an automated script in statistical software R (Appendix D). After excluding games that were completely rained out (that is, no overs were played), 1700 games remained in the data set. Out of these 1700 games, the Duckworth-Lewis method has been used 122 times to set up a new target. To predict the accuracy of the methods, I will exclude those

122 games and will work with the remaining 1578 games.

2.2 New Resource Tables

Since players and researchers alike have complained about Duckworth-Lewis method in the past, many statisticians have tried to come up with new methods [6, 8]. The simplest way to construct the resource table is by looking at the ‘ratio of means’ of runs and ‘means of ratios’ of runs [8]. The ratio of means method has higher variance compared to the mean of ratios method; therefore we ignore ratio of means method and focus only on the mean of ratios method [19]. Some researchers, including Duckworth and Lewis, have

14 argued that team 1 plays their natural game, while the game style of innings 2 depends on their target; thus resource surfaces should be constructed using first innings information. For instance, if the target is low then the teams play defensively in the beginning of the innings, whereas if the target is high, then the teams play aggressively from the very first over. On the other hand, other researchers have argued that since Duckworth-Lewis is mostly used to predict the outcomes of second innings, second innings data would provide more precise values. The main purpose of this project is to look at the accuracy and comparison of resource table and different surfaces. If team 2 wins the game without consuming all the overs, then accuracy results are likely to be misleading as some of the available resources are spared. On the other hand, if we exclude these games, then we will only be considering the games where team 1 wins; hence, the results will be biased towards team 1 (such resource table/surfaces will favour only team 1). It is difficult to find a way to overcome this problem that seems likely to provide reliable results. In contrast, team 1 consumes all the available resources because they set the target for team 2 and their resources do not get truncated. For this project, to construct new table/surfaces, we only look at innings 1 data as we assume that both innings are independent of each other. First of all, we will compute the resource table, R, and then using the resource table, we will compute different resource surfaces using different statistical methods.

2.2.1 Mean of Ratios (R)

To construct the resource table, we need final runs of the innings and runs at different points of the game.

Let Z(u, w) be the runs at different points of the game, where u is the number of overs remaining and can take the values 0, 1, . . . , 50 and w is the number of the wickets lost and can take the values 0, 1, . . . , 10; and F be the final runs of the innings. To construct this resource table, we take the average of ratio of runs at different points of game and the final runs. The formula for this resource table looks like:

R(u, w) =< 1 − Z(u, w)/F >, where < . > is the average over games that reached the point when u overs are remaining and w wickets are lost.

Since cricket players and fans care about the number of resources available, I convert the resources available as percentages (R(u, w) is multiplied by 100). The calculated resource table is provided in Table

3.3. Figure 2.1 shows the levelplots (or heat map) of the same resource table. As seen in the levelplot,

R(50, 0) = 100, which implies that if all the overs are remaining and none of the wickets is lost, then the full target is achievable. Also, R(0, w) = R(u, 10) = 0; no matter how many wickets are lost, if there are no more overs to play, there is no chance to score any further runs. Table E.2 and Figure 2.2 show the standard

15 Resource Table

9 100 8 7 80 6 60 5 4 40 3 Wickets Lost Wickets

2 20 1 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Overs Remaining

Figure 2.1: Heatmap (levelplot) of a Resource Table

Standard Deviation of Resource Table

9 20 8 7 15 6 5 4 10 3 Wickets Lost Wickets 2 5 1 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Overs Remaining

Figure 2.2: Heatmap (levelplot) of Standard Deviation of a Resource Table

16 deviation of all the entries of resource table R. My initial guess was that the standard error would be higher

for entries that have smaller sample size since, in general, standard error is smaller for large dataset and vice

versa, but surprisingly there is no such pattern seen in the standard deviation table. The standard deviation

values are smaller at the beginning of the game, and the values tend to increase until they reach the point

when 14 overs are available. After that, the standard deviation values again start decreasing. The standard

deviation values are highest in the middle of the innings and when 6 or 7 wickets are lost.

Practically, the resource table values are supposed to decrease as the remaining number of overs decrease

or the fall of wickets increase. However, there are a couple of flaws seen in this calculated resource table.

The first flaw is the violation of monotonicity; that is, the values in the resource table should decrease

monotonically as we go up the columns or move from left to right along the rows. For instance, the value

R(48, 0) = 96.56 < R(48, 3) = 97.42, which implies that in a fifty-over game, if 48 overs are remaining to bat and none of the wickets is lost then there is a lower chance to achieve the target compared to when three wickets are lost and the same number of overs are remaining to bat. This does not make sense because

∂R with the same number of overs remaining but few wickets lost, some of the resources are wasted ( ∂w < 0); thus the value should be lower compared to the one where less wickets are lost. A plausible explanation for

the violation of monotonicity is the sample size. For instance, the openers have survived the first over 1325

times, whereas three times, a team has lost three wickets after an over. Monotonicity violation might be

controlled by increasing the sample size. By increasing the sample size, we would expect asymptotic average

total runs over all the games that reached the point when u overs are remaining and w wickets are lost. The

second flaw is the missing values found in the table. The reason such values are missing is because: (1) No

games ever reached that point (these missing outcomes are highly improbable, and thus not represented in

the data set, but we would expect them to happen sometime in future). (2) These scenarios are actually

impossible in real games. For instance, there are six balls in an over and maximum a team can lose is six

wickets in an over. Obtaining a value R(49, 7) is impossible since no team can lose seven wickets in six

deliveries.

Figure 2.3 shows the difference between the resource table of two innings. There is an obvious difference

observed between the two resource tables. The heatmap shows that innings 1 shows more resources available

when fewer wickets are lost, whereas, innings 2 shows slightly more resources available when more than 7

wickets are lost and 30 - 40 overs are remaining, and hence are more biased towards team 2 (favours team

2).

17 Resource Table

30 9 8 20 7

6 10 5 4 0 3 Wickets Lost Wickets −10 2 1 −20 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Overs Remaining

Figure 2.3: Heatmap (levelplot) of difference of Resource Table of both innings

2.2.2 Optimization via D/L Method (Improved D/L Method)

In numerical optimization, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm is an iterative method

for solving unconstrained non-linear optimization problems [20]. This is essentially a special case of weighted

least squares, where runs are independent and Normally distributed with mean xuw and variance 1/Wuw

[20]. In this case, weights, Wuw, are the total number of games played when u overs are remaining and w wickets are lost. If the homogeneity of variances assumption is satisfied, that is if σ2 is same for all games,

σ2 then the variance of expected total runs scored is approximately equal to N . The calculated R resource table can be improved using the optimization method and using the Duckworth-Lewis function. For this resource surface, we first use the calculated resource table, R, to predict the 11 unknown Duckworth-Lewis method estimators using the default BFGS algorithm. We could impose the monotonicity constraints, that is F (0) > F (1) > . . . > F (9), by reparameterizing in terms of F(0) and ∆F (w) (where w = 1, 2, 3, . . . ), PN where F(w) = F(0) - j=1 ∆Fj(w), and the delta parameters are constrained to be greater than 0. To predict the unknown parameters using constrained optimizer, the actual Duckworth-Lewis parameters are

used as an initial guess. It should be noted that the actual Duckworth-Lewis parameters are not necessary;

however, using those parameters as starting values would make the BFGS algorithm converge faster.

Once the parameters are calculated, we can use Duckworth-Lewis equations 1.1 and 1.2 to calculate the

improved Duckworth-Lewis resource surface. The reason I called it improved Duckworth-Lewis is because

18 9 100 8 7 80 6 60 5 4 40

Wickets Lost Wickets 3

2 20 1 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Overs Remaining

Figure 2.4: Heatmap (levelplot) of Optimized R Resource Surface via Duckworth-Lewis Method since it uses the Duckworth-Lewis function and optimization on the resource table; for instance, we need to estimate the 11 unknown Duckworth-Lewis function parameters to calculate this resource surface. Since,

Duckworth-Lewis never told the method they used to estimate the parameters, the algorithms used to estimate the parameters might be different. The improved Duckworth-Lewis resource surface is provided in

Table 3.4 and the heat map is shown in Figure 2.4. The table and heat map shows that, as expected, none of the values are repeated (other than 0), the values are monotonically decreasing as wickets are lost and overs are consumed, and there are no missing values.

2.2.3 Isotonic Regression

Isotonic regression, also known as monotonic regression, fits a function to the data that is as faithful to the raw resource data as possible, except where deviating is necessary to ensure monotonicity [8]. Since we have two independent variables, which are overs remaining and wickets lost, we can use isotonic regression to compute a new resource surface that can improve the resource table. We replace all the missing values in the resource table with 0. Isotonic regression involves finding a weighted least-squares fit y ∈ Rn to a vector x ∈ Rn with weights vector W ∈ Rn, that minimizes

19 Isotonic Regression Resource Table

9 100 8 7 80 6 60 5 4 40 3 Wickets Lost Wickets

2 20 1 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Overs Remaining

Figure 2.5: Heatmap (levelplot) of Isotonic Regression Resource Surface

X X 2 Wuw(yuw − xuw) , u w where Wuw are the weights. Weights for each cell are the total number of games played. As said before, the model finds the best least squares fit to a set of points, given the constraint that the fit must be a non-decreasing function. Since the isotonic regression obeys monotonicity, xu,w > xu,w+1 and xu,w > xu−1,w. Like resource table, this resource surface is also calculated in a non-parametric fashion. According to Bhattacharya, the minimization of the function is equivalent to the method of constrained maximum likelihood estimation where yuw are independent and Normally distributed with means xuw and variance

1/Wuw [8]. To calculate the resource table, I used the R package ISO [21]. The calculated resource surface using isotonic regression is provided in Table 3.5 and the heat map is shown in Figure 2.5. Although the isotonic regression obeys the monotonicity rule by construction and weights are supposed to be strictly positive, we will again get the ‘missing values’ problem because the weights are 0 at the point(s) (u overs remaining and w wickets lost) where no games ever reached. Since isotonic regression resource table gives a step function, the missing values can be replaced by the last calculated entry. For instance, according to

Table 3.5, ISO(49, 1) = 98.44 and ISO(49, 0) is a missing entry, it is given the value of 98.44.

20 This resource surface resolves the problems (monotonicity and missing values) faced by the resource table but it encounters one other problem. Since it is a step function, many entries have same values, for instance both ISO(35, 7) and ISO(36, 7) have same value (36.62), meaning a team could afford to waste an over and still achieve the same target, which theoretically does not make sense. Resource surface is supposed to be

∂R ∂R monotonically decreasing inside the boundaries ( ∂w < 0 and ∂u < 0); and is supposed to be monotonically ∂R ∂R non-increasing at the boundaries ( ∂w ≤ 0 and ∂u ≤ 0). We know that ISO gives only monotonically non- decreasing results throughout, however changing our data (by taking the inverse) can provide the desired result. A plausible reason that the values are same in the resource surface is either because of small or no sample size available. In case the sample size is small, the method might return a higher value (violate the constraint) compared to the previous value and hence to avoid that, the value is replaced with the previous entry’s value.

2.2.4 Gibbs Sampling

Gibbs sampling is a general class of Markov Chain Monte Carlo (MCMC) algorithm for obtaining a sequence of observations which are approximated from a specified multivariate probability distribution, when solving by direct sampling is challenging. In simplest case, Gibbs sampler can reduce complexity by sampling directly from the conditional distribution (for example, by conditioning it to be Normal with known parameters).

We use Gibbs sampling to improve the isotonic regression, specifically making the fitted surface strictly decreasing (rather than just non-increasing as in the case of isotonic regression). The minimization of the isotonic regression arises from the maximization of the normal likelihood.

1 XX exp{− W (y − x )2}, 2 uw uw uw

where xuw are the unknown parameters [8]. One of the applications of Gibbs sampling is that in case of flat lines, that is when ISOu−1,w = ISOu+1,w or ISOu,w−1 = ISOu,w+1, the entry, Ru,w, is replaced by a value

that follows N(yu,w, sdRu,w /Wu,w), where sdRu,w gives the standard deviation of the runs scored (R). Since we know that the new values are supposed to lie between the desired intervals, we run the algorithm and use the accept/reject algorithm to choose the values. By simulating a large enough sample, we can achieve the desired degree of accuracy. Another way to find those values is using the truncated Normal distribution.

However, the approach is still non-parametric since no relationship is enforced on how runs are scored.

The calculated Gibbs sampling resource surface is provided in Table 3.6 and heatmap of the same resource surface is provided in Figure 2.6. Some of the resource surface values are very close to each other, for instance

Gb(49, 1) = 90.08 and Gb(48, 1) = 90.07. The reason that the values are close to each other is because the

21 Gibbs Resource Table

9 100 8 7 80 6 60 5 4 40 3 Wickets Lost Wickets

2 20 1 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Overs Remaining

Figure 2.6: Heatmap (levelplot) of Gibbs Sampling Resource Surface game is still in early stages and batsmen generally take time to settle down.

2.3 Comparison of Resource Tables

We compare the resource table and all the introduced resource surfaces with each other and with the internationally used Duckworth-Lewis method. Since most of the tables are calculated using the actual

50-overs games (they are calculated in a non-parametric fashion), whereas the Duckworth-Lewis table was constructed in a parametric fashion, the differences might be evident.

2.3.1 R vs. Duckworth-Lewis

Figure 2.7 shows the comparison between the Duckworth-Lewis and raw resource table. The dashed lines show the resources available for Duckworth-Lewis method and the solid lines show the resources available for R. As mentioned before, there are missing values and monotonicity flaws in R resource table, which are quite obvious in Figure 2.7. The reason for missing values is that no games ever reached that point and the reason for violation of monotonicity is the smaller sample size. At the points where the sample size is large, R resource table values lie on or near the Duckworth-Lewis curves. When less than 4 wickets are lost,

22 ● 0

● ● 100 ● ● ● ● ● 500 ● ● ● ● ● ● ● ● ● ● ● 1000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1500 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● 75 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● wickets lost ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● 0 ●● ●●● ● ● ●● ● ● ●● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● 1 ● ●●● ● ● 50 ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● 3 ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● 4 ● ● ● ● Resources Available ● ● ● ● ● ● ● 25 ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 6 ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 7 ● 0 ● ● 8 0 10 20 30 40 50 ● 9 Overs available

Figure 2.7: Comparison of R Resource Table and Duckworth-Lewis Resource Surface

Duckworth-Lewis shows more resources available compared to the raw resource table. In cricket terminology, it means that if less than 4 wickets are lost then teams have a higher tendency to score big runs. On the contrary, the raw resource table shows that teams, on average, score at the same rate.

2.3.2 Isotonic Regression vs. Duckworth-Lewis

Since isotonic regression returned a step function, we expect that there will be minimum overlapping between the curves of two resource tables.

Figure 2.8 shows the comparison between the Duckworth-Lewis resource table and isotonic regression resource surface. The dashed lines show the resources available for Duckworth-Lewis method and the solid lines show the resources available for isotonic regression. As expected, there is no similarity between the two methods and only few points from each tables overlap on each other. Moreover, Duckworth-Lewis gives more resources available for the middle twenty overs with less than five wickets lost.

2.3.3 Gibbs Sampling vs. Duckworth-Lewis

Figure 2.9 shows the comparison between the Duckworth-Lewis and Gibbs sampling resource tables. The dashed lines show the resources available for Duckworth-Lewis method and the solid lines show the resources

23 100 wickets lost 0 75 1 2 3

50 4 5 6

Resources Available 25 7 8 9

0

0 10 20 30 40 50 Overs available

Figure 2.8: Comparison of Isotonic Regression and Duckworth-Lewis Resources

● 0

● 100 ● ● ● ● 500 ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1500 ● ● ● ● ● ● ● ● ● ● ● ● ●● 75 ● ● ● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● wickets lost ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●● ● 0 ● ●● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● 2 ● ●● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● 3 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● 4 ●● ● ●● ● Resources Available ● ● ● 25 ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● 6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● 7 ●● ● ● ● ● ●● ● ● ● ● 0 ●● ● ● ● 8 0 10 20 30 40 50 ● 9 Overs available

Figure 2.9: Comparison of Gibbs Sampling and Duckworth-Lewis Resources

24 100 wickets lost 0 75 1 2 3

50 4 5 6

Resources Available 25 7 8 9

0

0 10 20 30 40 50 Overs available

Figure 2.10: Comparison of Isotonic Regression and Improved Duckworth-Lewis Resource Surfaces available for Gibbs sampling. Most of the values show that Gibbs sampling show more resources compared to the Duckworth-Lewis resource tables values; however when less than four wickets are lost and between overs

5 and 45, Duckworth-Lewis shows more resources available compared to the Gibbs sampling table values.

2.3.4 Improved Duckworth-Lewis vs. Isotonic Regression

Figure 2.10 shows the comparison between the improved Duckworth-Lewis and isotonic regression resource surfaces. The dashed lines show the resources available for improved Duckworth-Lewis method and the solid lines show the resources available for isotonic regression. Most of the values show that isotonic regression show more resources compared to the improved Duckworth-Lewis resource surface values, however when less than four wickets are lost and between overs 25 and 45, improved Duckworth-Lewis method shows slightly more resources available compared to the isotonic regression surface values.

2.3.5 Improved Duckworth-Lewis vs. Gibbs Sampling

Figure 2.11 shows the comparison between the improved Duckworth-Lewis and Gibbs sampling resource surfaces. The dashed lines show the resources available for improved Duckworth-Lewis method and the solid lines show the resources available for Gibbs sampling. Improved Duckworth-Lewis and Gibbs sampling

25 100 wickets lost 0 75 1 2 3

50 4 5 6

Resources Available 25 7 8 9

0

0 10 20 30 40 50 Overs available

Figure 2.11: Comparison of Gibbs Sampling and Improved Duckworth-Lewis Resource Surfaces shows similar resources available almost at all the points. At some points (from 0 to 10 overs available),

Gibbs sampling shows non-convex lines, which implies that resources available decrease drastically as we go towards the completion of innings; whereas resources available for improved Duckworth-Lewis method decrease in a parametric fashion. The convexity is also seen in the Gibbs sampling curves which implies that runs are scored at a faster rate for some parts of the game.

2.3.6 Improved Duckworth-Lewis vs. Duckworth-Lewis

The Duckworth-Lewis is calculated using 50 overs games played before 1998, whereas improved Duckworth-

Lewis is calculated using 50 overs games played after 2001. Both methods are calculated in a similar fashion but have different parameters.

Figure 2.12 shows the comparison of improved Duckworth-Lewis and Duckworth-Lewis resource surfaces.

The dashed lines show the resources available for Duckworth-Lewis and the solid lines show the resources available for improved Duckworth-Lewis. The difference in two methods is due to the different estimation of b. We do not know about the method Duckworth-Lewis used, but presumably the change in the curves is due to the actual scoring patterns. The improved Duckworth-Lewis table shows more resources available compared to the Duckworth-Lewis method at the points when four or more wickets are lost. This implies

26 ● 0

● 100 ● ● ● ● ● ● ● 500 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1000 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1500 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 75 ● ● ● ● ● ● ●● ● ● ● ●● ●● ● ●●● ● ● ●●● ● ● ● ●● ● wickets lost ● ● ● ● ● ● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ●● ● ● ● 0 ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● 1 ● ● ● 50 ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 2 ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● 3 ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● 4 ● ● ● Resources Available ●● ● ● ● 25 ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 5 ●● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● 6 ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ● 7 ●● 0 ● 8 0 10 20 30 40 50 ● 9 Overs available

Figure 2.12: Comparison of Improved Duckworth-Lewis and Duckworth-Lewis Resource Tables that, according to improved Duckworth-Lewis surface, even if teams play defensively and lose early wickets, there is still chance to achieve most of the target. However, when less than four wickets are lost, Duckworth-

Lewis shows more resources available compared to the improved Duckworth-Lewis method. The distance between Duckworth-Lewis curves increase approximately by the same rate whereas most of the curves for improved Duckworth-Lewis surface also show the same pattern except the curves for wickets lost ‘6’ and ‘7’ which are a little further from each other. A plausible explanation could be that the resources do reduce after the sixth wicket is lost and the batsman who gets out is replaced by a less specialist or a tail-hander batsman, so the target cannot be achieved with the same rate.

27 Chapter 3

Accuracy Check

The next step is to check the precision/accuracy of the resource tables/surfaces; as without using any statistical method to compare the methods, it is almost impossible to tell which method works the best. The resource surfaces are used only when the game is delayed due to unseen circumstances. The resource surfaces are used to set a new target for team 2 or decide the result of the game, depending on the availability of time. In order to check the accuracy of resource surfaces, we only consider the completed games, stop them at certain points, and try to predict the result of the games. The comparisons can be performed by either looking at the results of the game (binary prediction) or looking at the distances from the target, which is by looking at the margin of runs in case of victory. There are many metrics for measuring accuracy of binary prediction, such as AUC (Area Under the Curve)/ROC (Receiver Operating Characteristic) approaches,

Matthews correlation coefficient, and Cohen’s Kappa. Binary predictions are easier and represent even clearer results for simple comparison; as we are more interested in predicting the outcome of the game compared to the margin of victory. However, there is less resolution in binary prediction compared to the continuous prediction [12]. I decided to use Cohen’s Kappa to perform the binary comparison, whereas the second comparison can be performed using Root Mean Square Error (RMSE) [2]. To check the accuracy of the methods, we look both at with-in and out-of-sample prediction. In our case, we know the actual result (win/loss) of each game and we can predict the target for second innings (expected win/loss) using the resource surfaces. Firstly, we stopped the game at different points and looked at the predicted runs.

We, first of all, stopped the game after first over and looked at the outcome of the games. Then we stopped the game after every over and compared the methods. In order to check the accuracy of the methods, we assumed that teams play at a constant rate, that is the for batting team remains the same throughout the game. This is not a realistic assumption because, commonly, net run rate fluctuates almost

28 throughout the game and increases mostly in the end. To check the accuracy of the methods using Cohen’s

Kappa, we assume that team 1 played full overs (either 50 or specified number of overs if rain interfered before the start of the game) or lost all the wickets, whereas team 2 plays only the number of overs after we decide to stop the game (or gets out before playing the specified number of overs). In case of RMSE, we stop the game at same points and predict the number of runs that can be scored using resource table/surfaces.

For out-of-sample prediction, I randomly choose 80% of the games and after computing the resource tables,

I predict the outcome of remaining 20% of the games. There is also a bias-variance trade-off in estimating resource table/surfaces. Dealing with bias and variance is actually about dealing with under- and over-

fitting. If we try to reduce the bias, variance will increases (and vice versa) in relation to model complexity.

In practice, there is no analytical way to minimize both bias and variance. The best way is to choose an accurate measure of prediction error and explore differing levels of model complexity and then choose the complexity level that minimizes the overall error (RMSE). Within sample prediction reduces the bias but increases the variance, whereas out of sample increases the bias but reduces the variance. The method with the lowest overall error will be the best method (resource table/surface) out of all the methods.

3.1 Cohen’s Kappa

Cohen’s Kappa, κ, can be used to predict the binary outcomes of the game. It is defined as:

p − p κ = e , 1 − pe where p is the observed agreement between predictions and outcomes (that is, the proportion of successful predictions) and pe gives the proportion of agreement between unsuccessful prediction by chance. If there is complete agreement between prediction and result then κ = 1. The standard error (SE) of κ is calculated as: s p(1 − p) SE(κ) = 2 . n(1 − pe)

Cohen’s Kappa is approximately Normally distributed when both np and n(1 − p) are large enough (greater than 5). In such scenario, the 95% confidence interval can be found as κ ± 1.96 · SE(κ) [2].

Figure 3.1 shows the Kappa estimates calculated at different stoppage game points of within sample prediction. The higher the Kappa value, the better the method. It can be seen from the plot that the Gibbs sampling and Duckworth-Lewis method have the lowest overall Kappa estimates out of all the methods, whereas the improved Duckworth-Lewis method has the highest overall Kappa estimates. Looking at the individual points, Gibbs sampling has the lowest Kappa estimates at almost all the points, whereas improved

29 ● ● ● ● ● ● ● ● ● ● ● ● ● 0.8 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● method ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● G.S ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● R.opt 0.6 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ISO Kappa ● ● ● ● ● R ● ● ● ● DL ● ● ● 0.4 ● ●

0 10 20 30 40 50 Overs played

Figure 3.1: Kappa values along with the CI for different methods for 50 overs

Duckworth-Lewis has the highest Kappa estimates at most of the stoppage points. After over 42, the accuracy of Gibbs sampling surprisingly starts to decline. The coloured ribbons show the 95% confidence intervals.

Figure 3.2 shows the Kappa values of out-of-sample prediction. For out-of-sample prediction, we randomly selected 80% of the games to calculate different resource surfaces and used those surfaces to predict the remaining 20% of the games. We stopped the game after every over and computed the Kappa values; we then took an average of all the Kappa values. The plot shows that the Gibbs sampling method has the lowest and the Duckworth-Lewis method has the second lowest Kappa value; whereas the improved

Duckworth-Lewis resource surface has the highest Kappa value.

3.2 Root Mean Square Error

Root Mean Square Error (RMSE), also known as root mean square deviation, represents the sample standard deviation of the differences between observed and predicted values. The individual differences are called

‘residuals’ when the calculations are performed over the data sample that was used for estimation, and

30 0.72

0.69 ●

● ● ●

0.66 Kappa

0.63

0.60 R.opt G.S ISO DL R

method

Figure 3.2: Out of sample (unweighted) Kappa values along with the CI for different methods for all overs combined

‘prediction errors’ when computed out-of-sample. RMSE is calculated as:

s Pn (Yˆ − Y )2 RMSE = i=1 i i , n

where Yi is the observed value and Yˆi is the predicted value. The RMSE estimates tell about the range of run deviations between actual and expected runs. This type of metric (continuous prediction) for measuring accuracy is more powerful compared to the binary prediction.

For within sample prediction, we stop the game at 49 points (like last section(KAPPA) - over 1, 2,

. . . , 49) and predict the target. The observed score will be the final runs scored by team 1, whereas the predicted/expected runs will be the projected runs that can be scored after stoppage points using resource table/surfaces. RMSE only gives an estimate; however an approximate CI of RMSE can be found by using the bootstrap method. By keeping the same sample size and choosing the sample with replacement, we can

find another RMSE estimate. However, repeating the same procedure multiple times, we can get different values and hence using those values, we can find standard error within the estimates and thus the 95% CI

(we assume it follows Normal distribution).

31 For this thesis, we find RMSE values 500 times. RMSE estimates along with their confidence intervals for different stoppage points are shown in Figure 3.3. The figure shows that Duckworth-Lewis is the worst method at most of the points, whereas the raw resource table gives the smallest RMSE value. The points/dots in the figure represent the estimate and the bars give information about the 95% confidence intervals. The estimates tell us about the range of run deviations we expect to see between actual and expected runs. For instance, according to resource table, R, after 45 overs, an estimate is approximately 10. It tells us that using resource table, there will be around 10 runs difference between the actual and our calculated expected runs. Thus, the smaller the value, the better the method. As expected, the RMSE estimate values decrease as the game goes near completion. It is hard to predict accurate results in the beginning of the game, but as the game progresses, the runs can be predicted by looking at the performance of the team. After 20 overs, on average, improved Duckworth-Lewis methods shows slightly better results compared to the remaining resource surfaces. For out-of-sample prediction, we again randomly choose 80% of the games to calculate the resource table and predict the remaining 20% of the games using those tables. Figure 3.4 shows the out-of-sample prediction of all the methods. The methods are arranged according to the estimates. It can be seen from the plot that the raw resource table and isotonic regression have the smallest RMSE estimates, whereas Duckworth-Lewis method has the highest RMSE estimate, much larger than the other methods.

Also, all of the methods except Duckworth-Lewis have approximately the same standard error. According to Figure 3.3, Duckworth-Lewis does not predict results very well at the beginning of the game; that is why

Duckworth-Lewis has high out-of-sample RMSE estimate.

Figure 3.5 shows the bias values at different stoppage points for resource table and resource surfaces.

For reference, a black dotted line is also drawn at x-intercept. Gibbs sampling has average bias values close to zero in the middle of the innings, and is not biased towards either innings. On the other side, actual

Duckworth-Lewis method is more biased towards the expected values (towards innings 2). It assumes that in case of stoppage, on average, teams will score more runs than they actually score. Near completion of the game, almost all the methods have same bias except Gibbs sampling.

3.3 Stoppage Distribution

The purpose of this section is to look at the stoppage distribution for each method. First of all, like previous two sections, we stop the games after every over and calculate the bias, RMSE and Kappa estimation for each over. We also know that 122 games are prematurely stopped and the result of those games were decided via Duckworth-Lewis method (or called tie if less than 20 overs are played). We can find the probability of

32 ●

300 ● method ● ● R ● ● ● ● ● ISO 200 ● ● ● ● G.S ● ● RMSE ● ● ● R.opt ● ● ● ● ● ● ● ● ● ● ● ● DL ● ● ● ● 100 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● 0 10 20 30 40 50 Overs played

Figure 3.3: RMSE values along with the CI for different methods for 50 overs

120

100 RMSE

80

● ● 60 R.opt ISO G.S DL R

method

Figure 3.4: Out of sample (unweighted) RMSE values along with the CI for different methods for all overs combined

33 ● 100 ●

● ● method ● ● ● ● R ● 50 ● ● ● ISO ● ● ● ● bias ● G.S ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● R.opt ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● DL ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0 10 20 30 40 50 Overs played

Figure 3.5: Bias values for different methods for 50 overs 0.04 0.02 Stoppage Probability 0.00 0 10 20 30 40 50

Overs Played

Figure 3.6: Stoppage probability for all 50 overs

34 stoppage games (time of the stoppage of games) for each over, which can be calculated as:

n p = u , u N

where nu represents the number of games stopped after over u, and N represents the total number of games stopped (which is 122 in this case). Figure 3.6 shows the stoppage probability after each over. The most number of games are stopped after over 46 (7 games), but the stoppage probability is approximately uniformly distributed over all the overs. Once we know the stoppage probability, we can easily find the average bias, RMSE or Kappa over stoppage distribution, respectively, using the following formulae:

u X E(stoppage bias) = pu · bias i=1

u X E(stoppage RMSE) = pu · RMSE i=1

u X E(stoppage Kappa) = pu · Kappa. i=1 Table 3.1 and Figure 3.7 shows the average stoppage distribution of all the methods. ‘lwr represents the

lower range of CI; and ‘upr represents the upper range of CI. It can be seen from the plot and table that

Duckworth-Lewis has the highest average absolute bias, whereas the improved Duckworth-Lewis shows the

lowest average absolute bias. Using that above information, we can say that the improved Duckworth-Lewis

is the least biased method out of all the methods. The bias standard error of all the methods is approximately

same. As seen in Figure 3.3, Duckworth-Lewis showed the highest RMSE values in the beginning; however,

Gibbs sampling had slightly higher RMSE values at the last few overs compared to the other resource

surfaces. In the middle, almost all the resource table/surfaces have the same average RMSE values. The

average RMSE stoppage results agree with the previous statement. Duckworth-Lewis shows the highest

average RMSE value for stoppage points, and Gibbs sampling shows the second highest average RMSE value

at overall stoppage points. The standard error is approximately same for all the methods. Overall, it can

be seen that the raw resource table, isotonic regression resource surface and improved Duckworth-Lewis

resource surface shows better and more accurate results compared to the Duckworth-Lewis method.

35 Table 3.1: Average stoppage distribution for 50 overs

R ISO R.opt GS DL Av. Bias -6.439 -6.312 3.820 -4.307 19.509 lwr Bias -8.474 -8.347 1.567 -6.483 16.250 upr Bias -4.404 -4.277 6.072 -2.131 22.767 Av. RMSE 43.099 43.072 48.459 47.382 71.631 lwr RMSE 41.437 41.480 46.469 45.557 68.594 upr RMSE 44.604 44.706 50.467 49.212 74.642 Av. Kappa 0.696 0.695 0.686 0.633 0.703 lwr Kappa 0.665 0.664 0.654 0.597 0.673 Upr Kappa 0.727 0.727 0.717 0.669 0.733

bias kappa rmse

20 ● ● 70 ● 0.70 ● ● method ● ● R 10 60 ● ISO

● avg R.opt ● 0.65 ● GS

● 0 ● 50 DL ● ● ●

● ● ● ● 0.60 −10 40 R R R DL DL DL GS GS GS ISO ISO ISO R.opt R.opt R.opt method

Figure 3.7: Average stoppage distribution for 50 overs

36 Table 3.2: Duckworth-Lewis Resource Table for 50 overs

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 3.74 3.73 3.72 3.71 3.69 3.65 3.60 3.48 3.25 2.60 2 7.37 7.34 7.31 7.25 7.17 7.04 6.82 6.42 5.61 3.76 3 10.90 10.84 10.76 10.64 10.46 10.18 9.72 8.89 7.33 4.28 4 14.32 14.22 14.08 13.87 13.57 13.09 12.32 10.96 8.57 4.51 5 17.65 17.50 17.28 16.97 16.50 15.79 14.65 12.71 9.48 4.62 6 20.89 20.67 20.36 19.92 19.28 18.29 16.74 14.18 10.14 4.66 7 24.03 23.73 23.33 22.75 21.90 20.61 18.62 15.42 10.62 4.68 8 27.08 26.70 26.18 25.44 24.37 22.77 20.31 16.46 10.97 4.69 9 30.04 29.58 28.93 28.02 26.71 24.76 21.82 17.34 11.22 4.70 10 32.92 32.36 31.58 30.49 28.91 26.61 23.18 18.08 11.41 4.70 11 35.71 35.05 34.13 32.84 31.00 28.32 24.39 18.70 11.54 4.70 12 38.43 37.65 36.58 35.09 32.97 29.91 25.49 19.22 11.64 4.70 13 41.07 40.17 38.95 37.24 34.83 31.39 26.47 19.66 11.71 4.70 14 43.63 42.61 41.22 39.30 36.59 32.75 27.35 20.03 11.76 4.70 15 46.12 44.97 43.41 41.26 38.25 34.02 28.14 20.35 11.80 4.70 16 48.54 47.26 45.52 43.14 39.82 35.20 28.85 20.61 11.83 4.70 17 50.88 49.47 47.55 44.93 41.30 36.28 29.49 20.83 11.85 4.70 18 53.17 51.61 49.51 46.64 42.70 37.29 30.06 21.02 11.86 4.70 19 55.38 53.68 51.39 48.28 44.02 38.23 30.57 21.17 11.87 4.70 20 57.53 55.69 53.20 49.85 45.27 39.10 31.03 21.30 11.88 4.70 21 59.62 57.63 54.95 51.34 46.44 39.90 31.44 21.41 11.89 4.70 22 61.66 59.51 56.63 52.77 47.56 40.65 31.81 21.51 11.89 4.70 23 63.63 61.33 58.25 54.14 48.61 41.34 32.15 21.59 11.89 4.70 24 65.54 63.09 59.81 55.44 49.60 41.98 32.44 21.65 11.89 4.70 25 67.41 64.79 61.31 56.69 50.54 42.58 32.71 21.71 11.90 4.70 26 69.21 66.44 62.75 57.88 51.43 43.13 32.95 21.76 11.90 4.70 27 70.97 68.03 64.14 59.02 52.27 43.64 33.17 21.79 11.90 4.70 28 72.68 69.58 65.48 60.11 53.06 44.11 33.36 21.83 11.90 4.70 29 74.33 71.07 66.77 61.14 53.81 44.55 33.54 21.86 11.90 4.70 30 75.94 72.52 68.01 62.14 54.51 44.96 33.69 21.88 11.90 4.70 31 77.50 73.92 69.21 63.09 55.18 45.34 33.83 21.90 11.90 4.70 32 79.02 75.27 70.36 64.00 55.81 45.69 33.96 21.92 11.90 4.70 33 80.50 76.58 71.47 64.86 56.40 46.01 34.07 21.93 11.90 4.70 34 81.93 77.85 72.54 65.69 56.97 46.31 34.17 21.94 11.90 4.70 35 83.32 79.08 73.56 66.48 57.50 46.59 34.26 21.95 11.90 4.70 36 84.68 80.27 74.55 67.24 58.00 46.85 34.34 21.96 11.90 4.70 37 85.99 81.42 75.50 67.96 58.47 47.09 34.42 21.97 11.90 4.70 38 87.26 82.53 76.42 68.65 58.92 47.31 34.48 21.97 11.90 4.70 39 88.50 83.61 77.30 69.31 59.34 47.52 34.54 21.98 11.90 4.70 40 89.71 84.65 78.15 69.94 59.74 47.71 34.59 21.98 11.90 4.70 41 90.88 85.66 78.97 70.55 60.12 47.89 34.64 21.99 11.90 4.70 42 92.01 86.64 79.76 71.12 60.48 48.05 34.68 21.99 11.90 4.70 43 93.11 87.58 80.52 71.67 60.81 48.20 34.72 21.99 11.90 4.70 44 94.19 88.50 81.25 72.20 61.13 48.34 34.76 21.99 11.90 4.70 45 95.23 89.38 81.96 72.70 61.43 48.48 34.79 21.99 11.90 4.70 46 96.24 90.24 82.63 73.18 61.71 48.60 34.82 22.00 11.90 4.70 47 97.22 91.07 83.28 73.64 61.98 48.71 34.84 22.00 11.90 4.70 48 98.17 91.87 83.91 74.08 62.24 48.81 34.86 22.00 11.90 4.70 49 99.10 92.65 84.52 74.50 62.47 48.91 34.88 22.00 11.90 4.70 50 100.00 93.40 85.10 74.90 62.70 49.00 34.90 22.00 11.90 4.70

37 Table 3.3: R Resource Table using 50 over games

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 3.74 3.97 3.77 3.59 3.80 3.79 3.91 3.26 2.07 2 10.06 7.58 7.13 6.99 7.44 7.10 7.12 5.47 3.48 3 9.92 10.30 10.16 10.72 10.87 10.08 9.65 7.25 4.40 4 12.42 13.16 13.52 13.81 13.48 12.57 12.57 8.28 4.84 5 15.07 15.41 16.56 16.77 15.88 15.40 14.43 9.44 5.89 6 14.98 18.10 18.73 19.09 19.53 18.25 17.79 15.30 11.81 6.46 7 17.74 22.51 21.25 21.77 21.98 20.46 19.83 17.07 11.90 6.51 8 20.80 23.87 23.54 24.20 23.95 22.81 21.96 18.80 12.30 6.69 9 27.02 26.36 26.26 26.24 26.11 24.93 24.00 19.57 12.73 6.96 10 29.40 28.96 28.34 28.15 28.36 26.60 25.29 20.63 13.96 7.28 11 32.45 31.14 30.82 29.96 30.37 28.21 26.95 21.64 14.24 7.72 12 35.81 33.07 32.86 32.01 32.34 29.76 28.15 22.23 15.92 7.48 13 38.09 34.63 34.64 34.25 33.98 31.52 30.16 23.10 16.99 6.71 14 39.49 37.08 36.87 35.79 36.13 33.24 31.48 24.81 16.07 8.68 15 39.76 39.48 38.50 38.11 37.76 34.84 33.30 25.80 17.81 6.54 16 41.86 41.31 40.33 40.10 39.22 36.61 35.14 26.50 18.30 7.26 17 43.06 42.61 42.19 41.66 40.90 38.61 34.29 28.76 18.94 6.71 18 45.85 44.24 43.85 43.31 42.57 39.91 35.27 29.33 19.02 6.24 19 48.11 46.03 45.40 45.02 44.07 41.71 35.55 30.05 20.27 7.65 20 49.41 47.32 47.28 46.86 45.52 42.63 37.27 31.57 18.90 5.91 21 51.20 49.23 48.93 48.32 47.48 43.09 37.73 33.49 20.07 8.92 22 53.23 50.96 50.51 50.22 48.16 45.65 39.31 32.92 20.96 10.78 23 53.82 52.75 52.29 51.55 49.99 47.04 40.43 35.51 21.38 13.34 24 55.07 54.65 53.68 53.49 51.96 47.42 43.24 33.07 22.87 11.93 25 56.44 56.31 55.69 54.91 52.98 49.56 45.16 31.66 28.59 9.46 26 57.79 57.98 57.33 56.67 54.84 50.25 46.41 35.17 29.15 9.04 27 59.25 59.42 58.96 58.31 56.30 50.77 46.92 37.11 24.57 8.25 28 61.10 60.47 60.87 59.70 57.59 52.17 46.70 35.05 25.32 13.30 29 62.79 61.63 62.78 61.69 58.28 51.64 50.28 38.24 15.12 12.98 30 64.42 63.18 64.40 63.13 60.04 53.36 51.02 30.83 14.33 17.61 31 65.87 65.28 65.99 64.38 61.54 55.83 51.94 33.84 15.87 23.83 32 67.58 66.79 67.86 65.52 62.91 58.05 52.97 28.45 19.81 0.00 33 69.32 68.59 69.27 66.70 64.69 60.26 50.77 28.84 16.87 34 71.11 70.19 70.60 68.49 65.85 62.53 45.02 39.50 11.31 35 72.94 71.78 71.87 70.46 66.89 63.18 50.33 40.04 10.00 36 74.50 73.47 74.04 71.98 68.89 64.28 51.96 24.21 37 76.35 75.31 75.47 73.64 71.03 64.86 47.79 31.63 38 78.01 77.31 76.91 75.19 72.39 64.93 49.87 38.89 39 79.72 79.10 78.86 77.42 73.71 66.44 47.67 40 81.57 80.78 81.05 78.23 74.01 71.84 70.26 41 83.47 82.81 82.81 81.07 76.56 72.73 75.38 42 85.11 84.86 84.26 82.76 78.86 75.73 76.92 43 87.24 86.79 86.50 83.80 84.55 72.58 80.00 44 89.12 88.95 87.73 84.75 84.85 80.14 45 90.85 90.74 90.10 87.79 90.69 79.84 46 92.74 92.77 92.33 89.02 80.65 47 94.66 94.65 94.20 91.44 88.71 48 96.56 96.49 96.40 97.42 90.32 49 98.41 98.77 98.13 95.97 50 100.00

38 Table 3.4: Optimized R Resource Surface using 50 over games

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 2.80 2.80 2.80 2.80 2.79 2.78 2.77 2.72 2.64 2.43 2 5.56 5.55 5.55 5.54 5.53 5.50 5.45 5.27 4.94 4.23 3 8.28 8.27 8.26 8.24 8.22 8.14 8.03 7.63 6.94 5.56 4 10.96 10.93 10.92 10.89 10.85 10.72 10.53 9.84 8.69 6.54 5 13.60 13.56 13.54 13.49 13.43 13.22 12.94 11.90 10.21 7.26 6 16.21 16.15 16.12 16.04 15.95 15.67 15.26 13.82 11.54 7.80 7 18.77 18.69 18.65 18.55 18.43 18.05 17.51 15.61 12.70 8.19 8 21.30 21.20 21.15 21.02 20.86 20.37 19.68 17.28 13.72 8.49 9 23.79 23.66 23.60 23.44 23.24 22.63 21.77 18.83 14.60 8.70 10 26.24 26.09 26.01 25.81 25.58 24.83 23.79 20.28 15.37 8.86 11 28.66 28.48 28.39 28.15 27.86 26.97 25.75 21.63 16.04 8.98 12 31.05 30.83 30.72 30.44 30.11 29.06 27.63 22.89 16.62 9.07 13 33.40 33.14 33.02 32.69 32.30 31.10 29.45 24.07 17.13 9.13 14 35.71 35.42 35.27 34.90 34.46 33.08 31.21 25.16 17.58 9.18 15 37.99 37.66 37.50 37.08 36.57 35.01 32.90 26.18 17.96 9.21 16 40.24 39.86 39.68 39.21 38.64 36.89 34.54 27.14 18.30 9.24 17 42.46 42.03 41.83 41.30 40.67 38.73 36.12 28.02 18.60 9.26 18 44.64 44.17 43.94 43.36 42.66 40.51 37.65 28.85 18.85 9.27 19 46.79 46.27 46.02 45.38 44.61 42.25 39.12 29.62 19.08 9.28 20 48.91 48.34 48.07 47.37 46.52 43.95 40.54 30.34 19.27 9.29 21 51.00 50.38 50.08 49.31 48.40 45.60 41.92 31.01 19.44 9.30 22 53.06 52.39 52.06 51.23 50.23 47.21 43.24 31.64 19.59 9.30 23 55.09 54.36 54.01 53.11 52.03 48.77 44.52 32.22 19.72 9.30 24 57.09 56.30 55.93 54.95 53.80 50.30 45.76 32.76 19.84 9.30 25 59.06 58.21 57.81 56.77 55.53 51.79 46.95 33.27 19.93 9.31 26 61.00 60.09 59.66 58.55 57.22 53.23 48.11 33.74 20.02 9.31 27 62.91 61.95 61.49 60.30 58.88 54.65 49.22 34.18 20.10 9.31 28 64.80 63.77 63.28 62.01 60.51 56.02 50.29 34.59 20.16 9.31 29 66.66 65.56 65.04 63.70 62.11 57.36 51.33 34.97 20.22 9.31 30 68.49 67.33 66.78 65.36 63.68 58.67 52.33 35.33 20.27 9.31 31 70.29 69.07 68.48 66.98 65.21 59.94 53.30 35.66 20.31 9.31 32 72.07 70.78 70.16 68.58 66.71 61.17 54.23 35.97 20.35 9.31 33 73.82 72.46 71.81 70.15 68.19 62.38 55.13 36.26 20.38 9.31 34 75.55 74.12 73.44 71.69 69.63 63.56 56.00 36.53 20.41 9.31 35 77.25 75.75 75.04 73.20 71.05 64.70 56.84 36.78 20.44 9.31 36 78.93 77.35 76.61 74.69 72.44 65.82 57.65 37.02 20.46 9.31 37 80.58 78.93 78.15 76.15 73.80 66.90 58.44 37.24 20.48 9.31 38 82.21 80.49 79.67 77.58 75.13 67.96 59.19 37.44 20.49 9.31 39 83.82 82.02 81.17 78.99 76.44 68.99 59.92 37.63 20.51 9.31 40 85.40 83.53 82.64 80.38 77.72 70.00 60.63 37.81 20.52 9.31 41 86.96 85.01 84.09 81.73 78.98 70.98 61.31 37.97 20.53 9.31 42 88.49 86.47 85.51 83.07 80.21 71.93 61.97 38.12 20.54 9.31 43 90.00 87.90 86.91 84.38 81.42 72.86 62.60 38.27 20.55 9.31 44 91.50 89.32 88.29 85.66 82.60 73.77 63.21 38.40 20.56 9.31 45 92.97 90.71 89.64 86.93 83.76 74.65 63.81 38.53 20.56 9.31 46 94.41 92.08 90.98 88.17 84.90 75.51 64.38 38.64 20.57 9.31 47 95.84 93.43 92.29 89.39 86.02 76.34 64.93 38.75 20.57 9.31 48 97.25 94.75 93.58 90.58 87.11 77.16 65.46 38.85 20.58 9.31 49 98.63 96.06 94.84 91.76 88.18 77.95 65.97 38.95 20.58 9.31 50 100.00 97.34 96.09 92.91 89.23 78.73 66.47 39.03 20.58 9.31

39 Table 3.5: Isotonic Regression Resource Surface using 50 over games

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 32.23 3.94 3.94 3.81 3.81 3.81 3.81 3.81 3.26 2.07 2 32.23 10.06 7.58 7.26 7.26 7.26 7.11 7.11 5.47 3.48 3 32.23 10.70 10.70 10.70 10.70 10.70 10.08 9.65 7.25 4.40 4 32.23 13.68 13.68 13.68 13.68 13.48 12.57 12.57 8.28 4.84 5 32.23 16.57 16.57 16.57 16.57 15.88 15.40 14.43 9.44 5.89 6 32.23 19.31 19.31 19.31 19.31 18.25 17.79 15.30 11.81 6.46 7 32.23 22.51 21.84 21.84 21.84 20.46 19.83 17.07 11.90 6.51 8 32.23 24.00 24.00 24.00 23.95 22.81 21.96 18.80 12.30 6.69 9 32.23 26.36 26.26 26.24 26.11 24.93 24.00 19.57 12.73 6.96 10 32.23 28.96 28.34 28.28 28.28 26.60 25.29 20.63 13.96 7.16 11 32.45 31.14 30.82 30.20 30.20 28.21 26.95 21.64 14.24 7.16 12 35.81 33.07 32.86 32.19 32.19 29.76 28.15 22.23 15.92 7.16 13 38.09 34.64 34.64 34.25 33.98 31.52 30.16 23.10 16.56 7.16 14 39.49 37.08 36.87 35.96 35.96 33.24 31.48 24.81 16.56 7.16 15 39.76 39.48 38.50 38.11 37.76 34.84 33.30 25.80 17.81 7.16 16 41.86 41.31 40.33 40.10 39.22 36.61 34.72 26.50 18.30 7.16 17 43.06 42.61 42.19 41.66 40.90 38.61 34.72 28.76 18.94 7.16 18 45.85 44.24 43.85 43.31 42.57 39.91 35.27 29.33 19.02 7.16 19 48.11 46.03 45.40 45.02 44.07 41.71 35.55 30.05 19.67 7.16 20 49.41 47.32 47.28 46.86 45.52 42.63 37.27 31.57 19.67 7.16 21 51.20 49.23 48.93 48.32 47.48 43.09 37.73 33.23 20.07 8.92 22 53.23 50.96 50.51 50.22 48.16 45.65 39.31 33.23 20.96 10.32 23 53.82 52.75 52.29 51.55 49.99 47.04 40.43 33.64 21.38 10.32 24 55.07 54.65 53.68 53.49 51.96 47.42 43.24 33.64 22.87 10.32 25 56.44 56.31 55.69 54.91 52.98 49.56 45.16 33.64 23.90 10.32 26 57.94 57.94 57.33 56.67 54.84 50.25 46.41 34.18 23.90 10.32 27 59.39 59.39 58.96 58.31 56.30 50.77 46.81 34.18 23.90 10.32 28 61.10 60.71 60.71 59.70 57.59 51.92 46.81 34.18 23.90 13.12 29 62.79 62.32 62.32 61.69 58.28 51.92 50.28 34.18 23.90 13.12 30 64.42 63.90 63.90 63.13 60.04 53.36 50.58 34.18 23.90 14.76 31 65.87 65.68 65.68 64.38 61.54 55.83 50.58 34.18 23.90 14.76 32 67.58 67.38 67.38 65.52 62.91 58.05 50.58 34.18 23.90 14.76 33 69.32 68.96 68.96 66.70 64.69 60.26 50.58 34.18 23.90 23.90 34 71.11 70.41 70.41 68.49 65.85 62.53 50.58 36.62 23.90 23.90 35 72.94 71.83 71.83 70.46 66.89 63.18 50.58 36.62 23.90 23.90 36 74.50 73.75 73.75 71.98 68.89 64.28 50.58 36.62 36.62 36.62 37 76.35 75.38 75.38 73.64 71.03 64.86 50.58 37.39 37.39 37.39 38 78.01 77.31 76.91 75.19 72.39 64.93 50.58 46.78 46.78 46.78 39 79.72 79.10 78.86 77.42 73.71 66.44 50.58 50.58 50.58 50.58 40 81.57 80.89 80.89 78.23 74.01 71.84 66.37 54.67 54.67 54.67 41 83.47 82.81 82.81 81.07 76.56 72.15 66.37 54.67 54.67 54.67 42 85.11 84.86 84.26 82.76 78.86 72.15 66.37 54.67 54.67 54.67 43 87.24 86.79 86.50 83.90 83.90 72.15 66.37 54.67 54.67 54.67 44 89.12 88.95 87.73 84.76 84.76 72.15 66.37 54.67 54.67 54.67 45 90.85 90.74 90.10 87.79 87.73 72.15 66.37 54.67 54.67 54.67 46 92.75 92.75 92.33 89.02 87.73 72.15 66.37 54.67 54.67 54.67 47 94.66 94.65 94.20 89.99 87.73 72.15 66.37 54.67 54.67 54.67 48 96.56 96.49 96.40 89.99 87.73 72.15 66.37 54.67 54.67 54.67 49 98.44 98.44 96.63 89.99 87.73 72.15 66.37 54.67 54.67 54.67 50 100.00 98.44 96.63 89.99 87.73 72.15 66.37 54.67 54.67 54.67

40 Table 3.6: Gibbs Sampling Resource Surface using 50 over games

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 0.52 0.46 0.41 0.35 0.30 0.25 0.20 0.15 0.09 0.04 2 2.43 2.42 2.41 2.27 2.13 1.68 1.23 0.81 0.44 0.18 3 4.29 4.29 4.29 4.25 4.12 3.31 2.49 1.71 0.98 0.40 4 7.93 7.93 7.93 7.91 7.75 6.11 4.48 3.08 1.75 0.68 5 11.27 11.27 11.27 11.26 11.11 9.23 6.96 4.76 2.77 1.01 6 16.89 16.89 16.88 16.88 16.80 16.40 11.22 6.85 3.94 1.35 7 20.55 20.54 20.54 20.53 20.48 20.02 15.74 9.15 5.25 1.72 8 23.04 23.04 23.04 23.03 22.99 22.46 21.59 11.75 6.62 2.11 9 27.41 26.75 26.66 26.58 26.40 25.33 23.92 14.20 7.98 2.51 10 29.83 29.38 28.76 28.62 28.53 27.02 25.52 16.50 9.44 2.93 11 32.92 31.59 31.27 30.50 30.45 28.63 27.27 18.91 10.89 3.36 12 36.34 33.56 33.34 32.57 32.52 30.20 28.60 21.23 12.36 3.80 13 38.64 35.16 35.12 34.75 34.47 31.97 30.60 23.59 13.80 4.28 14 40.07 37.62 37.41 36.42 36.39 33.74 31.95 25.15 15.22 4.77 15 40.34 40.06 39.06 38.67 38.31 35.34 33.80 26.16 16.74 5.26 16 42.47 41.91 40.92 40.68 39.79 37.15 35.11 27.00 18.21 5.78 17 43.69 43.23 42.81 42.27 41.51 39.18 35.20 29.08 18.75 6.26 18 46.52 44.89 44.49 43.95 43.19 40.50 35.75 29.79 19.08 6.72 19 48.82 46.70 46.06 45.68 44.72 42.32 36.11 30.52 19.32 7.17 20 50.13 48.03 47.95 47.55 46.19 43.26 37.81 32.04 19.43 7.47 21 51.95 49.95 49.65 49.02 48.18 43.72 38.28 33.15 19.54 8.27 22 54.01 51.70 51.25 50.96 48.86 46.31 39.89 33.20 19.64 8.50 23 54.60 53.52 53.05 52.30 50.72 47.73 41.02 33.25 19.69 8.55 24 55.88 55.44 54.46 54.28 52.72 48.11 43.88 33.26 19.72 8.57 25 57.31 57.12 56.50 55.72 53.76 50.29 45.82 33.27 19.73 8.59 26 58.86 58.77 58.17 57.50 55.64 50.99 47.08 33.29 19.74 8.60 27 60.36 60.26 59.82 59.17 57.12 51.52 47.46 33.30 19.74 8.61 28 62.02 61.63 61.61 60.57 58.43 52.66 47.53 33.30 19.74 8.62 29 63.72 63.33 63.32 62.59 59.13 52.67 50.43 33.31 19.74 8.62 30 65.38 64.92 64.91 64.05 60.92 54.15 50.45 33.31 19.74 8.63 31 66.96 66.73 66.71 65.32 62.45 56.65 50.45 33.31 19.74 8.63 32 68.75 68.53 68.52 66.48 63.83 58.90 50.45 33.31 19.74 8.63 33 70.41 70.10 70.08 67.67 65.63 61.14 50.46 33.31 19.74 8.63 34 72.16 71.55 71.51 69.49 66.81 63.44 50.46 33.32 19.75 8.63 35 74.02 72.96 72.88 71.49 67.87 64.11 50.46 33.32 19.75 8.64 36 75.60 75.02 74.99 73.04 69.90 65.23 50.46 33.32 19.75 8.64 37 77.47 76.60 76.52 74.72 72.07 65.79 50.46 33.32 19.75 8.64 38 79.18 78.44 78.03 76.29 73.45 65.89 50.47 33.32 19.75 8.64 39 80.93 80.28 80.01 78.55 74.79 67.41 50.47 33.32 19.75 8.64 40 82.87 82.25 82.16 79.38 75.09 72.62 66.89 33.32 19.75 8.64 41 84.87 84.14 83.98 82.25 77.67 72.64 66.89 33.32 19.75 8.64 42 86.65 86.06 85.49 83.96 80.02 72.64 66.89 33.32 19.75 8.64 43 89.03 88.08 87.74 85.52 85.51 72.64 66.89 33.32 19.75 8.64 44 90.71 89.76 89.02 86.09 86.04 72.64 66.89 33.32 19.75 8.64 45 92.03 89.97 89.84 87.02 87.02 72.64 66.89 33.32 19.75 8.64 46 93.71 90.02 89.85 87.02 87.02 72.64 66.89 33.32 19.75 8.64 47 95.40 90.05 89.86 87.02 87.02 72.64 66.89 33.32 19.75 8.64 48 97.04 90.07 89.86 87.02 87.02 72.64 66.89 33.32 19.75 8.64 49 98.51 90.08 89.86 87.02 87.02 72.64 66.89 33.32 19.75 8.64 50 100.00 90.09 89.86 87.02 87.02 72.64 66.89 33.32 19.75 8.64

41 Chapter 4

Discussion

Some complaints have been made in the past about the Duckworth-Lewis method; however, most of the com- plaints have been made by the losing teams or their fans. In the past, statisticians such as Bhattacharya [8] and Perera [19] have proposed different methods and claimed that their method improved on the Duckworth-

Lewis method. In order to test their claim, we compared their methods with the actual Duckworth-Lewis method. We checked the accuracy of the method by performing cross-validation, and not by speculation or looking at the principles of the game. Since the style of the game has changed a lot in the last decade and players usually play more aggressively, we proposed a new resource table, improved Duckworth-Lewis, which uses the same surface model and fitting algorithm as Duckworth-Lewis table, with a major exception; that is, we used the games that are played after 2000, whereas the original Duckworth-Lewis table was calculated using the games that were played before 1998. The methods were compared using Cohen’s Kappa, RMSE and bias. Both with-in-sample and out-of-sample prediction results showed the actual Duckworth-Lewis method (based on the data from pre-1998 games) displayed the worst result out of all the methods, whereas the isotonic regression resource surface showed the lowest RMSE and improved Duckworth-Lewis displayed the highest Kappa estimate and lowest absolute average bias value. Due to discrepancies in the R resource table (missing values and monotonicity issues), it cannot be used at an international level; however, the improved Duckworth-Lewis resource surface can be used to replace the Duckworth-Lewis table at an inter- national level. Figure 4.1 shows the difference between the Duckworth-Lewis and improved Duckworth-Lewis resource surface. As seen in the plot, there is a big difference in the resources at the middle right end of the plot; that is Duckworth-Lewis assumes that when between four to seven wickets are lost with more than 35 overs remaining, then few runs can be scored, whereas improved Duckworth-Lewis resource surface still believes that there is enough time for players to score enough runs (Duckworth-Lewis underpredicts

42 9 30 8 25 7 20 6 15 5 4 10 5 Wickets Lost Wickets 3

2 0

1 −5 0 −10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Overs Remaining

Figure 4.1: Heatmap (levelplot) of difference between D/L and improved Duckworth-Lewis Resource Surfaces

compared to the improved Duckworth-Lewis surface).

The raw resource table is constructed by looking at the fraction of total runs scored by a given point in

the game (that is, number of overs remaining and wickets lost). However, not all teams end up scoring the

same number of final runs even if they have the same number of runs at some point in the game. This might

provide useful information, because play might proceed differently; hence we might be able to alter (and

improve) predictions on the basis of runs scored. Some of the teams end up scoring lower than the average

number of runs; some score a lot higher than the average number of runs; whereas, remaining teams score

close to the average number of runs. We define the lowest one-third of the games low scoring games (less than 226 runs); the highest one-third of the games are categorized as high scoring games (more than 272 runs); and the remaining games are classified as mid scoring games (runs scored in-between 226 and 272).

Figure 4.2 shows the resource table for each scoring category. The resource tables are split by the number of wickets lost. There are minor differences between the 3 resource tables when less than 8 wickets are lost.

For the remaining wickets lost, low scoring games show more resources available compared to the other two resource tables. Figure 4.3 shows the RMSE cross-validation of the additional resource tables along with the previously defined resource table and surfaces. It can be seen that if we categorize the data into 3 different scoring runs, out-of-sample prediction displays slightly better results and higher power of accuracy.

43 100 75 50 0 25 0 100 75 50 1 25 0 100 75 50 2 25 0 100 75 50 3 25 0 100 75 50 4 25 0 100

Mean 75 50 5 25 0 100 75 50 6 25 0 100 75 50 7 25 0 100 75 50 8 25 0 100 75 50 9 25 0 0 10 20 30 40 50 Overs

Figure 4.2: Difference between Low, Mid and High scoring runs Resource Tables

44 120 ●

100

RMSE 80

● ● ● ● ● ● 60 ● R.High R.Low R.Mid R.opt ISO G.S DL R

method

Figure 4.3: Out of sample RMSE values along with the CI for different methods for all overs combined

We can use the same methods to predict the runs in twenty over (T20I) games. T20I is a fast paced game compared to 50 overs games as there are less overs to play, so batsmen play aggressively from start to end, whereas in 50 over game, players get enough time to settle down and play their natural game. The T20I games are sufficiently short that stoppage/prediction rarely decide games. The method used at international level is also proposed by Duckworth and Lewis, however the actual Duckworth-Lewis method was scaled down so it can be used at an international level. Figure C.1 shows the scaled Duckworth-Lewis table. While comparing, we can also look at the previous methods which were used before Duckworth-Lewis and test their accuracy.

To decide the outcome of the game, a team playing second needs to play at least five overs. If a team does not get ample time to play five overs then the game results in a tie. Thus, we stop the games at overs 5,

10 and 15. Figure 4.4 shows the within sample Kappa prediction. According to the plot, it can be seen that the twenty over resource table, R, gives the highest Kappa values, whereas the internationally rejected methods (MPO, DMPO, PARAB, ARR) displayed the lowest Kappa values. An interesting point to note is that Duckworth-Lewis method produced higher Kappa value compared to the improved Duckworth-Lewis resource surface. The reason might be the smaller sample size as only 333 T20I games are considered while calculating the resource tables, whereas in 50-overs games, 1738 games were considered while calculating the tables. Figure 4.5 shows the RMSE values for different methods used on T20I games. RMSE also presents the same results as Kappa. However, isotonic regression method performs better than R resource table and, again, Duckworth-Lewis method seems to perform better than the improved Duckworth-Lewis resource

45 ● 0.6 ● ● ● ● ● ● Overs played ● ● ● 0.4 ● ●● ● 5 ● ● ● ● 10 ● KAPPA ● ● ● 15 ● 0.2 ●

● ●● ● ● ● 0.0 CLARK PARAB DMPO R.opt MPO ARR ISO DL R

method

Figure 4.4: Kappa estimates along with the CI for different methods for 20 overs games surface.

Few complaints have been made against the Duckworth-Lewis method as mentioned in the introduction.

The English captain, Paul Collingwood, raised the issue by saying that Duckworth-Lewis does not consider the actual T20I games in its calculations. England scored 191 runs in 20 overs and rain intervened the game.

After the rain stopped, West Indies were given a target of 60 runs in six overs with all wickets in hand.

Paul complained that the target should be more than 60 runs since it is a fast paced game and with all the wickets remaining, players would play aggressively. We can calculate the target using the T20I games data.

Using improved Duckworth-Lewis resource table, the resources available at that point are 42.35%, hence the target using this method would be 81 runs. West Indies successfully scored 60 runs in 5.5 overs; and in my opinion, if the target would have been 81 runs then England might have won that game. Another complaint made by Anit Murhekar is that Duckworth-Lewis gives more weight to wickets compared to overs. As seen in Figure C.1, most of the resources are still available if the team does not lose more than a couple of wickets in the first five to six overs, whereas the percent of resources significantly decrease after four wickets are lost.

The improved Duckworth-Lewis surface shows more resources available compared to the Duckworth-Lewis at most points.

In short, we have reviewed several methods that can perform better than the Duckworth-Lewis method.

46 ●

● ● 40 ● ● ● ● Overs played ● ● ● 5 ● ● ● ● 30 ● ● ● ● 10 ●

RMSE ● ● ● 15

● 20 ● ● ● ● ● ● CLARK PARAB DMPO R.opt MPO ARR ISO DL R

method

Figure 4.5: RMSE values along with the CI for different methods for 20 overs games

Since the game style, rules and regulations have changed in the last decade, the accuracy of the Duckworth-

Lewis can be improved by ignoring the old games and constructing the resource surface using the modern data. Some other factors can also affect the accuracy of resource table/surfaces, such as ground and weather conditions, and different scoring pattern of different teams. Hence, in my opinion, it might be a good idea to look into constructing different resource surfaces according to different ground conditions (sub-continent vs. non-continent) or different resource surfaces according to the performance of team 1 (low/mid/high scoring games).

47 Appendix A

Basic Rules of Cricket

A.1 Ways to dismiss a batsman

A batsman is dismissed in several ways. In limited over game, once dismissed, a batsman can not bat again.

Full details of all the methods can be found at http://en.wikipedia.org/wiki/Dismissal_

28cricket29. Summary of those methods is as follows:

: This is the most rare case of dismissal and the player is considered “retired out”. This

rule applies if a batsman leaves the field without ’s consent for any reason other than injury.

However, the player can come back only if the opposing captain let him bat again.

: This is another rare case of dismissal. This rule applies if the new batsman takes more

than two minutes to come to the field.

• Bowled: If the bowler’s hits the stump(s) and is completely removed, then the striker

batsman is considered out. The ball can hit the stump directly or indirectly (hits the bat or any part

of body first).

: If batsman hits the ball with the bat or glove(s) (when the glove is in contact with the bat)

and the ball is caught by the bowler or any fielder in the ground before it hits the ground, then the

batsman is considered out. If the wicketkeeper catches the ball then its called “caught behind”.

: If the wicketkeeper removes the bails with the ball while batsman tries to play a shot and

gets out of his (leaving no part of his body behind the crease), then the batsman is stumped

out.

48 • Run Out: If a fielder uses the ball to remove the bails from either set of stumps whilst the batsmen

are running between the wickets and are outside their crease, then the batsman (striker or non-striker)

is out.

(LBW): This rule mainly relies on umpire’s judgement. When the batsman

fails to hit the ball and ball instead hits either leg of the batsman while he is standing infront of the

stumps, then he is considered out. Usually, umpires decision rely on whether the ball pitched inside

the line and whether it hit the stumps if there was no leg in-between the ball and the stumps.

: The batsman is considered hit wicket if he breaks the stump(s) (striker’s end) by any

means while the game is in play. However, this rule does not apply if he breaks stumps while avoiding

a run out.

: If the batsman intentionally touches the ball with his hand when the bat is not

in contact, then he is considered out. In the past, only seven players are dismissed because of this

method.

• Hitting the ball twice: This is the most rare case of dismissal. If a batsman intentioanlly hit the

ball twice with his bat, then he is considered out. However, if he senses he would get bowled, then he

is allowed to stop the ball from hitting the stumps.

• Obstructing the field: If a batsman, by words or action, obstructs a fielder, then he is out. Usually,

batsmen obstruct the field in order to avoid run outs.

A.2 Ways to score runs

A batsman can score at-most 6 runs per ball. Some of the common ways to score runs are as follows:

• Running between wickets: A batsman can run between wickets as many times as he want to.

Mostly batsmen run once (single) or twice (double) between the wickets, after hitting the ball, and

sometimes they run three (triple) or four times as well depending on the situation.

: There are two scenarios to score runs via boundary.

– If the ball crosses the boundary without touching the ground, then six runs are awarded to the

batting team.

– If the ball crosses the boundary after touching the ground, then four runs are awarded to the

batting team.

49 • Extras: Sometimes runs are also awarded to the batting side due to bowling side errors.

: If a bowler fails to bowl in a playable zone, then an additional run is awarded to the

batting side.

– No ball: If a bowler oversteps or hits the stump with his body at non-striker’s end while bowling,

an extra run is awarded to the batting side.

: Bye is awarded when the striker batsman does not hit the ball but runs between the wickets.

: Leg bye is awarded when the ball hits the body of a striker batsman and batsmen

successfully run between the wickets.

• Penalty: Sometimes the wicketkeeper of the bowling side puts a helmet behind him on the ground

(instead of wearing it). If the bowl hits the helmet then five penalty runs are awarded to the batting

side. Penalty runs, depending on umpires, could also be awarded to the batting side in case of bowl

tempering or time wasting situations.

50 Appendix B

Clark Curves

There are six types of stoppage [5].

Stoppage Type 1

This type of stoppage is used when the game is delayed before the start of first innings. To take care of the interruption caused by rain, storm or flood lights, equal number of overs is reduced for both teams. However, no adjustment is required to the target set by team 1. In this scenario, the curve 1 in Figure B.1 will move to the left side depending on the new numbers of overs allotted to each side. For instance, rain interrupts a

20-over game and wastes almost half an hour of the game time. In such case, each side will be allowed to bat at-most 17 overs (depending on umpires judgement) and the target for team batting second will be a run more than the runs scored by team batting first.

Stoppage Type 2

If the game is interrupted during the first innings for a short period of time and the umpires allow the teams to play full overs, then no adjustment is made to the target. However, if the game is stopped for a longer period, then the number of overs are reduced from both innings to make up for the lost time. In such scenario, two possibilies are considered:

• If the team is allowed to bat more than five overs after the delay, no adjustment is made to the target.

Target for team batting second will be a run more than the number of runs scored by team batting

first (same number of overs for both teams). Figure B.2a shows that since the overs remaining are

more than 5 after the match resumes, the scoring curve move towards the right side.

• If the team is allowed to bat less than five overs after the delay, the adjustment is made to the target.

51 Figure B.1: CLARK Curves

Resources available can be calculated from Figure B.2b. The formula to calculate target for team 2 is calculated in the following steps:

– Projected Total w.r.t. Overs (PTO) = Actual Score before stoppage ∗ Par Percentage when play

Stops (PPS) / Par Percentage when play Resumes (PPR) (PPS and PPR can be read from curve

1 in Figure B.2b)

– Projected Total w.r.t. Wickets (PTW) = Actual Score before stoppage ∗ Wickets lost Ratio (WR)

∗ Par Percentage when play Resumes (PPR) (WR can be read from table 3 on [5])

– Projected Total (PT) = min(PTO, PTW) + Runs scored after break

– Percent Reduced (PR) = Resources lost for number of overs, which can be calculated from curve

2 in Figure B.2b

– Target = PT ∗ PR

For instance, in a 50-overs game, let’s assume team 1 scores 171 runs with the loss of 6 wickets (171/6) in 34 overs and rain interrupts the game. Team 1 is only allowed to bat 3 more overs after the delay

(a) More than 5 overs remaining (b) Less than 5 overs remaining

Figure B.2: Stoppage Type 2

52 Figure B.3: Stoppage Type 3

and they score 200/8 in 37 overs. Thus, team 1 lost 13 overs. The target for team 2 is:

– PTO = 171 ∗ 91.2% / 61.0% = 255.659

– PTW = 171 ∗ 1.37 ∗ 91.2% = 213.654

– PT = min(255.659, 213.654) + 29 = 242.654

– PR = 84.6%

– Target = 242.654 ∗ .846 = 205.29. Therefore, target is 206 for team 2.

Stoppage Type 3

This type of stoppage method is used when the game is interrupted during the first innings and team 1 does not get a chance to bat again due to enough time loss. In such scenario, team 1 fails to benefit its full resources (play aggressively at the end of the innings). Hence, it is unfair to give same target to team

2 in the same number of overs. To make it fair for both teams, an adjustment is made to the target (like previous stoppage type). PTO and PTW (the probable score that could be achieved in the given overs or wickets remainig) can be calculated from Figure B.3. The new target is set in the following ways:

– PTO = Actual Score ∗ 100% / Par Percentage (PP)

– PTW = Actual Score ∗ WR

– PT = min(PTO, PTW)

– Target = PT ∗ PR

For instance, in a 50-overs game, let’s assume team 1 scores 171/5 in 40 overs and rest of the innings is washed out due to rain. The target for team 2, using Figure B.3, is:

53 – PTO = 171 ∗ 100% / 73.1% = 233.926

– PTW = 171 ∗ 1.56 = 266.76

– PT = min(233.926, 266.76) = 233.926

– PR = 90.1%

• Target = 233.926 ∗ .901 = 210.77. Therefore, target is 211 for team 2 in 40 overs.

Stoppage Type 4

The fourth type of stoppage method is used before the start of second and after the completion of first innings. Target needs to be adjusted for team batting second. The new target will be set by mutiplying team 1’s score by PR. For instance, if team 1 scores 255/8 in 50 overs and due to rain, team 2 is allowed to play only 40 overs, then the target for team 2 is 231 runs (256 ∗ 90.1% = 230.7).

Stoppage Type 5

This method is used when the delay happens during the second innings and the game does resume after some time. To make it fair for team 2, the target needs to be reduced since the number of overs available are reduced. Depending upon the duration of interruption, the probable runs scorable in the overs lost are reduced from the target. In other words, the new target is set as:

– Target = Original Target ∗ (1 − Percentage Lost (PL)); where PL can be calculated as shwon in Figure

B.4

For instance, team 1 scores 255/8 in 50 overs and rain interrupts the game after 8 overs of innings 2.

Team 2 is allowed to play 32 overs after the game resumes (10 overs are lost), the target for team 2 in

40 overs is:

– PL = 33.0% (over 18) − 14.3% (over 8) = 18.7%; it tells the percentage of target should have been

achieved at that time

– Target = 256 ∗ (1 − 18.7%) = 208.13. Therefore, the revised target is 209 runs.

Stoppage Type 6

This stoppage type is similar to stoppage type 3. If rain interrupts the game during second innings and team 2 does not get a chance to finish its innings, then a new target (winning score) needs to be set. This

54 Figure B.4: Stoppage Type 5

method also considers the number of wickets lost by team 2. If the runs scored by team 2 at that particular point is greater than the winning score, team 2 is considered the winner, and vice versa. ’Winning score‘ is calculated by multiplying PP with the old target.

For instance, team 1 score 255/7 in 50 overs, so the target for team 2 is 256. At the time of interruption, team 2’s score is 205/6 in 42 overs and they do not get a chance to continue their innings. Therefore, the new target for team 2 is:

– PP = 77.4%

– Winning score > 256 ∗ 0.774 = 198.14. So, team 2 needed 199 runs to win.

– But we need to consider the number of wickets lost as well. So, PTW = 205 ∗ 1.37 = 280.85, which is

greater than 256. Therefore, team 2 is considered a winner.

55 Appendix C

Difference between T20I and 50 overs game

Since both T20I and 50-over games finish in a day and have many similarities, there are some dissimilarities between the two formats as well. The differences between two formats are as follows:

• The main difference is T20I are 20 overs long per innings, whereas 50 over games are 50 overs long per

innings.

• Each bowler is allowed to bowl at most 4 overs in T20I, whereas each bowler can bowl upto 10 overs

in 50 over game

• In T20I, fielding restriction is applied for first 6 overs (30% of the game), whereas in 50 over game,

first mandatory fielding restriction is applied for first 10 overs (20% of the game). It means that only

two players are allowed to stay outside the 30 yard circle

• In case of a tie, a super over is bowled in T20I to decide the result of a game, whereas a game is called

tie if both teams score the same number of runs

• T20I is a fast paced game compared to 50 over game as there are less overs to play, so batsmen play

aggressively from start to end, whereas in 50 over game, players get enough time to settle down and

play their natural game

However, the Duckworth-Lewis method was constructed for the 50 over format and was scaled down for

T20I. Also, the table is monotonic decreasing along both rows and columns, that is the number of resources available decrease with every wicket lost and the number of overs played. The resources available in T20I for

56 100 w=1w=0 w=2 w=3 75 w=4 w=5 w=6 50 w=7

25 w=8 Resources Available w=9 0 0 5 10 15 20 Overs available

Figure C.1: Resources available in T20I with wickets lost and overs remaining u overs remaining and w wickets lost is provided in Figure C.1. The D/L resource table is also provided in

Table C.1. Rows show the resources available for number of wickets lost at a particular point of the game, whereas columns give the resources available at different points of the game (overs remainig) with specific number of wickets lost.

57 Table C.1: Duckworth-Lewis Resource Table

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 6.71 6.70 6.68 6.65 6.62 6.56 6.47 6.33 6.04 5.28 2 13.21 13.15 13.08 12.98 12.83 12.60 12.28 11.75 10.75 8.37 3 19.49 19.37 19.21 18.98 18.67 18.18 17.49 16.40 14.43 10.17 4 25.58 25.36 25.08 24.70 24.15 23.32 22.17 20.39 17.29 11.23 5 31.46 31.13 30.70 30.12 29.30 28.06 26.37 23.80 19.53 11.85 6 37.16 36.69 36.09 35.28 34.14 32.43 30.14 26.74 21.27 12.21 7 42.67 42.04 41.25 40.18 38.69 36.47 33.52 29.25 22.63 12.42 8 48.00 47.20 46.19 44.84 42.96 40.19 36.56 31.40 23.69 12.54 9 53.16 52.17 50.93 49.27 46.98 43.62 39.28 33.25 24.51 12.62 10 58.15 56.96 55.46 53.47 50.75 46.78 41.73 34.84 25.16 12.66 11 62.98 61.57 59.81 57.47 54.29 49.70 43.93 36.19 25.66 12.68 12 67.66 66.02 63.97 61.27 57.61 52.39 45.90 37.36 26.05 12.70 13 72.18 70.30 67.96 64.88 60.74 54.87 47.67 38.36 26.36 12.70 14 76.56 74.42 71.78 68.31 63.68 57.16 49.26 39.21 26.60 12.71 15 80.79 78.39 75.44 71.57 66.43 59.27 50.69 39.95 26.78 12.71 16 84.89 82.22 78.94 74.67 69.03 61.21 51.97 40.58 26.93 12.71 17 88.86 85.91 82.30 77.62 71.46 63.01 53.12 41.12 27.04 12.72 18 92.69 89.46 85.51 80.42 73.75 64.67 54.15 41.58 27.13 12.72 19 96.41 92.89 88.60 83.08 75.90 66.19 55.08 41.98 27.20 12.72 20 100.00 96.18 91.55 85.60 77.91 67.60 55.91 42.32 27.25 12.72

58 Appendix D

Scraping Code

Once the ID’s of all the games (cricinfo has different web IDs for each game) are collected, the code used to scrape T20I is as follows: get.data <- function(x) {

## locate summary info

## Gives the line numbers in that script

olines <- grep("End of over",x)

if(length(olines)==0) {

warning("no commentary?")

return(data.frame(innings=NA,over=NA,team=NA,runs=NA,

totruns=NA,wicket.over=NA,wickets=NA,ground=NA)) }

## process first line of summary info

firstlines <- x[olines]

## Extracts 1st number that it sees in a line

over <- as.numeric(str_extract(firstlines,"\\d+")) ## over number

runs <- as.numeric(gsub(" +(run|runs)","",str_extract(firstlines,"\\d+ (run|runs)")))

## process second line of summary info

nextlines <- x[olines+1]

strlines <- gsub("<[^>]+>","",nextlines) ## erase HTML tags

strlines <- gsub(" +\\(.*$","",strlines) ## erase everything after ()

## extract the team, runs, wickets

59 team <- word(strlines,1,-2) ## get all but last word

runswickets <- word(strlines,-1,-1) ## last word

totruns <-as.numeric(word(runswickets,1,sep=fixed("/")))

wickets <-as.numeric(word(runswickets,2,sep=fixed("/")))

### Number of wickets lost per over

wicket.over<-NULL

for(i in1:length(x[olines])) {

wicket.over[i]<-wickets[i]-wickets[i-1]

i=i+1 }

wicket.over[1]=wickets[1]

### T20I game number

T20<-grep("T20I no.",x)

gameno<-x[T20]

T20I <- as.numeric(gsub("T20I no. +","",str_extract(gameno,"T20I no. \\d+")))

### Ground name

ground<-grep("Played at",x)

ground.<-x[ground+1]

ground <- word(gsub("<[^>]+>","",ground.),1,-2)

ground <- gsub(" +\\(.*$","",ground)

data.frame(T20I,over,team,runs,totruns,wicket.over,wickets,ground) }

get.URL <- function(ID,innings,sleep=0) {

if (sleep>0) Sys.sleep(sleep)

d <- get.data(readLines(commentary.URL(ID,innings)))

data.frame(ID,innings,d) }

# Final Data dd <- expand.grid(matchno=ID,innings=1:2) dd.all <- adply(as.matrix(dd),1,

60 function(x) { get.URL(x["matchno"],x["innings"],sleep=2) },

.progress="text")

61 Appendix E

Innings 2 Resource Table and

Resource Surfaces for 50 overs games

62 Table E.1: R Resource Table using 50 over games for Innings 2

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 0.16 0.58 1.29 1.15 1.80 1.80 1.40 2 1.90 0.84 1.16 2.75 3.07 3.92 3.68 2.66 3 1.61 2.90 3.91 4.32 5.41 6.10 5.95 3.45 4 0.00 4.00 5.15 6.04 6.10 8.16 8.26 7.14 3.60 5 0.62 6.06 6.73 7.55 9.13 10.42 10.28 7.94 4.16 6 3.69 7.79 8.95 9.37 12.05 12.20 11.48 9.27 4.14 7 2.79 10.37 10.96 11.44 14.09 14.27 12.98 9.75 4.78 8 0.00 3.67 12.34 13.33 13.70 16.50 16.14 14.24 10.06 6.31 9 3.12 5.88 14.33 15.00 16.73 18.54 18.28 14.62 11.17 6.81 10 8.48 9.21 17.04 16.26 18.97 20.99 19.69 15.45 12.33 7.62 11 8.26 11.55 17.56 18.85 20.90 22.23 20.98 17.17 12.76 6.75 12 11.59 15.46 19.59 20.55 23.06 24.16 22.38 18.06 14.11 6.89 13 16.53 17.42 21.04 22.56 25.49 25.95 23.89 19.48 12.74 7.36 14 14.59 18.68 23.09 24.95 27.36 27.23 25.90 19.61 13.54 7.47 15 14.05 20.29 24.84 27.06 29.11 28.72 26.80 19.91 15.81 8.73 16 16.66 21.08 26.52 28.55 30.68 30.52 27.79 20.75 15.87 8.86 17 17.21 23.98 28.70 29.83 32.88 31.27 29.24 21.79 17.49 7.00 18 20.70 27.41 30.85 31.11 34.84 32.53 30.66 23.57 17.49 9.72 19 23.91 29.79 32.41 33.80 36.25 33.99 31.61 22.98 20.04 8.10 20 26.41 32.37 34.29 36.22 37.79 35.16 31.76 25.86 21.66 5.79 21 30.08 35.06 36.29 38.92 38.88 36.10 32.89 27.40 21.58 3.76 22 30.75 38.15 38.29 40.76 40.71 37.40 35.83 27.23 20.80 6.21 23 31.30 39.81 41.20 42.27 41.83 38.56 37.33 28.59 18.26 9.20 24 35.37 40.81 42.70 44.26 43.88 39.65 39.33 29.05 19.33 10.73 25 38.67 43.03 44.81 46.12 45.64 41.67 39.80 28.41 22.00 15.34 26 40.90 44.37 47.33 48.09 47.76 42.59 40.26 29.57 22.23 8.75 27 41.32 46.54 49.24 50.08 48.84 45.27 40.25 30.01 23.31 8.64 28 43.73 48.08 51.42 51.90 49.64 47.91 41.28 33.56 19.64 10.64 29 45.93 49.85 53.18 54.03 51.66 48.14 43.87 31.65 25.72 9.98 30 47.43 51.65 55.11 54.71 54.57 49.33 46.64 29.31 24.71 9.10 31 50.02 53.71 56.72 56.68 55.76 50.71 50.61 25.51 28.93 12.88 32 52.81 55.11 58.81 58.71 57.55 54.61 46.49 32.28 32.99 14.36 33 54.71 57.86 60.25 60.32 58.62 56.25 45.00 34.20 26.85 0.00 34 57.21 60.39 61.86 62.78 61.31 56.03 47.28 35.47 30.09 10.84 35 59.78 62.17 64.25 65.20 62.87 57.15 53.02 30.18 32.75 26.51 36 62.05 64.58 66.69 67.39 64.25 57.33 50.73 29.35 45.05 15.66 37 64.71 67.20 69.07 68.83 66.01 62.99 54.09 39.61 12.85 46.99 38 66.79 69.58 71.54 69.72 68.28 69.71 35.40 36.92 20.00 46.99 39 69.74 72.08 73.80 72.30 69.61 66.99 48.02 35.16 49.40 40 72.43 74.58 75.97 74.16 72.32 69.21 48.91 35.56 53.01 41 75.27 76.83 78.70 77.56 73.15 74.99 49.21 57.83 42 77.82 79.52 80.91 78.69 76.07 73.19 49.07 59.04 43 81.13 81.90 83.42 79.90 78.24 63.86 44 83.71 84.49 85.46 84.26 81.68 69.20 45 86.69 87.57 87.93 86.57 87.32 69.18 46 89.55 89.81 90.64 88.25 82.06 69.88 47 92.36 92.41 93.10 88.64 94.85 79.52 48 94.93 94.72 96.12 87.95 49 97.60 97.59 98.10 50 100.00

63 Table E.2: Standard Deviation of R Resource Table using 50 over games for Innings 1

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 0.92 1.65 1.52 1.58 1.84 1.98 2.19 2.67 2 1.58 2.48 2.14 2.49 2.60 3.34 3.60 3.91 3 3.62 2.84 3.16 2.82 3.05 3.47 4.34 4.54 5.00 4 3.47 3.15 3.45 3.44 3.69 4.69 4.77 5.46 5.46 5 2.58 3.49 4.16 3.87 4.58 5.33 6.07 5.80 5.62 6 2.48 3.64 4.21 4.52 5.10 5.64 7.05 6.60 5.72 7 2.33 3.87 4.68 5.24 5.38 6.38 7.60 6.68 6.64 8 3.39 4.12 5.08 5.68 5.90 6.86 8.08 7.43 6.79 9 1.56 4.00 4.42 5.60 5.94 6.35 7.23 8.43 7.84 7.87 10 0.60 3.83 5.23 6.12 6.17 6.55 8.42 9.48 9.01 7.28 11 2.63 5.45 5.15 6.33 6.24 7.28 9.39 10.20 9.76 8.02 12 3.09 5.79 5.18 6.53 6.78 8.02 9.96 10.98 10.02 7.65 13 3.56 6.16 5.25 6.40 7.08 8.64 9.90 11.52 9.96 8.13 14 4.15 5.90 5.85 6.88 6.92 8.59 10.77 11.93 9.99 9.03 15 5.91 5.82 6.46 6.67 7.35 9.06 11.03 11.87 10.29 7.56 16 5.96 5.89 6.47 6.98 7.78 8.81 10.99 12.15 10.52 8.07 17 6.34 7.26 6.47 7.20 8.00 8.73 13.25 11.48 11.46 8.33 18 6.06 6.62 6.66 7.38 8.11 10.09 12.94 12.24 12.38 6.87 19 6.56 6.22 6.70 7.68 8.63 10.43 13.93 12.40 12.83 7.52 20 6.14 6.91 6.44 7.58 9.31 11.58 13.43 12.36 12.27 6.98 21 6.08 6.49 6.37 8.58 9.22 12.44 13.94 11.95 12.26 7.80 22 5.38 6.37 7.03 8.41 10.67 11.40 13.67 14.87 10.95 5.08 23 5.50 6.30 7.37 9.27 10.36 11.66 13.70 14.28 11.85 7.28 24 5.31 6.26 8.01 8.90 10.22 11.85 13.79 15.16 13.14 9.24 25 7.67 6.04 7.71 8.96 10.76 11.42 13.34 17.29 12.21 7.11 26 7.92 5.94 7.61 8.91 10.68 12.06 13.82 14.35 16.91 8.86 27 7.53 6.97 7.86 8.63 10.69 13.08 14.57 15.35 19.74 10.65 28 7.32 7.05 8.17 8.73 12.13 13.89 16.00 17.08 20.37 10.13 29 6.69 7.54 7.59 8.76 12.88 15.09 14.65 16.23 11.48 12.08 30 6.29 7.75 8.01 9.50 12.11 14.00 16.21 19.98 3.44 14.52 31 5.86 7.51 7.98 9.95 12.46 13.78 15.07 18.06 11.22 20.54 32 5.79 7.34 7.92 10.87 12.37 13.31 12.75 14.44 19.43 0.00 33 5.61 6.96 8.36 11.49 12.75 13.00 14.82 17.16 18.80 34 5.33 6.83 9.46 11.08 13.17 11.94 20.62 15.57 4.21 35 5.40 6.86 9.61 11.24 13.09 14.59 19.63 17.43 14.14 36 5.55 6.72 9.03 11.57 12.93 14.54 19.95 10.28 37 5.32 6.70 10.03 10.05 12.54 15.18 21.11 11.96 38 5.19 6.70 9.75 9.71 13.66 16.38 19.26 39 5.37 6.88 9.18 8.95 13.99 16.03 16.76 40 5.00 7.18 7.89 11.43 15.12 11.29 5.08 41 4.86 6.60 8.26 8.80 13.86 12.22 42 4.84 5.99 9.26 9.83 10.96 6.67 43 4.59 6.51 8.70 9.53 7.95 44 4.35 5.25 10.03 13.60 6.89 10.69 45 5.89 6.32 7.96 11.50 3.36 46 5.15 4.82 7.93 10.23 47 5.19 5.71 6.86 12.00 48 3.51 3.74 3.81 1.84 49 2.22 1.65 1.64 50 0.00

64 Table E.3: Isotonic Regression Resource Table using 50 over games for Innings 2

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 16.95 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 1.40 2 16.95 3.01 3.01 3.01 3.01 3.01 3.01 3.01 3.01 2.66 3 16.95 5.04 5.04 5.04 5.04 5.04 5.04 5.04 5.04 3.45 4 16.95 6.97 6.97 6.97 6.97 6.97 6.97 6.97 6.97 3.60 5 16.95 8.98 8.98 8.98 8.98 8.98 8.98 8.98 7.94 4.15 6 16.95 10.89 10.89 10.89 10.89 10.89 10.89 10.89 9.27 4.15 7 16.95 12.75 12.75 12.75 12.75 12.75 12.75 12.75 9.75 4.78 8 16.95 14.70 14.70 14.70 14.70 14.70 14.70 14.24 10.06 6.31 9 16.95 16.83 16.83 16.83 16.83 16.83 16.83 14.62 11.17 6.81 10 18.68 18.68 18.68 18.68 18.68 18.68 18.68 15.45 12.33 7.12 11 20.21 20.21 20.21 20.21 20.21 20.21 20.21 17.17 12.76 7.12 12 21.98 21.98 21.98 21.98 21.98 21.98 21.98 18.06 13.47 7.12 13 23.80 23.80 23.80 23.80 23.80 23.80 23.80 19.48 13.47 7.36 14 25.53 25.53 25.53 25.53 25.53 25.53 25.53 19.61 13.54 7.47 15 27.10 27.10 27.10 27.10 27.10 27.10 26.80 19.91 15.81 7.49 16 28.60 28.60 28.60 28.60 28.60 28.60 27.79 20.75 15.87 7.49 17 30.17 30.17 30.17 30.17 30.17 30.17 29.24 21.79 17.49 7.49 18 31.89 31.89 31.89 31.89 31.89 31.89 30.66 23.28 17.49 7.49 19 33.70 33.70 33.70 33.70 33.70 33.70 31.61 23.28 20.04 7.49 20 35.59 35.59 35.59 35.59 35.59 35.16 31.76 25.86 20.42 7.49 21 37.61 37.61 37.61 37.61 37.61 36.10 32.89 27.32 20.42 7.49 22 39.55 39.55 39.55 39.55 39.55 37.40 35.83 27.32 20.42 7.49 23 41.26 41.26 41.26 41.26 41.26 38.56 37.33 28.59 20.42 9.20 24 42.99 42.99 42.99 42.99 42.99 39.65 39.33 28.75 20.42 10.11 25 44.91 44.91 44.91 44.91 44.91 41.67 39.80 28.75 21.88 10.11 26 46.92 46.92 46.92 46.92 46.92 42.59 40.26 29.57 21.88 10.11 27 48.59 48.59 48.59 48.59 48.59 45.27 40.26 30.01 21.88 10.11 28 50.35 50.35 50.35 50.35 49.64 47.91 41.28 30.17 21.88 10.11 29 52.17 52.17 52.17 52.17 51.66 48.14 43.87 30.17 25.39 10.11 30 53.71 53.71 53.71 53.71 53.71 49.33 46.64 30.17 25.39 10.11 31 55.39 55.39 55.39 55.39 55.39 50.71 47.57 30.17 28.93 12.88 32 57.19 57.19 57.19 57.19 57.19 54.61 47.57 32.28 30.53 14.36 33 58.99 58.99 58.99 58.99 58.62 56.15 47.57 33.61 30.53 30.53 34 61.10 61.10 61.10 61.10 61.10 56.15 47.57 33.61 33.61 33.61 35 63.19 63.19 63.19 63.19 62.87 57.15 48.85 33.61 33.61 33.61 36 65.42 65.42 65.42 65.42 64.25 57.33 48.85 33.61 33.61 33.61 37 67.64 67.64 67.64 67.64 66.01 62.99 48.85 41.64 41.64 41.64 38 69.65 69.65 69.65 69.65 68.36 68.36 48.85 42.32 42.32 42.32 39 72.07 72.07 72.07 72.07 69.61 68.36 48.85 47.72 47.72 47.72 40 74.39 74.39 74.39 74.16 72.32 69.21 48.85 47.72 47.72 47.72 41 76.89 76.89 76.89 76.89 73.15 69.66 48.85 47.72 47.72 47.72 42 79.23 79.23 79.23 78.69 75.85 69.66 48.85 47.72 47.72 47.72 43 81.88 81.88 81.88 79.90 75.85 69.66 48.85 47.72 47.72 47.72 44 84.30 84.30 84.30 84.26 75.85 69.66 48.85 47.72 47.72 47.72 45 87.16 87.16 87.16 86.57 75.85 69.66 48.85 47.72 47.72 47.72 46 89.73 89.73 89.73 87.25 75.85 69.66 48.85 47.72 47.72 47.72 47 92.41 92.41 92.41 87.25 75.85 69.66 48.85 47.72 47.72 47.72 48 94.93 94.85 94.85 87.25 75.85 69.66 48.85 47.72 47.72 47.72 49 97.60 97.48 96.36 87.25 75.85 69.66 48.85 47.72 47.72 47.72 50 100.00 97.48 96.36 87.25 75.85 69.66 48.85 47.72 47.72 47.72

65 Table E.4: Optimized R Resource Table using 50 over games for Innings 2

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 3.82 3.78 3.78 3.78 3.78 3.76 3.77 3.75 3.65 3.43 2 7.53 7.38 7.37 7.38 7.36 7.29 7.31 7.24 6.87 6.08 3 11.13 10.79 10.78 10.80 10.75 10.60 10.64 10.50 9.71 8.14 4 14.62 14.03 14.01 14.05 13.97 13.71 13.77 13.53 12.22 9.74 5 18.01 17.10 17.08 17.14 17.01 16.63 16.72 16.36 14.44 10.97 6 21.30 20.02 19.99 20.08 19.90 19.37 19.49 18.99 16.40 11.93 7 24.48 22.80 22.76 22.87 22.64 21.94 22.10 21.45 18.12 12.68 8 27.58 25.43 25.38 25.53 25.23 24.35 24.55 23.74 19.65 13.25 9 30.58 27.93 27.87 28.05 27.68 26.62 26.86 25.87 20.99 13.70 10 33.49 30.31 30.24 30.45 30.01 28.74 29.03 27.86 22.18 14.04 11 36.32 32.56 32.48 32.73 32.21 30.74 31.07 29.71 23.23 14.31 12 39.06 34.71 34.61 34.89 34.30 32.61 32.99 31.44 24.15 14.52 13 41.71 36.74 36.63 36.95 36.28 34.37 34.80 33.05 24.97 14.68 14 44.29 38.67 38.55 38.91 38.16 36.02 36.50 34.55 25.69 14.81 15 46.80 40.51 40.37 40.77 39.94 37.56 38.09 35.94 26.33 14.90 16 49.22 42.25 42.10 42.54 41.62 39.02 39.60 37.25 26.89 14.98 17 51.58 43.90 43.74 44.22 43.22 40.38 41.01 38.46 27.39 15.04 18 53.87 45.47 45.29 45.82 44.73 41.66 42.34 39.59 27.83 15.08 19 56.08 46.97 46.77 47.34 46.16 42.86 43.59 40.65 28.21 15.12 20 58.23 48.38 48.17 48.78 47.52 43.99 44.77 41.63 28.55 15.14 21 60.32 49.73 49.51 50.16 48.81 45.05 45.88 42.54 28.86 15.17 22 62.35 51.01 50.77 51.46 50.03 46.04 46.92 43.40 29.12 15.18 23 64.31 52.22 51.97 52.70 51.18 46.97 47.90 44.19 29.36 15.19 24 66.22 53.37 53.11 53.88 52.28 47.84 48.82 44.94 29.56 15.20 25 68.07 54.46 54.19 55.00 53.31 48.67 49.68 45.63 29.75 15.21 26 69.86 55.50 55.21 56.07 54.30 49.44 50.50 46.27 29.91 15.22 27 71.60 56.49 56.18 57.08 55.23 50.16 51.26 46.87 30.05 15.22 28 73.29 57.43 57.11 58.04 56.11 50.84 51.99 47.43 30.18 15.23 29 74.93 58.32 57.98 58.96 56.95 51.47 52.66 47.95 30.29 15.23 30 76.52 59.16 58.82 59.83 57.74 52.07 53.30 48.44 30.38 15.23 31 78.06 59.96 59.60 60.66 58.49 52.63 53.90 48.89 30.47 15.23 32 79.55 60.73 60.35 61.44 59.20 53.16 54.47 49.31 30.55 15.23 33 81.00 61.45 61.07 62.19 59.87 53.65 55.00 49.70 30.62 15.23 34 82.41 62.14 61.74 62.90 60.51 54.12 55.50 50.07 30.67 15.24 35 83.78 62.79 62.38 63.57 61.12 54.55 55.97 50.41 30.73 15.24 36 85.10 63.41 62.99 64.22 61.69 54.96 56.41 50.73 30.77 15.24 37 86.39 64.00 63.57 64.83 62.23 55.34 56.82 51.03 30.81 15.24 38 87.64 64.55 64.11 65.40 62.75 55.70 57.22 51.30 30.85 15.24 39 88.85 65.08 64.63 65.96 63.24 56.04 57.58 51.56 30.88 15.24 40 90.02 65.59 65.13 66.48 63.70 56.36 57.93 51.80 30.91 15.24 41 91.16 66.07 65.60 66.98 64.14 56.66 58.25 52.02 30.94 15.24 42 92.27 66.52 66.04 67.45 64.55 56.94 58.56 52.23 30.96 15.24 43 93.34 66.95 66.46 67.90 64.95 57.20 58.85 52.43 30.98 15.24 44 94.38 67.36 66.86 68.33 65.32 57.44 59.12 52.61 30.99 15.24 45 95.39 67.75 67.24 68.74 65.67 57.68 59.37 52.78 31.01 15.24 46 96.37 68.12 67.60 69.12 66.01 57.89 59.61 52.94 31.02 15.24 47 97.32 68.47 67.95 69.49 66.32 58.10 59.84 53.08 31.03 15.24 48 98.24 68.80 68.27 69.84 66.62 58.29 60.05 53.22 31.04 15.24 49 99.13 69.12 68.58 70.17 66.91 58.47 60.25 53.35 31.05 15.24 50 100.00 69.42 68.87 70.49 67.18 58.63 60.44 53.46 31.06 15.24

66 Table E.5: Gibbs Sampling Resource Table using 50 over games for Innings 2

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 0.50 0.42 0.34 0.28 0.22 0.16 0.12 0.08 0.04 0.02 2 0.53 0.50 0.47 0.41 0.34 0.28 0.22 0.16 0.10 0.05 3 0.56 0.54 0.53 0.49 0.44 0.38 0.31 0.24 0.18 0.10 4 0.59 0.58 0.57 0.56 0.53 0.48 0.42 0.36 0.27 0.17 5 1.14 1.14 1.14 1.13 1.06 0.92 0.78 0.63 0.46 0.28 6 3.11 3.11 3.11 3.11 3.08 2.98 2.26 1.57 0.98 0.49 7 5.17 5.17 5.17 5.17 5.16 5.11 4.05 2.89 1.75 0.80 8 7.69 7.69 7.69 7.69 7.68 7.62 7.51 5.04 2.80 1.15 9 10.51 10.51 10.51 10.51 10.50 10.48 10.36 7.25 4.10 1.54 10 13.49 13.49 13.49 13.49 13.48 13.47 13.33 9.74 5.57 1.92 11 15.47 15.47 15.47 15.47 15.46 15.45 15.37 12.31 7.28 2.32 12 18.04 18.04 18.04 18.04 18.04 18.03 17.98 15.20 9.04 2.74 13 20.37 20.37 20.37 20.37 20.37 20.36 20.33 19.05 10.87 3.15 14 22.67 22.67 22.67 22.67 22.66 22.66 22.62 19.90 12.68 3.57 15 24.73 24.73 24.73 24.73 24.73 24.72 24.70 20.46 14.47 3.99 16 26.59 26.59 26.59 26.59 26.59 26.58 26.55 21.25 16.31 4.41 17 28.57 28.57 28.57 28.57 28.57 28.57 28.53 22.27 17.50 4.84 18 30.69 30.69 30.69 30.69 30.69 30.69 30.63 23.56 18.08 5.25 19 32.86 32.86 32.86 32.86 32.86 32.85 32.19 23.76 19.70 5.66 20 35.11 35.11 35.11 35.11 35.10 35.09 32.46 26.39 19.97 5.78 21 37.44 37.44 37.44 37.44 37.43 36.83 33.56 27.77 20.05 5.90 22 39.71 39.71 39.70 39.70 39.70 38.17 36.56 27.96 20.11 6.47 23 41.71 41.71 41.71 41.71 41.70 39.34 38.09 29.01 20.15 9.10 24 43.61 43.61 43.61 43.61 43.60 40.45 40.12 29.25 20.28 9.52 25 45.78 45.78 45.78 45.78 45.77 42.52 40.60 29.34 21.94 9.59 26 47.95 47.94 47.94 47.94 47.94 43.45 41.01 29.91 22.09 9.60 27 49.80 49.79 49.79 49.79 49.75 46.19 41.13 30.00 22.15 9.62 28 51.83 51.82 51.82 51.82 50.64 48.88 42.12 30.03 22.17 9.64 29 53.89 53.88 53.88 53.88 52.71 49.11 44.77 30.04 25.71 9.65 30 55.38 55.37 55.37 55.36 55.36 50.33 47.58 30.04 25.79 9.66 31 57.19 57.18 57.18 57.18 56.89 51.74 48.21 30.04 29.53 9.68 32 59.30 59.29 59.29 59.28 58.72 55.73 48.22 32.82 30.22 9.69 33 61.09 61.08 61.07 61.07 59.81 57.26 48.22 32.89 30.23 9.69 34 63.33 63.32 63.31 63.31 62.55 57.29 48.30 32.90 30.70 11.06 35 65.78 65.76 65.76 65.75 64.15 58.31 48.54 32.90 31.19 20.63 36 68.15 68.14 68.13 68.13 65.56 58.49 48.54 32.91 31.20 20.64 37 70.12 70.10 70.09 70.07 67.35 64.27 48.54 37.92 31.20 31.20 38 72.43 72.41 72.40 71.14 69.73 69.70 48.54 37.92 31.20 31.20 39 74.79 74.76 74.75 73.76 71.03 69.70 48.54 37.93 31.21 31.20 40 77.14 77.09 77.07 75.66 73.79 70.43 48.54 42.36 42.36 31.20 41 79.97 79.86 79.84 79.14 74.64 70.45 48.54 42.36 42.36 31.20 42 82.34 82.23 82.20 80.29 77.18 70.45 48.54 44.97 42.36 31.20 43 85.23 84.85 84.80 81.52 77.18 70.45 48.54 44.97 42.36 31.21 44 87.52 87.12 87.03 85.20 77.18 70.45 48.54 44.97 42.36 31.21 45 89.32 88.71 88.66 85.21 77.18 70.45 48.54 44.97 42.36 31.21 46 91.47 88.75 88.69 85.21 77.18 70.45 48.54 44.97 42.36 31.21 47 93.71 88.78 88.70 85.21 77.19 70.45 48.54 44.97 42.36 31.21 48 95.80 88.79 88.70 85.21 77.19 70.45 48.54 44.97 42.36 31.21 49 97.85 88.81 88.70 85.21 77.19 70.45 48.54 44.97 42.36 31.21 50 100.00 88.82 88.71 85.21 77.19 70.45 48.54 44.97 42.36 31.21

67 Appendix F

20 over Resource Tables

F.1 First Innings Tables

Table F.1: R Resource Table

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 6.19 6.54 6.04 6.69 6.33 6.48 6.74 5.92 4.64 2 13.65 11.99 12.35 12.04 12.92 13.23 12.20 10.20 7.30 3 19.34 16.26 19.03 18.88 18.59 18.37 17.39 12.89 5.52 4 21.58 23.28 24.08 24.11 24.96 24.40 20.96 11.64 9.58 5 19.77 29.67 28.95 28.83 30.54 29.99 27.31 25.45 17.85 10.51 6 34.37 33.16 33.70 34.33 35.74 35.03 28.73 23.42 28.88 9.84 7 34.87 38.66 38.54 39.89 40.04 38.78 31.48 29.97 20.30 11.90 8 39.30 42.59 44.49 44.80 44.30 42.99 34.55 36.91 28.05 9 45.22 47.91 49.48 49.25 49.22 45.14 40.14 37.04 36.21 10 49.78 52.86 54.11 53.45 54.53 46.76 44.55 54.57 57.55 11 55.69 57.79 58.16 57.97 59.15 49.84 37.63 56.90 12 60.69 61.85 62.97 62.85 61.34 53.08 63.49 60.34 13 64.80 65.82 67.14 68.94 59.60 57.44 68.97 14 69.73 69.29 71.98 71.76 64.36 73.65 72.41 15 74.82 75.23 76.22 77.26 72.01 81.01 16 80.05 81.35 80.84 81.11 85.00 82.76 17 86.17 86.52 85.01 86.29 90.79 18 91.52 91.33 92.09 94.12 98.63 19 95.71 96.28 97.88 20 100.00

68 Table F.2: Isotonic Regression Resource Table

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 39.30 6.46 6.46 6.45 6.45 6.45 6.45 6.45 5.20 3.14 2 39.30 13.65 12.64 12.64 12.64 12.64 12.64 12.01 9.09 4.66 3 39.30 19.34 18.53 18.53 18.53 18.27 18.27 16.67 10.58 4.66 4 39.30 24.09 24.09 24.09 24.09 24.09 24.09 19.61 10.58 8.99 5 39.30 29.67 29.52 29.52 29.52 29.52 26.95 23.21 15.99 10.24 6 39.30 34.55 34.55 34.55 34.55 34.55 27.52 23.21 23.21 10.24 7 39.30 39.30 39.30 39.30 39.30 38.34 30.37 29.56 23.21 11.90 8 44.09 44.09 44.09 44.09 44.09 42.41 34.49 34.49 32.97 32.97 9 48.81 48.81 48.81 48.81 48.81 44.38 39.30 38.47 38.47 38.47 10 53.35 53.35 53.35 53.35 53.35 45.53 45.17 45.17 45.17 45.17 11 57.82 57.82 57.82 57.82 57.82 48.98 45.17 45.17 45.17 45.17 12 62.28 62.28 62.28 62.28 60.09 53.28 53.28 53.28 52.68 52.68 13 66.66 66.66 66.66 66.66 60.09 58.27 58.27 53.28 52.68 52.68 14 70.48 70.48 70.48 70.48 64.80 64.80 58.27 53.28 52.68 52.68 15 75.49 75.49 75.49 75.49 71.27 69.99 58.27 53.28 52.68 52.68 16 80.80 80.80 80.80 80.80 80.80 69.99 58.27 53.28 52.68 52.68 17 86.29 86.29 85.00 85.00 84.67 69.99 58.27 53.28 52.68 52.68 18 91.48 91.39 91.39 85.00 84.67 69.99 58.27 53.28 52.68 52.68 19 95.74 95.74 92.86 85.00 84.67 69.99 58.27 53.28 52.68 52.68 20 100.00 95.74 92.86 85.00 84.67 69.99 58.27 53.28 52.68 52.68

F.2 Second Innings Tables

In this subsection, resource tables are calculated for T20I using innings 2 data. We calculate the tables exactly in the same way as we calculated before using first innings data. The only difference is that this time we use second innings data.

Mean of Ratios (R)

Table F.5 and Figure F.1 gives the calculated resource table for mean of ratios.

F.2.1 Isotonic Regression

Table F.6 and Figure F.2 gives the calculated resource table for isotonic regression.

F.2.2 Optimization on R

Table F.7 and Figure F.3 gives the calculated resource table for optimized R1 method.

69 Table F.3: Optimized R Resource Table

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 9.55 9.48 9.43 9.37 9.34 9.22 9.02 8.56 7.95 6.54 2 18.38 18.11 17.92 17.71 17.61 17.14 16.43 14.86 12.96 9.20 3 26.54 25.96 25.56 25.12 24.91 23.96 22.52 19.51 16.11 10.27 4 34.08 33.11 32.44 31.71 31.37 29.82 27.52 22.93 18.10 10.71 5 41.04 39.62 38.63 37.58 37.08 34.86 31.63 25.45 19.35 10.89 6 47.49 45.54 44.21 42.80 42.13 39.19 35.01 27.30 20.14 10.96 7 53.44 50.93 49.24 47.44 46.60 42.91 37.79 28.67 20.63 10.99 8 58.94 55.84 53.76 51.56 50.55 46.11 40.07 29.68 20.95 11.00 9 64.03 60.31 57.83 55.23 54.04 48.87 41.95 30.42 21.14 11.01 10 68.73 64.37 61.49 58.50 57.13 51.24 43.49 30.97 21.27 11.01 11 73.08 68.07 64.80 61.40 59.86 53.27 44.76 31.37 21.35 11.01 12 77.09 71.44 67.77 63.98 62.27 55.02 45.80 31.67 21.40 11.01 13 80.80 74.51 70.44 66.28 64.41 56.53 46.65 31.89 21.43 11.01 14 84.23 77.30 72.85 68.32 66.29 57.82 47.36 32.05 21.45 11.01 15 87.40 79.84 75.02 70.14 67.96 58.93 47.94 32.17 21.46 11.01 16 90.33 82.16 76.97 71.76 69.44 59.89 48.41 32.26 21.47 11.01 17 93.04 84.26 78.73 73.20 70.74 60.71 48.80 32.32 21.47 11.01 18 95.55 86.18 80.31 74.47 71.89 61.42 49.12 32.37 21.47 11.01 19 97.86 87.92 81.74 75.61 72.91 62.03 49.38 32.40 21.48 11.01 20 100.00 89.51 83.02 76.62 73.82 62.55 49.60 32.43 21.48 11.01

Table F.4: Gibbs Sampling Resource Table

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 21.46 8.96 6.94 6.83 6.72 6.51 6.46 6.44 5.79 4.52 2 27.05 14.03 12.79 12.73 12.72 12.70 12.62 12.17 10.29 6.46 3 28.92 19.49 18.72 18.71 18.70 18.59 18.47 17.30 12.20 6.49 4 32.61 24.54 24.44 24.43 24.39 24.39 24.37 20.68 12.25 8.83 5 34.19 29.99 29.82 29.80 29.79 29.78 27.26 24.77 17.95 9.83 6 35.30 34.89 34.84 34.83 34.82 34.74 28.93 24.84 24.83 9.85 7 39.51 39.49 39.49 39.48 39.46 38.79 31.74 30.69 24.85 9.93 8 44.40 44.40 44.39 44.38 44.25 43.16 35.25 34.81 27.78 9.96 9 49.32 49.18 49.18 49.18 49.16 45.29 40.12 37.05 34.30 11.34 10 53.62 53.61 53.61 53.60 53.57 47.05 43.57 43.51 43.22 12.86 11 58.09 58.08 58.08 58.05 58.04 50.27 43.59 43.51 43.23 13.93 12 62.40 62.39 62.39 62.36 60.42 54.64 54.60 43.56 43.25 16.90 13 67.06 67.05 67.04 67.02 60.44 57.61 54.61 43.58 43.37 19.85 14 70.77 70.75 70.75 70.73 66.20 66.16 54.62 48.05 47.97 20.20 15 75.87 75.84 75.84 75.80 72.26 66.47 54.62 50.79 49.48 23.65 16 81.01 80.96 80.92 80.90 80.87 66.47 61.66 61.18 53.49 36.06 17 86.25 86.24 85.02 81.22 81.14 66.48 64.60 61.77 54.04 38.96 18 91.58 87.02 86.99 81.23 81.21 70.57 66.17 62.43 55.58 41.88 19 95.75 87.02 87.00 81.23 81.22 73.71 68.54 63.67 59.12 50.41 20 100.00 87.02 87.00 86.81 84.78 78.14 77.75 71.48 67.41 52.10

70 Table F.5: R Resource Table for Innings 2

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 18.27 1.76 3.67 5.34 5.38 5.10 5.89 5.02 4.45 2 11.40 4.19 9.05 9.64 9.96 11.03 11.69 8.91 4.21 3 4.72 10.78 9.19 14.47 14.95 15.41 15.42 14.88 11.40 7.48 4 9.43 16.76 14.95 19.20 21.00 20.87 21.08 20.78 13.37 7.29 5 8.17 22.18 21.10 23.38 24.48 26.67 25.47 24.06 15.13 7.43 6 12.27 24.06 25.74 29.69 29.09 32.13 29.19 27.24 20.79 5.96 7 16.76 29.47 32.96 34.59 34.32 35.96 36.29 27.79 21.53 2.92 8 17.42 36.57 38.93 38.61 39.69 40.75 36.35 40.33 21.99 2.00 9 22.96 42.75 43.47 44.41 45.86 42.55 37.96 43.01 41.31 10 32.13 46.85 48.66 49.25 50.07 45.93 48.53 38.89 58.13 11 41.05 51.37 53.50 54.00 54.48 49.80 53.22 51.44 42.08 12 46.85 57.20 57.93 59.55 57.59 52.42 63.13 46.67 34.88 13 52.93 61.59 63.53 63.63 62.64 52.32 54.95 60.21 14 60.09 66.51 68.29 68.23 61.30 57.59 71.25 60.76 15 68.15 72.65 72.69 73.50 66.68 78.28 16 75.77 77.63 77.66 81.06 78.88 84.92 17 82.45 83.24 85.70 84.22 89.62 18 88.27 89.55 90.97 91.61 93.94 19 94.52 94.93 97.21 99.01 20 100.00

Resource Table

9 100 8 7 80 6 60 5 4 40 3 Wickets Lost Wickets 2 20 1 0 0

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Overs Remaining

Figure F.1: Heatmap (levelplot) of R Resource Table for Innings 2

71 Table F.6: Isotonic Regression Resource Table for Innings 2

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 22.43 8.63 5.66 3.95 3.95 3.95 3.95 3.95 3.95 3.16 2 22.43 8.63 8.20 8.20 8.20 8.20 8.20 8.20 7.94 3.16 3 22.43 12.52 12.52 12.52 12.52 12.52 12.52 12.52 10.90 5.17 4 22.43 18.04 18.04 18.04 18.04 18.04 18.04 18.04 11.47 5.17 5 22.43 22.43 22.43 22.43 22.43 22.43 22.43 22.43 13.78 5.17 6 26.78 26.78 26.78 26.78 26.78 26.78 26.78 25.62 19.82 5.17 7 32.10 32.10 32.10 32.10 32.10 32.10 32.10 26.79 21.02 5.17 8 36.62 36.62 36.62 36.62 36.62 36.62 36.62 36.62 21.61 5.17 9 41.63 41.63 41.63 41.63 41.63 41.63 38.97 38.97 38.97 38.97 10 46.39 46.39 46.39 46.39 46.39 46.01 46.01 43.34 43.34 43.34 11 50.94 50.94 50.94 50.94 50.94 49.03 49.03 49.03 43.34 43.34 12 55.54 55.54 55.54 55.54 55.54 52.95 52.95 49.03 43.34 43.34 13 59.98 59.98 59.98 59.98 59.98 52.95 52.95 52.95 50.92 50.92 14 64.96 64.96 64.96 64.96 61.12 54.80 54.80 53.38 50.92 50.92 15 70.08 70.08 70.08 70.08 67.97 67.97 54.80 53.38 50.92 50.92 16 76.11 76.11 76.11 76.11 76.11 67.97 54.80 53.38 50.92 50.92 17 82.20 82.20 82.20 82.20 82.20 67.97 54.80 53.38 50.92 50.92 18 88.09 88.09 88.09 88.09 82.20 67.97 54.80 53.38 50.92 50.92 19 94.06 94.06 94.06 88.09 82.20 67.97 54.80 53.38 50.92 50.92 20 100.00 94.06 94.06 88.09 82.20 67.97 54.80 53.38 50.92 50.92

Resource Table

9 100 8 7 80 6 60 5 4 40 3 Wickets Lost Wickets 2 20 1 0 0

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Overs Remaining

Figure F.2: Heatmap (levelplot) of Isotonic Regression Resource Table for Innings 2

72 Table F.7: Optimized R Resource Table for Innings 2

0 1 2 3 4 5 6 7 8 9 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 8.08 8.03 8.00 7.96 7.95 7.87 7.74 7.45 7.06 6.10 2 15.72 15.54 15.41 15.28 15.22 14.92 14.46 13.42 12.10 9.27 3 22.95 22.55 22.28 22.00 21.87 21.24 20.28 18.20 15.71 10.91 4 29.78 29.11 28.65 28.17 27.95 26.90 25.33 22.03 18.29 11.77 5 36.24 35.23 34.55 33.84 33.51 31.98 29.71 25.10 20.14 12.21 6 42.35 40.95 40.02 39.05 38.59 36.53 33.51 27.56 21.46 12.44 7 48.12 46.30 45.08 43.83 43.24 40.60 36.81 29.52 22.40 12.56 8 53.58 51.30 49.78 48.23 47.50 44.25 39.66 31.10 23.07 12.63 9 58.75 55.97 54.13 52.26 51.39 47.53 42.14 32.37 23.56 12.66 10 63.63 60.33 58.16 55.97 54.95 50.46 44.29 33.38 23.90 12.68 11 68.25 64.41 61.90 59.38 58.20 53.09 46.15 34.19 24.15 12.68 12 72.62 68.22 65.36 62.50 61.18 55.44 47.77 34.84 24.32 12.69 13 76.75 71.78 68.57 65.38 63.90 57.55 49.17 35.36 24.45 12.69 14 80.65 75.10 71.55 68.01 66.39 59.45 50.39 35.77 24.54 12.69 15 84.34 78.21 74.30 70.44 68.67 61.14 51.44 36.11 24.60 12.69 16 87.83 81.12 76.86 72.66 70.75 62.66 52.35 36.37 24.65 12.69 17 91.14 83.83 79.22 74.71 72.65 64.02 53.15 36.59 24.68 12.69 18 94.26 86.37 81.42 76.59 74.40 65.24 53.83 36.76 24.71 12.69 19 97.21 88.74 83.45 78.31 75.99 66.34 54.43 36.90 24.72 12.69 20 100.00 90.95 85.33 79.90 77.45 67.32 54.95 37.01 24.73 12.69

Resource Table

9 100 8 7 80 6 60 5 4 40 3 Wickets Lost Wickets 2 20 1 0 0

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Overs Remaining

Figure F.3: Heatmap (levelplot) of Optimized R Resource Table for Innings 2

73 Bibliography

[1] A brief . http://www.espncricinfo.com/ci/content/story/239757.html. Ac- cessed: 06-05-2014.

[2] Cohen’s Kappa. http://www-users.york.ac.uk/~mb55/msc/clinimet/week4/kappash2.pdf. Ac- cessed: 07-07-2014.

[3] History of cricket. http://www.zingari-net.net/. Accessed: 06-05-2014.

[4] Why test cricket remains the best format. http://www.theroar.com.au/2013/11/22/

why-test-cricket-remains-the-best-format. Accessed: 06-06-2014.

[5] A brief history of CLARK method. http://static.cricinfo.com/db/ABOUT_CRICKET/RAIN_RULES/

CLARK-SAMSON_RULE.html, 2013. Accessed: 08-05-2014.

[6] M. Asif. Statistical modelling in limited overs international cricket. PhD thesis, University of Salford

Manchester, Salford, , July 2013.

[7] G. Pond B. De Silva and T. Swartz. Applications: Estimation of the magnitude of victory in One-day

cricket. Australian and New Zealand Journal of Statistics, 43:259–268, 2001.

[8] Rianka Bhattacharya. The Duckworth-Lewis method and cricket. Master’s thesis, Simon

Fraser University, Canada, 2008.

[9] S. Clarke and P. Allsopp. Fair measures of performance: the World Cup of cricket. The Journal of the

Operational Resesarch Society, 52:471–479, 2001.

[10] W. do Rego. Wayne’s system. Wisdom Cricket Monthly, 24 November 1995.

[11] F. Duckworth and A. Lewis. A fair method for resetting the target in interrupted one-day cricket

matches. The Journal of the Operational Resesarch Society, 49:220–227, 1998.

74 [12] A. Gelman and D. Park. Splitting a predictor at the upper quarter or third and the lower quarter or

third. The American Statistician, 62, 2008.

[13] Edward I. George. Explaining the Gibbs Sampler. American Statistician, pages 167–174, 1992.

[14] V Jayadevan. A new method for the computation of target scores in interrupted, limited-over cricket.

Current Science, 83(5), 2002.

[15] AJ Lewis. Towards fairer measures of player performance in One-day cricket. Journal of the Operational

Research Society, 56(7):804–815, 2005.

[16] A. Longmore. Cricket. http://www.britannica.com/EBchecked/topic/142911/cricket. Accessed: 06-05-2014.

[17] Andrew McGlashan. Collingwood wants Duckworth-Lewis overhaul. http://www.espncricinfo.com/

world-twenty20-2010/content/story/458375.html. Accessed: 07-05-2014.

[18] A. Murhekar. Duckworth-lewis: or worth. http://www.academia.edu/5004853/Duckworth_

Lewis-_Duck_or_Worth, 27 December 2010.

[19] Harsha Perea. A second look at Duckworth-Lewis cricket in Twenty20 cricket. Master’s thesis, Simon

Fraser University, Canada, 2011.

[20] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. Numerical Recipes

3rd Edition: The Art of Scientific Computing. Cambridge University Press, New York, NY, USA, 3

edition, 2007.

[21] Rolf Turner. Iso: Functions to perform isotonic regression., 2013. R package version 0.0-15.

75