<<

BAYESIAN NON-LINEAR REGRESSION

WITH APPLICATION IN DECLINE CURVE

ANALYSIS FOR PETROLEUM RESERVOIRS.

by

YOUJUN LI

Submitted in partial fulfillment of the requirements

for the degree of Master of Science

Department of Mathematics, Applied Mathematics and

CASE WESTERN RESERVE UNIVERSITY

May, 2017 CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the dissertation of Youjun Li

candidate for the degree of Master of Science*.

Committee Chair Dr. Anirban Mondal

Committee Member Dr. Jenny Brynjarsdottir

Committee Member Dr. Wojbor Woyczynski

Date of Defense

March 31, 2017

*We also certify that written approval has been obtained

for any proprietary material contained therein. Contents

List of Tables iii

List of Figures iv

Acknowledgments vi

Abstract vii

1 Introduction 1

2 Regression And Compared (Linear Case) 5 2.1 Mean Regression ...... 5 2.2 Quantile Regression ...... 6 2.3 Comparison with Examples ...... 8 2.3.1 The Household Income Dataset ...... 8 2.3.2 Simulated with Outliers ...... 12 2.3.3 Econometric Growth Dataset ...... 14

3 Bayesian Linear Mean and Quantile Regressions 19 3.1 Bayesian Linear Mean Regression ...... 19 3.2 Bayesian Linear Quantile Regression ...... 19 3.3 Asymmetric Laplace As Error Distribution ...... 20 3.4 Bayesian Quantile Regression Demonstrated Using a Real Dataset . . 22

i CONTENTS CONTENTS

4 Bayesian with Simulated Oil Reservoir Data 26 4.1 A Brief Description of Decline Curve Analysis ...... 26 4.2 The Simulated Data with Asymmetric Laplace Error ...... 27 4.3 Bayesian Nonlinear Mean Regression ...... 29 4.4 Bayesian Nonlinear Quantile Regression ...... 29

5 Bayesian Nonlinear Regressions with Real Oil Reservoir Data 38 5.1 Bayesian Model and from the Posterior ...... 39

5.1.1 Prior for q0 ...... 40 5.1.2 Prior for b ...... 40

5.1.3 Prior for d0 ...... 40 5.1.4 Prior for σ ...... 41 5.1.5 The Advantage of “rstan” ...... 41 5.2 Data Analysis for the First Well ...... 42 5.3 Data Analysis for the Second Well ...... 54

6 Discussion 60

7 Future Work 62

8 Bibliography 64

ii List of Tables

3.1 Bayesian Summary of “Prostate Cancer” Dataset ...... 23

4.1 Summary of Bayesian Regression for Simulated Data . . . . . 34

5.1 Prior Parameters for q0 ...... 40 5.2 Prior Parameters for b ...... 40

5.3 Prior Parameters for d0 ...... 41 5.4 Prior Parameters for σ ...... 41 5.5 Median of P-Curvs for Well 1 ...... 52 5.6 Median of P-Curvs for Well 2 ...... 55

iii List of Figures

2.1 Graph of the ...... 7

2.2 Example 1-a Household Income Dataset ...... 9

2.3 Example 1-b Household Income Dataset with Mean Regression ...... 10

2.4 Example 1-c Household Income Dataset with Five Quantile Regressions .. 11

2.5 Example 2 Mean and Quantile Regression Against Outliers ...... 13

2.6 Example 3-a GDP and Female Secondary Education ...... 15

2.7 Example 3-b Slopes of Quantile Regressions ...... 16

2.8 Example 3-c All Quantile Regression Lines ...... 17

3.1 Example 4-a Simulated Data with Normal Error ...... 21

3.2 Example 4-b Residual Plots ...... 22

3.3 Example 5 Traceplot for “Age” ...... 24

4.1 Example 6 Data Plot ...... 28

4.2 Example 6 Bayesian Nonlinear Mean Regression Line ...... 30

4.3 Example 6 Traceplots and ...... 34

4.4 Example 6 Posterior Predictive Curves for Median Regression ...... 37

5.1 Example 7 ...... 39 5.2 Example 8 ...... 39

5.3 Example 7-a Traceplots of Mean Regression ...... 43

5.4 Example 7-b Traceplots of Median Regression ...... 44

iv LIST OF FIGURES LIST OF FIGURES

5.5 Example 7-c Traceplots of 10th Quantile Regression ...... 45

5.6 Example 7-d Traceplots of 90th Quantile Regression ...... 46

5.7 Example 7-e Histograms of Mean Regression ...... 47

5.8 Example 7-f Histograms of Median Regression ...... 48

5.9 Example 7-g Histograms of 10th Quantile Regression ...... 49

5.10 Example 7-h Histograms of 90th Quantile Regression ...... 50

5.11 Example 7-i Fitted Curves ...... 51

5.12 Example 7-j P10 P50 P90 Curves ...... 53

5.13 Example 7-k Posterior Predictive Curves of Median Regression ...... 55

5.14 Example 8-a Well Two Fitted Curves Compared ...... 56

5.15 Example 8-b Well Two P90 P50 P10 Curves ...... 57

5.16 Example 8-c Well Two Posterior Predictive Curves ...... 59

v Acknowledgments

I would like to pay special thankfulness, warmth and appreciation to the persons below who made my study successful and assisted me at every point of the thesis process to cherish my goal:

My academic and thesis advisor Dr. Anirban Mondal, for his guidance and exper- tise that made the whole thing possible.

My committee member and course instructor Dr. Jenny Brynjarsdottir, for her pro- fessional and kind suggestions that helped me ameliorate the writing of the thesis.

My committee member Dr. Wojbor Woyczynski, whose encouragement made me proud of what I have done.

My classmate Yuchen Han for reminding me to keep improving.

And last but not least, my parents and family, for being supportive no matter what. I am nothing without them.

vi Bayesian Non-linear Quantile Regression with Application in Decline Curve Analysis for Petroleum Reservoirs.

Abstract by YOUJUN LI

In decline curve analysis for hydrocarbon reservoirs, the use of quantile regression instead of the conventional mean regression would be appropriate in the context of oil industry requirement as the fitted quantile regression curves have the correct inter- pretation for the predicted reserves. However, of a mean regression result have been commonly reported. In this thesis, we consider non-linear quantile regres- sion model where the quantiles of the conditional distribution of the production rate are expressed as some standard non-linear functions of time, under a Bayesian frame- work. The posterior distribution of the regression coefficients and other parameters is intractable mainly due to the non-linearity in the quantile regression function, hence Metropolis Hastings algorithm is used to sample from the posterior. A quantitative assessment of the uncertainty of the decline parameters and the future prediction would be provided for two real datasets.

vii 1. Introduction

Unlike conventional mean regression, quantile regression was comprehensively stud- ied rather late and not widely used as mean regression. Due to its straightforward intuition, mean regression has dominated the statistical analysis in both business and industry. However, quantile regression managed to handle some difficulties that mean regression failed to overcome. Thanks to the work of Koenker and Bassett Jr (1978), quantile regression be- came an alternative tool to the conventional mean regression in statistical research. In quantile regression models the quantiles of the conditional distribution of the re- sponse variable are expressed as functions of the covariates. The quantile regression is particularly useful when the conditional distribution is heterogeneous and does not have a “standard” shape, such as an asymmetric, fat-tailed, or truncated distribu- tion. Compared to conventional mean regression, quantile regression is more robust to outliers and misspecification of the error distribution. It also provides the interpre- tation of the relationship between different of the response variable and the predictive variables. On the other hand, Bayesian methodology has been growing rapidly in the digital era. More and more traditional statistical methods have evolved with the integra- tion of Bayesian theory. Kottas and Gelfand (2001) attempted to combine Bayesian approach with quantile regression by considering non-parametric modeling for the error distribution of median regression, a special case of quantile regression, based on

1 Part 1

either P´olya tree or Dirichlet process priors. Yu and Moyeed (2001) further discussed an error distribution modeling from the asymmetric Laplace distribution by proving that the loss function of the quantile regression is distributed as asymmetric Laplace distribution. Although Yu & Moyeed have provided a very thorough study of Bayesian quantile regression, they left the idea of using informative prior for the parameters of the asymmetric Laplace distribution open, such as a prior for the σ. In this paper, we will also consider an informative prior for σ with a relatively large , so that we can bring the estimate of the error distribution’s variance as a part of the posterior inference. Unlike Bayesian quantile regression being applied for models in the past, application of Bayesian quantile regression for non-linear parametric models have not been well studied in the statistics community. In this thesis, we focus on a particular industrial field, the oil industry, and propose an alternative way to well data analysis other than conventional mean regression. Due to the nature of oil exploitation, the production rate tends to decline over time. Hence the data distribution will not have a “standard” shape, which makes mean regression less adequate. More importantly, the industry regulations require certain percentiles to be reported, indicating that no other approach should be more favorable than quantile regression for this particular problem. As confusing as it could be, merely reporting the percentiles of a mean regression result is incorrect. Unfortunately, this kind of mistake is commonly seen in the oil industry. To raise awareness of this problem has become one of the first intentions of this paper. More precisely, in order to estimate the recoverable reserves of hydrocarbon reser- voirs, people use a nonlinear regression model called “decline curve analysis”, where certain types of standard curves are fitted based on the past production performance and are then extrapolated to predict the future well performance. However, according

2 Part 1 to the security exchange commission (SEC) handbook, the reported reserves estimate is defined as “” rather than “mean”, such as what are called P90 and P10 curves. The names of the curves can be confusing since they are defined to describe the percentage of the data that are above the curve. For example, P90 90% of the data are above this curve, which actually implies that the curve should just be the 10th percentile curve. What most people do now is just to report the percentile out of a mean regression result, which is clearly not what is defined in the handbook. Hence a quantile regression of a respective “percentile” should be what really has been applied. Furthermore, the complexity of the decline curve model and the real field data makes regular quantile regression difficult as the solution is subject to an optimization problem whose result is highly sensitive to the choice of the starting points, and sometimes it is even hard to converge. Bayesian method on the other hand, takes account of the uncertainty of the parameters and also allows for the implementation of a ’s own beliefs by adding prior distributions of the parameters. And with the help of metropolis hasting algorithm, the inference tend to work better than regular quantile regression. In this thesis, we have first explained how quantile regression was derived, and then compared linear mean and quantile regression by showing the general advantages of quantile regression against mean regression with examples of real data. Secondly, Bayesian method has been applied for both linear mean and quantile regression by forming a likelihood of normal distribution and asymmetric Laplace dis- tribution respectively, so that how Bayesian method can be combined with regression can be demonstrated. Then we simulate a data set with the implementation of the decline curve model from the oil industry to apply Bayesian nonlinear mean regression, and Bayesian nonlinear quantile regression with uniform priors for all the parameters. We have demonstrated why quantile regression should be the better approach for the specific

3 Part 1 problem, i.e. oil well data analysis. We do so by comparing the two kinds of re- gressions by plotting the fitted value curve of the mean regression and the predictive posterior curves of the quantile regression, as well as following the industry regula- tions. Eventually, two real oil well datasets are taken to repeat the above procedure to make inference for Bayesian mean regression and Bayesian quantile regression. Instead of using uniform priors, we have applied informative priors not only for the parameters of the decline curve model, but also for the scale parameter of Asymmetric Laplace likelihood.

4 2. Mean Regression And Quantile Regression Compared (Linear Case)

2.1 Mean Regression

Recall the modeling of linear mean regression: the conditional mean of the response variable given the value of the predictive variable(s), is expressed as a linear function

0 of the predictive variable(s), i.e. yi = xiβ + i. The unknown parameter β is usually estimated by using the approach, which is to find the value of β that minimizes the sum of the squared error:

n X 0 2 (yi − x β) i=1

The solution is easily found by taking the derivative of the sum of the squared error with respect to β, setting it to equal to zero and solving for β. Under the normality assumption, the maximum likelihood method for the normal distribution happens to be equivalent to the least squares method. One might use it as an alternative way to obtain the estimated value of the parameter.

5 2.2. QUANTILE REGRESSION Part 2

2.2 Quantile Regression

Before we get to the modeling of linear quantile regression, we first define the τ th quantile of some Y with cumulative distribution function FY as:

−1 qY (τ) := FY (τ) = inf{y : FY (y) ≥ τ}

Where τ ∈ [0, 1]. Now suppose the τ th conditional quantile of Y given X is denoted as a linear function of X, i.e.

0 q(τ) = x β + ei

Analogous to the least squares method, the τ th quantile can also be obtained by minimizing the expected loss of y − q(τ) with respect to q(τ):

Z Z qˆτ = min Ey[ρτ (y − q)] = min[τ (y − q)dFY (y) − (1 − τ) (y − q)dFY (y)] q q y>q y

Where ρτ (y − q) = (y − q)(τ − I(y−q<0)), is the loss function. And for real data analysis problems, we would like to have the sample quantile, obtained from the same approach, by replacing integration to summation:

n X qˆτ = arg minq ρτ (yi − q) i=1

With the above linear specification for the conditional quantile, we finally have the estimate of the parameter:

n ˆ X 0 βτ = arg minβ ρτ (yi − x β) i=1

However, to solve for βˆ, can one use the same approach by taking the derivative of

6 2.2. QUANTILE REGRESSION Part 2

Figure 2.1: Graph of the Loss Function the expected loss function? Figure 2.1 shows the graph of the loss function. The expected loss function is not differentiable at y = x0βˆ (elbow point), which means the solution cannot be computed by conventional numerical methods (Kuan, 2007). We thus reformulate the problem to a problem: to minimize c0z with respect to z, such that y = Az and z contains only non-negative elements, where c = (00, 00, τ10, (1 − τ)10)0

0 0 z = (β+ , β− , e+0 , e−0 )0

A = (X, −X, In, −In)

0 0 Note that β+ = max(b, 0) and β− = − min(b, 0), and the same applies to e+0 and e−0 . Then simplex methods or interior point methods can be applied to solve the refor-

7 2.3. COMPARISON WITH EXAMPLES Part 2 mulated linear programming problem. Moreover, sequential quadratic programming shall be applied when the model specification is nonlinear. Similar to mean regression, if the error term is assumed to be distributed as asym- metric Laplace distribution, its maximum likelihood estimator is equivalent to the quantile regression estimator βˆ. We will discuss this in more details in the Bayesian quantile regression part.

2.3 Comparison with Examples

2.3.1 The Household Income Dataset

As the setups for each regression have been specified, we can go ahead compare these two methods by applying them to real datasets. The first dataset comes with the package “quantreg”, which will be used to perform quantile regression later on. The dataset contains a number of households’ annual income and their respective annual expenditure on food (both in Belgian francs). We will use regression to find if there is any linear relationship between these two variables by setting the income as predictive variable and the food expenditure as response variable. Let’s first take a look at the plot of the dataset in Figure 2.2. As the skewed plot depicted, the data is not evenly distributed. The data points are more concentrated in lower income , and spread out drastically as the income goes up. If we were to fit a linear mean regression model to this dataset, the regression line would look like Figure 2.3. The plot clearly shows that when the income is higher than 1000 francs, the mean regression line can hardly represent the data trend, which makes our interpretation of the regression result pointless in reality. Now we perform quantile regression at the 5th, 25th, 50th, 75th and 95th quantiles respectively and plot the quantile regression lines.

8 2.3. COMPARISON WITH EXAMPLES Part 2

Figure 2.2: Example 1-a Household Income Dataset

9 2.3. COMPARISON WITH EXAMPLES Part 2

Figure 2.3: Example 1-b Household Income Dataset with Mean Regression

10 2.3. COMPARISON WITH EXAMPLES Part 2

Figure 2.4: Example 1-c Household Income Dataset with Five Quantile Regressions

11 2.3. COMPARISON WITH EXAMPLES Part 2

As depicted in Figure 2.4, the linear trends only appear to be similar to the mean regression line at the 25th percentile, while the rest show very different slopes from it is of the mean regression line. As a result, the differences allow us to make realistic and reasonable interpretation for the dataset. For example, the 95th quantile regression line has the biggest slope, suggesting that in general, for those households that have been spending more on food already, their food expenditure would go up more rapidly when their income increases. Similarly, for those households that like to spend little on food, they most-likely won’t spend that differently if their income changes, as the 5th quantile regression line looks relatively flat. This interpretation is reasonable because when not making much money most people tend to form a habit of saving money, thus their expenditure would not be too sensitive to how much they earn; but if they have already been making good money, they don’t have a reason to economize their spending, especially on food expenditure, thus they tend to spend more when they make more money than before, and vice versa.

2.3.2 Simulated Data with Outliers

The second example is a simulated data with normal errors to find out what difference outliers will bring to the mean and quantile regression lines. The intercept is set to be zero and slope is 8. We add 4 outliers to the tails of the response variable as the quantile regression line we will examine is the median regression line. Figure 2.5 gives the plot of the data. From the plot we can tell that no obvious change has been made to the median regression line, yet the mean regression line is pivoted by the outliers. This robust- ness of quantile regression against outliers allows the result of quantile regression to be more reliable when dealing with really noisy dataset. However, we have to pay attention to the position of the outliers. If they happen to lie on the same percentile as that of the quantile regression, the robustness would fail.

12 2.3. COMPARISON WITH EXAMPLES Part 2

Figure 2.5: Example 2 Mean and Quantile Regression Against Outliers

13 2.3. COMPARISON WITH EXAMPLES Part 2

2.3.3 Econometric Growth Dataset

The second example is designed to show how quantile regression can help spot hidden relationship at the tails of the conditional distribution when it is symmetric. The dataset comes from a study by Barro and Lee (1994), consisting of the national growth rates (GDP) and other 13 variables, that may potentially have impact on GDP, of 161 countries from 1965 to 1985. For the purpose of demonstration, we only take one covariate called “Female Secondary Education” (fse2) to study the impact it might have on GDP. Figure 2.6 gives the plot of annual change per capita GDP against Female Secondary Education in average years of secondary schooling. A linear mean regression is fitted first. We get an estimated slope of 0.003, which is close to zero, indicating that no obvious effect from Female Secondary Education on GDP, which some economists believe differently. Next we perform multiple quantile regressions at tau=(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) to compare with the result of mean regression. We plot the slope values from each quantile regression against their corresponding quantiles. (Figure 2.7) The horizontal line stands for the mean regression slope value, 0.003. However, if we look at the slope values from the quantile regressions, they clearly decrease as the percentile increases, from positive to negative, passing through the mean regression slope value at somewhere between 30th and 40th quantile. We can also plot all the quantile regression lines like we did to the last example, they form a “V” shaped graph (Figure 2.8). By doing quantile regressions at different percentiles, it is reasonable to conclude that the effect of Female Secondary Education on GDP does not stay constant over the entire distribution. The effect becomes stronger in both the lower and upper tails, but with one being positive, and another being negative. The reason why a mean regression would only show a non-significant relationship is that the opposite effects

14 2.3. COMPARISON WITH EXAMPLES Part 2

Figure 2.6: Example 3-a GDP and Female Secondary Education

15 2.3. COMPARISON WITH EXAMPLES Part 2

Figure 2.7: Example 3-b Slopes of Quantile Regressions

16 2.3. COMPARISON WITH EXAMPLES Part 2

Figure 2.8: Example 3-c All Quantile Regression Lines

17 2.3. COMPARISON WITH EXAMPLES Part 2 of the quantiles on both tails get neutralized when counting on the overall mean. One thing we would like to point out is that, for this particular example, in order to showcase how the quantile regression can provide deeper insights of a dataset, we didn’t consider other models besides the . That is to say, we assume a linear model is appropriate to describe the relationship between Female Secondary Education and GDP.

18 3. Bayesian Linear Mean and Quantile Regressions

3.1 Bayesian Linear Mean Regression

In order to apply Bayesian method to any model, we need to specify the conditional distribution of the data Y given the parameters, i.e. the likelihood. It is commonly assumed that the error term of a linear model would have a normal distribution with mean being the linear function x0β. Then the maximum likelihood method can be used to find parameter solution which is equivalent to that of the least squares method. This implies that the normal likelihood should be appropriate for the Bayesian ap- proach. The posterior distribution of the parameter is then:

n 1 X π(β|y) ∝ p(β) (2πσ2)−n/2 exp{− (y − x0β)} 2σ2 i i i=1

It is proportional to the product of some prior distribution for β and the normal likelihood.

3.2 Bayesian Linear Quantile Regression

Similar to Bayesian mean regression, the key is to find an appropriate likelihood for the quantile model. Koenker and Machado (1999) showed that a random variable U

19 3.3. ASYMMETRIC LAPLACE AS ERROR DISTRIBUTION Part 3 is said to follow asymmetric Laplace distribution if its density is given by:

p(1 − p) u − µ f (u; µ, σ) = exp{−ρ ( )} p σ p σ where µ is the mean, σ is the scale parameter, 0 < p < 1 and ρ(u) is defined the same as the loss function of quantile regression in last section. It is skewed when p 6= 0.5, and tends to fit better if the data are noisy on the tails of its distribution. Yu and Moyeed (2001) connected the distribution with quantile regression by showing that no matter what the actual distribution for the error term is, forming the likelihood based on asymmetric Laplace distribution would always work well. Replacing p as the τ th quantile, u as Y , µ as q(τ) and σ as 1 for simplicity, we now have the setup for the asymmetric Laplace likelihood for the τ th quantile regression model. Furthermore, with the specification of q(τ) = x0β, the setup for Bayesian linear quantile regression is completed. Then the posterior distribution of the parameter would be: n n n X 0 π(β|y) ∝ p(β) τ (1 − τ) exp{− ρτ (yi − xiβ)} i=1 where p(β) is some prior distribution for the parameter β.

3.3 Asymmetric Laplace As Error Distribution

In order to show the asymmetric Laplace distribution (ALD) would work even when the real error distribution is different, we simulate a simple linear regression data with noise distributed as N(0, 10000). Similar to the last simulated data, the slope value is set to be 8 without intercept, and the predictor is a sequence from 1 to 100. Data plot is in Figure 3.1. In R, to perform Bayesian linear quantile regression, there is a convenient package called “bayesQR”, one of whose authors is actually Keming Yu, the same one that

20 3.3. ASYMMETRIC LAPLACE AS ERROR DISTRIBUTION Part 3

Figure 3.1: Example 4-a Simulated Data with Normal Error

21 3.4. BAYESIAN QUANTILE REGRESSION DEMONSTRATED USING A REAL DATASET Part 3

Figure 3.2: Example 4-b Residual Plots

proposed the asymmetric Laplace likelihood for Bayesian quantile regression. It im- plements what we mentioned above and uses Markov Chain Monte Carlo algorithm to sample from the posterior distribution. We use this package and do a Bayesian linear quantile regression for the median and summarize the result of posterior samples with half out of 10000 iterations burn-in. The estimated slope value is 8.08 with a 95% being (7.94, 8.16), and the residual plot looks fairly similar to the real normal error plot (Figure 3.2). That means our method still works well even if the original error distribution is normal.

3.4 Bayesian Quantile Regression Demonstrated

Using a Real Dataset

The R package “bayesQR” also provides datasets for users to get familiar with the package. We will use the “Prostate Cancer” dataset to illustrate how Bayesian linear

22 3.4. BAYESIAN QUANTILE REGRESSION DEMONSTRATED USING A REAL DATASET Part 3 Covariates Bayes Estimate Lower Upper Intercept -0.0183 -0.0874 0.0505 lcavol 0.5549 0.4498 0.6629 lweight 0.2263 0.1357 0.3205 age -0.1565 -0.2326 -0.0774 lbph 0.1836 0.1000 0.2655 svi 0.2859 0.1834 0.3881 lcp -0.1525 -0.2760 -0.0294 gleason 0.0711 -0.0415 0.1764 pgg45 0.1242 0.0116 0.2423

Table 3.1: Bayesian Summary of “Prostate Cancer” Dataset quantile regression and its inference are done. The dataset consists of the medical records of 97 male patients who were to receive a radical prostatectomy. We are interested in examining the correlation between the level of prostate antigen (lpsa) and some other predictive variables such as log of prostate weight (lweight), age and so on. We fit a median regression model including all the 8 covariates. We shall first check the convergence of the MCMC chains by plotting the traceplot with 2000 burn-in out of 5000 iterations. We only show the traceplot for the age (Figure 3.3) as the rest behave similarly. Judging by the plot the chain converges very fast. Once convergence is confirmed, we have various ways to make posterior inference by looking at different statistics of the posterior distribution. Here we decide to take a 90% credible interval. The summary of all 8 covariates is given in Table 3.1. The uncertainty of each estimate of the covariates is reflected by the credible intervals, whose interpretation is way more straight forward than a confidence interval: there is a 90% of chance that the true value of the parameter will lie inside the credible interval. Additionally, we can also plot the of the samples of each parameter to have a better look at their marginal posterior distribution. Sometimes there happens to be more than one from a marginal posterior distribution, then

23 3.4. BAYESIAN QUANTILE REGRESSION DEMONSTRATED USING A REAL DATASET Part 3

Figure 3.3: Example 5 Traceplot for “Age”

24 3.4. BAYESIAN QUANTILE REGRESSION DEMONSTRATED USING A REAL DATASET Part 3 all the modes should be reported.

25 4. Bayesian Nonlinear Regression with Simulated Oil Reservoir Data

4.1 A Brief Description of Decline Curve Analysis

Decline curve analysis (DCA) is a graphical procedure used for analyzing declining production rates and forecasting future performance of oil and gas wells. Oil and gas production rates decline as a function of time; loss of reservoir pressure, or changing relative volumes of the produced fluids, are usually the cause. Fitting a line through the performance history and assuming this same trend will continue in future forms the basis of DCA concept. The basic assumption in this procedure is that whatever causes controlled the trend of a curve in the past will continue to govern its trend in the future in a uniform manner. Arps et al. (1945) developed a set of equations to have the decline curve defined, based on numerical study and some justifications subject to the physics of fluid flow. His equations still are the foundation of today’s decline curve applications, including oil reservoirs.

q0 q(t) = 1/b (1 + bd0t)

This is Arps’ equation for general decline in a well, which can also be called the Hyperbolic Decline. If one were to apply this model to an oil well data, the current production rate (q(t)) can be well defined against time (t in days) once the three

26 4.2. THE SIMULATED DATA WITH ASYMMETRIC LAPLACE ERROR Part 4

parameters, initial rate(q0), the degree of the curvature of the line (b) and initial

decline rate (d0) are known. As we are to apply this specific general decline model to the oil reservoir data, these three parameters would be our main interest.

Several remarks regarding the b and d0 parameters shall be made. The parameter b takes a range from 0 to 1, and has most commonly been seen to be less than 0.5. Although some rare cases where b > 1 have been found, they are highly unlikely to appear in the oil industry. When b takes the value of 0 and 1, we have two special cases called Exponential and Harmonic Decline, respectively.

When b = 0, the decline model can be reformulated as q(t) = q0 exp{−dt}, where

−d(ln q) d is called nominal decline rate. Since it is defined as d = dt , it is fixed regardless of the choice of the baseline point, where the decline starts.

q0 When b = 1, we get the Harmonic Decline: q(t) = . Here d0 is the initial 1+d0t decline rate just like it is in the Hyperbolic Decline. In some cases, it is calculated

from what is called effective decline factor which is defined as d = q0−q1 . It changes q0 with the choice of the baseline point. Details of how to calculate the initial decline rate will be provided in the following sections.

The parameter d0 is normally between 0 and 0.2. Since we will apply the more general model, Hyperbolic Decline, we treat d0 as initial decline rate.

4.2 The Simulated Data with Asymmetric Laplace

Error

We assign some values to the above 3 parameters of our own choice, attempting to be as realistic as possible by restricting the values within the ranges we mentioned earlier. We take q0 = 9000, b = 0.9 and d0 = 0.004. We first obtain the data from the Hyperbolic Decline model with those 3 assigned parameters over 1000 days. Then we generate 1000 random samples from asymmetric

27 4.2. THE SIMULATED DATA WITH ASYMMETRIC LAPLACE ERROR Part 4

Figure 4.1: Example 6 Data Plot

Laplace distribution with mean for each sample set to be the corresponding value from the Hyperbolic Decline model, and a variance of 200 by setting the σ equal to 5. Now we have created a data set containing 1000 observations of daily production rate and their corresponding time points. And the production rate comes with a systematic error that follows asymmetric Laplace distribution. Then we trim the sample size of 1000 to 30 by keeping one observation every 34 days. Finally, we add 3 outliers to the upper tail (Figure 4.1).

28 4.3. BAYESIAN NONLINEAR MEAN REGRESSION Part 4

4.3 Bayesian Nonlinear Mean Regression

With the implementation of the Hyperbolic Decline model, we suppose the conditional mean is given by the right hand side of the equation, which makes the regression nonlinear. The error distribution is presumed to follow a normal distribution with variance set to be 200. The posterior density we obtained is (uniform priors):

n 1 X π(β|y) ∝ (2π200)−n/2 exp{− (y − q(β))} 2 × 200 i i=1

The model setup and sampling algorithm will not be elaborated because the same procedure will be explained in detail when introducing Bayesian nonlinear quantile regression in the next section, except that the likelihood here is from normal dis- tribution while it is from asymmetric Laplace distribution for quantile regression. Depicted in Figure 4.2, the plot of posterior median fitted value curve to compare with the results from quantile regression is given. The mean regression curve got shifted because of the outliers, leading to a deviated fitted-value curve that failed to meet the last 9 observations.

4.4 Bayesian Nonlinear Quantile Regression

Similar to mean regression, the conditional quantile is assumed to satisfy the Hyper- bolic Decline model. Unlike the former example, we won’t be able to take advantage of the “bayesQR” package due to the nonlinearity of the quantile regression, we will be programming by ourselves. For the simulated data, other than knowing the real values of the parameters, we assume to have no prior knowledge about them by using uniform priors for the 3 parameters. To ensure a proper posterior distribution, the parameters can only take values within certain intervals. These intervals will be specified with the uniform

29 4.4. BAYESIAN NONLINEAR QUANTILE REGRESSION Part 4

Figure 4.2: Example 6 Bayesian Nonlinear Mean Regression Line

30 4.4. BAYESIAN NONLINEAR QUANTILE REGRESSION Part 4

prior distributions. We now write down the model setting step by step. The conditional quantile is defined by the Hyperbolic Decline model:

q0 q(τ) = 1/b (1 + bd0t)

The likelihood is formed by asymmetric Laplace distribution:

n n n X L( β|y) = τ (1 − τ) exp{− ρτ (yi − q(τ))} i=1

Where β = (q0, b, d0). Adding uniform priors for each parameter, we finally have the posterior distribution:

n n n τ (1 − τ) X yi − q(τ) π(β|y) ∝ exp{− ρ ( )} σn τ σ i=1

With q0 ∈ (100, 15000), b ∈ (0, 2), d0 ∈ (0, 0.2), and σ, for now, we treat it as known parameter that equals 5 that is computed from the variance 200. In order to make from this posterior distribution, we need to sample from the posterior density function. As the density function does not appear to be some known standard form, we will use metropolis hasting algorithm to obtain draws from it. The algorithm essentially applies a rejection sampling method under some nice property of Markov Chain. We will not go through the rationale behind it in this paper. The algorithm can be represented as (Gelman et al., 2014):

• Assign some initial values θ0 for the 3 parameters to start the algorithm with;

• Choose a proposal distribution that can be easily sampled from, say p(x);

• Sample a new set of parameter values θ∗ from p(x) based on the starting point θ0;

π(θ∗)p(θ0|θ∗) • Calculate the rejection ratio by r = π(θ0)p(θ∗|θ0) ;

31 4.4. BAYESIAN NONLINEAR QUANTILE REGRESSION Part 4

• Generate a random number from uniform (0, 1), if it’s less than r, then accept θ∗ as the starting point for the next iteration, or keep θ0 for the next iteration.

Back to our simulated data, first we need to make some transformation for the pos- terior density because in R, when the exponential index is too large, the output will go to infinity, breaking down the sampling algorithm as a result. To avoid the failure of the sampling procedure, we take natural logarithm of the posterior density when calculating the rejection ratio, producing the reformulated equation:

ln(r) = ln(π(θ∗)) + ln(p(θ0|θ∗)) − ln(π(θ0)) − ln(p(θ∗|θ0))

Consequently, we also need to take natural logarithm of the random number generated from uniform (0, 1) when comparing it with r. Note that for now we will only perform a median regression, i.e. 50th quantile regression. Next we assign some values that are close to the real parameter values as the starting point, θ0 = (8500, 0.85, 0.0038). Then we will choose truncated multivariate normal distribution as our proposal density. The truncated intervals for each parame- ter will be how they have been defined in the uniform prior distributions earlier. This is how we implement the priors into the sampling algorithm. The matrix is set to be   402 0 0      0 0.042 0      0 0 0.0042 based on the being 4% of the data scale by experience. Note that we leave out the last 5 data points for prediction, that means we only took 25 data points out of 30. After running the algorithm in R with 10000 iterations, we realize a major chal- lenge we are facing to be tuning the jump size of the proposal distribution, i.e. the

32 4.4. BAYESIAN NONLINEAR QUANTILE REGRESSION Part 4 values of the standard deviations for the three parameters. Judging by the traceplot under the current standard deviation combination, even with half samples burn-in, none of the three chains behaves well, the acceptance rate is too low. We don’t want to make inference without the chains bouncing around the convergence value. Attempting to find the best behaving standard deviation combination, we come up with an interval that supposedly would include the most reasonable standard deviation values for all three parameters. Then we make 3 arithmetic sequences of length 20 based on each parameter’s interval, and take all the possible combinations to form a 203 × 3 dimensional grid so that we can go through each combination by repeating the above sampling method with 10000 iterations for 8000 times. If we were to check the traceplots for all 8000 combinations, it would obviously be too time consuming and inefficient. Note that an index called acceptance rate is also widely used to determine whether the MCMC algorithm is behaving well. It is the ratio of the samples (θ∗) that we have kept against the total number of samples, which means a high acceptance rate at least implies the jump size is reasonable. We will keep the combinations whose corresponding acceptance rate is higher than some threshold we set, say 20%, and then make further inspections like traceplots since there should only be several combinations left. We check the traceplots of the parameters as well as the natural log density, the best behaving combination turns out to be (8, 0.008, 0.00002). Its corresponding traceplots are given in Figure 4.3. Now it’s finally the time for Bayesian inference. Similar to what we have done in Bayesian linear quantile regression, we obtain the marginal histograms (Figure 4.3), and 95% credible intervals for all 3 parameters with half burn-in (20000 in total). We could either report the median or the mode of the posterior samples. We decide to take the median here out of consideration for credible intervals, so that they will all be quantiles of the posterior samples. The summary of the posterior samples is given in Table 4.1.

33 4.4. BAYESIAN NONLINEAR QUANTILE REGRESSION Part 4

Figure 4.3: Example 6 Traceplots and Histograms

Parameters Real Value Median 2.5% lower 97.5% upper q0 9000 9010.045 8985.569 9057.090 b 0.9 0.8943111 0.8637123 0.9176253 d0 0.004 0.003985633 0.003906158 0.004056896 Table 4.1: Summary of Bayesian Median Regression for Simulated Data

34 4.4. BAYESIAN NONLINEAR QUANTILE REGRESSION Part 4

From Table 4.1 it is easy to see that the medians are all close to the real values, which all lie inside the respective credible intervals. One very important advantage of Bayesian inference we have mentioned multiple times is that, it allows for uncertainty when making predictions, which is a feature the conventional method will never be able to provide. It can be achieved by obtaining the posterior predictive distribution. Theoretically, the posterior predictive density π(˜y|y) is given by using convolution integrals:

Z π(˜y|y) = π(˜y|θ)π(θ|y)dθ

To apply it with the posterior samples we already have, we just take the samples as the parameters for the likelihood distribution, then generatey ˜ from that distribution. Thus, each sample would produce its owny ˜ at each time point. As a result,y ˜ would have a distribution that can be represented by π(˜y|y), which is different from conventional method wherey ˜ is fixed. That is to say, we can do all the inference we want from a distribution withy ˜, such as mode, median and credible intervals. We follow the procedure described above to make prediction for the last 5 data points and compare the result with the real data. Since the scale parameter of asym- metric Laplace distribution is known (5), we only need to calculate the mean by using the posterior samples. Then we generate from the asymmetric Laplace distribution to get 10000 samples fory ˜ at each time point. We plot the median and 90% cred- ible interval curves for the last 5 time points with dotted lines to distinguish from the posterior median and credible interval curves. As shown in Figure 4.4, quantile regression’s robustness against outliers saves the day again! In terms of prediction, none of the 5 points falls out of the predictive credible interval curves. The reason why the credible interval is so narrow is that our data is very nicely distributed with small variance so that the regression fits almost perfectly and has little uncertainty.

35 4.4. BAYESIAN NONLINEAR QUANTILE REGRESSION Part 4

We will shortly see how this predictive credible interval can widen once dealing with real data.

36 4.4. BAYESIAN NONLINEAR QUANTILE REGRESSION Part 4

Figure 4.4: Example 6 Posterior Predictive Curves for Median Regression

37 5. Bayesian Nonlinear Regressions with Real Oil Reservoir Data

The datasets we will be working on are of two oil wells from Eagle Ford shale in Texas with a time span of two years and three years, respectively. Before we could start any data analysis, certain data wrangling is needed to make the datasets workable for decline curve analysis. Note that these two datasets come from the wells that did not have any other external interventions and controlling, allowing the implementation of decline curve analysis. As for now we only consider the oil production, the monthly production rate, which would be the response variable we will use along with its corresponding dates as predictive variable. Then we cut off the time points at the beginning where the oil production rate is still increasing because it takes time for the wells to be initialized and to reach the most efficient level for producing the oil. When the initialization is finished, the oil production rate shall go down over time and follow the Hyperbolic Decline model. For the predictive variable, we convert the date to the respective number of days and baseline the cutoff point where the production rate starts to decline. Then we calculate the daily production rate from the monthly production rate by dividing each one of them by their corresponding number of days in that month. Finally, cumulative days at each time points are calculated. The plots

38 5.1. BAYESIAN MODEL AND SAMPLING FROM THE POSTERIOR Part 5 for the two cleaned datasets are giving in Figure 5.1 and 5.2.

Figure 5.1: Example 7 Figure 5.2: Example 8

5.1 Bayesian Model and Sampling from the Pos-

terior

Besides using normal and asymmetric Laplace distribution for the likelihood for mean and quantile Bayesian regressions like what we did with the simulated dataset, we will use informative priors for the three decline curve parameters instead of uniform priors. Moreover, we will also use a prior for the scale parameter of the asymmetric Laplace distribution. It is simple and common to use truncated normal distribution as priors if one has some information of likely values the parameters should take. By taking the mean of the truncated normal distribution as the potential value, we assign the highest probability to the values we believe to be the most reasonable in a restricted range.

39 5.1. BAYESIAN MODEL AND SAMPLING FROM THE POSTERIOR Part 5

Parameters Well One Well Two µq 1600 214 σq 110 62

Table 5.1: Prior Parameters for q0 Parameters Well One Well Two µb 1 1 σb 0.3 0.3 Table 5.2: Prior Parameters for b

5.1.1 Prior for q0

According to the definition of q0 being the production rate at the start of production, we could just take the production rate of the baseline directly as q0’s prior mean. However, since the data tend to be noisy, we will round the baseline production rate to tens. We take the square root of the mean squared errors (MSE) from a nonlinear mean regression as the prior standard deviation (Table 5.1).

5.1.2 Prior for b

Setting prior mean and standard deviation values is rather straight forward for pa- rameter b: we simply take 1 for the mean, which is the reported average Hyperbolic Decline constant from the historical data of eagle ford shale. The historical data also show that the value of b fluctuates within the range of µb ± 3σb. Hence, one third of the mean value is an acceptable choice for the standard deviation (Table 5.2).

5.1.3 Prior for d0

As we mentioned in the earlier section, the initial decline rate d0, can be calculated from the effective decline factor by d = (q0−q1) over a specific time period. In our case, q0 we can plug in baseline production rate for q0 and production rate at the next time point for q1. Note that what we get here will be a monthly factor which should still

40 5.1. BAYESIAN MODEL AND SAMPLING FROM THE POSTERIOR Part 5

Parameters Well One Well Two µd 0.007 0.008 σd 0.0024 0.0027

Table 5.3: Prior Parameters for d0 Parameters Well One Well Two α 2.25 2.25 β 50 27.5

Table 5.4: Prior Parameters for σ

be divided by the number of days in that month to get daily effective decline factor.

1 −b Once we have d, we can compute d0 by b ((1 − d) − 1) to be the mean of its prior distribution. Then we take one third of the mean value as the standard deviation like what we did for parameter b based on the historical data(Table 5.3).

5.1.4 Prior for σ

For the scale parameter σ of the asymmetric Laplace distribution, it is common to take a skewed distribution as prior since σ defines the variance. So we use inverse gamma distribution proposed by Kozumi H, Kobayashi G (2011) to optimize the calculation of the posterior distribution. Mean of the inverse gamma is set to the value

of σ computed from the MSE we used for q0, by how the variance of an asymmetric

(1−2p+2p2) 2 Laplace distribution is defined: var = p2(1−p)2 σ . Since the prior mean is taken from an estimate of mean regression, we don’t want to be too restricted by it. Hence, we set a relatively large standard deviation of twice as the prior mean so that the posterior result will not be dominated by the prior setup (Table 5.4).

5.1.5 The Advantage of “rstan”

Instead of programming our own algorithms for sampling from the posterior distribu- tion, we use the powerful R package “rstan” to deal with the posterior distribution as

41 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

we find it more efficient than having to go through the “tuning” part from using our own code. However, there is no built-in function for asymmetric Laplace distribution in “stan”. We handle the problem by coding the precise log of asymmetric Laplace distribution as a user-defined function.

5.2 Data Analysis for the First Well

We perform Bayesian mean and quantile regressions with same prior distributions for the decline curve parameters, and one more prior in quantile regression for the σ of asymmetric Laplace distribution. Quantile regressions include 10th quantile, median and 90th quantile. The data is split into two parts, first 10 observations out of 15 are used for regression; the rest 5 points are for verifying the prediction. The same setup is applied when sampling from the posterior distributions for all mean and quantile regressions: 4000 iterations for each of 4 chains, and all the statis- tics will be taken from samples after half burn-in. The traceplots of all parameters from the 4 regressions are given in Figure 5.3 - 5.6, no convergence issues were spotted.

By looking at the histograms of the marginal posterior distribution for each pa- rameter, we make sure no one has multiple modes, i.e. no concern of mistakenly reporting some local optima when using “maximum a posteriori” (MAP) estimate (Figure 5.7-5.10). To compare mean regression and quantile regression, we plot the fitted curves of the mean regression and the median regression, by taking the mean of the posterior samples as the parameter values to compute the production rate as fitted values. Then we simply plot the fitted values against the time variable in the dataset (Figure 5.11). Note that the plot is in log scale and we will be using log scale for the rest of the analysis until the posterior predictive plots for the second well.

42 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.3: Example 7-a Traceplots of Mean Regression

43 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.4: Example 7-b Traceplots of Median Regression

44 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.5: Example 7-c Traceplots of 10th Quantile Regression

45 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.6: Example 7-d Traceplots of 90th Quantile Regression

46 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.7: Example 7-e Histograms of Mean Regression

47 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.8: Example 7-f Histograms of Median Regression

48 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.9: Example 7-g Histograms of 10th Quantile Regression

49 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.10: Example 7-h Histograms of 90th Quantile Regression

50 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.11: Example 7-i Fitted Curves

51 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

P90 P50 P10 q0 1555 1603 1619 b 0.92 0.81 0.74 d0 0.012 0.01 0.009 σ 7.28 17.12 4.95

Table 5.5: Median of P-Curvs for Well 1

Since the first well data is relatively nicely distributed without any outliers, the mean and median regression almost overlapped. But we will still focus on the inference of quantile regressions due to the fact that 10th and 90th quantiles are of interest in oil industry. The graph of the 10th, 50th, and 90th quantile fitted curves are normally required when reporting analysis results of an oil well even though the fitted curves don’t really play an important role in Bayesian inference because they don’t take account of the uncertainty of the parameters. People are interested in seeing the fitted curves just so they can have an idea of the well behavior. We provide the fitted quantile regression curves plot by taking the median of the posterior samples instead of the mean (Figure 5.12) and the parameters’ corresponding values (Table 5.5). The more conservative P90 curve (10th quantile curve) shows that only one point is not in the upper area of the curve, while the upper area of P10 curve (90th quantile curve) contains relatively more points (three points). Four points are out of the interval in total. Now we come to the main part of the Bayesian inference, the posterior predic- tive distribution. We compute the production rate without error by plugging in the posterior decline curve parameter samples of every iteration (2000 in total). Then take each of the computed production rate as the mean of the asymmetric Laplace distribution, and the corresponding posterior σ, to generate 2000 random numbers. We follow the same procedure for every time points so that at each time points, we have a set of posterior predictive samples. Finally, we take the median of those sam-

52 5.2. DATA ANALYSIS FOR THE FIRST WELL Part 5

Figure 5.12: Example 7-j P10 P50 P90 Curves

53 5.3. DATA ANALYSIS FOR THE SECOND WELL Part 5 ples to get the posterior predictive curve, as well as the 5th and 95th quantile to get the posterior predictive credible interval. We repeat the above for all 3 quantile regressions and plot the posterior predictive curves. Figure 5.13 only shows the plot of median regression. The posterior predictive credible interval of the median regression is clearly wider than it is formed by the P90 and P10 curves. The interval not only includes all the data points we used for regression, but also gives the prediction for the last 5 data points which successfully contains the real values.

5.3 Data Analysis for the Second Well

The second dataset has more observations (23) left after we did the data cleaning. The first 15 of them are used for regressions and the rest 8 are for prediction check. We follow the same procedure as we did with the first well. We do not include the traceplots and histograms this time since they are similar as the first well. Again, even though some of the chains of quantile regressions don’t behave as well as the first well, no obvious is spotted. And none of the marginal posterior distributions have multiple modes either. Since this well dataset is much noisier than the first well, the fitted mean regression curve and quantile regression curve have different trends (Figure 5.14). The mean regression curve is pushed downward by the relatively abnormal observations at the beginning, hence failed to fit the later points which actually look more like a decline curve trend. On the other hand, the “outliers” at the beginning did not have much effect on the quantile regression so that the median regression curve captures the later points better. And as a routine, we provide the P90, P50 and P10 plot (Figure 5.15) as well as the parameter values (Table 5.6).

54 5.3. DATA ANALYSIS FOR THE SECOND WELL Part 5

Figure 5.13: Example 7-k Posterior Predictive Curves of Median Regression

P90 P50 P10 q0 107.03 175.6 223.42 b 1.23 1.34 1.4 d0 0.007 0.006 0.005 σ 4 12.57 4.39

Table 5.6: Median of P-Curvs for Well 2

55 5.3. DATA ANALYSIS FOR THE SECOND WELL Part 5

Figure 5.14: Example 8-a Well Two Fitted Curves Compared

56 5.3. DATA ANALYSIS FOR THE SECOND WELL Part 5

Figure 5.15: Example 8-b Well Two P90 P50 P10 Curves

57 5.3. DATA ANALYSIS FOR THE SECOND WELL Part 5

Instead of plotting the posterior predictive curves in log scale, we use the original scale because some of the low quantile posterior predictive samples turn out to be negative and set to be zero for being more realistic. Granted that a value of zero is still unlikely, we keep them anyway so that a complete curve can be presented. Figure 5.16 shows the posterior predictive curves for the median regression. Taking uncertainty into account leads to a relatively wide credible interval that covers all the data points. Whether it is too wide is still debatable, but the fact that it captures all the data variation gives the edge to the Bayesian inference. After all, we can only talk about precision when there is accuracy first.

58 5.3. DATA ANALYSIS FOR THE SECOND WELL Part 5

Figure 5.16: Example 8-c Well Two Posterior Predictive Curves

59 6. Discussion

In most cases for industry data analysis, people are familiar with conventional mean regression and thus apply it to different problems. However, it could happen that when the normality prerequisite, which is the mostly used assumption for mean re- gression, does not meet and yet a linear regression is still used anyway. The results which are supposed to be treated with a lot of cautions, are always taken directly without further analysis and sometimes lead to unreliable conclusions. Moreover, in reality it is rare to have a nicely distributed data set. It either has complicated noises or many close-by outliers. Obstinately applying a mean regression will create difficul- ties for and results interpretation. For instance, misspecification of the error distribution will make mean regression meaningless; outliers will change the regression results drastically if not handled carefully; potential effect is hard to be noticed with symmetrically distributed data, etc. More and more criticisms and negative comments have been made against the abusive use of mean regression by and a call for a more cautious attitude toward data analysis has caught people’s attention. By implementing with real data examples, we showed that quantile regression tends to deal better with the above problems that are very common in real data. Quantile regression’s robustness against outliers and heterogeneous distribution makes the results more reliable. Furthermore, in oil industry, quantile regression should be disseminated since certain quantiles are of interest yet most people are applying mean

60 Part 6 regression. Even though Bayesian methodology has gained more and more attention and popularity due to the development of computer science and technology, the idea of combining it with quantile regression is still at its early stage where only methods for Bayesian linear quantile regression were studied. However, we find Bayesian method have the most potential in nonlinear cases as most nonlinear models are complicated and it’s hard to use conventional least squares approach to solve. Sometimes it doesn’t even converge or gives the values that are against our prior beliefs. This kind of mistake can easily be avoided by the prior distributions of Bayesian method. And sampling methods like Monte Carlo Markov Chain will guarantee a convergence with the appropriate setup. More importantly, by giving prediction a distribution, the results are more reasonable and realistic instead of a smooth curve.

61 7. Future Work

In recent years, Gong et al. (2011) have done Bayesian mean regression with the natural oil well data using the decline curve analysis. Yet no one has done Bayesian quantile regression, which is actually what people are supposed to report to SEC on the basis of the oil industry regulation. In future research, we would also like to incorporate more of the information from the nearby wells that behave similarly as the ones we are going to analyze, into the prior distribution to make well-founded predictions that are not only for the behavior of the existing wells, but also for the future wells. Moreover, simultaneous quantile regression framework is established to estimate multiple choices of quantiles at the same time. Tokdar et al. (2012) made it possible with Bayesian approach, but only in linear case. We would like to extend it to the nonlinear case where Hyperbolic Decline model and other nonlinear models can be applied.

62 Bibliography

Arps, J. J. et al. (1945). Analysis of decline curves. Transactions of the AIME 160, 228–247.

Barro, R. J. and Lee, J.-W. (1994). Sources of economic growth. In Carnegie- Rochester conference series on public policy vol. 40, pp. 1–46, Elsevier.

Buchinsky, M. (1994). Changes in the US wage structure 1963-1987: Application of quantile regression. pp. 405–458. JSTOR.

Chernozhukov, V. and Hansen, C. (2004). The effects of 401 (k) participation on the wealth distribution: an instrumental quantile . Review of Economics and statistics 86, 735–751.

Fetkovich, M., Fetkovich, E., Fetkovich, M. et al. (1996). Useful concepts for decline curve forecasting, reserve estimation, and analysis. SPE Reservoir Engineering 11, 13–22.

Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2014). Bayesian data analysis, vol. 2,. Chapman & Hall/CRC Boca Raton, FL, USA.

Gong, X., Gonzalez, R. A., McVay, D., Hart, J. D. et al. (2011). Bayesian Probabilistic Decline Curve Analysis Quantifies Shale Gas Reserves Uncertainty. In Canadian Unconventional Resources Conference Society of Petroleum Engineers.

Koenker, R. and Bassett Jr, G. (1978). Regression quantiles. pp. 33–50. JSTOR.

63 BIBLIOGRAPHY BIBLIOGRAPHY

Koenker, R. and Machado, J. A. (1999). Goodness of fit and related inference pro- cesses for quantile regression. Journal of the american statistical association 94, 1296–1310.

Kottas, A. and Gelfand, A. E. (2001). Bayesian semiparametric median regression modeling. Journal of the American Statistical Association 96, 1458–1468.

Kottas, A. and KRNJAJIC,´ M. (2009). Bayesian semiparametric modelling in quan- tile regression. Scandinavian Journal of Statistics 36, 297–319.

Kuan, C.-M. (2007). An introduction to quantile regression.

Powell, J. L. (1986). Censored regression quantiles. Journal of 32, 143–155.

Tokdar, S. T., Kadane, J. B. et al. (2012). Simultaneous linear quantile regression: a semiparametric Bayesian approach. Bayesian Analysis 7, 51–72.

Yu, K. and Moyeed, R. A. (2001). Bayesian quantile regression. Statistics & Proba- bility Letters 54, 437–447.

64