Bayesian Non-Linear Quantile Regression with Application in Decline Curve Analysis for Petroleum Reservoirs

BAYESIAN NON-LINEAR QUANTILE REGRESSION WITH APPLICATION IN DECLINE CURVE ANALYSIS FOR PETROLEUM RESERVOIRS. by YOUJUN LI Submitted in partial fulfillment of the requirements for the degree of Master of Science Department of Mathematics, Applied Mathematics and Statistics CASE WESTERN RESERVE UNIVERSITY May, 2017 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the dissertation of Youjun Li candidate for the degree of Master of Science*. Committee Chair Dr. Anirban Mondal Committee Member Dr. Jenny Brynjarsdottir Committee Member Dr. Wojbor Woyczynski Date of Defense March 31, 2017 *We also certify that written approval has been obtained for any proprietary material contained therein. Contents List of Tables iii List of Figures iv Acknowledgments vi Abstract vii 1 Introduction 1 2 Mean Regression And Quantile Regression Compared (Linear Case) 5 2.1 Mean Regression . .5 2.2 Quantile Regression . .6 2.3 Comparison with Examples . .8 2.3.1 The Household Income Dataset . .8 2.3.2 Simulated Data with Outliers . 12 2.3.3 Econometric Growth Dataset . 14 3 Bayesian Linear Mean and Quantile Regressions 19 3.1 Bayesian Linear Mean Regression . 19 3.2 Bayesian Linear Quantile Regression . 19 3.3 Asymmetric Laplace As Error Distribution . 20 3.4 Bayesian Quantile Regression Demonstrated Using a Real Dataset . 22 i CONTENTS CONTENTS 4 Bayesian Nonlinear Regression with Simulated Oil Reservoir Data 26 4.1 A Brief Description of Decline Curve Analysis . 26 4.2 The Simulated Data with Asymmetric Laplace Error . 27 4.3 Bayesian Nonlinear Mean Regression . 29 4.4 Bayesian Nonlinear Quantile Regression . 29 5 Bayesian Nonlinear Regressions with Real Oil Reservoir Data 38 5.1 Bayesian Model and Sampling from the Posterior . 39 5.1.1 Prior for q0 ............................ 40 5.1.2 Prior for b ............................. 40 5.1.3 Prior for d0 ............................ 40 5.1.4 Prior for σ ............................. 41 5.1.5 The Advantage of \rstan" . 41 5.2 Data Analysis for the First Well . 42 5.3 Data Analysis for the Second Well . 54 6 Discussion 60 7 Future Work 62 8 Bibliography 64 ii List of Tables 3.1 Bayesian Summary of \Prostate Cancer" Dataset . 23 4.1 Summary of Bayesian Median Regression for Simulated Data . 34 5.1 Prior Parameters for q0 .......................... 40 5.2 Prior Parameters for b .......................... 40 5.3 Prior Parameters for d0 .......................... 41 5.4 Prior Parameters for σ .......................... 41 5.5 Median of P-Curvs for Well 1 . 52 5.6 Median of P-Curvs for Well 2 . 55 iii List of Figures 2.1 Graph of the Loss Function ........................7 2.2 Example 1-a Household Income Dataset ..................9 2.3 Example 1-b Household Income Dataset with Mean Regression ...... 10 2.4 Example 1-c Household Income Dataset with Five Quantile Regressions .. 11 2.5 Example 2 Mean and Quantile Regression Against Outliers ........ 13 2.6 Example 3-a GDP and Female Secondary Education ........... 15 2.7 Example 3-b Slopes of Quantile Regressions ................ 16 2.8 Example 3-c All Quantile Regression Lines ................. 17 3.1 Example 4-a Simulated Data with Normal Error .............. 21 3.2 Example 4-b Residual Plots ........................ 22 3.3 Example 5 Traceplot for \Age" ....................... 24 4.1 Example 6 Data Plot ............................ 28 4.2 Example 6 Bayesian Nonlinear Mean Regression Line ........... 30 4.3 Example 6 Traceplots and Histograms ................... 34 4.4 Example 6 Posterior Predictive Curves for Median Regression ....... 37 5.1 Example 7 . 39 5.2 Example 8 . 39 5.3 Example 7-a Traceplots of Mean Regression ................ 43 5.4 Example 7-b Traceplots of Median Regression ............... 44 iv LIST OF FIGURES LIST OF FIGURES 5.5 Example 7-c Traceplots of 10th Quantile Regression ............ 45 5.6 Example 7-d Traceplots of 90th Quantile Regression ............ 46 5.7 Example 7-e Histograms of Mean Regression ................ 47 5.8 Example 7-f Histograms of Median Regression ............... 48 5.9 Example 7-g Histograms of 10th Quantile Regression ........... 49 5.10 Example 7-h Histograms of 90th Quantile Regression ........... 50 5.11 Example 7-i Fitted Curves ......................... 51 5.12 Example 7-j P10 P50 P90 Curves ..................... 53 5.13 Example 7-k Posterior Predictive Curves of Median Regression ...... 55 5.14 Example 8-a Well Two Fitted Curves Compared .............. 56 5.15 Example 8-b Well Two P90 P50 P10 Curves ................ 57 5.16 Example 8-c Well Two Posterior Predictive Curves ............ 59 v Acknowledgments I would like to pay special thankfulness, warmth and appreciation to the persons below who made my study successful and assisted me at every point of the thesis process to cherish my goal: My academic and thesis advisor Dr. Anirban Mondal, for his guidance and exper- tise that made the whole thing possible. My committee member and course instructor Dr. Jenny Brynjarsdottir, for her pro- fessional and kind suggestions that helped me ameliorate the writing of the thesis. My committee member Dr. Wojbor Woyczynski, whose encouragement made me proud of what I have done. My classmate Yuchen Han for reminding me to keep improving. And last but not least, my parents and family, for being supportive no matter what. I am nothing without them. vi Bayesian Non-linear Quantile Regression with Application in Decline Curve Analysis for Petroleum Reservoirs. Abstract by YOUJUN LI In decline curve analysis for hydrocarbon reservoirs, the use of quantile regression instead of the conventional mean regression would be appropriate in the context of oil industry requirement as the fitted quantile regression curves have the correct inter- pretation for the predicted reserves. However, quantiles of a mean regression result have been commonly reported. In this thesis, we consider non-linear quantile regression model where the quantiles of the conditional distribution of the production rate are expressed as some standard non-linear functions of time, under a Bayesian frame- work. The posterior distribution of the regression coefficients and other parameters is intractable mainly due to the non-linearity in the quantile regression function, hence Metropolis Hastings algorithm is used to sample from the posterior. A quantitative assessment of the uncertainty of the decline parameters and the future prediction would be provided for two real datasets. vii 1. Introduction Unlike conventional mean regression, quantile regression was comprehensively studied rather late and not widely used as mean regression. Due to its straightforward intuition, mean regression has dominated the statistical analysis in both business and industry. However, quantile regression managed to handle some difficulties that mean regression failed to overcome. Thanks to the work of Koenker and Bassett Jr (1978), quantile regression be- came an alternative tool to the conventional mean regression in statistical research. In quantile regression models the quantiles of the conditional distribution of the response variable are expressed as functions of the covariates. The quantile regression is particularly useful when the conditional distribution is heterogeneous and does not have a \standard" shape, such as an asymmetric, fat-tailed, or truncated distribution. Compared to conventional mean regression, quantile regression is more robust to outliers and misspecification of the error distribution. It also provides the interpre- tation of the relationship between different percentiles of the response variable and the predictive variables. On the other hand, Bayesian methodology has been growing rapidly in the digital era. More and more traditional statistical methods have evolved with the integra- tion of Bayesian theory. Kottas and Gelfand (2001) attempted to combine Bayesian approach with quantile regression by considering non-parametric modeling for the error distribution of median regression, a special case of quantile regression, based on 1 Part 1 either Pólya tree or Dirichlet process priors. Yu and Moyeed (2001) further discussed an error distribution modeling from the asymmetric Laplace distribution by proving that the loss function of the quantile regression is distributed as asymmetric Laplace distribution. Although Yu & Moyeed have provided a very thorough study of Bayesian quantile regression, they left the idea of using informative prior for the parameters of the asymmetric Laplace distribution open, such as a prior for the scale parameter σ. In this paper, we will also consider an informative prior for σ with a relatively large variance, so that we can bring the estimate of the error distribution's variance as a part of the posterior inference. Unlike Bayesian quantile regression being applied for linear regression models in the past, application of Bayesian quantile regression for non-linear parametric models have not been well studied in the statistics community. In this thesis, we focus on a particular industrial field, the oil industry, and propose an alternative way to well data analysis other than conventional mean regression. Due to the nature of oil exploitation, the production rate tends to decline over time. Hence the data distribution will not have a \standard" shape, which makes mean regression less adequate. More importantly, the industry regulations require certain percentiles to be reported, indicating that no other approach should be more favorable than quantile regression for this particular problem. As confusing

Bayesian Non-Linear Quantile Regression with Application in Decline Curve Analysis for Petroleum Reservoirs

QUANTILE REGRESSION for CLIMATE DATA Dilhani Marasinghe Clemson University, [email protected]

A Toolbox for Nonlinear Regression in R: the Package Nlstools

Gretl User's Guide

An Efficient Nonlinear Regression Approach for Genome-Wide

Quantreg: Quantile Regression

Model Selection, Transformations and Variance Estimation in Nonlinear Regression

Nonlinear Regression, Nonlinear Least Squares, and Nonlinear Mixed Models in R

柏際股份有限公司 Bockytech, Inc. 9F-3,No 70, Yanping S

Fitting Models to Biological Data Using Linear and Nonlinear Regression

Unconditional Quantile Regressions*

Stochastic Dominance Via Quantile Regression

Nonlinear Regression and Nonlinear Least Squares