Factorization of flow profile data in production and injection wells based on Bayesian modeling*

Konstantin Sidelnikov1 and Rinat Faizullin2

1 Department of Reservoir Energy State Control, Izhevsk Petroleum Scientific Center, Izhevsk, Russia 2 Department of Management, Kalashnikov Izhevsk State Technical University, Izhevsk, Rus- sia [email protected]

Abstract. This problem is ill-posed, and the process of adding information in order to solve it is needed (regularization). Several methods of deterministic

regularization based on  2 -norm minimization of the unknown vector are con- sidered. It is shown that such approaches do not take into account the petro- physical properties of the reservoir and cannot cover all possible factorization combinations. Bayesian regularization is proposed to factorize the flow profile data. According to this method, all relative factors are defined by the corre- sponding probability distribution functions. Core studies are used to determine the joint probability distribution of rock permeability and porosity. Layer pro- ductivity ratio distributions are calculated separately for each well based on its log interpretation data. Bayesian statistical inference is used to obtain the gen- eral drawdown ratio distribution for the entire field. This approach was tested on real data obtained from three fields.

Keywords: Regularization, Bayes’ theorem, statistical inference, well test, flow profile, productivity index, injectivity index

1 Introduction

In the process of analyzing the oil field development, it is required to solve the prob- lems of determining the injected water front, assessing the oil recovery factor sepa- rately for the producing formations, etc. In order to distribute the cumulative oil pro- duction and the volume of injected water between the producing formations, it is nec- essary to have the values of the oil and water flow rate for each formation separately [1-7]. Investigation of the flow profiles in formations allows obtaining the distribution of the produced and injected fluid over the entire cross-section of the pay zone. As a result, the dependence of the amount of produced or injected fluids on the depth of the completed interval is established. To obtain flow profiles over the formation thick-

* Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribu- tion 4.0 International (CC BY 4.0). ness, the results of measurements by flow meters are usually used, as well as ther- mometry data in the wellbore [8-15]. Bayesian methods are widely used in the practice of developing hydrocarbon fields: processing the results of well tests [5-6; 11] and well logging [9; 12-13], in reservoir simulation [7-8; 10], in statistical prediction of oil-field performance [3-4].

2 Materials and methods

Formulation of the problem. The paper discusses the results of determining the flow profile for a two-layered res- ervoir (Figure 1). It is also assumed that the inflow data from the investigated well interval contain the value of the total oil and water production rate without exact in- formation about fractional flow of each fluid.

Fig. 1. Well inflow model.

Thus, as a result of the production log, the following flow rate ratio for 2 produc- tion layers becomes known: Q q  1 , (1) Q2 Where:

Qi JPP ii wf   JPi ii ,  1, 2 (2) Ji – productivity of i -th layer; Pi – drawdown for the i-th layer in the case of production ("+" sign) or injection ("–" sign), respectively. The problem is to factorize (1) resulted in decomposition of the known q flow rate ratio into the product of the following unknown ratios: q j  p (3)

Where j JJ1 2 – productivity / injectivity ratio; p PP1  2 – drawdown ratio. Possible interpretations of nonuniform recovery of reserves based on factorization (3) are: ─ For relatively homogeneous reservoirs in terms of productivity, it can be approxi- mately assumed that j  1 ; therefore, the observed flow profile will be determined mainly by the ratio p, which will depend on the initial reserve and dynamics of the reservoir energy consumption in relation to each reservoir; ─ In the case of hydrodynamic equilibrium between the reservoirs and sufficiently fast process of repressuring between them, it can be assumed that p 1; therefore the main factor of the flow profile non-uniformity will be the heterogeneity of the reservoir properties, expressed as a ratio j.

There may also be other possible values of the productivity and drawdown ratios, reflecting more complex hydrodynamic processes of multilayer reservoir develop- ment. Obviously, under such conditions, factorization problem (3) has no solution with- out additional information. In particular, if it is possible to directly calculate the pro- ductivity Ji of each interval, then with the known q, it is possible to obtain the draw- down for each reservoir. However, direct calculation of Ji requires knowing the transmissibility of each layer, the current inflow regime of the well, the reservoir ge- ometry, etc. [15]. The absence or incompleteness of such information generally pre- cludes this approach. It is known that, to control the development of oil and gas fields, in addition to the production log, well testing is carried out as well. The values of the productivity index and reservoir pressure obtained from the results of these studies, as a rule, are integral in nature and determine the properties of the multilayered system only as a whole. Thus, on the basis of well testing, only the total fluid flow rate (injection) can be fac- tored:

QJ  P (4)

Where Q Q1  Q 2 , J J1  J 2 and weighted average drawdown: J J P1  P 2  P (5) J1 J 2 Obviously, however, for an unambiguous factorization (3) one well-known expres- sion in the form (4) is not enough. Moreover, this problem can be attributed to an ill- posed problem, the solution to which requires some additional constraints to its condi- tions (regularization) [16].

Deterministic regularization. When solving factorization problem (3) on the basis of (4), in fact, there is a vector of four unknowns: T u JJ1 2  P 1 P 2  (6) And a system of three equations:

Q1 J 1  P 1  Q2 J 2  P 2 (7)  J J1  J 2 The system of equations (7) has many solutions; therefore, one of the methods for obtaining a unique solution is to impose an additional constraint on the norm of the vector space [16]. For example, consider the minimization of the  2 -norm of the vector (6):

2 u uuT  min (8) 2 Subject to the fulfillment of (7). Problem (8) formally belongs to the class of nonlinear programming problems [17] due to additional conditions (7), some of which are nonlinear functions with respect to variables. Therefore, a special solution method is required, suitable for constrained optimization problems [17]. Conditions (7) can be linearized by replacing the variables for drawdown as fol- lows: T x JJ1 21  P 1 1 P 2  (9) As a result, system (7) can be represented as: Ax b (10) Where:

1 1 0 0 J A1 0Q 0  , b  0 1  (11) 0 1 0Q2  0 Further, there are two essentially equivalent ways of solving the problem:

x2  xT x  min 2 (12) Method 1 (optimization methods). On the one hand, problem (12) with constraints (11) is a quadratic programming problem [17]. Due to the fact that all constraints are equalities (there are no inequalities), it can be reduced to solving a system of linear equations: I AT  x  0     (13) A 0  λ  b Where I is the identity matrix, the elements of the main diagonal of which are equal to one; λ is the vector of Lagrange multipliers that appear along with the solu- tion x. Method 2 (methods of linear algebra). On the other hand, system (10) itself be- longs to the class of underdetermined systems of linear equations; therefore, one of the ways to solve it is to obtain a system of normal equations, in which the number of equations will already be equal to the number of unknowns: AAxT Ab T (14) Unfortunately, for a given matrix A in the form (11), system (14) has no solution, since the determinant of a normal matrix ATA is 0. Nevertheless, solution (10) remains possible on the basis of special algorithms for decomposition of rectangular matrices: QR decomposition, singular value decomposi- tion (SVD), etc.

Thus, factorization problem (3) can be solved in the formulation of minimizing  2 - norm of either vector u (8) or vector x (12). Table 1 shows the results of calculations in two ways. Here Q = 50 m3 / day and J = 1 m3 / day / bar. As follows from the results in table 1, the solutions obtained in two ways differ significantly from each other, with the exception of the case q = 1. This is due, of course, to the fact that in one case the values of the drawdown are minimized, and in the other case, their reciprocal values.

Table 1. Solutions based on minimizing the  2 -norm.

Solution in the form u Solution in the form x

q J1 , J 2 , P1 , P2 , J1 , J 2 , P1 , P2 , m3/day/bar m3/day/bar bar bar m3/day/bar m3/day/bar bar bar 50/50 0.500 0.500 50.0 50.0 0.500 0.500 50.0 50.0 70/30 0.637 0.363 54.9 41.4 0.501 0.499 69.9 30.1 90/10 0.812 0.188 55.4 26.6 0.510 0.490 88.3 10.2 99/1 0.955 0.045 51.8 11.2 0.833 0.167 59.4 3.0

It is also interesting to note that for the most frequently encountered in practice range of 1q  9 values, the solution in the form of x corresponds to formations that are practically homogeneous in terms of productivity. It can also be noted that for both cases, when q  1 the values of j and p obtained in two ways are always greater than one. The main problem of factorization (3) based on regularization (8) or (12) is that it is, in fact, an artificial mathematical technique that is used to mechanically solve a problem without taking into account its specifics. In addition, with this approach, there is an incomplete coverage of possible combinations of relations j and p that may take place in reality. In fact, for cases q  1 , either it will always be that j  1 or j 1, which may contradict the petrophysical concept of the distribution of reservoir properties for different formations.

Bayesian regularization. Consider the factorization problem (3) from a probabilistic point of view. We will assume that the observed values of q for different wells are some finite sample from the generally unknown true distribution of a random variable Q. Accordingly, the variables j and p are also instances of random variables J and P. From the point of Bayesian statistics, regularization corresponds to the addition of some prior distribu- tions on the required parameters, i.e. a method for calculating the distribution density functions f j and f p is required. J P Neglecting the influence of different mobility of fluids flowing in or out of forma- tions, different degrees of wellbore damage (skin factor), etc., we will assume that: k h  n n 1 j  n (15) k h m m 2 m

Where k h is the product of the absolute permeability and the thickness of the  n n l n-th interlayer for the l-th layer. The randomness of J is due to the uncertainty related to absolute permeability val- ues. As it is known, according to the core study data, a positive correlation is ob- served between the permeability and porosity of the samples, which can be repre- sented as the density of the two-dimensional distribution of a random vector X = [K, F]T:

Xf~K, F  k ,   (16)

According to well log interpretation data, the values of the porosity are known for each interlayer. As a result, the distribution for the random values of the permeability of the n-th interlayer Kn can be interpreted as the conditional distribution density K at

F  n obtained on the basis of (16):

fK, F k,  n  Kn~ fK F  k  n   (17) fF n 

Thus, for each w-th well, using (15) and (17), the distribution density fJ j φ w  is calculated, where φw is the vector of interlayer porosity values. The use of different distribution functions f j for different wells naturally takes J into account the chaotic change in the reservoir properties of oil formations from one zone (well) to another [14]. However, assuming the continuity of the spatial change in reservoir pressure, its distribution across reservoirs is mainly determined by produc- tion mechanism of a reservoir and well operation conditions that have been estab- lished at the current stage of development. Thus, the main task is to statistically derive a single distribution function fP  p for the entire field. The calculation is attended with certain difficulties. On the one hand, it is possible to use well test data for wells that completed in only one layer. However, as a rule, the amount of such information is not enough to construct a sample distribution function. Instead, we use Bayesian inference to derive the posterior distribution function for P by combining all available observations qw and known functions fJ j φ w  , where w is the well number. According to Bayes' theorem, the posterior distribution of P with respect to the available data q can be calculated as: f q p f p QP   P   fPQ  p q   (18) fQ  q 

Where fQP  q p is the likelihood of the data q at P = p; fP  p is prior distribu- tion P; fQ  q is the marginal likelihood of the data Q. The formula for a function f q is not so important, since it plays the role of a Q simple normalization factor [1; 2]. An uninformative distribution of a random variable P whose values belong to an interval of finite length is used as a prior distribution. In this case the probability density of fP  p will be constant throughout this interval.

The likelihood of observations qw can be calculated as:

qw  fQP  qpw   f Jφ w  (19) p  Where w is the well number. Thus, we get:

qw  fpPQ q    fJφ w  (20) w p 

Where q is the vector of observations qw . Using a different likelihood function for each observation q differs from classical Bayesian statistical inference. This approach is typical for hierarchical models in which some of the prior distributions (so-called hyperdistributions) are shared as pa- rameters of lower-level distributions [1]. In this case, the role of such a prior hy- perdistribution plays fP  p , through which information is exchanged between dif- ferent groups of observations (flow profile and reservoir properties of the wells) to obtain a more stable (reduced) estimate of the posterior distribution [1]. In addition, let us explain some details of the practical implementation of the pro- posed scheme. As it turned out, more stable estimates of the distribution parameters are obtained if we carry out a logarithmic transformation of problem (3): logq log j  log p . (21) Note also that, as a rule, the results of studies on the core are presented in the form of pairs of sample logk ,   values.

Methodology of factorizing the flow profile data. As a result of Bayesian regularization, posterior distributions j and p are calculated, which can be used as additional constraints when solving the factorization problem (3). For example, one can solve this problem in the following form: 2 2  j jMAP   p p MAP   min (22)

Subject to fulfillment of (3), where jMAP , pMAP are the maximum aposteriori esti- mates; α is weight coefficient. In fact, under condition (3), problem (22) is reduced to finding a real positive root of the quartic equation: 4 3 2 j jMAP j  qp MAP j   q 0 (23) In some cases, equation (23) can have two real positive roots. This situation arises with an equivalent contribution of the first and second terms to the total sum (22), i.e. the problem has two equivalent solutions. Further, on the basis of (4), one can calculate the individual parameters of the lay- ers: j 1 J J; J  J 1j1 2 j  1 (24) p j 1  j 1 P  PP;  P 1q1 2 q  1

3 Results

Examples of real fields. Let us consider the application of Bayesian regularization for the factorization prob- lem (3) on the example of three fields. At each of them, productive formation consists of two conditionally distinguished layers, designated as C2vr (upper layer) and C2b (lower layer). For some wells of the fields, a production logging tests were carried out to determine the flow profile. If there were several studies for the same well, then only the latest flow profile results were used. In total, 15, 87 and 65 flow profile results were used for fields No. 1, 2, and 3. Fig- ure 2 shows the distributions for all three fields, obtained using kernel density estima- tion (KDE) method. The bandwidth was 3 for q and 0.2 for log q . All distributions also have 0.25, 0.5 and 0.75-quantiles. As it can be seen, among all fields, the distribution shape log q for field No. 3 is closest to the normal curve, and its center (median) is located above zero. At the same time, field No. 2 demonstrates a two-modal distribution type log q , i.e. perhaps there are two groups of wells, for one of which the most probable values of log q will be less than zero, and for the second, on the contrary, they will be above zero. Regarding field no. 1, the only thing that can be said is that for about 75% of the samples there is logq  0.5 . In addition, for all field the percentage of observations, for which logq  0 , is higher than 60%. The results of the core study to determine the open porosity and absolute perme- ability of core samples for various formations of the fields are shown in figure 3. Dots denote pairs of values logk ,   , which are assumed to have a joint normal distribu- tion. Thus, according to (16) we use: log K    ~ Nμ , Σ  (25)   Where μ is the vector of mean values for log k and φ; Σ is covariance matrix 2×2;

Nμ , Σ is the probability density function of the bivariate normal distribution. To determine the parameters of the normal distribution (25), a Bayesian inference procedure similar to (18) was used. The following prior distributions of parameters were used:

  Nˆlog k ,    μ ~   Nˆ ,       Exp   (26) σ ~   Exp   C ~ LJK  

Where ˆ , ˆ are sample means for log k and φ, respectively; Exp  is the log k    probability density function of the exponential distribution with a parameter   0 ; LJK   is the probability density function of the LKJ-distribution with a parameter

  1 ; C is the correlation matrix 2×2 for which Σ DCD  ; σ is the vector of - dard deviations for log k and φ; D is the diagonal matrix with values σ on the main  diagonal. A reasonable choice is    1 ,   1 and   2 [18].

a) field number 1

b) field number 2

c) field number 3 Fig. 2. Distributions of q for fields.

a) field number 1

b) field number 2

c) field number 3 Fig. 3. Distributions of logk ,   of core samples for fields. Figure 3 also shows the values of the coordinates of the distribution center μ MAP and the correlation coefficient  corresponding to the maximum of the posterior MAP distributions μ and C. For fields No. 1 and 2, core samples from the C2vr formation have, on average, higher porosity and permeability than for the C2b formation. In turn, for field No. 3, the core samples of the C2vr formation, although on average have a higher porosity, their average permeability is lower than for the C2b formation. It is interesting to note that for all fields, the correlation coefficient between porosity and permeability is highest for the C2b reservoir.

a) field number 1

b) field number 2

c) field number 3 Fig. 4. Distributions of log j for fields. Then, for each well, the following sequence of actions was implemented: ─ According to the data on perforations for each layer, a group of interlayers com- pleted on the date of the flow profile was formed; ─ According to well log interpretation data, for each interlayer, the value of porosity was taken and the distribution log k was calculated based on (25) and (17);

─ According to well log interpretation data, for each interlayer, distribution kh was calculated based on the obtained distribution log k and interlayer’s thickness; ─ For each layer, the distribution of the sum of all interlayer’s kh obtained at the previous step was calculated; ─ The distribution of log j was calculated based on (15).

Figure 4 shows the results of calculating the distribution of log j separately for each well, as well as the averaged distribution curve for the entire field. The plots also show the value of the mode of the averaged distribution of log j . Thus, for fields No. 1 and No. 2, the productivity of the C2vr formation is on average about 10 and 4 times higher than the productivity of the C2b formation, respectively. At the same time, for field No. 3, the productivity of C2vr is on average approximately 2 times less than the productivity of the C2b formation. For all fields, the maximum value of the distribution mode of log j does not exceed 2, while the minimum value of the distribution mode of log j is higher than -2.

4 Discussion

As a result of Bayesian inference base on (18), the posterior distributions of log p for all the fields are obtained and are shown in figure 5. The mean value and highest den- sity interval (HDI) of 94% are also indicated there. Thus, for fields No. 1 and No. 2

pMAP  1, i.e. the drawdown for the C2vr formation is on average almost 2.5 times lower than the drawdown for the C2b formation. At the same time, it is the other way around for deposit No. 3, for which pMAP  1, i.e. it is possible that even at a lower productivity of the C2vr formation, the reservoir pressure in it is higher than in the C2b formation. As a result of which, a greater inflow from / injection into C2vr is observed. It is also interesting to note that the range of the posterior distribution of log p is the highest for field No. 1, which is possibly due to the small volume of ob- servations of log q . Figure 6 shows the results of testing the convergence of the procedure for Bayesian inference of the posterior distribution. Trace diagrams on the right in Fig. 6 looks like white noise, which indicates a good mixing of the statistical inference engine. In this case, the NUTS sampler was used, which generated four parallel chains of independ- ent samples from the posterior distribution. The size of each sample was 5000 ele- ments. In addition to them, 2000 elements in the chain were intended for automatic tuning of the sampling algorithm. 5 Conclusion

The problem of factorization of the results of flow profile of a two-layered reservoir using well test data is formulated. It is shown that this problem is ill-posed, and addi- tional restrictions on its conditions (regularization) are required. Several methods of deterministic regularization based on the minimization of the

 2 -norm of the vector of unknowns are described. It is shown that such approaches do not take into account the petrophysical properties of formations and have limita- tions on the coverage of possible factorization solutions.

a) field number 1

b) field number 2

с) field number 3 Fig. 5. Posterior distributions of log p for fields.

a) field number 1

b) field number 2

c) field number 3 Fig. 6. Convergence results of the Bayesian inference procedure for fields.

A Bayesian regularization is proposed for factorizing the results of the flow profile. According to it, for all relative factors the corresponding probability distribution func- tions are formed. For this, data from core studies are used to determine the permeabil- ity and porosity of the samples. Calculation of the distributions of the ratio of reser- voir productivity is carried out separately for each well, taking into account the well log interpretation data. Bayesian statistical inference is used to obtain the general distribution of the drawdown ratio for different reservoirs. This approach was tested on the example of three real fields. All calculations were carried out in Python using the PyMC3 probabilistic programming library. Visualiza- tion and exploratory analysis of the results were carried out using the and ArviZ libraries.

References

1. Martin O.: Bayesian Analysis with Python: Introduction to Statistical Modeling and Prob- abilistic Programming Using PyMC3 and ArviZ 2nd Edition. Birmingham-Mumbai: Packt (2018). 2. Downey A.B.: Think Bayes. Sebastopol: O’Reilly Media (2013). 3. Korde A.A.: Probabilistic Decline Curve Analysis in Unconventional Reservoirs Using Bayesian and Approximate Bayesian Inference Thesis. Fairbanks: University of Alaska (2019). 4. Li L.: Bayesian Statistical Inference Applied to Reservoir Modelling and Earthquake Scal- ing Thesis (Ph.D.). Edinburgh: University of Edinburgh (2006). 5. Booth ., Morton K., Onur M., Kuchuk F.: Grid-based Inversion of Pressure Transient Test Data with Stochastic Gradient Techniques Int. J. for Uncertainty Quantification, 2, 4 323–339 (2012). 6. Christen A., Sanso B., Santana Cibrian M., Velasco-Hernandez J.: 2015 Bayesian Decon- volution of Oil Well Test Data Using Gaussian Processes J. Appl. Statistics, 43, 422 (2016). 7. Eltahan E., Ganjdanesh R., Yu W., Sepehrnoori K.: Assisted History Matching Using Bayesian Inference: Application to Multiwell Simulation of a Huff ’n’ Puff Pilot Test in the Permian Basin Proc. of Unconventional Resources Technology Conf. (Austin) (SPE) (2020). 8. Darwis S., Fitriyati N., Gunawan A.Y., Marwati R.: 2012 Bayesian Reservoir Simulation Int. Conf. on Statistics in Science, Business and Engineering. Langkawi, IEEE, 1–5. (2012). 9. Masoudi P., Asgarinezhad Y., Tokhmechi B.: Feature Selection for Reservoir Characteri- sation by Bayesian Network Arabian J. of Geosciences, 8(5), 3031–43 (2015). 10. Dou Z.: Bayesian Global Optimization Approach to the Oil Well Placement Problem with Quantified Uncertainties Thesis (M.S.). West Lafayette: Purdue University (2015). 11. Anraku T.: Discrimination Between Reservoir Models in Well Test Analysis Thesis (Ph.D.). Stanford: Stanford University (1993). 12. D’Windt A.: Reservoir Zonation and Permeability Estimation: A Bayesian Approach Proc. of 48th Annual Logging Symp. Society of Petrophysicists and Well-Log Analysts (2007). 13. Loures L.L.: Porosity Bayesian Inference from Multiple Well-Log Data. CREWES Re- search Report, 14 (2002). 14. Lysenko V.D.: Oilfield Development: Theory and Practice. Moscow: Nedra, 367 (1996). 15. Ipatov A.I., Kremenetsky M.I.: Geophysical and hydrodynamic control of the development of hydrocarbon deposits. Moscow: Izhevsk: Institute of Computer Research, 780 (2005). 16. Numaier A.: Solving Ill-conditioned and Singular Linear Systems: A Tutorial on Regulari- zation. SIAM Review, 40(3), 636–66 (1998). 17. Avriel M.: Nonlinear Programming: Analysis and Methods. Mineola: Dover Publishing (2003). 18. LKJ Cholesky Covariance Priors for Multivariate Normal Models https://docs.pymc.io/notebooks/LKJ.html, last accessed 2020/10/29.