<<

Journal of Hydrology 403 (2011) 66–82

Contents lists available at ScienceDirect

Journal of Hydrology

journal homepage: www.elsevier.com/locate/jhydrol

Information content of slug tests for estimating hydraulic properties in realistic, high-conductivity scenarios ⇑ Michael Cardiff a, , Warren Barrash a, Michael Thoma a, Bwalya Malama b a Boise State University, Center for Geophysical Investigation of the Shallow Subsurface (CGISS), Department of Geosciences, 1910 University Drive, MS 1536, Boise, ID 83725-1536, USA b Montana Tech of the Univ. of Montana, Dept. of Geological Engineering, 1300 West Park Street, Butte, MT 59701, USA article info summary

Article history: A recently developed unified model for partially-penetrating slug tests in unconfined (Malama Received 12 August 2010 et al., in press) provides a semi-analytical solution for aquifer response at the wellbore in the presence Received in revised form 30 January 2011 of inertial effects and wellbore skin, and is able to model the full range of responses from over- Accepted 24 March 2011 damped/monotonic to underdamped/oscillatory. While the model provides a unifying framework for Available online 2 April 2011 realistically analyzing slug tests in aquifers (with the ultimate goal of determining aquifer properties This manuscript was handled by P. Baveye, such as K and specific storage Ss), it is currently unclear whether parameters of Editor-in-Chief this model can be -identified without significant prior information and, thus, what degree of infor- mation content can be expected from such slug tests. In this paper, we examine the information content Keywords: of slug tests in realistic field scenarios with respect to estimating aquifer properties, through analysis of Slug test both numerical experiments and field datasets. Aquifer characterization First, through numerical experiments using Markov Chain Monte Carlo methods for gauging parameter Wellbore skin uncertainty and identifiability, we find that: (1) as noted by previous researchers, estimation of aquifer Identifiability storage parameters using slug test data is highly unreliable and subject to significant uncertainty; (2) Kozeny–Carman joint estimation of aquifer and skin parameters contributes to significant uncertainty in both unless prior knowledge is available; and (3) similarly, without prior information joint estimation of both aquifer radial and vertical conductivity may be unreliable. These results have significant implications for the types of information that must be collected prior to slug test analysis in order to obtain reliable aquifer parameter estimates. For example, plausible estimates of aquifer anisotropy ratios and bounds on well- bore skin K should be obtained, if possible, a priori. Secondly, through analysis of field data – consisting of over 2500 records from partially-penetrating slug tests in a heterogeneous, highly conductive aquifer, we present some general findings that have applicability to slug testing. In particular, we find that aquifer hydraulic conductivity estimates obtained from larger slug heights tend to be lower on average (presumably due to non-linear wellbore losses) and tend to be less variable (presumably due to averaging over larger support volumes), supporting the notion that using the smallest slug heights possible to produce measurable water level changes is an important strategy when mapping aquifer heterogeneity. Finally, we present results specific to characterization of the aquifer at the Boise Hydrogeophysical Research Site. Specifically, we note that (1) K estimates obtained using a range of different slug heights give similar results, generally within ±20%; (2) correlations between estimated K profiles with depth at closely-spaced suggest that K values obtained from slug tests are representative of actual aquifer heterogeneity and not overly affected by near-well media disturbance (i.e., ‘‘skin’’); (3) geostatistical analysis of K values obtained indicates reasonable correlation lengths for sediments of this type; and (4) overall, K values obtained do not appear to correlate well with data from previous studies. Ó 2011 Elsevier B.V. All rights reserved.

1. Introduction

Slug tests have become a primary method for analyzing aquifer transmissivity due to their relative speed and simplicity as com- ⇑ Corresponding author. Tel.: +1 208 426 4678; fax: +1 208 426 3888. pared with more labor-intensive tests such as pumping tests or E-mail addresses: [email protected] (M. Cardiff), wbarrash@ hydraulic tomography. Likewise, slug tests have proven beneficial boisestate.edu (W. Barrash), [email protected] (M. Thoma), bmalama@ at contaminated sites since they do not produce water during a test mtech.edu (B. Malama).

0022-1694/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jhydrol.2011.03.044 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82 67 and thus may reduce characterization costs. In addition, partially- and wellbore skin may contribute significantly to uncertainty in penetrating slug tests such as those that can be performed within aquifer hydraulic parameters. packed-off intervals in a borehole are a beneficial source of infor- In this work, we undertake a study of slug test response in mation about depth variations in hydraulic conductivity, which is highly conductive aquifers under realistic field conditions (i.e., tak- not generally obtainable with traditional fully-penetrating pump- ing into account wellbore skin as well as timing issues that may be ing tests. In order to obtain aquifer parameter estimates from slug present in field data). In the first section of this work, we perform test records, curve fitting is generally carried out using one of a numerical experiments with the goal of understanding how esti- variety of analytic or semi-analytic models which assume homoge- mates of aquifer properties (and their associated uncertainty) are neous aquifer properties within the volume interrogated by slug affected by such conditions. We utilize the above-mentioned uni- test measurements. fied model in order to determine the sensitivity of estimated aqui- Depending on the type of aquifer being investigated (confined/ fer parameters and their uncertainty (as determined via inversion unconfined), the type of slug response observed (overdamped/ of synthetic data) under several different types of ‘‘prior’’ informa- underdamped), the location of the test (shallow/deep), the type tion. Then, in the second section of this work, we analyze a large of slug test performed (fully-penetrating/partially-penetrating), set of slug test field data from the Boise Hydrogeophysical and the existence of near-well disturbance (skin/no skin) a variety Research Site (BHRS) using the unified model with prior informa- of models may be used to analyze slug test data. Relatively simpler tion from studies at the site (Barrash et al., 2006). slug-test models may be used for analysis when the response ob- served is non-oscillatory or ‘‘overdamped’’ – as is generally the case in very shallow or low-conductivity aquifers – and where ver- 2. Mathematical model tical flow in the aquifer is deemed insignificant (Hvorslev, 1951; Cooper et al., 1967; Bouwer and Rice, 1976). In the overdamped re- Building on the work of Hyder et al. (1994) and Butler and Zhan sponse case where partial penetration produces significant vertical (2004) among others, Malama et al. (in press) developed a semi- flow, more complex models may be used (e.g., Hyder et al., 1994, analytical model for slug test response in unconfined aquifers that which also incorporates wellbore skin). takes into account partial penetration as well as skin effects at the While partially-penetrating slug tests present a potentially source well and inertial effects within the borehole. The solution quick and information-rich source of data for estimating depth- presented by Malama et al. (in press) satisfies the governing partial dependent aquifer heterogeneity, several difficulties are associ- differential equations (PDEs), boundary conditions, and initial con- ated with their implementation in highly conductive aquifers. ditions described below. Firstly, in highly conductive aquifers, slug response may be so fast that inertial effects within the wellbore become important, result- ing in oscillatory well water level responses. In order to duplicate such responses in numerical or analytical models, both the head response in the aquifer and inertial balances in the wellbore must be modeled, resulting in more computationally complex solutions (Bredehoeft et al., 1966; Van Der Kamp, 1976; Kipp, 1985; Spring- er and Gelhar, 1991; Hyder et al., 1994; Zlotnik and McGuire, 1998; Butler and Zhan, 2004). In the case of extremely fast-mov- ing wellbore water, even turbulent energy loss may contribute to slug response, resulting in non-linear responses (see, e.g., McEl- wee and Zenner, 1998). Secondly, the fast response of highly con- ductive systems means that, simply due to finite measurement frequency, it may be difficult to exactly define both the initial slug height and the time at which the test began (see, e.g., Butler, 1996, 1998), both of which are required as parameters for slug test modeling. Thirdly, since most slug test setups perform both slug injection/extraction and measurement at the same well, wellbore skin effects may contribute prominently to slug re- sponse (Faust and Mercer, 1984; Hyder et al., 1994; Malama et al., in press), meaning analysis of wellbore skin parameters

(such as radial extent and hydraulic conductivity Ksk within the skin) may be crucial to determining accurate aquifer conductivi- ties. Finally, typical models used for slug test data analysis gener- ally assume homogeneity within the ’’region of influence’’ of the slug test, which may impact K estimates obtained depending on the scale of the slug used and the extent of heterogeneity within the aquifer being analyzed (see, e.g., analyses in Butler et al. (1994) and Beckie and Harvey (2002)). Recently, a semi-analytical solution was developed by Malama et al. (in press) that models partially-penetrating slug test response Fig. 1. Slug test setup diagram, showing packer and port system utilized. Before in unconfined aquifers, in the presence of both wellbore inertial ef- each test, the water column was pressurized through a manifold (1) using a fects and wellbore skin. As such, this ‘‘unified’’ model is a highly compressed gas source. Water level in the unscreened column (2) is depressed due flexible platform for slug test analysis and is able to reproduce to outflow in the screened interval (3). At time to, excess air pressure is released the full range of possible slug responses, from overdamped/mono- through a valve in the manifold, resulting in flow into the well at the interval that has been isolated with packers (3). Water level in column is measured as it tonic to underdamped/oscillatory. Using this unified model, it be- equilibrates with surrounding head, using a transducer (4) connected to a data comes possible to gain insight into the utility of slug tests in acquisition system (DAQ) at the surface through a pressure-tight fitting. Dimen- highly conductive unconfined aquifers, where fast slug response sions are as explained in the text. 68 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82

We consider a system for performing partially-penetrating slug Finally, a linearized inertial balance is applied to the open water tests in an unconfined aquifer, such as the pressurized-gas system within the wellbore (see Butler and Zhan, 2004), resulting in the with geometry as described in Fig. 1. Within the aquifer, the satu- following PDE and initial conditions: rated flow equations apply assuming a small displacement of the Z 2 dT water table: d H 8mL dH g g 2 þ 2 þ H ¼ sskðrw; z; tÞdz ð4Þ dt rc Le dt Le bLe dB 2 @saq Kr;aq @ @saq @ saq Hðt ¼ 0Þ¼H Ss;aq ¼ r þ Kz;aq ð1Þ o @t r @r @r @z2 dH 0 saqðr; z; t ¼ 0Þ¼0 ¼ Ho dt t¼0 s r; z 0; t 0 aqð ¼ Þ¼ where H is the displacement of the wellbore water level from static @saq 2 1 ¼ 0 conditions [L], v is kinematic viscosity [L T ], dT and dB represent @z z¼B the z coordinates of the top and bottom of the test interval [L], rw is lim saqðr; z; tÞ¼0 r!1 the radius of the well [L], and rc is the radius of the water column. L and L are length parameters related to the flow geometry in e where saq represents the change in head from static conditions in 4 ðdT dBÞ rc the well, with L ¼ dT þ (Butler, 2002), and Le ¼ dT þ the aquifer [L], r represents radial distance from the wellbore center 2 rw 2 [L], z represents depth below the water table [L], t represents time ðd d Þ T B rc (Kipp, 1985). 2 rw [T], B represents the aquifer saturated thickness [L], Ss,aq is specific 1 The water level changes in the well are linked to head changes storage of the aquifer [L ], and Kr,aq and Kz,aq represent radial and vertical hydraulic conductivities within the aquifer [L T1], in the wellbore skin, lastly, through a mass balance: ( respectively. 2 dH @ssk prc dt 8z 2½dT ; dB Within the wellbore skin, similar saturated flow equations 2pðdT dBÞKr;sk r ¼ ð5Þ @r 0 elsewhere apply: r¼rw A summary of all physical parameters of the model is given in @s K @ @s @2s S sk ¼ r;sk r sk þ K sk ð2Þ Table 1. While it is recognized that all of the physical model s;sk @t r @r @r z;sk @z2 parameters are subject to measurement errors or uncertainties, a sskðr; z; t ¼ 0Þ¼0 few of the parameters can be very well constrained by simple field s r; z 0; t 0 skð ¼ Þ¼ measurements and are thus considered constant and known for the remainder of this work, as marked with Xs in Table 1. @ssk ¼ 0 The solution to these coupled equations (the ‘‘unified solution’’), @z z¼B as detailed in Malama et al. (in press) and references therein, is ob- where ssk represents change in head from static conditions in the tained through a combination of a Laplace transform in time and a 1 1 wellbore skin [L], and Ss,sk [L ], Kr,sk, and Kz,sk [L T ] represent stor- finite Fourier transform in z, resulting in a solution in the Laplace age and hydraulic conductivity parameters for the wellbore skin. domain that involves an infinite series of Bessel functions. We thus Between the aquifer and wellbore skin, continuity of head and refer to this solution as semi-analytical since a finite approxima- discharge are enforced: tion of the infinite series is necessary to obtain a result and since the inverse Laplace transform used to obtain the solution must, s r ; z; t s r ; z; t 3 aqð sk Þ¼ skð sk Þð Þ in general, be carried out numerically. @saq @ssk In this work, we utilize a slightly modified version of the unified Kr;aq ¼ Kr;sk @r @r solution in order to model slug tests from the BHRS. A modified r¼rsk r¼rsk code has been implemented in MATLAB and takes advantage of where rsk is the radius of the wellbore skin [L]. vectorization for most coding loops, resulting in a forward model

Table 1 Definitions of physical parameters required for performing simulation of slug tests. Those parameters marked with Xs are considered known, and uncertainty in these parameters (or errors in their measurement) is not considered here.

Model parameter (Units) Definition Assumed known/constant throughout?

Kr,aq (L/T) Hydraulic conductivity in the radial direction within the main aquifer

Kz,aq (L/T) Hydraulic conductivity in the vertical direction within the main aquifer 1 Ss,aq (L ) Specific storage within the main aquifer

Kr,sk (L/T) Hydraulic conductivity in the radial direction within the wellbore skin

Kz,sk (L/T) Hydraulic conductivity in the vertical direction within the wellbore skin 1 Ss,sk (L ) Specific storage within the wellbore skin

rw (L) Radius of well X

rc (L) Radius of slug water column X

rsk (L) Radius of the wellbore skin from the center of the well X

dT (L) Depth to the top of the test interval X

dB (L) Depth to the bottom of the test interval X B (L) Aquifer saturated thickness X m (L2 T1) Kinematic viscosity of water X g (L2 T1) Acceleration due to gravity X

H0 (L) Initial water level displacement

Hoffset (L) Correction factor for imperfect choice of static water levels (shifts time/head change graph up/down)

Hscale (L) Correction factor for imperfect estimation of initial slug height (scales solution head values by a constant)

toffset (T) Correction factor for imperfect choice of test start time (shifts time/head change graph left/right) M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82 69 that can simulate roughly 20–30 temporal observations per second More formally, given a set of field data and a set of estimated on a standard laptop CPU. The input parameters for this model are hydrologic parameters, the likelihood of the parameters given the summarized in Table 1. In addition to the physical parameters uti- data is, within a constant: lized in the governing equations, our revised model includes three 1 new parameters used to allow for errors in field data collection and LðmjdÞ/exp ðd GðmÞÞT R1ðd GðmÞÞ 2 data pre-processing. These errors may include misjudgement of static water levels, misjudgement of initial slug height, and mis- where m is an (n 1) vector of unknown model parameters, d is an estimation of time to when slug test initiation occurs. Given model (m 1) vector of datapoints, G() is the forward model operator, results that calculate H(t), the water level displacement in the Rn ! Rm, which converts from a given set of parameter values to borehole at time t, a revised estimate Hr(t) can be calculated taking the equivalent simulated field data, R is an (m m) matrix repre- into account these errors using the formula senting the covariance of measurement errors, and where the stan- dard assumption of Gaussian measurement errors has been HrðtÞ¼Hscale Hðt toffsetÞþHoffset ð6Þ invoked. To evaluate the distribution L(mjd), (i.e., to determine which parameter values are ‘‘likely’’) we perform Metropolis–Has- where Hscale, Hoffset, and toffset can be used to correct the slug height, tings sampling (Metropolis et al., 1953; Hastings, 1970), a Markov static water levels, and time zeroing, respectively. During data-fit- Chain Monte Carlo (MCMC) sampling method which results in a ser- ting these parameters may be modified within reasonable ranges ies of equally likely parameter sets given the data. While computa- in order to improve curve matching. tionally intensive, MCMC methods are more likely to focus sampling and detail on high probability density areas of the likeli- hood function (which are of the most interest), in comparison to na- 3. Analysis of parameter identifiability ive strategies such as grid search. To implement the Metropolis–Hastings algorithm, we begin The unified model presented above provides a method for real- from an initial parameter set and implement the following steps istically simulating slug tests in unconfined, high-conductivity iteratively: aquifers. In general, though, the more important feature of such models is that it allows estimation of important hydrologic param- 1. Starting from a current realization of the parameter set m, cal- eters (primarily, hydraulic conductivity in the aquifer) given a set culate the negative log-likelihood 1 T 1 of field data, through inverse modeling. However, as is made evi- NLLðmÞ¼2 ðd GðmÞÞ R ðd GðmÞÞ. dent by the above mathematical formulation, slug test response 2. Take a random step from the current parameter set m to a new will also be dependent on factors such as wellbore skin hydraulic set m0. Calculate the negative log-likelihood at the new location, conductivity and on storage parameters in both the aquifer and NLL(m0). skin. For this reason, as discussed by prior researchers (Butler, 3. Select a random variable u from a uniform distribution on [0, 1]. 1996, 1998; Faust and Mercer, 1984), care must be taken in evalu- 4. If ln(u)>NLL(m0) NLL(m), accept m0 as the new current reali- ating the information content of slug tests for identifying aquifer zation, otherwise re-accept m as the current realization. Return parameters. to step 1. In this section, we investigate the issue of parameter identifi- ability under a number of realistic scenarios. For example, if slug To arrive at a good representative sample of conditional realiza- test data are available from a wellbore which may have a wellbore tions, the above algorithm should first go through a ‘‘burn-in’’ per- skin, can we uniquely identify the hydrologic characteristics of iod where new realizations are accepted, but not stored in both the wellbore skin and the aquifer from the data? Alternately, memory. In practice, the ‘‘burn-in’’ period is utilized so that the are there a variety of different possibilities for combinations of Markov Chain is independent of its initial state (i.e., the initial aquifer and skin parameters that fit the data well, such that accu- parameter set supplied by the user), and so that the initial guess rate joint estimation of these parameters is, in effect, impossible? does not show a prominent signature in the final obtained distribu- To investigate the issue of parameter identifiability, we take the tion. After the set number of iterations in the burn-in period, all following approach. First, a set of field data is synthetically gener- realizations of the variables are stored. Ideally, it can be shown that ated using the unified model, and a small amount of noise is added conditional realizations from the algorithm above will effectively to the data in order to simulate measurement error and/or concep- be drawn from the true probability distribution as the number of tual error in our model. For comparability with the field data anal- accepted realizations approaches infinity. ysis presented, the ‘‘true’’ parameter values used are similar to In the following sections, we present a few specific sample cases those expected from the BHRS aquifer, and the degree of measure- that simplify the problem of parameter estimation in order to focus ment error added is comparable to that seen in real field data from on identifiability of a few key parameters. We focus on the issues of the field site (as discussed in later sections). We then treat the whether aquifer anisotropy, storage coefficients, and wellbore skin hydrologic parameters of the unified model as unknown and parameter can be effectively estimated given realistic, noisy field search for parameter sets that result in good fits to the data. data. In all cases presented below, we utilized synthetic slug tests

Table 2 Unknowns and prior knowledge assumed in three sample cases investigated using MCMC sampling to determine parameter identifiability.

Model parameter (Units) Case 1 Case 2 Case 3 True value Assumed knowledge True value Assumed knowledge True value Assumed knowledge

Kr,aq (m/s) 3.70e03 <1 3.70e03 <1 3.70e03 <1

Kz,aq (m/s) 2.70e03 <1 3.70e03 =Kr,aq 3.70e03 =Kr,aq 1 Ss,aq (m ) 5.00e05 =5e5 5.00e05 >1e11 5.00e05 =5e5

Kr,sk (m/s) 2.00e04 =2e4 2.00e04 =2e4 2.00e04 <1

Kz,sk (m/s) 2.00e04 =2e4 2.00e04 =2e4 2.00e04 =Kr,sk 1 Ss,sk (m ) 5.00e05 =5e5 5.00e05 =Ss,aq 5.00e05 =5e5 70 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82

Fig. 2. Realizations of Kr,aq and Kz,aq accepted by the Metropolis–Hastings algorithm, with true values and maximum likelihood estimate highlighted (top). Non-linear trade- off between Kr,aq and Kz,aq results in a wide range of reasonable parameter values that fit the data well. Selected fits for three randomly selected parameter realizations are plotted beneath, emphasizing that good data fits can be obtained with quite different parameter sets. that generated underdamped responses, though a similar analysis order to specifically investigate the identifiability of aquifer anisot- could be carried out for synthetic cases in which the responses ropy during parameter estimation, we assumed that wellbore skin were overdamped. The true parameter values used in each case, parameters (Ss,sk, Kr,sk, Kz,sk) and aquifer specific storage (Ss,aq) were and types of prior knowledge assumed are detailed in Table 2. known. Additionally, an upper bound of 1 m/s was set for both Kr,aq The sets of prior information used in each case are ‘‘strict’’ in the and Kz,aq. The set of accepted parameter values for Kr,aq and Kz,aq sense that we assume some parameters of the model are known found using Metropolis–Hastings sampling are shown in Fig. 2, perfectly a priori in each scenario, but only upper bounds are de- along with a few examples of the fit of simulated curves to the syn- fined for the parameters that are allowed to vary. Thus these sam- thetic data. ple cases focus on the relationship between parameters that are We note the following interesting features of the obtained allowed to vary in each sample case, which are essentially assumed parameter distribution. Firstly, the maximum likelihood estimate to have a uniform or ’’non-informative’’ prior distribution. Similar of the parameters is not equal to the true parameter values, sug- analysis may be carried out if other types of prior information gesting that even moderate amounts of noise in the data may bias are available (e.g. prior mean and variance estimates) by perform- estimates of aquifer anisotropy. Also obvious is the interesting ing Metropolis–Hastings sampling on the posterior probability non-linear trade-off between Kr,aq and Kz,aq. A wide range of both density function of the parameters given the data. Kr,aq and Kz,aq values were found that fit the data well, and thus For all of the sample cases investigated below, we utilized as joint identifiability of these parameters may be difficult without initial guess the ‘‘best estimate’’ obtained from deterministic opti- prior information. However, if prior information about one of these mization. A burn-in period of 5000 and acceptance total of 150,000 parameters is available (e.g. Kz,aq 3e3 m/s), or if prior informa- realizations were utilized for each case. While proper burn-in per- tion is available relating the two parameters (e.g. Kr,aq 1.3Kz,aq), iod length and accepted Markov Chain length are both the subject then conditional distributions will contain only very narrow uncer- of some debate (see, e.g., discussion in Liu et al., 2010), the choices tainty bounds, owing to the thinness of the given likelihood distri- utilized were found to perform well on similar but faster-running bution. Of course, prior information must also be used with care test problems. since assuming inaccurate information (e.g. Kr,aq 2Kz,aq) can lead to quite inaccurate parameter estimates.

3.1. Case 1 – Identifiability of aquifer anisotropy 3.2. Case 2 – Identifiability of storage coefficient In Case 1, a synthetic slug test data set was generated for a slightly anisotropic aquifer (Kr,aq = 3.7e3 m/s, Kz,aq = 2.7e3m/ In Case 2, we generated synthetic slug test data from an isotro- s)using the unified model and parameters as given in Table 2.In pic aquifer and assumed that both aquifer conductivity and storage M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82 71

Fig. 3. Realizations of Kr,aq and Ss accepted by the Metropolis–Hastings algorithm, with true values and maximum likelihood estimate highlighted (top). Several orders of magnitude in Ss are spanned, suggesting poor identifiability of this parameter. Uncertainty in Kr,aq remains low, even with unknown Ss. Selected fits for three randomly selected parameter realizations are plotted beneath, emphasizing that good data fits can be obtained with quite different parameter sets. parameters were unknown. That is, Kr,aq and Ss,aq were to be esti- Kr,aq = 3.7e3 m/s and Kr,sk =2e4 m/s, i.e. the slug response is af- mated assuming knowledge of skin parameters and also assuming fected by both aquifer conductivity and a low-conductivity or ‘‘po- knowledge that Kr,aq = Kz,aq. In addition, a lower bound of sitive’’ skin. In order to bound the problem, a reasonable upper 1 1e11 m was assumed for Ss,aq. The true values used for Kr,aq bound of 1 m/s was assumed for both parameters. 1 and Ss,aq are 3.7e3 m/s and 5e5m , respectively. Similarly to Case 1, the distribution of Kr,aq and Kr,sk shows an The distribution of likely parameters in Case 2 provides in- interesting non-linear trade-off between accepted values of the sights into joint estimation of these parameters (see Fig. 3). First, two parameters (see Fig. 4). This suggests that, as noted by other we note that the marginal distribution of K is quite narrow, only researchers, joint estimation of both aquifer and skin conductivity spanning a range of a few percent around the true value. In con- may be difficult without prior information (Faust and Mercer, trast, the marginal distribution of Ss is extremely wide, suggesting 1984; Hyder et al., 1994). As before, if good prior information is – as other researchers have noted – that estimation of storage available about either one of the parameters, then the resulting parameters using slug test data will be prone to large errors. conditional uncertainty in the other parameter will generally be

The relative lack of correlation between K and Ss estimates ac- low due to the thinness of the overall likelihood distribution. The cepted indicates, practically, that if a reasonable value of Ss is as- distribution of accepted parameter sets also indicates that it may sumed, it will not have a large impact on estimation of K. For be possible to obtain good lower bounds on skin conductivity val- example, if one assumed in this particular case that ues without prior information. Note, as shown by the accepted 1 Ss =5e6m , the conditional distribution of K obtained from this parameter estimates, that as one assumes progressively lower estimation of Ss will still give values close to the true K. This and lower skin K estimates, drastic increases in the aquifer K value agrees well with the analyses of Beckie and Harvey (2002), which are required in order to fit the data (see also Malama et al., in press). found that transmissivity estimates from slug tests were not Since we know that aquifer K estimates should be reasonable (for strongly influenced by storage properties, but that estimates of example, below 1 m/s), this information can be used to place rea- storage properties obtained from slug tests had dubious value. sonable lower bounds on the skin K values supported by the data.

3.3. Case 3 – Identifiability of skin conductivity 4. Field data analysis

In Case 3, we assume an isotropic aquifer with known storage In the following sections, we apply the lessons learned from our parameters and seek to estimate both aquifer and skin hydraulic investigation of parameter identifiability to a large set of slug test conductivity (Kr,aq and Kr,sk). Synthetic data were generated with data collected at the Boise Hydrogeophysical Research Site (BHRS). 72 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82

Fig. 4. Realizations of Kr,sk and Kr,aq accepted by the Metropolis–Hastings algorithm, with true values and maximum likelihood estimate highlighted (top). Non-linear trade- off between Kr,sk and Kr,aq results in a wide range of parameter values that fit the data well. Selected fits for three randomly selected parameter realizations are plotted beneath, emphasizing that good data fits can be obtained with quite different parameter sets.

4.1. Field site and data collection graphic units ranging from 0.172 to 0.425 in addition to significant changes in porosity variance between units (Barrash and Clemo, The BHRS is a research wellfield in an uncontaminated fluvial 2002). aquifer established in a gravel bar of the Boise River (see Fig. 5), Earlier analyses of datasets to obtain hydraulic conductivity (K) developed by the Center for Geophysical Investigation of the Shal- estimates at the site have noted differences in fitted K values when low Subsurface (CGISS) at Boise State University. The purpose of using analytic models on a well-by-well basis (Fox, 2006; Barrash the site is to provide a field-scale control volume for the develop- et al., 2006), suggesting lateral heterogeneity. Similarly, inverse ment and testing of hydrologic and geophysical methods used for modeling of dipole pumping tests has likewise suggested variabil- aquifer characterization (Barrash and Knoll, 1998). The site setup ity in depth-integrated K throughout the site (Cardiff et al., 2009), consists of 18 wells – arranged in a series of roughly concentric providing further evidence for lateral heterogeneity. In addition to rings around a central well (designated A1) – which were drilled lateral heterogeneity affecting head responses at wells, wellbore through the roughly 18m thick aquifer and completed into the skin has been found to contribute to differing responses at pump- underlying clay aquitard. The wells themselves are small-diameter ing/testing vs. observation wells. Drawdown curves show consis- (10 cm inner diameter) PVC pipe and are fully screened throughout tent, systematic evidence for positive wellbore skin (Barrash the aquifer depth with the exception of blank segments, roughly et al., 2006) at the BHRS, in terms of larger than expected draw- 0.3 m in length and spaced 3 m apart, representing locations where downs at pumping wells given observation wells’ responses. sections of pipe are threaded together. Disturbance of the natural A series of slug tests was performed at the BHRS with the goal of material near the wellbores during drilling is thought to be mini- contributing to the understanding of vertical variability in hydrau- mal based on the drilling and well-finishing techniques employed lic conductivity at the BHRS in the vicinity of the wells and to pro- (Barrash et al., 2006; Morin et al., 1988). vide more information about the 3D geologic structures Geologically, the shallow unconfined aquifer at the BHRS con- influencing groundwater flow and solute transport.The slug test sists of a heterogeneous mixture of unconsolidated, unaltered sand data analyzed in this paper were collected using a partially-pene- and gravel deposits with ages from Pleistocene to Holocene. Grain trating, pressurized-gas slug test system (Leap, 1984; Levy et al., size analyses from cores show varying distributions with depth, 1993) as diagrammed in Fig. 1. For all tests performed, the follow- from primarily sand-dominated units to bimodal sand-gravel mix- ing parameters of the system geometry were constant: tures to cobble-dominated units with sand in the interstices of the rc = 1.905 cm, rw = 5.08 cm, rsk = 5.715 cm, and dB dT = 30.48 cm. framework (Reboulet and Barrash, 2003; Barrash and Reboulet, Thickness of the aquifer B varied depending on water levels during 2004). Similarly, neutron porosity logs show significant variations testing conditions and minor variations in the depth of the clay in porosity with depth, with mean values for individual strati- aquitard at the site. The depth to the top of the slug interval dT M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82 73 varied based on the interval of the aquifer being evaluated. Logis- above the water column was pressurized and water in the water tically, each slug test proceeded as follows: before the test, the air column was allowed to come to equilibrium with the surrounding

Fig. 5. Location of BHRS with respect to Boise River and upstream Diversion Dam. Inset shows arrangement of central wells.

Fig. 6. Normalized sample slug test records, showing overall independence of response from initial slug height. Slug test records have been normalized by the height of the first measured datapoint. 74 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82

Table 3 Slug test data collected during BHRS field campaigns. Tests were performed at each 0.3 m (1 ft) interval, except at intervals where well casing is blank (i.e., unscreened). Slug heights represent approximate equivalent head changes in height of water based on column gas pressurization.

Well name Highest elevation Lowest elevation Data collection date Slug height 1 Slug height 2 Slug height 3 Slug height 4 (m AMSL) (m AMSL) (cm H2O) (cm H2O) (cm H2O) (cm H2O) Al 846.93 332.60 7/1/2008–7/2/2008 30 25 20 – Bl 847.70 831.24 6/16/2009–6/17/2009 5 13 20 5 B2 847.29 832.05 6/17/2008–6/19/2008 30 25 20 – B3 847.21 831.67 6/24/2008–6/26/2008 30 25 20 – B4 847.38 832.14 5/23/2008–6/03/2008 30 25 20 – B5 847.21 832.16 6/30/2009–6/30/2009 5 13 20 5 B6 847.53 832.59 6/30/2009–7/1/2009 5 13 20 5 Cl 847.16 832.22 7/14/2009 5 13 5 – C2 847.19 831.65 3/5/2008–8/6/2008 30 25 20 – C3 846.84 831.90 7/9/2008–7/11/2008 30 25 20 – C4 847.22 831.18 6/4/2008–6/6/2008 30 25 20 – C5 847.30 837.54 7/6/2008 5 13 20 – C5 836.93 832.06 7/7/2008 5 13 5 – C6 847.23 831.69 7/7/2009–7/8/2009 5 13 5 – XI 847.26 831.72 7/20/2009 5 13 5 – X2 847.31 831.46 7/15/2009 5 13 5 – X3 847.33 831.18 7/21/2009 5 13 5 – X4 847.53 823.79 6/8/2009–6/10/2009 5 13 20 5 X5 847.23 829.30 7/22/2009 5 13 5 –

Fig. 7. Samples of field data and optimized simulated slug test curves for both analysis cases. Curve fits obtained with and without wellbore skin are near-identical in most cases (e.g., Samples 1–3), but produce different aquifer K estimates. In a few cases (e.g., Sample 4), field data analyzed using the model with skin could not be fit with reasonable aquifer K estimates. Note aquifer K estimates were constrained to be below 1 m/s (log10(K) = 0). M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82 75

aquifer. Then, at time t0, the excess air pressure was near-instanta- In accordance with practical guidance from prior research stud- neously released, resulting in a loss of equivalent head in the water ies (see, e.g., Butler, 1996, 1998; McElwee, 2002), a series of at least column and, hence, flow into the well (i.e., a ‘‘slug out’’ test). three different slug heights (i.e., air pressurizations to produce

Fig. 8. Well C3 profiles, showing consistency of K estimates across slug heights. Spread of estimates is <20% for 98% of intervals in Analysis Case 1, and for 89% of intervals in Analysis Case 2.

Fig. 9. Well C5 profiles, showing consistency of K estimates across slug heights. Note unreasonable aquifer K estimate obtained near 839 m in Analysis Case 2 (1.0 m/s), suggesting that 2e4 m/s represents a lower bound for reasonable skin K values. 76 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82

three different Ho starting values) were performed for each depth from earlier studies at the site is used to constrain model parame- interval. These repetitions allow for testing whether slug response ters that otherwise would contribute to significant uncertainty in is independent of slug height, which can be used to validate the estimation of aquifer hydraulic conductivity. whether the assumption of linearity invoked by most analytic the- Since identification of storage parameters from slug test data ories (i.e., the linearity of Eq. (4)) applies. In many cases, like those appears unreliable, we have chosen a reasonable value of specific shown in Fig. 6, normalized slug response was largely repeatable storage Ss that applies to both the aquifer material and the well- and independent of initial slug height, suggesting that the data to bore skin. Prior studies at the BHRS (e.g. Barrash et al., 2006) have 5 1 4 1 do not grossly violate the assumptions of linearity. However, in obtained estimates of Ss between 3 10 m and 1 10 m . some slug test records a trend of increased damping with larger For the analysis of slug test data, we have thus chosen to utilize 5 1 slug heights was observed, suggesting that tests using larger slug a constant Ss =5 10 m . We note that, based on our earlier heights resulted in slight flow non-linearities within the wellbore. synthetic analyses, estimation of aquifer K values should not be In many of the wells, multiple repetitions of the smallest slug particularly dependent on this choice. Estimation of aquifer anisot- height were collected for a given depth interval (see Table 3). ropy from slug test data likewise appears unreliable based on our The repetition of a given slug height allows testing of whether well synthetic analyses, and we thus resort to utilizing prior informa- development is taking place during the performance of slug tests. tion available from earlier analyses. Based on analyses from Given prior well development activities undertaken at the BHRS fully-penetrating pumping tests, Barrash et al. (2006) found little and the variety of prior hydraulic tests performed at the site, well evidence for significant aquifer anisotropy. We therefore assume development during slug tests was not expected. In cases where during our analyses of slug test data that Kr,aq = Kz,aq. multiple replicates of the same slug height were performed, visual With regards to wellbore skin, Barrash et al. (2006) found evi- inspection of slug test data showed little to no difference between dence of positive (i.e., lower conductivity) skin through analyses these records. of drawdown curves at both pumping and observation wells. How- The field data utilized in this work were collected throughout ever, since the exact conductivity values for wellbore skin may be the summers of 2008 and 2009. Slug tests were performed at difficult to estimate, and since pumping tests may be affected by 0.3m depth increments for all intervals in each well, except for mechanisms such as strongly-convergent radial flow which result those intervals in the well that are unscreened due to pipe thread- in ‘‘pseudo-skin’’ behaviors (see, e.g. Desbarats, 1992; Neuman ing. Intervals were progressively isolated using a down-borehole and Orr, 1993; Rovey and Niemann, 2001), we have analyzed slug packer system consisting of 1 m long packers with a 0.3 m isolated test records using three different analysis cases. In Analysis Case 1, screen interval in between. The tests were performed at all 18 we have assumed that there is no wellbore skin present, i.e. by set- wells of the BHRS, resulting in a total of over 2500 response curves ting Kr,aq = Kz,aq = Kr,sk = Kz,sk. If positive wellbore skin is present in to be analyzed. Due to the highly conductive nature of the aquifer any magnitude, this assumption results in decreased estimates of and the quick response time, a full set of slug tests for a well could aquifer K, meaning that Analysis Case 1 provides a lower bound be collected quite quickly, and routinely took only 1–2 days. The on the aquifer K values from the site. In Analysis Case 2, we assume 4 particular slug heights used for each well varied and, as discussed isotropic wellbore skin with magnitude Kr,sk = Kz,sk =2 10 m/s, earlier, in more than half the zones, tests using the smallest slug which is somewhat higher conductivity than the skin value ob- height were repeated multiple times. tained by Barrash et al. (2006). Finally, in Analysis Case 3 we utilize the skin value obtained by Barrash et al. (2006), with 5 4.2. Parameter estimation Kr,sk = Kz,sk =2 10 m/s. In order to perform parameter estimation, we utilized an auto- Incorporating the lessons learned from our synthetic investiga- matic routine based on MATLAB’s built-in fmincon (constrained, tions, we apply parameter estimation to the full set of over 2500 gradient-based optimization) and fminsearch (simplex search) slug test records available from the BHRS. Useful prior information optimization routines in order to minimize data-fitting residuals.

Table 4 Trends in mean and variance of K estimates per well (Analysis Case 1) due to slug height used. In all except one case (bold), larger slug heights are associated with smaller depth- averaged K estimates. Similarly, in all but two cases (bold), use of larger slug heights is associated with decreased variance in K estimates obtained.

Well Slug height (cm) K mean (m/s) K variance (m2/s2) Slug 1 Slug 2 Slug 3 Slug 4 Slug 1 Slug 2 Slug 3 Slug 4 Slug 1 Slug 2 Slug 3 Slug 4

A1 30 25 20 – 9.57e04 9.86e04 1.05e03 1.91e07 1.98e07 3.30e07 B1 5 13 20 5 1.25e03 1.17e03 1.11e03 1.25e03 6.02e07 4.81e07 3.69e07 6.14e07 B2 30 25 20 – 8.74e04 8.99e04 9.36e04 1.15e07 1.21e07 1.31e07 B3 30 25 20 – 8.58e04 8.81e04 9.07e04 2.56e07 2.76e07 2.74e07 B4 30 25 20 – 1.09e03 1.14e03 1.16e03 3.64e07 4.06e07 4.14e07 B5 5 13 20 5 8.99e04 8.37e04 8.15e04 8.83e04 4.73e07 3.34e07 2.39e07 4.04e07 B6 5 13 20 5 1.10e03 1.02e03 9.82e04 1.08e03 5.80e07 4.07e07 3.14e07 5.15e07 C1 5 13 5 – 1.16e03 1.07e03 1.16e03 3.94e07 3.45e07 4.02e07 C2 30 25 20 – 6.66e04 6.88e04 6.93e04 9.73e08 1.06e07 1.11e07 C3 30 25 20 – 7.13e04 7.51e04 7.63e04 3.09e07 3.58e07 3.29e07 C4 30 25 20 – 5.61e04 5.82e04 6.08e04 7.10e08 7.70e08 8.23e08 C5 5 13 20 – 1.00e03 9.63e04 9.19e04 7.82e07 5.76e07 4.45e07 C5 5 13 5 – 1.05e03 1.00e03 1.08e03 7.27e07 6.66e07 8.88e07 C6 5 13 5 – 1.21e03 1.16e03 1.22e03 1.48e06 1.42e06 1.58e06 X1 5 13 5 – 1.61e03 1.48e03 1.58e03 1.79e06 1.41e06 1.71e06 X2 5 13 5 – 7.64e04 7.63e04 7.46e04 3.76e07 3.52e07 3.42e07 X3 5 13 5 – 8.30e04 7.90e04 8.33e04 1.79e07 1.58e07 1.80e07 X4 5 13 20 5 1.20e03 1.13e03 1.10e03 1.23e03 4.74e07 3.95e07 3.25e07 4.97e07 X5 5 13 5 – 1.21e03 1.12e03 1.16e03 5.76e07 3.88e07 4.81e07 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82 77

Table 5 Trends in mean and variance of K estimates per well (Analysis Case 2) due to slug height used. In all except four cases (bold), larger slug heights are associated with smaller depth- averaged K estimates. Similarly, in all but six cases (bold), use of larger slug heights is associated with decreased variance in K estimates obtained.

Well Slug height (cm) K mean (m/s) K variance (m2/s2) Slug 1 Slug 2 Slug 3 Slug 4 Slug 1 Slug 2 Slug 3 Slug 4 Slug 1 Slug 2 Slug 3 Slug 4

A1 30 25 20 – 1.62e03 1.65e03 3.05e03 3.62e06 2.95e06 9.77e05 B1 5 13 20 5 4.56e03 2.95e03 2.10e03 4.40e03 1.38e04 2.98e05 5.22e06 1.02e04 B2 30 25 20 – 1.23e03 1.28e03 1.34e03 3.79e07 4.20e07 4.63e07 B3 30 25 20 – 1.38e03 1.46e03 1.56e03 2.00e06 2.39e06 3.00e06 B4 30 25 20 – 2.83e03 2.15e02 2.44e02 3.52e05 1.66e02 2.17e02 B5 5 13 20 5 2.66e03 1.90e03 1.29e03 2.07e03 5.12e05 1.95e05 2.25e06 1.75e05 B6 5 13 20 5 1.37e02 7.00e03 2.77e03 1.67e02 3.32e03 1.09e03 6.23e05 8.54e03 C1 5 13 5 – 3.49e03 1.04e02 4.41e03 4.39e05 2.97e03 1.68e04 C2 30 25 20 – 8.74e04 9.25e04 9.71e04 5.29e07 6.57e07 8.36e07 C3 30 25 20 – 1.22e03 1.59e03 1.46e03 2.73e06 1.08e05 4.85e06 C4 30 25 20 – 6.82e04 7.19e04 7.57e04 1.68e07 1.88e07 2.06e07 C5 5 13 20 – 2.27e02 3.06e03 1.88e03 1.22e02 5.93e05 7.73e06 C5 5 13 5 – 2.95e03 2.89e03 3.10e03 2.68e05 2.82e05 3.24e05 C6 5 13 5 – 5.80e02 4.58e02 6.01e02 4.55e02 4.23e02 4.92e02 X1 5 13 5 – 9.01e02 6.18e02 8.18e02 6.35e02 4.74e02 5.43e02 X2 5 13 5 – 1.67e03 1.49e03 1.92e03 1.24e05 7.39e06 2.53e05 X3 5 13 5 – 1.20e03 1.11e03 1.21e03 1.01e06 8.55e07 9.73e07 X4 5 13 20 5 2.71e03 2.39e03 1.95e03 3.12e03 1.68e05 1.80e05 4.55e06 3.03e05 X5 5 13 5 – 1.66e02 2.24e03 2.64e03 1.03e02 7.10e06 1.14e05

In all three Analysis Cases, log10(Kr,aq)was treated as a variable ing parameters mentioned earlier (Hscale, Hoffset, and toffset). After bounded on the interval [1, 0], in addition to the scaling and tim- optimization, all data fits were visually examined and a subset of

Fig. 10. Example of similarity of estimated log10(K) profiles at closely-spaced wells (both analysis cases), suggesting that K values are representative of aquifer properties and likewise suggesting horizontal continuity of individual layers or lenses. Similar slug heights chosen for comparison. Inter-well distances are as follows: B1–B2 = 3.33 m, B1– C1 = 4.67 m, B2–C1 = 5.08 m. 78 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82

Fig. 11. Visualization of anisotropy ellipse for log10(K) variogram for Analysis Case 1 (no skin), estimated using RML methodology. Dimensions of the ellipse represent optimized correlation length L of exponential variogram, and are compared with dimensions of BHRS well field (black lines representing wells onsite). Axes of ellipse represent optimized principle anisotropy directions.

Fig. 12. Visualization of anisotropy ellipse for log10(K)variogram for Analysis Case 2 (skin K =2e4 m/s), estimated using RML methodology. Dimensions of the ellipse represent optimized correlation length L of exponential variogram and are compared with dimensions of BHRS well field (black lines representing wells onsite). Axes of ellipse represent optimized principle anisotropy directions. M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82 79 records with poor fits (due to convergence to local minima) were 4.3. Analysis of K estimates re-run. In cases where optimization had converged to unrealistic local minima, modified starting parameter estimates were tried 4.3.1. Dependence on initial slug height until acceptable convergence was obtained. Differences between aquifer K estimates obtained using differ- Examples of slug test data fits for a range of responses – using ent slug heights is slight in most cases when compared to overall Analysis Cases 1 and 2 – are shown in Fig. 7. In Analysis Case 3, K variability throughout the BHRS aquifer, as shown in Figs. 8 which is not pictured, all simulated slug responses were over- and 9. However, subtle trends are visible in overall K estimates damped even when extremely high aquifer K values up to 1 m/s dependent on slug height when viewed on a well-by-well basis. were utilized. Given that a large number of records from the field In Tables 4 and 5, we note two predominant trends, both of which data exhibit underdamped behavior, this suggests that the skin suggest that K estimates obtained from small slug heights may be conductivities obtained by Barrash et al. (2006) are too low, possi- the most useful for future analysis. The first trend we note is that, bly due to ‘‘pseudo-skin’’ effects from strongly convergent flow analyzed on a well-by-well basis, average K estimates tend to de- near the pumping well. For this reason, Analysis Case 3 is not con- crease as slug height increases. While such behavior could be sidered further. due to a variety of factors, one of the most likely is perhaps the In Analysis Cases 1 and 2, very consistent estimates of aquifer K existence of higher in-well flow velocities and, thus, some non-lin- were obtained for the majority of records across the full set of slug ear energy losses within the well (McElwee and Zenner, 1998; heights utilized. Estimated K profiles for wells C3 and C5 are shown McElwee, 2002), which incurs extra damping and thus lower-K in Figs. 8 and 9, respectively. For the majority of depths, K estimate parameter estimates. The second trend made evident in Tables 4 variability across slug heights was within ±20% of the average K ob- and 5 is that, analyzed on a well-by-well basis, the variance of K tained. However, it should be noted that at a few intervals (e.g., estimates obtained tends to decrease as slug height increases. In around 839 m AMSL in well C5, Fig. 9), Analysis Case 2 resulted this case, a likely cause for this behavior is that, at larger slug in aquifer K estimates that were unrealistically high. This suggests, heights, a larger region of influence is interrogated resulting in K as discussed in the synthetic analyses, that the skin value of estimates that average over a larger aquifer volume, i.e. produce 2 104 m/s used represents a lower bound on the skin K value a measurable water level disturbance in a larger portion around and, thus, aquifer K estimates obtained using Analysis Case 2 rep- the slug region. resent an upper bound on actual aquifer K. Given our previous rea- Because of these trends, and their likely causes, K estimates ob- soning that Analysis Case 1 represents a reasonable lower bound tained when using smaller initial slug heights are thought to be on actual aquifer K values, then, it is likely that true aquifer K val- more representative of fine-scale depth-dependent aquifer hetero- ues are somewhere between those found in Analysis Case 1 and geneity. The analyses performed in the following sections thus fo- Analysis Case 2.

Fig. 13. Comparison between neutron log-derived porosity (line) and estimated log10(K) from slug tests (points) for wells B1, B2, and C1, which are adjacent wells showing similar K trends. Note some regions in which porosity and K appear positively correlated and others in which negative correlation is evident (e.g., near 838–844 m AMSL in all wells). 80 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82 cus on K estimation results for each well using data from stimula- wells (which are treated as point-wise information) was performed tions with the smallest slug height. in order to gain more insight into the correlation structure. In our analysis, we utilized the Restricted Maximum Likelihood (RML) method of Kitanidis (1987), and analyzed the aquifer K estimates 4.3.2. K autocorrelation analysis from both Analysis Cases. We assume an anisotropic exponential Because of slug tests’ high sensitivity to wellbore skin conduc- variogram in which the sill of the variogram, the correlation tivity values, depth variability in aquifer hydraulic conductivity lengths along the principle anisotropy directions, and the direc- estimates could easily be attributed to depth variability of wellbore tions of the principle anisotropy vectors were all optimized. Since skin – a complication with slug tests that has not received much the optimization of these parameters is non-linear, a number of attention. However, if consistency in profiles of K with depth are different initial guesses were chosen for the geostatistical parame- seen at several wells, this suggests that aquifer K estimates ob- ters. Several of the optimization cases converged to the same tained are actually related to true aquifer K values near the parameter estimates, and in the other cases local minima were ob- pumped interval, and not just to random variations in wellbore tained, all of which had lower likelihood values. The maximum skin conductivity. A high degree of consistency is seen between K likelihood geostatistical parameters obtained from Analysis Cases profiles obtained from wells that are located near each other at 1 and 2 are shown visually in Figs. 11 and 12, respectively, and the BHRS, as exhibited in Fig. 10. are quite similar. For Analysis Case 1, i.e. for the K estimates ob- As expected from geostatistical theory, correlation between K tained when no wellbore skin was assumed, a minimum correla- profiles in wells is high when distances between wells are rela- tion length of about 1.5 m was obtained along a principle axis tively small (as in Fig. 10), and correlation decreases with increas- that is only slightly tilted relative to vertical (8°), and longer cor- ing distance. A geostatistical analysis of the K values obtained in all

Table 6 Correlation between porosity term of Kozeny–Carman equation and estimated K values from field data (Analysis Case 1) computed on a well-by-well basis. Highlighted entries have statistically significant P-values. Negative correlations occur often and represent almost half of statistically significant correlations.

Well name Correlation coefficient Correlation type P-value A1 0.439 Positive 0.4% B1 0.209 Positive 15.5% B2 0.287 Positive 5.3% B3 0.508 Positive 0.0% B4 0.200 Negative 18.3% B5 0.127 Negative 41.0% B6 0.279 Negative 8.1% C1 0.239 Positive 11.4% C2 0.135 Positive 37.3% C3 0.063 Positive 68.0% C4 0.117 Positive 42.4% C5 0.076 Negative 61.7% C6 0.153 Positive 30.9% X1 0.576 Positive 0.0% X2 0.014 Negative 92.8% X3 0.329 Negative 2.1% X4 0.286 Negative 3.4% X5 0.041 Positive 77.1%

Table 7 Correlation between porosity term of Kozeny–Carman equation and estimated K values from field data (Analysis Case 2) computed on a well-by-well basis. Only two wells (highlighted) have statistically significant positive correlations. For all other wells, correlation is weakly positive or negative.

Well name Correlation coefficient Correlation type P-value A1 0.167 Positive 29.1% B1 0.050 Positive 73.7% B2 0.268 Positive 7.2% B3 0.531 Positive 0.0% B4 0.027 Positive 85.9% B5 0.065 Negative 67.7% B6 0.040 Negative 80.6% C1 0.180 Positive 23.7% C2 0.018 Positive 90.7% C3 0.164 Positive 28.2% C4 0.075 Positive 60.8% C5 0.064 Negative 67.1% C6 0.102 Negative 50.0% X1 0.644 Positive 0.0% X2 0.011 Negative 94.5% X3 0.158 Negative 27.8% X4 0.081 Negative 55.5% X5 0.036 Negative 79.8% M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82 81 relation lengths of 6 m and 10 m are observed along the other two skin (as indicated by prior information in the case of the BHRS), principle axes, which are roughly parallel and perpendicular to the and by steadily decreasing the skin conductivity values, one can ar- Boise River, respectively. In Analysis Case 2, where a skin conduc- rive at an approximate upper bound of aquifer K. tivity of Ksk =2e4 m/s was assumed, only slight differences in the Even after depth-dependent aquifer K has been estimated by estimated geostatistical parameters are observed. The minimum slug test data analysis, we believe these values must be validated correlation length in this case is about 1.2 m, and is tilted 18° rel- through other methods such as comparison against other data ative to vertical. As before, the longer correlation lengths are ori- sources and autocorrelation (geostatistical) analyses. Due to slug ented roughly parallel and perpendicular to the Boise River again, tests’ high sensitivity to low-conductivity skins, it is possible that with correlation length values equal to 4 m and 8 m, respectively. aquifer K estimates obtained may simply represent depth-variabil- Regardless of which Analysis Case is considered, the geostatistical ity in a low-conductivity skin. By comparing K profiles obtained at results obtained support the notion that the BHRS is a predomi- closely-spaced wells, though, we increase our confidence that the nantly layered system, as has been suggested in numerous prior volume being interrogated by the slug test response extends be- analyses of both hydrologic and geophysical data from the site yond the wellbore skin and represents true aquifer K variability. (e.g., Barrash and Clemo, 2002; Tronicke et al., 2004; Irving et al., When slug test data are treated carefully and when K estimates ob- 2007; Bradford et al., 2009). tained are validated, the information content present in this data source can help to provide new insights into aquifer heterogeneity, and can likewise provide validation for hydrologic and geophysical 4.3.3. K-porosity correlation analysis models. Finally, one very interesting feature of the K distribution esti- mated using slug testing is its overall lack of correlation with porosity data. Porosity values at the BHRS wells have been esti- Acknowledgments mated using neutron well logs (Barrash and Clemo, 2002), and trends in porosity observed at wellbores have been verified using The authors wish to thank Brady Johnson, who participated in numerous geophysical methods (Tronicke et al., 2004; Clement the collection of the data analyzed in this work, and Drs. James But- and Knoll, 2006; Irving et al., 2007; Bradford et al., 2009). Theories ler Jr. and Geoff Bohling, who provided numerous helpful sugges- such as the Kozeny–Carman equation have been used extensively tions for field operation and model development. In addition, the in hydrologic and geophysical research to make inferences about authors would like to thank the helpful comments of Walter Illman aquifer K given either direct or indirect measurements of porosity and one anonymous reviewer, whose input helped improve this /. For example, the Kozeny–Carman Bear model suggests that K publication. Support for this research was provided by NSF Grants should be proportional to a function of the porosity K / /3/(1 / EAR-0710949 and DMS-0934680, and is gratefully acknowledged. )2 (Domenico and Schwartz, 1998). However, as shown for selected wells in Fig. 13, a strong correlation between porosity and K is not References apparent. Correlations such as those suggested by the oft-used Kozeny–Carmen equation do not appear to apply throughout the Barrash, W., Clemo, T., 2002. Heirarchical geostatistics and multifacies systems: BHRS aquifer. While positive correlation between K and porosity Boise Hydrogeophysical Research Site, Boise, Idaho. Water Resources Research 38 (10), 1196. exists for some intervals, there are equally many intervals in which Barrash, W., Knoll, M.D., 1998. Design of research wellfield for calibrating no correlation is apparent or even in which negative correlations geophysical methods against hydrologic parameters. In: Conference on appear to occur. As suggested by both Fig. 13 and shown more Hazardous Waste Research. pp. 296–318. Barrash, W., Reboulet, E.C., 2004. Significance of porosity for stratigraphy and completely, for both Analysis Cases 1 and 2 in Tables 6 and 7, textural composition in subsurface, coarse fluvial deposits: Boise respectively, porosity data alone (whether collected through direct, Hydrogeophysical Research Site. Geological Society of America Bulletin 116 hydrologic, or geophysical methods), may be insufficient for esti- (9/10), 1059–1073. Barrash, W., Clemo, T., Fox, J.J., Johnson, T.C., 2006. Field, laboratory, and modeling mating aquifer hydraulic conductivity in sediments similar to investigation of the skin effect at wells with slotted casing, Boise those present at the BHRS. Investigation into possible sedimentary Hydrogeophysical Research Site. Journal of Hydrology 326, 181–198. and post-depositional causes for this discrepancy at the BHRS are Beckie, R., Harvey, C.F., 2002. What does a slug test measure: an investigation of instrument response and the effects of heterogeneity. Water Resources currently in progress. Research 38 (12), 1290. Bouwer, H., Rice, R., 1976. A slug test for determining hydraulic conductivity of unconfined aquifers with completely or partially penetrating wells. Water 5. Conclusions Resources Research 12 (3), 423–428. Bradford, J.H., Clement, W., Barrash, W., 2009. Estimating porosity with ground- penetrating radar reflection tomography: a controlled 3-d experiment at the Partially-penetrating slug tests are an information-rich and rel- Boise Hydrogeophysical Research Site. Water Resources Research 45 (W00D26). atively easy-to-collect source of data for estimating vertical vari- Bredehoeft, J.D., Cooper Jr., H.H., Papadopulos, I.S., 1966. Inertial and storage effects ability of hydraulic conductivity, even in highly conductive in well-aquifer systems: an analog investigation. Water Resources Research 2 (4), 697–707. aquifers such as the aquifer at the BHRS. However, in interpreting Butler Jr., J.J., 1996. Slug tests in site characterization: some practical considerations. data from such tests, care must be taken in order to be realistic Environmental Geosciences 3 (2), 154–163. about the level of information that can be obtained from analyzing Butler Jr., J.J., 1998. The Design, Performance, and Analysis of Slug Tests. Lewis Publishers, Boca Raton. such tests. For example, estimation of aquifer anisotropy and aqui- Butler Jr., J.J., 2002. A simple correction for slug tests in small-diameter wells. fer storage properties using slug test data may be highly Ground Water 40 (3), 303–307. inaccurate. Butler Jr., J.J., Zhan, X., 2004. Hydraulic tests in highly permeable aquifers. Water Since often both the stimulation (i.e., the slug injection/extrac- Resources Research 40, W12402. Butler Jr., J.J., Bohling, G.C., Hyder, Z., McElwee, C., 1994. The use of slug tests to tion) and measurement take place at the same location, slug test describe vertical variations in hydraulic conductivity. Journal of Hydrology 156, data may be highly affected by wellbore skin, and in highly con- 137–162. ductive aquifers a key concern is the existence of ‘‘positive’’, or Cardiff, M., Barrash, W., Kitanidis, P., Malama, B., Revil, A., Straface, S., Rizzo, E., 2009. A potential-based inversion of unconfined steady-state hydraulic low-conductivity skin. However, by analyzing slug test data under tomography. Ground Water 47 (2), 259–270. a variety of different scenarios, it may be generally possible to Clement, W.P., Knoll, M.D., 2006. Traveltime inversion of vertical radar profiles. place bounds on aquifer K. By making the assumption of no well- Geophysics 71 (3), K67–K76. Cooper Jr., H.H., Bredehoeft, J.D., Papadopulos, I.S., 1967. Response of a finite- bore skin, aquifer K estimates obtained will represent a lower diameter well to an instantaneous charge of water. Water Resources Research 3 bound. Similarly, by analyzing data using a model with wellbore (1), 263–269. 82 M. Cardiff et al. / Journal of Hydrology 403 (2011) 66–82

Desbarats, J., 1992. Spatial averaging of transmissivity in heterogeneous fields with McElwee, C.D., 2002. Improving the analysis of slug tests. Journal of Hydrology 269 flow toward a well. Water Resources Research 28 (3), 757–767. (3-4), 122–133. Domenico, P.A., Schwartz, F.W., 1998. Physical and Chemical , second McElwee, C.D., Zenner, M.A., 1998. A nonlinear model for analysis of slug-test data. ed. John Wiley & Sons, Inc., New York, NY, USA. Water Resources Research 34 (1), 55–66. Faust, C.R., Mercer, J.W., 1984. Evaluation of slug tests in wells containing a finite- Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E., Chem, J., 1953. thickness skin. Water Resources Research 20 (4), 504–506. Equation of state calculations by fast computation machines. Physics 21, 1087– Fox, J.J., 2006. Analytical Modeling of Fully-penetrating Pumping Tests at the Boise 1092. Hydrogeophysical Research Site for Aquifer Parameters and Wellbore Skin. Morin, R., LeBlanc, D., Teasdale, W., 1988. A statistical evaluation of formation Ph.D. thesis. Boise State University. disturbance produced by well-casing installation methods. Ground Water 26 Hastings, W., 1970. Monte carlo sampling methods using markov chains and their (2), 207–217. applications. Biometrika 57 (1), 97–109. Neuman, S., Orr, S., 1993. Prediction of steady state flow in nonuniform geologic Hvorslev, M., 1951. Time lag and soil permeability in ground-water observations. media by conditional moments: exact nonlocal formalism, effective U.S. Army Corps of Engineers, Waterways Experiment Station, Bulletin No. 36. conductivities, and weak approximation. Water Resources Research 29 (2), Hyder, Z., Butler, James J.J., McElwee, C.D., Liu, W., 1994. Slug tests in partially 341–364. penetrating wells. Water Resources Research 30 (11), 2945–2957. Reboulet, E.C., Barrash, W., 2003. Core, Grain-size, and Porosity Data from the Boise Irving, G.D., Knoll, M.D., Knight, R.J., 2007. Improving crosshole radar velocity Hydrogeophysical Research Site, Boise, Idaho. Tech. Rep. 03-02, CGISS. Boise tomograms: a new approach to incorporating high-angle traveltime data. State University. Geophysics 72 (4), J31–J41. Rovey, C., Niemann, W., 2001. Wellskins and slug tests: where’s the bias? Journal of Kipp Jr., K.L., 1985. Type curve analysis of inertial effects in the response of a well to Hydrology 243, 120–132. a slug test. Water Resources Research 21 (9), 1397–1408. Springer, R.K., Gelhar, L.W., 1991. Characterization of Large-scale Aquifer Kitanidis, P.K., 1987. Parametric estimation of covariances of regionalized variables. Heterogeneity in Glacial Outwash by Analysis of Slug Tests with Oscillatory Water Resources Bulletin 23 (4), 557–567. Response, Cape Cod, Massachusetts. Tech. Rep. 91-4034, US Geological Survey. Leap, D.I., 1984. A simple pneumatic device and technique for performing rising Tronicke, J., Holliger, K., Barrash, W., Knoll, M.D., 2004. Multivariate analysis of water level slug tests. Ground Water Monitoring Review 4 (4), 141–146. cross-hole georadar velocity and attenuation tomograms for aquifer zonation. Levy, B.S., Pannell, L.J., Dadoly, J.P., 1993. A pressure-packer system for conducting Water Resources Research 40 (W01519). doi:10.1024/2003WR002031. rising head tests in water table wells. Journal of Hydrology 148, 189–202. Van Der Kamp, G., 1976. Determining aquifer transmissivity by means of well Liu, X., Cardiff, M., Kitanidis, P.K., 2010. Parameter estimation in nonlinear response tests: the underdamped case. Water Resources Research 12 (1), 71– environmental problems. Stochastic Environmental Research and Risk 77. Assessment 24 (7), 1003–1022. Zlotnik, V.A., McGuire, V.L., 1998. Multi-level slug tests in highly permeable Malama, B., Barrash, W., Cardiff, M., Thoma, M., Kuhlman, K.L., in press. Modeling formations: 1. Modification of the Springer–Gelhar (SG) model. Journal of slug tests in unconfined aquifers taking into account water table kinematics, Hydrology 204, 271–282. wellbore skin and inertial effects. Journal of Hydrology.