<<

Journal of Contaminant Hydrology 94 (2007) 109–125 www.elsevier.com/locate/jconhyd

Probability density function of non-reactive solute concentration in heterogeneous porous formations ⁎ Alberto Bellin a, , Daniele Tonina b,c

a Dipartimento di Ingegneria Civile e Ambientale, Università di Trento, via Mesiano 77, I-38050 Trento, Italy b Earth and Planetary Science Department, University of California, Berkeley, USA c USFS, Rocky Mountain Research Station 322 East Front Street, Suite 401 Boise, ID, 83702, USA Received 20 June 2006; received in revised form 15 May 2007; accepted 17 May 2007 Available online 12 June 2007

Abstract

Available models of solute transport in heterogeneous formations lack in providing complete characterization of the predicted concentration. This is a serious drawback especially in risk analysis where confidence intervals and probability of exceeding threshold values are required. Our contribution to fill this gap of knowledge is a model for the local concentration of conservative tracers migrating in heterogeneous aquifers. Our model accounts for dilution, mechanical mixing within the sampling volume and spreading due to formation heterogeneity. It is developed by modeling local concentration dynamics with an Ito Stochastic Differential Equation (SDE) that under the hypothesis of statistical stationarity leads to the Beta probability distribution function (pdf) for the solute concentration. This model shows large flexibility in capturing the smoothing effect of the sampling volume and the associated reduction of the probability of exceeding large concentrations. Furthermore, it is fully characterized by the first two moments of the solute concentration, and these are the same pieces of information required for standard geostatistical techniques employing Normal or Log-Normal distributions. Additionally, we show that in the absence of pore-scale dispersion and for point concentrations the pdf model converges to the binary distribution of [Dagan, G., 1982. Stochastic modeling of groundwater flow by unconditional and conditional probabilities, 2, The solute transport. Water Resour. Res. 18 (4), 835–848.], while it approaches the for sampling volumes much larger than the characteristic scale of the aquifer heterogeneity. Furthermore, we demonstrate that the same model with the spatial moments replacing the statistical moments can be applied to estimate the proportion of the plume volume where solute concentrations are above or below critical thresholds. Application of this model to point and vertically averaged bromide concentrations from the first Cape Cod tracer test and to a set of numerical simulations confirms the above findings and for the first time it shows the superiority of the Beta model to both Normal and Log-Normal models in interpreting field data. Furthermore, we show that assuming a-priori that local concentrations are normally or log-normally distributed may result in a severe underestimate of the probability of exceeding large concentrations. © 2007 Elsevier B.V. All rights reserved.

Keywords: Concentration pdf; Heterogeneous geological formations; Ito's Stochastic Differential Equation; Beta distribution

1. Introduction concentrations over short distances. This complexity is due to flow non-uniformity generated by spatial variability of Densely monitored field-scale tracer tests and studies on the hydraulic conductivity K. Solute concentrations are contaminated sites revealed large variations of solute attenuated by dilution associated with pore-scale dispersion (PSD) and their measurements are further affected by ⁎ Corresponding author. Tel.: +39 0461 882620; fax: +39 0461 882672. mixing within the sampling volume. In fact, solute E-mail address: [email protected] (A. Bellin). concentration is defined as the mass of solute dissolved

0169-7722/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.jconhyd.2007.05.005 110 A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125

in a unit volume of water: CΔV(x, t)=M(x, t)/(nΔV), where More recently, Fiori (2001) argued that the Beta n is the formation porosity and M is the mass of solute that distribution can be used to represent the variability of at time t is within the sampling volume ΔV centered at x, local concentrations, because it is bounded between a such that the larger ΔV the more it acts as an attenuation minimum and a maximum and is flexible enough to factor for the spatial variability of solute concentration. comply with the above properties. Other evidence in Point concentrations are obtained operationally when of the Beta distribution, involving the relation- the size of the sampling volume is much smaller than the ship between the and the , has been characteristic scale of variability of K, such that one can discussed by Chatwin and Sullivan (1990) and Chatwin safely assume that ΔV→0. In such a situation, the only et al. (1995) for atmospheric turbulent transport. hydrodynamical process attenuating solute concentra- The effect of pore-scale dispersion on point concen- tions is PSD. However, in all other cases, solute tration CDF was investigated numerically by Fiorotto concentration fluctuations that are perceived from and Caroni (2002). They showed a good fit of the Beta measurements – or that should be considered in risk distribution with data obtained numerically by simulat- analysis, when exposures are evaluated at extraction ing a tracer test in a two-dimensional heterogeneous wells – are attenuated by the combined effect of PSD formation. In a companion paper, Caroni and Fiorotto and mixing within the sampling volume. (2005) proposed an integral expression of the concen- Since detailed mapping of hydraulic conductivity in tration pdf, whose validity is limited to weakly the subsurface is unfeasible, solute transport is typically heterogeneous formations, and showed numerically modeled within a stochastic framework by considering an that the Beta distribution fits the experimental concen- ensemble of realizations of the K field. Most of the tration CDF well for values of the log-conductivity 2 existing stochastic studies analyze the evolution of the up to σY =2. 2 ensemble 〈C〉 and variance σC of point concentra- Our work differs from that of Caroni and Fiorotto tion (e.g., Kapoor and Gelhar, 1994; Zhang and Neuman, (2005) in the following two aspects: (1) In Caroni and 1996; Dagan and Fiori, 1997; Kapoor and Kitanidis, Fiorotto's work the Beta distribution is assumed a- 1998; Pannone and Kitanidis, 1999; Fiori and Dagan, priori, while we look for a physical interpretation 2000; Fiori, 2002; Fiorotto and Caroni, 2002; Caroni and leading to a theoretical justification of this choice; (2) Fiorotto, 2005). Others have specifically addressed the they considered only concentration data obtained from combined effect of pore-scale dispersion and sampling numerical simulations, whereas in addition to that we volume on solute concentration moments (Andricevic, confronted our model to field data. 1998) and flux at a compliance plane (Fiori et al., In this work, we seek to answer the following 2002). These studies concur in concluding that ergodic questions: Why is the Beta distribution a better model of conditions are hardly obtained in applications, such that local uncertainty obtained from Monte Carlo numerical 〈C〉 does not suffice to describe the concentration spatial experiments than the Normal and Log-Normal distribu- variability (Kitanidis, 1994; Fiori and Dagan, 1999), and a tions? Is it an artifact of numerical simulations, and in more effective representation should be used, for example particular of the geostatistical model of the spatial the local probability distribution (Goovaerts, 1997, ch. 7). variability of K, or a property of solute concentrations of Despite its importance in applications, the probability passive tracers? And furthermore, is the Beta distribu- density function (pdf) of solute concentrations in hetero- tion also a good model for the global probability geneous aquifers has been considered in few studies. distribution, which describes the probability of exceed- Dagan (1982) showed that in the absence of pore-scale ing a given concentration threshold irrespective of the dispersion point concentration at a given (fixed) location location within the plume? We address these questions is at any time either the initial concentration C0 or zero, first developing a theoretical model for the concentra- and the resulting Cumulative Distribution Function (CDF) tion CDF and then applying it to concentration data is binary with probability P=〈C〉/C0 of observing C0 and from the Cape Cod tracer test (Leblanc et al., 1991) and 1−P of observing zero concentration. Successively, Bellin from two-dimensional numerical simulations. et al. (1994) showed that although the concentration CDF is close to bimodality for small sampling volumes, it 2. Mathematical statement of the problem approaches the Normal distribution as the sampling volume grows large. However, all these studies neglected pore- 2.1. The general expression of the concentration pdf scale dispersion, which for point concentrations, has a 2 strong impact on σC (Kapoor and Kitanidis, 1998; Fiori Let us consider the instantaneous release of a and Dagan, 2000), and thus on the concentration pdf. conservative tracer with concentration C0 within the source A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 111

volume V0 in a heterogeneous aquifer of spatially variable and is not uniform within the plume. In fact, Bellin et al. K and constant porosity n. The evolution of the solute (1994) observed that for a fixed time the coefficient of concentration can be described by the following Stochastic variation of the solute concentration CV[CΔV]issmallat Differential Equation (SDE) (Cassiani et al., 2005): locations close to the center of mass, and increases sharply as one approaches the fringes of the plume. This behavior of CV[CΔ ] can be explained by observing that in a dCDV ¼ FðCDV Þdt þ GðCDV ÞdWðtÞð1Þ V Monte Carlo framework randomness is reflected in variations of plume position and shape between indepen- where W(t) is the Wiener process with zero mean and dent realizations of the velocity field, and that the dispersion coefficient equal to 1/2 (Gardiner, 1985, p. 66), expected variations between any two independent and F and G are functions of the concentration, but not realizations are larger at the fringes of the plume than explicitly of time (autonomous SDE). Hereafter to simplify close to the center of mass (e.g. Fisher et al., 1979; notation we omit the indication that CΔ varies in time and V Kitanidis, 1988; Dagan, 1991; Gelhar, 1993). space, which is therefore considered as implicit. More These differences in randomness can be modeled by specifically, F is the expected rate of change of CΔ and is V assigning the following form to the function G in the referred to in genetics as drift function, whereas G, called second right hand term of Eq. (1): the diffusion function, defines the disturbance to the expected rate of change caused by formation heterogeneity. pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi G ϵ C C C 3 Eq. (1), which has been shown consistent with the transport ¼ DV ð 0 DV Þ ð Þ equation, was extensively applied to model solute transport where ϵN0 is a coefficient proportional to the formation in atmospheric turbulent flow and turbulent combustion heterogeneity. In Eq. (3), the location with respect to the processes (see e.g. Wandel et al., 2003; Cassiani et al., center of mass is epitomized by CΔ , with CΔ ≈C 2005). V V 0 and CΔ ≈0 representing locations close to the center of One important result of stochastic theories is that solute V mass and outside the ensemble plume, respectively. concentrations evolve toward a maximum entropy state, Consistently with the above reasoning, in Eq. (1) G is corresponding to a well mixedGaussianplume,spontane- small compared to F when CΔ is either high, i.e. close ously in heterogeneous aquifers (Kitanidis, 1994; Kapoor V to C , or small, and it is maximum for CΔ =C /2. and Kitanidis, 1998). The rate of convergence of C toward 0 V 0 After these preparatory steps the SDE (1) assumes the 〈C〉 increases with pore-scale dispersion and sampling following form: volume. A simple model describing such a behavior as- sumes that the expected rate of change of local concentra- pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dC ¼ j½hCiC dt þ ϵ C ðC C ÞdWðtÞ; tions is proportional to its “distance” from equilibrium: DV DV DV 0 DV ð4Þ j F ¼ ½hCiCDV ð2Þ which can be written in a more convenient form for mathematical treatment by dividing both sides of the κN 〈 〉 where 0 is the rate of convergence of CΔV to C due to equation by C0 to obtain: dilution and mixing. The parameter κ is influenced by pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pore-scaledispersionandmixinginsuchawaythatan dZ ¼ j½hZiZdt þ ϵ Zð1 ZÞdWðtÞð5Þ increase in pore-scale dispersion coefficients or ΔV causes Δ 〈 〉 a faster convergence of C V to C . This approach, which where Z=CΔV /C0 and 〈Z〉=〈C〉/C0 are dimensionless supposes that mixing and pore scale dispersion homoge- solute concentrations, which vary between 0 and 1. Note nize the concentration field, corresponds to the Interaction that to simplify the notation we omitted to indicate the by Exchange with the Mean (IEM) (Villermaux and dependence of Z on ΔV, which therefore is considered as Devillon, 1972) also known as Linear Mean Square implicit. Estimation model (Dopazo and O'Brian, 1974) utilized in The pdf of the normalized concentration described by combustion turbulence and chemical engineering (Cassiani Eq. (5) satisfies the following Kolmogorov forward et al., 2005). equation (see e.g. Gardiner, 1985, ch. 4.3.6): The otherwise smooth transition of CΔV to 〈C〉 described by the first right hand term of Eq. (1) is  Af ðz; tÞ A 1 A A perturbed by flow non-uniformity, which introduces ¼ ½FðzÞf ðz; tÞþ GðzÞ ½GðzÞf ðz; tÞ : At Az 2 Az Az randomness into the system. This disturbance is described by the function G in the second right hand term of Eq. (1), ð6Þ 112 A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125

Under stationary conditions, when the SDE (1) reaches singular point at z→0 (see also Fig. 3 of (Cobb, 1981, 2 equilibrium and ∂f/∂t→0, which is possible for auton- p. 49)). Note that σZ plays a fundamental role in controlling omous diffusion processes (Pope, 1965), the solution of the shape of the pdf. the Eq. (6) is given by (Cobb, 1981,p.51): In the next section, we discuss two known cases for the pdf model: point solute concentration in the absence C½ p þ q f ðzÞ¼ z1þpð1 zÞ1þq ð7Þ of pore-scale dispersion and the purely diffusive case in C½ pC½q a uniform flow field, showing that our model (7) where Γ[x]=∫∞τ x−1e−τdτ is the Euler (x resembles these cases when PSD is set to zero and when 0 σ2 → stands for p, q,orp+q and τ is the integral dummy Y 0 (homogeneous formation), respectively. variable), p = 〈Z〉κ / ϵ = 〈Z〉 / β,andq =(1− 〈Z〉)κ / ϵ = (1−〈Z〉)/β. In these last two expressions, the parameter β 2.2. Two known cases of the concentration pdf assumes the following form: In an early work, Dagan (1982) showed that in the ϵ r2 absence of pore-scale dispersion the point concentration b ¼ ¼ Z ð8Þ j r2 of a conservative tracer is either C=C (Z=1) or C=0 hZi½1 hZi Z 0 (Z=0), depending on whether the sampler is internal or Note that the model (7) is the familiar Beta density external to the solute body, and the plume does not β 2 function, with the polarization ratio controlling the dilute. In this idealized situation σZ =〈Z〉 [1−〈Z〉] shape of the pdf. and the pdf of Z assumes the following form: The stationary hypothesis introduced in order to solve Eq. (6) does not imply that the solute concentra- f ðzÞ¼hZid½1 zþ½1 hZid½zð9Þ tion should reach a steady state condition, but rather that at each time step the population ensemble of the where δ is the . Hence, in this case concentration has reached statistical equilibrium. In knowing 〈Z〉 suffices to fully characterize the spatial other words, the concentration pdf depends on time random function Z regardless of the model of spatial though the solute concentration, but not directly. variability. In Appendix A, we show that for β→∞, In the Eq. (8), β varies between zero and infinity 2 which results from setting σZ =〈Z〉[1−〈Z〉] into Eq. (8), conferring to the Beta density function a large flexibility. the Beta model (7) reduces to the binary distribution (9) β For example, Fig. 1 shows that when is larger than both as required. Notice that the first right hand term of the 〈 〉 −〈 〉 Z and 1 Z the pdf is U-shaped with polarization to SDE (5) vanishes for β→∞, thereby validating the β z=0 and z=1. On the other hand, when is smaller than assumption introduced in Section 2.1 that the second 〈 〉 − 〈 〉 β both Z and 1 Z the pdf is unimodal, and when is right hand term mimics solute spreading due to flow 〈 〉 −〈 〉 between Z and 1 Z the pdf has a J-shape with a non-uniformities associated to formation heterogeneity. Another interesting result is obtained when dilution, or mixing associated with ΔV, dominates over solute spreading caused by flow non-uniformities. Increases in pore-scale dispersion coefficients, or in ΔV, lead to 2 reductions of σZ (see the works by Kapoor and Kitanidis (1998) and Fiori and Dagan (2000)) and β, which 2 converges linearly to zero as σZ →0. The limit of the model (7) for small β values is conveniently analyzed by examining the behavior of the central moments of Z: Z 1 ν lν ¼ ðz hZiÞ f ðzÞdz 0 ν  p p þ q ¼ F ν; p; p þ q; ð10Þ p þ q 2 1 p Fig. 1. Typical shapes of the beta distribution model for β larger than νN both 〈Z〉 and 1−〈Z〉 (i.e. p, qb1; solid ), smaller than both 〈Z〉 and where 1 is the order of the and 2F1 is the 1−〈Z〉 (i.e. p, qN1; dotted line), and for βN〈Z〉 but βb1−〈Z〉 (i.e. hypergeometric function (Gradshteyn and Ryzhik, pb1, qN1; dashed-dot line). 1980, p. 1045). In view of the ensuing analysis, it is A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 113 convenient to write the moments (10) in the following simulations for both small and large sampling volumes dimensionless form: and for different levels of formation heterogeneity. "#ν= l ν 2 lV ν p pq 3.1. Numerical experiment ν ν= ¼ ðl Þ 2 p þ q ðp þ qÞ2ð1 þ p þ qÞ 2  We considered a two-dimensional isotropic forma- p þ q : F ν; p; p þ q; tion with spatial variability of the log-transmissivity 2 1 p ð11Þ field, Y=ln(TK), described by the isotropic exponential model: The careful inspection of Eq. (11) for β→0 conducted in Appendix B shows that μν′=0 when ν is odd, and μν′= r2 = ; CY ðrÞ¼ Y exp½r IY ð12Þ 1·3·5⋯ (ν−1)=(ν−1)!!, that is the double factorial of ν− ν b 1(Arfken, 1985, p. 548), when is even. This where TK(x1, x2)=∫0K(x1, x2, x3)dx3 is the hydraulic property characterizes a Normal distribution with mean transmissivity, r is the two-point separation distance and 〈 〉 σ2 Z and variance Z. Furthermore, the convergence of IY is the log-transmissivity integral scale. the moments Eq. (11) to thosep offfiffiffi the Normal distribution A single realization of the transmissivity field was is proportional to β and b for ν even and odd, generated using HYDRO_GEN (Bellin and Rubin, 1996; respectively. Note that small β values are obtained for Rubin and Bellin, 1998), and the flow equation was k≫ϵ in the SDE (5) such that Z converges rapidly to solved using the Galerkin finite element code developed 〈Z〉, leading to an ergodic plume. Furthermore, the by Bellin et al. (1992). Simulations were conducted in a β→ σ2 → condition 0 is obtained for Z 0, such that the domain 100IY long and 70IY wide with the average flow, Normal distribution degenerates to the Dirac Delta U,inthex1 direction. A unitary mean head gradient was distribution f(z)=δ[z−〈Z〉], a solution consistent with imposed by assigning constant piezometric heads along the fact that for an ergodic plume C(x, t)=〈C(x, t)〉. the short sides of the domain and impervious boundaries We emphasize that the concentration pdf can be seen along the remaining two sides. as a measure of the uncertainty associated to the Transport was solved by forward or backward estimation of the local concentration by the ensemble particle tracking with pore-scale dispersion modeled as mean, and most importantly it can be used to assess the a Brownian motion with constant longitudinal and probability that the actual (measurable) concentration transverse dispersion coefficients Dd,L =UIY /200 and exceeds a given threshold at a given location as required Dd,T = UIY /2000, respectively. The resulting Peclet in risk assessment. As in Normal and Log-Normal numbers are PeL =200 and PeT =2000 in longitudinal distributions, note that the Beta model is fully and transverse directions, respectively. characterized by the ensemble mean and variance of In the forward particle tracking scheme (FPT), which the concentration, which in the last decades have been we adopted in estimating the spatial distribution of the the object of several investigations leading to a number solute concentration, a total mass M0 =nC0V0 of solute of closed form solutions (see e.g. Dagan, 1982; Kapoor was instantaneously released within the initial volume V0 and Gelhar, 1994; Kapoor and Kitanidis, 1998; Fiori and with a square horizontal projection area A0 of side L=1.5 Dagan, 2000), but at the same time enjoying a much IY, and thickness equal to the aquifer depth b.Themass larger flexibility with respect to these models. of solute was divided into Np =90,000 non-interacting −5 2 particles of volume vp =nV0 /Np =2.5 10 nIY b.The 3. Numerical validation of the concentration pdf model particles were tracked forward in time by utilizing the following tracking algorithm: In Section 2 we showed, by using a physically based pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; ϵ stochastic model, that the concentration pdf of a passive X1ðt þ DtÞ¼X1ðtÞþV1½X1ðtÞ X2ðtÞDt þ 2Dd;LDt 1 solute is the Beta model (7). This result was anticipated ð13Þ heuristically by Fiorotto and Caroni (2002) and Caroni pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi and Fiorotto (2005), who observed that the model (7) X2ðt þ DtÞ¼X2ðtÞþV2½X1ðtÞ; X2ðtÞDt þ 2Dd;T Dtϵ2 provides a good fit of the sample Cumulative Frequency ð14Þ Distribution (CFD) of concentrations obtained by of a numerical Monte Carlo experiment (see e.g. Figs. where V=(V1, V2) is the Eulerian velocity, X=(X1, X2)is 8 and 9 of Caroni and Fiorotto (2005)). In the ensuing the particle's trajectory, and ϵ1, ϵ2 are two independent section we validate the model (7) against numerical random numbers both normally distributed with zero 114 A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125

mean and unitary variance. The time step Δt is selected where Xb,1 and Xb,2 assume the expression in the last right such that the maximum step resulting from the application hand term of Eqs. (13) and (14), respectively and Xt,j(t; x) of Eqs. (13) and (14) to the computed velocity field is is the j-th component of the particle's position at time t. smaller than the grid's size. At the second step, the particle is advected back in time Consistent with particle tracking the solute concentra- according to the Eulerian velocity V at the position ⁎ tion was computed as follows: x=X (t−Δt; x): x; x; D ; x ⁎ D x MDV ð tÞ npð tÞV0 Xt;jðt t Þ¼Xj ðt t; Þ CDV ðt; xÞ¼ ¼ C0; ð15Þ nDV NpDV ⁎ Vj½X ðt Dt; xÞDtj¼ 1; 2: where MΔV is the solute mass that at time t is within the ð18Þ sampling volume ΔV with horizontal square projection area ΔA of side Δ and thickness b centered at x=(x , x ). where Vj is the j-th component of V. Eqs. (17) and (18) are 1 2 applied repeatedly until t=0, when the concentration is Note that MΔV is obtained by counting the number np(t, x) of particles, whose centers of mass are within ΔV at time computed as follows: Δ t. Simulations were conducted with =0.2IY and IY,such n ðV ; tÞ C ðx; tÞ¼ p 0 C ð19Þ that CΔV mimics the vertically averaged concentration DV n 0 measured in a monitoring well fully penetrating the f aquifer, in the first case, and a larger sampling volume in where np(V0, t) is the number of particles found within the the second. source volume. In this case, the minimum detectable Numerical errors in the above forward particle tracking concentration is given by: ΔCΔV,min =C0 /nf , corresponding scheme (FPT) arise from neglecting particle deformation to the case of only one particle reaching the source at time and assuming that the contribution of each particle to MΔV t=0. In our simulation we used nf =64,000 and 80,000 is either Δm=vpC0 =M0 /Np or zero, depending on whether for the control volume with size ΔV=0.2 IY and IY, its center of mass is within or outside ΔV. The latter respectively. assumption leads to the following lower detection limit for the simulated concentration: CΔV,min =C0V0 /(NpΔV), which is obtained when only one particle is found within 3.2. Analysis of the numerical results ΔV. At this point, we introduce the relative sensitivity of the FPT as the ratio between the solute concentration The need to quantify the probability of exceeding a obtained when only one particle is within ΔV and C : given concentration for regulatory limitations justifies 0 the interest in studying the CDF of solute concentra- CDV;min V0 tions. More specifically, because of its importance in Sc ¼ ¼ : ð16Þ C0 NpDV environmental analysis where samplers of different dimensions are used, we extended our analysis to the − 4 Simulations were conducted with Sc =6.25 10 , effects of sampling dimensions and distance from or meaning that concentrations lower than Sc C0 cannot be time since injection on the CDF of CΔC /C0. Simulations 2 detected. were conducted in weakly (i.e. σY =0.2) and moderately 2 Eq. (16) shows that in order to maintain the same (i.e., σY =1.2) heterogeneous two-dimensional forma- detection limit and accuracy, a reduction of the sampling tions with a square source area of side L=1.5 IY. volume requires an increase in the number of particles. Fig. 2a, and b show the CFDs of Z=CΔV /C0 for 2 This would lead to an exceedingly large computational σY =1.2 at the center of mass of the mean plume at times Δ ≪ burden when V V0. To alleviate the computational t=4 IY /U and 55 IY /U, respectively. The CFDs are burden and increase accuracy, we shifted from FTP to the obtained numerically with 2000 Monte Carlo simula- backward particle tracking (BPT) approach when tions for several ΔV and are assumed to resemble the calculating the local concentration over many Monte corresponding CDFs. In the same figures the CFDs – Carlo realizations. This methodology consists in releas- note that according to the Bernoulli statistical theorem Δ ing nf particles within Vand tracking them back in time the sample CFD converges to the CDF of the population towards the source (V0)(Vanderborght, 2001). Similarly as the sample's size grows to infinity – are compared to to FPT, the particle's movement is split into two steps: a the CDF of our Beta model, which is obtained by Brownian jump, modeled as: replacing the first two statistical moments of Z in the Eq. (A.1) with the sample mean and variance obtained from ⁎ D x x ; X j ðt t; Þ¼Xt; jðt; ÞXb; j j ¼ 1 2 ð17Þ the Monte Carlo simulation. We utilized this procedure A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 115

resulted in a poor adaptation to the same set of data (not shown in the figures). In Fig. 3, we report the results of simulations conducted to compute the concentration CDF at positions other than the center of mass of the mean plume. In all cases the Beta model (obtained as before by replacing in Eq. (A.1) the statistical moments with the sample's moments) matches the numerical CDF and performs much better than Normal and Log-normal models. A first comparison is between the CDFs at the position x=(4 IY, 0) for t=4 IY /U and t=8 IY /U, when the center of mass of the mean plume is on the sampling point and ahead of it, respectively. Besides the good match of the Beta model with numerical simulations, we observe that the probability of exceeding moderate and high concentrations is greater at the center of mass of the mean plume, and this is consistent with the structure we assumed for the “noise” term G (Eq. (3)) in Eq. (1). At a larger distance from the solute source, for example x=(55IY, 0) and time t=45 IY /U (i.e., when concentrations are measured at a location ahead of the center of mass of the mean plume), the effect is more evident. However at large distances, because of solute spreading, the zone around the center of mass with small variations of the solute concentration increases in size and the sampling within this zone provides similar CDF regardless of the position. This effect is the same that reduces the importance of sampling dimension at large distances from the source or at greater time since injection (see Figs. 2b and 3). Fig. 2. Comparison between Beta model and the CDF of the solute 2 Simulations conducted with σY =0.2, and not shown concentration obtained numerically by Monte Carlo simulations for here, produced similar results. Notice that since our Δ=0.2IY and IY at a) x=4IY and t=4IY /U and b) at x=55IY and 2 t=55IY /U. In all cases σY =1.2, PeL =200 and PeT =2000, and the source area is a square of side L=1.5 IY . rather than the least-square fitting or maximum likelihood estimation, because they require measured concentration data, which are often rare or absent in applications. Whereas the first two concentration moments, which are the only pieces of information required by the moment method, can be obtained from transport models. Examination of Fig. 2a reveals that CΔV /C0 is adequately described by the Beta model, which shows a great flexibility in capturing modifications of the CFD induced by variations of the sampling volume. More- over, numerical results of Fig. 2b show a declining sensitivity of CFD to the sampling volume at large times Fig. 3. Comparison between Beta model and the CDF of the solute concentration obtained numerically by Monte Carlo simulations at since injection, and this is accurately captured by the x=4IY for t=4IY /U and 8IY /U and at x=55IY for t=45IY /U. 2 Beta model. On the other hand, Normal and Log- Furthermore, Δ=0.2IY , σY =1.2, PeL =200 and PeT =2000, and the Normal models, often used in geostatistical studies, source area is a square of side L=1.5IY . 116 A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 model (7) is not limited to weakly heterogeneous applications, such that the CFD is poorly characterized, formations it comes as no surprise that it performs in particular when the threshold is higher than the mean well also in moderately heterogeneous formations and it concentration. In order to gain confidence in the can be expected to perform well at higher heterogeneity concentration CFD a suitable probability distribution is of the flow field. Therefore, the model provides good often adapted to the experimental data. predictions of the solute concentration CDFs for Furthermore, in the Bayesian approach to inverse different heterogeneity of the formation, at different problems one seeks for a set of model parameters m that locations within the plume, and for different sampling maximizes their conditional distribution p(m|d) given the dimensions. data d. According to the Bayes' theorem p(m|d) depends on the marginal distribution of the data h(d), through the 4. Applications following expression: pðmjdÞ¼TðdjmÞqðmÞ=hðdÞ, where T is the conditional distribution of the data given The above model can be used in risk analysis and in the parameters and ρ is the marginal prior distribution of all the applications requiring the probability that a the parameters. For example, one may want to infer the threshold concentration is exceeded. In envisioning most likely distribution of hydraulic conductivity from the possible applications, one should consider that this information provided by concentration measurements, in 2 〈 Δ 〉 σ model requires reliable estimates of C V and CΔV by which case h(d) can be estimated by our model by placing either theoretical closed form solutions (see e.g., Kapoor h(d)=f (Z). and Gelhar, 1994; Zhang and Neuman, 1996; Dagan and In the remainder of this Section we show how the Fiori, 1997; Kapoor and Kitanidis, 1999; Fiori and Beta model can be used to obtain the probability Dagan, 2000; Fiori, 2002), or numerical Monte Carlo distribution of a sample of concentration measurements simulations, which may be conditional to available taken from the plume at a given time, i.e. a snapshot, measurements, thus alleviating the computational bur- that is used in both applications described above. In den associated with the alternative approach of comput- order to do that, we assume that the sample made of ing the concentration CDF directly from a large sample concentrations taken from the plume at a given time of Monte Carlo realizations. In fact, when the first two obeys the following SDE: concentration moments are computed through Monte P j Carlo simulations the gain is in the smaller number of dCDV ¼ ½CpDVffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðtÞCDV dt realizations needed to stabilize them with respect to the þ ϵCDV ðC0 CDV ÞdWðtÞ; ð20Þ sample CFD. In geostatistics several strategies have been elaborat- which is similar to the SDE (4) utilized for modeling the ed in order to cope with the limited information provided local concentrationP with the difference that the spatial by the measurements. For example, in the Simple average CDV ðtÞ of the solute concentration replaces the Indicator Kriging (SIK) approach the prior information Pensemble average. We consider now that by definition = provided by the sample CFD is updated by a linear CDV ðtÞ¼MðtÞ VðtÞ, where M(t) is the total mass of combination of indicators obtained from the measure- solute and V(t) is the volume occupied by the plume. ments (see e.g. Goovaerts, 1997, p. 293). In doing that For a plume larger than the scale of spatial variability of one assumes implicitly that the prior concentration CDF K, V(t) varies slightly between independent realizations is statistically stationary with the consequence that the of K, and if, in addition to that, the plume is also well probability of exceeding a given threshold is assumed mixed the following approximation can be applied: Z the same within the plume (Goovaerts, 1997; Goovaerts P g 1 x; x et al., 2005). Substituting the sample CFD with the Beta CDV ðtÞ hCDV ð tÞid ð21Þ VGðtÞ model of the concentration CDF may alleviate this VGðtÞ problem providing more consistent prior information. Apart from these applications sometimes it is useful to and VG(t) is the volume occupied by a Gaussian plume know the proportion of the plume volume where solute with the same spatial moments of the actual plume. An concentrations are above a given threshold, for example indirect confirmation that the approximation introduced the solute concentration above which a remedial action in Eq. (21) holds can be found in the paper by Fitts should be taken. The Cumulative Frequency Distribution (1996), who showed that the spatial average of point (CFD) obtained from concentration measurements is the concentrations above background concentration at both simplest way of obtaining such information. However, a Cape Cod and Border tracer experiments is well set of coarse measurements is typically available in approximated by the spatial mean of the solute A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 117 concentration obtained by solving the ADE with the probability of observing solute concentrations smaller mean velocity and macrodispersion coefficients inferred than Clim. from concentration data (see the column “Mean” in Tables 3 and 4 of Fitts (1996)). 4.1.1. Vertically averaged concentrations As in Eq. (4) the first right hand term of the SDE (20) First we analyze the vertically averaged concentra- epitomizes the effects of PSD and mechanical mixing tion because it represents the type of information that within the sampling volume, which act to reduce spatial most commonly is available in applications where gradients of the concentration and thus drive the system sampling is performed in wells screened over the entire toward a spatially constant concentration. It should be plume depth rather than in multilevel samplers as in the noted, however, that this is an approximation since strictly Cape Cod experiment. speaking the system evolves toward a uniform concen- Fig. 4a–e show the CFD of Z(x1, x2)=(Cm(x1, x2)− tration only in a bounded domain, i.e. when the solute Clim)/(C0 −Clim) at 33, 83, 237, 426, and 511 days since disperses within a constant volume (Kitanidis, 1994). injection, where Cm(x1, x2) is the vertically averaged Under statistically stationary conditions the solution of Bromide concentration at the monitoring well location x= the SDE (20) is given by the model (7) with the statistical (x1, x2).Inallfigures,thesolidlineshowstheBetaCDFof moments replaced by the corresponding spatial moments, Eq. (A.1) with the parameter β computed through Eq. (8), ¯ nd 2 i.e. 〈CΔV (x, t)〉 with CΔV (t)=∑i=1 CΔV (xi, t)/nd and where 〈Z〉 and σZ areapproximatedbythesamplemean 2 2 nd ¯ 2 σΔV(x, t) with SΔV(t)=∑i=1[CΔV(xi, t)−CΔV (t)] /(nd −1), and variance, respectively. In each figure, the inset depicts where nd is the number of measurements. the results in a semi-logarithmic scale that better evidences the discrepancy between the models and the experimental 4.1. Analysis of the Cape Cod solute concentration data data for small concentrations. Furthermore, the dashed line shows the Log-Normal CDF, The Cape Cod tracer experiment (Leblanc et al., 1991; "# Z n Hess et al., 1992) provides a data set of solute concen- 1 ðe hniÞ2 PðZÞ¼ qffiffiffiffiffiffiffiffiffiffi exp de; ð22Þ trations, which is large enough for an experimental r2 0 e pr2 2 n validation of the theoretical results presented in the 2 n previous section. A controlled amount of non-reactive solute, bromide, was released for 17 hours within a where ξ=lnZ. In Eq. (22) volume with dimensions of 1.2×4×4 m, resulting in an ÂÃ hni¼exp hZiþr2 =2 ð23Þ initial concentration of C0 =640 mg/l. The thickness of Z the injection volume does not exceed 30 vertical integral and scale of the hydraulic log-conductivity, IYv, which has ÂÃ been estimated by Hess et al. (1992) in the range between r2 r2 r2 n ¼ exp 2hZiþ Z fexp½ Z 1gð24Þ 13 and 39 cm, such that ergodic conditions are only partially satisfied (Quinodoz and Valocchi, 1990; Dagan, are the and variance of the transformed 1991). Furthermore, Fitts (1996) showed that concentra- variable written as a function of the first two statistical tions predicted by solving the classic ADE were moments of Z. Likewise the Beta model, the Log-Normal inaccurate, although the plume thickness spans several CDF has been obtained by replacing the first two statistical vertical integral scales and the macrodispersion coeffi- moments of Z in Eqs. (23) and (24) with the sample mean cients were obtained with high accuracy from the and variance. Although unbounded, the Log-Normal concentration measurements at a dense monitoring distribution has been selected as a possible alternative to network of 656 multilevel samplers (MLS) (Leblanc et the Beta distribution because it is widely used in al., 1991, Figs. 8 and 9). geostatistics for skewed random functions. Here, we do At Cape Cod, the background concentration of not consider the Normal distribution as a possible Bromide was estimated less than Clim =0.1 mg/l (Leblanc alternative to the Beta model because it resulted in a et al., 1991), such that concentrations smaller than this much poorer fit with the sample than the Log-Normal limit are removed from the sample because of the distribution. impossibility to distinguish between the two sources of In addition, we performed the best fitting of Beta solute mass when concentrations fall below this limit. CDF model (A.1) with the least-square method to the With the lower limit of the population larger than zero, the sample CFD obtaining the dot-dashed curve shown in – variable Z should be redefined as follows: Z=(C−Clim)/ Fig. 4a e. With respect to matching the moments (C0 −Clim), such that Z varies between 0 and 1 with zero (moments method) the adaptation is better at low 118 A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125

Fig. 4. Cumulative Frequency Distribution of the vertically averaged Bromide concentrations reported from the first Cape Cod tracer test at a) t=33, b) t=83, c) t=237, d) t=426, and e) t=511 days since injection. The Cumulative Frequency Distribution is compared with the Beta and Log-Normal CDF models. The dot-dashed line shows the best fitting of the Beta model with the sample CFD.

concentrations and poorer at large concentrations. The 0.05 (Conover, 1999, pg. 239), resulting in an upper difference may be attributed to the more weight given limit of χ2 =12.59 for the test statistic: by the least-square method to low concentration Xnc 2 ðO E Þ measurements, which outnumber large concentration T ¼ j j ð25Þ E measurements. j¼1 j We applied the chi-square goodness-of-fit test to both Beta and Log-Normal distributions by using nine where nc=9 is the number of classes, Oj is the number equiprobable classes and level of significance set at of observations within class j, and Ej is the expected A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 119 number of observations within the same class. The matching the first two moments of the probability results of the test are shown in columns three and seven model and the sample. of Table 1 for the Beta and Log-Normal distributions, A closer inspection of Fig. 4b and c reveals that when respectively, and Clim =0.1 mg/l. At 203 days, none of the null hypothesis is rejected for the Beta model the the observations falls within the fourth class, which is mismatch between the model and the data is mostly then merged with the fifth class, reducing the total concentrated at low Z values (see the inset of the number of classes to 8, and χ2 to 11.07. We found that figures). Consistently with our finding there are the null hypothesis that Z is Log-Normally distributed evidences that small concentration measurements may is accepted at day 384, and rejected in all the other be inaccurate at Cape Cod. In a recent analysis of the cases. On the other hand, the null hypothesis that Z is Cape Cod data set, Fitts (1993) eliminated Bromide distributed according to the Beta model of Eq. (A.1) is concentrations less than 0.3 mg/l with the justification accepted at day 33, 203, 273, 384, 426 and 511, and that they were “substantially inaccurate due to chemical rejected in the remaining cases. With the only analysis method limitations”, while Thierrin and exception of day 461, note that when the null Kitanidis (1994) eliminated all the concentrations hypothesis is rejected for both distributions, the smaller than 0.5 mg/l in their analysis of dilution. smallest T is obtained with the Beta distribution, By eliminating from the sample concentration which shows a better adaptation to the experimental measurements lower than Clim =0.3 mg/l because unre- data with respect to the Log-Normal model. The same liable, as Fitts (1996) and Fiori and Dagan (1999) did in test applied to the best fitting curve produced similar their studies, the null hypothesis that Z is Beta results with the only difference that the null hypothesis distributed is accepted in all cases it was accepted for is accepted at the same days of the previous case plus Clim =0.1 mg/l plus days 83, 111, 139, 315, and 461 (the day 111. This slightly better performance of the best T statistic are reported in columns 6 and 10 of Table 1 for fitting in term of test statistic is counterbalanced by a Beta and Log-Normal distributions, respectively). loss of accuracy in the probability of exceeding large Increasing Clim had a beneficial effect also on the concentrations, as shown in Fig. 4a–e. Therefore, Log-Normal distribution, but the null hypothesis was subsequently we focus on the results obtained by verified for a smaller number of days with respect to the

Table 1 Chi-square goodness-of-fit test for the beta and Log-Normal probability distribution models of the vertically averaged bromide concentration at Cape Cod Test statistic, T Day F (C≤ Beta distribution Log-Normal distribution 0.3 mg/l) T T1 T1,R Tbf TT1 T1,R Tbf 33 0.15 11.45 6.06 0.53 9.25 19.64 0.97 0.05 12.79 55 0.20 15.42 3.29 0.21 14.63 16.26 1.51 0.09 13.79 83 0.35 33.63 25.15 0.75 7.97 26.00 15.61 0.60 12.41 111 0.31 21.75 12.62 0.58 8.68 19.75 2.50 0.13 14.08 139 0.50 57.37 32.63 0.57 12.46 56.33 22.31 0.40 20.42 174 0.27 27.56 14.44 0.52 21.64 40.53 5.59 0.14 23.08 203 a 0.18 10.78 4.08 0.38 9.95 14.63 0.04 0.00 18.05 237 0.48 37.20 25.39 0.68 13.55 33.70 17.84 0.53 15.8 273 0.12 5.39 2.29 0.43 7.02 15.68 0.01 0.00 13.63 315 0.37 28.90 13.89 0.48 10.48 37.45 5.14 0.14 18.76 349 0.16 12.62 4.02 0.32 12.73 26.79 0.73 0.03 27.96 384 0.13 5.48 1.45 0.26 7.61 8.09 0.23 0.03 7.9 426 0.17 2.84 0.44 0.15 3.36 16.42 0.30 0.02 7.96 461 0.20 14.36 2.32 0.16 12.46 12.16 3.38 0.28 6.46 511 0.31 9.59 6.03 0.63 8.88 11.43 1.58 0.14 10.47

T1 is the contribution to the T statistic of concentrations smaller than 0.3 mg/l for Clim =0.1 mg/l, and Tbf is the T statistic for Clim =0.3 mg/l. Furthermore, T1,R =T1/T is the relative contribution of concentrations smaller than 0.3 mg/l to the T statistic, and F (C≤0.3 mg/l) is the sample CFD for C=0.3 mg/l. a Obtained by merging the classes number 4 and 5 in one class, such that the number of classes reduces to 8 and consequently χ2 =11.07. 120 A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125

Beta distribution and, however, in cases when it was at C1 =0.3 mg/l, and computed the relative contribution accepted for both distributions the smallest T statistic to T of classes below C1: was that of the Beta model (see Table 1). In none of the Xnc1 2 days the null hypothesis was accepted for Log-Normal T1 1 ðOj EjÞ T ; ¼ ¼ ð26Þ 1 R T T E distribution and rejected for the Beta distribution. j¼1 j In order to evaluate the impact of measurement errors further on the concentration pdf, we have set the where nc1 is the number of classes with the upper bound threshold between doubtful and reliable concentrations smaller than Z1 =(C1 − Clim)/(C0 − Clim)=0.0003125,

Fig. 5. Cumulative frequency distribution of the point concentration of Bromide from the first Cape Cod tracer test at a) t=33, b) t=83, c) t=237, d) t=426 and, e) t=511 days since injection. The Cumulative Frequency Distribution is compared with the Beta and Log-Normal CDF models. The dot- dashed line shows the best fitting of the Beta model with the sample CFD. A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 121

incremented by one if Z1 is located in the upper half of in all other cases the T statistic was much smaller than the class containing it. that obtained with Clim =0.1 mg/l. Contrarily, the null The results of this analysis are shown in columns five hypothesis that the concentration is Log-Normally and nine of Table 1 for the Beta and Log-Normal distributed was accepted only for day 111. models, respectively. Additionally, columns four and In the light of these results, we conclude that the Beta eight show the corresponding values of T1. With the distribution is a good model of the spatial variability for exception of day 461, when both models are rejected the Bromide concentration observed at the first Cape and the smallest T is that of the Log-Normal model, the Cod tracer test. In particular, the Beta distribution largest contribution to T of classes Z≤Z1 is that of the models the experimental CFD accurately at high Beta model, and in 7 out of 15 cases this contribution concentrations, but it is less satisfactory at low exceeds 50% of T. In other words, when the null concentrations. However, chemical analysis errors hypothesis is rejected for the Beta model, it is because of have been reported by Hess et al. (1992) and Fitts the poor adaptation to concentrations smaller than (1996) to influence low concentration measurements 0.3 mg/l, which are negatively impacted by measure- negatively, such that the mismatch between CFD and ment and analytical errors. A treatment of the measure- Beta distribution function (Eq. (A.1)) should not be ment error, although theoretically possible, is beyond entirely attributed to our model's limitations. Converse- the scope of this work, suffice here to know that it ly, the Log-Normal model performs better, yet not impacts concentrations smaller than 0.3 mg/l negatively satisfyingly, at low concentrations, but poorly at high as evidenced in other studies. concentrations. The Normal model performs poorly in We conclude that the Beta model of Eq. (A.1) is all cases. Overall, the Beta model is then preferable over superior to the Log-Normal model of Eq. (22) in the alternative Log-Normal model. describing the probability distribution of vertically averaged Bromide concentrations at Cape Cod, and 4.2. Numerical validation of the spatial CDF that its performance improves significantly at high concentrations. On the other hand, small concentrations Simulations are performed in a two-dimensional are described better by the Log-Normal distribution, but formation by using a single realization of the transmis- this may be a consequence of the larger relative impact sivity field, which we assumed as the “real” field. The of measurement errors, which obscure the underlying resulting concentration distributions at selected times probability distribution. since injection (snapshots), obtained by solving solute transport with the FPT scheme of Eqs. (13) and (14), are 4.1.2. Point concentration compared with the Beta model (7). Concentration We repeated the analysis of the previous section for measurements composing the sample are taken at the Bromide concentrations measured at the ports of the center of a grid with horizontal dimension equal to the multilevel samplers. Each sampling port consists of a polyethylene tube with internal diameter of 0.47 cm (Leblanc et al., 1991). Since the sampler's size is one order of magnitude smaller than IYv, concentration measurements at the sampling ports can be thought as point concentrations. Fig. 5a–e show the CFD of Z (x1, x2, x3)=(C (x1, x2, x3)−Clim)/(C0 −Clim), where C (x1, x2, x3) is the Bromide concentration at the port's location (x1, x2, x3)andClim = 0.1 mg/l. Although, both Beta and Log-Normal models fail the χ2 goodness-of-fit test with χ2 exceeding the 95th percentile, the Beta distribution follows the CFD closely at moderate to large concentrations, while as in the vertically averaged concentrations the Log-Normal distribution seems preferable, yet not ideal, at small concentrations (see the inset of Fig. 5a–e). Fig. 6. Cumulative frequency distribution of solute concentrations obtained from numerical simulations of a field-scale tracer test, at By assuming Clim =0.3 mg/l, the null hypothesis that t=4IY /U since injection. The Cumulative Frequency Distribution is point concentrations are distributed as the Beta model compared with the following three CDF models: Beta, Normal and was accepted for days 203, 237, 273, 384 and 511, while Log-Normal. 122 A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 horizontal projection of ΔV. We considered two snap- Although, strictly speaking all the above models are shots: one at an early time since injection, and the other rejected by the chi-square goodness of fit test at 0.05 level at a later time. of significance, the Beta model performs much better than Fig. 6 compares the experimental CFD of Z=(CΔV − both Log-Normal and Normal models, confirming what CΔV,min)/(C0 −CΔV,min)attimet=4IY/U, obtained from was already evident from the visual inspection of Figs. 6 the numerical simulation of the tracer experiment with and 7. Similar considerations are suggested by the sampling size Δ=0.2IY, with the Beta model (A.1) and Kolmogorov–Smirnov test with the only difference that two additional models widely used in the theory of random the null hypothesis that the concentration CDF follows the functions: the Log-Normal model (22) and the Gaussian Beta model is in this case accepted for t=4IY /U and (Normal) model. Fig. 7 repeats Fig. 6 for t=55IY /U.For rejected for both Normal and Log-Normal models. each model the parameters are computed by matching the Simulations with a larger sampling volume of side first two statistical moments with the corresponding Δ=1IY have similar results with the Beta distribution moments of the sample. For the Beta model the parameter outperforming the Normal and Log-Normal models. β is computed by substituting the sample mean and However, in this case the chi-square goodness of fit test variance of Z into Eq. (8). is met by the Beta model at both t=4 IY /U and t=55IY / Likewise the Cape Cod data analysis presented in U and rejected at t=28IY /U, whereas Kolmogorov– Section 3, the Beta model (A.1) performs much better Smirnov test is always satisfied by the Beta, but not by than the other two models. Capitalizing on the informa- the Normal and Log-Normal distributions with the tion provided by the first two moments of the solute exception of the Normal model at time t=4IY /U. concentration, the model (A.1) provides an accurate description of the concentration CFD. At low concentra- 5. Conclusions tions, the Beta model performs much better with the data produced by the simulated tracer experiment than with We developed a new model for the concentration pdf those obtained in the field, suggesting that the latter may of a conservative tracer migrating in a heterogeneous be negatively affected by the chemical analysis error aquifer. The model is based on a Stochastic Differential reported by Hess et al. (1992) and Fitts (1996),aswe Equation describing the evolution of the plume subjected argued in the previous section. Furthermore, both Normal to spreading, pore-scale dispersion, and mechanical and Log-Normal models underestimate the probability of mixing associated with the sampler volume. Under exceeding high concentrations, which may impact risk statistical stationarity, we show that the local solute assessment negatively. concentration is distributed according to the Beta We performed the chi-square and Kolmogorov– distribution, whose parameters depend on the first two Smirnov goodness-of-fit tests (Conover, 1999, pg.428) statistical moments of the concentration. Similarly, the for the three models shown in Figs. 6 and 7, and for the probability distribution of the solute concentration in a snapshot at t=28IY /U not shown in the figures. single realization is well described by the Beta distribution with the first two statistical moments replaced by the corresponding spatial moments. The former is a model of local uncertainty, i.e. it describes the variability of the solute concentration at a given location, the latter can be seen as a model of global uncertainty describing the probability of exceeding a given concentration irrespec- tive of the location within the plume. Analysis of the Bromide concentration measurements from the first Cape Cod tracer experiment confirmed that our model is a valuable tool for interpreting concentration data, as well as to quantify the uncertainty resulting from modeling solute transport with a limited amount of information on the spatial distribution of the hydraulic conductivity. In particular, our model captures the smoothing effect of the sampling volume accurately Fig. 7. Cumulative Frequency Distribution of solute concentrations ob- predicting the reduction of the probability of exceeding tained from numerical simulations of a field-scale tracer test, at t=55IY/U since injection. The Cumulative Frequency Distribution is compared with high concentrations, when larger sampling volumes are the following three models: Beta, Normal and Log-Normal. adopted. The relatively poorer accuracy in modeling the A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 123 spatial distribution of small concentrations may be either a where γ=Γ [ω]/(Γ [ω〈Z〉]Γ [ω(1−〈Z〉)]), in which Γ [x]= ∞ x−1 −t model limitation, or a direct consequence of chemical ∫0 t e dt is the Euler gamma function, and z=ε≪1. By analysis errors in the concentration data. The Log-Normal taking the limit of the expansion (Eq. (A.2)) for ω→0we model performs slightly better than the Beta model for obtain: low concentrations for Cape Cod data, but at price of lim FðeÞ¼1 hZiðA:3Þ much poorer performance at high concentration levels xY0 with a significant negative impact on risk assessment. ε −〈 〉 εN Numerical simulations confirmed that our model is Thus, F( )=1 Z no matter how small 0 is, superior to both Log-Normal and Normal models in whereas it jumps to zero for z=0. representing local and global uncertainty arising from the Expanding now the Eq. (A.1) around z=1 we obtain: x x incomplete knowledge of spatial variability of the Fð1 eÞ¼1 ge ð1hZiÞ þ gð1 hZiÞe ð1hZiÞþ1 þ N permeability. Similarly to Cape Cod data, the probability A:4 of exceeding high concentrations is well predicted, while ð Þ ω→ the accuracy deteriorates slightly at low concentrations, which for 0 converges to with the limit between high and low concentrations at (C− lim Fð1 eÞ¼1 hZiðA:5Þ xY0 Clim)/(C0 −Clim)= 0.01 and 0.03 for t=4IY /U and 55IY / U, respectively. Thus, the range of concentrations for The discontinuity in z=1 is then similar to that which our model shows high accuracy in predicting the observed in z=0, with F(1−e)=1−〈Z〉 regardless how probability of exceeding a given threshold reduces with small e is, while F (z=1)=1. Whereas, F(z) is constant the elapsed time since injection. and equal to 1−〈Z〉 between z=0 and z=1. Thus, for ω→0(β→∞) the concentration CDF assumes the Acknowledgments following form: lim FðzÞ¼½1 hZiHðzÞþhZiHðz 1ÞðA:6Þ We are indebted to Kathrin Hess for making available xY0 to us the Cape Cod concentrations data set, and we also where H is the Heaviside step function (Abramowitz and thank three anonymous reviewers and the editor-in- Stegun, 1972), and the concentration pdf (Eq. (7)) charge for their helpful comments. reduces to the expression (9) obtained by Dagan (1982) with a different reasoning. Appendix A. Solution of the concentration pdf for β→∞ Appendix B. Moments of the RF Z for small β values

Let us consider the limit of the concentration pdf (7) The normalized statistical moments of order νN2, for β→∞. To handle this case properly, it is convenient given by the expression (11), can be rewritten by means to consider the CDF of the solute concentration: of the definition of hypergeometric function (Gradsh- Z z teyn and Ryzhik, 1980, p. 1045): V f f  FðzÞ¼Prob½Z z¼ f ½ d b ν=2 0 lV ν ν C x ν ¼ð1Þ hZi hZið1 hZiÞ b ½ x ; x 1 þ ¼ Bz½ hZi ð1 hZiÞ Xν ðB:1Þ C½xhZiC½xð1 hZiÞ ðνÞ ðhZi=bÞ 1 k k ðA:1Þ 1=b k k¼0 ð Þk k!hZi z a−1 b−1 where Bz[a, b]=∫0t (1−t) dt is the incomplete Beta where (a)k =Γ [a+k]/Γ [a] is the rising factorial and function (Abramowitz and Stegun, 1972, p. 258), and Γ [a]=∫∞ta−1e− tdt is the Euler gamma function. ω β 0 =1/ . By expanding the rising factorials in Eq. (B.1) in ω→ For 0 the function Eq. (A.1) is discontinuous at Taylor series around β=0, we obtain: → → z 0andz 1. The behavior of the CDF (A.1) in  ν=2 proximity of these discontinuities can be studied by means ν ν b lνV¼ð1Þ hZi hZið1 hZiÞ of Taylor expansion. For z=0 we obtain: 1 þ b  "# 2 Xν ν ν N ν W ; b 1 1 xð1 hZiÞ e k ð 1Þ ð k þ 1Þ k ðhZi Þ F e exhZig e O 1 þ ð1Þ ð Þ¼ x þ x þ k! k hZi 1 þ hZi g k¼1 hZi ðA:2Þ ðB:2Þ 124 A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 with Caroni, E., Fiorotto, V., 2005. Analysis of concentration as sampled in natural aquifers. Transp. Porous Media 59, 19–45. ½hZiþðk 1Þb½hZiþðk 2Þb N ½hZiþbhZi Cassiani, M., Franzese, P., Giostra, U., 2005. A PDF micromixing W ðhZi; bÞ¼ k ½1 þðk 1Þb½1 þðk 2Þb N :½1 þ b model of dispersion for atmospheric flow: 1. Development of the model, application to homogeneous turbulence and to neutral ðB:3Þ boundary layer. Atmos. Environ. 39, 1457–1469. Chatwin, P.C., Sullivan, P.J., 1990. A simple and unifying physical In order to study the structure of Eq. (B.2) it is interpretation of scalar fluctuation measurements from many convenient to treat separately the odd and even moments. turbulent shear flows. J. Fluid Mech. 212, 533–556. When ν is odd, let say ν=2j+1, Eq. (B.2) assumes the Chatwin, P.C., Lewis, D.M., Sullivan, P.J., 1995. Turbulent dispersion – following form: and the beta distribution. Environmetrics 6, 395 402. Cobb, L., 1981. Stochastic differential equations for the social 1=2 3=2 sciences. In: Cobb, Loren, Thrall, Robert M. (Eds.), Mathematical lV b f ; Z ; b b f ; Z ; b N 2jþ1 ¼ 2jþ1 0ðh i Þþ 2jþ1 1ðh i Þþ Frontiers of the Social and Policy Sciences. Westview Press, Inc., Boulder, Colorado, pp. 37–68. Ch. 2. ð2jþ1Þ=2 þb f2jþ1;jðhZi; bÞ; j ¼ 1; 2; N : ðB:4Þ Conover, W.J., 1999. Practical Nonparametric Statistics, Third edition. John Wiley and Sons, New York. and the functions f2j+1,k, k=0,1,…, j, which depend on Dagan, G., 1982. Stochastic modeling of groundwater flow by both β and 〈Z〉, share the important property that for unconditional and conditional probabilities, 2. The solute trans- β→0 they are function of 〈Z〉 only. A direct consequence port. Water Resour. Res. 18 (4), 835–848. Dagan, G., 1991. Dispersion of a passive solute in non-ergodic of this property is that μ′2j+1 →0 for β→0. ν ν transport by steady velocity fields in heterogeneous formations. When is even, let say =2j, Eq. (B.2) assumes the J. Fluid Mech. 233, 197–210. following form: Dagan, G., Fiori, A., 1997. The influence of pore-scale dispersion on concentration statistical moments in transport through heteroge- ð2j 1Þ!! neous aquifers. Water Resour. Res. 33 (7), 1595–1605. l2Vj ¼ ½1 þ 2b½1 þ 3b N ½1 þð2j 1Þb Dopazo, C., O'Brian, E.E., 1974. An approach to the autoignition of a 2 ðB:5Þ turbulent mixture. Acta Astronaut. 1, 1239–1266. bg ; Z ; b b g ; Z ; b N þ 2j 1ðh i Þþ 2j 2ðh i Þþ Fiori, A., 2001. On the influence of local dispersion in solute transport 2j2 þ b g2j;2j2ðhZi; bÞ; j ¼ 1; 2; N : through formations with evolving scales of heterogeneity. Water Resour. Res. 37 (2), 235–242. Similarly to the previous case, the functions g , Fiori, A., 2002. An asymptotic analysis for determining concentration 2j,k – k=1,2, …,2j−2 depend on both 〈Z〉 and β, but reduce uncertainty in aquifer transport. J. Hydrol. 284, 1 12. 〈 〉 β→ Fiori, A., Dagan, G., 1999. Concentration fluctuation in transport by to a polynomial in Z for 0. This leads us to groundwater: comparison between theory and experiments. Water conclude that limβ→0μ2′j =(2j−1)!!. Resour. Res. 35 (1), 105–112. The Eqs. (B.4) and (B.5) show that for β≪1 the Fiori, A., Dagan, G., 2000. Concentration fluctuations in aquifer difference between the moments of Z and those of a transport: a rigorous first-order solution and applications. J. Contam. Hydrol. 45 (1–2), 139–163. Normal distribution with the sameffiffiffi first two moments β→ pϵ ϵ Fiori, A., Berglund, S., Cvetkovic, V., Dagan, G., 2002. A first-order reduces to zero as 0 with and as leading term analysis of solute flux statistics in aquifers: the combined effect of for the odd and even moments, respectively. pore-scale dispersion, sampling, and linear sorption kinetics. Water Resour. Res. 38 (8). doi:10.1029/2001WR000678. References Fiorotto, V., Caroni, E., 2002. Solute concentration statistics in heterogeneous aquifers for finite Pèclet values. Transp. Porous Abramowitz, M., Stegun, I.A., 1972. Handbook of Mathematical Media 48, 331–351. Functions with Formulas, Graphs, and Mathematical Tables, 9th Fisher, H.B., List, E.J., Koh, R.C.Y., Imberger, J., Brooks, N.H., 1979. Printing. Dover, New York. Mixing in Inland and Coastal Waters. Academic, San Diego, Calif. Andricevic, R., 1998. Effects of local dispersion and sampling volume Fitts, C.R., 1996. Uncertainty in deterministic groundwater transport on the evolution of concentration fluctuations in aquifers. Water models due to the assumption of macrodispersive mixing: evidence Resour. Res. 34 (5), 1115–1129. from Cape Cod (Massachusetts, U.S.A. and Borden (Ontario, Arfken, G., 1985. Mathematical Methods for Physicists, 3rd edition. Canada) tracer tests. J. Contam. Hydrol. 23, 69–84. Academic Press, Orlando, Florida. Gardiner, C.W., 1985. Handbook of Stochastic Methods for Physics, Bellin, A., Rubin, Y., 1996. HYDRO_GEN: a spatially distributed Chemistry and the Natural Sciences. Springer, Berlin. random field generator for correlated properties. Stoch. Hydrol. Gelhar, L.W., 1993. Stochastic Subsurface Hydrology. Prentice Hall, Hydraul. 10 (4), 253–278. Englewood Cliffs, New Jersey. Bellin, A., Salandin, P., Rinaldo, A., 1992. Simulation of dispersion in Goovaerts, P., 1997. Geostatistics for Natural Resources Evaluation. heterogeneous porous formations: statistics, first-order theories, Oxford University Press, New York. convergence of computations. Water Resour. Res. 28 (9), 2211–2227. Goovaerts, P., AvRuskin, G., Meliker, J., Slotnick, M., Jacquez, G., Bellin, A., Rubin, Y., Rinaldo, A., 1994. Eulerian–Lagrangian Nriagu, J., 2005. Geostatistical modeling of the spatial variability approach for modeling of flow and transport in heterogeneous of arsenic in groundwater of southeast Michigan. Water Resour. geological formations. Water Resour. Res. 30 (11), 2913–2924. Res. 341 (7). A. Bellin, D. Tonina / Journal of Contaminant Hydrology 94 (2007) 109–125 125

Gradshteyn, I.S., Ryzhik, I.M., 1980. Table of Integrals, Series, and Gravel Aquifers: Field and Modeling Studies. Atomic Energy of Product. Academic Press, Inc., Orlando, Florida. Canada Limited, Chalk River Ont., Ottawa. October. Hess, K.M., Wolf, S.H., Celia, M.A., 1992. Large-scale natural Rubin, Y., Bellin, A., 1998. Conditional simulation of geologic media gradient tracer test in sand and gravel, Cape Cod, Massachusetts, 3. with evolving scales of heterogeneity. In: Sposito, G. (Ed.), Scale Hydraulic conductivity variability and calculated macrodispersiv- Dependence and Scale Invariance in Hydrology. Cambridge ities. Water Resour. Res. 28 (8), 2011–2017. University Press, New York, pp. 398–420. Ch. 14. Kapoor, V., Gelhar, L.W., 1994. Transport in three-dimensionally Thierrin, J., Kitanidis, P.K., 1994. Solute diluition at the Borden and heterogeneous aquifers, 1. Dynamics of concentration fluctuations. Cape Cod groundwater tracer tests. Water Resour. Res. 30 (11), Water Resour. Res. 30 (6), 1775–1788. 2883–2890. Kapoor, V., Kitanidis, P.K., 1998. Concentration fluctuations and Vanderborght, J., 2001. Concentration variance and spatial covariance dilution in aquifers. Water Resour. Res. 34 (5), 1181–1193. in second-order stationary heterogeneous conductivity fields. Kitanidis, P.K., 1988. Prediction by the method of moments of Water Resour. Res. 37 (7), 1893–1912. transport in a heterogeneous formation. J. Hydrol. 102, 453–473. Villermaux, J., Devillon, J.C., 1972. Représentation de la coalescence Kitanidis, P.K., 1994. The concept of the dilution index. Water Resour. et de la redispersion des domaines de ségrégation dans un fluide Res. 30 (7), 2011–2026. par un modéle d'interaction phénoménologique. In: Elsevier (Ed.), Leblanc, D.R., Garabedian, S.P., Hess, K.M., Gelhar, L.W., Quadri, R.D., Second International Symposium On Chemical Reaction Engi- Stollenwerk, K.G., Wood, W.W., 1991. Large-scale natural gradient neering, New York. tracer test in sand and gravel, Cape Cod, Massachusetts: 1. Wandel, A.P., Smith, N.S.A., Klimenko, A.Y., 2003. Verification of Experimental design and observed tracer movement. Water Resour. Markov hypothesis for conserved scalar process: validation of Res. 27 (5), 895–910. conditional moment closure for turbulent combustion. ANZIAM J. Pannone, M., Kitanidis, P.K., 1999. Large-time behavior of concen- 44, C802–C819. tration variance and dilution in heterogeneous media. Water Zhang, D., Neuman, S.P., 1996. Effect of local dispersion on solute Resour. Res. 35, 623–634. transport in randomly heterogeneous media. Water Resour. Res. Pope, S.B., 1965. Turbulent Flow. Cambridge University Press. 32 (9), 2715–2723. Quinodoz, H.A.M., Valocchi, A.J., 1990. Macrodispersion in heterogeneous aquifers: numerical experiments. In: Moltyaner, G. (Ed.), Transport and Mass Exchange Processes in Sand and