Attenuation imaging by wavefield reconstruction inversion with bound constraints and total variation regularization

Hossein S. Aghamiry 12, Ali Gholami 1 and St´ephaneOperto 2

INTRODUCTION ABSTRACT Wavefield reconstruction inversion (WRI) extends the Full waveform inversion (FWI) is a high-resolution non- search space of Full Waveform Inversion (FWI) by al- linear imaging technology which can provide accurate sub- lowing for wave equation errors during wavefield re- surface model by matching observed and calculated wave- construction to match the data from the first iteration. forms (Tarantola, 1984; Pratt et al., 1998; Virieux and Then, the wavespeeds are updated from the wavefields Operto, 2009). However, it is well acknowledged that it by minimizing the source residuals. Performing these suffers from two main pathologies. The first one is the two tasks in alternating mode breaks down the non- nonlinearity associated with cycle skipping: When the linear FWI as a sequence of two linear subproblems, distance between the observed and calculated data is the relaying on the bilinearity of the wave equation. We least-squares norm of their differences, FWI remains stuck solve this biconvex optimization with the alternating- into spurious local minima when the initial velocity model direction method of multipliers (ADMM) to cancel out does not allow to match traveltimes with an error lower efficiently the data and source residuals in iterations than half a period. To mitigate cycle skipping, many vari- and stabilize the parameter estimation with appropri- ants of FWI have been proposed with more convex dis- ate regularizations. Here, we extend WRI to viscoa- tances such as those based on matching filters (Warner coustic media for attenuation imaging. Attenuation and Guasch, 2016; Guasch et al., 2019) or optimal trans- reconstruction is challenging because of the small im- port (M´etivieret al., 2018) among others. The second print of attenuation in the data and the cross-talks with pathology is ill-posedness resulting from uneven subsur- velocities. To address these issues, we recast the multi- face illumination provided by limited-aperture surface ac- variate viscoacoustic WRI as a triconvex optimization quisitions (e.g., Tang, 2009) and parameter cross-talks and update wavefields, squared slowness, and attenua- during multiparameter reconstruction (see Operto et al., tion factor in alternating mode at each WRI iteration. 2013, for a tutorial). Mitigating this ill-posedness requires This requires to linearize the attenuation-estimation to account for the Hessian in local optimization methods (e.g. M´etivieret al., 2017) and regularize the inversion arXiv:1909.05170v2 [math.OC] 6 Jan 2020 subproblem via an approximated trilinear viscoacoustic wave equation. The iterative defect correction embed- with prior information such as physical bound constraints ded in ADMM corrects the errors generated by this (e.g. Asnaashari et al., 2013; Duan and Sava, 2016). linearization, while the operator splitting allows us to Among the methods proposed to mitigate cycle skipping, tailor `1 regularization to each parameter class. A toy wavefield reconstruction inversion (WRI) (van Leeuwen numerical example shows that these strategies mitigate and Herrmann, 2013, 2016) extends the parameter search cross-talk artifacts and noise from the attenuation re- space of frequency-domain FWI by processing the wave construction. A more realistic synthetic example rep- equation as a soft constraint with a . The resentative of the North Sea validates the method. resulting wave equation relaxation allows for data fitting with inaccurate velocity models through the reconstruc- tion of data-assimilated wavefields, namely wavefields sat- isfying the observation equation relating the wavefields to 0 1 University of Tehran, Institute of Geophysics, Tehran, Iran, the observations (Aghamiry et al., 2019a). The algorithm email: [email protected], [email protected] 2University Cote d’Azur - CNRS - IRD - OCA, Geoazur, Valbonne, then updates the model parameters by least-squares min- France, email: [email protected], [email protected] imization of the wave equation errors (or source residu- 1 2 A PREPRINT Aghamiry et al. als) so that the assimilated wavefields explain both the above and beneath. On the other hand, Hak and Mul- wave equation and the data as well as possible. Perform- der (2011) show that wavespeed and attenuation can be ing wavefield reconstruction and parameter estimation in decoupled during nonlinear waveform inversion of multi- an alternating mode (van Leeuwen and Herrmann, 2013) offset/multi-frequency data provided that the causality rather than by variable projection (van Leeuwen and Her- term is properly implemented in the attenuation model. rmann, 2016) recasts WRI as a sequence of two linear This conclusion has been further supported by several re- subproblems as a result of the bilinearity of the wave alistic synthetic experiments and real data case studies in equation in wavefield and squared slowness. The reader is marine and land environments, which manage to recon- also referred to Aghamiry et al. (2019b) for a more gen- struct trustworthy attenuation models (Hicks and Pratt, eral discussion on the bilinearity of the elastic anisotropic 2001; Askan et al., 2007; Malinowski et al., 2011; Tak- wave equation. Aghamiry et al. (2019e) solved this bi- ougang and Calvert, 2012; Kamei and Pratt, 2008; Prieux convex problem with the alternating direction method of et al., 2013; Stopin et al., 2016; Operto and Miniussi, multipliers (ADMM) (Boyd et al., 2010). ADMM is an 2018; Lacasse et al., 2019). This decoupling between ve- augmented Lagrangian method which makes use of oper- locity and attenuation can be further argued on the ba- ator splitting and alternating directions to solve convex sis of physical considerations. In the transmission regime separable multi-variate constrained problems. The aug- of wave propagation, wavespeeds control the kinematic mented Lagrangian function combines a penalty function of wave propagation. This implies that FWI is domi- and a Lagrangian function (Nocedal and Wright, 2006a, nantly driven toward wavespeed updating to match the Chapter 17). The penalty function relaxes the constraints traveltimes of the wide-aperture data (diving waves, post- during early iterations as in WRI, while the Lagrangian critical reflections) and update the long wavelengths of the function progressively corrects the constraint violations subsurface accordingly, while attenuation has a secondary via the action of the Lagrange multipliers. The lever- role to match amplitude and dispersion effects (e.g., see age provided by the Lagrange multipliers guarantees to Operto and Miniussi, 2018, for an illustration). This weak satisfy the constraints at the convergence point with con- imprint of the attenuation in the seismic response was il- stant penalty parameters (Aghamiry et al., 2019e). Ac- lustrated by the sensitivity analysis carried out by Kurz- cordingly, Aghamiry et al. (2019e) called their approach mann et al. (2013) who concluded that a crude homo- iteratively-refined WRI (IR-WRI). Alternatives to satisfy geneous background attenuation model may be enough to the constraints at the convergence point with penalty meth- perform reliable FWI, while da Siva et al. (2019) proposed ods rely on multiplicative (da Silva and Yao, 2017) or to reconstruct an under-parametrized attenuation model discrepancy-based (Fu and Symes, 2017) approaches. Aghamiryby semi global FWI. When a high-resolution attenuation et al. (2019d) implemented bounding constraints and total model is sought, the ill-posedness of the attenuation recon- variation (TV) regularization (Rudin et al., 1992) in IR- struction may be managed with different recipes includ- WRI with the split Bregman method (Goldstein and Os- ing data-driven and model-driven inversions (joint versus her, 2009) to improve the imaging of large-contrast media, sequential updates of the velocity and attenuation of se- with however undesirable staircase imprints in smooth re- lected subdatasets), parameter scaling, bound constraints gions. To overcome this issue and capture both the blocky and regularizations (e.g. Prieux et al., 2013; Operto et al., and smooth components of the subsurface, Aghamiry et al. 2013). (2019c) combine in IR-WRI Tikhonov and TV regulariza- In this context, the contribution of this study is two fold: tions by infimal convolution. first, we show how to implement velocity and attenuation The objective of this paper is to extend frequency-domain reconstruction in frequency-domain viscoacoustic IR-WRI IR-WRI to viscoacoustic media for attenuation imaging. when equipped with bound constraints and nonsmooth Attenuation reconstruction by FWI raises two potential regularizations. Second, we discuss with numerical exam- issues. The first is related to the cross-talks between ples whether the alternating-direction algorithm driven by wavespeed and attenuation. The ambiguity between ve- the need to expand the search space is suitable to manage locity and attenuation perturbation in least-squares mi- ill-conditioned multi-parameter reconstruction. It is well gration has been emphasized by Mulder and Hak (2009). acknowledged that viscous effects are easily included in Many combination of velocity and attenuation perturba- the time-harmonic wave equation with frequency-dependent tions can fit equally well reflection amplitudes since they complex-valued velocities as function of phase velocity and are basically related by an Hilbert transform. This am- attenuation factor (the inverse of quality factor) which biguity can be simply illustrated by the radiation pattern are both real-valued parameters (Toks¨ozand Johnston, of velocity and attenuation perturbations, which have the 1981). Accordingly, the objective function of viscoacoustic same amplitude versus angle behavior and a 90◦ phase IR-WRI requires to be minimized over a set of three pa- shift (Malinowski et al., 2011; da Siva et al., 2019). The rameter classes (wavefield, squared slowness, attenuation conclusions of Mulder and Hak (2009) are substantiated factor). In this study, we consider the Kolsky-Futterman by Ribodetti et al. (2000) who show that the Hessian of model as attenuation model (Kolsky, 1956; Futterman, ray+Born least-squares migration of single-offset reflec- 1962). With this model, the viscoacoustic wave equation tion data is singular if the reflector is not illuminated from is bilinear in wavefield and squared slowness, while it is A PREPRINT Viscoacoustic Wavefield Inversion 3 nonlinear in attenuation factor. This prompts us to in- struction via the alternating update of the multiple pa- troduce a first-order approximation of the viscoacoustic rameter classes. function to form a trilinear viscoacoustic wave equation. This paper is organized as follow. In the method sec- This equation allows us to recast the multivariate viscoa- tion, we first review the forward problem equation be- coustic IR-WRI as a sequence of three linear subproblems fore going into the details of viscoacoustic IR-WRI: We for wavefields, squared slowness and attenuation factor es- first formulate the constrained optimization problem to timation, which are solved in alternating mode following be solved and recast it as a saddle point problem with an the block relaxation strategy of ADMM. Then, the errors augmented Lagrangian function (Appendix A). Then, we generated by the approximated wave equation during at- review the solution of the three primal subproblems for tenuation estimation are corrected by the action of the wavefields, squared slowness and attenuation factor in the Lagrange multipliers (dual variables), which are formed framework of ADMM, as well as the expression of the dual by the source residuals computed with the exact wave variables or Lagrange multipliers that capture the history equation. Another application of augmented Lagrangian of the solution refinement in iterations. The appendix B method in AVO inversion is presented in Gholami et al. reviews in a general setting the split Bregman method (2018) where the linearized Zoeppritz equations are used to solve `1-regularized convex problem. This recipe can to simplify the primal problem, while the dual problem be easily applied to the squared slowness and attenuation compensates the linearization-related errors by comput- reconstruction subproblems. The final section presents ing the residuals with the exact Zoeppritz equations. Also, two synthetic examples, which are performed without and the decomposition of the viscoacoustic IR-WRI into three with bound constraints and TV regularization in order to linear subproblems provides the suitable framework to tai- discriminate the role of the augmented-Lagrangian opti- lor `1 regularizations to each parameter-estimation sub- mization from that of the priors. A toy example allows us problem (Aghamiry et al., 2019d,c). to illustrate in a simple setting how well IR-WRI manages The alternating update of the squared slowness and at- the parameter cross-talk and the ill-posedness of the at- tenuation factor at each IR-WRI iteration is probably tenuation reconstruction and how bound constraints and non neutral on how the inversion manages the parameter TV regularization remove the corresponding artifacts. A cross-talks and the contrasted sensitivity of the data to second synthetic example representative of the North Sea each parameter class, as the multi-parameter inversion is environment allows one to assess the method in a more broken down as a sequence of two mono-parameter inver- realistic setting. sions. This approach differs from those commonly used in multi-parameter inversion. The most brute-force ap- THEORY proach consists in the joint updating of the multiple pa- rameter classes, with the issue of managing multi-parameter In this section, we first review the viscoacoustic wave Hessian with suitable parameter scaling (e.g., Stopin et al., equation in the frequency-space domain. Then, we use 2014; M´etivier et al., 2015; Yang et al., 2016). Other this wave equation to formulate the iteratively-refined wave- approaches rely on ad-hoc hierarchical data-driven and field reconstruction inversion (IR-WRI) for velocity and model-driven inversion where the dominant parameter is attenuation. updated during a first mono-parameter inversion, before involving the secondary parameter in a subsequent multi- Forward problem parameter inversion (e.g., Prieux et al., 2013; Cheng et al., 2016). Another possible model-driven strategy consists The viscoacoustic wave equation in the frequency-space of performing the joint updating of the multiple param- domain is given by eter classes during a first inversion, then reset the sec-  ω2  ondary parameters to their initial values and restart a ∆ + u(x, ω) = b(x, ω), (1) c(x)2 multi-parameter inversion involving all the parameter classes (Yang et al., 2014). where ∆ is the Laplacian operator, ω is the angular fre- In this study, we assess our approach against two syn- quency, x = (x, z) denotes the position in the subsurface thetic experiments, a toy example and a more realistic model and b(x, ω) and u(x, ω) are respectively the source well-documented synthetic example representative of the term and the wavefield for frequency ω. Viscoacoustic North Sea (Prieux et al., 2013). A comparison of our (attenuative) media can be described by complex-valued approach with those reviewed above remains however be- velocity c(x). The velocity associated with the Kolsky- yond the scope of this paper. One reason is that all the Futterman model is given by (Kolsky, 1956; Futterman, above approaches have been implemented in conventional 1962) FWI, which would remain stuck in a local minimum when starting from the crude initial model used in this study. 1 1  1 ω sign(ω) = 1 − log | | + i , (2) This is to remind that IR-WRI provides a practical frame- c(x) v(x) πQ(x) ωr 2Q(x) work to conciliate the search space expansion to mitigate cycle skipping and easy-to-design multi-parameter recon- where v(x) denotes the phase velocity, Q(x) the frequency-√ independent quality factor, both real-valued, and i = −1. 4 A PREPRINT Aghamiry et al.

Also, sign(•) is the sign function that extracts the sign of both m and α, i.e. E(m) = kmkTV and F (α) = kαkTV. a real number •. The logarithmic term with reference fre- However, other regularizations such as compound regular- quency ωr = 2πfr implies causality (Aki and Richards, izations can be used in a similar way (see Aghamiry et al., N 2002; Hak and Mulder, 2011). In this study, fr is chosen 2019c). The isotropic TV norm of a 2D image w ∈ R is to be 50 Hz (Toverud and Ursin, 2005). defined as (Rudin et al., 1992) X kwk = p(∇ w)2 + (∇ w)2, (8) Inverse problem TV x z where ∇ and ∇ are respectively first-order difference We discretize the 2D partial-differential equation (PDE), x z operators in the horizontal and vertical directions with equation 1, with a N = N × N grid points, where N x z x appropriate boundary conditions (Gholami and Naeini, and N are the number of points in the horizontal and z 2019). vertical directions, respectively. We parametrize the in- √ Beginning with an initial velocity model v0 = 1/ m0 version by squared slowness m = 1/v2 and attenuation and α0 = 0, b0 = 0, d0 = 0, ADMM solves iteratively factor α = 1/Q. Accordingly, equation 2 in discrete form the multivariate optimization problem, equation 5, with reads as 1 alternating directions as (see Boyd et al., 2010; Benning = m ◦ ρ(α), (3) c2 et al., 2015; Aghamiry et al., 2019e,d, for more details) where  uk+1 = arg min Ψ(u, mk, αk, bk, dk) (9a) sign(ω) 1 ω  ρ(α) = (1 + β(ω)α)2 , β(ω) = i − log | | (4)  u  k+1 k+1 k k k 2 π ωr  m = arg min Ψ(u , m, α , b , d ) (9b)  and ◦ denotes Hadamard (element wise) product operator.  m∈M k+1 k+1 k+1 k k N N α = arg min Ψ(u , m , α, b , d ) (9c) The model parameters m ∈ R and α ∈ R are defined   α∈A as solution of the following nonlinear PDE-constrained op-  k+1 k k+1 k+1 k+1  b = b + b − A(m , α )u (9d) timization problem (Aghamiry et al., 2019e)   dk+1 = dk + d − Puk+1, (9e) min µE(m) + νF (α) u,m∈M,α∈A where ( A(m, α)u = b k k subject to (5) Ψ(u, m, α, b , d ) = µkmkTV + νkαkTV (10) Pu = d, k 2 k 2 + λkb + b − A(m, α)uk2 + γkd + d − Puk2, where u ∈ N×1 is the wavefield, b ∈ N×1 is the source C C is the augmented Lagrangian function written in scaled term, and A(m, α) ∈ N×N is the matrix representation C form (Appendix A), •k is the value of • at iteration k, the of the discretized Helmholtz PDE, equation 1. The ob- scalars λ, γ > 0 are the penalty parameters assigned to servation operator P ∈ M×N samples the reconstructed R the wave-equation and observation-equation constraints, wavefields at the M receiver positions for comparison with respectively, and bk, dk are the scaled Lagrange multipli- the recorded data d ∈ M×1 (we assume a single source C ers, which are updated through a dual ascent scheme by experiment for sake of compact notation; However, the ex- the running sum of the constraint violations (source and tension to multiple sources is straightforward). The func- data residuals) as shown by equations 9d-9e. The penalty tions E and F are appropriate regularization functions for parameters λ, γ > 0 can be tuned in equation 10 such that m and α, respectively, which are weighted by the penalty a dominant weight γ is given to the observation equation parameters µ and ν > 0, respectively. M and A are con- at the expense of the wave equation during the early itera- vex sets defined according to our prior knowledge of m tions to guarantee the data fit, while the iterative update and α. For example, if we know the lower and upper of the Lagrange multipliers progressively correct the er- bounds on m and α then rors introduced by these penalizations such that both of M = {m|mmin ≤ m ≤ mmax}, (6) the observation equation and the wave equation are satis- fied at the convergence point with acceptable accuracies. and In the next three subsections, we show how to solve each A = {α|α ≤ α ≤ α }. (7) min max optimization subproblem 9a-9c. The PDE constraint A(m, α)u = b in equation 5 is nonlinear in m and α and very ill-conditioned (Dolean Update wavefield (subproblem 9a) et al., 2015), while the data constraint Pu = d is linear but the operator P is rank-deficient with a huge null space The objective function Ψ is quadratic in u and its min- because M  N. Therefore, determination of the op- imization gives the following closed-form expression of u ∗ ∗ ∗ timum multivariate solution (u , m , α ) satisfying both λAT A + γPT P uk+1 = λAT (bk + b) + γPT (dk + d), constraints (the wave equation and the observation equa- (11) tion) simultaneously is extremely difficult, and requires so- phisticated regularizations. In this paper, we use the first- where A ≡ A(mk, αk) and AT denotes the Hermitian order isotropic TV regularization (Rudin et al., 1992) for transpose of A. A PREPRINT Viscoacoustic Wavefield Inversion 5 Update squared slowness (subproblem 9b) where

The PDE operator H = 2ω2β(ω)Cdiag(Buk+1 ◦ mk+1), (20) 2 A(m, α) = ∆ + ω Cdiag(m ◦ ρ(α))B (12) and k k k+1 k+1 is discretized with the finite-difference method of Chen h = b + b − A(m , 0)u . (21) et al. (2013) where ∆ is the discretized Laplace opera- Equation 19 is also a box-constrained TV-regularized tor, C introduces boundary conditions such as perfectly- convex problem, which can be solved with ADMM (Ap- matched layers (B´erenger,1994), B is the mass matrix pendix A) in a manner similar to the previous subproblem (Marfurt, 1984) which spreads the mass term ω2Cdiag(m◦ for squared slowness. It is important to stress that the er- ρ(α)) over all the coefficients of the stencil to improve rors generated by the first-order approximation of ρ(α) its accuracy following an anti-lumped mass strategy, and during the update of α are iteratively compensated by diag(•) denotes a diagonal matrix. From equation 12 we the action of the scaled Lagrange multiplier bk. These get that Lagrange multipliers are formed by the running sum of the wave equation errors, which are computed with the A(m, α)u = A(0, α)u + ω2Cdiag(Bu ◦ ρ(α))m, (13) exact wave equation operator (namely, without lineariza- where A(0, α) ≡ ∆. Therefore, subproblem 9b can be tion of ρ(α)). written as The overall workflow described above is summarized in Algorithm 1. k+1 k 2 m = arg min µkmkTV + λkLm − y k2, (14) m∈M where Practical implementation L = ω2Cdiag(Buk+1 ◦ ρ(αk)), (15) The ADMM optimization that is used to solve equa- and tions 14 and 19 is reviewed in Appendix B. We also refer yk = bk + b − ∆uk+1. (16) the reader to Goldstein and Osher (2009), Boyd et al. (2010) and Aghamiry et al. (2019d) as a complement. A Equation 14 describes the box-constrained TV-regularized key property of the ADMM algorithm, equation 9, is that, subproblem for m which is convex but non-smooth. This at iteration k, we don’t need to solve each optimization box-constrained TV-regularized problem can be solved ef- subproblems 9a-9c exactly via inner iterations. The in- ficiently with ADMM and splitting methods, also referred tuitive reason is that the updating of the primal vari- to as the split Bregman method (Goldstein and Osher, able performed by one subproblem is hampered by the 2009). Using splitting methods, the unconstrained sub- errors of the other primal variables that are kept fixed. problem 14 is recast as a multivariate constrained prob- In this framework, the errors at each iteration k are more lem, through the introduction of auxiliary variables. These efficiently compensated by the gradient-ascent update of auxiliary variables are introduced to decouple the `2 sub- the Lagrange multipliers (dual variable). This statement problem from the `1 subproblem such that they can be was corroborated by numerical experiments which showed solved in alternating mode with ADMM. Moreover, a closed that one (inner) iteration of each subproblem per ADMM form expression of the auxiliary variables is easily ob- cycle k generates solutions which are accurate enough tained by solving the `1 subproblem with proximity op- to guarantee the fastest convergence of the ADMM al- erators (Combettes and Pesquet, 2011; Parikh and Boyd, gorithm (Goldstein and Osher, 2009; Boyd et al., 2010; 2013). We refer the reader to Appendix B for a more Aghamiry et al., 2019d; Gholami et al., 2018; Aghamiry detailed review of this method. et al., 2019b). Moreover, this error compensation is more efficient when the dual variables are updated after each Update attenuation factor (subproblem 9c) primal subproblem 9a-9c rather than at the end of an Subproblem 9c is nonlinear due to the nonlinearity of iteration k as indicated in Algorithm 1 for sake of com- the PDE with respect to α. We linearize this subproblem pactness. This variant of ADMM is referred to as the by using a first-order approximation of ρ(α), equation 4, Peaceman-Rachford splitting method (Peaceman and Rach- as ford, 1955; He et al., 2014) and will be used in the fol- ρ(α) ≈ 1 + 2β(ω)α, (17) lowing numerical experiments. The reader is referred to Aghamiry et al. (2019e) for more details about the im- which is accurate for α  1 (Hak and Mulder, 2011), and proved convergence of the Peaceman-Rachford splitting gives method compared to ADMM in the framework of IR-WRI. A(m, α)u ≈ A(m, 0)u+2ω2β(ω)Cdiag(Bu◦m)α. (18) We follow the guideline presented in Aghamiry et al. (2019d, section 3.1) for tuning the penalty parameters. Accordingly, subproblem 9c can be written as the follow- The overall procedure is as follow: we first set µ=0.6 and ing linear problem ν=0.4 to tune the relative weight of the regularizations k+1 k 2 of the squared slowness (m) and attenuation factor (α), α ≈ arg min νkαkTV + λkHα − h k2, (19) α∈A equation 10. Then, we set the ratios µ/ξm and ν/ξα, 6 A PREPRINT Aghamiry et al. lines 7, 9, 10, 12 in Algorithm 1, to tune the soft thresh- 1e-f, respectively. The IR-WRI results without TV regu- olding performed by the TV regularization of m and α larization show acceptable velocity model and quite noisy (subproblems 9b and 9c). We refer the reader to Ap- attenuation reconstruction (Figure 1c-d). The velocity pendix B for the role of the penalty parameters ξm and reconstruction is however hampered by significant lim- ξα in TV regularization (denoted generically by ξ in equa- ited bandwidth effects. We also show underestimated tion B-3). We set µ/ξm and ν/ξα equal to 0.02×max(r), wavespeeds in the rectangular inclusion along the horizon- equations B-19. These values can be refined according to tal log. These underestimated wavespeeds are clearly cor- our prior knowledge of the subsurface medium. Then, we related with underestimated α values (overestimated Q), set a constant λ to balance the relative weight of the reg- hence highlighting some trade-off between v and α. Other ularization and the wave-equation misfit function during moderate trade-off artifacts are shown in the vertical log the parameter-estimation subproblems, equations 14 and of the α model, which shows undesired high-frequency 19, and lines 5 and 6 in Algorithm 1. If necessary, λ can perturbations at the position of the circular velocity inclu- be increased during iterations to mitigate the imprint of sion. The TV regularization removes very efficiently all of the regularization near the convergence point. Finally, we these pathologies: it extends the wavenumber bandwidth set γ for wavefield reconstruction such that λ/γ is a small of the models and removes to a large extent the parameter fraction of the highest eigenvalue of the regularized nor- cross-talk as the subsurface medium perfectly matches the mal operator, equation 11 and line 1 in Algorithm 1 (van piecewise-constant prior associated with TV regulariza- Leeuwen and Herrmann, 2016). This parameter γ can be tion (Figure 1e-f). Note however that α remains slightly kept constant during iterations. The reader is referred to underestimated in the circular inclusion (Figure 1f). This Aghamiry et al. (2019e) for an analysis of the sensitivity correlates with a barely-visible velocity underestimation of IR-WRI to the choice of the weight balancing the role at this location in Figure 1e. The relative magnitude of of the observation equation and the wave equation during these errors gives some insight on the relative sensitivity wavefield reconstruction. of the data to m and α.

NUMERICAL EXAMPLES Synthetic North Sea case study Simple inclusions test Experimental setup We first consider a simple 2D example to validate vis- We consider a more realistic 16 km × 5.2 km shallow- coacoustic IR-WRI without and with TV regularization. water synthetic model representative of the North Sea The true velocity model is a homogeneous background (Munns, 1985). The true v and α models are shown in model with a wavespeed of 1.5 km/s, which contains two Figures 2a and 2b, respectively. The velocity model is inclusions: a 250-m-diameter circular inclusion at position formed by soft sediments in the upper part, a pile of low- (1 km,1.6 km) with a wavespeed of 1.8 km/s and a 0.2 × velocity gas layers above a chalk reservoir, the top of which 0.8 km rectangular inclusion at the center of the model is indicated by a sharp positive velocity contrast at around with a wavespeed of 1.3 km/s (Figure 1a). Also, the true α 2.5 km depth, and a flat reflector at 5 km depth (Figure model is a homogeneous background model with α=0.01 2a). The α model has two highly attenuative zones in the (Q = 100), which contains two inclusions, both of them upper soft sediments and gas layers, and the α value is with α = 0.1 (Q = 10). The first is a 250-m-diameter relatively low elsewhere (Figure 2b). circular inclusion at position (1 km,0.4 km) and the sec- The fixed-spread surface acquisition consists of 320 ex- ond is a 0.2 × 0.8 km rectangle inclusion at the center plosive sources spaced 50 m apart at 25 m depth and 80 of the model (Figure 1b). The wavespeed and α rectan- hydrophone receivers spaced 200 m apart on the sea floor gular inclusions share the same size and position, while at 75 m depth. For sake of computational efficiency, we the position of the wavespeed and α circular inclusions use the spatial reciprocity of Green’s functions to process are different in order to test different parameter trade-off sources as receivers and vice versa. A free-surface bound- scenarios. A vertical and horizontal logs which cross the ary condition is used on top of the grid and the source center of v and α models are plotted in the left and bot- signature is a Ricker wavelet with a 10 Hz dominant fre- tom side of the models, respectively, for all the figures of quency. We perform forward modeling with a nine-point this test. An ideal acquisition is used with 8 sources and stencil finite-difference method implemented with anti- 200 receivers along the four edges of the model and three lumped mass and PML absorbing boundary conditions, frequency components (2.5, 5, 7 Hz) are jointly inverted where the stencil coefficients are optimized to the fre- using a maximum of 30 iterations as stopping criterion of quency (Chen et al., 2013). We solve the normal-equation iteration. system for wavefield reconstruction, equation 11, with a We performed viscoacoustic IR-WRI without and with sparse direct solver (Duff et al., 1986). The initial model TV regularization starting from homogeneous v and α for v is a highly Gaussian filtered version of the true model models with v = 1.5 km/s and α = 0, respectively. The (Figure 2c), while the starting α model is homogeneous final (v, α) models estimated by IR-WRI without and with α = 0. The common-shot gathers computed in the with TV regularization are shown in Figures 1c-d and true and initial models are compared in Figure 3 for a A PREPRINT Viscoacoustic Wavefield Inversion 7

Figure 2: North Sea case study. (a) True v model. (b) True α model. (c) Initial v model.

ing and finishing frequencies of the paths are [3, 6], [4, Figure 1: (a) True velocity model. (b) True attenuation 10], [6, 15] Hz respectively, where the first and second model. (c-d) Reconstructed velocity (c) and attenuation elements of each pair show the starting and finishing fre- (d) from viscoacoustic IR-WRI without TV regulariza- quencies, respectively. Also, we used batches of four fre- tion. (e-f) Same as (a-b) when TV regularization is ap- quencies with two frequencies overlap during the second plied. Profiles of the true (blue) and reconstructed (red) path and five frequencies with three frequencies overlap models running across the center of the models are shown during the third path. The motivation behind this fre- on the left and bottom of the reconstructed models. quency management is to keep the bandwidth of each patch narrow during the first path in order to mitigate nonlinearities through a progressive frequency continua- tion, before broadening this bandwidth during the second shot located at 16.0 km. The latter mainly show the di- and third paths to strengthen the imprint of dispersion in rect wave and the diving waves, which are highly cycle the inversion and decouple velocity and attenuation more skipped relative to those computed in the true model. efficiently. We perform the inversion with small batches of three frequencies with one frequency overlap between two con- Comparison of FWI and WRI objective Functions secutive batches, moving from the low frequencies to the higher ones according to a classical frequency continuation Before showing the inversion results, it is worth illus- strategy. The starting and final frequencies are 3 Hz and trating how WRI extends the search space of FWI for 15 Hz and the sampling interval in each batch is 0.5 Hz. this North Sea case study. For this purpose, we compare We perform three paths through the frequency batches the shape of the classical FWI misfit function based upon to improve the results, using the final model of one path the `2 norm of the data residuals (e.g. Pratt et al., 1998) as the initial model of the next one (these paths can be with that of the parameter-estimation WRI subproblem viewed as outer iterations of the algorithm). The start- for the 3 Hz frequency and for a series of v and α models 8 A PREPRINT Aghamiry et al.

Figure 3: Time domain seismograms computed in the true and initial models. The true seismograms are shown in the left and right panels, while those computed in the initial model are shown in the middle panel with a mirror representation such that the two sets of seismograms can be compared at long and short offsets. The seismograms are plotted with a reduction velocity of 2.5 km/s for sake of time axis compression. that are generated according to range of feasible solutions for α.

2 va = vtrue + a (vinit − vtrue), (22a) 2 αb = αtrue + b (αinit − αtrue), (22b) where −1 ≤ a, b ≤ 1. In this case, (va, αb) lies on the line-segment joining the initial point (vinit, αinit) and fi- nal point (vtrue, αtrue). We set vtrue and αtrue as those shown in Figures 2(a-b), vinit as the initial v model shown in Figure 2c, and αinit as a homogeneous model with α = 0.004 (Q = 250). The misfit functions of FWI and WRI are shown in Figures 4a and 4b, respectively. The FWI objective function exhibits spurious local minima with respect to both velocity and attenuation (a and b di- Figure 4: The objective function for the 3 Hz wavefield as mensions), while only one minimum is seen in the WRI ob- a function of va and αb generated using equation 22. (a) jective function. Indeed, this highlights the search space Classical reduced-space FWI. (b) WRI. expansion generated by the wave-equation relaxation dur- ing WRI. This search space expansion is displayed through a wider and flatter attraction basin compared to that of FWI. This wide attraction basin exacerbates the contrast between the sensitivities of the objective function to veloc- Viscoacoustic FWI results ity and attenuation, with a quite weak sensitivity to the latter. These contrasted sensitivities would likely make To show the need of search space expansion and regular- the parameter estimation subproblem poorly scaled if the ization, we first perform a classical viscoacoustic FWI for velocity and attenuation were jointly updated during WRI noiseless data (e.g. Kamei and Pratt, 2013; Operto and based on variable projection (van Leeuwen and Herrmann, Miniussi, 2018). We use squared-slowness and attenuation 2016). In this context, the alternating-direction strategy as optimization parameters and update them simultane- can be viewed as an heuristic to overcome this scaling is- ously with the L-BFGS quasi-Newton optimization and sue. Indeed, the lack of sensitivity to attenuation during a procedure for step length estimation (that the early WRI iterations requires aggressive regularization satisfies the ). Neither TV regulariza- and bound constraints to stabilize the attenuation estima- tion nor bound constraints are applied. Owing the limited tion. As WRI proceeds over iterations and the wave equa- kinematic accuracy of the initial models highlighted by the tion constraint is satisfied more accurately, the inversion seismograms mismatches in Figure 3, the reconstruction of should recover a significant sensitivity to attenuation as the velocity model remains stuck in a local minimum dur- that highlighted in Figure 4a allowing for a relaxation of ing the first frequency batch inversion (Figure 5a), while the regularization. These statements highlight the need to the estimated attenuation model shows unrealistic values reconcile search space expansion to manage nonlinearity due to the lack of bound constraints (Figure 5b). This and regularization plus bound constraints to restrict the failure prompts us to stop the inversion at this stage. A PREPRINT Viscoacoustic Wavefield Inversion 9 BTV regularization are shown at the reservoir level and below. Without BTV regularization, the top of the reser- voir is mispositioned (Figure 7a, x = 8.0 km, green ver- sus red curves) and the inversion fails to reconstruct the smoothly-decreasing velocity below it due to the lack of diving wave illumination at these depths (Figure 7a). This in turn prevents the focusing of the deep reflector at 5 km depth by migration of the associated short-spread reflec- tions (Figure 6a). When BTV regularization is used, vis- coacoustic IR-WRI provides a more accurate and cleaner images of the reservoir and better reconstructs the sharp contrast on top of it (Figure 6c). As expected, the TV reg- ularization replaces the smoothly-varying velocities below the reservoir (between 3 to 5 km depth) by a piecewise- constant layer due to the lack of wave illumination in this part of the model (Figure 7a, green curves). However, this does not prevent a fairly accurate reconstruction of the deep reflector at 5 km depth. A direct comparisons between the logs extracted from the true and the IR-WRI attenuation models at x = 3.5 km, x = 8.0 km and x = 12.0 km are shown in Figure 7b. The reconstruction of α without BTV regularization Figure 5: Viscoacoustic FWI results after inverting the is quite unstable, with an oscillating trend and overes- first frequency batch. (a) Reconstructed velocity model. timated values (Figures 6b and red curves in 7b). This (b) Reconstructed attenuation model. TV regularization highlights fairly well the ill-posedness of the attenuation and bound constraints are not applied. reconstruction. In contrast, the α model reconstructed with BTV regularization captures the large-scale attenua- tion trend in the shallow sedimentary cover, in the gas lay- Viscoacoustic IR-WRI results ers and below the reservoir (Figures 6d and green curves in 7b). We note however that the attenuation is underes- We now update v and α with IR-WRI according to the timated on top of the gas layers between 1 km and 1.4 km optimization workflow described in Algorithm 1. IR-WRI depth (Figure 7b, x = 8.0 km, green curve). This error in is performed without and with bound constraints + TV the attenuation reconstruction might be correlated with regularization (referred to as BTV regularization in the subtle underestimation of velocities at these depths (Fig- following). The lower and upper bounds are 1.2 km/s and ure 7a, x = 8 km, green curve). This might indicate on the 4 km/s for velocities, and 0.001 and 0.025 for α. For each one hand some mild amplitude-related cross-talk effects case, the stopping criterion for each batch is given to be between velocities and attenuation and on the other hand either reaching a maximum iteration count of 20 or the higher sensitivity of the data to velocities compared to X attenuation (in the sense that a small error in the veloc- kA(mk+1, αk+1)uk+1 − bk2 ≤  and 2 b ity contrasts can compensate more significant attenuation X k+1 2 kPu − dk2 ≤ d, (23) errors). Similar cross-talk artifacts have been previously discussed during the toy inclusion test (Figure 1e-f). where the sums run over the frequencies of the current We continue by assessing the resilience of the proposed batch, b=1e-3 and d=1e-5. We start with inversion of viscoacoustic IR-WRI to noise when data are contami- noiseless data. The final v and α models, estimated by nated with a Gaussian random noise with a SNR=10 db. IR-WRI without and with BTV regularization, are shown Here, SNR is defined based on the root mean square (RMS) in Figure 6(a-d) after 360 and 321 iterations, respectively. amplitude of signal and that of noise as A direct comparisons between the logs extracted from the true models, the initial model and the IR-WRI velocity Signal RMS Amplitude SNR = 20 log . (24) models reconstructed without/with BTV regularization at Noise RMS Amplitude x = 3.5 km, x = 8.0 km and x = 12.0 km are shown in Figure 7a. Although a crude initial velocity model was We use the same setup and the same initial models as used, the velocities in the shallow sedimentary cover and those used for the noiseless case. The stopping criterion the gas layers are fairly well reconstructed in both cases is defined by equation 23, where εd is now set to the noise (Figure 6a,c). Also, IR-WRI without BTV regulariza- level. The final models of IR-WRI obtained without and tion manages to reconstruct an acceptable velocity model with BTV regularization are shown in Figure 6(e-h). The unlike classical FWI (Figure 5). The main differences be- total number of IR-WRI iterations are 196 and 185, re- tween the IR-WRI velocity models built with and without spectively, for these results. In a similar manner to the 10 A PREPRINT Aghamiry et al. noiseless case, a direct comparisons between the logs ex- regularization to the squared slowness and attenuation tracted from the true models, the initial model and the factor. Moreover, it simplifies the multi-parameter op- IR-WRI velocity models at x = 3.5 km, x = 8.0 km and timization workflow and mitigates its computational cost x = 12.0 km are shown in Figure 8. Overall, a similar since the original poorly-scaled multi-parameter inversion trend as for the noiseless case is shown. However, the is recast as two interlaced mono-parameter inversions. A presence of noise in the data leads to a mispositioning of realistic synthetic example suggests that the search space the reservoir at 8 km distance in the BTV IR-WRI veloc- extension embedded in IR-WRI efficiently mitigates cy- ity model, which was not observed in the noiseless case cle skipping when a crude initial velocity model is used, (compare Figures 7a and 8a, x= 8km, green curves). This while the TV-regularized alternating-direction optimiza- mispositioning of the reservoir may be correlated with a tion reasonably manages the cross-talks between squared poorer reconstruction of the attenuating gas layers be- slowness and attenuation as well as the limited sensitivity tween 1 km and 2.5 km depth (compare Figures 7b and of the data to the attenuation. 8b, x= 8km, green curves). We also compute a common-shot gather for a shot APPENDIX A located at 16.0 km in the IR-WRI models inferred from noiseless/noisy data with/without BTV regularization (Fig- SCALED FORM OF AUGMENTED ure 9). The time-domain seismograms computed in the LAGRANGIAN IR-WRI models obtained without regularization (Figure In this appendix, we briefly review how augmented La- 9a and 9c) show underestimated amplitudes and do not grangian (AL) function, as the one shown in equation 10, match late dispersive arrivals due to the overestimated is used to solve constrained problem with the method and oscillating values of α (Figures 6b and 7b for noise- of multiplier (Nocedal and Wright, 2006b, Chapter 17). less data and Figures 6f and 8b for noisy data). Let’s start with the following constrained problem In the case of noiseless data, the bound constraints and 2 min kP (x)k2 subject to Q(x) = 0. (A-1) the TV regularization allow for a high-quality data fit x (Figure 9c), consistently with the accuracy of the mod- els shown in Figure 6c-d. In the case of noisy data, the The AL function associated with the problem A-1 com- bound constraints and the TV regularization improve sig- bines a Lagrangian function and a penalty function as nificantly the data match (compare Figure 9c and Figure 2 ξ 2 9d). However, the imprint of the cross-talk artifacts men- LA(x, v) = kP (x)k2 + hv,Q(x)i + kQ(x)k2 , (A-2) | {z } 2 Lagrangian | {z } tioned above are clearly seen at long offsets with a de- Augmentation graded fit of deeply-propagating waves (for example, the refracted wave from the deep reflector at around 1.5s trav- where h·, ·i denotes inner product and v and ξ denotes eltime and the late dispersive waves at around 4 s travel- the Lagrange multiplier (dual variable) and the penalty time) relative to the noiseless-data results (compare Fig- parameter, respectively. ure 9b and Figure 9d). This AL function can be written in a compact form by introducing the scaled dual variable q = −v/ξ and adding and subtracting the term ξ kqk2 to the function A-2 (See CONCLUSIONS 2 2 Boyd et al., 2010, Page 15 for more details): We extended the recently proposed ADMM-based iteratively- ξ ξ ξ refined wavefield reconstruction inversion (IR-WRI) for 2 2 2 2 LA(x, q) = kP (x)k2 − ξhq,Q(x)i + kQ(x)k2 + kqk2 − kqk2 attenuation imaging by inversion of viscoacoustic wave- 2 2 2 2 ξ 2 ξ 2 fields. The proposed viscoacoustic IR-WRI treats the non- = kP (x)k2 + kQ(x) − qk2 − kqk2. (A-3) linear viscoacoustic waveform inversion as a multiconvex 2 2 optimization problem. To achieve this goal, the original Equation A-3 shows the augmented Lagrangian method nonlinear multi-parameter problem for squared slowness can be seen as a penalty method with an error correction ξ 2 and attenuation factor is replaced by three recursive lin- term in the penalty function, 2 kQ(x) − qk2, correspond- ear mono-parameter subproblems for wavefield, squared ing to the scaled Lagrange multipliers. This correction slowness and attenuation factor that are solved in alter- term controls how well the constraint is satisfied at the nating mode at each IR-WRI iteration. The attenuation- convergence point. In the framework of the method of reconstruction subproblem requires to introduce an ap- multiplier, the AL function is minimized with respect to proximate multilinear viscoacoustic wave equation in wave- the primal variable x and maximized with respect to the field, squared slowness, and attenuation factor. How- scaled dual variable q in alternating mode. Expression ever, the errors generated by this approximate viscoacous- A-3 shows that the dual variable is simply updated with tic wave equation during the attenuation reconstruction the constraint violation when a gradient ascent method are efficiently compensated by the Lagrange multipliers is used. This recipe has been used to derive equation 9 (namely, the running sum of the wave equation errors) with ADMM (AL method with alternating update of mul- that are computed with the exact viscoacoustic equation. tiple classes of primal variable). The reader is referred to This new formulation has first the flexibility to tailor the Aghamiry et al. (2019e) for the detailed development. A PREPRINT Viscoacoustic Wavefield Inversion 11

Figure 6: Viscoacoustic IR-WRI results. (a-d) Noiseless data. (a-b) Velocity (a) and attenuation (b) models reconstructed without BTV regularization. (c-d) Same as (a-b) when BTV regularization is applied. (e-h) Same as (a-d) for noisy data (SNR=10db).

APPENDIX B order difference matrices, and X is the desired convex BOUND CONSTRAINED set. The penalty parameter λ > 0 balances the relative TV-REGULARIZATION USING ADMM weight of the TV regularizer and the misfit term. Follow- ing Aghamiry et al. (2019d, section 2.2.2), equation B-1 In this appendix, we review step by step how to solve a can be solved via the following three easy tricks. bound-constrained TV-regularized convex problem (such as those in equations 14 and 19) using variable splitting 1) Variable splitting. Since the variable x appears simul- and ADMM (Boyd et al., 2010). Let’s consider a general taneously in the TV, misfit, and bounding terms, it “cou- bound-constrained TV-regularized convex problem of the ples” these terms and makes it difficult to solve the prob- form lem. To decouple them, new auxiliary variables px = ∇xx, N q py ∈ X , and pz = ∇zx are substituted in the TV term X 2 2 λ 2 min |∇xx|i + |∇zx|i + kGx − yk2, (B-1) and the bound constraint, respectively, and their expres- x∈X 2 i=1 sion as a function of the original variable x are introduced for some column vector y and matrix G. The model x is as new equality constraints. This recasts B-1 as the fol- an N-length column vector, ∇x and ∇z are square first- 12 A PREPRINT Aghamiry et al.

Figure 7: For noiseless data, direct comparison along the Figure 8: Same as Fig. 7 but for noisy data with logs at x = 3.5 (left), x = 8.0 (center) and x = 12.0 km SNR=10db. The corresponding IR-WRI models are (right) between the true model (black), the initial model shown in Figure 6(e-h)). (dashed line) and the IR-WRI models without (red) and with BTV (green) regularization (Figure 6(a-d)). (a) Es- timated v , (b) estimated α. max optimization problem

N q X 2 2 λ 2 min max |px|i + |pz|i + kGx − yk2 x,px,py ∈X ,pz qx,qy ,qz 2 lowing constrained problem i=1 ξ + hq , p − ∇ xi + kp − ∇ xk2 x x x 2 x x 2 ξ 2 N + hqy, py − xi + kpy − xk2 X q λ 2 min |p |2 + |p |2 + kGx − yk2 (B-2) x i z i 2 ξ 2 x,px,py ∈X ,pz 2 + hq , p − ∇ xi + kp − ∇ xk , (B-3) i=1 z z z 2 z z 2  p = ∇ x,  x x subject to p = x, y where q , q , q are Lagrange multipliers (dual variables),  x y z pz = ∇zx. and ξ > 0 is the penalty parameter. Note that, here we used the same penalty parameter for all three constraints but one may use different parameter for each of them. The augmented Lagrangian method (a.k.a. method of multi- 2) Augmented Lagrangian. The second trick is to re- pliers, Hestenes, 1969) maximizes the objective in equa- lax these new linking constraints with an augmented La- tion B-3 with respect to the dual variables iteratively by grangian function. This recasts B-2 as the following min- using a simple steepest ascent algorithm (with step length A PREPRINT Viscoacoustic Wavefield Inversion 13

Figure 9: Time domain seismograms computed in (a) IR-WRI without regularization and (b) BTV regularized IR-WRI for noiseless data (Figure 6a- 6d). (c-d) Same as (a-b), but for noisy data (Figure 6e- 6h). The true seismograms are shown in the first and the last panel of the above mentioned seismograms (folded) to have a comparison at short and long offset with true seismograms. The seismograms are plotted with a reduction velocity of 2.5 km/s for sake of time axis compression. 14 A PREPRINT Aghamiry et al.

ξ) k+1 ξ k+1 k 2 py = arg min kpy − x + qyk2. (B-15) k+1 k k+1 k+1 py ∈X 2 qx = qx + ξ(px − ∇xx ) (B-4) k+1 k k+1 k+1 qy = qy + ξ(py − x ) (B-5) Subproblem B-13 is an easy-to-solve least-squares prob- k+1 k k+1 k+1 lem, which has a closed-form solution obtained by setting qz = qz + ξ(pz − ∇zx ) (B-6) the derivative of the objective function with respect to x k+1 k+1 k+1 k+1 where x and px , py , pz are obtained by solving equal to zero. N X q λ k+1  T T T −1 2 2 2 x = λG G + ξ∇ ∇x + ξI + ξ∇ ∇z (B-16) arg min |px|i + |pz|i + kGx − yk2 x z x,px,py ∈X ,pz 2  T T k k k k T k k  i=1 λG y + ξ∇x [px + qx] + ξ[py + qy] + ξ∇z [pz + qz ] . ξ + hqk, p − ∇ xi + kp − ∇ xk2 x x x 2 x x 2 Subproblem B-14 also has a closed form solution, given by ξ the generalized soft thresholding function (Goldstein and + hqk, p − xi + kp − xk2 y y 2 y 2 Osher, 2009), ξ + hqk, p − ∇ xi + kp − ∇ xk2, (B-7) z z z 2 z z 2 k+1 1/ξ k+1 k px = max(1 − , 0) ◦ (∇xx − qx), (B-17) 0 0 0 r beginning with px = py = pz = 0. Equations B-4-B-7 can be simplified by using a change and of variables (q ← 1 q , q ← 1 q , q ← 1 q ) using the x ξ x y ξ y z ξ z 1/ξ fact that for two real vectors a and b the following holds: pk+1 = max(1 − , 0) ◦ (∇ xk+1 − qk), (B-18) z r z z ξ ξ 1 ξ 1 ha, bi + kbk2 = kb + ak2 − k ak2. (B-8) 2 2 2 ξ 2 2 ξ 2 where q Accordingly, k+1 k 2 k+1 k 2 r = |∇xx − qx| + |∇zx − qz | . (B-19) k+1 k k+1 k+1 qx = qx + px − ∇xx (B-9) k+1 k k+1 k+1 Subproblem B-15 is a projection operator given by qy = qy + py − x (B-10) k+1 k k+1 k+1 k+1 k+1 k qz = qz + pz − ∇zx (B-11) q = projX (x − qy). (B-20) and In the case that X is a box set of form N X q λ arg min |p |2 + |p |2 + kGx − yk2 X = {x|x ≤ x ≤ x }, (B-21) x i z i 2 2 min max x,px,py ∈X ,pz i=1 ξ ξ then the projection operator admit a closed form solution + kp − ∇ x + qkk2 − kqkk2 2 x x x 2 2 x 2 k+1 k k+1 k ξ ξ proj (x − q ) = min(max(x − q , xmin), xmax), + kp − x + qkk2 − kqkk2 X y y 2 y y 2 2 y 2 ξ ξ where xmin and xmax are lower and upper bounds of x, + kp − ∇ x + qkk2 − kqkk2. (B-12) 2 z z z 2 2 z 2 respectively. 3) Alternating minimization. The basic augmented La- grangian method minimizes the objective function in equa- tion B-12 (augmented Lagrangian function) jointly over REFERENCES x, px, py, and pz, the third trick is to perform this min- Aghamiry, H., A. Gholami, and S. Operto, 2019a, Accu- imization by alternating minimizing with respect to each rate and efficient wavefield reconstruction in the time variable separately (Goldstein and Osher, 2009; Boyd et al., domain: Geophysics, 85(2), 1–6. 2010) to arrive at the so-called ADMM. ——–, 2019b, ADMM-based multi-parameter wavefield reconstruction inversion in VTI acoustic media with TV k+1 λ 2 ξ k k 2 x = arg min kGx − yk2 + kpx − ∇xx + qxk2 regularization: Geophysical Journal International, 219, x 2 2 1316–1333. ξ k k 2 ξ k k 2 + kp − x + q k + kp − ∇zx + q k . (B-13) ——–, 2019c, Compound regularization of Full-Waveform 2 y y 2 2 z z 2 Inversion for imaging piecewise media: IEEE N q Transactions on Geoscience and Remote Sensing, k+1 k+1 X 2 2 (px , pz ) = arg min |px|i + |pz|i 10.1109/TGRS.2019.2944464. p ,p x z i=1 ——–, 2019d, Implementing bound constraints and total- ξ variation regularization in extended full waveform in- + kp − ∇ xk+1 + qkk2 2 x x x 2 version with the alternating direction method of multi- ξ plier: application to large contrast media: Geophysical + kp − ∇ xk+1 + qkk2. (B-14) 2 z z z 2 Journal International, 218, 855–872. A PREPRINT Viscoacoustic Wavefield Inversion 15 ——–, 2019e, Improving full-waveform inversion by wave- physics, R282-R298, 78–82. field reconstruction with alternating direction method Futterman, W., 1962, Dispersive body waves: Journal of multipliers: Geophysics, 84(1), R139–R162. Geophysical Research, 67, 5279–5291. Aki, K. and P. G. Richards, 2002, Quantitative seismol- Gholami, A., H. Aghamiry, and M. Abbasi, 2018, Con- ogy, theory and methods, second edition: University strained nonlinear AVO inversion using Zoeppritz equa- Science Books. tions: Geophysics, 83(3), R245–R255. Askan, A., V. Akcelik, J. Bielak, and O. Ghattas, 2007, Gholami, A. and E. Z. Naeini, 2019, 3D Dix inversion us- Full waveform inversion for seismic velocity and anelas- ing bound-constrained TV regularization: Geophysics, tic losses in heterogeneous structures: Bulletin of the 84, 1–43. Seismological Society of America, 97, 1990–2008. Goldstein, T. and S. Osher, 2009, The split Bregman Asnaashari, A., R. Brossier, S. Garambois, F. Audebert, method for L1-regularized problems: SIAM Journal on P. Thore, and J. Virieux, 2013, Regularized seismic full Imaging Sciences, 2, 323–343. waveform inversion with prior model information: Geo- Guasch, L., M. Warner, and C. Ravaut, 2019, Adap- physics, 78, R25–R36. tive waveform inversion: Practice: Geophysics, 84(3), Benning, M., F. Knoll, C.-B. Sch¨onlieb,and T. Valko- R447–R461. nen, 2015, Preconditioned admm with nonlinear oper- Hak, B. and W. A. Mulder, 2011, Seismic attenuation ator constraint: IFIP Conference on System Modeling imaging with causality: Geophysical Journal Interna- and Optimization, 117–126. tional, 184, 439–451. B´erenger,J.-P., 1994, A perfectly matched layer for ab- He, B., H. Liu, Z. Wang, and X. Yuan, 2014, A strictly sorption of electromagnetic waves: Journal of Compu- contractive peaceman–rachford splitting method for tational Physics, 114, 185–200. convex programming: SIAM Journal on Optimization, Boyd, S., N. Parikh, E. Chu, B. Peleato, and J. Eckstein, 24, 1011–1040. 2010, Distributed optimization and statistical learning Hestenes, M. R., 1969, Multiplier and gradient methods: via the alternating direction of multipliers: Foundations Journal of optimization theory and applications, 4, 303– and trends in machine learning, 3, 1–122. 320. Chen, Z., D. Cheng, W. Feng, and T. Wu, 2013, An op- Hicks, G. J. and R. G. Pratt, 2001, Reflection waveform timal 9-point finite difference scheme for the Helmholtz inversion using local descent methods: estimating at- equation with PML: International Journal of Numerical tenuation and velocity over a gas-sand deposit: Geo- Analysis & Modeling, 10. physics, 66, 598–612. Cheng, X., K. Jiao, D. Sun, and D. Vigh, 2016, Multi- Kamei, R. and R. G. Pratt, 2008, Waveform tomography parameter estimation with acoustic vertical transverse strategies for imaging attenuation structure for cross- isotropic full-waveform inversion of surface seismic data: hole data: 70th Annual International Meeting, EAGE, Interpretation, 4(4), SU1–SU16. Expanded Abstracts, F019. Combettes, P. L. and J.-C. Pesquet, 2011, Proximal split- ——–, 2013, Inversion strategies for visco-acoustic wave- ting methods in signal processing, in Bauschke, H. H., form inversion: Geophysical Journal International, 194, R. S. Burachik, P. L. Combettes, V. Elser, D. R. Luke, 859–894. and H. Wolkowicz, eds., Fixed-Point Algorithms for In- Kolsky, H., 1956, The propagation of stress pulses in vis- verse Problems in Science and Engineering, volume 49 coelastic solids: Philosophical Magazine, 1, 693–710. of Springer Optimization and Its Applications, 185–212. Kurzmann, A., A. Przebindowska, D. Kohn, and T. Springer New York. Bohlen, 2013, Acoustic full waveform tomography in da Silva, N. and G. Yao, 2017, Wavefield reconstruction the presence of attenuation: a sensitivity analysis: Geo- inversion with a multiplicative cost function: Inverse physical Journal International, 195(2), 985–1000. problems, 34, 015004. Lacasse, M., H. Denli, L. White, V. Gudipati, S. Lee, da Siva, N. V., G. Yao, and M. Warner, 2019, and S. Tan, 2019, Accounting for heterogeneous atten- Semiglobal viscoacoustic full-waveform inversion: Geo- uation in full-wavefield inversion: Presented at the 81th physics, 84(2), R271–R293. Annual EAGE Meeting (London) - WS01: Attenuation: Dolean, V., P. Jolivet, and F. Nataf, 2015, An introduc- Challenges in Modelling and Imaging at the Exploration tion to domain decomposition methods - algorithms, Scale. theory, and parallel implementation: SIAM. Malinowski, M., S. Operto, and A. Ribodetti, 2011, Duan, Y. and P. Sava, 2016, Elastic wavefield tomography High-resolution seismic attenuation imaging from wide- with physical model constraints: Geophysics, 81, R447– aperture onshore data by visco-acoustic frequency- R456. domain full waveform inversion: Geophysical Journal Duff, I. S., A. M. Erisman, and J. K. Reid, 1986, Direct International, 186, 1179–1204. methods for sparse matrices, second edition: Oxford Marfurt, K., 1984, Accuracy of finite-difference and finite- Science Publications. element modeling of the scalar and elastic wave equa- Fu, L. and W. W. Symes, 2017, A discrepancy-based tions: Geophysics, 49, 533–549. penalty method for extended waveform inversion: Geo- M´etivier,L., A. Allain, R. Brossier, Q. M´erigot,E. Oudet, 16 A PREPRINT Aghamiry et al. and J. Virieux, 2018, Optimal transport for mitigating low-frequency land data set in Oman: Geophysics, 79, cycle skipping in full waveform inversion: a graph space WA69–WA77. transform approach: Geophysics, 83, R515–R540. Stopin, A., R.-E. Plessix, H. Kuehl, V. Goh, and K. M´etivier,L., R. Brossier, S. Operto, and V. J., 2017, Full Overgaag, 2016, Application of visco-acoustic full wave- waveform inversion and the : form inversion for gas cloud imaging and velocity model SIAM Review, 59, 153–195. building: Presented at the 78th EAGE Conference and M´etivier, L., R. Brossier, S. Operto, and J. Virieux, Exhibition 2016, EAGE. 2015, Acoustic multi-parameter FWI for the reconstruc- Takougang, E. M. T. and A. J. Calvert, 2012, Seismic tion of P-wave velocity, density and attenuation: pre- velocity and attenuation structures of the Queen Char- conditioned truncated Newton approach: 85th Annual lotte Basin from full-waveform tomography of seismic Meeting-New Orleans, Expanded Abstracts, 1198–1203, reflection data: Geophysics, 77(3), B107–B124. SEG. Tang, Y., 2009, Target-oriented wave-equation least- Mulder, W. A. and B. Hak, 2009, An ambiguity in atten- squares migration/inversion with phase-encoded hes- uation scattering imaging: Geophysical Journal Inter- sian: Geophysics, 74(6), WCA95–WCA107. national, 178, 1614–1624. Tarantola, A., 1984, Inversion of seismic reflection data Munns, J. W., 1985, The Valhall field: a geological in the acoustic approximation: Geophysics, 49, 1259– overview: Marine and Petroleum Geology, 2, 23–43. 1266. Nocedal, J. and S. J. Wright, 2006a, Numerical optimiza- Toks¨oz, M. N. and D. H. Johnston, 1981, Geophysics tion: Springer, 2nd edition. reprint series, no. 2: Seismic wave attenuation: Soci- ——–, 2006b, Numerical optimization: Springer, 2nd edi- ety of exploration geophysicists. tion. Toverud, T. and B. Ursin, 2005, Comparison of seismic at- Operto, S., R. Brossier, Y. Gholami, L. M´etivier, V. tenuation models using zero-offset vertical seismic pro- Prieux, A. Ribodetti, and J. Virieux, 2013, A guided filing (vsp) data: Geophysics, 70, F17–F25. tour of multiparameter full waveform inversion for mul- van Leeuwen, T. and F. Herrmann, 2016, A penalty ticomponent data: from theory to practice: The Lead- method for PDE-constrained optimization in inverse ing Edge, Special section Full Waveform Inver- problems: Inverse Problems, 32(1), 1–26. sion, 1040–1054. van Leeuwen, T. and F. J. Herrmann, 2013, Mitigating Operto, S. and A. Miniussi, 2018, On the role of density local minima in full-waveform inversion by expanding and attenuation in 3D multi-parameter visco-acoustic the search space: Geophysical Journal International, VTI frequency-domain FWI: an OBC case study from 195(1), 661–667. the North Sea: Geophysical Journal International, 213, Virieux, J. and S. Operto, 2009, An overview of full wave- 2037–2059. form inversion in exploration geophysics: Geophysics, Parikh, N. and S. Boyd, 2013, Proximal algorithms: Foun- 74, WCC1–WCC26. dations and Trends in Optimization, 1(3), 123–231. Warner, M. and L. Guasch, 2016, Adaptive waveform in- Peaceman, D. W. and H. H. Rachford, Jr, 1955, The version: Theory: Geophysics, 81, R429–R445. numerical solution of parabolic and elliptic differential Yang, J., Y. Liu, and L. Dong, 2016, Simultaneous estima- equations: Journal of the Society for industrial and Ap- tion of velocity and density in acoustic multiparameter plied Mathematics, 3, 28–41. full-waveform inversion using an improved scattering- Pratt, R. G., C. Shin, and G. J. Hicks, 1998, Gauss- integral approach: Geophysics, 81, R399–R415. Newton and full Newton methods in frequency-space Yang, J.-Z., Y.-Z. Liu, and L.-G. Dong, 2014, A multi- seismic waveform inversion: Geophysical Journal Inter- parameter full waveform strategy for acoustic media national, 133, 341–362. with variable density: Chinese Journal of Geophysics, Prieux, V., R. Brossier, S. Operto, and J. Virieux, 2013, 57, 628–643. Multiparameter full waveform inversion of multicompo- nent OBC data from Valhall. Part 1: imaging compres- sional wavespeed, density and attenuation: Geophysical Journal International, 194, 1640–1664. Ribodetti, A., S. Operto, J. Virieux, G. Lambar´e,H.-P. Val´ero,and D. Gibert, 2000, Asymptotic viscoacoustic diffraction tomography of ultrasonic laboratory data : a tool for rock properties analysis: Geophysical Journal International, 140, 324–340. Rudin, L., S. Osher, and E. Fatemi, 1992, Nonlinear total variation based noise removal algorithms: Physica D, 60, 259–268. Stopin, A., R.-E. Plessix, and S. Al Abri, 2014, Multi- parameter waveform inversion of a large wide-azimuth A PREPRINT Viscoacoustic Wavefield Inversion 17

Algorithm 1 Viscoacoustic Wavefield Inversion with Bound Constrained TV Regularization. Lines 4 to 6 are the primal subproblems for wavefield reconstruction and parameter estimation. Lines 7 to 12 are primal subproblems for auxiliary variables introduced to implement nonsmooth regularizations and bound constraints (Appendix B). Lines 13 to 20 are the dual subproblems solved with gradient ascent steps. 1: Begin with k = 0, an initial squared slowness m0, and attenuation α0, 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2: Set to zero the values of d , b , px,m, py,m, pz,m, px,α, py,α, pz,α, qx,m, qy,m, qz,m, qx,α, qy,α, qz,α, 3: while convergence criteria not satisfied do h i−1h i 4: uk+1 = λAT A + γPT P λAT [bk + b] + γPT [dk + d] −1 k+1 h T T T i h T k T k k k k T k k i 5: m = λL L+ξm∇x ∇x+ξmI+ξm∇z ∇z λL y +ξm∇x [px,m+qx,m]+ξm[py,m+qy,m]+ξm∇z [pz,m+qz,m]

−1 k+1 h T T T i h T k T k k k k T k k i 6: α = λH H + ξα∇x ∇x + ξαI + ξα∇z ∇z λH h + ξα∇x [px,α + qx,α] + ξα[py,α + qy,α] + ξα∇z [pz,α + qz,α]

µ/ξ 7: k+1 √ m k+1 k px,m = max(1 − k+1 k 2 k+1 k 2 , 0) ◦ (∇xm − qx,m) |∇xm −qx,m| +|∇z m −qz,m| k+1 k+1 k 8: py,m = projM(m − qy,m) µ/ξ 9: k+1 √ m k+1 k pz,m = max(1 − k+1 k 2 k+1 k 2 , 0) ◦ (∇zm − qz,m) |∇xm −qx,m| +|∇z m −qz,m| ν/ξ 10: k+1 √ α k+1 k px,α = max(1 − k+1 k 2 k+1 k 2 , 0) ◦ (∇xα − qx,α) |∇xα −qx,α| +|∇z α −qz,α| k+1 k+1 k 11: py,α = projA(α − qy,α) ν/ξ 12: k+1 √ α k+1 k pz,α = max(1 − k+1 k 2 k+1 k 2 , 0) ◦ (∇zα − qz,α) |∇xα −qx,α| +|∇z α −qz,α| k+1 k k+1 k+1 13: qx,m = qx,m + px,m − ∇xm k+1 k k+1 k+1 14: qy,m = qy,m + py,m − m k+1 k k+1 k+1 15: qz,m = qz,m + pz,m − ∇zm k+1 k k+1 k+1 16: qx,α = qx,α + px,α − ∇xα k+1 k k+1 k+1 17: qy,α = qy,α + py,α − α k+1 k k+1 k+1 18: qz,α = qz,α + pz,α − ∇zα 19: bk+1 = bk + b − A(mk+1, αk+1)uk+1

20: dk+1 = dk + d − Puk+1

21: k = k + 1

22: end while