<<

Seismo 1: Introduction

Barbara Romanowicz

Institut de Physique du Globe de Paris, Univ. of California, Berkeley

Les Houches, 6 Octobre 2014

Kola 1970-1989 – profondeur: 12.3 km Photographié en 2007

Rayon de la terre: 6,371 km Ces disciplines contribuent à nos connaissances sur La structure interne de la terre, dynamique et évolution

S’appuyent sur des observations à la surface de la terre, et par télédétection Geomagnetism au moyen de satellites

Geochemistry/cosmochemistry Seismology

• Lecture 1: Geophysical Inverse Problems • Lecture 2: Body waves, ray theory • Lecture 3: Surface waves, free oscillations • Lecture 4: – anisotropy • Lecture 5: Forward modeling of deep mantle body waves for fine scale structure • Lecture 6: Inner core structure and anisotropy Seismology: 1 Geophysical Inverse Problems with a focus on seismic tomography Gold buried in beach example

Unreasonable True model Model with good fit The forward problem – an example

• Gravity survey over an unknown buried mass Gravity measurements x x x x x distribution x x x x x x x x x • Continuous integral x x expression:

?

Unknown The anomalous mass at depth The data along the mass at depth surface The physics linking mass and gravity (Newton’s Universal Gravitation), sometimes called the kernel of the integral Make it a discrete problem

• Data is sampled (in time and/or space) • Model is expressed as a finite set of parameters

Data vector Model vector • Functional g can be linear or non-linear: – If linear:

– If non-linear:d = Gm

d - d0 = g(m)- g(m0 ) = G(m- m0 ) Linear vs. non-linear – parameterization matters!

• Modeling our unknown anomaly as a sphere of unknown radius R, density anomaly Dr, and depth b.

Non-linear in R and b

• Modeling it as a series of density anomalies in fixed

pixels, Δρj

Linear in all Δρj The discrete linear forward problem

A equation!

• di – the gravity anomaly measured at xi

• mj – the density anomaly at pixel j

• Gij – the geometric terms linking pixel j to observation i •

• Generally we say we have N data measurements, M model parameters, and therefore G is an N x M matrix Geophysical :

• Infer some properties of the Earth – -> “MODEL”

• From a set of observations – -> “DATA”

• Assuming a specific method which relates the model parameters to the data – -> “THEORY”

Forward Problem:

• Given a model m • Predict a set of observations d g • m d

• g is a functional that may be: – Explicit or implicit – Linear or non-linear

• g may contain theoretical assumptions/approximations Inverse problem

• Given a set of data d, estimate model parameters.

g-1 d m

• Problem can be stated implicitely: – g(d, m) = 0

• Or explicitely: – d = g(m) Forward/inverse problem

If g is linear, this relationship can take several forms: – Discrete: M d = G m i å ij j j=1

– Continuous:

d  G (x)m(x)dx i  i

: d( y)  G(x, y)m(x)dx

– Can always be reduced to a discrete problem. Example of linear problem: Acoustic tomography h = dimension of blocks (length/width) si = slowness within each block (assumed uniform) t,j = measured travel times Example 2: catscan

dI / ds  c(x, y)I

Non-linear

linearized Another way to linearize:

Assume net absorption of x rays is small – Replace exp(-x) by 1-x

Then discretize into small square boxes with constant

absorption coefficient cj.

The integral

becomes:

 matrix equation

Inverse problems

• Existence of solution?

• Uniqueness of solution? – Null space

• Stability of the solution: – Many inverse problems are “ill-posed” - extreme sensitivity to initial conditions: • m1 and m2 may not be “close” but the corresponding data elements d1 and d2 can be very “close” • Underdetermined system – More unknowns than data (M>N), but equations are consistent (provide independent constraints, e.g norm min.) – mest=GT[GGT]-1d

• Overdetermined system – e.g. fitting a straight line through a set of points – Typically more data than unknowns (N>M)

• Mixed determined systems – The case for most problems

Purely underdetermined problems

• Fewer equations than unknowns (N

• Assume equations independent

• More than one solution – how do we choose the “right one”? – Add “a priori information” • For example: “line passes through the origin” • Parameters have given sign, or lie in given range (e.g. density inside the earth should be between 1 and 50 g/cm3) or its profile close to Adams-Williamson.

• Often, we choose to minimize the norm of the solution (given an appropriate norm) • Example: minimize the size of the solution as measured by its Euclidian norm:

2 1/ 2 || m || mi 

• Solve the problem: • Find m which minimizes L=||m||2, subject to the constraint that d-Gm=0 – Use Lagrange multipliers, minimize:

M N M 2   (m)  m j  i di  Gijm j  j1 i1  j1   N  2mq  iGiq  0 mq i1 2m  GT 

Substitute in Gm=d: d = G[GT l /2] This gives  and then m -1 mˆ = GT [GGT ] d

If equations independent, GGT is non-singular

Overdetermined system: solution (example)

• The equation to solve is:

d  m  m x , i  1, N i 1 2 i N  2

• Minimize the misfit at each point:

ei  di  (m1  m2xi )

• That is, minimize:

T 2 E  e e  (di  m1  m2 xi ) i

• Set derivatives of E to zero: E  2Nm  2m x  2 d  0 m 1 2  i  i 1 m ,m E 1 2  2m x  2m z2  2 d z  0 m 1 i 2  i  i i 2 • More generally:

E  eT e  (d  Gm)T (d  Gm) N  M  M   di  Gijm j di  Gikmk  i1  j1  k1  E M N N Least squares solution:  0  2 m G G  2 G d m  k  iq ik  iq i q k1 i1 i1 GT Gm  GT d  0 -1 mˆ = [GTG] GT d

• Solution doesn’t always exist: – E.g straight line problem: If we have only one data point . In this case:

 N  N x   i  1 x T i1  1  G G  N N    2   2  x1 x1   xi  zi  i1 i1 

– We need to add other constraints! Mixed determined problems

• Example x ray tomography problem: – Several rays through one box – No rays through another box

• 1 - Separate the model parameters in two groups: overdetermined, underdetermined – Ideally: partition the equation:

Go 0 mo  do         0 Gu mu  du 

Mixed-determined problems

• 2 – Minimize some combination of the misfit and the solution size:

T 2 T (m)  e e  m m

• Then the solution is the damped least squares solution:

1 mˆ  GTG   2I GT d

• Generalize to the case where we want to find the solution closest to some particular model , called the “a priori model”: – Replace m by m-

• Generalize to other norms. – Example: minimize roughness, i.e. difference between adjacent model parameters. – Consider ||Dm|| instead of ||m|| and minimize:

T T T T Dm Dm m D Dm  m Wmm

– More generally, minimize:

T (m  m ) Wm (m  m ) Example : Minimizing roughness

Combined with being close to reference model Form of Wm First order ‘Roughening Matrix’ (Finite difference approximation to first derivative).

Laplacian (Finite difference approximation to second derivative).

Wm • We may also want to weigh the misfits in data space (some observations more accurate than others). Instead of eTe, minimize

T – E=e Wee

• Completely overdetermined problem: – Minimize E, then solution is:

mest  [GTW G]1GTW d e e

• Completely underdetermined problem: T – Minimize L=[m-] Wm[m-]]

est T T 1 m  m  WmG [GWmG ] [d  G  m ] Weighted damped least squares

• In general we will want to minimize: – E+2L • The solution then has the form:

est T 2 1 T m  m  [G WeG   Wm ] G We[d  G  m ] or, equivalently : est 1 T 1 T 2 1 1 m  m  Wm G [GWm G   We ] [d  G  m ]

More reading: Tarantola and Valette (1982, Rev. Geophys.); Tarantola (2005, SIAM) Summary 1 : recipes

Overdetermined: Minimize error “Least squares”

Underdetermined: Minimize model size “Minimum length”

Mixed-determined: Minimize both “Damped least squares”

Concept of ‘Generalized Inverse’ • Generalized inverse (G-g) is the matrix in the linear inverse problem that multiplies the data to provide an estimate of the model parameters;

g mˆ  G d

1 – For Least Squares Gg  GT G GT

– For Minimum Length 1 Gg  GT GGT   – For Damped Least Squares 1 Gg  GT G  2I GT Damped weighted least squares

Data weighting

Misfit of reference model

Model weighting

Perturbation to reference model Regularization tradeoffs

• Changing the weighting of the regularization terms affects the balance between minimizing The L curve model size and data misfit

• Too large values lead to simple models biased to reference model with poor fit to the data

• Small values lead to overly complex models that may offer only marginal improvement to misfit Regularization tradeoffs

• Run inversion for a number of different regularization values.

• Plot data residual versus model norm for different inversions.

• Choose inversion result at knee of ‘L’ curve. Summary 2

• In order to get more reliable and robust answers, we need to weight the data appropriately to make sure we focus on fitting the most reliable data

• We also need to specify a priori characteristics of the model through model weighting or regularization

• These are often not necessarily constrained well by the data, and so are “tuneable” parameters in our inversions Now that we have an answer…

• With some combination of the previous equations, nearly every dataset can give us an “answer” for an inverted model – This is only halfway there, though!

• How certain are we in our results?

• How well is the dataset able to resolve the chosen model parameterization?

• Are there model parameters or combinations of model parameters that we can’t resolve? Resolution Indicators • Data Resolution – How well does inversion uniquely predict the data – have generalized inverse estimation of model parameters; mˆ  Gg d obs

– Plug this estimate into original forward formulation of problem

d pre  Gm  Gmˆ  GGg d obs GGg d obs  Nd obs

– Tells how well predicted data match the measurements. • Perfect match N=I, prediction error =0 – This is the case for the minimum length solution. • Imperfect match N ≠I, prediction error ≠ 0.

Data Resolution • Matrix provides an indicator of how well neighboring data can be independently predicted (or resolved). – Magnitude of main diagonal gives relative weight of data in its own prediction. – Off diagonal elements how averaging occurs over neighboring values to produce value at a given point.

• Data resolution matrix very useful in experimental design in terms of picking out most important data, as well as providing experiment where data are uniquely resolved. Model evaluation

• Model resolution – Given the geometry of data collection and the choices of model parameterization and regularization, how well are we able to image target structures?

• Model error – Given the errors in our measurements and the a priori model constraints (regularization), what is the uncertainty of the resolved model? The model resolution matrix

• For any solution type, we can define a “generalized inverse” G-g, where :

• The data are related to the “true” model by:

d = Gmtrue

-g mest = G Gmtrue = Rmtrue Synthetic Example

Model Inversion result The resolution matrix

• Think of it as a filter that runs a target model through the data geometry and regularization to see how your inversion can see different kinds of structure

• Does not account for errors in theory or noise in data Masters, CIDER 2010 Checkerboard test

80%

mˆ = Rmtrue R contains theoretical assumptions on wave propagation, parametrization -g And assumes the problem is linear After Masters, CIDER 2010 R= G G Beware the checkerboard!

• Checkerboard tests really only reveal how well the experiment can resolve checkerboards of various length scales

• For example, if the study is interpreting vertically or laterally continuous features, it might make more sense to use input models which test the ability of the inversion to resolve continuous or separated features

From Allen and Tromp, 2005 What about model error?

• Resolution matrix tests ignore effects of data error

• Very good apparent resolution can often be obtained by decreasing damping/regularization

• If we assume a linear problem with Gaussian errors, we can propagate the data errors directly to model error Linear estimations of model error

a posteriori model data covariance

Alternatively, the diagonal elements of the model covariance can be estimated using bootstrap or other random realization approaches

Note that this estimate depends on choice of regularization Linear approaches: resolution/error tradeoff

Bootstrap error map (Panning Checkerboard resolution map and Romanowicz, 2006)