February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Publishers’ page

i February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Publishers’ page

ii February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Publishers’ page

iii February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Publishers’ page

iv February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

To the memory of my parents

v February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

vi February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Preface

This book presents analytic theory of random fields estimation optimal by the criterion of minimum of the variance of the error of the estimate. This theory is a generalization of the classical Wiener theory. Wiener’s theory has been developed for optimal estimation of stationary random processes, that is, random functions of one variable. Random fields are random func- tions of several variables. Wiener’s theory was based on the analytical solution of the basic integral equation of estimation theory. This equation for estimation of stationary random processes was Wiener-Hopf-type of equation, originally on a positive semiaxis. About 25 years later the theory of such equations has been developed for the case of finite intervals. The assumption of stationarity of the processes was vital for the theory. An- alytical formulas for optimal estimates (filters) have been obtained under the assumption that the spectral density of the stationary process is a pos- itive rational function. We generalize Wiener’s theory in several directions. First, estimation theory of random fields and not only random processes is developed. Secondly, the stationarity assumption is dropped. Thirdly, the assumption about rational spectral density is generalized in this book: we consider kernels of positive rational functions of arbitrary elliptic self- adjoint operators on the whole space. The domain of observation of the signal does not enter into the definition of the kernel. These kernels are correlation functions of random fields and therefore the class of such kernels defines the class of random fields for which analytical estimation theory is developed. In the appendix we consider even more general class of kernels, namely kernels R(x, y), which solve the equation QR = P δ(x y). Here − P and Q are elliptic operators, and δ(x y) is the delta-function. We − study singular perturbation problem for the basic integral equation of esti- mation theory Rh = f. The solution to this equation, which is of interest

vii February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

viii Random Fields Estimation Theory

in estimation theory, is a distribution, in general. The perturbed equation, 2 h +Rh = f has the unique solution in L (D). The singular perturbation problem consists of the study of the asymptotics of h as  0. This theory  → is not only of mathematical interest, but also a basis for the numerical solu- tion of the basic integral equation in distributions. We discuss the relation between estimation theory and quantum-mechanical non-relativistic scat- tering theory. Applications of the estimation theory are also discussed. The presentation in this book is based partly on the author’s earlier monographs [Ramm (1990)] and [Ramm (1996)], but also contains recent results [Ramm (2002)], [Ramm (2003)],[Kozhevnikov and Ramm (2005)], and [Ramm and Shifrin (2005)]. The book is intended for researchers in probability and statistics, anal- ysis, numerical analysis, signal estimation and image processing, theoreti- cally inclined electrical engineers, geophysicists, and graduate students in these areas. Parts of the book can be used in graduate courses in proba- bilty and statistics. The analytical tools that the author uses are not usual for statistics and probability. These tools include of elliptic operators, pseudodifferential operators, and . The presen- tation in this book is essentially self-contained. Auxiliary material which we use is collected in Chapter 8. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Contents

Preface vii

1. Introduction 1

2. Formulation of Basic Results 9 2.1 Statement of the problem ...... 9 2.2 Formulation of the results (multidimensional case) . . . . . 14 2.2.1 Basic results ...... 14 2.2.2 Generalizations ...... 17 2.3 Formulation of the results (one-dimensional case) ...... 18 2.3.1 Basic results for the scalar equation ...... 19 2.3.2 Vector equations ...... 22 2.4 Examples of kernels of class and solutions to the basic R equation ...... 25 2.5 Formula for the error of the optimal estimate ...... 29

3. Numerical Solution of the Basic Integral Equation in Dis- tributions 33 3.1 Basic ideas ...... 33 3.2 Theoretical approaches ...... 37 3.3 Multidimensional equation ...... 43 3.4 Numerical solution based on the approximation of the kernel 46 3.5 Asymptotic behavior of the optimal filter as the white noise component goes to zero ...... 54 3.6 A general approach ...... 57

ix February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

x Random Fields Estimation Theory

4. Proofs 65 4.1 Proof of Theorem 2.1 ...... 65 4.2 Proof of Theorem 2.2 ...... 73 4.3 Proof of Theorems 2.4 and 2.5 ...... 79 4.4 Another approach ...... 84

5. Singular Perturbation Theory for a Class of Fredholm Integral Equations Arising in Random Fields Estimation Theory 87 5.1 Introduction ...... 87 5.2 Auxiliary results ...... 90 5.3 Asymptotics in the case n = 1 ...... 93 5.4 Examples of asymptotical solutions: case n = 1 ...... 98 5.5 Asymptotics in the case n > 1 ...... 103 5.6 Examples of asymptotical solutions: case n > 1 ...... 105

6. Estimation and Scattering Theory 111 6.1 The direct scattering problem ...... 111 6.1.1 The direct scattering problem ...... 111 6.1.2 Properties of the scattering solution ...... 114 6.1.3 Properties of the scattering amplitude ...... 120 6.1.4 Analyticity in k of the scattering solution ...... 121 6.1.5 High-frequency behavior of the scattering solutions . 123 + 6.1.6 Fundamental relation between u and u− ...... 127 6.1.7 Formula for det S(k) and the Levinson Theorem . . . 128 6.1.8 Completeness properties of the scattering solutions . 131 6.2 Inverse scattering problems ...... 134 6.2.1 Inverse scattering problems ...... 134 6.2.2 Uniqueness theorem for the inverse scattering problem 134 6.2.3 Necessary conditions for a function to be a scatterng amplitude ...... 135 6.2.4 A Marchenko equation (M equation) ...... 136 6.2.5 Characterization of the scattering data in the 3D in- verse scattering problem ...... 138 6.2.6 The Born inversion ...... 141 6.3 Estimation theory and inverse scattering in R3 ...... 150

7. Applications 159 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Contents xi

7.1 What is the optimal size of the domain on which the data are to be collected? ...... 159 7.2 Discrimination of random fields against noisy background . 161 7.3 Quasioptimal estimates of derivatives of random functions . 169 7.3.1 Introduction ...... 169 7.3.2 Estimates of the derivatives ...... 170 7.3.3 Derivatives of random functions ...... 172 7.3.4 Finding critical points ...... 180 7.3.5 Derivatives of random fields ...... 181 7.4 Stable summation of orthogonal series and integrals with randomly perturbed coefficients ...... 182 7.4.1 Introduction ...... 182 7.4.2 Stable summation of series ...... 184 7.4.3 Method of multipliers ...... 185 7.5 Resolution ability of linear systems ...... 185 7.5.1 Introduction ...... 185 7.5.2 Resolution ability of linear systems ...... 187 7.5.3 Optimization of resolution ability ...... 191 7.5.4 A general definition of resolution ability ...... 196 7.6 Ill-posed problems and estimation theory ...... 198 7.6.1 Introduction ...... 198 7.6.2 Stable solution of ill-posed problems ...... 205 7.6.3 Equations with random noise ...... 216 7.7 A remark on nonlinear (polynomial) estimates ...... 230

8. Auxiliary Results 233 8.1 Sobolev spaces and distributions ...... 233 8.1.1 A general imbedding theorem ...... 233 8.1.2 Sobolev spaces with negative indices ...... 236 8.2 Eigenfunction expansions for elliptic selfadjoint operators . 241 8.2.1 Resoluion of the identity and integral representation of selfadjoint operators ...... 241 8.2.2 Differentiation of operator measures ...... 242 8.2.3 Carleman operators ...... 246 8.2.4 Elements of the spectral theory of elliptic operators in L2(Rr) ...... 249 8.3 Asymptotics of the spectrum of linear operators ...... 260 8.3.1 Compact operators ...... 260 8.3.1.1 Basic definitions ...... 260 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

xii Random Fields Estimation Theory

8.3.1.2 Minimax principles and estimates of eigen- values and singular values ...... 262 8.3.2 Perturbations preserving asymptotics of the spectrum of compact operators ...... 265 8.3.2.1 Statement of the problem ...... 265 8.3.2.2 A characterization of the class of linear com- pact operators ...... 266 8.3.2.3 Asymptotic equivalence of s-values of two op- erators ...... 268 8.3.2.4 Estimate of the remainder ...... 270 8.3.2.5 Unbounded operators ...... 274 8.3.2.6 Asymptotics of eigenvalues ...... 275 8.3.2.7 Asymptotics of eigenvalues (continuation) . 283 8.3.2.8 Asymptotics of s-values ...... 284 8.3.2.9 Asymptotics of the spectrum for quadratic forms ...... 287 8.3.2.10 Proof of Theorem 2.3 ...... 293 8.3.3 and Hilbert-Schmidt operators ...... 297 8.3.3.1 Trace class operators ...... 297 8.3.3.2 Hilbert-Schmidt operators ...... 298 8.3.3.3 Determinants of operators ...... 299 8.4 Elements of probability theory ...... 300 8.4.1 The probability space and basic definitions ...... 300 8.4.2 theory ...... 306 8.4.3 Estimation in Hilbert space L2(Ω, , P ) ...... 310 U 8.4.4 Homogeneous and isotropic random fields ...... 312 8.4.5 Estimation of parameters ...... 315 8.4.6 Discrimination between hypotheses ...... 317 8.4.7 Generalized random fields ...... 319 8.4.8 Kalman filters ...... 320

Appendix A Analytical Solution of the Basic Integral Equa- tion for a Class of One-Dimensional Problems 325 A.1 Introduction ...... 326 A.2 Proofs ...... 329

Appendix B Integral Operators Basic in Random Fields Es- timation Theory 337 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Contents xiii

B.1 Introduction ...... 337 B.2 Reduction of the basic integral equation to a boundary-value problem ...... 341 B.3 Isomorphism property ...... 349 B.4 Auxiliary material ...... 354

Bibliographical Notes 359

Bibliography 363

Symbols 371

Index 373 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Chapter 1 Introduction

This work deals with just one topic: analytic theory of random fields estima- tion within the framework of covariance theory. No assumptions about dis- tribution laws are made: the fields are not necessarily Gaussian or Marko- vian. The only information used is the covariance functions. Specifically, we assume that the random field is of the form

(x) = s(x) + n(x), x Rr, (1.1) U ∈ where s(x) is the useful signal and n(x) is noise. Without loss of generality assume that

s(x) = n(x) = 0, (1.2)

where the bar denotes the mean value. If these mean values are not zeros then one either assumes that they are known and considers the fields s(x) − s(x) and n(x) n(x) with zero mean values, or one estimates the mean − values and then subtracts them from the corresponding fields. We also assume that the covariance functions

(x) (y) := R(x, y), (x)s(y) := f(x, y) (1.3) U ∗ U U ∗ are known. The star stands for complex conjugate. This information is necessary for any development within the framework of covariance theory. We will show that, under some assumptions about functions (1.3), one can develop an analytic theory of random fields estimation. If the functions (1.3) are not known then one has to estimate them from statistical data or from some theory. In many applications the exact analytical expression for the covariance functions is not very important, but rather some general features of R or f are of practical interest. These features include, for example, the correlation radius.

1 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

2 Random Fields Estimation Theory

The estimation problem of interest is the following one. The signal u(x) of the form (1.1) is observed in a domain D Rr with the boundary Γ. ⊂ Assuming (1.2) and (1.3) one needs to linearly estimate As(x0) where A is a given operator and x Rr is a given point. The linear estimate is to be 0 ∈ best possible by the criterion of minimum of variance, i.e. the estimate is by the least squares method. The most general form of a linear estimate of u observed in the domain D is

L := h(x, y) (y)dy (1.4) U U ZD where h(x, y) is a distribution. Therefore the optimal linear estimate solves the variational problem

 := (L As)2 = min (1.5) U −

where Lu and As are computed at the point x0. A necessary condition on h(x, y) for (1.5) (with A = I) to hold is (see equations (2.11) and (8.423))

Rh := R(x, y)h(z, y)dy = f(x, z), x, z D := D Γ. (1.6) ∈ ∪ ZD The basic topic of this work is the study of a class of equations (1.6) for which the analytical properties of the solution h can be obtained, a numer- ical procedure for computing h can be given, and properties of the operator R in (1.6) can be studied. Since z enters (1.6) as a parameter, one can study the basic equation of estimation theory

Rh := R(x, y)h(y)dy = f(x), x D. (1.7) ∈ ZD A typical one-dimensional example of the equation (1.7) in estimation the- ory is

1 exp( x y )h(y)dy = f(x), 1 x 1. (1.8) 1 −| − | − ≤ ≤ Z− Its solution of minimal order of singularity is

h(x) = ( f 00 +f)/2+δ(x+1)[ f 0 ( 1)+f( 1)]/2+δ(x 1)[f 0 (1)+f(1)]/2. − − − − − (1.9) One can see that the solution is a distribution with singular support at the boundary of the domain D. By sing supp h we mean the set having no open neigborhood to which the restriction of h can be identified with a locally integrable function. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Introduction 3

In the case of equation (1.8) this domain D = ( 1, 1). Even if f − 2 ∈ C∞(D) the solution to equations (1.7), (1.8) are, in general, not in L (D). The problem is: in what functional space should one look for the solu- tion? Is the solution unique? Does the solution to (1.7) provide the solution to the estimation problem (1.5)? Does the solution depend continuously on the data, e.g. on f and on R(x, y)? How does one compute the solution analytically and numerically? What are the properties of the solution, for example, what is the order of singularity of the solution? What is the sin- gular support of the solution? What are the properties of the operator R as an operator in L2(D)? These questions are answered in Chapters 2-4. The answers are given for the class of random fields whose covariance functions R(x, y) are kernels of positive rational functions of selfadjoint elliptic operators in L2(Rr ). The class of such kernels consists of kernels R

1 R(x, y) = P (λ)Q− (λ)Φ(x, y, λ)dρ(λ) (1.10) ZΛ where Λ, dρ, Φ(x, y, λ) are respectively the spectrum , spectral measure and spectral kernel of an elliptic selfadjoint operator in L2(Rr ) of order s, and L P (λ) and Q(λ) are positive polynomials of degrees p and q respectively. The notions of spectral measure and spectral kernel are discussed in Section 8.2. If p > q then the operator in L2(D) with kernel (1.10) is an elliptic integro- differential operator R; if p = q then R = cI + , where c = const > 0, I is K the identity operator, and is a compact selfadjoint operator in L2(D); if K p < q, which is the most interesting case, then R is a compact selfadjoint operator in L2(D). In this case the noise n(x) is called colored. If φ(λ) is a measurable function then the kernel of the operator φ( ) is defined by the L formula

φ( )(x, y) = φ(λ)Φ(x, y, λ)dρ(λ). (1.11) L ZΓ The domain of definition of the operator φ( ) consists of all functions f L ∈ L2(Rr) such that

φ(λ) 2d(E f, f) < (1.12) | | λ ∞ ZΓ where E is the resolution of the identity for . It is a projection operator λ L February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

4 Random Fields Estimation Theory

with the kernel

λ Eλ(x, y) = Φ(x, y, λ)dρ(λ). (1.13) Z−∞

In particular, since E+ = I, one has ∞

∞ δ(x y) = Φ(x, y, λ)dρ(λ). (1.14) − Z−∞ In (1.13) and (1.14) the integration is actually taken over ( , λ) Λ and −∞ ∩ ( , ) Λ respectively, since dρ = 0 outside Λ. The kernel in (1.8) −∞ ∞ ∩ d corresponds to the simple case r = 1, = i dx , Λ = ( , ), dρ = dλ, 1 L − −∞2 ∞ x Φ(x, y, λ) = (2π)− exp iλ(x y) , P (λ) = 1, Q(λ) = (λ + 1)/2, e−| | = 2 1 { − } 1 ∞ λ +1 − (2π)− 2 exp(iλx)dλ, and formula (1.9) is a very particular −∞ case ofRthe general formulas given in Chapter 2. Let R(x, y) , α := 1 ` ` ∈ R s(q p), H (D) be the Sobolev spaces and H˙ − (D) be its dual with 2 − respect to H0(D) = L2(D). Then the answers to the questions, formulated above, are as follows. The solution to equation (1.7) solves the estimation problem if and only α α α if h H˙ − (D). The operator R : H˙ − (D) H (D) is an isomorphism. ∈ → α The singular support of the solution h H˙ − (D) of equation (1.7) is ∈ Γ = ∂D. The analytic formula for h is of the form h = Q( )G, where L G is a solution to some interface elliptic boundary value problem and the differentiation is taken in the sense of distributions. Exact description of this analytic formula is given in Chapter 2. The spectral properties of the operator R : L2(D) L2(D) with the kernel → R(x, y) are given also in Chapter 2. These properties include asymp- ∈ R totics as n of the eigenvalues λn of R, dependence λn on D, and → ∞ r asymptotics of λ1(D) as D R , that is D growing uniformly in direc- → α tions. Numerical methods for solving equation (1.7) in the space H˙ − (D) of distributions are given in Chapter 3. These methods depend heavily on the analytical results given in Chapter 2. The necessary background material on Sobolev spaces and spectral the- ory is given in Chapter 8 so that the reader does not have to consult the literature in order to understand the contents of this work. No attempts were made by the author to present all aspects of the the- ory of random fields. There are several books [Adler (1981)], [Yadrenko (1983)], [Vanmarcke (1983)], [Rosanov (1982)] and [Preston (1967)], and many papers on various aspects of the theory of random fields. They have February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Introduction 5

practically no intersection with this work which can be viewed as an ex- tension of Wiener’s filtering theory. The statement of the problem is the same as in Wiener’s theory, but we study random functions of several vari- ables, that is random fields, while Wiener (and many researchers after him) studied filtering and extrapolation of stationary random processes, that is random functions of one variable. Wiener’s basic assumptions were: 1) the random process u(t) = s(t) + n(t) is stationary, 2) it is observed on the interval ( , T ), −∞ 3) it has a rational spectral density (this assumption can be relaxed, but for effective solution of the estimation problems it is quite useful). The first assumption means that R(t, τ) = R(t τ), where R is the − covariance function (1.3). The second one means that D = ( , T ). The 1 −∞ third one means R˜(λ) = P (λ)Q− (λ), where P (λ) and Q(λ) are polyno- mials, R˜(λ) 0 for < λ < , R˜(λ) := ∞ R(t) exp( iλt)dt. The ≥ −∞ ∞ − analytical theory used by Wiener is the theory −∞of Wiener-Hopf equations. R Later the Wiener theory was extended to the case D = [T1, T ] of a finite interval of observation, while assumptions 1) and 3) remained valid. A review of this theory with many references is [Kailath (1974)]. Although the literature on filtering and estimation theory is large (dozens of books and hundreds of papers are mentioned in [Kailath (1974)]), the analytic theory presented in this work and developed in the works of the author cited in the references has not been available in book form in its present form, although a good part of it appeared in [?, Ch. 1]. Most of the previously known analytical results on Wiener-Hopf equations with rational R˜(λ) are immediate and simple consequences of our general the- ory. Engineers can use the theory presented here in many applications. These include signal and image processing in TV, underwater acoustics, geophysics, optics, etc. In particular, the following question of long stand- ing is answered by the theory given here. Suppose a random field (1.1) is observed in a ball B and one wants to estimate s(x0), where x0 is the center of B. What is the optimal size of the radius of B? If the radius is too small then the estimate is not accurate. If it is too large then the estimate is not better than the one obtained from the observations in a ball of smaller radius, so that the efforts are wasted. This problem is of practical importance in many applications. We will briefly discuss some other applications of the estimation the- ory, for example, discrimination of hypotheses, resolution ability of linear February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

6 Random Fields Estimation Theory

systems, estimation of derivatives of random functions, etc. However, the emphasis is on the theory, and the author hopes that other scientists will pursue further possible applications. Numerical solution of the basic integral equation of estimation theory was widely discussed in the literature [Kailath (1974)] in the case of random processes (r = 1), mostly stationary that is when R(x, y) = R(x y), and − mostly in the case when the noise is white, so that the integral equation for the optimal filter is

T (I + R)h := h(t) + R (t τ)h(τ)dτ = f(t), 0 t T, (1.15) s − ≤ ≤ Z0 where Rs is the covariance function of the useful signal s(t). Note that the integral operator R in (1.10) is selfadjoint and nonnegative in L2[0, T ]. 1 Therefore (I + T )− exists and is bounded, and the numerical solution of (1.15) is not difficult. Many methods are available for solving one- dimensional second order Fredholm integral equation (1.15) with positive- definite operator I +R. Iterative methods, projection methods, colloqation and many other methods are available for solving (1.15), convergence of these methods has been proved and effective error estimates of the numer- ical methods are known [Kantorovich and Akilov (1980)]. Much effort was spent on effective numerical inversion of Toeplitr matrices which one ob- tains if one discretizes (1.15) using equidistant colloqation points [Kailath (1974)]. However, if the noise is colored, the basic equation becomes

T R(t τ)h(τ)dτ = f(t), 0 t T. (1.16) − ≤ ≤ Z0 This is a Fredholm equation of the first kind. A typical example is equation (1.8). As we have seen in (1.9), the solution to (1.8) is a distribution, in general. The theory for the numerical treatment of such equation was given by the author [Ramm (1985)] and is presented in Chapter 3 of this book. In particular, the following question of singular perturbation theory is of interest. Suppose that equation

h + Rh = f,  > 0, (1.17)

is given. This equation corresponds to the case when the intensity of the white-noise component of the noise is . What is the behavior of h when  +0? We will answer this question in Chapter 5. → This book is intended for a broad audience: for mathematicians, engi- neers interested in signal and image processing, geophysicists, etc. There- February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Introduction 7

fore the author separated formulation of the results, their discussion and examples from proofs. In order to understand the proofs, one should be familiar with some facts and ideas of . Since the author wants to give a relatively self-contained presentation, the necessary facts from functional analysis are presented in Chapter 8. The book presents the theory developed by the author. Many aspects of estimation theory are not discussed in this book. The book has practically no intersection with works of other authors on random fields estimation theory. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

8 Random Fields Estimation Theory February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Chapter 2 Formulation of Basic Results

2.1 Statement of the problem

Let D Rr be a bounded domain with a sufficiently smooth boundary ⊂ Γ. The requirement that D is bounded could be omitted, it is imposed for simplicity. The reader will see that if D is not bounded then the general line of the arguments remains the same. The additional difficulties, which appear in the case when D is unbounded, are of technical nature: one needs to establish existence and uniqueness of the solution to a certain transmis- sion problem with transmission conditions on Γ. Also the requirement of smoothness of Γ is of technical nature: the needed smoothness should guar- antee existence and uniqueness of the solution to the above transmission problem. Let be an elliptic selfadjoint in H = L2(Rr ) operator of order s. Let L Λ, Φ(x, y, λ), dρ(λ) be the spectrum, spectral kernel and spectral measure of , respectively. A function F ( ) is defined as an operator on H with L L the kernel

F ( )(x, y) = F (λ)Φ(x, y, λ)dρ(λ) (2.1) L ZΛ

2 and domF ( ) = f : f H, ∞ F (λ) d(Eλf, f) < , where L { ∈ −∞ | | ∞} R λ (Eλf, f) = Φ(x, y, µ)f(y)f(y)dxdy dρ(µ), := . (2.2) Rr Z−∞ ZZ  Z Z

Definition 2.1 Let denote the class of kernels of positive rational func- R tions of , where runs through the set of all selfadjoint elliptic operators L L

9 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

10 Random Fields Estimation Theory

in H = L2(Rr). In other words, R(x, y) if and only if ∈ R 1 R(x, y) = P (λ)Q− (λ)Φ(x, y, λ)dρ(λ), (2.3) ZΛ where P (λ) > 0 and Q(λ) > 0, λ Λ, and Λ, Φ, dρ correspond to an ∀ ∈ elliptic selfadjoint operator in H = L2(Rr ). L Let

p = deg P (λ), q = degQ(λ), s = ord , (2.4) L where deg P (λ) stands for the degree of the polynomial P (λ), and ord L stands for the order of the differential operator . L An operator given by the differential expression

u := a (x)∂ju, (2.5) L j j s |X|≤

j j1 j2 jr where j = (j1, j2 . . .jr ) is a multiindex, ∂ u = ∂ ∂ . . .∂ u j = j1 + x1 x2 xr | | j + . . . j , j 0, are integers. The expression (2.5) is called elliptic if, for 2 r m ≥ any real vector t Rr, the equation ∈ j aj(x)t = 0 j =s |X| implies that t = 0. The expression

+ j j u := ( 1)| |∂ (a∗(x)u) (2.6) L − j j s |X|≤ is called the formal adjoint with respect to . The star in (2.6) stands L for complex conjugate. One says that is formally selfadjoint if = + L r L . If is formally selfadjoint then is symmetric on C0∞(R ), that L L r L is ( φ, ψ) = (φ, ψ) φ, ψ C0∞(R ), where (φ, ψ) is the inner product L 2 r L ∀ ∈ in H = L (R ). Sufficient conditions on aj(x) can be given for a formally selfadjoint differential expression to define a selfadjoint operator in H in the r following way. Define a symmetric operator 0 with the domain C0∞(R ) r L by the formula u = u for u C∞(R ). Under suitable conditions on L0 L ∈ 0 a (x) one can prove that is essentially selfadjoint, that is, its closure is j L0 selfadjoint (See Chapter 8). In particular this is the case if aj = aj∗ = const. In what follows we assume that R(x, y) . Some generalizations will ∈ R be considered later. The kernel R(x, y) is the covariance function (1.3) of the random field (x) = s(x)+n(x) observed in a bounded domain D Rr. U ⊂ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 11

A linear estimation problem can be formulated as follows: find a linear estimate

ˆ := L := h(x, y) (y)dy (2.7) U U U ZD such that

 := ˆ As 2 = min . (2.8) |U − | The kernel h(x, y) in (2.7) is a distribution, so that, by L. Schwartz’s theo- rem about kernels, estimate (2.7) is the most general linear estimate. The operator A in (2.8) is assumed to be known. It is an arbitrary operator, not necessarily a linear one. In the case when A = , that is A = I, where I U U is the identity operator, the estimation problem (2.8) is called the filtering problem. From (2.8) and (2.7) one obtains

 = h(x, y) (y)dy h (x, z) (z)dz U ∗ U ∗ ZD ZD 2Re h(x, z) (z)dz(As) (x) + As(x) 2 − U ∗ | | ZD = h(x, y)h∗(x, z)R(z, y)dzdy ZD ZD 2 2Re h∗(x, z)f(z, x)dz + As(x) − | | ZD = min . (2.9)

Here

f(y, x) := (y)(As(x)) = f ∗(x, y), (2.10) U ∗ the bar stands for the mean value and the star stands for complex conjugate. By the standard procedure one finds that a necessary condition for the minimum in (2.9) is:

R(z, y)h(x, y)dy = f(z, x), x, z D := D Γ. (2.11) ∈ ∪ ZD In order to derive (2.11) from (2.9) one takes h + αη in place of h in (2.9).

Here α is a small number and η C0∞(D). The condition (h) (h + αη) ∂ ∈ ≤ implies ∂α α=0= 0. This implies (2.11). Since h is a distribution, the left-hand side of (2.11) makes sense only if the kernel R(z, y) belongs to

the space of test functions on which the distribution h is defined. We February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

12 Random Fields Estimation Theory

will discuss this point later in detail. In (2.11) the variable x enters as a parameter. Therefore, the equation

Rh := R(x, y)h(y)dy = f(x), x D, (2.12) ∈ ZD is basic for estimation theory. We have supressed the dependence on x in (2.11) and have written x in place of z in (2.12). From this derivation it is clear that the operator A does not influence the theory in an essential way: if one changes A then f is changed but the kernel of the basic equation (2.12) remains the same. If Au = ∂u , one has ∂xj the problem of estimating the derivative of u. If Au = u(x + x0), where x is a given point such that x + x D, then one has the extrapolation 0 0 6∈ problem. Analytically these problems reduce to solving equation (2.12). If no assumptions are made about R(x, y) except that R(x, y) is a covariance function, then one cannot develop an analytical theory for equation (2.12). Such a theory will be developed below under the basic assumption R(x, y) ∈ . R Let us show that the class of kernels, that is the class of random fields R that we have introduced, is a natural one. To see this, recall that in the one-dimensional case, studied analytically in the literature, the covariance functions are of the form

R(x, y) = R(x y), x, y R1, − ∈

∞ 1 R˜(λ) := R(x) exp( iλx)dx = P (λ)Q− (λ), − Z∞ where P (λ) and Q(λ) are positive polynomials [Kai]. This case is a very particular case of the kernels in the class . Indeed, take R d r = 1, = i , Λ = ( , ), dρ(λ) = dλ, L − dx −∞ ∞

1 Φ(x, y, λ) = (2π)− exp iλ(x y) . { − } Then formula (2.3) gives the above class of convolution covariance functions with rational Fourier transforms. If p = q, where p and q are defined in (2.4), then the basic equation (2.12) can be written as

Rh := σ2h(x) + R (x, y)h(y)dy = f(x), x D, σ2 > 0, (2.13) 1 ∈ ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 13

where

1 2 1 P (λ)Q− (λ) = σ + P1(λ)Q− (λ), p1 := deg P1 < q, (2.14) and σ2 > 0 is interpreted as the variance of the white noise component of the observed signal (x). If p < q, then the noise in (x) is colored, it does U U not contain a white noise component. Mathematically equation (2.13) is very simple. The operator R in (2.13) is of Fredholm type, selfadjoint and positive definite in H, R σ2I, where ≥ I is the identity operator, and A B means (Au, u) (Bu, u) u ≥ ≥ ∀ ∈ H. Therefore, if p = q then equation (2.12) reduces to (2.13) and has a unique solution in H. This solution can be computed numerically without difficulties. There are many numerical methods which are applicable to equation (2.13). In particular (see Section 2.3.2), an iterative process can be constructed for solving (2.13) which converges as a geometrical series; a projection method can be constructed for solving (2.13) which converges and is stable computationally; one can solve (2.13) by collocation methods. However the important and practically interesting question is the following one: what happens with the solution h to (2.13) as σ 0? This is a σ → singular perturbation theory question: for σ > 0 the unique solution to equation (2.13) belongs to L2(D), while for σ = 0 the unique solution to (2.13) of minimal order of singularity is a distribution. What is the asymptotics of h as σ 0? As we will show, the answer to this question σ → is based on analytical results concerning the solution to (2.13). The basic questions we would like to answer are: 1) In what space of functions or distributions should one look for the solution to (2.12)? 2) When does a solution to (2.12) solve the estimation prob- lem (2.8)? Note that (2.12) is only a necessary condition for h(x, y) to solve (2.8). We will show that there is a solution to (2.12) which solves (2.8) and this solution to (2.12) is unique. The fact that estimation problem (2.8) has a unique solution follows from a Hilbert space interpretation of problem (2.8) as the problem of finding the distance from the element (As)(x) to the subspace spanned by the values of the random field u(y), y D. Since ∈ there exists and is unique an element in a subspace at which the distance is attained, problem (2.8) has a solution and the solution is unique. It was mentioned in the Introduction (see (1.9)) that equation (2.12) may have no solutions in L1(D) but rather its solution is a distribution. There can be February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

14 Random Fields Estimation Theory

several solutions to (2.12) in spaces of distributions, but only one of them solves the estimation problem (2.8). This solution is characterized as the solution to (2.12) of minimal order of singularity.

3) What is the order of singularity and singular support of the solution to (2.12) which solves (2.8)? 4) Is this solution stable under small perturbations of the data, that is under small perturbations of f(x) and R(x, y)? What is the appropriate notion of smallness in this case? What are the stability estimates for h? 5) How does one compute the solution analytically? 6) How does one compute the solution numerically? 7) What are the properties of the operator R : L2(D) L2(D) → in (2.12)? In particular, what is the asymptotics of its eigenvalues λ (D) as j + ? What is the asymptotics j → ∞ of λ (D) as D Rr , that is, as D expands uniformly in 1 → directions?

These questions are of interest in applications. Note that if D is finite then the operator R is selfadjoint positive in L2(D), its spectrum is discrete, λ > λ > 0, the first eigenvalue is nondegen- 1 2 ≥ · · · erate by Krein-Rutman’s theorem. However, if D = Rr then the spectrum of R may be continuous, e.g., this is the case when R(x, y) = R(x y), 1 − R˜(λ) = P (λ)Q− (λ). Therefore it is of interest to find λ1 := limλ1(D) r ∞ as D R . The quantity λ1 is used in some statistical problems. → ∞ 8) What is the asymptotics of the solution to (2.13) as σ 0? →

2.2 Formulation of the results (multidimensional case)

2.2.1 Basic results We assume throughout that R(x, y) , f(x) is smooth, more precisely, ∈ R that f Hα, Hα = Hα(D)is the , α := (q p)s/2, where ∈ − s, q, p are the same as in (2.4), and the coefficients aj(x) of (see (2.5)) are r/(s α ) L sufficiently smooth, say a (x) C(Rr), a (x) L −| | if s j < r/2, s ∈ j ∈ loc − | | a L2 if s j > r/2, a L2+,  > 0, if s j = r/2 (see [H¨ormander j ∈ loc − | | j ∈ loc − | | (1983-85), Ch 17]). If q p, then the problem of finding the is simple: such a solution does ≤ m+2 α m exist, is unique, and belongs to H | | if f H . This follows from ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 15

the usual theory of elliptic boundary value problems [Berezanskij (1968)], 1 since the operator P ( )Q− ( ) is an elliptic integral-differential operator L L of order 2 α if q p. The solution satisfies the elliptic estimate: | | ≤

h m+2|α| c f m , q p, k kH ≤ k kH ≤ where c depends on but does not depend on f. L If q > p, then the problem of finding the mos solution of (2.12) is more interesting and difficult because the order of singularity of h, is in general, positive, ordh = α. The basic result we obtain is: α α The mapping R : H˙ − H is a linear isomorphism between the α α → spaces H˙ − and H . The of h is ∂D = Γ, provided that f is smooth. If h is a solution to equation (2.12) and ordh > α then  = , where 1 1 ∞  is defined in (2.8) . Therefore if h1 solves (2.12) and ordh1 > α then h1 does not solve the estimation problem (2.8). The unique solution to (2.8) is the unique mos solution to (2.12) . We give analytical formulas for the mos solution to (2.12) . This solution is stable towards small perturbations of the data. We also give a stable numerical procedure for computing this solution. In this section we formulate the basic results. Theorem 2.1 If R(x, y) , then the operator R in (2.12) is an isomor- ∈ Rα α phism between the spaces H˙ − and H . The solution to (2.12) of minimal order of singularity, ordh α, can be calculated by the formula: ≤ h(x) = Q( )G, (2.15) L where

g(x) + v(x) in D G(x) = (2.16) (u(x) in Ω := Rr D, \ g(x) Hs(p+q)/2 is an arbitrary fixed solution to the equation ∈ P ( )g = f in D, (2.17) L and the functions u(x) and v(x) are the unique solution to the following (2.18)-(2.20):

Q( )u = 0 in Ω, u( ) = 0, (2.18) L ∞

P ( )v = 0 in D, (2.19) L February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

16 Random Fields Estimation Theory

s(p + q) ∂j u = ∂j (v + g) on Γ, 0 j 1. (2.20) N N ≤ ≤ 2 − By u( ) = 0 we mean limu(x) = 0 as x . ∞ | | → ∞ Corollary 2.1 If f H2β, β α, then ∈ ≥ sing supph = Γ. (2.21)

Corollary 2.2 If P (λ) = 1, then the transmission problem (2.18)-(2.20) reduces to the Dirichlet problem in Ω:

Q( )u = 0 in Ω, u( ) = 0, (2.22) L ∞ sq ∂j u = ∂j f on Γ, 0 j 1, (2.23) N N ≤ ≤ 2 − and (2.15) takes the form

f in D h = Q( )F, F = (2.24) L (u in Ω. Corollary 2.1 follows immediately from formulas (2.15) and (2.16) since g(x)+v(x) and u(x) are smooth inside D and Ω respectively. Corollary 2.2 follows immediately from Theorem 2.1: if P (λ) = 1 then g = f, v = 0, and p = 0. Let ω(λ) 0, ω(λ) C(R1), ω( ) = 0, ≥ ∈ ∞ ω := max ω(λ), (2.25) λ Λ ∈

R(x, y) = ω(λ)Φ(x, y, λ)dρ(λ), (2.26) ZΛ λ = λ (D) be the eigenvalues of the operator R : L2(D) L2(D) be j j → the eigenvalues of the operator R : L2(D) L2(D) with kernel (2.26), → arranged so that

λ λ λ > 0. (2.27) 1 ≥ 2 ≥ 3 ≥ · · ·

Theorem 2.2 If D D0 then λ λ0 , where λ0 = λ (D0). If ⊂ j ≤ j j j sup R(x, y) dy := A < , (2.28) x Rr | | ∞ ∈ Z February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 17

then

λ1 = ω, (2.29) ∞ where

lim λ1(D) := λ1 , (2.30) D Rr ∞ → and ω is defined in (2.25).

a Theorem 2.3 If ω(λ) = λ − (1 + o(1)) as λ , and a > 0, then the | | | | → ∞ asymptotics of the eigenvalues of the operator R with kernel (2.26) is given by the formula:

as/r λ cj− as j , c = const > 0, (2.31) j ∼ → ∞ where c = γas/r and

r γ := (2π)− η(x)dx, (2.32) ZD with

η(x) := meas t : t Rr, a (x)tα+β 1 . (2.33) { ∈ αβ ≤ } α = β =s/2 | | X| |

Here the form aαβ(x) generates the principal part of the selfadjoint elliptic operator : L u = ∂α(a (x))∂β u + , ord < s. L αβ L1 L1 α = β =s/2 | | X| | 1 Corollary 2.3 If ω(λ) = P (λ)Q− (λ) then a = q p, where q = deg Q, (q p)s/r − p = deg P , and λ cn− − , where λ are the eigenvalues of the n ∼ n operator in equation (2.12) . This Corollary follows immediately from Theorem 2.3. Theorems 2.1- 2.3 answer questions 1)-5) and 7) in section 2.1. Answers to questions 6) and 8) will be given in Chapter 3. Proof of Theorem 2.3 is given in Section 8.3.2.10.

2.2.2 Generalizations First, let us consider a generalization of the class of kernels for the case R when there are several commuting differential operators. Let , . . . be L1 Lm a system of commuting selfadjoint differential operators in L2(Rr). There February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

18 Random Fields Estimation Theory

exists a dµ(ξ) and a spectral kernel Φ(x, y, ξ), ξ = (ξ1, . . ., ξm) such that a function F ( , . . . ) is given by the formula L1 Lm

F ( , . . . ) = F (ξ)φ(ξ)dµ(ξ), (2.34) L1 Lm ZM where Φ(ξ) is the operator with kernel Φ(x, y, ξ). The domain of definition of the operator F (L , . . .L ) is the set of all functions u L2(Rr ) for which 1 m ∈ F (ξ) 2(Φ(ξ)u, u)dµ < , M is the support of the spectral measure dµ, M | | ∞ and the parentheses denote the inner product in L2(Rr). R For example, let m = r, = i ∂ . Then ξ = (ξ , . . .ξ ), dµ = j ∂xj 1 r r L − dξ . . . dξ , φ(x, y, ξ) = (2π)− exp iξ (x y) , where dot denotes the inner 1 r { · − } product in Rr. 1 If F (ξ) = P (ξ)Q− (ξ), where P (ξ) and Q(ξ) are positive polynomials and the operators P ( ) := P ( , . . . ) and Q( ) := Q( , . . . ) are L L1 Lr L L1 Lr elliptic of orders m and n respectively, m < n, then the theorems, analogous to Theorems 2.1-2.2, hold with sp = m and sq = n. Theorem 2.3 has also an analogue in which as = n m in formula (2.31). − Another generalization of the class of kernels is the following one. R Let Q(x, ∂) and P (x, ∂) be elliptic differential operators and

QR = P δ(x y) in Rr . (2.35) − Note that the kernels R satisfy equation (2.35) with Q = Q( ), P = ∈ R L P ( ). Let ordQ = n, ordP = m, n > m. Assume that the transmission L problem (2.18)-(2.20), with Q(x, ∂) and P (x, ∂) in place of Q( ) and P ( ) L L respectively, and ps = m, qs = n, has a unique solution in H (n+m)/2. Then Theorem 2.1 holds with α = (n m)/2. − The transmission problem (2.18)-(2.20) with Q(x, ∂) and P (x, ∂) in place of Q( ) and P ( ) is uniquely solvable provided that, for example, L L Q(x, ∂) and P (x, ∂) are elliptic positive definite operators. For more de- tails see Chapter 4 and Appendices A and B.

2.3 Formulation of the results (one-dimensional case)

In this section we formulate the results in the one-dimensional case, i.e., r = 1. Although the corresponding estimation problem is the problem for random processes (and not random fields) but since the method and the results are the same as in the multidimensional case, and because of February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 19

the interest of the results in applications, we formulate the results in one- dimensional case separately.

2.3.1 Basic results for the scalar equation Let r = 1, D = (t T, t), R(x, y) . The basic equation (2.12) takes the − ∈ R form

t R(x, y)h(y)dy = f(x), t T x t. (2.36) t T − ≤ ≤ Z − Assume that f Hα, α = s(q p)/2. ∈ − α Theorem 2.4 The solution to equation (2.36) in H˙ − exists, is unique, and can be found by the formula

h = Q( )G, (2.37) L where

sq/2 b−ψ−(x), x t T j=1 j j ≤ − G(x) = Pg(x), t T t t (2.38)  − ≤ ≤  sq/2 b+ψ+(x), x t. j=1 j j ≥ P Here b are constants, the functions ψ(x), 1 j sq/2, form a funda- j j ≤ ≤ mental system of solutions to the equation

+ Q( )ψ = 0, ψ−( ) = 0, ψ (+ ) = 0. (2.39) L j −∞ j ∞ The function g(x) is defined by the formula

sp

g(x) = g0(x) + cjφj(x), (2.40) j=1 X

where g0(x) is an arbitrary fixed solution to the equation

P ( )g = f, t T x t, (2.41) L − ≤ ≤ the functions φ , 1 j sp, form a fundamental system of solutions to j ≤ ≤ the equation

P ( )φ = 0, (2.42) L February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

20 Random Fields Estimation Theory

and c , 1 j sp, are constants. The constants b, 1 j sq/2, and c , j ≤ ≤ j ≤ ≤ j 1 j sp, are uniquely determined from the linear system: ≤ ≤ sq/2 sp k k D b−ψ− = D g + c φ , (2.43)  j j   0 j j j=1 j=1 X  x=t T  X  x=t T − −     sq/2 sp Dk b+ψ+ = Dk g + c φ , (2.44)  j j   0 j j j=1 j=1 X  x=t  X  x=t

1 1 where D = d/dx, 0 k s (p + q) 1. The map R−: f h, where h ≤ ≤ 2 − → is given by formula (2), is an isomorphism of the space H α onto the space α H˙ − . Remark 2.1 This theorem is a complete analogue of Theorem 2.1. The role of is played now by an ordinary differential selfadjoint operator L L in L2(R1). An ordinary differential operator is elliptic if and only if the coefficient in front of its senior (that is, the highest order) derivative does not vanish: s u = a (x)Dj u, a (x) = 0, (2.45) L j s 6 j=0 X One can assume that a (x) > 0, x R1, and the condition of uniform s ∈ ellipticity is assumed, that is

0 < c a (x) c , (2.46) 1 ≤ s ≤ 2

where c1 and c2 are positive constants which do not depend on x. Corollary 2.4 If f Hα, then sing supph = ∂D, where ∂D consists of ∈ two points t and t T . − Corollary 2.4 is a complete analogue of Corollary 2.1.

Corollary 2.5 Let Q(λ) = a+(λ)a (λ), where a (λ) are polynomials of −  degree q/2, the zeros of the polynomial a+(λ) lie in the upper half plane Imλ > 0, while the zeros of a (λ) lie in the lower half-plane Imλ < 0. − Since Q(λ) > 0 for < λ < , the zeros of a (λ) are complex conjugate −∞ ∞ + of the corresponding zeros of a (λ). Assume that P (λ) = 1. Then formula − (2.37) can be written as

h(x) = a+( )[θ(x t + T )a ( )f(x)] a ( )[θ(x t)a+( )f(x)], (2.47) L − − L − − L − L February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 21

1 if x 0 where θ(x) = ≥ , and the differentiation in (2.47) is understood (0 if x < 0 in the sense of distributions. This Corollary is an analogue of Corollary 2.2. Remark 2.2 Formula (2.47) is convenient for practical calculations. Let us give a simple example of its application. Let = i∂, r = 1, P (λ) = 1, 2 L − 1 Q(λ) = (λ +1)/2, R(x, y) = exp( x y ), Φ(x, y, λ) = (2π)− exp iλ(x −| − | { − y) , dρ(λ) = dλ, t = 1, t T = 1. Equation (2.36) becomes } − − 1 exp( x y )h(y)dy = f(x), 1 x 1, (2.48) 1 −| − | − ≤ ≤ Z− λ i λ+i a+(λ) = − , a (λ) = . Formula (2.47) yields: √2 − √2 1 h(x) = ( i∂ i)[θ(x + 1)( i∂ + i)f(x)] 2 − − − 1 ( i∂ + i)[θ(x 1)( i∂ i)f(x)] −2 − − − − 1 1 = (∂ + 1)[θ(x + 1)(∂ 1)f] + (∂ 1)[θ(x 1)(∂ + 1)f] −2 − 2 − − 1 1 = θ(x + 1)(∂2 1)f δ(x + 1)(∂ 1)f −2 − − 2 − 1 1 + θ(x 1)(∂2 1)f + δ(x 1)(∂ + 1)f 2 − − 2 −

f 00 + f f 0(1) + f(1) = − + δ(x 1) 2 2 − f 0( 1) + f( 1) +− − − δ(x + 1). (2.49) 2

Here we have used the well known formula θ0(x a) = δ(x a) where δ(x) is − − the delta-function. Formula (2.49) is the same as formula (1.9). The term ( f 00 + f)/2 in (2.49) vanishes outside the interval [ 1, 1] by definition. − − Remark 2.3 If t = + and t T = 0, so that equation (2.36) takes the ∞ − form of Wiener-Hopf equation of the first kind

∞ R(x, y)h(y)dy = f(x), x 0, (2.50) ≥ Z0 then formula (2.47) reduces to

h(x) = a+( )[θ(x)a ( )f(x)]. (2.51) L − L February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

22 Random Fields Estimation Theory

If = i∂ formula (2.51) can be obtained by the well-known factoriza- L − tion method.

Example 2.1 Consider the equation

∞ exp( x y )h(y)dy = f(x), x 0. (2.52) −| − | ≥ Z0 By formula (2.51) one obtains

f 00 + f f 0(0) + f(0) h(x) = − + − δ(x), (2.53) 2 2 if one uses calculations similar to the given in formula (2.49).

2.3.2 Vector equations In both cases r = 1 and r > 1 it is of interest to consider estimation problems for vector random processes and vector random fields. For vec- tor random processes the basic equation is (2.36) with the positive kernel R(x, y) in the sense that (Rh, h) > 0 for h 0, given by formula (2.3) in 1 6≡ which R˜ (λ) = P (λ)Q− (λ) is a matrix:

1 R˜ (λ) = (R˜ (λ)), R˜ (λ) := P (λ)Q− (λ), 1 i, j d, (2.54) ij ij ij ij ≤ ≤

where Pij(λ) and Qij (λ) are relatively prime positive polynomials for each fixed pair of indices (ij), 1 i, j d, d is the number of components of ≤ ≤ the random processes , s and n. Let Q(λ) be the polynomial of minimal U degree, deg Q(λ) = q, for which any Q (λ), 1 i, j d, is a divisor, ij ≤ ≤ A (λ) := R (λ)Q(λ). Denote by E the unit d d matrix and by A( ) the ij ij × L matrix differential operator with entries A ( ). Assume that ij L det A (λ) > 0, λ R1, (2.55) | ij | ∀ ∈

det B (x) = 0, x R1, (2.56) m 6 ∀ ∈

where m := s max1 i,j d deg Aij (λ), s = ord , ≤ ≤ L m d A( ) := B (x)∂j , ∂ = . (2.57) L j dx j=0 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 23

Let S(x, y) denote the matrix kernel

1 1 i = j S(x, y) := δij Q− (λ)Φ(x, y, λ)dρ(λ), δij = (2.58) Λ (0 i = j Z 6 1 of the diagonal operator Q− ( )E. The operator Q( )E is a diagonal L L matrix differential operator of order n = sq. Let us write the basic equation

t R(x, y)h(y)dy = f(x), t T x t, (2.59) t T − ≤ ≤ Z − where R(x, y) is the d d matrix with the spectral density (2.54), h and × f are vector functions with d components, f α, α = (n m)/2, α ∈ H − H denotes the space of vector functions (f1, . . .fd) such that f α := 1/2 k kH d 2 f α . j=1 k j kH   RemarkP 2.4 In the vector estimation problem h and f are d d matrices, × but for simplicity and without loss of generality we discuss the case when h is a vector. Matrix equation (2.59) is equivalent to d vector equations. Equation (2.59) one can write as

A( )v = f, (2.60) L

1 v := Q− ( )Eh = S(x, y)h(y)dy. (2.61) L ZD Let Φ , 1 j m, be a fundamental system of matrix solutions to the j ≤ ≤ equation

A( )φ = 0, (2.62) L and Ψ, 1 j n/2 be the fundamental system of matrix solutions to j ≤ ≤ the equation

Q( )EΨ = 0, (2.63) L such that

+ Ψ (+ ) = 0, Ψ−( ) = 0. (2.64) j ∞ j −∞ The choice of the fundamental system of matrix solutions to (2.63) with properties (2.64) is possible if is an elliptic ordinary differential operator, L February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

24 Random Fields Estimation Theory

that is a (x) = 0, x R1 (see Remark 2.1 and [N] p. 118). Let us write s 6 ∈ equations (2.60) and (2.61) as

m S(x, y)h(y)dy = g0(x) + Φj (x)cj , (2.65) D Z Xj=1

where g0(x) is an arbitrary fixed solution to the equation (2.60), and cj, 1 j m, are arbitrary linearly independant constant vectors. ≤ ≤ Theorem 2.5 If R with R˜ given by (2.54), the assumptions (2.46), ∈ R (2.55), (2.56) hold, and f α, α = (n m)/2, then the matrix equation ∈αH − (2.59) has a solution in ˙ − , this solution is unique and can be found by H the formula

h = Q( )EG, (2.66) L where the vector function G is given by

n/2 j=1 Ψj−bj−, x t T m ≤ − G(x) = g0(x) + j=1 Φj cj , t T x t, (2.67) P − ≤ ≤  n/2 Ψ+b+, x t. j=1 jP j ≥ P Here the functions Ψj, Φj and g0(x) were defined above, and the constant vectors b, 1 j n/2 and c , 1 j m, can be uniquely determined j ≤ ≤ j ≤ ≤ from the linear system

n/2 m k k D Ψ−(x)b− = D g (x) + Φ (x)c (2.68)  j j   0 j j  j=1 j=1 X  x=t T  X  x=t T − −

    n/2 m Dk Ψ+(x)b+ = Dk g (x) + Φ (x)c , (2.69)  j j   0 j j j=1 j=1 X  x=t  X  x=t

where 0 k (n + m)/2 1.   ≤ ≤ 1 − The map R− : f h, given by formulas (2.66)-(2.69) is an isomor- → α α phism between the spaces and ˙ − , α = (n m)/2. H H − Remark 2.5 The conditions (2.68), (2.69) guarantee that the function G(x), defined by formula (2.67), is maximally smooth so that the order of singularity of G and, therefore, of h (see formula (2.66) ) is minimal. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 25

2.4 Examples of kernels of class R and solutions to the basic equation

1 1. If r = 1, = i∂, ∂ = d/dx, Φ(x, y, λ) = (2π)− exp iλ(x y) , L − { − } dρ = dλ, then R(x, y) if ∈ R 1 ∞ R(x, y) = (2π)− R˜(λ) exp iλ(x y) dλ, (2.70) { − } Z−∞ where

1 R˜(λ) = P (λ)Q− (λ) (2.71)

and P (λ), Q(λ) are positive polynomials. 2. If r > 1, = ( 1, . . . r), r = i∂r , ∂r = ∂/∂xr, Φ(x, y, λ) = r L L L L − (2π)− exp iλ (x y) , λ = (λ , . . .λ ), dρ(λ) = dλ = dλ . . .dλ , then { · − } 1 r 1 r r R(x, y) = (2π)− R˜(λ) exp iλ (x y) dλ, (2.72) r { · − } ZR where R˜(λ) is given by (2.71) and

P (λ) = P (λ1, . . .λr ) > 0, Q(λ) = Q(λ1 . . .λr ) > 0 (2.73)

are polynomials. For the operators P ( ) and Q( ) to be elliptic of orders L L p and q respectively, one has to assume that

p q r 0 < c P (λ) λ − c , 0 < c Q(λ) λ − c , λ R (2.74) 1 ≤ | | ≤ 2 3 ≤ | | ≤ 4 ∀ ∈ 2 2 1/2 where λ = (λ1 + + λr) and cj , 1 j 4, are positive constants. | | · · · d2 ≤ ≤2 3. If r = 1, = 2 , D( ) = u : u H (0, ), u0(0) = 0 , D( ) = L − dx L { ∈ ∞ } L domain of , then L 1 R(x, y) = [A( x + y ) + A( x y )], x, y 0 (2.75) 2 | | | − | ≥ where

1 ∞ 1 1/2 A(x) = π− P (λ)Q− (λ) cos(√λx)λ− dλ (2.76) Z0 and P (λ) > 0, Q(λ) > 0 are polynomials. Indeed, one has for L 1 1/2 π− cos(√λx) cos(√λy)λ− dλ, λ 0 Φ(x, y, λ)dρ(λ) = ≥ (0, λ < 0, February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

26 Random Fields Estimation Theory

0 x, y < . Since ≤ ∞ 1 cos(kx) cos(ky) = [cos(kx ky) + cos(kx + ky)], k = √λ 2 − one obtains (2.75) and (2.76). If one put √λ = k in (2.76) one gets

2 ∞ 2 1 2 A(x) = P (k )Q− (k ) cos(kx)dk, (2.77) π Z0 which is a cosine transform of a positive rational function of k. The eigen- 2 2 1/2 functions of , normalized in L (0, ), are π cos(kx) and dρ = dk in L 2 ∞ the variable k. If = d is determined in L2(0, ) by the boundary L − dx2  ∞ condition u(0) = 0, then 1 R(x, y) = [A( x y ) A(x + y)], x, y 0, (2.78) 2 | − | − ≥ where A(x) is given by (2.77), the eigenfunctions of with the Dirichlet L 2 boundary condition u(0) = 0 are π sin(kx), dρ = dk in the variable k, 2 and Φ(x, y, k)dρ(k) = π sin(kx) sin(qky)dk one can compare this with the formula Φ(x, y, k)dρ(k) = 2 cos(kx) cos(ky)dk, which holds for deter- π L mined by the Neumann boundary condition u0(0) = 0. d2 2 1 2 4. If = 2 + (ν )x− , ν 0, x 0, then L − dx − 4 ≥ ≥

√xλJν(xλ)√yλJν (yλ)dλ, if λ 0 Φ(x, y, λ)dρ(λ) = ≥ (2.79) (0, if λ < 0, so that

∞ 1 R(x, y) = √xy P (λ)Q− (λ)Jν (λx)Jν (λy)λdλ, (2.80) Z0 where P (λ) and Q(λ) are positive polynomials on the semiaxis λ 0. 1 3 ≥ 5. Let R(x, y) = exp( a x y )(4π x y )− , x, y R , a = const > 0. − | − | | − | ∈ Note that ( ∆ + a2)R = δ(x y) in R3. The kernel R(x, y) . One has − − 2 2 2 ∈ R2 2 2 = ( 1, 2, 3), j = i∂j , P (λ) = 1, Q(λ) = λ + a , λ = λ1 + λ2 + λ3, L L L 3L L − Φdρ = (2π)− exp iλ (x y) dλ, { · − } 3 exp iλ (x y) R(x, y) = (2π)− { 2 · 2− } dλ. (2.81) 3 λ + a ZR 6. Let R(x, y) = R(xy). Put x = exp(ξ), y = exp( η). Then R(xy) = − R(exp(ξ y)) := R (ξ y). If R with = i∂, then one can solve − 1 − 1 ∈ R L − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 27

the equation

b R(xy)h(y)dy = f(x), a x b (2.82) ≤ ≤ Za analytically. 7. Let K (a x ) be the modified Bessel function which can be defined 0 | | by the formula

1 exp(iλ x) K0(a x ) = (2π)− 2 ·2 dλ, a > 0, (2.83) | | 2 λ + a ZR

where λ x = λ1x1 + λ2x2. Then the kernel R(x, y) := K0(a x y ) , · 2 2 | − | ∈ R = ( i∂1, i∂2), r = 2, P (λ) = 1, Q(λ) = λ + a , Φ(x, y, λ)dρ(λ) = L 1− − (2π)− exp iλ (x y) dλ. { · − } 8. Consider the equation

exp( a x y ) − | − | h(y)dy = f(x), x D R3, a > 0, (2.84) 4π x y ∈ ⊂ ZD | − | with kernel (2.81). By formula (2.15), Theorem 2.1, one obtains the unique 1 solution to equation (2.84) in H˙ − (D):

∂f ∂u h(x) = ( ∆ + a2)f + δ , (2.85) − ∂N − ∂N Γ   where u is the unique solution to the Dirichlet problem in the exterior domain Ω := R3 D: \ ( ∆ + a2)u = 0 in Ω, u = f , (2.86) − |Γ |Γ

Γ = ∂D = ∂Ω is the boundary of D, and δΓ is the delta function with support Γ. Let us derive formula (2.85). For the kernel (2.81) one has r = 3, p = 0, 2 2 sq P (λ) = 1, Q(λ) = λ + a , s = 1, q = 2, α = 2 = 1. Formula (2.15) reduces to

h(x) = ( ∆ + a2)G, (2.87) − with

f in D G = (2.88) (u in Ω, February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

28 Random Fields Estimation Theory

and u is the solution to (2.86). Indeed, since P (λ) = 1, one has v = 0 and g = f. In order to compute h by formula (2.87) one uses the definition of r the derivative in the sense of distributions. For any φ C ∞(R ) one has: ∈ 0 ( ∆ + a2)G, φ = G, ( ∆ + a2)φ − −  = f( ∆ + a2)φdx + u( ∆ + a2)φdx − − ZD ZΩ = ( ∆ + a2)fφdx + ( ∆ + a2)uφdx − − ZD ZΩ ∂φ ∂f ∂φ ∂u f φ ds + u φ ds − ∂N − ∂N ∂N − ∂N ZΓ   ZΓ   ∂f ∂u = ( ∆ + a2)fφdx + φds, (2.89) − ∂N − ∂N ZD ZΓ   where the condition u = f on Γ was used. Formula (2.89) is equivalent to (2.85). 9. Consider the equation

2π K (a x y )h(y)dy = f(x), x D R2, a > 0, (2.90) 0 | − | ∈ ⊂ ZD 2 where D = x : x R , x b , and K0(x) is given by formula (2.83). { ∈ | | ≤1 } The solution to (2.90) in H˙ − (D) can be calculated by formula (2.85) in which u(x) can be calculated explicitly

∞ exp(inφ)K (ar) u(x) = f n , (2.91) n K (ab) n= n X−∞ where x = (r, φ), (r, φ) are polar coordinates in R2,

2π 1 f := (2π)− f(b, φ) exp( inφ)dφ, (2.92) n − Z0

Kn(r) is the modified Bessel function of order n which decays as r + . ∂u → ∞ One can easily calculate ∂N Γ in formula (2.85):

∂u ∂u ∞ exp(inφ)K (ab) = = af n0 . (2.93) ∂N Γ ∂r r=b n K (ab) n= n X−∞

Formulas (2.85) and (2.93) give an explicit analytical formula for the solu- 1 tion to equation (2.90) in H˙ − (D). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 29

2.5 Formula for the error of the optimal estimate

In this section we give an explicit formula for the error of the optimal estimator. This error is given by formula (1.5). We assume for simplicity that A = I in what follows. This means that we are discussing the filtering problem. In the same way the general estimation problem can be treated. 1. The error of the estimate can be computed by formula (2.9) with A = I. This yields

 = (Rh, h) 2Re(h, f) +  (x),  (x) := s(x) 2, (2.94) − 0 0 | |

where (u, v) := D uvdx, (Rh, h) = D D R(z, y)h(y)h∗ (z)dydz, and h(y) := h(x, y). The optimal estimate R R R

ˆ(x) = h(x, y) (y)dy (2.95) U U ZD is given by the solution to the equation (2.11):

Rh = f, (2.96)

and we assume that R . Since (Rh, h) > 0, it follows from (2.94) and ∈ R (2.96) that

(x) =  (x) (Rh, h). (2.97) 0 − It is clear that the right side of (2.97) is finite if and only if the quadratic form (Rh, h) is finite. Our goal is to show that if and only if one takes the solution to (2.96) of minimal order of singularity, the mos solution to α (2.96), that is the solution h H˙ − , one obtains a finite value of (Rh, h). ∈ Therefore only the mos solution to (2.96) solves the estimation problem, and the error of the optimal estimate is given by formula (2.97) in which α h H˙ − is the unique solution to (2.96) of minimal order of singularity. ∈ 2. In order to achieve our goal, let us write the form (Rh, h) using the Parseval equality and the basic assumption R : ∈ R 1 2 (Rh, h) = P (λ)Q− (λ) h˜(λ) dρ(λ). (2.98) | | ZΛ Here

N(λ) 2 2 ˜ h(λ) = h(x)φj∗(x, λ)dx , (2.99) j=1 ZD X

February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

30 Random Fields Estimation Theory

where φ (x, λ) are the eigenfunctions of which are used in the expansion j L of the spectral kernel:

Φ(x, y, λ) = φj(x, λ)φj∗(y, λ), (2.100) j=1 X and N (see Section 8.2). One has λ ≤ ∞ 2 bs 2 b h H˙ − h˜(λ) (1 + λ )− dρ(λ) < , (2.101) ∈ ⇔ Λ ∞ Z

where b > 0 is an arbitrary num ber. By the assumption (see 2.74)

2 p/2 0 < c P (λ)(1 + λ )− c , (2.102) 1 ≤ ≤ 2

2 q/2 0 < c Q(λ)(1 + λ )− c , (2.103) 3 ≤ ≤ 4 where c , 1 j 4, are positive constants. Thus j ≤ ≤ 1 2 (q p)/2 0 < c P Q− (1 + λ ) − c . (2.104) 5 ≤ ≤ 6 From (2.98),(2.101) and (2.104) it follows that

(q p)s/2 α (Rh, h) < h H˙ − − = H˙ − . (2.105) ∞ ⇔ ∈ In particular, if m(h) := ordh > α, then (Rh, h) = . ∞ Let be an operator with constant coefficients in L2(Rr) so that L R(x, y) = R(x y), = ( , . . . ), := i∂/∂x , and formula (2.98) − L L1 Lr Lr − r takes the form 2 1 (Rh, h) = P (λ)Q− (λ) h˜(λ) dλ, (2.106) Rr Z

with λ = (λ1, . . .λr) and

r/2 h˜(λ) := (2π)− h(x) exp(iλ x)dx. (2.107) r · ZR Then P ( ) is elliptic of order p if and only if P (λ) satisfies (2.102) and L Q( ) is elliptic of order q if and only if Q(λ) satisfies (2.103). The integral L 2 (p q)/4 2 r (2.106) is finite if and only if h˜(λ)(1+λ ) − L (R ), where h˜(λ) is the ∈ α r usual Fourier transform. This is equivalent to h H− (R ), α = (q p)/2. ∈ − Since we assume that supph D, we conclude in the case described that α ⊂ h H˙ − (D). ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Formulation of Basic Results 31

3. Formula (2.97) can be written as

1 (x) =  (x) (f, h) =  (x) (f, R− f). (2.108) 0 − 0 − If is a vector random field then h(x, y) is a d d matrix and formulas U × (2.97) and (2.108) take the form

(x) =  (x) tr(f, h) =  (x) tr f(x, y)h∗(y, x) dy, (2.109) 0 − 0 − { } ZD where trA is the trace of the matrix A. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

32 Random Fields Estimation Theory February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Chapter 3 Numerical Solution of the Basic Integral Equation in Distributions

3.1 Basic ideas

It is convenient to explain the basic ideas using the equation

1 Rh = exp( x y )h(y)dy = f(x), 1 x 1 (3.1) 1 −| − | − ≤ ≤ Z− as an example, which contains all of the essential features of the general equation

Rh = R(x, y)h(y)dy = f(x), x D Rr (3.2) ∈ ⊂ ZD with the kernel R . ∈ R The first idea that might come to mind is that equation (3.1) is Fred- holm’s equation of the first kind, so that the regularization method can yield a numerical solution to (3.1) (see [R28], for example). On second thought one realizes that, according to Theorem 3.1, the solution to (3.1) does not 2 1 α α belong to L ( 1, 1) in general, and that the mapping R− : H H˙ − , − → α = 1 for equation (3.1), is an isomorphism. Therefore the problem of numerical solution of equation (3.1) is not an ill-posed but a well-posed problem. The solution to (3.1) is a distribution in general, which is clear from formula (2.49). For the solution of (3.1) to be an integrable function it is necessary and sufficient that the following boundary conditions hold

f 0(1) + f(1) = 0, f 0( 1) = f( 1). (3.3) − − This follows immediately from the formula (2.49). The problem is to de- velop a numerical method for solving equations (3.1) and (3.2) in the space α of distributions H˙ − .

33 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

34 Random Fields Estimation Theory

Much work has been done on the effective numerical inversion of the Toeplitz matrices which one obtains by discretizing the integral equation

h + Rh = f,  > 0 (3.4)

with R, for example, given by (3.1). If the nodes of discretization are equidistant, equation (3.4), after discretizing, reduces to a linear algebraic system with Toeplitz matrix tij = ti j. Discussion of this, however, is not − in the scope of our work. The question of principal interest is the question about asymptotic behavior of the solution to (3.4) as  +0. Note that, → for any  > 0, equation (3.4) is an equation with a selfadjoint positive defi- nite operator I + R in L2( 1, 1). Therefore, for any  > 0, equation (3.4) − has a solution in L2( 1, 1) for any fL2( 1, 1), and this solution is unique. − − Numerical solution of equation (3.4) by the above mentioned discretization (or collocation) method becomes impossible as  +0 because the con- → dition number of the matrix of the discretized problem grows quickly as  +0. The nature of singularity of the solution to the limiting equation → ( = 0) (3.1) is not clear from the discretization method described above. Numerical solution of equation (3.1) requires therefore a new approach which we wish to describe. The basic idea is to take into account theoretical results obtained in The- orems 2.1 and 2.4 . According to the Theorem 2.4, the solution to equation (3.1) with a smooth right hand side f(x) has the following structure:

h = Aδ(x 1) + Bδ(x + 1) + h , (3.5) − sm

where A and B are constants and hsm is a smooth function. The order of singularity of the solution to equation (3.1) is 1 since α = 1 for this equation. Let us assume that f is smooth, f H2, for example, so that ∈

f 00 + f − H0 = L2(D). 2 ∈ Let us look for an approximate solution to equation (3.1) of the form

n hn(x) = cjφj (x) + c0δ(x 1) + c 1δ(x + 1), (3.6) − − j=1 X

where cj , j = 1, 0, 1, . . ., are constants, φj , 1 j < , is a basis in 2 − { } ≤ ∞ H◦ = L ( 1, 1). The constants can be found, for example, from the least − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 35

squares method:

Rh f = min, (3.7) k n − k1 where f is the norm of f in the space Hα. k kα The variational problem (3.7) can be written as

1 n 2  := 1 c0 exp(x) + c0 1 exp( x) + j=1 cjψj (x) f(x) − {| − − − | n 2 +R c0 exp(x) c0 1 exp( x) + j=1 cj ψj0 (x) f 0(x) dx | − − − P − | } = min, P (3.8) where

1 1 c0 := c0e− , c0 1 = c 1e− , ψj(x) = Rφj, 1 j n. (3.9) − − ≤ ≤

The linear system for finding the cj , 1 j n, and c0, c0 1 is ≤ ≤ − ∂ ∂ ∂ = 0, 1 j n, = 0, = 0. (3.10) ∂cj ≤ ≤ ∂c0 ∂c0 1 − The matrix of this system is

a := (ψ , ψ ) , 1 i, j n, (3.11) ij j i 1 − ≤ ≤ where ψ0 = exp(x), ψ 1 = exp( x), ψj for 1 j < is defined in − − ≤ ∞ (3.9), the system ψj , 1 j n is assumed to be linearly independent { } − ≤ ≤ 1 for any n, and the inner product is taken in the space H : (u, v)1 := 1 1(uv + u0v0)dx. Matrix aij is positive definite for any n so that the system− R n a c = b , 1 i n, b := (f, ψ ) (3.12) ij j i − ≤ ≤ i i 1 j= 1 X− is uniquely solvable for any n. Convergence of the suggested numerical method is easy to prove. One wishes to prove that

h h 0 as n . (3.13) k n − k−1 → → ∞ 1 1 Since R : H˙ − H is an isomorphism, it is sufficient to prove that → Rh Rh 0 as n . (3.14) k n − k1 → → ∞ Since Rh = f, equation (3.14) reduces to

Rh f 0 as n . (3.15) k n − k1→ → ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

36 Random Fields Estimation Theory

Equation (3.15) holds if the set of functions ψ , 1 j < is complete { j} − ≤ ∞ in H1. Here

ψj = Rφj, 1 j , ψ 1 = exp( x), ψ0 = exp(x). (3.16) ≤ ≤ ∞ − − Therefore, if one chooses a system ψ H1, 1 j < , such that the j ∈ ≤ ∞ system ψ , 1 j < , forms a basis of H1 (or just a complete system { j} − ≤ ∞ in H1) then (3.15) holds. Since for practical calculations one need only know the matrix aij and the vector bi (see (3.12)), and both these quantities can be computed if the system ψ , 1 j < , and f are known, it is not { j} − ≤ ∞ necessary to deal with the system φ , 1 j . { j} ≤ ≤ ∞ We have proved the following

Proposition 3.1 If ψj , 1 j < , ψ0 = exp(x), ψ 1 = exp( x), { } − ≤ ∞ − − is a complete system in H1 then, for any n, the system (3.12) is uniquely (n) solvable. Let cj be its solution. Then the function n (n) hn = j= 1 cj φj (x), where φ0 = δ(x 1), φ 1 = δ(x + 1 − − − 1), φj = R− ψj, 1 j n P 1 ≤ ≤ converges in H− to the solution h of equation (3.1):

h hn 1 0, n . k − k− → → ∞ There are some questions of a practical nature:

1) how does one choose the system ψ , 1 j < , so that the matrix j ≤ ∞ aij in equation (3.12) is easily invertible? 2) how does one choose ψ , 1 j < , so that the functions φ are j ≤ ∞ j easily computable?

The first question is easy to answer: it is sufficient that the condition number of the matrix a , 1 j n for any n is bounded. This will be ij − ≤ ≤ the case if the system ψ , 1 j < , forms a Riesz basis. Let us recall { j} − ≤ ∞ that the system ψ is called a Riesz basis of a Hilbert space if and only if { j} there exists an orthonormal basis f of H and a linear isomorphism B of { j} H onto H such that Bf = ψ , j. The system ψ forms a Riesz basis of j j ∀ { j} the Hilbert space H if and only if the Gram matrix (ψi, ψj) := Γij defines a linear isomorphism of `2 onto itself. If ψj , 1 j < , in (3.6) is a basis of H◦ then ψj , 1 j < , { } ≤ ∞ 1 { } ≤ 1 ∞ ψj = Rφj, is a complete set in H . Indeed, suppose that f H and 1 1∈ (f, Rφ ) = 0 j. Then 0 = (f, Rφ ) = (I− f, Rφ ) = (RI− f, φ ) j, j 1 ∀ j + j 0 j 0 ∀ where the operator I has been introduced in VI.1, and H = H1. § + February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 37

Since the system φj is complete in H0 (H0 = H◦ in our case) by the { } 1 1 assumption, one concludes that RI− f = 0. Since I− f H and R(x, y) ∈ − is positive so that (Rg, g)0 > 0 for g = 0, g H , one concludes that 1 1 6 ∈ − I− f = 0. Since I− is an isometry between H+ and H , one concludes − that f = 0. Therefore, by Proposition 1, if the system φ forms a basis { j} of H◦ then hn h 1 0 as n , where hn is the solution to (3.7) k − k− → → ∞ of the form (3.6). If φ is a basis of H◦ then the system (3.12) is uniquely solvable for { j} all n if and only if the system ψj , 1 j n is linearly independent in 1 { } − ≤ ≤ H for all n. Here the system ψj is defined by formula (3.16).

3.2 Theoretical approaches

1. Let us consider equation (3.2) as an equation with the operator R : H H+ which is a linear isomorphism between the spaces H and H+. − → − The general theory of the triples of spaces H+ H0 H is given in ⊂ ⊂ − Section 8.1, and we will use the results proved in Section 8.1. α 0 α In our case H+ = H , H0 = H , H = H˙ − . In general, H+ H0 − ⊂ and H0 are Hilbert spaces, H+ is dense in H0, u 0 u +, and H k k ≤k k − is the to H+ with respect to H0. It is proved in Section 8.1 that there exist linear isometries p+ : H0 H+ and p : H H0, and 1 → − − → (u, v)+ = (qu, qv)0, where q = p+− . The operator q∗, the adjoint of q in H0, is an isometry of H0 onto H . − Let us rewrite equation (3.2) in the equivalent form

Ah0 := qRq∗h0 = f0, (3.17)

where

1 f := qf, h := (q∗)− h, f H , h H . (3.18) 0 0 0 ∈ 0 0 ∈ 0

The linear operator qRq∗ is bounded, selfadjoint, and positive definite:

2 2 (qRq∗φ, φ)0 = (Rq∗φ, q∗φ)0 c1 q∗φ = c1 φ 0, c1 > 0. (3.19) ≥ k k− k k Moreover

2 2 (R1∗φ, q∗φ)0 c2 q∗φ = c2 φ 0, c2 > 0. (3.20) ≤ k k− k k

Here we used the isometry of q∗: q∗φ = φ 0, and the inequality k k− k k 2 2 c1 h (Rh, h)0 c2 h , c2 c1 > 0. (3.21) k k−≤ ≤ k k− ≥ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

38 Random Fields Estimation Theory

This inequality is proved in Section 3.4, Lemma 3.5, below. Equation (3.17) with a linear positive definite operator A on a Hilbert space H0 is uniquely solvable in H0 and its solution can be obtained by iterative or projection methods. If the solution h0 to equation (3.17) is found then the desired function h = q∗h0. Let us describe these methods. Let us start with an iterative method. Assume that A is a bounded positive definite operator on a Hilbert space:

0 < m A M. (3.22) ≤ ≤ This means that m φ 2 (Aφ, φ) M φ 2, φ H. Let k k ≤ ≤ k k ∀ ∈ Au = f. (3.23)

Consider the iterative process 2 u = (I aA)u + af, a := , (3.24) n+1 − n M + m where u H is arbitrary. 0 ∈ Lemma 3.1 There exists limn un = u. This limit solves equation →∞ (3.23). One has M m u u cqn, q := − , 0 < q < 1, c = const > 0. (3.25) k n − k≤ M + m This is a well known result (see e.g. [Kantorovich and Akilov (1980)]). We give a proof for convenience of the reader. Proof. If u u in H then, passing to the limit in (3.24), one concludes n → that the limit u solves equation (3.23). In order to prove convergence and the estimate (3.25), it is sufficient to check that

I aA q, (3.26) k − k≤ where q is defined in (3.25). This follows from the spectral representation: 2λ M m I aA = sup 1 = − = q. (3.27) k − k m λ M − m + M M + m ≤ ≤

 Lemma 3.1 is proved. If A is not positive definite but only nonnegative, and f R(A), where ∈ R(A) is the range of A, then consider the following iterative process

un+1 + Aun+1 = un + f (3.28) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 39

u H is arbitrary. 0 ∈ Lemma 3.2 If A 0 and f R(A) then there exists ≥ ∈

lim un = u, (3.29) n →∞ where un is defined by (3.28) and u solves equation (3.23). Proof. If u u in H then, passing to the limit in (3.28) yields equation n → (3.23) for u. In order to prove that u u one writes equation (3.28) as n →

un+1 = Bun + h, (3.30) where

1 B := (I + A)− , h := Bf. (3.31)

Since A 0 one has 0 B I, where I is the identity operator in H. ≥ ≤ ≤ Under this condition (0 B I) one can prove [Krasnoselskii et. al. ≤ ≤ (1972), p. 71] that u u. Lemma 3.2 is proved.  n → 1 Remark 3.1 If A satisfies assumptions (3.22), then 0 < (M + 1)− 1 ≤ B (m + 1)− , and the iterative process (3.30) converges as a geometrical ≤ 1 series with q = (m + 1)− . 2. Let us consider the projection methods for solving equation (3.23) under the assumption (3.22). First, consider the least squares method which is a variant of the projec- tion method. The least squares method can be described as follows. Take a complete linearly independent system φ in H. Look for a solution { j} n un = cjφj. (3.32) Xj=1 Find the constants cj from the condition Au f = min . (3.33) k n − k

This leads to the linear system for cj: n a c = f , 1 i n (3.34) ij j i ≤ ≤ j=1 X where

aij := (Aφj , Aφi), fi = (f, Aφi). (3.35) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

40 Random Fields Estimation Theory

Since the system φ , 1 j n, is linearly independent for any n, and A j ≤ ≤ is an isomorphism of H onto H, the system Aφ , 1 j n, is linearly { j} ≤ ≤ independent for any n. Therefore det aij = 0, 1 i, j n, and the system 6 ≤ ≤ (n) (3.34) is uniquely solvable for any right hand sides and any n. Let cj , 1 j n, be the unique solution to system (3.34) and ≤ ≤ n (n) un := cj φj . (3.36) j=1 X Let us prove that u u as n . It is sufficient to prove that the n → → ∞ system Aφ , 1 j < , is complete in H. Indeed, if this is so, then { j} ≤ ∞ Au f 0 as n , where u is given by (3.36). Therefore k n − k→ → ∞ n 1 1 u u = A− (Au f) m− Au f 0. (3.37) k n − k k n − k≤ k n − k→ 1 1 Here we used the estimate A− m− . It is easy to check that the k k≤ system Aφ , 1 j < , is complete in H. Indeed, suppose (h, Aφ ) = 0, { j } ≤ ∞ j 1 j < , for some h H. Then (Ah, φ ) = 0, 1 j < . Thus Ah = 0 ≤ ∞ ∈ j ≤ ∞ since by the assumption the system φj , 1 j < , is complete in H. 1 { } ≤ ∞ Since A− exists, equation Ah = 0 implies h = 0. We have proved the following lemma.

Lemma 3.3 If A satisfies condition (3.22) and φ , 1 j < , is a { j} ≤ ∞ complete linearly independent system in H, then the least squares method of solving equation (3.23) converges. Namely: a) for any n the system (3.34) is uniquely solvable and the aproximate solution un is uniquely determined by formula (3.36), and b) u u 0 as n , where u is the unique k n − k→ → ∞ solution to equation (3.23).

The general projection method can be described as follows. Pick two complete linearly independent systems in H φ and ψ , 1 j < . { j} { j } ≤ ∞ Look for an approximate solution to equation (3.23) of the form (3.32). Find the coefficients c , 1 j n, from the condition j ≤ ≤ (Au f, ψ ) = 0, 1 i n. (3.38) n − i ≤ ≤ Geometrically this means that the vector Au f is orthogonal to the linear n − span of the vectors ψ , 1 i n. Equations (3.38) can be written as i ≤ ≤ n b c = f , 1 i n (3.39) ij j i ≤ ≤ j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 41

where

bij = (Aφj , ψi), fi = (f, ψi). (3.40)

The least squares method is the projection method with ψi = Aφi. In [Krasnoselskii, M. (1972)] one can find a detailed study of the general projection method. Let us give a brief argument which demonstrates convergence of the projection method. Let φ be a complete linearly independent system in { j} H, L := span φ , . . .φ , P is the orthogonal projection on L in H. An n { 1 n} n n infinite system φ is called linearly independent in H, if, for any n, the { j} system φ , 1 j n, is linearly independent in H. Take ψ = φ and { j} ≤ ≤ j j write equation (3.38) as

P Au = P f, u L . (3.41) n n n n ∈ n

Since un = Pnun and the operator PnAPn is selfadjoint positive definite on the subspace L H, equation (3.41) is uniquely solvable for any f H n ⊂ ∈ and any n. Note that PnAun = PnAPnun and A satisfies assumes (3.22). To prove that u u 0 as n , let us subtract from (3.41) the k n − k→ → ∞ equation

PnAh = Pnf. (3.42)

The result is

P AP (u u) = P A(u P u). (3.43) n n n − n − n Since φ is complete, P I, as n , strongly, where I is the identity { j} n → → ∞ operator. This and the boundedness of A imply

P A(u P u) c u P u 0, n . (3.44) k n − n k≤ k − n k→ → ∞ Multiply (3.43) by P (u u ) and use the positive definiteness of A to n − n obtain

c P (u u ) 2 u P u P (u u ) , k n − n k ≤k − n kk n − n k or

c P (u u ) u P u 0, n . (3.45) k n − n k≤k − n k→ → ∞

Since Pnun = un, equations (3.44) and (3.45) imply

u u u P u + P u u 0, n . (3.46) k − n k≤k − n k k n − n k→ → ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

42 Random Fields Estimation Theory

We have proved Lemma 3.4 If (3.22) holds and φ is a complete linearly independent { j} system in H, then the projection method (3.41) for solving equation (3.23) converges, and equation (3.41) is uniquely solvable for any n.

3. If one applies the projection method with ψj = φj to equation (3.17) and chooses a Schauder basis φ of H, then the matrix { j}

aij := (qRq∗φj , φi) = (Rq∗φj, q∗φi) (3.47)

and the numbers

f0i = (f0, φi) = (f, q∗φi) (3.48)

can be computed. The parentheses here denote the inner product in H 0 = H0, (u, v) = (u, v)0. As q∗φi := wi one can take a basis of H : since q∗ is − an isometry between H0 and H it sends a basis φi of Ho onto a basis − { } wi of H . Let us suggest a system wi for computational purposes. { } − { } Let B be a ball which contains the domain D, v be the orthonormal in { j } L2(B) system of eigenfunctions of the Dirichlet Laplacian in B:

∆v = λ v , v = 0 on ∂B. (3.49) − j j j j For any < β < , the system v , 1 j < , forms a basis of −∞ ∞ { j} ≤ ∞ Hβ(B). Indeed the norm in Hβ (B) is equivalent to the norm

β/2 ( ∆) u 2 := u k − kL (B) k ks and 1/2 ∞ u = u 2λβ , u := (u, v ). (3.50) k kβ  | j| j  j j Xj=1   Therefore, u Hβ(B) is a necessary and sufficient condition for the Fourier ∈ series

∞ u = ujvj (3.51) j=1 X to converge in Hβ (B), so that the system v is a basis of Hβ(B) for any { j} β, < β < . Moreover, this basis is orthogonal in H β (B) for any β, −∞ ∞ although it is not normalized in Hβ(B) for β = 0. In order to check the 6 orthogonality property, note that February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 43

(v , v ) = ( ∆)β v , v = λβ(v , v ) = λβδ , j i β − j i 0 j j i 0 j ji where δji is the Kronecker delta.  β The basis vj can be used therefore as a basis of H = H˙ − for any { } − α β 0. In the case of equation (3.17) with R , one has H = H˙ − , ≥ ∈ R − α α = s(q p)/2. Although the system v , 1 j < , is a basis of H˙ − , − { j} ≤ ∞ it is not very convenient for representation of singular functions, such as δ , for example. The situation is similar to the one arising when δ(x y) Γ − is represented as

∞ δ(x y) = φ (x)φ∗(y), (3.52) − j j Xj=1 where φ is an orthonormal basis in L2(D). Formula (3.52) is valid in { j} the sense that for any f L2(D) one has ∈

∞ f(x) = (f, φj )φj. (3.53) Xj=1 n The sequence δ (x y) := φ (x)φ∗(y) is a delta-sequence, that is n − j=1 j j P 2 fδn(x y)dy f(x) 0 as n f L (D). (3.54) − − 2 → → ∞ ∀ ∈ ZD L (D)

The series (3.52) does not con verge in the classical sense.

3.3 Multidimensional equation

In this section we describe the application of the basic idea presented in Section 3.1 to the multidimensional equation of random fields estimation theory

R(x, y)h(y)dy = f(x), x D Rr. (3.55) ∈ ⊂ ZD

We assume that R and Γ is smooth. Let j = (j1, . . .jr) be a multiindex, ∈ R (j) j = j1 +j2 + +jr . By b(s)δΓ we mean the distribution with support | | · · · { } r on Γ = ∂D which acts on a test function φ C∞(R ) by the formula ∈ 0

(j) j (j) b(x)δΓ , φ = ( 1)| | b(s)φ (s)ds. (3.56) { } − Γ   Z February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

44 Random Fields Estimation Theory

Here b(s) is a smooth function on Γ. Let us look for an approximate solution α to equation (3.55) in H˙ − of the form

n α 1 n − i (i) h = c φ (x) + ( 1)| |a b (s)δ := h + h (3.57) n j j − mi{ mi Γ} on sn j=1 i=0 m=1 X X X where a and c are constants, the system φ , 1 j < , forms a mi j { j} ≤ ∞ basis of H = L2(D), and the systems b (s) , 1 m < , form a basis 0 { mi } ≤ ∞ of L2(Γ), 0 i α 1. Here h stands for an approximation of h , ≤ | | ≤ − on 0 the ordinary part of the solution h = ho + hs to equation (3.55), and hsn stands for an approximation of the singular part, hs, of this solution. If 2qs ps 2qs f H − , then G(x), defined by formula (2.16), belongs to H and h, ∈ the solution to (3.55) given by formula (2.15), belongs to H 0 = L2(D) in the interior of D, h = Q( )G , where the symbol h denotes the restriction o L |D |D of the distribution h to the interior of D. For example, if D = (t T, t), h f 00 +f − is given by formula (2.49), then h = − . The term |D 2 n hon := cj φj(x) (3.58) Xj=1 0 in (3.57) can approximate ho with an arbitrary accuracy in H , if n is large enough, because the system φ , 1 j < , forms a basis of H0. { j} ≤ ∞ (It would be sufficient to assume that φ is a complete in H0 linearly { j} independent system. The term

α 1 n − h := ( 1)ia b (s)δ (i) (3.59) sn − mi{ mi Γ} m=1 Xi=0 X α can approximate hs in H˙ − with an arbitrary accuracy, if n is large enough, because the systems b (s) , 1 m < , are complete in L2(Γ) and h { mi } ≤ ∞ s is of the form (see formula (2.16)

α 1 − i (i) h = ( 1)| | b (s)δ , (3.60) s − { i } i=0 X where the coefficients bi(s) are the traces on Γ of certain derivatives of f(x) (see formula (2.85), for example). If f(x) is sufficiently smooth then 2 2 the functions bi(s) are in L (Γ) and can be approximated in L (Γ) with an arbitrary accuracy, if m is large enough, by linear combinations of functions b (s) because the systems b (s) , 1 m < , are assumed to be mi { mi } ≤ ∞ complete in L2(Γ) for any 0 i α 1. ≤ | | ≤ − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 45

Choose coefficients cj and bmi in (3.57) so that

Rh f = min . (3.61) k n − k1 The variational problem (3.61) leads to the linear algebraic system for the coefficients cj and bmi. The arguments given in Section 3.1 below formula (3.7) remain valid without essential changes. Rather than formulate some general statements, consider an example. Let exp( x y ) −| − | hdy = f(x), D R3. (3.62) 4π x y ⊂ ZD | − | This is equation (2.84) with a = 1. Look for its approximate solution in 1 H˙ − of the form n n hn = cj φj + ambm(s)δΓ. (3.63) j=1 m=1 X X Here α = 1, so that the double sum in (3.59) reduces to the second sum in (3.63), cj and am are the coefficients to be determined by the least squares method (3.61). One has

n n Rhn = cjηj(x) + amwm(x) (3.64) j=1 m=1 X X where exp( x s ) g(x, s) = −| − | , η := Rφ (x) (3.65) 4π x s j j | − |

wm(x) := bm(s)g(x, s)ds. (3.66) ZΓ Therefore (3.61) yields:

n n c η (x) + a w (x) f = min . (3.67) k j j m m − k1 j=1 m=1 X X

This leads to the linear system for the 2n coefficients cj and am:

2n a β = γ , 1 i 2n (3.68) ij j i ≤ ≤ j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

46 Random Fields Estimation Theory

where

β = c , 1 j n, β = a , j = n + m, 1 m n, (3.69) j j ≤ ≤ j m ≤ ≤ γ = (f, η ) , 1 i n, γ = (f, w ) , i = n + m, 1 m n,(3.70) i i 1 ≤ ≤ i i 1 ≤ ≤ aij := (vj , vi)1, (3.71) v = η , 1 i n, v = w , i = n + m, j 1 m n. (3.72) i i ≤ ≤ i i ≤ ≤ Exercise: Under what assumptions can one prove that Rh f 0 as n implies h h 0 and k n − k1→ → ∞ k on − o ko→ hsn hs 1 0 as n ? k − k− → → ∞

3.4 Numerical solution based on the approximation of the kernel

Consider the basic equation

Rh := R(x, y)h(y)dy = f(x), x D Rr (3.73) ∈ ⊂ ZD with kernel

R(x, y) = R˜(λ)Φ(x, y, λ)dρ(λ), R˜(λ) > 0, (3.74) ZΛ where R˜(λ) is a positive continuous function vanishing at infinity. Let us call this function the spectral density corresponding to the kernel R(x, y). Assume that, for any  > 0, one can find polynomials P(λ) > 0 and ˜ 1 ˜ Q(λ) > 0, such that the kernel R(λ) := P(λ)Q− (λ) approximates R(λ) in the following sense:

2 β sup R˜ R˜ (1 + λ ) := R˜ R˜ β< . (3.75) λ Λ{| − | } k − k ∈

We assume that for all sufficiently small , 0 <  < o, o is a small number, one has

deg Q (λ) deg P (λ) = 2β > 0, (3.76)  −  where β does not depend on . For example, if 1/2 λ2 + 3 1 R˜(λ) = , Λ = ( , ), λ2 + 2 λ2 + 1 −∞ ∞   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 47

then β = 1. We also assume that

2 β inf inf R˜(λ) (1 + λ ) := γo > 0, (3.77) 0<<o λ Λ | |  ∈  and

2 β 2 β inf R˜(λ)(1 + λ ) := γ1 > 0, sup R˜(λ)(1 + λ ) := γ2 < . (3.78) λ Λ{ } λ Λ{ } ∞ ∈ ∈ βs βs The basic idea of this section is this: if the operator R : H˙ − H ,  → with the spectral density R˜ (λ) is for all  (0,  ) an isomorphism and  ∈ o the asumptions (3.75)-(3.78) hold with constants β, c which do not depend on , then R, the operator with the kernel R(x, y), is also an isomorphism βs βs between H˙ − and H . Therefore the properties of the operator R will be expressed in terms of the properties of the rational approximants of its spectral density. We will need some preparations. First, let us prove a general lemma.

Lemma 3.5 Let H+ H0 H be a rigged triple of Hilbert spaces, ⊂ ⊂ − where H is the dual space to H+ with respect to H0. Assume that R : − H H+ is a linear map such that − → 2 2 c1 h (Rh, h) c2 h , h H , (3.79) k k−≤ ≤ k k− ∀ ∈ −

where 0 < c1 < c2 are constants, and (f, h) is the value of the functional h H on the element f H+. Then ∈ − ∈ 1 1 R c , R− c− , (3.80) k k≤ 2 k k≤ 1

so that R is an isomorphism of H onto H+. Here R is the norm of − k k the mapping

R : H H+. − → Proof. One has

2 c1 h (Rh, h) Rh + h , k k−≤ ≤k k k k− so that

Rh + c1 h h H . (3.81) k k ≥ k k− ∀ ∈ − 1 Therefore R− is defined on the range of R and

1 1 R− c− . (3.82) k k≤ 1 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

48 Random Fields Estimation Theory

Let us prove that the map R is surjective, that is, the range of R is all of H+. If it is not, then there exists a φ H , φ = 0, such that ∈ − 6 (Rh, φ) = 0 h H . (3.83) ∀ ∈ − It follows from (3.83) that

(h, Rφ) = 0 h H . (3.84) ∀ ∈ − Therefore Rφ = 0 and

2 0 = (Rφ, φ) c1 φ . (3.85) ≥ k k− Thus φ = 0 contrary to the assumption. Therefore the map R is surjective. Let us now prove the first inequality (3.80). One has

R = sup (Rg, h) = sup Re(Rg, h) k k | | | | (R(h + g), h + g) (R(h g), h g) = sup | − − − | 4 c2 2 2 sup h + g + h g c2, (3.86) ≤ 4 k k− k − k− ≤  where the supremum is taken over all h, g H such that ∈ − h 1, g 1. (3.87) k k−≤ k k−≤ 

Remark 3.2 The surjectivity of the map R : H H+ follows also from − → the fact that R is a coercive, monotone and continuous mapping (see, e.g., [Deimling (1985), p. 100]).

Let us now prove

Lemma 3.6 Let R : H H+ be an isomorphism for all  (0, o), − → ∈ where o > 0 is a small fixed number. Let R : H H+ be a linear map − → defined on all of H . Assume that − 1 R− M (3.88) k  k≤ and

R R < , (3.89) k −  k February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 49

where M = const > 0 does not depend on  (0, o). Then R : H H+ ∈ − → is an isomorphism and

1 1 R− M(1 M)− for M < 1. (3.90) k k≤ − Proof. One has

1 R = R + R R = R [I + R− (R R )], (3.91)  −    −  1 where I is the identity operator on H . The operator R− (R R) is an − − operator from H into H and − − 1 R− (R R ) M (3.92) k  −  k≤ 1 because of (3.88) and (3.89). If M < 1, then the operator I +R− (R R )  −  is an isomorphism of H onto H , and − − 1 1 1 [I + R− (R R )]− (1 M)− . (3.93) k  −  k≤ −

Therefore, the operator R is an isomorphism of H onto H+, − 1 1 1 1 R− = [I + R− (R R )]− R− , (3.94)  −   and

1 1 R− M(1 M)− , M < 1. (3.95) k k≤ − Lemma (3.6) is proved. 

βs 0 2 βs Let us choose H = H˙ − , H0 = H = L (D), and H+ = H , where − s = ord , and is the elliptic operator which defines the kernel R of L L ∈ R the equation (3.73). Note that, by Parseval’s equality, one has:

˜ ˜ 2 2 β ˜ 2 2 (Rh, h) = R(λ) h(λ) dρ(λ) γ2 (1+λ )− h(λ) dρ(λ) = γ2 h βs Λ | | ≤ Λ | | k k− Z Z (3.96) and, similarly,

2 γ1 h βs (Rh, h), (3.97) k k− ≤

where γ1 and γ2 are constants from condition (3.78). From (3.96), (3.97) and Lemma 3.5, one obtains

Lemma 3.7 If the spectral density R˜(λ) of the kernel (3.74) satisfies conditions (3.78) with some β > 0, then the operator R with kernel R(x, y), February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

50 Random Fields Estimation Theory

βs defined by formula (3.74), is an isomorphism between the spaces H˙ − (D) and Hβs(D) and

1 1 R γ , R− γ− , (3.98) k k≤ 2 k k≤ 1

where γ1 and γ2 are constants from condition (3.78). Let us discuss briefly the approximation problem. Let R˜(λ) be a con- tinuous positive function such that conditions (3.78) hold and

2β lim R˜(λ)λ = γ3, (3.99) λ →∞ where β is a positive integer, and let λ = tg(φ/2). Then λ runs through the real axis, < λ < , if φ runs through the unit circle, π φ −∞ ∞ − ≤ ≤ π. Because of the assumption (3.99), one can identify + and , and ∞ −∞ consider the function R˜(λ) as a function φ R(φ) := R˜ tg , π φ π, (3.100) 2 − ≤ ≤   which is 2π periodic function defined on a unit circle. Since 1 λ2 2λ sin φ = − , cos φ = (3.101) 1 + λ2 1 + λ2

sin(m+1)φ and cos(mφ) is a polynomial of degree m of cos φ, while sin φ is also a polynomial of degree m of cos φ, a trigonometric polynomial

n Sn(φ) := ao + aj cos(jφ) + bj sin(jφ), (3.102) Xj=1 where aj are bj are constants, can be written as

2n c λm S (φ) := R˜ (λ) = m=0 m , (3.103) n n (1 + λ2)n P where cm are some constants. Therefore, if one wishes to approximate a function R˜(λ) on the whole axis ( , ) by a rational function, one −∞ ∞ can approximate the function R(φ), defined by formula (3.100) on the unit circle, by a trigonometric polynomial Sn(φ) and then write this polynomial as a rational function of λ as in formula (3.103). The function R˜n(λ), defined by formula (3.103), satisfies the condition

2β R˜ (λ) γ λ− as λ (3.104) n ∼ 3 → ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 51

if and only if β, 0 < β < n, is an integer and

0 = c2n = c2n 1 = . . . = c2n 2β+1 = 0. (3.105) − −

If (3.105) holds then the constant γ3 in formula (3.104) equals c2n 2β. The − theory of approximation by trigonometric polynomials is well developed (see [Akhieser (1965)]). In particular, if condition (3.99) holds, then ap- proximation in the norm

2 β sup (1 + λ ) R˜(λ) R˜(λ) := R˜ R˜ β (3.106) λ R1 | − | k − k ∈ is possible. The norm (3.106) is the norm (3.75) with Λ = R1. We keep the same notation for these two norms since there will be no chance to make an error by confusing these two norms: in all our arguments they are interchangeable.

Lemma 3.8 If R˜(λ) is a continuous function defined on R1 = ( , ) −∞ ∞ which satisfies condition (3.99), then, for any  > 0, there exists a rational ˜ 1 function R = P(λ)Q− (λ) such that

R˜(λ) R˜ (λ) <  (3.107) k −  kβ

and condition (3.76) holds. If R˜(λ) > 0 then the polynomials P(λ) and Q(λ) can be chosen positive.

Proof. The function ψ(λ) := (1 + λ2)β R˜(λ) is continuous on R1 and

2 β lim (1 + λ ) R˜(λ) = γ3 (3.108) λ →∞

φ φ because of (3.99). The function ψ tg 2 , tg 2 = λ, is a continuous function of φ on the interval [ π, π]. Therefore, there exists, for any given  > 0, a − trigonometric polynomial such that

φ φ max ψ tg Sn(φ) < , λ = tg , (3.109) π φ π 2 − 2 − ≤ ≤  

where n = n() depends on ψ. From (3.109) and (3.103) it follows that

2n 2 β m 2 n max (1 + λ ) R˜(λ) cmλ (1 + λ )− < . (3.110) <λ< − −∞ ∞ m=0 X

February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

52 Random Fields Estimation Theory

Therefore (3.107) holds with

2n m 2 n β P(λ) := cmλ , Q(λ) := (1 + λ )− − . (3.111) m=0 X In order to prove the last statement of Lemma 3.8, one approximates first the continuous function (1+λ2)β/2R˜1/2(λ) by a rational function T (λ) in the uniform norm on R1 with accuracy , where  > 0 is a given num- ≤ ber. Then the square of this rational function approximates the function (1 + λ2)βR˜(λ) with accuracy const , where the constant does not depend · on . Indeed, if f T < , then | − | f 2 T 2 f T (max f + max T )  const. (3.112) | − | ≤ | − | | | | | ≤ · 2 1 If T is a rational function then T (λ) = P (λ)Q− (λ) where P (λ) and Q(λ) are positive. Lemma 3.8 is proved. 

Let us summarize the results in the following theorem.

Theorem 3.1 Let R˜(λ) be a continuous positive function on R1 and suppose condition (3.99) holds with a positive integer β. Then:

a) for any  > 0, there exists a positive rational function R˜(λ) such that conditions (3.75) and (3.76) hold; b) for all sufficiently small , 0   , the operator R , with the kernel ≤ ≤ 0  defined by the spectral density R˜(λ), is an isomorphism of the space βs βs βs βs H˙ − (D) := H˙ − onto H (D) := H ; βs βs c) the operator R : H˙ − H is an isomorphism; → d) there exist positive constants γ0, γ1, and γ2, such that conditions (3.77) and (3.78) hold; e) the following estimates hold:

1 1 R γ , R− γ− , (3.113) k k≤ 2 k k≤ 1

where γ1 and γ2 are the constants in formula (3.78); 1 f) if γ1−  < 1, then

1 1 1 1 R− γ− (1 γ− )− (3.114) k  k≤ 1 − 1 and

1 1 2 1 1 R− R− γ− (1 γ− )− . (3.115) k −  k≤ 1 − 1 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 53

Proof. The statements (a) to (e) of Theorem 3.1 follow from Lemmas 3.5-3.8. The statement (3.114) is analogous to (3.95) and can be proved similarly. The last statement (3.115) follows immediately from the identity:

1 1 1 1 R− R− = R− (R R)R− (3.116) −    − and estimate (3.114), second estimate (3.113), and the estimate

R R , (3.117) k  − k≤ which is a consequence of (3.75). Let us explain why (3.75) implies (3.117). One has 2 R = sup (Rh, h) = sup R˜(λ) h˜(λ) dρ(λ) k k h 1 | | h 1 k k− ≤ k k− ≤ Z 2 β 2 β 2 sup (1 + λ ) R˜(λ) sup (1 + λ )− h˜ dρ(λ) ≤ λ R1 h 1 | | ∈ n o k k− ≤ Z

= sup (1 + λ2)β R˜(λ) sup h 2 1 λ R h − 1 k k− ∈ n o k k ≤

= sup (1 + λ2)β R˜(λ) . (3.118) λ R1 ∈ n o βs Here H = H˙ − . Thus − R sup (1 + λ2)β R˜(λ) . (3.119) k k≤ λ R1 ∈ n o

From (3.75) and (3.119) one obtains (3.117). Theorem 3.1 is proved.  It is now easy to study the stability of the numerical solution of equation (3.73) based on the approximation of R(x, y). Consider the equation

Rh = f , f H , (3.120) δ δ δ ∈ +

where fδ is the noisy data:

f f δ. (3.121) k δ − k+≤ This means that, in place of the exact data f H , an approximate ∈ + data fδ is given, where δ > 0 is the accuracy with which the given data approximates in H+ the exact data. Suppose that R˜(λ), the spectral density of the given kernel, satisfies condition (3.99) with β > 0 an integer. Take a kernel R , such that the estimate (3.117) holds with  > 0 sufficiently  ∈ R February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

54 Random Fields Estimation Theory

small. This is possible by Theorem 3.1. Then estimate (3.115) holds. Consider the equation

Rhδ = fδ . (3.122)

By Theorem 3.1 the operator R : H H+ is an isomorphism if  > 0 − → is sufficiently small, 0 <  < 0. Therefore, for such  equation (3.122) is uniquely solvable in H . We wish to estimate the error of the approximate − solution:

1 1 h hδ = R− f R− fδ k − k− k − k− 1 1 1 R− (f fδ ) + (R− R− )fδ ≤ k − k− k − k− 1 1 1 R− f f + R− R− f ≤ k kk − δ k+ k −  kk δ k+ 1 2 1 1 γ− δ + γ− (1 γ− )− f . (3.123) ≤ 1 1 − 1 k δ k+ βs βs In our case H = H˙ − , H+ = H . Estimate (3.123) proves that the − error of the approximate solution goes to zero as the accuracy of the data increases, that is δ 0. Indeed, one can choose  > 0 sufficiently small, → so that the second term on the right side of the inequality (3.123) will be arbitrarily small, say less than δ. Then the right side of (3.123) is not more 1 than (γ1− + 1)δ. We have proved

Lemma 3.9 The error estimate of the approximate solution hδ is given by the inequality:

1 1 1 1 h hδ γ1− δ + γ1− (1 γ1− )− fδ + (3.124) k − k−≤ − k k

where γ1 is the constant in condition (3.78).

3.5 Asymptotic behavior of the optimal filter as the white noise component goes to zero

Consider the equation

h + Rh = f,  > 0, (3.125)

where R , or, more generally, R is an isomorphism between H and ∈ R − H+, where H+ H0 H is a rigged triple of Hilbert spaces. We wish to ⊂ ⊂ − study the behavior as  0 of h , the optimal filter. This question is of →  theoretical and practical interest as was explained in the Introduction. It February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 55

will be discussed in depth in Chapter 5. We assume that the estimate

2 2 c1 h (Rh, h) c2 h h H (3.126) k k−≤ ≤ k k− ∀ ∈ −

holds, where c1 > 0 and c2 > 0 are constants and the parentheses denote the pairing between H and H+. − From (3.125) it follows that

 h 2 +(Rh , h ) = (f, h ), (3.127) k  k0   

where the parentheses denote the inner product in H0, which is the pairing between H and H+ (see Section 8.1). It follows from (3.126) and (3.127) − that

2 c1 h (Rh, h) (f, h) f + h . (3.128) k k−≤ ≤ ≤k k k k− Thus

1 h c f +, c = c1− , (3.129) k k−≤ k k where the constant c > 0 does not depend on . Since H is a Hilbert − space, and bounded sets are weakly compact in Hilbert spaces, inequality (3.129) implies that there is a weakly convergent subsequence of h, which we denote again h, so that

h * h in H as  0. (3.130) − →

Here * denotes weak convergence in H which means that for any f H+ − ∈ one has

(f, h ) (f, h) as  0, f H . (3.131)  → → ∀ ∈ + Let φ H be arbitrary. It follows from (3.125) that ∈ +

(h, φ) + (Rh, φ) = (f, φ),

or, since (Rh, φ) = (h, Rφ), Rφ H , ∈ + (h , φ) + (h , Rφ) = (f, φ), φ H . (3.132)   ∀ ∈ + One has

(h, φ)  h φ + 0 as  0, (3.133) | | ≤ k k−k k → → February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

56 Random Fields Estimation Theory

where we used estimate (3.129). Therefore one can pass to the limit  0 → in equation (3.132) and obtain

(h, Rφ) = (f, φ) φ H , (3.134) ∀ ∈ +

or

(Rh f, φ) = 0 φ H , (3.135) − ∀ ∈ +

where h H is the weak limit (3.130). Since H+ H is dense in H in ∈ − ⊂ − − the norm of H , one concludes from (11) that −

Rh = f. (3.136)

We have proved the following theorem.

Theorem 3.2 Let H+ H0 H be a triple of rigged Hilbert spaces. ⊂ ⊂ − If R : H H+ is an isomorphism, and (3.126) holds, then the unique − → solution to equation (3.125) converges weakly in H to the unique solution − to the limit equation (3.136).

Remark 3.3 The weak convergence in H is exactly what is natural in − the estimation theory. Indeed, the estimate

uˆ = h(x, y)f(y)dy = (hx, f) (3.137) ZD

is the value of the functional (3.137) at the element f, hx = h(x, y) H , ∈ − f H , x Rr being a parameter. The error of the optimal estimate ∈ + ∈ (see e.g. formulas (2.96) and (2.108) are also expressed as the values of a functional of the form (3.137). One can prove that actually h converges strongly to H . Indeed, equa- − tion (3.125) implies (h, h) = (Rh, h) (Rh, h) = (h, h) . Thus − ≤ − h h . Choose a weakly convergent in H sequence hn := h , || ||− ≤ || ||− − n lim n = 0. Then hn * h in H , h limn hn and n − − →∞ − →∞ || || ≤ || || limn hn h . Consequently, limn hn = h . This and →∞|| || ≤ || ||− →∞ || ||− || ||− the weak convergence hn * h in H imply strong convergence in H , so − − limn h hn = 0. →∞ || − ||− February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 57

3.6 A general approach

In this section we outline an approach to solving the equation

Rh = f (3.138)

which is based on the theory developed in Section 8.1. Assume that R : H H is a compact positive operator in the Hilbert → space H, that is

(Rh, h) > 0 h H, h = 0. (3.139) ∀ ∈ 6 The parentheses denote the inner product in H. The inner product

(h, g) := (Rh, g) (3.140) − induces on H a new norm

h = (Rh, h)1/2 = R1/2h . (3.141) k k− k k Let H be the Hilbert space with the inner product (3.140) which is the − completion of H in the norm (3.141). By H+ we denote the dual to H − space with respect to H = H0 (see Section 8.1). One has

H+ H H (3.142) ⊂ ⊂ −

where H+ is dense in H and H is dense in H . The inner product in H+ − is

1 1/2 1/2 1/2 (u, v) = (R− u, v) = (R− u, R− v), u, v Dom(R− ). (3.143) + ∈ Therefore

1/2 u = R− u . (3.144) k k+ k k 1/2 One can see that H+ is the range of R . Indeed, Ran(R1/2) H by definition and is closed in H norm. In- ⊂ + + deed, let f = R1/2u and assume that f f 0 as n, m . n n k n − m k+→ → ∞ Then, by (3.144), u u 0 as n, m . Therefore there exists k n − m k→ → ∞ a u H such that u u 0 as n . Let f := R1/2u. Then ∈ k n − k→ → ∞ f H and f f 0 as n . Thus Ran(R1/2) is closed in H , ∈ + k − n k+→ → ∞ + where Ran(A) is the range of an operator A. Since H+ is the completion February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

58 Random Fields Estimation Theory

1/2 1/2 of Ran(R ) in H+ norm, it follows that H+ = Ran(R ). One can also define the norm in H+ as

1/2 (u, R− h) u += sup | |. (3.145) k k h∈H h h=0 k k 6 If and only if the right side of (3.145) is finite one concludes that u 1/2 ∈ Dom(R− ) and obtains from (3.145) equation (3.144). The triple (3.142) is a triple of rigged Hilbert spaces (see Section 8.1), and R : H H+ is − → an isomorphism. Therefore, a general approach to stable solving equation (3.138) can be described as follows. Suppose an operator A : H H → is found such that A > 0 and the norm (Au, u)1/2 is equivalent to the norm (3.141). In this case the spaces H+ and H constructed with the − help of A will consist of the same elements as the spaces H+ and H − constructed above, and the norms of these spaces are equivalent, so that one can identify these spaces. Suppose that one can construct the mapping 1 A− : H+ H . This mapping is an isomorphism between H+ and H . → − − Then equation (3.138) can be written as

1 1 Bh := A− Rh = A− f := g, (3.146)

where B : H H is an isomorphism. Therefore equation (3.138) is − → − reduced to the equation

Bh = g (3.147)

which is an equation in the Hilbert space H with a linear − B which is an isomorphism of H onto H . The operator B in equation − − (3.148) is selfadjoint and positive in H . Indeed − 1 1 (Bh, v) = (RBh, v) = (RA− Rh, v) = (h, RA− Rv) = (h, Bv) . (3.148) − − Moreover

1 (Bh, h) = (A− Rh, Rh) > 0 for h = 0 (3.149) − 6 since A and R are positive. Equation (3.148) with positive, in the sense (3.149), isomorphism B from H onto H can be easily solved numerically by iterative or projection − − methods described in Section 3.2. Let us now describe some connections between the concepts of this sec- tion and the well known concept of a reproducing kernel. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 59

Definition 3.1 A kernel K(x, y), x, y D Rr, is called a reproducing ∈ ⊂ kernel for a Hilbert space H+ of functions defined on D, where D is a (not necessarily bounded) domain in Rr, if for any u H one has ∈ +

(K(x, y), u(y))+ = u(x). (3.150) It is assumed that, for every x D, K(x, y) H , and H consists of ∈ ∈ + + functions for which their values at a point are well defined. From (3.150) it follows that (K(x, y)u, u) 0, where K(x, y)u := + ≥ (K(x, y), u)+, and

2 K(x, x) 0, K(x, y) = K∗(y, x) K(x, y) K(x, x)K(y, y) (3.151) ≥ | | ≤ as we will prove shortly. The reproducing kernel, if it exists for a Hilbert space H+, is unique. Indeed, if K1 is another reproducing kernel then

(K(x, y) K (x, y), u(y)) = 0 u H . (3.152) − 1 + ∀ ∈ +

Therefore K(x, y) = K1(x, y). The reproducing kernel exists if and only if the estimate

u(x) c u u H (3.153) | | ≤ k k+ ∀ ∈ + holds with a positive constant c which does not depend on u. Indeed

u(x) (K(x, y), u(y)) K(x, y) u . (3.154) | | ≤ + ≤k k+k k+ Thus (3.153) holds with c = K(x, y ) . Note that k k+ K(x, y) 2 = (K(x, y), K(x, y)) = K(x, x) (3.155) k k+ + because of (3.150). Conversely, if (3.153) holds then, by Riesz’s theorem about linear functionals on a Hilbert space, there exists a K(x, y) such that (3.150) holds. Since (3.150) implies that, for any numbers t , 1 j n, one has j ≤ ≤ n n n

K(xi, xj)titj∗ = K(xi, y)tj , K(x, y)tj 0, (3.156) ! ≥ i,jX=1 Xi=1 Xi=1 one sees that the matrix K(xi, xj) is nonnegative definite and therefore (3.151) holds. Lemma 3.10 Assume that D Rr is a bounded domain and the kernel ⊂ R(x, y) of the operator R : H H, H = L2(D) is nonnegative definite and → February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

60 Random Fields Estimation Theory

continuous in x, y D. Then the Hilbert space H generated by R (see ∈ + formula (3.143)) is a Hilbert space with reproducing kernel R(x, y).

Proof. If u H then u Ran(R1/2) so that there is a v such that ∈ + ∈ u = R1/2v, v H. (3.157) ∈ If we prove that the operator R1/2 is an integral operator:

u(x) = R1/2v = T (x, y)v(y)dy (3.158) ZD such that the function 2 t(x) := T (x, y) 2 dy (3.159) | | ZD  is continuous in D, then (3.158) and (3.144) imply

1/2 u(x) t(x) v = t(x) R− u = t(x) u . (3.160) | | ≤ k k k k k k+ This is an estimate identical with (3.153), and we have proved that this estimate implies that H+ has the reproducing kernel K(x, y). To finish the proof one has to prove (3.158) and (3.159). Since D is bounded, the operator R : H H with continuous kernel is in the trace → class. This means that

∞ R(x, y) = λj φj(x)φj∗(y) (3.161) Xj=1 where λ λ > 0 are the eigenvalues of R counted according to 1 ≥ 2 ≥ · · · their multiplicities, φj are the normalized eigenfunctions

R(x, y)φj (y)dy = λjφj (x), (φj, φi) = δij , (3.162) ZD and

∞ T rR = λj = R(x, x)dx < . (3.163) D ∞ Xj=1 Z We will explain the second equality (3.163) later. The operator R1/2 has the kernel

1/2 T (x, y) = λj φj(x)φj∗(y), (3.164) X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 61

which can be easily checked: the kernel of R is the composition:

T T := T (x, z)T (z, y)dz = R(x, y). (3.165) ◦ ZD Therefore

∞ T (x, y) 2dy = λ φ (x) 2 = R(x, x). (3.166) | | j| j | D j=1 Z X Therefore (3.158) and (3.159) are proved, and

t(x, x) = [R(x, x)]1/2. (3.167)

Let us finally sketch the proof of the second equality (3.163). This equality is well known and the proof is sketched for convenience of the reader. It is sufficient to use Mercer’s theorem : If R(x, y) is continuous and nonnegative definite, that is

2 R(x, y)h(y)h∗ (x)dydx 0 h L (D), (3.168) ≥ ∀ ∈ ZD ZD then the series (3.161) converges absolutely and uniformly in D D. × If Mercer’s theorem is applied to the series (3.161) with x = y, then

∞ 2 ∞ R(x, x)dx = λj φj dx = λj . (3.169) D D | | Z Xj=1 Z Xj=1 Thus, (3.163) holds. To prove Mercer’s theorem, note that the kernel Rn(x, y) := R(x, y) n − j=1 λj φj(x)φj∗(y) is nonnegative definite for every n. Therefore R (x, x) 0 so that Pn ≥ n λ φ (x) 2 R(x, x) n. (3.170) j | j | ≤ ∀ j=1 X Therefore the series ∞ λ φ (x) 2 R(x, x) c converges and c does j=1 j | j | ≤ ≤ not depend on x because R(x, x) is a continuous function in D. Thus, the P series

∞ R(x, y) = λj φj(x)φj∗(y) (3.171) j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

62 Random Fields Estimation Theory

converges uniformly in x for each y D. Indeed: ∈ n 2 n n 2 2 λj φj(x)φj∗(y) λj φj(x) λj φj(y) ≤ | | · | | m m m X Xn X 2 c λj φj(y) 0 as m, n . (3.172) ≤ m | | → → ∞ X Take y = x in (3.171) and get

∞ R(x, x) = λ φ (x) 2. (3.173) j| j | j=1 X Since R(x, y) is continuous in D D, the functions φ (x) are continuous × j in D. By Dini’s lemma the series (3.173) converges uniformly in x D. ∈ Therefore the series (3.173) can be termwise integrated which gives (3.169). Lemma 3.10 is proved.  Exercise Prove Dini’s lemma: if a monotone sequence of continuous functions on a compactum D Rr converges to a continuous function, ⊂ then it converges uniformly. In Definition 3.1 we assume that the space H+ with reproducing kernel consists of functions u(x) whose values at a point are well defined. This excludes spaces L2(D), for example. If the definition (3.150) is understood in the sense that both sides of (3.150) are equal as elements of H+ (and not pointwise) then the spaces of the type L2 can be included. However, in general, for such kind of spaces the reproducing kernel is not necessarily an 2 element of the space. For example, if H+ = L (D) then (3.150) implies that K(x, y) is the kernel of the identity operator in L2(D). But the identity operator in L2(D) does not have a kernel in the set of locally integrable functions. In the set of distributions, however, it has kernel δ(x y), where − δ(x) is the delta-function. As the kernel of the identity operator in L2(D) the delta-function is understood in weak sense:

δ(x y)fgdxdy = f(x)g(x)dx, f, g L2(D). (3.174) − ∀ ∈ ZD ZD ZD Remark 3.4 We have seen that the operator R in L2(D), D Rr is ⊂ bounded, with continuous nonnegative definite kernel R(x, y), belongs to the trace class. Therefore R1/2 is a Hilbert-Schmidt operator. One can prove that such an operator is an integral operator without assuming that R1/2 0. A linear operator A : H H on a Hilbert space is called a ≥ → February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Numerical Solution of the Basic Integral Equation in Distributions 63

Hilbert-Schmidt operator if ∞ Aφ 2< where φ , 1 j < , is j=1 k j k ∞ { j} ≤ ∞ an orthonormal basis of H. Pick an arbitrary f H, f = ∞ (f, φ )φ . P ∈ j=1 j j Consider P ∞ ∞ Af = (Af, φi)φi = (f, φj)(φj , A∗φi)φi. (3.175) i=1 i,j=1 X X Let H = L2(D). Then (3.175) can be written as

Af = A(x, y)f(y)dy (3.176) ZD with

∞ A(x, y) := ajiφj∗(y)φi(x), aji := (φj , A∗φi). (3.177) i,jX=1 One can check that the series (3.177) converges in L2(D) L2(D) and × ∞ 2 2 A(x, y) dx dy = aji < . (3.178) D D | | | | ∞ Z Z i,jX=1 Indeed, by Parseval’s equality one has

∞ ∞ ∞ a 2 = Aφ 2< . (3.179) | ji| k j k ∞ i=1 j=1 j=1 X X X One can prove that the sum (3.179) does not depend on the choice of the orthonormal basis of H. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

64 Random Fields Estimation Theory February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Chapter 4 Proofs

In this chapter we prove all of the theorems formulated in Chapter 2 except Theorem 2.3, the proof of which is given in Section 8.3.2.10 as a consequence of an abstract theory we develop.

4.1 Proof of Theorem 2.1

In order to make it easier for the reader to understand the basic ideas, we give first a proof of Corollary 2.2, which is a particular case of Theorem 2.1. This case corresponds to the assumption P (λ) = 1, and in this case the transmission problem (2.18 - 2.20) reduces to the exterior Dirichlet boundary problem (2.22 - 2.23).

Proof of Corollary 2.2 Equation (2.12)

R(x, y)h(y)dy = f(x), x D (4.1) ∈ ZD holds if and only if

α (Rh, φ) = (f, φ) φ H˙ − . (4.2) ∀ ∈ α Since smooth functions with compact support in D are dense in H˙ − equa- tion (4.2) holds if and only if

(Rh, φ) = (f, φ) φ Hm(D), m α, (4.3) ∀ ∈ 0 ≥ − m where H0 (D) is the Sobolev space of functions defined in the domain D with compact support in D. Let us take φ = Q( )ψ, ψ C∞(D). The L ∈ 0 function φ Hm(D) if the coefficients of the operator belong to Hm(D). ∈ 0 L

65 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

66 Random Fields Estimation Theory

We will assume that these coefficients are sufficiently smooth. Sharp con- ditions of smoothness on the coefficients of the operator are formulated L in the beginning of Section 2.2. Let us write (4.3) as

(Q( )Rh, ψ) = (Q( )f, ψ) ψ C∞(D). (4.4) L L ∀ ∈ 0 By the assumption

Q( )R = δ(x y), (4.5) L − equation (4.4) reduces to

(h, ψ) = (Q( )f, ψ) ψ C∞(D). (4.6) L ∀ ∈ 0 This means that the distribution h equals to Q( )f in the domain D (which L is an open set). If f is smooth enough, say f Hqs, then the obtained ∈ result says that

sing supp h = ∂D = Γ, (4.7)

since in D the distribution h is equal to a regular function Q( )f in D. In L order to find h in D, we extend f from D to Rr so that the extension F has two properties:

F is maximally smooth (4.8)

and

Q( )F = 0 in Ω. (4.9) L Requirement (4.9) is necessary because the function h = Q( )F has to L have support in D. Requirement (4.8) is natural from two points of view. The first, purely mathematical, is: the requirement (4.8) selects the unique solution to equation (4.1) of minimal order of singularity, the mos solution to (4.1). The second point of view is of statistical nature: only the mos solution to equation (4.1) gives the (unique) solution to the estimation problem we are interested in (see formula (2.105)). Let F = u in Ω. Then (4.9) says that

Q( )u = 0 in Ω. (4.10) L Since F = f in D, condition (4.8) requires that qs ∂j u = ∂j f on Γ = ∂D, 0 j 1, (4.11) N N ≤ ≤ 2 − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 67

qs where N is the outer normal to Γ, and one cannot impose more than 2 boundary conditions on u since the Dirichlet problem in Ω allows one to qs impose not more than 2 conditions on Γ. Finally one has to impose the condition

u( ) = 0. (4.12) ∞ Indeed, one can consider F as the left hand side of equation (4.1) with α h H˙ − . In this case it is clear that condition (4.12) holds since R(x, y) ∈ → 0 as x . | | → ∞ The Dirichlet problem (4.10)-(4.12) is uniquely solvable in H ∞(Ω) if f ∈ C∞(D), Γ C∞, and the coefficients of are C∞. If Γ and the coefficients ∈ m L qs aj(x) of are C∞, but f H (D), m α = 2 , then the solution u L ∈ ≥ m to the Dirichlet problem (4.10)-(4.12) belongs to H (Ω) H∞(Ω). We ∩ assume that a (x) C∞ and Γ C∞. This is done for simplicity and j ∈ ∈ in order to avoid lengthy explanations of the results on elliptic regularity of the solutions and the connection between smoothness of Γ and of the coefficients of with the smoothness of the solution u to the problem L (4.10)-(4.12). The uniqueness of the solution to the Dirichlet problem (4.10)-(4.12)

follows from the positivity of Q(λ): the quadratic form (Q( )u, u) 2 = 0 L L (Ω) if and only if u = 0 provided that u satisfies conditions (4.11) with f = 0. If u is the unique solution to the problem (4.10)-(4.12) then

f(x) in D, F (x) = F Hα(Rr). (4.13) (u(x) in Ω, ∈

Indeed, F Hα (Rr), and F Hα(Rr ) since u decays at infinity. Since ∈ loc ∈ F Hα(Rr) and ordQ( ) = qs = 2α, one has ∈ L α h(x) = Q( )F H˙ − (D). (4.14) L ∈ Corollary 2.2 is proved. 

Remark 4.1 Consider the operator = ∆ + a2, a > 0, in L2(R3) with L − the domain of definition H2(R3). The operator is elliptic selfadjoint and 2 3 L positive definite in H0 = L (R ):

( u, u) a2(u, u), a > 0, u H . (4.15) L ≥ ∀ ∈ 0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

68 Random Fields Estimation Theory

1 The Green function of , which is the kernel of the operator − , is L L exp( a x y ) G(x, y) = − | − | . (4.16) 4π x y | − | It decays exponentially as x . | | → ∞ Exponential decay as x holds for the Green function of = | | → ∞ L ∆ + q in H if has property (4.15), in particular if q(x) q > 0. − 0 L ≥ 0 Exercise. Is it true that if is an elliptic selfadjoint operator in H = L 0 L2(Rr) and Q(λ) > 0 for all λ R1 is a polynomial, then the kernel of the 1 ∈ operator [Q( )]− decays exponentially as x ? L | | → ∞

Proof of Theorem 2.1 Let us start with rewriting equation (4.1) with R in the form ∈ R P ( ) S(x, y)h(y)dy = f(x), x D, (4.17) L ∈ ZD where

1 S(x, y) := Q− (λ)Φ(x, y, λ)dρ(λ). (4.18) ZΛ Equation (4.17) can be written as

S(x, y)h(y)dy = g + v(x), (4.19) ZD where g is a fixed solution to the equation

P ( )g = f in D (4.20) L and v is an arbitrary solution to the equation

P ( )v = 0 in D. (4.21) L Equation (4.19) is of the form considered in the proof of Corollary 2.2 with g +v in place of f. Applying the result proved in this corollary, one obtains the following formula:

h = Q( )G, (4.22) L where g + v in D, G = (4.23) (u in Ω, February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 69

and

Q( )u = 0 in Ω, u( ) = 0. (4.24) L ∞ Here g is a particular solution of (4.20) and v is an arbitrary solution to (4.21). Formula (4.22) gives the unique solution of minimal order of singularity, mos solution, to equation (4.1) if and only if G is maximally smooth. If f and Γ are sufficiently smooth, the maximal smoothness of G is guaranteed if and only if the following transmission boundary conditions hold on Γ: s(p + q) ∂j u = ∂j (v + g) on Γ, 0 j 1. (4.25) N N ≤ ≤ 2 − Given the orders of the elliptic operators

ord P ( ) = ps, ord Q( ) = qs, (4.26) L L ps+qs one cannot impose, in general, more than 2 conditions of the form ps+qs (4.25). We will prove that if one imposes 2 conditions then the transmission problem (4.20), (4.21),(4.24), (4.25) is uniquely solvable and G Hs(p+q)/2(Rr). Therefore the mos solution h to equation (4.1), given ∈ by formula (4.22) in which G has maximal smoothness: G H s(p+q)/2(Rr), ∈ has the minimal order of singularity:

α s(p + q) (q p)s h H˙ − (D), α = qs = − . ∈ − 2 − 2   In order to complete the proof one has to prove that the transmission problem (4.20), (4.21),(4.24), (4.25) has a solution and its solution is unique. This problem can be written as

P ( )G = f in D (4.27) L

Q( )G = 0 in Ω (4.28) L

G( ) = 0 (4.29) ∞

j j s(p + q) ∂N G = ∂ G , 0 j 1, (4.30) + − ≤ ≤ 2 −    where + and in (4.30) denote the limiting values on Γ from D and from − Ω respectively. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

70 Random Fields Estimation Theory

First let us prove uniqueness of the solution to the problem (4.27)-(4.30). Suppose f = 0. The problem (4.27)-(4.30) is equivalent to finding the mos solution of equation (4.1). Indeed, we have proved that if h is the mos solution to (4.1) then h is given by formula (4.22) where G solves (4.27)-(4.30). Conversely, if G solves (4.27)-(4.30)) then h given by formula (4.22) solves equation (4.1) and has minimal order of singularity. This is checked by a straightforward calculation: for any φ C∞(D) one has ∈ 0 (RQ( )G, φ) = (G, Q( )Rφ) = (G, P ( )φ) = (P ( )G, φ) L L L L = (f, φ), φ C∞(D), (4.31) ∀ ∈ 0 where we have used the formula

Q( )R(x, y) = P ( )δ(x y) (4.32) L L − and the selfadjointness of P ( ). Formula (4.31) implies that L Rh = RQ( )G = f in D. (4.33) L Equation (4.22)-(4.24) imply that supph D. It follows from (4.22) and s(p+q)/2 r ⊂ α the inclusion G H (R ) that h H˙ − (D), α = (q p)s/2. Thus, ∈ ∈ − we have checked that h given by (4.22) with G given by (4.27)-(4.30) solves α equation (1) and belongs to H˙ − (D), that is h is the mos solution to (4.1).

Exercise. Prove uniqueness of the solution to problem (4.27)-(4.30) in Hs(p+q)/2(Rr) by establishing equivalence of this problem and the problem α of solving equation (4.1) in H˙ − (D), and then proving that equation (4.1) α has at most one solution in H˙ − (D). α Hint: If h H˙ − and Rh = 0 in D, then ∈ 2 0 = (Rh, h) c1 h α, c1 > 0, (4.34) ≥ k k− so that h = 0. Let us prove the existence of the solution to the problem (4.27)-(4.30) in Hs(p+q)/2(Rr ). Consider the bilinear form

[φ, ψ] := P (λ)Q(λ)φ˜(λ)ψ˜∗(λ)dρ(λ) (4.35) ZΛ defined on the set V := Hs(p+q)/2(Rr) Hsq(Ω) of functions which satisfy ∩ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 71

the equation:

Q( )φ = 0 in Ω. (4.36) L

Since P (λ)Q(λ) c > 0 λ R1, one has the norm ≥ ∀ ∈

1/2 [φ, φ]1/2 = P (λ)Q(λ) φ˜ 2dρ(λ) , (4.37) | | ZΛ 

which is equivalent to the norm of Hs(p+q)/2(Rr). Indeed

2 (p+q)/2 0 < d P (λ)Q(λ)(1 + λ )− d . (4.38) 1 ≤ ≤ 2

Therefore

d (1 + λ2)(p+q)/2 φ˜ 2dρ P (λ)Q(λ) φ˜ 2dρ 1 | | ≤ | | ZΛ ZΛ d (1 + λ2)(p+q)/2 φ˜ 2dρ(λ). (4.39) ≤ 2 | | ZΛ On the other hand,

2 β 2 2 (1 + λ ) φ˜ dρ(λ) = φ βs r . (4.40) | | k kH (R ) ZΛ This proves that the norm (4.37) is equivalent to the norm of the space Hs(p+q)/2(Rr). Consider the form

[G, ψ] = P (λ)Q(λ)G˜(λ)ψ˜∗(λ)dρ(λ) ZΛ = P ( )G(y)Q( )ψ∗(y)dy = P ( )G Q( )ψ ∗dy r L L L { L } ZR ZD = f Q( )ψ ∗dy ψ v, (4.41) { L } ∀ ∈ ZD where Parseval’s equality was used. Let W be the Hilbert space which is the completion of V in the norm (4.37). For any f Hα(D), α = s(q p)/2, ∈ − the right-hand side of (4.41) is a linear bounded functional on W . Indeed, extend f to all of Rr so that f Hα(Rr) and use Parseval’s equality and ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

72 Random Fields Estimation Theory

the equation Q( )ψ = 0 to obtain L

f Q( )ψ ∗dy = f Q( )ψ ∗dy { L } r { L } ZD ZR

˜ ˜ = Q(λ)f(λ)ψ∗(λ) dρ(λ) ZΛ 1/2 1/2 Q(λ) f˜ 2 dρ(λ) Q(λ)P (λ) ψ˜ 2dρ(λ) ≤ | | P (λ) | | ZΛ  ZΛ  f ψ c ψ , (4.42) ≤ k kαk kW ≤ k kW where c = f is a positive constant which does not depend on ψ W . k kα ∈ According to Riesz’s theorem about linear functionals one concludes from (4.41) and (4.42) that

[G, ψ] = [T f, ψ] ψ W, (4.43) ∀ ∈ where T : Hα W is a bounded linear mapping. The function → G = T f, G W, (4.44) ∈ is the solution to problem (4.27)-(4.30) in Hs(p+q)/2(Rr). The last state- ment can be proved as follows. Suppose that G W satisfies equation ∈ (4.41) for all ψ V . Then G Hs(p+q)/2(Rr ) so that equations (4.30) and ∈ ∈ (4.29) hold, and G solves equation (4.28). In order to check that equation r r (4.27) holds, let ψ C∞(R ) in (4.41). This is possible since C∞(R ) V . ∈ 0 0 ⊂ Then

r P ( )G f η∗dy = 0 η = Q( )ψ, ψ C∞(R ). (4.45) { L − } ∀ L ∈ 0 ZD If the set of such η is complete in L2(D), one can conclude from equation (4.45) that equation (4.27) holds. To finish the proof of Theorem 2.1, let r 2 us prove that the set Q( )ψ ψ C∞(R ) is complete in L (D). But { L } ∀ ∈ 0 this is clear because even the smaller set of functions

j qs Q( )ψ ψ C∞(D), ∂ ψ = 0 on Γ, 0 j 1 (4.46) { L } ∀ ∈ N ≤ ≤ 2 − is dense in L2(D). Indeed, the operator Q( ) is essentially selfadjoint in L L2(D) on the set

j qs ψ : ψ C∞(D), ∂ ψ = 0 on Γ, 0 j 1 . (4.47) ∈ N ≤ ≤ 2 − n o February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 73

That is, the closure of Q( ) with the domain (4.47) is selfadjoint in L2(D): L it is the Dirichlet operator Q( ) in L2(D). Since Q( ) is positive definite L L on the set (4.47), its closure is also a positive definite selfadjoint operator in L2(D). Therefore, the range of the closure of Q( ) is the whole space L L2(D), and the range of the operator Q( ) with the domain of definition L (4.47) is dense in L2(D). This completes the proof of Theorem 2.1. 

4.2 Proof of Theorem 2.2

Let us first prove a lemma of a general nature. Lemma 4.1 Let

Rφ := R(x, y)φ(y)dy, R(x, y) = R∗(y, x) (4.48) ZD and assume that the kernel R(x, y) defines a compact selfadjoint operator in H = L2(D) for any bounded domain D Rr. Let λ (D) be the eigenvalues ⊂ j of R : L2(D) L2(D), → Rφj = λj (D)φj (4.49)

+ and λj (D) be the positive eigenvalues, ordered so that λ+(D) λ+ (4.50) 1 ≥ 2 ≥ · · · and counted according to their multiplicities. Then λ+(D ) λ+(D ) j provided that D D . (4.51) j 2 ≥ j 1 ∀ 2 ⊃ 1 Proof. By the well-known minimax principle one has

+ λj (D2) = min max (Rφ, φ)2 := min µj (ψ). (4.52) ψ1,...,ψj−1 (φ,ψi)2 =0, 1≤i≤j−1 ψ (φ,φ)2=1

Here (u, v) := uv∗dx, m = 1, 2, and m Dm

R µj (ψ) := max (Rφ, φ)2. (4.53) (φ,ψi)2 =0, 1≤i≤j−1 (φ,φ)2=1 Let D := D D . If we assume that an additional restriction is imposed 3 2 \ 1 on φ in formula (4.52), namely

φ = 0 in D3 (4.54) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

74 Random Fields Estimation Theory

then µj(ψ) cannot increase:

νj(ψ) := max (Rφ, φ)2 µj (ψ). (4.55) (φ,ψi )2=0, 1≤i≤j−1 ≤ (φ,φ)2=1,φ=0 in D3 On the other hand,

νj(ψ) = max (Rφ, φ)1. (4.56) (φ,ψi)1 =0, 1≤i≤j−1 (φ,φ)1=1 Since

+ λj (D1) = min νj(ψ) min µj(ψ) = λj (D2), (4.57) ψ ≤ ψ one obtains (4.51). Lemma 4.1 is proved. 

Assume now that D expands uniformly in directions, D Rr. We wish → to find out if the limit

lim λ1(D) := λ1 (4.58) D Rr ∞ → exists and is finite. We assume that

R(x, y) = R˜(λ)Φ(x, y, λ)dρ(λ), (4.59) ZΛ where R˜(λ) > 0 is a continuous function which vanishes at infinity. This implies that the operator R : L2(D) L2(D) with kernel (4.59) is compact, → as follows from Lemma 4.2 If R˜(λ) is a continuous function such that

lim R˜(λ) = 0, (4.60) λ →∞ then the operator R : L2(D) L2(D), where D Rr is a bounded domain, → ⊂ with the kernel (4.59) is compact in H = L2(D).

Proof. Given a number  > 0, find a continuous function r(λ) = 0 for λ > N such that | |

max R˜(λ) r(λ) < . (4.61) λ N | − | | |≤ Here the number N is chosen so large that

R˜(λ) <  for λ > N. (4.62) | | | | February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 75

2 Denote by R the operator in L (D) whose kernel has the spectral density 2 r 2 r(λ). Let denote the orthoprojection in L (R ) onto L (D). If R is P ∞ the operator with kernel (4.59) ( considered as an operator in L2(Rr)), then

R = R (4.63) P ∞P and

R = R (4.64) P ∞P with the same notation for R . One has ∞

R R R R . (4.65) k − k≤k ∞ − ∞ k≤ Here one has used the fact that the norm of the operator r : L2(Rr ) → L2(Rr) with the kernel

r(x, y) = r˜(λ)Φ(x, y, λ)dρ(λ) (4.66) ZΛ is given by the formula

r = max r˜(λ) . (4.67) k k λ Λ | | ∈ Indeed

r = sup (Rφ, φ) = sup r˜(λ) φ˜ 2dρ(λ) k k φ =1 | | φ =1 Λ | | k kL2(Rr ) k kL2(Rr ) Z 2 max r˜(λ) sup φ L2(Rr )= max r˜(λ) . (4.68) λ Λ λ Λ ≤ | | φ 2 r =1 k k | | ∈ k kL (R ) ∈ This proves the inequality

r max r˜(λ). (4.69) k k≤ λ Λ | ∈ In order to establish the equality (4.67), take the point λ0 at which the function r(λ) attains its maximum. Such a point does exist since | | the function r(λ) is continuous and vanishes at infinity. Then find a φ(x), | | φ 2 r = 1, such that φ˜ = 0 for λ λ > δ, where δ > 0 is an arbitrary k kL (R ) | − 0| small number. Then, using continuity of r˜(λ), one obtains

2 r = sup r˜(λ) φ˜ dρ(λ) ( r˜(λ0) η(δ)) k k φ 2 r =1 λ λ0 δ | | ≥ | | − k kL (R ) Z| − |≤

= max r˜(λ) η(δ) (4.70) λ Λ | | − ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

76 Random Fields Estimation Theory

where η(δ) is arbitrarily small if δ > 0 is sufficiently small. From (4.69) and (4.70) formula (4.67) follows. From (4.67) and the obvious inequality k 1 one obtains (4.65). If one can prove that the operator R is compact P k≤  in L2(D) then Lemma 4.2 is proved, because R can be approximated in the norm by compact operators R with arbitrary accuracy according to (4.65). 2 Let us prove that the operator R is compact in L (D). One has

N w := Rf = r(λ)φ(x, y, λ)dρ(λ) f(y)dy. (4.71) D N Z Z− !

Let f 2 r 1. Taking into account that Φ = λΦ and using Parseval’s k kL (R )≤ L equality, one obtains

2 N Rf L2(Rr )= D N λr(λ)Φ(x, y, λ)dρ(λ) f(y)dy k L k − L2(Rr )   N 2 2 ˜R2 R 2 2 2 = N λ r(λ) f dρ(λ) N max N λ N r(λ) f L2(Rr ) − | | | | ≤ − ≤ ≤ | | k k c(N). (4.72) R ≤ Let us now recall the well-known elliptic estimate ([H¨ormander (1983-85)]):

w s c(D , D ) w 2 + w 2 , (4.73) k kH (D1)≤ 1 2 k L kL (D2) k kL (D2) which holds for the elliptic operator , ord = s, and for arbitrary bounded L L domains D D Rr, D is a strictly inner subdomain of D . From 1 ⊂ 2 ⊂ 1 2 (4.72) and (4.73) it follows that

w s c f B := f : f 2 r 1 , (4.74) k kH (D)≤ ∀ ∈ 1 k kL (R )≤ where c > 0 is a constant which does not depend on f B . Indeed, ∈ 1 the estimate for w 2 r is given by formula (4.72), and the estimate k L kL (R ) for w 2 r is obtained in the same way. Inequality (4.74) and the k kL (R ) imbedding theorem (see Theorem 8.1) imply that the set Rf is relatively 2 { } 2 compact in L (D). Therefore the operator R maps a unit ball in L (D) 2 into a relatively compact set in L (D). This means that R is compact in L2(D). Lemma 4.2 is proved.  In order to proceed with the study of the behavior of λ (D) as D Rr 1 → let us assume that

sup R(x, y) dy := A < . (4.75) x Rr Rr | | ∞ ∈ Z Lemma 4.3 If the kernel R(x, y) defines, for any bounded domain D ⊂ Rr, a selfadjoint compact nonnegative operator in L2(D), and condition February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 77

(4.75) holds, then the limit (4.58) exists and

λ1 A. (4.76) ∞ ≤

Proof. From Lemma 4.1 one knows that λ1(D) grows monotonically as D increases in the sense (4.51). Therefore, existence of the limit (4.58) and the estimate (4.76) will be established if one proves that

λ (D) A (4.77) 1 ≤ for all D Rr. Let Rφ = λ (D)φ . One has ⊂ 1 1 1

λ1(D) sup φ1(x) sup R(x, y) dy sup φ1(y) x D | | ≤ x D D | | y D | | ∈ ∈ Z ∈

sup R(x, y) dy sup φ1(y) . (4.78) ≤ x Rr Rr | | y D | | ∈ Z ∈ Therefore inequality (4.77) is obtained. Lemma 4.3 is proved. 

Let us now prove Theorem 2.2. Proof of Theorem 2.2 We need only prove formula (2.29). Take φ1 in r Lemma 4.3 with φ 2 = 1. Extend φ to all of R by setting φ = 0 k 1 kL (D) 1 1 in Ω. Then, using Parseval’s equality, one obtains:

λ1(D) = R(x, y)φ1(y)dy φ1∗(x)dx ZD ZD  2 ˜ ˜ = R(x, y)φ1(y)φ1∗(x)dydx = R(λ) φ1(λ) dρ(λ) Rr Rr Λ Z Z Z max R˜(λ) φ1 L2(Rr )= max R˜(λ). (4.79) ≤ λ Λ k k λ Λ ∈ ∈

Choose φ˜1 with support in a small neighborhood of the point λ0 at which R˜(λ) attains its maximum. Then λ(D) max R˜(λ) , where  > 0 is an ≥ λ − arbitrary small number. This proves formula (2.29) and Theorem 2.2 in which ω(λ) stands for R˜(λ). 

We now discuss some properties of the eigenvalues λj (D). Suppose that R(x, y) = R(x y), R( x) = R(x), and the domain − − D is centrally symmetric with respect to the origin, that is, if x D ∈ then x D. Let us recall that an eigenvalue is called simple if the − ∈ corresponding eigenspace is one dimensional. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

78 Random Fields Estimation Theory

Lemma 4.4 If λ is a simple eigenvalue of the operator R : L2(D) → L2(D) with the kernel R(x y), R( x) = R(x), and D is centrally sym- − − metric, then the corresponding eigenfunction φ

Rφ = λφ (4.80)

is either even or odd.

Proof. One has

λφ( x) = R( x y)φ(y)dy = R(x + y)φ(y)dy − − − ZD ZD = R(x z)φ( z)dz. (4.81) − − ZD Here we set y = z in the second integral and used the assumption of − central symmetry. Therefore φ( x) is the eigenfunction corresponding to − the same eigenvalue λ. Since this eigenvalue is simple, one has φ( x) = − cφ(x), c = const. This implies φ(x) = cφ( x), so that c2 = 1. Thus c = 1. −  If c = 1 then φ(x) is even. Otherwise it is odd. Lemma 4.4 is proved. 

Remark 4.2 If λ is not simple, the corresponding eigenfunction may be neither even nor odd.

For example the operator

π Rφ := φ(y)dy (4.82) π Z− has an eigenvalue λ = 0. The corresponding eigenspace is infinite dimen- sional: it consists of all functions orthogonal to 1 in L2( π, π). In particu- − lar, the function cos y + sin y is an eigenfunction which is neither even nor odd and it corresponds to the eigenvalue λ = 0. Suppose one has a family of domains D , 0 < t < , such that D = D, t ∞ 1 and D = x : x = tξ, ξ D . Then the eigenvalues λ (D ) := λ (t) t { ∈ 1} j t j depend on the parameter t and one can study this dependence. If one writes

r R(x, y)φ(y)dy = t R(tξ, tη)φ(tη)dη = λj(t)φ(tξ) ZDt ZD1 2 then one sees that λj(t) are the eigenvalues of the operator R(t) in L (D) r with the kernel R(ξ, η, t) := t R(tξ, tη), where D = D1 does not depend on February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 79

t. This implies immediately that λj(t) depend on t continuously provided that

R(t0) R(t) 0 as t t0. (4.83) k − k→ → Indeed,

max λj (t0) λj (t) R(t) R(t0) . (4.84) j | − | ≤k − k Estimate (4.84) follows from the minimax principle and is derived in Section 8.3. Condition (4.83) holds, for example, if

R(x, y) 2dxdy c(D) < (4.85) | | ≤ ∞ ZD ZD for any bounded domain D Rr. ⊂ One can also study differentiability of the eigenvalues λj (t) with respect to parameter t using, for example, methods given in [K]. This and the study of the eigenfunctions as functions of the parameter t would lead us astray.

4.3 Proof of Theorems 2.4 and 2.5

Proof of Theorem 2.4 This proof can be given in complete analogy to the proof of Theorem 2.1. On the other hand, there is a special feature in the one-dimensional (r = 1) theory which is the subject of Theorem 2.4. Namely the spaces of all solutions to homogeneous equations

P ( )φ = 0 (4.86) L and

Q( )ψ = 0 (4.87) L are finite dimensional (in contrast to the case when r > 1). The system (2.43)-(2.44), for example, is a linear algebraic system. Therefore, existence of the solution to this system follows from the uniqueness of this solution by Fredholm’s alternative. 

Let us briefly describe the basic steps of the proof. Step 1. Consider first the case when P (λ) = 1. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

80 Random Fields Estimation Theory

Lemma 4.5 The set of all solutions of the equation t Rh := R(x, y)h(y)dy = f(x), t T x t (4.88) t T − ≤ ≤ Z − sq sq with the kernel R(x, y) and P (λ) = 1 in the space H˙ − := H˙ − (D), ∈ R D = (t T, t), is in one-to-one correspondence with the set of the solutions − of the equation

∞ R(x, y)h(y)dy = f(x), x R1, (4.89) ∈ Z−∞ where

h = Q( )F, (4.90) L sq 1 h H− (R ) and supph D = [t T, t]. Here ∈ ⊂ − qs/2 b−ψ−(x), x t T, j=1 j j ≤ − F (x) = Pf(x), t T x t, (4.91)  − ≤ ≤  qs/2 b+ψ+(x), t x, j=1 j j ≤  b are arbitrary constants,Pand the functions ψ , 1 j qs/2, form a j { j } ≤ ≤ fundamental system of solutions to equation (4.87) such that

+ ψ (+ ) = 0, ψ−( ) = 0. (4.92) j ∞ j −∞ qs Proof. Let h H˙ − be a solution to (4.88). This means that for any 1 ∈ φ C∞(R ) one has ∈ 0 1 (Rh, φ) = (F, φ), φ C∞(R ), (4.93) ∀ ∈ 0 where the parentheses in (4.93) denote the L2(R1) inner product and F is given by (4.91). One can say that F is defined to be the integral Rh qs for x > t and x < t T (see formula (4.88)). This integral for h H˙ − − ∈ can be considered as the value of the functional h on the test function R(x, y), since, for a fixed x, R(x, y) Hqs outside of an arbitrarily small ∈ loc neighborhood of the point x. Since Q( )R = δ(x y), the kernel R does L − not belong to Hqs, however, δ(x y) can be interpreted as a kernel of − the identity operator in L2, so that (δ(x y), φ) = φ in the sense that − (δ(x y)φ, ψ) = (φ, ψ) φ, ψ L2. − qs ∀ ∈ qs 1 Since h H˙ − one can consider it as an element of H− (R ). ∈ Equation (4.93) then is equivalent to (4.89). The correspondence men- qs 1 tioned in Lemma 4.5 1 can be described as follows. If h H− (R ), ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 81

supph [t T, t], and h solves equation (4.89), then h = h solves equation ∈ − qs (4.88). Conversely, if h H˙ − solves equation (4.88), then h = h is an qs 1 ∈ element of H− (R ) with supph [t T, t], and h solves equation (4.89). ∈ − The constants bj in (4.91) can be uniquely determined by the given h from the formula t R(x, y)hdy = F (x). t T Z − Lemma 4.5 is proved. 

Example 4.1 Let

1 x y e−| − |h(y)dy = f(x), 1 x 1. (4.94) 1 − ≤ ≤ Z− d λ2+1 2 Here = i dx , r = 1, P (λ) = 1, Q(λ) = 2 , Q( ) = ( ∂ + 1)/2, dL − L − ∂ = dx . (See Section 2.4.) Equation (4.87) has the solutions

ψ = exp(x), ψ+ = exp( x). (4.95) − −

Choose F by formula (4.91) with some b, qs = 1, t = 1, t T = 1. Then − − 1 h = ( ∂2 + 1)F 2 − f 00 + f + + = − + δ0(x 1) f(1) b e + δ(x 1) f 0(1) b e 2 − − − − + δ0(x + 1)[b−e f( 1)] + δ(x + 1)[ f 0( 1) b−e].  (4.96) − − − − − For convenience of the reader let us give all the details of the calculation. By definition of the distributional derivative one has

+ b ∞ (Q( )F, φ) = (F, Q( )φ) = exp(x)( φ00 + φ)dx L L 2 1 − 1 Z1 1 b− − + f(x)( φ00 + φ)dx + exp( x)( φ00 + φ)dx. 2 1 − 2 − − Z− Z−∞ After integrating by parts two times one obtains

1 1 + (Q( )F, φ) = ( f 00 + f)φdx + b [φ0(1)e φ(1)e] L 2 1 − − Z− φ0(1)f(1) + φ(1)f 0(1) + φ0( 1)f( 1) φ( 1)f 0( 1) − − − − − − + b− [ φ0( 1)e φ( 1)e] . (4.97) − − − − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

82 Random Fields Estimation Theory

Formula (4.97) is equivalent to (4.96). Corollary 2.5 follows from Lemma 4.5 immediately. Indeed, the solution of minimal order of singularity one

obtains from formula (4.90) if and only if the constants bj are chosen so that F has minimal order of singularity, that is, if F is maximally smooth. This happens if and only if the following conditions hold: qs F (j)(t) = 0, F (j)(t T ) = 0, 0 j 1. (4.98) − ≤ ≤ 2 − h i h i Here, for example,

F (j)(t) := F (j)(t + 0) F (j)(t 0) (4.99) − − h i is the jump of F (j)(x) across point t. If conditions (4.98) hold, one can rewrite formula (4.90) as formula (2.47).

Exercise. Check the last statement.

Hint: The calculations are similar to the calculations given in Example 4.1.

Step 2. Assume now that P (λ) 1. Write equation (4.88) as 6≡ t P ( ) S(x, y)h(y)dy = f(x), t T x t, (4.100) L t T − ≤ ≤ Z − where

1 S(x, y) = Q− (λ)Φ(x, y, λ)dρ(λ). (4.101) ZΛ Rewrite equation (4.100) as

t ps S(x, y)h(y)dy = g0(x) + cj φj. (4.102) t T Z − Xj=1 Here φ , 1 j ps, is the fundamental system of solutions to equation { j} ≤ ≤ (4.86), g0 is a particular solution to the equation P ( )g = f, t T x t, (4.103) L − ≤ ≤ and c , 1 j ps, are arbitrary constants. j ≤ ≤ Apply to equation (4.102) the result of Lemma 4.5, in particular, use formula (4.90) to get the conclusion of Theorem 2.4. Equations (2.43)-(2.44) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 83

are necessary and sufficient for G, given by formula (2.38), to have minimal order of singularity and, therefore, for h, given by formula (2.37), to have minimal order of singularity. As we have already noted, the solvability of the system (2.43)-(2.44) for the coefficients c , 1 j ps, and b, j ≤ ≤ j 0 j qs 1, follows from the fact that the homogeneous system (2.43)- ≤ ≤ 2 − (2.44) has only the trivial solution and from Fredholm’s alternative. The fact that for f(x) = 0 the system (2.43)-(2.44) has only the trivial solution can be established as in the proof of Theorem 2.1. Corollary 2.1 follows from formulas (2.15)-(2.20) immediately.

Exercise. Give a detailed proof of the uniqueness of the solution of the homogeneous system (2.43)-(2.44). Hint: A solution to this system generates a solution h to equation (4.88) α with f = 0, h H˙ − . Use Parseval’s equality to derive from ∈ α Rh = 0 t T x t, h H˙ − (4.104) − ≤ ≤ ∈

that h = 0. If h = 0 then cj = bj = 0 for all j.

Proof of Theorem 2.5 The proof is similar to the proof of Theorem 2.4. Equation (2.65)

t S(x, y)h(y)dy = g(x), t T x t, (4.105) t T − ≤ ≤ Z − where g(x) is the right hand side of equation (2.65), can be written as

∞ S(x, y)h(y)dy = G(x), < x < , (4.106) −∞ ∞ Z−∞ where (x) is given by (2.67) and G supph [t T, t]. (4.107) ⊆ − There is a one-to-one correspondence between solutions to equation (4.105) α α 1 in ˙ − and solutions to equation (4.106) in − (R ) with property (4.107). H H Equation (4.106) can be solved by the formula

h = Q( )EG, (4.108) L and h given by formula (4.108) has property (4.107). This h has minimal n m order of singularity α = − . The constant vectors b and c solve ≤ 2 j j linear algebraic system (2.68-2.69). That this system is solvable follows February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

84 Random Fields Estimation Theory

from Fredholm’s alternative and the fact that this system has at most one solution. The fact that this system has at most one solution follows, as in the proof of Theorem 2.4, from Parseval’s equality and the positivity of the kernel R(x, y). Theorem 2.5 is proved. 

4.4 Another approach

The remark we wish to make in this section concerns the case of the positive definite kernel R(x, y) which satisfies the equation

Q(x, ∂)R(x, y) = P (x, ∂)δ(x y), (4.109) − where Q(x, ∂) and P (x, ∂) are elliptic positive definite in L2(Rr ) operators, not necessarily commuting, ordQ(x, ∂) = n, ordP (x, ∂) = m, m < n, and R(x, y) 0 as x y . As in Section 4.1, we wish to study the → | − | → ∞ equation

Rh = f, x D (4.110) ∈ α n m α with f H (D), α := − , to prove that the operator R : H˙ − (D) ∈ 2 → Hα(D) is an isomorphism, and to give analytical formulas for the solution h of minimal order of singularity. Consider equation (4.110) as an equation in L2(Rr )

Rh = F, x Rr, (4.111) ∈ with f in D F = (4.112) (u in Ω := Rr D, \ where

Q(x, ∂)u = 0 in Ω. (4.113)

Equation (4.113) follows from equation (4.109) and the assumption that supph D. The function F has minimal order of singularity if and only if ⊂ u solves (4.113) and satisfies the boundary conditions n ∂j u = ∂j f on Γ, 0 j 1, u( ) = 0. (4.114) N N ≤ ≤ 2 − ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Proofs 85

The problem (5)-(6) is the exterior Dirichlet problem which is uniquely solvable if Q is positive definite. If u solves problem (4.113)-(4.114) then (n m)/2 r (n+m)/2 r F H − (R ), suppQ(x, ∂)F D, QF H− (R ). Apply ∈ ⊂ ∈ Q(x, ∂) to both sides of equation (4.111) and use equation (4.109) to get

(n+m)/2 r P (x, ∂)h = Q(x, ∂)F, QF H− (R ), suppQF D. (4.115) ∈ ⊂ (n m)/2 The solution to (4.115) in the space H˙ − − (D) exists, is unique, and is the solution to equation (4.110) of the minimal order of singularity. More details are given in Appendix B. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

86 Random Fields Estimation Theory February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Chapter 5 Singular Perturbation Theory for a Class of Fredholm Integral Equations Arising in Random Fields Estimation Theory A basic integral equation of random fields estimation theory by the criterion of minimum of variance of the estimation error is of the form Rh = f, where Rh = R(x, y)h(y) dy, and R(x, y) is a covariance function. The singular D perturbationR problem we study consists of finding the asymptotic behavior of the solution to the equation εh(x, ε) + Rh(x, ε) = f(x), as ε 0, ε > 0. → The domain D can be an interval or a domain in Rn, n > 1. The class of operators R is defined by the class of their kernels R(x, y) which solve the equation Q(x, D )R(x, y) = P (x, D )δ(x y), where Q(x, D ) and x x − x P (x, Dx) are elliptic differential operators. Presentation in this chapter is based on [Ramm and Shifrin (2005)]

5.1 Introduction

Consider the equation

εh(x, ε) + Rh(x, ε) = f(x) , x D Rn , (5.1) ∈ ⊂ where D is a bounded domain with a sufficiently smooth boundary ∂D, and

Rg(x) := R(x, y)g(y) dy .

DZ

In this paper we study the class of kernels R(x, y) which satisfy the R equation

Q(x, D )R(x, y) = P (x, D )δ(x y) in Rn x x −

87 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

88 Random Fields Estimation Theory

and tend to zero as x y , where Q(x, D ) and P (x, D ) are elliptic | − | → ∞ x x differential operators with smooth coefficients, and δ(x y) is the delta - − function. For technical reasons below we use the kernels R(x, y) of the same class, but written in a slightly different form ( see (5.5) ). Specifically, we write

R(x, y) = P (y, Dy)G(x, y) (5.2)

where

α β P (y, Dy ) = aα(y)Dy , Q(x, Dx) = bβ (x)Dx , p < q, (5.3) α p β q |X|≤ |X|≤ and

Q(x, D )G(x, y) = δ(x y). (5.4) x − Note that

Q(x, D )R(x, y) = P (y, D )δ(x y) (5.5) x y − In this paper all the functions are assumed to be real - valued. We assume that the coefficients aα(x), bβ (x) and f(x) are sufficiently smooth n functions in R , α = (α1, , αn) and β = (β1, , βn) are multiindices, n n · · · α · · · β α ∂| | β ∂| | α = αi, β = βj , D = , D = . Suf- y α1 αn x β1 βn | | i=1 | | j=1 ∂y ∂yn ∂x ∂x 1 · · · 1 n ficient Psmoothness ofPthe coefficients means that the integrations· · · by parts we use are justified. The following assumptions hold throughout the paper:

n A1) (Q(x, D )ϕ, ϕ) c (ϕ, ϕ) , c = const > 0 , ϕ(x) C∞(R ) , (5.6) x ≥ 1 1 ∀ ∈ 0

n (P (x, D )ϕ, ϕ) c (ϕ, ϕ) , c = const > 0 , ϕ(x) C∞(R ) , (5.7) x ≥ 2 2 ∀ ∈ 0 where ( , ) is the L2(Rn) inner product, and L2 is the real Hilbert space. · · By Q∗(x, Dx) and P ∗(x, Dx) the operators formally adjoint to Q(x, Dx) and P (x, Dx) are denoted. If (5.6) holds, then q > 0 is an even integer, and (5.7) implies that p is an even integer, 0 p < q. Define a := (q p)/2. Let H λ(D) be the usual ≤ λ − 2 0 Sobolev space and H˙ − (D) be its dual with respect to L (D) = H (D).

Denote ϕ λ = ϕ Hλ (D) for λ > 0 and ϕ λ = ϕ H˙ λ(D) for λ < 0. Let us k k k k a k k k k a denote, for the special value λ = a, H (D) = H+, H˙ − (D) = H . Denote − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 89

by (h1, h2) and by ( , ) the inner products in H and, respectively, in − · · − L2(D). As in Chapter 8, let us assume that

2 2 n A2) c3 ϕ (Rϕ, ϕ) c4 ϕ , c3 = const > 0 , ϕ(x) C0∞(R ) . k k− ≤ ≤ k k− ∀ ∈ (5.8) This assumption holds, for example ( see Chapter 8) if

c5 ϕ (p+q)/2 Q∗ϕ a c6 ϕ (p+q)/2 , c5 = const > 0 , k k ≤ k k− ≤ k k n ϕ(x) C∞(R ) , (5.9) ∀ ∈ 0 and

c ϕ (P Q∗ϕ, ϕ) c ϕ , c = const > 0 , 7k k(p+q)/2 ≤ ≤ 8k k(p+q)/2 7 n ϕ(x) C∞(R ) . (5.10) ∀ ∈ 0 The following result is proved in Chapter 8.

Theorem 5.1 If (5.8) holds, then the operator R : H H+ is an − → isomorphism. If QR = P δ(x y) and (5.9) and (5.10) hold, then (5.8) − holds. Equation (5.1) and the limiting equation Rh = f are basic in random fields estimation theory, and the kernel R(x, y) in this theory is a covariance function, so R(x, y) is a non - negative definite kernel:

n (Rϕ, ϕ) 0 , ϕ(x) C∞(R ) . ≥ ∀ ∈ 0 If p < q, then the inequality (Rϕ, ϕ) C(ϕ, ϕ), C = const > 0, n ≥ ϕ(x) C∞(R ) does not hold. ∀ ∈ 0 In [Ramm and Shifrin (1991); Ramm and Shifrin (1993); Ramm and Shifrin (1995)] a method was developed for finding asymptotics of the so- lution to equation (5.1) with kernel R(x, y) satisfying equation (5.5) with Q(x, Dx) and P (x, Dx) being differential operators with constant coeffi- cients. Our purpose is to generalise this theory to the case of operators with variable coefficients. In Chapter 8 the limiting equation Rh = f is studied for the above class of kernels. In Chapter 2 the class of kernels R(x, y), which are kernels of positive rational functions of an arbitrary selfadjoint in L2(Rn) elliptic operator, was studied. In Section 5.2 we prove some auxiliary results. In Section 5.3 the asymp- totics of the solution to equation (5.1) is constructed in case n = 1, that is, for one - dimensional integral equations of class defined below formula R February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

90 Random Fields Estimation Theory

(5.1). In Section 5.4 examples of applications of the proposed asymptotical solutions are given. In Section 5.5 the asymptotics of the solution to equa- tion (5.1) is constructed in the case n > 1, and in Section 5.6 examples of applications are given.

5.2 Auxiliary results

Lemma 5.1 Assume (5.4) and suppose that G( , y) = 0. Then ∞

Q∗(y, D )G(x, y) = δ(x y) . (5.11) y − n Proof. Let ϕ(x) C∞(R ). Then ∈ 0 (Q∗(y, D )Q(x, D )G(x, y), ϕ(y)) = (δ(x y), Q(y, D )ϕ(y)) y x − y = Q(x, Dx)ϕ(x) . (5.12)

Also one has:

(Q∗(y, Dy )Q(x, Dx)G(x, y), ϕ(y)) = Q(x, Dx)(Q∗(y, Dy )G(x, y), ϕ(y)) . (5.13) Therefore

Q(x, Dx)(Q∗(y, Dy)G(x, y), ϕ(y)) = Q(x, Dx)ϕ(x) . (5.14)

Because of (5.6) and of the condition G( , y) = 0 this implies ∞ n (Q∗(y, D )G(x, y), ϕ(y)) = ϕ(x) , ϕ(x) C∞(R ) , (5.15) y ∀ ∈ 0 so (5.11) follows.  Consider now the case n = 1: p q di dj P (y, D ) = a (y) , Q(x, D ) = b (x) , x R1 , y R1 y i dyi x j dxj ∈ ∈ i=0 j=0 X X (5.16) In this case D = (c, d), D = [c, d]. Lemma 5.2 If g(y) is a smooth function in D, then

d d

[P (y, Dy)G(x, y)] g(y) dy = G(x, y) [P ∗(y, Dy )g(y)] dy + K2g(x) , Zc Zc (5.17) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 91

where

p k k j j 1 d j 1 ∂ − G(x, y) d − (ak(y)g(y)) K2g(x) := ( 1) − k j j 1 , (5.18) − ∂y − dy − c Xk=1 Xj=1   and

K2 = 0 if p = 0 . (5.19)

Proof. Use definition (5.16) of P (y, Dy) in (5.17), integrate by parts, and get formulas (5.17) - (5.19). 

Lemma 5.3 If g(y) is a smooth function in D, then

d d

G(x, y) [Q(y, Dy )g(y)] dy = [Q∗(y, Dy )G(x, y)] g(y) dy + K1g(x) , Zc Zc (5.20) where

q m m i i 1 d i 1 d − g(y) ∂ − (bm(y)G(x, y)) K g(x) := ( 1) − . (5.21) 1 − dym i ∂yi 1 m=1 i=1 − − c X X   Proof. Similarly to Lemma 5.2, integrations by parts yield the desired formulas. 

Consider the case n > 1.

Lemma 5.4 If P (y, Dy) is defined in (5.3) and g(x) is a smooth function in D, then

[P (y, Dy)G(x, y)] g(y) dy = G(x, y) [P ∗(y, Dy )g(y)] dy + M2g(x) , DZ DZ (5.22) where

n αk γ +α +α + +α 1 M g(x) := ( 1) k k+1 k+2 ··· n− 2 − 1 α p k=1 γk=1 ≤X≤ X X

α α α α γ ∂| |− n− n−1−···− k+1− k G(x, y) α1 α2 αk−1 αk γk × ∂y1 ∂y2 ∂yk 1 ∂yk − ∂ZD · · · − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

92 Random Fields Estimation Theory

αk+1 +αk+2+ +αn +γk 1 ∂ ··· − (aα(y)g(y)) Nk(y) dSy . (5.23) γk 1 αk+1 αk+2 αn × ∂y − ∂y ∂y ∂yn k k+1 k+2 · · · Here ∂D is the boundary of D, y ∂D, Nk(y) is the k - th component ∈ n of the unit normal N to ∂D at the point y, pointing into D0 := R D, and \ if αk = 0 then the summation over γk should be dropped. Proof. Apply Gauss’ formula ( i.e. integrate by parts ). 

Lemma 5.5 If Q(x, Dx) is defined in (5.3) and g(y) is a smooth function in D, then

G(x, y) [Q(y, Dy )g(y)] dy = [Q∗(y, Dy )G(x, y)] g(y) dy + M1g(x) , DZ DZ (5.24) where

n βk γ +β +β + +β 1 M g(x) := ( 1) k k+1 k+2 ··· n − 1 − 1 β q k=1 γk=1 ≤X≤ X X

βk+1 +βk+2 + +βn+γk 1 ∂ ··· − (bβ(y)G(x, y))

γk 1 βk+1 βk+2 βn × ∂y − ∂y y ∂yn ∂ZD k k+1 k+2 · · ·

β β β β γ ∂| |− n − n−1−···− k+1− k g(y) Nk(y) dSy (5.25) × β1 β2 βk−1 βk γk ∂y1 ∂y2 ∂yk 1 ∂yk − · · · − Here y ∂D, and if β = 0 then the summation over γ should be ∈ k k dropped. Remark 5.1 For any smooth in D function g(x), one has

Q(x, D )K g(x) = 0 , x (c, d) , j = 1, 2 , (5.26) x j ∈ and

Q(x, D )M g(x) = 0 , x D , j = 1, 2 . (5.27) x j ∈ Formulas (5.26) and (5.27) follow from the definitions of Kj and Mj and from equation (5.4). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 93

5.3 Asymptotics in the case n = 1

To construct asymptotic solutions to equation (5.1) with R(x, y) we ∈ R reduce this equation to a differential equation with special, non - standard, boundary conditions. Theorem 5.2 Equation (5.1) is equivalent to the problem:

εQ(x, D )h(x, ε) + P ∗(x, D )h(x, ε) = Q(x, D )f(x) , x (c, d) (5.28) x x x ∈ with the following conditions

εK h(x, ε) K h(x, ε) = K f(x) . (5.29) 1 − 2 1 Proof. If h(x, ε) solves (5.1) and R(x, y) satisfies (5.2), one gets

d

εh(x, ε) + [P (y, Dy)G(x, y)] h(y, ε) dy = f(x) . (5.30) Zc From (5.30) and (5.17) one gets:

d

εh(x, ε) + G(x, y) [P ∗(y, Dy)h(y, ε)] dy + K2h(x, ε) = f(x) . (5.31) Zc

Applying Q(x, Dx) to (5.31) and using (5.4) and (5.26), yields (5.28). Let us check (5.29). From (5.28) and (5.31) one gets:

d εh(x, ε) + G(x, y) Q(y, D ) [f(y) εh(y, ε)] dy + K h(x, ε) = f(x) . y − 2 Zc (5.32) From (5.32) and (5.20) one obtains

d

εh(x, ε) + [Q∗(y, D )G(x, y)] (f(y) εh(y, ε)) dy y − Zc

+K (f εh)(x, ε) + K h(x, ε) = f(x) . (5.33) 1 − 2 From(5.33) and (5.11) one concludes:

εh(x, ε) + f(x) εh(x, ε) + K f(x) εK h(x, ε) + K h(x, ε) = f(x) − 1 − 1 2 This relation yields (5.29). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

94 Random Fields Estimation Theory

Let us now assume (5.28) and (5.29) and prove that h(x, ε) solves (5.1). Indeed, (5.2) and (5.17) imply

d d

εh(x, ε) + R(x, y)h(y, ε) dy = εh(x, ε) + [P (y, Dy )G(x, y)] h(y, ε) dy Zc Zc

d

= εh(x, ε) + G(x, y) [P ∗(y, Dy)h(y, ε)] dy + K2h(x, ε) . (5.34) Zc From (5.34) and (5.28) one gets

εh(x, ε) + Rh(x, ε) = εh(x, ε)

d + G(x, y) Q(y, D )(f(y) εh(y, ε)) dy + K h(x, ε) . (5.35) y − 2 Zc From (5.35) and (5.20) one obtains:

εh(x, ε) + Rh(x, ε) = εh(x, ε)

d

+ [Q∗(y, D )G(x, y)] (f(y) εh(y, ε)) dy + K (f εh)(x, ε) + K h(x, ε) . y − 1 − 2 Zc This relation and equation (5.11) yield: εh(x, ε) + Rh(x, ε) = εh(x, ε) + f(x) εh(x, ε) − + K f(x) εK h(x, ε) + K h(x, ε) , 1 − 1 2 and, using (5.29), one gets (5.1). Theorem 5.2 is proved.  This theorem is used in our construction of the asymptotic solution to (5.1). Let us look for this asymptotics of the form:

∞ ∞ l l h(x, ε) = ε (ul(x) + wl(x, ε)) = ε hl(x, ε) , (5.36) Xl=0 Xl=0 where the series in (5.36) is understood in the asymptotical sense as follows:

L h(x, ε) = εl(u (x) + w (x, ε)) + O(εL+1) as ε 0 l l → Xl=0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 95

L+1 where O(ε ) is independent of x and ul(x) and wl(x, ε) are some func- tions. Here u0(x) is an arbitrary solution to the equation

P ∗(x, Dx)u0(x) = Q(x, Dx)f(x) . (5.37)

If u0(x) is chosen, the function w0(x, ε) is constructed as a unique so- lution to the equation:

εQ(x, Dx)w0(x, ε) + P ∗(x, Dx)w0(x, ε) = 0 , (5.38)

which satisfies the conditions

εK w (x, ε) K w (x, ε) = K f(x) + K u (x) . (5.39) 1 0 − 2 0 1 2 0

Theorem 5.3 The function h0(x, ε) = u0(x) + w0(x, ε) solves the equa- tion

εh0(x, ε) + Rh0(x, ε) = f(x) + εu0(x) . (5.40)

Proof. From (5.2) and (5.17) one gets:

d

εh0(x, ε) + Rh0(x, ε) = εh0(x, ε) + [P (y, Dy )G(x, y)] h0(y, ε) dy Zc

d

= εh0(x, ε) + G(x, y)P ∗(y, Dy )h0(y, ε) dy + K2h0(x, ε) . (5.41) Zc

From (5.37) and (5.38) it follows that

P ∗(y, Dy )h0(y, ε) = P ∗(y, Dy )(u0(y) + w0(y, ε)) = P ∗(y, Dy )u0(y)

+P ∗(y, D )w (y, ε) = Q(y, D )f(y) εQ(y, D )w (y, ε) y 0 y − y 0

= Q(y, D ) [f(y) εw (y, ε)] . (5.42) y − 0

From (5.42) and from the definition of h0(x, ε) one derives:

P ∗(y, D )h (y, ε) = Q(y, D ) [f(y) εh (y, ε) + εu (y)] . (5.43) y 0 y − 0 0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

96 Random Fields Estimation Theory

From (5.43) and (5.41) one gets:

d

εh0(x, ε) + Rh0(x, ε) = εh0(x, ε) + G(x, y) Q(y, Dy )[f(y) Zc

εh (y, ε) + εu (y)] dy + K h (x, ε) . (5.44) − 0 0 2 0 Equations (5.44) and (5.20) yield:

εh0(x, ε) + Rh0(x, ε) = εh0(x, ε)

d

+ [Q∗(y, D )G(x, y)] (f(y) εh (y, ε) + εu (y)) dy y − 0 0 Zc

+K (f(x) εh (x, ε) + εu (x)) + K h (x, ε) . (5.45) 1 − 0 0 2 0 From (5.45) and (5.11) one derives:

εh (x, ε) + Rh (x, ε) = εh (x, ε) + f(x) εh (x, ε) + εu (x) 0 0 0 − 0 0

+K f(x) εK h (x, ε) + εK u (x) + K h (x, ε) . (5.46) 1 − 1 0 1 0 2 0 This implies:

εh0(x, ε) + Rh0(x, ε) = f(x) + εu0(x) + K1f(x)

εK w (x, ε) + K u (x) + K w (x, ε) . (5.47) − 1 0 2 0 2 0 Equations (5.47) and (5.39) yield (5.40). Theorem 5.3. is proved.  Let us construct higher order approximations. If l 1 then u (x) is ≥ l chosen to be an arbitrary particular solution to the equation

P ∗(x, Dx)ul(x) = Q(x, Dx)ul 1(x) . (5.48) − −

After ul(x) is fixed, the function wl(x, ε) is constructed as the unique solution to the equation

εQ(x, Dx)wl(x, ε) + P ∗(x, Dx)wl(x, ε) = 0 , (5.49) satisfying the conditions

εK1wl(x, ε) K2wl(x, ε) = K1ul 1(x) + K2ul(x) . (5.50) − − − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 97

Theorem 5.4 The function hl(x, ε) = ul(x)+wl(x, ε) solves the equation

εhl(x, ε) + Rhl(x, ε) = ul 1(x) + εul(x) . (5.51) − − Proof. The proof is similar to that of theorem 5.3 and is omitted.  Define L l HL(x, ε) = ε hl(x, ε) . (5.52) Xl=0 Theorem 5.5 The function HL(x, ε) solves the equation

L+1 εHL(x, ε) + RHL(x, ε) = f(x) + ε uL(x) . (5.53) Proof. From (5.52) one gets

L L l l εHL(x, ε) + RHL(x, ε) = ε ε hl(x, ε) + ε Rhl(x, ε) Xl=0 Xl=0

L l = ε [εhl(x, ε) + Rhl(x, ε)] . (5.54) Xl=0 Using (5.40), (5.51) and (5.54) yield (5.54). Theorem 5.5 is proved.  Theorem 5.6 If the function f(x) is sufficiently smooth in D, then it is possible to choose a solution u0(x) to (5.37) and a solution ul(x) to (5.48) so that the following inequality holds

L+1 HL(x, ε) h(x, ε) Cε , (5.55) k − k− ≤ where C = const > 0 does not depend on ε, but it depends on f(x). Proof. From (5.1) and (5.54) one obtains ε(H (x, ε) h(x, ε)) + R(H (x, ε) h(x, ε)) = εL+1u (x) . (5.56) L − L − L From (5.56) it follows that ε(H (x, ε) h(x, ε), H (x, ε) h(x, ε)) L − L − +(R(H (x, ε) h(x, ε)), H (x, ε) h(x, ε)) = εL+1(u (x), H (x, ε) h(x, ε)) . L − L − L L − (5.57) Using (5.8) one obtains

2 L+1 c3 HL(x, ε) h(x, ε) ε uL(x) + HL(x, ε) h(x, ε) . (5.58) k − k− ≤ k k k − k− February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

98 Random Fields Estimation Theory

Inequality (5.55) follows from (5.58) if the norm uL(x) + is finite. 3(q p)/2 k k Consider L = 0. If f(x) H − (D) then it is possible to find a (q∈p)/2 solution of (5.37) u0(x) H − (D). Thus the norm u0(x) + is finite. ∈ 5(q p)/2 k k For L = 1 suppose that f(x) H − (D). Then there exist a 3(q p)/2∈ solution to (5.37) u0(x) H − (D) and a solution to (5.38) u1(x) (q p)/2 ∈ ∈ H − (D) = H so that the norm u (x) is finite. + k 1 k+ If f(x) C∞(D) then the approximation H (x, ε) satisfying (5.55) can ∈ L be constructed for an arbitrary large L. 

5.4 Examples of asymptotical solutions: case n = 1

Example 5.1 Let

1 a x y εh(x, ε) + e− | − |r(y)h(y, ε) dy = f(x) , (5.59)

Z1 − where r(y) C2 > 0 is a given function. ≥ In this example the operators P (y, Dy) and Q(x, Dx) act on an arbitrary, sufficiently smooth, function g(x) according to the formulas:

P (y, Dy)g(y) = r(y)g(y) , i.e., p = 0 ,

and

1 d2g(x) a Q(x, D )g(x) = + g(x) . x −2a dx2 2 One has

a x y a x y Q(x, D )e− | − | = δ(x y) , so G(x, y) = e− | − | . x − Equation (5.37) yields

2 f 00(x) + a f(x) u (x) = − , (5.60) 0 2ar(x)

and (5.38) takes the form

ε 2 ( w00(x, ε) + a w (x, ε)) + r(x)w (x, ε) = 0 . (5.61) 2a − 0 0 0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 99

If one looks for the main term of the asymptotics of h(x, ε), then one can solve in place of (5.61) the following equation ε w00 (x, ε) + r(x)w (x, ε) = 0 , (5.62) −2a 0a 0a

where w0a(x, ε) is the main term of the asymptotics of w0(x, ε). We seek asymptotics of the bounded, as ε 0, solutions to (5.61) and → (5.62). To construct the asymptotics, one may use the method, developed in [Vishik and Lusternik (1962)]. Namely, near the point x = 1 one sets − x = y 1, y 0, and writes (5.62) as: − ≥ ε v00(y, ε) + r(y 1) v (y, ε) = 0, (5.63) −2a a − a where v (y, ε) := w (y 1, ε). a 0a − Put y = t√ε and denote ϕa(t, ε) := va(t√ε, ε). Then 1 d2ϕ (t, ε) a + r(t√ε 1) ϕ (t, ε) = 0 . (5.64) −2a dt2 − a Neglecting the term t√ε in the argument of r is possible if we are looking for the main term of the asymptotics of ϕa. Thus, consider the equation: 1 d2ϕ (t, ε) a + r( 1) ϕ (t, ε) = 0 . (5.65) −2a dt2 − a Its solution is

√2ar( 1) t √2ar( 1) t ϕa(t, ε) = C1e− − + C2e − .

Discarding the unbounded, as t + , part of the solution, one gets → ∞ √2ar( 1) t ϕa(t, ε) = C1e− − .

Therefore, the main term of the asymptotics of w0a(x, ε) near the point x = 1 is: − √2ar( 1)/ε (1+x) w0a(x, ε) = C1e− − , C1 = const. (5.66)

Similarly one gets near the point x = 1

√2ar(1)/ε (1 x) w0a(x, ε) = D1e− − , D1 = const . (5.67) From (5.66) and (5.67) one derives the main term of the asymptotics of the bounded, as ε 0, solution to equation (5.62): → √2ar( 1)/ε (1+x) √2ar(1)/ε (1 x) w0a(x, ε) = C1e− − + D1e− − . (5.68) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

100 Random Fields Estimation Theory

Now the problem is to find the constants C1 and D1 from condition (5.39). Since p = 0, formula (5.19) yields K2 = 0, and (5.39) is:

εK1w0(x, ε) = K1f(x) . (5.69)

From (5.69) and (5.21) one gets

1 1 ∂G(x, y) dw0a(y, ε) ε w0a(y, ε) G(x, y) ( ∂y 1 − dy 1) − −

1 ∂G(x, y) 1 = f(y) f 0(y)G(x, y) 1 . (5.70) ∂y 1 − − −

∂G(x, y) a x y Note that = ae− | − |sgn(y x), where sgn(t) = t/ t , so ∂y − − | |

∂G(x, 1) a(1 x) ∂G(x, 1) a(1+x) = ae− − , − = ae− . (5.71) ∂y − ∂y From (5.71) and (5.70) one obtains

a(1 x) a(1+x) ε w (1, ε)( a)e− − w ( 1, ε)ae− { 0a − − 0a −

a(1 x) a(1+x) a(1 x) w0 (1, ε)e− − + w0 ( 1, ε)e− = f(1)ae− − − 0a 0a − } −

a(1+x) a(1 x) a(1+x) f( 1)ae− f 0(1)e− − + f 0( 1)e− . (5.72) − − − − This implies:

ε aw (1, ε) + w0 (1, ε) = af(1) + f 0(1), { 0a 0a } and

ε aw ( 1, ε) + w0 ( 1, ε) = af( 1) + f 0( 1) . (5.73) {− 0a − 0a − } − − − Keeping the main terms in the braces, one gets:

ε2ar(1) D1 = f 0(1) + af(1) ,

and p

ε2ar( 1) C = f 0( 1) af( 1) . − − 1 − − − p February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 101

Therefore

f 0( 1) + af( 1) f 0(1) + af(1) C1 = − − − , D1 = . (5.74) ε2ar( 1) ε2ar(1) − From (5.60), (5.68)pand (5.74) one finds the pmain term of the asymp- totics of the solution to (5.59):

2 f 00(x) + a f(x) f 0( 1) + af( 1) √2ar( 1)/ε(1+x) h(x, ε) − + − − − e− − ≈ 2ar(x) ε2ar( 1) − p f 0(1) + af(1) √2ar(1)/ε(1 x) + e− − . (5.75) ε2ar(1) If r(x) = const, thenp(4.17) yields the asymptotic formula obtained in [Ramm and Shifrin (1991)]. Example 5.2 Consider the equation

d εh(x, ε) + G(x, y)h(y, ε) dy = f(x) , (5.76) Zc where G(x, y) solves the problem ∂2G(x, y) + a2(x)G(x, y) = δ(x y) , G( , y) = 0 , (5.77) − ∂x2 − ∞ and a2(x) const > 0, x R1. ≥ ∀ ∈ d2 Here P (y, D ) = I, p = 0, Q(x, D ) = + a2(x), q = 2. y x −dx2 One can write G(x, y) as

ϕ (x)ϕ (y) , x < y , G(x, y) = 1 2 (5.78) ϕ (x)ϕ (y) , y < x ,  2 1

where functions ϕ1(x) and ϕ2(x) are linearly independent solutions to the equation Q(x, D )ϕ(x) = 0, satisfying conditions ϕ ( ) = 0, ϕ (+ ) = x 1 −∞ 2 ∞ 0 and

ϕ0 (x)ϕ (x) ϕ (x)ϕ0 (x) = 1 . (5.79) 1 2 − 1 2 By (5.37) one gets

2 u (x) = f 00(x) + a (x)f(x) . (5.80) 0 − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

102 Random Fields Estimation Theory

By (5.38) one obtains

2 ε( w00(x, ε) + a (x)w (x, ε)) + w (x, ε) = 0 . − 0 0 0

The main term w0a(x, ε) of the asymptotics of w0(x, ε) solves the equa- tion:

εw00 (x, ε) + w (x, ε) = 0 . − 0a 0a Thus

(x c)/√ε (d x)/√ε w0a(x, ε) = Ce− − + De− − . (5.81)

Condition (5.39) takes the form (5.69). Using w0a(x, ε) in place of w0(x, ε) in (5.69), one gets, similarly to (5.70), the relation

d ∂G(x, y) d ε w0a(y, ε) w0a(y, ε) G(x, y)c ( ∂y c − ) d ∂G(x, y) d = f(y) f 0(y) G(x, y)c . ∂y c − Keeping the main terms, one gets

d d ∂G(x, y) d εw0a(y, ε) G(x, y)c = f(y) f 0(y) G(x, y)c . (5.82) − ∂y c − From (5.82) and (5.78) one obtains

ε w0 (d, ε)ϕ (x)ϕ (d) + w0 (c, ε)ϕ (x)ϕ (c) {− 0a 1 2 0a 2 1 }

= f(d)ϕ (x)ϕ0 (d) f(c)ϕ (x)ϕ0 (c) f 0(d)ϕ (x)ϕ (d) 1 2 − 2 1 − 1 2

+f 0(c)ϕ2(x)ϕ1(c) . (5.83)

Because ϕ1(x) and ϕ2(x) are linearly independent, it follows from (5.83)

εw0 (d, ε)ϕ (d) = f(d)ϕ0 (d) f 0(d)ϕ (d) , − 0a 2 2 − 2

εw0 (c, ε)ϕ (c) = f(c)ϕ0 (c) + f 0(c)ϕ (c) . (5.84) 0a 1 − 1 1 Substitute (5.81) in (5.84) and keep the main terms, to get D ε ϕ (d) = f(d)ϕ0 (d) f 0(d)ϕ (d) , − √ε 2 2 − 2 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 103

C ε ϕ (c) = f(c)ϕ0 (c) + f 0(c)ϕ (c) . − √ε 1 − 1 1 This yields the final formulas for the coefficients:

f 0(c)ϕ (c) + f(c)ϕ0 (c) f 0(d)ϕ (d) f(d)ϕ0 (d) C = − 1 1 , D = 2 − 2 . (5.85) √ε ϕ1(c) √ε ϕ2(d) From (5.80), (5.81) and (5.85) one gets the main term of the asymptotics of the solution to (5.76):

2 f 0(c)ϕ1(c) + f(c)ϕ10 (c) (x c)/√ε h(x, ε) f 00(x) + a (x)f(x) + − e− − ≈ − √ε ϕ1(c)

f 0(d)ϕ2(d) f(d)ϕ20 (d) (d x)/√ε + − e− − . (5.86) √ε ϕ2(d)

5.5 Asymptotics in the case n > 1

Consider equation (5.1) with R(x, y) . The method for construction ∈ R of the asymptotics of the solution to (5.1) in the multidimensional case is parallel to the one developed in the case n = 1. The proofs are also parallel to the ones given for the case n = 1, and are omitted by this reason. Let us state the basic results. Theorem 5.7 Equation (5.1) is equivalent to the problem

εQ(x, Dx)h(x, ε) + P ∗(x, Dx)h(x, ε) = Q(x, Dx)f(x) , (5.87)

εM h(x, ε) M h(x, ε) = M f(x) . (5.88) 1 − 2 1 Proof. One uses lemmas 5.1, 5.4 and 5.5 and formula (5.27) to prove theorem 5.7.  To construct the asymptotics of the solution to equation (5.1), let us look for the asymptotics of the form:

∞ ∞ l l h(x, ε) = ε (ul(x) + wl(x, ε)) = ε hl(x, ε) , (5.89) Xl=0 Xl=0 where u0(x) is an arbitrary solution to the equation

P ∗(x, Dx)u0(x) = Q(x, Dx)f(x) , (5.90) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

104 Random Fields Estimation Theory

and if some u0(x) is found, then w0(x, ε) is uniquely determined as the solution to the problem

εQ(x, Dx)w0(x, ε) + P ∗(x, Dx)w0(x, ε) = 0 , (5.91)

εM w (x, ε) M w (x, ε) = M f(x) + M u (x) . (5.92) 1 0 − 2 0 1 2 0

Theorem 5.8 The function h0(x, ε) = u0(x) + w0(x, ε) solves the equa- tion

εh0(x, ε) + Rh0(x, ε) = f(x) + εu0(x) . (5.93)

Let us construct higher order terms of the asymptotics. Define ul(x) (l 1) as an arbitrary solution to the equation ≥

P ∗(x, Dx)ul(x) = Q(x, Dx)ul 1(x) . (5.94) − − After finding ul(x), one finds wl(x, ε) as the unique solution to the problem

εQ(x, Dx)wl(x, ε) + P ∗(x, Dx)wl(x, ε) = 0 , (5.95)

εM1wl(x, ε) M2wl(x, ε) = M1ul 1(x) + M2ul(x) . (5.96) − − −

Theorem 5.9 The function hl(x, ε) = ul(x)+wl(x, ε) solves the equation

εhl(x, ε) + Rhl(x, ε) = ul 1(x) + εul(x) . (5.97) − − Define L l HL(x, ε) = ε hl(x, ε). (5.98) Xl=0 From Theorems 5.8 and 5.9 one derives

Theorem 5.10 The function HL(x, ε) solves the equation

L+1 εHL(x, ε) + RHL(x, ε) = f(x) + ε uL(x) . (5.99) Theorem 5.11 If the function f(x) is sufficiently smooth in D, then it is possible to choose a solution u0(x) to (5.90) and a solution ul(x) to (5.94), so that the following inequality holds

L+1 HL(x, ε) h(x, ε) Cε , k − k− ≤ where C = const > 0 does not depend on ε, but it depends on f(x). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 105

5.6 Examples of asymptotical solutions: case n > 1

Example 5.3 Consider the equation

εh(x, ε) + G(x, y)s( y )h(y, ε) dy = 1 , (5.100) | | SZ1

where x = (x , x ), y = (y , y ), y = y2 + y2, s( y ) is a known smooth 1 2 1 2 | | 1 2 | | 2 1 positive function, s( y ) C > 0, Gp(x, y) = K0(a x y ), K0(r) is | | ≥ 2π | − | the MacDonalds function, ( ∆ + a2)G(x, y) = δ(x y), S is a unit disk − x − 1 centered at the origin. In this example P (y, D )g(y) = s( y )g(y), p = 0, Q(x, D ) = ∆ +a2, y | | x − x q = 2. Let us construct the main term of the asymptotics of the solution to (5.100). By (5.90) one gets

s( x )u (x) = ( ∆ + a2)1 = a2 . | | 0 − x Thus a2 u (x) = . (5.101) 0 s( x ) | | Equation (5.91) yields:

ε( ∆ + a2)w (x, ε) + s( x )w (x, ε) = 0 . (5.102) − x 0 | | 0 The main term w0a(x, ε) of the asymptotics of w0(x, ε) solves the equa- tion

ε∆ w (x, ε) + s( x )w (x, ε) = 0 . (5.103) − x 0a | | 0a In polar coordinates one gets ∂2w (r, ϕ, ε) 1 ∂w (r, ϕ, ε) 1 ∂2w (r, ϕ, ε) ε 0a + 0a + 0a − ∂r2 r ∂r r2 ∂ϕ2  

+s(r)w0a(r, ϕ, ε) = 0 . (5.104)

By radial symmetry w0a(r, ϕ, ε) = w0a(r, ε), so d2w (r, ε) 1 dw (r, ε) ε 0a + 0a + s(r)w (r, ε) = 0 . (5.105) − dr2 r dr 0a   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

106 Random Fields Estimation Theory

The asymptotics of the solution to (5.105) we construct using the method of [Vishik and Lusternik (1962)]. Let r = 1 %. Then − d2w (%, ε) 1 dw (%, ε) ε 0a 0a + s(1 %) w (%, ε) = 0 . − d%2 − 1 % d% − 0a  −  Put % = t√ε and keep the main terms, to get

d2w (t) 0a + s(1)w (t) = 0 , (5.106) − dt2 0a so

√s(1) t √s(1) t w0a(t) = Ce− + De .

Keeping exponentially decaying, as t + , solution one obtains: → ∞ √s(1) t w0a(t) = Ce− .

Therefore

√s(1)/ε (1 r) w0a(r, ε) = Ce− − . (5.107)

To find the constant C in (5.107) we use condition (5.92). Since p = 0, one concludes M2 = 0, and (5.92) takes the form

εM1w0a(x, ε) = M1f(x) = M11 . (5.108)

From (5.25) and (5.108) one gets:

∂G(x, y) ∂w0(y, ε) ε w0(y, ε) G(x, y) dly ∂Ny − ∂Ny ∂ZS1  

∂G(x, y) ∂1 = 1 G(x, y) dly , ∂Ny − ∂Ny ∂ZS1  

where dly is the element of the arclength of ∂S1. If one replaces w0(y, ε) by w0a(y, ε) in the above formula then one gets

∂G(x, y) ∂w0a(y, ε) ∂G(x, y) ε w0a(y, ε) G(x, y) dly = dly . ∂Ny − ∂Ny ∂Ny ∂ZS1   ∂ZS1 (5.109) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 107

The main term in (5.109) can be written as:

∂w0a(y, ε) ∂G(x, y) ε G(x, y) dly = dly . (5.110) − ∂Ny ∂Ny ∂ZS1 ∂ZS1 By (5.107) for y ∂S one gets ∈ 1 ∂w (y, ε) s(1) 0a = C . (5.111) ∂Ny r ε From (5.110) and (5.111) one obtains ∂G(x, y) εs(1) C G(x, y) dl = dl , x S . (5.112) − y ∂N y ∀ ∈ 1 Z Z y p ∂S1 ∂S1 For x = 0 and y ∂S one gets ∈ 1 1 ∂G(0, y) 1 dK0(ar) G(0, y) = K0(a) , = 2π ∂Ny 2π dr r=1

a a = K0 (ar) = K1(a) . 2π r=1 −2π These relations and (5.112) imply:

εs(1) CK (a) = aK (a) . − 0 − 1 Therefore p aK (a) C = 1 . (5.113) εs(1) K0(a) From (5.101), (5.107) and (5.113)p one finds the main term of the asymp- totics of the solution to (5.100):

2 a aK1(a) √s(1)/ε (1 x ) h(x, ε) + e− −| | . (5.114) ≈ s( x ) εs(1) K (a) | | 0 If s( x ) = 1, then (5.114)p agrees with the earlier result, obtained in | | [Ramm and Shifrin (1995)]. Example 5.4 Consider the equation

εh(x, ε) + G(x, y)s( y )h(y, ε) dy = 1 , (5.115) | | BZ1 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

108 Random Fields Estimation Theory

where x = (x , x , x ), y = (y , y , y ), s( y ) is a smooth positive function, 1 2 3 1 2 3 | | e a x y s( y ) C2 > 0, G(x, y) = − | − | , P (y, D )g(y) = s( y )g(y), so p = 0, | | ≥ 4π x y y | | ( ∆ + a2)G(x, y) = δ(x y),|so−Q(x,| D ) = ∆ + a2, q = 2, B is a unit − x − x − x 1 ball centered at the origin. The main term of the asymptotics is constructed by the method of Section 5. By (5.90) one gets

s( x )u (x) = ( ∆ + a2)1 = a2 . | | 0 − x Thus a2 u (x) = . (5.116) 0 s( x ) | | By (5.91)

ε( ∆ + a2)w (x, ε) + s( x )w (x, ε) = 0 . − x 0 | | 0

Keeping the main terms w0a(x, ε) of the asymptotics of w0(x, ε), one gets

ε∆ w (x, ε) + s( x )w (x, ε) = 0 . − x 0a | | 0a In spherical coordinates this equation for the spherically symmetric so- lution becomes: d2w (r, ε) 2 dw (r, ε) ε 0a + 0a + s(r)w (r, ε) = 0 . (5.117) − dr2 r dr 0a   Let r = 1 %. Then (5.117) can be written as: − d2w (%, ε) 2 dw (%, ε) ε 0a 0a + s(1 %) w (%, ε) = 0 . − d%2 − 1 % d% − 0a  −  Put % = t√ε and keep the main terms in the above equation to get d2w (t) 0a + s(1)w (t) = 0 . (5.118) − dt2 0a The exponentially decaying, as t + , solution to (6.19) is: → ∞ √s(1) t w0a(t) = Ce− .

Therefore

√s(1)/ε (1 x ) w0a(x, ε) = Ce− −| | . (5.119) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Singular Perturbation Theory 109

The constant C in (5.119) is determined from conditions (5.92), which in this example can be written as

εM1w0(x, ε) = M1f(x) = M11 . (5.120)

Using formulas (5.25) and (5.120) one gets

∂G(x, y) ∂w0(y, ε) ∂G(x, y) ε w0(y, ε) G(x, y) dSy = dSy . ∂Ny − ∂Ny ∂Ny ∂ZB1   ∂ZB1

Replacing w0(y, ε) by w0a(y, ε) and keeping the main terms, one obtains

∂w0a(y, ε) ∂G(x, y) ε G(x, y) dSy = dSy . (5.121) − ∂Ny ∂Ny ∂ZB1 ∂BZ1

From (5.119) for y ∂B one derives ∈ 1 ∂w (y, ε) s(1) 0a = C . (5.122) ∂Ny r ε From (5.122) and (5.121) it follows that

∂G(x, y) εs(1) C G(x, y) dS = dS . (5.123) − y ∂N y Z Z y p ∂B1 ∂B1 Put x = 0 in (5.123). Let us compute the corresponding integrals:

a y a 1 e− | | e− a G(0, y) dS = dS = dS = e− . (5.124) y 4π y y 4π y ∂ZB1 ∂ZB1 | | ∂ZB1 Note that: ar ∂G(0, y) 1 ∂ e− 1 a a = = ae− + e− . ∂N 4π ∂r r −4π y  r=1  Thus

∂G(0, y) 1 a a dSy = e− (a + 1) dSy = e− (a + 1) . (5.125) ∂Ny −4π − ∂ZB1 ∂ZB1

From (5.123), (5.124) and (5.125)) one gets, setting x = 0, the relation

a a εs(1) Ce− = e− (a + 1) . − − p February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

110 Random Fields Estimation Theory

This yields a + 1 C = . (5.126) εs(1) From (5.116), (5.119) and (5.126)pthe main term of the asymptotics of the solution to equation (5.115) follows:

2 a a + 1 √s(1)/ε (1 x ) h(x, ε) + e− −| | . (5.127) ≈ s( x ) εs(1) | | y If s(x) = 1, formula (5.127) yieldspa result obtained in [Ramm and Shifrin (1995)]. Let us summarize briefly our results. In this paper we constructed asymptotics of the solution to (5.1) as ε +0, and demonstrated how → the L2 - solution to (5.1) tends to a distributional solution of the limiting equation Rh(x) = f(x). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Chapter 6 Estimation and Scattering Theory

In recent years a number of papers have appeared in which the three- dimensional (3D) inverse scattering problem is associated with the random fields estimation problem. In this Chapter we give a brief presentation of the direct and inverse scattering theory in the three-dimensional case and outline the connection between this theory and the estimation theory. This connection, however, is less natural and significant than in one-dimensional case, due to the lack of causality in the spacial variables. In Chapter 1 the direct scattering problem is studied, in Chapter 2 the inverse scattering problem is studied, in Chapter 3 the connection between the estimation theory and inverse scattering is discussed.

6.1 The direct scattering problem

6.1.1 The direct scattering problem Consider the problem

` u k2u := [ 2 + q(x) k2]u = 0 in R3, k > 0 (6.1) q − −∇ −

1 1 x u = exp(ikθ x)+A(θ0, θ, k)r− exp(ikr)+o(r− ), r = x , θ0 = · | | → ∞ r (6.2) 2 2 3 1 where θ, θ0 S , S is the unit sphere in R , and o(r− ) in (6.2) is uniform 2∈ in θ, θ0 S . ∈ The function u is called the scattering solution, the function A(θ0, θ, k) is called the scattering amplitude, the function q(x) is the potential.

111 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

112 Random Fields Estimation Theory

Let us assume that

a q Q := q : q = q, q c(1 + x )− , a > 3 . (6.3) ∈ | | ≤ | | The bar in this chapter stands for complex conjugate (and not for mean value). By c we denote various positive constants. By QM we denote the following class of q

Q := q : q(j) Q, 0 j m , (6.4) m ∈ ≤ | | ≤ n o so that Q0 = Q. The scattering theory is developed for q which may have local singular- ities and are described by some integral norms, but this is not important for our presentation here. Our purpose is to give a brief outline of the the- ory for the problem (6.1)-(6.3) with minimum technicalities. The following questions are discussed:

1) selfadjointness of `q, 2) the nature of the spectrum of `q , 3) existence and uniqueness of the solution to (6.1)-(6.3), 4) eigenfunction expansion in scattering solutions, and 5) properties of the scattering amplitude.

3 The operator `q defined by the differential expression (6.1) on C0∞(R ) is symmetric and bounded from below. Let us denote by `q its closure in H = L2(R3).

Lemma 6.1 The operator `q is selfadjoint. Proof. This lemma is a particular case of Lemma 8.5 in Section 8.2.4.

Lemma 6.2 1) The negative spectrum of `q is discrete and finite. 2) The positive spectrum is absolutely continuous. 3) The point λ = 0 belongs to the continuous spectrum but may not belong to the absolutely continuous spectrum.

Proof. Let us recall Glazman’s lemma: 

Lemma 6.3 Negative spectrum of a selfadjoint operator A is discrete and finite if and only if

sup dim < , (6.5) M ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 113

where the supremum is taken over the set of subspaces such that M (Au, u) 0 for u . ≤ ∈ M A proof is, e.g., in [Ramm (1986), p. 330] or [Glazman (1965), 3]. § Therefore the first statement of Lemma 6.2 is proved if one proves that

N := sup dim < , := u : u 2dx < q (x) u 2dx , − M ∞ M 3 |∇ | 3 − | |  R R  Z Z (6.6) where N is the number of negative eigenvalues of `q counting their multi- − plicities and q = max 0, q(x) . One has − { − }

q = q+(x) q (x), q+ = max q, 0 . (6.7) − − { } Let us write

1 − = u : 1 < q u 2dx u 2dx . (6.8) M −| | |∇ | ( Z Z  ) The ratio of the quadratic forms in (6.8) has a discrete spectrum and the corresponding eigenfunctions solve the problem

q u = λ 2u in R3 − − ∇ which can be written as

1 g0q u = λu, g0f := (4π x y )− f(y)dy. (6.9) − | − | Z Let q1/2 := p. Then (6.9) can be written as −

Aφ := pg0pφ = λφ, φ := pu. (6.10)

The operator A, defined in (6.10), is compact, selfadjoint, and nonnegative- definite in L2(R3) if q Q. Therefore the number of the eigenvalues λ ∈ n of the problem (6.10) which satisfy the inequality λn > 1, is finite. This number is the dimension of defined in (6.8). Thus N < , where N M − ∞ − is defined in (6.6). Statement 1) of Lemma 6.2 is proved. Only a little extra work is needed to give an estimate of N from above. − Namely,

∞ 2 2 2 2 2 A φj = λj φj, T rA = λj λj 1 = N . (6.11) ≥ ≥ − Xj=1 λXj >1 λXj >1 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

114 Random Fields Estimation Theory

Thus q (x)q (y)dxdy N T rpg0q g0p = − − . (6.12) − ≤ − (4π)2 x y 2 ZZ | − | The right-hand side of (6.12) is finite if q Q. Note that if q Q, then − ∈ ∈ q Q and q+ Q. − ∈ ∈

6.1.2 Properties of the scattering solution

Lemma 6.4 The scattering solution exists and is unique.

Proof. The scattering solution solves the integral equation exp(ik x y ) u = u gqu, gf := | − | f(y)dy, (6.13) 0 − 4π x y Z | − | where

u := exp(ikθ x). (6.14) 0 · Conversely, the solution to (6.13) is the scattering solution with 1 A(θ0, θ, k) = exp( ikθ0 y)g(y)u(y, θ, k)dy. (6.15) −4π − · Z It is not difficult to check that if q Q then the operator ∈ T (k)u := gqu (6.16)

is compact in C(R3). Therefore, the existence of the solution to (6.13) follows from the uniqueness of the solution to the homogeneous equation

u = T u, u C(R3) (6.17) − ∈ by Fredholm’s alternative. If u solves (6.17) then u solves equation (6.1) and satisfies the radiation condition ∂u 2 lim iku ds = 0. (6.18) r s =r ∂r − →∞ Z| |

Since q = q, the function u solves equation (6.1) and Green’s formula yields ∂u ∂u lim u u ds = 0. (6.19) r s =r ∂r − ∂r →∞ Z| |   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 115

From (6.18) and (6.19) it follows that

∂u 2 lim + k2 u 2 ds = 0. (6.20) r s =r ∂r | | →∞ Z| | " #

Any solution to (6.1) which satisfies condition (6.20) has to vanish iden- tically according to a theorem of Kato [Kato (1959)]. Thus u = 0 and Lemma 6.4 is proved. 

Lemma 6.5 Let f L2(R3) be arbitrary. Define ∈ 3/2 f˜(ξ) := (2π)− f(x)u(x, ξ)dx, fj := (f, uj ), 1 j N , (6.21) ≤ ≤ − Z where uj are the orthonormalized eigenfunctions corresponding to the dis- crete spectrum of `q:

`quj = λj uj, 1 j N , λj < 0, (6.22) ≤ ≤ −

(uj, um) := uj(x)um(x)dx = δjm, (6.23) Z

ξ R3, ξ = k, ξ = kθ, θ S2. (6.24) ∈ | | ∈ Then

N− 3/2 f(x) = (2π)− f˜(ξ)u(x, ξ)dξ + fj uj(x). (6.25) j=1 Z X Formulas (6.21), (6.25) are analogous to the usual Fourier inversion formulas. They reduce to the latter if q(x) = 0. The proof of Lemma 6.5 requires some preparations. We follow the scheme used in [Ramm (1963); Ramm (1963b); Ramm (1965); Ramm (1968b); Ramm (1969d); Ramm (1970); Ramm (1971b); Ramm (1987); Ramm (1988b)] and in [Ramm (1986), p. 47]. Let G(x, y, k) be the resolvent kernel of `q:

(` k2)G(x, y, k) = δ(x y) in R3. (6.26) q − − This kernel solves the equation

G(x, y, k) = g(x, y, k) g(x, z, k)q(z)G(z, y, k)dz. (6.27) − Z February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

116 Random Fields Estimation Theory

This equation is similar to (6.13) and can be written as

(I + T )G = g, (6.28)

where T is defined in (6.16). Therefore, as in the proof of Lemma 6.4, the 3 solution to the equation (6.27) exists and is unique in the space Cy(R ) of 1 functions of the form G = c x y − + v(x, y) where v(x, y) is continuous | − | and c = const, G := c + maxx R3 v(x, y) . The operator T is compact k k | | ∈ | | in C (R3) if q Q. The homogeneous equation (6.28) has only the trivial y ∈ solution. The operator T = T (k) depends continuously on k C+ := k : 1 ∈ { Imk 0 . Therefore [I + T (k)]− is a continuous function of k in the ≥ } region C ∆(k), where ∆(k) is a neighborhood of a point k , ∆(k) := + ∩ 0 k : k k < δ δ > 0, and the operator I + T (k ) is invertible. Since for { | − 0| } 0 any k > 0 the operator I + T (k) is invertible, it follows that G(x, y, k) is continuous in k in the region C ∆(0, ), that is in a neighborhood of + ∩ ∞ the positive semiaxis in C . The continuity holds for any x, y fixed, x = y, + 6 and also in the norm of Cy. This implies that the continuous spectrum of ` in the interval (0, ) is absolutely continuous. q ∞ From the equation (6.27) it follows that

x G(x, y, k) = g(r)u(y, θ, k) [1 + o(1)] , r = x , = θ (6.29) − | | → ∞ x | | 1 where g(r) := (4πr)− exp(ikr) and u(y, θ, k) is the scattering solution. 1 − 3 In fact, o(1) = 0 x uniformly in y D, where D R is an arbitrary | | ∈ ∈ fixed bounded domain.  Indeed, it follows from (6.27) that

u(y, θ, k) = exp( ikθ y) exp( ikθ z)q(z)G(z, y, k)dz. (6.30) − − · − − · Z The function (6.30) solves equation (6.1):

(` k2)u = q(y) exp( ikθ y) exp( ikθ z)q(z) [δ(z y)] dz = 0, q − − · − − · − Z and satisfies the condition (6.2) since the integral term in (6.30) satisfies the radiation condition. Therefore, the scattering solution can be defined by formula (6.29). This definition was introduced and used systematically in [R 28)]. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 117

The starting formula in the proof of the eigenfunction expansion theo- rem is the Cauchy formula 1 0 = R fdλ (6.31) 2πi λ ZCN where

1 R := (A λI)− , A = A∗ = ` . (6.32) λ − q C is a contour which consists of the circle γ := λ : λ = N , of a finite N N { | | |} number N of circles γj := λ : λ + λj = δ where λj < 0, 1 j N , − { | | } ≤ ≤ − are negative eigenvalues of `q, δ > 0 is a small number such that γj does not intersec with γ for j = m, and of a loop which joins points N i0 m 6 LN − and N + i0 and goes from N i0 to 0 and from 0 to N + i0. The circles − γj , 1 j N are run clockwise and γN is run counterclockwise. The ≤ ≤ − integral 1 R fdλ = P f (6.33) 2πi λ j Zγj 2 3 where Pj is the orthoprojection in H = L (R ) onto the eigenspace of A corresponding to the eigenvalue λj. Note that there is no minus sign in front of the integral in (6.33) because γj is run clockwise and not counter- clockwise. One has:

1 1 N Rλfdλ = ImRλ+i0fdλ, (6.34) 2πi π 0 ZLN Z where we have used the relation

Rλ i0f = Rλ+i0f. (6.35) − Formula (6.35) follows from the selfadjointness of A:

Rλ i0 = Rλ∗+i0 (6.36) −

and from the symmetry of the kernel of the operator Rλ(x, y). Finally, for any selfadjoint A one has 1 lim Rλfdλ = f. (6.37) N −2πi →∞ ZγN February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

118 Random Fields Estimation Theory

Indeed, if A is selfadjoint then

∞ 1 R = (t λ)− dE , (6.38) λ − λ Z−∞

where Eλ is the resolution of the identity for A. Substitute (6.38) into (6.37) to get

1 ∞ dEλf ∞ 1 dλ lim dλ = lim dEλf N −2πi γ t λ N −2πi γ t λ →∞ Z N Z−∞ − →∞ Z−∞  Z N −  N = lim dEλf = f. (6.39) N N →∞ Z− Here we have used the formula

1 dλ 1, N < t < N, = − (6.40) 2πi t λ − γN (0, t > N or t < N. Z − − Using (6.31), (6.33), (6.34) and (6.37) one obtains

N− 1 ∞ f = fj uj(x) + ImRλ+i0fdλ, (6.41) π 0 Xj=1 Z where

fj = (f, uj ), 1 j N (6.42) ≤ ≤ − and the sum in (6.41) is the term

Pj f. (6.43) j X Let λ = k2 in (6.41). Then

1 ∞ 2 ∞ := ImR fdλ = ImG(x, y, k)f(y)dy kdk. (6.44) J π λ+i0 π Z0 Z0 Z  We wish to show that the term (6.44) is equal to the integral in (6.25). This can be done by expressing ImG(x, y, k) via the scattering solutions. Green’s formula yields:

∂G(x, s) ∂G(x, s) G(x, y, k) G(x, y, k) = G(s, y) G(s, y) ds. − s =r ∂ s − ∂ s Z| |  | | | |  (6.45) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 119

Take r and use (6.29) to get → ∞ 2iImG(x, y, k) = lim g(r)u(y, θ, k)ikgu(x, θ, k) r s =r − − →∞ Z| | + g(r)u(y, θ, k)ikgu(x, θ, k) − − 2ik = 2 u(x, θ, k)u(y, θ, k)dθ. (6.46) (4π) 2 ZS Thus 2 k ImG(x, y, k) = 3 u(x, θ, k)u(y, θ, k)dθ. (6.47) π (2π) 2 ZS Substitute (6.47) into (6.44) to get

1 ∞ ˜ 2 1 ˜ = 3/2 f(ξ)u(x, θ, k)k dk = 3/2 f(ξ)u(x, ξ)dξ. J (2π) 0 S2 (2π) Z Z Z (6.48) Here ξ = kθ, dξ = k2dkdθ, f˜(ξ) is given by (6.21). From (6.41), (6.44), and (6.48) formula (6.25) follows. Lemma 6.5 is proved. Remark 6.1 Let us give a discussion of the passage from (6.44) and (6.47) to (6.48). First note that our argument yields Parseval’s equality:

(f, h) = (f˜(ξ), h˜(ξ)) + fjhj (6.49) j X

dEλ and the formula for the kernel of the operator dλ , where the derivative is understood in the weak sense dE (x, y) 1 λ = ImG(x, y, √λ) dλ π √λ √ √ = 3 u(x, θ, x)u(y, θ, λ)dθ, λ > 0. (6.50) 16π 2 ZS To check (6.49) one writes

∞ ∞ ∞ (f, h) = fj hj + dEλf, dEµh = fj hj + d(Eλf, h) 0 0 0 Xj Z Z  Xj Z

where we have used the orthogonality of the spectral family: E(∆)E(∆0) = E(∆ ∆0). Furthermore, using (6.50) one obtains ∩ ∞ d(Eλf, g) = f˜(ξ)h˜(ξ)dξ. Z0 Z February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

120 Random Fields Estimation Theory

The last two formulas yield (6.49). The passage from (6.44) and (6.47) to (6.48) is clear if f L2(R3) L1(R3): in this case the integral ∈ ∩ f˜(ξ) := f(y)u(y, ξ)dy converges absolutely. If f L2(R3) then one ∈ can establish formula (6.25) by a limiting argument. Namely, let be R F the operator of the Fourier transform, f = f˜(ξ), f := [ f, f] F { j} Fc Fd where the brackets indicate that the Fourier transformh is ia set of the co- efficients fj , corresponding to the discrete spectrum of `q and the function f˜(ξ) corresponding to the continuous spectrum of ` . The operator is q F isometric by (6.49), and if it is defined originally on a dense in L2(R3) set L2(R3) L1(R3) it can be uniquely extended by continuity on all of L2(R3). ∩ Formula (6.21) is therefore well defined for f L2(R3). If formula (6.25) ∈ is proved for f L2(R3) L1(R3), it remains valid for any f L2(R3) ∈ ∩ ∈ because the inverse of is also an isometry from Ran onto L2(R3). Let F 2 3 F us note finally that Ran C = L (R ), where Cf := f˜(ξ), and C is an 2 3 F 2 3 F F isometry from L (R ) onto L (R ), ∗ f = f E f, ∗ f˜ = f˜. Here FC FC − 0 FC FC E0f is the projection of f onto the linear span of the eigenfunctions of `q, E f = P f. This follows from the formula N( ∗ ) = 0 , the proof of 0 j j FC { } which is the same as in [Ramm (1986), p. 51]. P

6.1.3 Properties of the scattering amplitude

Let us now formulate some properties of the scattering amplitude A(θ0, θ, k).

Lemma 6.6 If q Q then the scattering amplitude has the properties ∈

A(θ0, θ, k) = A(θ , θ, k), k > 0 (reality) (6.51) − 0

A(θ0, θ, k) = A( θ, θ0, k), (reciprocity) (6.52) − −

A(θ0, θ, k) A(θ, θ0, k) k − = A(θ0, α, k)A(θ, α, k)dα (unitarity). 2i 4π S2 Z (6.53) In particular, if θ0 = θ in (6.53) then one obtains the identity

k ImA(θ, θ, k) = A(θ, α, k) 2 dα (optical theorem). (6.54) 4π 2 | | ZS Proof. 1) Equation (6.51) follows from the real-valuedness of q(x). In- deed, u(x, θ, k) and u(x, θ, k), k > 0, solve the same integral equation − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 121

(6.13). Since this integral equation has at most one solution, it follows that

u(x, θ, k) = u(x, θ, k), k > 0. (6.55) − Equation (6.51) follows from (55) immediately. 2) The proof of (6.52)-(6.54)is somewhat longer and since it can be found in [Ramm (1975), p. 54-56] we refer the reader to this book.  Let us define the S-matrix k S = I A (6.56) − 2πi where S : L2(S2) L2(S2) is considered to be an operator on L2(S2) with → the kernel k S(θ0, θ, k) = δ(θ θ0) A(θ0, θ, k). (6.57) − − 2πi The unitarity of S

S∗S = I (6.58)

implies

A A∗ k − = A∗A (6.59) 2i 4π which is (6.53) in the operator notation.

6.1.4 Analyticity in k of the scattering solution Define

φ := exp( ikθ x)u(x, θ, k). (6.60) − · Then φ solves the equation

[I + Tθ(k)] φ = 1 (6.61) where exp [ik x y ikθ (x y)] T (k)φ := | − | − · − q(y)φ(y)dy. (6.62) θ 4π x y Z | − | The operator T (k) : C(R3) C(R3) is compact and continuous (in the θ → norm of operators) in the parameter k C := k : Imk 0 . If q Q, ∈ + { ≥ } ∈ the operator T (k) is analytic in k in C since x y θ (x y) 0. The θ + | − | − · − ≥ operator I + T (k) is invertible for some k C , for example, for k C θ ∈ + ∈ + February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

122 Random Fields Estimation Theory

sufficiently close to the positive real semiaxis, or for k = a + ib, where a and b are real numbers and b > 0 is sufficiently large. Indeed under the last assumption the norm of the operator Tθ(k) is less than one. Therefore by the well-known result, the analytic Fredholm’s theorem, one concludes 1 3 that [I +Tθ (k)]− is a meromorphic in C+ operator function on C(R ) (see [Ramm (1975), p. 57]). The poles of this function occur at the values kj at which the operator I + Tθ(kj) is not invertible. These values are

k = i λ (6.63) j | j| q where λj are the eigenvalues of the operator `q, and, possibly, the value k = 0. Indeed, if

[I + T (k )] v = 0, v C(R3), k C (6.64) θ j ∈ j ∈ + then the function

w := exp(ikθ x)v, v C(R3) (6.65) · ∈ solves the equation

w = T w (6.66) − where T is defined in (6.16). It follows from (6.65) and (6.66) that w = 1 O( x − ). This and equation (6.66) imply that | | (` k2)w = 0 (6.67) q − and

w L2(R3). (6.68) ∈ Equation (6.67) follows from (6.66) immediately. Equation (6.68) can be easily checked if k C , that is, if k = a + ib, b > 0, a is real. Indeed, use ∈ + (6.66) , the assumption q Q, which implies that q L2(R3) L1(R3), ∈ ∈ ∩ and boundedness of w to get: 2 2 exp( b x y ) w 2 3 dx − | − | q w dy k kL (R ) ≤ 4π x y | || | Z Z | − | exp( 2b x y ) c dx − | − | q(y) dy c . (6.69) ≤ x y 2 | | ≤ 1 Z Z | − | Since the operator ` = ∆+q(x) is selfadjoint equations (6.67) and (6.68) q − imply w = 0 provided that k2 is not real. Since k C , the number k2 is ∈ + real if and only if k = i λ , k2 = λ < 0. Equations (6.67) and (6.68) | | −| | p February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 123

2 with k = λ imply that λ = λj is an eigenvalue of `q . Therefore, the −| | 1 only points at which the operator [I + Tθ(k)]− has poles in C+ are the points (6.63) and, possibly, the point k = 0. One can prove [R, 28h,i)] that if q Q and ` 0 the number λ = 0 is not an eigenvalue of ` . However, ∈ q ≥ q even if ` 0 the point λ = 0 may be a resonance (half-bound state) for q ≥ `q . This means that the equation

∆u = qu, u C(R3) and u L2(R3) (6.70) ∈ 6∈ may have a nontrivial solution which is not in L2(R3). In this case the operator I +T (0) is not invertible. Even if q(x) C∞ the operator ` may θ ∈ 0 q have a resonance at λ = 0. Even if ` 0 and q is compactly supported q ≥ and locally integrable the operator `q may have a resonance at λ = 0.

3 1 Example 6.1 Let B = x : x 1, x R . Let u = x − for x 1. | | ≤ ∈ | | | | ≥ Extend u inside B as a C∞ real-valued function such that u(x) δ > 0 in  ≥ B. This is possible since u = 1 on ∂B. Define ∆u q(x) := . (6.71) u 2 3 Then q C∞, q = 0 for x 1, q is real-valued, u L (R ), and the ∈ 0 | | ≥ 6∈ desired example is constructed. This argument does not necessarily lead to a nonnegative ` . In order to get ` 0 one needs an extra argument q q ≥ given in [Ramm (1987)]. Let us give a variant of this argument. The 2 2 inequality `q 0 holds if and only if ( ) φ + q(x) φ dx 0 for ≥3 ∗ 2 |∇ | 2 | 1| 2 ≥ all φ C0∞(R ). It is known that φ dx (4r )− φ dx for all ∈ 3 |∇ R|  ≥ 2 1 | | φ C0∞(R ), r := x . Therefore ( ) holds if ( ) (4r )− + q 0. Choose ∈ γ 1 | | ∗ R ∗∗ R ≥ u = r − (1 + γ γr), where γ > 0 is a sufficiently small number. Then q, − defined by (71), satisfies ( ) as one can easily check. This q is integrable ∗∗ 1 γ 1 and ` 0. The function u = r− for r 1 and u = r − (1+γ γr) solves q ≥ ≥ − the equation ` u = 0 in R3, u L2(R3), ` 0, q = 0 for r 1, and q is q 6∈ q ≥ ≥ locally integrable.

1 Exercise: Prove that the numbers (6.63) are simple poles of [I + Tθ(k)]− .

6.1.5 High-frequency behavior of the scattering solutions Assume now that q Q . Then the function φ defined in (6.60) can be ∈ 1 written as

1 ∞ 1 φ = 1 + q(x rθ)dr + o , k + . (6.72) 2ik − k → ∞ Z0   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

124 Random Fields Estimation Theory

If q Q , m > 1, more terms in the asymptotic expansion of φ as k ∈ m → ∞ can be written [Skriganov (1978)]. Formula (6.72) is well known and can be derived as follows.

Proof of Formula (6.72) Step 1: Note that

2 max Tθ (k) 0 as k + . (6.73) θ S2 k k→ → ∞ ∈ This can be proved, as in [Ramm (1986), p. 390], by writing the kernel 2 Bθ(x, y, k) of Tθ : exp [ik( x z + z y )] B (x, y, k) = | − | | − | q(z)dzq(y) exp [ikθ (x y)] . θ 16π2 x z z y · − Z | − || − | (6.74) Introduce the coordinates s, t, ψ defined by the formulas x + y x + y z = `st + 1 1 , z = ` (s2 1)(1 t2) cos ψ + 2 2 1 2 2 − − 2 p x + y z = ` (s2 1)(1 t2) sin ψ + 3 3 , (6.75) 3 − − 2 where p

` = x y /2, x z + z y = 2`s, x y z y = 2`t, = `3(s2 t2), | − | | − | | − | | − |−| − | J − (6.76) and is the Jacobian of the transformation J (z , z , z ) (s, t, ψ), 1 s < , 1 t 1, 0 ψ < 2π. 1 2 3 → ≤ ∞ − ≤ ≤ ≤ In the new coordinates one obtains

1 ∞ B (x, y, k) q(y) ` exp(2ik`s)p(s)ds , (6.77) | θ | ≤ 16π2 | | Z1

where

2π 1 p(s) := dψ dtq1(s, t, ψ) 0 1 Z Z− and q1(s, t, ψ) is q(z) in the new coordinates given by formula (6.75). One can choose a sufficiently large number N > 0 such tht

x > N or y > N implies sup Bθ(x, y, k) < (N), (6.78) | | | | θ S2,k>0 | | ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 125

where (N) 0 as N . → → ∞ If N is fixed, then for x N and y N it follows from (77) that | | ≤ | | ≤

B 0 as k + (6.79) | θ| → → ∞

since p(s) L1(1, ). ∈ ∞ This proves (6.73). Note that in this argument it is sufficient to assume q Q. ∈

Step 2: If (6.73) holds, one can write

∞ φ = 1 + ( 1)j T j(k)1, (6.80) − θ Xj=1

where the series in (6.80) converges in the norm of C(R3) if k is sufficiently large so that T 2 < 1. Note that if T < 1 then k θ k k k

∞ 1 j j (I + T )− = ( 1) T (6.81) − j=0 X

and the series converges in the norm of operators. If T 2 < 1, formula k k (6.81) remains valid. Indeed

∞ ∞ ∞ ( 1)j T j = ( 1)2j T 2j + T ( 1)2j+1T 2j − − − j=0 j=0 j=0 X X X 2 1 2 1 = (I T )− T (I T )− − − − 2 1 1 = (I T )(I T )− = (I + T )− . (6.82) − −

In fact it is known that the series (6.81) converges and formula (6.81) holds if T m < 1 for some integer m 1. k k ≥ As k , each term in (6.80) has a higher order of smallness than the → ∞ previous one. Therefore it is sufficient to consider the first term in the sum (6.80) and to check that

1 ∞ 1 T (k)1 = q(x rθ)dr + o , k + (6.83) − θ 2ik − k → ∞ Z0   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

126 Random Fields Estimation Theory

in order to prove (6.72).

Step 3: Let us check (6.83). One has

exp ik [ x y θ (x y)] T (k)1 = { | − | − · − } q(y)dy θ 4π x y Z 2 | − | ∞ drr exp(ikr) = exp(ikrθ α)q(x + rα)dα, (6.84) 4πr 2 · Z0 ZS where we set y = x + z, z = rα, α S2. Use formula [Ramm (1986)], p. ∈ 55:

exp(ikrθ α)f(α)dα S2 · exp( ikr) exp(ikr) 1 = 2πi − f(R θ) f(θ) + o , as k (6.85) kr − − kr k → ∞ h i  which holds if f C1(S2). ∈ From (6.84) and (6.85) one obtains

1 ∞ 1 T (k)1 = drq(x rθ) + o , k + (6.86) θ −2ik − k → ∞ Z0   which is equivalent to (6.83). Formula (6.72) is proved.  It follows from (6.72) that

1 ∞ 1 u(x, θ, k) = exp(ikθ x) 1 + q(x rθ) + o , k + · 2ik 0 − k → ∞  Z    (6.87) provided that q Q . ∈ 1 Another formula, which follows from (6.72), is

∞ θ x lim 2ik [φ(x, θ, k) 1] = θ x q(x rθ)dr · ∇ k + { − } · ∇ − → ∞ Z0 ∞ ∂q = dr = q(x) − ∂r Z0 or

q(x) = θ x lim 2ik [φ(x, θ, k) 1] . (6.88) · ∇ k + { − } → ∞ Note that the left side does not depend on θ, so that (6.88) is a compatibility condition on the function φ(x, θ, k). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 127

From (6.87) and (6.15) it follows that 1 1 A(θ0, θ, k) = exp [ik(θ θ0) x] q(x)dx + 0 , k + −4π − · k → ∞ Z   (6.89) provided that q Q . In particular, ∈ 1 1 1 A(θ, θ, k) = q(x)dx + 0 , k + . (6.90) −4π k → ∞ Z  

6.1.6 Fundamental relation between u+ and u− If one defines

+ u := u(x, theta, k), u− := u(x, θ, k) (6.91) − − then one can prove that

+ ik u = Su− := u− + A(θ0, θ, k)u−(x, θ0, k)dθ0. (6.92) 2π 2 ZS Let us derive (6.92). We start with the equations

u+ = u G+qu , u := exp(ikθ x), (6.93) 0 − 0 0 · u− = u G−qu , (6.94) 0 − 0 where G+ = G, where G is defined by the equation (6.27), and

G− := G. (6.95)

Equations (6.93) and (6.94) one can easily check by applying the operator ` k2 to these equations. Subtract (6.94) from (6.93) and use (6.95) to q − get

+ + u u− = 2iImG qu − − 0 ik 1 = dθ0u−(x, θ0, k) u−(y, θ0, k)q(y)u0(y, θ, k) 2π 2 −4π ZS  Z  ik = A(θ0, θ, k)u−(x, θ0, k)dθ0. (6.96) 2π 2 ZS The last equality in (6.96) follows from the definition (91), properties (6.51) and (6.52) of the scattering amplitude and the formula

+ k ImG (x, y, k) = 2 u−(x, θ0, k)u−(y, θ0, k)dθ0 (6.97) 16π 2 ZS February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

128 Random Fields Estimation Theory

which is similar to (6.46) and which follows from (6.46) and (6.91). Let us derive (6.97). Note that, by formulas (6.46) and (6.91), one has

+ + ImG (x, y, k) = ImG (x, y, k), u(x, θ, k) = u (x, θ, k), (6.98) − 

+ k ImG (x, y, k) = 2 u(x, θ, k)u(y, θ, k)dθ 16π 2 ZS k = 2 u−(x, θ, k)u−(y, θ, k)dθ 16π 2 − − − − ZS k = 2 u−(x, θ0, k)u−(y, θ0, k)dθ0 16π 2 ZS k = 2 u−(x, θ0, k)u−(y, θ0, k)dθ0. (6.99) 16π 2 ZS Here we used (6.98). Thus, formula (6.97) is obtained. Note that

4πA(θ0, θ, k) = u (y, θ0, k)q(y)u(y, θ, k)dy (6.100) − 0 − Z and

4πA( θ, θ0 , k) = u (y, θ, k)q(y)u(y, θ0 , k)dy − − − 0 − Z = u0(y, θ, k)q(y)u− (y, θ0, k). (6.101) Z From formula (6.52) it follows that the right sides of (6.100) and (6.101) are equal. This explains the last equation (6.96).

6.1.7 Formula for det S(k) and the Levinson Theorem If q(x) = q(x) is decaying sufficiently fast, (for example, if (1 + x )q Q, 3 2 2 2 2 | | ∈ x R ) then the operator A : L (S ) L (S ) with kernel A(θ0, θ, k), ∈ → k > 0, is in the trace class and

ik ik d( k) det S(k) = det I + A = exp q(x)dx − , k > 0 2π −2π d(k)    Z  (6.102) where

d(k) := det (I + T (k)) . (6.103) 2 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 129

The operator T (k) in (6.103) is defined in (6.16) and the symbol det2(I +T ) is defined in Definition 8.7 p. 300. If k > 0 and q = q, then d( k) = d(k), − where the bar stands for complex conjugate. Therefore

det S(k) = exp [2iδ(k)] , (6.104) where k δ(k) = q(x)dx β(k), β(k) := argd(k). (6.105) −4π − Z The Levinson Theorem says that ν δ(0) = π m + , (6.106) 2   where m is the number of the bound states counting with their multiplic- ities, in other words m is the dimension of the subspace spanned by the eigenfunctions of `q corresponding to all of its negative eigenvalues, and ν = 1 if k = 0 is a resonance and ν = 0 otherwise, that is, if I + T (0) is invertible. It is assumed that δ(k) is normalized in such a way that

k lim δ(k) + q(x)dx = 0 (6.107) k 4π →∞  Z  or, according to (6.105), that

lim β(k) = 0. (6.108) k →∞ Formula (6.106) follows from (6.105) and the argument principle applied to d(k). Formula (6.102) can be derived as follows:

d( k) := det (I + T ( k)) − 2 − 1 = det [I + T (k)] I + (I + T (k))− (T ( k) T (k)) 2 − − n h 1 io = det [I + T (k)] det I + (I + T (k))− (T ( k) T (k)) 2 − − × exp T r [T ( k) h T (k)] . i (6.109) {− − − } Here we have used formula (12) which precedes Definition 8.8 on p. 300. 2i sin(k x y ) The operator T ( k) T (k) has the kernel | − | q(y), so that its − − − 4π x y trace is | − | 2ik ik T r [T ( k) T (k)] = q(y)dy = q(y)dy. (6.110) − − − 4π −2π Z Z February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

130 Random Fields Estimation Theory

Therefore formula (109) can be written as

d( k) ik 1 − exp q(x)dx = det I + (I + T (k))− (T ( k) T (k)) . d(k) −2π − −  Z  h (6.111)i Finally one proves that

1 ik det I + (I + T (k))− (T ( k) T (k)) = det I + A . (6.112) − − 2π h i   Formulas (6.111) and (6.112) imply (6.102). Let us prove (6.112). Let 1 (I + T (k))− := B. Then

1 τ := (I + T (k))− (T ( k) T (k)) = B [g( k) g(k)] q. (6.113) − − − − One has 2ik sin(k x y ) ik 1 g( k) g(k) = | − | = exp ikθ (x y) dθ. − − − 4π k x y −2π 4π S2 { · − } | − | Z (6.114) Furthermore, if u := exp(ikθ x) then 0 ·

Bu0 = u(x, θ, k), (6.115)

where u(x, θ, k) is the scattering solution (6.1)-(6.2). Therefore the right- hand side of (6.113) is the operator in L2(R3) with the kernel

ik τ(x, y) := 2 dθ exp( ikθ y)u(x, θ, k)q(y). (6.116) −8π 2 − · ZS Note that ik ik T rτ = τ(x, x)dx = A(θ, θ, k)dθ = T r A . (6.117) 2π 2 2π Z ZS   From formula 8) of Section 8.3.3 it follows that (6.112) is valid provided that

ik j T rτ j = T r A , j = 1, 2, 3 . . .. (6.118) 2π   One can check (6.118) as we checked (6.117). Thus, formula (6.102) is derived. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 131

6.1.8 Completeness properties of the scattering solutions

Theorem 6.1 Let h(θ) L2(S2) and assume that ∈

h(θ)u(x, θ, k)dθ = 0 x ΩR := x : x > R , (6.119) 2 ∀ ∈ { | | } ZS where k > 0 is fixed, x R3. It is assumed that q Q. Then h(θ) = 0. The ∈ ∈ same conclusion holds if one replaces u(x, θ, k) by u(x, θ, k) in (119) and − − if x Rr, r 2. ∈ ≥ Proof. The proof consists of two steps. Step 1. The conclusion of Theorem 6.1 holds if u(x, θ, k) is replaced by u (x, θ, k) := exp(ikθ x) in (6.119). Indeed, if 0 ·

h(θ) exp(ikθ x)dθ = 0 x ΩR (6.120) 2 · ∀ ∈ ZS and a fixed k > 0, then the Fourier transform of the distribution h(θ)δS2 vanishes for all sufficiently large x. The distribution h(θ)δS2 is defined by the formula

3 φ(y)δS2 dy = φ(θ)h(θ)dθ φ C0∞(R ). (6.121) 2 ∀ ∈ Z ZS Since h(θ)δS2 has compact support, its Fourier transform is an entire func- tion of x. If this entire function vanishes for all sufficiently large x 3, it ∈ vanishes identically. Therefore h(θ) = 0. Step 2. If (6.119) holds then (6.120) holds. Therefore, by Step 1, h(θ) = 0. In order to prove that (6.119) implies (6.120) let us note that

u0(x, θ, k) = (I + T (k)) u, (6.122)

where T (k) is defined by (6.16). For every k > 0 the operator I + T (k) is an isomorphism of C(R3) onto C(R3). Applying the operator I + T (k) to (6.119) and using (6.122) one obtains (6.120). Note that the operator I + T (k) acts on u(x, θ, k) which is considered as a function of x while θ and k are parameters. Theorem 6.1 is proved.  Theorem 6.1 is used in [R 26)] for a characterization of the scattering data which we give in Section sscattering2.5. Another completeness property of the scattering solution can be formu- lated. Let

N (` ) := w : w H2(D), ` w = 0 in D , (6.123) D q ∈ q  February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

132 Random Fields Estimation Theory

where D R3 is a bounded domain with a sufficiently smooth boundary ⊂ Γ, for example Γ C1,α, α > 0, suffices. ∈ Theorem 6.2 Let q Q, where Q is defined in (6.3). The closure ∈ in L2(D) (and in H1(D)) of the linear span of the scattering solutions u(x, θ, k) θ S2 and any fixed k > 0 contains N (` k2). { } ∀ ∈ D q − Proof. We first prove the statement concerning the L2(D) closure. Let f N (` k2) and assume that ∈ D q −

fu(x, θ, k)dx = 0 θ S2. (6.124) ∀ ∈ ZD Define

v(x) := G(x, y, k)f(y)dy, (6.125) ZD where G is uniquely defined by equation (6.27). Use (6.29) and (123) to conclude that

2 v(x) = O x − as x . (6.126) | | | | → ∞  Since

(` k2)v = 0 in Ω := R3 D (6.127) q − \ and (6.126) holds, one concludes applying Kato’s theorem ( see [Kato (1959)]) that v = 0 in Ω (see the end of the proof of Lemma 6.4 in subsection 6.1.2). In particular,

v = vN = 0 on Γ, (6.128)

where vN is the normal derivative of v on Γ. It follows from (124) that

(` k2)v = f in D. (6.129) q − − Since

(` k2)f = 0 in D (6.130) q − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 133

by the assumption, one can multiply (128) by f, integrate over D and get

f 2dy = f(` k2)vdy − | | q − ZD ZD = (` k2)fvdx + (fv f v)ds q − N − N ZD ZΓ = (` k2f)vdx = 0. (6.131) q − ZD Here we have used (6.128) and the real-valuedness of the potential. It follows from (6.131) that f = 0. The first statement of Theorem 6.2 is proved. In order to prove the second statement which deals with completeness in H1(D), one assumes

(fu + f u)dx = 0 θ S2 (6.132) ∇ · ∇ ∀ ∈ ZD and some f N (` k2). Integrate (131) by parts to get ∈ D q − ( ∆f + f)udx + f uds = 0, θ S2. (6.133) − N ∀ ∈ ZD ZΓ Define

v := ( ∆f + f)G(x, y, k)dy + G(x, s, k)f ds. (6.134) − N ZD ZΓ Argue as above to conclude that v = 0 in Ω and

v = vN− = 0 on Γ, (6.135)

where vN− is the limit value of vN on Γ from Ω. By the jump formula for the normal derivative of the single-layer potential (see, e.g., [Ramm (1986)], p. 14) one has

+ v v− = f . (6.136) N − N N + Since vN− = 0 it follows that vN = fN on Γ. Thus

+ v = 0, vN = fN on Γ, (6.137) and

(` k2)v = ∆f f in D. (6.138) q − − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

134 Random Fields Estimation Theory

Multiply (6.138) by f, integrate over D and then by parts to get

f 2 + f 2 dx+ ff ds = (` k2)fvdx+ fv+ f v ds. − |∇ | | | N q − N − N Z ZΓ ZD ZΓ   (6.139) From (6.130), (6.137) and (6.139) it follows that

f 2 + f 2 dx = 0. (6.140) |∇ | | | ZD  Thus, f = 0. Theorem 6.2 is proved. 

6.2 Inverse scattering problems

6.2.1 Inverse scattering problems

The inverse scattering problem consists of finding q(x) given A(θ0, θ, k). One should specify for which values of θ0, θ and k the scattering amplitude is given.

2 Problem 1 A(θ0, θ, k) is given for all θ0, θ S and all k > 0. Find q(x). ∈

2 Problem 2 A(θ0, θ, k) is given for all θ0, θ S and a fixed k > 0. ∈

2 2 Problem 3 A(θ0, θ, k) is given for a fixed θ S and all θ0 S and all ∈ ∈ k > 0.

Problem 1 has been studied much. We will mention some of the results relevant to estimation theory. Problem 2 has been solved recently [R 29)] but we do not describe the results since they are not connected with the estimation theory Problem 3 is open, but a partial result is given in [R 29d)].

6.2.2 Uniqueness theorem for the inverse scattering prob- lem The uniqueness of the solution to Problem 1 follows immediately from for- 2 mula (1.89). Indeed, if A(θ0, θ, k) is known for all θ0, θ S and all k > 0, ∈ then take an arbitrary ξ R3, an arbitrary sequence k + , and find ∈ n → ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 135

2 a sequence θ , θ0 S such that n n ∈

lim (θn θn0 )kn = ξ, kn + . (6.141) n →∞ − → ∞ This is clearly possible. Pass to the limit k in (6.89) to get n → ∞

4π lim kn(θn θn0 ) = ξA(θn0 , θn, kn) = exp(iξ x)q(x)dx. (6.142) − kn − · →∞ Z Therefore the Fourier transform of q is uniquely determined. Thus q is uniquely determined. We have proved

2 2 Lemma 6.7 If q Q then the knowledge of A(θ0, θ, k) on S S R , ∈ 1 × × + R := (0, ), determines q(x) uniquely. + ∞ In fact, our proof shows that it suffices to have the knowledge of A for an arbitrary sequence kn and for some θ0 and θ such that for any 3 → ∞ ξ R one can choose θ0 and θ such that (6.1) holds. ∈ n n The reconstruction of q(x) from the scattering data via formula (6.142) requires the knowledge of the high frequency data. These data are not easy to collect in the quantum mechanics problems, and for very high en- ergies the Schr¨odinger equation is no longer a good model for the physical processes. Therefore much effort was spent in order to find a solution to Prob- lem 1 which uses all of the scattering data; to find necessary and sufficient conditions for a function A(θ0, θ, k) to be the scattering amplitude for a potential q from a certain class, e.g. for q Q , this is called a char- ∈ m acterization problem; and to give a stable reconstruction of q given noisy data.

6.2.3 Necessary conditions for a function to be a scatterng amplitude

A number of necessary conditions for A(θ0, θ, k) to be the scattering ampli- tude corresponding to q Q follow from the results of Section 6.1 of this ∈ 1 scatteringtheory. Let us list some of these necessary conditions:

1) reality, reciprocity and unitarity: that is, formulas (6.51)-(6.54) 2) high-frequency behavior: formulas (6.89), (6.90), (6.142).

Other necessary conditions will be mentioned later (see formulas (6.158) and (6.159) below). Some necessary and sufficient conditions for A(θ0, θ, k) to be the scattering amplitude for a q Q are given first in [R 26), 27)]. ∈ 1 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

136 Random Fields Estimation Theory

These conditions can not be checked algorithmically: they are formulated in terms of the properties of the solutions to certain integral equations whose kernel is the given function A(θ0, θ, k).

6.2.4 A Marchenko equation (M equation) Define

1 ∞ η(x, θ, α) := [φ(x, θ, k) 1] exp( ikα)dk, (6.143) 2π − − Z−∞ where φ is defined by (6.60) and has property (6.72) as k , provided → ∞ that q Q . ∈ 1

For simplicity we assume that `q has no bound states. (6.144)

Under this assumption φ is analytic in C and continuous in C 0. Let + + \ us assume that k = 0 is not an exceptional point, that is, φ is continuous in C+. Start with equation (6.92) which we rewrite as

ik φ(x, θ, k) = φ(x, θ, k) + − − 2π ×

A(θ0, θ, k) exp [ik(θ0 θ) x] φ(x, θ0, k)dθ (6.145) 2 − · − − ZS or

φ(x, θ, k) 1 = φ(x, θ, k) 1 − − − − ik + A(θ0, θ, k) exp [ik(θ0 θ) x] [φ(x, θ0, k) 1] dθ0 2π 2 − · − − − ZS ik + A(θ0, θ, k) exp [ik(θ0 θ) x] dθ0. (6.146) 2π 2 − · ZS Take the Fourier transform of (6.146) and use (6.143) to get

∞ η(x, θ, α) = η(x, θ, α) + B(α β)η(x, θ0, β)dθ0dβ + η0. − − S2 − − − Z−∞ Z (6.147) Here

∞ ik η0 := exp(ikα) A(θ0, θ, k) exp [ik(θ0 θ) x] dθ0 dk (6.148) 2π S2 − · Z−∞  Z  February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 137

and the integral term in (6.147) is

1 ∞ ik 2π dk exp( ikα) 2π S2 A(θ0, θ, k) exp [ik(θ0 θ) x] −∞ − − · × [φ(x, θ0, k) 1] dθ0 R −R − − ∞ = dβ S2 dθ0B(α β, θ0, θ, x)η(x, θ0, β)dθ0, (6.149) −∞ − − − where R R

1 ∞ ik B(α, θ0, θ, x) := dk exp( ikα) A(θ0, θ, k) exp [ik(θ0 θ) x] . 2π − 2π − · Z−∞ (6.150) The Fourier transform in (6.150) is understood in the sense of distributions. Under the assumption (4), the analyticity of φ(x, θ, k) in k in the region C and the decay of φ as k , k C , which follows from (6.72) + | | → ∞ ∈ + imply that

η(x, θ, α) = 0 for α < 0. (6.151)

Therefore the right hand side of (6.149) can be written as

∞ ∞ dβ dθ0B(α + β, θ0, θ, x)η(x, θ0, β) := (α + β)η(β)dβ. 0 S2 − 0 B Z Z Z (6.152) Equation (6.147) now takes the form of the Marchenko equation

∞ η(x, θ, α) = (α + β)η(β)dβ + η , α > 0, (6.153) B 0 Z0 where we took into account that

η(x, θ, α) = 0 for α > 0 (6.154) − −

according to (6.151). The function η0 in (6.153) is defined in (6.148), and the integral operator in (6.153) is defined in (6.152). The kernel of the operator in (6.153) is defined by (6.150) and is known if the scattering amplitude is known. If A(θ0, θ, k) is the scattering amplitude corresponding to a q Q0, where Q0 is the subset of Q which consists of the potentials ∈ 1 1 1 with no bound states, then equation (6.153) has a solution η with the following properties: if one defines η for α < 0 by formula (6.151) then the function

∞ φ(x, θ, k) := 1 + dα exp(ikα)η(x, θ, α) (6.155) Z0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

138 Random Fields Estimation Theory

solves the equation

2 φ + 2ikθ φ q(x)φ = 0 (6.156) ∇x · ∇x − and the function u := exp(ikθ x)φ solves the Schr¨odinger equation ·

`q u = 0. (6.157)

In particular, the function ( 2 + k2)u ∇ := q(x) (6.158) u does not depend on θ (this is a compatibility condition). Another compatibility condition gives formula (6.88). This formula can be written as

q(x) = 2θ η(x, θ, +0). (6.159) − · ∇x Indeed, it follows from (6.155) that

lim 2ik(φ 1) = 2η(x, θ, +0). (6.160) k { − } − →∞ Formula (6.159) follows from (6.88) and (6.160). The compatibility condition (6.159) and the Marchenko equation (6.153) appeared in [Newton (1982)] where condition (6.159) was called the “mir- acle” condition since the left side of (6.159) does not depend on θ). The above derivation is from [Ramm (1992)].

6.2.5 Characterization of the scattering data in the 3D in- verse scattering problem

Let us write A if A := A(θ0, θ, k) is the scattering amplitude corre- ∈ AQ sponding to a potential q Q. Assuming that q Q, we have proved in ∈ ∈ Section 6.1 that equation (6.92), which we rewrite as ik v(x, θ, k) = v(x, θ, k) + A(θ0, θ, k)v(x, θ0, k)dθ0 − − 2π 2 − − ZS ik + A(θ0, θ, k) exp(ikθ0 x)dθ0 (6.161) 2π 2 · ZS has a solution v for all x R3 and all k > 0, where ∈ v := u(x, θ, k) exp(ikθ x) := u u . (6.162) − · − 0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 139

This v has the following properties:

1 1 v = A(θ0, θ, k)g(r) + o(r− ), r = x , θ0 = xr− (6.163) | | → ∞ which is equation (6.2), and

(∆ + k2)(u + v) 0 = q(x) Q (6.164) u0 + v ∈ which is equation (6.158). These properties are necessary for A . It turns out that they are ∈ AQ also sufficient for A . Let us formulate the basic result (see [Ramm ∈ AQ (1992)]).

Theorem 6.3 For A it is necessary and sufficient that equation ∈ AQ (6.161) has a solution v such that (6.164) holds and

1 1 v = A (θ0, θ, k)g(r) + o(r− ), r = x , xr− = θ0. (6.165) q | | → ∞

The function Aq defined by (6.165) is equal to the function A(θ0, θ, k) which is the given function, the kernel of equation (6.161), and it is equal to the scattering amplitude corresponding to the function q(x) defined by (6.164). There is at most one solution to equation (6.161) with properties (6.163), (6.164) and (6.165).

Proof. We have already proved the necessity part. Let us prove the sufficiency part. Let A(θ0θ, k) be a given function such that equation (6.161) has a solution with properties (6.164) and (6.165). First, it follows that u defined by the formula

u := exp(ikθ x) + v (6.166) · is the scattering solution for the potential q(x) defined by formula (6.164). Since the scattering solution is uniquely determined (see Lemma 6.4 in section 6.1.2 2, 1) one concludes that the function A (θ0, θ, k) defined by § q formula (6.165) is the scattering amplitude corresponding to the potential q(x) defined by formula (6.164). Secondly, let us prove that

Aq(θ0, θ, k) = A(θ0, θ, k) (6.167)

where A(θ0, θ, k) is the given function, the kernel of equation (6.161). Note February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

140 Random Fields Estimation Theory

that we have proved in 6.1 that v satisfies the equation

ik v(x, θ, k) = v(x, θ, k) + A1(θ0, θ, k)v(x, θ0, k)dθ0 − − 2π 2 − − ZS ik + Aq(θ0, θ, k) exp(ikθ0 x)dθ0. (6.168) 2π 2 · ZS This is equation (6.92) written in terms of v. Subtract (6.168) from (6.161) to get

3 0 = [A(θ0, θ, k) Aq(θ0, θ, k)] u(x, θ0, k)dθ0, x R . (6.169) 2 − − − ∀ ∈ ZS Equation (6.169) and Theorem 6.1 from Section 6.1.8, imply (6.167). The last statement of Theorem 6.3 can be proved as follows. Suppose there are two (or more) solutions vj , j = 1, 2, to equation (6.161) with properties (6.164) and (6.165). Let qj(x) and Aj (θ0, θ, k), j = 1, 2 be the corresponding potentials and scattering amplitudes. If q1 = q2 then v1 = v2 by the uniqueness of the scattering solution (Lemma 6.4, p. 114). If q q 1 6≡ 2 then w := v v 0. The function w solves the equation 1 − 2 6≡

ik 3 w(x, θ, k) = w(x, θ, k)+ A(θ00, θ, k)w(x, θ00, k)dθ00, x R . − − 2π S2 − − ∀ ∈ Z (6.170) Note that

1 1 w(x, θ, k) = [A (θ0, θ, k) A (θ0, θ, k)] g(r) + o(r− ), r , xr− = θ0 1 − 2 → ∞ (6.171) and

1 w(x, θ, k) = [A (θ0, θ, k) A (θ0, θ, k)] g(r) + o(r− ), − − 1 − − − 2 − − 1 r , xr− = θ0 (6.172) → ∞ 1 where g(r) := r− exp(ikr). From (6.170), (6.171) and (6.172) it follows that

1 [A (θ0, θ, k) A (θ0, θ, k)] g(r) = B(θ0, θ, k)g(r) + o(r− ), r 1 − 2 → ∞ (6.173) where the expression for B(θ0, θ, k) is not important for our argument. It follows from (6.173) that

A1(θ0, θ, k) = A2(θ0, θ, k) (6.174) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 141

so that B(θ0, θ, k) = 0). By Lemma 6.7, it follows that q1 = q2. Theorem 6.3 is proved. 

1 Exercise. Prove that if ag(r) = bg(r) + o(r− ), r , k > 0, where a → ∞ and b do not depend on r then a = b = 0.

nπ Hint: Write a exp(ikr) = b exp( ikr) + o(1); choose rn = k , n . − π → ∞ nπ+ 2 Derive that a = b. Then choose r0 = , n . Derive a = b. Thus n k → ∞ − a = b = 0.

Another characterization of the class of scattering amplitudes is given in [Ramm (1992)]. A characterization of the class of scattering amplitudes at a fixed k > 0 is given in [Ramm (1988)].

6.2.6 The Born inversion The scattering amplitude in the Born approximation is defined to be

1 A (θ0, θ, k) := exp ik(θ θ0) x q(x)dx (6.175) B −4π { − · } Z which is formula (6.15) with u(y, θ, k) substituted by u (y, θ, k) := exp(ikθ 0 · y). The Born inversion is the inversion for q(x) of the equation

exp ik(θ θ0) x q(x)dx = 4πA(θ0, θ, k) (6.176) { − · } − Z which comes from setting

A(θ0, θ, k) = AB(θ0, θ, k). (6.177)

The first question is: does a q(x) Q exist such that (6.177) holds for 2 ∈ all θ0, θ S and all k > 0? ∈ The answer is no, unless q(x) = 0 so that AB(θ0, θ, k) = A(θ0, θ, k) = 0.

2 Theorem 6.4 Assume that q Q. If (6.177) holds for all θ0, θ S and ∈ ∈ all k > 0 then q(x) = 0.

Proof. Since q = q, it follows from (6.175) that

A (θ, θ, k) A (θ, θ, k) = 0. (6.178) B − B February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

142 Random Fields Estimation Theory

From (6.178), (6.177) and (1.54) one concludes that

2 2 AB(θ, α, k) dα = 0 θ S and all k > 0. (6.179) 2 | | ∀ ∈ ZS Thus A (θ, α, k) = 0 for all θ, α S2 and k > 0. This and (6.175) imply B ∈ that q(x) = 0. Theorem 6.4 is proved. 

Remark 6.2 If q Q is compactly supported and (6.177) holds for all 2 ∈ θ0, θ S and a fixed k > 0 then q(x) = 0. This follows from the uniqueness ∈ theorem proved in [Ramm (1992)].

It follows from Theorem 6.4 that the scattering amplitude A(θ0, θ, k) cannot be a function of p := k(θ θ0) only. The Born inversion in practice 3 − 2 reduces to choosing a p R , finding θ, θ0 S and k > 0 such that ∈ ∈

p = k(θ θ0), (6.180) − writing equation (6.176) as

q˜(p) := exp(ip x)q(x)dx = 4π [A(p) + η] (6.181) · − Z where 1 1 A(p) := q(x) exp(ip x)dx = q˜(p) (6.182) −4π · −4π Z and η is defined as

η := A(θ0, θ, k) k(θ θ0)=p A(p). (6.183) | − − One then wishes to neglect η and compute q(x) by the formula 4π q(x) = − A(p) exp( ip x)dp. (6.184) (2π)3 − · Z However, the data are the values A(p)+η or, if the measurements are noisy, the values

A(p) + η + η1 := Bδ(p) (6.185)

where η1 is noise and δ > 0 is defined by formula (6.186) below. The question is: assuming that δ > 0 is known such that

η + η < δ (6.186) | 1| February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 143

how does one compute qδ(x), such that

q (x) q(x) (δ) 0 as δ 0. (6.187) | δ − | ≤ → → In other words, how does one compute a stable approximation of q(x) given noisy values of A(p) as in (6.185)? This question is answered in [Ramm (1992)]. We present the answer here. Define

3 1 qδ(x) := 4π(4π)− Bδ(p) exp( ip x)dp, R(δ) = c0δ− 2b − p R(δ) − · Z| |≤ (6.188) 3 where the constants c0 > 0 and b > 2 will be specified below (see formulas (6.191) and (6.190)).

Theorem 6.5 The following stability estimate holds:

1 3 q (x) q(x) c δ − 2b (6.189) | δ − | ≤ 1 provided that

2 b 3 q˜(p) c (1 + p )− , b > . (6.190) | | ≤ 2 | | 2

The constants c0 and c1 are given by the formulas

1 c 2b c = 2 (6.191) 0 4π  

3 2b 3 2 1 1 2b c1 = + 3−2b c2 . (6.192) "3π 4π 2π2(2b 3)(4π) 2b #   − Proof. Using (6.182) and (6.190), one obtains

3 q (x) q(x) 4π(2π)− | δ − | ≤ ×

Bδ (p) exp( ip x)dp exp( ip x)A(p)dp p

144 Random Fields Estimation Theory

For a fixed δ > 0, minimize φ(δ, R) in R to get

1 1 3 c2 2b φ = c δ − 2b if R(δ) = (6.194) min 1 4πδ   where c1 is given by (52). Theorem 6.5 is proved. 

The practical conclusions, which follow from Theorem 6.5, are:

1) The Born inversion needs a regularization. One way to use a regulariza- tion is given by formula (6.188). If one would take the integral in (6.188) over all of R3, or over too large a ball, the error of the Born inversion might have been unlimited. 2) Even if the error η of the Born approximation for solving the direct scat- tering problem is small, it does not imply that the error of the Born inversion (that is the Born approximation for solving the inverse scatter- ing problem) is small.

The second conclusion can be obtained in a different way, a more gen- eral one. Let (q) = A( ), where is a nonlinear map which sends a B ∗ B potential q Q into a scattering amplitude A. The Born approximation is ∈ a linearization of ( ). Let us write it as ∗

0(q )(q q ) = A (q ). (6.195) B 0 − 0 − B 0

The inverse of the operator 0(q ) is unbounded on the space of functions B 0 with the sup norm. Therefore small in absolute value errors in the data may lead to large errors in the solution q q . The Born approximation − 0 is a linearization around q0 = 0. The distorted wave Born approximation is a linearization around the reference potential q0. In both cases the basic conclusion is the same: without regularization the Born inversion may lead to large errors even if the perturbation q q is small (in which case Born’s − 0 approximation is accurate for solving the direct problem). Let us discuss another way to recover q(x) from A(θ0, θ, k) given for large k. This way has a computational advantage of the following nature. One does not need to find θ0, θ, and k such that (6.180) holds and one integrates over S2 S2 instead of R3 in order to recover q(x) stably from the given × noisy data Aδ(θ0, θ, k):

A (θ0, θ, k) A(θ0, θ, k) δ. (6.196) | δ − | ≤ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 145

We start with the following known formula ([Saito (1982)])

2 2 lim k A(θ0, θ, k) exp ik(θ0 θ) x dθdθ0 = 2π x y − q(y)dy. k 2 2 { − · } − | − | →∞ S S Z Z Z (6.197) To estimate the rate of convergence in (6.197) one substitutes for A(θ0, θ, k) its expression (6.15) to get

k2 := dyq(y) exp ikθ0 (x y) dθ0 J 4π 2 { · − } − Z ZS exp ikθ (x y) dθ + exp( ikθ x)(φ 1)dθ 2 {− · − } 2 − · − ZS ZS  (6.198)

where φ is defined by (6.60). A simple calculation yields

sin(k x y ) exp ikθ (x y) dθ = 4π | − | . (6.199) 2 { · − } k x y ZS | − | Thus

sin2(k x y ) sin(k x y ) = 4π dyq(y) | − | + | − | J − x y 2 4π x y × Z  | − | | − | exp( ikθ x) [φ(y, θ, k) 1] dθ 2 − · − ZS  := + . (6.200) J1 J2 One has 1 cos(2k x y ) = 4π dyq(y) − | − | J1 − 2 x y 2 Z | − | 2 ∞ = 2π q(y) x y − dy 2π dr cos(2kr)Q(x, r) (6.201) − | − | − Z Z0 where

Q(x, r) := q(x + rα)dα, r = y x . (6.202) 2 | − | ZS Let us assume that

q Q := q : q = q, q + Dq + D2q Q . (6.203) ∈ 2 | | | | | | ∈  February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

146 Random Fields Estimation Theory

Then, integrating by parts, one obtains

∞ 1 ∞ ∂Q dr cos 2krQ(x, r) = − dr sin(2kr) 2k ∂r Z0 Z0 2 1 ∂Q ∞ ∂ Q = cos(2kr) ∞ dr cos(2kr) 4k2 ∂r 0 − ∂r2  Z0  c , k > 1. (6.204) ≤ k2 Here c = const > 0 does not depend on x:

2 ∞ ∂ Q c = max q(x + rα) dα + max dr , (6.205) 3 2 x∈R3 S2 |∇ | x R 0 ∂r r 0 Z ∈ Z ≥

2 ∂ Q a c (1 + x r )− , a > 3 (6.206) ∂r2 ≤ 1 | − |

and

∞ a ∞ a ∞ a dr(1 + x r )− dr (1 + x r )− dr (1 + x r )− 0 | − | ≤ 0 || | − | ≤ || | − | Z Z Z−∞ ∞ a 2 dr(1 + r)− c (6.207) ≤ ≤ 2 Z0 so that the right-hand side of (6.205) is bounded uniformly in x R3. Thus ∈

2 2 = 2π x y q(y)dy + O(k− ), k (6.208) J1 − | − | → ∞ Z provided (6.203) holds. 2 3 Note that if q Lloc(R ) and no a priori information about its smooth- ∈ 2 ness is known, then one obtains only o(1) in place of O(k− ) in (6.208). From (1.72) it follows that

1 ck− , k > 1. (6.209) J2 ≤ Thus, assuming (6.203),

2 1 = 2π x y − q(y)dy + O(k− ), k (6.210) J − | − | → ∞ Z 1 where O(k− ) is uniform in x. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 147

Therefore (6.197) can be written as

2 k A(θ0, θ, k) exp ik(θ0 θ) x dθdθ0 2 2 { − · } ZS ZS 2 1 = 2π x y − q(y)dy + O(k− ). (6.211) − | − | Z The equation

2 3 2π x y − q(y)dy = f(x), x R (6.212) − | − | ∈ Z is solvable analytically. Take the Fourier transform of (6.212) to get 1 q˜(p) = p f˜(p). (6.213) −4π3 | | Here we have used the formula π 2 ∞ x 2 := x − exp(ip x)dx = 2π dr exp(i p r cos θ) sin θdθ | | | | · 0 0 | | Z 2 Z Z ∞ sin p r 2π g = 4π dr | | = . (6.214) p r p Z0 | | | | thus 1 q(x) = − exp( ip x) p f˜(p)dp. (6.215) 32π6 − · | | Z 1 Assume for a moment that the term O(k− ) in (71) is absent. Then, ap- plying formula (6.215), taking f(x) to be the left-hand side of (6.211), and taking the Fourier transform of f˜(p), one would obtain 1 q(x) = dp p exp( ip x) −32π6 | | − · × Z 2 3 k A(θ0, θ, k)(2π) δ [p + k(θ0 θ)] dθdθ0 2 2 −  ZS ZS  k3 = 3 dθdθ0A(θ0, θ, k) θ0 θ exp ik(θ0 θ) x (6.216). −4π 2 2 | − | { − · } ZS ZS

This formula appeared in [Somersalo, E. et al. (1988)]. If Aδ is known in place of A, and (6.196) holds, then formula (6.216), with Aδ in place of A, gives k3 qδ(x) := 3 dθdθ0Aδ(θ0, θ, k) θ0 θ exp ik(θ0 θ) x . (6.217) − 4π 2 2 | − | { − · } ZS ZS February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

148 Random Fields Estimation Theory

1 Neglecting the term O(k− ) in (6.211), one obtains

3 3 1 3 16 q qδ δk dθdθ0 θ θ0 (4π )− = δk , (6.218) | − | ≤ 2 2 | − | 3π ZS ZS where we have used the formula:

θ θ0 dθdθ0 = dθ θ θ0 dθ0 = 4π θ θ0 dθ0 2 2 | − | 2 2 | − | 2 | − | ZS ZS ZS ZS ZS π 64π2 = 8π2 2 2 cos γ sin γdγ = . (6.219) 0 − 3 Z p 1 It is now possible to take into account the term O(k− ) in (6.211). Note that, as follows from (6.72) and (6.200),

1 1 ∞ 1 2 dy q(y) dθ q(x rθ) dr + o(k− ). (6.220) J ≤ 2k | | x y 2 | − | Z | − | ZS Z0 One has

1 ∞ 2 a/2 c dy q(y) x y − dθ 1 + x rθ − dr, a > 3 (6.221) | || − | 2 | − | Z Z0 ZS  and π a/2 sin γdγ 1 + x rθ 2 − dθ = 2π a/2 2 | − | 2 2 ZS Z0 (1 + x + r 2 x r cos γ)  | | | | − | | 1 dt = 2π a/2 1 (1 + x 2 + r 2 2 x rt) Z− | | | | − | |

a/2+1 2 a/2+1 1 + x r − 1 + ( x + r)2 − = 2π || | − | − | | . (6.222) h i 2 x r a 1  | || | 2 − Moreover 

a/2+1 2 a/2+1 1 + x r − 1 + ( x + r)2 − ∞ c dr || | − | − | | , x 1 0 h i r   ≤ x | | ≥ Z | | (6.223) where c > 0 is a constant. Therefore ck 1 − . (6.224) J2 ≤ 1 + x 2 | | February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 149

2 3 1 This means that the L (R ) norm of 2 as a function of x is O(k− ) as J 1 k . Therefore, if one takes into account the O(k− ) term is (71), one → ∞ obtains in place of (6.218) the following estimate

3 1 q q 2 3 c(δk + k− ), c = const > 0. (6.225) k − δ kL (R )≤ Minimization of the right-hand side of (6.225) in k yields

1/4 1/4 q q 2 3 c δ− for k = (3δ)− (6.226) k − δ kL (R )≤ 1 m

where km = km(δ) is the minimizer of the right-hand side of (6.225) on the interval k > 0. Therefore, if the data Aδ(θ0, θ, k) are noisy, so that (6.196) holds, one should not take k in formula (6.217) too large. The quasioptimal k is given in (6.226), and formula (6.217) with k = km gives a stable approximation of q(x). 1 Let us finally discuss (6.225). The term ck− has already been discussed. The first term cδk3 has been discussed for the estimate in sup norm. In the case of L2 norm one has to estimate the L2(R3) norm of the function

h(x) := a(θ0, θ) exp ik(θ0 θ) x dθ0dθ, a := A Aδ (6.227) 2 2 { − · } − ZS ZS given that

a δ, a + 0 a m (6.228) | | ≤ |∇θ | |∇θ | ≤ 1 where , 0 are the first derivatives in θ and θ0, and m = const > 0. In ∇θ ∇θ 1 order to estimate h we use the following formula [Ramm (1986), p.54] 2πi exp( ikr) exp(ikr) exp(ikrθ α)f(θ)dθ = − f( α) f(α) 2 · k r − − r ZS   1 + o , r + , k > 0, r → ∞   α S2. (6.229) ∈ This formula is proved under the assumption f C1(S2). From (6.227)- ∈ (6.229) one obtains cδ h(x) . (6.230) | | ≤ 1 + x 2 | | Thus

h 2 3 cδ. (6.231) k kL (R )≤ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

150 Random Fields Estimation Theory

By c we denote various positive constants. From (6.231) one obtains the first term in (6.225) for the case of the estimate in L2(R3) norm.

6.3 Estimation theory and inverse scattering in R3

Consider for simplicity the filtering problem which is formulated in Chapter 2. Let

= s + n(x), x R3 (6.232) U ∈ where the useful signal s(x) has the properties

s(x) = 0, s∗(x)s(y) = Rs(x, y) (6.233)

s(x)n(y) = 0 (6.234)

and the noise is white

n(x) = 0, n (x)n(y) = δ(x y). (6.235) ∗ − In this section the star stands for complex conjugate and the bar stands for the mean value. The optimal linear estimate of s(x) is given by

sˆ(x) = h(x, y) (y)dy. (6.236) U ZD Here the optimality is understood in the sense of minimum of variance of the error of the estimate, as in Chapter 2. Other notations are also the same as in Chapter 2. In particular, D R3 is a bounded domain in which ⊂ is observed. U It is proved in Chapter 2 that the optimal h(x, y) solves equation 2.11 which in the present case is of the form:

Rh := h(x, z) + R (z, y)h(x, y)dy = R (z, x), x, y D (6.237) s s ∈ ZD or, if one changes z y and y z, this equation takes the form → → h(x, y) + R (y, z)h(x, z)dz = R (y, x), x, y D. (6.238) s s ∈ ZD Note that under the assumptions (6.233)-(4) one has

R(x, y) = R (x, y) + δ(x y), f(x, y) = R (x, y) (6.239) s − s February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 151

where R(x, y) and f(x, y) are defined in 2.3 (II.1.3). The basic equation (6.238) is a Fredholm’s second kind equation with positive definite in L2(D) operator R, R I. It is uniquely solvable in L2(D) (that is, it has a solution ≥ in L2(D) and the solution is unique). There are many methods to solve (6.238) numerically. In particular an iterative method can easily be constructed for solving (6.238). This method converges as a geometrical series (see Section 3.2, Lemma 3.1). Projection methods can be easily constructed and applied to (6.238) (see Section 3.2, Lemma 3.3). In [Levy and Tsitsiklis (1985)] and [Yagle (1988)] attempts are made to use for the numerical solution of the equation (6.238) some analogues of Levinson’s recursion which was used in the one- dimensional problems when R (x, y) = R (x y). s s − In the one-dimensional problems causality plays important role. In the three-dimensional problems causality plays no role: the space is isotropic in contrast with time. Therefore, in order to use the ideas similar to Levinson’s recursion one needs to assume that the domain D is parametrized by one parameter. In [Levy and Tsitsiklis (1985)] the authors assume D to be a disc (so that the domain is determined by the radius of the disc; this radius is the parameter mentioned above). Of course, one has to impose severe restrictions on the correlation function Rs(x, y). In [Levy and Tsitsiklis (1985)] it is assumed that

R (x, y) = R ( x y ). (6.240) s s | − |

This means that s(x) is an isotropic random field. In [Yag] the case is considered when D R3 is a ball and R (x, y) solves ⊂ s the equation

∆xRs(x, y) = ∆yRs(x, y), (6.241)

where ∆ is the Laplacian. If R (x, y) = R (x y) then (6.241) holds. x s s − Let us derive a differential equation for the optimal filter h(x, y). The derivation is similar to the one given in subsection 8.4.8 for Kalman filters. Let us apply the operator ∆ ∆ to (6.238) assuming (6.241) and taking x − y February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

152 Random Fields Estimation Theory

D = z : z x : { | | ≤ | |}

0 = (∆x ∆y)h(x, y) + ∆x h(x, z)Rs(y, z)dz − z x Z| |≤| |

h(x, z)∆zRs(y, z)dz − z x Z| |≤| |

= (∆x ∆y)h(x, y) + (∆x ∆z)h(x, z)Rs(y, z)dz , − z x − − J Z| |≤| | (6.242)

where, as we will prove,

2 := r Rs(y, rβ)q(x, rβ)dβ, r = x . (6.243) J 2 | | ZS

Here S2 is the unit sphere in R3 and

d 4 q = q(r, α, β) := q(x, rβ) := 2 h(rα, rβ) h(rα, rβ) − dr − r 2 d = r2h(rα, rβ) , x = rα. (6.244) −r2 dr2   Let us prove (6.243) and (6.244). Integrate by parts the second integral in (6.242) to get

h(x, z)∆zRs(y, z)dz = ∆zh(x, z)Rs(y, z)dz z x z x Z| |≤| | Z| |≤| | ∂ ∂h(x, z) + h(x, z) Rs(y, z) Rs(y, z) dz. (6.245) z =r ∂ z − ∂ z Z| |  | | | | 

The first integral in (6.242) can be written as

2 r d 2 d ∆∗ 2 1 := 2 + + 2 dρρ h(rα, ρβ)Rs(y, ρβ)dβ, (6.246) J dr r dr r 2   Z0 ZS

where ∆∗ is the angular part of the Laplacian and z = ρβ, where ρ = z | | February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 153

and β S2. One has ∈ r 2 1 1 = dρρ 2 ∆∗h(rα, ρβ)Rs(y, ρβ)dβ + 2r h(rα, rβ)Rs(y, rβ)dβ J 0 S2 r S2 Z r Z Z 2 2 d + dρρ h(rα, ρβ)Rs(y, ρβ)dβ 2 r dr Z0 ZS d 2 + r h(rα, rβ)Rs(y, rβ)dβ dr S2 r Z 2 d + dρρ h(rα, ρβ)Rs(y, ρβ)dβ 2 dr Z0 ZS 

= ∆xh(x, z)Rs(y, z)dz + 4r h(rα, rβ)Rs(y, rβ)dβ z r S2 Z| |≤ Z 2 d 2 d + r h(rα, rβ)Rs(y, rβ)dβ + r h(rβ, rβ) Rs(y, rβ)dβ 2 dr 2 dr ZS ZS 2 ∂ + r h(rα, ρβ) Rs(y, rβ)dβ (6.247) 2 ∂r ZS ρ=r

From (6.245) and (6.247) one obtains

2 ∂ ∂h(rα, ρβ) = r h(rα, rβ) Rs(y, rβ) + Rs(y, rβ) dβ −J 2 − ∂r ∂ρ ZS  ρ=r #

+ 4r h(rα, rβ)Rs(y, rβ)dβ ∗ 2 ZS 2 d d + r h(rα, rβ)Rs(y, rβ)dβ + h(rα, rβ) Rs(y, rβ) 2 dr dr ZS  ∂ + h(rα, ρβ) R (y, rβ) dβ ∂r s ρ=r #

2 d = 2r h(r α, rβ)Rs(y, rβ)dβ + 4r h(rα, rβ)Rs(y, rβ)dβ 2 dr 2 ZS ZS 2 = r Rs(y, rβ)q(x, rβ)dβ, (6.248) 2 ZS where q is given by (6.244). In order to derive a differential equation for h let us assume that

Rs(x, y) = Rs(y, x), (6.249) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

154 Random Fields Estimation Theory

write equation (6.238) for D = z : z x as { | | ≤ | |}

(x, y) + Rs(y, z) (x, z)dz = Rs(y, x), y x . (6.250) H z x H | | ≤ | | Z| |≤| | Note the restiction y x in (6.250). Multiply (6.250) by q(rα, rβ), set | | ≤ | | in (6.250) x = rβ, r = x , integrate over S2 in β and then multiply by r2 | | to get

2 2 r (rβ, y)q(rα, rβ)dβ + Rs(y, z)r (rβ, z)q(rα, rβ)dβ S2 H z x S2 H Z Z| |≤| | Z 2 = r Rs(y, rβ)q(rα, rβ)dβ. (6.251) 2 ZS Define

(∆ ∆ ) (x, y) := φ(x, y) (6.252) x − y H and set x = rα in (6.252). Write equation (6.242) as

2 φ(rα, y) + φ(rα, z)Rs(y, z)dz = r Rs(y, rβ)q(rα, rβ)dβ z x S2 Z| |≤| | Z (6.253) or, in the operator form

(I + Rs)φ = ψ, (6.254)

where ψ is the right-hand side of (6.253). Equation (6.251) is of the form

(I + Rs)γ = ψ (6.255)

where

γ := r2 (rβ, y)q(rα, rβ)dβ. (6.256) 2 H ZS Since the operator I+R I is injective, it follows from (6.254) and (6.255) s ≥ that φ = γ. Thus

2 (∆x ∆y) (x, y) = r (rβ, y)q(rα, rβ)dβ, y x = r, x = rα, − H S2 H | | ≤ | | Z (6.257) where α, β S2 and q is given by (6.244) with h(rα, rβ) = (rα, rβ). Let ∈ H us formulate the result: February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 155

Lemma 6.8 If (x, y) solves equation (6.250) and the assumptions H (6.241) and (6.249) hold, then solves equation (6.257) with q(rα, rβ) H given by (6.244) with h(rα, rβ) = (rα, rβ), α, β S2, x = r, x = rα. H ∈ | | If one defines

(x, y) = 0 for y > x (6.258) H | | | | and put

3 ˜(x, ξ) = (2π)− 2 (x, y) exp(iξ y)dy, := , (6.259) H H · 3 Z Z ZR where the integral in (6.259) is taken actually over the ball y x because | | ≤ | | of (6.258), then

3 (x, y) = (2π)− 2 ˜(x, ξ) exp( iξ y)dξ. (6.260) H H − · Z Substitute (6.260) into (6.257) (or, which is the same, Fourier transform (6.257) in the variable y) to get

2 2 (∆x + ξ ) ˜(x, ξ) = r ˜(rβ, ξ)q(rα, rβ)dβ, x = rα. (6.261) H 2 H ZS Equation (6.261) is a Schr¨odinger equation with a non-local potential

Q ˜ := r2 ˜(rβ, ξ)q(rα, rβ)dβ. (6.262) H 2 H ZS Suppose that (x, y) is computed for y x a. Given this (x, y), H | | ≤ | | ≤ H how does one compute the solution h(x, y) to the equation (6.238) with D = B = x : x a ? a { | | ≤ } Write equation (6.238) for D = Ba and z = ρβ as

a 2 h(x, y, a) + Rs(y, ρβ)h(x, ρβ, a)dβρ dρ = Rs(y, x). (6.263) 2 Z0 ZS Differentiate (6.263) in a to get

a ∂h ∂h(x, ρβ, a) 2 2 + Rs(y, ρβ) dβρ dρ = a R(y, aβ)h(x, aβ, a)dβ. ∂a 0 S2 ∂a − S2 Z Z Z (6.264) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

156 Random Fields Estimation Theory

Let x = aα in (6.263). Multiply (6.263) by a2h(z, aα, a) and integrate − over S2 to get:

a2 h(aα, y, a)h(z, aα, a)dα − S2 + a R (y, ρβ) a2 h(aα, ρβ, a)h(z, aα, a) 0 S2 s R − S2 = a2 R (y, aα)h(z, aα, a)dα. (6.265) R R − S2 s R

The operator I + Rs is injectivR e. Therefore equations (6.264) and (6.265) have the same solution since their right-hand sides are the same. Set z = x and α = β in (6.265), compare (6.264) and (6.265) and get ∂h(x, y, a) = a2 h(aβ, y, a)h(x, aβ, a)dβ. (6.266) ∂a − 2 ZS Note that

h(x, y) = (x, y) for y x (6.267) H | | ≤ | | according to equation (6.250). Therefore (6.266) can be written as ∂h(x, y, a) = a2 (aβ, y, a)h(x, aβ, a)dβ. (6.268) ∂a − 2 H ZS Equation (6.268) can be used for computing the function h(x, y, a) for all x, y B , given (x, y) for y x a. The value h(x, aβ, a) can be ∈ a H | | ≤ | | ≤ computed from equation (6.263): a 2 h(x, aβ, a) + Rs(aβ, ρθ)h(x, ρθ, a)dθρ dρ = Rs(aβ, x). (6.269) 2 Z0 ZS The function Rs(aβ, z) is known for all z, β and the function h(x, ρθ, a), ρ < a is assumed to be computed recursively, as a grows, by equation (6.268). Namely, let us assume that (x, y) is computed for all values y x H | | ≤ | | ≤ A, and one wants to compute h(x, y) for all x, y B := x : x A . ∈ A { | | ≤ } From (6.268) one has

h (x, y, (m + 1)τ) = h(x, y, mτ) τ(mτ)2 − × (mτβ, y, mτ)h(x, mτβ, mτ)dβ. (6.270) 2 H ZS Here m = 0, 1, 2, . . ., τ > 0 is a small number, the step of the increment of a. It follows from (6.263) that

h(x, y, 0) = Rs(y, x), (6.271) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Estimation and Scattering Theory 157

so that

h(x, y, τ) = Rs(y, x), (6.272)

3 h(x, y, 2τ) = Rs(y, x) τ (τβ, y, τ)h(x, τβ, τ)dβ, (6.273) − 2 H ZS and so on. One can assume that y > x because for y x one can use | | | | | | ≤ | | (6.267). A formal connection of the estimation problem with scattering theory can be outlined as follows. Let us assume that there exists a function (x, y), (x, y) = 0 for H H y > x , such that the function φ(x, θ, k) defined by the formula | | | | φ(x, θ, k) := exp(ikθ x) exp(ikθ y) (x, y)dy (6.274) · − y x · H Z| |≤| | is a solution to the Schr¨odinger equation

∆ + k2 q(x) φ = 0, (6.275) − where ∆ = 2 is the Laplacian. This assumption is not justified presently, ∇ so that our argument is formal. Taking the inverse Fourier transform of (6.274) in the variable kθ, one obtains

1 ∞ 2 3 dkk [φ(x, θ, k) exp(ikθ x)] exp( ikθ y)dθ = (x, y). −(2π) 0 S2 − · − · H Z Z (6.276) Compute (∆ ∆ ) formally taking the derivatives under the integral x − y H signs in the left-hand side of (6.276) and using (6.275). The result is

(∆ ∆ ) (x, y) = q(x) (x, y). (6.277) x − y H H One is interested in the solution of (6.277) with the property (x, y) = 0 H for y > x . Define ˜(x, ξ) by formula (6.259). Substitute (6.260) into | | | | H (6.277) and differentiate in y formally under the sign of the integral to get

∆ + ξ2 q(x) ˜(x, ξ) = 0. (6.278) x − H Therefore, if one compares (6.278) and (6.261) one can conclude that the right-hand side of (6.261) reduces to q(x) ˜(x, ξ). This means that H 1 q(rα, rβ) = δ(α β)q(x), x = rα (6.279) r2 − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

158 Random Fields Estimation Theory

where δ(α β) is the delta-function. If (6.279) holds then the non-local − potential Q defined by (6.262) reduces to the local potential q(x). Equations (6.244) and (6.279) imply d 2 r2 (rα, rβ) = δ(α β)q(rα). (6.280) − dr H − Note that h(rα, rβ) = (rα, rβ). From (6.280) one obtains H δ(α β) r (rα, rβ) = − q(ρα)dρ + R (0, 0), (6.281) H − 2r2 s Z0 where we have used the equation (0, 0) = R (0, 0) (6.282) H s which follows from (6.250). Let us summarize the basic points of this section: 1) the solution to equation (6.250) solves equation (6.257) provided that the assumptions (6.241) and (6.249) hold; 2) the solution to equation (6.238) with D = Ba is related to the solution to equation (6.250) for x a by the formulas (6.267) and (6.268); | | ≤ 3) if the solution to equation (6.250) is found then one can compute the solution to equation (6.238) with D = Ba recursively using (6.270); 4) the solution to equation (6.250) solves the differential equation (6.257), and its Fourier transform solves the Schr¨odinger equation (6.261) with a non-local potential. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Chapter 7 Applications

In this Chapter a number of questions arising in applications are discussed. All sections of this Chapter are independent and can be read separately.

7.1 What is the optimal size of the domain on which the data are to be collected?

Suppose that one observes the signal

(x) = s(x) + n(x), x D Rr (7.1) U ∈ ⊂

in a bounded domain D which contains a point x0. Assume for simplicity that D is a ball B` with radius ` centered at x0. The problem is to estimate s(x0). As always in this book, s(x) is a useful signal and n(x) is noise. It is clear that if the radius `, which characterizes the size of the domain of observation, is too large then the time and effort will be wasted in collecting data which do not improve the quality of the estimate significantly. On the other hand, if ` is too small then one can improve the estimate using more data. What is the optimal `? This question is of interest in geophysics and many other applications. Let us answer this question using the estimation theory developed in Chapter 2. We assume that the optimal estimate is linear and the opti- mization criterion is minimum of variance. We also assume that the data are the covariance functions (1.3), that condition (1.2) holds, and that R(x, y) . ∈ R The optimal estimate is given by formula (2.15). Let us assume for simplicity that P (λ) = 1, and

R(x, y) c exp( a x y ) c > 0, x y  > 0 (7.2) | | ≤ − | − | | − | ≥

159 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

160 Random Fields Estimation Theory

where the last inequality allows a growth of R(x, y) as x y, for example, 1 → R(x, y) = (4π x y )− exp( a x y ). Here c and a are positive constants, 1 | − | − | − | a− is the so-called correlation radius, and the function f(x, y) defined by formula (1.3) is smooth. Concerning f(x, y) we assume the same estimate as for R(x, y):

f(x, y) c exp( a x y ), c > 0, x y  > 0. (7.3) | | ≤ − | − | | − | ≥ 0 Under these assumptions the optimal filter is of the form

h = Q( )f + h = h + h (7.4) L s 0 s

where hs is the singular part of h which contains terms of the type b(s)δ (j) (see Section 3.3), and h is the regular part of h. { Γ} 0 The optimal estimate is of the form

sˆ(x ) = h (x , y) (y)dy + h (x , y) (y)dy. (7.5) 0 0 0 U s 0 U ZB` ZB`

The optimal size ` of the domain B` of observation is the size for which the second term in (7.5) is negligible compared with the first.

Example 7.1 Suppose that r = 3, x0 = 0, R(x, y) = (4π x 1 j | − y )− exp( a x y ), and ∂ f M, 0 j 2, where j is a multi- | − | − | | | ≤ ≤ | | ≤ index. Then by formula (2.85) one has ∂f ∂u h(y) = ( ∆ + a2)f(y) + δ , (7.6) − ∂ y − ∂ y Γ  | | | |  where h(y) = h(0, y), f(y) = f(0, y), Γ = x : x = ` . Therefore the { | | } optimal estimate is ∂f ∂u sˆ(0) = h(y) (y)dy = ( ∆ + a2)f dy + (s)ds U − U ∂ y − ∂ y U ZB` ZB` ZΓ   | | | | (7.7) and u is uniquely determined by f as the solution to the Dirichlet problem (2.22-2.23) which in our case is

( ∆ + a2)u = 0 if x `, u = f if x = `, u( ) = 0. (7.8) − | | ≥ | | ∞ The solution to problem (7.8) can be calculated analytically. One gets

∞ h (iar) u(r, θ) = f Y (θ) n , (7.9) n n h (ia`) n=0 n X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 161

where θ = (ϑ, φ) is a point on the unit sphere S2 in R3, a unit vector, Y (θ) is the system of spherical harmonics orthonormalized in L2(S2), { n } fn := S2 f(`, θ)Yn∗(θ)dθ, and hn(r) is the spherical Hankel function, π 1/2 (1) (1) hn(r) :=R 2r Hn+(1/2)(r), where Hn (r) is the Hankel function. The solution (7.9) is a three dimensional analogue of the solution (2.91).  The second integral on the right-hand side of (7.7) is of order of mag- nitude O (exp( a`)), while the first integral is of order of magnitude O(1). − Therefore, if we wish to be able to neglect the effects of the boundary term on the estimate with accuracy about 5 percent then we should choose ` = 3/a. A practical recipe for choosing ` so that the magnitude of the boundary term in (7.7) is about γ percents of the magnitude of the volume term is 1 100 ` = ln . (7.10) a γ

7.2 Discrimination of random fields against noisy back- ground

Suppose that one observes a random field (x) which can be of one of the U forms

(x) = s (x) + n(x), p = 0, 1. (7.11) U p Here sp(x), p = 0, 1, are deterministic signals and n(x) is Gaussian random field with zero mean value

n = 0. (7.12)

In particular, if s0 = 0 then the problem is to decide if the observed signal (x) contains the signal s (x) or is it just noise. In order to formulate U 1 the discrimination problem analytically we take as the optimality criterion the principle of maximum likelihood. Other optimal decision rules such as Neyman-Pearson or Bayes rules could be considered similarly. Note that we assume in this section that noise in Gaussian. This is done because under such an assumption one can calculate the likelihood ratio analytically. Let us first develop the basic tools for solving the discrimination prob- lem. Let

Rφj := R(x, y)φj(y)dy = λj φj in D. (7.13) ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

162 Random Fields Estimation Theory

Here

R(x, y) := n∗(x)n(y), (7.14)

λ λ > 0, (7.15) 1 ≥ 2 ≥ · · ·

λj are the eigenvalues of the operator R counted according their multiplici- 2 ties, and the φj are the corresponding normalized in L (D) eigenfunctions. Let us define random variables nj by the formula:

1/2 nj = λ−j n(x)φj∗(x)dx. (7.16) ZD From (7.12) it follows that

nj = 0. (7.17)

Moreover, since R(x, y) = R∗(y, x), one has

1/2 ninj∗ = (λiλj )− R(y, x)φj (x)φi∗(y)dxdy ZD ZD 1/2 1 i = j = (λiλj )− dxφj(x)λiφi∗(x) = δij = (7.18) D (0 i = j. Z 6

The random variables nj are called noncorrelated coordinates of the random field n(x). One has

∞ 1/2 n(x) = λj njφj(x). (7.19) j=1 X

The series in (7.19) converges in the mean. The random variables nj are Gaussian since the random field n(x) is Gaussian. Define

1/2 spj := λj− sp(x)φj∗(x)dx, p = 0, 1. (7.20) ZD Then

∞ 1/2 sp(x) + n(x) = λj [spj + nj]φj(x). (7.21) Xj=1 Let

:= s + n . (7.22) Upj pj j February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 163

Then

= s , s 2 = 1, (7.23) Upj pj |Upj − pj | and are Gaussian. Upj Let Hp denote the hypothesis that the observed random field is sp(x) + n(x), p = 0, 1, and f(u1, . . ., un Hp) is the probability density for the random variables under the assumption that the hypothesis H occured: pj p U n n 1 2 f(u , . . ., u H ) = (2π)− exp u s . (7.24) 1 n p −2 | j − pj |  j=1  X 

Here we used equations (7.23). Since pj are complex valued we took n n/2 U (2π)− rather than (2π)− as the normalizing constant in (7.24). The likelihood ratio is defined as

f(u1, . . ., un H1) `(u1, . . ., un) = . (7.25) f(u1, . . ., un H0)

Therefore 1 n ln `(u , . . ., u ) = u s 2 u s 2 1 n −2 | j − 1j | − | j − 0j | Xj=1 n  n  1 2 2 = s s + Re u (s∗ s∗ ).(7.26) 2 | 0j| − | 1j| j 1j − 0j j=1 j=1 X  X We wish to compute the limit of the function (7.26) as n . If this is → ∞ done one can formulate the decision rule based on the maximum likelihood principle. Note first that the system of eigenfunctions φ of the operator R is { j} complete in L2(D) since we have assumed that the selfadjoint operator R : L2(D) L2(D) is positive, that is (Rφ, φ) = 0 implies φ = 0. (See → (7.15). Indeed since

L2(D) = c` RanR N(R) (7.27) { } ⊕ where c` RanR is the closure of the range of R, and N(R) is the null space { } of R, and since N(R) = 0 by assumption (7.15), one concludes that the { } closure of the range of R is the whole space L2(D). Thus the closure of the linear span of the eigenfunctions φ is L2(D) as claimed. { j} February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

164 Random Fields Estimation Theory

Let us assume that

R(x, y) . (7.28) ∈ R

Define the function V (x) as the solution of minimal order of singularity of the equation

RV := R(x, y)V (y)dy = s (x) s (x), x D. (7.29) 1 − 0 ∈ ZD Thus

1 V (x) = R− (s s ). (7.30) 1 − 0 Using Parseval’s equality one gets

∞ 1 1 ∗ λj− cj bj∗ = c(y) R− b(y) dy (7.31) j=1 ZD X  where c(y) and b(y) are some functions for which the integral (7.31) con- verges,

cj = c(y)φj∗dy, bj = b(y)φj∗(y)dy, (7.32) ZD ZD

1 1 λj− bj = R− b(y)φj∗dy. (7.33) ZD Therefore using formulas (7.16), (7.19), (7.20), (7.22), (7.30) and (7.31) one obtains

∞ (s∗ s∗ ) = (y)V ∗(y)dy. (7.34) Uj 1j − 0j U j=1 D X Z Let us assume that

1 1 s (R− s )∗dx < , s (R− s )∗dx < . (7.35) 0 0 ∞ 1 1 ∞ ZD ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 165

1 Then, using selfadjointness of R− , one gets

∞ 2 2 1 1 s s = s (R− s )∗ s (R− s )∗ dx | 0j| − | 1j| 0 0 − 1 1 j=1 ZD X    1 = s V ∗dx + s (R− s )∗ − 0 0 1 ZD ZD 1 1 R− (s s )s∗dx R− s s∗dx − 1 − 0 1 − 0 1 ZD ZD = s V ∗dx s∗V dx. (7.36) − 0 − 1 ZD ZD Combining (7.36), (7.34) and (7.26) one obtains Lemma 7.1 There exists the limit

ln ` ( (x)) = lim ln `(u1, . . ., un) n U →∞ 1 1 = Re (x)V ∗(x)dx s V ∗dx s∗(x)V dx. U − 2 0 − 2 1 ZD ZD ZD (7.37)

If the signals sp, p = 0, 1, and the kernel R(x, y) are real valued then (7.36) reduces to s (x) + s (x) ln ` ( (x)) = (x) 0 1 V (x)dx. (7.38) U U − 2 ZD   Suppose that the quantity on the right hand side of equation (7.37) (or (7.38)) has been calculated, so that the quantity ln ` ( (x)) is known. Then U we use The maximum likelihood criterion: if ln ` ( (x)) 0 then the decision U ≥ is that hypothesis H1 occured, otherwise hypothesis H0 occured. Therefore if 1 1 Re (x)V ∗(x)dx s (x)V ∗(x)dx + s∗(x)V (x)dx (7.39) U ≥ 2 0 2 1 ZD ZD ZD then H1 occured. Here V is given by formula (7.30). If the opposite inequality holds in (7.39) then H0 occured. If s (x), p = 0, 1, (x) and R(x, y) are real valued then the inequality p U (7.39) reduces to 1 (x)V (x)dx [s (x) + s (x)] V (x)dx. (7.40) U ≥ 2 0 1 ZD ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

166 Random Fields Estimation Theory

The decision rule is: if (7.40) holds then H1 occured, otherwise H0 occured. If one uses some other threshold criterion (such as Bayes, Neyman- Pearson, etc.) then one formulates the decision rule based on the inequality 1 Re (x)V ∗(x)dx ln κ + [s (x)V ∗(x) + s∗(x)V (x)] dx, (7.41) U ≥ 2 0 1 ZD ZD where κ > 0 is a constant which is determined by the threshold. (See Section 8.4 for more details.) The decision rule: Practically, the decision rule based on the inequal- ity (7.39) (or (7.41)) can be formulated as follows:

1) given sp(x), p = 0, 1, solve equation (7.29) for V (x) by formulas given in Theorem 2.1. 2) if V (x) is found and (x) is measured, then compute the integrals in U formula (7.39) and check if the inequality (7.39) holds. 3) if yes, then the decision is that the observed signal is

= s (x) + n(x). (7.42) U 1 Otherwise

= s (x) + n(x). (7.43) U 0 Example 7.2 Consider the problem of detection of signals against the background of white Gaussian noise. In this case s0(x) = 0, R(x, y) = σ2δ(x y), we assume that the variance of the noise is σ2. The solution to − equation (7.29) is therefore

2 V = σ− s1(x). (7.44)

The inequality (7.39) reduces to

1 2 Re (x)s (x)dx s∗(x) dx. (7.45) U 1 ≥ 2 | 1 | ZD ZD If (7.45) holds then the decision is that the observed signal (x) is of the U form

(x) = s (x) + n(x). U 1 Otherwise one decides that

(x) = n(x). U February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 167

The problem of discrimination between two signals s1(x) and s0(x) against the white Gaussian noise background is solved similarly. If

1 2 2 Re (x)(s s )∗dx s s dx (7.46) U 1 − 0 ≥ 2 | 1| − | 0| ZD ZD  then the decision is that equation (7.42) holds. Otherwise one decides that (7.43) holds. If all the signals are real-valued then inequality (36) can be written as

( s )2dx ( s )2dx. (7.47) U − 0 ≥ U − 1 ZD ZD The decision rule now has a geometrical meaning: if the observed signal is closer to S in L2(D) metric then the decision is that (7.42) holds. U 1 Otherwise one decides that (7.43) holds. We have chosen a very simple case in order to demonstrate the decision rule for the problem for which all the calculations can be carried through in an elementary way. But the technique is the same for the general kernels R . ∈ R Example 7.3 Consider the problem of detection of a signal with un- known amplitude. Assume that the observed signal is either of the form (x) = γs(x) + n(x) (7.48) U or (x) = n(x). (7.49) U Parameter γ is unknown, function s(x) is known, n(x) is a Gaussian noise with covariance function R(x, y) . Given the observed signal (x) ∈ R U one wants to decide if the hypothesis H1 that (7.48) holds is true, or the hypothesis H0 that (7.49) holds is true. Moreover, one wants to estimate the value of γ. In formula (7.37) take

s1 = γs(x), s0 = 0. (7.50)

Then, using the equation Re ∗V = Re V ∗, write U U γ∗ ln ` ( (x)) = Re ∗(x)V dx s∗V dx (7.51) U U − 2 ZD ZD where V (x) solves the equation

R(x, y)V (y)dy = γs(x). (7.52) ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

168 Random Fields Estimation Theory

One should find the estimate of γ by the maximum likelihood principle from the equations ∂ ln ` ∂ ln ` = 0, = 0. (7.53) ∂γ ∂γ∗ If, again for simplicity, one assumes that the noise is white with variance σ2 = 1, so that R(x, y) = δ(x y), then the solution to (7.52) is − V (x) = γs(x), (7.54)

and formula (7.51) reduces to

2 γ 2 ln ` = Reγ ∗sdx | | s dx. (7.55) U − 2 | | ZD ZD Therefore equations (7.53) yield

2 2 ∗sdx = γ∗ s dx, s∗dx = γ s dx, (7.56) U | | U | | ZD ZD ZD ZD so that the estimate γˆ of γ is

1 2 − γˆ = s∗dx s dx . (7.57) U | | ZD ZD  Exercise. Check that the estimate γˆ is unbiased, that is

γˆ = γ. (7.58)

Hint: Use the equation = γS(x). U

Exercise. Calculate the variance: σ2 (γˆ γˆ)2 = , E := s 2dx. (7.59) − E | | ZD Estimate (7.59) shows that the variance of the estimate of γ decreases as the energy E of the signal S(x) grows, which is intuitively obvious. Assume that hypothesis H0 occured. Then the quantity γˆ defined by σ2 formula (7.57) is Gaussian with zero mean value and its variance equals E by formula (7.59). Therefore

Prob( γˆ > b) = 2erf(bE1/2/σ) (7.60) | | February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 169

where

1/2 ∞ 2 erf(x) := (2π)− exp( t /2)dt. (7.61) − Zx If one takes confidence level  = 0, 95 and decides that hypothesis H0 occured if

2erf(γˆE1/2/σ) > , (7.62)

then the decision rule for detection of a known signal with an unknown amplitude against the Gaussian white noise background is as follows:

1) given the observed signal (x) calculate γˆ by formula (7.57), U 2) calculate the left hand side in the formula (7.62); if the inequality (7.62) holds then the decision is that (7.49) holds; otherwise the decision is that (7.48) holds.

7.3 Quasioptimal estimates of derivatives of random func- tions

7.3.1 Introduction Suppose that the observed signal in a domain D Rr is ⊂ (x) = s(x) + n(x), x D Rr, (7.63) U ∈ ⊂ where s(x) is a useful signal and n(x) is noise, s = n = 0. If one wishes to j estimate ∂ s(x0) optimally by the criterion of minimum of variance then one has a particular case of the problem studied in Ch. II with As = ∂j s (see formula (I.5)). This estimation problem can be solved by the theory developed in Ch. II. However, the basic integral equation for the optimal filter may be difficult to solve, the optimal filter may be difficult to implement, and the calculation of the optimal filter depends on the analytical details of the behavior of the spectral density of the covariance kernel R(x, y) = u (x)u(y), R(x, y) . That is, if one changes R˜(λ) ∗ ∈ R locally a little it ceases to be a rational function, for example. One can avoid the above difficulties by constructing a quasioptimal esti- mate of the derivative which is easy to calculate under a general assumption about the spectral density, which is stable towards small local perturbations of the spectral density and depends basically on the asymptotic behavior of this density as λ , and which is easy to implement practically. | | → ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

170 Random Fields Estimation Theory

The notion of quasioptimality will be specified later and it will be shown that the quasioptimal estimate is nearly as good as the optimal one. The basic ideas are taken from [Ramm (1968); Ramm (1972); Ramm (1981); Ramm (1984); Ramm (1985b)].

7.3.2 Estimates of the derivatives Consider first the one-dimensional case: (t) = s(t) + n(t). Assume that U

n(t) δ, (7.64) | | ≤

s00(t) M. (7.65) | | ≤ Let us assume for simplicity that n(t) and s(t) are defined on all of R1, that s(t) is an unknown deterministic function which satisfies (7.65), and the noise n(t) is an arbitrary random function which satisfies (7.64). Let A denote the set of all operators T : C(R1) C(R1), linear and → nonlinear, where C(R1) is the of continuous functions on R1 with the norm f = maxt R1 f(t) . Let k k ∈ | | 1 ∆ := (2h)− [ (t + h) (t h)], (7.66) hU U − U −

h(δ) := (2δ/M)1/2, (δ) := (2Mδ)1/2. (7.67)

First, let us consider the following problem: given (t) and the numbers U δ > 0 and M > 0 such that (7.64) and (7.65) hold, find an estimate ˆ of U s0(t) such that

ˆ s0(t) 0 as δ 0 (7.68) k U − k→ → and such that this estimate is the best possible in the sense

ˆ s0(t) = inf sup T s0 . (7.69) k U − k T A 00 k U − k ∈ |s |≤M n δ | |≤ This means that among all estimates T the estimate ˆ is the best one for U U the class of the data given by inequalities (7.64), (7.65). It turns out that this optimal estimate is the estimate (7.70) in the following theorem. Theorem 7.1 The estimate

ˆ := ∆ (7.70) U h(δ)U February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 171

has the properties

ˆ s0 (δ) (7.71) k U − k≤ and

inf sup ˆ s0 = (δ), (7.72) T A 00 k U − k ∈ |s |≤M n δ | |≤ where (δ) and h(δ) are defined by (7.67) and ∆ is defined by (7.66). h(δ)U Proof. One has δ Mh ∆ s0 ∆ ( s) + ∆ s s0 + . (7.73) | hU − | ≤ | h U − | | h − | ≤ h 2 Indeed n(t + h) + n(t h) δ ∆ ( s) | | | − | , | h U − | ≤ 2h ≤ h and s(t + h) s(t h) − − s0 2h −

2 2 h h 2 s(t) + s0(t)h + s00(ξ+) s(t) + s0(t)h s00(ξ ) Mh = 2 − − − 2 2h ≤ 2h Mh = , (7.74) 2 where ξ are the points in the remainder in the Taylor formula and esti-  mates (7.64), (7.65) were used. For fixed δ > 0 and M > 0, minimize the right side of (7.73) in h > 0 to get δ Mh δ Mh(δ) min + = + = (δ), (7.75) h>0 h 2 h(δ) 2   where h(δ) are (δ) are defined in (7.67). This proves inequality (7.71). To prove (7.72), take M s = t[t 2h(δ)] 0 t 2h(δ), (7.76) 1 − 2 − ≤ ≤ and extend it on R1 so that

1 s00(t) M, s (t) δ, t R . (7.77) | 1 | ≤ | 1 | ≤ ∀ ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

172 Random Fields Estimation Theory

Here h(δ) is given by (7.67). The extension of s1 with properties (7.77) is possible since on the interval [0, 2h(δ)] conditions (7.77) hold. Let

s (t) = s (t). (7.78) 2 − 1 One has

s00 M, s δ, p = 1, 2. (7.79) | p | ≤ | p| ≤ Take (t) = 0, t R1. Then U ∈ (t) s (t) δ, p = 1, 2. (7.80) |U − p | ≤ Therefore one can consider (t) as the observed value of both s (t) and U 1 s (t). Let T A be an arbitrary operator on C(R1). Denote 2 ∈ T (t) = a. (7.81) U t=0

One has

sup T s0 sup T (0) s0(0) max a s10 (0) , a s20 (0) |s00|≤M k U − k ≥ |s00|≤M | U − | ≥ {| − | | − |} n δ n δ | |≤ | |≤ 1 s0 (0) s0 (0) = (δ), (7.82) ≥ 2| 1 − 2 | where (δ) is given by (7.67). Taking infimum in T A of both sides of ∈ (7.82) one obtains

inf sup T s0 (δ). (7.83) T A ∈ |s00|≤M k U − k≥ n δ | |≤ From (7.83) and (7.71) the desired inequality (7.72) follows. Theorem 7.1 is proved. 

7.3.3 Derivatives of random functions Assume now that s(t) is a random function, that s(t) and n(t) are uncor- related, n = 0, and n = σv, where the variance of v, denoted by [v], is 1, D so that

[v] = 1, [n] = σ2. (7.84) D D The problem is to find a linear estimate Lu such that

[L s0] = min (7.85) D U − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 173

given the observed signal (t) = s(t) + n(t). As was explained in section U 7.3.1, we wish to find a quasioptimal linear estimate of s0 such that this estimate is easy to compute, easy to implement, and is nearly as good as the optimal estimate. Let us assume that

[s(m)(t)] M 2 , (7.86) D ≤ m where s(m)(t) is the m-th derivative of s(t). Let us seek the quasioptimal estimate among the estimates of the form

Q (Q) 1 (Q) kh ∆ s := h− A s t + . (7.87) h K Q k= Q   X− If m = 2q or m = 2q +1 let us take Q = q. If one expands the expression on the right hand side of (7.87) in powers of h and requires that the order of (Q) the smallness as h 0 of the function ∆h s s0 be maximal, one obtains → (Q) − the following system for the coefficients Ak :

Q k j A(Q) = δ , 0 j 2Q, (7.88) Q k 1j ≤ ≤ k= Q   X− where

0, j = 1 δ1j = 6 (1, j = 1.

The system (7.88) is uniquely solvable since its determinant does not vanish: it is a Vandermonde determinant. One can find by solving system (7.88) that

(1) (1) 1 A0 = 0, A 1 = (7.89)  2

(2) (2) 4 (2) 1 A0 = 0, A 1 = , A 2 = + (7.90)  3  6

(3) (3) 9 (3) 9 (3) 1 A0 = 0, A 1 = , A 2 = + , A 3 = (7.91)  4  20  20

(4) (4) 16 (4) 4 (4) 16 A0 = 0, A 1 = , A 2 = + , A 3 =   5  5  105 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

174 Random Fields Estimation Theory

(4) 1 A 4 = + . (7.92)  70 We will need

(q) Lemma 7.2 Let m = 2q + 1. Assume that the coefficients Ak in (7.87) satisfy (7.88) with 0 j 2q, and let ≤ ≤

q m 2 c := A(q) k2m, m = 2q + 1. (7.93) m (m!)2q2m k k= q X− Then

(q) 2m 2 2 ∆ s s0 γ h − , γ := c M , (7.94) D h − ≤ m m m m h i where is the symbol of variance. D

In order to prove this lemma, one needs a simple

Lemma 7.3 Let gj be random variables and aj be constants. If

[g ] M, 1 j n (7.95) D j ≤ ≤ ≤ then

n n a g nM a 2. (7.96) D  j j ≤ | j| j=1 j=1 X X  

Proof of Lemma 7.3 Note that

n 2 n 2 bk n bk (7.97) | |! ≤ | | Xk=1 Xk=1 by Cauchy’s inequality. Let f(x1, . . ., xn) be the probability density of the joint distribution of the random variables g1, g2, . . ., gn. Let us assume without loss of generality February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 175

that gk = 0. Denote dx = dx1 . . .dxn, Rn = , x = (x1, . . ., xn). Then

2 R R n n n n a g = a x f(x)dx a 2 x 2fdx D  j j j j ≤ | j| | j| j=1 Z j=1 Z j=1 j=1 X X X X   n n n n

= a 2 x2f(x)dx = a 2 [g ] | j| j | j| D j Xj=1 Xj=1 Z Xj=1 Xj=1 n nM a 2. (7.98) ≤ | j| Xj=1 Lemma 7.3 is proved. 

Proof of Lemma 7.2 One has

q m m (q) 1 (q) h k (m) ∆ s s0 = h− A s (t ), (7.99) h − k m!qm k k= q X−

where tK are the points in the remainder of Taylor’s formula. Apply Lemma 7.3 to equation (7.99) and take into account the assumption (7.86) to get

2m 2 2 q q h − Mm (q) 2 2m [∆ s s0] (2q + 1) A k (7.100) D h − ≤ (m!)2q2m | k | k= q X− which is equivalent to (31). Lemma 7.2 is proved. 

Lemma 7.4 One has

(q) ∆ s0 φ(h), (7.101) D h U − ≤ h i where

q 2m 2 2 2 (q) (q) k j φ(h) := γ h − + σ h− A A R − h . (7.102) m k j q k,j= q   X− Here m = 2q + 1,

R(t τ) := v (t)v(τ) (7.103) − ∗ is the covariance function of v(t), and conditions (7.84) hold. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

176 Random Fields Estimation Theory

Proof. By assumption s and v are uncorrelated. Therefore

(q) (q) 2 (q) ∆ s0 = ∆ s s0 + σ ∆ v (7.104) D h U − D h − D h h i 2m 2 h 2 2 qi (qh) (q) i k j γmh − + σ h− k,j= q Ak Aj R −q h (7.105), ≤ − P (q)   where we took into account that the coefficients Aj are real numbers. Lemma 7.4 is proved.  Definition 7.1 The estimate Lu := ∆(q) (7.106) h U is called quasioptimal if h minimizes the function φ(h) defined by (7.102). Thus, the quasioptimal estimate minimizes a natural majorant φ(h) of the variance of the estimate ∆(q) among all estimates (7.106) with h U different h. The majorant φ(h) is natural because the equality sign can be attained in (7.101) (for example, if s = 0 then the equality sign is attained in (7.101)). The quasioptimal filter is easy to calculate: it is sufficient to find mini- mizer of φ(h), h > 0. This filter is easy to implement: one needs only some multiplications, additions and time shift elements. We will compare the er- ror estimates for optimal and quasioptimal filter shortly, but first consider an example. Example 7.4 Let m = 2, q = 1, (t + h) (t h) ∆(1) = U − U − . (7.107) h U 2h By formulas (7.93) and (7.89) for m = 2 and q = 1 one calculates 1 c = . (7.108) 2 4 2 Let us assume that the constant M2 := M in the estimate (7.86) is known, the variance σ2 of noise is known (see (7.84)) and the covariance function of v(t) is R(t) = exp( t ). (7.109) −| | Then formula (7.102) yields M σ2h 2 Mh2 σ2h 2 φ(h) = h2 + − [1 exp( 2h)] = + − [R(0) R(2h)] . 4 2 − − 4 2 − (7.110) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 177

If σ 1 then the minimizer of the function (7.110) should be small. As-  suming h 1 and using 1 exp( h) h for h 1, one obtains  − − ≈  Mh2 σ2 φ(h) + , h > 0. (7.111) ≈ 4 h It is easy to check that the function (7.111) attains its minimum at

1/3 2σ2 h = (7.112) min M   and

4/3 1/3 2/3 1/3 4/3 1/3 min φ σ M 2 + 2− = 2.381σ M . (7.113) ≈   Note that if the minimizer is small, so that h 1, then the behavior min  of the covariance function is important only in a neighborhood of t = 0. But this behavior is determined only by the asymptotic behavior of the spectral density R˜(λ) as λ . | | → ∞ Let us now compare briefly optimal and quasioptimal estimates. Let us define the spectral density R˜s(λ) of s(t):

∞ R˜ (λ) := exp( iλt)R (t)dt, (7.114) s − s Z−∞ where

R (t τ) := s (t)s(τ). (7.115) s − ∗ We assume that A 5 0 < R˜ (λ) , a (7.116) s ≤ (1 + λ2)a ≥ 2 and that the spectral density of noise B 0 < R˜(λ) , b > 1. (7.117) ≤ (1 + λ2)b One has for the quasioptimal estimate (7.107) formula (7.110), and

∂4 M = [s00(t)] = R 00 (0) := [s (t)] s (τ) = R (t τ) D s 00 ∗ 00 t=τ ∂t2∂τ 2 s − t=τ   1 ∞ A ∞ dλ = λ4R˜ (λ)dλ λ4 constA. (7.118) 2π s ≤ 2π (1 + λ2)a ≤ Z−∞ Z−∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

178 Random Fields Estimation Theory

The term R(0) R(2h) in (7.110) can be written as − 1 ∞ R(0) R(2h) = R˜(λ)[1 exp(2iλh)]dλ. (7.119) − 2π − Z−∞ Therefore the function φ(h) in (7.110) can be expressed entirely in terms of spectral densities. One has

B ∞ dλ R(0) R(2h) 1 exp(2iλh) | − | ≤ 2π (1 + λ2)b | − | Z−∞ B ∞ dλ Bh ∞ sin λh λdλ = sin λh π (1 + λ2)b | | ≤ π λh (1 + λ2)b Z−∞ Z−∞

const Bh. (7.120) ≤ × It follows from (7.110), (7.118) and (7.120) that

Bσ2 φ(h) const Ah2 + , (7.121) ≤ h   where const does not depend on A, B, σ and h but depends on a and b. If a 5 and b > 1 one can take an absolute constant in (7.121). This shows ≥ 2 that estimate (7.121) does not depend much on the details of the behavior of the spectral densities. One can see from (7.121) that

A 1/3 Bσ2 1/3 φ constσ4/3 , h = const . (7.122) min ≤ B min A     A This estimate shows how the variance of noise and the ratio B influence the behavior of the error as σ 0. Note that → ∞ ∞ dλ 1 = [v] = R(0) = R˜(λ)dλ B = const B. D ≤ (1 + λ2)b × Z−∞ Z−∞ Therefore B is of order of magnitude of 1, and formula (7.122) can be written as

φ constσ4/3A1/3. (7.123) min ≤ This estimate holds if h 1, that is if min  σ A1/2. (7.124)  All these estimates are asymptotic as σ 0. → February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 179

The optimal h satisfies the equation t 2 Rs(y z) + σ R(y z) h(x, z)dz = f(y, x), t T y t, t T − − − ≤ ≤ Z −   (7.125) where ∂ f(y, x) := s (y)s (x) = R (y x) = R0 (y x). (7.126) ∗ 0 ∂x s − − s − The error of the optimal estimate can be computed by formula (2.108):

2 (x) = s (x) (f, h ) = R00(0) (f, h ), (7.127) | 0 | − 0 − s − 0

where h0 is the solution to equation (7.125) of minimal order of singularity. For simplicity let us assume that t T = and t = . The error − −∞ ∞ for this physically nonrealizable filter is not more than for the physically realizable filter. The reasons for considering the nonrealizable filter are: 1) this filter is easy to calculate, and 2) its error gives a lower bound for the error of realizable filter. Let x = 0 in (7.125). Since the random functions we consider are assumed to be stationary there is no loss in the assumption that x = 0. Take the Fourier transform of (7.125) with t T = , t = + , and use − −∞ ∞ the theorem about the Fourier transform of convolution to get

[R˜ (λ) + σ2R˜(λ)]h˜ (λ) = iλR˜ . (7.128) s 0 − s Thus

2 1 h˜ = iλR˜ [R˜ + σ R˜]− . (7.129) 0 − s s Use (7.129) and apply Parseval’s equality to (61) to get

˜2 1 ∞ 2 ∞ 2 Rs (0) = λ R˜s(λ)dλ λ dλ 2 2π − R˜s + σ R˜ Z−∞ Z−∞ 2 σ ∞ 2 2 1 = λ R˜ (λ)R˜(λ)[R˜ (λ) + σ R˜(λ)]− dλ. (7.130) 2π s s Z−∞ It follows from (7.130), (7.116) and (7.117), if we assume that for large λ the sign in (7.116) and (7.117) becomes asymptotic equality, that ≤

2 ∞ λ dλ (0) const σ2AB ≤ × (1 + λ2)bA + σ2B(1 + λ2)a Z−∞ const σ2 as σ 0. (7.131) ≤ × → February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

180 Random Fields Estimation Theory

The estimate (7.113) gives O(σ4/3) as σ 0 but if one takes larger → m the estimate will be 0 σα(m) , where α(m) 2 as m grows. One → can estimate α(m) using formula (7.102). For small h one has φ 2m 2 2 1  2/(2m 1) 2 (2m 2)/(2m 1)≤ γmh − + constσ h− , hmin σ − and φmin σ − − , 2m 2 ∼ ∼ so that α(m) = 2 2m−1 . Therefore α(m) 2 as m . − → → ∞

7.3.4 Finding critical points Before we discuss the case r > 1 of random functions of several variables, let us outline briefly an application of the results of section 7.3.2 to the problem of finding the extremum of a random function. Assume that the observed function (t) = s(t)+n(t), where s(t) is a smooth function defined U on the interval [0, 1] which has exactly one maximum on this interval. Such functions s(t) are called univalent. Suppose that this maximum is attained at a point τ. Assume that

s00 M, n(t) δ. (7.132) | | ≤ | | ≤ The problem is to find τ given the signal (t), 0 t 1. U ≤ ≤ The solution to this problem is:

1) divide the interval [0, 1] by the points tk = kh, where h = h(δ) is given by (7.67), k = 0, 1, 2, . . . 1 (1) 2) calculate ˆ := (2h)− [ (t + h) (t h)] = ∆ (t ) Uk U k − U k − h U k 3) compute ˆ ˆ , k = 0, 1, 2, . . . UkUk+1 4) if

ˆ > (δ) k (7.133) |Uk| ∀ and

ˆ ˆ < 0 for some j (7.134) UjUj+1 then

tj < τ < tj + h. (7.135)

Indeed, from (7.67) and (7.71) one concludes that

ˆ(t ) (δ) s0(t ) ˆ(t ) + (δ). (7.136) U k − ≤ k ≤ U k

From (7.133), (7.134) and (7.136) it follows that s0(t) changes sign on the interval (tj, tj+1). This implies (7.135) since s(t) is univalent. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 181

If (7.133) is not valid for some k = k0, then the maximum may be on the interval (t h, t + h). In this case it may happen that for no j condition k − k (7.134) holds. The above method may not work if the derivative of s(t) is very small (smaller than (δ)) in a large neighborhood of τ. Remark 7.1 Since we used formula (7.71) we assumed that (t) is de- U fined on all of R1. If it is defined only on a bounded interval [a, b] then the expression ∆(1) (t) is not defined for t < a+h. In this case one can define h U (1) 1 ∆˜ (t) = h− [ (t + h) (t)] , a t < a + h h U U − U ≤ = ∆(1) (t), a + h t b h h U ≤ ≤ − 1 = h− [ (t) (t h)] , b h t b. (7.137) U − U − − ≤ ≤ In this case

(1) 2δ Mh ∆˜ (t) s0(t) + (7.138) h U − ≤ h 2

˜ so that the minimizer hmin of the right hand side of (7.138) is 1/2 h˜min = 2(δ/M) (7.139) and the minimum of the right hand side of (7.138) is

˜(δ) = 2(Mδ)1/2. (7.140)

Note that ˜(δ) = 21/2(δ), where (δ) is given by (7.67).

7.3.5 Derivatives of random fields Let us consider the multidimensional case. There are no new ideas in this case but we give a brief outline of the results for convenience of the reader. Suppose that (x) = s(x) + n(x), x Rr. Let s denote the gradient U ∈ ∇ of s(x) and s = maxx Rr s(x) . Assume that k k ∈ | | n(x) δ, (7.141) k k≤ and

2 r max (d s(x)θ, θ) M θ S1 := θ : θ R , θ θ = 1 , (7.142) x Rr ≤ ∀ ∈ { ∈ · } ∈

where r ∂2s(x) (d2s(x)θ, θ) = θ θ . ∂x ∂x i j i,j=1 i j X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

182 Random Fields Estimation Theory

Define (x + hθ) (x hθ) ∆ (x) := U − U − , h > 0. (7.143) hU 2h Theorem 7.2 If h(δ) and (δ) are given by (7.67) then ∆ (x) s(x) θ (δ), θ S . (7.144) | h(δ)U − ∇ · | ≤ ∀ ∈ 1 Moreover inf sup T (x) s(x) θ = (δ) (7.145) T A k U − ∇ · k ∈ s,n

and the infimum is attained at T = ∆h(δ). Here the supremum is taken over all s(x) C2(Rr), which satisfy (76), and all n(x), which satisfy (75), and ∈ the infimum is taken over the set A of all operators T : C(Rr) C(Rr) → linear or nonlinear. The proof of this theorem is similar to the proof of Theorem 7.1. The role of the function s1(t) in formula (7.76) is played by the function M s (x) = x 2 2h(δ)x θ in B , (7.146) 1 2 | | − · δ where B is the ball, centered at the point h(δ)θ, with radius h(δ), x 2 = δ | | r x 2. It is clear that s (x) vanishes at the boundary ∂B of the ball, j=1 | j| 1 δ that P (d2s (x)θ, θ) M θ S (7.147) 1 ≤ ∀ ∈ 1

and that s (x) δ. (7.148) | 1 | ≤ Let s (x) = s (x) and argue as in the proof of Theorem 7.1 in order to 2 − 1 obtain (7.144) and (7.145).

7.4 Stable summation of orthogonal series and integrals with randomly perturbed coefficients

7.4.1 Introduction Consider an orthogonal series

∞ f(x) = c φ (x), x D Rr, (7.149) j j ∈ ⊂ j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 183

where

(φj, φm) := φj(x)φm∗ (x)dx = δjm (7.150) ZD and

cj := (f, φj). (7.151)

Suppose that the data

b := c +  , 1 j < (7.152) { j j j} ≤ ∞ are given, that is the Fourier coefficients of f are known with some errors j. Assume that

2 j = 0, j∗m = σ δjm, σ = const > 0. (7.153)

The problem is: given the data (7.152), (7.153), estimate f(x). From the point of view of systems theory, one can interpret this prob- lem as follows. Suppose that the system’s response to the signal φj(x) is Kj φj(x), where Kj is a generalized transmission coefficient of the system. For example, if ω is a continuous analogue of j and φj(x) = exp(iωx) then K(iω) is the usual transmission coefficient of the linear system. If there is a noise at the output of the system then one actually receives

j∞=1(Kj cj + j )φj(x) at the output, where j is the noise comlponent corresponding to the j-th generalized harmonic φ (x). P j Let us consider two methods for solving the problem. These methods are easy to use in practice. The first method is to define

N fN := bjφj (x) (7.154) j=1 X and to choose N = N(σ) so that

f (x) f(x) 2 = min, f 2:= f 2dx. (7.155) k N(σ) − k k k | | ZD The second method is to define

∞ g(x) := ρj (ν)bjφj (x), (7.156) j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

184 Random Fields Estimation Theory

where ν > 0 is a parameter, and to choose the multipliers ρj (ν) so that g f 2 = min . (7.157) k − k The same problem can be formulated for orthogonal integrals that is for continuous analogues of orthogonal series:

∞ f(x) = c(λ)φ(x, λ)dλ (7.158) Z−∞ and

φ(x, λ)φ∗(x, λ0)dx = δ(λ λ0). (7.159) r − ZR 2 We assume that b(λ) are given, b(λ) = c(λ) + (λ),  (λ)(λ ) = σ δ(λ λ0) ∗ 0 − and the problem is to estimate f(x). This can be done in the same way as for the problem for series.

7.4.2 Stable summation of series Let us consider the first method. Assume that

a 1 φ (x) c, c Aj− , a > , (7.160) | j | ≤ | j| ≤ 2 where c and A are positive constants which do not depend on x and j. Then N ∞ 2 2 fN f = j j∗0 (φj , φj0 ) + cj k − k 0 | | j,jX=1 j=XN+1 2 2 ∞ 2a σ N + A j− ≤ j=XN+1 N 2a+1 1 σ2N + A2 − := γ(N), a > . (7.161) ≤ 2a 1 2 − Here we used (7.150), (7.152), (7.153) and (7.160). Let us find Nm for which γ(N) = min, σ and A assumed fixed. One has

2a 1/(2a) A 1/a N(σ) := N = (7.162) m 2a 1 σ  −    and

1/a (2a 1)/a γm := γ(Nm) = constA σ − , (7.163) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 185

where const depends on a but not on A and σ. We have proved Proposition 7.1 If N(σ) is given by (7.162) then

2 1/a (2a 1)/a f (x) f(x) constA σ − . (7.164) k N(σ) − k ≤ Therefore formula (7.154) with N = N(σ) gives an estimate of f(x) such that the error of this estimate goes to zero according to (7.164) as σ 0. →

7.4.3 Method of multipliers Let us consider the second method. Take

ρ (ν) := exp( νj). (7.165) j − These are multipliers of convergence used in Abel’s summation of series. Then 2 ∞ ∞ := exp( νj)b φ (x) c φ (x) J − j j − j j j=1 j=1 X X

∞ ∞ = 1 exp( jν) 2 c 2 + σ2 exp( 2jν) | − − | | j| − Xj=1 Xj=1 ∞ 1 exp( jν) 2 exp( 2ν) A2 | − − | + σ2 − . (7.166) ≤ j2a 1 exp( 2ν) Xj=1 − − For fixed A and σ one can find νm which minimizes the right side of (7.166). If γm is the minimum of the right side of (7.166), then the error estimate of the method is

γ . (7.167) J ≤ m One can see that γ 0 as σ 0. m → →

7.5 Resolution ability of linear systems

7.5.1 Introduction Let us briefly discuss the notions of resolution ability of a linear system. In optics Rayleigh gave an intuitive definition of resolution ability: if one has two bright points [δ(x a) + δ(x + a)]/2 as the input signal and if the − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

186 Random Fields Estimation Theory

optical system is described by the transmission function h(x, y), so that the output signal is [h(x, a) + h(x, a)]/2 then the two points can be resolved − < according to the Rayleigh criterion if [h(0, a) + h(0, a)]/2 0.8h(0). − ∼ Note that we took the signal [δ(x a) + δ(x + a)]/2 rather than δ(x − − a) + δ(x + a) in order to compare this signal with a bright point at the origin δ(x). The sum of the coefficients in front of delta-functions should be therefore equal to 1, the coefficient in front of δ(x). The factor 0.8 in the Rayleigh criterion is an empirical one. The transmission function h(x, y) is defined as follows: if sin(x) is the input signal then the output signal of the linear system is given by

sout(x) = h(x, y)sin (y)dy. (7.168) ZD The domain D in optics is usually the input pupil of the system, or its input vision domain. The optical system is called isoplanatic if h(x, y) = h(x y). − In describing the Rayleigh criterion we assume that h(x, y) has absolute maximum at x = y, that the distance 2a between two points is small, so that both points lie in the region near origin in which h(x, y) is positive. Suppose now that

δ(x a)+δ(x+a) s1(x) = − 2 , (7.169) s0(x) = δ(x), (7.170)

and one observes the signal

(x) = h(x, y)s (y)dy + n(x), (7.171) Uj j ZD where n(x) is the output Gaussian noise, n = 0, n 2 = σ2 < , and j = 0 | | ∞ (hypothesis H0) or j = 1 (hypothesis H1). The problem:

given the observed signal (x) decide whether H or H occured. Uj 0 1 (7.172) If with the probability 1 one can make a correct decision no matter how small a > 0 is, then one says that the resolution ability of the system in the sense of Rayleigh is infinite. The traditional intuition (which says that, for a fixed size of the input pupil, an optical system can resolve the distances of order of magnitude of the wavelength) is based on the calculation of diffrac- tion of a plane wave by a circular hole in a plane. In [R 6)] it was proved that, in the absence of noise, the transmission function of a linear system February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 187

can be made as close to the δ(x y) as one wishes, by means of apodization. − This means that there is a sequence hm(x, y) of the transmission functions which is a delta sequence in the sense that

h (x, y)s(y)dy s(x) as m m → → ∞ ZD for any continuous s(y).

7.5.2 Resolution ability of linear systems In this section we apply the theory developed in Section 7.2 in order to show that there exists a linear system, or what is the same for our purposes, the transmission function h(x, y) such that the resolution ability of this system in the sense of Rayleigh is infinite. More precisely, we will formulate a decision rule for discriminating between the hypoteses H0 and H1 such that the error of this rule can be made as small as possible. The error of the rule is defined to be

αm := P (γ1 H0) (7.173)

that is, the probability to decide that hyp othesis H1 occured when in fact H0 occured. The meaning of the parameter m, the subscript of α, will be made clear shortly. This parameter is associated with the sequence of linear systems whose resolution ability increases without limit. First let us choose hm(x, y) so that the sequences

h (x, z) = h (x, y)δ(y z)dy := δ (x, z) (7.174) m m − m ZD are delta-sequences. Then, by formula (4), the observed signals became

δ (x, a) + δ (x, a) (x) = m m − + n(x) := s (x) + n(x) (7.175) U1 2 1 or

(x) = δ (x, 0) + n(x) := s (x) + n(x). (7.176) U0 m 0 Let us apply to the problem (7.172) the decision rule based on formula (7.41). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

188 Random Fields Estimation Theory

First one should solve the equation (7.29):

δ (x, a) + δ (x, a) RV := R(x, y)V (y)dy = m m − δ (x, 0) 2 − m ZD = s (x) s (x) := f, (7.177) 1 − 0

where R(x, y) := n∗(x)n(y) is the covariance function of noise. We assume that R(x, y) , and, for simplicity, that P (λ) = 1. In this case R(x, y) ∈ R solves the equation

Q( )R(x, y) = δ(x y). (7.178) L − a We also assume that δ (x, y) is negligibly small for x y > | | , and that m | − | 2 points 0, a, and a are inside of D and − ρ(0, Γ) > a , ρ(a, Γ) > a , ρ( a, Γ) > a , (7.179) | | | | − | | where ρ(x, Γ) is the distance between point x and Γ = ∂D. In this case one can neglect the singular boundary term of the solution to equation (7.177) and write this solution as

V (x) = Q( )f. (7.180) L Let us write inequality (7.39):

1 Re (x)Q( )f ∗(x)dx δ (x, 0)Q( )f ∗(x)dx U L ≥ 2 m L ZD ZD 1 δ∗ (x, a) + δ∗ (x, a) + m m − Q( )f(x)dx, 2 2 L ZD (7.181)

where we assume that the coefficients of Q( ) are real. Otherwise one L would write [Q( )f]∗ in place of Q( )f ∗. L L Let us write the expression (7.173) for αm:

α := P (γ H ) = P 2Re (x)V ∗dx (s V ∗ + s∗V )dx , m 1 0 U0 ≥ 0 1  ZD ZD  (7.182)

where V , s1 and s0 are given by (7.180), (7.175) and (7.176) respectively. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 189

It follows from (7.182) that

α = P 2Re n(x)V ∗(x)dx (s∗ s∗)Q( )(s s )dx m ≥ 1 − 0 L 1 − 0  ZD ZD 

= P 2Re n(x)V ∗(x)dx (s∗Q( )s + s∗Q( )s )dx . ≥ 1 L 1 0 L 0  ZD ZD  (7.183)

Here we took into account that

s∗Q( )s dx = 0 for i = j (7.184) j L i 6 ZD because of the assumption that for m sufficiently large the functions S1(x) and S0(x) have practically nonintersecting supports: each of them with derivatives of order sq is negligibly small in the region where the other is not small. One can write (7.183) as 3 α = P Re n(x)V ∗(x)dx A , (7.185) m ≥ 2 m  ZD  where we denote by Am the positive quantity of the type

A = δ (x, 0)Q( )δ∗ (x, 0)dx, (7.186) m m L m ZD and one can write in place of δ (x, 0) the functions δ (x, a) or δ (x, a). m m m − The basic property of Am is

A + as m . (7.187) m → ∞ → ∞ Indeed, the elliptic operator Q( ) is positive definite on H sq/2(Rr ), and L one can assume that δ (x, 0) H˙ sq/2(B ), B := x : x Rr , x a . m ∈ a a { ∈ | | ≤ } Therefore

2 δ (x, 0)Q( )δ∗ (x, 0)dx c δ (x, 0) dx + . (7.188) m L m ≥ | m | → ∞ ZD ZD Here c is a positive constant which does not depend on δm(x, 0) (it depends on Q( ) only) and the integral in (7.188) tends to infinity because δ (x, 0) L m is a delta-sequence by construction (see formula (7.174) and the line below it). Let us apply the Chebyshev inequality to get:

D n(x)V ∗(x)dx P h(x)V ∗(x)dx A D . (7.189) ≥ m ≤ A2  ZD  R m 

February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

190 Random Fields Estimation Theory

Here we took into account that n = 0, and [n] stands for the variance of D random quantity n. One has

n(x)V ∗(x)dx = R(x, y)V (x)V ∗(y)dxdy D ZD  ZD ZD = R(x, y)Q( )f(x)Q( )f ∗ (y)dxdy L L ZD ZD = Q( )R(x, y)f(x)dx Q( )f ∗(y)dy L L ZD ZD 

= δ(x y)f(x)dx Q( )f ∗(y)dy − L ZD ZD  3 = f(y)Q( )f ∗ (y)dy = A . (7.190) L 2 m ZD Here we used definition (7.177) of f, formula (7.186) and equalities of the type (7.184) for the functions δ (x, 0), δ (x, a) and δ (x, a). From m m m − (7.189) and (7.190) it follows that 3 P n(x)V ∗(x)dx A 0 as β + . (7.191) ≥ m ≤ 2A → m → ∞  ZD  m

If X is an arbitrary random variable then P ReX A P X A . (7.192) { ≥ m} ≤ {| | ≥ m} From (7.185), (7.187), (7.191) and (7.192) it follows that 2 αm 0 as m . (7.193) ≤ 3Am → → ∞

Let us compute the probability to take the decision that the hypothesis H0 occurred while in fact H1 occurred. We have

β := P (γ H ) = P 2Re V ∗dx (s V ∗ + s∗V )dx m 0 1 U1 ≤ 0 1  ZD ZD 

= P 2Re nV ∗dx (s V ∗ s V ∗)dx ≤ 0 − 1  ZD ZD  1 = P Re nV ∗dx fQ( )f ∗dx ≤ −2 L  ZD ZD  3 = P Re nV ∗dx A , (7.194) ≤ −4 m  ZD  where we used formula (7.190). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 191

If ReX < A then X > A. Therefore − | | P ( X A) P ReX A . (7.195) | | ≥ ≥ { ≤ − } From (7.191), (7.194) and (7.195) one concludes that

3 3 16 8 β P hV ∗dx A = 0 as m . m ≤ ≥ 4 m ≤ 2 9A 3A → → ∞  ZD  m m (7.196)

We have proved the following

Theorem 7.3 The problem (7.172) can be solved by the decision rule from Section 7.2 and α 0, β 0, where α is defined in (7.182) and m → m → m βm is defined in (7.194).

7.5.3 Optimization of resolution ability In this section we give a theory of optimization of resolution ability of linear optical instruments. Let us consider the resolution ability for the problem of discriminat- ing between two arbitrary signals s1(x) and s0(x) which are deterministic functions of x Rr. The observed signal in a bounded domain D Rr is ∈ ⊂ (x) = s (x) + n(x), j = 0 or j = 1, (7.197) U j n(x) is Gaussian noise,

n = 0, [n] = σ2, n (x)n(y) = R(x y). (7.198) D ∗ − Assume that the linear optical instrument is isoplanatic. This means that its transmission function, h(x, y), is h(x y). In optics r = 2 and D is the − input pupil, or entrance pupil, of the instrument. Consider the case of incoherent signals. For incoherent signals the trans- mission power function is h(x y) 2. This means that if I (x) := s(x) 2 is | − | in | | the intensity of the signal s(x) in the object plane then in the image plane one has the distribution of the intensity Iout(x) given by

2 Iout(x) = h(x y) Iin(y)dy, = . (7.199) | − | 2 Z Z ZR The bar denotes averaging in phases. This will be explained soon. Let us February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

192 Random Fields Estimation Theory

briefly derive (7.199). One has

2 I (x) = h(x y)s (y)dy = h(x y)h∗(x y0)s∗ (y )dydy . out − in − − in 0 0 Z ZZ (7.200)

Let us now explain the meaning of the average. For incoherent signals by definition we have

iφj (y) sin(y) = Aj (y)e , (7.201) Xj

iφj where Aj(y)e are sources which form the signal sin(y) at the point y, and φj are their phases which assumed random, uniformly distributed in the interval [ π, π] and statistically independent, so that φ (y)φ (y ) = − j j 0 δ 0 δ(y y0) φ = 0. Under these assumptions one has jj − j

s (y)s∗ (y ) = A (y)A∗0 (y0exp i[φ (y) φ 0 (y )] in in 0 j j { j − j 0 } Xj,j 2 = Aj (y)Aj∗0 (y0)δjj0 δ(y y0) = Aj(y) δ(y y0) 0 − | | − Xj,j Xj = I (y)δ(y y0). (7.202) in − Here we took into account that

2 2 I (y) = s (y) = A (y)A∗0 (y)exp i[φ (y) φ 0 (y)] = A (y) . in | in | j j { j − j } | j | 0 j Xj,j X (7.203) From (7.202) and (7.200) one obtains (7.199). Denote

H(x) := h(x) 2 0, (7.204) | | ≥ where h(x) is the transmission function for an isoplanatic linear instrument. Then

Iout(x) = H(x y)Iin (y)dy, := (7.205) − 2 Z Z ZR for incoherent signals. Let us assume that

H2(x)dx :=  < . (7.206) ∞ Z One often assumes in applications that

H˜ (λ) is negligible outside ∆, (7.207) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 193

where ∆ R2 is a finite region. The assumption (7.178) means that the ⊂ instrument filters out the spatial frequencies which do not belong to ∆. Let us finally assume that the noise is a real-valued function which is a perturbation of the observed intensity in the image plane. One can think of n(x), for example, as of the noise of the receiver of the intensity.

The problem is: given the observed signal

(x) = I (x) + n(x), j = 1 or j = 0, U j I (x) = H(x y)s (y)dy, s (x) := Iin (x) (7.208) j − j j j Z

decide whether it is of the form (7.179) with j = 1 (hypothesis H1) or with j = 0 (hypothesis H0).

Applying the decision rule from Section 7.2 and taking into account that the signals are real valued, one solves the equation (7.29)

R(x, y)V (y)dy = I (x) I (x) := I(x) (7.209) 1 − 0 ZD and then checks inequality (7.39) with real-valued signals:

1 (x)V (x)dx [I (x) + I (x)] V (x)dx. (7.210) U ≥ 2 0 1 ZD ZD

If (7.210) holds then the decision is that hypothesis H1 occurred. Oth- erwise one concludes that hypothesis H0 occurred. The error of the first kind of this rule is the probability to decide that H1 occurred when in fact H0 occurred:

α := P (γ1 H0) 1 = P [I (x) + n(x)]V (x)dx [I (x) + I (x)]V (x)dx 0 ≥ 2 0 1 ZD ZD  1 = P n(x)V (x)dx I(x)V (x)dx , (7.211) ≥ 2 ZD ZD  where I(x) is given by (7.209). Note that

n(x)V dx = 0, (7.212) ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

194 Random Fields Estimation Theory

d2 := n(x)V (x)dx = R(x, y)V (y)V (x)dydx. (7.213) D ZD  ZD ZD We assume here that R(x, y) and V (x) are real-valued. Since

D n(x)V (x)dx is Gaussian (because n(x) is) one concludes from (7.211), (7.212) and (7.213) that R 2 1 ∞ t α = exp 2 dt = erf(d/2), (7.214) d√2π 2 −2d Zd /2   where erf(x) is defined in (7.61) and we used the equation

d2 = I(x)V (x)dx (7.215) ZD which can be easily checked:

d2 := R(x, y)V (y)dy V (x)dx = I(x)V (x)dx ZD ZD  ZD because of (7.209). Therefore

α = min for those H(x) for which d = max . (7.216)

For the error of the second kind which is the probability to decide that H0 occurred while in fact H1 occurred one has

1 β := P (γ H ) = P (I + n)V dx < (I + I )V dx 0 1 1 2 0 1 ZD ZD  1 = P nV dx < I(x)V (x)dx −2 ZD ZD  d2 /2 2 1 − t = exp dt d√2π −2d2 Z−∞   d/2 2 1 − t = exp dt = erf(d/2) = α. (7.217) √2π − 2 Z−∞   Therefore both α and β = α will be minimized if H(x) solves the following optimization problem

d2 := R(x, y)V (y)V (x)dydx = max (7.218) ZD ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 195

subject to the conditions

R(x, y)V (y)dy = I (x) I (x) := I(x) (7.219) 1 − 0 ZD and

I(x) = H(x y)s(x)dx, s(x) := s (x) s (x). (7.220) − 1 − 0 Z Let us assume for simplicity that D is the whole plane:

D = R2. (7.221)

Then, taking the Fourier transform of (7.219) and (7.220) yields

R˜(λ)V˜ (λ) = I˜(λ) = H˜ (λ)s˜(λ). (7.222)

Here the last assumption (7.198) was used. Thus

1 V˜ (λ) = H˜ (λ)s˜(λ)R˜− (λ). (7.223)

Write (51) as

2 1 ˜ 1 ˜ 2 2 d = I(x)V (x)dx = 2 R− H(λ) s˜ dλ = max (7.224) 2 (2π) 2 | | | | ZR ZR and (7.206) as

2 2  = (2π)− H˜ dλ. (7.225) 2 | | ZR The instrument with the power transmission function H(x) which maxi- mizes functional (7.224) under the restriction (7.225) will have the maxi- mum resolution power for the problem of discriminating two given signals s1(x) and s0(x). The solution to (7.224) is easy to find: H˜ 2 should be parallel to the 1 2 | | vector R˜− s˜ , so | | 2 1 2 H˜ = R˜− s˜ const, (7.226) | | | | · where the constant is uniquely determined by condition (7.225):

1/2 2 1 2  H˜ (λ) = R˜− (λ) s˜(λ) . (7.227) | | | | · 1 R˜ 1 s˜ 2dλ 2π R2 − | | Formula (60) determines uniquely H˜ := A(λR), so that | | H˜ (λ) = A(λ) exp[iφ(λ)], (7.228) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

196 Random Fields Estimation Theory

where φ(λ) is the unknown phase of H˜ (λ):

H(x) exp(iλ x)dx = A(λ) exp[iφ(λ)], H(x) 0. (7.229) 2 · ≥ ZR Formula (7.224) shows that the resolution power of the optimal instrument depends on A(λ) = H˜ (λ) only. The phase φ(λ) does not influence the | | resolution ability but influences the size of the region D out of which H(x) is negligibly small. If one writes the equation

H(x) exp(iλ x)dx = A(λ) exp[iφ(λ)], H(x) 0 (7.230) · ≥ ZD and consider it as an equation for H(x) and φ(λ), given A(λ) and argH(x) = 0, then one has a phase retrieval problem. In (7.230) D R2 is assumed to ⊂ be a finite region with a smooth boundary. This problem has been studied in [Kl], where some uniqueness theorems are established. However, the numerical solution to this problem has not been studied sufficiently. The condition (7.206) does not seem to have physical meaning. One could assume that

H(x)dx = E. (7.231) Z In this case the const in (7.226) cannot be found explicitly, in contrast to the case when condition (7.206) is assumed. If (7.231) is assumed then it follows from (7.176) that

Iout(x) max Iin(y)E. (7.232) ≤ y R2 ∈

7.5.4 A general definition of resolution ability In this section a general definition of resolution ability is suggested. The classical Rayleigh definition deals with very special signals: two bright points versus one bright point. Suppose that the set M of signals, which one wishes to resolve is rather large. For example, one can assume that M con- 2 1 sists of all functions belonging to L (D) or to C0 (D), the space of functions which have one continuous derivative in a bounded domain D Rr and ⊂ are compactly supported in D. Suppose that the linear system is described by its transmission function

Lf = h(x, y)f(y)dy. (7.233) ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 197

Assume that actually one observes the signal

(x) = Lf + n(x), n = 0, [n] = σ2, (7.234) U D where n(x) is noise. Let

B := f (7.235) σ U σ denote a mapping which recovers f given (x). Let us assume that the 1 U operator L− exists but is unbounded. Then in the absence of noise one 1 1 can recover f exactly by the formula f = L− , but L− cannot serve as U Bσ in (7.235): first, because in the presence of noise may not belong to 1 1 U the domain of L− , secondly, because if L− is well defined in the presence U 1 of noise it may give a very poor estimate of f due to the fact that L− is unbounded. Let us define the resolution ability of the procedure Bσ on the class M of input signals as a maximal number r [0, 1] for which ∈ [Bσ f] lim sup {D Ur − } < . (7.236) σ 0 f M σ ∞ → ∈ This definition takes into account the class M of the input signals, the procedure B for estimating f, and the properties of the system (see the definition (7.234) of ). Therefore all the essential data are incorporated in U the definition. The definition makes sense for nonlinear injective mappings L as well. Roughly speaking the idea of this definition is as follows. If the error in the input signal is O(σ) and the identification procedure Bσ produces f = B such that, in some suitable norm, f f = O(σr ) as σ 0, σ U k σ − k → then the large r is, the better the resolution ability of the procedure Bσ is. Example 7.5 Assume that the transmission function is

2 1/2 sin(x y) h(x y) = − . (7.237) − π x y   − Note that

1 ∞ iλx 1 λ 1 h˜(λ) := h(x)e− dx = | | ≤ (7.238) √2π (0 λ > 1. Z−∞ | | Let

1 ∞ ˜(λ) exp(iλx)dλ Bσ = U , (7.239) U √2π h˜(λ) + a(σ) Z−∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

198 Random Fields Estimation Theory

where a(σ) > 0 for σ > 0 will be chosen later. Let M be the set of functions f(x) L2( , ) such that ∈ −∞ ∞ ˜ ˜ f L2[ 1,1] m, f(λ) = 0 for λ > 1, (7.240) k k − ≤ | | where m > 0 is a constant. Assume that n(x) is an arbitrary function such that

1/2 ∞ n(x) L2( , ), n := h(x) 2dx σ. (7.241) ∈ −∞ ∞ k k | | ≤ Z−∞  By Parseval’s equality, one has

˜ ˜ 2 ˜ 2 2 hf + n˜ n˜ a(σ)f Bσ f = f˜ = − k U − k a(σ) + h˜ − a(σ) + h˜

2 h˜ 2 f˜ σ2 2 k k + a2 (σ) 2 + a2(σ)m . ≤  a2(σ) h˜  ≤ a2(σ)  

  (7.242)

Choose

a(σ) = σ1/2. (7.243)

Then (7.242) and (7.243) yield

B f 2(1 + m)σ. (7.244) k σU − k≤

Therefore r = 1 for the procedure Bσ defined by formula (7.239): B f lim sup k σ U − k 2(1 + m) < . (7.245) σ 0 f M σ ≤ ∞ → ∈

7.6 Ill-posed problems and estimation theory

7.6.1 Introduction In this section we define the notion of ill-posed problem and give examples of ill-posed problems. Many problems of practical interest can be reduced to solving an operator equation

Au = f, (7.246) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 199

where u U, f F , A : U F , U and F are Banach spaces, A is an ∈ ∈ → injective mapping with discontinuous inverse and domain D(A) U. Such ⊂ problems are called ill-posed because small perturbations of f may lead 1 to large perturbations of the solution u due to the fact that A− is not continuous. In many cases R(A), the range of A, is not the whole F . In this case small perturbations of f may lead to an equation which does not have a solution due to the fact that the perturbed f does not belong to R(A). The problem (7.246) is said to be well-posed in the sense of Hadamard if 1 A : D(A) F is injective, surjective, that is R(A) = F , and A− : F U → → is continuous. The formulation of an ill-posed problem (7.246) can often be given as follows. Assume that the data are δ, A, f , where δ > 0 is a given number, { δ} fδ is a δ-approximation of f in the sense

f f δ. (7.247) k − δ k≤ The problem is: given the data δ, A, f , find u U such that { δ} δ ∈ u u 0 as δ 0. (7.248) k δ − k→ → We use for norms in U and F . k · k A more general formulation of the ill-posed problem is the one when the data are δ, η, A , f where δ and f are as above, η > 0 is a positive { η δ} δ number and Aη is an approximation of A in a suitable sense, for example, if A is a linear bounded operator one can assume A A η. We will k η − k≤ not discuss this more general formulation here. (See [I], for example.) Ill-posed problems can be formulated as estimation problems. For ex- ample, suppose that A is a linear operator, u solves equation (7.246), and one knows a randomly perturbed right hand side of (7.246), namely

f + n, (7.249)

where n is a random variable,

n = 0, n (x)n(y) = σ2δ(x y). (7.250) ∗ − The problem is to estimate u given the data (7.249), (7.250). Let us give a few examples of ill-posed problems of interest in applica- tions.

Example 7.6 Numerical differentiation. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

200 Random Fields Estimation Theory

Let x Au := u(t)dt = f(x). (7.251) Za Assume that δ > 0 and f are given such that f f δ. Here f = δ k δ − k≤ k k maxa x b f(x) , F = U = C([a, b]). The problem (7.251) is ill-posed. ≤ ≤ | | Indeed, the linear operator A is injective: if Au = 0 then u = 0. Its range consists of the functions f C1[a, b] such that f(a) = 0. Therefore equation ∈ (7.251) with f in place of f has no solutions in C[a, b] if f C1[a, b] or δ δ 6∈ f (0) = 0. If one takes f = f +δ sin[ω(x a)], then the solution to equation δ 6 δ − (7.251) with f in place of f exists: u = f 0 +δω cos[ω(x a)]. Since u = f 0 δ δ − one has u u = δω 1 if ω is sufficiently large. Therefore, a small k δ − k  in the norm of U perturbation of f resulted in a large in the same norm 1 perturbation of u. Therefore the formula uδ = A− fδ = fδ0 in this example does not satisfy condition (7.248). In Section 7.3 a stable solution to the problem (7.251) is given: f (x + h(δ)) f (x h(δ)) u (x) = δ − δ − . (7.252) δ 2h(δ) It is proved that

f 0 u = u u (δ) 0 as δ 0, (7.253) k − δ k k − δ k≤ → → where h(δ) and (δ) are given by formulas (7.67) (see Theorem 7.1). For- mula (7.252) should be modified for x < a + h(δ) and x > b h(δ) (see − (7.107)).

Remark 7.2 The notion of ill-posedness depends on the topology of F . For example, if one considers as F the space C 1[a, b] of functions, which satisfy condition A(0) = 0, then problem (7.251) is well posed and A is an isomorphism of C[a, b] onto F .

Example 7.7 Stable summation of orthogonal series with perturbed co- efficients. Let

∞ u(x) = c φ (x), x D Rr. (7.254) j j ∈ ⊂ j=1 X 2 Suppose (φj, φm) = δjm is a basis of L (D), the parentheses denote the inner product in L2(D). Assume that u L2(D). This happens if and only ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 201

if

∞ c 2 < . (7.255) | j| ∞ j=1 X Suppose the perturbed coefficients are given c = c +  ,  δ. (7.256) jδ j j | j| ≤ The problem is: given the data cjδ and δ > 0, find uδ such that (7.248) holds, the norm in (7.248) being L2(D) norm. 2 2 Consider the map A : L (D) `∞ which sends a function u(x) L (D) → ∈ into a sequence c = cj = (c1, . . ., cj, . . .) `∞ by formula (7.254). Since, { } ∈ 2 in general, the perturbed sequence c = c ` , the series ∞ c φ (x) δ { jδ} 6∈ j=1 jδ j diverges in L2(D), so that the perturbed sequence c may not belong { jδ} P to the range of A. It is easy to give examples when cjδ R(A) but the 2 { } ∈ function uδ := j∞=1 cjδφj(x) differs from u in L (D) norm as much as one wishes no matter how small δ > 0 is. P

Exercise. Construct such an example:

1 2 Therefore A− is unbounded from `∞ into L (D), and the problem is ill-posed. Note that if one changes the topology on the set of sequences cj 2 { } from `∞ to ` then the problem becomes well-posed and the operator A : L2(D) `2 is an isomorphsm. → A stable solution to the problem is given in Section 7.4. Example 7.8 Integral equation of the first kind. Let A be an integral operator A : H H, H = L2(D), → Au = A(x, y)u(y)dy = f(x), x D. (7.257) ∈ ZD If A is injective, that is, Au = 0 implies u = 0, and A is compact, then 1 A− is not continuous in H, R(A) is not closed in H, so that small in H perturbations of f may result in large perturbations of u, or may lead to an equation which has no solutions in H (this happens if the perturbed f does not belong to R(A)). Therefore, the problem (7.257) is ill-posed. Example 7.9 Computation of values of unbounded operators. Suppose B : F U is an unbounded linear operator densely defined in → F (that is, its domain of definition D(B) is dense in F ). Suppose f D(B), ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

202 Random Fields Estimation Theory

Bf = u. Assume that instead of f we are given a number δ > 0 and fδ such that f f δ. k − δ k≤

The problem is: given f and B compute u such that u u = δ δ k δ − k k u Bf 0 as δ 0. δ − k→ →

1 1 This problem is ill-posed. If A− is unbounded and B = A− , the problem (7.257) reduces to the above problem.

Example 7.10 Analytic continuation. Suppose f(z) is analytic in a bounded domain D of the complex plane and continuous in D. Assume that D D is a strictly inner subdomain 1 ⊂ of D.

The problem is: given f(z) in D1 find f(z) in D.

By Cauchy’s formula one has

1 f(t)dt = f(z), z D . (7.258) 2πi t z ∈ 1 Z∂D − This is an integral equation of the first kind for the unknown function f(t) on ∂D. If f(t) is found then f(z) in D is determined by Cauchy’s formula. Therefore the problem of analytic continuation from the subdomain D1 to the domain D is ill-posed.

Example 7.11 Identification problems. Let sin (x) be the input signal and sout(x) be the output signal of a linear system with the transmission function h(x, y), that is

h(x, y)s (y)dy = s (x), x ∆, (7.259) in out ∈ ZD where D and ∆ are bounded domains in Rr, and the function h in contin- uous in D ∆. ×

The identification problem is: given sout(x) and h(x, y) find sin(y).

Equation (7.259) is an integral equation of the first kind. Therefore the above problem is ill-posed. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 203

Example 7.12 Many inverse problems arising in physics are ill-posed. For example, the inverse scattering problem in three dimensions. The prob- lem consists of finding the potential from the given scattering amplitude (see [R 26] and Chapter 6). Inverse problems of geophysics are often ill- posed [R 2]. Example 7.13 Ill-posed problems in linear algebra. Consider equation (7.246) with U = Rn and F = Rm, Rm is the m- dimensional Euclidean space. Let N(A) = u : Au = 0 be the null-space { } of A, and R(A) be the range of A. If N(A) = 0 define the normal solution 6 { } u0 to (7.246) as the solution orthogonal to N(A):

Au = f, u N(A). (7.260) 0 0 ⊥

This solution is unique: if u˜0 is another solution to (7.260) then

A(u u˜ ) = 0, u u˜ N(A). 0 − 0 0 − 0 ⊥

This implies that u0 = u˜0. One can prove that the normal solution can be defined as the solution to (7.246) with the property that its norm is minimal:

min u = u , (7.261) k k k 0 k where minimum is taken over the set of all solutions to equation (7.246), and the minimizer is unique: u = u . Indeed, any element u Rn can be 0 ∈ uniquely represented as:

u = u u , u N(A), u N(A) (7.262) 0 ⊕ 1 0 ⊥ 1 ∈

u 2= u 2 + u 2 . (7.263) k k k 0 k k 1 k

If Au = f then Au0 = f, and (7.263) implies (7.261). Moreover, the minimum is u and is attained if and only if u = 0. k 0 k 1 The normal solution to the equation Au = f can be defined as the least squares solution:

Au f = min, u N(A) (7.264) k − k ⊥ in the case when f R(A). This solution exists and is unique. Existence 6∈ follows from the fact that minimum in (7.264) is attained at the element u such that Au f = dist(f, R(A)) (note that R(A) is a closed subspace k − k of Rm). Uniqueness of the normal solution to (7.264) is proved as above. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

204 Random Fields Estimation Theory

Lemma 7.5 The problem of finding normal solution to (7.246) is a well- posed problem in the sense of Hadamard: for any f Rm there exists ∈ and is unique the normal solution u0 to equation (7.246), and this solution depends continuously on f: if Au = f, Au = f , f f δ (7.265) 0 0δ δ k − δ k≤ then u u 0 as δ 0. (7.266) k 0 − δ k→ → Proof. Existence and uniqueness are proved above. Let us prove (7.266). Let f = f f , f = f f , (7.267) 1 ⊕ 2 δ δ1 ⊕ δ2 where f R(A), f R(A), f and f are defined similarly. One has 1 ∈ 2 ⊥ δ1 δ2 Au0 = f1 Au0δ = fδ1. (7.268)

1 The operator A : N(A)⊥ R(A) is an isomorphism. Therefore A− : → R(A) N(A)⊥ is continuous. Lemma 7.5 is proved.  → Definition 7.2 The mapping A+ : f u is called pseudoinverse of A. → 0 The normal solution u0 is called sometimes pseudosolution. Although we have proved that the problem of finding normal solu- tion to equation (7.246) is well-posed in the sense of Hadamard when A : Rn Rm, we wish to demonstrate that practically this problem should → be considered as ill-posed in many cases of interest. As an example, take n = m, A : Rn Rn, N(A) = 0 , so that A is → { } injective and, by Fredholm’s alternative, A is an isomorphism of Rn onto Rn. Consider the equations

Au = f, Auδ = fδ. (7.269) Thus

1 A(u u ) = f f , u u = A− (f f ). − δ − δ − δ − δ Therefore

1 u u A− f f . (7.270) k − δ k≤k kk − δ k Since f = Au A u (7.271) k k k k≤k kk k February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 205

one obtains

u uδ 1 f fδ k − k A− A k − k. (7.272) u ≤k kk k f k k k k Define ν(A), the condition number of A, to be

1 ν(A) := A− A . (7.273) k kk k Then (7.272) shows that the relative error of the solution can be large even if the relative error f f / f of the data is small provided that ν(A) is large. k δ − k k k Note that the inequality (7.272) is sharp in the sense that the equality sign can be attained for some u, uδ, f and fδ . The point is that if ν(A) is very large, then the problem of solving equation (7.246) with A : Rn Rn, with N(A) = 0 , is practically ill- → { } posed in the sense that small relative perturbations of f may lead to large relative perturbations of u.

7.6.2 Stable solution of ill-posed problems In this section we sketch some methods for stable solution of ill-posed prob- lems, that is, for finding uδ which satisfies (7.248). First let us prove the following known lemma. By a compactum M U ⊂ we mean a closed set such that any infinite sequence of its elements contains a convergent subsequence.

Lemma 7.6 Let M U be a compactum. Assume that A : M N is ⊂ 1 → closed and injective, N := AM. Then A− : N M is continuous. → Remark 7.3 We assume throughout that U and F are Banach spaces but often the results and proofs are valid for more general topological spaces. These details are not of prime interest for the theory developed in this work, and we do not give the results in their most general form for this reason.

Proof. Let Au = f , f N. Assume that n n n ∈ f f 0 as n . (7.274) k n − k→ → ∞ We wish to prove that f N, that is, there exists a u M such that ∈ ∈ Au = f, and

u u 0. (7.275) k n − k→ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

206 Random Fields Estimation Theory

Since u M and M is a compactum, there exists a convergent subse- n ∈ quence, which is denoted again u , with limit u, u u 0. Since M n k n − k→ is a compactum, it is closed. Therefore u M. Since u u, Au f, ∈ n → n → and A is closed, one concludes that Au = f. Lemma 7.6 is proved.  This lemma shows that if one assumes a priori that the set of solutions 1 to equation (7.248) belongs to a compactum M then the operator A− (which exists on R(A) since we assume A to be injective) is continuous on the set N := AM. Therefore an ill-posed problem which is considered under the condi- tion f N becomes conditionally well posed. This leads to the following ∈ definition. Definition 7.3 A quasisolution of equation (7.246) on a compactum M is the solution to the problem Au f = min, u M. (7.276) k − k ∈ Here A : U F is a linear bounded operator. → The functional (u) := Au f is continuous and therefore it attains its k − k minimum on the compactum M. Thus, a quasisolution exists. In order to prove its uniqueness and continuous dependence on f, one needs additional assumptions. For example, one can prove Theorem 7.4 If A is linear, bounded, and injective, M is a convex com- pactum and F is strictly convex then, for any f F , the quasisolution ∈ exists, is unique, and depends on f continuously. The proof of Theorem 7.4 requires some preparation. Recall that F is called strictly convex if and only if u + v = u + v implies that k k k k k k v = λu for some constant λ.

Exercise. Prove that λ has to be positive. The spaces Lp(D), `p, Hilbert spaces are strictly convex, while L1(D), C(D) and `1 are not strictly convex. Definition 7.4 If g U is a vector and M U is a set, then an element ∈ ⊂ h M is called the metric projection of g onto M if and only if g h = ∈ k − k infu M g u . The mapping P : g h is called the metric projection ∈ k − k → mapping, P g = h, or PM g = h. In general, P g is a set of elements. Therefore the following lemma is of interest. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 207

Lemma 7.7 If U is strictly convex and M is convex then the metric projection mapping onto M is single valued.

Proof. Suppose h = h , h P g, j = 1, 2. Then 1 6 2 j ∈ m := h g = h g u g u M. k 1 − k k 2 − k≤k − k ∀ ∈ Since M is convex, (h + h )/2 M. Thus 1 2 ∈ h + h 1 1 m g 1 2 g h + g h = m. (7.277) ≤ − 2 ≤ 2 k − 1 k 2 k − 2 k

Therefore, since U is strictly convex, one concludes that g h1 = λ(g h2), − − λ is a real constant. Since g h = g h , it follows that λ = 1. If k − 1 k k − 2 k  λ = 1 then h = h , contrary to the assumption. If λ = 1, then 1 2 −

g = (h1 + h2)/2. (7.278)

Since M is convex, equation (7.278) implies that g M. This is a contra- ∈ diction since g M implies P g = g. Lemma 7.7 is proved.  ∈ Lemma 7.8 If U is strictly convex and M is convex then P : U M M → is continuous.

Proof. Suppose g g 0 but h h  > 0, where h = P g , k n − k→ k n − k≥ n n h = P g. Since M is a compactum, one can assume that hn h , h M. → ∞ ∞ ∈ Thus h h  > 0. One has k ∞ − k≥ g h g h (7.279) k − k≤k − ∞ k and

g h g gn + gn hn + hn h k − ∞ k ≤ k − k k − k k − ∞ k g h , n . (7.280) → k − k → ∞

Indeed g gn 0, hn h 0, and k − k→ k − ∞ k→ g h = dist(g , M) dist(g, M) = g h . (7.281) k n − n k n → k − k From (7.279) and (7.280) one obtains g h = g h . This implies k − ∞ k k − k h = h as in the proof of Lemma 7.7. This contradiction proves Lemma ∞ 7.8.  February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

208 Random Fields Estimation Theory

Exercise. Prove that dist(g, M) := infu M g u is a continuous ∈ k − k function of g.

We are ready to prove Theorem 7.4.

Proof of Theorem 7.4 Existence of the solution to (7.276) is already proved. Since M is convex and A is linear the set AM := N is convex. Since N is convex and F is strictly convex, Lemma 7.7 says that PN f exists and is unique, while Lemma 7.8 says that PN f depends on f continuously. 1 Let Au = PN f. Since A is injective u = A− PN f is uniquely defined and, by Lemma 7.6, depends continuously on f. Theorem 7.4 is proved. 

It follows from Theorem 7.4 that if M U is a convex compactum ⊂ which contains the solution u to equation (7.246), if A is an injective linear bounded operator, and F is strictly convex, then the function

1 uδ = A− PAM fδ (7.282)

satisfies (7.248). The function uδ can be found as the unique solution to optimization problem (7.276) with fδ in place of f. One could assume A closed, rather than bounded, in Theorem 7.4. Uniqueness of the quasi solution is not very important for practical purposes. If there is a set u of { δ} the solutions to (7.276) with fδ in place of f, if A is injective and Au0 = f0, and if f f δ, then u u 0 as δ 0 for any of the elements of k δ − 0 k≤ k δ − 0 k→ → the set u . Indeed Au f Au f = f f δ. Therefore { δ} k δ − δ k≤k 0 − δ k k 0 − δ k≤ Au Au Au f + f f 2δ. Since M is compact and k δ − 0 k≤k δ − δ k k δ − 0 k≤ u , u M the inequality Au Au 2δ implies u u 0 as δ 0 ∈ k δ − 0 k≤ k δ − 0 k→ δ 0 (see Lemma 7.6). → We have finished the description of the first method, the method of qua- sisolutions, for finding a stable solution to problem (7.246). How does one choose M? The choice of M is made in accordance to the a priori knowledge about the solution to (7.246). For instance, in Example 7.6 one can take as M the set of functions u, which satisfy the condition

u(a) M , u0(x) M , (7.283) | | ≤ 1 | | ≤ 2

where Mj are constants, j = 1, 2. Then

x Au = udt = f Za February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 209

and

f 00 M , f 0(a) M , f(a) = 0. (7.284) | | ≤ 2 | | ≤ 1 Inequality (7.283) defines a convex compactum M in L2[a, b] and (7.284) defines N = AM in L2[a, b]. Theorem 7.4 is applicable (since L2[a, b]

is strictly convex) and guarantees that u u 2 0. A stable k − δ kL [a,b]→ approximation of u = f 0(x) in C[a, b] norm is given in Theorem 7.1. Let us now turn to the second method for constructing uδ which satisfies equation (7.248). This will be a variational method which is also known as a regularization method. While in the first method one needs to solve the variational problem (7.276) with the restriction u M, in the second ∈ method one has to solve a variational problem without restrictions. Consider the functional F (u) := Au f 2 +γφ2(u), f f δ, (7.285) k − δ k k δ − k≤ where 0 < γ = const is a parameter, A is a linear bounded operator and φ(u) is a positive strictly convex densely defined functional which defines a norm: φ(u) 0, φ(u) = 0 u = 0, φ(λu) = λ φ(u) (7.286) ≥ ⇒ | | φ(u + u ) φ(u ) + φ(u ) 1 2 < 1 2 if u = λu , λ = const. (7.287) 2 2 1 6 2 We also assume that the set of u U, which satisfy the inequality ∈ φ(u) c, (7.288) ≤ is compact in U. In other words the closure of the set Domφ(u) in the norm φ(u) is a Banach space U U which is dense in U, and the imbedding φ ⊂ operator i : U U is compact. φ → One often takes φ(u) = u , (7.289) k L k where : U U is a linear densely defined boundedly invertible operator L → with compact inverse. An operator is called boundedly invertible if its inverse is a bounded operator defined on all of U. Let us assume that U is reflexive so that from any bounded set of U one can select a weakly convergent in U subsequence. We will need a few concepts of nonlinear analysis in order to study the minimization problem F (u) = min. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

210 Random Fields Estimation Theory

Definition 7.5 A functional F : U R1 is called convex if D(F ) := → DomF is a linear set and for all u, v D(F ) one has ∈ F (λu + (1 λ)v) λF (u) + (1 λ)F (v), 0 λ 1. (7.290) − ≤ − ≤ ≤ Definition 7.6 A functional F (u) is called weakly lower semicontinuous from below if

un * u lim inf F (un) F (u), (7.291) n ⇒ →∞ ≥ where * denotes weak convergence in U. Lemma 7.9 A weakly lower semicontinuous from below, functional F (u), in a reflexive Banach space U is bounded from below on any bounded weakly closed set M DomF and attains its minimum on M at a point of M. ⊂ Note that the set is weakly closed if u M and u * u implies u M. n ∈ n ∈

Proof of Lemma 7.9 Let

d := inf F (u), F (un) d, un M. (7.292) −∞ ≤ u M → ∈ ∈ Since M is bounded and U is reflexive, there exists a weakly convergent subsequence of un which we denote un again, un * u. Since M is weakly closed one concludes that u M. Since F is weakly lower semicontinuous ∈ from below, one concludes that

d F (u) lim inf F (un) = d. (7.293) n ≤ ≤ →∞ Therefore d > , and F (u) = d. Lemma 7.9 is proved.  −∞ Lemma 7.10 A weakly lower semicontinuous from below functional F (u) in a reflexive Banach space attains its minimum on every bounded, closed, and convex set M. Proof. Any such set M in a reflexive Banach space is weakly closed. Thus Lemma 7.10 follows from Lemma 7.9.  Exercise. Prove the following lemmas. Lemma 7.11 If F (u) is weakly lower semicontinuous from below func- tional in a reflexive Banach space such that F (u) + as u (7.294) → ∞ k k→ ∞ then F (u) attains its minimum on any closed convex set M U. ⊂ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 211

Lemma 7.12 A weakly lower semicontinuous from below functional in a reflexive Banach space attains its minimum on every compactum.

Lemma 7.13 A convex Gˆateaux differentiable functional F (u) is weakly lower semicontinuous from below in a reflexive Banach space.

Definition 7.7 A functional F (u) is Gateux differentiable in U if and only if

1 lim t− [F (x + th) F (x)] = Ah (7.295) t +0 → − for any x, h U, where A : U R1 is a linear bounded functional on U. ∈ → Proof of Lemma 7.13 If un * u then convexity of F (u) implies

F (u) F (u ) + F 0(u)(u u ). (7.296) ≤ n − n Pass to the limit infimum in (51) to get

F (u) lim inf F (un). (7.297) n ≤ →∞ Lemma 7.13 is proved. 

Exercise. Prove that if F (u) is Gateux differentiable and convex in the sense (7.290) then

F (u) F (v) F 0(u)(u v) u, v DomF. (7.298) − ≤ − ∀ ∈ In fact, if F (u) is Gateux differentiable then

(7.290) (7.298) (F 0(u) F 0(v), u v) 0. (7.299) ⇔ ⇔ − − ≥

The last inequality means that F 0(u) is monotone. The parentheses in (7.299) denote the value of the linear functional F 0(u) F 0(v) U ∗ at the − ∈ element u v U. By U ∗ the space of linear bounded functionals on U is − ∈ denoted. We are now ready to prove the following theorem.

Theorem 7.5 Assume that A is a linear bounded injective operator de- fined on the reflexive Banach space U, and φ(u) is a strictly convex weakly lower semicontinuous from below functional such that the set (7.288) is compact in U. Then the minimization problem

F (u) = min, (7.300) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

212 Random Fields Estimation Theory

where F (u) is defined in (7.285), has the unique solution uδ,γ for any γ > 0, and if one chooses γ = γ(δ) so that

2 1 γ(δ) 0, δ γ− (δ) m < as δ 0, (7.301) → ≤ ∞ →

where m = const > 0, then uδ := uδ,γ(δ) satisfies (7.248).

Proof. The functional F (u) 0. Let 0 d := infu U F (u), and let un ≥ ≤ ∈ be the minimizing sequence

F (u ) d. (7.302) n → Then

2 2 1 d F (u ) d +  δ + γφ (u ), u := A− f, n > n(), (7.303) ≤ n ≤ ≤ f f ∀ where  > 0 is a sufficiently small number, see (7.317) below. From (7.301) and (7.303) one concludes that

2 1 2 2 γφ (u ) d +  γ γ− δ + φ (u ) cγ (7.304) n ≤ ≤ f ≤ so that φ2(u ) c, c := m+φ2(u ). Therefore one can choose a convergent n ≤ f subsequence from the sequence un. This subsequence is denoted also un:

un −→u0, u0 = u0δ := uδ. (7.305) U Since A is continuous one has

Au f Au f . (7.306) k n − k→k 0 − k The lower weak semicontinuity of φ(u) and (7.305) imply

lim inf φ(un) φ(u0). (7.307) n →∞ ≥ Thus

d F (u0) lim inf F (un) = d. (7.308) n ≤ ≤ →∞ Therefore the solution to (7.300) exists and the limit (7.305) is a solution. Suppose v is another solution:

d = F (v) = F (u0). (7.309)

Then, since F (u) is convex, one has

d F (λu + (1 λ)v) λF (u )+(1 λ)F (v) = d, λ [0, 1]. (7.310) ≤ 0 − ≤ 0 − ∀ ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 213

Therefore u + v F (u ) + F (v) F 0 = 0 . (7.311) 2 2   This implies that Au f Av f u + v k 0 − δ k + k − δ k = A 0 f (7.312) 2 2 2 −

and

u + v φ(u ) + φ(v) φ 0 = 0 . (7.313) 2 2   From (7.287) and (7.312) one concludes that v = cu0, c = const. Since φ(cu) = c φ(u), equation (7.313) implies that c 0. From (7.310) it | | ≥ follows that F (λcu + (1 λ)u ) = d λ [0, 1]. (7.314) 0 − 0 ∀ ∈ Let µ := λ(c 1). If c = 1, then for all real sufficiently small µ one obtains − 6 from (7.314) that

Au f 2 +γφ2(u ) = Au f + µAu 2 +γ(1 + µ)2φ2(u ). k 0 − δ k 0 k 0 − δ 0 k 0 Thus 0 = µ2 Au 2 +2µRe(Au f , Au ) + 2γµφ2(u ) + γµ2φ2(u ). (7.315) k 0 k 0 − δ 0 0 0 Since (7.315) is a quadratic equation it cannot be satisfied for all small µ, since its coefficients are not all zeros. Therefore c = 1, and uniqueness of the solution to (7.300) is established. Let us prove the last statement of Theorem 7.5. Assume that (7.301) 1 holds. Let Au = f u = A− f := uf . One has F (u) F (u ). (7.316) ≥ 0 Therefore

2 2 2 2 1 Au f +γφ (u ) δ + γφ (u), u = A− f. (7.317) k 0 − δ k 0 ≤ Thus, by (7.301), φ2(u ) φ2(u) + m := c (7.318) 0 ≤ and, using (7.301) again, one obtains

2 1 2 2 Au f γ γ− δ + φ (u) cγ 0 as δ 0. (7.319) k 0 − δ k ≤ ≤ → →  February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

214 Random Fields Estimation Theory

Similarly, for sufficiently small δ, one has

F (u) F (u ), φ2(u ) c, (7.320) ≥ δ δ ≤

Au f 2 cγ(δ) 0 as δ 0. (7.321) k δ − δ k ≤ → → From (7.321) it follows that

Au Au Au f + f Au k δ − k ≤ k δ − δ k k δ − k Au f +δ c1/2γ1/2(δ) + δ 0 ≤ k δ − δ k ≤ → as δ 0. (7.322) → Let M := v : v U, φ2(v) c . Then M is a compactum, u M, u M ∈ ≤ δ ∈ ∈ by (7.320) and (7.318). Therefore, (7.322) and Lemma 7.6 imply (7.248).  Theorem 7.5 is proved. 

Remark 7.4 If one finds for γ = γ(δ) not the minimizer uδ = uδ,γ(δ) itself but an approximation to it, say v , such that F (v ) F (u), then, as δ δ ≤ above,

φ2(v ) c, Av f 2 cγ(δ). (7.323) δ ≤ k δ − δ k ≤ Thus

Av Au c1/2γ1/2(δ) + δ 0. (7.324) k δ − k≤ → As above, from (7.323) and (7.324) it follows that v u 0 as δ 0. k δ − k→ → Therefore one can use an approximate solution to the minimization problem (7.300) as long as the inequality F (v ) F (u) holds. δ ≤ Remark 7.5 One can assume in Theorem 7.5 that A is not bounded but closed linear operator, D(A) D(φ) and the set (7.288) is compact in ⊃ the space G which is D(A) equipped with the graph norm u := u A k kA k k + Au . One can also assume that φ(u) is a convex lower weakly semi- k k continuous functional, the set φ(u) c is not necessarily compact. The ≤ change in the proof of Theorem 7.5 under this assumption is as follows. From (7.304) it follows that un * u0 in U. It follows [Ru, p. 65, Theorem 3.13] that there exists a convex combination uˆ of u such that uˆ u . n n n → 0 The sequence uˆn is minimizing if un is. Therefore equations (7.306)-(7.308) hold with uˆn in place of un. The proof of the uniqueness of the solution to (7.300) is the same as above. One can prove that u * u as δ 0 and δ → that there is a convex combination uˆ of u such that uˆ u as δ 0. δ δ δ → → However, there is no algorithm to compute uˆδ given uδ. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 215

If R : F U is the mapping which sends f into u , the solution α,δ → δ α,δ to (7.300), then

Rδfδ := Rα(δ),δfδ = uδ (7.325) satisfies (7.248), that is

1 R f A− f 0 as δ 0. (7.326) k δ δ − k→ →

A construction of such a family Rα,δ of operators, that there exists α(δ) such that (7.326) holds, is used for solving an ill-posed problem (7.246). The family Rα,δ is called a regularizing family for problem (7.246). The error estimate of the approximate solution uα,δ := Rαfδ can be given as follows:

u u R (f f) + R Au u ω(α)δ + η(α) := (, δ). k α,δ − k≤k α δ − k k α − k≤ Here we assumed that R is a linear operator and that R ω(α). One α k α k≤ assumes that R and u are such that η(α) 0 and ω(α) + as α 0. α → → ∞ → Then there exists α = α(δ) such that α(δ) 0 and  (α(δ), δ) := (δ) 0 → → as δ 0. → Therefore Rα is a regularizing family for problem (7.246) provided that ω(α) + and η(α) 0 as α 0. The stable approximation to the → ∞ → → solution u of equation (7.246) is uδ := Rα(δ)fδ, where α(δ) is chosen so that (α, δ)  (α(δ), δ) := (δ) for any δ > 0. One has the error estimate ≥ u u (δ). k − δ k≤ We gave two general methods for constructing such families. The theory can be generalized in several directions: 1) one can consider nonlinear A; unbounded A, for example, closed densely defined A; A given with some error, say A is given such that A A <  k −  k . 2) one can consider special types of A, for example convolution and other special kernels; in this case one can often give a more precise error esti- mate for approximate solution. 3) one can study the problem of optimal choice of γ and of the stabilizing functional φ(u) in (7.285). 4) one can study finite dimensional approximations for solving ill-posed problem (7.246). 5) one can study methods of solving problem (7.246) which are optimal in a suitable sense. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

216 Random Fields Estimation Theory

These questions are studied in many books and papers, and we refer the reader to [Ivanov et. al. (1978); Lavrentiev and Romanov (1986); Morozov (1984); Ramm (1968); Ramm (1973b); Ramm (1975); Ramm (1980); Ramm (1981); Ramm (1984); Ramm (1985b); Ramm (1987b); Ramm (1987c); Tanana (1981); Tikhonov (1977)]. In [Ramm (2003a)] and [Ramm (2005)] a new definition of the reg- ularizing family is given. The new definition operates only with the data δ, f , , and does not use the unknown f. The compact in { δ K} K this definition is the compact to which the unknown solution belongs. The knowledge of this compact is an a priori information about the so- lutions of ill-posed problems. One calls Rα,δ a regularizing family, if limδ 0 sup u:u , Au f δ Rδfδ u = 0, where u solves the equation → { ∈K || − δ||≤ } || − || Au = f, and R = R for some 0 < α(δ) 0 as δ 0. δ α(δ),δ → →

7.6.3 Equations with random noise In this section we look at the problem (7.246) with linear bounded injective operator from the point of view of estimation theory. Let us consider the equation

Aw = f + n, (7.327)

where A is an injective linear operator on a Hilbert space H, and n is noise. Let us assume for simplicity that noise takes values in H. In practice this is not always the case. For example, if H = L2(Rr) then a sample function n(x) may belong to L2(Rr ) locally, but not globally if n(x) does not decay sufficiently fast as x . Therefore the above assumption simplifies the | | → ∞ theory. Assume that

2 n = 0, n∗(x)n(y) = σ R(x, y), (7.328)

where σ2 > 0 is a parameter which characterizes the power of noise. Let us assume that the solution to (7.327) exists and f RanA. One may try ∈ to suggest the following definition.

Definition 7.8 The solution to (7.327) is statistically stable if

[w u] 0 as σ 0, (7.329) D − → → where u solves the equation

Au = f. (7.330) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 217

We will also use this definition with (7.329) substituted by

w u 2 0 as σ 0, (7.331) k − k → → where denotes the norm in H. k · k This definition is very restrictive. First, the assumption that equation (7.327) is solvable means that a severe constraint is imposed on the noise. Secondly, the requirement (7.329) is rather restrictive. Let us illustrate this by examples and then consider some less restrictive definition of the stable solution to (7.327). Note that, under the above assumption,

1 1 1 w = A− f + A− n = u + A− n. (7.332)

Thus

1 [w u] = [A− n]. (7.333) D − D Let us assume that H = L2(D), D Rr is a finite region, and A is a ⊂ selfadjoint compact operator on H with kernel A(x, y),

Aφ = λ φ , λ λ > 0, (7.334) j j j 1 ≥ 2 ≥ · · · where

φj(x)φi∗(x)dx := (φj , φi) = δji. (7.335) ZD Then

∞ 1 1 A− n = λj− (n, φj)φj. (7.336) j=1 X Therefore

1 ∞ 1 1 [A− n] = λ− λ− n(t)φ (t)dt n (z)ψ (z)dzφ (x)φ∗(x) D i j j∗ ∗ i j i i,j=1 D D X Z Z ∞ 2 1 1 = σ λi− λj− R(z, t)φi(z)φj∗(t)dzdtφj(x)φi∗(x) i,j=1 D D X Z Z 2 1 1 = σ A− (x, z)R(z, t)A− (t, x)dzdt, (7.337) ZD ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

218 Random Fields Estimation Theory

1 1 where A− (x, y) is the kernel of the operator A− in the sense of distribu- tions and is given by the formula

∞ 1 1 A− (x, y) = λj− φj∗(x)φj (y). (7.338) Xj=1 For the right hand side of (7.337) to converge to zero it is necessary and 1 1 sufficient that the kernel B(x, y) of the operator A− RA− be finite for all x and y = x, B(x, x) < . ∞ If one requests in place of (7.329) that

w u 2 0 as σ 0, (7.339) k − k → → where the bar denotes statistical average and

w := (w, w)1/2, (7.340) k k then the following condition (7.342) will imply (7.339). One has

1 1 1 2 1 1 (A− n, A− n) = dx [A− n] = σ T r(A− RA− ) 0 as σ 0 D D → → Z (7.341) provided that A is selfadjoint, positive, and

1 1 T r(A− RA− ) < . (7.342) ∞ Condition (7.342) is a severe restriction on the correlation function R(x, y) of the noise.

Example 7.14 Consider the case H = L2(R1),

∞ Au := A(x y)u(y)dy. (7.343) − Z−∞ Then

1 1 ∞ 1 A− f := exp(iλx)A˜− (λ)f˜(λ)dλ, (7.344) 2π Z−∞ where

1 ∞ f(x) = f˜(λ) exp(iλx)dx (7.345) 2π Z−∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 219

one obtains (a derivation is given below) the formula ˜ 1 1 2 ∞ R(λ) A− n(A− n)∗ = σ dx. (7.346) A˜(λ) 2 Z−∞ | | Here we have assumed that

∞ n(x) = exp(iλx)dζ(λ), (7.347) Z−∞ where ζ(λ) is a random process with orthogonal increments such that 2 ˜ ζ(λ) = 0, dζ∗(λ)dζ(λ) = σ R(λ)dλ, (7.348)

dζ (λ)dζ(µ) = 0 for λ = µ. (7.349) ∗ 6 If B is a linear integral operator with convolution kernel:

∞ Bn = B(x y)n(y)dy, − Z−∞ and the spectral representation for n(x) is (7.347) then the spectral repre- sentation for Bn is

∞ ∞ Bn = B(x y) exp(iλy)dy dζ(λ) − Z−∞ Z−∞  ∞ = exp(iλx)B˜(λ)dζ(λ). (7.350) Z−∞ If B1 and B2 are two linear integral operators with convolution kernels, then

∞ B n(B n) = exp(iλx iµx)B˜ (λ)B˜∗(µ)dζ(λ)dζ (µ) 1 2 ∗ − 1 2 ∗ ZZ −∞ 2 ∞ ˜ ˜ ˜ = σ B1(λ)B2∗(λ)R(λ)dλ. (7.351) Z−∞ Equation (7.346) is a particular case of (7.351). Note that stationary ran- dom functions (7.347) have mean value zero and

2 ∞ ∞ ix(λ µ) [n(x)] := n(x) = e − dζ (λ)dζ(µ) D | | ∗ Z−∞ Z−∞ ∞ = σ2 R(λ)dλ. (7.352) Z−∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

220 Random Fields Estimation Theory

It follows from formula (7.346) that the variance of the random process 1 A− n at any point x is finite if and only if the spectral density R˜(λ) of 2 1 the noise is such that R˜(λ) A˜(λ) − L ( , ). If A˜(λ) tends to zero | | ∈ −∞ ∞ as λ , the above condition imposes a severe restriction on the noise. | | → ∞ For example, if the noise is white, that is R˜(λ) = 1, then the condition 2 1 A˜(λ) − L ( , ) is not satisfied for A˜(λ) 0 as λ . | | ∈ −∞ ∞ → | | → ∞ Example 7.15 Consider the Hilbert space H of 2π-periodic functions

f = fm exp(imx) (7.353) m=0 X6 with the inner product

(f, g) := fmgm∗ . (7.354) m=0 X6 Assume that f(x) + n(x) is given, where n H ∈ n(x) = 0, n (x)n(y) = σ2R(x y), (7.355) ∗ −

R(x + 2π) = R(x). (7.356)

Note that R(x) can be written as

R(x) = σ2 r exp( imx). (7.357) m − m=0 X6

Suppose one wants to estimate f 0(x) given f(x) + n(x). If one uses the function uˆ := f 0(x) + n0(x) as an estimate of f 0, then the variance of the error of this estimate can be calculated as follows. Let

n(x) = nm exp(imx), (7.358) m=0 X6

where nm, the Fourier coefficients of n(x), are random variables, such that

2 nm = 0, nm∗ nj = σ rmδmj . (7.359)

The numbers rm can be determined easily. Indeed

R(x y) = n (x)n(y) = n n exp(ijy imx) − ∗ m,j m∗ j − = σ2 r exp im(x y) . (7.360) Pm {− − } P February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 221

From (7.356) and (7.360) one can see that the numbers rm in (7.359) can be calculated by the formula 1 π rm = R(x) exp(imx)dx. (7.361) 2π π Z−

Since R(x) is a covariance function, the numbers rm are nonnegative as it should be. From (7.358) and (7.359) it follows that

2 2 [n0] = σ m r . (7.362) D m m X Therefore the numbers rm have to satisfy the condition that the right side of (114) is a convergent series, in order that the estimate uˆ be statistically stable in the sense of Definition 7.7.

Example 7.16 Let be a selfadjoint positive operator on a Hilbert space L H. Assume that the spectrum of is discrete: L 0 λ λ , λ as m . (7.363) ≤ 1 ≤ 2 ≤ · · · m → ∞ → ∞ Consider the problem

u = u, t > 0 (7.364) t L

u(0) = f. (7.365)

Let φ be the eigenvectors of : m L φ = λ φ (7.366) L m m m and assume that the system φ , 1 m < , forms an orthonormal { m} ≤ ∞ basis of H:

(φm, φj) = δmj . (7.367)

The formal solution to problem (7.364)-(7.365) is

∞ u = exp(λmt)fmφm (7.368) m=1 X where

fm := (f, φm). (7.369) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

222 Random Fields Estimation Theory

Formula (7.368) gives a formal solution to (7.364)-(7.365) in the sense that formal differentiation in t yields

∞ ut = λm exp(λmt)fmφm, (7.370) m=1 X and formal application of the operator and formula (118) yield L ∞ u = λ exp(λ t)f φ , (7.371) L m m m m m=1 X so that (7.370) and (7.371) yield (7.364). Put t = 0 in (7.368) and get (7.365). Formula (7.368) gives the strong solution to (7.364)-(7.365) in H if and only if the series (7.370) converges in H, that is

∞ λ2 exp(2λ t) f 2 < , t > 0. (7.372) m m | m| ∞ m=1 X This implies that the problem (7.364)-(7.365) is very ill-posed. This prob- lem is an abstract heat equation with reversed time. If one takes in −L place of in (7.364) then the problem is analogous to the usual heat equa- L tion and is well posed. Suppose that H = L2(D), and that the noisy data are given in (7.365), the function f(x) + n(x) in place of f(x) where n(x) is noise. Let

∞ n(x) = nmφm(x), (7.373) m=1 X where

nm := (n(x), φ(x)) (7.374) where the parentheses denote the inner product in L2(D). It is clear that if n(x) = 0, then

n = 0, m. (7.375) m ∀ Let

2 n∗(x)n(y) = σ R(x, y). (7.376) Then

∞ 2 nm∗ nj φm∗ (x)φj(y) = σ R(x, y). (7.377) m,j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 223

The kernel R(x, y) is selfadjoint and nonnegative definite, being a covariance function. Let us assume that the matrix

rmj := nm∗ nj (7.378)

is such that the series (7.377) converges in L2(D) L2(D). If one uses the × formula

∞ uˆ := exp(λmt)(fm + nm)φm (7.379) m=1 X for the solution of the problem (7.364)-(7.365) with the noisy initial data, then for the variance of the error of this estimate one obtains

∞ [u uˆ] = exp[(λ + λ )t]r φ∗ (x)φ (x). (7.380) D − m j mj m j m,j=1 X It is clear from (7.380) that the solution uˆ is not statisticaly stable since the series (7.380) may diverge although the series (7.377) converges.

Exercise. Let (7.363)-(7.368) hold, u(0) <  and u(T ) c. Prove 1 t t k k k k≤ that u(t)  − T c T for 0 t T . k k≤ ≤ ≤ Hint: Consider φ(t) := u(t) 2= ∞ exp( 2λ t) f 2. Check that k k m=1 − m | m| φ00 > 0 and (ln φ)00 0. Thus ln φ is convex. Therefore ≥ P ln φ [(1 α)0 + T ] (1 α) ln φ(0) + α ln φ(T ), 0 α 1. − ≤ − ≤ ≤ t Let α = T . Then the desired inequality follows.

The above examples lead to the following question: how does one find a statistically stable estimate of the solution to equation (7.327)? Let us outline a possible answer to this question. We consider linear estimates, but the approach allows one to generalize the theory and to consider nonlinear estimates as well. The approach is similar to the one outlined in Section 2.1. Let us look for a linear statistically stable estimate uˆ of the solution to equation (7.327) of the form

uˆ = L(f + n). (7.381) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

224 Random Fields Estimation Theory

Assume that the injective linear operator A in (7.327) is an integral operator on H = L2(D), D Rr, with the kernel A(x, y), and ⊂ Lf = L(x, y)f(y)dy. (7.382) ZD Since Au = f, one has:

uˆ u 2 = (LA I)u + Ln 2 = (LA I)u 2 + Ln 2 (7.383) | − | | − | | − | | | where I is the identity operator and the term linear with respect to n vanishes because n = 0. Let us calculate the last term in (7.383):

Ln 2 = L (x, y)n (y)dy L(x, z)n(z)dz = | | ∗ ∗ ZD ZD 2 = σ L∗(x, y)R(y, z)L0(z, x)dydz. (7.384) ZD ZD Here we used second formula (7.328) and the standard notation

L0(z, x) := L(x, z). (7.385)

Remember that star denotes complex conjugate (and not the adjoint oper- ator). Integrating both sides of (7.383) in x over D yields

 := uˆ u 2 = (LA I)u 2 +σ2T rQ (7.386) k − k k − k where Q is an integral operator with the kernel

Q(x, ξ) := L∗(x, y)R(y, z)L0(z, ξ)dydz (7.387) ZD ZD and T rQ stands for the trace of the operator Q. This operator is clearly nonnegative definite in H = L2(D):

(Qφ, φ) = (L∗RL0φ, φ) = (RL0φ, L0φ) 0. (7.388) ≥

Here we used the fact that R is nonnegative definite; that (L∗)† = L0, where 2 A† denotes the adjoint operator in L (D); and we have assumed that the function Q(x, ξ) is continuous in x, ξ D. The last assumption and the ∈ fact that the kernel Q(x, ξ) is nonnegative definite imply that

T rQ = Q(x, x)dx. (7.389) ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 225

One wants to choose L such that

 = min (7.390)

1 where  is given in (7.386). If one put L = A− then the first term on the right side of (7.386) vanishes and the second is finite if

1 1 T r (A− )∗R(A− )0 < . (7.391) ∞ We assume that (7.391) holds. We claim that if (7.391) holds and σ 0 → then one can choose L so that

 := (σ) 0 as σ 0. (7.392) min → → Any such choice of L yields a statistically stable estimate of the solution u. Let us prove the claim that the choice of L which implies (7.392) is possible. For simplicity we assume that A is positive selfadjoint operator on H. Put

1 L = (A + δI)− (7.393)

where δ > 0 is a small number. Then the spectral theory yields:

A 2 k k λ (LA I)u 2 = 1 d(E u, u) k − k λ + δ − λ Z0   A 2 k k δ = d(E u, u) (7.394) (λ + δ)2 λ Z0 where Eλ is the resolution of the identity of the operator A. Since

A 2 A k k δ k k d(E u, u) d(E u, u) = u 2< (7.395) (λ + δ)2 λ ≤ λ k k ∞ Z0 Z0 and, as δ 0, the integrand in (7.394) tends to zero, one can use the → Lebesgue dominant convergence theorem and conclude that

(LA I)u 2 := η(δ, u) 0 as δ 0 (7.396) k − k → → where L is given by (7.393). The claim is proved.

Lemma 7.14 If L is defined by (7.393) and (7.391) holds then

lim sup T r L∗RL0 < . (7.397) δ 0 { } ∞ → February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

226 Random Fields Estimation Theory

Proof. One has

1 1 (A + δI)− ∗ R (A + δI)− 0 1 1 1 1 = (A + δI)− ∗ A∗(A− )∗R(A− )0A0 (A + δI)− 0 . (7.398)

 1 1   The operator (A− )∗R(A− )0 is in the trace class by (7.391). Moreover, if A > 0 and δ > 0, then

1 1 (A + δI)− ∗ A∗ 1, A0 (A + δI)− 0 1. (7.399) ≤ ≤

    Both inequalities (7.399) can be pro ved similarly or reduced one to the other, because A∗ = A0. Note that A > 0 implies

A∗ > 0. (7.400)

Indeed, for any φ H, one has ∈

(A∗φ, φ)∗ = (Aφ∗, φ∗) > 0 (7.401)

since A > 0 by the assumption and φ∗ H. Therefore ∈

(A∗φ, φ)∗ = (A∗φ, φ) > 0 φ H. (7.402) ∀ ∈ The desired estimate (7.399) follows from the :

1 λ (A + δI)− ∗ A∗ = max 1. (7.403) 0<λ A∗ λ + δ ≤ ≤k k   Lemma 7.14 is pro ved. 

It is now easy to prove the following theorem.

Theorem 7.6 Let A > 0 be a bounded operator on H = L2(D). Assume that condition (7.328) holds and T rR < . Then the estimate ∞ uˆ = L(f + n), (7.404)

with L given by (7.393), is statistically stable (in the sense (7.331)) estimate of the solution to equation Au = f, provided that parameter δ = δ(σ) in formula (7.393) is chosen so that

2  := σ T r(L∗RL0) + η(δ, u) = min . (7.405) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 227

Proof. Note that we do not assume in this theorem that condition (7.391) holds. Therefore T r(L∗RL0) := ψ(δ) > 0 will, in general, satisfy the condition

ψ(δ) + as δ 0. (7.406) → ∞ → Thus

 = σ2ψ(δ) + η(δ, u). (7.407)

From (7.406), (7.407) and (7.396) it follows that the function  considered as a function of δ for a fixed σ > 0 attains its minimum at δ = δ(σ) and

δ(σ) 0 as σ 0. (7.408) → → Therefore

(σ) =  =  (δ(σ)) 0 as σ 0. (7.409) min → → Theorem 7.6 is proved.  If some estimates for ψ(δ) and η(δ, u) are found then an estimate of (σ) can be obtained. This requires some a priori assumptions about the solution. Example 7.17 A simple estimate for ψ(δ) is the following one.

2 T rR T r(L∗RL0) T rR L∗ . (7.410) ≤ k k ≤ δ2 Here we used the estimates

1 L∗ = L0 δ− , (7.411) k k k k≤ where L is given by (7.393), and the estimate

2 T r(L∗RL0) T rR L . (7.412) ≤ k k We will prove inequality (7.412) later. Let us estimate η(δ, u). To do this, assume that

a A− f c, a = 1 + b, b > 0, (7.413) k k≤ where c > 0 is a constant, and

A a k k a A− f := λ− dEλf, (7.414) Z0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

228 Random Fields Estimation Theory

A a 2 k k 2a 2 A− f = λ− d(E f, f) c . (7.415) k k λ ≤ Z0 1 Since Au = f, u = A− f it follows from (7.396) and (7.415) that

A 2 k k δ 2 η(δ, u) := η(δ) = λ− d(E f, f) (λ + δ)2 λ Z0 A 2 2b A k k δ λ 2 2b 2b k k 2a = λ− − d(E f, f) δ λ− d(E f, f) (λ + δ)2 λ ≤ λ Z0 Z0 c2δ2b. (7.416) ≤ Therefore, under the a priori assumption (7.413) about f, one has σ2T rR  + c2δ2b. (7.417) ≤ δ2 The right hand side in (7.417) attains its minimum in δ (σ > 0 being fixed) at

1 T rR 2a δ = δ(σ) = σ1/a (7.418) min c2b   and

 = (σ) constσ2 b/a (7.419) min ≤ where const can be written explicitly. Let us finally prove inequaity (7.412). This inequality follows from Lemma 7.15 If B is a linear bounded operator on H and R 0 is a ≥ trace class operator, then BR and RB are trace class operators and

T r(BR) B T rR, T r(RB) B T rR. (7.420) | | ≤k k ≤k k Proof. Let us recall that a linear operator T : H H is in the trace → class if and only if

∞ T := s (T ) < , (7.421) k k1 j ∞ Xj=1 where sj (T ) are the s-numbers of T . These numbers are defined by the equality

1/2 sj (R) = λj (T ∗T ) (7.422) n o February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 229

where λ1 λ2 0 are the eigenvalues of the nonnegative definite ≥ ≥ · · · ≥ 1/2 selfadjoint operator (T †T ) , and T † is the adjoint of T in H. The minimax principle for the s-values is

T φ sj+1(T ) = min max k k (7.423) j φ⊥Lj φ L φ=0 k k 6 where runs through all j-dimensional subspaces of H, and φ means Ln ⊥ Lj that φ is orthogonal to all elements of . If B is a linear bounded operator Lj on H then

BT φ T φ sj+1(BT ) = min max k k B min max k k j φ⊥Lj φ ≤k k j φ⊥Lj φ L φ=0 k k L φ=0 k k 6 6 = B s (T ). (7.424) k k j+1 Therefore if (173) holds then

∞ ∞ BT = s (BT ) B s (T ) = B T . (7.425) k k1 j ≤k k j k kk k1 j=1 j=1 X X The first part of Lemma 7.15 is proved since if R 0 one has T rR = ≥ k R . The second part can be reduced to the first. Indeed, T and T ∗ are k1 simultaneously in the trace class since

s (T ) = s (T ∗), j. (7.426) j j ∀ One has

(T B)∗ = B∗T ∗. (7.427)

Since B = B∗ , and (T rT ) = T rT ∗, one concludes from (179) and k k k k (177) that

tr(T B) = tr(B∗T ∗) B∗ T ∗ = B T . (7.428) | | | | ≤k kk k1 k kk k1 Take T = R 0 then T = T rR and the second inequality (7.420) is ≥ k k1 obtained. Lemma 7.15 is proved. 

Additional information about s-values one can find in Section 6.3. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

230 Random Fields Estimation Theory

7.7 A remark on nonlinear (polynomial) estimates

Let

(x) = s(x) + n(x), x D Rr. (7.429) U ∈ ⊂ Consider the polynomial estimate (filter):

m A := H [j] (7.430) U jU j=1 X where

H [j] = h (x, ξ , . . ., ξ ) (ξ ) . . . (ξ )dξ , . . ., dξ (7.431) jU · · · j 1 j U 1 U j 1 j ZD ZD

[j] = (ξ ) . . . (ξ ). (7.432) U U 1 U j The problem is to find A such that

 := [A s] = min . (7.433) D U − Here is the symbol of variance, the assumptions about s(x) and n(x) are D the same as in Chapter 1, the optimal estimate is defined by n functions (h1, . . ., hn), and we could consider by the same method the problem of estimating a known operator on s(x), for example ∂js(x). Let us substitute (7.430) in (7.433):

m m [j] [i] [i] 2  := H H 2Re H∗ s(x) + s(x) jU i∗U ∗ − i U ∗ | | i,j=1 i=0 X X m m 2 = a H H∗ 2Re H b + s(x) ij j i − i i | | i,j=1 i=0 X X = min (7.434)

Here

b := [i] s(x), b = b (x, ξ0 , . . ., ξ0) (7.435) i U ∗ i i 1 i

a := [j] [i] = (ξ ) . . . (ξ ) (ξ ) . . . (ξ ), (7.436) ij U U ∗ U 1 U j U ∗ 10 U ∗ i0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Applications 231

m aijHj j=1 X m := a (ξ0 , . . ., ξ0, ξ , . . ., ξ )h (x, ξ , . . ., ξ )dξ . . .dξ . · · · ij 1 i 1 j j 1 j 1 j j=0 D D X Z Z (7.437)

Note that (7.435) implies that

aij∗ = aji, aij = aij(ξ10 , . . ., ξi0, ξ1, . . ., ξj). (7.438)

Let

hj(ξ1, . . ., ξj) + jηj(ξ1, . . ., ξj) (7.439)

be substituted for hj in (7.437) and we surpress exibiting dependence on x since x will be fixed. Here j are numbers. The condition  = min at j = 0 implies that

m a H = b , 1 i m. (7.440) ij j i ≤ ≤ j=1 X

This is a system of integral equations for the functions hj (x, ξ1, . . ., ξj ):

m

aij(ξ10 , . . ., ξi0, ξ1, . . ., ξj)hj(x, ξ1, . . ., ξj)dξ1 . . . D · · · D Xj=1 Z jtimes Z dξj = bi|{z}(x, ξ10 , . . ., ξi0). (7.441) Consider as an example the case of polynomial estimates of degree 2. Then

a11(ξ10 , ξ1)h1(ξ1)dξ1 + a12(ξ10 , ξ1, ξ2)h2(ξ1, ξ2)dξ1dξ2 = b1(ξ10 ) ZD ZD ZD (7.442)

a21(ξ10 , ξ20 , ξ1)h1(ξ1)dξ1 + a22(ξ10 , ξ20 , ξ1, ξ2)h2(ξ1, ξ2)dξ1dξ2 ZD ZD ZD = b2(ξ10 , ξ20 ). (7.443)

If a (ξ0 , ξ0 , ξ , ξ ) belongs to , or to some class of the operators which 22 1 2 1 2 R can be inverted, one can find h2(ξ1, ξ2) from equation (7.443) in terms of h1(ξ1) and then (7.442) becomes an equation for one function h1(ξ1). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

232 Random Fields Estimation Theory

In the framework of correlation theory it is customary to consider only linear estimates because the data (covariance functions) consist of the mo- ments of second order. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Chapter 8 Auxiliary Results

8.1 Sobolev spaces and distributions

8.1.1 A general imbedding theorem Let D Rr be a bounded domain with a smooth boundary Γ. The Lp(D), ⊂ p 1 spaces consist of measurable functions on D such that u Lp(D):= ≥ p 1/p k k ( D u dx) < . For p = + one has u L∞(D):= ess supx D u(x) . | | ∞ ∞ k k ∈ | | If in place of the Lebesgue measure a measure µ is used then we use Lp(D, µ) R as the symbol for the corresponding space. These are Banach spaces. If

C0∞(D) is the set of infinitely differentiable functions with compact support p in D, then C0∞(D) is dense in L (D), 1 p < . If one defines a mollifier, r ≤ ∞ i.e. a function 0 ρ(x) C∞(R ), ρ(x) = 0 for x 1, ρ(x)dx = 1, ≤ ∈ 0 | | ≥ := , e.g. Rr R 2 1 R ρ(xR) := c exp ( x 1)− for x < 1, ρ(x) = 0 for x 1 (8.1) { | | − } | | | | ≥ where c is the normalizing constant chosen so that ρdx = 1, then the function R n 1 u (x) := − ρ( x y − )u(y)dy,  > 0, (8.2)  | − | Z p belongs to C∞ and u u Lp 0 as  0. By L one means a loc k − k loc → → loc set of functions which belong to Lp on any compact subset of D or Rr. p p ˜ ˜ Convergence in Lloc(D) means convergence in L (D) where D is an arbi- trary compact subset in D. By W `,p(D), the Sobolev space, one means the Banach space of functions u(x) defined on D with the finite norm

` j u `,p := D u p . (8.3) k kW (D) k kL (D) j =0 |X|

233 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

234 Random Fields Estimation Theory

Here j is a multiindex, j = (j , . . ., j ), Dj = Dj1 . . .Djr , j = j + +j . 1 r x1 xr | | 1 · · · r The space C∞(D) of infinitely differentiable in D functions is dense in `,p ˙ `,p W (D). By W (D) we denote the closure of C0∞(D) in the norm (8.3). By H`(D) we denote W `,2(D). This is a Hilbert space with the inner product

` j j 1/2 (u, v) := D uD v∗dx, u := (u, u) . (8.4) ` k k` ` D j=0 Z X

r Let C∞(D) denote the space of restrictions to D of functions in C ∞(R ). If the boundary of D is not sufficiently smooth, then C∞(D) may be `,p not dense in W (D). A sufficient condition on Γ for C∞(D) to be dense in W `,p(D) is that D is bounded and star shaped with respect to a point. D is called star shaped with respect to a point 0 if any ray issued from 0 intersects Γ := ∂D at one and only one point. Another sufficient condition `,p for C∞(D) to be dense in W (D) is that every point of Γ has a neighbor- hood U in which D U is representable in a suitable Cartesian coordinate ∩ system as xr < f(x1, . . .xr 1), where f is continuous. − Any function u W `,p(D), p 1, ` 1, (possibly modified on a set of ∈ ≥ ≥ Lebesgue Rr-measure zero) is absolutely continuous on almost all straight lines parallel to coordinate axes, and its distributional first derivatives co- incide with the usual derivatives almost everywhere. The spaces W `,p(D) are complete. We say that a bounded domain D Rr satisfies a uniform interior cone ⊂ condition if there is a fixed cone CD such that each point of Γ is the vertex of a cone C (x) D congruent to C . A strict cone property holds if D ⊂ D Γ has a locally finite covering by an open set U and a corresponding { j} collection of cones C such that x U Γ one has x + C D. { j} ∀ ∈ j ∩ j ∈ According to Calderon’s extension theorem, there exists a bounded linear operator E : W `,p(D) W `,p(Rr) such that T u = u on D for every → u W `,p(D) provided that D C0,1. The class C0,1 of domains consists ∈ ⊂ of bounded domains D such that each point x Γ has a neighborhood ∈ U with the property that the set D U is represented by the inequality ∩ xr < f(x1, . . ., xr 1) in a Cartesian coordinate system, and the function f − is Lipschitz-continuous. The domains in C0,1 have the cone property. The extension theorem holds for a wider class of domains than C 0,1, but we do not go into details. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 235

Let us formulate a general embedding theorem ([Mazja (1986)]).

Theorem 8.1 Let D R be a bounded domain with a cone property, ⊂ and let µ be a measure on D such that

s r r sup ρ− µ(D B(x, ρ)) < , B(x, ρ) = y : x y ρ, y R , x R ∩ ∞ { | − | ≤ ∈ } ∈ and s > 0. (If s n is an integer, then µ can be s-dimensional Lebesgue ≤ measure on D Γs, where Γs is an s-dimensional smooth manifold). ∩ `,p Then, for any u C∞(D) W (D), one has ∈ ∩ k j D u q c u `,p (8.5) k kL (D,µ)≤ k kW (D) j=0 X where c = const > 0 does not depend on u. Here the parameters q, s, `, p, k satisfy one of the following sets of restrictions:

1 a) p > 1, 0 < r p(` k) < s r, q sp[r p(` k)]− − − ≤ ≤ − −1 b) p = 1, 0 < r ` + k s r, q s(r ` + k)− − ≤ ≤ ≤ − c) p > 1, r = p(` k), s r, q > 0 is arbitrary if − ≤ d) p > 1, r < p(` k) or − e) p = 1, r ` k then ≤ − k j sup D u c u W `,p (D) . (8.6) x D | | ≤ k k Xj=0 ∈ If 1 f) p 1, (` k 1)p < r < (` k)p, and λ := ` k rp− then ≥ − − − − − λ k k sup x y − D u(x) D u(y) C u W `,p (D) . (8.7) x,y∈D | − | | − | ≤ k k x=y 6 If g) (` k 1)p = r then (7) holds for all 0 < λ < 1. − − `,p k,q The imbedding operator i : W (D) W (Ds D) is compact if s > 0, → ∩ 1 r > (` k)p, r (` k)p < s r, and q < sp[r (` k)p]− . − − − ≤ − − If r = (` k)p, q 1, s r then the above imbedding operator i is − ≥ ≤ compact. If r < (` k)p then i : W `,p(D) Ck(D) is compact. − ` →` 1/2 1 The trace operator i : H (D) H − (Γ) is bounded if ` > . → 2 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

236 Random Fields Estimation Theory

8.1.2 Sobolev spaces with negative indices We start with the brief exposition of the facts in distribution theory which r r will be used later. Let S(R ) be the Schwartz’s space of C∞(R ) functions which decay with all their derivatives faster than any negative power of x | | as x , so that | | → ∞ m m j φ m := max(1 + x ) D φ(x) < , 0 m < . (8.8) x Rn | | ∈ | | | | ∞ ≤ ∞ Xj=0 The set of the norms defines the topology of S(Rr ). A sequence | · |m φn converges to φ in S if φn φ m 0 as n for 0 m < . It r| − | → → ∞ ≤ ∞ is easy to see that C0∞(R ) is dense in S in the above sense. A space S0 of tempered distributions is the space of linear continuous functionals on S. Continuity of a linear functional f on S means that (f, φ ) (f, φ) n → if φ φ in S. By (f, φ) the value of f at the element φ is denoted. A n → wider class of distributions is used often, the class 0 of linear continuous r D functionals over the space = C0∞(R ) of test functions. Continuity of D r f 0 means that for any compact set K R there exist constants c ∈ D ⊂ j and m such that (f, φ) c j m supx K D φ , φ C0∞(K). By the | | ≤ | |≤ ∈ | | ∈ derivative of a distribution f S0 one means a distribution f 0 S0, such P∈ ∈ that

(f 0, φ) = (f, φ0) φ S. (8.9) − ∀ ∈ Also

m m m (D f, φ) = ( 1)| |(f, D φ), φ S. − ∀ ∈ r Let D be an open set in R , and S(D) be the completion of C0∞(D) in the topology of S(Rr ). A continuous linear functional on S(D) is a distribution in D. The space of these distributions is denoted S0(D). If (F, φ) = (f, φ) r for all φ S(D), some F S0(R ) and some f S0(D), then f is called ∈ ∈ ∈ the restriction of F to D and we write f = pF . Since S(D) is, by definition, a closed linear subspace of S(Rr ), the Hahn-Banach theorem says that a linear continuous functional f S0(D) can be extended from S0(D) to r ∈ r S0(R ). Let us denote this extension by Ef = F . The space S0(R ) is a Frechet space, i.e. a complete locally convex metrizable space, so that r Hahn-Banach theorem holds. If F S0(R ) then one says that F = 0 in ∈ an open set D if (F, φ) = 0 for all φ S(D), or which is equivalent, for all ∈ φ C∞(D). If D is the maximal open set on which F = 0, one says that ∈ 0 Ω := Rr D is the support of F . By Ω the closure of Ω is denoted. If F \ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 237

is locally integrable, i.e. a usual function, then supp F is the complement of the maximal open set on which F = 0 almost everywhere in the sense of Lebesgue measure in Rr . Note that supp DmF supp F . The Fourier r ⊆ transform of f S0(R ) is defined by ∈ r (f˜, φ˜∗) = (2π) (f, φ∗) (8.10)

where the star stands for complex conjugation, the tilde stands for the Fourier transform, and

φ˜ := φ(x) exp(iλ x)dx, := , φ S(Rr ). · r ∈ Z Z ZR This definition of f˜ is based on the Parseval equality for functions: ˜ ˜ (φ1, φ2∗) = 2π(φ1, φ2∗), and on the fact that the Fourier transform is a linear continuous bijection of S(Rr ) onto itself. Note that

D]mf = ( iλ)m φ˜ (8.11) −

(f φ) = f˜φ,˜ f := f˜ (8.12) F ∗ F r r where f φ is the convolution of f S0(R ) and φ S(R ), defined by ∗ ∈ ∈ f φ = (f, φ(x y)). Here x Rr is fixed and (f, φ(x y)) denotes the ∗ − ∈ − value of f at the element φ(x y) S(Rr ). The function f φ is infinitely − ∈ ∗ differentiable. r Any f S0(R ) can be represented in the form ∈ j f = D fj (8.13) j m | X|≤ where fj are locally continuous functions which satisfy the inequality f (x) c(1 + x )N , j m, (8.14) | j | ≤ | | | | ≤ m and N are some integers, and c = const > 0. The space H `(Rr) one r can define either as the closure of C0∞(R ) in the norm `:= H` , or r k · k k · k by noting that, for φ C∞(R ), one can define an equivalent norm by the ∈ 0 formula:

2 r 2 ` 2 φ = (2π)− (1 + λ ) φ˜ dλ, (8.15) k k` | | | | Z and

(u, v) := (1 + λ 2)`uvdλ, (u, v) := (u, v) . (8.16) ` | | 0 Z February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

238 Random Fields Estimation Theory

In (8.15) one can assume < ` < . If f H`(Rr) and φ S(Rr ) then −∞ ∞ ∈ ∈ r (f, φ) (2π)− f˜ φ˜ dλ f ` φ `, (8.17) | | ≤ | || | ≤k k k k− Z where 1/2 r/2 2 ` 2 φ `:= (2π)− (1 + λ )− φ˜ dλ . (8.18) k k− | | | | Z  ` r r It is clear from (8.17) that one can define H− (R ) as the closure of S(R ) in the norm

φ `= sup (f, φ) / f ` . (8.19) k k− f H` (Rr )f=0 {| | k k } ∈ 6 ` r ` r Thus H− (R ) is the dual space to H (R ) with respect to the pairing given by the inner product ( , ) . Consider the spaces H˙ `(D) and H˙ `(Ω), · · 0 < ` < , of functions belonging to H`(Rr) with support in D and −∞ ∞ Ω respectively, Ω := R3 D. These spaces are closed in H`(Rr) norm ` r \ subspaces of H (R ) which can be described as completions of C0∞(D) and ` r ` C∞(Ω) in the norm H (R ). If f H˙ (D) then (f, φ) = 0 φ C∞(Ω). 0 ∈ ∀ ∈ 0 Consider the space H`(D), where D EH`, ` 0. This means that ⊂ ≥ D has the property that there exists an extension operator E : H `(D) ` r → H (R ) with the property Ef = f in D, Ef ` r c f ` . A k kH (R )≤ k kH (D) similar definition can be given for D EW `,p. Bounded domains D C0,1 ⊂ ⊂ are in the class EH` (and in EW `,p). The property Ef = f in D for ` < 0 ` r r means that pEf = f, where Ef H (R ) S0(R ), ` < 0, and p is the r ∈ ⊂ restriction operator p : S0(R ) S0(D). Thus, for any `, < ` < , → −∞ ∞ we consider the linear space of restrictions of elements f H `(Rr), < ∈ −∞ ` < , to D C0,1. ∞ ⊂ Define a norm on this space by the formula

f `:= f H` (D)= inf Ef H` (Rr ), < ` < , (8.20) k k k k E k k −∞ ∞ where the infimum is taken over all extensions of f belonging to H `(Rr). If Ef is such an extension, then Ef + f is also such an extension for any − f H˙ `(Ω). − ∈ If ` 0 and D EH` then the norm (8.20) is equivalent to the usual ≥ ⊂ norm (1.4). ` ` 2 Let (H (D))0 denote the dual space to H (D) with respect to L (D) = 0 H (D). This notion we introduce in an abstract form. Let H+ and H0 be a pair of Hilbert spaces, H H , H is dense in H in H norm and + ⊂ 0 + 0 0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 239

f + f 0. Define the dual to H+ space H+0 := H as follows. Note k k ≥k k − that if f H and φ H then ∈ 0 ∈ + (f, φ) f φ f φ , (8.21) | | ≤k k0k k0≤k k0k k+

where (f, φ) = (f, φ)0 is the inner product in H0. Define

1 f := sup (f, φ) φ +− . (8.22) − k k φ∈H+ | | k k φ=0 6

It follows from (14) that f f 0, and that (f, φ) is a bounded k k−≤k k linear functional on H+. By Riesz’s theorem, (f, φ)0 = (If, φ)+, where I : H0 H+ is a linear bounded operator, If + f 0. Define H as → k k ≤k k − the completion of H0 in the norm (8.22). Clearly H+ H0 H . The ⊂ ⊂ − triple H+, H0, H is called a . The space H is called the dual space to { −} − H+ with respect to H0. The inner product in H can be defined by the − formula

(f, g) := (If, Ig)+ . (8.23) −

Indeed, the operator I was defined as an operator from H0 into H+. There- fore, for f, g H the right side of (8.23) is well defined. Consider the ∈ 0 completion of H0 in the norm f = If +. Then we obtain H . Note k k− k k − that the right side of (8.22) can be written as If . Therefore, the k k+ operator I is now defined as a linear isometry from H into H+. In fact, − this isometry is onto H+. Indeed, suppose (If, φ)+ = 0 f H . We ∀ ∈ − wish to prove that φ = 0. If f H , then 0 = (If, φ) = (f, φ) . Thus, ∈ 0 + 0 φ = 0. If one considers H+ as the space of test functions, then H is the − corresponding space of distributions. ` In this sense, we treat (H (D))0 as the space of distributions correspond- ` ` ` ` ` ing to H = H (D). Since H˙ (D) H (D), one has (H (D))0 (H˙ (D))0. + ⊂ ⊂ One of the results we will need is the following:

` ` (H˙ (D))0 = H− (D), < ` < , −∞ ∞ ` ` (H (D))0 = H˙ − (D), < ` < . (8.24) −∞ ∞ 0,1 ` Here we assume that D C . Let us recall that H˙ − (D) can be ⊂ ` r defined as the completion of the set C0∞(D) in the norm of H− (R ), while ` ` r H− (D) can be defined as the space of restrictions of elements of H − (R ) to the domain D C0,1 with the norm (8.20). ⊂ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

240 Random Fields Estimation Theory

Let us now describe a canonical factorization of the isometric operator I : H H+ defined above. This factorization is − →

I = p+p , (8.25) −

where p : H H0 and p+ : H0 H+ are linear isometries onto H0 and − − → → H+ respectively. In order to prove (8.25) let us note that (If, g)+ = (f, g)0 for all f, g H . Thus I is a selfadjoint positive linear operator on H , ∈ + + (If, f)+ = (f, f)0 0 (= 0 if and only if f = 0). Let + be the closure of 1/2 ≥ J 1/2 I : H+ H+ considered as an operator in H0. Since I f += f 0 → 1/2 k k k k the operator I is closable. If fn H+ and fn converges in H0 to f H0, 1/2 ∈ 1/2 ∈ then I fn g. Let us define +f = I f = g. Then + is defined −→H+ J J on all of H , it is an isometry: f = f , and its range is all of 0 k J+ k+ k k0 H . Indeed, suppose ( f, φ) = 0 for all f H and some φ H . + J+ + ∈ 0 ∈ + Then (f, φ) = 0 f H . Thus φ = 0. Since is an isometry, its range 0 ∀ ∈ 0 J+ is a closed subspace in H+. We have proved above that the orthogonal complement of the range of + is trivial. Therefore the range of + is J 1 J H+. So one can take p+ = +. Define p := +− I. Then (8.25) holds J − J and p : H H0 is an isometry with domain H and range H0, while − − → − p : H H is an isometry with domain H and range H . One has + 0 → + 0 + I = p+p , which is the desired factorization (8.25). − If i : H H is the imbedding operator and I is considered as an + → 0 operator from H0 into H+ then i = I∗, where I∗ is the operator adjoint to I : H H , I∗ : H H . Indeed 0 → + + → 0 (If, g) = (f, g) = (f, ig) f, g H . (8.26) + 0 0 ∀ ∈ + If one assumes that the imbedding operator i : H H is in the + → 0 Hilbert-Schmidt class as an operator in H0, then the p+ is in the Hilbert- Schmidt class as an operator in H . Indeed, p : H H is an isometry. 0 + 0 → + Therefore it sends a bounded set in H , say the unit ball of H : u 1, 0 0 k k0≤ into a bounded set in H , p u = u 1 if u 1. Since the + k + k+ k k0≤ k k0≤ imbedding i : H H is Hilbert-Schmidt, the operator p considered as + → 0 + an operator on H0 is in the Hilbert-Schmidt class:

p : H H = (i : H H )(p : H H ). (8.27) + 0 → 0 + → 0 + 0 → + The right side of (8.27) is the product of an operator in the Hilbert-Schmidt class and a bounded operator. Therefore the product is in the Hilbert- Schmidt class. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 241

1 1 Define p+− := q+, q+ : H+ H0, q+f 0= f +, p− := q , → k k k k − − q : H0 H , q f = f 0. One has − → − k − k− k k (q h, q g) = (h, g) , h, g H , (8.28) + + 0 + ∈ +

(p h, p g) = (h, g) , h, g H (8.29) + + + 0 ∈ 0

(p h, p g)0 = (h, g) , h, g H , (8.30) − − − ∈ −

(q h, q g) = (h, g)0, h, g H0. (8.31) − − − ∈

Because p+p = I, one has − 1 q q+ = I− . (8.32) − Note that

(f, q+u)0 = (q f, u)0 f H0, u H+, (8.33) − ∈ ∈

so that q = q+∗ is the adjoint to q+ in H0. − To check (8.33) one writes

1 (f, q+u)0 = (q f, q q+u) = (q f, I− u) = (q f, u)0 (8.34) − − − − − − which is equation (8.33).

8.2 Eigenfunction expansions for elliptic selfadjoint opera- tors

8.2.1 Resoluion of the identity and integral representation of selfadjoint operators Every selfadjoint operator A on a Hilbert space H can be represented as

∞ A = λdEλ, (8.35) Z−∞ where Eλ is a family of orthogonal projection operators such that

2 Eλ = Eλ, E = 0, E+ = I, (8.36) −∞ ∞

E∆E∆0 = E∆ ∆0 (8.37) ∩ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

242 Random Fields Estimation Theory

where 0 is the zero operator, I is the identity operator, ∆ = (a, b], < −∞ a < b < , E := E E . The family E is called the resolution of the ∞ ∆ b − a λ identity of A. The domain of definition of A is:

∞ Dom A = f : f H, λ2d(E f, f) < . (8.38) { ∈ λ ∞} Z−∞ A function φ(A) is defined as

∞ φ(A) = φ(λ)dEλ, (8.39) Z−∞ where

∞ Dom φ(A) = f : f H, φ(λ) 2d(E f, f) < . { ∈ | | λ ∞} Z−∞ The operator integrals (8.35), (8.39) can be understood as improper oper- ator Stieltjes integrals which converge strongly.

8.2.2 Differentiation of operator measures A family E(∆) of selfadjoint bounded nonnegative operators from the set of Borel sets ∆ R1 into the set of B(H) of bounded linear operators on a ∈ Hilbert space H is called an operator measure if E( j∞=1 ∆j) = j∞=1 E(∆j) where the limit on the right is taken in the sense of weak convergence of S P operators, ∆ ∆ = for i = j, and E( ) = 0. Assume that for ∆ i ∩ j ∅ ∅ bounded one has T r E(∆) < , where T rA is the trace of A. Then ∞ ρ(∆) := T r E(∆) 0 is a usual (scalar) measure on R1. Let Ψ denote the ≥ | | 1/2 Hilbert-Schmidt norm of a linear operator Ψ, Ψ = ∞ Ψe 2 , | | j=1 k j k where ej is an orthonormal basis of H. In this chapter the starwill { } P stand for the adjoint operator and the bar for complex conjugate or for the closure.

Lemma 8.1 For ρ-a.e. (almost everywhere) there exists a HS (Hilbert- Schmidt) operator-valued function Ψ(λ) 0 with Ψ(λ) T rΨ(λ) 1 ≥ k k≤ ≤ such that

E(∆) = Ψ(λ)dρ(λ). (8.40) Z∆ The function Ψ(λ) is uniquely defined ρ-a.e. and can be obtained as a weak February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 243

limit

1 Ψ(λ) = w lim E(∆ )ρ− (∆ ) as ∆ λ. (8.41) − j j j → The integral in (8.40) converges in the norm of operators for any bounded ∆. The limit in (8.41) means that λ ∆ and ∆ 0, where ∆ is the ∈ j | j| → | j| length of ∆j, and ∆j is a suitable sequence of intervals. Let A be a selfadjoint operator on H and E(∆) be its resolution of the identity. In general, trE(∆) is not finite so that Lemma 1 is not applicable. In order to be able to use this lemma, let us take an arbitrary linear densely defined closed operator T on H with the properties: i) RanT = H, T is 1 injective, ii) T − σ where σ is the class of Hilbert-Schmidt operators. ∈ 2 2 Definition 8.1 A linear compact operator K on H belongs to the class σ , K σ , if and only if ∞ sp (K) < , where s (K) are the s-values p ∈ p n=1 n ∞ n of K, which are defined by the formula s (K) = λ [(K K)1/2] and λ (B) P n n ∗ n are the eigenvalues of a compact selfadjoint nonnegative operator B ordered so that λ λ 0. If K σ it is called trace class operator; if 1 ≥ 2 ≥ · · · ≥ ∈ 1 K σ it is called HS operator. ∈ 2 More information about trace class operators is given in Section 8.3.3. Having chosen T as above, define the operator measure θ(∆) := 1 1 2 T − ∗E(∆)T − . If A σ define its HS norm by the formula A := ∈ 2 | | ∞ Ae 2, where Af is the norm in H of the vector Af, and e j=1 k j k k k { j} is the orthonormal basis of H. By A denote the norm of A. Note that P | | AB A B , A A , λA = λ A , A + B A + B , and that | | ≤ | | k k k k≤ | | | | | || | | | ≤ | | | | A does not depend on the choice of the orthonormal basis ej of H. Since | | 2 1 2 { } 1 2 T r(B∗AB) A B one has T rθ(s) T − E(∆) T − < . ≤ | | k k ≤k k | | ≤k k ∞ Therefore Lemma 8.1 is applicable to θ(∆), so that

1 1 (E(∆)f, g) = T − ∗E(∆)T − T f, T g = (Ψ(λ)T f, T g) dρ(λ) (8.42) Z∆  where dρ is a nonnegative measure, ρ (( , )) < , Ψ(λ) is a nonnega- −∞ ∞ ∞ tive operator function, Ψ(λ) 0, and Ψ(λ) T rΨ(λ) = 1. ≥ | | ≤ Let, for a fixed λ, φ (λ) and ν (λ), α = 1, 2, . . .N , be respec- α α λ ≤ ∞ tively the orthonormal system of eigenvectors of Ψ(λ) and the corresponding eigenvalues. Then

N(λ) N(λ)

(Ψ(λ)T f, T g) = να(T f, φα)(T g, φα) = (T f, ψα)(T g, ψα), (8.43) α=1 α=1 X X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

244 Random Fields Estimation Theory

where the bar denotes complex conjugate,

Nλ Nλ ψ := ν1/2(λ)φ , ψ 2= ν = T rΨ(λ) = 1. (8.44) α α α k α k α α=1 α=1 X X One can write

Nλ (E(∆)f, g) = (T f, ψα(λ)) (T g, ψα(λ))dρ(λ). (8.45) ∆ α=1 Z X If F (λ) is a Borel measureable function on Λ, then

∞ (F (A)f, g) = F (λ) (Ψ(λ)T f, T g) dρ(λ), (8.46) Z−∞ where f D (F (A)) D(T ), g D(T ). If one takes T = q and if the ∈ ∩ ∈ + imbedding i : H H is in the Hilbert-Schmidt class, then the operator + → 0 P (λ) := T ∗Ψ(λ)T , which appears in (8.42):

(E(∆)f, g) = (P (λ)f, g) dρ(λ) (8.47) Z∆ and in (8.46):

∞ (F (A)f, g) = F (λ) (P (λ)f, g) dρ(λ), (8.48) Z−∞ 1 can be considered as an operator from H+ into H for any fixed λ R . − ∈ This operator is in the Hilbert-Schmidt class because it is a product of two bounded operators and a Hilbert-Schmidt operator Ψ(λ): T ∗ = q : H0 − → H is bounded, T = q+ : H+ H0 is bounded, Ψ(λ) : H0 H0 is in − → → the Hilbert-Schmidt class. The range of the operator P (λ) is a generalized eigenspace of the operator A corresponding to the point λ. This eigenspace belongs to H . Formula (8.47) can be written as − E(∆) = P (λ)dρ(λ) (8.49) Z∆ where the integral (8.49) converges weakly, as follows from formula (8.47), but also it converges in the Hilbert-Schmidt norm of the operators (H+, H ), where by (H+, H ) we denote the set of linear bounded op- L − L − erators from H+ into H . Indeed −

P (λ) T ∗ Ψ(λ) T Ψ(λ) 1 (8.50) | | ≤k k | | k k≤ | | ≤ where we took into account that T = q+ = 1, T ∗ = q = 1. k k k k k k k − k February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 245

The operator P (λ) is an orthogonal projector in the following sense. If φ H and ∈ + (P (λ)u, φ) = 0 u H (8.51) 0 ∀ ∈ + then

P (λ)φ = 0. (8.52)

Indeed

0 = (P (λ)u, φ) = (q Ψ(λ)q+u, φ) = (Ψ(λ)q+u, q+φ) 0 − 0 0 = (q+u, ψ(λ)q+φ) = (u, q Ψ(λ)q+φ) 0 − 0 = (u, P (λ)φ) = 0 u H . (8.53) 0 ∀ ∈ + Thus, equation (8.52) follows from (8.53). Therefore, if φ is orthogo- nal to the range of P (λ) then the projection of φ onto the range of P (λ) vanishes. Let us rewrite formula (8.43) in terms of the generalized eigenvectors. Define

T ∗ψα = q ψα := ηα H . (8.54) − ∈ − Then (8.43) can be rewritten as

N(λ)

(P (λ)f, g) = fα(λ)gα(λ), (8.55) α=1 X where

f (λ) := (f, η ) , f H . (8.56) α α 0 ∈ + Formula (8.45) becomes

N(λ)

(E(∆)f, g) = fα(λ)gα(λ)dρ(λ). (8.57) ∆ α=1 Z X Since P (λ) is in the Hilbert-Schmidt class, it is an integral operator with a kernel Φ(x, y, λ), the (generalized) spectral kernel (see the Remark at the end of Section 3.6). The operator Eλ is an integral operator with the kernel

λ E(x, y, λ) = Φ(x, y, λ)dρ(λ). (8.58) Z−∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

246 Random Fields Estimation Theory

The operator F (A) is an integral operator with the kernel

∞ F (A)(x, y, λ) = F (λ)Φ(x, y, λ)dρ(λ). (8.59) Z−∞

8.2.3 Carleman operators An integral operator

Af = A(x, y)f(y)dy, = (8.60) r Z Z ZR is called a Carleman operator if

ν(A) := sup A(x, y) 2dy < . (8.61) x Rr | | ∞ ∈ Z A selfadjoint operator is called a Carleman operator if there exists a L continuous function φ(λ),

φ(λ) C(Λ), 0 < φ(λ) c λ Λ, (8.62) ∈ | | ≤ ∀ ∈ where c = const > 0, Λ is the spectrum of , such that the operator L A = φ( ) is an integral operator with the kernel A(x, y) which satisfies L (8.61). Let H = L2(Rr) and take H = L2 (Rr , p(x)) , where p(x) 1 and 0 + ≥ 1 p− (x)dx < . (8.63) ∞ Z For example, one can take p(x) = (1 + x 2)(r+)/2, where  > 0 is any | | positive number. Then the operator A defined by the formula (8.60) is in the Hilbert-Schmidt class σ2(H0, H+) if the condition (8.63) holds. Indeed, if f , 1 j < , in an orthonormal basis of H , then { j} ≤ ∞ 0 2 ∞ ∞ 2 1 Af = p− (x) A(x, y)f (y)dy dx k j k+ j j=1 j=1 Z Z X X 1 2 = dxp− (x) A(x, y) dy | | Z Z 1 ν(A) dxp− (x) < , (8.64) ≤ ∞ Z where Parseval’s equality was used to get the second equality and the con- ditions (8.61) and (8.63) were used to get the final inequality (8.64). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 247

If A = φ( ) and A σ (H , H ) one can use the triple L ∈ 2 0 + 2 r 2 r 2 r 1 L (R , p(x)) L (R ) L R , p− (x) (8.65) ⊂ ⊂ for eigenfunction expansions of the operator . Here  L 2 r 2 r 2 r 1 H+ = L (R , p(x)) , H0 = L (R ), H = L R , p− (x) . (8.66) − Indeed, the basic condition, which has to be satisfied for the theory developed in section 1 to be valid, is

1 1 T r (T − )∗E(∆)T − < , (8.67) ∞ provided that ∆ is bounded. Since φ(λ) is con tinuous and (8.62) holds, one has

ξ (λ) φ(λ) 2c(∆), λ Λ, (8.68) ∆ ≤ | | ∈

where ξ∆(λ) is the characteristic function of ∆:

1, λ ∆ ξ∆(λ) = ∈ (8.69) (0, λ ∆ 6∈ and c(∆) = const > 0. Inequality (8.68) implies

2 E(∆) c(∆) φ(λ) dE = c(∆)φ∗( )φ( ). (8.70) ≤ | | λ L L ZΛ Therefore

1 1 1 1 T r (T − )∗E(∆)T − c(∆)T r φ( )T − † φ( )T − ≤ L L 1  = c(∆) φ(n)T − .  o (8.71) | L | Thus (8.67) holds if

1 φ( )T − < . (8.72) | L | ∞ 1 If T − = p+, then condition (8.72) becomes

φ( )p < . (8.73) | L +| ∞

In the case of the triple (8.65)-(8.66) the operator p+ is a multiplication operator given by the formula

1/2 p+f = p− (x)f(x). (8.74) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

248 Random Fields Estimation Theory

Condition (8.73) holds if p(x) satisfies condition (8.63) and φ( ) = A is a L Carleman operator so that condition (8.61) holds. Indeed, inequality (8.73) holds if

2 1 A(x, y) p− (y)dydx < . (8.75) | | ∞ ZZ We assume hat the function φ(λ) is such that (8.61) implies

sup A(x, y) 2dx < . (8.76) y Rr | | ∞ ∈ Z If this is the case then inequality (8.75) holds provided that (8.63) and (8.76) hold. Therefore in this case one can use the triple (8.65) for eigenfunction expansions of the operator and the generalized eigenfunctions of are 2 r 1 L 2 r L elements of L R , p− (x) , so that they belong to Lloc(R ). Inequality (8.61) implies (8.76), for example, if φ(λ) is a real-valued  function. In this case A = A∗, so that

A(x, y) = A(y, x) (8.77)

and if (8.77) holds then clearly (8.61) implies (8.76). In many applications one takes

m φ(λ) = (λ z)− , (8.78) − where z is a complex number and m is a sufficiently large positive integer. If φ is so chosen then

∞ m A(x, y; z) = (λ z)− Φ(x, y, λ)dρ(λ) (8.79) − Z−∞ and

∞ m A(y, x, z) = (λ z)− Φ(y, x, λ)dρ(λ) − Z−∞ ∞ m = (λ z)− Φ(x, y, λ)dρ(λ) − Z−∞ ∞ = (λ z ) mΦ(x, y, λ)dρ(λ) − ∗ − Z−∞  = A(x, y, z), (8.80)

where we have used the equation

Φ(x, y, λ) = Φ(y, x, λ), (8.81) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 249

which follows from the assumed selfadjointness of . Therefore, if both L kernels A(x, y; z) and A(x, y; z) satisfy inequality (8.61), then (8.76) holds.

8.2.4 Elements of the spectral theory of elliptic operators in L2(Rr) Let

j j j1 jr ∂ u = aj (x)D u, D = D1 Dr , Dp = i , (8.82) L · · · − ∂xp j s |X|≤ r j r where x R , j is a multiindex, a (x) C| |(R ), ∈ j ∈ a (x, ξ) := a (x)ξj = 0 for (x, ξ) Rr (Rr 0), (8.83) s j 6 ∈ × \ j =s |X| and assume that is formally selfadjoint L = ∗ (8.84) L L that is

r ( φ, ψ) = (φ, ψ) φ, ψ C∞(R ). (8.85) L L ∀ ∈ 0 The function as(x, ξ) is called the symbol of the elliptic operator (8.82), and condition (8.83) is the ellipticity condition. If (8.83) holds and r 3 ≥ then s is necessarily an even number. Often one assumes that is strongly elliptic. This means that L Rea (x)ξj = 0 for (x, ξ) Rr (Rr 0). (8.86) j 6 ∈ × \ j =s |X| The assumptions (8.84) and (8.86) imply that the operator is bounded r L from below on C0∞(R ). 2 2 r ( u, u) c u s r c u 2 r u C∞(R ), (8.87) L 0 ≥ 1 k kH (R ) − 2 k kL (R ) ∀ ∈ 0 where c1 and c2 are positive constants. Define the minimal operator in L2(Rr) generated by the formally selfad- joint differential expression (8.82) as the closure of the symmetric operator r u u with the domain of definition C∞(R ). Any densely defined of → L 0 a Hilbert space H symmetric operator is closable. Recall that , the L L closure of , is defined as follows. Let u Dom := D( ), u u in H L n ∈ L L n → and u f in H. Then one declares that u D( ) and u = f. This L n → ∈ L L definition implies that is defined on the closure of D( ) in the graph norm L L February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

250 Random Fields Estimation Theory

u := u + u and the graph of is the closure in the graph norm k kL k k k L k L of the graph of , that is of the set of ordered pairs u, u , u D( ). L { L } ∈ L One says that is closable if and only if the set u, u , u D( ), is a L { L } ∈ L graph. In other words, is closable if and only if there is no pair 0, f , L { } f = 0, in the closure of the graph of . This means that if 6 L u D( ), u 0 and u f (8.88) n ∈ L n → L n → then

f = 0. (8.89)

If is symmetric and densely defined and φ D( ), then L ∈ L (f, φ) = lim ( un, φ) = lim (un, φ) = 0 φ D( ). (8.90) n n →∞ L →∞ L ∀ ∈ L Since D( ) is dense, one concludes that f = 0. Therefore is closable. We L L denote by . L Lm Under some assumptions on the coefficients a (x) it turns out that j Lm is selfadjoint. If no assumptions of the growth of a (x) as x are j | | → ∞ made then may be not selfadjoint (see [Ma, p 156] for an example). Lm Let us give some sufficient conditions for to be selfadjoint. Lm Note that if is densely defined symmetric and bounded from below it L has always a (unique) selfadjoint extension with the same lower bound, the Friedrichs extension . This extension is characterized by the fact that LF its domain of definition belongs to the energy space of the operator , that L is to the Hilbert space H defined as the closure of D( ) in the metric L L [u, u] = ( u, u) + c(u, u), (8.91) L where c > 0 is a sufficiently large constant such that + cI is positive L definite on D( ). L Since the closure of is the minimal closed extension of and since L L L is a closed extension of , one concludes that if is bounded from LF L L below in H = L2(Rr ) and is selfadjoint then Lm = . (8.92) Lm LF In order to give conditions for to be selfadjoint, consider first the case Lm when

a (x) = a (x) = const, j s. (8.93) j j | | ≤ r In this case is symmetric on C∞(R ). L 0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 251

Lemma 8.2 If (8.93) holds then is selfadjoint. Its spectrum consists Lm of the set

λ : λ = a ξj, ξ Rr . (8.94)  j ∈  α s  |X|≤  Proof. Let  

r/2 u = (2π)− exp( iξ x)u(x)dx := uˆ(ξ) (8.95) F − · Z

1 r/2 u(x) = − uˆ = (2π)− exp(iξ x)uˆ(ξ)dξ (8.96) F · Z ∂ (Dpu) = ξpuˆ(ξ), 1 p r, Dp = i . (8.97) F ≤ ≤ − ∂xp Therefore

a Dj u = a ξj uˆ F  j  j j s j s |X|≤ |X|≤   and

1 u = − (ξ) u, (8.98) L F L F where

(ξ) := a ξj. (8.99) L j j s |X|≤ The operator of the Fourier transform is unitary in L2(Rr ). Formula F (8.98) shows that the operator is unitarily equivalent to the operator L of multiplication by the polynomial (ξ) defined by formula (8.99). This L r multiplication operator is defined on the set (C0∞(R )) of the Fourier F r transforms of the functions belonging to the set C0∞(R ), that is to the domain of definition of the operator . Consider the closure M in L2(Rr ) L ξ of the operator of multiplication by the function L(ξ):

Muˆ := L(ξ)uˆ(ξ). (8.100)

The domain of definition of M is

D(M) = uˆ : L(ξ)uˆ(ξ) L2(Rr) . (8.101) ∈  February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

252 Random Fields Estimation Theory

The operator M is clearly selfadjoint since the function L(ξ) is real-valued. Therefore

1 = − M (8.102) Lm F F is also selfadjoint. Since the spectra of the unitarily equivalent operators are identical and the spectrum of M is the set (8.94), the spectrum of Lm is the set (8.94). Lemma 8.2 is proved.  The following well-known simple lemmas are useful for proving, under suitable assumptions, that is selfadjoint. Lm Lemma 8.3 Let A and B be linear operators on a Hilbert space H, A is selfadjoint, D(A) D(B), B is symmetric, and ⊂ Bu  Au +c u , u D(A), (8.103) k k≤ k k k k ∀ ∈ where 0 <  < 1 is a fixed number and c is a positive number. Then the operator A + B with Dom(A + B) = DomA is self-adjoint. Lemma 8.4 Let A be a symmetric densely defined in a Hilbert space H operator. Then A, the closure of A, is selfadjoint if and only if one of the following conditions holds

c`Ran(A iλ) = H (8.104)  or

N(A∗ iλ) = 0 . (8.105)  { } Here λ > 0 is an arbitrary fixed number, c`RanA is the closure of the range of A,

N(B) = u : Bu = 0 (8.106) { }

and A∗ is the adjoint of A. For convenience of the reader let us prove these lemmas. We start with Lemma 8.4.

Proof of Lemma 8.4 a) The argument is the same for any λ > 0, so let us take λ = 1. Suppose that A is selfadjoint. Then A∗ = A∗ = A. If A∗u = iu then

(A∗u, u) = i(u, u). (8.107) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 253

Since A∗ is selfadjoint the quadratic form (A∗u, u) is real valued. Thus, equation (8.107) implies U = 0, and (8.105) is established. To derive (8.104) one uses the formula

c`Ran(A i) N(A∗+i) = H, (8.108)  ⊕ where is the symbol of the orthogonal sum of the subspaces. From (8.108) ⊕ and (8.105) one gets (8.104).

b) Assume now that (8.105) holds and prove that A is selfadjoint. If (8.105) holds then (8.104) holds as follows from (8.108). Conversely, (8.104) implies (8.105). Note that

c`Ran(A i) = Ran(A i) (8.109)   because Ran(A i) are closed subspaces. Indeed, if (A i)u = f and   n n f f, then n → (A i)u 0, n, m , (8.110) k  nm k→ → ∞ where u := u u . Since A is symmetric, one obtains from (75) that nm n − m (A i)u 2= Au 2 + u 2 0, n, m . (8.111) k  nm k k nm k k nm k → → ∞ Therefore u is a Cauchy sequence. Let u u. Then, since A is closed, n n → one obtains

(A i)u = f. (8.112)  Therefore Ran(A i) are closed subspaces. If (8.105) holds then  Ran(A i) = H. (8.113)  This and the symmetry of A imply that A is selfadjoint. Indeed, let

(A + i)u, v = (u, f) u D(A). (8.114) ∀ ∈ Using (8.113), find w such that

(A i)w = f. (8.115) − This is possible because of (8.113). Use (8.114) and the symmetry of A to obtain

(A + i)u, v = u, (A i)w = (A + i)u, w . (8.116) −    February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

254 Random Fields Estimation Theory

By (8.113) one has Ran(A + i) = H. This and (8.116) imply v = w. Therefore v D(A) and A is selfadjoint on D(A). Lemma 8.4 is proved.  ∈

Proof of Lemma 8.3 It is sufficient to prove that, for some λ > 0,

Ran(A + B iλ) = H. (8.117)  One has

1 A + B iλ = I + B(A iλ)− (A iλ). (8.118)    Equation (8.117) is established as soon as we prove that

1 B(A iλ)− < 1. (8.119) k  k 1 Indeed, if (8.119) holds then the operator I+B(A+iλ)− is an isomorphism of H onto H and, since A is selfadjoint, Ran(A iλ) = H, so  1 Ran(A + B iλ) = Ran I + B(A iλ)− (A iλ) = H. (8.120)    Use the basic assumption (8.103) to prove (8.119). Let

1 1 (A + iλ)− u = f, B(A + iλ)− u = Bf. (8.121)

Then

1 B(A + iλ)− u = Bf  Af +c f k k k k≤ k k k k 1 1 =  A(A + iλ)− u +c (A + iλ)− U k k k k 1 1  u +cλ− u = ( + cλ− ) u . (8.122) ≤ k k k k k k If  < 1 and λ > 0 is large enough, then

1  + cλ− < 1. (8.123)

Thus (8.122) holds, and Lemma 8.3 is proved.  In deriving (8.122) we have used the inequalities

1 1 (A iλ)− λ− , λ > 0, (8.124) k  k≤ and

1 A(A iλ)− 1, (8.125) k  k≤ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 255

both of which follow immediately from the spectral representation of a selfadjoint operator A: if

φ(A) = φ(t)dEt (8.126) Z then φ(A) max φ(t) . (8.127) k k≤ t | | Both inequalities (8.124) and (8.125) can be derived in a simple way which does not use the result (8.126)-(8.127). This derivation is left for the reader as an exercise. 

We now return to the question of the selfadjointness of . If is Lm Lm selfadjoint then is called essentially selfadjoint. We wish to prove that L if the principal part of is an operator with constant coefficients and the L remaining part is an operator with smooth bounded coefficients then is L essentially selfadjoint. The principal part of is the differential expression L := a (x)Dj . (8.128) L0 j j =s |X| Let us recall that a polynomial P (ξ) is subordinate to the polynomial Q(ξ) if P (ξ) | | 0 as ξ , ξ Rr. (8.129) 1 + Q(ξ) → | | → ∞ ∈ | | If (8.129) holds then we write

P Q. (8.130) ≺≺ We say that Q(ξ) is stronger than P (ξ) and write

P Q (8.131) ≺ if P˜(ξ) c, ξ Rr, (8.132) Q˜(ξ) ≤ ∀ ∈ where c > 0 is a constant, and

1/2 P˜(ξ) := P (j)(ξ) 2 . (8.133)  | |  j 0 |X|≥    February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

256 Random Fields Estimation Theory

In the following lemma a characterization of elliptic polynomials are given. A homogeneous polynomial Q(ξ) of degree s is called elliptic if

Q(ξ) = 0 for ξ Rr 0. (8.134) 6 ∈ \ Lemma 8.5 A homogeneous polynomial Q(ξ) of degree s is elliptic if and only if it is stronger than every polynomial of degree s. In particular ≤ c ξ s Q(ξ) , (8.135) | | ≤ | | where c = const > 0. A proof of this result can be found in [H¨ormander (1983-85), vol. II, p. 37]. Lemma 8.6 Assume that

u = u + a (x)Dj u (8.136) L L0 j j

u = a Dj u, (8.137) L0 j j =s |X| where

(ξ) := a ξj is an elliptic polynomial (8.138) L0 j j =s |X| and

sup aj(x) c, j < s. (8.139) x Rr | | ≤ | | ∈ r Then is essentially selfadjoint on C∞(R ) and = is selfadjoint on L 0 L Lm Hs(Rr).

r Proof. By Lemma 8.2 defined on C∞(R ) is essentially selfadjoint. L0 0 If (8.138) holds then inequality (8.135) holds. Therefore the closure of , L0 the operator , is selfadjoint and L0m Dom = Hs(Rr ). (8.140) L0m Let us apply Lemma 8.3 with A = and B = . The basic L0m L − L0 condition to check is inequality (8.103) on Hs(Rr). It is sufficient to check r r s r this inequality on C0∞(R ) since C0∞(R ) is dense in H (R ). If (8.103) r s r established for any u C∞(R ) then one takes u H (R ) and a sequence ∈ 0 ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 257

r u C∞(R ) such that u u s r 0, n , and passes to the n ∈ 0 k n − kH (R )→ → ∞ limit n in (8.103). This yields inequality (8.103) for any u H s(Rr). → ∞ ∈ In order to check inequality (8.103) it is sufficient to prove that

j r a (x)D u  u +c() u , u C∞(R ) (8.141) k j k≤ k L0 k k k ∀ ∈ 0 for any  > 0, however small, and any j such that j < s. Using (8.139) | | and Parseval’s equality one obtains:

a (x)Dj u c Dj u = c ξj uˆ . (8.142) k j k≤ k k k k On the other hand, Parseval’s equality, condition (8.138) and inequality (8.135) yield

u = (ξ)uˆ c ξ suˆ . (8.143) k L0 k k L0 k≤ k | | k If j < s then | | ξ j  ξ s ξ > R = R(). (8.144) | | ≤ | | | | In the region ξ R one estimates | | ≤ ξjuˆ c(R) uˆ , (8.145) | | ≤ | | j where, for example, one can take c(R) = R| |. Therefore

ξjuˆ 2dξ c2(R) uˆ 2= c2(R) u 2, (8.146) ξ R | | ≤ k k k k Z| |≤ and

j 2 2 2 2 2 ξ uˆ dξ  0(ξ)uˆ dξ =  0u (8.147) ξ >R | | ≤ |L | k L k Z| | Z where Parseval’s equality and estimates (8.144) and (8.145) were used. From (8.142), (8.146) and (8.147) one obtains the desired inequality (8.141). Lemma 8.6 is proved.  Remark 8.1 Note that the method of the proof allows one to relax con- dition (8.139). For example, one could use some integral inequalities to estimate the Fourier transform. Let us now prove the following result.

Lemma 8.7 If defined by (8.82) has smooth and uniformly bounded L in Rr coefficients such that the principal part of is an elliptic operator L February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

258 Random Fields Estimation Theory

r of uniformly constant strength, then is essentially selfadjoint on C ∞(R ) L 0 and is selfadjoint on Hs(Rr). Lm Proof. Let us recall that the principal part of , L0 L u = a (x)Dj u (8.148) L0 j j =s |X| is an elliptic operator of uniformly constant strength if the principal symbol

j as(x, ξ) := aj(x)ξ j =s |X| satisfies the ellipticity condition (8.83) and the condition of uniformly con- stant strength a˜ (x, ξ) s c x, y Rr (8.149) a˜s(y, ξ) ≤ ∀ ∈ where c does not depend on x, y and

1/2 a˜ (x, ξ) := Dj a (x, ξ) 2 . (8.150) s  | ξ s |  j 0 |X|≥  s r By Lemma 8.4 the operator m is selfadjoint on H (R ) if the equations L ( iλ)u = f, (8.151) Lm  s r r are solvable in H (R ) for any f C0∞(R ) and for some λ > 0. Indeed, in ∈r 2 r this case Ran( iλ) C∞(R ) and therefore is dense in H = L (R ). Lm  ⊃ 0 Since is symmetric Ran( iλ) is closed in L2(Rr ) and, being dense Lm Lm  in L2(Rr ), has to coincide with L2(Rr). This implies, by Lemma 8.4 that m is selfadjoint. L s r r Existence of the solution to (8.151) in H (R ) for any f C∞(R ) ∈ 0 follows from the existence of the fundamental solution (x, y, λ): E ( iλ) (x, y, λ) = δ(x y) in Rr (8.152) Lm  E − and the estimate

s r c x y − , if r is odd or r > s, | − |  x y 1 (x, y, λ) | − | ≤ (8.153) |E | ≤  s r 1 c x y − + c1 log x y , if r is even and r s,  | − | | − | ≤ x y 1  | − | ≤   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 259

where c and c1 are positive constants, and

(x, y, λ) c exp( a(λ) x y ) if x y 1 (8.154) |E | ≤ − | − | | − | ≥ where c > 0 is a constant and a(λ) > 0 is a constant, a(λ) + as → ∞ λ + . Also (x, y, λ) is smooth away from the diagonal x = y, and the → ∞ E following estimates hold

Dj (x, y, λ c exp ( a(λ) x y ) if x y 1 (8.155) | E | ≤ − | − | | − | ≥

s r j c + c x y − −| | if s = r + j , x y 1 Dj (x, y, λ) 0 1| − | 6 | | | − | ≤ | E | ≤ (c0 + c1 log x y if s = r + j , x y 1. | | − || | | | − | ≤ (8.156) Indeed, if there exists the fundamental solution with the properties (8.152)-(8.156) then

u = (x, y, λ)f(y)dy (8.157) r E ZR solves (8.151) and u Hs(Rr), so that is selfadjoint. ∈ lm Existence of the fundamental solution with the properties (8.152)- (8.156) for elliptic selfadjoint operators with constant coefficients can be established if one uses the Fourier transform [H¨ormander (1983-85), vol. I, p. 170], and for the operators of uniformly constant strength it is es- tablished in [H¨ormander (1983-85), vol. II, p. 196]. Thus, Lemma 8.7 is proved. 

N 1 It is not difficult now to establish that the function ( iλ)− := φ( ), L − r L λ > 0, has a kernel which is a Carleman operator if N > 2s . Indeed, it follows from the estimate (8.156) that the singularity of the kernel of the Ns r 2 r operator φ( ) is 0 x y − so that this kernel is locally in L if N > . L | − | 2s On the other hand, the estimate (8.154) implies that the kernel of φ( ) is in  L L2(Rr) globally. Since the constants in the inequalities (8.154) and (8.156) do not depend on x, one concludes that φ( ) is a Carleman operator. Let L us formulate this result as Lemma 8.8.

r Lemma 8.8 Suppose that N > 2s and the assumptions of Lemma 8.7 N 1 hold. Then the operator ( iλ)− , λ > 0, is a Carleman operator. L − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

260 Random Fields Estimation Theory

8.3 Asymptotics of the spectrum of linear operators

In this section we develop an abstract theory of perturbations preserving asymptotics of the spectrum of linear operators. As a by-product a proof of Theorem 2.3 is obtained.

8.3.1 Compact operators 8.3.1.1 Basic definitions Let H be a separable Hilbert space and A : H H be a linear operator. → The set of all bounded operators we denote L(H). The set of all linear compact operators on H we denote σ . Let us recall that an operator A ∞ is called compact if it maps bounded sets into relatively compact sets. It is well known that A is compact if and only if one of the following conditions hold

1) f * f, g * g implies (Af , g ) (Af, g), n n n n → 2) f * f implies Af Af, n n → 3) from any bounded sequence of elements of H one can select a subsequence fn such that

(Af , f ) converges as n, m nm nm → ∞ where

f := f f . nm n − m By * we denote weak convergence and stands for convergence in the → norm of H (strong convergence).

If A is compact and B is a bounded linear operator then AB and BA are compact. A linear combination of compact operators is compact. If An σ and An A 0 as n , then A σ . The operator A is ∈ ∞ k − k→ → ∞ ∈ ∞ compact if and only if A∗ is compact. In this section we denote the adjoint operator by A∗. If H is a separable Hilbert space and A is compact then there exists a sequence A of finite rank operators such that A A 0. n k n − k→ An operator B is called a finite rank operator if rank A := dim RanB < . ∞ If A is compact then A∗A 0 is compact and selfadjoint. The spectrum ≥ of a selfadjoint compact operator is discrete, the eigenvalues λn(A∗A) are nonnegative and have at most one limit point λ = 0. We define the singular February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 261

values of a compact operator A (s-values of A) by the equation

1/2 sj (A) = λj (A∗A). (8.158) One has

s (A) s (A) 0. (8.159) 1 ≥ 2 ≥ · · · ≥ Note that

s (A) = A (8.160) 1 k k

and if A = A∗ then

s (A) = λ (A) . (8.161) j | j | The following properties of the s-values are known

sj (A) = sj (A∗), (8.162) s (BA) B s (A), (8.163) j ≤ k k j s (AB) B s (A) (8.164) j ≤ k k j for any bounded linear operator B. Obviously

s (cA) = c s (A), c = const. (8.165) j | | j Any bounded linear operator A can be represented as

A = U A (8.166) | | where

1/2 A := (A∗A) (8.167) | |

and U is a partial isometry which maps Ran(A∗) onto RanA. Representa- tion (8.166) is called polar representation of A. The operator A is selfadjoint. If A is compact then A is compact. | | | | Let φj be its eigenvectors and sj = sj (A) be its eigenvalues:

A φ = s φ , (φ , φ ) = δ . (8.168) | | j j j j m jm Then

∞ A = s (A)( , φ )φ (8.169) | | j · j j j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

262 Random Fields Estimation Theory

where the series (8.169) converges to A in the norm of operators: | | n A s (A)( , φ )φ 0 as n . (8.170) | | − j · j j → → ∞ j=1 X

Let

ψj := Uφj. (8.171)

Then

∞ A = s ( , φ )ψ . (8.172) j · j j j=1 X Formula (8.172) is the canonical representation of a compact operator A. Note that

(ψj, ψm) = δjm

since U is a partial isometry. It follows from (8.172) that A is a limit in the norm of operators of the finite rank operator n s ( , φ )ψ . Moreover j=1 j · j j

P ∞ A∗ = s (A)( , ψ )φ . (8.173) j · j j Xj=1 In the formulas (8.172) and (8.175) the summation is actually taken over all j for which s (A) = 0. If rank A < , then s (A) = 0 for j > rank A. j 6 ∞ j If A is compact and normal, that is A∗A = AA∗, then its eigenvectors form an orthonormal basis of H

∞ A = λ (A)( , φ )φ , Aφ = λ (A)φ (8.174) j · j j j j j j=1 X and

s (A) = λ (A) . (8.175) j | j | 8.3.1.2 Minimax principles and estimates of eigenvalues and sin- gular values Lemma 8.9 Let A be a selfadjoint and compact operator on H. Let

λ+ λ+ (8.176) 1 ≥ 2 ≥ · · · February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 263

be its positive eigenvalues counted according to their multiplicities and φj are corresponding eigenvectors

+ Aφj = λj φj . (8.177) Then

+ (Aφ, φ) λj+1 = min max (8.178) Lj φ Lj (φ, φ) ⊥ where L H is an n-dimensional subspace. Maximum in (8.178) is n ⊂ attained on the subspace

(A) := span φ , . . .φ (8.179) Lj { 1 j} spanned by the first j eigenvectors of A corresponding to the positive eigen- values. Remark 8.2 Maximum may be attained not only on the subspace (8.179). The sign φ L means that φ is orthogonal to the subspace L. ⊥ Lemma 8.10 If A σ then ∈ ∞ Aφ sj+1(A) = min max k k. (8.180) Lj φ Lj φ ⊥ k k Lemma 8.10 follows immediately from Lemma 8.9 and from the defini- tion of the s-values given by formula (8.158). Lemma 8.11 If A σ then ∈ ∞

sj+1(A) = min A K (8.181) K ∈Kj k − k where is the set of operators of rank j. Kj ≤ The following inequalities for eigenvalues and singular values of compact operators are known.

Lemma 8.12 If A and B are selfadjoint compact operators and A B, ≥ that is

(Aφ, φ) (Bφ, φ) φ H, (8.182) ≥ ∀ ∈ then

λ+(A) λ+(B). (8.183) j ≥ j February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

264 Random Fields Estimation Theory

Lemma 8.13 If A and B are selfadjoint and compact operators, then

+ + + λm+n 1(A + B) λm(A) + λn (B), (8.184) − ≤ and

λm−+n 1(A + B) λn−(A) + λm− (B). (8.185) − ≥ Moreover

λ+(A) λ+(A + B) B (8.186) j − j ≤k k

and

λ−(A) λ−(A + B) B (8.187) j − j ≤k k

where λ−(A) λ−(A) < 0 are the negative eigenvalues of a selfad- 1 ≤ 2 ≤ · · · joint compact operator A counted according to their multiplicities.

Lemma 8.14 If A is compact and B is a finite rank operator, rank B = ν, then

sj+ν (A) sj (A + B) sj ν(A). (8.188) ≤ ≤ − Lemma 8.15 If A, B σ then ∈ ∞

sm+n 1(A + B) sm(A) + sn(B), (8.189) − ≤

sm+n 1(AB) sm(A)sn(B), − ≤

s (A) s (B) A B . (8.190) | n − n | ≤k − k Lemma 8.16 If A σ then ∈ ∞ n λ (A) n s (A). (8.191) uj=1| j | ≤ uj=1 j Lemma 8.17 If A, B σ and f(x), 0 x < , is a real-valued ∈ ∞ ≤ ∞ nondecreasing, convex, and continuous function vanishing at x = 0, then

n n f (s (A + B)) f (s (A) + s (B)) (8.192) j ≤ j j j=1 j=1 X X for all n = 1, 2, . . ., . ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 265

In particular, if f(x) = x, one obtains

n n n s (A + B) s (A) + s (B) (8.193) j ≤ j j Xj=1 Xj=1 Xj=1 for all n = 1, 2, . . ., . If f(x), f(0) = 0, 0 x < , is such that the ∞ ≤ ∞ function φ(t) := f (exp(t)) is convex, < t < , then −∞ ∞ n n f (s (AB)) f (s (A)s (B)) (8.194) j ≤ j j j=1 j=1 X X for all n = 1, 2, . . ., . ∞ In particular, if f(x) = x, then

n n s (AB) s (A)s (B). (8.195) j ≤ j j j=1 j=1 X X for all n = 1, 2, . . ., . ∞ Lemma 8.18 Let A, B σ and ∈ ∞ a lim n sn(A) = c, (8.196) n →∞ where a > 0 and c = const > 0. Assume that

a lim n sn(B) = 0. (8.197) n →∞ Then

a lim n sn(A + B) = c. n →∞ Proofs of the above results can be found in [GK].

8.3.2 Perturbations preserving asymptotics of the spectrum of compact operators 8.3.2.1 Statement of the problem Here we are interested in the following question. Suppose that A and Q are linear compact operators on H, and B is defined by the formula

B = A(I + Q). (8.198) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

266 Random Fields Estimation Theory

Question 1: Under what assumptions are the singular values of B asymp- totically equivalent to the singular values of A in the following sense: s (B) lim n = 1. (8.199) n →∞ sn(A) Assume now that

p p s (A) = cn− 1 + O(n− 1 ) , n , (8.200) n → ∞ where p and p1 are positive numbers, and c > 0 is a constant.

Question 2: Under what assumptions is the asymptotics of the singular values of B given by the formula

p q s (B) = cn− 1 + O(n− ) , n ? (8.201) n → ∞ When is q = p1?  

We will answer these questions and give some applications of the results.

8.3.2.2 A characterization of the class of linear compact operators We start with a theorem which gives a characterization of the class of linear compact operators on H. In order to formulate this theorem let us introduce the notion of limit dense sequence of subspaces. Let L L , . . . , dimL = n (8.202) n ⊂ n+1 ⊂ n be a sequence of finite-dimensional subspaces of H such that

ρ(f, L ) 0 as n for any f H (8.203) n → → ∞ ∈ where ρ(f, L) is the distance from f to the subspace L.

Definition 8.2 A sequence of the subspaces Ln is called limit dense in H if the conditions (8.202) and (8.203) hold. Theorem 8.2 A linear operator A : H H is compact if and only if → there exists a limit dense in H sequence Ln of subspaces Ln such that Ah sup k k 0 as n , h H. (8.204) h L h → → ∞ ∈ ⊥ n k k February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 267

If (8.204) holds for a limit dense in H sequence Ln then it holds for every limit dense in H sequence of subspaces.

Proof. Sufficiency. Assume that Ln is a limit dense in H sequence of subspaces and condition (8.204) holds. We wish to prove that A is compact. Let Pn denote the orthoprojector in H onto Ln. Condition (8.204) can be written as

γn := sup Ah 0 as n . (8.205) khk=1 k k→ → ∞ h L ⊥ n Therefore

A APn = sup Ah APnh = sup A(I Pn)h k − k h 1 k − k h 1 k − k k k≤ k k≤ = sup Ag sup Ag = γn 0. g=(I−Pn)h || || ≤ g Ln k k → g =(1 P h 2 )1/2 1, h 1 ⊥ k k −k n k ≤ k k≤ (8.206)

Therefore A is the norm limit of the sequence of the operators APn. The operator AP is of finite rank n. Therefore A is compact. Note that n ≤ in the sufficiency part of the argument the assumption that the sequence Ln is limit dense in H is not used. In fact, if condition (8.204) holds for any sequence of subspaces L L then A is compact as we have proved n ⊂ n+1 above.  Necessity. Assume now that A is compact and L is a limit dense { n} in H sequence of subspaces. We wish to derive (8.204). We have

sup Ah = sup Ah APnh sup A(I Pn)h h⊥Ln k k Pn h=0 k − k≤ h =1 k − k h =1 h =1 k k k k k k = A(I P ) 0 as n . (8.207) k − n k→ → ∞ The last conclusion follows from the well known result which is formu- lated as Proposition 8.1.

Proposition 8.1 If A is compact and the selfadjoint orthoprojection Pn converges strongly to the identity operator I then

A(I P ) 0 as n . (8.208) k − n k→ → ∞ Note that P I strongly if and only if the sequence L is limit dense n → n in H. Let us prove Proposition 8.1. Let A be compact and B∗ = B 0 n n → February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

268 Random Fields Estimation Theory

strongly. In our case B = I P . Represent A = K + F , where K is a − n  finite rank operator and F < . Then k  k AB F B + KB c+ KB . (8.209) k n k≤k  n k k n k≤ k n k Here c B does not depend on n and . Choose n sufficiently large. ≥k n k Then

KB <  (8.210) k n k since B 0 strongly and K is a finite rank operator. Indeed n → m KBnh = sj (Bnh, φj)ψj (8.211) j=1 X where m K := s ( , φ )ψ , s = const. (8.212) j · j j j j=1 X

It is known that B 0 strongly does not imply B∗ 0 strongly, in n → n → general. But since we have assumed that Bn = Bn∗ , we have

m KB h s ψ h B φ (n) h (8.213) k n k≤ | j | k j kk kk n j k≤ k k Xj=1 where (n) 0 as n because → → ∞ B φ 0 as n , 1 j m. (8.214) k n j k→ → ∞ ≤ ≤ Proposition 8.1 and Theorem 8.2 are proved. Note that in the proof of the necessity part the assumption that the sequence Ln is limit dense in H plays the crucial role: it allows one to claim that P I strongly. If the sequence L is not limit dense then there may n → n exist a fixed vector h = 0 such that Ah > 0 and h ∞ L . In 6 k k ⊥ n=1 n this case condition (8.204) does not hold. This is the case, for example, if S h is the first eigenvector of a selfadjoint compact operator A, and Ln := span φ , . . ., φ where Aφ = λ φ . { 2 n+1} j j j

8.3.2.3 Asymptotic equivalence of s-values of two operators We are now ready to answer Question 1. Recall that N(A) := u : Au = 0 . { } February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 269

Theorem 8.3 Assume that A, Q σ , N(I + Q) = 0 and rank ∈ ∞ { } A = . Then ∞ s A(I + Q) lim n { } = 1 (8.215) n →∞ sn(A) and s (I + Q)A lim n { } = 1. (8.216) n →∞ sn(A) Proof. By the minimax principle for singular values one has A(I + Q)φ sn+1 A(I + Q) = min max k k { } Ln φ Ln φ ⊥ k k A(I + Q)φ (I + Q)φ = min max k k k k Ln φ Ln (I + Q)φ φ ⊥  k k k k  A(I + Q)φ Qφ max k k 1 + max k k ≤ φ Mn (I + Q)φ φ Mn φ ⊥ k k  ⊥ k k 

= sn+1(A)(1 + n), (8.217)

where Qφ n := max k k 0 as n . (8.218) φ Mn φ → → ∞ ⊥ k k Here M is so chosen that the condition φ M is equivalent to the n ⊥ n condition (I + Q)φ n(A), where n(A) is the linear span of the first n ⊥ L 1/L2 eigenvectors of the operator (A∗A)

M := (I + Q∗) (A). (8.219) n Ln Since N(I + Q) = 0 and Q is compact, the operator I + Q is an iso- { } morphism of H onto H and so is I + Q∗. Therefore the limit dense in H sequence of the subspaces (A) is mapped by the operator I + Q∗ into Ln a limit dense in H sequence of the subspaces Mn. Indeed, suppose that f M n, that is ⊥ n ∀

(f, (I + Q∗)φ ) = 0 j (8.220) j ∀ 1/2 where φ is the set of all eigenvectors of the operator (A∗A) includ- { j} ing the eigenvectors corresponding to the eigenvalue λ = 0 if zero is an February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

270 Random Fields Estimation Theory

1/2 eigenvalue of (A∗A) . Then

((I + Q)f, φ ) = 0 j. (8.221) j ∀ 1/2 Since the set of all the eigenvectors of the operator (A∗A) is complete in H, we conclude that

(I + Q)f = 0. (8.222)

This implies that f = 0 since I + Q is an isomorphism. The fact that n 0 follows from the compactness of Q and Theorem → 1 1 1. Let B = A(I + Q). Since (I + Q)− = I + Q where Q := Q(I + Q)− 1 1 − is a compact operator, one has A = B(I + Q1). Therefore one obtains as above the inequality:

s (A) s (B)(1 + δ ), δ 0 as n . (8.223) n+1 ≤ n+1 n n → → ∞ From (8.217) and (8.223)equation (8.215) follows. The proof of (8.216) reduces to (8.215) if one uses property (8.162) of s-values. Theorem 8.3 is proved. 

The result given in Theorem 8.3 is optimal in some sense. Namely, if Q is not compact but, for example, an operator with small norm, then the conclusion of Theorem 8.3 does not hold in general (take, for instance, Q = I where I is the identity operator). The assumption rank A = ∞ is necessary since if rank A < one has only a finite number of nonzero ∞ singular values. The assumption N(I + Q) = 0 is often easy to verify { } and it is natural. It could be dropped if the assumption about the rate of decay of sn(A) is

p s (A) cn− , p > 0 n ∼ but we do not go into detail.

8.3.2.4 Estimate of the remainder Let us now answer the second question.

Theorem 8.4 Assume that A and Q are linear compact operators on H, N(I + Q) = 0 , B := A(I + Q), { } p p s (A) = cn− 1 + O(n− 1 ) as n , (8.224) n → ∞   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 271

where p, p1 and c are positive numbers, and

a 1 a Qf c Af f − , a > 0. (8.225) k k≤ k k k k Then

p q sn(B) = cn− 1 + O(n− ) (8.226)

where   pa q := min p , . (8.227) 1 1 + pa   In particular, pa if > p then q = p (8.228) 1 + pa 1 1

and therefore not only the main term of the asymptotics of sn(A) is pre- served but the order of the remainder as well. Remark 8.3 The estimate (8.227) of the remainder in (8.226) is sharp in the sense that it is attained for some Q. Proof. Let n and m be integers. It follows from (8.180) that Bh sn+m+1(B) = min max k k Ln+m h Ln+m h ⊥ k k A(I + Q)h Qh max k k 1 + max k k .(8.229) ≤ h Mn (I + Q)h · h m (A) h ⊥ k k  ⊥L k k  Here, as in the proof of Theorem 8.3, Mn is defined by formula (8.219), and m(A) is the linear span of first m eigenvectors of the operator L1/2 (A∗A) . This means that we have chosen Ln+m to be the direct sum of the subspaces M + (A). Since the sequence (A) is limit dense in n Lm Lm H one can use Theorem 8.2 and conclude from (8.229) that

s (B) s (A)(1 +  ),  0 as m (8.230) n+m+1 ≤ n+1 m m → → ∞ and a Qh Ah a m = max k k c max k ka = csm+1(A). (8.231) h m (A) h ≤ h m (A) h ⊥L k k ⊥L k k Therefore

s (B) s (A) 1 + csa (A) . (8.232) n+m+1 ≤ n+1 m+1   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

272 Random Fields Estimation Theory

Unfortunately our assumptions now do not allow to use the argument sim- ilar to the one used at the end of the proof of Theorem 8.3. The reason is that our assumptions now are no longer symmetric with respect to A and B. For example, inequality (8.225) is not assumed with B in place of A. In applications it is often possible to establish the inequality (8.225) with B in place of A, and in this case the argument can be simplified: one can use by symmetry the estimate (8.232) in which B and A exchange places. With the assumption formulated in Theorem 8.4 we proceed as follows. Write

1 A = B(I + Q ), Q = Q(I + Q)− . (8.233) 1 1 − Choose

M := (I + Q∗) (B) (8.234) 1n 1 Ln and use the inequalities similar to (8.229)-(8.232) to obtain

s (A) s (B) 1 + c sa (A) . (8.235) n+m+1 ≤ n 1 m+1 It follows from (8.232) and (8.235) that 

s (A) s (B) 1 + c sa (A) n+2m+1 ≤ n+m+1 1 m+1 s (A) 1 + csa (A) 1 + c sa (A) ≤ n+1  m+1  1 m+1 s (A) 1 + c sa (A) , (8.236) ≤ n+1  2 m+1    where we took into account that 

0 < s (A) 0 as m (8.237) m → → ∞ so that s2 (A) s (A) for all sufficiently large m. m ≤ m Therefore, for all sufficiently large m one has

sn+2m+1(A) a sn+m+1(B) [1 + 0 (sm(A))] sn+m+1(A) ≤ sn+m+1(A)

sn+1(A) a [1 + O (sm(A))] . (8.238) ≤ sn+m+1(A)

Choose

1 x m = n − , 0 < x < 1. (8.239) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 273

It follows from (8.224) and (8.239) that

p sn+m(A) n + m − p p = 1 + 0(n + m)− 1 + O(n− 1 ) s (A) n n   m  p x  p = 1 + 0 + O(n− 1 ) = 1 + O(n− ) + O(n− 1 ).(8.240) n   From (8.240) and (8.238) one obtains

sn+m+1(B) p x (1 x)pa = 1 + O(n− 1 ) + O(n− ) + O(n− − ). (8.241) sn+m+1(A) Let

q := min p , x, (1 x)pa . (8.242) { 1 − } Then

sn+m+1(B) q = 1 + O(n− ). (8.243) sn+m+1(A) Since n + m + 1 1 as n (8.244) n ∼ → ∞ it follows from (8.243) that formula (8.226) holds. Choose now x, 0 < x < 1 such that

min (x, (1 x)pa) = max . (8.245) − An easy calculation shows that (8.245) holds if pa x = . (8.246) 1 + pa Therefore pa q = min p , . 1 1 + pa   This is formula (8.227). Theorem 8.4 is proved. 

We leave for the reader as an exercise to check the validity of the Re- mark. Hint. A trivial example, which shows that for some Q the order of the remainder in (8.226) is attained, is the following one. Let A > 0 be a February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

274 Random Fields Estimation Theory

selfadjoint compact operator. In this case sj (A) = λj (A). Take Q = φ(A). Then

λ (B) = λ A [1 + φ(A)] = λ (A) [1 + φ(λ )] (8.247) n n { } n n by the spectral mapping theorem. If one chooses φ(λ) such that

p1 φ(λn) = o(n− ). (8.248)

Then q = p1 is the order of the remainder in the formula

p p1 λn(B) = cn− 1 + O(n− ) . (8.249)   8.3.2.5 Unbounded operators Note that the results of Theorems 8.3, 8.4 can be used in the cases when the operators we are interested in are unbounded. For example, suppose that is an elliptic selfadjoint operator in H = L2(D) of order s and ` is L a selfadjoint in H differential operator of lower order, ord` = m. We wish to check that λ ( + `) lim n L = 1. (8.250) n λ ( ) →∞ n L Since λ ( ) + and λ ( + cI) = λ ( ) + c where c is a constant, one n L → ∞ n L n L can take + cI in place of in (8.250) and choose c > 0 such that the L L 1 operator +cI is positive definite in H. Then the operator A := ( +cI)− L L is compact in H. Moreover

1 + cI + ` = I + `( + cI)− ( + cI) L L L so that  

1 1 1 1 B := ( + cI + `)− = ( + cI)− I + `( + cI)− − . (8.251) L L L If ord` < ord then the operator   L 1 S := `( + cI)− is compact in H. (8.252) L One can always choose the constant c > 0 such that N(I + S) = 0 , so { } that

1 (I + S)− = I + Q (8.253)

where Q is compact in H. Then (8.251) can be written as

B = A(I + Q) (8.254) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 275

and the assumptions of Theorem 8.3 are satisfied. In fact, since A and B are selfadjoint and λn(A) and λn(B) are positive for all sufficiently large n, one has

s (B) = λ (B), s (A) = λ (A), n > n . (8.255) n n n n ∀ 0 By Theorem 8.3 one has λ (B) lim n = 1. (8.256) n →∞ λn(A) 1 1 Since λn(B− ) = λn− (B), it follows from (97) that λ (B 1) lim n − = 1. (8.257) n 1 →∞ λn(A− ) This is equivalent to (8.250) because, as was mentioned above, λ ( + ` + cI) lim n L = 1 (8.258) n λ ( + `) →∞ n L for any constant c.

8.3.2.6 Asymptotics of eigenvalues In this section we prove some theorems about perturbations preserving asymptotics of the spectrum. In order to formulate these theorems in which unbounded operators appear, we need some definitions. Let A be a closed liner densely defined in a Hilbert space H operator, D(A) is its domain of definition, R(A) is its range, N(A) = u : Au = 0 { } is its null-space, σ(A) is its spectrum.

Definition 8.3 We say that the spectrum of A is discrete if it consists of isolated eigenvalues with the only possible limit point at infinity, each of the eigenvalues being of finite algebraic multiplicity and for each eigenvalue λj the whole space can be represented as a direct sum of the root subspace corresponding to λ , and a subspace H which is invariant with respect Mj j j to A and in which the operator A λ I has bounded inverse. In this case − j λj is called a normal eigenvalue. The root linear manifold of the operator A corresponding to the eigen- value λ is the set of vectors which solve the equation

:= u : (A λI)nu = 0 for some n . (8.259) Mλ { − } February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

276 Random Fields Estimation Theory

The algebraic multiplicity ν(λ) of the eigenvalue λ is

ν(λ) := dim . (8.260) Mλ If is closed in H it is called root subspace. The geometric multiplicity Mλ n(λ) of λ is the dimension of the eigenspace corresponding to λ, n(λ) = dim N(A λI). If  > 0 is small enough so that there is only one eigenvalue − in the disc z λ <  then | − |

1 1 Pλ := R(z)dz, R(z) := (A zI)− (8.261) −2πi z λ = − Z| − | 2 is the projection, that is P = P . The subspace PλH is invariant for A, Pλ commutes with A, PλA = APλ, the spectrum of the restriction of A onto PλH consists of only one point λ, which is its eigenvalue of algebraic multiplicity ν(λ). An example of operators with discrete spectrum is the class of operators 1 for which the operator (A λ I)− is compact for some λ C. Such are − 0 0 ∈ elliptic operators in a bounded domain. If A is an operator with discrete spectrum then

λ (A) as n . (8.262) | n | → ∞ → ∞ Thus, for any constant c, λ (A + cI) lim n = 1. (8.263) n →∞ λn(A) 1 Therefore it is not too restrictive to assume that A− exists and is compact: 1 1 if A− does not exist then choose c such that (A + cI)− exists and study 1 the asymptotics of λn(A + cI) = λn(A) + c. Note that if (A λ0I)− is 1 − compact for some λ , then (A λI)− is compact for any λ for which A λI 0 − − is invertible. This follows from the resolvent identity

1 1 1 1 (A λI)− = (A λ )− + (λ λ )(A λI)− (A λ I)− . (8.264) − − 0 − 0 − − 0 1 If A− is compact we define the singular values of A by the formula

1 1 sn(A) = sn(A− ) − . (8.265)

If  

A = A∗ m > 0 (8.266) ≥ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 277

then we denote by HA the Hilbert space which is the completion of D(A) 1/2 1 in the norm u = (Au, u) . Clearly H H, u m− u , k kA A ⊂ k k≤ k kA and (u, v)A := (Au, v) is the inner product in HA. The inner product can 1/2 1/2 be written also as (u, v) = (A u, A v). If B = B∗ m, then by H A ≥ − B we mean the Hilbert space which is the completion of D(B) in the norm u := ((B + m + 1)u, u)1/2. All unbounded operators we always assume k kB densely defined in H.

Theorem 8.5 Let A = A∗ m > 0 be a linear closed operator with ≥ discrete spectrum, T be a linear operator, D(A) D(T ), B := A + T , 1 ⊂ D(B) = D(A). Assume that A− T is compact in H , B = B∗ and H A A ⊂ D(T ). Then

λ (B) lim n = 1. (8.267) n →∞ λn(A) 1 The conclusion (8.267) remains valid if A m and [A + (m + 1)I]− T ≥ − is compact in HA.

1 Remark 8.4 If T > 0 then A− T is compact in HA if and only if the imbedding operator i : H H is compact. By H we mean the Hilbert A → T T space which is the completion of D(T ) in the norm (T u, u)1/2. If T is not positive but (T f, f) (Qf, f) for some Q > 0 and all f D(T ), | | ≤ ⊂ 1 D(T ) D(Q), and if the imbedding i : H H is compact then A− T ⊂ A → Q is compact.

The reader can prove these statements as an exercise or find a proof in [Glazman (1965), 4]. § To prove Theorem 8.5 we need a lemma.

1 Lemma 8.19 If the operator A− T is compact in HA then HA = HB and the spectrum of B is discrete.

Assuming the validity of this lemma let us prove Theorem 8.5 and then prove the Lemma.

Proof of Theorem 8.5 Let us use the symbol for orthogonality in H q A and for orthogonality in H. If n(A) is the linear span of the first n ⊥ 1 L eigenvectors of A− then f n(A) is equivalent to f n(A). Indeed, if 1 ⊥ L q L A− φ = λ φ , λ = 0, then j j j j 6 0 = (f, φ ) = λ (f, Aφ ) = λ (f, φ ) (f, φ ) = 0. j j j j j A ⇔ j A February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

278 Random Fields Estimation Theory

Note also that

inf α(1 + β) (1 sup β) inf α if α 0 and 1 < β < 1. (8.268) { } ≥ − ≥ − We will use the following statement

1 (T f, f) (A− T f, f)A γn : sup | | = sup | | 0 as u ≡ f (A) (f, f)A f (A) (f, f)A → → ∞ ⊥Ln qLn (8.269) 1 which follows from Theorem 8.2 and the assumed compactness of A− T in HA. We are now ready to prove Theorem 8.5. By the minimax principle one has (Bf, f) λn+1(B) = sup inf f Ln (f, f) Ln ⊥ (Bf, f) (Af, f) (T f, f) inf = inf 1 + ≥ f n (A) (f, f) f n(A) (f, f) (Af, f) ⊥L ⊥L    λ (A)(1 γ ), γ 0 as n , (8.270) ≥ n+1 − n n → → ∞ where we have used (8.268) and (8.269). By symmetry, for all sufficiently large n, one has

λ (A) λ (B)(1 δ ), δ 0 as n . (8.271) n+1 ≥ n+1 − n n → → ∞ We left for the reader as an exercise to check that under the assumptions 1 of Theorem 8.5 the operator (B + cI)− T is compact if B + cI is invertible. From (8.270) and (8.271) the desired conclusion (8.267) follows. If A 1 ≥ m and [A + (m + 1)I]− T is compact in H then we argue as above and − A obtain in place of (8.270) the following inequality

λ B + (m + 1)I λ A + (m + 1)I (1 γ ), n+1 { } ≥ n+1 { } − n γ 0 as n . (8.272) n → → ∞ Since λ (A + cI) = λ (A) + c, λ (A) + and λ (B) + , inequality n n n → ∞ n → ∞ (8.272) implies

λ (B) [1 + o(1)] λ (A) [1 + o(1)] (1 γ ) as n , (8.273) n+1 ≥ n+1 − n → ∞ and the rest of the argument is the same as above. Theorem 8.5 is proved.  February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 279

1 Proof of the Lemma 8.19 Note that B = A(I + S), where S := A− T is compact in H . Let us represent S in the form S = Q+F where Q < 1 A k kA and F is a finite rank operator. The operator S is selfadjoint in HA. Indeed

(Sf, g)A = (T f, g) = (f, T g) = (f, Sg)A

where we used the symmetry of T = B A on H. We choose Q and F to − be selfadjoint in HA. The operator I + Q is positive definite in HA while

N (F u, u) = a (u, φ ) 2 A j| j A| j=1 X

for some orthonormal in HA set of functions φj , for some constants aj, and some number N = rank F . Since D(A) is dense in HA, one can find v D(A) such that φ v < , where  > 0 is arbitrarily small. Then j ∈ k j − j kA N (F u, u) a (u, φ v ) + (u, Av ) 2 | A| ≤ | j| | j − j A j | j=1 X c  u 2 +c u 2 (8.274) ≤ 1 k kA 2 k k

where c1 and c2 are some positive constants which do not depend on u. It follows from (8.274) that

(Bu, u) = (A(I + Q)u, u) + (AF u, u)

= ((I + Q)u, u)A + (F u, u)A c u 2 c  u 2 c u 2 (8.275) ≥ 0 k kA − 1 k kA − 2 k k where c > 0. It follows from (8.275), if one chooses  so small that c c > 0 0− 1 0, that B is bounded from below in H. Since, clearly, (Bu, u) c(u, u) one ≤ A concludes that the metrics of HA and HB are equivalent, so that HA = HB. It remains to be proved that the spectrum of B is discrete. Since B is selfadjoint it is sufficient to prove that no point of spectrum σ(B) of B belongs to the σ (B). Recall that λ σ (B), ess ∈ ess where B = B∗, if and only if dim E(∆ )H < for any  > 0, where  ∞ ∆ = (λ , λ+) and E(∆) is the resolution of the identity corresponding to  − the selfadjoint operator B. Assume that λ σ(B) and dim E(∆ )H = ∈ n ∞ for some sequence  0. Then there exists an orthonormal sequence n → u H, such that Bu λu 0 as n . Thus n ∈ k n − n k→ → ∞ 1 1 u + A− T u λA− u 0 as n . (8.276) k n n − n k→ → ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

280 Random Fields Estimation Theory

Since u = 1 we have k n k (Au , u ) + (T u , u ) λ(u , u ) 0, n . (8.277) n n n n − n n → → ∞ 1 If A− T is compact in HA, we have proved that, for any  > 0,

(T u, u)  u 2 +c() u 2, u H . (8.278) | | ≤ k kA k k ∈ A It follows from (8.277) and (8.278) that

u c (8.279) k n kA≤ 1 where c > 0 is a constant which does not depend on n. Since A− T is com- pact in HA, inequality (8.279) implies that a subsequence of the sequence 1 un exists (we denote this subsequence again by un) such that A− T un con- verges in HA and, therefore, in H. Since the set un is orthonormal, un converges weakly to zero in H

u * 0, n . (8.280) n → ∞ Therefore

1 A− T u 0 as n . (8.281) k n k→ → ∞ From (8.281) and (8.276) it follows that

1 u λA− u 0 as n (8.282) k n − n k→ → ∞ where u is an orthonormal subsequence. This means that if λ = 0 then n 6 λ σ (A) which is a contradiction since, by assumption, A does not have ∈ ess essential spectrum. If λ = 0 then (8.282) cannot hold since u = 1. k n k Therefore B does not have essential spectrum and its spectrum is discrete. Lemma 8.19 is proved.  Example 8.1 Let A be the Dirichlet Laplacian ∆ in a bounded domain − D Rr and B = ∆ + q(x), where q(x) is a real-valued function. In this ⊂ − 1 case T u = q(x)u is a multiplication operator. The condition A− T is o 1 1 compact in HA means that ( ∆)− q is compact in H (D). This condition 1/2 − 2 r holds if and only if A− T is compact in H = L (D), D R , that 1/2 2 p ⊂ is ( ∆)− q is compact in L (D). If, for example, q L (D) then the − 1/2 2 ∈ γ operator ( ∆)− q(x) is compact in L (D) provided that q L (D), − ∈ γ > r, and Theorem 8.5 asserts that, in this case, λ ( ∆ + q) lim n − = 1. (8.283) n λ ( ∆) →∞ n − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 281

In the calculation of the Lp class to which q belongs we have used the known imbedding theorem which says that the imbedding i : W k,p(D) rp → L r−kp (D) is compact for kp < r, where W k,p(D) is the Sobolev space of functions with derivatives of order k belonging to Lp(D). If q Lγ (D) ≤ ∈ and u L2(D) then qu Lp(D), ∈ ∈ 1 1 α β qu pdx q pαdx u pβ | | ≤ | | | | ZD ZD  ZD  α where α > 1, β = α 1 , pβ = 2, pα = γ. Thus − 2(α 1) γ γ p = − , p = , so that α = 1 + α α 2 2γ 1 p 2 1,p and p = γ+2 . On the other hand, if qu L (D) then ∆− qu W (D) rp ∈ ∈ ⊂ r−p rp 2r L (D). If r p > 2, that is p > r+2 , then γ > r. The condition on q(x) for which (8.283)− holds can be relaxed.

1 1 In the next theorem we assume compactness of A− T and T A− in H rather than in HA.

Theorem 8.6 Assume that A = A∗ m > 0 is an operator with discrete ≥ spectrum, D(A) D(T ), B = A + T , D(B) = D(A), B is normal, 0 ⊂ 1 6∈ σ(B), and the operator A− T is compact in H. Then the spectrum of B is discrete and λ (B) lim n = 1. (8.284) n →∞ λn(A) Proof. First we prove that the spectrum of B is discrete. Since A is 1 selfadjoint positive definite and its spectrum is discrete it follows that A− is compact. Let

1 1 1 Au + T u = λu + f, u + A− T u = λA− u + A− f. (8.285)

1 Since 0 σ(B) the operator I + A− T has bounded inverse. Therefore 6∈ 1 1 1 1 1 1 u = λ(I + A− T )− A− u + (I + A− T )− A− f. (8.286)

1 1 1 The operator (I + A− T )− A− is compact being a product of a bounded and compact operator. Equations (8.285) and (8.286) are equivalent. Therefore

1 1 1 1 1 1 1 1 (B λI)− = I λ(I + A− T )− A− − (I + A− T )− A− . (8.287) − −   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

282 Random Fields Estimation Theory

1 It follows from (8.287) that λ σ(B) if and only if λ− σ(F ), F := 1 1 1 ∈ ∈ (I + A− T )− A− . Since F is compact, each λ is an isolated eigenvalue of finite algebraic multiplicity and σ(B) is discrete. In this part of the argument we did not use the assumption that B is normal. If B is normal then λn(B) = sn(B), where sn(B) are the singular | | 1 1 values of B. Since B = A(I + A− T ) and A− T is compact, since sn(B) = 1 1 1 sn− (B− ), and since A− is compact, we can apply Theorem 1 and get

s (B 1) s (A) lim n − = lim n = 1. (8.288) n 1 n →∞ sn(A− ) →∞ sn(B)

Since A > 0, we have sn(A) = λn(A). Therefore the desired result (8.284) will be proved if we prove that

λ (B) lim | n | = 1. (8.289) n →∞ λn(B) Let us prove (8.289). Let

Aφj + T φj = λj φj. (8.290)

Since B is normal we can assume that

(φj , φm) = δjm. (8.291)

Rewrite (8.290) as

1 1 φj + A− T φj = λjA− φj. (8.292)

Multiply (8.292) by φj to get

1 1 1 + (A− T φj, φj) = λj (A− φj, φj). (8.293)

1 Since A− T is compact and φ * 0 as j , we have: j → ∞ 1 (A− T φ , φ ) 0 as j . (8.294) j j → → ∞ 1 Note that (A− φj, φj) > 0. Therefore it follows from (8.293) that

1 Imλj Im(A− T φj, φj) = 1 0 as j . (8.295) Reλj 1 + Re(A− T φj, φj) → → ∞

This implies (8.289). Theorem 8.6 is proved.  February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 283

8.3.2.7 Asymptotics of eigenvalues (continuation) In this section we continue to study perturbations preserving asymptotics of the spectrum of linear operators. Let us give a criterion for compactness 1 of the resolvent (A λI)− := R(λ) for λ σ(A), where A is a closed − 6∈ densely defined linear operator in H.

1 Theorem 8.7 The operator (A λI)− , λ σ(A) is compact if and only 1 − 6∈ if the operator (I + A∗A)− is compact.

1 Proof. Sufficiency. Suppose (I + A∗A)− is compact and λ σ(A). Let 6∈ 1 g c, (A λI)− g = f . (8.296) k n k≤ − n n Then

f c and Af c, (8.297) k n k≤ k n k≤ where c denotes various positive constants. Therefore

2 1/2 2 2 (I + A∗A) f = f + Af c. (8.298) n k n k k n k ≤

1 1/2 The operators ( I + A∗A)− and (I + A∗A)− are selfadjoint positive op- erators. They are simultaneously compact or non-compact. Therefore if 1 1/2 (I + A∗A)− is compact then (I + A∗A)− is compact and (8.298) im- plies that the sequence fn is relatively compact. Therefore the operator 1 { } (A λI)− , λ σ(A), maps any bounded sequence gn into a relatively − 6∈ 1 compact sequence f . This means that (A λI)− is compact. n − 1 Necessity. Assume that (A λI)− is compact and hn c. Then 1 − k k≤ the sequence (A λI)− hn is relatively compact. We wish to prove that − 1 the sequence qn := (I + A∗A)− hn is relatively compact. The sequence (I + A∗A)qn = hn is bounded. Thus

2 2 ((I + A∗A)q , q ) = q + Aq c. (8.299) n n k n k k n k ≤ 1 Define p := (A λI)q , q = (A λI)− p . We have n − n n − n p Aq + λ q c (8.300) k n k≤k n k | | k n k≤ where c denotes various constants. From (8.300) and compactness of (A 1 1 − λI)− it follows that the sequence q = (A λI)− p is relatively compact. n − n Theorem 8.7 is proved.  February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

284 Random Fields Estimation Theory

Remark 8.5 Let T be a linear operator in H, D(A) D(T ) and let ⊂ 0 σ(A). 6∈ Definition 8.4 If for any sequence fn such that f + Af c k n k k n k≤

the sequence T fn is relatively compact then T is called A-compact. In other words, T is A-compact if it is a compact operator from the space GA into H. The space GA is the closure of D(A) in the graph norm f := f + Af . If A is closed, which we assume, then D(A) = G k kGA k k k k A is a Banach space if it is equipped with the graph norm. Proposition 8.2 The operator T is A-compact if and only if the operator 1 T A− is compact in H.

1 Proof. Suppose T is A-compact. Let f c, and define g = A− f . k n k≤ n n Then gn + Agn c. Therefore the sequence T gn is relatively k k k k≤ 1 compact. This means that the sequence T A− fn is relatively compact. 1 1 Therefore T A− is compact in H. Conversely, suppose T A− is compact 1 in H and f + Af c. Then the sequence T f = T A− Af is k n k k n k≤ n n relatively compact. Proposition 8.2 is proved. 

8.3.2.8 Asymptotics of s-values In this section we prove Theorem 8.8 Let A be a closed linear operator in H. Suppose that σ(A), the spectrum of A, is discrete and 0 σ(A). Let T be a linear operator, 6∈ D(A) D(T ), B = A + T , D(B) = D(A). ⊂ 1 1 If the operator T A− is compact then B is closed. If, in addition, A− is compact and, for some number k σ(A), the operator B+kI is injective, 6∈ then σ(B) is discrete and s (B) lim n = 1 as n . (8.301) n →∞ sn(A) → ∞ The following lemma is often useful. Lemma 8.20 Suppose that f H is a bounded sequence which does { n} ∈ not contain a convergent subsequence. Then there is a sequence ψ = { m} f f such that { nm+1 − nm } ψ * 0 as m (8.302) m → ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 285

and ψ does not contain a convergent subsequence. { m} Proof. Since f is bounded we can assume that it converges weakly: { n}

fn * f (8.303)

(passing to a subsequence and using the well known fact that bounded sets in a Hilbert space are relatively weakly compact). Since f does not { n} contain a convergent subsequence, one can find a subsequence such that

f f  > 0 for all m = k. (8.304) k nm − nk k≥ 6 If

ψ := f f (8.305) m nm+1 − nm then (8.303) implies (8.302), and the sequence ψ does not contain a { m} convergent subsequence because

ψ  > 0, (8.306) k m k≥

and if there would be a convergent subsequence ψmj it would have to con- verge to zero since its weak limit is zero. Lemma 8.20 is proved. 

This lemma can be found in [Glazman (1965), 5] where it is used in the § proof of the following result: if A is a closed linear operator in H and K is a compact operator then σc(A + K) = σc(A), where σc(A) is continuous spectrum of A that is the set of points λ such that there exists a bounded sequence ψ D(A) which does not contain a convergent subsequence and m ∈ which has the property Aψ λψ 0 as m . k m − m k→ → ∞

Proof of Theorem 8.8 (1) Let us first prove that B is closed. Assume that

f f, Bf = Af + T f g, (8.307) n → n n n → and f D(B) = D(A). Suppose we have the estimate n ⊂ Af c. (8.308) k n k≤ 1 Then the sequence T fn = T A− Afn contains a convergent subsequence 1 since T A− is compact. This and the second equation (8.307) imply that the sequence Af contains a convergent subsequence which we denote { n} February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

286 Random Fields Estimation Theory

again Af . Since A is closed by the assumption, we conclude that f n ∈ D(A) = D(B) and

Af + T f = g

where we took into account that

1 1 lim T fn = lim T A− Afn = T A− Af = T f. n n →∞ →∞ Thus, the operator B is closed provided that (8.308) holds. Let us prove inequality (8.308). Suppose

Af , n . (8.309) k n k→ ∞ → ∞ Define f g := n , g 0, Ag = 1. (8.310) n Af k n k→ k n k k n k Equation (8.307) implies

Ag + T g 0, n . (8.311) n n → → ∞ 1 As above, compactness of the operator T A− and the last equation (8.310)

imply that one can assume that the subsequence T gnm , which we denote again T gn, converges in H. This and equation (8.311) imply that Agn converges to an element h:

Ag h. (8.312) n → Since A is closed and g 0, one concludes that h = 0. This is a contra- n → diction:

1 = lim Agn = h = 0. n →∞ k k k k This contradiction proves estimate (8.308). We have proved that the oper- ator B is closed. (2) Let us prove that σ(B) is discrete. We have

1 1 1 1 (B λI)− = (A + T λI)− = (A + kI)− (I + Q µS)− , (8.313) − − − where

1 Q := T S, S = (A + kI)− (8.314)

µ = λ + k, k σ(A). (8.315) 6∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 287

The operators S and Q are compact. If B + kI is injective then I + Q is injective. Since Q is compact this implies that I + Q is an isomorphism of H onto H. Therefore

1 1 1 (I + Q µS)− = (I + Q)− (I µK)− (8.316) − − where

1 K := S(I + Q)− is compact. (8.317)

Therefore the set µ for which the operator B λI is not invertible is a − discrete set, namely the set of the characteristic values of the compact operator K. Recall that µj is a characteristic value of K if

φ = µ Kφ , φ = 0. (8.318) j j j j 6 Thus the set µ has the only possible limit point at infinity. Each µ is an { j} j isolated eigenvalue of k of finite algebraic multiplicity and therefore λj = µ k is an isolated eigenvalue of B of finite algebraic multiplicity. Finally, j − the corresponding to λj projection operator (8.261) is finite dimensional, so that λj is a normal eigenvalue. We have proved that σ(B) is discrete. (3) Let us prove the last statement of the theorem, i.e. formula (8.301). We have

1 1 1 1 1 1 sn(B) = sn− (B− ) = sn− A− (I + T A− )− . (8.319)  We can assume without loss of generality that k = 0. In this case the 1 1 operator I + T A− is invertible and since T A− is compact one can write 1 1 1 (I + T A− )− = I + S, where S is a compact operator. The operator A− is compact by the assumption. We can apply now Theorem 8.3 and obtain

1 sn A− (I + S) lim = 1. (8.320) n s (A 1) →∞  n − This is equivalent to the desired result (8.301). 

8.3.2.9 Asymptotics of the spectrum for quadratic forms In this section we study perturbations preserving spectral asymptotics for quadratic forms. As a motivation to this study let us consider the following classical problem. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

288 Random Fields Estimation Theory

Let D Rr be a bounded domain with a smooth boundary Γ. Consider ⊂ the problems

( ∆ + 1)u = λ u in D, u = 0 on Γ (8.321) − j j j N

( ∆ + 1)u = µ u in D, u + σu = 0 on Γ (8.322) − j j j N where σ = σ(s) C1(Γ) and N is the outer normal to Γ. ∈ The question is: how does one see that µ lim n = 1. (8.323) n →∞ λn The usual argument uses relatively complicated variational estimates. The eigenvalues λn are minimums of the ratio of the quadratic forms [ u 2 + u 2]dx D |∇ | | | = min, u H1(D) (8.324) u 2dx ∈ R D | | while µn are minimumsR of the ratio [ u 2 + u 2]dx + σ u 2ds D |∇ | | | Γ | | = min, u H1(D). (8.325) u 2dx ∈ R D | | R The desired conclusionR (8.323) follows immediately from the abstract result we will prove and from the fact that the quadratic form σ u 2ds is Γ | | compact with respect to the quadratic form [ u 2 + u 2]dx. D |∇ | | | R Let A[u, v] and T [u, v] be bounded from below quadratic forms in a R Hilbert space, T [u, u] 0 and A[u, u] > m u 2, m > 0. Assume that ≥ k k D[A] D[T ], where D[A] is the domain of definition of the form A, and ⊂ that the form A is closed and densely defined in H. The form A is called closed if D[A] is closed in the norm

u := A[u, u] 1/2 . (8.326) k kA { } If A[u, u] is not positive definite but bounded from below: A[u, u] m ≥ − k u 2, u D[A], then the norm u is defined by k ∀ ∈ k kA u = A[u, u] + (m + 1)(u, u) 1/2 . (8.327) k kA { } The following proposition is well-known (see e.g. [Kato (1995)]). Proposition 8.3 Every closed bounded from below quadratic form A[u, v] is generated by a uniquely defined selfadjoint operator A. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 289

This means that

A[u, v] = (Au, v) u D(A), v D[A] ∀ ∈ ∈ and D(A) D[A] H is dense in D[A] in the norm (8.327). The spectrum ⊂ ⊂ of the closed bounded from below quadratic form is the spectrum of the corresponding selfadjoint operator A. Definition 8.5 A quadratic form T is called A-compact if from any se- quence f such that f c one can select a subsequence f such that n k n k≤ nk T [f f , f f ] 0 as m, k . nk − nm nk − nm → → ∞ Theorem 8.9 If A[u, u] is a closed positive definite quadratic form in H with discrete spectrum λn(A), and T [u, u] is a positive A-compact quadratic form, D(A) D(T ), then the form B[u, u] := A[u, u] + T [u, u], D[B] = ⊂ D[A], is closed, its spectrum is discrete and λ (B) lim n = 1. (8.328) n →∞ λn(A) The conclusions of the theorem remain valid if T [u, u] is not positive but T [u, u] T [u, u] and T is A-compact. | | ≤ 1 1 We need a couple of lemmas for the proof. Lemma 8.21 Under the assumptions of Theorem 8.9 the quadratic form T [u, u] > 0 can be represented as

T [u, v] = [T u, v] (8.329)

where [u, v] is the inner product in HA := D[A] and T > 0 is a compact selfadjoint operator in HA.

Proof. Consider the quadratic form T [u, v] in the Hilbert space HA. Since T [u, u] is A-compact, it is bounded in HA. If T [u, v] is not closed in HA consider its closure and denote it again by T [u, v]. By Proposition 8.2 there exists a selfadjoint in HA operator T > 0 such that (8.329) holds. Let us prove that T is compact in H . Suppose u c. Since T [u, u] A k n kA≤ is A-compact there exists a subsequence, which we denote un again, such that

T [u u , u u ] 0, n, m . n − m n − m → → ∞ Thus [T (u u ), u u ] 0, n, m . (8.330) n − m n − m → → ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

290 Random Fields Estimation Theory

Since T > 0 is selfadjoint, T 1/2 is well defined and (8.330) can be written as

1/2 T (un um) 0, n, m . (8.331) − A → → ∞

1/2 This implies that T is compact in HA. Therefore T is compact. Lemma 8.21 is proved. 

Lemma 8.22 Under the assumptions of Theorem 8.9 one has HB = HA. Proof. It is sufficient to prove that

T [u, u] A[u, u] + c() u 2  > 0. (8.332) ≤ k k ∀ If (8.332) holds then

(1 )A[u, u] c() u 2 B[u, u] (1 + )A[u, u] + c() u 2 − − k k ≤ ≤ k k c ()A[u, u] ≤ 2 1/2 so that the norm B[u, u] + c() u 2 is equivalent to the norm u . k k k kA This means that H = H . The proof of (8.332) is the same as the proof  B A of Lemma 8.19 used in the proof of Theorem 8.5. Lemma 8.22 is proved.

Proof of Theorem 8.9 We need only prove formula (8.328) and the fact that B has a discrete spectrum. The other conclusions of Theorem 8.9 have been proved in Lemmas 8.21 and 8.22. Since the form B[u, u] is bounded from below in H we may assume that it is positive definite. If not we choose a constant m such that Bm[u, u] := B[u, u] + m(u, u) is positive definite. Since λ (B ) = λ (B) + m and since λ (A) + , the equation n m n n → ∞ λ (B ) lim n m = 1, n n →∞ λn(A) → ∞ is equivalent to (8.328). Note first that the spectrum of the form B[u, u] is discrete. Indeed, the following known proposition (Rellich’s lemma) implies this.  Proposition 8.4 Let B[u, u] be a positive definite closed quadratic form in H. The spectrum of B is discrete if and only if the imbedding operator i : H H is compact. B → For the convenience of the reader we prove Proposition 8.4 after we finish the proof of Theorem 8.9. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 291

Returning to the proof of Theorem 8.9 we note that A has a discrete spectrum by the assumption. Therefore i : H H is compact. Since A → H = H the imbedding i : H H is compact. By Proposition 8.4 this A B B → implies that the spectrum of B is discrete. To prove formula (8.328) we use the minimax principle: B[u, u] λn+1(B) = sup inf u Ln (u, u) Ln ⊥ A[u, u] T [u, u] inf 1 + ≥ f n (A) (u, u) A[u, u] ⊥L   (T u, u) λn+1(A) 1 sup 2 ≥ − f n (A) u A ⊥L k k ! = λ (A)(1 γ ), γ . (8.333) n+1 − n n → ∞ Here we used Theorem 8.2 and denoted by (A) the linear span of the first Ln n eigenvectors of the operator A generated by the quadratic form A[u, u]. Interchanging A and B we get

λ (A) λ (B)(1 δ ), δ 0, n . (8.334) n+1 ≥ n+1 − n n → → ∞ From (8.333) and (8.334) formula (8.328) follows. The last statement of Theorem 8.9 follows from the following proposition 8.4.

Proposition 8.5 If T [u, u] T1[u, u] and T1 is A-compact then the 1 | | ≤ operator A− T is compact in HA. We will prove this proposition after the proof of Proposition 8.4. Proposition 8.5 granted, the proof of the last statement of Theorem 8.9 is quite similar to the given above and is left to the reader. Theorem 8.9 is proved. 

Proof of Proposition 8.4 Assume that the spectrum of B[u, u] is discrete. Then the corresponding selfadjoint operator B has only isolated eigenvalues 1 0 < m λn(B) + . Therefore the operator B− is compact in H. ≤ → 1/∞2 This implies that B− is compact in H. Assume that un B c, that 1/2 1/2 k1/2 k ≤ is, B u c. Then the sequence u = B− B u contains a k n k≤ n n convergent in H subsequence. Thus, the imbedding i : H H is compact. B → Conversely, suppose i : H H is compact. Then any sequence u B → n such that u = B1/2u < c contains a convergent subsequence. This k n kB k n k February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

292 Random Fields Estimation Theory

1/2 1/2 means that B− is compact. Since B− is selfadjoint it follows that 1 B− is compact. Since B m > 0 this implies that the spectrum of B is ≥ discrete. Proposition 8.4 is proved. 

1 1 Proof of Proposition 8.5 Denote Q := A− T , Q1 = A− T1. Then

[Qu, u] [Q u, u]. (8.335) | | ≤ 1

where the brackets denote the inner product in HA. The operator Q1 is nonnegative and compact in HA. Indeed,

[Q u, u] = (T u, u) 0 1 1 ≥ so Q 0 in H . Suppose u c. Then T u contains a Cauchy 1 ≥ A k n kA≤ 1 n sequence in HT1 , that is

(T u T u , u u ) 0, m, k . (8.336) 1 nm − 1 nk nm − nk → → ∞ Thus

[Q (u u ) , u u ] 0, m, k . (8.337) 1 nm − nk nm − nk → → ∞ Since Q 0 equation (8.337) implies that Q1/2 is compact in H . There- 1 ≥ 1 A fore Q is compact in H . Conversely, if Q is compact in H then T 0 1 A 1 A 1 ≥ is A-compact. Indeed, if u c then T u = AQ u so that there is a k n kA≤ 1 n 1 n subsequence unm such that

(T (u u ) , u u ) = [Q (u u ) , u u ] 0 1 nm − nk nm − nk 1 nm − nk nm − nk → as m, k , because Q is compact in H . → ∞ 1 A So we have proved that

1 T 0 is A compact if and only if A− T is compact in H . (8.338) 1 ≥ 1 A

Let us prove that if Q1 is compact in HA. This is the conclusion of Proposition 8.5. According to section Section 8.3.1.1 it is sufficient to prove that if fn * 0 and gn * 0 in HA then

[Qf , g ] 0, n . (8.339) n n → → ∞ Indeed, if

f * 0 and g * 0 [Qf , g ] 0 (8.340) { n n } ⇒ n n → February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 293

then

f * f and g * g [Qf , g ] [Qf, g], (8.341) { n n } ⇒ n n → that is, Q is compact. To check that (8.340) implies (8.341) one writes

[Qf , g ] = [Q (f f) , g ] + [Qf, g ] n n n − n n = [Q (f f) , g g] + [Q (f f) , g] + [Qf, g g] + [Qf, g] n − n − n − n − = [Q (f f) , (g g)] + [f f, Q∗g] + [Qf, g g] n − n − n − n − + [Qf, g]. (8.342)

It follows from (8.342) that (8.340) implies (8.341). Let us check that (8.335) implies (8.340). One uses the well known polarization identity 1 [Qf, g] = [(f + g), f + g] [(f g), f g] i [Q(f + ig), f + ig] 4{ − − − − + i [Q(f ig), f ig] . (8.343) − − } It is clear from (8.343) and (8.335) that 1 [Qf , g ] [Q (f + g ) , f + g ] + [Q (f g ) , f g ] | n n | ≤ 4{| 1 n n n n | | 1 n − n n − n | + [Q (f + ig ) , f + ig ] | 1 n n n n | + [Q (f ig ) , f ig ] 0, n . (8.344) | 1 n − n n − n |} → → ∞

The last conclusion follows from the assumed compactness of Q1 and the fact that if fn * 0 and gn * 0 then any linear combination c1fn + c2gn converges weakly to zero. Proposition 8.5 is proved. 

Example 8.2 It is now easy to see that (164) holds. Indeed, the imbed- ding i : H1(D) L2(Γ, σ ) is compact. Therefore the quadratic form → | | σ u 2ds is A-compact, where A[u, u] = u 2 + u 2 dx. From this Γ | | D |∇ | | | and Theorem 8.9 the formula (8.323) follows. R R 

8.3.2.10 Proof of Theorem 2.3 In this section we prove Theorem 2.3. First let us note that if

∞ R(x, y) = ω(λ)Φ(x, y, λ)dρ(λ) (8.345) Z−∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

294 Random Fields Estimation Theory

where ω(λ) C(R1), ω( ) = 0 then the operator R : L2(D) L2(D), ∈ ∞ → where D Rr is a bounded domain, and ⊂ Rh := R(x, y)h(y)dy, (8.346) ZD is compact. This is proved in Section 4.2 (cf. the argument after formula (4.71)). If ω(λ) 0 then R = R∗ 0. Suppose that ≥ ≥ ω (λ) = ω(λ) [1 + φ(λ)] , φ( ) = 0 (8.347) 1 ∞ where φ(λ) C(R1), 1 + φ(λ) > 0. Then the corresponding operator R ∈ 1 can be written as

R1 = R(I + Q), (8.348)

where Q is a compact operator with the kernel

∞ Q(x, y) = φ(λ)Φ(x, y, λ)dρ(λ). (8.349) Z−∞ By Theorem 8.3 one has s (R ) lim n 1 = 1. (8.350) n →∞ sn(R) Note that the operator I + Q is injective since 1 + φ(λ > 0. Since R 0 1 ≥ and R 0 one has s (R ) = λ (R ), s (R) = λ (R). Therefore (191) can ≥ n 1 n 1 n n be written as λ (R ) lim n 1 = 1. (8.351) n →∞ λn(R) 2 a/2 Therefore it is sufficient to prove formula (2.31) for ω(λ) = (1 + λ )− . Secondly, let us note that if one defines

N(λ) := 1, (8.352) λ λ Xn≤ then formulas

λ = cnp [1 + o(1)] as n + , c = const > 0, p > 0, (8.353) n → ∞ and

1/p 1/p N(λ) = c− λ [1 + o(1)] , λ + , p > 0, (8.354) → ∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 295

are equivalent. This follows from the fact that the function N(λ) is the inverse function for λ(N) := λN in the sense that N(λN ) = N and λ (N(λ)) = λN . Therefore if one knows the asymptotics of N(λ) then one knows the asymptotics of λn and vice versa. In [Ramm (1975), p. 339] p1 it is proved that if o(1) in (8.353) is 0 (n− ), p1 > 0, then o(1) in (8.354) p /p is 0 λ− 1 . Thirdly, let us recall a well known fact that an elliptic operator of order  L s with smooth coefficients and regular boundary conditions in a bounded domain D Rr with a smooth boundary (these are the regularity assump- ⊂ tions) has a discrete spectrum and

N(λ, ) = γλr/s [1 + o(1)] , λ + , (8.355) L → ∞ where N(λ, ) is defined by (8.352) with λ = λ ( ), and γ = const > 0 is L n n L defined by formula (2.32). By formula (8.353) one obtains

s/r s/r λ ( ) = γ− n [1 + o(1)] , n + . (8.356) n L → ∞ The operator R is a rational function of so that by the spectral mapping L theorem one obtains

a as/r as/r λ (R) = λ− ( ) [1 + o(1)] = γ n− [1 + o(1)] , n . (8.357) n n L → ∞ This is formula (2.31). 2 a/2 For the function ω(λ) = (1 + λ )− and even a a proof of the formula (2.31) is given in [Ramm (1980), p. 62]. This proof goes as follows. The problem

Rφ := R(x, y)φ (y)dy = λ φ (x), x D (8.358) n n n n ∈ ZD is equivalent to the problem

λnφn(x), x D R(x, y)φn(y)dy = ∈ (8.359) Rr (un(x), x Ω, Z ∈ where

φn(x) := 0 in Ω (8.360)

Q( )u = 0 in Ω (8.361) L n February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

296 Random Fields Estimation Theory

as u ( ) = 0, ∂j u = λ ∂j φ on Γ, 0 j 1, (8.362) n ∞ N n n N n ≤ ≤ 2 − where Q(λ) := (1 + λ2)a/2. The equivalence means that every solution to (8.358) generates the solution to (8.359)- (8.362) and vice versa. The problem (8.359)- (8.362) can be written as

λ Q( )φ = χ (x)φ (x) in Rr, (8.363) n L n D n where

1, x D χD(x) = ∈ (0, x Ω. ∈ This problem has been studied [Tulovskii (1979)] and formula (8.357) has been established. 2 a/2 For the general case of ω(λ) = (1 + λ )− , a > 0 one can use the results from the spectral theory of elliptic pseudo-differential operators. Under suitable regularity assumptions the following formula for the number N(λ) := #λ : λ λ of eigenvalues of such an operator R in a bounded { n n ≤ } domain D Rr is valid: ⊂ r r N(λ) = (2π)− meas (x, ξ) D R : r(x, ξ) < λ [1 + o(1)] , λ + . { ∈ × } → ∞ (8.364) Here meas is the Lebesgue measure, and r(x, ξ) is the symbol of the pseudo- differential operator R. This means that

r Rh := (2π)− exp i(x y) ξ r(x, ξ)h(y)dydξ, = . (8.365) { − · } r ZZ Z ZR j The symbol of the elliptic operator (2.5) is j s aj (x)(iξ) . Only the | |≤ principal symbol, that is is a (x)ξj, defines the main term of the j =s j P asymptotics of N(λ). Since s| | = ord is even one chooses so that P L L (x, ξ) := a (x)ξj > 0 for ξ = 0. For example, one chooses L0 j =s j | | 6 = ∆ rather| than| ∆. In this case L − P r r r/s r r/s (2π) meas (x, ξ) D R : 0(x, ξ) < λ = λ (2π)− ηdx = γλ , { ∈ × L } D Z (8.366) where η is given by (2.33) for the operator in the selfadjoint form. L0 The asymptotic behavior of the function N(λ) has been studied exten- sively for wide classes of differential and pseudo-differential operators (see, February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 297

e.g., [Levitan (1971); H¨ormander (1983-85); Safarov and Vassiliev (1997); Shubin (1986)] and references therein).

8.3.3 Trace class and Hilbert-Schmidt operators In this section we summarize for convenience of the reader some rsults on Hilbert-Schmidt and trace-class operators. This material can be used for references. One writes A σ , 1 p < if ∈ p ≤ ∞

∞ sp(A) < . j ∞ j=1 X 8.3.3.1 Trace class operators The operators A σ are called trace class (or nuclear) operators. The ∈ 1 operators A σ are called Hilbert-Schmidt (HS) operators. The class ∈ 2 σ denotes the class of compact operators. If A σp then A σq with ∞ ∈ ∈ q p. We summarize some basic known results about trace class and HS ≥ operators.

Lemma 8.23 If and only if A σ the sum T rA := ∞ (Aφ , φ ) is ∈ 1 j=1 j j finite for any orthonormal basis of H and does not depend on the choice of ν(A) P the basis. In fact T rA = j=1 λj (A), where ν(A) is the sum of algebraic multiplicities of the eigenvalues of A. P

The following properties of the trace are useful.

Lemma 8.24 If A σ , j = 1, 2, then j ∈ 1

1) T r(c1A1 + c2A2) = c1T rA1 + c2T rA2, cj = const, j = 1, 2, 2) T rA∗ = (T rA), the bar stands for complex conjugate, 3) T r(A1A2) = T r(A2A1), 1 4) T r(B− AB) = T rA where B is a linear isomorphism of H onto H, 5) T r(A A ) 0 if A 0 and A 0, 1 2 ≥ 1 ≥ 2 ≥ 6) T r(A A )1/2 T rA1+T rA2 if A 0 and A 0, 1 2 ≤ 2 1 ≥ 2 ≥ 7) T rA ∞ s (A) := A , | | ≤ j=1 j k k1 8) A = sup ∞ (Af , h ) , where the sup is taken over all orthonor- k k1 P j=1 | j j | mal bases f and h of H, { jP} { j} 9) c A + c A c A + c A , k 1 1 2 2 k1≤ | 1| k 1 k1 | 2| k 2 k1 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

298 Random Fields Estimation Theory

10) Assume that A σ , 1 j < , A c, c does not depend on j, j ∈ 1 ≤ ∞ k j k1≤ and A * A. Then A σ , and A sup A , the symbol * j ∈ 1 k k1≤ j k j k1 denotes weak convergence of operators, 11) if A σ and B is a bounded linear operator, then ∈ 1 AB A B , k k1 ≤ k k1k k BA A B . k k1 ≤ k k1k k 8.3.3.2 Hilbert-Schmidt operators The operators A σ are called Hilbert-Schmidt (HS) operators. ∈ 2 Lemma 8.25 A σ if and only if the sum A 2:= ∞ Aφ 2 is ∈ 2 k k2 j=1 k j k finite for any orthonormal basis φ of H. If A σ then { j} ∈ 2 P ∞ A 2= s2(A). k k2 j j=1 X If c = const and A σ then j j ∈ 2 c A + c A c A + c A . k 1 1 2 2 k≤ | 1| k 1 k2 | 2| k 2 k2 If A σ and B is a bounded linear operator then ∈ 2 A = A∗ , k k2 k k2 AB A B . (8.367) k k2 ≤ k k2k k Lemma 8.26 A σ if and only if there exists an orthonormal basis ∈ 2 φ of H for which { j} ∞ Aφ 2< . (8.368) k j k ∞ j=1 X In particular, if (8.368) hold for an for an orthonormal basis of H, then it holds for every orthonormal basis of H. Lemma 8.27 A σ if and only if there exists an orthonormal basis of ∈ 1 H for which

∞ Aφ < . (8.369) k j k ∞ j=1 X However if (8.369) holds for an orthonormal basis of H it may not hold for another orthonormal basis of H. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 299

1 1 Example 8.3 Let H = `2, f = c 1, 2 , . . ., n , . . . where c = const > 0 is chosen so that f = 1. Let A be the orthogonal projection on the k k  one-dimensional subspace spanned by f, and let φj , φj = δ1j, be an c { } c orthonormal basis of ` . Then Aφ = f, ∞ Aφ = ∞ = . 2 j j j=1 k j k j=1 j ∞ Lemma 8.28 If and only if A σ it canPbe representedPin the form ∈ 1 A = A A where A σ , j = 1, 2. 1 2 j ∈ 2 Lemma 8.29 The classes σ and σ are ideals in the algebra (H) of all 1 2 L linear bounded operators on H. If H is a separable Hilbert space then the set σ of all compact operators on H is the only closed proper non-zero ideal ∞ in (H). The ideal is called proper if it is not (H) itself. The closedness L L is understood as the closedness in the norm of linear operators on H.

8.3.3.3 Determinants of operators

Definition 8.6 If A σ then ∈ 1 ν(A) d(µ) := det(I µA) := [1 µλ (A)] . − − j j=1 Y One has

1) d(µ) exp ( µ A 1). | | ≤ | |µk k 1 2) d(µ) = exp T r A(I µA)− dµ , if the operator I λA, 0 λ − 0 − − ≤ ≤ µ, is invertible. R    3) det(I A) = limn det [δij (Aφi, φj)] where φj is an arbi- − →∞ − i,j=1,...,n { } trary orthonormal basis of H. 4) det(I AB) = det(I BA), AB σ1, BA σ1, A σ , B (H). − − ∈ ∈ ∈ ∞ ∈ L 5) det [(I A)(I B)] = det [(I B)(I A)], A, B σ . − − − − ∈ 1 6) if A(z) σ is analytic operator function in a domain ∆ of the complex ∈ 1 plane z then d(1, z) := det (I A(z)) is analytic in ∆; here d(µ, z) := − det (I µA(z)). d − dA(z) 7) T r F (A(z)) = T r F 0 (A(z)) where F (λ) is holomorphic in dz { } dz the domain which containsn the spectrumo of A(z) for all z ∆, and ∈ F (0) = 0. 8) det(I +A) = exp T r log(I + A) , A σ where log(I +A) can be defined { } ∈ 1 by analytic continuation of log(I + zA). This function is well defined for z A < 1. | | k k February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

300 Random Fields Estimation Theory

If A σ then the series ∞ λ (A) may diverge and the Definition ∈ 2 j=1 | j | 8.6 is not applicable. One gives P Definition 8.7

µ(A)

d2(µ) := det(I µA) := [1 µλj (A)] exp [µλj (A)] . 2 − { − } j=1 Y One has:

µ 2 9) d (µ) exp | | T r(A∗A) . | 2 | ≤ 2 n o n 10) d2(1) = limn det [δij (Aφi, φj)]1 i,j n exp j=1(Aφj, φj) where →∞ − ≤ ≤ φj is an arbitrary orthonormal basis of H. h i { } P 11) if A, B σ and I C = (I A)(I B) then ∈ 2 − − − det(I C) exp [T r(AB)] = det(I A) det(I B) 2 − 2 − 2 − If B σ then ∈ 1 12) det(I C) = det(I A) det(I B) exp T r [(I A)B] . 2 − 2 − − { − } Definition 8.8 If A σ then ∈ p 13) p 1 ∞ − m µ m dp(µ) := det(I µA) := [1 µλj(A)] exp λj (A) . p − − m ( "m=1 #) jY=1 X Carleman’s inequality: If A σ , λ are eigenvalues of A counted ∈ 2 j according to their multiplicities, λ λ and φ (A) := ∞ (1 | 1| ≥ | 2| ≥ · · · λ j=1 − λ λ 1) exp(λ λ 1), then j − j − Q

1 1 2 2 14) φ (A)(A λI)− λ exp 1 + λ − A . k λ − k≤ | | 2 | | k k   

8.4 Elements of probability theory

8.4.1 The probability space and basic definitions A probability space is a triple Ω, , P where Ω is a set, is a sigma { U } U algebra of its subsets, and P is a measure on , such that P (Ω) = 1, so its U a measure space Ω, equuipped with a normalized countably additive { U} February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 301

measure

∞ P U ∞ A = P (A ), A A = for j = m. j=1 j j j ∩ m ∅ 6 j=1  X A random variable ξ is a -measurable function on Ω, that is, a - U U measurable map Ω R1. A random vector ξ is a -measurable map Ω → U → Rr. A distribution function of ξ is F (x) = P (ξ < x). It has properties: F ( = 0, F (+ ) = 1, F is nondecreasing, −∞ ∞ F (x + 0) F (x) = P (ξ = x), F (x 0) = F (x). − − The probability density f(x) := F 0(x) is defined in the classical sense if F (x) is absolutely continuous, so that F (x) = x f(t)dt. If ξ is a discrete random variable, that is ξ takes values in a discrete−∞ set of points, then its R distribution function is F (x) = P θ(x x ), where i i − i P 1 x > 0 θ(x) := , Pi > 0, Pi = 1. (0 x 0 i ≤ X The probability density for this distribution is f(x) = P δ(x x ) where i i − i δ(x) is the delta-function. P The probability density has the properties: 1) f 0, ≥ 2) ∞ fdt = 1. A random−∞ vector ξ = (ξ , . . ., ξ ) has a distribution function R 1 r

F (x1, . . ., xr) := P (ξ1 < x1, . . ., ξr < xr).

This function has characteristic properties

1) F (+ , . . ., + ) = 1, ∞ ∞ 2) F (x , . . ., x = , . . ., x ) = 0 for any 1 m r, 1 m −∞ r ≤ ≤ 3) F (x1, . . ., xm = + , . . ., xr) = F (x1, . . ., xm 1, xm+1, . . ., xr), ∞ − 4) F is continuous from the left, i.e. F (x , . . ., x 0, . . ., x ) = 1 m − r F (x1, . . ., xm, . . ., xr) and nondecreasing in each of the variables x1, . . ., xr.

The probability density f(x1, . . ., xr) is defined by the formula

r ∂ F (x1, . . ., xr) f(x1, . . ., xr) = ∂x1 . . . ∂xr February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

302 Random Fields Estimation Theory

so that

x1 xr F (x1, . . ., xr) = dx1 . . . dxrf(x1, . . ., x2). Z−∞ Z−∞ This formula holds if the measure defined by the distribution function F is absolutely continuous with respect to the Lebesgue measure in Rr, i.e. if P (ξ ∆) = 0 for any ∆ Rr such that meas∆ = 0 where meas is the ∈ ⊂ Lebesgue measure in Rr.

Example 8.4 A random vector ξ is said to be uniformly distributed in a set ∆ Rr if its probability density is ⊂ 0 x ∆ f(x) = 6∈ (8.370) ( 1 x ∆. meas∆ ∈ Example 8.5 A random vector ξ is said to be Gaussian (or normally distributed, or normal) if its probability density is

r r/2 1/2 f(x) = (2π)− [det C] exp c (x m )(x m ) . (8.371)  ij i − i j − j  i,j=1  X 

Here C = (cij ) is a positive definite matrix, M[ξ] = ξ = m = (m1, . . ., mr), 1 the matrix C− is the covariance matrix of ξ:

1 ( 1) C− := c − = (ξ m )(ξ m ), ξ = m , (8.372) ij i − i j − j i i   where all the quantities are real-valued, the bar denotes the mean value, defined as follows:

ξi := xidF, := . r Z Z ZR If g(x , . . ., x ) is a measurable function defined on Rr, g : Rr R1, 1 r → then η = g(ξ1, . . ., ξr) is a random variable with the distribution function:

Fη(x) := P (η < x) = dF (x1, . . ., xr) Zg(x1 ,...,xr )<η

where F (x1, . . ., xr) is the distribution function for the vector (ξ1, . . ., ξr). One has

g(ξ1, . . ., ξr) = g(x1, . . ., xr)dF (x1, . . ., xr). Z February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 303

In particular the variance of a random variable is defined as

∞ [ξ] := (ξ ξ)2 = (x m)2dF (x) D − − Z−∞ where

∞ ξ = m = xdF (x). Z−∞ Let us define conditional probabilities. If A and B are random events then P (AB) P (A B) := , (8.373) P (B)

where AB is the event A B and P (A B) is called the conditional prob- ∩ ability of A under the assumption that B occured. The conditional mean

value (expectation) of a random variable ξ = f(u) under the assumption that B occured is defined by

1 M(ξ B) = P − (B) f(u)P (du). (8.374) ZB

In particular, if

1 if A occurs ξ = (0 otherwise

then (8.374) reduces to (8.373). n If A = U E , E E 0 = , j = j0, then j=1 j j ∩ j ∅ 6 n P (A) = P (A Ej)P (Ej) (8.375) j=1 X

and

1 P (Ej A) = P (A Ej)P (Ej)P − (A). (8.376)

This is the Bayes formula . The conditional distribution function of a ran- dom variable ξ with respect to an event A is defined as

P ( ξ < x A) F (x A) := { } ∩ . (8.377) ξ P (A)

February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

304 Random Fields Estimation Theory

One has

∞ M ξ A = xdFξ x A . (8.378) Z−∞   The characteristic function of a random variable ξ with the distribution function F (x) is defined by

∞ φ(t) := M [exp(itξ)] = exp(itx)dF (x). (8.379) Z−∞ It has the properties

1) φ(0) = 1, φ(t) 1, < t < , φ( t) = φ∗(t) | | ≤ −∞ ∞ − 2) φ(t) is uniformly continuous on R1 3) φ(t) is positive definite in the sense

n φ(t t )z z∗ 0 (8.380) j − m j m ≥ j,m=1 X for any complex numbers z and real numbers t , 1 j n. j j ≤ ≤ Theorem 8.10 (Bochner-Khintchine ) A function φ(t) is a charac- teristic function if and only if it has properties 1)-3). One has 1 T lim φ(t) exp( itx)dt = 0 T 2T T − →∞ Z− if F (x) is continuous at the point x. More generally 1 T lim φ(t) exp( itx)dt = F (x + 0) F (x). T 2T T − − →∞ Z− If conditions 2) and 3) hold but condition 1) does not hold then one still has the formula

∞ φ(t) = exp(itx)dF (x) (8.381) Z−∞ where F (x) is a monotone nondecreasing function, but the condition F (+ ) = 1 does not hold and is replaced by F (+ ) < . ∞ ∞ ∞ Let us define the notion of random function. Let Ω, , P be the { U } probability space and ξ(t, ω), ω Ω, be a family of random variables ∈ depending on a parameter t D Rr. The space X to which the variable ∈ ⊂ 1/2ξ(t, ω) belongs for a fixed ω Ω is called the phase space of the random ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 305

function ξ(x, ω). If r = 1 the random function is called a random process, if r > 1 it is called a random field. Usually one writes ξ(t) in place of ξ(t, ω). If X = Rm then ξ(t) is a vector random field, if m = 1 then ξ(t) is a scalar random field. One assumes that X is a measurable space (X, ) where B B is the Borel sigma-algebra generated by all open sets of X. If one takes n points x , . . ., x D Rr then one obtains a random 1 n ∈ ⊂ vector ξ(t , ω), . . ., ξ(t, ω) . Let F (t , . . ., t ) be the distribution function for { 1 } 1 n this random vector. For various choices of the points tj one obtains various distribution functions. The collection of these functions is consistent in the following sense:

1) Fξ ,...,ξ (x1, x2, . . ., xm 1, xm X, . . ., xn) = 1 n − ∈ F (x1, . . ., xm 1, xm+1, . . ., xn) − 2) F (x , . . ., x ) = F (x , . . ., x ) ξ1,...,ξn 1 n ξi1 ,...,ξin i1 in for any permutation (i1, . . ., in) of the set (1, . . ., n).

The following theorem gives conditions for a family F (x1, . . ., xn) of functions to be the family of finite-dimensional distribution functions cor- responding to a random function ξ(t).

Theorem 8.11 (Kolmogorov) If and only if the family of functions F is compatible and each of the functions has the characteristic properties 1)-4) of a distribution function, there exists a random function for which this family is a family of finite-dimensional distribution functions.

Moment functions of a random function are defined as

mj (t1, . . ., tj) = M [ξ(t1) . . .ξ(tj )] = ξ(t1) . . .ξ(tj ). (8.382)

Especially often one uses the mean values

M [ξ(t, ω)] = m(t) = ξ(t) (8.383)

and covariance function

R(t, τ) := [ξ(t) m(t)]∗ [ξ(τ) m(τ)] . (8.384) − − The characteristic property of the class of covariance functions is positive definiteness in the sense n R(t , t )z∗z 0 (8.385) i j i j ≥ i,j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

306 Random Fields Estimation Theory

for any choice of real ti and complex zi. The star stands for complex conjugate.

8.4.2 Hilbert space theory Let us assume that a family of random variables with finite second moments is equipped with the inner product defined as

(ξ, η) := ξη∗ (8.386)

and the norm is defined as

ξ = (ξ, ξ)1/2. (8.387) k k Then the random variables belong to the space L2 = L2(Ω, , P ). Conver- U gence in this space is defined as convergence in the norm (8.387). If ξ(t) is a random function then its correlation function is defined as

B(t, τ) := ξ∗(t)ξ(τ). (8.388)

If ξ(t) = 0 then B(t, τ) = R(t, τ), where R(t, τ) is the covariance function (8.384). The characteristic property of the class of correlation functions is positive definiteness in the sense (8.385). A random function ξ(t) L2(Ω, , P ) is called continuous (in L2 sense) ∈ U at a point t0 if

ξ(t) ξ(t ) 0 as ρ(t, t ) 0, (8.389) k − 0 k→ 0 →

where ρ(t, t0) is the distance between 1/2 and t0.

Lemma 8.30 For (8.389) to hold it is necessary and sufficient that B(t, τ) be continuous at the point (t0, t0).

A random function ξ(t) L2 is called differentiable (in L2 sense) at a 2∈ point t0 if there exists in L the limit

1 ξ0(t0) := l.i.m. 0− [ξ(t0 + ) ξ(t0)] , (8.390) → − where l.i.m. = limit in mean stands for the limit in L2.

Lemma 8.31 For (8.390) to hold it is necessary and sufficient that February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 307

2 ∂ B(t0,t0) ∂t∂τ exists, that is

2 ∂ B(t0, t0) 1 := lim [B(t0 + 1, t0 + 2) ∂t∂τ 1→0 12  0 2→ B(t , t +  ) B(t +  , t ) + B(t , t )] . (8.391) − 0 0 2 − 0 1 0 0 0 ∂2B(t,τ) If ∂t∂τ exists then ∂2B(t, τ) ξ (t)ξ (τ) = 0∗ 0 ∂t∂τ

∂B(t, τ) ξ (t)ξ(τ) = . (8.392) 0∗ ∂t Let ξ(x), x D Rr be a random function and µ(x) be a finite ∈ ⊂ measure. The Lebesgue integral of ξ(x) is defined as

ξ(x)dµ(x) = l.i.m.n ξn(x)dµ(x) (8.393) →∞ ZD ZD 2 where ξn(x) ξn+1(x), ξn(x) L and ξn(x) −→ ξ(x), that is ≤ ∈ P

lim P ( ξ(x) ξn(x) > ) = 0, x D (8.394) n →∞ | − | ∀ ∈ for every  > 0. If µ(D) < and ∞ B(x, x)dµ(x) < (8.395) ∞ ZD then

ξ(x) 2dµ(x) = B(x, x)dµ(x). (8.396) | | ZD ZD Assume that ξ(x) L2 and B(x, y) is continuous in D D, where ∈ × D Rr is a finite domain. Then, by Mercer’s theorem (see p. 61) one has ⊂ ∞ B(x, y) = λj φj(x)φj∗(y) (8.397) j=1 X where

Bφ = λ φ , λ λ > 0 (8.398) j j j 1 ≥ 2 ≥ · · · February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

308 Random Fields Estimation Theory

(φj, φm) = φj φm∗ dx = δjm, (8.399) ZD

Bφ := B(x, y)φ(y)dy. (8.400) ZD Put

ξn := ξ(x)φn(x)dx. (8.401) ZD Then

ξn∗ ξm = B(x, y)φm(y)φn∗ (x)dydx = λmδnm, (8.402) ZD ZD and

ξ∗(x)ξn = B(x, y)φn(y)dy = λnφn(x). (8.403) ZD Lemma 8.32 The series

∞ ξ(x) = ξj φj∗(x) (8.404) j=1 X converges in L2 for every x D if the function B(x, y) is continuous in ∈ D D. × Remark 8.6 If one defines B1(t, τ) by the formula

B1(t, τ) := ξ(t)ξ∗(τ) (8.405)

so that B1 = B∗, then formulas (8.397)-(8.400) hold for B1, in formula (8.401) in place of φn one puts φn∗ and in formula (8.404) one puts φj in place of φj∗. Let us define the notion of a stochastic or random measure. Consider a random function ζ(x) L2, x D. Let be a sigma-algebra ∈ ∈ B of Borel subsets of D. Suppose that to any ∆ there corresponds a ⊂ B random variable µ(∆) with the properties

1) µ(∆) L2, µ( ) = 0 ∈ ∅ 2) µ(∆ ∆ ) = µ(∆ ) + µ(∆ ) if ∆ ∆ = 1 ∪ 2 1 2 1 ∩ 2 ∅ 3) µ(∆ )µ (∆ ) = m(∆ ∆ ), 1 ∗ 2 1 ∩ 2 where m(∆) is a certain deterministic function on . Note that B February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 309

m(∆) = µ(∆) 2 0 | | ≥ and m(∆ ∆ ) = µ(∆ ) + µ(∆ ) 2 = m(∆ ) + m(∆ ) 1 ∪ 2 | 1 2 | 1 2 provided that ∆ ∆ = , so that m(∆) has some of the basic properties 1 ∩ 2 ∅ of a measure. It is called the structural function of µ(∆) and µ(∆) is called an elementary orthogonal stochastic measure. Assume that m(∆) is semiadditive in the following sense: for any ∆ ∈ B the inclusion ∆ ∞ ∆ , ∆ , implies ⊂ j=1 j j ∈ B S ∞ m(∆) m(∆ ). (8.406) ≤ j j=1 X Then m(∆) can be extended to a Borel measure on . B Let f(x) L2 (D, , m). Define the stochastic integral as the following ∈ B limit

f(x)ζ(dx) = l.i.m. fn(x)ζ(dx). (8.407) ZD ZD

where fn(x) is a sequence of simple functions such that

1/2 2 f f 2 := f f m(dx) 0 n . (8.408) k − n kL (D,m) | − n| → → ∞ ZD  A simple function is a function of the type

n

fn(x) = cjχAj (8.409) j=1 X where c = const, χ is the characteristic function of the set A . j Aj j ⊂ B Lemma 8.33 If f L2(D, m), i = 1, 2, and c = const, then i ∈ i

(c1f1 + c2f2)ζ(dx) = c1 f1ζ(dx) + c2 f2ζ(dx) (8.410) ZD ZD ZD and

f1(x)ζ(dx) f2∗(x)ζ∗(dx) = f1(x)f2∗(x)m(dx). (8.411) ZD ZD ZD Using the notion of the stochastic integral one can construct an integral representation of random functions. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

310 Random Fields Estimation Theory

Suppose that a random function ξ(x), x D, has the covariance func- ∈ tion of the form

B(x, y) = g∗(x, λ)g(y, λ)m(dλ) (8.412) ZΛ where m(dλ) is a Borel measure on the set Λ, g(x, λ) L2(Λ, m(dλ)) x ∈ ∀ ∈ D, and the set of functions g(x, λ), x D is complete in L2(Λ, m(dλ)). { ∈ } Lemma 8.34 Under the above assumptions there exists an orthogonal stochastic measure ζ(dλ) such that

ξ(x) = g(x, λ)ζ(dλ). (8.413) ZΛ

Equation (8.413) holds with probability one, and m(∆) := ∆ m(dλ) is the structural function corresponding to ζ(dλ). There is an isometric isomor- 2 2 2 R phism between L (Λ, m(dλ)) and Lξ, where Lξ is the closure of the set of random variables of the form n c ζ(∆ ), ∆ Λ, in the norm (8.387). j=1 j j j ∈ This isomorphism is established by the correspondence P ξ(x) g(x, λ), ζ(∆) χ (λ). (8.414) ↔ ↔ ∆ If h (λ) L2(Λ, m(dλ)), i = 1, 2 then i ∈

(h1, h2)L2(Λ,m) := h1h2∗m(dλ) = ξ1ξ2∗, (8.415) ZΛ where

ξi = hi(λ)ζ(dλ). (8.416) ZΛ This theory extends to the case of random vector-functions.

8.4.3 Estimation in Hilbert space L2(Ω, U, P ) Assume that L2 is the subspace of L2(Ω, , P ), a random variable η ξ U ∈ L2(Ω, , P ) and we want to give the best estimate of η by an element of U L2, that is to find η L2 such that ξ 0 ∈ ξ

δ := η η0 = inf η φ , (8.417) k − k φ L2 k − k ∈ ξ where the norm is defined by (8.387). The element η L2 does exist, is unique, and is the projection of η 0 ∈ ξ onto the subspace L2 in the Hilbert space L2(Ω, , P ). ξ U February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 311

The error of the estimate, the quantity δ defined by (8.417), can be calculated analytically in some cases. For example, if (ξ1, . . ., ξn) is a finite set of random variables then

(ξ1, ξ1) . . . (ξ1, ξn) ξ1 . 1 . η0 = , (8.418) Γ (ξ , ξ ) . . . (ξ , ξ ) ξ n 1 n n n (η, ξ ) . . . (η, ξ ) 0 1 n

where Γ = Γ(ξ1, . . ., ξn) is the Gramian of (ξ1, . . ., ξ n):

(ξ1, ξ1) . . . (ξ1, ξn) Γ := ...... (8.419)

(ξ , ξ ) (ξ , ξ ) n 1 n n

One has Γ(ξ , . . ., ξ , η) δ2 = 1 n . (8.420) Γ(ξ1, . . ., ξn)

The optimal estimate η L2 satisfies the orthogonality equation 0 ∈ ξ (η η, ξ(x)) = 0, x D (8.421) 0 − ∀ ∈ which means geometrically that η η is orthogonal to L2. − 0 ξ Equation (1.6) for the optimal filter is a particular case of (8.421): if, in the notation of Chapter 1,

η (z) = h(z, y) (y)dy, η = s(z), (8.422) 0 U ZD then (8.421) becomes

h(z, y) (y) (x)dy = s(z) (x), U U ∗ U ∗ ZD which, according to (1.3), can be written as

h(z, y)R(x, y)dy = f(x, z). (8.423) ZD This is equation (1.6). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

312 Random Fields Estimation Theory

8.4.4 Homogeneous and isotropic random fields If ξ(x) is a random field and

ξ(x) = 0, ξ (x)ξ(y) = R(x y) (8.424) ∗ − then ξ(x) is called a (wide-sense) homogeneous random field. It is called a homogeneous random field if for any n and any x1, . . ., xn, x the distri- bution function of n random variables ξ(x1 + x), . . ., ξ(xn + x) does not depend on x. Here x Rr or, if x D Rr, then one assumes that ∈ ∈ ⊂ x, y D implies x + y D. The function R(x) is positive definite in the ∈ ∈ sense (8.380). Therefore, by Bochner-Khintchine theorem, there exists a monotone nondecreasing function F (x), F (+ < , such that ∞ ∞ R(x) = exp(ix y)dF (y), = , (8.425) · r Z Z ZR

r x y = x y . (8.426) · j j j=1 X One often writes dF (y) = F (dy) to emphasize that F determines a measure on Rr . Monotonicity in the case r > 1 is understood as monotonicity in each of the variables. If r > 1 then a positive definite function R(x) is the Fourier transform of a positive finite measure on Rr. This measure is given by the function F (x) which satisfies characteristic properties 2)-4) of a distribution function. It follows from (8.425) that

0 < R(0) = F (Rr) < (8.427) ∞

R(x) R(0) (8.428) | | ≤

R( x) = R∗(x). (8.429) − A homogeneous random field is called isotropic if

ξ∗(x)ξ(y) = ξ∗(gx)ξ(gy) (8.430) for all x, y Rr and all g SO(n), where SO(n) is the group of rotations ∈ ∈ of Rr around the origin. Equation (8.430) for homogeneous random fields is equivalent to

R(x) = R(gx) g SO(n). (8.431) ∀ ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 313

This means that

R(x) = R( x ) (8.432) | | where x = (x2 + +x2)1/2 is the length of the vector x. This and formula | | 1 · · · r (8.425) imply that dF (y) = dφ( y ). If | | R(x) dx < (8.433) | | ∞ Z then dF = f(y)dy and f(y) is continuous:

r f(y) = (2π)− exp( ix y)R(x)dx. (8.434) − · Z If

R 2dx < (8.435) | | ∞ Z then dF = f(y)dy, f(y) L2(Rr ), and formula (8.434) holds in L2-sense. ∈ It is known that 2πρ r/2 exp(ix y)ds = y J(r 2)/2(ρ y ) (8.436) x =ρ · y | | − | | Z| |  | | 

where Jn(t) is the Bessel function and ds is the element of the surface area of the sphere x = ρ in Rr. Using formula (65) one obtains | | Lemma 8.35 Assume that R(ρ) is continuous function. This function is a correlation function of a homogeneous isotropic random field in Rr if and only if it is of the form:

r ∞ J(n 2)/2(λρ) (r 2)/2 − R(ρ) = 2 − Γ (r 2)/2 dg(λ) (8.437) 2 0 (λρ) −   Z where g(λ) is a monotone nondecreasing bounded function, g(+ ) < , ∞ ∞ and Γ(z) is the Gamma-function. If r = 2 formula (8.437) becomes

∞ R(ρ) = J0(λρ)dg(λ), (8.438) Z0 for n = 3 one gets

∞ sin(λρ) R(ρ) = 2 dg(λ). (8.439) λρ Z0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

314 Random Fields Estimation Theory

From formula (8.425) and Lemma 8.34 it follows that a homogeneous random field admits the spectral representation of the form

ξ(x) = exp(ix y)ζ(dy), = , (8.440) · r Z Z ZR where ζ(dy) is an orthogonal stochastic measure on Rr. If the random field is homogeneous isotropic and continuous in the L2- sense then

h(m,r) ∞ ∞ Jm+(r 2)/2(λ x ) ξ(x) = c S (θ) − | | ζ (dλ). (8.441) r m,j (λ x )(r 2)/2 mj m=0 0 − X Xj=1 Z | | 2 r 1 Here cr = const, Sm,j (θ) is the orthonormalized in L (S − ) system of r 1 r r 1 the spherical harmonics, S − is the unit sphere in R , θ S − , ∈ (m + r 3)! h(m, r) = (2m + r 2) − , − (r 2)!m! − r 2 is the number of linearly independent spherical harmonics corre- ≥ sponding to the fixed m. For example, if r = 3 then h(m, 3) = 2m + 1. The stochastic orthogonal measures ζmj (dλ) have the properties

ζmj (dλ) = 0, (8.442)

ζ (∆ )ζ (∆ ) = δ δ m(∆ ∆ ) (8.443) mj 1 pq∗ 2 mp jq 1 ∩ 2 where ∆ and ∆ are arbitrary Borel sets in the interval (0, ), and m(∆) 1 2 ∞ is a finite measure on (0, ). ∞ If ξ(x) is a homogeneous random field with correlation function (8.425), then one can aply a differential operator Q( i∂), ∂ = (∂ , . . ., ∂ ), ∂ = − 1 r j ∂ , to ξ(x) in L2 sense if and only if ∂xj

Q(y) 2F (dy) < . (8.444) | | ∞ Z If condition (8.444) holds then Q( i∂)ξ(x) is a homogeneous random field − and its correlation function is Q∗( i∂)Q( i∂)R(x) and the corresponding − − spectral density is Q(y) 2f(y), where F (dy) = f(y)dy. By the spectral | | density of the homogeneous random field with correlation function (8.425) one means the function f(y) defined by F (dy) = f(y)dy in the case when F (dy) is absolutely continuous with respect to Lebesgue’s measure, so that f(y) L1(Rr). ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 315

8.4.5 Estimation of parameters Let ξ be a random variable with the distribution function F (x, θ) which depends on a parameter θ. The problem is to estimate the unknown θ given n sample values of ξ to which F (x, θ) belongs. The estimated value of θ let us denote θˆ = 1/2hatθ(x , . . ., x ) where x , 1 j n, are observed 1 n j ≤ ≤ values of ξ. What are the properties one seeks in an estimate? If ρ(θ, θˆ) is the risk function which measures the distance of the estimate θˆ from the true value of the parameter θ, then the estimate is good if

ρ(θ, θˆ) ρ(θ, θˆ ), (8.445) ≤ 1

where θˆ1 is any other estimate, and

ρ(θ, θˆ) = ρ θ, θˆ(x1, . . ., xn) dF (x1, θ) . . .dF (xn, θ) (8.446) Z   and F (x, θ) is the distribution function of ξ for a fixed θ. A minimax estimate is the one for which

sup ρ(θ, θˆ) = min (8.447) θ

where θ runs through the set Θ in which it takes values. A Bayes estimate is the one for which

ρ(θ, θˆ)dµ(θ) = min (8.448) ZΘ where µ(θ) is an a priori given distribution function on the set Θ. This means that one prescribes a priori more weight to some distribution of θ. An unbiased estimate is the one for which

θˆ = θ. (8.449)

An efficient estimate is the one for which

θ θˆ 2 θ θˆ 2 for any θˆ . (8.450) | − | ≤ | − 1| 1 The Cramer-Rao inequality gives a lower bound for the variance of the estimate: 1 σ2(θˆ) := θˆ θ 2 , (8.451) θ | − | ≥ nI(θ) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

316 Random Fields Estimation Theory

where one assumes that (78) holds,

∂ log p(x, θ) 2 I(θ) := p(x, θ)ν(dx), (8.452) ∂θ Z

and one assumes that dF has a density p (x, θ) with respect to a σ-finite

measure ν(dx) dF = p(x, θ)ν(dx) (8.453) and p(x, θ) is differentiable in θ. A mesure ν on a set E in a measure space is called σ-finite if E is a countable union of sets E with ν(E ) < . j j ∞ In particular, if measure ν is concentrated on the discrete finite set of points y1, . . ., yn, then n ∂ log p(y , θ) 2 I(θ) = j p(y , θ). (8.454) ∂θ j j=1 X

The quantity I(θ) is called the information quan tity. A measure ν is called concentrated on a set A E if ν(B) = ν(B A) for every B E, that is, ⊂ ∩ ⊂ if ν(B) = 0 whenever B A = . ∩ ∅ Sometimes an estimate θˆ is called efficient if the equality sign holds in (8.451). An estimate θˆ is called sufficient if dF (x, θ) = p(x, θ)ν(dx) and

p(x1, θ) . . .p(xn, θ) = gθ(θˆ)h(x1, . . ., xn)

where gθ and h are nonnegative functions, h does not depend on θ and gθ depends on x1, . . ., xn only through θˆ = θˆ(x1, . . ., xn). Suppose that the volume of the sample grows, i.e. n . An estimate θˆ(x , . . ., x ) := θˆ → ∞ 1 n n is called consistent if

lim P θˆn θ >  = 0 for every  > 0. (8.455) n | − | →∞   There are many methods for constructing estimates. Maximum-likelihood estimates of θ is the estimate obtained from the equations ∂ log L(θ, x , . . ., x ) 1 n = 0, 1 j m. (8.456) ∂θj ≤ ≤

Here θ is a vector parameter θ = (θ1, . . ., θm), the function L(θ, x1, . . ., xn) is called the likelihood function and is defined by

n L(θ, x1, . . ., xn) := Πi=1p(xi, θ), (8.457) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 317

where p(x, θ) is the density of dF defined by dF = p(x, θ)dx. Cramer proved that if:

k ∂ log p(x,θ) 1 1) ∂θk , k 3, exists for all θ Θ and almost all x R k ≤ ∈ ∈ 2) ∂ p(x,θ) g (x) where g (x) L1(R1), k = 1, 2, and ∂θk ≤ k k ∈ sup θ Θ ∞ g3(x)p(x, θ)dx < ∈ ∞ 3) I (θ) is positiv −∞ e and finite for every θ Θ, where I(θ) is defined by (8.452) R ∈ with ν(dx) = dx,

then equation (8.456) has a solution θˆ(x1, . . ., xn) which is a consistent, asymptotically efficient and asymptotically Gaussian estimate of θ. Here asymptotic efficiency is understood in the sense that inequality (8.451) becomes an equality asymptotically as n . More precisely, → ∞ define 1 ˆ 2 ˆ − eff(θ) := nI(θ)σθ (θ) . (8.458) h i Then the estimate θˆ is asymptotically efficient if

lim eff(θˆn) = 1. (8.459) n →∞

The estimate θˆn is asymptotically Gaussian in the sense that

I(θ)n1/2 θˆ(x , . . ., x ) θ N(0, 1) as n (8.460) 1 n − ∼ → ∞ h i where N(0, 1) is the Gaussian distribution with zero mean value and vari- ance one, and we assumed for simplicity that θ is a scalar parameter. We do not discuss other methods for constructing estimates (such as the method of moments, the minimum of χ2 method, intervals of confidency, the Bayes estimates etc.).

8.4.6 Discrimination between hypotheses

One observes n values x1, . . ., xn, of a random quantity ξ, and assumes that there are two hypotheses H0 and H1 about ξ. If H0 occurs then the probability density of the observed values is fn(x1, . . ., xn H0), otherwise it is f (x , . . ., x H ). Given the observed values x , . . ., x one has to n 1 n 1 1 n decide whether H or H occured. Let us denote γ , i = 0 , 1, the decision 0 1 i that H occured. The decision γ is taken if (x , . . ., x ) D , where D i 0 1 n ∈ 0 0 is a certain domain in Rn. The choice of such a domain is the choice of the February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

318 Random Fields Estimation Theory

decision rule. If (x , . . ., x ) D then the decision γ is taken. The error 1 n 6∈ 0 1 α10 of the first kind is defined as

α10 = P (γ1 H0) = f(x H0)dx, x = (x1, . . ., xn) (8.461) ZD1

where D = Rn D . Thus α is the probability to take the decision that 1 \ 0 10 H1 occured when in fact H0 occured. The error of the second kind is

α01 = P (γ0 H1) = f(x H1)dx, x = (x1, . . ., xn). (8.462) ZD0

The conditional probabilities to tak e the right decisions are

P (γ H ) = 1 α , P (γ H ) = 1 α . (8.463) 0 0 − 10 1 1 − 01

One cannot decrease both α10 and α01 without limit: if α10 decreases then D1 decreases, therefore D0 increases and α01 increases. The problem is to choose an optimal in some sense decision rule. Let us describe some approaches to this problem. The Neyman-Pearson approach gives the decision rule which minimizes α01 under the condition that α α, where α is a fixed confidence level. 10 ≤ Let us define the likelihood ratio

f(x H1) `(x) := , x = (x1, . . ., xn), (8.464) f(x H0)

and the threshold c > 0 which is given by the equation

P `(x) c H = α. (8.465) ≥ 0  The Neyman-Pearson decision rule is

if `(x) < c then H0 occured, otherwise H1 occured. (8.466) If φ(t) is a monotone increasing function then `(x) < c if and only if φ (`(x)) < φ(c). In particular, the rule

if log `(x) < log c then H0 occured, otherwise H1 occured (8.467) is equivalent to (8.466). Assume that the a priori probability p0 of H0 is known, so that the a priori probability of H is 1 p . 1 − 0 Then the maximum a posteriori probability decision rule is: p if `(x) < 0 then H occured, otherwise H occured. (8.468) 1 p 0 1 − 0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 319

If no a priori information about p0 is known, then one can use the maximum likelihood decision rule

if `(x) < 1 then H0 occured, otherwise H1 occured. (8.469) All these rules are threshold rules with various thresholds, and other decision rules are discussed in the litrature.

8.4.7 Generalized random fields A generalized random field ξ(x), x Rr is defined as follows. Suppose that ∈r φ (x), . . ., φ (x) is a set of C∞(R ) functions, and to each such set there { 1 m } 0 corresponds a random vector ξ(φ ), . . ., ξ(φ ) , such that the distribution { 1 m } functions for all such vectors for all m and all choices of φ , . . ., φ are { 1 m} consistent. Then one says that a generalized random field ξ is defined. The theories of generalized random functions of one and several vari- N ables are similar. A linear combination j=1 cj ξj (x) of generalized random functions is defined by the formula P N N cjξj (x)(φk) = cjξj (φk). j=1 j=1 X X r Similarly, if ψ C∞(R ), then ψξ(φ) := ξ(ψφ), ξ(x+h)(φ) := ξ (φ(x h)). ∈ − If ξ(φ) := m(φ) is a continuous linear functional on C0∞, then m := ξ is called the mean value of ξ,

m(φ) = xdF, whereF (x) = P ξ(φ) < x . (8.470) { } Z Recall that m(φ) is called continuous in C∞ if φ (x) φ(x) implies 0 n → m(φn) m(φ), where φn φ means that all φn and φ vanish outside → →r (j) (j) of a fixed compact set E of R and maxx E φn φ 0 as n for ∈ | − | → → ∞ all multiindices j. The correlation functional of a generalized random field is defined as

B(φ, ψ) = ξ∗(φ)ξ(ψ). (8.471) The covariance functional is defined as

R(φ, ψ) = [ξ (φ) m (φ)] [ξ(ψ) m(ψ)] = B(φ, ψ) m(φ)m(ψ). (8.472) ∗ − ∗ − − Both functionals are nonnegative definite:

B(φ, φ) 0, R(φ, φ) 0. ≥ ≥ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

320 Random Fields Estimation Theory

If the random vectors ξ(φ ), . . ., ξ(φ ) for all m are Gaussian, then ξ { 1 m } is called a generalized random Gaussian field. If B(φ, ψ) and m(φ) are r continuous in C0∞(R ) bilinear and, respectively, linear functionals and R(φ, φ) 0 then there is a generalized random Gaussian field for which ≥ B(φ, ψ) is the correlation functional and m(φ) is the mean value functional. A generalized random field is called homogeneous (stationary) if the random vectors ξ (φ (x + h)) , . . ., ξ (φ (x + h)) and ξ (φ (x)) , . . ., ξ (φ (x)) { 1 m } { 1 m } have the same distribution function for any h Rr. The mean value ∈ functional for a homogeneous generalized random field is

m(φ) = const φdx Z and its correlation functional is

B(φ, ψ) = φ˜(λ)ψ˜∗(λ)µ(dλ) Z where φ˜(λ) is the Fourier transform of φ(x), and µ(dλ) is a positive measure on Rr satisfies for some p 0 the condition ≥ p 1 + λ 2 − µ(dλ) < . r | | ∞ ZR  The measure µ is called the spectral measure of ξ. One can introduce the spectral representation of the generalized random field similar to (8.440). An important example of a Gaussian generalized random field is the Brow- nian motion.

8.4.8 Kalman filters Let us start with the basic equation for the optimal Wiener filter for the filtering problem

t 2 σ h(t, τ) + h(t, τ 0)Rs(τ, τ 0)dτ 0 = f(τ, t), t0 < τ < t, (8.473) Zt0 where = s + n is the observed signal, s and n are uncorrelated, U n (t)n(τ) = σ2δ(t τ), R (τ, t) := s (τ)s(t), ∗ − s ∗ s(t) = n(t) = 0, f(τ, t) := (τ)s(t). (8.474) U ∗ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 321

The optimal estimate of s is

t sˆ(t) = h(t, τ) (τ)dτ. (8.475) U Zt0 The error of this estimate

s˜(t) := s(t) sˆ(t) (8.476) − and

t [s˜(t)] = R (t, t) h∗(t, τ)f(τ, t)dτ. (8.477) D s − Zt0 Let us assume that s(t) satisfies the following differential equation

s˙(t) = A(t)s + w, (8.478)

where for simplicity we assume all functions to be scalar functions, w to be white noise, and

w = 0, w (t)w(τ) = Qδ(t τ), Q = const > 0. (8.479) ∗ − One could assume that

= Hs(t) + n, (8.480) U where n is white noise, and H is a linear operator, but the argument will be essentially the same, and, for simplicity, we assume (8.479). Note that

f(τ, t) = (τ)s(t) = [s (τ) + n (τ)] s(t) = R (τ, t) (8.481) U ∗ ∗ ∗ s assuming that the noise n(τ) and the signal s(t) are uncorrelated

n∗(τ)s(t) = 0. (8.482)

Also

R(τ, t) := (τ) (t) = R (τ, t) + σ2δ(t τ) (8.483) U ∗ U s − provided that (8.482) holds and

n (τ)n(t) = σ2δ(t τ), n = 0. (8.484) ∗ − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

322 Random Fields Estimation Theory

To derive a differential equation for the optimal impulse function, dif- ferentiate (8.473) in t using (8.481):

t ∂Rs(τ, t) ∂h(t, τ 0) = h(t, t)R(τ, t) + R(τ, τ 0)dτ 0. (8.485) ∂t ∂t Zt0 For τ < t equation (8.483) becomes

R(τ, t) = Rs(τ, t), τ < t. (8.486) This and equation (8.473) yield after multiplication by h(t, t):

t h(t, t)R(τ, t) = h(t, t)h(t, τ 0)R(τ, τ 0)dτ 0. (8.487) Zt0 From (8.478) one obtains ∂ R (τ, t) = AR (τ, t), τ < t (8.488) ∂t s s where one used the equation

s∗(τ)w(t) = 0 for τ < t. (8.489) To derive (8.489) note that equation (8.478) implies

t s(t) = ψ(t, t0)s(t0) + ψ(t, τ)w(τ)dτ (8.490) Zt0 where ψ(t, τ) is the transition function for the operator d A. Since dt −

w∗(t)w(τ) = 0 for τ < t, (8.491) it follows from (8.490) that (8.489) holds. From (8.488), (8.481) and (8.473) one has ∂ t R (τ, t) = A h(t, τ 0)R(τ, τ 0)dτ 0. (8.492) ∂t s Zt0 From (8.485), (8.492) and (8.487) one obtains

t ∂h(t, τ 0) 0 = Ah(t, τ 0) h(t, t)h(t, τ 0) + R(τ, τ 0)dτ 0 (8.493) − − ∂t Zt0   for τ < t. Since R is positive definite (see (8.483)), equation (8.493) has only the trivial solution, one gets ∂h(t, τ) = Ah(t, τ) h(t, t)h(t, τ), τ < t. (8.494) ∂t − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Auxiliary Results 323

This is a differential equation for the optimal filter h(t, τ). Let us find a differential equation for the optimal estimate sˆ(t) defined by (8.475):

t ∂h(t, τ) sˆ˙ = h(t, t) (t) + (τ)dτ (8.495) U ∂t U Zt0 ˙ df where f := dt . From (8.494) and (8.495) one gets

t sˆ˙ = h(t, t) (t) + [Ah(t, τ) h(t, t)h(t, τ)] (τ)dτ U − U Zt0 = h(t, t) (t) + Asˆ(t) h(t, t)sˆ(t) U − = Asˆ(t) + h(t, t) [ (t) sˆ(t)] . (8.496) U −

This is the differential equation for sˆ(t). The initial condition is sˆ(t0) = 0 according to (8.475). Let us express h(t, t) in terms of the variance of the error (8.373). If this is done then (8.496) can be used for computations. From (8.473) and (8.481) one obtains:

t 2 Rs(τ, t) = σ h(t, τ) + Rs(τ, τ 0)h(t, τ 0)dτ 0. (8.497) Zt0 Put τ = t in (8.497) assume that h(t, τ) is real-valued and use (8.477) to get

2 h(t, t) = σ− [s˜(t)] . (8.498) D

Let us finally derive a differential equation for h(t, t). From (8.476), (8.478) and (8.496) it follows that

s˜˙ = As + w Asˆ h(t, t) [s(t) + n(t) sˆ(t)] − − − = [A h(t, t)] s˜(t) + w h(t, t)n(t). (8.499) − −

The solution to (8.499) is

t s˜(t) = ψ(t, t )s˜(t ) + ψ(t, τ) [w(τ) h(τ, τ)n(τ)] dτ. (8.500) 0 0 − Zt0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

324 Random Fields Estimation Theory

One obtains, using (8.499), that

˙ 2 h(t, t) = σ− s˜˙∗(t)s˜(t) + s˜∗(t)s˜˙(t) 2 2 = σ− 2Re [A∗(t) h(t, t)] σ h(t, t)  −  + w (t)s˜(t) h(t, t)h (t)s˜(t) ∗ − ∗ 2 2 2 = [A∗(t) + A(t)] h(t, t) 2h (t,ot) + Qσ− + h (t, t) − 2 2 = [A∗(t) + A(t)] h(t, t) h (t, t) + Qσ− , (8.501) − where we assumed that w, n and s(t0) are uncorrelated, took into ac- count that h(t, t) > 0 (see (8.498)) and used formula (8.500) to get 1 Q w (t)s˜(t) = Qψ(t, t) = (8.502) ∗ 2 2 and h(t, t) h (t)s˜(t) = σ2. (8.503) ∗ − 2 1 Note that ψ(t, t) = 1 and the 2 factor in (8.502) and (8.503) appeared since we used the formula t δ(t τ)fdτ = 1 f(t). t0 − 2 Equation (8.501) is the Riccati equation for h(t, t). Equations (8.494), R (8.496) and (8.501) define Kalman’s filter. This filter consists in computing the optimal estimate (8.475) by solving the differential equation (8.496) in which h(t, t) is obtained by solving the Riccati equation (8.501). The initial data for equation (8.501) is

2 2 2 h(t , t ) = σ [s˜(t )] = σ− [s(t)] = σ− R (t , t ). (8.504) 0 0 D 0 D s 0 0 Here we used equation (8.476) and took into account that sˆ(t0) = 0. The ideas of the derivation of Kalman’s filter are the same for random vector-functions. In this case A(t) is a matrix. For random fields there is no similar theory due to the fact that there is no causality in the space variables in contrast with the time variable. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Appendix A Analytical Solution of the Basic Integral Equation for a Class of One-Dimensional Problems

In this Section we develop the theory for a class of random processes, be- cause this theory is analogous to the estimation theory for random fields, developed in the next Section. Let

L (?)Rh = f, 0 x L, Rh = R(x, y)h(y) dy ≤ ≤ Z0

where the kernel R(x, y) satisfies the equation QR = P δ(x y). Here − Q and P are formal differential operators of order n and m < n, respec- tively, n and m are nonnegative even integers, n > 0, m 0, Qu := n 1 m 1 ≥ q (x)u(n) + − q (x)u(j), P h := h(m) + − p (x)h(j), q (x) c > 0, n j=0 j j=0 j n ≥ the coefficients qj(x) and pj(x) are smooth functions defined on R, δ(x) is P α n mP α the delta-function, f H (0, L), α := −2 , H is the Sobolev space. ∈ α An algorithm for finding analytically the unique solution h H˙ − (0, L) α ∈ to (?) of minimal order of singularity is given. Here H˙ − (0, L) is the dual space to Hα(0, L) with respect to the inner product of L2(0, L). α α Under suitable assumptions it is proved that R : H˙ − (0, L) H (0, L) → is an isomorphism. Equation (?) is the basic equation of random processes estimation the- ory. Some of the results are generalized to the case of multidimensional equation (?), in which case this is the basic equation of random fields esti- mation theory. The presentation in Appendix A follows the paper [Ramm (2003)].

325 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

326 Random Fields Estimation Theory

A.1 Introduction

In Chapter 2 estimation theory for random fields and processes is con- structed. The estimation problem for a random process is as follows. Let u(x) = s(x) + n(x) be a random process observed on the interval (0, L), s(x) is a useful signal and n(x) is noise. Without loss of generality we as- sume that s(x) = n(x) = 0, where the overbar stands for the mean value,

u∗(x)u(y) := R(x, y), R(x, y) = R(y, x), u∗(x)s(y) := f(x, y), and the star here stands for complex conjugate. The covariance functions R(x, y) and f(x, y) are assumed known. One wants to estimate s(x) optimally in the sense of minimum of the variance of the estimation error. More precisely, one seeks a linear estimate

L Lu = h(x, y)u(y) dy, (A.1) Z0 such that

(Lu)(x) s(x) 2 = min. (A.2) | − | This is a filtering problem. Similarly one can formulate the problem of optimal estimation of (As)(x), where A is a known operator acting on s(x). If A = I, where I is the identity operator, then one has the filtering problem, if A is the differentiation operator, then one has the problem of optimal estimation of the derivative of s, if As = s(x+x0), then one has an extrapolation problem, etc. The kernel h(x, y) is, in general, a distribution. As in Chapter 1, one derives a necessary condition for h to satisfy (A.2):

L R(x, y)h(y, z) dy = f(x, z), 0 x, z L. (A.3) ≤ ≤ Z0 Since z enters as a parameter in (A.3), the basic equation of estimation theory is:

L Rh := R(x, y)h(y) dy = f(x), 0 x L. (A.4) ≤ ≤ Z0 The operator in L2(0, L) defined by (A.4) is symmetric. In Chapter 1 it is assumed that the kernel

∞ P (λ) R(x, y) = Φ(x, y, λ) dρ(λ), (A.5) Q(λ) Z−∞ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Analytical Solution of the Basic Integral Equation 327

where P (λ) and Q(λ) are positive polynomials, Φ(x, y, λ) and dρ(λ) are spectral kernel and, respectively, spectral measure of a selfadjoint ordinary differential operator ` in L2(R), deg Q(λ) = q, deg P (λ) = p < n, p 0, ≥ ord` := σ > 0, x, y Rr, r 1, and ` is a selfadjoint elliptic operator in ∈ ≥ L2(Rr). α α It is proved in 4 that the operator R : H˙ − (0, L) H (0, L), n m α → α,2 α := −2 σ is an isomorphism. By H (0, L) the Sobolev space W (0, L) α α is denoted, and H˙ − (0, L) is the dual space to H (0, L) with respect to 2 0 α L (0, L) := H (0, L) inner product. Namely, H˙ − (0, L) is the space of dis- tributions h which are linear bounded functionals on Hα(0, L). The norm α of h H˙ − (0, L) is given by the formula ∈ (h, g) h H˙ −α(0,L) = sup | | , (A.6) k k g Hα (0,L) g Hα(0,L) ∈ k k where (h, g) is the L2(0, L) inner product if h and g belong to L2(0, L). α α One can also define H˙ − (0, L) as the subset of the elements of H− (R) with support in [0, L]. We generalize the class of kernels R(x, y) defined in (A.5): we do not use the spectral theory, do not assume ` to be selfadjoint, and do not assume that the operators Q and P commute. We assume that

QR = P δ(x y), (A.7) − where Q and P are formal differential operators of orders n and m respec- tively, n > m 0, n and m are even integers, δ(x) is the delta-function, ≥ n m 1 − Qu := q (x)u(j), q (x) c > 0, P h := h(m) + p (x)h(j), (A.8) j n ≥ j j=0 j=0 X X

qj and pj are smooth functions defined on R. We also assume that the n 2 equation Qu = 0 has linearly independent solutions u− L ( , 0) 2 j ∈ −∞ and n linearly independent solutions u+ L2(0, ). In particular, this 2 j ∈ ∞ implies that if Qh = 0, h Hα(R), α > 0, then h = 0, and the same ∈ conclusion holds for h Hβ (R) for any fixed real number β, including ∈ negative β, because any solution to the equation Qh = 0 is smooth: it is a linear combination of n linearly independent solution to this equation, each of which is smooth and none belongs to L2(R). February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

328 Random Fields Estimation Theory

Let us assume that R(x, y) is a selfadjoint kernel such that

2 2 R c1 ϕ (Rϕ, ϕ) c2 ϕ , c1 = const > 0, ϕ C0∞( ), (A.9) k k− ≤ ≤ k k− ∀ ∈ 2 where ( , ) is the L (R) inner product, ϕ := ϕ H−α (R) := ϕ α, n · m· k k− k k k k− α := −2 , ϕ β := ϕ Hβ (R), and we use below the notation ϕ + := k k k k α α k k ϕ α := ϕ + . The spaces H (0, L) and H˙ − (0, L) are dual of k kH (0,L) k kH each other with respect to the L2(0, L) inner product, as was mentioned α α above. If ϕ H˙ − (0, L), then ϕ H− (R), and the inequality (A.9) holds ∈ ∈ for such ϕ. By this reason we also use (for example, in the proof of Theorem α A.1 below) the notation H− for the space H˙ − (0, L). Assumption (A.9) holds, for example, for the equation

1 Rh = exp( x y )h(y)dy = f(x), 1 x 1. 1 −| − | − ≤ ≤ Z− Its solution of minimal order of singularity is

h(x) = ( f 00 +f)/2+δ(x+1)[ f 0 ( 1)+f( 1)]/2+δ(x 1)[f 0 (1)+f(1)]/2. − − − − − One can see that the solution is a distribution with support at the boundary of the domain D if the following inequalities (A.10) and (A.11) hold:

R c3 ϕ α+n Q∗ϕ α c4 ϕ α+n, c3, c4 = const > 0, ϕ C0∞( ), k k− ≤ k k− ≤ k k− ∀ ∈ (A.10)

2 2 R c5 ϕ n+m (P Q∗ϕ, ϕ) c6 ϕ n+m , ϕ C0∞( ), (A.11) k k 2 ≤ ≤ k k 2 ∀ ∈

where Q∗ is a formally adjoint to Q differential expression, and c5 and c6 are positive constants independent of ϕ C∞(R). The right inequality ∈ 0 (A.11) is obvious because ordP Q∗ = n+m, and the right inequality (A.10) is obvious because ordQ∗ = n. Let us formulate our basic results.

Theorem A.1 If (A.9) holds, then the operator R, defined in (A.5), is ˙ α α n m an isomorphism of H− (0, L) onto H (0, L), α = −2 . Theorem A.2 If (A.7), (A.10) and (A.11) hold, then (A.9) holds and α α R : H˙ − (0, L) H (0, L) is an isomorphism. → Theorem A.3 If (A.7), (A.10) and (A.11) hold, and f H n(0, L), α ∈ then the solution to (A.4) in H˙ − (0, L) does exist, is unique, and can be February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Analytical Solution of the Basic Integral Equation 329

calculated analytically by the following formula:

n α 1 x − − j (j) + j (j) h = G(x, y)Qf dy + aj−( 1) Gy (x, 0) + aj ( 1) Gy (x, L) , 0 − − Z j=0 h i X (A.12)

where aj are some constants and G(x, y) is the unique solution to the problem

P G = δ(x y), G(x, y) = 0 for x < y. (A.13) −

The constants aj are uniquely determined from the condition h(x) = 0 for x > L.

α Remark A.1 The solution h H˙ − (0, L) is the solution to equation ∈ (A.4) of minimal order of singularity. Remark A.2 If P = 1 in (A.7) then the solution h to (A.4) of minimal n order of singularity, h H˙ − 2 (0, L), can be calculated by the formula h = ∈ QF , where F is given by (A.22) (see below) and u+ and u are the unique (j) −(j) solutions of the problems Qu+ = 0 if x > L, u+ (L) = f (L), 0 j n (j) (j) ≤n ≤ 2 1, u+( ) = 0, and Qu = 0 if x < 0, u (0) = f (0), 0 j 2 1, − ∞ − − ≤ ≤ − u ( ) = 0. − −∞

A.2 Proofs

˙ α Proof of Theorem A.1. The set C0∞(0, L) is dense in H− (0, L) (in the α norm of H− (R)). Using the right inequality (A.9), one gets: (Rh, h) R H− H+ = sup 2 c2, (A.14) k k → h H− h ≤ ∈ k k− 2 by the symmetry of R in L (0, L). This implies R H− H+ c2. Using 2 || || → ≤ the left inequality (A.9), one gets: c1 h Rh + h , so k k− ≤ k k k k−

c1 h Rh +. (A.15) k k− ≤ k k Therefore

1 1 R− H+ H− . (A.16) k k → ≤ c1 Consequently, the range Ran(R) of R is a closed subspace of H +. In fact, + + Ran(R) = H . Indeed, if Ran(R) = H , then there exists a g H− such 6 ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

330 Random Fields Estimation Theory

that 0 = (Rψ, g) ψ H−. Taking ψ = g and using the left inequality ∀ ∈ (A.9) one gets g = 0, so g = 0. Thus Ran(R) = H+. k k− Theorem A.1 is proved. 

Proof of Theorem A.2. From (A.7) and (A.8) it follows that the kernel R(x, y) defines a pseudodifferential operator of order 2α = m n. In par- − − ticular, this implies the right inequality (A.9). In this argument inequalities (A.10) and (A.11) were not used. Let us prove that (A.10) and (A.11) imply the left inequality (A.9). One has

R Q∗ϕ α C ϕ n α, ϕ C0∞( ), (A.17) k k− ≤ k k − ∀ ∈

because ordQ∗ = n. Inequality (A.10) reads:

R c3 ϕ α+n Q∗ϕ α c4 ϕ α+n, ϕ C0∞( ), (A.18) k k− ≤ || ||− ≤ k k− ∀ ∈

where c3 and c4 are positive constants. If (A.18) holds, then Q∗ : α+n α α+n α H− (R) H− (R) is an isomorphism of H− (R) onto H− (R) pro- → vided that N(Q) := w : Qw = 0, w Hα(R) = 0 . Indeed, if the range { α ∈ } { } α of Q∗ is not all of H− (R), then there exists an w = 0, w H (R) such 6 ∈ α that (Q∗ϕ, w) = 0 ϕ C∞(R), so Qw = 0. If Qw = 0 and w H (R), ∀ ∈ 0 ∈ then, as was mentioned below formula (A.8), it follows that w = 0. This α proves that Ran(Q∗) = H− (R). Inequality (A.11) is necessary for the left inequality (A.9) to hold. In- deed, let ψ = Q∗ϕ, ϕ C∞(R), then (A.9) implies ∈ 0 2 2 c5 ϕ α+n c Q∗ϕ α (RQ∗ϕ, Q∗ϕ) = (QRQ∗ϕ, ϕ) = (P Q∗ϕ, ϕ), k k− ≤ k k− ≤ (A.19) where c > 0 here (and elsewhere in this paper) stands for various estimation constants. Because α + n = n+m , inequality (A.19) is the left inequality − 2 (A.11). The right inequality (A.11) is obvious because the order of the operator P Q∗ equals to n + m. Let us prove now that inequalities (A.11) and (A.10) are sufficient for the left inequality (A.9) to hold. Using the right inequality (A.10) and the left inequality (A.11), one gets:

2 2 R c ψ α c5 ϕ n+m (P Q∗ϕ, ϕ) = (Rψ, ψ), ψ = Q∗ϕ, ϕ C0∞( ). k k− ≤ k k 2 ≤ ∀ ∈ (A.20) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Analytical Solution of the Basic Integral Equation 331

α Let us prove that the set ψ = Q∗ϕ ϕ C∞(R) is dense in H˙ − (0, L). ∀ ∈ 0 { } { } α Assume the contrary. Then there is an h H˙ − (0, L), h = 0, such that ∈ 6 (Q∗ϕ, h) = 0 for all ϕ C∞(R). Thus, (ϕ, Qh) = 0 for all ϕ C∞(R). ∈ 0 ∈ 0 Therefore Qh = 0, and, by the argument given below formula (A.8), it follows that h = 0. This contradiction proves that the set Q∗ϕ ϕ C∞(R) ∀ ∈ 0 α { } is dense in H˙ − (0, L). Consequently, (A.20) implies the left inequality (A.9). The right in- equality (A.9) is an immediate consequence of the observation we made earlier: (A.7) and (A.8) imply that R is a pseudodifferential operator of order 2α = (n + m). − − Theorem A.2 is proved.  Proof of Theorem A.3. Equations (A.4) and (A.7) imply

P h = g := QF. (A.21)

Here u , x < 0, − F := f, 0 x L, (A.22)  ≤ ≤ u+, x > L,  where  Qu = 0, x < 0, (A.23) −

Qu+ = 0, x > L, (A.24)

α and u and u+ are chosen so that F H (R). This choice is equivalent to − ∈ the conditions:

u(j)(0) = f (j)(0), 0 j α 1, (A.25) − ≤ ≤ −

u(j)(L) = f (j)(L), 0 j α 1. (A.26) + ≤ ≤ − α α n n+m If F H (R), then g := QF H − (R) = H− 2 (R), and, by (A.22), ∈ ∈ one gets:

n α 1 − − (j) + (j) g = Qf + a−δ (x) + a δ (x L) , (A.27) j j − Xj=0 h i n+m + where a are some constants. There are n α = constants a and j − 2 j the same number of constants aj−. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

332 Random Fields Estimation Theory

Let G(x, y) be the fundamental solution of the equation

P G = δ(x y) in R, (A.28) − which vanishes for x < y:

G(x, y) = 0 for x < y. (A.29)

Claim. Such G(x, y) exists and is unique. It solves the following Cauchy problem:

(j) P G = 0, x > y, Gx (x, y) = δj,m 1, 0 j m 1, (A.30) − ≤ ≤ − x=y+0

satisfies condition (A.29), and c an be written as

m G(x, y) = cj (y)ϕj (x), x > y, (A.31) j=1 X where ϕ (x), 1 j m, is a linearly independent system of solutions to j ≤ ≤ the equation:

P ϕ = 0. (A.32)

Proof of the claim. The coefficients cj(y) are defined by conditions (A.30):

m (k) cj(y)ϕ (y) = δk,m 1, 0 k m 1. (A.33) j − ≤ ≤ − j=1 X The determinant of linear system (A.33) is the Wronskian W (ϕ , . . ., ϕ ) = 1 m 6 0, so that cj(y) are uniquely determined from (A.33). The fact that the solution to (A.30), which satisfies (A.29), equals to the solution to (A.28) – (A.29) follows from the uniqueness of the solution to (A.28) – (A.29) and (A.30) – (A.29), and from the observation that the solution to (A.28) – (A.29) solves (A.30) – (A.29). The uniqueness of the solution to (A.30) – (A.29) is a well-known result. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Analytical Solution of the Basic Integral Equation 333

Let us prove uniqueness of the solution to (A.28) – (A.29). If there were two solutions, G1 and G2, to (A.28) – (A.29), then their difference G := G G , would solve the problem: 1 − 2 P G = 0 in R, G = 0 for x < y. (A.34)

By the uniqueness of the solution to the Cauchy problem, it follows that G 0. Note that this conclusion holds in the space of distributions as well, ≡ because equation (A.34) has only the classical solutions, as follows from the ellipticity of P . Thus the claim is proved. 

From (A.21) and (A.27) – (A.29) one gets:

n α 1 x x − − (j) + (j) h = G(x, y)Qf dy + G(x, y) aj−δ (y) + aj δ (y L) dy 0 0 − Z Z Xj=0 h i n α 1 x − − j (j) (j) + = G(x, y)Qf dy + ( 1) G (x, y) a− + G (x, y) a − y j y j Z0 j=0 " y=0 y=L # X x := G(x, y)Qf dy + H(x). (A.35) Z0 α It follows from (A.35) that h H− (R) and ∈ h = 0 for x < 0, (A.36) R that is, (h, ϕ) = 0 ϕ C0∞( ) such that suppϕ ( , 0). In order to ∀ ∈α ⊂ −∞ guarantee that h H˙ − (0, L) one has to satisfy the condition ∈ h = 0 for x > L. (A.37)

Conditions (A.36) and (A.37) together are equivalent to supph [0, L]. n+m ⊂ Note that although Qf H˙ − 2 (0, L), so that Qf is a distribution, the x ∈ ∞ integral 0 G(x, y)Qf dy = G(x, y)Qf dy is well defined as the unique solution to the problem P w =−∞Qf, w = 0 for x < 0. R R Let us prove that conditions (A.36) and (A.37) determine the constants n+m a, 0 j 1, uniquely. j ≤ ≤ 2 − If this is proved, then Theorem A.3 is proved, and formula (A.35) gives α an analytical solution to equation (A.4) in H˙ − (0, L) provided that an

algorithm for finding aj is given. Indeed, an algorithm for finding G(x, y) consists of solving (A.29) – (A.30). Solving (A.29) – (A.30) is accomplished analytically by solving the linear algebraic system (A.33) and then using February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

334 Random Fields Estimation Theory

formula (A.31). We assume that m linearly independent solutions ϕj (x) to (A.32) are known. Let us derive an algorithm for calculation of the constants a, 0 j j ≤ ≤ n+m 1, from conditions (A.36) – (A.37). 2 − Because of (A.29), condition (A.36) is satisfied automatically by h de- fined in (A.35). To satisfy (A.37) it is necessary and sufficient to have

L G(x, y)Qf dy + H(x) 0 for x > L. (A.38) ≡ Z0

By (A.31), and because the system ϕj 1 j m is linearly independent, { } ≤ ≤ equation (A.38) is equivalent to the following set of equations:

n+m 1 L 2 − j (j) (j) + ck(y)Qf dy + ( 1) ck (0)aj− + ck (L)aj = 0, 1 k m. 0 − ≤ ≤ Z Xj=0 h i (A.39)

Let us check that there are exactly m independent constants aj and that all the constants aj are uniquely determined by linear system (A.39). If there are m independent constants aj and other constants can be linearly represented through these, then linear algebraic system (A.39) is uniquely solvable for these constants provided that the corresponding ho- mogeneous system has only the trivial solution. If f = 0, then h = 0, as follows from Theorem 1.1, and g = 0 in (A.27). Therefore a = 0 j, and j ∀ system (A.39) determines the constants a j uniquely. j ∀ Finally, let us prove that there are exactly m independent constants n aj. Indeed, in formula (A.21) there are 2 linearly independent solutions 2 u− L ( , 0), so j ∈ −∞

n 2 u = b−u−, (A.40) − j j j=1 X

and, similarly, u+ in (A.21) is of the form

n 2 + u+ = bjuj , (A.41) j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Analytical Solution of the Basic Integral Equation 335

where u+ L2(0, ). Condition F Hα(R) implies j ∈ ∞ ∈ n 2 (k) (k) n m b−(u−) = f at x = 0, 0 k α 1 = − 1, (A.42) j j ≤ ≤ − 2 − j=1 X and

n 2 n m b+(u+)(k) = f (k) at x = L, 0 k − 1. (A.43) j j ≤ ≤ 2 − j=1 X n n m m Equations (A.42) and (A.43) imply that there are 2 −2 = 2 inde- m + − pendent constants bj− and 2 independent constants bj , and the remaining + n m constants bj− and bj can be represented through these m constants by − n m solving linear systems (A.42) and (A.43) with respect to, say, first −2 con- n m stants, for example, for system (A.42), for the constants b−, 1 j − . j ≤ ≤ 2 This can be done uniquely because the matrices of the linear systems (A.42) and (A.43) are nonsingular: they are Wronskians of linearly independent + solutions uj− 1 j n−m and uj 1 j n−m . { } ≤ ≤ 2 { } ≤ ≤ 2 The constants aj can be expressed in terms of bj and f by linear rela- tions. Thus, there are exactly m independent constants aj. This completes the proof of Theorem A.3. 

Remark A.3 In Chapter 5 a theory of singular perturbations for the equations of the form

εhε + Rhε = f (A.44)

is developed for a class of integral operators with a convolution kernels R(x, y) = R(x y). This theory can be generalized to the class of ker- − nels R(x, y) studied here. The basic interesting problem is: for any ε > 0 equation (A.44) has a unique solution h L2(0, L); how can one find the ε ∈ asymptotic behavior of hε as ε 0? The limit h of hε as ε 0 should → → α solve equation Rh = f and, in general, h is a distribution, h H˙ − (0, L). ∈ The theory presented in Chapter 5 allows one to solve the above problem for the class of kernels studied here.

Remark A.4 Theorems A.1 and A.2 and their proofs remain valid in the case when equation (A.4) is replaced by the

Rh := R(x, y)h(y)dy = f, x D. (A.45) ∈ ZD February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

336 Random Fields Estimation Theory

Here D Rr, r > 1, is a bounded domain with a smooth boundary S, D is ⊂ the closure of D, R(x, y) solves (A.7), where P and Q are uniformly elliptic differential operators with smooth coefficients, ordP = m 0, ordQ = ≥ n > m, equation Qh = 0 has only the trivial solution in H β(Rr) for any fixed real number β. Under the above assumptions, one can prove that the operator defined by the kernel R(x, y) is a pseudodifferential elliptic operator n m of order 2α, where α := − . We do not assume that P and/or Q are − 2 selfadjoint or that P and Q commute. An analog of Remark 2.1 holds for the multidimensional equation (A.44) as well. Equation (A.45) is the basic integral equation of random fields estimation theory. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Appendix B Integral Operators Basic in Random Fields Estimation Theory

B.1 Introduction

Integral equations theory is well developed starting from the beginning of the last century. Of special interest are the classes of integral equations which can be solved in closed form or reduced to some boundary-value problems for differential equations. There are relatively few such classes of integral equations. They include equations with convolution kernels with domain of integration which is the whole space. These equations can be solved by applying the Fourier transform. The other class of integral equa- tions solvable in closed form is the Wiener-Hopf equations. Yet another class consists of one-dimensional equations with special kernels (singular in- tegral equations which are reducible to Riemann-Hilbert problems for ana- lytic functions, equations with logarithmic kernels, etc). (See e.g. [Zabreiko et. al. (1968)], [Gakhov (1966)]). In Chapter 5 a new class of multidimen- sional integral equations is introduced. Equations of this class are solvable in closed form or reducible to a boundary-value problem for elliptic equa- tions. This class consists of equations (B.3) (see below), whose kernels R(x, y) are kernels of positive rational functions of an arbitrary selfadjoint elliptic operator in L2(Rn), where n 1. In Appendix A this theory is gen- ≥ eralized to the class of kernels R(x, y) which solve problem QR = P δ(x y), − where δ(x) is the delta-function, Q and P are elliptic differential operators, and x R1. Ellipticity in this case means that the coefficient in front of the ∈ senior derivative does not vanish. In Appendix A integral equations (B.3) with the kernels of the above class are solved in closed form by reducing them to a boundary-value problem for ODE. Our aim is to generalize the approach proposed in Appendix A to the multidimensional equations (B.3) whose kernel solves equation QR = P δ(x y) in Rn, where n > 1. This is −

337 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

338 Random Fields Estimation Theory

not only of theoretical interest, but also of great practical interest, because, as shown in Chapter 1, equations (B.3) are basic equations of random fields estimation theory. Thus, solving such equations with larger class of kernels amounts to solving estimation problems for larger class of random fields. The kernel R(x, y) is the covariance function of a random field. The class of kernels R, which solve equation QR = P δ(x y) in Rn, contains the − class of kernels introduced and studied in Chapters 1-4. Our theory is not only basic in random fields estimation theory, but can be considered as a contribution to the general theory of integral equa- tions. Any new class of integral equations, which can be solved analytically or reduced to some boundary-value problems is certainly of interest, and potentially can be used in many applied areas. For convenience of the reader, the notations and auxiliary material are put in Section B.4. This Appendix follows closely the paper [Kozhevnikov and Ramm (2005)]. Let P be a differential operator in Rn of order µ,

α P := P (x, D) := aα (x) D , α µ | X|≤ n where a (x) C∞ (R ) . α ∈ The polynomials

α α p (x, ξ) := aα (x) ξ and p0 (x, ξ) := aα (x) ξ α µ α =µ |X|≤ |X| are called respectively symbol and principal symbol of P. Suppose that the symbol p(x, ξ) belongs to the class SG(µ,0) (Rn) con- n n sisting of all C∞ functions p (x, ξ) on R R , such that for any multiindices × α, β there exists a constant Cα,β such that

α β µ β α n 2 1/2 D D p (x, ξ) C ξ −| | x −| | , x, ξ R , ξ :=(1 + ξ ) x ξ ≤ α,β h i h i ∈ h i | | (B.1)

It is known (cf. [Wloka et. al. (1995), Prop. 7.2]) that the map P (x, D) : (Rn) (Rn) is continuous, where (Rn) is the Schwartz S → S S space of smooth rapidly decaying functions. Let Hs (Rn) (s R) be the ∈ usual Sobolev space. It is known that the operator P (x, D) acts naturally on the Sobolev spaces, that is, the operator P (x, D) is (cf. [Wloka et. al. s n s µ n (1995), Sec. 7.6]) a bounded operator: H (R ) H − (R ) for all s R. → ∈ The operator P (x, D) is called elliptic, if p (x, ξ) = 0 for any x Rn, 0 6 ∈ ξ Rn 0 . ∈ \ { } February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 339

Let P (x, D) and Q (x, D) be both elliptic differential operators of even orders µ and ν respectively, 0 µ < ν, with symbols satisfying (B.1) (for ≤ Q (x, D) we replace p and µ in (B.1) respectively by q and ν). The case µ ν is a simpler case which leads to an elliptic operator perturbed by a ≥ compact integral operator in a bounded domain. We assume also that P (x, D) and Q (x, D) are invertible operators, that 1 s µ n is, there exist the inverse bounded operators P − (x, D) : H − (R ) s n 1 s ν n s n → H (R ) and Q− (x, D) : H − (R ) H (R ) for all s R. 1 → ∈ Let R := Q− (x, D) P (x, D) . The invertibility of P (x, D) and Q (x, D) imply that R is an invertible pseudodifferential operator of negative order s n s+ν µ n µ ν acting from H (R ) onto H − (R ) (s R) . − ∈ Since P and Q are elliptic, their orders µ and ν are even for n > 2. If n = 2, we assume that µ and ν are even numbers. Therefore, the number a := (ν µ) /2 > 0 is an integer. − Let Ω denote a bounded connected open set in Rn with a smooth bound- 2 ary ∂Ω (C∞-class surface) and Ω its closure in L (Ω), Ω = Ω ∂Ω. The ∪ smoothness restriction on the domain can be weakened, but we do not go into detail. The restriction R of the operator R to the domain Ω Rn is defined Ω ⊂ as

RΩ := rΩReΩ− , (B.2)

n where eΩ is the extension by zero to Ω := R Ω and rΩ is the restriction − − \ to Ω. It is known (cf. [Grubb (1990), Th. 3.11, p. 312]) that the operator RΩ defines a continuous mapping

s s+ν µ R : H (Ω) H − (Ω) (s > 1/2) , Ω → − where Hs (Ω) is the space of restrictions of elements of Hs (Rn) to Ω with the usual infimum norm (see Section B.4). The pseudodifferential operator R of negative order µ ν and its re- − striction RΩ can be represented as integral operators with kernel R (x, y) :

Rh = R (x, y) h (y) dy, R h = R (x, y) h (y) dy (x Ω) , Ω ∈ RZn ΩZ n n n n where R (x, y) C∞ (R R Diag) , Diag is the diagonal in R R , ∈ × \ × Moreover, R (x, y) has a weak singularity:

σ R (x, y) C x y − n + µ ν σ < n. | | ≤ | − | − ≤ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

340 Random Fields Estimation Theory

For n + µ ν < 0, R (x, y) is continuous. − γ Let γ := n + µ ν and rxy := x y 0. Then R(x, y) = O(rxy− ) if n − | − | → γ is odd or if n is even and ν < n, and R(x, y) = O(rxy− log rxy) if n is even and ν > n. In Chapter 1, the equation

a a ν µ R h = f H (Ω) , h H− (Ω) , a = − , (B.3) Ω ∈ ∈ 0 2 is derived as a necessary and sufficient condition for the optimal estimate of random fields by the criterion of minimum of variance of the error of the estimate. The kernel R(x, y) is a known covariance function, and h(x, y) is the distributional kernel of the operator of optimal filter. The kernel h(x, y) should be of minimal order of singularity, because only in this case this kernel solves the estimation problem: the variance of the error of the estimate is infinite for the solutions to equation (B.3), which do not have minimal order of singularity. In Chapters 1-4, equation (B.3) was studied under the assumption that P and Q are polynomial functions of a selfad- joint elliptic operator defined in the whole space. In Appendix A some generalizations of this theory are given. In particular, the operators P and Q are not necessarily selfadjoint and commuting. In this Appendix an extension to multidimensional integral equations of some results from Appendix A is given. We want to prove that, under some natural assumptions, the oper- a a ator RΩ is an isomorphism of the space H0− (Ω) onto H (Ω), where a = (ν µ) /2 > 0, and Hs (Ω) , s R, denotes the subspace of Hs (Rn) − 0 ∈ that consists of the elements supported in Ω. To prove the isomorphism property, we reduce the integral equation (B.3) to an equivalent elliptic exterior boundary-value problem. Since we a (ν µ)/2 look for a solution u belonging to the space H (Ω ) = H − (Ω ) , − − and the differential operator Q is of order ν, then Qu should belong to some Sobolev space of negative order. This means that we need results on the solvability of equation (B.3) in Sobolev spaces of negative order. Such spaces as well as solvability in them of elliptic differential boundary value problems in bounded domains have been investigated in [Roitberg (1996)] and later in [Kozlov et. al. (1997)]. The case of pseudodifferential boundary value problems has been studied in [Kozhevnikov (2001)]. In [Erkip and Schrohe (1992)] and in [Schrohe (1999)] the solvability of elliptic differential and pseudodifferential boundary value problems for unbounded manifolds, and in particular for exterior domains, has been established. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 341

These solvability results have been obtained in weighted Sobolev spaces of positive order s. To obtain the isomorphism property, we need similar solvability results for exterior domain in the weighted Sobolev spaces of negative order. One can find in Section B.4 the definition of these spaces (cf. [Roitberg (1996)]).

B.2 Reduction of the basic integral equation to a boundary- value problem

j In Theorem B.1 the differentiation along the normal to the boundary Dn is used. This operator is defined in Section B.4. Theorem B.1 Integral equation (B.3) is equivalent to the following sys- tem (B.4), (B.5), (B.6):

Qu = 0 in Ω − (B.4) Dj u = Dj f on ∂Ω, 0 j a 1,  n n ≤ ≤ −

a P h = QF, h H− (Ω) , (B.5) ∈ 0 where u Ha (Ω ) is an extension of f: ∈ − f Ha (Ω) in Ω, F Ha (Rn) , F := ∈ (B.6) ∈ u Ha (Ω ) in Ω .  ∈ − − a a Proof. Let h H0− (Ω) solve equation (B.3), RΩh = f H (Ω). Let us 1∈ a ∈ a µ Rn define F := Q− P h. Since h H0− (Ω) , it follows that P h H− − ( ) 1 a+ν∈µ n a n ∈ and F = Q− P h H− − (R ) = H (R ) . We have f = RΩh = 1 ∈ rΩQ− P h = rΩF, so F is an extension of f. Therefore, F can be represented 1 in the form (B.6). Furthermore, since F = Q− P h, then P h = QF, that a a ν is, h solves (B.5). Since h H0− (Ω) , then QF = P h H0− − (Ω) . It ∈ a n ∈ j j follows, that Qu = 0 in Ω . Since F H (R ) , we get Dnu = Dnf on − ∈ ∂Ω , 0 j a 1. This means that u Ha (Ω ) solves the boundary- − ≤ ≤ − ∈ − value problem (B.4). Thus, it is proved that any solution to (B.3) solves problem (B.4), (B.5). a a Conversely, let a pair (u, h) H (Ω ) H0− (Ω) solve system (B.4), ∈ − × 1 (B.5), (B.6). Since P h = QF, then Rh = Q− P h = F. It follows from (B.6) that R h = Rh = F = f, i.e. h solves (B.3).  Ω |Ω |Ω Remark B.1 If µ > 0, the boundary value problem (B.4) is underdeter- mined because Q is an elliptic operator of order ν which needs ν/2 boundary February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

342 Random Fields Estimation Theory

conditions, but we have only a (a < ν/2) conditions in (B.4). Therefore, the next step is a transformation of equation (B.5) into µ/2 extra bound- ary conditions to the boundary value problem (B.4). This will be done in Theorem B.2.

2 2 1/2 Let us define κ (ξ0, λ) := 1 + ξ0 + λ . Choose a function 1 | | ρ (τ) (R) with supp − ρ R and ρ (0) = 1. Let a > −  ∈ S t F ⊂ R Z 2sup ∂τ ρ (τ) . Let Ξ+,λ denote a family (λ +, t ) of order-reducing | | t 1 ∈ ∈ pseudodifferential operators Ξ+,λ := − χ+ (ξ, λ) , where χ+ (ξ, λ) := t F F ξn κ (ξ0, λ) ρ aκ(ξ0,λ) + iξn are their symbols. It has been proved in n [Grubb (1996), Sec.2.5] that the operator Ξt maps the space R := +,λ S0 + n u (Rn) : supp u R onto itself and has the following isomorphism  ∈ S ⊂ + propn erties for s R: o ∈

t s n s t n Ξ : H (R ) H − (R ) , (B.7) +,λ '

t s n s t n Ξ : H R H − R . (B.8) +,λ 0 + ' 0 +   t It is known ([Grubb (1996)], [Schrohe (1999)]) that using Ξ+,λ and an appropriate partition of unity one can obtain, for sufficiently large λ, the t operator Λ+ which is an isomorphism:

t s n s t n Λ : H (R ) H − (R ) , s R, + ' ∀ ∈ and

t s s t Λ : H (Ω) H − (Ω) , s R. (B.9) + 0 ' 0 ∀ ∈ Lemma B.1 Let P (x, D) be an invertible differential operator of order 1 µ, that is, there exists the inverse operator P − (x, D) which is bounded: 1 s µ n s n P − (x, D) : H − (R ) H (R ) for all s R. Then a solution h to the → ∈ equation

a µ P (x, D) h = g, g H− − (Ω) ∈ 0 a belongs to the space H0− (Ω) if and only if g satisfies the following µ/2 boundary conditions:

j a µ/2 1 r D Λ− − P − (x, D) g = 0 (j = 0, ..., µ/2 1) . ∂Ω n + − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 343

1 a Proof. Necessity. Let h = P − (x, D) g, h H0− (Ω) , solve the equa- a µ ∈ a µ/2 tion P (x, D) h = g, g H0− − (Ω) . By (B.9), we have Λ+− − h µ/2 ∈ a µ/2 ∈ H (Ω) . Therefore, r Dj Λ− − h = 0 (j = 0, ..., µ/2 1) . 0 ∂Ω n + −

j a µ/2 Sufficiency. Assume that the equalities r∂ΩDnΛ−+ − h = a µ a µ n 0 (j = 0, ..., µ/2 1) hold. Since g H− − (Ω) H− − (R ) , we have − ∈ 0 ⊂ 1 a Rn a µ/2 µ/2 Rn h = P − (x, D) g H− ( ) . Therefore, Ψ := Λ−+ − h H ( ) . j ∈ ∈ Since r∂ΩDnΨ = 0 (j = 0, ..., µ/2 1), we have Ψ = Ψ+ + Ψ , where µ/2 − µ/2 − ν/2 Ψ+ := eΩ− r Ψ H0 (Ω) and Ψ := eΩrΩ− Ψ H0 (Ω ) . Since Λ+ : Ω − − µ/2 a∈ ν/2 ∈ a ν/2 H (Ω) H− (Ω) , it follows that Λ Ψ H− (Ω) . Moreover, Λ is 0 ' 0 + + ∈ 0 + a differential operator with respect to the variable xn, hence suppΨ Ω ν/2 − ⊂ − implies suppΛ+ Ψ Ω . Since P is a differential operator, − ⊂ − ν/2 ν/2 supp P Λ+ Ψ suppΛ+ Ψ Ω . − ⊂ − ⊂ −   On the other hand, we have

ν/2 ν/2 ν/2 ν/2 Φ := P Λ+ Ψ = P Λ+ (Ψ+ + Ψ ) = P Λ+ Ψ+ + P Λ+ Ψ . − −         For any ϕ C0∞ (Ω ) one has: ∈ − ν/2 ν/2 ν/2 0 = Φ, ϕ = P Λ+ Ψ+, ϕ + P Λ+ Ψ , ϕ = P Λ+ Ψ , ϕ h i − − D  E D  E D  E ν/2 ν/2 Thus, supp P Λ+ Ψ Ω. It follows that supp P Λ+ Ψ ∂Ω. − ⊂ − ⊂   ν/2   Rn For any Ψ C0∞ (Ω ) , we have P Λ+ Ψ C∞ ( ) and − ∈ − − ∈ ν/2 ν/2  supp P Λ+ Ψ ∂Ω. Therefore, P Λ+ Ψ = 0. Since P is invertible, − ⊂ − ν/2    µ/2 Λ+ Ψ = 0 for Ψ C0∞ (Ω ) . Since C0∞ (Ω ) is dense in H0 (Ω ) , − ν/2 − ∈ − µ/2 − − one gets Λ+ Ψ = 0 for Ψ H0 (Ω ) . It follows that − − ∈ − ν/2 ν/2 ν/2 ν/2 a h = Λ+ Ψ = Λ+ Ψ+ + Λ+ Ψ = Λ+ Ψ+ H0− (Ω) . − ∈ Lemma B.1 is proved. 

Let F C∞ (Ω) Ω . Assume that F has finite jumps Fk of the ∈ ∩ S − normal derivative of order k (k = 0, 1, ...) on ∂Ω. For x0 ∂Ω, we will use  ∈ the following notation:

F0 (x0) := [F ] (x0) := lim (F (x0 + εn) F (x0 εn)) , ∂Ω ε +0 → − −

k Fk (x0) := DnF ∂Ω (x0) .   February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

344 Random Fields Estimation Theory

k Let f C∞ Ω and u Ω , and define γkf (x0) := r∂ΩDnf (x0) , ∈ ∈ S − γ u (x ) := r Dk u (x ) . k 0 ∂Ω n 0  Let δ∂Ω denote the Dirac measure supported on ∂Ω, that is, a distribu- tion acting as

n (δ , ϕ) := ϕ (x) dS, ϕ (x) C∞ (R ) . ∂Ω ∈ 0 Z∂Ω It is known that for any differential operator Q of order ν there exists a ν j representation Q = QjDn, where Qj is a tangential differential operator j=0 of order ν j (cf SectionP B.4). We denote by DαF (x) the classical − { } derivative at the points where it exists. The following Lemma B.2 is essentially known but for convenience of the reader a short proof of this lemma is given.

Lemma B.2 The following equality holds for the distribution QF :

ν j 1 − k QF = QF i Qj Dn (Fj 1 kδ∂Ω) . (B.10) { } − − − j=0 X Xk=0

Proof. Let cos (nxj) denote cosine of the angle between the exterior unit normal vector n to the boundary ∂Ω of Ω and the xj -axis. We use the known formulas ∂u dx = u (x) cos (nx ) dσ, u (x) C∞ Ω , j = 1, ..., n, ∂x j ∈ Z j Z Ω ∂Ω 

∂v dx = v (x) cos (nxj) dσ, v (x) C0∞ Ω j = 1, ..., n. ∂x − ∈ − Z j Z Ω− ∂Ω  where dσ is the surface measure on ∂Ω. Applying these formulas to n the products u (x) ϕ (x) and v (x) ϕ (x) where ϕ (x) C∞ (R ) , u (x) ∈ 0 ∈ C∞ Ω , v (x) C0∞ Ω , we get ∈ −   ∂u ∂ϕ ϕ (x) dx = u (x) dx + u (x) ϕ (x) cos (nxj) dσ ∂xj − ∂xj ΩZ ΩZ ∂ZΩ j = 1, ..., n, (B.11) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 345

∂v ∂ϕ ϕ (x) dx = v (x) dx v (x) ϕ (x) cos (nxj) dσ ∂xj − ∂xj − ΩZ− ΩZ− ∂ZΩ j = 1, ..., n. (B.12)

By (B.11), (B.12), we have

∂F ∂ϕ ∂ϕ (x) , ϕ = F, = F (x) dx ∂x − ∂x − Rn ∂x  j   j  Z j

∂F (x) = ϕ (x) dx + [F ]∂Ω (x) cos (nxj) ϕ (x) dS Rn ∂x Z  j  Z∂Ω

∂F n = + [F ] cos (nx ) δ , ϕ (x) , ϕ (x) C∞ (R ) . ∂x ∂Ω j ∂Ω ∈ 0  j   This means,

∂F ∂F = + [F ] cos (nx ) δ , j = 1, ..., n. ∂x ∂x ∂Ω j ∂Ω j  j 

It follows, DnF = DnF iF0δ∂Ω. Furthermore, using the last formula we 2 { }− 2 have D F = Dn DnF iDn (F δ ) = D F iF δ iDn (F δ ) n { } − 0 ∂Ω n − 1 ∂Ω − 0 ∂Ω and so on. By induction one gets:  j 1 − j j k DnF = DnF i Dn (Fj 1 kδ∂Ω) (j = 1, 2, ...). − − − k=0  X ν j j Substituting this formula for DnF into the representation Q = QjDn, j=0 we get (B.10). Lemma B.2 is proved. P 

Denoting in the sequel the extensions by zero to Rn of functions f (x) 0 0 ∈ C∞ Ω , u (x) Ω , as f and u , and using Lemma 2, we obtain ∈ S − the following formulas:   ν j 1 − 0 0 k j 1 k (Qf) = Q f i Q D D − − f δ f C∞ Ω , − j n n ∂Ω ∂Ω ∈ j=1 k=0  X X    (B.13) February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

346 Random Fields Estimation Theory

ν j 1 − 0 0 k j 1 k (Qu) = Q u + i Qj Dn Dn− − u δ∂Ω u Ω , ∂Ω ∈ S − j=1 k=0  X X    (B.14) j j where Dnf ∂Ω := r∂ΩDnf. Using these formulas one can define the action of the operator Q upon the elements of the spaces Hs,ν (Ω) and Hs,ν (Ω )  − (s R) (defined in Section B.4) as follows (cf. [Kozlov et. al. (1997), Sect. ∈ 3.2.], [Roitberg (1996), Sect. 2.4]):

ν j 1 − 0 0 k s,ν Q f, ψ := Q f i Qj Dn (ψj kδ∂Ω) f, ψ H (Ω) , − − ∈ j=1 k=0   X X  (B.15)

ν j 1 − 0 0 k s,ν Q u, φ := Q u + i Qj Dn (φj kδ∂Ω) , u, φ H (Ω ) . − ∈ − Xj=1 Xk=0    (B.16) It is known ([Roitberg (1996)],[Kozlov et. al. (1997)], [Kozhevnikov (2001)]) that Q, defined respectively in (B.15) and (B.16), is a bounded mapping

s,ν s ν s,ν s ν Q : H (Ω) − (Ω) and Q : H (Ω ) − (Ω ) . → H − → H − Moreover, Q is respectively the closure of the mapping f Q (x, D) f → f C∞ Ω or u Q (x, D) u u Ω between the correspond- ∈ → ∈ S − ing spaces.   Let Wm` (m = 1, ..., µ/2, ` = a + 1, ..., ν) be the operator acting as follows:

ν ν a µ/2 1 j ` Wm` (φ) := iγm 1Λ−+ − P − QjDn− (φδ∂Ω) , φ C∞ (∂Ω) , − ∈ `=Xa+1 Xj=` (B.17) k where γk is the restriction to ∂Ω of the Dn (cf. Section B.4). The mapping W is a pseudodifferential operator of order m µ + m` − ν/2 1 ` . Therefore, for any real s, this mapping is a bounded operator: − − s s m+µ ν/2+1+` W : H (∂Ω) H − − (∂Ω) . m` → a,ν a ν For f, ψ H (Ω) , one has g := Q f, ψ H − (Ω) , and we set ∈ ∈ 0  a µ/2 1 0  wa+m := γm 1Λ−+ − P − g (m = 1, ..., µ/2), (B.18) − − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 347

a µ/2 1 where the operator γm 1Λ−+ − P − (x, D) is a trace operator of order − µ/2 m+1/2 m 1 a 3µ/2. It follows that w H − (∂Ω) . − − − a+m ∈ Theorem B.2 Integral equation (B.3)

a a R h = f H (Ω) , h H− (Ω) Ω ∈ ∈ 0 is equivalent to the following boundary-value problem:

Qu = 0 in Ω , j j − Dnu = Dnf on ∂Ω, 0 j a 1,  ν ≤ ≤ − (B.19)   Wm` (γ` 1u) = wa+m on ∂Ω, 1 m µ/2, `=a+1 − ≤ ≤  P where the functions u, f and h are related by the formulas

a 1 a n f H (Ω) in Ω, h = P − QF, F H (R ) , F := ∈ ∈ u Ha (Ω ) in Ω .  ∈ − − Proof. Our starting point is Theorem B.1. Consider the equation P h = a a Rn QF, h H0− (Ω) . Since F H ( ) and Qu = 0 in Ω by (B.4), then a∈ ν a µ ∈ − QF H0 − (Ω) = H0− − (Ω) . By Lemma 1, a solution h to the equation ∈ a µ a P h = QF H− − (Ω) belongs to the space H− (Ω) if and only if QF ∈ 0 0 satisfies the following µ/2 boundary conditions:

m 1 a µ/2 1 r∂ΩDn − Λ−+ − P − QF = 0, m = 1, ..., µ/2. (B.20)

Since F = f 0 + u0, one has QF = Q f 0 + Q u0 . Substituting the last expression into (B.20) we have   a µ/2 1 0 a µ/2 1 0 γm 1Λ−+ − P − Q u = γm 1Λ−+ − P − Q f m = 1, ..., µ/2. − − − From (B.15) and (B.16), one gets: 

ν j 1 − a µ/2 1 k iγm 1Λ−+ − P − Qj Dn (φj kδ∂Ω) − − j=1 X kX=0 a µ/2 1 0 = γm 1Λ−+ − P − Q f, ψ − ν j 1 − a µ/2 1  k + iγm 1Λ−+ − P − Qj Dn (ψj kδ∂Ω) (B.21) − − j=1 X kX=0 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

348 Random Fields Estimation Theory

Since f Ha (Ω) in Ω, F := ∈ and F Ha (Rn) , u Ha (Ω ) in Ω ∈  ∈ − − it follows that γj 1u = γj 1f, j = 1, ..., a. Therefore, φj = γj 1u = − − − γj 1f = ψj, j = 1, ..., a. − We identify the space Ha (Ω) with the subspace of Ha,(ν) (Ω) of all f, ψ = (f, ψ1, ..., ψν) such that ψa+1 = ... = ψν = 0. Let f, ψ be- a,(ν) long to this subspace and u, φ = (u, φ1, ..., φν) H (Ω ) . Then we  ∈ −  can rewrite (B.21) as  a µ/2 1 0 γm 1Λ−+ − P − Q f, ψ − ν j a µ/2 1 j `  + iγm 1Λ−+ − P − Qj Dn− (φ`δ∂Ω) = 0. (B.22) − j=1 X `=Xa+1 Changing the order of the summation

ν j ν ν = , Xj=1 `=Xa+1 `=Xa+1 Xj=` we get ν ν a µ/2 1 j ` iγm 1Λ−+ − P − Qj Dn− (φ`δ∂Ω) − `=a+1 j=` a µ/P2 P1 0 = γm 1Λ−+ − P − Q f, ψ , (B.23) − − where m = 1, ..., µ/2. In view of (B.17) and (B.18), formula (B.23) can be rewritten as µ/2 equations

ν Wm` (φ`) = wa+m on ∂Ω, m = 1, ..., µ/2. `=Xa+1

Since φj = γj 1u for u S Ω , we get − ∈ − ν  Wm` (γ` 1u) = wa+m on ∂Ω, m = 1, ..., µ/2. − `=Xa+1 These equations define µ/2 extra boundary conditions for the boundary-value problem Qu = 0 in Ω , − Dj u = Dj f on ∂Ω, 0 j a 1.  n n ≤ ≤ − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 349

Theorem B.2 is proved. 

B.3 Isomorphism property

We look for a solution u Ha (Ω ) to the boundary-value problem (B.19). ∈ − Let us consider the following non-homogeneous boundary-value problem associated with (B.19):

Qu = w in Ω j 1 − γ0Bj u := γ0Dn− u = wj on ∂Ω, 1 j a  ν ≤ ≤   γ0Ba+m u := Wm` (u` 1) = wa+m on ∂Ω, 1 m µ/2, `=a+1 − ≤ ≤  P (B.24)  where w, wj j = 1, ..., ν/2, are arbitrary elements of the corresponding Sobolev spaces (see below Theorem B.3 and B.4). For the formulation of the Shapiro-Lopatinskii condition we need some notation. Let ε > 0 be a sufficiently small number. Denote by U (ε-conic neigh- borhood) the union of all balls B (x, ε x ) , centered at x ∂Ω with radius h i ∈ ε x . Let y = (y0, yn) = (y1, ..., yn 1, yn) be normal coordinates in an ε- h i − conic neighborhood U of ∂Ω, that is, ∂Ω may be identified with y = 0 , { n } yn is the normal coordinate, and the normal derivative Dn is Dyn near ∂Ω. Each differential operator on Rn with SG-symbol can be written in U as a

0 differential operator with respect to Dy and Dyn :

ν j 0 Q = Qj (y, Dy ) Dyn , Xj=0

where Qj (y, Dy0 ) are differential operators with symbols belonging to SG(ν,0) (Rn) . Let

ν j q (y, ξ) = q (y, ξ0, ξn) = qj (y, ξ0) ξn j=0 X

be the symbol of Q, where ξ0 and ξn are cotangent variables associated with y0 and yn. Assumption 1. We assume that the operator Q is md-properly elliptic (cf. [Erkip and Schrohe (1992), Assumption 1, p. 40]), that is, for all large y + ξ0 the polynomial q (y, ξ0, z) with respect to the complex variable z has | | | | exactly ν/2 zeros with positive imaginary parts τ1 (y0, ξ0) , ..., τν/2 (y0, ξ0) . February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

350 Random Fields Estimation Theory

We conclude from Assumption 1 that the polynomial q (y, ξ0, z) has no real zeros and has exactly ν/2 zeros with negative imaginary part for all large y + ξ0 . | | | | In particular, the Laplacian ∆ in the space Rn (n 2) is elliptic in ≥ the usual sense but not md- properly elliptic, while the operator I ∆ is − md-properly elliptic. Let 1/2 n 1 − 1 χ (y0, ξ0) := 1 + ξi g (y)− ξj ,  ij  i,jX=1     where g = (gij) is a Riemannian metric on ∂Ω. We denote

ν/2 + 1 q (y0, ξ0, z) := z χ (y0, ξ0)− τ (y0, ξ0) . − j j=1 Y   Consider the operators Bm (m = 1, ..., ν/2) from (B.24). Each of them is of the form ν 1 − j 0 Bm = Bmj (y0, Dy ) Dyn Xj=0 in the normal coordinates y = (y0, yn) = (y1, ..., yn 1, yn) in an ε-conic − neighborhood of ∂Ω. Here Bmj (y0, Dy0 ) is a pseudodifferential operator of order ρ j (ρ N) acting on ∂Ω. Let b (y0, ξ0) denote the principal m − m ∈ mj symbol of Bmj (y0, Dy0 ) . The operators Bm in the boundary-value problem (B.24) are operators of this type. We set

ν 1 − ρm+j j bm (y0, ξ0, z) := bmj (y0, ξ0) χ (y0, ξ0)− z . j=0 X Define the following polynomials with respect to z:

ν/2 j 1 rm (y0, ξ0, z) = rmj (y0, ξ0) z − j=1 X + as the residues of bm (y0, ξ0, z) modulo q (y0, ξ0, z) , i.e. we get rmj (y0, ξ0) representing bm (y0, ξ0, z) in the form

ν/2 + j 1 bm (y0, ξ0, z) = qm (z) q (y0, ξ0) + rmj (y0, ξ0) z − , j=1 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 351

where qm (z) is a polynomial in z. Assumption 2. (Shapiro-Lopatinskii condition) The determinant det(rmj (y0, ξ0)) is bounded and bounded away from zero, that is, there exist two positive constants c and C such that

0 < c det (r (y0, ξ0)) C. ≤ mj ≤ Remark B.2 The following Theorem B.3 has been proved in [Erkip and Schrohe (1992), Th. 3.1] in the more general case of the SG-manifold. The latter includes the exterior of bounded domains which is a particular case of the SG-manifolds. This particular case was chosen for simplicity of the exposition. Moreover, the results in [Erkip and Schrohe (1992), Th. 3.1], [Schrohe (1999)] have been obtained for operators acting in weighted Sobolev spaces. The usual Sobolev spaces in Theorem B.3 are particular cases of the weighted Sobolev spaces with zero order of the weight.

Theorem B.3 (cf [Erkip and Schrohe (1992), Th. 3.1], [Schrohe (1999)]). If the differential operator Q of even order ν satisfies Assump- tions 1 and 2, that is Q is md-properly elliptic and the Shapiro-Lopatinskii condition holds for the operator Q, γ0B1, ..., γ0Bν/2 , then the mapping

ν/2  s s ν s ρj 1/2 Q, γ0B1, ..., γ0Bν/2 : H (Ω ) H − (Ω ) H − − (∂Ω) , s ν, − → − × ≥ j=1  Y is a .

Assumption 3.The Fredholm operator Q, γ0B1, ..., γ0Bν/2 has the trivial kernel and cokernel.  2 For example, if the kernel R(x, y) has the property (Rh, h) c h −a ≥ || ||H0 a for all h H− , where c = const > 0 does not depend on h, then the ∈ 0 operator in Assumption 3 is invertible (see Chapter 1).

Corollary B.1 Under the assumptions of Theorem B.3 and in addition under Assumption 3, for any s R, there exists a bounded (Poisson) oper- ∈ ator

m 1 s+2m j 1/2 s+2m K : Π − H − − (∂Ω) H (Ω ) (B.25) j=0 → − which gives a unique solution u = Kχ to the boundary-value problem

Qu = 0 in Ω , γ0B1u = χ1, ..., γ0Bν/2u = χν/2 (B.26) − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

352 Random Fields Estimation Theory

with

ν/2 1 s+ν j 1/2 χ = χ , ..., χ Π − H − − (∂Ω) . 1 ν/2 ∈ j=0 More precisely, theoperator u = Kχ solves the problem with s < 0 in the s+ν sense that u = Kχ is the limit in the space H (Ω ) of a sequence un in − Hν (Ω ) with −

Qun = 0, γ0Bj un = χj,n (j = 1, ..., m), lim χ χ n n →∞ → ν/2 1 s+ν j 1/2 in Πj=0− H − − (∂Ω) . Proof. The statement of Corollary is an immediate consequence of The- orem B.3 due to the fact that the solution operator to the boundary-value problem (B.26) with homogeneous equation Qu = 0 in Ω is a Poisson op- − erator. The latter acts in the full scale of Sobolev spaces [Schrohe (1999)], that is, (B.25) holds for all s R.  ∈ Theorem B.4 Under the assumptions of Theorem B.3 and in addition under Assumption 3, the mapping RΩ, defined in the Introduction, is an a a isomorphism : H− (Ω) H (Ω). 0 → Proof. Let us consider the operator Q, γ0B1, ..., γ0Bν/2 generated by the boundary value problem (B.24). Taking into account that  ρ = order B = j 1 for j = 1, ..., a, j j − ρ = order B = j µ + ν/2 2 for j = a + 1, ..., ν/2, j j − − one concludes by Theorem B.3 that the mapping

u, φ Q u, φ , γ B u, φ , ..., γ B u, φ = (w, w , ..., w ) 7→ 0 1 0 ν/2 1 ν/2 is aFredholm operator. It maps the space Hs(Ω ) to the space − a ν/2 s ν s j+1/2 s j+µ ν/2+3/2 H − (Ω ) H − (∂Ω) H − − (∂Ω) (s ν) . − × × ≥ jY=1 j=Ya+1 Assumption 3 implies that this mapping is an isomorphism. By Corollary, the operator K, solving the boundary-value problem

Qu = 0, γ0Bj u = χj (j = 1, ..., m),

is a Poisson operator m 1 s+2m j 1/2 s+2m R K : Πj=0− H − − (∂Ω) H (Ω ) (s ) . → − ∈ February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 353

Choosing s = a and using Theorem B.2, we conclude, that for any f Ha (Ω) the function u is a unique solution to the boundary-value ∈ problem (B.19). Therefore, again by Theorem B.2, the operator RΩ is an a a  isomorphism of the space H0− (Ω) onto H (Ω). Theorem B.4 is proved. Example B.1 Let P = I be the identity operator (its order µ = 0) and Q = I ∆, (ν = ord Q = 2) . Then, by Theorem B.4, the corresponding − 1 1 operator R is an isomorphism: H− (Ω) H (Ω) . Ω 0 → Under the assumptions of Theorem B.4 there exists a unique solution to the integral equation (B.3). Let us find this solution. Examples of analytical formulas for the solution to the integral equation (B.3) can be found in [Ramm (1990)]. The analytical formulas for the solution in the cases when the corresponding boundary-value problems are solvable analytically, can be obtained only for domains Ω of special shape, for example, when Ω is a ball, and for special operators Q and P , for example, for operators with constant coefficients. We give such a formula for the solution of equation (B.3) assuming P = I and Q = ∆ + a2I. Consider the equation −

exp( a x y ) 3 RΩh (x) = − | − | h(y)dy = f(x), x Ω R , a > 0, Ω 4π x y ∈ ⊂ Z | − | (B.27) with the kernel R(x, y) := exp( a x y )/ (4π x y ), P = I, and Q = − | − | | − | ∆+a2I. By formula (2.24), one obtains a unique solution to the equation − 1 (B.27) in H0− (Ω): ∂f ∂u h(x) = ( ∆ + a2)f + δ (B.28) − ∂n − ∂n ∂Ω   where u is a unique solution to the exterior Dirichlet boundary-value prob- lem

2 ( ∆ + a )u = 0 in Ω , u ∂Ω = f ∂Ω . (B.29) − − | | n For any ϕ C∞(R ) one has: ∈ 0 ( ∆ + a2)Rh, ϕ = Rh, ( ∆ + a2)ϕ − −   = f( ∆ + a2)ϕdx + u( ∆ + a2)ϕdx = − − ΩZ ΩZ− February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

354 Random Fields Estimation Theory

= ( ∆ + a2)fϕdx + ( ∆ + a2)uϕdx − − ΩZ ΩZ−

(f∂nϕ ∂nfϕ) ds + (u∂nϕ ∂nuϕ) ds − − − ∂ZΩ ∂ZΩ

2 = ( ∆ + a )fϕdx + (∂nf ∂nu) ds, − − ΩZ ∂ZΩ where the condition u = f on ∂Ω was used. Thus, we have checked that 1 formula (B.28) gives the unique in H0− (Ω) solution to equation (B.27). This solution has minimal order of singularity.

B.4 Auxiliary material

We denote by R the set of real numbers, by C the set of complex num- bers. Let Z := 0, 1, 2, ... , N := 0, 1, ... , N := 1, 2, ... , Rn := {   } { } + { } x = (x , ..., x ) : x R, i = 1, ..., n . { 1 n i ∈ } Let α be a multi-index, α := (α1, . . ., αn), αj N, α := α1 + . . .+ αn, 1 α α1 α2 α∈ | | i := √ 1; D := i− ∂/∂x ; D := D D . . .D n . − j j 1 2 n Let C∞ Ω be the space of infinitely differentiable up to the boundary functions in Ω. Near ∂Ω there is defined A normal vector field n (x) =  (n1 (x) , ..., nn (x)) , is defined in a neighborhood of the boundary ∂Ω as follows: for x ∂Ω, n (x ) is the unit normal to ∂Ω, pointing into the 0 ∈ 0 exterior of Ω. We set

n (x) := n (x0) for x of the form x = x0 + sn (x0) =: ζ (x0, s)

where x ∂Ω, s ( δ, δ) . Here δ > 0 is taken so small that the repre- 0 ∈ ∈ − sentation of x in terms of x ∂Ω and s ( δ, δ) is unique and smooth, 0 ∈ ∈ − that is, ζ is bijective and C∞ with C∞ inverse, from ∂Ω ( δ, δ) to the × − set ζ (∂Ω ( δ, δ)) Rn. × − ⊂ We call differential operators tangential when, for x ζ (∂Ω ( δ, δ)), ∈ × − they are either of the form

n n ∂f Af = a (x) (x) + a (x) f with a (x) n (x) = 0, j ∂x 0 j j j=1 j j=1 X X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 355

or they are products of such operators. The derivative along n is denoted ∂n :

n ∂f ∂nf := n (x) (x) j ∂x j=1 j X 1 for x ζ (∂Ω ( δ, δ)) . Let Dn := i− ∂n. ∈ × −n Let Ω := R Ω denote the exterior of the domain Ω, r∂Ω, rΩ be − \ respectively the restriction operators to ∂Ω, Ω : r f := f , r f := ∂Ω |∂Ω Ω f |Ω Let (Rn) be the space of rapidly decreasing functions, that is the S n space of all u C∞ (R ) such that ∈ m sup sup 1 + x 2 Dαu (x) < for all k, m N. α k x Rn | | ∞ ∈ | |≤ ∈  Let Ω be the space of restrictions of the elements u (Rn) to S − ∈ S Ω (this space is equipped with the factor topology). −  k Let u C∞ Ω and v Ω , then we set γku := r∂ΩDnu = ∈ ∈ S − Dk u , γ v := r Dk v = Dk v . n ∂Ω k ∂Ω n n ∂Ω Let Hs (Rn) (s R) be the usual Sobolev space:   ∈ s n 1 2 s/2 n H (R ) := f 0 − 1 + ξ f L (R ) , ∈ S | F | | F ∈ 2 n  o

1 2 s/2 f s Rn := − 1 + ξ f , H ( ) n k k F | | F L2(R )  ixξ where denotes the Fourier transform f x ξf (x) = Rn e− f(x)dx, 1 F n 7→ F → − its inverse and 0 = 0 (R ) denote the space of tempered distributions F S S R which is dual to the space (Rn) . S Let Hs (Ω) and Hs (Ω ) (0 s R) be respectively the spaces of re- − ≤ ∈ strictions of elements of Hs (Rn) to Ω and Ω . The norms in the spaces − Hs (Ω) and Hs (Ω ) are defined by the relations −

f s := inf g s Rn (s 0) , k kH (Ω) k kH ( ) ≥

f s := inf g s Rn (s 0) , k kH (Ω− ) k kH ( ) ≥ where infimum is taken over all elements g Hs (Rn) which are equal to f ∈ in Ω respectively in Ω . − February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

356 Random Fields Estimation Theory

s R s By H0 (Ω) (s ) and H0 (Ω ) , we denote the closed subspaces of the ∈ − space Hs (Rn) which consist of the elements with supports respectively in Ω or in Ω , that is, − Hs (Ω) := f Hs (Rn) : supp f Ω Hs (Rn) , s R, 0 ∈ ⊆ ⊂ ∈  s s Rn s Rn R H0 (Ω ) := f H ( ) : supp f Ω H ( ) , s . − ∈ ⊆ − ⊂ ∈ We define the spaces

Hs (Ω) s > 0 s (Ω) := H Hs (Ω) s 0,  0 ≤

s s H (Ω ) s > 0, (Ω ) := s − H − H0 (Ω ) s 0.  − ≤ For s = k + 1/2 (k = 0, 1, ..., ` 1) , we define the spaces Hs,` (Ω) and 6 − Hs,` (Ω ) respectively as the sets of all −

u, φ = (u, φ1, ..., φ`) and v, ψ = (v, ψ1, ..., ψ`)

s  s  where u (Ω) , v (Ω ) , φ = (φ1, ..., φ`) and ψ = (ψ1, ..., ψ`) are − ∈ H` ∈ H s j+1/2 vectors in H − (∂Ω) satisfying the condition j=1 Q j 1 j 1 φj = Dn− u ∂Ω , ψj = Dn− v ∂Ω for j < min (s, `) .

The norms in Hs,` (Ω) and Hs,` (Ω ) can be defined as − ` 2 2 2 u, φ Hs,` = u s (Ω) + φj Hs−j+1/2 (∂Ω) , (Ω) k kH k k j=1  X

` 2 2 2 v, ψ Hs,` = v s(Ω) + ψj Hs−j+1/2 (∂Ω) . (Ω) k kH k k j=0  X

Since only the components φj and ψj with index j < s can be chosen independently of u, we can identify Hs,` (Ω) and Hs,` (Ω ) with the following − spaces. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Integral Operators Basic in Random Fields Estimation Theory 357

For s = k + 1/2 (k = 0, 1, ..., ` 1) , 6 − s (Ω) , ` = 0, H s (Ω) , 1 ` < s + 1/2,  H ≤ ` s,`  s s j+1/2 1 H (Ω) =  (Ω) H − (∂Ω) , 0 < s + < `,  H × 2  j=[s+1/2]+1  `   s Qs j+1/2 1 (Ω) H − (∂Ω) , s < 2  H × j=1   Q  s (Ω ) , ` = 0, H − s (Ω ) , 1 ` < s + 1/2,  H − ≤ ` s,`  s s j+1/2 1 H (Ω ) =  (Ω ) H − (∂Ω) , 0 < s + 2 < `, −  H − ×  j=[s+1/2]+1  `   s Qs j+1/2 1 (Ω ) H − (∂Ω) , s < 2  H − × j=1   Q Finally, for s =k + 1/2 (k = 0, 1, ..., ` 1) , we define the spaces Hs,` (Ω) , − Hs,` (Ω ) by the method of complex interpolation. − Let us note that for s = k+1/2 (k = 0, 1, ..., ` 1) , the spaces Hs,` (Ω) , s,` 6 − H (Ω ) are completion of C∞ Ω , Ω respectively in the norms − S −   ` 1 2 2 − 2 (u, γ0u, ..., γ` 1u) Hs,`(Ω) = u s (Ω) + γj u Hs−j−1/2 (∂Ω) , k − k k kH k k Xj=0

` 1 2 2 − 2 v, γ v, ..., γ v s,` = v s + γ v . 0 ` 1 H (Ω−) (Ω− ) j Hs−j−1/2 (∂Ω) k − k k kH k k j=0 X February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

358 Random Fields Estimation Theory February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Bibliographical Notes

The estimation theory optimal by the criterion of minimum of the error vari- ance has been created by N. Wiener (1942) for stationary random processes and for an infinite interval of the time of observation. A large bibliography one can find in [Kailath (1974)]. The theory has been essentially finished by mid-sixties for a finite interval of the time of observation and for stationary random processes with rational spectral density. Many attempts were made in engineering literature to construct a generalization of the Wiener theory for the case of random fields. The reason is that such a theory is needed in many applications, e.g., TV and optical signal processing, geophysics, underwater acoustics, radiophysics etc. The attempts to give an analytical estimation theory in engineering literature (see [Ekstrom (1982)] and refer- ences therein) were based on some type of scanning, and the problem has not been solved as an optimization problem for random fields. The first analytical theory of random fields estimation and filtering, which is a generalization of Wiener’s theory, has been developed in the series of papers [Ramm (1969); Ramm (1969b); Ramm (1969c); Ramm (1970b); Ramm (1970c); Ramm (1970d); Ramm (1971); Ramm (1971c); Ramm (1973c); Ramm (1975); Ramm (1976); Ramm (1978); Ramm (1978b); Ramm (1978c); Ramm (1978d); Ramm (1979); Ramm (1980b); Ramm (1980c); Ramm (1984b); Ramm (1985); Ramm (1987d); Ramm (2002); Ramm (2003); Kozhevnikov and Ramm (2005)] and in [Ramm (1980), Chapter 1]. This theory is presented in Chapters 2-4. Its applications are given in Chapter 7, and its generalizations to a wider class of random fields are given in the Appendices A and B. The material in Chapter 3 is based on the paper [Ramm (1985)]. The material in Section 7.7 is taken from [Ramm (1980)] where a reference to the paper Katznelson, J. and Gould, L., Con- struction of nonlinear filters and control systems, Information and Control,

359 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

360 Random Fields Estimation Theory

5, (1962), 108-143, can be found together with a critical remark concerning this paper. In Section 7.2 the paper [Ramm (1973b)] is used; in Section 7.3.4 papers [Ramm (1968); Ramm (1973b); Ramm (1978); Ramm (1981); Ramm (1985b); Ramm (1984); Ramm (1987b); Ramm (1987c)] are used; the stable differentiation formulas (7.66, (7.67) and (7.71) were first given in [Ramm (1968)]; in Section 7.5 the papers [Ramm (1969b); Ramm (1970b); Ramm (1970c); Ramm (1970d)] are used. There is a large literature in which various aspects of the theory presented in Section 7.6 are dis- cussed, see [Fedotov (1982)], [Ivanov et. al. (1978)], [Lattes and Li- ons (1967)], [Lavrentiev and Romanov (1986)], [Morozov (1984)], [Payne (1975)], [Tanana (1981)], [Tikhonov (1977)] and references therein. The presentation in Section 7.6 is self-contained and partly is based on [Ramm (1981)]. The class of random fields has been introduced by the author in 1969 R [Ramm (1969b); Ramm (1970b); Ramm (1970c); Ramm (1970d)]. It was found ( see [Molchan (1975)], [Molchan (1974)]) that the Gaussian random fields have Markov property if and only if they are in the class and R P (λ) = 1 (see formula (1.10)). Chapter 5 contains a singular perturbation theory for the class of inte- gral equations basic in estimation theory. This chapter is based on [Ramm and Shifrin (2005)] (see also [Ramm and Shifrin (1991)], [Ramm and Shifrin (1993)], [Ramm and Shifrin (1995)]). Random fields have been studied extensively [Adler (1981)], [Gelfand and Vilenkin (1968)], [Koroljuk (1978)], [Pitt (1971)], [Rosanov (1982)], [Vanmarcke (1983)], [Wong (1986)], [Yadrenko (1983)], but there is no in- tersection between the theory given in this book and the material presented in the literature. In the presentation of the material in Section 8.1.2 the author used the book [Berezanskij (1968)]. Theorem 8.1 in Section 8.1.1 is taken from [Mazja (1986), p. 60], and the method of obtaining the eigenfunction expansion theorem in Section 8.2 is taken from [Berezanskij (1968)]. There is a large literature on the material presented in section 8.2.4. Of course, it is not possible in this book to cover this material in depth (and it was not our goal). Only some facts, useful for a better under- standing of the theory presented in this book, are given. For second-order elliptic operators a number of stronger conditions sufficient for to be L L selfadjoint or essentially selfadjoint are known (see [Kato (1981)]). The assumption that is selfadjoint is basic for the eigenfunction ex- L pansion theory developed in [Berezanskij (1968)]. In some cases an eigen- function expansion theory sufficient for our purposes can be developed February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Bibliographical Notes 361

for certain non-selfadjoint operators (see [Ramm (1981b); Ramm (1981c); Ramm (1982); Ramm (1983)]. We have discussed the case when D, the domain of observation, is finite. For the Schr¨odinger operator the spectral and scattering theory in some domains with infinite boundaries is devel- oped in [Ramm (1963b); Ramm (1963); Ramm (1965); Ramm (1968b); Ramm (1969d); Ramm (1970); Ramm (1971b); Ramm (1987); Ramm (1988b)]. The material in Section 8.3.1 is well known. The proofs of the results about s-values can be found in [Gohberg and Krein (1969)]. The material in Section 8.3.2 belongs to the author [Ramm (1980); Ramm (1981b); Ramm (1981c)] and the proofs of all of the results are given in detail. The material in Section 8.3.3 is known and proofs of the results can be found in [Gohberg and Krein (1969)], [Konig (1986)], and [Pietsch (1987)]. In Section 8.4 some reference material in probability theory and statis- tics is given. One can find much more material in this area in [Koroljuk (1978)]. The purpose of the Chapter 6 is to explain some connection be- tween estimation and scattering theory. In the presentation of the scattering theory in Section 6.1 the pa- pers [Ramm (1963b); Ramm (1963); Ramm (1965); Ramm (1968b); Ramm (1969d)] are used, where the scattering theory has been devel- oped for the first time in some domains with infinite boundaries. Most of the results in Section 6.1 are well known, except of Theorems 6.1 and 6.2, which are taken from [Ramm (1987e); Ramm (1987f); Ramm (1988); Ramm (1988c); Ramm (1989)]. It is not possible here to give a bibliog- raphy on scattering theory. Povsner (1953-1955) and then Ikebe (1960) studied the scattering problem in R3. Much work was done since then (see [H¨ormander (1983-85)] vol. II, IV and references therein). A short and self- contained presentation of the scattering theory given in Section 6.1 may be useful for many readers who would like to get a quick access to basic re- sults and do not worry about extra assumptions on the rate of decay of the potential. Lemma 6.1 in Section 6.2 is well known, equation (6.13) is derived in [Newton (1982)], our presentation follows partly [Ramm (1987e); Ramm and Weaver (1987)] and Theorem 6.1 is taken from [Ramm (1987e)]. A connection between estimation and scattering theory for one- dimensional problems has been known for quite awhile. In [Levy and Tsitsiklis (1985)] and [Yagle (1988)] some multidimensional problems of February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

362 Random Fields Estimation Theory

estimation theory were discussed. In Section 6.3 some of the ideas from [Yagle (1988)] are used. Our arguments are given in more detail than in [Yagle (1988)]. The estimation problem discussed in [Levy and Tsitsiklis (1985)] and [Yagle (1988)] are the problems in which the noise has a white component and the covariance function has a special structure. In [Levy and Tsitsiklis (1985)] it is assumed that r = 2 and R(x, y) = R( x y ), that is the | − | random field is isotropic, and in [Yagle (1988)] r 2 and ∆ R(x, y) = ≥ x ∆yR(x, y). The objective in these papers is to develop a generalization of the Levinson recursion scheme for estimation in one-dimensional case. The arguments in [Levy and Tsitsiklis (1985)] and [Yagle (1988)] are not applicable in the case when the noise is colored, that is, there is no white component in the noise. There was much work done on efficient inversion of Toeplitz’s matrices [Friedlander et. al. (1979)]. These matrices arise when one discretize equation (3.4). However, as  0, one cannot invert the → corresponding Toeplitz matrix since its condition number grows quickly as  0. If  = 1 in (3.4), then there are many efficient ways to solve equation → (3.4). It would be of interest to compare the numerical efficiency of various methods. It may be that an iterative method, or a projection method will be more efficient than the discretization method with equidistant nodes used together with the efficient method of inverting the resulting Toeplitz matrix. The results in Appendix A are taken from [Ramm (2003)] and the results in Appendix B are taken from [Kozhevnikov and Ramm (2005)]. The author has tried to make the material in this book accessible to a large audience. The material from the theory of elliptic pseudodifferential equations (see, e.g., [H¨ormander (1983-85)]) was not used. The class of R kernels is a subset in the set of pseudodifferential operators. The results obtained in this book concerning equations in the class are final in the R sense that exact description of the range of the operators with kernels in class is given and analytical formulas for the solutions of these equations R are obtained. The general theory of pseudodifferential operators does not provide analytical formulas for the solutions. It was possible to derive such formulas in this book because of the special structure of the kernels in the class . In [Eskin (1981), 27] an asymptotic solution is obtained for a R § class of pseudo-differential equations with a small parameter. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Bibliography

Adler, R. (1981). The geometry of random fields, J. Wiley, New York. Agmon, S. (1982). Lectures on exponential decay of solutions of second-order el- liptic equations, Princeton Univ. Press, Princeton. Akhieser, N. (1965). Lectures on approximation theory, Nauka, Moscow. Aronszajn, N. (1950). Theory of reproducing kernels, Trans. Am. Math. Soc. 68, pp. 337-404. Berezanskij, Yu (1968). Expansions in eigenfunctions of selfadjoint operators, Amer. Math. Soc., Providence RI. Deimling, K. (1985). Nonlinear functional analysis, Springer Verlag, New York. Ekstrom, M. (1982). Realizable Wiener filtering in two dimensions, IEEE Trans. on acoustics, speech and signal processing, 30, pp. 31-40; Ekstrom, M and Woods, J. (1976). Two-dimensional spectral factorization with applications in recursive digital filtering, ibid, 2, pp. 115-128. Erkip, A. and Schrohe, E, (1992). Normal solvability of elliptic boundary-value problems on asymptotically flat manifolds, J. of Functional Analysis 109, pp. 22–51. Eskin, G. (1981). Boundary value problems for elliptic pseudodifferential equa- tions, Amer. Math. Soc., Providence, RI. Fedotov, A. (1982). Linear ill-posed problems with random errors in the data, Nauka, Novosibirsk. Friedlander, B., Morf, Mi, Kailath, T., Ljung, L. (1979). New inversion formulas for matrices classified in terms of their distance from Toeplitz matrices, Linear algebra and its applications, 27, pp. 31-60. Gakhov, F. (1966). boundary-value problems, Pergamon Press, Oxford. Glazman, I. (1965). Direct methods of qualitative spectral analysis of singular differential operators, Davey, New York. Gohberg, I. and Krein, M. (1969). Introduction to the theory of linear nonselfad- joint operators, AMS, Providence. Gilbarg, D. and Trudinger, N. (1977). Elliptic partial differential equations of second order, Springer Verlag, New York. Gelfand, I. and Vilenkin, N. (1968). Generalized functions, vol. 4., Acad. Press, New York.

363 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

364 Random Fields Estimation Theory

Grubb, G. (1990). Pseudo-differential problems in Lp spaces, Commun. in Partial Differ. Eq.,15 (3), pp. 289–340. Grubb, G. (1996). of pseudodifferential boundary problems, Birkh¨auser, Boston. H¨ormander, L. (1983-85). The analysis of linear partial differential operators, vol. I-IV, Springer Verlag, New York. Ivanov, V., Vasin, V. and Tanana, V (1978). Theory of linear ill-posed problems and applications, Moscow, Nauka. Kato, T. (1995). Perturbation theory for linear operators, Springer Verlag, New York. Kato, T. (1981). Spectral theory of differential operators, North Holland, Amster- dam. (Ed. Knowles, I. and Lewis, R.) pp. 253-266. Kato, T. (1959). Growth properties of solutions of the reduced wave equa- tion,Comm. Pure Appl. Math, 12, pp. 403-425. Kailath, T. (1974). A view of three decades of linear filtering theory, IEEE Trans. on inform. theory, IT-20, pp. 145-181. Kantorovich, L. and Akilov, G. (1980). Functional analysis, Pergamon Press, New York. Klibanov, M. (1985). On uniqueness of the determination of a compactly sup- ported function from the modulus of its Fourier transform, Dokl. Acad. Sci. USSR, 32, pp. 668-670. Konig, H. (1986). Eigenvalue distribution of compact operators, Birkhauser, Stuttgart. Koroljuk, V. ed. (1978). Reference book in probability and statistics, Naukova Dumka, Kiev. Kozhevnikov, A. (2001). Complete scale of isomorphisms for elliptic pseudodiffer- ential boundary-value problems, J. London Math. Soc. (2) 64, pp. 409–422. Kozhevnikov, A. and Ramm, A.G. (2005). Integral operators basic in random fields estimation theory, Intern. J. Pure and Appl. Math., 20, N3, 405-427. Kozlov, V. Maz’ya, V. and Rossmann, J. (1997). Elliptic boundary-value problems in domains with point singularities, AMS, Providence 1997. Krasnoselskii, M., et. al. (1972). Approximate solution of operator equations, Wal- ters - Noordhoff, Groningen. Lattes R. and Lions J. (1967). M´ethode de quasi-reversibilite et applications, Dunod, Paris. Lavrentiev, M. and Romanov, V. and Shishatskii, S. (1986). Ill-posed problems of mathematical physics and analysis, Amer. Math. Soc., Providence, RI. Levitan, B. (1971). Asymptotic behavior of spectral function of elliptic equation, Russ. Math. Survey, 6, pp. 151-212. Levy, B. and Tsitsiklis, J. (1985). A fast argorithm for linear estimation of two- dimensional random fields, IEEE Trans. Inform. Theory, IT-31, pp. 635- 644. Mazja, V. (1986). Sobolev spaces, Springer Verlag, New York. Molchan, G. (1975). Characterization of Gaussian fields with Markov property, Sov. Math. Doklady, 12, pp. 563-567. Molchan, G. (1974). L-Markov Gaussian fields, ibid. 15, pp. 657-662. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Bibliography 365

Morozov, V. (1984). Methods for solving incorrectly posed problems, Springer Ver- lag, New York. Naimark, M. (1969). Linear differential operators, Nauka, Moscow. Newton, R. (1982). Scattering of waves and particles, Springer Verlag, New York. Payne, L.E. (1975). Improperly posed problems, Part. Dif. Equ., Regional Conf. Appl. Math., Vol. 22, SIAM, Philadelphia. Pietsch, A. (1987). Eigenvalues and s-numbers, Cambridge Univ. Press, Cam- bridge. Piterbarg, L. (1981). Investigation of a class of integral equations, Diff. Urav- nenija, 17, pp. 2278-2279. Pitt, L. (1971). A Markov property for Gaussian processes with a multi- dimensional parameter, Arch. Rat. Mech. Anal., 43, pp. 367-391. Preston, C. (1967). Random fields, Lect. notes in math. N34, Springer Verlag, New York. Ramm, A.G. (1963). Spectral properties of the Schr¨odinger operator in some domains with infinite boundaries. Doklady Acad. Sci. USSR, 152, pp. 282- 285. Ramm, A.G. (1963b). Investigation of the scattering problem in some domains with infinite boundaries I, II, Vestnik 7, pp. 45-66; 19, pp. 67-76. Ramm, A.G. (1965). Spectral properties of the Schr¨odinger operator in some infinite domains, Mat. Sbor. 66, pp. 321-343. Ramm, A.G. (1968). On numerical differentiation. Math., Izvestija vuzov, 11, 1968, 131-135. 40 # 5130. Ramm, A.G. (1968b). Some theorems on analytic continuation of the Schr¨odinger operator resolvent kernel in the spectral parameter. Izv. Ac. Nauk. Arm. SSR, Mathematics, 3, pp. 443-464. Ramm, A.G. (1969). Filtering of nonstationary random fields in optical systems. Opt. and Spectroscopy, 26, pp. 808-812; Ramm, A.G. (1969b). Apodization theory. Optics and Spectroscopy, 27, (1969), pp. 508-514. Ramm, A.G. (1969c). Filtering of nonhomogeneous random fields. Ibid. 27, pp. 881-887. Ramm, A.G. (1969d). Green’s function study for differential equation of the sec- ond order in domains with infinite boundaries. Diff. eq. 5, pp. 1509-1516. Ramm, A.G. (1970). Eigenfunction expansion for nonselfadjoint Schr¨odinger op- erator. Doklady, 191, pp. 50-53. Ramm, A.G. (1970b). Apodization theory II. Opt. and Spectroscopy, 29, pp. 390- 394. Ramm, A.G. (1970c). Increasing of the resolution ability of the optical instru- ments by means of apodization. Ibid. 29, pp. 594-599. Ramm, A.G. (1970d). On resolution ability of optical systems. Ibid., 29, pp. 794- 798. Ramm, A.G. (1971). Filtering and extrapolation of some nonstationary random processes. Radiotech. i Electron. 16, pp. 80-87. Ramm, A.G. (1971b). Eigenfunction expansions for exterior boundary problems. Ibid. 7, pp. 737-742. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

366 Random Fields Estimation Theory

Ramm, A.G. (1971c). On multidimensional integral equations with the translation kernel. Diff. eq. 7, pp. 2234-2239. Ramm, A.G. (1972). Simplified optimal differentiators. Radiotech. i Electron. 17, pp. 1325-1328. Ramm, A.G. (1973). On some class of integral equations. Ibid., 9, pp. 931-941. Ramm, A.G. (1973b). Optimal harmonic synthesis of generalized Fourier series and integrals with randomly perturbed coefficients. Radio technika, 28, pp. 44-49. Ramm, A.G. (1973c). Discrimination of random fields in noises. Probl. peredaci informacii, 9, pp. 22-35. 48 # 13439. Ramm, A.G. (1975). Approximate solution of some integral equations of the first kind. Diff. eq. 11, pp. 582-586. 440-443. Ramm, A.G. (1976). Investigation of a class of integral equations. Doklady Acad. Sci. USSR, 230, pp. 283-286. Ramm, A.G. (1978). A new class of nonstationary processes and fields and its applications. Proc. 10 all-union sympos. “Methods of representation and analysis of random processes and fields” Leningrad, 3, pp. 40-43. Ramm, A.G. (1978b). On eigenvalues of some integral equations. Diff. Equations, 15, pp. 932-934. Ramm, A.G. (1978c). Investigation of a class of systems of integral equations. Proc. Intern. Congr. on appl. math., Weimar, DDR, (1978), 345-351. Ramm, A.G. (1978d). Investigation of some classes of integral equations and their application. In collection “Abel inversion and its generalizations”, edited by N. Preobrazhensky, Siberian Dep. of Acad. Sci. USSR, Novosibirsk, pp. 120- 179. Ramm, A.G. (1979). Linear filtering of some vectorial nonstationary random pro- cesses, Math. Nachrichten, 91, pp. 269-280. Ramm, A.G. (1980). Theory and applications of some new classes of integral equations, Springer Verlag, New York. Ramm, A.G. (1980b). Investigation of a class of systems of integral equations, Journ. Math. Anal. Appl., 76, pp. 303-308. Ramm, A.G. (1980c). Analytical results in random fields filtering theory, Zeitschr. Angew. Math. Mech., 60, T 361-T 363. Ramm, A.G. (1981). Stable solutions of some ill-posed problems. Math. Meth. in Appl. Sci. 3, pp. 336-363. Ramm, A.G. (1981b). Spectral properties of some nonselfadjoint operators, Bull, Am. Math. Soc., 5, N3, pp. 313-315. Ramm, A.G. (1981c). Spectral properties of some nonselfadjoint operators and some applications in “Spectral theory of differential operators”, Math. Stud- ies, North Holland, Amsterdam, ed. I. Knowles and R. Lewis, pp. 349-354. Ramm, A.G. (1982). Perturbations preserving asymptotics of spectrum with a remainder. Proc. A.M.S. 85, N2, pp. 209-212; Ramm, A.G. (1983). Eigenfunction expansions for some nonselfadjoint operators and the transport equation, J. Math. Anal. Appl. 92, pp. 564-580. Ramm, A.G. (1984). Estimates of the derivatives of random functions. J. Math. Anal. Appl. 102, pp. 244-250. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Bibliography 367

Ramm, A.G. (1984b). Analytic theory of random fields estimation and filter- ing. Proc. of the intern sympos. on Mathematics in systems theory (Beer Sheva, 1983), Lecture notes in control and inform. sci. N58, Springer Verlag, pp. 764-773. Ramm, A.G. (1985). Numerical solution of integral equations in a space of dis- tributions. J. Math. Anal. Appl. 110, pp. 384-390. Ramm, A.G. (1985b). Estimates of the derivatives of random functions II. (with T. Miller). J. Math. Anal. Appl. 110, pp. 429-435; Ramm, A.G. (1986). Scattering by obstacles, Reidel, Dordrecht. Ramm, A.G. (1987). Sufficient conditions for zero not to be an eigenvalue of the Schr¨odinger operator, J. Math Phys., 28, pp. 1341-1343. Ramm, A.G. (1987b). Optimal estimation from limited noisy data, Journ. Math. Anal. Appl., 125, pp. 258-266. Ramm, A.G. (1987c). Signal estimation from incomplete data, Journ. Math. Anal. Appl., 125, pp. 267-271. Ramm, A.G. (1987d). Analytic and numerical results in random fields estimation theory, Math. Reports of the Acad. of Sci., Canada, 9, pp. 69-74. Ramm, A.G. (1987e). Characterization of the scattering data in multidimensional inverse scattering problem, in the book: Inverse Problems: An Interdisci- plinary Study. Acad. Press, New York, pp. 153-167. (Ed. P. Sabatier). Ramm, A.G. (1987f). Completeness of the products of solutions to PDE and uniqueness theorems in inverse scattering inverse problems, 3, L77-L82. Ramm, A.G. (1988). Multidimensional inverse problems and completeness of the products of solutions to PDE. J. Math. Anal. Appl. 134, 1, pp. 211-253. Ramm, A.G. (1988b). Conditions for zero not to be an eigenvalue of the Schr¨odinger operator, J. Math. Phys. 29, pp. 1431-1432. Ramm, A.G. (1988c). Recovery of potential from the fixed energy scattering data. Inverse Problems, 4, pp. 877-886; Ramm, A.G. (1989). Multidimensional inverse scattering problems and complete- ness of the products of solutions to homogeneous PDE. Zeitschr. f. angew. Math. u. Mech., T305, N4-5, T13-T22. Ramm, A.G. (1990). Random fields estimation theory, Longman Scientific and Wiley, New York, pp. 1-273. Ramm, A.G. (1990b). Stability of the numerical method for solving the 3D inverse scattering problem with fixed energy data. Inverse Problems, 6, L7-L12; J. reine angew. math., 414, (1991), pp. 1-21. Ramm, A.G. (1990b). Is the Born approximation good for solving the inverse problem when the potential is small? J. Math. Anal. Appl. 147, pp. 480- 485. Ramm, A.G. (1991). Symmetry properties of scattering amplitudes and applica- tions to inverse problems. J. Math. Anal. Appl. 156, pp. 333-340. Ramm, A.G. (1992). Multidimensional inverse scattering problems, Longman, New York, (Russian edition, Mir, Moscow 1993). Ramm, A.G. (1996). Random fields estimation theory, MIR, Moscow, pp. 1-352. Ramm, A.G. (2002). Estimation of Random Fields, Theory of Probability and Math. Statistics, 66, pp. 95-108. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

368 Random Fields Estimation Theory

Ramm, A.G. (2003). Analytical solution of a new class of integral equations. Differential and Integral Equations, 16 N2 p. 231-240. Ramm, A.G. (2003a). On a new notion of regularizer, J. Phys. A, 36, p. 2191- 2195. Ramm, A.G. (2004). One dimensional inverse scattering and spectral problems, Cubo Math. Journ., 6, N1, p. 313-426. Ramm, A.G. (2005). Inverse problems, Springer, New York. Ramm, A.G. and Shifrin, E.I. (1991). Asymptotics of the solution to a singularly perturbed integral equation, Appl. Math. Lett. 4, pp. 67 - 70. Ramm, A.G. and Shifrin, E.I. (1993). Asymptotics of the solutions to singularly perturbed integral equations, Journal of Mathematical Analysis and Appli- cations, 178, No 2, pp. 322 - 343. Ramm, A.G. and Shifrin, E.I. (1995). Asymptotics of the solutions to singularly perturbed multidimensional integral equations, Journal of Mathematical Analysis and Applications, 190, No 3, pp. 667 - 677. Ramm, A.G and Shifrin, E.I. (2005). Singular pertubation theory for a class of Fredholm integral equations arising in random fields estimation theory, Journal of integral equations and operator theory. Ramm, A.G.and Weaver, O. (1987). A characterization of the scattering data in 3D inverse scattering problem. Inverse problems, 3, L49-52. Ramm, A.G. and Weaver, O. (1989). Necessary and sufficient condition on the fixed energy data for the potential to be spherically symmetric. Inverse Problems, 5 pp. 445-447. Roitberg, Ya. A. (1996). Elliptic boundary-value problems in the spaces of distri- butions, Kluwer, Dordrecht. Rosanov, Yu (1982). Markov random fields, Springer Verlag, New York. Rudin, W. (1973). Functional Analysis, McGraw Hill, New York. Safarov, Yu. and Vassiliev, D. (1997). The asymptotic distribution of eigenval- ues of partial differential operators, American Mathematical Society, Prov- idence, RI, 1997. Saito, Y. (1982). Some properties of the scattering amplitude and the inverse scattering problem, Osaka J. Math., 19, pp. 527-547. Schrohe, E. (1987). Spaces of weighted symbols and weighted sobolev spaces on manifolds, In: Pseudo-Differential Operators, Cordes, H.O., Gramsch, B., and Widom, H. (eds.) Springer LN Math. 1256, pp. 360-377, Springer- Verlag, Berlin. Schrohe, E. (1999). Fr´echet algebra techniques for boundary value problems on noncompact manifolds, Math. Nachr. 199, pp. 145–185. Shubin, M. (1986). Pseudodifferential operators and spectral theory, Springer Ver- lag, New York. v Skriganov, M. (1978). High-frequency asymptotics of the scattering amplitude, Sov. Physics, Doklady, 241, pp. 326-329. Somersalo, E. et. al. (1988). Inverse scattering problem for the Schr¨odinger equa- tion in three dimensions, IMA preprint 449, pp. 1-7. Tanana, V. (1981). Methods for solving operator equations, Nauka, Moscow. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Bibliography 369

Tikhonov, A. and Arsenin, V. (1977). Solutions of ill-posed problems, Winston, Washington. Tulovskii, V. (1979). Asymptotic distribution of eigenvalues of differential equa- tions, Matem. Sborn., 89, pp. 191-206. Vanmarcke, E. (1983) Random fields: analysis and synthesis, MIT Press, Cam- bridge. Vishik, M.I. and Lusternik, L. (1962). Regular degeneration and boundary layer for linear differential equations with a small parameter, Amer. Math. Soc. Transl. 20 pp. 239 - 264. Wloka, J.T. (1987). Partial differential equations, Cambridge University Press. Wloka, J.T., Rowley, B. and Lawruk, B. (1995). Boundary value problems for elliptic systems, Cambridge University Press. Wong, E. (1986). In search of multiparameter Markov processes, in the book: Communications and Networks, ed. I.F. Blake and H.V. Poor, Springer Verlag, New York, pp. 230-243. Yadrenko, M. (1983). Spectral theory of random fields, Optimization Software, New York. Yagle, A. (1988). Connections between 3D inverse scattering and linear least- squares estimation of random fields, Acta Appl Math. 13, N3, pp. 267-289. Yagle, A. (1988). Generalized split Levinson, Schur, and lattice algorithms for 3D random fields estimation problem (preprint). Zabreiko, P., et. al. (1968). Integral equations, Reference Text, Nauka, Moscow. February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

370 Random Fields Estimation Theory February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Symbols

Spaces H+ H0 H , 239 ⊂ ⊂ − C`(D), 236 Lp(D), Lp(D, µ), 233 W `,p(D), 233 H`(D), 238 = C∞, 236 D 0 0, 236 D , 236 S 0, 236 S H˙ `(D), 238 ` ` H− (D), H˙ − (D), 239 V , 70 W , 71 H˙ `(D) = H˙ 1(D) H`(D) ∩ Classes of domains C0,1

With cone property EW `,p, 238

Classes of operators σp, 243 σ1 trace class, 243 σ2 Hilbert-Schmidt class, 243 σ2(H1, H2)

371 February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

372 Random Fields Estimation Theory

elliptic operator, 255 L Special functions Jν (x), 26 K0(x), 27 hn(r) spherical Hankel functions, 160 Yn(Θ) spherical harmonics, 160 δ(x) delta function, 4

Symbols used in the definition of class R dρ spectral measure of , 3 L P (λ), Q(λ) polynomials, 3 p = deg P (λ), 3 q = deg Q(λ), 3 s = ord , 3 L class of kernels, 3 R Φ(x, y, λ) spectral kernel of , 3 L Λ spectrum of , 3 L

Various symbols Rr Euclidean r-dimensional space, 1 S2 the unit sphere in R3 , 161 s(x) useful signal, 1 n(x) noise, 1 (x) observed signal, 1 U h(x, y) optimal filter, 2 H0, H1 hypotheses, 163 `(u1, . . ., un) the likelihood ratio, 163 N(A) null-space of A, 275 RanA range of A, 275 D(A) domain of A, 275 σ(A) spectrum of A, 275 strong convergence, 39 → * weak convergence, 260

= Rr , 9 R R February 12, 2006 10:52 WSPC/Book Trim Size for 9in x 6in book

Index

Approximation of the kernel, 46 Moment functions, 305 Asymptotic efficiency, 317 Order of singularity, 3 Bochner-Khintchine theorem, 304 Projection methods, 39 Characteristic function, 304 Characterization of the scattering Random function, 305 data, 138 Reproducing kernel, 59 Completeness property of the Rigged triple of Hilbert spaces, 239 scattering solution, 131 Conditional mean value, 303 Singular support, 15 Correlation function, 306 Sobolev spaces, 233 Covariance function, 305 Solution of (2.12) of minimal order of singularity (mos solution), 14 Direct scattering problem, 111 Spectral density, 314 Distributions, 236 Spectral measure, 18 Stochastic integral, 309 Elliptic estimate, 76 Estimate, Bayes’, 315 Transmission problem, 15 Estimate, efficient, 315 Estimate, maximum likelihood, 316 Variance, 303 Estimate, minimax, 315 Estimate, unbiased, 315 Weakly lower semicontinuous, 210 Estimation in Hilbert Space, 310

Integral representation of random functions, 309 Inverse scattering problem, 134 Iterative method, 38

Mean value, 305 Mercer’s theorem, 61

373