<<

Data types in science

Discrete data (data tables) Observations Calculations Continuous data Analytics functions Analytic solutions

1 2

From continuous to discrete … From discrete to continuous? ??

soon after hurricane Isabel (September 2003)

3 4

If you think that the data values f in the data table What do we want from discrete sets i are free from errors, of data? then interpolation lets you find an approximate 5 value for the f(x) at any point x within the Data points interval x1...xn. 4 Quite often we need to know function values at 3 any arbitrary point x y(x) 2 Can we generate values 1 for any x between x1 and xn from a data table? 0 0123456 x 5 6

1 Key point!!! Applications of approximating functions The idea of interpolation is to select a function g(x) such that

Data points 1. g(xi)=fi for each data 4 point i

2. this function is a good 3 approximation for any interpolation differentiation integration

other x between y(x) 2 original data points

1

0123456

x 7 8

What is a good approximation? Important to remember

What can we consider as Interpolation ≡ Approximation a good approximation to There is no exact and unique solution the original data if we do The actual function is NOT known and CANNOT not know the original be determined from the tabular data. function? Data points may be interpolated by an infinite number of functions

9 10

Step 1: selecting a function g(x) Two step procedure We should have a guideline to select a Select a function g(x) reasonable function g(x). We need some ideas about data!!! Find coefficients g(x) may have some standard form or be specific for the problem

11 12

2 Linear combination is the most common Some ideas for selecting g(x) form of g(x) & linear combination of elementary functions, or ‰ Most interpolation methods trigonometric, or exponential functions, or are grounded on rational functions, … ‘smoothness’of g(x) a h (x) a h (x) a h (x) ... interpolated functions. = 1 1 + 2 2 + 3 3 + However, it does not work all the time Three of most common approximating functions & ‰ Practical approach, i.e. & Trigonometric functions what physics lies beneath & Exponential functions the data 13 14

Approximating functions should have following properties : Idea & It should be easy to determine & It should be easy to evaluate & It should be easy to differentiate & It should be easy to integrate

15 16

Example: C++ Linear Interpolation: coefficients double int1(double x, double xi[], double yi[], int imax) { double y; int j; // if x is ouside the xi[] interval if (x <= xi[0]) return y = yi[0]; if (x >= xi[imax-1]) return y = yi[imax-1]; // loop to find j so that x[j-1] < x < x[j] j = 0; while (j <= imax-1) { if (xi[j] >= x) break; j = j + 1; } y = yi[j-1]+(yi[j]-yi[j-1])*(x-xi[j-1])/(xi[j]-xi[j-1]); return y; }

17 18 bisection approach is much more efficient to search an array

3 Linear interpolation: conclusions

1.0 The linear interpolation may work well for very smooth functions when the second and higher are

0.5 small. It is worthwhile to note that for the each data interval ) 2 0.0 one has a different set of coefficients a0 and a1.

sin(x This is the principal difference from data fitting where -0.5 the same function, with the same coefficients, is used f(x)=sin(x2) Data points to fit the data points on the whole interval [x1,xn]. linear interpolation -1.0 We may improve quality of linear interpolation by increasing number of data points x on the interval. 0123 i x HOWEVER!!! It is much better to use higher-odrder 19 . 20

Linear vs. Quadratic interpolations Interpolation

example from F.S.Acton “Numerical methods that work” “A table of sin(x) covering the first quadrant, for example, requires 541 pages if it is to be linearly interpolable to eight decimal places. If quadratic interpolation is used, the same table takes only one page having entries at one-degree intervals.”

The number of data points minus one defines the order of interpolation. Thus, linear (or two-point interpolation) is the first order interpolation 21 22

Properties of polynomials System of equations for the second order

Weierstrass theorem: If f(x) is a in the closed interval a ≤ x ≤ b then for every ε > 0 there exists a

polynomial Pn(x), where the value on n depends on the value of ε , such that for all x in the closed interval a ≤ x ≤ b Ö We need three points for the second order interpolation Pn (x) − f (x) < ε Ö Freedom to choose points? Ö This system can be solved analytically

23 24

4 Lagrange Interpolation: C++ Solutions for the second order double polint(double x, double xi[], double yi[], int isize, int npoints) { double lambda[isize]; double y; int j, is, il; // check order of interpolation if (npoints > isize) npoints = isize; and any arbitrary order (Lagrange interpolation) // if x is ouside the xi[] interval if (x <= xi[0]) return y = yi[0]; if (x >= xi[isize-1]) return y = yi[isize-1]; f (x) ≈ f1λ1(x) + f2λ2 (x) + fnλn (x) K // loop to find j so that x[j-1] < x < x[j] n x − x j = 0; (x) j λi = ∏ while (j <= isize-1) j(≠i)=1 xi − x j { if (xi[j] >= x) break; j = j + 1; 25 } 26

Example: C++ (cont.) 2.5 // shift j to correspond to (npoint-1)th interpolation f(x) j = j - npoints/2; 2.0 data points 1-st order // if j is ouside of the range [0, ... isize-1] 1.5 3-rd order if (j < 0) j=0; 5-th order ] if (j+npoints-1 > isize-1 ) j=isize-npoints; 2 1.0 7-th order y = 0.0; for (is = j; is <= j+npoints-1; is = is+1) 0.5

{ 0.0 lambda[is] = 1.0; for (il = j; il <= j+ npoints-1; il = il + 1) -0.5

{ cos(3x)/[0.4+(x-2) -1.0 if(il != is) lambda[is] = lambda[is]* (x-xi[il])/(xi[is]-xi[il]); -1.5 } y = y + yi[is]*lambda[is]; -2.0 } 0123 return y; x } 27 28

note on Lagrange interpolation Important!

It is possible to use Lagrange formula Moving from the first -order to the third and 5th order straightforwardly, like in the example above improves interpolated values to the original function. However, the 7th order interpolation instead being There is a better algorithm - Neville’s algorithm closer to the function f(x) produces wild oscillations. (for constructing the same interpolating This situation is not uncommon for high-order polynomial) . Sometimes Neville’s algorithm confused with Rule of thumb: do not use high order interpolation. Aitken’s algorithm Fifth order may be considered as a practical limit. (the latter now considered obsolete). If you believe that the accuracy of the 5th order interpolation is not sufficient for you, then you should rather consider some other method of interpolation. 29 30

5 Cubic Spline vs. Polynomials Ö The idea of is reminiscent of very old mechanical devices used by draftsmen to Ö One of the principal drawbacks of the polynomial get a smooth shape. interpolation is related to discontinuity of derivatives at Ö It is like securing a strip of elastic material (metal data points xj. or plastic ruler) between knots (or nails). Ö The procedure for deriving coefficients of spline Ö The final shape is quite smooth. interpolations uses information from all data points, Ö Cubic spline interpolation is used in most plotting i.e. nonlocal information to guarantee global smoothness software. (cubic spline gives most smooth result) in the interpolated function up to some order of derivatives.

31 32

Equations Coefficients for spline interpolation

the interpolated function on [xj,xj+1] interval is presented in a cubic spline form For the each interval we need to have a set of three

parameters bj, cj and dj. Since there are (n-1) intervals, one has to have 3n-3 equations for deriving the coefficients for j=1,n-1. What is different from polynomial interpolation? The fact that gj(xj)=fj(xj) imposes (n-1) equations. … the way we are looking for the coefficients! Polynomial interpolation: 3 coefficient for 4 data points Spline interpolation: 3 coefficients for the each interval

33 34

Central idea to spline interpolation Now we have more equations (n-1) + 2(n-2) = 3n-5 equations the interpolated function g(x) has continuous the first Data points and second derivatives at each of the n-2 interior 4

points xj. 3 Data points 4

y(x) 2

3 1

y(x) 2 0123456 but we need 3n-3 equations x

1

0123456 x 35 36

6 a little math … Cubic spline boundary conditions c d s (x) = a + b (x − x ) + i (x − x )2 + i (x − x )3 (1) i i i i 2 i 6 i Possibilities to fix two more conditions xi−1 ≤ x ≤ xi i = 1,2,KN.

Need to find ai , bi , ci , di . • Natural spline - the second order derivatives are d s '(x) = b + c (x − x ) + i (x − x )2 zero on boundaries. i i i i 2 i • Input values for the first order f(1)(x) derivates at si ''(x) = ci + di (x − xi ) boundaries si '''(x) = di • Input values for the second order f(2)(x) derivates at by the definition (interpolation) a = f (x ) boundaries i i 1) s(x) must be continuous at xi si (xi ) = si+1(xi ) c d a = a + b (x − x ) + i+1 (x − x )2 + i+1 (x − x )3 i i+1 i+1 i i+1 2 i i+1 6 i i+1

37 38

The system of equations to find coefficients

hidi = ci − ci−1 i = 1,2,KN c0 = cN = 0 (5) using hi = xi − xi−1 2 2 3 hi hi hi hici − di = bi − bi−1 i = 2,3,KN (6) hibi − ci + di = fi − fi−1 (2) 2 2 6 2 3 hi hi 2) si '(xi ) = si+1'(xi ) i = 1,2,KN −1 h b − c + d = f − f i = 1,2, N (7) i i 2 i 6 i i i−1 K d c h − i h2 = b − b i = 2,3, N (3) Solve the system for c i = 1,2, N −1 i i 2 i i i−1 K i K Consider two equations (7) for points i and i −1 3) si ''(xi ) = si+1''(xi ) 2 hi hi fi − fi−1 dihi = ci − ci−1 i = 2,3,KN (4) b = c + d = i 2 i 6 i h additional equations (conditions at the ends) i h h2 f f s''(a) = s''(b) = 0 i−1 i−1 i−1 − i−2 bi−1 = ci−1 + di−1 = 2 6 hi−1 s1''(x0 ) = 0, sN ''(xN ) = 0 c1 − d1h1 = 0, cN = 0 and sustract the second equation from the first 1 1 f − f f − f b b (h c h c ) (h2d h2 d ) i i−1 i−1 i−2 39 i − i−1 = i i − i−1 i−1 − i i − i−1 i−1 + − 40 2 6 hi hi−1

Use the difference bi − bi−1 in the right side of (6) h2 2h2 ⎛ f − f f − f ⎞ h c + h c − i−1 d − i d = 2⎜ i i−1 − i−1 i−2 ⎟ (8) i i i−1 i−1 i−1 i ⎜ ⎟ Implementation 2 3 ⎝ hi hi−1 ⎠ Then from (5) 2 Recommendation – do not write your own hi di = hi (ci − ci−1), 2 spline program but get one from …. hi−1di−1 = hi−1(ci−1 − ci−2 ) Substitute in (8) Advanced program libraries! ⎛ f − f f − f ⎞ h c 2(h h )c h c 6⎜ i+1 i i i−1 ⎟ (9) Books i−1 i−2 + i−1 + i i−1 + i i = ⎜ − ⎟ ⎝ hi+1 hi ⎠ for i = 1,2,KN −1, c0 = cN = 0 This is a tridiagonal system of equations (use Thomas method) 2 ci − ci−1 hi hi fi − fi−1 then di = , bi = ci − di + for i = 1,2,KN, hi 2 6 hi 41 42

7 Example: Spline interpolation Comments to Spline interpolation

1.0 Generally, spline does not have advantages over polynomial f(x) data points interpolation when used for smooth, well behaved data, or 1-st order 0.5 spline when data points are close on x scale. The advantage of spline comes in to the picture when

0.0 dealing with ''sparse'' data, when

sin(x) there are only a few points for smooth functions -0.5 or when the number of points is close to the number of expected maximums. -1.0

0123456 43 44 x

Example: Example: spline and Quality of interpolation: average difference from f(x) spline and Quality of interpolation: average difference from f(x) polynomial spline 1st 3rd 5th 7th polynomial spline 1st 3rd 5th 7th 0.0004 0.0171 0.0013 0.0012 0.0004 0.0526 0.1410 0.0848 0.1434 0.2808 interpolation interpolation 3 4 data points data points f(x) f(x) spline spline 1rd order 1rd order 3th order 2 3th order 2 5th order 5th order 7th order

7th order /1.5) 2

1 0 f(x)=sin(x f(x)=exp(cos(x))

0 -2 012345 012345 45 46 x x

Divided Differences Divided Difference Polynomials the first divided difference at point i Let’s define a polynomials Pn(x) where the fi 1 − fi coefficients are divided differences f [x , x ] = + i i+1 x − x i+1 i P (x) = f (0) + (x − x ) f (1) + (x − x )(x − x ) f (2) + the second divided difference n i i i i i+1 i K (x x )(x x ) (x x ) f (n) f [x , x ] − f [x , x ] + − i − i+1 K − i+n−1 i f [x , x , x ] = i+1 i+2 i i+1 i i+1 i+2 x x i+2 − i It is easy to show that Pn(x) passes exactly in general through the data points f [x , x , x ] − f [x , x , x ] i+1 i+2 K n i i+1 K n+i−1 P (x ) f , P (x ) f f [xi , xi+1,Kxn ] = n i = i n i+1 = i+1 K xn − xi other notations (0) (1) (2) fi = fi , fi = f [xi , xi+1], fi = f [xi , xi+1, xi+2 ] 47 48

8 Comments Divided Difference coefficients for four points Polynomials satisfy a uniqueness theorem – a polynomial of degree n passing exactly through n+1 points is unique.

The data points in the divided difference polynomials do not have to be in any specific order. However, more accurate results are obtained for interpolation if the data are arranged in order of closeness to the interpolated point

49 50

Example: Fortran Example: double precision d(n,n), x(n), f(n) x: 3.20,3.30,3.35,3.40,3.50,3.60,3.65,3.70 integer i,j f: 0.312500,0.303030,0.298507,0.294118, ...... 0.285714,0.277778,0.273973,0.270270 d = 0.0 ! initialization of d(n,1) do i=1,n 0.312500 -0.094700 0.028265 -0.007311 -0.006774 ... d(i,1) = f(i) 0.303030 -0.090460 0.026802 -0.009343 0.010684 ... end do 0.298507 -0.087780 0.024934 -0.006138 -0.001741 ... ! calculations 0.294118 -0.084040 0.023399 -0.006660 -0.000053 do j=2,n 0.285714 -0.079360 0.021734 -0.006676 do i=1,n-j+1 0.277778 -0.076100 0.020399 d(i,j)=(d(i+1,j-1)-d(i,j-1))/(x(i+1+j-2)-x(i)) 0.273973 -0.074060 end do 0.270270 end do ! print results do i=1,n write(*,200) (d(i,j),j=1,n-i+1) end do 51 52 200 format (5f10.6)

Equally spaced xi example: table of differences Fitting approximating polynomials to tabular for four points

data is much easier when the values xi are equally spaced the forward difference relative to point i

fi+1 − fi = Δfi the backward difference relative to point i +1

fi+1 − fi = ∇fi+1 the central difference relative to point i +1/ 2

fi+1 − fi = δfi+1/ 2

53 54

9 Difference tables are useful for evaluating the quality of a set of tabular data smooth difference => “good” data not monotonic differences => possible errors in the original data, or the step in x is too large, or may be a singularity in f(x)

55 56

The Newton Difference Polynomials the Newton forward difference polynomials A major advantage of the Newton forward and s(s −1) s(s −1)(s − 2) backward difference polynomials is that each higher P (x) = f + sΔf + Δ2 f + Δ3 f + n 0 0 2! 0 3! 0 order polynomial is obtained from the previous lower-

s(s −1)(s − 2) (s − [n −1]) n degree polynomials simply by adding the new term + + K Δ f K n! 0 the Newton backward difference polynomials Other difference polynomials: s(s +1) s(s +1)(s + 2) P (x) = f + s∇f + ∇2 f + ∇3 f + n 0 0 2! 0 3! 0 # Strirling centered-difference polynomials s(s +1)(s + 2) (s + [n −1]) # Bessel centered-difference polynomials + + K ∇n f K n! 0

where s = (x − x0 ) / h, x = x0 + sh 57 58

Example: compare Lagrange and divided Interpolation differences interpolations ƒ A rational function g(x) is a ratio of two polynomials Quality of Lagrange interpolation: average difference from f(x) ƒ Often rational function interpolation is a more powerful 1st 3rd 5th 7th method compared to polynomial interpolation. 0.1410 0.0848 0.1434 0.2808

Quality of interpolation: average difference from f(x) a + a x + a x2 + a xn Orders of divided difference interpolation g(x) = 0 1 2 K n 1 2 3 4 5 2 m b0 + b1x + b2 x + bm x 0.1410 0.1484 0.0848 0.1091 0.1433 K 6 7 8 9 0.2022 0.2808 0.3753 0.4526

59 60

10 The procedure has two principal steps When using rational functions is a good idea?

ƒ On the first step we need to choose powers for the Rational functions may well interpolate functions with numerator and the denominator, i.e. n and m. You poles may need to attempt a few trials before coming to a 2 n a0 + a1x + a2 x +Kan x conclusion. g(x) = 2 m b0 + b1x + b2 x +Kbm x ƒ Once we know the number of parameters we need to find the coefficients that is with zeros of the denominator

2 m b0 + b1x + b2 x +Kbm x = 0

61 62

Example for n=2 and m=1 Finding coefficients using 4 data points

2 2 a + a x + a x f(x1)(c+ b1x1)= a0 +a1x1 +a2x1 g(x) = 0 1 2 2 b0 + b1x f(x2)(c+ b1x2)= a0 +a1x2 +a2x2

2 f(x3)(c+ b1x3)= a0 +a1x3 +a2x ƒ We need five coefficients i.e. 5 data points 3 2 ƒ We should fix one of the coefficients since only f(x4)(c+ b1x4)= a0 +a1x4 +a2x4 the ratio makes sense. ƒ After a few rearrangements the system may be

ƒ If we choose, for example, b0 as a fixed number c written in the traditional form for a system of linear then we need 4 data points to solve a system of equations. equations to find the coefficients ƒ Very stable and robust programs for rational function interpolation can be found in many standard program libraries. 63 64

Interpolation in two or more dimensions Applications for Interpolation

• bilinear interpolation Interpolation has many applications both in physics, • science,and engineering. • bicubic spline Interpolation is a corner's stone in (integrations, differentiation, Ordinary Differential •… Equations, Partial Differential Equations). Two-dimensional interpolation methods are widely used in image processing, including digital cameras.

65 66

11 Are we there yet? What is ?

If you are interested in function values outside the

range x1...xn then the problem is called extrapolation. Generally this procedure is much less accurate than interpolation. You know how it is difficult to extrapolate (foresee) the future, for example, for the stock market.

67 68

What is data fitting?

If data values fi(xi) are a result of experimental observation with some errors, then data fitting may be a better way to proceed.

In data fitting we let g(xi) to differ from measured values

fi at xi points having one function only to fit all data points; i.e. the function g(x) fits all the set of data. Data fitting may reproduce well the trend of the data, even correcting some experimental errors.

69

12