Data types in science
Discrete data (data tables) Experiment Interpolation Observations Calculations Continuous data Analytics functions Analytic solutions
1 2
From continuous to discrete … From discrete to continuous? ??
soon after hurricane Isabel (September 2003)
3 4
If you think that the data values f in the data table What do we want from discrete sets i are free from errors, of data? then interpolation lets you find an approximate 5 value for the function f(x) at any point x within the Data points interval x1...xn. 4 Quite often we need to know function values at 3 any arbitrary point x y(x) 2 Can we generate values 1 for any x between x1 and xn from a data table? 0 0123456 x 5 6
1 Key point!!! Applications of approximating functions The idea of interpolation is to select a function g(x) such that
Data points 1. g(xi)=fi for each data 4 point i
2. this function is a good 3 approximation for any interpolation differentiation integration
other x between y(x) 2 original data points
1
0123456
x 7 8
What is a good approximation? Important to remember
What can we consider as Interpolation ≡ Approximation a good approximation to There is no exact and unique solution the original data if we do The actual function is NOT known and CANNOT not know the original be determined from the tabular data. function? Data points may be interpolated by an infinite number of functions
9 10
Step 1: selecting a function g(x) Two step procedure We should have a guideline to select a Select a function g(x) reasonable function g(x). We need some ideas about data!!! Find coefficients g(x) may have some standard form or be specific for the problem
11 12
2 Linear combination is the most common Some ideas for selecting g(x) form of g(x) & linear combination of elementary functions, or Most interpolation methods trigonometric, or exponential functions, or are grounded on rational functions, … ‘smoothness’of g(x) a h (x) a h (x) a h (x) ... interpolated functions. = 1 1 + 2 2 + 3 3 + However, it does not work all the time Three of most common approximating functions & Polynomials Practical approach, i.e. & Trigonometric functions what physics lies beneath & Exponential functions the data 13 14
Approximating functions should have following properties Linear Interpolation: Idea & It should be easy to determine & It should be easy to evaluate & It should be easy to differentiate & It should be easy to integrate
15 16
Example: C++ Linear Interpolation: coefficients double int1(double x, double xi[], double yi[], int imax) { double y; int j; // if x is ouside the xi[] interval if (x <= xi[0]) return y = yi[0]; if (x >= xi[imax-1]) return y = yi[imax-1]; // loop to find j so that x[j-1] < x < x[j] j = 0; while (j <= imax-1) { if (xi[j] >= x) break; j = j + 1; } y = yi[j-1]+(yi[j]-yi[j-1])*(x-xi[j-1])/(xi[j]-xi[j-1]); return y; }
17 18 bisection approach is much more efficient to search an array
3 Linear interpolation: conclusions
1.0 The linear interpolation may work well for very smooth functions when the second and higher derivatives are
0.5 small. It is worthwhile to note that for the each data interval ) 2 0.0 one has a different set of coefficients a0 and a1.
sin(x This is the principal difference from data fitting where -0.5 the same function, with the same coefficients, is used f(x)=sin(x2) Data points to fit the data points on the whole interval [x1,xn]. linear interpolation -1.0 We may improve quality of linear interpolation by increasing number of data points x on the interval. 0123 i x HOWEVER!!! It is much better to use higher-odrder 19 interpolations. 20
Linear vs. Quadratic interpolations Polynomial Interpolation
example from F.S.Acton “Numerical methods that work” “A table of sin(x) covering the first quadrant, for example, requires 541 pages if it is to be linearly interpolable to eight decimal places. If quadratic interpolation is used, the same table takes only one page having entries at one-degree intervals.”
The number of data points minus one defines the order of interpolation. Thus, linear (or two-point interpolation) is the first order interpolation 21 22
Properties of polynomials System of equations for the second order
Weierstrass theorem: If f(x) is a continuous function in the closed interval a ≤ x ≤ b then for every ε > 0 there exists a
polynomial Pn(x), where the value on n depends on the value of ε , such that for all x in the closed interval a ≤ x ≤ b Ö We need three points for the second order interpolation Pn (x) − f (x) < ε Ö Freedom to choose points? Ö This system can be solved analytically
23 24
4 Lagrange Interpolation: C++ Solutions for the second order double polint(double x, double xi[], double yi[], int isize, int npoints) { double lambda[isize]; double y; int j, is, il; // check order of interpolation if (npoints > isize) npoints = isize; and any arbitrary order (Lagrange interpolation) // if x is ouside the xi[] interval if (x <= xi[0]) return y = yi[0]; if (x >= xi[isize-1]) return y = yi[isize-1]; f (x) ≈ f1λ1(x) + f2λ2 (x) + fnλn (x) K // loop to find j so that x[j-1] < x < x[j] n x − x j = 0; (x) j λi = ∏ while (j <= isize-1) j(≠i)=1 xi − x j { if (xi[j] >= x) break; j = j + 1; 25 } 26
Example: C++ (cont.) 2.5 // shift j to correspond to (npoint-1)th interpolation f(x) j = j - npoints/2; 2.0 data points 1-st order // if j is ouside of the range [0, ... isize-1] 1.5 3-rd order if (j < 0) j=0; 5-th order ] if (j+npoints-1 > isize-1 ) j=isize-npoints; 2 1.0 7-th order y = 0.0; for (is = j; is <= j+npoints-1; is = is+1) 0.5
{ 0.0 lambda[is] = 1.0; for (il = j; il <= j+ npoints-1; il = il + 1) -0.5
{ cos(3x)/[0.4+(x-2) -1.0 if(il != is) lambda[is] = lambda[is]* (x-xi[il])/(xi[is]-xi[il]); -1.5 } y = y + yi[is]*lambda[is]; -2.0 } 0123 return y; x } 27 28
note on Lagrange interpolation Important!
It is possible to use Lagrange formula Moving from the first -order to the third and 5th order straightforwardly, like in the example above improves interpolated values to the original function. However, the 7th order interpolation instead being There is a better algorithm - Neville’s algorithm closer to the function f(x) produces wild oscillations. (for constructing the same interpolating This situation is not uncommon for high-order polynomial) polynomial interpolation. Sometimes Neville’s algorithm confused with Rule of thumb: do not use high order interpolation. Aitken’s algorithm Fifth order may be considered as a practical limit. (the latter now considered obsolete). If you believe that the accuracy of the 5th order interpolation is not sufficient for you, then you should rather consider some other method of interpolation. 29 30
5 Cubic Spline Spline vs. Polynomials Ö The idea of spline interpolation is reminiscent of very old mechanical devices used by draftsmen to Ö One of the principal drawbacks of the polynomial get a smooth shape. interpolation is related to discontinuity of derivatives at Ö It is like securing a strip of elastic material (metal data points xj. or plastic ruler) between knots (or nails). Ö The procedure for deriving coefficients of spline Ö The final shape is quite smooth. interpolations uses information from all data points, Ö Cubic spline interpolation is used in most plotting i.e. nonlocal information to guarantee global smoothness software. (cubic spline gives most smooth result) in the interpolated function up to some order of derivatives.
31 32
Equations Coefficients for spline interpolation
the interpolated function on [xj,xj+1] interval is presented in a cubic spline form For the each interval we need to have a set of three
parameters bj, cj and dj. Since there are (n-1) intervals, one has to have 3n-3 equations for deriving the coefficients for j=1,n-1. What is different from polynomial interpolation? The fact that gj(xj)=fj(xj) imposes (n-1) equations. … the way we are looking for the coefficients! Polynomial interpolation: 3 coefficient for 4 data points Spline interpolation: 3 coefficients for the each interval
33 34
Central idea to spline interpolation Now we have more equations (n-1) + 2(n-2) = 3n-5 equations the interpolated function g(x) has continuous the first Data points and second derivatives at each of the n-2 interior 4
points xj. 3 Data points 4
y(x) 2
3 1
y(x) 2 0123456 but we need 3n-3 equations x
1
0123456 x 35 36
6 a little math … Cubic spline boundary conditions c d s (x) = a + b (x − x ) + i (x − x )2 + i (x − x )3 (1) i i i i 2 i 6 i Possibilities to fix two more conditions xi−1 ≤ x ≤ xi i = 1,2,KN.
Need to find ai , bi , ci , di . • Natural spline - the second order derivatives are d s '(x) = b + c (x − x ) + i (x − x )2 zero on boundaries. i i i i 2 i • Input values for the first order f(1)(x) derivates at si ''(x) = ci + di (x − xi ) boundaries si '''(x) = di • Input values for the second order f(2)(x) derivates at by the definition (interpolation) a = f (x ) boundaries i i 1) s(x) must be continuous at xi si (xi ) = si+1(xi ) c d a = a + b (x − x ) + i+1 (x − x )2 + i+1 (x − x )3 i i+1 i+1 i i+1 2 i i+1 6 i i+1
37 38
The system of equations to find coefficients
hidi = ci − ci−1 i = 1,2,KN c0 = cN = 0 (5) using hi = xi − xi−1 2 2 3 hi hi hi hici − di = bi − bi−1 i = 2,3,KN (6) hibi − ci + di = fi − fi−1 (2) 2 2 6 2 3 hi hi 2) si '(xi ) = si+1'(xi ) i = 1,2,KN −1 h b − c + d = f − f i = 1,2, N (7) i i 2 i 6 i i i−1 K d c h − i h2 = b − b i = 2,3, N (3) Solve the system for c i = 1,2, N −1 i i 2 i i i−1 K i K Consider two equations (7) for points i and i −1 3) si ''(xi ) = si+1''(xi ) 2 hi hi fi − fi−1 dihi = ci − ci−1 i = 2,3,KN (4) b = c + d = i 2 i 6 i h additional equations (conditions at the ends) i h h2 f f s''(a) = s''(b) = 0 i−1 i−1 i−1 − i−2 bi−1 = ci−1 + di−1 = 2 6 hi−1 s1''(x0 ) = 0, sN ''(xN ) = 0 c1 − d1h1 = 0, cN = 0 and sustract the second equation from the first 1 1 f − f f − f b b (h c h c ) (h2d h2 d ) i i−1 i−1 i−2 39 i − i−1 = i i − i−1 i−1 − i i − i−1 i−1 + − 40 2 6 hi hi−1
Use the difference bi − bi−1 in the right side of (6) h2 2h2 ⎛ f − f f − f ⎞ h c + h c − i−1 d − i d = 2⎜ i i−1 − i−1 i−2 ⎟ (8) i i i−1 i−1 i−1 i ⎜ ⎟ Implementation 2 3 ⎝ hi hi−1 ⎠ Then from (5) 2 Recommendation – do not write your own hi di = hi (ci − ci−1), 2 spline program but get one from …. hi−1di−1 = hi−1(ci−1 − ci−2 ) Substitute in (8) Advanced program libraries! ⎛ f − f f − f ⎞ h c 2(h h )c h c 6⎜ i+1 i i i−1 ⎟ (9) Books i−1 i−2 + i−1 + i i−1 + i i = ⎜ − ⎟ ⎝ hi+1 hi ⎠ for i = 1,2,KN −1, c0 = cN = 0 This is a tridiagonal system of equations (use Thomas method) 2 ci − ci−1 hi hi fi − fi−1 then di = , bi = ci − di + for i = 1,2,KN, hi 2 6 hi 41 42
7 Example: Spline interpolation Comments to Spline interpolation
1.0 Generally, spline does not have advantages over polynomial f(x) data points interpolation when used for smooth, well behaved data, or 1-st order 0.5 spline when data points are close on x scale. The advantage of spline comes in to the picture when
0.0 dealing with ''sparse'' data, when
sin(x) there are only a few points for smooth functions -0.5 or when the number of points is close to the number of expected maximums. -1.0
0123456 43 44 x
Example: Example: spline and Quality of interpolation: average difference from f(x) spline and Quality of interpolation: average difference from f(x) polynomial spline 1st 3rd 5th 7th polynomial spline 1st 3rd 5th 7th 0.0004 0.0171 0.0013 0.0012 0.0004 0.0526 0.1410 0.0848 0.1434 0.2808 interpolation interpolation 3 4 data points data points f(x) f(x) spline spline 1rd order 1rd order 3th order 2 3th order 2 5th order 5th order 7th order
7th order /1.5) 2
1 0 f(x)=sin(x f(x)=exp(cos(x))
0 -2 012345 012345 45 46 x x
Divided Differences Divided Difference Polynomials the first divided difference at point i Let’s define a polynomials Pn(x) where the fi 1 − fi coefficients are divided differences f [x , x ] = + i i+1 x − x i+1 i P (x) = f (0) + (x − x ) f (1) + (x − x )(x − x ) f (2) + the second divided difference n i i i i i+1 i K (x x )(x x ) (x x ) f (n) f [x , x ] − f [x , x ] + − i − i+1 K − i+n−1 i f [x , x , x ] = i+1 i+2 i i+1 i i+1 i+2 x x i+2 − i It is easy to show that Pn(x) passes exactly in general through the data points f [x , x , x ] − f [x , x , x ] i+1 i+2 K n i i+1 K n+i−1 P (x ) f , P (x ) f f [xi , xi+1,Kxn ] = n i = i n i+1 = i+1 K xn − xi other notations (0) (1) (2) fi = fi , fi = f [xi , xi+1], fi = f [xi , xi+1, xi+2 ] 47 48
8 Comments Divided Difference coefficients for four points Polynomials satisfy a uniqueness theorem – a polynomial of degree n passing exactly through n+1 points is unique.
The data points in the divided difference polynomials do not have to be in any specific order. However, more accurate results are obtained for interpolation if the data are arranged in order of closeness to the interpolated point
49 50
Example: Fortran Example: double precision d(n,n), x(n), f(n) x: 3.20,3.30,3.35,3.40,3.50,3.60,3.65,3.70 integer i,j f: 0.312500,0.303030,0.298507,0.294118, ...... 0.285714,0.277778,0.273973,0.270270 d = 0.0 ! initialization of d(n,1) do i=1,n 0.312500 -0.094700 0.028265 -0.007311 -0.006774 ... d(i,1) = f(i) 0.303030 -0.090460 0.026802 -0.009343 0.010684 ... end do 0.298507 -0.087780 0.024934 -0.006138 -0.001741 ... ! calculations 0.294118 -0.084040 0.023399 -0.006660 -0.000053 do j=2,n 0.285714 -0.079360 0.021734 -0.006676 do i=1,n-j+1 0.277778 -0.076100 0.020399 d(i,j)=(d(i+1,j-1)-d(i,j-1))/(x(i+1+j-2)-x(i)) 0.273973 -0.074060 end do 0.270270 end do ! print results do i=1,n write(*,200) (d(i,j),j=1,n-i+1) end do 51 52 200 format (5f10.6)
Equally spaced xi example: table of differences Fitting approximating polynomials to tabular for four points
data is much easier when the values xi are equally spaced the forward difference relative to point i
fi+1 − fi = Δfi the backward difference relative to point i +1
fi+1 − fi = ∇fi+1 the central difference relative to point i +1/ 2
fi+1 − fi = δfi+1/ 2
53 54
9 Difference tables are useful for evaluating the quality of a set of tabular data smooth difference => “good” data not monotonic differences => possible errors in the original data, or the step in x is too large, or may be a singularity in f(x)
55 56
The Newton Difference Polynomials the Newton forward difference polynomials A major advantage of the Newton forward and s(s −1) s(s −1)(s − 2) backward difference polynomials is that each higher P (x) = f + sΔf + Δ2 f + Δ3 f + n 0 0 2! 0 3! 0 order polynomial is obtained from the previous lower-
s(s −1)(s − 2) (s − [n −1]) n degree polynomials simply by adding the new term + + K Δ f K n! 0 the Newton backward difference polynomials Other difference polynomials: s(s +1) s(s +1)(s + 2) P (x) = f + s∇f + ∇2 f + ∇3 f + n 0 0 2! 0 3! 0 # Strirling centered-difference polynomials s(s +1)(s + 2) (s + [n −1]) # Bessel centered-difference polynomials + + K ∇n f K n! 0
where s = (x − x0 ) / h, x = x0 + sh 57 58
Example: compare Lagrange and divided Rational Function Interpolation differences interpolations A rational function g(x) is a ratio of two polynomials Quality of Lagrange interpolation: average difference from f(x) Often rational function interpolation is a more powerful 1st 3rd 5th 7th method compared to polynomial interpolation. 0.1410 0.0848 0.1434 0.2808
Quality of interpolation: average difference from f(x) a + a x + a x2 + a xn Orders of divided difference interpolation g(x) = 0 1 2 K n 1 2 3 4 5 2 m b0 + b1x + b2 x + bm x 0.1410 0.1484 0.0848 0.1091 0.1433 K 6 7 8 9 0.2022 0.2808 0.3753 0.4526
59 60
10 The procedure has two principal steps When using rational functions is a good idea?
On the first step we need to choose powers for the Rational functions may well interpolate functions with numerator and the denominator, i.e. n and m. You poles may need to attempt a few trials before coming to a 2 n a0 + a1x + a2 x +Kan x conclusion. g(x) = 2 m b0 + b1x + b2 x +Kbm x Once we know the number of parameters we need to find the coefficients that is with zeros of the denominator
2 m b0 + b1x + b2 x +Kbm x = 0
61 62
Example for n=2 and m=1 Finding coefficients using 4 data points
2 2 a + a x + a x f(x1)(c+ b1x1)= a0 +a1x1 +a2x1 g(x) = 0 1 2 2 b0 + b1x f(x2)(c+ b1x2)= a0 +a1x2 +a2x2
2 f(x3)(c+ b1x3)= a0 +a1x3 +a2x We need five coefficients i.e. 5 data points 3 2 We should fix one of the coefficients since only f(x4)(c+ b1x4)= a0 +a1x4 +a2x4 the ratio makes sense. After a few rearrangements the system may be
If we choose, for example, b0 as a fixed number c written in the traditional form for a system of linear then we need 4 data points to solve a system of equations. equations to find the coefficients Very stable and robust programs for rational function interpolation can be found in many standard program libraries. 63 64
Interpolation in two or more dimensions Applications for Interpolation
• bilinear interpolation Interpolation has many applications both in physics, • bicubic interpolation science,and engineering. • bicubic spline Interpolation is a corner's stone in numerical integration (integrations, differentiation, Ordinary Differential •… Equations, Partial Differential Equations). Two-dimensional interpolation methods are widely used in image processing, including digital cameras.
65 66
11 Are we there yet? What is extrapolation?
If you are interested in function values outside the
range x1...xn then the problem is called extrapolation. Generally this procedure is much less accurate than interpolation. You know how it is difficult to extrapolate (foresee) the future, for example, for the stock market.
67 68
What is data fitting?
If data values fi(xi) are a result of experimental observation with some errors, then data fitting may be a better way to proceed.
In data fitting we let g(xi) to differ from measured values
fi at xi points having one function only to fit all data points; i.e. the function g(x) fits all the set of data. Data fitting may reproduce well the trend of the data, even correcting some experimental errors.
69
12