1 Numerical Differentiation: Finite-Differences

Home , Finite difference, Order of approximation

1 1. Numerical Diﬀerentiation: Finite-Diﬀerences.

1 Numerical Diﬀerentiation: Finite-Diﬀerences.

So far we have computed derivatives by “symbolic diﬀerentiation”. For example

f(x) = exp(x2 + sin(x)) ⇒ f 0(x) = exp(x2 + sin(x))(2x + cos(x))

This option is OK in cases when the function f is known analytically, and it’s possible for us to do this. However, what if our function is too complicated to derive? What if our function is given by samples (say, an image?), and what if the function is unknown (say, like in the field of differential equations)? For these reasons, we wish to be able to compute the derivative of f at some point x0 numerically, given some other values of f. Let’s recall the definition of a first derivative

f(x + h) − f(x) f 0(x) = lim h→0 h

In principle, we can use this definition, and approximate the value of the first derivative by using a finite h > 0 f(x + h) − f(x) f 0(x) ≈ Dh(f) = . h This approximation is called a (first order) forward difference approximation (we will see soon why it is called a first order approximation). This approximation has a error associated with it, which theoretically is supposed to go to 0, as h → 0 (we will see that in practice it is not like that). A similar approximation is given by

f(x) − f(x − h) f 0(x) ≈ D−h(f) = , h and is called a (first order) backward difference approximation. 2 1. Numerical Differentiation: Finite-Differences.

1.1 Construction and truncation errors in ﬁnite diﬀerence approximations

How did we construct this approximations? and is there a way to quantify the error? The answer lies in the Taylor series.

0 1 00 2 f(x) = f(x0) + f (x0)(x − x0) + 2! f (x0)(x − x0) + ... (1) 0 1 00 2 = f(x0) + f (x0)(x − x0) + 2! f (ξ)(x − x0) , ξ ∈ (x, x0) 0 2 = f(x0) + f (x0)(x − x0) + O((x − x0) ).

00 1 00 2 In the last equality we assume that f (ξ) is bounded. The term 2! f (ξ)(x − x0) is called the truncation error of the Taylor series. Using the h notation we can write

0 1 00 2 f(x0 + h) = f(x0) + f (x0)h + 2! f (x0)h + ... (2) 0 1 00 2 = f(x0) + f (x0)h + 2! f (ξ)h , ξ ∈ (x0, x0 + h) 0 2 = f(x0) + f (x0)h + O(h ).

Now we will examine the forward-diﬀerence approximation using the Taylor series above:

f(x + h) − f(x ) f(x ) + f 0(x )h + 1 f 00(ξ)h2 − f(x ) 0 0 = 0 0 2! 0 ξ ∈ (x , x + h) (3) h h 0 0 0 1 00 = f (x0) + 2! f (ξ)h, ξ ∈ (x0, x0 + h) 0 = f (x0) + O(h).

Similarly, for a backward-diﬀerence approximation we will get

f(x ) − f(x − h) f(x ) − f(x ) + f 0(x )h − 1 f 00(ξ)h2 0 0 = 0 0 0 2! ξ ∈ (x − h, x ) (4) h h 0 0 0 1 00 = f (x0) − 2! f (ξ)h, ξ ∈ (x0 − h, x0) 0 = f (x0) + O(h).

Definition 1 (Asymptotic order of approximation). The asymptotic order of approximation of a given finite difference approximation is the power of h in the truncation error. 3 1. Numerical Differentiation: Finite-Differences.

In our examples above, the power of h in the truncation error is 1, hence, the approximations are of ﬁrst order.

1.1.1 A general description of ﬁnite-diﬀerence approximations

Generally, we would like to approximate the k-th derivative of a function f using its values as some given points, say f(x0), f(x0 + h), f(x0 − h), f(x0 + 2h), f(x0 + 3h) etc., where h > 0. It is common to denote

xi = x0 + ih, but note that i does not have to be positive or even an integer:

1 x1 = x0 + h, , x2 = x0 + 2h, , x−1 = x0 − h, x1/2 = x0 + 2 h.

The set of points {xi} is called the computational grid, and for convenience, we will deﬁne h such that the i “indices” are integers.

Example 1. Let f(x) = x3. Approximate f 0(4) by a forward and backward diﬀerences using h = 0.1. For these approximations we need f(x0), f(x1), f(x−1).

f(x−1) = f(3.9) = 59.319, f(x0) = f(4) = 64, f(x1) = f(4.1) = 68.921.

Note that the right answer is f 0(x) = 3x2, f 0(4) = 48. Using the two approximations we get:

68.921 − 64 FD = = 49.21 0.1 64 − 59.319 BD = = 46.81. 0.1

Here, the FD approximation over approximated the true value and the BD underestimated it. This can be explained using the truncation error. Both of them are of ﬁrst order.

Central difference approximation for the first derivative It seems that the error in the example above is quite high, and in each of the approximations above we did not use all three points. Can we improve the approximation using the same points (and the same h). It turns out that it is possible! 4 1. Numerical Differentiation: Finite-Differences.

3 Let f ∈ C . We will look at the points f(x1), f(x0), f(x−1). We will try to ﬁnd an approximation 0 f (x0) ≈ af(x1) + bf(x0) + cf(x−1)

We will recall that

0 1 00 2 1 000 3 (a·) f(x1) = f(x0) + f (x0)h + 2 f (x0)h + 3! f (ξ1)h ξ1 ∈ (x0, x0 + h)

(b·) f(x0) = f(x0) 0 1 00 2 1 000 3 (c·) f(x−1) = f(x0) − f (x0)h + 2 f (x0)h − 3! f (ξ2)h ξ2 ∈ (x0 − h, x0)

and we will now ﬁt the coeﬃcients to match the approximation the best we can, cancelling leading error terms.

(f(x0)) 0 = a + b + c 0 (f (x0)) 1 = a · h + c · (−h) 1 (f 00(x )) 0 = (a + c) h2 0 2

1 1 We get, that a = 2h , b = 0, c = − 2h , and the central diﬀerence approximation is (not surprisingly) f(x + h) − f(x − h) f 0(x ) ≈ . 0 2h The error is given by the truncation errors that are summed:

1 00 3 1 00 3 1 00 2 2 a( 3! f (ξ1)h ) − c( 3! f (ξ2)h ) = 3! f (ξ)h ξ ∈ (x0 − h, x0 + h) = O(h )

The equality is not straightforward and is held by a version of the mean value theorem. The central diﬀerence approximation is then of second order:

f(x + h) − f(x − h) f 0(x ) = + O(h2). 0 2h

Example 2 (Example 1 continued). Let f(x) = x3. Approximate f 0(4) by a central diﬀerence

using h = 0.1. For the approximation we need f(x0), f(x1), f(x−1).

f(x−1) = f(3.9) = 59.319, f(x0) = f(4) = 64, f(x1) = f(4.1) = 68.921. 5 1. Numerical Diﬀerentiation: Finite-Diﬀerences.

Note that the right answer is f 0(x) = 3x2, f 0(4) = 48. Using the two approximations we get:

68.921 − 59.319 CD = = 48.0100 0.2

Here, the approximation is of second order, and is much more accurate than the previous ones.

In a similar way, if f ∈ C5 we can also get a 4th order approximation for the ﬁrst derivative

−f(x + 2h) + 8f(x + h) − 8f(x − h) + f(x − 2h) f 0(x ) = 0 0 0 0 + O(h4) 0 12h

Central diﬀerence approximation for the second derivative Similarly to the previ- 4 ous approximation, let f ∈ C . We will look at the points f(x1), f(x0), f(x−1). We will try to ﬁnd an approximation for the second derivative

00 f (x0) ≈ af(x1) + bf(x0) + cf(x−1)

Again, we will use

0 1 00 2 1 000 3 1 (4) 4 (a·) f(x1) = f(x0) + f (x0)h + 2 f (x0)h + 3! f (x0)h + 4! f (ξ1)h ξ1 ∈ (x0, x0 + h)

(b·) f(x0) = f(x0) 0 1 00 2 1 000 3 1 (4) 4 (c·) f(x−1) = f(x0) − f (x0)h + 2 f (x0)h − 3! f (x0)h + 4! f (ξ2)h ξ2 ∈ (x0 − h, x0)

and we will now ﬁt the coeﬃcients to match the approximation the best we can, cancelling leading error terms.

(f(x0)) 0 = a + b + c 0 (f (x0)) 0 = a · h + c · (−h) 1 (f 00(x )) 1 = (a + c) h2 0 2 6 1. Numerical Diﬀerentiation: Finite-Diﬀerences.

1 2 We get, a = c = h2 , and b = − h2 , or

f(x − h) − 2f(x ) + f(x + h) f 00(x ) ≈ 0 0 0 0 h2

Note that because a − c = 0, the O(h3) terms are also cancelled, just like the O(h) terms. This is typical for symmetric approximations. Here we get that the truncation error that we are left with is given by:

1 00 4 1 00 4 2 00 2 2 a 4! f (ξ1)h + c 4! f (ξ2)h = 4! f (ξ)h ξ ∈ (x0 − h, x0) = O(h ).

Overall, the central diﬀerence approximation for the second derivative is also or second order

f(x − h) − 2f(x ) + f(x + h) f 00(x ) = 0 0 0 + O(h2). 0 h2

A second order forward approximation the ﬁrst derivative Let us assume that we

are given with f(x0), f(x1), f(x2). We will try to ﬁnd an approximation

0 f (x0) ≈ af(x0) + bf(x1) + cf(x2)

We will recall that

(a·) f(x0) = f(x0) 0 1 00 2 1 00 3 (b·) f(x1) = f(x0) + f (x0)h + 2 f (x0)h + 3! f (ξ1)h ξ1 ∈ (x0, x0 + h) 0 1 00 2 1 00 3 (c·) f(x2) = f(x0) + 2f (x0)h + 2 f (x0)(2h) − 3! f (ξ2)(2h) ξ2 ∈ (x0 − h, x0)

and we will now ﬁt the coeﬃcients to match the approximation the best we can, cancelling leading error terms.

(f(x0)) 0 = a + b + c 0 (f (x0)) 1 = b · h + c · (2h) 1 (f 00(x )) 0 = (b + 4c) h2 0 2 7 1. Numerical Diﬀerentiation: Finite-Diﬀerences.

−3 4 −1 We get, that a = 2h , b = 2h , c = 2h . The error is given by the truncation errors that are summed: 1 00 3 1 00 3 2 b( 3! f (ξ1)h ) − c( 3! f (ξ2)(2h) ) = O(h ). The central diﬀerence approximation is then of second order:

4f(x + h) − 3f(x ) − f(x + 2h) f 0(x ) = 0 0 + O(h2). 0 2h 8 1. Numerical Diﬀerentiation: Finite-Diﬀerences.

1.2 Numerical errors and optimal h

We thus far saw an approximation of the form

(k) X p f (x0) = aif(xi) + O(h ), i where p is the order of approximation. We will always have a condition where

X ai = 0, i coming from the equation for the f(x0) terms. This means that we get a loss of significance when approximating derivatives. Assume that there is an error bounded by ∆ in the computed (or sampled) values f(xi), then the numerical error in the discrete approximation is given by X ˜ X X RN = aif(xi) − aif(xi) ≤ |ai|∆ i i i 1 Each of the coefficients ai is O( hk )—this is how the coefficients for approximating the k-th derivative come out. Hence, RN is smaller as h is higher. On the other hand, the truncation p error RT = O(h ) is smaller as h is smaller. This means that there is an optimal value for h for the total error which is given by the sum of both error types:

1 p Rtotal = RN + RT = O hk + O(h ).

The next code and Fig. 1 demonstrate that we should not choose h to be as small as possible (and actually we should use it quite high). As we increase the order of approximation (increase p, while k remains), we will get a smaller total error with the same h, and in fact

the optimal h will get even larger (to reduce RN ). We will not deal with this issue further. 9 1. Numerical Diﬀerentiation: Finite-Diﬀerences.

Figure 1: The error in practice for approximating the second derivative of exp(x) using central diﬀerence, as a function of h.

% Matlab example for the error with respect to h function optimal_h_demo() f = @(x)exp(x); x = 1; f_tagayim = exp(x); h_grid = logspace(-10,1,50); f_tagayim_approx = discreteSecondDerivative(f,x,h_grid); loglog(h_grid,abs(f_tagayim_approx - f_tagayim)); title(’Total error of discrete approximation - optimal h demo’) xlabel(’h’); ylabel(’R_{total}’) return;

function f_tagayim_approx = discreteSecondDerivative(f,x,h) f_tagayim_approx = (f(x+h) - 2*f(x) + f(x-h))./h.^2; return;