<<

Mathematical methods in Theoretical Physics

Sommersemester 2014 Technische Universit¨at Berlin

Dr. Gernot Schaller

July 14, 2014 2 Contents

1 Integration of Functions 7 1.1 Heuristic Introduction into ...... 7 1.1.1 Differentiation over C ...... 7 1.1.2 Integration in the C ...... 10 1.1.3 Cauchyintegraltheorem ...... 11 1.1.4 Cauchy’sintegralformula ...... 12 1.1.5 Examples ...... 13 1.1.6 TheLaurentExpansion ...... 15 1.1.7 TheResidueTheorem ...... 16 1.2 UsefulIntegrationTricks ...... 20 1.2.1 StandardIntegrals ...... 20 1.2.2 IntegrationbyDifferentiation ...... 21 1.2.3 IntegrationbySeriesExpansion ...... 22 1.3 TheEulerMacLaurinsummationformula ...... 23 1.3.1 Bernoullifunctions ...... 23 1.3.2 TheEuler-MacLaurinformula ...... 25 1.3.3 ApplicationExamples ...... 26 1.4 NumericalIntegration ...... 28 1.4.1 TheTrapezoidalRule...... 29 1.4.2 TheSimpsonRule ...... 30

2 Integral Transforms 35 2.1 FourierTransform...... 35 2.1.1 ContinuousFourierTransform ...... 35 2.1.2 ImportantFourierTransforms ...... 36 2.1.3 ...... 37 2.1.4 DiscreteFourierTransform...... 38 2.2 LaplaceTransform ...... 40 2.2.1 Definition ...... 40 2.2.2 Properties ...... 43 2.2.3 Applications...... 46

3 Ordinary Differential Equations 51 3.1 Definition ...... 51 3.2 LinearODEs ...... 52 3.2.1 ConstantCoefficients ...... 52 3.2.2 PropertiesoftheMatrixExponential ...... 53

3 4 CONTENTS

3.2.3 NumericalComputation ...... 55 3.3 NonlinearODEs...... 57 3.3.1 SeparablenonlinearODEs ...... 57 3.3.2 Fixed-PointAnalysis ...... 58 3.4 NumericalSolution ...... 64 3.4.1 Runge-Kuttaalgorithm...... 65 3.4.2 Leapfrogintegration ...... 66 3.4.3 Adaptivestepsizecontrol...... 66 3.5 AnoteonLargeSystems...... 68

4 Partial Differential Equations 69 4.1 SeparationAnsatz ...... 70 4.1.1 DiffusionEquation ...... 71 4.1.2 DampedWaveEquation ...... 73 4.2 FourierTransform...... 75 4.2.1 Reaction-DiffusionEquation ...... 75 4.2.2 UnboundedWaveEquation ...... 76 4.2.3 Fokker-Planckequation...... 77 4.3 Coordinatesystems...... 78 4.3.1 CylindricalCoordinates ...... 78 4.3.2 SphericalCoordinates ...... 79 4.3.3 Schr¨odingerEquation...... 80 4.4 Green’sfunctions ...... 81 4.4.1 PoissonEquation ...... 82 4.4.2 WaveEquation ...... 83 4.4.3 Time-dependent Reaction-Diffusion Equation ...... 85 4.5 NonlinearEquations ...... 86 4.5.1 Fisher-Kolmogorov-Equation...... 87 4.5.2 Korteweg-de-Vriesequation ...... 88 4.6 NumericalSolution: FiniteDifferences ...... 89 4.6.1 Explicit Forward-Time Discretization ...... 90 4.6.2 Implicit Centered-Time discretization ...... 91 4.6.3 HigherDimensions ...... 93 4.6.4 NonlinearPDEs...... 93

5 Master Equations 97 5.1 RateEquations ...... 97 5.1.1 Example 1: Fluctuating two-level system ...... 98 5.1.2 Example2: Interactingquantumdots...... 98 5.2 DensityMatrixFormalism ...... 99 5.2.1 DensityMatrix ...... 99 5.2.2 DynamicalEvolutioninaclosedsystem ...... 100 5.2.3 CompletelyPositivelinearmaps ...... 102 5.3 LindbladMasterEquation ...... 103 5.3.1 Master Equation for a cavity in a thermal bath ...... 105 5.3.2 MasterEquationforadrivencavity ...... 106 5.3.3 Phenomenologic Identification of Jump Terms ...... 107 CONTENTS 5

5.4 Example: Single-Electron-Transistor ...... 110 5.5 Entropy ...... 111 6 CONTENTS

This lecture aims at providing physics students and neighboring disciplines with a heuristic toolbox that can be used to tackle all kinds of problems. The lecture will try to be as self-contained as possible and aims at providing rather recipes than mathematical proofs. It is not an original contribution but an excerpt of many papers, other lectures, and own experiences. These have not yet been included as references (footnotes describing famous scientists originate from Wikipedia and solely serve for better identification). As successful learning requires practice, a number of exercises will be given during the lecture, the solution to these exercises may be turned in in the next lecture (computer algebra may be used if applicable), for which students may earn points. Students having earned 50 % of the points at the end of the lecture are entitled to three ECTS credit points. The lecture script will be made available online at http://wwwitp.physik.tu-berlin.de/~ schaller/. Corrections and suggestions for improvements should be addressed to [email protected]. A tentative schedule of the lecture is as follows:

Integration • Integral Transforms • Ordinary Differential Equations • Partial Differential Equations • Statistics • Operators • Chapter 1

Integration of Functions

In this chapter, we first briefly review the properties of complex numbers and functions of complex numbers. We will find that these properties can be used to solve e.g. challenging integrals in an elegant fashion.

1.1 Heuristic Introduction into Complex Analysis

Complex analysis treats complex-valued functions of a complex variable, i.e., maps from C to C. As will become obvious in the following, there are fundamental differences compared to real functions, which however can be used to simplify many relationships extremely. In the following, we will therefore for a complex-valued function f(z) always use the partition

z = x +iy : x,y R { ∈ } f(z) = u(x,y)+iv(x,y) : u,v : R2 R , (1.1) { → } where x and y denote real and imaginary part of the complex variable z, and u and v the real and imaginary part of the function f(z), respectively.

1.1.1 Differentiation over C The differentiability of complex-valued functions is treated in complete analogy to the real case. One fundamental difference however is that the limit

f(z′) f(z) f ′(z) := lim − (1.2) z′ z z z → ′ − has in the complex plane an infinite number of ways to approach z′ z, whereas in the real case there are just two such paths (left and right limit). In analogy to the→ real case we define the derivative of a complex function f(z) via the difference quotient when the limit in Eq. (1.2)

a.) exists and

b.) is independent of the chosen path.

In this case, the function f(z) is then called in z complex differentiable. If f(z) is complex- differentiable in a Uε-environment of z0, it is also called holomorphic in z0. Demanding that a function is holomorphic has strong consequences, see Fig. 1.1. It implies

7 8 CHAPTER 1. INTEGRATION OF FUNCTIONS

iy

z’=z+i ∆ y

Figure 1.1: For the complex-differentiable func- tion f(x +iy)= u(x,y)+iv(x,y) we consider the path z′ z once parallel to the real and once z z’=z+ ∆ x parallel→ to the imaginary axis. For a complex- differentiable function, the result must be inde- pendent of the chosen path. x

that the derivative can be expressed by u(x +∆x,y)+ iv(x +∆x,y) u(x,y) iv(x,y) f ′(z) = lim − − ∆x 0 → ∆x u(x +∆x,y) u(x,y) v(x +∆x,y) v(x,y) = lim − +i lim − ∆x 0 ∆x 0 → ∆x → ∆x ∂u ∂v = +i . (1.3) ∂x ∂x On the other hand one must also have u(x,y +∆y)+iv(x,y +∆y) u(x,y) iv(x,y) f ′(z) = lim − − ∆y 0 → i∆y u(x,y +∆y) u(x,y) v(x,y +∆y) v(x,y) = lim − +i lim − ∆y 0 ∆y 0 → i∆y → i∆y ∂u ∂v = i + , (1.4) − ∂y ∂y since f(z) has been assumed complex differentiable. Now, comparing the real and imaginary parts of Eqns. (1.3) and (1.4) one obtains the Cauchy 1-Riemann 2 differential equations ∂u ∂v ∂u ∂v = , = , (1.5) ∂x ∂y ∂y −∂x which – in essence – generate the full complex analysis. The following theorem holds regarding complex differentiability:

Box 1 (Complex differentiability) The function f(z) = u(x,y)+iv(x,y) is in z0 complex differentiable if and only if:

1Augustin-Louis Cauchy (1789–1857) was an extremely productive french mathematician with pioneering con- tributions to analysis and functional calculus. He is regarded as the first modern mathematician since he strict methodical approaches to . 2Georg Friedrich (1826–1866) was a german mathematician with numerous contributions to analysis, differential geometry, functional calculus and number theory. 1.1. HEURISTIC INTRODUCTION INTO COMPLEX ANALYSIS 9

1. real part u(x0,y0) and imaginary part v(x0,y0) are in (x0,y0) totally differentiable and 2. the Cauchy-Riemannschen differential equations (u = v , u = v ) in (x ,y ) hold. x y y − x 0 0 Then, the derivative is given by f ′(z )= u (x ,y )+iv (x ,y )= v (x ,y ) iu (x ,y ). 0 x 0 0 x 0 0 y 0 0 − y 0 0

Many functions can be analytically continued from the real case to the full complex plane C: the sine function (holomorphic in the full complex plane) • ∞ ( 1)nz2n+1 sin(z)= − (1.6) (2n + 1)! n=0 X the cosine function (holomorphic in the full complex plane) • ∞ ( 1)nz2n cos(z)= − (1.7) (2n)! n=0 X the (holomorphic in C) • ∞ zn exp(z) = , (1.8) n! n=0 X ∞ (iz)2n (iz)2n+1 cos(z)+isin(z) = + = exp(iz) (1.9) (2n)! (2n + 1)! n=0 X   The last relation reduces for z R to the well-known Euler 3- Moivre 4 formula. ∈ We would like to summarize a few properties of the complex derivative: 1. Since the definition is analogous to the real derivative we also have in the complex plane C the product rule, quotient rule, and the chain rule.

5 2. Real- and Imaginary part of holomorphic functions obey the Laplace equation: uxx + uyy = 0= vxx +vyy. This is a direct consequence of the Caucy-Riemann differential equations (1.5). This also implies that when a is known at the boundary of an area G C, it is fixed for all x G. ⊂ ∈ 3. It is easy to show that all in z are in C holomorphic.

2 2 2 4. There are fundamental differences to R . For example, the function f(z) = z∗z = x + y is totally differentiable in R2 but not holomorphic, since the Cauchy-Riemann differential equations (1.5) are not obeyed.

3Leonhard Euler (1707–1783) was a swiss mathematician with important contributions to analysis. His son Johann Albrecht Euler also became a mathematician but contributed also to astronomy. 4Abraham de Moivre (1667–1754) was a french mathematician most notably known for the formula n [cos(z)+isin(z)] = cos(nz)+isin(nz). 5Pierre-Simon (Marquis de) Laplace (1749–1827) was a french mathematician, physicist and astronomer. He contributed greatly to astronomy and physics, which forced him to developed new mathematical tools, which are still important today. 10 CHAPTER 1. INTEGRATION OF FUNCTIONS

iy z 2

Figure 1.2: In contrast to the real numbers a path c(t) between points z1 and z2 in the complex plane z must be specified to fix the integral. This can be 1 conveniently done by choosing a parametrization of a contour c(t)= x(t)+iy(t) with parameter t. x

1.1.2 Integration in the complex plane C As was already the case with differentiation, also for the integration a path must be fixed – in stark contrast to the real case, see Fig. 1.2. As in the real case the integral along a contour is defined via the Riemann sum. For its practical calculation however one uses a parametrization of the integration contour c by a curve c(t):[α, β] R c C in the following way: ⊂ → ⊂ β f(z)dz = f[c(t)]˙c(t)dt. (1.10) Zc Zα This maps complex integrals to real ones. Fundamental example of complex calculus We consider the closed integral in counterclockwise (mathematical) rotation along a circle with n radius R and center z0 over the function f(z)=(z z0) : n Z. The parametrization of the it − ∈ contour is given by z(t)= z0 + Re : t =0 ... 2π. This implies 2π f(z)dz = i Rn+1ei(n+1)tdt ISR(z0) Z0 n+1 2π R ei(n+1)t : n = 1 = n+1 0 6 − 2πi : n = 1  − n 2πi : n = 1 (z z0) dz = − (n Z) . (1.11) − 0 : else ∈ ISR(z0)  For n = 1 the vanishing of the integral follows from the periodicity of the exponential func- tion (1.86 ).− For n = 1 the function is just constant along the contour, such that the value of the integral is given by− the length of the path. Many theorems of complex calculus may be traced back to this fundamental example. In the complex plane the integral has similiar properties as in the real case: 1. Since contour integrals are mapped to real integrals via parametrizations of the path it follows that the integral is linear. 2. For the same reason, the sign of an integral changes when the contour is traversed in the opposite direction (e.g. clockwise in the fundamental example).

R3 ~ 3. The analogy to path integrals in (for example regarding the work W = c F d~r ) leads to the valid question: R When are complex integrals independent of the chosen path? 1.1. HEURISTIC INTRODUCTION INTO COMPLEX ANALYSIS 11

1.1.3 Cauchy integral theorem

Box 2 (Cauchy’s integral theorem) Let G C be a bounded and simply connected (no holes) area. Let f in G be holomorphic and let c be a closed⊂ curve that is fully contained in G. Then one has:

f(z)dz =0 . (1.12) Ic

A vanishing integral over a closed curve does of course also imply that the integral between two points is independent on the path (simply form a closed curve by adding two baths between two points). Here, we would just like to sketch the proof of Cauchy’s theorem. First we separate real and imaginary parts:

f(z)dz = [u(x,y)dx v(x,y)dy]+i [u(x,y)dy + v(x,y)dx] . (1.13) − Ic Ic Ic To both integrals we can apply the two-dimensional Stokes theorem ∂F ∂F F~ d~r = [F (x,y)dx + F (x,y)dy]= y x dxdy . (1.14) x y ∂x − ∂y I∂A I∂A ZZA   When we do now in Eq. (1.13) identify in the first integral F = u(x,y),F = v(x,y) and in the x y − second integral Fx = v(x,y),Fy = u(x,y) we obtain ∂v ∂u ∂u ∂v f(z)dz = dxdy +i dxdy . (1.15) −∂x − ∂y ∂x − ∂y Ic ZZA(c)   ZZA(c)   The terms in round brackets vanish for holomorphic functions f(z), since they constitute nothing but the Cauchy-Riemann differential equations (1.5). Cauchy’s integral theorem is useful for the calculation of many contour integrals. Consider, 1 for example, the function f(z)=(z z0)− . It is holomorphic for all z = z0 but has at z = z0 a − 6 1 pole. When now z0 is not enclosed by the integration contour c, one has c(z z0)− dz = 0. If, in contrast, z is inside the contour specified by c, the pole can be cut out by a− modified integration 0 H path c′ (as depicted in Fig. 1.3) to render f holomorphic in the area surrounded by c′. Then, the area enclosed by the modified curve c′ = c + c1 + S¯ε(z0)+ c2 does no longer contain a pole of f. Furthermore, for a closed path c and a closed circle S¯ε(z0) the integrals over c1 and c2 mutually cancel as f(z) is continuous and these paths are traversed with opposite direction. Since the circle S¯ε(z0) is traversed with negative (clockwise) orientation, one obtains – with the fundamental example (1.11) – for the complete integral

f(z)dz =0= f(z)dz + f(z)dz = f(z)dz 2πi . (1.16) ¯ − Ic′ Zc ZSε(z0) Zc This implies for arbitrary integration contours c:

1 0 : z0 / c (z z0)− dz = ∈ , (1.17) − 2πi : z0 c Ic  ∈ i.e., whereas the fundamental example was only valid for a circular contours, the above result is much more general. 12 CHAPTER 1. INTEGRATION OF FUNCTIONS

c 2 c z 1 0 c

S ε(z0 ) Figure 1.3: Appropriate excision of a singularity at z = z0. The modified curve c′ = c + c1 + S¯ε(z0)+ c2 does no longer enclose a singularity. In addition, it is directly obvious that the contri- butions c1 and c2 cancel each other since the are traversed at opposite orientation. Therefore, one actually has f(z)dz + f(z)dz = 0. c S¯ε(z0) H H 1.1.4 Cauchy’s integral formula

Box 3 (Cauchy’s integral formula) Let G C be bounded and f in G be holomorphic and ⊂ continous at the boundary ∂G. Let G be bounded by a finite number of disjoint curves ci. Then one has: f(z) 2πif(z ) : z G dz = 0 0 ∈ . (1.18) z z 0 : z0 / G I∂G − 0  ∈

Also here we would like to provide a sketch for the proof: Since the case z0 / G can be mapped to Cauchy’s integral theorem we will only consider the interesting case z G∈. Then, the integrand 0 ∈ has a pole at z = z0, which – compare Fig. 1.3 – is excised by a small circle from G. In analogy to the previous example one obtains following Cauchy’s integral theorem

f(z) 0 f(z + εeiϕ) dz = lim 0 iεeiϕdϕ ε 0 iϕ ∂G z z0 − → 2π εe I − 2π Z iϕ = i lim f(z0 + εe )dϕ ε 0 0 → Z 2π = i f(z0)dϕ =2πif(z0) . (1.19) Z0 A first observation regarding Eq. (1.18) is that all values of the function f(z) in G are completely determined by its values at the boundary. This is a consequence of the Cauchy-Riemann differential equations as these imply that f(x,y) is a solution to the Laplace equation (which is completely determined by given boundary conditions [f(∂G)]). Therefore, Eq. (1.18) can be regarded a useful method to solve two-dimensional boundary problems on Laplace’s equation. Eq. (1.18) can be easily generalized to poles of higher order. Particularly interesting is of ∂ course only the case z0 G. By performing multiple derivatives on both sides of Eq. (1.18) ∈ ∂z0 1.1. HEURISTIC INTRODUCTION INTO COMPLEX ANALYSIS 13 one obtains f(z) 2πif ′(z ) = 1 dz , 0 · (z z )2 I∂G − 0 f(z) 2πif ′′(z ) = 1 2 dz, 0 · · (z z )3 I∂G − 0 f(z) 2πif ′′′(z ) = 1 2 3 dz ... (1.20) 0 · · · (z z )4 I∂G − 0 For the n-th derivative one obtains Cauchy’s general integral formula: f(z) 2πi dz = f (n)(z ) (n =0, 1, 2,...) , (1.21) (z z )n+1 n! 0 I∂G − 0 which holds of course only for z0 G. With this formula also integrals around higher-order poles can be calculated. ∈

1.1.5 Examples Cauchy’s integral theorem and Cauchy’s integral formula are often used to calculate real-valued integrals – these just have to be suitably continued into the complex plane. In the following, we will demonstrate this analytic continuation with some examples.

1. The integral

∞ dx 1 I = mit f(x)= (1.22) x2 +1 x2 +1 Z−∞ has the solution π, since the antiderivative of f(x) is arctan. However, also without knowing this one can find the solution directly with Eq. (1.18). The analytic continuation of f(x) to the full complex plane is given by 1 1 f(z)= = . (1.23) z2 +1 (z i)(z + i) − Obviously, there exist two poles at z = i. When we supplement the integral along the real axis with the semicircle in the upper complex± half-plane as depicted in Fig. 1.4, we see that only one pole (z = +i) resides inside the integration contour. In the limit limR however →∞2 the upper semicircle does not contribute, since f(z) decays in the far-field as R− , whereas the length of the semicircle only grows linearly in R. This implies

∞ dx F (z) 1 I = 2 = dz with F (z)= x +1 c z i z +i Z−∞ I − 2πi = 2πiF (i) = = π. (1.24) 2i

2. When we consider the integral over the parametrization of a conic section, i.e., over an ellipse

2π dϕ 2π dϕ I = = : 0 ε< 1 (1.25) 1+ ε cos ϕ 1+ ε (eiϕ + e iϕ) ≤ Z0 Z0 2 − 14 CHAPTER 1. INTEGRATION OF FUNCTIONS

y

Figure 1.4: Closure of a complex integral in the i upper complex half-plane. It is essential to ver- −R R x ify that the integrand vanishes in the upper arc −i of radius R faster than 1/R as R , since otherwise its contribution cannot be→ neglected. ∞

we can use the substitution z(ϕ)= eiϕ to map the integral to a closed integral along the unit circle around the origin 1 dz 2 dz I = = i z + εz z + 1 εi z2 + 2 z +1 IS1(0) 2 z IS1(0) ε 2 dz =  , (1.26) εi (z z )(z z ) IS1(0) − 1 − 2 1 1 where z1 = + 2 1 : inside S1(0) −ε rε − 1 1 z2 = 2 1 : outside S1(0) (1.27) −ε − rε − With Eq. (1.18) it follows that 2 1 2π 1 2π I = 2πi = = . (1.28) 2 εi z1 z2 ε 1 1 √1 ε − ε2 − − q 3. The integral

∞ dx 1 ∞ dx 1 ∞ dz R I = 2 2 2 = 2 2 2 = 2 2 : a (1.29) 0 (x + a ) 2 (x + a ) 2 (z + ia) (z ia) ∈ Z Z−∞ Z−∞ − can as before be closed by a semicircle in the complex plane, see Fig. 1.5. Here the integrand vanishes even faster than in our first example, such that the contribution of the upper semicircle need not be considered for the same reasons. However, here we have a second order pole residing at z = +ia, such that Cauchy’s general integral formula should be used 1 dz 1 1 d 1 I = lim = 2πi 2 R (z +ia)2(z ia)2 2 1! dz (z +ia)2 →∞ Ic − z=ia 2πi π = − = . (1.30) (z +ia)3 4a3 z=ia

4. Using the same contour as before we can also solve the integral (assuming k > 0) ikz ikz ∞ cos(kx)dx ∞ e dz ∞ e dz I = = = , (1.31) x2 + a2 ℜ z2 + a2 (z +ia)(z ia) Z−∞ Z−∞ Z−∞ − 1.1. HEURISTIC INTRODUCTION INTO COMPLEX ANALYSIS 15

y

ia

−R R x −ia Figure 1.5: Closure of a contour in the upper complex plane. We have implicitly assumed here that a> 0.

which e.g. shows up in the calculation of many Fourier 6 transforms. Above, we have already used that the integral over the imaginary part vanishes due to symmetry arguments (odd function). Here, we have just as in Fig. 1.5 two poles, but of first order at z = ia. Furthermore, in contrast to the previous cases we have to be cautious: The contour should± be closed in the upper half-plane, since in the lower half-plane (y < 0) the exponential function (exp(ikz) = exp(ikx)exp( ky)) would not be bounded. When we would close in the lower half-plane, the contribution− from the semi-circle would therefore not vanish. Closing therefore in the upper half-plane, we obtain

ikz e π ka I =2πi = e− . (1.32) z +ia a z=ia

For k < 0 we would have to choose the opposite contour, but the symmetry of the cos

function suggests that the result must be the same and we simply have to replace k k in →| | the result.

1.1.6 The Laurent Expansion

Box 4 (Laurent Expansion) Let f(z) be holomorphic inside the (doughnut) K(z ,r,R)= z C : 0 r< z z

a. The ak do not depend on ρ and b. for all z K(z ,r,R) it follows that f is given by ∈ 0 ∞ f(z)= a (z z )n . (1.34) n − 0 n= X−∞

6Jean Baptiste Joseph Fourier (1768–1830) was a french mathematician and physicist. He was interested in heat conductance properties of solids (Fourier’s law) and his Fourier analysis is a basis of modern physics and technology. 16 CHAPTER 1. INTEGRATION OF FUNCTIONS

The above expansion is also called Laurent 7 -Expansion. In contrast to the conventional Taylor 8 expansion we also observe terms with poles (ak with k < 0). From the definition of the coefficients above it furthermore follows that the Laurent-expansion is unique. This in turn implies that in case f(z) can at z = z0 be expanded into a Taylor-, then this is also the Laurent-series. In this case it just happens that many coefficients vanish ak<0 = 0. Here, we just want to show that the calculation of the expansion coefficients ak is self-consistent. For this we insert the expansion (1.34) in Eq. (1.33)

∞ l ∞ 1 (z z0) 1 l k 1 − − ak = al − k+1 dz = al (z z0) dz . (1.35) 2πi (z z0) 2πi − l= Sρ(z0) l= Sρ(z0) X−∞ I − X−∞ I Using the fundamental example of complex calculus from Eq. (1.11) it follows that all terms except one (for l k 1= 1, i.e., for l = k) in the sum vanish. This leads to − − − 1 ∞ a = 2πi a δ = a . (1.36) k 2πi l kl k l= X−∞ This shows that the definition of the expansion coefficients is self-consistent. The Laurent-expansion is often used to classify isolated singularities. Isolated singularities have in their Uε-environment no further singularity – a typical counter-example is the function f(z)= √z, which is non-holomorphic on the complete negative real axis (where it has no isolated singularities). To summarize, it is said that f(z) has at z = z0 a

curable singularity, when the limit limz z0 =: f(z0) exists and is independent on the • → approaching path. In the Laurent series all coefficients ak<0 vanish in this case. A popular example for this situation is sin(z)/z, which has a curable singularity at z = 0.

pole of order m, when in the Laurent series of f(z) around z0 the first non-vanishing • 3 coefficient is a m, i.e., when a m =0 , ak< m = 0. For example, the function f(z)=1/(z i) − − 6 − − has a third order pole at z = i. , when in the Laurent of f(z) there is no first non- • vanishing coefficient, as is e.g. the case for f(z) = exp(1/z) at z = 0.

1.1.7 The

Box 5 (Residue) Let the Laurent expansion of a function f(z) be known around z0. Then, the expansion coefficient a 1 is called the residue of f at z0, i.e., − 1 Resf(z)= a 1 = f(z)dz. (1.37) z=z0 − 2πi ISρ(z0)

7Pierre Alphonse Laurent (1813–1854) was a french mathematician working as an engineer in the french army. His only known contribution – the investigation of the convergence properties of the Laurent series expansion – was published post mortem. 8Brook Taylor (1685–1731) was a british mathematician. He found his expansion already in 1712 but its impor- tance for differential calculus was noted much later. 1.1. HEURISTIC INTRODUCTION INTO COMPLEX ANALYSIS 17

z1

z2

z4 Figure 1.6: The singularities in G can be excised with circles of radii ρk, which im- n plies ∂G f(z)dz = k=1 S¯ (z ) f(z)dz = − ρk k n ¯ + k=1 S (z ) f(z)dz. Here, Sρk (zk) denote the z3 H ρk k P H clockwise orientation as depicted and S (z ) P H ρk k the positive orientation (counterclockwise), which compensates the sign. The two integral contribu- tions along each cut (red) cancel each other as they are traversed with opposite orientation.

1 Obviously we have for the fundamental example with f(z)=(z z0)− for the residue Resf(z)=1. − z=z0 Furthermore, the integral theorem by Cauchy also implies that the residue vanishes at places where f(z) is holomorphic. The residue theorem combines all theorems stated before and is one of the most important tools in mathemetics.

Box 6 (Residue Theorem) Let G C be an area bounded by a finite number of piecewise continuous curves Let furthermore f be⊂ at the boundary ∂G and also inside G holomorphic – up to a finite number of isolated singularities z1,z2,...,zn 1,zn that are all in G. Then one has: { − } n f(z)dz =2πi Resf(z) . (1.38) ∂G z=zk I Xk=1

Here we would also like to provide a sketch of the proof. We will need the Laurent expansion and Cauchy’s integral theorem. To obtain an area within which the function is completely holomorphic we excise the singularities as demonstrated in Fig. 1.6. The ρk have to be chosen such that the Laurent-expansion of f around zk converges on the circle described by Sρk (zk). Now we insert at each singularity for f(z) the Laurent expansion around the respective singularity

n n ∞ ∞ l f(z)dz = al (z zk) dz = al2πiδl, 1 − − ∂G k=1 l= Sρk (zk) k=1 l= I −∞ I −∞ X Xn X X = 2πi Resf(z) , (1.39) z=zk Xk=1 18 CHAPTER 1. INTEGRATION OF FUNCTIONS where we have used the fundamental example of complex calculus (1.11). Therefore, the residue theorem allows one do calculate path integrals that enclose multiple singularities. Very often, poles show up in these integrals, and the formula for the calculation of residues in Eq. (1.37) is rather unpractical for this purpose, since it requires the parametrization of an integral. Therefore, we would like to discuss a procedure that allows the efficient calculation of residues at poles of nth order. Let f(z) have at z = z0 a pole of order m. Then, f(z) can be represented as

a m a 2 a 1 f(z)= − + ... + − + − + a + ... . (1.40) (z z )m (z z )2 (z z ) 0 − 0 − 0 − 0

We are mainly interested in the coefficient a 1. Therefore we first multiply the above equation − with (z z )m and then (m 1) times perform the derivative for z, i.e., − 0 − m m 2 m 1 m (z z0) f(z)= a m + ... + a 2(z z0) − + a 1 (z z0) − + a0(z z0) + ... − − − − m 1 − − − d − m m! m 1 (z z0) f(z)=(m 1)! a 1 + a0(z z0)+ ... . (1.41) dz − − − − 1! −

When in addition we do now also perform the limit z z0 we see that all unwanted terms vanish and we obtain a formula for the fast calculation of residues→ of poles of m-th order.

Box 7 (Convenient calculation of residues) Given that the function f(z0) has at z = z0 a pole of order m, the residue can be obtained via

m 1 1 d − m Resf(z)= lim m 1 (z z0) f(z) . (1.42) z=z0 (m 1)! z z0 dz − − → −

In the following, we will discuss the residue theorem with the help of a few examples

4 z 1. The Laurent expansion of the function f(z) = z− e around z = 0 can be directly found from the Taylor expansion of the exponential function

∞ n 4 z 4 z 4 1 2 1 3 1 4 z− e = z− = z− 1+ z + z + z + z + ... n! 2 6 24 n=0   1 X1 1 1 1 = + + + + + ... (1.43) z4 z3 2z2 6z 24 and we can directly read off Resf(z)=1/6. This implies that e.g. for the closed integral z=0 along the unit circle centered at the origin (which contains the pole at z = 0)

ez πi dz =2πiResf(z)= . (1.44) z4 z=0 3 IS1(0) 2. For the integral

cos(z) I = dz = f(z)dz (1.45) z3(z π)2 IS7(π) − IS7(π) 1.1. HEURISTIC INTRODUCTION INTO COMPLEX ANALYSIS 19

iy

ib ia −R R

−ia x −ib

Figure 1.7: The contour integral (1.49) must be closed in the lower complex plane, as otherwise the contribution due to the semicircle would not vanish. Two poles are enclosed by the contour.

one has to calculate the residues at two poles, since both poles are enclosed by the integration contour S7(π). It is therefore useful to employ formula (1.42). At z = 0 we have a third-order pole, i.e., 1 d2 cos(z) Resf(z) = lim (z 0)3 z=0 z 0 2! dz2 − z3(z π)2 → − 1 cos(z) 4sin(z) 6cos(z) 6 π2 = lim − + + = − . (1.46) 2 z 0 (z π)2 (z π)3 (z π)4 2π4 →  − − −  Analogously we proceed at z = π, where we have a second order pole 1 d cos(z) Resf(z) = lim (z π)2 z=π z π 1! dz − z3(z π)2 → − sin(z) 3cos(z) 3 = lim − + − = . (1.47) z π z3 z4 π4 →   Together we obtain cos(z) 12 π2 (12 π2)i I = dz =2πi − = − . (1.48) z3(z π)2 2π4 π3 IS7(π) − 3. The Fourier transform ikx ikz ∞ e ∞ e F (k)= − dx = − dz (1.49) (x2 + a2)(x2 + b2) (z ia)(z ib)(z +ia)(z +ib) Z−∞ Z−∞ − − can be calculated using the residue theorem. First we assume just for simplicity a > 0, b > 0, and k > 0: Symmetry arguments suggest that the value of the integral can only depend on a , b , and k (the latter comes in since the imaginary part of the integral vanishes). When| | | choosing| | the| integration contour we have to be cautious, since exp( ikz)= exp( ikx) exp(+ky) for k > 0 is bounded only for z with negative real part y < 0. Therefore,− we have− to close the Fourier integral in the lower complex half-plane, see Fig. 1.7. Inside this contour we have two poles of first order at z = ia und z = ib, and from Eq. (1.42) we obtain − − ikz ka e− e− Res f(z) = lim = − , z= ia z ia (z2 + b2)(z ia) 2ia(b2 a2) − →− − − e ikz e kb Res f(z) = lim − = − . (1.50) z= ib z ib (z2 + a2)(z ib) 2ib(b2 a2) − →− − − 20 CHAPTER 1. INTEGRATION OF FUNCTIONS

From the residue theorem we therefore obtain

ikx kb ka ∞ e− ae− be− F (k) = dx = 2πi − (x2 + a2)(x2 + b2) − 2iab(b2 a2) Z−∞ − π kb ka = − ae− be− , (1.51) ab(b2 a2) − −  where the sign arises from the negative orientation of the contour. With different assumptions on a, b, and k we might have to choose a different contour, direct inspection of the original integral however tells us that the result can only depend on the absolute values a , b , and k of these parameters. | | | | | | Many popular Fourier transforms can be calculated with the residue theorem, e.g. that of a Lorentz 9 function.

Exercise 1 (Fourier Transform Example) Using the residue theorem, calculate the (inverse) Fourier transform of a Lorentzian function (δ > 0, ǫ R, τ R) ∈ ∈ + 1 ∞ 1 δ iωτ e− dω. (1.52) √2π π (ω ǫ)2 + δ2 Z−∞ − Do not forget the case τ < 0.

1.2 Useful Integration Tricks

Very often, integrals cannot be obviously solved by standard methods. Before one has to revert to numerical methods, there are a few well-known tricks to solve integrals or to map them to well-known standard integrals, which we will summarize here.

1.2.1 Standard Integrals 1. The integral (a> 0, b R) ∈ 2 2 2 √ax b + b ax +bx  2√a  4a I1(a,b) = e− dx = e− − Z Z 2 2 b y2 dy π b = e 4a e− = e 4a (1.53) √a a Z r is a well-known standard, since it corresponds to a non-normalized Gaussian 10 distribution b centered at x∗ = 2a . We note that the integral does not converge for a<= 0. 9Hendrik Antoon Lorentz (1853–1928) was a dutch physicist known for the theoretical explanation of the Zeeman effect and for the transformation equations in special relativity. The distribution is sometimes however also called Breit-Wigner distribution or Cauchy distribution. 10Johann Carl Friedrich Gauß (1777–1855) was a german mathematician, astronomer, physicist and geodesist. His extreme mathematical talent was soon recognized. Since he only published results after he was convinced that the theory was complete, many important contributions were discovered in his diary after his death. 1.2. USEFUL INTEGRATION TRICKS 21

2. We can use Cauchy’s integral formula to solve the integral over a Lorentzian function (a R, b> 0 for simplicity) ∈ 1 1 I (a,b) = dx = dx 2 (x a)2 + b2 (x a +ib)(x a ib) Z − Z − − − 1 π = 2πi = . (1.54) x a +iδ b x=a+ib − The result of the integral is of course the same when b < 0, such that the more general π expression would be I2(a,b)= b . | | 3. The half-sided integral (b> 0)

∞ bx 1 I (b)= e− dx = (1.55) 3 b Z0 is readily solved as the antiderivative is well-known.

1.2.2 Integration by Differentiation As many known integrals involve parameters, it is a popular method to see how these integrals transform when we consider derivatives with respect to these parameters. Here, we generalize the integrals introduced before. 1. Since the exponential function mainly reproduces itself under differentiation, we observe that performing a derivative with respect to the parameter b produces a factor of x in front of the exponential. We therefore directly conclude that for any (x) = a + a x + ... + a xN , integrals of the type (a> 0, b R) P 0 1 N ∈ 2 ax2+bx ∂ π b I (a,b)= (x)e− dx = e 4a , (1.56) 1 P P ∂b a Z   r where we have used Eq. (1.53). 2. Similarly, we can solve the integral over higher powers of our second standard example (a R, b> 0, n N) ∈ ∈ n n 1 n ∂ 1 I2(a,b) = 2 2 dx =( 1) n! 2 dx (x a) + b − ∂B (x a) + B 2 Z     Z B=b − n − n ∂ π = ( 1) n! , (1.57) − ∂B √ 2   B B=b

where we have used Eq. (1.54). Alternatively, we might have solved this using the residue

theorem.

N 3. Also half-sided integrals with polynomials (x) = a0 + a1x + ... + aN x can be solved by differentiation P

∞ bx ∂ 1 I (b) = (x)e− dx = . (1.58) 3 P P −∂b b Z0   22 CHAPTER 1. INTEGRATION OF FUNCTIONS

Exercise 2 (Integration by Differentiation) Show that a Gaussian distribution

1 (x µ)2/(2σ2) P (x)= e− − (1.59) √2πσ has mean µ = x = xP (x)dx and variance σ2 = (x x )2 = (x µ)2P (x)dx. h i h −h i i − R R

1.2.3 Integration by Series Expansion Sometimes, series expansions of the integrand (or parts of it) lead to analytically solvable integrals. Here, we will consider series expansion of the integral

f(x) I = e− dx (1.60) Z where we assume that the function f(x) is positive for large x, has a single minimum at x0 and that the integral converges. The dominant contribution to the integral can be expected from the minimum of f(x), such that we may expand

2 (x x0) f(x)= f(x )+ f ′′(x ) − + (x,x ) , (1.61) 0 0 2 R 0 n n where the remainder is given by (x,x )= ∞ (x x ) f (x )/(n!). Above, we have used that R 0 n=3 − 0 0 we expand around a minimum, where f ′(x0) = 0. The integral then becomes P 2 f(x0) (x,x0) f (x0)(x x0) /2 I = e− e−R e− ′′ − dx. (1.62) Z We note its similarity to our example in Eq. (1.53). Indeed, when we neglect the remainder (x,x ) 0, the integral just becomes R 0 →

f(x0) 2π I e− . (1.63) ≈ sf ′′(x0)

We note that at x0 the function f(x) must have a minimum (implying that f ′′(x0) > 0) to make the expansion work. This expansion is often called saddle point (or stationary phase) approxima- tion. However, we furthermore note that we can solve this integral to arbitrary accuracy using a systematic expansion

n (n) 2 f(x0) ∞ y f (x0)/(n!) f ′′(x0)y /2 I = e− e− Pn=3 e− dy (1.64) Z of the first exponential in the integrand. Approximations of similar spirit can be applied to integrands that are given by a product of a simple weight function (which in our previous example was the Gaussian) and a complicated function (which was the exponential of the remainder before). For example, the current through a single quantum dot can be represented by the following integral Γ Γ 1 Γ/2 I = L R [f (ω) f (ω)] dω, (1.65) Γ +Γ L − R π (ω ǫ)2 + (Γ/2)2 L R Z − 1.3. THE EULER MACLAURIN SUMMATION FORMULA 23

1 βα(ω µα) where Γα and fα(ω) = e − +1 − are bare tunneling rates and Fermi functions of lead α L,R , held at inverse temperatures β and chemical potentials µ . The abbreviation Γ = ∈ { }   α α ΓL +ΓR thus describes the overall coupling strength. Clearly, the current vanishes in equilibrium when both temperatures and chemical potentials are equal. Moreover, we note with using that for small coupling strength we can approximate the Lorentzian function by a Dirac-Delta distribution 1 ǫ δ(x) = lim , (1.66) ǫ 0 2 2 → π x + ǫ it becomes obvious that for weak couplings we can approximate the current by

ΓLΓR I [fL(ǫ) fR(ǫ)] , (1.67) ≈ ΓL +ΓR − which is also the result of a simple master equation approach.

Exercise 3 (Saddle Point Approximation) Expand the integral (µ R) ∈

1 (x µ)2/(2σ2) x4 I = e− − − dx (1.68) √2πσ Z for small σ > 0 up to the first two non-vanishing terms.

1.3 The Euler MacLaurin summation formula

Sums and Integrals are obviously related. For summations over integers, this relation can be made explicit by the Euler-MacLaurin 11 summation formula, which enables to calculate integrals by sums or sums by integrals.

1.3.1 Bernoulli functions We first introduce the Bernoulli 12 numbers as expansion coefficients of the series

x ∞ B = n xn , (1.69) ex 1 n! n=0 − X where it is easy to show that the smallest Bernoulli numbers are given by 1 1 1 B = 1 , B = , B = , B =0 , B = , B =0 , 0 1 −2 2 6 3 4 −30 5 1 1 5 B = , B =0 , B = , B =0 , B = , ... (1.70) 6 42 7 8 −30 9 10 66 11Colin Maclaurin (1698–1746) was a scottish mathematician, physicist and geodesist. He published the series in 1742 in a treatise on Newton’s infinitesimal calculus. 12Jakob I. Bernoulli (1655–1705) was one of many famous members of the Bernoulli family with contributions to probability theory, variational calculus and . 24 CHAPTER 1. INTEGRATION OF FUNCTIONS

We note that all odd Bernoulli numbers Bn 3 vanish. The Bernoulli numbers can be conveniently ≥ calculated via a recursion formula (without proof)

n 1 1 − n +1 B = B , B =1 . (1.71) n −n +1 k k 0 Xk=0   We also note that the Laurent-expansion (1.33) also enables us to calculate the Bernoulli numbers via the contour integral n! z dz B = . (1.72) n 2πi ez 1 zn+1 ISǫ(0) −

Exercise 4 (Bernoulli numbers) Show that Eq. (1.72) is consistent with the definition of the Bernoulli numbers (1.69) and calculate B0, B1, and B2 explicitly.

Now, we extend the for the Bernoulli numbers by considering the function

xesx ∞ B (s) F (x,s)= = n xn , (1.73) ex 1 n! n=0 − X which defines the Bernoulli functions Bn(s) implicitly as Taylor expansion coefficients (recall that F (x,s) has a curable singularity at x = 0). These Bernoulli functions have interesting analytic properties, we first state the simple ones

1. It is obvious that Bn(0) = Bn. 2. When we consider the limit s = 1, we can obtain a further relation between the Bernoulli functions and Bernoulli numbers

xex x ( x) ∞ B ∞ ( 1)nB F (x, 1) = = = − = n ( x)n = − n xn ex 1 1 e x e( x) 1 n! − n! − − n=0 n=0 − − − X X ∞ B (1) = n xn , (1.74) n! n=0 X which – since this has to hold for all x – yields B (1) = ( 1)nB . n − n 3. Considering the limit x 0, we can fix the first Bernoulli function → F (0,s)=1= B0(s) . (1.75)

4. A recursive relation between the Bernoulli functions can be obtained by considering the partial derivative

∂ xesx ∞ B (s) F (x,s) = x = n xn+1 ∂s ex 1 n! n=0 − X ∞ B (s) ∞ B (s) = n′ xn = n′ xn , (1.76) n! n! n=0 n=1 X X 1.3. THE EULER MACLAURIN SUMMATION FORMULA 25

where in the last step we have used B0(x) = 1. Now, we can rename in the second equation the summation index n = m +1, such that we can write

∞ B′ (s) ∞ B′ (s) F (x,s)= m+1 xm+1 = n+1 xn+1 , (1.77) (m + 1)! (n + 1)! m=0 n=0 X X

from which we infer the relation Bn′ +1(s)=(n + 1)Bn(s). Altogether, we summarize again

B (0) = B , B (1) = ( 1)nB , n n n − n B0(s) = 1 , Bn′ +1(s)=(n + 1)Bn(s) . (1.78) Together, these formulas can be used to calculate the Bernoulli functions recursively. For a recur- sive calculation of Bernoulli functions, we note that these are completely defined by the relation

1 n 1 ≥ B0(s) = 1 , Bn′ (s)= nBn 1(s) , Bn(s)ds = 0 . (1.79) − Z0

Exercise 5 (Bernoulli recursions) Show the validity of the last of Eqns. (1.79). You may use that all higher odd Bernoulli numbers vanish B =0 p 1. 2p+1 ∀ ≥

For completeness, we note the first few Bernoulli functions explicitly 1 1 3 x B (x) = 1 , B (x)= x , B (x)= x2 x + , B (x)= x3 x2 + , 0 1 − 2 2 − 6 3 − 2 2 1 B (x) = x4 2x3 + x2 . (1.80) 4 − − 30 1.3.2 The Euler-MacLaurin formula We want to represent the integral

1 I = f(x)dx, (1.81)

Z0 in terms of a sum involving only the derivatives of f(x). It can be shown easily that any integral along a finite range can be mapped to the above standard form. Heuristically, we may insert the

Bernoulli function B0(x)=1= B1′ (x)

1 1

I = f(x)B0(x)dx = f(x)B1′ (x)dx. (1.82) Z0 Z0 Integration by parts yields

1 1 1 f(0) + f(1) I =[f(x)B (x)] f ′(x)B (x)dx = f ′(x)B (x)dx, (1.83) 1 0 − 1 2 − 1 Z0 Z0 26 CHAPTER 1. INTEGRATION OF FUNCTIONS

where we have – temporarily used B1(0) = 1/2 and B1(1) = +1/2 – to demonstrate that the first term simply yields combinations of f(0)− and f(1). We can go on with this game by using

Bn(x)=1/(n + 1)Bn′ +1(x) and performing again an integration by parts 1 1 1 I = [f(x)B (x)] f ′(x)B′ (x)dx 1 0 − 2 2 Z0 1 1 1 1 = f(x)B (x) f ′(x)B (x) + f ′′(x)B′ (x)dx 1 − 2 2 2 3 3  0 · Z0 1 1 1 1 1 (3) = f(x)B (x) f ′(x)B (x)+ f ′′(x)B (x) f (x)B′ (x)dx 1 − 2 2 2 3 3 − 4! 4  0 0 ·1 Z n m 1 n 1 (m 1) ( 1) − ( 1) (n) = f − (x)B (x) − + − f (x)B (x)dx, (1.84) m m! n! n "m=1 # 0 X 0 Z where in the last line we have generalized to arbitrary integers n 1. We can further use properties ≥ of the Bernoulli numbers, e.g. that B3(0) = B3 = 0and B3(1) = B3 = 0 and similiar for all higher odd Bernoulli functions evaluated at either 0 or 1. Therefore,− after separating the nonvanishing terms with B1(0/1), only the even contributions need to be considered 1 q 2m 1 2q 1 f(0) + f(1) (2p 1) ( 1) − ( 1) ) (2q) I = + f − (x)B (x) − + − f (x)B (x)dx 2 2p (2p)! (2q)! 2q " p=1 # 0 X 0 Z q 1 f(0) + f(1) B2p (2p 1) 1 1 (2q) = f − (x) + f (x)B (x)dx, (1.85) 2 − (2p)! 0 (2q)! 2q p=1 Z0 X   which finally allows us to summarize the Euler-MacLaurin formula.

Box 8 (Euler-MacLaurin formula) The Euler-MacLaurin formula relates the integral of a function f(x)

1 I = f(x)dx Z0 q 1 f(0) + f(1) B2p (2p 1) (2p 1) 1 (2q) = f − (1) f − (0) + f (x)B (x)dx, (1.86) 2 − (2p)! − (2q)! 2q p=1 Z0 X   where q > 1 is an integer, with the values of f(x) and its derivatives at the end points of the interval and an error term involving the Bernoulli functions.

The formula can be extremely useful in practice when cleverly applied, as will be outlined below.

1.3.3 Application Examples

1 1 (2q) 1. In many cases, the remainder term (2q)! f (x)B2q(x)dx can be neglected once q is chosen 0 large enough. This holds true exactlyR when f(x) is a polynomial in x, such that above a 1.3. THE EULER MACLAURIN SUMMATION FORMULA 27

certain q all derivatives simply vanish. However, also when the derivatives of f(x) do not inflate with q, the last term can be neglected for large q and we obtain the Euler-MacLaurin approximation

1 ∞ f(0) + f(1) B2p (2p 1) (2p 1) f(x)dx f − (1) f − (0) . (1.87) ≈ 2 − (2p)! − Z0 p=1 X   In practice of course one will have to cut this series at a certain p.

b 2. The Euler-MacLaurin formula is useful to evaluate series of the form m=a f(m) (with integers b>a). To see this, we consider the integral instead P b 1 b − m+1 I = f(x)dx = f(x)dx. (1.88) a m=a m Z X Z For each integration interval we can introduce the function g(x)= f(m + x), such that one has for a single interval

m+1 1 f(y)dy = g(x)dx Zm Z0 ∞ f(m)+ f(m + 1) B2p (2p 1) (2p 1) = + f − (m) f − (m + 1) .(1.89) 2 (2p)! − p=1 X   For the total integral this now implies

b 1 − ∞ f(m)+ f(m + 1) B2p (2p 1) (2p 1) I = + f − (m) f − (m + 1) , (1.90) 2 (2p)! − m=a ( p=1 ) X X   where we can now separately evaluate the terms. The values f(m) combine to

b 1 − f(m)+ f(m + 1) f(a) f(b) = + f(a +1)+ f(a +2)+ ... + f(b 1) + 2 2 − 2 m=a X f(a)+ f(b) b = + f(m) , (1.91) − 2 m=a X where we have of course implicitly assumed convergence of the series. All derivatives of f(x) cancel when evaluated at intermediate integers

b 1 − ∞ ∞ B2p (2p 1) (2p 1) B2p (2p 1) (2p 1) f − (m) f − (m + 1) = f − (a) f − (b) , (1.92) (2p)! − (2p)! − m=a p=1 p=1 X X   X   such that we can approximate a sum by an integral as

b b ∞ f(a)+ f(b) B2p (2p 1) (2p 1) f(m)= + f(x)dx f − (a) f − (b) . (1.93) 2 − (2p)! − m=a Za p=1 X X   28 CHAPTER 1. INTEGRATION OF FUNCTIONS

Exercise 6 (Finite Series) Show that

N 1 n3 = N 2(1 + N)2 . (1.94) 4 n=1 X

The above formula is also useful to obtain asymptotic expansions of infinite series. When e.g. we consider the limit b and a = 0 and assume that all derivatives for the function vanish at infinity, one obtains→ the ∞ reduced version

∞ ∞ f(0) ∞ B2p (2p 1) f(m)= + f(x)dx f − (0) . (1.95) 2 − (2p)! m=0 0 p=1 X Z X

Exercise 7 (Infinite Series Expansion) Show that

∞ 1 1 1 ∞ B = + + 2p . (1.96) (z + m)2 z 2z2 z2p+1 m=0 p=1 X X

3. Finally, we note here that the Euler-MacLaurin formula can be used to implement integrals numerically, which we will discuss in the next section.

1.4 Numerical Integration

If anything else (exact analytical solution, expansion of parts of the integrand, etc.) fails, one has to resort to numerical approaches. This is represented here as a last chance, since numerical integration may suffer from a variety of problems. The material in this section is a strongly-reduced excerpt from

W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in • C, Cambridge University Press (1994) which is available on-line at no cost. In particular, numerical integration is only useful when the integrand is sufficiently smooth and bounded and when the location of potential peaks is well known. The simplest approach to integrate a function numerically is to approximate it with a number of equally spaced sampling points

b N b a f(x)dx f(a +(b a)i/N) − , (1.97) ≈ − N i=0 Za X which becomes exact as the number of sampling points approaches infinity N . Unfortunately, as numerical calculations should be performed in finite time, one will have to→ use ∞ a finite number of sampling points. Figure 1.8 demonstrates potential problems with numerical integration. 1.4. NUMERICAL INTEGRATION 29

f(x)

Figure 1.8: When the number of sampling points is finite, fine details (e.g. the peak) of the inte- grand may be missed. In regions where the in- tegrand does not vary strongly, the number of sampling points could be reduced (to reduce the number of possibly expensive function calls). To maintain a high level of accuracy, the number of sampling points should be increased where the a b integrand changes strongly.

We further note here that solving the integral is equivalent to solving the differential equation dy = f(x) , with y(a)=0 (1.98) dx for the solution y(b). Methods for the solution of differential equations have been adapted to treat strongly varying f(x). Therefore, if the integrand has many peaks – with positions of the peaks possibly unknown, a suitable solver for differential equations in combination with an adaptive stepsise is advisable. Finally, we note that, to cope with such problems, modern computer algebra systems (e.g. Mathematica) provide several options regarding the integration method, the position of potential singularities or peaks, the required accuracy, the number of sampling points etc. It is usually a good idea to use them.

1.4.1 The Trapezoidal Rule We will discuss the case of equally spaced sampling points here. For a short integration interval δ we can approximate any well-behaved function by a lower-order polynomial, such that the remainder term in the Euler-MacLaurin formula can be safely neglected. Then we can apply the Euler- MacLaurin formula in Box 8 with f(x)= g(xi + xδ)δ and y = xi + xδ to yield

xi+δ ∞ g(xi)+ g(xi + δ) B2p (2p 1) (2p 1) 2p g(y)dy = (b a) g − (x + δ) g − (x ) δ . (1.99) 2 − − (2p)! i − i Zxi p=1 X   The factor δ2p arises due to the chain rule. We note that δ becomes smaller when we increase the number of sampling points. Now we consider the discretization of the interval [a,b] with N intervals such that x = a + nδ with δ = (b a)/N and n 0, 1,...,N , see Fig. 1.9. When n − ∈ { } applying the above equation to every interval, we see that for all internal points x1,...,xN 1 the − terms involving Bernoulli numbers cancel, and only the external contributions remain

b N a+nδ I = g(y)dy = g(y)dy a a+(n 1)δ Z n=1 Z − X N 1 − ∞ 2p g(a)+ g(b) (2p 1) (2p 1) (b a) = δ + g(a + nδ) + g − (a) g − (b) − . (1.100) 2 − N 2p " n=1 # p=1 X X   30 CHAPTER 1. INTEGRATION OF FUNCTIONS

Figure 1.9: Sketch of a smooth function approx- imated by N + 1 sampling points separated by distance δ.

First one can see that the second summation in the above equation can be neglected when N , as for each power of 1/N only two terms remain. Then, the derivatives of g(y) need not→ ∞ be computed. This is the essence of the trapezoidal rule.

Box 9 (Trapezoidal Rule) The integral of a function g(y) can be approximated by N +1 sam- pling points with interval spacings δ =(b a)/N − N 1 b g(a)+ g(b) − 1 I = g(y)dy = δ + g(a + nδ) + . (1.101) 2 O{N 2 } a " n=1 # Z X

One obvious advantage of using the simple trapezoidal rule – where all mesh points are equally • contributing to the integral – is that when we refine the grid at which f(x) is evaluated, the previous result need not be discarded but can be used. Furthermore, the sum rule is extremely simple to implement, and one might have guessed • this rule from the beginning by centering the integration intervals at the sampling points. We see that doubling the number of sampling points reduces the error by a factor of four. • 1.4.2 The Simpson Rule A further interesting observation stemming from the Euler-MacLaurin summation formula (or from the fact that the higher odd Bernoulli numbers vanish) is that only even powers of δ =(b a)/N occur in Eq. (1.100), i.e., formally we can write for the integral approximation with N points−

N 1 − g(a) g(b) α 4 S = 2δ + g(a + n2δ) + + + N − , (1.102) N 2 2 N 2 O{ } " " n=1 # # X where we use ∆ = (b a)/N =2δ. The fact that only even powers of 1/N occur has the beautiful consequence, that by− cleverly combining summation r esults using the discretization width ∆ and a half discretization width δ = ∆/2 as depicted in Fig. 1.10 we can eliminate the leading order error term. Formally, we can express the result for the half-width discretization as 1.4. NUMERICAL INTEGRATION 31

Figure 1.10: Sketch of the approximation of the integral in the shaded region with discretization width ∆ = (b a)/N (black circles) and δ =∆/2 (red circles).− The coarse discretization requires N +1 sampling points, whereas the finer grid has 2N + 1 sampling points.

2N 1 − g(a) g(b) α 4 S = δ + g(a + nδ) + + + N − . (1.103) 2N 2 2 4N 2 O{ } " n=1 ! # X

Consequently, we see that in the combination of Eqns. (1.102) and (1.103)

4 1 S = S S 3 2N − 3 N 4 1 1 1 4 4 1 = δ g(a) 2 g(a) + g(a + δ) + g(a +2δ) 2 g(a +2δ) + ... 3 2 − 3 2 3 3 − 3       h 4 1 4 + g(a + (2N 2)δ) 2 g(a + (2N 2)δ) + g(a + (2N 1)δ) 3 − − 3 − 3 −     4 1 1 1 4 + g(b) 2 g(b) + N − 3 2 − 3 2 O{ }   1 4 2i = δ g(a)+ g(a + δ)+ g(a +2δ)+ ... 3 3 3 h2 4 1 4 + g(a + (2N 2)δ)+ g(a + (2N 1)δ)+ g(b) + N − . (1.104) 3 − 3 − 3 O{ } i Since there is no error contribution of power 1/N 3, the actual error of this discretization approxi- mation scales as 1/N 4, which is a significant improvement over the trapezoidal rule. Writing this as a sum of M =2N + 1 terms (where M is obviously always odd) we obtain an approximation for the integral that is known as Simpson’s rule 13 .

Box 10 (Simpson’s rule) When one approximates the value of an integral with M = 2N +1

13Thomas Simpson (1710–1761) was a british mathematician with contributions to interpolation and numerical integration. It should be noted that the Simpson rule was actually introduced by Johannes Kepler already in 1615 who used it to estimate the content of wine barrels. 32 CHAPTER 1. INTEGRATION OF FUNCTIONS

0 10

-1 f(x) = (x+1) 1/2 -2 f(x) = (x+1) 10 Figure 1.11: Scaling of the errors of the analyti- f(x) = tanh(x) 10 cally solvable integrals 0 g(x)dx with the num- -4 ber of sampling points in the interval [0, 10]. Solid 10 R symbols denote the errors of the Simpson rule -6 (compare Box 1.105), whereas hollow symbols 10 show the errors resulting from the simple trape- integration error

-8 zoidal rule (Box 1.101) with the same number of 10 sampling points. The functions chosen are ana-

-10 lytic in the integration interval. Dashed lines in 10 0 1 2 3 4 10 10 10 10 the background denote the N − scaling. number of sampling points in [0,10]

equally spaced sampling points, an efficient method is given by

b 1 4 2 g(x)dx = δ g(a)+ g(a + δ)+ g(a +2δ)+ ... 3 3 3 Za h 2 4 1 4 + g(a + (2N 2)δ)+ g(a + (2N 1)δ)+ g(b) + N − , 3 − 3 − 3 O{ } b a b a i δ = − = − . (1.105) 2N M 1 −

Fig. 1.11 demonstrates the scaling of the error that is obtained when comparing Simpson and trapezoidal rules with analytically solvable integrals. In particular, the superior scaling of the Simpson rule becomes visible when it is compared with the trapezoidal rule used with the same number of sampling points. Since it yields higher accuracy with the same number of sampling points, Simpson’s rule should be given preference over the trapezoidal rule. We finally note that when the stepsize is further reduced in a numerical algorithm, the previous results need not be discarded despite the fact that in the Simpson rule not all sampling points are treated equally. This can simply be achieved by separately treating contributions from boundary points, even and odd internal points

δ Ibdy = N [g(a)+ g(b)] , N 3 4 Iodd = δ [g(a + δ )+ ... + g(a + (2N 1)δ )] , N 3 N N − N 2 Ievn = δ [g(a +2δ )+ ... + g(a + (2N 2)δ )] , (1.106) N 3 N N − N

bdy odd evn such that the integral is approximated by IN = IN + IN + IN . For the refined calculation 1 with half the discretization width δ2N = 2 δN , one actually only has to sample the function at the odd new odd points, i.e., to calculate I2N . The other contribution can be obtained from the previous 1.4. NUMERICAL INTEGRATION 33 results 1 Ibdy = Ibdy , 2N 2 N 1 1 Ievn = Iodd + Ievn . (1.107) 2N 4 N 2 N This can be used to recursively calculate new points of the function, starting e.g. with only three sampling points b a δ δ = − , Sbdy = 1 [g(a)+ g(b)] , 1 2 1 3 4 Sevn = 0 , Sodd = δ g(a + δ ) . (1.108) 1 1 3 1 1 In the next step, the other contributions are iteratively calculated

1 bdy 1 bdy δn = δn 1 , Sn = Sn 1 , (1.109) 2 − 2 − evn 1 odd 1 evn odd 4 Sn = Sn 1 + Sn 1 , Sn = δn [g(a + δn)+ g(a +3δn)+ ... + g(b 3δn)+ g(b δn)] . 4 − 2 − 3 − −

bdy evn odd The sum In = Sn + Sn + Sn then defines the approximation to the integral, and the absolute difference

∆= I I (1.110) | n+1 − n| can be used as an error criterion to halt further refinements.

Exercise 8 (Simpson Rule) Using Eqns. (1.108) and (1.109), approximate the integral

+1 1 x2/2 I = e− dx (1.111) √2π Z1 − bdy evn odd numerically by calculating Sn , Sn , Sn , and In up to n =3. You should not need more than 9 function evaluations. 34 CHAPTER 1. INTEGRATION OF FUNCTIONS Chapter 2

Integral Transforms

2.1 Fourier Transform

Fourier transforms are extremely important in modern science and technology. They are widely used in data and image analysis, image and sound compression.

2.1.1 Continuous Fourier Transform There are multiple ways to define the Fourier transform of a function. They can all be transformed into each other with comparably little effort. Therefore, we will just pick one here that is convenient for us.

Box 11 (Fourier transform) The Fourier transform of a function f(t) is defined as

1 F (ω)= f(t)e+iωtdt. (2.1) √2π Z The inverse Fourier transform is then given by

1 iωt f(t)= F (ω)e− dω. (2.2) √2π Z

We summarize a few properties of the Fourier Transform (FT)

1. The Fourier transform of a function exists when it is square-integrable, i.e., when f(t) 2dt < . This is a sufficient condition (but not necessary). One necessary (but not| sufficient)| ∞ R condition for the existence of the FT is limt f(t)=0. →±∞ 2. The Fourier representation of the Dirac 1 -Delta function is

1 δ(ω)= e+iωtdt. (2.3) 2π Z 1Paul Dirac (1902–1984) was an influential british theoretical physicist and one of the founders of quantum mechanics.

35 36 CHAPTER 2. INTEGRAL TRANSFORMS

3. When f(t) and F (ω) are FT pairs, it follows that the Fourier transform of the n-th derivative is given by ( iω)nF (ω), i.e., − 1 1 F (ω)= f(t)e+iωtdt = ( iω)nF (ω)= f (n)(t)e+iωtdt. (2.4) √2π ⇒ − √2π Z Z This property will prove extremely important later-on, since it can be used to convert differ- ential equations into algebraic ones. The basic idea is to solve the simpler algebraic equation in frequency space and perform an inverse Fourier transform.

Exercise 9 (FT of a derivative) Show the validity of Eq. (2.4).

4. The FT is real-valued when f ∗(t)=+f( t). − 5. Shifts in the time-domain imply a product with an exponential in the frequency domain

iaω f(t a) e− F (ω) . (2.5) − ⇔ 6. The FT obeys the Parseval 2 relation

f(t)g∗(t)dt = F (ω)G∗(ω)dω. (2.6) Z Z The above integral is nothing but a scalar product on the space of square-integrable functions. The Fourier transform leaves this scalar product invariant and is thus unitary. In particular when f(t)= g(t), it reduces to f(t) 2dt = F (ω) 2dω. | | | | R R Exercise 10 (Parseval relation) Show the validity of Eq. (2.6).

2.1.2 Important Fourier Transforms The Fourier Transform of a Gaussian is again a (not normalized) Gaussian • 2 2 µ /(2σ ) 2 2 1 2 2 e− σ µ (t µ) /(2σ ) (ω i 2 ) f(t) = e− − , F (ω)= e− 2 − σ , (2.7) √2πσ √2π where we note that the width of the Gaussian in the frequency domain is just the inverse of the width σ in the time domain.

Exercise 11 (FT of a Gaussian) Show the validity of Eq. (2.7).

As a rule of thumb, this implies that the FT of a flat function is peaked whereas the FT of a peaked function is flat. An extreme case of this is the Dirac-Delta distribution.

2Marc-Antoine Parseval (1755–1836) was a french mathematician who published this relation without proof, which he thought was obvious. 2.1. FOURIER TRANSFORM 37

The FT of an exponentially decaying function is a Lorentzian •

1 iǫt δ t 1 δ f(t)= e− e− | | , F (ω)= . (2.8) √2π π (ω ǫ)2 + δ2 − Here, δ plays the role of a width in the frequency domain.

The FT of of a rectangular function is a bandfilter function • 1 : µ δ/2 t µ + δ/2 e+iωµ δω f(t)= − ≤ ≤ , F (ω)= sinc , (2.9) 0: else √2π 2    where δ describes the width in the time domain.

2.1.3 Convolution The Fourier transform maps single functions to other single functions. In contrast, the convolution maps two functions to a single function. It is therefore discarding information and cannot be inverted.

Box 12 (Convolution) The convolution of functions F (ω) and G(ω) can be defined as

C(ω)= F (Ω)G(ω Ω)dΩ= F (ω Ω)G(Ω)dΩ . (2.10) − − Z Z

When F (ω) and G(ω) are Fourier transforms of f(t) and g(t), respectively, it follows that the FT of a product in the time-domain is a convolution

+iωt C(ω)= f(t)g(t)e dt = F (ω′)G(ω ω′)dω′ = F (ω ω′)G(ω′)dω′ . (2.11) − − Z Z Z One obvious consequence is that when we sample only an interval of the function, e.g. the interval [T ∗ δ/2,T ∗ + δ/2] we can write this as a the FT of a product of two functions: the original function− and the rectangular function g(t) from Eq. (2.9). The FT over a finite interval is thus given by a convolution in the frequency domain

T ∗+δ/2 1 +iωt 1 +iωt 1 +iω T δω′ f(t)e dt = f(t)g(t)e dt = F (ω ω′)e ′ ∗ sinc dω′ .(2.12) √2π √2π 2π − 2 T Zδ/2 Z Z   ∗− This justifies the name bandfilter functions for the sinc-function. Likewise, it follows that the inverse FT of a product in the spectral domain is a convolution in the time domain

iωt F (ω)G(ω)e− dω = f(t′)g(t t′)dt′ = f(t t′)g(t′)dt′ . (2.13) − − Z Z Z 38 CHAPTER 2. INTEGRAL TRANSFORMS

Exercise 12 (Convolution in the FT) Show the validity of Eqns. (2.11) and (2.13).

A second obvious consequence is that the convolution theorem can be used to to calculate the FT of a product of a simple periodic function and a decaying function. As a very simple example we consider here the functions (Ω R, δ > 0) ∈ 1 δ t f(t) = sin(Ωt) , g(t)= e− | | . (2.14) √2π It is straightforward to deduce their FTs from our previous results π 1 δ F (ω)=i [δ(ω Ω) δ(ω + Ω)] , G(ω)= . (2.15) 2 − − π ω2 + δ2 r The evaluation of the convolution integral is thus particularly simple in this case

1 1 δ t +iωt FG(ω) = sin(Ωt) e− | |e dt √2π √2π Z 1 i π 1 δ ′ ′ ′ ′ ′ ′ = F (ω )G(ω ω )dω = [δ(ω Ω) δ(ω + Ω)] 2 2 dω √2π − √2π 2 π − − (ω ω′) + δ Z r Z − i δ δ = 2π (ω Ω)2 + δ2 − (ω + Ω)2 + δ2  −  i 4δωΩ = . (2.16) 2π [(ω Ω)2 + δ2][(ω + Ω)2 + δ2] −

Exercise 13 (Convolution) Show that

1 1 t2/(2σ2) +iωt 1 σ2(ω Ω)2/2 σ2(ω+Ω)2/2 cos(Ωt) e− e dt = e− − + e− . (2.17) √2π √2πσ 2√2π Z h i

Similiar decompositions exist whenever the function f(t) is periodic. In this case, one can use +inωt the decomposition of f(t)= n cne into a to compute the FT of f(t)g(t). P 2.1.4 Discrete Fourier Transform When the function f(t) has significant support only in a finite interval [ ∆/2, +∆/2], we can approximate the integral in the continuous Fourier transform by a sum of−N terms (using δ = ∆/(N 1)) − 1 1 +∆/2 F (ω) = f(t)e+iωtdt f(t)e+iωtdt √2π ≈ √2π ∆/2 Z Z− +(N 1)/2 1 − f(kδ)e+iωkδδ, (2.18) ≈ √2π k= (N 1)/2 −X− 2.1. FOURIER TRANSFORM 39

and the function F (ω) is then obtained from a number of discrete values fk = f(kδ). Many natural signals are a priori discrete instead of continuous, and there is consequently also a unitary variant of the discrete Fourier transform.

Box 13 (Discrete Fourier Transform) Let a0,a1,...,aN 1 be a list of N (complex-valued) { − } numbers. Then, their discrete Fourier Transform (DFT) is given by

N 1 − jk 1 +2πi a˜ = e N a , (2.19) k √ j N j=0 X where k 0, 1,...,N 1 . The original numbers can be recovered from the inverse DFT ∈ { − } N 1 − jk 1 2πi aj = e− N a˜k , (2.20) √N Xk=0 where j 0, 1,...,N 1 . ∈ { − }

The reason why also the DFT is invertible lies in the identity

N 1 − 1 2πi ℓk δ = e N , (2.21) k,0 N Xℓ=0 which can be shown by employing the finite

N 1 xN+1 S = xℓ = − . (2.22) N 1 x Xℓ=0 − We note for completeness that the above relation holds for x = 1 and is evident from SN+1 = N+1 6 1+ xSN = SN + x . The DFT corresponds to the multiplication with an N N unitary matrix, i.e., when we T × T arrange the numbers in vectors a˜ = (˜a0,..., a˜N 1) and a =(a0,...,aN 1) , we have the relation − − a˜ = Ua with the matrix

1 1 1 2 1 (N 1) 2πi 2πi 2πi · − e N· e N· ... e N 2 1 2 2 . 2πi · 2πi · . 1  e N e N .  U = . (2.23) √ . .. . N  . . .  (N 1) 1 (N 1) (N 1)  2πi − · 2πi − · −   e N ...... e N      The inverse transformation is then given by a = U †a˜. Performing this matrix-vector multiplication 2 would thus require N multiplications of the numbers aj. However, we just note here that when N is a powerO{ of two,} i.e., N = 2n, the DFT can be performed much more efficiently with complexity N ln N . The corresponding algorithm is known as Fast Fourier Transform (FFT), and extensionsO{ of it have} found widespread application in data analysis and image and sound compression (jpeg, mp3). Due to the similiar structure, we just note here that many properties of the continuous FT also hold for the DFT. 40 CHAPTER 2. INTEGRAL TRANSFORMS 2.2 Laplace Transform

2.2.1 Definition

Box 14 (Laplace Transform) The Laplace transform (LT) of a function f(t) is given by

∞ st F (s)= f(t)e− dt. (2.24) Z0 It exists when the integral converges, which defines a region of convergence s S C. For t> 0, it can be inverted via ∈ ⊂

γ+i 1 ∞ f(t)= F (s)e+stds. (2.25) 2πi γ i Z − ∞ Here, γ R must be chosen such that the integration path lies within the region of convergence S. ∈

The integral of the inverse Laplace transform is also called Bromwich 3 integral. We note that unlike the FT, where forward and backward transformations are essentially of the same difficulty, computing the inverse LT may be much more tedious than computing the LT. Furthermore, since only positive times enter in the Laplace transform, the inverse LT can only be used to reconstruct the function f(t) for t> 0. We start by considering some examples:

1. To elucidate what is meant with region of convergence, we consider the simple Laplace transform (α C) ∈ (α s)t ∞ αt ∞ (α s)t e − (s)> (α) 1 f(t)= e , F (s)= e − dt = ℜ =ℜ . (2.26) α s s α Z0 0 − − C In this case, the region of convergence is defined by S = s : (s) > (α) . By definition, it must not contain any poles or other singularit{ies.∈ The integrationℜ ℜ contour} of the inverse LT must lie within this region of convergence, i.e., we have to choose γ such that also γ > (α) , see Fig. 2.1. We note that the inverse LT can be evaluated using the residue theoremℜ once} the integration contour is closed. For this, it is necessary to continue the LT F (s) to the full complex plane and to close the contour in the left half plane, since in the right half plane the integrand F (s)est is not suppressed. Provided that the Laplace transform has only a finite number of poles in the complex plane, we can use the residue theorem, which implies for the inverse Laplace transform

f(t)= ResF (s)est = esitResF (s) , (2.27) s=si s=si i i X X st where si are the poles of the Laplace transform F (s) – recall that e is analytic in C – that are contained by the integration contour. We note that when the poles of F (s) have a vanishing or even negative real part, the above decomposition yields a convenient way to

3Thomas John I’Anson Bromwich (1875–1929) was a british mathematician and physicist with contributions to Maxwells equations. 2.2. LAPLACE TRANSFORM 41

Figure 2.1: For a α, the region of convergence for the Laplace transform is given by all complex numbers with a larger real part than α (shaded area). It must not contain any singularities (by definition). The integration con- tour for the inverse Laplace transform must lie within this region of convergence γ > (α). The inverse LT can be evaluated using the residueℜ the- orem when the contour is closed (green arc).

approximate the long-term behaviour of the function f(t). For our particular simple example this means est f(t) = Res = eαt , (2.28) s=α s α − which demonstrates consistency. 2. As a second simple example we consider the function f(t)= tneαt (2.29) with n 0, 1, 2,... and α C. The Laplace transform exists for (s) > (α) and can by found by∈{ differentiation} ∈ ℜ ℜ n ∞ n αt st ∂ 1 n! F (s)= t e e− dt = = . (2.30) ∂α s α (s α)n+1 Z0   − − In the complex plane, it has a pole of order n +1 at s = α. The residue necessary for the calculation of the inverse Laplace transform can be found by direct inspection

st (s α)t ∞ m e n! αt e − n! αt n!t m n 1 = e = e (s α) − − , (2.31) (s α)n+1 (s α)n+1 m! − m=0 − − X which yields estn! Res = eαttn (2.32) s=α (s α)n+1 − and thus consistently reproduces our original function.

1 3. As a more complex example, we consider the function f(t) = √t . First we consider the existence of the Laplace transform st ∞ e F (s)= − dt (2.33) √t Z0 42 CHAPTER 2. INTEGRAL TRANSFORMS

For large times, the integrand decays. In addition, we note that though for small times the integrand diverges, the integral is still finite, as it can be upper-bounded for (s) > 0 ℜ 1 st st 1 st st e ∞ e e ∞ e F (s) = − dt + − dt − dt + − dt | | √t √t ≤ √t √t Z0 Z1 Z0 Z1 1 s 1 ∞ st e− dt + e− dt =2+ , (2.34) ≤ √t s Z0 Z1

where we have employed the triangle inequality z1 + z2 z1 + z2 (which can be recursively applied to larger sums and therefore also holds| for the|≤| integ|ral).| | It is not difficult to show that the actual value of the integral is given by

st ∞ e π F (s)= − dt = . (2.35) √t s Z0 r

Exercise 14 (Substitution) Show validity of Eq. (2.35).

We note that F (s) has no on the negative real axis but a full branch cut: By employing the polar representation s = s e+iφ we see that on the positive real axis we have | | π π π π lim = lim = lim = . (2.36) (s) 0+ s (s) 0 s φ 0 s eiφ s ℑ → r ℑ → − r → r| | r| | However, on the negative real axis the two limits do not coincide

π π iφ π iπ/2 π lim = lim e− = e− = i , (s) 0+ s φ +π s s − s ℑ → r → r| | r| | r| | π π iφ π +iπ/2 π lim = lim e− = e = +i . (2.37) (s) 0 s φ π s s s ℑ → − r →− r| | r| | r| | To compute the inverse Laplace transform, we therefore have to excise the branch cut with a more complicated integration contour, see Fig. 2.2. The Bromwich integral can then be represented as

γ+i 1 ∞ π +st f(t)= e ds = I1(t) I2(t) I3(t) , (2.38) 2πi γ i s − − − Z − ∞ r where the individual contributions have to be parametrized explicitly. The first integral becomes

0 0 1 π +xt 1 π +xt I1(t) = lim e dx = − e dx ǫ 0+ 2πi x +iǫ 2π x → Z−∞ r Z−∞ r 0 yt − 1 π yt 1 ∞ e− 1 = e− dy = dy = − , (2.39) 2π + y −2√π 0 √y 2√t Z ∞ r Z 2.2. LAPLACE TRANSFORM 43

Figure 2.2: When the branch cut is excised, the contour encloses a region where the Laplace transform is holomorphic. Since in addition the green arcs do not contribute, the integral along the closed contour must vanish, and the Bromwich integral (solid black) can be repre- sented as f(t)= I (t) I (t) I (t). − 1 − 2 − 3

where in the last step we have used the previous result (2.35), this time assuming positive time. The integral along the semicircle essentially vanishes because the path is so short, and the moderate divergence of the integrand at the origin cannot compensate for this

π/2 1 − π +iφ +iφ I2(t) = lim exp ρe t ρe idφ =0 . (2.40) 2πi ρ 0+ ρe+iφ → Z+π/2 r  Finally, the third integral yields the same contribution as the first one

1 −∞ π +xt 1 −∞ π +xt I3(t) = lim e dx = e dx ǫ 0+ 2πi x iǫ 2π x → 0 0 Z ytr − Z r− 1 ∞ e− 1 = dy = − . (2.41) −2√π √y 2√t Z0 The inverse Laplace transform therefore consistently becomes 1 f(t)= . (2.42) √t Inverting the Laplace transform may evidently become tedious when branch cuts are involved.

2.2.2 Properties Since the exponential function occurs in the Laplace-transform, too, we observe that many inter- esting properties of the Fourier transform are also present in similiar form in the Laplace transform, some of which we summarize below 1. When f(t) and F (s) are Laplace transform pairs, the Laplace transform of the derivative f ′(t) is related to both F (s) and the initial value f(0)

2 f ′(t) sF (s) f(0) , f ′′(t) s F (s) sf(0) f ′(0) . (2.43) ⇔ − ⇔ − − 44 CHAPTER 2. INTEGRAL TRANSFORMS

Exercise 15 (Laplace transform of a derivative) Show validity of Eq. (2.43).

As with the FT, this formula can be applied recursively to yield for higher derivatives

n (n) n k 1 (n k) f (t) s F (s) s − f − (0) . (2.44) ⇔ − Xk=1 As with the FT, this property also allows to convert differential equations into algebraic ones. However, as the main difference we see here that the initial values of f(t) are also entering these equation. The LT therefore conveniently allows to solve initial value problems: The LT of an unknown function is computed formally and solved for in the spectral domain. However, sometimes one does not need the full time-dependent solution but would be satisfied with initial limt 0 f(t) or final values limt f(t). To obtain these, the following properties may → →∞ be helpful.

2. The LT obeys the initial value theorem

lim f(t)= lim sF (s) . (2.45) t 0 s → →∞ Provided that f(t) has a Taylor expansion at t = 0, we can use our example in Eq. (2.30) for α = 0 to see that

∞ (n) ∞ (n) f (0) ∞ n st f (0) F (s)= t e− dt = , (2.46) n! sn+1 n=0 0 n=0 X Z X such that f (1)(0) f (2)(0) lim sF (s)= f (0)(0)+ lim + + ... = f(0) . (2.47) s s s s2 →∞ →∞  

3. Provided that the limit limt f(t) exists, or that the Laplace transform has only a finite →∞ number of poles in the left complex half plane (with either negative real part or at s = 0), the Laplace transform obeys the final value theorem

lim f(t) = lim sF (s) . (2.48) t s 0 →∞ → We consider the Laplace transform of a derivative to see this

∞ st lim f ′(t)e− dt = lim [sF (s) f(0)] = lim sF (s) f(0) . (2.49) s 0 s 0 − s 0 − → Z0 → →

Provided that the long-term limit exists, f ′(t) must vanish for large times, and we can exchange limit and integral. The above equation then becomes

∞ st ∞ lim f ′(t)e− dt = f ′(t)dt = lim f(t) f(0) , (2.50) s 0 t − Z0 → Z0 →∞ and comparison with the previous Equation eventually yields the final value theorem (2.48). 2.2. LAPLACE TRANSFORM 45

4. The Laplace transform of an integral of a function can be easily computed once the LT of ∞ st the original function is known, i.e., when F (s)= 0 f(t)e− dt, we conclude t R ∞ st F (s) f(t′)dt′ e− dt = . (2.51) s Z0 Z0  This can be seen by performing an integration by parts

t t ∞ st ∞ 1 d st f(t′)dt′ e− dt = f(t′)dt′ − e− dt s dt Z0 Z0  Z0 Z0    t ∞ 1 st 1 ∞ st = f(t′)dt′ − e− + f(t)e− dt s s Z0   0 Z0 F (s) = , (2.52) s

d t where we have used that dt 0 f(t′)dt′ = f(t). 5. Shifts in the time-domain (a>R 0) imply a product with an exponential in the s-domain

as f(t a)Θ(t a) e− F (s) , (2.53) − − ⇔ where Θ(x) denotes the Heaviside-Theta function.

Exercise 16 (Time-shift) Show the validity of Eq. (2.53).

6. The Laplace transform of a convolution of two functions can be written as a simple product. First, we note that the Laplace transform only considers functions with a positive argument, so the convolution reduces to

+ t ∞ (f g)(t)= f(τ)g(t τ)dτ = f(τ)g(t τ)dτ . (2.54) ∗ − 0 − Z−∞ Z The Laplace transform of such a convolution becomes

t ∞ st ∞ st FG(s) = (f g)(t)e− dt = f(τ)g(t τ)dτ e− dt 0 ∗ 0 0 − Z t Z Z  ∞ st ∞ ∞ st = dt dτf(τ)g(t τ)e− = dτ dtf(τ)g(t τ)e− , (2.55) − − Z0 Z0 Z0 Zτ where in the last step we have used that the two-dimensional integral covers the sector with τ,t 0 bounded by the lines τ = 0, τ = t to change the order of the integration, see also Fig.≥2.3. After substitution, we can relate the LT to the separate Laplace transforms

∞ sτ ∞ s(t τ) FG(s) = dτf(τ)e− dtg(t τ)e− − = F (s)G(s) . (2.56) − Z0 Zτ The structure of the equations is thus considerably simplified, which justifies the importance of the Laplace transform for differential equations with a memory kernel. 46 CHAPTER 2. INTEGRAL TRANSFORMS

Figure 2.3: Sketch of the integration region in Eq. (2.55), which covers only the shaded area. The same region is covered by the integrals t ∞ ∞ ∞ 0 dt 0 dτ[...] and 0 dτ τ dt[...]. R R R R 2.2.3 Applications Here, we briefly demonstrate the use of the Laplace transform for a few examples.

1. First we discuss the simple equation

x˙ = ax, (2.57)

at for which we know the solution is x(t)= e x0. Formally performing the Laplace transform ∞ st F (s)= 0 x(t)e− dt yields R sF (s) x(0) = aF (s) , (2.58) − which we can solve for the Laplace transform x F (s)= 0 , (2.59) s a − which has a pole at s = a in the complex plane. The inverse Laplace transform can be obtained from the residue theorem – Eq. (2.27) and yields

at x0 at x(t)= e Res = e x0 . (2.60) s=a s a − 2. Now, we consider the well-known equation of motion for a harmonic oscillator of mass m, dampening constant γ0, spring constant κ0, which is in addition subject to a periodic driving force of amplitude α0 and frequency Ω γ κ α x¨ + 0 x˙ + 0 x = 0 cos(Ωt) , (2.61) m m m where x = x(t) denotes the time-dependent position of the oscillator. We perform the Laplace ∞ st transform F (s)= 0 x(t)e− dt on the equation above and with γ = γ0/m, κ = κ0/m, and R 2.2. LAPLACE TRANSFORM 47

α = α0/m we obtain

2 α ∞ (s iΩ)t (s+iΩ)t s F (s) sx v + γ [sF (s) x ]+ κF (s) = e− − + e− dt − 0 − 0 − 0 2 Z0   α 1 1  = + 2 s iΩ s + iΩ αs −  = , (2.62) s2 + Ω2 where we have used properties of the Laplace transform of a derivative and introduced initial position x0 = x(0) and initial velocity v0 =x ˙(0). In the above equation, we can now solve for the Laplace transform algebraically αs s + γ 1 F (s)= + x + v . (2.63) (s2 + Ω2)(s2 + γs + κ) s2 + γs + κ 0 s2 + γs + κ 0 We denote the roots of the polynomial by γ γ 2 γ 4κ s = κ = 1 1 , (2.64) ± − 2 ± 2 − 2 − ± − γ2 r   r  where we can see that the s may have s large imaginary part (for weak dampening) or can ± also become purely real (for strong dampening). Since we can write the Laplace transform by using four poles αs s + γ 1 F (s)= + x0 + v0 ,(2.65) (s + iΩ)(s iΩ)(s s+)(s s ) (s s+)(s s ) (s s+)(s s ) − − − − − − − − − − its inverse is readily computed via the residue theorem – or Eq. (2.27) – to yield

iΩt α( iΩ) +iΩt α(+iΩ) x(t) = e− − + e ( 2iΩ)( iΩ s+)( iΩ s ) (+2iΩ)(iΩ s+)(iΩ s ) − − − − − − − − − +s+t αs+ +s t αs +e + e − − (s+ + iΩ)(s+ iΩ)(s+ s ) (s + iΩ)(s iΩ)(s s+) − − − − − − − − +s+t s+ + γ +s t s + γ + e + e − − x0 (s+ s ) (s s+)  − − − −  +s+t 1 +s t 1 + e + e − v0 . (2.66) s+ s s s+  − − − −  Now, the dependence of the pole positions s on the model parameters tells us what long- ± term dynamics can be expected, see also Fig. 2.4. Unless the dampening vanishes or is infinite, two poles will have a negative real part, and the corresponding contributions to the full solution will be negligible for large times. The asymptotic solution will in this case be given by α e iΩt e+iΩt x(t) − + → 2 (iΩ + s+)(iΩ + s ) (iΩ s+)(iΩ s )  − − − −  2(κ Ω2) cos(Ωt)+2γΩ sin(Ωt) = − . (2.67) γ2Ω2 +(κ Ω2)2 − 2 Noting that κ = κ0/m = ω0 denotes the squared eigenfrequency of the undamped oscillator, it is clearly visible that the asymptotic oscillation amplitude may become large when the driving frequency matches the oscillator eigenfrequency. 48 CHAPTER 2. INTEGRAL TRANSFORMS

Figure 2.4: The Laplace transform of the damped and driven oscillator has four poles. With turn- ing on damping γ, the two poles (blue) at the original oscillator eigenfrequencies i√κ = iω0 move into the left half plane (arrows)± until± they collide at s∗ = √κ when γ = √4κ. When damping is further− increased, they move in op- posite direction along the negative real axis. For physical parameters (κ > 0, γ > 0), none of the poles ever aquires a positive real part. For most parameter constellations, only the two poles aris- ing from the driving (red) will have a vanishing real part.

3. As a third example we will consider an integro-differential equation

t ρ˙ = g(τ)ρ(t τ)dτ , (2.68) − Z0 where we simply note that the change of ρ at time t depends also on all previous values of ρ(t). The function g(τ) – sometimes called memory kernel – weights these influences from the past. The convolution property tells us

sF (s) ρ = G(s)F (s) , (2.69) − 0 where

∞ sτ G(s)= g(τ)e− dτ (2.70) Z0 denotes the Laplace transform of the kernel function. For simplicity we choose here a simple kernel of the form

ατ g0 g(τ)= g τe− , G(s)= . (2.71) 0 (s + α)2

The kernel peaks at τ = 1/α, which can then be considered as a characteristic delay time. The Laplace transform of the solution therefore becomes

ρ ρ (s + α)2 ρ (s + α)2 F (s)= 0 = 0 = 0 , (2.72) s G(s) s(s + α)2 g (s s )(s s )(s s ) − − 0 − 1 − 2 − 3 2 where si denote the roots of the polynomial s(s + α) g0. Since the polynomial is only of third order, these can be computed exactly with the Cardano− 4 formulas. Since we aim for

4Gerolamo Cardano (1501–1576) was an italian scholar who worked as a doctor, philosopher and mathematician. He is considered as one of the last polymaths. 2.2. LAPLACE TRANSFORM 49

simple expressions, we use a power series ansatz to find the lowest order corrections to the poles for small g0 g g g s = 0 + g3/2 , s = α i 0 0 + g3/2 . (2.73) 1 α2 O{ 0 } 2/3 − ± α − 2α2 O{ 0 } r

Exercise 17 (Approximate roots) Use a power series ansatz to find Eqns. (2.73). It might be useful to non-dimensionalize first.

Consequently, we see that the long-term dynamics of the solution is dominated by the pole with the largest real part, and we find ρ (s + α)2 ResF (s)= 0 1 , (2.74) s=s1 (s s )(s s ) 1 − 2 1 − 3 which implies for the long-term asymptotics ρ (s + α)2 ρ(t) es1t 0 1 . (2.75) → (s s )(s s ) 1 − 2 1 − 3 4. Finally, we may also apply the convolution theorem to memories that have a strongly peaked delay kernel, e.g. to the differential equation

ρ˙(t)= g ρ(t τ )Θ(t) , (2.76) 0 − D

where τD > 0 is a delay time. We can write the above equation also as a convolution

t ρ˙(t)= g(τ)ρ(t τ)dτ, g(τ)= g δ(τ τ ) . (2.77) − 0 − D Z0 The Laplace transform of the kernel function becomes

sτD G(s)= g0e− , (2.78) such that the Laplace transform of the solution is ρ ρ F (s)= 0 = 0 . (2.79) s G(s) s g e sτD − − 0 − The denominator now has infinitely many roots sn (corresponding to first order poles) which in general will have to be determined numerically, and the full solution is then given by

+ ∞ snt ρ0 ρ(t)= e Res sτ . (2.80) s=sn s g e D n= 0 − X−∞ − In particular, only one pole – which we formally denote as s0 – has a positive real part and will thus dominate the long-term dynamics

t ρ0 →∞ s0t ρ(t) e Res sτ . (2.81) → s=s0 s g e D − 0 − 50 CHAPTER 2. INTEGRAL TRANSFORMS

1 10 ] 0

Figure 2.5: Plot of the full solution (2.84) (solid ρ

black, in units of ρ0) versus time (in units of delay (t) [ ρ time τD). For times smaller than the delay time, nothing happens. Afterwards the growth dynam- g τ = 0.1

solution 0 D ics is not immediately exponential but quickly ap- g τ = 0.5 0 0 D 10 τ g0 D = 1.0 proches the asymptotic limit obtained by consid- τ g0 D = 2.0 ering the dominant pole (dashed red). For short 1 pole (long-term asymptotics) times the dynamics is poorly captured if only a 11 poles 0 1 2 3 4 5 τ finite number of poles is considered in Eq. (2.80). dimensionless time t [ D]

Alternatively, the full inverse LT for finite times can be obtained by expanding the Laplace transform

∞ n ∞ n ρ0 1 ρ0 g0 nsτD g0 nsτD F (s)= g = e− = ρ0 e− . (2.82) s 1 0 e sτD s s sn+1 s − n=0 n=0 − X   X We can perform the inverse Laplace transform for each term separately

n n 1 nsτD g0 g0 n − e− = (t nτ ) Θ(t nτ ) , (2.83) L sn+1 n! − D − D   where we have used previous results.

Exercise 18 (Shifted Laplace Transform) Show the validity of Eq. (2.83).

The solution eventually becomes

∞ gn ρ(t)= 0 (t nτ )nΘ(t nτ )ρ . (2.84) n! − D − D 0 n=0 X For significant delay, the solution can be quite different from simple exponential growth, see Fig. 2.5. In particular, it is straightforward to see that we can truncate the solution for short times after a few terms. Chapter 3

Ordinary Differential Equations

3.1 Definition

In short, any equation involving a function and also its derivatives with respect to a single variable may be called ordinary differential equation.

Box 15 (Ordinary differential Equation) An ordinary differential equation (ODE) of order n is defined by the equation

(n) f(t,x(t),x′(t),x′′(t),...,x (t))=0 , (3.1) where f(t,x′(t),...) is a function. When the equation can be solved for the highest derivative, i.e.,

(n) (n 1) x (t)= F (t,x(t),x′(t),...,x − (t)) (3.2) it is called explicit.

Any explicit m-th order ODE can always be mapped to a coupled system of first order ODEs. To see this, we introduce auxiliary variables

(n 1) y1(t)= x(t) , y2(t)= x′(t) , ..., yn(t)= x − (t) , (3.3) which yields the system

y1′ (t) = y2(t) , y2′ (t)= y3(t) , ..., yn′ 1(t)= yn(t) , − yn′ (t) = F (t,y1(t),...,yn(t)) . (3.4)

Above, only the first derivatives occur, but we now have a system of n equations instead of a single equation with n derivatives. As a simple example, we consider the harmonic oscillator

mx¨ + γx˙ + κx = g(t) , (3.5) which by introducing

y1(t)= x(t) , y2(t)=x ˙(t) (3.6)

51 52 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS can be converted into the system

γ κ g(t) y˙ (t)= y (t) , y˙ (t)= y (t)+ y (t)+ . (3.7) 1 2 2 −m 2 m 1 m

There exist many types of ODEs and only few of them can be solved analytically.

3.2 Linear ODEs

A special case of ODEs occurs when the functions yi(t) occur only linearly in the ODE, i.e.,

y˙1 a11(t) ... a1N (t) y1 f1(t) . . . . .  .  =  . .   .  +  .  . (3.8) y˙n aN1(t) ... aNN (t) yn fn(t)                 The system is called homogeneous when fi(t) = 0. In what follows, we will switch to a vector representation (marked by bold symbols)

y˙ = A(t)y + f(t) . (3.9)

We can immediately exploit the linearity to deduce that when we have found two solutions to the homogeneous system, any combination of these will also be a solution of the homogeneous system. Furthermore, when yH (t) denotes any solution (or the general solution) of the homogeneous system and yS(t) any particular solution of the inhomogeneous system

y˙H = A(t)yH (t) , y˙S = A(t)yS(t)+ f(t) , (3.10) it is directly evident that the combination of both y(t)= yH (t)+yS(t) will also solve the differential equation (3.9).

3.2.1 Constant Coefficients The case becomes much simpler when the coefficients in the linear system are constant A(t) A and f(t) f. Then, the solution for the homogeneous system reads → → At yH (t)= e c , (3.11) where c is any vector. A special solution of the inhomogeneous system can then be found by solving the time-independent equation AyS + f = 0, and by using the initial condition y(0) = y0 we can fix the vector c. Thus, the solution is given by

At At y(t)= e y + 1 e yS . (3.12) 0 −   Evidently, the computation of the exponential of a matrix is important to calculate the solution of linear ODEs with constant coefficients. 3.2. LINEAR ODES 53

3.2.2 Properties of the The exponential of a matrix is defined via the power series expansion

∞ An eA = . (3.13) n! n=0 X We summarize some properties of the matrix exponential

1. When two matrices commute [A, B]= AB BA = 0 we have − eA+B = eAeB . (3.14)

It may seem trivial but is sadly often forgotten: Usually matrices do not commute, and the above property does not hold.

2. The matrix exponential is always invertible, and one has

A 1 A e − = e− . (3.15)  1 3. When there exists a similiarity transform C = BAB− one can compute the matrix expo- nential via

C A 1 e = Be B− . (3.16)

This is particularly useful if eA is simple to compute.

4. When the matrix if of block-diagonal form, its exponential can be computed by exponenti- ating the blocks separately

B1 B1 0 e 0 B = ... = eB = ... , (3.17)   ⇒   Bk 0 Bk 0 e         which may significantly – in particular when the dimension of the blocks is one (diagonal matrix) – reduce the effort.

5. The determinant of a matrix exponential can be computed by exponentiating the trace

A Tr A det e = e { } . (3.18)  6. One has d eAt = AeAt . (3.19) dt Note that this does not hold when A is time-dependent. 54 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS

In particular when two matrices do not commute but when their commutator is small or just a C-number, the Baker1- Campbell2- Hausdorff3 formula eX eY = eZ(X,Y ) , with 1 1 1 Z(X,Y ) = X + Y + [X,Y ]+ [X, [X,Y ]] [Y, [X,Y ]] + ... (3.20) 2 12 − 12 may be useful. In particular when [X,Y ] C we may exactly truncate the series. A simple consequence of the expansion is the Lie4 product∈ formula

N eA+B = lim eA/N eB/N , (3.21) N →∞ which has applications e.g. in path integrals. Very often, one encounters transformations of the +X X form e Ye− . These can be conveniently treated with the nested commutator decomposition

∞ +X X 1 1 1 e Ye− = Y +[X,Y ]+ [X, [X,Y ]] + [X, [X, [X,Y ]]] + ... = [X,Y ] 2! 3! n! n n=0 X with [X,Y ]m = X, [X,Y ]m 1 , and [X,Y ]0 = Y . (3.22) − The above expansion is also known as nested commutator expansion for obvious reasons.

Exercise 19 (nested commutator expansion) Prove the nested commutator expan- sion (3.22).

Finally, we note that the matrix exponential can always be computed. When A is diagonalizable, we may directly compute the matrix exponential via the (simple) • exponential of the diagonalized matrix – cf. Eq. (3.16). Even when A is not diagonalizable, it can be represented in Jordan5 normal form • λ 1 0 J 0 1 ...... 1 ..   B = UAU − =  .  with Jordan blocks Ji = . . (3.23) .. 1 0 Jk      0 λ        Since these Jordan blocks can be decomposed as Ji = λ1 + Ni with nilpotent matrices Ni k (meaning that Ni = 0 for sufficiently large exponent k), their exponential can be explicitly Ji λ Ni λ k ℓ computed via e = e e = e ℓ=0 Ni /ℓ! . h i 1Henry Frederick Baker (1856–1956) wasP a british mathematician with contributions to algebraic geometry. He elaborated on the formula in 1902. 2John Edward Campbell (1862–1924) was a british mathematician who first introduced the formula in 1897. 3Felix Hausdorff (1868–1942) was a german mathematician who also acted as a philosopher and writer (under the pseudonym Paul Mongr´e). He systematized the formula geometrically in 1906. 4Sophus Lie (1842–1899) was a norwegian mathematician best known for his contributions to continuous trans- formation (Lie-) groups. 5Marie Ennemond Camille Jordan (1838–1922) was a french mathematician who initially worked as an engineer and persued his mathematical studies in his spare time. He contributed to analysis, group theory, and topology. 3.2. LINEAR ODES 55

3.2.3 Numerical Computation The problem with using the Taylor series expansion (3.13) for direct numerical computation of the matrix exponential is that in general convergence is poor. In particular for matrices with a large norm, intermediate contributions to the series may become very large – possibly exceeding numerical ranges, and since very large numbers are used, also numerical roundoff errors become a serious issue. It has therefore become common practice to use other schemes to approximate the exponential of a matrix, here we will discuss the scaling and squaring algorithm in combination with Pad´eapproximation and an algorithm for computing the inverse of a matrix.

Scaling and Squaring To make all algorithms converge well, we require that the norm of the matrix is neither too small nor too large but rather in the order of one. We therefore make use of the property σ eA = eA/σ , (3.24) such that A /σ 1. It is then particularly helpful to choose σ =2N with integer N, since then – supposedk onek is≈ given a suitable approximation

N B eA/2 = eA/σ , (3.25) 0 ≈ A it takes only N matrix multiplications to recover e from B0 N 1 N 2 2 A/2 − 2 A/2 − 2 A B1 = B0 = e , B2 = B1 = e , ... ,BN = BN 1 = e . (3.26) − Pad´eapproximation In principle, we could simply use the Taylor series once A /σ is small. However, if speed is relevant, much can be gained by approximating the exponentik k al by Pad´e6 approximation. The basic idea is to approximate any function

∞ n f(x)= cnx (3.27) n=0 X by the ratio of two polynomials – pkm(x) being a polynomial of order k and qkm a polynomial of order m – such that 1 0 1 m m 1 0 1 k k f(x) [q (x)]− p (x) = f(x) q + q x + ... + q x − p + p x + ... + p x − km km − km km km km km km = xk+m+1 . (3.28) O{  }    Provided that computing the inverse of a matrix does not provide significant additional cost (see below), this method is particularly useful, since the Pad´e approximants to the exponential are explicitly known

k (k + m j)!k! p (x) = − (+x)j , km (k + m)!(k j)!j! j=0 X − m (k + m j)!m! q (x) = − ( x)j . (3.29) km (k + m)!(m j)!j! − j=0 X − 6Henri Eug`ene Pad´e(1863–1953) was a french mathematician who contributed approximation techniques using rational functions. 56 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS

In particular, it is obvious that the trivial m = 0 case recovers the Taylor approximation. In practice, it is preferred to choose k = m. In particular, when k = m is large, we note that qmm(X) approximates exp( X/2), whereas p (X) approximates exp(+X/2). Therefore, p (X) may − mm mm serve as a first guess to the inverse of qmm(X) in iterative algorithms (see below). When we apply the matrix inversion to the Pad´eapproximation of the matrix exponential, we note that one is already given a first guess for the matrix inverse of Q = q (x) by M mm 0 ≈ pmm(x), provided m is large enough. This becomes evident when noting that for large m, qmm(X) approximates exp( X/2), whereas p (X) exp(+X/2) (poorly). − mm

Exercise 20 (Pad´eapproximants) Show that

(2m j)!m! 1 lim − = . (3.30) m (2m)!(m j)! 2j →∞ −

Matrix inversion

Multiple algorithms for matrix inversion exist. They may become somewhat sophisticated but have an optimal complexity of N 3 (e.g. LU decomposition). Here, we are seeking for a simpler algorithm that is comparably easyO{ to} implement.

So let Q denote the matrix to be inverted and let M0 be a first guess for its inverse. For the iteration to converge, the initial guess M0 must be close to the true inverse. Defining the initial residual as R = 1 M Q, which implies M Q = 1 R, we use − 0 0 − 1 1 1 1 1 1 1 Q− = Q− (M0− M0)=(Q− M0− )M0 =(M0Q)− M0 =(1 R)− M0 2 3 − = (1 + R + R + R + ...)M0 , (3.31) where we have only used the geometric series – which converges only when the norm of the initial residual is small. When we now define the partial approximation as

n k Mn = R M0 , (3.32) " # Xk=0 1 such that M = Q− , it is not difficult to find recursion relations between the Mn. One of them ∞ is a well-known iteration method for finding the inverse of a matrix Q

M =2M M QM , (3.33) 2n+1 n − n n which converges (quadratically) to the inverse of Q – provided that M0 is sufficiently close to the 1 1 true inverse. Trivially we see that for Mn = Q− we also obtain M2n+1 = Q− from Eq. (3.33).

Exercise 21 (matrix inversion) Show validity of Eq. (3.33). 3.3. NONLINEAR ODES 57

We finally note that the iteration will converge when the initial residual is small, i.e., when one already has a sufficiently good estimate of the inverse. In the general case, convergence is not guaranteed. However, the initial choice

Q† M0 = (3.34) Q 1 Q k k k k∞ with the row and column norms

A 1 = max aij , A = max aij (3.35) k k i | | k k∞ j | | j ! i ! X X will lead to (possibly slow) convergence of the iteration scheme (3.33). This follows from realizing that the initial residual

Q†Q R = 1 M0Q = 1 (3.36) − − Q 1 Q k k k k∞ has only eigenvalues ranging between zero and one.

3.3 Nonlinear ODEs

In essence, nonlinear ODEs can only be solved in special cases, e.g. if certain symmetries are present.

3.3.1 Separable nonlinear ODEs A special case appears when one only has a single first order nonlinear ODE that is separable, i.e., of the form dx x˙ = = f(x)g(t) , (3.37) dt where f(x) is a function of x and g(t) a function of time. Separating the differentials in the derivative yields dx = g(t)dt, (3.38) f(x) such that integrating the r.h.s. from t0 to t yields with the substitution x(t) the relation x(t) dx t = g(t)dt. (3.39) f(x) Zx0 Zt0 In population dynamics, for example, one models the growth in worlds with limited resources (e.g. nutrients, space) by the logistic growth equation dN N(t) = αN(t) 1 , (3.40) dt − C   where α is the initial growth rate and C denotes a carrying capacity. 58 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS

Exercise 22 (Logistic Growth) Find the solution of the logistic growth equation (3.40).

3.3.2 Fixed-Point Analysis When it is not possible to solve a nonlinear ODE exactly – which is the general case – some insight can still be gained when the ODE admits for the calculation of stationary states (which unfortunately is not always possible). So let there be in the ODE system without explicit time- dependence a stationary pointx ¯

x˙ i = fi(x1,...,xn) , where fi(¯x1,..., x¯n)=0 . (3.41)

Then, we can learn about the dynamics around this stationary point by expanding linearly xi(t)= x¯i + yi(t), which yields ∂f y˙ = i y , (3.42) i ∂x j j j X which constitutes a simple linear system

∂f1 ... ∂f1 ∂x1 ∂xn . .. . y˙ = Ay ,A =  . . .  (3.43) ∂fn ... ∂fn  ∂x1 ∂xn    At that has the simple solution y(t)= e y0. Consequently, the eigenvalues of the matrix A can tell a lot about the behaviour of the solution. One can use these to classify the character of the fixed point, for example: When the real part of all eigenvalues is negative, the fixed point is called attractive or • stable. Then, all solutions starting in the region where the linearization is valid will decay towards the fixed point.

When the real part of all eigenvalues is positive, the fixed point is repulsive or unstable. • In this case, all solutions will move away from the fixed point and will thus eventually leave the linearization region.

When the real part of all eigenvalues vanishes, the solution will oscillate in closed orbits • around the fixed point. Therefore, the fixed point is then called elliptic or neutrally stable.

Lotka-Volterra model We illustrate this classification with a simple example for a Lotka7 -Volterra8 model dN dP = N(a bP ) , = P (cN d) , (3.44) dt − dt − 7Alfred James Lotka (1880–1949) was an austrian-american chemist. 8Vito Volterra (1860–1940) was an italian mathematician and physicist with contributions to integral equations. 3.3. NONLINEAR ODES 59

N and P denote populations of prey and predators, respectively. The positive constants can be interpreted as proliferation capability of prey (a), the impact of the predators on the prey population (b), the ability of predators to proliferate in presence of prey (c), and the death of predators in absence of prey (d). It is straightforward to identify the fixed points

N¯1 = 0 , P¯1 =0 ,

N¯2 = d/c, P¯2 = a/b, (3.45) and the linearization N = N¯i + ni and P = P¯i + pi yields the linear systems d n +a 0 n d n 0 db/c n 1 = 1 , 2 = − 2 , (3.46) dt p1 0 d p1 dt p2 +ac/b 0 p2    −        and we see that the first fixed point is actually a saddle point, since in one direction (prey) it predicts unbounded growth and in the other direction (predators) exponential dampening. The eigenvalues for the second fixed point have only imaginary parts, such that it is classified as elliptic. Since the model is particularly simple, we can obtain the trajectories exactly.

Exercise 23 (Lotka-Volterra model) Introduce dimensionless variables

d c b τ = at, α = , n(τ)= N(t) , p(τ)= P (t) (3.47) a d a and then obtain from Eqns. (3.44) an equation of the form f(n(τ),p(τ))=0.

Kepler model Another popular example is the Kepler9 problem of a point particle in a radially symmetric po- tential. Adjusting the coordinate system such that the dynamics only occurs in the x y-plane, the equations of motions read − ∂V (r) ∂V (r) mx¨ + =0 , my¨ + =0 , (3.48) ∂x ∂y and we could directly map it to a system of first-order differential equations with four variables. However, exploiting the radial symmetry of the potential it is favorable to use polar coordinates

x = r cos(φ) , y = r sin(φ) . (3.49)

We can now expressx ¨ andy ¨ in terms of derivatives of r and φ with respect to time, which yields the equations V (r) 0 =r ¨cos(φ) 2˙rφ˙ sin(φ) rφ¨ sin(φ) rφ˙2 cos(φ)+ ′ cos(φ) , (3.50) − − − m V (r) 0 =r ¨sin(φ)+2˙rφ˙ cos(φ)+ rφ¨ cos(φ) rφ˙2 sin(φ)+ ′ sin(φ) . (3.51) − m 9Johannes Kepler (1571–1630) was a german astronomer, philosopher, and theologist. He is best known for the Kepler laws describing the planetary motion. 60 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS

At first glance, this does not seem a great improvement over the initial equations, but it is easy to see that by cleverly combining the above equations, the situation can be greatly simplified. The combination cos(φ)(3.50) + sin(φ)(3.51) yields V (r) r¨ rφ˙2 + ′ =0 , (3.52) − m and when combining sin(φ)(3.50) cos(φ)(3.51) we obtain − 1 d 2˙rφ˙ rφ¨ = r2φ˙ =0 , (3.53) − − −r dt   which implies existence of the conserved quantity

2 r φ˙ = C0 = const , (3.54) which is just related to the conservation of angular momentum. Therefore, we can reduce the system to a single differential equation of second order for r 1 2 V (r) C2 V (r) V (r) r¨ r2φ˙ + ′ =r ¨ 0 + ′ =r ¨ + eff′ =0 , (3.55) − r3 m − r3 m m   where we have introduced the effective potential mC2 V (r)= V (r)+ 0 . (3.56) eff 2r2 Not knowing anything about theoretical mechanics, we would map this to two coupled first order ODEs via y1 = r and y2 =r ˙

1 ∂Veff (y1) y˙1 = y2 , y˙2 = , (3.57) −m ∂y1 and it is immediately obvious that at a fixed point, y2 =r ˙ = 0 and also the derivative of the effective potential should vanish Veff′ (r) = 0. When we insert the gravitational potential γ mC2 V (r)= + 0 , (3.58) eff − r 2r2 we see that the fixed point describes an orbit with a positive radius and furthermore that it is an elliptic fixed point.

Exercise 24 (Fixed Point) Show that Eq. (3.57) in combination with an effective potential like Eq. (3.58) has an elliptic fixed point.

This implies that e.g. the distance between earth and sun will not approach a stationary value but rather hover around a mean, which characterizes the elliptical orbit. The radial equation can be converted into a first order equation by taking energy conservation into account

1 d 1 2 mr¨ + V ′ (t)= mr˙ + V (r) =0 , (3.59) eff r˙ dt 2 eff   which enables one to obtain a separable equation forr ˙. This approach however is rather sophisti- cated, such that commonly one aims for obtaining an equation for dr/dφ. 3.3. NONLINEAR ODES 61

Reservoir Relaxation Model As our last example for nonlinear differential equations, we consider a nonlinear relaxation model describing the evolution of temperature and chemical potential of an always-thermalized finite electronic reservoir. The idea is that the electronic reservoir is coupled to a small system (that itself may be coupled to further reservoirs), through which it experiences a flux of charge IM and energy IE. The total electron number and energy in the reservoir are represented as

N = (ω)f(ω)dω, E = (ω)ωf(ω)dω, (3.60) D D Z Z where temperature T =1/β and chemical potential µ enter implicitly in the Fermi 10 function 1 f(ω)= β(ω µ) . (3.61) e − +1 Total conservation of charge and energy implies that given charge and energy currents into the reservoir ∂N ∂N dβ ∂E ∂E dβ N˙ = J = µ˙ + T,˙ E˙ = J = µ˙ + T˙ (3.62) M ∂µ ∂β dT E ∂µ ∂β dT one can calculate the change of reservoir charge and energy. Here however, we will be interested in the change of reservoir temperature and chemical potential, for which we can obtain a differential equation by solving the above equations forµ ˙ and T˙ . This requires us to solve for the coefficients first ∂N = (ω)f(ω)[1 f(ω)]dωβ = β, ∂µ D − I1 Z ∂N = (ω)f(ω)[1 f(ω)](ω µ)dω = , ∂β − D − − −I2 Z ∂E = (ω)ωf(ω)[1 f(ω)]dωβ =( + µ )β, ∂µ D − I2 I1 Z ∂E = (ω)ωf(ω)[1 f(ω)](ω µ)dω = µ . (3.63) ∂β − D − − −I3 − I2 Z Here, we have defined three integrals

= (ω)f(ω)[1 f(ω)]dω, = (ω)(ω µ)f(ω)[1 f(ω)]dω, I1 D − I2 D − − Z Z = (ω)(ω µ)2f(ω)[1 f(ω)]dω, (3.64) I3 D − − Z which in the wide-band limit (ω)= D can be solved exactly D D π2 D π2 = = DT , =0 , = = DT 3 . (3.65) I1 β I2 I3 3 β3 3

10Enrico Fermi (1901 – 1954) was an italian physicist known for many contributions to statistical mechanics, quantum theory and nuclear theory. He also supervised the construction of the worlds first nuclear reactor. 62 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS

Figure 3.1: Sketch of two reservoirs (boxes left and right) coupled (dashed lines) via an unspec- ified quantum system (middle). Since particle and energy currents are assumed as conserved, the four thermodynamic variables µL/R(t) and TL/R(t) can be reduced to two, e.g. only the dif- ference in chemical potentials and the difference in temperatures.

Exercise 25 (Fermi integrals) Show validity of Eq. (3.65). You might want to use that

2 2 ∞ ln (x) π = . (3.66) (x + 1)2 3 Z0

From these, we obtain a simple relation between currents and thermodynamic parameters

JM 1 0 µ˙ = D 2 . (3.67) J µ π T T˙  E   3   We can directly invert the matrix containing the heat and charge capacities to solve for the first derivatives

µ˙ 1 1 0 JM ˙ = 3 µ 3 1 . (3.68) T D 2 2 JE    − π T π T   Although we have represented this using a matrix, we stress that the resulting ODE is highly nonlinear, since the currents may themselves depend in a highly nonlinear fashion on the reservoir temperature. When we have now two reservoirs (labeled L and R) that are connected by some structure, all parameters become reservoir-specific, see Fig. 3.1. However, any reasonable two- R L terminal setup should realistically obey particle conservation JM = JM = JM and also energy R L − conservation JE = JE = JE. The two currents will depend on all thermodynamic variables, resulting in − 1 1 µ˙ L = JM (µL,µR,TL,TR) , µ˙ R =+ JM (µL,µR,TL,TR) , −DL DR ˙ 1 3 µL(t) 1 3 1 TL = + 2 JM (µL,µR,TL,TR) 2 JE(µL,µR,TL,TR) , DL π TL(t) − DL π TL(t) ˙ 1 3 µR(t) 1 3 1 TR = 2 JM (µL,µR,TL,TR)+ 2 JE(µL,µR,TL,TR) . (3.69) −DR π TR(t) DR π TR(t) These equations can however be further reduced. For example, from the conservation of total particle number N˙ L + N˙ R =0= DLµ˙ L + DRµ˙ R we obtain that e.g. the weighted average of the chemical potentials

DL DR µL(t)+ µR(t)=¯µ =const (3.70) DL + DR DL + DR 3.3. NONLINEAR ODES 63

is a constant of motion – where Dα denotes the charge capacity of reservoir α. This equation can be used to represent the chemical potentials µL(t) and µR(t) using only the bias voltage V (t)= µ (t) µ (t). From the conservation of energy we obtain that L − R D D π2 D π2 D E = L µ2 (t)+ R µ2 (t)+ L T 2(t)+ R T 2 (t) = const 2 L 2 R 3 2 L 3 2 R 2 DL + DR 2 1 DLDR 2 π 2 2 = µ¯ + V (t)+ DLTL(t)+ DRTR(t) 2 2 DL + DR 6   (3.71) is also a constant of motion.

Exercise 26 (Relaxation constants) Using Eq. (3.69) show that (3.70) and (3.71) are con- stants of motion.

We consider now the difference of the squared temperatures

τ (t)= T 2(t) T 2 (t) , (3.72) diff L − R 2 and we note that we can reconstruct Tα(t) from τdiff (t) and V (t). Since Tα(t) > 0, this also allows to uniquely express the temperatures by τdiff (t) and V (t). Now, combining the first two of Eqns. (3.69) yields an equation for the bias voltage

DL + DR V˙ = JM (V,τdiff ) , (3.73) − DLDR and to obtain an equation for τdiff we use the last two equations in (3.69) to find

d 6 D + D D D 6 D + D τ = L R µ¯ + R − L V J (V,τ ) L R J (V,τ ) . (3.74) dt diff π2 D D D D M diff − π2 D D E diff  L R L R  L R For completeness we note that we can express all thermodynamic parameters in terms of the two 2 dynamical variables V and τdiff

DR DL µL(t) =µ ¯ + V (t) , µR(t)=¯µ V (t) , DL + DR − DL + DR 2 6 Erest DR 3 DLDR 2 TL(t) = 2 + τdiff (t) 2 2 V (t) , π DL + DR DL + DR − π (DL + DR)

2 6 Erest DL 3 DLDR 2 TR(t) = 2 τdiff (t) 2 2 V (t) , (3.75) π DL + DR − DL + DR − π (DL + DR) where we have defined the constant E = E DL+DR µ¯2. We note that positivity of the tem- rest − 2 peratures yields constraints V (t) and τdiff (t). A further immediate observation is that the voltage changes only when charges are flowing from one reservoir to another. In contrast, the temperature can also change when either energy or matter or both are flowing. To solve the resulting set of ODEs one has to specify the dependence of JM and JE on the thermodynamic parameters. 64 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS

Figure 3.2: Temporal evolution of the bias volt- 0,1 0,1 age V (t) (black) and the temperature difference T T (red) for different ratios of channel ener- L − R 0,08 0,08 gies ǫ2 = αǫ = ǫ1 (solid, dashed, and dash-dotted, respectively). After an initial evolution phase the 0,06 0,06 system reaches a pseudo-equilibrium that is per- sistent only for ǫ1 = ǫ2 (solid curves). Whenever 0,04 0,04 ε = ε the channel energies are different, the pseudo- 2 ε = 1.5 ε bias voltage [a.u.] 2 ε = 2 ε equilibrium eventually relaxes to thermal equilib- 0,02 2 0,02 rium. During the pseudo-equilibrium phase (in- temperature difference [a.u.] termediate plateaus), part of the initial tempera- 0 0 0,1 1 10 100 1000 10000 1e+05 ture gradient has been converted into a voltage. time t [a.u.]

A useful example is the single-electron transistor that has been treated previously. Here, we have two reservoirs with temperatures TL, TR and chemical potentials µL and µR, respecively. These are connected via a single quantum dot, through which the current reads J = γ [f (ǫ) f (ǫ)] , J = ǫJ , (3.76) M L − R E M where γ encoded details of the coupling strength to the respective reservoirs into a single factor and where ǫ was the on-site energy of the quantum dot. The so-called tight-coupling property JE = ǫJM follows from the fact that a single quantum dot only has a single transition frequency ǫ. This can be compared with a more complicated structure, e.g. two quantum dots connecting the two reservoirs in parallel without direct interaction. Then, the currents have the structure J = γ [f (ǫ ) f (ǫ )] + γ [f (ǫ ) f (ǫ )] , M 1 L 1 − R 1 2 L 2 − R 2 J = ǫ γ [f (ǫ ) f (ǫ )] + ǫ γ [f (ǫ ) f (ǫ )] . (3.77) E 1 1 L 1 − R 1 2 2 L 2 − R 2 These do not exhibit the tight-coupling property J = ǫJ – unless the ǫ are equal. Nevertheless, E 6 M i also here equilibrium V = 0 and τdiff = 0 will evidently lead to vanishing currents and therefore to fixed points. Now, by initializing the system e.g. with a temperature gradient in the absence of a charge gradient it is possible to generate (at least temporally) a voltage, i.e., to extract work. The temporal evolution of such a system is depicted in Fig. 3.2. It is visible that in the tight- coupling limit, it is possible to convert e.g. an initial temperature gradient into work (a persistent voltage). However, it should realistically be kept in mind that the tight-coupling property is never exactly fulfilled and relaxation into final equilibrium may thus be expected. Nevertheless, even these more realistic systems show a distinct timescale separation between initial charge separation and discharging of the system.

3.4 Numerical Solution

There are many numerical solution methods for nonlinear ODEs on the market. They can of course also be applied to linear ODEs with time-dependent or constant coefficients. Here, we will just discuss the simplest implementations that are not optimized for a particular problem. To fix the notation, we will consider systems of first order differential equations (linear or nonlinear with possibly time-dependent coefficients) x˙ = f(x(t),t) , (3.78) 3.4. NUMERICAL SOLUTION 65 where bold symbols denote vectors – or vector-valued functions, and where f can also explicitly depend on time. In this section, we will for convenience drop the explicit bold notation for vectors. When time is discretized in slices of width ∆t, we can also discretize the derivative forward in time dx x x t = ℓ∆t, x(t )= x , (t )= ℓ+1 − ℓ . (3.79) ℓ ℓ ℓ dt ℓ ∆t Thus, the differential equation can directly be converted into an iterative scheme when the known values xℓ and tℓ are inserted in the right-hand side x = x +∆tf(x ,t )+ ∆t2 . (3.80) ℓ+1 ℓ ℓ ℓ O{ } This first order forward-time (also called Euler discretization) scheme is however not very accurate and may behave quite unstable, unless the discretization width is chosen extremely small (which implies a large numerical effort). It should not be recommended. In principle one is free to insert values from the future in the right-hand side, too. If the function f permits, one might solve the resulting equation

xℓ+1 = xℓ +∆tf(xℓ+1,tℓ+1) (3.81) for xℓ+1 and obtain another iterative scheme, provided we can solve the resulting equation for xℓ+1 either directly or via some numerical algorithm. Such schemes are called implicit ones. Finally, we remark that it is easily possible to mix forward and backward time discretizations, which will become relevant for the numerical solution of partial differential equations. We will illustrate this issue with the simple differential equation for exponential decayx ˙ = αx(t) with α> 0. The simple forward-time Euler discretization yields − x = (1 α∆t) x . (3.82) ℓ+1 − ℓ When now the timestep is chosen too large such that α∆t 2 one can easily see that instead ≥ αt of approaching the value zero – as the exact solution x(t) = e− x0 would suggest – the solution explodes as xℓ+1 = 1 α∆t xℓ . A smaller timestep would not exhibit this behaviour. Alterna- tively, we could| use| an| implicit− || scheme| by evaluating the right-hand side of the differential equation in the future x x = α∆tx , which for this simple equation can be solved for ℓ+1 − ℓ − ℓ+1 1 x = x . (3.83) ℓ+1 1+ α∆t ℓ We note that the solution will not explode for large α∆t, such that regarding stability, an implicit scheme is favorable. Finally, the right-hand side can also be evaluated symmetrically xℓ+1 xℓ = α∆t(x + x )/2, which yields − − ℓ+1 ℓ 2 α∆t x = − x (3.84) ℓ+1 2+ α∆t ℓ and thus also predicts stable behaviour.

3.4.1 Runge-Kutta algorithm When it is not possible to invert the function f, we will use a forward-time discretization scheme. We see that by simply computing the derivative at an intermediate point between xℓ and xℓ+1, we can increase the order of the scheme 1 1 k =∆tf(x ,t ) , k =∆tf(x + k ,t + ∆t) , x = x + k + ∆t3 . (3.85) 1 ℓ ℓ 2 ℓ 2 1 ℓ 2 ℓ+1 ℓ 2 O{ } 66 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS

The above scheme is called second order Runge11 -Kutta12 (or midpoint) method. The fact that it is second order in ∆t is visible from the Taylor-expansion

dx d2x ∆t2 x = x + ∆t + + ∆t3 ℓ+1 ℓ dt dt2 2 O{ } t=tℓ t=tℓ 2 2 ∂f( x,t) dx ∆t ∂f(x,t) ∆t 3 = xℓ + f(x ℓ,tℓ)∆t + + + ∆t , (3.86) ∂x dt 2 ∂t 2 O{ } t=tℓ t=tℓ

which coincides with what we would obtain by expanding Eq. (3.85) for small ∆t. In effect, it uses two function calls – one at the initial time tℓ and one at half a time-step tℓ +∆t/2 to propagate the solution by one timestep. The by far most-used algorithm is however a fourth-order Runge-Kutta method

k1 = ∆tf(xℓ,tℓ) , k2 =∆tf(xℓ + k1/2,tℓ +∆t/2) ,

k3 = ∆tf(xℓ + k2/2,tℓ +∆t/2) , k4 =∆tf(xℓ + k3,tℓ +∆t) , k k k k x = x + 1 + 2 + 3 + 4 + ∆t5 . (3.87) ℓ+1 ℓ 6 3 3 6 O{ } It uses four function evaluations to propagate the solution by one timestep but has a much larger accuracy. Nevertheless, it should be used in combination with an adaptive timestep.

3.4.2 Leapfrog integration Many problems of classical mechanics can be cast into the formx ¨ = F (x)/m = f(x) or – equiva- lently –x ˙ = v andv ˙ = f(x). The leapfrog scheme works by evolving position and velocity in an alternating fashion – thus leap-frogging over each other

xi = xi 1 + vi 1/2∆t, vi+1/2 = vi 1/2 + f(xi)∆t. (3.88) − − −

Here, xi is the position at time ti and vi+1/2 is the velocity at time ti +∆t/2. It is also possible to apply the scheme with velocities and positions evaluated at the same times

f(x ) f(x )+ f(x ) x = x + v ∆t + i ∆t2 , v = v + i i+1 ∆t. (3.89) i+1 i i 2 i+1 i 2 The Leapfrog integration is a second order method but has the nice property that it is symmetric in time (provided a constant timestep is used). This implies that the initial conditions can be recovered from any state

3.4.3 Adaptive stepsize control To keep a certain promise on the total error, the timestep width in explicit integration schemes can be increased (to proceed faster with less numerical effort) and decreased (to improve accuracy) when applicable. The implementation of an adaptive stepsize requires one to estimate the error for each timestep. Such an estimate can be gained from comparing two small steps of size ∆t/2 and a single large step of size ∆t as shown in Fig. 3.3. At first glance, it seems that adaptive

11Carl David Tolm´eRunge (1856–1927) was a german mathematician and physicist. 12Martin Wilhelm Kutta (1867–1944) was a german mathematician. 3.4. NUMERICAL SOLUTION 67

Figure 3.3: Program flow chart of an adaptive stepsize control in explicit integration schemes, such as e.g. fourth order Runge-Kutta. The overhead for stepsize control in a single loop is about 50% of the computational cost. However, since the increase of the timestep also allows one to proceed much faster through regions where f(x,t) is essentially flat, adaptive stepsize control will quite often lead to a reduced overall runtime. Start false End true

compute

discard and reduce stepsize true false false

increase accept stepsize true 68 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS stepsize control will triple the computational cost. However, this is not true. Since we keep the solution obtained from the small timesteps xℓ+1, the actual discretization width of the algorithm is ∆t/2, and the additional overhead due to the stepsize control is therefore roughly 50% in a single loop. In addition, the possibility to increase the timestep when the error is negligible (which e.g. happens when f(x,t) is flat) mostly leads to a significantly faster solution when stepsize control is applied.

3.5 A note on Large Systems

When the ODE system involves many variables – e.g. in a molecular dynamics simulation of many interacting constituents, stability may become a more and more important issue. For illustration we consider a chain of coupled harmonic oscillators with the equation of motion

∂ k 2 ∂ k 2 0 = mz¨i + (zi zi 1) + (zi+1 zi) = mz¨i k (zi+1 2zi + zi 1) . (3.90) ∂zi 2 − − ∂zi 2 − − − − We could now map this to a large system of first order ODEs and try to solve it with some explicit integration scheme. In consequence, timesteps for all oscillators would have to be chosen very small if only a single oscillator was strongly displaced. Thus, it is very likely that such explicit methods are not suitable for large systems. In fact, one can notice that the harmonic oscillator chain can also be driven as a differential equation for a function z(x,t) depending on two variables. When we discretize space in intervals ∆x, this can be written as

2 ∂ z(xi,t) ∆x z(xi+1) 2z(xi)+ z(xi 1) 0= (k∆x) − − . (3.91) ∂t2 − m ∆x2 In the limit ∆x 0, m 0, and k such that the average spring constant k∆x and the mass density m/∆→x remain→ finite implying→ ∞ that κ = k∆x2/m remains also finite, this becomes the wave equation

∂2z(x,t) ∂2z(x,t) 0= κ . (3.92) ∂t2 − ∂x2

Exercise 27 (Discretization of second derivative) Show that the second derivative can be ap- proximated by

f(x +∆x) 2f(x)+ f(x ∆x) f ′′(x)= lim − − . (3.93) ∆x 0 2 → ∆x

In fact, the wave equation is a simple example for a partial differential equation, where deriva- tives with respect to more than a single variables occur. This opens a whole class of systems, which we discuss in the next chapter. Chapter 4

Partial Differential Equations

As we have seen before, partial differential equations (PDEs) often arise in the continuum limit from ordinary differential equations. In essence, a partial differential equation can be defined implicitly by e.g. in two dimensions the equation

∂g ∂g ∂2g F x,y,g(x,y), , , ,... =0 . (4.1) ∂x ∂y ∂x2   Generally, we can formally write an n-th order PDE this as

F x,g(x),Dg(x),D2g(x),...,Dng(x) =0 . (4.2)

A partial differential equation should fulfil the following properties

The sought-after function should depend on at least two variables • There should be derivatives occuring with respect to all variables • The equation should only contain derivatives (not integrals) or the function itself. • An important subclass of PDEs are the linear ones of second order, which in case of a function φ(x,y) we can write as

∂2φ ∂2φ ∂2φ ∂φ ∂φ A + B + C + D + E + Fφ(x,y)= G(x,y) . (4.3) ∂x2 ∂x∂y ∂y2 ∂x ∂y It is a general feature of a PDE that their solutions may exhibit discontinuities in the function or its derivatives. The surface through which these discontinuities occur are called characteristics of the PDE and depending on the shape of the characteristics one can – depending on the prefactors of the second derivatives – in two-dimensions further classify the PDE:

When B2 4AC > 0, the PDE is called hyperbolic. An example for such a PDE is • the previously− discussed equation that arises in the continuum limit for coupled harmonic oscillators, i.e., the wave equation

∂2φ ∂2φ κ =0 . (4.4) ∂t2 − ∂x2 Here, we obviously have B2 4AC =4κ> 0. − 69 70 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

When B2 4AC = 0, the PDE is called parabolic. An example for such a PDE is the • diffusion equation−

∂2φ ∂φ D =0 . (4.5) ∂x2 − ∂t

Exercise 28 (Diffusion Equation) Consider a chain of coupled compartments with con- centrations ci(t). These obey the rate equation dynamics

c˙i = Rci+1(t)+ Rci 1(t) 2Rci(t) . (4.6) − − Show that when R and ∆x 0 such that D = R∆x2 remains constant, this implies → ∞ → the diffusion equation to hold for c(x,t) where c(xi,t)= ci(t).

Finally, when B2 4AC > 0, the PDE is called elliptic. An example for this is the Poisson • equation −

∂2φ ∂2φ + = ρ(x,y) , (4.7) ∂x2 ∂y2

where we have A = C = 1.

A nice property of PDEs is their uniqueness of solutions. Given that one has found any solution satisfying the PDE and the boundary conditions (treating time and other variables on equal footing, the term boundary conditions may also include the initial condition), then it is the solution. Therefore, any ansatz to find solutions of PDEs is valuable.

4.1 Separation Ansatz

A popular approach to PDE is to reduce their dimension by trying to solve them with a separation ansatz. Here, one assumes that the solution is separable, i.e., that it can be written in a product form. For example, when spatial and temporal contributions separable, the solution could be written as Φ(x,t)= φ(x)ψ(t). One simply inserts this as an ansatz in the PDE and tries to obtain separate differential equations for φ(x) and ψ(t), which due to their reduced dimensionality can be solved much simpler. The general aim is to reduce the PDE to the form

2 F1(φ(x), Dφ, D φ,...)= F2(ψ(t), ψ,˙ ψ,...¨ ) . (4.8)

Then, since this relation should hold for all x and t, both sides of the equation should be equal to some constant α. The separate differential equations are then given by

2 α = F1(φ(x), Dφ, D φ,...) , α = F2(ψ(t), ψ,˙ ψ,...¨ ) . (4.9) 4.1. SEPARATION ANSATZ 71

4.1.1 Diffusion Equation As an example, we discuss how to reduce the one-dimensional diffusion equation

∂ρ ∂2ρ = D (4.10) ∂t ∂x2

1 in the domain x L/2, +L/2 with the initial condition ρ(x, 0) = ρ0(x) and von-Neumann boundary conditions∈ {− } ∂ρ ∂ρ ( L/2,t)= (+L/2,t)=0 (4.11) ∂x − ∂x using the separation ansatz ρ(x,t) = X(x)T (t). The diffusion equation can then be brought into the form 1 T (t) X (x) ′ = ′′ =! k2 , (4.12) D T (t) X(x) − and we obtain the two separate equations

2 2 T ′(t)+ Dk T (t)=0 , X′′(x)+ k X(x)=0 . (4.13)

Dk2t These are now ordinary differential equations. The first of these is readily solved by T (t)= e− , where we have fixed the constant arising from the initial condition to one. Since T (t) and X(x) enter as a product, we are free to transfer this constant to X(x). The general solution of the +ikx ikx second equation is X(x) = Ae + Be− , where we have to fix A and B and the unknown k from initial and boundary conditions. The boundary conditions yield two equations

A +ikL A ikL = e , = e− , (4.14) B B

inπ which can only be satisfied for discrete values of k nπ/L. Consequently, we have Bn = e± An = n → ( 1) An. Therefore, depending on whether n is even or odd, we will have B = +A or B = A, respectively.− Since the discreteness of k also maps to the time-dependent part, we first note that− the general solution can be written as

D(n2π2/L2)t +iπnx/L n iπnx/L ρ(x,t) = e− A e +( 1) e− n − n X D(n2π2/L2)t  n i2πnx/L  = e− a 1+( 1) e− = ρ (x,t) , (4.15) n − n n n X   X where n goes in principle over all integers. We note that in the above Fourier-type series, terms with even n will lead to an even contribution of ρ (x,t) = ρ ( x,t) whereas terms with odd n n n − to odd contributions with ρn(x,t) = ρn( x,t). The above solution is constructed to obey the boundary condition but also has to match− − the initial condition, e.g.

1 1 ! +iπnx/L n iπnx/L ρ (x)= + cos(2πx/L) = A e +( 1) e− . (4.16) 0 L L n − n X   1John von Neumann (1903 – 1957) was a hungarian mathematician and physicist considered as one of the founding fathers of computing. 72 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

2

D t/L² = 0.0

(x,t) D t/L² = 0.01

ρ D t/L² = 0.1 D t/L² = 1.0 1,5

1 Figure 4.1: Solution of the diffusion equa- tion with (no-flux) von-Neumann boundary conditions. The initial unimodal concentra- 0,5 tion profile quickly decays into a flat distri- dimensionless concentration L

bution. The area under all curves is the same 0 -0,4 -0,2 0 0,2 0,4 – implying conservation of total mass. position x [L]

We immediately see that this is an even function of x and can therefore conclude that A2n+1 = 0. By direct comparison we can directly identify

A0 =1/(2L) ,A+2 =1/(2L) , (4.17) whereas all other coefficients must vanish. The full solution is then given by

1 1 D(4π2/L2)t ρ(x,t)= + cos(2πx/L)e− , (4.18) L L and it satisfies boundary condition and initial condition (which of course must be chosen compatible with each other), see also Fig. 4.1. We see that the initial unimodal shape quickly decays to the 1 constant value L . For general initial conditions we note that by introducing the functions

+2πin x e L gn(x)= (4.19) √L we directly note their orthonormality relation over the interval [ L/2, +L/2] − +L/2

gn∗(x)gm(x)dx = δnm , (4.20) L/2 Z− such that we can determine the even Fourier coefficients from general initial conditions as

+L/2 1 i2πnx/L an = ρ0(x)e− dx. (4.21) L L/2 Z−

Exercise 29 (Orthonormality relation) Show validity of Eq. (4.20).

Also when we had prescribed not the derivative of the function at the boundaries (to inhibit an unphysical flux out of the considered domain) but the value of the function itself – so called Dirichlet2 boundary conditions – we would have found a discretizing condition on k.

2Johann Peter Gustav Lejeune Dirichlet (1805 – 1859) was a german mathematician with contributions to number theory and Fourier series. 4.1. SEPARATION ANSATZ 73

Now if, alternatively, we wanted to solve the diffusion equation not in a box but in the full space (discarding then the von-Neumann boundary conditions), we would not obtain a discretizing constraint on k. Then, it may assume any continous value, and we can write the solution as

+ikx Dk2t ρ(x,t)= A(k)e e− dk. (4.22) Z From the initial condition, we can then obtain A(k).

Exercise 30 (Diffusion in an unbounded domain) Solve the one-dimensional diffusion equation in an unbounded domain with the initial condition

ρ (x)= δ(x x ) . (4.23) 0 − 0

4.1.2 Damped Wave Equation Things become a bit more complicated when higher derivatives occur, e.g. with respect to time. As an example, we consider the one-dimensional wave equation (4.4) supplemented with a dampening term ∂2z ∂z ∂2z + γ κ =0 . (4.24) ∂t2 ∂t − ∂x2 The function z(x,t) may e.g. describe the transversal displacement of a guitar string, where the dampening γ arises from internal friction and collisions with the air molecules. Consistently with this picture, we assume Dirichlet boundary conditions z( L/2,t) = z(+L/2,t) = 0 at all times − and a triangular and stationary displacement initially

∂z z(x, 0) = a [1 2x/L]Θ(x)+[1+2x/L]Θ( x) , (x, 0)=0 . (4.25) { − − } ∂t A separation ansatz z(x,t)= X(x)T (t) yields the equations

2 2 T ′′(t)+ γT ′(t)+ κk T (t)=0 , X′′(x)+ k X(x)=0 . (4.26)

We first turn our attention to the spatial equation, we have a similiar situation as with the diffusion +ikx ikx equation before, i.e., the solution is of the form X(x)= Ae + Be− . The Dirichlet boundary conditions now yield

A +ikL A ikL = e , = e− , (4.27) B − B −

kL n+1 where we have to discretize k = nπ/L, such that B = Ae± =( 1) A. − λ+t − λ t We can directly solve the temporal equation via T (t)= t+e + t e − , where − γ γ 2 γ λ = κk2 = ik√κ + γ2 . (4.28) ± − 2 ± 2 − ± − 2 O{ } r  74 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

Above, we have expanded the solution for small γ, where the string is still expected to vibrate. As before, we can demand that T (0) = 1 to transfer the normalization question to X(x), but due to the initially vanishing transverse string velocity we also have T ′(0) = 0. This fixes the coefficients

+λ 1 iγ t = + = + + γ2 , − λ+ λ 2 4k√κ O{ } − − λ 1 iγ 2 t+ = − − = + γ . (4.29) λ+ λ 2 − 4k√κ O{ } − − Since k is dependent on n, all these variables become dependent on n, too, and the temporal (n) (n) (n) (n) λ+ t λ t solution can be written as Tn(t)= t+ e + t e − , such that the general solution becomes −

+inπx/L n+1 inπx/L z(x,t) = T (t)A e +( 1) e− n n − n X   = [T2n(t)A2n2i sin(2nπx/L)+ T2n+1(t)A2n+12 cos((2n + 1)πx/L)] . (4.30) n X and has to match the initial condition in Eq. (4.25). We can calculate the coefficients from orthonormality of the functions

2 sin(nπx/L) : n 2, 4, 6,... gn(x)= ∈ { } (4.31) L cos(nπx/L) : n 1, 3, 5,... r  ∈ { } over the interval [ L/2, +L/2]. −

Exercise 31 (function orthonormality) Show that the functions in Eq. (4.31) are indeed or- thonormal over the interval [ L/2, +L/2]. −

Since the initial condition (4.25) can be written as

∞ 8a πx z(x, 0) = cos (2n + 1) , (4.32) π2(1+2n)2 L n=0 X h i we can however directly read off

4a A = ,A =0 , (4.33) 2n+1 π2(1+2n)2 2n which solves the PDE

∞ 8a (2n + 1)πx z(x,t)= T (t) cos . (4.34) 2n+1 π2(2n + 1)2 L n=0 X   The vibration of the guitar string with a symmetric initial excitation in x will thus support half of the eigenmodes of the string, which will follow damped oscillations, see Fig. 4.2. 4.2. FOURIER TRANSFORM 75

1 t κ = 0.00 π t κ = 0.25 π t κ = 0.50 π t κ = 0.75 π t κ = 1.00 π 0,5 Figure 4.2: Solution of the wave equation with Dirichlet boundary conditions. The ini- tial triangular displacement profile oscillates 0 and is slowly damped (dashed curves: after 5 periods). The points at which the derivative -0,5 of the function appears discontinuous form a hyperbolic causality cone, consistent with transversal displacement z(x,t) [a]

-1 the classification of the wave equation. Pa- -0,4 -0,2 0 0,2 0,4 longitudinal position x [L] rameters: γ =0.01κ.

4.2 Fourier Transform

We have seen that in finite domains the solutions of PDEs are given by Fourier series. This is connected to the fact that any periodic function – which is equivalent to considering the function only in the interval of one period – can be represented by a Fourier series. The natural consequence is that for unbounded domains one may expect a continuos Fourier representation to hold. Alter- native to the separation ansatz, we may exploit this and use properties of the Fourier transform to reduce derivatives to algebraic equations, similiar to the use of the Laplace transform with ODEs.

4.2.1 Reaction-Diffusion Equation The separation ansatz does not always work. Consider, for example, the diffusion equation sup- plemented with a non-separable reaction term

∂c ∂2c = D + r(x,t) , (4.35) ∂t ∂x2 where c(x,t) is the concentration of a substance, D is the diffusion constant, and r(x,t) is the rate at which the substance is produced. An alternative interpretation would be that c(x,t) is the temperature, D the heat conductivity, and r(x,t) could describe heat sources and sinks. Since we consider the spatial domain as unbounded, we perform a continous FT 1 1 C(ω,t)= c(x,t)e+iωxdx, R(ω,t)= r(x,t)e+iωxdx, (4.36) √2π √2π Z Z which transfers the reaction-diffusion equation to ∂C = D( iω)2C(ω,t)+ R(ω,t) . (4.37) ∂t − Here, we have used property (2.4) of the FT. Since this equation does no longer contain a derivative with respect to ω, it is actually an ODE with ω as a parameter, which we can hopefully solve with a suitable method. For example, considering a Lorentzian and time-independent profile of the reaction rate, we can easily compute the FT from adapting Eq. (2.8)

2 R0δ R0 δ ω r(x,t)= r(x)= = R(ω,t)= R(ω)= e− | | . (4.38) x2 + δ2 ⇒ √2π 76 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

When now initially the concentration of the substance vanishes throughout c(x, 0) = 0, the same holds true for its FT C(ω, 0) = 0, and we can write the solution of the ODE as

Dω2t 1 e− R0 δ ω C(ω,t)= − e− | | . (4.39) h ω2 i √2πD

The solution is then represented via the inverse FT integral

Dω2t 1 e− 1 R0 δ ω iωx c(x,t)= − e− | |e− dω. (4.40) 2π D h ω2 i Z Since in this case we already know the inverse FT of parts of the solution, it may be helpful to employ the convolution theorem.

Exercise 32 (Bacterial Colony) Assume that bacteria follow undirected motion but may in ad- dition proliferate. Solve the resulting equation for the bacterial density

∂n ∂2n = D + αn(x,t) (4.41) ∂t ∂x2 with the initial condition n(x, 0) = δ(x x ) in an unbounded spatial domain. − 0

4.2.2 Unbounded Wave Equation

As a second example for the use of the Fourier Transform, we consider again the wave equation (4.4) ∂z without dampening and with initial conditions ∂t (x, 0) = 0 and z(x, 0) = aδ(x). Such an initial excitation could e.g. correspond to an infinitely large steel bar that is locally excited with a hammer at time t = 0. Simple FT Z(ω,t)= (z(x,t)) converts the equation into an ODE Fx Z¨ + κω2Z(ω,t)=0 . (4.42) with the initial conditions Z(ω, 0) = a/√2π and Z˙ (ω, 0) = 0. Therefore, we have reduced the problem to a damped harmonic oscillator, which is solved by

a Z(ω,t)= cos √κωt (4.43) √2π  in accordance with the initial conditions. The inverse FT now yields

1 iωx a z(x,t)= Z(ω,t)e− dω = δ(x √κt)+ δ(x + √κt) , (4.44) √2π 2 − Z   and we see that the initial pulse propagates with velocity c = √κ along the steel bar. 4.2. FOURIER TRANSFORM 77

4.2.3 Fokker-Planck equation We have seen that an undirected motion leads in the continuum representation to the diffusion equation. When for whatever reason the rates at which particles move in a particular direction depends on time and space (varying motility) and possibly also on direction (drift of the medium), we can describe the flow between the compartments with the rate equation

P˙i = Ri 1(t)Pi 1 + Li+1(t)Pi+1 [Ri(t)+ Li(t)]Pi . (4.45) − − −

Here, Ri(t) desribes the rate to move to the right from compartment i to compartment i +1, and Li(t) the rate to move left from compartment i to compartment i 1. It is straightforward to see ˙ − that the total probability is conserved i Pi = 0. Furthermore, we can write the rate equation as P P˙i = [Ri 1(t)+ Li 1(t)]Pi 1 +[Ri+1(t)+ Li+1(t)]Pi+1 2[Ri(t)+ Li(t)]Pi { − − − − } + Li(t)Pi Li 1(t)Pi 1 Ri+1(t)Pi+1 Ri(t)Pi . (4.46) { − − − } − { − } Here, we see that the first line appears like a second derivative of a function [r(x,t)+l(x,t)]P (x,t), whereas the second line is the discretized version of the first derivative of l(x,t)P (x,t) and r(x,t)P (x,t), respectively. This motivates us to consider the Fokker3 -Planck4 equation as

∂ρ ∂ ∂2 = [v(x,t)ρ(x,t)] + [d(x,t)ρ(x,t)] , (4.47) ∂t ∂x ∂x2 where v(x,t) describes the drift and d(x,t) the undirected part of the motility. We supplement the Fokker-Planck equation with a self-amplifying term – corresponding e.g. to proliferation of a bacterial species. When we assume that drift v and motility D are constant, the equation becomes

∂ρ ∂2ρ ∂ρ = D + v + αρ(x,t) . (4.48) ∂t ∂x2 ∂x This equation could e.g. model a pipe with flow velocity v, at which at time t = 0 a bacterial population is placed with

ρ (x, 0) = δ(x x ) . (4.49) 0 − 0 After Fourier transform, the equation assumes the form

∂ρ(ω,t) = α iωv ω2D ρ(ω,t) , (4.50) ∂t − −

 iωx  and the initial condition transfers to ρ (ω) = e 0 . The equation in frequency space is therefore 0 √2π solved by

iωx0 (α iωv ω2D)t e ρ(ω,t)= e − − , (4.51) √2π

3Adriaan Dani¨el Fokker (1887 – 1972) was a dutch physicist and musician. He derived the equation in 1913 but also contributed to special and general relativity. 4Max Karl Ernst Ludwig Planck (1858 – 1947) was a german theoretical physicist with numerous contributions. His fame rests on his original contributions to quantum theory. 78 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS and we can calulate the solution from the inverse FT αt 2 2 1 (α iωv ω D)t iω(x x0) e (x x0+vt) /(4Dt) ρ(x,t)= e − − e− − dω = e− − , (4.52) 2π √2π√2Dt Z which is just an inflated Gaussian that propagates with mean µ = x0 vt and at the same time spreads. From the normalization of the Gaussian we conclude that the− total bacterial population

N(t)= ρ(x,t)dx = eαt (4.53) Z indeed simply follows an exponential growth law.

4.3 Coordinate systems

Due to symmetries present in many problems, the use of adapted coordinate systems may be particularly helpful. The differential operators occuring in PDEs may strongly depend on the used coordinate system. Therefore, we state here the most important differential operators in spherical and cylindrical coordinates, too.

4.3.1 Cylindrical Coordinates In cylindrical coordinates (equivalent to 2d polar coordinates), the transformation equations read

x = ρ cos φ, y = ρ sin φ, h = z, (4.54) such that only two coordinates are actually transformed. The inverse transformations become y ρ = x2 + y2 , φ = arctan , z = h. (4.55) x p The chain rule implies for the derivatives ∂ ∂ρ ∂ ∂φ ∂ ∂h ∂ = = + + ∂x ∂x ∂ρ ∂x ∂φ ∂x ∂h       ∂ sin φ ∂ = cos φ , ∂ρ − ρ ∂φ ∂ ∂ cos φ ∂ = sin φ + , ∂y ∂ρ ρ ∂φ ∂ ∂ = , (4.56) ∂z ∂h and with the unit vectors cos φ sin φ 0 − e = sin φ , e = cos φ , e = 0 (4.57) ρ   φ   h   0 0 1       we can write the gradient of a function as ∂ 1 ∂ ∂ = e + e + e . (4.58) ∇ ρ ∂ρ ρ φ ∂φ h ∂h 4.3. COORDINATE SYSTEMS 79

For the Laplace operator we thus obtain 1 1 ∆= ∂2 + ∂ + ∂2 + ∂2 . (4.59) ρ ρ ρ ρ2 φ h

Exercise 33 (Laplacian in cylindrical coordinates) Show that Eq. (4.59) indeed represents the Laplace operator in cylindrical coordinates.

The volume element for radial length dρ, angular length ρdφ and height dz becomes

V = ρdρdφdh . (4.60)

4.3.2 Spherical Coordinates From the transformation equations

x = r sin θ cos φ, y = r sin θ sin φ, z = r cos θ (4.61) we can readily deduce the inverse (keeping in mind that the spherical coordinates have to obey r> 0, 0 <θ<π, and 0 <φ< 2π)

x2 + y2 y r = x2 + y2 + z2 , θ = arctan , φ = arctan . (4.62) p z x p The chain rule then implies that the differential operators become ∂ ∂ cos θ cos φ ∂ sin φ ∂ = sin θ cos φ + , ∂x ∂r r ∂θ − r sin θ ∂φ ∂ ∂ cos θ sin φ ∂ cos φ ∂ = sin θ sin φ + + , ∂y ∂r r ∂θ r sin θ ∂φ ∂ ∂ sin θ ∂ = cos θ . (4.63) ∂z ∂r − r ∂θ With introducing the unit vectors

sin θ cos φ cos θ cos φ sin φ − e = sin θ sin φ , e = cos θ sin φ , e = cos φ , (4.64) r   θ   φ   cos θ sin θ 0 −       we can therefore also write the gradient operator as

∂ x ∂ 1 ∂ 1 ∂ = ∂ = e + e + e . (4.65) ∇  y  r ∂r r θ ∂θ r sin θ φ ∂φ ∂z   We have written the above definition of the gradient assuming that the derivative always acts to the right. However, when higher-order derivatives are to be expressed in spherical coordinates, it 80 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS is important to note that the prefactors also depend on the coordinates themselves. Consider, for example, the Laplacian differential operator ∂2 ∂2 ∂2 ∆ = + + ∂x2 ∂x2 ∂x2 1 ∂ ∂ 1 ∂ ∂ 1 ∂2 = r2 + sin θ + . (4.66) r2 ∂r ∂r r2 sin θ ∂θ ∂θ r2 sin2 θ ∂φ2 Here, the Laplace operator cannot be simply obtained by first calculating the scalar product of two gradient operators (sometimes one writes ∆ = ), since the basis vectors depend on the ∇·∇ coordinates. Using simple trigonometric relations, it is actually not difficult to show that the above representation of the Laplace operator follows from the chain rule. For completeness, we also present the volume element in spherical coordinates. It can be constructed by a segment in the range r [r, r + dr], θ [θ,θ + dθ], and φ [φ,φ + dφ]. It has radial length dr, height rdθ in z-direction,∈ and a width r∈sin θdφ in the x y∈-plane, such that we have − dV = dxdydz = r2 sin θdrdθdφ , (4.67) which has to be used in three-dimensional integrals.

4.3.3 Schr¨odinger Equation As an example for the usefulness of adapted coordinate systems, we study the Schr¨odinger 5 equation ∂ ~2 i~ Ψ(r,t)= ∆+ V (r,t) Ψ(r,t) , (4.68) ∂t −2m   where Ψ denotes the wave-function, ~ the Planck-constant, m the mass of the particle, and V a potential. For a time-independent potential V (r,t) = V (r), the time evolution can be separated with the ansatz Ψ(r,t)= F (r)T (t), which yields T (t) 1 ~2 i~ ′ = ∆+ V (r) F (r)= E, (4.69) T (t) F (r) −2m   for a constant E. Whereas the time-dependent part yields a trivial evolution T (t) = ei/~Et, the spatial equation corresponds to an eigenvalue equation for a differential operator. When the potential is radially symmetric, V (r)= V (r), the spatial equation becomes ~2 1 ∂ ∂ 1 ∂ ∂ 1 ∂2 r2 + sin θ + + V (r) F (r,θ,φ)= EF (r,θ,φ) .(4.70) −2m r2 ∂r ∂r r2 sin θ ∂θ ∂θ r2 sin2 θ ∂φ2     Above, we see that a further separation ansatz F (r,θ,φ)= R(r)Y (θ,φ) yields the equations ~2 1 ∂ ∂ ~2λ 0 = r2 + V (r) E R(r) , −2m r2 ∂r ∂r − − 2mr2   1 ∂ ∂ 1 ∂2 0 = sin θ + λ Y (θ,φ) , (4.71) sin θ ∂θ ∂θ sin2 θ ∂φ2 −   5Erwin Rudolf Josef Alexander Schr¨odinger (1887 – 1961) was an austrian physicist known as one of the founders of quantum mechanics. 4.4. GREEN’S FUNCTIONS 81 and the radial part can now be treated as a conventional ODE. Here, the constant λ was introduced from the separation ansatz and is related to the total angular momentum. Finally, we can also treat the angular part with a further separation ansatz Y (θ,φ)=Θ(θ)Φ(φ). This yields eventually ∂2 0 = + m Φ(φ) , ∂φ2   ∂ ∂ 0 = sin θ sin θ λ sin2 θ m Θ(θ) , (4.72) ∂θ ∂θ − −   where m is the additional contant arising from the separation ansatz. It is related to the z- component of the angular momentum. Therefore, recognizing the spherical symmetry has enabled us to map a 4-dimensional PDE to the solution of four independent ODEs. As is discussed in many quantum mechanics textbooks, these can be solved with some effort.

4.4 Green’s functions

We have already seen that once inhomogeneous terms appear, the solution of PDEs becomes even more complicated. In a nutshell, the method of Green’s6 functions allows to construct the solution of linear inhomogeneous PDEs from simple inhomogeneous PDEs. Let z be a point in the domain we consider and let z be a linear differential operator that may contain functions of z and derivatives with respectD to all components of z. For example, for the diffusion equation ∂ ∂2 x,tρ(x,t) = zρ(z) = 0 we would have x,t = ∂t D ∂x2 . To such a linear differential operator weD can associatedD a Green’s function as follows.D −

Box 16 (Green’s function) For a linear differential operator z, the Green’s function G(z, s) is defined as the solution of the equation D

G(z, s)= δ(z s) . (4.73) Dz − Then, the inhomogeneous problem

zρ(z)= φ(z) (4.74) D is solved by

ρ(z)= G(z, s)φ(s)ds . (4.75) Z

It is then straightforward to show the claim

zρ(z)= z G(z, s)φ(s)ds = [ zG(z, s)] φ(s)ds = δ(z s)φ(s)ds = φ(z) . (4.76) D D D − Z Z Z We note that given the Green’s function, it may of course be difficult or impossible to solve the integral for arbitrary source terms φ(s). Nevertheless, it provides a straightforward recipe how to

6 George Green (1793 – 1841) was a british mill owner and mathematician. With the heritage of his father he could persue his mathematical interests in his forties but nevertheless contributed strongly. 82 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS obtain the solution and has proven useful in many cases. Many variants of the Green’s function formalism exist, tailored to specific problems. Here, we will only discuss a few representatives.

4.4.1 Poisson Equation When we consider the long-term limit in reaction-diffusion equations or electrostatic problems, one arrives at the Poisson 7 equation, which e.g. in three dimensions assumes the form

∂2 ∂2 ∂2 + + φ(x,y,z)=4πρ(x,y,z) . (4.77) ∂x2 ∂y2 ∂z2   The Green’s function must accordingly obey

∂2 ∂2 ∂2 + + G(x,y,z,x′,y′,z′)= δ(x x′)δ(y y′)δ(z z′) , (4.78) ∂x2 ∂y2 ∂z2 − − −   and we perform a three-dimensional FT yielding

′ e+iω r G(ω, r′)= · . (4.79) −(2π)3/2ω2

The actual solution is obtained via the inverse FT ′ 1 e iω (r r ) G(r, r′) = dω dω dω − · − = G(r r′). (4.80) −(2π)3 x y z ω2 + ω2 + ω2 − Z x y z To perform the inverse FT it is useful to switch to spherical coordinates, which yields

1 1 G(r r′)= . (4.81) − −4π r r′ | − |

Exercise 34 (Poisson Green’s function) Show that Eq. (4.81) is valid by calculating the in- ∞ verse FT. You might want to use that 0 sin(x)/xdx = π/2. R Consequently, for an arbitrary source distribution, the potential is determined by

ρ(x′,y′,z′) φ(x,y,z)= dx′dy′dz′ . (4.82) − (x x )2 +(y y )2 +(z z )2 Z − ′ − ′ − ′ We note that the Green’s function dependsp on possible boundary conditions. Fortunately, within electrostatics it is sometimes possible to enforce boundary conditions using the method of image charges.

7Sim´eon Denis Poisson (1781 – 1840) was a french mathematician, geodesist and physicist. 4.4. GREEN’S FUNCTIONS 83

Exercise 35 (Spherically-symmetric charge distributions) For a charge density that is spherically symmetric (i.e., ρ(r′,θ′,φ′) = ρ(r′)) and localized (i.e., ρ(r′ R)=0), show that outside the charge distribution the potential resembles that of a point charge,≥ i.e.,

R r R Q 2 φ(r) =≥ , Q =4π ρ(r′)r′ dr′ . (4.83) − r Z0

4.4.2 Wave Equation The wave equation can also be written with the d’Alembert 8 operator

1 ∂2 ✷φ(r,t)= ∆ φ(r,t) , (4.84) c2 ∂t2 −   where c denotes the wave velocity (when we derive the wave equation from the Maxwell equations c indeed becomes the velocity of light). An inhomogeneous wave equation

✷φ(r,t)= f(r,t) (4.85) can arise when e.g. a chain of oscillators is driven or when electromagnetic signals are sent. To find the Green’s function, we have to solve

✷G(r,t)= δ(t t )δ(x x )δ(y y )δ(z z ) . (4.86) − 0 − 0 − 0 − 0 We note that the direction of time is not inherent in the above equation. Fundamental principles such as causality therefore will have to be enforced to find the correct Green’s function. Since the ✷-operator is translationally invariant, we can set t0 = 0 and r0 = 0. A four-fold FT yields the equation

ω2 1 2 − + k2 G(k, ω)= , (4.87) c2 2π     and to obtain the original Green’s function we have to perform the inverse FT (later-on, we can re-insert r r r and t t t ) → − 0 → − 0 1 4 e i(k r+ωt) G(r,t)= dω d3k − · . (4.88) 2π k2 ω2/c2   Z Z − We first consider the temporal integral

e iωt I(k,t)= − dω. (4.89) k2 ω2/c2 Z − Here, we have the situation that two poles ω∗ = ck lie directly on the real axis, such that one may question the mere existence of the integral. Closer± inspection of the integrand however yields

8Jean-Baptiste le Rond d’Alembert (1717 – 1783) was a french mathematician, physicist and philosopher also known for his co-authorship of the Encyclop´edie. 84 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

Figure 4.3: Sketch of the integration contour in Eq. (4.89) in the complex plane. To ensure for causality, the poles are shifted by an infinitesi- mal amount into the lower complex plane (red dots), such that they contribute only for posi- tive times, where the contour has to be closed using the lower arc. For negative times, the contour is closed in the upper half plane, and the Green’s function correspondingly vanishes. that imaginary and real parts separately have positive and negative contributions at the poles that may cancel. Therefore, we consider shifting the integration contour by ω ω iǫ. At this point, we introduce causality as a boundary condition on the Green’s function:→ For±t < 0, the Green’s function should vanish, since a source term should only influence the solution in the future but not in the past. For t< 0, the contour will be closed in the upper half plane, and for t> 0, the lower half plane will be the correct choice. Therefore, to obtain the so-called retarded Green’s function, we consider ω ω +iǫ, which moves the poles at ω1/2 = iǫ ck to the lower half plane, see also Fig. 4.3. For t→ < 0, the integral correspondingly vanishes,− whereas± for positive times we have to take two poles into account e iωtc2 e iωtc2 I(k,t) = Θ(t)2πilim Res − + Res − ǫ 0 ω=ck iǫ c2k2 (ω +iǫ)2 ω= ck iǫ c2k2 (ω +iǫ)2 →  − − − − −  e ickt e+ickt = 2πiΘ(t)c2 − 2ck − 2ck c  −  = 2πΘ(t) sin(ckt) . (4.90) k The retarded Green’s function therefore becomes 3 1 3 c ik r G(r,t) = Θ(t) d k sin(ckt)e− · 2π k   Z 2 π 1 ∞ ikr cos θ = Θ(t) dk dθ sin θck sin(ckt)e− , (4.91) 2π   Z0 Z0 were we have introduced spherical coordinates and already performed a trivial angular integration. Next, we perform the remaining angular integral π +1 ikr cos θ ikrx sin(kr) sin θe− = e− dx =2 , (4.92) 0 1 kr Z Z− such that the Green’s function becomes 2 2 1 c ∞ 1 c G(r,t) = 2 Θ(t) sin(ckt)sin(kr)dk = Θ(t) sin(ckt)sin(kr)dk 2π r 0 2π r  2 Z   Z 1 c ( 1) +ickt ickt +ikr ikr = Θ(t) − e e− e e− dk 2π r 4 − −   Z 1 c ( 1)    = Θ(t) − 2[δ(r + ct) δ(r ct)] 2π r 4 − − Θ(t) c = δ(r ct) , (4.93) 4π r − 4.4. GREEN’S FUNCTIONS 85 where in the last step we have used that the first δ-function cannot contribute for ct > 0. From translational invariance, we conclude Θ(t t ) c G(r,t, r ,t )= − 0 δ( r r c(t t )) . (4.94) 0 0 4π r r | − 0|− − 0 | − 0| We note that this Green’s function obeys causality as the boundary condition. We have not specified any initial condition for the field, implicitly assuming that the field initially vanishes. The full solution of the inhomogeneous wave equation ✷φ(r,t)= f(r,t) (4.95) therefore becomes

Θ(t t′) c ′ ′ 3 ′ φ(r,t)= − δ( r r c(t t′))f(r ,t′)d r dt′ . (4.96) 4π r r′ | − |− − Z | − | For example, for a point-like sender source at f(r,t)= f(t)δ(3)(r r ), we obtain − 0 Θ(t t′) c φ(r,t) = − δ( r r c(t t′))f(t′)dt′ 4π r r | − 0|− − Z | − 0| 1 1 ∞ = δ( r r τ)f(t τ/c)dτ 4π r r | − 0|− − | − 0| Z0 1 f(t r r /c) = −| − 0| . (4.97) 4π r r | − 0| Correspondingly, we see that the field at point r can only be influenced by the sender signal that has been sent at time t r r0 /c, which again motivates the name retarded Green’s function. Furthermore, we see that−| the− field| is suppressed as the distance between sender and receiver is increased.

4.4.3 Time-dependent Reaction-Diffusion Equation To solve the time-dependent reaction-diffusion equation ∂ρ ∂2ρ = D + r(x,t) (4.98) ∂t ∂x2 for arbitrary time-and space-dependent reaction rates with the Green’s function formalism we will have to specify an initial condition ρ(x, 0) = ρ0(x). To find the Green’s function, defined by ∂ ∂2 D G(x,t,x′,t′)= δ(x x′)δ(t t′) , (4.99) ∂t − ∂x2 − −   we perform a combined Fourier transformation in space and Laplace transformation in time

1 ∞ +iωx st G(ω,s,x′,t′)= dx dtG(x,t,x′,t′)e e− . (4.100) √2π Z Z0 The Green’s function has to match the initial conditions, which become G (ω)= 1 dxρ (x)e+iωx. 0 √2π 0 Therefore, the reaction-diffusion equation becomes R 1 2 +iωx′ st′ sG(ω,s,x′,t′) G0(ω)=( iω) DG(ω,s,x′,t′)+ e − Θ(t′) , (4.101) − − √2π 86 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS which is solved by

+iωx′ st′ G0(ω) Θ(t′) e − G(ω,s,x′,t′)= + . (4.102) ω2D + s √2π ω2D + s To obtain the actual Green’s function, we have to perform an inverse Laplace transform

2 Θ(t )Θ(t t ) 2 2 Dtω ′ ′ D(t t′) ω +iωx′ G(ω,t,x′,t′)= e− G0(ω)+ − e− − (4.103) √2π and an inverse Fourier transform

1 iωx G(x,t,x′,t′)= G(ω,t,x′,t′)e− dω. (4.104) √2π Z

For example, when the concentration vanishes initially, G0(ω) = 0, the Green’s function becomes a truncated Gaussian

Θ(t′)Θ(t t′) (x x )2/[4D(t t )] G(x,t,x′,t′)= − e− − ′ − ′ . (4.105) √2π 2D(t t ) − ′ The general solution for vanishing initial conditionsp is then obtained from the Green’s function via

ρ(x,t) = G(x,t,x′,t′)r(x′,t′)dx′dt′

Z t r(x′,t′) (x x )2/[4D(t t )] = dx′ dt′ e− − ′ − ′ , (4.106) 0 √2π 2D(t t ) Z Z − ′ and we see that the solution depends on the reactionp on all previous times and is weighted by a Gaussion in space and time. When we consider for example the rate r(x′,t′)= δ(x′ x )Θ(t′ t ), we obtain − 0 − 0 t 1 2 (x x0) /[4D(t t′)] ρ(x,t)=Θ(t t0) e− − − dt′ . (4.107) − t √2π 2D(t t ) Z 0 − ′ Alternatively, we could be interested byp the metabolitic secretion of a bacterial species that evolves according to our example in the Fokker-Planck equation. Then, the production rate for metabolitic outputs would be proportional to the bacterial concentration (4.52), and the concen- tration would have to be determined from the integral

t αt′ 2 2 e (x x0+vt) /(4D t ) 1 (x x ) /[4D(t t )] ρ(x,t)= dx′ dt′ e− ′− B ′ e− − ′ − ′ ,(4.108) 0 √2π√2DBt √2π 2D(t t ) Z Z ′ − ′ where DB denoted the motility of the bacteria. p

4.5 Nonlinear Equations

With a few exceptions, the study of nonlinear PDEs is limited to numerical approaches. From common experiences, it is a well-established approach to look for travelling wave solutions that 4.5. NONLINEAR EQUATIONS 87 may either arise exactly or asymptotically in this case. To be more exact, a travelling wave solution has in one spatial dimension the form u(x,t)= u(x vt)= u(z) , (4.109) − where v is an a priori unknown wavespeed.

Exercise 36 (Travelling Waves) Show that a transformation of the form (4.109) transforms a differential operator according to

= a ∂n∂m = a ( v)m∂n+m . (4.110) Dx,t nm x t →Dz nm − z nm nm X X

4.5.1 Fisher-Kolmogorov-Equation A prominent example for an equation exhibiting travelling waves is the Fisher 9- Kolmogorov 10- equation ∂u ∂2u = D + αu(x,t)[C u(x,t)] . (4.111) ∂t ∂x2 − Here, D represents the diffusion constant as usual, α corresponds to a growth rate, and C denotes a (homogeneous) carrying capacity. It is straightforward to bring the equation in dimensionless α form by introducing τ = αt, y = D x, and n = u/C, which yields p∂n ∂2n = + n(1 n) . (4.112) ∂τ ∂y2 − We note that homogeneous stationary solutions are n(y,τ)=0 and n(y,τ) = 1. A travelling-wave ansatz of the form n(y,τ)= n(y vτ) then transforms the PDE into an ODE of the form − n′′(z)+ vn′(z)+ n(z)[1 n(z)]=0 . (4.113) − Without loss of generality we constrain ourselves to positive wavespeeds v > 0 . (4.114) When we find a solution to ODE equation with constant v > 0 that is also stable in presence of perturbations, the travelling waves for v and v are both solutions. We can qualitatively study the equivalent system of coupled first order ODEs−

n′(z)= m(z) , m′(z)= vm(z) n(z)[1 n(z)] (4.115) − − − near its fixed pointsm ¯ 1 =0,n ¯1 =0andm ¯ 2 =0,n ¯2 = 1 with a nonlinear stability analysis. Near the fixed points, we obtain the linearized systems ∆˙n 0 +1 ∆n ∆˙n 0 +1 ∆n 1 = 1 , 2 = 2 . (4.116) ∆m˙ 1 v ∆m1 ∆m˙ +1 v ∆m2  1   − −    2   −   9Sir Ronald Aylmer Fisher (1890 – 1962) was a british statistician and evolutionary biologist, who contributed to the foundations of population genetics. 10Andrey Nikolaevich Kolmogorov (1903 – 1987) was a soviet mathematician with contributions to multiple fields. 88 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

Figure 4.4: Phase plane for Eq. (4.116). For v 2, the fixed point at the origin is stable, ≥ whereas the other fixed point is always a saddle node. To obtain a trajectory connecting the fixed points that is completely contained in the quadrant n(z) > 0 and n′(z) < 0, it is therefore necessary that v 2. ≥

By studying the eigenvalues of these matrices, we can infer the qualitative behaviour of the solution near the fixed points. At the origin, we obtain the eigenvalues v v 2 λ± = 1 , (4.117) 1 −2 ± 2 − r  which are both negative for v 2. For 0 0 for all v > 0, the second fixed point is a saddle node. To obtain trajectories travelling from the second fixed point to the first in the physically relevant quadrant specified by m(z) = n′(z) < 0 and n(z) > 0, it is therefore necessary that v 2, see Fig. 4.4. This means that a corresponding travelling wave solution should have a (dimensionless)≥ minimum wavespeed v 2. In the original frame this implies the condition ≥ v 2√αD, (4.119) wave ≥ such that the minimum wave speed is given by the product of proliferation rate and diffusion constant. This raises the question which wavespeed is actually obtained for a given initial condition. At the front of the travelling wave, where n(y,τ) 1, we can use the exponential ansatz n(y,τ)= a(y vτ) ≪ Ae− − (with A > 0 and a > 0) and neglect the quadratic part in the dimensionless Fisher- Kolmogorov equation (4.112). This yields for the initial condition with slope a a condition for v = a +1/a, and we see that the minimum wavespeed is obtained for a = 1.

4.5.2 Korteweg-de-Vries equation Another prominent nonlinear equation is one proposed by Korteweg 11 and de Vries 12. It was proposed to describe the dynamics of waves in long channels. These waves were observed to keep their shape over very large distances despite of friction and dispersive forces. In its dimensionless form, the Korteweg-de-Vries (KdV) equation reads ∂u ∂u ∂3u +6u(x,t) + =0 (4.120) ∂t ∂x ∂x3 11Diederik Johannes Korteweg (1848 – 1941) was a Dutch mathematician. 12Gustav de Vries (1866 – 1934) was a Dutch mathematician 4.6. NUMERICAL SOLUTION: FINITE DIFFERENCES 89 and thus constitutes an example of a nonlinear PDE of third order. As before, we are aiming to find travelling wave solutions and therefore use the transformation z = x vt, where v is the wave speed. This yields the ODE −

u′′′(z)+[6u(z) v]u′(z)=0 , (4.121) − to which one aims to find a solution. It turns out that multiple solutions exist (the physical one of course is then chosen by the initial and boundary conditions). Here, we just describe one way of constructing such solitonic solutions. A simple ansatz is to assume an exponential dependence, and to prevent the solution to blow off at z , it is reasonable to use the ansatz → ±∞ +A nz n= a ane u(z)= − . (4.122) +B nz n= b bne P − Since the solution should be finite at z P , it follows that we have to demand A = B and a = b. → ±∞ Then, already with the simplest ansatz with A = B = 1 and a = b = 1, i.e., with an ansatz of the form

z +z a 1e− + a0 + a+1e − u(z)= z +z (4.123) e− + b0 + b+1e it is possible to find a solution to Eq. (4.121). Above, we have already eliminated the parameter b 1, since it is redundant. Using this ansatz in the KdV equation, sorting the result in powers of − enz, and equating all coefficients of enz to zero eventually yields a set of equations for the remaining parameters. One particular solution (which is not simply constant) is

4(v 1)+4b (v + 5)ez + b2(v 1)e2z u(z)= − 0 0 − . (4.124) z 2 6(2+ b0e )

Here, the velocity of the wave v and the parameter b0 are both arbitrary. The velocity however is related to the baseline of the solution limz u(z)=(v 1)/6. We note that as the dimensionless →±∞ KdV equation (4.121) is invariant with respect to a shift− of the solution when we also change the velocity, we can construct other solutions by adding a constant. In particular, one soliton with compact support is obtained by setting v = 1, yielding

z 4b0e u(z)= z 2 . (4.125) (2 + b0e )

Exercise 37 (Soliton solution) Show that the soliton (4.125) solves Eq. (4.121) with v =1.

4.6 Numerical Solution: Finite Differences

The numerical solution of PDEs is usually an enormous challenge, such that the use of supercom- puters is often advisable. We will discuss a few simple approaches at an introductory level. 90 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

Here, we discretize both time and space in equal intervals of size ∆t and ∆x, ∆y, and ∆z. (n) As a convention, we will label the solution at discrete sampling points by uabc = u(xa,yb,zc,tn). Suitable approximations for derivatives occuring in a PDE are then the difference quotients, e.g. in three spatial dimensions

(n+1) (n) (n) (n) ∂u(x,y,z,t) ua,b,c ua,b,c ∂u(x,y,z,t) ua+1,b,c ua 1,b,c − , − − (4.126) ∂t ≈ ∆x ∂x ≈ 2∆x would be a forward-time approximation to the time derivative and a centered-space approximation to the spatial derivative. Often, one is given an initial distribution of the solution and wants to evolve it in time. Depending now on which kind of approximation is inserted for the time derivatives, we obtain different numerical solution schemes. We will illustrate this at the example of the 1D diffusion equation (the generalization to higher dimensions is straightforward).

4.6.1 Explicit Forward-Time Discretization A forward-time approximation would yield

(n+1) (n) (n) (n) (n) ua ua ua+1 2ua + ua 1 − = D − − , (4.127) ∆t ∆x2 which we can immediately convert into an iterative equation for the propagated solution

(n+1) (n) ∆t (n) (n) (n) ua = ua + D ua+1 2ua + ua 1 . (4.128) ∆x2 − − h i Thus, to propagate the solution of the diffusion equation by one timestep, one needs the local value (n) (n) of the function ua and the value of the next neighbours ua 1. To encode boundary conditions requires some special care: Assuming a discretization with a ± 0, 1,...,N 1,N , the boundary conditions will constrain the external nodes a = 0and a = N,∈ such { that these− need} not be explicitly evolved: To encode e.g. Dirichlet boundary conditions, we simply have to fix the boundary nodes to • (n) (n) a constant value u0 = uleft and uN = uright and need to apply the iteration equation only (n) for the internal nodes ua with 1 a N 1. ≤ ≤ − To encode e.g. no-flux von-Neumann boundary conditions, we also evolve only the internal • (n) (n) (n) (n) nodes and replace in the iteration equation u0 = u1 and uN = uN 1. For general von- (n) −(n) (n) (n) Neumann boundary conditions one can demand the differences u0 u1 and uN uN 1 to be constant. − − −

(n) (n) (n) (n) For periodic boundary conditions, we can e.g. eliminate uN = u0 , uN+1 = u1 , and • (n) (n) u 1 = uN 1 and evolve only nodes with 0 n N 1 with the iteration equation. − − ≤ ≤ − We note that for a forward-time and centered space discretization we have not treated time and space on equal footing. This enables one to arrive at a simple iteration equation but unfortunately comes at a costly price: When the timestep is too large or the spatial resolution becomes too fine, the scheme becomes unstable. We see this from a von-Neumann stability analysis of the discretization scheme, where as an ansatz one locally linearizes the PDE (here, this is not necessary since the diffusion equation is linear) and then considers the dynamics of modes of the form

(n) n iak ua = A e , (4.129) 4.6. NUMERICAL SOLUTION: FINITE DIFFERENCES 91 where k R and A is unknown. Thus, the ansatz is oscillatory in space (note that this simply corresponds∈ to an expansion in Fourier modes) and exponential in time, and if, for any k, we obtain that A > 1, the solutions may diverge in time and the scheme is considered unstable. Inserting the ansatz| | in Eq. (4.128) we obtain

n+1 ika ∆t n ika ∆t n ika ik +ik A e = 1 2D A e + D A e e− + e , (4.130) − ∆x2 ∆x2     which implies ∆t A =1+2D [cos(k) 1] . (4.131) ∆x2 − To demand that A 1 for all k eventually yields the Courant 13-condition | |≤ ∆t 2D 1 , (4.132) ∆x2 ≤ that has to be satisfied to render the iteration (4.128) stable. This condition, unfortunately, is quite prohibitive: For large D, we will have to choose a small timestep and a rough spatial discretization, and if we want to increase the spatial resolution (decrease ∆x), we at the same time must also increase the temporal resolution (decrease ∆t). Furthermore, it is a necessary condition for stability but not sufficient. When the dimension is increased, the Courant-condition becomes even more prohibitive ∆t ∆t ∆t 2D + + 1 . (4.133) ∆x2 ∆y2 ∆z2 ≤  

Exercise 38 (Courant condition) Show that the Courant condition (4.133) arises for the 3d diffusion equation for explicit forward-time discretization.

4.6.2 Implicit Centered-Time discretization We can can also center the time derivative around t, which however would involve values from the ∂u (n+1) n 1 previous timestep ∂t (u u − )/(2∆t). Alternatively, we can also evaluate the r.h.s. of the ≈ − 14 15 diffusion equation at time tn and tn+1. This is called Crank - Nicolson scheme, which for the one-dimensional diffusion equation reads

(n+1) (n) D ∆t (n) (n) (n) D ∆t (n+1) (n+1) (n+1) ua = ua + [ua+1 2ua + ua 1]+ [ua+1 2ua + ua 1 ] . (4.134) 2 ∆x2 − − 2 ∆x2 − − We note that the temporal derivative is centered around t +∆t/2, which has been taken into (n) n ika account by the average on the r.h.s. A von-Neumann stability analysis ua = A e yields 1 D∆t (1 cos k) A = − ∆x2 − , (4.135) 1+ D∆t (1 cos k) ∆x2 − 13Richard Courant (1888 – 1972) was a german mathematician. 14John Crank (1916 – 2006) was a british mathematical physicist. 15Phyllis Nicolson (1917 – 1968) was a british mathematician. 92 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

Figure 4.5: Comparison of Crank-Nicolson so- lutions of the diffusion equation for Dirich-

let (black), von-Neumann (red), and periodic 0,1 t = 25 ∆t (blue) boundary conditions for different times t = 100 ∆t (from bold to thin). Whereas for Dirichlet t = 200 ∆t boundary conditions (initiated with a δ-profile at x = 20∆x), the total mass is not con- 0,05 served, this is different for periodic and (no- concentration profile flux) von-Neumann boundary conditions (ini-

tiated at x = 80∆x). Other parameters: 0 2 20 40 60 80 D∆t/∆x =0.1. length [∆x] which for all k R and for all D∆t/∆x2 > 0 obeys A 1. The scheme is therefore uncondition- ally stable. This∈ allows one to choose large timesteps| (whic|≤ h would however also lead to decreased accuracy). Unfortunately, the scheme is more diffucult to evolve since it requires to invert a matrix: Arranging all terms with n +1 on the left and those with n on the right, we obtain for the internal (1 a N 1) nodes the equations ≤ ≤ − (n+1) (n+1) (n+1) (n) (n) (n) αua 1 +(1+2α)ua αua+1 =+αua 1 + (1 2α)ua + αua+1 , (4.136) − − − − − where we have defined α = (D/2)∆t/∆x2. When we assume e.g. Dirichlet boundary conditions (n+1) (n) (n+1) (n) u0 = u0 and uN = uN for the external nodes, this can be written with a matrix A(α) (n+1) n u0 u0 . .  .   .  A(+α) . = A( α) . , . − .      (n+1)   un   uN   N        1 0 α 1+2α α  − −. .  α .. .. A(α) =  − . .  . (4.137)  .. .. α     α 1+2− α α     − 0− 1      Fortunately, the matrix to be inverted is tri-diagonal. We also note that it is diagonally dominant, such that its inverse will always exist. For other boundary conditions, we would obtain some minor modifications, but in any case, for such matrices, adapted algorithms exist that compute the inverse extremely efficiently. We note that the initial conditions must be chosen consistent with the boundary conditions. When the concentration does not vanish at the boundary, the choice of boundary condition may lead to fundamentally different behaviour, see Fig. 4.5.

Exercise 39 (boundary conditions) Find the matrix A(α) for von-Neumann boundary condi- (n+1) (n+1) (n) (n) tions (keeping the spatial derivative at the boundaries constant implies u1 u0 = u1 u0 (n+1) (n+1) (n) (n) − (n+1) −(n+1) and uN uN 1 = uN uN 1) and for periodic boundary conditions (which imply u0 = uN (n) − (n)− − − and u0 = uN ). 4.6. NUMERICAL SOLUTION: FINITE DIFFERENCES 93

The fact that a matrix has to be inverted to propagate the solution is common to all implicit schemes. This is usually compensated by the fact that implicit schemes tend to be more stable than explicit ones. In higher dimensions however, the matrix that has to be inverted will no longer be tri-diagonal but rather band-diagonal. There exist adapted methods to map the band-diagonal ma- trices to multiple tri-diagonal ones. However, since generally only few next neighbors are involved in the representation of spatial derivatives, the matrix to be inverted will be always extremely sparse. Therefore, an iterative algorithm as e.g. inverse iteration in Eq. (3.33) may be applicable and even quite efficient when the sparsity of the matrix is used in the matrix multiplications. We also note that when one is interested in a Poisson-type problem

∆ρ(x)= f(x) (4.138)

2 we can map it to the steady-state problem of a reaction-diffusion equation ∂tρ = D∂xρ + Df(x). To find the solution to the Poisson equation very quickly, a large timestep is also desirable, such that an implicit scheme would be advisable.

4.6.3 Higher Dimensions Whereas in one spatial dimension, it is straightforward to collect the discretized values in a single vector and to find the next neighbours of a node, the latter requires a little thought in higher dimensions. To facilitate the construction of such models, we introduce here the index function, which takes the indices of a node as an input and returns the position where its value is stored in a vector as an output. Clearly, in one dimension we have

I(a)= a, 1 a N , (4.139) ≤ ≤ x where Nx is the total number of nodes. In two dimensions, this becomes

I(a,b)=(b 1)N + a, 1 a N , 1 b N , (4.140) − x ≤ ≤ x ≤ ≤ y which can take all values in 1 I(a,b) N N . ≤ ≤ x y Finally, in three dimensions we have

I(a,b,c) = (c 1)N N +(b 1)N + a, − x y − x 1 a N , 1 b N , 1 c N , (4.141) ≤ ≤ x ≤ ≤ y ≤ ≤ z which takes the values in 1 I(a,b,c) NxNyNz. Of course, this is just one possible cartesian representation. ≤ ≤ Unfortunately, this has the consequence that spatial derivatives will in higher dimensions not generally lead to a tri-diagonal form of the matrix that should be inverted. Nevertheless, the matrix will always be quite sparse (assuming a sufficiently large number of integration points).

4.6.4 Nonlinear PDEs When parts of the PDE are nonlinear, it is – at least concerning the nonlinear terms – usually simpler to use forward-time discretization. As an example, we consider the Fisher-Kolmogorov 94 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

Figure 4.6: Numerical solution of the Fisher- Kolmogorov equation (4.112) dynamics. For 1 τ={0.0,0.1,...,1.0}∆τ small times (black), the initially sharply τ={2,3,...,10}∆τ peaked distribution first smoothens (diffusion- 0,8 τ={12,14,...,30}∆τ dominated regime). For intermediate times (red), exponential growth in the central region 0,6 sets in establishing two phases where n(x,t) ∈ 0, 1 . Afterwards (blue), the phase boundary 0,4 { } propagates like a travelling wave at dimension-

less speed v 2 (length of arrows: 20∆x). Due 0,2 to the periodic≈ boundary conditions, the two dimensionless concentration

wave fronts collide. Other parameters ∆x = 1, 0 20 40 60 80 ∆τ =0.1 position [∆x]

equation (4.112). The linearised version around the fixed valuesn ¯1 =0 andn ¯2 = 1 becomes with n(y,τ)=¯ni + ni(y,τ) ∂n ∂2n ∂n ∂2n 1 = 1 + n (y,τ) , 2 = 2 n (y,τ) . (4.142) ∂τ ∂y2 1 ∂τ ∂y2 − 2 When we use a conventional forward-time discretization on the linearized versions, a von-Neumann stability analysis yields the condition ∆τ A = 1 2 [1 cos(k)] ∆τ 1 . (4.143) | | − ∆y2 − ± ≤

Therefore, it is important to keep in mind that for given discretization width ∆y, the timestep ∆t must remain bounded. It can actually be tested easily that when the timestep is chosen too large, the solution will diverge. For the full Fisher-Kolmogorov-equation we obtain the iteration scheme

(n+1) (n) ∆τ (n) (n) (n) (n) (n) ua = ua + ua+1 2ua + ua 1 +∆τua (1 ua ) . (4.144) ∆y2 − − − h i When supplemented with periodic boundary conditions, this yields the dynamics shown in Fig. 4.6. We see that the speed of the resulting travelling wave is slightly larger than vmin = 2. Similarly, we can consider the evolution of solitons in the KdV equation (4.120). Here, the forward-time discretization yields (we start from the form ∂ u = 6u∂ u ∂3u) t − x − x (n+1) (n) ∆t (n) (n) ∆t (n) (n) (n) (n) ua = ua 1 3 ua+1 ua 1 ua+2 2ua+1 +2ua 1 ua 2 . (4.145) − ∆x − − − 2∆x3 − − − −      n n When we assume periodic boundary conditions ua+N = ua , there is no need to find other dis- cretizations of the third derivative at the boundary. Unfortunately, when we neglect the nonlinear (n) n ika term, a von-Neumann stability analysis ua = A e yields that A 1 for all k, i.e., a forward discretization scheme is always unstable – regardless of the timestep| | ≥ width. Therefore, implicit methods must even be used for these types of problems. But even for implicit methods there exists a variety of different approaches. The simplest approach is to use a semi-implicit method. Rewriting the nonlinear term in the kdV equation we obtain ∂u ∂u2 ∂3u = 3 . (4.146) ∂t − ∂x − ∂x3 4.6. NUMERICAL SOLUTION: FINITE DIFFERENCES 95

Treating all linear terms in a centered fashion and only the non-linear term explicitly (forward- time), we obtain a system that requires only matrix inversion for its propagation. For this, we first write numerical approximations to the first and third derivative as

2 2 (n) (n) ∂u2 ua+1 ua 1 − − , ∂x ≈  2∆x  3 (n) (n) (n) (n) (n+1) (n+1) (n+1) (n+1) ∂ u ua+2 2ua+1 +2ua 1 ua 2 ua+2 2ua+1 +2ua 1 ua 2 − − − − + − − − − . (4.147) ∂x3 ≈ 4∆x3 4∆x3

Sorting all forward-time terms to the left and backward-time terms to the right, we obtain

(n) ua (n+1) ua ∆t (n) (n) (n) (n) 4∆x3 ua+2 2ua+1 +2ua 1 ua 2 ∆t (n+1) (n+1) (n+1) (n+1) = − − − − − (4.148). + 4∆x3 ua+2 2ua+1 +2ua 1 ua 2 2 2 − − − − ∆t  (n) (n)  3 2∆x ua+1 ua 1   − − −      (n) n ika When we now apply a von-Neumann stability analysis ua = A e and neglect the nonlinear terms this yields

1 2iα [sin(2k) sin(k)] A = − − , (4.149) 1+2iα [sin(2k) sin(k)] − where α =∆t/(4∆x3). Therefore, we conclude that A = 1 for all values of k. Unfortunately, any slight modification (recall that we have neglected the| nonli| near terms completely) might spoil this fragile stability. Furthermore, we see that – assuming periodic boundary conditions and eliminating the redundant node – to propagate the system from n to n +1, we have to invert a matrix of the form

1 2α +α α +2α −. . . − +2α ...... α  −  ......  α . . . .  A =  − . . . .  . (4.150)  ...... +α     . . .   +α ...... 2α   −   2α +α α +2α 1   − −    Inversion of this matrix is always possible when α =∆t/(4∆x3) < 1/6: In this case, the matrix is diagonally dominant, and the Gershgorin 16 circle| | theorem guarantees positivity of all eigenvalues. Unfortunately, to obey a reasonable spatial resolution, this requires to keep the timestep quite small. The next possible improvement would be to treat also the nonlinear terms using a Crank- Nicolson discretization scheme. For simplicity, we will assume periodic boundary conditions (n) (n) (n) throughout uN = u0 and eliminate the node u0 from our considerations. Moving all terms

16Semyon Aronovich Gershgorin (1901 – 1933) was a soviet mathematician. 96 CHAPTER 4. PARTIAL DIFFERENTIAL EQUATIONS

Figure 4.7: Numerical solution of the kdV equation with the soliton (4.125) with b0 = 20. After t = 100, the soliton (v = 1) has completed 0,5 one cycle (periodic boundary conditions), and for semi-implicit discretization (4.148) its shape 0,4 t=100 has slightly changed (thin curves in lighter col- t=40 0,3 t=30 ors, ∆x = 0.1, ∆t = 0.00025). This er- t=20 t=10 ror is not present in the fully implicit Crank- t=0 0,2 Nicolson treatment (4.151) with Newton iter- solution u(x,t) ation (bold curves, ∆x = 0.1, ∆t = 0.25). 0,1 Beyond increased accuracy, the fully implicit

Crank-Nicolson scheme was also about 1000 0 0 20 40 60 80 100 times faster. dimensionless position x to the right yields the nonlinear equation

2 2 (n) (n) (n) (n) (n) (n) (n) 0 = ua + α ua+2 2ua+1 +2ua 1 ua 2 +3β ua+1 ua 1 − − − − − − −        2 2 (n+1) (n+1) (n+1) (n+1) (n+1) (n+1) (n+1) +ua + α ua+2 2ua+1 +2ua 1 ua 2 +3β ua+1 ua 1 − − − − − −   (n+1)  (n+1)      = fa(u1 ,...,uN ) , (4.151)

3 ∆t where α = ∆t/(4∆x ) as before and β = 4∆x . The spatial index obeys 1 a N and the perodic boundary conditions imply a + N ˆ=a. This equation can be solved for x≤= u≤(n+1) using a- fixed-point method such as e.g. Newton-iteration. The Jacobian 17 of the vector-valued function f(x) becomes

0 +x2 xN ∂f1 ∂f1 − ... x1 0 +x3 ∂x1 ∂xN − . .  ......  (x) = . . = A +6β . . . . (4.152) J   ∂fN ∂fN  . .  ...  .. .. +x   ∂x1 ∂xN   N     +x x 0   1 N 1   − −  (n) (n) That is, given the vector u we can e.g. define the initial guess x0 = u and then iterate

1 x = x − (x )f(x ) (4.153) n+1 n −J n n until convergence xN xN 1 < ǫ is reached. The final value does then correspond to the next (n+1) | − − | timestep u = xN . We note that we need not necessarily compute the inverse matrix explicitly but it suffices to solve the system (xn)∆x = f(xn) for ∆x and then set xn+1 = xn + ∆x. However, this comes with the advantageJ of treating− all derivatives in a centered fashion, which greatly improves accuracy. Furthermore, we observe numerically that the timestep can now be chosen much larger, compare Fig. 4.7.

17Carl Gustav Jacob Jacobi (1804–1851) was a german mathematician. Chapter 5

Master Equations

Many processes in nature are stochastic. In classical physics, this may be due to our incomplete knowledge of the system. Due to the unknown microstate of e.g. a gas in a box, the collisions of gas particles with the domain wall will appear random. In quantum theory, the evolution equations themselves involve amplitudes rather than observables in the lowest level, such that a stochastic evolution is intrinsic. In order to understand such processes in great detail, a description should include such random events via a probabilistic description. For dynamical systems, probabilities associated with events may evolve in time, and the determining equation for such a process is called master equation.

5.1 Rate Equations

If the master equation is a system of linear ODEs P˙ = AP , we will call it rate equation.

Box 17 (Rate Equation) A rate equation is a first order differential equation describing the time evolution of probabilities, e.g. for discrete events dP k = [T P T P ] , P˙ = T P , (5.1) dt kℓ ℓ − ℓk k Xℓ where the Tkℓ > 0 are transition rates between events ℓ and k. In matrix representation, this implies

Ti1 T12 ... T1N − i=1 6  PT21 Ti2 T2N  T = − i=2 (5.2)  . 6 . .   . P .. .     TN1 ...... TiN   − i=N   6   P 

The rate equation is said to satisfy detailed balance, when for the stationary state T P¯ =0 the equality TkℓP¯ℓ = TℓkP¯k holds for all pairs (k,ℓ) separately.

97 98 CHAPTER 5. MASTER EQUATIONS

Furthermore, when the transition matrix Tkℓ is symmetric, all processes are reversible at the level of the rate equation description. Here, we will use the term master equation in more general terms describing any system of coupled ODEs for probabilities. This is more general than a rate equation, since, for example, the Markovian quantum master equation does in general not only involve probabilities (diagonals of the density matrix) but also coherences (off-diagonals). It is straightforward to show that rate equations conserve the total probability dP k = (T P T P )= (T P T P )=0 . (5.3) dt kℓ ℓ − ℓk k ℓk k − ℓk k Xk Xkℓ Xkℓ Beyond this, all probabilities must remain positive, which is also respected by a normal rate equation: Evidently, the solution of a rate equation is continuous, such that when initialized with valid probabilities 0 P (0) 1 all probabilities are non-negative initially. Let us assume ≤ i ≤ that after some time, the probability Pk it the first to approach zero (such that all others are non-negative). Its time-derivative is then always non-negative dP k =+ T P 0 , (5.4) dt kℓ ℓ ≥ Pk=0 ℓ X which implies that Pk = 0 is repulsive, and negative probabilities are prohibited. Finally, the probabilities must remain smaller than one throughout the evolution. This however follows immediately from k Pk = 1 and Pk 0 by contradiction. In conclusion, a rate equation of the form≥ (5.1) automatically preserves the sum of probabilities P and also keeps 0 Pi(t) 1 – a valid initialization provided. That is, under the evolution of a rate equation, probabilities≤ ≤ remain probabilities.

5.1.1 Example 1: Fluctuating two-level system Let us consider a system of two possible events, to which we associate the time-dependent proba- bilities P0(t) and P1(t). These events could for example be the two conformations of a molecule, the configurations of a spin, the two states of an excitable atom, etc. To introduce some dynamics, let the transition rate from 0 1 be denoted by T > 0 and the inverse transition rate 1 0 be → 10 → denoted by T01 > 0. The associated master equation is then given by d P T +T P 0 = − 10 01 0 (5.5) dt P1 +T10 T01 P1    −  

Exercise 40 (Temporal Dynamics of a two-level system) Calculate the solution of Eq. (5.5). What is the stationary state?

5.1.2 Example 2: Interacting quantum dots Imagine a double quantum dot, where the Coulomb interaction energy is so large that the doubly occupied state can be omitted from the considerations. In essence, only three states remain. 5.2. DENSITY MATRIX FORMALISM 99

Let 0 denote the empty, L the left-occupied, and R the right-occupied states, respectively. Now| assumei the two quantum| i dots to be tunnel-coupled| i to adjacent reservoirs but not among themselves. When we assume the transition rate to load the system from reservoir α L,R to ∈ { } be given by a product of a bare tunneling rate Γα and the occupation probability 0 < fα < 1, it is also reasonable to assume that the rate of the inverse process is given by Γ (1 f ), i.e., by α − α the product of bare tunneling rate and the probability to have a free space in the reservoir 1 fα. The total rate matrix then reads − f 1 f 0 f 0 1 f − L − L − R − R T =Γ +f (1 f ) 0 +Γ 00 0 . (5.6) L  L L  R   0− − 0 0 +f 0 (1 f ) R − − R In fact, a microscopic derivation can be used to confirm the above-mentioned assumptions, and the parameters fα become the Fermi functions 1 fα = (5.7) βα(ǫα µα) e − +1 with inverse temperature βα, chemical potentials µα, and dot level energy ǫα.

Exercise 41 (Detailed balance) Determine whether the rate matrix (5.6) obeys detailed bal- ance.

5.2 Density Matrix Formalism

5.2.1 Density Matrix Suppose one wants to describe a quantum system, where the system state is not exactly known. That is, there is an ensemble of known normalized states Φi , but there is uncertainty in which of these states the system is. Such systems can be convenient{| i}ly described by the density matrix.

Box 18 (Density Matrix) The density matrix can be written as

ρ = p Φ Φ , (5.8) i | iih i| i X where 0 p 1 denote the probabilities to be in the state Φ with p = 1. Note that we ≤ i ≤ | ii i i require the states to be normalized ( Φi Φi = 1) but not generally orthogonal ( Φi Φj = δij is allowed). h | i P h | i 6 Formally, any matrix fulfilling the properties

self-adjointness: ρ† = ρ • normalization: Tr ρ =1 • { } positivity: Ψ ρ Ψ 0 for all vectors Ψ • h | | i≥ can be interpreted as a valid density matrix. 100 CHAPTER 5. MASTER EQUATIONS

For a pure state one has p¯i = 1 for a particular ¯i and thereby ρ = Φ¯i Φ¯i . Evidently, a density matrix is pure if and only if ρ = ρ2. | ih | The expectation value of an operator for a known state Ψ | i A = Ψ A Ψ (5.9) h i h | | i can be obtained conveniently from the corresponding pure density matrix ρ = Ψ Ψ by simply computing the trace | ih |

A Tr Aρ = Tr ρA = Tr A Ψ Ψ h i ≡ { } { } { | ih |} = n A Ψ Ψ n = Ψ n n A Ψ h | | ih | i h | | ih | | i n n ! X X = Ψ A Ψ . (5.10) h | | i When the state is not exactly known but its probability distribution, the expectation value is obtained by computing the weighted average

A = P Φ A Φ , (5.11) h i i h i| | ii i X where Pi denotes the probability to be in state Φi . The definition of obtaining expectation values by calculating traces of operators with the density| i matrix is also consistent with mixed states

A Tr Aρ = Tr A p Φ Φ = p Tr A Φ Φ h i ≡ { } i | iih i| i { | iih i|} ( i ) i X X = p n A Φ Φ n = p Φ n n A Φ i h | | iih i| i i h i| | ih | | ii i n i n ! X X X X = p Φ A Φ . (5.12) i h i| | ii i X We finally note that the (real-valued) diagonal entries of the density matrix are often called pop- ulations, whereas the (complex-valued) off-diagonal entries are termed coherences. However, these notions always depend on the chosen basis.

Exercise 42 (Superposition vs. Localized States) Calculate the density matrix for a statis- tical mixture in the states 0 and 1 with probability p0 = 3/4 and p1 = 1/4. What is the density matrix for a statistical| i mixture| i of the superposition states Ψ = 3/4 0 + 1/4 1 and | ai | i | i Ψ = 3/4 0 1/4 1 with probabilities p = p =1/2? | bi | i− | i a b p p p p

5.2.2 Dynamical Evolution in a closed system The evolution of a pure state vector in a closed quantum system is described by the evolution operator U(t), as e.g. for the Schr¨odinger equation

Ψ(˙ t) = iH(t) Ψ(t) (5.13) − | i E

5.2. DENSITY MATRIX FORMALISM 101 the time evolution operator

t

U(t)=ˆτ exp i H(t′)dt′ (5.14) −   Z0  may be defined as the solution to the operator equation  U˙ (t)= iH(t)U(t) . (5.15) − iHt For constant H(0) = H, we simply have the solution U(t)= e− . Similarly, a pure-state density matrix ρ = Ψ Ψ would evolve according to the von-Neumann equation | ih | ρ˙ = i[H(t),ρ(t)] (5.16) − with the formal solution ρ(t)= U(t)ρ(0)U †(t), compare Eq. (5.14). When we simply apply this evolution equation to a density matrix that is not pure, we obtain

ρ(t)= p U(t) Φ Φ U †(t) , (5.17) i | iih i| i X i.e., transitions between the (now time-dependent) state vectors Φi(t) = U(t) Φi are impossible with unitary evolution. This means that the von-Neumann evolution| equationi does| i yield the same dynamics as the Schr¨odinger equation if it is restarted on different initial states.

Exercise 43 (Preservation of density matrix properties by unitary evolution) Show that the von-Neumann (5.16) equation preserves self-adjointness, trace, and positivity of the density matrix.

Also the Measurement process can be generalized similarly. For a quantum state Ψ , measure- ments are described by a set of measurement operators M , each corresponding| toi a certain { m} measurement outcome, and with the completeness relation m Mm† Mm = 1. The probability of obtaining result m is given by P p = Ψ M † M Ψ (5.18) m h | m m | i and after the measurement with outcome m, the quantum state is collapsed

m M Ψ Ψ m | i . (5.19) | i → Ψ Mm† M Ψ h | m | i q The projective measurement is just a special case of that with M = m m . m | ih |

Box 19 (Measurements with density matrix) For a set of measurement operators M { m} corresponding to different outcomes m and obeying the completeness relation m Mm† Mm = 1, the probability to obtain result m is given by P

pm = Tr Mm† Mmρ (5.20)  102 CHAPTER 5. MASTER EQUATIONS and action of measurement on the density matrix – provided that result m was obtained – can be summarized as

m MmρMm† ρ ρ′ = (5.21) → Tr Mm† Mmρ n o

It is therefore straightforward to see that description by Schr¨odinger equation or von-Neumann equation with the respective measurement postulates are equivalent. The density matrix formal- ism conveniently includes statistical mixtures in the description but at the cost of quadratically increasing the number of state variables.

Exercise 44 (Preservation of density matrix properties by measurement) Show that the measurement postulate preserves self-adjointness, trace, and positivity of the density matrix.

5.2.3 Completely Positive linear maps A very convenient approach to general linear maps of the density matrix that preserve its properties is the so-called Kraus 1 representation.

Box 20 (Kraus representation) The Kraus representation of a linear map of the density matrix is given by

ρ′ = KαρKα† (5.22) α X with α Kα† Kα = 1 and otherwise arbitrary Kraus operators Kα. It preserves all density matrix properties. P

It is straightforward to show that the properties trace, hermiticity, and positivity of ρ are also inherited by ρ′. That is, under the evolution of a Kraus map, density matrices remain density matrices.

Exercise 45 (Kraus map) Show that the Kraus map preserves the density matrix properties.

Special cases of Kraus maps are the evolution under measurement and also the unitary evolution of closed system. Below, we will see that specific non-unitary differential formulations of the density matrix evolution can also be related to Kraus maps.

1Karl Kraus (1938 – 1988) was a german theoretical physicist. 5.3. LINDBLAD MASTER EQUATION 103 5.3 Lindblad Master Equation

Any dynamical evolution equation for the density matrix should (at least in some approximate sense) preserve its interpretation as density matrix, i.e., trace, hermiticity, and positivity must be preserved. By construction, the mesurement postulate and unitary evolution preserve these properties. However, more general evolutions are conceivable. If we constrain ourselves to linear master equations that are local in time and have constant coefficients, the most general evolution that preserves trace, self-adjointness, and positivity of the density matrix is given by a Lindblad 2 form.

Box 21 (Lindblad form) A master equation of Lindblad form has the structure

N 2 1 − 1 ρ˙ = ρ = i[H,ρ]+ γ A ρA† A† A ,ρ , (5.23) L − αβ α β − 2 β α α,βX=1  n o where the hermitian operator H = H† can be interpreted as an effective Hamiltonian and γαβ = γβα∗ is a positive semidefinite matrix, i.e., it fulfills xα∗ γαβxβ 0 for all vectors x (or, equivalently αβ ≥ that all eigenvalues of (γ ) are non-negative λ P 0). αβ i ≥

Exercise 46 (Trace and Hermiticity preservation by Lindblad forms) Show that the Lindblad form master equation preserves trace and hermiticity of the density matrix.

The Lindblad type master equation can be written in simpler form: As the dampening ma- trix γ is hermitian, it can be diagonalized by a suitable unitary transformation U, such that U γ (U †) = δ γ with γ 0 representing its non-negative eigenvalues. Using this αβ α′α αβ ββ′ α′β′ α′ α ≥ unitary operation, a new set of operators can be defined via Aα = Uα αLα . Inserting this α′ ′ ′ decompositionP in the master equation, we obtain P N 2 1 − 1 ρ˙ = i[H,ρ]+ γ A ρA† A† A ,ρ − αβ α β − 2 β α α,βX=1  n o 1 † † = i[H,ρ]+ γαβUα′αUβ∗ β Lα′ ρLβ Lβ Lα′ ,ρ − " ′ # ′ − 2 ′ αX′,β′ Xαβ  n o 1 = i[H,ρ]+ γ L ρL† L† L ,ρ , (5.24) − α α α − 2 α α α   X  2 where γα denote the N 1 non-negative eigenvalues of the dampening matrix. Evidently, the representation of a master− equation is not unique. Any other unitary operation would lead to a different non-diagonal form which however describes the same master equation. In addition, we

2G¨oran Lindblad (1940–) is a swedish physicist. 104 CHAPTER 5. MASTER EQUATIONS note here that the master equation is not only invariant to unitary transformations of the operators Aα, but in the diagonal representation also to inhomogeneous transformations of the form

L L′ = L + a α → α α α 1 H H′ = H + γ a∗ L a L† + b, (5.25) → 2i α α α − α α α X  with complex numbers aα and a b. The first of the above equations can be exploited to choose the Lindblad operators Lα traceless.

Exercise 47 (Shift invariance) Show the invariance of the diagonal representation of a Lind- blad form master equation (5.24) with respect to the transformation (5.25).

We would like to demonstrate the preservation of positivity here. Since preservation of her- miticity follows directly from the Lindblad form, we can – at any time – formally write the density matrix in its spectral representation

ρ(t)= P (t) Ψ (t) Ψ (t) (5.26) j | j ih j | j X with probabilities Pj(t) R (we still have to show that these remain positive) and time-dependent eigenstates obeying unitary∈ evolution

Ψ˙ (t) = iK(t) Ψ (t) , K(t)= K†(t) . (5.27) j − | j i E

Exercise 48 (Evolution of Eigenstates is Unitary) Show that orthonormality of time- dependent eigenstates implies unitary evolution, i.e., a hermitian operator K(t).

The hermitian conjugate equation also implies that Ψ˙ = +i Ψ K. Therefore, its time- j h j| derivative becomes D

ρ˙ = P˙ Ψ Ψ iP K Ψ Ψ +iP Ψ Ψ K , (5.28) j | jih j|− j | jih j| j | jih j| j X h i and sandwiching the density matrix yields Ψj(t) ρ˙ Ψj(t) = P˙j(t). On the other hand, we can also use the Lindblad equation to obtain h | | i

P˙ = i Ψ H Ψ P +iP Ψ H Ψ j − h j| | ji j j h j| | ji

+ γ Ψ L ( P Ψ Ψ )L† Ψ Ψ L† L Ψ P α h j| α k | kih k| α | ji−h j| α α | ji j α " # X Xk 2 = γ Ψ L Ψ P Ψ L† L Ψ P . (5.29) α|h j| α | ki| k − h j| α α | ji j α ! α ! Xk X X 5.3. LINDBLAD MASTER EQUATION 105

This is nothing but a rate equation with positive but time-dependent transition rates

2 Rk j(t)= γα Ψj(t) Lα Ψk(t) 0 , (5.30) → |h | | i| ≥ α X and with our arguments from Sec. 5.1 it follows that the positivity of the eigenvalues Pj(t) is granted. Unfortunately, the basis within which this simple rate equation holds is time-dependent and also only known after solving the master equation and diagonalizing the solution. It is therefore not very practical in most occasions. However, in many applications a basis admitting a rate equation representation can be found, which we will discuss below. Usually (but not always), the coherences in this case just decay and can therefore be neglected in the long-term limit. However, even in these fortunate cases it may become difficult to find the basis where the long-term density matrix becomes diagonal (i.e., the pointer basis) without solving the full master equation. Formally, when the density matrix is written as a vector (first containing the populations and then the coherences), a master equation acts like a matrix – often called Liouvillian 3 Then, a rate equation representation means that the Liouvillian simply has block form letting populations and coherences evolve independently. In fact, the most common quantum-optical procedure for deriving a Lindblad master equation (successively applying the Born 4, Markov 5, and secular approximations) very often (in case of non-degenerate system Hamiltonians) leads to a simplified rate equation description in the energy eigenbasis of the system Hamiltonian.

5.3.1 Master Equation for a cavity in a thermal bath Consider the Lindblad form master equation

1 1 ρ˙ = i Ωa†a,ρ + γ(1 + n ) aρ a† a†aρ ρ a†a S − S B S − 2 S − 2 S     1 1 +γn a†ρ a aa†ρ ρ aa† , (5.31) B S − 2 S − 2 S  

βΩ 1 with bosonic operators a,a† = 1 and Bose-Einstein bath occupation n = e 1 − and cavity B − frequency Ω. In Fock-space representation, these operators act as a n = √n +1 n +1 (where   †   0 n< ), such that the above master equation couples only the diago| inals of the density| i matrix ρ ≤= n ρ∞ n to each other n h | S | i

ρ˙n = γ(1 + nB)[(n + 1)ρn+1 nρn]+ γnB [nρn 1 (n + 1)ρn] − − − = γnBnρn 1 γ [n(1 + nB)+(n + 1)nB] ρn + γ(1 + nB)(n + 1)ρn+1 (5.32) − − in a tri-diagonal form. This is a simple rate equation, and its structure makes it particularly easy to calculate its stationary state recursively, since the boundary solution nBρ¯0 = (1+ nB)¯ρ1 implies for all n the relation

ρ¯n+1 nB βΩ = = e− , (5.33) ρ¯n 1+ nB

3Joseph Liouville (1809 – 1882) was a french mathematician. 4Max Born (1882–1970) was a german physicist and mathematician. 5Andrei Andrejewitsch Markow (1856–1922) was a russian mathematician with contributions to probability calculus and analysis. 106 CHAPTER 5. MASTER EQUATIONS i.e., the stationary state is a thermalized Gibbs 6 state with the reservoir temperature. We can however also relate this rate equation to the Fokker-Planck equation (4.47). Defining the quantities γ d = [n(1 + n )+(n + 1)n ] , v = γ [n n ] , (5.34) n 2 B B n − B it is visible that we can write the master equation as

vn+1ρn+1 vn 1ρn 1 ρ˙n = − − − +(dn+1ρn+1 2dnρn + dn 1ρn 1) . (5.35) 2 − − − The first term already looks like a first derivative, and the second term like the second derivative, just the appropriate scaling is missing. Noting that vn and dn have dimensions of inverse time whereas ρn is dimensionless, this final step is performed by introducing continuous values ρ(x,t), v(x), and d(x) with ρ (t) ρ(n∆x,t)= n , v(n∆x)= v ∆x, d(n∆x)= d ∆x2 , (5.36) ∆x n n which eventually yields the Fokker-Planck equation in one spatial dimension

2 ∂tρ(x,t)= ∂x [v(x)ρ(x,t)] + ∂x [d(x)ρ(x,t)] (5.37) with a linear velocity profile v(x)= γ(x x ) , x = n ∆x (5.38) − 0 0 B that changes sign at x0 and also a linearly increasing mobility profile

d(x)= γxx0 + γ/2(x + x0)∆x. (5.39) Since x = n∆x the mobility is always positive. Thus, by solving the master equation for its stationary state, we have also solved the Fokker-Planck equation with spatially dependent velocity and diffusion properties! Such mapping possibilities do quite generally exist for master equations that couple only the next neighbours.

5.3.2 Master Equation for a driven cavity

When the cavity is driven with a laser and simultaneously coupled to a vaccuum bath nB = 0, one may obtain the master equation

P +iωt P ∗ iωt 1 1 ρ˙ = i Ωa†a + e a + e− a†,ρ + γ aρ a† a†aρ ρ a†a , (5.40) S − 2 2 S S − 2 S − 2 S     where the Laser frequency is denoted by ω and its amplitude by P . The transformation ρ = +iωa†at iωa†at e ρSe− maps to a time-independent master equation

P P ∗ 1 1 ρ˙ = i (Ω ω)a†a + a + a†,ρ + γ aρa† a†aρ ρa†a . (5.41) − − 2 2 − 2 − 2     This equation obviously couples coherences and populations in the Fock space representation where a n = √n n 1 and a† n = √n +1 n +1 . Nevertheless, it can be mapped to a closed set of | i | − i | i | i equations for certain expectation values, e.g. a†a , a , and a† . h i

6Josiah Willard Gibbs (1839 – 1903) was an american physicict, chemist, and mathematician. 5.3. LINDBLAD MASTER EQUATION 107

Exercise 49 (Coherent state) Derive from (5.41) a closed set of equations for a†a , a , and h i a† and show that

2 2 P P P ∗ P lim a†a = | | , lim (Ω ω)a†a + a + a† = | | . t γ2 + 4(Ω ω)2 t − 2 2 −γ2 + 4(Ω ω)2 →∞ − →∞   −

To obtain a rate equation representation, we use the invariance of the Lindblad equation under shifts of the Lindblad operators and simultaneous modification of the Hamiltonian (5.25). When we consider the transformation 1 1 ρ˙ = i H + (α∗a αa†),ρ + γ (a + α)ρ(a† + α∗) (a† + α∗)(a + α),ρ (5.42) − 2i − − 2      P ∗ with α = 2(Ω ω) iγ we see that the master equation can be written as − − 1 ρ˙ = i (Ω ω)˜a†a,ρ˜ + γ aρ˜ a˜† a˜†a,˜ ρ (5.43) − − − 2      wherea ˜ = a + α denotes the transformed annihilation operator. In the transformed basis, the corresponding rate equation reads

ρ˙ =+γ(n + 1)ρ γnρ . (5.44) n n+1 − n Here, we observe that there are only transitions from n n 1 but not opposite transitions. Consequently, the system will relax to the ground state (n→= 0)− of the modified Hamiltonian

(Ω ω)P (Ω ω)P ∗ H′ = (Ω ω)a†a + − a + − a† − 2(Ω ω)+iγ 2(Ω ω) iγ − − − (Ω ω) P 2 = (Ω ω)˜a†a˜ − | | 1 . (5.45) − − 4(Ω ω)2 + γ2 − (Ω ω) P 2 − | | The energy of the ground state is given by E0 = 4(Ω ω)2+γ2 , which is consistent with the previously obtained result. − −

5.3.3 Phenomenologic Identification of Jump Terms Given a master equation without microscopic derivation, it is essential to interpret the transitions that are induced by the individual terms. Assuming that the density matrix at time t is given by a pure state ρ(t)= Ψi Ψi , the probability to find it in an orthogonal state Pf = Ψf ρ(t+∆t) Ψf after an infinitesimally| ih short| time interval ∆t is given by h | | i

P =+ γ ∆t Ψ L Ψ Ψ L† Ψ . (5.46) f α h f | α | iih i| α | f i α X Therefore, it appears quite natural to define the transition rate as

2 Ri f = γα Ψf Lα Ψi . (5.47) → |h | | i| α X 108 CHAPTER 5. MASTER EQUATIONS

We note that due to Ψi Ψf = 0, the above definition is invariant with respect to shifts of the Lindblad operators (5.25h |). Therefore,i these terms in the master equation are often also called jump terms. The quantum jumps go along with changes in e.g. the particle or energy content of the system

∆N = Ψ N Ψ Ψ N Ψ , ∆E = Ψ H Ψ Ψ H Ψ , (5.48) S h f | | f i−h i| | ii S h f | | f i−h i| | ii which – when the system is only coupled to a single reservoir – must have been transferred into the reservoir (conservation of total energy and particle number provided). For multiple reservoirs ν however, one often has a simple additive decomposition of the bare rates

(ν) γα = γα , (5.49) ν X (ν) where γα describes the coupling to reservoir ν only. This allows one to associate to the jump term

(ν) (ν) ρ = γ L ρL† (5.50) Jα α α α in the master equation a transfer of energy and/or particles to the reservoir ν due to jump α. Very often, one is interested in the statistics of energy and/or particle transfers. To track these, it is useful to introduce the concept of a conditioned density matrix. For example, being interested in the statistics of particle transfers, we can write the total joint density matrix as ρ = ρ(n) ρ(n). Tracing out the reservoir one arrives at the additive decomposition SB n ⊗ B P ρ = ρ(n) , (5.51) n X where ρ(n) denotes the density matrix of the system provided that net n particles have been transferred into the monitored reservoir. However, since we have identified the jump terms that lead to the exchange of particles with the reservoir, we can set up a differential equation for the conditioned density matrices from the Lindblad master equation

(n) (n) (n 1) (n+1) ρ˙ = 0ρ + +ρ − + ρ . (5.52) L L L− Above, we have assumed that at most one particle is transferred in single quantum jumps (which is typical for the weak-coupling limit). Here, + ( ) describes the part of the Liouvillian that L L− increases (decreases) the number of particles in the monitored reservoir by one. Consequently, 0 contains all remaining terms (which either leave the particle number invariant or describe jumpL processes to other reservoirs), and the total Liouvillian is just given by = 0 + + +. If one − is only interested in the total probability that n particles have been transferred,L L L this isL obtained by tracing out the system state

(n) Pn(t) = Tr ρ (t) . (5.53)

It is in general quite difficult to obtain the Pn(t) explicitly. However, sometimes a description in terms of moments or cumulants is much simpler to obtain. The translational invariance of the conditioned master equation (5.52) suggests to perform a Fourier transform

ρ(χ,t)= ρ(n)(t)e+inχ , (5.54) n X 5.3. LINDBLAD MASTER EQUATION 109 and the FT variable χ is in this context for obvious reasons called counting field. We note that to obtain moments of the distribution Pn(t), it suffices to calculate suitable derivatives with respect to the counting field

k k n n = n Pn(t)=( i∂χ) Tr ρ(χ,t) . (5.55) t − { } χ=0 n X After the FT, the conditioned master equation becomes

+iχ iχ ρ˙(χ,t)= 0 + +e + e− ρ(χ,t)= (χ)ρ(χ,t) , (5.56) L L L− L

 (n) where the counting field only enters as an ordinary parameter. Assuming that ρ (0) = δn,0ρ0, (χ)t this is solved by ρ(χ,t)= eL ρ0, and consequently, the moment-generating function for Pn(t) is given by

(χ)t k n M(χ,t) = Tr eL ρ0 , n =( i∂χ) M(χ,t) . (5.57) t − χ=0

 Alternatively, one may characterize a distribution completely using cumulants . These are ob- tained by derivatives of the cumulant-generating function, which is defined via the logarithm

(χ)t k n C(χ,t) = logTr eL ρ0 , n =( i∂χ) C(χ,t) . (5.58) t − χ=0

 Up to now, these expressions are not particularly useful, as computing the exponential of a Liou- villian is cumbersome (it is in general not hermitian and may not admit diagonalization). In the long-term limit, one may assume that the initial state is not of relevance, such that in practice, one often uses the long-term versions were the initial state is replaced by the stationary one

(χ)t (χ)t M(χ,t) = Tr eL ρ¯ , C(χ,t) = logTr eL ρ¯ , (5.59)   which is obtained from (0)¯ρ = 0. Assuming that there exists only a single stationary state (ergodicity), the cumulant-generatingL function may even be further simplified. To see this, we note that the Liouvillian (0) must have eigenvalues with negative real part and a single vanishing L eigenvalue describing the unique stationary state. Continuing this for non-vanishing χ, there must exist one eigenvalue λ(χ) with λ(0) = 0 that has the largest real part of all (at least in a surrounding of χ = 0). Now, we use that this dominant eigenvalue occurs in the Jordan-Block form as a separate entry

λ(χ)t 1(χ) (χ) (χ) 1 e 0 1 C(χ,t) = logTr (χ)eQ− L Q − (χ)ρ log Tr (χ) − (χ)ρ Q Q 0 → Q 0 0 Q 0 n o     = log eλ(χ)tf(χ) = λ(χ)t + log f(χ) λ(χ)t. (5.60) ≈   Here, (χ) is the similiarity transformation that maps to the Jordan block form, and we have used that forQ sufficiently large times the only surviving eigenvalue is the one with the largest real part. Therefore, to learn about the statistics Pn(t), it suffices to obtain the dominant eigenvalue of the counting-field dependent Liouvillian. 110 CHAPTER 5. MASTER EQUATIONS 5.4 Example: Single-Electron-Transistor

The possibly simplest example for a master equation is the single-electron-transistor, for which we have already discussed the current (1.65) from a thermodynamic perspective. Starting from a microscopic derivation (which we omit here), one obtains a Lindblad master equation acting in the Hilbert space of the central quantum dot (with basis vectors 0 and 1 denoting the empty and filled dot, respectively). For tunnel-couplings to two leads, it can| i be simplified| i to the form

1 ρ˙ = i ǫd†d,ρ + (Γ f +Γ f ) d†ρd dd†,ρ − L L R R − 2     1  +(Γ (1 f )+Γ (1 f )) dρd† d†d,ρ , (5.61) L − L R − R − 2    where H = ǫd†d is the Hamiltonian of the dot with on-site energy ǫ, d (d†) are fermionic annihilation βν (ǫ µν ) 1 (creation) operators, Γν are bare tunneling rates and fν =[e − +1]− are the Fermi functions of reservoir ν. Arranging the matrix elements of the density matrix in a vector

ρ00 ρ ρ ρ 00 01 =  11  , (5.62) ρ10 ρ11 ⇒ ρ01    ρ   10    it is straightforward to see that the Lindblad equation can be represented as a superoperator of the form

Γνfν +Γν(1 fν)0 0 −+Γ f Γ (1 − f )0 0 =  ν ν − ν − ν  . (5.63) L 0 0 Γν/2+iǫ/2 0 ν L,R − ∈{X }  0 0 0 Γ /2 iǫ/2   − ν −    We observe that here coherences and populations evolve independently, and that the coherences just decay. In fact, the assumptions used in the microscopic derivation of the master equation for this model are not compatible with initial superpositions of empty and filled electronic states, such that we may neglect the coherences completely.

Exercise 50 (Lindblad superoperator) Show that the superoperator (5.63) is consistent with the Lindblad equation (5.61).

Identifying the jump terms e.g. for jumps from the dot into the right reservoir Γ (1 f )dρd† R − R and jumps from the right reservoir into the dot ΓRfRd†ρd we see that these map to off-diagonal terms in the population part of the Liouvillian

0 ΓR(1 fR) 0 0 + = − , = , L 0 0 L− ΓRfR 0     ΓLfL ΓRfR +ΓL(1 fL) 0 = − − − . (5.64) L +ΓLfL ΓL(1 fL) ΓR(1 fR)  − − − −  5.5. ENTROPY 111

Altogether, this leads after Fourier transformation to the counting-field dependent Liouvillian

+iχ ΓLfL ΓRfR +ΓL(1 fL)+ΓR(1 fR)e (χ) = − − iχ − − . (5.65) L +ΓLfL +ΓRfRe− ΓL(1 fL) ΓR(1 fR)  − − − − 

Exercise 51 (current) Calculate the dominant eigenvalue of the Liouvillian (5.65) and deter- d d mine the time derivative of the first cumulant I = dt n t = dt ( i∂χ)λ(χ) . hh ii − χ=0

In essence, we have got an evolution equation enabling us to calculate the evolution of the joint probability distribution Pna(t) describing the probabilities for having transferred n particles to the right reservoir and simultaneously leaving the system in state a 0, 1 . ∈ { } 5.5 Entropy

The previous transport examples show that the rate matrix decomposes additively in the reservoirs. This is quite generic in the weak-coupling limit between system and reservoir. Furthermore, we note that the ratio of backward and forward rates triggered by the reservoir ν (characterized by 7 inverse temperature βν and chemical potential µν) yields a simple Boltzmann factor

(ν) Lji βν [(ǫj ǫi) µν (nj ni)] (ν) = e− − − − . (5.66) Lij Here, (ν) is the transition rate from state i to j, and these are characterized by energies ǫ and Lji i particle contents ni. Indeed, when we compare with Eq. (5.63), we obtain with ǫ0 = 0, ǫ1 = ǫ, n0 = 0, and n1 = 1 the relation

(ν) 01 Γν(1 fν) 1 fν βν (ǫ µν ) βν [(ǫ0 ǫ1) µν (n0 n1)] L(ν) = − = − = e − = e− − − − . (5.67) Γνfν fν L10 Similarly, we find this principle in the cavity coupled to a thermal bath (5.32)

(ν) n,n 1 γnBn nB β(ω µ) β[(ǫn ǫn 1) µ(nn nn 1)] L − = = = e− − = e− − − − − − , (5.68) (ν) γ(1 + n )n 1+ n n 1,n B B L − though there we had set the chemical potential to zero. Eq. (5.66) has interesting consequences for the evolution of the Shannon 8 entropy

S = P ln P . (5.69) − i i i X Before we proceed further, we briefly review basic properties of the Shannon entropy.

7Ludwig Eduard Boltzmann (1844–1906) was an austrian physicist who reformulated thermodynamics. 8Claude Elwood Shannon (1916 – 2001) was an american mathematician and engineer. He is the founder of information theory. 112 CHAPTER 5. MASTER EQUATIONS

Obviously, we have S 0 since P 0 and due to P 1 we also have ln P 0. The • ≥ i ≥ i ≤ − i ≥ shannon entropy vanishes when Pi = δi¯i for some ¯i. We always have the side-constraint that all probabilities are normalized. Using the method • of Lagrange-multiplyiers, i.e., maximizing S′ = S + λ(1 i Pi) with respect to Pi and λ we obtain − 1 P P max = , Smax = ln M, (5.70) i M where M is the total number of states (i 1,...,M ). ∈ { }

When we add the side-constraint that also the energy should be fixed E = i EiPi, the • distribution maximizing the Shannon entropy becomes h i P e βEi P max = − , Smax =+β E + ln Z(β) , (5.71) i Z(β) h i

βEi with the partition function Z(β)= i e− .

With also the side constraint that theP total number of particles is fixed N = i NiPi, the • Shannon entropy is maximized by h i P e β(Ei µNi) P max = − − , Smax = β[ E µ N ]+ln Z(β,µ) , (5.72) i Z(β,µ) h i− h i

β(Ei µNi) with the partition function Z(β,µ)= i e− − . We see that while more side constraints areP added the entropy is further reduced. For example, for an equidistant energy spectrum labeled by the number of particles En = Ωn, Nn = n where n is constrained by 0 n , one obtains explicitly ≤ ≤ ∞ 1 Ω µ Z(β,µ)= β(Ω µ) , E µ N = β(Ω −µ) , (5.73) 1 e− − h i− h i e − 1 − − x x where we have used that E µ N = ∂β ln Z(β,µ). Now, the function ex 1 ln[1 e− ] is always positive, diverges in the high-temperatureh i− h i − limit (x 0), and vanishes in the− low-temperature− − limit → x . →Using ∞ simple algeraic transformations and the local detailed balance property 5.66, it is straight- forward to obtain a balance equation d S˙ = P ln P = P˙ ln P −dt i i − i i i i X X (ν) (ν) Pj = (ν)P ln P Lji Lij − Lij j i (ν) (ν) ij ν Pj ij ji ! X X L L (ν) (ν) Pj 1 = + (ν)P ln Lij + (ν)P ln Lji Lij j (ν) Lij j (ν) P ij ν ji Pi ! ij ν ij j ! X X L X X L (ν) (ν) Pj = + (ν)P ln Lij + (ν)P ln Lji Lij j (ν) Lij j (ν) ij ν ji Pi ! ij ν ij ! X X L X X L (ν) ¯ (ν) t (ν) ij Pj (ν) ji →∞ + P¯ ln L + P¯ ln L . (5.74) → Lij j (ν) ¯ Lij j (ν) ij ν ji Pi ! ij ν ij ! X X L X X L 0 βν [(ǫj ǫi) µν (nj ni)] ≥ − − − − | {z } | {z } 5.5. ENTROPY 113

(ν) In the above lines, we have simply used trace conservation i ij = 0 and finally the local detailed balance property (5.66). This property enables us to identify inL the long-term limit the second term as energy and matter currents. When multiplied by the inversP e temperature of the corresponding reservoir, they combine to an entropy flow, which motivates the definition below.

Box 22 (Entropy Flow) For a rate equation of the type (5.1), the entropy flow from reservoir ν is defined as

S˙ (ν) = + (ν)P¯ [ β [(ǫ ǫ ) µ (n n )]] e Lij j − ν j − i − ν j − i ij X = β I(ν) µ I(ν) , (5.75) ν E − ν M   (ν) (ν) where energy currents IE and matter currents IM associated to reservoir ν count positive when entering the system.

The remaining contribution corresponds to a production term. We note that it is always positive, which can be deduced from the formal similarity to the Kullback 9- Leibler 10 divergence of two probability distributions or – more directly – using the Logarithmic Sum Inequality.

Exercise 52 (Logarithmic Sum Inequality) Show that for non-negative ai and bi

n a a a ln i a ln i b ≥ b i=1 i X with a = i ai and b = i bi. P P Its positivity is perfectly consistent with the second law of thermodynamics, and we therefore identify the remaining contribution as entropy production.

Box 23 (Entropy Production) For a rate equation of the type (5.1), the average entropy pro- duction is defined as

(ν) P¯j S˙ = (ν)P¯ ln Lij 0 . (5.76) i Lij j (ν) ¯ ≥ ij ν ji Pi ! X X L It is always positive and at steady state balanced by the entropy flow.

When the dimension of the system’s Hilbert space is finite and the rate equation (5.1) ap- proaches a stationary state, its Shannon entropy will also approach a constant value S˙ = 0.

9Solomon Kullback (1907 – 1994) was an American cryptanalyst. 10Richard Leibler (1914 – 2003) was an american cryptanalyst. 114 CHAPTER 5. MASTER EQUATIONS

Therefore, at steady state the entropy production in the system must be balanced by the entropy flow through its terminals

S˙ = S˙ = β I(ν) µ I(ν) . (5.77) i − e − ν E − ν M ν X   The above formula conveniently relates the entropy production to energy and matter currents from the terminals into the system. Evidently, the entropy production is thus related to heat ˙ (ν) (ν) (ν) currents Q = IE µνIM , which can be determined from a master equation by means of the Full Counting Statistics.− We note here that identifying the entropy production in a system is no purely academic exercise: In the long term, it is additive in the respective entropy flows, and their identification allows e.g. for the definition of thermodynamically meaningful (and bounded) efficiencies of thermoelectric nano-scale devices.