1 Introduction

1.1 Terminology and classiﬁcation of PDEs

A differential eqation is an equation satisfied by a function u, which involves besides u its derivates as well. If u depends only on one variable, e. g., the time t, we call the equation ordinary differential equation (ODE). n If it depends on several variables, e. g., on z = (z1, z2, ..., zn) ∈ R , then there occur partial derivatives

∂u(z) ∂iu(z) := , ∂xi and we call the equation partial diﬀerential equation (PDE). n A general second-order, linear PDE for a function u(z), z ∈ R is

n n X X − ∂iaij(z)∂ju + bi(z)∂iu + c(z)u = f(z), (1.1) i,j=1 i=1

In case aij(z), bi(z) and c(z) are independent of z, we have a PDE with constant coefficients. d Note, that for time-dependent problems in R we have z = (x, t) and n = d + 1 where d x ∈ R . For time-independent problems z = x. n If we are searching for u defined in the open set O ⊂ R then we need the regularity 2 1 0 assumption u ∈ C (O), aij ∈ C (O), bi, c, f ∈ C (O) such that all derivatives exist in classical sense. These regularity assumptions will be reduced later when we will formulate the PDE in weak sense. 2 For u ∈ C (O) the partial derivates can be switched, i. e., ∂i∂j = ∂j∂i. If aij are not symmetric we can symmetrize the coefficients by

new orig orig aij := (aij + aji )/2, and by adjusting the remaining coefficients bi such that the form of the PDE remains (1.1). n So, we can assume that the coefficient matrix A(z) = {aij(z)}i,j=1 is symmetric. PDEs are classified into

• In elliptic equations an incident at a point z inﬂuences all point in their neighbourhood.

5 • In parabolic equations there is in one direction from a point z for which the inﬂuence is only for larger values. This is often the time direction, where only later times are inﬂuenced.

• In hyberbolic equations there are areas of directions around a point z which are inﬂuenced.

The principal part (Hauptteil) of the PDE

n X − ∂iaij(z)∂ju i,j=1 is mainly responsible for their classiﬁcation.

Definition 1.1 (Classification of second-order PDEs). Consider a second-order, linear PDE of the form (1.1) with the symmetric coefficient matrix A(z).

1. The equation is said to be elliptic at z ∈ O if the eigenvalues of A(z) are all non-zero and of same sign, so e. g., for positive-deﬁnite A(z).

2. The equation is said to be parabolic at z ∈ O if one eigenvalue of A(z) vanishes whereas the others are all non-zero and of same sign, e. g., A(z) is positive semi- deﬁnite, but not positive deﬁnite, and the rank of (A(z), b(z)) is equal to n.

3. The equation is said to be hyberbolic at z ∈ O if A(z) has only non-zero eigenvalues, whereas n − 1 of one sign and one of the other sign.

A partial diﬀerential equation (1.1) is said to be elliptic, parabolic or hyberbolic in a set O of points z, if it has the property for all z ∈ O.

The classiﬁcation covers most (linear) physical models , but is not complete.

Remark 1.2. Do not mix second-order elliptic PDEs with elliptic bilinear forms, which we will discuss later.

1.2 Partial diﬀerential equations as mathematical models

A mathematical model is the description of a system using mathematical language. Mathematical models are obtained by a combination of ﬁrst principles and constitutive equations. First principles are law of nature like conservative equations, actio = reactio, etc. They are based on basic assumptions. Constitutive equations represent material properties and are based often on measure- ments. They include empirical parameters.

6 1.2.1 Heat transfer equation We have as ﬁrst principle the convervation of energy which correspond in absence of work to the conservation of temperature ∂u (x, t) + div j(x, t) = f(x, t) (1.2) ∂t where u → temperature [u] = 1K W j → heat ﬂux [j] = 1 m2 W f → heat source/sink [f] = 1 m3 and as constitutive equation Fourier’s law

j(x, t) = −A(x) grad u(x, t), (1.3) saying, that the heat ﬂux is proportional to the temperature gradient. The matrix A represents the heat conductivity tensor (Leitf¨ahigkeitsmatrix). We call the material

homogen : A(x) = A, isotrop : A(x) = α(x) · 1, and inhomogen or anisotrop otherwise. α(x) is the conductivity. The heat equation is parabolic if A is positive deﬁnite. For a unique solution we have to complete the system by initial conditions

u(x, 0) = u0(x), and boundary conditions. We may describe the temperature on the boundary of the domain of interest Ω

u(x, t) = g(x, t) for x ∈ ∂Ω or the heat ﬂux

j(x, t) · n(x) = h(x, t) for x ∈ ∂Ω.

Here, n(x) is the normalised outer normal vector.

1.2.2 Diﬀusion and Poisson equation ∂u The static limit of the heat equation, i. e., with ∂t = 0 in (1.3), we get the elliptic system

j = −A(x) grad u , (FL) div j = f, (EL) which has to be equipped with the above boundary conditions. The same system arises in

7 • Electrostatics: u → electric potential [u] = 1V As j → displacement current (D)[j] = 1 m2 As f → charge density (ρ)[f] = 1 m3 Here, A stands for the dielectric tensor, which is usually designated by . The relationship (EL) is Gauss’ law, and (FL) arises from Faraday’s law curl E = 0 and the linear constitutive law D = E. • Stationary electric currents: u → electric potential [u] = 1V A j → electric current [j] = 1 m2 Inside conductors, where tensor A represents the conductivity. The source term f usually vanishes and excitation is solely provided by non-homogeneous boundary conditions. In this context (FL) arises from Ohm’s and Ampèrescircuit law and (EL) is a consequence of the conservation of charge. A similar elliptic system is j = −A(x) grad u , (FL) div j = f − c(x)u, (EL) which appear in mol • Molecular diffusion: u → concentration [u] = 1 m3 mol j → flux [j] = 1 m2s mol f → production/consumption rate [f] = 1 m3s Here A stands for the diffusion constant and, if non-zero, c denotes a so-called reaction coefficient. The equation (EL) guarantees the conservation of total mass of the relevant species.

1.2.3 Wave equation For the vertical displacement of a thin membran can be modelled by the ﬁrst principle, mass times acceleration is the force, ∂2u m(x) (x, t) = div σ(x, t) + f(x, t), ∂t2 and Hook’s law as constitutive equation σ(x, t) = A(x) grad u(x, t). Here, we have u → vertical displacement [u] = 1m σ → stress vector [σ] = 1J f → external force [f] = 1N grad u → (local) deformation of the membran ∂u ∂t → (local) speed of the membran ∂2u ∂t2 → (local) acceleration kg and the density of mass m(x) ([m] = 1 m2 ).

8 Equation Modelled system Classification Advection equation Transport of particles in a flow. Hyberbolic Navier-Stokes equations Flow of fluids. Hyberbolic Euler Flow of incompressible fluids. Hyberbolic Shallow water equations Flow in shallow domains (earth atmosphere). Hyberbolic Stokes equations Static limit of flow of viscous flows. Elliptic Elasticity equations Stresses in elastic bodies. Elliptic Maxwell’s equations Electromagnetic fields. Hyberbolic Time-harmonic Maxwell’s equations Electromagnetic fields. Elliptic Eddy current modell Electromagnetic fields in low frequency. Elliptic Schrödinger’sequation Electron structure of matters. Elliptic Black-Scholes equation Prices of stocks. Parabolic

Table 1.1: Some examples for systems modelled by PDEs.

The second-order PDE is ∂2u m(x) (x, t) = div(A(x) grad u) + f, (1.4) ∂t2 and for homogeneous isotrope material with A = 1 we have ∂2u m(x) (x, t) = ∆u + f, ∂t2 which is hyperbolic. We have to complete the PDE with initial conditions

u(x, 0) = u0(x), and boundary conditions. We may describe the displacement on the boundary of the membran u(x, t) = g(x, t) for x ∈ ∂Ω or the boundary stress σ(x, t) · n(x) = h(x, t) for x ∈ ∂Ω.

1.2.4 Time-harmonic wave-equation −i ωt −i ωt −i ωt Let f = Re(fbe ) as well as g = Re(gbe ), h = Re(bh e ) for some (angular) + frequency ω ∈ R . With the independance of the coeﬃcient function of t we can write −i ωt u = Re(ube ) and insertion into (1.4) leads to the time-harmonic wave-equation 2 − div(A(x) grad ub) − ω m(x)ub = f,b (1.5) which is a linear elliptic PDE of second order. The boundary condition transfer to ub where g and h are replaced by gb and bh. Remark 1.3. Linear parabolic or hyberbolic PDEs of second-order get elliptic in their static limit or in their time-harmonic form.

9 1.2.5 Further models Further systems modelled by PDEs are given in Table 1.1.

1.3 Well-posedness

Deﬁnition 1.4 (Hadamard’s well-posedness). A problem is said to be well-posed (kor- rekt gestellt) if

1. it has a unique solution u,

2. the solution depends continuously on the given data f, i. e.,

kuk ≤ Ckfk with a constant independant of u and f. Otherwise the problem is ill-posed.

Corollary 1.5. Small pertubations in the data of linear well-posed problems leads to small pertubations of the solution.

Proof. Let fe = f + δf with kδfk 1 (very small). Then ue is the solution of the system with data fe and δu = ue − u the solution of the system with data δf. So kδuk ≤ Ckδfk 1.

Example 1.6 (Ill-posed problem). Let Ω = [0, 1]×[0, ∞) and u the solution of the PDE

∆u = 0 1 u(x , 0) = sin(nx ) 1 n 1 ∂2u(x1, 0) = 0, i. e., instead of boundary conditions on all boundaries we prescibe “initial condition” on the boundary x2 = 0. For n → ∞ the boundary data gets smaller and smaller, whereas the solution 1 u(x , x ) = sin(nx ) cosh(nx ) 1 2 n 1 2 explodes for a ﬁxed x2 > 0.

10 1.4 Some norms and spaces

Repeating the deﬁnition of a vector space and a norm.

Deﬁnition 1.7 (Vector space). A vector space over R or C is a set X, whose elements are called vectors, for which the operations addition and scalar multiplication are deﬁned, and for any vectors x, y ∈ X and λ ∈ R or C it holds

(i) x + y ∈ X, (ii) λx ∈ X, (iii) 0 ∈ X.

Deﬁnition 1.8 (Norm). Let X be a real (or complex) vector space. We call k·k : X → R a norm on X if

kxk = 0 ⇐⇒ x = 0, (deﬁniteness), (N1) kxk ≥ 0 ∀x ∈ X, (N2) kλxk = |λ|kxk ∀λ ∈ R(or λ ∈ C), (homogeneity), (N3) kx + yk ≤ kxk + kyk ∀x, y ∈ X, (triangle inequality). (N4)

11 2 Finite diﬀerences scheme for second-order elliptic PDEs

2.1 The strong formulation

d We consider the problem the connected bounded open set Ω ⊂ R with boundary ∂Ω and the PDE with f ∈ C(Ω), g ∈ C(∂Ω), 0 ≤ c ∈ C(Ω)

Lu = −∆u + cu = f in Ω, (2.1a) u = g on ∂Ω. (2.1b)

Lemma 2.1 (Basic Maximum (minimum) principle for c = 0). Let u ∈ C2(Ω) ∩ C(Ω) be solution of (2.1) with c = 0. If f ≤ 0 (f ≥ 0) in Ω, then the maximum (minimum) of u in Ω is attained on the boundary ∂Ω. Furthermore, if the maximum (minimum) is attained at an interior point of Ω, then the function u is constant.

Proof. First, we carry out the proof for the stronger assumption f < 0 in Ω. Suppose that there exists some xe ∈ Ω (a maximum point in the interior) such that

u(xe) = sup u(x) > sup u(x). (2.2) x∈Ω x∈∂Ω

With

d 2 (2.1a) X ∂ u 0 > f(x) = (Lu)(x) = −∆u(x) = − (x) e e ∂x2 e i=1 i

12 which is a contradiction to (2.2), as for a maximum point for all 1 ≤ i ≤ d

∂2u 2 (xe) ≤ 0. ∂xi

Now, let f ≤ 0, and xe is again the maximum of u in the interior and (2.2) holds. As the u on the boundary is smaller than the maximal value we can ﬁnd a suﬃciently small β > 0 such that the function

d X 2 w = u + β (xi − xei) i=1 attains its maximum at an interior point x0. Since

Lw = fw = f − dβ < f the maximum of w cannot be attained at an interior point, and we have a contradiction as well, i. e.,(2.2) does not hold and u attains its maximum on the boundary. Let the maximum is denoted by M = maxx∈∂Ω. There might be still points xe in the interior with u(xe) = M. Assume that u is not constant in Ω. So there exists a ball B ⊂ Ω of positive radius R and mid-point x0 with supx∈B2 u(x) < M for any proper subset B2 ( B, but with a boundary point xe ∈ ∂B for which u(xe) = M. Let G := B\B(x0,R/2) the open bounded ring domain with outer boundary Γe and inner boundary Γ. Let furthermore

2 v(x) := e−λ|x−x0| > 0 with λ > 0 large enough such that

2 −λ|x−x0| Lv = λ(d − λ|x − x0|)e < 0 for all x ∈ G. Note, that

v(x) = 0 on Γe, (2.3) and so

wε := u + εv coincides with u on Γ.e Since supx∈Γ u(x) < M by assumption we can choose a ε > 0 so small such that

wε(x) − wε(xe) = wε(x) − M < 0 in G. (2.4)

Since Lwε = f + εLv < 0 the maximum of wε in G is on ∂G, which is by (2.3) and (2.4) only attained in xe. Hence,

grad wε(xe) = 0

13 and as grad v(xe) =6 0 it holds

grad u(xe) =6 0, i. e., xe is not a maximum point which contradicts the assumptions. So there exists no interior maximum point if u is not constant. For f ≥ 0 we consider ue := −u and apply the previous steps. Example 2.2. Consider an open bounded set Ω = (−1, 1)2 and the Poisson equation

−∆u = −4 < 0 in Ω,

2 2 which is solved by the function u = u(x1, x2) = x1 + x2 with appropiate boundary condition. Since f ≤ 0 in Ω the solution u attains its maximum on the boundary ∂Ω. This is indeed true.

Remark 2.3 (Non-local inﬂuence of sources). Let f ≤ 0 and g = 0. As the maximum of the solution u of (2.1) is attained on the boundary, there exists no subdomain G ⊂ Ω with |G| > 0 where u ≡ 0. Any point in the domain Ω “feels” the source. Lemma 2.4 (Basic Maximum (minimum) principle for c ≥ 0). Let u ∈ C2(Ω) ∩ C(Ω) be solution of (2.1) with c ≥ 0. Let furthermore f ≤ 0 (f ≥ 0) in Ω. Then if u has a non-negative maximum (non-positive minimum) in Ω is attained on the boundary ∂Ω. Furthermore, if the positive maximum (negative minimum) is attained at an interior point of Ω, then the function u is constant. Corollary 2.5 (Comparison principle). Let u, v ∈ C2(Ω) ∩ C(Ω) solve the equations Lu = fu and Lv = fv, respectively, and

fu ≤ fv in Ω u ≤ v on ∂Ω.

Then, u ≤ v in Ω.

14 Proof. Let w := v − u and so Lw = fv − fu ≤ 0. If the minimum of w in Ω is positive, then w is positive in Ω and the corollary is true. If the minimum of w in Ω is not positive, then by the minimum principle (Lemma 2.4) it is attained on boundary. But w ≥ 0 on the boundary, i. e., the minimal value is not below 0, and so w ≥ 0 in Ω.

Corollary 2.6 (Uniqueness). Let u a solution of the boundary value problem (2.1). Then u is unique. Proof. Assume that v =6 u solves (2.1). Then, w := v − u solves (2.1) with f ≡ 0 and g ≡ 0. Applying the comparison principle we conclude that 0 ≥ w ≥ 0 which contradicts the assumption.

Deﬁnition 2.7 (Classical solution). If u ∈ C2(Ω) ∩ C(Ω) satisfy (2.1) (or even (1.1)) in a pointwise sense, and the prescribed boundary conditions, then these functions are called a classical solution of the boundary value problem. In general there does not exist a classical solution. A special case with a classical solution is that of the second-order elliptic PDE with smooth boundary and constant coeﬃcient functions, e. g.,(2.1). Here, we can make even a statement about the regularity of the solution. k ∞ Lemma 2.8 (Elliptic shift theorem). Let f ∈ C (Ω), k ∈ N0, g ∈ C (∂Ω) and ∂Ω ∈ C∞. Let u the unique solution of (2.1). Then u ∈ Ck+2(Ω). Lemma 2.9 (Continuous extension theorem). Let u be the solution of (2.1) and u ≡ 0 in the open bounded subset G ⊂ Ω. Then, u ≡ 0 in Ω. Proof. See [2].

2.2 Approximation by the Finite Diﬀerence Method (FDM)

If the solution is smooth enough we can replace the derivatives by diﬀerence quotients.

2.2.1 Diﬀerence quotients n+1 Let h > 0 and x ∈ R. Let I = [x − h, x + h], u ∈ C for some n ≥ 0. Taylor’s theorem gives 2 3 n 0 h 00 h 000 (±h) (n) u(x ± h) = u(x) ± hu (x) + u (x) ± u (x) + ... + u (x) + Rn(u; x, h) 2 3! n! with the remainder Z x±h n+1 1 n (n+1) (±h) (n+1) Rn(u; x, h) = (x − t) u (t) dt = u (ξ) for some ξ ∈ I. n! x (n + 1)! The forward diﬀerence for u ∈ C3(I) is u(x + h) − u(x) h (D+u)(x) := = u0(x) + u00(x) + O(h2), h 2

15 whereas the backward diﬀerence is given by

u(x) − u(x − h) h (D−u)(x) := = u0(x) − u00(x) + O(h2). h 2

The central diﬀerence for u ∈ C5(I) is

u(x + h) − u(x − h) h2 (D0)(x) := = u0(x) + u000(x) + O(h4). 2h 3! For the second derivative u00(x) we get for u ∈ C4(I):

u(x + h) − 2u(x) + u(x − h) (D+D−u)(x) := = u00(x) + O(h2). h2 In two dimensions we have for u ∈ C4([−h, h] × [−h, h] + x) the ﬁve-point-stencil for the Laplacian

u(x1 + h, x2) + u(x1 − h, x2) + u(x1, x2 + h) + u(x1, x2 − h) − 4u(x1, x2) (∆u)(x) = +O(h2). h2 | {z } =: Λu

Let us also specify the non-equidistant ﬁnite diﬀerence for u ∈ C4(I)

2 u(x + h ) − u(x) u(x) − u(x − h ) h − h u00(x) = R − L − R L u000(x) + O(h2). hL + hR hR hL 3

The Laplacian can be approximated for u ∈ C3([−h, h] × [−h, h] + x) 2 u(x1 + h , x2) − u(x1, x2) u(x1, x2) − u(x1 − h , x2) (∆u)(x) = R − L hL + hR hR hL 2 u(x1, x2 + h ) − u(x1, x2) u(x1, x2) − u(x1, x2 − h ) + T − B + O(h). hB + hT hT hB

∗ We call the diﬀerence quotient Λ u, which generalises Λu (special case h = hL = hR = hT = hB).

2.3 The Finite Diﬀerence method

The PDE (2.1) is approximated on a uniform grid

Th := {x = (jh, ih): x ∈ Ω}, (2.5) whereas the boundary has the grid

Gh := {x = (jh, x2) or x = (x1, ih) for some x1, x2 ∈ R : x ∈ ∂Ω}.

16 Figure 2.1: Grid Th for a computational domain Ω and boundary grid Gh.

The union is T h := Th ∪ Gh (see Fig. 2.1), and |T h| its cardinality. −2 −1 The cardinalities of Th and Gh behave asymptotically like |Th| ∼ h and |Gh| ∼ h . We call a grid function a function vh ∈ T h. We ask the finite difference approximant uh to u to fulfill

(Lhuh)(x) = f(x) for all x ∈ Th, (2.6a)

uh(x) = g(x) for all x ∈ Gh, (2.6b) with

∗ (Lhuh)(x) := −Λ uh(x) + cuh(x),

∗ and Λ meaning the general diﬀerence quotient for ∆ with h ≥ hL, hR, hT , hB > 0 the > > > > minimal values such that x − (hL, 0) , x + (hR, 0) , x + (0, hT ) , x − (0, hB) ∈ T h.

2.3.1 Existence and uniqueness

Lemma 2.10 (Discrete maximum principle). Let f ≤ 0 (f ≥ 0) on Th. If the maximum (minimum) of the solution uh of (2.6) is non-negative (non-positive) it is attained on Gh.

Proof. Let Lhuh ≤ 0 for all x ∈ Th, and xe ∈ Th such that

uh(xe) = max uh(x) x∈T h and uh(xe) ≥ 0.

17 As consequence of the fact that xe is a maximum and an interior point it is 2 u (x1 + h , x2) − u (x1, x2) u (x1, x2) − u (x1 − h , x2) h e R e h e e − h e e h e L e ≤ 0, hL + hR hR hL 2 u (x1, x2 + h ) − u (x1, x2) u (x1, x2) − u (x1, x2 − h ) h e e T h e e − h e e h e e B ≤ 0, hB + hT hT hB and so

∗ 0 ≥ (Lhuh)(xe) = −(Λ uh)(xe) + c(xe)uh(xe) ≥ c(xe)uh(xe) ≥ 0

So, it remains only (Lhuh)(xe) = 0, which is possible for

c(xe) = 0 and uh(x) = uh(xe) or c(xe) > 0 and uh(x) = uh(xe) = 0 for all points x in the stencil of xe. > If xe + (hR, 0) ∈ GT the statement of the lemma is proved, otherwise we repeat the > previous steps for xe + (hR, 0) instead of xe until the (right) boundary of the domain is reached, and the lemma holds. For f ≥ 0 we consider ueh := −uh and apply the previous steps.

Corollary 2.11. The only solution of (2.6) with f ≡ 0 on Th and g ≡ 0 on Gh is uh ≡ 0.

Proof. As f ≤ 0 there can be only a non-negative maximum of uh on the boundary. In view of the boundary conditions uh ≤ 0. Vice-versa with f ≥ 0 we have uh ≥ 0 and the proof is complete.

Corollary 2.12 (Existence and uniqueness). The system of equations (2.6) provides a unique solution uh.

Proof. We have shown in Corollary 2.11 that the null space of the FDM matrix is empty, and as the matrix is ﬁnite-dimensional the range is the full and a solutions exists for any f and g.

Lemma 2.13 (Discrete comparison principle). Let uh,1 and uh,2 the solutions of (2.6) with f = f1, g = g1 or f = f2, g = g2, respectively. If for the data it hold the inequalities

|f1(x)| ≤ f2(x) ∀x ∈ Th,

|g1(x)| ≤ g2(x) ∀x ∈ Gh.

Then,

|uh,1(x)| ≤ uh,2(x) ∀x ∈ T h.

18 Proof. Let vh := uh,1 − uh,2, then

Lhvh = f1 − f2 ≤ 0 on Th,

vh = g1 − g2 ≤ 0 on Gh.

If there is a non-negative maximum, it is attained on Gh, and so vh(x) ≤ 0 for all x ∈ T h. Otherwise, the maximum is negative and so even vh(x) < 0 for all x ∈ T h. Repeating the same steps with wh := −uh,1 − uh,2 ﬁnishes the proof.

2.3.2 Convergence The ﬁnite diﬀerence method provides approximation on grid points only, therefore we can also only use norms on the grid to measure the error. Grid functions are measured by the `∞ norm

kvhk∞ := max |vh(x)| x∈T h or the weighted `2 norm deﬁned by 1/2 1 X 2 kvhk2,h := p |vh(x)| ∼ hkvhk2. |Th| x∈T h The weight is to make the norm independent of h, so that we can compare diﬀerent mesh widths. For example, the constant function 1 has same norm on meshes T h1 and

T h2 even if h1 =6 h2. Note, that this holds also true for meshes of diﬀerent domains in 2 diﬀerence to the L -norm. We use the same norms also on Th. By the Cauchy-Schwarz inequality we have for any grid function vh

kvhk2,h ≤ kvhk∞. (2.7)

0 Let v ∈ C (Ω). Then, we call the grid function (Rhv) ∈ T h its restriction to T h. With the consistency error we measure the accuracy if the exact solution would be inserted in the numerical scheme. Deﬁnition 2.14 (Consistency error). Let u be the solution of (2.1). The consistency error is deﬁned as

kLuh − Lhuk = kf − Lhuk with k · k some norm on the mesh Th. The consistency error is of order k if for h → 0 it can be bounded by C hk for some constant C > 0. Lemma 2.15 (Estimate of the consistency error). Let u ∈ C3+`(Ω) with ` = 0, 1 the solution of (2.1). There exist two constants C∞,C2 > 0 independent of h such that on grids Th the consistency error can be bounded as

kf − Lhuk∞ ≤ C∞h, 1+`/2 kf − Lhuk2,h ≤ C2 h .

19 Proof. Since for any x ∈ Th

∗ f − Lhu = −∆u + Λ u, we have to bound the error of the finite difference approximation of the Laplacian. The non-equidistant finite difference quotient Λ∗, which is needed close to the boundary, approximates the Laplacian point-wise with O(h), and dominates so the consistency error in the `∞ norm. For u ∈ C4(Ω) we have on O(h−2) points with the difference quotient Λ an point-wise error of O(h2) and on O(h−1) points with the difference quotient Λ∗ a point-wise error of O(h) and so

∗ 2 2 −2 4 −1 2 3 k∆u − Λ uk2,h ≤ O(h ) · O(h )O(h ) + O(h )O(h ) = O(h ).

The error on the boundary dominates again. If u ∈ C3(Ω) the pointwise error is anyway O(h) and we loose half an order.

As the numerical scheme works with ﬁnite precision we have to consider the stability of the scheme, i. e., the inﬂuence of small pertubation of the data to the solution, as well.

Lemma 2.16 (Stability of the FDM in the `∞-norm). There exists a constant C > 0 only depending on the size of Ω such that for the solution uh of (2.6) it holds

diam(Ω)2 ku k∞ ≤ kfk∞ + kgk∞. h 4

Proof. Without loss of generality let 0 ∈ Ω and let R = diam(Ω). Then, for any constants α, β > 0 the grid function

2 2 2 vh = α(R − x1 − x2) + β (2.8)

∗ is positive for all x ∈ T h. Inserting (2.8) into Λ we have

∗ ∗ 2 2 (Λ vh)(x) = −Λ (x1 + x2) 2 2 2 2 2 (x1 + hR) − x1 x1 − (x1 − hL) = −α 2 − 2 hL + hR hR hL 2 2 2 2 2 (x2 + hT ) − x2 x2 − (x2 − hB) + 2 − 2 hT + hB hT hB = −4 α < 0.

Thus, vh is by Corollary 2.11 the unique solution of

2 2 2 (Lhvh)(x) = α 4 + c(x)(R − x1 − x2) + β =: fe > 0 on Th, 2 2 2 vh(x) = α(R − x1 − x2) + β =: ge > 0 on Gh.

20 1 For α := 4 kfk∞ and β := kgk∞ we have

|f(x)| ≤ fe(x) ∀x ∈ Th, |g(x)| ≤ ge(x) ∀x ∈ Gh. By Lemma 2.13 we have the inequality

|uh(x)| ≤ vh(x) ∀x ∈ T h, and so

2 2 R ku k∞ ≤ kv k∞ ≤ αR + β = kfk∞ + kgk∞. h h 4

Lemma 2.17 (Stability of the FDM in the weighted `2-norm). Let g ≡ 0 and Th a uniform grid. There exists a constant C > 0 only depending on the size of Ω such that for the solution uh of (2.6) it holds diam(Ω)2 ku k ≤ kfk . h 2,h 2π2 2,h

Proof. As both grid functions uh and f have same length it suﬃces to show the lemma directly for the `2 norm. Since uh ≡ 0 on Gh by (2.6b), it remains the linear system of equations (2.6a) with N = |Th| unknowns (without values on the boundary) and N equations:

Auh = f, where we give to the points in Th an order. The matrix A is symmetric (for uniform grid Th), and so its eigenvalues are real. The eigenvalues of A are in fact even positive. Assume there exists a λ ≤ 0 such that

(A − λ)vh = 0, where vh is the corresponding eigenvector and with the same notation grid function. Then extending it with zero onto Gh the grid function vh is unique solution of

(Lh,λvh)(x) := ((Lh − λ)vh)(x) = 0 for all x ∈ Th,

vh(x) = 0 for all x ∈ Gh,

? the with Lh,λ = −Λ +(c−λ). By uniqueness vh vanishes, is consequently no eigenvector and we have only positive eigenvalues. If the inverse of A exists, it holds (without absolute value)

−1 −1 −1 kA k2 = λmax(A ) = max λk(A ). 1≤k≤N

21 For the corresponding eigenvectors vh,k it is

−1 −1 −1 −1 A vh,k = λk(A )vh,k ⇔ λk (A )vh,k = Avh,k,

−1 −1 i. e., λk(A) = λk (A ) and so

−1 −1 1 kA k2 = max λk (A) = , 1≤k≤N λh,1 where we assume an order of the eigenvalues of A. The discrete eigenvalues λh,k are upper bounds to the eigenvalues λk of the continuous problem in Ω which can be proved by the minimax principle for variational problems using piecewise-linear eigenfunctions as shown for example by Kuttler [3]. With the minimax principle it can been shown as well that the continuous eigenvalues increase if the problem is stated in a subdomain. So, the lowest discrete eigenvalue λh,1 can be bounded from below by the lowest continuous eigenvalue λ1 in the domain Ω, which can be bounded from below by the lowest continuous eigenvalue in the square domain with side lengths diam(Ω), which is larger than 2π2/(diam(Ω))2 (the corresponding eigenfunction for c = 0 is sin(πx1/R) sin(πx2/R)). Remark 2.18. In case of non-uniform grids we can rely at least on the bound diam(Ω)2 ku k ≤ kfk∞, h 2,h 4 which is consequence of Lemma 2.16 and (2.7). Remark 2.19 (Minimax principle). For the variational eigenvalue problem a(u, v) = λ hu, vi with the symmetric positive-semidefinite bilinear form a the k-eigenvalue is a(u, u) λk = min max 2 . Hk⊂H u∈Hk kuk dim(Hk)=k The finite difference method is stable and so if consistent it convergences for h → 0. Lemma 2.20 (Convergence of the finite difference method). Let u ∈ C3+`(Ω), ` = 0, 1 the solution of (2.1), and uh the solution of (2.6). Then, there exists C > 0 such that

1+`/2 kuh − uk∞ ≤ C h, kuh − uk2,h ≤ C h . Proof. We have the systems

(Lhuh)(x) = f(x) for all x ∈ Th (Lhu)(x) = (Lhu)(x) for all x ∈ Th

uh(x) = g(x) for all x ∈ Gh u(x) = g(x) for all x ∈ Gh, and so the grid function uh − u is unique solution of

(Lh(uh − u))(x) = (f − Lhu)(x) for all x ∈ Th

(uh − u)(x) = 0 for all x ∈ Gh.

22 With the stability estimates in Lemma 2.16 and 2.17, and the estimates of the consistency error we conclude in

kuh − uk∞ ≤ C(Ω)kf − Lhuk∞ ≤ C h, 1+`/2 kuh − uk2,h ≤ C(Ω)kf − Lhuk2,h ≤ C h .

2.3.3 Implementational issues The representation of (2.6) with separation of ﬁnite diﬀerences and boundary approximation is advantageously for the theory. However, the size of the system can be easily reduced by writing only a system for the internal unknown uh on Th, this is by ommit- ing (2.6b) and by moving the known values of uh on Gh to the right hand side of (2.6a). For example in 1D with the step sizes hL, h, . . . , h, hR and c ≡ 0 we have instead of the linear system

 2h−1 2(h−1+h−1) 2h−1  − L L − L ··   hL+h hL+h hL+h f   1  −2 −2 −2     · −h 2h −h ···  uL  f2     .   ......   u1   .   . . .     .  −2 −2 −2  .     · · · −h 2h −h ·   .  = fN−1        2h−1 2(h−1+h−1) 2h−1  u f  R R R   N   N  · · · · − h +h h +h − h +h    R R R  uR  gL   1 ······    gR ······ 1 the reduced one

−1 −1  −1  −1 2(hL +h ) 2hL   − · 2hL hL+h hL+h f1 + gL   hL+h  −2 −2 −2       −h 2h −h ··  u1  f2    .  .   ......   .  =  .  .  . . .   .   .  −2 −2 −2    · · · −h 2h −h  uN  fN−1    −1 −1 −1    2h 2(h +h−1)  2hR R R fN + gR · · · · − hR+h hR+h hR+h The system becomes in case of uniform grids symmetric, where linear solvers for symmetric matrices, like conjugate gradients (CG), could be used. For writing down a linear system of equations we have to order the unknowns and equations. This is in one space dimension straight forward (see last example). In two space dimensions often rectangular domains are used in practise, i. e., 0 ≤ j < N1, 0 ≤ i < N2, and global index is

I(i, j) = iN1 + j.

23 More general grids (for example that in Fig. (2.1)) may we cut into layers with jmin(i) ≤ j ≤ jmax(i), 0 ≤ i < N2, and the global index is deﬁned recursively

I(0, jmin(0)) = 0,

I(i, jmin(i)) = I(i − 1, jmax(i − 1)) + 1,

I(i, j) = I(i, jmin(i)) + j − jmin(i).

2.3.4 Discussion Pros • The FDM is easy to implement. • The system matrices are sparse with O(h−d) non-zero entries out of O(h−2d) entries which are even locally repeating in almost the whole domain.

• Convergent solutions in the `∞ and weighted `2 norm.

Contras • Limitation of the application to C2 continuous solutions (of the strong problem). • Limitations of the grid to simple geometries and continuous material function σ(x) in case of a div σ(x) grad operator. • The system matrices are for non-uniform grids not symmetric. • Only point-wise approximation.

Extension • The FDM can be extended to higher orders by finite difference approximation of higher order which leads to larger stencils and so denser matrices. The FDM of higher orders convergence with a higher rate if the solution has higher regularity. • C2 continuous approximations can be obtained by interpolating the solution, e. g., with cubic B-splines [4] (small support) or cubic Z-splines [5] (derivatives on grid points coincides with finite differences). A cubic B-spline basis function for equidistant points (see Fig. 2.2(a)) is given by  2 3 1  1 − 6x − 6|x| for |x| ≤ 2 ,  3 1 S(x) = 2(1 − |x|) for 2 ≤ |x| ≤ 1,  0 for 1 ≤ |x|. The cubic Z-spline basis function for equidistant points (see Fig. 2.2(a)) is given by  5 2 3 3  1 − 2 x + 2 |x| for |x| ≤ 1,  1 2 Z2(x) = 2 (2 − |x|) (1 − |x|) for 1 ≤ |x| ≤ 2,  0 for 2 ≤ |x|.

24 1 1

0.8 0.8

0.6 0.6 0.4 0.4 0.2 0.2 0 0 (a) −3 −2 −1 0 1 2 3 (b) −3 −2 −1 0 1 2 3

Figure 2.2: Cubic B-spline (a) and Z-spline (b) basis functions for equidistant points.

Level 0 Level 1 Level 2 Level 3 Figure 2.3: Finite diﬀerence grids with two materials, one with a circular shape.

References

[1] Lawrence C. Evans. Partial Diﬀerential Equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, 1998. [2] William McLean. Strongly Elliptic Systems and Boundary Integral Equations. Cambridge University Press, 2000. [3] J. R. Kuttler and V. G. Sigillito. Eigenvalues of the Laplacian in two dimensions. SIAM Review, 26(2):pp. 163–193, 1984. [4] Carl de Boor. A Practical Guide to Splines. Springer Verlag, New York, 2001. [5] Julian B. Sagredo. Z-splines: moment conserving cardinal spline interpolation of compact support for arbitrarily spaced data. SAM Report 2003-10, ETH Z¨urich, Seminar for Applied Mathematics, Aug 2003.

25 3 Elliptic Boundary Value Problems

d We consider the problem the connected bounded open set Ω R with boundary ∂Ω ⊂ and the PDE

Lu = div σ grad u + cu = f in Ω, (3.1a) − u = g on ΓD, (3.1b)

σ grad u n = h on ΓN , (3.1c) · σ grad u n + βu = h on ΓR, (3.1d) · where ∂Ω = ΓD ΓN ΓR, c 0 and > σ1 σ σ0 > 0. The last boundary ∪ ∪ ≥ ∞ ≥ ≥ condition (3.1c) is the Robin boundary condition which models an resistant (absorbing) d outer medium in R Ω. \ 3.1 Domains

We consider the partial differential equations of interest on bounded, connected, and d open subsets of (the affine space) R , d = 1, 2, 3. These are called the (spatial) domains of related boundary value problem and will be denoted by Ω. The topological closure of Ω is Ω and its boundary ∂Ω := Ω Ω. A domain has an unbounded open complement 0 d \ Ω := R Ω. \ Example 3.1. For d = 1 admissible domains will be open intervals ]a, b[, a > b, and ∂Ω = a, b . { } The minimum requirement for a boundary of a domain is that it is C0 continuous and closed, i. e., the boundary of the boundary is empty. However, meaningful boundary values for solutions of partial differential equations can only be imposed if we make additional assumptions on ∂Ω. First, we recall that a d m function f : U R R , d, m N, is Lipschitz continuous, if there is a γ > 0 such ⊂ 7→ ∈ that

d f(ξ) f(η) γ ξ η ξ, η R . | − | ≤ | − | ∀ ∈ d Definition 3.2. A domain Ω R is called a Lipschitz domain, if there exists a finite ⊂ covering of ∂Ω of open d-dimensional rectangles such that for every x ∂Ω we can U ∈ find an open neighborhood U such that there is a bijective mapping ∈ U T d Φ = (Φ1,..., Φd) : R := ξ R , ξk < 1 U, { ∈ | | } 7→ which satisfies

26 1. Both Φ and Φ−1 are Lipschitz continuous.

2. U ∂Ω = Φ( ξ R : xd = 0 ). ∩ { ∈ } 3. U Ω = Φ( ξ R : ξd < 0 ). ∩ { ∈ } 0 4. U Ω = Φ( ξ R : ξd > 0 ). ∩ { ∈ } We call the boundary of a Lipschitz domain a Lipschitz boundary. If Φ can be chosen k to be k-times continuously diﬀerentiable, k N, then Ω is said to be of class C . ∈ Lipschitz domains have a boundary which is “slighly” smoother than only continuous. This excludes especially domains with fractal boundaries for which a ﬁnite covering of ∂Ω is not possible. For those domains we cannot assume the statements in following of the lecture.

Ω

Figure 3.1: Domain whose boundary is the limit of a Koch curve is not Lipschitz (bounded domain, but boundary of inﬁnite length).

There are a few examples of simple domains that do not qualify as Lipschitz domains.

Ω

Figure 3.2: Domains that are not Lipschitz: slit domain (left), cusp domain (middle), crossing edges (right).

A profound result from measure theory asserts that a Lipschitz continuous function ∞ with values in R is diﬀerentiable almost everywhere with partial derivatives in L .

27 d Therefore we can be define the exterior unit vectorfield n : ∂Ω R by 7→ ∂Φ ∂ξ (ξ) n(x) := d for almost all x ∂Ω and ξ = Φ−1x . (3.2) ∂Φ (ξ) ∈ | ∂ξd | In almost all numerical computations only a special type of Lipschitz domains will be relevant, namely domains that can be described in the widely used CAD (computer-aided design) software packages. Definition 3.3. In the case d = 2 a connected domain Ω is called a curvilinear Lips- chitz polygon, if Ω is a Lipschitz domain, and there are a finite number of open subsets Γk ∂Ω, k = 1,...,P , P N, such that ⊂ ∈ ∂Ω := Γ1 ΓP , Γk Γl = if k = l , ∪ · · · ∪ ∩ ∅ 6 1 and for each k 1,...,P there is a C -diffeomorphism Φk : [0, 1] Γk. ∈ { } 7→ The boundary segments are called edges, their endpoints are the vertices of Ω. A tangential direction can be defined for all points of an edge including the endpoints, which gives rise to the concept of an angle at a vertex, see Fig. 3.3. The mappings Φk can be regarded as smooth parametrizations of the edges. An analoguous notion exists

Figure 3.3: Curvilinear polygon with added angles at vertices in three dimensions. To give a rigorous definition we appeal to the intuitive concept 3 of a closed polyhedron in R , which can be obtained as a finite union of convex hulls 3 of finitely many points in R . The surface of a polyhedron can be split into flat faces. Moreover, it is clear what is meant by “edges” and “vertices”. 3 Definition 3.4. A connected domain Ω R is called a curved Lipschitz polyhedron, if ⊂

28 1. Ω is a Lipschitz domain,

2. there is a continuous bijective mapping Φ : ∂Π ∂Ω , where Π is a polyhedron 7→ with flat faces F1,...,FP , P N. ∈ 1 3. Φ : F k Φ(F k) is a C -diffeomorphism for every k = 1,...,P . 7→ We call Φ(Fk) the (open) face Γk of Ω. Further, Φ takes edges and vertices of Π to edges and vertices of Ω. Based on the parametrization we can introduce the surface measure dS and define integrals of measurable functions on Γ

P f(x) dS(x) := f(Φ(ξ)) det(DΦ(ξ)DΦ(ξ)>) dξ , (3.3) | | ZΓ k=1 Z−1 XFk=Φ (Γk) q where dξ is the (d 1)-dimensional Lebesgue measure on the ﬂat face Fk. − d Deﬁnition 3.5. A subset Ω R is called a computational domain if it is of class ⊂ Ck, k 1, or ≥ – a bounded connected interval for d = 1.

– a curvilinear Lipschitz polygon for d = 2.

– a curved Lipschitz polyhedron for d = 3.

3.2 Linear diﬀerential operators

d Let α N0 be a multi-index, i. e., a vector of d non-negative integers: ∈ n α = (α1, . . . , αd) N0 . ∈ Set α := α1 + + αn and denote by | | ··· α1 αn α ∂ ∂ ∂ := α1 αn ∂x1 ··· ∂xn the partial derivative of order α . Remember that for suﬃciently smooth functions | | all partial derivatives commute. Provided that the derivatives exist, the gradient of a d function f :Ω R R is the column vector ⊂ 7→ ∂f ∂f > grad f(x) := (x),..., (x) , x Ω . ∂x ∂x ∈ 1 1 d d The divergence of a vector ﬁeld f = (f1, . . . , fd):Ω R R is the function ⊂ 7→ d ∂f div f(x) := k (x) , x Ω . ∂xk ∈ k=1 X

29 The diﬀerential operator ∆ := div grad is known as Laplacian. In the case d = 3 the ◦3 3 rotation of a vectorﬁeld f :Ω R R is given by ⊂ 7→ ∂f3 ∂f2 (x) (x) ∂x2 − ∂x3   ∂f1 ∂f3 curl f(x) := (x) (x) , x Ω . ∂x − ∂x  ∈  3 1   ∂f ∂f   2 1   (x) (x) ∂x1 − ∂x2    Notation 3.6. Bold roman typeface will be used for vector-valued functions, whereas plain print tags R(C)-valued functions. For the k-th component of a vector valued function f we write fk or, in order to avoid ambiguity, (f)k.

3.3 Integration by parts

d Below we assume that Ω R , d = 1, 2, 3, is bounded and an interval for d = 1, a ⊂ Lipschitz polygon for d = 2, and a Lipschitz polyhedron for d = 3. Throughout we T adopt the notation n = (n1, . . . , nd) for the exterior unit normal vectorﬁeld that is deﬁned almost everywhere on Γ := ∂Ω, see Sect. 3.1.

Theorem 3.7 (Gauß’ theorem). If f (C1(Ω))d (C(Ω))d, then ∈ ∩ div f dx = f, n dS. h i ZΩ ZΓ Proof. Please consult [5, 15] and [6]. § By the product rule

div(u f) = u div f + grad u, f h i for u C1(Ω), f (C1(Ω))d, we deduce the ﬁrst Green formula ∈ ∈ f, grad u + div f u dx = f, n udS (FGF) h i h i ZΩ ZΓ for all f (C1(Ω))d (C0(Ω))d and all u C1(Ω) C0(Ω). Plugging in the special ∈ ∩ ∈ ∩ f = fk, k = 1, . . . , d, k the k-th unit vector, we get ∂u ∂f f + u dx = fu n dS (IPF) ∂ξ ∂ξ k ZΩ k k ZΓ for f, u C1(Ω) C0(Ω). We may also plug f = grad v into (FGF), which yields ∈ ∩ grad v, grad u + ∆v u dx = grad v, n u dS (3.4) h i h i ZΩ ZΓ

30 for all v C2(Ω) C1(Ω), u C1(Ω) C0(Ω). ∈ ∩ ∈ ∩ In three dimensions, d = 3, another product rule ( stands for the antisymmetric × vector product)

div(u f) = curl u, f u, curl f × h i − h i for continuously diﬀerentiable vectorﬁelds u, f (C1(Ω))3 can be combined with Gauss’ ∈ theorem, and we arrive at

curl u, f u, curl f dx = u f, n dS (CGF) h i − h i h × i ZΩ ZΓ Remark 3.8. For d = 1 Gauss’ theorem boils down to the fundamental theorem of calculus, and (FGF) becomes the ordinary integration by parts formula.

3.4 Distributional derivatives

Example 3.9. Consider heat conduction in a plane wall composed of two layers of equal thickness and with heat conductivity coeﬃcients κ1 and κ2. The inside of the wall is kept at temperature u = u1, the outside at u = 0. Provided that the width and the height of the wall are much greater than its thickness, a one-dimensional model can be used. After spatial scaling it boils down to

d d j = κ(x) u , j = 0 , u(0) = 0 , u(1) = u1 , (3.5) − dx dx where

1 κ1 for 0 < x < 2 , κ(x) =  1 κ2 for 2 < x < 1 . An obvious “physical solution” that guarantees the continuity of the heat ﬂux is

2u1κ2x 1 for 0 < x < 2 , κ1 + κ2 u(x) =  (3.6) 2u1κ1(x 1) 1  − + u1 for 2 < x < 1 . κ1 + κ2  Evidently, this solution is not diﬀerentiable and can not be a “classical solution”.

d Deﬁnition 3.10 (Test function space). For a non-empty open set Ω R we denote ⊂ C∞(Ω) := v C∞(Ω) : supp v := x Ω: v(x) = 0 Ω 0 { ∈ { ∈ 6 } ⊂ } the space of test functions, which is the space of functions C∞(Ω) with compact support.

31 1 0 2

0.8 0.5 1.5 − 0.6 ) ) ) x x x ( ( (

1 0 1 j u − u 0.4

1.5 0.5 0.2 − κ1 = 1 κ2 = 2 κ1 = 1 κ2 = 2 κ1 = 1 κ2 = 2 0 2 0 0 0.2 0.4 0.6 0.8 1 − 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 x x x

Figure 3.4: Non-classical solution u(x) of Example 3.9 for κ1 = 1 and κ2 = 2.

Lemma 3.11. Two integrable functions f and g deﬁned in the bounded set Ω are almost everywhere equal if and only if

fv dx = gv dx ZΩ ZΩ for all test functions v C∞(Ω). ∈ 0 We consider in the following functions to be equivalentif they are equal almost everywhere.

2 n 2 Definition 3.12. Let u L (Ω) and α N0 . A function w L (Ω) is called the weak ∈ ∈ ∈ derivative or distributional derivative ∂αu (of order α ) of u, if | | wv dx = ( 1)|α| u ∂αv dx v C∞(Ω) . − ∀ ∈ 0 ZΩ ZΩ Based on this definition, all linear differential operators introduced in Sect. 3.2 can be given a weak/distributional interpretation. For example, the “weak” gradient grad u of a function u L2(Ω) will be vectorfield w (L2(Ω))d with ∈ ∈ w, v dx = u div v dx v (C∞(Ω))d . (3.7) h i − ∀ ∈ 0 ZΩ ZΩ This can be directly concluded from (FGF). The same is true of the “weak divergence” w L2(Ω) of a vectorfield u (L2(Ω))d ∈ ∈ w v dx = u, grad v dx v C∞(Ω) . (3.8) − h i ∀ ∈ 0 ZΩ ZΩ The term “weak derivatives” is justified, because this concept is a genuine generalization of the classical derviative.

Theorem 3.13. If u Cm(Ω), then all weak derivatives of order m agree in L2(Ω) with the corresponding∈ classical derviatives. ≤

Proof. Clear by a straightforward application of (IPF).

32 Hence, without changing notations, all derivatives will be understood as weak derivatives in the sequel. Straight from the deﬁnition we also infer that all linear diﬀerential operators in weak sense commute.

Remark 3.14. If u has a continuous m-th classical derivative, m N0, in Ω except on a ∈ piecewise smooth q-dimensional submanifold, q < d, of Ω, and u has a weak m-th derivative in Ω, then the latter agrees with the pointwise classical derivative almost everywhere in Ω. The following lemmata settle when “piecewise derivatives” of piecewise smooth functions can be regarded as their weak derivative. d Lemma 3.15. Let Ω R be bounded with Lipschitz boundary and assume a partition ⊂ Ω = Ω1 Ω2, Ω1 Ω2 = , where both sub-domains are supposed to have a Lipschitz ∪ ∩ ∅ 2 boundary, too. Assume that the restriction of the function u L (Ω) to Ωl, l = 1, 2, 1 ∈ 1 belongs to C (Ωl) and that u|Ωl can be extended to a function in C (Ωl). Then u possesses weak derivatives ∂u , k = 1, . . . , d, if and only if u C0(Ω). In this ∂xk case ∈ ∂ u|Ω if x Ω1 , ∂u ∂xk 1 (x) = ∈ L2(Ω). (3.9) ∂xk  ∂  ∈ u|Ω if x Ω2  ∂xk 2 ∈  Proof. Using locally supported test functions in the deﬁnition of the weak derivative, it is clear that (3.9) supplies the only meaningful candidate for the weak derivative of u. Then we appeal to (FGF) and the fact that any crossing direction nΣ of the interface Σ := ∂Ω1 ∂Ω2 will be parallel to the exterior unit normal of one sub-domain, and ∩ anti-parallel to that of the other. Thus, we get the identity

grad u, v dx = grad u, v dx + grad u, v dx h cl i h cl i h cl i ZΩ ZΩ1 ZΩ2 = u div v dx + [u] v , nΣ dS + u [v] , nΣ dS, − Σ h{ }Σ i { }Σ h Σ i ZΩ ZΣ ZΣ =hv,nΣi =0 where [u] C0(Σ) and u C0(Σ) stand| for{z the jump} and mean of| u {zacross} Σ and Σ ∈ { }Σ ∈ gradcl denotes the “classical gradient” of a suﬃciently smooth function. Thanks to the assumptions on u this will be a continuous function on Σ. As

∞ [u] v, nΣ dS = 0 v C (Ω) [u] = 0 , Σ h i ∀ ∈ 0 ⇔ Σ ΣZ the assertion of the lemma follows from the deﬁnition of the weak gradient. Example 3.16. The weak derivative of the temperature distribution from Example 3.9 is given by (cf. Figure 3.4)

−1 1 0 2u1κ2(κ1 + κ2) if 0 < ξ < 2 , u (ξ) = −1 1 (2u1κ1(κ1 + κ2) if 2 < ξ < 1 .

33 Remark 3.17. The fact that a function has classical derivatives in subdomains and even the fact that the classical derivatives can be extended to a function in Ω which is in L2(Ω), does not guarantee that it has a weak derivative in the whole domain Ω. For ∞ example a function which is piecewise C (Ωl), l = 1, 2 (notation like in Lemma 3.9), but discontinuous over Σ, has not a weak gradient.

Corollary 3.18. Under the geometric assumptions of the previous lemma let u|Ωl belong m m−1 to C (Ωl) with possible extension to C (Ωl). Then α α 2 d m−1 ∂ u exists and ∂ u L (Ω) α N0, α m u C (Ω) . ∈ ∀ ∈ | | ≤ ⇔ ∈ Lemma 3.19. We retain the assumptions of Lemma 3.15 with the exception that u is 2 d 1 d replaced by a vectorﬁeld u (L (Ω)) with restrictions u (C (Ωl)) that can be ∈ |Ωl ∈ extended to continuously diﬀerentiable functions on Ωl, l = 1, 2. Then u has a weak divergence div u L2(Ω), if and only if the normal component of u ∈ is continuous across Σ := ∂Ω1 ∂Ω2. Its divergence agrees with the classical divergence on the sub-domains. ∩ If d = 3, u has a weak rotation curl u (L2(Ω))3, if and only if the tangential ∈ components of u are continuous across Σ. The combined rotations on the sub-domains yield the weak rotation.

3.5 Weak formulations

3.5.1 Pure Dirichlet boundary conditions Interpreting the div in (3.1a) as weak derivative we can equivalently write

σ grad u, grad v + c uv dx = fv dx (3.10) h i ZΩ ZΩ for test functions v C∞(Ω). Additionally the solution u has to fulﬁll the Dirichlet ∈ 0 boundary conditions (3.1b), which all trial functions (dt. Ansatzfunktionen) – functions to try if they solve (3.10) – have to fulﬁll. Boundary conditions which are present in constraint to the space of trial or test functions are called essential.

3.5.2 Neumann boundary conditions Multiplying (3.1a) with a test function v C∞(Ω) C(Ω) C∞(Ω) we get ∈ ∩ ⊂ 0 σ grad u, grad v + c uv dx σ grad u, n v dS = fv dx, (FWP) h i − h i ZΩ Z∂Ω ZΩ and we can incorporate the Neumann boundary condition (3.1c) in the variational formulation. For pure Neumann boundary conditions, i. e.,ΓN = ∂Ω, this reads

σ grad u, grad v + c uv dx = fv dx + h v dS. (3.11) h i ZΩ ZΩ Z∂Ω

34 Boundary conditions which are present in the variational formulation are called natural. To see that (3.12) is equivalent to (3.1a) and (3.1c) we choose ﬁrst test functions v C∞(Ω). Then, the term on ∂Ω disappears, and with the deﬁnition of the weak ∈ 0 divergence we get (3.1a). Now, let v C∞(Ω) C∞(∂Ω). Integrating by parts and ∈ ∩ using the fact that (3.1a) holds we get

σ grad u, n v dS = hvdS h i Z∂Ω Z∂Ω which is by Lemma 3.11 equivalent to (3.1c).

3.5.3 Robin boundary conditions For the case of pure Robin boundary conditions, where Neumann b.c. are included with β = 0, inserting (3.1d) into (FWP) leads to

σ grad u, grad v + c uv dx + βu v dS = fv dx + h v dS. (3.12) h i ZΩ Z∂Ω ZΩ Z∂Ω Also Robin boundary conditions are natural.

3.5.4 Primal formulation for the ﬁrst order system The static limit of the heat equation was given also as ﬁrst order system

j = σ(x) grad u , (FL) − div j = f, (EL)

In the case of the ﬁrst order system comprised of (FL) and (EL) the derivation of the formal weak formulation is more subtle. The idea is to test both equations with smooth vectorﬁelds, for (FL), and functions, for (EL), respectively, and integrate over Ω, but apply integration by parts to only one of the two equations: this equation is said to be cast in weak form, whereas the other is retained in strong form. If we cast (EL) in weak form and use the strong form of (FL) we get

j, q dx = σ grad u, q dx q , h i − h i ∀ ZΩ ZΩ (3.13) j, grad v + j, n v dS = fv dx v . − h i h i ∀ ZΩ ZΓ ZΩ Choosing q = grad v we can merge both equations, and the result will coincide with (FWP). This formal weak formulation is called primal.

35 3.5.5 Dual formulation for the ﬁrst order system The alternative is to cast (FL) in weak form and keep (EL) strongly, which results in the dual formal weak formulation:

σ−1j, q + u div q dx = u q, n dS q , − h i ∀ ZΩ Z∂Ω (3.14)

div j v dx = fv dx v . ∀ ZΩ ZΩ Here, elimination of u is not possible. Note, that the Dirichlet trace of u appear in the variational formulation, i. e., here the Dirichlet boundary conditions are natural. Contrary, the trace of j is absent and Neumann boundary conditions are essential. For β > 0 the Robin boundary conditions (3.1d) can be inserted in the variational formulation as u = β−1(h j, n ), hence, − h i they are natural.

3.6 Linear and bilinear forms

A key role in the theory of vector spaces is played by the associated homomorphisms, which are called linear mappings in this particular context.

Deﬁnition 3.20. Let V,W be real vector spaces. A mapping

T : V W is called a (linear) operator, if T(λv + µw) = λ T v + µ T w for all • 7→ v, w V , λ, µ R. If W = R, then T is a linear form. ∈ ∈ b : V V R is called a bilinear form, if for every w V both v b(w, v) • × 7→ ∈ 7→ and v b(v, w) are linear forms on V . 7→ Linear forms play a crucial role in our investigations of variational problems:

Deﬁnition 3.21. The dual V 0 of a normed vector space V is the normed vector space L(V, R) of continuous linear forms on V . The dual space is equipped with the (operator) norm f(v) f V 0 = sup | |. k k 06=v∈V v V k k 0 Notation 3.22. For f V we will usually write f, v V 0×V instead of f(v), v V ∈ h i ∈ n (“duality pairing”). The notation , is reserved for the Euklidean inner product in R and will designate the Euklideanh· ·i norm. | · | Given some linear form f a linear variational problem seeks u V such that ∈

b(u, v) = f, v 0 v V. (LVP) h iV ×V ∀ ∈

36 Deﬁnition 3.23. A bilinear form b on a vector space V is called symmetric, if

b(v, w) = b(w, v) v, w V. ∀ ∈ A special class of symmetric bilinear forms often occurs in practical variational problems: Definition 3.24. A bilinear form b on a real vector space is positive definite, if for all v V ∈ b(v, v) > 0 v = 0 . ⇔ 6 A symmetric and positive definite bilinear form is called an inner product.

Deﬁnition 3.25. A continuous bilinear form b on a normed space V is called V -elliptic with ellipticity constant γe > 0, if

2 b(v, v) γe v v V. | | ≥ k kV ∀ ∈ 3.7 Deﬁnition of Sobolev spaces

3.7.1 Banach spaces Deﬁnition 3.26. A normed vector space V is complete, if every Cauchy sequence vk k V has a limit v in V . A complete normed vector space is called a Banach {space} .⊂

p m Example 3.27. The function spaces L (Ω), 1 p , and C (Ω), m N0, are Banach spaces. ≤ ≤ ∞ ∈

3.7.2 Hilbert spaces For a symmetric positive deﬁnite bilinear form a an inner product a induces a norm through

1 v := a(v, v) 2 v V. k ka ∈ The fundamental Cauchy-Schwarz-inequality

a(v, w) v w v, w V (CSI) ≤ k ka k ka ∀ ∈ ensures that a will always be continuous with norm 1 with respect to the energy norm. Moreover, we have Pythagoras’ theorem

a(v, w) = 0 v 2 + w 2 = v + w 2 . ⇔ k ka k ka k ka In the context of elliptic partial diﬀerential equations a norm that can be derived from a V -elliptic bilinear form is often dubbed energy norm. Vector spaces that yield Banach spaces when endowed with an energy norm oﬀer rich structure.

37 Deﬁnition 3.28. A Hilbert space is a Banach space whose norm is induced by an inner product.

Lemma 3.29. The dual of a Hilbert space is a Hilbert space as well. Exercise 3.1. If an inner product a is V -elliptic and continuous in Banach space V then (V, ) is a Hilbert space. k·ka Notation 3.30. In the sequel, the symbol H is reserved for Hilbert spaces. When H is a Hilbert space, we often write ( , )H to designate its inner product. · · 3.7.3 Sobolev spaces In Sect. 3.5 we learned that the formal variational problem associated with the pure homogeneous Neumann problem for (3.1) is: seek u :Ω R such that 7→ σ grad u, grad v + c uv dx = fv dx v . (3.15) h i ∀ ZΩ ZΩ We already know that grad has to be used in distributional sense. The concrete spaces have deliberately been omitted in (3.15), because we want to heed the guideline for- mulated in the context of Example 3.9 and set out from (3.15) and design the “ideal” space. It goes without saying that the investigation of (3.15) is easiest, when we regard a Hilbert space H equipped whose inner product coincides with the bilinear form, and consequently which is equipped with the energy norm

v 2 := σ grad v, grad v + c v 2 dx (3.16) k ke h i | | ZΩ as its norm. So we arrive at the preliminary “deﬁnition”

H := v :Ω R : weak grad v exists and energy norm (3.16) of v < . { 7→ ∞} Deﬁnition 3.31 (Sobolev space H1(Ω)). The Sobolev space H1(Ω) is the space of square integrable functions Ω R with square integrable weak gradients: → H1(Ω) := v L2(Ω) : (the weak gradient) grad v exists and grad v L2(Ω) { ∈ ∈ } with norm

2 2 2 2 v 1 := v 2 + D grad v 2 k kH (Ω) k kL (Ω) k kL (Ω) where D = diam(Ω) is introduced to match units, which is often omitted in the deﬁnition. Note, that the H1(Ω)-norm and the energy norm are equivalent if the latter is based on an H1(Ω)-elliptic and H1(Ω)-continuous bilinear form, i. e., there exists two constants C1,C2 > 0 such that

1 C1 v e v 1 C2 v e v H (Ω). k k ≤ k kH (Ω) ≤ k k ∀ ∈

38 d Deﬁnition 3.32. For m N0 and Ω R we deﬁne the Sobolev space of order m as ∈ ⊂ Hm(Ω) := v L2(Ω) : ∂αv L2(Ω), α m , { ∈ ∈ ∀ | | ≤ } equipped with the norm

1 α 2 2 v m := ∂ v 2 . k kH (Ω) k kL (Ω) |αX|≤m A vector ﬁeld is said to belong to Hm(Ω), if this is true of each of its components.

d Notation 3.33. For all m N0 and Ω R ∈ ⊂ 1 α 2 2 v 1 := ∂ v 2 | |H (Ω) k kL (Ω) |αX|=m denotes a semi-norm on Hm(Ω).

The Sobolev spaces are a promising framework for variational problems, see [7, Thm. 3.1]:

m Theorem 3.34. The Sobolev spaces H (Ω), m N0, are Hilbert spaces with the inner product ∈

α α m (u, v) m := (∂ u, ∂ v) 2 u, v H (Ω) . H (Ω) L (Ω) ∈ |αX|≤m The above Sobolev spaces are based on all partial derivatives up to a fixed order. We can as well rely on some partial derivatives of any linear differential operator in the definition of a Sobolev-type space.

∞ l ∞ k Deﬁnition 3.35. If D :(C (Ω)) (C (Ω)) , l, k N, is a linear diﬀerential operator 7→ ∈ of order m, m N, we write ∈ H(D; Ω) := u (Hm−1(Ω))l : D u (L2(Ω))k , { ∈ ∈ } where the corresponding norm on this space is given by

1 2 2 2 u := u m−1 + D u 2 . k kH(D;Ω) k kH (Ω) k kL (Ω) The kernel of D in H(D; Ω) will be denoted by

H(D 0; Ω) := u H(D; Ω) : D u = 0 . { ∈ } An analogue of Thm. 3.34 holds true for such spaces H(D; Ω).

39 Example 3.36. The most important representatives of spaces covered by Def. 3.35 are H(div; Ω) := u (L2(Ω))d : div u L2(Ω) , (3.17) { ∈ ∈ } H(curl; Ω) := u (L2(Ω))3 : curl u (L2(Ω))3 , (3.18) { ∈ ∈ } H(∆, Ω) := v H1(Ω) : ∆u L2(Ω) . (3.19) { ∈ ∈ } and, derived from them, H(div 0; Ω) and H(curl 0; Ω). Remark 3.37. On an intersection of Hilbert spaces we use the product norm: u 2 := u 2 + u 2 , u V W. (3.20) k kV ∩W k kV k kW ∈ ∩ For instance, this can be used to introduce the Hilbert space H(div; Ω) H(curl; Ω). ∩ 3 From now we confine Ω R to the class of computational domains according to ⊂ Def. 3.5 from Sect. 3.1. We recall a definition from functional analysis Definition 3.38. A subspace U of a normed space V is called dense, if > 0, v V : u U : v u . ∀ ∈ ∃ ∈ k − kV ≤ This means that elements of a dense subspace can arbitrarily well approximate elements of a normed vector space. Then we can state a key result in the theory of Sobolev spaces, the famous Meyers- Serrin theorem, whose proof is way beyond the scope of these lecture notes, see [7, Thm. 3.6]: ∞ m Theorem 3.39. The space C (Ω) is a dense subspace of H (Ω) for all m N0. ∈ Moreover, the space (C∞(Ω))d is a dense subspace of H(div; Ω) and H(curl; Ω) (d = 3 in the latter case). How can we make such a bold claim. The answer is offered by the procedure of completion, by which for every normed space one can construct a Banach space, of which the original space will become a dense subspace, see [8, Thm. 2.3]. In addition, the completion of a normed space is unique, which means that the completion of a space is completely determined by the normed space itself: the procedure of completion adds no extra particular properties. Thanks to Thm. 3.39 we can give an alternative definition of the Sobolev spaces. Corollary 3.40. The spaces Hm(Ω), H(div; Ω), and H(curl; Ω) (for d = 3) can be obtained by the completion of spaces of smooth functions with respect to the corresponding norms.

Remark 3.41. Thm. 3.39 also paves the way for an important technique of proving relationships between norms. If we have an assertion that boils down to and (in)equality of the form A(v) B(v) or A(v) = B(v) (3.21) ≤ claimed for all functions u of a Sobolev space and involving continuous expressions A, B, then it suﬃces to prove (3.21) for the dense subspace of smooth functions.

40 That means for the variational formulation, e. g.,(3.15), we can replace the space of test function by a Sobolev space. 1 ∞ Furthermore, we deﬁne as H0 (Ω) as the completion of C0 (Ω) with respect to the H1(Ω)-norm.

3.8 The Dirichlet principle

Consider the homogeneous Neumann boundary value problem for (3.1), i.e., ΓN = ∂Ω and h 0, and assume that c is strictly positive. Then the boundary term in (FWP) ≡ can be dropped and we get the variational problem: seek u H1(Ω) such that ∈ σ grad u, grad v + c uv dx = fv dx v H1(Ω) . (3.22) h i ∀ ∈ ZΩ ZΩ Evidently, its associated bilinear form

a(u, v) = σ grad u, grad v + c uv dx h i ZΩ is symmetric positive deﬁnite.

Lemma 3.42. The solution u of (3.22) with V = H1(Ω) can be characterised by

1 u = arg min J(v) with J(v) = a(v, v) f, v V 0×V . v∈V 2 − h i Proof. If u denotes the solution of (3.22), a simple calculation shows

1 1 J(v) J(u) = a(v, v) f, v 0 a(u, u) + f, u 0 − 2 − h iV ×V − 2 h iV ×V = 1 a(v, v) a(u, v) 1 a(u, u) + a(u, u) 2 − − 2 = 1 a(v, v) 1 a(u, v) 1 a(u, v) + 1 a(u, u) 2 − 2 − 2 2 = 1 a(v u, v u) =: 1 v u 2 v V. 2 − − 2 k − ka ∀ ∈ This shows that u will be the unique global minimizer of J. It is easy to establish that J is strictly convex and coercive (J(v) tends to + for ∞ v ), which implies existence and uniqueness of a global minimizer u. Now, k k → ∞ consider the function

hv(τ) := J(u + τv) τ R, v V. ∈ ∈

Since hv is smooth (it is a quadratic function) and, since u is a global minimizer d hv(τ) = 0 , dτ |τ=0 which is equivalent to (3.22), since any v can be chosen.

41 This means that a solution of (3.22) will be a global minimizer of the energy functional J(v). This hints at the general fact that

Selfadjoint elliptic boundary value problems are closely related to minimization problems for convex functionals on a function space.

This accounts for their prevasive presence in mathematical models, because the state of many physical systems is characterized by some quantity (energy, entropy) achieving a minimum. Let us try to elaborate the connection for the second-order scalar elliptic boundary value problem (3.1). For g C0(Γ) define the affine subset of H1(Ω) ∈ 1 1 H (Ω) := u H (Ω) : u = g on ΓD . ΓD,g { ∈ } 1 Consider the strictly convex functional J : H (Ω) R ΓD,g 7→ 1 1 J(v) := σ grad v, grad v + c v 2 dx fv dx + β v2 dS hv dS. 2 h i | | − 2 − ZΩ ZΩ ZΓR ZΓN ∪ΓR A necessary and sufficient criterium for u to be a global minimum of J, is

d 1 J(u + τv) = 0 v HΓ ,0(Ω) , dτ |τ=0 ∀ ∈ D which is equivalent to the linear variational problem: seek u H1 (Ω) such that ∈ ΓD,g

σ grad u, grad v + c uv dx + β uv = fv dx + hv dS v H1 (Ω) . h i ∀ ∈ ΓD,0 ZΩ ZΓR ZΩ ZΓN ∪ΓR (3.23)

Therefore, numerical methods for variational problems are also suitable for solving a certain class of optimization problems. In Section 3.5 we have shown also the equivalence of the variational formulation and the second order elliptic boundary value problem (3.1). Thus, the variational problem (3.23) emerges as the link between the PDE and the minimization problem, see Fig. 3.5.

BVP for Symmetric, Minimization selfadjoint positive deﬁnite of strictly convex → ⇔ elliptic PDE variational problem (energy) functional

Figure 3.5: Relationship between minimization problems, variational problems, and elliptic boundary value problems

42 3.9 Theory of variational formulations

3.9.1 The Riesz representation theorem Theorem 3.43 (Riesz representation theorem). Let H be a Hilbert space and ϕ H0 ∈ an arbitrary linear form on H. Then there exists a unique element u H such that ∈

ϕ(v) = (u, v)H for all v H. ∈

Moreover, ϕ H0 = u H . k k k k The Riesz representation theorem tell that for any element in H0 an element in H can be assigned, where the norms coincide, i. e., there is an isometric isomorphism between Hilbert spaces and its dual, “H = H0” – the Riesz isomorphism. For variational formulations with symmetric, positive deﬁnite bilinear forms the Riesz representation theorem guarantees existence and uniqueness of the solution, and its continuous dependancy by the functional on the right in the energy norm. If the bilinear form is elliptic and continuous in the test and trial space H, and with equivalence to the energy norm, the solution is also bounded in the H-norm itself.

f(v) f(v) v H 1 √γ u H (u, u)a = sup | | = sup | | k k f H0 . k k ≤ v∈H (v, v)a v∈H v H (v, v)a ≤ k k √γ p k k p p 3.9.2 The inf-sup conditions Recall that the dual V 0 of V includes all bounded linear forms on V . We can also consider the bidual of V , the dual V 00 of V 0, that is the space of functions where the duality pairing with linear forms in V 0 is bounded. It is equipped with the operator norm

v(g) v, g V 00×V 0 v V 00 = sup | | = sup | h i |. (3.24) k k 0 g V 0 0 g V 0 g∈V \{0} k k g∈V \{0} k k Obviously V V 00. If any element v00 V 00 can be identified with an element v V ⊂ ∈ ∈ such that for all g V 0 ∈ 00 00 v (g) = v , g = g, v 0 = g(v) V 00×V 0 h iV ×V 00 00 and v V 00 = v V , i. e., V and V are isomorphic, then V is called reflexive and we k k k k write V V 00. ' Example 3.44. The space Lp(Ω) is reflexive for any 1 < p < . However, L∞(Ω) and ∞ L1(Ω) are not reflexive.

Corollary 3.45. All Hilbert spaces are reﬂexive.

43 Proof. By Lemma 3.29 the dual H0 of a Hilbert space H is a Hilbert again, and so H00 as well. By Riesz representation theorem we can identify with a u00 H00 a linear form ∈ f H0 with the same norm by ∈ 00 0 (f, w)H0 = u , w 00 0 w H H ×H ∈ and, applying once more, a function u H, again with the same norm. ∈ Throughout this section U, V will stand for reﬂexive Banach spaces with norm , k·kU , and b will designate a continous bilinear form on U, V , that is b L(U V, R), or k·kV ∈ × a continuous sesquilinear form on U, V , that is b L(U V, C). ∈ × Given some f W 0 a linear variational problem seeks u U such that ∈ ∈

b(u, v) = f, v 0 v V. (LVP) h iV ×V ∀ ∈ Theorem 3.46. The following statements are equivalent: 0 (i) For all f V the linear variational problem (LVP) has a unique solution uf U that satisﬁes∈ ∈ 1 uf U f V 0 , (3.25) k k ≤ γs k k

with γs > 0 independent of f. (ii) The bilinear form b satisﬁes the inf-sup conditions b(w, v) γs > 0 : inf sup | | γs (IS1) ∃ w∈U\{0} v w ≥ v∈V \{0} k kV k kU v V 0 : sup b(w, v) > 0. (IS2) ∀ ∈ \{ } w∈U\{0} | |

Proof.

(ii) (i): Injectivity (Uniqueness) ⇒ 0 Let u1, u2 U be two solutions of (LVP) for the same f V . Then b(u1 u2, v) = 0 ∈ ∈ − for all v V , and from (IS1) we immediately infer u1 = u2. ∈ (ii) (i): Closedness of the range ⇒ To prove existence of solutions of (LVP) we deﬁne the following subspace of V 0, which correspond to the to b associate operator:

0 0 V := g V : w U : b(w, v) = g, v 0 v V . b ∈ ∃ ∈ h iV ×V ∀ ∈ ∞ 0 0 Let gk k=1 be a Cauchy-sequence in Vb . Since V is a Banach space and hence { } 0 complete, gk will converge to some g V . By deﬁnition ∈ k N : wk U : b(wk, v) = gk, v 0 v V. (3.26) ∀ ∈ ∃ ∈ h iV ×V ∀ ∈

44 Thanks to the inf-sup condition (IS1), we have for any k, m N ∈ 1 gk gm, v V 0×V 1 wk wm U sup h − i = gk gm V 0 . k − k ≤ γs v∈V \{0} v V γs k − k k k ∞ Hence, wk is a Cauchy-sequence, too, and will converge to some w U. The { }k=1 ∈ continuity of b and of the duality pairing makes it possible to pass to the limit on both sides of (3.26)

b(w, v) = g, v 0 v V, h iV ×V ∀ ∈ 0 0 0 which reveals that g V . Since gk V has its limit in V , it is a closed subspace ∈ b ∈ b b of V 0. (ii) (i): Surjectivity (Existence) ⇒ 0 0 0 0 Now, assume that Vb = V . As Vb V is closed, a corollary of the Hahn-Banach 6 ⊂ 00 theorem, see [9, Satz 4.1], confirms the existence of v0 V V (V reflexive!) ∈ ' such that 0 0 = v0(g) = g(v0) = g, v0 0 g V . h iV ×V ∀ ∈ b 0 By definition of V this means b(w, v0) = 0 for all w U and contradicts (IS2). b ∈ (ii) (i): Stability estimate ⇒

(IS1) 1 b(u, v) 1 f, v V 0×V uf U sup | | = sup h i = f V 0 . k k ≤ γs v γs v k k v∈V \{0} k kV v∈V \{0} k kV (i) (ii): ⇒ 0 Fix some w V and denote by gw V the continuous functional v b(u, v), ∈ ∈ 7→ i. e., w is the unique solution of

b(w, v) = gw, v 0 v V h iV ×V ∀ ∈ from which we conclude (IS1) by (3.25) 1 1 b(w, v) w V gw V 0 = sup | | . k k ≤ γs k k γs v v∈V \{0} k kV Now, ﬁx some v V 0 . Then, ∈ \{ } v, g V 00×V 0 0 < v V 00 = sup h i and so 0 < sup v, g V 00×V 0 k k g∈V 0\{0} g V 0 g∈V 0\{0} h i k k By the reﬂexivity of V we have

0 < sup g, v V 0×V = sup b(w, v) , (3.27) g∈V 0\{0} h i g∈V 0\{0} | |

where w is the unique solution of b(w, v) = g, v 0 , v V , and (IS2) follows. h iV ×V ∀ ∈

45 Deﬁnition 3.47 (Ellipticity of sesquilinear forms). A sesquilinear form b( , ) is called · · V -elliptic, if there exists a γ > 0 and a σ C with σ = 1, such that for all v V ∈ | | ∈ Re(σb(v, v) γ v 2 . ≥ k kV

Lemma 3.48 (Lax-Milgram). Let V be a reﬂexive Banach space. Let the bilinear form b : V V R or sesquilinear form b : V V C be V -elliptic (see Def. 3.25 and × → × → Def. 3.47). Then, the variational problem (LVP) has for any f V 0 a unique solution ∈ u V with ∈ 1 u V f V 0 . k k ≤ γ k k

Proof. The V -ellipticity of the sesquilinear form b implies for any v V ∈ γ v 2 Re(σb(v, v)) σb(v, v) = b(v, v) , (3.28) k kV ≤ ≤ | | | | which holds directly for the bilinear form b. Then, (IS1) hols as b(w, v) b(w, w) (3.28) sup | | | | γ w V . v V ≥ w V ≥ k k v∈V \{0} k k k k Furthermore, any v V ∈ 2 sup b(w, v) b(v, v) γ v V > 0. w∈V \{0} | | ≥ | | ≥ k k and (IS2) holds. Applying Thm. 3.46 the proof is complete.

3.10 Wellposedness for variational problem of second order elliptic PDEs

3.10.1 Boundary value problems with Dirichlet data In Sec. 3.8 we have derived a variational problem for the second order BVP (3.1). Strictly speaking, (3.23) does not match the definition (LVP) of a linear variational problem, because the unknown is sought in an affine space rather than a vector space. 1 Yet, (3.23) can easily be converted into the form (LVP) by using an extension ug H (Ω) ∈ of the Dirichlet data, that is, ug = g on ΓD, and plugging u := ug +δu into (3.23), where, 1 now, the offset δu H (Ω) assumes the role of the unknown function and ug will ∈ ΓD,0 show up in an extra contribution to the right hand side functional. This throws into question:

In which we ask for the equality ug = g ? • For which functions g such an extension exists ? • 1 How can we prescribe the zero trace on ΓD in H (Ω) ? • ΓD,0

46 3.10.2 Traces In order to give a meaning to essential boundary conditions in the context of Sobolev spaces, we have to investigate “restrictions” of their functions onto ∂Ω or parts of it. By m m a trace operator Rm on a Sobolev space H (Ω) we mean linear mapping from H (Ω) into a subspace of L2(∂Ω), such that

∞ (Rm u)(x) = u(x) x ∂Ω, u C (Ω) . ∀ ∈ ∀ ∈ In a sense, a trace operator is the extension to Hm(Ω) of the plain pointwise restriction u|∂Ω of a smooth function u onto ∂Ω. It is by no means obvious that such trace operators exist (as continuous mappings Hm(Ω) L2(∂Ω)). 7→ Example 3.49. For u L2(Ω) a continuous trace operator cannot be deﬁned. In particular, a trace inequality∈ of the form

∞ γt > 0 : u γt u 2 u C (Ω) (3.29) ∃ |∂Ω L2(∂Ω) ≤ k kL (Ω) ∀ ∈ remains elusive. Indeed, let Ω =]0; 1[2 and, for 0 < h < 1, deﬁne

0 if h x1 1, 0 x2 1 , v(x) := ≤ ≤ ≤ ≤ x1 (1 if 0 x1 h, 0 x2 1 . − h ≤ ≤ ≤ ≤ Then we can compute

1 2 2 1 = v(0, x2) dx2 v , | | ≤ |∂Ω L2(∂Ω) Z0 and 1 1 2 x1 y=ξ1/h 2 h v L2(Ω) = 1 dx1 = h (1 y) dy = 3 . k k 0 − h 0 − Z Z 1 If (3.29) were true, there would exists a constant γt > 0 such that 1 3 γth. For h 0 we obtain a contradiction. ≤ →

A continuous trace operator can only be found, if we have control of derivatives of the argument functions:

1 2 Theorem 3.50. The trace operator R1 is continous from H (Ω) into L (∂Ω), that is,

∞ γt > 0 : u γt u 1 u C (Ω) . ∃ |∂Ω L2(∂Ω) ≤ k kH (Ω) ∀ ∈

More precisely, the following multiplicative trace inequality holds true:

2 1 γ(Ω) > 0 : R1 u 2 γ(Ω) u 2 u 2 + grad u 2 u H (Ω) . ∃ k kL (∂Ω) ≤ k kL (Ω) k kL (Ω) k kL (Ω) ∀ ∈ n o Notation 3.51. Writing γ(Ω) we hint that the “constant” γ may only depend on the domain Ω.

47 Notation 3.52. The trace operator R1 is often suppressed in expressions like Γ v . . . dS, when it is clear that the restriction of the function v H1(Ω) to a part of the boundary ∈ R Γ is used. 1 If Γ denotes a part of ∂Ω with positive measure, we can restrict R1 u, u H (Ω), to ∈ Γ, write RΓ for the resulting operator, and trivially have the continuity 1 γ > 0 : RΓ u 2 γ u 1 u H (Ω) . ∃ k kL (Γ0) ≤ k kH (Ω) ∀ ∈ Given the continuity of the trace operator, we can introduce the following closed subspace of H1(Ω). Definition 3.53. For m N and any part Γ of the boundary ∂Ω of Ω with Γ > 0 we define ∈ | | m 1 α 2 d HΓ (Ω) := v H (Ω) : RΓ(∂ v) = 0 in L (Γ), α N0, α < m . { ∈ ∀ ∈ | | } m m If Γ = ∂Ω we write H0 (Ω) = HΓ (Ω). 1 1 Obviously, the spaces HΓ(Ω) are closed subspaces of H (Ω). Another important density result holds true, see [7, Thm. 3.7]. Theorem 3.54. The functions in C∞(Ω) whose support does not intersect Γ form a m ∞ m dense subspace of H (Ω), m N. In particular, C0 (Ω) is a dense subspace of H0 (Ω). Γ ∈ 2 By Thm. 3.50 the trace operator R1 maps continuosly into L (Γ). This raises the issue, whether it is also onto. The answer is negative. The question is, how we can characterize the range of the trace operator R1. Let Γ temporarily stand for a connected component of the boundary of Ω. We start by introducing a norm ∞ v 1/2 = inf w 1 : w C (Ω), w = v (3.30) k kH (Γ) {k kH (Ω) ∈ |Γ } on the space of restrictions of smooth functions to Γ. It is highly desirable that this norm is intrinsic to Γ, that is, switching to another domain Ω, for which Γ is also a connected component of the boundary, and using (3.30) produces an equivalent norm. ∞ e Definition 3.55. The completion of C (Ω) with respect to the norm 1/2 is |Γ k·kH (Γ) designated by H 1/2(Γ). The next theorem shows that the definition of the H 1/2(Γ)-norm is really intrinsic, see [7, 3]. § Theorem 3.56. The space H 1/2(Γ) is a Hilbert space and can be equipped with the (equivalent) Sobolev-Slobodeckij-norm 2 2 v(x) v(y) v 1/2 := | − | dS(x)dS(y) . k kH (Γ) x y d ZΓ ZΓ | − |

1 1/2 Theorem 3.57. The trace operator R1 : H (Ω) H (Γ) is continuous and surjective 1/2 7→1 and has a bounded right inverse F1 : H (Γ) H (Ω), ie., R1 F1 = Id. 7→ ◦

48 3.10.3 Dual spaces By deﬁnition H1(Ω) L2(Ω) and so for f L2(Ω) by the Cauchy-Schwarz-inequality 1 ⊂ ∈ the linear form H (Ω) R → v fvdx 7→ ZΩ 0 is bounded in (H1(Ω)) . The same holds for subspaces H1 (Ω) or H1(Ω). ΓD,0 0 It is easy to verify that

Ω uv dx u H−1(Ω) := sup (3.31) k k v∈H1(Ω)\{0} v 1 0 kR kH (Ω) deﬁnes a norm on L2(Ω). Deﬁnition 3.58. The completion of L2(Ω) with respect to the norm given by (3.31) is called H−1(Ω). Lemma 3.59. The space H−1(Ω) is a Hilbert space, which is isometrically isomorphic 1 0 to (H0 (Ω)) . Remark 3.60. By construction we have the continuous and dense embeddings 0 H1(Ω) L2(Ω) (H1(Ω)) = H−1(Ω) . 0 ⊂ ⊂ 0 Sometimes, such an arrangement is called a Gelfand triple.

Notation 3.61. Often the integral Ω ... dx is used to denote the duality pairing of H−1(Ω) and H1(Ω). 0 R 1 1/2 The same considerations that above targeted H0 (Ω) can be applied to H (Γ) on a surface without boundary. This will yield the Hilbert space H−1/2(Γ), which contains L2(Γ) and is (isometrically isomorphic to the) dual to H 1/2(Γ). As before, the integral

Γ ... dS is often used to indicate the corresponding duality pairing. Dual spaces play a crucial role when it comes to defining traces of vectorfields. R ∞ d Lemma 3.62. The normal components trace Rn for u (C (Ω)) , defined by Rn u(x) := ∈ u(x), n(x) for all x ∂Ω, can be extended to a continuous and surjective operator h i −1/2∈ Rn : H(div; Ω) H (∂Ω). 7→ Proof. Pick some u (C∞(Ω))d. By (FGF) and the Cauchy-Schwarz inequality in L2(Ω) ∈ we find

Γ u, n v dS u, n H−1/2(Γ) = sup h i kh ik 1/2 v 1/2 v∈H (Γ)\{0} R k kH (Γ) 1 = sup u, grad F1 v + div u F1 v dx 1/2 v 1/2 h i v∈H (Γ)\{0} k kH (Γ) ZΩ F1 v 1 k kH (Ω) u H(div;Ω) F1 H1/2(Γ)7→H1(Ω) u H(div;Ω) , ≤ v 1/2 k k ≤ k k k k k kH (Γ)

49 where v := F1 v, which means that R1 v = v, see Thm. 3.57. Applying the principle explained in Remark 3.41 shows the continuity of Rn asserted in the Lemma. −1/2 To conﬁrme that Rn is onto H (Γ), wee rely on the symmetric positive deﬁnite linear variational problem: seek u H1(Ω) such that ∈

1 grad u, grad v + uv dx = h R1 v dS v H (Ω) . (3.32) h i ∀ ∈ ZΩ ZΓ Here, h is an arbitrary function from H−1/2(Γ) and, clearly, the boundary integral has to be understood in the sense of a duality pairing. By the results of Ch. ?? the variational problem (3.32) has a unique solution in H1(Ω). Testing with v C∞(Ω) and recalling the deﬁnition of weak derivatives, see Def. 3.12, ∈ 0 we ﬁnd that

div(grad u) + u = 0 in L2(Ω) . − Note that div has to be understood as diﬀerential operator in the sense of distributions according to (3.8). This shows div(grad u) L2(Ω), which allows to apply (FGF) to ∈ (3.32):

div(grad u) + u v dx + grad u, n R1 v dS = h R1 v dS − h i ZΩ ZΓ ZΓ =0 for all v C∞| (Ω). By{z a density} argument, this amounts to h = grad u, n in H−1/2(Γ). ∈ h i

Moreover, the trace theorems for H1(Ω) and H(div; Ω) enable us to apply the argument elaborated in Remark 3.41 to (FGF). Hence this integration by parts formula is seen to hold for all f H(div; Ω) and u H1(Ω): ∈ ∈ f, grad u + div f u dx = f, n udS f H(div; Ω), u H1(Ω) . (FGF) h i h i ∀ ∈ ∈ ZΩ ZΓ ∞ 3 Lemma 3.63. The tangential components trace Rt for u (C (Ω)) , deﬁned by ∈ Rt u(x) := u(x) n(x) for all x ∂Ω, can be extended to a continuous and surjec- × ∈−1/2 3 −1/2 3 tive operator Rt : H(curl; Ω) (H (∂Ω)) , where (H (Γ)) is understood as the 1 7→ dual of (H /2(Γ))3.

3.10.4 Second order elliptic PDE with c c0 > 0 ≥ The previous discussions give sense to the variational formulation: Seek u H1 (Ω) ∈ ΓD,g such that

σ grad u, grad v + c uv dx + β uv = fv dx + hv dS v H1 (Ω) . h i ∀ ∈ ΓD,0 ZΩ ZΓR ZΩ ZΓN ∪ΓR (3.23)

50 1/2 1 For g H (ΓD) an extension ug H (Ω) exists, and homogeneous or inho- • ∈ ∈ mogeneous Dirichlet trace is prescribed in the spaces H1 (Ω) and H1 (Ω) = ΓD,0 ΓD,g H1 (Ω) + u are well-deﬁned. ΓD,0 g

−1/2 For h H (ΓN ΓR) the respective linear form is the duality pairing with • −1/2 ∈ ∪ H (ΓN ΓR) and hence continuous. ∪ For f L2(Ω) the respective linear form is continuous as L2(Ω) (H1 (Ω))0. • ∈ ⊂ ΓD,0 1 In case of c c0 > 0 the bilinear form is H (Ω)-elliptic (and so also for subspaces like ≥ H1 (Ω)) with an ellipticity constant ΓD,0

γ = min( inf σ(x), inf c(x)), x∈Ω x∈Ω and well-posedness of (3.23) follow from Lax-Milgram’s lemma with 1 u H1(Ω) f L2(Ω) + g 1/2 + h −1/2 . k k ≤ γ k k k kH (ΓD) k kH (ΓN ∪ΓR)

1 B What is in case of c = 0 ? We have “ellipticity” for the H (Ω)-seminorm, but the control of the L2(Ω)-norm is missing.

? 2 2 2 b(u, u) γ1 u 1 γ2 u 1 + u 2 ≥ | |H (Ω) ≥ | |H (Ω) k kL (Ω) 3.10.5 The inequalities of Poincar´eand Friedrich

Lemma 3.64 (The Friedrich inequality). Let Ω be a open bounded domain and ΓD > 0. 1 | | Then, for all u H (Ω) there exists a constant CF (Ω, ΓD) such that ∈ ΓD,0 2 2 2 u 2 CF (Ω, ΓD) diam(Ω) grad u 2 . k kL (Ω) ≤ k kL (Ω) B Well-posedness in case of Dirichlet boundary conditions (at least at part of the boundary) and c 0. ≥ For pure Neumann boundary conditions and c = 0, that is

div j = div σ grad u = f in Ω, − (3.33) j n = σ∂nu = h on ∂Ω, · the solution can only be ﬁxed up to an additive constant. Using exactly the constant function as test function in the associate variational formulation we get a compatibility conditions

f dx = h dS, ZΩ Z∂Ω

51 the data of the problem has to fulﬁll. One can see this also by applying Gauss’ theorem applied to j. The constant can be ﬁxed by demanding a vanishing mean value of u, i. e., we switch to the subspace

H1(Ω) := v H1(Ω) : v dx = 0 ∗ { ∈ } ZΩ and we can apply

Lemma 3.65 (The Poincar´einequality). Let Ω be a open bounded domain. Then, for 1 all u H (Ω) there exists a constant CP (Ω) such that ∈ 2 2 2 u uΩ 2 CP (Ω) diam(Ω) grad u 2 , k − kL (Ω) ≤ k kL (Ω) 1 where uΩ = |Ω| Ω u(x)dx is the mean of u in Ω. Remark 3.66.ROne is tempted to ensure uniqueness of solutions of (3.33) by demanding u(x0) = 0 for some x0 Ω. However, this approach is ﬂawed (fehlerhaft), because the ∈ 1 1 mapping u u(x0) is unbounded on H (Ω) (there are unbounded functions in H (Ω) 7→ for d > 1). Therefore, such a strategy may lead to severely ill-conditioned linear systems of equations when employed in the context of a Galerkin discretization.

The general rule is that in order to impose constraints one has to resort to functionals/operators/mappings that are continuous on the relevant function spaces.

In case of Robin boundary conditions the constant is not in the kernel of the formulation, as adding a constant would harm the Robin boundary condition. This is addressed by the following lemma.

1 Lemma 3.67. Let f : H (Ω) R be a continuous functional which ﬁxes constants and →2 for which we have f(λv) = λ f(v) for λ R , i. e., a constant function c vanishes | | | | | | ∈ if f(c) = 0. Then, it exists a constant C = C(f, Ω) > 0 such that for all u H1(Ω) we have ∈

2 2 u 2 C grad u 2 + f(u) . k kL (Ω) ≤ k kL (Ω) | | Proof. Assume that the assertion of the lemma was false. Then, we can ﬁnd a sequence ∞ 1 un , un H (Ω) such that for all n N { }n=1 ∈ ∈ 2 1 2 1 2 grad un 2 + f(un) un 2 un 1 . (3.34) k kL (Ω) | | ≤ nk kL (Ω) ≤ nk kH (Ω) n o

As the functionals on the left and right side are quadratic, we can consider instead the associated normalised sequence with un 1 = 1. k kH (Ω)

52 1 As the sequence un is bounded, it exists a subsequence weakly convergent in H (Ω) [10, Appendix A.3.8], i. e., for k → ∞ 1 1 un * u in H (Ω) which means (un , v) 1 (u, v) 1 v H (Ω), k k H (Ω) → H (Ω) ∀ ∈ where the limit is denoted by u. 2 1 By continuity of the functional grad v 2 + f(v) : H (Ω) R we can take the k kL (Ω) | | → limit on both sides of (3.34), and it holds

2 grad u 2 + f(u) = 0. (3.35) k kL (Ω) | | This implies that grad u 0 and u is a constant. Since f(u) = 0 the constant is zero, ≡ and u 0. ≡ For bounded domains H1(Ω) is compactly embedded in L2(Ω), which mean that for any weakly convergent sequence in H1(Ω) there exists a subsequence strongly convergent in L2(Ω) (Rellich-Kondrachov theorem, [11, Chapter 2]). Denoting the subsequence again un it holds for k k → ∞

un 2 u 2 = 0. k k kL (Ω) → k kL (Ω)

This mean that the subsequence un we have un 1 = 1, grad un 2 0 k k k kH (Ω) k k kL (Ω) → and un 2 0, which is not possible. k k kL (Ω) → So, also for the case c = 0 the bilinear form of the variational problem (3.23) is H1 (Ω)-elliptic. ΓD,0

3.11 Discrete variational formulations

A first step towards finding a practical algorithm for the approximate solution of (LVP) is to convert it into a discrete variational problem. We use the attribute “discrete” in the sense that the solution can be characterized by a finite number of real (or complex) numbers. Given a real Banach space V with norm and a bilinear form b L(V V, R) k·kV ∈ × we pursue a Galerkin discretization of (LVP). Its gist (wesentliche) is to replace V in (LVP) by finite dimensional subspaces. The most general approach relies on two subspaces of V :

Wn V : “trial space”, dim Wn = N ⊂ , N N . Vn V : “test space”, dim Vn = N ∈ ⊂ Notation 3.68. Throughout a subscript n will be used to label “discrete entities” like the above ﬁnite dimensional trial and test spaces, and their elements. Often we will consider sequences of such spaces; in this case n will assume the role of an index.

53 u

u V V n n 0

Figure 3.6: b-Orthogonality of the error en = u un with respect to Vn, if b is an inner − product.

0 Given the two spaces Wn and Vn and some f V the discrete variational problem ∈ corresponding to (LVP) reads: seek un Wn such that ∈ b(un, vn) = f, vn 0 vn Vn . (DVP) h iV ×V ∀ ∈ This most general approach, where Wn = Vn is admitted, is often referred to as Petrov- 6 Galerkin method. In common parlance, the classical Galerkin discretization implies that trial and test space agree. If, moreover, b provides an inner product on V , the method is known as Ritz-Galerkin scheme. If, for given b and f both (LVP) and (DVP) have unique solutions u V and un Wn, ∈ ∈ respectively, then a simple subtraction reveals

b(u un, vn) = 0 vn Vn . (3.36) − ∀ ∈ Abusing terminology, this property is called Galerkin orthogonality, though the term orthogonality is only appropriate, if b is an inner product on V . Sloppily speaking, the discretization error en := u un is “orthogonal” to the test space Vn, see Fig. 3.6. − Theorem 3.69. Let V be a Banach space and b L(V V, R) satisfy the inf-sup ∈ × conditions (IS1) and (IS2) from Thm. 3.46. Further, assume that

b(wn, vn) γn > 0 : inf sup | | γn . (DIS) ∃ wn∈Wn vn wn ≥ vn∈Vn\{0} k kV k kV Then, for every f V 0 V 0 the discrete variational problem (DVP) has a unique ∈ ⊂ n solution un that satisﬁes the stability estimate

1 1 f(vn) un V f V 0 = sup | | , (3.37) k k ≤ γn k k n γn v ∈V vn n n k kV and the quasi-optimality estimate

b V ×V 7→R u un V 1 + k k inf u wn V , (3.38) k − k ≤ γ wn∈Wn k − k n where u V solves (LVP). ∈

54 Proof. Following in the proof of Thm. 3.46 it is clear that (DIS) implies the uniqueness of un, and similarly to the stability estimate (3.37) follows. Since dim Vn = dim Wn, in the ﬁnite dimensional setting this implies existence of un (cf. [10, Lemma A.9]). It remains to show (3.38). For any wn Wn we can estimate using the triangle ∈ inequality and (3.37):

u un u wn + wn un k − kV ≤ k − kV k − kV 1 b(wn u + u un, vn) u wn V + sup | − − | ≤ k − k γn vn vn∈V \{0} k kV 1 b(wn u, vn) 1 b(u un, vn) u wn V + sup | − | + sup | − |. ≤ k − k γn vn γn vn vn∈V \{0} k kV vn∈V \{0} k kV The last term vanishes due to Galerkin orthogonality (3.36) and with the continuity of b we obtain (3.37).

Remark 3.70. One can not conclude (DIS) from (IS1) and (IS2) because the supremum is taken over a much smaller set. Remark 3.71. It goes without saying that for a V -elliptic bilinear form b, cf. Def. 3.25, the assumptions of Thm. 3.69 are trivially satisﬁed. Moreover, we can choose γn equal to the ellipticity constant γe in this case.

Thm. 3.69 provides an a-priori estimate for the norm of the discretization error en. It reveals that the Galerkin solution will be quasi-optimal, that is, for arbitrary f the norm of the discretization can be bounded by a constant times the best approximation error

inf u vn V , wn∈Wn k − k of the exact solution u w.r.t. Wn. It is all important that this constant must not depend on f. Now, let us consider the special case that V is a Hilbert space. To hint at this, we write H instead of V . It is surprising that under exactly the same assumptions on b, Wn, and Vn as have been stated in Thm. 3.69, the mere fact that the norm of V = H arises from an inner product, permits us to get a stronger a-priori error estimate [12]. Theorem 3.72. If V is a Hilbert space we obtain the sharper a-priori error estimate b V ×V 7→R u un V k k inf u vn V , (3.39) k − k ≤ γn vn∈Vn k − k if the assumptions of Thm. 3.69 are satisﬁed. Theorem 3.73 (C´eas’sLemma). If the bilinear form b is V -elliptic the estimate b V ×V 7→R u un V k k inf u vn V , (3.40) k − k ≤ γe vn∈Vn k − k holds.

55 The other special case is that of Ritz-Galerkin discretization aimed at a symmetric, positive deﬁnite bilinear form b. Then the Ritz-Galerkin method will furnish an optimal solution in the sense that un is the best approximation of u in Vn, i. e., the element in VN nearest to the exact solution u in the energy norm. Corollary 3.74. If b is an inner product in V , with which V becomes a Hilbert space 0 H, and Vn = Wn, then (DVP) will have a unique solution un for any f H . It satisﬁes ∈

u un b inf u vn b , k − k ≤ vn∈Vn k − k where is the energy norm derived from b. k·kb Proof. Existence and uniqueness are straightforward. It is worth noting that the estimate can be obtained in a simple fashion from Galerkin orthogonality (3.36) and the Cauchy- Schwarz inequality

2 u un = b(u un, u vn) u un u vn , k − kb − − ≤ k − kb k − kb for any vn Vn. ∈ Remark 3.75. Many of our eﬀorts will target asymptotic a-priori estimates that ∞ ∞ involve sequences Vn , Wn of test and trial spaces. Then it will be the principal { }n=1 { }n=1 objective to ensure that the constant γn is bounded away from zero uniformly in n. This will guarantee asymptotic quasi-optimality of the Galerkin solution: the estimate (3.37) will hold with a constant independent of n. Notice that the norm of b that also enters (3.38) does not depend on the ﬁnite dimensional trial and test spaces.

In the sequel we will take for granted that b and Vn,Wn meet the requirements of Thm. 3.69. The “Galerkin orthogonality” (3.36) suggests that we examine the so-called Galerkin projection Pn : V Wn deﬁned by 7→

b(Pnw, vn) = b(w, vn) vn Vn . (3.41) ∀ ∈ Proposition 3.76. Under the assumption of Thm. 3.69, the equation (3.41) defines a −1 continuous projection Pn : V V with norm Pn γ b . 7→ k kV 7→V ≤ n k kV ×V 7→R Proof. As a consequence of Thm. 3.69, Pn is well defined. Its linearity is straightforward 2 and the norm bound can be inferred from (3.37). Also Pn = Pn is immediate from the definition.

The Galerkin projection connects the two solutions u and un of (LVP) and (DVP), respectively, through

un = Pnu . (3.42)

Deﬁnition 3.77. If H is a Hilbert space we call two subspaces V,W H orthogonal, ⊂ and write V W , if (v, w)H = 0 for all v V , w W . A linear operator P : H H ⊥ ∈ ∈ 7→ is an orthogonal projection, if P 2 = P and Ker(P ) Range(P ). ⊥

56 Proposition 3.78. If b L(V V, R) is an inner product on V and the assumptions of Thm. 3.69 are satisﬁed, then∈ the× Galerkin projection associated with b is an orthogonal projection with respect to the inner product b. Proof. According to Def. 3.77 and the previous proposition, we only have to check that Ker(Pn) Wn = Range(Pn). As Pn is a projector there is a decomposition of V in a ⊥ direct sum [10, Lemma A.38]

V = Range(Pn) Range(Id Pn), ⊕ − i. e., Range(Pn) Range(Id Pn) = 0 . By Galerkin orthogonality (3.36) and (3.42) ∩ − { } this direct sum is orthogonal. As Ker(Pn) = Range(Id Pn) − v Ker(Pn) Pnv = 0 (Id Pn)v = v v Range(Id Pn) . ∈ ⇔ ⇔ − ⇔ ∈ − the proposition holds.

3.12 The algebraic setting

The variational problem (DVP) may be discrete, but it is by no means amenable to straightforward computer implementation, because an abstract concept like a ﬁnite dimensional vector space has no algorithmic representation. In short, a computer can only handle vectors (arrays) of ﬁnite length and little else. We adopt the setting of Sect. 3.11. The trick to convert (DVP) into a problem that can be solved on a computer is to introduce ordered bases

1 N BV := pn, . . . , pn of Vn , { 1 N } N := dim Vn = dim Wn . BW := q , . . . , q of Wn , { n n } Remember that a basis of a ﬁnite dimensional vector space is a maximal set of linearly indepedent vectors. By indexing the basis vectors with consecutive integers we indicate that the order of the basis vectors will matter. Lemma 3.79. The following is equivalent:

(i) The discrete variational problem (DVP) has a unique solution un Wn. ∈ (ii) The linear system of equations

Bµ = ϕ (LSE)

with N k j N,N B := b(qn, pn) R , (3.43) j,k=1 ∈ N ϕ := f, pk N , (3.44) n 0 R V ×V k=1 ∈ D E N N has a unique solution µ = (µk) R . k=1 ∈

57 Then

N k un = µkqn . k=1 X Proof. Due to the basis property we can set

N N k k un = µkqn , vn = νkpn, µk, νk R , ∈ k=1 k=1 X X in (DVP). Hence, (DVP) becomes: seek µ1, . . . , µN such that

N k N j N j b( µkq , νjp ) = f, νjp k=1 n j=1 n j=1 n V 0×V X X X for all ν1, . . . , νN R. We can now exploit the linearity of b and f: ∈ N N N k j j µkνjb(qn, pn) = νj f, pn V 0×V . (3.45) j=1 k=1 j=1 X X X Next, plug in special test vectors given by (ν1, . . . , νN ) = l, l 1,...,N , where l is N ∈ { } the l-th unit vector in R . This gives us

N k l l µkb(qn, pn) = f, pn , l = 1,...,N. (3.46) V 0×V k=1 X D E N As the special test vectors span all of R and thanks to the basis property, we conclude that (3.45) and (3.46) are equivalent. On the other hand, (3.46) corresponds to (LSE), as is clear by recalling the rules of matrix vector multiplication. × Note that in (3.43) j is the row index, whereas k is the column index. Consequently, the element in the j-th row and k-th column of the matrix B in (LSE) is given by k j b(qn, pn). The columns correspond to the trial functions where the rows correspond to the test functions.

k ↓ ····· ····· j ·····   → · · ∗ · · ·····   Notation 3.80. Throughout, bold greek symbols will be used for vectors in some Euk- n lidean vector space R , n N, whereas bold capital roman font will designate matrices. ∈ The entries of a matrix M will either be written in small roman letters tagged by two subscripts: mij or will be denoted by (M)ij.

58 Corollary 3.81. If and only if the bilinear form b and trial/test space Wn/Vn satisfy the assumptions of Thm. 3.69, then the matrix B of (LSE) will be regular.

Thus, we have arrived at the ﬁnal “algebraic problem” (LSE) through the two stage process outlined in Fig. 3.7. It is important to realize that the choice of basis does not aﬀect the discretization error at all: the latter solely depends on the choice of trial and test spaces. Also, Cor. 3.81 teaches that some properties of B will only depend on Vn, too.

(Continuous) variational problem (LVP): b(u, v) = f, v 0 h i×V V

§ Choose ﬁnite dimensional trial and test spaces Wn and Vn. ¤ ¦ ¥

Discrete variational problem (DVP): b(un, vn) = f, vn 0 h iV ×V

§ Choose bases of Wn and Vn ¤ ¦ ¥

Algebraic problem (LSE): Bµ = ϕ

Figure 3.7: Overview of stages involved in the complete Galerkin discretization of an abstract variational problem.

However, the choice of basis may have a big impact on other properties of the resulting matrix B in (LSE).

Example 3.82. If b induces an inner product on Vn = Wn, then a theorem from linear algebra (Gram-Schmidt-orthogonalisation) tells us that we can always find a b- orthonormal basis of Vn. Evidently, with respect to this basis the matrix associated with b according to (3.43) will be the N N identity matrix I. × Lemma 3.83. Consider a fixed bilinear form b and finite dimensional trial/test space Wn = Vn for the discrete variational problem (DVP). We choose two different bases 1 N 1 N B := p , . . . , p and B := p ,..., p of Wn = Vn, for which { n n } { n n } N j e ek e N N,N pn = sjkpn with S = (sjk)j,k=1 R regular. ∈ k=1 X e

59 Relying on these bases we convert (DVP) into two linear systems of equations Bµ = ϕ and Bµ = ϕ, respectively. If the discrete variational problem (DVP) possesses a unique solution, then the two lineare systemse e and their respective solutions are related by B = SBST , ϕ = Sϕ , µ = S−T µ . (3.47) Proof. Using (3.43) and the linearity of b we get e e e N N N N m l k j T blm = b(pn , pn) = smkb(pn, pn)slj = sljbjk smk = (SBS )lm , k=1 j=1 k=1 j=1 X X XX e e e (SB)lk which gives the relationship between B and B. The other| {z relationships} are as straightforward. e Remark 3.84. The lemma reveals that all possible Galerkin matrices B from (LSE) that we can obtain for a given discrete variational problem (DVP) form a congruence class of matrices. It is exactly the invariants of congruence classes that are invariants of Galerkin matrices: symmetry, regularity, positive definiteness, and the total dimensions of eigenspaces belonging to positive, negative, and zero eigenvalues. Its not an similarity transformation, so eigenvalues are not invariant (in case of an orthogonalisation the eigenvalues are all one). N,N An important property of a regular matrix M R , N N, is its spectral con- ∈ ∈ dition number κ(M) = M M−1 , | | N where, by default, R is equipped with the Euklidean vector norm, and M will always | | stand for the operator norm associated with the Euklidean vector norm. Remember that we are working on computers with finite precision algebra and that the spectral condition number measures the influence of small relative errors in the right hand side vector to the solution vector. µ µ ϕ ϕ | − | B B−1 | − | µ ≤ | | ϕ | | | | e e B Choice of basis of Vn, Wn influences the condition number For the remainder of this section we assume that trial and test spaces and their re- 1 N spective bases agree. Any choice of basis B := pn, . . . , pn for Vn spawns a coefficient N { } isomorphism Cn : R Vn by 7→ N k Cnµ = µkpn . k=1 X It can be used to link the operator form of the discrete variational problem and the associated linear system of equations (LSE) w.r.t. B.

60 Deﬁnition 3.85. For a linear operator T : V W on vector real spaces V,W we introduce its adjoint operator T0 by 7→

W 0 V 0 0 7→ T : V R  h 7→ . 7→ v h, T v 0  7→ h iW ×W Lemma 3.86. We have 

0 B = C Bn Cn , n ◦ ◦ 0 where Bn : Vn V is the operator related to the bilinear form b on Vn, 7→ n

Bn wn, vn 0 = b(wn, vn) wn Wn, vn Vn, h iVn×Vn ∀ ∈ ∈ and B is the coeﬃcient matrix for b w.r.t. the basis B of Vn.

N N Proof. Pick any µ, ξ R and remember that R is its own dual with the duality ∈ pairing given by the Euklidean inner product, that is

T µ, ξ N ∗ N = µ ξ . h i(R ) ×R Owing to Def. 3.85 and (3.43) we ﬁnd

T µ Bξ = b(Cnξ, Cnµ) = Bn Cnξ, Cnµ 0 h iVn×Vn 0 T 0 = C Bn Cnξ, µ = µ (C Bn Cn)ξ . n (RN )0×RN n

Lemma 3.87. If B is a matrix according to (3.43) based on a bilinear form b and a trial/test space Vn (equipped with basis B) that satisfy the assumptions of Thm. 3.69, then

2 −1 2 b V ×V 7→R κ(B) Cn N C N k k , R 7→V n Vn7→ ≤ k k R γn where Cn is the coeﬃcient isomorphism belonging to basis B, and γn is the inf-sup constant from (DIS).

Proof. By the deﬁnition of the operator norm,

b(wn, vn) Bn wn V 0 = sup | | γn wn V wn Vn , k k n v ∈V vn ≥ k k ∀ ∈ n n k kV which means

−1 0 −1 −1 γn Bn fn fn V 0 fn Vn Bn 0 γn . V ≤ k k n ∀ ∈ ⇒ Vn7→Vn ≤

61 The operator norm of Bn can be bounded by

B 0 = b . k kV 7→V k kV ×V 7→R By Lemma 3.86 we have

0 −1 −1 −1 0 −1 B = C Bn Cn , B = C B (C ) . n ◦ ◦ ◦ n ◦ n Then the submultiplicativity of operator norms ﬁnish the proof.

B condition number depends on the choosen basis via Cn, as well as on the discretisation space via γn and on problem itself via b . k kV ×V 7→R References

n [5] O. Forster. Analysis 3. Integralrechung im R mit Anwendungen. Vieweg–Verlag, Wiesbaden, 3rd edition, 1984. [6] H. König.Ein einfacher Beweis des Integralsatzes von Gauss. Jahresber. Dtsch. Math.-Ver., 66:119– 138, 1964. [7] J. Wloka. Partial differential equations. Cambridge University Press, Cambridge, UK, 1987. [8] F. Hirzebruch and W. Scharlau. Einführungin die Funktionalanalysis, volume 296 of BI Hochschul- taschenbücher. Bibliographisches Institut, Mannheim, 1971. [9] H. W. Alt. Lineare Funktionalanalysis. Springer-Verlag, Heidelberg, 1985. [10] Pavel Sol´ın.ˇ Partial Differential Equations and the Finite Element Method. Wiley-Interscience, Hoboken, USA, 2006. [11] Dietrich Braess. Finite Elements: Theory, Fast Solvers, and Applications in Solid Mechanics. Cambridge University Press, 3th edition, 2007. [12] J. Xu and L. Zikatanov. Some oberservations on Babuˇska and Brezzi theories. Report AM222, PennState Department of Mathematics, College Park, PA, September 2000. To appear in Numer. Math. [13] T. Kato. Estimation of iterated matrices with application to von Neumann condition. Numer. Math., 2:22–29, 1960. [14] W. Hackbusch. Theorie und Numerik elliptischer Differentialgleichungen. B.G. Teubner–Verlag, Stuttgart, 1986. [15] C. Schwab. p- and hp-Finite Element Methods. Theory and Applications in Solid and Fluid Me- chanics. Numerical Mathematics and Scientific Computation. Clarendon Press, Oxford, 1998.

62 4 Primal Finite Element Methods

In Section 3.11 we saw that the Galerkin discretization of a linear variational problem (LVP) posed on a Banach space V entails finding suitable finite dimensional trial and test spaces W ,V V . In this context, “suitable” means that some discrete inf-sup- n n ⊂ conditions have to be satisfied, see Thm. 3.69. In this chapter we only consider linear variational problems that arise from the primal weak formulation of boundary value problems as discussed in Sect. 3.5, see (3.23) and Sec. 3.10.1). These variational problems are set in Sobolev spaces and feature elliptic bilinear forms according to Def. 3.25. Hence, if trial and test space agree, which will be the case throughout this chapter, stability of the discrete variational problem is not an issue, cf. Remark 3.71. In this setting, the construction of Vn has to address two major issues 1. In light of Thm. 3.69 V must be able to approximate the solution u V of the n ∈ linear variational problem well in the norm of V .

2. The space Vn must possess a basis BV that allows for eﬃcient assembly of a stiﬀness matrix with desirable properties (e.g. well conditioned and/or sparse, cf. Sect. 3.12).

The ﬁnite element methods tries to achieve these goals by employing

spaces of functions that are piecewise smooth and “simple” and • locally supported basis function of these spaces. • 4.1 Meshes

For the remainder of this section let Ω Rd, d = 1, 2, 3, stand for a computational ⊂ domain according to Def. 3.5.

Deﬁnition 4.1. A mesh of Ω is a collection K M , M := ] 1, of connected M { i}i=1 M open subsets K Ω such that i ⊂ the closure of each K is the C∞-diﬀeomorphic image of a closed d-dimensional • i polytope (that is, the convex hull of d + 1 points in Rd),

K = Ω and K K = if i = j, i, j 1,...,M . • i i ∩ j ∅ 6 ∈ { } [i 1] denotes the cardinality of a ﬁnite set

64 (a) Triangle (b) Quadrilateral (c) Tetraedron

(d) Prism (e) Pyramid (f) Hexaedron

Figure 4.1: Cell types in 2D and 3D.

The Ki are called cells of the mesh. Remark 4.2. Sometimes the smoothness requirement on the diffeomorphism is relaxed and mapping that are continuous but only piecewise C∞ are admitted. Following the terminology of Sect. 3.1, each cell is an interval (d = 1), a Lipschitz polygon (d = 2), or a Lipschitz polyhedron (d = 3). Therefore, we can refer to vertices, edges (d > 1), and faces (d = 3) of a cell appealing to the geometric meaning of the terms. See Fig. 4.1 for a list of cell types in 2 and 3 dimensions. Meshes are a crucial building block in the design of the finite dimensional trial and test spaces used in the finite element method. Furthermore, they provide subdomains for integration to build the system matrix and vector of the right hand side. Definition 4.3. For d = 3 the set of (topological) faces of a mesh is given by M ( ) := interior(K K ), 1 i < j ] (geometric) faces ∂Ω , F M i ∩ j ≤ ≤ M ∪ { ⊂ } the set of (topological) edges of is defined as M ( ) := interior(F F 0),F,F 0 ( ),F = F 0 , E M ∩ ∈ F M 6 whereas the set of (topological) nodes is

( ) := E E0,E,E0 ( ),E = E0 . N M ∩ ∈ E M 6

65 Similarly, we can deﬁne sets of edges and nodes for d = 2, and the set of nodes for d = 1. Often the term face is used for components of a mesh of dimension d 1, that is, for faces in three dimensions, edges in two dimensions, and nodes in one− dimension. We still write ( ) for the set of these (generalized) faces. F M

Figure 4.2: Two-dimensional mesh and the sets of edges (red) and nodes (blue)

Remark 4.4. The “is contained in the closure of” incidence relations ( ) N M × ( ) true, false , ( ) ( ) true, false , etc., describe the topology of a mesh.E M 7→ The { locations} of nodesE M and× F shapeM 7→ of cells { are features} connected wity the geometry of the mesh. We can define the nodes (E) of an edge, the edges (F ) of a face and N E the faces (K) of a cell. F Already contained in the definition of a mesh is the notion of reference cells. By them we mean a finite set K1,..., KP , P N, of d-dimensional polytopes such that all ∈ cells of the mesh can be obtained from one of the Ki under a suitable diffeomorphism. We recall that a mappingb b d d b d,d d Φ : R R , ξ Fξ + τ , F R regular , τ R (AFF) 7→ 7→ ∈ ∈ represents a bijective affine mapping of d-dimensional Euklidean space. An affine mapping from a triangle is a triangle, and an affine mapping of a square is a parallelogram. Definition 4.5. A mesh of Ω Rd is called affine equivalent, if all its cells arise M ⊂ as affine images of a single d-dimensional (reference) polytope. A family of meshes is affine equivalent, if this is true for each of its members {Mn}n∈N and if the same reference polytope can be chosen for all n, n N. M ∈ To map from a square to a general (straight) quadrilateral we need a bilinear mapping 2 2 Φ : R R , ξ Fξ + τ + τ Qξ1ξ2. (4.1) 7→ 7→ Note, that for the bilinear mapping to a non-convex quadrilateral (one angle 180◦) is ≥ not a diffeorphism. With the so called blending techniques a mapping for curved cells for given parametri- sation of curved edges can be defined:

x (ξ1, ξ2) = Φ ξ = (1 ξ2) x1(ξ1) + ξ1 x2(ξ2) + ξ2 x3(1 ξ1) + (1 ξ1) x4(1 ξ2) K K − − − − (1 ξ1)(1 ξ2) p ξ1(1 ξ2) p ξ1ξ2 p (1 ξ1)ξ2 p . − − − 0 − − 1 − 2 − − 3

66 p2

x3(t)

x2(t) ξ2 p3 1 K

ΦK x4(t)

Kˆ p0 p1 x1(t)

0 ξ 0 1 1

Figure 4.3: Mapping for curved quadrilateral.

Deﬁnition 4.6. A mesh = K M of Ω is called a triangulation, if K K , M { i}i=1 i ∩ j 1 i < j M, agrees with a geometric face/edge/vertex of Ki or Kj. The cells of a triangulation≤ ≤ are sometimes called elements.

Figure 4.4: A mesh that is not a triangulation. The arrow points at the culprit.

An important concept is that of the orientation of the geometric objects of a triangulation. Definition 4.7. Orienting an edge amounts to prescribing a direction. The orientation of a face for d = 3 can be fixed by specifying an ordering of the edges along its boundary. If all geometric objects of a triangulation are equipped with an orientation, we will call it an oriented triangulation. Example 4.8. In the case of a conforming simplicial triangulation the orientation of all geometric objects can be fixed by sorting the vertices. This will induce an ordering of the vertices of all cells, edges, and faces, which, in turns, defines their orientation.

67 Deﬁnition 4.9. A triangulation := K M of Ω is called conforming, if K K , M { i}i=1 i ∩ j 1 i < j M, is a (geometric) face of both K and K . ≤ ≤ i j Unless clearly stated otherwise we will tacitly assume that all meshes that will be used for the construction of ﬁnite element spaces in the remainder of this chapter are conforming.

Figure 4.5: A triangulation that is not conforming and possesses two hanging nodes.

Definition 4.10. A node of a mesh that is located in the interior of a geometric face of one of its cells is known as hangingM (dangling) node. In the case of triangulations we can distinguish special classes: simplicial triangulations that entirely consist of triangles (d = 2) or tetrahedra • (d = 3), whose edges/faces might be curved, nevertheless. quadrilateral (d = 2) and hexahedral (d = 3) triangulations, which only com- • prise cells of these shapes. Curved edges or faces are admitted, again. Corollary 4.11. Any family of simplicial triangulations is affine equivalent. Remark 4.12. A quadrilaterial triangulation need not be affine equivalent, because, for instance, there is no affine map taking a square to general trapezoid. 3 Exercise 4.1. Let ν1,..., ν4 R stand for the coordinate vectors of the four vertices of ∈ a tetrahedron K. Determine the affine mapping (AFF) that takes the “unit tetrahedron” 0 1 0 0 K := convex 0 , 0 , 1 , 0          0 0 0 1  b         to K. When will the matrix F of this affine mapping be regular? 2 Exercise 4.2. Let ν1,..., ν4 R denote the coordinate vectors belonging to the four ∈ vertices of a quadrilateral K in the plane. Determine an analytic description of a simple 2 2 smooth mapping Φ : R R from the “reference square” 7→ 0 1 0 1 K := convex , , , 0 0 1 1 to K. Compute the Jacobianb DΦ and its determinant.

68 Figure 4.6: Examples of triangular and quadrilateral meshes in two dimensions

The term grid is often used as a synonym for triangulation, but we will reserve it for meshes with a locally translation invariant structure. These can be tensor product grids, that is meshes whose cells are quadrilaterals (d = 2) or hexahedra (d = 3) with parallel sides.

Figure 4.7: Example of triangular and quadrilateral grids in two dimensions

Automatic mesh generation is a challenging subject, which deals with the design of algorithms that create a mesh starting from a description of Ω. Such a description can be given

in terms of geometric primitives (ball, brick, etc.) whose unions or intersections • constitute Ω.

by means of a parameterization of the faces of Ω. • through a function f : Rd R, whose sign indicates whether a point is located • 7→ inside Ω or outside.

by a mesh covering the surface of Ω and a direction of the exterior unit normal. • Various strategies can be employed for automatic grid generation:

advancing front method that build cells starting from the boundary. •

69 Delaunay reﬁnement techniques that can create a mesh starting from a mesh for • ∂Ω or a “cloud” of points covering Ω.

the quadtree (d = 2) or octree (d = 3) approach, which ﬁlls Ω with squares/cubes • of diﬀerent sizes supplemented by special measures for resolving the boundary.

mapping techniques that split Ω into sub-domains of “simple” shape (curved tri- • angles, parallelograms, bricks), endow those with parametric grids and glue these together.

Remark 4.13. Traditional codes for the solution of boundary value problems based on the finite element method usually read the geometry from a file describing the topolgy and geometry of the underlying mesh. Then an approximate solution is computed and written to file in order to be read by post-processing tools like visualization software, see Fig. 4.8. Parameters

Finite element solver Post-processor Mesh generator (computational (e.g. visualization) kernel)

Figure 4.8: Flow of data in traditional ﬁnite element simulations

Remark 4.14. A typical ﬁle format for a mesh of a simplicial conforming triangulation of a two-dimensional polygonal domain is the following:

# Two-dimensional simplicial mesh N N # Number of nodes ∈ ξ1 η1 # Coordinates of first node

ξ2 η2 # Coordinates of second node . .

ξN ηN # Coordinates of N-th node (4.2) M N # Number of triangles ∈ 1 1 1 n1 n2 n3 X1 # Indices of nodes of first triangle 2 2 2 n1 n2 n3 X2 # Indices of nodes of second triangle . . M M M n1 n2 n3 XM # Indices of nodes of M-th triangle

70 Here, Xi, i = 1,...,M, is an additional piece of information that may, for instance, describe what kind of material properties prevail in triangle #i. In this case Xi may be an integer index into a look-up table of material properties or the actual value of a coeﬃcient function inside the triangle. Additional information about edges located on ∂Ω may be provided in the following form:

K N # Number of edges on ∂Ω ∈ 1 1 n1 n2 Y1 # Indices of endpoints of first edge 2 2 n1 n2 Y2 # Indices of endpoints of second edge (4.3) . . K K n1 n2 YK # Indices of endpoints of K-th edge where Yk, k = 1,...,K, provides extra information about the type of boundary conditions to be imposed on edge #k. Some ﬁle formats even list all edges of the mesh in the format (4.3). Please note that the ordering of the nodes in the above ﬁle formats implies an orientation of triangles and edges.

For a comprehensive account on mesh generation see [16]. An interesting algorithm for Delaunay meshing is described in [17, 18]. Free mesh generation software is also available, just to name some, netgen, gmsh, triangle, emc2. However, the most sophisticated mesh generation tools are commercial products and their algorithmic details are classiﬁed.

4.2 Linear ﬁnite elements on triangular meshes

4.2.1 Basis functions

Figure 4.9: FE Basis function bP (x).

71 2 Let the domain Ω R be a polygon with a triangular conforming mesh . ∈ M We deﬁne the ﬁnite element space of piecewise linear, continuous functions as

1 0 S (Ω, ) := u C (Ω) : u(x) = a + bx1 + cx2, K . M { ∈ |K ∀ ∈ M} Proposition 4.15 (Properties of S1(Ω, )). M It holds S1(Ω, ) H1(Ω). • M ⊂ u S1(Ω, ) is uniquely defined by the values of u(P ) on the nodes P ( ). • ∈ M ∈ N M N = dim S1(Ω, ) = ] < . • M N ∞ S1(Ω, ) = span b (x): P ( ) , where the hut functions are defined as • M { P ∈ N M } 1 0 ϕ S (Ω, ), b (P ) = δ = 0 . P ∈ M P P P Let b the vector of the basis functions (for a fixed numbered). Then, an arbitrary FE function v S1(Ω, ) can be written as ∈ M > v(x) = v(P )bP (x) = v b(x), P ∈NX(M) so also the solution

> un(x) = un(P )bP (x) = µ b(x). P ∈NX(M) 4.2.2 Assembling of system matrix and load vector

The support of the basis functions bP (x) is only a few triangles, i. e., the integral in the bilinear form reduces to smaller sets for pairs of basis functions. Changing the point of view, we consider the shape functions, which are the restrictions of a basis function to one cell K . For S1(Ω, ) these are exactly three, each ∈ M M for one node of K. The shape functions can be deﬁned on a single cell K as

−1 NK,Pj (K)(x) = Nj(ΦK x). (4.4) where Nj are the element shape functions deﬁnedb by

b N0(ξ) = 1 ξ1 ξ2, Nj(ξ) = ξj, j = 1, 2. (4.5) − − on the referenceb element, which is the triangle withb vertices (0, 0), (1, 0), (0, 1). Pj(K) is the j-th node of the triangle K, j = 0, 1, 2, with the coordinate pj. Then, the (aﬃne) element mapping is

x = Φ (ξ) = p0 + ξ1(p1 p0) + ξ2(p2 p0) = τ + F ξ, K − − K

72 with F = (p1 p0, p2 p0) and τ = p0. − − As the element mapping is linear, the shape functions are in fact linear as well. The basis functions result from the shape functions by glueing, i. e.,

NK,Pj (x) x K,P (K),P = Pj(K), bP (x) = ∈ ∈ N (0 otherwise. This can be expressed by the connectivity or T-matrices

3,N 1,Pj(K) = Pi( ), TK R , (TK )ij = M ∈ (0, otherwise. These matrices give a relation between local number of (element) shape functions and global numbering of basis functions. They have the form

... 1 ...... · · · ··· ··· · T = ...... 1 ...... K  · ··· · · ··· ·  ...... 1 ... · ··· ··· · · ·   For the T-matrices an extremly sparse format is used. For linear ﬁnite elements on triangles we have only to store the three indices Pi0 , Pi1 and Pi2 of the three nodes in the triangle. Now, we can express a basis function in one cell K as

3 b (x) = (T) N (x), (4.6) Pi |K ki K,Pk(K) =1 Xk and the system matrix is given with

(B) = b(b , b ) = (b , b ) ij Pj Pi Pj |K Pi |K XK 3 3 b = ( (TK )kjNK,Pk(K), (TK )`iNK,P`(K)) =1 =1 XK Xk X` (4.7) 3 3 b = (TK )kj(TK )`i (NK,Pk(K),NK,P`(K)) =1 =1 XK Xk X` > = (TK )·,jBK (TK )·,i XK as sum of “weighted” element matrices

> B = TK BK T. (4.8) XK This mean we have to integrate only over the shape functions, and to sum up the contributions over all the cells while scattering to the right position in the system matrix.

73 4.2.3 Element stiﬀness matrix The bilinear form in the variational problem (3.23) consists of three parts, which contribute to the system matrix. The ﬁrst part

a(u, v) = σ grad u, grad v dx h i ZΩ is constituting the stiﬀness matrix A. We transform the derivatives to K. By chain rule of diﬀerentiation and with

∂ ∂ > ∂ ∂ > = , b , = , , ∇ ∂x ∂x ∇ ∂ξ ∂ξ 1 2 1 2 we have b

∂x1 ∂x2 ∂ξ1 NK,Pj (x(ξ)) = ∂x1 NK,Pj (x) + ∂x2 NK,Pj (x) ∂ξ1 ∂ξ1 ∂x1 ∂x2 ∂ξ2 NK,Pj (x(ξ)) = ∂x1 NK,Pj (x) + ∂x2 NK,Pj (x) ∂ξ2 ∂ξ2 and so ∂Φ > N (x(ξ)) = K N (x), (4.9) ∇ K,Pj ∂ξ ∇ K,Pj > b DΦK and so for triangular cells | {z }

N (x) = F−> N (x(ξ)) = F−> N (ξ). ∇ K,Pj K ∇ K,Pj K ∇ j For a single cell K the entries of theb element stiﬀness matrixb b AK are so given by

a (N ,N ) = σ(x)( N (x))> N (x)dx K K,Pj K,Pi ∇ K,Pj ∇ K,Pi ZK = σ(Φ ξ)( N (ξ))>F−1F−> N (ξ) F dξ K ∇ j K K ∇ i | K | ZKb = σ(Φ ξ)( b Nb (ξ))>adj(F )adj(b bF )> N (ξ) F −1 dξ (4.10) K ∇ j K K ∇ i | K | ZKb where we use the relation of the inverseb b and the adjoint matrixb (dt.b Adjunkte)

F−1 = F −1adj(F ). K | K | K a b d b The adjoint matrix of a matrix is − . c d c a Note, that the gradient of the element shape− functions are constant vectors 1 1 0 N0 = − , N1 = , N2 = , (4.11) ∇ 1 ∇ 0 ∇ 1 − b b b b b b

74 So we can simplify σ (A ) = a (N ,N ) = K ( N )>adj(F )adj(F )> N F −1 K ij K K,Pj K,Pi 2 ∇ j K K ∇ i| K | with σ = 1 σ(x)dx the average heatb conductionb in K (note thatb b F = 2 K ). K |K| K | K | | | The latter simplies for constant material or may be obtained by numerical quadrature R for more general smooth functions. Due to the special values of N (see (4.11)) we can even simplify [19] ∇ j σK > b bAK = D DK 4 K K | | with a matrix DK with coordinate diﬀerences

y1 y2 y2 y0 y0 y1 DK = − − − . x2 x1 x0 x2 x1 x0 − − − 4.2.4 Element mass matrix The mass matrix is related to the bilinear form

m(u, v) = c uv dx. ZΩ The element mass matrix MK for the cell K can be computed as

m (N ,N ) = c(x) N (x)N (x) dx = c(Φ ξ)N (ξ)N (ξ) F dξ. K K,Pj K,Pi K,Pj K,Pi K j i | K | ZK ZKb (4.12) b b In case of a constant function cK in the cell K we can write

1 6 i = j, (MK )ij = cK K 1 | | ( i = j. 12 6 4.2.5 Element load vector The element load vector is related to the linear form

`K (v) = fvdx. ZK For general, smooth function f we use numerical quadrature to evaluate the integrals. The simpliest quadrature rule is

fvdx K f(x )v(x ) ≈ | | K K ZK where xK is the barycenter (dt. Schwerpunkt) of the cell K. As this quadrature rule is only exact for linear functions, and the shape functions are already linear, it will provide only reasonable results if f is (almost) constant in K.

75 4.3 Higher order ﬁnite elements on curved cells

4.3.1 Linear ﬁnite elements on quadrilateral cells The element shape functions for the linear ﬁnite elements on quadrilateral cells are

N0(ξ) = (1 ξ1)(1 ξ2), N1(ξ) = ξ1(1 ξ2), − − − N2(ξ) = ξ1ξ2, N3(ξ) = (1 ξ1)ξ2, b b − which span onb the square K = (0, 1)2 the space b

1(K) := span 1, ξ1, ξ2, ξ1ξ2 . b Q { } 2 The element shape functions areb linear along the edges of the square [0, 1] , and, hence, the shape functions resulting after bilinear mapping (4.1) and as for triangles with (4.4), are linear along the edges of K with value 1 on one of the four nodes, respectively. This allows to glue shape functions of the cells around a nodes to constitute a globally continuous basis functions. These basis functions span the space

1 0 S (Ω, ) = u C (Ω) : u(x) Φ (ξ) 1(K) , M { ∈ |K ◦ K ∈ Q } Since the inverse of the bilinear mapping (4.1) is in general not polynomial,b the basis functions are in general no piecewise polynomials anymore, only on each cell the image of a polynomial under ΦK . The matrix assembling is similar to that for triangles, whereas the T-matrices have 4 rows and the element matrices follow by (4.10) and (4.12) where FK is replaced by the Jacobian matrix DΦK (ξ), which is not constant anymore, but linear in ξ1 and x2. In the formula for the stiﬀness matrix the inverse of the Jacobian determinant appears, which is not a polynomial. Even for constant material functions σ(x) and c(x), the use of numerical quadrature rules may become interesting.

4.3.2 Numerical quadrature for quadrilateral cells The integration is on the reference cell [0, 1]2, so a tensor product of 1D quadrature rule can be applied. The quadrature rules are usually given in the intervall [ 1, 1]. So, we − simply transform 1D integrals via

1 1 n 1 1 +1 f(ξ) dξ = f( ξ+1 ) dξ w f( ξj ), (4.13) 2 2 ≈ 2 j 2 0 −1 =1 Z Z Xj where wj are the weights and ξj the abscissas of the quadrature rule. Accurate quadrature rule for smooth functions are variants of the Gauss quadrature. The most well-known is the Gauss-Legendre rule for which the abscissas are the zeros

76 of the n-th Legendre polynomial Pn(ξ) and the weights are given by (see [20, page 887] 2 w = . j (1 ξ2)[P 0 (ξ )]2 − j n j The Gauss-Legendre rule is exact for polynomials of degree 2n 1 and the remainder − (for the interval [0, 1]) is

(n!)4 R = f (2n)(ξ), 0 < ξ < 1. n (2n + 1)[(2n)!]3

The zeros of the Legendre polynomials are tabulated (see [20, page 921ﬀ]) and there is an algorithm (see Matlab version gauleg.m on the webpage of the lecture) in the numerical recipes [21]. Only accurate for polynomials of degree 2n 3 is the Gauss-Lobatto rule, for which, − however, both end-points are included. For the square [0, 1]2 we have the product quadrature rule

1 1 1 1 n1 n2 1 ξ1+1 ξ2+1 1 ξi+1 ξj +1 f(ξ) dξ1 dξ2 = f( , ) dξ1 dξ2 w w f( , ), 4 2 2 ≈ 4 i j 2 2 0 0 −1 −1 =1 =1 Z Z Z Z Xi Xj with n1 and n2 quadrature points in the two directions.

4.3.3 Linear ﬁnite elements on curved cells For curved cells the mapping from reference cell, either triangle or square, is even more general than the bilinear mapping. This inﬂuences only the Jacobian matrix DΦK (ξ) and numerical quadrature will be essential to compute the element matrix entries. The continuity of the basis functions is assured if for neighbouring cells K1 and K2 the mapping from a reference interval [0, 1] to the common edge, introduced by the mappings

ΦK1 and ΦK2 , is the same. This is also true for mixed meshes of curved triangles and quadrilaterals. Let us deﬁne the local space for triangles as

1(K) := span 1, ξ1, ξ2 . P { } Then, we are now in the positionb to introduce a general deﬁnition for the space

1 1 S (Ω, ) = u H (Ω) : u(x) Φ (ξ) 1(K) if K is a triangle or M { ∈ |K ◦ K ∈ P u(x) K ΦK (ξ) 1(K) if K is a quadrilateral . | ◦ ∈ Q b } For curved triangular cells we have also to rely on numericalb quadrature.

77 4.3.4 Numerical quadrature for triangular cells Numerical quadrature for triangular cells are deﬁned on the triangle with nodes ( 1, 1), − − (1, 1), ( 1, 1), where integrals over K can easily transformed to. − −

1 1−ξ1 1 2−ξ1b n 1 ξ1+1 ξ2+1 1 ξ1,j +1 ξ2,j +1 f(ξ) dξ2 dξ1 = f( , ) dξ2 dξ1 w f( , ), 4 2 2 ≈ 4 j 2 2 0 0 −1 −1 =1 Z Z Z Z Xj Let us define generalisations of 1 and 1. P Q Definition 4.16. Given a domain K Rd, d N, we write ⊂ ∈ α α1 αd d m(K) := span ξ K ξ := ξ ξ , α N0, α m P { ∈ 7→ 1 ····· d ∈ | | ≤ } for the vector space of d-variate polynomials of (total) degree m, m N0. T d ∈ If m = (m1, . . . , md) N0 we designate by ∈ α1 αd m(K) := span ξ K ξ ξ , 0 α m , 1 k d Q { ∈ 7→ 1 ····· d ≤ k ≤ k ≤ ≤ } the space of tensor product polynomials of maximal degree mk in the k-th coordinate direction. There are Gauss quadrature rules for triangles which are exact for polynomials of maximal total degree 1, 2,... 5 and which use only 1, 3, 4, 6 and 7 points, respectively. See e. g.. page 141 in Solin [22]. To get higher accuracies in a systematic matter there are representation of the integral over a reference square and using a tensor-product • quadrature rule, approximated Fekete points, where points lying on the three edges correspond to • Gauss-Lobatto points, and which are tabulated, to name just the best known. Quadrature rules for triangles can be obtained from quadrature rules on a square 0 1 1 using the Duffy transformation, see Fig. 4.10. When K = convex 0 , 0 , 1 and 2 { } K =]0, 1[ , then

b f(ξ1, ξ2) dx = f(ξ1(1 ξ2), ξ2) (1 ξ2) dξ . − − ZK ZKb If f (K), then the integrand on the rightb handb sideb willb belongb to +1(K). ∈ Pm Qm,m The use of numerical quadrature inevitably introduces another approximation, which will contribute to the overall discretization error. The general rule is that b

The error due to numerical quadrature must not dominate the total discretization error in the relevant norms. Remark 4.17. An alternative to numerical quadrature is polynomial interpolation of coeﬃcients, source functions and (inverse of) Jacobian matrix followed by analytical evaluation of the localised integrals.

78 ξ2 ξ2

b ξ = ξ (1 ξ ) 1 1 − 2 ξ = ξ 2 b2 b b

ξ1 ξ1

Figure 4.10: “Duﬀy transformation”b of a square into a triangle.

4.3.5 Higher order finite elements The linear finite elements on triangles and quadrilaterals are a special case for finite elements on high orders. The basis functions have been identified to nodes P of the mesh and are linear in local coordinates along each edge. To obtain higher order finite elements we enrich the space of element shape functions by element shape functions

identiﬁed to a particular edge E on which they have polynomial behaviour and • which vanish along all other edges, and so also along the end-points of E

identiﬁed to a single cell K and vanishing on all edges E. These are called interior • or bubble functions.

The former can be written for K = [0, 1]2 as

N 1 (ξ) = (1 ξ2) P (ξ1), lower edge, e, ,j − b j Ne,2,j(ξ) = ξ1 Pj(ξ2), right edge, b Ne,3,j(ξ) = ξ2 Pj(ξ1), upper edge, b Ne,4,j(ξ) = (1 ξ1) Pj(ξ2), left edge, b − and the latter asb

Nc,i,j(ξ) = Pi,j(ξ1, ξ2), where P (ξ) are polynomials vanishing at ξ 0, 1 and P (ξ) are polynomials vanishing j b ∈ { } i,j at ∂K. Note, that the choice of the basis inﬂuences the condition number of the resulting matrices. b

79 Remember that we are interested in an one-to-one relation between shape functions in neighbouring cells. This can be achieved by choosing for one polynomial P and P (ξ) = P ( ξ) dependent if the ξ1 or ξ2-direction along the edge coincides or does not j ± coincide with the global orientation of the edge. In the following we will discuss the use of a family of shape functions based on integrated Legendre polynomials, which are either symmetric or anti-symmetric and allows therefore for fast matrix assembling and a simple matching.

4.3.6 Integrated Legendre polynomials as basis in quadrilaterals The Legendre polynomials are deﬁned in [ 1, 1] and orthogonal w.r.t. the L2-inner prod- − uct. We can deﬁne them by

L0(ξ) = 1,L1(ξ) = ξ (j + 1)L +1(ξ) = (2j + 1)ξL (ξ) jL −1(ξ), j > 1, j j − j and 1 2 L (ξ)L (ξ) dξ = δ . j i ij 2j + 1 Z−1 The integrated Legendre polynomials are ξ 1 L0(ξ) = 1, L1(ξ) = ξ, L (ξ) = L −1(t) dt = (L (ξ) L −2(ξ)) . − j j 2j 1 j − j Z−1 − Theb 1D element shapeb functions areb then deﬁned in [0, 1] as

N0(ξ) = 1 ξ, N1(ξ) = ξ, − (2j 1) 1 Nbj(ξ) = − b Lj(2ξ 1) = (Lj(2ξ 1) Lj−2(2ξ 1)) . 2 − 2(2j 1) − − − r − b b As in N2 for j 1 only even degree monomialsp are present, and in N2 +1 only odd j ≥ j degree monomials, we have for j 1 ≥ b b N2 (ξ) = N2 ( ξ), N2 +1(ξ) = N2 +1( ξ). j j − j − j − Then, the 2D element shape functions for ([0, 1]2), p 1 are given by b b Qp b ≥ b N (ξ) = N (ξ1)N (ξ2), 0 i, j p. (p+1)i+j j i ≤ ≤ For the element shape functions identiﬁed to one of the four vertices it is i, j 1, for b b b ≤ those related to one of the edges it is either i 1 or j 1, and for i, j 2 the element ≤ ≤ ≥ shape functions are interior. The basis in the reference element is hierarchic meaning that for increasing polynomial degree only new functions appear and the former remain. In case of a constant function σ and a parallelogram cell the element stiﬀness and mass matrices have O(p) non-zero entries. Otherwise, the basis with integrated Legendre polynomials leads to moderately increasing condition numbers for increasing maximal polynomial degree p. In the T-matrix there is exactly one entry with value 1

80 1 1 1

0.5 0.5 0.5

0 0 0

0.5 0.5 0.5 − − −

1 1 1 − 0 0.2 0.4 0.6 0.8 1 − 0 0.2 0.4 0.6 0.8 1 − 0 0.2 0.4 0.6 0.8 1 (a) The linear element shape (b) The even element shape (c) The odd element shape func- function Nb0(ξ), Nb1(ξ). function Nb2(ξ). tion Nb3(ξ).

Figure 4.11: The ﬁrst 1D element shape functions based on integrated Legendre polynomials.

for each element shape functions identified to a node, • for that identified to an edge if the function along the edge is symmetric, and • for that identified to a cell. • If the function is anti-symmetric along the edge we have to relate the local ξ1 or ξ2- direction to the (global) orientation of the edge. If they coincide the entry in the T- matrix is 1 otherwise 1. The latter is equivalent to mirror the element shape function − w.r.t. the perpendicular bisector of the edge (dt. Mittelsenkrechte). This means that the global orientation of the edge, which is known to the two neighbouring cells, is the judge deciding the direction of the shape functions.

4.3.7 Integrated Legendre polynomials as basis in triangles A similar basis for triangles can be defined using the d Definition 4.18 (Barycentric coordinates). Given d + 1 points p0,..., pd R that do ∈ not lie in a hyperplane the barycentric coordinates λ1 = λ1(x), . . . , λ +1 = λ +1(x) d d ∈ R of x Rd are uniquely defined by ∈ λ1 + + λ +1 = 1 , λ1 p0 + + λ p = x . ··· d ··· d d The barycentric coordinates for a point x can be obtained by solving

p0 1 p 1 λ1 x1 , ··· d, . . . .  . .   .  =  .  , (4.14) p1 p λ x  ,d ··· d,d  d   d  1 1  λ +1  1   ···   d          which shows their uniqueness and existence, if the points pj are not complanar. The convex hull of p0,..., pd can be described by 0 d d convex p ,..., p = x R , 0 λi(x) 1, 1 i d + 1 . { } { ∈ ≤ ≤ ≤ ≤ }

81 For the reference triangle K we have

λ1 = 1 ξ1 ξ2, λ2 = ξ1, λ3 = ξ2, − − b and the three element shape functions identiﬁed to the nodes P0(K), P1(K), P2(K) are given by (4.5) or

N0(λ) = λ1, N1(λ) = λ2, N2(λ) = λ3.

They vanish onb the opposite edge whereb λj = 0. b The element shape functions identiﬁed with an edge opposite to the node Pj have to vanish on the two neighbouring edges where λj−1 = 0, λj+1 = 0 (meant modulus 3). So we can write

Ne,1,j(λ) = λ2λ3Pj(λ1) Pj(ξ1), edge opposite to node P0,

Ne,2,j(λ) = λ1λ3Pj(λ2) Pj(ξ1), edge opposite to node P1, b Ne,3,j(λ) = λ1λ2Pj(λ3) Pj(ξ1), edge opposite to node P2. b The interiorb element shape functions vanish on all three edges and can be written as

Nc,i,j(λ) = λ1λ2λ3Pi,j(λ).

The functionsb Pj(λ) can be chosen as integrated Legendre polynomials L0,... Lp−2, and Pi,j(λ) as tensor product of integrated Legendre polynomials in λ1 and λ2 (λ3 is dependent) with maximal total polynomial degree p 3. b b − The choice of integrated Legendre polynomials is less suggesting as for quadrilaterals as the integration domain is not of tensor-product form. The following result will be important to compute element matrices with constant coeﬃcients:

Lemma 4.19. For any non-degenerate triangle K and β1, β2, β3 N, ∈ β1! β2! β3! λβ1 λβ2 λβ3 dx = 2 K . 1 2 3 | | · (β + β + β + 2)! ZK 1 2 3 Proof. The ﬁrst step amounts to a transformation of K to the “unit triangle” K := 0 1 0 convex 0 , 0 , 1 , which leads to b 1 1−ξ2

β1 β2 β3 β1 β2 β3 λ1 λ2 λ3 dx = 2 K ξ1 ξ2 (1 ξ1 ξ2) dξ1dξ2 =: 2 K I. K | | − − | | Z Z0 Z0

Note, that the barycentric coordinates evaluated on K are the barycentric coordinates 2 on the reference cell, which are ξ1, ξ2, 1 ξ1 ξ2. Transforming K to (0, 1) by the − − b b

82 Duﬀy-transformation ξ1 = ξ1, ξ2 = ξ1(1 ξ2) (see Fig. 4.10) we get − 1 1 b b b β1 β1 β2 β3 I = ξ (1 ξ2) ξ (1 ξ2 ξ1(1 ξ2)) (1 ξ2)dξ1 dξ2 1 − 2 − − − − Z0 Z0 1 b b b 1 b b b b b b

β1 β3 β2 β1+β3+1 = ξ (1 ξ1) dξ1 ξ (1 ξ2) dξ2 = B(β1 + 1, β3 + 1) B(β2 + 1, β1 + β3 + 2) 1 − 2 − Z0 Z0 b b b b b b where we used Euler’s beta functions

1 B(α, β) := tα−1(1 t)β−1 dt , 0 < α, β < . − ∞ Z0

Using the known formula Γ(α + β) B(α, β) = Γ(α)Γ(β), Γ the Gamma function, we end up with

Γ(β1 + 1)Γ(β3 + 1) Γ(β2 + 1)Γ(β1 + β3 + 2) I = . Γ(β1 + β3 + 2) · Γ(β1 + β2 + β3 + 3)

Then, Γ(n) = (n 1)! ﬁnishes the proof. − 4.4 Conforming ﬁnite element basis on non-conforming meshes

In case of non-conforming meshes hanging nodes and hanging edges appear that will not carry basis functions. The associated shape functions do not contribute by a one-to-one relation to a basis function. However, they are needed to represent hut functions or basis functions identiﬁed to an edge on a parents edge.

Example 4.20 (T-Matrices for an irregular mesh). Three cells with each three shape functions and four global basis functions. The hanging node is marked with . ◦ 3 4 3 2 1234 3 2 K 1 1 TL =  · ·· 2 1 ··· 1 3 1/2 1/2   · · 3 L   1234 I 1 1/2 1/2 TK =  · · 1 2 1 ··· 1 2 3 1   ·· · 1 2  

83 Let the non-conforming mesh be generated by a reﬁnement procedure from a conforming mesh. The interior basis functions vanishing on all edges follow as for conforming meshes. Only non-hanging nodes and non-hanging edges carry basis functions. For the integration routine we have to represent the basis function by shape functions on the cells of the mesh. For the cells in the support of such a basis function we can ﬁnd parent cells such that on each edge in the interior of the support there are only two cells adjacent. On this conforming level we can construct the representation by T matrices as introduced in the former sections.

Kˆ ⋆ Kˆ d Kˆ c ⋆ Kˆ ′ Kˆ

a b Kˆ ′ Kˆ Kˆ

(a) Vertical. (b) Horizontal. (c) Into four childs.

Figure 4.12: Subdivison variants on the reference square.

To represent the basis functions on each cell on the smallest level we can introduce a relation between the shape functions on diﬀerent level. So let K a parent cell of a cell K0. Then, we can represent each shape function deﬁned on K in the cell K0 as

N (x) 0 = (S 0 ) N 0 (x), K,i |K K K ji K ,j Xj where a S matrix is involved. The shape functions NK,i are deﬁned as element shape function Ni on the reference element K and then transformed with the element map ΦK 0 0 −1 0 to K. As K is a part of K we can write a representation between K and K = ΦK K , where x bK0 b ∈ b b −1 −1 b Ni(ΦK x) = (SK0K )jiNj(ΦK0 x) j ∈K0 X ∈K b b b b or | {z } | {z }

N (Φ ξ) = (S 0 ) N ( ξ ) i Kb 0 K K ji j ∈ 0 ∈Kb b Kb b | {z } |{z} Obviously, we can also write S instead of S 0 . For each reﬁnement variant we have Kb 0Kb K K 1 such an S-matrix. It simpliﬁes when restricting to the those only with division ratio 2 , where we have that of Fig. 4.12 for quadrilateral cells.

84 1 1 1

0.5 0.5 0.5

0 0 0

0.5 0.5 0.5 − − −

1 1 1 − 0 0.2 0.4 0.6 0.8 1 − 0 0.2 0.4 0.6 0.8 1 − 0 0.2 0.4 0.6 0.8 1 √ 1 6 1 (a) The even element shape (b) Nb2(ξ) evaluated on (0, 2 ) (c) − 4 Nb1(ξ) and 4 Nb2(ξ). function Nb2(ξ). mapped to (0, 1).

1 Figure 4.13: The representation of N2(ξ) evaluated in [0, 2 ] by element shape functions.

b S-matrices in 1D The reference interval I = [0, 1] is subdivided in the left part I0 = [0, 1 ] and the right one I? = [ 1 , 1]. So for the left part we have Φ = ξ . 2 2 Kb 0 2 b b Example 4.21. If we take for instance the 1D element shape function (see Fig. 4.13) b

N2(ξ) = √6ξ(ξ 1) − and we search (S 0 )2 such that for ξ I Ib Ib ,j b ∈ 2 ξ b N2 = (S 0 )2 N (ξ). 2 Ib Ib ,j j =0 Xj b b We ﬁnd

ξ √6 √6 1 N2 = ξ(ξ 2) = N1(ξ) + N2(ξ) , 2 4 − − 4 4 √ ξ 6ξ(ξ−1) b b b and so | {z } | {z } √ 6 1 (S 0 )2,· = 0 . Ib Ib − 4 4 In general the 1D S matrices can be obtained by evaluating the element shape functions on to [0, 1] transformed Chebychev points ξ and on the points Φ ξ , and solving a ` Kb 0 ` (small) linear system. Note, that in case of a hierarchic basis N the 1D S matrix for { j}j a maximal polynomial degree p is just a part of the 1D S for maximal polynomial degrees q > p. So the S matrices have just to be computed once for theb maximal polynomial degree in the space.

S-matrices in 2D For element shape functions with (functional) tensor product structure, i. e.,

N = N N . k` k ⊗ ` b b b

85 For an anisotropic subdivision the 2D S matrices are a tensor product like

S = S I Kb 0Kb Ib0Ib ⊗ where I is the identity matrix. For subdivision in both direction we have a tensor product like

S = S S . Kb dKb Ib0Ib ⊗ Ib?Ib 4.5 Local and global degrees of freedom

Let us collect the shape functions NK,i of a cell K in the local trial space ΠK which has a particular dimension. In the space ΠK we have the from the reference element transformed polynomials. ∞ Then we call a local degrees of freedom a linear functional (C (K))l R, if a 0 7→ set ΣK of local degrees of freedom provide a basis for the dual space (ΠK ) , i. e.

]Σ = dim Π and v Π : l(v) = 0 l Σ v = 0 . K K ∀ ∈ K ∀ ∈ K ⇒ The property that the local degrees of freedom ﬁx the function in the local trial space is called unisolvence. We can always choose the basis of ΣK such that

lK (N ) = δ , m, j 1,..., dim Π . (4.15) m K,j mj ∈ { K } Example 4.22 (Linear ﬁnite elements). The local degrees of freedom can be chosen as point evaluations on the nodes

K lj (v) := v(pj(K)), (4.16) which are well-deﬁned for v L∞(K). ∈ Example 4.23 (Higher order ﬁnite elements on quadrilaterals). The local degrees of freedom of the four nodal shape functions can be chosen as (4.16). Then, for bounded functions v

4 v (x) := v(x) lK (v)N (x) e − j K,j =1 Xj is zero on all four nodes. Note, that the integrated Legendre polynomials vanish on both end points for i 2, ≥ i. e., Li( 1) = 0, and so for any j N ± ∈ 1 1 b L0 (ξ)L0 (ξ)dξ = L (ξ)L00(ξ)dξ. i j − i j Z−1 Z−1 b b b b

86 On the other hand 1 1 0 0 2 L (ξ)L (ξ)dξ = L −1(ξ)L −1(ξ)dξ = δ . i j i j ij 2j 1 Z−1 Z−1 − So b b

1 1 00 2j 1 2i 1 00 2 Ni(ξ)Nj (ξ)dξ = 2 − − Li(2ξ 1)Lj (2ξ 1)dξ − 0 − r 2 r 2 0 − − Z Z1 b b 1 b0 0 b = (2j 1)(2i 1) Li(ξ)Lj(ξ)dξ = δij. 2 − − −1 p Z To fix the trace on each edge Ei(K), i = 1, 2, 3, 4 inb caseb that ve would be a shape functions identified to this edge, we introduce for j = 0, . . . , p 2 − 1 K 00 le,i,j(v) = 2 ve(ξ) Nj+2(ξ) dξ − 0 Z v (Φ ξ) e Ei(K) b b | {z } 00 −1 −1 −1 = 2 ve(x) N +2(Φ x) DΦE (K)(Φ x) dS(x), (4.17) − j Ei(K) | i Ei(K) | EiZ(K) b where ΦEi(K) is the mapping from the reference interval [0, 1] to Ei(K) which is introduced as the mapping ΦK of the respective edge in K to Ei(K) respecting the (global) orientation of Ei(K) (if the local ξ1 or ξ2 direction along the edge in K is opposite to global orientation, ξ1 or ξ2, respectively, is replaced byb 1 ξ1 or 1 ξ2). − − Then, in case v Π the function b ∈ K 4 p−2 v (x) := v (x) lK (v) N (x) c e − e,i,j e,i,j =1 =0 i j −1 X X Nbe,i,j (Φ x) Ei(K) | {z } is vanishing on all edges of K. Similarily, we define for i, j = 0, . . . , p 2, − K 00 00 lc,i,j(v) = 4 vc(ξ) Ni+2(ξ1)Nj+2(ξ2)dξ Kb Z vc(x) b b b 00 00 −1 −1 := 4 |v{z(x}) N (ξ1)N (ξ2) DΦ (Φ x) dx, c i+2 j+2 | K K | ZK −1 b b with ξ = ΦK x.

Deﬁnition 4.24. A ﬁnite element is a triple (K, ΠK , ΣK ) such that

(i) K is a cell of a mesh of the computational domain Ω Rd. M ⊂ (ii) Π (C∞(K))l is the trial space with dim Π < . K ⊂ K ∞

87 (iii) ΣK is a set of local degrees of freedom. A ﬁnite element is called V -conforming, if

(iv) for any face F of K the degrees of freedom localized on F uniquely determine the natural trace R u of a u Π onto F . |F ∈ K Deﬁnition 4.25. Let K be a cell of a mesh and F be a face/edge/node of the mesh M that is contained in K. Given a local trial space Π , Π (C∞(K))l, and a set Σ of K K ⊂ K local degrees of freedom, a linear functional l Σ is called localized/supported on ∈ K F or associated with F , if

l(v) = 0 v (C∞(K))l, supp(v) F = . ∀ ∈ ∩ ∅

Notation 4.26. The d.o.f. localized on a face F of K form the set ΣK (F ). By duality, localized degrees of freedom permit us to talk about “local shape functions associated with faces/edges/nodes”. To obtain V -conforming basis functions the local degrees of freedom localised on a face/edge/node on the cells sharing face/edge/node has to be matched to global degrees of freedoms. We have unisolvence, i. e., the global degrees of freedom determine the basis functions.

Remark 4.27. The “matching condition” for local d.o.f. is equivalent to demanding that the related local shape function can be “sewn together” across intercell faces to yield a function in V .

The construction of ﬁnite element spaces can be started from local/global shape functions or equivalently via the approach of degrees of freedom (“dual view”, see Fig. 4.14).

Local degree of freedom ←→ Global degree of freedom

Local shape functions Global shape functions ←→ Figure 4.14: Duality of degrees of freedom and shape functions

The concepts of degrees of freedom may be helpful to obtain ﬁnite element spaces which are Cm continuous, m > 0, or where only the tangential components or normal component of vector ﬁelds are continuous (H(curl, Ω) or H(div, Ω) conforming.

References

[16] P.J. Frey and P.-L. George. Mesh generation. Application to ﬁnite elements. Hermes Science Publishing, Oxford, UK, 2000.

88 [17] J. Ruppert. A Delaunay refinement algorithm for quality 2-dimensional mesh generation. J. Algo- rithms, 18(3):548–585, 1995. [18] J.R. Shewchuk. Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator. In Ming C. Lin and Dinesh Manocha, editors, Applied Computational Geometry: Towards Geometric Engineering, volume 1148 of Lecture Notes in Computer Science, pages 203–222. Springer-Verlag, May 1996. [19] Christoph Schwab. Einführungin die Numerik partieller Differentialgleichungen: StationäreProb- leme. Vorlesungsmanuskript, WS 2000/01. [20] Milton Abramowitz and Irene A. Stegun. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover, New York, ninth dover printing, tenth gpo printing edition, 1964. [21] W.H. Press, B.P. Flannery, S.A. Teukolsky, W.T. Vetterling, et al. Numerical recipes, volume 3. Cambridge university press Cambridge, 2007. [22] Pavel Sol´ın.ˇ Partial Differential Equations and the Finite Element Method. Wiley-Interscience, Hoboken, USA, 2006.

89 5 Basic Finite Element Theory

5.1 Discretisation error is bounded by the interpolation error

If the discrete variational problem is well-posed (see Theorem 3.69) the discretisation eror is bounded “quasi optimally” by the best approximation error: b V ×V 7→R u un V 1 + k k inf u wn V , (3.38) k − k ≤ γ wn∈Wn k − k n Following the usual way one estimates the best approximation error by the error of the interpolation of the solution u to the ﬁnite trial space Wn

inf u wn V u Ih u V , (5.1) wn∈Wn k − k ≤ k − k with I : V W or at least I : V W , V V is some interpolation operator. The h → n h S → n S ⊂ latter interpolation operators can be used if the solution is in fact in a smaller subspace VS of V . So, its all to deﬁne some interpolation of u in the discrete space Wn and to estimate its error.

5.2 The Bramble-Hilbert lemma

A generalisation of Poincar´e’sinequality (Lemma 3.65) and a best approximation of polynomials in Sobolev spaces similarly to the point wise statement of Taylor’s theorem is the

Lemma 5.1 (Bramble-Hilbert lemma). If Ω Rd is a bounded Lipshitz-domain and ⊂ m N, then ∈ m γ = γ(m, Ω) > 0 : inf v p Hm(Ω) γ v Hm(Ω) v H (Ω) . ∃ p∈Pm−1(Ω) k − k ≤ | | ∀ ∈ Proof. The proof can be found in [23].

m In other words, the norm on the quotient space H (Ω)/ −1(Ω) is equivalent to the Pm seminorm m . |·|H (Ω) Note, that for ` > m we have

inf v p Hm(Ω) inf v p Hm(Ω) γ v Hm(Ω) . p∈P`−1(Ω) k − k ≤ p∈Pm−1(Ω) k − k ≤ | | There is also a version of the Bramble-Hilbert lemma for tensor-product polynomials [24]:

90 Lemma 5.2. If Ω Rd is a bounded Lipshitz-domain and m N, then ⊂ ∈ 1/2 d m 2 ∂ v m γ = γ(m, Ω) > 0 : inf v p Hm(Ω) γ m v H (Ω) . ∃ p∈Qm−1(Ω) k − k ≤ ∂ξ 2 ∀ ∈ i=1 i L (Ω)! X

5.3 The interpolation operator of Raviart-Thomas

Let the finite element space consist of V -conforming finite elements. The interpolation operator of Raviart-Thomas is defined as

N (In v)(x) := lj(v)bj(x) =1 Xj where lj are the global degrees of freedom. As the global degrees of freedom result by matching of local degrees of freedom we have for the restriction on one cell K

dim ΠK (I v)(x) := lK (v)N (x), x K. n j K,j ∈ =1 Xj 5.4 The interpolation error estimates on simplices

We need three properties of In on simplices: The shape functions are polynomials on K,Π = (K). • K Pm The interpolation operator I preserves polynomials, i. e., for p (K) • n ∈ Pm

dim ΠK dim ΠK dim ΠK dim ΠK In p = In( αiNK,i) = αi lj(NK,i) NK,j = αjNK,j = p. i=1 j=1 i=1 j=1 X X X δij X | {z } t The interpolation operator In is continuous for functions in H (K) for some t R. • ∈ Since the local degrees of freedom for linear ﬁnite elements as well as those of higher orders involve point evaluations we have t = 1 for d = 1 and t = 2 for d = 2. We have the following theorem [25, Thm. 6.2].

Theorem 5.3. If and only if d/2 < m, m N, then Hm(Ω) is continuously embedded ∈ in C0(Ω).

91 With the Bramble-Hilbert-Lemma, the triangle inequality and t m+1 we can estimate ≤

u In v H1(K) inf (u p) In(u p) H1(K) k − k ≤ Pm(K) k − − − k

inf u p H1(K) + In(u p) H1(K) ≤ Pm(K) k − k k − k (1 + C(m, K)) inf u p Ht(K) ≤ Pm(K) k − k

(1 + C(m, K)) inf u p t γ(t, K)(1 + C(m, K)) u t , Pt−1(K) k − kH (K) ≤ | |H (K) ≤  (1 + C(m, K)) inf u p m+1 γ(m, K)(1 + C(m, K)) u m+1 .  Pm(K) k − kH (K) ≤ | |H (K)  B How does the constant in the estimate depend on K, especially the size of K ? To answer this question we consider the interpolation operator on the usual reference element, which has a ﬁxed size and angles.

Interpolation operator on the reference element We have deﬁned the shape functions NK,j as from a reference element transformed element shape functions

NK,j(ΦK (ξ)) = Nj(ξ).

Equivalently, we can deﬁne local degrees of freedomb on K with

K lj(v) = lj (v) b where v(ξ) := v(Φ (ξ)) is the pullback of v. K b b Then, we can deﬁne an interpolation operator on K as

b dim Π K b (I v)(ξ) := lj(v)Nj(ξ), (5.2) =1 Xj b b b which equalise the transformedb interpolation operatorb

dim ΠK dim ΠK K In v(ξ) := (In v)(x(ξ)) = lj (v)NK,j(x(ξ)) = l(v)Nj(ξ) = (I v)(ξ). (5.3) =1 =1 Xj Xj d b b b bb Estimate on the reference element Repeating the steps, we went on K, on the reference element K we obtain estimates for the interpolation error for I. Let us measure the error in the Hr-norm with r t m + 1 (t 1 for d = 1 and t 2 for d = 2, 3). ≤ ≤ ≥ ≥ b b

u Iu = inf (u p) I(u p) Hr(Kb) Hr(Kb) − p∈Pm(Kb) − − −

(1 + C) inf u p γ(m)(1 + C) u . b bb b b b Ht(Kb ) Ht(Kb) ≤ p∈Pm(Kb) k − k ≤ | | b b

92 Transformation techniques Let us transform Sobolev norms between the reference triangle K and the triangle K and vice-versa.

Lemma 5.4. If ΦK : K K is an aﬃne mapping ξ FK ξ + τ , then, for all m N0, b 7→ 7→ ∈ m + d m m −1/2 m u b d F det(F ) u m u H (K) , | |Hm(Kb) ≤ d k K k | K | | |H (K) ∀ ∈ m + d m −1 m 1/2 m ub m d F det(F ) u u H (K) . | |H (K) ≤ d k K k | K | | |Hm(Kb) ∀ ∈ with F denoting the matrix norm of FK associatedb with the Euclidean vectorb norm. k kK ∞ Proof. Without loss of generality we can assume that u C (K). Let α Nd with ∈ ∈ α = m, m N0. Remember that the Gateaux derivative in direction δ is | | ∈ u(ξ + εδ) Du(ξ)(δ) := lim . ε→0 ε m d b d So, the m-th Gateaux-derivative D : R R R allows to express b × · · · × 7→ ∂αu = Dmu(ξ)(δ1,..., δm) ,

1 α1 α1+1 α1+α2 with δ = = δ = e1, δ = = δ = e2, etc. Here, e designates the k-th ··· b ··· b k unit vector in Rd. This means for example 2 3 ∂ξ1 ∂ξ2 u(ξ) = D u(ξ)(e1, e2, e2). So, we deduce that b b α m m 1 m k d k ∂ u(ξ) D u(ξ) := sup D u(ξ)(δ ,..., δ ), δ R , δ = 1 , | | ≤ k k { ∈ } (instead of the particular choice of δ1,..., δm we take the supremum) which implies b b b m + d u 2 = ∂αu(ξ) 2 dξ Dmu(ξ) 2 dξ . (5.4) Hm(Kb) | | Kb | | ≤ m Kb k k |αX|=m Z Z The chain ruleb gives b b

m 1 m m D u(ξ)(δ ,..., δ ) = D u(ΦK (ξ))(FK δ1,..., FK δm) . (5.5) By linearity of the derivative we have b Dmu(ξ) F m Dmu(Φ(ξ)) . k k ≤ k K k k k Next, we use the transformation formula for multidimensional integrals and apply it to b (5.4): m + d u 2 F 2m Dmu(x) 2 det(F ) −1 dx | |Hm(Kb) ≤ m k K k k k | K | ZK (5.6) m + d b F 2m det(F ) −1 Dmu(x) 2 dx . ≤ m k K k | K | k k ZK

93 Then, observe that m m 1 m k d k D u(x) = sup D u(x)(δ ,..., δ ), δ R , δ = 1 k k { ∈ } d d m 1 m k d k sup D u(δ eα1 , . . . , δ eαm ) , δ R , δ = 1 ≤ { ··· | α1 αm | ∈ } α =1 α =1 X1 Xm d d Dmu(e ,..., e ) dm max ∂αu(x) , ≤ ··· | α1 αm | ≤ {| | α =1 α =1 X1 Xm Finally,

m 2 m α 2 2m α 2 2m 2 D u(x) dx d max ∂ u(x) dx d ∂ u(x) dx = d u Hm(K). K k k ≤ K {| | ≤ K | | | | Z Z |αX|=m Z

Remark 5.5. If Φ is a general C∞-diﬀeomorphism K K, then the analogue of (5.5) m 7→ m will involve derivatives of u from Du up to D u and derivatives DΦK up to D ΦK . Thus, in order to estimate the Sobolev-seminorm u b , we have to resort to the full | |Hm(Kb) Sobolev norm u m and vice versa. k kH (K) b Estimate on a cell K With the transformation techniques we can relate the interpolation error on K with the that of the push-backed function r 2 2 u I u = u I u r − n Hr(K) | − n |H (K) `=0 Xr 2 ` + d 2` 2 d2` F−1 det(F ) u I u ≤ d k K k | K | − n Hr(Kb) `=0 X Ibub 2 b d r + d 2r |{z} 2 C(r) d2r F−1 det(F ) u Iu . (5.7) ≤ d k K k | K | − Hr(Kb)

Using the estimate on the reference element we get b bb

r + d r 1 u I u C(m) dr F−1 det(F ) /2 u , − n Hr(K) ≤ d k K k | K | | |Ht(Kb) and transformed back to K b

r + d t + d r+t −1 r t u I u C(m) d F F u t . (5.8) − n Hr(K) ≤ d d k K k k K k | |H (K)

C(m,d)

The estimate depends on| the size and{z shape of the} triangle through FK . We will see in the following that F = O(h ) and F−1 = O(h−1), where h is the cell diameter. k K k K k K k K K

B Convergence in terms of hK is only expected if the solution has higher regularity (even for d = 1 where the interpolation operator is continuous in Ht(K) for t = 1).

94 Estimates of the simplicial element mapping Let us consider an affine equivalent simplicial triangulation , see Def. 4.5. We fix a reference simplex K and find affine M mappings Φ : K K, Φ (ξ) := F ξ + τ for each K . In light of the general K 7→ K K K ∈ M strategy outlined above, we have to establish bounds for F , F−1 b, det(F ) , and k kK K | K | det(F ) −1 thatb depend on controllable geometric features of . | K | M Definition 5.6. Given a cell K of a mesh we define its diameter M h := sup x y , x, y K , K {| − | ∈ } and the maximum radius of an inscribed ball

r := sup r > 0 : x K : x y < r y K . K { ∃ ∈ | − | ⇒ ∈ } The ratio hK /rK is called the shape regularity measure ρK of K.

Figure 5.1: Diameter hK and rK for a triangular cell

Lemma 5.7. If K,K Rd, d = 2, 3, are a generic non-degenerate simplices and ⊂ Φ : K K, Φ (ξ) := F ξ + τ , the associated bijective aﬃne mapping, then K 7→ K K d b d−1 d d hK 1−d hK rK K hK hK d−1 b ρ = det(FK ) = | | = ρ , (5.9) h K hd ≤ | | K ≤ h rd−1 h Kb Kb Kb | | Kb Kb Kb h h h h F K = 1 ρ K , bF−1 Kb = 1 ρ Kb . (5.10) K 2r 2 Kb h K 2r 2 K h k k ≤ Kb Kb ≤ K K

Proof. The inequalities (5.9) can be concluded from the volume formula for simplices by elementary geometric considerations. Write ζ K for the center of the largest inscribed ball of K. Then estimates (5.10) ∈ follow from b 1 −1 b FK = sup FK ξ , ξ = 1 = r sup FK (ξ ζ) , ξ ζ = 2r {| | | | } 2 Kb {| − | | − | Kb } 1 −1 = r sup ΦK (ξ) ΦK (ζ) , ξ ζ = 2r hK /2r , 2 Kb {| − | | − | Kb } ≤ Kb because both Φ(ξ) and Φ(ζ) lie inside K. A role reversal of K and K establishes the other estimate. b

95 The shape regularity measure of a simplex can be calculated from bounds for the smallest and largest angles enclosed by edge/face normals. We give the result for two dimensions:

Lemma 5.8. If the smallest angle of a triangle K is bounded from below by α > 0, then

−1 −1 sin(α/2) ρ 2 sin(α/2) . ≤ K ≤

rK α

Figure 5.2: Angle condition for shape regularity of a triangle

Proof. It is immediate from Fig. 5.2 that

1/2h sin(α/2) l sin(α/2) = r h sin(α/2) . K ≤ K ≤ K

Lemma 5.7 clearly shows that uniform shape-regularity of the cells is key to achieving a uniform behavior of the Sobolev seminorms under transformation to a reference element.

Deﬁnition 5.9. Given a mesh its meshwidth can be computed by M

hM := max h ,K , { K ∈ M} whereas its shape regularity measure is deﬁned as

ρM := max ρ ,K . { K ∈ M} Here, the notations from Def. 5.6 have been used. Moreover, the quasi-uniformity measure of is the quantity M 0 −1 µM := max h /h 0 ,K,K = max h ,K min h ,K . { K K ∈ M} { K ∈ M} · { K ∈ M} Remark 5.10. Usually software for simplicial mesh generation employs elaborate algorithms to ensure that the angles of the triangles/tetrahedra do not become very small or close to π. Hence, it is not unreasonable to assume good shape regularity of simplicial meshes that are used for ﬁnite element computations.

96 The choice of reference simplices is arbitrary. So we may just opt for

0 1 0 K := convex , , for d = 2 , (5.11) 0 0 1 b 0 1 0 0 K := convex 0 , 0 , 1 , 0 for d = 3 . (5.12)          0 0 0 1  b         Corollary 5.11. Let be a simplicial triangulation and choose the reference simplex M according to (5.11) and (5.12), respectively. Then the aﬃne mappings Φ : K K, K 7→ Φ (ξ) := F ξ + τ , K , satisfy K K K ∈ M b 1−d ρM d d −1 −1 d hM det(FK ) hM , FK hM , FK ρMµM hM . µM ≤ | | ≤ k k ≤ ≤

Interpolation error estimate in the mesh The interpolation error can be decomposed into the contributions from the all triangles of , using (5.8) and Corollary 5.11 and M summing the errors from each triangle we get

Theorem 5.12. Let In stand for the finite element interpolation operator belonging to the finite element space ( ) on a simplicial mesh . Then, for 2 t m + 1, Sm M M ≤ ≤ 0 r t ≤ ≤ t−r t γ = γ(t, r, m, ρM, µM): u I u r γ h u t u H (Ω) . ∃ k − n kH (Ω) ≤ M | |H (Ω) ∀ ∈ 5.5 A-priori error estimates for finite elements

Now, we are in the position to conclude in an a-priori error estimate for the discretisation error of the ﬁnite elements solution uh. Applying (3.38), (5.1) and Theorem 5.12 we end end up with

t−1 t u u 1 γ h u t for 2 t m + 1 and u H (Ω) , (5.13) k − nkH (Ω) ≤ M | |H (Ω) ≤ ≤ ∈ where γ = γ(Ω, γ , b , ρM, µM). n k k Let us discuss this a priori ﬁnite element discretization error estimate:

1. The estimate (5.13) hinges on the fact that the exact solution u is “smoother” (in terms of Sobolev norms) than merely belonging to H1(Ω). For general f H−1(Ω) ∈ this must never be taken for granted. However, for a restricted class of problems (3.1) with extra smoothness of the right hand side, e.g. f Hr(Ω), elliptic shift ∈ theorems may guarantee that u Ht(Ω) for t > r. For instance, for smooth Ω ∈ we can expect u Hr+2(Ω). ∈ Example 5.13. For d = 1 we have u Hr+2(Ω), if f Hr(Ω). ∈ ∈

97 2. The bound from (5.13) can be converted into an asymptotic a priori error estimate by considering a sequence n, n N, of simplicial meshes of Ω. They M ∈ are assumed to be uniformly shape-regular, that is,

γ > 0 : ρMn < γ n N . ∃ ∀ ∈ Moreover, the meshes are to become inﬁnitely ﬁne

hM 0 as n . n → → ∞ Then the statement of (5.13) can be expressed by

t−1 u un 1 = O(h ) for n . (5.14) k − kH (Ω) Mn → ∞ If (5.14) holds, common parlance says that the h-version ﬁnite element solutions enjoy convergence of the order t 1 as the meshwidth tends to zero. − Remark 5.14. With considerable extra eﬀort, more sophisticated best approximation estimates can be derived: for m, t 1 we have ≥ min{m+1,t}−1 hM inf u vn H1(Ω) γ(ρM, µM) u Ht(Ω) . (5.15) vn∈Sm(M) k − k ≤ m k k This paves the way for a-priori error estimates for the p-version of H1-conforming elements.

5.6 Duality techniques

Let us deal with the variational problem

b(u, v) := A grad u, grad v dx = fv dx v H1(Ω) . (5.16) h i ∀ ∈ 0 ZΩ ZΩ and its Galerkin discretization based on the ﬁnite element space ( ) on a simplicial Sm M mesh . Now, we aim to establish an estimate of the discretization error in the L2(Ω)- M norm. This is beyond the scope of the theory presented in Sec. 3.9 (inf-sup-conditions, quasi- optimality) and will rely on particular techniques for elliptic boundary value problems.

Assumption 5.15. We assume that (5.16) is 2-regular, that is, all u H1(Ω) with ∈ 0 div(A grad u) L2(Ω) satisfy − ∈ 2 u H (Ω) and u 2 γ div(A grad u) 2 , ∈ k kH (Ω) ≤ k kL (Ω) with a constant γ = γ(A, Ω) > 0 independent of u.

98 We write un for the unique solution of the discrete variational problem

u ( ) H1(Ω) : b(u , v ) = fv dx v ( ) . n ∈ Sm M ∩ 0 n n n ∀ n ∈ Sm M ZΩ Write u H1(Ω) for the exact solution of (5.16) and e := u u H1(Ω) for the ∈ 0 h − n ∈ 0 discretization error. From Sect. 3.11 we recall the Galerkin orthogonality (3.36)

b(e , v ) = 0 v ( ) . h n ∀ n ∈ Sm M The solution w H1(Ω) of the dual linear variational problem ∈ 0

w H1(Ω) : b(w, v) = e v dx v H1(Ω) , (5.17) ∈ 0 h ∀ ∈ 0 ZΩ will be a solution of the the elliptic boundary value problem

div(A grad w) = e in Ω , w = 0 on Γ . − h Since e L2(Ω), by Assumption 5.15 we know h ∈ 2 w H (Ω) , w 2 γ e 2 , (5.18) ∈ k kH (Ω) ≤ k hkL (Ω) with γ = γ(Ω, A) > 0. Next, we plug v = eh into (5.17) and arrive at

2 eh L2(Ω) = b(w, eh) = inf b(w vn, eh) , k k vn∈Sm(M) − where Galerkin orthogonality came into play. We may now plug in vn := In w, where I is the ﬁnite element interpolation operator for ( ). Then we can use the the n Sm M continuity of b in H1(Ω) and the interpolation error estimate of Thm. 5.12 for r = 1 and t = 2:

2 e 2 b(w I w, e ) γ w I w 1 e 1 γ hM w 2 e 1 . k hkL (Ω) ≤ − n h ≤ k − n kH (Ω) · k hkH (Ω) ≤ | |H (Ω) · k hkH (Ω)

Here, the ﬁnal constant γ will depend on A, m, ρM, and µM, but not on u or un. Eventually, we resort to the 2-regularity in the form of estimate (5.18) and cancel one power of e 2 . k hkL (Ω) This technique is known as duality technique, because it relies on the dual variational problem (5.17). Sometimes the term “Aubin-Nitsche trick” can be found. Sum- ming up we have proved the following result:

Theorem 5.16. Assuming 2-regularity according to Assumption 5.15, we obtain

u u 2 γ hM u u 1 , k − nkL (Ω) ≤ k − nkH (Ω) where the constant γ > 0 depends on Ω, A, m, ρM, µM.

99 Remark 5.17. Thm. 5.16 tells us that under suitable assumptions in the h-version of ﬁnite elements we can gain another power of hM when measuring the discretization error in the L2(Ω)-norm. More generally, often we can expect that, sloppily speaking,

the weaker the norm of the discretization error that we consider the faster it will converge to zero as hM 0. → What remains to be settled is whether Assumption 5.15 is reasonable. This is part of elliptic regularity theory. In particular, we have the following result [26]

1 Theorem 5.18. If the computational domain Ω Rd is convex or has C -boundary and ⊂ A C1(Ω), then the elliptic boundary value problem belonging to (5.16) is 2-regular. ∈ 5.7 Estimates for quadrature errors

As explained in Sect. 4.3.2 and 4.3.4, usually the finite element discretization of (3.10) or (3.12) will rely on local numerical quadrature for the computation of the stiffness matrix and of the load vector. The use of numerical quadrature will inevitably perturb the finite element Galerkin solution and introduce another contribution to the total discretization error, which is called consistency error. We have already stressed that the choice of the local quadrature rule is guided by the principle that

the error due to numerical quadrature must not dominate the total discretization error (in the relevant norms).

As far as the h-version of ﬁnite elements in concerned this guideline can be rephrased as follows:

the impact of numerical quadrature must not aﬀect the order of convergence in terms of the meshwidth.

5.7.1 Abstract estimates We consider a linear variational problem (LVP) on a Banach space V

u V : b(u, v) = f, v 0 v V, ∈ h iV ×V ∀ ∈ with bilinear form b L(V V, R) satisfying the inf-sup conditions (IS1), (IS2) and ∈ × f V 0, see Sect. 3.9. Existence and uniqueness of a solution u V are guaranteed by ∈ ∈ Thm. 3.46. Based on V V , dim(V ) < , we arrive at the discrete variational problem (DVP), n ⊂ n ∞ see Sect. 3.6.

u V : b(u , v ) = f, v 0 v V . n ∈ n n n h niV ×V ∀ n ∈ n

100 We assume the discrete inf-sup condition (DIS) to be satisfied, which implies existence and uniqueness of un. From an abstract point of view the application of numerical quadrature and an inexact boundary approximation in a finite element context means that the discrete variational problem will suffer a perturbation

u V : b(u , v ) = f, v v V , (5.19) n ∈ n n n n V 0×V ∀ n ∈ n 0 with a bilinear formeb L(Vn eVne, R) and f e V . The perturbation destroys Galerkin ∈ × ∈ n orthogonality and leads to extra terms in the discretization error estimate of Cor. 3.74. e e Theorem 5.19 (First Strang’s lemma). Beside the assumptions on b and b stated above we demand that b satisﬁes (DIS) with constant γn. Then, (5.19) will have a unique solution un Wn, which satisﬁes the a-priori error estimate e ∈ e e b(wn, vn) b(wn, vn) u un V γ inf u wn V + sup | − | k − k ≤ wn∈Wn k − k ∈ v vn Vn k nkV e e f, vn × 0 f, vn × 0 + sup | V V − V V | , vn∈Vn vn V k k e with γ = γ( b , γ ) > 0 k k n Proof. Similarly to the proof of quasi-optimality in Theorem 3.69 we use triangle inequality and (DIS) to estimate

u u u w + w u k − nkV ≤ k − nkV k n − nkV 1 b(wn un, vn) u w + sup − . e ≤ k − nkV γ e v n vn∈Vn\{0} k nkV e e With b(w u , v ) = (b(u, v ) b(u , v )) + (b(w , v ) b(w , v )) + b(w u, v ) n − n n n − n n n n − n n n − n 0 b b b = ( f, vn V ×V f, vn V ×V 0 ) + ( (wn, vn) (wn, wn)) + (wn u, vn), e e h i e−e e − − the continuity of b and as wn WeN has been arbitrarye we conclude in the statement of ∈ the lemma.

The two terms

b(w , v ) b(w , v ) f, vn 0 f, vn 0 sup | n n − n n | , sup | V ×V − V ×V | , vn∈Vn vn V vn∈Vn vn V k ke k k e are called consistency (error) terms. They have to be tackled, when we aim to gauge the impact of numerical quadrature or inexact boundary representation quantitatively.

101 Remark 5.20. Note, that ellipticity of b

b(vn, ven) γ1 v V ≥ k k for some positive constant γ1 impliese (DIS). This property for the pertubed bilinear form is called h-ellipticity. In the h-version of ﬁnite elements (mesh reﬁnement) we want γ1 to be indepedent of the meshwidth (“uniform h-ellipticity”).

Remark 5.21. If b is still an continuous bilinear form on V V , then the estimate of the theorem can be simpliﬁed in the following way: × e u u u w + w u k − nkV ≤ k − nkV k n − kV 1 b(wn u + u un, vn) u w + sup − − , e ≤ k − nkV γ e v n vn∈Vn\{0} k nkV e e b 1 + u wn V + R(u) ≤ γn k − k | | e with the residual term

b(u, v ) f(v ) R(u) = sup n − n . vn∈Vn vn V e k k e 5.7.2 Uniform h-ellipticity Let us consider the variational problem

b(u, v) := A grad u, grad v dx = fv dx v H1(Ω) . (5.16) h i ∀ ∈ 0 ZΩ ZΩ discretized by means of ﬁnite elements of uniform polynomial degree m on a simplicial triangulation of a polygonal/polyhedral computational domain Ω. M Applying local quadrature rules of the form

PK f(x) dx K ωK f(πK ) , (NUQ) ≈ | | l l Ω ∈M =1 Z KX Xl the perturbed bilinear form for u , v 0( ) reads n n ∈ Sm, M

PK b(u , v ) := K ωK A(πK ) grad u (πK ), grad v (πK ) . (5.20) n n | | l l n l n l K∈M l=1 X X e For the analysis we must rely on a certain smoothness of the coeﬃcient function A:

Assumption 5.22. The restriction of the coeﬃcient function A :Ω Rd,d to any cell 7→ K belongs to Cm(K)d,d and can be extended to a function Cm(K)d,d. ∈ M ∈

102 Lemma 5.23. Let A satisfy

2 T d γ µ µ A(x)µ µ R and almost all x Ω , (UPD) | | ≤ ∀ ∈ ∈ K with γ > 0 and Assumption 5.22, and let the local quadrature weights ωl be positive. If the local quadrature rules are exact for polynomials up to degree 2m 2, then − 2 b(v , v ) γ v 1 v ( ) . n n ≥ | n|H (Ω) ∀ n ∈ Sm M Proof. Since A is uniformlye positive deﬁnite and the quadrature weights are positive

PK PK b(v , v ) K ωK γ grad v (πK ) 2 γ K ωK grad v (πK ) 2 n n ≥ | | l | n l | ≥ | | l | n l | K∈M l=1 K∈M l=1 X 2 X X X e = γ v 1 , | n|H (Ω) d because on each K we know grad v −1(K) so that the numerical quadrature ∈ M n ∈ Pm of grad v 2 is exact. | n| References

[23] P.G. Ciarlet. The Finite Element Method for Elliptic Problems, volume 4 of Studies in Mathematics and its Applications. North-Holland, Amsterdam, 1978. [24] K.T. Smith. Inequalities for formally positive integro-diﬀerential forms. Bull. Am. Math. Soc., 67:368–370, 1961. [25] J. Wloka. Partial diﬀerential equations. Cambridge University Press, Cambridge, UK, 1987. [26] P. Grisvard. Elliptic Problems in Nonsmooth Domains. Pitman, Boston, 1985.

103 6 Adaptive Finite Elements

In this chapter we only consider the primal variational formulation of a second order elliptic boundary value problem

∆u = f in Ω , u = 0 or grad u, n = 0 on Γ . (6.1) − h i 2 in a bounded polygon Ω R with Lipschitz boundary Γ. ⊂ Deﬁnition 6.1. A Galerkin discretization of a variational problem is called adaptive, if it employs a trial space Vn that is based on non-uniform meshes or non-uniform polynomial degree of the ﬁnite elements. We distinguish

a priori adapted ﬁnite element spaces, which aim to take into account known • features of the exact solution.

a posteriori adapted ﬁnite element spaces, whose construction relies on the data • of the problem.

The next example shows that a posteriori adaptivity can dramatically enhance accuracy:

Example 6.2. If we knew the continuous solution u V of the linear varitional problem ∈ (LVP), we could just choose Vn := span u and would end up with a perfect Galerkin discretization. { }

Three basic policies can be employed to achieve a good ﬁt of the ﬁnite element space and the continuous solution:

adjusting of the mesh while keeping the type of ﬁnite elements (h-adaptivity). • M adjusting the local trial spaces (usually by raising/lowering the local polynomial • degree) while retaining a single mesh (p-adaptivity).

combining both of the above approaches (hp-adaptivity). • 6.1 Regularity of solutions of second-order elliptic boundary value problems

If the geometry does not interfere, the solution of (6.1) is as smooth as the data f permit:

104 Theorem 6.3. If ∂Ω is smooth (i. e. ∂Ω has a parameter representation with C∞ functions), then for the solution u of (6.1) it holds

k k+2 f H (Ω) = u H (Ω) for k N0 , ∈ ⇒ ∈ ∈ and

k k N0, γ = γ(Ω, k): u k+2 γ(Ω, k) f k f H (Ω) . ∀ ∈ ∃ k kH (Ω) ≤ k kH (Ω) ∀ ∈ Similar results hold for Neumann boundary conditions on the whole of ∂Ω. If ∂Ω has corners (as in the case of a polygonal domain), the results from the previous section do not hold any longer. The solution gets singular meaning that some (higher) derivative is not square integrable.

Γj 1 − ωj 1 −

Ω πj 1 −

Γj θj rj

πj

Figure 6.1: Polygon Ω and notation for the corners.

2 Theorem 6.4. Let Ω R be a polygon with J corners πj. Denote the polar coordinates ⊂ in the corner πj by (rj, θj) and the inner angle at the corner πj by wj as in Figure 6.1. −1+s 1 Additionally, let f H (Ω) with s 1 integer and s = λjk, where the λjk are given by the singular exponents∈ ≥ 6 kπ λjk = for k N . (6.2) ωj ∈

Then, we have the following decomposition [27] of the solution u H1(Ω) of the ∈ 0 Dirichlet problem (6.1) into a regular part (i. e. with the regularity one would expect

1The result holds for s > 0 non-integer as well. Since we only deﬁned the spaces Hk(Ω) for k integer, we do not go into the details here.

105 from a smooth boundary according to Thm. 6.3) and ﬁnitely many so-called singular functions sjk(r, θ): J 0 u = u + ψ(rj) αjksjk(rj, θj). (6.3) =1 Xj λXjk

~~λjk λjk non-integer: sjk(r, θ) = r sin(λjkθ),~~

~~λjk λjk N, ω π, 2π : sjk(r, θ) = r ln r sin(λjkθ) ∈ 6∈ { }~~

Note, that for λjk N, ω π, 2π the singular function sjk(r, θ) is not harmonic ∈ 6∈ { } (∆sjk = 0) and does not fulﬁll the Dirichlet boundary condition for θ = ω. The former 6 λ is cured by adding the smooth function r jk θ cos(λjkθ)) and the latter by adding the harmonic polynomial [28] (x, y) of degree λ satisfying the boundary data. Hλjk jk For the homogeneous Neumann problem in (6.1), sin has to be replaced by cos and vice-versa.

~~Remark 6.5. The coeﬃcients αjk in (6.3) depend only on f and are called (generalised) stress intensity factors.~~

Remark 6.6. At ﬁrst glance, the decomposition (6.3) appears to be very special and restricted to the problem (6.1). Yet, similar decompositions with suitable sjk(r, θ) hold for all elliptic boundary value problems of the form

div(A grad u) = f in Ω , u = 0 on ΓD , A grad u, n = 0 on ΓN . PSfrag − h i λ Generally, sjk(r, θ) = r jk Θjk(θ) is a non-trivial solution of the homogeneous diﬀerential equation in an inﬁnite sector S with a tip at the singular point.

Ω

~~πj 1 Sω − π u =0 j+1 ∂u/∂n =0 u =0 ∂u/∂n =0 ωj Γj 1 − rj θ Γj~~

~~πj θj O~~

~~Figure 6.2: Corner πj with changing boundary conditions and the inﬁnite sector Sω.~~

106 Example 6.7. Consider ∆u = f in Ω with mixed boundary conditions at π . Let − j πj ∂Ω be a boundary point where the type of the boundary conditions changes from Dirichlet∈ to Neumann (cf. Figure 6.2). In the inﬁnite sector

~~S = (r, θ) : 0 < r < , 0 < θ < ω , ω { ∞ } we are looking for non-trivial solutions of the homogeneous problem ∂s ∆s = 0 in Sω, = 0, s = 0 ∂n θ=0 θ=ω~~

~~λ λ of the form s(r, θ) = r Θ(θ). Using s = r Θ(θ), it follows in Sω: 0 = ∆s = rλ−2(Θ00 + λ2Θ) for r > 0, i. e. the pairs (λ, Θ(θ)) are eigenpairs of a Sturm-Liouville problem~~

Θ = Θ00 + λ2Θ = 0 in (0, ω) , Θ0(0) = 0 , Θ(ω) = 0. L One recalculates that the eigenpairs are explicitly given by π λ = (k 1/2) , Θ (θ) = cos(λ θ), k = 1, 2, 3,.... k − ω k k Note: even if ω = π, i. e. for changing boundary conditions on a straight edge, there 1 exists a singularity r /2 cos(θ/2) for changing boundary conditions.

Ω

~~Figure 6.3: Cracked panel.~~

Example 6.8. Consider the pure Neumann problem for ∆u = f on a domain with − a crack (tip of the crack at the origin as in Figure 6.3). Here ω = 2π and therefore kπ k λk = 2π = 2 and ∞ kθ u u0 + α rk/2 cos . ≡ k 2 =1 Xk

107 Remark 6.9. Note that the singular functions sjk(r, θ) in (6.3) have a singularity at r = 0 whereas they are smooth for r > 0. Therefore, the solution u of the Poisson problem (6.1) with a smooth right hand side f is smooth in the interior of Ω. The singular behaviour of u is restricted to the corners πj.

Remark 6.10. The decomposition of the solution in Theorem 6.4 shows that for ωj > π α the following holds: λj1 = π/ωj < 1. Additionally, it follows from (∂ sj1)(rj, θj) −|α| ∼ rλj1 for r 0 that the derivative ∂α of the singular functions s for α = 2 is not j j → jk | | square integrable since λ 1 α < 1, i. e. for α = 2 we have j − | | − | | α 2 −2−ε 1 (∂ s 1)(r , θ ) r / L (Ω). | j j j | ∼ j ∈ The shift theorem Thm. 6.3 does no longer hold.

~~References. Corner and edge singularities for solutions of elliptic problems are discussed in [29, 30, 27].~~

~~6.2 Convergence of ﬁnite element solutions~~

Let u ( ) stand for the Galerkin solution of (6.1) obtained by means of La- n ∈ Sm Mn grangian ﬁnite elements of uniform polynomial degree m N on the mesh n. Tem- ∈ M porarily, we will allow d 1, 2, 3 . ∞ ∈ { } Let denote a uniformly shape-regular and quasi-uniform family of triangu- {Mn}n=1 lations of the polygon Ω such that h := hM 0 as n . From Sect. 5.5 we know n n → → ∞ that, if the continuous solution u satisﬁes u Ht(Ω), t 2, we have, as n , the ∈ ≥ → ∞ asymptotic error estimate

min(m+1,t)−1 u u 1 γh u t , (6.4) k − nkH (Ω) ≤ n | |H (Ω) with γ > 0 independent of n and u. For a unified analysis of the h-version and p-version of finite elements and, in particular, on non-uniform meshes it is no longer meaningful state a-priori error estimates in terms of the meshwidth. Hence, let us measure the “costs” involved in a finite element scheme by the dimension of the finite element space, whereas the “gain” is gauged by the accuracy of the finite element solution in the H1-norm. For the h-version we first assume a uniformly shape- regular and quasi-uniform family ∞ of simplicial meshes. In the case of finite {Mn}n=1 elements of polynomial degree m we have the crude estimates

~~d + m −d Nn := dim( m( n)) ] n ] n h , S M ≤ d · M ⇒ M ≈ Mn with constants depending on shape-regtularity and m. Thus, if t m + 1 we get ≥ asymptotically~~

~~−m/d u u 1 γN . (6.5) k − nkH (Ω) ≤ n~~

108 The constant γ depends on Ω, A and the bounds for ρMn , µMn . This reveals an algebraic asymptotic convergence rate of the h-version of ﬁnite elements for second order elliptic problems. However, even for small m the regularity u Hm+1(Ω) cannot be taken for granted. ∈ Consider d = 2 and remember that from Sect. 6.1 it is merely known that for f ∈ Hk−2(Ω): 0 u = u + using (6.6) 0 k with a smooth part u H (Ω), k 2, and with a singular part using, which is a (ﬁnite!) ∈ ≥ sum of singular functions s(ri, θi), which have the explicit form

s(r, θ) = rλ Θ(θ) , (6.7) with piecewise smooth Θ, where 0 < λ < k 1 (we assume here that log r terms are − absent). The singular functions (6.7) are only poorly approximated by ﬁnite element functions on sequences of quasi-uniform meshes. For the singular functions s(r, θ) as in (6.7) and with (r, θ) denoting polar coordinates at a vertex of Ω the (optimal) error estimate

min(m,λ) − min(m,λ)/2 min s vn H1(Ω) γhn γNn vn∈Sm(Mn) k − k ≤ ≤ holds, where again N := dim ( ) = O(h−2) denotes the number of degrees of n Sm Mn n freedom. For a sequence ∞ of quasi uniform meshes one therefore observes only the {Mn}n=1 suboptimal convergence rate

min(m,λ∗) − min(m,λ∗)/2 u u 1 γh γN , (6.8) k − nkH (Ω) ≤ n ≤ n where λ∗ = min λ : j = 1, . . . , J, k = 1, 2,... , as h 0 (or for N ), instead { jk } n → n → ∞ of the optimal asymptotic convergence rate (6.5) supported by the polynomial degree of the ﬁnite element space. Since often λ∗ < 1, one observes even for the simple piecewise linear (m = 1) elements a reduced convergence rate, and for m > 1 we hardly ever get the optimal asymptotic −m/d rate O(Nn ).

Remark 6.11. If the exact solution u is very smooth, that is, t 1, raising the poly- d −d nomial degree m is preferable (p-version), because we have Nn m hM and, thus the estimate (see Remark 5.14) ≈

~~min{m+1,t}−1 hM inf u vn H1(Ω) γ(ρM, µM) u Ht(Ω) , (5.15) vn∈Sm(M) k − k ≤ m k k gives asymptotically for t m ≤ → ∞ (t−1)/d u u 1 γN . (6.9) k − nkH (Ω) ≤ n~~

109 For p-FEM the regularity of the solution determine the rate of convergence, where for h-FEM the rate is mainly determined by the polynomial degree. For large t this is clearly superior to (6.5). The bottom line is that low Sobolev regularity of the exact solution suggests the use of the h-version of finite elements, whereas in the case of very smooth solutions the p-version is more efficient (w.r.t. the dimension of the finite element space).

Remark 6.12. If the exact solution is analytic in Ω, that is, it is C∞ and can be expanded into a locally convergent power series in each point of Ω, then the p-version yields an exponential asymptotic convergence rate

0 β u u 1 γ exp( γ N ) , (6.10) k − nkH (Ω) ≤ − n with γ, γ0, β > 0 only depending on problem parameters and the ﬁxed triangulation, but independent of the polynomial degree m of the ﬁnite elements.

~~6.3 A priori adaptivity by graded meshes~~

The developments of Sect. 6.1 give plenty of information about the structure of the solutions of (6.3) for smooth data f. It is the gist of a priori adaptive schemes to take into account this information when picking the ﬁnite element space. This can overcome the poor performance of ﬁnite elements on quasi-uniform meshes pointed out in Sect. 6.2.

ξ2

~~κ2 πj~~

~~Ω κ2 πj+1 κ1 Φ K~~

~~ξ1 πj b κ1~~

~~Figure 6.4: Polygon Ω with corner πj, subdomain conv(πj, κ1, κ2) adjacent to it and its representation by an aﬃne map Φ from the standard triangle K.~~

b One option is judicious (vern¨unftig) mesh reﬁnement towards the vertices of the polygon. Consider the polygon Ω shown in Fig. 6.4. In Ω, consider any vertex πj (In Fig. 6.4 we chose a convex corner, the approach to a re-entrant corner at πj+1 is indicated in Fig. 6.5). We denote again by (r, θ) polar coordinates at vertex πj, and by s(r, θ) a singular function as in (6.7)

~~s(r, θ) = rλ Θ(θ)~~

~~110 ξ2~~

~~κ1 κ4 πj+1 K1 K2 κ2 ξ1 b Kb3 κ 3 Φ b~~

~~Figure 6.5: Mapping for a re-entrant corner at πj+1.~~

with a smooth Θ(θ). The triangle K = conv(πj, κ1, κ2) denotes a neighbourhood of vertex πj in Ω (shown shaded in Fig. 6.4). By means of an aﬃne map Φ the triangle K is mapped onto the reference triangle K with polar coordinates (r, θ). The singular function s(r, θ) in Ω is transformed by Φ into b b b s(r, θ) = rλ Θ(θ) in K, with the same exponent λ but with another C∞-function Θ(θ): b b b b b b b Example 6.13. Let π = (0, 0), (x , x ) stand for the coordinates in Ω, j 1 2 b b

~~x1 = r cos θ , x2 = r sin θ x ξ cos θ 1 = F 1 = r x2 ξ2 sin θ ! b and b b~~

~~s(r, θ) = rλ Θ(cos θ, sin θ) denote the singular function in (6.7). To prove that s(r, θ) is, in the coordinates ξ1, ξ2, once again of the form (6.7) let F = (fij)1≤i,j≤2. Then~~

~~2 2 2 2 2 r = x1 + x2 = (f11 ξ1 + f12 ξ2) + (f21 ξ1 + f22 ξ2) 2 2 2 = r (f11 cos θ + f12 sin θ) + (f21 cos θ + f22 sin θ) , { } and b b b b b~~

λ λ 2 2 λ λ r = r (f11 cos θ + f12 sin θ) + (f21 cos θ + f22 sin θ) 2 = r Θ1(θ) , { } with a smooth (analytic)b functionb Θ1(θ).b Analogously,b we have thatb Θ(θ) =b Φ2(θb) with a smooth function Θ2(θ). b b b b b 111 Due to the transformation theorem it is therefore sufficient to investigate the finite element approximation of s(r, θ) in (6.7) in the reference domain K as shown in Figure 6.6. In the case of a re-entrant corner, the reference domain consists of three triangles, see Fig. 6.5, and the ensuing considerations can be applied to eachb of them. β In what follows we show that by using so-called algebraically graded meshes n at −m/2 M the vertices of Ω the optimal asymptotic behavior O(Nn ) of the best approximation error of finite elements of uniform global degree m can be retained for singular functions as well.

ξ2

~~b β n M~~

~~K? L2 L3 L4 L5 ξ1 0 1 1 Mn (a) Graded mesh on (0, 1). (b) Graded mesh on reference triangle.~~

~~β Figure 6.6: Construction of a graded meshes n in Ω = (0, 1) for β > 1 and on reference M triangle K.~~

b β ∞ 2 Deﬁnition 6.14. A family n of meshes of a computational domain Ω R is {M }n=1 ⊂ called algebraically graded with respect to π Ω and grading factor β 1 if ∈ ≥ (i) the meshes are uniformly shape-regular, and

(ii) with constants independent of n and hn := h β , Mn K β, π K : h n−1 dist(π,K)1−1/β . ∀ ∈ Mn 6∈ K ≈ β We will describe the concrete construction of algebraically graded meshes n,n N, M ∈ with grading factor β 1 n N on the reference domain K with respect to the vertex 0 ≥ ∈ 0 , see Fig. 6.6: b 0 1 0 Algorithm 6.15 (Graded mesh on reference triangle). On K = convex 0 , 0 , 1 we proceed as follows: { } b

~~112 1. Construct a partition 0 = τ n < τ n < < τ n = 1 of (0, 1) by setting τ n := (j/n)β, 0 1 ··· n j j = 1, . . . , n. 2. Use this partition to deﬁne the layers~~

n n L = ξ K : τ < ξ1 + ξ2 < τ , j = 1, . . . , n . j { ∈ j−1 j } β 3. Equip each layer L , j = 1b, . . . , n with a simplicial triangulation n such that j M |Lj a) their union yields a simplicial triangulation of K, β b) the shape regularity measure of n (see Def. 5.9) is uniformly bounded M |Lj idependently of j and n, b β c) for each K n we have h τ τ −1 with constants independent of j ∈ M |Lj K ≈ j − j and n, and β ∗ 0 d) n consists of a single triangle K adjacent to . M |L1 0 β Remark 6.16. For β = 1 the meshes n are quasi-uniform with meshwidth 1/n. M Lemma 6.17. Fix β > 1. Then, with constants only depending on the bounds on the β β shape-regularity measure ρ( n ) and quasi-uniformity measure µ( n ) we ﬁnd M |Lj M |Lj 1−1/β β j β−1 β j β h = K β , (6.11) K ≈ n n n n ∀ ∈ Mn|L1 ! and

] β j . (6.12) Mn|L1 ≈ Proof. Pick n N and j 2, . . . , n . Then, with the monotonously increasing function β ∈ ∈ { 0 } β−1 f(t) = (t/n) and its derivative f (t) = β/n(t/n) we have τ τ −1 = f(j) f(j 1) j − j − − which can be bounded by the mean value theorem by f 0(j) and f 0(j 1) from above − and below. Thus,

β j 1 β−1 β j β−1 − τ τ −1 . n n ≤ j − j ≤ n n −β Together with hK∗ = n we conclude the ﬁrst assertion of the lemma. To conﬁrm the second, we start with the volume formula

j 2β j 1 2β 2β j 2β−1 2 L = − . (6.13) | j| n − n ≈ n n As a consequence of (6.11), the area of a triangle L is ⊂ j β2 j 2β−2 2 K h2 K β (6.14) | | ≈ K ≈ n2 n ∀ ∈ Mn|Lj None of the constants depends on n and j. Dividing (6.13) by (6.14) yields (6.12).

113 β Corollary 6.18. The family n , n N, β 1, of meshes emerging from construc- {M } ∈ ≥0 tion 6.15 is algebraically graded with respect to 0 and grading factor β. β Corollary 6.19. The algebraically graded meshes n, n N, of K constructed as M ∈ above for β 1 satisfy ≥ b β β 2 hn := h β /n ,] n n , Mn ≈ M ≈ with constants independent of n. β As a consequence, h( n) 0 for n , if β 1. Moreover, for ﬁxed m N, we M → → ∞ ≥ ∈ get from Cor. 6.19 that

~~N = dim ( β) γ ] β γ n2 n Sm Mn ≤ Mn ≤ holds with a constant independent of n.~~

1 −1 − 2 Approximation of the regular part As, again by Cor. 6.19, n γNn , we deduce ≤ from Thm. 5.12 that for the regular part u0 Hm+1(K) of the decomposition (6.6) of ∈ the solution u H1(Ω) of ∆u = f holds ∈ 0 − b 0 0 0 m −m/2 min u vn 1 u In u 1 γh γN , (6.15) β H (Kb) H (Kb) n n vn∈Sm(Mn) − ≤ − ≤ ≤

β with γ = γ(m, ρ β ). Here In is the ﬁnite element interpolation operator for m( n). Mn S M This implies that the regular part u0 of the solution u can also be approximated on algebraically β-graded meshes at the optimal rate (6.4), independently of the size of β 1 (the size of the constant γ in the error estimates (6.15) depends of course on β ≥ and possibly grows strongly with β > 1; for ﬁxed β and n the convergence rate → ∞ (6.15) is optimal, however).

Approximation of the singular part Let us now consider the singular part of u in the decomposition (6.6). According to (6.3) the solution using is a ﬁnite sum of terms of the form (6.7) (the treatment of terms of the form rλ log r Θ(θ) is left to the reader as | | an exercise), where Θ(θ) C∞([0, ω]) is assumed without loss of generality, and where ∈ λ > 0.

Theorem 6.20. Let s(r, θ) = rλ Θ(θ) with λ > 0 and Θ C∞([0, π/2]) for (r, θ) K as in Fig. 6.6. Let further ∈ ∈ β > max m/λ, 1 . b { } β Then there holds, as N = dim ( n) n Sm M → ∞ m − 2 min s vn 1 γ(m, λ, β) Nn , β H (Kb) vn∈Sm(Mn) | − | ≤ i. e. for β > m/λ the optimal asymptotic convergence rate (6.4) for smooth solutions u is recovered.

114 Proof. The proof relies on local estimates for the interpolation error s I s. For the sake − n of simplicity we restrict the discussion to the case of m = 1 and leave the generalization to arbitrary polynomial degree to the reader. Hence, In designates linear interpolation β on n. M ∗ β 0 Let K n denote the triangle which contains the origin in its closure. We are ∈ M 0 going to demonstrate that the contribution of this triangle to the interpolation error is negligible. First, using polar coordinates, observe that

π π τ1 /2 τ1 /2 2 2 2 2 2 ∂s 1 ∂s λ−1 1 λ 0 s H1(K∗) + r dθ dr = λr Θ(θ) + r Θ (θ) r dθ dr | | ≤ ∂r r ∂θ ! r ! Z0 Z0 Z0 Z0 τ1 1 2βλ γ(Θ) λ2r2λ−1 + r2λ−1 dr = γ(Θ) (1 + λ2) . ≤ n Z0 ∗ Since In(s) is linear on K , we ﬁnd

2 ∗ −2 2 2 1 2λ 2 2 I s 1 ∗ = K τ s(τ1, 0) + s(0, τ1) = /2τ (Θ(0) + Θ(π/2) ) | n |H (K ) | | 1 1 2 1 βλ = γ(Θ) τ 2λ = γ(Θ) . 1 n Thus, a simple application of the triangle inequality yields 2βλ 2 1 −βλ −m s I s 1 ∗ γ(Θ) γN γN , | − n |H (K ) ≤ n ≤ n ≤ n since β > λ/m. Next, consider K K∗. For x K deﬁne the piecewise constant function h(x) by \ ∈ h(x) = h K β . b |bK K ∀ ∈ Mn Then, the local interpolation error estimate of Thm. 5.12 (with r = 1, m = 1, t = 2) yields 2 2 s I s = s I s 1 | − n |H1(Kb\K∗) | − n |H (K) β K∈Mn KX6=K∗

~~2 2 2 2 2 γ hK s H2(K) = γ h D s dx . ≤ | | ∗ | | β Kb\K K∈Mn Z KX6=K∗~~

The construction of the graded mesh implies that at any point ξ K K∗ it holds ∈ \ β β 1−1/β √2 j −1 j b −1 1−1/β r = ξ for some j N h(ξ) γ n γ n r , | | ≥ 2 n ∈ | | ≤ n ≤ ! D2s γ rλ−2 , with γ > 0 independent of n. | | ≤

115 Hence, for n 1 and λ > β−1, 1 2 2 2 −2 2(1−1/β)+2λ−4 (h(ξ)) D s dξ γ √ n r r dr K\K? | | ≤ 2 n−β Z b Z 2 1 −2 2λ−2/β −2 −1 γ n r √ γ n γ N , 2 −β n ≤ 2 n ≤ ≤ h i using λ > β−1 implies the assertion. For general m 1 we get in the case of λ > m/β ≥ 1 2 −2m 2λ−2/β−2(m−1) s In s 1 ∗ γ n r √ H (Kb\K ) 2 −β | − | ≤ 2 n −2m h −2β(λ+m−1)−2i −2m −2 m (λ+m−1)−2 γ (n + n ) γ ( n + n λ ) ≤ ≤ γ (n−2m + n−2m−2) γn−2m γ N −m . ≤ ≤ ≤ n

Remark 6.21. From Thm. 6.20 we learn that either λ > m and β = 1 or λ < m and β > m/λ will ensure the optimal rate of convergence of the ﬁnite element solution in terms of the dimension of the ﬁnite element space.

Corollary 6.22. For K as shown in Fig. 6.6 and kmax such that λkmax = max λk : 0 0 m+1 { λ < m , we have u = u + using with u H (K) and k } ∈ b kmax λk b using = αk r Φk(θ) , =1 Xk ∞ where we assume that α1 = 0, and 0 < λ1 λ2 ... , and Φ C ([0, ω]). Then for 6 ≤ ≤ k ∈ m 1, β > max 1, m/λ1 it holds ≥ { } −m/2 min using v 1 γN with γ = γ(p, αk, β) . (6.16) β H (Kb) n v∈Sm(Mn) | − | ≤ Remark 6.23.

~~1. If m/λ1 > 1 we achieve with the grading factor~~

~~β > m/λ1 (6.17)~~

~~the same convergence rate as for smooth solutions u with a quasi uniform triangulation.~~

~~2. For ﬁxed ] we have to increase β (i. e. the grading must be more pronounced) if M a) m is raised and b) if the singular exponent λ1 is reduced.~~

3. Usually the singular exponent λ1 > 0 is unknown. Thm. 6.20 shows, however, that β > 1 must only be chosen suﬃciently large in order to compensate for the eﬀect of the corner singularity on the convergence rate of the FEM. The precise value of λ1 is not necessary.

116 4. No reﬁnement is required, if λ1 m. This is the case e. g. for m = 1 and the ≥ Laplace equation in convex polygons Ω, where λj := π/ωj > 1. For m > 1 mesh reﬁnement near vertices is also required in convex domains, in general.

The preceding analysis at a single vertex can be transferred to the general polygon: 2 let Ω R denote a polygon with straight sides and ⊂ f Hk−2(Ω), k 2 . ∈ ≥ Let further u H1(Ω) denote the solution of the homogeneous Dirichlet problem for ∈ 0 ∆u = f. Then, for every m 1 there exists a β > 1 such that for k m + 1 it holds − ≥ ≥ −m/2 inf u vn 1 γN , β H (Ω) n vn∈Sm(Mn) | − | ≤

~~β for N = dim ( n) . n Sm M → ∞ References~~

[27] C. Schwab. p- and hp-Finite Element Methods. Theory and Applications in Solid and Fluid Me- chanics. Numerical Mathematics and Scientiﬁc Computation. Clarendon Press, Oxford, 1998. [28] M. Chamberland and D. Siegel. Polynomial solutions to dirichlet problems. Proceedings of the american mathematical society, 129(1):211–218, 2001. [29] P. Grisvard. Singularities in boundary value problems, volume 22 of Research Notes in Applied Mathematics. Springer–Verlag, New York, 1992. [30] S.A. Nazarov and B.A. Plamenevskii. Elliptic Problems in Domains with Piecewise Smooth Bound- aries, volume 13 of Expositions in Mathematics. Walter de Gruyter, Berlin, 1994.

~~117~~