Part I — INTEGRATION

Mathematical Analysis III⇤ 1 The challenges of a good definition of integra- Daniel Ueltschi tion Integration is a natural operation, it is the reverse of derivation. It is intuitively December 5, 2016 t defined as “the area below the curve”. If f is a nice so that g(t)= a f(x)dx makes sense, we have R g0(t)=f(t). (1.1) The intuitive definition of integration becomes confusing when looking at irreg- Contents ular functions, such as b a dx 1 dx 1 The challenges of a good definition of integration 2 (a) f(x)=1/x , with a>0. Can we make sense of 0 xa and b xa ?

5 2 Step functions 3 R R 4

3 Convergence of of functions 5 3

2

4 Methods of integration 12 1

5 Characterisation of regulated functions 15 1 2 3 4

1 b 1 6 Improper and Riemann 17 (b) f(x)=cosx . What is 0 cos x dx?

1.0 7 Uniform and pointwise convergence 19 R

0.5 8 Functions of two variables 22

0.2 0.4 0.6 0.8 1.0 1.2 1.4 9 Series of functions 25 -0.5 10 Normed vector spaces 31 -1.0 11 Inner products 35 1 1 1 (c) Two variations of the previous example: g(x)=x cos x and h(x)= x cos x . It 12 Linear maps (operators) 37 is a good exercise to draw these functions!

13 Open and closed sets 39 0ifx Q, (d) f(x)= 2 1ifx R Q. 14 The contraction mapping theorem 41 ( 2 \ It is natural to wonder whether one really wants to integrate such functions? There are good reasons to answer positively. As is well-known, R is a more conve- nient of numbers than Q. In a similar fashion, we will consider various spaces of functions and these spaces should be large enough so as to contain the limits

⇤These lecture notes are based on past notes by various lecturers at the University of Warwick, of suitable Cauchy sequences. We are therefore seeking to define integration in a including Oleg Zaboronski and Sergey Nazarenko. general fashion.

1 2 2 Step functions Let us now point out important properties of integrals.

We have just argued that integration needs to be defined very generally... but we Proposition 2.2. (Additivity) For any ' S[a, b] and c (a, b), we have 2 2 now consider a restricted class of functions. b c b Definition 2.1. A partition P of the [a, b] is a set of numbers, '(x)dx = '(x)dx + '(x)dx. a a c P = p0,p1,...,pk such that a = p0

Notice that if ' is constant on each subinterval of P ,andQ is a refinement of Proof. Let P = p0,...,pk be a partition that is compatible with ', (it is then { } P , then ' is also constant on the subintervals of Q. We let S[a, b] denote the set of compatible with ↵' + ). Let 'i, i be the values of ', on the interval (pi 1,pi). all step functions on the interval [a, b]. It has the structure of a : Then

b k Proposition 2.1. If f,g S[a, b] and ↵, , then ↵f + g S[a, b]. R ↵'(x)+ (x) dx = ↵'i + i (pi pi 1) 2 2 2 Za i=1 X Proof. Let P apartitionthatiscompatiblewithf,andQ a partition that is com- k k (2.3) = ↵ 'i(pi pi 1)+ i(pi pi 1) patible with g. Then P Q is a refinement of both P and Q, where f,g are both constant on subintervals.[ It is then clear that ↵f + g is constant on subintervals i=1 i=1 Xb b X of P Q, so it is a step function. [ = ↵ '(x)dx + (x)dx. It may be worth pointing out that many spaces of functions are vector spaces, Za Za such as the space B[a, b] of bounded functions or the space C[a, b] of continuous functions. Note also that Proposition 2.4. (Fundamental theorem of calculus for step functions) Let C[a, b] B[a, b]andS[a, b] B[a, b]. (2.1) ' S[a, b] and P = p ,...,p a partition compatible with '. Consider ⇢ ⇢ 2 { 0 k} We now define the of step functions. the function I :[a, b] R ! Definition 2.3. (Integral of step functions) Let ' S[a, b] and P = t 2 I(t)= '(x)dx. p0,...,pk a compatible partition; we let 'i denote the value of ' in the { } a interval (pi 1,pi), i =1,...,k. The integral is defined as Z Then b k (a) I(t) is continuous on [a, b]. k '(x)dx = 'i(pi pi 1). (b) I is di↵erentiable on i=1(pi 1,pi) and I0(t)='(t). [ a i=1 Z X Notice that the definition of the integral does not depend on the choice of the Proof. Let 'i be the value of ' on the interval (pi 1,pi), i =1,...,k. Let t 2 (pj 1,pj); we have partition that is compatible with ' (why?). We also define the integral for b

5 6 f(y ) > ". Since (x )isbounded,ithasaconvergentsubsequence(x )byBolzano- Recall that a function is piecewise continuous if it is continuous at all points of n | n nk Weierstrass theorem. Let u =limk xn . Since the interval, except for a finite number of points. Using a partition that contains all !1 k points of discontinuity, we can extend Proposition 3.3 to certain piecewise continuous (3.4) ynk u ynk xnk + xnk u 0, functions: | |  | | | | ! we also have ynk u. Further, u [a, b] because the interval is closed (here we Corollary 3.4. Let f :[a, b] R be a function and P = p0,...,pk a ! 2 ! { } used the hypothesis that the interval is closed). We now have sequences (xn ), (yn ), partition such that f is continuous on each interval (pi 1,pi), and that it k k numbers " > 0andu [a, b], such that can be extended to a on [pi 1,pi]. Then f is regulated. 2

f(xnk ) f(ynk ) > "; Here, we invite the reader to think about functions on [0, 1] that are not regulated. • Can you conceive one that is bounded and continuous at every point save one? And x u, y u. • nk ! nk ! can you prove the impossibility for a sequence of step functions to converge to this Then f is not continuous at u, hence not continuous on [a, b]. peculiar function? We can now define the integral of regulated functions. We now introduce the class of functions for which integration will be defined. Definition 3.4. The integral of f R[a, b] is defined as 2 Definition 3.3. A function f :[a, b] R is regulated if there exists a ! b b sequence ('n) of step functions, 'n S[a, b], that converges uniformly to f, 2 f(x)dx =lim 'n(x)dx, i.e. f 'n 0 as n . a n a 1 Z !1 Z Alternatively,k k f!is regulated!1 if for every " > 0, there exists ' S[a, b] such 2 where ('n)n 1 is any sequence of step functions that converges uniformly to that f ' < ". k k1 f. Let R[a, b] denote the set of regulated functions on [a, b]. It is not hard to check This definition raises two questions. Does the limit exists, and is it independent that R[a, b] is a vector space. Using the triangle inequality, we have of the choice of the sequence of step functions? The answer is yes:

f = f ' + ' f ' + ' < , (3.5) k k1 k k1 k k1 k k1 1 Proposition 3.5. Let f R[a, b] and ('n)n 1, ( n)n 1 be sequences of 2 b so regulated functions are bounded. R[a, b] contains all step functions, and also all step functions that converge uniformly to f. Then limn a 'n and b !1 continuous functions: limn n exist, and they are equal. !1 a R Proposition 3.3. If f :[a, b] R is continuous, then it is regulated. R ! Proof. Using Proposition 3.1, we have

b b b Proof. If f is continuous, then it is uniformly continuous by Lemma 3.2. For every 'n(x)dx 'm(x)dx = 'n(x)dx 'm(x) dx 'n 'm (b a). (3.7) " > 0, there exists > 0suchthat f(x) f(y) < " for all x y < . Let a a a k k1 | | | | Z Z Z P = p0,...,pk be a partition of [a, b] such that pi pi 1 < for all i =1,...,k. { } Since 'n 'm 'n f + f 'm , we have that for every " > 0, there Let ' S[a, b] defined by k k1 k k1 k" k1 b b 2 exists N such that 'n 'm < b a for all m, n > N. Then a 'n a 'm < " k k1 for m, n > N, so that b ' is a Cauchy sequence of numbers, and it therefore '(x)=f(pi 1)ifx [pi 1,pi), and '(b)=f(b). (3.6) a n n 1 R R 2 converges. R We have f(x) '(x) = f(x) f(pi 1) < " since x pi 1 < , so that f ' < " Finally, the two sequences ('n)and('n)givethesameintegral: 1 indeed. | | | | | | k k b b 'n(x)dx n(x)dx 'n n (b a) 'n f + f n (b a) 0 a a k k1  k k1 k k1 ! Z Z (3.8) ⇥ ⇤ as n . !1 7 8 We established in Propositions 2.2 and 2.3 some properties of integrals in the uniformly in x [0, ). Hence uniform continuity. The case ↵ =0needsspecial case of step functions. These extend readily to integral of regulated functions. treatment, but it2 is trivial.1 Recall the devil’s staircase. Is it a regulated function? It is not a step function, Proposition 3.6. (Additivity) For any f R[a, b] and c (a, b), we have and it is not clear whether it is continuous (it is, in fact). But it is definitely 2 2 b c b monotone increasing, which is enough. f(x)dx = f(x)dx + f(x)dx. a a c Proposition 3.9. If f :[a, b] R is monotone, then it is regulated. Z Z Z !

Proposition 3.7. (Linearity) For any f,g R[a, b] and ↵, (a, b), we Proof. We can suppose that f is nondecreasing. Let k N,and 2 2 2 have i b b b (3.13) qi = f(a)+ k f(b) f(a) , ↵f(x)+g(x) dx = ↵ fi(x)dx + g(x)dx. a a a Z Z Z where i =1,...,k. Then let pi [a, b] be a number such that f(x) qi if xpi (pi is unique when f is increasing; it is not unique when f The proof is immediate, using that if 'n f and n g, then ↵'n + n ↵f + g. Then it follows from Proposition 2.3.! ! ! has flat parts). It is possible that pi+1 = pi. We define

Proposition 3.8. Let f R[a, b] and m, M R such that m f(x) M qi if x (pi 1,pi)forsomepi 1 11. For ↵ > 1, observe that 2 x y x ↵ ↵ ↵ ↵ ↵ ↵ 1 (3.11) F (x) F (y) = f(t)dt f(t)dt = f(t)dt f x y . (3.16) f(x+) f(x)=(x+) x = x 1+ x 1 x 1+↵ x 1 = ↵x , | | a a y k k1| | Z Z Z which diverges as x . No uniform⇥ continuity ⇤ in this⇥ case. For⇤↵ [0, 1], we use !1 2 Then for every " > 0, we can choose = "/ f to get F (x) F (y) < " for all 1 ↵ ↵ ↵ ↵ 1 k k | | 0 f(x + ) f(x)=x 1+ 1 x 1+↵ 1 = ↵x ↵, (3.12) x, y with x y < .  x  x  | | ⇥ ⇤ ⇥ ⇤ 9 10 We can now formulate a version of the fundamental theorem of calculus. 4 Methods of integration

Theorem 3.11. Let f R[a, b], and assume that f is continuous at c The main two methods are the integration by parts and integration by substitution. x 2 2 (a, b). Then F (x)= a f is di↵erentiable at c, and F 0(c)=f(c). They both allow to replace the integral of a function by that of another function which is hopefully simpler. The goal of this section is to establish the validity of the R corresponding formulæ. c+h Proof. We have F (c + h) F (c)= c f(t)dt. Since f is continuous at c, for every We define the product of two functions by (fg)(x)=f(x)g(x). " > 0thereexists > 0 such that f(c) " f(t) f(c)+" for t c < . Then R   | | Proposition 4.1. If f,g R[a, b], then fg R[a, b]. c+h 2 2 h f(c) " f(t)dt h( f)c)+" . (3.17)  c  Z Proof. First, let us note that fg f g . Let ('n), ( n)beregulated k k1 k k1k k1 Consequently, functions such that 'n f and n g uniformly. Then 'n n is a step function F (c + h) F (c) for all n,and ! ! " f(c) " (3.18)  h  F (c+h) F (c) fg 'n n fg 'ng + 'ng 'n n 1 1 1 for all h < . This is precisely the meaning of limh 0 h = f(c). k k k k k k (4.1) | | ! f 'n g + 'n g n . If f C[a, b], it is regulated by Proposition 3.3, and Theorem 3.11 shows that k k1k k1 k k1k k1 2 its primitive is di↵erentiable at any point of (a, b). Another useful version of the Since 'n f , it is bounded. Then everything tends to 0 as n , so fg k k1 !k k1 !1 fundamental theorem of calculus is the following. is indeed equal to the limit of a sequence of step functions.

Theorem 3.12. Let f C[a, b] and F a di↵erentiable function on [a, b] 2 Theorem 4.2. (Integration by parts) Let f,g be continuously di↵erentiable such that F 0(x)=f(x) for all x (a, b). Then 2 functions on [a, b]. Then b b b f(x)dx = F (b) F (a). b f 0g = fg fg0. a Z a a a Z Z Proof. Consider the function g on [a, b], It is perharps worth clarifying the notation used in the theorem; written explic- itly, the equation is x g(x)=F (x) f(t)dt. (3.19) b b a f 0(x)g(x)dx = f(b)g(b) f(a)g(a) f(x)g0(x)dx. (4.2) Z Za Za This function is di↵erentiable and g0(x)=F 0(x) f(x)=0.Thisimpliesthatg is constant (this follows from the mean value theorem), so that g(a)=g(b), i.e., Proof. Recall that (fg)0(x)=f 0(x)g(x)+f(x)g0(x). By Theorem 3.12 (the second version of the fundamental theorem of calculus), we have b b b b F (a)=F (b) f(t)dt. (3.20) (fg)(b) (fg)(a)= (fg)0(x)dx = f 0g + fg0. (4.3) Za Za Za Za Rearranging this identity, we get the formula of integration by parts. This theorem is amazingly useful and versatile. Let us describe a nice application to the Gamma function :[0, ) [0, ), defined by 1 ! 1

1 x 1 t (x)= t e dt. (4.4) Z0 11 12 Notice that [0, ) is not a closed interval, and is not regulated around 0 when Further, we have x<1. A correct1 definition for the integral above is d b F g(x) = F 0 g(x) g0(x)=f g(x) g0(x), (4.10) x 1 t dx (x)= lim lim t e dt. (4.5) a 0+ b ! !1 Za so that d d One can check that the limits exist, and that it can be taken in any order. We use f g(x) g0(x)dx = F g(x) . (4.11) t x 1 t c c Theorem 4.2 with f 0(t)= e and g(t)=t . Then f(t)= e and g0(t)= Z x 2 (x 1)t ,andweobtain This is identical to the right side of Eq. (4.9). Let us review a few examples. 1 t x 1 (x)= e t dt 0 b 2 2 Z (a) Consider the integral a cos(x )xdx.Wetakef(x)=cosx and g(x)=x , 1 2 t x 1 1 t x 2 (4.6) 1 b = e t +(x 1) e t dt R 1 2 0 0 and Theorem 4.3 shows that the integral is equal to cos t dt = 2 sin b Z 2 a2 =(x 1)(x 2). 2 Z sin a . b b t x 1 b d The first term in the middle equation is equal to lima 0+ limb ( e t ) , which 2 1 2 ! !1 a We can also proceed more directly and write cos(x )xdx = 2 sin(x ) dx, is 0 if x>1. This shows that for all x>0, we have a a dx Z Z ⇣ ⌘ which obviously gives the same result. (x +1)=x(x). (4.7) b b d (b) sin x cos xdx = 1 sin2 x dx = 1 sin2 b sin2 a . 1 2 2 t t 1 a a dx Since (1) = 0 e dt = e =1,weobtain(n +1)=n! for integer n. The Z Z 0 ⇣ ⌘ Gamma function generalises factorials to real numbers! ⇡ ⇡ R sin x d sin x sin x We now discuss the method of substitution. It is based on the chain rule for (c) e cos x dx = e dx =e =0. 0 dx 0 derivatives. Z ⇣ ⌘ If you find the answer 0 to be disappointing, evaluate the integral from 0 to Theorem 4.3. Let f C[a, b] and g :[c, d] [a, b] be di↵erentiable. Then ⇡/2instead! 2 ! e e d g(d) 1 2 1 d 3 1 (d) log x dx = 3 log x dx = 3 . f g(x) g0(x)dx = f(t)dt. x dx Z1 Z1 Zc Zg(c) ⇣ ⌘ It is natural to approximate an integral by a direct sum. It is tempting to Note that the above formula can also be written as decompose the interval [a, b] into smaller intervals, and to approximate the function b a 1 by a constant within each interval. Namely, given n N, let ai = a + i n , i = b g (b) n b a 2 1,...,n. The question is whether f(a ) is a good approximation for f.We f(x)dx = f g(t) g0(t)dt, (4.8) i=1 n i b a g 1(a) Z Z check that this converges to a f as n . P !1 1 where g (a)isequaltoanyvaluey [c, d] such that g(y)=a. R 2 Proposition 4.4. Let f R[a, b]. Then 2 Proof. The claim follows from the first version of the fundamental theorem of calcu- n b lus, Theorem 3.11, and from the chain rule for derivatives. Let F (x)= x f. Then b a a lim f(ai)= f(t)dt. n n F is di↵erentiable with F 0(x)=f(x), and !1 i=1 a R X Z g(d) f(t)dt = F g(d) F g(c) . (4.9) Zg(c)

13 14 Proof. We first prove it for step function, then we use a standard continuity ar- Notice that f(x )mayormaynotexist. gument. If ' S[a, b], there exists P = p ,...,p such that ' is constant on ± 2 { 0 k} (pi 1,pi). Then Proposition 5.1. Let f :[a, b] R. It is regulated i↵ for every x (a, b), f(x+) and f(x ) exist, and f(a!+) and f(b ) exist. 2 b n b a b a '(t)dt '(ai) 2k ' . (4.12) a n  n k k1 This is a beautifully explicit criterion; the proof is not easy and we treat both Z i=1 X implications separately. Indeed, the di↵erences occur only at the points of discontinuity of ',andthemax- imum jump is 2' . This goes to 0 as n , for any fixed '; the claim holds Proof that f regulated implies that left and right limits exist: Let x (a, b]andxn k1 !1 1 2 ! therefore for step functions. x . Let 'k S[a, b] such that f ' < k .Foranyk, there exists k such that 2 k k1 If f R[a, b], then for any " > 0, there exists ' S[a, b] such that f ' < ". 'k is constant on (x k,x). Then there exists Nk such that xn x < k if n>Nk, 1 | | Then 2 2 k k and 2 (5.2) f(xm) f(xn) < k n b a b n b a n b a f(ai) f(t)dt f(ai) '(ai) for all m, n > Nk.So(f(xn))n 1 is Cauchy, hence convergent. Let f(x ) denote its n  n n i=1 a i=1 i=1 X Z X X limit. n b b a If (xn0 )isanothersequencethatconvergestox , we can consider the mixed + '(ai) '(t)dt (4.13) n a sequence (xn”) = (x1,x10 ,x2,x20 ,...). We have xn” x , so f(xn”) is Cauchy by i=1 Z the above argument, hence convergent. All its subsequences! converge to the same Xb b + '(t)dt f(t)dt . limit, so f(xn0 ) f(x ) as well. a a ! Z Z Proof that the existence of left and right limits implies that f is regulated: The idea The first and third terms of the right side are less than f ' (b a). The second 1 is to consider the set A", defined for " > 0by term vanishes in the limit n . We can make the rightk sidek as small as we wish !1 by first choosing " small enough, then n large enough. (5.3) A" = c [a, b]: ' S[a, b] such that f ' [a,c] < " . 2 9 2 1 n o Here, f denotes the restriction of f to the interval [a, c]. The set A is the 5 Characterisation of regulated functions [a,c] " maximal interval that contains a, and such that f can be approximated by a step A natural question is whether a given function is regulated. So far, we have seen function. The goal is to show that A" =[a, b] for any " > 0. that: (i) A" = : Since f(a+) exists, there exists > 0suchthat f(x) f(a+) < " for all x 6 (a,; a + ). We can take | | Step functions are regulated (this immediately follows from the definition of 2 • regulated functions). f(a)ifx = a, Continuous functions on closed intervals are regulated (Proposition 3.3). '(x)= f(a+) if x (a, a + ), (5.4) • 8 2 > Piecewise continuous functions, with some extra conditions, are regulated. We have ' S[a, b]and (' f) [a,a+] < ", so A" contains the interval [a, a + ]. • 2 1 A function such as f(x)=sin1 on (0, 1] (and f(0) given some value) is not (ii) sup A = b: Assume that sup A = c 0. Combining ' and It turns out that a rather simple criterion holds true, namely, that left and right 0 ' , we get an approximation of f on [a, c + ], so that sup A >c, contradiction. limits of the function exist at every point. We use the notation 0 " (iii) b A": Since f(b )exists,wecanrepeattheargumentin(i)togetanapproxi- f(x+) = lim f(y)andf(x )= lim f(y). (5.1) 2 y x+ y x mating step function on [b ,b] for some > 0. There is also an approximating step ! ! 15 16 function on [a, b ] by (i) and (ii). They can be combined to give an approximation It is not hard to verify that A f does not depend on the midpoint e (a, b). on [a, b]. 2 Let us discuss a few examples.R Then A" =[a, b] for any " > 0, so f R[a, b]. 2 (a) f(x)= 1 cos px on (0, ). The primitive is F (x)=2sinpx,and b f(x)dx = px 1 a 2sinpb 2sinpa. We can take the limit a 0+, but not the limit b . 1 ! R !1 6 Improper and Riemann integrals 1 Then 0 f(x)dx exists as an improper integral, but not 1 f(x)dx. There exist interesting functions that are not regulate, but whose integrals are well- (b) Can weR modify the function above so that the improperR integral exists? Con- defined. An example is the Gamma function, (x)= 1 tx 1 e t dt; the integrant 1 s 0 sider f(x)=x 2 cos px where s is a fixed parameter. For which s does the is not always regulated at 0 (if x<1) and the interval is infinite. Let us consider a R improper integral exists? few examples. We can gain insight using integration by parts; namely, 1 s (a) x dx, s>0. The function is not regulated; we can calculate explicitly; if 0 b 1 1 2sinpx b b 2sinpx s =1, cos px dx = + s dx. (6.4) R 6 1 s s a s+1 s 1 1 s 1 1 1 s a x px x a x x dx = x = (1 a ). (6.1) Z Z 1 s a 1 s Za In order to take the limit a 0+, the first term of the right side gives the ! As a 0+, the latter converges i↵ s 1. We would like to declare that condition s 1 ; the second term gives s< 1 . As for the limit b ,the 1 s! 1   2 2 !1 0 x dx = 1 s for s [0, 1). first term gives the condition s>0; this also implies that the second term 2 1 R 1 1 converges. Thus both limits exist if s (0, 2 ). (b) 0 x dx, which is the case s =1ofitem(a).Wehave 2 1/2 1 (c) If we consider an integral of a function such as f(x)= x 1 , we can R 1 1 | | x dx =logx = log a. (6.2) decompose it as a Za 2 dx 1 dx 2 dx This tends to as a 0+, so this integral should not converge. (6.5) 1/2 = 1/2 + 1/2 , 1 ! 0 x 1 0 x 1 1 x 1 b 1 s s Z | | Z | | Z | | (c) 1 x dx. The interval is infinite, but we can consider 1 x and attempt to let b . If s =1,weget and use the notion of improper integrals for each terms (this integral is then R !1 6 R equal to 4). b s 1 1 s (6.3) x dx = 1 s (b 1). We now discuss the integral in the sense of Riemann. The idea is to find good 1 Z upper and lower bounds of the integral using step functions. 1 This converges (to s 1 )i↵ s>1. Definition 6.2. Let f :[a, b] R, and define 1 b 1 (d) 1 x dx. Since x dx = log b, this tends to as b , so the integral ! 1 1 1 !1 b does not converge. The upper sum Uf =inf a ' : ' S[a, b] and '(x) f(x) x R R • [a, b] . 2 8 2 R Definition 6.1. Let A be an interval on R (it can be open, closed, neither, b The lower sum Lf =sup ' : ' S[a, b] and '(x) f(x) x finite, infinite) and f a function A R that is regulated on all closed • a 2  8 2 ! e [a, b] . intervals [c, d] A. Let a =infA and b =supA.Iflimc a+ f(x)dx R ⇢ ! c d We say that f is Riemann-integrable i↵ U = L , in which case we define exists for some e (a, b), and limd b e f(x)dx exists for some e (a, b), f f 2 ! R 2 b then the improper integral f(x)dx exists, and is defined to be f = U = L . A R a f f R e d HowR does Riemann integration compare with the integral of regulated functions? f(x)dx =lim f(x)dx +lim f(x)dx. c a+ d b It turns out to be more general. ZA ! Zc ! Ze 17 18 Proposition 6.1. If f R[a, b], then it is Riemann-integrable, and both Definition 7.1. Let A R and f1,f2,... be functions A R. We say 2 ⇢ ! definitions give the same value. that (fn)n 1 converges to f pointwise if fn(x) f(x) for every x A. ! 2 It is worth comparing pointwise and uniform convergence. Proof. Since f R[a, b], for any n there exists ' S[a, b] such that 2 n 2 fn f pointwise means that for all x A, for all " > 0, there exists Nx," 1 1 • ! 2 'n(x) f(x) 'n(x)+ , (6.6) such that f (x) f(x) < " for all n>N . n   n | n | x," for all x [a, b]. A shifted step function is still a step function, so we get fn f uniformly means that " > 0, there exists N" such that fn(x) f(x) < 2 • ! 8 | | " for all x A and all n>Nx,". b b 2 b a b a 'n Lf Uf 'n + . (6.7) Uniform convergence implies pointwise convergence, but the converse is not al- n    n Za Za ways true. Here are some examples (please draw them!). b Letting n , we see that Lf = Uf , and they are both equal to a f. 1 !1 1 nx if x [0, n ], (a) On the interval [0, 1], fn(x)= 2 It is not hard to check Let us look at some examples. R 0ifx ( 1 , 1]. ( 2 n 1 1ifx =0, (a) f(x)= px on [0, 1] is still not Riemann, since there exists no step function that the pointwise limit f exists and is given by f(x)= that is larger than f. 0ifx (0, 1]. ( 2 Convergence is not uniform, since fn f =1foralln, so it does not tend 1ifx =2 n for some n =0, 1, 2,... k k1 (b) On the interval [0, 1], let f(x)= We to 0. (0otherwise. (b) On the interval [0, 1], f (x)=xn. The pointwise limit f exists and is given observe that f(0+) does not exist, so it is not regulated. But we can use n 0ifx [0, 1), by f(x)= 2 Convergence is not uniform, since we again have N 1ifx 2 , (1ifx =1. 'N (x)=  (6.8) f(x)ifx>2 N , fn f =1foralln. ( k k1

1 N (c) Let g be a (non identically zero) function R R such that g(x)=0for to get Uf 'N =2 , hence Uf 0. Using the 0 step function, we get !  0  x > 1, and let fn(x)=g(x n). We have limn fn(x)=0forallx, Lf 0. Then f is Riemann-integrable and its integral is zero. | | !1 R but fn = g , which does not depend on n and does not tend to 0. k k1 k k1 1 Convergence is then pointwise but not uniform. (c) The function sin x on [0, 1] is Riemann-integrable, although it is not regulated. Recall that a metric space (i.e. a set with a notion of distance between its ele- 1ifx R Q, (d) The function f(x)= 2 \ on the interval [0, 1] is not Riemann- ments) is complete if every Cauchy sequence converges within the space. We know 0ifx Q, that ( , )iscomplete. ( 2 R integrable, and not regulated. Its integral cannot be defined as improper |·| 1 integral either. It is Lebesgue-integrable,though,and 0 f =1. Proposition 7.1. The space of regulated function with the sup norm, (R[a, b], ), is complete. R k · k1 7 Uniform and pointwise convergence Proof. Let (fn)n 1 be a sequence of regulated functions, that is Cauchy with re- “Analysis is the art to take limits”. Sequences of functions can converge in several spect to the sup norm. We show that (a) (fn)convergestosomefunctionf;(b) di↵erent ways, and this diversity should be embraced as to make the theory more convergence is uniform; (c) the limit function f is regulated. interesting and far-reaching. We have already seen the notion of uniform conver- (a) fn converges pointwise. For every fixed x,wehave fm(x) fn(x) fm fn , | | k k1 gence: fn f uniformly if fn f 0asn . Another useful notion is so (fn(x))n 1 is a Cauchy sequence of real numbers. It converges, and we define ! k k1 ! !1 pointwise convergence: (f(x)=limn fn(x)foreachx [a, b]. !1 2 19 20 (b) f f uniformly. For every " > 0, there exists N such that 1.0 1.0 1.0 n ! " 0.8 0.8 0.8 0.6 0.6 0.6

fm(x) " fn(x) fm(x)+", (7.1) 0.4 0.4 0.4

  0.2 0.2 0.2 for all x and all m, n > N . Taking the limit m , we get 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 " !1

f(x) " fn(x) f(x)+", (7.2) The functions f0,f1,f2 are illustrated above. At each iteration, the oblique   straight lines are replaced by two oblique lines and a flat line in the centre. It is for all x and all n>N". Then fn f 0indeed. apparent that all fn are continuous. Further, k k1 ! (c) f R[ab]. Let ' be step functions such that ' f as m , uniformly. 2 n,m n,m ! n !1 1 fn+1 fn =max sup fn+1(x) fn(x) , sup fn+1(x) fn(x) , sup fn+1(x) fn(x) Then there exists mn such that 'n,mn fn < n . The sequence ('n,mn )converges 1 k k1 k k x [0, 1 ) | | x [ 1 , 2 ] | | x ( 2 ,1] | | uniformly to f: n 2 3 2 3 3 2 3 o =sup fn+1(x) fn(x) x [0, 1 ) | | 'n,mn f 'n,mn fn + fn f (7.3) 2 3 k k1 k k1 k k1 1 1 =sup fn(3x) fn 1(3x) 2 2 and the right side goes to 0 as n , so f is regulated. x [0, 1 ) | | !1 2 3 1 = fn fn 1 . Next, a great theorem of analysis. 2 k k1 (7.6) Theorem 7.2. Let A be an arbitrary interval in (open, closed, neither, R n Iterating, we get fn+1 fn 2 f1 f0 . It follows that (fn)n 1 is a Cauchy finite, or infinite) and C(A) be the space of continuous functions A R. k k1  k k1 Then (C(A), ) is complete. ! sequence of continuous functions (with respect to the sup norm), so it converges to k · k1 a continuous function by Theorem 7.2. Notice that f is a continuous function that is constant on intervals whose total Proof. Let (fn)n 1 be a Cauchy sequence of functions in C(A)(withrespectwith length is equal to 1! the sup norm). Proceeding exactly as in the previous proof, we get the existence of afunctionf : A R such that fn f 0. There remains to verify that f is 1 continuous. ! k k ! 8 Functions of two variables Let x A. For every " > 0, there exists fn in the above sequence such that 2 " " We consider functions f : D where D =[a, b] [c, d], and explore the properties fn f < . Further, there exists > 0suchthat fn(y) fn(x) < for all R k k1 3 | | 3 b ! d b ⇥ d b y A with y x < . Then of objects such as a f(x, t)dx, dt a f(x, t)dx,and c a f(x, t)dx dt. First, let us 2 | | recall the notion of continuity. R R R ⇥R ⇤ f(y) f(x) f(y) fn(y) + fn(y) fn(x) + fn(x) f(x) ", (7.4) | |  | | | | | |  2 Definition 8.1. Let f : D R, where D is an arbitrary set in R . for all y A with y x < . This shows that f is indeed continuous. ! 2 | | f is continuous at (x ,t ) D if for every " > 0, there exists > 0 • 0 0 2 The devil’s staircase function is a beautiful application of Theorem 7.2, which such that f(x, t) f(x0,t0) < " for all (x, t) D with x x0 + t can be used to establish that it is continuous. We can define the devil’s staircase as t < . | | 2 | | | 0| the limit of functions fn :[0, 1] [0, 1], where f0(x)=x,and ! f is uniformly continuous if for every " > 0, there exists > 0 such 1 1 • f (3x)ifx [0, ), that f(x, t) f(x0,t0) < " for all (x, t), (x0,t0) D with x x0 + t 2 n 3 | | 2 | | | 1 2 1 2 f (x)= (7.5) t0 < . n+1 8 2 if x [ 3 , 3 ], | 1 1 2 2 <> + fn(3x 2) if x ( , 1]. 2 2 2 3 As in the one-variable case (Lemma 3.2), one can prove that continuity implies :> uniform continuity if D is a closed domain.

21 22 Lemma 8.1. Assume that f is uniformly continuous on D, and let I : Proof. Let b y d d y [c, d] R be defined by I(t)= f(x, t)dx. Then I is uniformly continuous ! a F (y)= dx dtf(x, t) dt dx (f(x, t). (8.2) on the interval [c, d]. R Za Zc Zc Za We show that F (y)=0forally [a, b]. It is clear that F (a)=0,andusingthe 2 b fundamental theorem of calculus, we have Proof. We have I(s) I(t)= f(x, s) f(x, t) dx. We know that for every " > 0, a there exists > 0suchthat f(x, s) f(y, t) < " whenever x y + s t < . d d d y R ⇥ ⇤ (8.3) For this ,wehave I(s) I(|t) < "(b a)whenever| s t <| , so| I is| uniformly | F 0(y)= dtf(y, t) dt dxf(x, t). c dy c a continuous indeed. | | | | Z Z Z d d The second term of the right side is of the form dy c g(y, t)dt, with g(y, t)= Next, we exchange derivative and integral. y f(x, t)dx. The function g is continuous in y, t, and its derivative @g = f(y, t) a R @y is also continuous. Then we can use Proposition 8.2 to exchange derivative and Proposition 8.2. Assume that f(x, t) and @f (x, t) are continuous on [a, b] R @t ⇥ integral. We get [c, d]. Then for all t (c, d), we have d d 2 F 0(y)= dtf(y, t) dtf(y, t)=0. (8.4) d b b @f c c f(x, t)dx = (x, t)dx. Z Z dt @t We have shown that F is constant, hence 0. Za Za In most cases, one can exchange the order of integration. But here is a coun- b b @f @f terexample: On the domain, [0, 2] [0, 1], let Proof. Let F (t)= a f(x, t)dx and G(t)= a @t (x, t)dx. (Since f and @t are ⇥ xy(x2 y2) continuous, these integrals exist.) We have (x2+y2)3 if (x, y) =(0, 0), R R f(x, y)= 6 (8.5) F (t + h) F (t) b f(x, t + h) f(x, t) @f (0ifx = y =0. G(t)= (x, t) dx h a h @t This function behaves badly at (0, 0), as it diverges both to + and ; see the Z h i (8.1) 1 1 b @f @f illustration. = (x, ⌧) (x, t) dx. a @t @t Z h i We used the mean-value theorem for the last line, and ⌧ is a number that depends @f on x, t, h, which satisfies ⌧ [t, t + h]. Since @t is uniformly continuous, the last integrant vanishes as h 0.2 It follows that F is di↵erentiable, and its derivative is G. ! Next, a version of Fubini theorem, which allows to exchange the order of inte- gration.

Theorem 8.3. Assume that f :[a, b] [c, d] R is continuous. Then ⇥ ! b d d b f(x, t)dt dx = f(x, t)dx dt. a c c a Integrals can be calculated explicitly (by substitution or integration by parts) Z hZ i Z hZ i and we find Physicists have another way to denote multiple integrals: The above identity b d d b 2 1 1 is written a dx c dtf(x, t)= c dt a dxf(x, t). This is less elegant but more dx dyf(x, y)= , convenient, and we will often use it. 0 0 5 Z Z (8.6) R R R R 1 2 1 dy dxf(x, y)= . 20 Z0 Z0 23 24 Here, order of integration matters! Definition 9.1. Let A be an arbitrary interval in R, and let (fk)k 0 be functions A R. ! We now discuss the exchange of limits and derivatives: If fn f, does fn0 1 ! converge to f 0? An equivalent question is whether (C [a, b], ) is complete? Here, The series of functions k1=0 fk converges pointwise if 1 k·k1 • n C [a, b] denotes the space of continuously di↵erentiable functions on the interval limn fk(x) exists for each x A. !1 k=0 P 2 [a, b]. Let us look at an example. The seriesP of functions k1=0 fk converges uniformly if there exists On the interval [ 1, 1], consider f (x)= x2 + 1 . It is clear that f (x) x • n n n ! | | a function S : A R such that pointwise. Further, we have ! P q n

2 2 1 lim fk S =0. fn(x) f(x) n 1 n fn(x) f(x) = | | = (8.7) !1 k=0 1 | | fn(x)+f(x) 2 1  pn X | | x + n + x | | b b q We have seen that, if g g uniformly, then g g. This immediately n ! a n ! a for all x, so that fn f 0asn . The functions fn are clearly di↵erentiable implies that, if fk is a series of functions that converges uniformly, we have k2 1 k11/2! !1 k R R (with fn0 (x)=(x + n ) x), they converge uniformly, but the limit f(x)= x is 1 | | not di↵erentiable. This shows that C [ 1, 1] is not complete with respect to the sup P 1 b b 1 norm. fk(x)dx = fk(x) dx. (9.1) a a But there are many cases where the limit is di↵erentiable. Xk=0 Z Z ⇣Xk=0 ⌘ If all functions f are continuous and the series f converges uniformly, it follows 1 k k k Theorem 8.4. Assume that (fn)n 1 is a sequence of functions in C [a, b] from Theorem 7.2 that the limiting function is continuous. And if fk are di↵eren- such that (fn) and (fn0 ) are Cauchy with respect to the sup norm. Then there tiable functions and f , f converge uniformly,P the limiting function is also 1 k k k k0 exists f C [a, b] such that f f and f 0 f 0, uniformly. 2 n ! n ! di↵erentiable by Theorem 8.4. Now, a simple criterionP forP the uniform convergence of series of functions. It is Proof. By Theorem 7.2 (C[a, b] is complete), there exist continuous functions f and enough for many applications. g such that fn f and fn0 g uniformly. There remains to check that f is ! ! Proposition 9.1. Let (fk)k 0 be functions A R, and assume that di↵erentiable, and that f 0 = g.Wehave ! k1=0 fk < . Then k1=0 fk converges uniformly. x x x k k1 1 g = lim fn0 =lim fn0 =lim fn(x) fn(a) = f(x) f(a). (8.8) P P n n n Za Za Za Proof. For any m

25 26 This theorem can be used in reverse order: Given a regulated function f on [0, 2⇡], one can define the Fourier coecients (ak)and(bk); if the latter are absolutely summable, the function f can be written as a convergent Fourier series. Jean Baptiste Joseph Fourier (1768–1830) Proof. Let fk = ak cos kx + bk sin kx. Then fk < ak + bk , and Proposition k k1 | | | | 9.1 implies that the Fourier series converges uniformly. Let f = k fk denote the limiting function. Since f is the uniform limit of continuous functions, it is continuousP by Theorem Joseph Fourier hesitated between religion and mathematics, and became an ardent 7.2. 2⇡-periodicity is immediate, since cos kx and sin kx are 2⇡-periodic functions revolutionary by 1793. He was arrested in 1794 during the Terror, but he escaped the for all k . guillotine thanks to political changes (Robespierre was guillotined on 28 July 1794). N The formulæ2 for the coecients is not hard to establish, knowing that integrals He studied at Ecole´ Normale with Lagrange, Laplace, Monge, and was arrested and freed again in 1795. can be interchanged with series that converge uniformly. In 1798, he joined Napol´eon in his invasion (in French: “exp´edition”) of Egypt. It 1 2⇡ 1 2⇡ 1 started well, until Nelson destroyed the French fleet in the Battle of the Nile (1 f(x)coskx dx = a cos px + b sin px cos kx dx ⇡ ⇡ p p August 1798). Napol´eon returned to Paris in 1799, but Fourier and the rest of the Z0 Z0 p=0 expeditionary force remained there until 1801. X (9.3) 1 1 2⇡ 1 1 2⇡ Back in France, he became Prefect of Is`ere (Grenoble) and he worked on the De- = a cos px cos kx dx + b sin px cos kx dx. ⇡ p ⇡ p scription of Egypt. Between 1804 and 1807, Fourier wrote On the propagation of p=0 Z0 p=0 Z0 heat in solid bodies. This work was controversial at the time because of issues with X X Fourier series. One can check that

When Napol´eon marched through Grenoble in the “hundred days”, Fourier fled 2⇡ instead of welcoming him. He was nonetheless nominated Prefect of the Rhˆone. cos px cos kx dx = ⇡p,k if (p, k) =(0, 0), Shortly after this, Waterloo, and the end of Napol´eon’s epopee. 0 6 Z (9.4) In 1822 he published his essay Th´eorie analytique de la chaleur, which Lord Kelvin 2⇡ sin px cos kx dx =0, described as “a great mathematical poem”. Z0 Let us now discuss Fourier series. The goal is to express f :[0, 2⇡] R as a where p,k =1ifp = k, and 0 otherwise, is called “Kronecker’s symbol”. The last ! series of sines and cosines. The main result is the following. line in Eq. (9.3) is then equal to ak. The case of bk is similar. If a function f is not continuous on [0, 2⇡] (or if f(0) = f(2⇡)), its Fourier Theorem 9.2. Let (ak)k 0 and (bk)k 0 be sequences of numbers such that 6 coecients cannot be absolutely summable — that leads otherwise to a contradic- ak < and bk < . Then k | | 1 k | | 1 tion. If f is continuous, the situation is not clear. But if we assume f to be twice P 1P di↵erentiable, it is definitely enough: The series ak cos kx + bk sin kx converges uniformly on R. • k=0 Proposition 9.3. Assume that f C2[0, 2⇡], and also that f(0) = f(2⇡) X 2 The limiting function, f(x), is continuous and 2⇡-periodic, i.e. f(x + and f 0(0) = f 0(2⇡). Then • 2⇡)=f(x). 2 f 00 ak , bk k k1 . We have a = 1 2⇡ f(x)dx, and for all k =1, 2, 3,..., | | | |  k2 • 0 2⇡ 0 R 2⇡ 1 2 f 00 a = f(x)coskx dx, Proof. We only prove that a k k1 , the other case being similar. Integrating k ⇡ | k|  k2 Z0 1 2⇡ b = f(x)sinkx dx. k ⇡ Z0

27 28 k2t by parts, and observing that the boundary terms vanish, we have and similarly for bk. The solutions are ak(t)=ak e . Let us then define

2⇡ 1 1 a = f(x)coskx dx k2t k2t k T (x, t)= ak e cos kx + bk e sin kx . (9.10) ⇡ 0 Z k=0 1 1 2⇡ 1 2⇡ X = f(x)sinkx f 0(x)sinkx dx ⇡ k 0 k 0 We can now proceed backwards and check that this series converges absolutely, so it (9.5) h Z 2⇡ i 1 1 2⇡ 1 defines a function that indeed solves the heat equation. This establishes existence! = f 0(x)cos kx + f 00(x)coskx dx It is possible to prove uniqueness, although this is more dicult and we leave it ⇡k k 0 k 0 h 2⇡ Z i aside. Of special interest is to study the solution above. As the time increases, the 1 00 Fourier coecients become smaller, it converges better and the function becomes = 2 f (x)coskx dx. ⇡k 0 smoother. In the limit t , we see that a (t),b (t) 0forallk 1, and the Z !1 k k ! 1 2⇡ 2 function converges to the constant function a0, which is equal to the average initial Then ak 2 f 00(x) dx 2 f 00 . 1 2⇡ ⇡k 0 k 1 | |  | |  k k temperature, a0 = 2⇡ 0 T0(x)dx. Let us make useR of Fourier series in order to study the heat equation. It describes R the evolution of temperature in a metallic rod. If T (x)denotesthetemperatureat x, the heat flux between x and x + h is proportional to T (x) T (x + h). The change of temperature at x is then proportional to T (x) T (x + h)+T (x) T (x h). This suggests that the time derivative of T is proportional to the second derivative with respect to space. We consider here a ring of perimeter 2⇡. Let T (x, t)thetemperatureatx [0, 2⇡)andtimet 0. Its evolution satisfies the heat equation 2 @ @2 T (x, t)= T (x, t). (9.6) @t @x2

The initial condition is give by the function T0(x). Let us assume that T0 is su- 2 ciently smooth (e.g. C )sothatitisgivenbyaFourierseries,T0(x)= k1=0(ak cos kx+ bk sin kx). We also assume that at all times, we have P 1 T (x, t)= ak(t)coskx + bk(t)sinkx . (9.7) k=0 X Assuming we can di↵erentiate inside the sums, the time first- and space second derivatives are

@ 1 T (x, t)= a˙ (t)coskx + b˙ (t)sinkx , @t k k k=0 X (9.8) @2 1 T (x, t)= k2a (t)coskx k2b (t)sinkx . @x2 k k k=0 X

Here,a ˙ k(t), b˙k(t)denotethetimederivatives.Equatingallterms,wegetordinary di↵erential equations for each modes, namely

2 a˙ (t)= k a (t),a(0) = a , (9.9) k k k k 29 30 n It is a good exercise to show that for all x R ,wehavelimp x p = x . Part II — NORMS AND INNER PRODUCTS 2 !1 k k k k1 Definition 10.2. Let (V, ) be a normed space. k · k The reader will have noticed the similarities between the absolute values of num- A sequence (vn)n 1 in V converges to v V i↵ limn v vn =0. bers, and the sup norm of functions. Both allow to define convergence and Cauchy • 2 !1 k k

sequences. A shared property is the triangle inequality which was used over and A sequence (vn)n 1 in V is a Cauchy sequence if for every " > 0, over again. It is worth formalising this structure for general vector spaces. • there exists N such that v v < " for all m, n > N . " k m nk " A normed vector space is complete i↵ every Cauchy sequence con- 10 Normed vector spaces • verges; a complete normed space is called a .

Recall that a vector space (over R)isasetwhereelementscanbeaddedwithone another, and multiplied by numbers.

Definition 10.1. Let V a vector space over R. A function : V R is a norm if k · k ! Stefan Banach (1892–1945) (i) it is positive: v 0 for all v V , and v =0i↵ v =0; k k 2 k k (ii) it is homogeneous: ↵v = ↵ v for all ↵ R and v V ; k k | | k k 2 2 Stefan Banach is one of the most important 20th century mathematicians. He was (iii) it satisfies the triangle inequality: u+v u + v for all u, v V . born in Krak´owand he moved in 1910 to Lemberg (then in the Habsburg empire) k kk k k k 2 in order to start university studies. He believed then that there was nothing new A vector space equipped with a norm is a normed space. to discover in mathematics, so he studied engineering. He spent World War I in Krak´ow, where he met Hugo Steinhaus, who became a close friend and collaborator. Notice that (ii) and (iii) imply that v 0: We have 0 = 0 = v v He moved back to Lw´ow(now in Poland) in 1920. With other mathematicians, he k k k k k k v + v =2 v , for all v V . founded the Lw´owSchool of Mathematics, headquartered at the Scottish Caf´e. k k k k k k 2 Special cases of normed vector spaces are (R, )or(C, ). More interesting In 1939–41 Lw´owwas occupied by the Soviet Union and Banach became a cor- n |·| |·| are norms on V = R ; vectors are denoted x =(x1,...,xn). In cases (a) and (b), it responding member of the Academy of Sciences of Ukraine. But after the Nazi is not hard to check that all axioms for a norm are satisfied. invasion in 1941, universities in Lw´owwere closed and Banach avoided deportation by working in a typhus research institute. The Soviets reoccupied Lw´owin 1944. (a) The sup norm x =maxi=1,...,n xi . As other Poles, Banach was preparing to leave the town, now called Lviv, but he k k1 | | died of lung cancer in 1945. (c) The 1-norm x = n x . k k1 i=1 | i| Banach essentially founded linear analysis, and he obtained such fundamental re- sults as the Hahn–Banach, Banach–Steinhaus, and Banach–Alaoglu theorems. The P n 2 1/2 (d) The Euclidean norm x 2 = i=1 xi . This norm is induced by an Banach–Tarski paradox is a mathematically rigorous theorem that relies on the inner product and thek trianglek inequality| | follows from the Cauchy-Schwarz P axiom of choice and that states that a three-dimensional unit ball can be cut in inequality, as will be seen later. finitely-many pieces, these pieces moved and rotated, and reassembled so as to give 1/p two unit balls. This puzzling result raises questions about the validity of the axiom (e) More generally, the p-norm x = n x p . The triangle inequality p i=1 i of choice. holds for p 1andisknownask kMinkowski inequality| | : P n n n One easily proves that if limn vn exists, it is unique: If vn v 0and 1/p 1/p 1/p k k! p p p x + y x + y . (10.1) vn v0 0, then v v0 v vn + vn v0 0; then v v0 =0,sothat | i i|  | i| | i| k k! k kk k k k! k k i=1 i=1 i=1 v = v0. ⇣X ⌘ ⇣X ⌘ ⇣X ⌘ n Notice that (R, )isaBanachspace,andalso(R , p)forp [1, ]. The The proof of this inequality is a bit dicult. space of step functions|·| with the sup norm, (S[a, b], ),k is· notk complete,2 hence1 not k·k1 31 32 n Banach. But (R[a, b], )and(C[a, b], )areBanachspaces,byProposition Let x R denote the limit of the subsequence (x`m ). The triangle inequality 1 1 7.1 and Theorem 7.2 respectively.k · k k · k implies that2

x x`m x x`m . (10.5) Definition 10.3. Two norms and 0 on the vector space V are k k1 k k1 k k1 k · k k · k equivalent is there exist k, K > 0 such that for all v V , This tends to 0 as m , so that x =1since x` =1forall`. But we also 2 1 1 have !1 k k k k k v v 0 K v . x x x x K x x 0, (10.6) k kk k  k k k kk `m k k `m k k `m k! The motivation for this notion is that equivalent norms give the same notion of where we used the bound that we have already established. Since x` 0, this k k! convergence (they induce the same topology). Indeed, if (vn)n 1 satisfies vn v implies that x =0,hencex =0and x =0,contradiction.Theconclusionis k k! k k k k1 0, then v v 0 0forallnorms 0 that are equivalent to . Notice also that k>0. k n k ! k · k k · k that if and 0 are equivalent, and 0 and 00 are equivalent, then k · k k · k k · k k · k k · k Since all finite-dimensional vector spaces are isomorphic to n for some n,this and 00 are also equivalent. Further, if (V, )isBanach,and and 0 are R k · k k · k k · k k · k proposition shows that all norms are equivalent in finite-dimensional spaces. Things equivalent, then (V, 0)isalsoBanach. k · k are much more interesting in infinite-dimensional spaces, such as spaces of sequences. Now comes a disappointing result. Given x =(xk)k 0, we define n Proposition 10.1. All norms on are equivalent. 1 1/p R p x p = xk and `p = (xk)k 0 : x p < . • k k | | k k 1 k=0 n ⇣X ⌘ Proof. We show that all norms are equivalent to . Let (ej)j=1 denote the usual n k · k1 n x =supxk and ` = (xk)k 0 : x < . This is the space of all basis of R , where x =(x1,...,xn)= j=1 xjej. Then • k k1 k N | | 1 k k1 1 2 n nP n bounded sequences. x = xjej xj ej x ej . (10.2) k k  | | k kk k1 k k One can check that (`p, p)arenormedvectorspacesforp [0, ]. (Minkowski j=1 j=1 j=1 k · k 2 1 X X X inequality holds for p 1, and it is not hard to prove for p =1, 2, .) Notice also n 1 We can define K = ej ; this number is independent of x,and x K x that `p `q when p " > 0foralln,but x q 0. For k k k k ! x n instance, consider k =inf k k : x R 0 . (10.3) 1/p x 2 \{ } (n) (k log n) if 1 k n, nk k1 o xk =   (10.7) x 0otherwise. k k ( Since k x ,wehavek x x for all x; this is what we want to show, but  k k1 k k1 k k we need to check that k =0.Byrescaling,wehavethat (n) 1 n 1 1/p (n) 6 Then x p = , which tends to 1 as n ;but x q = k k log n k=1 k !1 k k n 1/p n q/p 1/q k =inf x : x R and x =1 . (10.4) (log n) k . Since q/p > 1, the sum is bounded uniformly in n, so k k 2 k k1 k=1 P the prefactor makes it go to 0 as n . We proceed ab absurdo. If k =0,thereexistsasequence( x`)` 1 such that P !1 x =1forall`,and x` 0as` . k k1 k k! !1 There exists a subsequence (x`m )m 1 that converges with respect to the . k · k1 Indeed, by Bolzano-Weierstrass, there exists a subsequence of (x`)suchthatthefirst components converge; again by Bolzano-Weierstrass, there exists a subsequence of the subsequence such that the second components converge (the first components still converge). We can continue to extract subsequences until all components con- verge. It is not hard to check that the resulting subsequence converges with respect to . k · k1 33 34 11 Inner products

Definition 11.1. An inner product (or scalar product) on the vector Cauchy– space V is a function , : V V R such that Bunyakovsky– h· ·i ⇥ ! Schwarz inequality (i) it is symmetric: x, y = y, x for all x, y V ; h i h i 2 (ii) it is linear: x, ↵y + z = ↵ x, y + x, z for all ↵, R and x, y V ; h i h i h i 2 The full setting and the inequality came only progressively. Augustin-Louis Cauchy 2 (1789–1857) obtained it for sums in 1821. V´iktor ⌫kovleviq Bunk´ovskii$ (Vik- (iii) it satisfies x, x 0 for all x V , and x, x =0implies that x =0. h i 2 h i tor Yakovlevich Bunyakovsky, 1804–1889) extended it to integrals in 1859. Hermann Amandus Schwarz (1843–1921) proposed a modern proof in 1888. This definition is for real inner products.1 Inner product spaces enjoy many more properties than normed spaces, so it Definition 11.2. The induced norm of an inner product is helps to know whether a given norm is the induced norm for some inner product. It follows from Theorem 11.1 (c) that a necessary condition is that the norm satisfies x = x, x . k k h i the parallelogram identity. It turns out that it is a sucient condition. A vector space with an inner product,p which is complete with respect to the Proposition 11.2. If a norm satisfies the parallelogram identity, then there induced norm, is called a . exists an inner product such that it is the induced norm. The inner product is given by Theorem 11.1. x, y = 1 x + y 2 x 2 y 2 h i 2 k k k k k k (a) The induced norm is a norm. = 1 x + y 2 x y 2 . 4 k k k k (b) The inner product satisfies Cauchy–Bunk´ovskii$-Schwarz in- The two forms of the inner product are identical because of the parallelogram equality: identity.2 x, y x y . h i k kk k Proof. The proofs of properties (i) and (iii) of the inner product are immediate, but (c) The induced norm satisfies the parallelogram identity: linearity is harder to establish than could have perhaps been expected. We start with x + y 2 + x y 2 =2 x 2 +2 y 2. k k k k k k k k 2 2 2 2 2 2 x+y+z x y z = x+y+z + x+y z x z+y x z y . (11.2) k k k k k k k k k k k k Proof. (a) and (c) are easy and are left as an exercise. Using the parallelogram identity, we then get For (b), we use the one inequality at hand, applied to x + ↵y: 2 2 2 2 2 2 2 2 2 x + y + z x y z =2 x + y +2 z 2 x z 2 y 0 x + ↵y, x + ↵y = x +2↵ x, y + ↵ y . (11.1) k k k k k k k k k k k k (11.3) h i k k h i k k =2 x + y 2 +2 y 2 2 x y 2 2 z 2. k k k k k k k k This is nonnegative for all ↵ R, so this quadratic polynomial cannot have two ze- 2This proposition remains true for complex vector spaces, but the inner product is given by the 2 2 2 2 roes. This means that the discriminant is nonpositive, that is, 4 x, y 4 x y polarisation identity 0. This is precisely the Cauchy-Schwarz inequality. h i k k k k  1 2 2 2 2 x, y = 4 x + y x y i x +iy +i x iy . 1In quantum mechanics, one needs complex vector spaces where inner products satisfy x, y = h i k k k k k k k k y, x ; they are antilinear in the first variable and linear in the second. h i The proof is nearly identical. h i

35 36 m n The second equation is obtained by exchanging y and z. Averaging the two right Let V = R and V 0 = R ,andT an n m matrix. Then sides, we get • ⇥ T11 T1m x1 . ··· . . Tx = . . . , (12.2) T T xm 2 2 2 2 2 2 ✓ n1 nm ◆✓ ◆ x + y + z x y z = x + y x y + x + z x z . (11.4) ··· k k k k k k k k k k k k which gives a vector with n components. We have just proved that for all x, y, z V , 1 2 Adi↵erential operator: the map T : C [a, b] C[a, b] where Tf = f 0. • ! x, y + z = x, y + x, z . (11.5) h i h i h i An integral operator: the map T : C[a, b] C[a, b] where the image Tf is the • function ! There remains to prove that x, ↵y = ↵ x, y . Eq. (11.5) proves it for ↵ N. Since x h i h i 2 x, y = x, y , this holds for ↵ Z. Further, by the homogeneity of norms, we (Tf)(x)= f(t)dt, (12.3) h i h i 2 have Za 1 2 2 2 ↵x, ↵y = ↵(x + y) ↵(x y) = ↵ x, y . (11.6) with x [a, b]. h i 4 k k k k h i 2 Now let ↵ = p/q with p, q N. Using both equations above, we have 2 Definition 12.1. Let (V, ) and (V 0, 0) be normed spaces. The oper- 1 p (11.7) k · k k · k x, ↵y = q2 qx, qy = q x, y . ator norm of the linear map T : V V 0 is defined as h i h i h i ! So linearity holds for all ↵ Q. A continuity argument extends it to all ↵ R. Let Tx 0 ↵ be rational numbers that2 converge to ↵; since the norm is continuous, we2 have T =supk k =sup Tx 0. n k k x V,x=0 x x V, x =1 k k 2 6 k k 2 k k x, ↵y = 1 x + ↵y 2 x ↵y 2 h i 4 k k k k We say that T is bounded if T < . 1 2 2 k k 1 =lim x + ↵ny x ↵ny n 4 k k k k !1 The reader should verify that both supremums above give the same value. It =limx, ↵ny (11.8) n h i follows from the definition that Tx 0 T x , a useful inequality. Before we !1 k k k kk k =lim↵n x, y state the next proposition, let us recall that a map f between normed space (V, ) n k·k !1 h i and (V 0, 0)iscontinuous at y V if for every " > 0, there exists > 0such = x, y . k · k 2 h i that f(x) f(y) 0 < " for all x V such that x y < . Here, the map f is not necessarilyk linear.k 2 k k We used linearity with respect to rational numbers, which we checked in Eq. (11.7). Linear maps are continuous i↵ they are bounded.

Proposition 12.1. Let (V, ) and (V 0, 0) be normed spaces and T : k · k k · k V V 0 be a linear map. The following are equivalent: 12 Linear maps (operators) ! (a) T is continuous at 0 (here, 0 V ). Functional analysis (also called linear analysis) is the domain of mathematics that 2 studies vector spaces or arbitrary dimensions and linear maps between these spaces. (b) T is continuous. This field originates from quantum mechanics, but it has much broader applications. A linear map (or operator)betweenvectorspacesV and V 0 is a map T : V (c) T < . ! k k 1 V 0 that satisfies T (↵x + y)=↵T (x)+T (y), (12.1) Proof. It is clear that (b) implies (a). The converse implication follows from linear- for all numbers ↵, R and vectors x, y V . It is customary to write Txinstead of ity: Continuity at 0 implies that for all " > 0, there exists > 0 such that T (x). Linear maps are2 quite peculiar, but2 they include many interesting examples: Tx Ty 0 = T (x y) 0 < " (12.4) k k k k

37 38 whenever x y < . This means that T is also continuous at y. The set B(x, )= y V : x y is called the closed ball of radius Next, wek show k that (a) implies (c). For x V with x =1,wehave centred on x. Heuristically,2 openk sets arek sets that do not contain their boundary, 2 k k while closed sets contain their boundary. 1 " Tx 0 = T (x) 0 < . (12.5) Let us review some examples. k k k k The upper bound holds for all x (such that x =1),sothat T < " < . (a) On R, the interval (a, b)= x R : a 0 such that B(f(x), ") U. ⇢ • that B(x, ) ⇢ U. 2 Since f is continuous, there exists > 0suchthaty B(x, )= f(y) ⇢ 1 2 1 ) 2 B(f(x), ") U. Then y f (U). This shows that B(x, ) f (U), so we have A subset U V is closed if its complement U c = V U is open. ⇢ 1 2 ⇢ • ⇢ \ proved that f (U)isopen.

39 40 1 We now assume that f (U)isopenforanyopenU V 0. Let x V and " > 0. Proof. We start by verifying that the sequence (xn)ofthetheoremisCauchy.We 1 ⇢ 2 Since f (B(f(x), ")) is open, it contains the ball B(x, )forsome > 0. Then have f(B(x, )) B(f(x), "), so f is continuous at x. ⇢ n xn+1 xn = f(xn) f(xn 1) ⌘ xn xn 1 ⌘ x1 x0 . (14.1) Here is a useful characterisation of closed sets, that indeed suggests that they k k k k k k··· k k include boundary points. If m

n 1 Proposition 13.3. A subset U of the normed space (V, ) is closed i↵ 1 ⌘m k · k k x U for all n, and x x, imply that x U. xm xn xk xk+1 ⌘ x1 x0 = x1 x0 . (14.2) n 2 n ! 2 k k k k k k 1 ⌘ k k kX=m kX=m 1 Then (0, 1] R is not closed since the sequence ( )n 1 converges to 0, but 0 2 n We see that for every " > 0, we can find N large enough so that xm xn < " for does not belong to (0, 1]. all m, n > N. k k The sequence (xn)convergessinceV is complete; let x denote the limit. It Proof. We prove that: U not closed there exists (xn)inU with xn x/U. , ! 2 belongs to U since U is closed, and it is fixed point by continuity of f: : If U is not closed, V U is not open, so there exists x V U such that ) \ 2 \ B(x, ) V U for all > 0. It follows that for all n, there exists xn U such that 6⇢ \ 2 f(x)= lim f(xn)= lim xn+1 = x. (14.3) x x < 1 , so x x. n n k n k n n ! !1 !1 : Let (xn)inU such that xn x/U.Forany > 0, the ball B(x, )contains Finally, the fixed point is unique: If x, z are two fixed points, we have x( for n large enough; but x / V! U2, so V U is not open, and U is not closed. n n 2 \ \ x z = f(x) f(z) ⌘ x z , (14.4) k k k k k k 14 The contraction mapping theorem which implies that x z =0. k k The proof shows that convergence of the sequence (xn)isexponentiallyfast. Definition 14.1. Let (V, ) be normed space and U V a closed subset. Let us illustrate the theorem with a few applications. First, consider the equation k · k ⇢ ax A map f : U U is a contraction on U if there exists ⌘ < 1 such that e = x with the parameter a (0, 1). We are looking for solution(s) x [0, ). ! ax 2 2 1 Let f(x)= e ; by the mean-value theorem, f(x) f(y) ⌘ x y  k k ax ay e e = f 0(⇠) x y (14.5) | | | || | for all x, y U. 2 a⇠ with ⇠ between x and y. Then f 0(⇠) = a e a, so that f(x) f(y)

41 42 we see that the equation above is the fixed point equation Tx = x.Weseeksucient Let T denote the Picard operator conditions for the existence of a unique solution. We have T :C[t0 ,t0 + ] C[t0 ,t0 + ], b ! Tf Tg =sup k(x, y) f(y) g(y) dy t (14.13) k k1 x [a,b] a x x0 + F s, x(s) ds. 2 Z 7! t0 b Z sup k(x, y) f(y) g(y) dy (14.8) The integral equation becomes Tx = x, so we are looking for fixed points of T .We  x [a,b] a | | 2 Z equip C[t0 ,t0 + ] with the sup norm (so it is complete indeed). Let us assume b f g sup k(x, y) dy. that F is Lipschitz-continuous with respect to the second variable, namely that k k1 x [a,b] a | | 2 Z F (s, t) F (s, u) L t u , (14.14) We see that T is a contraction if | |  | | b for a constant L that does not depend on s, t, u. Then sup k(x, y) dy<1. (14.9) t x [a,b] a | | 2 Z Tx(t) Ty(t) = F s, x(s) F s, y(s) ds The contraction mapping theorem implies the existence of a unique fixed point. Zt0 t ⇣ ⌘ It is worth doing the same exercise with norm 1 instead of .Wefind (14.15) k · k k · k1 L x(s) y(s) ds that T is a contraction provided  Zt0 b L t t0 x y . sup k(x, y) dx<1. (14.10)  | | k k1 a y [a,b] | | Z 2 This shows that Tx Ty L x y , so T is a contraction if L < 1. ⇣ ⌘ k k1  k k1 This condition is distinct from the one before, so we can solve other equations. Hence the existence of a unique solution. We have just proved the Picard-Lindel¨of But a non-trivial problem occurs here, namely that the space (C[a, b], )isnot theorem. k · k1 complete; it must first be completed, in a similar way that R is the completion of Theorem 14.2. Let D =[t a, t + a] [x b, x + b]. Assume that Q, so the contraction mapping theorem can be applied. 0 0 ⇥ 0 0 F : D R is continuous in the first variable and Lipschitz-continuous ! Ordinary di↵erential equations. Consider an ordinary di↵erential equation of in the second variable. Then there exists > 0 such that the di↵erential dx 1 the form equation = F (t, x(t)) has a unique solution x C [t0 ,t0 + ]. dt 2 dx = F t, x(t) , It is worth pointing out that the conclusion of this theorem is not obvious. For dt (14.11) dx instance, the equation dt = x(t)hastwosolutionswitht [0, 1]: x 0and x(t0)=x 0. 2 ⌘ x(t)= 1 t2. The theorem does not apply because the function F (t)=pt is not 4 p Here, x is a function on R, x0 R is the initial conditions, and F is a function Lipschitz. 2 R2 R.Wearelookingforsucient conditions that guarantees the existence of a di↵erentiable! function x that solves this equation. We use the contraction mapping Newton-Raphson method. This is a method that helps to solve the equation theorem. f(x)=0,wheref is a function [a, b] R. The idea is to start with a reasonable ! It helps to use an integral equation; assuming that x is a solution to the equation guess, x0, and to apply the following iteration: above, we can integrate and we get f(xn) t x = x . (14.16) n+1 n f (x ) x(t) x(t )= F s, x(s) ds, (14.12) 0 n 0 Zt0 Then we hope that xn tends to the solution as n . Draw a picture to see why! or x(t)=x(t )+ t F s, x(s) ds. Conversely, if x solves the integral equation, then The method does not always work. A counter-example!1 is the function g(x)= 0 t0 it solves the di↵erential equation by the fundamental theorem of calculus (Theorem x1/3, with x =0.Onegetsx = 2x , which does not define a convergent R 0 n+1 n 3.11; we should assume here that F is continuous). sequence. 6

43 44 2 1 We assume that f C (R), that g0(x) =0forallx,andthat Notice that " can be taken to be arbitrarily small! Taking " = 10 , convergence is so 2 6 fast that the number of correct digits doubles at every step! f(x)f 00(x) (14.17) ⌘ =sup| 2 | < 1. x R f 0(x) Jacobi algorithm. Our last application of the contraction mapping theorem is a 2 method that allows to compute the solution to the linear algebra equation As we shall see, it is enough to guarantee that the equation f(x)=0hasaunique solution, and that x converges to it. n Ax = b, (14.24) We work with the Banach space (R, ). Let us introduce |·| where x, b are vectors in Rn,andA is an n n matrix. Assuming that A is invertible, f(x) 1 ⇥ 1 g(x)=x . (14.18) the solution is x = A b, but computation of A is tricky. Jacobi algorithm consists f (x) 0 of decomposing A = D + R, where D contains the diagonal of A and R the o↵- The fixed point equation g(x)=x is equivalent to f(x)=0.Bythemean-value diagonal terms: theorem, there exists ⇠ such that a a a a 0 0 0 a a 11 12 ··· 1n 11 ··· 12 ··· 1n f(⇠)f 00(⇠) a21 a22 a2n 0 a22 a21 0 a2n g(x) g(y) = g0(⇠) x y = x y ⌘ x y . (14.19) 2 A = 0 . . ··· . 1 = 0 . . 1 + 0 . . ··· . 1 . | || | f 0(⇠) | |  | | ...... B C B C B C Ban1 an2 annC B 0 annC Ban1 an2 0 C It follows that the equation has a unique solution, and that the sequence (xn)n 0 B ··· C B C B ··· C @ A @ A @ A(14.25) converges exponentially fast to it. In fact, it can be checked that convergence is 1 much faster: Using Taylor expansion with remainder, we have We assume that all diagonal terms di↵er from zero. The matrix D is easy to compute, since 1 f(x + a) a11 0 0 g(x + a)=x + a 1 ··· f 0(x + a) 1 0 a22 D = 0 1 . (14.26) 1 2 . .. f(x)+af 0(x)+ 2 a f 00(⇠1) . . = x + a B 1C f 0(x)+af 00(⇠2) B 0 ann C (14.20) B C f (x)+ 1 af (⇠ ) n@ n A f 0(x)+af 00(⇠2) 0 2 00 1 We now introduce the map f : R R by = x + a a ! f 0(x)+af 00(⇠2) f 0(x)+af 00(⇠2) 1 1 f(x)=D (b Rx). (14.27) f 00(⇠2) f 00(⇠1) = x + a2 2 . f 0(x)+af 00(⇠2) The fixed point equation f(x)=x amounts to Dx = b R(x), which is equivalent n Here, ⇠ , ⇠ are close to x, ⇠ x < a . Let to (D + R)x = b, i.e. Ax = b. Let be any norm on R ; recall that it induces 1 2 | 1,2 | | | k · k the operator norm on matrices M =supx =1 Mx . Then we have 1 k k k k k k f 00(x + y2) 2 f 00(x + y1) C =sup . (14.21) 1 1 f(x) f(y) = D R(x y) D R x y . (14.28) y1 , y2 < a f 0(x)+af 00(x + y2) | | | | | | k k k kk kk k 2 1 Then g(x + a) x Ca . Let us now look at the sequence (xn)definedby We make the new assumption D R < 1sothatf is a contraction. This allows | |  2 k k xn+1 = g(xn), and let an = x xn . We have just shown that an+1 Can. Let us to calculate the first terms of the inductive sequence xn+1 = f(xn), which gives a 1 | |  now assume that a0 C " for some " < 1 (since xn x, we can choose n large good approximation to the solution of Eq. (14.24).  1 ! 1 enough so that xn x

Banach fixed point theorem, 41 Jacobi algorithm, 46 Banach space, 32 , 38 linear map, 37 Lipschitz-continuous, 44 Cauchy–Schwarz inequality, 35 , 39 Minkowski inequality, 31 completeness Newton-Raphson method, 44 continuous functions, 21 norm, 31 regulated functions, 20 equivalent, 33 continuous induced, 35 uniformly, 6 supremum, 5 contraction, 41 normed space, 31 contraction mapping theorem, 41 convergence open ball, set, 39 pointwise, 20 operator, 37 operator norm, 38 devil’s staircase, 21 parallelogram identity, 35 fixed point, 41 Picard operator, 44 Fourier series, 27 Picard-Lindel¨of theorem, 44 Fredholm integral equation, 42 pointwise convergence, 20 Fubini theorem, 23 polarisation identity, 36 fundamental theorem of calculus, 11 primitive, 10 heat equation, 29 regulated function, 7 Hilbert space, 35 characterisation, 16 improper integral, 17 , 18 indefinite integral, 10 scalar product, 35 induced norm, 35 Schwarz inequality, 35 inequality series of functions, 26 Cauchy–Schwarz, 35 step function, 3 Minkowski, 31 supremum norm, 5 inner product, 35 integral improper, 17 of regulated function, 8 of step function, 3 Riemann, 18 integration by parts, 12

47