FOURIER ANALYSIS

T.K.SUBRAHMONIAN MOOTHATHU

Contents

1. Introduction 1 2. and approximate identities 3 3. : preliminaries, and a divergence result 7 4. Sufficient conditions for pointwise convergence of Fourier series 11 5. Ces`arosummability and Abel summability 14 6. Weak type boundedness for maximal functions 17 7. Fourier series: pointwise convergence of Ces`aroand Abel sums 20 8. Pointwise convergence of Fourier series for functions of bounded variation 24 9. Convolution is a smoothing operation 26 D ∞ R E ∞ R 10. Topologies on the spaces = Cc ( ) and = C ( ) 30 11. The S 35 12. Distributions: preliminaries 37 13. Convolution and distributions 42 14. Some structure theorems about distributions 45 15. Fourier transform on R: basics 48 16. Fourier transform: sufficient conditions for pointwise inversion 50 17. Fourier transform on S, L2(R), and on distributions 54 18. Fourier transform of measures 58 19. Poisson summation formula 61 20. Two theorems of Wiener 62 21. Sketch: interpolation and the Lp-theory of Fourier series 65

1. Introduction

Abstract Harmonic Analysis, the generalization of Fourier Analysis, refers generally to the repre- sentation theory of locally compact topological groups which are not necessarily abelian. Harmonic Analysis done on Euclidean groups such as the torus and Rn (which are abelian) is usually called 1 2 T.K.SUBRAHMONIAN MOOTHATHU

Fourier Analysis, and this is what we plan to study here. We will discuss the basic aspects of Fourier Analysis from the perspective of pure Mathematics, making use of tools from Measure Theory and (and hence these two subjects are prerequisites for this course). A little bit of knowledge about Topological Groups will also be needed for the next section.

Fourier series on the circle will be discussed first. Fourier integral on R, distributions, and a few advanced topics will be discussed afterwards. For a study deeper than what we present here, the student may refer to, for instance, the following books: (i) Y. Katznelson, An Introduction to Harmonic Analysis, (ii) L. Grafakos, Classical Fourier Analysis, (iii) M.A. Pinsky, Introduction to Fourier Analysis and Wavelets, (iv) F.G. Friedlander, Introduction to the Theory of Distributions.

Recall that if X is a with orthonormal basis {en : n ∈ N}, then any x ∈ X has ∑ ∞ ⟨ ⟩ the series representation x = n=1 x, en en. The idea of representing functions on the circle as a Fourier series is quite similar. We sketch this briefly.

Let T = R/Z denote the unit circle on the complex plane, which we parametrize either as [0, 1) or as [−1/2, 1/2). We remark that in several textbooks T is parametrized either as [0, 2π) or − 1 as [ π, π), and then one will get an extra 2π factor in certain integral expressions. Irrespective of how we parametrize T, keep in mind that T is a compact metric space. Note that C(T) =

{f : T → C : f is continuous} is a w.r.to the supremum norm ∥ · ∥∞ defined as

∥f∥∞ = sup{|f(t)| : t ∈ T}. We equip T = [0, 1) with the Lebesgue measure, and consider the Hilbert space L2(T) of square integrable complex valued functions on T, where the inner product ∫ ∫ ⟨ ⟩ 1 2πint ∈ Z ⟨ ⟩ 1 is f, g = 0 f(t)g(t)dt. Let en(t) = e for n . We see that em, en = 0 em−n(t)dt = 0 if 2 m ≠ n and = 1 if m = n, which means {en : n ∈ Z} is an orthonormal set in L (T).

Since span{en : n ∈ Z} is a subalgebra of C(T) separating points, vanishing nowhere, and closed under complex conjugation, it follows by Stone-Weierstrass theorem (Theorem 7.33 of Rudin,

Principles of Mathematical Analysis) that span{en : n ∈ Z} is dense in (C(T), ∥ · ∥∞). Since 2 ∥ · ∥2 ≤ ∥ · ∥∞, the topology induced on C(T) by ∥ · ∥∞ is stronger than the L -topology. Also C(T) 2 2 is dense in L (T), a fact from Measure Theory. It follows that span{en : n ∈ Z} is dense in L (T). 2 Thus {en : n ∈ Z} is an orthonormal basis for the Hilbert space L (T). ∑ ∫ 2 1 Hence any f ∈ L (T) has a series representation f = ∈Z anen, where an = ⟨f, en⟩ = fe−n = ∫ n 0 1 f(t)e−2πintdt. By a change of variable θ = 2πt (which amounts to parametrizing T as [0, 2π)), 0 ∫ 1 2π −inθ we can also obtain the classical formula an = 2π 0 f(θ)e dθ. ∑ ∑ N ∞ Observe that the representation f = n∈Z anen means only that the ( n=−N anen)N=0 of partial sums converges to f in the L2-norm. Since Lp-convergence (1 ≤ p < ∞) does not imply FOURIER ANALYSIS 3 ∑ ∈ T pointwise convergence, the series n∈Z anen(t) may not converge to f(t) at points t . Therefore, among other things, it is natural to discuss the following: (i) finding sufficient conditions for the pointwise convergence of the Fourier series, (ii) rate of convergence of the Fourier series, (iii) other types of convergence that may hold even in the absence of pointwise convergence.

2. Convolution and approximate identities

We start with a little abstract theory that is applicable to both T and Rn. Let G be a lo- cally compact second countable abelian group throughout this section. Read the basic theory of such groups from relevant books (for instance, G admits a complete separable metric). The most important fact is that such a group G has a Haar measure µ on it, which means (i) µ ≠ 0 is a Borel measure on G (i.e., defined on the Borel σ-algebra of G), (ii) (local finiteness) µ(K) < ∞ for every compact set K ⊂ G, and (iii) (translation invariance) µ(A + x) = µ(A) for every Borel set A ⊂ G and every x ∈ G. Moreover any other measure on G satisfying the above properties must be of the form cµ for some c > 0. A Haar measure µ on G is always regular, which means

µ(A) = sup{µ(K): K ⊂ A and K is compact} = inf{µ(U): A ⊂ U ⊂ G and U is open}.

Note that the Lebesgue measure is a Haar measure on T and Rn.

Definition: Let µ be a Haar measure on G. For 1 ≤ p < ∞, let Lp(G) = {f : G → C : ∫ |f|pdµ < ∞} with the usual convention that we identify f and g if they agree µ-almost G ∫ p p 1/p 2 everywhere. For f ∈ L (G), let ∥f∥p = ( |f| dµ) . When p = 2, L (G) is a Hilbert ∫ G ⟨ ⟩ ∞ { → space with the inner product f, g = G fgdµ. Similarly one defines L (G) = f : G C : f is measurable, and bounded µ-almost everywhere} with norm defined as ∥f∥∞ = inf{M > 0 : |f(x)| ≤ M for x outside a µ-null set}. Let C(G) = {f : G → C : f is continuous} and

Cc(G) = {f ∈ C(G): f has compact support}, where the support of f, denoted as supp(f), is the closure of the set {x ∈ G : f(x) ≠ 0}. The following is a standard fact.

p Exercise-1: Let 1 ≤ p < ∞. Then Cc(G) is dense in L (G). In particular, if G is compact, then C(G) is dense in Lp(G). [Hint: Find the proof from a textbook. The idea of one proof is roughly as follows. The indicator function 1K of a compact set K ⊂ G can be approximated by members of Cc(G) using Urysohn’s lemma. Since the measure µ is regular, the indicator function 1A of any Borel set A ⊂ G also has such approximations. And a general f ∈ Lp(G) can be approximated by ∑ k ⊂ simple functions, i.e., functions of the form j=1 aj1Aj where Aj G are Borel.]

The translation invariance of the Haar measure has the following important consequence: 4 T.K.SUBRAHMONIAN MOOTHATHU

1 Exercise-2: Let f ∈ L (G), and fy(x) = f(x − y) for y ∈ G. Then, ∫ ∫ ∫ − ∈ 1 ∥ ∥ ∥ ∥ ∈ (i) G fydµ = G f(x y)dµ(x) = G fdµ, fy L (G), and fy 1 = f 1 for every y G. 1 1 (ii) For each f ∈ L (G), the map y 7→ fy from G to L (G) is continuous. ∫ ∫ [Hint: (i) The equality fydµ = fdµ is clear when f is an indicator function 1A with µ(A) < ∞. And any f ∈ L1(G) may be approximated by linear combinations of such indicator functions. (ii) 1 Since ∥fy −gy∥1 = ∥(f −g)y∥1 = ∥f −g∥1, and since Cc(G) is dense in L (G) by Exercise-1, it suffices to consider f ∈ Cc(G) and establish the continuity of y 7→ fy at y = 0 ∈ G. Let K = supp(f), the (compact) support of f, and A ⊂ G be a compact symmetric neighborhood of 0 ∈ G. Note that µ(K + A) < ∞ since K + A is compact. As f ∈ Cc(G) is uniformly continuous, given ε > 0, we can find a symmetric neighborhood U ⊂ G of 0 with U ⊂ A such that a − b ∈ U implies ∫ |f(a) − f(b)| < ε/µ(K + A). Then, for y ∈ U, we have ∥f − fy∥1 = |f(x) − f(x − y)|dµ(x) ≤ ∫ G K+A(ε/µ(K + A))dµ = ε.] Remark: As G is locally compact and second countable, G is σ-compact, and consequently the Haar measure µ on G is σ-finite. Hence Fubini’s theorem holds for µ (this will be used repeatedly).

Certain Banach spaces admit an associative multiplication operation that distributes over addi- tion, making the Banach space a . For instance, l∞ has the pointwise multiplication

(an)(bn) := (anbn), and C(T) has the pointwise multiplication (fg)(x) := f(x)g(x). The Banach space L1(G) also admits a multiplication called convolution:

Definition: Let µ be a Haar measure on G. The convolution f ∗ g of f, g ∈ L1(G) is defined as ∫ ∫ ∗ − f g(x) = G f(y)g(x y)dµ(y) = G f(y)gy(x)dµ(y), which can be roughly thought of as a weighted average of f by the function y 7→ gy. Applying Fubini’s theorem to Φ(x, y) := |f(y)gy(x)|, we ∫ ∫ ∫ (∫ ) ∫ | | | | | |∥ ∥ ∥ ∥ ∥ ∥ ∞ see that G G Φdµdµ = G f(y) G gy(x) dµ(x) dµ(y) = G f(y) g 1dµ(y) = f 1 g 1 < . 1 Hence (f ∗g)(x) is well-defined for µ-almost every x ∈ G and f ∗g ∈ L (G) with ∥f ∗g∥1 ≤ ∥f∥1∥g∥1.

Exercise-3: Let µ be a Haar measure on G and let f, g, h ∈ L1(G). Then, 1 (i) f ∗ g ∈ L (G) with ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1. (already done above) (ii) (Commutativity) f ∗ g = g ∗ f.[Hint: Substitute u = x − y in the integral representing f ∗ g.] (iii) (Associativity) f ∗ (g ∗ h) = (f ∗ g) ∗ h. (iv) (Linearity in each variable) (af + bg) ∗ h = a(f ∗ h) + b(g ∗ h) for a, b ∈ C.

(v) (Convolution commutes with translation) (f ∗ g)z = fz ∗ g = f ∗ gz for every z ∈ G.

Remark: If G is compact, then Lp(G) ⊂ L1(G) for 1 < p < ∞ since µ(G) < ∞. This inclusion does not hold when G is non-compact such as Rn. But still f ∗ g is defined for f, g belonging to certain Lp(G) spaces, and the result below gives information about the location of g ∗ f in such cases. FOURIER ANALYSIS 5

[101] Let µ be a Haar measure on G. 1 ∞ (i) If g ∈ L (G) and f ∈ Cc(G), then g ∗ f ∈ L (G) and g ∗ f is uniformly continuous. (ii) (Minkowski’s inequality) Let 1 ≤ p ≤ ∞. If g ∈ L1(G) and f ∈ Lp(G), then g ∗ f ∈ Lp(G) with

∥g ∗ f∥p ≤ ∥f∥p∥g∥1. ≤ ≤ ∞ 1 1 1 ∈ r (iii) (Young’s inequality) Let 1 p, q, r be such that p + r = q + 1. If g L (G) and p f ∈ L (G), then f ∗ g ∈ Lq(G) and ∥f ∗ g∥q ≤ ∥g∥r∥f∥p.

Proof. (i) Since f ∈ Cc(G), f is uniformly continuous and bounded. We have |g∗f(x)| ≤ ∥g∥1∥f∥∞.

Given ε > 0, choose a symmetric neighborhood U ⊂ G of 0 ∈ G such that |f(x) − f(y)| < ε/∥g∥1 ∫ whenever x − y ∈ U. Then |(g ∗ f)(a) − (g ∗ f)(b)| ≤ |g(y)||f(a − y) − f(b − y)|dµ(y) < ∫ G ∥ ∥ | | − ∈ ∗ (ε/ g 1) G g(y) dµ(y) = ε whenever a b U, and hence g f is uniformly continuous. (ii) The case p = 1 is already done, and the case p = ∞ is easy. So assume 1 < p < ∞. Let q = p/(p − 1) so that 1 + 1 = 1. Let h(y) = f(x − y) and ν be the Borel measure on G given by ∫ p q ν(A) = |g|dµ. Applying H¨older’sinequality w.r.to ν, we have |g ∗ f(x)| ≤ A ( ) ∫ ∫ ∫ 1/p | − || | · | | ≤ ∥ ∥ ∥ ∥ ∥ ∥1/q | − |p| | f(x y) g(y) dµ(y) = 1 h dν 1 Lq(ν) h Lp(ν) = g 1 f(x y) g(y) dµ(y) . G G G ∫ − | ∗ |p ≤ ∥ ∥p−1 Since p/q = p 1, G g f(x) dµ(x) g 1 C, where by Fubini and translation invariance, ∫ ∫ ∫ (∫ ) C = |f(x − y)|p|g(y)|dµ(y)dµ(x) = |f(x − y)|pdµ(x) |g(y)|dµ(y) G G G G ∫ (∫ ) ∫ | |p | | ∥ ∥p| | ∥ ∥ ∥ ∥p = f(x) dµ(x) g(y) dµ(y) = f p g(y) dµ(y) = g 1 f p. G G G ∥ ∗ ∥p ≤ ∥ ∥p∥ ∥p ∥ ∗ ∥ ≤ ∥ ∥ ∥ ∥ Thus g f p g1 1 f p, or g f p g 1 f p.

(iii) This is a generalization of (ii) with a similar, but more complicated proof, which is left as a reading assignment. See Theorem 1.2.12 of Grafakos, Classical Fourier Analysis. 

In general, L1(G) may not have a unit element for convolution, i.e., there may not exist g ∈ L1(G) with f ∗ g = f for every f ∈ L1(G); see the Remark after [103]. However, L1(G) has what is called approximate identities: a parametrized family of functions which, in the limiting case, behaves as a unit element for convolution. First we will define an approximate identity formally; its behavior justifying the name ‘approximate identity’ will be proved in [102] below.

Definition: Let µ be a Haar measure on G, and let 0 < b < ∞. A parametrized family {Ka : 0 < a < b} in L1(G) is an approximate identity for L1(G) if the following three properties are satisfied: ∫ (A1) (Normalization) G Kadµ = 1 for every a. 1 (A2) (L -boundedness) sup{∥Ka∥1 : 0 < a < b} < ∞. 6 T.K.SUBRAHMONIAN MOOTHATHU ∫ 1 ∈ ⊂ ∈ | | (A3) (L -concentration at 0 G) For any neighborhood U G of 0 G, lima→0 G\U Ka dµ = 0.

Note that if Ka ≥ 0, then (A2) follows from (A1). Sometimes the approximate identity satisfies an additional property: ∞ (A4) (L -concentration at 0 ∈ G) For any neighborhood U ⊂ G of 0 ∈ G, lima→0 sup{|Ka(x)| : x ∈ G \ U} = 0. (Note that (A4) implies (A3) when µ(G) < ∞, when G is compact).

Remark: Soon we will encounter explicit examples of approximate identities on L1(T) in relation with the theory of Fourier series. For the moment note that if g ∈ L1(Rn) is with g ≥ 0 and ∫ 1 n n gdµ = 1, then we can obtain an approximate identity {Ka}a>0 for L (R ) by putting Ka(x) = R ∫ ∫ ∫ −n −n a g(x/a). We have Rn Ka(x)dx = Rn a g(x/a)dx = Rn g(y)dy = 1 by the change of variable y = x/a, where we used the fact that a−n is the determinant of the Jacobian of the map x 7→ x/a ∫ ∫ Rn Rn | | | | → → from to . And |x|>δ Ka(x) dx = |y|>δ/a g(y) dy 0 as a 0. Remark: In the above definition, we parametrized the approximate identity with a ∈ (0, b) and con- sidered the limit as a → 0. However, we can define approximate identity using other parametriza- tions and other limiting processes; for instance, we may parametrize with n ∈ N and consider the limit as n → ∞, or parametrize with r ∈ (0, 1) and consider the limit as r → 1.

A general strategy: While estimating integral expressions in Fourier Theory, the following strategy will be often followed: split the integral into two parts, one part for a neighborhood of 0, and the other part for the region outside; then estimate each integral separately.

[102] (Justifying the name approximate identity) Let µ be a Haar measure on G, and let {Ka : 0 < a < b} be an approximate identity on L1(G).

(i) If f ∈ Cc(G), then lima→0 ∥Ka ∗ f − f∥∞ = 0. p (ii) If 1 ≤ p < ∞ and f ∈ L (G), then lima→0 ∥Ka ∗ f − f∥p = 0. ∞ (iii) Assume in addition that G is compact and {Ka : 0 < a < b} satisfies the L -concentration ∫ ∈ 1 condition (A4). If g L (G) satisfies limx→0 g(x) = c, then lima→0 G Ka(x)g(x)dµ(x) = c.

Proof. (i) Using condition (A1) of an approximate identity, we note that Ka ∗ f(x) − f(x) = ∫ − − {∥ ∥ } G Ka(y)(f(x y) f(x))dµ(y). Let ε > 0 be given, and M = sup Ka 1 : 0 < a < b . Since f ∈ Cc(G) is uniformly continuous, we may choose a neighborhood U ⊂ G of 0 ∈ G such that ε |f(x − y) − f(x)| ≤ for every y ∈ U and x ∈ G. Using condition (A3), choose a0 ∈ (0, b) such ∫ 2M that |K |dµ < ε for every a ∈ (0, a ). Then for 0 < a < a , we have G\U a 4∥f∥∞ 0 0 ∫ ∫

|Ka ∗ f(x) − f(x)| ≤ |Ka(y)||f(x − y) − f(x)|dµ(y) + |Ka(y)||f(x − y) − f(x)|dµ(y) U G\U ∫ ∫

≤ (ε/2M) |Ka(y)|dµ(y) + 2∥f∥∞ |Ka(y)|dµ(y) ≤ ε/2 + ε/2 = ε U G\U FOURIER ANALYSIS 7

by the choice of M and a0. As this holds for every x ∈ G, ∥Ka ∗f −f∥∞ ≤ ε for every a ∈ (0, a0).

p (ii) If f ∈ L (G) and g ∈ Cc(G), we have ∥Ka ∗f −f∥p ≤ ∥Ka ∗(f −g)∥p +∥Ka ∗g −g∥p +∥g −f∥p. p Since Cc(G) is dense in L (G), we can make ∥g − f∥p arbitrarily small by choosing g suitably. By

[101](ii), ∥Ka ∗ (f − g)∥p ≤ ∥Ka∥1∥f − g∥p ≤ M∥f − g∥p, where M = sup{∥Ka∥1 : 0 < a < b}.

Finally, ∥Ka ∗ g − g∥p ≤ ∥Ka ∗ g − g∥∞ → 0 as a → 0 by part (i). From these observations, it follows that ∥Ka ∗ f − f∥p → 0 as a → 0. ∫ ∫ − − {∥ ∥ (iii) We have G Ka(x)g(x)dµ(x) c = G Ka(x)(g(x) c)dµ(x) by (A1). Let M = sup Ka 1 : } ⊂ ∈ | − | ε 0 < a < b . Given ε > 0, choose a neighborhood U G of 0 G such that g(x) c < 2M for ε x ∈ U. Then choose a0 ∈ (0, b) by (A4) such that sup{|Ka(x)| : x ∈ G \ U} < , 2(∥g∥1 + cµ(G)) where the compactness of G ensures that µ(G) < ∞. Then, for 0 < a < a0, as in the proof of (i), ∫ ∫ ∫ | − | ≤ ε | | ε | − | ≤  G Kagdµ c U Ka dµ + G\U g(x) c dµ ε/2 + ε/2 = ε. 2M 2(∥g∥1 + cµ(G))

3. Fourier series: preliminaries, and a divergence result

Recall that we parametrize the circle T = R/Z as [0, 1) (sometimes also as [−1/2, 1/2)) with 2πirt addition performed modulo 1. Let er(t) = e for r ∈ R, and note that er = e−r, where the bar stands for complex conjugation. Since the Lebesgue measure on T is finite, we have Lp(T) ⊃ Lq(T) for 1 ≤ p ≤ q < ∞, and thus the largest among them is L1(T). Whenever needed, it will be implicitly assumed that any f ∈ L1(T) is extended to the whole of R with period 1. ∑ 1 b b Definition: The Fourier series of f ∈ L (T) is formally defined as ∈Z f(n)en, where f(n) = ∫ ∫ n 1 1 b ∈ Z 0 f(t)en(t)dt = 0 f(t)e−n(t)dt. Here f(n) is called the nth Fourier coefficient of f for n . b b b b b b Remark: Note that f(n)en + f(−n)e−n = (f(n) + f(−n)) cos 2πnt + i(f(n) − f(−n)) sin 2πint. Hence we have the formal equality of Fourier series

∑ ∑∞ ∑∞ b b b b f(n)en = f(0) + (f(−n)e−n + f(n)en) = A0 + (An cos 2πnt + Bn sin 2πint), n∈Z n=1 n=1 ∫ ∫ ∫ b b b where A0 = f(0) = f(t)dt, An = f(n)+f(−n) = f(t)(e−n(t)+en(t))dt = 2 f(t) cos 2πntdt T ∫ T ∫ T b b and Bn = i(f(n) − f(−n)) = T if(t)(e−n(t) − en(t))dt = 2 T f(t) sin 2πintdt. Observe that if f is an even function, then Bn = 0 for all n since sin is an odd function; and if f is an odd function, then An = 0 for all n since cos is an even function.

2 b 2 2 Exercise-4: Restricted to L (T), the map f 7→ (f(n))n∈Z from L (T) to l (Z) is an isometric isomorphism of Hilbert spaces. [Hint: See the Introduction, and use Parseval’s identity.] 8 T.K.SUBRAHMONIAN MOOTHATHU ∫ Example: Let T = [0, 1) and f : T → C be f(t) = t. We have fb(0) = 1 tdt = 1/2. For ∫ 0 ∫ b 1 −1 1 1 n ∈ Z \{0}, integration by parts gives f(n) = te−n(t)dt = (−2πin) te−n(t)| − e−n(t)dt = ∫ 0 ∑ 0 0 −1 −1 2 1 2 2 b 2 (−2πin) − 0 = (−2πin) . Also ∥f∥ = t dt = 1/3. Now ∥f∥ = ∈Z |f(n)| by Exercise-4. ∑ 2 0 2 ∑ n ∞ 2 2 −1 2 ∞ −2 Hence 1/3 = (1/4) + 2 n=1(4π n ) . Simplification yields π /6 = n=1 n .

Exercise-5: (Basic properties of the Fourier coefficients) Let f, g ∈ L1(T) and a, b ∈ C. Then, for every n ∈ Z we have: (i) (Linearity) af\+ bg(n) = afb(n) + bgb(n). b (ii) f(n) = fb(−n). 1 b b b (iii) If fs ∈ L (T) is defined as fs(t) = f(t − s) for s ∈ T, then fs(n) = f(n)e−n(s) = f(n)en(−s). ∫ b 1 − 1 (iv) (Another expression for the Fourier coefficient) f(n) = (1/2) 0 [f(t) f(t + 2n )]e−n(t)dt. b (v) en ∗ f = f ∗ en = f(n)en. (vi) (Fourier coefficient of convolution is product of Fourier coefficients) f[∗ g(n) = fb(n)gb(n). ∫ ∫ b b [Hint: (iii) fs(n) = f(t−s)e−n(t)dt = f(y)e−n(y+s)dy = f(n)e−n(s) by putting t = y+s. (iv) ∫ T T ∫ b 1 b 1 In f(n) = f(t)e−n(t)dt, substitute t = y+ to get f(n) = − f(y+ )e−n(y)dy and add this to T ∫2n T∫ 2n b b the first expression for f(n). (v) f ∗ en(t) = f(s)en(t − s)ds = ( f(s)e−n(s)ds)en(t) = f(n)en(t). [ (vi) Using (v) we have f ∗ g(n)en = (f ∗ g) ∗ en = f ∗ (g ∗ en) = f ∗ (gb(n)en) = gb(n)(f ∗ en) = b gb(n)f(n)en. And we may cancel en ≠ 0 from both ends].

The smoother the function f, the faster the rate of convergence of (fb(n)) to 0 as |n| → ∞.

[103] (Rate of decay of Fourier coefficients) Let f ∈ L1(T). Then, b b ∞ (i) |f(n)| ≤ ∥f∥1 for every n ∈ Z and hence (f(n))n∈Z ∈ l (Z) (this is improved below). This has 1 b b the following consequence by linearity: if (fk) → f in L (T), then (fk(n)) → f(n) uniformly in n. b b (ii) (Riemann-Lebesgue lemma) lim f(n) = 0, i.e., (f(n))n∈Z ∈ c0(Z) := {(xn) : lim xn = 0}. |n|→∞ ∫ |n|→∞

(iii) (Generalized Riemann-Lebesgue lemma) lim f(t)er(t)dt = 0, and consequently ∫ r∈∫R; |r|→∞ T lim f(t) cos 2πrtdt = 0 and lim f(t) sin 2πrtdt = 0. r∈R; |r|→∞ T r∈R; |r|→∞ T (iv) If f is k-times differentiable with f (k) ∈ L1(T), then we have fb(k)(n) = (2πin)kfb(n), and hence k b b −k lim|n|→∞ |n| f(n) = 0 (this means (f(n)) goes to 0 faster than (|n| )).

∫ ∫ b Proof. (i) |f(n)| ≤ T |fe−n|dt = T |f|dt = ∥f∥1 since |e−n| = 1.

1 b (ii) Since C(T) is dense in L (T) by Exercise-1, and since |f(n) − gb(n)| ≤ ∥f − g∥1, it suffices to prove the result for f ∈ C(T). And the result in this case follows from Exercise-5(iv), where we ∫ b − 1 noted the expression f(n) = (1/2) T[f(t) f(t + 2n )]e−n(t)dt. FOURIER ANALYSIS 9

(iii) To prove the first statement, note as above that it suffices to prove for f ∈ C(T), and then ∫ ∫ − − 1 ∈ R\{ } note T f(t)er(t)dt = (1/2) T[f(t) f(t 2r )]er(t)dt for r 0 as in the hint of Exercise-5(iv).

To prove the second statement, assume f is real valued and note er(t) = cos 2πrt + i sin 2πrt. ∫ ∫ b 1 −1 1 −1 1 ′ (iv) Integration by parts gives f(n) = fe−ndt = (−2πin) f(t)e−n| + (2πin) f e−ndt = ∫ 0 0 0 −1 1 ′ −1 b′ b′ b b(k) 0 + (2πin) 0 f e−ndt = (2πin) f (n). That is, f (n) = 2πinf(n), and inductively f (n) = k b k b (k) (2πin) f(n). Finally, the assertion lim|n|→∞ |n| f(n) = 0 follows by applying part (ii) to f . 

Remark: We may explain why there is no multiplicative identity for convolution in L1(T). Suppose ∑ ∈ 1 T ∗ ∈ 1 T ∞ 2 ∈ T ⊂ 1 T there is f L ( ) with f g = g for every g L ( ). Consider g = n=−∞ en/n C( ) L ( ), and note that gb(n) = 1/n2 ≠ 0 for every n ∈ Z. If g = f ∗ g, then gb(n) = f[∗ g(n) = fb(n)gb(n) and hence we must have fb(n) = 1 for every n ∈ Z, which contradicts Riemann-Lebesgue lemma.

For later use, we note down the following consequences of [103](iii).

[104] Let∫ g, h ∈ L1(T) and b < c be in T. Then, c (i) lim g(t) sin(2N + 1)πtdt = 0. →∞ N b − (ii) If∫ the function t 7→ t 1h(t) is bounded almost everywhere in a neighborhood of 0, then c lim t−1h(t) sin(2N + 1)πtdt = 0. →∞ N b 1 ∈ 1 T Proof. (i) Note that sin(2N + 1)πt = sin 2π(N + 2 )t, and apply [103](iii) to 1(b,c)g L ( ). (ii) Let g(t) = t−1h(t). By hypothesis, there are δ ∈ (0, 1/2) and M > 0 such that |g(t)| < M for ∫ ∫ ∫ ∈ − | | ≤ −1| | ≤ −1∥ ∥ ∞ almost every t ( δ, δ). Hence T g(t) dt |t|<δ Mdt + |t|>δ δ h(t) dt δM + δ h 1 < and thus g ∈ L1(T). Apply part (i) to g. 

Question: When does the Fourier series converge pointwise? ∑ ∑ 1 N N Definition: The Nth DN ∈ C(T) is defined as DN = − e−n = − en. ∑ n= N n= N N Note that DN (t) = 1 + 2 cos 2πnt, and therefore DN is a real valued even function. The Nth n=1 ∑ ∈ 1 T N b partial sum sN (f) of the Fourier series of f L ( ) is defined as sN (f) = n=−N f(n)en. ∈ 1 T Exercise-6:(sN in terms of DN ) For f L ( ), we have: (i) sN (f) = DN ∗ f , and consequently sN (f) is real valued whenever f is real valued. ∫ (ii) If T is parametrized as [−1/2, 1/2), then 1/2 − ∀ a ∈ T. sN (f, a) = 0 DN (t)[f(a + t) + f(a t)]dt ∑ N [Hint: (i) sN (f) = f ∗ ( − en) = f ∗ DN = DN ∗ f by Exercise-5(v). n= ∫N ∫ 0 1/2 (ii) sN (f, a) = DN ∗ f(a) = ( + )DN (t)f(a − t)dt. Now the substitution t = −y converts ∫ −1/2 0 1/2 the first integral into 0 DN (y)f(a + y)dy since DN is even.]

1 See the end of Section 8 for a picture of the graph of DN . 10 T.K.SUBRAHMONIAN MOOTHATHU

1 1 Remark: sN : L (T) → L (T) is linear, but is not a positive operator (since DN 0). For any 1 N ∈ N, we can construct f ∈ L (T) with f ≥ 0 and sN (f) 0 as follows. Choose b ∈ T with

DN (b) < 0, and then choose ε > 0 and δ > 0 by the continuity of DN such that DN (t) < −ε for ∫ ∫ 1 1 every t ∈ [b, b + 2δ]. Note that sN (f, a) = DN (t)f(a − t)dt = DN (y)f(a + y)dy since DN is 0 ∫ 0 b+2δ−a − ∈ even. If we take f = 1[b+δ,b+2δ], then sN (f, a) = b+δ−a DN (y)dy < εδ < 0 for every a (0, δ).

Below we show that even though {DN } fails to satisfy property (A2) in the definition of an approximate identity, {DN } satisfies (A1) and a property similar to (A3). sin(2N + 1)πt ∈ T \ Z ∈ Z [105] (i) DN (t) = for t and DN (t) = 2N + 1 for t . sin πt ∫ 1 (ii) (Normalization property (A1) holds) 0 DN (t)dt = 1. 1 (iii) (L -boundedness (A2) fails) ∥DN ∥1 → ∞ as N → ∞. (iv) (A property similar to the L1-concentration (A3) holds) Let 0 < δ < 1/2. Then we have ∫ ∫ −δ 1/2 limN→∞ DN (t)dt = 0 = limN→∞ DN (t)dt; more generally ∫−1/2 δ ∫ −δ 1/2 ∈ 1 T limN→∞ −1/2 DN (t)h(t)dt = 0 = limN→∞ δ DN (t)h(t)dt for every h L ( ).

Proof. (i) 2i sin πt DN (t) = (e1/2(t)−e−1/2(t))DN (t) = eN+1/2(t)−e−(N+1/2)(t) = 2i sin(2N +1)πt. ∫ ∑ ∫ ∫ ∫ 1 N 1 1 1 ̸ (ii) 0 DN (t)dt = n=−N 0 en = 1 since 0 e0 = 1 and 0 en = 0 for n = 0. ∫ ∫ 1 1 −1 (iii) Since sin x < x for x > 0, we have ∥DN ∥1 = |DN (t)|dt ≥ | sin (2N + 1)πt|(πt) dt. 0 ∫0 (2N+1)π −1 Put y = (2N + 1)πt and note dy/y = dt/t. We see ∥DN ∥1 ≥ | sin y|(πy) dy ≥ ∑ ∫ ∑ 0 ∫ 2N+1 nπ | | 2 −1 2N+1 2 −1 → ∞ → ∞ nπ | | n=1 (n−1)π sin y (nπ ) dy = n=1 (nπ ) as N . We used: (n−1)π sin y = 1. ∫ ∫ ∫ 1/2 −δ 1/2 (iv) Since DN (t)h(t)dt = DN (t)(−h(t))dt, it suffices to show lim DN (t)h(t)dt = 0. −1/2 δ →∞ ∫ ∫ N δ 1/2 Note by (i) that δ DN (t)h(t)dt = T g(t) sin(2N + 1)πtdt, where the function g defined as g(t) = 1 (t)h(t)/ sin πt belongs to L1(T) since sin πt > sin πδ for t ∈ (δ, 1/2). Therefore, ∫ (δ,1/2) ∫ 1/2 → → ∞  δ DN (t)h(t)dt = T g(t) sin(2N + 1)πtdt 0 as N by [104](i).

To be alert about the possible failure of pointwise convergence of the Fourier series even in the case of continuous functions, we establish a negative result in the beginning itself. To use in the proof, keep in mind the Uniform boundedness theorem: if (Tα)α∈J is a pointwise bounded family of bounded linear operators from a Banach space to a normed space, then sup{∥Tα∥ : α ∈ J} < ∞.

1 Remark: The family {DN } as N → ∞ cannot be an approximate identity on L (T) because of the failure of L1-boundedness proved in [104](iii) above. This is the essential reason behind the failure of pointwise convergence for a general Fourier series. FOURIER ANALYSIS 11

[106] (Failure of pointwise convergence of Fourier series even for continuous functions) Parametrize

T as [−1/2, 1/2). There is f ∈ C(T) with sup{|sN (f, 0)| : N ∈ N} = ∞. So the sequence (sN (f, 0)) of Fourier partial sums of f at 0 does not converge to f(0).

Proof. Define linear functionals ϕN :(C(T), ∥ · ∥∞) → C as ϕN (f) = sN (f, 0). They are bounded ∑ | | ≤ N | b | ≤ ∥ ∥ since ϕN (f) n=0 f(n) (N + 1) f ∞. We need to show (ϕN ) is not pointwise bounded. By the Uniform boundedness theorem, it suffices to show sup{∥ϕN ∥ : N ∈ N} = ∞. For this purpose, we will show ∥ϕN ∥ ≥ ∥DN ∥1 for every N ∈ N. This suffices since ∥DN ∥1 → ∞ by [105](iii).

1 Fix N ∈ N and let h ∈ L (T) be such that hDN = |DN |, i.e., we take h(t) = 1 if DN (t) ≥ 0 and h(t) = −1 if DN (t) < 0. We may find a sequence (fk) in C(T) such that ∥fk∥∞ ≤ 1 and (fk) → h ∫ ∫ pointwise (check). Note that ϕN (fk) = sN (fk, 0) = T fk(t)DN (0 − t)dt = T fk(t)DN (t)dt since 1 DN (−t) = DN (t). Since fk’s are dominated by 1 ∈ L (T), we get by Lebesgue dominated conver- ∫ ∫ ∫ gence theorem that limk→∞ ϕN (fk) = limk→∞ T fk(t)DN (t)dt = T h(t)DN (t)dt = T |DN (t)|dt =

∥DN ∥1. Since ∥fk∥∞ ≤ 1, we conclude ∥ϕN ∥ ≥ ∥DN ∥1, and we are done. 

∥sN (f)∥∞ Remark: However, ∥s (f)∥∞ cannot grow very fast: it is known that lim →∞ = 0 for N N log N f ∈ C(T), see Proposition 1.6.6 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets.

Remark: We sketch a little history. Kolmogorov gave an example of f ∈ L1(T) whose Fourier series diverges almost everywhere. In contrast, Carleson showed that the Fourier series of any f ∈ L2(T) converges pointwise to f for almost every t ∈ T. Hunt extended this result to every f ∈ Lp(T) for every 1 < p < ∞. These results are beyond our scope, and we will not prove them. However, we will prove pointwise convergence of Fourier series under some smoothness assumption (we need to prevent f from oscillating too much). And also a little later, we will prove Fej´er-Lebesgue theorem that says that the averages of partial sums of the Fourier series of any f ∈ L1(T) converge to f pointwise almost everywhere (i.e., outside a Lebesgue null set).

4. Sufficient conditions for pointwise convergence of Fourier series

Philosophy: If f ∈ L1(T) satisfies some smoothness condition that prevents f from oscillating too much, then we may expect (sN (f)) to converge to f pointwise.

We start with some basic observations.

Exercise-7: Let f ∈ L1(T), w ∈ C and a ∈ T = [−1/2, 1/2). (i) For any δ ∈ (0, 1/2) we have ∫ (∫ ∫ )

sN (f, a) − w = DN (t)(f(a − t) − w)dt = + DN (t)(f(a − t) − w)dt. T |t|<δ |t|>δ 12 T.K.SUBRAHMONIAN MOOTHATHU ∫

(ii) lim sN (f, a) = w ⇔ there exists δ ∈ (0, 1/2) such that lim DN (t)(f(a − t) − w)dt = 0. →∞ →∞ N N |t|<δ (iii) (Riemann localization principle) If f ≡ 0 in a neighborhood of a, then limN→∞ sN (f, a) = 0. ∫ ∫ [Hint: (i) w = DN (t)wdt since DN (t)dt = 1. Also, sN (f) = DN ∗ f. (ii) As N → ∞, the ∫ T T 7→ − − 1 T integral |t|>δ in (i) goes to 0 by [105](iv) since the function t f(a t) w belongs to L ( ).] Improving Exercise-7(ii), we get:

[107] (Pointwise convergence criterion - 1) Let f ∈ L1(T), w ∈ C and a ∈ T = [−1/2, 1/2). Then, ∫ −1 (i) lim sN (f, a) = w ⇔ ∃ δ ∈ (0, 1/2) with lim t (f(a − t) − w) sin(2N + 1)πtdt = 0. →∞ →∞ N N |t|<δ −1 1 (ii) (Dini’s test) If t 7→ t (f(a − t) − w) belongs to L (T), then lim sN (f, a) = w. N→∞ −1 (iii) If t 7→ t (f(a − t) − w) is bounded a.e. in a neighborhood of 0, then lim sN (f, a) = w. N→∞

Proof. (i) By Exercise-7(ii), lim sN (f, a) = w ⇔ there exists δ ∈ (0, 1/2) such that ∫ N→∞

lim (f(a − t) − w)DN (t)dt = 0. Note that the integrand can be written as →∞ N |t|<δ 1 1 1 (f(a − t) − w)D (t) = ( − + )(f(a − t) − w) sin(2N + 1)πt. N sin πt πt πt 1 − 1 T → C Using l’Hopital rule, check limt→0( sin t t ) = 0. Hence g : defined as g(0) = 0 and g(t) = 1 − 1 for t ≠ 0 satisfies g ∈ C(T) ⊂ L1(T). Therefore, t 7→ g(t)(f(a − t) − w) belongs sin πt πt ∫ 1 T − − to L ( ), and so by [104](i), we have limN→∞ |t|<δ g(t)(f(a t) w) sin(2N + 1)πtdt = 0 for any δ ∈ (0, 1/2). The required result follows.

(ii) Use part (i), and apply [104](i) to g(t) := t−1(f(a − t) − w).

(iii) Use part (i), and apply [104](ii) to h(t) := f(a − t) − w. 

Definition: Let f ∈ L1(T) (assumed to be extended to the whole of R with period 1), and a ∈ T. We say2 f is Lipschitz at a if there exist λ ≥ 1 and a neighborhood U ⊂ T of a such that |f(a) − f(b)| ≤ λ|a − b| for every b ∈ U. Note that if f is differentiable at a, then f is Lipschitz at a (∵ define g(a) = f ′(a) and g(t) = (f(t) − f(a))/(t − a) for t ≠ a. Then f(t) − f(a) = g(t)(t − a), and g is continuous at a so that |g(t)| ≤ λ := |g(a)| + 1 in a neighborhood of a). We say f is H¨oldercontinuous at a if there exist ⟨≥ 1, α > 0, and a neighborhood U ⊂ T of a such that |f(a) − f(b)| ≤ λ|a − b|α for every b ∈ U.

1 [108] Let f ∈ L (T) and a ∈ T. Then each of the following implies lim sN (f, a) = f(a): N→∞ (i) f is differentiable at a. (ii) f is Lipschitz at a.

2These definitions are to be understood modulo a null set since f ∈ L1(T). FOURIER ANALYSIS 13

(iii) f is H¨oldercontinuous at a.

Proof. Let g(t) = t−1(f(a−t)−f(a)). For (i) and (ii), note that g is bounded a.e. in a neighborhood of 0, and apply [107](iii). For (iii), it suffices by [107](ii) to show that g ∈ L1(T). By hypothesis, there exist λ ≥ 1, α > 0 and δ ∈ (0, 1/2) such that |f(a − t) − f(a)| ≤ λ|t|α whenever |t| < δ. Then ∫ ∫ ∫ α−1 −1 −1 α −1 |g(t)|dt ≤ λ|t| dt + δ |f(a − t) − f(a)|dt ≤ 2λα δ + 2δ ∥f∥1 < ∞. T |t|<δ |t|>δ 

Definition: Let f ∈ L1(T). We say f is piecewise continuous if for every open interval (a, b) ⊂ T, the map f is continuous on (a, b) except possibly for finitely many jump discontinuities, and if the limits f(a+), f(b−) exist. We say f is piecewise C1 if both f and f ′ are piecewise continuous. Next we will prove Dirichlet’s theorem about convergence of Fourier series for piecewise C1 functions.

Exercise-8: Let f ∈ L1(T), a ∈ T, and w ∈ C.

(i) For any δ ∈ (0, 1/2) we have sN (f, a) − w = ∫ (∫ ∫ ) 1/2 DN (t)[f(a + t) + f(a − t) − 2w]dt = + DN (t)[f(a + t) + f(a − t) − 2w]dt. 0 0δ ∫ δ (ii) lim sN (f, a) = w ⇔ ∃ δ ∈ (0, 1/2) with lim DN (t)[f(a + t) + f(a − t) − 2w]dt = 0. →∞ →∞ N ∫ N 0 ∫ 1/2 1/2 [Hint: (i) sN (f) = DN (t)[f(a+ t)+ f(a −t)]dt by Exercise-6. Also, w = 2 DN (t)wdt since ∫ 0 ∫ 0 → ∞ T DN (t)dt = 1 and DN is even. (ii) As N , the integral t>δ in (i) goes to 0 by [105](iv) since the function t 7→ f(a + t) + f(a − t) − 2w belongs to L1(T).]

We use a splitting of the integrand different from the one used in [107] for the next:

Exercise-9: (Pointwise convergence criterion - 2) Let f ∈ L1(T), a ∈ T, and assume the limits f(a+), f(a + t) − f(a+) f(a − t) − f(a−) f(a−) exist. If g (t) := and g (t) := are bounded almost 1 sin πt 2 sin πt everywhere in (0, δ) for some δ ∈ (0, 1/2) , then limN→∞ sN (f, a) = [f(a+) + f(a−)]/2. [Hint: Let ∫ 1 δ g = 1 (g1+g2), w = [f(a+)+f(a−)]/2. Then g ∈ L (T) and DN (t)[f(a+t)+f(a−t)−2w]dt = ∫ (0,δ) 0 δ → → ∞ 0 g(t) sin(2N + 1)πtdt 0 as N by [104](i). Now apply Exercise-8(ii).]

1 1 [109] (Dirichlet’s theorem) If f ∈ L (T) is a piecewise C function, then limN→∞ sN (f, a) = [f(a+) + f(a−)]/2 for every a ∈ T. (note: [f(a+) + f(a−)]/2 = f(a) if f is continuous at a.)

f(a + t) − f(a+) f(a − t) − f(a−) Proof. Let g (t) = and g (t) = . By l’Hopital rule, we get 1 sin πt 2 sin πt ′ ′ − f (a + t) −1 ′ f (a t) −1 ′ lim g1(t) = lim = π f (a+) and lim g2(t) = lim = π f (a−). Therefore t→0+ t→0+ π cos πt t→0+ t→0+ π cos πt g1, g2 are bounded a.e. in (0, δ) for some δ ∈ (0, 1/2). Now the result follows by Exercise-9.  14 T.K.SUBRAHMONIAN MOOTHATHU

It is also possible to extend (with some effort) the arguments from Exercise-8 along the lines of [107] to prove Jordan’s theorem of convergence of Fourier series for functions of bounded variation. But we decide to give another proof of this, which will be given a little later.

5. Cesaro` summability and Abel summability

In spite of the negative result [106], it will be shown a little later that the Fourier series of f ∈ L1(T) converges w.r.to certain other notions of summability. In this Analysis section without any Fourier Theory, we briefly mention two such notions - Ces`arosummability and Abel summability.

Motivating observations: (i) The series 1−1+1−1+1−1+··· is not convergent in the usual sense since the sequence of partial sums is (1, 0, 1, 0, 1, 0,...). However, the averages of the partial sums ∑∞ form the sequence (1, 1/2, 2/3, 1/2, 3/5, 1/2,...), which converges to 1/2. (ii) Saying that an ∑ n=0 ∞ n is convergent is same as saying the analytic function f(z) = anz converges at z = 1. If ∑ n=0 ∞ n f has radius of convergence ≥ 1, then even if anz does not converge at z = 1, the limit ∑ n=0 ∞ n limr→1− n=0 anr may exist. ∑ ∑ N ∞ Definition: Let an ∈ C for n ≥ 0, sN = an, and s ∈ C. (i) We say an is Ces`aro n=0 ∑ n∑=0 −1 N ∞ summable to s if limN→∞ σN = s, where σN = (N + 1) sn. (ii) We say an is Abel ∑ n=0 ∑ n=0 ∞ n ∈ ∞ n summable to s if n=0 anr converges for every r (0, 1) and limr→1− n=0 anr = s. ∑ ∑ N −1 N Exercise-10: Let an ∈ C for n ≥ 0, sN = an, and σN = (N + 1) sn. Then, ∑ ∑ n=0 ∑ ∑n=0 −1 N n −1 N N n (i) σN = (N + 1) ( an) = (N + 1) (N + 1 − n)an = (1 − )an. ∑ n=0∑ m=0 ∑ n=0 n=0 N+1 ∞ n ∞ n 2 ∞ n (ii) anr = (1 − r) snr = (1 − r) (n + 1)σnr for 0 < r < 1. n=0 n=0 n=0 ∑ ≥ ∈ C ≥ Further, for p = 1, 2 and n 0, let ap,n , and define a3,n = i+j=n a1,ia2,j for n 0 (this is called the Cauchy product, and can be thought of as a discrete version of convolution). ∑ ∑ ∞ ∞ ∈ C (iii) If the series n=0 a1,n and n=0 a2,n are convergent to a, b respectively, then their Cauchy ∑∞ product n=0 a3,n defined above may not converge (see 3.49 of Rudin, Principles of Mathematical ∑∞ Analysis for this), but a3,n is Ces`arosummable to ab. ∑ n=0 ∑ ∞ ∞ ∈ C (iv) Even if the series n=0 a1,n and n=0 a2,n are Ces`arosummable to a, b respectively, the ∑∞ series a3,n may not be Ces`arosummable. Show that Abel summability does not imply Ces`aro n=0 ∑ ∑ ∞ ∞ − − − ··· summability by considering n=0 a1,n = n=0 a2,n = 1 1 + 1 1 + 1 1 + , and noting that ∑∞ a3,n = 1 − 2 + 3 − 4 + 5 − 6 + ··· is Abel summable to 1/4 but is not Ces`arosummable. n=0 ∑ N [Hint: (ii) sn − sn−1 = an and (n + 1)σn − nσn = sn. (iii) Let sp,N = ap,n for p = 1, 2, 3. ∑ ∑ ∑ n=0 N Check that s3,n = (N + 1 − i − j)a1,ia2,j = s1,is2,j. Also the hypothesis n=0 i+j=N ∑ i+j=N ∑ 1 N 1 says (s1,n) → a and (s2,n) → b. Hence |ab − s3,n| = | (ab − s1,is2,j)| ≤ ∑ ∑ N+1 n=0 N+1 i+j=N 1 |ab − s s | + 1 |ab − s s |, where the last two N+1 i+j=N; min{i,j}≤N0 1,i 2,j N+1 i+j=N; min{i,j}>N0 1,i 2,j FOURIER ANALYSIS 15 sums can be made arbitrarily small if N0 is large and N > N0 is very large compared to N0. (iv) For 1−2+3−4+··· , the averages of the partial sums form the sequence (1, 0, 2/3, 0, 3/5, 0, 4/7, 0,...), which does not converge; but their averages converge to 1/4.]

∑∞ ∑∞ [110] (i) If a complex series an converges to s ∈ C, then an is Ces`arosummable to s. ∑ n=0 ∑ n=0 ∞ ∈ C ∞ (ii) If n=0 an is Ces`arosummable to s , then n=0 an is Abel summable to s. ∑ ∑ N −1 N Proof. Let sN = n=0 an, and σN = (N + 1) n=0 sn.

(i) Let ε > 0 be given. Since (sn − s) → 0 by hypothesis, there is M > 0 with |sn − s| ≤ M for every n ≥ 0. Choose k ∈ N such that |sn − s| < ε for every n ≥ k. Then choose m > k such that

(k + 1)M/(N + 1) < ε for every N ≥ m. Then for every N > m, we have |σN − s| =

1 ∑N 1 ∑k 1 ∑N (k + 1)M N − k | (s − s)| ≤ | (s − s)| + | (s − s)| ≤ + ε < ε + ε. N + 1 n N + 1 n N + 1 n N + 1 N + 1 n=0 n=0 n=k+1 ∑ ∑ ∞ n 2 ∞ n (ii) Given lim σN = s. By Exercise-10(ii), we have anr = (1 − r) (n + 1)σnr for ∑ n=0 ∑ n=0 − −2 ∞ n − 2 ∞ n 0 < r < 1. Since (1 r) = n=0(n + 1)r , we may write s = (1 r) n=0(n + 1)sr . Then for any N ∈ N, we have ( ) ∑∞ ∑N ∑∞ n 2 n | anr − s| ≤ + (n + 1)(1 − r) r |σn − s| = J1(N, r) + J2(N, r), say. n=0 n=0 n=N+1

Given ε > 0, choose N ∈ N large enough so that |σn − s| < ε for n ≥ N. Then J2(N, r) ≤ ε ∑ ∑ ∞ n ∞ n −2 since (n + 1)r ≤ (n + 1)r = (1 − r) . If M := max{(n + 1)|σn − s| : n ≤ N}, then n=N+1∑ n=0 ≤ N − 2 → → −1  J1(N, r) M n=0(1 r) 0 as r 1 . ∑ ∞ ∈ C Remark: Tauber showed that if n=0 an is Abel summable to s and limn→∞ nan = 0, then ∑∞ n=0 an = s. This generated many other results of the same type. A result where a weaker notion of summability plus some condition giving the convergence of the original series is now called a Tauberian theorem3. In [111] below, we present Hardy-Littlewood Tauberian theorem (which improves Tauber’s result), with a simplified proof due to Karamata and H. Wielandt.

Exercise-11: Let f = 1(1/2,1) : (0, 1) → R and ε > 0. Then there exist real polynomials p1 and p2 such that (i) p1 ≤ f ≤ p2,

(ii) p1(0) = 0 = p2(0) and p1(1) = 1 = p2(1) (this implies t(1 − t) is a factor of p2 − p1), and p (t) − p (t) ∫ (iii) the polynomial q(t) = 2 1 satisfies 1 q(t)dt < ε. t(1 − t) 0 f(t) − t [Hint: Let F : (0, 1) → R be F (t) = , which is bounded and continuous except for a jump t(1 − t) discontinuity at 1/2. First approximate F by continuous functions, and then apply Weierstrass

3The terminology Tauberian theorem has a more general meaning, which we may see later. 16 T.K.SUBRAHMONIAN MOOTHATHU ∫ ≤ ≤ 1 − approximation theorem to find polynomials h1, h2 such that h1 F h2 and 0 (h2 h1) < ε.

Note that f(t) = t + t(1 − t)F (t). Let pj(t) = t + t(1 − t)hj(t) for j = 1, 2. Then p1(0) = 0 = p2(0), ∫ p (t) − p (t) ∫ p (1) = 1 = p (1), p ≤ f ≤ p , and 1 2 1 dt = 1(h − h ) < ε.] 1 2 1 2 0 t(1 − t) 0 2 1 ∑∞ [111] (i) (Hardy’s Tauberian theorem) If a complex series an is Ces`arosummable to s ∈ C ∑ n=0 | | ∞ ∞ and supn∈N nan < , then n=0 an = s. ∑∞ (ii) (Hardy-Littlewood Tauberian theorem) If a complex series an is Abel summable to s ∈ C ∑ n=0 | | ∞ ∞ and supn∈N nan < , then n=0 an = s.

Proof. As (i) follows from (ii) and [110], it suffices to prove (ii). After a translation of the function ∑ ∞ n | | ≤ n=0 anz , we may assume s = 0. Let C > 0 be such that supn nan C. ∑ ∞ n Step-1 : Let F = {f : (0, 1) → R : limr→1− anf(r ) = 0}, which is a real vector space. The ∑ n=0 ∑ ∞ n k ∞ kn hypothesis says limr→1− n=0 anr = 0, and a substitution r = t gives limx→1− n=0 ant = 0. Thus all the maps t 7→ tk for k ∈ N belong to F, and hence F contains all polynomials p with p(0) = 0, i.e., all p without the constant term. Let f = 1(1/2,1) : (0, 1) → R. Note that for ∑ ∑ ∞ n ∞ 1/2 < r < 1, anf(r ) = an, and therefore what we need to show is that f ∈ F, i.e., ∑ n=0 n=0 ∞ n that limr→1− n=0 anf(r ) = 0. We will achieve this by approximating f with polynomials.

Step-2 : Given ε > 0, choose polynomials p1 and p2 for f and as specified by Exercise-11 such that p (t) − p (t) ∑ ∑ b ∫ the polynomial q(t) = 2 1 = m b tk (say) satisfies m k = 1 q(t)dt < ε/C. t(1 − t) k=0 k k=0 k + 1 0 1 − rn 1 − rn For our upcoming estimate, observe that ≤ n, or ≤ 1 − r for 0 < r < 1, and hence 1 − r n

p (rn) − p (rn) rn(1 − rn)q(rn) ∑m 2 1 = ≤ rn(1 − r)q(rn) = (1 − r) b r(k+1)n. n n k k=0

C Step-3 : Since |a | ≤ and f − p ≤ p − p , the last estimate from step-2 gives n n 1 2 1

∞ ∞ ∞ ∑ ∑ p (rn) − p (rn) ∑m ∑ ∑m 1 − r | a (f(rn) − p (rn))| ≤ C 2 1 = C (1 − r)b rn+kn = C b , n 1 n k k 1 − rk+1 n=0 n=0 k=0 n=0 k=0

∑ ∑∞ m bk → − n ∈ F which tends to C k=0 < ε as r 1 . Also, lim anp1(r ) = 0 since p1 . Thus we k + 1 r→1− n=0 ∑∞ ∑ | n | ≤ ∞ n − n conclude lim sup anf(r ) ε. By considering n=0 an(p2(r ) f(r )), we can show similarly → − r 1 n=0 ∑∞ ∑∞ n n that lim inf anf(r ) ≥ −ε. Since ε > 0 is arbitrary, lim anf(r ) = 0.  r→1− r→1− n=0 n=0 FOURIER ANALYSIS 17

6. Weak type boundedness for maximal functions

We will touch upon maximal functions and their relation to pointwise convergence. This will be used in the next section to prove Fej´er’sresult about the Ces`arosummability of Fourier series. 1 Suppose we have a sequence of operators (Tn) defined on L (X, µ) for some measure space (X, µ). ∗ Their maximal function T is defined as the supremum of Tn’s in an appropriate sense. We are interested in finding out conditions that will ensure that T ∗ has some sort of boundedness behavior.

Definition: Let Cb = C ∪ {∞}. For a measurable space X, let M(X, Cb) = {g : X → Cb : g is measurable}, and similarly define M(X, R), M(X, [0, ∞]), etc.

Definition: Let (X, µ), (Y, ν) be measure spaces. A map T : L1(X, µ) → M(Y, Cb) (not necessarily linear) is weak (1, 1) if ∃ C > 0 such that for every α > 0 and f ∈ L1(X, µ) we have ν({y ∈ Y :

|T f(y)| > α}) ≤ C∥f∥1/α. Note that if T is weak (1,1), then ν({y ∈ Y : |T f(y)| > n}) → 0 as n → ∞, and hence T f is finite almost everywhere, i.e., ν({y ∈ Y : T f(y) = ∞}) = 0. The inclusion operator I : L1(X, µ) → M(X, C) is weak (1, 1) with constant C = 1: if we fix f ∈ L1(X, µ) and ∫ ∫ put A = {x ∈ X : |f(x)| > α}, then µ(A ) = 1dµ ≤ |f/α|dµ ≤ ∥f∥ /α for α > 0. α α Aα Aα 1

1/q Remark: (i) More generally, the condition ν({y ∈ Y : |T f(y)| > α}) ≤ C∥f∥p/α defines weak (p, q) maps, but we do not need this more general concept. (ii) If the supremum of a sequence of linear maps is weak (1, 1), then there is a useful conclusion, which is stated below.

1 b [112] Let (X, µ) be a measure space, and Tn : L (X, µ) → M(X, C) be linear. If the maximal ∗ 1 ∗ function T : L (X, µ) → M(X, [0, ∞]) of {Tn : n ∈ N} defined as T f(x) = sup{|Tnf(x)| : n ∈ N} 1 1 is weak (1, 1), then the set F := {f ∈ L (X, µ):(Tnf) → f pointwise µ-a.e.} is closed in L (X, µ).

1 Proof. Let (fk) be a sequence in F converging to f ∈ L (X, µ) in ∥ · ∥1-norm. Note that {x ∈ X : ∪ | − | } ∞ { ∈ | − | } lim supn Tnf(x) f(x) > 0 = m=1 Am, where Am := x X : lim supn Tnf(x) f(x) > 2/m . ∈ N | − | ≤ Hence it suffices to show µ(Am) = 0 for each m . We have lim supn→∞ Tnf(x) f(x)

∗ lim sup |Tn(f − fk)(x)| + lim sup |Tnfk(x) − f(x)| = T (f − fk)(x) + |(f − fk)(x)| n→∞ n→∞

for a.e. x ∈ X since Tn is linear and (Tnfk) → fk pointwise a.e. Therefore µ(Am) ≤

∗ µ({x ∈ X : T (f −fk)(x) > 1/m})+µ({x ∈ X : |(f −fk)(x)| > 1/m}) ≤ Cm∥f −fk∥1 +m∥f −fk∥1,

where C > 0 is given by the weak (1, 1) property of T ∗, and in the last term we used the fact that the inclusion operator is weak (1, 1) with constant 1. As the above inequality is true for every fk and since ∥f − fk∥1 → 0, we conclude that µ(Am) = 0.  18 T.K.SUBRAHMONIAN MOOTHATHU

We will soon see that a particular maximal function that we are going to consider is lower semicontinuous, which motivates Exercise-12, a small diversion from our main theme.

Exercise-12: Let X be a metric space (can be more generally a topological space also). A function f : X → R (or f : X → [−∞, ∞]) is upper semicontinuous if {x ∈ X : f(x) < α} is open in R for every α ∈ R, and is lower semicontinuous if {x ∈ X : f(x) > α} is open in R for every α ∈ R.

For example, 1A is upper semicontinuous if A ⊂ X is closed, and 1U is lower semicontinuous if U ⊂ X is open. Since union of open sets is open and intersection of closed sets is closed, we have that the infimum of a family of upper semicontinuous functions is upper semicontinuous, and the supremum of a family of lower semicontinuous functions is lower semicontinuous. By the same reasoning, the pointwise limit of a decreasing sequence of upper semicontinuous functions is upper semicontinuous, and the pointwise limit of an increasing sequence of lower semicontinuous functions is lower semicontinuous. Now, consider a function f : X → R, where X is a metric space. (i) f is upper semicontinuous ⇔ for each x ∈ X and ε > 0, there is a neighborhood U ⊂ X of x ∈ ⇔ ≤ → such that f(y) < f(x) + ε for every y U lim supk→∞ f(xk) f(x) whenever (xk) x in X. (ii) If f is upper semicontinuous with X compact, f is bounded above and attains its maximum.

(iii) If a sequence (fn) of upper semicontinuous functions from X to R converges uniformly to f, then f is upper semicontinuous. (iv) If f is upper semicontinuous and sup f(X) < ∞, then for each x ∈ X we have that f(x) = inf{g(x): g ∈ C(X, R) and f ≤ g}. (v) Formulate and prove the corresponding statements for lower semicontinuity. For example, a lower semicontinuous function on a compact space is bounded below and attains its minimum.

[Hint: (ii) Let Un = {x ∈ X : f(x) < n}. Then extracting a finite subcover of the open cover

{Un : n ∈ N} of X, we see f is bounded above. Let M = sup f(X), and let (xk) be a sequence in X with limk f(xk) = M. By compactness, we may assume (xk) → x ∈ X, and then f(x) ≥ lim sup f(xk) = M by upper semicontinuity. (iii) Let x ∈ X and ε > 0. Choose n large with

∥f − fn∥∞ < ε/3 and then choose a neighborhood U of x with fn(y) < fn(x) + ε/3 for every y ∈ U.

Then f(y) < fn(y) + ε/3 < fn(x) + 2ε/3 < f(x) + ε for every y ∈ U. (iv) After a translation, assume f ≤ −1. Fix b ∈ X, and let ε ∈ (0, 1). Since A := {x ∈ X : f(x) ≥ f(b) + ε} is closed, there is continuous h : X → [0, 1] with h(b) = 1 and h(A) = {0}. Consider g ∈ C(X, R) defined as g(x) = (f(b) + ε)h(x), which satisfies g(b) = f(b) + ε. If x ∈ A, then f(x) ≤ −1 < 0 = g(x); and if x ∈ X \ A, then f(x) < f(b) + ε ≤ g(x) since f(b) + ε < 0 and 0 ≤ h ≤ 1. Thus f ≤ g.]

We will introduce Hardy-Littlewood maximal function on Rn instead of on R as there is no extra cost for this in the proofs, but we will use it only in one dimension. FOURIER ANALYSIS 19

n 1 n Definition: A measurable function f : R → C is said to be locally integrable if f1K ∈ L (R ), ∫ | | ∞ ⊂ R 1 Rn i.e., if K f dµ < for every compact set K . Let Lloc( ) be the collection of all locally Rn ∞ Rn ⊂ 1 Rn 1 Rn ̸ 1 Rn integrable functions on . Clearly L ( ) Lloc( ) and hence Lloc( ) = L ( ). We may p Rn ⊂ 1 Rn ≤ ∞ ∞ also see L ( ) Lloc( ) for 1 p < as follows. Assume 1 < p < (the case p = 1 is trivial) 1 1 n q n and + = 1. Then for any compact K ⊂ R , we have 1K ∈ L (R ) and hence by H¨older’s p q ∫ ∫ | | | | ≤ ∥ ∥ ∥ ∥ ∞ inequality we obtain K f dµ = Rn f 1K dµ f p 1K q < . Remark: If ν is a locally finite Borel measure on Rn absolutely continuous w.r.to the Lebesgue measure µ, then by Radon-Nikodym theorem there is a measurable function f : Rn → [0, ∞) ∫ ⊂ Rn ∈ 1 Rn with ν(A) = A fdµ for every Borel set A . Evidently, f Lloc( ); and conversely any ∈ 1 Rn Rn f Lloc( ) defines a locally finite Borel measure on that is absolutiely continuous w.r.to µ.

Definition: The Hardy-Littlewood maximal function M : L1 (Rn) → M(Rn, [0, ∞]) is defined as ∫ loc 1 | | ∈ 1 Rn ∈ R Mf(a) = supr>0 µ(B(a,r)) B(a,r) f dµ for f Lloc( ) and a , where µ is the Lebesgue measure on Rn, and the measurability of Mf is ensured by Exercise-13 below. Moreover, Mf is finite almost everywhere, i.e., µ({a ∈ Rn : Mf(a) = ∞}) = 0, as a consequence of [113] below. When ∫ n = 1, we have Mf(a) = sup 1 a+r |f(t)|dt for f ∈ L1 (R) and a ∈ R. Also note that ∫ r>0 2r a−r loc 1 a+r | | ∈ 1 T ∈ T Mf(a) = sup0

1 n Remark: (i) If f ≡ c, then Mf ≡ |c| ∈/ L (R ). (ii) If f : R → C is f = 1(0,∞), then Mf(a) = 1 for a > 0 and Mf(a) = 1/2 for a ≤ 0; here, Mf is not continuous. (iii) M is sublinear: M(f + g) ≤ Mf + Mg (since |f + g| ≤ |f| + |g|) and M(cf) = |c|Mf. Check that M is not linear.

∈ 1 Rn { ∈ Rn Exercise-13: Fix f Lloc( ) and let Mf be as defined above. Verify that the set x : Mf(x) > α} is open for each α > 0. Thus the function Mf on Rn is lower semicontinuous and ≥ → Rn hence Borel measurable. Also, Mf(x) lim supn→∞ Mf(xk) whenever (xk) x in .

The following technical fact from Euclidean Measure Theory is needed in the next proof.

n n Exercise-14: Let B1,...,Bk ⊂ R be finitely many balls, and µ be the Lebesgue measure on R .

Then there is a pairwise disjoint subcollection {Bj : j ∈ F } for some F ⊂ {1, . . . , k} such that ∪ ∪ ∑ k ≤ n n ≥ · · · ≥ ≥ µ( j=1 Bj) 3 µ( j∈F Bj) = 3 j∈F µ(Bj). [Hint: Assume µ(B1) µ(Bk) so that r(1)

· · · ≥ r(k) for the radii. Let j1 = 1. Having chosen j1, . . . , jr, let jr+1 be the smallest j > jr such ∪ r { } ∈ { }\ that Bj is disjoint with i=1 Bji . Put F = j1, j2,... . If m 1, . . . , k F , then Bm intersects ∗ ∗ Bj for some ji < m. Then r(m) ≤ r(ji) and Bm ⊂ 3 Bj , where 3 Bj is the ball concentric to Bj i ∪ ∪ i ∑ i ∪ i k ≤ ∗ ≤ n n with radius 3r(ji). Hence µ( j=1 Bj) µ( j∈F 3 Bj) 3 j∈F µ(Bj) = 3 µ( j∈F Bj).]

[113] The Hardy-Littlewood maximal function M : L1(Rn) → M(Rn, [0, ∞]) is weak (1, 1) with constant C = 3n. Similarly, M : L1(T) → M(T, [0, ∞]) is weak (1, 1). 20 T.K.SUBRAHMONIAN MOOTHATHU

1 n n Proof. Fix f ∈ L (R ) and let Aα = {x ∈ R : Mf(x) > α} for α > 0. We need to show µ(Aα) ≤ n n 3 ∥f∥1/α. Since µ(Aα) = sup{µ(K): K ⊂ Aα compact}, it suffices to show µ(K) ≤ 3 ∥f∥1/α for an arbitrary compact set K ⊂ Aα. By the definition of Mf, for each a ∈ K ⊂ Aα there is a ball ∫ ∫ 1 | | 1 | | B centered at a with µ(B) B f dµ > α, or equivalently µ(B) < α B f dµ. As K is compact, we may cover K with finitely many such balls B1,...,Bk. By Exercise-14, choose a pairwise disjoint ∪ ∑ { ∈ } ⊂ { } k ≤ n subcollection Bj : j F for some F 1, . . . , k with µ( j=1 Bk) 3 j∈F µ(Bj). Then, ∫ ∫ ∪k ∑ n ∑ n n∥ ∥ n 3 3 3 f 1 µ(K) ≤ µ( Bk) ≤ 3 µ(Bj) = |f|dµ ≤ |f|dµ = , α α Rn α j=1 j∈F j∈F Bj

where the third inequality is by the disjointness of the collection {Bj : j ∈ F }. 

Definition: Parametrize T = [−1/2, 1/2). For r ∈ (0, 1/2), define the Lebesgue kernel Lr := ∫ ∫ 1 ∈ 1 T ∥ ∥ ∗ 1 − 1 a+r 2r 1[−r,r] L ( ), and note Lr 1 = 1. See that Lr f(a) = 2r T f(t)1[−r,r](a t)dt = 2r a−r f(t)dt is a local average of f at a ∈ T for f ∈ L1(T). The Lebesgue maximal function L∗ : L1(T) → ∫ M T ∞ ∗ | ∗ | 1 | a+r | ( , [0, ]) is defined as L f(t) = sup0

Exercise-15: Let f ∈ L1(T) and t ∈ T. Then, ∗ ∗ ∗ (i) |Lr ∗ f| ≤ Lr ∗ |f| ≤ L |f| = 1 · L |f| = ∥Lr∥1L |f| = ∥Lr∥1Mf. ∑ k | ∗ | ≤ ∗ | | ≤ ∥ ∥ ∗| | ∥ ∥ (ii) If K = j=1 cjLrj is a convex combination, then K f K f K 1L f = K 1Mf. 1 (iii) If K ∈ L (T) is a nonnegative even function decreasing on [0, 1/2), then K/∥K∥1 can be ∗ approximated by convex combinations of Lr’s and hence |K ∗f| ≤ K ∗|f| ≤ ∥K∥1L |f| = ∥K∥1Mf.

7. Fourier series: pointwise convergence of Cesaro` and Abel sums

Philosophy: We know {DN } is not an approximate identity. But the averages of DN ’s will form an approximate identity, and this will lead to the Ces`arosummability of the Fourier series. ∑ N Definition: Recall the Dirichlet kernel DN = n=−N en, and define the Nth Fej´erkernel

∑N ∑N ∑n ∑N |n| F := (N + 1)−1 D = (N + 1)−1 e = (1 − )e . N n k N + 1 n n=0 n=−N k=−n n=−N ∑ ∈ 1 T N b ∗ For f L ( ), recall the Fourier partial sum sN (f) = n=−N f(n)en = DN f, and define the Nth Fej´ermean

∑N ∑N |n| ∑N |n| σ (f) := (N + 1)−1 s (f) = F ∗ f = (1 − )e ∗ f = (1 − )fb(n)e , N n N N + 1 n N + 1 n n=0 n=−N n=−N b where we used the fact en ∗ f = f ∗ en = f(n)en from Exercise-5(v).

Remark: σN (f) is real valued when f is; and σN (f) ≥ 0 when f ≥ 0 since FN ≥ 0. FOURIER ANALYSIS 21

sin2(N + 1)πt [114] (i) FN (t) = and in particular FN ≥ 0 for every N ≥ 0. (N + 1) sin2 πt (ii) If we parametrize T = [1/2, 1/2), then FN is an even function for every N ≥ 0. ∫ (iii) ∥FN ∥1 = T FN (t)dt = 1 for every N ≥ 0. 1 (iv) For 0 < δ < t < 1/2, we have FN (t) ≤ → 0 as N → ∞ uniformly in t ∈ (δ, 1/2). (N + 1) sin2 πδ 1 ∞ (v) {FN : N ≥ 0} is a nonnegative approximate identity for L (T) satisfying also the L - concentration condition (A4) from page 5. ∫ (vi) 1/2 − for every a ∈ T. σN (f, a) = 0 FN (t)[f(a + t) + f(a t)]dt

sin(2N + 1)πt Proof. (i) Recall DN (t) = by [105], and 2 sin A sin B = cos(A − B) − cos(A + sin πt ∑ ∑ 2 N 2 N B). Then, we see (N + 1)FN (t) · 2 sin πt = 2Dn(t) sin πt = 2 sin(2n + 1)πt sin πt = ∑ n=0 n=0 N − − 2 n=0[cos 2nπt cos(2n + 2)πt] = cos 0 cos(2N + 2)πt = 2 sin (N + 1)πt.

(ii) This follows from (i). ∑ ∫ N − |n| ̸ (iii) FN = n=−N (1 N+1 )en; and T en(t)dt = 0 for n = 0, and = 1 for n = 0. Statement (iv) also follows from (i), and (v) is a summary of what is proved above. ∫ 1/2 (vi) Recall sN (f, a) = DN (t)[f(a + t) + f(a − t)]dt by Exercise-6. Now use the facts that ∑ 0 ∑ −1 N −1 N  σN (f) = (N + 1) n=0 sn(f) and FN = (N + 1) n=0 Dn.

Since {FN : N ≥ 0} is an approximate identity satisfying (A4), we may deduce the following:

[115] (Fej´er’stheorem) (i) If f ∈ C(T), then ∥σN (f) − f∥∞ → 0 as N → ∞. p (ii) Let 1 ≤ p < ∞. If f ∈ L (T), then ∥σN (f) − f∥p → 0 as N → ∞. (iii) Let f ∈ L1(T) (assumed to be extended to R with period 1). If the limits f(a+) and f(a−) exists at a ∈ T, then limN→∞ σN (f, a) = [f(a+) + f(a−)]/2. In particular, if f is continuous at a ∈ T, then limN→∞ σN (f, a) = f(a).

Proof. (i) and (ii): We know σN (f) = FN ∗ f. Now use [114](v) and [102].

(ii) Let g : T → C be g(t) = [f(a + t) + f(a − t)]/2 for the given a ∈ T. Then g ∈ L1(T) with ∫ ∫ ∥ ∥ ≤ ∥ ∥ 1/2 g 1 f 1. By [114](vi), σN (f, a) = 2 0 FN (t)g(t)dt = T FN (t)g(t)dt since FN , g are even, and hence limN→∞ σN (f, a) = limt→0 g(t) = [f(a+) + f(a−)]/2 by [114](v) and [102](iii). 

Seminar topic: Let f0, f1, f2 ∈ C(T) be f0 ≡ 1, f1(t) = sin 2πt and f2(t) = cos 2πt. Korovkin’s

(second) theorem states that if Tn : C(T) → C(T) are positive linear maps for n ∈ N with limn→∞ ∥Tnfj − fj∥∞ = 0 for j = 0, 1, 2, then limn→∞ ∥Tnf − f∥∞ = 0 for every f ∈ C(T). 22 T.K.SUBRAHMONIAN MOOTHATHU

Present a proof of Korovkin’s theorem (see for example, M. Uchiyama, Proof of Korovkin’s the- orems via inequalities, Amer. Math. Monthly, 110, (2003)). Apply Korovkin’s theorem to the sequence (σN ) of positive linear maps of C(T) to give another proof of [115](i).

Definition:A trigonometric polynomial is a finite linear combination of en’s with complex scalars. ∑ ∑ N For example, FN = − en is a trigonometric polynomial. If g = cnen is a trigonometric poly- n= N ∑ ∑ 1 b nomial (where the sum is finite), then for any f ∈ L (T), g∗f = f ∗g = cn(f ∗en) = cnf(n)en, which is again a trigonometric polynomial. In particular, σN (f) = FN ∗ f is a trigonometric poly- ∑ ∈ 1 T N − |n| b nomial for every f L ( ); in fact, σN (f) = n=−N (1 N+1 )f(n)en.

Exercise-16: (i) {g ∈ C(T): g is a trigonometric polynomial} is dense in both (C(T), ∥ · ∥∞) and Lp(T) for 1 ≤ p < ∞. (ii) (Uniqueness of Fourier coefficients) If f, g ∈ L1(T) and fb(n) = gb(n) for every n ∈ Z, then f = g. ∑ 1 b 1 ∞ b (iii) (Fourier inversion) If f ∈ L (T) is such that (f(n)) ∈ l (Z), i.e., if −∞ |f(n)| < ∞, then ∑ n= ∞ b ∈ T g := n=−∞ f(n)en C( ) and f = g almost everywhere. 1 b (iv) F : L (T) → c0(Z) given by f 7→ (f(n))n∈Z is linear and injective, but is not surjective. ∑ N − |n| b [Hint: (i) Use [115](i) and [115](ii) respectively after noting that σN (f) = n=−N (1 N+1 )f(n)en is a trigonometric polynomial. (ii) By considering f − g, we may assume g = 0. Use [115](i) and b the above expression for σN (f). (iii) Use (ii) after noting gb(n) = f(n) for every n. (iv) Linearity is clear, and F is injective by (ii). If F is also surjective, then F −1 should by a bounded linear 1 b operator by Inverse mapping theorem since the spaces L (T) and c0(Z) are Banach. But DN (n) = 1 b if |n| ≤ N, and = 0 if |n| > N so that ∥(DN (n))∥∞ = 1; and we know ∥Dn∥1 → ∞ by [105].]

Exercise-17: (Wiener’s density theorem for T). Let f ∈ L1(T ). Then {f ∗ g : g ∈ L1(T)} is dense in 1 b b [ b L (T) ⇔ f(n) ≠ 0 for every n ∈ Z.[Hint: If f(n0) = 0, then f ∗ g(n0) = f(n0)gb(n0) = 0 for every g ∈ L1(T), and consequently {f ∗ g : g ∈ L1(T)} cannot be dense in L1(T). Conversely, suppose ∑ b 1 N |n| b b f(n) ≠ 0 for every n ∈ N. Given h ∈ L (T ), define gN = − (1 − )(h(n)/f(n))en. Since ∑ n= N N+1 ∗ b ∗ N − |n| b → 1 T f en = f(n)en, we see f gN = n=−N (1 N+1 )h(n)en = σN (h) h in L ( ) by [115](ii).] Now we prepare ourselves to prove Lebesgue’s extension of Fej´er’stheorem; and also the Dirichlet-

Jordan theorem about functions of bounded variation. Even though {FN } is an approximate 4 identity, FN is not decreasing on [0, 1/2). Therefore, in order to make use of Exercise-15(iii), we will define below KN ≥ FN such that KN ’s are even and decreasing on [0, 1/2).

Exercise-18: (i) (Trigonometric facts) sin(N + 1)πt ≤ (N + 1) sin πt and sin πt ≥ 2t for 0 ≤ t ≤ 1/2.

4 See the end of Section 8 for a picture of the graph of FN . FOURIER ANALYSIS 23

2 2 (ii) For 0 ≤ |t| ≤ 1 , we have F (t) ≤ (N+1) sin πt = N + 1 =: K (t). And for 1 ≤ |t| < 2(N+1) N (N+1) sin2 πt N 2(N+1) ≤ 1 1 T → C 1/2, we have FN (t) (N+1)(2t)2 = 4(N+1))t2 =: KN (t). The function KN : defined in this manner satisfies the following: KN ≥ FN ≥ 0, KN ∈ C(T), KN is an even function that decreases ∫ ∫ ∥ ∥ 1/(2N+2) 2 1/2 −2 − − 1 ≤ on [0, 1/2), and KN 1 = 2 0 (N + 1)dt + 4(N+1) 1/(2N+2) t dt = (1 0) + (1 N+1 ) 2. ∗ (iii) |FN ∗ f| ≤ FN ∗ |f| ≤ KN ∗ |f| ≤ 2L |f| = 2Mf by parts (ii) and (iii) and by Exercise-15(iii), where M is the Hardy-Littlewood maximal function in one dimension. ∗ ∗ | ∗ | ∈ 1 T (iv) The Fej´ermaximal function F is defined as F f(t) = supN FN f(t) for f L ( ) and t ∈ T. Then F ∗f ≤ 2L∗|f| = 2Mf by (iv).

[116] (i) The Fej´ermaximal function F ∗ and the Lebesgue maximal function L∗ are weak (1, 1). (ii) (Lebesgue theorem about the pointwise convergence of Fej´ermeans) Let f ∈ L1(T). Then

(σN (f)) → f pointwise almost everywhere. ∫ T ∈ 1 T 1 a+r (iii) (Lebsgue differentiation theorem on ) Let f L ( ). Then limr→0 2r a−r f(t)dt = f(a) for almost every a ∈ T, and includes all a ∈ T at which f is continuous.

Proof. (i) We know that L∗f ≤ L∗|f| = Mf; and F ∗f ≤ 2L∗|f| = 2Mf by Exercise-18(iv). Also M is weak (1, 1) by [113].

(ii) (σN (f)) → f pointwise (in fact, uniformly) for every f ∈ C(T) by Exercise-16. Also C(T) is 1 ∗ dense in L (T) by Exercise-1. Apply [112] to F after putting TN f = FN ∗ f = σN (f).

(iii) Let f ∈ L1(T) be continuous at a ∈ T, and ε > 0. Choose δ > 0 such that |f(a) − f(t)| < ε whenever |a − t| < δ. Then for 0 < r < δ we have |f(a) − Lr ∗ f(a)| = ∫ ∫ ∫ 1 a+r 1 a+r 1 a+r 1 |f(a) − f(t)dt| = | (f(a) − f(t))dt| ≤ |f(a) − f(t)|dy ≤ ε · 2r = ε. 2r a−r 2r a−r 2r a−r 2r ∫ ∗ 1 a+r ∈ T Hence limr→0 Lr f(a) = limr→0 2r a−r f(t)dt = f(a). In particular, this holds at every a when f ∈ C(T). Now use the denseness of C(T) in L1(T) and apply [112] to L∗ after putting ∗  Tnf = Lrn f for any decreasing sequence (rn) in (0, 1/2) converging to 0. ∫ ∈ T 1 a+r ∈ 1 T Remark: A point a with limr→0 2r a−r f(t)dt = f(a) is called a Lebesgue point for f L ( ). ∑ 5 |n| T Definition: For 0 < r < 1, define the Poisson kernel Pr = n∈Z r en, which belongs to C( ) by the uniform convergence of the series. Keeping in mind the expressions in Exercise-7(iii), for f ∈ L1(T) and 0 < r < 1, define the rth Abel mean of f as

∑∞ ∑∞ ∑N ∑ N N b b |n| Ar(f) = (1 − r) r sN (f) = (1 − r) r f(n)en = f(n)r en = Pr ∗ f N=0 N=0 n=−N N∈Z

5 See the end of Section 8 for a picture of the graph of Pr 24 T.K.SUBRAHMONIAN MOOTHATHU b where we used the fact f(n)en = f ∗ en = en ∗ f from Exercise-5. 1 − r2 [117] (i) (Another expression for ) P (t) = . r 1 + r2 − 2r cos 2πt 1 (ii) {Pr : 0 < r < 1} as r → 1 is an approximate identity on L (T) satisfying Pr ≥ 0 and also the ∞ following L -concentration condition (A4): limr→1 sup{|Pr(t)| : |t| > δ} = 0 for every δ ∈ (0, 1/2).

Proof. (i)

∑∞ ∑∞ − n n 1 1 2 2r cos 2πt Pr(t) + 1 = r e−n(t) + r en(t) = + = 1 − re− (t) 1 − re (t) 1 + r2 − 2r cos 2πt n=0 n=0 1 1

2 − 2r cos 2πt 1 − r2 since e− (t) + e (t) = 2 cos 2πt. Hence P (t) = − 1 = . 1 1 r 1 + r2 − 2r cos 2πt 1 + r2 − 2r cos 2πt ∫ ∫ ∑ ∑ ∫ ∫ |n| |n| ̸ (ii) T Pr = T n∈Z r en = n∈Z T r en = 1 since T en = 0 for n = 0; here the interchange of summation and integral is justified by the uniform convergence of the series. Thus {Pr} satisfies 1 − r2 1 − r2 condition (A1) of an approximate identity. By (i), P (t) ≥ = ≥ 0, and r 1 + r2 − 2r (1 − r)2 hence the L1-boundedness condition (A2) follows from (A1). For |t| > δ ∈ (0, 1/2), we have 2 2 1 + r − 2r cos 2πt = (1 − r) + 2r(1 − cos 2πt) > 2r(1 − cos 2πδ). Hence limr→1 sup{|Pr(t)| : |t| > 1 − r2 δ} ≤ lim sup → = 0, establishing (A4); and this implies property (A3) also.  r 1 2r(1 − cos 2πδ)

Remark: It can also be shown that for each r, the Poisson kernel Pr decreases on [0, 1/2).

[118] (i) If f ∈ C(T), then limr→1− ∥Ar(f) − f∥∞ = 0. 1 (ii) Let 1 ≤ p < ∞. If f ∈ L (T), then limr→1− ∥Ar(f) − f∥p = 0. 1 (iii) If f ∈ L (T) and if the limits f(a+) and f(a−) exist at a point a ∈ T, then limr→1− Ar(f, a) = 1 [f(a+)+f(a−)]/2. In particular, if f ∈ L (T) is continuous at a ∈ T, then limr→1− Ar(f, a) = f(a). 1 − (iv) If f ∈ L (T), then Ar(f, a) → f(a) as r → 1 for almost every a ∈ T.

Proof. Statements (i) and (ii) follow from the fact [117](ii) that {Pr : 0 < r < 1} as r → 1 is an approximate identity, by [102]. Statements (iii) and (iv) follow from [115] since Ces`arosummability implies Abel summability by [110] (or imitate the proofs of [115](iii) and [116](ii)). 

Remark: Poisson kernel appears naturally in the theory of partial differential equations, for example in solving the Laplace equation of the unit disc.

8. Pointwise convergence of Fourier series for functions of bounded variation

Here we will present Jordan’s theorem about pointwise convergence of Fourier series for a function f ∈ L1(T) of bounded variation. FOURIER ANALYSIS 25 ∑ T → C ⊂ T b { k | − | ∈ Definition: For a function f : , and [a, b] , let Va (f) = sup j=1 f(aj) f(aj−1) : k

N and a = a0 ≤ a1 ≤ · · · ≤ ak−1 ≤ ak = b} be the total variation of f in [a, b]. We say f is of 1 ∞ bounded variation if V0 (f) < .

1 Remark: Let f ∈ L (T) be of bounded variation. It is known that we may write f = f1 − f2 + i(f3 −f4), where fj’s are real valued are monotone increasing. A monotone function is differentiable almost everywhere and has only jump discontinuities. Thus f is differentiable almost everywhere and the limits f(a+) and f(a−) exist for every a ∈ T (see my notes Measure Theory for these facts). We aim to show limN→∞(sN (f, a)) = [f(a+) + f(a−)]/2 for every a ∈ T.

[119] Let f ∈ L1(T) be a function of bounded variation. Then,

(i) (Dirichlet’s theorem) limN→∞ σN (f, a) = [f(a+) + f(a−)]/2 for every a ∈ T. | b | ≤ 1 ∈ Z (ii) nf(n) V0 (f)/2 for every n .

(iii) (Jordan’s theorem) limN→∞ sN (f, a) = [f(a+) + f(a−)]/2 for every a ∈ T.

Proof. (i) By the Remark above, the limits f(a+) and f(a−) exist for every a ∈ T. So the result follows by [115](iii).

1 (ii) We may assume n ≠ 0, and put r = | | . First note that ∑ ∫ 2∑n ∫ b 2|n|−1 (k+1)r 2|n|−1 r f(n) = k=0 kr f(t)e−n(t)dt = k=0 0 f(s + kr)e−n(s + kr)ds.

Since e−n(s + kr) = e−n(s) when k is even, and e−n(s + kr) = −en(s) when k is odd, we get ∫ (∑ ) b r |n|−1 − f(n) = 0 j=0 [f(s + 2jr) f(s + (2j + 1)r)] e−n(s)ds. 1 ∫ ∑| |− ∫ V (f) Therefore, |fb(n)| ≤ r n 1 |f(s + 2jr) − f(s + (2j + 1)r)|ds ≤ r V 1(f)ds ≤ V 1(f)r = 0 . 0 j=0 0 0 0 2|n| (iii) This follows from parts (i) and (ii) above, and Hardy’s Tauberian theorem [111](i). 

Remark: Let f ∈ L1(T) be of bounded variation. Then as remarked above, f is differentiable a.e., and hence continuous a.e.; now, Jordan’s theorem implies limN→∞ sN (f, a) = f(a) for a.e. a ∈ T.

Seminar topic/ reading assignment: (i) If (an)n∈Z is a sequence of nonnegative numbers such that a−n = an, an+1 − an ≥ an − an−1 and limn→∞ an = 0, then there is a nonnegative function ∑∞ sin 2πnt ∑ sgn(n)e (t) f ∈ L1(T) with fb(n) = a for every n ∈ Z. (ii) = n is a convergent n n=2 log n |n|≥2 2i log |n| trigonometric series that is not the Fourier series of any f ∈ L1(T); see sections 4.1 and 4.2 of Y. Katznelson, An Introduction to Harmonic Analysis. (iii) Gibbs-Wilbraham phenomenon; see section 1.2.8 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets.

Remark: We saw in the beginning of this notes that the Fourier series converges in L2(T). About p the general L -convergence, the following is known: if 1 < p < ∞, then limN→∞ ∥f − sN (f)∥p = 0 26 T.K.SUBRAHMONIAN MOOTHATHU

p 1 for every f ∈ L (T); but there are f ∈ L (T) for which the sequence (sN (f)) does not converge to f in L1(T).

Figure 1. Rough shape of the graphs of DN , FN , and Pr.

In sections 9-14, we will introduce the theory of distributions, and this will have an interplay with the theory of Fourier transform that we will discuss afterwards. The theory of distributions is usually developed on an open subset of Rn. But we will stick to the one dimensional space R in order to convey the ideas in the simplest way without notational distractions. After grasping the one dimensional case, for applications the student should read from relevant books6 the multidi- mensional theory, which is built upon more or less the same ideas. One serious difference that we observe when we move from T to R is that we have Lp(T) ⊂ L1(T) for 1 ≤ p < ∞, but there are no inclusion relations among Lp(R) for 1 ≤ p ≤ ∞ since the Lebesgue measure of R is infinite; and we have to pay special attention to the decay properties of a function value f(x) as |x| → ∞. Special subspaces defined in terms of various decay properties will play an important role, and we start by sketching the theory of such subspaces.

9. Convolution is a smoothing operation

In this section, we will establish the following philosophy in various forms: f ∗ g is at least as smooth as f and g, and often smoother. As a warm-up, we first look at different types of continuous functions on R.

Definition: Recall that the support of a function f : R → C is supp(f) := {x ∈ R : f(x) ≠ 0}. Also recall C(R) = {f : R → C : f is continuous}, and Cc(R) = {f ∈ C(R): f has compact support}.

Let C0(R) = {f ∈ C(R) : lim|x|→∞ f(x) = 0} (the space of continuous functions vanishing at ∞), and Cb(R) = {f ∈ C(R): f is bounded}. Note that all these spaces are complex vector spaces.

∞ Exercise-19: (i) Cc(R) ⊂ C0(R) ⊂ Cb(R) = C(R) ∩ L (R).

6Eg: F.G. Friedlander, Introduction to the Theory of Distributions, or L. Grafakos, Classical Fourier Analysis. FOURIER ANALYSIS 27

(ii) Every f ∈ C0(R) is uniformly continuous, but f ∈ Cb(R) may not be uniformly continuous. p (iii) Cc(R) ⊂ L (R) for 1 ≤ p ≤ ∞. p p (iv) For 1 ≤ p < ∞, C0(R) is not a subset of L (R), and Cb(R) ∩ L (R) is not a subset of C0(R). ∞ (v) C0(R) is a closed vector subspace of (L (R), ∥ · ∥∞), and hence is a Banach space. p (vi) Cc(R) is dense in both (C0(R), ∥ · ∥∞) and (L (R), ∥ · ∥p) for 1 ≤ p < ∞.

[Hint: (ii) If f ∈ Cb(R) is with f(n) = 0 and f(n + 1/n) = 1 for n ≥ 2, then f is not uniformly n n 1 continuous. (iv) If f ∈ C0(R) is with f(x) = 1/n for 10 ≤ x ≤ 10 + n ∀n ∈ N, then f∈ / L (R).

(vi) Given f ∈ C0(R) and ε > 0, choose N ∈ N such that |f(x)| < ε for |x| ≥ N. Choose continuous g : R → [0, 1] with g ≡ 1 on [−N,N] and g(x) = 0 for |x| ≥ N + 1. Then fg ∈ Cc(R) p and ∥f − fg∥∞ ≤ ε. The denseness of Cc(R) in L (R) is already noted in Exercise-1.]

1 1 If f, g ∈ L (R), then we know by Exercise-3(i) that f ∗ g ∈ L (R) with ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1. To supplement this, first we observe:

p p Exercise-20: (i) Fix f ∈ L (R), 1 ≤ p < ∞. Then a 7→ fa from R to L (R) is uniformly continuous. 1 ∞ ∞ (ii) If f ∈ L (R) and g ∈ L (R), then f ∗ g ∈ L (R) with ∥f ∗ g∥∞ ≤ ∥f∥1∥g∥∞, and moreover f ∗ g is uniformly continuous.

[Hint: (i) As in Exercise-2(ii), we may assume f ∈ Cc(R). Let N ∈ N be such that supp(f) ⊂ [−(N − 1),N − 1]. Given ε > 0, choose δ ∈ (0, 1) such that |x − y| < δ implies |f(x) − f(y)|p < ∫ ε p p . Then for a, b ∈ R with |a − b| < δ, we have ∥fa − fb∥p = |f(x − a) − f(x − b)| dx = ∫2N ∫ R |f(y) − f(y + a − b)|pdy = N |f(y) − f(y + a − b)|pdy < ε · 2N = ε. (ii) |f ∗ g(x)| = ∫R ∫ −N 2N 1 | f(y)g(x − y)dy| ≤ |f(y)|∥g∥∞dy = ∥f∥1∥g∥∞. Define F ∈ L (R) as F (y) = f(−y). Note R R ∫ that |g ∗ f(a) − g ∗ f(b)| = | g(y)(Fa(y) − Fb(y))dy| ≤ ∥g∥∞∥Fa − Fb∥1, and use (i).]

[120] (i) If f, g ∈ Cc(R), then f ∗ g ∈ Cc(R) and supp(f ∗ g) ⊂ supp(f) + supp(g). 1 (ii) If f ∈ L (R) and g ∈ C0(R), then f ∗ g ∈ C0(R). ∞ 1 1 ∈ p R ∈ q R ∗ ∈ R (iii) Let 1 < p, q < and p + q = 1. If f L ( ) and g L ( ), then f g C0( ) and

∥f ∗ g∥∞ ≤ ∥f∥p∥g∥q. p (iv) Let 1 ≤ p < ∞. If f ∈ L (R) and g ∈ Cc(R), then f ∗ g ∈ C0(R).

Proof. (i) Let h = f ∗ g, K = supp(f) and L = supp(g). Then supp(gy) = y + L. Since ∫ ∫

|h(a) − h(b)| ≤ |f(y)||g(a − y) − g(b − y)|dy ≤ ∥f∥∞ |g(a − y) − g(b − y)|dy, K K we may deduce that h is continuous by the uniform continuity of g. If x∈ / K + L, then gy(x) = 0 ∫ ∈ ⊂ for every y K and hence h(x) = K f(y)gy(x)dy = 0. This shows supp(h) K + L. ∞ (ii) f ∗ g is defined by Exercise-20(ii) since g ∈ C0(R) ⊂ L (R). By Exercise-19(vi), choose (fn) and (gn) in Cc(R) with ∥f − fn∥1 → 0 and ∥g − gn∥∞ → 0. Let M = sup ∥gn∥∞ < 28 T.K.SUBRAHMONIAN MOOTHATHU

∞. We know by (i) that fn ∗ gn ∈ Cc(R) ⊂ C0(R). With the help of Exercise-20(ii), we see

|f ∗g(x)−fn∗gn(x)| ≤ |f ∗g(x)−f ∗gn(x)|+|f ∗gn(x)−fn∗gn(x)| ≤ ∥f∥1∥g−gn∥∞+M∥f −fn∥1 → 0 as n → ∞. Since C0(R) is closed w.r.to ∥ · ∥∞ by Exercise-19(v), we conclude that f ∗ g ∈ C0(R). ∫ (iii) Let h(x) = g(−x) so that f ∗ g(x) = R f(y)hx(y)dy. Apply H¨older’sinequality and noting ∞ ∥hx∥q = ∥g∥q, we may deduce that f ∗ g ∈ L (R) with ∥f ∗ g∥∞ ≤ ∥f∥p∥g∥q. The proof that f ∗ g ∈ C0(R) is similar to the one given for (ii). Choose sequences (fn) and (gn) in Cc(R) ∥ − ∥ → ∥ − ∥ → ∥ ∥ ∞ with f fn p 0 and g gn q 0. Let M = supn gn q < . By what is proved so far,

∥f ∗g−fn ∗gn∥∞ ≤ ∥f ∗(g−gn)∥∞ +∥(f −fn)∗gn∥∞ ≤ ∥f∥p∥g−gn∥q +∥f −fn∥1M → 0 as n → ∞.

Since fn ∗ gn ∈ Cc(R) ⊂ C0(R) and since C0(R) is a Banach space, we get that f ∗ g ∈ C0(R).

q (iv) This follows from (ii) and (iii) since Cc(R) ⊂ L (R). 

Remark: We mention an application of [120](iii). Claim: If A ⊂ R has positive Lebesgue measure, then A − A contains a neighborhood of 0. Proof : We may assume µ(A) < ∞. Let f = 1A and ∫ ∫ g(x) = f(−x), and note f, g ∈ L2(R). Given that 0 < µ(A) = f = f 2 = f ∗ g(0). Since f ∗ g is continuous by [120](iii), there is a neighborhood U ⊂ R of 0 such that f ∗ g(x) > 0 for every x ∈ U. ∫ ∫ But 0 < f ∗ g(x) = f(y)g(x − y)dy = f(y)f(y − x)dy implies x ∈ A − A, and hence U ⊂ A − A.

m m Definition: Let C (R) = {f ∈ Cc(R): f is m-times continuously differentiable}, C (R) = {f ∈ c ∩ 0 R } ∞ R ∞ m R C0( ): f is m-times continuously differentiable , and C0 ( ) = m=1 C0 ( ). More importantly, for our future discussion of distributions, we define the space E of smooth functions, and the space D of smooth functions with compact support as follows:

Let E = C∞(R) = {f : R → C : f is infinitely often differentiable}, ∩ D ∞ R R ∩ E ∞ m R and = Cc ( ) = Cc( ) = m=1 Cc ( ). Notation: Let Df = f ′, the derivative of f : R → C when f is differentiable.

Fact from Analysis: (Differentiating under the integral - see Theorem 7.40 in Apostol, Mathematical Analysis) If ϕ :[a, b] × [c, d] → C and the partial derivative ∂ϕ are continuous, then h :[a, b] → C ∫ ∂x∫ d d ∂ϕ defined as h(x) := c ϕ(x, y)dy is differentiable and Dh(x) = c ∂x (x, y)dy. ∈ R ∈ 1 R ∗ ∈ 1 R ∗ ∗ [121] (i) If f Cc( ) and g Cc ( ), then f g Cc ( ) and D(f g) = f Dg. ≥ ∈ k R ∈ m R ∗ ∈ k+m R i+j ∗ i ∗ j (ii) Let k, m 0. If f Cc ( ) and g Cc ( ), then f g Cc ( ) and D (f g) = D f D g for 0 ≤ i ≤ k and 0 ≤ j ≤ m. In particular, if f ∈ Cc(R) and g ∈ D, then f ∗ g ∈ D and Dm(f ∗ g) = f ∗ Dmg for every m ∈ N. ≤ ≤ ∞ ∈ p R ∈ m R ∈ N ∗ ∈ m R (iii) Let 1 p . If f L ( ) and g Cc ( ) for some m , then f g C0 ( ) and j ∗ ∗ j ≤ ≤ ∈ p R ∈ D ∗ ∈ ∞ R D (f g) = f D g for 1 j m. In particular, if f L ( ) and g , then f g C0 ( ) and Dm(f ∗ g) = f ∗ Dmg for every m ∈ N. FOURIER ANALYSIS 29

∈ 1 R ∈ ∞ R ∗ ∈ ∞ R m ∗ ∗ m ∈ N (iv) If f L ( ) and g C0 ( ), then f g C0 ( ) and D (f g) = f D g for every m .

∈ 1 R ∈ R ∗ ∈ R Proof. (i) Since g Cc ( ), we have Dg Cc( ) and hence f Dg Cc( ) by [120](i). It remains ∗ ∗ − ∂ϕ to show D(f g) = f Dg. Put ϕ(x, y) := f(y)g(x y). Then ϕ and ∂x are continuous, where ∂ϕ − ∈ R ∂x (x, y) = f(y)Dg(x y). For a fixed x , ϕ(x, y) = 0 for y outside a compact interval since f and g have compact supports. By applying the Fact mentioned above to ϕ, we get D(f ∗g) = f ∗Dg.

(ii) This follows by the repeated application of (i) since the convolution is symmetric.

(iii) We know by [120](iv) that f ∗ g ∈ C0(R). It remains to show f ∗ g is m-times continuously differentiable and Dj(f ∗ g) = f ∗ Djg for 1 ≤ j ≤ m.

1 Case-1 : f ∈ L (R). Fix x ∈ R, and we claim that D(f ∗ g)(x) = f ∗ Dg(x). Let (tn) be a −1 sequence of non-zero reals converging to 0. Define hn(y) = t (g(x + tn − y) − g(x − y)). Since ∫ n ∫ −1 ∗ − ∗ ∗ tn (f g(x+tn) f g(x)) = R f(y)hn(y)dy, it remains to show limn→∞ R f(y)hn(y)dy = f Dg(x).

By mean value theorem applied to g, we observe that ∥hn∥∞ ≤ ∥Dg∥∞ for every n ∈ N. Also, if we put h(y) = Dg(x − y), then we see that (fhn) → fh pointwise. Since fhn is dominated by the integrable∫ function ∫∥Dg∥∞|f|, we conclude∫ by Lebesgue dominated convergence theorem that lim f(y)hn(y)dy = f(y)h(y)dy = f(y)Dg(x − y)dy = f ∗ Dg(x), which proves the claim. n→∞ R R R ∈ R ∗ ∗ ∈ R ∗ ∈ 1 R Since Dg Cc( ), we have D(f g) = f Dg C0( ) also by [120](iv), and thus f g C0 ( ). Now repeating the argument with Dg, D2g, etc. in place of g, we get the desired result. ∫ p Case-2 : f ∈ L (R), where p ∈ (1, ∞). Let K = supp(g). Since f ∗g(x) = R 1x−K (y)f(y)g(x−y)dy 1 and since y 7→ 1x−K (y)f(y) belongs to L (R) by H¨older’sinequality, the result follows by case-1. ∫ ∞ Case-3 : f ∈ L (R). Let K = supp(g). Since f ∗ g(x) = R 1x−K (y)f(y)g(x − y)dy and since 1 y 7→ 1x−K (y)f(y) belongs to L (R), the result follows again by case-1.

(iv) Use the argument in case-1 of the proof of (iii), and use [120](ii) to say f ∗ Dg ∈ C0(R). 

We will see below that R has an approximate identity {Hδ : δ > 0} with the additional property p that Hδ ∈ D for every δ > 0. This tool is useful in approximating L -functions by members of D.

−1/x Exercise-21: Let h : R → R be h(x) = e 1(0,∞)(x). Then h ≥ 0 and h ∈ E.

(ii) For a < b, let ha,b(x) = h(x − a)h(b − x). Then ha,b ≥ 0, ha,b ∈ D and supp(ha,b) = [a, b]. ∫ −1/(1−x2) (iii) Let H(x) = c · h−1,1(x) = ce , where c > 0 is a constant chosen so that R H(x)dx = 1. Then H ≥ 0, H ∈ D and supp(H) = [−1, 1]. ∫ −1 (iv) For δ > 0, let Hδ(x) = δ H(x/δ). Then Hδ ≥ 0, R Hδ = 1, Hδ ∈ D and supp(Hδ) = [−δ, δ]. 1 Thus {Hδ : δ > 0} as δ → 0 is a nonnegative approximate identity on R satisfying the L - concentration condition in a strong sense (the family {Hδ : δ > 0} is also called a mollifier). 30 T.K.SUBRAHMONIAN MOOTHATHU

(v) If f ∈ Cc(R), then ∥f − f ∗ Hδ∥∞ → 0 as δ → 0. p (vi) Let 1 ≤ p < ∞. If f ∈ L (R), then ∥f − f ∗ Hδ∥p → 0 as δ → 0. (n) [Hint: (i) For x > 0, show inductively that f (x) = pn(1/x)f(x) for some polynomial pn. (ii)-(iv) are easy consequences of (i). Statements (v) and (vi) follow from part (vi) and [102].]

D ∞ R p R ∥ · ∥ ≤ ∞ [122] := Cc ( ) is dense in (L ( ), p) for 1 p < .

p Proof. Since Cc(R) is dense in (L (R), ∥ · ∥p) by Exercise-1, it suffices to show f ∈ Cc(R) can be p approximated in L -norm by members of D. We know ∥f − f ∗ Hδ∥p → 0 by Exercise-21(vi). Since f ∈ Cc(R) and Hδ ∈ D, we also have f ∗ Hδ ∈ D by [121](ii), and we are done. 

Topologies on the spaces D ∞ R and E ∞ R 10. = Cc ( ) = C ( )

The theory of distributions (to be introduced soon) is based on the three spaces D ⊂ S ⊂ E, where D ∞ R E ∞ R S := Cc ( ), := C ( ) , and is the Schwartz space(to be defined in the next section). In this section, we will introduce suitable topologies on D and E, and will mention a few basic properties of these spaces. First we will review a few selected facts from Functional Analysis that we need.

Definition: Let X be a (over C), i.e., X is a vector space having a Hausdorff topology, and the maps (a, x) 7→ ax from C×X to X and (x, y) 7→ x+y from X2 to X are continuous. (i) X is locally convex if 0 ∈ X has a neighborhood base consisting of convex open sets. (ii) X is a Fr´echet space if X is locally convex and admits a complete metric.

Remark: Many function spaces appearing in Analysis do not admit any natural structure of a Banach space, but they retain two nice properties of a Banach space - local convexity and the existence of an admissible complete metric - so that they are Fr´echet spaces (we will see examples shortly). One advantage of a Fr´echet space is that Baire category theorem, and consequently many classic theorems in Functional Analysis based on Baire category theorem (Open mapping theorem, Uniform boundedness theorem, etc.) hold good on Fr´echet spaces - see Rudin, Functional Analysis.

Definition: Let X be a vector space (over C). A function p : X → [0, ∞) is a seminorm if p(ax) = |a|p(x) and p(x + y) ≤ p(x) + p(y) for every a ∈ C and x, y ∈ X. The first property implies p(x) = 0 if x = 0 (take a = 2). If the converse (that is, p(x) = 0 ⇒ x = 0) also holds, then p becomes a norm on X. An easy way to produce a seminorm p is: choose any linear functional ϕ : X → C and put p(x) = |ϕ(x)|. A useful observation about a seminorm p is the following: since p(x) + p(y − x) ≤ p(y) and p(y) + p(x − y) ≤ p(x), we get |p(x) − p(y)| ≤ p(x − y).

Definition: Let X be a vector space (over C), and P be a family of seminorms on X. (i) P is separating if the only x ∈ X with p(x) = 0 ∀ p ∈ P is x = 0. FOURIER ANALYSIS 31

(ii) The topology generated by P on X is the smallest topology on X that makes every p ∈ P continuous. Sets of the form {x ∈ X : pj(x) < ε for 1 ≤ j ≤ k}, where ε > 0 and p1, . . . , pk ∈ P , form a base at 0 ∈ X for this topology. Using this, observe that the topology generated by a family of seminorms is always locally convex.

(iii) P is directed if ∀ p1, p2 ∈ P , ∃ p3 ∈ P with p3 ≥ max{p1, p2}. If P is a directed, then basic neighborhoods of 0 ∈ X have the form {x ∈ X : p(x) < ε} for some ε > 0 and p ∈ P .

[123] (Working knowledge about a topology specified by seminorms) Let X be a topological vector space, where the topology is generated by a directed family P of seminorms. Then, (i) A seminorm q on X (not necessarily a member of P ) is continuous iff there exist p ∈ P and C > 0 such that q(x) ≤ Cp(x) for every x ∈ X. Consequently, a linear functional ϕ : X → C is continuous iff there exist p ∈ P and C > 0 such that |ϕ(x)| ≤ Cp(x) for every x ∈ X.

(ii) Let P be separating and countable, say P = {pk : k ∈ N}, and let Uk = {x ∈ X : pk(x) < 1/k}.

Then {Uk : k ∈ N} is a local base at 0 for X (we remark that even when X is metrizable, it is more convenient to use the seminorms than the metric).

(iii) If P = {pk : k ∈ N} is separating, then X is metrizable with a translation invariant metric ∑ ∞ −k { − } d(x, y) := k=1 2 min 1, pk(x y) . Here, translation invariance means d(x+z, y +z) = d(x, y).

(iv) Let P = {pk : k ∈ N} be separating. Then, (xn) → x in X ⇔ for every k ∈ N, there is n0 ∈ N such that pk(x − xn) < 1/k for every n ≥ n0; and (xn) is Cauchy in X ⇔ for every k ∈ N, there is n0 ∈ N such that pk(xm − xn) < 1/k for every m, n ≥ n0.

Proof. (i) Suppose q is continuous and U ⊂ X be a basic neighborhood of 0 ∈ X with q(U) ⊂ [0, 1). We may assume U = {x ∈ X : p(x) < ε} for some ε > 0 and p ∈ P since P is directed. We claim that C := 2/ε works. Consider x ∈ X. If p(x) = 0, then p(ax) = ap(x) = 0 so that ax ∈ U for every a > 0, which implies aq(x) = q(ax) < 1 for every a > 0; hence x = 0 and trivially ≤ ε ∈ q(x) Cp(x). If p(x) > 0, then for a := 2p(x) we have ax U and aq(x) = q(ax) < 1; this implies q(x) < 1/a = Cp(x). Conversely, if the given condition holds, then q is continuous at 0 ∈ X; and the continuity at a general point follows by observing that q(x − y) ≤ |x − y|. The second assertion about a linear functional follows by applying what is already proved to q(x) := |ϕ(x)|.

(ii) Consider a basic neighborhood U = {x ∈ X : pj(x) < ε for 1 ≤ j ≤ m} of 0, where pj ∈ P and

ε > 0. Let n ≥ m be such that 1/n < ε, and then choose pk ≥ max{p1, . . . , pn} using the fact that

P is directed. Then clearly 0 ∈ Uk ⊂ U, and this shows {Uk : k ∈ N} is a local base at 0.

(iii) Translation invariance and symmetry of d are clear. Triangle inequality of d follows from that of pk’s. We have d(x, x) = 0 since pk(0) = 0. If d(x, y) = 0, then pk(x − y) = 0 for every k ∈ N, which implies x − y = 0 or x = y since P is separating. Thus d is a translation 32 T.K.SUBRAHMONIAN MOOTHATHU invariant metric. Now we will verify using Uk’s from (ii) that d induces the same topology. Since ≤ k 1 ⊂ pk(x) 2 d(0, x), we have Bd(0, k ) Uk. For the other direction, for a given ε > 0, choose ∑ 2 k ∈ N ∞ −j ≥ { } m with 1/m < ε/2 and j=m+1 2 < ε/2. Then choose k > m with pk max p1, . . . , pm by the directedness of P . For any x ∈ Uk, we now have pj(x) ≤ pk(x) < 1/k for 1 ≤ j ≤ k and ∑ ∑ ≤ k −j ∞ −j ⊂ hence d(0, x) j=1 2 /k + j=k+1 2 < 1/k + ε/2 < ε, which shows Uk Bd(0, ε).

(iv) This follows from (ii). 

We now apply these tools from Functional Analysis to the functions spaces of our interest:

Exercise-22: (Examples) (i) Let ρN (f) = max{|f(x)| : |x| ≤ N} for f ∈ C(R) and N ∈ N. Then

P = {ρN : N ∈ N} is a directed separating family of seminorms on C(R), and C(R) is a Fr´echet space w.r.to the metric induced by P . Also, the topology on C(R) obtained in this way (the topology of uniform convergence on compact sets) coincides with the compact-open topology on C(R).

∞ j (ii) For f ∈ E := C (R), and N ∈ N, let pN (f) = max{|D f(x)| : 0 ≤ j ≤ N and |x| ≤ N}. Then

P = {pN : N ∈ N} is a directed separating family of seminorms on E, and E is a Fr´echet space w.r.to the metric induced by P . If we put UN = {f ∈ E : pN (f) < 1/N}, then {UN : N ∈ N} is a local base at 0 for E. For fn, f ∈ E, we have (fn) → f in E iff for each N ∈ N, there is n0 ∈ N such that pN (f − fn) < 1/N for every n ≥ n0. A linear functional ϕ : E → C is continuous iff there exist

C > 0 and N ∈ N such that |ϕ(f)| ≤ CpN (f) for every f ∈ E.

(iii) Dk := {f ∈ E : supp(f) ⊂ [−k, k]} is a closed vector subspace of E (hence a Fr´echet space), ∪ ∞ D D D and note k=1 k = . The seminorms pN from part (ii) become norms when restricted to k, and {Dk ∩ UN : N ∈ N} is a local base at 0 for Dk, where UN is as in (ii). For f, fn ∈ Dk, we have

(fn) → f in Dk iff for each N ∈ N, there is n0 ∈ N such that pN (f − fn) < 1/N for every n ≥ n0.

E ∞ R E j ∞ [Hint: (ii) Completeness of = C ( ): if (fn) is Cauchy in , then (D fn)n=1 is Cauchy in the j j Fr´echet space C(R) for each j ≥ 0; put gj := limn D fn and deduce D g0 = gj for every j ∈ N using Theorem 7.17 in Rudin, Principles of Mathematical Analysis. (iii) Let ϕx : E → C be the ∩ D D continuous functional ϕx(f) = f(x). Then k = |x|>k ker(ϕx), and hence k is closed.]

j Remark: By Exercise-22(ii), (fn) → f in E ⇔ for each N ∈ N and j ≥ 0, (D fn) → f uniformly j to f on [−N,N] as n → ∞ ⇔ for each j ≥ 0, (D fn) → f uniformly on compact subsets of R. In ∞ m higher dimension, if we consider E(W ) := C (W, C) for an open set W ⊂ R , then (fn) → f in α E(W ) ⇔ for each multi-index α,(D fn) → f uniformly on compact subsets of W .

Exercise-23: (i) (Existence of smooth bump function) Given 0 < a < b, there exists g ∈ D such that 0 ≤ g ≤ 1, g ≡ 1 on [−a, a], and supp(g) ⊂ [−b, b]. (ii) D is dense in E. FOURIER ANALYSIS 33

[Hint: (i) By Exercise-21, there is f ∈ E with f(x) = 0 for x ≤ 0 and f(x) > 0 for x > 0. Let h(x) = f(b − x)/[f(b − x) + f(x − a)]. Then h ≡ 1 on (−∞, a), h ≡ 0 on (b, ∞), and h(a, b) = (0, 1).

Put g(x) = h(|x|). (ii) Let f ∈ E. Pick g ∈ D by (i) with g ≡ 1 on [−1, 1]. Define gn(x) = g(x/n).

Then fgn ∈ D, and fgn ≡ f on [−n, n] so that pN (f − fgn) = 0 for n > N, giving (fgn) → f in E.]

Remark: As D ̸= E, it follows by Exercise-23(ii) that D is not closed (complete) in E (another argument for this is by using Baire category theorem, after noting that the proper closed subspaces

Dk must be nowhere dense in D). Intuitively, D is not closed in E because: if (fn) is a sequence in

D converging to f ∈ E, then supp(fn) can get bigger with n so that in the limiting case supp(f) may fail to be compact. Since it is desirable to work with complete spaces, we will now try to put a

(sequentially complete) topology Tind called the inductive limit topology on D. This topology Tind will be such that if (fn) is Cauchy in (D, Tind), then there will be a uniform bound for supp(fn) for every n, and this will ensure that the support of f := lim fn is also compact.

As before, let Dk = {f ∈ D : supp(f) ⊂ [−k, k]} and let Tk denote the topology on Dk (subspace topology induced from E). Keep in mind that (Dk, Tk) is a Fr´echet space.

Definition: The inductive limit topology Tind on D is defined as the finest (strongest) locally convex topology on D such that the inclusions Dk ⊂ D become continuous for every k ∈ N.

Remark: By definition, Tind is stronger than the subspace topology on D induced by E. Conse- quently, Dk is closed in (D, Tind) for each k ∈ N by Exercise-22(iii).

Exercise-24: (i) The collection of all convex balanced sets U ⊂ D such that Dk ∩ U is open in Dk form a local base at 0 for the locally convex space (D, Tind). Here, U is said to be balanced if cU ⊂ U for all c ∈ C with |c| ≤ 1).

(ii) The subspace topology induced on Dk from Tind coincides with the original topology Tk of Dk.

Thus U ⊂ D is open in D iff Dk ∩ U is open in Dk for every k ∈ N . [Hint: We leave this as a reading assignment - see 6.4 in Rudin, Functional Analysis.]

Definition: Let X,Y be topological vector spaces. (i) A ⊂ X is a bounded subset if for every neighborhood U of 0 ∈ X, there is c > 0 such that cA ⊂ U. (ii) A sequence (xn) in X is Cauchy if for every neighborhood U ⊂ X of 0, there is n0 ∈ N such that xn − xm ∈ U for every m, n ≥ n0; and X is (sequentially) complete if every Cauchy sequence in X converges to some element of X. (iii) A linear map T : X → Y is bounded if T (A) is bounded in Y whenever A ⊂ X is bounded.

[124] (i) A ⊂ D is bounded ⇔ there is k ∈ N such that A ⊂ Dk and A is bounded in Dk.

(ii) (fn) → f in D ⇔ there is k ∈ N such that f, fn ∈ Dk for every n ∈ N and (fn) → f in Dk. (iii) D is sequentially complete. 34 T.K.SUBRAHMONIAN MOOTHATHU

(iv) D is not metrizable.

Proof. In (i) and (ii), we will prove only the implication ‘⇒’ since ‘⇐’ is a direct consequence of the continuity of the inclusion Dk ⊂ D.

(i) Suppose A ⊂ D is bounded. If A is not a subset of Dk for any k, choose fk ∈ A \Dk for every k ∈ N. Then there are xk ∈ R with |xk| > k and εk := |fk(xk)/k| > 0 for every k ∈ N. Let

U = {f ∈ D : |f(xk)| < εk for every k ∈ N}. We claim that Dk ∩ U is a neighborhood of 0 in Dk for every k ∈ N. Given k ∈ N, choose m ∈ N such that |xj| < m and 1/m < εj for 1 ≤ j ≤ k. Then

Um := {f ∈ Dk : pm(g) < 1/m} ⊂ U, and this establishes the claim. Therefore by Exercise-24(ii),

U is a neighborhood of 0 in D. Since (fk) is bounded, we must have (fk/k) → 0. But fk/k∈ / U for any k ∈ N by the definition of U, a contradiction. This contradiction establishes that A ⊂ Dk for some k ∈ N. To show A is bounded in Dk, consider a neighborhood V of 0 in Dk. Then

V = Dk ∩ U for some neighborhood U of 0 in D by Exercise-24(ii). Let c > 0 be with cA ⊂ U.

Then cA ⊂ cDk ∩ U = Dk ∩ U = V , where cDk = Dk since Dk is a vector subspace.

(ii) This follows from part (i) and Exercise-24(ii) since {f} ∪ {fn : n ∈ N} is bounded.

(iii) This follows from (i) and the completeness of Dk since a Cauchy sequence is bounded. ∪ D D ∞ D (iv) If is metrizable, then it is a by (iii). Now = k=1 k is a union of proper closed vector subspaces (∵ Dk ≠ D by Exercise-21(ii)). And a proper closed vector subspace must be nowhere dense. Hence we arrive at a contradiction by Baire category theorem. 

Remark: More explicitly, [124](ii) means: (fn) → f in D ⇔ there is k ∈ N such that supp(f) and j j supp(fn) are subsets of [−k, k] for every n ∈ N and limn→∞ ∥D f − D fn∥∞ = 0 for every j ≥ 0. D ∞ C ⊂ m → In higher dimension, if we consider (W ) := Cc (W, ) for an open set W R , then (fn) f in D(W ) ⇔ there is a compact set K ⊂ W such that supp(f), supp(fn) ⊂ K for every n, and α (D fn) → f uniformly on K for each multi-index α.

Seminar topic: The spaces E, Dk and D have the Heine-Borel property: closed and bounded subsets are compact (see 1.46 and 6.7 in Rudin, Functional Analysis).

Even though D is not metrizable, continuity can be characterized using sequences:

Exercise-25: Let Y be a locally convex vector space and T : D → Y be linear. Then TFAE: (i) T is continuous. | ∈ N (ii) T Dk is continuous for each k . (iii) T is bounded.

(iv) (fn) → 0 in D implies (T fn) → 0 in Y . FOURIER ANALYSIS 35

[Hint: (i) ⇔ (ii) can be deduced using Exercise-24(ii) (a general property of inductive limit topol- ogy), and (i) ⇔ (iii) is routine Functional Analysis. We have (iv) ⇒ (ii) since Dk is metrizable.]

j j Remark: Since pN (D f) ≤ pN+j(f), the map f 7→ D f is continuous on Dk for every k, j ∈ N, and hence f 7→ Djf from D to D is continuous for every j ∈ N by Exercise-25.

11. The Schwartz space S

We have Cc(R) ⊂ C0(R) ⊂ C(R), where the members f ∈ C0(R) are defined in terms of the decay of f at infinity. Similarly, now we will define7 the Schwartz space S of smooth functions with D ⊂ S ⊂ E, where the members f ∈ S are defined by requiring that f and all its derivatives decay rapidly at infinity. Later we will see that S is the natural domain for the Fourier transform.

Exercise-26: Write xif for the function x 7→ xif(x). For f ∈ C(R), the following are equivalent: (i) xif ∈ L∞(R) for i = 0, 1, 2,.... (ii) (1 + |x|)if ∈ L∞(R) for i = 0, 1, 2,.... i (iii) x f ∈ C0(R) for i = 0, 1, 2,.... i (iv) (1 + |x|) f ∈ C0(R) for i = 0, 1, 2,.... ∑ | |i ≤ | | i ≤ i | |j ≤ | |i | | ≥ ⇔ ⇔ [Hint: Since x (1 + x ) j=0 cj x C x for x 1, we get (i) (ii) and (iii) (iii). i+1 i If ∥x f∥∞ < ∞, then lim|x|→∞ |x| |f(x)| = 0, and this shows (i) ⇒ (iii).] S D ⊂ S ⊂ ∞ R ⊂ E Definition: Define the Schwartz space as below and note C0 ( ) . S := {f ∈ E : xiDjf ∈ L∞(R) ∀ i, j ≥ 0} = {f ∈ E : (1 + |x|)iDjf ∈ L∞(R) ∀ i, j ≥ 0} i j i j = {f ∈ E : x D f ∈ C0(R) ∀ i, j ≥ 0} = {f ∈ E : (1 + |x|) D f ∈ C0(R) ∀ i, j ≥ 0}.

−1/x2 2 −1 Example: The map x 7→ e 1(0,∞)(x) belongs to S, but x 7→ (1 + x ) is not a member of S.

i j Exercise-27: If we put si,j(f) = ∥x D f∥∞, then the family {si,j : i, j ≥ 0} of seminorms defines a locally convex topology on the vector space S. Moreover:

(i) The family {si,j : i, j ≥ 0} of seminorms is separating but is not a directed family. However, if ∑ { ∈ N} S we put qN = 0≤i,j≤N si,j, then qN : N is a separating directed family of seminorms on inducing the same topology on S. ∑ { ∈ S } { ∈ S } { ∈ N} (ii) Let VN = f : qN (f) < 1/N = f : 0≤i,j≤N si,j(f) < 1/N . Then VN : N is a local base at 0 for S. (iii) A linear functional ϕ : S → C is continuous ⇔ there exist C > 0 and N ∈ N such that ∑ | | ≤ ∈ S ϕ(f) CqN (f) = C 0≤i,j≤N si,j(f) for every f .

7L. Schwartz who introduced the theory of distributions is different from H.A. Schwarz in Cauchy-Schwarz inequality. 36 T.K.SUBRAHMONIAN MOOTHATHU

(iv) For f, fn ∈ S, we have (fn) → f in S ⇔ for every N ∈ N, there is n0 ∈ N such that ∑ − − ≥ qN (f fn) = 0≤i,j≤N si,j(f fn) < 1/N for every n n0. (v) S admits an invariant metric that is complete. Thus S is a Fr´echet space. [Hint: After verifying (i), deduce the other statements using [123] and the hint of Exercise-22(ii).]

Exercise-28: (i) D is dense in S. D ⊂ S D ⊂ S S ⊂ ∞ R ∥ · ∥ S ⊂ E (ii) The inclusions k , (hence) , and (C0 ( ), ∞), and are all continuous. (iii) For 1 ≤ p < ∞ we have S ⊂ Lp(R) and the inclusion is continuous. (iv) S is closed under the following linear maps: f 7→ xif for every i ≥ 0 (and hence under f 7→ gf for any polynomial g), and f 7→ Djf for every j ≥ 0. These maps are continuous on S. [Hint: (i) Let f ∈ S. By Exercise-23(i), there is g ∈ D, 0 ≤ g ≤ 1, with g ≡ 1 on [−1, 1]. j Define gn(x) = g(x/n) so that fgn ∈ D and fgn ≡ f on [−n, n]. Note that D (f − fgn) = ∑ j k j−k − ∥ j−k − ∥ ∥ j−k − ∥ ≤ ckD fD (1 gn). Let M = max D (1 g) ∞. Then supn D (1 gn) ∞ M for k=0 0≤k≤j ∑j i k 0 ≤ k ≤ j. Therefore, si,j(f − fgn) ≤ sup M ck|x D f(x)| → 0. (ii) Fix N > k. Let M = | | x >n k∑=0 ∑ {| i| ≤ ≤ | | ≤ } ≤ ∈ D max x : 0 i N, x N . Then qN (f) = 0≤i,j≤N si,j(f) 0≤i,j≤N MpN (f) for f k. D ⊂ S S ⊂ ∞ R S ⊂ E ∥ ∥ Hence k is continuous. Next, C0 ( ) and are continuous since f ∞ = q0(f) and 2 p 2 2 pN (f) ≤ qN (f). (iii) Let g(x) = 1/(1 + |x|) . Note g ∈ L (R). Let C > 0 be with (1 + |x|) ≤ C|x| ∫ ∫ ∫ | | ≥ ∈ S | |p ≤ | |p | 2 |p ≤ p p p∥ ∥p for x 1. For f , we have R f |x|≤1 f + |x|>1 Cx gf 2s0,0(f) + C s2,0(f) g p. i2 ≤ j2 ≤ (iv) For continuity, note si1,j(x f) si1+i2,j(f) and si,j1 (D f) si,j1+j2 (f).]

[125] If f, g ∈ S, then f ∗ g ∈ S. And the bilinear map (f, g) 7→ f ∗ g from S2 to S is continuous.

∈ S ⊂ ∞ R ∩ 1 R ∗ ≥ Proof. Let f, g C0 ( ) L ( ). The second inclusion ensures f g is defined. Fix i, j 0. We need to show xiDj(f ∗ g) ∈ L∞(R). Let h = Djg, which belongs to S. By [121](iv), f ∗ g ∈ C∞(R) ∑ 0 j ∗ ∗ i − i i k − i−k and D (f g) = f h. Writing x = (y + (x y)) = k=0 cky (x y) , observe that

∫ ∑i ∫ ∑i i i k i−k k i−k x (f ∗ h)(x) = x f(y)h(x − y)dy = ck y f(y)(x − y) h(x − y)dy = ckx f ∗ x h. R R k=0 k=0 ∈ S k i−k ∈ S ⊂ ∞ R ∩ 1 R k ∗ i−k ∈ Since f, h , we have x f, x h C0 ( ) L ( ). So by [121](iv) we get x f x h ∞ R i ∗ ∈ ∞ R ⊂ ∞ R ∗ ∈ S C0 ( ). Hence x (f h) C0 ( ) L ( ), and this completes the proof that f g .

To establish the continuity of the bilinear map (f, g) 7→ f ∗ g, consider sequences (fn) → 0 and (gn) → 0 in S. In view of Exercise-27(iv), we need to show limn→∞ si,j(fn ∗ gn) = 0 for every ∑ ≥ ∗ ∥ i j ∗ ∥ ≤ i ∥ k ∗ i−k j ∥ i, j 0. From the above arguments, si,j(fn gn) = x D (fn gn) ∞ k=0 ck x fn x D gn ∞. k i−k j k i−k j By Exercise-20, ∥x fn ∗ x D gn∥∞ ≤ ∥x fn∥1∥x D gn∥∞. The right hand side goes to 0 k i−k j k because: (x fn) → 0 and (x D gn) → 0 in S by Exercise-28(iv), and then ∥x fn∥1 → 0 and FOURIER ANALYSIS 37

∥ i−k j ∥ → S ⊂ 1 R S ⊂ ∞ R ∥ · ∥ x D gn ∞ 0 since the inclusions L ( ) and (C0 ( ), ∞) are continuous by parts

(iii) and (i) of Exercise-28. Hence limn→∞ si,j(fn ∗ gn) = 0. 

Remark: Similarly, it can be shown that the map (f, g) 7→ f ∗ g is bilinear and continuous from D2 D D × ∞ R ∞ R to , from C0 ( ) to C0 ( ), etc. (try to write the proofs for some of them).

12. Distributions: preliminaries

Motivations for introducing distributions: (i) Two fundamental operations in Calculus are differ- entiation and integration. People say that the theory of distributions is a completion of the theory of ordinary differentiation just as the theory of Lebesgue integration is a completion of the theory of Riemann integration.

(ii) We will show that all distributions (also called generalized functions) are differentiable in a certain sense8. It will turn out that all locally integrable functions and all locally finite Borel measures are distributions, and hence we can differentiate them!

(iii) Certain partial differential equations may have meaningful solutions that are differentiable only almost everywhere. The framework of distributions provides a proper place for such solutions. (iv) Distributions provide a mathematical framework in which certain ‘functions’ such as the δ that are not functions in the ordinary sense (δ(0) = ∞ and δ ≡ 0 on R \{0}) obtain the proper status of functions. (v) The theory of Fourier transform can be elegantly developed playing with the Schwartz class and tempered distributions (this is the approach in Grafakos, Classical Fourier Analysis).

For a topological vector space X, let X′ = {ϕ : X → C : ϕ is linear and continuous} be its dual. We will write ⟨ϕ, x⟩ for ϕ(x) when x ∈ X and ϕ is a map from X to C.

D ⊂ S ⊂ E D ∞ R T E ∞ R S Definition: Recall the spaces , where = (Cc ( ), ind), = C ( ), and is the Schwartz space. Let D′, S′, and E′ be their duals respectively. We have D′ ⊃ S′ ⊃ E′. The members of D′ are called distributions, the members of S′ are called tempered distributions, and the members of E′ are called distributions with compact support (this terminology will be clarified shortly). From the earlier theory, we have the following characterization of members of D′, S′, and E′.

(k) i j [126] Recall that pN (f) = max {|f (x)| : |x| ≤ N}, si,j(f) = ∥x D f∥∞, and qN = max si,j. 0≤k≤N 0≤i,j≤N D → C D′ ⇔ | ∈ N ∈ N (i) A linear functional ϕ : belongs (ϕ Dk is continuous for each k ) for each k , there exist C > 0 and N ∈ N such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ Dk.

8The generalized derivative of functions in the sense of distributions is often called the weak derivative. 38 T.K.SUBRAHMONIAN MOOTHATHU

(ii) A linear functional ϕ : S → C belongs to S′ ⇔ there exist C > 0 and N ∈ N such that ∑ |⟨ ⟩| ≤ ∈ S ϕ, f CqN (f) = C 0≤i,j≤N si,j(f) for every f . (iii) A linear functional ϕ : E → C belongs to E′ ⇔ there exist C > 0 and N ∈ N such that ′ |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ E. Consequently, for every ϕ ∈ E , there is N ∈ N with the following property: whenever f ∈ E satisfies Dkf = 0 for 0 ≤ k ≤ N, then ⟨ϕ, f⟩ = 0 (the smallest such N is sometimes called the order of ϕ; thus every ϕ ∈ E′ is of finite order).

Examples: (i) (Distributions generalize the notion of locally integrable functions) We will show E ⊂ L1 (R) ⊂ D′. The first inclusion is clear. Now, any g ∈ L1 (R) induces a linear functional loc ∫ loc ϕg : D → C by the expression ⟨ϕg, f⟩ = ϕg(f) := f(x)g(x)dx. Fix k ∈ N. Then ∥f∥∞ ≤ pk(f) ∫ R ∈ D k | | |⟨ ⟩| ≤ for every f k, and for C := −k g we have ϕg, f Cpk(f). Thus ϕg is continuous by [126](i) ′ and ϕg ∈ D . From now onwards we will just write ⟨g, f⟩ for ⟨ϕg, f⟩ if no confusion can arise.

(ii) (Tempered distributions generalize the notion of Lp-functions and polynomials) Let 1 ≤ p ≤ ∞, and we will show Lp(R) ⊂ S′. If g ∈ L∞(R), then by the hint of Exercise-28(iii), there ∑ ∫ is C > 0 such that ∥f∥1 ≤ C ≤ ≤ si,j(f) for every f ∈ S, and hence |⟨g, f⟩| ≤ ∥g∥∞ |f| ≤ ∑ 0 i,j 2 ∥ ∥ 7→ ⟨ ⟩ S′ g ∞C 0≤i,j≤2 si,j(f). Therefore g (more precisely, the functional f g, f ) belongs to . Next suppose 1 ≤ p < ∞, let g ∈ Lp(R) and 1 < q ≤ ∞ be such that 1 + 1 = 1. Let h(x) = (1 + |x|)−2. ∫ p q q Since h ∈ L (R), we get by H¨older’sinequality that |gh| ≤ ∥g∥p∥h∥q. Let C > 0 be with ∫ ∫R | | 2 ≤ | |2 | | ≥ | | 2 ≤ (1 + x ) C x for x 2. Since |x|≤2(1 + x ) dx |x|≤2 9dx = 36, we have ∫ ∫ 2 |⟨g, f⟩| ≤ ( + )|gh|(1 + |x|) |f|dx ≤ (36s0,0(f) + Cs2,0(f))∥g∥p∥h∥q, |x|≤2 |x|>2 ∫ ′ −m which shows g ∈ S by [126](ii). More generally, if g : R → R satisfies R g(x)(1 + |x|) dx < ∞ for some m ≥ 2, then by an argument similar to the one given above, we can show g ∈ S′. In particular, if g : R → R is a polynomial, then g ∈ S′.

(iii) (Distributions with compact support generalize the notion of Lp-functions with bounded sup- ∫ 9 ∈ D ⊂ 1 R |⟨ ⟩| ≤ k | | ≤ ∥ ∥ ∈ E D ⊂ E′ port ) If g k L ( ), then g, f pk(f) −k g g 1pk(f) for f . This shows . More generally, if 1 ≤ p ≤ ∞ and g ∈ Lp(R) satisfies supp(g) ⊂ [−N,N] for some N ∈ N, then ∫ ∫ |⟨ ⟩| ≤ N | | ≤ | | ≤ ∥ ∥ ∥ ∥ ∈ E putting h = 1[−N,N] we see g, f pN (f) −N g pN (f) R gh g p h qpN (f) for f , 1 1 ≤ ≤ ∞ ∈ p R E′ where p + q = 1. This shows for 1 p , any g L ( ) with bounded support belongs to .

′ ′ ′ 1/x2 1 ′ (iv) The inclusions E ⊂ S ⊂ D are proper. Let g(x) = e 1 ∞ (x). Then g ∈ L (R) ⊂ D , but (0, ) ∫ loc ∫ ′ −1/x2 ∞ g∈ / S because for f ∈ S defined as f(x) = e 1 ∞ (x), we have ⟨g, f⟩ = fg = 1 = ∞. (0, ) R∫ 0 ′ ′ Similarly, if g ≡ 1, then g ∈ S (being a polynomial) but g∈ / E since g ∈ E and R gg = ∞.

9When we say a general function has bounded/compact support, it means the function is identically zero outside a bounded/compact set. FOURIER ANALYSIS 39

(v) (Locally finite measures are distributions) A Borel measure µ on R is locally finite10 if µ(K) < ∞ for every compact set K ⊂ R. If µ is a locally finite Borel measure on R, then µ ∈ D′, where ∫ we identify µ with the linear functional f 7→ ⟨µ, f⟩ := R fdµ. This linear functional is indeed continuous since for each k ∈ N, we have |⟨µ, f⟩| ≤ (2k+1)µ([−k, k])∥f∥∞ ≤ (2k+1)µ([−k, k])pk(f) ′ for every f ∈ Dk. If in addition µ is compactly supported, then a similar argument gives µ ∈ E .

In particular, for each a ∈ R, the Dirac measure δa defined as δa(Y ) = 1 if a ∈ Y and = 0 if a∈ / Y ∫ ′ belongs to E , and ⟨δa, f⟩ = R fdδa = f(a), which is just the evaluation map at a.

(vi) (Cauchy’s principal value PV 1 ) The map x 7→ 1/x is locally integrable on R\{0}, but not on R ∫ ∫ x 1 1 1 1 |1 ∞ because 0 x dx = limε→0 ε x dx = limε→0 log x ε = . Still, there is a distribution corresponding to this map, which we introduce now. Let f ∈ D, and write f(x) = f(0) + xg(x) for g ∈ D. Since f(−ε) − f(ε) = −2εg(ε), integration by parts yields

∫ ∫ ∫ ∫ ∫ −ε ∞ −ε ∞ x−1f(x)dx = ( + )x−1f(x)dx = −ε(g(−ε)+g(ε)) log ε−( + )Df(x) log |x|dx. |x|>ε −∞ ε −∞ ε ∫ −1 Since g(0) = Df(0), we have limε→0 ε(g(−ε) + g(ε)) log ε = 0. Hence limε→0 x f(x)dx = ∫ |x|>ε − ∞ | | 1 −∞ Df(x) log x dx. The distribution PV x called the Cauchy’s principal value is defined as ∫ ∫ 1 ⟨PV , f⟩ = lim x−1f(x)dx = − Df(x) log |x|dx → x ε 0 |x|>ε R ∫ ∫ ∈ D | | 1 R b b for f . The function h(x) := log x belongs to Lloc( ) because 0 log xdx = limε→0 ε log xdx = − b − ⟨ 1 ⟩ −⟨ ⟩ limε→0[x log x x]ε = b log b b. Hence h is (induces) a distribution, and PV x , f = h, Df , 1 from which we may deduce that PV x is indeed a distribution. Idea behind the definition of various operations on distributions: If T : D → D is a linear operator, then correspondingly there is a transpose operator T t : D′ → D′ given by T tϕ(f) = ϕ(T f) for ϕ ∈ D′ and f ∈ D. The defining expression T tϕ(f) = ϕ(T f) may be written as ⟨T tϕ, f⟩ = ∫ t ⟨ϕ, T f⟩. When ϕ = ϕg for a genuine function g, the defining expression becomes (T g)(x)f(x)dx = ∫ g(x)(T f)(x)dx. This observation tells us how to define various operations on distributions.

Definition: Let ϕ ∈ D′ and f ∈ D. ∫ ∫ (i) (Translation) Recall the notation fy(x) = f(x−y). Observe that gy(x)f(x)dx = g(z)f−y(z)dz.

Hence we define the translation ϕy of ϕ as ⟨ϕy, f⟩ = ⟨ϕ, f−y⟩ for y ∈ R. ∫ ∫ (ii) (Dilation/scaling) Let a ∈ R \{0}. Since g(ax)f(x)dx = g(z)f(z/a)a−1dz, we define the dilation ϕ(a ·) of ϕ as ⟨ϕ(a ·), f⟩ = ⟨ϕ, a−1f(·/a)⟩. Remark: In Rn, ⟨ϕ(a ·), f⟩ := ⟨ϕ, a−nf(·/a)⟩.

10A locally finite Borel measure on spaces such as Rn is generally called a Radon measure. 40 T.K.SUBRAHMONIAN MOOTHATHU ∫ ∫ e e e e (iii) (Reflection) Let f(x) = f(−x). Since R gfe = R gf, we define ⟨ϕ, f⟩ = ⟨ϕ, f⟩. ∫ ∫ (iv) (Smooth multiplication) Since (hg)f = g(hf), we define hϕ ∈ D′ as ⟨hϕ, f⟩ = ⟨ϕ, hf⟩ for h ∈ E. If h ∈ D, then hϕ ∈ E′, and hϕ is called a localization of ϕ. ∫ ∫ ∫ ∞ (v) (Differentiation) Integration by parts gives (Dg)f = gf|−∞ − gDf = − gDf since f has compact support. Hence we define the derivative Dϕ of ϕ as ⟨Dϕ, f⟩ = −⟨ϕ, Df⟩ (do not forget the ∫ minus sign!). If ϕ comes from a function g, i.e., if ⟨ϕ, f⟩ = fg, then Dϕ is called the distributional ∈ 1 R derivative of g, and is also written as Dg. In this sense, every g Lloc( ) has a distributional derivative.

Remark: The above operations are also defined for ϕ ∈ S′ and ϕ ∈ E′ with a modification: we need to assume h ∈ D in (iv) to ensure that hϕ ∈ S′ when ϕ ∈ S′.

Example: (i) Let h = 1 ∞ : R → R, which is called the Heaviside function. We have ⟨Dh, f⟩ = ∫ (0, ) −⟨ ⟩ − ∞ ⟨ ⟩ ∈ D h, Df = 0 Df = f(0) = δ0, f by the Fundamental theorem of calculus for f . Hence m m m Dh = δ0, the Dirac measure at 0. Note further that ⟨D δa, f⟩ = (−1) D f(a) for f ∈ E, a ∈ R ∈ N 1 and m . (ii) From an earlier discussion, we find that the distributional derivative of PV x is the locally integrable function x 7→ log |x|. (iii) Let g : R → C be absolutely continuous (then g ∈ L1 (R) ⊂ D′), h ∈ L1 (R), and assume g′(x) = h(x) for almost every x ∈ R. For any f ∈ D, loc loc ∫ ∫ ∫ ∞ integration by parts gives ⟨Dg, f⟩ = −⟨g, Df⟩ = − gDf = −gf|−∞ + fh = 0 + fh = ⟨h, f⟩. Thus Dg = h in the sense of distributions.

Exercise-29: (Smooth partition of unity) If K ⊂ R is a nonempty compact set and U1,...,Uk ⊂ R are nonempty open sets covering K, then there are g1, . . . , gk ∈ D such that

(i) 0 ≤ gj ≤ 1 and supp(gj) ⊂ Uj for 1 ≤ j ≤ k. ∑ k ∈ (ii) j=1 gj(x) = 1 for every x K.

Here, we say {gj : 1 ≤ j ≤ k} is a smooth partition of unity for K subordinate to {Uj : 1 ≤ j ≤ k}. [Hint: See my notes Introduction to Manifolds.]

Now we will explain why the members of E′ are called ‘distributions with compact support’.

Definition We say ϕ ∈ D′ vanishes (or ϕ is 0) in an open set U ⊂ R if ⟨ϕ, f⟩ = 0 for every f ∈ D with supp(f) ⊂ U. We define the support of ϕ, supp(ϕ), as the complement of the largest open subset U ⊂ R on which ϕ vanishes. Exercise-30(i) below ensures that this definition is meaningful.

Example: For the Dirac measure δa at a ∈ R, we have supp(δa) = {a}. ∪ ∈ D′ ⊂ R k Exercise-30: (i) If ϕ vanishes on open sets U1,...,Uk , then ϕ vanishes on j=1 Uj. (ii) If ϕ ∈ D′ and f ∈ D have disjoint supports, then ⟨ϕ, f⟩ = 0. FOURIER ANALYSIS 41

(iii) Let ϕ ∈ D′, and f, g ∈ D. If f and g agree on a neighborhood of supp(ϕ), then ⟨Djϕ, f⟩ = ⟨Djϕ, g⟩ for every j ≥ 0. ∫ ′ (iv) If g ∈ D, and ϕg ∈ D is given as ⟨ϕg, f⟩ = R fg, then supp(ϕg) = supp(g). ∪ k [Hint: (i) Let f ∈ D be with K := supp(f) ⊂ Uj. Let {gj : 1 ≤ j ≤ k} be a smooth partition j=1 ∑ { ≤ ≤ } k of unity for K subordinate to Uj : 1 j k given by Exercise-29. Then f = j=1 gjf, and by hypothesis ⟨ϕ, gjf⟩ = 0 for 1 ≤ j ≤ k. Hence ⟨ϕ, f⟩ = 0. Statement (iii) is a consequence of (ii) and linearity. (iv) Assume g is real valued. Let K = supp(g) and L = supp(ϕg). Clearly, L ⊂ K. If L ≠ K, using the smoothness of g, find a nondegenerate closed interval J ⊂ K \ L such that g > 0 (or g < 0) on J. Let h ∈ D be a bump function such that h ≥ 0, h ≡ 1 on a neighborhood ∫ ∫ ∩ ∅ ≤ ⟨ ⟩ of J where g > 0, and supp(h) L = . Then 0 < J g R hg = ϕg, h = 0, a contradiction.] Remark: In Exercise-30(iii), agreement on supp(ϕ) is not sufficient: f(x) = x and g ≡ 0 agree on supp(δ0) = {0}, but ⟨Dδ0, f⟩ = −⟨δ0, Df⟩ = −Df(0) = −1 ≠ 0 = −Dg(0) = ⟨Dδ0, g⟩.

[127] (Justification of a name) Let ϕ ∈ D′. Then ϕ ∈ E′ ⇔ supp(ϕ) is compact.

Proof. ⇒: If supp(ϕ) is not compact, there is a sequence (fk) in D such that [−k, k] ∩ supp(fk) = ∅ ⟨ ⟩ ̸ ∈ N −1 − ∩ j ∅ and ak := ϕ, fk = 0 for every k . Let hk = ak fk. Since [ N,N] supp(D hk) = for every k ≥ N and every j ≥ 0, we have limN→∞ pN (hk) = 0 for every N, and thus (hk) → 0 in E. On the ′ other hand, 1 = ⟨ϕ, hk⟩ 9 0. Hence ϕ∈ / E . We add the remark that though (hk) → 0 in E, we ∪ 9 D ∞ have (hk) 0 in since k=1 supp(hk) is not a bounded set.

⇐: Assume supp(ϕ) is compact, and k ∈ N be with supp(ϕ) ⊂ (−k, k). Since ϕ ∈ D′, by [126](i) there exist C > 0 and N ∈ N such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ Dk. Let h ∈ D, h ≥ 0, be a bump function with h ≡ 1 on a neighborhood of supp(ϕ), and supp(h) ⊂ [−k, k]. Extend ϕ to E by putting ⟨ϕ, f⟩ = ⟨ϕ, hf⟩ for f ∈ E. The extension is linear. For every f ∈ E, we have hf ∈ Dk, ′ ′ and |⟨ϕ, f⟩| = |⟨ϕ, hf⟩| ≤ CpN (hf) ≤ CC pN (f), where the constant C = Ch is obtained by using the product rule for Dj(hf) for 0 ≤ j ≤ N. By [126](iii), the extension ϕ belongs to E′. 

[128] A summary of some important inclusions that we have seen so far is: D ⊂ S ⊂ E ⊂ 1 R ⊂ D′ (i) Lloc( ) . S ⊂ p R ⊂ S′ ≤ ∞ S ⊂ ∞ R ⊂ ∞ R ⊂ S′ (ii) L ( ) for 1 p < ; and also C0 ( ) L ( ) . (iii) D ⊂ E′ ⊂ S′ ⊂ D′. Remark: By Exercise-23(ii), we know D is dense in E. A similar argument will show E′ is dense in ′ ′ D as follows. Given ϕ ∈ D , choose g ∈ D with g ≡ 1 on [−1, 1], put gn(x) = g(x/n), and check ′ that gnϕ ∈ E . We have ⟨gnϕ, f⟩ = ⟨ϕ, gnf⟩ → ⟨ϕ, f⟩ for every f ∈ D, and this shows (gnϕ) → ϕ in D′, completing the argument. Later we will see that D is dense in D′, and S is dense in S′. 42 T.K.SUBRAHMONIAN MOOTHATHU

13. Convolution and distributions

Recall the notation ge(y) = g(−y) so that g(x − y) = gex(y). For h, g ∈ D observe that h ∗ g(x) = ∫ ∫ h(y)g(x − y)dy = h(y)gex(y)dy = ⟨h, gex⟩. This motivates the following definition.

Definition: (Convolution of a distribution by a smooth function) Let ϕ ∈ D′ and g ∈ D; or ϕ ∈ S′ ′ and g ∈ S; or ϕ ∈ E and g ∈ E. Define ϕ ∗ g : R → C as ϕ ∗ g(x) = ⟨ϕ, gex⟩. To prove basic facts about convolution, we require the following technical result:

Exercise-31: Let ϕ ∈ D′, G ∈ C∞(R2), and assume that for each x ∈ R, there are δ > 0 and k ∈ N with supp(G(x + t, ·)) ⊂ [−k, k] for every t ∈ (−δ, δ). Then h : R → C defined as h(x) = ⟨ϕ, G(x, ·)⟩ ∂mG(x, ·) belongs to E and Dmh = ⟨ϕ, ⟩ for every m ∈ N.[Hint: Fix x ∈ R, let δ > 0 and k ∈ N be ∂xm as given. Note that t−1(h(x + t) − h(x)) = ⟨ϕ, t−1(G(x + t, ·) − G(x, ·))⟩ since ϕ is linear. Check by ∂G(x, ·) hypothesis that t−1(G(x + t, ·) − G(x, ·)) → in D as t → 0. Hence by the continuity of ϕ ∂x ∂G(x, ·) we get lim t−1(h(x + t) − h(x)) = ⟨ϕ, lim t−1(G(x + t, ·) − G(x, ·))⟩ = ⟨ϕ, ⟩. Now repeat.] t→0 t→0 ∂x Remark: The following is an important result in the theory of distributions. It tells you that even 1 R if a distribution ϕ is given by a very rough function (for example, by a member of Lloc( )), a convolution ϕ ∗ g of ϕ by an appropriate smooth function g produces a smooth function.

[129] (i) Let ϕ ∈ D′ and g ∈ D; or ϕ ∈ S′ and g ∈ S; or ϕ ∈ E′ and g ∈ E. Then ϕ ∗ g ∈ E. (ii) If ϕ ∈ E′ and g ∈ D, then ϕ ∗ g ∈ D. (iii) If ϕ ∈ E′ and g ∈ S, then ϕ ∗ g ∈ S. (iv) In all the above cases, we have Dm(ϕ ∗ g) = ϕ ∗ Dmg for every m ∈ N.

′ 2 Proof. Step-1 : First suppose ϕ ∈ D and g ∈ D. Define G : R → C as G(x, y) = gex(y) = g(x − y). ∂mG ∂mg(x − y) Then the hypothesis of Exercise-31 is satisfied. Hence ϕ ∗ g ∈ E. Also (x, y) = = ∂xm ∂xm ]m m ]m m (D g)x(y), and hence D (ϕ ∗ g) = ⟨ϕ, (D g)x⟩ = ϕ ∗ D g for every m ∈ N.

Step-2 : Let ϕ ∈ E′ and g ∈ E. Let h ∈ D be a bump function with h ≡ 1 on a neighborhood

V of supp(ϕ). Then ⟨ϕ, f⟩ = ⟨ϕ, hf⟩ for every f ∈ E. In particular, ϕ ∗ g(x) = ⟨ϕ, hgex⟩. Letting G(x, y) = h(y)g(x − y) and applying Exercise-31, we deduce that ϕ ∗ g ∈ E. Since h ≡ 1 on a ∂mG ∂mg(x − y) neighborhood V of supp(ϕ), we also have (x, y) = 1 · = (D]mg) (y) for y ∈ V , and ∂xm ∂xm x m ]m m hence D (ϕ ∗ g) = ⟨ϕ, (D g)x⟩ = ϕ ∗ D g for every m ∈ N.

′ Step-3 : Let ϕ ∈ E and g ∈ D. Then there is k ∈ N such that ϕ and gex have disjoint supports for

|x| ≥ k, and hence ϕ ∗ g(x) = ⟨ϕ, gex⟩ = 0 for |x| ≥ k, which shows supp(ϕ ∗ g) is also compact. FOURIER ANALYSIS 43

Step-4 : (hint) Let ϕ ∈ S′ and g ∈ S. An argument similar to (but slightly more computational than) that in step-1 will show ϕ∗g ∈ E and Dm(ϕ∗g) = ϕ∗Dmg; see Exercise 2.3.5(a) in Grafakos, Classical Fourier Analysis.

Step-4 : If ϕ ∈ E′ ⊂ S′ and g ∈ S, then ϕ ∗ g ∈ E and Dm(ϕ ∗ g) = ϕ ∗ Dmg as hinted in step-4. To j j show ϕ ∗ g ∈ S, we need to show si,j(ϕ ∗ g) < ∞ for every i, j ≥ 0. Since D (ϕ ∗ g) = ϕ ∗ D g and j i D g ∈ S, we may assume j = 0. Thus we need to show ∥x (ϕ ∗ g)∥∞ < ∞. Since g ∈ S, we have i m i m (1+|x|) D g ∈ S and hence for every N ∈ N, there is βN > 0 such that ∥(1+|x|) D g∥∞ ≤ βN for ′ 0 ≤ m ≤ N. Since ϕ ∈ E , there exist C > 0 and N ∈ N by [126](iii) such that |⟨ϕ, f⟩| ≤ CpN (f) = m i i C max{|D f(y)| : 0 ≤ m ≤ N, |y| ≤ N} for every f ∈ E. Hence |x (ϕ ∗ g)(x)| = |x ⟨ϕ, gex⟩| = ∂mg |x|i |⟨ϕ, xige ⟩| ≤ C max{|xi (x − y)| : 0 ≤ m ≤ N, |y| ≤ N} ≤ Cβ max{ : |y| ≤ N}. x ∂ym N (1 + |x − y|)i i From this estimate, we conclude that ∥x (ϕ ∗ g)∥∞ < ∞ by observing the following: if |x| ≥ 2N, |x|i |x|i then ≤ ≤ 2i for |y| ≤ N.  (1 + |x − y|)i (1 + |x|/2)i

Example: Let g ∈ E. Then δ0 ∗ g(x) = ⟨δ0, gex⟩ = gex(0) = g(x − 0) = g(x). That is, δ0 ∗ g = g. ∫ ∫ ∫ Definition and Remark: For f, g, h ∈ D, note that ⟨h ∗ g, f⟩ = (h ∗ g)(x)f(x)dx = h(y)g(x − ∫ ∫ ∫ y)f(x)dxdy = h(y)f(x)ge(y − x)dxdy = h(y)(f ∗ ge)(y)dy = ⟨h, f ∗ ge⟩. This motivates us to define ⟨ϕ ∗ g, f⟩ = ⟨ϕ, f ∗ ge⟩ for various types of distributions ϕ and appropriate smooth functions g, f. We may check that this is compatible with our earlier definition of ϕ∗g as follows: ⟨ϕ∗g, f⟩ = ∫ ∫ ∫ ∫ ∫ (ϕ ∗ g)(x)f(x)dx = ⟨ϕ, gex⟩f(x)dx = ⟨ϕ, f(x)gex⟩dx = ⟨ϕ, f(x)gex(·)dx⟩ = ⟨ϕ, y 7→ f(x)g(x − ∫ y)dx⟩ = ⟨ϕ, f ∗ ge⟩, where taking the integral sign inside ⟨, ⟩ is justified by considering the limit of the Riemann sums defining the integral and using the linearity and continuity of ϕ.

Exercise-32: (i) Let ϕ ∈ D′. Then T : D → E defined as T g = ϕ ∗ g is linear, continuous, and commutes with translations (i.e., ϕ ∗ gy = (ϕ ∗ g)y for every y ∈ R). (ii) (An important fact about convolution) Conversely, if a continuous linear map T : D → E commutes with translations, then there is a unique ϕ ∈ D′ such that T g = ϕ ∗ g for every g ∈ D. [Hint: (ii) Define ϕ ∈ D′ as ⟨ϕ, g⟩ = T g(0) for g ∈ D. Since T commutes with translation, we have

T g(x) = (T g)−x(0) = T (g−x)(0) = T (gex)(0) = ⟨ϕ, gex⟩ = ϕ ∗ g(x) for every x ∈ R.]

Remark: There are results similar to Exercise-32(ii) in other settings. For instance, Theorem 2.5.2 of Grafakos, Classical Fourier Analysis says in particular (with a more involved proof) that if 1 ≤ p, q ≤ ∞ and T : Lp(R) → Lq(R) is a bounded linear operator commuting with translations, then there is a unique ϕ ∈ S′ such that T g = ϕ ∗ g for every g ∈ Lp(R).

It is also possible to define the convolution of two distributions when at least one of them has compact support. For this, one has to develop the rather technical theory of tensor product of two 44 T.K.SUBRAHMONIAN MOOTHATHU distributions. The main points are stated without proof in [130] below. The student may refer to Chapter 4 of F.G. Friedlander, Introduction to the Theory of Distributions for the proofs. For ⊗ ⊗ f, g : R → C, let f g : R2 → C be f g(x, y) = f(x)g(y). Let D(R2) = C∞(R2). It is known ⊗ c that span{f g : f, g ∈ D} is dense in D(R2).

[130] Let ϕ, ψ ∈ D′, with at least one having compact support. Then there is a unique dis- ⊗ tribution ϕ ψ ∈ D′(R2) called the tensor product of ϕ and ψ with the defining property that ⊗ ⊗ ⟨ϕ ψ, f g⟩ = ⟨ϕ, f⟩⟨ψ, g⟩ for every f, g ∈ D. Moreover, we have: ⊗ (i) ∀ F ∈ D(R2), ⟨ϕ ψ, F ⟩ = ⟨ϕ, x 7→ ⟨ψ, F (x, ·)⟩⟩ = ⟨ψ, y 7→ ⟨ϕ, F (·, y)⟩⟩ ∀ (x, y) ∈ R2. ⊗ (ii) supp(ϕ ψ) = supp(ϕ) × supp(ψ) ⊂ R2. ⊗ (iii) (ϕ, ψ) 7→ ϕ ψ is linear and continuous in each variable. ∂k ∂m ⊗ ⊗ (iv) (ϕ ψ) = Dkϕ Dmψ for every k, m ∈ N. ∂x⊗k ∂ym 2 (v) δx δy = δ(x,y) for every (x, y) ∈ R . ⊗ Remark: The idea of proof is to use the first equality in (i) as the definition of ϕ ψ, and then to prove that this indeed defines a distribution with the listed properties. ∫ ∫ ∫ ∫ ∫ Note that ⟨h∗g, f⟩ = (h∗g)(z)f(z)dz = h(x)g(z −x)f(z)dxdz = h(x)g(y)f(x+y)dxdy.

[131] Let ϕ, ψ ∈ D′, with at least one of them having compact support. Then their convolution ⊗ ϕ ∗ ψ ∈ D′ is defined as ⟨ϕ ∗ ψ, f⟩ = ⟨ϕ ψ, (x, y) 7→ f(x + y)⟩ for f ∈ D. We have: (i) ⟨ϕ ∗ ψ, f⟩ = ⟨ϕ, x 7→ ⟨ψ, f(x + ·)⟩⟩ = ⟨ψ, y 7→ ⟨ϕ, f(· + y)⟩⟩, and hence ϕ ∗ ψ = ψ ∗ ϕ. (ii) supp(ϕ ∗ ψ) ⊂ supp(ϕ) + supp(ψ). (iii) (ϕ, ψ) 7→ ϕ ∗ ψ is linear and continuous in each variable. (iv) Dm(ϕ ∗ ψ) = Dmϕ ∗ ψ = ϕ ∗ Dmψ for every m ∈ N.

(v) ϕ ∗ δ0 = ϕ.

Proof. All are essentially direct consequences of [130]. We just indicate a proof for (ii). Let f ∈ D be with supp(f) ∩ (supp(ϕ) + supp(ψ)) = ∅. Then the support of the map (x, y) 7→ f(x + y) is disjoint with supp(ϕ) × supp(ψ), and hence by [130](ii) we get ⟨ϕ ∗ ψ, f⟩ = 0. 

[132] (i) D (considered as a subset of E′) is dense in D′. (ii) S is dense in S′.

Proof. (i) By the Remark after [128], E′ is dense in D′. So it suffices to show D is dense in E′. Let ∫ ′ −1 ′ ϕ ∈ E . Pick g ∈ D be with g ≥ 0 and g = 1, and put gn(x) = n g(x/n). Then (gn) → δ0 in D

(check). Therefore (ϕ ∗ gn) → ϕ ∗ δ0 = ϕ by [131](iii) and [131](v). Also ϕ ∗ gn ∈ D by [129](ii).

(ii) An argument similar to that in (i) works since D ⊂ S ⊂ S′ ⊂ D′.  FOURIER ANALYSIS 45

14. Some structure theorems about distributions

[133] (i) Let ϕ ∈ E′ be with supp(ϕ) = {0}. Then there exists N ∈ N such that for every f ∈ E with Djf(0) = 0 for 0 ≤ j ≤ N, we have ⟨ϕ, f⟩ = 0. (ii) (Structure theorem for distribution supported on a point) Let ϕ ∈ E′ be with supp(ϕ) = {a}. ∑ N j ∈ C Then ϕ = j=0 cjD δa for finitely many constants cj , where δa is the Dirac distribution at a.

′ Proof. (i) Since ϕ ∈ E , there exist C > 0 and N ∈ N such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ E. Fix f ∈ E with Djf(0) = 0 for 0 ≤ j ≤ N. Let g ∈ E be such that g(x) = 0 for |x| ≤ 1 and g(x) = 1 for |x| ≥ 2. Put gk(x) = g(kx). Then gk(x) = 0 for |x| ≤ 1/k and gk(x) = 1 for |x| ≥ 2/k.

Since fgk ≡ 0 in a neighborhood of 0, we have supp(ϕ) ∩ supp(fgk) = ∅, and hence ⟨ϕ, fgk⟩ = 0 by Exercise-30. Therefore, |⟨ϕ, f⟩| = ⟨ϕ, f − fgk⟩| ≤ CpN (f − fgk) for every k ∈ N. Thus to prove

⟨ϕ, f⟩ = 0, it suffices to show limk→∞ pN (f − fgk) = 0. We make four observations: j (a) Since f = fgk for |x| > 2/k, pN (f − fgk) = max{|D (f − fgk)(x)| : |x| ≤ 2/k, 0 ≤ j ≤ N}, ∑ j − j − j i j−i − (b) D (f fgk) = D (f(1 gk)) = i=0 cijD fD (1 gk) by product rule. j−i j−i j−i (c) ∥D (1 − gk)∥∞ ≤ k ∥D (1 − g)∥∞ for 0 ≤ i ≤ j ≤ N and every k ∈ N. M (d) ∃ a constant M > 0 with max{|Dif(x)| : |x| ≤ 2/k} ≤ 1 for 0 ≤ i ≤ N since Djf(0) = 1 kN+1−i i+1 0 for 0 ≤ j ≤ N (to see this, consider |xi| ≤ 2/k, and pick |xi+1| ≤ 2/k with |xiD f(xi+1)| = i i i |D f(xi) − D f(0)| = |D f(xi)| by Mean value theorem, and repeat this till one gets xN+1). i j−i Combining (c) with (d), we get a constant M2 > 0 such that max{|D f(x)D (1 − gk)(x)| : M M |x| ≤ 2/k} ≤ 2 ≤ 2 for 0 ≤ i ≤ j ≤ N. Combining this with (a) and (b), we get a constant kN+1−j k M > 0 such that pN (f − fgk) ≤ M/k → 0 as k → ∞.

(ii) After a translation assume a = 0. Let N ∈ N be as given by part (i). Consider f ∈ E. By ∑ xj Taylor’s theorem, there is h ∈ E such that f(x) = N Djf(0) +h(x) for x ∈ R. Differentiating j=0 j! this repeatedly and substituting x = 0, see Djh(0) = 0 for 0 ≤ j ≤ N. Hence ⟨ϕ, h⟩ = 0 by (i), and ∑ xj therefore ⟨ϕ, f⟩ = N Djf(0)⟨ϕ, ⟩. Recall that ⟨Djδ , f⟩ = (−1)j⟨δ ,Djf⟩ = (−1)jDjf(0) and j=0 j! 0 0 xj ∑ ∑ put c = (−1)j⟨ϕ, ⟩. Then ⟨ϕ, f⟩ = N c ⟨Djδ , f⟩ for f ∈ E, and hence ϕ = N c Djδ .  j j! j=0 j 0 j=0 j 0

Exercise-33: Let β ∈ D′ be the distribution induced by the constant function 1 (which is locally ∫ ∫ ∫ integrable), i.e., ⟨β, f⟩ = f.1 = f for f ∈ D. Then ker(β) = {f ∈ D : f = 0} is a vector R R ∫ R x subspace of D having codimension one. Let J : ker(β) → D be Jf(x) = −∞ f(y)dy. Then J is well-defined, i.e, Jf ∈ D for f ∈ ker(β), and J is linear and continuous. Moreover, DJf = f for ∫ f ∈ ker(β) and JDf = f for f ∈ D. In particular, {f ∈ D : R f = 0} = ker(β) = D(D), the range of the differentiation operator D : D → D.

[Hint: To show J is continuous, show J : Dk ∩ ker(β) → D are continuous at 0 for k ∈ N.] 46 T.K.SUBRAHMONIAN MOOTHATHU

[134] (i) (Every distribution has a primitive) If ϕ ∈ D′, then there is ψ ∈ D′ with Dψ = ϕ. ′ (ii) (Determined up to a constant) If ϕ, ψ1, ψ2 ∈ D are with Dψ1 = ϕ = Dψ2, then ψ1 − ψ2 is a ∫ constant in the sense that there is c ∈ C with ⟨ψ1 − ψ2, f⟩ = ⟨c, f⟩ = c f for every f ∈ D. ∫ (iii) If Dψ = 0 for ψ ∈ D′, then ψ is a constant, i.e., ∃ c ∈ C with ⟨ψ, f⟩ = ⟨c, f⟩ = c f for f ∈ D.

∫ Proof. (i) Let β and J be as in Exercise-33. Fix h ∈ D with ⟨β, h⟩ = R h = 1. Then h spans a one-dimensional space complementary to ker(β). Define the projection P : D → ker(β) as P f = f −⟨β, f⟩h (check that ⟨β, P f⟩ is indeed zero). Then any f ∈ D can be written as f = ⟨β, f⟩h+P f.

We define ψ : D → C as ⟨ψ, f⟩ = −⟨ϕ, JP f⟩, which is obviously linear. If (fn) → 0 in D, then

P fn = fn − ⟨β, fn⟩h → 0 in ker(β) ⊂ D, and hence (JP fn) → 0 in D by the continuity of J.

Therefore, ⟨ψ, fn⟩ = −⟨ϕ, JP fn⟩ → 0 in C. This shows that ψ is continuous at 0, and hence ψ ∈ D′. For f ∈ D, we have P Df = Df since Df ∈ ker(β), and also JDf = f by Exercise-33(ii); hence ⟨Dψ, f⟩ = −⟨ψ, Df⟩ = ⟨ϕ, JP Df⟩ = ⟨ϕ, JDf⟩ = ⟨ϕ, f⟩. Thus Dψ = ϕ.

(ii) Write f ∈ D as f = ⟨β, f⟩h + P f as above. We have ⟨ψ1 − ψ2, P f⟩ = ⟨ψ1 − ψ2, DJP f⟩ =

−⟨D(ψ1 − ψ2), JP f⟩ = −⟨0, JP f⟩ = 0, and therefore ⟨ψ1 − ψ2, f⟩ = ⟨ψ1 − ψ2, ⟨β, f⟩h⟩ = c⟨β, f⟩ = ∫ c f, where c := ⟨ψ1 − ψ2, h⟩.

(iii) This follows from (ii) since we also have D0 = 0. 

[135] (Local structure theorem for distributions - every distribution is locally a finite order de- ′ rivative of a continuous function) Let ϕ ∈ D and a > 0. Then there exist g ∈ Cc(R) with supp(g) ⊂ [−a, a] and an integer m ≥ 0 such that for every f ∈ D with supp(f) ⊂ (−a, a) (note the open bracket), we have ⟨ϕ, f⟩ = ⟨Dmg, f⟩.

Proof. Let Γ = {f ∈ D : supp(f) ⊂ (−a, a)} and k ≥ a. By [126](i), there are C > 0 and N ≥ k such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ Dk, and in particular for every f ∈ Γ. Fix f ∈ Γ, j j 0 ≤ j ≤ N, and let x0 ∈ (−a, a) be with |Df (x0)| = ∥Df ∥∞. By mean value theorem, there j − j − j − j+1 Df (x0) Df ( a) Df (x0) 0 j is y0 ∈ (−a, x0) with Df (y0) = = , and hence ∥Df ∥∞ = x0 + a x0 + a j j+1 j+1 |Df (x0)| = |x0 + a||Df (y0)| ≤ (2a + 1)∥Df ∥∞. Applying this observation repeatedly, we j N N see pN (f) = max{|D f(x)| : 0 ≤ j ≤ N, |x| ≤ N} ≤ (2a + 1) ∥D f∥∞ for f ∈ Γ. Putting N N C0 = C(2a + 1) , we conclude |⟨ϕ, f⟩| ≤ C0∥D f∥∞ for every f ∈ Γ. ∫ N ∈ ∈ N x N+1 ∈ − Further note that D f Γ for f Γ and D f(x) = −a D f(y)dy for x ( a, a), which N N+1 N+1 shows ∥D f∥∞ ≤ ∥D f∥1 for f ∈ Γ. Hence |⟨ϕ, f⟩| ≤ C0∥D f∥1 for every f ∈ Γ. Let DN+1(Γ) = {DN+1f : f ∈ Γ} and define the linear functional ψ : DN+1(Γ) → C as ⟨ψ, DN+1f⟩ = N+1 N+1 j j ⟨ϕ, f⟩. If f1, f2 ∈ Γ and D f1 = D f2, then f1 = f2 since D f1 and D f2 have compact FOURIER ANALYSIS 47 supports for every j ≥ 0, and therefore ψ is well-defined. From the estimate above, |⟨ψ, DN+1f⟩| ≤ N+1 N+1 C0∥D f∥1, and thus ψ :(D (Γ), ∥ · ∥1) → C is also continuous.

By Hahn-Banach theorem, ψ has a continuous linear extension ψ : L1(−a, a) → C. Since ∫ ∞ − 1 − ∈ ∞ − ⟨ ⟩ a L ( a, a) is the dual of L ( a, a), there is h L ( a, a) such that ψ, f = −a fh for every f ∈ L1(−a, a). Put h(x) = 0 for |x| ≥ a and note h ∈ D′. For f ∈ Γ we have ⟨ϕ, f⟩ = ∫ ∫ N+1 a N+1 N+1 N+1 N+1 N+1 N+1 ⟨ψ, D f⟩ = hD f = hD f = ⟨h, D f⟩ = (−1) ⟨D h, f⟩ = ⟨D h0, f⟩, −a R ∫ N+1 x where h0 = (−1) h. If we define g : R → C as g(x) = −∞ h0(y)dy, then g is continuous with N+1 N+2 supp(g) ⊂ [−a, a] and Dg = h0. And ⟨ϕ, f⟩ = ⟨D h0, f⟩ = ⟨D g, f⟩ for every f ∈ Γ. 

[136] (Global structure theorem for distributions with compact support - every distribution with compact support is a finite sum of finite order derivatives of continuous functions) If ϕ ∈ E′ and b > 0 is with supp(ϕ) ⊂ (−b, b), then there are finitely many functions fj ∈ Cc(R) with supp(fj) ⊂ (−b, b) ∑ m j such that ϕ = j=0 D fj.

Proof. Let 0 < a < b be chosen with supp(ϕ) ⊂ (−a, a). By [131], there are g ∈ Cc(R) with supp(g) ⊂ [−a, a] ⊂ (−b, b) and m ≥ 0 such that ⟨ϕ, f⟩ = ⟨Dng, f⟩ for every f ∈ D with supp(f) ⊂ (−a, a). Let h ∈ D be a bump function satisfying h ≡ 1 on supp(ϕ) and supp(h) ⊂ (−a, a). Then for any f ∈ E, we have supp(fh) ⊂ (−a, a), and

∑m m m m m j m−j ⟨ϕ, f⟩ = ⟨ϕ, fh⟩ = ⟨D g, fh⟩ = (−1) ⟨g, D (fh)⟩ = (−1) cj⟨g, D fD h⟩. j=0 m m−j Putting gj = (−1) cjgD h, which does not depend on f, we see that g ∈ C(R) with supp(gj) ⊂ [−a, a] ⊂ (−b, b) and

∑m ∫ ∑m ∫ ∑m ∑m m j m−j j j j j ⟨ϕ, f⟩ = (−1) cj gD fD h = gjD f = ⟨gj,D f⟩ = (−1) ⟨D gj, f⟩. j=0 j=0 j=0 j=0

j Letting fj = (−1) gj, we get the required result. 

Definition: We say g ∈ C(R) is of polynomial growth if there are constants C > 0, M > 0 such that |g(x)| ≤ C(1 + |x|)M for every x ∈ R.

[137] (Structure theorem for tempered distributions - every tempered distribution is a finite order derivative of a continuous function of polynomial growth) If ϕ ∈ S′, then there exist m ∈ N and g ∈ C(R) of polynomial growth such that ⟨ϕ, f⟩ = ⟨Dmg, f⟩ for every f ∈ S.

Proof. Step-1 : (Sketch) First assume supp(ϕ) ⊂ (0, ∞). Since ϕ ∈ S′, there exist C > 0 and N ∈ N ∑ |⟨ ⟩| ≤ ∈ S ∈ E ≡ such that ϕ, f 0≤i,j≤N si,j(f) for every f . Choose h0 with h0 1 in a neighborhood 48 T.K.SUBRAHMONIAN MOOTHATHU

′ ′ of supp(ϕ) and supp(h0) ⊂ (0, ∞). Since ⟨ϕ, f⟩ = ⟨ϕ, h0f⟩, we can find C = C (h0) > 0 by applying the product rule of differentiation in si,j(h0f) so that ∑ |⟨ ⟩| |⟨ ⟩| ≤ ′ {| i j | } ∈ S ∗ ϕ, f = ϕ, h0f C 0≤i,j≤N sup x D f(x) : x > 0 for every f . ( ) N N Let h : R → C be h(x) = x /N! for x > 0 and h(x) = 0 for x ≤ 0 so that D h = 1(0,∞) and N+1 D h = δ0. Let g = ϕ ∗ h, which is defined as ϕ ∗ h(x) = ⟨ϕ, y 7→ h(x − y)⟩; see the beginning of the previous section. One then verifies that the continuous function g is of polynomial growth N+1 N+1 N+1 with the help of (∗), and checks D g = D (ϕ ∗ h) = ϕ ∗ D h = ϕ ∗ δ0 = ϕ; see section 8.3 of F.G. Friedlander, Introduction to the Theory of Distributions for the details.

Step-2 : In the general case, choose h1, h2 ∈ E with h1, h2 ≥ 0, supp(h1) ⊂ (−∞, 1), supp(h2) ⊂

(0, ∞), and supp(h1) ∪ supp(h2) = R. Putting gj = hj/(h1 + h2) for j = 1, 2, we see gj ∈ E,

0 ≤ g1, g2 ≤ 1, g1 + g2 = 1, supp(g1) ⊂ (−∞, 1) and supp(g2) ⊂ (0, ∞); in other words, {g1, g2} is a smooth partition of unity for R subordinate to the open cover {(−∞, 1), (0, ∞)}. Since ϕ = g1ϕ+g2ϕ with supp(g1ϕ) ⊂ (−∞, 1) and supp(g2ϕ) ⊂ (0, ∞), we may apply the argument in step-1 to each of g1ϕ and g2ϕ to deduce the required result about ϕ. 

15. Fourier transform on R: basics

∈ R R → C 2πirx ∈ R ∈ ∞ R ∈ R For r , let er : be er(x) = e for x . Note that er L ( ) for every r . b b b 1 Recall from Exercise-5 that f(n) = f(n) · 1 = f(n)en(0) = f ∗ en(0) for f ∈ L (T). This motivates the following definition.

b 1 b Definition: The Fourier transform f : R → C of f ∈ L (R) is defined as f(y) = f ∗ ey(0) = ∫ b R f(x)e−y(x)dx for y ∈ R. For example, let u ∈ (0, ∞) and f = 1[−u,u]. Then f(0) = 2u and ∫ b u e−y(−u) − e−y(u) sin 2πiuy f(y) = e− (x)dx = = for y ∈ R \{0}. −u y 2πiy πy Remark: The integral defining fb(y) is a global integral, over the whole of R. Hence, if we change f on a small interval, the value of fb(y) may change for every y ∈ R.

Exercise-34: (Properties of the Fourier transform - I) Let f, g ∈ L1(R). b ∞ b b (i) f ∈ L (R), f is uniformly continuous, and ∥f∥∞ ≤ ∥f∥1. 1 b b (ii) (Continuity) If (fk) → f in L (R), then ∥f − fk∥∞ → 0 as k → ∞. b e (iii) (Reflection and Linearity) fe = fb (where fe(x) = f(−x)), and af\+ bg = afb+ bgb for a, b ∈ C. (iv) If f is an even (odd) function, then so is fb. ∫ b − 1 ∈ R \{ } (v) f(y) = (1/2) R[f(x) f(x + 2y )]e−y(x)dx for y 0 . b [Hint: (i) Note ∥ey∥∞ = 1, and use an argument similar to that for Exercise-20. (ii) Use ∥f∥∞ ≤ ∫ ∫ ∫ b b ∥f∥1. (iv) If f is even, then f(−y) = f(x)ey(x)dx = f(−x)ey(−x)dx = f(x)e−y(x)dz = f(y). FOURIER ANALYSIS 49 ∫ 1 b b 1 (v) Substitute x = z + in the integral expression for f(y) to get f(y) = − f(z + )e−y(z)dz = ∫ 2y R 2y − 1 b R f(x + 2y )e−y(x)dx and add this to the original expression for f(y).]

1 Exercise-35: (Properties of the Fourier transform - II) Let f, g ∈ L (R), , and let fa(x) = f(x − a). d b d b (i) eaf = (f)a and (fa) = e−af for every a ∈ R. b (ii) f ∗ ey = f(y)ey for every y ∈ R. (iii) f[∗ g = fbgb. ∫ ∫ b (iv) R f(x)gb(x)dx = R f(x)g(x)dx. (v) Let a ≠ 0. If g(x) = f(ax), then gb(y) = a−1fb(y/a) (so If g(x) = f(x/a), then gb(y) = afb(ay)). ∫ ∫ ∫ d b b [Hint: (i) eaf(y) = ea(x)f(x)e−y(x)dx = f(y)e− − (x)dx = f(y − a), and fa(y) = f(x − ∫ ∫ (y a) b a)e−y(x)dx = f(z)e−y(z + a)dz = e−y(a) f(z)e−y(z)dz = e−a(y)f(y). Deduce (iii) from (ii) as [ b follows: f ∗ g(y)ey = f ∗ g ∗ ey = f ∗ gb(y)ey = gb(y)f ∗ ey = gb(y)f(y)ey and cancel ey ≠ 0 from both ends as in Exercise-5. Use Fubini’s theorem to prove (iv). For (v), put z = ax in the integral.]

Exercise-36: (Fourier transform and differentiation) Let f ∈ L1(R). 1 d b (i) If Df ∈ L (R), then lim|x|→∞ f(x) = 0 and Df(y) = 2πiyf(y) for y ∈ R. d (ii) If xf ∈ L1(R), then fb is differentiable and Dfb(y) = −2πi(xf)(y) for y ∈ R. (iii) More generally, we have: Dmf ∈ L1(R) ⇒ D[mf(y) = (2πiy)mfb(y); and xmf ∈ L1(R) ⇒ fb is m-times differentiable with Dmfb(y) = (−2πi)m(\xmf)(y). ∫ ∫ ∈ 1 R ∀ ∃ | − | | b | ≤ b | | ≤ [Hint: (i) Since Df L ( ), ε > 0 M > 0 with f(b) f(a) = a Df a Df < ε for M 1 a < b. So limx→∞ f(x) exists. This limit must be 0 since f ∈ L (R). Similarly limx→−∞ f(x) = 0. To derive the expression for Dfd(y), do integration by parts. (ii) Since |e−t(x) − 1| ≤ |2πtx| b − b 1 f(y + t) f(y) and since xf ∈ L (R), Lebesgue dominated convergence theorem gives lim → = [ ] t 0 t ∫ ∫ e−t(x) − 1 d f(x)e− (x) lim → dx = −2πi xf(x)e− (x)dx = −2πi(xf)(y).] R y t 0 t R y Remark: From Exercise-35(i) and Exercise-36, and some of the future results, we see the following are pairs of dual operations for the Fourier transform: (i) translation and rotation (multiplication by a unimodular scalar). (ii) Differentiation and multiplying with x. (iii) Convolution and taking pointwise product.

Example: Let f ∈ L1(R) be f(x) = e−πx2 (and f(0) = 1). We will show fb = f. We have Df(x) = −2πxf(x) and hence f is the unique solution to Dg(x) + 2πxg(x) = 0 with initial condition g(0) = 1. Thus it suffices to show fb also satisfies this equation. Since xf ∈ L1(R) ∫ b and using −2πxf(x) = Df(x), we get by Exercise-36 that Df(y) = (−2πixf(x))e−y(x)dx = 50 T.K.SUBRAHMONIAN MOOTHATHU ∫ ∫ d b b b iDf(x)e−y(x)dx = iDf(y) = i · 2πiyf(y) = −2πyf(y). Also f(0) = f(x)dx = 1 by Complex ∫ ∫ ∫ R −π|z|2 integration (consider ( R f(x)dx)( R f(y)dy) = C e dz and use polar coordinates).

Remark: The second part of Exercise-36(iii) says that the faster the decay of f at ∞, the smoother fb is. Result [138](ii) below says that the smoother f is, the faster the decay of fb at ∞.

1 b b [138] (i) (Riemann-Lebesgue lemma) Let f ∈ L (R). Then lim f(y) = 0, and hence f ∈ C0(R). ∫ ∫ |y|→∞ Also, lim f(x) cos 2πxydx = 0 and lim f(x) sin 2πxydx = 0. |y|→∞ R |y|→∞ R (ii) (Smoother functions, faster decay) If Djf ∈ L1(R) for 0 ≤ j ≤ m, then lim |y|mfb(y) = 0. |y|→∞

e− (a) − e− (b) Proof. (i) If f = 1 , then fb(y) = y y → 0 as |y| → ∞. By linearity, fb(y) → 0 as [a,b] 2πiy |y| → ∞ for any step function f ∈ L1(R). Also it is known that step functions are dense in L1(R)

(see my notes Measure Theory). Another proof : We may assume f ∈ Cc(R) because Cc(R) is dense in L1(R). Now use the expression for fb(y) from Exercise-34(v) and the uniform continuity of f - b as in the proof of [103](ii) - to deduce lim|y|→∞ f(y) = 0.

(ii) This follows from (i) and Exercise-36(i). 

7→ b 1 R ∞ R ∥ · ∥ Remark: By the above results, f f from L ( ) to (C0 ( ), ∞) is linear and continuous.

16. Fourier transform: sufficient conditions for pointwise inversion

Definition: For g ∈ L1(R), the Fourier inverse transform g∨ of g is defined as g∨(y) = gb(−y) = ∫ ∨ ∨ R g(x)ey(x)dx. By Exercise-34(i) and [138], we see that g ∈ C0(R); in particular, g is bounded and uniformly continuous. Also note that if f ∈ L1(R) is an even function, then f ∨(y) = ∫ ∫ ∫ b ∨ b f(x)ey(x)dx = f(−x)ey(x)dx = f(z)e−y(z)dz = f(y), i.e., f = f when f is even.

Question and Remark: If f ∈ L1(R), can we expect the equality (fb)∨ = f? If we want this equality to hold everywhere, a necessary condition (by the observation in the above paragraph) is that f ∈ C0(R). Even if we demand equality only almost everywhere, a necessary condition is that ∫ ∞ b ∨ ∞ b f ∈ L (R). Another point to note is, formally (f) (x) = −∞ f(y)ex(y)dy, but this integral may not be defined if fb is not an L1-function. With our knowledge that fb ∈ L∞(R), an integral that we ∫ ∈ 1 R u b ∈ ∞ can always define for f L ( ) is −u f(y)ex(y)dy for u (0, ); also we may investigate whether this integral converges to f(x) as u → ∞. This motivates the following definitions. \ Definition: (i) For u ∈ (0, ∞), the continuous Dirichlet kernel Du : R → C is defined as Du = 1[−u,u], e− (−u) − e− (u) sin 2πuy and hence D ∈ C (R). From an earlier calculation, D (y) = y y = for u 0 u 2u πy y ∈ R \{0}, and Du(0) = 2u. Thus Du is a real valued function, and also Du(−y) = Du(y). FOURIER ANALYSIS 51

1 (ii) For f ∈ L (R) and u > 0, the uth Fourier partial integral su(f): R → C is defined as su(f, a) = ∫ u b ∈ R −u f(x)ea(x)dx for a . We now look for conditions that yield limu→∞ su(f, a) = f(a).

1 Exercise-37: (Various expressions for su(f)) Let f ∈ L (R), u > 0 and a ∈ R. Then, ∫ ∞ sin 2πux ∫ ∞ sin 2πux (i) s (f, a) = f(a + x) dx = f(a − x) dx = D ∗ f(a). u −∞ πx −∞ πx u ∫ ∞ sin 2πux ∫ ∞ (ii) s (f, a) = [f(a + x) + f(a − x)] dx = [f(a + x) + f(a − x)]D (x)dx. u 0 πx 0 u b d [Hint: (i) By Exercise-35(i), eaf = f−a, and for g := 1 − , we have gb = Du. Hence su(f, a) = ∫ ∫ ∫ [ u,u∫] u b d ∞ f(x)ea(x)dx = f−a(x)g(x)dx = f−a(x)gb(x)dx = f(a+x)Du(x)dx by Exercise-35(iv). −u R ∫ −∞ R Replacing x with −x gives su(f, a) = R f(a−x)Du(x)dx = f ∗Du(a) since Du is an even function.] ∫ 1 ∞ −1 [139] (i) Let g ∈ L (R) ∩ L (R) be such that g ≥ 0 and R g = 1. Let gt(x) = t g(x/t) for t > 0. 1 1 Then {gt : t > 0} as t → 0 is an approximate identity for L (R). Moreover, for any f ∈ L (R) we have limt→0 f ∗ gt(x) = f(x) for a.e. x ∈ R. b 1 b ∨ (ii) (Fourier inversion theorem) Assume f, f ∈ L (R), and let f0 = (f) . Then, f0 ∈ C0(R), f = f0 d∨ b ∨ 1 ∞ almost everywhere, and also (f ) = f0. Moreover, f, f ∈ L (R) ∩ L (R).

1 Proof. (i) We know {gt : t > 0} as t → 0 is an approximate identity for L (R). We see limt→0 f ∗ ∫ gt(x) = f(x) for a.e. x ∈ R by noting that |f(x) − f ∗ gt(x)| = | (f(x) − f(x − y))gt(y)dy| ≤ ∫ ∫ R −1 R |f(x)−fy(x)|t g(y/t)dy = R |f(x)ftz(x)|g(z)dz ≤ ∥f −ftz∥1∥g∥∞ → 0 as t → 0 by Exercise-20.

(ii) Direct evaluation will not work since the complex valued exponential function on R does not belong to L1(R). Therefore the proof becomes a little involved, where we need to insert a suitable −πx2 approximate identity into the integral. We know f0 ∈ C0(R). Let K(x) = e , and Kt(x) = −1 −1 −πx2/t t K(x/t) = t e for t > 0. Then {Kt : t > 0} as t → 0 is an approximate identity. Since b b b −πt2y2 K = K, we have ht(y) := Kt(y) = K(ty) = K(ty) = e by Exercise-35(v). Since K, Kt are ∨ b b even functions, Kt = Kt, and therefore ht = Kt.

Fix z ∈ R. Since (ht) → 1 pointwise as t → 0, Lebesgue dominated convergence theorem yields ∫ ∫ b b that f0(z) = f(y)ez(y)dy = limt→0 f(y)ez(y)ht(y)dy. By Exercise-35 and the evenness of ∫ R ∫ R ∫ ∫ b d b Kt, we see f(y)ez(y)ht(y)dy = f−z(y)ht(y)dy = f−z(y)ht(y)dt = f(y + z)Kt(y)dy = ∫ R R R R R f(−y + z)Kt(y)dy = f ∗ Kt(z) → f(z) for a.e. z ∈ R as t → 0 by part (i). In the last step, one can also take limit along a sequence (tn) → 0 after noting that ∥f − f ∗ Kt∥1 → 0 as t → 0, and that L1-convergence implies pointwise convergence a.e. along a subsequence.

∨ b d∨ b 1 ∞ Since f (y) = f(−y), we also have (f ) = f0. These imply that f, f ∈ L (R) ∩ L (R). 

The following is an important technical fact for Fourier Theory. ∫ x −1 [140] (i) Let h(x) = 0 y sin ydt. Then limx→∞ h(x) = π/2. 52 T.K.SUBRAHMONIAN MOOTHATHU

∫ ∞ ∫ ∞ sin 2πux ∫ ∞ ∫ ∞ sin 2πux (ii) D (x)dx = dx = 1/2, and therefore D (x)dx = dx = 1. 0 u 0 πx −∞ u −∞ πx Proof. (i) Note that h is monotone on (nπ, (n + 1)π), and the differences h((n + 1)π) − h(nπ) = ∫ (n+1)π −1 nπ y sin ydy alternate in sign and decrease to 0. Hence limx→∞ h(x) exists. Thus it suf-

fices to show limn→∞ h(xn) = π/2 for some sequence (xn) → ∞. From the earlier parts of the ∫ notes, we know that the discrete Dirichlet kernel11 D ∈ C(T) satisfies 1/2 = 1/2 D (t)dt = ∫ N 0 N ∫ sin(2N + 1)πt 1/2 1 1 1/2 dt. Since lim ( − ) sin(2N + 1)πtdt = 0 by [104], 0 →∞ sin πt∫ N 0 sin πt πt 1/2 sin(2N + 1)πt 1/2 = lim , and hence →∞ N 0 πt ∫ ∫ 1 1/2 −1 (N+ 2 )π −1 1 π/2 = limN→∞ 0 t sin(2N+1)πtdt = limN→∞ 0 y sin ydy = h((N+ 2 )π) by putting (2N + 1)πt = y. For another proof of (i) using complex integration, see Example 2.7 in Chapter 5 of Conway, Functions of one Complex Variable.

t ∫ ∞ ∫ ∞ (ii) Putting x = , we see D (x)dx = 1 t−1 sin tdt. Now apply (i).  2πu 0 u π 0 Exercise-38: (Sufficient conditions for the pointwise Fourier inversion) Let f ∈ L1(R) and a ∈ R. ∫ δ sin 2πux (i) lim →∞ s (f, a) = f(a) ⇔ there is δ > 0 such that (f(a + x) − f(a)) dx = 0. u u −δ πx − f(a + x) f(a) 1 (ii) (Dini’s test) If x 7→ is in L (−δ, δ) for some δ > 0, then lim su(f, a) = f(a). x u→∞ f(a + x) − f(a) (iii) If x 7→ is bounded a.e. in a neighborhood of 0, then lim su(f, a) = f(a). x u→∞ (iv) If f is differentiable at a, or Lipschitz/H¨oldercontinuous at a, then lim su(f, a) = f(a). u→∞ 1 (v) If f is piecewise C on each bounded interval, then limu→∞ su(f, a) = [f(a+) + f(a−)]/2. ∫ ∫ ∫ − δ − [Hint: (i) f(a) = R f(a)Du(x)dx by [139], and hence su(f, a) f(a) = ( −δ + |x|>δ)(f(x + a) f(a))Du(x)dx, where the second integral goes to 0 as u → ∞ by [138]. The proofs of other statements are also similar to what we wrote for the pointwise convergence of Fourier series. Refer to G.Bachman, L.Narici and E.Beckenstein, Fourier and Wavelet Analysis for some helpful hints.] ∫ R 1 v [Definition: For] v > 0, define the continuous Fej´erkernel Fv on as Fv(x) = v u=0 Du(x)du = − v − cos 2πux 1 cos 2πvx ≥ 2 2 = 2 2 0. Note that Fv is an even function, and by the identity 2π vx u=0 2π vx 2 ∫ ∫ ∫ 2 sin πvx 1 v 1 − cos 2θ = 2 sin θ we have Fv(x) = . See Fv(x)dx = Du(x)dxdu = ∫ π2vx2 R v u=0 x∈R 1 v v 0 1du = 1 by [139](ii) and an interchange of the integral. Since Fv(x) = vF1(vx), the family 1 {Fv : v > 0} also satisfies the L -concentration condition (A3). Thus {Fv : v > 0} as v → ∞ is 1 an approximate identity on R. For f ∈ L (R) and v > 0, the continuous Fej´ermean σv(f) of f ∫ ∫ ∫ 1 v 1 v − ∗ is defined as σv(f, x) = v u=0 su(f, x)du = v u=0 y∈R Du(y)f(x y)dydu = Fv f(x), where the last equality is by an interchange of the integrals.

sin(2N + 1)πt sin 2πux 11Same notation is used for discrete and continuous Dirichlet kernels: D (t) = , D (x) = . N sin πt u πx FOURIER ANALYSIS 53

R → R − |x| − Exercise-39: (i) For v > 0, let gv : be gv(x) = 1 v for v < x < v, and g(x) = 0 ∨ elsewhere. Then Fv = gbv = g ∈ C0(R). v ∫ ∈ 1 R ∈ R ∗ v − |y| b (ii) For f L ( ), v > 0, and a , we have σv(f, a) = Fv f(a) = −v(1 v )f(y)ea(y)dy.

(iii) If f ∈ Cc(R), then ∥f − Fv ∗ f∥∞ → 0 as v → ∞. p (iv) Let 1 ≤ p < ∞. If f ∈ L (R), then ∥f − Fv ∗ f∥p → 0 as v → ∞. 1 (v) Let f ∈ L (R) and assume f is continuous at a ∈ R. Then limv→∞ σv(f, a) = f(a). 1 (vi) If f ∈ L (R), then limv→∞ σv(f, a) = f(a) for a.e. a ∈ R. (vii) (Uniqueness) If f, g ∈ L1(R) and fb = gb, then f = g a.e., and hence f = g in L1(R). ∫ ∫ 0 v x [Hint: (i) Replacing x with −x in − , and integrating by parts, gb(y) = (1− )[ey(x)+e−y(x)]dx = ∫ ∫ v ∫ 0 v v x 1 v sin 2πxy 1 v ∨ (1 − ) · 2 cos 2πxy dx = 0 + dx = Dx(y)dx = Fv(x). And g = gbv since gv is 0 v v 0 πy∫ v 0 ∫ v ∨ even. (ii) Since Fv(a − z) = g (a − z) = gv(y)ea−z(y)dy, we get Fv ∗ f(a) = Fv(a − z)f(z)dz = ∫ ∫ v ∫ ∫ b v − |y| b gv(y)ea(y)e−y(z)f(z)dzdy = gv(y)ea(y)f(y)dy = −v(1 v )f(y)ea(y)dy. (iii) and (iv): They follow from [102] since {Fv : v > 0} as v → ∞ is an approximate identity. (v) Similar to the initial part of the proof of [116](v). Statement (vi) follows from [139](i), and (vii) from (ii) and (vi).]

Remark: The space Cbu(R) := {f ∈ C(R): f is bounded and uniformly continuous} is closed in ∞ L (R). Also Cbu(R) satisfies the following two properties: ∥fy∥∞ = ∥f∥∞, and y 7→ fy from R to (Cbu(R), ∥ · ∥∞) is continuous for each f ∈ Cbu(R). At the abstract level, these two are the properties going into the proof of [102](i). Hence, extending Exercise-39(iii), we can also establish that ∥f − Fv ∗ f∥∞ → 0 as v → ∞ for every f ∈ Cbu(R), and in particular for every f ∈ C0(R).

Now we prove the analogue of Dirichlet-Jordan theorem (result [118]) with a different proof, for which we will make use of the following fact.

Fact: If g :[a, b] → R is increasing, then g is differentiable almost everywhere, and for any ∫ ∫ ∫ b b b bounded real function h on [a, b], we have a hdg = a h(x)Dg(x)dx, where a hdg is the Riemann- Stieltjes integral w.r.to g (see Theorem 6.17 in Rudin, Principles of Mathematical analysis). If g :[a, b] → C is of bounded variation, then we can write g = g1 − g2 + i(g3 − g4), where gj’s are monotone increasing, and hence we can define the Riemann-Stieltjes integral w.r.to g, and one has ∫ ∫ b b a hdg = a h(x)Dg(x)dx for any bounded real function h on [a, b] in this case also. [141] (Dirichlet-Jordan theorem for R) Let f ∈ L1(R) be of bounded variation in every compact interval [a, b] ⊂ R. Then limu→∞ su(f, a) = [f(a+) + f(a−)]/2 for every a ∈ R; in particular, limu→∞ su(f, a) = f(a) for a.e. a ∈ R (since a function of bounded variation, being a linear combination of monotone functions, is differentiable a.e., and hence continuous a.e.) 54 T.K.SUBRAHMONIAN MOOTHATHU

Proof. Fix a ∈ R and let g(x) = f(a+x)+f(a−x). Then g ∈ L1(R), and g is of bounded variation on ∫ ∫ ⊂ R δ ∞ any compact interval [a, b] . Fix δ > 0. By Exercise-37(ii), su(f, a) = ( 0 + δ )g(x)Du(x)dx. ∫ ∞ As a consequence of Riemann-Lebesgue lemma, limu→∞ g(x)Du(x)dx = 0 (see [105](iv) for a ∫ δ δ comparison). Thus it suffices to show limu→∞ g(x)Du(x)dx = [f(+) + f(a−)]/2. Let hu(x) = ∫ 0 x 0 Du(y)dy so that Dhu = Du. Integration by parts yields ∫ ∫ ∫ δ δ δ |δ − ∗ g(x)Du(x)dx = g(x)Dhu(x)dx = g(x)hu(x) 0 Dg(x)hu(x)dx.( ) 0 0 0 ∫ x sin t Let H(x) = dt. By [139] note that H(0) = 0, H(2πux) = h (x), and lim →∞ H(x) = 1/2. 0 πt u x |δ − → − → ∞ Now g(x)hu(x) 0 = g(δ )H(2πuδ) g(δ )/2 as u . Also,

∫ ∫ ∫ ∫ δ δ δ δ 1 g(δ−) − g(0+) lim Dg(x)h (x)dx = lim h (x)dg = lim H(2πux)dg = dg = →∞ u →∞ u →∞ u 0 u 0 u 0 0 2 2 by the Fact above and Lebesgue dominated convergence theorem. Using these in (∗), we get ∫ δ −  limu→∞ 0 g(x)Du(x)dx = g(0+)/2 = [f(a+) + f(a )]/2. ∫ ∫ ∈ 1 R b b R Exercise-40: If f L ( ), then limu→∞ a su(f, x)dx = a f(x)dx for every a < b in .

[Hint: This is essentially a consequence of the fact that Du is even and hence su behaves self- ∫ ∫ ∫ ∫ b adjointly: su(f)g = fsu(g). Indeed, letting g = 1 , we su(f, x)dx = su(f, x)g(x)dx = ∫ ∫ ∫ (a,b∫) ∫ a R ∫ f ∗ Du(x)g(x)dx = f(y)Du(x − y)g(x)dxdy = f(y)Du(y − x)g(x)dxdy = f(y)g ∗ R ∫ R R R R R 1 Du(y)dy = f(y)su(g, y)dy. Also, since g ∈ L (R) is of bounded variation, su(g) → g pointwise R ∫ ∫ ∫ ∫ b → b → ∞ a.e., bounded by an integrable function. Hence a su(f) = R fsu(g) R fg = a f as u .]

17. Fourier transform on S, L2(R), and on distributions

p Recall that the Schwartz space S is a subset of L (R) for 1 ≤ p ≤ ∞. We denote by ⟨·, ·⟩2 the L2-inner product.

[142] If f ∈ S, then fb ∈ S. The Fourier transform map F : S → S given by Ff = fb is linear and bijective, and satisfies the unitary condition FF ∗ = I = F ∗F, where F ∗ : S → S is the inverse ∗ ∨ Fourier transform F f = f . In particular, ⟨Ff, Fg⟩2 = ⟨f, g⟩2 and ∥Ff∥2 = ∥f∥2 for f, g ∈ S.

i j b j j b Proof. Let f ∈ S and i, j ≥ 0. We need to show ∥y D f∥∞ < ∞. By Exercise-36, y D f(y) = i j \j j j−i \i j \i j i j y (−2πi) (x f) = (−1) (2πi) D (x f)(y). And ∥D (x f)∥∞ ≤ ∥D (x f)∥1 < ∞ since we have Di(xjf) ∈ S ⊂ L1(R) as f ∈ S. For f, g ∈ S, we see by Fubini’s theorem that ∫ ∫ ∫ ∫ ∫ ∗ ∗ ⟨Ff, g⟩2 = f(x)e−y(x)g(y)dxdy = f(x)g(y)ey(x)dxdy = f(x)F g(x)dx = ⟨f, f g⟩2, FOURIER ANALYSIS 55 and hence F ∗ is indeed the adjoint of F. Since S ⊂ L1(R), we have F ∗F = I and similarly FF ∗ = I ∗ by [139]. This gives ⟨Ff, Fg⟩2 = ⟨f, F Fg⟩2 = ⟨f, g⟩2 for f, g ∈ S. 

Remark: A corollary of [142] is that Γ := {f ∈ L1(R): supp(fb) is compact} is dense in L1(R). Proof : Since D ⊂ S ⊂ L1(R) are dense inclusions, and since F ∗ : S → S is an isomorphism, the set {g∨ : g ∈ D} is dense in L1(R). And {g∨ : g ∈ D} ⊂ Γ since gc∨ = g, which has compact support.

[143] (Plancherel’s theorem - Fourier transform on L2(R)) Let F : S → S be the Fourier transform Ff = fb. Since S is dense in L2(R), we see by [142] that F has a unique extension F : L2(R) → 2 L (R) as a , i.e., as a bijective linear map satisfying ⟨Ff, Fg⟩2 = ⟨f, g⟩2 for 2 2 f, g ∈ L (R). In particular, ∥Ff∥2 = ∥f∥2 holds for every f ∈ L (R). The unique extension F ∗ : L2(R) → L2(R) of the inverse Fourier transform F ∗ : S → S is the inverse of F on L2(R). Moreover, the extension F on L2(R) satisfies Ff = fb a.e. for every f ∈ L1(R) ∩ L2(R).

Proof. All except the last line are evident. To prove the last line, consider f ∈ L1(R) ∩ L2(R). We claim that there is a sequence (fn) ∈ D such that ∥f − fn∥1 → 0 and ∥f − fn∥2 → 0 as n → ∞. 1 2 Since (f1 − ) → f in both L (R) and L (R), it suffices to consider the case where f has compact [ n,n] ∫ support. Choose g ∈ D with g ≥ 0 and g = 1; let gn(x) = ng(nx); and consider fn = f ∗ gn to b b establish the claim. Now, ∥f −fn∥∞ ≤ ∥f −fn∥1 → 0 and ∥Ff −Ffn∥2 = ∥f −fn∥2 → 0 as n → ∞ b b by Exercise-34(i) and [142]. But Ffn = fn since fn ∈ D ⊂ S, and therefore Ff = f a.e. 

∫ ∫ −2πa|x| R b 0 2π(a−iy)x ∞ −2π(a+iy)x Example: Let f(x) = e on , where a > 0. Then f(y) = −∞ e dx+ 0 e dx = 1 1 a + = . We can now use the Plancherel identity ∥fb∥2 = ∥f∥2 to 2π(a − iy) 2π(a + iy) π(a2 + y2) 2 2 ∫ 1 ∫ evaluate the following integral: dy = a−2π2∥fb∥2 = a−2π2∥f∥2 = a−2π2( 0 e4πaxdx+ R (a2 + y2)2 2 2 −∞ ∫ ∞ 1 1 π e−4πaxdx) = a−2π2 · ( + ) = . 0 4πa 4πa 2a3 Three natural classes of approximate identities on L1(R) can be obtained in a unified fashion as demonstrated in the Exercise below. 1 − cos x Exercise-41: (0) (Facts about integrals) Let K : R → C be any of the following: K(x) = , πx2 −x2/4 ∫ 1 e√ or K(x) = , or K(x) = . In each case K ≥ 0 and R K = 1, so that we can π(1 + x2) 4π manufacture approximate identities from K as stated below. 1 − cos 2πux (i) (Fejer kernel) In the first case, let K (x) = 2πuK(2πux) = . Then {K : u > 0} u 2π2ux2 u as u → ∞ is an approximate identity for L1(R). t (ii) (Poisson kernel) In the second case, let K (x) = t−1K(x/t) = . Then {K : t > 0} as t π(t2 + x2) t t → 0 is an approximate identity for L1(R). 56 T.K.SUBRAHMONIAN MOOTHATHU

√ −x2/4t −1/2 e (iii) (Gaussian kernel) In the third case, let Kt(x) = t K(x/ t) = √ . Then {Kt : t > 0} 4πt as t → 0 is an approximate identity for L1(R). The mathematical formulation of Heisenberg’s uncertainty principle says roughly the following: for every a, b ∈ R, it is impossible for f to be concentrated about a and fb to be concentrated about b simultaneously. In other words, if f is concentrated, then fb should spread out, and vice versa. b [144] (i) (Uncertainty principle - qualitative form) If f, f ∈ Cc(R), then f ≡ 0. (ii) (Uncertainty principle - quantitative form) Let f ∈ L2(R) satisfy the following decay conditions: xf, Df ∈ L2(R) and x|f|2 vanishes at ∞. Then f, fb ∈ L1(R), and for every a, b ∈ R, we have ∫ ∫ ∥f∥4 ( |x − a|2|f(x)|2dx)( |y − b|2|fb(y)|2dy ≥ 2 , i.e., 16π2∥(x − a)f∥2∥(y − b)fb∥2 ≥ ∥f∥4. R R 16π2 2 2 2 ∫ Proof. (i) Check g : C → C defined as g(z) = R f(t)e−z(t)dt is complex analytic by differentiating under the integral sign, etc. We have g(u + i0) = fb(u) = 0 for u ∈ R \ supp(fb). Since R \ supp(fb) contains an interval (and hence is a set containing a limit point), we get g ≡ 0. Then fb(u) = g(u) = 0 for every u ∈ R. Hence f ≡ 0 by Fourier inversion [139] and the continuity of f.

(ii) Since xf ∈ L2, we may write f = (1 + x2)1/2f × (1 + x2)−1/2 and apply Cauchy-Schwarz inequality to see f ∈ L1(R). Similarly, Df ∈ L2(R) implies yfb ∈ L2(R) by Exercise-36(i) and [143], and this in turn (by writing fb = (1 + y2)1/2fb× (1 + y2)−1/2) implies fb ∈ L1(R).

By Exercise-35(i), a translation in f corresponds to multiplying fb with a unimodular complex scalar, and a translation in fb corresponds to multiplying f with a unimodular complex scalar. So, after a translation in both f and fb, we may assume a = 0 = b. Thus it suffices to show 16π2∥xf∥2∥yfb∥2 ≥ ∥f∥4.. Note that xfDf ∈ L1(R) by Cauchy-Schwarz inequality since xf, Df ∈ 2 2 2 ∫ ∫ 2 R 2∥ ∥2∥ b∥2 ≥ 2 ≥ 2 ∥ ∥4 L ( ). We claim that 16π xf 2 yf 2 (2 xfDf) (2Re xfDf) = f 2.

We have 4π2∥yfb∥2 = ∥2πiyfb∥2 = ∥Dfd∥2 = ∥Df∥2 by Exercise-36 and [143], and therefore 2 2 2 2∫ 16π2∥xf∥2∥yfb∥2 = 4∥xf∥2∥Df∥2 = 4∥xf∥2∥Df∥2 ≥ (2 xfDf)2 by Cauchy-Schwarz. Next, note 2 2 2 2 2 2 ∫ ∫ that D|f|2 = D(ff) = f · Df + Df · f = 2Re(fDf), and consequently u |f|2 = u |f|2 · 1 = ∫ −u −u 2 u u 2 x|f| |− − 2Re xfDf. Letting u → ∞ and using the vanishing of x|f| at infinity, we conclude u ∫−u ∫ ∥ ∥2 − ∥ ∥4 2  f 2 = 0 2Re xfDf so that f 2 = (2Re xfDf) . ∫ 2 2 2 2 Remark: (i) If f ∈ L (R) is with ∥f∥2 = 1, then infa∈R R(x−a) |f(x)| dx =: σ (f) is the variance of f in the language of Probability Theory, and hence [144](ii) says σ2(f)σ2(fb) ≥ 1/(16π2). (ii) Every f ∈ S satisfies the hypothesis of [144](ii). ∫ ∫ Now we wish to define Fourier transform of distributions. Recall the identity fgb = fgb . This suggests the definition ⟨ϕ,b f⟩ = ⟨ϕ, fb⟩. However, note that if f ∈ D \ {0}, then fb is not in D by the FOURIER ANALYSIS 57 uncertainty principle. Therefore, to have a symmetric situation, we define Fourier transform only for tempered distributions (and this includes distributions with compact support).

Definition and Example: For ϕ ∈ S′ we define ϕb ∈ S′ as ⟨ϕ,b f⟩ = ⟨ϕ, fb⟩. (i) If ϕ ∈ S comes from ∫ ∫ 1 b b ′ ′ an L -function g, then ϕ = gb since fgb = fg. (ii) Consider the Dirac measure δa ∈ E ⊂ S . ∫ b b b b We have ⟨δa, f⟩ = ⟨δa, f⟩ = f(a) = fe−a = ⟨e−a, f⟩, and hence δa is the tempered distribution ∫ b b induced by the function e−a. In particular, δ0 = e0 = 1, which means ⟨δ0, f⟩ = f for f ∈ S.

Remark: (i) ϕ 7→ ϕb from S′ to itself is a sequentially continuous linear isomorphism due to [142]. (ii) Since Lp(R) ⊂ S′ for 1 ≤ p ≤ ∞ by [128], the Fourier transform of every f ∈ Lp(R) (1 ≤ p ≤ ∞) is now defined in the sense of distributions. b e Exercise-42: Let ϕ, ψ ∈ S′. Then, (i) (aϕ\+ bψ) = aϕb + bψb, and ϕe = ϕb. (ii) Dϕb = (−\2πixϕ) and hence (by invertibility) (2\πixϕ) = −Dϕb. (iii) ([Dϕ) = 2πiyϕb. [ b d b (iv) (eaϕ) = (ϕ)a and (ϕa) = e−aϕ. [Hint: (ii) ⟨Dϕ,b f⟩ = −⟨ϕ, ([Df)⟩ = −⟨ϕ, 2πiyfb⟩ = ⟨−2πixϕ, fb⟩ = ⟨(−\2πixϕ), f⟩. (iii) ⟨([Dϕ), f⟩ = b d b b [ b [ −⟨ϕ, Df⟩ = −⟨ϕ, −2πi(xf)⟩ = ⟨ϕ, 2πixf⟩ = ⟨2πiyϕ, f⟩. (iv) ⟨(eaϕ), f⟩ = ⟨ϕ, eaf⟩ = ⟨ϕ, (f−a)⟩ = b b d b b [ b b ⟨ϕ, f−a⟩ = ⟨(ϕ)a, f⟩, and ⟨(ϕa), f⟩ = ⟨ϕa, f⟩ = ⟨ϕ, (f)−a⟩ = ⟨ϕ, e−af⟩ = ⟨ϕ, e−af⟩ = ⟨e−aϕ, f⟩.] d Exercise-43: (i) fb∗ ge = (fgb) for f, g ∈ S (this is needed for the proof of (ii)). (ii) Let g ∈ S and ϕ ∈ S′. Then ϕ[∗ g = gbϕb and ϕb ∗ gb = gϕc. c · ∨ f· b∗ e ∨ b ∨ e ∨ b ∗ ∨ ∨ ∨ [Hint: (i) As ( ) = ( ), we have (f g) = (f) (g) = fg by the product rule (h1 h2) = h1 h2 . d Now use invertibility in S. (ii) ⟨ϕ[∗ g, f⟩ = ⟨ϕ ∗ g, fb⟩ = ⟨ϕ, fb∗ ge⟩ = ⟨ϕ, (fgb)⟩ = ⟨ϕ,b fgb⟩ = ⟨gbϕ,b f⟩, ∨ d and ⟨ϕb ∗ g,b f⟩ = ⟨ϕ,b f ∗ egb⟩ = ⟨ϕ,b f ∗ g ⟩ = ⟨ϕ, f\∗ g∨⟩ = ⟨ϕ, fb(g∨)⟩ = ⟨ϕ, fgb ⟩ = ⟨gϕ, fb⟩ = ⟨gϕ,c f⟩.]

Exercise-44: (i) If ϕ ∈ E′ ⊂ S′, then ϕb is (the restriction to R of) a complex analytic function g, i.e., ⟨ϕ,b f⟩ = ⟨g, f⟩ for f ∈ S′. In particular, ϕb ∈ E. Also, Dmϕb has polynomial growth at ∞ ∀ m ≥ 0. (ii) (Product rule) If ϕ, ψ ∈ E′, then ϕ ∗ ψ ∈ E′, and ⟨ϕ[∗ ψ, f⟩ = ⟨ϕ,b f⟩⟨ψ,b f⟩ for f ∈ S. m m [Hint: (i) g : C → C given by g(z) = ⟨ϕ, e−z⟩ is complex analytic, and D g(z) = ⟨ϕ, (−2πiy) e−z⟩. ′ m m m Since ϕ ∈ E , there exist C > 0 and N ≥ 0 with |D g(z)| = |⟨ϕ, (−2πiy) e−z⟩| ≤ CpN ((−2πiy) e−z), 12 from which the polynomial growth property follows . Since e−y(x) = e−x(y), and since the integral representing fb(y) can be approximated by Riemann sums, we get ⟨ϕ,b f⟩ = ⟨ϕ, fb⟩ = ∫ ∫ ∫ ′ ϕ( f(x)e−x(·)dx) = f(x)ϕ(e−x(·))dx = f(x)g(x)dx = ⟨g, f⟩. (ii) We know ϕ ∗ ψ ∈ E . By ∫ ∫ [ [131], we see ⟨ϕ ∗ ψ, f⟩ = f(x + y)ϕ(e−x)ψ(e−y)dxdy = ⟨ϕ, f⟩⟨ψ, f⟩ by interchanging ϕ and ψ with the integrals as above.]

12see p.119 of Grafakos, Classical Fourier Analysis for the computational details. 58 T.K.SUBRAHMONIAN MOOTHATHU

18. Fourier transform of measures

In this and the remaining sections, we select a few topics related to Fourier Analysis and give a very brief sketch about them, often with partial or skipped proofs. These sketches are intended as appetizers for the students to learn more about advanced topics related to Fourier Analysis.

Definition: Let (X, A) be a measurable space. A complex measure on (X, A) is a map µ : A → C ∑∞ ∪∞ satisfying the following: µ(A) = µ(Ak) whenever A = Ak is a measurable partition. k=1 ∫ k=1 ∈ 1 R R For example, if f L ( ), then µ(A) := A f(x)dx defines a complex Borel measure on (hint: use Lebesgue dominated convergence theorem to get countable additivity). This also shows that the modulus of a complex measure does not satisfy monotonicity property: A ⊂ B does not imply

|µ(A)| ≤ |µ(B)|. For instance, if f = 1[0,1] − 1[2,3] and dµ = fdx, then µ(R) = 0 but µ([0, 1]) = 1.

We will show that the modulus of a complex measure can always be dominated by a finite (positive) measure in an optimal manner. In the proof, we will use the following fact.

Fact: (see Lemma 6.3 in Rudin, Real and Complex Analysis) If z1, . . . , zN are finitely many complex ∑ ∑ ⊂ { } | | ≥ −1 N | | numbers, then there is F 1,...,N with k∈F zk π k=1 zk .

[145] (Domination by finite positive measure) Let µ be a complex measure on a measure space ∑∞ (X, A). For A ∈ A, let β(A) = sup |µ(Ak)|, where the supremum is taken over all measurable ∪ k=1 ∞ A | | ≤ partitions A = k=1 Ak. Then, β is a finite positive measure on (X, ) with µ(X) β(X).

Proof. Clearly β(∅) = 0. Also note |µ(A)| ≤ β(A) for A ∈ A since A = A ∪ ∅ ∪ ∅ ∪ ∅ ∪ · · · is also a measurable partition. To check countable additivity for β, consider A ∈ A and a measurable ∪ ∪ ∪ ∞ ∞ ∞ ∩ partition A = k=1 Ak. For any measurable partition A = n=1 Bn, we have that Ak = n=1(Ak ∪∞ ∑∞ Bn) and Bn = (Ak ∩ Bn) are measurable partitions of Ak and Bn, and hence |µ(Bn)| ≤ ∑ ∑ k=1 ∑ ∑ ∑ n=1 ∞ | ∞ ∩ | ≤ ∞ ∞ | ∩ | ≤ ∞ n=1 k=1 µ(Ak Bn) k=1 n=1 µ(Ak Bn) k=1 β(Ak). Taking supremum over all ∪∞ ∑∞ measurable partitions A = Bn, we get β(A) ≤ β(Ak). To prove the reverse inequality, n=1 k=1 ∪ ≤ ∞ consider 0 ck < β(Ak), and choose a measurable partition Ak = n=1 Ck,n of Ak with ck < ∑∞ ∪∞ ∑∞ |µ(Ck,n)| for each k ∈ N. As A = Ck,n is a measurable partition, we get ck ≤ ∑k=1 k,n∑=1 k=1 ∞ | | ≤ ∞ ≤ k,n=1 µ(Ck,n) β(A), and it follows that k=1 β(Ak) β(A) by the choice of ck’s.

If β(X) = ∞, we derive a contradiction as follows. Given M > 0, choose a measurable partition ∪ ∑ ∞ N X = An with |µ(Ak)| > πM for some N ∈ N. Applying the Fact mentioned above with k=1 k=1 ∪ ∑ zk = µ(Ak), find F ⊂ {1,...,N} such that for B := ∈ Ak we have |µ(B)| = | ∈ µ(Ak)| ≥ ∑ k F k F −1 N π |µ(Ak)| > M. As M > 0 is arbitrary, we must have sup{|µ(B)| : B ∈ A} = ∞. Therefore, k=1 ∑ n we can find a sequence (Bn) in A such that |µ(B1)| ≥ 1 and |µ(Bn+1)| ≥ 1 + |µ(Bj)|. ∪ j=1 \ n ≥ ∈ N Put C1 = B1 and Cn+1 = Bn+1 j=1 Bj. Then µ(Cn) 1 for every n and hence the FOURIER ANALYSIS 59

∑∞ series µ(Cn) cannot converge to any complex number. On the other hand, we should have ∪ n=1 ∑ ∞ ∞  µ( n=1 Cn) = n=1 µ(Cn) since Cn’s are disjoint. This is the required contradiction.

Remark: (i) In the above, |µ(X)| is called the total variation of µ, and β is called the total variation measure of µ (often β is denoted as |µ|; note that |µ(A)| ≤ |µ|(A), but equality may not hold). (ii) Because of [145], the collection of complex measures does not include all positive measures since a positive measure need not be finite!

Definition: If X is a metric space, let M(X) denote the collection of all complex Borel measures on ∫ X. For µ ∈ M(R), its Fourier transform µb : R → C is defined as µb(y) = e−y(x)dµ(x) for y ∈ R. ∫ R Similarly, the nth Fourier coefficient of µ ∈ M(T) is defined as µb(n) = T e−n(x)dµ(x) for n ∈ Z. Note that if µ ∈ M(R) is absolutely continuous w.r.to the Lebesgue measure, then by Radon- ∫ 1 b Nikodym theorem, there is f ∈ L (R) with dµ = fdt and hence µb(y) = R e−y(t)f(t)dt = f(y).

Remark: The Fourier transform of a measure share many (but not all) of the properties of Fourier transform of a function. This is not surprising because we can identify f ∈ L1(R) with fdx ∈ M(R). Also, it can be shown that M(R) is a Banach space w.r.to the norm ∥µ∥ := |µ|(R).

Exercise-45: (Properties - I) Let µ ∈ M(R), and let |µ| = β as in [145]. Then, ∫ ∫ (i) | R fdµ| ≤ R |f|d|µ|. ∞ (ii) µb ∈ L (R) with ∥µ∥∞ ≤ |µ|(R). (iii) µb : R → C is uniformly continuous. (iv) In general, Riemann-Lebesgue lemma fails for µb, i.e., µb may not vanish at infinity; for example, ∫ b δ0(y) = R e−ydδ0 = δ0(e−y) = 1 for every y ∈ R. However, if µ is absolutely continuous w.r.to the

Lebesgue measure, then lim|y|→∞ µb(y) = 0 by Radon-Nikodym theorem. (v) If µ has compact support, then µb ∈ E with Dmµb(y) = (−2πi)mxdmµ(y) for m ∈ N and y ∈ R. ∫ ∫ b 1 b (vi) If f, f ∈ L (R) (for instance, if f ∈ S), then R fdµ = R f(y)µb(y)dy. ∫ ∫ ∫ [Hint: (ii) |µb(y)| ≤ |e−y(x)|d|µ(x)| = 1d|µ| = |µ|(R). (iii) |µb(y + t) − µb(y)| ≤ |e−t(x) − R R ∫ R | | | ⊂ − −1 b −b b −1 − 1 d µ (x). (iv) Let supp(µ) ( b, b). We have t (µ(y+t) µ(y)) = −b e−y(x)t (e−t(x) 1)dµ(x), |t−1(e−tx − 1)| ≤ 2π|x| and x 7→ 2πx belongs to L1(−b, b). By dominated convergence theorem, ∫ b b ∨ Dµb(y) = e−y(x)(−2πix)dµ = −2πiνb(y), where dν = xdµ. (vi) f = (f) by Fourier inversion, −∫b ∫ ∫ ∫ b b and hence R fdµ = R R f(y)ey(x)dµ(x)dy = R f(y)µb(y)dy by Fubini’s theorem.]

Exercise-46 (Properties - II) Let µ, ν be complex measures on R. Then, (i) (Linearity) aµ\+ bν = aµb + bνb for a, b ∈ C. ∫ ∫ ∫ ∫ (ii) µ[∗ ν = µbνb, where µ∗ν(A) := 1A(x+y)dµ(x)dν(y) = µ(A−y)dν(y) = ν(A−x)dµ(x). ∫ ∫ R R R R (iii) R µdνb = R νdµb . 60 T.K.SUBRAHMONIAN MOOTHATHU

(iv) (Uniqueness) If µb = νb, then µ = ν. ∫ ∫ ∫ [Hint: (ii) µ ∗ ν satisfies fd(µ ∗ ν) = f(x + y)dµ(x)dν(y) for f ∈ L1(R, µ) ∩ L1(R, ν). Hence ∫ ∫ R R∫ R µ[∗ ν(z) = e−z(x+y)dµ(x)dν(y) = µb(z)e−z(y)dν(y) = µb(z)νb(z). (iii) Use Fubini’s theorem. R R ∫ ∫ R (iv) By Exercise-45(vi), R fdµ = R fdν for every f ∈ D ⊂ S. Now 1[a,b] can be approximated by members of D, and hence µ([a, b]) = ν([a, b]) for every a < b.]

Remark: (i) Analogues of Exercise-45 and Exercise-46 hold for complex measures on T. (ii) µ ∈ M(T) is called a Rajchman measure if Riemann-Lebesgue lemma holds for µb, i.e., if lim µb(n) = 0. |n|→∞ For example, if µ ∈ M(T) is absolutely continuous w.r.to the Lebesgue measure, then µb = fb for some f ∈ L1(T) by Radon-Nikodym theorem, and therefore µ is a Rajchman measure. By a theorem of Neder, every Rajchman measure µ is continuous in the sense that µ({a}) = 0 ∀ a ∈ T.

Seminar topic: (Bochner’s theorem) Let g : R → C be continuous. Then g = µb for some µ ∈ M(R) ∑ ≥ n − ≥ with µ 0 iff g is positive definite in the sense that j,k=1 g(zj zk)zjzk 0 for any finite subset

{z1, . . . , zn} ⊂ C (see section 2.8 in Katznelson, An Introduction to Harmonic Analysis).

Going back to the theory of Fourier series, we may now supplement [119] as follows:

[146] Let f ∈ C(T) be of bounded variation. Then, ∫ (i) Let µ ∈ M(T) be given by the Riemann-Stieltjez integral w.r.to f, i.e., µ(A) = T 1Adf for Borel subsets A ⊂ T. Then 2π|nfb(n)| ≤ |µb(n)| ≤ |µ|(T) for every n ∈ Z.

(ii) limN→∞ ∥f − sN (f)∥∞ = 0, i.e., (sN (f)) → f uniformly.

Proof. (i) Recall that a function of bounded variation is differentiable a.e. Now, integration by parts ∫ ∫ ∫ ∫ b 1 1 1 1 gives 2πinf(n) = 2πin 0 f(t)e−n(t)dt = 0+ 0 e−n(t)Df(t)dt = 0 e−n(t)df(t) = 0 e−n(t)dµ(t) = µb(n). And use (the analogue of) Exercise-45(ii).

∥ − ∥ | b | ∞ (ii) lim f σN (f) ∞ = 0 by [115], and supn nf(n) < by part (i). Now use Hardy’s Tauberian N→∞ theorem [111] (to be precise, a uniform version of [111]) to deduce lim ∥f − sN (f)∥∞ = 0.  N→∞

Remark: On Fourier series, termwise integration is always allowed, but termwise differentiation is allowed only under extra hypothesis. We explain: (i) Let f ∈ L1(T) and F : T → C be ∫ t F (t) = 0 f(s)ds. Then F is absolutely continuous, and hence is a continuous function of bounded variation (see 5.4 of Royden, Real Analysis). By [146], the Fourier series of F converge to F uniformly. Since uniform convergence allows the interchange of integration and summation, the Fourier series of F is obtained by termwise integration of the Fourier series of f. (ii) Since termwise ∑ b differentiation of n∈Z f(n)en brings an additional n to the numerator, the resulting series may not converge for a general f ∈ L1(T). However, if we assume some smoothness condition, say assume FOURIER ANALYSIS 61 f ∈ L1(T) is piecewise C2, then it can be shown that the series obtained by termwise differentiation ∑ b − ∈ T of n∈Z f(n)en(t) converges pointwise to [Df(t+) + Df(t )]/2 for every t .

19. Poisson summation formula ∑ ∈ 1 R b If f L ( ), we may ask what the series n∈Z f(n)en does represent. Poisson summation formula relates this series to the periodization of f defined below. ∑ ∈ 1 R T Definition: For f L ( ), its periodization fP on is formally defined as fP (t) = k∈Z f(t + k) for t ∈ T = [0, 1) (we may also view fP as defined on the whole of R with period 1, given by the same series). For example, the periodization of continuous Dirichlet kernel is the discrete Dirichlet kernel (see p.223 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets).

1 [147] Let f ∈ L (R) and fP be its periodization on T defined above. Then, ∑ ∈ C ∈ T ∈ 1 T (i) fP (t) , i.e., the series k∈Z f(t + k) is convergent, for a.e. t . Also, fP L ( ) with b b fP (n) = f(n) for every n ∈ Z. ∑ b (ii) (Poisson summation formula) Assume in addition that ∈Z |f(n)| < ∞. Then (after modifying ∑ ∑ n ∈ T b ∈ T on a null set), fP C( ) with k∈Z f(t + k) = fP (t) = n∈Z f(n)en(t) for every t . ∫ ∑ ∫ ∑ ∫ ∈ 1 R ∞ | | k+1 | | 1 | | Proof. (i) As f L ( ), we see > R f(x) dx = k∈Z k f(x) dx = k∈Z 0 f(x + k) dx. Since the series is absolutely convergent, we may interchange summation and integration to get ∫ ∑ 1 1 ( ∈Z |f(x + k)|)dx < ∞. This shows fP (x) is finite a.e. and fP |[0,1) ∈ L (T). Similarly, an in- 0 k ∫ ∫ ∑ b 1 1 terchange of series and integration yields fP (n) = fP (t)e−n(t)dt = ( ∈Z f(t + k))e−n(t)dt = ∑ ∫ ∑ ∫ 0 ∫ 0 k 1 k+1 b k∈Z 0 f(t + k)e−n(t)dt = k∈Z k f(y)e−n(y)dy = R f(y)e−n(y)dy = f(n). ∑ b ∑ b (ii) By part (i), ∈Z |fp(n)| = ∈Z |f(n)| < ∞, and hence by Exercise-16(iii) we obtain fP (t) = ∑ n ∑ n b b ∈ T  n∈Z fP (n)en(t) = n∈Z f(n)en(t) for t .

As an application, we mention below a case of recovering g : R → C from just knowing g|Z. ∑ ∈ 1 T | b | ∞ Exercise-47: (Sampling formula - simple form) Let f L ( ) be with n∈Z f(n) < , and ∫ ∑ g(n) sin π(x − n) g : R → C be g(y) = 1/2 f(t)e (t)dt for y ∈ R. Then g(x) = for x ∈ R. −1/2 y n∈Z π(x − n) [Hint: Parametrize T = [−1/2, 1/2) and extend f to R by putting f = 0 for |t| > 1/2. Then ∫ 1 b f ∈ L (R) and f(y) := R f(x)e−y(x)dx = g(−y) for y ∈ R. Poisson summation formula for the ∑ ∑ ∑ 1 b periodization fP ∈ L (T) gives fP (t) = ∈Z f(n)en(t) = ∈Z g(−n)en(t) = ∈Z g(n)e−n(t) n ∫ n ∫ ∑ n 1/2 1/2 for t ∈ T. Since fP = f on T, we get g(x) = − f(t)ex(t)dt = − ( ∈Z g(n)e−n(t))ex(t)dt = ∑ ∫ 1/2 1/2 n 1/2 n∈Z g(n) −1/2 ex−n(t)dt, where the interchange of integration and series is justified by uniform ∫ 1/2 ex−n(t) 1/2 sin π(x − n) convergence. Now note that e − (t)dt = | = .] −1/2 x n 2πi(x − n) t=−1/2 π(x − n) 62 T.K.SUBRAHMONIAN MOOTHATHU

Remark: In the language of Physics, Exercise-47 is called the sampling of bandlimited signal, where a bandlimited signal is a signal whose Fourier transform has compact support; note that if we think of f in Exercise-47 as defined on the whole of R, then part of the hypothesis is supp(fb) ⊂ [−1/2, 1/2].

20. Two theorems of Wiener

We will present two theorems of Wiener in Fourier Theory using tools from the theory of Banach algebras13. This will also provide an opportunity for the student to see the fruitful interaction among different branches of Mathematics. A few facts about Banach algebras will be briefly mentioned below.

Definition:A Banach algebra is a complex Banach space Γ admitting an associative multiplication operation ‘·’ that satisfies the following for s, t ∈ Γ: (i) (submultiplicative property of norm) ∥s · t∥ ≤ ∥s∥∥t∥, and (ii) (bilinearity of product) (s, t) 7→ s · t is linear w.r.to addition in each variable. If the multiplication is also commutative, then Γ is called a commutative Banach algebra. If there is a multiplicative identity, then Γ is called a unital Banach algebra. It may be noted that there is a simple procedure by which a multiplicative unit (say) u can be added to any non-unital Banach algebra Γ to convert it into a (slightly larger) unital Banach algebra Γ + Cu.

Some examples of commutative Banach algebras: (i) C with usual multiplication. (ii) L∞(R) with pointwise product as multiplication. (iii) C(K) := {all continuous f : K → C} with sup-norm and pointwise multiplication, where K is a compact Hausdorff space. 1 1 (iv) L (R) with convolution as multiplication (recall that ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1 for f, g ∈ L (R)). ∑ 1 Z ∗ − (v) l ( ) with discrete convolution x y(k) := n∈Z x(k n)y(n) as multiplication, whose multi- plicative identity is the element (..., 0, 0, 1, 0, 0,...), where 1 is at the 0th place.

Definition: Let Γ be a commutative Banach algebra and Γ∗ = {all continuous linear ϕ :Γ → C}. We say ϕ ∈ Γ∗ \{0} is a multiplicative functional if ϕ(st) = ϕ(s)ϕ(t) for s, t ∈ Γ. If K is a compact

Hausdorff space and a ∈ K, then the evaluation map ϕa : C(K) → C given by ϕa(f) = f(a) is a multiplicative functional on C(K); also, ker(ϕa) = {f ∈ C(K): f(a) = 0} is a maximal ideal in Γ.

We will use the following facts about a commutative unital Banach algebra Γ in the sequel.

Fact-1: Maximal ideals in Γ are precisely the kernels of multiplicative functionals on Γ.

13Originally these theorems where proved using the tools of Fourier Analysis, with rather complicated proofs. FOURIER ANALYSIS 63

Fact-2: An element t ∈ Γ is invertible w.r.to multiplication iff ϕ(t) ≠ 0 for every multiplicative functional ϕ on Γ. Fact-3: Let M(Γ) = {all multiplicative functionals on Γ} ≃ {all maximal ideals in Γ}, which is called the Gelfand space of Γ. We have M(Γ) ⊂ the unit sphere of the dual Γ∗, and (M(Γ), weak*) is a compact Hausdorff space. Also, the evaluation map (called the Gelfand map) E :Γ → C(M(Γ)) given by Et(ϕ) = ϕ(t) for t ∈ Γ and ϕ ∈ M(Γ) embeds Γ in C(M(Γ)). If Γ has no multiplicative unit, then we can only say that M(Γ) is a locally compact Hausdorff space, and ∥ϕ∥ ≤ 1 for every ϕ ∈ M(Γ) (if ∥ϕ∥ > 1, there is t ∈ Γ with ∥t∥ < 1 < |ϕ(t)|; then ∥tn∥ ≤ ∥t∥n → 0, but |ϕ(tn)| = |ϕ(t)|n → ∞, a contradiction to the continuity of ϕ). ∑ ∈ M 1 Z ⇔ ∃ ∈ C | | n M 1 Z T [148] (i) ϕ (l ( )) z with z = 1 such that ϕ(x) = n∈Z x(n)z . Thus (l ( )) = . b 1 (ii) (Wiener’s theorem about invertibility) Let C1(T) = {f ∈ C(T):(f(n)) ∈ l (Z)}. If f ∈ C1(T) is non-vanishing on T, then 1/f ∈ C1(T).

1 Proof. (i) Here we think of T as T = {z ∈ C : |z| = 1}. For z ∈ T, let ϕz : l (Z) → C be ∑ n ϕz(x) = ∈Z x(n)z , which is linear and continuous, and not identically zero. Now, ϕz(x ∗ y) = ∑ n ∑ ∑ ∑ ∑ ∗ k − k m+n k∈Z(x y)(k)z = k∈Z n∈Z x(k n)y(n)z = m∈Z n∈Z x(m)y(n)z = ϕz(x)ϕz(y) and 1 1 thus ϕz ∈ M(l (Z)). Conversely, consider ϕ ∈ M(l (Z)). Let {vn : n ∈ Z} be the standard 1 basis of l (Z), where vn(k) = 1 for k = n and vn(k) = 0 for k ≠ n. Verify that vi ∗ vj = vi+j, 1 and v0 is the multiplicative identity for convolution in l (Z). Let z = ϕ(v1) ∈ T. Then ϕ(v2) = 2 n 0 ϕ(v1 ∗ v1) = ϕ(v1)ϕ(v1) = z and inductively ϕ(vn) = z for n ∈ N. Moreover, ϕ(v0) = 1 = z and −n 1 ϕ(v−n) = 1/ϕ(vn) = z for n ∈ N since vn ∗ v−n = v0. Since span{vn : n ∈ Z} is dense in l (Z), ∑ n ∈ 1 Z it follows by the linearity and continuity of ϕ that ϕ(x) = n∈Z x(n)z for every x l ( ). It can also be shown that the correspondence M(l1(Z)) ↔ T is a homeomorphism.

1 (ii) Here we parametrize T as T = [0, 1). We may identify C1(T) with l (Z) by the correspondence b 1 f ↔ (f(n))n∈Z. In this correspondence, the convolution in l (Z) corresponds to pointwise product ∑ b in C1(T) by Fourier inversion. If f ∈ C1(T) is non-vanishing, then f(t) = ∈Z f(n)en(t) = ∑ n b 2πit n ̸ ∈ T ̸ ∈ n∈Z f(n)(e ) = 0 for every t = [0, 1). This means by (i) that ϕ(f) = 0 for every ϕ 1 M(l (Z)). Then by Fact-2, f is invertible w.r.to multiplication in C1(T). Hence 1/f ∈ C1(T). 

To prove the second theorem of Wiener, first we identify multiplicative functionals on L1(R). We start with a little abstract theory that generalizes [148](i).

Definition: Let G be a locally compact second countable abelian group (example: Z, T, R). Define its dual group Gb = {all continuous group homomorphisms α : G → T}, where the group operation is pointwise multiplication in T. Any α ∈ Gb is called a character of G. 64 T.K.SUBRAHMONIAN MOOTHATHU b b b Fact-4: (i) α ∈ R iff there is y ∈ R with α(x) = ey(x), and hence R = R. (ii) α ∈ Z iff there is z ∈ T with α(n) = zn, and hence Zb = T. By duality, Tb = Z.

[149] (i) Let G be a locally compact second countable abelian group equipped with Haar measure. b 1 b 1 Then we may identify G with M(L (G)), where α ∈ G corresponds to ϕα ∈ M(L (G)) given by ∫ ∈ 1 ϕα(f) = G fα for f L (G). (ii) (Fourier transform gives all multiplicative functionals) ϕ ∈ M(L1(R)) iff there is y ∈ R with ϕ(f) = fb(y) for f ∈ L1(R).

Proof. (i) All integrations considered below are w.r.to the Haar measure µ. We know that L1(G)∗ = ∫ L∞(G) and any ϕ ∈ L1(G) is given by ϕ(f) = fg for some g ∈ L∞(G). Since Gb ⊂ L∞(G), it ∞ 1 ∗ b follows from the definition of ϕα that ϕα ∈ L (G) = L (G) for α ∈ G. If K ⊂ G is a compact ∫ 2 set of positive measure, then ϕα(α1K ) = |α| dµ = µ(K) > 0, and hence ϕα ≠ 0. Moreover, ∫ ∫ ∫ K ∫ ∫ ϕα(f ∗ g) = (f ∗ g)α = f(x − y)g(y)α(x)dµ(x)dµ(y) = f(z)g(y)α(y + z)dµ(y)dµ(z) = 1 b ϕα(f)ϕα(g) by Fubini since α(y + z) = α(y)α(z), and thus ϕα ∈ M(L (G)) for α ∈ G.

b ∞ For α, β ∈ G, if ϕα = ϕβ, then α must coincide with β in L (G), which means α = β almost everywhere. And then α = β everywhere since α, β are continuous. This shows α 7→ ϕα is injective.

1 b ∞ Now consider ϕ ∈ M(L (G)). We need to show ϕ = ϕα for some α ∈ G. Let g ∈ L (G) be ∫ with ϕ(f) = fg and h ∈ L1(G) be with ϕ(h) = 1. Since ϕ is multiplicative, we observe ϕ(f) = ∫ ∫ ∫ ∫ 1·ϕ(f) = ϕ(h)ϕ(f) = ϕ(h∗f) = h(x−y)f(y)g(x)dµ(x)dµ(y) = f(y) hy(x)g(x)dµ(x)dµ(y) = ∫ f(y)ϕ(hy)dµ(y), which suggests that we define α : G → T as α(y) := ϕ(hy). Since ϕ and y 7→ hy are continuous, α is continuous. Since translation commutes with convolution, α(y + z) =

ϕ(hy+z) = ϕ(h)ϕ(hy+z) = ϕ(h ∗ hy+z) = ϕ(hz ∗ hy) = ϕ(hz)ϕ(hy) = α(z)α(y), which shows

α(y + z) = α(y)α(z). It remains to show |α| = 1. Since ∥ϕ∥ ≤ 1 by Fact-3, |α(y)| ≤ ∥hy∥1 = ∥h∥1 n n by Exercise-2. For any n ∈ N, we see |α(y) | = |α(ny)| ≤ ∥h∥1, and similarly |α(−y) | ≤ ∥h∥1. n n Since |α(y) α(−y) | = |α(ny − ny)| = |α(0)| = ∥h∥1, we must have |α(y)| = 1 = ∥h∥1.

(ii) By Fact-4, any character α of R is of the form α = ey for some y ∈ R. Since ey = e−y, part (i) applied to G = R yields the required result. 

1 Exercise-48: Let Γ = Cδ0 + L (R) be the unital Banach algebra obtained by attaching the multi- 1 plicative unit δ0 for convolution to L (R). Let ϕ0 :Γ → C be ϕ0(cδ0 + f) = c. Then, 1 (i) M(Γ) = {ϕ0} ∪ {ψ : ψ(cδ0 + f) = ϕ0(cδ0 + f) + ϕ(f) for some ϕ ∈ M(L (R))}. b (ii) If ψ ∈ M(Γ) \{ϕ0}, then there is y ∈ R such that ψ(cδ0 + f) = c + f(y). (iii) Let w ∈ D ⊂ S be with 0 ≤ w ≤ 1, and let v = w∨ ∈ S ⊂ L1(R). Let f ∈ L1(R) be such b that f is non-vanishing, and put f1(x) = f(−x). Then ϕ0(δ0 − v + f ∗ f1) = 1 ≠ 0. Also, for any FOURIER ANALYSIS 65

\ b 2 ψ ∈ M(Γ)\{ϕ0} there is y ∈ R with ψ(δ0 −v +f ∗f1) = 1−vb(y)+f ∗ f1(y) = 1−w(y)+|f(y)| > 0 \ bb bb b 2 since f ∗ f1 = ff1 = ff = |f| . Consequently, δ0 − v + f ∗ f1 is invertible in Γ by Fact-2.

[Hint: For (ii), use (i) and [149](ii). Compute ψ(δ0 − v + f ∗ f1) in (iii) using (ii).]

[150] (Wiener’s theorem about translates) For f ∈ L1(R), the following are equivalent: (i) The Fourier transform fb is non-vanishing. (ii) {f ∗ g : g ∈ L1(R)} is dense in L1(R). 1 (iii) span{fy : y ∈ R} is dense in L (R), where fy(x) = f(x − y).

Proof. (i) ⇒ (ii): Let Λ = {h ∈ L1(R): supp(bh) is compact}, which is dense in L1(R) by the Remark after [142]. Hence it suffices to show that for every h ∈ Λ, there is g ∈ L1(R) with 1 1 h = f ∗ g. So consider h ∈ Λ. Let Γ = Cδ0 + L (R), and w, v, f1 ∈ L (R) be as in Exercise-48. We b may assume w ≡ 1 in a neighborhood of supp(h). By Exercise-48, δ0 − v + f ∗ f1 is invertible in Γ.

Let u ∈ Γ be with (δ0 − v + f ∗ f1) ∗ u = δ0, and then (δ0 − v + f ∗ f1) ∗ u ∗ h = h. The Fourier b transform of (δ0 − v) ∗ u ∗ h is (1 − w)ubh, which is 0 since we assume w ≡ 1 in a neighborhood of b supp(h). By the uniqueness of Fourier transform, we must have (δ0 − v) ∗ u ∗ h = 0, and hence 1 f ∗ f1 ∗ u ∗ h = h. Writing u = cδ0 + g0 with g0 ∈ L (R), we get f ∗ (f1 ∗ ch + f1 ∗ g0 ∗ h) = h. Letting g ∈ L1(R) to be the bracketed expression, we arrive at the desired conclusion f ∗ g = h.

(ii) ⇒ (iii): (Sketch) From (ii) and the inequality ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1, it follows that {f ∗ g : g ∈ 1 1 Cc(R)} is also dense in L (R) since Cc(R) = L (R). Consider g ∈ Cc(R) and assume supp(g) ⊂ [a, b]. ∫ ∫ b b We have f ∗ g(x) = f(x − y)g(y)dy = fy(x)g(y)dy. If a = a0 ≤ a1 ≤ · · · ak−1 ≤ ak = b is a a a ∑ k − sufficiently fine partition of [a, b], and h(x) := j=1 faj (x)g(aj)(aj aj−1), then h approximates 1 1 f ∗ g in L (R); and also h ∈ span{fy : y ∈ R}. Therefore span{fy : y ∈ R} is dense in L (R). b b b (iii) ⇒ (i): Suppose f(z) = 0 for some z ∈ R. Then fy(z) = ey(z)f(z) = 0, and hence gb(z) = 0 for 1 every g ∈ Λ := span{fy : y ∈ R}. Since g 7→ gb from L (R) to C0(R) is continuous, and since there are g ∈ L1(R) with gb(z) ≠ 0 (use Fourier inversion), it follows that Λ cannot be dense in L1(R). 

21. Sketch: interpolation and the Lp-theory of Fourier series

Operators on Lp(T) and Lp(Rn) play a significant role in the modern theory of Fourier Analysis. While dealing with such operators, two interpolation theorems are of basic importance: Riesz- Thorin theorem and Marcinkiewicz theorem. We will give a brief sketch about the former, which is needed in our discussion of Lp-convergence of Fourier series.

Exercise-49: Let (X, µ) be a σ-finite measure space and let Lp = Lp(X, µ). p q (i) Let 1 ≤ p < q ≤ ∞. If f ∈ L and A = {x : |f(x)| ≤ 1}, then f1A ∈ L . q p (ii) Let 1 ≤ p < q ≤ ∞. If f ∈ L and B = {x : |f(x)| > 1}, then f1B ∈ L . 66 T.K.SUBRAHMONIAN MOOTHATHU

(iii) If 1 ≤ p < q < r ≤ ∞, then Lp ∩ Lr ⊂ Lq ⊂ Lp + Lr. q p p [Hint: (i) Since |f1A| ≤ 1, we have |f1A| ≤ |f1A| ≤ |f| . (ii) Let r = q/p > 1 and s > 1 be 1 1 p r s with + = 1. We have |f| ∈ L , and 1B ∈ L since µ(B) < ∞. Hence by H¨older’sinequality, ∫ r s ∫ p p p p r q |f1B| = |f| 1B ≤ ∥f ∥r∥1B∥s < ∞. (iii) If f ∈ L ∩ L , then f1{x:|f(x)|≤1} ∈ L by (i) and q q q p r f1{x:|f(x)|>1} ∈ L by (ii) so that (their sum) f ∈ L . Similar reasoning gives L ⊂ L + L .]

The proof of Riesz-Thorin theorem is based on the following fact from Complex Analysis:

Fact: (Three lines theorem) Consider the vertical strip S = {z ∈ C : 0 < Re(z) < 1}. Let h : S → C be a bounded continuous function analytic on S. Let Mt = sup{|h(z)| : Re(z) = t} for 0 ≤ t ≤ 1. ≤ 1−t t ≤ ≤ Then Mt M0 M1 for 0 t 1.

[151] (Riesz-Thorin interpolation theorem) Let (X, µ) and (Y, ν) be σ-finite measure spaces. Let p p q q 1 ≤ p0 ≤ p1 ≤ ∞, 1 ≤ q0 ≤ q(1) ≤ ∞, and T : L 0 (µ) + L 1 (µ) → L 0 (ν) + L 1 (ν) be a linear operator such that T restricted to Lpj (µ) is a bounded linear operator into Lqj (ν) with operator norm M for j = 0, 1. For 0 < t < 1, define 1 ≤ p , q ≤ ∞ by 1 = 1−t + t and 1 = 1−t + 1 . Then j t t pt p0 p1 qt q0 q1 − pt qt ≤ 1 t t T restricted to L (µ) is a bounded linear operator into L (ν) with operator norm M0 M1, i.e., − ∥ ∥ ≤ 1 t t∥ ∥ ∈ pt T f qt M0 M1 f pt for every f L (µ).

Proof. (Sketch) Observe that since Lpt (µ) ⊂ Lp0 (µ)+Lp1 (µ) by Exercise-49, T is defined on Lpt (µ).

1 1 pt For s = 0, t, 1, let rs > 1 be such that + = 1. Note that the operator norm on L (µ) that we qs ∫ rs {| | ∈ pt ∈ rt } need to estimate is equal to Mt := sup Y (T f)gdν : f L (µ) and g L (ν) have unit norm . By approximation, it is enough to consider simple functions f, g of unit norm in the expression ∑ ∑ N iα pt N iβ rt for Mt. Let f = aje j 1A ∈ L (µ) and g = bke k 1B ∈ L (ν) be simple functions j=1 j k=1 k ∑ iα iβ N of unit norm, where aj, bk ≥ 0. Let fj = e j 1A and gk = e k 1B so that f = ajfj and ∑ j k j=1 N { ∈ C } ∈ g = k=1 bkgk. Let S = z : 0 < Re(z) < 1 . For z S, define pz, rz by the condition that 1 = 1−z + z and 1 = 1−z + z . Note that since a , b ≥ 0, we may define the quantities apt/pz pz p0 p1 rz r0 r1 j k j rt/rz ∈ → C and bk and they depend analytically on z for z S. Let h : S be defined as

∫ ∑N ∑N ∑N ∫ pt/pz rt/rz pt/pz rt/rz h(z) = (T ( aj fj))( bk gk)dν = aj bk (T fj)gkdν, Y j=1 k=1 j,k=1 Y which is clearly analytic on S. The proof is completed by verifying that h is continuous and bounded on S, and then applying the Three lines theorem mentioned above. See the book of Grafakos or Pinsky for the computational details.  ∑ ≤ ∞ ∈ p T N b → p T Question: Let 1 p < , f L ( ) and sN (f) = n=−N f(n)en. Does (sN (f)) f in L ( )? We know the answer to be YES when p = 2, and it is known that the answer is NO when p = 1. To investigate other cases, we introduce certain operators. FOURIER ANALYSIS 67

Definition: Let F (T) be the collection of all trigonometric polynomials on T, where note that F (T) ∑ p T ≤ ∞ ∈ T b is dense in L ( ) for 1 p < . We write f F ( ) as f = n∈Z f(n)en with the understanding that it is a finite sum, i.e., fb(n) = 0 except for finitely many n ∈ Z. We define the Hilbert transform H and the Riesz projection P from Lp(T) to itself (1 ≤ p < ∞) by defining them on the dense subspace F (T) (see also [152] below): ∑ ∑ ∑ b −1 − ∞ b (i) H( n∈Z f(n)en) = i( n=−∞ n=1)f(n)en. If we define sgn(0) = 0, sgn(n) = 1 and ∑ b ∑ b sgn(−n) = −1 for n ∈ N, then note that H( ∈Z f(n)en) = −i ∈Z sgn(n)f(n)en. ∑ ∑ n n b ∞ b (ii) P ( n∈Z f(n)en) = n=1 f(n)en.

[152] Let 1 < p < ∞ (note that we have excluded 1). Then the Hilbert transform H and the Riesz projection P are bounded linear operators on Lp(T).

Proof. (Sketch) Clearly H,P are linear. For f ∈ F (T), note that f + iHf = fb(0) + 2P f, and ∫ ∫ | b | ≤ 1 | | 1 · | | ≤ ∥ ∥ f(0) 0 f = 0 1 f f p by H¨older’sinequality. Therefore it suffices to show H is bounded. We outline the structure of proof. ∑ ∥ ∥2 ≤ | b |2 ∥ ∥2 ∈ T 2 T Step-1 : By Parseval, Hf 2 f(n) = f 2 for f F ( ) and hence H is bounded on L ( ). n∈Z Step-2 : One shows by some computation that H is bounded on L2k(T) for every integer k ≥ 2. The proof is left as a reading assignment; see for instance, Lemma 3.3.4 of M.A. Pinsky, introduction to Fourier Analysis and Wavelets.

Step-3 : From the above two steps and Riesz-Thorin interpolation theorem, it follows that H is bounded on Lp(T) for 2 ≤ p < ∞.

1 1 Step-4 : If 1 < p < 2, choose q > 2 with + = 1. As the adjoint of H is −H, ∥Hf∥p = ∫ p q sup{| fHg| : g ∈ F (T) and ∥g∥q = 1} ≤ ∥f∥p∥Hg∥q ≤ ∥f∥p∥H∥q→q by H¨older’sinequality. As p ∥H∥q→q < ∞ by step-3, we get ∥H∥p→p < ∞, where ∥H∥p→p is the operator norm on L (T). 

[153] Let 1 < p < ∞ (note that we have excluded 1). Then, ∥ ∥ ∞ ∥ − ∥ ∈ p T (i) supN∈N sN p→p < , and (ii) limN→∞ f sN (f) p = 0 for every f L ( ). ∑ ∑ p b 2N b Proof. (i) Let PN be the operator on L (T) specified by PN ( ∈Z f(n)en) = f(n)en for f = ∑ ∑ ∑n n=0 b ∈ T 2N d N b n∈Z f(n)en F ( ). Observe that e−N n=0 feN (n)en = n=−N f(n)en, i.e., e−N PN (feN ) = p sN (f) for f ∈ F (T). Since multiplication by e±N does not change L -norm, it suffices to show ∑ ∑ ∑ ∞ ∞ b ∞ b sup ∈N ∥PN ∥p→p < ∞. For f ∈ F (T), we have PN f = ( − )f(n)en = f(n)en− N∑ n=0 n=2N+1 n=0 ∞ b b − e2N n=1 f(n + 2N)en = f(0) + P f e2N P (fe−2N ), where P is the Riesz projection. As P is | b | ≤ ∥ ∥ ∥ ∥ ≤ ∥ ∥ ∞ bounded by [152] and f(0) f p, we get supN∈N PN p→p 1 + 2 P p→p < . 68 T.K.SUBRAHMONIAN MOOTHATHU

∈ p T ∥ ∥ (ii) Let f L ( ) and ε > 0. Let M = supN∈N sN p→p, which is finite by part (i). Choose a trigonometric polynomial g with ∥f − g∥p < ε. We have sN (g) = g for all large N ∈ N. Therefore,

∥f − sN (f)∥p ≤ ∥f − g∥p + ∥sN (g) − sN (f)∥p ≤ ∥f − g∥p + ∥sN ∥p→p∥f − g∥p < (1 + M)ε for all large N ∈ N, which shows limN→∞ ∥f − sN (f)∥p = 0. 

Remark: Similarly we can define Hilbert transform and Riesz projection on Lp(R) for 1 < p < ∞, p and use their boundedness to prove that limu→∞ ∥f − su(f)∥p = 0 for every f ∈ L (R), 1 < p < ∞. In the proof, instead of F (T), one should use a suitable dense subset of Lp(R), for instance {f ∈ L1(R) ∩ Lp(R): supp(fb) is compact}.

Further reading: (i) The Hilbert transform is a prototype of an important class of operators called multipliers; see section 3.6 in Grafakos, Classical Fourier Analysis. (ii) Modern theory of Fourier Analysis depends heavily on singular integral operators in whose study a basic result is Calder´on- Zygmund decomposition; see section 4.3 in Grafakos, Classical Fourier Analysis. (iii) Applications of Fourier Theory to Probability Theory can be found in Chapter 5 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets. Fourier theory in higher dimension and many other interesting topics can also be found in the books of Grafakos and Pinsky. See also J. Duoandikoetxea, Fourier Analysis. For a more abstract general theory, see Rudin, Fourier Analysis on Groups.

*****