Fejer’s Theorem

S. Ziskind

1 Discussion

We provide a very brief overview of , and prove Fejer’s The- orem, which illustrates the use of Cesaro Means in establishing a pointwise convergence property of such series. Several consequences of the theorem will also be noted. The overview is drawn from [1], while the proof of Fejer’s Theorem is based on [2].

It must be noted that we ignore technical issues involving the definition of integrals using Lebesgue measure instead of Riemann sums. This is discussed in [1] and [3] among many other books, but is out of scope for this memo.

2 Fourier Series

The general goal of harmonic analysis is to represent functions as the sum of simpler functions. Here we consider expressing periodic functions on a fixed interval as the sum of sines and cosines, which leads to the study of Fourier Series.

To make things specific let us take the interval to be [−π, π], with the un- derstanding that any function of interest satisfies f(x + 2π) = f(x). The obvious periodic functions√ on the interval are sin nx and cos nx, where n ≥ 0 is an integer, and i = −1.

Because of Euler’s identity eiθ = cos θ + isin θ, we have

eiθ + e−iθ eiθ − e−iθ cos θ = and sin θ = 2 2i which says that we can equivalently break periodic functions into a sum of the exponentials einx, for integral values of n, both positive and negative. Being easier to work with, we switch to this approach, and define the functions

1 inx en(x) = e , for integral n. Note that en = e−n, where the overbar indicates complex conjugation.

A direct calculation shows that 1 Z π ek(x)ej(x)dx = δkj 2π −π where δkj, the Dirac delta ”function”, is either 1 or 0, depending as to whether j = k or not.

Recalling basic calculus, the integral should make us think of a Riemann sum of products, and linear algebra tells us that the sum of products is a dot product (a.k.a inner product, a.k.a scalar product), where the conjugation reminds us to think of complex Euclidean space of many dimensions, Cn. When working in n complex dimensions we define the inner product of 2 vectors as < X, Y >= Pn x y , and then use this to define the norm of a √ 1 k k vector as kXk = < X, X >, and distance via kX − Y k. These definitions align with the ordinary geometric notions in 2 and 3 dimensions, and extend angle measure and the like into n dimensions.

It is natural to extend the notions of inner product, norm and distance into infinitely many dimensions, i.e. the space of infinite sequences of complex numbers, C∞, by letting

∞ ∞ X 2 X 2 < X, Y >= xkyk and kXk2 = |xk| k=1 k=1 but we face a new problem: infinite series don’t always converge. The solution is obvious: restrict attention to (2 sided) infinite sequences for which

∞ X 2 |xk| < ∞ k=−∞

In addition to making norms finite, the restriction also makes all dot products finite. The space of such sequences, along with the associated dot product, is known as l2, the space of square summable sequences (extending in both directions).

2 The natural continuous analogy replaces a sum with an integral, and as before we need to restrict attention to those functions on [−π, π] for which 1 Z π |f(t)|2dt < ∞ 2π −π and we define the inner product by 1 Z π < f, g >= f(t)g(t) dt 2π −π These functions comprise the space L2, the space of square integrable func- tions. (The integrals are taken in the sense of Lebesgue, not Riemann.)

Working again by analogy to the finite dimensional spaces, we see that the functions ek are orthogonal and of unit length. Thus we can treat them as basis vectors in L2 and form the projection of a vector onto any of these vectors as Z π Z π Z π ˆ 1 1 1 −int f(n) = f(t)en(t) dt = f(t)e−n(t) dt = f(t)e dt 2π −π 2π −π 2π −π

The numbers fˆ(n) are the Fourier Coefficients of f, and the sequence of all ˆ ∞ of them, {f(n)}−∞ is the Fourier Series of f. The following results, given without proof (see [1]), show that L2 and l2 behave strikingly like Cn.

2 ∞ Completeness: L is complete, with {en}−∞ acting as an orthonormal basis. ˆ Plancherel’s Theorem: kfk2 = kfk2 Parseval’s Theorem: R f g = P fˆgˆ

Theorem: If we define

k=+n X ˆ sn(x) = f(k) ek(x) k=−n then kf − snk2 → 0 as n → ∞.

3 3 Fejer’s Theorem

The last theorem of the preceding section may be re-stated as: “The sym- metric partial sums of the Fourier series of an L2 function converge to the function in the L2 norm.”

In many ways this is the most natural sense of convergence for an L2 func- tion’s Fourier series, but there is a more basic form of convergence that this doesn’t address. We ask: Do the partial sums converge pointwise to f: ? sn(x) → f(x). More generally, we can ask when this happens, for what sort of functions, and how often. This turns out to be a surprisingly difficult and deep issue.

It has long been known that pointwise convergence can fail at some points, even for continuous functions. Even worse, there is an example of a function R satisfying |f| < ∞ for which sn(x) diverges at every single value of x! This made people suspect that the general problem was hopeless. Never- theless, Lennart Carleson proved, in 1966, that the partial sums do in fact converge for almost every value of x when f ∈ L2, so in particular for con- tinuous f. This theorem is deep, complex, and far beyond the scope of this memo.

Prior to Carleson, Fejer (1904) found a relatively simple way of capturing the value of a in the pointwise sense, but he needed to smooth the values of the partial sums. Given the sums sn(x), define the Cesaro Means of this sequence as n−1 1 X σ (x) = s (x) n n k k=0

Fejer’s Theorem: If f is a bounded periodic function on [−π, π] that is continuous at x, then the Cesaro means of the symmetric partial sums of its Fourier series converges pointwise to f at x. i.e. σn(x) → f(x). If f is continuous on the whole interval then the convergence is uniform.

Proof - Step 1: We start the proof of Fejer’s Theorem by evaluating n n X X 1 Z π s (x) = fˆ(k)e (x) = e (x) f(t)e (t)dt n k k 2π k −n −n −π

4 n 1 Z π X 1 Z π = ( e (x − t))f(t)dt = f(t)D (x − t)dt 2π k 2π n −π −n −π where Dn(y), the Dirichlet Kernel, is defined and evaluated (noting it to be a ) as

n X sin(n + 1 )y D (y) = eiky = ··· = 2 n sin(y/2) −n

We will not use it in this form, but simply note its graph for n=10.

25

20

15

10

5

0

-5 -4 -2 0 2 4

Figure 1: Dirichlet’s Kernel, n=10

5 It should be noted that both sn and σn are both expressed as integrals of the form Z f(t)g(t − x)dt where g is either the Dirichlet kernel or the Fejer kernel. In general, an integral of this form is called the of f and g. In essence it takes the function f and overlays it with the function g, but with g shifted so that it is centered at x. Because of periodicity, any part of either D or K that is shifted over the edge of [−π, π] simply wraps around and reappears on the other side of the interval. Much more can be said about convolution, and it is important for many applications (such as signal processing).

An equivalent form of Dn(y), more useful for the next step of the proof, is that n n n X X X 1 − ei(n+1)y 1 − e−i(n+1)y D (y) = eiky = eiky + e−iky − 1 = + − 1 n 1 − eiy 1 − e−iy −n 0 0 (1 − ei(n+1)y)(1 − e−iy) + (1 − e−i(n+1)y)(1 − eiy) = − 1 (1 − eiy)(1 − e−iy) cos ny − cos(n + 1)y = ··· = 1 − cos y

Proof - Step 2: Next we observe that

n−1 n−1 1 X 1 X 1 Z π σ (x) = s (x) = f(t)D (x − t) dt n n k n 2π k 0 0 −π

n−1 1 Z π n 1 X o 1 Z π = f(t) D (x − t) dt = f(t)K (x − t) dt 2π n k 2π n −π 0 −π where Kn(y), the Fejer Kernel, is defined as

n−1 1 X K (y) = D (y) n n k 0

Using the second expression for the Dirichlet kernel, we note that cos ny − cos(n + 1)y (n + 1)K (y) − nK (y) = D (y) = n+1 n n 1 − cos y

6 Given this expression for Kn, we note first that K1(y) ≡ 1, and then that 1n cos y − cos 2y o 1n1 − cos 2y o K (y) = 1K (y) + = 2 2 1 1 − cos y 2 1 − cos y 1n cos 2y − cos 3y o 1n1 − cos 3y o K (y) = 2K (y) + = 3 3 2 1 − cos y 3 1 − cos y and by induction (and the double angle formula) we find

1 h1 − cos ny i 1 hsin(ny/2)i2 K (y) = = n n 1 − cos y n sin(y/2)

Here is its graph for n = 10.

10

8

6

4

2

0 -4 -2 0 2 4

Figure 2: Fejer’s Kernel, n=10

7 With this final expression for the Fejer Kernels we can see that they have three important properties:

1. Kn ≥ 0 1 R π 2. 2π −π Kn(y) dy = 1 3. If 0 < δ < π, n o lim max Kn(y) = 0 n→∞ δ≤|y|≤π

The first is obvious from that fact the Kn is a square, the second follows from letting f ≡ 1, and the third can be seen by noting that 1 h 1 i2 1 h 1 i2 K (y) ≤ ≤ n n sin(y/2) n sin(δ/2)

A sequence of functions satisfying these three properties is known as an , and the properties are just what we need to prove Fejer’s Theorem. (A physicist would say that the limit of an approximate identity is the Dirac Delta “Function”, except that she wouldn’t put quotes around the word Function.)

Proof - Step 3: Let the bound on f be denoted kfk∞. Using the second property of the Kernels, and choosing a small δ > 0, we can write 1 Z π h i |σn(x) − f(x)| = | f(x − t) − f(x) Kn(t)dt| 2π −π 1 Z δ 1 Z ≤ |f(x − t) − f(x)|Kn(t)dt + |f(x − t) − f(x)|Kn(t)dt 2π −δ 2π |t|≥δ 1 Z δ 1 ≤ |f(x − t) − f(x)|Kn(t)dt + 2kfk∞ max Kn(t) 2π −δ 2π δ≤|y|≤π By the assumed continuity of f at x, the first term can be made arbitrarily small, and the second term can be made small, for large n, using property 3. If f is continuous on the entire interval then the choice of δ made to shrink the first term can be made independently of x, because of the fact that a continuous function on a closed interval is uniformly continuous. Once that choice is made the second term can be made small independently of x. Fejer’s Theorem is proved.

8 4 Consequences of Fejer’s Theorem

Weierstrass Polynomial Approximation: Any continuous function on a closed interval can be uniformly approximated by a polynomial. Explicitly, if f is continuous on the interval [a, b] and δ > 0 is given, then there is a polynomial p(x) for which |f(x) − p(x)| < δ for every x ∈ [a, b]. Proof: Using Fejer’s Theorem we approximate f uniformly as a sum of ex- ponentials, or equivalently as a sum of sines and cosines. But on any interval we can approximate sine or cosine uniformly using its Taylor series, which is a polynomial. Done!

Smooth Convergence: Let f be a periodic function on [−π, π] that has first and second derivatives, both continuous. Then sn(x) → f(x) uniformly. (In other words, we don’t need to smooth the partial sums via Cesaro means.) Proof: Apply the formula for integration by parts, R u dv = u v − R v du, which requires that both u0 and v0 exist and are continuous, to find that

Z π −inx π Z π −inx 1 −inx 1 f(x)e 1 0 e fˆ(n) = f(x)e dx = − f (x) dx 2π −π 2π −in −π 2π −π −in 1 Z π e−inx = f 0(x) dx 2π −π in because of the periodicity of f. Doing this a second time we see that Z π ˆ −1 00 −inx f(n) = 2 f (x)e dx 2πn −π

ˆ kf 00k 1 and consequently |f(n)| ≤ 2π n2 , which means that the series sn(x) con- verges absolutely, and hence converges. On the other hand, by writing

n n N 1 X 1 X 1 X n − N | xk − L| = | (xk − L)| ≤ | (xk − L)| + max |xk − L| n n n n k>N 1 1 1

N 1 X < | (xk − L)| + max |xk − L| n k>N 1 and letting first N become large to shrink the second term and then letting n get large to shrink the first, we readily see that the Cesaro means of a

9 convergent series will also converge, and to the same limit. In our situation we know that sn(x) converges to something, and the Cesaro means converge, by Fejer’s Theorem, to f(x). Consequently, the value to which sn(x) converges must also be f(x).

References

[1] H. Dym and H.P. McKean, “Fourier Series and Integrals, Academic Press, 1972

[2] K. Hoffman, “Banach Spaces of Analytic Functions”, Prentice-Hall, 1962

[3] H. Royden, “Real Analysis”, Macmillan, 1963

10