<<

MODULAR FORMS AND THE FOUR SQUARES THEOREM

AARON LANDESMAN

Contents 1. Introduction 1 2. Definition of modular Forms 1 2.1. Preliminaries 2 2.2. A First Definition of Modular Forms 3 2.3. The action of SL2(Z) on H 4 2.4. Three Other Ways to Think about Modular Forms 5 3. The Space of modular Forms 7 3.1. 7 3.2. Notation for Spaces of Modular Forms 9 3.3. A Bound on the Dimension of Modular Forms of a Fixed Weight 10 3.4. A Complete Characterization of Modular Forms 12 4. Modular Forms of Higher Levels 13 4.1. Congruence Subgroups 13 4.2. A More Refined Definition of Modular Forms 14 5. The space M2(Γ0(4)) 15 5.1. Computing the of G2(τ) 15 5.2. The almost-invariance of G2 17 5.3. An Almost Invariant SL2(Z) function from G2. 21 5.4. A Bounding Theorem 21 5.5. The elements of M2(Γ0(4)) 23 6. Theta Functions 25 7. The Sum of Four Squares Theorem 27 References 28

1. Introduction The main goal of this paper is to introduce modular forms and to obtain an explicit formula for the of ways to write a positive integer as a sum of four squares. Along the way, we shall also explicitly describe all modular forms with respect to the principal SL2(Z).

2. Definition of modular Forms Modular forms are naturally viewed in several different contexts: as holomorphic functions satisfying transformation equations, as sections of line bundles on Rie- mann surfaces, and as functions on lattices. In the first section, we shall describe 1 2 AARON LANDESMAN these different perspectives for thinking about modular forms and explain how they relate. But first we must build up the necessary background. 2.1. Preliminaries.

Definition 2.1.1. Let F be a ring. The nth general linear over F, GLn(F ) is the group of invertible n × n matrices with coefficients in F, viewed as a group via multiplication and inversion of matrices. Definition 2.1.2. Again, with F a ring, the nth special linear group over F, no- tated SLn(F ) is the group of n×n matrices with coefficients in F with determinant 1. Definition 2.1.3. Let F be a ring and let Z(n, F ) be the subgroup of diagonal matrices of GLn(F ). The Projective linear group of dimension n over F , P GLn(F ) = GLn(F )/Z(n, F ).

Remark 2.1.4. Let In denote the n × n identity . Note that in the case that F = R or C, we have Z(n, F ) = F · In, is formed by diagonal matrices. In the further case that n = 2, which we shall mostly be dealing with in this paper, ∼ ∼ we have that P GL2(R) = GL2(R)/Z(2, R), P GL2(C) = GL2(C)/Z(2, C), since the matrices I2, −I2 are the only two elements of Z(2,F ) with determinant 1. Definition 2.1.5. Let Cˆ denote the , also known as the one point compactification of C. The group of fractional linear transformations, notated ˆ ˆ az+b F LT, are rational functions f : C → C which are of the form f(z) = cz+d . They form a group under composition. This is a group because it is the image of the natural surjection GL2(C) → F LT. Note that the point of Cb not in C is labeled ∞ ax+b and any FLT of the form 0 is labeled ∞. Lemma 2.1.6. The group of fractional linear transformations is naturally isomor- phic to P GL2(C)

Proof. Observe that we have a natural map φ : GL2(C) → F LT, mapping the a b matrix to the function f(z) = az+b . Clearly Z(2, ) ∈ ker φ. Hence, we c d cz+c C ∼ have an induced map φ : P GL2(C) = GL2(C)/Z(2, C) → F LT. This map is clearly surjective by choosing the same (a, b, c, d) for both groups. It only remains to check only the identity acts trivially, which holds because the identity is the only element mapping to the fractional linear transformation with a zero at z = 0 and a pole at z = ∞. 

Remark 2.1.7. As described above, we obtain that P GL2(C) acts naturally on the Riemann sphere by fractional linear transformations.

Lemma 2.1.8. Let Aut(Cˆ) denote the set of invertible meromorphic maps f : Cˆ → Cˆ. We have Aut(Cˆ) =∼ F LT. Proof. Clearly all fractional linear transformations are automorphisms, with their inverse given by their inverse in the group of fractional linear transformations, which ˆ we saw above was P GL2(C). Conversely, suppose T ∈ Aut(C). Then, T must have precisely one pole, and precisely 1 zero. Suppose the pole is p and the zero is q. z−p Note that we must have p, q distinct. Then, the S = T · z−q , has no zeros and no poles. Note this is written as a product of meromorphic MODULAR FORMS AND THE FOUR SQUARES THEOREM 3 functions, with removable singularities filled in. That is, in S(z), we cancel off common factors from the numerator and denominator. Then, S is a function on the Riemann sphere with no poles or zeros. Hence, by Liouville’s theorem, it must cz−cp be constant, and so S = c, which implies T = z−q , which can be rewritten in the form of a linear transformation by multiplying the numerator and denominator by the same constant so that the resulting determinant will be 1.  Notation 2.1.1. Let Im(z) denote the imaginary part of z.

Definition 2.1.9. The upper half plane is the subset H = {z ∈ C|Im(z) > 0}. a b Lemma 2.1.10. Let g = ∈ SL ( ). Then, for z ∈ BC, Im(gz) = Im(z) . c d 2 R |cz+d|2 Proof. Observe that az + b g(z) = cz + d (az + b)(d + cz) = |cz + d|2 bd + ac|z|2 + Re(z)(ad + bc) + i(ad − bc)Im(z) = |cz + d|2 bd + ac|z|2 + Re(z)(ad + bc) + iIm(z) = . |cz + d|2 Im(z) Hence, Im(gz) = |cz+d|2 .  Corollary 2.1.11. Let Aut(H) denote the group of meromorphic functions send- ing H to H with a meromorphic inverse on H. The restriction of fractional linear transformations to H define a subgroup of Aut(H). Proof. Clearly, and FLT maps R ∪ {∞} → R ∪ {∞}, and by the previous lemma, it maps i to something with positive imaginary part. Therefore, it must map all of H → H, and since it is an automorphism of Cˆ it restricts to an automorphism of H.  2.2. A First Definition of Modular Forms. The main idea of modular forms is that they have a certain invariance under composition with fractional linear transformations.

Definition 2.2.1. Let Γ = SL2(Z) act on the upper half plane via fractional linear az+b transformations. Suppose γ ∈ Γ, γ(z) = cz+c . A weakly modular function of weight k is a complex meromorphic function f : H → H, satisfying f(γ(z)) = (cz + d)kf(z). Proposition 2.2.2. The only weakly modular function of odd weight k is 0. Proof. Let f be such a weakly modular function. Then, applying the matrix −I, we see f(z) = (−1)kf(z) = −f(z) if k is odd. This implies f(z) = 0 identically on H.  We’d like to say that a modular function is simply a weakly modular function that is also meromorphic at infinity, where infinity is thought of as lying very far in the imaginary direction. We can also translate H to the unit disk by a linear 4 AARON LANDESMAN fractional transformation, which translates R ∪ {∞} to the boundary of the unit disk. Now we develop a tiny bit a Fourier analysis to make this notion precise.

Notation 2.2.1. We shall use D to denote the complex unit disk D = {z ∈ C||z| < 1}, and D∗ to denote the punctured complex unit disk, D∗ = D − {0}.

Lemma 2.2.3. Any function f : H → C satisfying f(z) = f(z + 1) can be written as f(z) = g(e2πiz) for g : ∆∗ → C holomorphic. Proof. Since the upper half plane is simply connected, we have a well defined log- arithm log : H → D∗ ⊂ C. Locally, log has an inverse. Define the coordinate 1 2πiq q(z) = 2πi log z. Then, at least locally, we can write z = e . Therefore, locally we can write g(q) = f(e2πiq) = f(z). This is well defined on the strip 0 ≤ Re(z) < 1. However, since f(z) = f(z + 1), we can holomorphically extend this function by defining f(z) = f(z − bzc). So, g(q) is holomorphic because it is locally holomor- phic.  0 −1 1 1 Notation 2.2.2. Let S = ,T = as elements of SL ( ). 1 0 0 1 2 Z Remark 2.2.4. Note that any weakly modular function f satisfies f(z) = f(z + 1) because f(z +1) = f(T z) = 1kf(z) = f(z) by definition of weakly modular. There- fore, we can write f(z) = f(e2πiq) = g(q): D∗ → C. Because g(q) is holomorphic, on D∗ and meromorphic at 0, we can expand it as a Laurent series around 0, P∞ n g(q) = i=−∞ anq . Definition 2.2.5. A function f : H → C satisfying f(z) = f(z +1) is meromorphic P∞ n at infinity if we write f(z) = g(q) = i=−∞ anq as above, we have an = 0 for all n ≤ N, for some N ∈ BZ. Equivalently, we can say that g(q) has a pole or a removable singularity at 0.

Definition 2.2.6. A function f : H → C satisfying f(z) = f(z + 1) is holomorphic P∞ n at infinity if when we write f(z) = g(q) = i=−∞ anq as above, we have an = 0 for all n < 0. Equivalently, we can say that g(q) has a removable singularity at 0.

Definition 2.2.7. A modular function is a weakly modular function f : H → C that is meromorphic at infinity.

Definition 2.2.8. A is a weakly modular function f : H → C which is holomorphic on H, and is holomorphic at infinity.

2.3. The action of SL2(Z) on H. The purpose of the next several theorems is to gain a better understanding of the group P GL2(Z). Definition 2.3.1. A for a Γ on H is an open set D ⊂ H so that the closure of D, notated D satisfies ΓD = H, no two points in D are in the same Γ orbit, and for any point p ∈ ∂D = D − D, p is not in the same orbit as any point in D and p is only in the same orbit as finitely many other points in ∂D. 1 1 Theorem 2.3.2. The set D = {z ∈ C | |z| > 1, − 2 < Re(z) < 2 } is a fundamental domain for P GL2(Z) acting as fractional linear transformations.

Proof. Let the subgroup of SL2(Z) generated S, T be denoted by H. We will first show that if z ∈ C, it is in the same orbit as some element in D. To see this, by MODULAR FORMS AND THE FOUR SQUARES THEOREM 5

Im(z) Lemma 2.1.10, we may note that Im(gz) = |cz+d|2 . However, there are only finitely many γ ∈ SL2(Z) for which |cz + d| ≤ 1. Therefore, there are only finitely many such γ ∈ H. But, for one of these finitely many γ, Im(γ(z)) is maximized over all γ ∈ H, and therefore there is some g such that Im(g(z)) attains its maximum over −1 k 1 g ∈ H. We may apply an appropriate power of T so that 2 < Re(T g(z)) ≤ 2 . Of course, this has the same imaginary part as g(z). Note that we must have |g(z)| ≥ 1 because if |g(z)| < 1, and g is such that Im(gz) is maximized, then Im(g(z)) Im(Sg(z)) = |g(z)|2| < Im(z), contradicting maximality of g. All that remains to check is that no two elements of D are in the same SL2(Z) orbit. Suppose we have z ∈ D. Suppose g(z) ∈ D. We may assume Im(g(z)) ≥ Im(z), but by lemma 2.1.10, this implies that |cz + d| ≤ 1. This can only happen 1 when c = ±1, 0. This is because Im(z) ≥ 2 , and so if c ≥ 2, Im(cz + d) ≥ 1, and so |cz + d| > 1. In the case c = 0, we must have d = ±1. This implies a = 1, and 1 since the real part of z is constrained by |Re(z)| < 2 , the only way it can be in the 1 same orbit as another point is if |Re(z)| = ± 2 and z ∈ ∂D. Next, if c = ±1, we can assume without loss of generality c = 1, Then we need |z + d| ≤ 1. If d = 0, then |z| = 1, so b = −1. In the special case that z is a cube root of unity, that 1 is |Re(z)| = 2 , we may have a = ±1, 0, but in all these cases it relates z to other points on the boundary of D. Otherwise, we must have a = 0, and in this case it relates z = α + βi, to −α + βi, which again lies on the boundary. Therefore, no points in the interior are in the same SL2(Z) orbit.  Corollary 2.3.3. Let H be the subgroup of SL(2, Z) generated by S, T. Then, H = PSL(2, Z).

Proof. Let z ∈ D. For a given g ∈ P GL2(Z), take w = gz. Then, as we showed in the proof of Theorem 2.3.2, there exists some h ∈ H with hw ∈ D. Therefore, since D is a fundamental domain, we must have hw = z. which means h = g−1, −1 and hence g = h ∈ H. Therefore H = PSL(2, Z).  Definition 2.3.4 (Modular Form). A modular Form of weight k is a holomorphic −1 k function on the upper half plane satisfying f(z + 1) = f(z) and f( z ) = z f(z), which is also holomorphic at infinity. Lemma 2.3.5. Definition 2.3.4 is equivalent to our original definition 2.2.8. k Proof. Since S, T generate P GL2(Z), having f(γ(z)) = (cz + d) f(z) for all k is equivalent to having this hold just for γ = S, T , as we are assuming in this definition.  2.4. Three Other Ways to Think about Modular Forms. The following def- inition perhaps illuminates the reason for calling it a modular “form” instead of simply a “function.” Since all modular forms of weight 2k + 1 are 0, at least when our definition involves relations for all of SL2(Z), we can restrict our attention to even weight modular forms. Therefore, we can make the following definition. Definition 2.4.1 (Modular Form). A modular form of weight 2k is a holomorphic k form ω = f(z)dz defined on H that is SL2(Z) invariant, and f(z) is holomorphic at infinity. Remark 2.4.2. In the above definition, dzk should be read as (dz)k and not d(zk), ⊗k as it should really be thought of as a section of the ΩH . 6 AARON LANDESMAN

Lemma 2.4.3. Definition 2.4.1 is equivalent to Definition 2.2.8

d az+b (a(cz+d)−c(az+b) ad−bc 1 Proof. Observe that dz cz+d = (cz+d)2 = (cz+d)2 = (cz+d)2 . Therefore, f(gz) = (cz + d)kf(z) if and only if f(gz)d(gz)k = (cz + d)2kf(z)d(gz)k = (cz + d)2kf(z)(cz + d)−2kdzk = f(z)dzk. Similarly, this invariance precisely implies f transforms as in the definition of modular form by the same computation. So, ω is SL2(Z) invariant if and only if f(z) is a modular form of weight 2k.  Definition 2.4.4. A modular Form of weight 2k is a holomorphic form ω = k f(z)dz defined on H/SL2(Z), and f(z) is holomorphic at infinity. Lemma 2.4.5. Definition 2.4.4 is equivalent to Definition 2.4.1

Proof. By definition, forms on H which are G invariant are equivalent to forms on H/G.  In addition, we can also view modular forms as acting on the set of lattices. To justify this, we shall now see that the space of lattices is equivalent to H/SL2(Z). Definition 2.4.6. Let V be a finite dimensional real vector space. A lattice Γ ⊂ V is a discrete subgroup under addition of finite rank, which generates V as an R vector space.

Notation 2.4.1. Denote R to be the set of lattices in C. Let L denote pairs of ∗ 2 nonzero complex (ω1, ω2) ∈ (C ) such that Im(ω1/ω2) > 0.

Lemma 2.4.7. The set R is in bijection with L/SL2(Z), where SL2(Z) acts on these two dimensional vectors of R by left multiplication. L Proof. First, define the map F : L → R. sending (ω1, ω2) 7→ Zω1 Zω2. First note that all elements of the image are lattices because Im(ω1/ω2) 6= 0, or equivalently, (ω1, ω2) generates C as an R vector space. We have to show that this map is surjective, and that any two elements mapping to the same lattice are in the same SL2(Z) orbit. First, let us show surjectivity. Given any lattice, we can find two elements ω1, ω2 that generate it as a Z module. These elements clearly must be independent, and hence, after possibly reordering, we can arrange so that Im(ω1/ω2) > 0. Finally, we just have to check that two pairs mapping to the same set are in the same SL2(Z) orbit. First, it’s clear that any two pairs which are in the same SL2(Z) orbit generate the same lattice, since multiplication by an element of SL2(Z) is simply a change of Z basis, with inverse given by its inverse in SL2(Z). However, suppose we have (ω1, ω2) and (η1, η2) both of which generate the same lattice. We must have matrices g, h with integer entries, mapping each lattice to the other. In particular, g and h must be mutual inverses. But since all integer matrices have integer determinants, this implies both g and h must have unit determinant. However, the fact that Im(ω1/ω2) > 0, Im(η1/η2) > 0 implies that the change of basis matrix must have positive determinant, and hence g, h ∈ SL2(Z).  ∗ ∗ Corollary 2.4.8. Denoting Λ = R/C where C acts on L by λ(ω1, ω2) = (λω1, λω2), we obtain a bijection between H =∼ Λ. Proof. Clearly R/C∗ =∼ H, where C∗ acts by scalar multiplication, since the point ∗ (ω1, ω2)/C can be identified with the quotient ω1/ω2, which by assumption has MODULAR FORMS AND THE FOUR SQUARES THEOREM 7 positive imaginary part, and hence lies in the upper half plane. Therefore, quoti- enting both sides the bijection from the previous lemma by the action of C∗, we obtain the corollary.  Definition 2.4.9 (Modular Form). Let Γ ∈ R, λ ∈ C. A modular Function of weight k is a function F : R → C satisfying F (λΓ) = λ−kF (Γ), where F is holomorphic as a function from H after the identification Λ =∼ H as in the previous corollary. Lemma 2.4.10. Definition 2.4.9 is equivalent to Definition 2.2.8 Proof. (Proof of equivalence to our previous definitions of modular forms). Strictly speaking this is not precisely the same definition as the modular forms we have seen before, since this has a different domain and range, but we shall show that these two notions are equivalent. First, note that (ω1, ω2) is SL2(Z) invariant, as we showed in Lemma 2.4.7. Therefore, to F we can associate the function f : H → C, defined k by f(ω1/ω2) = ω2 F (ω1, ω2). Then, observe that f is a modular function as defined aω1+bω2 in Definition 2.2.8. We have f(γ(ω1/ω2)) = F (γ(ω1/ω2), 1)) = F ( , 1) = cω1+dω2 k k (cω1/ω2 + d) F (aω1/ω2 + b, cω1/ω2 + d) = (cω1/ω2 + d) F (ω1/ω2, 1) = (cω1/ω2 + k d) f(ω1/ω2), which shows that f is a modular function as defined above. Similarly, we see that if f is modular in the sense defined above, then F is a modular function in this sense.  These equivalence classes should be thought of as points in the of classes of elliptic curves. That is, any can be represented by the quotient of C by a lattice Λ, and two elliptic curves are isomorphic if and only if their lattices are scalar multiples of each other. This allows us to view modular functions as functions on isomorphism classes of elliptic curves.

3. The Space of modular Forms In this section, we will describe the surprisingly simple complete classification of all modular forms on H. Essentially, all modular forms are given by Eisenstein series, which we will now define. Furthermore, the Eisenstein series of higher weights are generated by the Eisenstein series of weights 2 and 3.

3.1. Eisenstein Series.

Definition 3.1.1. The Eisenstein series of weight k is a function R → C, sending P 1 a lattice Γ to Gk(Γ) = γ∈Γ,γ6=0 γk . Lemma 3.1.2. For k > 2, the Eisenstein series converges absolutely

Proof. First, note that given a lattice Γ, we can find a constant cΓ such that there 2 are at most r cΓ points γ ∈ Γ with |γ| < r. This implies that there is another con- stant kΓ so that there are at most rkΓ points with r < |γ| < r +1. Then, in order to P 1 P∞ P 1 show absolute convergence, we can write γ∈Γ,γ6=0 |γk| = n=0 n<|γ|≤n+1 |γ|k , and it suffices to show the right hand series converges. Clearly, to show this con- P∞ P 1 verges, it suffices to show n=1 n<|γ|≤n+1 |γ|k converges, as the n = 0 term is P 1 P 1 1 1 finite. However, n<|γ|≤n+1 |γ|k < n<|γ|≤n+1 nk < nkΓ nk = kγ nk−1 . Then, P∞ P 1 P∞ 1 n=1 n<|γ|≤n+1 |γ|k ≤ n=1 kΓ nk−1 , which converges so long as k > 2.  8 AARON LANDESMAN

Notation 3.1.1. We shall now abuse notation, notating Gk both as a function on lattices and on the upper half plane under the correspondence from the previous section. We shall write Gk(z) to view it as a function on H, and Gk(Γ) to view it as a function on lattices. Lemma 3.1.3. Under the correspondence of functions on lattices with functions P 1 k on ,Gk(Γ) becomes the function Gk(z) = . H m,n∈Z,(m,n)6=(0,0) (mz+n) L Proof. Let a lattice Γ = ω1Z ω2Z. Then, under the correspondence from the previous section, by letting z = ω1 , we obtain the function ω2 k M Gk(z) = ω2 Gk(ω1Z ω2Z) X 1 = ωk 2 γk γ∈Γ,γ6=0

k X 1 = ω2 k (mω1 + nω2) m,n∈Z,(m,n)6=(0,0) X 1 = ω1 k (m ω + n) m,n∈Z,(m,n)6=(0,0) 2 X 1 = . (mz + n)k m,n∈Z,(m,n)6=(0,0)  Definition 3.1.4. The is defined to be X ζ(z) = n−z. n>0,n∈Z Note that it is absolutely convergent for k ∈ C, Re(k) > 1. In particular it is convergent on reals greater than 1.

Theorem 3.1.5. The Eisenstein Series Gk(z) is a modular function of weight k, for k ≥ 3, and Gk(∞) = 2 · ζ(k) for k even, and 0 for k odd. Proof. First, let us check it satisfies the transformation property. Clearly the Eisen- stein series is preserved under the map z 7→ z + 1. For the other invariance, we −1 −k need to check Gk( z ) = z Gk(z). Indeed, −1 X 1 G ( ) = k −1 k z (m z + n) m,n∈Z,(m,n)6=(0,0) X 1 = z−k (m + nz)k m,n∈Z,(m,n)6=(0,0) X 1 = z−k (mz + n)k m,n∈Z,(m,n)6=(0,0) −k = z Gk(z)

Next, we shall check Gk(z) is holomorphic in H. Defining X 1 f = , t (mz + n)k m,n∈Z,|m| 0, ∃T |∀t > T if |ft(x) − f(x)| < . Using the above estimate, if we choose T such that it is satisfied at w, (which can clearly be done since the function converges at w,) then the same P 1 T works for all other points x ∈ D because |f(x) − ft(x)| = m,n≥t |mx+n|k ≤ P 1 m,n≥t |mz+n|k < . Therefore, we have the {ft} → Gk(z) uniformly, and since all ft are holomorphic on H, so is Gk(z). Finally, to check Gk(z) is holomorphic at infinity, by Riemann’s removable singularity theorem, it suffices to check it is bounded. However, again using the estimate from the previous paragraph, we have that it is bounded in the fundamental domain, by the value at w, a cube root of unity. Then, using the fact that Gk(z) = Gk(z + 1), we have that the value at infinity is bounded, and so it must be holomorphic. Finally, we just have to check the value Gk(∞). We know since the series is uniformly convergent, its value at ∞ is the term by term limit. For z 6= 0, we have 1 limIm(z)→∞ (mz+n)k = 0. For even k, we have, X 1 Gk(∞) = lim Im(z)→∞ (mz + n)k n,m,(n,m)6=(0,0) X 1 = lim Im(z)→∞ (mz + n)k n,m,(n,m)6=(0,0) X X = n−k = 2 n−k n6=0 n≥0 = 2ζ(k) In the case that k were odd, we would reach 0 at the penultimate step in the above computation. 

Corollary 3.1.6. For odd k, Gk = 0. Proof. It is a modular form of odd weight, hence 0. Of course, we could also see this by observing explicitly that the sum is 0, by pairing off a point in the lattice with its negative.  3.2. Notation for Spaces of Modular Forms.

Notation 3.2.1. Denote the vector space of modular forms of weight k by Mk. Note that it is a vector space since linear combinations of modular forms of weight k are again obviously modular forms of weight k.

Definition 3.2.1. A modular form f is called a form if a0 = 0, where a0 is the constant term in the fourier expansion of f at ∞. 10 AARON LANDESMAN

0 Definition 3.2.2. Denote the vector space of cusp forms of weight k by Mk . Again, this too is a vector space, because Mk is a vector space, and linear combi- nations of functions which vanish at infinity also vanish at infinity. L Definition 3.2.3. Denote the modular form algebra M = Mk, and the k∈Z ideal M 0 = L M 0. k∈Z k 3 3 2 2 Notation 3.2.2. Denote ∆ = 60 (G4(z)) − 27 · 140 (G6(z)) . 0 Lemma 3.2.4. ∆ ∈ M12. 3 Proof. Clearly G4(z) is a modular form of weight 12 = 4 · 3, because it is the cube 2 of a modular form of weight 4, and (G6(z)) is also a modular form of weight 12 because it is the square of a modular form of weight 6. Therefore, ∆ is indeed a modular form of weight 12. To check it is also a cusp form, we need to check that its value at infinity is 0. However, ∆(∞) = 603(2ζ(4))3 − 27 · 1402 · (2ζ(6))2 = 3 π4 3 2 2π6 2 60 · ( 45 ) − 27 · 140 · ( 945 ) = 0.  ∼ Lemma 3.2.5. Mk = 0 for k ∈ 2Z + 1 Proof. This is simply a restatement of the fact that there are no nonzero odd weight modular forms.  3.3. A Bound on the Dimension of Modular Forms of a Fixed Weight. Above we have described the Eisenstein series of even weights. It will turn out that products of these account for all modular forms. We shall go about showing this by first obtaining an upper bound on the dimensions of modular forms of varying weights, and then showing that the Eisenstein Series generate a space of exactly this dimension. Definition 3.3.1. The vanishing order of a f at a point p, n denoted vp(f) is the maximum integer n for which we can write f(z) = g(z)·(z−p) for g holomorphic.

Lemma 3.3.2. Note that the vanishing order is a valuation. That is, vp(fg) = vp(f) + vp(g) and vp(f + g) ≥ vp(f) + vp(g). Proof. After writing out Taylor series for f and g, the lemma is immediate. 

Lemma 3.3.3. Let f be a modular form. If p = γ(q) then vp(f) = vq(f). Therefore, we have a well defined value vt(f) for t ∈ H/SL2(Z). k Proof. Since f(γ(z)) = (cz+d) f(z), and since vp(cz+d) = 0 for all p ∈ H, we have vγ (p)(f) = vp(f). Thus, we naturally have an induced valuation on the quotient H/SL2(Z). 

Definition 3.3.4. Let G be a group acting on H. For p ∈ H define ep to be the order of the stabilizer of p under the G action.

Remark 3.3.5. Under the action of G = SL2(Z), by 2.3.2, we understand that in the fundamental domain, ep = 2 for p = i, ep = 3 for p = w, with w a sixth root of unity, and otherwise ep = 1. Theorem 3.3.6. For f a nonzero modular function of weight 2k, X 1 k v∞(f) + vp(f) = . ep 12 p∈H/SL2(Z) MODULAR FORMS AND THE FOUR SQUARES THEOREM 11

f 0(z) Proof. To prove this, we shall integrate the function f(z) . First observe that since f is a nonzero modular form, there is some neighborhood of ∞, call it N, at which f 0(z) it has no zeros. Observe that the zeros of f are exactly the poles of f(z) and the f 0(z) residue of f(z) is the multiplicity of the zeros. Since D − N is compact, f can only have finitely many zeros in that region. Define F to be a set containing D − N and of the form D ∩ {z ∈ H|Im(z) < k} for some real constant K. We shall then R f 0(z) compute the contour integral ∂F f(z) dz in two different ways. Therefore, the sum above is a finite sum, hence well defined. Define A, C to be the sixth root of unity and third root of unity, respectively, on our fundamental domain, and define B to be the point i. Next, we have to deal with poles on the boundary. If there are poles on the boundary, other than the points A, B, C, they will come in pairs. which are reflections around the real axis. We can then simply integrate along small semicircles around them, and take the f 0(z) limit as the radius approaches 0. Note that f(z) has the same residue at both of them, because the order of f is well defined on points that are equivalent under the SL2(Z) action. Hence, we can assume there are no points on the boundary. Next, we take care of poles at A, B, C. If there is a pole at B, then if we replace the contour going through B by a circle of small radius around B, in the limit that radius goes to 0, the circle approaches a half circle, and hence the integration around 1 f 0 1 it is 2 ResB( f ) = 2 vBf. Next, we know there is a pole at A if and only if there is a pole at C, and if we replace the contour going through A and C by small circles 1 around A and C, in the limit their radii go to 0, the each approach 6 of a circle, 1 f 0 1 f 0 1 f 0 hence they contribute together ( 6 ResC ( f ) + 6 ResA( f ) = 3 ResA( f ) = vA(f). Now, using the residue theorem and adding the poles inside the contour, we have R f 0(z) P 1 dz = 2πi · vp(f). So, it remains to show that the integral ∂D f(z) p∈H/SL2(Z) ep k along this region F is equal to 2πi(−v∞(f) + 12 ). Indeed, Let us rewrite ∂F as 1 1 1 the union of C curves CD, DE, EA, AB, BC, where D = 2 + Ki, E = − 2 + Ki, R f 0 where we defined K in the first paragraph. Then, observe that ED f = 2πiv∞(f), because when we transform the upper half plane to the unit circle by taking a logarithm, the line segment ED travels around the origin exactly once, and since f 0 there are no other residues in the region, the integral is exactly 2πiRes∞( f ) = R f 0 2πiRes∞(f). Hence, DE f = −2πiRes∞(f). R f 0 k So, our remaining goal is to show EA,AB,BC,CD f = 2πi 12 . First, observe that R f 0 R f 0 R f 0 because f(z + 1) = f(z), we have EA f = DC f and so, EA+CD f = 0. So, we R f 0 k −1 k just need to show AB,BC f = 2πi 12 . To complete this, note f( z ) = −z f(z), 1 and so making the change of variable y = − z , we see

Z B f 0(z) Z B kzk−1f( −1 ) + zkf 0( −1 ) 1 dz = z z dz k −1 2 A f(z) A z f( z ) z Z B f 0(y)yk − kyk−1f(y) = k dy C y f(y) Z B f 0(y) Z C 1 = + −k dy. C f(y) B y 12 AARON LANDESMAN

Therefore, C C iπ Z f 0 Z f 0(y) f 0(y) 1 Z dy Z 2 k = − + k dy = k = kdz = 2πi · , f f(y) f(y) y y iπ 12 AB,BC B B 3 as we wanted to show.  3.4. A Complete Characterization of Modular Forms.

Corollary 3.4.1. For k < 0,Mk = 0. Proof. If there were some form of negative weight, by 3.3.6 it would have to have some vp(f) < 0. However, since it is holomorphic, we must have vp(f) ≥ 0. So, there are no negative weight forms. 

Corollary 3.4.2. M2 = 0. P 1 1 Proof. From 3.3.6, v∞(f) + vp(f) = . However, since all vp ≥ p∈H/SL2(Z) ep 12 1 0, vp(f) ∈ Z and all ep ≥ 6 , the above formula has no solutions, hence there cannot be any modular forms of weight 2.  k ∼ 0 L Lemma 3.4.3. For k ∈ 2Z, if M0 6= 0, then Mk = Mk C · Gk

Proof. First, define the functional ξ : Mk → C, ξ(f) = f(∞). This is clearly 0 k linear, and by definition Mk = ker ξ. Since the image of ξ. In the case M0 6= 0, 0 dim(Im(ξ)) = 1, and so by rank nullity, dim Mk = 1 + dim Mk .  Lemma 3.4.4. Recall ∆ defined in Notation 3.2.2. In Lemma 3.2.4 we have shown ∼ 0 ∆ is a cusp form of weight 12. There is an isomorphism Mk−12 = Mk , defined by f 7→ ∆ · f. 0 Proof. Clearly multiplication by ∆ maps F : Mk−12 → Mk , f 7→ ∆·f but since ∆ is a cusp form, it follows that the product of ∆ with any modular form is a cusp form, 0 P 1 and so the image lies in M . By 3.3.6 we have v∞(∆) + vp(∆) = 1. k p∈H/SL2(Z) ep Since it has a 0 at ∞, it cannot have any other 0’s. Therefore, we can define 0 g an inverse map division by ∆,G : Mk → Mk−12, g 7→ ∆ . This is well defined 0 because for any g ∈ Mk , we know g(∞) = 0, and since ∆ has a zero of order 1 at ∞, g/∆ cannot have a pole at ∞. Furthermore, since ∆ is holomorphic and non-vanishing on H, g/∆ is also holomorphic on H. Therefore, g/∆ is holomorphic on H, holomorphic at infinity, and is the quotient of two modular forms, hence it is itself a modular form, as desired. Clearly, G, F are mutual inverses, and so F defines an isomorphism.  ∼ Theorem 3.4.5. For k odd, and for k < 0, and k = 2, we have Mk = 0. In the ∼ ∼ case k = 0, we have M0 = C. For k = 4, 6, 8, 10, Mk = C · Gk. And for k ≥ 12, ∼ L Mk = C · Gk ∆ · Mk−12. ∼ Proof. We have already proved that for k odd, k < 0, and k = 2 that Mk = 0 in the 0 ∼ ∼ above lemmas. For k = 0, 4, 6, 8, 10, we know Mk = Mk−12 = 0, and so we obtain dim Mk ≤ 1, by 3.4.3. However, we have explicitly produced modular functions of weight k, namely the constants C if k = 0, and Gk(z) otherwise. Therefore, dim Mk = 1, and it is generated by C if k = 0, and Gk otherwise. ∼ L It only remains to prove that for k ≥ 12,Mk = C · Gk ∆ · Mk−12. By the k ∼ previous lemma, we obtain that M0 = ∆ · Mk−12. Further, by Lemma 3.4.3 we MODULAR FORMS AND THE FOUR SQUARES THEOREM 13 know that there is at most one linearly independent non-cusp form of any given weight. However, once again, we have produced such a modular form Gk(z) which ∼ L is not a cusp form because ζ(k) 6= 0 for k > 0, k ∈ 2Z. Therefore, Mk = C·Gk ∆· Mk−12.   bk/12c if k ≡ 2 mod 12  Corollary 3.4.6. For k > 0, dim Mk = bk/12c + 1 if k ≡ 0, 4, 6, 8, 10 mod 12 . 0 otherwise

Proof. The above theorem tells us this is the case for k < 12, and induction and the last statement of the theorem tells us the dimension counting holds for k ≥ 12. 

We are now able to completely characterize modular forms on C in terms of G4 and G6. ∼ Corollary 3.4.7. M = C[G4(z),G6(z)]

Proof. We clearly have a homomorphism φ : C[G4(z),G6(z)] → M, since G4(z),G6(z) are modular forms, and linear combinations of modular forms are modular forms. We wish to show φ is an isomorphism. First, let us show φ is a surjection. Note that ∆ ∈ C[G4(z),G6(z)], since it is a linear combination of powers of G4,G6, so by the theorem, it suffices to show that Gk ∈ C[G4(z),G6(z)]. However, we may note that for all k > 4, k ∈ 2Z we can write a b a form G4G6, which is a form of weigh 4a + 6b. Clearly, for any even k ≥ 4, we can find nonnegative integers a, b so that 4a+6b = k, and so this gives us some modular form of weight k. However, it is not a cusp form because G4 and G6 are not cusp forms. Hence, by our structure theorem above, we can write it uniquely as f + g 0 where g ∈ Mk , f ∈ CGk. Inducting on k, we may assume that g ∈ C[G4(z),G6(z)]. This implies that f ∈ C[G4(z),G6(z)]. However, by our structure theorem, f is a constant multiple of Gk(z). Therefore, Gk(z) ∈ C[G4(z),G6(z)], completing the proof. To complete the proof, we just need to show φ is also an injection. Since φ is a map of graded algebras, it suffices to show φ is an injection on the kth graded component. However, by Corollary ??, we precisely know the dimension of Mk, and it is simple to see that C[G4(z),G6(z)] has the same dimension. Since the two dimensions are the same, and φ is surjective on graded components, it must also be injective on graded components. Therefore, φ is also injective, and hence an isomorphism. 

4. Modular Forms of Higher Levels It seems above we have completely solved the subject of Modular Forms. How- ever, we have made the stringent requirement that these forms transform properly under all of SL2(Z). The next step is to slightly loosen the requirements, and only necessitate that the functions transformed properly under certain finite index sub- groups in SL2(Z). It turns out that there are many more rich ideas but first we have to specify the correct subgroups, called Congruence Subgroups. We will then use the more general modular forms to deduce the number of ways to write a number as a sum of four squares. 14 AARON LANDESMAN

4.1. Congruence Subgroups. Definition 4.1.1. The Principal Congruence Subgroup of level N is denoted

a b a b a b 1 0  Γ(N) = | ∈ SL ( ), ≡ mod N c d c d 2 Z c d 0 1 where congruence mod N is taken entry-wise.

Lemma 4.1.2. The index [SL2(Z) : Γ(N)] < ∞.

Proof. Observe Γ(N) is the kernel of the map SL2(Z) → SL2(Z/NZ), where the latter is a finite group. Hence, the kernel Γ(N) has finite index.  Definition 4.1.3. A Congruence Subgroup of level N is a group Γ with Γ(N) ⊂ Γ ⊂ SL2(Z).

Example 4.1.1. The subgroups of SL2(Z) defined by a b a b a b 1 b  Γ (N) = | ∈ SL ( ), ≡ mod N 1 c d c d 2 Z c d 0 1

a b a b a b a b  Γ (N) = | ∈ SL ( ), ≡ mod N 0 c d c d 2 Z c d 0 d

are principle congruence subgroups of level N with Γ(N) ⊂ Γ1(N) ⊂ Γ0(N) ⊂ SL2(Z). 4.2. A More Refined Definition of Modular Forms. We are now ready to define Modular Forms with respect to arbitrary principal congruence subgroups. Essentially, it will be the same as before, but they only need transform properly with respect to these subgroups. a b Definition 4.2.1. For γ = ∈ SL ( ), τ ∈ the factor of automorphy, c d 2 Z H is denoted j(γ, τ) = cτ + d.

Definition 4.2.2. A weight k operator is a map [γ]k : Hom(H, C) → Hom(H, C) −k with γ ∈ SL2(Z) defined by (f[γ]k)(τ) = j(γ, τ) f(γ(τ)). Here we are denoting [γ]k(f) = f[γ]k.

Definition 4.2.3. For Γ a set of matrices in SL2(Z), a meromorphic function f : H → C is invariant of weight k with respect to Γ if f[γ]k = f, ∀γ ∈ Γ.

Definition 4.2.4. For Γ a set of matrices in SL2(Z), a meromorphic function f : H → C is weakly modular of weight k with respect to Γ if is invariant of weight k with respect to Γ and f is holomorphic.

0 Lemma 4.2.5. For γ, γ ∈ SL2(R), the following statements hold (1) j(γγ0, τ) = j(γ, γ0(τ))j(γ0, τ) (2) (γγ0)(τ) = γ(γ(τ)) 0 0 (3) [γγ ]k = [γ]k[γ ]k Im(τ) (4) Im(γ(τ)) = |j(γ,τ)|2 MODULAR FORMS AND THE FOUR SQUARES THEOREM 15

Proof. We have already seen the fourth and second statement above. The other two can be seen by direct computation. That is, simply write out matrices for γ, γ0 write out their corresponding factors of automorphy, multiply them, and verify the equalities. 

Corollary 4.2.6. If a function f is weakly modular with respect to a set of matrices Γ is weakly modular with respect to the group generated by Γ.

Lemma 4.2.7. For all congruence subgroups Γ ⊂ SL2(Z) there exists a number 1 h h ∈ so that the matrix m = ∈ Γ. Z h 0 1 Proof. Every congruence subgroup by definition contains some principal congruence 1 N subgroup Γ(N), which contains elements of the form . 0 1 

Notation 4.2.1. Throughout this section we shall simply use h = h(Γ) to be the minimal positive h from the previous lemma.

Proposition 4.2.8. For f a weakly modular function of weight k with respect to Γ, f is hZ periodic and there exists a function g : D − {0} → C so that f(τ) = g(e2πiτ/h).

Proof. The proof is the same as 2.2.3. 

Definition 4.2.9. A weakly modular function f is holomorphic at ∞ if g has a removable singularity at q = 0, or equivalently f has a Fourier expansion of the P∞ 2πiτ/h form f(τ) = n=0 ane .

Definition 4.2.10. Let Γ ⊂ SL2(Z) be a congruence subgroup. A function f : H → C is a modular form of weight k with respect to Γ if f is holomorphic, f is weakly modular of weight k with respect to Γ and f[α]k is holomorphic at ∞ for all α ∈ SL2(Z). Definition 4.2.11. With Γ, f as above, we say f is a cusp form of weight k with respect to Γ if f is a modular form such that a0 = 0 in the Fourier expansion of f[α]k for all α ∈ SL2(Z). Notation 4.2.2. We shall denote the space of modular forms of weight k with respect to Γ by Mk(Γ) and the space of cusp forms by Sk(Γ). Then, the spaces of all modular L L forms is M(Γ) = k Mk(Γ) and the space of cusp forms is S(Γ) = k Sk(Γ).

5. The space M2(Γ0(4)) In this section we describe the space of modular forms of weight 2 on the principal congruence subgroup Γ0(4). This section will be devoted to showing that two explicit independent forms exist. However, it is a somewhat deeper result, involving theory, that these are the full space of modular forms, and so we shall omit the proof. The reason for studying this space is that one of its elements turns out to count the number of ways to write a positive integer as a sum of four squares, as shall be discussed in the next section. 16 AARON LANDESMAN

5.1. Computing the Fourier Series of G2(τ). We shall first develop some stan- dard trigonometric identities in order to go between the conditionally convergent Eisenstein series G2(τ) and its Fourier series. P Definition 5.1.1. The is denoted σ(n) = d|n,d>0 d. Lemma 5.1.2. We have the following identities for π cot πz: 1 P∞  1 1  (1) π cot πz = z + d=1 τ−d + τ+d . P∞ 2πizm (2) π cot πz = πi − 2πi m=0 e where the first sum converges absolutely and the second sum converges conditionally. Proof. (Sketch) The first identity a standard argument, so I will cos πz only give a sketch of the proof. It is easy to see that π cot πz = π sin πz has poles exactly where it’s denominator has zeros. That is, its poles are exactly at the integer points, and we can see that its residues are all 1. The residues are easy to compute because by periodicity they are all the same, and at the origin the residue can be d z cos πz πz sin πz calculated by computing dz π sin πz |z=0 = sin2 πz |z=0 = 1. This tells us that the 1 P∞  1 1  principal parts at the poles of both sides of π cot πz = z + d=1 τ−d + τ+d agree. Then, consider the function ∞ ! 1 X  1 1  f(z) = π cot πz − + + . z τ − d τ + d d=1 We wish to show it is 0. We would like to show it is bounded on the strip 0 ≤ Re(z) < 1, because it will then follow that it is bounded everywhere. This boundedness in the limit Im(z) → +∞ can be established by writing π cot πz = cos πz e−iπz +eiπz π sin πz = iπ eiπz −e−iπz , and noting that both numerator and the denominator ap- proach eπz as z → i∞. Therefore, both π cot πz and f(z)−π cot πz are bounded on the strip 0 ≤ Re(z) ≤ 1, and their difference, f(z) is bounded. This tells us f(z) is constant by Liouville’s Theorem, and checking the value of f(z) at any single point tells us it is 0. The easiest point to check is the limit as z → i∞. The second identity is much more straightforward. As before we write ∞ cos πz e−iπz + eiπz −iπe2πiz + 1 X π cot πz = π = iπ = = πi − 2πi e2πizm, sin πz eiπz − e−iπz 1 − e2πiz m=0 by expanding this as a geometric series.  Lemma 5.1.3. Let ( 0 Z if c 6= 0 Zc = . Z − {0} if c = 0 Then, for all τ ∈ H, the function X X 1 G2(τ) = 2 0 (cτ + d) c∈Z d∈Zc converges conditionally and satisfies ∞ 2 X 2πiτn G2(τ) = 2ζ(2) − 8π σ(n)e . n=1 MODULAR FORMS AND THE FOUR SQUARES THEOREM 17

Proof. Using the above lemma 5.1.2 we have ∞ ∞ 1 X  1 1  X + + = πi − 2πi e2πizm, z z − d z + d d=1 m=0 and so taking the derivative of both sides (as is justified by uniform convergence) yields the identity ∞ ∞ X 1 X = 4π2 me2πimz, (z + d)2 d=−∞ m=0 where the doubly infinite sum is interpreted as summing in the order of increasing . Therefore, X X 1 X 1 X X 1 X X 1 = + + (cτ + d)2 (0τ + d)2 (cτ + d)2 (cτ + d)2 0 0 c<0 0 c>0 0 c∈Z d∈Zc d∈Z0 d∈Zc d∈Zc X 1 X X 1 = + 2 d2 (cτ + d)2 d6=0 c>0 d∈Z ∞ π2 X X = 2 · + 2 4π2 me2πimcτ 6 c>0 m=0 ∞ ∞ π2 X X = + 8π2 me2πimcτ 3 c=1 m=0 P∞ P∞ 2πimcτ Now, note that c=1 m=0 me is absolutely convergent because τ ∈ H 2πimcτ P P∞ 2πimcτ so, |e | < 1. Therefore, for a fixed c, we can bound c m=0 m|e | = P 1 P −4πiτc 2 c | (1−e2πiτc)2 |, whose tail is bounded above by 2 c |e | = 1−e−4πiτc , which surely converges as |e4πiτc| < 1. This tells us that based on the initial order of summation the sum is absolutely convergent, although this does not mean our original sum was absolutely convergent. However, since this sum is absolutely convergent, we can rearrange its terms to obtain

∞ ∞ ∞ X X X X me2πimcτ = de2πiτn. c=1 m=0 n=1 d>0,d|n Therefore, this tells us

2 ∞ X X 1 π 2 X X 2πiτn 2 = + 8π de , 0 (cτ + d) 3 c∈Z d∈Zc n=1 d>0,d|n and the sum is conditionally convergent. 

5.2. The almost-invariance of G2. The next step on the way to producing el- ements of M2(Γ0(4)) is to create a function which is invariant. It turns out G2 isn’t quite invariant, and we shall see how invariance fails in this subsection. Not to worry though, with a few tricks, actual invariant functions can be found.  1 1  Lemma 5.2.1. G (τ) = G (τ). 2 0 1 2 2 18 AARON LANDESMAN

Proof. Using the previous lemma 5.1.3, we have the explicit Fourier expansion ∞ π2 X X G (τ) = + 8π2 de2πiτn, 2 3 n=1 d>0,d|n and so ∞ ∞ π2 X X π2 X X G (τ +1) = +8π2 de2πi(τ+1)n = +8π2 de2πiτn = G (τ). 2 3 3 2 n=1 d>0,d|n n=1 d>0,d|n   0 −1  Lemma 5.2.2. G (τ) = G (τ) − 2πi . 2 1 0 2 τ 2 Proof. First, note that

 0 −1  −1 G (τ) = τ −2G ( ) 2 1 0 2 2 τ X X 1 = τ −2 −1 2 (c τ + d) c∈Z d∈Zc X X 1 = 2 0 (−c + dτ) c∈Z d∈Zc X X 1 = 2 0 (cτ − d) d∈Z c∈Zd X X 1 = 2 0 (cτ + d) d∈Z c∈Zd X 1 X X 1 = + d2 (cτ + d)2 d6=0 d∈Z c6=0 π2 X X 1 = 2 + 6 (cτ + d)2 d∈Z c6=0 Next, observe that we have a telescoping series X 1 X 1 1 = − = 0. (cτ + d)(cτ + d + 1) cτ + d cτ + d + 1 d∈Z d∈Z Hence, we can add this to G2 to obtain π2 X X 1 G (τ) = + + 0 2 3 (cτ + d)2 c6=0 d∈Z π2 X X 1 X X 1 = + − 3 (cτ + d)2 (cτ + d)(cτ + d + 1) c6=0 d∈Z c6=0 d∈Z π2 X X 1 1 = + − 3 (cτ + d)2 (cτ + d)(cτ + d + 1) c6=0 d∈Z π2 X X 1 = + 3 (cτ + d)2(cτ + d + 1) c6=0 d∈Z MODULAR FORMS AND THE FOUR SQUARES THEOREM 19

P 1 Now, we may observe that this double sum is on the order of d,c (cτ+d)3 , and therefore by the same argument as in 3.1.5 this converges absolutely. We may therefore rearrange terms in the series, to obtain

π2 X X 1 G (τ) = + 2 3 (cτ + d)2(cτ + d + 1) d∈Z c6=0 π2 X X 1 1 = + − 3 (cτ + d)2 (cτ + d)(cτ + d + 1) d∈Z c6=0 π2 X X 1 X X 1 = + − 3 (cτ + d)2 (cτ + d)(cτ + d + 1) d∈Z c6=0 c6=0 d∈Z −1 X X 1 = τ −2G ( ) − 2 τ (cτ + d)(cτ + d + 1) d∈Z c6=0

Hence, to prove the claim, it suffices to show

X X 1 2πi − = (cτ + d)(cτ + d + 1) τ d∈Z c6=0

Or equivalently,

N−1 X X 1 2πi − lim = N→∞ (cτ + d)(cτ + d + 1) τ d=−N c6=0

Note that for N fixed, this sum converges absolutely. So, reversing the orders of the summations gives a telescoping sum, which works out to

N−1 N−1 X X 1 X X 1 − = − (cτ + d)(cτ + d + 1) (cτ + d)(cτ + d + 1) d=−N c6=0 c6=0 d=−N N−1 X X 1 1 = − − cτ + d cτ + d + 1 c6=0 d=−N X 1 X 1 = − + cτ − N cτ + N c6=0 c6=0

The next step is to employ our identities for cotangent from 5.1.2. Recall

∞ 1 X  1 1  π cot πz = + + . z τ − d τ + d d=1

Therefore,

X 1 X 1 X 1 1 1 N 1 − + = τ + τ = 2π cot π − 2 . cτ − N cτ + N N − c N + c τ τ N c6=0 c6=0 c6=0 τ τ 20 AARON LANDESMAN

Finally, N−1 X X 1 1 N τ − lim = lim 2π cot π − 2 N→∞ (cτ + d)(cτ + d + 1) N→∞ τ τ N d=−N c6=0 1 N = lim 2π cot π N→∞ τ τ 2πi N 1 e τ + 1 = 2π lim i 2πi N τ N→∞ e τ − 1 2πi = , τ as we wanted to show.  a b Lemma 5.2.3. (G [γ] )(τ) = G (τ) − 2πic with γ = ∈ SL ( ). 2 2 2 cτ+d c d 2 Z 1 1 0 −1 Proof. By 2.3.3 we have that , generate SL ( ) action by Linear 0 1 1 0 2 Z Fractional Transformations. We have just seen above that this lemma holds for the special case of the two generators. Therefore, to prove this lemma, it suffices to show that if this lemma is satisfied for γ, η, it is also satisfied for γ−1 and γ · η. a b e f First, let us show that it holds for γ · η. Write γ = , η = . Using c d g h the general fact prove in 4.2.5 that f[γ]2[η]2 = f[γ · η]2, we would like to show that 2πi(ce+dg) f[γ · η]2(τ) = f(τ) − (ce+dg)τ+cf+dh . Indeed, since f[γ]2[η]2 = f[γ · η]2, we have 2πic f[γ] [η] (τ) = (f(τ) − )[η] 2 2 cτ + d 2 2πig 2πic = f(τ) − − (gτ + h)−2 gτ + h eτ+f c · gτ+h + d 2πic − 2πig(c(eτ + f) + d(gτ + h)) = f(τ) − . (gτ + h)(c(eτ + f) + d(gτ + h)) Therefore, to show multiplicativity holds, we need to show 2πic + 2πig(c(eτ + f) + d(gτ + h)) 2πi(ce + dg) = , (gτ + h)(c(eτ + f) + d(gτ + h)) (ce + dg)τ + cf + dh or equivalently, 2πic + 2πig(c(eτ + f) + d(gτ + h)) = 2πi(ce + dg) gτ + h Clearing the denominator, we need to show only 2πic + 2πig(c(eτ + f) + d(gτ + h)) = 2πi(ce + dg)(gτ + h) After expanding and cancelling matching terms, we obtain that it suffices to show 2πi(c + gcf) = 2πiceh, which holds because η ∈ SL2(Z) and so 1 + gf = eh, as its determinant is 1. To complete the proof of the lemma, it suffices to show this law is closed under inversion. To do this, it suffices to check that the inverses of the generators S, T both satisfy this law, because any element of SL2(Z) can be written as products of MODULAR FORMS AND THE FOUR SQUARES THEOREM 21

S, T, and their inverses. This is easy to see, since S is actually self inverse and the −1 same argument as given in Lemma 5.2.1 shows T satisfies this law. 

5.3. An Almost Invariant SL2(Z) function from G2.. a b Lemma 5.3.1. For γ = ∈ SL ( ), The function G (τ) − π satisfies c d 2 Z 2 =(τ)  π   π  G (τ) − [γ] = G (τ) − 2 =(τ) 2 2 =(τ)

2πic Proof. We showed in the previous lemma 5.2.3, that (G2[γ]2)(τ) = G2(τ) − cτ+d . π π 2πic So, to complete this proof, it suffices to show =(τ) [γ]2 = =(τ) − cτ+d To see this, observe π π (cτ + d)−2 = (cτ + d)−2 Im( aτ+b ) (aτ+b)(cτ+d) cτ+d Im( (cτ+d)(cτ+d) ) π = (cτ + d)−2 =(aτd+bcτ) (cτ+d)(cτ+d) π = (cτ + d)−2 =(τ)(ad−bc) (cτ+d)(cτ+d) π = (cτ + d)−2 =(τ) (cτ+d)(cτ+d) π(cτ + d) = (cτ + d)=(τ) So, it suffices to show that π(cτ + d) π 2πic = − (cτ + d)=(τ) =(τ) cτ + d Note that π 2πic (cτ + d)π − 2πic=(τ) − = =(τ) cτ + d =(τ)(cτ + d) So, we need to show π(cτ + d) (cτ + d)π − 2πic=(τ) = (cτ + d)=(τ) =(τ)(cτ + d) Or equivalently, π(cτ + d) = (cτ + d)π − 2πic=(τ) Subtracting (cτ +d)π from both sides, we obtain that it suffices to show πc(τ −τ) = 2πi=(τ), which is of course true, completing the proof. 

5.4. A Bounding Theorem. We have above produced an SL2(Z) invariant func- tion, but it is not holomorphic. We will later show that G2,N (τ) = G2(τ) − NG2(Nτ) satisfies G2,N ∈ M2(Γ0(N)) are holomorphic on the upper half plane, which will deal with the issue of holomorphicity away from ∞. To deal with the problem of holomorphicity, we will first need a crucial theorem, stating that if we can asymptotically bound the Fourier coefficients of a function by a polynomial, then it is holomorphic at infinity. 22 AARON LANDESMAN

Lemma 5.4.1. Let f : H → C be holomorphic on H, satisfying f(z + 1) = f(z). If there exist positive constants C, r, such that the Fourier expansion of f satisfies P∞ 2πinτ/N r f(τ) = n=0 ane , with |an| < Cn for all n, then Z ∞  r −2πty/N C |f(τ)| ≤ C0 + C t e dt + r . t=0 y Proof. Write τ = x + iy. Note that ∞ ∞ ∞ X 2πiτ/N X 2πiτ/N X r 2πny/N |f(τ)| = | ane | ≤ |an||e | < Cn e n=0 n=0 n=0 Defining the function g(t) = e2πty/N , we may observe that rN rN g0(t) > 0, t ∈ (0, , g0(t) < 0, t > . 2πy 2πy

0 r−1 2πty/N −2πy  rN We can see this because g (t) = t e r + t · N . Let k = b 2πy c. Since g0(t) is increasing on (0, k) it follows that

k−1 k−1 X X Z k nre2πny/N = g(n) < g(t)dt, n=1 n=1 t=0 and additionally since g(t) is decreasing on (k + 1, ∞), it follows that ∞ ∞ X X Z ∞ nre2πny/N = g(n) < g(t)dt. n=k+1 n=k+1 t=k Combining these two facts yields that

∞ k−1 ∞ X X X nre2πny/N = (k − 1)re2π(k−1)y/N + kre2πky/N + nre2πny/N + nre2πny/N n=1 n=1 n=k+1 Z k Z ∞ < (k − 1)re2π(k−1)y/N + kre2πky/N + g(t)dt + g(t)dt t=0 t=k Z k Z ∞ = (k − 1)re2π(k−1)y/N + kre2πky/N + g(t)dt + g(t)dt t=0 t=k Z ∞ = (k − 1)re2π(k−1)y/N + kre2πky/N + g(t)dt t=0

rN2πy rN 2πky/N N2πy r Since k > 2πy − 1, and e ≈ e = e , which is a constant. So, choosing r C0 large enough to encompass this factor of e , it follows that Z ∞  Z ∞  −r r −2πty/N C |f(τ)| ≤ |a0| + C g(t)dt + O(y ) < C0 + C t e dt + r . t=0 t=0 y 

Theorem 5.4.2. Let Γ be a principal congruence subgroup of level N and f : H → C such that f is holomorphic on H, is meromorphic at ∞, and f satisfies f[γ]k = f for all γ ∈ Γ. If there exist positive constants C, r such that the Fourier expansion P∞ 2πinτ/N r of f satisfies f(τ) = n=0 ane , with |an| < Cn for all n, then f ∈ Mk(Γ). MODULAR FORMS AND THE FOUR SQUARES THEOREM 23

Proof. To prove this, by the definition of modular form 4.2.10, we only have to show that f[α]k is holomorphic at infinity for all α ∈ SL2(Z), since we are assuming it is holomorphic elsewhere and translation invariant. We are in the situation of 5.4.1 and so we may apply the result that

Z ∞  r −2πty/N C |f(τ)| ≤ C0 + C t e dt + r . t=0 y

Next, since f is holomorphic, and fractional linear transformations are holomor- k phic, it follows that for α ∈ SL2(Z), f[α]k(τ) = (cτ + d) f(α(τ)) is holomorphic as well. Furthermore, it is weight k invariant after composing with any element of αΓα−1 because f was Γ invariant. Therefore, we have a Laurent series expan- P 2πiτ/N sion f[α]k(τ) = bne . To complete the proof, we need to show this is n∈Z holomorphic at infinity, or equivalently the bn = 0, n < 0. This is equivalent to the 2πiτ/N condition that limqN →0(f[α]k)(τ) · e = 0, because this precisely means there are no negative terms in the Laurent series expansion, as if there were this limit would be infinite. First, let us note that using the previous lemma 5.4.1

−k lim |(f[α]k)(τ)| = |(cτ + d) f(α(τ))| qN →0 = O(τ −k)O(=(α(τ)r) = O(y−k)O(yr) = O(yr−k).

This implies that for D a constant,

2πiτ/N r−k 2πiτ/N r−k −2πy/N lim |(f[α]k)(τ) · e | ≤ D|y ||e | = D|y |e |, qN →0 which clearly goes to 0 as y → ∞, since the exponential term dominates the poly- nomial term. This means the Laurent series is actually a Fourier series, and so f is indeed holomorphic at infinity. 

5.5. The elements of M2(Γ0(4)). We are finally ready to produce two indepen- dent elements of M2(Γ0(4)). Explicitly, these elements are G2(τ)−2G2(2τ),G2(τ)− 4G2(4τ), as we shall soon show.

Lemma 5.5.1. The function G2,N (τ) = G2(τ)−NG2(Nτ) satisfies G2,N ∈ M2(Γ0(N)) for all N > 0.

 a b a Nb Proof. Next, note that for γ ∈ Γ (N), writing γ = , η = ∈ 0 Nc d c d aNτ+bN SL2(Z) we can write Nγ(τ) = cNτ+d = η(Nτ). This tells us that for γ ∈ Γ0(N), 24 AARON LANDESMAN we have that

−2 G2,N [γ]2(τ) = (Ncτ + d) (G2(γ(τ)) − NG2(Nγ(τ))) [γ]2 −2 −2 = (cNτ + d) (G2(γ(τ))) − (cNτ + d) (NG2(Nγ(τ))) 2πiNc = G (τ) − − (c(Nτ) + d)−2(NG (Nγ(τ))) 2 Ncτ + d 2 2πiNc = G (τ) − − (c(Nτ) + d)−2(NG (η(Nτ))) 2 Ncτ + d 2 2πiNc 2πic = G (τ) − − (N(G (Nτ) − )) 2 Ncτ + d 2 c(Nτ) + d 2πiNc 2πiNc = G (τ) − − NG (Nτ) + 2 Ncτ + d 2 c(Nτ) + d

= G2(τ) − NG2(Nτ),

where we have crucially used Lemma 5.2.3, both for G2(τ) and G2(Nτ). Next, to see it is holomorphic, simply note that G2(τ) has a convergent Fourier series given in 5.1.3. Therefore, NG2(Nτ) has a convergent Fourier series as well, which means their difference is holomorphic. It only remains to show it is holomorphic at infinity. Indeed, it is for this sole purpose we have proved the theorem 5.4.2 above. We have just shown that G2,N is both holomorphic and Γ0(N) invariant. So, we only need to show, the final condition of 5.4.2 is met. That is, we must show that in the Fourier expansion r of G2,N , the coefficient an is bounded by Cn , for some C, r. However, we know the coefficient, at least for n > 1, is bounded by 8π2σ(n). So, it suffices to show r Pn 2 σ(n) < Cn . However, this is quite obvious because σ(n) < i=1 < n . Hence, the function G2,N is holomorphic at infinity and in fact G2,N ∈ M2(Γ0(4)). 

Lemma 5.5.2. The function G2,2,G2,4 have series expansions

 ∞    −π2 X X G (τ) = 1 + 24 d e2πiτn 2,2 3     n=1 0

 ∞    2 X X 2πiτn G2,4(τ) = −π 1 + 8  d e  n=1 0

Proof. From 5.1.3, we have the explicit Fourier series

∞ ∞ X π2 X G (τ) = 2ζ(2) − 8π2 σ(n)e2πiτn = − 8π2 σ(n)e2πiτn. 2 3 n=1 n=1 MODULAR FORMS AND THE FOUR SQUARES THEOREM 25

Hence,

G2,2(τ) = G2(τ) − 2G2(2τ) ∞ ∞ π2 X π2 X = − 8π2 σ(n)e2πiτn − 2 + 8π2 σ(n)e2πi2τn 3 3 n=1 n=1 ∞ π2 X = − − 8π2 σ(n)e2πiτn − σ(n)e2πi2τn 3 n=1 ∞     π2 X X = − − 8π2 σ(n)e2πiτn − d e2πiτn 3     n=1 d|n,2|d

∞    π2 X X = − − 8π2 d e2πiτn 3    n=1 d|n,2-d  ∞    −π2 X X = 1 + 24 d e2πiτn 3     n=1 0

 ∞    2 X X 2πiτn G2,4(τ) = −π 1 + 8  d e  n=1 0

Theorem 5.5.3. The space M2(Γ0(4)) is two dimensional. Proof. Omitted for difficulty. See [2, p. 108], exercise 3.9.3, although understanding this proof may actually involve reading the first hundred pages. 

Corollary 5.5.4. G2,2,G2,4 form a basis for M2(Γ0(4)). Proof. Since the space is two dimensional, and we have explicitly produced these forms, which are independent as can be seen by looking at the first two coefficients of their Fourier expansion, they must form a basis.  6. Theta Functions Theta functions are yet another way to produce modular forms. Like Eisenstein series, they appear as infinite sums, but their transformation properties are not quite as immediate. In particular, we will be interested in the Jacobi , whose fourth power counts the number of ways to write integers as a sum of four squares and lives in M2(Γ0(4)). Definition 6.0.5. Let Γ be a lattice and q be a q :Γ → Z which is positive and definite. Then the theta function of the lattice Γ with quadratic P πizq(x) form q is θΓ(z) = x∈Γ e . Notation 6.0.1. We shall notate particular case of Γ = Z, q(x) = 2x2 by X 2 θ(z) = e2πizx . x∈Z We shall also define θ(τ, k) = θ(τ)k. 26 AARON LANDESMAN

Lemma 6.0.6. θ(τ, k) are absolutely convergent. Proof. It suffices to show θ(τ) is absolutely convergent in the upper half plane. 2 2 This is equivalent to showing P |e2πizx | = P e−2πIm(z)x < ∞. So, taking x∈Z x∈Z −2πIm(z) P x2 α = e < 1 it suffices to show x>0 α converges. But since α < 1, this P x sum is dominated by the geometric series x>0 α , which converges. Hence, the theta function converges absolutely.  P∞ 2πinτ Lemma 6.0.7. Writing θ(τ, k) as a Fourier series, θ(τ, k) = n=1 an,ke , the k Pk 2 coefficients an,k satisfy an,k = |{v ∈ Z , n = i=1 vi }|. Proof. We see that the coefficient of e2πinτ in the expansion !k   X 2πizx2 X X 2πizx2 e =  1 e x∈ x∈ Pk 2 Z Z a1,...,ak| i=1 ai =x P k Pk 2 is Pk 2 1 = |{v ∈ , n = v }|. a1,...,ak| i=1 ai =x Z i=1 i  Lemma 6.0.8. We have the identities θ(τ + 1, a) = θ(τ, a) and θ(τ, a + b) = θ(τ, a)θ(τ, b) Proof. To show θ(τ + 1, a) = θ(τ, a), it suffices to show θ(τ + 1) = θ(τ) which holds 2 2 because θ(τ) = P e2πin τ = P e2πin (τ+1) = θ(τ + 1). The second identity n∈Z n∈Z a+b a b holds simply because x = x · x .  Definition 6.0.9. The of a Lebesgue integrable function f : n → is F(f)(x) = R f(t)e−2πixtdx. R C R Lemma 6.0.10 (Poisson Summation). For f = θ (or in general a Schwartz Func- tion) we have P f(n) = P Ff(n) n∈Z n∈Z Proof. Define F (x) = P f(n+x). Observe that F (x) = F (x+1) and so writing n∈Z F as its Fourier series gives ! X Z 1 X F (x) = f(x + n)e−2πikxdx e2πikx 0 k∈Z n∈Z ! X X Z 1 = f(x + n)e−2πikxdx e2πikx 0 k∈Z n∈Z ! X X Z n+1 = f(x)e−2πikxdx e2πikx n k∈Z n∈Z X Z  = f(x)e−2πikxdx e2πikx k∈Z R X = F(f)(k)e2πikx k∈Z We crucially used the fact that θ has rapidly decreasing derivatives (i.e. is a Schwartz function) to interchange the sum and the integral. Plugging in x = 0 yields P f(n) = F (0) = P F(f)(k). n∈Z k∈Z  −1 √ Lemma 6.0.11. θ( 4τ ) = −2iτθ(τ). MODULAR FORMS AND THE FOUR SQUARES THEOREM 27

2 Proof. Define f(x) = e−πtx . Observe ∞ Z 2 fˆ(n) = e−πtx +2πixydx −∞ Z ∞ 2 −tπ(x− iy )2 πy = e t e t dx −∞ 2 Z ∞ πy −tπ(x− iy )2 = e t e t dx −∞ iy 2 Z ∞+ t πy −tπx2 = e t e dx iy −∞+ t 2 Z ∞ πy −tπx2 = e t e dx −∞ 1 πy2 = √ e t t Then, the Poisson summation formula tells us

X 1 πy2 X X X √ e t = F(f)(n) = F(f)(n) = f(n). t n∈Z n∈Z n∈Z n∈Z P 2πin2z −t Next, note that since θ(z) = n∈ e , taking s = 2i , we get θ(s) = qZ √ 1 θ( 1 ), which implies θ( 1 ) = 2s θ(s). This tells us the equation holds −2is −4s −4s i it true when s = 2 , i.e. when it lies on the positive imaginary axis. However, we can then just define the function φ to satisfy this condition everywhere, and since it is holomorphic, and it agrees with θ on the positive imaginary axis, it must agree with θ everywhere.  τ 2 Corollary 6.0.12. θ( 4τ+1 , 4) = (4τ + 1) θ(τ, 4). t √ Proof. To check this, it suffices to show θ( 4t+1 ) = 4t + 1θ(t). Indeed, t 1 r 1 −1 θ( ) = θ(− −1 ) = 2i( + 1)θ( − 1) 4t + 1 4( 4t − 1) 4t 4t r 1 −1 = 2i( + 1)θ( ) 4t 4t r 1 √ = 2i( + 1) · −2itθ(t) 4t √ = 4t + 1θ(t)



7. The Sum of Four Squares Theorem Finally, will use the calculations above to derive that the Jacobi Theta function lies in M2(Γ0(4)), from which it will follow that it is a linear combination of Eisen- stein series, whose coefficients we have found above. We can then relate these to to find the explicit formula for the sum of four squares. 28 AARON LANDESMAN

1 1 1 0 Theorem 7.0.13. The group Γ (4) is generated by the matrices ± , ± . 0 0 1 4 1 a b Proof. Take any matrix ∈ Γ (4). It suffices to show this can be formed by c d 0 products of the matrices above. Observe that

n a b 1 1 a b 1 n a e  · = · = . c d 0 1 c d 0 1 c nc + d So, by this identity, unless c = 0, we can choose an appropriate n so that |nc + d| ≤ |c|/2. Note since c ≡ 0 mod 4 and d must be odd for the determinant to be 1, so actually we obtain the strict inequality |nc + d| < |c|/2. Note that if c = 0, in order for the determinant to be 1, up to sign, the matrix must be upper triangular with 1 1 1s on the diagonal, which means it is a power of ± . So, the above tells us 0 1 we may assume |d| < |c|/2. Furthermore, we also see,

n a b 1 0 a b  1 0  f b · = · = . c d 4 1 c d 4n 1 c + 4nd d By similar reasoning to the above, we can choose n so that |c+4nd| < 2|d|. Hence, we can arrange |c|/2 < |d|. Now, given any matrix, we can perform multiplications of the above two types so that at every step we decrease the absolute value of either c or d. This implies that after finitely many steps, we can arrange for c = 0, 1 1 at which point we know the matrix is generated by ± . Therefore, all of 0 1 Γ0(4) is generated by the claimed matrices. 

Corollary 7.0.14. The function θ(τ, 4) ∈ M2(Γ(0), 4). −1 Corollary 7.0.15. The theta function satisfies θ(τ, 4) = π2 G2,4(τ) 4 Proof. Recall that we have Fourier expansion for θ(τ, 4) = 1 + 2e2πiτ + ···  = 2πiτ 1 + 8e + ··· . Comparing this to the beginning of the expansions of G2,2(τ) = π2 2πiτ  2 2πiτ  − 3 1 + 24e + ··· G2,4(τ) = −π 1 + 8e + ··· we see that when we write θ(τ, 4) = aG2,2 + bG2,4 and solve for a, b using the linear constraints de- −1 termined by the first two coefficients of the Fourier series we find a = 0, b = π2 , −1 which implies θ(τ, 4) = π2 G2,4(τ).  Theorem 7.0.16. For n ≥ 1, there are 8 P d ways to write n as a sum of d|n,d/∈4Z 4 squares. Proof. We saw in 6.0.7 that the nth Fourier coefficient of θ(τ, 4) is the number of ways to write n as a sum of four squares. However, we saw in 5.5.2 that the nth 2 P −1 Fourier coefficient of G2,4(τ) is −π · 8 · d. Since θ(τ, 4) = 2 G2,4(τ), d|n,d/∈4Z π when we equate the nth Fourier coefficients of these two series, we obtain that the number of ways to write n as a sum of four squares is equal to 8 P d. d|n,d/∈4Z  References [1] Serre, J.-P. A Course in Arithmetic. New York: Springer, 1973. [2] Diamond, Fred, and Shurman, Jerry.A First Course in Modular Forms. New York: Springer, 2005.