<<

MIT Undergraduate Math Association Magazine

Fall 2013 A Note from the UMA President

The MIT Undergraduate Math Association is pleased to bring you our Fall 2013 issue of the UMA Magazine! The UMA has published a magazine for the math community at MIT pe- riodically for over 25 years. Now, after a short hiatus, we are The UMA Magazine writing and editing team resuming the tradition. In the Spring, we held a survey to gauge your opinions about The UMA officers: President Josh Alman, Vice President life as a math major at MIT and interests for our magazine. Mitchell Lee, Treasurer Leon Zhou, and Secretary Felipe This issue contains everything you asked for, and more: Hernandez.

• Four articles by current MIT undergraduates about math Holden Lee, last year’s Vice President, who continued to that we find interesting. help us despite having graduated. • An interview with Professor Andrew Sutherland, focus- ing on his work and the recent progress toward the Twin Soohyun Park from USWIM, who helped to organize our Prime Conjecture. math articles.

• Interviews with mathematics students who graduated Ping Ngai Chung who contributed an article. from MIT and are pursuing various fields of study. Our staff advisor Prof. Ju-Lee Kim. • The more interesting results from our survey. Special thanks to Prof. Andrew Sutherland, Leonid Chin- • Some mathematical nuggets and jokes. delevitch, Delong Meng, and David B. Rush for being in- We hope that you find the content useful and interesting! terviewed for this magazine. An online version is available at the UMA website http:// web.mit.edu/uma/www/, which will also include unabridged copies of our four interviews. We are already in the process of making the next edition of this magazine! If you are interested in contributing, please contact us at [email protected].

The UMA also holds various events throughout the year. Our first two lectures of the fall semester will be given by Prof. Scott Aaronson on Tuesday, September 10th at 5 pm, and by Prof. Henry Cohn on Tuesday, September 17th at 5 pm, loca- tions TBA. To hear more about these and other UMA events, subscribe to our mailing list [email protected], either directly, or by asking us to add you at [email protected].

Have a great semester! Josh Alman UMA President

1 Table of Contents

Combinatorial Applications of Network Flow / Josh Alman 3

Generalized Fibonacci Sequences / Soohyun Park 6

The Young-Frobenius Identity / Mitchell Lee 8

Sieve Methods and the Twin Prime Conjecture / Ping Ngai Chung 12

Interview with Prof. Andrew Sutherland / Felipe Hernandez 16

Interview with Leonid Chindelevitch / Holden Lee 19

Interview with Delong Meng / Holden Lee 22

Interview with David B Rush / Holden Lee 24

Math Major Survey Responses / Holden Lee and Soohyun Park 27

Nuggets and Jokes / Leon Zhou 30

2 Combinatorial Applications of Network Flow Josh Alman

Network flow is one of the most important problems in 1.2 The Ford-Fulkerson Algorithm computer science and operations research, and also has ap- A natural question to ask is: given a network, what is the plications to other subjects such as ecology and physics. How- maximum possible value of a flow through it? In computer sci- ever, the problem is inherently combinatorial, and has some ence, one seeks fast algorithms to answer this question. Here, cool applications to combinatorial problems. In this article, I we give one simple algorithm, the Ford-Fulkerson algorithm, will first introduce the theory behind network flows, and then which yields important insights into the problem. present some of my favorite applications. This article is meant The algorithm is based on paths through a graph called aug- to be self-contained, and requires no previous knowledge of menting paths. Given a flow f in our network, an augmenting network flows. A reader familiar with network flows may want 0 path is a list e ,e ,e ,...,e of distinct edges such that: to skip to the section on applications. 0 1 2 k 1. e eR for any 0 i, j k, i 6= j ≤ ≤ 1 Introduction 2. e s, and e t 0− = k+ = 1.1 Definitions 3. ei+ ei− 1 for each 0 i k, = + ≤ < A network is a complete directed graph G (V,E), together = 4. f0(e j ) c(e j ) for each 0 j k. with an edge capacity function c : E R 0 and two distin- < ≤ ≤ → ≥ guished vertices, the source s and the sink t. In other words, it is a path of edges from s to t which are not If an edge e goes from vertex a to vertex b, we will write e− filled to capacity. We can augment f0 by this augmenting path R = a and e+ b. If e is an edge, then e will denote its reverse as follows: we let ag min{c(e j ) f0(e j ) 0 j k} 0, and = R R = − | ≤ ≤ > edge, namely e e and e e . then we get a new flow f1 defined by: − = + + = − A flow f in the network is a map f : E R with the following →  f (e) ag if e {e ,e ,...,e }, properties:  0 + ∈ 0 1 k R R R f1(e) f (e) ag if e {e ,e ,...,e }, = 0 − ∈ 0 1 k 1. For each edge e E, f (e) c(e),  ∈ ≤ f0(e) otherwise. R 2. For each edge e E, f (e) f (e ), We can see by definition that f is a new valid flow with f ∈ = − 1 | 1| = f ag . From this, it is clear that whenever there is an aug- | 0| + 3. For each vertex v other than s and t: menting path in our flow, the flow does not have the maximum X possible value. It turns out that the converse is true as well, f (e) 0. = and this is how the Ford-Fulkerson algorithm works. e+ v = In the Ford-Fulkerson algorithm, we start with the ‘empty’ flow f , where f (e) 0 for all edges e. Then, while there ex- If e is an edge, and e a and e b, we will often write 0 0 = − = + = ists an augmenting path, we replace f by augmenting it by f (a,b): f (e). 0 = that path. We repeat until there are no augmenting paths left. The value of the flow, denoted f , is the net amount of flow | | Then, the resulting flow is a flow of maximum possible value. into t: X Of course, it is not even clear that this algorithm will ever f f (e). | | = terminate. However, it will if we restrict ourselves to rational e+ t = capacities, meaning c(e) Q 0 for all edges e: A network can be thought of as a series of water pipes con- ∈ ≥ nected to each other. The capacity of a pipe represents how Lemma 1. If all the edge capacities in a network are rational much water can flow through the pipe in that direction per numbers, then the Ford-Fulkerson algorithm will terminate in unit of time. Then, a flow corresponds to water going through a finite number of steps. the pipes, so that the water is conserved everywhere other Proof. If we multiply all the capacities by a value d 0, ap- than the source and the sink. The value of the the flow is the > ply the Ford-Fulkerson algorithm to get a flow in the resulting amount of water going from the source to the sink per unit of network, then divide the capacities and the resulting flows by time. d, the result is the same as if we had just applied the Ford- From the definitions and this intuition, we can see that: Fulkerson algorithm. Hence, since d can be the LCM of all the X X X X denominators of the edge capacities, we can assume without f f (e) f (e) f (e) f (e). | | = = − = = − loss of generality that all the edge capacities are integers. e t e− t e− s e s += = = +=

3 a 3/3 Theorem 1 (Max Flow–Min Cut). For any network G (V,E) b = with capacity function c : E Q 0, source s, and sink t: → ≥ 5/8 2/2 max { f } min {c(C)} flows f | | = cuts C

s 2/2 2/1 1/3 1/4 t Proof. The fact that − − max { f } min {c(C)} flows f | | ≤ cuts C 2/2 5/7 follows from (1). Now, to complete the proof, we will show that: c 4/6 d max { f } min {c(C)}. flows f | | ≥ cuts C

Figure 1: A network with a flow. The notation x/y on an edge Consider any flow f ∗ that results from the Ford-Fulkerson al- means that edge has capacity y, and flow x along it. All edges gorithm. There are no augmenting paths in f ∗. Let C ∗ be the not shown have capacity 0, and have flow equal to negative set of all vertices we can reach from s by a path that uses only the flow in the opposite direction if such a flow is shown, or 0 directed edges that are not filled to capacity. otherwise. The flow shown has value 7. The cut {s,a} also has First, notice that if C ∗ contained t, then there would be an capacity 7. augmenting path. Hence, we must have t C , and so C is a ∉ ∗ ∗ cut. P Second, consider any edge e such that e− C ∗ and e+ C ∗. Let M e t c(e). Then, we can see that M is an upper ∈ ∉ = += We must have that f (e) c(e), since otherwise, using this bound on f for any flow f , since it is the most flow that can = | | edge, we would see that e can be reached from s using only come into the sink from the edges connected to it. + edges that are not filled to capacity, and so e C , a contra- Next, notice that at each iteration of the Ford-Fulkerson al- + ∈ ∗ diction. Summing this result over all such edges e shows that gorithm, the amount ag that we augment an augmenting path f (C ) c(C ). by will always be a positive integer. Indeed, all the edge capac- ∗ ∗ = ∗ We finally have our desired result: ities are integers, and we can see by induction that all the flow values will always be integers. max { f } f ∗ f ∗(C ∗) c(C ∗) min {c(C)}. Our initial flow f has value f 0, and at each iteration flows f | | ≥ | | = = ≥ cuts C 0 | 0| = we increase the value of the flow by ag 1, but the flow can- ≥ not increase beyond M, so we can only take a finite number of steps. From the above argument, we see that the Ford-Fulkerson algorithm did indeed give us a flow f ∗ of maximum capacity. Hence, for the remainder of this paper we restrict ourselves This has another important corollary: to the case where all edge capacities are rational. It should be Corollary 1 (Max Flow Integrality). If the network has only in- noted that the main results in the next subsection will turn out teger capacities, meaning c(e) Z 0 for all edges e, then there to hold true even without this restriction. ∈ ≥ is an integer flow f , with f (e) Z for all edges e, which attains ∈ the maximum possible value. 1.3 Max Flow–Min Cut Proof. This follows from the third paragraph in the proof of To prove the correctness of the Ford-Fulkerson algorithm, Lemma 1. we need one more definition. A cut C is a set C = {s,v1,v2,...,vm} of vertices containing s but not t. We can ex- tend the definitions of the capacity function c and a flow f for 2 Applications a cut C to define the capacity of C, and the flow going out of C, respectively, as follows: We now have the theory in place to prove some interest- X ing combinatorial results relatively easily! In each, one should c(C) c(e), pay special attention to how augmenting paths translate to the = {e E e C,e C} ∈ | −∈ +∉ problem at hand, although we will not need to explicitly men- X f (C) f (e). tion them. = {e E e C,e C} We begin with a classic theorem in matching theory. The re- ∈ | −∈ +∉ An important property that one should verify is that for any sult is often intuitively stated as follows: Consider a collection flow f and any cut C: of n men and n women, such that each woman has a list of men she is willing to marry, and each man is willing to marry f f (C) c(C). (1) | | = ≤ any woman who desires him. Then, we can match up the men See, for instance, Figure 1. and women to be happily married if and only if, for any subset Now we can prove the following important theorem, which of k of the women, there are at least k men who are on at least implies the correctness of the Ford-Fulkerson algorithm: one of their lists.

4 Problem 1 (Hall’s Marriage Lemma). If H (V ,E ) is an Next, here is a problem I was first shown by my friend Vlad = 0 0 undirected bipartite graph, with bipartition V A B, and Firoiu: 0 = ∪ A B , then there exists a bijection m : A B such that | | = | | → Problem 2. Let M be an m n matrix of real numbers such (a,m(a)) E for all a A if and only if, for every subset D A, × ∈ 0 ∈ ⊆ that each row and each column sums to an integer. Then, we have that: there exists an m n matrix N of integers that has the same ¯ ¯ × D ¯{b B (a,b) E0 for some a D}¯. row and column sums as M. | | ≤ ∈ | ∈ ∈ Proof. For the only if direction, suppose that there is such a Proof. Notice that if we had such an M and N, we could add bijection m. Then for all subsets D A, define m(D): {m(a) any integer to the same entry in both without changing the ⊂ = | a D}. Since (a,m(a)) E for all a A, result. Hence, we assume without loss of generality that each ∈ ∈ 0 ∈ entry of M is between 0 and 1, inclusive. Let ri be the sum of m(D) {b B (a,b) E0 for some a D}. row i for each 1 i m, and c be the sum of column j for ⊂ ∈ | ∈ ∈ ≤ ≤ j each 1 j n, and let S P r P c . Therefore, ≤ ≤ = i i = j j Let A {a1,a2,...,am} and B {b1,b2,...,bn} be sets of ¯ ¯ = = D m(D) ¯{b B (a,b) E0 for some a D}¯. vertices corresponding to the rows and columns, respectively, | | = | | ≤ ∈ | ∈ ∈ and consider the network G (V,E) where V {s,t} A B, = = ∪ ∪ For the if direction, we are going to construct a network s is the source, t is the sink, E {(x, y) x, y V, x y}, and where there is a node for each element of V , and flow can = | ∈ 6= 0 c : E Z 0 is the capacity function defined by: go between two nodes in the network if they are adjacent in → ≥  H. We will then find that there exists a flow corresponding ex- r if e s and e a  i − = + = i actly to the matching that we want.  c j if e+ t and e− b j Consider the network G (V,E), where V V {s,t}, where c(e) = = 0 = = = ∪ 1 if e− A,e+ B s is the source and t is the sink, E {(x, y) x, y V, x y}, and  ∈ ∈ = | ∈ 6= 0 otherwise. the capacity function c : E Z 0 given by → ≥  First, note that the cut C {s} has c(C) S. Next, consider the 1 if e s and e A = =  − = + ∈ flow f : E R defined by:  1 if e+ t and e− B → c(e) = ∈  = ` if (e ,e ) E r if e s and e a  − + 0  i − = + = i  ∈  0 otherwise. c j if e+ t and e− b j  = = M if e a ,e b If we choose ` to be a sufficiently large integer (` A will f (e) i j − = i + = j > | | =  if eR falls into one of suffice), then I claim that a minimum cut has capacity A .  f (eR ) | |  First note that the cut C {s} achieves this value, as does − the previous cases =  the cut C {s} V . Since we picked ` to be greater than A , 0 otherwise. ∗ = ∪ 0 | | any cut C 0 of smaller capacity would need to be of the form We can see that f is a valid flow with f S. Since we found C 0 {s} A0 B0, where A0 A, B0 B, and | | = = ∪ ∪ ⊆ ⊆ the cut C with c(C) S, we know by the Max Flow–Min Cut = theorem that the maximum possible value of a flow in this net- {b B (a,b) E0 for some a A0} B0. ∈ | ∈ ∈ ⊆ work is S. By the Max Flow Integrality Corollary, there must be The last condition is needed so that we do not have any an integer flow f with f S. It is easy to check that the ∗ | ∗| = edges of capacity ` crossing our cut. Thus, by the hypothe- matrix N defined by N f (a ,b ) satisfies the desired con- i j = ∗ i j sis, we have that B0 A0 . Each edge from s to a vertex in ditions. | | ≥ | | A \ A0, as well as each edge from a vertex in B0 to t, contributes 1 to c(C 0). Hence: Finally, here is a problem I wrote that appeared on the HMMT Invitational Competition last year. While there is a c(C 0) ( A A0 ) B0 ( A B0 ) B0 A , nice elementary solution to it on the HMMT website, it can = | | − | | + | | ≥ | | − | | + | | = | | also be solved in a straightforward way using the ideas from as desired. this article. I leave it to you as an exercise. By the Max Flow–Min Cut Theorem, we know that the max- imum possible flow value is A . Hence, by the Max Flow Problem 3. Let S be a set of size n, and k be a positive integer. | | Integrality Corollary, there must be an integer flow f with For each 1 i kn, there is a subset S S such that S 2. ∗ ≤ ≤ i ⊂ | i | = f A . It must have f (s,a) 1 for each a A so that the Furthermore, for each s S , there are exactly 2k values of i | ∗| = | | ∗ = ∈ ∈ flow leaving s is A , and no other flow entering any a A since such that e S . Show that it is possible to choose one element | | ∈ ∈ i all the other capacities coming in are 0. Similarly, f (b,t) 1 from S for each 1 i kn such that every element of S is ∗ = i ≤ ≤ for each b B and no other flow is leaving any b B. Since f chosen exactly k times. ∈ ∈ ∗ is an integer flow, for each a A there is one b B such that ∈ ∈ f (a,b) 1, and for each b B there is one a A such that ∗ = ∈ ∈ f (a,b) 1. This defines our desired bijection m. ∗ =

5 Generalized Fibonacci Sequences Soohyun Park

A generalization of the familiar Fibonacci sequence has in- Definition 2. Given a tiling T which consits of a monominos teresting combinatorial and number-theoretic properties. We and b dominos, its weight is will look at a tiling interpretation similar to one which exists wt(T ) sa tb. for the Fibonacci sequence and investigate the period length = of these sequences modulo n. Specifically, the study of period The empty tiling is defined to have weight 1. lengths modulo n has important applications related to ran- dom number generators and is used in some primality tests. The weights of a tiling can be used to relate linear tilings to generalized Fibonacci polynomials. This follows from the 1 Introduction definition of a generalized Fibonacci polynomial and can be proved by showing that the initial conditions and recurrence The Fibonacci numbers were introduced by Leonardo Fi- relation are satisfied. th bonacci in the 13 century in a problem about the number of P Theorem 1. For n 0, we have fn 1(s,t) T L wt(T ). offspring of a pair of rabbits: ≥ + = ∈ n “How many pairs of rabbits can be produced by a pair of The relation above yields a relatively simple formula for rabbits in a year if they produce a new pair every month? Each fn(s,t). pair becomes productive a month after it is born.” P ¡n k 1¢ n 2k 1 k Theorem 2. fn(s,t) k 0 −k − s − − t . Letting n be the number of months passed and {Fn} be = ≥ the sequence satisfying the recurrence Fn Fn 1 Fn 2 with = − + − Proof. By Theorem 1, it suffices to show that the number of F0 0 and F1 1, we obtain the number of rabbits. Aside n 2k 1 k ¡n k 1¢ = = tilings of n 1 squares with weight s t is − − . This from possibly being useful to study rabbit populations (prob- − − − k ably not true), the Fibonacci numbers also come up in a lot of is true since the number of ways to arrange the n 2k 1 ¡n 2k 1 k¢ ¡n k 1¢ − − monominos and k dominos is − − + − − . places. For example, an input of consecutive Fibonacci num- k = k bers makes up the worst-case scenario for the Euclidean algo- Remark 1. Taking s t 1 in Theorem 2, we recover an ex- rithm. = = pression for the Fibonacci sequence: Ã ! 2 A tiling interpretation X n k 1 Fn − − . = k 0 k Another way to interpret the (n 1)st term of the Fibonacci ≥ + sequence is the number of ways to tile a 1 n board with domi- In the remainder of the paper, we will simply write fn × = noes and monominos. A combinatorial interpretation related fn(s,t). to tiling also exists for the following generalization of the Fi- bonacci sequence. 3 Periodicity modulo n Definition 1. The generalized Fibonacci polynomials are poly- nomials in s and t defined by f (s,t) 0 and f (s,t) 1 with In this section, we will find the period of generalized Fi- 0 = 1 = the recurrence fn(s,t) s fn 1(s,t) t fn 2(s,t) for n 2. The bonacci sequences modulo n for squarefree n. Since gener- = − + − ≥ sequences given by fixed values of s,t Z are called the gener- alized Fibonacci sequences are defined by a second-order re- ∈ alized Fibonacci sequences. currence, each pair of consecutive terms modulo n completely determines the remainder of the sequence. Hence, instead of Here are some examples of generalized Fibonacci se- considering the sequence of terms modulo n, we will instead quences. consider the sequence of pairs of adjacent terms modulo n; Example 1. Setting s t 1, we obtain the Fibonacci se- = = these two sequences have the same period. We can see that quence: 0,1,1,2,3,... this new sequence of pairs is periodic by noting that there are 2 Example 2. If we set s 2 and t 1, we have the following only n possibilities for consecutive pairs (fn, fn 1) modulo n. = = − + sequence: 0,1,2,3,4,... Hence, the pairs of consecutive terms must repeat modulo n. It is natural, then, to look at when this sequence starts to be We consider linear tilings, which are coverings of a row of periodic. squares by dominoes and monominoes (single squares). The set of linear tilings of a row of n squares is denoted by Ln. For Theorem 3. Fix s,t Z and let r be a positive integer with ∈ example, two possible linear tilings for a row of four squares gcd(r , t) = 1. Then there exists a positive integer m such that are two dominoes or two squares to the left of a domino. r fm. |

6 quadratic residue mod p, then k(p) p 1. Otherwise, we have 300 2 | − 2 that k(p) 2(p 1) ordp (t ). Note that this divides p 1, which | + · − is the order of F . ×p2

200 Proof. If D is a quadratic residue mod p, then the discrim-

) F F n inant is a square in p and the eigenvalues are in p . Also, (

π since D 0 (mod p), the eigenvalues are distinct and U is di- 6≡ 100 agonalizable. So, we can consider the period to be the order of the eigenvalues in Fp . By Fermat’s little theorem, we have that U p 1 I (mod p). − ≡ 0 When D is not a quadratic residue mod p, the eigenval- ues are not in Fp . In this case, we look at the field extension 0 20 40 60 F 2 {a bγ : a,b Fp }, where γ is one of the roots of the p = + ∈ n characteristic polynomial of the recurrence. If γ¯ is the other root of the characteristic polynomial, then it follows from the Figure 1: The period π(n) of the Fibonacci sequence modulo p properties of the Frobenius automorphism that γ¯ γ in Fp . n for 1 n 67. 2(p 1) 2 = ≤ ≤ Using this with γγ¯ t, it follows that γ + t in Fp . Since = − = 2 the order of the eigenvalues in F divides 2(p 1) ordp (t ), ×p2 + · we are done. Proof. By the periodicity argument above, we know that we can find p,q (p q) such that fp fq (mod r ) and fp 1 We can also use similar ideas to relate the periods of gener- < ≡ + ≡ fq 1 (mod r ). Now, t fp 1 fp 1 s fp and t fq 1 fq 1 alized Fibonacci sequences over different moduli. + − = + − − = + − s fq . So, t fp 1 t fq 1 (mod r ). Since gcd(r,t) 1, this im- − ≡ − = Theorem 5. The period of generalized Fibonacci sequences plies that fp 1 fq 1 (mod r ). Applying a similar argument, − ≡ − modulo lcm(m ,m ) is equal to the least common multiple of fp 2 fq 2 (mod r ). Repeating this p times, 0 fp p fq p 1 2 − ≡ − = − ≡ − (mod r ), which implies the desired result. periods modulo m1 and m2. This means that we can find the period modulo n for any This means that the Fibonacci sequence is purely periodic squarefree n. In addition, similar methods are used to study modulo r for any r 2. Figure 1 depicts some values of the ≥ the periodicity of general linear recurrences. period of the Fibonacci sequence modulo n, for small values of n. So how do we find these periods? Although an algorithm can be used to find them, no explicit expression is known for References the period modulo n. We will investigate periods of general- ized Fibonacci sequences by looking at the matrix [1] T. Amdeberhan, X. Chen, V. H. Moll, B. E. Sagan, General- ized Fibonacci polynomials and Fibonomial coefficients, µs t ¶ arXiv:1306.6511 (2013) 1 - 19. U . = 1 0 [2] S. Gupta, P. Rockstroh, F. E. Su, Splitting fields and peri- Note that ods of Fibonacci sequences modulo primes, Math. Mag. µ ¶ µ ¶ fk fk 1 85 (2012) 130 - 135. U + . fk 1 = fk 2 + + [3] V. E. Hoggatt and C. T. Long, Divisibility properties of This means that the matrix shifts pairs of consecutive terms to generalized Fibonacci polynomials, Fibonacci Quart. 12 the next such pair (i.e. can relate consecutive pairs linearly). (1974) 113 - 120. Since the period is the smallest m such that shifting by m in- dices does not change the residue modulo n, it is the smallest m such that U m I (mod n) for some n. In the case where ≡ n is a prime, we can look at eigenvalues of this matrix to find some divisibility properties of the period. First, we define the number-theoretic notion of multiplica- tive order, or just order for short.

Definition 3. The order modulo n, denoted ordn(a), of a num- ber a relatively prime to n is the smallest positive integer m such that am 1 (mod n). ≡ Theorem 4. Let p be an odd prime, k(p) be the period modulo p, and D : s2 4t be the discriminant of the characteristic = + polynomial of U. Assume that D 0 (mod p) and t 0. If D a 6≡ 6=

7 The Young-Frobenius Identity Mitchell Lee

Young tableaux1 were introduced by the Reverend Alfred tableaux of a given shape λ by f λ.2 (Table 2 shows some ex- Young in 1900 as part of his series of papers on invariant the- amples of f λ.) The number of standard Young tableaux of size ory, a branch of algebra dealing with actions on algebraic va- n is then X rieties. Subsequently, they were adopted by Georg Frobenius f λ, to describe the representations of the symmetric group. Since λ n ` then, they have become indispensable in invariant theory, where the sum ranges over all partitions λ of n. As discussed representation theory, and algebraic geometry, and the study earlier, no one knows how to count the partitions of size n by of Young tableaux has become a central part of combinatorics. a simple formula, so it would be perfectly reasonable to ex- However, despite the ubiquity of Young tableaux, they can be pect that we cannot count standard Young tableaux of size n defined in an elementary fashion, and we will even be able to either. However, there is a formula for this quantity which, at prove some interesting things about them with a bit of linear first glance, seems to come from nowhere: algebra. n Let n 0 be an integer. Then, a partition of n is a se- 2 ≥ X λ bXc n! quence λ (λ , ,λ ) of integers such that λ λ 0 f . = 1 ··· k 1 ≥ ··· ≥ k > = 2k (n 2k)!k! and λ λ n. More informally, it is a way of writ- λ n k 0 − 1 + ··· + k = ` = ing n as the sum of some positive integers, without regard (This number is also equal to the number of functions f : to their order. For example, the partitions of the number 5 {1, ,n} {1, ,n} which are their own inverses.) But an ··· → ··· are (5),(4,1),(3,2),(3,1,1),(2,1,1,1),(1,1,1,1,1). The numbers even more striking equation governs the sum of the squares λ , ,λ are called the parts of the partition λ, and we use λ 1 ··· k of the f : the notation λ n to denote that λ is a partition of n. It is in- ³ ´2 ` X λ teresting to note that there is no simple known formula for the f n!. λ n = number p(n) of partitions of the number n. ` The Young diagram (or just “diagram” for short) of the par- This celebrated result is called the Young-Frobenius identity. The Young-Frobenius identity was first proved using prop- tition λ (λ1, ,λk ) is a figure consisting of k left-justified = ··· erties of the irreducible representations of the symmetric rows of boxes, the ith of which has λi boxes. We make no dis- tinction between a partition and its corresponding Young di- group Sn. In 1961, Schensted gave a brilliant combinatorial agram; in particular, we will use the same letter λ to refer to proof of the identity: he exhibited a bijection, now known as both the partition and its Young diagram. Table 1 depicts ex- the Robinson-Schensted correspondence, between the set of permutations of {1, ,n} and the set of pairs (P,Q) of standard amples of Young diagrams. In this article, we will draw Young ··· diagrams with the first row at the top, a convention known as Young tableaux of size n, where P and Q have the same shape English notation. However, it is worth mentioning that some λ. In what follows, we will prove the Young-Frobenius identity authors draw Young diagrams with the first row at the bottom, using some elementary linear algebra. which is known as French notation. The set of all Young diagrams of all sizes (including the empty Young diagram , which has no boxes and size 0) is de- A Young tableau T is a Young diagram λ, where the boxes ; of λ are filled in with the numbers 1, ,n in some order. The noted Y . If λ and µ are both Young diagrams, we write λ µ if ··· ≥ diagram λ is called the shape of the tableau T . The tableau T every box of µ is also a box of λ. For example, is called standard if the numbers written in the boxes increase along every row and column. For example, the following is a . standard Young tableau of shape λ (4,3,3,1): ≥ = 1 4 6 7 The ordering is a partial order, and it gives Y the structure ≤ 2 5 8 of a lattice, which is a special kind of partially ordered set. In- 3 9 10 deed, Y is often called Young’s lattice. We say that λ covers µ,

11 2Quite surprisingly, there is actually quite a simple formula for f λ:

λ n! Naturally, the first question we should ask is: How many f Q , = x λ h(x) standard Young tableaux are there? Denote the number of ∈ where the product is over all boxes x of λ and h(x) is the number of boxes of x either directly below or directly to the right of x, including x itself. This is known 1The word tableau is borrowed from French, where it means “picture.” The as the hook-length formula, and has seen many different proofs since Robinson plural of “tableau” is “tableaux,” pronounced in the same way. and Thrall discovered it independently on the same day in 1953.

8 λ (5) (4,1) (3,2) (3,1,1) (2,2,1) (2,1,1,1) (1,1,1,1,1)

Table 1: The Young diagrams for the partitions of 5.

λ

f λ 1 4 5 6 5 4 1

Table 2: The values of f λ for the partitions λ of 5.

written λ µ, if λ can be formed from µ by adding only one First, we prove (i). If λ is a Young diagram, then a corner  box. For example, of λ is a box inside λ at the end of both its row and column. A dual corner of λ is a box outside of λ which is bordered by .  boxes of λ above it and to its left. Figure 2 illustrates these def- initions. The Young diagrams covered by λ can be obtained Figure 1 depicts Young’s lattice and its covering relation. It by starting with λ and removing a box at a corner. The Young contains a segment down from the diagram λ to the diagram diagrams covering λ can be obtained by starting with λ and µ if λ µ. The importance of the covering relation is that stan- adding a box at a dual corner. Imagine walking up the lower-  dard Young tableau T of shape λ are in correspondence with right boundary of a Young diagram λ. You will see dual cor- sequences (λ , ,λn) of Young diagrams with ners to your right and corners to your left. They will alternate, 0 ··· starting and ending with a dual corner. The number of dual λ0 λ1 λn 1 λn λ. corners of λ is thus one more than the number of corners of λ. ; = ≺ ≺ ··· ≺ − ≺ = Put a different way, the number of Young diagrams covering λ The standard Young tableau T corresponds to the sequence is one more than the number of Young diagrams covered by λ. (λ , ,λn) where λ is the Young diagram consisting of all the 0 ··· k boxes of T which are filled with a number less than or equal to Now, we prove (ii). Suppose that λ and µ are two different k. For example, the standard Young tableau Young diagrams, which are both covered by the Young dia- gram ν. A little bit of thought reveals that ν µ λ; that is, the = ∪ 1 3 4 6 boxes of ν are exactly those boxes contained in either λ or µ. The Young diagram µ λ, whose boxes are exactly those boxes 2 5 ∩ contained in both λ and ν, is then covered by both λ and µ. So corresponds to the sequence if there is a Young diagram covering both λ and µ, then (1) that diagram is unique, and (2) there is also a Young diagram cov- ³ ´ , , , , , , . ered by both λ and µ. A similar argument reveals that if there ; is a Young diagram covered by both λ and µ, then (1) that dia- So we have reduced the Young-Frobenius identity to a state- gram is unique, and (2) there is also a Young diagram covering ment about the covering relation on Young’s lattice. both λ and µ. This implies that the number of Young diagrams We now establish the following properties (i) and (ii). These covering both λ and µ is exactly equal to the number of Young two properties make Y into a differential poset. The reason for diagrams covered by both λ and µ (and furthermore that it is this terminology will become more clear shortly. less than or equal to 1).

(i) If λ is a Young diagram, then the number of Young dia- The importance of the two properties (i) and (ii) was first grams covering λ is exactly one more than the number of recognized by Professor Richard Stanley in his 1988 paper Young diagrams covered by λ. on differential posets, which presents a generalization of the Young-Frobenius identity and related results. The interested (ii) If λ and µ are two different Young diagrams, then the reader can find the paper on Professor Stanley’s website; here number of Young diagrams covering both λ and µ is ex- we will stick to the specific case of Young tableaux. It turns out actly equal to the number of Young diagrams covered by that the statements (i) and (ii) translate very nicely into the both λ and µ. language of linear algebra. Let R[Y ] be the vector space of all

9 ......

;

Figure 1: A Hasse diagram of the bottom part of Young’s lattice.

The covering relation induces two linear operators U (for ≺ • up) and D (for down) on R[Y ]. If λ is a Young diagram, then let X Uλ : µ, • = µ λ Â • the sum of all the diagrams covering λ, and X • Dλ : µ, = µ λ ≺ • the sum of all the diagrams covered by λ. (In particular, D ; = • 0.) Since a linear operator on a vector space is uniquely de- termined by its values on a basis, U and D extend uniquely to Figure 2: A sample Young diagram with the corners colored linear operators on all of R[Y ]. For example, gray. The dual corners are filled by dots. ³ ´ U U U + = + formal linear combinations of elements of Y with real coeffi- ³ ´ µ ¶ cients.3 In other words, R[Y ] is the vector space over the real = + + + numbers that has Y as a basis. For example, this vector space 2 contains elements such as = + · +

0 and ³ ´ D 2 D 2 D − · = − · ³ ´ ³ ´ 3 2 − · = + − . 5 p2 . = − − · + · + The coefficient of µ in Uλ is 1 if it is possible to start at λ, take 3In other applications of Young’s lattice, it is more useful to consider linear combinations of elements of Y with complex coefficients. These form a vector a step up in Young’s lattice (see Figure 1), and end up at µ; oth- space called C[Y ]. However, that will not be necessary here. erwise, it is 0. Similarly, the coefficient of µ in Dλ is 1 if it is

10 possible to start at λ, take a step down in Young’s lattice, and All that is left is to work through some algebra with D and end up at µ; otherwise, it is 0. U. First, we leave it to the reader to verify, by a standard in- If λ is a Young diagram of size n, then the coefficient of λ in ductive argument, that DU n U nD nU n 1 for all integers − = − U n is the number of ways to start at , take n steps up, and n 1. Recall that we reduced the Young-Frobenius identity to ; ; ≥ then end up at λ. In other words, it is exactly f λ, the number the statement DnU n n! . We proceed by induction on n. ; = ; of standard Young tableaux of shape λ. That is, The base case n 0 is trivial. For the inductive step n 0, = > n X λ n n n 1 n n n 1 n U f λ, (1) D U (D − (DU U D) D − U D) ; = ; = − + ; λ n n 1 n 1 n 1 n 1 ` (nD − U − D − U − D) = + ; where the sum is over all partitions λ of n. n 1 n 1 n 1 n 1 Similarly, if λ is any partition, then the coefficient of in nD − U − D − U − D ; = ; + ; Dnλ is the number of ways to start at λ, take n steps down, n(n 1)! 0 = − ; + and then end up at . Again, this is equal to f λ. Therefore, for ; n! any partition λ, = ; Dnλ f λ . (2) as desired. This completes the proof of the Young-Frobenius = ; identity. Combining equations (1) and (2), Ã ! X DnU n Dn f λλ ; = λ n X ` f λDnλ = λ n Ã` ! X ³ ´2 f λ . = λ n ; ` Thus, to prove the Young-Frobenius identity, it is enough to show that DnU n n! . ; = ; The next step is to translate the combinatorial properties (i) and (ii) of Y into statements involving U and D. First, if λ and µ are any Young diagrams, then the coefficient of µ in U(Dλ) is the number of ways to start at λ in Young’s lattice, take a step down, take a step up, and then end up at µ. In other words, it is the number of Young diagrams covered by both λ and µ. On the other hand, the coefficient of µ in D(Uλ) is the number of ways to start at λ, take a step up, take a step down, and then end up at µ. In other words, it is the number of Young diagrams covering both λ and µ. Therefore, the coefficient of µ in (DU UD)λ is the number − of Young diagrams covering both λ and µ, minus the number of Young diagrams covered by both λ and µ. By properties (i) and (ii), this is equal to ( 1 if µ λ = . 0 otherwise

So (DU UD)λ λ for all Young diagrams λ. Since a linear − = operator is completely determined by its values on a basis, we conclude that DU UD I, (3) − = where I is the identity operator on R[Y ].4

4A similar equation relates differentiation and multiplication by x on the vec- tor space R[x] of all polynomials in the variable x with real coefficients: d d x x I. dx − dx = This is the justification for the term “differential poset.” In a loose sense, D is ∂ ∂U . Furthermore, it is interesting to note that if A and B are operators on a finite-dimensional real vector space, then AB BA I. We leave the proof as an − 6= exercise for the reader.

11 Sieve Methods and the Twin Prime Conjecture

Ping Ngai Chung

Yitang Zhang’s recent solution [9] to the “weak” twin prime tors. For instance, to count the number of primes in A , it suf- conjecture re-ignites the interest towards this well-known fices to remove all the elements congruent to 0 modulo each conjecture. In this article, we take this opportunity to intro- small prime p, i.e. the primes not greater than the square root duce the fundamental tool towards the so-called twin-prime of the largest element in A . The flexibility of the choice of A type problems, the sieve methods. We shall introduce the ba- and P allows us to keep track of different additive properties sic setting of sieve theory, and then an application to the fa- of the integers with few prime factors. mous Brun’s theorem. In the second last section, we include a brief discussion of its limitation, the parity problem. We in- 1.2 The general setting evitably omit most of the details to give a clearer big picture. We refer the readers to the excellent expositions by Cojocaru- We now give the general setting of sieve theory, following Murty [2], Friedlander-Iwaniec [3] and Iwaniec-Kowalski [7] the discussion in [2]. Throughout the article, we let N be a real for a more detailed account. For more advanced readers in- number, A be a set of natural numbers at most N and P be a terested in Zhang’s proof, we recommend Ram Murty’s nice set of primes. We always use p to denote a prime number, and article [8]. d to denote a square-free number, that is, an integer not divis- ible by any perfect square greater than 1. We adopt the usual asymptotic notations O and relative to the parameter N as ¿ 1 Introduction N goes to infinity. For each prime p P , we fix ω(p) residue ∈ classes modulo p and label them as “distinguished”. Let Ap 1.1 Motivation denote the set of elements of A belonging to at least one of the distinguished residue classes modulo p. For any positive Sieve theory is a fundamental tool to detect the additive real number z, and square-free integer d that is a product of properties of integers with few prime factors, for instance the primes in P , we define primes themselves. Since prime numbers are “multiplicative” \ in nature, it is hard in general to answer simple “additive” A : Ap d = questions about primes like when is p 2 prime given a prime p d | + Y p (twin prime conjecture) or when is an integer the sum of ω(d): ω(p) two primes (Goldbach’s conjecture), even using very powerful = p d | tools about primes. For instance, while the prime number the- Y P(z): p. orem gives a precise description of the asymptotic behavior = p P ,p z of the primes, it gives almost no information about the above ∈ < two questions, in the sense that one can easily construct many We denote ¯ ¯ ¯ ¯ infinite sequences of positive integers in place of the primes, ¯ [ ¯ S(A ,P ,z): ¯A \ Ap ¯. which satisfy the asymptotic behavior of the primes, yet fail to = ¯ p P(z) ¯ | satisfy the predictions by both the twin prime conjecture and The primary goal of sieve theory is to estimate S(A ,P ,z). In- the Goldbach conjecture. tuitively, S(A ,P ,z) is the number of remaining elements in There was very little progress in both conjectures until 1915, the “ambient set” A after “sifting” out elements satisfying cer- when Viggo Brun introduced Brun’s sieve. As a corollary, he tain congruence conditions. As we shall see in later sections, proved that there are infinitely many integers n such that n an estimate of this quantity gives many consequences in the and n 2 each have at most nine prime factors. He also + gaps between primes under careful choices of A and P . showed that all sufficiently large even integers are the sum of two integers, each having at most nine prime factors. These 1.3 Sieve of Eratosthenes are considered tremendous advances towards the two well- known conjectures. Another consequence, Brun’s theorem, The sieve of Eratosthenes is a well-known algorithm to gen- will be discussed in section 2. This fundamental work even- erate primes. It was first described by Nicomedes in his Intro- tually leads to the development of the modern sieve theory. duction to Arithmetic. We first review its main ideas, which are The general idea of sieve theory is as follows: given a finite essential for all the subsequent sieves. set of integers A , we remove the elements in A that are con- We consider the set of positive integers greater than 1 and gruent to some fixed residue(s) modulo p for each prime p in a less than a fixed positive real number N, and iteratively mark given set of primes P . Then we count the number of remain- the elements that are certainly primes and composites, until ing elements. These are usually integers with few prime fac- all the primes can be identified. In each iteration, we mark the

12 smallest unmarked element as prime, and mark all its multi- that follow will elucidate the key points of this theorem, and ples (except itself) as composites. we shall give a direct application in the next section. Although this algorithm gives us much more information Suppose there is a real number X such that about the primes, we are primarily interested in counting the ω(d) number of primes less than N. This can be obtained by the A X R | d | = d + d inclusion-exclusion principle. Using the language of sieve the- ory, we take A to be the set of all the integers less than N and for some real number Rd . In practice, one can usually take P to be the set of all primes. For each prime p P , we distin- X A . Then we have the following theorem. ∈ ≈ | | guish the residue class congruent to 0 modulo p. Let µ be the Theorem 2. (The sieve of Eratosthenes) Möbius function, i.e. Assume the above notation. Suppose that the following con- ( ditions are satisfied: ( 1)r if n is square-free and has r prime factors, µ(n): − = 1. R O(ω(d)), 0 otherwise. | d | = Then for each z 0, the number of primes less than N is 2. for some κ 0, > ≥ X X ω(p)logp π(N) S( , ,z) z µ(d) z. (1) κlogz O(1), A P Ad p ≤ + ≤ + = d P(z) | | + p P(z) | | In fact, since the smallest prime factor of a composite number 3. for some positive real number y,#A 0 for every d = is at most its square root, S(A ,P ,z) π(z) 1 gives the exact square-free d y. + − > value of π(N) for N z N 1/2. > > Then One may further compute S(A ,P ,z) asymptotically. Since µ ¶ Y ω(p) Ad N/d and x x O(1), S(A ,P ,z) X 1 O(E(X , y,z,κ)), | | = b c b c = + = − p + ¹ º p P(z) X X N | µ(d) A µ(d) d where E(X , y,z,κ) is an explicit function of X , y,z,κ (cf. Theo- d P(z) | | = d P(z) d | | rem 5.4.1 in [2]). X µ(d) N O(2z ) We give the following remarks to this statement. = d P(z) d + | µ ¶ Y 1 Remark 1. Condition 1 ensures that the elements in Ad are N 1 O(2z ). (2) roughly well-distributed in . From the probabilistic per- = p z − p + A < spective, this condition requires the events of sifting out dis- We observe that the main term is the expected value of tinguished residue classes modulo each prime in P are “suffi- S(A ,P ,z), if the events of sifting out elements in A divisible ciently” independent. by each prime in P are mutually independent. This seems Remark 2. The κ in condition 2 is typically called the sieve like a promising way to estimate π(N) at first glance. How- dimension. In practice, it is the weighted average of the values ever, the error term is too big that we can get a nontrivial es- of ω(p) for p P . In particular, if ω(p) is constant for all except timate only if z O(logN). To illustrate the insufficiency, we ∈ = finitely many primes in P , one may take κ to be that constant. take z c logN for some small c 0, then (1) and (2) will give = > Remark 3. The error terms in sieve-theoretical statements are N π(N) , usually fairly complicated. The important point in practice is ¿ loglogN to find a z z(X ) such that the main term still dominates. In = the above theorem, one may take z much greater than log X , an estimate much weaker than the actual behavior π(N) ∼ and yet slightly smaller than any positive power of X . In other N/logN given by the prime number theorem. sieves, the error term can be further improved so that we can take z X c for some small constant c 0, perhaps the best = > 1.4 The formal statement that one can hope for. The above discussion gives a rough idea of the modern treatment of the sieve of Eratosthenes. One may hope that 2 Brun’s theorem a more careful analysis of the cancellation among the error terms may give a better bound. This is true indeed, when The twin prime conjecture predicts that there are infinitely combined with “Rankin’s trick” [2, Sec. 5.3]. Furthermore, the many pairs of consecutive primes. As mentioned in the first ideas can be generalized to the case with more than one distin- section, sieve theory is capable of answering certain twin- guished residue class modulo each prime in P , given certain prime type questions, that is, different weaker forms of this regularity conditions. We shall first state the main theorem, conjecture. To illustrate this point, we shall use the theorem and then discuss its important parts. The statement itself may to prove one of the first major achievements of modern sieve look unnatural to first-time readers, but hopefully the remarks theory, Brun’s theorem.

13 Theorem 3. (Brun’s theorem) The sum Hence µ ¶ Ã ! Y 2 X 2 2 X 1 1 exp (logz)− . p z − p ≤ − p z p ¿ p,p 2 both prime p < < + converges. If we write the expression of E(N,z) explicitly, we shall see that one can choose z so that We recall that one of the proofs that there are infinitely many primes is to show that the sum of the reciprocals of all logz c logN/loglogN = primes diverges. Brun’s theorem shows that this approach is unable to solve the twin prime conjecture. This was not known for some small positive constant c, and deduce that until the landmark paper of Viggo Brun in 1919, who solved N(loglogN)2 this problem by introducing a variant of the sieve of Eratos- S(A ,P ,z) . 2 thenes, now known as the Brun’s sieve. This paper is generally ¿ log N regarded as the start of the modern sieve theory. Nonetheless, The proof is complete by observing that the number of twin it was later known that the theorem can be proved using Theo- primes is at most z S(A ,P ,z). rem 2, a combination of the elementary sieve of Eratosthenes + and Rankin’s trick. In fact, Brun’s theorem is the immediate consequence of the following theorem. 3 The parity problem Theorem 4. The number of primes p N such that p 2 is ≤ + Despite being a powerful tool in proving the twin-prime prime is type problems, the sieve methods have a notorious obstacle N(loglogN)2 . that seems difficult to overcome: the parity problem. Roughly ¿ 2 log N speaking, the sieve methods are incapable of distinguishing Proof. Let A be the set of natural numbers at most N and P numbers with an odd number of prime factors and num- be the set of all primes. Let z z(N) be a positive real num- bers with an even number of prime factors. This problem = ber to be chosen. For each p z, we distinguish the residue was first identified by Atle Selberg in 1949, who demonstrated < classes 0 and 2 modulo p. Then ω(p) 2 for all prime p with this problem using the following example (cf. [2, Ex. 7.20]). − = Let Φ (N,z) be the number of natural numbers at most 2 p z, ω(2) 1 and ω(p) 0 for all p z. Ap is the set of odd < < = = ≥ all natural numbers at most N congruent to 0 or 2 modulo p N with an odd number of prime factors, counted with mul- − for each prime p z, and empty for all p z. Hence tiplicity, such that all its prime factors are greater than z. < ≥ Similarly define Φeven(N,z). Then any known sieve methods ω(d) A N O(1) would yield the same upper bound (2 o(1))N/logN for both | d | = d + + Φodd(N,pN) and Φeven(N,pN). Nonetheless, it is a direct for all square-free integer d divisible only by primes less than consequence of the prime number theorem that z. We take X N in Theorem 2. = Φeven(N,pN) 0 Now we check the three conditions of Theorem 2. = N 1. Since R O(1), the first condition is satisfied. Φ (N,pN) (1 o(1)) . | d | = odd = + logN 2. We leave it as an exercise to the readers that The problem is perhaps the reason why it is hard for sieves X logp logz O(1). to distinguish primes and “almost primes”, that is, products of p z p = + two distinct primes. This has many significant consequences. ≤ A notable example is Chen’s theorem [1], which states that A proof can be found in [2, Thm. 1.4.3]. With this in mind, there are infinitely many primes p such that p 2 is either condition (2) is satisfied with κ 2. + = prime or almost prime. The parity problem suggests that it might be difficult to eliminate all the primes p such that p 2 3. Suppose that Ad is nonempty for some squarefree d. + Then there is some x A . By the definition of A , we is almost prime, thereby unable to solve the twin prime con- ∈ d d have d x(x 2) and x N, so d x(x 2) N(N 2). jecture using the known sieve methods. | + ≤ ≤ + ≤ + Thus, we can take y N(N 2). = + Hence by Theorem 2, 4 Final remarks µ ¶ Y 2 S(A ,P ,z) N 1 O(E(N,z)). Of course, it is too early to say that the conjecture cannot = p z − p + be settled by any sieve methods, given that the parity prob- < lem is still not very well understood. Starting from 1996, Fried- By Theorem 1.4.4 in [2], lander and Iwaniec [4] developed some parity-sensitive sieves, X 1 attempting to break the parity problem. More recently, Gold- loglogz O(1). p z p = + ston, Graham, Pintz and Yıldırım [6] showed the analogous <

14 conjecture for almost primes, and remarked that their meth- ods are much more successful for almost primes than primes. They showed that there are infinitely many pairs of almost primes at most 6 apart. By contrast, it was not until recent work by Zhang [9], based on earlier work by Goldston, Pintz and Yıldırım (GPY) [5] using similar methods, that such con- stant exists for primes, and the best known bound is still on the order of thousands. These results are all based on a variant of the Selberg’s sieve, combined with other important number- theoretic facts. It remains a wide open problem to push the sieve methods beyond the approach of GPY.

References

[1] J.-R. Chen, On the representation of a large even integer as the sum of a prime and a product of at most two primes, Scientia Sinica 16 (1973) 157-176.

[2] A. Cojocaru, M. Murty, An introduction to sieve methods and their applications. London Mathematical Society Stu- dent Texts, 66. Cambridge University Press, Cambridge, 2006. xii+224 pp. ISBN: 978-0-521-64275-3; 0-521-61275- 6.

[3] J. Friedlander, H. Iwaniec, Opera de cribro, American Mathematical Society Colloquium Publications, 57. Amer- ican Mathematical Society, Providence, RI, 2010. xx+527 pp. ISBN: 978-0-8218-4970-5

[4]—, Using a parity-sensitive sieve to count prime values of a polynomial, Proc. Nat. Acad. Sci. U.S.A. 94 (1997), no. 4, 1054–1058.

[5] D. Goldston; J. Pintz, C. Yıldırım, Primes in tuples. I. Ann. of Math. (2) 170 (2009), no. 2, 819-862.

[6] D. Goldston, S. Graham, J. Pintz, C. Yıldırım, Small gaps between products of two primes. Proc. Lond. Math. Soc. (3) 98 (2009), no. 3, 741-774.

[7] H. Iwaniec and E. Kowalski, Analytic , Amer- ican Mathematical Society Colloquium Publications, 53. American Mathematical Society, Providence, RI, 2004. xii+615 pp. ISBN: 0-8218-3633-1.

[8] R. Murty, The Twin Prime Problem and Generalizations, http://www.ias.ac.in/resonance/August2013/ p712-731.pdf

[9] Yitang Zhang, Bounded gaps between primes, Ann. of Math., to appear.

15 Interview with Prof. Andrew Sutherland

Felipe Hernandez

Professor Andrew Sutherland is a Principal Research Sci- related topics in (the proof of Fermat’s entist in the MIT math department. His research focuses on Last Theorem, for example). I guess it was seeing the beauty computational number theory. of the mathematics behind the cryptography that got me interested in number theory. Felipe: How did you get involved with mathematics (and computational number theory)? Felipe: Yitang Zhang has recently proved that the gap between primes is bounded. That is, there are infinitely many pairs of Prof. Sutherland: When I applied to college I knew I wanted primes that are within 70 million of each other. You’ve been to be either a mathematician, a computer scientist, or a physi- involved in an effort to make the gap smaller, can you tell us a cist. I ended up getting my undergraduate degree in mathe- bit more about that? matics, and then going to graduate school in mathematics, but I took a lot of computer science courses along the way. Part Prof. Sutherland: It’s been amazing, quite an adventure. way through graduate school, I got an opportunity to work on There’s a whole story to this. I was on a plane traveling to a software project, and this turned into starting a company Emory University to give a talk, and Zhang had just given his with two friends who were also affiliated with MIT. What I ini- talk at Harvard on May 13. He gave the talk almost unan- tially thought was going to be a summer project turned into a nounced, in a last minute seminar. I saw the email about it, yearlong project, then a multi-year project, and I wound up but it wasn’t really clear to me whether this was serious, and spending twelve years working in the software industry. In in any case I couldn’t make it to the talk because I was travel- 2000, I finally sold my share of the company. I had really en- ing to Emory. When I landed I sent an e-mail to a friend who joyed the process, and it was a fun ride, but I was ready to do was at the seminar to see if Zhang had really done it. Yes, he something else. I had friends who wanted me to start another assured me, it looks good; it’s been submitted to Annals, the software company, but I decided I wanted to take a break and referees have gone over it, and they think it’s correct. I thought spend some time figuring out what I wanted to do with the rest wow, that’s really cool, but I didn’t think too much more about of my life. During that break I came to the realization that if I it right then. was stuck on a desert island, as long as I had (along with food I flew to Chicago that night to attend a workshop there, and and water) a pad of paper, a pencil, and some interesting math while I was checking into the hotel, the clerk asked me what problems to think about, I’d be happy as a clam. That’s what I did. I told him I was a mathematician, and he asked “what made me decide to come back to MIT and finish my PhD. kind of math?” “Number theory,” I replied. Then he said, “ah, I had spent my first few years in graduate school focused 70 million, pretty good, huh?” And my jaw just about hit the on theoretical computer science. Ironically, it wasn’t until I floor. This random hotel clerk checking me in knows about had been working in the software industry for a while that I the prime gaps result that had only been announced two days got interested in number theory. The software I developed ago. It made me realize that this was a big deal, but I still didn’t was a massively distributed peer-to-peer messaging system think too much more about it, since Zhang’s proof really lies in for financial transactions. Security played a critical role, as the realm of analytic number theory which is not my specialty. you might expect, and as the person who had the strongest A couple weeks later I noticed on Scott Morrison’s blog that math background in our company, I was tasked with that as- people had started chipping away at Zhang’s bound. Now 70 pect of the business. I got very familiar with some of the prac- million is an interesting number for a computational number tical aspects of cryptography, but I found that the mathemat- theorist. The typical situation is that first you have no bound ics behind it was in many ways more interesting to me than at all, and then one day somebody comes along and proves the cryptographic protocols themselves. I got interested in el- an ineffective bound, meaning that we know there is a bound, liptic curve cryptography, which is a technology that has been but we have no idea what it is. The next thing that happens (if around for a while but is only now really coming into vogue; we’re lucky) is that somebody is able to give an explicit bound, Google recently changed the protocols they use for secure ac- but it’s astronomically large, way beyond the range of anything cess to their services (Gmail, Google search, YouTube, etc...) to that could be used in a practical computation, and things stop use elliptic curve cryptography. there. Cryptographic applications aside, the mathematics of Zhang gave a bound that is small enough to be compu- elliptic curves is a truly fascinating subject in its own right, tationally interesting, and the gap between 70 million and and a lot of recent progress in mathematics has risen out of 2 (which would prove the twin prime conjecture) seems the theory that’s been developed around elliptic curves and a lot smaller than the gap between infinity and 70 mil-

16 lion. So a couple of people started to chip away at the Prof. Sutherland: I guess the simple answer would be 7, but bound 70 million, which, if you read Zhang’s paper care- not for any of the reasons that most people would say 7. fully, actually turns out to be more like 60 million. Scott This is related to one of my favorite research results, which Morrison was able to reduce the bound further, and then was a combination of some nice number theory and some Terry Tao got involved (you can see a complete record of serendipitous computation. There are certain equations that the progress at http://michaelnielsen.org/polymath1/ have solutions over the rational numbers if and only if they index.php?title=Bounded_gaps_between_primes). have solutions modulo every prime (and also solutions over Now Terry Tao is not only a math prodigy and an individual the real numbers). When you have a problem that has that genius (with a Fields medal to show for it), he is also a propo- property, you say a local-global principle applies. But there nent of open, communal mathematics, an approach that has are some very interesting examples where a local-global prin- also been advocated by Tim Gowers. The prime gaps prob- ciple doesn’t apply. This result is concerned with a particular lem is perfect for this sort of work; rather than having lots of local-global principle related to elliptic curves that almost al- different people individually publishing separate papers that ways applies; it holds everywhere except in one very special reduce the bound in small increments it’s much more efficient case that occurs at the prime 7. Exactly one isomorphism class to have everyone contribute their ideas and let them feed off of elliptic curves violates this local-global principle, and only of each. So on June 4th, Terry proposed a Polymath project to does so at the prime 7. work on this problem. I had been trying to prove this local-global principle for quite some time; I had been talking to Nick Katz at Princeton There are three main parts to Zhang’s proof, each of which about this problem, and I almost had a proof. I had narrowed can be independently optimized. One group set to work on things down to the point that I knew that if there was a viola- optimizing the parameter known as $, which then determines tion of this local-global principle, then it had to occur within a parameter k0, which can also be optimized. The last piece a set of examples that I could conceivably hope to search. My of the proof, which is what gives you the bound 70 million, expectation was that I wouldn’t find any counter-examples, involves estimating a parameter known as H. When I got and then I could just say yes, the local-global principle holds, involved, Scott Morrison had brought k0 down from Zhang’s and everybody would say “that’s nice”, but it really wouldn’t value of 3.5 million to 341,640. This brought the prime gap have been all that interesting. But then I found an exception, bound H down to about 5 million. To compute H, you need a single case where this local-global principle does not hold, to solve a combinatorial optimization problem, and when and I was able to prove that this is the only exception. The I looked at this I saw that they were getting into the range existence of this one exception at the prime 7 made the where a lot of interesting algorithms can be applied, so I got paper so much more interesting, because it showed that this involved. Since then you can see that there have been a lot of particular local-global principle really isn’t a trivial thing, it’s different people contributing, back and forth. Since June 4, rather delicate. So that’s why 7 is my favorite number. when the polymath project officially started, we’ve brought H down from 5 million to somewhere around 60,000. Felipe: What do you think about the use of computation in theoretical mathematics? Felipe: So where is H now?

Prof. Sutherland: I think it’s starting to become a lot more Prof. Sutherland: I just posted a new result on the blog, common. One of the other big projects that I worked on with which hasn’t yet been posted on the records page. The Kiran Kedlaya (formerly at MIT) was looking at Sato-Tate dis- new bound on H is 60,732. I left a program running on my tributions. These are distributions associated to the num- computer overnight, and when I came in this morning it ber of solutions to certain equations when you consider them hadn’t found anything new. But I just checked again about modulo primes. There are conjectures about what happens half an hour before you came, because I knew you were going in the general case, but in the process of testing these conjec- to ask about this, and it had found the new record. It’s only tures we found a bunch of exceptional cases. In order to make a small improvement, and I’m sure other people will find sense of these we had to gather massive amounts of data, and better records soon [Note: within an hour of the interview, the it took a lot of work to improve the algorithms we were using bound had decreased to 60,726]. It’s just one step in a process in order to make this possible; it’s often not enough to just use that will hopefully end at 2, or at least close to it (Zhang’s a bigger computer, algorithmic breakthroughs can be crucial, method of proof actually can’t do any better than 16). My and that was very true with this project. guess is we won’t actually get down to 16, but we’ll try to make I sometimes describe what I do as building telescopes for it as small as possible, and there’s certainly still a lot of room mathematicians. If you only look at these Sato-Tate distri- for improvement [Note: by the time of this publication, the butions using primes up to, say, 100,000, you really can’t see bound had been reduced to 4,680]. what’s going on. You then extend your algorithms to han- dle primes up to a million, but find that it’s still not good Felipe: As a number theorist, do you have a favorite number? enough. But then when you get up to about a billion, the pic- ture starts to crystallize, as you can see at http://math.mit.

17 edu/~drew/g2SatoTateDistributions.html. Now you can really see the shape of the distribution, you can compute its moments, and you find that the moments ap- pear to converge to integers. Then you can start to develop a theory that explains where these distributions are coming from. It turns out they are related to certain groups of matri- ces, a random matrix model. In the cases we looked at, these were compact Lie groups made up of 4 4 complex matrices × that are both unitary and symplectic. We ended up classifying all the possible matrix groups that can arise for certain types of equations, those related to genus 2 curves, and working in col- laboration with Francesc Fité and Victor Rotger we were even- tually able to prove a that there are exactly 52 matrix groups associated to genus 2 curves, and we can say exactly what the Sato-Tate distributions are in each case. Without the computational tools, without the telescope, we would have had no idea what was really going on, and no hope of proving our classification theorem. I sometimes give an analogy to astronomy. You can spend years sitting in a win- dowless basement tying to figure out how the universe works by developing a theory of cosmology, but if you never go out and look up at the sky, it’s hard to make a lot of progress. I think computers give mathematicians the ability to look up at the sky and see things, and not just to see things but also to prove things. In our work on classifying Sato-Tate distribu- tions the computations would often reveal cases that we were missing and then we’d have to generalize our theory. But then our new theory would predict cases that we hadn’t found with the computer, and that would force us to either search harder, or to find a mathematical obstruction, a proof that some cases can’t arise. Eventually we were able to get both sides, the the- ory and the computation, to match up perfectly. There are other examples at MIT where computation has played a key role in obtaining results in pure mathematics; the work by David Vogan and others on the exceptional Lie group E8 comes to mind. I think the whole nature of research in mathematics is changing as a result of computation. Consider Gauss, who some might regard as one of the “purest” of pure mathematicians. If Gauss were alive today I’m quite sure he would have the fastest computer he could find sitting on his desk cranking away. Gauss used to do massive computations by hand, computing thousands and thousands of prime num- bers in the process of trying to understand their distribution. People tend to think of him as a genius whose theorems just came to him out of the blue, but no, he was constantly guided by computations. As Manin observed, one of the things that made Gauss such a great mathematician was that he had a better computer in his head than most of us do. But the good news is that today we don’t need to have a better computer in our head, the one on our desk will do just fine, and the faster our computers get the more we can see.

18 Interview with Leonid Chindelevitch

Holden Lee

Leonid Chindelevitch received his Ph.D. in mathe- imizes the cost of the changes that took place. Basically there’s matics from MIT in 2010. He is currently an applied a cost you can assign to every mutation: the mutation could mathematician working as a postdoctoral fellow at the take a bunch of nucleotides and delete them forming gaps, Harvard School of Public Health, and writes a weekly converting 1’s into 0’s, or turn the gaps into nucleotides, con- blog http://www.mathophilia.com where he explains verting 0’s to 1’s. From this formulation, it sounds like a prob- mathematics to a broader audience. A recording of this lem on binary strings, but there is this really interesting under- interview can be found at http://mathophilia.com/ lying biology that has to do with the way the species evolved, turning-the-tables-an-interview-with-yours\ and the costs of the different mutations are actually informed -truly/. by what we know about the biology. But then the problem itself, once you’ve defined the cost, Holden: Leonid, could you tell us a little about what you becomes a purely mathematical problem, and an algorithmic do right now? problem, because these are very long sequences, millions of characters for twenty species. Even though this problem has Leonid: I’m currently working at the Harvard School of very interesting biological meaning, as a first approximation Public Health as a postdoctoral fellow, building models of it is a purely mathematical problem. It’s similar to a biology infectious diseases. In particular I’m working on HIV and problem challenge given in the TopCoder competition, which tuberculosis—on population-level models that look at how is basically a competition for those who develop algorithms. diseases are transmitted between people, and what different It was interesting to see that people with no biological back- interventions achieve in terms of controlling those epidemics. ground but with very good knowledge of math and computer science are able to make a significant dent in solving a biolog- Holden: How did you get interested in math and biology? ical problem. This was a problem on the immune system but very similar to the one I just described. Leonid: I’ve always liked math; I’ve always had an inclination for doing math since I was 5 years old. My dad was trained Holden: You went to grad school at MIT and afterwards you as a mathematician, and he liked putting these math books worked in industry, a startup, nonprofit, and finally academia. around me so that I would naturally develop a curiosity. . . I’ve Could you tell us about each of these experiences? done math in various forms since that age. The biology applications were something I discovered fairly Leonid: I went to work at a pharmaceutical company for late, during freshman year of undergrad at McGill University a year after I finished my Ph.D. I really enjoyed working when I took an introductory class in computer science. Until there. I worked on actual research problems with very clear then I didn’t have much programming experience. I really motivations—identifying targets for drugs—but at the same liked the teacher; his name was Mathieu. In one of the time these were problems that were driven by the curiosity classes towards the end of the semester, he talked about his that our team had. It was very nice to work with other people. own research in computational biology and bioinformatics, It was a very collaborative environment so there were people and that’s how I got interested. I talked to him afterwards who did experiments, people who were really good program- and we decided to do a project together over the summer. I mers, people who were interested in interpreting the results really did not enjoy biology in high school. It was too much and data, and businesspeople as well. memorization for me. I didn’t realize that really interesting It was a good experience, but ultimately I wanted to do mathematics could be done to answer biological questions. something slightly more ambitious, perhaps, and so I had the idea of trying to do a startup for a while. My friend had an Holden: Could you give an example? idea about something that we could do and we developed it together. It was a different problem that a lot of pharmaceu- Leonid: The first problem I worked on was the problem of re- tical companies were interested in: how do we tell by looking constructing the common ancestor of a bunch of species. As at a drug trial that failed, what is the difference between the input you have a set of binary strings, 0s and 1s. The 0s rep- patients who responded well to the drug and the patients who resent gaps, or missing nucleotides, and the 1’s represent nu- didn’t respond well? It could be a trial that failed, or it could cleotides that are present in the genetic sequence of a particu- be a trial that succeeded in a small subpopulation. How can lar species. We want to reconstruct the tree of how the species we look at the genetic sequences your patients had—there’s evolved over time from a common ancestor, in a way that min- about 3 million nucleotides in our genome—how do we find

19 the relevant places that allow us to explain the differences we and actually running it. observe? It’s very much a computational problem, and we de- But the main factor was this article I saw in the New York veloped some interesting approaches towards it. We tried to Times last summer, “Is Algebra Necessary"? A fairly promi- do a startup around it. It ended up being a very interesting nent political scientist said that kids are failing high school be- experience but we decided to leave it behind in the end. cause of algebra, and algebra isn’t something they’re going to Around the same time I worked at a research center at a hos- use later on in their life. He said we should just stop teach- pital, also in this area, and there I got to do some really inter- ing algebra because what is it going to do if we’re just going esting human genetics. I basically tried to understand circa- to fail kids out of high school because of it? I felt this article dian rhythms, which are periodic rhythms that the majority was very misguided, because there were some fundamental of living organisms have. Some people call them biological gaps in terms of the public’s perception of mathematics ed- clocks, cyclic processes that make us wake up, go to sleep and ucation. I wanted to start a conversation with people who are get food at a certain time. There were many interesting prob- well-educated and who have a general appreciation for math- lems in that field as well, so I ended up working on that for a ematics but who are not mathematicians, and try to convey to while. them the value that mathematics, math education, and math In addition, I was involved with a nonprofit organization, research bring to society. That was my overarching goal. the one non-mathematical endeavor I was involved with dur- I wanted to explain and educate the public—but not so ing that time. This was a campaign trying to get justice for much educate as start a dialogue. The responses to the article the survivors of the Bhopal disaster, the largest industrial dis- made it clear that there’s a lot of fear of mathematics in aster in history. It happened in India in 1984. I was heavily society. There’s a lot of misunderstanding in general, and not involved in the campaign, doing a lot of organization, aware- just of math but of mathematicians. People apply all these ness, protests, all kinds of stuff. That also was very interesting. stereotypes to mathematicians, like we have no social skills, But ultimately what made me come back to academia is and we’re strange people. Some of those can be positive too, the realization that the thing I worked on in grad school to- but there’s a lot of negative stereotypes. I wanted to make the wards the end—modeling infectious diseases—was the thing I point, or start a dialogue around the topic that the people was most interested in. Since I hadn’t worked on it for several the stereotypes are based on may not be representative of years, I decided to take the opportunity to go back to infec- the general population of mathematicians. We’re all different tious disease research, and do some mathematical modeling people, and there’s no use generalizing in that way, especially of that. I spent a fair bit of time and finally found this post- if it makes you reject that area of knowledge, or if it confirms doctoral opportunity, and that’s what I’m doing now. It finally your negative beliefs about mathematics and what it can do feels like the search is over. with you. Those were the goals. I’m still working on figuring One thing I want to recommend to people currently in the out exactly the direction for the blog, I’ve done a few different math program at MIT is that there might be a lot of pressure to things, it’s been good, I’ve been posting every Thursday. I figure out exactly what it is you want to do, but don’t rush into enjoy the process. decisions. It’s okay to try something for a while, and realize you like it but maybe not that much, or realize that you don’t Holden: How can the math community raise math awareness like it, and that’s okay. Life is long, especially now that we’re to the general public or improve math education? pushing our life expectancy higher. You’re probably going to be working for a long time. Take the time to experiment Leonid: Two great questions. I don’t have the answers, but I with different things, and ideally find something you’re really have some thoughts on both of these. As far as raising math passionate about and you want to be working on. Whatever awareness, I don’t think we should simplify the stuff that we area of math it is, that’s great; if it happens to be something do so much that it becomes accessible. There are some efforts other than math, that’s great too. The most important thing in that direction, but there are some technical areas of math is that you feel that you’re making a contribution on the one that would just take a very long time to explain to someone. hand, and on the other hand you’re also keeping yourself The motivation and purpose behind the math is something happy. that forms a much better narrative. It would be very good for us to learn how to tell stories. I’m a big believer in telling a Holden: Could you tell us a little about your blog mathophilia story and creating a narrative, and this can be as simple as and what inspired you to start it? identifying something related to your field that has permeated everyday life. For example, if you’re doing coding theory, then Leonid: It was a confluence of factors. One is that I actually convey to people that the way we have music recorded on CD’s ran my last marathon in October—the Lowell Marathon—and is very much dependent on the results in this particular field. I knew from my previous experience that the day you finish a Pretty much every field has some kind of application to every- marathon you feel this emptiness of sorts, a void that comes day life or application to understanding the world that we live from having achieved a particular goal, and having to turn the in. That’s also very valuable; for example, string theory helps page from that. I looked for another project that would be ex- answer the question: what is the structure of the physical uni- citing for me to start after I finished training for the marathon verse? How to craft a narrative, how to tell a story, is a valuable

20 skill for us to have. Public speaking is also valuable: it’s not a skill I think anyone is born with, but rather a skill you develop like any other; taking complex path integrals is also a skill, and something you can learn. I think those two things would defi- nitely help, if we pay attention to them, and find opportunities to reach out to the public, in whatever way we can. That’s for your first question. As far as mathematics education, that’s a very tricky one. I would say to think for yourself how you personally got interested in math and what it was that worked for you, what created the spark, what made you decide you want to do math. Chances are, at some point you had a great teacher and mentor, and they supported your interest, or helped you develop it. My suggestion would be to think of ways to pass it on, either mentor high school students, anybody in your family who has interest, or just random people who reach out to you and ask you about the stuff that you do. If you can convey that interest, that will encourage people to become curious, and once you become curious that’s really all you need, because then you read on your own, discover on your own, and if you really like it you continue doing it.

Holden: What other things do you do besides math research, and do they relate to math in your head in any way?

Leonid: Other than the research and the blog, I pretty much have one thing I do pretty seriously, which is classical guitar. I’m very excited, because I’m playing my first solo concert later this month. That’s something I’ve been doing since I was nine. There’s certainly mathematical structure that comes up in the way scales are constructed, the way that rhythm is defined, and the way that there’s this whole science/art of counterpoint, which is basically the progression in your melody, the progression in the different voices in your piece. Probably for composing there’s a lot more (I’ve only composed one piece), composing is all about patterns and math is all about patterns. I wouldn’t say I have been taking a lot of time to explore these connections although I ended up interested in them.

Holden: Anything else you want to say?

Leonid: I’d like to reiterate one thing I learned from my mentors: take the long-term view on things, and don’t get discouraged. What we do is one of the hardest activities that humans have come up with; it can be really frustrating when you’re hitting your head against the wall working on a problem or trying to understand something. It’s good to go back to the original source of inspiration that you had for going into math or the field you’re in, to draw strength from that and not give up.

Holden: Thank you.

21 Interview with Delong Meng

Holden Lee

Delong Meng graduated from MIT with an S.B. in math- boiled down to a discrete math problem. I think the reason I ematics in June 2013. He is currently pursuing a Ph.D. in could solve those problems is because coming from a math Economics at Stanford. background, I wasn’t scared of solving those problems. I think that often when economic theorists see complicated Holden: You came to MIT planning on doing to math, but math they try to avoid it, while for me I could spend an entire then decided to pursue economics. Could you tell us what led summer just thinking about one problem. That’s something to the change? that I got from doing math.

Delong: I’ll just speak from a historical perspective. I met Holden: One thing that attracted you to economics was the Gabriel Carroll [economist with a Ph.D. from MIT] in high math problems you found there. Was there anything else? school, and I asked him why are you doing economics? He said the problems he works on are similar to Olympiad com- Delong: I can always say it’s more applied, but do I really care binatorics problems, so I became curious about economics. that it’s more applied? But it’s easier to explain to people why My father would always say: do math or you’re not go- I’m doing what I’m doing. The economics I do is very math- ing to make money. He said economics is even easier and ematical, but there is a whole spectrum of things from very you’ll make more money, so why not do economics? When I theoretical to very applied. I think it gave me the potential to came to MIT I saw some Chinese IPhO and IMO [International in the future to think about big economics problems, such as Physics/Math Olympiad] gold medalists, all doing economics, how the government affects the economy. and I asked them, why are you doing economics? Their answer was basically the same as the one my father gave: if you do Holden: What advice do you have for students coming to MIT physics, you can’t find a job, so why not do economics? When who want to major in math but also might be interested in I first came to MIT I thought this attitude was really bad. But other areas, to decide what they want to do in the future? that’s the Chinese way of doing things. I was really confused, but at first I just started doing a lot of Delong: I’m sure everyone says “do what you like, go explore,” math. I thought since I did the IMO, and since I’m at MIT with and I think for me, the lesson is actually, go explore. When I all these resources, I can do well in math. Freshman year, I came in I did not actually explore. I wanted to explore, but think I made some really bad mistakes in approaching math: I deep in my heart I was set on doing pure math. Also don’t thought it was just like MOP [Math Olympiad Program] where rush yourself; that’s another lesson I learned. Freshman year, people compete with others and solve problems. I didn’t even I thought if I didn’t take 18.701 I’d be behind. But actually, in know what it meant to learn math. Since a lot of people took my junior year I took a lot of GIRs, and that’s a waste of time 18.701, I also took 18.701. But it turned out that a lot of people for a junior. Freshman year, take all the basic classes and start had learned linear algebra or group theory before, so taking exploring different things. I think that would be better. 18.701 was good for them, but I didn’t have experience in any math beyond high school math. I found it very hard. Even Holden: What do you think were your most memorable then I still wanted to just study math: I thought that in high experiences here in the past four years, what did you learn? school I did math to get into college, but now I got into MIT, How do you think you’ve grown as a person? maybe I should just work hard. What really killed me was com- mutative algebra. I took it sophomore year, and thought it was Delong: In terms of academics, I’d say the most memorable a good class, but it turned out I actually didn’t understand it experience was sophomore fall, taking commutative algebra. at all. After that class I took more math classes. I don’t think I That term I think I had no idea what I was doing. I thought I actually understood them. was smart, taking this fancy class, but it wasn’t good for me. So I think my transition to economics is that I tried very Academically, that’s one thing I learned: from now on to go hard to do math, but somehow it didn’t work the way I ex- one solid step at a time. pected, and I felt I didn’t actually understand what I was tak- Grown as a person? I had some deep issues personally; I ing. Sophomore summer, I did an economics UROP,and that had a bad relationship with my parents, and had to come to was just like an Olympiad problem, so I thought, if I’m not do- terms with my identity in terms of being a Chinese immigrant, ing very well in math, then I might just try economics. and having gone to school in the US. A lot of the time I tried to But being a math major here helped me a lot with doing avoid those, but they still came back and confused me. One economics. For one thing, the undergrad research I did thing that helped me was that I was that in junior year, this

22 guy in my church did a discipleship training with me, and we I haven’t really thought about any alternatives. But I don’t worked through some deep issues. I think having met more know, in grad school, especially if I do economics things may people outside the MOP community helped. Freshman year, change. There are a lot of jobs other than professors. I came in, I chose to be in Random Hall because I saw that’s where all the math majors lived. Pecker was the math floor. Holden: What other things do you like to do besides math and Over the past four years it changed a lot; now it’s not a math economics? floor anymore. Just seeing that people exist outside math: that helped me understand that I can’t just know one group of Delong: I like listening to Chinese songs, just to relax. I try to people. go to the gym. I also read the New York Times.

Holden: Could you tell us about the research you did? Holden: Do you have any other advice for MIT math majors?

Delong: With Richard Stanley it was an algebraic combina- Delong: Don’t copy psets. torics problem, on reduced decompositions of the symmetric group. Several of his students had worked on the same thing. Holden: Thank you, Delong. For economics, my sophomore summer I did research on learning in repeated auctions. Basically, the idea is that you have this game. People have studied Nash equilibrium, but al- ways argue about whether bidders will play Nash equilibrium. One way we can justify Nash equilibrium is to consider re- peated bidding where people start off with some strategy and, as they see how their opponents behave, adjust their strategy. In the problem I studied, they respond to the distribution of their opponent’s history. I basically proved that in the system their strategies will converge to the Nash equilibrium. The problem sounds like an analysis problem but the gist of it is still very combinatorial. You boil the problem down to a se- quence of matrices; once the first few columns converge, then the next column will converge. The other problem I did was matching. In school districts in Boston, when people apply to public schools, they have to go through a centralized matching system where students will submit their preferences. The school board takes into account all these preferences and assigns you to a school. There are schools that are very popular, so you need some kind of mechanism to decide which students get into which schools. There’s this algorithm called the Top Trading Cycle algorithm which gives that. But in this algorithm, you need schools to rank students, and in reality, what the schools do is just run a randomized lottery for all the students. What I studied is how could schools run their lotteries? There are two ways to do it: one way is to have a centralized lottery where the school board says, here is a ranking of all the students, now all the schools will use that ranking. The other is for each school to generate its own lottery. Which one is better for the students? Intuitively you’d think that you want multiple lotteries, because if you get screwed at one school, you want a chance at another school, but actually if you look at the entire distribution, it’s not clear that one is better than the other. I proved an equivalence result under some assumptions on how students rank their schools.

Holden: What do you see yourself doing in the future?

Delong: If you asked me this question ten years ago, I would say, be a professor. I guess that’s always my default answer.

23 Interview with David B Rush Holden Lee

David B Rush graduated from MIT with an S.B. in mathe- work was in combinatorial models for representation theory, matics in June 2013. He is continuing as a graduate student in so the paper I wrote last summer, for instance, has to do with mathematics at MIT starting this fall. Kashiwara crystals, which give a combinatorial model for representations of Lie algebras. Even if you don’t remember Holden: David, could you tell us a bit about yourself? Tell us at any given time all the details about how the weights relate about your interest in math. Where do you want to go with to the roots, you can still draw the Kashiwara crystal, assign math? a Young tableau to each vertex, and give some combinatorial insight that might be relevant to the representation theory David: I liked math ever since I was little, but I didn’t seriously without being as well-versed in the representation theory as contemplate any kind of career in mathematics until college, your advisor. in part because I forestalled any contemplation of careers in general until college. Prior to that, I focused on what Holden: How did you develop your ability to do research? captivated my interests. When I did get to college, I decided that if I were going to do something mathematical I wanted David: First, I should point out it’s not clear to me that I have it to be research, in part because I’ve always had a very real ability to do research yet. I think you can’t really give in- personal relationship with my work. Math is impersonal as structions for research, because the process of creativity it- disciplines go, and I felt like the aspect of mathematics which self, by definition, can’t be systematized. It’s something people entails the most self-expression (for lack of a better word) is have to discover on their own. trying to make original contributions to our understanding. The mathematics I was doing in research was a combina- That’s what I’m most predisposed to at the moment, so in tion of the skills I developed in college, solving problem set terms of what I envision for my future in mathematics, I problems, in conjunction with the skills we developed with think it would be something academic. However, as you Olympiads, where you had to just generate insights without as are well aware, the number of jobs in academia is small, so I much understanding of the context, but I think the experience may contemplate alternative plans at some point in the future. I found most similar to math research wasn’t mathematical at all. In projects in high school history classes, I would have to Holden: What kind of math are you most interested in? answer some question with some thesis, and support it with evidence gathered from looking around in the library. I took David: That’s a good question because I’ve just been to visit those assignments very seriously, and that was the first time I various graduate schools and one of the factors I’ve had to cultivated any abilities that could be described as research. consider in terms of making the decision is what kind of math Obviously there’s some difference in how your work mani- I would be working on and what the best place would be to fests itself just based on the subject matter. I’m sure a math- work on that particular kind of math. My undergrad research ematician would point out that when you’re writing a history is in algebraic combinatorics, but it may be a good idea to try paper, while you’re subject to the rules of logic, you’re not nec- to branch out, at least in the beginning years of grad school, to essarily subject to the rigor of proof. But the research process see what else I might be interested in doing. I picked combina- was still very similar in that I would read papers and if I saw torics at the undergraduate level rather serendipitously in that something interesting, I would try to develop the analogous it was the first thing I had an opportunity to try (and also be- argument to whatever situation I was examining. If I saw a ref- cause it’s more accessible to undergraduates than some other erence in that article to some other article where it said “here’s fields). how such-and-such a claim is justified, here’s a related argu- Algebraic combinatorics jibed very closely with what my ment that was made towards a different goal,” I would look at strengths were, in part because when you study algebraic that. Ultimately I think that one of my strengths is the ability objects from a combinatorial perspective, you can glean to synthesize things. In both of the projects I was engaged in, insights into the structure of things that you don’t fully I took the understanding that other people had accumulated understand. Obviously you’re not going to understand them over a number of years and put it together in a way that they as well as some of the people who’ve been working in the field hadn’t expected. for a long time will, but if you have the specific combinatorial People have different kinds of research skills. People like model, just by looking at the model almost the way we did John Nash, for instance, write a paper and it’s just a totally in Olympiads, you can obtain results that people who didn’t new idea that isn’t necessarily sophisticated because the field have as revealing of a picture might have missed. Most of my doesn’t exist yet. The original paper on Nash equilibria is

24 inventing a field, so it has maybe five references. I certainly ested in trying to be at least somewhat broadly educated. would like to be able to learn about other people’s experiences I try to keep track of what’s going on in the world, if for no doing research a different way but I think any quest for a set other reason than I’m interested in people and in hearing their of explicit instructions is going to be a red herring. The best stories, and that’s true at both a local and global level. I also counsel I can give to undergraduates who wish to do research think it’s important to pass on whatever glimpses of insight a is to be very patient and be very persistent. The first time I person might have been so fortunate to come across—one of tried to do a research project it was a disaster, but I didn’t give the reasons why I’m agreeing to do an interview such as this up. That’s important; you have to believe in yourself. (although I’m a bit skeptical that anyone is going to altogether learn very much from it). I took a number of non-math classes Holden: How do you think your four years at MIT have at MIT because I’m interested in the arts and humanities, and I changed you? What are your most memorable experiences? had a semi-canonical education in those fields in high school, which was very valuable to me. I missed out on that in col- David: One of the most important things I had to do in col- lege, but on the other hand I had other offerings that I maybe lege was to find new goals. The process of finding new goals wouldn’t have stumbled across if I had gone to a more conven- is difficult in any environment. For a lot of people, especially tional college. intelligent people for whom things tend to come fairly easily, I hope I can continue to be interested in many things in it’s very hard to find new goals in a context where everybody graduate school at MIT. But if not, I can always walk over to seems so unbelievably talented. People suspect processes in Harvard where people do run the risk of speculating without life are far more deterministic than they are, so if you’re in an necessarily too much grounding in what they’re speculating environment like MIT, you might be afraid of trying new in, but at the same time, display an admirable interest (often thing because there are already 50 experts in new thing at lacking here) in other subjects where their expertise is valu- MIT and you think to yourself, well gosh, I could never be one able. I audited a course last semester at Harvard Law School of them, and it doesn’t occur to you that maybe a few years which was jointly taught by Noah Feldman from the law back maybe 10 of the 50 were experts at the time. I think it’s school, and from the math department, called ironic but I would say that MIT, in collecting so many talented The Nature of Evidence, and I think I learned quite a bit from students, can be very inspiring but at the same time can be surveying the various academic disciplines and asking what very disillusioning. As a result you have to really listen to your constitutes meaningful questions through the lens of how the heart to find what you’re interested in doing. practitioners judge evidence. Barry Mazur is one of the most I’ve appreciated the environment where everyone around intellectually alive people I’ve ever met. At the same time as us is very smart, but at the same time I’m cautious of jump- having the wealth of knowledge that he must as a professor of ing to conclusions about it. MIT celebrates a very narrow 75, I felt like in spirit he’s younger than I am, and that in itself form of intelligence, and it encourages its community to was an inspiration. cultivate it at the expense of exploring other intellectual outlets which I suspect would be helpful not only in our Holden: What are your goals now? lives but also in our intellectual pursuits. I’ve found some of my intellectual talents were not particularly well-regarded David: Well, there are career goals and there are life goals. As at MIT. I was very lucky to be somebody who’s predisposed an aspiring academic you attempt to unify them more than towards mathematics—being the language of science, math you might be asked to in other pursuits. But at the same time, is universally valued here—but at the same time, I like other it’s not clear to me that your career should encompass your things, like teaching and long, not necessarily goal-oriented, entire life. conversations. It was therefore important for me to keep in As far as career goals, it’s pretty standard: I try to do some- mind that being MIT-smart is one thing, and it’s certainly thing reasonably significant in mathematics that I could be important—if you aren’t mathematically competent or proud of myself for having done, and after I’ve done that, if I’m competent in your chosen area of expertise then you won’t confident that I understand the process of how somebody gets be able to achieve much of anything at all—but somehow, there better than I do now, then I would certainly love to give competence isn’t the only thing that matters in the world, back, not only through teaching, but through other forms of and especially in any creative endeavor, I think it’s only one involvement in education and higher education. Throughout prerequisite among several others. the process, I’d like to try to be as intellectually alive as some- one like Barry Mazur. It is not something I’m going to achieve, Holden: What other things are you interested in? but it’s definitely something that motivates me. Of course, I think the most important thing is to have David: I’m interested in many things. I don’t know if I have as meaningful relationships with people. Again, that’s going to many hobbies as some people tend to; there’s not that much be at a local and global level, locally with your friends, rela- besides what I’m working on at the moment that I really pur- tionships with people you know, and trying to be honest and sue that determinedly. I do like to ski, or to sing, or play piano loyal, and globally, too. That means not just understanding recreationally, or things of that nature, and I’m also very inter- how your work fits in the broader project of human inquiry

25 but also understanding how it fits in the present. What is what he understood taste to be, I started to develop the begin- humanity doing today and how is my work contributing nings of aesthetic sensibilities in mathematics of my own. to that; can I use whatever particular expertise I have to Anyway, I’m not only a lifelong student of math; I’m also contribute more to the best that I can? I don’t think that a lifelong student of teaching. Presumably I’ll have a better should be at the expense of the local, though. Some people, answer for you at some point in the future. especially as they’re growing up, tend to discard or undervalue the importance of friendship and mentorship. That I’m even in a position to make any of the contributions that I might be able to is much less a credit to me than to my parents and the mentors and teachers that I’ve had the privilege of knowing over the years. So it’s really as simple as acknowledging that other people have supported me and committing to do the same for the present and future generations.

Holden: What do you think were the most important things your mentors taught you?

David: Again, I’m somewhat skeptical of what I see as the con- ceptualization of teaching today as instruction. The most im- portant mentors I had gave me the tools I needed to teach my- self. When one is teaching, it’s important to (a) show that there is a real subject, in order to gain the trust of the students, and prove that you have a reason to be there. Part of showing there is a real subject is demonstrating to the student their own in- competence, but at the same time, it’s also important to (b) empower them and give them the confidence that they can meet the new standard that you set up for them. It’s when there’s simultaneously a lack of standards and low expecta- tions for whether the students will meet whatever amorphous standards there are that students become the most cynical and least interested in pushing themselves to learn whatever it is you’re hoping to impart. So I think that the combination of demonstrating and em- powering is the essence of teaching. Many mentors did that at MIT for me. Steve Kleiman, who has very high standards as an extraordinary mathematician, manifested those standards for people in his class, and he was very careful about praising me for work that I had done. One time when I gave him some- thing, he said, “This is an abomination," but he didn’t just say that, he also explained why and told me to fix it. I didn’t have explicit instructions on how to fix it but because I heard his reasoning I was able to figure out on my own what needed to be done. My advisor, Mike Sipser, was also very helpful to me in terms of giving me a view of math from someone who’s in the practice as opposed to someone who’s thinking of joining it. In many ways he helped me escape some of the confining aspects of MIT’s preconception of what intelligence is. Sipser is someone who has achieved a great deal in the discipline de- spite not having done math contests and not necessarily hav- ing some of the same skills that we would expect someone who’s as accomplished mathematically as he is to have. He said, it’s not just about problem-solving; it’s also about asking interesting questions, and having taste in questions. He ex- plained what that meant to him. Now could he teach me what it meant to have taste? No, of course not. But by listening to

26 Math Major Survey Responses

Holden Lee and Soohyun Park

In spring 2013, we sent out a survey to math undergradu- On a scale of 1 (awful! §) to 6 (amazing! ©), what is the over- ates and received 28 responses. We asked respondents to rate all quality of math teaching at MIT? the math program at MIT, to share math-related experiences and advice, and write about what math meant to them. The 10 majority of respondents were satisfied with math at MIT. 10 However, respondents also gave suggestions for improving the math program, which we hope both math students and 8 faculty will take note of. 6 What year are you? 5 4 4 3 Year Number of responses 2 2 1 6

0 2 7 0 1 2 3 4 5 6 3 9

4 2 How satisfied are you with the math community at MIT?

What major are you? Double majors included 6-1, 6-3, 9, 15, 10 10 and 21M.

Year Number of responses 8 7

18 general 10 6 18 applied 2 4 18 theoretical 10 3 2 18C 2 2 1 0 0 1 General statistics 1 2 3 4 5 6

On a scale of 1 (pure) to 6 (applied), where do you fit? How satisfied are you with the math program at MIT?

12 11 8 7 10

6 8 5 5 6 6 4 5 3 3 4 2 2 2 1 0 0 0 1 2 3 4 5 6 1 2 3 4 5 6

27 Check everything that you’ve been involved with.

REU 6 Job in finance 3 Other internship 8 Teaching 10 Math competitions 17 UROP 12 Other 2

0 3 6 9 12 15 18

2 Math classes at MIT other fields, seminar format, opportunity to give presen- tations and work on a substantial paper. Overall an excel- What is your favorite area of math? lent experience.

Area Number of responses • Other responses: 18.152 (Intro to PDE), 18.310 (Principles of Applied Math I), 18.312 (Algebraic Combinatorics), Algebra (including representa- 3 18.330 (Numerical Analysis), 18.404 (Theory of Compu- tion theory) tation), 18.702 (Algebra II), 18.704 (Seminar in Algebra), Analysis 4 18.901 (Topology), 18.906 (Algebraic Topology II), 18.712 (Intro to Representation Theory) Combinatorics 2 Let y = how much you feel like you’re supposed to learn in Computer science (Theoreti- all the math courses you’ve taken. Let x = how much you can 1 x cal) remember and apply. Compute y . Logic 1 4 4 4 Number theory 1

Numerical analysis 1 3 Probability and Statistics 2 2 2 2 Topology (including algebraic 2 4 topology)

Number of responses 1 1 1 1 1 What are your favorite math classes you’ve taken at MIT, 1 and why?

0 0 • 18.014 and 18.024 Calculus, Multivariable Calculus with 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Theory – (2 votes) (1) We had the pleasure of having Clark Barwick as our professor both of these courses, and he 3 Extended responses was an amazing teacher. We were pushed beyond what we thought we could do and learned a lot of neat math in The math community at MIT the process. (2) Our lecturer was very good at connecting with students. The math community at MIT is “very large and talented • 18.100B Real Analysis - because the problems were very compared to other schools.” However, for someone who interesting, the course materials and lectures were well- hasn’t gone to the top high schools or done high-level contest organized, and the material was genuinely interesting math, it can be difficult to fit in. One person mentioned and applicable. difficulties finding math undergrads to relate to: “I just want a community where people can just hang out, but not • 18.424 Seminar in Information Theory - Fantastic pro- talk about math.” Another found that other communities fessor (Prof. Shor), really unusual material compared to (dorm, club) were stronger, and that many people in the math most classes on topics relevant in stochastic modeling, community preferred not to speak English. myriad and diverse applications to numerous areas in

28 What does math mean to you? How have your conceptions recommended taking 18.701/2, 18.100B, and 18.440 early. of math, and what it means to do math, changed in your However, it is also important to build a solid foundation for time at MIT? the hard math classes by starting with easy intro classes.

Several respondents gained new respect for math during Finally, a respondent said, “Don’t give up because it’s hard. their time at MIT. One mentioned that “mathematics must If you keep at it, it will be easy someday, and you will scarcely be the most creative subject in the world.” They also gained believe how much you’ve advanced. Just keep grinding. The respect for aspects of math they might not have previously end product is worth it.” valued, such as computational mathematics. What are you dissatisfied with about math at MIT? Ted Hilk wrote that math “is about mental models (what many people tend to hand-wave somewhat dismissively as Teaching and advising topped the list—not all professors ‘intuition’) and an unambiguous means of expressing them. . . are great at teaching. “Lesson plans lack practical motivation What matters is structures, the means by which they act on for the material and largely consist of regurgitation of mate- themselves and each other, and the analogies between them.” rial from the book. Many students skip classes for this rea- He writes that math helps him learn other classes like 2.003 son.” Moreover, “professors, grad students, and TA’s... make quickly, and that the models and analogies in mathematics students feel very uncomfortable about asking questions and helps him come up with new trading strategies. not understanding. I have only experienced this in the math department and many seem to have an air of elitism about Semon Rezchikov says, “Math is about continually creating them.” Problem sets can be poorly designed: “There is a fo- a new idea you didn’t have before. You have some base cus on doing a small number of difficult problems rather than thought, you formalize it, and you push it very far, farther doing enough easier problems to practice and really learn the than you could have if you didn’t formalize it, until you have material.” this basic novel concept that is really valuable. The dream is somehow, it will help us do science. Like von Neumann, he would just go around, talk to different scientists, figure out what their essential problem is, and he solved it by working with them. He did this repeatedly, and he totally changed the world.”

On the other hand, the focus on theoretical math turned off some students from the subject. One respondent said that “professors and students throw out the notions of practicality in search for some broader abstraction and new theorems to prove” and hence felt that “too much of the highest level math is too far removed from applications for my interests.”

What advice do you have for a freshman math major?

Practical advice included: • Apply for summer activities earlier!

• Don’t think about what looks the best on a resume, find what you are actually interested in and do that instead.

• Do stuff outside of classes!

• Make more time to do extra problems and really immerse yourself in the material taught in your classes.

• Don’t take a UROP until you are qualified to take one you are actually interested in. Don’t just look at UROP ads, ask a professor. Try to get an internship as early as you can (ex. freshman programs at Google). Several respondents recommended students to skip to the interesting parts: test out of GIRs, and take prerequisites early (or try not to take them, if they’re boring). Respondents

29 Nuggets and Jokes

Leon Zhou

1 Nuggets 2 Hilarious math jokes!

Claim. For all natural numbers a, a a. Riddle: What’s the difference between a mathematician and 6< a large pizza? (Answer below!) Proof. We use the von Neumann ordinal definition of the nat- ural numbers1, and proceed by strong induction. In the base case, 0 , so nothing can be less than it, in- = ; A biologist, a physicist, and a mathematician are sitting on cluding itself. a bench outside an empty house. As they are watching, two Now suppose that for all b a, we know that b b. Suppose < 6< people walk into the house. Fifteen minutes later three people a a. Then by the inductive hypothesis, a a. < 6< walk out. The biologist says, “They must have reproduced.” The physicist says, “There must have been a measurement error.” n à ! X n 1 i + The mathematician says, “I guess the house wasn’t empty i 1 = 2 after all.” = The biologist says, “Yeah, that makes more sense.”

(Source: http://mathoverflow.net/questions/8846/ proofs-without-words)

By Ben Zinberg Claim. (Fermat’s Little Theorem): If p is prime and a 1 is > any integer, then ap a is divisible by p. − Proof. Let S be the set of all ordered p-tuples of the integers There are 10 types of people in the world: Those who can between 1 and a. Then S has ap elements. count, and those who can’t. For each element of S, write out the coordinates of the tuple in a circle like in Figure 1. You can think of this as a necklace with p beads, each of Answer to riddle: One is a human being who studies mathe- which can be any of a different colors. Let N be the set of all matics for a living, and the other is an oven-baked, flat, round such necklaces; then N also has ap elements, of course. We’ll bread typically topped with a tomato sauce, cheese and vari- consider two necklaces equivalent if one can be rotated to look ous toppings (Source: Wikipedia). like the other. Since p is prime, every such necklace is equivalent to ex- actly p total necklaces, corresponding to the p different ways 1 the necklace can be rotated, EXCEPT for those necklaces 1 2 which consist of beads which are all the same color. In other words, the ap a non-monochromatic necklaces can be split 1 1 − up into disjoint sets of size p, each of which contains neck- laces which are all equivalent to each other. In other words, 3 3 ap a is divisible by p. − 1In the von Neumann ordinal definition of N, we set 0 {}, and inductively Figure 1: An example necklace with p 7 beads of a 3 colors. = = = define n 1 n {n} {0,1,...n}. Then a b if and only if a b. + = ∪ = < ∈

30 Thanks for reading! ©