Lectures on the Calabi-Yau Landscape
Jiakang Bao1,2,∗ Yang-Hui He1,3,4,† Edward Hirst1,2,‡ Stephen Pietromonaco5§
1Department of Mathematics, City, University of London, EC1V 0HB, UK 2Department of Physics, Imperial College London, SW7 2AZ, UK 3Merton College, University of Oxford, OX14JD, UK 4School of Physics, NanKai University, Tianjin, 300071, P.R. China 5Department of Mathematics, University of British Columbia, V6T 1Z2, Canada
Abstract In these lecture notes, we survey the landscape of Calabi-Yau threefolds, and the use of machine learning to explore it. We begin with the compact portion of the landscape, focusing in particular on complete intersection Calabi-Yau varieties (CICYs) and elliptic fibrations. Then we examine non-compact Calabi-Yau manifolds which are manifest in Type II superstring theories. They arise as representation varieties of quivers, used to describe gauge theories in the bulk familiar four dimensions. Finally, given the huge amount of Calabi-Yau data, whether and how machine learning can be applied to algebraic geometry and string landscape is also discussed. These notes are directed to the beginning graduate student interested in mathematics and in physics, and are based on lectures given by the 2nd author at the 2019 PIMS Summer School on Algebraic Geometry in High-Energy Physics at the University of Saskatchewan. arXiv:2001.01212v2 [hep-th] 4 Feb 2020
∗[email protected] †[email protected] ‡[email protected] §[email protected]
1 CONTENTS CONTENTS
Contents
1 Introduction4
I Compact Calabi-Yau Landscape5
2 Calabi-Yau Geometry in Math and Physics5 2.1 Topological Data ...... 7 2.2 String Compactifications ...... 9
3 C.I.C.Y. 9 3.1 Cyclic Calabi-Yau Threefolds ...... 9 3.2 CICY Calabi-Yau Threefolds ...... 11
4 Elliptically Fibered Calabi-Yau Threefolds 13
5 Additional Regions of the Compact Landscape 15
II Non-compact Calabi-Yau Landscape 16
6 String Theory Structures 16 6.1 D-branes ...... 16 6.2 Quivers ...... 17 3 6.3 An Orbifold Example: C /Z3 ...... 18 6.4 McKay Correspondence ...... 19
7 Algebraic Geometry Viewpoint 20 7.1 Brane Tilings ...... 20 7.2 Dessin d’Enfants ...... 22
8 Non-compact Calabi-Yau Summary 24
III Machine-Learning the Landscape 26
9 Performance Measures: Hypersurfaces in W P4 26
10 Learning CICYs 28 10.1 Distinguishing Elliptic Fibrations ...... 31
11 A Digression: Group Theory 33 11.1 Learning Cayley Tables ...... 33
2 CONTENTS CONTENTS
11.2 Learning Finite Simple Groups ...... 33
12 Summary and Outlook 34
A Some Complex Geometry 35 A.1 K¨ahlerManifolds ...... 36 A.2 Chern Classes ...... 36
B Toric Varieties 37
C Introduction to Machine Learning 39 C.1 Text recognition ...... 39 C.2 Neural Networks ...... 41 C.3 Support Vector Machines ...... 43 C.4 Decision Trees ...... 46 C.5 Types of Machine Learning ...... 48
References 49
3 1 INTRODUCTION
1 Introduction
Superstring theories demand our spacetime dimension to be 10, which means we should reduce them to an effectively 4-dimensional theory. The standard solution of string com- pactification, as a generalization of Kaluza-Klein compactification, renders the extra six dimensions Calabi-Yau (CY). Thus, the study of Calabi-Yau and algebraic geometry has entered the field of theoretical physics. In order to avoid an excess of symmetries in our observed 4-dimensional universe, isome- tries in our geometry, which leads to extra graviphotons, is not allowed [1]. This leaves us the only option of manifolds of complex dimension 3, which requires K¨ahlerstructure and vanishing first Chern classes (c1 = 0). As will be explained in §2.2, we also want the manifold to be Ricci-flat. However, given a K¨ahlermanifold with zero c1, the existence of a (unique) K¨ahlermetric in the same K¨ahlerclass with vanishing Ricci form is not self-evident. Followed by the work of Calabi [2] and Yau [3,4], mathematicians reached a great success in studying CY manifolds. Later, physicists realized the crucial role CY manifolds play in fundamental physics as aforementioned. Discoveries in physics enabled people to reconstruct the Standard Model from compactifications and also led to the mirror symmetry which is now a focused interface of mathematics and physics [5]. More details and discussions on the physcial predictions from CY manifolds can be found in [1]. Nowadays, thanks to the information age, we are able to let machines help us learn the structure of CY manifolds due to the large volume of data which has been compiled since the mid-1980s by physi- cists and mathematicians. This even brings computer science and data science into this interdisciplinary area. The outline is organized as follows. In PartI, we mainly focus on compact CY landscape. We start with a background on Calabi-Yau geometry. We also pay our attention to the complete intersection Calabi-Yaus (CICYs). Then we contemplate the non-compact case in PartII. In this part, more physics and mathematics, such as quivers and toric varieties, and their relations are discussed. Finally, we apply machine learning to the study of CY landscape in PartIII. Along with a quick introduction to machine learning, we perform this technique to different topics in mathematics. In the appendices, some prerequisites are provided.
4 2 CALABI-YAU GEOMETRY IN MATH AND PHYSICS
Part I Compact Calabi-Yau Landscape
Some basic topological or geometric facts are given in AppendixA. For far more detailed treatment on what follows, we refer the reader to [6–9].
2 Calabi-Yau Geometry in Math and Physics
The story of Calabi-Yau manifolds originates in the mid-1950s with the following conjec- ture of Eugenio Calabi.
Conjecture 2.1. (The Calabi Conjecture) Let (X, g, ω) be a compact K¨ahlermanifold, 1,1 1,1 and fix R ∈ Ω (X) such that [R] = c1(TX ) ∈ H (X). Then there exists a unique K¨ahler metric ge with K¨ahlerform ωe such that [ω] = [ωe], and
R = Ric(ωe)
where Ric(ωe) is the Ricci form of ωe. The power of this conjecture is that it describes complicated geometric data (curvature) in terms of simpler topological data (Chern classes). For example, in complex dimension 1, this conjecture reduces to the Gauss-Bonnet theorem for Riemann surfaces, which says that the curvature is determined completely by the genus. In higher dimensions, the conjecture is that the curvature is controlled by the first Chern class (of the tangent bundle). Calabi himself proved the uniqueness part of his conjecture, but the existence remained an open problem for 20 years before Shing-Tung Yau completed the proof, for which he received the Fields Medal in 1982.
Theorem 2.2. (Yau) The Calabi conjecture holds.
We will be primarily interested in the special case of R = 0, in which we say that X admits a Ricci-flat metric. In general relativity, Riemannian manifolds with Ricci-flat metrics are vacuum solutions of Einstein’s equations (that is, solutions without matter and energy). We are therefore interested in such manifolds which are K¨ahler. This leads us to the definition of a Calabi-Yau manifold1.
Definition 2.3. Let X be a compact K¨ahlermanifold with dimC(X) = n. We say X is a Calabi-Yau n-fold if it admits a Ricci-flat metric2 of strictly SU(n) holonomy.
1In fact, the word “Calabi-Yau” was coined by physicists later [5] for Ricci-flat K¨ahlermanifolds. 2Yau’s proof of the Calabi conjecture was not constructive, and to-date, there is not a single compact Calabi-Yau manifold where the Ricci-flat metric is known explicitly (outside of trivial cases of tori). This is an important open problem.
5 2 CALABI-YAU GEOMETRY IN MATH AND PHYSICS
Let us give some low-dimensional examples of Calabi-Yau manifolds:
1. The only Calabi-Yau manifold of (complex) dimension 1 is an elliptic curve. Thus, there is a single topological type.
2. The Calabi-Yau manifolds of complex dimension 2 are called K3 surfaces. A simple construction is as a smooth quartic hypersurface in P3. All K3 surfaces are simply connected, and diffeomorphic to one another; so there is only one topological type. (Note that 4-dimensional tori are indeed Ricci flat, but they do not satisfy the condition on the holonomy group in the definition.)
Proposition 2.4. For X as in the definition, the following are equivalent3:
1. X is a Calabi-Yau n-fold.
2. The first Chern class of X vanishes; c1(TX ) = 0.
3. There exists a covariantly constant spinor on X.
4. There exists a non-vanishing holomorphic n-form on X. ∼ 5. X is a smooth projective algebraic variety with trivial canonical line bundle ωX = OX , Vn ∗ k where ωX = TX , and which additionally satisfies H (X, OX ) = 0 for 0 < k < n.
The final characterization in the proposition is clearly the preferable one in algebraic geometry. We can remove the hypothesis of projectivity, which results in non-compact Calabi-Yau manifolds, of interest to us in PartII. We could also allow for mild singularities, which inevitably arise when studying families of Calabi-Yau manifolds.
Remark 2.5. One must beware of mildly different definitions of Calabi-Yau. Our definition excludes all tori (in particular, abelian varieties) and, for example, the threefold K3×E; the product of a K3 surface and an elliptic curve. These spaces admit Ricci-flat metrics, though of holonomy strictly contained in SU(n). In physics, this will translate into the low-energy theory having enhanced supersymmetry. Both abelian threefolds and K3×E are of interest in enumerative geometry.
3There are some subtleties in these propositions. The second one is actually weaker. For instance, complex tori with dimension greater than one have vanishing first Chern classes, but they fail to satisfy the fifth one. On the other hand, people often count these as Calabi-Yaus as they have trivial holonomies and infinite fundamental groups. Moreover, we also have non-algebraic K3 surfaces that fail the fifth condition even though they are simply connected with holonomy SU(2) [10]. Anyway, people adopt different definitions in different literature. This won’t be an issue in our applications.
6 2.1 Topological Data 2 CALABI-YAU GEOMETRY IN MATH AND PHYSICS
2.1 Topological Data One can assign to a complex manifold X the Hodge cohomology groups
p,q q p H (X) := H (X, ΩX )
with Hodge numbers hp,q the corresponding dimensions. If X is compact and K¨ahler,the topological Euler characteristic is given by
dim X X χ(X) = (−1)p+qhp,q. (2.1) p,q=1
If X is a compact Calabi-Yau threefold, then due to various symmetries [7–9] the only relevant Hodge numbers are h1,1 and h2,1. By Proposition 2.4, X is a smooth projective variety with vanishing h1,0, h2,0 and therefore by the Hodge decomposition
H1,1(X) ∼= H2(X, C).
2 We can choose an integral basis {Jk}k=1,...,h1,1 of H (X, C) such that the K¨ahlercone is P 1,1 K = k tkJk tk ∈ R>0 . In other words, the quantity h measures the number of K¨ahler classes on X (or by dualizing, the number of curve classes). Using the Calabi-Yau condition, we similarly have 2,1 ∼ 1 H (X) = H (X,TX ). The cohomology group on the right encodes the infinitesimal deformations of the complex/al- gebraic structure of X. Therefore on a Calabi-Yau threefold, the Hodge number h2,1 measures the dimension of the space of complex/algebraic deformations, while h1,1 measures the dimension of the K¨ahlercone. The two Hodge numbers determine the topological Euler characteristic via (2.1) χ(X) = 2(h1,1 − h2,1). (2.2) Using the chosen basis of K we define the triple intersection form of X Z drst = Jr ∧ Js ∧ Jt. X This integral can be hard to compute in general, but we can use the following result [6, Thm. 1.3]. If we have an embedding f : X,→ A with A a smooth projective variety of dimension m + 3, then for all ω ∈ Hk(A) Z Z ω|X = ω ∧ η X A where η is a (m, m)-form which when restricted to X is the top Chern class of the normal
bundle NX/A. For our purposes, A will be a simpler space than X itself; for example, a projective space or product of projective spaces.
7 2.1 Topological Data 2 CALABI-YAU GEOMETRY IN MATH AND PHYSICS
For any K¨ahlerthreefold, the total Chern class can be written in the chosen basis of K as
h1,1 h1,1 h1,1 X X X c(TX ) = 1 + [c1(TX )]rJr + [c2(TX )]rsJr ∧ Js + [c3(TX )]rstJr ∧ Js ∧ Jt. r=1 r,s=1 r,s,t=1
Moreover, the topological Euler characteristic of a K¨ahlermanifold X is the integral over X
of the top Chern class of TX . Using the triple intersection form, we can therefore express
h1,1 X χ(X) = drst[c3(TX )]rst. r,s,t=1
For a Calabi-Yau threefold, of course c1(TX ) = 0, so that leaves c2(TX ) to be independently specified.
Theorem 2.6 (Wall). The topological type of a compact Calabi-Yau threefold is completely p,q determined by the Hodge numbers h , the triple intersection form drst, and the second Chern class c2(TX ). P It is convenient to contract c2 with d by defining [c2(TX )]r := s,t[c2(TX )]rsdrst. It suffices to record this contraction instead of the individual components [c2(TX )]rs. Therefore, by Theorem 2.6, the data determining the topological type of a Calabi-Yau threefold is: 1,1 2,1 1,1 (h , h ), [c2(TX )]r, drst r, s, t = 1, . . . , h . (2.3)
Recall from Section2 that for Calabi-Yau manifolds of dimensions 1 and 2, there is re- spectively a single topological type. Does this pattern persist in dimension 3? Spectacularly, no. The lower bound on the number of topological types of Calabi-Yau threefolds is currently around 500,000,000! But there is the following conjecture.
Conjecture 2.7 (Yau). The number of topological types of Calabi-Yau threefolds is finite4.
In other words, there are finite possibilities for the values in the data set (2.3).
Remark 2.8. Beware that even after fixing the topological type of the Calabi-Yau, there is still generally a moduli of algebraic/complex structures on the variety of fixed type. This is typical of moduli problems: specify as much discrete data as possible, which fixes the topological type, and then study families of complex structures.
4In fact, this conjecture is made for any CY n-folds. It is certainly true for n = 1, 2.
8 2.2 String Compactifications 3 C.I.C.Y.
2.2 String Compactifications Calabi-Yau threefolds entered physics through string theory in the late 80s. The con- sistency of the physical string theories (Type I, Type IIA, Type IIB, and the Heterotic theories) remarkably requires that the (real) dimension of spacetime be 10. So we obviously have to contend with the fact that we only observe 4 dimensions. The idea behind string
compactifications is to decompose the 10-dimensional spacetime M10 as
M10 = M4 × X (2.4) where M4 is our 4-dimensional spacetime, and X is a compact 6-dimensional manifold. The vague intuition should be that the extra 6 dimensions of X are tightly curled-up and unobservable at small energies. If X is a complex threefold, then it has real dimension 6. But why do we want X to additionally be Calabi-Yau? It is because Calabi-Yau manifolds are those admitting Ricci- flat metrics. In general relativity Ricci-flat manifolds correspond to a vacuum configuration of spacetime, i.e. a universe without matter or energy. Therefore, compactifying on a Calabi- Yau threefold X, as in (2.4), models a string theory vacuum. Let us tie this back in with our exploration of the Calabi-Yau landscape. The vague principle one should keep in mind is:
As X varies over the compact Calabi-Yau landscape, the physics
observed in M4 changes. In other words, the topology and ge- ometry of X dictates physical phenomena in spacetime.
3 Complete Intersections in Products of Projective Spaces (CICYs)
In this section we begin constructing our first examples of compact Calabi-Yau threefolds. The simplest (and most famous) Calabi-Yau threefold is the quintic, and more generally, the cyclic manifolds. Subsuming these examples, is the important class of complete intersection in products of projective spaces, or CICY for short. After constructing these geometries, we show how certain crucial topological information is encoded into the defining equations.
3.1 Cyclic Calabi-Yau Threefolds Let us now construct the most straightforward example of a Calabi-Yau manifold in
each dimension. Let f(x0, . . . , xn) be a homogeneous degree d polynomial, or equivalently, a section of the line bundle OPn (d). The vanishing locus of the section defines a degree d hypersurface X in the projective space Pn.
9 3.1 Cyclic Calabi-Yau Threefolds 3 C.I.C.Y.
Theorem 3.1. (The Adjunction Formula) Let X ⊂ Pn be a smooth, closed subvariety of codimension m. The canonical bundle of X is given by
m n n ωX = Λ NX/P ⊗OX OP (−n − 1) X (3.1)
n where NX/Pn is the normal bundle of X in P [11].
Since X is a divisor cut out by a section of OPn (d), the normal bundle is the line bundle
NX/Pn = OPn (d)|X . Therefore, the canonical bundle will be trivial if and only if d = n+1. By the Lefschetz hyperplane theorem, π1(X) is trivial. We have therefore shown the following.
Proposition 3.2. A homogeneous polynomial of degree n + 1 in the n + 1 projective coor- dinates on Pn defines a compact Calabi-Yau n-fold as a divisor X ⊂ Pn.
Since we are interested in dimension 3, of most importance here will be the the quintic Calabi-Yau threefold constructed from a quintic polynomial in P4. For example, the Fermat quintic is the vanishing locus of
5 5 5 5 5 f(x0, x1, x2, x3, x4) = x0 + x1 + x2 + x3 + x4. (3.2)
Remark 3.3. Note that saying “the” quintic is somewhat misleading, as we actually get a family of Calabi-Yau threefolds, by varying the coefficients in the quintic polynomial. However, these correspond to various complex structures on the same underlying topological type. It is conventional to refer to the entire family as ”the quintic.” Similarly, note that certain quintic polynomials give singular varieties. Unless mentioned otherwise, we will assume to be working with a smooth member of the family, for example (3.2).
How can we generalize the quintic Calabi-Yau? The quintic is a hypersurface, and the most immediate generalization of a hypersurface is a complete intersection X ⊂ Pn, which means the codimension of X equals the number of polynomials cutting it out. This is the most ideal intersection, though is quite rare in the world of varieties. n Suppose we have k homogeneous polynomials {fi}i=1,...,k on P with qi ∈ Z≥0 the degree of fi. The vanishing locus of the fi produces a compact Calabi-Yau threefold as a complete intersection in Pn if k = n − 3 (Complete intersection condition) k X (3.3) n + 1 = qi (Generalization of Adjuntion) i=1 One can show the fundamental group is trivial using a generalization of the Lefschetz hy- perplane theorem [6, Thm. 1.4]. We call such a manifold a cyclic Calabi-Yau threefold. A notation which will prove helpful in the following section is to denote a collection of degrees as
M = [ q1 q2 ··· qk ]
10 3.2 CICY Calabi-Yau Threefolds 3 C.I.C.Y.
with XM the corresponding cyclic Calabi-Yau. Note that n can be recovered from the condition n = k + 3. Clearly, (3.3) defines a rather constrained combinatorial problem, and it turns out there are only 5 solutions. In the notation above, these are:
[ 5 ], [ 2 4 ], [ 3 3 ], [ 3 2 2 ], [ 2 2 2 2 ].
The first example is the quintic, the second example is the complete intersection of a quadric and a quartic in P5, the third example is the complete intersection of two cubics in P5, etc.
3.2 CICY Calabi-Yau Threefolds We can achieve a far greater generalization of the quintic by considering complete inter- sections in not just the ambient space Pn, but rather in a product of projective spaces
A = Pn1 × · · · × Pnm .
Suppose we have k multi-homogeneous polynomials {fi}i=1,...,k on A, with multi-degrees i i qj ∈ Z≥0 where i = 1, . . . , k and j = 1, . . . , m. In words, qj is the degree of the i-th polynomial on the j-th factor of A. Generalizing the notation for cyclic Calabi-Yau threefolds, we package the data into the configuration matrix
1 2 k q1 q1 ··· q1 1 2 k q2 q2 ··· q2 M = . . . . (3.4) . . .. . . . . 1 2 k qm qm ··· qm
We define XM ⊂ A to be the vanishing locus of the {fi}i=1,...,k. The projective variety XM is a Calabi-Yau threefold if the following conditions hold
m X k = ni − 3 (Complete intersection condition) i=1 k (3.5) X i nj + 1 = qj, for all j = 1, . . . , m (Generalization of Adjuntion) i=1
Such a XM is called a CICY, which refers to a Calabi-Yau threefold realized as a complete intersection in products of projective space. One is faced with the following combinatorial problem: Problem 3.4. Can we classify all configuration matrices (3.4) up to equivalence and redun- dancies? This represents one of the earliest big-data problems in the world of pure mathematics and physics. It was undertaken in the late 1980s by Candelas, Lutken, Schimmrigk and others [12]. Let us briefly survey the landscape of CICYs that were discovered:
11 3.2 CICY Calabi-Yau Threefolds 3 C.I.C.Y.
• There are 7890 CICYs corresponding to 7890 inequivalent configuration matrices. The smallest matrix is 1 × 1 (corresponding to the quintic) and they reach a maximum of 12 rows or 15 columns.
i • qj ∈ [0, 5] for all i, j. • There are 266 distinct Hodge pairs (h1,1, h2,1).
• There are 70 distinct Euler characteristics χ ∈ [−200, 0].
• The transpose of a configuration matrix is again a configuration matrix.
• The 5 cyclic Calabi-Yau threefolds are the only ones with a single row. In other words, there are only 5 complete intersection Calabi-Yau threefolds in a single projective space.
Example 3.5. Consider the following configuration matrix
1 1 S = 3 0 (3.6) 0 3
From the conditions (3.5), it is straightforward to check that S corresponds to a compact 1 2 2 Calabi-Yau threefold XS which is cut out of P × P × P by two equations of multi-degrees (1, 3, 0) and (1, 0, 3), respectively. This is a CICY, which we call the Sch¨oenmanifold, after Chad Sch¨oen[13]. The two relevant Hodge numbers are h2,1 = h1,1 = 19, and therefore,
χ(XS) = 0. In the next section we will see that XS is also an elliptic fibration. The transpose of the matrix (3.6)
1 3 0 TY = (3.7) 1 0 3 of course also corresponds to a CICY, one called the Tian-Yau manifold XTY . The Hodge 1,1 2,1 numbers are h = 14, h = 23 and therefore, χ(XTY ) = −18. The Tian-Yau manifold carries a free G = Z/3Z action which preserves the Calabi-Yau structure. As a result, the quotient XTY /G is a smooth compact Calabi-Yau threefold (though not a CICY) which has a special Euler characteristic χ = −6, see [7–9]. At the time, this quotient was taken seriously as a candidate for the geometry of the universe! Unfortunately, it has some problems in its matter content.
In general, it is difficult to compute the Hodge numbers for the CICY dataset (in the above example, we gave them without proof). We present this topological data, along with the Euler characteristic for CICYs in Figure1. The Hodge numbers are presented as frequency plots. Interestingly, the distribution of h1,1 is somewhat Gaussian while h2,1 is somewhat Poisson.
12 4 ELLIPTICALLY FIBERED CALABI-YAU THREEFOLDS
(a) h1,1 (b) h2,1
(c) χ (d)
Figure 1: CICY topological data
All CICYs have non-positive Euler characteristic. One weak form of the mirror symmetry conjecture is that compact Calabi-Yau threefolds come in pairs with opposite Euler char- acteristics. Therefore, if one put too much stock in the CICY dataset, they might wrongly convince themselves that all Calabi-Yau manifolds have negative Euler characteristic! We clearly have to venture further in the landscape to encounter the mirror partners of the CICYs.
4 Elliptically Fibered Calabi-Yau Threefolds
Elliptic curves are among the most beautiful objects in mathematics. They provide a link between the fields of geometry, number theory, algebra, and even physics. In fact, as we saw in Section2, an elliptic curve is the unique Calabi-Yau manifold in dimension 1. The notion of an elliptic fibration should be thought of as elliptic curves moving in a family. To understand this vague intuition, let us start with some basics. Let Λ ⊂ C be a full-rank lattice. Topologically, the quotient space C/Λ is a complex torus, or a Riemann surface of genus 1. The following important proposition says that all such Riemann surfaces arise from cubic curves in the projective plane, i.e cubic plane curves.
Proposition 4.1. Riemann surfaces of genus 1 are in bijection with smooth cubic hyper- surfaces in P2, i.e. smooth vanishing loci of homogeneous degree 3 polynomials in 3 vari- ables [14].
13 4 ELLIPTICALLY FIBERED CALABI-YAU THREEFOLDS
By the degree-genus formula for plane curves, any smooth cubic hypersurface in P2 has genus 1. Conversely, given a complex torus of the form C/Λ, the Weierstrass ℘-function ℘(τ, z) associated to Λ gives an embedding into P2. And the differential equation satisfied by ℘(τ, z) implies that the image satisfies a cubic equation. Consider the complex threefold X ⊂ P2 × P2 defined by the vanishing locus of the following bi-homogeneous degree (1,3) polynomial
3 3 3 a0x0 + a1x1 + a2x2 = 0. (4.1)
2 Here (a0 : a1 : a2) are coordinates on the first factor of P and (x0 : x1 : x2) are coordinates 2 on the second. Notice that for any point (a0 : a1 : a2) ∈ P the above equation becomes a cubic in (the second) P2. Therefore, the map π : X → P2 defined by projection onto the first factor, is surjective and all fibers are cubics in P2. This motivates the following definition.
Definition 4.2. An elliptic fibration is a morphism5 π : X → B between smooth algebraic varieties X,B such that a generic fiber of π is a smooth elliptic curve. We call X the total space and B the base.
An elliptically fibered Calabi-Yau threefold, is a Calabi-Yau threefold X together with the structure of an elliptic fibration π : X → B. One should think of an elliptic fibration π : X → B as a family of elliptic curves parameterized by the base B. However, over certain loci in the base, the elliptic curves can degenerate to singular curves. In virtually all interesting fibrations in algebraic geometry, one has to allow for singular fibers. For example, looking back to (4.1) the fiber above the point (1 : −1 : 0) ∈ P2 is
3 3 2 2 x0 − x1 = (x0 − x1)(x0 + x0x1 + x1)
which is not a smooth cubic: it is the union of a line and a conic.
Example 4.3. Let Y ⊂ P1 × P2 be the vanishing locus of the bi-homogeneous degree (1, 3) polynomial
a0f(x0, x1, x2) + a1g(x0, x1, x2) = 0 1 where f, g are generic homogeneous cubic polynomials. Since for any point (a0, a1) ∈ P , the above equation becomes a cubic in P2, the projection onto the first factor π : Y → P1 defines an elliptic fibration called a rational elliptic surface.
1 2 2 Example 4.4. Recall from Example 3.5, the Sch¨oenmanifold XS ⊂ P × P × P is the vanishing locus of homogeneous polynomials of multi-degree (1, 3, 0) and (1, 0, 3) respectively. Let Y be a rational elliptic surface from Example 4.3. We can define a map
1 2 π : XS → Y ⊂ P × P 5Strictly speaking, we want the map π to be flat and proper. These are technical algebro-geometric conditions ensuring we have nice family of projective curves of arithmetic genus 1.
14 5 ADDITIONAL REGIONS OF THE COMPACT LANDSCAPE by projecting onto the vanishing locus of the multi-degree (1, 3, 0) polynomial. The fiber over a point in P1 × P2 is a cubic in P2 since we have to impose the second equation defining
XS. Therefore, XS is an elliptically fibered Calabi-Yau threefold.
The above example illustrates that there are CICYs which are also elliptically fibered Calabi-Yau threefolds. See Figure2, where “S” denotes the Sch¨oenmanifold. According to [7–9], there is a common belief that “most” Calabi-Yau threefolds are elliptically fibered. It is an active area of research to determine precisely which Calabi-Yau threefolds are elliptically fibered.
5 Additional Regions of the Compact Landscape
Unfortunately, there are many important classes of compact Calabi-Yau threefolds which we cannot discuss in detail here. Most notably, the Calabi-Yau hypersurfaces in 4-dimensional toric varieties. This problem was undertaken in the late 1990s by Kreuzer-Skarke (KS), and resulted in one of the biggest datasets seen in pure mathematics. For details on the KS dataset, we refer the reader to [7–9]. In Figure2 we summarize the portions of the Calabi-Yau landscape mentioned in this survey. The point marked “S” denotes the Sch¨oenmanifold, which is both an elliptic fi- bration and a CICY. The point marked “Q” is the quintic, which is both a CICY as well as a hypersurface in a toric variety. The points labelled “×” denote compact Calabi-Yau threefolds not falling into any of these groups.
Calabi−Yau Threefolds
Elliptic Fibration
KS S . Toric Hypersurface . Q CICY
Figure 2: The compact Calabi-Yau threefold landscape
15 6 STRING THEORY STRUCTURES
Part II Non-compact Calabi-Yau Landscape
6 String Theory Structures
6.1 D-branes D-branes occur in Type IIB Superstring theory as the Dirichlet boundary conditions of open strings. A D-brane is hence the hyperplane traced out by the allowed movement of the endpoint of an open string. The dimensionality of the D-brane in question defines the restriction on the directions the string endpoint can move in; such that a Dp brane only allows string endpoints to move in its (p+1)-dimensional world-volume. For example, a D0 brane is a spatial point moving through time, and fixes the endpoint of the string. Additionally, a D1 brane is a spatial line, forming a sheet as it is traced through time, and restricts the string endpoint to any position on this line for all time progression.
Figure 3: A graphic representation of a D-brane [15]. The vertical axis gives full Minkowski space, R1,3, such that a vertical line is the D3 brane considered in Superstring theory. Further theories may use higher dimensional branes indicated by the vertical line’s extension into a plane along the dk axis. The remaining dimensions of the theory are extra, and only endpoints of open strings are restricted to the D-brane as shown.
The D-branes world-volumes support a tensor form of dimension (p+1), this can be integrated over the spatial dimensions to give a conserved charge, known as the Chan-Paton factor of the brane. The form in consideration connects the brane with a U(1)-bundle, such that enhanced gauge symmetry arises as the branes are stacked. In the stacking process, N D-branes’ world-volumes are overlaid in spacetime at an infinitesimal limit, and the total brane gauge group enhances via: U(1)N 7→ U(N). Here the gauge connection on the branes generalises to a higher rank tensor as the string endpoints can be connected across multiple branes in the stack. This becomes important in defining the quiver representation, which is used in the following machine-learning analysis.
16 6.2 Quivers 6 STRING THEORY STRUCTURES
D-branes are important in the brane-world physical interpretation of Type II Superstring theory. In the 10-dimensional spacetime of the Type IIB superstrings, the endpoints are restricted to exist on a D3 brane, whose world-volume is the familiar R1,3 Minkowski space of general relativity and other theories. The remaining six dimensions form a non-compact Calabi-Yau space, such that X10 = R1,3 ⊗ X6. The standard model exists on the D3 brane (or stack of N D3 branes), and only interacts with the X6 Calabi-Yau space via gravitation. The simplest case of a non-compact Calabi-Yau 3-fold is C3, which is trivially Ricci-flat. Beyond that Orbifolds are a natural candidate. Orbifolds are formed from action of a discrete group quotient on a manifold. These manifolds are discussed further in AppendixB[7–9].
6.2 Quivers A Quiver, Q, is a multi-digraph, such that its set of nodes and arrows have finite car-
dinalities N0 and N1 respectively. The quiver represents a gauge theory, where each node has an associated U(Ni) gauge group. The product of all node gauge groups give the full gauge group of the theory. Each arrow is associated with a field, Xij, in the bi-fundamental representation of the gauge groups associated with the nodes connected to the arrow. The fields transform according to the Young tableaux (, ) for the nodes groups. The superpo-
tential, W , of the theory the quiver represents leads to a set of polynomials, {∂Xij W = 0}, which physically give the vacuum state of the theory. Importantly, the representation variety of the quiver is the Vacuum Moduli Space of the gauge theory. A quiver’s representation variety is the gauge invariant quotient of the quiver’s representations, with relations from the superpotential, and quotiented by a product group of complex General Linear transformations. Geometric invariant theory (GIT) is generally used to construct moduli spaces by considering the quotients of groups on algebraic varieties. This representation variety is an affine variety, such that the coefficients of the zero- locus of the variety’s polynomial set gener- ates the corresponding prime ideal. Con- versely, the Vacuum Moduli Space of a gauge theory is a geometric space with a vacuum state of the gauge theory associated to each point in the space. This moduli space of- ten forms a manifold known as the vacuum manifold of the theory. Figure 4: The quiver for N = 4 Super Yang-Mills The space of quivers and superpotentials, theory, with three adjoint fields: X, Y, Z [7–9]. (Q,W ), produces a space of representation varieties, which hence give all the Vacuum Moduli Spaces of the gauge theory in question. Each of the Vacuum Moduli Spaces of a supersymmetric gauge theory is a non-compact Calabi-Yau manifold, and hence this is how the non-compact Calabi-Yau landscape naturally arises in Superstring theory. A simple example of a quiver is the “clover”, which represents
17 3 6.3 An Orbifold Example: C /Z3 6 STRING THEORY STRUCTURES
the famous N = 4 Super Yang-Mills theory, shown in figure4. The superpotential for this example is W = Tr [X,Y ]Z which leads to the simplest Vacuum Moduli Space case of C3 [7–9].
3 6.3 An Orbifold Example: C /Z3 Here we consider a typical example of quiver gauge theory used commonly in association with AdS/CFT correspondence, as it is the worldvolume theory of a D3 brane in the bulk spacetime. This non-compact Calabi-Yau manifold examined is given by the toric variety: 3 C /Z3. This quotient structure makes the manifold an orbifold; where the algebraic geometry structure is explained further in AppendixB. The U(1) 3 quiver in question is shown in figure 5 and shows 9 fields in the theory.
Figure 5: The U(1)3 quiver with 9 fields denoted by the 3 sets of 3 arrows [16].
Since 3 fields exist on each of the 3 edges, there are correspondingly 33 = 27 gauge invariant operators possible, associated with all the closed cycles in the quiver. In this theory, each of the gauge invariant operator terms appear in the superpotential as products of the fields in the corresponding cycle, giving
3 X α β γ W = εαβγX12X23X31 , (6.1) α,β,γ=1
for the totally antisymmetric rank 3 tensor εαβγ , where the Greek indices run 1 7→ 3 for each of the 3 arrows between each pair of nodes. Each field has subscripts to denote the nodes it is in representations of. The are also 9 F-term equations of motion from the superpotential, which are 3 3 3 X β γ X α γ X α β 0 = εαβγX23X31 = εαβγX12X31 = εαβγX12X23 , (6.2) β,γ=1 α,γ=1 α,β=1 where each term is 3 equations for each value of the uncontracted index. These equations
arise under the action of 0 = ∂X W for each of the fields, X. 27 The 27 gauge invariant operators are redefined as dimensions of C , denoted yαβγ. Elim- ination with the F term equations via low degree polynomial interpolation [16] leads to a system of 17 linear equations, and 27 quadratic equations. Further elimination via trivial
18 6.4 McKay Correspondence 6 STRING THEORY STRUCTURES
substitution with the 17 linear equations reduces the system to 27 equations in 10 variables, thus giving the C10 space. These equations are recognised as the standard Veronese embed- ding: P2 ,→ P9 which can be affinised into a C3 embedding in the C10 found above. This embedding corresponds to their existing exactly 10 degree 3 monomials in 3 variables, such that each one then corresponds to a dimension in C10. These equations then give the degree 3 9 irreducible variety which defines the 3 dimensional orbifold C /Z3 [17]. The exponents of the 10 degree 3 monomials in 3 variables give vectors in the fan of the toric variety definition (noting that the orbifold being abelian makes it also toric). Taking the rays of this fan gives three coplanar vectors, which in the plane correspond to points which in turn define the toric diagram. These are {(1, 0), (0, 1), (−1, −1)},which are plotted in figure6, the dual of this diagram then gives the orbifold’s toric diagram [7–9,16].
3 Figure 6: The toric diagram dual for the C /Z3 orbifold [7–9]. The origin is denoted in the diagram centre, and the toric diagram can be retrieved as this diagram’s dual.
The corresponding brane tiling and dessin d’enfant can then be formed from the toric diagram; these objects are addressed in section7[18].
6.4 McKay Correspondence McKay correspondence concerns a discrete finite subgroup G ⊂ SU(2). Firstly taking the tensor product between the defining 2 complex dimensional rep of G and each irrep of G, and then taking the irrep decomposition of this tensor product makes the correspondence manifest. Whereby each decomposition coefficient is the square of the adjacency matrix for each of the simply-laced Dynkin diagrams. Dynkin diagrams represent the root system of the gauge group’s Lie algebra. To be simply-laced means there is only one edge between each node, which represents a restriction on the angles between the fundamental roots. Specifically, the simply-laced Dynkin diagrams + are: An, Dn, En where the first two are series of diagrams for n ∈ Z , whilst the En refers to three of the exceptional Lie algebras. The Dynkin diagrams in question are affine-extended, which is canonically achieved by central extension of the original Lie algebra. This amounts to introducing an additional imaginary root, which increases the dimensionality of the root ˜ ˜ ˜ system. These are hence represented with an additional node, and denoted: An, Dn, and En
19 7 ALGEBRAIC GEOMETRY VIEWPOINT
respectively. In the special case of simply-laced, the Dynkin diagrams correspond exactly to their Coxeter diagrams, which represent Coxeter groups, defined by reflection symmetries. This is relevant because the representation variety of the affine Dynkin diagrams formu- lated as quivers are Calabi-Yau 2-folds (a.k.a. K3-surfaces). We can then produce orbifolds from these described by McKay quivers such that they have the form C × (C2/G). These orbifolds are hence also candidates for the extra dimensions in Superstring theory. However where C3 leads to N = 4 Super Yang-Mills theory on the D3 brane, these orbifolds produce N = 2 supersymmetric QFTs. When taking quotients to produce the orbifolds in question, relations between the invari- ants of the orbifold give rise to algebraic singularities. In C2 these are the du Val singularities. Smoothing out these singularities through desingularisation requires the resolution map be- tween the canonical bundle and canonical sheaf to be crepant. Meaning that no discrepancy divisor is needed with the resolution map. When this crepant resolution map is established, metrics and other physically relevant measures can be written explicitly for some special cases. These crepant resolutions are key in generalising the quotient process to act on Calabi- Yau 3-folds (as C3/G); introducing further orbifolds into the Calabi-Yau spectrum. However in this case the manifolds are related to one another by mirror symmetry and in particular flop transitions. These orbifolds correspond to N = 1 super-conformal gauge theories, hence extending also the practical applications of studying the Calabi-Yau landscape with respect to examining topical theories in physics. The quotient product, and crepant resolution methods extend the landscape of non- compact Calabi-Yau manifolds from only C3 to also include a plethora of orbifolds. Physicists interpret the manifold landscape as representation varieties of quivers, which indicate the equivalent gauge-field theories [7–9].
7 Algebraic Geometry Viewpoint
7.1 Brane Tilings The method to connect the quivers of a gauge theory, with the toric diagram (see Ap- pendixB) of the relevant Calabi-Yau that makes up the remaining dimensions in the full 10d superstring spacetime, exists for both directions [19, 20]. Deriving the toric diagram from the quiver is more straightforward and follows the clockwise process depicted in figure7. The converse, “geometric engineering”, toric diagram to quiver process was originally computationally demanding, with exponential time complexity. This process was streamlined by introducing the concept of brane tiling. This brane tiling concept was derived from noticing a consistent relation between the number of nodes, edges, and superpotential terms,
(N0, N1, N2) respectively, N0 − N1 + N2 = 0 . (7.1)
20 7.1 Brane Tilings 7 ALGEBRAIC GEOMETRY VIEWPOINT
Figure 7: A pictorial representation of the process that links the quiver and superpotential (Q,W) to the Toric diagram of the equivalent non-compact Calabi-Yau manifold [7–9]. This specific example is for the conifold considered previously.
This applied for all quivers whose representation variety was a toric variety (as for those considered in string theory). The relation 7.1 was associated to the Euler characteristic for a torus, and this allowed the quiver and superpotential to be encoded as a bipartite graph tiling on a (genus, g = 1) torus. The connection of brane tilings to quivers follows a simple algorithm. Whilst mapping from the toric diagrams to the brane tilings is epimorphism; with the orbit of tilings which are mapped to by the same toric diagram related by Seiberg duality [21]. Seiberg duality relates an “electric” and a “magnetic” theory, stating that under RG flow they both approach the same IR fixed point. Therefore they represent the same theory at lower energy densities. In our context it represents the relation between two quiver gauge theories, where some additional fields are integrated out/introduced, which graphically corresponds to contracting/expanding parts of the brane tilings. This concept is exemplified in figure8.
Figure 8: The contraction of part of a brane tiling [21], corresponding to integrating out a massive field to relate two quiver gauge field theories via Seiberg duality.
21 7.2 Dessin d’Enfants 7 ALGEBRAIC GEOMETRY VIEWPOINT
More mathematically, the Seiberg duality process corresponds to cluster mutation of the mathematical graph-theoretic quiver objects. Through a series of steps of reorienting and reassigning arrows associated with a node in the quiver, and adjusting the gauge group size by the number of fields, a different (dual) quiver is formed [22]. This cluster mutation process is a generalisation of the Seiberg duality. Multiple actions of the cluster mutation for different nodes creates “mutation classes” of quivers. Their equivalent brane tilings are connected by a process known as urban renewal, again a mathematical generalisation of the integrating out/introduction of fields in the physical application of Seiberg duality. These dual quivers are related, where their tilings correspond to the same toric diagram under the epimorphism previously mentioned. Tilings are an important step in the geometric engineering process. The quiver duality concept may also be thought of as monodromy of wrapped 3-cycles in the dual theory via another duality known as mirror symmetry. Mirror symmetry connects mirror dual Calabi-Yau manifolds in different superstring theories, where they lead to the same resulting physics. In this case the D3 brane on one Calabi-Yau 3-fold is mirror dual to a D6 brane with 3 dimensions identified (3-cycle wrapping) on the dual Calabi-Yau 3-fold. This concept has been shown to be practical in Topological string theory where the mirror symmetry concept has been mathematically well defined [23]. Mirror symmetry allows calculation of certain complicated invariants by performing eas- ier calculations in the dual theory. A key example is Gromov-Witten invariants, which arise in symplectic geometry which also satisfies the ’almost complex’ structure requirements. The almost complex structure is a looser condition than K¨ahlergeometry in that only the tangent space is required to be smooth linear complex, and not necessarily the underlying space. These invariants are calculated from pseudoholomorphic curves which are the sym- plectic equivalent of distances in Riemannian geometry. More general quantities are usually expressed in terms of the Gromov-Witten invariants, which are difficult to compute, but can be reduced to simpler integrals in the mirror dual theory [24,25].
7.2 Dessin d’Enfants Bipartite tilings are the algebraic geometry equivalent of Grothendieck’s Dessin d’Enfants from number theory. This interpretation can be useful for categorising the tilings, and hence the quiver gauge theories. Mathematically the dessins are interpreted using Belyi maps, β, which map from a smooth compact Riemann surface (described as a hyperelliptic curve of complex numbers), Σ, to projective space, P1 such that [26]
β :Σ 7−→ P1 . (7.2)
A dessin is then formed from the preimage of a Belyi map which has three ramification points; where a ramification point is an element of Σ where the local Taylor expansion of β starts at order ≥ 2 and corresponds to degeneration of the map. Under the SL(2,C) symmetry of P1, the three ramification points are transformed to (0, 1, ∞), and the dessin is formed by associating the preimages of 0 to black nodes, preimages of 1 to white nodes, and
22 7.2 Dessin d’Enfants 7 ALGEBRAIC GEOMETRY VIEWPOINT preimages of the (0,1) interval to edges. Therefore the dessin is a bipartite graph drawn on the Riemann surface Σ, such that
β−1(0) → • , β−1(1) → ◦ , β−1(0, 1) → − . (7.3)
The dessins can be categorised by their passports, which is the collection of the ramification data, represented r0(1), r0(2), ..., r0(B)|r1(1), r1(2), ..., r1(W )|r∞(1), r∞(2), ..., r∞(I) , (7.4) such that ri(j) is the ramification value (order of the leading term in Taylor expansion) of the jth preimage of value i in the image of the Belyi map. The total number of preimage points are (B, W, I) for the ramification points (0, 1, ∞) respectively. Note also here that the Riemann-Hurwitz formula sets B = W for the genus 1 torus we are working on. The actual value of each ramification point then gives the valency of each node in the dessin. The passport doesn’t identify the dessins exactly, a more effective way of representing the dessins independently is combinatorically as permutation triples. Permutation triples encode the dessin information by creating elements of the symmetric group which are the products of all cycles containing either the white nodes, σW , or the black nodes, σB. An additional object, σ∞, in the symmetric group is defined also, such that:
σW · σB · σ∞ = 1d , (7.5) for 1d the identity element of the symmetric group, Sd, of the d edges in the dessin. The group elements σ∞ are then associated to cycles about faces of the dessin under this symmetric group [27]. In supersymmetric QFTs, R-symmetry connects the fields in the theory via their R-charge of the supersymmetric representations. For the tiling, each edge has an R-charge from the field it represents in the superpotential interpretation of the tiling. Under the symmetry, these charges must satisfy X X Ri = 2 , 1 − Ri = 2 , (7.6)
i∈En i∈Ef for En the edges bounding any node in question, and Ef the edges bounding any face in question. These relations in terms of the tiling are equivalent to Euler’s relation as in equation 7.1. Isoradial embedding is a method for constructing the tiling which automatically satisfies these required conditions on the R-charges. Nodes are organised on the circumferences of intersecting tessellated circles such that the angles subtended by triangles formed with adjacent nodes and the circle’s centres satisfy θi = πRi/2. This causes the conditions in 7.6 to translate to basic geometric conditions on total angle around a point and total interior
23 8 NON-COMPACT CALABI-YAU SUMMARY angle of a polygon respectively. These conditions fix the nodes’ positions up to rotation about the circles, this is then fixed by performing a-maximisation of the function
X 3 a(Ri) = (Ri − 1) , (7.7) i∈E for E the set of all edges in the tiling. Maximising this equation over the Ri partition is equivalent to minimising the conic base volume.
8 Non-compact Calabi-Yau Summary
The non-compact Calabi-Yau landscape makes itself of manifest importance in super- string theory through the interpretation of quiver gauge theories. Within this, the manifolds make up the additional dimensional space in the theories’ brane world interpretation. Using generalizations of McKay Correspondence, the association from the manifolds to the quivers is made through their representation varieties. Orbifolds can then be introduced into the landscape using group quotients and crepant resolution. Alternatively the manifolds may be considered more algebraically as toric varieties, gen- erally defined using a fan structure on a lattice. The general toric variety construction allows formation of more Calabi-Yau manifolds, including the conifold. From this interpre- tation toric diagrams can be formed from the varieties which aid in manifold classification, especially in the context of brane tilings. Brane tilings are a useful geometric interpretation of the quiver and superpotential prop- erties; and particularly streamline the process of calculating the physical theories represented by toric diagrams. Seiberg duality and mirror symmetry also become important concepts in this consideration of forming physical quiver gauge theories. Finally these tilings can be considered in parallel to dessin d’enfants from number theory. This interpretation in terms of Belyi maps and their ramifications offers some explanation of structure associated with the underlying theories’ supersymmetry. This interconnection between these interpretations of the Calabi-Yau manifolds is well depicted in the example in figure9 for the conifold. Points of interest for further investigation include the interpretation of the Seiberg duality in terms of the dessin structure; and how the use of dessins may relate the absolute Galois group (important in the theory of dessins) into the physical theories of the tilings. Beyond these, the parallels between the physical, algebraic geometry, and number theoretic structures offers many sources for inspiration.
24 8 NON-COMPACT CALABI-YAU SUMMARY
Figure 9: The conifold Calabi-Yau manifold interpreted in terms of: (a) its underlying physical theory in terms of quiver and superpotential; (b) the representation variety’s toric diagram (note it is the equivalent dual diagram that is shown); (c) the Belyi pair used to encode the dessin tiling structure; and (d) the brane tiling on a torus [7–9].
25 9 PERFORMANCE MEASURES: HYPERSURFACES IN W P4
Part III Machine-Learning the Landscape
As we have seen above, different areas in mathematics, including algebraic geometry, representation theory and even number theory, have appeared in our study of CY manifolds in theoretical physics. As the extra six dimensions are believed to be “wrapped” as a CY 3-
fold under string compactification, people began to search for the possible CY3’s, and have so 10 far collected a gigantic list of CY3’s from reflexive polytopes, estimated at order 10 [8,9]. Furthermore, the number of string vacua in the landscape is astonishingly of order 10500 for type IIB theory6 [28]. Thus, the power of computers and algorithms is urgent for this interdisciplinary research. A different version of “WWJD” has now been raised: what would Jython do?
9 Performance Measures: Hypersurfaces in W P4
In AppendixC, machine learning is briefly introduced. Whatever approach the machine adopts for the learning, we always need to know how well it performs. Let us quantify its performance using the following example. Recall the weighted complex projective space W P4 in (B.2)7. Our input for each hypersurface in W P4 is a 5-vector of co-prime positive integers which determines the space. Let us consider a simple query of whether the Hodge 2,1 number h > 50. Geometrically, we are searching for CY3’s with a relatively large number of complex deformations. Our data D consists of 7555 5-vectors, xi, each resulting in a 2,1 binary output, yi. For example, ({1, 1, 1, 1, 1} → 1) ∈ {(xi → yi)} = D as h = 101 > 50 in this case. On the other hand, we have ({2, 2, 3, 3, 5} → 0) since h2,1 = 43 < 50 here. In [29], the Hodge numbers are computed using Landau-Ginzburg method. However, such procedure would take hours, and this is just a very simple query. Now that our data D is fully known, we can then split our data into a training set T and a validation set V , viz, D = T F V . We can then establish a machine-learning algorithm so as to check how well it performs. This procedure is known as the cross validation. To quantify the accuracy, we make the following definitions.
Definition 9.1. Let V = {(xi → yi)} where yi is the actual correct output for input xi, and pred let yi be the output predicted by the machine-learning model on xi with i running from pred 1 to N. Then the precision p is the percentage that yi agrees with yi: 1 p := |{ypred = y }| ∈ [0, 1]. (9.1) N i i
6For F-theory, the number of flux vacua arisen from elliptic fourfolds even rockets to at least 10272000. 7In terms of the notation in (B.2), this is CP(a0,a1,a2,a3). For brevity, we will henceforth denote it as W P4.
26 9 PERFORMANCE MEASURES: HYPERSURFACES IN W P4
pred pred Definition 9.2. Consider yi and yi as vectors y and y respectively. Then the cosine distance is y · ypred d := ∈ [−1, 1]. (9.2) C |y||ypred|
If the cosine angle between the two vectors is 1, we have a complete agreement. If dC is -1, then it is the worst fit. If dC = 0, then it is a random correlation.
Now we take 2000 samples (out of 7555) from D, which is approximately 25%, to be our training data. Then we establish our MLP and test it using the remaining data. The detailed Python code can be found in [7–9]. It turns out that there are only 375 errors in our experiment, which gives p = (5555 − 375)/5555 ' 93.25%, and the cosine distance dC is 0.91. This is a quite impressive result with such a high accuracy. Remarkably, the running time is less than one minute on an ordinary laptop8! To make sure that our machine-learning makes satisfying predictions, we need to intro- duce Matthews correlation coefficient (MCC). Firstly, we have:
Definition 9.3. Let {(xi → yi)} be categorical data, where yi ∈ {1, 2, . . . , k} takes value in k categories. Then the confusion matrix is a k × k matrix where 1 is added to the (ab)th entry if the actual value of y is a while the predicted ypred is b.
As a result, we want the confusion matrix to be diagonal ideally. In our binary case, the confusion matrix is 2 × 2, and we have this table:
Actual True (1) False (0) Predicted . (9.3) True (1) True Positive (tp) False Positive (fp) False (0) False Negative (fn) True Negative (tn)
Then we can define:
Definition 9.4. For binary classifications, the Matthews correlation coefficient is the square root of the normalized χ-squared, that is, r χ2 tp · tn − fp · fn φ := = ∈ [−1, 1]. (9.4) N p(tp + fp)(tp + fn)(tn + fp)(tn + fn)
Such definition can also be generalized to k × k confusion matrices [30, 31]. If the MCC returns 1, then we have a perfect prediction. If MCC is -1, then our fit is a complete disagreement. If φ = 0, then it is a random prediction. It is crucial to notice that other measures such as p and dC , unlike MCC, are not useful when the sizes of two classes differ too much, i.e., when we have imbalanced data. For example, if there is only 0.1% of the
8This can also be done using Mathematica with a high accuracy as well. In particular, Mathematica has machine learning built into its core operating system from version 11.2, and now Mathematica 12 has been released with detailed documentation on machine learning.
27 10 LEARNING CICYS
data to be classified as true. Then our algorithm would naively train a model predicting false for any input. Nevertheless, the accuracy p would still reach 99.9%9. In our case of hypersurfaces in W P4, we have φ = 0.84, which gives a quite nice prediction. Now one may wonder how well our NN will behave when we change the number of samples in the training data. This can be analyzed via learning curves:
Definition 9.5. Let D = {(xi → yi)} have N data-points. We choose cross validation by taking γN data-points randomly as training data T , with some γ ∈ (0, 1]. Then the remaining (1−γ)N data-points form the validation data V . The performance of the machine- learning algorithm, upon training on T and validated on V , is a function L(γ) measured by any goodness of fit as aforementioned. The learning curve is then the plot of L(γ) against γ.
In practice, γ is chosen discretely. Moreover, for each γ, we repeat random samples γN a number of times for statistical stability, so there are error bars associated to our points on the curve. The learning curve of our hypersurfaces in W P4 case is depicted in Fig. 10. As we can
Figure 10: The learning curves for machine learning whether h2,1 > 50 for a hypersurface in W P4. We repeat cross validation 10 times at each incremental interval of 5%. see, there is a large error for training data less than 10% as the NN has not seen enough data for valid predictions. However, from 20%, our predictions become really well-behaved. The curve then ascends steadily as the (both) measures are approaching to 1.
10 Learning CICYs
Let us now focus on the CICY dataset of 7890 inequivalent complete intersection CY 3-folds in products of (unweighted) complex projective spaces. As discussed in §3.2, a CICY
9There are also other measures (especially required for imbalanced data) such as F-score. However, MCC is the most informative one as it includes all the four categories in confusion matrices [32].
28 10 LEARNING CICYS
is represented by a matrix, whose entries are 0 to 5, with number of rows ranging from 1 to 12 and number of columns ranging from 1 to 15. In terms of computer graphics, it is a 12 × 15 pixelated image with 6 different colours (or 6 shades of grey in greyscale image). As an example, the CICY of 8 equations in (P1)5 × (P2)3 is the matrix in Fig. 11 such that we have the image as in Fig. 12.
110000000000 101000000000 000101000000 000010100000 000000200000 011000010000 100001100000 000110010000 000000000000 000000000000 000000000000 000000000000 000000000000 000000000000 000000000000
Figure 11: The matrix representing the CICY Figure 12: The corresponding image of Fig. 11 in (P1)5 × (P2)3. where purple pixels are 0, green, 1 and red, 2.
As a matter of fact, we need CNN to take advantage of pixelations of CICYs. Never- theless, for this task, we are only using the graphic images to emphasize that even though our computers have no knowledge of algebraic geometry, they can still “learn” to make good predictions, and we will keep using MLP in our analysis. Similar to §9, the machine learns the binary query of the Hodge number h1,1 > 5. The input would be 12 × 15 matrices, and the output is again either 0 or 1. Now we take 4000 random samples (< 50%) as our training data, and test the remaining 3890 data- points as validation. The learning time only takes about 5 minutes while the performance is remarkable. The accuracy p is 97% and the cosine distance dC reaches 0.98. For MCC, we have φ = 0.87. Varying the numbers of samples in T yields the learning curves depicted in Fig. 13.
Figure 13: The learning curves for machine learning whether h1,1 > 5 for CICYs.
We can see that there is a huge discrepancy between p and φ for small γ’s. This is due to the great disparity between the sizes of two classes (h1,1 ≤ 5 and h1,1 > 5). Indeed, Fig.
29 10 LEARNING CICYS
1 with the distribution of h1,1’s verifies our argument. As aforementioned, MCC would be much more useful in this case. For larger γ’s, we do see the ascent of both curves, approaching to 1. Let us now make our problem more sophisticated and compute the precise values of h1,1. Here we try three different methods and compare their results:
• NN Classifier: As h1,1 ∈ [0, 19], the output is a 20-channel classifier (cf. the 10- channel classifier in text recognition) with each neuron mapping to 0 or 1. The detailed architecture is decribed in [7–9,33].
• NN Regressor: The output is some real number which means it is continuous. There are certain parameters known as hyperparameters that need to be optimized before training by hand, such as the number of hidden layers etc. This is discussed in detail in Appendix C in [33].
• SVM Classifier: The output is one of the possible values of h1,1, that is, some integer between 0 and 19. The hyperparameter optimization is also discussed in [33].
We can now plot the three learning curves as in Fig. 14. We see that the NN classifier
Figure 14: The learning curves generated by averaging over 100 different random cross validation splits. performs best in the machine-learning. Again, it is impressive that such training on an ordinary laptop takes only about 10 minutes and the validation only takes a few seconds. For reference, we plot the histograms of frequencies of predicted and actual h1,1’s with
30 10.1 Distinguishing Elliptic Fibrations 10 LEARNING CICYS
validation sets of sizes 20% and 80% of the total data respectively for the three methods in Fig. 15.
Figure 15: The frequencies of h1,1’s.
10.1 Distinguishing Elliptic Fibrations As CICYs may admit elliptic fibration (EF) structures, [34] machine-learnt these elliptic fibrations. Here we will contemplate the 643 CICY 3-folds with h1,1 ≤ 4 since all those with h1,1 > 4 can be (obviously) elliptically fibred [35–38]. As we have an unbalanced dataset (with 53 non-elliptic and 590 elliptic), we would like to make an enhancement on 53. Notice that CICY configurations are the same up to row and column permutations. We can therefore take 10 random permutations (independently) of both rows and columns on each of the 53 configuration matrices, which yields 102 × 53 = 5300 non-elliptic cases with output 0. We also perform 3 such permutations so that we have 32 × 590 = 5310 elliptic cases with output 1. Moreover, as these CICYs can all be represented by configuration matrices with 6 rows and 7 columns, the input will be a 6 × 7 matrix.
31 10.1 Distinguishing Elliptic Fibrations 10 LEARNING CICYS
Now we are dealing with our familiar binary queries. Following the similar recipe as above, we can draw the learning curves in Fig. 16. The error bars for up to about 25%
Figure 16: The learning curves for the (enhanced) ten thousand data-points on EFs. looks ugly as there is inadequate training data. However, from 30% and above, machine- learning gives us a pretty nice result with high accuracy. Again, each training only takes a few seconds. It is also worth noting that we can make a control test for this problem. We just arbitrarily choose 53 configuration matrices out of the 643 and assign 0 or 1 to these 53 matrices randomly. Then we have the learning curves depicted in Fig. 17. We see that the machine-
Figure 17: The learning curves on a control set of a randomly chosen property. learning is poorly behaved (with ∼ 50% precision and φ ∼ 0) which shows that there is no inherent pattern for the machine to find in the control test. In contrast, EF is truly not a random property.
32 11 A DIGRESSION: GROUP THEORY
11 A Digression: Group Theory
11.1 Learning Cayley Tables Now we would like to apply machine-learning to more basic problems in mathematics. Let us first start with recognizing Cayley tables C [39] out of Latin squares L. Allowing permutations of rows and columns, the number of Cayley tables of size n will 10 11 be #Cn = 1, 2, 6, 48, 120, 1440,... These Cayley tables form a subset of Latin squares L. The number of Latin squares grows as:#Ln = 1, 2, 12, 576, 161280, 812851200,... Compare #C with #L, we see that the probability of a Latin square being a Cayley table is essentially 0 from n as small as 5. This is important for our algorithm so that we can choose n ≥ 5, and assign L → 0 and C → 1. Thus, we are back to our familiar binary query. A more detailed descriptions of algorithms can be found in [40]. Here, we will just give out the learning curves as in Fig. 18. Both of the measures show that we have a perfect result when we have
Figure 18: The learning curves for n = 8 Latin squares. We vary the training size from 500 to 6000 in increments of 500.
∼ 25% out of the total data as training data.
11.2 Learning Finite Simple Groups After recognizing the Cayley tables, one may wonder how machine-learning would per- form when studying other group properties. We now focus on the problem of finite simple groups. We know that it is not straightforward to determine whether a finite group is simple by just contemplating its Cayley table. However, there shouldn’t be random properties in group
10 2 If we naively consider all possible permutations, then #Cn should be (n!) × #G (where #G denotes the number of elements in group G.). However, Cayley tables always have degeneracies under permutations which leads to only (n!)×#G distinct matrices though this is not obvious for non-symmetric matrices (which corresponds to non-abelian groups). 11A Latin square is a n × n matrix filled by n symbols (here 1, 2, . . . , n), each of which appears exactly once in each row and in each column.
33 12 SUMMARY AND OUTLOOK theory, so we would like to see how machines can behave in this task. The detailed treatment can be found in [40]. Here, we report the learning curves as in Fig. 19. We see that even at
Figure 19: The learning curves for identifying whether a finite group of order n ≤ 70 is simple. There are 20 simple groups out of the total 602 groups. a low percentage of training data, the machine can still make very good predictions directly from Cayley tables without knowing Sylow theorems. More examples of studying group properties and algebraic structures via machine-learning can be found in [40]. Recently, similar explorations have also been done in number theory [41].
12 Summary and Outlook
As argued in [8,9], any computational algebraic geometry problem is machine-learnable as it is in essence a finite number of steps finding kernels and cokernels of integer matrices. Thus, machines can be pretty well-behaved although they know nothing about algebraic geometry. On the other hand, this also shows that our properties in algebraic geometry are not random. Otherwise, our machine-learning would fail, just like the control case in §10.1. Hence, we expect that machine so far is not able to learn number theory in which elusive prime numbers play a key role. As a sanity check, when we input a bunch of prime numbers and train our machine to valid larger primes in our data, we only achieve a terrible 0.1% accuracy. After all, AI does not stand for all-powerful incredibility. Machine learning is good at matrix/tensor manipulations (and this is why TensorFlow is given such a name). Anyway, algebraic geometry is the area where machine learning can show its power as we have seen above. Despite our success in machine-learning, we still do not know why this works. Unlike other disciplines in science, we do not know what is going on among neurons and their connections, and we can still get good predictions from them regardless of the theoretical intractability. An almost 1 accuracy certainly cannot satisfy mathematicians, but machine learning still bypasses the expensive steps for practical purposes. After decades of research, string landscape now solidly resides in the era of big data. CY manifolds is only a small portion of the heterotic landscape. There is plenty of room at the bottom where NNs can act as classifiers or predictors for generalized K¨ahlergeometry, for
34 A SOME COMPLEX GEOMETRY
stable holomorphic bundles, for quivers and brane tilings and so forth. The landscape is still on the threshold of benefiting from data science.
Acknowledgement
We would like to thank Thomas Creutzig and Steve Rayan for organizing the “PIMS - USaskatchewan Summer School on Algebraic Geometry in High-Energy Physic” which provided a wonderful atmosphere for mathematicians and physicists, students and experts, to interact and collaborate. YHH would like to thank STFC for grant ST/J00037X/1. EH would like to thank STFC for the PhD studentship.
A Some Complex Geometry
Let us consider a complex manifold M of complex dimension m. Then the set of (p,q)- forms, Ωp,q(M), obviously form an abelian group under addition. As M is also a 2m- dimensional real manifold, we can decompose the familiar exterior derivative into two pieces: d = ∂ + ∂¯, such that ∂ acts on the holomorphic part of a (p,q)-form while ∂¯ acts on the antiholomorphic part, that is, ∂ :Ωp,q(M) → Ωp+1,q(M) and ∂¯ :Ωp,q(M) → Ωp,q+1(M). Followed from d2 = 0, we get ∂¯2 = 0 (as well as ∂2 = 0 and ∂∂¯ + ∂∂¯ =0). We may therefore construct the cochain complex:
¯ ¯ ¯ ¯ ¯ 0 −→∂ Ωp,0(M) −→∂ Ωp,1(M) −→∂ · · · −→∂ Ωp,m(M) −→∂ 0. (A.1)
Then
Definition A.1. The Dolbeault cohomology group is defined as
ker(∂¯ :Ωp,q(M) → Ωp,q+1(M)) Hp,q(M) := . (A.2) ∂¯ Im(∂¯ :Ωp,q−1(M) → Ωp,q(M))
The dimensions of the Dolbeault cohomology groups are known as the Hodge numbers, p,q p,q h := dimH∂¯ (M). Not all the Hodge numbers are independent. For any complex manifold, Kodaira-Serre duality yields [14] hp,q = hm−p,m−q. (A.3) In particular, we have h0,0 = hm,m = 1. (A.4) Also, one can prove that the Dolbeault cohomology group is always of finite dimension, viz, hp,q < ∞.
35 A.1 K¨ahler Manifolds A SOME COMPLEX GEOMETRY
A.1 K¨ahler Manifolds As we will see, K¨ahlerstructure would put more constraints on the Hodge numbers. First, we need to introduce
Definition A.2. A Hermitian metric g on the complex manifold M with complex structure J is a Riemannian metric satisfying g(Jv, Jw) = g(v, w) for any vector fields on M. Equiv- alently, in terms of (complex) components, gαβ = gα¯β¯ = 0. Then the Hermitian form is the 2-form defined by ω(v, w) := g(Jv, w). Equivalently, in terms of (complex) components12,
ωab = igαβ¯ − igαβ¯ . Therefore, ω is also a (1,1)-form.
As the pure holomorphic and antiholomorphic components of the Hermitian metric van-
ish, it is not hard to see that ωab = −ωba. Now we are able to define
Definition A.3. The Hermitian metric g is K¨ahler if dω = 0, and ω is called a K¨ahlerform. A complex manifold is a K¨ahlermanifold if it admits a K¨ahlermetric.
In some literature, the K¨ahlermanifold is defined as a complex manifold having a sym- plectic form (being bilinear, non-degenerate and antisymmetric). In fact, bilinearity and non-degeneracy come from the Riemannian metric, and antisymmetry follows the Hermicity of the metric as mentioned above.
It is worth remarking that since dω = 0 (which is the same as ∂αgβγ¯ = ∂βgαγ¯ along with its conjugate equation), then equivalently we can write ω = i∂∂K¯ for some real scalar function K known as the K¨ahlerpotential. For K¨ahlermanifolds, the Hodge numbers further satisfy
hp,q = hq,p, (A.5) hp,p ≥ 1. (A.6)
Then (A.3) and (A.5) yield hp,q = hm−q,m−p. (A.7) If the manifold is further Calabi-Yau, one can show that hm,0 = h0,m = 1, and hm,p = hp,m = h0,p = hp,0 = 0 for 0 < p < m [6]. K¨ahlergeometry is ubiquitous in physics. We care about K¨ahlermanifolds since they preserve holomorphicity under parallel transportations of vectors, and a K¨ahlerstructure is essential for a manifold being Calabi-Yau.
A.2 Chern Classes Another important concept for Calabi-Yau manifolds is the Chern classes.
12We use Latin indices for real coordinates and Greek ones for complex coordinates.
36 B TORIC VARIETIES
Definition A.4. Given the complex vector bundle E over the complex manifold M of complex dimension m and the gauge group (aka structure group) G, let F = dA + A ∧ A be the strength field (aka curvature 2-form) of the gauge potential (aka connection) A. We define the total Chern class as13 i c(E) = det I + F . (A.8) 2π