On Computing with Perron Numbers: A Summary of New Discoveries in Cyclotomic Perron Numbers and New Computer Algo- rithms for Continued Research

A Thesis Presented to the Honors Tutorial College at Ohio

University

In Partial Fulfillment of the Requirements for Graduation from the Honors Tutorial College with a Bachelor of Science

Degree in Mathematics (BS 1903) and a Bachelor of Science

Degree in Computer Science (BS 1929)

By William C. Kanieski

May 3, 2021

c 2021 William C. Kanieski. All Rights Reserved. This thesis has been approved by the Ohio University Mathematics Department, the School of Electrical Engineering and Computer Science, and the Honors Tutorial College.

Dr. Alexei Davydov Director of Studies, Mathematics Thesis Adviser

Dr. David Chelberg Director of Studies, Computer Science Thesis Adviser

Dr. Donal Skinner Dean, Honors Tutorial College

1 On Computing with Perron Numbers

William Kanieski

May 3, 2021

Contents

I The Mathematics of Cyclotomic Perron Numbers 4

1 Introduction to Cyclotomic Integers and Conductors 5 1.1 Real Cyclotomic Integers ...... 5 1.2 Conductor n ...... 6

2 Defining and Representing Perron Numbers 8

3 Properties of Perron Numbers 9 3.1 Matrices ...... 9 3.2 Fusion Graphs ...... 11 3.3 Arithmetic Properties ...... 15

4 Representing Cyclotomic Integers 16 4.1 Generating Sets ...... 16 4.2 Length of the Perron Vector ...... 17

5 Checking if a Cyclotomic Integer is Perron 18

6 Minimality of Perron Numbers 20 6.1 Definition of Minimality ...... 20 6.2 Relation to Prime Numbers in the Integers ...... 23

7 Minimality of Quantum Numbers 24

8 Potential Future Applications 30

2 II Algorithms Regarding Perron Numbers and Polynomi- als 31

9 Efficiency and Computational Complexity 31

10 Minimal Perron Vector Algorithms 34

11 Other Computations Related to Perron Numbers 37

12 Perron Research Algorithms Involving Polynomials 38 12.1 Overview of Polynomials in Computer Science ...... 38 12.2 Polynomial Structures ...... 40 12.3 Polynomial Operations ...... 43 12.3.1 Arithmetic within Polynomials ...... 43 12.3.2 Arithmetic within Expressions and Mods ...... 57 12.4 Lexing and Parsing ...... 59

13 Research Summary 65

14 Conclusion 66

3 Abstract Although they are linked to many areas of mathematics, Perron numbers remain a field into which relatively little research has been conducted throughout history. Having their roots in mathemat- ics and linear algebra, Perron numbers manifest themselves everywhere from complex patterns in nature to solutions of higher order polynomial equations. This paper discusses our research into cyclotomic Perron numbers and their properties, and is divided into two parts. In Part 1, we introduce the concept of Perron numbers and their properties in algebra and number theory, as well as their relevance to fusion graphs and their spectral radii. Unlike previous research, we represent cyclotomic Perron numbers as vectors of integers, rather than as algebraic expressions or irrational decimals. With this easier to understand vector representation, we describe cyclotomic Perron numbers of a given conductor as points that lie within a “cone” in a multidimensional Cartesian coordinate plane. We then define mini- mal Perron numbers as being points that lie on the boundary of this cone, which can be added together to produce all other Perron num- ber points within the cone. Drawing comparisons to factoring positive integers into prime numbers, we provide new procedures for how to test if a Perron number is minimal. We then detail our findings for various conductors and extrapolate the patterns we discover to explain previously unnoticed properties of cyclotomic Perron numbers. In Part 2, we delve deeper into the algorithms we used in the course of our research and explain their relevance to Perron numbers. We be- gin by expanding our description of our minimal finding algorithms and providing a few general programs for testing if a cyclotomic integer is Perron. We go on to illustrate the importance of large polynomial computations to our research, and discuss some C++ programs we wrote to solve these problems quickly and without making mistakes. We describe the data structures and algorithms involved, including the time and space complexities of each mathematical operation and the grammar used to parse polynomial expressions. We then conclude the paper with a passage on how our discoveries and programs may be use- ful to both future researchers studying Perron numbers and computer scientists writing programs with algebra in mind.

4 Part I The Mathematics of Cyclotomic Perron Numbers

1 Introduction to Cyclotomic Integers and Con- ductors

1.1 Real Cyclotomic Integers Before we can examine Perron numbers, we must define some terms in alge- bra and number theory. We will use these definitions to show where Perron numbers fit into mathematics, how they are categorized, and their special properties. First, we must define algebraic numbers. An is any number, real or complex, that can be expressed as a solution of a polynomial equation with rational coefficients. Of these, an is an algebraic number that can be expressed as the root of a polynomial equation with integer coefficients such that the leading coefficient is a 1, meaning the polynomial must be monic. The minimal polynomial of an algebraic integer x is the monic polynomial P (x) with integer coefficients and the least number of factors such that x is a root of P (x). For example, the minimal polynomial of the algebraic integer x = 1 would be x − 1 = 0, and 2 not x − 3x + 2 = (x − 1)(x − 2) = 0, which has an√ extra factor. Similarly, 2 the minimal polynomial√ of the√ algebraic integer x = 2 would be x −2 = 0 and not just x − 2 = 0, as 2 is not an integer coefficient. Additionally, 1 the rational algebraic number x = 2 would not be considered an algebraic 1 integer, as its minimal polynomial is x − 2 = 0, one of whose coefficients is not an integer. A special kind of algebraic integer results from the equation xn − 1 = 0. By the Fundamental Theorem of Algebra, there are n such roots to this equation. Of these, 1 is the only solution that is a real number, along with −1 if n is even; the rest are complex numbers lying on the unit circle in the complex plane. These are known as the complex nth roots of unity. In order to understand them, we must examine Euler’s Identity, which extends exponentiation to imaginary exponents. It states that eiθ = cos(θ) + i sin(θ) and yields the interesting equation eiπ + 1 = 0. This means that e2πi = e2πki = 1k = 1 as long as k is an integer. Therefore, the complex 2πki n 2πk 2πk nth roots of unity must all be of the form e = cos( n )+i sin( n ). Here,

5 for the sake of convenience, we will refer to these roots as powers of εn = 2π 2π cos( n ) + i sin( n ). This is because εn is guaranteed to be a primitive nth , meaning that all other nth roots of unity can be expressed as integer exponents of εn. From here, we are ready to define cyclotomic numbers. A cyclotomic number is a number that can be written as a sum of rational multiples of powers of the complex nth roots of unity. Examples of these would in- 3 3 clude ε5 + ε5, 4 i, or any . Of the set of cyclotomic numbers, we define cyclotomic integers as cyclotomic numbers that are also al- gebraic integers. In the course of this paper, we will be focusing solely on real cyclotomic integers whose imaginary part is 0; if we refer to cy- clotomic integers, we will assume that they are real unless specified oth- erwise. We note that adding a power of εn to its reciprocal results in k −k 2πk 2πk 2πk 2πk 2πk εn + εn = cos( n ) + i sin( n ) + cos(− n ) + i sin(− n ) = cos( n ) + 2πk 2πk 2πk 2πk i sin( n ) + cos( n ) − i sin( n ) = 2 cos( n ), causing the cosines to com- bine and the sines to cancel each other out, thus removing the imaginary component of the complex number. Indeed, according to Washington (1982), all cyclotomic integers are sums of integer multiples of the complex roots of unity; furthermore, all real cyclotomic integers can be expressed as sums of −1 2π powers of εn + εn = 2 cos( n ). This is an extremely important property of cyclotomic integers, and we will make good use of it when constructing and categorizing Perron numbers later on.

1.2 Conductor n In the course of our research, we came up with a method for categorizing cyclotomic integers into similar groups known as conductors. In this section, we will discuss what a conductor is and how it can be used to describe the properties of cyclotomic integers as well as Perron numbers. From previous sections, we can surmise that the easiest way to group cyclotomic integers would be based on the nth roots of unity of which they −16 5 are comprised. For instance, ε7 + ε7 = ε7 + ε7 would be of conductor 7, 3 π 2π ε3 + ε9 = ε9 + ε9 would be of conductor 9, and 2 cos( 5 ) = 2 cos( 10 ) = ε10 + −1 2 −2 ε10 = −ε5 − ε5 would be of conductor 5. Indeed, upon close examination, our method would seem to work well for even conductors. For odd ones, however, we run into some difficulties. For instance, let us look at the complex fifth roots of unity in Figure 1. Because they are generated from 2π the sines and cosines of the angle 5 , which is one fifth of the unit circle, they form a regular pentagon in the complex plane (outlined in blue), which is a pattern we can extrapolate to regular n-gons for all complex nth roots

6 of unity. The first thing we notice about this pentagon is that it does not contain the point −1. Furthermore, we notice that the real part of ε5, 2π 2 cos( 5 ) ≈ 0.618033989, is less in absolute value than that of ε5, which is 4π cos( 5 ) ≈ −1.618033989, despite the latter being negative. We want our primitive root to be positive and greater in absolute value than all the other roots in the graph. This will become important later on in the paper, when we discuss the properties of Perron numbers in Section 2.

imaginary axis 3π 3π 2π 2π cos( 5 ) + i sin( 5 ) cos( 5 ) + i sin( 5 ) i

4π 4π π π cos( 5 ) + i sin( 5 ) cos( 5 ) + i sin( 5 )

−1 0 1 real axis

6π 6π 9π 9π cos( 5 ) + i sin( 5 ) cos( 5 ) + i sin( 5 )

−i 7π 7π 8π 8π cos( 5 ) + i sin( 5 ) cos( 5 ) + i sin( 5 )

Figure 1: The fifth roots of unity interspersed with the tenth roots of unity.

We can fix these discrepancies by shifting the focus of our graph to the tenth roots of unity, which include the fifth roots of unity as even exponents. Now, we can use ε10 as our primitive root instead of ε5. This ensures that our

7 graph includes −1 as a power of ε10, and will make studying Perron numbers and other cyclotomic integers much easier. Therefore, the conductor of a cyclotomic integer x is the smallest n such that x can be expressed as a sum of integer multiples of the complex nth roots of unity. In essence, k = 2n is the minimal degree of the polynomial xk − 1 = 0 that is guaranteed to contain all the nth roots of unity and their negatives.

2 Defining and Representing Perron Numbers

Now that we are familiar with some key terminology, we are ready to ex- amine Perron numbers themselves. We return our attention to algebraic integers. By the Fundamental Theorem of Algebra, for any minimal poly- nomial P (x) of degree n, the variable x will have n − 1 (not necessarily distinct) companion roots. These are known as the Galois conjugates of x. For instance, the monic cubic equation x3 − x2 − 2x + 1 = 0 has three π roots, all of which are algebraic integers: x1 = 2 cos( 7 ) ≈ 1.801937736, 3π 5π x2 = 2 cos( 7 ) ≈ 0.4450418679, and x3 = 2 cos( 7 ) ≈ −1.246979604. Thus the Galois conjugates of x1 are x2 and x3, and vice versa for the other two. Knowing these terms, we define a Perron number as follows: A Perron number is a positive real algebraic integer whose absolute value is greater than or equal to the absolute values of its Galois conjugates. In the case of x1, x2, and x3, the absolute value of x1 is greater than that of x2 and x3; therefore, x1 is Perron.√ Other more famous Perron numbers include the 1+ 5 φ = 2 ≈ 1.618033989, which manifests itself in common patterns in nature, such as Golden Rectangles and Golden Spirals. A Golden Rectangle is one in which the ratio of the length to the width is equivalent to the Golden Ratio. If an infinite series of squares of decreasing size are inscribed in such a rectangle, and circular arcs are drawn to connect their diagonals, a Golden Spiral is formed. Golden Spirals can be found in the shape of nautilus shells, the way that seeds arrange themselves in the centers of sunflowers, and in the structures of many spiral galaxies including our own 2 (Livio, 2003). The minimal polynomial√ of the Golden Ratio is x − x − 1, 1− 5 and its sole Galois conjugate is 2 ≈ −0.618033989 = 1 − φ. It is just one specimen of the vast and expansive field of Perron numbers. Although some of our research applies to all Perron numbers in general, our main focus for this paper will be cyclotomic Perron numbers, or Perron numbers that are also cyclotomic integers. If we refer to a Perron number of conductor n, we will automatically assume that the Perron num- ber is cyclotomic, as it will be composed of a sum of complex 2nth roots of

8 unity.

3 Properties of Perron Numbers

3.1 Matrices Matrices are a trademark facet of linear algebra, and frequently appear in various fields of mathematics. Along with being able to add and subtract two matrices as long as they have the same number of rows and columns, it is also possible to multiply them as long as the number of columns in the first one equals the number of rows in the second one. A vector can be treated as a matrix with a single column and a nonzero number of rows. Multiplying an n × n matrix by an n dimensional vector yields another n dimensional vector, as shown below:

      a11 a12 . . . a1n x1 a11x1 + a12x2 + ··· + a1nxn a21 a22 . . . a2n x2 a21x1 + a22x2 + ··· + a2nxn    ×   =    . . .. .   .   .   . . . .   .   .  an1 an2 . . . ann xn an1x1 + an2x2 + ··· + annxn

Here, the subscripts denote the coordinates of the elements in the original matrix and the ellipses are used to indicate that more elements may be present than are seen on the page. Matrices and vectors√ can also be multiplied by scalar, or non-matrix, 5 values like 2, − 2 , 2, or π. This multiplies all their coefficients by the scalar in question. Occasionally, if a matrix A is multiplied by a vector ~v, its product will be another vector whose coefficients are a scalar multiple λ of the original ~v. In this case, λ is referred to as an eigenvalue of A, and ~v is its corresponding eigenvector. For example, consider the following matrix/vector equation:

3 1 4 3 · 4 + 1 · 4 16 4 × = = = 4 2 2 4 2 · 4 + 2 · 4 16 4 4 3 1 In this case, is an eigenvector of , and 4 is its corresponding 4 2 2 eigenvalue. Using linear algebra, the eigenvalues of an n×n matrix can be calculated by finding the roots of a polynomial equation with λ as the sole variable. This is known as the characteristic polynomial of the matrix and can

9 be found by computing the determinant of the matrix with λ subtracted from all of its diagonals, where the row number equals the column number. a b Denoted by |A| for a matrix A, the determinant of a 2 × 2 matrix is c d a b c a b = ad − bc. The determinant of a 3 × 3 matrix d e f would be c d   g h i calculated by shaving off the top row a b c, multiplying all of its even- numbered terms by −1, multiplying them by the determinants of the 2 × 2 d e f matrices formed by cutting the corresponding column out of , and g h i adding the results together:

a b c a bc ab c abc

d e f = a def − b def + c d ef

g h i ghi ghi g hi

e f d f d e = a − b + c h i g i g h We would calculate the determinant of a 4×4 matrix in a similar manner, but using the determinants of 3 × 3 sub-matrices instead of 2 × 2 ones. In this way, determinants are recursive processes, meaning that you have to compute several n − 1 × n − 1 determinants in order to calculate the determinant of an n × n matrix. 3 −1 For example, let us try computing the characteristic polynomial of . 2 2 Subtracting λ from the diagonals, we find that our characteristic polynomial

3 − λ −1 is = (3 − λ)(2 − λ) − 2 · (−1) = λ2 − 5λ + 4 = (λ − 1)(λ − 4). 2 2 − λ Setting this equal to 0, we find that the solutions 1 and 4 are the eigenvalues of our matrix. Expanding this process to larger matrices with n diagonals, we find that the characteristic polynomial of an n×n matrix must always be of degree n. Thus, by the Fundamental Theorem of Algebra, an n×n matrix can have up to n distinct eigenvalues, since it will produce a characteristic polynomial of degree n with up to n distinct roots. Of these, we will refer to the eigenvalue with the largest absolute value as the matrix’s spectral radius. Perron numbers were first used by German mathematician Oskar Perron when describing the eigenvalues of square matrices whose coefficients were positive real numbers. In his 1907 paper Zur Theorie der Matrices, Perron

10 came up with a theorem stating that the spectral radius of a positive real matrix must be one of its eigenvalues (Perron, 1907). Fellow German math- ematician Georg Frobenius would later generalize this statement to include not just positive real matrices, but all irreducible matrices with non-negative coefficients. This spectral radius would later be referred to as the Perron eigenvalue, or Perron number, of the matrix. The theorem would later be known as the Frobenius-Perron, or Perron-Frobenius, Theorem (Gannon & Schopieray, 2019). Later on, Doug Lind would prove the converse of the Perron-Frobenius Theorem to be true as well. Lind’s Theorem states that, given any Perron number λ, there exists some matrix with non-negative integer coefficients that is irreducible and has λ as a Perron eigenvalue (Thurston, 2014). Later on, we will be making use of both of these theorems when discussing the link between matrices and fusion graphs, and how both are related to Perron numbers.

3.2 Fusion Graphs In mathematics, a graph is a collection of vertices connected by adjoining edges. These edges can be directed, meaning they only point in one direc- tion from one node to another, or they can be bidirectional. Both vertices and edges can be weighted, meaning that they have numerical values as- sociated with them. Here, we will be referring to graphs with weighted vertices and bidirectional edges, although the patterns we notice will be just as applicable to directed graphs as well. An interesting relationship exists between graphs and square matrices. Normally, when we want to represent a graph, we draw a series of circles in somewhat arbitrary positions connected by lines or arrows of varying lengths. Let us suppose that we want a general way of representing a graph that does not require a pen and paper or some graph drawing software. One option would be to give each vertex a letter or number, sort them, and use them to label the rows and columns of a table. Then, choosing a vertex from the top row and a vertex from the leftmost column, we could record their relationship by how many edges there were between them:

11 C

A B E F

D

Figure 2: A bidirectional graph with six vertices and eleven edges.

A B C D E F A 0 2 0 0 0 0 B 2 0 1 1 3 0 C 0 1 0 0 1 0 D 0 1 0 0 1 0 E 0 3 1 1 0 2 F 0 0 0 0 2 0

Figure 3: The corresponding to Figure 2.

Examining this table, we note that without the row and column labels it is essentially just a square matrix with non-negative values. This matrix is referred to as the adjacency matrix of the graph, and is unique to only that graph. Now that we have established a connection between matrices and graphs, we return to the Perron-Frobenius Theorem from earlier. We define the spectral radius of a graph to be the spectral radius of its corresponding adjacency matrix. To understand the relationship between the structure of a graph and its spectral radius, we must assign weights to each of the graph’s vertices. However, these weights must follow a special rule: If the spectral radius is r and the weight of a node i is wi, then for each node i the weights of its neighbors must all add up to the product of the weight wi and r (Davydov & Kanieski, n.d.-b). For√ example, suppose√ that we want to construct a graph with spectral radius 3. We note that 3 is Perron,√ as it is greater than or equal to its one and only Galois conjugate − 3. We start out with a single node of

12 √ weight 1. Multiplying√ 1 by 3, we find that the sum of the nodes connected√ to 1 must be√ 3.√ Experimenting, we decide to add a node of weight 3 to √the 1 node. 3 · 3 = 3, meaning that the sum of the nodes connected to 3 must equal 3. Since we already have a 1, we decide√ to connect a 2 to the other side.√ The nodes connected to 2 must add to 2 3, and√ we already have a single 3 attached. We then decide to attach another 3 to the end, and the only node we have left that we can still attach is another 1. We have the resulting graph: √ √ 1 3 2 3 1

√ Figure 4: A simple graph of spectral radius 3.

Similarly, given a graph without its corresponding adjacency matrix, it is still possible for us to use algebra to calculate its spectral radius. For example, let us consider Figure 5 below. Giving the graph weights, we start by labeling each of the nodes with only one neighbor with a 1. With this in mind, the center nodes forming the triangle must have weights equal to the spectral radius, since they are the only vertices connected to each of the 1 nodes. Because the product of each node and the spectral radius r must equal the sum of its neighbors, we have the equation r2 = 2r + 1. Solving this with the quadratic formula and selecting the larger of the two roots for the sake of being√ both positive and Perron, we have that the spectral radius must be r = 1 + 2.

13 1

√ 1 + 2

√ 1 + 2 1

√ 1 + 2

1

√ Figure 5: A fusion graph of spectral radius 1 + 2.

From here, we can apply the Perron-Frobenius Theorem and Lind’s The- orem to draw some interesting conclusions regarding these graphs. First, we will define irreducible matrices in the context of graphs. An irreducible matrix is one whose corresponding graph is connected, meaning that there exists a path of edges in the graph between any two of its vertices (Davydov & Kanieski, n.d.-b). In this context, the Perron-Frobenius Theorem states that every connected graph has a distinct spectral radius, which must be a Perron number. Furthermore, Lind’s Theorem states that for every Perron number, we can construct at least one graph that possesses it as a spectral radius. We have demonstrated both of these theorems in action with our previous two graphs. Given a Perron number as a spectral radius, we were able to construct a graph; given a graph, we were able to determine the Perron number that was its spectral radius. Studying the spectral radius property of a graph, we notice that we can refer to the weights of its vertices as distinct√ elements in a set. In some cases, like with the graph of spectral radius 3 in Figure 3.2, we can see an interesting pattern: If any two nodes in the graph are multiplied

14 together, the resulting product is always a sum√ of√ integer multiples √of other elements.√ √ For example, 2 · 2 = 2 + 2, 3 · 3 = 3 = 1 + 2, and 3 · 2 = √3 + 3. Additionally,√ we can√ see from the√ graph of spectral√ radius 1 + 2 in Figure 5 that (1 + 2)(1 + 2) = 3 + 2 2 = 1 + 2(1 + 2). This property is known as a fusion rule (Verlinde, 1988). For the purposes of this paper, we have chosen to classify graphs whose nodes possess this property as fusion graphs (Davydov & Kanieski, n.d.-b). Fusion rules and their associated Perron numbers appear in many areas of not just mathematics, but physics as well, and constitute another major developing field of research. According to Etingof et al. (2017), Perron num- bers serve as the dimensions of Grothendieck rings, and alongside fusion rules form the combinatoric basis of the study of finite tensor categories, which appear in numerous areas of mathematics and of which fusion graphs are a part. In physics, they play a role in conformal field theories, or quantum field theories that are invariant under conformal transformations (Verlinde, 1988). In computer science, finding nonabelian (non-commutative) fusion rules could hold the key to building a topological quantum computer, which would protect quantum memory from decoherence and decay by locking it into “braids” using tensor categories (ROWELL & ZHENGHAN, 2018). As of right now, however, the information we have on fusion rules is very lim- ited and many aspects of them are still not well-understood. Future research into the properties of Perron numbers might potentially unlock more secrets about fusion rules, and thereby provide a new path to navigate previously uncharted territories in multiple scientific disciplines.

3.3 Arithmetic Properties An interesting property of Galois conjugates is that they are additive and multiplicative. This means that the Galois conjugate of a + b is equal to the Galois conjugate of a plus the Galois conjugate of b; likewise, the Galois con- jugate of ab is equal to the Galois conjugate of a times the Galois conjugate of b. If a and b are greater than all of their respective Galois conjugates, then by extension a + b and ab are each greater than all of their respec- tive Galois conjugates as well. By definition, this means that if you add or multiply two Perron numbers, your result will always be another Perron number. Perron numbers are thus closed under addition and multiplication (Davydov & Kanieski, n.d.-a). This property makes finding and generating Perron numbers relatively easy, even though the majority of them are irrational and cannot be easily expressed even using nested radicals. If we have a finite set of Perron num-

15 bers, we can generate an infinite number of them by adding and multiplying existing ones that we already know. In this case, we would refer to our set as a generating set. If the elements of our generating set are linearly in- dependent, meaning they cannot be expressed as sums of multiples of one another, then our generating set is referred to as a basis. However, we will see later on that the elements of our generating sets are not always linearly independent for all conductors. There are multiple different generating sets that we can use to produce Perron numbers, which each come with their own pros and cons, and exploring these will allow us to distinguish different properties of Perron numbers and interpret their results.

4 Representing Cyclotomic Integers

4.1 Generating Sets Cyclotomic Perron numbers are of particular interest to us not only due to their ease of construction and examination, but also because of their unique properties. So far, every cyclotomic Perron number has been found to be the radius of some fusion graph, although this has yet to be proven for all cases (Etingof et al., 2017). Additionally, cyclotomic Perron numbers can be represented not just as irrational numbers, but as vectors of integers in a given generating set. For example, we could represent a + bφ as the two- dimensional vector (a, b), preserving only the coefficients of 1 and φ. This allows for the exact values of cyclotomic Perron numbers to be stored in memory as the integer coefficients of an array, which we can then add or subtract or use to compute their Galois conjugates. There are three main methods we can use to express these Perron numbers as vectors, each of which has their own strengths and weaknesses. We will discuss them below. One way we could construct our generating set in conductor n would be π using exponents of x = 2 cos( n ). We will refer to this as the exponential representation, and the vector (a1, a2, . . . , ak) would represent a1 + a2x + 2 k−1 a3x +···+anx . All cyclotomic integers of conductor n can be represented in this manner because, as we will see later on, all cyclotomic integers that are sums of the complex nth roots of unity will always be expressible in π terms of polynomials of 2 cos( n ). This representation has the advantage of xk being relatively easy to compute. Another approach we could use would be to write the cyclotomic inte- gers as sums of multiples of cosines. We will refer to this as the trigono- metric representation, and the vector (a1, a2, . . . , ak) would represent π 2π (k−1)π a1 + 2a2 cos( n ) + 2a3 cos( n ) + ··· + 2an cos( n ). We know from pre-

16 vious sections that all cyclotomic integers of conductor n can be expressed k −k kπ as sums of εn + εn = 2 cos( n ). Therefore, this representation would span all cyclotomic integers of conductor n and be a valid generating set. It has the advantage of each coefficient being somewhat easy to calculate via trigonometry, and we can use trigonometric identities on our numbers to explain some of their properties. Both of these representations have their positive qualities that might make us want to use them. However, when we get to minimality later on, we will see that they do not always provide the best description of a cyclotomic integer and its composition. Instead, the elements of our set will be the quantum numbers [k]εn , which we have defined in our research as:

k −k εn − εn [k]εn = −1 εn − εn

Quantum numbers follow the interesting recursive pattern that [1]εn = 1, −1 2π [2]εn = εn + εn = 2 cos( n ), and [k]εn = [2]εn [k − 1]εn − [k − 2]εn . This means that all quantum numbers are expressible as polynomials in terms 2 3 of 1 and [2]εn ; for instance, [3]εn = [2]εn − 1 and [4]εn = [2]εn − 2[2]εn . They are of interest to us because, in addition to spanning all cyclotomic integers of conductor n, the product of any two quantum numbers can be expressed as a sum of quantum numbers, meaning that quantum numbers follow a fusion rule (Davydov & Kanieski, n.d.-a). This means that any graph whose spectral radius is a quantum number must be a fusion graph. Additionally, we will see that, for small conductors, quantum numbers are often minimal. From now on, we will be using the set of quantum numbers as our generating set for expressing cyclotomic integers of conductor n. For the sake of convenience, we will refer to [k]εn as quantum k.

4.2 Length of the Perron Vector The length of a Perron vector for conductor n depends on how many distinct Galois conjugates ε2n possesses. For cyclotomic integers, the kth Galois m 2πm 2πm mk 2πmk 2πmk conjugate of εn = cos( n ) + sin( n ) is εn = cos( n ) + sin( n ). Because sine and cosine functions are periodic with rational multiples of π, this means that Galois conjugates repeat themselves. For instance, the 3rd √ √ √ and 21st Galois conjugates of 2 cos( π ) = 6+ 2 are 2 cos( π ) = 2 and √ 12 4 4 7π 2 cos( 4 ) = 2 respectively, which are both equal. Therefore, there are only a limited number of Galois conjugates and thus a limited number of coefficients for the Perron vector representation.

17 k To calculate the total number of distinct Galois conjugates of εn, we must k make sure that the polynomial we are using for εn remains minimal. For the complex fourth roots of unity, the imaginary unit i = ε4 has three Galois conjugates: 1, −1, and −i. However, only one of these, −i, is necessary as a companion root to make the minimal polynomial x2 + 1 = (x − i)(x + i) = 0. This is because the other factors x + 1 and x − 1 are already minimal polynomials themselves, meaning that the polynomial x4−1 containing them is not minimal itself. The same goes with the complex ninth roots of unity: 3 6 2 −1 The third and sixth Galois conjugates ε9 = ε3 and ε9 = ε3 = ε3 are the complex cube roots of unity, meaning that together they form the minimal 2 −1 2 polynomial x +x+1 = (x−ε3)(x−ε3 ). We must therefore divide x +x+1 as well as the trivial x − 1 out of the “minimal” polynomial x9 − 1 for the ninth roots of unity, resulting in the new minimal polynomial x6 + x3 + 1. We will refer to such minimal polynomials for complex nth roots of unity as cyclotomic polynomials. In order for a kth Galois conjugate to be unique to an nth root of unity, k and n cannot share any factors other than 1, meaning they must be coprime. Therefore, the number of roots of the nth cyclotomic polynomial must be the total number of positive integers greater than or equal to 1 but less than n that are coprime to n. This number is referred to as Euler’s totient function φ(n), and can be computed by taking the prime factorization of n and subtracting 1 from a single copy of each prime number. For example, if n = 2250 = 2 · 32 · 53, then φ(n) = (2 − 1) · (3 − 1) · 3 · (5 − 1) · 52 = 600. For conductor n, we would thus expect our Perron vectors to all be of dimension φ(2n), and consist of φ(2n) integer coordinates. However, we note that half of our Galois conjugates are simply negatives of the other half, as demonstrated by the mirror image pentagons in Figure 1. This means that integer multiples of these opposing pairs of Galois conjugates would combine together into just one coefficient. Therefore, the dimension φ(2n) of a Perron vector of conductor n is k = 2 . For odd integers n, this will φ(n) reduce down to 2 . In the course of our research, we will focus primarily on conductors with odd n, as they follow more consistent patterns than those with even n.

5 Checking if a Cyclotomic Integer is Perron

By definition, a number is Perron if it is positive and greater than the abso- lute values of all its Galois conjugates. Let σn(a) be the nth Galois conjugate of a. Given a vector representing a cyclotomic Perron number p with a gen-

18 erating set (x1, x2, . . . , xk), we can find σn(p) simply by conjugating each element in the generating set to get (σn(x1), σn(x2), . . . , σn(xk)). Therefore, we can generate a series of inequalities utilizing the coefficients of the vector for p which, if satisfied, will determine whether or not p is Perron. Take the inequalities for Perron vectors of conductor 7. Let β = x2 = π 2 cos( 7 ) be quantum 2 for conductor 7. Then, for the odd, positively-valued Galois conjugates of β (which will have the same absolute value as their negative even counterparts), we will have the following:

2 σ3(β) = β − β − 1 2 σ5(β) = 2 − β We thus have that if the number (a, b, c) = a + bβ + c(β2 − 1) is Perron, it must be greater in absolute value than the numbers formed by replacing β with β2 − β − 1 or 2 − β2. In addition, it must also be positive. This results in the following inequalities:

a + bβ + c(β2 − 1) ≥ 0 2 2 2 a + bβ + c(β − 1) ≥ σ3(a + bβ + c(β − 1)) = (a + c) − (b + c)β + b(β − 1) 2 2 2 a + bβ + c(β − 1) ≥ −σ3(a + bβ + c(β − 1)) = −(a + c) + (b + c)β − b(β − 1) 2 2 2 a + bβ + c(β − 1) ≥ σ5(a + bβ + c(β − 1)) = (a + b + c) + cβ − (b + c)(β − 1) 2 2 2 a+bβ+c(β −1) ≥ −σ5(a+bβ+c(β −1)) = −(a+b+c)−cβ+(b+c)(β −1) which can be rewritten as

a + bβ + c(β2 − 1) ≥ 0 (1 + 2β − β2)b + (β2 + β − 2)c ≥ 0 2a + (β2 − 1)b + (β2 − β)c ≥ 0 (β2 + β − 2)b + (2β2 − β − 3)c ≥ 0 2a + (−β2 + β + 2)b + (β + 1)c ≥ 0 We note that the first inequality is redundant, as it can be shown that any vector (a, b, c) whose coefficients satisfy the other four inequalities must evaluate to a positive real number. Now, at last we have a consistent and well-thought-out method for not only representing Perron numbers, but computing their Galois conjugates and verifying they are Perron in the first place. We will make use of these properties later on when studying the distributions of Perron numbers in k dimensional space for conductor n.

19 6 Minimality of Perron Numbers

6.1 Definition of Minimality For a given conductor n, it will be useful for us to have a general model of how its Perron numbers are distributed. From previous sections, we note that cyclotomic integers of conductor n can each be expressed as an φ(n) ordered set of k = 2 integers (a1, a2, ··· , an). Conveniently, we find that we can treat these integers√ as coordinates in k-dimensional space. For instance, we could express 5 = −1 + 2φ as the integer point (−1, 2) in π two-dimensional space for conductor 5. Likewise, if β = 2 cos( 7 ), then β9 = 19 + 42β + 47(β2 − 1) would be the integer point (19, 42, 47) in three- dimensional space for conductor 7. For higher dimensions, the coordinates of these points and their locations relative to one another become very difficult to visualize. However, we note that in order to be Perron, the coordinates of a cyclotomic integer must satisfy a series of inequalities. When graphed in k-dimensional space, these inequalities become k − 1 dimensional boundaries. For conductor 5, the boundary is a single pair of lines; for conductor 7, it is a series of five planes. In general, for k dimensions these boundaries form a k-dimensional “cone” whose vertex is the origin and whose edges branch outward in a positive direction. We will refer to these cones in our research as Perron cones (Davydov & Kanieski, n.d.-a).

20 y

y ≥ −2x, y ≥ 0

x

Figure 6: Perron cone inequalities and boundaries for conductor 5.

Figure 7: Perron cone boundaries for conductor 7. 21 Examining these graphs, we recall our earlier observation that Perron numbers are closed under addition. Indeed, if we were to add any two points in the Perron region of one of these graphs (where adding entailed adding corresponding coordinates together), we would inevitably end up with another point within the boundaries of the Perron cone. Adding a point to any point near the boundary would result in the sum being closer to the center of the cone than the boundary point; similarly, adding any two points lying near the same boundary would only result in another point lying farther along the boundary, as it would still have to abide by the inequalities. However, it becomes clear to us that not every point in the Perron cone can be expressed as a sum of other points in this manner. A trivial example would be 1 itself, as by definition all Perron numbers must be greater than or equal to 1 and 0 is not considered Perron. We thus define a new term: If a Perron number of a given conductor cannot be expressed as a sum of smaller Perron numbers, then it is said to be minimal. Minimality is important to the study of cyclotomic Perron numbers for a number of reasons. Although the boundaries of Perron cones are composed of smooth planes (or their equivalent in higher dimensions), the actual points lying on the planes almost never have integer coordinates. This is because the majority of conductors have Perron inequalities involving irrational co- efficients, such as the ones demonstrated for conductor 7. Expressing these inequalities as equations and trying to find the roots, we find that there is no integer solution. Therefore, the actual integer boundary points are not on the boundary itself, but merely very close to it, with no integer gaps in between. Furthermore, because this boundary is irrational, there exist infinitely many boundary points that cannot be expressed as sums of other boundary points. Therefore, just as there are infinitely many prime num- bers, there are generally infinitely many minimal Perron numbers for a given conductor n. Conductors 3 and 5 are the only odd-numbered exceptions to this rule. Consider the representations of these numbers in their quantum bases. For π instance, 2 cos( 3 ) = 1 is the quantum 2 of conductor 3, and since the dimen- φ(3) sion k = 2 = 1, Perron numbers in conductor 3 are all one-dimensional, consisting solely of the set of positive integers. Because all positive integers can be written as sums of ones, 1 is the only minimal Perron number for conductor 3. For conductor 5, the only Galois conjugate of φ is just 1 − φ, and we know from the previous section that the only inequalities necessary for a point (a, b) = a + bφ are b ≥ 0 and b ≥ 2a. In this case, there are three

22 √ minimal Perron numbers: (1, 0) = 1, (0, 1) = φ, and (−1, 2) = 2φ − 1 = 5. Because all these inequalities are either linear or quadratic (k ≤ 2), none of them involve multiples of the irrational quantum numbers themselves coming into contact with the integer coefficients. Therefore, because there φ(n) are no other values of n for which 2 ≤ 2, conductors 3 and 5 are the only odd ones for which there are finitely many minimal elements. Any other conductors n with finitely many minimals only occur when n is the product of 3 or 5 and a power of 2. We say that these conductors are finitely generated (Davydov & Kanieski, n.d.-a).

6.2 Relation to Prime Numbers in the Integers We have seen numerous examples in our research of how primes exhibit themselves in patterns among cyclotomic Perron numbers. Prime factor- izations are key to finding minimal polynomials of cyclotomic integers and computing Euler’s totient function to determine the dimension of a conduc- tor. Here, we will be using the properties of prime numbers again to better understand minimal Perron numbers and how to detect them. A simple algorithm exists for determining whether or not a positive inte- ger n is prime. We would divide n by all the positive integers between 1 and n − 1, and see if our result was also an integer. If we wanted to narrow the range of possible factors, we could decrease the upper bound of our interval √ √ from n−1 down to b nc, the greatest integer less than or equal to n. This √ would work because dividing n by any number greater than n would result √ in a quotient less than n, which by then we would have already checked. This prime-finding algorithm takes advantage of the simple property of positive integers known as closure under multiplication. Whenever two positive integers are multiplied, the result is always guaranteed to be another positive integer. Therefore, to determine if a number is composite instead of prime, all we have to do is find a single nontrivial pair of integers that has our number as its product. Each integer will either be prime or be another composite integer with its own prime factorization. This means that the original number will either be a prime or a product of multiple primes. In this way, we can draw a key analogy between the well-known prop- erties of prime numbers and those of the emerging field of Perron numbers. Minimal Perron numbers are to their conductors what prime numbers are to the natural numbers. The only difference here is that, instead of multiplica- tion, Perron numbers rely primarily on addition to generate new elements. Just as 2 is always prime, 1 is always minimal. If we want to prove that 413 is composite, then all we have to do is find a pair of integer factors

23 (7 and 59) that multiply to produce it. Likewise, if we want to prove that (−2, −1, 1, 1, 1) is Perron in conductor 11, we must find the two addends for which it is the sum. It is from these observations that we construct our algorithm for detect- ing minimality. In order to prove that a Perron number p is non-minimal (the equivalent of a natural number being composite), we need only show that there is at least one pair of Perron numbers a and b such that p = a+b. It does not matter if a and b are minimal themselves, only that they are each Perron. Because Perron numbers are all positive by definition, if p = a + b, p then a and b must both lie between 0 and p. If a ≤ 2 , then we have that p b ≥ 2 , and vice versa; therefore, when searching for our first candidate, we p need only check Perron numbers between 0 and 2 , as any candidates larger p than 2 would have a companion addend within that range. This is analo- gous to only having to check prime numbers less than or equal to the square root of the number being tested. If a pair for a and b is found, then p is not minimal. Otherwise, we conclude that it is. Of course, we note that there are some key differences between primes and minimal cyclotomic Perron numbers. The most notable difference is that, unlike prime factorizations, the decomposition of a Perron number into a sum of minimals is not always unique. For instance, the Perron number (0, 2) in conductor 5 can be written either as (0, 1) + (0, 1) or as (−1, 2) + (1, 0), all of whose addends are minimal. However, the Perron property of closure under addition remains analogous to the closure of the natural numbers under multiplication, so the integrity of our minimal decomposition process still holds.

7 Minimality of Quantum Numbers

In the course of our research, Dr. Davydov and I made a very important discovery. When we were first studying the minimality of Perron numbers, we always assumed that all quantum numbers in the quantum generating set were minimal. This assumption influenced many of our initial approaches to handling Perron numbers and interpreting their data. For instance, one of the primary reasons we chose to use a quantum representation rather than an exponential one was because we believed that it would result in all the unit vectors (1, 0, 0,... ), (0, 1, 0,... ), (0, 0, 1,... ),... being minimal, making our generating set for each conductor a basis. This would also result in all Perron vectors with all non-negative coefficients being minimal, which would make them easier to detect. Indeed, for conductors 5 and 7

24 this seems to be the case: (1, 0) and (0, 1) are both minimal for conductor 5, and (1, 0, 0), (0, 1, 0), and (0, 0, 1) are all minimal for conductor 7. We believed that this pattern would extend to higher conductors as well, and sought to test this hypothesis in our work. When we first started producing lists of minimal Perron numbers, our data initially appeared to support our hypothesis. Quantum 2, as expected, was minimal for all conductors. This was also the case for quantum 3 and quantum 4. However, by the time we reached conductor 11 and were able to look at quantum 5, we discovered something strange: Quantum 5, or (0, 0, 0, 0, 1) = (1, 0, 0, 0, 0) + (−1, 0, 0, 0, 1) was not minimal, because (−1, 0, 0, 0, 1) was Perron. The same occurred for quantum 6: Starting with conductor 13, (−1, 0, 0, 0, 0, 1) was Perron, meaning that (0, 0, 0, 0, 0, 1) was not minimal. Even more strangely, after conductor 21, quantum 6 went back to being minimal again and remained that way for every conductor afterward. The same thing happened to conductor 8 and conductor 10. A bizarre pattern was emerging.

Figure 8: Highest integer subtracted from quantum 8 while still being Per- ron. Note the way the graph dips back down to 0 after conductor 33.

25 Figure 9: Highest integer subtracted from quantum 11 while still being Perron. Note how the graph appears to level out at y = 4.

Intrigued by this new development, we wrote programs in Python to detect whether or not a quantum number was minimal for a given odd conductor x. If the quantum number was not minimal, the program would return the largest integer y it could subtract from the quantum number with the answer still being Perron. We found two distinct trends for quantum numbers. If the quantum number was odd, y would appear to follow an al- most logistic, S-shaped curve that started from 0, increased roughly linearly, and eventually leveled out to a maximum value where it would stay for all future conductors. If the quantum number was even, something radically different would happen: The graph of y would rapidly increase from 0, hit a maximum value, and then decrease in a pattern reminiscent of the function 1 y = x until it hit 0 again. The higher n was for a quantum number n, the larger the maximum value for y would be on the graph.

26 Figure 10: Hightest integer subtracted from quantum 67 while still being Perron. Integer points have been hidden for the sake of clarity. Note how the graph appears to level off at 24.

27 Figure 11: Highest integer subtracted from quantum 100 while still being Perron. Integer points have been hidden for the sake of clarity. Note the almost hyperbolic behavior of the graph as it tends toward 0.

Next, instead of integers, we tried subtracting integer multiples y of quantum 2. This time, the exact opposite of our previous trials took place: For even quantum numbers, y would eventually level out at some value. For odd quantum numbers, however, y would peak and then rapidly approach 0.

28 Figure 12: Highest integer multiple of quantum 2 subtracted from quantum 10 while still being Perron.

Figure 13: Highest integer multiple of quantum 2 subtracted from quantum 11 while still being Perron.

Examining the nature of the Perron cone, we have some reasoning for why y increases with conductor n. In general, the larger n is, the more

29 dimensions the Perron cone has and the more space there is between the co- ordinate axes representing the quantum numbers and the hyper-planes rep- resenting the inequalities. Indeed, the jagged nature of parts of the graphs is φ(n) due in part to the number of dimensions 2 suddenly decreasing for certain φ(11) φ(15) values of n, for example 2 = 5 and 2 = 4, leaving less room for other Perron numbers.

8 Potential Future Applications

Understanding minimality in cyclotomic Perron numbers will play a crucial role in continued research into fusion rules. Plotting minimal Perron vectors for different conductors will help us to gain insight into how cyclotomic Perron numbers add and multiply together. This in turn could lead to new discoveries in fusion rules, which would revolutionize many fields of science and mathematics if we knew more about them. Our experiments with the minimality of quantum numbers have estab- lished the difficulty of finding a basis for a conductor n containing only minimal values. At the same time, they have given us a better idea of the nature of Perron cones and their properties. In the future, this could lead to new properties of Perron numbers being discovered, resulting in new theorems in algebra and number theory. In the next section of this document, we will be discussing the pro- grams and algorithms we used to perform these computations with Perron numbers. We will also discuss research we undertook related to polynomials involving Perron numbers, and how we were able to streamline our programs to accommodate large data sets.

30 Part II Algorithms Regarding Perron Numbers and Polynomials

9 Efficiency and Computational Complexity

In computer science, efficiency is generally measured by two factors: time complexity and space complexity. Time complexity refers to the asymp- totic rate at which the total number of instructions carried out by the pro- gram increases with input size. Space complexity refers to the asymptotic rate at which the total amount of auxiliary data used by the program in- creases with input size. Both of these complexities are generally treated as irregular, approximate functions of the input size, whose bounds are given by the definitions below:

1. Given two functions f(n) and g(n), f(n) ∈ O(g(n)) if and only if there exists a constant c and an initial value n0 such that f(n) ≤ cg(n) for all n ≥ n0. If f(n) < cg(n) for all n ≥ n0, then we have the special case that f(n) ∈ o(g(n)).

2. f(n) ∈ Ω(g(n)) if and only if there exists a constant c and an initial value n0 such that f(n) ≥ cg(n) for all n ≥ n0. If f(n) > cg(n) for all n ≥ n0, then we have the special case that f(n) ∈ ω(g(n)).

3. f(n) ∈ Θ(g(n)) if and only if there exist two constants c1 and c2 and an initial value n0 such that for all n ≥ n0 c1g(n) ≥ f(n) ≥ c2g(n). If f(n) ∈ Θ(g(n)), then it is also true that g(n) ∈ Θ(g(n)), and that f(n) ∈ O(g(n)) and f(n) ∈ Ω(g(n)) as well.

For the purposes of this paper, we will be using O notation, as it serves as an upper bound on the runtime of our algorithms. However, for most of these algorithms it would be just as true and valid to use Ω or Θ notation. In general, if f(n) ∈ O(g(n)), then g(n) ∈ Ω(f(n)). To prove that f(n) ∈ O(g(n)), all we need to show is that f(n) lim = 0 x→∞ g(n) This can be done using calculus and L’Hˆopital’sRule if necessary, namely f(x) f 0(x) f(a) that limx→a g(x) = limx→a g0(x) if g(a) is indeterminate.

31 With these notations for time complexity, we can make some general observations about specific functions. First, we note that if a and b are constants, then ank ∈ O(bnk). If h ≤ k, then nh ∈ O(nk). We observe that O follows many of the same properties as the ≤ operator, namely that

1. If f(n) ∈ O(g(n)) and g(n) ∈ O(h(n)), then f(n) ∈ O(h(n)).

2. If f(n) ∈ O(g(n)) and h(n) ∈ O(k(n)), then f(n) + h(n) ∈ O(g(n) + k(n)) and f(n)h(n) ∈ O(g(n)k(n)).

3. If f(n) ∈ O(g(n)) and a > 0, then (f(n))a ∈ O((g(n))a).

ln x x−1 Additionally, because limx→∞ x = limx→∞ 1 = 0 due to L’Hˆopital’s Rule, we have that ln n ∈ O(n); in fact, using this same formula we can show that (ln n)k ∈ O(nh) for all k ≥ 1 and h > 0. Also, because log a = logc a b logc b for all c, if b and c are constants, then so is 1 . Therefore, all logarithms logc b of the same number n are constant multiples of one another; this means that, with respect to O, all logarithms are the same regardless of base. This means that O(log n), O(ln n), and O(lg n) all represent the same time complexity, as each function grows at the same asymptotic rate, and can be used interchangeably. It will help us to memorize some key time complexities in order to de- scribe functions in our program. For instance, sorting an array of size n using a comparison-based sorting algorithm will always take Ω(n log n) com- parisons. We can deduce this from two key observations. For an optimal comparison-based sorting algorithm, we would want to have a chart show- ing all n! possible combinations of n elements and how to get between them based on whether or not certain elements to be compared are already in order. For instance, if we had an array {a, b, c, . . . } and we saw that a > b, then we would have to swap the elements a and b in the array; otherwise, we would leave a and b in their current order and move on. Therefore, we can think of sorting as making a series of yes or no decisions. These decisions would form a binary decision tree, as shown below.

32 a ≤ b?

Yes: b ≤ c? No: a ≤ c?

Yes: a ≤ b ≤ c No: a ≤ c? Yes: b ≤ a ≤ c No: b ≤ c?

Yes: a ≤ c ≤ b No: c ≤ a ≤ b Yes: b ≤ c ≤ a No: c ≤ b ≤ a

Figure 14: A binary decision tree for sorting an array of 3 elements. Note there are 3! = 6 possible results and the tree has a depth of dlg 6e = 3.

The maximum depth of a binary tree of n elements would just be n, which would represent a chain of nodes all on one side of the tree. However, we can see that the minimum depth of a binary tree would occur when as many nodes as possible each had two children; this would maximize the number of nodes on each level, thus flattening out the tree. This minimum depth would be the minimum number of times we could cut the tree in half before we were left with individual nodes. In mathematical terms, this would be equal to dlg ne, or the least integer greater than or equal to the log base 2 of n. The depth of our decision tree is important in that it determines the number of comparisons we will have to make in order to sort the array. The total number of comparisons will determine the runtime of our sorting algorithm. If we wanted to sort an array of n elements, the depth of our binary decision tree would therefore be dlg(n!)e. We note that lg(n!) = lg(1 · 2 · 3 · ····n) = lg 1+lg 2+lg 3+···+lg n ≤ lg n+lg n+lg n+···+lg n = n lg n. Thus, we have proven that dlg(n!)e ≤ n lg n, meaning that the optimal runtime of our sorting algorithm is O(n log n). Seeing that this formula relies on the depth of the decision tree being as small as possible, we note that this time complexity gives us a lower bound for the runtime of our sorting algorithm. Therefore, comparison-based sorting algorithms in general have a runtime of Ω(n log n). In addition to sorting, another key feature of our algorithms will be the ability to merge two sorted arrays together. Merging involves choosing the least element of each array, putting it at the end of an accumulator array, and moving on to the next one. For instance, merging {1, 3, 5} and {2, 4, 6} would result in {1, 2, 3, 4, 5, 6}. If the elements of one array get exhausted before the other one is empty, we can just append the elements

33 of the remaining array to the accumulator array, taking advantage of them already being sorted. Therefore, the number of comparisons we have to make is not constant; however, it is bounded at the top by the combined length of the two arrays, which would occur if the arrays just happened to exhaust themselves at roughly the same time. The time complexity of merging an array of size m with an array of size n is therefore O(m + n). For the purposes of our programs, we need only memorize two key time complexities: O(n) and O(n log n). In the long run, any function f(n) ∈ O(n log n) will always exceed g(n) ∈ O(n) in value after a certain value c. For instance, if f(n) = 5n and g(n) = n log n, then f(100) = 500 > g(100) = 200 with log base 10, but f(1, 000, 000) = 5, 000, 000 < g(1, 000, 000) = 6, 000, 000. Therefore, for data sets whose size is a large integer n, a runtime of O(n log n) will always eventually take longer than a runtime of O(n). We therefore want to avoid O(n log n) algorithms in our code, and use O(n) algorithms instead, with a runtime f(n) = kn for some relatively small k.

10 Minimal Perron Vector Algorithms

Needless to say, decomposing large quantities of Perron numbers for multiple different conductors would be very difficult for us to do by hand. Therefore, we ended up writing programs in C++ to automate their computation. Without going into detail with the code itself, we will discuss the general nature of the program and the algorithms involved. To compute all the minimals of conductor n within a certain range, we must first exhaustively enumerate its cyclotomic Perron numbers. We begin by choosing an interval [a, b] for the coefficients of the corresponding Perron vectors. Next, we must enumerate all vectors from (a, a, ··· , a) to (b, b, ··· , b) and determine which ones are Perron based on the inequalities for the given conductor. This is the equivalent of enumerating all integers from 00000 to 99999 to determine which ones are prime: The coefficients of our vectors are like digits, with a being the lowest like 0, and b being the highest like 9. We then store these vectors in an array and sort them based on the algebraic numerical values to which they correspond. Afterward, starting with the first element p in the array, we start subtracting all the p previous elements from p one at a time. We then take each difference d ≤ 2 and plug it back into our inequalities. If it is Perron, we know that p is not minimal and we halt the algorithm. Otherwise, we continue subtracting smaller Perron numbers until there are none left. If we run out of candidates to subtract, then p is deemed minimal.

34 There are a few variations on this algorithm. For instance, we might take an approach analogous to the Sieve of Eratosthenes and only subtract the p numbers d ≤ 2 that have already been marked by the algorithm as minimal. This drastically decreases the runtime of our program, especially for larger conductors where large amounts of non-minimal Perron numbers might exist in between the minimal ones. Additionally, we can generalize our algorithm to all odd conductors even if we have not had the chance to compute the inequalities ourselves. To do this, all we have to do is multiply the vector of our coefficients by the matrix of the elements of our generating set and their Galois conjugates. Suppose we have a Perron number p represented by the Perron vector (a1, a2, ··· , ak) of conductor n whose quantum 2 is x2 = π 2 cos( n ) and whose quantum k is xk. Here, we recall from Section 4.1 that mπ the mth Galois conjugate of x2 is 2 cos( n ), and that xk follows the recursive formula xk = x2xk−1 − xk−2. Because of the additive and multiplicative properties of Galois conjugates, to find σm(xk), the mth Galois conjugate of xk, we need only substitute σm(x2) for x2 in our recursive formula. We thus construct our matrix as follows:

 2      1 x x − 1 . . . xk a1 p 2 1 σ2(x) σ2(x − 1) . . . σ2(xk)  a2  σ2(p)   2      1 σ3(x) σ3(x − 1) . . . σ3(xk)  a3  σ3(p)    ×   =   ......   .   .  . . . . .   .   .  2 1 σn−1(x) σn−1(x − 1) . . . σn−1(xk) ak σn−1(p)

From the Perron vector, this matrix multiplication efficiently computes the value of p itself as well as each of its Galois conjugates. It does this by taking each coefficient of the Perron vector and multiplying it by its corre- sponding quantum number or the quantum number’s ith Galois conjugate, adding together the results to get the final sum. If p is greater in absolute value than all of the numbers below it in the vector, then it is Perron. There are, however, some caveats to our algorithm, particularly with how the Perron vectors are sorted. We take it for granted that Perron numbers with small values have small coefficients. For instance, we do not expect to see a Perron number between 1 and 5 that has 1000000 as one of the coefficients in its Perron vector, with some equally large negative coefficient to balance it out. Indeed, because of the general shape of a Perron cone, this does not happen. While points like the one we described do exist, they are by their very nature almost never Perron; due to being unusually small in absolute value for their coefficients, they are likely to be

35 much less than at least one if not more of their Galois conjugates. However, it is possible to choose a range for the coefficients of a Perron vector and have a Perron number whose coefficients lie just slightly outside of that range and whose numerical value is less than that of some of the chosen numbers. This excluded Perron number could potentially be a component in the decomposition of another larger Perron number, which would end up being incorrectly recorded as minimal. Thankfully, we can avoid this error by updating our database of Perron numbers as the algorithm progresses. If one of the differences d that we calculate turns out to be Perron, but we do not have it listed because its coefficients are too large, we can add it to our array for future reference. This can also be mitigated by having the coefficients of the smaller quantum numbers have a wider range than those of the larger ones, ensuring that any variability in the larger quantum numbers is balanced out by larger variabilities in the smaller ones. We have seen this phenomenon occur a few times in our research, but it has always been easily fixed by increasing the range of the coefficients. Indeed, it showed up the least often for Perron vectors in the quantum basis, thus justifying our choice to use them. We can generalize this algorithm using pseudocode familiar to computer scientists. Here, we will use the walrus operator := to denote setting a vari- able equal to a value, and use normal text within parentheses to display comments that are not part of our code but that give a general picture of what is going on. The algorithm to generate minimal cyclotomic Perron numbers of conductor n whose coefficients lie between a and b would thus be: k := phi(n) M := a matrix with n - 1 rows and k columns (Make Perron matrix) for 1 <= i <= n - 1: M[i - 1][0] := 1 M[i - 1][1] := 2cos(pi * i / n) for 0 <= i < n: for 2 <= j < k: M[i][j] := M[i][j - 1] * M[i][1] - M[i][j - 2] S := all k-dimensional vectors with coefficients between a and b P := set of vectors that are Perron (currently empty) minimals := set of Perron vectors that are minimal (currently empty) v := some n - 1 dimensional vector (Find out which cyclotomic integers are Perron) for each s in S:

36 v := M * s if v[0] > |v[i]| for all 1 <= i < n: add s to P, paired with its corresponding value v[0]) sort P based on the corresponding values of its vectors (Find out which Perron numbers are minimal) for each p of P: for each d of minimals less than or equal to p/2: if p - d is Perron: (Check using the Perron matrix from earlier) do not add p to minimals add p - d to P if it is not already there otherwise, add p to minimals

11 Other Computations Related to Perron Num- bers

In addition to computing Perron numbers and examining their properties, our research involved studying various other aspects of matrices and Galois theory. Of particular relevance to fusion rules were computing general for- mulas for the trace and norm of a cyclotomic integer of conductor n. Like the terms spectral radius and Perron number, these two terms originate from matrices and linear algebra. Here, however, they will be used in the context of cyclotomic integers. The trace tr(p) of a cyclotomic integer p is equal to the sum of p and its distinct Galois conjugates. In conductor 5, for instance, the trace of a + bφ is equal to tr(a + bφ) = (a + bφ) + (a + b(1 − φ)) = 2a + b. Similarly, the norm n(p) is equal to p multiplied by each of its Galois conjugates. In this case, n(a + bφ) = (a + bφ)(a + b(1 − φ)) = a2 + ab − b2. Trace and norm are useful to us in that they provide a manner for expressing a cyclotomic integer as a single integer, as the algebraic terms such as φ inevitably cancel each other out in the final expression. In addition to Perron numbers, the trace and norm for conductor n have played important roles in much of our research into fusion rules and fusion rings (Davydov & Kanieski, n.d.-a). The issue with trace and norm is that they become increasingly difficult for us to compute by hand for larger conductors. In conductor 7, the trace of 2 π a+bβ +cβ , where β = 2 cos( 7 ), is 3a+b+5c. Its norm, on the other hand, is the gargantuan a3 −b3 +c3 +a2b+5a2c−2ab2 +6ac2 −b2c+2bc2 +abc. Here, not only do we have to multiply a + bβ + cβ2 by all of its Galois conjugates, but we must simplify the equation according the minimal polynomial β3 =

37 β2 + 2β − 1. This means repeatedly reducing higher powers of β down to exponents less than or equal to 2, resulting in even more multiplications after the initial product of the conjugates. For conductors 7 and 9, this process is very difficult to do by hand. By the time we reach conductor 11, where the π 5 4 3 2 minimal polynomial of 2 cos( 11 ) is x = x + 4x − 3x − 3x + 1 and there are five Galois conjugates total, it becomes almost humanly impossible. For this very reason, we have chosen to write our own libraries in C++ for adding, subtracting, multiplying, dividing, and simplifying polynomials involving Perron numbers. These programs have allowed us to learn more about the cyclotomic Perron numbers of each conductor and their properties than we would have otherwise been able to by hand. In the future, we might expand upon them to delve deeper into abstract algebra and Galois theory, and investigate other aspects of mathematics that would otherwise be hin- dered by difficult computations. In the coming sections, we will discuss some of the polynomial algorithms we have written, analyze their computational efficiency, and investigate ways in which they might be improved later on.

12 Perron Research Algorithms Involving Polyno- mials

12.1 Overview of Polynomials in Computer Science To students of both mathematics and computer science, it becomes readily apparent that the principal difference between the two fields is their levels of abstraction. Mathematics is an inherently abstract science, consisting mostly of formulae written, used, and proven by humans. Such esoteric con- cepts as topology, partial differential equations, and (ironically) computabil- ity theory can be represented mathematically as symbolic statements on a page, which can be read and understood by people who know what they mean. For instance, we know what π is and what many of its properties are, even though we will never have its exact decimal representation. Com- puters, on the other hand, focus mainly on performing rapid calculations involving concrete values whose approximate representations are stored in memory. It is difficult to get a computer to understand what an irrational or transcendental number is or what its properties are. In fact, just adding rational numbers whose denominators are not powers of 2 can prove to be a challenge for a computer: If we add 0.1 and 0.2 to one another in C++ in the form of 64-bit floating point decimals, we get 0.30000000000000004441 instead of 0.3. This is because the computer is not adding 0.1 and 0.2,

38 but rather the closest 64-bit floating point decimals it can find for each of them. If such basic tasks are hard to represent in computer science, we ask ourselves, how will we ever get computers to understand polynomials? In actuality, polynomials have been at the root of programming ever since the dawn of computer science. In 1821, computing pioneer Charles Babbage began designing his first “Difference Engine,” a mechanical con- traption consisting of interconnected gears and wheels with decimal digits on them. This Difference Engine was designed to calculate polynomial func- tions of variables entered by the user, which would then be printed out into a table using a printing apparatus. Unfortunately, the machine was far too complicated to build at the time, and it was not deemed efficient enough to justify its size and cost. Only one seventh of the machine was ever assem- bled, and government funding for Difference Engine No. 1 was cut in 1842 (Swade, n.d.). However, the question of polynomials in computer science was revived in the mid-twentieth century, as electronic computers were slowly becoming more advanced and widespread. Of particular interest to computer scientists and mathematicians alike was the implementation of factoring algorithms for polynomials of one or more variables. However, when traditional fac- toring algorithms were implemented on early computers, they were found to be radically inefficient in terms of time complexity. By the early 1980’s, computer scientists and mathematicians alike had managed to engineer new polynomial algorithms that were much more efficient, even if some still ran in exponential time. A good example would be Elwyn Berlekamp’s 1967 algorithm for factoring univariate polynomials of degree n with r factors whose coefficients were elements of the field Zp (the set of integers modulo a prime p), which ran in only O(n3 + prn2) time (Kaltofen, 1982). Computer science had embraced the abstract nature of mathematics, and the field of computer algebra was born. Although our primary focus is not factoring polynomials, but rather adding, multiplying, and simplifying them given certain constraints, our goals are the same. We want algorithms that are efficient in terms of both runtime and memory. As we study Perron numbers of increasingly high conductors, this will become an absolute necessity. The primary reason we are writing this program is to perform calculations in seconds that would take us hours or days to do by hand. With this in mind, we move on to the design of our polynomial program, and why we have chosen our current algorithms.

39 12.2 Polynomial Structures The Polynomial data structure includes multiple nested C++ classes, each with their own arithmetic and lexicographic operations. It relies on several native C++ data types:

1. int — 32-bit signed integer; can also be referred to as a long

2. double — 64-bit floating point decimal

3. long long — 64-bit signed integer

4. long double — floating point decimal with higher accuracy than a double, whose size can vary based on the compiler

5. std::string — character string of arbitrary length; can be empty as in "" or contain a word as in "name"

6. std::vector — dynamic array of objects of type type, which can be sorted and change in size with the addition and subtraction of new elements

7. union — object that can be of more than one data type

8. enum — object used to denote an enumerated data type; used in con- junction with a union to tell what data type it contains

The first object class here is the Variable, which consists solely of a std::string representing the variable’s name and an int representing the integral power to which it is raised. Variable objects can only be multi- plied or divided by other Variable objects of the same name; the resulting Variable possesses the same name as its predecessors, but a different ex- ponent. Variables can also be raised to the power of positive or negative integers, which are then multiplied by their current exponents. Because ex- ponentiation comes first in the order of operations, Variables are the sim- plest object class in a Polynomial and form the basis of most Polynomial computations. The second object class is the Constant, which is on the same level of the hierarchy as a Variable.A Constant contains a C++ union object, which is capable of being either a long long integer, two long longs representing the numerator and denominator of a fraction, or a long double representing a non-integer decimal number. It also contains an enum data type that denotes which type of data the union holds. Constant objects can be added,

40 subtracted, multiplied, divided, or raised to integral powers, and possess the same comparison operations as the real numbers R. The next level of the hierarchy is the Term, which consists of a Constant coefficient followed by a sorted std::vector of Variables and an int representing the total sum of the exponents of the Term’s Variables. Like Variables, Terms can be raised to any integral power. Unlike Variables, any two Terms can be multiplied or divided regardless of their contents. They are considered equal if they contain the same Variables and exponents in the same order. If two Terms are equal they can be added or subtracted, yielding a new Term with the same Variables as its predecessors but a dif- ferent coefficient. These properties allow Terms to have a greater versatility than Variables, but they still require Variable and Constant operations to function. The next level of the hierarchy is the Polynomial itself, which consists solely of a std::vector of Term objects. This dynamic array, how- ever, is different from the one a Term contains in that it contains a “bookend” element at the very end called TERM MAX. This object is a Term that cannot be parsed from user input and is designed such that it is “greater than” all other Terms. It serves as a rightmost element in the sorted array and is used for increased efficiency in Polynomial multiplication, and is never seen by the user. Polynomial objects can be compared for equality but otherwise possess no ordering system. They can be added, subtracted, multiplied, modded via a quotient ring, raised to positive integral powers, or subjected to Euclidean polynomial division. The top level of the hierarchy contains two classes which are interde- pendent on one another for their functionality. The first of these, the Expression class, is similar to the Constant class in that it is designed to accommodate three possible data types: a single Constant, representing a constant value; a single Polynomial, representing a polynomial value; and a numerator and denominator Polynomial, together representing a fraction of two Polynomials. Unfortunately, because the Polynomial data type can change in size, these objects cannot be contained simultaneously in a union object like the one in the Constant class; instead, the Expression contains them all separately, possessing a single Constant and two Polynomials, and any of these that are not being used are set to 0 to minimize extra space in memory. Like the Constant class, the Expression contains an enum data type that stores whether or not it represents a constant value, a polynomial, or a fraction of two polynomials. The other class at the top of the hierarchy is the Mod class, which is used for modding and computing the greatest common divisor of two polynomials.

41 It consists of a Variable x followed by a std::vector of Expressions and another std::vector of ints representing the associated powers of x of which each Expression is a multiple. For instance, if the Mod were supposed to represent the equation a4 = a3 − 3a + b + 4 = 1 · a3 + (−3) · a1 + (b + 4) · a0, x would be a4, the vector of Expressions would be {b + 4, −3, 1}, and the vector of associated exponents would be {0, 1, 3}. Together with Expressions, Mods form the level of the code with which the user interacts. All user input in our program is recorded either in the form of Expressions or Mods. We have chosen the above representation of polynomial expressions as C++ data types for several reasons. In order to provide support for ab- stract variables while still allowing efficient computation of constant ele- ments, we have made Terms and Constants separate classes; Constant coef- ficients within Terms always come at the very beginning so that they can be added, subtracted, multiplied, or divided separately from Variables, which can have different names and need to be treated separately. Constants are capable of being integers, fractions, or decimals with the intention of maxi- mizing the possibilities of what the we can enter as input: A user might want to enter a polynomial of integer, fractional, decimal, or mixed Constant co- efficients, whose hierarchy is defined in terms of decreasing accuracy of what their data represents. The nesting of our above data types occurs according to the reverse of the “PEMDAS” order of operations: The strongest operation (exponenti- ation) is built into the Variable class, while the next-strongest operation (multiplication) is built into the Term class, with interconnected Variables. The addition and subtraction operations, each occurring at the bottom of the PEMDAS hierarchy, are incorporated into the Polynomial class with interconnected Terms, and the invisible TERM MAX element exists at the end as a bookend to use during Polynomial multiplication. Because of the sheer complexity of polynomial division and the fact that a quotient of arbitrarily many polynomials can always be reduced down to a single fraction with a numerator and a denominator, the Expression class is our penultimate object, and we have the Mod class as our very last object so that we can perform substitutions involving Expressions. With this structure in mind, we will now examine how these classes operate and how we can use them to do computations related to cyclotomic Perron numbers.

42 12.3 Polynomial Operations 12.3.1 Arithmetic within Polynomials Simple Operations Each Polynomial sub-class has its own algorithms to carry out arithmetic operations. These operations, when carried out, should yield the answer in simplest form: For instance, a + b + a must simplify to 2a + b and (a + b)(a − b) must simplify to a2 − b2. In order to carry out this simplification, equivalent Variables and Terms must be merged together, and sums of zero and products of one must be removed to save space. To efficiently achieve this, all Variables within a Term must be sorted lexicographically, meaning that they must follow a sort of easily predictable alphabetical order. The same must apply to all Terms within a Polynomial. The manners in which these objects are sorted will be discussed below under Ordering Methods, but for now all we need to know about them is that these elements must remain sorted even after mathematical operations are applied to them. This way, the Terms and Polynomials only have to be sorted once after they are read from user input, and any Term or Polynomial resulting from mathematical operations on existing objects will be sorted upon creation. Variable and Constant operations are the most basic, as neither of them contain dynamic arrays that require sorting. A Variable can be raised to any integral power; the resulting Variable will have the same name as its predecessor, but its exponent will be multiplied by integral power. Multiplying two Variables of the same name adds their exponents, while division subtracts them. Addition, subtraction, multiplication, and division can occur between any two Constants regardless of type. Operations between two fractions are carried out in the same way as they would be in real life, with addition and subtraction requiring a greatest common factor and multiplication and division multiplying corresponding numerators and denominators. Integer and floating point operations for integers and decimals are carried out in the normal manner by the operating system. Division by zero is allowed, resulting in the values INFINITY or INDETERMINATE depending on whether or not the numerator is also zero. These two values can be computed and stored in memory; however, they cannot be subject to further mathematical operations, and will throw exceptions if arithmetic is attempted with them. If the two Constants are of different types, then the algorithm copies both of them and casts the lower copy on the hierarchy to the same type as the higher one, with integers becoming fractions and fractions becoming

43 decimals. Exponentiation of Constants is carried out efficiently using bitwise op- erations on a positive exponent, taking advantage of its binary representa- tion to minimize the number of necessary multiplications or divisions. For instance, 22 = 24 + 22 + 21 = 0x00010110 in 8-bit binary, meaning that a22 = a16 · a4 · a2. Thus, instead of multiplying a by itself 22 times, we can take a and repeatedly square it to get a single multiplicative factor that we can multiply with the accumulator based on whether or not the nth bit from the right of the binary representation is a 0 or a 1. Negative exponents can be dealt with by taking the reciprocal of their corresponding positive exponents. As the number of nonzero bits in a number is at most its bi- nary logarithm and two multiplications occur per bit, this reduces the total number of multiplications necessary from O(n) to O(log n). Addition and subtraction of Terms containing the same Variables with the same exponents occurs in a similar manner to the multiplication of like Variables, in that it is the Constant coefficients that are added and sub- tracted and the sorted dynamic array remains the same. Multiplication of Terms, however, requires more effort. One option would be to combine the two dynamic Variable arrays together, sort them, and combine adja- cent Variables with the same name. However, for Terms with m and n Variables respectively, this would take O((m + n) log(m + n)) time, mainly from having to sort the m + n total elements. Our multiplication algorithm instead runs in only O(m + n) time, taking advantage of the fact that the two dynamic arrays are already sorted, and runs similarly to the merging operation of the merge sort algorithm. In this case, the dynamic arrays are treated as queues, with only the first Variable elements of the two queues being considered. During each insertion, the lesser of the two Variables al- phabetically is inserted into the back of the product Term’s dynamic array, and its successor takes its place in its parent Term. If the Variable equals the last Variable in the array, then the two are multiplied together; if the last Variable has a zero exponent, then it is popped off the back before the next Variable is inserted. Once one Term runs out of Variables, the other Term inserts all of its remaining Variables in order, only checking the first remaining one to see if it can be multiplied with the back of the product array. This ensures that the resulting product Term is completely sorted. Di- vision occurs in the same manner, but involves dividing the product by each Variable if it comes from the divisor. Addition and subtraction of Terms within a Polynomial follow the exact same procedure as Term multiplication and division, just with a different operation and data type. Additionally, we would ignore the TERM MAXs in each addend when merging, and a TERM MAX

44 would have to be appended to the back of the final Polynomial sum. The pseudocode for multiplying Terms a and b would be the following: i := 0 j := 0 a size := the number of Variables in a b size := the number of Variables in b min := some Variable product := an empty Term whose only Variable is 1 (Begin merging process) while i < a size and j < b size: (Find minimum element) if a[i] < b[j]: min := a[i] increment i else: min := b[j] increment j (Merge Variables together if they have the same name) if min has the same name as the last Variable v of product: v := v * min else: (Pop Variables with an exponent of 0) if the last Variable v of product equals 1: pop v off the back of product push min to the back of product (If there are any Variables remaining in a, push them all into the product) if i < a size: if a[i] has the same name as the last Variable v of product: v := v * a[i] else: if the last Variable v of product equals 1: pop v off the back of product push a[i] to the back of product push Variables a[i + 1] to a[a size - 1] onto the back of product (If there are any Variables remaining in b, push them all into the product) else if j < b size: if b[j] has the same name as the last Variable v of product: v := v * b[j] else:

45 if the last Variable v of product equals 1: pop v off the back of product push b[j] to the back of product push Variables b[j + 1] to b[b size - 1] onto the back of product if the last Variable v of product equals 1: pop v off the back of product return product

Factor Product Factor

2 2 −5 − 3 15 2 a a a2 −1 b a b 3 d c b−1 e b f −2 c3 d e f −2

2 −1 3 2 −2 −5a b c × 15 abdef

2 3 3 −2 − 3 a c def

Figure 15: Diagram for Term multiplication algorithm showing how the Terms are alphabetically merged together.

46 Addend Sum Addend

b3 b3 c2 d3 d3 −a bc c2 2 a bc b−1c ab−1 a a−1c 1 −a −a−1b ab−1 1 2 b−1c −a−1b a−1c

b3 + d3 + bc + a + ab−1 + 1 − a−1b

+

c2 − a + 2 + b−1c + a−1c

=

b3 + d3 + c2 + bc + ab−1 + 3 + b−1c − a−1b + a−1c

Figure 16: Diagram demonstrating Polynomial addition. Note the simi- larities to Term multiplication, with sorted Terms being merged together instead of Variables.

47 Multiplication Polynomial multiplication, like addition, utilizes merging and takes advantage of each dynamic array of Terms being already sorted. However, instead of just two queues for Polynomials p = a + b + c + d and q = e + f + g + h, we have one queue for each Term contained in p, namely a, b, c, and d. Each queue has the same size as the number of Terms in q, and all the queues combined will serve as a sort of multiplication table for us to use with the distributive property. To save memory, however, only the first element of the queue is actually stored in memory, along with its index in the queue. Taking advantage of the property in our lexicographic ordering that if w < x and y < z, then wy < xz, we have the following table showing the general order which the product Terms must follow in the sorted Polynomial product:

× e < f < g < h < TERM MAX a ae < af < ag < ah < TERM MAX < < < < < b be < bf < bg < bh < TERM MAX < < < < < c ce < cf < cg < ch < TERM MAX < < < < < d de < df < dg < dh < TERM MAX < < < < < TERM MAX TERM MAX TERM MAX TERM MAX TERM MAX TERM MAX

Figure 17: Multiplication table showing the chain of < signs for sorted terms in the final product. Note the presence of the TERM MAXs at the boundaries of the table.

Here, each row is one of our queues, and only the first element of the queue is actually stored in memory. The actual multiplication table would just be the array {ae, be, ce, de, TERM MAX}; if we were to pop ae off the top of it and merge it into the product, our new table would be {af, be, ce, de, TERM MAX}, with af being computed on the fly by the algorithm. We notice, however, that the ordering in our table is only a partial order, meaning that there exist elements x, y, and z such that z < x and z < y, but we have no definitive way of telling whether x < y or y < x. For instance, we know that af and be are both greater than and should come after ae, but we would have to check in order to confirm or disprove that af < be. However, before we merge ce or any other element in the c queue into our product, we will have to merge be, and no matter what happens,

48 the very first element we will have to merge will be ae, as it will always be less than all other Terms in the multiplication table. Therefore, when we merge a Term in the first column of a row in this table, we “unlock” the row beneath it; otherwise, that row and all the rows beneath it stay dormant, and do not have to be considered yet. Our multiplication algorithm therefore keeps track of the “bottom” of the table, which is the last row/queue that has been unlocked. If a Term is selected from the bottom row, the new bottom becomes the row under it. Once the bottom hits TERM MAX, it will not move any further, as TERM MAX will never be selected for merging into the product by the algorithm. The same goes for if any of the queues hit TERM MAX, in which case the algorithm will be prevented from going off the right end of the table. The pseudocode for multiplying Polynomials a and b would thus be: rows := the number of Terms in a columns := the number of Terms in b (Means the only Term is TERM MAX; one of the Polynomials is 0) if rows == 1 or columns == 1: return 0 product := an empty Polynomial whose only Term is 0 (Total number of Terms that will have to be merged together) table size = (rows - 1) * (columns - 1) (Only the first column of the multiplication table is stored in memory) (Once a Term in that column is merged, the next Term in its row takes its place) mult table = an array with "rows" Terms total (Indices of the elements of each row in the multiplication column) indices = an array of 0s whose length is "rows" (Compute first column of the multiplication table) for 0 <= i < rows: mult table[i] = a[i] * b[0] mult table[rows - 1] = TERM MAX (Variable holding the row number you are on) (Forms the “bottom” of the table) (If you hit the bottom row, you “unlock” the next row under it) bottom := 0 (Index of the minimum element to be merged) min index := 0 min := some Term for 0 <= i < table size:

49 (Compute the minimum Term in mult table and its index) min = mult table[0] min index = 0 for 1 <= j <= bottom: if mult table[j] < min: min = mult table[j] min index = j (If you reach the bottom, unlock the next row) if min index == bottom: increment bottom (Replace the minimum Term with whatever comes next in its row) increment indices[min index] (Merge Terms if possible) if min shares the same Variables as the last Term t of product: t := t + min else: (Remove any Terms equaling 0 before pushing min) if the last Term t of product equals 0: pop t off the back of product push min to the back of product (Compute min’s replacement) mult table[min index] := a[min index] * b[indices[min index]] if the last Term t of product equals 0: pop t off the back of the product push TERM MAX to the back of product return product

If p has m Terms and q has n Terms, then the number of comparisons this algorithm has to make is just O(mn), as there will be m or fewer com- parisons for each of n rows. This is much more efficient than if we were to multiply all the Terms, resort them, and then merge them together, which would require O(mn log(mn)) comparisons due to us having to sort mn el- ements. In addition, by only storing the first column of the multiplication table, we have ensured that the space complexity (the extra memory required by the algorithm) is only O(n), rather than O(mn); because of this, multi- plying two Polynomials with a thousand Terms each would only require a thousand extra Terms in memory, rather than a million. We have therefore designed a faster and more practical method for polynomial multiplication using computer science.

50 Division Euclidean polynomial division is probably the most bizarre and elaborate operation implemented for the Polynomial class. For polynomials in only one variable x, division is very similar to long division in decimal numbers, with coefficients of powers of x replacing decimal digits. However, when the polynomial contains more than one variable, the process becomes a lot murkier. We could feasibly think of 2, 10, and −5 as the digits of 2x2 + 10x − 5; what would be the “digits” of p9 − 10abc + 71t2 − 99? To answer this question, we must make use of a concept known as Gr¨obnerbases (Sturmfels, 1996). Suppose we have a numerator n and a denominator d for our polynomial division, and we want to compute the quotient and remainder. Our Gr¨obnerbasis will be the set of all possible polynomial multiples of d. If we were to add or subtract any elements in this basis, we would get another polynomial multiple of d; likewise, if we mul- tiplied any element in our Gr¨obnerbasis by another polynomial, we would get another element that was already in our set. In terms of abstract alge- bra, we would refer to this as an ideal of our ring of polynomials (Eder & Hofmann, 2021). The multivariate division process occurs in the form of a series of reduc- tions. Here, we will define some terminology for a polynomial p: lc(p) will be the leading coefficient of p, lt(p) will be the leading term of p, and lm(p) will be the leading monomial of p. For instance, if p = 2ab + c − 7, we have lt(p) that lc(p) = 2, lt(p) = 2ab, and lm(p) = ab. We note that lm(p) = lc(p) . A polynomial f in the Gr¨obnerbasis is said to be reduced by g if there are no terms in f that are divisible by the leading monomial lm(g). This reduced result will be the remainder from the multivariate division, and the reductions that we will take to get this remainder will produce our quotient (Sturmfels, 1996). Before we begin our reduction, we must have the terms in both our numerator and denominator sorted according to some consistent lexicographical ordering. We will be using the one we’ve already made for our Polynomial class for the sake of convenience. In order for the algorithm to work, we must make sure that there are no negative exponents in the numerator or denominator. These can be eliminated by multiplying the numerator and denominator by the positive versions of these exponents to cancel them out. For instance, if we are dividing ay−1 + ax−1 by y−1 + x−1, we can multiply both sides by xy to get ax+ay the improved quotient x+y . From here, we are safe to proceed with the rest of the algorithm. If f is the polynomial in the numerator and g is the polynomial in the denominator, then we must find the first term m in f that is divisible by

51 m lm(g). If t = lt(g) , then the reduction is red(f, g) = f − tg. At this point, t is added to the quotient, and the next reduction is performed on the polynomial resulting from red(f, g). Once the polynomial can no longer be reduced by g, the division algorithm is over, and whatever is left over is the remainder. For example, we can try dividing a2 − b2 by a − b. We see that the 2 2 2 a2 first term in a − b divisible by a is a , and that a = a. Thus, our first reduction is a2 −b2 −a(a−b) = a2 −b2 −a2 +ab = ab−b2, and our quotient is ab currently a. We find that the next term that divides a is ab, and that a = b. Therefore, our second reduction is ab − b2 − b(a − b) = ab − b2 − ab + b2 = 0, and our quotient becomes a + b. At this point, our remainder is 0, so there a2−b2 is nothing left to reduce. This means that our final quotient for a−b is a + b. We have achieved multivariate polynomial division.

Expressions Unfortunately, Gr¨obnerbasis division is often inconsistent when the numerator and denominator do not evenly divide one another. For instance, if we were to divide a + b by b + c using this method, we would get a quotient of 1 and a remainder of a − c. However, if we renamed b to d and altered the lexicographic order, we would get that a + d divided by c + d had a quotient of 0 and a remainder of a + d, as no reductions would be possible. For this reason, we use Expressions to represent quotients of Polynomials a+b that do not evenly divide each other, such as b+c . However, if we have two Polynomials that have a factor in common, such as a in a2 + ab = a(a + b) and ac + ad = a(c + d), we want this factor to be divided out of the nu- merator and denominator to simplify the fraction. This will require us to compute the polynomial greatest common denominator, and to use Gr¨obnerbasis division to factor it out of the numerator and denominator. For polynomials a(x) and b(x) in one variable x, Euclid’s GCD algorithm involves finding a quotient q1(x) and remainder r1(x) such that a(x) = b(x)q1(x) + r1(x), and then finding another quotient/remainder pair q2(x), r2(x) such that q1(x) = r1(x)q2(x) + r2(x). The process is repeated until the remainder polynomial is 0, in which case the last quotient is the greatest common divisor. This same process can be performed with polynomials of more than one variable. We can do this by reducing the polynomial to a single variable by treating all the other variables as coefficients of powers of that variable. This first variable can be chosen as the first variable appearing in the polynomial, or it can be selected at random. For example, ax2 +bx2 −ax+cx+d+e−f =

52 (a + b)x2 + (−a + c)x1 + (d + e − f)x0. This is the perfect job for our aforementioned Mod class. Here, if Polynomial p has n Terms, the n Terms can be sorted with respect to ascending powers of x in O(n log n) time to form the Mod. Afterward, we can perform polynomial long division in one variable to get the quotient and remainder, and rinse and repeat until we are done. Any fractions of polynomials that end up as coefficients of powers of x can be recursively simplified using this same process. For example, suppose we wanted to calculate the GCD of ac + ad + bc + bd = (a + b)(c + d) and a2 − b2 = (a + b)(a − b), which we can see from their factorizations is just a + b. Converting these two into polynomials of a, we would have (c + d)a + (bc + bd) and a2 + (−b2). Because a2 − b2 has a higher degree in a than (c + d)a + (bc + bd), we will use the former as the dividend and the latter as the divisor. For the sake of convenience, we also want our divisor to be monic in terms of a so that the leading terms can divide each bc+bd other: Dividing out c+d, our new divisor would be a+ c+d . We would then bc+bd recursively apply our algorithm to get c+d = b; if we expressed bc + bd and c + d both in terms of c and divided them, we would get 0 as a remainder, indicating c + d was the GCD and could be divided out via Gr¨obnerbasis b 2 2 division to get the fraction 1 . Dividing a − b by a + b, we would get a quotient of a − b and a remainder of 0, indicating that a + b is our greatest common divisor as expected. Now that we have a method of simplifying fractions within Expressions, we can perform the same operations on them that we can with Polynomials. Operations within Expressions are similar to those in Constants in that the three data types follow a hierarchy. If a Constant is paired with a Polynomial in an operation between Expressions, the Constant will be converted into a Polynomial by wrapping it an empty Term within a Polynomial object. Likewise, if a Polynomial is paired with a fraction of two Polynomials, it is itself converted into a fraction by giving it a denom- inator of 1. With Expression fractions, the same rules as regular fractions a c ad±bc apply to the Polynomials in the numerator and denominator: b ± d = bd , a c ac a c ad a n an b · d = bd , b / d = bc , and b = bn . The difference is, with our polynomial GCD algorithm in place, we now have methods for simplifying the results: a b a+b a+b + a+b will now equal 1, instead of remaining a+b forever. Gr¨obnerbasis division is still available in the Expression class. However, it can only occur between two Expressions that are both single polynomials. In this case, the // operator (implemented in the C++ code using the | operator) will return the quotient of the two Polynomials, and is akin to integer division. The % operator will return the remainder.

53 Substitution Up until now, many of our program’s polynomial features have already been implemented in various other platforms and languages. The TI-Nspire calculator, for instance, already has polynomial addition, subtraction, and multiplication, as well as multivariate Euclidean division (Polynomial Toolkit for TI-Nspire, n.d.). However, these operations by themselves would not be enough to attempt the large experiments we have conducted with Perron numbers. In order to efficiently compute trace and norm, we will need our polynomial program to support another feature: modding. Modding out a polynomial p with respect to a variable x involves simplifying p by substituting individual x variables of a certain exponent or higher with equivalent expressions that have a lower degree in that variable. For example, suppose we have the equality x2 = x + 1. Therefore, within our polynomials, the simplification ax3 + bx2 + cx + d = ax(x + 1) + b(x + 1) + cx + d = ax2 + ax + bx + b + cx + d = a(x + 1) + (a + b + c)x + (b + d) = (2a + b + c)x + (a + b + d) can be made. In our program, we can have multiple modding constraints for substitution at the same time; in this case, the program applies them to the Polynomial objects in the order in which they are entered. Modding involves both Polynomial addition and multiplication. How- ever, it has more steps and is generally more complicated and time-consuming. Modding requires three operands: a Polynomial p, a Variable vk to replace within p, and a Polynomial m to substitute for vk. The first step of the algorithm sweeps through p to find the highest exponent n of v in p; if n < k, then the algorithm simply returns p, as it is already in its simplest form. If not, then the algorithm proceeds to the second step, which involves creating an array that pairs all exponents from k to n of v with their associated irreducible substitution Polynomials. For instance, if p = x5 + ax − 9 and we knew that x2 = x + 1, our table would contain the pairings:

x2 = x + 1 x3 = 2x + 1 x4 = 3x + 2 x5 = 5x + 3 In this case, the nth pairing would be calculated by multiplying the (n − 1)th pairing by v and checking to see if the first Term of the result was a multiple of vk. If it was, then the algorithm would replace the first Term with the simplified Polynomial version of itself and adding it back in: For

54 instance, x6 = 5x2 +3x would become 5(x+1)+3x = 8x+5. These pairings are then all stored in memory so that the algorithm can look them up later instead of having to recalculate them each time they are found in the main Polynomial. The next step involves sweeping through the original Polynomial and replacing each Term containing the Variable to be modded out with its corresponding substitution Polynomial. These substitution Polynomials are all added together, and the original Terms in the parent Polynomial are deleted. The remaining Terms of the parent Polynomial are then added to this sum, yielding the final answer. This algorithm can be made more efficient by expressing both the Polynomial to be simplified as well as modding Polynomial in terms of Mods with x as the base variable. This ensures that the Terms in each Polynomial are al- ready sorted based on powers of x, meaning that we do not have to go back through the Polynomial to search for them. Of course, there still exist many languages that support modding poly- nomials. Wolfram, for instance, possesses a function called PolynomialMod that allows a user to reduce a polynomial p modulo a polynomial q (Polyno- mial Algebra-Wolfram Language Documentation, n.d.). We also note that modding can be achieved to some extent by taking the remainder of a poly- nomial Euclidean division: For instance, if we wanted to simplify x3 subject to x2 = x + 1, we could just divide x3 by x2 − x − 1 and take the remain- der 2x + 1 as our answer. Therefore, any language that supports Euclidean division with Gr¨obner bases should also support modding. However, there are a number of advantages that our program’s method of substitution affords us. Our program allows us to choose which specific variable gets modded out, as well as its associated exponent. For instance, modding out a polynomial with respect to a + b + c would give us different results based on the Gr¨obnerbasis we used and its associated lexicographic ordering. Here, we could specify the constraint as a = −b − c, or b = −a − c or c = −a − b, and not have to worry about ambiguity. By storing these constraints as Mod objects, we can also have several of these at a time and 2 1 a chain them together: For instance, a = 5 could be paired with x = 2 + 2 to give a representation√ of the Golden Ratio. This allows us to store algebraic 1+ 5 numbers like 2 as Variable objects, which can then be treated as exact values whose properties are determined by their associated Mods. In this manner, we can compute with Perron numbers directly by storing their minimal polynomials in memory as Mods, and adding and multiplying the corresponding Variables accordingly.

55 Additionally, embedding our algorithms in the C++ language using custom-made functions and classes gives us more control over how the com- putations are carried out. This allows us to feed our programs massive amounts of data in the form of large polynomials and multiple modding constraints, and also allows us to extend C++ with Polynomial objects. Borrowing off of the features C++ already supports, we now have access to conditionals, loops, and recursive functions involving Polynomial ex- pressions. We have used such C++ code to generate a massive list of all the minimal polynomials of quantum 2 for prime conductors between 2 and 200, many of which we ended up using in our research (Davydov & Kanieski, n.d.-a). By writing our own polynomial tool, we were able to perform exper- iments and gather data that would have been difficult or impossible using other polynomial programs.

Exponentiation In comparison with multiplication and division, Polynomial exponentiation to positive integer powers is surprisingly easy. The Multi- nomial Expansion Theorem, an extension of the Binomial Expansion Theorem, gives us a reliable formula for computing the Terms in the final Polynomial without actually having to perform all of the tedious multipli- cations. The formula is as follows:

m !n   m X X n Y ij ak = aj i1, i2, ··· , im k=1 i1+i2+···+im=n j=1

n  n! Here, i ,i ,··· ,i = is referred to as the multinomial co- 1 2 m i1!i2! ··· im! n! efficient (of which the binomial coefficient n = is a gener- k k!(n − k)! alization for two terms). The second sum iterates over all possible ways to add non-negative integers to get the exponent n using m natural numbers, where m is the number of terms in the original polynomial. Simple multiplication and division can be used to compute the multi- nomial coefficient, and the generation of sums for the denominator can be done recursively: For each 0 < i ≤ n, add i to the sequence in the denom- inator and run the algorithm again on n − i until you reach 0. Applying this exponentiation formula to our program, however, we find that there is no general definitive way to order the resulting Terms or multinomial coeffi- cients, meaning that the resulting Polynomial will not always be in order. Therefore, the Polynomial must be sorted immediately after computation. However, compared to the exponential time saved by removing all those dis-

56 tributive multiplications, the additional O(n log n) time from the sorting is barely noticeable.

12.3.2 Arithmetic within Expressions and Mods Ordering Methods Variables possess two different ordering methods, which are used in different contexts:

1. If two Variables are part of the same Term, then they are sorted alphabetically regardless of exponent.

2. If two Variables are part of different Terms, then how they are sorted depends on their exponents. If both exponents are positive, then the Variables are sorted in alphabetical order; if they are both negative, the Variables are sorted in reverse alphabetical order. If they are different signs, the Variable with the positive exponent always comes first.

Two Variable objects are said to be equivalent if they share the same name. Variable equivalence is used to determine whether two Variables are compatible to be multiplied or divided. If they share the same name and the same exponent, they are said to “match” each other, which is how they are compared between two Terms to determine equality. The reason for having two separate Variable orderings is to preserve both efficient Term addition and Variable multiplication. Because the Variables are algebraic and can be substituted for real numbers, they must follow a total ordering that obeys the arithmetic principles of multiplica- tion. This means that:

1. If a ≤ b and b ≤ a, then a = b. Furthermore, for any a and b, either a ≤ b or b ≤ a must be true.

2. If a < b and b < c, then a < c.

3. If a < b, then for all c, ac < bc.

4. If a < b and c < d, then ac < bd.

5. If a = b and c > 1, then a < bc. If c < 1, then ac < b.

6. If a < b, then b−1 < a−1.

57 These properties are essential to the efficiency of the algorithms used for addition and multiplication of Terms and Polynomials. In order to avoid having to re-sort the Variables in each Term and the Terms in each Polynomial every time they are subjected to an operation, the dynamic arrays must be ordered prior to computation and result in ordered dynamic arrays after computation. In essence, these ordering principles must be preserved under multiplication in order for the algorithms not to have to repeatedly sort them. The second Variable ordering method can be shown to possess this property: For instance, the reverse alphabetical order rule holds when you apply Property 6 to a < b, since the letter “a” comes before the letter “b” in the alphabet. Furthermore, if a < b, we should be able to multiply both sides by either a−1 or b−1 in accordance with Property 3 to get that ab−1 < 1 and, more interestingly, that 1 < a−1b. In ordinary algebra, if we multiply the sorted polynomial a + b by the sorted polynomial b−1 + a−1 using the Distributive Property and add the terms in lexicographic order, then we get that the resulting polynomial ab−1+1+1+a−1b = ab−1+2+a−1b also follows this ordering principle. However, this Variable ordering principle is ill-equipped for handling Variables within the same Term. Suppose we want to multiply abc by cb−1a−1. If we simply try to merge the Variables together in this sorted order, the result we get is abc2b−1a−1, which is not fully simplified. In order to reduce this expression down to c2 by having adjacent equal Variables cancel one another out, we would have to sort them after multiplication. This would mean that, in general, we would have to sort periodically the Variables in every Term. This would significantly increase the computa- tional complexity of the program, making it run noticeably slower. Thus, standard alphabetical order is used for Variables within a single Term, but not for separate Terms. Terms have only one ordering method, wherein they are first compared by the sums of the exponents of their Variables. If one Term has a greater exponent sum than another, then it is considered “less” than its partner and is placed towards the front of the Polynomial. If two Terms have the same exponent sum, then their individual Variables are compared using the second ordering method described above. If while iterating through each Term one of their dynamic arrays runs out of Variables (such as the empty Term representing 1), then the signage of the exponent in the next Variable of the other Term determines the order- ing: If it is positive, the nonempty Term must come before the empty one. If it is negative, the nonempty one must come after instead. This is why the inequality ab−1 < 1 < a−1b previously mentioned holds. In a manner simi-

58 lar to Polynomials, Terms can be thought of as having a bookend Variable at the back of their dynamic arrays representing the number 1. Using the first Variable ordering method, 1 is greater than all other Variables that can possibly be ordered alphabetically; however, using the second ordering method, 1 is still greater than all Variables with positive exponents but becomes less than all Variables with negative exponents.

12.4 Lexing and Parsing One of the most important components of our programs involving polyno- mials and matrices is how we interpret user input. Ideally, we want the user to be able to enter any arbitrary polynomial expression, such as x + x - 3 or (a + b)^3 / b^2 or k * (k + 1) * (k + 2), and have the program be able to understand and evaluate it. Similarly, we want to have some way for the user to enter constraints such as x^2 = x + 1 or i^2 = -1 for modding. In order to achieve this, we must invent our own formal language contain- ing all possible polynomial expressions that can be evaluated and excluding all invalid input strings. This language must follow a set of rules known as a formal grammar, which will show us how to form valid strings according to the syntax of our polynomials. First, we must acknowledge the functionality of our program. It should take a std::string as user input, transform it into a series of Expressions in memory connected by various operators and other symbols, perform the specified computations, and output the result back to the user as an- other std::string. However, before the std::string can be parsed into an Expression, it would be helpful for us to cut the character string up into relevant components such as numbers, variable names, parentheses, opera- tors, equal signs, and line breaks. These components are known in the world of computer science as tokens and are akin to words in a sentence in that they are indivisible units of meaning. By transforming user input from a string of letters and numbers into a sequence of tokens, we make it easier for the program to interpret it later on. This transformation is called lexical analysis or lexing, and an algorithm that carries out this process is known as a lexer. The lexer for our polynomial language will be relatively simple. Con- tiguous substrings of letters not separated by spaces, such as a, pi, or reallylongunbrokenname, will be lexed in as variable names, which we will represent in our grammar as s. Unbroken sequences of digits will be lexed as integers, or s; if said sequences contain a single decimal point, they will be lexed as decimals, or s, instead. Each opera-

59 tor +, -, *, /, //, %, = and the parentheses (, ) will have their own tokens. Line-breaks will be used to separate individual lines, which may contain either modding constraints or output statements as we will discuss later. After the input is lexed, it is then parsed into polynomials. Parsing will involve analyzing the string of tokens produced by the lexer according to our language and converting it into abstract syntax in the form of Expression objects in memory. In order to achieve this, our language will make use of a context-free grammar. In such a grammar, there are two types of symbols: terminal symbols formed by single tokens, such as a, 1.2, +, or variablename, that cannot be further broken down by the parser; and nonterminal symbols that can be further reduced into sequences of terminal symbols, such as a + b, 20 - 2^4, or (a - b) * (c - d). In a context-free grammar, each rule shows how to parse a single nonterminal symbol down into a sequence of terminal and nonterminal symbols, which can themselves be recursively parsed according to other rules in the grammar (Appel, 2004). In terms of notation, we will say that A ::== B (read “A reduces to B”) if the parser can “unfold” and simplify the nonterminal symbol A into B, where B might be a single symbol or a sequence of symbols. If A can be reduced to multiple distinct expressions, such as B, C, D, or E, we would write this rule as A ::== B | C | D | E. For example, when parsing a Variable, which consists of a variable name followed by a ^ and an integer exponent, we might have the rule ::== ^ , where is a terminal symbol representing a variable name token and is a nonterminal symbol which might be further unfolded into a positive or negative integer. We note, however, that the above rule will not allow us to parse Variables that do not have an exponent attached to them, such as x or pi. Therefore, we want to make the ^ symbols at the end of the rule optional; in other words, we want ^ to also be reducible to nothing, or in this case the empty string "". In this case, we can split our single rule into two rules:

::==

::== "" | ^

We can construct a similar rule for parsing exponents with or without neg-

60 ative signs:

::== - | where is a terminal symbol representing a positive integer formed by contiguous decimal digits. When creating the grammar for our polynomial language, however, we have to be careful to avoid ambiguity; in other words, we want to make sure that, given a single token string, only a single rule will be able to parse it. Otherwise, our program will become confused and not know which rule to use. For instance, suppose our grammar was

::== "" | | | | + | - | *

We can see immediately that this is not a good idea. If we were to take the polynomial expression a + b - c * d, which one of these rules would we use to parse it? After all, a, a + b, and a + b - c would all be non- terminal symbols; which one of them would we choose to represent the first in the expression? For this reason, when we have multiple rules for a single nonterminal symbol, we want to make sure that all of them begin with a distinct sequence of terminal symbols, since terminal symbols cannot be further simplified and can only be interpreted in one way. Otherwise, ambiguity would inevitably arise from the parser not knowing how to partition the into smaller s. For polynomial addition, we could rewrite our rules into two groups:

::== "" | | | | | |

::== "" | + | + | +

61 In this grammar, the parser knows to shave off the first token of the and parse the rest as a continuation of the sum, if it exists. For the continua- tion of the sum in , we could either parse it as the empty string "" if it is absent, or search for a + and find the next term, whether it be a variable, integer, or decimal, to add to the rest of the expression. This grammar is completely unambiguous: Each input string would be parsed by a unique sequence of rules without any uncertainty. We can thus develop our grammar following the reverse of the “PEM- DAS” order of operations. We begin with a general polynomial expression . We want to express this as a sum or difference of s. To do this, we will use the same format as our unambiguous grammar for parsing addition, splitting the into a and an :

::==

::== + | - | ""

Next, we want to express each as a product, quotient, or remainder of s. We will do this in the same manner as with addition and subtraction, splitting each into a and a :

::==

::== * | / | // | % | ""

We also want to give support for negative signs in front of variables and con- stants. This operation is essentially just multiplying the value in question by −1, so it is on the same level as multiplication and division. Therefore, we will define a to be the absolute value of a :

::== - |

62 This is where we come to the highest level of the order of operations: expo- nents and parentheses. At this point, all we should be left with are s, , and s that are either by themselves or raised to integer powers. However, this is where parentheses come in: At the level, it is possible to have another polynomial expression altogether within parenthe- ses that still needs to be parsed. We thus incorporate this as a fourth rule when parsing scalars:

::== ( ) | | |

::== ^ | ^ - | ""

This leaves us with the following grammar, where is the original expression encompassing the entire line and $ is the end of the string:

63 Figure 18: The context-free grammar used for parsing a polynomial expres- sion in our language.

The final feature we want to incorporate into our grammar is the ability to parse mod expressions, such as x^2 = x + 1. In our language, we are entering polynomial and modding expressions line by line. If we enter a modding expression, we want the program to add it to the modding table; if we enter a regular polynomial expression, we want the program to eval- uate it and output the result. Our parser then must have the additional rule:

::== $

64 | = $

We note, however, that this grammar is ambiguous: A and a could just as easily be part of another . Therefore, when parsing each line, we need to go up to 5 tokens into the string in order to detect the = classifying the line as a modding expression. These would be a , a ^, a -, an , and then an =, although some of these could potentially be absent. To do this, we must upgrade our grammar from an LL(1) grammar to an LL(5) grammar. An LL(k) grammar is a context-free grammar that allows the parser to look ahead k symbols when parsing the string (Appel, 2004). Previously, our language had used an LL(1) grammar, as it was only allowed to look at the next single token in the sequence. Now that it can look ahead 5 symbols, we can determine whether or not our line is a mod- ding statement used for substitution or a polynomial expression that should produce output. There are, of course, ways we could potentially improve this grammar. For instance, it would be helpful to be able to have mathematical expressions within exponents, such as a^(3 + 2). Additionally, it might be interesting later on to add syntax for loops and conditionals; a mathematician might want to know what would happen if you took φ, squared it, added 1, and repeated the process 50 times, or find out on what iteration the expression finally exceeded one million. In the coming years, we intend to modify and augment our language to include more features that might be of use to mathematicians studying cyclotomic Perron numbers and polynomials.

13 Research Summary

As an undergraduate in the Honors Tutorial College at Ohio University, I have done extensive research with Dr. Alexei Davydov regarding cyclotomic Perron numbers of conductor n and their properties. Using C++ and Galois theory, I was able to write innovative programs and algorithms to compute cyclotomic Perron numbers, detect minimal Perron numbers, and demon- strate patterns in minimal Perron numbers of various conductors, which I then plotted using Python. To better understand the behavior of Perron numbers in the algebraic world, I wrote my own C++ polynomial classes with an emphasis on efficiency of computation and a focus on simplifying polynomials of Perron numbers via modding. I wrote my own algorithms for polynomial addition and multiplication, as well as a context-free gram- mar for the user to parse polynomial expressions. I also used Gr¨obnerbases

65 and multivariate polynomial GCD to implement polynomial division, and used multinomial coefficients to implement fast polynomial exponentiation. These projects have formed my senior thesis for the HTC, and I intend to continue research into both Perron numbers and computer algebra into the future.

14 Conclusion

In this paper, we have made several discoveries in an area of Perron numbers that has received relatively little attention from the mathematical commu- nity. By studying the distribution of minimal cyclotomic Perron numbers for various conductors and developing algorithms for decomposing a Perron number into a sum of minimals, we have obtained a large amount of data that has revealed many previously undiscovered properties from them. In the future, this data might be used to better understand fusion rules and cat- egories and could possibly lead to further developments in such diverse fields as category theory and quantum computing. Computer scientists might one day utilize our programs as an alternative coding representation of polyno- mials and cyclotomic integers, and might expand upon our algorithms to better understand algebraic entities. The world of mathematics is contin- uing to expand, and with the help of new research from computer science, knowledge about Perron numbers will only increase in the coming years.

References

Appel, A. W. (2004). Sections 3.1-3.2. In Modern compiler implementation in ML. Cambridge University Press. Section 3.1 discusses and gives examples of context-free gram- mars and how to represent them. Section 3.2 discusses LL(k) parsers and how they can be used for predictive parsing. We will be using this source to explain the polynomial parser we wrote and its grammar rules.

Davydov, A., & Kanieski, W. (n.d.-a). Cyclotomic perron numbers and fusion rings. (in preparation) This is my joint paper with Dr. Alexei Davydov pertaining to cyclotomic Perron numbers. In this paper we give an algebraic background for Perron numbers before delving into their prop-

66 erties. We introduce the concept of minimality and discuss the geometries of Perron cones for different conductors.

Davydov, A., & Kanieski, W. (n.d.-b). Fusion graphs. (in preparation) This is my joint paper with Dr. Alexei Davydov regarding fu- sion rules and fusion graphs. Here, we introduce the concept of fusion graphs and give some key examples. We link fusion graphs to fusion rules and cyclotomic Perron numbers of vari- ous conductors.

Eder, C., & Hofmann, T. (2021). Efficient Grbner bases computation over principal ideal rings. Journal of Symbolic Computation, 103 , 1 - 13. This source introduces Grobner bases and their relation to ide- als. It describes how to use them to carry out Euclidean poly- nomial division with multiple variables, a technique which we have programmed into our polynomial expansion tool.

Etingof, P. I., Gelaki, S., Nikshych, D., & Ostrik, V. (2017). Tensor cate- gories (Vol. 205). American Mathematical Society. This book mentions some key mathematical applications of Per- ron numbers. These include as dimensions of Grothendieck rings and rings with a basis in which the structure constants are non- negative integers. These form the foundation of the combina- torics needed to study tensor categories, which are important in numerous mathematical fields as well as theoretical computer science and quantum computing.

Gannon, T., & Schopieray, A. (2019). Algebraic number fields generated by Frobenius-Perron dimensions in fusion rings. The article mentions that Perron numbers and fusion graphs have many applications in other fields, such as topological quan- tum computing, invariants of knots and 3-manifolds, subfactor theory, vertex operator algebras, and conformal field theory. It also introduces the concept of a Perron-Frobenius dimension of elements of a fusion ring and goes on to illustrate some of their properties. It also includes the primary definition of the Perron-Frobenius theorem, namely that if M is a square matrix of positive real numbers, then the spectral radius of M is one of its eigenvalues.

67 Kaltofen, E. (1982). Factorization of Polynomials. Computing Supplemen- tum, 95113. doi: 10.1007/978-3-7091-3406-1 8 This article begins with an introduction to the history of com- puter algorithms for factoring polynomial equations. It gives several different approaches and explains how polynomial algo- rithms have changed since the beginning of computer algebra. Livio, M. (2003). The Golden Ratio : The Story of Phi, the World’s Most Astonishing Number. Broadway Books. Livio covers in great depth the history and properties of the Golden Ratio, a Perron number of Conductor 5. Among other things, he discusses many different instances of the Golden Ra- tio occurring in geometric patterns in nature and art, and gives mathematical justification for why this is the case. Perron, O. (1907). Zur Theorie der Matrices. Mathematische Annalen, 64 (2), 248263. doi: 10.1007/bf01449896 This paper is Oskar Perron’s original 1907 article pertaining to the theory of matrices and referencing Perron’s theorem. Polynomial Algebra-Wolfram Language Documentation. (n.d.). Re- trieved from https://reference.wolfram.com/language/guide/ PolynomialAlgebra.html This website lists and explains many polynomial-related func- tions supported by Wolfram Mathematica. We compare some of these functions to our own, especially those related to mod- ding, and describe why ours are more pertinent to research into Perron numbers. Polynomial Toolkit for TI-Nspire. (n.d.). Retrieved from https:// compasstech.com.au/TNS Authoring/poly.html This website lists the numerous polynomial-related functions supported by the TI-Nspire and their purposes, as well as giving helpful links to related documents. We cite this source as a comparison for our own polynomial tool, and how our program has algorithms specific to Perron numbers that the TI-Nspire lacks. ROWELL, E. C., & ZHENGHAN, W. (2018). MATHEMATICS OF TOPO- LOGICAL QUANTUM COMPUTING. Bulletin (New Series) of the American Mathematical Society, 55 (2), 183 - 238.

68 Braided fusion rules are important to topological quantum com- puting in that they aid in ”knotting” the information together in quantum states that cannot decay due to being locked into a topological structure.

Sturmfels, B. (1996). Grobner Basics. In Grobner bases and convex poly- topes. American Math. Soc. This source describes the reduction process in Grobner bases and how it allows us to implement multivariate polynomial di- vision. It also mentions the Buchberger algorithm and discusses different lexicographic orderings.

Swade, D. (n.d.). The Engines. Retrieved from https://www .computerhistory.org/babbage/engines/ This source mentions Charles Babbage’s difference engine as an early example of a computer capable of performing polynomial functions. We cite it here to give an example of how polynomial mathematics has been a part of programming since its inception.

Thurston, W. P. (2014). Entropy in dimension one. In A. Bonifant, M. Lyu- bich, & S. Sutherland (Eds.), Frontiers in complex dynamics: in celebra- tion of John Milnor’s 80th birthday. Princenton University Press. Thurston mentions several key points about Perron numbers, notably Doug Lind’s Theorem proving the converse of the Frobenius Perron Theorem to be true. He also gives more appli- cations for the Frobenius Perron Theorem, such as for studying the topological entropy of a map.

Verlinde, E. (1988). Fusion rules and modular transformations in 2D confor- mal field theory. Nuclear Physics B, 300 , 360-376. doi: https://doi.org/ 10.1016/0550-3213(88)90603-7 Verlinde gives some examples of fusion rules in physics and how they are important to conformal field theory. He defines a fu- sion rule as pertaining to a ring where the product of any two elements results in a sum of integer multiples of other elements, and shows how fusion rules are linked to matrices.

Washington, L. C. (1982). Introduction to cyclotomic fields. Springer. In this paper, Washington states the theorem that all real cyclo- tomic integers are sums of even multiples of cosines of rational

69 multiples of pi. This is important when considering our quantum bases for representing Perron numbers of various conductors.

70