<<

SIAM J. DISCRETE MATH. c 2017 Society for Industrial and Applied Mathematics Vol. 31, No. 2, pp. 783–795

H-REPRESENTATION OF THE KIMURA-3 FOR THE m-CLAW TREE∗

MARIE MAUHAR† , JOSEPH RUSINKO‡ , AND ZOE VERNON§

Abstract. Given a group-based Markov model on a tree, one can compute the represen- tation of a polytope describing a toric variety associated with the algebraic statistical model. In the cases of Z2 and Z2 × Z2, these have applications in the field of phylogenetics. We provide a half-space representation of the polytope for the m-claw tree where G = Z2 × Z2. This choice of group corresponds to the Kimura-3 model of evolution.

Key words. description, polytopes, group-based phylogenetic models

AMS subject classification. 52B20

DOI. 10.1137/15M1051890

1. Introduction. Phylogenetic trees depict evolutionary relationships between proteins, genes, or organisms. Many tree reconstruction methods assume evolution is described as a Markov process along the edges of a tree. The Markov matrices define the probability that a characteristic changes along an of the tree. The probability of observing a particular collection of characteristics at the leaves of the tree can be computed as a polynomial in the (unknown) entries of the transition matrices. See [20] for an overview of this algebraic statistical viewpoint on phylogenetics. Given a tree and an evolutionary model, invariants are polynomial relationships satisfied by the expected pattern frequencies occurring in sequences evolving along the tree under the Markov model [6, 11]. provides a framework for computing the complete set of invariants as the elements of a prime ideal that define an algebraic variety. Many classical varieties arise in the study of phylogenetic models [10], as do modern objects such as conformal blocks [18] and Berenstein–Zelevinsky [16]. For some models of evolution, known as group-based models, there is a finite group which acts freely and transitively on the set of states in the transition . Two such models are the binary symmetric model (G = Z2) and the Kimura-3 model (G = Z2 × Z2), the latter of which accounts for differences in DNA mutation rates between transitions and both types of transversions [14]. In a group-based model the transition matrices can be simultaneously diagonalized [11]. After the change of coordinates induced by these diagonalizations, the variety is seen to be toric [21, 23]. As a toric variety, there is an associated polytope whose combinatorial structure describes the geometry of the variety. We refer the reader to [7, 13] for background on toric varieties and polytopes.

∗Received by the editors December 10, 2015; accepted for publication (in revised form) January 31, 2017; published electronically April 20, 2017. http://www.siam.org/journals/sidma/31-2/M105189.html Funding: This material is based upon work supported by the National Science Foundation under grants DMS-1358534 and DMS-1616186. †Department of Mathematics, Lenoir-Rhyne University, Hickory, NC 28601 (marie.mauhar@ alumni.lr.edu). ‡Department of Mathematics and Computer Science, Hobart and William Smith Colleges, Geneva, NY 14456 ([email protected]). §Department of Mathematics, Washington University, St. Louis, MO 63130 (zoe.vernon@wustl. edu). 783 784 MARIE MAUHAR, JOSEPH RUSINKO, AND ZOE VERNON

For a fixed tree and finite group there is an algorithm for computing the vertices of this polytope (i.e., the vertex or V-representation) [5, 8]. An equivalent description of the polytope as the collection of points satisfying a set of linear inequalities is known as the half-space or H-representation. The H-representation for group-based models is known only in the case of the binary symmetric model [2]. Finding an H-representation is difficult for simple classes of trees. Such a repre- sentation is unknown in the case of a tree with one node and m leaves, a tree referred to as the m-claw tree. Claw trees play an important role in phylogenetics, as the variety associated with any tree can be computed by a sequence of toric fiber products of the varieties associated to claw trees [23, 24]. In this article we provide an H-representation associated to the claw tree of the Kimura-3 model. This description of the Kimura-3 polytope builds off of a well-known identification of the polytope for the binary symmetric model with the demihypercube. This work complements the growing body of knowledge about the geometry of the Kimura-3 variety [3, 15, 20].

Outline of the paper. In section 2 we review the vertex description of the polytope associated with the binary symmetric and Kimura-3 models. We introduce a polytope ∆(m) that we propose as the H-representation of the polytope associ- ated with the Kimura-3 model for an arbitrary claw tree. We show that if ∆(m) is integral, then it is the H-representation. In section 3 we introduce an isomorphism between ∆(m) and a polytope ∆0(m), described in terms of a 3×m matrix whose row and column coordinates satisfy a set of inequalities reminiscent of those that define the demihypercube. We then identify a connection between the number of integral coordinates of a in ∆0(m) and the number of facets on which it lies. This con- nection is utilized in section 4 to prove the main result of our paper: that ∆(m) is an and thus provides an H-representation of the Kimura-3 polytope associated with a claw tree. We conclude with a collection of open problems about the and geometry of group-based models.

2. Polytopes for group-based models. For a fixed tree and group-based model with group G, the V-representation of the polytope can be computed using an algorithm described in [5, 8] which we review here. For each leaf, one defines a map g : G → R|G|−1, where the identity maps to (0, 0,..., 0) and each noniden- tity element maps to a standard vector of the integral lattice Z|G|−1. Under this identification, the polytope associated to the m-claw tree is the in Rm(|G|−1) of all possible labelings of the leaves with m group elements that sum to the identity. Throughout this paper when discussing claw trees we assume that the number of leaves is at least three. We view the coordinates of Rm(|G|−1) as entries of a (|G| − 1) × m matrix M whose columns are indexed by the leaves of the tree. Definition 2.1 (after [23] as described in [8]). The Kimura polytope, denoted K(m), is the polytope associated to the m-claw tree with the group Z2 × Z2. The 3m vertices of K(m) ⊆ R are in bijection with collections of m elements of Z2 × Z2 such that the sum of these elements is the identity. We identify the elements of Z2 ×Z2 3 with column vectors in a 3 × m matrix M under the map g : Z2 × Z2 → R given by g(0, 0) = (0, 0, 0), g(1, 0) = (1, 0, 0), g(0, 1) = (0, 1, 0), and g(1, 1) = (0, 0, 1). Our construction of a facet description for the K(m) was inspired by the H- representation for the polytope associated to the group Z2. In the case of Z2 the polytope is the demihypercube which has a well-known H-representation. H-REPRESENTATION OF THE KIMURA-3 POLYTOPE 785

Definition 2.2 (see [2]). The H-representation of the demihypercube is given by ( ) m X X DH(m) = d ∈ [0, 1] : di ≤ |A| − 1 + di , i∈A i/∈A where A ranges over all of {1, 2, . . . , m} of odd .

To connect the two polytopes, we note that an element of Z2 × Z2 is the identity if and only if its image is the identity under all homomorphisms from Z2 × Z2 to Z2. Therefore the sum of the images of all of the group elements defining a vertex of the Kimura polytope must be the identity under the three nontrivial homomorphisms from Z2 × Z2 to Z2. We formalize this relationship in Theorem 4.4, where we show that the following inequalities provide an H-representation for the Kimura-3 polytope. 3m Definition 2.3. The polytope ∆(m) is the set of points in R satisfying xij ≥ 0 P3 for all 1 ≤ i ≤ 3 and 1 ≤ j ≤ m and i=1 xij ≤ 1 for all 1 ≤ j ≤ m, as well as the following A-inequalities, where A ranges over all odd cardinality subsets of {1, 2, 3, . . . , m}: X X (x1j + x2j) ≤ |A| − 1 + (x1j + x2j), j∈A j6∈A X X (x1j + x3j) ≤ |A| − 1 + (x1j + x3j), j∈A j6∈A X X (x2j + x3j) ≤ |A| − 1 + (x2j + x3j). j∈A j6∈A

We first show that ∆(m) contains the Kimura-3 polytope. Proposition 2.4. The polytope K(m) ⊆ ∆(m).

Proof. Let v be a vertex of K(m) corresponding to a choice (g1, g2, . . . , gm) of elements of Z2 × Z2 which sum to the identity. By the identification in Definition 2.1, P3 v satisfies xij ≥ 0 and i=1 xij ≤ 1 for all 1 ≤ j ≤ m. Without loss of generality, we prove that v satisfies the A-inequality X X (x1j + x2j) ≤ |A| − 1 + (x1j + x2j). j∈A j6∈A

Letting X = {k ∈ A|gk = (1, 0)} and Y = {k ∈ A|gk = (0, 1)}, we have |X| + |Y | ≤ |A|. We divide the proof into two cases according to the parity of |X| and |Y |. If |X| and |Y | are of the same parity, then |X| + |Y | 6= |A|, since |A| is odd. Therefore, X (x1j + x2j) = |X| + |Y | j∈A ≤ |A| − 1 X ≤ |A| − 1 + (x1j + x2j). j6∈A

If |X| and |Y | are of opposite parity, then, without loss of generality, we assume that |X| is odd. This implies X gi = (1, 0). i∈X∪Y 786 MARIE MAUHAR, JOSEPH RUSINKO, AND ZOE VERNON

In order to neutralize (1, 0) in Z2 × Z2, we must add either (1, 0), or both (1, 1) and (0, 1) from an element(s) not in A. In either case there must exist an l 6∈ A such P P that x1l + x2l = 1. Therefore, j∈A(x1j + x2j) ≤ |A| − 1 + j6∈A(x1j + x2j). Since v satisfies all of the defining inequalities of ∆(m), we have v ∈ ∆(m). Containment of K(m) in ∆(m) follows from the convexity of K(m). The opposite inclusion, that ∆(m) ⊆ K(m), is not immediately clear. However, the following shows that the inclusion holds as long as ∆(m) is integral. Proposition 2.5. If ∆(m) is integral, then ∆(m) = K(m). Proof. Assume v is an integral vertex of ∆(m) which is not a vertex of K(m). Using Definition 2.1, we can identify v with a sequence of group elements g1, . . . , gm. Pm Since v is not a vertex of K(m), then k=1 gk 6= (0, 0). Without loss of generality we assume that the first coordinate of this sum is 1. It follows that A = {l|gl = (1, 0) or (1, 1)} has odd cardinality. By construction, P P we have j∈A x1j +x3j = |A| and j∈ /A x1j +x3j = 0. This contradicts the hypothesis that v was an element of ∆(m) since v does not satisfy X X (x1j + x3j) ≤ |A| − 1 + (x1j + x3j); j∈A j∈ /A therefore, all integral vertices of ∆(m) are also vertices of K(m). By convexity this shows that if ∆(m) is integral, then it is a of K(m). Combining this with Proposition 2.4 yields the equality of polytopes. 3. An alternative description of the Kimura-3 polytope. 3.1. Coordinate change from ∆(m) to ∆0(m). We have exchanged the problem of converting a V-representation to an H-representation with another prob- lem: recognizing when a rational polytope is integral. However, the relationship be- tween the A-inequalities of ∆(m) and those of the demihypercube suggests a change of coordinates that allows us to demonstrate integrality. The A-inequalities for the demihypercube and for ∆(m) share the same underlying indexing structure. However, a more subtle connection with the demihypercube is revealed after a change of coordinates motivated by the group homomorphisms from Z2 × Z2 → Z2. Definition 3.1. Define a map f : R3m → R3m by f(M) = M 0, where M 0 is a 0 0 0 3×m matrix with x1j = x1j +x2j, x2j = x1j +x3j, and x3j = x2j +x3j for 1 ≤ j ≤ m. Definition 3.2. Define the polytope ∆0(m) = f(∆(m)). The f is not an isomorphism of the underlying lattices. However, the following computations show that ∆(m) and ∆0(m) are isomorphic and that they are either both integral or both nonintegral polytopes. 3 3 Lemma 3.3. Let fj : R → R be the restriction of f to a map from column j 0 of the matrix M to column j of M . The map fj is an isomorphism which maps the unit 3- in R3 to DH(3). Proof. The vertices of the unit 3-simplex are {(0,0,0),(1,0,0),(0,1,0),(0,0,1)}. The 3 3 map fj : R → R is given by fj(x1j, x2j, x3j) = (x1j + x2j, x1j + x3j, x2j + x3j). The function fj is an isomorphism with inverse x + y − z x + z − y y + z − x f −1(x, y, z) = , , . j 2 2 2 H-REPRESENTATION OF THE KIMURA-3 POLYTOPE 787

The second part of the claim follows from applying the change of coordinates to the vertices of the 3-simplex. Corollary 3.4. The polytopes ∆(m) and ∆0(m) are isomorphic. Proof. It follows directly from Lemma 3.3 that the map f in Definition 3.1 is an affine isomorphism which maps the vertices of ∆(m) to the vertices of ∆0(m). Corollary 3.5. The polytope ∆(m) is integral if and only if ∆0(m) is integral. Proof. It is clear that if ∆(m) is integral, then ∆0(m) must also be integral. Now assume that P ∈ ∆(m) is a nonintegral vertex which maps to an integral vertex P 0 0 of ∆ (m). Without loss of generality we assume that x1j is nonintegral. As P is an element of ∆(m), we know that the coordinates of P lie in the [0, 1]. In order 0 for P to be nonintegral and P to be integral we must therefore have x1j + x2j = 1 and x1j, x2j > 0. Similarly it follows that x1j + x3j = 1 and x2j + x3j = 1 as well. 1 Combining these three equations yields x1j = x2j = x3j = 2 . However, this implies x1j + x2j + x3j > 1, so P could not have been an element of ∆(m). Proposition 3.6. The facets of ∆0(m) are described by the inequalities

X 0 X 0 xij ≤ |A| − 1 + xij, j∈A j6∈A where A is any subset of {1, 2, . . . , m} of odd cardinality and 1 ≤ i ≤ 3, and

X 0 X 0 xij ≤ |B| − 1 + xij, i∈B i6∈B where B is a subset of {1, 2, 3} of odd cardinality and 1 ≤ j ≤ m. We call the former row facets and the latter column facets. Proof. Applying the change of coordinates in Lemma 3.3 to the defining inequal- ities of ∆(m) yields the collection of defining inequalities

X 0 X 0 xij ≤ |A| − 1 + xij, j∈A j6∈A where A is any subset of {1, 2, . . . , m} of odd cardinality and 1 ≤ i ≤ 3, and

X 0 X 0 xij ≤ |B| − 1 + xij, i∈B i6∈B where B is a subset of {1, 2, 3} of odd cardinality and 1 ≤ j ≤ m. As this collection of inequalities defines ∆0(m), it follows that the set of facets must be associated to a subset of these inequalities. Now assume that A is the index set associated to a defining inequality of ∆0(m). Without loss of generality we assume that the defining inequality is associated to the P 0 P 0 first row of M, and therefore is given by j∈A x1j ≤ |A| − 1 + j6∈A x1j. We define 3m a point PA ∈ R with coordinates P1,j = 1 if j ∈ A, and 0 otherwise, along with 1 P2,k = P3,k = 2 for all 1 ≤ k ≤ m. That A is a facet follows from the observation that PA does not satisfy the inequality indexed by A but does satisfy all other defining inequalities of ∆0(m). Not only does ∆0(m) have a convenient description in terms of the rows and columns of a matrix M, but also the following observation will be key to our investi- gation of the integrality of ∆0(m). 788 MARIE MAUHAR, JOSEPH RUSINKO, AND ZOE VERNON

Lemma 3.7. The polytope ∆0(m) is contained in the [0, 1]3m. Proof. This follows by noting that inequalities defining the column facets of ∆0(m) are precisely those that define the three-dimensional demihypercube, which is con- tained in the unit . Since every column of three coordinates lies with [0, 1]3, we must have ∆0(m) contained in [0, 1]3m. 3.2. Supporting-demihypercubes. For a point P in ∆0(m), we examine the connection between the number of integral coordinates in a row (resp., column) of P and the number of row facets (resp., column facets) that P lies on. To explain this connection, we restrict our attention to the coordinates of a par- m ticular row of P . Let P |r be the canonical projection of P into R corresponding to the coordinates of row r of the matrix. We introduce the notion of a supporting- demihypercube to describe the projection of ∆0(m) onto a particular row space. With- out loss of generality the proofs in this section refer to the restrictions to the rows of P ; however, the analogous results hold when restricted to the columns of P as well. Definition 3.8. For 1 ≤ r ≤ 3, the supporting-demihypercube is defined by

0 m SDH(m) = {P |r : P ∈ ∆ (m)} ⊆ R .

The corresponding to the row facets define supporting hyperplanes of the polytope. We do not assume that SDH(m) = DH(m), as there may be additional facets of SDH(m) induced by the restriction that P ∈ ∆0(m). We do make use of the fact that coordinates of SDH(m) must lie between zero and one, which follows from Lemma 3.7. 3.3. Integrality of coordinates in the supporting-demihypercube. In this subsection we demonstrate a positive correlation between the number of inte- gral coordinates of a point P ∈ SDH(m) and the number of supporting hyperplanes corresponding to A-inequalities on which P lies. Lemma 3.9. Every nonintegral point P of SDH(m) has at least two nonintegral coordinates. Proof. Assume that P is a nonintegral point of SDH(m) with a single nonintegral coordinate pk, and let I = {i1, i2, . . . , il} denote the indices of coordinates of P where = 1. The proof that there is a second nonintegral coordinate is broken into cases based on the parity of |I|. If |I| is odd, then the A-inequality corresponding to I is |I| ≤ |I| − 1 + pk. It follows that pk ≥ 1, which contradicts the existence of a unique nonintegral coordinate P ∈ SDH(m). If |I| is even, let A = I∪{k}. Then the corresponding A-inequality is |I|+pk ≤ |I|, so pk ≤ 0. This contradicts pk being a nonintegral coordinate of P ∈ SDH(m). Consequently, P cannot have exactly one nonintegral coordinate. Therefore, no point in SDH(m) may have a single nonintegral coordinate. More- over, the following lemma demonstrates that select points have at most two noninte- gral coordinates. Lemma 3.10. If a point P lies on two supporting hyperplanes of SDH(m) corre- sponding to A-inequalities, then P has at most two nonintegral coordinates. H-REPRESENTATION OF THE KIMURA-3 POLYTOPE 789

Proof. In this proof we use the notation A0 to denote the complement of a set A in {1, 2, . . . , m}. We use the notation A\B to denote the relative complement of a set B in a set A. Assume that P is a point in SDH(m) which lies on supporting hyperplanes HA and HB corresponding to index sets A and B. Let p1, . . . , pm denote the coordinates of P . Then, X X X X HA : pi + pj = |A\B| + |A ∩ B| − 1 + pk + pl i∈(A\B) j∈(A∩B) k∈(B\A) l∈(A∪B)0 and X X X X HB : pk + pj = |B\A| + |A ∩ B| − 1 + pi + pl. k∈(B\A) j∈(A∩B) i∈(A\B) l∈(A∪B)0

Combining these two conditions gives the following equation: X X 2 pj = |A\B| + |B\A| + 2|A ∩ B| − 2 + 2 pl. j∈(A∩B) l∈(A∪B)0

Since A and B are distinct sets of odd cardinality, it follows that |A\B| + |B\A| ≥ 2. In order for the above equation to hold for a point P ∈ SDH(m), we must have pj = 1 0 for all j ∈ A ∩ B, pl = 0 for all l ∈ (A ∪ B) , and |A\B| + |B\A| = 2. Consequently, there are at most two nonintegral coordinates, and they would have to be indexed by the two elements of (A\B) ∪ (B\A). Furthermore, in the following two propositions we show that a point which lies on three supporting hyperplanes corresponding to A-inequalities must be integral, while a point that lies on two supporting hyperplanes corresponding to A-inequalities has exactly two nonintegral coordinates. Proposition 3.11. If a point P lies on three supporting hyperplanes of SDH(m) corresponding to A-inequalities, then P is integral and lies on at least m supporting hyperplanes corresponding to A-inequalities. Proof. Assume that P lies on supporting hyperplanes of SDH(m) corresponding to odd cardinality sets A, B, and C. Repeated application of Lemma 3.10 yields

 1, i ∈ {A ∩ B} ∪ {A ∩ C} ∪ {B ∩ C},  p = i 0 otherwise.

Notice that the integral point P cannot have an odd number of coordinates with the value one or it would not satisfy the A-inequality where A = {i|pi = 1}. So, we may let I = {i1, i2, . . . , i2k} denote the indices of coordinates of P at which pi = 1, and J = {j2k+1, . . . , jm} denote the indices of coordinates where pj = 0. The result follows from checking that P lies on the m supporting hyperplanes corresponding to the sets I\{il} for 1 ≤ l ≤ 2k, and I ∪ {il} for 2k + 1 ≤ l ≤ m. Proposition 3.12. P lies on exactly two supporting hyperplanes of SDH(m) corresponding to A-inequalities if and only if P has exactly two nonintegral coordi- nates. Proof. Assume that P lies on exactly two supporting hyperplanes of SDH(m) corresponding to A-inequalities. Then, by Lemma 3.10, P has at most two nonintegral 790 MARIE MAUHAR, JOSEPH RUSINKO, AND ZOE VERNON

coordinates, and by Lemma 3.9, P cannot have only one noninteger coordinate. The claim that P has exactly two nonintegral coordinates then follows from noting that if P were integral, then it would have to lie on the m-hyperplanes constructed in the proof of Proposition 3.11. Now let P ∈ SDH(m) have exactly two nonintegral coordinates, pk and pl. Let I = {i|pi = 1}. If |I| is odd, consider the sets A1 = I and A2 = I ∪ {k, l}. Since P ∈ SDH(m), it must satisfy the inequalities corresponding to A1 and A2. Therefore, X X pi ≤ |A1| − 1 + pi

i∈A1 i6∈A1 and X X pi ≤ |A2| − 1 + pi.

i∈A2 i6∈A2 But these reduce to

|I| ≤ |I| − 1 + pk + pl and

|I| + pk + pl ≤ |I| + 2 − 1.

From this it follows that pl + pk = 1, and that both of these inequalities are in fact equalities. Now assume that |I| is even. Consider A1 = I ∪ l and A2 = I ∪ j. Since P ∈ SDH(m), it must satisfy the inequalities corresponding to A1 and A2. Therefore, X X pi ≤ |A1| − 1 + pi

i∈A1 i6∈A1 and X X pi ≤ |A2| − 1 + pi.

i∈A2 i6∈A2 But these reduce to

|I| + pk ≤ |I| + 1 − 1 + pl and

|I| + pl ≤ |I| + 1 − 1 + pk.

From this it follows that pl = pk, and that both of these inequalities are equalities. By Proposition 3.11, P cannot lie on any additional supporting hyperplanes as- sociated to A-inequalities. In summary, if a point in a supporting-demihypercube lies on three or more supporting hyperplanes corresponding to A-inequalities, then it must be integral. If it lies on exactly two supporting hyperplanes corresponding to A-inequalities, then it must have exactly two nonintegral coordinates. Moreover, no point in the supporting- demihypercube can have exactly one nonintegral coordinate. 3.4. Classification of supporting hyperplanes. In addition to knowing how the supporting structure constrains coordinate integrality, the proof of the integrality of ∆0(m) requires an additional classification of the supporting hyperplanes of SDH(m). H-REPRESENTATION OF THE KIMURA-3 POLYTOPE 791

Definition 3.13. Let P be a point in SDH(m) with exactly two nonintegral co- ordinates, pi and pj. Assume that P lies on a supporting hyperplane H corresponding to a set A. If i and j are both elements of A or both elements of A0, we call H a same supporting hyperplane or S-supporting hyperplane. Otherwise, we call H an opposite or O-supporting hyperplane. This classification of supporting hyperplanes is dependent on a choice of a point and does not universally classify supporting hyperplanes into two disjoint sets. How- ever, as we apply the concept only to the case when a point has been specified, we suppress the unneeded notation that would indicate that the classification is a function of a point P . Lemma 3.14. Let P be a point in SDH(m) with exactly two nonintegral coordi- nates, pk and pl. Let I = {i|pi = 1}. If  odd, then P lies on two S-supporting hyperplanes;  |I| is even, then P lies on two O-supporting hyperplanes. Proof. This follows directly from the proof of Proposition 3.12. Definition 3.15. Let P be a point in ∆0(m). We define k(P ) as the number of nonintegral coordinates in P , and ω(P ) as the sum of the number of rows and columns of the matrix representation of P which contain a nonintegral coordinate. Lemma 3.16. If P ∈ ∆0(m), then k(P ) ≥ ω(P ) with equality met only when P lies on exactly two supporting hyperplanes corresponding to A-inequalities for each row (resp., column) containing a nonintegral coordinate. Proof. Let P be a point in ∆0(m). We compute ω(P ) by counting the number of rows or columns which contain a nonintegral coordinate of P . We can compute k(P ) by summing the number of nonintegral coordinates which occur in each row and column and then dividing by two. By Lemma 3.9 if a row (resp., column) contains a nonintegral coordinate, there must be at least two nonintegral coordinates in that row (resp., column). Therefore k(P ) ≥ ω(P ). In order for ω(P ) to equal k(P ) we would need each row (resp., column) that contains a nonintegral coordinate to contain exactly two nonintegral coordinates. By Proposition 3.12, this can occur only when P lies on exactly two supporting hy- perplanes corresponding to A-inequalities for each row (resp., column) containing a nonintegral coordinate. 4. Proof of the facet description for the Kimura-3 polytope. In this section we show that ∆0(m) is an integral polytope by demonstrating that there is an open interval containing any nonintegral point in ∆0(m). More formally, we observe that point P is not a vertex of ∆0(m) if there exist a nonzero vector v ∈ R3m and an  > 0 such that P + λv ∈ ∆0(m) whenever λ ∈ (−, ). The following two lemmas will serve as tools for demonstrating that a nonintegral point P is not a vertex of ∆0(m). Lemma 4.1. Let P ∈ ∆0(m). If the number of nonintegral coordinates is greater than the sum of the number of rows and columns which contain nonintegral coordinates (k(P ) > ω(P )), then P is not a vertex of ∆0(m). Proof. Let P be a point in ∆0(m) such that k(P ) > ω(P ). We construct a vector v and an  > 0 such that P + λv ∈ ∆0(m) for all λ ∈ (−, ). We show that such a vector v exists by explicitly defining its coordinates. 792 MARIE MAUHAR, JOSEPH RUSINKO, AND ZOE VERNON

0 We set vij = 0 if xij is an integral coordinate of P . We set up a linear system of equations to solve for the remaining coordinates of v. This homogeneous system will have ω(P ) homogeneous linear equations and k(P ) unknowns. For convenience, we linearly reorder the coordinates of P such that x1, x2, . . . , xk are nonintegral. Let M be the 3 × m matrix representation of P . For each row or column of M containing nonintegral coordinates, P may lie on zero, one, or two of the associated facets. If P does not lie on any facet, then there exists an  such that P + λv will satisfy the corresponding inequalities for any choice of v. If P lies on a single facet, H, of a particular row or column, then H defines a single homogeneous linear relation on v1, v2, . . . , vk, since P + v would satisfy the P P same linear relation so long as j∈A vj − j6∈A vj = 0, where A is the indexing set which defines the facet. Thus we get a single homogeneous restriction for each row or column for which P lies on a single facet. Additionally, if P lies on two facets in a particular row or column, then by Propo- sition 3.12 that row or column contains two nonintegral coordinates corresponding to indices i and j. Therefore, by Lemma 3.14, P may not lie on both an O-supporting hy- perplane and an S-supporting hyperplane. Moreover, by the proof of Proposition 3.12, the coordinates v1, . . . , vk must satisfy a single linear homogeneous relationship in order for P + λv to satisfy the equalities which define these two facets. Explicitly, vi −vj = 0 for O-supporting hyperplanes and vi +vj = 0 for S-supporting hyperplanes. Therefore we get a system of up to ω(P ) homogeneous linear equations in k(P ) unknowns. By the hypothesis, k(P ) > ω(P ), so there exists a nontrivial solution for v. Since vi,j = 0 whenever Pi,j is integral, we may choose an  such that λv + P does not violate any of the remaining inequalities for λ ∈ (−, ). It follows that λv + P ∈ ∆0(m) for all λ ∈ (−, ). Lemma 4.1 is sufficient for constructing an interval for most nonintegral points of ∆0(m). When k(P ) = ω(P ) we explicitly construct an interval containing P using the following lemma. Lemma 4.2. If P ∈ ∆0(m) has one of the configurations of nonintegral coordinates shown in (1), then P is not a vertex of ∆0(m): (1)  0 0 0 0   0 0 0 0  x11 x12 x13 ··· x1m x11 x12 x13 ··· x1m 0 0 0 0 0 0 0 0 P1 =  x21 x22 x23 ··· x2m  ,P2 =  x21 x22 x23 ··· x2m  , 0 0 0 0 0 0 0 0 x31 x32 x33 ··· x3m x31 x32 x33 ··· x3m where coordinates in bold are the only nonintegral coordinates of P . Proof. Given a point P ∈ ∆0(m) with a configuration of nonintegral coordinates as displayed in (1), we demonstrate that P is not a vertex of ∆0(m). In these two cases each row and column has at most two nonintegral coordinates, so we use the S- and O-supporting hyperplane descriptions to assist in the construction. To build the interval, we first note that for each configuration in (1) there exists a Hamiltonian cycle, 0 0 0 (p1, p2, . . . , pk) = x11, x21, . . . , x11, in the graph with vertices corresponding to nonintegral coordinates, and edges con- necting nonintegral coordinates in the same row or column. In the case of P1 (resp., P2) we reorder the coordinates of v = (v1, v2, v3, v4, . . . , v3m) (resp., v = (v1, . . . , v6, . . . , v3m)) with the first four (resp., six) coordinates correspond- ing to the nonintegral coordinates of P1 (resp., P2) in the order of the Hamiltonian cycle. H-REPRESENTATION OF THE KIMURA-3 POLYTOPE 793

Using the reordered coordinates, we construct a vector v as follows. First set v1 = 1, and for 1 ≤ i ≤ 3 (resp., 5) assign  vi if pi and pi+1 lie on an O-supporting hyperplane, vi+1 = −vi if pi and pi+1 lie on an S-supporting hyperplane.

We set vi = 0 for all remaining coordinates. Such a collection is consistent for the set of linear constraints defined in the proof of Lemma 4.1 applied to each consecutive pair of nonintegral coordinates. The constraint induced by v1 and v4 (resp., v6) is consistent only if there are an even number of S-supporting hyperplanes; otherwise v1 would have to simultaneously be 1 and −1. To show that there are an even number of S-supporting hyperplanes we let I 0 denote the set of indices such that xij = 1. We can compute |I| by taking half of the number of coordinates with value 1 in each row, plus half of the number of coordinates with value one in each column. By Lemma 3.14, every O-row and O-column must contain an even number of coordinates with value 1, so the total number of locations with coordinate 1 in a O-row or O-column must be even. Since |I| must be an integer, it follows that the total number of coordinates with value 1 in an S-row or S-column must then also be even. By Lemma 3.14, every S-row and S-column must contain an odd number of 1’s. Therefore, the total number S-rows and S-columns must be even. This confirms the consistency of the vector v. We now choose an  which forces λv + P to satisfy the remaining inequalities. Thus P is not a vertex of ∆0(m). We utilize Lemma 4.2 to prove that ∆0(m) is integral. Proposition 4.3. The polytope ∆0(m) is integral. Proof. Let P be a nonintegral point in ∆0(m). By Proposition 3.11 there exists a row or column for which the restriction of P to a particular row lies on two or fewer supporting hyperplanes of SDH(m). Since these hyperplanes are described by the same set of inequalities which define the facets of ∆0(m), it follows that there exists a row/column for which P lies on two or fewer row or column facets. If every nonintegral coordinate is on a row and column for which P lies on exactly two facets, then by Proposition 3.12 there are exactly two nonintegral coordinates in each such row and column. Without loss of generality we assume that the first nonintegral coordinate is x11. As there must be exactly one additional nonintegral coordinate in the first row and column, we assume that x12 and x21 are also nonintegral. By reordering the remaining coordinates we can thus assume P has one of the configurations in (1). Then, by Lemma 4.2, P is not a vertex of ∆0(m). Otherwise, there must exist a row or column for which P lies on fewer than two facets and thus has more than two nonintegral coordinates in that row or column. This ensures that k(P ) > ω(P ), and thus by Lemma 4.1, P is not a vertex of ∆0(m). Therefore the polytope ∆0(m) must be integral. Theorem 4.4. The polytope ∆(m) is an H-representation of K(m). Proof. By Proposition 4.3, the polytope ∆0(m) is integral. Applying Corollary 3.5 shows that ∆(m) is also integral. Finally, by Proposition 2.5 we have K(m) = ∆(m). 5. Conclusion. Phylogenetic varieties have been used to answer identifiability questions [1, 17] and to develop tree reconstruction algorithms [9, 12, 22]. Greater 794 MARIE MAUHAR, JOSEPH RUSINKO, AND ZOE VERNON

understanding of the geometry of phylogenetic varieties, including an understanding of the singularity locus, has improved the speed and accuracy of these reconstruction methods [4, 19]. Additionally, in light of recent connections with conformal blocks and Berenstein–Zelevinsky triangles, a deeper understanding of phylogenetic varieties is also of interest to algebraic geometers [16]. The H-representation provides a new vantage point for understanding the Kimura- 3 variety. The authors hope this will lead to a better understanding of the geometry and associated biology of the Kimura-3 variety and of group-based models in general. To this end, we describe open problems of both mathematical and biological interest. We begin with a fundamental question about the geometry of the Kimura-3 variety. Open Problem 1. Classify the singularity structure of the Kimura-3 varieties. (See [2] for an analogous study in the binary symmetric case.) In the binary symmetric case, the variety associated to any tree with m leaves is deformation equivalent to the variety associated to the m-claw tree [2]. While such a relationship does not hold in the Kimura-3 case [15], one would still like to understand the geometric relationship among varieties associated to different n-leaf trees. From a biological perspective this problem can be posed as follows. Open Problem 2. Describe the geometric relationship between two Kimura-3 varieties whose trees differ by a single nearest neighbor interchange. These geometric questions are closely related to the combinatorics of the polytopes themselves. We hope that the H-representation will help provide an answer to the following combinatorial problem. Open Problem 3. Compute the f-vector and Hilbert polynomial of the polytope associated to the Kimura-3 model for the m-claw tree. Open Problem 4. Describe the H-representation for the polytope associated to the m-claw tree for an arbitrary finite abelian group. Acknowledgments. The authors would like to thank Valery Alexeev, Wayne Anderson, Kaie Kubjas, and Mateusz Michalek for their comments and suggestions during the development of this work. A special thanks also to the anonymous review- ers, whose time and careful editing greatly improved the quality of this manuscript.

REFERENCES

[1] E. S. Allman, S. Petrovic, J. A. Rhodes, and S. Sullivant, Identifiability of two-tree mixtures for group-based models, IEEE/ACM Trans. Comput. Biol. Bioinform., 8 (2011), pp. 710–722. [2] W. Buczynska and J. A. Wisniewski, On the geometry of binary symmetric models of phy- logenetic trees, J. European Math. Soc., 9 (2007), pp. 609–635. [3] M. Casanellas and J. Fernandez-S´ anchez´ , Geometry of the Kimura 3-parameter model, Adv. Appl. Math., 41 (2008), pp. 265–292. [4] M. Casanellas and J. Fernandez-S´ anchez´ , Relevant phylogenetic invariants of evolutionary models, J. Math. Pures Appl., 96 (2011), pp. 207–229. [5] M. Casanellas, J. Fernandez-S´ anchez,´ and M. Micha lek, Local Description of Phyloge- netic Group-Based Models, preprint, https://arxiv.org/abs/1402.6945, 2014. [6] J. Cavender and J. Felsenstein, Invariants of phylogenies in a simple case with discrete states, J. Classification, 4 (1987), pp. 57–71. [7] D. A. Cox, J. B. Little, and H. K. Schenck, Toric Varieties, AMS, Providence, RI, 2011. [8] M. Donten-Bury and M. Micha lek, Phylogenetic invariants for group-based models, J. Al- gebraic Statist., 3 (2012), pp. 44–63. [9] N. Eriksson, Tree construction using singular value decomposition, in Algebraic for Computational Biology, Cambridge University Press, New York, 2005, pp. 347–358. H-REPRESENTATION OF THE KIMURA-3 POLYTOPE 795

[10] N. Eriksson, K. Ranestad, B. Sturmfels, and S. Sullivant, Phylogenetic algebraic geom- etry, in Projective Varieties with Unexpected Properties, Walter de Gruyter, Berlin, 2005, pp. 237–255. [11] S. N. Evans and T. Speed, Invariants of some probability models used in phylogenetic infer- ence, Ann. Statist., 21 (1993), pp. 355–377. [12] J. Fernandez-S´ anchez´ and M. Casanellas, Invariant versus classical quartet inference when evolution is heterogeneous across sites and lineages, System. Biol., 65 (2015), pp. 280–291. [13] W. Fulton, Introduction to Toric Varieties, Ann. Math. Stud. 131, Princeton University Press, Princeton, NJ, 1993. [14] M. Kimura, A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences, J. Molec. Evolution, 16 (1980), pp. 111–120. [15] K. Kubjas, Hilbert polynomial of the Kimura 3-parameter model, J. Algebraic Statist., 3 (2012), pp. 64–69. [16] K. Kubjas and C. Manon, Conformal blocks, Berenstein–Zelevinsky triangles, and group- based models, J. Algebraic Combin., 40 (2014), pp. 861–886. [17] C. Long and S. Sullivant, Identifiability of 3-class Jukes–Cantor mixtures, Adv. Appl. Math., 64 (2015), pp. 89–110. [18] C. A. Manon, The Algebra of Conformal Blocks, preprint, https://arxiv.org/abs/0910.0577, 2009. [19] F. A. Matsen, Fourier transform inequalities for phylogenetic trees, IEEE/ACM Trans. Com- put. Biol. Bioinform., 6 (2009), pp. 89–95. [20] M. Micha lek, Toric geometry of the 3-Kimura model for any tree, Adv. Geom., 14 (2014), pp. 11–30. [21] L. Pachter and B. Sturmfels, Algebraic Statistics for Computational Biology, Cambridge University Press, Cambridge, UK, 2005. [22] J. P. Rusinko and B. Hipp, Invariant based quartet puzzling, Alg. Molec. Biol., 7 (2012), p. 35. [23] B. Sturmfels and S. Sullivant, Toric ideals of phylogenetic invariants, J. Comput. Biol., 12 (2005), pp. 204–228. [24] S. Sullivant, Toric fiber products, J. Algebra, 316 (2007), pp. 560–577.