arXiv:1811.09553v3 [math.AG] 1 Oct 2020 variables qain ftefr ( form the of equations nt olcino oyoileutos npriua,i sdfie b defined is it particular, In equations. polynomial of collection finite a oeie ewl ipywieMat write simply will we sometimes l arcscmuewt t n riflapoc o tdigall studying for approach the fruitful study One to it. is with matrices commute matrices all fmatrices of o ntne famti is a if instance, For AB 2 esr fhwcoetomtie r ocmuigwt n anothe one with defi commuting To to are varieties. matrices affine two close indeed well-studied how are of generalizations measure these that prove 5 4 3 2 1 o study commu This of varieties subvarieties. to proper generalised blue be two [ can of varieties union through the matrices as written be on in point MSC: Keywords: AEEN LZ,AEADRGUTERMAN ALEXANDER ELYZE, MADELEINE 10 n nvriyo jbjn,Fclyo ahmtc n hsc,Jadr physics, and mathematics of Faculty Ljubljana, USA of 01267 University Moscow MA Williamstown, Mathematics, College, Applied Williams and Fundamental 14170 for Dolgoprudny, Technology, Center and Moscow Physics of Russia. Institute 119991, Moscow Moscow, University, State Moscow Lomonosov nti ae eitoueohrgnrlztoso h omtn m commuting the of generalizations other introduce we paper this In Date Let 2 ,o commuting or ], dmninlspace -dimensional = coe,2020. October, 1 : 41,15A27. 14M12, k BA n ntesz ftematrices. varieties the be of not size may commut or the distance-2 may on variety the sets and commuting a that distance showing form higher fields, the indeed other that its over does characterize s work and sets We variety also commuting We our pape distance-2 matrices. this of the of In of each dimension distance another. field, commuting one pa closed the all with algebraically using of commute consists by that it variety field set, commuting that a As in entries geometry. algebraic with and algebra linear in Abstract. k eafield, a be x 2 thstesrcueo an of structure the has It . n ij omtn ait,cmuigdsac,cmuiggraph. commuting distance, commuting variety, commuting 2 ,B A, sast h omtn ait ossso l ons( points all of consists variety commuting the set, a As . omtn distance commuting and IHRDSAC OMTN VARIETIES COMMUTING HIGHER-DISTANCE y ∈ h omtn ait fmtie vragvnfil sawl-tde ob well-studied a is field given a over matrices of variety commuting The ij p Mat n hsvreyi nw ob reuil [ irreducible be to known is variety This . tpe fmatrices. of -tuples ≥ k n XY 2 nitgr n Mat and integer, an 2 × n n n 2 ota aro arcs( matrices of pair a that so , × scalar hc omt ihoeaohrudrmti multiplication. matrix under another one with commute which − n X Y omtn variety commuting fmtie,wihaedfie sn the using defined are which matrices, of hti,acntn utpeo h dniymti,then matrix, identity the of multiple constant a is, that , ) ij n 1. × ,where 0, = n Introduction ffievariety affine when 1 1 k , 2 n , 3 sipii.W r neetdi hs pairs those in interested are We implicit. is × AP MORRISON RALPH , n X ( k [ enn hti steslto e of set solution the is it that meaning , 10 h e fall of set the ) and , 15 ,B A, ,Russia. 1, nk 9 I10 jbjn,Slovenia. Ljubljana, SI-1000 19, anska Y .W dniyMat identify We ]. 191 Russia. 119991, , are 9 ) , ∈ 15 reuil components. irreducible eedn ntefield the on depending , n n e savreybut variety a is set ing Mat r fsur matrices square of irs ,maigta tcannot it that meaning ], × ecmuethe compute We . egnrls the generalise we r igtilso matrices of triples ting n 4 ,B A, o htoe an over that how N KLEMEN AND , n n 2 × ete,w eda need we them, ne × the y ar fcommuting of pairs arcsfildwith filled matrices .W iluethe use will We r. omttvt of commutativity f n n ) ti ait,and variety, atrix n orsod oa to corresponds ∈ arcsover matrices × n k n n 2 2 2 n × polynomial 2 commuting n ject uhthat such ihthe with SIVIC ˇ k 5 ; 2 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5 graph Γ(Matn×n) over k [2]. This graph has vertices corresponding to the n × n non-scalar matrices over k, with an edge between A and B if and only if AB = BA. With a view towards defining distance, a natural question to ask is whether Γ(Matn×n) is connected, and if so what is its diameter; that is, what is the supremum of the length of all shortest paths between pairs of vertices. The answer to this question depends both on the field k and on the integer n. As noted in [4, Remark 8] and discussed in Proposition 2.14, for n = 2 the graph Γ(Matn×n) is disconnected. However, it was shown in [3] that if n ≥ 3 and k is an algebraically closed field, then Γ(Matn×n) is connected, and in fact diam(Γ(Matn×n)) = 4. The same holds when k = R [17]. Note that the situation is quite different for other fields. For example, the graph Γ(Matn×n) over the rationals Q is not connected [1, Remark 8]. Whenever Γ(Matn×n(k)) is connected, for any two non-scalar matrices A, B ∈ Matn×n we can define dk(A, B) to be the distance between the corresponding vertices on the graph Γ(Matn×n(k)); we will drop the subscript k and simply write d(A, B) when the field is implicit. Under this definition, d(A, B) = 0 if and only if A = B, and d(A, B) = 1 if and only if A =6 B and A and B commute. We extend the function d(· , ·) to take any two n × n matrices as input by letting d(A, B) = 1 whenever A or B is a scalar matrix (unless A = B, in which case d(A, B) = 0). If Γ(Matn×n) is not connected, then we take d(A, B) = ∞ whenever A and B are nonscalar matrices in different components of the graph. d We define the distance-d commuting set Cn(k) to be the set of pairs of n × n matrices with entries in k that have commuting distance at most d. That is, d 2 Cn(k) := {(A, B) | d(A, B) ≤ d} ⊂ Matn×n. d When k is clear from context, we abbreviate this notation to Cn. Perhaps the most familiar 1 of these sets is Cn, which is simply the n × n commuting variety. Our main theorem shows d that regardless of our choice of d, the set Cn is still an affine variety, at least over a field that is algebraically closed. d Theorem 1.1. Let k be an algebraically closed field. Then for any d ≥ 0, the set Cn is an 2 2 affine variety in the 2n -dimensional affine space Matn×n. The same result holds for any field if d ≤ 2. d There do exist choices of k, n, and d such that Cn(k) is not a variety, as illustrated in the following proposition suggested to the authors by Jan Draisma. d Proposition 1.2. Let n ≥ 3 and d ≥ 4. The set Cn(k) is a variety if and only if k is finite or diam(Γ(Matn×n(k))) ≤ d. The final remaining case of d = 3 for infinite, non-algebraically closed fields is more subtle. It turns out that such commuting sets are not always varieties. 3 Proposition 1.3. The distance-3 commuting set of pairs of 4 × 4 real matrices C4 (R) is not a variety. d The fact that Cn is a variety over many fields has a number of nice consequences. For instance, every variety has a well-defined notion of dimension, meaning that this set is a reasonable geometric object. Moreover, if we are working over k = C, or k = R with d ≤ 2, every variety is closed in the usual Euclidean topology, so we know that these sets are closed. It turns out that many instances of Theorem 1.1 are readily proven. In the case of d = 0, 0 2 we have that Cn = {(A, B) | A = B}, which is defined by the n equations of the form 1 (X − Y )ij=0, and is thus an affine variety. As already noted, Cn is the commuting variety. HIGHER-DISTANCE COMMUTING VARIETIES 3

In the case of n = 2 and d ≥ 2, any two matrices A and B with d(A, B) ≤ d must in fact d 1 1 satisfy d(A, B) ≤ 1 (see Proposition 2.14), so Cn = Cn. Since Cn is a variety, the theorem holds in this case. Finally, if n ≥ 3 and d ≥ 4 then we have d 4 2 Cn = Cn = Matn×n since d(A, B) ≤ 4 for all A and B [3], [17]. This set is indeed a variety: it is the vanishing locus of the zero polynomial. Thus, it remains to show that Theorem 1.1 holds in the case where n ≥ 3, and either d =2 or d = 3; and to prove our results for more general fields. After presenting useful background material on varieties and on commuting distance in Section 2 (as well as proving Proposition 1.2), we handle the d = 2 case in Section 3. We also give an explicit decomposition of the distance-2 commuting variety into irreducibles, and compute its dimension. The d = 3 case, for both algebraically closed fields and otherwise, is studied in Section 4.

2. Affine varieties and commuting distance 2.1. Affine varieties. We devote the first part of this section to present the tools and results from algebraic geometry that we will use. Most of our notation and terminology comes from [5]. Let k be a field, and let R = k[x1,...,xn] be the polynomial ring in n variables over n k. We refer to k as n-dimensional affine space. Let f1,...,fs ∈ R. The vanishing locus of n f1,...,fs is the set of points in k at which all the polynomials vanish: n Vk(f1,...,fs) := {(a1,...,an) ∈ k | fi(a1,...,an)=0 for1 ≤ i ≤ s}. We refer to any set of this form as an affine variety. Given an ideal I ⊂ R, we define its vanishing locus as n Vk(I) := {(a1,...,an) ∈ k | f(a1,...,an) = 0 for all f ∈ I}.

If I = hf1,...,fsi, then we have Vk(I) = Vk(f1,...,fs)[5, Proposition 2.5.9]. Since every ideal I ⊂ R can be generated by finitely many polynomials [12], we could equivalently define an affine variety to be any set of the form Vk(I) where I is an ideal in R. 2 Example 2.1. We consider Mat2×2 as 8-dimensional affine space, each point of which corre- a11 a12 b11 b12 sponds to a pair of 2×2 matrices (A, B)= ( a21 a22 ) , b21 b22 . Our ring of polynomials can then be written as R = k[x , x , x , x ,y ,y ,y ,y ], where the variable x (respectively 11 12 21 22 11 12 21 22  ij yij) corresponds to the coordinate aij (respectively bij). For shorthand, we could also write x11 x12 y11 y12 R = k[X,Y ], where X = ( x21 x22 ) and Y = ( y21 y22 ). Consider the variety V (f1, f2, f3, f4), where

f1 =x11y11 + x12y21 − x11y11 − x21y12

f2 =x11y12 + x12y22 − x12y11 − x22y12

f3 =x21y11 + x22y21 − x11y21 − x21y22

f4 =x21y12 + x22y22 − x12y21 − x22y22.

These polynomials are simply the entries of XY −YX. It follows that f1(A, B)= f2(A, B)= f3(A, B)= f4(A, B)=0 if and only if AB−BA =0, that is if and only if A and B commute. Thus, Vk(f1, f2, f3, f4) is the 2 × 2 commuting variety. In fact, since f1 = −f4, we have Vk(f1, f2, f3, f4)= Vk(f1, f2, f3). 4 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5

If a set S ⊂ kn is an affine variety, we say that it is Zariski-closed. This terminology comes from the Zariski topology on kn, which defines a subset of kn to be closed if it is equal to Vk(f1,...,fs) for some f1,...,fs ∈ k[x1,...,xn]. This defines a topology because the union of finitely many varieties is a variety; because the intersection of any collection of varieties n is a variety; and because ∅ = Vk(1) and k = Vk(0) are both varieties. Definition 2.2. For a general set S ⊂ kn, the Zariski closure of S, denoted S, is the smallest Zariski-closed set (that is, the smallest variety) containing S. Example 2.3. Consider the set S = {(a, a) | a =06 } ⊂ R2. This set is not a variety: if a polynomial f(x, y) vanishes on the set S, then since polynomials are continuous functions we also have f(0, 0)= 0. It follows that any variety containing S must also contain the point (0, 0). In fact, the Zariski closure of S is simply S together with this point:

S = S ∪{(0, 0)} = VR(x − y). In the previous example over R, the Zariski closure of our set happened to equal its closure in the usual Euclidean topology. When working over R or C, since polynomials are continuous functions, all Zariski-closed sets are closed in the Euclidean topology, so it is true that the Zariski closure always contains the Euclidean closure. In general, this containment is strict, although in certain special cases we have equality, such as for constructible sets. Definition 2.4. We say that a set in kn is constructible if it is defined by a finite boolean combination of polynomial conditions. In other words, it can be written as a finite combina- tion of intersections and unions of affine varieties and their complements.

For example, the set S from Example 2.3 is constructible, since S = VR(x − y) \ VR(x, y). Proposition 2.5 (Corollary 1 in Section 1.10 of [16]). Suppose S ⊂ Cn is a constructible set. Then the Zariski closure of S is equal to the Euclidean closure of S. A variety V is called irreducible if it is irreducible in the Zariski topology, i.e. if it cannot be written as a union V = V1 ∪ V2 of two blue proper subvarieties V1 ( V and V2 ( V . Any variety V can be written as a finite union of irreducible varieties V1,...,Vs, unique up to reordering with no Vi contained in any Vj for i =6 j [5, Theorem 4.6.4]. The ideal-variety correspondence [5, Chapter 4] allows us to connect geometric operations on affine varieties to algebraic operations on ideals. Often these correspondences only hold up to taking Zariski closures. Proposition 2.6 (Theorem 3.2.3 in [5]). Let k be an algebraically closed field, and let I ⊂ n m k[x1,...,xn] be an ideal. Let 1 ≤ m ≤ n, and let π : k → k denote projection onto the first m coordinates. Then

π(Vk(I)) = Vk(I ∩ k[x1,...,xm]). In other words, up to Zariski closure, projecting a variety corresponds to eliminating variables. HIGHER-DISTANCE COMMUTING VARIETIES 5

Proposition 2.7 (Theorem 4.4.10 in [5]). Let k be an algebraically closed field, and let I,J ⊂ k[x1,...,xn] be ideals. Let ∞ N I : J = {f ∈ k[x1,...,xn] | for all g ∈ J, there is N ≥ 0 such that fg ∈ I}. Then I : J ∞ is an ideal, and ∞ Vk(I : J )= Vk(I) \ Vk(J). Definition 2.8. The ideal I : J ∞ is called the saturation of I with respect to J. So, up to Zariski closure, taking a set-theoretic difference of varieties corresponds to sat- urating one ideal with respect to another. We now build up some tools that will help us study the distance-d commuting sets. We briefly discuss projective algebraic geometry [5, Chapter 8]. Although our main results are stated in the language of affine varieties, projective geometry will be used in the proofs of Theorem 3.8 and Proposition 4.4. Again we choose a field k and an integer n. The role of n affine space is now played by projective space Pk , which as a set is all equivalence classes of (n + 1)-tupes (a0 : a1 : ··· : an) where ai ∈ k, not all equal to 0. The equivalence is defined by (a0 : a1 : ··· : an) ∼ (b0 : b1 : ··· : bn) if and only if there exists a nonzero constant λ with ai = λbi for all i. There are a number of other constructions of projective space, such as the one given in example below. Example 2.9. Let V be a non-zero vector space. Then the projectivization of V , denoted P(V ), is the set of one-dimensional vector subspaces of V . In order to talk about polynomials vanishing at a point of projective space, we must consider polynomials that are homogeneous; that is, polynomials where all terms have equal degree. This is because if f ∈ k[x0,...,xn] is homogeneous with each term having degree d, then d f(λa0, λa1,...,λan)= λ f(a0, a1,...,an) for any ai ∈ k and any λ. Thus, f vanishing at the tuple (a0 : ··· : an) implies it vanishes for all other tuples in the same equivalence class. We can consider a product of a projective space and an affine space: m n k × Pk . A subset of such a space is Zariski-closed if and only if it is the vanishing locus of a collection of polynomials, which are homogeneous in the variables corresponding to the coordinates of n Pk . A key result that we will use is that when projecting away from projective spaces, Zariski-closed sets are mapped to Zariski-closed sets, at least over an algebraically closed field. Proposition 2.10 (Proposition 8.5.5 and Theorem 8.5.6 in [5]). Let k be an algebraically m n m closed field, and let π : k × Pk → k be the projection map onto the affine space. Then π maps Zariski-closed sets to Zariski-closed sets. In the language of morphisms, such a projection π is a proper morphism. We can generalise this to have more copies of affine and projective spaces, and the result still holds as long as we are projecting away from projective spaces. Note that this result does not hold when projecting away from affine spaces rather than projective spaces: the projection map π : C × C → C defined by π(a, b) = a sends the hyperbola VC(xy − 1) to the set {a ∈ C | a =06 }, which is not a variety. Moreover, an analog 6 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5 of Proposition 2.10 does not hold for the field of real numbers, as the following example shows.

2 2 1 Example 2.11. Let S = ([x0 : x1],y)|x0 = yx1 ⊂ P (R) × R. Then S is Zariski-closed, but the projection to then second component is theo set of non-negative reals, which is not Zariski-closed. Several of our results in Section 4 will study varieties over the real numbers. To help with that endeavor we close this subsection with the following result. Lemma 2.12. Let k be a field, and let K be an algebraic extension of k. If V ⊂ Kn is an affine variety in Kn, then V ∩ kn is a variety in kn n Proof. Let V = VK (f1,...,fr) be an affine variety in K , where fi ∈ K[x1,...,xn] for all i. Since K is an algebraic extension of k, there exists a finite extension k′ of k containing ′ n all coefficients of all fi. Note that W := Vk′ (f1,...,fr)= VK (f1,...,fr) ∩ (k ) , since both ′ n sets are the set of all points in (k ) where f1,...,fr simultaneously vanish. ′ Let α1,...,αs be a vector space basis for k over k. We can write each fi as s

fi = αjgij, j=1 X n where gij ∈ k[x1,...,xn] for all i, j. We claim that W ∩ k = Vk(g11,...,grs). Certainly n W ∩ k ⊃ Vk(g11,...,grs): if all gij vanish at a point, then all fi do the same. For the other n direction, suppose p ∈ W ∩ k , so that fi(p) = 0 for all i. This means that s

αjgij(p)=0 j=1 X ′ for all i. Since α1,...,αs is a vector space basis for k over k, this implies that gij(p)=0 for all i and j. Thus p ∈ Vk(g11,...,grs), giving us the other direction of containment. This proves that W ∩ kn is a variety; since W ∩ kn = (V ∩ (k′)n) ∩ kn = V ∩ kn, this completes the proof.  2.2. Commuting distance. Now we present and develop material about commuting dis- tance and commuting graphs. Let A, B ∈ Matn×n. We say that A and B commute if AB = BA. For notation, we will sometimes write A ↔ B to indicate that A and B com- mute. Given two non-scalar matrices A and B, define the commuting distance d(A, B) to be the distance between two matrices A and B on the commuting graph Γ(Matn×n), so that d(A, B) = 0 if and only if A = B, and d(A, B) = 1 if and only if A =6 B and A and B commute. If A and B do not commute, then d(A, B) = d, where d is the smallest natural number, if it exists, such that there exist d−1 non-scalar matrices C1,...,Cd−1 with d(A, C1) = d(Cd−1, B) = d(Ci,Ci+1)=1 for1 ≤ i ≤ d − 2. Such a commuting chain could be written as A ↔ C1 ↔ C2 ↔···↔ Cd−1 ↔ B. If there does not exist such a chain, we define d(A, B) = ∞. We note that it is vital to require that all Ci are non-scalar: otherwise, we would have d(A, B) ≤ 2 for all A and B. We extend the definition of the distance function to allow scalar matrices as input by defining d(A, A) = 0 and d(A, B)= d(B, A) = 1 whenever A is a scalar matrix and A =6 B. HIGHER-DISTANCE COMMUTING VARIETIES 7

Example 2.13. Consider the following three matrices over C: 1 2 0 1 1 0 1 0 0 A = 3 4 0 , B = 2 2 0 , C = 0 1 0 . 0 0 5 0 0 3 0 0 0 Direct computation shows that A ↔ C and B ↔C, so d(A, C)=1 and d(B,C)=1. Similarly we can verify that AB =6 BA. Since A and B do not commute but they do commute with the common non-scalar matrix C, we have d(A, B)=2.

Proposition 2.14. For any field k let A, B ∈ Mat2×2. Then either d(A, B) ≤ 1 or d(A, B)= ∞. Proof. It will suffice to show that there do not exist any matrices A and B with d(A, B) = 2: if two matrices are separated by a commuting chain of length at least 2, a subchain will yield two matrices of distance exactly 2. Suppose for the sake of contradiction that there exist A and B with d(A, B)=2. Let C be a non-scalar matrix such that A ↔ C ↔ B. Since C is not a scalar matrix, it is not the root of a polynomial of degree 1, meaning that the minimal polynomial of C has degree at least 2. The minimal polynomial thus coincides with the characteristic polynomial of C up to a scalar factor. Since C is a matrix with equal minimal and characteristic polynomials, the matrices commuting with C are precisely those of the form p(C), where p ∈ k[x] is a polynomial [14, Theorem 3.2.4.2]. It follows that A = p(C) and B = q(C) for some polynomials p and q. But p(C) and q(C) commute, so d(A, B) ≤ 1, a contradiction.  Since over any field there are pairs of 2 × 2 matrices that do not commute (for instance, 0 1 0 0 ( 0 0 ) and ( 1 0 )), we immediately get the following result originally noted in [4, Remark 8]:

Corollary 2.15. The graph Γ(Mat2×2) is disconnected, regardless of the field k.

Once n ≥ 3, the connectedness of Γ(Matn×n) is guaranteed by k being algebraically closed or real closed, which means that k/k is a nontrivial finite extension. The graph Γ(Matn×n) can be disconnected in other cases, such as over the field Q of rational numbers [1, Remark 8]. Theorem 2.16 (Corollary 7 in [3]). Let k be an algebraically closed field, and let n ≥ 3. Then Γ(Matn×n) is connected, and diam(Γ(Matn×n))=4. Below we provide a short proof of this fact for the completeness. To prove that the diameter is as most 4, one constructs for any A and B a commuting chain A ↔ C ↔ D ↔ E ↔ B, where C, D, and E are nonscalar. Choose C and E to be rank 1 matrices built out of left and right eigenvectors of A and of B, respectively. Then choose D to be a rank 1 matrix commuting with both C and E, which is possible for any two rank 1 matrices as long as n ≥ 3. Thus, we have d(A, B) ≤ 4 for all A and B. The same result for real closed fields is proved in [17]. One way to see the diameter is indeed equal to 4 (rather than to 2 or 3) is to note that an elementary is always at distance four from its transpose, as proven in [3]. The authors of [8] go on to give a necessary condition for when two matrices can be at the maximal distance of four from one another. A matrix A is said to be non-derogatory if its characteristic polynomial is equal to its minimal polynomial. Equivalently, a matrix 8 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5 is said to be non-derogatory if for each distinct eigenvalue λ of A there is only one Jordan block corresponding to λ in the Jordan normal form for A [6, Chapter 7]. If a matrix is not non-derogatory, we say it is derogatory. Theorem 2.17. [8, Theorem 1.1] Let n ≥ 3 and k be algebraically closed. Then the following statements are equivalent for a non-scalar matrix A ∈ Matn×n. (i) A is non-derogatory. (ii) A commutes only with elements of k[A]. (iii) There exists a matrix B ∈ Matn×n such that d(A, B)=4. We close this section by proving Proposition 1.2. Proof of Proposition 1.2. In affine space over a finite field, any set is a variety, as it is the union of finitely many points. Thus we will assume k is infinite. d 2 If diam(Γ(Matn×n(k))) ≤ d, then Cn(k) is all of 2n -dimensional affine space, which is indeed a variety, namely V(0). Conversely, assume diam(Γ(Matn×n(k))) > d. This means d 2 that Cn(k) ( Matn×n. Since d ≥ 4, we know that k is not algebraically closed. We will show 4 2 4 that the Zariski closure of Cn(k) is Matn×n, implying that Cn(k) is not a variety. Indeed, we first argue that the set of all pairs (A, B), both diagonalizable over k, forms a 2 Zariski dense set in Matn×n(k). First we note that k is Zariski dense in k: this is because any infinite subset of a field is Zariski dense in that field. It follows that GLn(k) is Zariski n n dense in GLn(k), and that k is Zariski dense in k . This means that the image of the map n GLn(k) × k → Matn×n(k) −1 (g, (λ1, ··· ,λn)) 7→ g diag(λ1, ··· ,λn)g is Zariski-dense in the image of the corresponding map over k, which is Zariski dense in Matn×n k . Thus the set of all k-diagonalizable matrices forms a Zariski dense subset of Matn×n (k). Let (A, B ) be a pair of matrices with entries in k, both diagonalizable over k; this means in particular that each has an eigenvalue in k, and so commutes with a rank 1 matrix with entries in k. If C is a rank 1 matrix commuting with A, and D is a rank 1 matrix commuting with B, then since n ≥ 3 there exists a rank 1 matrix E commuting with both C and D. As all rank 1 matrices are nonscalar, d(A, B) ≤ 4. As this holds on a Zariski 2 4 2 dense subset of Matn×n(k), this means that Cn(k) = Matn×n(k). Since d ≥ 4 we have 2 4 d 2 d 2 Matn×n(k) = Cn(k) ⊂ Cn(k) ⊂ Matn×n(k), implying Cn(k) = Matn×n(k), thus completing the proof.  3. The distance-2 commuting set 2 Fix n ≥ 3, and let k be any field. In this section we will prove that Cn(k) is a variety. In Proposition 3.1, we show that it is the vanishing locus of certain minors of a 2n2 × n2 matrix 2 with linear polynomials as its entries. We then present a more geometric construction of Cn 2 as an affine variety over C. The proof that this construction actually gives Cn, rather than some other set, relies on Proposition 3.1, and so does not constitute an independent proof 2 2 that Cn is a variety. We then find the irreducible decomposition of Cn over C, as well as its dimension. 2 Proposition 3.1. Over any field, the set Cn is an affine variety. In particular, it is defined by minors of a 2n2 × n2 matrix with linear entries. HIGHER-DISTANCE COMMUTING VARIETIES 9

2 Proof. We will explicitly give the desired collection of polynomials that define the set Cn. Let A,B,C be n × n matrices with entries given by aij, bij, and cij, respectively. We have A ↔ C if and only if n n

aikckj − aℓjciℓ =0 Xk=1 Xℓ=1 for 1 ≤ i, j ≤ n. Similarly, B ↔ C if and only if n n

bikckj − bℓjciℓ =0 Xk=1 Xℓ=1 for 1 ≤ i, j ≤ n. Flattening the matrix C into a vector c, we can write these equations together as

MA,Bc = 0, 2 2 where MA,B is an 2n × n matrix with entries in k[{aij, bij}]. Thus A ↔ C ↔ B if and only if the flattening of C is in the nullspace of MA,B. It follows that the set of matrices commuting with both A and B is a vector space (namely the vector space of solutions of the above equation) whose dimension is equal to the nullity of the matrix MA,B. The dimension of this vector space is at least 1, since every scalar matrix commutes with all matrices. Therefore, the nullity of MA,B is strictly greater than 1 if and only if A and B commute with a mutual non-scalar matrix, which is equivalent to d(A, B) ≤ 2. By the rank-nullity theorem, d(A, B) ≤ 2 if and only if the rank of MA,B is at 2 2 2 2 most n − 2. This means that (A, B) ∈ Cn if and only if the (n − 1) × (n − 1) minors of MA,B are all equal to 0. These minors are polynomials in the entries of A and B. Thus, the 2 2  vanishing locus of these polynomials in Matn×n is precisely Cn.

To see the techniques of this proof more explicitly, we construct the matrix MA,B for the case of n = 3. Example 3.2. Let A, C, B ∈ C3×3 such that

a11 a12 a13 b11 b12 b13 c11 c12 c13 A = a a a , B = b b b , C = c c c .  21 22 23  21 22 23  21 22 23 a31 a32 a33 b31 b32 b33 c31 c32 c33       Then AC − CA =0 if and only if the following nine equations are all satisfied:

a13c31 + a12c21 + a11c11 − c13a31 − c12a21 − c11a11 =0

a13c32 + a12c22 + a11c12 − c13a32 − c12a22 − c11a12 =0

a13c33 + a12c23 + a11c13 − c13a33 − c12a23 − c11a13 =0

a23c31 + a22c21 + a21c11 − c23a31 − c22a21 − c21a11 =0

a23c32 + a22c22 + a21c12 − c23a32 − c22a22 − c21a12 =0

a23c33 + a22c23 + a21c13 − c23a33 − c22a23 − c21a13 =0

a33c31 + a32c21 + a31c11 − c33a31 − c32a21 − c31a11 =0

a33c32 + a32c22 + a31c12 − c33a32 − c32a22 − c31a12 =0

a33c33 + a32c23 + a31c13 − c33a33 − c32a23 − c31a13 =0. 10 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5

These equations can be expressed as MAc = 0, where 0 −a21 −a31 a12 0 0 a13 0 0 −a12 a11−a22 −a32 0 a12 0 0 a13 0 −a13 −a23 a11−a33 0 0 a12 0 0 a13 a21 0 0 a22−a11 −a21 −a31 a23 0 0 MA =  0 a21 0 −a12 0 −a32 0 a23 0  0 0 a21 −a13 −a23 a22−a33 0 0 a23  a31 0 0 a32 0 0 a33−a11 −a21 −a31   0 a31 0 0 a32 0 −a12 a33−a22 −a32   0 0 a31 0 0 a32 −a13 −a23 0   T  T and c =(c11,c12,c13,c21,c22,c23,c31,c32,c33) ; we remark that MA = A⊗I −I ⊗A , where ⊗ denotes the Kronecker product of matrices [13, Chapter 4]. Similarly, a matrix MB encodes the equation BC − CB = 0. The matrix MA,B is obtained by stacking MA on top of MB. So, in the n =3 case our equation MA,Bc = 0 is 0 −a21 −a31 a12 0 0 a13 0 0 −a12 a11−a22 −a32 0 a12 0 0 a13 0 0 −a13 −a23 a11−a33 0 0 a12 0 0 a13 0 a21 0 0 a22−a11 −a21 −a31 a23 0 0 0  0 a21 0 −a12 0 −a32 0 a23 0  0 0 0 a21 −a13 −a23 a22−a33 0 0 a23 c11  0   a31 0 0 a32 0 0 a33−a11 −a21 −a31  c12 0 31 32 12 33 22 32 c13 0  0 a 0 0 a 0 −a a −a −a  0  0 0 a31 0 0 a32 −a13 −a23 0  c21   c22  0   0 −b21 −b31 b12 0 0 b13 0 0    = .   c23  0   −b12 b11−b22 −b32 0 b12 0 0 b13 0  c31  0   −b13 −b23 b11−b33 0 0 b12 0 0 b13   c32   0  b21 0 0 b22−b11 −b21 −b31 b23 0 0 c33  0       0   0 b21 0 −b12 0 −b32 0 b23 0     0   0 0 b21 −b13 −b23 b22−b33 0 0 b23   0   b31 0 0 b32 0 0 b33−b11 −b21 −b31   0   31 32 12 33 22 32  0  0 b 0 0 b 0 −b b −b −b     0 0 b31 0 0 b32 −b13 −b23 0     2  To find the defining equations for C3 in the polynomial ring k[x11,...,x33,y11,...,y33], we would consider the matrix 0 −x21 −x31 x12 0 0 x13 0 0 −x12 x11−x22 −x32 0 x12 0 0 x13 0 −x13 −x23 x11−x33 0 0 x12 0 0 x13 x21 0 0 x22−x11 −x21 −x31 x23 0 0  0 x21 0 −x12 0 −x32 0 x23 0  0 0 x21 −x13 −x23 x22−x33 0 0 x23  x31 0 0 x32 0 0 x33−x11 −x21 −x31   0 x31 0 0 x32 0 −x12 x33−x22 −x32   0 0 x31 0 0 x32 −x13 −x23 0   0 −y21 −y31 y12 0 0 y13 0 0   12 11 22 32 12 13   −y y −y −y 0 y 0 0 y 0   −y13 −y23 y11−y33 0 0 y12 0 0 y13   y21 0 0 y22−y11 −y21 −y31 y23 0 0   0 y21 0 −y12 0 −y32 0 y23 0   0 0 y21 −y13 −y23 y22−y33 0 0 y23   y31 0 0 y32 0 0 y33−y11 −y21 −y31   31 32 12 33 22 32   0 y 0 0 y 0 −y y −y −y   0 0 y31 0 0 y32 −y13 −y23 0    9 18 and use as the defining equations all determinants of 8×8 submatrices. This yields 8 · 8 = 393 822 polynomial equations.   Remark 3.3. Specializing to the cases of k = R or k = C, the fields of real or complex numbers, Proposition 3.1 leads to a nice property: since affine varieties are closed in the 2 ∞ n×n ∞ n×n Euclidean topology, the set Cn is closed. For instance, if {Ai}i=1 ⊂ C and {Bi}i=1 ⊂ C are sequences of matrices such that

(i) limi→∞ Ai = A and limi→∞ Bi = B, and (ii) for each i, there exists a non-scalar matrix Ci such that Ai ↔ Ci ↔ Bi, then there must exist a non-scalar matrix C ∈ Cn×n such that A ↔ C ↔ B. (The same holds replacing C with R.) This is not immediately obvious even with the additional assumption that the limit limi→∞ Ci exists, since the limit of non-scalar matrices could be a scalar matrix. HIGHER-DISTANCE COMMUTING VARIETIES 11

2 We now provide a direct algebro-geometric construction for Cn over the field C. 2 3 Proposition 3.4. Inside of the 3n -dimensional space Matn×n, let T = {(A, C, B) | A ↔ C ↔ B}. Let S be the set of all triples (A, C, B) where C is a scalar matrix. Finally, let 3 2 π : Matn×n → Matn×n be projection onto the A and B coordinates, so that π(A, C, B)=(A, B). Then both T and S are affine varieties in 3n2-dimensional space, and 2 Cn = π(T \ S). Proof. Let X, Z, and Y be matrices of variables corresponding to the coordinates of A, C, and B, respectively. Then T = V(I), where the ideal I is generated by the polynomials (XZ − ZX)ij and (YZ − ZY )ij; and S = V(J), where the ideal J is generated by the 2 polynomials z11 − zii (for 2 ≤ i ≤ n) and zij (for i =6 j). The fact that Cn = π(T \ S) follows by definition.  Now we wish to take the set-theoretic difference of one variety from another, and then project the resulting set to a lower dimensional space. It is possible to do this by manipulating ideals, at least up to Zariski closure: By Propositions 2.6 and 2.7, the set-minus operation corresponds to ideal saturation, and projection corresponds to elimination of variables. In particular, V(I : J ∞)= T \ S, and V((I : J ∞) ∩ k[X,Y ]) = π T \ S .   2 We will show that we may remove the Zariski closures, so that this ideal does in fact define Cn. Proposition 3.5. When k = C, we have V((I : J ∞) ∩ k[X,Y ])) = π(T \ S). Proof. Let R = T \ S. Then we wish to show π(R)= π(R).

2 Certainly we have π(R) ⊂ π(R). Since π(R) = Cn is a variety by Proposition 3.1, it is Zariski closed. Thus if we can show that π(R) ⊂ π(R), it will follow that π(R) ⊂ π(R), since π(R) is the smallest Zariski-closed set containing π(R). Suppose that (A, B) ∈ π(R). It follows that there exists C such that (A, C, B) ∈ R. Since R is a constructible set, its Zariski closure is equal to its closure in the Euclidean topology ∞ by Proposition 2.5. It follows that there exists a sequence of points {(Ai,Ci, Bi)}i=0 ⊂ R ⊂ T such that limi→∞(Ai,Ci, Bi)=(A, C, B). Then limi→∞(Ai, Bi)=(A, B). Since 2 2 (Ai, Bi) ∈ π(R) = Cn, and since Cn is closed in the Euclidean topology, we have that 2 2  (A, B) = limi→∞(Ai, Bi) ∈ Cn. So, π(R) ⊂ Cn. This completes the proof. 2 If the field k is algebraically closed, we now determine the decomposition of Cn into ir- reducible varieties, as well as its dimension. The key players will be the varieties Zi for 12 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5

n 1 ≤ i ≤ ⌊ 2 ⌋, where Zi is the set of all pairs of matrices that commute with a common rank i matrix that squares to itself: 2 Zi = {(A, B) | ∃P ∈ Matn×n s.t. P = P , rank(P )= i, AP = P A, BP = P B}. We start with the following two lemmas. Lemma 3.6. When k is algebraically closed, we have 2 Cn = Z1 ∪···∪ Z⌊n/2⌋. 2 2 2 Proof. By definition Zi ⊂ Cn for all i, and since Cn is a variety we have Zi ⊂ Cn as well. This gives one direction of containment. 2 ′ For the other direction, let (A, B) ∈ Cn, so that there exists a non-scalar matrix C ∈ Matn×n commuting with both A and B. Then there exists a maximal matrix C ∈ Matn×n commuting with both A and B. Here maximality is considered in the sense of [7], that is the centralizer of no non-scalar matrix properly contains the centralizer of C. Indeed, if C′ is not maximal, then its centralizer is contained in a centralizer of a certain non-scalar maximal matrix C. Hence both A and B commute with C. Since k is algebraically closed by [7, Theorem 3.2] C is either a nontrivial idempotent matrix, or a nonzero matrix that squares to the zero matrix. In the first case, since C and I − C are maximal idempotents which commute with A and B, we have (A, B) ∈ Zi where i = min{rank(C), rank(I − C)}. Now assume that C is a nonzero matrix that squares to 0. Permuting the rows and columns in the Jordan canonical form of C in order to collect the zero rows together we obtain that 0 I 0 C is similar to the matrix C′′ = 0 0 0 where the first two block rows and columns 0 0 0 are of size i and the last ones are of size n −2i (possibly 0). Without loss of generality we A1 A2 A3 further assume that C = C′′. Commutativity then implies that A = 0 A 0 and  1  0 A4 A5

B1 B2 B3  0 0 0 B = 0 B 0 for some submatrices A and B . Now we consider X = 0 I 0 ,  1  j j   0 B4 B5 0 0 I 0 0 0  0 0 0   Y = 0 A2 A3 and Z = 0 B2 B3 . Then C +λX commutes with A+λY and with 0 0 0  0 0 0  2 B + λZ for each λ ∈ k, so (A+ λY,B + λZ ) ∈ Cn for each λ ∈ k. Moreover, for λ =6 0 the matrix C + λX is a multiple of an idempotent of rank n − i, therefore (A + λY,B + λZ) ∈ Zi for each λ =6 0, and consequently (A, B) ∈ Zi.  n 2 2 2 Lemma 3.7. For 1 ≤ i ≤ ⌊ 2 ⌋, the variety Zi is irreducible, with dimension n +i +(n−i) .

Proof. Fix P to be an idempotent of rank i. The set ZP = {(A, B) | AP = PA,BP = 2 2 2 P B} ⊂ Cn is an affine space of dimension 2(i +(n − i) ), and is therefore irreducible. Now, 2 the GLn acts on Matn×n by conjugation, which induces the GLn-action on Matn×n by simultaneous conjugation: (g, (A, B)) 7→ (gAg−1,gBg−1). Let · denote this action. Consider the orbit of ZP under this action. Since any two idempotent matrices of the same rank are similar, we have that Zi = GLn ·ZP . Since GLn is a connected group acting on an irreducible 2 2 variety ZP , we have that Zi is irreducible. (To see this, let ϕ : GLn × Matn×n → Matn×n HIGHER-DISTANCE COMMUTING VARIETIES 13

denote the morphism of the group action, so that Zi is the image of GLn × ZP under ϕ. Since GLn and ZP are irreducible, so too is their product, and so too is the image of the product under ϕ.) We are now ready to compute the dimension of Zi. Choose P to be the idempotent matrix I 0 ( 0 0 ), where I is the i × i . Consider the following map:

ϕ : GLn × ZP → Zi (g, (A, B)) 7→ g · (A, B)=(gAg−1,gBg−1).

This map is surjective since Zi = GLn · ZP , so we can determine the dimension of Zi by computing the generic dimension of a fibre of ϕ. To accomplish this, we note that the set of 2 2 all pairs (A, B) ∈ Cn where A has n distinct eigenvalues is an open subset of Cn; this open ′ ′ ′ set intersects every Zi. Assume that ϕ(g, (A, B)) = ϕ(g , (A , B )), where A has n distinct eigenvalues. Conjugating our four red block diagonal matrices A, B, A′, B′ with an element of GLi × GLn−i, we may assume without loss of generality that A is a , since it has n distinct eigenvalues. Since A′ is similar to A, it is also diagonalizable. Since it must ′ −1 already be block diagonal, we know that there exists h ∈ GLi × GLn−i such that hA h is diagonal. Now, since A and hA′h−1 are similar diagonal matrices, there exists a permutation ′ −1 −1 ′ ′ ′ matrix σ ∈ GLn with σhA h σ = A. Now, the equality ϕ(g, (A, B)) = ϕ(g , (A , B )) is equivalent to A′ = g′−1gAg−1g′ and B′ = g′−1gBg−1g′. Thus, when g′ is chosen, we have that A′ and B′ are uniquely determined by A, B, and g. Note that g′ has to satisfy the equation σhg′−1gAg−1g′h−1σ−1 = A, i.e. that A commutes with g−1g′h−1σ−1. As A is diagonal with n distinct eigenvalues, we have that g′ = gDσh for some invertible diagonal matrix D. Since the permutation matrices normalize the set of diagonal matrices, we have that g′ = gσD′h for ′ some invertible diagonal matrix D . Letting Sn denote the group of permutation matrices, ′ (−1) we have that g ∈ gSn(GLi × GLn−i). Thus the fibre ϕ (ϕ(g, (A, B))) is a subvariety of the variety that is parametrized by the product Sn(GLi × GLn−i). Since Sn is finite, the dimension of each irreducible component of the fibre is at most dim(GLi × GLn−i) = i2 +(n − i)2, giving us an upper bound on dimension. For a lower bound, consider the set −1 −1 {(gh, (h Ah, h Bh)) | h ∈ GLi × GLn−i} which contains (g, (A, B)). This set has dimension i2 +(n − i)2, and is a subset of the fibre ϕ(−1)(ϕ(g, (A, B))). Thus when A has n distinct eigenvalues, we have that the dimension of the fibre over (g, (A, B)) is i2 +(n − i)2. As this result holds on the open dense set where 2 2 2 A has n distinct eigenvalues, we have that dim(Zi) = n + i +(n − i) by [11, Theorem 11.12].  2 2 2 2 2 It immediately follows that dim(Cn) = n +1 +(n − 1) = 2n − 2n + 2, as this is the largest dimension of any Zi. These lemmas also allow us to characterize the decomposition 2 of Cn into irreducibles. Theorem 3.8. Let k be an algebraically closed field. The unique irredundant decomposition 2 of Cn into irreducible varieties is 2 Cn = Z1 ∪···∪ Z⌊n/2⌋.

Proof. By Lemmas 3.6 and 3.7, all we need to show is that no Zi is contained in any Zj for n i =6 j. For each i =1,..., ⌊ 2 ⌋, define 2 n Wi = (A, B) ∈ Matn×n | ∃ subspaces U, V ⊂ k s.t. dim(U)= i, dim(V )= n − i,  14 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5

AU ⊂ U, AV ⊂ V,BU ⊂ U,BV ⊂ V } .

We have that Zi is contained in Wi for each i. Moreover, the set Wi is the image of the projection of the set ′ Wi = {(A,B,U,V ) ∈ Matn×n × Matn×n × Gr(n, i) × Gr(n, n − i) | AU ⊂ U, AV ⊂ V,BU ⊂ U,BV ⊂ V } to the first two factors, where Gr denotes the Grassmannian. Recall that the Grassmannian Gr(n, j) is the set of all j-dimensional subspaces of an n-dimensional vector space. It has a natural structure of a projective variety given by Pl¨ucker embedding Gr(n, j) → P(∧j(kn)) = (n)−1 P j that sends a subspace U ∈ Gr(n, j) with a basis {u1,...,uj} to u1 ∧···∧ uj ∈ P(∧j(kn)), see [11, Chapter 6]. Note that this map is well-defined, as the wedge-product u1 ∧···∧ uj is independent of the chosen basis. If U ∈ Gr(n, i) is represented in Pl¨ucker coordinates by u1 ∧ ... ∧ ui, then AU ⊂ U if and only if (A + sI)u1 ∧ ... ∧ (A + sI)ui is a multiple of u1 ∧ ... ∧ ui for each s ∈ k [18, Theorem 2.2], which is equivalent to vanishing of n (A + sI)u1 ∧ ... ∧ (A + sI)ui all 2×2 minors of the 2× i matrix for all s ∈ k. This is a u1 ∧ ... ∧ ui   ′ polynomial condition which  is homogeneous in the u-coordinates, so Wi is a closed subset of 2 n −1 n −1 k2n × P( i ) × P(n−i) . By Proposition 2.10 its projection onto the first two factors, which n is Wi, is closed, and Zi ⊂ Wi. Now, for 1 ≤ i < j ≤ ⌊ 2 ⌋ there exists a pair of matrices (A, B) ∈ Zj that do not have an (n − i)-dimensional common invariant subspace, implying Zj 6⊂ Zi. On the other hand, Zi cannot be a subset of Zj since dim Zi > dim Zj. This completes the proof. 

2 One consequence of this theorem is that Cn is an irreducible variety if and only if n ≤ 3: 2 1 2 2 for n = 2 we have C2 = C2 and for n = 3 we have C3 = Z1; but for n ≥ 4 the variety Cn has at least two distinct irreducible components, namely Z1 and Z2.

4. The distance-3 commuting set To study pairs of matrices with commuting distance at most 3, we first define the notion of polynomially commuting matrices.

Definition 4.1. Fix n ≥ 3, and let A and B be two matrices in Matn×n. Then A and B polynomially commute if there exist polynomials p, q ∈ k[x] such that (i) p(A) ↔ q(B), and (ii) 1 ≤ deg(p), deg(q) ≤ n − 1. Note that without the degree bounds on p and q, every pair of matrices would polynomially commute with one another: if deg(p) = 0, then p(A) would be a scalar matrix; and if deg(p)= n we could choose p equal to the characteristic polynomial of A, so that p(A) is the zero matrix. In either case, we would have p(A) commuting with every matrix. However, unlike with our definition of commuting distance, it is not forbidden for p(A) or q(B) to be a scalar matrix. This will come into play in Lemma 4.2. We also remark that we may assume that p(x) and q(x) have no constant term. This is because for any constants λ and µ, two matrices C and D commute if and only if C − λI and D − µI commute. Eliminating the constant term from p and q only effects p(A) and p(B) by subtracting off a constant multiple of the identity matrix. HIGHER-DISTANCE COMMUTING VARIETIES 15

2 Let PCn ⊂ Matn×n denote the set of all pairs of polynomially commuting matrices. To 3 prove that Cn is an affine variety over an algebraically closed field, we will first show that 3 Cn = PCn. We will then show that PCn is an affine variety, at least over an algebraically closed field. We begin with the following result.

Lemma 4.2. Let A, B ∈ Matn×n such that at least one of A and B is derogatory. Then (A, B) ∈ PCn. Proof. Assume that A is derogatory; a symmetric argument will hold when B is derogatory. Let p ∈ k[x] be the minimal polynomial of A. Then by the definition of derogatory matrices, p(x) is not equal to the characteristic polynomial of A, which has degree n. Since the minimal polynomial divides the characteristic polynomial, we have deg(p) < n. Combined with the fact that minimal polynomial of any matrix is non-constant, this implies that 1 ≤ deg(p) ≤ n − 1. Now let q(x)= x. We then have A ↔ p(A) ↔ q(B) ↔ B, since the zero matrix p(A) commutes with all other matrices and since B commutes with itself. Thus A and B polynomially commute.  Proposition 4.3. Suppose k is algebraically closed. Then the distance-3 commuting set is equal to the set of polynomially commuting pairs of matrices: 3 Cn = PCn. 2 3 Proof. Let (A, B) ∈ Matn×n. We claim that (A, B) ∈ Cn if and only if (A, B) ∈ PCn. First suppose at least one of A and B is derogatory. Then by Theorem 2.17, we have 3 (A, B) ∈ Cn, and by Lemma 4.2, we have (A, B) ∈ PCn. Thus our claim holds in this case. Now suppose A and B are both non-derogatory. By Theorem 2.17, A commutes with 3 polynomials in A only, and B commutes with polynomials in B only. Now, (A, B) ∈ Cn if and only if there exist non-scalar matrices C and D such that A ↔ C ↔ D ↔ B. Since A and B commute with polynomials in themselves only, such C and D exist if and only if there are polynomials p(x), q(x) ∈ k[x] such that C = p(A) and D = q(B). In fact, p and q must have degree at least 1, since C and D are non-scalar. Moreover, p may be chosen to have degree at most n − 1, since the characteristic polynomial is annihilating, so any power Ak with k ≥ n can be written as a combination of I, A, A2,...,An−1. A similar argument shows we may take deg(q) ≤ n − 1. Thus, the desired matrices C and D exist if and only if (A, B) ∈ PCn. This completes our proof.  3 To show that Cn is an affine variety in the case of algebraically closed field, it remains to show the following proposition.

Proposition 4.4. Suppose k is algebraically closed. Then the set PCn is an affine variety. Proof. Suppose A and B polynomially commute, and let p and q be polynomials satisfying all the conditions of our definition. As previously noted, we may assume that p and q have no constant term. Write n−1 n−1 i i p(x)= cix , q(x)= dix . i=1 i=1 X X 16 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5

The condition p(A)q(B) − q(B)p(A) = 0 can then be written as

n−1 n−1 n−1 n−1 i i i i 0= ciA diB − diB ciA i=1 ! i=1 ! i=1 ! i=1 ! Xn−1 n−1 X n−1 Xn−1 X i j j i = cidjA B − cidjB A i=1 j=1 ! i=1 j=1 ! X X X X n−1 n−1 i j j i = cidj A B − B A . i=1 j=1 X X  Thinking of the ci, the dj, the entries of A, and the entries of B as variables, this equation really consists of n2 polynomials (one for each coordinate) in those variables. These equations are bilinear in the ci and the dj variables. Note that the set of polynomials with no constant term and degree between 1 and n − 1 is a vector space. Call this vector space V , and let P(V ) be the projectivization of V , so that two polynomials are identified if and only if they differ by a constant multiple. The equivalence class of a polynomial p ∈ V in P(V ) will be denoted by hpi. Consider the incidence correspondence

2 Φ= {(A, B, hpi, hqi| p(A)q(B)= q(B)p(A)} ⊂ Matn×n × P(V ) × P(V ). Note that Φ is Zariski closed: it is defined by the equations we derived from p(A)q(B) − q(B)p(A) = 0, which are indeed bilinear in the coordinates defining p and q. Consider the projection map 2 2 π : Matn×n × P(V ) × P(V ) → Matn×n. 2 By Proposition 2.10, π(Z) ⊂ Matn×n is Zariski-closed whenever whenever Z is. Thus, the 2  image π(Φ) = PCn is Zariski-closed in Matn×n. We conclude that PCn is an affine variety. Combining Propositions 4.3 and 4.4, we have the following result, which completes the proof of Theorem 1.1.

3 Proposition 4.5. Over an algebraically closed field, the distance-3 commuting set Cn is an affine variety. Remark 4.6. We briefly mention another method of proof for Proposition 4.5 with the additional assumption that the characteristic of k does not divide n, suggested to the authors by Eyal Markman. Consider all quadruples (A, B, C, D) with A ↔ B ↔ C ↔ D, where B and C are nonscalar. Without loss of generality, we may assume that B and C have trace zero, since adding a scalar matrix to each does not affect our assumptions. The set of all trace zero nonscalar matrices can be identified with an n2 − 2 dimensional projective space, n2−2 and so we may think of our set of quadruples as a Zariski-closed subset of Matn×n × P × n2−2 3 P × Matn×n. Projecting away from the projective spaces yields Cn, which must therefore be Zariski-closed; here we use the fact that no nonzero scalar matrix has trace 0 since the characteristic of k does not divide n. The fact that the distance-d commuting sets over algebraically closed fields are all varieties allows us to deduce the following fact. HIGHER-DISTANCE COMMUTING VARIETIES 17

Corollary 4.7. If k = C or k = R, then 2 {(A, B) | d(A, B) ≤ 3} ⊂ Matn×n(k) is a set of measure 0. Phrased contrapositively, a “random” pair of matrices (A, B) satisfies d(A, B) = 4. Proof. For k = C, this follows from the fact that any proper subvariety of affine space over C or R has measure 0, and the fact that at least one pair of matrices has d(A, B) = 4. 3 3 2 For k = R, we note that Cn(R) ⊂ Cn(C) ∩ Matn×n (R) : this is because if there is a commuting chain over R connecting A and B, the same commuting chain exists over C. By 3 2 Lemma 2.12, we know that Cn(C) ∩ Matn×n (R) is a variety since C is algebraic over R; it is in fact a proper variety by the usual examples of distance-4 pairs of matrices. Thus 3 2 2 3  Cn(C) ∩ Matn×n (R) has measure 0 in Matn×n (R) , and so too does its subset Cn(R). 2 3 Just as with Cn, we have an algebro-geometric construction of Cn over C. 2 Proposition 4.8. Let k = C, I be the ideal generated by the 3n polynomials (XZ − ZX)ij, (ZW − WZ)ij, and (Y W − WY )ij, J1 be the ideal generated by the polynomials z11 − zii (for 2 ≤ i ≤ n) and zij (for i =6 j), and let J2 be generated by the polynomials w11 − wii (for 2 ≤ i ≤ n) and wij (for i =6 j). Let J = J1 ∩ J2 be the intersection of the two ideals. Then 3 V ∞ Cn = ((I : J ) ∩ k[X,Y ]). 2 4 Proof. Inside of the 4n -dimensional space Matn×n, we consider the set Q = {(A,C,D,B) | A ↔ C ↔ D ↔ B}. Let S be the set of all quadruples (A,C,D,B) where at least one of C and D is a scalar matrix. Finally, let 4 2 π : Matn×n → Matn×n be projection onto the A and B coordinates, so that π(A,C,D,B)=(A, B). Then, by definition, 3 Cn = π(Q \ S). As in the d = 2 case, we can show that Q and S are affine varieties in 4n2-dimensional space. Let X, Z, W , and Y be matrices of variables corresponding to the coordinates of A, C, D, and B, respectively. Then Q = V(I). By conditions, V(J1) is the set of all quadruples (A,C,D,B) where C is a scalar matrix, and V(J2) is the set of all quadruples (A,C,D,B) where D is a scalar matrix. Let J = J1 ∩ J2 be the intersection of the two ideals. Then V(J) = V(J1) ∪ V(J2)[5, Theorem 4.3.15]. Thus, S = V(J1) ∪ V(J) = V(J). Using Propositions 2.6 and 2.7, we have V(I : J ∞)= Q \ S and V((I : J ∞) ∩ k[X,Y ]) = π Q \ S . As argument identical to that in the proof of Proposition 3.5 shows that we may remove the Zariski closures in π Q \ S , so that   ∞ 3 V((I : J ) ∩ k[X,Y ]) = π Q \ S = π(Q \ S)= Cn, as desired.    18 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5

We now study the dimension of the distance-3 commuting set over an algebraically closed field. In a forthcoming paper, we will provide a computation of its dimension as being precisely 2n2 − 2; here we content ourselves with a lower bound.

Lemma 4.9. Let V ⊂ Matn×n denote the set of all derogatory matrices. We have that V is a variety of dimension n2 − 3. Proof. First we argue that V is a variety. Recall that a matrix A is derogatory if and only if its characteristic polynomial is not equal to its minimal polynomial. This means that there is a nontrivial linear dependence between I, A, A2, ··· , An−2. Flattening these n − 1 square matrices and forming a (n−1)×n2 matrix B, the existence of such a nontrivial linear dependence is equivalent to B having rank at most n − 2. This rank condition is equivalent to the vanishing of all size n − 1 minors of B, which are polynomials in the entries of A. The set of all derogatory matrices is thus a variety defined by these polynomials. ′ Consider the set V of all matrices similar to diag(λ1,λ1,λ2,...,λn−1) for some pair- ′ wise distinct λ1,...,λn−1 ∈ k. We claim that V is dense in V . To see this, we can write any derogatory matrix in Jordan canonical form, and then perturb the coefficients to guarantee n − 1 distinct eigenvalues. Now we observe that V ′ is the image of the map n−1 −1 ϕ: GLn × k → Matn×n that sends (g, (λ1,...,λn−1)) to g diag(λ1,λ1,λ2,...,λn−1)g . −1 −1 If g diag(λ1,λ1,λ2,...,λn−1)g = h diag(µ1,µ1,µ2,...,µn−1)h , then µ1 = λ1 and the se- quence (µ2,...,µn−1) is a permutation of (λ2,...,λn−1). As in the proof of Lemma 3.7 we I 0 then see that there exists a permutation matrix σ ∈ S such that for ρ = ∈ S the n−2 0 σ n   S 0 matrix ρ−1h−1g commutes with diag(λ ,λ ,λ ,...,λ ) and is hence of the form 1 1 2 n−1 0 D for some 2 × 2 invertible matrix S and some (n − 2) × (n − 2) invertible diagonal matrix S−1 0 I 0 D. Hence, h = g , which shows that every fibre of ϕ is isomorphic 0 D−1 0 σ−1 ∗ n−2    ∗  to GL2 × (k ) × Sn−2, where k denotes the nonzero elements of k. This has dimension n +2, and so by [11, Theorem 11.12] it follows that dim(V ′)= n2 − 3, as desired.  3 2 3 It immediately follows from this lemma that dim(Cn) ≥ 2n − 3, since V × Matn×n ⊂ Cn. The following result gives us a sharper bound. 3 2 Proposition 4.10. We have dim(Cn) ≥ 2n − 2. 3 2 3 Proof. As previously remarked, we have dim(Cn) ≥ 2n − 3 since V × Matn×n ⊂ Cn. Sup- 3 2 pose for the sake of contradiction that dim(Cn)=2n − 3. The dimension of a variety is by definition the largest possible dimension of an irreducible component of that variety. 2 Therefore there exists an irreducible component C of V × Matn×n of dimension 2n − 3. It 3 is an irreducible subvariety of Cn, therefore it is contained in an irreducible component C of 3 3 2 2 Cn. As dim Cn = 2n − 3 by our assumption, it follows that dim C = 2n − 3. Now, C and C are varieties of the same dimension, with C irreducible and C ⊂ C, so we get C = Ceby [5, Proposition 9.4.10]. It follows that C is an irreducible componente of V × Matn×n and 3 ofe Cn. Take an arbitrary pair (A, B) ∈ C. Thuse there exist non-scalare matrices Ce and D such that A ↔ C ↔ D ↔ B. The centralizer of any matrix contains a non-derogatory

matrix, as if J = Jm1 (λ1) ⊕···⊕ Jmt (λt) is a matrix in the Jordan canonical form, then

Jm1 (µ1)⊕···⊕Jmt (µt) commutes with J and it is non-derogatory if the eigenvalues µ1,...,µt are pair-wise distinct. We can therefore choose a non-derogatory matrix E commuting with HIGHER-DISTANCE COMMUTING VARIETIES 19

3 C. It follows that (A + λE,B) ∈ Cn for every λ ∈ k. In Lemma 4.9 we have shown that the set of non-derogatory matrices is open in Matn×n in the Zariski topology. As E is non- derogatory, it therefore follows that A+λE is non-derogatory for all but finitely many scalars λ ∈ k. For each such λ the pair (A + λE,B) does not belong to C. Therefore there exists ′ 3 ′ another irreducible component C of Cn, different from C, such that (A + λE,B) ∈ C for infinitely many λ ∈ k. In the Zariski topology the closure of an infinite number of points on a line is equal to the whole line. As C′ is closed, it therefore has to contain (A + λE,B) for each λ ∈ k. In particular (A, B) ∈ C′, so C ( C′. This is a contradiction to C being a 3  component of Cn. We close with a brief study of the distance-3 commuting set over the field of real numbers. We start with the following example. 1 −1 0 3 110 0 −1 1 0 −1 −110 0 Example 4.11. Let A = −2 2 0 −4 , B = 0 0 −1 1 ∈ Mat4×4(R). We claim that  0 0 0 −2   0 0 −1 −1  dR(A, B)=4 and dC(A, B)=3. Indeed, A is of rank 2 with the eigenvalues {2, −2, 0, 0}, so it is derogatory, and by Theorem 2.17 dC(A, B) ≤ 3. Assume now that A ↔ C ↔ D ↔ B for some C,D ∈ Mat4×4(R). Then direct compu- ab c d b+2c a−2c c b+c tations show that C is of the form e f e f and D is of the form e f a−b−2c+ 2 − 2 b+c−d+ 2 + 2 0 0 0 a−c−d ! a′ b′ 0 0 −b′ a′ 0 0 0 0 c′ d′ . It can be verified straightforwardly that no two non-scalar such real matrices ′ ′  0 0 −d c  C and D commute. Thus dR(A, B) ≥ 4. It follows that dC(A, B)=3 and dR(A, B)=4. This example, combined with the following lemma, implies Proposition 1.3, which states 3 that C4 (R) is not a variety. Lemma 4.12. Let k be an infinite field, and suppose that there exist n × n matrices A and 3 B with dk(A, B)=4 where A has rank at most n − 2. Then Cn(k) is not a variety. 3 Proof. Suppose for the sake of contradiction that Cn(k) is a variety. Then so too is its 3 intersection with any other variety. Let V be the variety obtained by intersecting Cn(k) with the codimension-n2 linear space L defined by fixing the first n2 coordinates to be our given 2 matrix A. Identifying L with kn , we can think of V as a variety in n2-dimensional affine space. By our hypotheses, it is a proper subvariety; for instance, it does not contain the point B. By the argument from the proof of Proposition 1.2, the set of n × n matrices over an infinite field k that have an eigenvalue in k forms a Zariski-dense set. Let B′ be such a matrix. Then B′ commutes with some rank 1 matrix D. Since A is an n × n matrix of rank at most n − 2, there exists a rank 1 matrix C = uvT that commutes with both A and D: simply choose u to be orthogonal to the row space of A and the row space of D, and choose v to be orthogonal to the column space of A and the column space of D. Thus we have a commuting chain A ↔ C ↔ D ↔ B′, where both C and D have rank 1 and are ′ therefore nonscalar. Thus dR(A, B ) ≤ 3. Since A is distance at most 3 from all matrices in a Zariski-dense set, we have that V is Zariski-dense in in L. This is a contradiction to the fact that it is a proper subvariety.  3 It is currently an open problem whether Cn(R) is a variety for n = 3 and for n ≥ 5. One possible tool for approaching this is the following result. 20 MADELEINE ELYZE, ALEXANDER GUTERMAN1,2,3, RALPH MORRISON4, AND KLEMEN SIVICˇ 5

Proposition 4.13. Let k be a field and n ≥ 3 such that dk(A, B)= dk(A, B) for all (A, B) ∈ 2 3 Matn×n(k) . The distance-3 commuting set Cn(k) is an affine variety.

Proof. From our assumption on dk and dk, we have that

3 3 2 Cn(k)= Cn(k) ∩ Matn×n(k) .

m By Lemma 2.12, if k is a field and V ⊂ k is an affine variety, then so is V ∩ km. We know 3 by Proposition 4.5 that Cn(k) is an affine variety, so its intersection with the affine space 2 3  Matn×n(k) is as well. We conclude that Cn(k) is an affine variety.

We remark that if dk(A, B) ≤ 2, we have dk(A, B)= dk(A, B). This follows by definition if dk(A, B) ≤ 1. We also have dk(A, B) ≤ 2 if and only if the rank of the matrix MA,B from Section 3 is at most n2 − 2; since rank is independent of field, this is equivalent to

dk(A, B) ≤ 2. It follows that dk(A, B) = 2 if and only if dk(A, B) = 2. Thus the only way to have dk(A, B) =6 dk(A, B) is with dk(A, B) ≥ 4 and dk(A, B) > dk(A, B) ≥ 3. This can indeed happen, as illustrated in Example 4.11. At present we do 2 not know of any instances of k =6 k where dk(A, B) = dk(A, B) for all A, B ∈ Matn×n(k) at least for some fixed n > 2. The following example shows that we do not always have dR(A, B) = dC(A, B) for the case of 3 × 3 matrices. This does not immediately imply that 3 C3 (R) is not a variety; rather, it indicates that any proof that it is a variety cannot rely on the previous proposition.

0 10 0 −1 0 Example 4.14. Let A = −1 0 0 , B = 1 0 1 ∈ Mat3×3(R). We claim that  0 00 0 0 0 dR(A, B)=4 and dC(A, B)=3. Assume thatA ↔ C ↔ D ↔ B for some C and D. a b 0 Then direct computations show that C is of the form −b a 0 and D is of the form  0 0 c a′ b′ c′   −b′ a′ −b′ . A straightforward calculation shows that no two non-scalar such real  0 0 a′ − c′ matrices C and Dcommute, so dR(A, B)=4. On the other hand, the complex matrices of 0 10 0 1 i this form C = −1 0 0 and D = −1 0 −1 commute, which shows that dC(A, B)=  0 0 i  0 0 −i 3.    

Acknowledgements. The first author was supported by the Clare Boothe Luce Program. The investigations of the second author are financially supported by RSF grant 17-11-01124. The fourth author is partially supported by Slovenian Research Agency (ARRS) grants N1- 0103 and P1-0222. The authors would like to thank Ngoc Tran, who organised the discussions that led to this project; as well as Spencer Backman and Martin Ulirsch, who took part in those discussions. They also thank Jan Draisma, Tom Garrity, Jake Levinson, and Eyal Markman for many helpful discussions and ideas about higher distance commuting varieties. HIGHER-DISTANCE COMMUTING VARIETIES 21 References

[1] S. Akbari, H. Bidkhori, and A. Mohammadian. Commuting graphs of matrix algebras. Comm. Algebra, 36(11):4020–4031, 2008. [2] S. Akbari, M. Ghandehari, M. Hadian, and A. Mohammadian. On commuting graphs of semisimple rings. Appl., 390:345–355, 2004. [3] S. Akbari, A. Mohammadian, H. Radjavi, and P. Raja. On the diameters of commuting graphs. Linear Algebra Appl., 418(1):161–176, 2006. [4] S. Akbari and P. Raja. Commuting graphs of some subsets in simple rings. Linear Algebra Appl., 416(2- 3):1038–1047, 2006. [5] D. A. Cox, J. Little, and D. O’Shea. Ideals, varieties, and algorithms. Undergraduate Texts in Math- ematics. Springer, Cham, fourth edition, 2015. An introduction to computational algebraic geometry and commutative algebra. [6] C. G. Cullen. Matrices and linear transformations. Dover Books on Advanced Mathematics. Dover Publications, Inc., New York, second edition, 1990. [7] G. Dolinar, A. Guterman, B. Kuzma, and P. Oblak. Extremal matrix centralizers. Linear Algebra Appl., 438(7):2904–2910, 2013. [8] G. Dolinar, B. Kuzma, and P. Oblak. On maximal distances in a commuting graph. Electron. J. Linear Algebra, 23:243–256, 2012. [9] R. M. Guralnick. A note on commuting pairs of matrices. Linear and Multilinear Algebra, 31(1-4):71–75, 1992. [10] R. M. Guralnick and B. A. Sethuraman. Commuting pairs and triples of matrices and related varieties. Linear Algebra Appl., 310(1-3):139–148, 2000. [11] J. Harris. Algebraic geometry: a first course, volume 133 of Graduate Texts in Mathematics. Springer- Verlag, New York, 1992. [12] D. Hilbert. Uber¨ die Theorie der algebraischen Formen. Math. Ann., 36(4):473–534, 1890. [13] R. A. Horn and C. R. Johnson. Topics in Matrix Analysis. Cambridge University Press, 1991. [14] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, second edition, 2017. [15] T. S. Motzkin and O. Taussky. Pairs of matrices with property L. II. Trans. Amer. Math. Soc., 80:387– 401, 1955. [16] D. Mumford. The red book of varieties and schemes, volume 1358 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, expanded edition, 1999. Includes the Michigan lectures (1974) on curves and their Jacobians, With contributions by Enrico Arbarello. [17] Y. N. Shitov. Distances on the commuting graph of the ring of real matrices. Mat. Zametki, 103(5):765– 768, 2018. [18] M. Tsatsomeros. A criterion for the existence of common invariant subspaces of matrices. Linear Algebra Appl., 322(1-3):51–59, 2001.