On Optimality of CSS Codes for Transversal T Narayanan Rengaswamy, Graduate Student Member, IEEE, Robert Calderbank, Fellow, IEEE, Michael Newman, and Henry D

Home , CSS code, Robert Calderbank

On Optimality of CSS Codes for Transversal T Narayanan Rengaswamy, Graduate Student Member, IEEE, Robert Calderbank, Fellow, IEEE, Michael Newman, and Henry D. Pﬁster, Senior Member, IEEE

Abstract—In order to perform universal fault-tolerant quan- unitaries on each physical qubit of the code. Since errors do tum computation, one needs to implement a logical non-Clifford not propagate within codeblocks during such an operation, gate. Consequently, it is important to understand codes that these gates are naturally fault-tolerant. Hence, transversal im- implement such gates transversally. In this paper, we adopt an algebraic approach to characterize all stabilizer codes for plementations of logical gates are highly desirable. However, which transversal T and T † gates preserve the codespace. the Eastin-Knill theorem shows that there is no QECC that Our Heisenberg perspective reduces this question to a finite detects at least 1 error and possesses a universal set of logical geometry problem that translates to the design of certain classical gates that can be realized via transversal operations [2], [3]. codes. We prove three corollaries of this result: (a) For any Therefore, there is a need to balance the operations that can be non-degenerate [[n, k, d]] stabilizer code supporting a physical transversal T (which might not be logical T ), there exists an implemented transversally and the operations for which other [[n, k, d]] CSS code with the same property; (b) Triorthogonal fault-tolerant mechanisms must be devised. codes form the most general family of CSS codes that realize In general, logical Clifford gates are easier to implement logical transversal T via physical transversal T ; (c) Triorthog- than logical T gates. This is because self-dual CSS codes, onality is necessary for physical transversal T on a CSS code i.e., CSS codes where the pure X-type and pure Z-type to realize the logical identity. The main tool we use is a recent characterization of a particular family of diagonal gates in the stabilizers are constructed from the same classical code, admit Clifford hierarchy that are efficiently described by symmetric a transversal implementation of the logical Clifford group, matrices over rings of integers [N. Rengaswamy et al., Phys. but not of the logical T gate. On the other hand, there Rev. A 100, 022304]. We refer to these operations as Quadratic are some code families such as triorthogonal codes [4] and Form Diagonal (QFD) gates. Our framework generalizes all color codes [5] that realize the logical T gate transversally. existing stabilizer code constructions that realize logical gates via transversal T . We provide several examples of codes and briefly Therefore, a common strategy is to utilize these codes to discuss connections to decreasing monomial codes, pin codes, perform magic state distillation and state injection to apply generalized triorthogonality and quasitransversality. We partially the logical T gate on the data [4], [6], [7]. By this approach, extend these results towards characterizing all stabilizer codes circuits on the error-corrected quantum computer will only ` that support transversal π/2 Z-rotations. In particular, using consist of Clifford operations, augmented by ancillary magic Ax’s theorem on residue weights of polynomials, we provide an alternate characterization of logical gates induced by transversal states, and these operations can be realized transversally. π/2` Z-rotations on a family of quantum Reed-Muller codes. There has also been interest in employing logical smaller We also briefly discuss a general approach to analyze QFD gates angle rotations, compared to the π/8 rotation of the T gate [8]. that might lead to a characterization of all stabilizer codes that This poses heavier requirements on the distillation code, but support any given physical transversal 1- or 2-local diagonal gate. can also result in shorter gate sequences during compilation. In contrast to the difficulty of small-angle logical rotations, the Index Terms—Stabilizer codes, transversal gates, triorthogonal fidelity of physical rotations can increase at finer angles [9], codes, CSS codes, QFD gates, non-degenerate codes, binary helping to mitigate the burden of magic state distillation with polynomials cumbersome codes. Such trapped-ion systems are leading can- didates to realize a universal fault-tolerant quantum computer I.INTRODUCTION and most experimental efforts focus on stabilizer codes. Thus, Quantum error correction is vital to build a universal, fault- it may be profitable to further understand stabilizer codes arXiv:1910.09333v3 [quant-ph] 18 Aug 2021 tolerant quantum computer. Since such a device must process supporting smaller-angle transversal Z-rotations as well. the information stored in it, we need to devise schemes that In this paper, we take steps towards systematically un- fault-tolerantly perform unitary operations on the protected derstanding the construction of general stabilizer codes that information. For an [[n, k, d]] quantum error-correcting code, support physical transversal T and T † gates as logical opera- any unitary operation on the k logical qubits must be realized tors, and then discuss extensions to transversal finer angle Z- via an operation on the n physical qubits that preserves the rotations. We also briefly discuss a general method to analyze code subspace. A transversal gate is one in which the physical other diagonal gates that have an efficient representation operation decomposes into a tensor product of individual using symmetric matrices over rings of integers [10] (see Section III-A), which we refer to as Quadratic Form Diagonal N. Rengaswamy, R. Calderbank, and H.D. Pfister are with the Depart- (QFD) gates. These form a subgroup of all diagonal gates in ment of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27708, USA. M. Newman is with the Departments of the Clifford hierarchy which were characterized earlier by Cui Electrical and Computer Engineering, Chemistry and Physics, Duke Univer- et al. [11]. We have shown that all 1- and 2-local diagonal sity, Durham, North Carolina 27708, USA. Email: {narayanan.rengaswamy, gates in the Clifford hierarchy are QFD gates [10]. Fault- robert.calderbank, henry.pfister}@duke.edu, [email protected] Part of this work was presented at the 2020 IEEE International Symposium tolerance makes it natural to partition the physical qubits into on Information Theory [1]. small groups and employ “generalized” transversal gates that 2 split into operations on these individual groups. Indeed, such fixed even when the lattice size is increased, i.e., code a scheme has been recently explored by Jochym-O’Connor et distance is increased. However, codes such as quantum al. [12], and can be used to construct a universal set of fault- Reed-Muller codes are typically non-degenerate since tolerant gates [13]. In fact, they showed that if we allow the each stabilizer generally has weight at least equal to the partition to change during computation, then we can obtain a code distance. universal set of logical gates through transversal operations 2) Triorthogonal codes form the most general family of alone. More precisely, on a concatenated code, where the CSS codes that realize logical transversal T from physi- (fixed) partition decides which is the inner code and which is cal transversal T (see Theorem 14 and Corollary 15). the outer code, a transversal operation on the outer code can be Here, by “logical transversal T ” we mean that the effectively used to realize the operation fault-tolerantly on the induced logical operation applies T to each logical qubit. overall concatenated code, although the overall operation is not 3) Triorthogonality is necessary for physical transversal transversal. Therefore, our general approach to analyze QFD T on a CSS code to realize the logical identity (see gates allows one to investigate codes that support transversal Theorem 12). An additional condition on the logical 1- and 2-local diagonal gates, on a partition of qubits into X operators distinguishes this case from triorthogonal groups of at most two. This paper is a proof-of-concept for codes where the logical operation is also a transversal the important case of Z-rotations. T . Several works have studied the problem of realizing non- These results suggest that, for the problem of distilling magic trivial logical operators via physical Z-rotations [4], [14], [15], states using physical transversal T , CSS codes might indeed [16], [17]. These works approach this problem by restricting be optimal. We emphasize that we are able to make such themselves to Calderbank-Shor-Steane (CSS) codes and then conclusions because we focus on the effects of physical op- examining the action of these gates on the basis states of these erations directly on the stabilizer (and logical Pauli) group(s), codes. When a π/2` Z-rotation gate rather than just on the basis states of CSS codes, i.e., by taking a “Heisenberg” perspective rather than a “Schrodinger”¨ −ıπ π π 2πı/2` exp Z = cos · I2 − ı sin · Z ≡ diag 1, e perspective. 2` 2` 2` (1) We believe this result opens the way to leverage the rich classical literature on self-dual codes [18], [19], the acts on a qubit in the computational basis {|0i , |1i}, it picks MacWilliams identities [20] and the McEliece theorems on 2πı up a phase of exp 2` when acting on |1i and it leaves |0i divisibility of weights [21], [22], to potentially construct undisturbed. Hence, a transversal application of this gate on an new stabilizer codes with transversal gates. Furthermore, this n 2πı n-qubit state |vi , v ∈ Z2 , picks up the phase exp ` wH (v) , perspective is a new tool for arguing about the best possible Pn 2 where wH (v) := i=1 vi denotes the Hamming weight of scaling achievable for rates and distances of stabilizer codes v. Therefore, by engineering the Hamming weights of the supporting transversal T gates, or even general π/2` Z- binary vectors describing the superposition in the CSS basis rotations. states, these works determined sufficient conditions for such Among several examples, we construct a [[16, 3, 2]] code transversal Z-rotations to realize logical operators on these where transversal T realizes the logical CCZ (up to Pauli codes. corrections; see Section III-D). This code belongs to the In contrast to these previous works, we take a Heisenberg compass code family studied in [23]. This is also closely approach to this problem by examining the action of the related to Campbell’s [[8, 3, 2]] color code [24] that is defined physical operation on the stabilizer group defining the code, on a 3-dimensional cube, and it can be interpreted as three naturally generalizing the aforementioned strategy. Conse- such cubes in a chain. (The construction can be extended quently, we are able to derive necessary and sufficient condi- to a chain of arbitrary number of cubes.) As we show in tions for any stabilizer code to support a physical transversal Example 3, the [[8, 3, 2]] code belongs to a family of [[2m, m, 2]] T,T † gate, without restricting ourselves to CSS codes (see quantum Reed-Muller codes defined on m-dimensional cubes. Theorems 2, 6). When applied to CSS codes, these conditions However, as we discuss in Example 5, the [[16, 3, 2]] code can translate to constructing a pair of classical codes CX and CZ be constructed using the (classical) formalism of decreasing such that CZ contains a self-dual code supported on each monomial codes that was introduced by Bardet et al. [25], [26]. codeword in CX . Concretely, this result allows us to prove This formalism generalizes Reed-Muller and polar codes [27], the following corollaries which broadly form “converses” to and provides a general framework for synthesizing a large the sufficient conditions derived in the aforementioned works. family of codes via evaluations of polynomials. Recently, 1) Given an [[n, k, d]] non-degenerate stabilizer code sup- Krishna and Tillich [28] have exploited this framework to porting a physical transversal T,T † gate, there exists construct triorthogonal codes from punctured polar codes an [[n, k, d]] CSS code supporting the same operation for magic state distillation. Thus, the [[16, 3, 2]] code forms (see Corollary 10). An [[n, k, d]] stabilizer code is non- an interesting example because it points towards a general degenerate if every stabilizer element has weight at least application of the formalism of decreasing monomial codes d. For degenerate stabilizer codes, this statement holds for transversal Z-rotations, where the logical X and Z strings under an additional assumption on the stabilizer gener- are not necessarily identical as in the standard presentation ators. Note that the toric and color codes are degenerate of triorthogonal codes. Such asymmetry in logical operators codes because the weights of the stabilizer generators are and hence the X- and Z-distances of the codes, which can 3 also exist in triorthogonal codes, might be useful in scenarios E(a, b) := ıα1β1 Xα1 Zβ1 ⊗ · · · ⊗ ıαnβn Xαn Zβn of biased noise as well [29]. Hence, this formalism provides T = ıab mod 4D(a, b). (4) more flexibility in designing codes as well as analyzing them. 2 Finally, we extend this approach beyond T gates and es- The unitaries E(a, b) are Hermitian and satisfy E(a, b) = IN , tablish conditions for a stabilizer code to support a transversal where N := 2n, but the unitaries D(a, b) can have order 1, 2 or π/2` Z-rotation (see Theorem 17). However, the conditions we 4. Although Pauli operators are usually represented as binary derive involve trigonometric quantities on the weights of vec- vectors, we use the generalized notation of integer vectors to tors describing the stabilizer, and we are unable to distill finite essentially keep track of phases and signs more carefully, as geometric conditions without making a simplifying assump- we discussed in [10]. The n-qubit Heisenberg-Weyl group (or κ tion. Therefore, we have yet to establish a full generalization Pauli group) is defined as HWN := {ı D(a, b) for all a, b ∈ ` n ` of Theorems 2 and 6 to general π/2 Z-rotations. Note that Z2 , κ ∈ Z4}. We use the notation Z2` = {0, 1, 2,..., 2 − 1} we only discuss Z-rotations of the form in (1) because non- to denote the set of integers modulo 2` for some integer ` > 0. n trivial error-detecting stabilizer codes only support rotations For v = [v1, v2, . . . , vn] ∈ Z2 , let |vi = ev = |v1i ⊗ |v2i ⊗ belonging to the Clifford hierarchy [12], [11]. However, we are · · · ⊗ |vni denote the standard basis vector with entry 1 in the m m r v 0 hv| = |vi† able to study a family of [[2 , r , 2 ]] quantum Reed-Muller position indexed by and elsewhere, and let , the codes, where 1 ≤ r ≤ m/2 and r divides m, and provide Hermitian transpose of |vi. An arbitrary n-qubit quantum state P N an alternative perspective that highlights the logical operation can be written as |ψi = n αv |vi ∈ , where αv ∈ v∈Z2 C C −ıπ P 2 realized by a transversal exp m/r Z gate. This recovers the satisfy n |αv| = 1 as per the Born rule [30], and 2 v∈Z2 C well-known [[8, 3, 2]] code of Campbell [24], and also provides denotes the field of complex numbers. It is easy to check new information about the family of codes discussed in [14], that X |0i = |1i ,X |1i = |0i ,Z |0i = |0i ,Z |1i = − |1i. P [15]. By the “CSS sufficiency” intuition above, this ties back Hence, we can write E(a, 0) = n |v ⊕ ai hv| ,E(0, b) = v∈Z2 to the “Schrodinger”¨ perspective of past works. P vbT n (−1) |vi hv|. Throughout the paper, ⊕ denotes mod- v∈Z2 The paper is organized as follows. Section II establishes ulo 2 addition and + denotes the usual addition over integers. the necessary background and notation, which includes the Also, all binary and integer-valued vectors will be row vectors Pauli group, QFD gates, and stabilizer codes. Section III-A while complex-valued vectors will be column vectors. For outlines the general approach to analyze stabilizer codes that n x = [x1, x2, . . . , xn], y = [y1, y2, . . . , yn] ∈ Z2 , x ∗ y = support a given physical QFD gate. Section III-B establishes [x1y1, x2y2, . . . , xnyn]. the necessary and sufficient conditions for a stabilizer code Using the fact that XZ = −ZX, we can prove the † to support a given pattern of T and T gates on the physical following identities for Pauli matrices (e.g., see [31]). qubits. Section III-C derives conditions for physical transversal bcT −adT mod 4 T to realize logical transversal T on CSS codes, and proves E(a, b)E(c, d) = ı E(a + c, b + d) that triorthogonal codes are the most general CSS codes that = (−1)h[a,b],[c,d]is E(c, d)E(a, b), (5) satisfy this property. Sections III-D and III-E discuss examples 0 I T Z h[a, b], [c, d]i := [a, b]Ω[c, d]T , Ω := n , (6) of codes where physical transversal realizes logical CC s I 0 gates, and explain how these codes satisfy the conditions n a(b+2x)T axT derived in Theorem 2. The discussion on related work in [14] E(a, b + 2x) = ı D(a, b + 2x) = (−1) E(a, b). is provided at the end of Section III-E. Section III-F partially (7) extends the results in Section III-B and explicitly derives Therefore, two Pauli matrices E(a, b),E(c, d) commute if and the logical operation realized by transversal Z-rotations on only if h[a, b], [c, d]is = 0, and they anti-commute otherwise. the aforementioned family of quantum Reed-Muller codes. The Pauli operators { √1 E(a, b), a, b ∈ n} form an N Z2 The relation to quantum pin codes and quasitransversality orthonormal basis for all unitary matrices under the trace is discussed at the end of Section III-F. Finally, Section IV † † inner product hA, BiTr := Tr(A B), where A represents concludes the paper and discusses future directions. the Hermitian transpose of A. Therefore, given any matrix U ∈ UN , where UN denotes the group of N × N unitary II.PRELIMINARIESAND NOTATION matrices, we can express it as A. The Pauli or Heisenberg-Weyl Group X 1 1 √ √ The single qubit Pauli operators are the unitaries U = Tr E(a, b) · U · E(a, b). (8) n N N a,b∈Z2 1 0 0 1 1 0 I := ,X := ,Z := ,Y := ıXZ, Note that Tr(E(a, b)) = 0 unless E(a, b) = E(0, 0) = I , in 2 0 1 1 0 0 −1 N which case Tr(E(0, 0)) = N. (2) P If U = n φv |vi hv| is diagonal (in the standard √ v∈ 2 2 2 2 Z where ı := −1. They satisfy X = Z = Y = I2. coordinate basis), then for a 6= 0 we observe that Let A ⊗ B denote the Kronecker product between matrices Tr(E(a, b)U) A and B. For n qubits, given a = [α1, α2, . . . , αn], b =   [β , β , . . . , β ] ∈ n, where denotes the ring of integers, 1 2 n Z Z abT X X v0bT 0 0 we define the operators := Tr ı |v ⊕ ai hv| · (−1) |v i hv | · U v∈ n v0∈ n α β α β α β Z2 Z2 D(a, b) := X 1 Z 1 ⊗ X 2 Z 2 ⊗ · · · ⊗ X n Z n , (3) (9) 4

T X T = ıab (−1)vb hv| U |v ⊕ ai = 0. (10) explicitly calculates their action on an n-qubit (Hermitian) n v∈Z2 Pauli matrix to be

Hence, Tr(E(a, b)U) 6= 0 if and only if a = 0, and (`) (`) † φ(R,a,b,`) (`−1) T τ E(a, b)(τ ) = ξ E(a , b + a R) τ , P vb R R 0 0 0 R˜(R,a,`) Tr(E(0, b)U) = v∈ n (−1) φv. Z2 (15) φ(R, a, b, `) := (1 − 2`−2)a RaT B. Quadratic Form Diagonal (QFD) Gates 0 0 `−1 T T + 2 (a0b + b0a ), (16) The Clifford hierarchy of unitary operators was defined 1 1 R˜(R, a, `) := (1 + 2`−2)D − (D RD by Gottesman and Chuang [6] in order to demonstrate that a0R a¯0 a0 + D RD + 2D ). (17) universal quantum computation can be realized via quantum a0 a¯0 a0RDa0 teleportation if one has access to Bell-state preparation, Bell- basis measurements and arbitrary single-qubit operations. The Here, Dx represents a diagonal matrix with the diagonal set first level C(1) of the hierarchy is defined to be the Pauli group, to the vector x, and x¯ = 1 − x with 1 representing the vector (1) (`) i.e., C = HWN . For ` ≥ 2, the levels C are defined whose entries are 1. We write a = a0 + 2a1 + 4a2 + . . . , b = n n recursively as b0 +2b1 +4b2 +... ∈ Z with ai, bi ∈ Z2 . With this notation, b + a R is an integer sum and the definition of E(a, b) has (`) : † (`−1) 0 0 C = {U ∈ UN : UE(a, b)U ∈ C ∀ E(a, b) ∈ HWN }. been suitably generalized to integer vectors a, b as we did (11) in [10]. Note that D(a, b) is unaffected by this generalization T By this definition, the second level is the Clifford group, since it does not have an overall phase factor ıab and (2) 2 2 C = CliffN , that is fundamental to quantum computation. X = Z = I2, i.e., D(a, b) = D(a0, b0). Whenever we only Equivalently, the Clifford group can also be defined as the consider binary vectors a, b we will replace a0 in the above `−2 T automorphism group of HWN . Any given g ∈ CliffN satisfies expressions with a, so then φ(R, a, b, `) = (1 − 2 )aRa . Equation (15) naturally extends the action of the Clifford † gE(a, b)g = ±E([a, b]Fg), (12) group given in (12) to a large class of diagonal unitaries 2n×2n T in the Clifford hierarchy. In this case the symplectic matrix where Fg ∈ Z2 satisfies Fg Ω Fg = Ω and hence is In R called a binary symplectic matrix [31]. CliffN can be generated ΓR = is defined over Z2` and still satisfies 1 0 In using the unitaries Hadamard, Phase , Controlled-Z (CZ) and T ΓR ΩΓR = Ω (mod 2). The results of [10] also show that Controlled-NOT (CX) defined respectively as (`) (`) τR ∈ C , and that all 1- and 2-local diagonal unitaries 1 1 1 1 0 in the Clifford hierarchy can be represented using an integer H := √ ,P := , 2 1 −1 0 ı symmetric matrix R. There are also many higher-locality diagonal gates that can be represented in this manner, and CZab := |0i h0| ⊗ (I2)b + |1i h1| ⊗ Zb, a a some examples are provided in [10]. Henceforth, we will refer CX := |0i h0| ⊗ (I ) + |1i h1| ⊗ X . (13) a→b a 2 b a b to these type of diagonal unitaries as Quadratic Form Diagonal The subscripts for CZ and CX denote the indices of the two (QFD) gates. Z qubits involved, and C is symmetric with respect to both Example 1. Consider n = 1, ` = 3, ξ = eıπ/4,R = [ 1 ] qubits. However, for CX, a → b indicates that when the (`) and hence τ = T . Since this is diagonal in the standard control qubit a is in state |1i the target qubit b is flipped by R basis, it commutes with Z = E(0, 1). For E(1, 0) with a = 1, applying the X gate, and when a is in state |0i the target b is φ(R, a, b, `) = (1 − 2)aRaT = −1 and R˜(R, a, `) = (1 + left undisturbed. Note that H† = H and HZH = X implies 2)D −(D RD +D RD +2D ) = [ 1 ]. Hence, (15) implies CZ = ((I ) ⊗ H ) CX ((I ) ⊗ H ). It is well-known 1 0 1 1 0 1 ab 2 a b a→b 2 a b TXT † = τ (3)E(1, 0)(τ (3))† = e−ıπ/4E(1, 1)τ (2) = that Clifford unitaries along with any unitary from a higher R R R˜(R,1,3) level, say from C(3), can be used to approximate any unitary e−ıπ/4YP . It is easy to see that the phase gate can be (I2+Z) (I2−Z) operator arbitrarily well, and hence form a universal set for expressed as P = 2 + ı 2 . Therefore, quantum computation [32]. The widely used choice for the non-Clifford unitary is the “π/8” gate or the T gate, defined Y + ıX ı(Y − ıX) TXT † = e−ıπ/4 + as 2 2 √ −ıπ/8 e−ıπ/4 1 0 1/4 e 0 − ıπ Z T := = P = Z ≡ = e 8 . = [(1 + ı)(Y + X)] 0 eıπ/4 0 eıπ/8 2 1 (1 − ı) X + Y (14) = √ (1 + ı)(Y + X) = √ , (18) 2 2 2 The work in [10] considered diagonal unitaries of the (`) P vRvT mod 2` form τ = n ξ |vi hv|, where ` ∈ >1, R v∈Z2 Z which is a well-known identity cast in our framework. We will 2πı † −ıπ/4 ξ := exp( 2` ), and R is a symmetric matrix over Z2` . It use the identity TXT = e YP extensively in this paper. (`) 1We use the notation “P ” for the phase gate and reserve the commonly It will be convenient to expand τR in the Pauli basis. Using used notation of “S” for stabilizer groups. the observation we made earlier for a general diagonal unitary 5

n U, for x ∈ Z2 we define the generators of S that produce E(aj, bj), i.e., jE(aj, bj) = Q νtE(ct, dt) for a unique subset J. (`) 1 h (`)i 1 X T T t∈J⊆{1,...,r} c := √ Tr E(0, x)τ = √ (−1)vx ξvRv CSS (Calderbank-Shor-Steane) code R,x n R n A is a special type of 2 2 n v∈Z2 stabilizer code defined by a stabilizer S whose generators split (`) 1 X (`) into strictly X-type and strictly Z-type operators. Consider ⇒ τ = √ c E(0, x). (19) R n R,x two classical binary codes C ,C such that C ⊂ C , and 2 n 1 2 2 1 x∈Z2 ⊥ ⊥ ⊥ ⊥ let C1 ,C2 represent their respective dual codes (C1 ⊂ C2 ). Using this Pauli expansion for τ (`−1) in (15), and assuming Define the stabilizer S := hνcE(c, 0), νdE(0, d), c ∈ C2, d ∈ R˜(R,a,`) ⊥ n C1 i for some suitable νc, νd ∈ {±1}. Let C1 be an [n, k1] a, b ∈ Z2 , we get ⊥ code and C2 be an [n, k2] code such that C1 and C2 can (`) (`) † τR E(a, b)(τR ) correct up to t errors. Then S defines an [[n, k1 − k2, ≥ 2t + 1]] CSS code3 that we will represent as CSS(X,C ; Z,C⊥). = ξφ(R,a,b,`)E(a, b + aR) τ (`−1) (20) 2 1 R˜(R,a,`) k2×n ⊥ (n−k1)×n If G2 ∈ Z2 and G1 ∈ Z2 represent generator 1 X (`−1) ⊥ = ξφ(R,a,b,`)E(a, b + aR) · √ c E(0, x) matrices for the codes C2 and C1 , respectively, then a binary n R˜(R,a,`),x 2 n generator matrix for S can be written as x∈Z2 (21) n n ⊥ 1 φ(R,a,b,`) X (`−1) −axT G1 n − k1 = √ ξ c ı E(a, b + aR + x). GS = . (25) n R˜(R,a,`),x 2 n G2 k2 x∈Z2 (22) The primary challenge here is to determine which coefficients III.MAIN RESULTS AND DISCUSSION are non-zero for given R, a, `, and also their values. Given a stabilizer code, we need to develop a scheme to perform fault-tolerant universal quantum computation on C. Stabilizer Codes the logical qubits protected by the code, because otherwise the code can only be used as a quantum memory. In [31], A stabilizer group S is a commutative subgroup of the Pauli we produced a systematic algorithm that can synthesize all group HWN with Hermitian elements that does not include (equivalence classes of) Clifford circuits on the physical qubits −IN . If S has dimension r, then it can be generated as S = n that realize a given logical Clifford operator (on the logical hνiE(ci, di); i = 1, . . . , ri, where νi ∈ {±1}, ci, di ∈ Z2 and qubits). We will refer to this as the Logical Clifford Synthesis E(ci, di),E(cj, dj) commute for all i 6= j, i, j ∈ {1, . . . , r}, 4 T T (LCS) algorithm . As mentioned before, for universal quantum i.e., h[ci, di], [cj, dj]is = cidj + dicj = 0 (mod 2). Rec- computation, we also need to determine a way to synthesize ollect that commuting N × N Hermitian matrices can be circuits that realize at least one non-Clifford logical operator. simultaneously diagonalized and hence have a common basis N Moreover, the simplest example of a fault-tolerant circuit of eigenvectors that span C . Given a stabilizer group S, is a transversal operator, that splits as a tensor product of the corresponding stabilizer code [33] is the subspace V (S) individual single-qubit operators on the physical qubits of the spanned by all eigenvectors in the common eigenbasis of S N code, since errors on individual qubit lines do not spread to that have eigenvalue +1, i.e., V (S) := {|ψi ∈ C : g |ψi = other qubits. Since fault-tolerant realizations of logical non- |ψi for all g ∈ S}. The subspace V (S) is called an [[n, k, d]] Clifford operators are harder to produce, we begin by asking a stabilizer code because it encodes k := n − r logical qubits question that is motivated by the easiest non-Clifford operator into n physical qubits. The minimum distance d is defined to to engineer. 2 be the minimum weight of any operator in NHWN (S) \ S. What kind of stabilizer codes support a transversal operator Here, NHWN (S) denotes the normalizer of S inside HWN , composed of T and T † gates on the physical qubits? κ NHW (S) := {ı E(a, b) ∈ HWN : E(a, b)E(c, d)E(a, b) = N In other words, what structure is needed in the stabilizer E(c, d) for all νE(c, d) ∈ S, κ ∈ }. (23) Z4 S so that the code subspace V (S) is preserved under the Given a Hermitian Pauli matrix E(c, d) and ν ∈ {±}, it application of a given pattern of T and T † (and identity) gates IN +νE(c,d) on the physical qubits? This reverses the strategy employed in is easy to show that 2 is the projector on to the ν- eigenspace of E(c, d). Therefore, the projector on to the code the LCS algorithm, where we translated a logical operator to subspace V (S) of the stabilizer code defined by S is given by a physical operator. The above question is more practically motivated because single-qubit Z-rotations are some of the r 2r Y (IN + νiE(ci, di)) 1 X easiest examples of non-Clifford gates that can be performed Π := = E(a , b ), (24) S 2 2r j j j in the lab, e.g., for trapped ion systems these gates are actually i=1 j=1 3We say the distance is at least 2t + 1 because distance of the code is the where j ∈ {±1} in the last equality is a character of the ⊥ ⊥ minimum weight of any vector in (C1 \ C2) ∪ (C2 \ C1 ), and not just group S, and hence is determined by the product of signs of ⊥ C1 ∪ C2 . 4Our implementation of the LCS algorithm, along with several 2Weight of a Pauli operator refers to the number of qubits on which it acts general purpose subroutines, is available at: https://github.com/nrenga/ non-trivially, i.e., as X,Y or Z. symplectic-arxiv18a. 6 native operations [34], and we want to make the maximum use can use the above approach to derive codes accordingly; if of them. Assuming the stabilizer has the necessary structure, the operation is also a QFD gate, then we can exactly use we also need to determine what logical operator is realized the above equations. However, solving the above equality for by the given pattern of T and T † gates. We also address arbitrary R, ` might be hard. In this paper, we solve the (3) this question for some codes and logical operators, e.g., we equality completely when τR is a transversal combination of consider the case when a transversal application of the T gate T and T † gates. We also provide a nearly complete solution realizes the logical transversal T on all the k logical qubits for the transversal application of higher level Z-rotations. ⊥ encoded by a CSS(X,C2; Z,C1 ) code. This establishes a tight connection with triorthogonal codes defined by Bravyi and B. Stabilizer Codes that Support T and T † Gates Haah [4]. Subsequently, we provide a partial answer to the extension of the above question to Z-rotations above level 3 We begin with a formula for the physical transversal T gate of the Clifford hierarchy [8], [14], [15], [16], [17]. Finally, which, given several applications, is of independent interest. m m r we produce a family of [[2 , , 2 ]] quantum Reed-Muller n n r Lemma 1. Let E(a, b) ∈ HWN ,N = 2 , for some a, b ∈ Z2 . codes, where 1 ≤ r ≤ m/2 and r divides m, and show that the Then the transversal T gate acts on E(a, b) as transversal π/2m/r Z-rotation is a logical operator on these † 1 X T codes. Furthermore, we also derive the exact logical operation T ⊗nE(a, b) T ⊗n = (−1)by E(a, b ⊕ y), 2wH (a)/2 realized by this transversal gate on these codes. ya We begin by outlining the general strategy for understanding (30) when a physical QFD gate preserves a stabilizer codespace. T where wH (a) = aa is the Hamming weight of a, and y a denotes that y is contained in the support of a. A. General Approach for QFD Gates Proof. This result is a special case of Lemma 4, which we The key idea in addressing the above question is the follow- prove in Appendix II-B. ing: a physical operator U ∈ UN preserves the code subspace of a stabilizer code defined by a stabilizer group S if and only Using this lemma, we state our first result which partially † if UΠSU = ΠS. This is because two operators preserve each answers the above question. others’ eigenspaces if and only if they commute. Here, we say an operator A preserves the eigenspace of another operator B Theorem 2 (Transversal T ). Let S = hνiE(ci, di); i = if it holds that for any eigenvector v of B with eigenvalue b, 1, . . . , ri define an [[n, n − r, d]] stabilizer code, with arbitrary νi ∈ {±1}, and denote the elements of S by jE(aj, bj), j = Av is also an eigenvector of B with eigenvalue b. Thus, U is r a valid logical operator for S if and only if 1, 2,..., 2 . If the transversal application of the T gate preserves the code space V (S) and hence realizes a logical r r 2 2 operation on V (S), then the following are true. † 1 X † 1 X UΠSU = jUE(aj, bj)U = jE(aj, bj) 2r 2r 1) For any jE(aj, bj) ∈ S with non-zero aj, wH (aj) is j=1 j=1 even, where wH (aj) represents the Hamming weight of = ΠS. (26) n aj ∈ Z2 . (`) 2) For any jE(aj, bj) ∈ S with non-zero aj, define Zj := If U = τ for some ` ≥ 2 and R symmetric over ` , then R Z2 {z a : E(0, z) ∈ S for some ∈ {±1}}5. Then by (15) we need j z z Zj contains its dual computed only on the support of aj, 2r i.e., on the ambient dimension w (a ). Equivalently, Z (`) (`) 1 X (`) (`) H j j τ Π (τ )† = τ E(a , b )(τ )† (27) R S R 2r j R j j R contains a dimension wH (aj)/2 self-dual code Aj that j=1 is supported on aj, i.e., there exists a subspace Aj ⊆ Zj 2r T 1 1 such that yz = 0 (mod 2) for any y, z ∈ Aj (including X φ(R,aj ,bj ,`) = j √ ξ 2r 2n y = z) and dim(Aj) = wH (aj)/2. j=1 3) Let Z˜ ⊆ wH (aj ) denote the subspace Z where T j Z2 j X (`−1) −aj x c ı E(aj, bj + ajR + x) all positions outside the support of aj are punctured R˜(R,aj ,`),x x∈ n n ˜ ⊥ Z2 (dropped). Then, for each z ∈ Z2 such that z˜ ∈ (Zj) r zzT (28) for some j ∈ {1,..., 2 }, we have z = ı , i.e., T 2r ızz E(0, z) ∈ S. Here, (Z˜ )⊥ denotes the dual of Z 1 X j j = r jE(aj, bj). (29) taken over this punctured space with ambient dimension 2 ⊥ j=1 wH (aj). (Also, Zj ⊇ (Z˜j) with zeros added outside This shows one important use of the formula (15) derived the support of aj.) in [10]. As mentioned at the end of Section II-B, the primary Conversely, if the first two conditions above are satisfied, and problem here is to determine which coefficients are non-zero if the third condition holds for all z ∈ Aj instead of just the for given R, a, `, and also their values. In principle, we can 5 Although using the above notation it is true that zE(0, z) = solve for all the conditions on S that are necessary (and 0 r j0 E(aj0 , bj0 ) for some j ∈ {1, 2,..., 2 } with aj0 = 0, we use the sufficient) for this equality. So, if we want to take advantage notation with z for convenience and also because it actually refers to pure of an operation that we can do “easily” in the lab, then we Z-type stabilizers. 7

dual of (the punctured) Zj, then transversal T preserves the If the Z-stabilizer generators were instead taken to be code space V (S) and hence induces a logical operation. Z1Z2,Z3Z4,Z5Z6, then the superposition above in |00iL ⊗6 will be (|000000i + |111111i). Therefore, T X1X3X5 will Proof. See Appendix II-A. be a valid logical operator (that still implements the logical identity). Remark 3. Note that, since A is supported on a , the ambient j j Given that S has the necessary structure given by The- dimension of vectors in A is essentially w (a ). So A j H j j orem 2, note that we can freely add another Z-stabilizer is a [w (a ), w (a )/2] self-dual code embedded in n. H j H j Z2 generator that commutes with X⊗6, e.g., Z Z Z Z ↔ The last point is requiring that the Z-stabilizers arising from 1 3 4 6 [1, 0, 1, 1, 0, 1] ∈/ ZS. This does not affect the transversal T vectors in the subspaces Aj have the correct sign, given by ⊗n ⊗n † T property: once T ΠS(T ) = ΠS, the mapping ΠS 7→ ΠS · ıwH (z) = ızz ∈ {±1}. If this is not taken care of, then (IN +E(0,z)) still preserves the equality since (I + E(0, z)) an appropriate Pauli operator has to be applied before and 2 N is diagonal. after transversal T in order to make a valid logical operator. Indeed, this Pauli operator is essentially fixing the signs of Now we generalize Lemma 1 and Theorem 2 to T and T † the Z-stabilizers as required. Hence, although the necessary gates, which addresses the initial question completely. and sufficient directions of the theorem differ in the last sign Lemma 4. For N = 2n and a, b ∈ n, let E(a, b) ∈ HW condition, for all practical purposes one can take the signs Z2 N and choose t , t ∈ n such that t ∗ t = 0 (i.e., supp(t ) ∩ to be imposed on all of A instead of just its subspace that 1 7 Z2 1 7 1 j supp(t ) = ∅). Define t = t + 7t ∈ {0, 1, 7}n, t0 = t + t ∈ is identified with (Z˜ )⊥. These Pauli corrections preserve the 7 1 7 1 7 j n. Then the physical operation T ⊗t acts on E(a, b) as code parameters [[n, n − r, d]]. Z2 ⊗t ⊗t† Let us now look at a simple example constructed using this T E(a, b) T 1 T theorem that will clarify the requirements above. X (b+t7)y = 0 (−1) E(a, b ⊕ y), (34) 2wH (a∗t )/2 Example 2. Define a [[6, 2, 2]] CSS code by the following y(a∗t0) stabilizer generator matrix. where T ⊗t denotes that T (resp. T † = T 7) is applied to the  1 1 1 1 1 1 0 0 0 0 0 0  qubits in the support of t1 (resp. t7). 0 0 0 0 0 0 1 1 0 0 0 0 G =   . (31) Proof. See Appendix II-B. S  0 0 0 0 0 0 0 0 1 1 0 0    n 0 0 0 0 0 0 0 0 0 0 1 1 Corollary 5. Let E(a, b) ∈ HWN for N = 2 and a, b ∈ n 0 n Z2 . For j, j ∈ {1, 2,..., 7}, let tj ∈ Z2 and assume that 0 The right half of the last 3 rows form the generators tj ∗ tj0 = 0 (i.e., supp(tj) ∩ supp(tj0 ) = ∅) for j 6= j . Define 7 of ZS for this code. Since there is only one non-trivial P n ˜ ˜ ˜ t := j=1 jtj ∈ Z8 , t1 := t1 +t5, t2 := t2 +t6, t3 := t3 +t7 ∈ a in this case, we see that Z = A with a = n ⊗t j S 1 1 Z2 . Then the physical operation T acts on E(a, b) as [1, 1, 1, 1, 1, 1]. Hence, the stabilizer generators are X⊗6 = ⊗t ⊗t† X1X2 ··· X6, −Z1Z2, −Z3Z4, −Z5Z6, since the generators T E(a, b) T ⊗6 T of ZS have weight 2. Multiplying X and the product of (−1)a(t3+t4+t5+t6) ⊗6 = these three Z-stabilizers, we see that Y ∈ S. We can define w (a∗(t˜ +t˜ ))/2 ¯ ¯ 2 H 1 3 the logical X operators for this code to be X1 = X1X2, X2 = X ˜ T (−1)(b+t3)z E (a, b ⊕ z) , (35) X3X4, since these are linearly independent and commute with ˜ ˜ ˜ ˜ all stabilizers. Using the identity we observed in Example 1, (a∗t2)z(a∗(t1+t2+t3)) we see that where T ⊗t denotes that T j is applied to the qubits in the ⊗6 ⊗6 † −ı·2π/4 support of tj. T X1X2(T ) = e (Y1P1)(Y2P2) Proof. See Appendix II-C. ≡ −ı(X1X2)(P1P2), (32) Theorem 6 (Transversal T (t)). Let S = hν E(c , d ); i = since −Z Z ∈ S. We observe that (P P )X⊗6(P P )† = i i i 1 2 1 2 1 2 1, . . . , ri define an [[n, n−r, d]] stabilizer code as in Theorem 2. Y Y X X X X ≡ X⊗6 up to the stabilizer −Z Z , so 1 2 3 4 5 6 1 2 Let t = t + 7t , t ∗ t = 0, with supports of t , t ∈ n P P indeed preserves V (S). But (P P )(X X )(P P )† = 1 7 1 7 1 7 Z2 1 2 1 2 1 2 1 2 indicating the qubits on which T and T † = T 7 are applied, Y Y = (X X )(−Z Z ) ≡ X X , and P P obviously 1 2 1 2 1 2 1 2 1 2 respectively. Define t0 = t + t ∈ n. If the application of commutes with X¯ , so P P is essentially the logical identity 1 7 Z2 2 1 2 the T ⊗t gate realizes a logical operation on V (S), then the gate. A similar reasoning holds for P P . Therefore, up to a 3 4 following are true. global phase, the transversal T preserves the logical operators 0 X¯ and X¯ , so in this case the transversal T gate realizes just 1) For any jE(aj, bj) ∈ S with non-zero aj, wH (aj ∗t ) is 1 2 0 the logical identity (up to a global phase). This can also be even, where wH (aj ∗t ) represents the Hamming weight 0 n of (aj ∗ t ) ∈ Z2 . checked explicitly by writing the logical basis states |x1x2iL for x ∈ : 2) For any jE(aj, bj) ∈ S with non-zero aj, define i Z2 0 Zj,t0 := {z (aj ∗ t ): zE(0, z) ∈ S for some z ∈ 1 x1 x2 {±1}}. Then Z 0 contains its dual computed only on |x1x2i = X¯ X¯ · √ (|010101i + |101010i) . (33) j,t L 1 2 0 2 the support of (aj ∗ t ), i.e., on the ambient dimension 8

T 0 wH (z)+2t7z 0 wH (aj ∗ t ). Equivalently, Zj contains a dimension ı E(0, z) ∈ S. Notice that this means (C2 ∗ t ) ⊂ 0 ⊥ ⊥ 0 ⊗t wH (aj ∗ t )/2 self-dual code Aj,t0 that is supported on C1 ⊂ C2 , since (x ∗ t ) ∈ Ax,t0 . Then T is a valid logical 0 ⊥ (aj ∗ t ), i.e., there exists a subspace Aj,t0 ⊆ Zj,t0 such operator for CSS(X,C2; Z,C1 ). If for all x ∈ C2 and for all T T ⊗t0 that yz = 0 (mod 2) for any y, z ∈ Aj,t0 (including z ∈ Ax,t0 we have t7z ≡ 0 (mod 2), then T (which is 0 y = z) and dim(Aj,t0 ) = wH (aj ∗ t )/2. composed of only T gates) is also a valid logical operator, as 0 ˜ wH (aj ∗t ) the sign constraints for E(0, z) are now independent of t . 3) Let Zj,t0 ⊆ Z2 denote the subspace Zj,t0 where 7 0 all positions outside the support of (aj ∗t ) are punctured Remark 8. Intuitively, a CSS-T code (for transversal T ) is n ˜ ⊥ (dropped). Then, for each z ∈ Z2 such that z˜ ∈ (Zj,t0 ) T T determined by two classical codes C2 ⊂ C1 such that for every for some j ∈ {1,..., 2r}, we have = ızz +2t7z , i.e., z codeword x ∈ C2, there exists a dimension wH (x)/2 self-dual zzT +2t zT ⊥ ı 7 E(0, z) ∈ S. Here, (Z˜ 0 ) denotes the dual ⊥ j,t code in C1 supported on x. This also means that C1 ∗ C2 ⊆ 0 ⊥ of Zj,t taken over this punctured space with ambient C for the following reason. For a ∈ C1, x ∈ C2, it is clear 0 ⊥ 1 dimension w (a ∗ t ).(Z 0 ⊇ (Z˜ 0 ) with zeros ⊥ H j j,t j,t that a is orthogonal to every vector in C1 . In particular, a is added outside the support of (a ∗ t0).) ⊥ j orthogonal to the self-dual code Cx ⊂ C1 supported on x. T T Conversely, if the first two conditions above are satisfied, and But, for any z ∈ Cx, we have az = (a ∗ x)z = 0. This ⊥ if the third condition holds for all z ∈ Aj,t0 instead of just the means a ∗ x ∈ Cx ⊂ C1 since Cx is self-dual. We think that dual of (the punctured) Zj,t0 , then transversal T preserves the this observation might make it more convenient to derive some code space V (S) and hence induces a logical operation. properties of CSS-T codes, since there is a good literature on the star product [35]. Proof. The proof is along the same lines as for Theorem 2, but adapted suitably to the general case in Lemma 4. Corollary 7 also suggests that there might not be a significant advantage in working with general stabilizer codes, Notice that the above two results reduce to Lemma 1 and rather than just CSS codes, as far as T and T † gates are Theorem 2, respectively, when t = [1, 1,..., 1] and t = 1 j concerned. This is because, by Theorem 6, there is always [0, 0,..., 0] for j = 2, 3,..., 7. The main difference is that in a large asymmetry required between the number of stabilizer this general scenario, the conditions in Theorem 2 are applied elements that have at least one X (or Y ) in them, and the to the intersection of the support of a and (t + t ). j 1 7 number of purely Z-type stabilizer elements. Hence, altering Example 2 (contd.). Assume that now we want to apply T and the pure X-type stabilizers into X,Y -type stabilizers might † T according to t1 = [1, 0, 1, 0, 1, 0] and t7 = [0, 1, 0, 1, 0, 1], not provide much gain, say, in terms of the distance of 0 respectively. Since t = t1 +t7 = [1, 1,..., 1], aj ∗(t1 +t7) = the code. The next corollary confirms this intuition for non- aj always and so the first two conditions of Theorem 6 reduce degenerate stabilizer codes. to the transversal T case. However, the last condition needs the Definition 9. An [[n, k, d]] stabilizer code is non-degenerate if sign for the Z-stabilizer generators to be ı2+2 = 1, so we need ⊗6 every stabilizer element has weight at least d. to change the stabilizer to be S = hX ,Z1Z2,Z3Z4,Z5Z6i. Then the superposition for |00iL will indeed be (|000000i + The general notion of degeneracy is that, for a distance d |111111i), and it is easy to verify that T ⊗t fixes the logical non-degenerate code, there cannot be two weight at most t = d−1 basis states |00iL , |01iL , |10iL , |11iL, so that it also realizes b 2 c errors that have the same action on the code space, i.e., the logical identity. their product produces a stabilizer (whose weight is at most 2t < d). This is what makes degeneracy a purely quantum In principle, we can generalize Theorem 6 to the case of phenomenon since on a classical linear code two weight t arbitrary powers of T by using Corollary 5. However, the errors cannot combine to produce a codeword on a d ≥ 2t + 1 derivation is more complicated and the final conditions are code. Therefore, our definition directly states that all stabilizers not fully clear because the summation in Corollary 5 is over have weight at least d, which can be seen to be equivalent to a coset and not a subspace as in Lemma 4. Hence, this the above (cf. [36]). generalization still remains open. Note that the toric and color codes are degenerate (CSS) Using these results, we can refine the CSS construction to codes because the weights of the stabilizer generators are fixed produce codes that support a desired pattern of T and T † gates. even when the lattice size is increased, i.e., code distance is in- Note that the first two conditions in Theorem 6 only depend creased. However, codes such as quantum Reed-Muller codes on (t + t ) and not individually on t and t , i.e., on the 1 7 1 7 are typically non-degenerate since each stabilizer generally has union of their supports. Hence, any pattern of T and T † on weight at least equal to the code distance. the support of (t1 + t7) will preserve the code subspace, up to an initial Pauli application that produces the right signs for the Corollary 10 (Sufficiency of CSS-T Codes). Consider an Z-stabilizers as prescribed by the last condition in Theorem 6. [[n, k, d]] non-degenerate stabilizer code generated by the   n AB Corollary 7 (CSS-T Codes). Let t1, t7 ∈ Z2 be such that 0 matrix GS = C 0  that satisfies the transversal T (t) t1 ∗ t7 = 0, and define t = t1 + 7t7, t = t1 + t7. Consider a ⊥ 0 0 D code CSS(X,C2; Z,C1 ) with stabilizer S, such that wH (x∗t ) property (Theorem 6). Then the CSS code generated by G = ⊥   S is even for all x ∈ C2. For each x ∈ C2, let C1 contain A 0 0 a dimension wH (x ∗ t )/2 self-dual code Ax,t0 supported on C 0 has parameters [[n, ≥ k, ≥ d]] and also satisfies the 0   (x ∗ t ). Moreover, for all x ∈ C2 and for each z ∈ Ax,t0 , let 0 D 9 transversal T (t) property for the same t ∈ {0, 1, 7}n. which is all 0s in the remaining matrix. In other words, let x1, x2, x3, x4 be binary variables that also represent degree- Proof. See Appendix II-D. 1 monomials, with x1 being the least significant bit and x4 Remark 11 (Degenerate Codes). Observe that the arguments being the most significant bit. Then, the rows of G2 from above can be extended to the case when the given stabilizer top to bottom are x1, x2, x3, x4 respectively, with the first code is degenerate, but now the distance of the new CSS code coordinate removed. Similarly, since the dual of RM(1, 4) is ⊥ constructed above is lower bounded only by the minimum the [16, 11, 4] extended Hamming code RM(2, 4), the dual C1 weight of D, which can be strictly less than d. More explicitly, of the punctured RM(1, 4) code C1 is obtained by shortening ⊥ ⊥ the minimum weight of hA, Ci \ D is still d since this space RM(2, 4). Therefore, the rows of G1 must be the degree-1 is strictly outside the stabilizer but in the normalizer of the monomials x1, x2, x3, x4 and the degree-2 monomials xixj given stabilizer code. So the distance of the CSS code mainly for i < j, with the first coordinate removed. ⊥ depends on the minimum weight of D \hA, Ci and the vector Since all vectors in C2 have weight 8, the first condition of z at the end can be assumed to be taken from this subspace. As Theorem 2 is satisfied. Now consider, say, the X-stabilizer a result, such a z with weight less than d cannot also belong arising from the monomial x1 belonging to C2. By direct ⊥ ⊥ to B since otherwise this would contradict the assumption observation of the rows of G1 , we see that the monomials that the given stabilizer code has distance d. Therefore, under x1, x1x2, x1x3, and x1x4 are linearly independent vectors the assumption that for the given stabilizer code any vector contained in the support of x1. If we project these vectors ⊥ ⊥ z ∈ D \ (hA, Ci ∪ B ) has weight at least d, the above onto just the support of x1, i.e., drop the x1 in the description corollary can be extended to the degenerate case. We leave of the monomials, then these 4 vectors form the monomials the more general problem of addressing the full extension of 1, x˜1, x˜2, x˜3 in the space of 3 binary variables. By definition, the above corollary to the degenerate case for future work. these generate the Reed-Muller code RM(1, 3), which is also the [8, 4, 4] extended Hamming code that is self-dual. Since Motivated by the above corollary, all our examples in all other codewords in C are also degree-1 polynomials, the this paper are CSS-T codes (including the [[6, 2, 2]] code in 2 same argument as above can be applied to them. Therefore, Example 2). The [[6, 2, 2]] code is not just a corner case the second condition of Theorem 2 is satisfied as well. Finally, where the transversal T gate realizes the logical identity. The since all generating codewords of C⊥ have Hamming weight following result provides necessary and sufficient conditions 1 4 or 8, the last condition of Theorem 2 produces no negative for this to happen. signs for the Z-stabilizers. Hence, the [[15, 1, 3]] quantum Theorem 12 (Logical Identity). Let S be the stabilizer for Reed-Muller code supports the transversal T gate, and it can ⊥ † an [[n, k, d]] CSS-T code CSS(X,C2; Z,C1 ). Let the logical be checked that this realizes the logical T gate on the single Pauli X group be X¯ = hE(xi, 0); i = 1, . . . , ki. Then the encoded qubit. transversal T gate on the n physical qubits realizes the logical We can also construct CSS codes where the physical identity operation if and only if the following are true. transversal T realizes logical transversal T . In fact, triorthog- 1) For each E(x, 0) ∈ X¯, ıwH (x)E(0, x) must be a stabi- onal codes introduced by Bravyi and Haah [4] serve exactly lizer. this purpose, although they allow for an additional Clifford 2) For each E(x, 0) ∈ X¯ and aE(a, 0) ∈ S, correction beyond the strict (physical) transversal T operation. ıwH (x∗a)E(0, x ∗ a) must be a stabilizer. As our next result, using our methods we show a “converse” 3) For any two logical Paulis E(x, 0),E(y, 0) ∈ X¯, that triorthogonality is not only sufficient but also necessary if ıwH (x∗y)E(0, x ∗ y) must be a stabilizer. we desire to realize logical transversal T via (strict) physical 4) For any two X-type stabilizers E(a, 0),E(b, 0) ∈ S, transversal T (using a CSS-T code). We first repeat the ıwH (a∗b)E(0, a ∗ b) must be a stabilizer. definition of a triorthogonal matrix for clarity. Proof. See Appendix II-E. We will see shortly that the last Definition 13 (Triorthogonality [4]). A p × q binary matrix G three conditions essentially constitute the property of tri- is said to be triorthogonal if and only if the support of any pair orthogonality for the generator matrix G1 for the classical bi- and triple of its rows has even overlap, i.e., wH (Ga ∗Gb) ≡ 0 nary code C1. See the proof for a more detailed argument. (mod 2) for any two rows Ga and Gb for 1 ≤ a < b ≤ p, and wH (Ga ∗ Gb ∗ Gc) ≡ 0 (mod 2) for all triples of rows C. Realizing Logical T Gates with Transversal T Ga,Gb,Gc for 1 ≤ a < b < c ≤ p. Let us begin by constructing the well-known [[15, 1, 3]] Before we state our next result, let us clarify some unfortu- (punctured) quantum Reed-Muller code [37], [38] that sup- nate ambiguity in the terminology of triorthogonal codes. First, ports a transversal T , using the conditions in Theorem 6. The a binary triorthogonal code is one that has a generator matrix construction is shown in Fig. 1. that satisfies the above triorthogonality property. Second, a The generator matrix G2 for the simplex code C2 that quantum CSS triorthogonal code as defined by Bravyi and produces all X-type stabilizers (which are all weight 8) is Haah starts with a binary triorthogonal code and adds con- ⊥ formed by the first 4 rows of G1 , as shown in Fig. 1. straints to some of its rows that are used to describe generators Notice that this is obtained by shortening RM(1, 4): take the for the logical X operators of the resulting CSS code. This generator matrix for the Reed-Muller code RM(1, 4), remove family of codes satisfy the property that when transversal T is the first row of all 1s, and then remove the first column applied to the physical qubits of the triorthogonal code, along 10

15 15 F2 F2 101010101010101 x1 10 4 011001100110011 x2   000111100001111 x3 ⊥ 000000011111111 x C1 = Punctured C2 = Hamming   4 001000100010001 x x RM(1, 4) code G⊥ :=   1 2 1 000010100000101 x x 1 1   1 3 000000001010101 x x   1 4 ⊥ 000001100000011 x2x3 C2 = Simplex C1 = Even weight   000000000110011 x2x4 code subcode 000000000001111 x x 4 10 3 4

⊥ ⊥ {0} {0} The ﬁrst 4 rows of G1 form G2, so C2 ⊂ C1

⊥ Fig. 1: The CSS(X,C2; Z,C1 ) construction for the [[15, 1, 3]] quantum Reed-Muller code.

T with possibly an additional Clifford operation (correction), xy ≡ 0 (mod 2) for x 6= y and wH (a∗(x∗y)) ≡ 0, wH (z∗(x∗ then it induces a transversal T on the logical qubits. But, in y)) ≡ 0 (mod 2) for any a ∈ C2, z ∈ C1/C2, z∈ / {x, y}, all the literature, sometimes the terminology is more casual where because E(0, x∗y) needs to commute with X-type stabilizers triorthogonal codes are described as codes that realize logical and logical X operators. Similarly, the other conditions of transversal T via physical transversal T . Finally, we note that triorthogonality can be derived from Theorem 12. the notion of triorthogonality is closely related to triply-even Therefore, the essential difference between transversal T re- codes, but they are distinct objects [39]. alizing logical transversal T or logical identity is the following: for the former we need the Hamming weight condition above Theorem 14 (Logical Transversal T ). Let S be the stabilizer which in part implies w (x ) ≡ 1 (mod 8), while for the for an [[n, k, d]] CSS-T code CSS(X,C ; Z,C⊥). Let G = H i 2 1 1 ıwH (x)E(0, x) ∈ S w (x) ≡ 0 G latter we need which implies H C1/C2 be a generator matrix for the classical code C ⊃ 2 G 1 (mod ), and these are mutually contradictory. Note that even 2 if we permit a Clifford correction and omit the Hamming C2 such that the rows xi, i = 1, . . . , k, of GC /C form a 1 2 weight condition above, the proof of Theorem 14 implies that generating set for the coset space C1/C2 that produces the the constraint wH (xi) ≡ 1 (mod 8) is still necessary, so the logical X group of the CSS-T code, i.e., X¯ = hE(xi, 0); i = 1, . . . , ki. Then the physical transversal T gate realizes the contradiction remains. Even in the Bravyi-Haah recipe, they logical transversal T gate, without any Clifford correction as impose that wH (xi) ≡ 1 (mod 2). We will construct a Reed- Muller family of CSS-T codes shortly, where we explicitly in [4], if and only if the matrix G1 is triorthogonal and the following condition holds true: state a condition that differentiates between when the physical transversal T realizes the logical identity and when it realizes k M some non-trivial logical operator. x = cixi, ci ∈ {0, 1} Finally, with respect to the earlier discussion of degenerate i=1 codes, there appears to be limited examples of degenerate ⇒ wH (x ⊕ a) ≡ wH (c) (mod 8) for all a ∈ C2. (36) triorthogonal codes. Prominently, the 3D color codes are degenerate and realize logical T via physical combinations Proof. See Appendix II-F. of T and T −1 on different qubits. Thus, these codes are Corollary 15. The triorthogonal construction introduced by triorthogonal (which permits diagonal Clifford corrections that Bravyi and Haah in [4] is the most general CSS-T family that can change T to T −1) [40]. realizes logical transversal T from physical transversal T .

Proof. See Appendix II-G. D. Realizing Logical CCZ via Transversal T The controlled-controlled-Z (CCZ) gate is defined as the While the Hamming weight condition above can be hard to unitary CCZ := diag(1, 1, 1, 1, 1, 1, 1, −1), which is a 3-qubit check in practice, using the Bravyi-Haah recipe still implies gate that applies the Pauli Z operator on the third qubit if and that one has to calculate a final Clifford correction. We sus- only if the first two qubits are in state |1i. Similar to the CZ pect that CSS-T codes constructed using classical monomial gate, this unitary is symmetric with respect to all the three codes, such as Reed-Muller codes or more general decreasing qubits involved in the operation. monomial codes [25], [28], might possess simple ways to check the Hamming weight condition above, since the weight Example 3 ([[8, 3, 2]] Color Code). First we revisit the con- distribution of some of these codes are known. struction of the [[8, 3, 2]] color code of Campbell [24], since it Observe that triorthogonality is a common condition for is now well-known, and show how it satisfies Theorem 2. The realizing either logical transversal T or logical identity from code can be defined by considering the 8 physical qubits to physical transversal T , since the last three conditions in be the vertices of a cube. There is a single X-type stabilizer Theorem 12 constitute the property of triorthogonality. Indeed, generator that is defined by X on all the vertices. There are xyT if ı E(0, x ∗ y) ∈ S for any x, y ∈ C1/C2, then we need 4 independent Z-type generators that are defined by Z on the 11

all but one non-zero vector in C2 are weight-8 and there is ⊥ one vector which is all 1s. The code C1 is generated by 10 weight-4 vectors, 9 of which are represented as plaquette operators and the last one corresponds to the vertical string Z5Z6Z7Z8 (or Z13Z14Z15Z16 in Fig. 2(b)). Consider the first X-type generator X1X2 ··· X8. In its support, there are 3 plaquette weight-4 strings and 1 vertical weight-4 string, all of which are linearly independent and have mutually even overlap, Hence, these clearly form a self-dual code which is in fact the [8, 4, 4] extended Hamming code. This can be checked for all the other vectors in C2, so the first two conditions of Theorem 2 are satisfied. The last condition imposes no negative signs to the Z-type stabilizers since all of them have weight at least 4. Therefore, transversal T preserves the code subspace. To see that the realized logical operator is CCZ, consider ⊗16 ¯ Fig. 2: A [[16, 3, 2]] CSS-T code where transversal T realizes the action of T on X1 = X1X2X3X4. Recollect that CCZ logical CCZ (up to logical Paulis). (a) The 3 weight-8 X-type on qubits a, b, c maps Xa 7→ Xa CZbc,Xb 7→ Xb CZac,Xc 7→ stabilizer generators. (b) The 10 weight-4 Z-type stabilizer Xc CZab, and CZ on qubits e, f maps Xe 7→ XeZf ,Xf 7→ generators. (c) The 3 X-type logical Pauli generators. (d) The Xf Ze. corresponding 3 Z-type logical Pauli generators. The red and ⊗16 ⊗16† − ıπ ·4 T X¯ T = e 4 (Y Y Y Y )(P P P P ) blue colors are included to differentiate between X-type and 1 1 2 3 4 1 2 3 4 ¯ † † † † Z-type operators, respectively, for visual clarity. = −X1(P1 P2 P3 P4 ). (37) † † † † ¯ We need to show that U := P1 P2 P3 P4 ≡ CZ23. Recollecting P †XP = −Y vertices of (4 independent) faces of the cube. So the X-type that , we notice that [8, 1, 8] † stabilizers come from the classical repetition code, UX¯2U = Y1Y2X5X6X9X10X13X14 ≡ −X¯2Z¯3, (38) which can be written as the Reed-Muller code RM(0, 3). It is UX¯ U † = Y Y X X X X X X ≡ −X¯ Z¯ , (39) easy to verify that the Z-type stabilizers come from the [8, 4, 4] 3 2 3 6 7 10 11 14 15 3 2 extended Hamming code, which is also the self-dual Reed- since Z1Z2Z13Z14,Z2Z3Z14Z15 ∈ S. Thus, up to signs, we Muller code RM(1, 3). So, there are 14 elements of weight 4, † † † † ¯ ⊗16 have verified that P1 P2 P3 P4 ≡ CZ23. Hence, T acts like one element of weight 8, and the identity. By appropriately logical CCZ on X¯1, and similar calculations can be done to defining the logical X strings from faces of the cube, it can verify the other relations for CCZ. In this case, the signs can be shown that transversal T realizes logical CCZ on this be fixed by checking the relations for (X¯2X¯3) CC¯Z (X¯2X¯3), code. This code is also a special case of Theorem 19 for since this is the precise logical operator realized by T ⊗16. m = 3, r = 1, which generalizes to any m (and r = 1) by the conditions of the theorem. Thus, this is a family of [[2m, m, 2]] Example 5 ([[16, 3, 2]] Decreasing Monomial Code). An equiv- ⊥ alent [[16, 3, 2]] code can be constructed as a decreasing CSS(X,C2; Z,C ) codes defined on m-dimensional cubes as 1 monomial code as follows, using the monomial description C2 = RM(0, m),C1 = RM(1, m), similar to the 3D code above. of Reed-Muller codes we discuss in Appendix I. Define the code C2 as the space generated by the monomials G2 = Example 4 ([[16, 3, 2]] Bacon-Shor-like Code). Now we con- {1, x1, x2}, and the code C1 as the space generated by struct a [[16, 3, 2]] Bacon-Shor-like code using the conditions of G1 = G2 ∪ {x3, x4, x1x2}. Hence, the logical X group Theorem 2 and show that the transversal T realizes the logical is generated by GX = {x3, x4, x1x2}. While Reed-Muller CCZ gate (up to Paulis). In particular, this code belongs codes always include all monomials up to some degree as to the compass code family studied in [23]. Although the the generators, decreasing monomial codes with maximum [[8, 3, 2]] code is smaller while having essentially the same degree r might include only some of the degree r monomials properties, we will demonstrate shortly that the [[16, 3, 2]] code among the generators. However, the code must include all can be constructed from decreasing monomial codes [25], [26]. monomials of degree up to r −1, and the degree r terms must While this framework has been recently used by Krishna and be chosen according to a partial order as described in [26]. Tillich [28] to construct triorthogonal codes from punctured In the construction above, both C2 and C1 are decreasing polar codes, this example has non-identical logical X and Z monomial codes. Using the formalism in [26], it is easy to see ⊥ ⊥ generators unlike the standard presentation of triorthogonal that the dual codes C1 and C2 are generated respectively by ⊥ codes [4]. G1 = {1, x1, x2, x3, x4, x1x2, x1x3, x1x4, x2x3, x2x4} and ⊥ ⊥ The construction of the [[16, 3, 2]] CSS-T code G2 = G1 ∪ {x3x4, x1x2x3, x1x2x4}. So the logical Z group ⊥ CSS(X,C2; Z,C1 ) is shown in Fig. 2. The code C2 is is generated by GZ = {x1x2x4, x1x2x3, x3x4}, where we generated by three weight-8 vectors that are represented as have rewritten the generators in an order such that they form the vertical rectangles in Fig. 2(a). It is easy to check that corresponding pairs with logical X generators in GX . In other 12

words, we see that the corresponding entries in GX and GZ holds for all the other codewords in C2 which are degree- ⊥ multiply to the full monomial x1x2x3x4, which is the only 1 polynomials as well. Finally, the generators of C1 have monomial whose evaluation has odd weight, and hence the Hamming weights 8, 16, 32 and 64, so there are no negative pairs anti-commute as required. Similarly, multiplying terms signs introduced by the last condition of Theorem 2. Thus, from GX and GZ that are not pairs does not yield the full this is indeed a CSS-T code. monomial, thereby ensuring they have even overlap. Finally, In order to determine the logical operation realized by we see that the product of the three logical X generators transversal T , we initially wrote a computer program to produces the full monomial, which means their triple product generate the vectors in the superposition of all the 215 logical has odd weight. This is precisely one of the requirements in the computational basis states. (Later, in Theorem 19, we derive generalized triorthogonality conditions established by Haah the exact logical operation analytically.) Then we calculated and Hastings [14], which is a special case of quasitransver- the action of T ⊗64 on them by just computing the Hamming sality established earlier by Campbell and Howard [15], in weights of the vectors in the superposition. The effective logi- order to ensure that transversal T performs a logical CCZ on cal operation was a diagonal unitary with entries ±1, and there a CSS code. The other requirements in their conditions can were 13, 888 entries that were (−1) in the diagonal (out of the also be quickly verified simply using the fact that the only 215 = 32, 768). We determined the Boolean function encoding monomial of odd weight is the full monomial. the locations of 1 and −1 in the diagonal, and simplified the To see that this also satisfies Theorem 2, consider for exam- naive 13, 888 term sum-of-products (SOP) expression into a ple the X-stabilizer corresponding to the monomial x1 ∈ G2. 1991 term SOP expression using the software “Logic Friday”. ⊥ We observe that the elements x1, x1x2, x1x3, x1x4 ∈ G1 Subsequently, we used “Mathematica” to convert this Boolean are supported on x1. When we project down to x1, i.e., function into its algebraic normal form (ANF) and obtained consider only the support of x1, we get the monomials the following polynomial. 1, x˜1 = x2, x˜2 = x3, x˜3 = x4 that precisely generate the self-dual code RM(1, 3). A similar analysis can be made for q(v1, . . . , v15) = v1v10v15 + v1v11v14 + v1v12v13 ⊥ other elements in C2. Moreover, since the elements in G1 have + v2v7v15 + v2v8v14 + v2v9v13 4, 8 16 weights , or , the last condition of Theorem 2 does not + v3v6v15 + v3v8v12 + v3v9v11 introduce any negative signs for the Z-stabilizers. Therefore, + v v v + v v v + v v v we have used the decreasing monomial codes formalism to 4 6 14 4 7 12 4 9 10 produce an equivalent [[16, 3, 2]] code, where only the logical + v5v6v13 + v5v7v11 + v5v8v10. (40) X and Z generators have changed in comparison to the Therefore, the logical diagonal gate can be represented as construction above. We believe this is not just one special U L |v ··· v i = (−1)q(v1,...,v15) |v ··· v i . This implies case but points to a general construction of CSS codes using 1 15 L 1 15 L that the gate decomposes into exactly 15 CCZ gates on this formalism that support transversal Z-rotations. the logical qubits, and hence belongs to the 3rd level of the Clifford hierarchy. More interestingly, note that the CSS (3) E. Realizing Products of C gates with Transversal T superposition of a given logical computational basis state

We demonstrate two examples where transversal T realizes |v1 ··· v15iL consists of all vectors in the corresponding coset Q15 ¯ vi a logical diagonal gate at the 3rd level that is not a single of C2 in C1 generated by i=1 Xi , i.e., the binary vector elementary gate but a product of elementary gates. These codes representations xi of the logical operators X¯i = E(xi, 0). have been partially discussed in recent works [14], [15] but we Therefore, the diagonal of U L encodes exactly which cosets describe the general family and later derive the exact logical of RM(1, 6) in RM(2, 6) have all vectors of weight exactly operation realized by transversal T on these codes. 4 mod 8 (diagonal entry −1) and which cosets have all vectors of weight 0 mod 8 (diagonal entry 1). Since the above Example 6 ([[64, 15, 4]] Reed-Muller Code). Consider a phase polynomial is a codeword in RM(3, 15) of degree 3, CSS(X,C ; Z,C⊥) code where C = RM(1, 6) ⊂ C = 2 1 2 1 this suggests a deeper connection to RM codes where the RM(2, 6) and therefore C⊥ = RM(3, 6) ⊂ C⊥ = RM(4, 6). 1 2 coset weight distribution modulo 8 is encoded exactly by a The distance of this code is the minimum of the minimum codeword in RM(3, 15). Theorem 19 explores this connection distances of C and C⊥. It is well-known that the minimum 1 2 more rigorously than the empirical approach described above. distance of RM(r, m) is 2m−r, so the distance of this CSS- T code is 4. Therefore, this gives a [[64, 15, 4]] code. Let us Example 7 ([[128, 21, 4]] Reed-Muller Code). Similarly, we quickly check the conditions in Theorem 2. The code C2 constructed a [[128, 21, 4]] Reed-Muller CSS-T code by setting ⊥ is generated by degree-1 monomials in 6 binary variables C2 = RM(1, 7) ⊂ C1 = RM(2, 7), and hence C1 = ⊥ x1, x2, . . . , x6, and all of its codewords have even weight. RM(4, 7) ⊂ C2 = RM(5, 7). The X-stabilizers are generated Using the same strategy that we used for the [[15, 1, 3]] RM by degree-1 monomials, and logical X operators are given by ⊥ code, consider the monomial x1 in C2. Since C1 is generated the coset representatives for C1/C2, which are degree-2 poly- by all monomials of degree less than or equal to 3, it contains nomials. This implies, for any a, b ∈ C2 and x, y ∈ C1/C2, the monomials x1(1), x1(xi), x1(xixj) for i, j ∈ {2, 3, 4, 5, 6} a ∗ b, a ∗ x, x ∗ y are either degree 2, 3, or 4 polynomials, all ⊥ and i < j. If we project down to x1, then the monomials of which belong to C1 by the definition of RM codes. So 1, x˜ix˜ix˜j for i, j ∈ {1, 2, 3, 4, 5} and i < j exactly generate C1 is a triorthogonal code that also satisfies the first condition the code RM(2, 5) which is self-dual. A similar analysis of Theorem 12, and hence transversal T realizes the logical 13 identity. However, the code supports the application of the F. Stabilizer Codes that Support Transversal Z-Rotations T gate on the physical qubits corresponding to the pattern We first generalize Lemma 1 to transversal π/2` Z- prescribed by any degree-1 polynomial, as can be verified rotations, that again could be of independent interest. from the conditions in Theorem 6 by setting t7 = 0 and t1 n n a degree-1 polynomial. Although this example does not fit in Lemma 16. Let E(a, b) ∈ HWN for N = 2 , a, b ∈ Z2 . (`) ıπ Theorem 19 that concerns strictly with transversal Z-rotations, Then transversal τ[1] = exp 2` Z , ` ≥ 2, acts on E(a, b) as we verified computationally that the logical gate is non-trivial † τ (`)E(a, b) τ (`) in this case. In In wH (y) Example 8 (Reed-Muller Family). We can generalize this 1 X 2π byT m m r = tan ` (−1) E(a, b ⊕ y), construction as a Reed-Muller family of [[n = 2 , , 2 ]] 2π wH (a) 2 r sec 2` ya CSS-T codes defined by C2 = RM(r−1, m),C1 = RM(r, m), (41) ⊥ ⊥ and hence C1 = RM(m − r − 1, m) ⊂ C2 = RM(m − r, m). T Using the conditions in Theorem 2 and Theorem 12, we see where wH (a) = aa is the Hamming weight of a, and y a m denotes that y is contained in the support of a. that we need r ≤ 3 for transversal T to be supported, and also r > m−1 for transversal T to not realize the 3 Proof. See Appendix II-H. logical identity. This appears to imply that there is exactly one integer value of r that provides a valid code, but this For ` = 3 (transversal T ), the cosine term produced a −w (a )/2 need not be true since decreasing monomial codes correspond 2 H j factor which we were able to ensure was an precisely to non-integer values of r. For example, one can integer by enforcing aj to have even Hamming weight. Then w (a )/2 take m = 9, r = 3 to obtain a valid [[512, 84, 8]] code. Indeed, we produced 2 H j copies of each stabilizer element in ⊥ order to cancel this factor and thereby reproduced the code notice that C1 = RM(5, 9) contains the code RM(4, 8) in the support of any degree-1 polynomial, and RM(4, 8) contains the projector. However, for ` > 3, extending this idea requires that 2π wH (a) self-dual code “RM(3.5, 8)”, which is generated by all degree sec 2` cancel the sum of (signed) tangents acquired at most 3 monomials as well as the first half of all degree for each copy of the stabilizer element. This leads us to an 4 monomials when they are arranged in lexicographic order. extension of Theorem 2. Once again, Theorem 19 provides the logical gate realized by Theorem 17 (Transversal Z-rotations). Let S = transversal T on this [[512, 84, 8]] code. hνiE(ci, di); i = 1, . . . , ri define an [[n, n − r]] stabilizer : n The family of quantum Reed-Muller codes in Example 8 code as in Theorem 2. Let ZS = {z ∈ Z2 : zE(0, z) ∈ S} appears in recent work by Haah and Hastings [14], and for some {z}. For any jE(aj, bj) ∈ S with non-zero aj, : Campbell and Howard [15], [16], where in [14] they focused define the subspace Zj = {v ∈ ZS : v aj} and the set : n on distilling CCZ magic states from these codes via physical Wj = {y ∈ Z2 : y aj, y∈ / Zj}. Then the transversal application of the exp ıπ Z gate realizes a logical operation transversal T . For this reason, they needed the logical CCZs to 2` be on distinct triples of (logical) qubits and they provided an on V (S) if and only if the following are true for all such analytic construction that guarantees a [[2m, 3(2m/3−2), 2m/3]] aj 6= 0: w (v) w (a ) quantum Reed-Muller code satisfying this constraint. How- X 2π H 2π H j ı tan = sec , (42) ever, using a computational search strategy, they show that in v 2` 2` certain cases the number of logical qubits can be increased to v∈Zj w (v⊕y) produce more disjoint CCZs. For example, for m = 9, they X 2π H ı tan = 0 for all y ∈ W , (43) expand the [[512, 18, 8]] code from the analytic construction v 2` j that yields 6 logical CCZs into a [[512, 30, 8]] code that yields v∈Zj

10 logical CCZs on disjoint triples of logical qubits. where v ∈ {±1} is the sign of E(0, v) in the stabilizer S. In Theorem 19, we generalize the quantum Reed-Muller Proof. See Appendix II-I. family from Example 8 for general π/2` Z-rotations, and also prove the exact logical operator realized on these codes. The extension here is only partial in the sense that the condi- We believe that, when applied to the transversal T scenario, tions on the stabilizer involve trigonometric quantities and we this result allows one to analytically derive the above codes still have to distill ﬁnite geometric constraints on the vectors in [14] by using combinatorial arguments to carefully “peel describing the stabilizer elements, similar to Theorems 2 and 6. off” the additional CCZs that either overlap on qubits involved However, under the assumption that Zj is a self-dual code and in existing ones or violate the generalized triorthogonality v = 1 for all v ∈ Zj, we are able to deduce the following constraints [14]. This peeling procedure effectively drops all condition on the Hamming weights of v and aj. logical qubits that are not involved in the maximum number Lemma 18. Let C be an [m, m/2] self-dual code and ` ≥ 2. max of disjoint CCZs, k , obtainable on these codes. Moreover, wH (v) m CCZ Then P ı tan 2π = sec 2π if and only if each this approach might lead to an exact characterization of kmax v∈C 2` 2` CCZ v ∈ C satisﬁes w (v) = m/2 or (m − 2w (v)) is divisible for any m, without involving the Lovasz Local Lemma that H H by 2`. appears to provide guarantees only for large m. We leave this investigation for future work. Proof. See Appendix II-J. 14

Finally, although we do not have the full extension of = |v1iL ⊗ |v2iL ⊗ · · · ⊗ |v15iL . (47) Theorem 2 yet, we consider a family of quantum Reed- For this code, the rows of G are evaluations of Muller codes QRM(r, m) that supports π/2` Z-rotations from C1/C2 the m = 15 degree r = 2 monomials, namely the Clifford hierarchy, and we also explicitly construct the m/3 logical operations induced by transversal Z-rotations on these x1x2, x1x3, x1x4, . . . , x5x6. So, the polynomial f ∈ codes. The code QRM(r, m) is a CSS code deﬁned by RM(r, m) above is a linear combination of degree r = 2 monomials, and possibly lower degree monomials that corre- C2 = RM(r − 1, m) and C1 = RM(r, m). Hence, we can 15 identify the following relationships: spond to just X-type stabilizers. Hence, vf ∈ Z2 exactly de- scribes which corresponding rows of GC1/C2 are chosen in this X-type stabilizers ↔ c ∈ RM(r − 1, m), linear combination. Therefore, if f = x1x2 + x3x4 + x5x6 +

Z-type stabilizers ↔ c ∈ RM(m − r − 1, m), (smaller degree terms), then |vx1x2 vx3x4 vx5x6 iL = |111iL X-type logical operators ↔ c ∈ RM(r, m), and other qubits are set to |0iL, so q(f) = 1. But if f = x1x2 + x3x4 + x5x6 + x3x5 + x4x6 + (smaller degree terms), Z-type logical operators ↔ c ∈ RM(m − r, m). (44) then q(f) = 0 as this polynomial corresponds to two CCZs The parameters for QRM(r, m) are given by applying the phase −1. m m min{r,m−r} k [[2 , r , 2 ]]. Recollect that for vf ∈ Z2 , the For stating the general result, it will be convenient to replace CSS basis states are the monomial subscripts with binary vectors p1, p2, p3 = p v v v 1 X m/r. So, for example, for the first term x1x2 x3x4 x5x6 |vf iL ≡ vf · GC1/C2 ⊕ c |C2| these index vectors are given by p1 = [1, 1, 0, 0, 0, 0], p2 = c∈C2 [0, 0, 1, 1, 0, 0], p = [0, 0, 0, 0, 1, 1], which each have Ham- 1 3 X ming weight r = 2 and sum up to 1. = vf · GC1/C2 ⊕ y · G2 , (45) |C2| y∈ k2 x Z2 Define µ(x) := (−1) , and let νp(s) denote the largest integer t such that pt divides s. where GC1/C2 denotes the generator matrix for the linear subspace of coset representatives for C2 in C1, and G2 denotes Theorem 19. Suppose that 1 ≤ r ≤ m/2 and r divides the generator matrix for the code C2. For QRM(r, m), the rows m. Then, the transversal exp ıπ Z gate is a logical of G correspond to degree r monomials, each identifying 2m/r C1/C2 operator for QRM(r, m). Moreover, up to local corrections, a logical qubit. Hence, any polynomial f comprised of these the corresponding logical operator acts on a computational monomials corresponds to a distinct logical computational basis state by basis state |vf iL. So a non-trivial logical X operator is described by a degree r polynomial f, but only the degree |vf iL 7→ µ(q(f)) |vf iL , (48) r terms will determine which logical qubits are acted upon. P d where f = d∈ m adx ∈ RM(r, m), ad ∈ {0, 1} with ad = Also, this implies that if a particular degree r term is present Z2 0 for all wH (d) > r, and in f, then the corresponding logical qubit is set to |1iL in |v i . m/r f L X Y q(f) ≡ q(v ) = v (mod 2), (49) Example 6 (contd.). Before we state the general result, let f pj j=1 us setup the notation through the [[64, 15, 4]] example from (p1,...,pm/r )∈P Section III-E. Recollect that in this case we have m = 6 and m Pm/r where P := {(p1, . . . , pm/r): pj ∈ Z2 , pj = r = 2, so the logical qubits can be identified with the degree j=1 1, wH (pj) = r}. In particular, deg(q) = m/r and so by [11], 2 monomials that define generators for logical X operators. it is a gate from the (m/r)-th level of the Clifford hierarchy. Hence, the polynomial in (40) defining the logical gate realized by physical transversal T can be represented in monomial Proof. See Appendix II-K. subscripts as q(f) ≡ q(v ) f In general, the above theorem states that the logical gate

= vx1x2 vx3x4 vx5x6 + vx1x2 vx3x5 vx4x6 + vx1x2 vx3x6 vx4x5 polynomial q(f) consists of all terms such that the monomials in the subscripts of each term form a unique partition of + vx x vx x vx x + vx x vx x vx x + vx x vx x vx x 1 3 2 4 5 6 1 3 2 5 4 6 1 3 2 6 4 5 m variables into m/r groups of r variables each. Therefore, + v v v + v v v + v v v x1x4 x2x3 x5x6 x1x4 x2x5 x3x6 x1x4 x2x6 x3x5 the number of terms in the polynomial q(f), and hence the + vx1x5 vx2x3 vx4x6 + vx1x5 vx2x4 vx3x6 + vx1x5 vx2x6 vx3x4 number of gates in the induced logical operator is given by + v v v + v v v + v v v , m! . x1x6 x2x3 x4x5 x1x6 x2x4 x3x5 x1x6 x2x5 x3x4 (r!)m/r ( m )! (46) r Remark 20 (Quasitransversality). In [15], [16], Campbell and where each term in the polynomial corresponds to a logical (3) Howard considered diagonal gates UF ∈ C that can be CCZ gate acting on the three logical qubits indexed by the P F (x) ıπ/4 expressed as UF = x∈ k ω |xi hx|, where ω = e three monomial subscripts, and the sum corresponds to a Z2 and F (x) = L(x) + 2Q(x) + 4C(x) (mod 8) is a weighted product of such gates (in the logical unitary space). In the polynomial with L(x) linear (mod 8), Q(x) quadratic (mod notation of (40), v v v ≡ v v v and so on, x1x2 x3x4 x5x6 1 10 15 4) and C(x) cubic (mod 2) polynomials. So L corresponds which means v = [v , v , . . . , v ], i.e., f x1x2 x1x3 x5x6 to single-qubit Z-rotations, Q corresponds to controlled Z-

|vf iL = |vx1x2 iL ⊗ |vx1x3 iL ⊗ · · · ⊗ |vx5x6 iL rotations, and C corresponds to CCZ gates. They define a 15 quantum code to be “F -quasitransversal” if there exists a Clif- of polynomials. Although these logical operations involve ford g such that gT ⊗n acting on the physical qubits realize the products of overlapping many-controlled-Z gates, it will be logical gate UF . In [15, Lemma 1], they provided the following interesting to investigate their utility in magic state distillation sufficient condition for a CSS code to be F -quasitransversal, and other proposals for universal quantum computation. In which we rewrite in our notation (e.g., see (45)): certain systems, finer angle rotations often have better fidelity than coarser angle rotations. Hence, these native resources in w (x · G ⊕ y · G ) ∼ F (x) (mod 8), y ∈ k2 , H C1/C2 2 c Z2 the lab could be leveraged in combination with these codes to (50) potentially obtain better circuit decompositions. where the subscript “c” implies that the two sides are Clifford equivalent, i.e., there exists a weighted polynomial F˜ such that ACKNOWLEDGMENT by replacing F (x) with F (x) + 2F˜(x) above, we can replace ∼c with equality. This article is dedicated to the memory of David Poulin. We would like to thank Kenneth Brown and Anirudh Now, observe that u = x · GC1/C2 ⊕ y · G2 ∈ C1 exactly corresponds to u = ev(f) for some f ∈ RM(r, m) above Krishna for helpful discussions. Kenneth Brown suggested us (with x = vf ). Hence, we note that (213) exactly matches to think about a Bacon-Shor-like code on a 4 × 4 lattice, the (quasi)transversality condition above (with equality and and this led to the first construction of the [[16, 3, 2]] code. thereby no Clifford correction). Therefore, QRM(m/3, m) is Subsequently, during his visit to Duke University, Anirudh 4q(f)-(quasi)transversal. Krishna demonstrated that an equivalent [[16, 3, 2]] code can be constructed using decreasing monomial codes. We would also Remark 21 (Quantum Pin Codes). Vuillot and Breuck- like to thank Andrew Landahl and Earl Campbell for carefully mann [17] recently introduced “Quantum Pin Codes” as an reading earlier drafts of this manuscript and providing helpful abstract framework to synthesize stabilizer codes that sup- feedback. We are also thankful to Jeongwan Haah for encour- port transversal, or partially transversal, physical Z-rotations. aging us to clarify the existence of the self-dual code in the These codes are inspired by topological constructions such as proof of Theorem 2; this motivated us to greatly improve this color codes [5], but the abstraction extends beyond algebraic proof. N.R. would like to thank Christophe Vuillot, Jean-Pierre topology while retaining transversality properties. The authors Tillich, Earl Campbell, Mark Howard, Michael Beverland, produce several new codes using this formalism. We note Andrew Landahl and Andrew Cross for fruitful discussions that the above result regarding QRM(r, m) codes applies to and feedback. The connections to pin codes, generalized tri- a general family of quantum pin codes as discussed in [17, orthogonality and quasitransversality were made during these Section V-D]. conversations. N.R. would like to thank Christophe Vuillot, Jean-Pierre Tillich, and Earl Campbell also for their hospitality IV. CONCLUSION during his visits to their research groups. In this paper, we used the recent characterization of This work was supported in part by the National Sci- quadratic form diagonal (QFD) gates [10] to derive necessary ence Foundation (NSF) under Grant Nos. 1718494 and and sufficient conditions for a stabilizer code to support a 1908730. The research of M.N. was supported under the physical transversal T gate. Our Heisenberg approach allowed grant ODNI/IARPA LogiQ program (W911NF-16-1-0082). us to generalize all such existing constructions. Using this, Any opinions, findings, conclusions, and recommendations we showed that for any non-degenerate stabilizer code with expressed in this material are those of the authors and do not this property, there exists an equivalent CSS code that also necessarily reflect the views of these sponsors. possesses this property. So for magic state distillation via transversal T on non-degenerate stabilizer codes, CSS codes REFERENCES are essentially optimal. We also showed that triorthogonal codes form the most general family of CSS codes that real- [1] N. Rengaswamy, R. Calderbank, M. Newman, and H. D. Pfister, ize logical transversal T via physical transversal T . Among “Classical coding problem from transversal T gates,” Accepted to IEEE Int. Symp. Inform. Theory, 2020. [Online]. Available: several examples, we constructed a [[16, 3, 2]] code using http://arxiv.org/abs/2001.04887 the decreasing monomial formalism, and demonstrated how [2] B. Eastin and E. Knill, “Restrictions on Transversal Encoded Quantum to check that transversal T realizes logical CCZ with the Gate Sets,” Phys. Rev. Lett., vol. 102, no. 11, p. 110502, 2009. [Online]. Available: http://arxiv.org/abs/0811.4262 help of generalized triorthogonality conditions. This points [3] B. Zeng, A. Cross, and I. L. Chuang, “Transversality Versus to a possibly general construction of CSS codes supporting Universality for Additive Quantum Codes,” IEEE Trans. Inform. transversal T using this formalism. Theory, vol. 57, no. 9, pp. 6272–6284, 2011. [Online]. Available: https://arxiv.org/abs/0706.1382 We then extended the above results beyond T gates, and [4] S. Bravyi and J. Haah, “Magic-state distillation with low overhead,” derived trigonometric stabilizer conditions for the code to Phys. Rev. A, vol. 86, no. 5, p. 052329, 2012. [Online]. Available: support a transversal π/2` Z-rotation. However, we were http://arxiv.org/abs/1209.2426 [5] A. Kubica and M. E. Beverland, “Universal transversal gates with color only able to reduce this to finite geometric conditions under codes: A simplified approach,” Phys. Rev. A, vol. 91, no. 3, p. 032330, some assumptions. Finally, we considered a family of quantum 2015. [Online]. Available: https://arxiv.org/abs/1410.0069 Reed-Muller codes and determined the exact logical operation [6] D. Gottesman and I. L. Chuang, “Demonstrating the viability of universal quantum computation using teleportation and single-qubit induced by transversal Z-rotations on these codes from an operations,” Nature, vol. 402, no. 6760, pp. 390–393, 1999. [Online]. alternate viewpoint, using Ax’s theorem on residue weights Available: http://www.nature.com/articles/46503 16

[7] S. Bravyi and A. Kitaev, “Universal quantum computation with ideal [31] N. Rengaswamy, R. Calderbank, S. Kadhe, and H. D. Pfister, “Synthesis Clifford gates and noisy ancillas,” Phys. Rev. A, vol. 71, no. 2, p. 022316, of Logical Clifford Operators via Symplectic Geometry,” in Proc. IEEE 2005. [Online]. Available: https://arxiv.org/abs/quant-ph/0403025 Int. Symp. Inform. Theory, Jun 2018, pp. 791–795. [Online]. Available: [8] A. J. Landahl and C. Cesare, “Complex instruction set computing http://arxiv.org/abs/1803.06987v1 architecture for performing accurate quantum $z$ rotations with less [32] P. Boykin, T. Mor, M. Pulver, V. Roychowdhury, and F. Vatan, “On magic,” arXiv preprint arXiv:1302.3240, 2013. [Online]. Available: universal and fault-tolerant quantum computing: a novel basis and a http://arxiv.org/abs/1302.3240 new constructive proof of universality for Shor’s basis,” in 40th Annu. [9] Y. Nam, J.-S. Chen, N. C. Pisenti, K. Wright, C. Delaney, D. Maslov, Symp. Found. Comput. Sci. (Cat. No.99CB37039), 1999, pp. 486–494. K. R. Brown, S. Allen, J. M. Amini, J. Apisdorf, K. M. Beck, A. Blinov, [Online]. Available: https://arxiv.org/abs/quant-ph/9906054 V. Chaplin, M. Chmielewski, C. Collins, S. Debnath, K. M. Hudek, [33] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum A. M. Ducore, M. Keesan, S. M. Kreikemeier, J. Mizrahi, P. Solomon, Information. Cambridge University Press, 2010. M. Williams, J. D. Wong-Campos, D. Moehring, C. Monroe, and [34] N. M. Linke, D. Maslov, M. Roetteler, S. Debnath, C. Figgatt, K. A. J. Kim, “Ground-state energy estimation of the water molecule on a Landsman, K. Wright, and C. Monroe, “Experimental comparison of trapped-ion quantum computer,” npj Quantum Inf., vol. 6, no. 1, p. 33, two quantum computing architectures,” Proceedings of the National 2020. [Online]. Available: http://arxiv.org/abs/1902.10171 Academy of Sciences, vol. 114, no. 13, pp. 3305–3310, 2017. [Online]. [10] N. Rengaswamy, R. Calderbank, and H. D. Pfister, “Unifying Available: https://www.pnas.org/content/114/13/3305/ the Clifford hierarchy via symmetric matrices over rings,” Phys. [35] H. Randriambololona, “On products and powers of linear codes under Rev. A, vol. 100, no. 2, p. 022304, 2019. [Online]. Available: componentwise multiplication,” Algorithmic arithmetic, geometry, and http://arxiv.org/abs/1902.04022 coding theory, vol. 637, pp. 3–78, 2015. [11] S. X. Cui, D. Gottesman, and A. Krishna, “Diagonal gates in the [36] D. Gottesman, “The heisenberg representation of quantum computers,” Clifford hierarchy,” Phys. Rev. A, vol. 95, no. 1, p. 012329, 2017, in Intl. Conf. on Group Theor. Meth. Phys. International Press, [Online]. Available: http://arxiv.org/abs/1608.06596. [Online]. Available: Cambridge, MA, 1998, pp. 32–43. [Online]. Available: https: https://journals.aps.org/pra/pdf/10.1103/PhysRevA.95.012329 //arxiv.org/abs/quant-ph/9807006 [12] T. Jochym-O’Connor, A. Kubica, and T. J. Yoder, “Disjointness of [37] J. T. Anderson, G. Duclos-Cianci, and D. Poulin, “Fault-Tolerant Stabilizer Codes and Limitations on Fault-Tolerant Logical Gates,” Conversion between the Steane and Reed-Muller Quantum Codes,” Phys. Rev. X, vol. 8, no. 2, p. 021047, 2018, [Online]. Available: Phys. Rev. Lett., vol. 113, no. 8, p. 080501, 2014, [Online]. Available: https://arxiv.org/pdf/1710.07256.pdf. http://arxiv.org/abs/1403.2734. [13] T. Jochym-O’Connor and R. Laflamme, “Using concatenated quantum [38] D. Quan, L. Zhu, C.-X. Pei, and B. C. Sanders, “Fault-tolerant codes for universal fault-tolerant quantum gates,” Physical review letters, conversion between adjacent Reed–Muller quantum codes based on vol. 112, no. 1, p. 010505, 2014. gauge fixing,” J. Phys. A Math. Theor., vol. 51, no. 11, p. 115305, [14] J. Haah and M. B. Hastings, “Codes and Protocols for Distilling $T$, 2018. [Online]. Available: http://arxiv.org/abs/1703.03860 controlled-$S$, and Toffoli Gates,” Quantum, vol. 2, p. 71, 2017. [39] J. Haah, “Towers of generalized divisible quantum codes,” Phys. [Online]. Available: https://arxiv.org/abs/1709.02832 Rev. A, vol. 97, no. 4, p. 042327, 2018. [Online]. Available: [15] E. T. Campbell and M. Howard, “Unified framework for magic state https://arxiv.org/abs/1709.08658 distillation and multiqubit gate synthesis with reduced resource cost,” [40] F. H. E. Watson, E. T. Campbell, H. Anwar, and D. E. Browne, Phys. Rev. A, vol. 95, no. 2, p. 022316, Feb 2017. [Online]. Available: “Qudit colour codes and gauge colour codes in all spatial dimensions,” https://journals.aps.org/pra/pdf/10.1103/PhysRevA.95.022316 Phys. Rev. A, vol. 92, no. 2, p. 022312, 2015. [Online]. Available: [16] ——, “Unifying Gate Synthesis and Magic State Distillation,” Phys. http://arxiv.org/abs/1503.08800 Rev. Lett., vol. 118, no. 6, p. 060501, Feb 2017. [Online]. Available: [41] J. Ax, “Zeroes of Polynomials Over Finite Fields,” Am. J. https://journals.aps.org/prl/pdf/10.1103/PhysRevLett.118.060501 Math., vol. 86, no. 2, p. 255, Apr 1964. [Online]. Available: [17] C. Vuillot and N. P. Breuckmann, “Quantum Pin Codes,” arXiv preprint https://www.jstor.org/stable/2373163?origin=crossref arXiv:1906.11394, 2019. [Online]. Available: http://arxiv.org/abs/1906. [42] X.-d. Hou, “Polynomials meeting Ax’s bound,” Acta Arithmetica, vol. 11394 176, no. 1, pp. 65–80, 2016. [18] E. M. Rains and N. J. Sloane, “Self-dual codes,” arXiv preprint arXiv:math/0208001, 2002. [19] G. Nebe, E. M. Rains, and N. J. A. Sloane, Self-dual codes and invariant theory. Springer, 2006, vol. 17. [20] F. J. MacWilliams and N. J. A. Sloane, The Theory of Error-Correcting Codes. North-Holland, Amsterdam, 1977. [21] R. McEliece, “Weights modulo 8 in binary cyclic codes,” Jet Propulsion Lab, vol. 11, pp. 86–88, 1972. [22] R. J. McEliece, “Weight congruences for p-ary cyclic codes,” Discrete Mathematics, vol. 3, no. 1-3, pp. 177–192, 1972. [23] M. Li, D. Miller, M. Newman, Y. Wu, and K. R. Brown, “2d compass codes,” Physical Review X, vol. 9, no. 2, p. 021041, 2019. [24] E. T. Campbell, “The smallest interesting colour code,” 2016, blog post. [Online]. Available: https://earltcampbell.com/2016/09/26/ the-smallest-interesting-colour-code/ [25] M. Bardet, V. Dragoi, A. Otmani, and J.-P. Tillich, “Algebraic properties of polar codes from a new polynomial formalism,” in Proc. IEEE Int. Symp. Inform. Theory. IEEE, 2016, pp. 230–234. [Online]. Available: http://arxiv.org/abs/1601.06215 [26] M. Bardet, V. Dragoi, A. Otmani, and J.-P. Tillich, “Algebraic Prop- erties of Polar Codes From a New Polynomial Formalism,” [Online]. Available: http://arxiv.org/pdf/1601.06215.pdf. [27] E. Arıkan, “Channel polarization: A method for constructing capacity- achieving codes for symmetric binary-input memoryless channels,” IEEE Trans. Inform. Theory, vol. 55, no. 7, pp. 3051–3073, July 2009. [28] A. Krishna and J.-P. Tillich, “Magic state distillation with punctured polar codes,” arXiv preprint arXiv:1811.03112, 2018. [Online]. Available: http://arxiv.org/abs/1811.03112 [29] D. K. Tuckett, S. D. Bartlett, and S. T. Flammia, “Ultrahigh Error Threshold for Surface Codes with Biased Noise,” Phys. Rev. Lett., vol. 120, no. 5, p. 050505, 2018. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.120.050505 [30] M. M. Wilde, Quantum Information Theory. Cambridge University Press, 2013. 17

APPENDIX I CLASSICAL REED-MULLER CODES

Given an integer m ≥ 1, let x1, x2, . . . , xm be binary variables and we adopt the convention that x1 represents the least signiﬁcant bit (LSB) and xm represents the most signiﬁcant bit (MSB). These variables can also be interpreted as monomials of degree 1, and we can construct degree t monomials xi1 xi2 ··· xit where ij ∈ {1, . . . , m}. The set of all monomials in m variables is denoted by Mm. A degree t polynomial f on m variables is a binary linear combination of monomials such that the maximum degree term(s) has (have) degree t. Any polynomial f ∈ F2[x1, . . . , xm] can be associated one-to-one to its 2m evaluation vector ev(f) := [ f(x , . . . , x )] m ∈ . Note that the unique degree 0 monomial is taken to be 1 m 1 (xm,...,x1)∈F2 F2 whose evaluation vector is the all-1s vector. For 0 ≤ r ≤ m, the binary Reed-Muller code RM(r, m) is generated by evaluation vectors of all monomials on m binary variables with degree at most r, i.e., 2m RM(r, m) := {ev(f) ∈ F2 : f ∈ F2[x1, . . . , xm], deg(f) ≤ r} 2m = hev(f) ∈ F2 : f ∈ Mm, deg(f) ≤ ri. (51) Pr m Hence, the dimension of RM(r, m) is given by k = t=0 t . It is well-known that the minimum distance of RM(r, m) is 2m−r and that the dual of RM(r, m) is RM(m − r − 1, m) [20]. If ev(f) ∈ RM(r, m), then we also write f ∈ RM(r, m).

APPENDIX II PROOFSFOR ALL RESULTS In all the proofs below we will use a few observations or identities repeatedly, so we mention them here. T T (O1) As discussed in Section II-A, for a, b, x ∈ Zn we have E(a, b + 2x) = (−1)ax and E(a + 2x, b) = (−1)bx E(a, b). When multiplying two Pauli matrices we have the identities E(a, b)E(c, d) = (−1)h[a,b],[c,d]is E(c, d)E(a, b) (52) T T = ıbc −ad mod 4E(a + c, b + d) (53) T T = ıbc −ad mod 4E(a + c, (b ⊕ d) + 2(b ∗ d)) (54) T T T = ıbc −ad mod 4(−1)(a+c)(b∗d) E((a ⊕ c) + 2(a ∗ c), b ⊕ d) (55) T T T T = ıbc −ad mod 4(−1)(a⊕c)(b∗d) +(b⊕d)(a∗c) E(a ⊕ c, b ⊕ d). (56) n P xvT I ⊥ (O2) For any binary subspace A ⊆ Z2 , due to symmetry in binary subspaces we have x∈A(−1) = |A| · (v ∈ A ). In other words, we have |x ∈ A: xvT ≡ 0| = |x ∈ A: xvT ≡ 1| if and only if v∈ / A⊥. n n n (O3) For a ∈ Z2 , a set such as A := {x ∈ Z2 : x a} is a subspace of Z2 since x1, x2 a ⇒ (x1 ⊕ x2) a. Hence, using T P xv I ⊥ wH (a) I (O2) we observe that x∈A(−1) = |A| · (v ∈ A ) = 2 · (v a¯), where a¯ is the ones’ complement of a. n (O4) Let a, b ∈ Z2 have disjoint supports so that a ∗ b = 0. Then we observe that     X vaT vbT (i) X X v aT X v bT ı (−ı) =  ı a   (−ı) b  (57) n v∈Z2 w(a⊕b) vaa vbb     X X w (v ) X w (v ) =  ı H a   (−ı) H b  (58) w(a⊕b) vaa vbb (ii) = 2n−wH (a)−wH (b)(1 + ı)wH (a)(1 − ı)wH (b). (59) n In step (i) we split any v ∈ Z2 uniquely as v = va ⊕ vb ⊕ w, where va is supported only on a, vb is supported only on b, and w is supported outside the support of a and b. In step (ii), for each of the two sums over va and vb, we notice T T that the inner products vaa and vbb take values q ∈ {0, 1, . . . , wH (va)} and q ∈ {0, 1, . . . , wH (vb)} according to the n Hamming weights wH (va) and wH (vb), respectively. Hence, there are q vectors va a (resp. vb b) that produce the inner product q, and this is captured in the binomial expansion of (1 + ı)wH (a) (resp. (1 − ı)wH (b)).

A. Proof of Theorem 2 We will make use of the above observations in the following proof. We proved in [10] that given a tensor product τ (`) ⊗ R1 τ (`) ⊗ · · · ⊗ τ (`), the result is also of the form τ (`) with R2 Rn R   R1 0 ··· 0  0 R2 ··· 0  R =   . (60)  . . .. .   . . . .  0 0 ··· Rn 18

Hence, for an n-qubit system, the transversal application of T corresponds to R = In and ` = 3. Based on the discussion in Section III-A, for the case of transversal T , we need

2r 1 X T ⊗nΠ (T ⊗n)† = T ⊗nE(a , b )(T ⊗n)† (61) S 2r j j j j=1 2r 1 1 T X X bj y = j · (−1) E(aj, bj ⊕ y) (62) 2r 2wH (aj )/2 j=1 yaj   2r 1 X j  X b yT  = E(a , b ) + (−1) j E(a , b ⊕ y) (63) r wH (aj )  j j j j  2 2   j=1 2 yaj y6=0 2r 1 X = E(a , b ), (64) 2r j j j j=1 where only for the last step we have assumed that transversal T preserves the code subspace. Note that whenever aj = 0, the denominator is 1 and the inner summation is trivial since only 0 0. Therefore, each such stabilizer E(0, bj) is retained unchanged (as we would expect since T ⊗n is diagonal and commutes with diagonal Paulis), and we only need to analyze the case aj 6= 0. r For any index j = h ∈ {1, 2,..., 2 }, first we observe that we need wH (ah) to be even in order to make the denominator wH (a )/2 an integer, which can be canceled by producing 2 h copies of the stabilizer element E(ah, bh) (with the appropriate r sign h ∈ {±1}) in the summation over all 2 stabilizer elements. This clearly shows that the first condition in the theorem is necessary for transversal T to preserve the code space. (We will call the sum over j ∈ {1,..., 2r} as the “outer summation” and the sum over y aj as the “inner summation”; also, for a given j = h we refer to the corresponding E(ah, bh) as the 0 0 “outer summation term”.) The only way to produce copies of E(ah, bh) is through stabilizers E(ah, bh) such that bh ⊕ y = bh 0 for some y ah. (Note: These two stabilizers correspond to two different outer summation terms for some indices j and j , where aj = aj0 = ah but bj = bh is distinct from bj0 = bh ⊕ y.) If two such Paulis E(ah, bh ⊕ y),E(ah, bh ⊕ z) must belong T T to S then we need E(ah, bh ⊕ y) and E(ah, bh ⊕ z) to commute, which means we need ahy ≡ ahz (mod 2). Since y = 0 T T must be included, we need ahz = zz = wH (z) ≡ 0 (mod 2) for all such choices z ah. wH (a )−1 There are nh := 2 h such even weight vectors z ah but only a subset of them might correspond to stabilizers in the inner summation. Hence, let us define

Zh := {z ah : zE(0, z) ∈ S for some z ∈ {±1}} (65) as in the statement of the theorem. It is clear that Zh is a binary subspace of even weight vectors, so let its dimension be th ≤ wH (ah) − 1. As mentioned above, every z ∈ Zh provides an outer summation term E(ah, bh ⊕ z) that produces a p copy of E(ah, bh) in its corresponding inner summation. So we need at least 2 h such z’s in order to cancel the factor in the denominator, where ph := wH (ah)/2, i.e., th ≥ ph. In order to prove that the second and third conditions in the theorem are necessary for transversal T to preserve the code space, we need to ensure two things based on the equality imposed in (64):

r wH (a )/2 (a) For any h ∈ {1, 2,..., 2 }, over all outer summation indices j there are a net of 2 h copies of E(ah, bh) (with the right sign) so that this stabilizer element is preserved under conjugation of the code projector by transversal T . r (b) For any h ∈ {1, 2,..., 2 }, for each y∈ / Zh and y ah, over all outer summation indices j there are exactly the same number of positive and negative copies of the non-stabilizer element E(ah, bh ⊕ y), which cancel out. Let us express these two conditions mathematically by collecting signs of the respective Pauli elements. For notational clarity, if the outer summation term is E(ah, bh) then we write its sign as (ah,bh) and similarly if the term is E(ah, bh ⊕ z) then we write its sign as (ah,bh⊕z). For conditions (a) and (b), we respectively require

T T X (bh⊕z)z X bhz wH (ah)/2 (ah,bh⊕z)(−1) = (ah,bh⊕z)(−1) = (ah,bh)2 , (66)

z∈Zh z∈Zh T T T T X (bh⊕z)(z⊕y) bhy X bhz zy (ah,bh⊕z)(−1) = (−1) (ah,bh⊕z)(−1) (−1) = 0. (67)

z∈Zh z∈Zh

T T If we ignore the overall sign (−1)bhy in the second equation, then we see that it is impossible to have (−1)zy = 1 for all z ∈ Zh, because otherwise we have a contradiction between the two conditions. Therefore, since y∈ / Zh, by the symmetry of T T binary vectors spaces (observation (O2)), we must have |{z ∈ Zh : zy ≡ 0}| = |{z ∈ Zh : zy ≡ 1}|. To proceed further we will ﬁnd the following lemma about binary subspaces to be useful. 19

n Lemma 22. For even n, let C be an [n, k ≥ 2 ] binary linear code comprising only of even weight codewords. Then the following are equivalent: 1) C contains its dual C⊥. n 2) C contains an [n, 2 ] self-dual subcode Cs. n T T 3) Any vector y ∈ Z2 \ C satisfies |{x ∈ C : xy ≡ 0}| = |{x ∈ C : xy ≡ 1}|. k×n Proof: First we show that 2) implies 3). Let G ∈ Z2 be a generator matrix for C with rows g1, g2, . . . , gk such that n g1, g2, . . . , gn/2 form a submatrix Gs that generates the self-dual subcode Cs. Then any y ∈ Z2 \ C cannot be orthogonal to T all rows of Gs, since otherwise y ∈ Cs. Let yg1 = 1 (mod 2) without loss of generality. Then for any other gi such that T 0 0 0 0 ygi = 1 (mod 2), one can replace gi with gi ⊕ g1 to form a new matrix G with rows g1, g2, g3, . . . , gk that still spans C 0 0 T T (i.e., some gi can be equal to gi). Hence, now y(gi) = 0 (mod 2) for all i = 2, 3, . . . , k and yx = 1 (mod 2) for all x in 0 0 0 T T the coset g1 ⊕ span(g2, g3, . . . , gk). This shows that |{x ∈ C : xy ≡ 0}| = |{x ∈ C : xy ≡ 1}|. n Next we show that 3) implies 1) and 1) implies 2). Now we start by assuming that every y ∈ Z2 \ C satisfies |{x ∈ C : xyT ≡ 0}| = |{x ∈ C : xyT ≡ 1}|. This means that all vectors orthogonal to C are contained in C, i.e., C⊥ ⊆ C. Without loss of generality, assume that we have a generator matrix G for C with rows g1, g2, . . . , gk such that g1, g2, . . . , gn−k form a submatrix G⊥ that generates the dual code C⊥. Then we can enlarge C⊥ into a dimension (n − k + 1) self-orthogonal code ⊥ ⊥ by adding gn−k+1 to the rows of G , since gn−k+1 ∈ C is dual to C and to itself (as it has even weight). Now notice that all vectors orthogonal to this new C⊥ are still in the rowspace of G since the dual of the span of the first (n − k) rows are ⊥ contained in the rowspace of G. Hence, it is possible to find a row between gn−k+2 and gk that can be used to enlarge C into a dimension (n − k + 2) self-orthogonal code. Intuitively, as C⊥ is enlarged, its dual keeps shrinking starting from C at the beginning. We can continue this process and, since we assumed k ≥ n/2, it stops only when C⊥ is a dimension n/2 self-dual code, as required. wH (a ) Let Z˜h denote the subspace Zh where all the indices outside the support of ah are punctured, i.e., Z˜h ⊆ {0, 1} h . ⊥ Setting C = Z˜h in Lemma 22, we see that Z˜h must contain its dual (Z˜h) and also a dimension wH (ah)/2 self-dual code ˜s Zh. In other words, Zh must contain a dimension wH (ah)/2 self-dual code in the support of ah. This proves the necessity of the second condition in the theorem.

Let us compute the sign (ah,bh⊕z) of E(ah, bh ⊕ z) assuming that hE(ah, bh), zE(0, z) ∈ S. Using (5), (7) we calculate

T −ahz hE(ah, bh) · zE(0, z) = hzı E(ah, bh + z) (68) zzT = hzı E(ah, (bh ⊕ z) + 2(bh ∗ z)) (69) T T zz ah(bh∗z) = hzı (−1) E(ah, bh ⊕ z) (70) T T zz bhz = hzı (−1) E(ah, bh ⊕ z) (71) T T zz bhz ⇒ (ah,bh⊕z) = (ah,bh)(0,z)ı (−1) , (72) where we have used observation (O1) and for the fourth equality we have assumed that z ah. Now we start analyzing the T T T sum for condition (a) above by observing that wH (z ⊕ v) = zz + vv − 2zv .

T X bhz Γ := (ah,bh⊕z)(−1) (73)

z∈Zh T T T X zz bhz bhz = (ah,bh)(0,z)ı (−1) (−1) (74)

z∈Zh X zzT = (ah,bh) (0,z)ı (75)

z∈Z˜h 2 X zzT +vvT ⇒ Γ = (0,z)(0,v)ı (76)

z,v∈Z˜h T X wH (z⊕v) (z⊕v)v = (0,z⊕v)ı (−1) (77)

z,v∈Z˜h   X uuT X uvT = (0,u)ı  (−1)  (u := z ⊕ v) (78)

u∈Z˜h v∈Z˜h ˜ X uuT = |Zh| (0,u)ı (by observation (O3)). (79) ⊥ u∈(Z˜h)

T ˜ ⊥ ˜ ⊥ wH (ah)/2 uu We need this sum over (Zh) to be |(Zh) | so that |Γ| = 2 as required. Therefore, it is clear that we need (0,u) := ı ⊥ ⊥ for all u ∈ (Z˜h) . Again, note that these vectors u essentially represent n-qubit pure Z-type stabilizers even though (Z˜h) is 20

˜ ⊥ ˜⊥ a space where all indices outside the support of ah have been punctured. The subtlety is that (Zh) 6= (Zh ) since the latter computes the dual over {0, 1}n, which is not what we want here. Let us also verify that condition (b) above is satisﬁed. T T X bhz zy ∆ := (ah,bh⊕z)(−1) (−1) (80)

z∈Zh T T T T X zz bhz bhz zy = (ah,bh)(0,z)ı (−1) (−1) (−1) (81)

z∈Zh X zzT zyT = (ah,bh) (0,z)ı (−1) (82)

z∈Z˜h T X wH (w⊕u) (w⊕u)y = (ah,bh) (0,w⊕u)ı (−1) (83) ⊥ w∈Z˜h/(Z˜h) ⊥ u∈(Z˜h) X wwT wyT = (ah,bh) (0,w)ı (−1) ⊥ w∈Z˜h/(Z˜h)   X uuT (w⊕y)uT  (0,u)ı (−1)  (84) ⊥ u∈(Z˜h) = 0, (85) T ⊥ since uw = 0 for all u ∈ (Z˜h) and, again, by symmetry of binary vector spaces and because y∈ / Z˜h, we have |u ∈ ⊥ T ⊥ T (Z˜h) : uy ≡ 0| = |u ∈ (Z˜h) : uy ≡ 1| (observation (O2)). Thus, this establishes the necessity of the third condition in the theorem. uuT ˜s Now for the converse we assume that the first two conditions in the theorem hold true and that (0,u) := ı for all u ∈ Zh, ˜s ˜ where Zh is the self-dual code present inside Zh. Once again, we just have to show that conditions (a) and (b) above are ˜ ⊥ ˜s satisfied under these assumptions. For (b), the above calculation for ∆ itself suffices since (Zh) ⊆ Zh, so we are only left wH (ah)/2 to show that Γ = (ah,bh)2 . We observe that X zzT Γ = (ah,bh) (0,z)ı (86)

z∈Z˜h X X wwT +uuT −2wuT = (ah,bh) (0,w⊕u)ı (87) ˜ ˜s ˜s w∈Zh/Zh u∈Zh   X wwT X uuT wuT = (ah,bh) (0,w)ı  (0,u)ı (−1)  (88) ˜ ˜s ˜s w∈Zh/Zh u∈Zh   X wwT X wuT = (ah,bh) (0,w)ı  (−1)  (89) ˜ ˜s ˜s w∈Zh/Zh u∈Zh ˜s = (ah,bh)|Zh| (90)

wH (ah)/2 = (ah,bh)2 , (91) ˜ ˜s ˜s since only w = 0 ∈ Zh/Zh is orthogonal to all u ∈ Zh (observation (O2)). This completes the proof of the theorem. B. Proof of Lemma 4 We will use the observations listed at the beginning of Appendix II to complete this proof. As mentioned earlier in the proof of Theorem 2, we proved in [10] that given a tensor product of diagonal unitaries τ (`) ⊗ τ (`) ⊗ · · · ⊗ τ (`), the result is also of R1 R2 Rn (`) the form τR with   R1 0 ··· 0  0 R2 ··· 0  R =   . (92)  . . .. .   . . . .  0 0 ··· Rn

Hence, for an n-qubit system, the transversal application of T gate corresponds to R = In and ` = 3. Similarly, it is easy to ⊗t see that the symmetric matrix corresponding to T is R = Dt, where t = t1 + 7t7 and Dt is a diagonal n × n matrix with the diagonal set to t. Then, according to (15), we see that ˜ R(Dt, a, 3) = 3Da∗t − (D1−aDtDa + DaDtD1−a + 2Da∗t) = Da∗t = Da∗t1 + 7Da∗t7 , (93) 21

T T T T φ(Dt, a, b, 3) = −aDta = −at = −at1 − 7at7 = −wH (a ∗ t1) + wH (a ∗ t7)(mod 8). (94)

(2) 1 T T T √ P vx vDa∗t1 v vDa∗t7 v n We need to calculate c = n n (−1) ı (−ı) for all x ∈ . For x = 0, we observe Da∗t,x 2 v∈Z2 Z2

(2) 1 X vD vT vD vT c = √ ı a∗t1 (−ı) a∗t7 (95) Da∗t,0 n 2 n v∈Z2 1 X T T = √ ıv(a∗t1) (−ı)v(a∗t7) (96) n 2 n v∈Z2 1 = √ 2n−wH (a∗t1)−wH (a∗t7)(1 + ı)wH (a∗t1)(1 − ı)wH (a∗t7) (observation (O4)) (97) 2n √ ıπ [w (a∗t )−w (a∗t )] n−w (a∗t0) /2 ± ıπ = e 4 H 1 H 7 2( H ) (since e 4 = (1 ± ı)/ 2). (98)

0 n For the case x 6= 0, we calculate as follows. Recollect that we have deﬁned t = t1 + t7 ∈ Z2 .

(2) 1 X T c = √ ıv[(a∗t1)+3(a∗t7)+2x] (99) Da∗t,x n 2 n v∈Z2 1 X T X T = √ ıv[(a∗t1)+3(a∗t7)+2x] (−1)wx (observation (O4)) (100) 2n v(a∗t0) w(a∗t0)     n −w (a∗t0) X v[(a∗t )+3(a∗t )+2x]T X w[(a∗t )+3(a∗t )+2x]T 0 = 2 2 H  ı 1 7   ı 1 7  , x (a ∗ t ). (101) v(a∗t0)⊕x wx In the last step, unless x (a ∗ t0) it is easy to see that the inner summation in the second step vanishes (observation (O3)). Furthermore, we have used this property of x to split the sum into indices on the support of x and the others (observation 0 (O4)). For convenience, let us denote by A the support of (a ∗ t ) ⊕ x, by A1 the support of (a ∗ t1) ⊕ (x ∗ a ∗ t1), and by A7 0 T the support of (a ∗ t7) ⊕ (x ∗ a ∗ t7). Note that A = supp(a ∗ t ) \ supp(x) and hence vx = 0. For simplicity, we will write n {v ∈ Z2 : supp(v) ⊆ A} as v A. Then, using observation (O4), we can write the ﬁrst sum above as X T T X T X T ıv(a∗t1) (−ı)v(a∗t7) = ız1(a∗t1) (−ı)z2(a∗t7) (102)

vA z1A1 z2A7 = (1 + ı)wH (a∗t1)−wH (x∗a∗t1)(1 − ı)wH (a∗t7)−wH (x∗a∗t7) (103) ıπ [(w (a∗t )−w (a∗t ))−(w (x∗a∗t )−w (x∗a∗t ))] w (a∗t0)−w (x) /2 = e 4 H 1 H 7 H 1 H 7 2( H H ) . (104) Now, again using observation (O4), we can calculate the second sum similarly as follows. X T X T X T ıw[(a∗t1)+3(a∗t7)+2x] = (−ı)ww ızz (105)

wx wx∗(a∗t1) zx∗(a∗t7) = (1 − ı)wH (x∗a∗t1)(1 + ı)wH (x∗a∗t7) (106) −ıπ [w (x∗a∗t )−w (x∗a∗t )] w (x)/2 = e 4 H 1 H 7 2 H . (107) Combing these two results and substituting back we get,

( 0 ıπ (n−wH (a∗t ))/2 [(wH (a∗t1)−wH (a∗t7))−2(wH (x∗a∗t1)−wH (x∗a∗t7))] 0 2 e 4 if x (a ∗ t ), c(2) = (108) Da∗t,x 0 otherwise.

(`) Recollect from (22) that the action of τR on a Pauli matrix E(a, b) is given by

(`) (`) 1 X (`−1) T τ E(a, b)(τ )† = √ ξφ(R,a,b,`) c ı−ax E(a, b + aR + x). (109) R R n R˜(R,a,`),x 2 n x∈Z2 Hence, using observation (O1), we can calculate the action of T ⊗t on E(a, b) under conjugation to be † T ⊗tE(a, b) T ⊗t

1 −ıπ T [wH (a∗t1)−wH (a∗t7)] X (2) −ax = √ e 4 c ı E(a, b + aDt + x) (110) 2n Da∗t,x x(a∗t0) 0 (n−wH (a∗t ))/2 2 T T X −x[(a∗t1)−(a∗t7)] −x[(a∗t1)+(a∗t7)] = √ ı E(a, b + (a ∗ t1) + 7(a ∗ t7) + x) (111) 2n x(a∗t0) 22

1 T X x(a∗t1) 0 = 0 (−1) E(a, b + ((a ∗ t ) + x) + 6(a ∗ t7)) (112) 2wH (a∗t )/2 x(a∗t0)

1 T T X x(a∗t1) +a(a∗t7) 0 0 = 0 (−1) E (a, b + ((a ∗ t ) ⊕ x + 2(x ∗ (a ∗ t )))) (113) 2wH (a∗t )/2 x(a∗t0)

1 T X x(a∗t1) +wH (a∗t7) 0 = 0 (−1) E (a, b + ((a ∗ t ) ⊕ x + 2x)) (114) 2wH (a∗t )/2 x(a∗t0)

1 T T X x(a∗t1) +wH (a∗t7)+xa 0 0 = 0 (−1) E (a, b ⊕ ((a ∗ t ) ⊕ x) + 2b ∗ ((a ∗ t ) ⊕ x)) (115) 2wH (a∗t )/2 x(a∗t0) 1 T T 0 T X x(a∗t7) +(a∗t7)(a∗t7) +a[b∗((a∗t )⊕x)] 0 = 0 (−1) E (a, b ⊕ ((a ∗ t ) ⊕ x)) (116) 2wH (a∗t )/2 x(a∗t0) 1 T 0 T X (a∗t7)[(a∗t7)⊕x] +a[b∗((a∗t )⊕x)] 0 = 0 (−1) E (a, b ⊕ ((a ∗ t ) ⊕ x)) (117) 2wH (a∗t )/2 x(a∗t0)

1 T T X (a∗t7)[y⊕(a∗t1)] +a(b∗y) 0 = 0 (−1) E (a, b ⊕ y)(y := (a ∗ t ) ⊕ x) (118) 2wH (a∗t )/2 y(a∗t0)

1 T T X y(a∗t7) +a(b∗y) = 0 (−1) E (a, b ⊕ y) (119) 2wH (a∗t )/2 y(a∗t0)

1 T X (b+t7)y = 0 (−1) E (a, b ⊕ y) . (120) 2wH (a∗t )/2 y(a∗t0)

0 T T The last step follows from y (a ∗ t ) ⇒ y a ⇒ (b ∗ y) a and y(a ∗ t7) = wH ((y ∗ a) ∗ t7) = wH (y ∗ t7) = yt7 .

C. Proof of Corollary 5 P7 0 We have t = j=1 jtj with tj ∗ tj0 = 0 for all j 6= j . Then we can write T ⊗t = T t1+t3+t5+t7 P t2+t3+t6+t7 Zt4+t5+t6+t7 ,

t4+t5+t6+t7 ⊗t ⊗t † where Z = E(0, t4 + t5 + t6 + t7). Now we can compute T E(a, b)(T ) as follows. Firstly we observe

T a(t4+t5+t6+t7) E(0, t4 + t5 + t6 + t7) · E(a, b) · E(0, t4 + t5 + t6 + t7) = (−1) E(a, b). (121)

q † q n Next, using observation (O1), and observing that P E(a, b)(P ) = E(a, b + (a ∗ q)) for q ∈ Z2 , we can calculate T P t2+t3+t6+t7 · (−1)a(t4+t5+t6+t7) E(a, b) · (P †)t2+t3+t6+t7 T a(t4+t5+t6+t7) = (−1) E(a, b + (a ∗ (t2 + t3 + t6 + t7))) (122) T T a(t4+t5+t6+t7) +a(b∗(t2+t3+t6+t7)) = (−1) E(a, b ⊕ (a ∗ (t2 + t3 + t6 + t7))) (123) T T a(t4+t5+t6+t7) +a(b∗(t˜2+t˜3)) = (−1) E(a, b ⊕ (a ∗ (t˜2 + t˜3))) (124) T ˜ ˜ = (−1)a(t4+t5+t6+t7) +wH (a∗b∗t2)+wH (a∗b∗t3)E(a, c), (125) where we have defined c := b ⊕ (a ∗ (t˜2 + t˜3)) for convenience. Finally, we can invoke Lemma 4 with the t1, t7 in that lemma taken respectively to be t1 + t3 + t5 + t7 and 0 here. Then we have T ˜ ˜ T t1+t3+t5+t7 · (−1)a(t4+t5+t6+t7) +wH (a∗b∗t2)+wH (a∗b∗t3)E(a, c) · (T †)t1+t3+t5+t7 T a(t4+t5+t6+t7) +wH (a∗b∗t˜2)+wH (a∗b∗t˜3) (−1) X cyT = ˜ ˜ (−1) E(a, c ⊕ y) (126) 2wH (a∗(t1+t3))/2 ya∗(t˜1+t˜3) a(t +t +t +t )T +w (a∗b∗t˜ )+w (a∗b∗t˜ ) (−1) 4 5 6 7 H 2 H 3 T ˜ ˜ T X by +y(a∗(t2+t3)) ˜ ˜ = ˜ ˜ (−1) E(a, b ⊕ [(a ∗ (t2 + t3)) ⊕ y]) (127) 2wH (a∗(t1+t3))/2 ya∗(t˜1+t˜3) a(t +t +t +t )T +w (a∗(t˜ +t˜ )) (−1) 4 5 6 7 H 2 3 T T X bz +z(a∗(t˜2+t˜3)) = ˜ ˜ (−1) E(a, b ⊕ z), (128) 2wH (a∗(t1+t3))/2 (a∗t˜2)za∗(t˜1+t˜2+t˜3) where we have defined z := (a ∗ (t˜2 + t˜3)) ⊕ y. In the last step, first we observe that (a ∗ t˜2) is always present in z and is unaffected by the range of y above since t˜1 ∗ t˜2 = 0 = t˜3 ∗ t˜2. Hence, we write that z contains and is contained in (a ∗ t˜2). 23

Next, since the change of variables does not change the support of y inside (a ∗ t˜1), the new variable z also loops over all vectors contained in (a ∗ t˜1). Finally, since z adds (a ∗ t˜3) to y (modulo 2), we see that y undergoes a ones’ complement in the support of (a ∗ t˜3). However, since y takes all possible values in the support of (a ∗ t˜3), we conclude that z still retains ˜ T ˜ T that property. Furthermore, under the change of variables, we have cancelled the new factors (−1)b(a∗t2) and (−1)b(a∗t3) ˜ ˜ correspondingly with the existing factors (−1)wH (b∗a∗t2) and (−1)wH (b∗a∗t3). Now we observe that T T z(a ∗ (t˜2 + t˜3)) = wH (z ∗ (a ∗ t˜2)) + wH ((z ∗ a) ∗ t˜3) = wH (a ∗ t˜2) + wH (z ∗ t˜3) = wH (a ∗ t˜2) + t˜3z , (129) since clearly z a. Substituting this back, and noting that t˜3 = t3 + t7, we obtain a(t +t +t +t )T † (−1) 3 4 5 6 T ⊗t ⊗t X (b+t˜3)z T E(a, b) T = ˜ ˜ (−1) E (a, b ⊕ z) , (130) 2wH (a∗(t1+t3))/2 (a∗t˜2)z(a∗(t˜1+t˜2+t˜3)) which is the ﬁnal expression given in the statement of the lemma.

D. Proof of Corollary 10 For convenience, we will use the notation A, B, C, D to also refer to the subspaces hAi, hBi, hCi, hDi generated by the matrices A, B, C, D, respectively. We assume that A and C are disjoint, that B and D are disjoint, and that C and D have full rank, all without loss of generality. Note that if some of these are not satisﬁed, then we can perform suitable row operations to subsume rows a 0 into C 0 and rows 0 b into 0 D. We will prove the result for t = [1, 1,..., 1], i.e., transversal T , but the extension to any t ∈ {0, 1, 7}n is straightforward as we comment at the end of the proof. Firstly, it is clear that the CSS code has the transversal T property because this depends on the subspace hA, Ci being even and the existence of a self-dual code in D within the support of each vector in the subspace hA, Ci. Dropping B does not affect these properties. It is also clear that the CSS code still encodes k qubits if A has full rank, but if this is violated then some rows of the stabilizer matrix are removed to provide room for more than k logical qubits (without affecting the transversal T property). Secondly, the distance of the CSS code is lower bounded by the minimum of the minimum weights of hA, Ci⊥ and D⊥. Since 0 hA, Ci⊥ belongs to the normalizer of the given stabilizer code, and D has minimum weight at least d by non- degeneracy, we know that the minimum weight of hA, Ci⊥ is at least d. Originally, hB,Di⊥ 0 is in the normalizer of the given stabilizer, so the minimum weight of hB,Di⊥ is at least d as well. Since C 0 was initially in the stabilizer, we know that C ⊂ hB,Di⊥ and minimum weight of C must be at least d by non-degeneracy. However, A ⊂ D⊥ and A 0 was not originally part of the stabilizer, so it appears that A might have vectors of weight less than d. But by non-degeneracy, minimum weight of D is d. So the minimum weight of any self-dual code Cx ⊂ D is d, for any x ∈ hA, Ci. Hence, this 6 means that wH (x) ≥ d since x ∈ Cx . Now consider a z ∈ D⊥. We want to show that a minimal weight z has weight at least d. Suppose, for the sake of contradiction, that wH (z) is minimal but is strictly less than d. Now consider any x ∈ hA, Ci and look at the projection (x ∗ z). By assumption, z is orthogonal to D and hence is orthogonal to Cx. But the inner product of z with any vector in Cx only depends on the projection (x ∗ z). Therefore, (x ∗ z) is orthogonal to Cx, and hence belongs to Cx as Cx is self-dual. Observe that wH (x ∗ z) ≤ wH (z) < d, which implies that we have found an element (x ∗ z) ∈ Cx that has weight less than d. This is a contradiction since minimum weight of Cx is d, and this completes the proof for transversal T . Note that for 0 any other t = t1 + 7t7, as Theorem 6 suggests, we simply replace x ∈ hA, Ci in the above argument with (x ∗ t ), where 0 t = t1 + t7.

E. Proof of Theorem 12 GC1/C2 Lk Let G1 = be a generator matrix for the code C1. Then by the CSS construction, the vectors x = i=1 cixi, G2 where ci ∈ {0, 1} and xi form the k rows of the coset generator matrix GC1/C2 , determine all logical X operators E(x, 0) ⊥ for the code CSS(X,C2; Z,C1 ). Similarly, vectors a ∈ C2 determine the X-type stabilizers E(a, 0) for the code. Therefore, (x ⊕ a) ∈ C1 represents all possible X-type representatives of all logical X operators for the CSS-T code. Recollect that by ⊥ ⊥ ˜⊥ the CSS-T conditions, C2 ⊂ C1 ⇒ a ∈ C1 as well, and in fact a˜ belongs to the dual Za of the (punctured) subspace Za in ⊥ wH (a) C1 that is supported on a (see Theorem 2 for notation), so that ı E(0, a) ∈ S. By assumption, we have transversal T acting trivially on the logical qubits, so transversal P = T 2 must also act trivially on the logical qubits. Using this fact, and the identity PXP † = Y , let us observe the action of transversal P on a logical X representative E(x ⊕ a, 0). We have † P ⊗nE(x ⊕ a, 0) P ⊗n = E(x ⊕ a, x ⊕ a) = E(x ⊕ a, 0) · ıwH (x⊕a)E(0, x ⊕ a), (131) T where the second equality follows from the identity (5). Note that wH (x⊕a) = wH (x)+wH (a)−2xa ≡ wH (x)+wH (a) (mod 4), since xaT ≡ 0 (mod 2) due to the fact that logical X operators must commute with Z-type stabilizers, and ıwH (a)E(0, a) ∈

6Strictly speaking, we are only concerned with the minimum weight of D⊥ \ hA, Ci and not of D⊥, i.e., we do not require that the new CSS code is also non-degenerate. However, the above shows that the minimum weight of hA, Ci is already at least d due to Theorem 2. 24

T S. (Recollect that E(x, 0) and E(0, a) commute if and only if their symplectic inner product h[x, 0], [0, a]is = xa = 0.) Hence, ıwH (x⊕a)E(0, x ⊕ a) = ıwH (x)E(0, x) · ıwH (a)E(0, a). Since P ⊗n must act trivially on logical qubits, we require that P ⊗nE(x ⊕ a, 0) (P ⊗n)† ≡ E(x ⊕ a, 0), where the equivalence class is deﬁned by multiplication with stabilizer elements. For this equivalence to be true, we need ıwH (x)E(0, x) ∈ S. This proves the ﬁrst condition stated in the theorem. The above ensures that P ⊗n acts like the logical identity. Let us now examine T ⊗n along similar lines by using the identity TXT † = e−ıπ/4YP . We require

† ıπ ⊗n ⊗n − wH (x⊕a) x⊕a T E(x ⊕ a, 0) T = e 4 E(x ⊕ a, x ⊕ a) P (132) − ıπ w (x⊕a) w (x⊕a) x⊕a = e 4 H E(x ⊕ a, 0) · ı H E(0, x ⊕ a) P (133) − ıπ w (x⊕a) x⊕a w (x⊕a) = e 4 H E(x ⊕ a, 0) P · ı H E(0, x ⊕ a) (134) ≡ E(x ⊕ a, 0). (135) Here the notation P x⊕a means that the phase gate is applied to the qubits in the support of (x⊕a). From the above calculation for P ⊗n, we know that ıwH (x⊕a)E(0, x ⊕ a) ∈ S, so for the last equivalence to be true, we need to ensure that P x⊕a acts like the logical identity. Let us examine its action on an arbitrary logical X representative E(y ⊕ b, 0) for y ∈ C1/C2, b ∈ C2. We require † P x⊕aE(y ⊕ b, 0) P x⊕a = E(y ⊕ b, (y ⊕ b) ∗ (x ⊕ a)) (136) = E(y ⊕ b, 0) · ıwH ((y⊕b)∗(x⊕a))E(0, (y ⊕ b) ∗ (x ⊕ a)) (137) ≡ E(y ⊕ b, 0). (138) Observe that this is satisfied for the case y = x, b = a by the arguments above for P ⊗n. Clearly, the constraint we need is that ıwH ((y⊕b)∗(x⊕a))E(0, (y ⊕ b) ∗ (x ⊕ a)) ∈ S. Since this must hold for all valid x, y, a, b, by setting a = b = 0 we obtain the third condition of the theorem. Similarly, by setting a = y = 0 and x = y = 0 respectively, we obtain the second and fourth conditions of the theorem. It can be verified that these alone ensure that ıwH ((y⊕b)∗(x⊕a))E(0, (y ⊕ b) ∗ (x ⊕ a)) ∈ S for all combinations of x, y, a, b, since we can split (y ⊕ b) ∗ (x ⊕ a) = (y ∗ x) ⊕ (y ∗ a) ⊕ (b ∗ x) ⊕ (b ∗ a) and using the Hamming weight identity we used above. Finally, we will show that the last three conditions amount to triorthogonality. wH (y∗x) ⊥ Note that since we need ı E(0, y ∗ x) ∈ S, it must be true that (y ∗ x) ∈ C1 since by the CSS construction pure ⊥ Z-type stabilizers arise from the code C1 . As logical X operators and X-type stabilizers must each commute with Z-type T stabilizers, by the symplectic inner product constraint we can see that this implies z(y ∗ x) = wH (z ∗ y ∗ x) ≡ 0 (mod 2) wH (b∗a) for any z ∈ C1/C2 or z ∈ C2. Similarly, since we need ı E(0, b ∗ a) ∈ S, we also have wH (z ∗ b ∗ a) ≡ 0 for any z ∈ C1/C2 or z ∈ C2. These are exactly the triorthogonality conditions in Definition 13. The first condition of Definition 13 ⊥ wH (y∗x) wH (y∗a) follows from the facts that C2 ⊂ C1 , and since ı E(0, y ∗ x), ı E(0, y ∗ a) ∈ S must be Hermitian, the phase T T has to be ±1, which implies wH (y ∗ x) = xy ≡ 0, wH (y ∗ a) = ay ≡ 0 (mod 2) for any x, y ∈ C1/C2 and a ∈ C2. Hence, triorthogonality of G1 is a necessary condition for transversal T to realize the logical identity on a CSS-T code.

F. Proof of Theorem 14

The proof uses a very similar strategy as for Theorem 12. As we observed there, (x⊕a) ∈ C1 for x ∈ C1, a ∈ C2 represents all possible X-type representatives of all logical X operators for the CSS-T code. Recollect that by the CSS-T conditions, ⊥ ⊥ ˜⊥ ⊥ C2 ⊂ C1 ⇒ a ∈ C1 as well, and in fact a˜ belongs to the dual Za of the (punctured) subspace Za in C1 that is supported on a (see Theorem 2 for notation), so that ıwH (a)E(0, a) ∈ S. By assumption, we have physical transversal T acting as transversal T on the logical qubits, so physical transversal P = T 2 must also act as transversal P on the logical qubits. Using this fact, and the identity PXP † = Y , let us observe the action of transversal P on a logical X representative E(x ⊕ a, 0). We require † P ⊗nE(x ⊕ a, 0) P ⊗n = E(x ⊕ a, x ⊕ a) = E(x ⊕ a, 0) · ıwH (x⊕a)E(0, x ⊕ a) (139) ≡ ıwH (c)E(x ⊕ a, 0)E(0, z), (140)

⊥ ⊥ where the second equality follows from (5), and z ∈ C2 /C1 is the corresponding logical Z string for the logical X string 7 T x . This also means xz ≡ wH (c) (mod 2) since the respective pairs of logical X and Z must anti-commute. Note that T T wH (x ⊕ a) = wH (x) + wH (a) − 2xa ≡ wH (x) + wH (a) (mod 4) because xa ≡ 0 (mod 2) due to the fact that logical X operators must commute with Z-type stabilizers (and ıwH (a)E(0, a) ∈ S). (Recollect that E(x, 0) and E(0, a) commute if and T wH (x⊕a) wH (x) wH (a) only if their symplectic inner product h[x, 0], [0, a]is = xa = 0.) Hence, ı E(0, x⊕a) = ı E(0, x)·ı E(0, a), wH (x)−wH (c) and for the last equivalence to be true above, we need ı E(0, x ⊕ z) ∈ S. Therefore, wH (x) − wH (c) ≡ 0 (mod 2) for this matrix to be Hermitian. Observing the case when wH (c) = 1 and hence x = xi for some i ∈ {1, . . . , k}, this implies ⊥ T T ⊥ that wH (xi) must be odd. Moreover, since a ∈ C2 ⊂ C1 , we see that xia = 0, and zia = 0 because zi ∈ C2 . Also, for any

7 Lk Qk ¯ ci † † Using the notation x = i=1 cixi in the theorem statement, E(x ⊕ a, 0) = E(a, 0) i=1 Xi . Since (P ⊗ P )(X ⊗ X)(P ⊗ P ) = Y ⊗ Y = 2 ı (X ⊗ X)(Z ⊗ Z) physically, the calculation above translates this to the logical level to produce wH (c), where X¯i = E(xi, 0) and Z¯i = E(0, zi). 25

T ⊥ xj ∈ C1/C2 that is distinct from xi, we have zixj = 0 by deﬁnition, and since the above condition means xi ⊕ zi ∈ C1 , we T ¯ also have xixj = 0. So we can assume zi = xi to be the corresponding logical Z string for Xi as well, in which case the above condition becomes trivial and the required equivalence is satisﬁed. In this scenario, since xi ⊕ zi = 0 ⇒ E(0, xi ⊕ zi) = IN , we need wH (xi) ≡ 1 (mod 4) exactly as for a non-trivial stabilizer group we need to make sure that −IN ∈/ S. ⊗n The above ensures that P acts desirably and so E(x ⊕ a, x ⊕ a) indeed corresponds to the logical Y operator Y¯c Lk ⊗n corresponding to the given logical X string x = i=1 cixi ∈ C1/C2. Let us now examine T along similar lines by using the identity TXT † = e−ıπ/4YP . We require

† † ıπ ıπ ⊗n ⊗n ⊗n ⊗n − wH (x⊕a) x⊕a − wH (c) T X¯c T = T E(x ⊕ a, 0) T = e 4 E(x ⊕ a, x ⊕ a) P ≡ e 4 Y¯c P¯c, (141) ¯ Lk x⊕a where Pc denotes the logical phase gate corresponding to the given logical X string x = i=1 cixi, and the notation P means that the phase gate is applied to the qubits in the support of (x ⊕ a). We have veriﬁed above that the Y¯c condition is x⊕a x⊕a satisﬁed, and when P acts on X¯c = E(x ⊕ a, 0), it indeed acts desirably. Therefore, to verify P ≡ P¯c, we just need to x⊕a ensure that P acts like the logical identity on all E(y ⊕ b, 0) for y 6= x ∈ C1/C2 and any b ∈ C2. This part of the proof is identical to the corresponding arguments in the proof of Theorem 12, and so this proves that triorthogonality of G1 is a necessary condition. Finally, we observe that we need the condition wH (x ⊕ a) ≡ wH (c) (mod 8) for the above equivalence to hold, and this completes the proof of the theorem.

G. Proof of Corollary 15 The Bravyi-Haah construction allows for a Clifford correction after the transversal T gate in order to exactly realize logical transversal T . In order to prove the equivalence of their construction to Theorem 14, we need to show that, by requiring their Clifford correction to be trivial, we arrive at the same conditions as listed above. Since triorthogonality of G1 is a common constraint in both Theorem 14 and the Bravyi-Haah construction, we are left to verify that the Hamming weight condition above coincides with the condition for the Clifford correction to be trivial. Lk+k2 Let C2 be an [n, k2] code so that the number of rows in G2 is k2. Let y = i=1 diyi with di = ci, yi = xi for i = 1, . . . , k Q(d) and yi = ai for i > k, where ai are the rows of G2. The Clifford correction depends on the phase ı and is trivial when k+k X2 X Q(d) := Γidi − 2 Γijdidj ≡ 0 (mod 4), (142) i=1 i k, Substituting for Γi and 2Γij, we get

k k2 k+k2 X (wH (xi) − 1) X wH (aj) X c + d − d d (y yT ) ≡ 0 (mod 4) (143) i 2 k+j 2 i j i j i=1 j=1 i

k k2 k+k2 X X X T ⇔ ciwH (xi) − wH (c) + dk+jwH (aj) − 2 didj(yiyj ) ≡ 0 (mod 8) (144) i=1 j=1 i

⇔ wH (x ⊕ a) ≡ wH (c)(mod 8). (145)

T Lk Lk2 For the last step, using the fact that wH (x ⊕ a) = wH (x) + wH (a) − 2xa recursively for x = i=1 cixi, a = j=1 dk+jaj, it can be veriﬁed that

 k k  M M2 wH (x ⊕ a) = wH  cixi ⊕ dk+jaj (146) i=1 j=1 T k !  k  k !  k  M M2 M M2 = wH cixi + wH  dk+jaj − 2 cixi  dk+jaj (147) i=1 j=1 i=1 j=1

 k ! k !T  M M = c1wH (x1) + wH cixi − 2c1x1 cixi  i=2 i=2     T  k2 k2  M M  + dk+1wH (a1) + wH  dk+jaj − 2dk+1a1  dk+jaj  j=2 j=2 26

T k !  k+k  M M2 − 2 diyi  djyj (148) i=1 j=k+1

 k !  k  M M2 = c1wH (x1) + wH cixi + dk+1wH (a1) + wH  dk+jaj (149) i=2 j=2  T  T  T  k ! k+k2 k ! k+k2  M M M M  − 2 d1y1 diyi + dk+1yk+1  djyj + diyi  djyj  (150) i=2 j=k+2 i=1 j=k+1 . . (continue recursion for i = 2, . . . , k − 1 and j = k + 2, . . . , k + k2 − 1)

k k2 k+k2 X X X T = ciwH (xi) + dk+jwH (aj) − 2 didj(yiyj ). (151) i=1 j=1 i

H. Proof of Lemma 16 In this proof, we will see that the ideas used to calculate the coefﬁcients for the transversal T gate generalize to the (`) 1 0 1 0 transversal application of power-of-2 roots of T , namely τ := ` = with R = [ 1 ]. When this Z-rotation R 0 e2πı/2 0 ξ `−2 T is applied transversally, we have R = In as before for the T gate. Then from (15) we get φ(R, a, b, `) = (1 − 2 )aIna = `−2 ˜ ˜ `−2 `−2 (1 − 2 )wH (a) and R := R(R, a, `) = (1 + 2 )DaIn − (D1−aInDa + DaInD1−a + 2DaInDa ) = (1 + 2 )Da − (0 + `−2 0 + 2Da) = (2 − 1)Da. n First consider the case a 6= [1, 1,..., 1] and let x ∈ Z2 such that supp(x) * supp(a), so that there exists some j ∈ {1, . . . , n} satisfying aj = 0, xj = 1. Once again, let x˜ = [x1, x2, . . . , xj−1, xj+1, . . . , xn] and similarly for a,˜ v˜. Then we observe that

(`−1) 1 X vxT 2 vRv˜ T c `−2 = √ (−1) (ξ ) (152) (2 −1)Da,x n 2 n v∈Z2 1 X T `−2 T = √ (−1)vx ξ2(2 −1)va (153) n 2 n v∈Z2 1 X T `−1 T 1 X T `−1 T = √ (−1)v˜x˜ ξ(2 −2)˜va˜ − √ (−1)v˜x˜ ξ(2 −2)˜va˜ (154) n n 2 n−1 2 n−1 v˜∈F2 v˜∈F2 (vj =0) (vj =1) = 0. (155)

So we only need to consider coefﬁcients corresponding to x a. For the remaining calculations, it is convenient to note that 2`−1−2 −2 n ξ = −ξ . Once again, for the case x = [0, 0,..., 0] and arbitrary a ∈ Z2 , we observe that

−2 wH (a) 1 T 1 √ 1 − ξ (`−1) X −2 va n−wH (a) −2 wH (a) n c `−2 = √ (−ξ ) = √ 2 (1 − ξ ) = 2 . (156) (2 −1)Da,0 n n 2 n 2 2 v∈Z2

T For x 6= [0, 0,..., 0], and x a, let j ∈ supp(x) ⊆ supp(a) so that xj = aj = 1 and ejx = 1. Then for each v such that T T vx ≡ 0, we have (v ⊕ ej)x ≡ 1. So we calculate

(`−1) c `−2 (2 −1)Da,x 1 X T T = √ (−1)vx (−ξ−2)va (157) n 2 n v∈Z2 1 X T 1 X T = √ (−ξ−2)va − √ (−ξ−2)va (158) n n 2 n 2 n v∈Z2 v∈Z2 vxT ≡0 vxT ≡1 1 X h T T i = √ (−ξ−2)va − (−ξ−2)(v⊕ej )a (159) n 2 n v∈Z2 vxT ≡0 27

1 X h T T i = √ (−ξ−2)va − (−ξ−2)(v+ej −2(v∗ej ))a (160) n 2 n v∈Z2 vxT ≡0 1 X T = √ (−ξ−2)va 1 − (−ξ−2)aj −2vj aj (161) n 2 n v∈Z2 vxT ≡0 −2 2 (1 + ξ ) X T (1 + ξ ) X T = √ (−ξ−2)vã˜ + √ (−ξ−2) · (−ξ−2)vã˜ (162) 2n 2n v˜:v ˜x˜T ≡0 v˜:v ˜x˜T ≡1 (vj =0) (vj =1)  −2 1 + ξ 1 X −2 vã˜T  √ √ (−ξ ) if x = ej, j ∈ supp(a),  n−1  2 2 n−1  v˜∈  F2    = (163) −2  1 + ξ  1 X −2 vã˜T 1 X −2 vã˜T   √ √ (−ξ ) − √ (−ξ )  if wH (x) ≥ 2, x a  2  2n−1 2n−1    v˜∈ n−1 v˜∈ n−1   F2 F2 v˜x˜T ≡0 v˜x˜T ≡1  (`−1) −2 1 + ξ c `−2 if x = ej, j ∈ supp(a), = √ × (2 −1)Da˜ ,0 (`−1) (164) 2 c `−2 if wH (x) ≥ 2, x a. (2 −1)Da˜ ,x˜

When wH (x) = 1, i.e., x = ej, we calculate −2 √ −2 wH (a)−1 −2 √ −2 wH (a) (`−1) (1 + ξ ) n−1 1 − ξ 1 + ξ n 1 − ξ c `−2 = √ 2 = 2 . (165) (2 −1)Da,ej 2 2 1 − ξ−2 2 We can simplify the ﬁrst factor as −2 2 −2 2 2π 2π 2π 1 + ξ 1 − ξ 1 + ξ − ξ − 1 −2ı sin `−1 −2ı sin ` cos ` 2π · = = 2 = 2 2 = −ı cot . (166) 1 − ξ−2 1 − ξ2 1 − ξ−2 − ξ2 + 1 2π 2 2π 2` 2 1 − cos 2`−1 2 sin 2` Therefore, in general we have

 −2 wH (x) −2 wH (a)−wH (x)  1 + ξ p n−w (x) 1 − ξ (`−1)  √ 2 H if x a, c `−2 = 2 (167) (2 −1)Da,x 2 0 otherwise  2π wH (x) √ 1 − ξ−2 wH (a)  −ı cot 2n if x a, = 2` 2 (168) 0 otherwise. Using this expression, we proceed to calculate the action of τ (`) on an arbitrary Pauli operator E(a, b). In τ (`)E(a, b)(τ (`))† In In 1 T φ(In,a,b,`) X (`−1) −ax = √ ξ c `−2 ı E(a, b + aIn + x) (169) n (2 −1)Da,x 2 n x∈Z2 wH (x) −2 wH (a) 1 `−2 X 2π √ 1 − ξ = √ ξ(1−2 )wH (a) −ı cot 2n ı−wH (x)E(a, b + a + x) (170) n 2` 2 2 xa w (a) xxT ξ(1 − ξ−2) H X 2π = − cot E(a, b + a + x) (171) 2ı 2` xa T −1 wH (a) xx ξ − ξ X 2π T T = − cot (−1)a(b∗(a⊕x)) +ax E(a, b ⊕ (a ⊕ x)) (172) 2ı 2` xa w (a) T 2π ! H (a−y)(a−y) 2ı sin ` X 2π T = 2 cot (−1)a(b∗y) E(a, b ⊕ y) (173) 2ı 2` ya wH (a) wH (a) wH (y) 2π 2π X 2π T = sin cot tan (−1)a(b∗y) E(a, b ⊕ y) (174) 2` 2` 2` ya 28

wH (a) wH (y) 2π X 2π T = cos tan (−1)a(b∗y) E(a, b ⊕ y) (175) 2` 2` ya wH (y) 1 X 2π T = tan (−1)by E(a, b ⊕ y). (176) 2π wH (a) 2` sec 2` ya Note that we have used the fact that y := a ⊕ x = a − x since x a, and also that axT = xxT , ayT = yyT .

I. Proof of Theorem 17 As in the proof of Theorem 2, we need to determine necessary and sufﬁcient conditions for the equality r † 2 † (`) (`) 1 X (`) (`) τ ΠS τ = jτ E(aj, bj) τ (177) In In 2r In In j=1 r 2 wH (y) 1 2π T X j X bj y = tan (−1) E(aj, bj ⊕ y) (178) 2r 2π wH (aj ) 2` j=1 sec 2` yaj 2r 1 X = E(a , b ) (179) 2r j j j j=1

= ΠS. (180) 0 Note that for j 6= j , either aj 6= aj0 or at least bj 6= bj0 if aj = aj0 . By commutativity of stabilizers, for v1, v2 ∈ Zj, we need h[aj, bj ⊕ v1], [aj, bj ⊕ v2]is = 0 which implies aj(v1 ⊕ v2) = 0 (mod 2), or equivalently, wH (v1 ⊕ v2) = 0 (mod 2). Hence, all vectors in Zj must have even weight for any aj 6= 0. T (j) (j) (j) bj v Let v E(aj, bj ⊕ v) ∈ S for v ∈ Zj, for some v ∈ {±1}. Note that it is unclear if we need v = j(−1) always, as the expression above might suggest, and from the calculation in Theorem 2. In general, if we collect all the coefﬁcients for 2π wH (aj ) E(aj, bj), then dividing it by sec 2` must leave j alone in the numerator, i.e., wH (v) wH (aj ) X T 2π 2π + (j)(−1)(bj ⊕v)v tan = sec (181) j v 2` j 2` v∈Zj \{0} wH (v) wH (aj ) X T 2π 2π ⇒ (j)(−1)bj v tan = sec (since w (v) even). (182) v 2` j 2` H v∈Zj

Let vE(0, v) ∈ S for v ∈ Zj, for some aj, and v ∈ {±1}. Then we calculate T −aj v jE(aj, bj) · vE(0, v) = jvı E(aj, bj + v) (183) vvT = jvı E(aj, bj ⊕ v + 2(bj ∗ v)) (184) T T vv bj v = jvı (−1) E(aj, bj ⊕ v) (185) (j) =: v E(aj, bj ⊕ v). (186) T T (j) vv bj v T Hence, v = jvı (−1) , where wH (v) = vv is even. (Note that by the property of any non-trivial stabilizer, E(0, 0) = IN has sign +1 always, so 0 = 1.) Substituting this value back, we get the condition

wH (v) wH (aj ) X T 2π 2π ıvv tan = sec (187) v 2` 2` v∈Zj w (v) w (a ) X 2π H 2π H j ⇒ ı tan = sec , (188) v 2` 2` v∈Zj for some choices of v ∈ {±1}, except when v = 0 as commented above. This proves the ﬁrst condition in the theorem. Now we need to ensure that all terms E(aj, bj ⊕y) for y ∈ Wj get cancelled properly, for all aj, so that the code projector ΠS is indeed preserved. Observe that terms corresponding to the coset {bj ⊕ y : y aj} can only appear in the inner summations r (j) corresponding to those i ∈ {1, 2,..., 2 } where iE(ai, bi) = v E(aj, bj ⊕ v) with v ∈ Zj. Moreover, for every y ∈ Bj, (j) the Pauli term E(aj, bj ⊕ y) appears exactly once in the inner summation corresponding to v E(aj, bj ⊕ v). We just have to collect these coefﬁcients and set their sum to zero. Hence, for each y ∈ Bj we need

wH (v⊕y) X T 2π (j)(−1)(bj ⊕v)(v⊕y) tan = 0 (189) v 2` v∈Zj 29

wH (v⊕y) X T T T 2π ⇒ ıvv (−1)bj v +(bj ⊕v)(v⊕y) tan = 0 (190) j v 2` v∈Zj w (v⊕y) X 2π H ⇒ ı tan = 0. (191) v 2` v∈Zj T In the last equation, we can reduce the exponent wH (v ⊕ y) to just wH (v) − 2vy since the additional term wH (y) does not affect the equality to zero. This proves the second condition in the theorem.

J. Proof of Lemma 18

Pm m−i i P m−wH (v) wH (v) The weight enumerator of the code C is WC (x, y) = i=0 Aix y = v∈C x y , where Ai is the number 1 of codewords in C of Hamming weight i. The MacWilliams identities for a self-dual code are WC (x, y) = |C| WC (x+y, x−y), where |C| is the number of codewords in C. Then we observe that w (v) m X 2π H 2π ı tan = sec (192) 2` 2` v∈C w (v) m X 2π H 2π ⇒ ı tan cos = 1 (193) 2` 2` v∈C w (v) m−w (v) X 2π H 2π H ⇒ ı sin cos = 1 (194) 2` 2` v∈C 2π 2π ⇒ W cos , ı sin = 1 (195) C 2` 2` m−w (v) w (v) 1 X 2π 2π H 2π 2π H ⇒ cos + ı sin cos − ı sin = 1 (196) |C| 2` 2` 2` 2` v∈C m−2w (v) 1 X 2π 2π H ⇒ cos + ı sin = 1, (197) |C| 2` 2` v∈C −2πı 2π 2π where the last step follows from the fact that exp 2` = cos 2` − ı sin 2` . We note that wH (v) is even for all v ∈ C and observe three cases.

1) If wH (v) = m/2, then the term v contributes 1 to the sum. 2(m−2wH (v))π 2(m−2wH (v))π 2) If wH (v) < m/2, then the term v contributes cos 2` + ı sin 2` .

2(2wH (v)−m)π 2(2wH (v)−m)π 3) If wH (v) > m/2, then the term v contributes cos 2` − ı sin 2` . Since the all-1s vector is always present in a self-dual code, we pair the terms v and w = 1 ⊕ v such that wH (v) < m/2 2(m−2wH (v))π 2(m−2wH (v))π and wH (w) = m − wH (v). Hence, the term w contributes cos 2` − ı sin 2` . Therefore, we have the 1 P 2(m−2wH (v))π condition |C| v∈C cos 2` = 1 that is satisﬁed if and only if each term in the sum equals 1. Indeed, this happens ` if and only if each v ∈ C satisﬁes either wH (v) = m/2 or 2 divides (m − 2wH (v)).

K. Proof of Theorem 19

From the example above and (45), we realize that any f ∈ RM(r, m) corresponds to a vector u = vf · GC1/C2 ⊕ y · G2 ∈ C1 2πı ıπ : t in the CSS superposition for |vf iL. Deﬁne ζt = e and note that on any state |ui, transversal exp 2m/r Z maps

wH (u) |ui 7→ ζ m |ui . (198) 2 r

k2 By fixing vf we fix the degree r terms in f, and by sweeping over all y ∈ Z2 we exhaust all choices of degree at most (r − 1) terms in f, thereby examining all states |ui in the CSS superposition corresponding to |vf iL. For the logical operator m/r to be well-defined as a diagonal gate acting as per (48), we need to show that wH (u) (mod 2 ) depends only on the degree r terms in f. Thus, we are interested in ν2(wH (u)) for different u in a single coset of RM(r − 1, m) in RM(r, m). First, let us consider QRM(1, m) separately for simplicity. Here, if |ui = |00 ··· 0i, then for any w ∈ u + RM(0, m), ν2(wH (w)) = m and so |wi 7→ |wi. However, if u = ev(f) with deg(f) = 1, then for any w ∈ u + RM(0, m) = {u, u ⊕ 1}, w corresponds to a codimension-1 affine plane so that ν2(wH (w)) = m − 1, and so |wi 7→ − |wi. Hence, the logical diagonal unitary has diagonal entries (1, −1, −1,..., −1), which is equivalent to (−1, 1, 1,..., 1) up to a global phase of (−1). Thus, ıπ up to local corrections (i.e., a logical transversal X gate correction), transversal application of physical Z-rotation exp 2m/r Z implements a logical Cm−1Z gate. This captures the [[8, 3, 2]] code we discussed previously in Section III-D. 30

m m/r Now consider the more general case where r ≥ 2. We are interested in calculating wH (u) = 2 − N(f) (mod 2 ), where N(f) denotes the number of zeros of f over F2. Then, following the proofs of Ax’s theorem [41], [22], [42], note that 1 X N(f) = µ(x f(x , . . . , x )) (199) 2 0 1 m m+1 x=(x0,x1,...,xm)∈F2   1 X X = µ x a xd (200) 2  0 d  m+1 d∈ m x∈F2 F2 1 X Y = µ x a xd (201) 2 0 d m+1 d∈ m x∈F2 F2 1 X Y = 1 − 2x a xd . (202) 2 0 d m+1 d∈ m x∈F2 F2

Deﬁne the function t on F2 by t(0) = 1, t(1) = −2. Then distributing the product, we can express N(f) as 1 X X Y i(d) N(f) = t(i(d))a (x xd)i(d) , (203) 2 d 0 i m+1 d∈ m x∈F2 F2 m where the summation over indicators i runs over all Boolean functions i: F2 → F2. We want to calculate ν2(wH (u)), where m u = ev(f), and we observe that ν2(wH (u)) = ν2(2 − N(f)) = ν2(N(f)). Hence, we are interested in the 2-adic valuation of N(f), so we group terms from this sum into products and rewrite this as       1 X Y i(d) Y X Y i(d) N(f) = a t(i(d)) x xd . (204) 2  d     0  i d∈ m d∈ m m+1 d∈ m F2 F2 x∈F2 F2 Observe that for each function i, the ﬁrst product is binary, and the remaining two terms are each powers of 2, so the whole term is a power of 2 and has a 2-adic valuation. The last term is a power of 2 because it is precisely the Hamming weight of Q di(d) the monomial m x0x . d∈F2 hQ i hP Q d i(d)i Now we are interested in the quantity ν2 m t(i(d)) m+1 (x0x ) , since the smallest value among all d∈F2 x∈F2 d indicating functions i will determine ν2(N(f)). Observe that when i is the zero function, this quantity takes the maximal value m+1 of 2 , and hence does not affect ν2(N(f)). When i is not the zero function, we can calculate     X Y di(d) X ν2  x0x  = m − wH  i(d)d and (205) m+1 d∈ m d∈ m x∈F2 F2 F2   Y X ν2  t(i(d)) = i(d). (206) m m d∈F2 d∈F2

hQ i hP Q di(d)i P P So we conclude that ν2 m t(i(d)) m+1 m x0x = m − wH m i(d)d + m i(d). Now, d∈F2 x∈F2 d∈F2 d∈F2 d∈F2 because f ∈ RM(r, m), ad in the first term of each i in N(f) ensures that only terms with deg(d) ≤ r survive. Hence, we P P have wH m i(d)d ≤ r m i(d), with equality occurring only when all d with i(d) = 1 are disjoint degree r terms, d∈F2 d∈F2 i.e., weight r vectors. From this, we can conclude         Y X Y i(d) X 1 X ν t(i(d)) x xd ≥ m − w i(d)d + w i(d)d (207) 2    0  H   r H   d∈ m m+1 d∈ m d∈ m d∈ m F2 x∈F2 F2 F2 F2 m ≥ . (208) r t m 1 P The second inequality holds because m − t + − = (m − t)(1 − ) ≥ 0, since t = wH m i(d)d ≤ m and r ≥ 2. r r r d∈F2 P P Furthermore, because r|m, we have equality if and only if wH m i(d)d = m and m i(d) = m/r. In other d∈F2 d∈F2 words, the 2-adic valuation of N(f) is solely determined by those functions i for which exactly m/r disjoint terms d, each of weight r, have i(d) = 1. Put together, these conditions exactly define the coefficient products appearing in q(f) in (49). Let P 0 denote the set of all such i satisfying these conditions, so that this set has a bijective mapping to the set P defined in the theorem statement. Then returning to N(f), and noting that only those terms i which contribute 2m/r matter, we see that m wH (u) = 2 − N(f) (209) 31

m ≡ −N(f) (mod 2 r ) (210) m X Y i(d) m r −1 r ≡ 2 ad (mod 2 ) (211) 0 m i∈P d∈F2 m/r m X Y r −1 = 2 vpj (212)

(p1,...,pm/r )∈P j=1 m −1 m = 2 r q(f) (mod 2 r ), (213)

0 by construction of P and P . Here, u determines which ad = 1, or equivalently which vpj = 1 (u ↔ vf ), since u = ev(f) points to a specific coset of RM(r − 1, m) in RM(r, m). As q(f) is oblivious to lower-order terms in f (that correspond to m X-type stabilizers), each coset indeed has a well-defined weight residue (mod 2 r ), and thus the induced logical operation is also well-defined. Accordingly, by (198), the logical action on (logical) computational basis vectors is defined by

2m/r−1q(f) |vf iL 7→ ζ2m/r |vf iL = µ(q(f)) |vf iL . (214) This completes the proof.