<<

Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2020

Conditions for the existence of error correction codes

Daniel Mark Lucas Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd

Recommended Citation Lucas, Daniel Mark, "Conditions for the existence of codes" (2020). Graduate Theses and Dissertations. 18174. https://lib.dr.iastate.edu/etd/18174

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Conditions for the existence of quantum error correction codes

by

Daniel Mark Lucas

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Mathematics

Program of Study Committee:

Yiu Tung Poon, Major Professor Domenico D’Alessandro Leslie Hogben Sung Yell Song Eric Weber

The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this dissertation. The Graduate College will ensure this dissertation is globally accessible and will not permit alterations after a degree is conferred.

Iowa State University

Ames, Iowa

2020

Copyright c Daniel Mark Lucas, 2020. All rights reserved. ii

TABLE OF CONTENTS

Page ABSTRACT ...... iii

CHAPTER 1. INTRODUCTION ...... 1

CHAPTER 2. BACKGROUND ...... 4 2.1 Basics of Linear Algebra ...... 4 2.2 Algebraic Geometry ...... 6 2.3 Classical Error Correction ...... 9 2.4 Postulates of ...... 13

CHAPTER 3. QUANTUM ERROR CORRECTION ...... 21 3.1 Quantum Confusability Graph and Zero Error Capacity...... 21 3.2 Quantum Error Correcting Codes ...... 27

CHAPTER 4. NUMERICAL RANGE GENERALIZATIONS ...... 38

4.1 Conditions for the Non-Emptiness of Λk(A1, ..., Am)...... 39

4.2 Geometry of Λk(A1, ..., Am)...... 50

CHAPTER 5. FUTURE WORK ...... 53

REFERENCES ...... 55 iii

ABSTRACT

In this dissertation, we consider quantum errors, represented by quantum channels defined on the space of n × n complex matrices, Mn. We begin with a T defined on Mn. The Kraus operators of this channel can be used to generate an operator system S, which is guaranteed to have a basis of Hermitian matrices A1, ..., Am in Mn. Given k, m ≥ 1, we find a lower bound on n above which there is always an n × k matrix U with

∗ ∗ orthonormal columns such that U A1U, ..., U AmU are diagonal k × k matrices. We then extend this to a lower bound on n above which there is always a k-dimensional quantum error correction code for any quantum channel T on Mn with an associated operator system S with dim(S) ≤ m + 1. Such a bound is equivalent to a lower bound on n above which the joint rank-k numerical range of any m Hermitian matrices in Mn is non-empty. So, we further extend our result to a lower bound on n above which the rank-k numerical range of any m Hermitian matrices in Mn is star-shaped. 1

CHAPTER 1. INTRODUCTION

The latter half of the 20th century saw both the invention and the explosive growth of technology. This had such a profound impact on the world that the period from then to the present has been dubbed the Information Age. Electronic were first invented in the 1940’s, and at the time were incredibly expensive, took up rooms full of space, and had barely a fraction of the power of today’s scientific calculators. They were also very prone to errors, and were widely considered nothing more than an inter- esting theoretical idea. However, the invention of the integrated circuit in 1959 and then the microprocessor in 1971 allowed computers to develop from slow, expensive behemoths only owned by government agencies, major universities and research companies to comparatively fast, cheap, personal computers the size of a briefcase.

Even then, computers likely wouldn’t have had the impact they’ve had if not for the invention of the internet in the 60’s and 70’s. The development of secure, widely accessible and reliably error-free electronic communication changed computers from an information storage and processing device to one of our primary means of communication and a method of engaging in instantaneous transactions. While the idea of packing a transmission with additional data to allow for the detection and correction of errors is initially simple, the dis- covery of more efficient ways of achieving this have drastically improved transmission rates. Even in the last few years, there has been a leap forward in transmission rates that was only possible because of the application of drastically more efficient error correction .

As we developed faster and more efficient computing technology and algorithms, we be- gan to discover more and more problems that classical computers are incapable of solving efficiently. A well-known example of one of these problems is the problem of factoring the product of large primes. For two large enough primes a, b, the product ab can take months 2 to factor, even for some of the most powerful supercomputers in existence. Through the use of a quantum principle called superposition, quantum computers can test multiple in- puts simultaneously, which allows us to solve some problems like this in a matter minutes or even seconds. However, just as in the early development of classical computers, error correction is a serious hurdle that must be crossed. Quantum states are very sensitive to er- rors in the hardware, and are particularly vulnerable to interference from outside the system.

In addition, the development of will actually give rise to new prob- lems. The reason the example problem of factoring the product of large primes is so well- known is that much of our internet security depends on the difficulty of this problem. The development of privately available quantum or hybrid quantum-classical computers will necessitate a restructuring of security protocols for online interactions. While there are quantum-proof security protocols in classical computing, there is also a phenomenon in quantum mechanics called the collapse that allows for the creation of rela- tively simple protocols that are theoretically unbreakable. This makes the development of reliable quantum communication an important endeavor, and so there is a looming need for more efficient and robust quantum error correction algorithms. However, some of the same properties of quantum mechanics that make quantum computing such a promising technology also give rise to serious difficulties when it comes to correcting quantum commu- nication errors. The means we have to be able to correct any errors without directly measuring the state that was sent through the communication channel, and because of the superposition principle, errors are continuous linear functions on a vec- tor space, rather than discrete functions on a finite field. While these are difficult problems to overcome, there have been exciting developments in this field over the last several decades.

In this dissertation, we will discuss the development of the field of quantum mechanics from the late 20th century to the present. We start with a description of classical error 3 correction, then give a brief description of the basic principles of quantum mechanics. One of these principles, called , allows us obtain some information about a without directly measuring it. We describe how this fact allows us to import some of the algorithms from classical error correction and adjust them for use in quantum error correction. We describe a method of quantum error correction that can be done without at all, provided we know what errors might occur in transmission and the probability of each error occurring. Finally, we describe the theoretical work that has been done to determine general conditions on a quantum communication channel under which the existence of an error correction is guaranteed. 4

CHAPTER 2. BACKGROUND

2.1 Basics of Linear Algebra

There are a variety of ways that quantum can be represented. This dissertation approaches the topic primarily through the lens of linear algebra, and we intro- duce here some of the basic definitions and concepts that we will use later.

An inner product space V is an n-dimensional vector space over a field F equipped with a function h·, ·i : V × V → F s.t.

1. For x, y ∈ V, hx, yi = hy, xi

2. For x, y, z ∈ V, a ∈ F , a(hx, zi + hy, zi) = ha(x + y), zi

3. For all x ∈ V, hx, xi ≥ 0 and hx, xi = 0 iff. x = 0

Such a functional is called an inner product on V. For H = Cn, we define the inner product as ∗ Pn p hv, wi = w v = i=1 wivi. The inner product induces a metric on V defined as ||·|| = h·, ·i. If V is complete with respect to this metric, it is called a . For a Hilbert space

H, we denote by M(H) the space of linear functions A : H → H. When H = Cn, we may

∗ write Mn, instead. For A ∈ M(H), the Hermitian adjoint of A is the function A ∈ M(H) s.t. hAv, wi = hv, A∗wi for all v, w ∈ H. For H = Cn, this can be explicitly defined as the conjugate transpose of A.

∗ ∗ ∗ Let A ∈ Mn. We call A normal if A A = AA . We call A Hermitian if A = A, and we denote the space of all n × n Hermitian matrices as Hn. We call A positive semidefinite, denoted A ≥ 0, if hAv, vi ≥ 0 for all v ∈ Cn. We call A unitary if A∗A = I, where I denotes

n the identity function on C , and we denote the set of all unitary n × n matrices Un. We call

A scalar if A = λI for some λ ∈ C and diagonal if ai,j = 0 whenever i 6= j, where ai,j is the (i, j)-th entry of A. If A is diagonal, we often describe it using the entries of its main 5

diagonal as A = diag(a1,1, ..., an,n). We denote the space of all n × n diagonal matrices as

Dn. The following is a well-known and very useful theorem about Hermitian and positive semidefinite matrices called the Spectral Decomposition Theorem:

Theorem 2.1. Let A ∈ Mn. Then A is normal if and only if there is a unitary matrix

U ∈ Mn such that

∗ U AU = diag(d1,1, ..., dn,n) for some d1,1, ..., dn,n ∈ C

A is Hermitian if and only if it is normal and d1,1, ..., dn,n ∈ R, and A is positive semidefinite

if and only if it is Hermitian and di,i ≥ 0 for 1 ≤ i ≤ n.

More generally, we can define a matrix space Cm×n of linear maps A : Cn → Cm. This space is itself a Hilbert space, but to define the inner product, we need to define the trace

of a square matrix A ∈ Mn.

n X Definition 2.2. Let A ∈ Mn. Then tr(A) = ai,i. i=1

We can then define the inner product of the matrix space Cm×n as hA, Bi = tr(B∗A) for A, B ∈ Cm×n. The trace also has several useful properties that may come into play later.

• For A, B ∈ Mn, a ∈ C, tr(aB + C) = a · tr(B) + tr(C).

• For two matrices A, B s.t. AB ∈ Mn, tr(AB) = tr(BA).

∗ • For A ∈ Mn, tr(A ) = tr(A).

2 Let P ∈ Hn be s.t. P = P . Then P is called an orthogonal projection, and there is a

n×k ∗ k k ≤ n and a U ∈ C such that U U = Ik, where Ik is the identity function on C , and P = UU ∗. Conversely, given any k-dimensinoal subspace K ⊂ Cn, there is an orthogonal 6

n projection P ∈ Mn such that P v ∈ K for every v ∈ C and for every w ∈ K, there is a v ∈ Cn such that P v = w.

For two matrices A ∈ Cm×n,B ∈ Cp×q, the tensor product of A and B is the block matrix A ⊗ B = (ai,jB)i,j. We can then define the tensor product of two matrix spaces,

Cm×n ⊗ Cp×q, as the space of all linear combinations of elements of the form A ⊗ B for A ∈ Cm×n,B ∈ Cp×q. It is common practice to call the Kronecker product the tensor product for the sake of simplicity, rather than maintaining the distinction. This product has some very useful properties:

• For A, B ∈ Cm×n,C ∈ Cp×q,

(A ⊗ C) + (B ⊗ C) = (A + B) ⊗ C and (C ⊗ A) + (C ⊗ B) = C ⊗ (A + B).

• For A ∈ Cm×n,B ∈ Cp×q, c ∈ C, c(A ⊗ B) = (cA) ⊗ B = A ⊗ (cB).

• For A ∈ Cm×n,B ∈ Cn×m,C ∈ Cp×q,D ∈ Cq×p,(A ⊗ C)(B ⊗ D) = (AB) ⊗ (CD).

• For A ∈ Cm×n,B ∈ Cp×q,(A ⊗ B)∗ = (A∗) ⊗ (B∗).

2.2 Algebraic Geometry

For our main result, we use some theorems from the field of algebraic geometry and so we introduce some of the basic concepts and theorems here. All of these definitions and theo- rems come from [13].

Definition 2.3. Let V be a vector space of dimension n+1 over C. The set of 1-dimensional subspaces of V is called the n-dimensional projective space over C, and is denoted Pn. 7

An element of Pn can be described using any non-zero vector in the subspace. As such, for v ∈ Pn, λ ∈ C \{0}, λ · v = v. Because of this particular quirk of the projective plane, homogeneous functions play a very important role in algebraic geometry, as we will see soon.

Definition 2.4. Let V be an n-dimensional vector space over a field F. A function f defined on V is homogeneous if there is a d ∈ N such that for every a ∈ F,

d f(ax1, ax2, ..., axn) = a f(x1, ..., xn)

Then d is called the degree of f.

We call a homogeneous polynomial f : Cn+1 → C of degree d a form on Pn. This is equivalent to the condition that every term of f has total degree d. We can now define a projective variety.

Definition 2.5. A subset X ⊂ Pn is a projective variety if there is a finite set of forms

n f1, ..., fk on P such that

n X = {v ∈ P : f1(v) = ··· = fk(v) = 0}

These definitions extend intuitively to the direct product of two projective planes Pm × Pn.

A polynomial f : Cm+1 × Cn+1 → C is a form on Pn × Pm if there is some p, q ∈ N s.t. for any (u, v) ∈ Cn+1 × Cm+1 and µ, λ ∈ C, f(λu, µv) = λpµqf(u, v). That is, f is homogeneous in each set of indeterminates independently. Then a subset X ⊂ Pm × Pn is a projective

m n variety if there is some set of forms {f1, ..., fk} on P × P s.t.

m n X = {(v, w) ∈ P × P : f1(v, w) = ··· = fk(v, w) = 0} 8

For any m, n ∈ N, there is an embedding called the Segre embedding that maps Pn × Pm isomorphically to a projective variety in P(m+1)(n+1)−1. The proof that this embedding is an isomorphism of projective varieties is on page 55 of [13], and relies on a lot of groundwork that is laid earlier in the book, and so we will not include it here. However, we will describe the map itself.

N Let N = (m+1)(n+1)−1. For w ∈ P , index the entries of w as wi,j, where i = 0, ..., m and j = 0, ..., n. Then the Segre embedding is defined by the map ϕ : Pm × Pn → PN , where

m n (m+1)(n+1)−1 ϕ(u, v) = (uivj)i,j. So, P × P can be thought of as a projective variety of P . Now, for the main result we use in this dissertation, we will need a definition of the dimension of a projective variety, which requires some setup.

Definition 2.6. Given a projective variety X ⊂ Pn, a subset Y ⊂ X is a subvariety of X if it is a projective variety.

Definition 2.7. A projective variety X ⊂ Pn is reducible if there are two subvarieties

X1,X2 ( X such that X1 ∪ X2 = X. It is irreducible otherwise.

Definition 2.8. The dimension of a projective variety X ⊂ Pn is the largest integer n such that there is a strictly decreasing chain Y0 ) ··· ) Yn ) ∅ of length n of irreducible projective subvarieties of X.

It is quite straightforward to build this sequence of projective subvarieties for Pm × Pn.

m n m n Let Y0 = P × P , for 1 ≤ k ≤ m, let Yk = {(v, w) ∈ P × P : v1 = ··· = vk = 0}, and for

m n m + 1 ≤ k ≤ m + n, let Yk = {(v, w) ∈ P × P : v1 = ··· = vm = w1 = ··· = wk−m = 0}. So, the dimension of Pm × Pn is m + n. 9

This last proposition from [13] is the goal of this section. As with the Segre embedding, the proof of this proposition relies on the groundwork that is laid earlier in the book, and so we do not include it. However, this is a specific application of a very well-known theorem from algebraic geometry called Bezout’s Theorem.

Proposition 2.9. If r ≤ n, then r forms have a common zero on an n-dimensional projective variety.

2.3 Classical Error Correction

The basic problem of classical error correction is fairly straightforward. One party, commonly referred to as Alice, transmits a package of n bits to another party, commonly referred to as Bob. Somewhere along the communication line, which is called a noisy channel, one or more errors occur, and as a result, the package Bob receives is not the same as the pack- age Alice sent. In this scenario, we then have two alphabets, or sets. One of possible n-bit

packages Alice could send, {x1, ..., xN } = X, and one of possible packages Bob could receive,

{y1, ..., yM } = Y . We can then describe the noisy channel using the column stochastic ma-  trix N = p(yi|xj) i,j, where p(yi|xj) is the probability that Alice sent xj, given that Bob received yi. The problem of error correction is to find some subset S ⊂ X such that if Alice sends some xi ∈ S, Bob can determine with an acceptably low probability of error which xi Alice sent.

To achieve this, we need to construct the confusability graph of the channel. A graph (V,E) is defined as a set of vertices V connected by a set of edges E ⊆ V × V such that (v, v) 6∈ E for all v ∈ V and if (v, w) ∈ E,(w, v) ∈ E. We can construct the confusability graph of N , GN , by letting X be the set of vertices and defining the edge set as

 E := (xi, xj) ∈ X × X : i 6= j, ∃ ` ∈ {1, ..., M} s.t. p(y`|xi) · p(y`|xj) > 0 10

While we define the edge set this way to allow for a more concise explanation, in practical application, there are sometimes errors that are possible, but exceedingly unlikely, so it can be more practical to begin by artificially setting p(y`|xi) = 0 if p(y`|xi) < r for some probability threshold r ≥ 0. We then define the edge set using the simpler matrix. Then clearly, if (xi, xj) 6∈ E and Bob receives some y` ∈ Y such that p(y`|xi) > 0, then Bob can assume that Alice did not send xj. If S ⊂ X is such that E ∩ (S × S) = ∅, we call S an independent set. Then, if Alice sends some xi ∈ S, Bob can determine without error which xi Alice sent. One useful tool in finding such independent sets is the adjacency matrix of the confusability graph, which is defined as

  1 if (i, j) ∈ E (MN )i,j =  0 if (i, j) 6∈ E

Then, for a set S = {xi ∈ X : i ∈ α}, α ⊂ {1, ..., |X|}, S is independent if and only if the submatrix B[α] = (MN )i,j∈α is a zero matrix.

Example. A simple, though admittedly inefficient, example of an error correcting algorithm involves a 3-bit noisy channel in which it is assumed that at most one bit will be corrupted.

The vertex set is X = {x0, x1, ..., x7}, where xi represents the number i for 0 ≤ i ≤ 7. Since at most one error can occur, two numbers can be confused if they differ by two or fewer bits. For example, 000 and 011, after being sent through the noisy channel, both have a chance of becoming the numbers 010 or 001. We call the number of bits that differ between two numbers the Hamming distance, denoted dH (x, y). So, the confusability graph of this noisy channel is GN = (X,E), where E = {(x, y) ∈ X × X : dH (x, y) ≤ 2, x 6= y}. 11

We then have the adjacency matrix

  0 1 1 1 1 1 1 0      1 0 1 1 1 1 0 1       1 1 0 1 1 0 1 1         1 1 1 0 0 1 1 1  MN =      1 1 1 0 0 1 1 1       1 1 0 1 1 0 1 1         1 0 1 1 1 1 0 1    0 1 1 1 1 1 1 0

Clearly, the only independent sets are {x0, x7}, {x1, x6}, {x2, x5}, {x3, x4}. Any of these sets can be used to design an error correcting algorithm, but the most intuitive method is to use

x0 and x7, which represent the numbers 000 and 111. Then, whatever bit Alice wants to send, she simply triples it and sends it through the noisy channel. Whatever Bob receives, he chooses the value that is in the majority, and if the assumption that at most one bit is corrupted holds true, that is the value that Alice wanted to send.

Of course, as soon as such an algorithm is found, the question of optimization immedi- ately arises. Given two finite sets X and Y and a noisy channel N : X → Y , what is the largest cardinality we can achieve for an independent set S ⊂ X? This value is called the zero error capacity.

Definition 2.10. Let N : X → Y be a noisy channel between two finite alphabets X and

Y , and let GN = (X,E) be the confusability graph of N . Then the zero error capacity of N is

α(N ) = max{|S| : S ⊂ X,E ∩ (S × S) = ∅} 12

This value is certainly important, but note that part of the definition of a noisy channel involves a specific number of bits being transmitted, and only some of those bits can actu- ally contain meaningful information. If Alice needs to transmit more bits than the channel can reliably support in a single transmission, she will need to send more than one package. This changes the way we mathematically represent the channel. Let N be a noisy channel through which Alice sends n-bit numbers from an alphabet X and Bob receives numbers

from an alphabet Y , and let GN = (X,E) be the confusability graph of the channel. Suppose Alice sends two n-bit numbers through the channel. Assuming the channel is the same when Alice transmits each number and that errors occur independently, we have a new channel

0 0 N ×N : X ×X → Y ×Y . There are then three ways that two elements (x1, x2), (x1, x2) ∈ X

0 0 0 0 0 could be confused. Either (x1, x1), (x2, x2) ∈ E,(x1, x1) ∈ E and x2 = x2 or x1 = x1 and

0 (x2, x2) ∈ E.

To describe the confusability graph of N × N , we will need a few more tools from graph theory. First, let G = (V,E) be a graph, and let x, y ∈ V . We write x ' y if (x, y) ∈ E or x = y. We can now define the strong product of two graphs.

Definition 2.11. Let G = (V1,E1),H = (V2,E2) be graphs. Let V = V1 × V2 and define E ⊂ V × V as

E := {(a, b, c, d) ∈ V × V : a ' c, b ' d, (a, b) 6= (c, d)}

Then the strong product of G and H is G  H = (V,E).

Clearly, GN ×N = GN  GN . In a similar way, we can extend this to k uses of the noisy

k k channel, and then the channel is N with the confusability graph GN . 13

2.4 Postulates of Quantum Mechanics

In order to describe error correction in quantum communication, we need a basic descrip- tion of quantum mechanics, how information is encoded in quantum states, and how to represent an error in quantum communication. The field of quantum mechanics is based on a set of axiomatic postulates. These are described in a variety of different ways, each of which represents a particular interpretation of the physical facts that have been observed through experimentation. The following postulates are taken from [2], and they represent one interpretation of quantum mechanics that is particularly convenient for use in the field of theory.

Since we are introducing this in the context of quantum information theory, we will use what is called Bra-ket notation in this section. In this notation, standard basis vectors in Cn are written as |0i, ..., |n − 1i, the conjugate transpose of a vector |xi ∈ Cn is written as hx|, and for two vectors |xi, |yi ∈ Cn, the inner product is written hy|xi and the tensor product can either be written |xi|yi or |xyi.

Postulate 2.1. Associated to any isolated physical system is a Hilbert space H known as the state space of the system. The system is completely described by its state vector |vi ∈ H such that ||v|| = 1.

The simplest quantum mechanical system, and the system which we will be most con- cerned with, is the . A qubit has a two-dimensional state space. Suppose |0i and |1i form an orthonormal basis for that state space. Then an arbitrary state vector in the state space can be written

2 2 |ψi = a|0i + b|1i for some a, b ∈ C such that |a| + |b| = 1 14

In quantum information theory, we usually work using for a variety of reasons, one of which is that it allows us to import many of the algorithms from classical computing with fewer adjustments. Intuitively, the states |0i and |1i are analogous to the two values 0 and 1 which a bit may take. The way a qubit differs from a bit is that superpositions of these two states, of the form a|0i + b|1i, can also exist, in which it is not possible to say that the qubit is definitely in the state |0i, or definitely in the state |1i.

For example, when a hits a half-silvered mirror, it can either be reflected or pass

2 through, and we can represent these two states using the standard basis vectors of C , e1

2 and e2, respectively. Then the principle of superposition states that any unit vector in C also describes a possible state of the photon. In other words, it is possible that the photon passes through and is reflected, and therefore occupies two different physical locations at the same time.

Postulate 2.2. The evolution of a closed quantum system (a system that is protected from uncontrolled outside forces) is described by a unitary transformation. That is, the state |ψi

0 of the system at time t1 is related to the state |ψ i of the system at time t2 by a unitary operator U which depends only on the times t1 and t2,

|ψ0i = U|ψi

One of the more mysterious aspects of quantum mechanics is the fact that simply mea- suring a quantum state actually changes it. This effect is called the wave function collapse, and it is described in the next postulate. 15

Postulate 2.3. Quantum are described by a collection Am ∈ Mn of measure- ment operators, called a measurement system. These are operators acting on the state space of the system being measured. The index m refers to the measurement outcomes that may occur in the experiment. If the state of the quantum system is |ψi immediately before the measurement then the probability that result m occurs is

∗ p(m) = hψ|AmAm|ψi, and the state of the system after the measurement is

Am|ψi p ∗ hψ|AmAm|ψi

The measurement operators satisfy the completeness equation,

X ∗ AmAm = I m

A simple example of a measurement system is M0 = |0ih0|,M1 = |1ih1|. Then, given a state |ψi = a|0i + |1i, the probability that the measurement outcome is 0 is |a|2, and the probability that it is 1 is |b|2.

This brings us to another interesting problem in quantum mechanics, that of distin- guishing between different quantum states. A set of states {|x1i, ..., |xki} is perfectly distin-

∗ guishable if there is a measurement system {A1, ..., Ak} such that hxi|Aj Aj|xii = δi,j. The following well-known result is proved in [12], and gives a much simpler equivalent condition to this definition:

n Theorem 2.12. A set {|x1i, ..., |xki} ∈ C is perfectly distinguishable if and only if it is orthonormal. 16

n Proof. Suppose {|x1i, ..., |xki} ⊂ C is orthonormal, and define Ai = |xiihxi|. Then

∗ 2 hxi|Aj Aj|xii = |hxi|xji| = δi,j

So {|x1i, ..., |xki} is perfectly distinguishable.

n Suppose {|x1i, ..., |xki} ⊂ C is perfectly distinguishable. Then for some measurement

∗ n system {A1, ..., Ak} such that hxi|Aj Aj|xii = δi,j. For |xii, |xji, there are some |yi ∈ C

2 2 s.t. ||y|| = 1, hxj|ri = 0 and |xii = a|xji + b|yi, where |a| + |b| = 1. Then, since

∗ hxj|Ai Ai|xji = 0,

∗ ∗ 2 1 = hxi|Ai Ai|xii = haxj + by|Ai Ai|axj + byi = |b|

So, a = 0. Then hxi|xji = 0, and therefore {|x1i, ..., |xki} is orthonormal.

The next postulate deals with composite quantum systems made up of two or more quan- tum systems.

Postulate 2.4. The state space of a composite physical system is the tensor product of the state spaces of the component physical systems. Moreover, if we have systems numbered 1

through n, and system number i is prepared in the state |ψii, then the joint state of the total system is

|ψ1i ⊗ |ψ2i ⊗ · · · ⊗ |ψni.

This allows us to describe one of the most interesting features of quantum mechanics,

entanglement. Consider the two qubit state |ψi = √1 (|00i + |11i). 2 There are no |xi, |yi ∈ C2 such that |ψi = |xi|yi, since that would require the existence

of x , x , y , y ∈ s.t. x y = x y = √1 , which implies x , x , y , y are all nonzero, and 1 2 1 2 C 1 1 2 2 2 1 2 1 2 17

x1y2 = x2y1 = 0, which requires at least two of x1, x2, y1, y2 to be zero. This illustrates what it means for a quantum state to be entangled.

Definition 2.13. Let |vi ∈ Cm ⊗ Cn. We call |vi separable if there are |xi ∈ Cm, |yi ∈ Cn such that |vi = |xi|yi. We call |vi entangled otherwise.

Entanglement is one of the most incredible and mysterious aspects of quantum mechanics, because when two particles are entangled, operations performed on one particle affect the other particle regardless of the distance between them. For example, consider the state |ψi defined earlier. If the second qubit is measured and gives the state |0i, that change can be

represented by the measurement operator U = I2 ⊗ (|0ih0|), and the resulting state of the entire system is U|ψi = |00i phψ|U ∗U|ψi

So both qubits underwent a wave function collapse, not just the second qubit.

Now that we have a way of representing any given quantum state, we can describe how

numbers are encoded in quantum states. If we have a number a ∈ N such that a ≤ 2n, we can write a as an n-bit binary number, and then represent it using n qubits as the state |ai. For example, if we want to represent the number 3 using a system of 5 qubits, we use the state |00011i ∈ C32.

We cannot always determine uniquely the state of a quantum system, and so we consider the collection of possible states |x1i, ..., |xri that the system may be in and the probabilities r X p1, ..., pr, where pi = 1, that each state is the actual state of the system. We then describe i=1 r X the state of the system using the matrix ρ = pi|xiihxi|. If r > 1, we call ρ a mixed state, i=1 otherwise it is called a pure state. We call such a matrix a . 18

r r X n X Let ρ = pi|xiihxi| for some |x1i, ..., |xri ∈ C , p1, ..., pr ∈ [0, 1] s.t. pi = 1, and let i=1 i=1 |vi ∈ Cn. Then r r X X 2 hv|ρ|vi = pihv|xiihxi|vi = pi|hv|xii| ≥ 0, i=1 i=1

r r X X so ρ ≥ 0. Furthermore, tr(ρ) = pi · tr(|xiihxi|) = pi · tr(hxi|xii) = 1. i=1 i=1

With this new representation of quantum states, we need an equivalent representation of measurement for density matrices. We maintain the same definition of measurement system.

Let {Ai : 1 ≤ i ≤ r} be a measurement system. Then for a pure state ρ = |xihx|, we have

∗ ∗ ∗ tr(AiρAi ) = tr(Ai|xihx|Ai ) = hx|Ai Ai|xi = p(i),

∗ so, the probability that measuring the mixed state ρ will result in outcome i is pi = tr(AiρAi ).

Definition 2.14. The mixed states ρ1, ..., ρd ∈ Mn are perfectly distinguishable if there is a

∗ measurement system {A1, ..., Ak}, k ≥ d, such that tr(AiρjAi ) = δi,j for 1 ≤ i, j ≤ d.

There is, again, a well-known and simple equivalent condition in [12].

Theorem 2.15. A collection of density matrices {ρ1, ..., ρd} ⊂ Mn is perfectly distinguishable if and only if ρiρj = 0 for i 6= j.

Proof. Suppose {ρ1, ..., ρd} ⊂ Mn are density matrices such that ρiρj = 0 for i 6= j. For

n 1 ≤ i ≤ d−1, let Pi be the orthogonal projection onto the subspace ran(ρi) = {ρiv : v ∈ C }. d−1 d−1 X X Then Pi is an orthognal projection, so Pd = I − Pi is an orthogonal projection. Then i=1 i=1 d X we have Pi = I, so {P1, ..., Pd} is a measurement system. i=1 19

Since ran(ρd) ⊂ ran(Pd) and ran(ρi) = ran(Pi) for 1 ≤ i ≤ d − 1,

∗ tr(PiρjPi ) = tr(Piρj) = tr(δi,jρj) = δi,j

Now suppose {ρ1, ..., ρd} ⊂ Mn is perfectly distinguishable. Then there is a measurement

∗ system {A1, ..., Ad} such that tr(AiρjAi ) = δi,j. First of all, note that for any A ≥ 0, there is n ∗ ∗ X a U ∈ Un and a diagonal D ≥ 0 s.t. A = U DU. Then tr(A) = tr(U DU) = tr(D) = di,i, i=1 so tr(A) = 0 if and only if A = 0. We will use this fact later in the proof.

∗ ∗ ∗ Now, we want to show that tr(AiρjAi ) = tr(ρjAi Ai) = 0 if and only if ρjAi Ai = 0. One

∗ ∗ direction is obvious, so we only need to prove that ρjAi Ai = 0 if tr(ρjAi Ai) = 0. For ease

∗ of reference let P = ρj and Q = Ai Ai, and suppose tr(PQ) = 0. Then P,Q ≥ 0, so there is ∗ ˜ ∗ a unitary U such that U PU = D = diag(d1, ..., dk, 0, ..., 0) ≥ 0. Let Q = U QU. Then

k X ˜ ∗ ∗ diq˜i,i = tr(DQ) = tr((U PU)(U QU)) = tr(PQ) = 0, i=1

  Q˜ Q˜ ˜ ˜ 1,1 1,2 soq ˜i,i = 0 for 1 ≤ i ≤ k. We can write Q as the block matrix Q =  , where  ˜∗ ˜  Q1,2 Q2,2 ˜ ˜ ˜ ˜ k n−k Q1,1 ∈ Mk. Since Q ≥ 0, Q1,1 ≥ 0, and therefore Q1,1 = 0. Then for v ∈ C and h ∈ C ,

    * v v +   , Q˜   = hv, Q˜ hi + hh, Q˜∗ v + Q˜ hi     1,2 1,2 2,2 h h

˜ For some r > 0, let v = −rQ1,2h. Then the inner product becomes

˜ 2 ˜ −2r||Q1,2h|| + hh, Q2,2hi

˜ ˜ 2 Let r → ∞. Since Q ≥ 0, this inner product must always be positive, so ||Q1,2h|| = 0. 20

  0 0 ˜ ˜ n Since h was an arbitrary vector, Q1,2 = 0. So, Q =  . Then for all v, w ∈ C ,  ˜  0 Q2,2 ˜ Dv ∈ span(e1, ..., ek) and Qw ∈ span(ek+1, ..., en), so

0 = v∗DQw˜ = v∗U ∗PUU ∗QUw = v∗U ∗P QUw

∗ n Since U ∈ Un, this is equivalent to v P Qw = 0 for all v, w ∈ C . Then PQ = 0. Note that

this proof is still valid for any P,Q ≥ 0, P,Q ∈ Mn.

∗ ∗ ∗ So, tr(ρjAi Ai) = 0 if and only if ρjAi Ai = 0. We also have tr(ρiAi Ai) = 1 = tr(ρi),

∗ ∗ X ∗ since ρi is a density matrix. Then tr(ρi(I − Ai Ai)) = 0. Since I − AiAi = AjAj ≥ 0 and j6=i ∗ ∗ ρi ≥ 0, ρi(I − Ai Ai) = 0, and therefore ρi = ρiAi Ai. Then, for i 6= j, we have

∗ ρiρj = ρiAi Aiρj = ρi · 0 = 0,

∗ ∗ since tr(Ai Aiρj) = tr(AiρjAi ) = 0. 21

CHAPTER 3. QUANTUM ERROR CORRECTION

3.1 Quantum Confusability Graph and Zero Error Capacity

Since quantum communication sends information encoded in quantum states, any error that occurs during transmission would be a physical event that transforms the original quantum state into a new quantum state. We call such an operation a quantum channel, and we rep- resent it using a linear map Φ : M(H) → M(K), where M(H) is the space of endomorphisms on H. Since Φ must map quantum states to quantum states, we must have Φ(ρ) ≥ 0 if ρ ≥ 0 and tr(Φ(ρ)) = tr(ρ) for all ρ ∈ M(H). We call such maps positive and trace preserving. Furthermore, since we can take a physical operation and apply it to every particle in a mul- tipartite state simultaneously, Φ must also be completely positive.

Definition 3.1. Let Φ : M(H) → M(K) be linear. Then Φ is completely positive if

⊗n (In ⊗ Φ)(ρ) ≥ 0 for all positive semidefinite ρ ∈ M(H ) for all n ≥ 1.

We have a very convenient representation for completely positive maps in the following well-known theorem proven by Choi in [1], usually referred to as Choi’s Theorem:

Theorem 3.2. Let Φ: Mn → Md be linear. Then the following are equivalent: (1) Φ is completely positive.

∗ n n (2) CΦ = [Φ(eiej )]i,j=1 ≥ 0, where e1, ..., en are the standard basis vectors for C . r d×n X ∗ (3) There are some F1, ..., Fr ∈ C s.t. Φ(X) = FiXFi for all X ∈ Mn. i=1

Proof. We begin by proving that (1)⇒(2). Let Φ : Mn → Md be completely positive, and

∗ n n let Q = (eiej )i,j=1, where e1, ..., en are the standard basis vectors for C . Then

∗ ∗ n ∗ ∗ n ∗ n Q = (eiej )i,j=1 = ((ejei) )i,j=1 = (eiej )i,j=1 = Q 22 and n !n n !n 2 X ∗ ∗ X ∗ Q = (eiek)(ekej ) = eiej = nQ k=1 i,j=1 k=1 i,j=1 Then for any eigenvector v of Q, if λ is its corresponding eigenvector, we have

λ2x = Q2x = nQx = nλx, so λ = 0 or n, and therefore Q ≥ 0. Then, since Φ is completely positive,

(I ⊗ Φ)(Q) = Φ(e e∗)n ≥ 0 n i j i,j=1

nd Now we prove (2)⇒(3). Let Φ : Mn → Md be linear such that CΦ ≥ 0, and let v1, ..., vK ∈ C be unit eigenvectors of CΦ with norms equal to the square root of their associated eigenvalues. Then K X ∗ CΦ = vivi i=1   h(i)  1  (i) (i) d  .  Then for 1 ≤ i ≤ K, there is a set of n vectors h1 , ..., hn ∈ C such that vi =  . .    (i)  hn   (i) (i) d×n Then for 1 ≤ i ≤ K, define Bi = h1 ··· hn . Then Bi ∈ C , so if we define

K X ∗ Ψ(X) = BiXBi , i=1

K ∗ X (`) (`) ∗ Ψ is a linear map from Mn to Md. Since Ψ(eiej ) = hi (hj ) is equal to the (i, j)-th `=1 ∗ ∗ d × d block submatrix of CΦ, we have Ψ(eiej ) = Φ(eiej ) for 1 ≤ i, j ≤ n. 23

∗ Since the set {eiej : 1 ≤ i, j ≤ n} is a basis for Mn and Φ, Ψ are both linear maps,

K X ∗ Φ(X) = Ψ(X) = BiXBi for all X ∈ Mn. i=1

p Finally, we prove (3)→(1). Let Q = (Xi,j)i,j=1 ∈ Mp(Mn) be positive semidefinite. Then

r ! r X ∗ X ∗ (Ip ⊗ Φ)(Q) = (Φ(Xi,j))i,j = FiXi,jFi = (BkXi,jBk)i,j ≥ 0, i=1 i,j i=1

since the sum of positive semidefinite matrices is positive semidefinite. Then (Ip ⊗ Φ) is positive for arbitrary p, and therefore Φ is completely positive.

Then, if Φ : Mn → Md is completely positive, we have

r ! r ! X ∗ X ∗ tr(Φ(X)) = tr FiXFi = tr Fi FiX i=1 i=1

Pr ∗ This is equal to tr(X) for all X ∈ Mn iff. i=1 Fi Fi = In.

Now that we have a way to describe errors, we turn to the theory of correcting them. The most obvious method is to take our inspiration from classical error correction. We want to

∗ ∗ find a set {x1, ..., xk} of perfectly distinguishable states such that {Φ(x1x1), ..., Φ(xkxk)} are perfectly distinguishable. The existence of such a set would allow Bob to determine without error which state Alice sent, although it would not necessarily allow Bob to actually recover that state.

Given a quantum channel Φ : Mn → Md, what is the largest perfectly distinguishable

n ∗ ∗ subset {x1, ..., xk} ⊂ C s.t. {Φ(x1x1) , ..., Φ(xkxk)} is perfectly distinguishable? By analogy to classical error correction, we define the zero error capacity of a quantum channel as follows: 24

Definition 3.3. Let Φ : Mn → Md be a quantum channel. Then the zero error capacity of Φ is

 n ∗ ∗ ∗ α(Φ) = max k ∈ N : ∃ {x1, ..., xk} ⊂ C s.t. for i, j ∈ [k], i 6= j, xi xj = Φ(xixi )Φ(xjxj ) = 0

The following theorem from [12] gives a useful equivalent condition to the existence of such a subset:

r X ∗ Theorem 3.4. Let Φ: Mn → Md be a TPCP map defined as Φ(X) = FiXFi for i=1 n all X ∈ Mn, and let {x1, ..., xk} ⊂ C be orthonormal. Then for i 6= j, 1 ≤ i, j ≤ k,

∗ ∗ ∗ ∗ Φ(xixi )Φ(xjxj ) = 0 if and only if xi F` Fmxj = 0 for 1 ≤ `, m ≤ r.

∗ ∗ Proof. Suppose Φ(xixi ) · Φ(xjxj ) = 0 for i 6= j. Recall from the proof of 2.15 that since

∗ ∗ ∗ ∗ ∗ Φ(xixi ) ≥ 0 for 1 ≤ i ≤ k, Φ(xixi )Φ(xjxj ) = 0 if and only if tr(Φ(xixi )Φ(xjxj )) = 0. We have

r ! r !! r X ∗ ∗ X ∗ ∗ X ∗ ∗ ∗ ∗ tr F`xixi F` Fmxjxj Fm = tr(F`xixi F` Fmxjxj Fm) `=1 m=1 `,m=1 r X ∗ ∗ = hFmxj,F`xii · tr(F`xixj Fm) `,m=1 r X = hFmxj,F`xii · hF`xi,Fmxji `,m=1 r X 2 = |hFmxj,F`xii| `,m=1

∗ ∗ So, tr(Φ(xixi )Φ(xjxj )) = 0 if and only if |hFmxj,F`xii| = 0 for 1 ≤ `, m ≤ r. Then

∗ ∗ Φ(xixi )Φ(xjxj ) = 0 if and only if F`xi ⊥ Fmxj for 1 ≤ i, j ≤ m.

This equivalent condition leads us to the concept of operator systems, which are defined as follows: 25

Definition 3.5. Let S ⊂ Mn be a subspace of Mn. Then S is an operator system if: (1) I ∈ S (2) If A ∈ S, A∗ ∈ S.

r X ∗ Let Φ : Mn → Md be a TPCP channel defined as Φ(X) = FiXFi , and define i=1

∗ SΦ = span(Fi Fj : 1 ≤ i, j ≤ r)

∗ ∗ ∗ ∗ Clearly, if Fi Fj ∈ SΦ, then (Fi Fj) = Fj Fi ∈ SΦ, and since Φ is trace-preserving, r X ∗ Fi Fi = I, so SΦ is an operator system. For an operator system S, an orthonormal set i=1 n {x1, ..., xk} ⊂ C s.t. ∗ ∗ tr(Axixj ) = xj Axi = 0

for all i 6= j, 1 ≤ i, j ≤ k and all A ∈ SΦ is called a S-independent set of size k. Then the con- dition given in Theorem 3.4 is equivalent to the condition {x1, ..., xk} is an SΦ-independent set.

Note that, while a quantum channel Φ defines an operator system SΦ, different quantum

channels can have the same operator system. A trivial example would be Φ(X) = InXIn   In   and Ψ(X) =   X . So, when we find a S -independent set of size k, the we   In 0 Φ 0 can use the same set to design an error correcting algorithm for any quantum channel that shares an operator system with Φ.

Operator systems are also sometimes called noncommutative graphs in the context of error correction because they can be seen as a generalization of the confusability graph of a noisy channel, which we considered in classical error correction. We can illustrate this fact by constructing the operator system of a classical noisy channel. Let N : X → Y be a

p ∗ classical noisy channel, and for 1 ≤ i ≤ |X|, 1 ≤ j ≤ |Y |, define Fi,j = p(yj|xi)ˆejei , where 26

|X| |Y | e1, ..., e|X| ande ˆ1, ..., eˆ|Y | are the standard basis vectors of C and C , respectively. Define

ΦN : M|X| → M|Y | as |X| |Y | X X ∗ ΦN (ρ) = Fi,jρFi,j for ρ ∈ M|X| i=1 j=1

By construction, ΦN is completely positive, and

! X ∗ X X ∗ X ∗ Fi,jFi,j = p(yj|xi) eiei = eiei = I i,j i j i

So, ΦN is also trace-preserving, and is therefore a quantum channel. We can then define the operator system of N as

∗ SN = span(Fh,iFj,` : 1 ≤ h, j ≤ |X|, 1 ≤ i, ` ≤ |Y |)

∗ If we represent xi ∈ X with eiei for 1 ≤ i ≤ |X|, we have

|Y | ∗ X ∗ ΦN (eiei ) = p(yj|xi)ˆejeˆj j=1

So, for 1 ≤ h < i ≤ |X|,

 |Y |   |Y |  |Y | ∗ ∗ X ∗ X ∗ X ∗ ΦN (eheh)ΦN (eiei ) =  p(yj|xh)ˆejeˆj   p(yj|xi)ˆejeˆj  = p(yj|xh) · p(yj|xi)ˆejeˆj j=1 j=1 j=1

This is zero if and only if p(yj|xh) · p(yj|xi) = 0 for 1 ≤ j ≤ |Y |, so if GN is the confusability

graph of N , by Theorem 3.4, a set T = {xi1 , ..., xik } ⊂ X is independent if and only if

{ei1 , ..., eik } is a SN -independent set. So we see that in the field of error correction, SN acts

as a generalization of GN . 27

3.2 Quantum Error Correcting Codes

We now come to the problem of quantum error correction. When describing quantum error correction algorithms, it is common practice to return to representing quantum states as vectors, written in bra-ket notation, since it makes descriptions of these algorithms more clear.

Operations performed on quantum states are called quantum gates in this context, and are represented using unitary matrices. It is also convenient to represent these gates graphi- cally, particularly the CNOT gate that we will use repeatedly in this section. A gate U that is only applied to a single qubit is depicted as

U

When a quantum gate U has dimension 2n for some n > 1, we use a similar graphical representation, simply extending it to cover all of the affected qubits. We can also abuse this notation somewhat and use the same graphical representation to represent the state undergoing a quantum channel. The gates are applied from left to right, and we will now describe the CNOT gate, which is used several times in this section. The CNOT gate has a control bit and a target bit. If the control bit is 0, the gate does nothing. If the control bit is 1, the gate changes the value of the target bit from 1 to 0 or vice versa. In the simple example of a CNOT gate on two qubits, this is represented by the matrix

  1 0 0 0      0 1 0 0    CNOT =    0 0 0 1      0 0 1 0 28

Of course, it becomes much more difficult to describe this matrix when dealing with more than two qubits, particularly if there are intervening bits between the control bit and the target bit. This is where the graphical representation becomes much more convenient. The following is a CNOT gate applied on a system of three qubits, with the first bit as the control and the third as the target:

We use the • to indicate the control bit and the ⊕ to represent the target. We to indicate the initial state of a qubit, we write its quantum state as a bra vector at the beginning of the line that represents it. For example, if the qubit is in the state |ψi, we would write

|ψi

We can now give an example of a simple quantum error correction algorithm which draws its inspiration from the example of classical error correction that we discussed earlier. Just as the classical example uses a 3-bit channel, this algorithm involves a quantum channel

Φ: M8 → M8 that has a low probability of ”flipping” one of the qubits in the standard basis vectors |000i, |001i, ..., |111i. For example, if it flips the second bit, it would map |i0ji to |i1ji and vice versa for i, j ∈ {0, 1}.

By analogy to the classical case, Alice can transmit a single qubit, |ψi = a|0i+b|1i ∈ C2, without error by encoding it with two qubits in the state |0i using the following algorithm:

|ψi • • |0i |0i 29

This results in the entangled state a|000i+b|111i, which Alice then sends to Bob. It is at this point that we see one of the difficulties of designing quantum error correction algorithms based on analogous algorithms from classical error correction. In the classical case, Bob can simply inspect the number he received and, based on the majority rule, determine the original value. However, in the quantum case, measuring the quantum state would cause a wave function collapse, changing the state Bob received. To avoid this, Bob must add two qubits in the state |0i to the system and perform some operations to entangle them with the state he has received using the following algorithm:

• • • • |0i |0i

Bob then measures the last two qubits. The value he gets is called the error syndrome, and it is used to determine what, if any, error occurred. If he gets |00i, then no error occurred, so he has the state a|000i + b|111i. If he gets |01i, then the last qubit was flipped, so he has the state a|001i + b|110i. If he gets |10i, then the second qubit was flipped, and if he gets |11i, then the first qubit was flipped. He can then apply a simple quantum channel to flip the affected qubit back. Finally, he applies the same algorithm that Alice used in reverse to get the state (a|0i + |1i)|00i, which will allow him to independently manipulate the state with which Alice started. All together, this algorithm is represented as follows, with the symbol representing measurement: 30

Encoding Syndrome detection Decoding |ψi • • • • • • |ψi |0i Φ • |0i |0i • Syndrome |0i correction |0i

|0i

At its foundation, this algorithm works as a continuous extension of the discrete classical error correction algorithm. Here, Alice maps her single qubit onto a subspace of a larger state space which can be protected from the errors that might occur during transmission through Φ. In general, if the dimension of such a subspace is k, we call it a k-dimensional quantum error correction code. Knill and Laflamme give a very useful equivalent condition in [4].

Theorem 3.6. Let K ⊂ Cn be a nonzero subspace and let P : Cn → K be the orthogonal r X ∗ projection onto K. Let Φ: Mn → Mn be a TPCP map given by Φ(X) = FiXFi for all i=1 ∗ ∗ X ∈ Mn. Then there is a TPCP map Ψ: Mn → Mn such that Ψ(Φ(vv )) = vv for all

∗ v ∈ K iff. there is some Λ ∈ Mr such that PFi FjP = λi,jP for 1 ≤ i, j ≤ r.

We will prove one direction of this statement below. The other direction will be proved in Theorem 3.8.

∗ ∗ Proof. Suppose there is a quantum channel Ψ : Md → Mn s.t. Ψ(Φ(vv )) = vv for all q q X ∗ X ∗ v ∈ K. Then Ψ admits the form Ψ(ρ) = EiρEi , with Ei Ei = I. We can define a i=1 i=1 31

quantum channel η : Mn → Mn as

q r X X ∗ η(ρ) = P ρP = Ψ(Φ(P ρP )) = (EiFjP )ρ(EiFjP ) i=1 j=1

Since the operator system of a quantum channel is determined by the quantum channel,

∗ ∗ Sη = span(PFi EhEjF`P : 1 ≤ h, j ≤ r, 1 ≤ i, ` ≤ q) = span(P ). Then there is some

{αh,i,j,` : 1 ≤ h, j ≤ r, 1 ≤ i, ` ≤ q} ⊂ C

∗ ∗ s.t. PFi EhEjF`P = αh,i,j,`P for 1 ≤ h, j ≤ r, 1 ≤ i, ` ≤ q. Then for 1 ≤ i, j ≤ r, since Ψ is trace-preserving by assumption,

q ! q q ! ∗ ∗ X ∗ X ∗ X PFi FjP = PFi EhEh FjP = PFiEhEhFjP = αh,i,h,j P h=1 h=1 h=1

Pq Then, if we define Λ ∈ Mr s.t. λi,j = h=1 αh,i,h,j, one direction is proved.

While this kind of algorithm works well in theory, it requires both an encoding and decod- ing operation and an algorithm for determining the communication error without measuring the state that was actually transmitted. This leaves the process open to human error, par- ticularly in the operation of measuring to get the syndrome. Much of this human error can be avoided if we can design a quantum channel that can recover the correct quantum state without measurement. In [6], Li, Nakahara, Poon, Sze and Tomita found that whenever there is a recovery channel that can correct Φ with a syndrome measurement, we can also construct a recovery channel that can correct Φ without a syndrome measurement. This construction utilizes the , which is defined as follows: 32

Definition 3.7. Let H = H1 ⊗ · · · ⊗ Hr, let A : H → H be linear and for some i ∈ {1, ..., r},

let e1, ..., en be the standard basis vectors for Hi. Then the i-th partial trace is

n X ∗ tri(A) = (IH1 ⊗ · · · ⊗ ej ⊗ · · · ⊗ IHr )A(IH1 ⊗ · · · ⊗ ej ⊗ · · · ⊗ IHn ) j=1

In particular, this construction from [6] uses the first partial trace, and for the sake of

clarity, we can describe its action more simply. Let A ∈ Mn,B ∈ Mm. Then

tr1(A ⊗ B) = tr(A) · B.

r X ∗ Theorem 3.8. Let Φ: Mn → Mn be a quantum channel defined as Φ(ρ) = FiρFi . i=1 Suppose K ⊂ C is an error correcting code of dimension k, and let P ∈ Mn be on orthogonal

n×k ∗ ∗ projection on K. Let W ∈ C be s.t. W W = Ik and P = WW , so that any ρ ∈ Mn

∗ s.t. P ρP = ρ has the form W ρW˜ for some ρ˜ ∈ Mk. The there is a unitary R ∈ Mn and

a positive definite matrix D ∈ Mq with q ≤ min{r, n/k} such that for any density matrix

ρ˜ ∈ Mk,

∗ ∗ R Φ(W ρW˜ )R = (D ⊗ ρ˜) ⊕ 0n−qk

In particular, if n is a multiple of k so that Mn can be regarded as Mn/k ⊗ Mk, then

∗ ∗ ˜ ˜ R Φ(W ρW˜ )R = D ⊗ ρ˜ with D = D ⊕ 0n/k−q

and a recovery quantum channel can be constructed as the map Ψ: Mn → Mn defined by

∗ ∗ Ψ(ρ) = W (tr1(RρR ))W

Proof. Suppose K ⊂ Cn is an error correcting code of dimension k, and let P be an orthogonal

∗ projection onto K. Then by Theorem 3.6, there is a Λ ∈ Mr s.t. PFi FjP = λi,jP for 33

1 ≤ i, j ≤ r. Then we have

r ∗ r Λ ⊗ P = (λi,jP )i,j=1 = (PFi FjP )i,j=1

  PF ∗  1     .  =  .  F P ··· F P   1 r  ∗  PFr

Since any matrix of the form BB∗ is positive semidefinite, Λ ⊗ P ≥ 0, so Λ ≥ 0. Suppose Λ

∗ has rank q. Then there is a unitary U ∈ Mr such that U ΛU = D ⊕ 0r−q for some diagonal matrix D ≥ 0. Define r X Ej = ui,jFi for j = 1, ..., r i=1     Let F = F1 ··· Fr , and let E = E1 ··· Er . Then E = F (U ⊗ In) and for any

ρ ∈ Mn, r X ∗ ∗ Φ(ρ) = FjρFj = F (Ir ⊗ ρ)F j=1 ∗ ∗ = F (U ⊗ In)(Ir ⊗ ρ)(U ⊗ In) F r ∗ X ∗ = E(Ir ⊗ ρ)E = EjρEj j=1

r X ∗ So Φ(ρ) = EjρEj . Futhermore, j=1

r r ∗ X ∗ X ∗ PEi EjP = u`,ium,jPF` FmP = u`i um,jλ`,mP = (U ΛU)i,jP = di,jP for 1 ≤ i, j ≤ r `,m=1 `,m=1

∗ We can assume without loss of generality that Ej = Fj, and so PFi FjP = di,jP for   ∗ 1 ≤ i, j ≤ r. Then we can replace the matrix F with F = F1 ··· Fq . Since P = WW

∗ with W W = Ik, it follows that

∗ ∗ ∗ ∗ W Fi FjW = di,jIk and (Iq ⊗ W ) F F (Iq ⊗ W ) = D ⊗ Ik 34

  −1/2 −1/2 1 1 Define an n×qk matrix R1 = F (Iq ⊗W )(D ⊗Ik), where D = diag √ , ..., √ . d1,1 dq,q   ∗ Then R R = Iqk. Let R2 be an n × (n − qk) matrix s.t. R = R1 R2 is unitary. Now for

∗ ∗ ∗ any ρ ∈ Mn with P ρP = ρ, ρ = W ρW˜ for someρ ˜ ∈ Mk. For j > q, W Fj FjW = dj,jIk = 0,

so FjW = 0. Then

r q X ∗ ∗ X ∗ ∗ ∗ ∗ Φ(ρ) = Fj(W ρW˜ )F = Fj(W ρW˜ )Fj = F (Iq ⊗ (W ρW˜ ))F (1) j=1 j=1

It follows that

∗ ∗ ∗ ∗ R Φ(ρ)R = R F (Iq ⊗ W )(Iq ⊗ ρ˜)(Iq ⊗ W ) F R

∗ 1/2 1/2 ∗ = R R1(D ⊗ Ik)(Iq ⊗ ρ˜)(D ⊗ Ik)R1R   1/2 D ⊗ Ik   =   (I ⊗ ρ˜) 1/2 = (D ⊗ ρ˜) ⊕ 0   q D ⊗ Ik 0 r−qk 0

r ! r ! r X ∗ X X Finally, since P = P Fi Fi P = di,i P , di,i = 1. Then i=1 i=1 i=1

r ! ∗ ∗ X tr1(R Φ(W ρW˜ )R) = di,i ρ˜ =ρ, ˜ i=1

∗ ∗ ∗ ∗ so if we define Ψ(ρ) = W (tr1(R ρR))W , Ψ(Φ(W ρW˜ )) = W ρW˜ for allρ ˜ ∈ Mk.

For an example of this type of error correction algorithm, recall the noisy channel de- scribed in the quantum error correction algorithm above. If we explicitly describe this channel, we can construct a recovery channel without a syndrome measurement. Let

√ √ F0 = p0I8,F1 = p1σx ⊗ I2 ⊗ I2, √ √ F2 = p2I2 ⊗ σx ⊗ I2,F3 = p3I2 ⊗ I2 ⊗ σx, 35

  0 1   where σx =   and p0 + p1 + p2 + p3 = 1. Then for a 3-qubit state ρ, the action of the 1 0 3 X ∗ noisy channel on ρ is defined as Φ(ρ) = FiρFi . As in the first error correction method, i=0 Alice takes the state a|0i + b|1i and encodes it in the state a|000i + b|111i. Then if ρ is the density matrix she is sending through Φ, we have

  p0a 0 0 0 0 0 0 p0b        0 p3a 0 0 0 0 p3b 0    a 0 ··· 0 b      0 0 p a 0 0 p b 0 0     2 2   0 0 ··· 0 0         . . . . .   0 0 0 p1d p1c 0 0 0  ρ =  ......  and Φ(ρ) =          0 0 0 p1b p1a 0 0 0       0 0 ··· 0 0       0 0 p c 0 0 p d 0 0   2 2  c 0 ··· 0 0      0 p3c 0 0 0 0 p3d 0    p0c 0 0 0 0 0 0 p0d

We now construct a quantum recovery channel that is inverse to Φ on the   subspace K = span(|000i, |111i). Let P = e1 0 ··· 0 e8 , and define

  G1 = P,G2 = P e2 e1 0 0 0 0 e8 e7

    G3 = P e3 0 e1 0 0 e8 0 e6 ,G4 = P e5 0 0 e8 e1 0 0 e4

4 4 X ∗ X ∗ Then we have Gi Gi = I8, so if we define Ψ : M8 → M8 as Ψ(ρ) = GiρGi , then i=1 i=1 Ψ is a trace-preserving, completely positive (TPCP) map. Furthermore, for any density matrix ρ ∈ span(vv∗ : v ∈ K, ||v|| = 1), Ψ(Φ(ρ)) = ρ. Putting all of this together, we have the following : 36

Encoding Decoding |ψi • • • • |ψi |0i Φ Ψ |0i |0i |0i

Just as with the first error correcting method, Alice begins by encoding the state she wishes to send onto a subspace of C8 which can be protected from error. However, after the state has gone through the channel, instead of needing to entangle additional qubits and measure to get a syndrome, then correct the error, then finally decode the result to get the original state, Bob can simply apply Ψ, then decode the result. This is far more efficient and less prone to human error.

In both methods, the subspace K that can be protected against the error channel Φ depends entirely on SΦ, and not directly on Φ itself. However, the actual recovery quan- tum channel that Knill and Laflamme construct in their proof does depend on the specific

F1, ..., Fr, not just SΦ. The construction given by Li, Nakahara, Poon, Sze and Tomita in Theorem 3.8 is simpler and far more flexible, as it corrects not just Φ, but also any

Ψ: Mn → Mn s.t. SΨ ⊂ SΦ. The following theorem, also from [6], states this more precisely:

˜ Theorem 3.9. Let R, W, F1, ..., Fr be as described in Theorem 3.8. If Φ is another quantum s ˜ X ˜ ˜∗ ˜ channel Φ(ρ) = FjρFj , where the operators Fj are linear combinations of F1, ..., Fr, then j=1 ˜ ∗ there is a D ≥ 0 such that for any density matrix ρ˜ ∈ Mk and ρ = W ρW˜ , we have

R∗Φ(˜ ρ)R = (D˜ ⊗ ρ˜) ⊕ 0 37

Proof. Using the same notations as in Theorem 3.8, for j = 1, ..., s, let

r ˜ X Fj = ti,jFi for some t1,j, ..., tr,j ∈ C i=1

q ˜ X Recall that FjW = 0 for j > q. Then FjW = ti,jFiW for j = 1, ..., s. Define a q × s i=1 ∗ matrix T = (ti,j)i,j. For any density matrix ρ = W ρWˆ ∈ Mn,

s ˜ X ˜ ∗ ˜∗ Φ(ρ) = FjW ρWˆ Fj j=1 s q ! q ! X X ∗ X ∗ = ti,jFi W ρWˆ th,jFh j=1 i=1 h=1 q s ! X X ∗ ∗ = Fi ti,jth,jW ρWˆ Fh h,i=1 j=1 = F (TT ∗ ⊗ W ρWˆ ∗)F ∗

Note that this is the same as equation 1 from the proof of Theorem 3.8, but with TT ∗

˜ 1/2 ∗ 1/2 replacing Iq. Then, letting D = D TT D ≥ 0,

R∗Φ(˜ ρ)R = (D˜ ⊗ ρˆ) ⊕ 0 38

CHAPTER 4. NUMERICAL RANGE GENERALIZATIONS

The equivalent conditions from Theorems 1.12 and 1.14 are closely related to a particular generalization of the numerical range of a matrix A ∈ Mn, defined as

∗ n W (A) = {v Av : v ∈ C , ||v|| = 1}

First, for A1, ..., Am ∈ Mn, we define the joint numerical range as

m n ∗ ∗ W (A1, ..., Am) = {x ∈ C : ∃ v ∈ C s.t. v v = 1 and v Aiv = xi for 1 ≤ i ≤ m}

n Let Pk(n) be the set of all orthogonal projections P : C → K s.t. dim(K) = k. Then, define the rank-k numerical range of A as

Λk(A) = {λ ∈ C : ∃ P ∈ Pk(n) s.t. P AP = λP }

We then further generalize this to the rank-k joint numerical range of a set of matrices.

Definition 4.1. Let A1, ..., Am ∈ Mn. Then the rank-k joint numerical range of A1, ..., Am is defined as

m Λk(A1, ..., Am) = {u ∈ C : ∃ P ∈ Pk(n) s.t. PAiP = uiP for 1 ≤ i ≤ m}

Then by Theorem 1.14, the existence of a k-dimensional code for a quantum channel r X ∗ Φ: Mn → Md, defined as Φ(ρ) = FiρFi , is equivalent to the condition i=1

∗ Λk(Fi Fj : 1 ≤ i, j ≤ r) 6= ∅ 39

∗ Suppose A ∈ Mn,P ∈ Pk(n) s.t. PFi FjP = ai,jP for 1 ≤ i, j ≤ r. Then, by the linearity of matrix operations, for any B ∈ SΦ, PBP = λP for some λ ∈ C. So, if

∗ dim(SΦ) = m + 1 and {I,B1, ..., Bm} is a basis for SΦ,Λk(Fi Fj : 1 ≤ i, j ≤ r) 6= ∅ if

∗ and only if Λk(B1, ..., Bm) 6= ∅. Furthermore, since B ∈ SΦ if and only if B ∈ SΦ, we can assume that any basis of SΦ is composed of Hermitian matrices, which simplifies the calculations involved in determining whether Λk(B1, ..., Bm) 6= ∅.

The equivalent condition from Theorem 3.4 is related to a further generalization of the rank-k joint numerical range. Let Dk be the space of diagonal matrices in Mk. Then, for

A ∈ Mn, define Dk(A) as

n×k ∗ ∗ Dk(A) = {D ∈ Dk : ∃ U ∈ C s.t. U U = Ik,U AU = D}

Then, for A1, ..., Am ∈ Mn, define Dk(A1, ..., Am) as

m n×k ∗ Dk(A1, ..., Am) = {(D1, ..., Dm) ∈ (Dk) : ∃ U ∈ C s.t. U U = Ik,

∗ U AiU = Di for 1 ≤ i ≤ m}

Then, by Theorem 3.4, there exists a set of perfectly distinguishable quantum states

n ∗ ∗ {x1, ..., xk} ⊂ C such that {Φ(x1x1), ..., Φ(xkxk)} are perfectly distinguishable if and only

∗ if Dk(Fi Fj : 1 ≤ i, j ≤ r) 6= ∅. We can again turn to the operator system of Φ, SΦ, and

∗ find a basis of Hermitian matrices {I,B1, ..., Bm} ⊂ SΦ. Then Dk(Fi Fj : 1 ≤ i, j ≤ r) 6= ∅ if and only if Dk(B1, ..., Bm) 6= ∅.

4.1 Conditions for the Non-Emptiness of Λk(A1, ..., Am)

We now come to an important question in the field of quantum error correction: For what triples (n, k, m) ∈ N3 is a k-dimensional code guaranteed to exist for any quantum 40

channel Φ : Mn → Md with dim(SΦ) = m + 1? We have seen above that this is equivalent to asking when Λk(A1, ..., Am) 6= ∅ for any A1, ..., Am ∈ Hn, where Hn is the set of all Hermitian

matrices in Mn. A common approach to answering this question is to find a relation between

n, k and m that guarantees that Dk(B1, ..., Bm) 6= ∅ for some basis {B1, ..., Bm} ⊂ SΦ, then extending that to a relation that guarantees a k-dimensional code for Φ using a specific application of Tverberg’s Theorem [14], which is written below.

Theorem 4.2. Let m, k ∈ N, let n ≥ (m + 1)(k − 1) + 1 and let D1, ..., Dm ∈ Mn(R) be

diagonal. Then Λk(D1, ..., Dm) 6= ∅.

Tverberg also proves that this bound is minimal. That is, for n < (m + 1)(k − 1) + 1,

there are D1, ..., Dm ∈ Dn such that Λk(D1, ..., Dm) = ∅. For ease of reference, let r(k, m)

be the minimum value of n s.t. Λk(A1, ..., Am) 6= ∅ for any A1, ..., Am ∈ Hn. In [5], Knill, Laflamme and Viola found the following bound:

Theorem 4.3. Let n ≥ (m+1)(k−1)+1, and let A1, ..., Am ∈ Hn. Then Dk(A1, ..., Am) 6= ∅.

n Proof. The proof is short, and so we will include it. Let v1 ∈ C , and for some `, 1 ≤ ` < k,

n (0) (1) (m) assume that we can choose v1, ..., v` ∈ C so that for some I = C ,C , ..., C ∈ M`,

letting A0 = I,

∗ (h) vi Ahvj = ci,j δi,j for 0 ≤ h ≤ m, 1 ≤ i, j ≤ `

Then, since `(m + 1) ≤ (k − 1)(m + 1) < n, there is a v`+1 that is orthogonal to Ahvi

n for 0 ≤ h ≤ m, 1 ≤ i ≤ `. Then, by induction, there is a v1, ..., vk ∈ C s.t.

(0) (m) ∗ (h) for some I = C , ..., C ∈ Mk, vi Ahvj = ci,j δi,j for 0 ≤ h ≤ m, 1 ≤ i, j ≤ k. Let

n×k ∗ ∗ V = [ v1 ··· vk ]. Then V ∈ C , V V = Ik and V AiV is diagonal for 1 ≤ i ≤ m, so

Dk(A1, ..., Am) 6= ∅. 41

We can then use Theorem 4.2 to get

r(k, m) ≤ (m + 1)(m + 1)(k − 1) + 1 − 1 + 1 = (m + 1)2(k − 1) + 1

Due to the fact that Knill, Laflamme and Viola apply Theorem 4.2 in a slightly different way, the bound they give in their paper is k − 1 higher than this, but this is the essence of their approach. In [8], Li and Poon improve this bound to r(k, m) ≤ (m + 1)2(k − 1) by letting the first vector they choose be an eigenvector of A1. Finally, in [15], Weaver proves

 n  that Λk(A1, ..., Am) 6= ∅ for all A1, ..., Am ∈ Hn if (k − 1)(m + 1) + 1 ≤ m . His proof of this condition uses a similar technique to that used in our proof of Theorem 4.9. We can see that this condition is equivalent to r(k, m) ≤ m(m + 1)(k − 1) + 1 if we consider the case n = m(m + 1)(k − 1) + 1. Then we have

m(m + 1)(k − 1) + 1  1  = (m + 1)(k − 1) + = (m + 1)(k − 1) + 1, m m so for n ≥ m(m + 1)(k − 1) + 1, Weaver’s condition is satisfied.

We improve on this upper bound for r(k, m) in the next few theorems. First, there is

n×k ∗ ∗ a U ∈ C such that U U = Ik and U AiU diagonal for 1 ≤ i ≤ m if and only if there

n ∗ ∗ ∗ is an orthonormal set {u1, ..., uk} ⊂ C such that 0 = ui uj = ui A1uj = ··· = ui Amuj for

1 ≤ i, j ≤ k with i 6= j. Since we are concerned with finding when Dk is non-empty for some basis of the operator system of a quantum channel Φ, we can assume that A1, ..., Am ∈ Hn. So, we only need to consider 1 ≤ i < j ≤ k. 42

∗ T Let n ∈ N, let A1, ..., A2n−3 ∈ Hn and define x = [ x1 ··· xn ] and y = [ y1 ··· yn ] ,

where x1, ..., xn, y1, ..., yn are indeterminates. Then

n ∗ X x y = xiyi, i=1 n ∗ X x A1y = (A1)i,j xiyj, i,j=1 . . n ∗ X x A2n−3y = (A2n−3)i,j xiyj i,j=1

are 2(n − 1) forms on the 2(n − 1)-dimensional projective variety Pn−1 × Pn−1. Then, by Proposition 2.9, these forms have a common zero, which is represented by some

(u, v) ∈ (Cn \{0})2 s.t. ∗ ∗ ∗ u v = u A1v = ··· = u A2n−3v = 0.

This proves the following theorem:

Theorem 4.4. Let n ≥ 2. Then for any A1, ..., A2n−3 ∈ Hn, D2(A1, ..., A2n−3) 6= ∅.

This result is actually quite surprising if we view it in a slightly different context. Consider

the vector space span(v, A1v, ..., A2n−3v). By the result above, the dimension of this space of

2n − 2 vectors is less than or equal to n − 1. So, for an appropriate choice of v ∈ Cn, given n n X any n matrices in span(A1, ..., A2n−3), there is a c ∈ C s.t. v is a zero eigenvector of ciAi. i=1

To give a more specific example, given A1,A2,A3 ∈ H3, we can find a unitary matrix

∗ ∗ ∗ U ∈ U3 such that the leading principal 2 × 2 submatrix of U A1U, U A2U and U A3U is diagonal. Even for n this small, this result was previously unknown. 43

Furthermore, the next two theorems will show that this is the least upper bound on m s.t. D2(A1, ..., Am) 6= ∅ for any A1, ..., Am ∈ Hn. For A ∈ Mn, α, β ∈ [n] = {1, ..., n}, let

A[α : β] = (ai,j)i∈α,j∈β.

 Theorem 4.5. For any n ∈ N, there is a V ∈ U(n) s.t. det V [α : β] 6= 0 for all α, β ⊂ [n] s.t. |α| = |β|.

 Proof. Let α, β ⊂ [n] such that |α| = |β| = k. Define f : U(n) → C as f(V ) = det V [α : β] for all V ∈ U(n). Then f is a polynomial, and is therefore a continuous function. So,

{V ∈ U(n) : det V [α : β] 6= 0} = f −1(−∞, 0) ∪ (0, ∞) is open.

Let V ∈ U(n) s.t. det V [α : β] = 0, let r = rank(V [α : β]) and let S ⊂ α be s.t. {V [i : β]: i ∈ S} spans the rows of V [α : β]. Since V is unitary, its columns are linearly independent, so  dim span(vi : i ∈ α) = k

Then, there is a p ∈ [n] \ α s.t. V [p : β] is not in the span of the rows of V [α : β]. Let   q ∈ α \ S, and define Up,q(t) : [0, 1] → U(n) as follows: Let Up,q(t) p,q = Up,q(t) q,p = it, let √   2  Up,q(t) p,p = Up,q(t) q,q = 1 − t and let Up,q(t) i,j = δi,j for all other entries.

p ε ε (1) For any ε > 0, let t < 8 (2 − 8 ), and define V = Up,q(t)V . Then (1)  ∗ rank V [α : β] = r + 1 and, using the matrix norm ||A|| = tr(A A) for every A ∈ Mn,

(1) ||V − V || = ||(I − Up,q(t))V || = ||I − Up,q(t)||

r ! √ ε ε2 ε = 4(1 − 1 − t2) < 4 1 − 1 − + = 4 64 2 44

Using a similar method, we can construct V (2), ..., V (k−r) s.t. V (k−r) ∈ U(n), rank V (k−r)[α : β] = k, and

k−r k−r X X ε ||V − V (k−r)|| ≤ ||V − V (1)|| + ||V (i−1) − V (i)|| < < ε 2i i=2 i=1

n o So, the set V ∈ U(n) : det V [α : β] 6= 0 is open and dense in U(n). Since there are a finite number of α, β ⊂ [n] s.t. |α| = |β|, by the Baire category theorem,

n o V ∈ U(n) : det V [α : β] 6= 0 ∀ α, β ⊂ [n] s.t. |α| = |β|

\ n o = V ∈ U(n) : det V [α : β] 6= 0 α,β⊂[n] |α|=|β| is dense in U(n), and is therefore non-empty.

Having established the existence of such a matrix, we can now use it to construct

A1, ..., A2n−2 ∈ Hn s.t. D2(A1, ..., A2n−2) = ∅.

Theorem 4.6. Let n ≥ 2, and let V ∈ U(n) s.t. det V [α : β] 6= 0 for any α, β ⊂ [n] s.t.

n |α| = |β|. Then, if ei is the i-th standard basis vector for C ,

∗ ∗ ∗ ∗ D2(e1e1, ..., en−1en−1, v1v1, ..., vn−1vn−1) = ∅

∗ Proof. Since det(vi,j) 6= 0 ∀ {i}, {j} ⊂ [n], ei vj 6= 0 for 1 ≤ i, j ≤ n. Let S,T ⊂ [n] s.t. |S| + |T | = k ≤ n. Then V [n] \ S : T  is a (n − |S|) × (k − |S|) matrix, and so contains a nonsingular (k − |S|) × (k − |S|) submatrix by the conditions on V . Then  dim span(ei, vj : i ∈ S, j ∈ T ) = k. 45

n ∗ ∗ ∗ ∗ ∗ Let x, y ∈ C be nonzero, and suppose x y = x eiei y = x vivi y = 0 for 1 ≤ i ≤ n − 1.

n ∗ ∗ Let P = {e1, ..., en−1, v1, ..., vn−1}. For any vector v ∈ C , x vv y = 0 iff. either x ⊥ v or y ⊥ v so the union of x⊥ ∩ P and y⊥ ∩ P covers P .  Since dim span(ei, vj : i ∈ S, j ∈ T ) = k for any S,T ⊂ [n] such that |S| + |T | = k ≤ n and dim(x⊥) = n − 1, |x⊥ ∩ P | ≤ n − 1. Similarly, |y⊥ ∩ P | ≤ n − 1, so x⊥ ∩ y⊥ ∩ P = ∅ and |x⊥ ∩ P | = |y⊥ ∩ P | = n − 1.

Then there are S,T ⊂ [n − 1], not necessarily non-empty, such that x ⊥ span(ei, vj : i ∈ S, j ∈ T ) and |S| + |T | = n − 1. Since |S ∪ {n}| + |T | = n,

 dim span({ei, vj : i ∈ S ∪ {n}, j ∈ T ) = n,

n so {ei, vj : i ∈ S ∪ {n}, j ∈ T } is a basis for C .

∗ ∗ ∗ Then, since x ⊥ span(ei, vj : i ∈ S, j ∈ T ), x en 6= 0. Similarly, y en 6= 0. Since x ei = 0 for

∗ all i ∈ S and y ei = 0 for all i ∈ [n − 1] \ S,

∗ ∗ ∗ x y = xnyn = (x en)(eny) 6= 0

∗ ∗ ∗ ∗ ∗ This contradicts x y = 0, so D2(e1e1, ..., en−1en−1, v1v1, ..., vn−1vn−1) = ∅.

Unfortunately, for k > 2, we cannot construct indeterminate vectors u1, ..., uk over C s.t.

∗ ui uj is holomorphic for every i, j ∈ [k] with i < j, so we cannot apply Proposition 2.9 in the same manner to the problem for k > 2. However, we can directly construct the first k − 2 vectors, which results in the following bound:

m+3 Theorem 4.7. Let m ≥ 1, k ≥ 2 and n ≥ 2 + m(k − 2). Then Dk(A1, ..., Am) 6= ∅ for any A1, ..., Am ∈ Hn. 46

Proof. : We proceed by induction on k. For k = 2, the result follows from Theorem 4.4.

m+3 Suppose the result holds for some k ≥ 2. Let m ≥ 1, let n ≥ 2 + m((k + 1) − 2), let A0 = I,A1, ..., Am ∈ Hn and let S = span(I,A1, ..., Am). Without loss of generalization, we can assume that A1 = diag(a1, ··· , an) is diagonal. Let u1 = e1. Then A1u1 = a1u1, so dim(Su1) = d ≤ m. Then there is a W ∈ U(n) such that W (Su1) = span(ei : 1 ≤ i ≤ d).

∗ ∗ Let B1, ..., Bm be the bottom right (n − d) × (n − d) submatrix of WA1W , ..., W AmW , respectively. Then, since

m + 3 n − d ≥ n − m ≥ + m(k − 2), 2

n−d by the inductive assumption, there are v2, . . . , vk+1 ∈ C such that

∗ ∗ ∗ 0 = vj vi = vj B1vi = ··· = vj Bmvi for all 2 ≤ i < j ≤ k + 1

  0 ∗   Let uj = W   for 2 ≤ j ≤ k + 1. Then for j = 2, . . . , k + 1 and ` = 0, 1, . . . , m, vj ∗ ∗ uj A`u1 = (W uj) (WA`u1) = 0. Therefore,

∗ ∗ ∗ 0 = uj ui = uj A1ui = ··· = uj Amui for all 1 ≤ i < j ≤ k + 1.

∗ ∗ So, if we let U = [ u1 ··· uk+1 ], U U = Ik+1 and U AiU is diagonal for 1 ≤ i ≤ m. Then

Dk+1(A1, ..., Am) 6= ∅.

While we do not know in general whether this bound is optimal, we have seen that it is optimal for k = 2, it is clearly optimal for m = 1, and we will now show it is also optimal for m = 2. 47

Theorem 4.8. Let k ≥ 2 and let

    I 0 0 I  k−1   k−1  A1 =   ,A2 =   0 −Ik−1 Ik−1 0

Then Dk(A1,A2) = ∅.

Proof. : Let k ≥ 2, n = 2k − 2, and A1,A2 be defined as in the statement of the theo- rem. Suppose to the contrary that Dk(A1,A2) 6= ∅. Then there exists an orthonormal set

n {u1, . . . , uk} of vectors in C such that

∗ ∗ ∗ ui uj = ui A1uj = ui A2uj = 0 for 1 ≤ i < j ≤ k.   (1) ui   (1) (2) k−1 Writing ui = , where ui , ui ∈ C , we have  (2)  ui

∗ ∗ ∗  (1) (1)  (2) (2) ui uj = ui uj + ui uj ,

∗ ∗ ∗  (1) (1)  (2) (2) ui A1uj = ui uj − ui uj ,

∗ ∗ ∗  (1) (2)  (2) (1) ui A2uj = ui uj + ui uj .

So we have

∗ ∗ ∗ ∗  (1) (1)  (2) (2)  (1) (2)  (2) (1) ui uj = ui uj = ui uj + ui uj = 0 for 1 ≤ i < j ≤ k. (2)

(1) (2) By rearranging the indices, we may assume that for some 0 ≤ m1 ≤ m2 ≤ k, ui = uj = 0 (1) (2) (2) (1) for 1 ≤ i ≤ m1 < j ≤ m2 and ui , ui 6= 0 for all m2 < i ≤ k. Then ui , uj 6= 0 for all (2) (1) (1) 1 ≤ i ≤ m1 < j ≤ m2. By (2), {ui , uj : 1 ≤ i ≤ m1 < j ≤ m2} ∪ {ui : m2 < i ≤ k} is an orthogonal family of k nonzero vectors in Ck−1, a contradiction. 48

Finally, we would like to extend these results to find a lower bound r(k, m) ∈ N such

that, given k, m ∈ N,Λk(A1, ..., Am) 6= ∅ for all A1, ..., Am ∈ Hn if n ≥ r(k, m). However,

n×k ∗ in this case, in addition to the condition that there exists a U ∈ C s.t. U U = Ik and

∗ ∗ ∗ ∗ ui uj = ui A`uj = 0 for 1 ≤ i < j ≤ k, 1 ≤ ` ≤ m, we have the condition ui A`ui = uj A`uj

for 1 ≤ i < j ≤ k, 1 ≤ ` ≤ m. This condition cannot be expressed as a polynomial over C, and so we cannot apply Proposition 2.9. However, we can apply Theorem 4.2.

Let m ∈ N, k ≥ 2 and let q = (m + 1)(k − 1) + 1. Then by Theorem 4.7,

m+3 ∗ if n ≥ 2 + m(q − 2), Dq(A1, ..., Am) 6= ∅ for any A1, ..., Am ∈ Hn. Since U AU ∈ Hn

whenever A ∈ Hn and diagonal Hermitian matrices are real-valued, we then have

Λk(A1, ..., Am) 6= ∅ by Theorem 4.2. This proves the following result:

m+3 Theorem 4.9. Let m ∈ N, k ≥ 2, and let n ≥ 2 + m((m + 1)(k − 1) − 1). Then for any

A1, ..., Am ∈ Hn, Λk(A1, ..., Am) 6= ∅.

m+3  So r(k, m) ≤ d 2 e + m (m + 1)(k − 1) − 1 . This result improves on the bound m−1 r(k, m) ≤ m(m + 1)(k − 1) + 1 proved in [15] by b 2 c, where bxc = max{a ∈ Z : a ≤ x}, and is therefore the best known bound for general m, k. However, there are some techniques that can improve on this bound for smaller values of m. First, due to [3] and [9], we have r(k, 1) = 2k − 1 and r(k, 2) = 3k − 2. Using these bounds and some additional techniques, we can also get better bounds for 3 ≤ m ≤ 7.

Theorem 4.10. Let k ≥ 1 and n ≥ 6k−5. Then for any A1,A2,A3 ∈ Hn, Λk(A1,A2,A3) 6= ∅.

Proof. For m = 3, n ≥ 2(3k − 2) − 1 = 6k − 5, let A1,A2,A3 ∈ Hn. Since r(k, 1) = 2k − 1,

we can find a P1 ∈ P3k−2(n) s.t. P1A1P1 = λP1 for some λ ∈ C. Since P1 ∈ Pk(n), it can

∗ n×k ∗ ∗ be written P1 = UU for some U ∈ C such that U U = Ik, we have U A1U = λIk.

∗ ∗ Letting B2 = U A2U, B3 = U A3U ∈ H3k−2, we have Λk(B2,B3) 6= ∅, so there is a 49

∗ P2 = VV ∈ Pk(3k − 2) s.t. (P2B2P2,P2B3P2) = (c1P2, c2P2) for some c1, c2 ∈ C. Then,

∗ ∗ letting P = UVV U , we have P ∈ Pk and (PA1P,PA2P,PA3P ) = (λP, c1P, c2P ), so

Λk(A1,A2,A3) 6= ∅.

A similar technique can be applied for m = 4, this time separating A1,A2,A3,A4 ∈ Hn into two sets of two. This will give us n(4, k) ≤ 9k − 8. We utilize similar techniques for m = 5, 6, 7, 8, resulting in the table below, the columns of which give m, then the upper bound on r(k, m) we can get using these techniques, which we will call r0(k, m), the bound r(k, m) ≤ m(m + 1)(k − 1) + 1 found in [15], and finally the new bound from Theorem 4.9. Note that for m = 8, our bound and the bound from [15] is actually better than r0(k, m), and

m+3 0 the value d 2 e + m((m + 1)(k − 1) − 1) − r (k, m) gets progressively larger as m increases past 8.

Table 4.1 First column: Number of matrices, Second column: Best upper bound on r(k, m) achievable for m matrices using techniques similar to the proof of Theorem 4.10, Third column: Weaver’s bound for m matrices, Fourth column: The bound from Theorem 4.9 for m matrices

0 m+3  m r (k, m) m(m + 1)(k − 1) + 1 d 2 e + m (m + 1)(k − 1) − 1 1 2k − 1 2k − 1 2k − 1 2 3k − 2 6k − 5 6k − 5 3 6k − 5 12k − 11 12(k − 1) 4 9k − 8 20k − 19 20(k − 1) 5 18k − 17 30k − 29 30k − 31 6 27k − 26 42k − 41 42k − 43 7 54k − 53 56k − 55 56k − 58 8 81k-80 72k-71 72k-74 9 144k-146 90k-89 90k-93 50

4.2 Geometry of Λk(A1, ..., Am)

Having considered the question of when Λk(A1, ..., Am) 6= ∅ for any A1, ..., Am ∈ Hn, we now turn to a consideration of the geometric properties Λk(A1, ..., Am) possesses when it is, in fact, non-empty. Since every point in Λk(A1, ..., Am) represents a k-dimensional quantum error correcting code for any Φ : Mn → Mn s.t. SΦ ⊂ span(I,A1, ..., Am), having some un- derstanding of the geometry of Λk(A1, ..., Am) could be very useful. There are two geometric properties that have been the primary focus of research with respect to this question, namely convexity and star-shape.

We call a set S ⊂ Cn convex if, for any x, y ∈ S and any t ∈ [0, 1], tx + (1 − t)y ∈ S. Then the convex hull of S is conv(S) := ∩{T ⊆ Cn : S ⊂ T,T is convex}. From [7], we have Λ1(A1, ..., Am) is convex for every A1, ..., Am ∈ Hn for m = 3, n ≥ 3. We also have the following result from [10]:

Theorem 4.11. Suppose A ∈ Mn and 1 ≤ r < k. Then

∗ n×(n−r) ∗ Λk(A) = ∩{Λk−r(X AX): X ∈ C ,X X = In−r}

∗ n×(n−1) ∗ Letting r = k − 1, we get Λk(A) = ∩{W (X AX): X ∈ C ,X X = In−1}. Since the numerical range of every square matrix is convex by the Toeplitz-Hausdorff Theorem

[16], Λk(A) is convex for every matrix A ∈ Mn. Furthermore, any matrix A ∈ Mn can be

n×k ∗ written as A = A1 + iA2 for some A1,A2 ∈ Hn, so for any U ∈ C s.t. U U = Ik and

∗ ∗ ∗ ∗ ∗ U AU = U A1U + iU A2U = λIk, we have (U A1U, U A2U) = (λ1Ik, λ2Ik). By the Spectral

n×n n Decomposition Theorem, for some D1 = diag(d1,1, ..., dn,n) ∈ R , U1 ∈ Un and any x ∈ C

n ∗ ∗ ∗ X 2 x A1x = x U1 D1U1x = |xi| di,i ∈ R i=1 51

So, λ1, λ2 ∈ R. Let a, b ∈ Λk(A). Then, since Λk(A) is convex, for every t ∈ [0, 1] there

n×k is a Ut ∈ C s.t.

∗ ∗ ∗ Ut A1Ut + iUt A2Ut = Ut AUt = (ta + (1 − t)b)Ik

= (t(a1 + ia2) + (1 − t)(b1 + ib2))Ik

= (ta1 + (1 − t)b1)Ik + i(ta2 + (1 − t)b2)Ik

∗ ∗ Then Ut A1Ut = (ta1 +(1−t)b1)Ik and Ut A2Ut = (ta2 +(1−t)b2)Ik, and therefore Λk(A1,A2) is convex.

On the other hand, we can have Λk(A1, ..., Am) be nonconvex for m ≥ 3. We find the following example in [8]:

      1 0 0 1 0 i       Example. Let B1 =   ,B2 =   ,B3 =  . We can see by straight- 0 −1 1 0 −i 0 3 forward calculation that Λ1(B1,B2,B3) = W (B1,B2,B3) = {v ∈ C : ||v|| = 1}, which is not convex. Then, for m ≥ 3, k > 1, let Aj = Bj ⊗ Ik, and let Ai = 02k for 3 < i ≤ m. Then ∼ Λk(A1, ..., Am) = W (B1,B2,B3), and is therefore not convex.

Finally, in [7], Li and Poon found that for n > 1, m ≥ 4, there are A1, ..., Am ∈ Hn s.t.

Λ1(A1, ..., Am) is not convex.

Theorem 4.12. Suppose A1,A2,A3 are Hermitian matrices such that {I,A1,A2,A3} is lin- early independent. Then there exists a Hermitian matrix A4 such that W (A1,A2,A3,A4) is not convex.

So, the conditions under which Λk(A1, ..., Am) is convex for all A1, ..., Am ∈ Hn are quite limited. However, for large enough n,Λk(A1, ..., Am) does have a property called star-shaped. 52

Definition 4.13. Let S ⊂ Cn. Then S is star-shaped if there is an x ∈ S s.t. for all y ∈ S, t ∈ [0, 1], tx + (1 − t)y ∈ S. We call x a star center of S.

In [8], Li and Poon proved the following condition under which Λk(A1, ..., Am) is star- shaped for all A1, ..., Am ∈ Hn:

Theorem 4.14. Let A1, ..., Am ∈ Hn, and let k ∈ N. If Λkˆ(A1, ..., Am) 6= ∅ for some ˆ k ≥ (m + 2)k, then Λk(A1, ..., Am) is star-shaped, and every element of conv Λkˆ(A1, ..., Am)

is a star-center for Λk(A1, ..., Am).

We can combine this with the result proved in Theorem 4.9, and we get

m+3  Theorem 4.15. Let m ≥ 1, k ≥ 2, let n ≥ 2 + m (m + 1)((m + 2)k − 1) − 1 and let

A1, ..., Am ∈ Hn. Then Λk(A1, ..., Am) is star-shaped.

Previously the best known bound was n ≥ (m+1)2((m+2)k−1) from [8]. The difference between this bound and ours is

m + 5 (m + 1)(m + 2)k − , 2

which is quite significant even at low values of m, k. 53

CHAPTER 5. FUTURE WORK

There are several questions that are still open for investigation in this area. First, the techniques used in the proof of Theorem 4.10 show that finding a lower bound on n s.t.

Dk(A1, ..., Am) 6= ∅ for all A1, ..., Am ∈ Hn then applying Theorem 4.2 to replace k with (m + 1)(k − 1) + 1 is inefficient. That is, it results in bounds that are significantly higher for some values of m, k than bounds that can be achieved using different techniques. However, the only alternate techniques found thus far either cannot extend beyond low values of m or k, or are only superior for low values of m. Finding a new technique that improves the bounds we have for general m, k would be a significant step forward.

Second, it it still an open question whether we have the best bound for n above which Dk(A1, ..., Am) 6= ∅ for all A1, ..., Am ∈ Hn. We know the best bound for k = 2 and m = 2, and the best bound for m = 1 and k = 1 is trivial. However, we do not know if the methods used to extend our bounds to higher values of m and k are optimal. The particular application of the Bezout theorem that we use does give us an indication of where we could start, though.

n Let k, m > 2, n ∈ N, let A0 = I,A1, ..., Am ∈ Hn and let v ∈ C . Then, if dim(span(v, A1v, ..., Amv)) = d ≤ m, for any α ⊂ {0, ..., m} with |α| = d + 1, there is some

(α) d+1 X (α) n−1 d c ∈ C s.t. ci Aiv = 0. Each such equation is composed of n forms on P × P , i∈α   m + 1   n−1 dp and there are p =   such equations, for a total of p · n forms on P × P . d + 1 Then, by Bezout’s Theorem, if p · n ≤ n − 1 + p · d, there is always a nonzero solution. Some exploration into these numbers shows that the inequality is always satisfied for d = m

(as would be expected, since we can always choose v to be an eigenvector of A1) and for

2 d = m − 1, the inequality becomes n ≤ m − m , which is equivalent to n ≤ m − 1 for 54

m ≥ 2. This bound obviously has some holes in it, since we know that Dk(A1, ..., Am) 6= ∅

for A1, ..., Am ∈ Hm. However, it does give us a set of forms to study, and hopefully we can find some properties of the projective variety given by this set of forms.

Finally, the quantum channels we deal with are often artificially general. In real world applications, there are particular errors that are likely to occur if any error occurs at all, and most other errors are so unlikely that they can be ignored. For instance, the bit flip     0 1 1 0     error σx =   and the phase flip error σz =   have been widely studied 1 0 0 −1 because they occur more often, and there is a specific type of operator system that assumes that no more than r errors of this type occur during the transmission. Another example that has been studied is an error caused by a stable fault in the communication line, which would cause the same error to be applied to every bit in the quantum state. We would

⊗r ⊗r ∗ represent this using the quantum channel Φ(ρ) = (X )ρ(X ) , where X ∈ U2s and rs is the number of qubits in the system. The operator systems produced by such quantum channels can sometimes be corrected much more efficiently than just a general quantum channel Φ : Mn → Mn with dim(SΦ) = m + 1. There are a lot of important questions about the general case that could provide a baseline if they were solved, but when it comes to applying the theory practically, some communication between pure mathematicians and the physicists dealing directly with the physical problem can certainly help frame the problem better. 55

REFERENCES

[1] M.D. Choi, Completely Positive Linear Maps on Complex Matrices, Linear Algebra and its Applications, 10:3 (1975), 285-290

[2] F.M. DeAssis, E.B. Guedes, R.A.C. Medeiros, Quantum Zero-Error Information The- ory, Springer, 2016

[3] K. Fan and G. Pall, Embedding Conditions for Hermitian and Normal Matrices, Canad. J. Math. 9 (1957), 298-304

[4] E. Knill, R. Laflamme, A Theory of Quantum Error-Correcting Codes, Physical Review A., 55:2 (1997), 900-911

[5] E. Knill, R. Laflamme, L. Viola, Theory of Quantum Error Correction for General Noise, Phys Rev Lett., 84:11 (2000), 2525-2528

[6] C.K. Li, M. Nakahara, Y.T. Poon, N.S. Sze, H. Tomita, Recovery in quantum error cor- rection for general noise without measurement Quantum Information and Computation, 12 (2012), 149-158

[7] C.K. Li, Y.T. Poon, Convexity of the joint numerical range, J. Matrix Analysis Appl., 21 (1999), 668-678

[8] C.K. Li, Y.T. Poon, Generalized Numerical Ranges and Quantum Error Correction, J. Operator Theory, 66:2 (2011), 335-351

[9] C.K. Li, Y.T. Poon, N.S. Sze, Condition for the higher rank numerical range to be non-empty, Linear and Multilinear Algebra, 57 (2009), 365-368

[10] C.K. Li, Y.T. Poon, N.S. Sze, Higher rank numerical ranges and low rank perturbations of quantum channels, Journal of Mathematical Analysis and Applications, 348:2 (2008), 843-855 56

[11] C.K. Li, N.K. Sze, Canonical forms, higher rank numerical ranges, totally isotropic subspaces, and matrix equations, Proc. Amer. Math. Soc., 136:9 (2008), 3013–3023.

[12] Paulsen. (Winter 2016). Entanglement and Non-Locality [Lecture Notes], Retrieved from

http://www.math.uwaterloo.ca/~vpaulsen/EntanglementAndNonlocality_LectureNotes_7.pdf.

[13] I.R. Shafarevich, Basic Algebraic Geometry 1: Varieties in Projective Space, 3rd ed., Springer, 2013

[14] H. Tverberg, A generalization of Radon’s theorem, J. Lond. Math. Soc., 41 (1966), 123-128.

[15] N. Weaver, The ”Quantum” Turan Problem for Operator Systems, Pacific Journal of Mathematics, 301 (2019), 335-349

[16] F. Zhang, Matrix Theory: Basic Results and Techniques, 2nd ed., Springer, 2011