<<

Convex Set Approximation Problems in Quantum Information

by Eric Fernandes

A Thesis presented to The University of Guelph

In partial fulfilment of requirements for the degree of Master of Science in and Statistics

Guelph, Ontario, Canada c Eric Fernandes, May 2020 ABSTRACT

CONVEX SET APPROXIMATION PROBLEMS IN QUANTUM INFORMATION

Eric Fernandes Advisor: Dr. Rajesh Pereira University of Guelph, 2019 Co-Advisor: Dr. Bei Zeng

This thesis investigates methods to approximate convex sets which involve minimizing the Hausdorff metric between a set and certain subsets. We begin by giving a lower bound for the Hausdorff metric between a hypersphere and a circumscribed simplex. We show that this bound is achieved by the regular simplex. Next, we form a lower bound on the Hausdorff distance between the of the joint numerical range of positive operator valued-measures and the probability simplex. An entanglement witness is a linear functional that separates the convex compact set of separable states from certain entangled states in the . We investigate the applications of our methods by exploring the problem of finding a polytope generated by entanglement witnesses that has minimal distance to the set of separable states. iii

Dedication

Sister Bridget iv

Acknowledgements

Thank you, Dr. Rajesh Pereira, for your guidance and insight throughout my master’s experience. Your dedication to your field is inspirational and has encouraged me to remain motivated for any challenges to come. I would like to thank Dr. Bei Zeng for the opportunity to work in the field of quantum information and making my transition from physics to math seamless. Thank you to Ali, Comfort, David, and Kat for being welcoming friends in the department. I would like to thank the lads at 93 Moss for reminding me to take breaks from studies, I couldn’t ask for a better group of brothers. Also an honourable mention to my friends at 748 Scottsdale for allowing me to frequently down spike you in smash ultimate. v

Contents

Abstract ii

Dedication iii

Acknowledgements iv

1 Introduction and Preliminaries 1 1.1 Outline ...... 1 1.2 Mathematical Foundations ...... 3 1.2.1 Euclidean Space ...... 4 1.2.2 Affine Geometry ...... 4 1.2.3 Convexity ...... 5 1.2.4 Hilbert Space ...... 8 1.2.5 Half Spaces ...... 11 1.2.6 Simplex ...... 15 1.2.7 Hausdorff Distance ...... 15 1.2.8 Joint Numerical Range ...... 17 1.3 Quantum Information ...... 19

2 An Application of Blaschke’s Selection Theorem 23 2.1 Two problems in ...... 23 2.2 The Blaschke Selection Theorem ...... 24 2.3 Main Results ...... 24

3 Optimal Simplex Enclosing a Hypersphere 26 3.1 Introduction ...... 26 3.2 Circumradius and Inradius ...... 26 3.3 Circumcircle and Incircle of a Triangle ...... 27 3.4 Generalization of Euler’s Triangle Inequality ...... 29 3.5 Application of Klamkin and Tsintsifas’ Result on the Hausdorff distance . . 33 3.6 Conclusion ...... 35 vi

4 Standard Simplex and the Joint Numerical Range of 36 4.1 Introduction ...... 36 4.2 A Hausdorff metric inequality ...... 40

5 Entanglement Witnesses and Further Work 44 5.1 Introduction ...... 44 5.2 Separability criterion for low dimension ...... 44 5.3 Entanglement witnesses ...... 46 5.3.1 Problem with the entanglement witnesses ...... 47 5.4 Application of Hausdorff distance with the entanglement witness ...... 48

Bibliography 49 Chapter 1

Introduction and Preliminaries

1.1 Outline

Convexity often arises in which is used in quantum information theory. is the study of the properties of convex sets and convex functions. It has major applications in optimization. This thesis will explore the approximations of convex sets and their applications to quantum systems. We study two convex sets A, and B, where B is a subset of A. One of these sets is fixed, and the other set varies among a class of sets. We find a member of that class that minimizes the distance between A and B. Since A and B are sets, the most natural way to define the distance between these two sets is the Hausdorff distance. By bounding the Hausdorff distance, we develop optimal set approximations as well as an understanding of the properties of each set. Quantum information is the manipulation, communication, and storage of information using the laws of quantum mechanics. Quantum mechanics is vastly different from classical mechanics because of phenomena such as superposition and entanglement. This means that entangled particles can have an effect on one another, even when separated by great distances. Particles that do not exhibit this phenomenon are called separable. Even though

1 entanglement is difficult to grasp intuitively, there are mathematical descriptions that help develop an idea of how the process behaves. Whether a particle is entangled or separable, it can be described as a state in a Hilbert space. Quantum states can be described as unit vectors in a Hilbert space. Quantum states that are separable form a convex set in Hilbert space. These convex sets are more often irregularly shaped and difficult to approximate. Entanglement witnesses are used to detect entanglement. An entanglement witness is a linear functional that separates entangled states from the set of separable states. This method of detecting entanglement stems from the Hahn-Banach theorem or the separating hyperplane theorem. The Hahn-Banach theorem dates back to almost a century and is a central result in functional analysis. It has now being used in Quantum Information for the last couple of decades. We begin chapter 1 by explaining the mathematical definitions that are applied later on in the thesis. We introduce methods used to analyze convex sets such as the Hausdorff distance. The Hausdorff distance allows us to how close two bounded sets are from each other. This aids in analyzing set approximations and understanding the properties of our sets. A complete inner product space is introduced so that sets of vectors can be applied in quantum information and described as states. In chapter 2, we introduce two generalized problems that apply to the remainder of the thesis. We apply two completely different methods in solving special cases of these problems in chapter 3 and chapter 4. Using the Blaschke Selection Theorem, we will show that both of these methods indeed always have a solution. In chapter 3, we study the problem of finding the simplex containing the closed unit ball of dimension n which has minimal Hausdorff distance to the closed unit ball. In order to do this, we introduce the geometric properties and definitions that arise with convex sets and their enclosed convex subsets. When calculating the distance between these objects the inradius and circumradius are introduced as the main parameters in this chapter. A

2 relationship between these radii is introduced by Klamkin and Tsintsifas in [16] that inspires the main result of this chapter which bounds the Hausdorff distance between the hypersphere and the simplex that contains it. In chapter 4, the simplex, or more specifically the probability simplex is fixed while the convex hull of the joint numerical range of positive operator valued-measure (POVM’s) is en- closed inside. We make statements about types of POVM’s and bound the distance between the set and the convex hull. Chapter 4 is where the main theorem of the thesis is proposed and proved. Finally, chapter 5 is focused on further research. We begin with the problem of separa- bility of pure states. Separability of mixed states is more difficult and requires operational entanglement criterion. Operational entanglement criterion are different ways to identify entanglement other than Schmidt decomposition. There are still limitations to identifying entanglement with this method. This introduces the idea of the Entanglement Witness seper- ating entangled states from separable states. An entanglement witness is a linear functional used to distinguish separable states from entangled states. We conclude the thesis with the application of the Hausdorff distance with using the entanglement witness.

1.2 Mathematical Foundations

This chapter introduces the basic mathematical tools necessary used throughout the thesis. We begin with an introduction to the types of spaces and operations used in these spaces. We also introduce the notion of entanglement and other quantum properties.

3 1.2.1 Euclidean Space

Definition 1.1. [14] A norm on a vector space V is a rule which, given any v ∈ V , specifies a ||v|| such that

1. ||v|| > 0 if v 6= 0, and ||0|| = 0;

2. ||av|| = |a| · ||v|| for any v ∈ V and any scalar a;

3. ||v + w|| ≤ ||v|| + ||w|| for any v, w ∈ V.

A normed space is a vector space V with a given norm.

Remark 1.2. The norm is used to measure the distance between two vectors.

Lemma 1.3. (Cauchy-Schwarz Inequality) For any vectors v, w ∈ Rn

n 2  n  n  X X 2 X 2 viwi ≤ |vi| |wi| .

i=1 i=1 i=1 Definition 1.4. [11] A Cauchy sequence of elements of a normed space is a sequence

(xn) such that for any  > 0 there is a number N such that ||xn − xm|| <  for all n, m ≥ N.

Definition 1.5. [11] A normed space M is called complete if every Cauchy sequence in M converges in M.

Theorem 1.6. [2] (Bolzano-Weierstrass) Each bounded sequence of real (or complex) numbers contains a convergent subsequence.

1.2.2 Affine Geometry

We assume the reader has a basic understanding of including such topics as; linear dependence and independence, dimension, linear maps and so forth. For the majority of the thesis, we will be working in a space of dimension n. When studying problems that are invariant under translations, it is more natural to work in the setting of affine geometry.

4 n Definition 1.7. Let {v1, v2, . . . , vm} be elements of R . Then {v1, v2, . . . , vm} are affinely Pm independent if whenever i=1 kivi = 0 for any real numbers k1, k2, . . . km that satisfy Pm i=1 ki = 0 then k1 = k2 = ··· = km = 0.

The points {v1, v2, . . . , vm} are affinely dependent if ki 6= 0 for some i.

Remark 1.8. Any maximally affinely independent subset of Rn must have n + 1 elements.

Definition 1.9. [30] A mapping T : Rn → Rm is called an affine transformation if T (λx + µy) = λT (x) + µT (y) whenever x, y ∈ R and λ + µ = 1.

In addition to the origin 0 := (0, 0,..., 0) the vector whose entries are all ones 1 := (1, 1,..., 1) is also worth noting.

1.2.3 Convexity

Definition 1.10. [11] Let V be a real vector space. Then a linear functional is a function

F : V → R with the following properties:

1. F (v + w) = F (v) + F (w) for all v, w ∈ V.

2. F (αv) = αF (v) for all v ∈ V and all constants α ∈ R.

In this thesis most of the vector spaces will be real, and hence α will be a real number. If the vector space is complex, then F maps V into the complex numbers.

Definition 1.11. [11] A set P in a normed linear space H is a hyperplane if there exists a non-zero continuous linear functional f and a scalar α such that

P = {x ∈ V |f(x) = α}.

5 Remark 1.12. These hyperplanes can be used to create half-spaces which will be explored later on.

Definition 1.13. [11] A subset S of normed space is said to be open if for each x ∈ S there is a δ > 0 such that y ∈ S whenever kx − yk < δ.

Another way to think about an open set S is that every point in S can be surrounded by a sphere of points that also lie in S.

Definition 1.14. [11] A set S is said to be closed if it contains all of its limit points.

It can be shown that the set S is closed if the complement of S is open.

Definition 1.15. [11] A subset S of Rn is said to be compact if every sequence in S has a subsequence that converges to an element contained in S.

Definition 1.16. [11] A subset S of a normed space N is called bounded if there is a number M such that kxk ≤ M for all x ∈ S.

Definition 1.17. [30] Let x and y be distinct points of Rn then the subset

{λx + µy | λ, µ ≥ 0, λ + µ = 1}

of the line through x and y is called the line segment joining x and y.

Definition 1.18. [11] A set S in a vector space is called convex if, for any x, y ∈ S, λx + (1 − λ)y ∈ S for all λ ∈ [0, 1].

Remark 1.19. The finite line-segment joining any two members of a convex set, lies entirely in the set.

6 Definition 1.20. [30] A point x is said to be a convex combination of points a1, . . . , am ∈

n R if there exists scalars λ1, . . . , λm ≥ 0 with λ1 + ··· + λm = 1 such that

x = λ1a1 + ··· + λmam

Remark 1.21. Convex sets are closed under convex combinations.

Definition 1.22. [30] The convex hull of a set A ⊂ Rn or conv(A) is the intesection of all convex sets in Rn containing A.

Remark 1.23. The convex hull of a set in two dimensions can be thought of as wrapping a rubber band around the extreme points of the set. It is smallest convex set that contains the set.

Theorem 1.24. [4] (Carath´eodory’s theorem) Let a ∈ conv(A), where A is a subset of

Rn. Then a can be expressed as a convex combination of n + 1 or fewer points of A.

Proof. There exists points a1, . . . , am of A and scalars λ1, . . . , λm ≥ 0 with λ1 + ··· + λm = 1 such that

a = λ1a1 + ··· + λmam.

We assume that this representation of a is chosen so that a cannot be expressed as a convex combination of fewer than m points of A. It follows that no two of the points are equal and that λi > 0. Supposed that m > n + 1. Then since A is n-dimensional, the set {a1, . . . , am} must be affinely dependent, and so there exists scalars µ1, . . . , µm, not all zero, such that

0 = µ1a1 + ··· + µmam, µ1 + ··· + µm = 0.

Let t > 0 be such that the scalars λ1 + µ1t, . . . , λm + µmt are non-negative with at least one of them zero; such a t exists since the λ’s are all positive at least one of the µ’s is

7 negative. The equation

a = (λ1 + µ1t)a1 + ··· + (λm + µm)am

when zero coefficient terms are omitted, exhibits a as a convex combination of fewer than m points of A. This contradiction to the minimality of m shows that m ≤ n + 1.

1.2.4 Hilbert Space

Definition 1.25. [11] A (real or complex) inner product space is a (real or complex) vector space V with an inner product specified. An inner product is a rule which given any x, y ∈ V , specifies a number hx, yi, such that

1. hx, xi is real and positive for all x 6= 0, and h0, 0i = 0;

2. hx, yi = hy, xi for all x, y ∈ V ;

3. hax, yi = ahx, yi for any scalar a and x, y ∈ V ;

4. hx, y + zi = hx, yi + hx, zi for all x, y, z ∈ V.

Remark 1.26. For a real space, the bar hy, xi indicating complex conjugation is redundant since the inner products are all real. These properties are generally the same that govern the scalar product in ordinary vector algebra. However, in a complex space the inner product is not linear but conjugate linear.

A Hilbert space is a complete inner product space. Hilbert spaces are important in quantum mechanics because any quantum state can be described as a unit vector in Hilbert space.

8 Definition 1.27. Let HA and HB be complex Hilbert spaces, consider a continuous linear

∗ operator A : HA → HB. Then the adjoint of a continous operator A is A : HB → HA satisfying hAx, yi = hx, A∗yi

Remark 1.28. A is a complete normed vector space. A Hilbert space is a Banach space.

A Hilbert space H has a norm being defined as ||v|| = phv, vi. The Hilbert spaces in this thesis are finite dimensional. A vector ψ ∈ H is a column vector denoted as |ψi. Correspondingly, hφ| will denote a linear functional on H defined by the inner product

hφ| : ψ → hφ|ψi, ψ ∈ H.

This is known as bra-ket notation and was developed by Dirac as the common notion for quantum states. The set of continuous linear functionals form another Hilbert space called the H∗ of H and should be thought of as row vectors |ψi∗. This is the to hψ| [13].

Definition 1.29. A A : V → W between two Banach spaces satisfies the inequality

kAvk ≤ Ckvk,

for all v ∈ V , where C is a constant independent of the choice of v ∈ V.

Remark 1.30. Bounded linear operators A : HA → HB are continuous.

 Proof. Let  > 0. If v, w ∈ V : kv − wk < C , then kAv − Awk ≤ Ckv − wk < . Therefore, not only is A continuous, but uniformly continuous as well.

9 Definition 1.31. Given a normed vector space V the operator norm of A is given by

||A|| = sup{||Ax|| | ||x|| ≤ 1}

where x ∈ V.

T Observation 1.32. [11] A matrix A is said to be Hermitian if A = A∗ where A∗ = A with the following observations;

1. A + A∗, AA∗, and A∗A are all Hermitian.

2. If A is Hermitian then Ak is Hermitian for all k ∈ Z+.

3. If A is non-singular then A−1 is also Hermitian.

Definition 1.33. [29] A A is unitary if AA∗ = A∗A = I.

The product of two unitary matrices is also a unitary matrix. The inverse of a uni- tary matrix is yet again a unitary matrix. Unitary matrices form a group under matrix multiplication.

Definition 1.34. [29] A square matrix is normal if AA∗ = A∗A.

10 Theorem 1.35. (Cartesian Decomposition) If M is a square complex matrix, then

there exists unique Hermitian matrices H1 and H2 such that M = H1 + iH2.

M+M ∗ M−M ∗ Proof. Assume M is a square complex matrix. Then choose H1 = 2 and H2 = 2i so that

M + M ∗ M − M ∗ M + M ∗ + M − M ∗ M = + i = = M. 2 2i 2

Now suppose H1 +iH2 = H3 +iH4 where Hk is Hermitian for k = 1, 2, 3, 4 then H1 +H3 =

∗ ∗ ∗ i(H4 − H2). We let M = H1 + H3 = i(H4 − H2). However M = H1 + H3 = H1 + H3 = M

∗ ∗ but M = [i(H4 − H2)] = −i(H4 − H2) = −M. Meaning that M = −M therefore M = 0

and H1 = H3 and H2 = H4. So the Cartesian decomposition exists and is unique.

Remark 1.36. A matrix is normal if and only if the Hermitian matrices in its Cartesian decomposition commute.

1.2.5 Half Spaces

Definition 1.37. [26] Let W be a subspace of Rn. Its orthogonal complement is the subspace

⊥ n W = {v ∈ R | v · w = 0 ; ∀ w ∈ W }.

The symbol W ⊥ is sometimes read as ”W perp.”

Remark 1.38. X⊥ is a closed subspace of the Hilbert space H, for any subset X of H.

Theorem 1.39. [7] (Orthogonal Projection) If E is a closed subspace of a Hilbert space H, then every x ∈ H can be uniquely written as x = y + z with y ∈ E with z ∈ E⊥. y is called the projection of x on to E.

11 Theorem 1.40. [7] (The Riesz Representation Theorem) For every continuous linear functional f on a Hilbert space H, there is a unique v ∈ H such that f(x) = hx, vi for all x ∈ H.

Proof. [7] Let N be a set such that N = {x ∈ H|f(x) = 0}. If N = H, then f is identically zero, and v = 0 gives f(x) = hx, vi for all x. If N 6= H, we show that the orthogonal complement N ⊥ is one-dimensional. We show that every pair of vectors in N ⊥ is linearly dependent. Take any x, y ∈ N ⊥; ax+ by ∈ N ⊥ for any numbers a, b by Definition 1.37; but the vector z = f(y)x − f(x)y ∈ N since f(z) = 0. Thus z lies in both N and N ⊥, and is therefore perpendicular to itself and therefore zero. Thus given any x, y ∈ N we have constructed a linear combination of x and y which is zero, so N ⊥ is one dimensional.

⊥ ⊥ Let e be an element of N with ||e|| = 1; {ei} is an orthonormal basis for N , and any v ∈ N ⊥ can be written v = hv, eie. Now, by the Orthogonal Projection Theorem, any x ∈ H can be written x = y + z, with y ∈ N ⊥ and z ∈ N. Then f(x) = f(y) + f(z) = hy, eif(e) + 0 = hx − z, eif(e) = hx, eif(e). Thus f(x) = hx, vi as required, where v = f(e)e. To prove uniqueness, suppose f(x) = hx, ui = hx, vi for all x ∈ H. Taking x = u − v gives hu − v, ui = hu − v, vi, hence ||u − v|| = 0 and therefore u = v.

Lemma 1.41. Let K be a nonempty closed convex subset of Rn. Then there exists a unique vector in K of minimum norm.

Proof. Let  = inf{kxk | x ∈ K}. Let {xj} be a sequence in K such that kxjk → . Note

2 2 that (xi + xj)/2 is in K since K is convex and so kxi + xjk ≥ 4 . Since

2 2 2 2 2 2 2 kxi − xjk = 2kxik + 2kxjk − kxi + xjk ≤ 2kxik + 2kxjk − 4 → 0.

12 As i, j → ∞, {xi} is a Cauchy sequence and so it has a limit x in K. By continuity of the norm, x has norm . It is a unique element in K having norm , since y is in K and has norm , then kx − yk2 ≤ 2kxk2 + 2kyk2 − 42 = 0 and therefore x = y.

Theorem 1.42. (Separating hyperplane theorem). Let A and B be two disjoint nonempty convex subsets of Rn. Suppose A is compact and B is closed. Then there exists a nonzero vector v and a real number α such that

hx, vi ≥ α hy, vi ≤ α for all x in A and y in B.

Proof. Using the proof of lemma 1.41, we can prove this theorem. Given disjoint nonempty convex sets A, B, let K = A + (−B) = {x − y|x ∈ A, y ∈ B}.

Since −B is convex and the sum of convex sets is convex, K is convex. By the lemma 1.41, the closure K of K, which is convex, contains a vector v of minimum norm. Since K is convex, for any n in K, the line segment v + t(n − v), 0 ≤ t ≤ 1 lies in K and so 0 ≤ 2hv, ni − 2kvk2 + tkn − vk2 and letting t → 0 gives: hn, vi ≥ kvk2. Hence, for any x in A and y in B, we have: hx − y, vi ≥ kvk2. Thus, if v is nonzero, the proof is complete since

inf hx, vi ≥ kvk2 + suphy, vi. x∈A y∈B

Definition 1.43. Given a real normed space X, a bounded linear functional f : X → R and

13 α ∈ R we set

Hf,α := {y ∈ X : f(y) ≤ α}

where Hf,α is a closed half-space.

Remark 1.44. Every closed convex set is an intersection of half-spaces.

Definition 1.45. If S is a subset of a vector space V then x is an interior point of S denoted as Int(S) if there exists an open ball centered at x which is completely contained in S.

Definition 1.46. A supporting hyperplane of a set S in Rn is a hyperplane with the following properties:

1. S is entirely contained in one of the two closed half-spaces bounded by the hyperplane.

2. S has at least one boundary point on the hyperplane.

Definition 1.47. Given a nonempty convex set A, a point a0 is a supporting point if a0 is a point on the boundary of A and there is a supporting hyperplane containing a0.

Theorem 1.48. (Separation Theorem) Let V be a vector space A ⊆ V where A is convex and nonempty. If a0 ∈ A\Int(A) then a0 is a supporting point of A.

In other words if a set S is a compact convex set in a finite dimensional Banach space and ρ is a point in the space where ρ∈ / S, then there exists a hyperplane that separates ρ from S.

14 1.2.6 Simplex

A geometric k-simplex is the convex set spanned by k + 1 affinely independent points

n {x0, . . . , xk} in R .

Definition 1.49. [3] The unit simplex is the n-dimensional simplex which is the convex

n hull of the zero vector and the unit vectors, {0, e1, . . . , en} ∈ R . It can be expressed as the set of vectors that satisfy x ≥ 0, 1T x ≤ 1.

Definition 1.50. [3] The probability simplex is the (n − 1)-dimensional simplex which is

n the convex hull of the unit vectors {e1, . . . , en} ∈ R . It is the set of vectors that satisfy

x ≥ 0, 1T x = 1.

Remark 1.51. [3] Vectors in the probability simplex correspond to probability distributions

th on a set with n elements, with xi denoted as the probability of the i element.

1.2.7 Hausdorff Distance

Definition 1.52. [24] Let A and B be non-empty compact subsets of a normed vector space

(V, k · k), then the Hausdorff distance dH (A, B) is measured by

dH (A, B) = max(max(min ||y − x||), (max(min ||x − y||))) x∈A y∈B y∈B x∈A

where x ∈ A, and y ∈ B.

The Hausdorff distance is defined on the compact subset of X. The Hausdorff metric is complete if the original space is complete.

15 Remark 1.53. The Hausdorff distance is the greatest distance of all distances from a point in one set to the closest point in the other set.

Observation 1.54. Let c ∈ (X, d) and A ⊂ X be compact. Then dH ({c},A) is the radius of the smallest closed ball centered at the point c that contains A.

To help express the following definitions of distances between sets, we introduce the idea of λ-neighbourhood.

n Definition 1.55. Let A be a set in R and let λ ≥ 0. Then the λ-neighbourhood (A)λ of A is the set A + λU, where U denotes the closed unit ball {x ∈ Rn : kxk ≤ 1}.

The Hausdorff distance dH (A, B) between two non-empty, compact sets A, B can be restated in terms of the λ-neighborhoods as:

dH (A, B) = inf{λ ≥ 0 : A ⊆ (B)λ, and B ⊆ (A)λ} where λ ≥ ka − bk with a ∈ A and b ∈ B. The Hausdorff metric is translation invariant; we have dH (A + x, B + x) = dH (A, B) for all x.

n Theorem 1.56. [30] Let A be a set in R and let λ ≥ 0. Then (A)λ is convex when A is convex.

The proof of this follows from closed balls being convex, the addition of convex sets is convex and the fact that a set remains convex if multiplied by a scalar.

Lemma 1.57. [21] Let A be a non-empty closed convex set in Rn and let y∈ / A, then there exists a point x¯ ∈ A with minimum distance from y, (recall lemma 1.41) such that ky − x¯k ≤ ky − xk for all x ∈ A.

n Theorem 1.58. [30] Let A and B be non-empty compact subsets of R with dH (A, B) = λ.

Then A ⊆ (B)λ and B ⊆ (A)λ.

16 Proof. Let a ∈ A. For any  > 0, A ⊆ (B)λ+, when a point b of B for which ka−bk ≤ λ+,

and so inf{ka − bk | b ∈ B} ≤ λ. Then Lemma 1.57 shows that there exists some point bo of

B such that ka − bok ≤ λ. Thus a ∈ (B)λ and A ⊆ (B)λ. Similarly B ⊆ (A)λ.

Proposition 1.59. Let A and B be the closed balls B[a; r] and B[b; s] respectively in Rn.

Then dH (A, B) = kb − ak + |s − r|.

Proof. Without loss of generality let r ≤ s. Then,

A ⊆ B − (b − a) ⊆ (B)kb−ak and

B = A + b − a + (s − r)U ⊆ (A)kb−ak+|s−r|.

Where U denotes the closed unit ball.

When dH (A, B) ≤ kb−ak+|s−r|, B contains a point whose distance from a is kb−ak+s.

Thus, if λ ≥ 0 and B ⊆ (A)λ = B[a; λ + r], Then kb − ak = s ≤ λ + r. Whence dH (A, B) = kb − ak + s − r. Thus dh(A, B) = kb − ak + s − r, i.e. dH (A, B) = kb − ak + |s − r|.

1.2.8 Joint Numerical Range

Let H be an infinite-dimensional complex Banach space and let B(H) be the set of bounded linear operators on H. For an operator A ∈ B(H), the numerical range of A is the set

W (A) = {hAx, xi | x ∈ H, kxk = 1}.

The numerical range is a bounded and convex subset of the complex plane and is not

17 necessarily closed for infinite dimensional operators. In finite dimensions the numerical range is closed.

Definition 1.60. [5] Let A = (A1,...,Am) ∈ B(H) be an m-tuple of n by n Hermitian matrices, then the joint numerical range of A is

∗ ∗ n W (A) = {(v A1v, . . . , v Amv)|v ∈ C , ||v|| = 1}.

W (A) is a subset of Rm. Unlike the single operator case, the joint numerical range is in general not convex.

Proposition 1.61. Let M be a n by n complex matrix with Cartesian decomposition H1,H2.

Then W (M) = W (H1,H2).

Proof. Assume M is a square complex matrix, then due to Theorem 1.35 it must have a Cartesian decomposition such that

M + M ∗ M − M ∗ H = ,H = . 1 2 2 2i

Then,

∗ n W (M) = {(z Mz) | z ∈ C , kzk = 1}

∗ ∗ n = {(Re(z Mz), Im(z Mz))| z ∈ C , kzk = 1} z∗Mz + z∗Mz¯ z∗Mz − z∗Mz¯ = {( , )| z ∈ n, kzk = 1} 2 2i C z∗Mz + z∗M ∗z z∗Mz − z∗M ∗z = {( , )| z ∈ n, kzk = 1} 2 2i C M + M ∗ M − M ∗ = {(z∗ z), (z∗ z)| z ∈ n, kzk = 1} 2 2i C ∗ ∗ n = {(z H1z), (z H2z)| z ∈ C , kzk = 1}

= W (H1,H2).

18 Remark 1.62. The numerical range of an arbitrary complex matrix can be viewed as the joint numerical range as the two Hermitian matrices from its Cartesian decomposition.

1.3 Quantum Information

Entanglement of Pure States

Quantum systems can be described as unit vectors in a Hilbert space. When two quantum systems interact they create a composite system. The composite system of both parties is

described by vectors in the tensor-product HA ⊗HB = H. One Hilbert space is controlled by

Alice denoted as HA of dimension dA. The other Hilbert space is controlled by Bob, denoted

as HB of dimension dB. This means that any vector in this shared space can be written as

d ,d XA B |ψi = cij|aii ⊗ |bji ∈ H i,j=1

with a complex dA by dB matrix C = (cij). In order to keep the notation simple, the tensor product is written as |aii⊗|bji = |aibji. We can now define the condition for separable and entangled pure states.

Definition 1.63. Let |ψi ∈ H be a pure state, then |ψi is a separable pure state if we can find states |φAi ∈ HA and |φBi ∈ HB such that

|ψi = |φAi ⊗ |φBi

holds. Otherwise |ψi is entangled.

If a state is separable, any measurements on its components are uncorrelated. This means that states can be easily prepared in a local way. Alice produces a state |φAi and

19 Bob independently produces a state |φBi. If Alice measures a state A and Bob measures a state B, then the outcomes will be completely independent of each other.

Definition 1.64. [1] An observable is an operator that corresponds to a physical quantity, such as energy, spin, or position.

Entanglement, and separablilty are very simple to distinguish when dealing with a bi- partite system of pure states. Before proceeding to define separability for mixed states we need a tool called Schmidt decomposition.

PdA,dB Lemma 1.65. [19] (Schmidt Decomposition). Let |ψi = i,j=1 cij|aibji ∈ H be a vector in the tensor product of two Hilbert spaces. If dA = dB then there exists an orthonormal basis

|αii of HA and an orthonormal basis |βji of HB such that

R X |ψi = λk|αkβki k=1

holds, with positive real coefficients λk. The λk are uniquely determined as the square

† roots of the eigenvalues of the matrix CC , where C = (cij).

The number R ≤ min{dA, dB} is called the Schmidt rank of |ψi. If the λk are pairwise different, then |αki and |βki are unique up to a phase. Note that separable states correspond to Schmidt rank one. Therefore, this type of decomposition can be used to tell whether a given pure state is separable or entangled.

Entanglement of Mixed States

A more general and applicable situation occurs when one does not know the exact state of a quantum system. Now suppose we have an unknown state |φii with some probability pi P of this state occuring. This situation is described by a density operator ρ = i=1 pi|φiihφi| P with i=1 pi = 1 and pi ≥ 0.

20 In a given basis the density operator is represented by a complex matrix. This matrix is

positive semidefinite and Hermitian, since all the operators |φiihφi| are positive and Hermi- tian. We are led to a geometrical picture of the set of all states forming a convex set since all positive semidefinite matrices of trace one can be interpreted as a density operator of some

state. This means that given two states ρ1 and ρ2, their convex combination is also a state. P This property also holds for multiple states. Given some pi ≥ 0 with i=1 pi = 1 then the P convex combination i=1 piρi of some states is again, a state. Coefficients pi ≥ 0 with the P property i=1 pi = 1 are called convex weights.

Definition 1.66. Let ρ be a density operator on a composite system H = HA ⊗ HB. Then ρ is a product state if there exists states ρA for Alice and ρB for Bob such that ρ = ρA ⊗ρB. The

A B state is mixed state separable, if there are convex weights pi and product states ρi ⊗ ρi such that

X A B ρ = piρi ⊗ ρi i=1 otherwise the state is entangled

Notice that the separable states are a convex combination of product states. This means that the set of separable states forms a convex set. Suppose a quantum system is in one of a number of states |φii, where i is an index, with respective probabilities pi.

Theorem 1.67. [18] (Characterization of density operators) An operator ρ is the density operator associated to some ensemble {pi, |φii} if and only if it satisfies the conditions

1. ρ is a positive operator.

2. ρ has a trace equal to one.

Remark 1.68. The first property is obvious and the second property is easily proved with

21 the definition of a trace. Given an arbitrary othonormal basis {Φi} results in P tr(ρ) = i=1hΦi|ρ|Φii.

Remark 1.69. The density operator provides a means for describing quantum systems whose state is not completely known.

22 Chapter 2

An Application of Blaschke’s Selection Theorem

2.1 Two problems in convex optimization

In this chapter, we introduce the two problems in convex approximation that we will consider in the thesis.

Problem 2.1. (Nearest Subset Problem) Let V be a finite-dimensional real vector space and let C be a non-empty compact convex subset of V . Let F be a nonempty family of compact convex subsets of C. Further suppose F is a closed set in the Hausdorff metric. Find the set in F that has minimal Hausdorff distance to C.

Problem 2.2. (Nearest Superset Problem) Let V be a finite-dimensional real vector space and let C be a non-empty compact convex subset of V . Let F be a nonempty family of compact convex subsets of V all of which contain C as a subset. Further suppose F is a closed set in the Hausdorff metric. Find the set in F that has minimal Hausdorff distance to C.

23 In Chapter 3, we solve the special case of Problem 2.2 where V = Rn, C is a n-dimensional hypersphere and F is the collection of simplicies which contain the hypersphere. In Chapter

4, we solve the special case of Problem 2.1 where V = Rn, C is the probability simplex and F is the family of convex hulls certain joint numerical ranges of POVMs. In Chapter 5, we explore the special case of Problem 2.2 where V is the real vector space of mn by

mn Hermitian matrices, C is the set of separable states on Cm ⊗ Cn and F is the family of simplicies whose faces are entanglement witnesses. It should be noted that there is no general method for finding the solution for these problems. The method of solutions in Chapter 3 and Chapter 4 are different. However, we can show that these problems always have a (possibly nonunique) solution. Our main tool is the Blaschke selection theorem which we introduce next.

2.2 The Blaschke Selection Theorem

Theorem 2.3. Let V be a finite dimensional real vector space. Any bounded sequence of compact convex subsets of V contains a subsequence which converges in the Hausdorff metric.

If a metric space is complete, then the associated Hausdorff metric for that space is also complete. That means the Blachke selection theorem can be extended to any Banach space. See the paper [22] for a simple proof. The Blaschke selection theorem allows us to guarantee the existence of a subsequence of convex sets that converges to a limit (a convex set), in the Hausdorff metric.

2.3 Main Results

We are now ready to prove our main results. We note that similar applications of the Blaschke selection theorem have appeared in the literature, see for instance [10, Proposition

24 1] or [20].

Theorem 2.4. The nearest subset problem always has a solution.

Proof. Let V be a finite-dimensional real vector space and let C be a non-empty compact convex subset of V . Let F be a nonempty family of compact convex subsets of C. Further

suppose F is closed in the Hausdorff metric. Let k = infS∈F dH (S,C). Then for all n ∈ N,

1 ∞ there exists Sn ∈ F such that dH (Sn,C) < k + n . Since Sn ⊆ C for all n ∈ N, {Sn}n=1 is a bounded sequence of compact convex sets which has a convergent subsequence by the Blaschke selection principle. Let T be the limit of this subsequence, then T ∈ F since F is

closed, and dH (T,C) = k. Therefore T is a solution.

Theorem 2.5. The nearest superset problem always has a solution.

Proof. Let V be a finite-dimensional real vector space and let C be a non-empty compact convex subset of V . Let F be a nonempty family of compact convex subsets of V all of which contain C as a subset. Further suppose F is closed in the Hausdorff metric. Let

1 k = infS∈F dH (S,C). Then for all n ∈ N, there exists Sn ∈ F such that dH (Sn,C) < k + n .

This Hausdorff metric inequality implies that Sn ⊆ C(k+1). Since C(k+1) is a bounded set,

∞ {Sn}n=1 is a bounded sequence of compact convex sets which has a convergent subsequence by the Blaschke selection principle. Let T be the limit of this subsequence, then T ∈ F since

F is closed and dH (T,C) = k. Therefore T is a solution.

Remark 2.6. We may not have uniqueness of the solution because the convergent subse- quence and hence its limit, is not necessarily unique. This means that our solution T may also not be unique.

25 Chapter 3

Optimal Simplex Enclosing a Hypersphere

3.1 Introduction

The overarching theme of this thesis is using the Hausdorff metric to minimize the distance between objects. This chapter is concerned with enclosing a hypersphere inside of a simplex while using the Hausdorff metric to minimize the distance between the hypersphere and the simplex. Since the hypersphere is very symmetric, we expect that the minimizing simplex should also be as symmetric as possible. We prove that, as suspected, a regular simplex minimizes the distance between an enclosed hypersphere and a simplex. A simplex is regular if all of its edges have the same length.

3.2 Circumradius and Inradius

Let C be a compact convex set in Rn that has a non-empty interior. This set C is called a convex body. Let r be the supremum of the set of radii of closed balls lying in C, and R

26 be the infimum of the set of radii of closed balls in Rn containing C. Then r is called the inradius of C and R is the circumradius of C. It is obvious that r and R are both positive real numbers with R ≥ r.

Theorem 3.1. Let C be a convex body in Rn with inradius r and circumradius R. Then C contains a closed ball of radius r and is contained in a unique closed ball of R.

n Proof. The definition of R implies that for each j = 1, 2,..., there exists an aj ∈ R and

1 ∞ an Rj > 0 such that C ⊆ B[aj; Rj] and Rj < R + ( j ). The sequence {Rn}n=1 converges to ∞ R and the sequence {an}n=1 is bounded. Thus due to Bolzano-Weierstrass Theorem, there

n is a subsequence {aij} that converges to some point a of R . It follows from the previous

example that B[aij ; Rij ] → B[a; R] as j → ∞. We show that C ⊆ B[a; R]. Let c ∈ C.

Since C ⊆ B[aij ; Rij ], kc − aij k ≤ Rij . Letting j → ∞ in the last inequality, we find that kc − ak ≤ R. Thus c ∈ B[a; R] and C ⊆ B[a; R]. The proof that C contains a closed ball of radius r is similar to the one we have just given. Suppose that C lies in both of the closed

balls B[a, R] and B[b; R] of radius R in Rn. Then, for each x in C,

1 1 kx − (a + b)k2 ≤ R2 − ka − bk2. 2 4  q  1 2 1 2 When C ⊆ B 2 (a + b); R − 4 ka − bk , C cannot lie in a closed ball of less than radius R, we must have a = b. Thus, there is precisely one closed ball of radius R in Rn which contains C.

3.3 Circumcircle and Incircle of a Triangle

Definition 3.2. [15] The circumcenter is the point at which the perpendicular bisectors of the sides of a triangle intersect and which is equidistant from the three vertices

For a generic triangle ABC, we denote the sides BC, CA, AB by a,b,c respectively

27 Figure 3.1: [36]

The circumcircle of a triangle ABC is a unique circle passing through each of the ver- tices. The center of the circle is called the circumcenter O, and is the interesection of the perpendicular bisectors of each side [36]. The circumradius R is given by the law of sines:

a b c 2R = = = sin(A) sin(B) sin(C)

Definition 3.3. The semiperimeter of a polygon is half of its perimeter

Definition 3.4. [15] The incenter is the single point in which the three bisectors of the interior angles of a triangle intersect and which is the center of the inscribed circle.

The incircle is tangent to each of the three sides a, b, c (without extension). The incenter I, is the intersection of the bisectors of the three angles [36]. The inradius r of the triangle ABC is determined by the area of the triangle [35].

2∆ ∆ r = = a + b + c s

1 a+b+c Where ∆ = 2 bc sin(A) and s = 2

28 Figure 3.2:

3.4 Generalization of Euler’s Triangle Inequality

Given a triangle, extend two sides in the opposite direction of their common vertex. The circle tangent to these two lines and to the other side of the triangle, is called the excircle, or sometimes referred to as the escribed circle. The radius of this circle is called the exradius.

Figure 3.3:

29 a+b+c Definition 3.5. [33] Let a be a side length, A be an angle, ∆ be the area, and s = 2 be the semiperimeter, then the exradius ra is defined to be

∆ r = a s − a

Theorem 3.6. (Euler’s Triangle Inequality) Let ∆ABC be an arbitrary triangle with circumradius R and inradius r. Then R ≥ 2r with equality holding if and only if ∆ABC is equilateral.

Proof. We will use the relations between the inradius and exradii (ra, rb, rc) to prove the inequality. The following are standard identities, and their proofs can be obtained from any book on trigonometry.

1 1 1 1 + + = (3.1) ra rb rc r

ra + rb + rc − r = 4R. (3.2)

The Cauchy-Schwarz inequality implies that

1 1 1 (ra + rb + rc)( + + ) ≥ 9, (3.3) ra rb rc

With equality if and only if ra = rb = rc Hence from (3.1) and (3.2) we have

4R + r ≥ 9 r

which gives the desired inequality R ≥ 2r. Equality holds if all sides are equal.

This was not the method that Euler used to prove his triangle inequality. Before intro-

30 ducing his technique we denote I and O as the incenter and circumcenter respectively of the triangle and R and r as the circumradius and inradius respectively. In his paper [17] he introduced the formula:

R(R − 2r) = (OI)2. (3.4)

This results in the inequality R ≥ 2r. The relationship for equation 3.4 is easily proved when the right constructions are made.

Figure 3.4: [17]

1 While referring to figure 3.4, we first show that ∠IBX = 2 (∠A + ∠B) = ∠BIX, so BX = IX. The triangles AIZ and YXB are similar, since they are right-angled and the angles at A and Y are equal. Hence,

XY BX = AI ZI

and so 2Rr = (AI)(BX) = (AI)(IX). By intersecting chords, this results in (PI)(IQ) = (R − OI)(R + OI), and the relationship is complete. All of these definitions and inequalities can be applied to higher dimensional problems.

31 Definition 3.7. [31] The circumsphere is a sphere circumscribed in a given solid

Definition 3.8. [31] The insphere is a sphere inscribed in a given solid

Definition 3.9. [32] The circumradius of a simplex or regular polytope is the radius of the circumsphere, which is the hypersphere that passes through all vertices.

Definition 3.10. [34] The inradius of a convex polytope is the radius of the insphere, which is the largest hypersphere that will fit inside the polytope

Euler’s triangle inequality can be generalized to an n-dimensional simplex

Theorem 3.11. (Generalization of Euler’s Triangle Inequality) [16] Let S be an n- dimensional simplex. Let R, and r be the circumradius and inradius respectively then R ≥ nr with equality when the simplex is regular.

Proof. [16] Let Ai and Fi (for i = 1, 2, . . . , n + 1) denote, respectively the vertices and the

opposite (n − 1)-dimensional faces of an n-dimensional simplex of volume V . Also, let hi

and ri denote the distances from the circumcenter O of the simplex to the face opposite P P P of Ai, respectively. Since, R + ri ≥ hi, RFi + riFi ≥ hiFi. The volume of the n-

dimensional simplex is given by hiFi/n, so by evaluating the volume of the simplex in three different ways, for every subscript i, that

X X nV = hiFi = riFi = r Fi

Hence, X X X X R Fi ≥ hiFi − riFi = (n + 1)nV − nV = nr Fi

or R ≥ nr. There is equality if and only if R + ri = hi ∀i, or equivalently, if and only if all the altitudes are all concurrent at the circumcenter. Such simplexes, whose altitudes are concurrent, are said to be orthocentric.

32 3.5 Application of Klamkin and Tsintsifas’ Result on

the Hausdorff distance

Klamkin and Tsintsifas determined in [16] that R ≥ nr where R is the circumradius, n is the dimension of the simplex and r is the inradius of the insphere. We use this result to derive an inequality between the Hausdorff distance from the insphere to the polytope, in terms of R and r.

Lemma 3.12. Let P be a convex polytope, Si be its inscribed hypersphere and Sc be its circumscribed hypersphere then

dH (P,Si) ≥ R − r

where r is the radius of Si and R be the radius of Sc. Equality exists if Si and Sc have the same centre.

Proof. Since Si ⊆ P ⊆ Sc, the Hausdorff distance from the polytope P to the insphere Si is less than or equal to the Hausdorff distance from the insphere Si to the circumsphere Sc written as

dH (P,Si) ≤ dH (Si,Sc)

Similarly, the Hausdorff distance from the polytope P to the insphere Si is less than or equal to the insphere to the circumsphere written as

dH (P,Sc) ≤ dH (Si,Sc)

From the definition of a circumsphere, the Hausdorff distance from Sc to the polytope is also R. This does not apply to the definition of the insphere because in this case, the insphere

33 lacks uniqueness. The Hausdorff distance from the insphere to the polytope requires an inequality built from previous definitions. Using the triangle inequality,

R ≤ dH ({Ci},P ) ≤ dH ({Ci},Si) + dH (Si,P ) = r + dH (Si,P )

Subtracting the radius r from both sides results in

dH (Si,P ) ≥ R − r.

Theorem 3.13. Let P be a simplex, Si be its inscribed hypersphere of radius r and n be the dimension of the simplex then

dH (Si,P ) ≥ (n − 1)r

with equality if P is a regular simplex.

Proof. Let R be the circumradius, r be the inradius, and n be the dimension of the simplex, then the proof from theorem 3.11 results in R ≥ nr with equality if and only if the simplex is regular. We can now use lemma 3.12 to obtain our desired result.

What we have accomplished in this chapter is to use the Klamkin and Tsintsifas gener- alization of the Euler triangle inequality in order to bound the Hausdorff distance between

the insphere Si and the regular simplex P . The equality condition from theorem 3.13 is the solution we expect from the symmetry of a simplex. The reason the simplex is used instead of a polytope is because a simplex has the property of all vertices touching the circumsphere.

This results in R sharing a point on Sc and P . Thus we are able to define the Hausdorff distance between Si and P in terms of the dimension of the simplex and the inradius.

34 3.6 Conclusion

The Hausdorff distance between the hypersphere and an n-simplex which contains it, is bounded below by (n − 1)r where n is the dimension of the space and r is the radius of the hypersphere. This result shows us that we can analytically solve an interesting special case of problem 2.2. While this specific problem does not have applications to quantum information, it does raise the possibility that more difficult special cases of problems 2.1 and 2.2 which do have applications to quantum information, can be solved. In fact, in the next chapter, we solve such a problem.

35 Chapter 4

Standard Simplex and the Joint Numerical Range of POVMs

4.1 Introduction

This chapter presents some major new results of this thesis. By fixing the probability simplex, we study the behavior of the enclosed convex hull of the joint numerical range. The joint numerical range of a set of m Hermitian matrices is a subset of Rm defined as follows:

Definition 4.1. Let M = (M1,M2,...,Mm) be an m-tuple of d by d Hermitian matrices.

∗ ∗ ∗ Then the joint numerical range of M is W (M) = {(v M1v, v M2v, . . . , v Mmv) | v ∈

Cd, ||v|| = 1}.

Definition 4.1 can be restated in terms of Dirac’s bra-ket notation

W (M) = {(hΦ|M1|Φi, hΦ|M2|Φi,..., hΦ|Mm|Φi)} where |Φi runs through all pure states on Cd.

36 In recent years, the joint numerical range has been useful in a number of different areas of quantum information. Some key examples are [6, 12, 25]. Positive operator-valued measures are a key concept in the theory of quantum measure- ment.

Definition 4.2. Let H be a Hilbert space. A (discrete) positive operator-valued

m measure (POVM) is a set of Hermitian positive semidefinite operators {Mi}i=1 on H that Pm sum to the identity operator, i=1 Mi = I.

m The discrete POVM {Mi}i=1 is a mathematical representation of a specific measurement which has m possible outcomes. Given a quantum state |Φi and a discrete POVM M =

(M1,M2 ...,Mm) both acting on the same Hilbert space, the vector

(hΦ|M1|Φi, hΦ|M2|Φi, ..., hΦ|Mm|Φi) is an element of W (M) and is the m-tuple whose kth element is the probability of getting outcome number k if we apply the measurement modelled by the POVM M to the state |Φi. Our first observation follows directly from this. Discrete POVMs can be characterized by their joint numerical range. We first need the following terminology:

m Definition 4.3. The probability simplex ∆m is the set of all elements of R that have Pm all entries nonnegative and summing to one. (I.e. ∆m = {(x1, x2, ..., xm): k=1 xk = 1 and xk ≥ 0 for all k : 1 ≤ k ≤ m}.)

Lemma 4.4. Let M = (M1,M2 ...,Mm) be an m-tuple of d by d Hermitian matrices. Then M is a discrete POVM if and only if the joint numerical range of M is contained in the probability simplex.

P Proof. Let M be a d by d Hermitian POVM. Then Mi = I, and since M is Hermitian

∗ x Mix ≥ 0. This means that m X ∗ ∗ x Mix = x Ix = 1 i=1

37 ∗ Conversely let x be any unit vector, then x Mix ≥ 0. This means that Mi is a positive

∗ th semi-definite matrix, and that x Mix is the i entry of element of ∆m. Any unit vector x, results in m m ∗ X X ∗ x ( Mi)x = x Mix = 1 i=1 i=1 P Therefore i=1 Mi = I and is a POVM.

We now consider the following problem in quantum tomography. Let |Φi and |Ψi be two distinct quantum states on a Hilbert space H. Suppose we have a large number of identical copies of one of the two states and we want to determine which of these two states we have by repeatedly applying the same measurement to each of these states. If this measurement

m is modelled by the POVM {Mi}i=1, then we expect the distribution of the measurement

outcomes to be close to v = (hΦ|M1|Φi, hΦ|M2|Φi, ..., hΦ|Mm|Φi) if the unknown state is

|Φi and close to w = (hΨ|M1|Ψi, hΨ|M2|Ψi, ..., hΨ|Mm|Ψi) if the unknown state is |Ψi. Our ability to distinguish between these two states depends on the distance between the two points v and w lying in the joint numerical range W (M). In the case where v = w, both these states yield the exact same measurement probabilities and we would be unable to distinguish between the two states no matter how many such measurements we made. If on the other hand, v and w are far apart, then |Φi and |Ψi would give radically different measurement outcomes and it is easier to determine which state we have with high probability even with relatively fewer measurements.

Remark 4.5. If M is a POVM we are using to distinguish between states, we expect to have higher success if its joint numerical range W (M) is spread out. Since the joint numerical range of a POVM is contained in the probability simplex, we can measure the spread of W (M) by looking at the Hausdorff distance between the probability simplex and the convex hull of W (M). The smaller this Hausdorff distance is, the more spread out W (M) is and the better the POVM M would be in distinguishing states.

38 In the next section, we consider a POVM M with m outcomes acting on a d dimensional space and give a lower bound on the Hausdorff distance between the convex hull of W (M)

and the probability simplex in Rm. We show that if m = d2, the states that achieve this lower bound are exactly the symmetric informationally complete POVMs (usually referred to as SIC-POVMs) which have been well studied. SIC-POVMs were first defined in [23] which is still one of the key references for SIC-POVMs. The following is now the standard definition.

Definition 4.6. Let H be a d-dimensional complex Hilbert space. Then a set of d2 rank-one

d2 1 projections {Pi}i=1 is called a SIC-POVM if tr(PiPj) = d+1 for all i 6= j.

Example 4.7. If d = 2, we can construct a SIC-POVM which has four unit vectors with

the inner product being |hv , v i| = √1 i j 3 where the four vectors are

√ √ √ 1 2 1 2 1 2 {(1, 0); (√ , √ ); (√ , √ ); (√ , √ )}. 3 3 3 3ω 3 3ω2

√ −1+ 3i Here ω = 2 .

d2 Pd2 We note that if {Pi}i=1 is a SIC-POVM then i=1 Pi = dI. In our work, we will normalize the SIC-POVM so that it is in fact a POVM, so we take our SIC-POVMs to be

1 d2 the set { d Pi}i=1 where the Pi are as in the above definition. It was conjectured in [23] that there is a SIC-POVM in every dimension d ∈ N. While many SIC-POVMs have been found, this conjecture remains open.

39 4.2 A Hausdorff metric inequality

We begin this section by proving some of the inequalities we need to prove our main result.

Theorem 4.8. (Courant-Fischer Minimax Theorem) Suppose that S ∈ Cnxn is an

Hermitian matrix and let {λk(S)} be the eigenvalues of S listed in ascending order. Then

x∗Sx λk(S) = min max dim(U)=k x∈U,x6=0 x∗x

and

x∗Sx λk(S) = max min . dim(V )=n−k+1 x∈V,x6=0 x∗x

Where λn(S) = λmax(S) meaning that λmax denotes the largest eigenvalue of S.

m Lemma 4.9. Let m ≥ d ≥ 2 and let M = {Mk}k=1 be a POVM with m elements in a d- dimensional Hilbert space. Then there exists some j with 1 ≤ j ≤ m such that λmax(Mj) ≤

d d m . Equality holds if and only if every element of the POVM is m times a rank one projection.

Pm Pn Proof. Since all the Mj are positive semidefinite, we have k=1 λmax(Mk) ≤ k=1 tr(Mk) =

tr(Id) = d. We have equality if and only if all of the Mk are rank one. Choose Mj to be

d the element of the POVM which has smallest maximal eigenvalue, then λmax(Mj) ≤ m . We d have equality if and only if every element of the POVM is m times a rank one projection.

We also need the following simple consequence of the Cauchy-Schwarz Inequality.

Lemma 4.10. Let (x1, x2, ..., xm−1) be an (m − 1)-tuple of nonnegative numbers, then

Pm−1 2 1 Pm−1 2 k=1 xk ≥ m−1 ( k=1 xk) with equality if and only if all of the xks are equal.

This lemma can be used to give a distance inequality between certain points on the

n n probability simplex. We use {ei}i=1 to denote the standard basis of R , so ej is the m-vector with a one in its jth entry and zeros everywhere else.

40 Corollary 4.11. Let j be a fixed integer with 1 ≤ j ≤ m and x = (x1, x2, ..., xm) be a point in p m the probability simplex ∆m with xj ≤ c for some fixed c ∈ [0, 1]. Then kx−ejk ≥ m−1 (1−c) 1 with equality if and only if xj = c and xi = − m−1 c for all j 6= i.

Proof. This follows from lemma 4.10 with the equality condition following from the equality condition in lemma 4.10.

s 2 X 2 kx − ejk = (1 − xj) + xi i6=j s 1 X ≥ (1 − x )2 + ( x )2 j m − 1 i i6=j r 1 = (1 − x )2 + (1 − x )2 j m − 1 j r m = (1 − x ) m − 1 j r m ≥ (1 − c) m − 1

We also need the following lemma about the Hausdorff metric between a convex set and a convex subset. If T is a convex set, then ext(T ) denotes the set of extreme points of T . In particular if T is a simplex then ext(T ) is the set of vertices of T ; so when T is the

n probability simplex, the set ext(T ) = {ei}i=1.

Lemma 4.12. Let S ⊆ T ⊆ Rm where both S and T are convex and compact, then dH (S,T ) = supy∈ext(T ) infx∈S kx − yk.

Proof. Since S ⊆ T , dH (S,T ) = inf{ > 0 : T ⊆ S}. Since S is convex, it will contain

T whenever it contains all of the extreme points of T . Therefore dH (S,T ) = inf{ > 0 : ext(T ) ⊆ S} = supy∈ext(T ) infx∈S kx − yk.

41 We are now ready to state and prove our main theorem of this section.

m Theorem 4.13. Let m ≥ d ≥ 2 and let M = {Mk}k=1 be a POVM with m elements in a

m−d 2 d-dimensional Hilbert space. Then dH (conv(W (M)), ∆m) ≥ √ . When m = d , the m(m−1) inequality reduces to an equality if and only if M is a scalar multiple of a SIC-POVM.

Proof. Let x = (x1, x2, ..., xm) ∈ conv(W (M)). By lemma 4.9, there exists some j with

d d 1 ≤ j ≤ m such that λmax(Mj) ≤ m . Note that xj ≤ λmax(Mj) ≤ m . Hence by corollary 4.11, r m d m − d kx − ejk ≥ (1 − ) = . m − 1 m pm(m − 1)

If we first take the minimum over all x ∈ conv(W (M)) and then take the max over all j,

m−d we get dH (conv(W (M)), ∆m) ≥ √ . This inequality will become an equality if and m(m−1) only if the equality conditions for both lemma 4.9 and corollary 4.11 are satisfied. The first

d equality condition is true if and only if λmax(Mj) = tr(Mj) = m for all j which is equivalent m d d ∗ to the existence of unit vectors {vj}j=1 ∈ C such that Mj = m vjvj . In this case the point

x ∈ conv(W (M)) which minimizes kx − ejk is

d v∗Mv = (|hv , v i|2, |hv , v i|2, ..., |hv , v i|2). j j m 1 j 2 j m j

2 1 This point will satisfy the equality condition for corollary 4.11 if and only if |hvi, vji| = − m−1 2 ∗ for all i 6= j. In the case m = d , this condition is equivalent to {dMk = vkvk} being a scalar multiple of a SIC-POVM.

We note that this result gives a variational characterization of SIC-POVMs. We can also obtain the following necessary and sufficient condition for the existence of a SIC-POVM.

Corollary 4.14. Let d be a natural number. Then there exists a SIC-POVM on Cd if and only if for any  > 0, there exists a POVM M on Cd with d2 elements such that d−1 d (conv(W (M)), ∆ 2 ) < √ + . H d d2−1

42 d d2−d d−1 Proof. Suppose M is a SIC-POVM on , then dH (conv(W (M)), ∆d2 ) = √ = √ . C d2(d2−1) d2−1 Therefore M works for all . The other direction follows from standard compactness ar-

d2 2 guments. The set (Hd) of all d -tuple of d by d Hermitian matrices is a finite-dimension vector space and the set of all POVMs on Cd with d2 elements form a compact subset

d2 of (Hd) . Now suppose for all natural numbers n, there exists a POVM M(n) such that

d−1 1 d (conv(W (M )), ∆ 2 ) < √ + . This sequence of POVM’s has a convergent subse- H (n) d d2−1 n quence, let M be the limit of the subsequence. Since the joint numerical range is continuous

d2 d−1 on (H ) , d (conv(W (M)), ∆ 2 ) = √ . Hence by our previous theorem M is a SIC- d H d d2−1 POVM.

The existence of a SIC-POVM in Cd is equivalent to the existence of d2 unit vectors

{v } ∈ d that satisfy |hv , v i| = √ 1 for all i 6= j. i C i j d+1 This result may be useful in computer searches for SIC-POVMs as a random search is very unlikely to land exactly on a SIC-POVM. POVMs which are close to the bound in Theorem 4.13 could yield promising regions of the space for further searching.

43 Chapter 5

Entanglement Witnesses and Further Work

5.1 Introduction

Entanglement is a quantum property that can occur when two particles interact with each other. When particles are completely entangled, we can learn certain properties of one particle by studying the properties of the other particle. Even if the particles are separated by extreme distances, the moment an experimenter measures a certain property of one of the particles they can know certain properties of the other entangled particle. We will not discuss the physics but go straight into the mathematics behind entanglement.

5.2 Separability criterion for low dimension

Characterizing and classifying entangled states of higher dimensional bipartite systems is a difficult task. A density operator ρ acting on a finite Hilbert space H can either be separable or entangled. For two low-dimensional systems namely, H = C2 ⊗ C2 and H = C2 ⊗ C3,

44 there exists an operationally simple condition for separability. This is known as the Peres- Horodecki criterion. It indicates that a state ρ is separable if and only if its partial transpose is positive. To understand this result, we define the partial transpose.

AB Definition 5.1. If given a general state ρ that acts on H = HA ⊗ HB then its partial transpose with respect to B is

nm TB X A B T ρAB = prρr ⊗ (ρr ) r=1 Remark 5.2. We can extend the partial transpose to general density matrices by linearity.

Definition 5.3. [8] A transpose is referred to as the positive partial transposition (PPT)

TA when all of the eigenvalues of ρAB are non-negative.

Definition 5.4. [8] A transpose is referred to as the non-positive under partial

TA transposition (NPPT) is when one of the eigenvalues of ρAB is negative.

Remark 5.5. A ≥ 0 is being used to denote that A is positive semidefinite.

Theorem 5.6. [8] If a state is separable, then

TA ρAB ≥ 0, and

TB TB T (ρAB) = (ρAB) ≥ 0

Proof. Our definition of separable leads us to write

K X ρAB = pi|eiihei| ⊗ |fiihfi|. i=1

45 By performing the partial transposition with respect to A we have

K TA X T ρAB = pi(|eiihei|) ⊗ (|fiihfi|). i=1

Since A† = (A∗)T our result is that

K TA X ∗ ∗ ρAB = pi(|ei ihei |) ⊗ (|fiihfi|) ≥ 0. i=1

Every separable state has a positive partial transposition.

The converse of Theorem 5.6 is only valid in those two cases where H = C2 ⊗ C2 and H = C2 ⊗ C3. This is because in higher dimensions, there exist entangled states that have a positive partial transpose. This means that we need a new separability criterion for higher dimensions.

5.3 Entanglement witnesses

Mathematically describing entanglement becomes more complex in higher dimensions. We can approach this problem geometrically using entanglement witnesses. Throughout the remainder of this thesis the inner product we are using is the trace inner product. For P any n by n matrix A, the trace is defined as the sum of diagonal entries, tr(A) = i=1 aii. For any two m by n matrices A and B one can define the Frobenius or trace inner product P as hA, Bi = ij aijbji = tr(AB).

Theorem 5.7. [27] Let S be the set of all separable states, then a state ρent ∈ H is entangled if and only if there exists a Hermitian operator A ∈ H which satisfies

1. hρent,Ai < 0

46 2. hσ, Ai ≥ 0 ∀σ ∈ S

A Hermitian operator A that satisfies the hypotheses of theorem 5.7 is called an entan- glement witness.

Remark 5.8. Theorem 5.3 can be derived from the Hahn-Banach Theorem of functional analysis or the separation hyperplane theorem.

Theorem 5.9. Let C be a convex closed set in a Hilbert space H, and let ρ ∈ H \ C. Then there exists a hyperplane that separates ρ from the set C.

Proof. If we identify H and C ≡ S then we know according to theorem 5.7 that there exists a hyperplane that separates the entangled and separable states. Since such a hyperplane exists there must be a A ∈ H which satisfies hρ, Ai = 0 where ρ is an arbitrary vector in H.

Definition 5.10. An entanglement witness A is called optimal, denoted Aopt, if there exists a separable state σ ∈ S such that

hσ, Aopti = 0

5.3.1 Problem with the entanglement witnesses

An entanglement witness is a linear functional that separates certain entangled states from the set of all separable states. When the inner product of a state with the entanglement witness is negative, we can guarantee entanglement. When the inner product of a state with the entanglement witness is non-negative, we can neither confirm or deny that the state is within the set of separable states. This is why increasing the number of entanglement witnesses will result in a more accurate representation of the set of separable states. To perfectly describe the set of separable states, we would need to use an infinite number

47 of optimal entanglement witnesses. An infinite number of entanglement witnesses would prevent an entangled state ρ from being in the interior of the set of entanglement witnesses while being on the exterior of the set of separable states. However, for many reasons having an infinite number of entanglement witnesses is impractical. This restricts us to using a finite number of entanglement witnesses meaning that in some cases, the entanglement witness test will fail. Now, the challenge is to minimize the size of the set between the entanglement witnesses and the set of separable states in order to decrease the probability that the entanglement witness test fails.

5.4 Application of Hausdorff distance with the entan-

glement witness

Definition 5.11. An (n−1)-dimensional face of an n-dimensional polytope is called a facet.

Definition 5.12. An entanglement witness polytope is a polytope whose facets are of the form {X : hA, Xi = 0} where A is an optimal entanglement witness.

Definition 5.13. An entanglement witness simplex is an entanglement witness poly- tope which is also a simplex.

The entanglement witness polytopes are subsets of the set of mn by mn Hermitian matrices. The space of mn by mn Hermitian matrices is a real subspace of dimension m2n2. This means the number of facets f of an entanglement witness polytope must be f ≥ m2n2+1 with equality if and only if its an entanglement witness simplex. All entanglement witness polytopes contain the separable states as a subset. The problem can now be phrased as: Let f be some fixed polytope with f faces which has a minimum Hausdorff distance to the set

48 of separable states. If f = m2n2 + 1 then the problem becomes finding the entanglement witness simplex which has a minimum Hausdorff distance to the set of separable states.

Problem 5.14. Let V be a real vector space of mn by mn Hermitian matrices. Let C be the set of separable states of Hn ⊗ Hm and let F be the family of entanglement witness polytopes with k facets where k is a fixed integer with k ≥ m2n2 +1. Find the set in F that has minimal Hausdorff distance to C.

Recall the proof from theorem 2.4, there exists a minimum but not a necessarily unique solution to this problem. This problem is important because the optimal entanglement wit- ness polytopes will eliminate most of the entangled states from the test. Any entangled state that is not eliminated from the test will be geometrically close to the set of separa- ble states. This means that the degree of entanglement of these states will be significantly less than entangled states that are geometrically further from the set of separable states. See Shimony’s paper [28] for further insight. This problem is difficult because we do not have a complete characterization of optimal entanglement witnesses to generate entangle- ment witness polytopes. See the paper [9] for more about constructing and characterizing entanglement witnesses.

49 Bibliography

[1] Aharonov, Y., Popescu, S., Rohrlich, D., and Vaidman, L. Measurements, errors, and negative kinetic energy. Physical Review A 48, 6 (1993), 4084.

[2] Bolzano, B. Untersuchungen zur Grundlegung der Asthetik¨ , vol. 1. Athen¨aum-Verlag, 1972.

[3] Boyd, S., and Vandenberghe, L. Convex optimization. Cambridge university press, 2004.

[4] Caratheodory,´ C. Uber¨ den Variabilit¨atsbereich der Fourier’schen Konstanten von positiven harmonischen Funktionen. Rendiconti Del Circolo Matematico di Palermo (1884-1940) 32, 1 (1911), 193–217.

[5] Chan, J.-T. A note on the boundary of the joint numerical range. Linear and Multi- linear Algebra 66, 4 (2018), 821–826.

[6] Chen, J., Dawkins, H., Ji, Z., Johnston, N., Kribs, D., Shultz, F., and Zeng, B. Uniqueness of quantum states compatible with given measurement results. Physical Review A 88, 1 (2013), 012109.

[7] Crann, J., Kribs, D. W., and Paulsen, V. I. In preparation: Functional analysis for quantum information.

[8] Das, S., Chanda, T., Lewenstein, M., Sanpera, A., De, A. S., and Sen, U. The separability versus entanglement problem. arXiv preprint arXiv:1701.02187 (2017).

[9] Doherty, A. C., Parrilo, P. A., and Spedalieri, F. M. Complete family of separability criteria. Physical Review A 69, 2 (2004), 022308.

[10] Goodman, J. E. On the largest convex polygon contained in a non-convex n-gon, or how to peel a potato. Geometriae Dedicata 11, 1 (1981), 99–106.

[11] Griffel, D. H. Applied functional analysis. Courier Corporation, 2002.

[12] Gutkin, E., and Zyczkowski,˙ K. Joint numerical ranges, quantum maps, and joint numerical shadows. Linear Algebra and its Applications 438, 5 (2013), 2394–2404.

50 [13] Holevo, A. S. Quantum systems, channels, information: a mathematical introduction, vol. 16. Walter de Gruyter, 2012.

[14] Jameson, G. J. O., and Jameson, G. Topology and normed spaces, vol. 322. Chap- man and Hall London, 1974.

[15] Kang, Y.-S., and Seo, E.-J. A study on the teaching method of incenter and circumcenter of triangle. Journal of the Korean School Mathematics Society 12, 3 (2009), 171–188.

[16] Klamkin, M. S., and Tsintsifas, G. A. The circumradius-inradius inequality for a simplex. Mathematics Magazine 52, 1 (1979), 20–22.

[17] Leversha, G., and Smith, G. C. Euler and triangle geometry. The Mathematical Gazette 91, 522 (2007), 436–452.

[18] Nielsen, M. A., and Chuang, I. Quantum computation and quantum information, 2002.

[19] Peres, A. Quantum theory: concepts and methods, vol. 57. Springer Science & Business Media, 2006.

[20] Petty, C., and Waterman, D. An extremal theorem for n-simplexes. Monatshefte f¨urMathematik 59, 4 (1955), 320–322.

[21] Phelps, R. Convex sets and nearest points. Proceedings of the American Mathematical Society 8, 4 (1957), 790–797.

[22] Price, G. B. On the completeness of a certain metric space with an application to Blaschke’s selection theorem. Bulletin of the American Mathematical Society 46, 4 (1940), 278–280.

[23] Renes, J. M., Blume-Kohout, R., Scott, A. J., and Caves, C. M. Symmetric informationally complete quantum measurements. Journal of Mathematical Physics 45, 6 (2004), 2171–2180.

[24] Rockafellar, R. T., and Wets, R. J.-B. , vol. 317. Springer Science & Business Media, 2009.

[25] Rodman, L., Spitkovsky, I. M., Szko la, A., and Weis, S. Continuity of the maximum-entropy inference: Convex geometry and numerical ranges approach. Journal of Mathematical Physics 57, 1 (2016), 015204.

[26] Saha, S. K. Dynamics of serial multibody systems using the decoupled natural or- thogonal complement matrices. J. Appl. Mech. 66 (1999), 986–996.

[27] Schneiderbauer, L. Entanglement or separability. Master’s thesis, 2012.

51 [28] Shimony, A. Degree of entanglement. Annals of the New York Academy of Sciences 755, 1 (1995), 675–679.

[29] Steeb, W.-H., and Hardy, Y. Matrix Calculus and Kronecker Product: A Practical Approach to Linear and Multilinear Algebra Second Edition. World Scientific Publishing Company, 2011.

[30] Webster, R. Convexity. Oxford University Press, 1994.

[31] Weisstein, E. W. Circumsphere. https://mathworld.wolfram.com/, 2002.

[32] Weisstein, E. W. Circumradius. https://mathworld.wolfram.com/, 2003.

[33] Weisstein, E. W. Excircles. https://mathworld.wolfram.com/, 2003.

[34] Weisstein, E. W. Inradius. https://mathworld.wolfram.com/, 2003.

[35] Yiu, P. Class notes: Notes on Euclidean geometry, 1998.

[36] Yiu, P. Class notes: Introduction to the geometry of the triangle, 2001.

52