CONVEX GEOMETRIC CONNECTIONS TO INFORMATION THEORY

by

Justin Jenkinson

Submitted in partial fulfillment of the requirements

For the degree of Doctor of Philosophy

Department of

CASE WESTERN RESERVE UNIVERSITY

May, 2013 CASE WESTERN RESERVE UNIVERSITY

School of Graduate Studies

We hereby approve the dissertation of Justin Jenkinson, candi- date for the the degree of Doctor of Philosophy.

Signed: Stanislaw Szarek

Co-Chair of the Committee

Elisabeth Werner

Co-Chair of the Committee

Elizabeth Meckes

Kenneth Kowalski

Date: April 4, 2013

*We also certify that written approval has been obtained for any proprietary material contained therein. c Copyright by

Justin Jenkinson

2013 TABLE OF CONTENTS

List of Figures ...... v

Acknowledgments ...... vi

Abstract ...... vii

Introduction ...... 1

CHAPTER PAGE

1 Relative Entropy of Convex Bodies ...... 6

1.1 Notation ...... 7 1.2 Background ...... 10 1.2.1 Affine Invariants ...... 11 1.2.2 Associated Bodies ...... 13 1.2.3 Entropy ...... 15 1.3 Mean Width Bodies ...... 16 1.4 Relative Entropies of Cone Measures and Affine Surface Areas . . 24 1.5 Proof of Theorem 1.4.1 ...... 30

2 of Quantum States ...... 37

2.1 Preliminaries from Convex Geometry ...... 39 2.2 Summary of Volumetric Estimates for Sets of States ...... 44 2.3 Geometric Measures of Entanglement ...... 51 2.4 Ranges of Various Entanglement Measures and the role of the Di- mension ...... 56 2.5 Levy’s Lemma and its Applications to p-Schatten Norms ...... 60 2.6 Concentration for the Support Functions of PPT and S ...... 64 2.7 Geometric Banach-Mazur Distance between PPT and S ...... 66 2.8 Hausdorff Distance between PPT and S in p-Schatten Metrics . . 67 2.9 Maximum Trace Distance from a PPT state to S ...... 70

iii Summary ...... 74

APPENDIX

A Constants in the Spherical Isoperimetric Inequality ...... 75

Bibliography ...... 81

iv LIST OF FIGURES

FIGURE PAGE

1 The Minkowski sum of a triangle and a ...... 2

2 The polar of the cube is the tetrahedron...... 3

3 The polar of an ellipse is another ellipse...... 4

1.1 The Gauss map of an ellipse, K...... 8

1.2 The support function of K...... 10

A.1 The plots of q3 and q4...... 79

v ACKNOWLEDGMENTS

Many thanks to my advisors Stanislaw Szarek and Elisabeth Werner for providing so much inspiration and motivation and for leading me through this endeavor. The collaboration with Karol and Michal Horodecki was essential to many of the results in Chapter 3 and I am grateful for their participation and comments. I would also like to thank my committee members for many useful comments and suggestions. I would also like to thank the Department of Mathematics, the School of Graduate

Studies, and Case Western Reserve University in general.

This research has been partially supported by grants DMS-0503642, DMS-0652722 and DMS-0801275 from the National Science Foundation.

vi ABSTRACT

Convex Geometric Connections to Information Theory

by

JUSTIN JENKINSON

Convex geometry is a field of mathematics that has experienced rapid growth in recent years and has proven to be an extremely useful perspective in areas of research. Problems in many different fields can be interpreted geometrically which often leads to powerful and surprising results.

This thesis establishes connections between convex geometry and both classical and quantum information theory. We introduce the mean width bodies and illustrate the geometric interpretation they provide for the relative entropy of cone measures of a convex body and its polar. We define relative entropy for convex bodies and its relation to affine isoperimetric inequalities is considered. Other connections are made by considering quantum information theory.

The fundamental objects in quantum information theory are quantum states. The set of states is convex as are some of its important subsets. Therefore, convex geom- etry provides a natural approach to explore quantum states. Fairly sharp estimates are obtained regarding the geometry of quantum states using basic notions in convex geometry. In particular, the distance between the set of states with positive partial transpose and the set of separable states is explored.

vii Finally, the optimal constants for the spherical isoperimetric inequality are pro- vided and generalizations of a concentration inequality are suggested.

viii INTRODUCTION

Convex geometry is a relatively young field of mathematics, yet it is based on simple notions of volume and distance known even to the ancient Greeks. Modern convex geometry has its roots with Hermann Brunn and Hermann Minkowski around the turn of the 20th century. The body of knowledge that grew from their work became known as the Brunn-Minkowski theory. Rolf Schneider [81] describes the Brunn-

Minkowski theory as “...the result of merging two elementary notions for sets in Euclidean space: vector addition and volume.” One of the central concepts is

n Minkowski addition. If K,L ⊂ R their Minkowski sum is given by:

K + L = {x + y : x ∈ K , y ∈ L}

(see figure 1). Minkowski addition of sets is a fundamental notion in convex geometry and leads to many important results such as isoperimetric inequalities, Hadwiger’s theorem, and the Brunn-Minkowski inequality. The Brunn-Minkowski inequality

n states that for any compact subsets, K and L, of R

|K + L|1/n ≥ |K|1/n + |L|1/n , or, equivalently, for t ∈ [0, 1]

|tK + (1 − t)L| ≥ |K|t |L|1−t .

Here and in what follows |·|, when applied to a set, denotes the volume of the set in the appropriate . The content of the Brunn-Minkowski inequality can 1 Figure 1: The Minkowski sum of a triangle and a circle.

be summarized nicely (if somewhat informally) by noticing that the statements are equivalent to saying that |·|1/n is concave (or, equivalently, that log(|·|) is concave) with respect to Minkowski addition. The Brunn-Minkowski inequality is in fact re- sponsible for much of the modern theory [23, 81]. It was through this inequality that many important connections to other fields where made. Variants of the Brunn-

Minkowski inequality show up in many fields and have been proven to be extremely useful.

The classical isoperimetric inequality compares the length of a closed curve in the plane and the area which it encloses. This idea has been generalized and variants exist in many fields of mathematics. The classic survey by Osserman [73] shows how pervasive these ideas are. Isoperimetric inequalities will be discussed in more detail in section 1.2 and A. A geometric perspective is extremely useful for many fields of study. A few that have benefitted in particular are , probability, graph theory, linear programming, and information theory.

n Another important notion in convex geometry is duality. For a set K ⊂ R , we

2 define the polar of K (or the dual of K) as

◦ n K = {y ∈ R : hx, yi ≤ 1 for all x ∈ K}

Notice that K◦ is always convex (a set L is convex if x, y ∈ L then for all t ∈ [0, 1], tx + (1 − t)y ∈ L). If K is also convex and contains the origin in its interior, then

◦◦ n n K = K. Additionally, the set K0 of convex bodies in R containing the origin as an interior point is closed under the polar operation, K 7→ K◦. The polar operation is also contravariant. That is, if K ⊂ L then L◦ ⊂ K◦ and (rK)◦ = 1/rK◦ for all

n K,L ⊂ R and r ∈ R. For example, figure 2 demonstrates the polar operation on 3 the cube in R and figure 3 demonstrates that the polar of an ellipse is again another ellipse. An important fact is that If X is a normed space with unit ball K, then the dual space, X∗, has unit ball K◦. This is the reason the term “dual” is sometimes used instead of “polar”. This is also illustrated in figure 2: the polar of the l∞ ball is the l1 ball and vise versa. This duality is a powerful tool for exploring the structure of convex sets and normed spaces.

Figure 2: The polar of the cube is the tetrahedron.

3 Figure 3: The polar of an ellipse is another ellipse.

Information theory has its roots with Claude Shannon [86] who wished to quantify information and to discover the best method of propagation. Information is prop- agated by a process of encoding, transmitting, and decoding. Shannon wished to

find bounds on the amount of information that could be passed through circuits and to find encoding schemes that were optimal for these circuits. Being able to dis- tinguish information from noise is crucial to encoding, decoding, and cryptography and quickly leads to questions of distinguishability of probability distributions [13].

Therefore, central notions in information theory involve measures of distinguisha- bility of probability distributions. These notions of distinguishability (Kolmogorov distance, Bhattacharyya coefficient, Shannon distinguishability [13, 22]) give us a new interpretation of certain mathematical objects and have quantum analogs as well. Quantum information theory arose from the work of Richard Feynman who, in

1982 in an entertaining paper [21] asked, “Can physics be simulated by a universal

4 computer?” The field shifted with the pioneering work of Bennett and Brassard [7] who considered a quantum cryptography. This led the way to quantum comput- ing and quantum algorithms. That is, using quantum interactions to propagate and manipulate information. In 1995 Peter Shor [88] showed that a quantum computer could, in theory, factorize prime numbers in polynomial time, a feat that is thought to be impossible with classical (non-quantum) computers. Shor’s groundbreaking result elevated quantum information theory from a novelty to a priority for cryptographers and security administrations. The secret ingredient that makes quantum information theory powerful and unique from classical information theory is the notion of entan- glement. The mathematical aspects of entanglement are discussed below in sections

2.2 and 2.3. An excellent survey article on the matter is [43].

The results from section 1 are part of a joint work with Prof. Werner and will appear in the Transactions of the AMS [48]. The main results from section 2.5 and 2.9 are part of a collaboration with Prof. Szarek, Karol Horodecki, and Michal

Horodecki [44]. This work is still being prepared for publication. While the results of

Appendix A are rather elementary, in view of their utility they will be also submitted for publication (perhaps after further development).

5 CHAPTER 1

RELATIVE ENTROPY OF CONVEX BODIES

Many ideas in information theory have analogs in convex geometric analysis or can be interpreted geometrically. This may be obvious for some examples, but many surprising connections have been discovered only recently. Geometric inequalities may often be used to obtain inequalities for probability densities. For instance, the entropy power inequality can be deduced from the Brunn-Minkowski inequality and vice versa.

In fact, the two inequalities can be seen as consequences of generalizations of Young’s inequality (see, for example [13, 16, 59, 60, 63]). The main avenue of discovery seems to be through affine invariants and the various geometric relationships that give rise to these structures. For instance: affine surface area, which is a basic affine invariant, can be arrived at by considering the floating bodies [83]; the Cramer-Rao inequality

(also known as the information inequality) can be seen as a consequence of relations between the Legendre ellipsoid and a polar L2-projection body [63].

These objects are part of what is now known as the Lp-Brunn-Minkowski theory which plays a central role in modern convex geometry and has seen steady growth in recent years and many results (see, e.g., [25], [27], [29], [35] - [37], [50], [51], [54]

- [65], [68], [69], [71], [79], [82] - [85], [90], [91], [97] - [101], [109]) have spoken to the pervasive nature of the theory. Problems in many different areas can be seen as affine geometric problems. Probably the most famous example is the Busemann-

Petty problem which asks if the volume inequality of symmetric bodies follows from

6 inequality on all central sections of the bodies. Satisfactory results where not obtained until the introduction of intersection bodies, an object of study in the Lp-Brunn- Minkowski theory, by Lutwak in [58]. The problem has since been solved thanks to

[24, 26, 79, 107, 108].

Two important notions of Lp-Brunn-Minkowski theory are the Lp-affine surface area introduced for p > 0 by Lutwak in [59] and Lp-centroid bodies introduced by Lutwak and Zhang in [61]. Paouris and Werner [74] used these notions to show that the exponential of the relative entropy of the cone measures of a symmetric convex body and its polar equals a limit of normalized Lp-affine surface areas. In doing so, they introduce a new affine invariant ΩK . This quantity establishes another link between convex geometry and information theory and is the motivation for the present work.

We make yet another connection between convex geometry and information theory by considering a new class of bodies: the mean width bodies. We use these bodies to arrive at characterization of relative entropy of cone measures (an information theoretic notion). What is especially interesting is that mean width bodies need not be symmetric or even convex. The fact that there are no assumptions is in contrast to [74] where symmetry is needed. Another surprise is that the volume estimates involved are only first-order estimates where as previous work required second-order estimates. This suggests that the mean width bodies are more sensitive to boundary structure than the Lp-centroid bodies.

1.1 Notation

Most of the notation used in this work is standard and in general we follow the conventions set by Schneider [81]. We record here some of the notational conventions used.

7 n We work in R , which is equipped with a Euclidean structure h· , ·i. We denote n by k · k the corresponding Euclidean norm. B2 (x, r) is the Euclidean ball centered

n n at x with radius r. We write B2 = B2 (0, 1) for the Euclidean unit ball centered at 0 and Sn−1 for the unit sphere. Volume is denoted by | · | and is understood to be in the appropriate dimension. Throughout this chapter, we will assume that all bodies

n n in question belong to K0 the set of convex bodies contained in R with centroid at n the origin. The centroid of a (measurable) set K ⊂ R is defined by

R xdx c = K = (X) K |K| E where X is a random point distributed uniformly in K.

Figure 1.1: The Gauss map of an ellipse, K.

For a point x ∈ ∂K, the boundary of K, NK (x) is the outer unit normal to K at

n−1 x. The map NK : ∂K → S , x 7→ NK (x), is sometimes called the Gauss map. See

8 2 2 figure 1.1. We write K ∈ C+, if K has smooth (C ) boundary ∂K with everywhere

2 strictly positive Gaussian curvature κK . More precisely, K ∈ C+ if, for every x ∈ ∂K, the differential of the Gauss map at x defines a positive definite operator on the tangent space at x of K, the determinant of which defines the Gaussian curvature,

κK (x). Alternatively, by the implicit function theorem, we may view the boundary

2 of K, at least locally, as the graph of some function f. Then K ∈ C+ if the Hessian

2 of f is positive definite everywhere on ∂K. The importance of C+ bodies is that If

2 K is C+ then the Gauss map is invertible and we can define the curvature function, fK (u) as the reciprocal of the Gaussian curvature κK (x) at this point x ∈ ∂K that has u as outer normal.

We denote by µK the usual surface area measure on the boundary of K, ∂K. When necessary, we will use ω for the usual surface area measure on Sn−1 induced ω(A) by Lebesgue measure on n and σ its normalization: σ(A) = for all Borel R ω(Sn−1) measurable sets A ⊂ Sn−1.

n For ξ and x in R , H = H(x, ξ) is the hyperplane through x orthogonal to ξ. The two closed half spaces generated by H are given by

+ + n H = H (x, ξ) = {y ∈ R : hy, ξi ≥ hx, ξi}

− − n H = H (x, ξ) = {y ∈ R : hy, ξi ≤ hx, ξi}

n n−1 Let K be a convex body in R and let u ∈ S . The support function of K in the direction u ∈ Sn−1 is given by

hK (u) = max{hx, ui : x ∈ K} (1.1.1)

The support function measures the distance from the origin to the hyperplane which is orthogonal to u and intersects K only on the boundary of K. See figure 1.2 for example.

9 Figure 1.2: The support function of K.

1.2 Background

This section provides background to the mathematical ideas and objects that are relevant to this chapter. These ideas include affine-invariants, bodies generated from a given body, and the notion of entropy.

Much of the general theory of convex bodies concerns the shape of a convex body and to a lesser extent the size or position of the body. Because of this, a natural ques- tion to ask is what attributes of a body are unchanged by affine transformations? The simplest affine invariant, known even to the ancient Greeks, is volume. Considering

10 only this quantity already yields important results such as the isoperimetric inequal- ity, which can expressed for K ∈ Kn as

n−1 |∂K|  |K|  n n−1 ≥ n (1.2.1) |S | |B2 | with equality if and only if K is an ellipsoid. The Blaschke-Santal´o inequality is another important result concerning the volume of convex bodies. This is discussed in more detail in section 2.1. Volume inequalities are important; however, sharper and deeper results are often required. To this end, we must obtain affine invariants that more accurately distinguish between the shape or boundary structure of a body.

These affine invariants more accurately describe the shape of a body and provide answers for many important questions.

1.2.1 Affine Invariants

3 The notion of affine surface area, first introduced by Blaschke [9] in R , plays a 3 fundamental role in convex geometry. Blaschke showed that for convex bodies in R with C∞ boundary (infinitely many non-vanishing derivatives), affine surface can be computed by the formula

Z 1 as(K) = κK (x) n+1 dµ(x) . (1.2.2) ∂K

Here κK (x) is the Gaussian curvature and µ is the surface measure of K. Expressions

n to compute affine surface area for arbitrary convex bodies in R were given at the same time by Leichtweiss [53], Lutwak [58] and Sch¨uttand Werner [83]. In fact,

Sch¨uttand Werner showed in [83] that the formula (1.2.2) holds for arbitrary convex

n bodies in R , replacing Gaussian curvature by the generalized Gaussian curvature.

The affine surface area has since been generalized to the family of Lp-affine surface areas defined for all p, the case p = 1 being the classical affine surface area. The case of p > 1 was established in 1996 by Lutwak in [59] while proving the Lp-affine 11 isoperimetric inequality for this case (the Lp-affine isoperimetric inequality will be discussed below). Using a geometric formulation of Lp-affine surface area, Meyer and Werner ([68, 69]) established the case of −n < p < 1 along with a definition for the case p = −n. The remaining cases were settled by Sch¨uttand Werner in [85] using yet another geometric interpretation. For p 6= −n the Lp-affine surface area is given by p Z κ(x) n+p asp(K) = n(p−1) dµ(x) . (1.2.3) ∂K hx, N(x)i n+p For p = 0 this takes the form Z as0(K) = hx, N(x)i dµ(x) = n |K| ∂K and for p = ±∞ (and with sufficiently smooth K) it is given by Z κ(x) ◦ as±∞(K) = n dµ(x) = n |K | . ∂K hx, N(x)i

The Lp-affine surface area of the unit ball coincides with the regular surface area. That is, For all p 6= −n,

n n−1 n asp(B2 ) = S = n |B2 | .

The Lp-affine surface areas are affine invariant. That is, if det(T ) = 1 then asp(T (K)) = asp(K). This follows from the general fact that [46, 59, 85]

n−p asp(T (K)) = (det(T )) n+p asp(K) .

If P is a polytope then asp(P ) = 0 for p > 0 since the generalized Gauss curvature is almost everywhere 0 on a polytope.

As previously mentioned, Lutwak introduced the Lp-affine surface area and es- tablished the Lp-affine isoperimetric inequality for p > 1 in [59]. This version can be stated as follows. n−p   n+p asp(K) |K| n ≤ n , asp(B2 ) |B2 | 12 with equality if and only if K is an ellipsoid. This form holds, in fact, for all p ≥ 0 as shown by Hug [46]. If p = 0, equality holds trivially. In [100], Werner and Ye proved

Lp-affine isoperimetric inequalities for all p < 1. If −n < p ≤ 0 then

n−p   n+p asp(K) |K| n ≥ n asp(B2 ) |B2 |

2 Again, with equality if and only if K is an ellipsoid. If p < −n and K is C+ then

n−p   n+p asp(K) np |K| n+p n ≥ c n . asp(B2 ) |B2 |

Here c is the constant from the reverse Santal´oinequality stated below in (2.1.9) (see

[81] or [71]). The best known value of c is due to Kuperberg [52] where it was shown that c > 1/2 if K is symmetric and c > 1/4 in the general case. The constants in

[52] are actually slightly better but we state them this way for the sake of simplicity.

Mahler conjectured [66] that there is equality in the reverse Santal´oinequality when

K is a simplex. See the discussion following equation (2.1.9) for more details.

Another affine invariant, ΩK , mentioned above, was defined by Paouris and Werner

n [74]. For K ∈ K0 ,  n+p asp(K) ΩK = lim . (1.2.4) p→∞ n|K◦|

The quantity ΩK is related to the relative entropy of the cone measures of the convex body K and its polar K◦. See [74] for a detailed discussion of cone measures and the connection ΩK makes to information theory.

1.2.2 Associated Bodies

There are many convex bodies that can be derived from other bodies. These bodies are often discovered through attempts to solve problems such as Shepherd’s problem and the Busemann-Petty problem (see Lutwak [58] for example) or to find geometric characterizations of affine invariants. The most relevant such body for the present 13 discussion is the convex floating body. The convex floating body gives a useful charac- |K| terization of affine surface area and was defined in [83] as follows: if 0 ≤ δ < , the 2 + convex floating body Kδ of K is the intersection of all halfspaces H whose defining hyperplanes H cut off a set of volume at most δ from K:

\ + Kδ = H . (1.2.5) |H−∩K|≤δ

The floating bodies Kδ give a ‘parameterization’ of K in the sense that K0 = K, and

Kδ ⊆ K when δ ≥ . The floating body operation is just one of a whole class of operations that can be used to examine the surface structures of a body. The as(K) describes the limiting behavior of the volume differences, |K| − |Kδ|. That is, if Kδ is the floating body of K, it was proved in [83] that

|K| − |Kδ| lim 2 = cnas(K) . δ→0 δ n+1

−2 n−1 ! n+1 1 B where c = 2 . This is a special case of the more general fact n 2 n + 1

|K| − |K| as(K) lim n n = n . →0 |B2 | − |(B2 )| as(B2 ) where K is a ‘parameterization’ of K that satisfies certain conditions. We refer to [98] for the details.

n For a convex body K in R of volume 1 and 1 ≤ p ≤ ∞, the Lp-centroid body

Zp(K) is the convex body that has support function [61]

Z 1/p p hZp(K)(θ) = |hx, θi| dx . (1.2.6) K

The floating body and the Lp-centroid body are just two examples of bodies connected to a given convex body. Others include the illumination body [97], the projection body

[63], the surface body [85], and the intersection body [58].

14 1.2.3 Entropy

An important notion in the present work is the notion of entropy. Entropy was introduced by Shannon [86] to provide a measure of ‘randomness’ of probability dis- tributions. Specifically to determine which probability distributions are best suited to propagate information. If X is a random variable on a measure space, (Ω, µ), with probability density function f, the classical entropy, or Shannon entropy [13] is given by Z H(X) = − f(x) log(f(x))dµ(x) = E(− log(f(X))) . Ω Notice that the values that X takes are irrelevant to the entropy. The entropy is really the expected information content of the probability density. Joint entropy is given by

Z H(X,Y ) = − f(z) log(f(z))dµ(z) = E(− log(f(Z))) Ωx×Ωy where Z = X × Y and f is the joint distribution.

The relative entropy or Kullback-Leibler divergence between two distributions is given by Z p(x) DKL(p||q) = p(x) log dµ(x) . (1.2.7) Ω q(x) The mutual information can then be expressed as

I(X,Y ) = D(p(x, y)||p(x)p(y)) .

See [13] for more details and discussion of the above quantities. The notion of en- tropy has been generalized in several different directions. Most notable are the Von

Neumann entropy and the p-Renyi entropy (See e.g. [72]).

15 1.3 Mean Width Bodies

n The mean width W (K) of a convex body K in R is defined as Z W (K) = 2 hK (u)dσ(u). (1.3.1) Sn−1 where hK is the support function of K defined above by (1.1.1) (see e.g. Schneider [81]). The factor of 2 appears because the width of K in the direction parallel to u (i.e., the distance between the two supporting hyperplanes to u) is

n−1 actually max hx, ui−min hx, ui = hK (u)+hK (−u), and averaging this sum over S x∈K x∈K results in a factor of 2. Let M and K be convex bodies such that 0 is the centroid of

K and K ⊂ M. By switching to polar coordinates it is easy to see [30] that Z 2 −(n+1) W (M) − W (K) = n−1 kξk dξ. (1.3.2) |S | K◦\M ◦

◦ Let f : K → R be a positive, integrable function. We generalize (1.3.2) to 2 Z Wf (M) − Wf (K) = n−1 f(ξ)dξ. (1.3.3) |S | K◦\M ◦ −2 Z Here we can define Wf (K) = n−1 f(ξ)dξ, but many important properties arise |S | K◦ only when we consider differences of such quantities. Also, defining Wf (K) separately costs some generality because since the difference of these quantities is given by an integration over K◦ \ M ◦, we only need to consider functions integrable on bounded sets separated from 0. For example, the generalization suggests that if f = k·k−(n+1) Z 2 −(n+1) then Wf (K) = W (K), but the integral n−1 kξk dξ is divergent. |S | K◦ In order to define the mean width bodies we consider a specific M. Namely, for

n x ∈ R , let Kx = [x, K] be the of x and K. For x ∈ K, Kx = K. Therefore, we will consider only x∈ / K. Let t ≥ 0 following Glasauer and Gruber

[30], we define the following convex bodies:

n K[t] = {x ∈ R : w(x) ≤ t} (1.3.4) 16 where Z 2 −(n+1) w(x) = W (Kx) − W (K) = n−1 kξk dξ. (1.3.5) |S | ◦ ◦ K \Kx The bodies K[t] in equation (1.3.4) have been used by several authors (e.g. by

B¨or¨oczkyand Schneider [10] and Glasauer and Gruber [30]) in connection with ap- proximation of convex bodies by polytopes.

We shall now provide a lemma concerning (1.3.3) for certain classes of functions;

n−1 the α-homogeneous functions. Let α ∈ R, α 6= 0. Let f : S → R be a positive function. f is said to be α-homogeneous, or homogeneous of degree α, if for all r ≥ 0,

f(ru) = rαf(u).

n Lemma 1.3.1. Let K and M be convex bodies in R such that 0 is the centroid of K n−1 and K ⊂ M. Let f : S → R be a positive, integrable function that is homogeneous of degree α.

(i) Let α 6= −n. Then

2 Z  1 1  Wf (M) − Wf (K) = f(u) α+n − α+n dσ(u). (α + n) Sn−1 hK (u) hM (u)

(ii) Let α = −n. Then Z   hM (u) Wf (M) − Wf (K) = 2 f(u) log dσ(u). Sn−1 hK (u)

Proof. We convert to polar coordinates and use α-homogeneity to obtain

2 Z Wf (M) − Wf (K) = n−1 f(ξ)dξ |S | K◦\M ◦ 1 Z Z h (u) 2 K n−1 = n−1 f(ru)r drdω(u) |S | Sn−1 1 hM (u) 1 Z Z h (u) 2 K n+α−1 = n−1 f(u)r drdω(u) |S | Sn−1 1 hM (u) Integration then yields (i) and (ii). 17 After finding Lemma 1.3.1, we realize that the difference in width (1.3.3) of bodies may be described in terms of relative entropy (1.2.7). In light of this, we can ask for probability distributions with relative entropy that can be described by the width differences of convex bodies. 1 1 If we let f(u) = n (or f(u) = n ) in Lemma 1.3.1 (ii), then f(ru) = hK (u) hM (u) −n r −n n = r f(u). Thus this f is homogeneous of degree −n. hK (u) n−1 n Let now (X, µ) = (S , ω) and for convex bodies K and M in R put 1 1 pK = ◦ n , pM = ◦ n . (1.3.6) n|K |hK n|M |hM

n−1 Then dPK = pK dω and dPM = pM dω are probability measures on S and Lemma 1.3.1 (ii) becomes

Z  n  2 ◦ 1 hM W 1 (M) − W 1 (K) = |K | log dσ hn hn ◦ n n K K n Sn−1 |K |hK hK ◦ Z   ◦  2|K | pK |K | = n−1 pK log + log ◦ dω |S | Sn−1 pM |M | 2|K◦|   |K◦|  = D (P kP ) + log . |Sn−1| KL K M |M ◦|

For the probability distributions described by (1.3.6) we have the following corol- lary.

n Corollary 1.3.2. Let K and M be convex bodies in R such that K ⊂ M and let pK and pM be the probability densities given in (1.3.6). Then Z 1 dξ  |K◦|  n ◦ = DKL(PK kPM ) + log ◦ K◦\M ◦ hK (ξ) |K | |M |

We now want to apply the above considerations for a specific M. Namely, M =

◦ Kx. We generalize them as follows. Let f : K → R be a positive, integrable function.

As above, with Kx instead of M, we put 2 Z wf (x) = Wf (Kx) − Wf (K) = n−1 f(ξ)dξ (1.3.7) |S | ◦ ◦ K \Kx 18 and generalize (1.3.4) to

n Kf [t] = {x ∈ R : wf (x) ≤ t}. (1.3.8)

−β Thus, for instance, for β ∈ R and fβ(ξ) = kξk we get

 2 Z  K [t] = x ∈ n : kξk−βdx ≤ t , (1.3.9) fβ R n−1 |S | ◦ ◦ K \Kx which, in the particular case β = n + 1, gives the bodies (1.3.4) above.  x  As K = [x, K], K◦ = K◦ ∩{y ∈ n : hy, xi ≤ 1}. Thus, putting H+ , x = x x R kxk2  x  {y ∈ n : hy, xi ≤ 1}, K◦ is obtained from K◦ by cutting off a cap K◦∩H− , x R x kxk2 of K◦:  x  K◦ = K◦ ∩ H+ , x . x kxk2 and  x  K◦ \ K◦ = K◦ ∩ H− , x . x kxk2 Therefore ( ) 2 Z K [t] = x ∈ n : f(ξ)dξ ≤ t . (1.3.10) f R n−1   |S | K◦∩H− x ,x kxk2

This is the definition of the mean width bodies from [48].

Remarks 1: Properties of Kf [t]

(i) It is clear that for all f and for all t ≥ 0, K ⊂ Kf [t] and that Kfβ [0] = K for all

β. However, it can happen that K is a proper subset of Kf [0]. To see this, let K =

n n ◦ n n B∞ = {(x1, . . . , xn) ∈ R : max1≤i≤n|xi| ≤ 1}. Then K = B1 = {(x1, . . . , xn) ∈ R : n X n |xi| ≤ 1}. Define f : B1 → R,(x1, . . . , xn) → f((x1, . . . , xn)) by i=1   0, xn ≥ 0 f(x) =  1, otherwise. 19 3 3 Then (0,..., 0, ) ∈ K [0] but (0,..., 0, ) ∈/ K. 2 f 2

2 2 (ii) Kf [t] need neither be bounded nor convex. Indeed, let K = B∞. Define f : B1 →

R,(x1, x2) → f((x1, x2)) by  1  , x2 ≥ 0 f(x) = 2  1, otherwise. 1 3 1 If t ≥ , K [t] = 2. If ≤ t < , {(x , x ) ∈ 2 : x ≥ 0} ⊂ K [t]. If π f R 4π π 1 2 R 2 f 1 3 ≤ t < , {(0, x ) ∈ 2 : x ≥ 0} ⊂ K [t]. Thus K [t] is unbounded in those 2π 4π 2 R 2 f f 1 cases. If t < , then K [t] is bounded. 2π f 3 Moreover, with the same K and f: {(x , x ) ∈ 2 : x ≥ 0} ⊂ K [ ] and 1 2 R 2 f 4π  1  3  1 −1  0, − √ ∈ Kf [ ]. Let x0 = √ , √ . Then wf (x0) = 1 − 3/2 4π 1 − 3/2 1 − 3/2 √  √  3 3 3 1 − 3/16 > . Therefore, K [ ] is not convex. 4π f 4π

(iii) Formulas (1.3.7) and (1.3.10) show that to define Kf [t], we cut off sets of “weighted volume” t of K◦. This is a similar procedure to that of the floating body

n defined in (1.2.5). If M ∈ K , we obtain Mδ from M, by cutting off sets of volume δ from M. A reasonable concern is that if the “weighted volume” is just regular

◦ volume, that is f(x) = 1, then is Kf [t] = (K )δ for some t and δ? For β = 0, we get in formula (1.3.10), 2 Z K [t] = {x ∈ n : dξ ≤ t} f0 R n−1   |S | K◦∩H− x , x kxk2 kxk    n−1  n ◦ − x x tω(S ) = x ∈ R : K ∩ H , ≤ kxk2 kxk 2

◦ However, Kf0 [t] is not a convex floating body of K .

n n Indeed, it is easy to see that for the Euclidean ball B = rB2 in R with radius r,

Bf0 [t], for small t, is a Euclidean ball with radius of order

 2n 2  r 1 + knr n+1 t n+1 , 20 2  n  n+1 1 n(n + 1)|B2 | ◦ where kn = n−1 .(B )δ, for small δ, is a ball with radius of order 2 2|B2 |

1  2n 2  1 − c r n+1 δ n+1 , r n

2 1  n + 1  n+1 where cn = n−1 (see e.g. [83]) and Bδ, for small δ, is a ball with radius 2 |B2 | of order   cn 2 n+1 r 1 − 2n δ , r n+1 (see also e.g. [83]).

δ Also, Kf0 [t] resembles the illumination body K which, for δ ≥ 0, is defined as follows [97]:

δ n K = {x ∈ R : |[x, K] \ K| ≤ δ}.

The resemblance is in the fact that the set [x, K]\K is looks similar to the domain of integration in (1.3.10). So if f(x) = 1 it is reasonable to be concerned that we have arrived at an illumination body. In fact, Kf [t] is not a illumination body. Again, this

n n δ can be seen by considering the Euclidean ball rB2 .(rB2 ) , for small δ, is a Euclidean ball with radius of order   dn 2 n+1 r 1 + 2n δ , r n+1 2 1 n(n + 1) n+1 where dn = n−1 [97]. 2 |B2 |

We have seen that Kf [t] need not be convex. But it is always star-convex.

n Lemma 1.3.3. Let K be a convex body in R such that 0 is the centroid of K. Let ◦ f : K → R be a positive, integrable function.

(i) Kf [t] is star convex i.e. [0, x] ⊂ Kf [t] for all x ∈ Kf [t]. \ (ii) Kf [t] = Kf [t + s]. s>0 21 Proof. (i) Let x ∈ Kf [t] and let y ∈ [0, x]. Then Ky = [y, K] ⊂ [x, K] = Kx and

◦ ◦ ◦ ◦ ◦ consequently K \ Ky ⊂ K \ Kx. As f ≥ 0 on K , we therefore get

2 Z 2 Z n−1 f(ξ)dξ ≤ n−1 f(ξ)dξ ≤ t |S | ◦ ◦ |S | ◦ ◦ K \Ky K \Kx and thus y ∈ Kf [t]. \ (ii) For all s > 0, Kf [t] ⊂ Kf [t + s]. Therefore, we only need to show that Kf [t + s>0 \ s] ⊂ Kf [t]. Let thus x ∈ Kf [t + s]. Then for all s > 0, wf (x) ≤ t + s. Letting s>0 s → 0, we get wf (x) ≤ t.

Additional conditions on f ensure convexity of Kf [t]. This is shown in the next lemma whose proof is the same as the corresponding one in [10].

n Lemma 1.3.4. Let K be a convex body in R such that 0 is the centroid of K. Let n−1 f : S → R be a positive, integrable function that is homogeneous of degree α. If

α ≤ −(n + 1), then Kf [t] is convex.

Proof. Let x and y be in Kf [t] and let 0 < λ < 1. For t ∈ R, t ≥ 0, the function γ g(t) = t is convex if γ ≥ 1. Therefore, and as K(1−λ)x+λy ⊆ (1 − λ)Kx + λKy, we get for α ≤ −(n + 1)

−(α+n) −(α+n) −(α+n) −(α+n) hK (1 − λ) h + λ h (1 − λ) h + λ h (1−λ)x+λy ≤ Kx Ky ≤ Kx Ky . −(α + n) −(α + n) −(α + n)

22 Hence for α ≤ −(n + 1), 2 Z f(u)h−(α+n) (u)dσ(u) K(1−λ)x+λy −(α + n) Sn−1 2  Z ≤ (1 − λ) f(u)h−(α+n)(u)dσ(u) Kx −(α + n) Sn−1 Z  +λ f(u)h−(α+n)(u)dσ(u) Ky Sn−1  Z  2 −(α+n) ≤ (1 − λ) f(u)hK (u)dσ(u) + t −(α + n) Sn−1  Z  2 −(α+n) +λ f(u)hK (u)dσ(u) + t −(α + n) Sn−1 Z 2 −(α+n) = f(u)hK (u)dσ + t. −(α + n) Sn−1

Remark. If α > −(n + 1), then Kf [t] need not be convex. An example is the cube

2 in R and the f given in Remark 1 (ii).

Now we give conditions that guarantee that Kf [t] is bounded and give a desirable boundary condition for Kf [t].

n Lemma 1.3.5. Let K be a convex body in R such that 0 is the centroid of K. Let ◦ f : K → R be a strictly positive, integrable function. Then

(i) Kf [0] = K.

(ii) There exists t0 such that for all t ≤ t0, Kf [t] is bounded.

(iii) Let t ≤ t0, where t0 is as in (ii). Then we have for all x ∈ ∂Kf [t] that wf (x) = t.

Proof.

(i) We only have to show that Kf [0] ⊂ K. Let x ∈ Kf [0]. Then 2 Z wf (x) = n−1 f(ξ)dξ = 0 . |S | ◦ ◦ K \Kx ◦ ◦ ◦ ◦ ◦ As f > 0 on K , this can only happen if m(K \ Kx) = 0. As Kx ⊂ K is closed and

◦ ◦ convex, this can only happen if Kx = K , or, equivalently, Kx = K, or x ∈ K. 23 (ii) This follows immediately from (i), Lemma 1.3.3 (ii) and the fact that, as K is a convex body, there exists α > 0 such that

 1  Bn(0, α) ⊂ K ⊂ Bn 0, . (1.3.11) 2 2 α \ As K = Kf [0] = Kf [t], there exists t0 such that for all t ≤ t0, Kf [t] ⊂ 2K ⊂ t>0  2  Bn 0, . 2 α

(iii) Let t ≤ t0 and let x ∈ ∂Kf [t]. Suppose wf (x) < t. Let y ∈ {ax : a ≥ 1}. Z ◦ ◦ Then Kx = [x, K] ⊂ Ky = [y, K], hence Ky ⊂ Kx and therefore f(ξ)dξ ≥ K◦\K◦ Z y f(ξ)dξ. As f > 0 on K◦, we can choose y = ax with a > 1 such that ◦ ◦ K \Kx 2 Z n−1 f(ξ)dξ = t. This implies that x∈ / ∂Kf [t], a contradiction. |S | ◦ ◦ K \Ky

1.4 Relative Entropies of Cone Measures and Affine Surface

Areas

In this section we present new geometric interpretations of important affine invari- ants mentioned in the introduction, namely the Lp-affine surface areas. Many such geometric interpretations have been given (see e.g. [69, 84, 85, 99, 100, 101]). The remarkable fact here is that these geometric interpretations of affine invariants for convex bodies are expressed in terms of not necessarily convex bodies, a phenomenon which already occurred in [101] by Werner and Ye in regards to mixed p-affine surface areas.

We also give new geometric interpretations for the relative entropies of cone mea- sures of convex bodies. Geometric interpretations for those quantities were given

first in [74] in terms of Lp-centroid bodies defined by (1.2.6). However, in the con- text of the Lp-centroid bodies, the relative entropies appeared only after performing

24 a second-order expansion of certain expressions. Now, using the mean width bod- ies, already a first-order expansion makes them appear. Thus, these bodies detect

“faster” more detail of the boundary of a convex body than the Lp-centroid bodies.

n 2 Theorem 1.4.1. Let K be a convex body in R that is in C+ and such that 0 is the ◦ centroid of K. Let f : K → R be a continuous function such that f(y) ≥ c for all y ∈ K◦ and some constant c > 0. Then

Z 2 |Kf [t]| − |K| hx, NK (x)i dµK (x) lim 2 = 1 . t→0 kn t n+1 ∂K f(y(x))κK (x) n+1

2  n  n+1 1 n(n + 1)|B2 | ◦ kn = n−1 and y(x) ∈ ∂K is such that hy(x), xi = 1. 2 2|B2 |

Remark. u We put NK (x) = u. Then hx, NK (x)i = hK (u) and y(x) = . As dµK = fK dω, hK (u) we therefore also have

|K [t]| − |K| Z h (u)2dω(u) lim f = K . (1.4.1) 2 n+2   t→0 n+1 n−1 u kn t S fK (u) n+1 f hK (u)

Theorem 1.4.1 leads to the announced new geometric interpretations of the above mentioned quantities which we introduce now. Recall Lp-affine surface area given above in equation (1.2.3);

p Z n+p κK (x) asp(K) = n(p−1) dµK (x) ∂K hx, NK (x)i n+p for real p 6= −n. Then we have

n 2 Corollary 1.4.2. Let K be a convex body in R that is in C+ and such that 0 is the centroid of K.

25 ◦ (i) For p ∈ R, p 6= −n, let pas : ∂K → R be defined by

n+p(n+2) ! n+p hx, NK (x)i pas(y) = 1 , κK (x) n+1 where, for y ∈ ∂K◦, x = x(y) ∈ ∂K is such that hx, yi = 1 Then

p Z n+p |Kpas [t]| − |K| κK (x) dµK (x) lim 2 = n(p−1) = asp(K). t→0 n+1 kn t ∂K hx, NK (x)i n+p

◦ (ii) For β ∈ R, let fβ : K → R be defined by 1 f (y) = = hx, N (x)iβ, β kykβ K where, again, for y ∈ ∂K◦, x = x(y) ∈ ∂K is such that hx, yi = 1 Then Z |Kfβ [t]| − |K| dµK (x) lim 2 = 1 t→0 β−2 kn t n+1 ∂K κK (x) n+1 hx, NK (x)i

2 Proof. As ∂K is in C+, the functions pas and fβ satisfy the conditions of Theorem 1.4.1. The proof of the corollary then follows immediately from Theorem 1.4.1.

Remarks n (i) For β = 0, we get in Corollary 1.4.2 (ii) the p = − L -affine surface area of n + 2 p K.

−(n−1) (ii) As κK (rx) = r κK (x), it makes most sense to put fK (ru) = frK (u) =

n−1 r fK (u) and define n − 1 to be the degree of homogeneity of the function fK . Then 2n(n + p(n + 2)) p is homogeneous of degree and f is homogeneous of degree β. as (n + 1)(n + p) β (n + 1)2 + 1 Thus, by Lemma 1.3.4, K [t] is convex if −n < p ≤ −n and K [t] pas (n + 1)2 + n + 2 fβ is convex if β ≤ −(n + 1).

n 2 Let K a convex body in R that is C+. Let

κK (x) hx, NK (x)i pK (x) = n ◦ , qK (x) = . (1.4.2) hx, NK (x)i n|K | n |K| 26 Then

PK = pK µK and QK = qK µK (1.4.3) are probability measures on ∂K that are absolutely continuous with respect to µK .

Recall now that the normalized cone measure cmK on ∂K is defined as follows: For every measurable set A ⊆ ∂K

1 cm (A) = |{ta : a ∈ A, t ∈ [0, 1]}|. (1.4.4) K |K|

The next proposition is well known. See e.g. [74] for a proof. It shows that the

◦ measures PK and QK defined in (1.4.3) are the cone measures of K and K.

n 2 Proposition 1.4.3. Let K a convex body in R that is C+. Let PK and QK be the probability measures on ∂K defined by (1.4.3). Then

−1 PK = NK NK◦ cmK◦ and QK = cmK , or, equivalently, for every measurable subset A in ∂K

  −1  PK (A) = cmK◦ NK◦ NK (A) and QK (A) = cmK (A).

In the next two corollaries we also use the following notations. For a convex body

n K in R and x ∈ ∂K, let ri(x), 1 ≤ i ≤ n − 1 be the principal radii of curvature. We put

r = infx∈∂K min ri(x) and R = sup max ri(x). (1.4.5) 1≤i≤n−1 x∈∂K 1≤i≤n−1 n 2 Note that if K be a convex body in R that is in C+, then 0 < r ≤ R < ∞. Note also that r = R iff K is a Euclidean ball with radius r.

n 2 Corollary 1.4.4. Let K be a convex body in R that is in C+ and such that 0 is the centroid of K. Let r, R be as in (1.4.5). 27 ◦ (i) Let ent1 : ∂K → R be defined by

− n+2 κ (x) n+1 hx, N (x)in+1 ent (y) = K K , 1 2n  R |K| κK (x)  log 2n ◦ n+1 r |K | hx,NK (x)i where, again, for y ∈ ∂K◦, x = x(y) ∈ ∂K is such that hx, yi = 1 Then

Z 2n |Kent1 [t]| − |K| κK (x) R |K|κK (x) lim 2 = log dµK (x) t→0 n 2n ◦ n+1 kn t n+1 ∂K hx, NK (x)i r |K |hx, NK (x)i  R = n|K◦| [D (P kQ ) + 2n log KL K K r    ◦ −1  R = n|K | D N N ◦ cm ◦ kcm + 2n log . KL K K K K r

◦ (ii) Let ent2 : ∂K → R be defined by

− 1 κ (x) n+1 ent (y) = K , 2 2n  R |K|κK (x)  log 2n ◦ n+1 r |K |hx,NK (x)i where, again, for y ∈ ∂K◦, x = x(y) ∈ ∂K is such that hx, yi = 1 Then

Z 2n ◦ n+1 |Kent2 [t]| − |K| r |K |hx, NK (x)i lim 2 = − hx, NK (x)i log dµK (x) t→0 2n kn t n+1 ∂K R |K|κK (x)  R = −n|K| D (Q ||P ) − 2n log KL K K r    −1  R = −n|K| D cm kN N ◦ cm ◦ − 2n log . KL K K K K r

2 Proof. As ∂K is in C+, 0 < r ≤ R < ∞ and we have for all x ∈ ∂K that

n n B2 (x − rNK (x), r) ⊂ K ⊂ B2 (x − RNK (x),R).

Suppose first that r = R. Then K is a Euclidean ball with radius r and the right hand sides of the identities in the corollary are equal to 0. Moreover, in this case,

ent1 and ent2 are identically equal to ∞. Therefore, for all t ≥ 0, Kent1 [t] = K and

Kent2 [t] = K and hence for all t ≥ 0, |Kent1 [t]| − |K| = 0 and |Kent2 [t]| − |K| = 0. Therefore, the corollary holds trivially in this case. 28 Suppose now that r < R. Then, as

2n  4n R |K| κK (x) R 1 ≤ 2n ◦ n+1 ≤ . r |K | hx, NK (x)i r we get for all x ∈ ∂K that

n−1 ! 2 |K◦|rn−1 fPQ(x) ≥ R  > 0. 2 log r

Thus the functions ent1 and ent2 satisfy the conditions of Theorem 1.4.1. The proof of the corollary then follows immediately from Theorem 1.4.1.

In [74], the following new affine invariant ΩK was introduced and its relation to the relative entropies was established.

n Let K a convex body in R with centroid at the origin.  n+p asp(K) ΩK = lim . p→∞ n|K◦|

Let pK and qK be the densities defined in (1.4.2). It was proved in [74] that for a

n 2 convex body K in R that is C+.   |K| − 1 D (P kQ ) = log Ω n (1.4.6) KL K K |K◦| K and

 ◦ 1  |K | − n D (Q kP ) = log Ω ◦ . (1.4.7) KL K K |K| K

In [74], geometric interpretations in terms of Lp-centroid bodies were given in the case of symmetric convex bodies for the new affine invariants ΩK . These interpre- tations are in the spirit of Corollary 1.4.2: As p → ∞, the quantities ΩK and the related relative entropies appear in appropriately chosen volume differences of K and its Lp-centroid bodies (1.2.6). However, in the context of the Lp-centroid bodies, a second-order expansion was needed for the volume differences in order to make these 29 terms appear. Now, it follows from Corollary 1.4.4 (i) and (ii) and Corollary 1.4.5 that no symmetry assumptions are needed and that already a first-order expansion gives such geometric interpretations, if one uses the mean width bodies instead of the

Lp-centroid body.

n 2 Corollary 1.4.5. Let K be a convex body in R that is in C+ and such that 0 is the centroid of K. Let the functions ent1 and ent2 be as in Corollary 1.4.2. Then

   1  |Kent1 [t]| − |K| 2 ◦ R ◦ |K| − n lim 2 − 2n |K | log = n|K | log Ω . t→0 ◦ K kn t n+1 r |K | and

   1  |Kent2 [t]| − |K| 2 R |K| n lim 2 − 2n |K| log = n|K| log Ω ◦ . t→0 ◦ K kn t n+1 r |K |

1.5 Proof of Theorem 1.4.1

To prove Theorem 1.4.1, we need the following lemmas. The first one, Lemma 1.5.1, is well known.

n Lemma 1.5.1. Let En(x0, a) be an ellipsoid in R centered at x0 and with axes parallel to the coordinate axes and of lengths a1, . . . , an. Let 0 < ∆ < an. Let

C(En, ∆) = En ∩ H(x0 + (an − ∆)en, en)

be a cap of En(x0, a) of height ∆. Then

n−1 n+1  ∆  2 n−1 2 2 1 − |B | n−1 2an 2 Y ai n+1 √ ∆ 2 ≤ |C(E , ∆)| n + 1 a n i=1 n n+1 n−1 n−1 2 2 |B2 | Y ai n+1 ≤ √ ∆ 2 n + 1 a i=1 n

30 In the next few lemmas and throughout the remainder of this section we will use the following notation.

n ◦ Let K be a convex body in R . Let f : K → R be an integrable function and for t ≥ 0, let Kf [t] be a mean width body of K. For x ∈ ∂K, let

xt = {γx : γ ≥ 0} ∩ ∂Kf [t]. (1.5.1)

Let y(x) ∈ ∂K◦ be such that hy(x), xi = 1. Let m be the Lebesgue measure on 2f n and let m be the measure (on K◦) defined by m = m, i.e. for all A ⊂ K◦ R f f |Sn−1|

2 Z mf (A) = n−1 f(ξ)dξ. (1.5.2) |S | A

n 2 Lemma 1.5.2. Let K be a convex body in R that is in C+ and such that 0 is the ◦ centroid of K. Let f : K → R be an integrable function such that f(y) ≥ c for all ◦ y ∈ K and some constant c > 0. Let xt be as in (1.5.1). Then the functions   1 kxtk 2 − 1 t n+1 kxk are uniformly (in t) bounded by an integrable function.

Proof. We can assume that t ≤ t0 where t0 is given by Lemma 1.3.5. Then Kf [t] is bounded and hence

n Kf [t] ⊂ B2 (0, a) (1.5.3) for some a > 0. As f ≥ c on K◦, we get with (1.3.10)

2 Z t ≥ f(ξ)dξ n−1  − |S | ◦ xt x K ∩H 2 , kxtk kxk   2c ◦ − xt x ≥ n−1 K ∩ H 2 , . |S | kxtk kxk

31 2 ◦ 2 As K is in C+, K is in C+. Thus, by the Blaschke rolling theorem (see [81]), there

◦ n ◦ exists r0 > 0 such that for all y ∈ ∂K , B2 (y − r0NK◦ (y), r0) ⊂ K . Let now

◦ x y(x) ∈ ∂K be such that hx, y(x)i = 1. Then N ◦ (y(x)) = and thus K kxk     2c n x − xt x t ≥ n−1 B2 y(x) − r0 , r0 ∩ H 2 , |S | kxk kxtk kxk n+3 n−1 n+1 2 n−1   2 2 2 c r0 B2 1 1 ≥ n−1 − , (n + 1) |S | kxk kxtk     n x − xt x where we have used that B2 y(x) − r0 , r0 ∩ H 2 , is the volume kxk kxtk kxk   1 1 kxt − xk n x of a cap of height − = of the ball B2 y(x) − r0 , r0 which kxk kxtk kxtkkxk kxk we have estimated from below using Lemma 1.5.1. We assume also that t is so small 1 1 that − < r0. kxk kxtk kx k kx − xk As x and x are collinear, t − 1 = t and hence t kxk kxk

2 − n−1   n−1 ! n+1 n+1 1 kxtk 1 kxt − xk (n + 1) |S | r0 2 − 1 = 2 ≤ n−1 n+3 kxtk t n+1 kxk t n+1 kxk c B2 2 n+1 2 − n−1 n−1 ! n+1 n+1 (n + 1) |S | r0 ≤ n−1 n+3 a. (1.5.4) c B2 2 n+1

In the last inequality we have used (1.5.3). The expression (1.5.4) is a constant and thus integrable.

n 2 Lemma 1.5.3. Let K be a convex body in R that is in C+ and such that 0 is the ◦ centroid of K. Let f : K → R be a continuous, positive function. Then for all x ∈ ∂K one has

 n  2 hx, NK (x)i kxtk hx, NK (x)i lim 2 − 1 = 1 2 , t→0 n kn t n+1 kxk κK (x) n+1 f(y(x)) n+1

2  n  n+1 1 n(n + 1)|B2 | ◦ where kn = n−1 and y(x) ∈ ∂K is such that hx, y(x)i = 1. 2 2|B2 | 32 Proof. Let x ∈ ∂K. Let xt be as in (1.5.1). As x and xt are collinear and as (1 + s)n ≥ 1 + ns for s ∈ [0, 1), one has for small enough t, hx, N (x)i kx kn  hx, N (x)i  kx − xkn  K t − 1 = K 1 + t − 1 ≥ ∆(x, t), n kxk n kxk  x  where ∆(x, t) = ,N (x) kx − xk = hx − x, N (x)i. kxk K t t K Similarly, as (1 + s)n ≤ 1 + ns + 2ns2 for s ∈ [0, 1), one has for t small enough, hx, N (x)i kx kn   2n kx − xk K t − 1 ≤ ∆(x, t) 1 + t . (1.5.5) n kxk n kxk

Hence for ε > 0 there exists tε ≤ t0, t0 from Lemma 1.3.5, such that for all 0 < t ≤ tε n h kxtk  i hx, NK (x)i kxk − 1 1 ≤ ≤ 1 + ε. n ∆(x, t)

◦ ◦ By Lemma 1.3.5 (iii) and (1.5.2), mf (K \ Kxt ) = t and thus h n i 2 hx, N (x)i kxtk − 1 m (K◦ \ K◦ ) n+1 K kxk f xt 1 ≤ 2 ≤ 1 + ε. n ∆(x, t) t n+1 N (x) Let now y = y(x) ∈ ∂K◦ be such that hx, yi = 1. Thus y = K and hx, NK (x)i x ◦ N ◦ (y) = . As f is continuous on K , there exists δ > 0 such that for all K kxk n z ∈ B2 (y, δ), f(y) − ε < f(z) < f(y) + ε.

◦ ◦ n We choose t so small that K \ Kxt ⊂ B2 (y, δ). Then

2 (f(y(x)) − ε) ◦ ◦ ◦ ◦  K \ K ≤ mf K \ K |Sn−1| xt xt 2 Z = n−1 f(ξ)dξ |S | K◦\K◦ xt

2 (f(y(x)) + ε) ◦ ◦ ≤ K \ K |Sn−1| xt and we get with (new) absolute constants c1 and c2 that

2 h n i   n+1 hx, N (x)i kxtk − 1 2f(y(x)) K◦ \ K◦ K kxk |Sn−1| xt 1 − c1ε ≤ 2 n ∆(x, t) t n+1

≤ 1 + c2ε. (1.5.6) 33 ◦ 2 As K and hence K is in C+, κK◦ (y) > 0. It is well known (see [84]) that then there exists an ellipsoid E = E(y − anNK◦ (y), a) centered at y − anNK◦ (y) and with

◦ half axes of lengths a1 . . . an which approximates ∂K in a neighborhood of y. For the computations that follow, we can assume without loss of generality that NK◦ (y) = en and that the other axes of E coincide with e1 . . . , en−1. Thus (see [84]), for ε > 0 given, there exists ∆ε such that for all ∆ ≤ ∆ε

 − E y − (1 − ε)anNK◦ (y), (1 − ε)a ∩ H∆

◦ − ⊆ K ∩ H∆ ⊆

 − E y − (1 + ε)anNK◦ (y), (1 + ε)a ∩ H∆, (1.5.7)

where H∆ = H(y − ∆en, en). Also (see [84]),

n−1 Y an κ ◦ (y) = . (1.5.8) K a2 i=1 i

 x x  As x → x as t → 0, we can choose t so small that K◦ \K◦ = K◦ ∩H− t , t xt 2 kxtk kxk − is contained in H (y − ∆en, en). Hence, by (1.5.7),    − xt x ◦ ◦ E y − (1 − ε)a N ◦ (y), (1 − ε)a ∩ H , ≤ K \ K ≤ n K 2 xt kxtk kxk    − xt x ◦ E y − (1 + ε)anNK (y), (1 + ε)a ∩ H 2 , . kxtk kxk

1 1 ∆(x, t) By Lemma 1.5.1, with (1.5.8), and as − = , we get with kxk kxtk kxtkhx, NK (x)i new absolute constants c1 and c2

n+1 n−1 n+1 2   2 2 B2 ∆(x, t) ◦ ◦ (1 − c1ε) 1 ≤ K \ Kxt ≤ 2 kx khx, N (x)i (n + 1) (κK◦ (y)) t K n+1 n+1 n−1   2 2 2 B2 1 1 (1 + c2ε) 1 − 2 kxk kx k (n + 1) (κK◦ (y)) t n+1 n+1 n−1   2 2 2 B2 ∆(x, t) = (1 + c2ε) 1 . 2 kx khx, N (x)i (n + 1) (κK◦ (y)) t K 34 Hence, again with new absolute constants c1 and c2, (1.5.6) becomes

2 h n i  n−1  n+1 kxtk 2f(y)|B2 hx, NK (x)i kxk − 1 2 (n+1)|Sn−1| 1 − c1ε ≤ 2 1 ≤ 1 + c2ε. n t n+1 (κK◦ (y)) n+1 kxtkhx, NK (x)i

Therefore, as kxtk → kxk as t → 0,

 n  hx, NK (x)i kxtk lim 2 − 1 = t→0 n t n+1 kxk 2 1  n  n+1 n+1 1 n(n + 1)|B2 | κK◦ (y) kxkhx, NK (x)i n−1 2 2 2|B2 | f(y) n+1 1 Now we use that kxk = and that (see e.g. [101]) hy, NK◦ (y)i

1 κK◦ (y) n+1 hx, NK (x)i = 1 ◦ hy, NK (y)i κK (x) n+1

2  n  n+1 1 n(n + 1)|B2 | We put kn = n−1 and get that 2 2|B2 |

 n  2 hx, NK (x)i kxtk hx, NK (x)i lim 2 − 1 = kn 1 2 . t→0 n t n+1 kxk κK (x) n+1 f(y) n+1

Proof of Theorem 1.4.1

It is well known (see e.g. [101]), that for a convex body K and a star-convex body L with 0 ∈ int(K) and K ⊂ L

1 Z kx0kn  |L| − |K| = hx, NK (x)i − 1 dµK (x) n ∂K kxk where x ∈ ∂K, x0 ∈ ∂L and x = ∂K ∩ [0, x0].

Therefore,

Z  n  1 kxtk |Kf [t]| − |K| = hx, NK (x)i − 1 dµK (x) n ∂K kxk

35 We now use Lemma 1.5.2 and Lebegue’s theorem to interchange integration and limit and then Lemma 1.5.3 and get

Z  n  |Kf [t]| − |K| 1 1 kxtk lim 2 = lim 2 hx, NK (x)i − 1 dµK (x) t→0 t n+1 n t→0 t n+1 ∂K kxk Z  n  hx, NK (x)i kxtk = lim 2 − 1 dµK (x) ∂K t→0 n t n+1 kxk Z 2 hx, NK (x)i = kn 1 2 dµK (x). ∂K κK (x) n+1 f(y) n+1 This finishes the proof of Theorem 1.4.1.

36 CHAPTER 2

GEOMETRY OF QUANTUM STATES

Quantum mechanics has a rich history with some of the most famous names in sci- ence associated with it. This work deals with aspects of the mathematical framework of quantum mechanics and does not delve into the physical interpretation of such aspects. The mathematical framework for quantum mechanics was first described by Heisenberg and called matrix mechanics. Later a different approach, wave me- chanics, was found by Erwin Schr¨odingierusing partial differential equations. These approaches are in fact equivalent and the preference of one approach to another is made often out of convenience for the physical interpretation. Paul Dirac, on the other hand, adopts a symbolic approach to describing quantum mechanics in his highly in-

fluential book [19]. The generally accepted mathematical framework for quantum mechanics concerns observables and states [80]. These objects are in some sense dual.

For our purposes, an observable is a Hermitian operator on a Hilbert space and the eigenvalues (or spectrum) of an observable represent certain physical quantities. A measurement of this observable yields a combination of eigenvalues. That is, if A is an observable and |λii ∈ H are eigenvectors of A (here we are using the Dirac nota- tion [19]) with eigenvalues λi then the measurement on |λii gives hλi|A |λii = λi with certainty. If instead a measurement is performed on an arbitrary state, |φi, then the

2 values pi = |hλi, φi| determine a probability distribution for the outcomes in such a way that the measurement hφ|A|φi = λi with probability pi. This is the central idea

37 in the Copenhagen interpretation of quantum mechanics, that the act of measurement returns a single (random) outcome [80]. A fundamental difference between classical and quantum information theory is that a bit of information is not in one state or the other (think ‘1’ or ‘0’) but in fact in both states simultaneously with a certain probability distribution [72]. There is a popular analogy that quantum mechanics is a non-commutative , with observables corresponding to random variables and states corresponding to probability measures. This is often a helpful analogy but is limited for several reasons, (see [15]). More modern approaches use a

C∗ formalism or other ideas such as rigged Hilbert spaces [80]. In this work we are concerned only with the set of possible states a quantum system may take. However, since every quantum operation may be related to a state via the Choi-Jamiolkowski isomorphism, [12, 47], this viewpoint also sheds light on quantum dynamics.

The distinguishing feature of quantum information theory is entanglement [43].

The notion of entanglement was made famous in the thought experiment of Ein- stein, Podolsky, and Rosen [20] who wished to demonstrate the inconsistency of the superposition principle and the Copenhagen interpretation of quantum mechanics.

A hidden variable theory of quantum mechanics emerged to resolve this so-called paradox but outcomes of the EPR experiment were shown by John Bell [6] to be demonstrably inconsistent with any hidden variable theory. In [6] Bell proved that outcomes of a local hidden variable theory must satisfy certain inequalities, come to be known as Bell Inequalities, and subsequent research showed that these inequalities may be violated. Further indication that a hidden variable theory is inconsistent with the axioms of quantum mechanics was given by the famous papers by Clauser, Horn,

Shimony, and Holt [14], and Reinhart Werner [102]. In 1981, Alain Aspect et al.

[1, 2] give strong experimental results showing a violation of Bell’s inequality. Since then, many experiments have been performed that support the axioms of quantum

38 mechanics [80]. These results lead us to believe that entanglement is real. A natural question is, how does one quantify entanglement? This question leads to many deep and important mathematical notions and is the main source of inspiration for this chapter.

Besides entanglement, there are other fundamental concepts of quantum theory that appear to have significance for information processing. Examples of such no- tions include quantum discord [43, 40] and the PPT property, to which we devote considerable attention in this chapter and which is defined below in section 2.2.

2.1 Preliminaries from Convex Geometry

This section will provide background for certain concepts in convex geometry as well as introduce notation used throughout this chapter. The notation in this chapter is mostly the same as the previous but we do make a few changes sometimes for convenience and others to conform with the common literature. One difference in the present chapter is the use of non-standard norms. Because of this we will be reasonably explicit with norms. For example, the Euclidean norm will be denoted k·k2. When comparing states we adopt a quasi-Dirac notation [19]. Vectors in a Hilbert space H with norm 1 we denote with the Dirac ket, such as |φi ∈ H, and its dual element with the Dirac bra, hφ| ∈ H∗, but the inner product we denote by hψ, φi and not the Dirac bracket hφ|ψi. This allows us to write the pure states (the rank

1 orthogonal projections) as |φihφ| while still conforming to the usual inner product since we use the inner product notion in several different contexts. As in the previous

n n chapter, we let K denote the set of convex bodies contained in R . We recall the n definition of the polar body: for all K ⊂ R ,

◦ n K = {y ∈ R : hx, yi ≤ 1 for all x ∈ K} . (2.1.1)

39 n Given K ∈ K0 , we can define a quantity related to the support function (1.1.1), the gauge of K, also known as the Minkowski functional, via

1 kxk = inf{λ > 0 : x ∈ λK} = . K sup{t ≥ 0 : tx ∈ K}

The gauge of a set should be thought of as a generalization of the norm: if K is the unit ball with respect to some norm k·k, then k·kK ≡ k·k. If K is bounded, it follows ◦ n ◦ that K ∈ K0 and so the gauge of K is defined and in fact k·kK◦ = hK , the support function of K (1.1.1).

n n−1 If K ⊂ R is bounded and du is the normalized measure on S , we define, in this chapter, the mean width of K as

Z w(K) = hK (u) du . (2.1.2) Sn−1

Since K is bounded, we can express this as

Z w(K) = kukK◦ du . (2.1.3) Sn−1

Mean width is usually defined as it is in the previous chapter with (1.3.1) as twice this quantity, that is W (K) = 2w(K). However, it is often more convenient to drop that factor and to work with w(K) as defined above – which is really the mean half- width – since this invariant is more closely related to the volume radius (as Urysohn’s inequality illustrates below). It what follows, we will refer to w(K) as the mean width of K with the understanding that w(K) is defined by (2.1.2) (as was done, e.g., in

[3] and [93]).

It is often fruitful to express w(K) in terms of the standard Gaussian measure, Z n n−1 µn, on R , rather than the uniform measure on S . Let γn := kyk2 dµn(y) = Rn √ n + 1 n √ γn 2Γ( )/Γ( ). Then γn ∼ n, by which we mean that lim √ = 1. We use 2 2 n→∞ n the ∼ convention frequently in what follows. In fact, a simple elementary argument

40 using convexity of log Γ and the functional equation Γ(x + 1) = xΓ(x) shows that √ √ n − 1 < γn < n. We then have

Z 1 Z w(K) = hK (u)du = max hx, yi dµn(y) . (2.1.4) x∈K Sn−1 γn Rn

The integral on the right is often called the Gaussian mean width of K. If K is a

n compact subset of R (or of some other vector or affine space of real dimension n), we define its volume radius as

 |K| 1/n vrad(K) = n . (2.1.5) |B2 | In other words, vrad(K) is the radius of the Euclidean ball of the same dimension

n and volume as K. Again, if K ∈ K0 , then the volume radius of K can be calculated by passing to polar coordinates and expressed using the gauge of K as follows

Z 1/n −n vrad(K) = kukK du . (2.1.6) Sn−1

Important relationships between convex bodies can be expressed though these quantities. For example Urysohn’s inequality states that the volume radius can be no larger than the mean width for K ∈ Kn. That is,

 |K| 1/n w(K) ≥ n or w(K) ≥ vrad(K). (2.1.7) |B2 |

Another related inequality which is used frequently in what follows is the following

vrad(K) ≥ w(K◦)−1 , (2.1.8) valid for any K. This is, in a way, a dual version of the Urysohn inequality

41 (2.1.7), but it is more elementary as it depends only on H¨older’sinequality. In fact, we have

Z n n n+1 − n+1 1 = kukK kukK du Sn−1 n 1 Z  n+1 Z  n+1 −n ≤ kukK du kukK du Sn−1 Sn−1

◦ n n ≤ w(K ) n+1 vrad(K) n+1 .

Other classical inequalities include the Brunn-Minkowski inequality, the Santal´oin- equality and the Rogers-Shephard inequality, see for example [81].

The Santal´oand reverse Santal´oinequality can be expressed simultaneously as follows (see e.g. [81]). Let K ∈ Kn be symmetric, then

|K| |K◦|1/n c ≤ ≤ 1 , (2.1.9) n 2 |B2 | where c > 0 is a universal constant (i.e., independent of n and K). Notice that using

(2.1.5) this can rewritten as

c ≤ vrad(K)vrad(K◦) ≤ 1 . (2.1.10)

The inequality on the right holds for all convex bodies and is known as the Santal´o inequality with equality achieved if and only if K is an ellipsoid. The inequality on the left is known as the reverse Santal´oinequality and is due to Bourgain and Milman

[11]. While the optimal value of the constant c is unknown, it has been shown by

Kuperberg [52] that c = 1/2 works. More precisely, Corollary 1.6 in [52] establishes

2n−1/n c ≥ 2 . n

Note that while the assertion of Corollary 1.6 in [52] looks rather complicated, the above inequality can be easily extracted from the second formula on page 873 of [52].

42 A version of the Santal´oinequalities exist for bodies that are not necessarily symmetric. We have then

◦ c1 ≤ inf vrad(K)vrad((K − z) ) ≤ 1, (2.1.11) z where the infimum is taken over the interior of K. The infimum in the above ex- pression is attained (because K is compact, |·| is continuous, and |(K − z)◦| → +∞ when z approaches the boundary of K) and the minimizing z is called the Santal´o point of K. Again, the optimal value of the constant is unknown, but it is known that c1 = c/2 works (this value comes from a symmetrization argument combined with the Rogers-Shephard inequality, as discussed below). It is also known that the constant c1 in (2.1.11) is different from the constant c in (2.1.9) and (2.1.10) above. Finally, mentioned in section 1.2.1, it is conjectured that, for given n, the minimum of vrad(K)vrad(K◦) in (2.1.9) and (2.1.10) is attained when K is an n-cube and the minimum of (2.1.11) is attained when K is an n-simplex (this is the previously mentioned Mahler conjecture [66, 67]).

The Rogers-Shephard inequalities give us a way to compare volumes of non- symmetric bodies by considering their symmetrizations. A symmetrization that is often considered is the difference body, K − K, which is the Minkowski sum of K and its reflection. The usual formulation of the Rogers-Shephard inequality [77] is, for K ∈ Kn, 2n |K − K| ≤ |K| . (2.1.12) n On the other hand, we can estimate the volume of (K −K)◦ by considering the gauge of (K − K)◦

k·k(K−K)◦ = k·kK◦ + k·k−K◦ and passing to polar coordinates (cf. (2.1.6)); Jensen’s inequality then gives us

|(K − K)◦| ≤ 2−n |K◦| . 43 Combining these statements yields

2n |K − K| |(K − K)◦| ≤ 2−n |K| |K◦| . n

Next, applying (2.1.10) with K − K in place of K leads to

12n1/n c ≤ vrad(K)vrad(K◦). 2 n

2n−1/n In other words, we concluded that (2.1.11) holds with c = 2 c ≥ c/2 ≥ 1/4. 1 n Another possible (and useful) variant of the Rogers-Shephard inequality can be

n+1 stated as follows (cf. [78]). If we let W ⊂ R be an n-dimensional convex body and n+1 let h be the distance from the hyperplane H spanned by W to the origin in R , and we let Ω = conv(W ∪ −W ) then we have

2n 2h |W | ≤ |Ω| ≤ 2h |W | . n + 1

This symmetrization was used in [3] and [92] to obtain some of the estimates used in the following section.

2.2 Summary of Volumetric Estimates for Sets of States

The purpose of this section is to summarize the known estimates of various volumetric invariants for convex sets appearing in quantum information theory.

D1 D2 DN Let H = C ⊗ C ⊗ · · · ⊗ C and d := D1D2 ··· DN = dim H. We always assume that Dj ≥ 2, since otherwise the corresponding factors can be discarded since they correspond to scalar multiplication. We consider the set of states on B(H) which we denote

D = D(H) = {ρ ∈ B(H) , ρ ≥ 0 , tr ρ = 1} .

This set is convex and its extreme points are called pure states, which are the rank 1

44 orthogonal projections. An important subset of D is the separable states, which we denote

Dj S = S(H) = conv{ρ1 ⊗ ρ2 ⊗ · · · ⊗ ρN , ρj ∈ D(C )}.

A state is said to be entangled if it is not separable. The (real) dimension of D is d2 −1 =: n. The same is true for S and for the other subsets that we will subsequently consider.

We are concerned mainly with the case N = 2. In this case, in addition to S, an important subset is the set of states with positive partial transpose

D D PPT = {ρ ∈ D(C 1 ⊗ C 2 ) , (Id ⊗ T )(ρ) ≥ 0} .

Here T is the transpose map and Id is the identity map; the map Γ := Id ⊗ T is called the partial transpose. If we transpose on the first system with T ⊗Id we obtain the same set PPT . We have the inclusions S ⊆ PPT ⊆ D (the first inclusion is the content of the Peres-Horodecki partial transpose criterion [42], and the second one is trivial).

The PPT property is intriguing and not fully understood. Besides its appear- ance in the Peres-Horodecki criterion, it has bearing on various quantum information protocols and, for example, is the object of the distillability conjecture [41, 45], one of important open problems in quantum information theory.

While all results can be stated – with some effort – for arbitrary Dj’s, to lighten the notation we will concentrate on the case D1 = D2 = ... = DN =: D. The first estimates that we record address the question of relative sizes of the sets

S, PPT and D in terms of the usual volume and are due to Szarek and Aubrun [3].

They showed that there exist universal constants C , c , c0 > 0 such that

cN  |S| 1/n C(N ln N)1/2 ≤ ≤ d1/2−1/2N |D| d1/2−1/2N

45 and, when N = 2, |PPT |1/n c ≤ ≤ 1. 0 |D| In view of the nth root appearing in the estimates, it is convenient to formulate them using the volume radius (2.1.5). In this notation, the estimates stated above can be rewritten as cN vrad(S) C(N ln N)1/2 ≤ ≤ (2.2.1) d1/2−1/2N vrad(D) d1/2−1/2N vrad(PPT ) c ≤ ≤ 1. (2.2.2) 0 vrad(D) Note that the expressions in the numerators of the first and last member of (2.2.1) are independent of D. In particular, for N = 2, which is the setting that we are mostly interested in, this reads

c c vrad(S) C C 2 = 2 ≤ ≤ 2 = 2 , D1/2 d1/4 vrad(D) d1/4 D1/2

2 1/2 where c2 = c and C2 = C(2 ln 2) are just some numerical (effectively computable) positive constants.

Many of the upper bounds on volume radius are derived from similar bounds on the mean width via Urysohn’s inequality, which in turn is often upper bounded by

first relating it to the Gaussian width (cf. (2.1.4)), and then appealing to majoration results for Gaussian processes such as Dudley’s inequality for example ([17], or see

[75], Theorem 5.6). Szarek, Werner and Zyczkowski [93] use similar, although more subtle, arguments to arrive at explicit upper and lower bounds – involving reasonable constants – for the mean width and for the volume radius of the sets in question.

These arguments are based on the notion of polar body (2.1.1), on the dual Urysohn inequality (2.1.8), and exploit the not-so-well-known relationship between polarity of convex bodies and that of convex cones (see Lemma 1 on page 11 of [93]).

While the results stated thus far involve comparing volumes of sets contained in the same affine space and so they are independent of any additional structures 46 that space may be endowed with, estimating volume and – particularly – all the considerations involving duality require the presence of a specific Euclidean structure.

The standard choice of such structure in spaces of matrices is given by the Hilbert-

† Schmidt inner product hX,Y iHS := trXY and it will be assumed in what follows. † Note that in physics literature the convention hX,Y iHS := trX Y is more common, and in the spaces of self-adjoint matrices – which will be our most common setting – the † is not needed.

Further, let us emphasize that all the sets in question are contained in the affine subspace of trace 1 matrices where the maximally mixed state, I/d, plays the role of the origin. That is, the polar operation ◦ is the polar operation in the space of trace

1 matrices with respect to I/d. The maximally mixed state also plays the role of the origin with regard to reflection, K 7→ −K, and dilation, K 7→ rK. More precisely, if

K is a set in the affine subspace of trace 1 matrices

K◦ means (K − I/d)◦ + I/d

−K means − (K − I/d) + I/d

rK means r(K − I/d) + I/d .

An equivalent approach is to consider K − I/d as a (full-dimensional) subset of the vector subspace of trace 0 matrices, and think of all the operations as taking place inside that subspace.

In the above convention, we have

D◦ = −dD , PPT ◦ = (D ∩ (I ⊗ T )D)◦ = −d conv(D ∪ (I ⊗ T )D), (2.2.3) and so on (see [93] for details). Also, any discussion of the inradii and outradii of these sets will be in this context in the sense that the smallest (largest) ball which contains (is contained in) a body K will lay in the space of trace 1 matrices and

47 be centered about I/d. Note that since for all the sets that we will consider I/d is the only point invariant under the isometries of the set, the smallest/largest balls containing/contained in them are necessarily centered at I/d. √ One non-trivial estimate used above is w(D) ≤ 2/ d. This inequality is closely related to the largest eigenvalue of a random matrix from a Gaussian Unitary Ensem-

sa ble, or GUE. The GUE can be described as follows: if we let X ∈ Md (the real vector space of d × d self-adjoint matrices) be chosen according to the standard Gaussian 1 1 measure on Msa, that is µ (X) = exp(− kXk2 /2) where is the normalization d g Z 2 Z factor, then X is a GUE matrix. The GUE is usually defined in a constructive manner by specifying the distribution of individual entries. These approaches are equivalent but the advantage of our approach is that the relationship to the uniform distribution on the Hilbert-Schmidt sphere becomes clear. If S is the Hilbert-Schmidt sphere of d × d self-adjoint trace 0 matrices, using (2.1.4) we calculate

Z 1 w(D) = hD(ξ)dξ = E max hσ, Xi , S γd2−1 σ∈D where X is a random matrix, distributed according to the standard Gaussian measure on the space of d × d self-adjoint trace 0 matrices; let us denote this distribution by

GUE0.

Now consider max hσ, Xi, where X is a GUE0 matrix. We need only consider the σ∈D extreme points of D which are the orthogonal projections of rank 1. So we have

max hσ, Xi = max h|uihu| ,Xi σ∈D u∈Sd−1 = max tr(|uihu| X) u∈Sd−1 = max hXu, ui , u∈Sd−1 which is the largest eigenvalue of X. It is well known from the literature ([18, 28, 34] for instance) that the expected value of the largest eigenvalue of a GUE matrix is 48 √ asymptotically 2 d, and it is easy to deduce that the same is true for GUE0 (cf. section 2.5). Since γd2−1 ∼ d we have √ w(D) ∼ 2/ d .

√ The fact that w(D) ≤ 2/ d for all d is slightly more delicate; it follows for example from the argument in [92], Appendix F (see also section 2.5).

Following [93] one can derive the estimate

1 vrad(PPT ) ≥ √ (2.2.4) 4 d as follows. We have

√ w(PPT ◦) = d w(conv(D ∪ (I ⊗ T )D)) ≤ 2d w(D) ≤ 4 d

This together with the dual Urysohn inequality (2.1.8)

vrad(PPT ) ≥ w(PPT ◦)−1 yields (2.2.4). On the other hand, using the Santal´oinequality (2.1.9) and the duality relations (2.2.3), specifically the fact that D◦ = −dD, we obtain

1 vrad(D) ≤ √ (2.2.5) d

Together, these estimate imply

1 vrad(PPT ) ≤ . 4 vrad(D)

The above inequality may be further sharpened by using, first, the stronger fact that 2 w(conv(D ∪ (I ⊗ T )D)) ∼ w(D) ∼ √ (which follows from the concentration of d measure, cf. section 2.5) instead of the “trivial” fact that w(conv(D ∪ (I ⊗ T )D)) ≤

2w(D) (which only uses the general inequality w(conv(A ∪ B)) ≤ w(A) + w(B), valid when A ∩ B 6= ∅, and I ⊗ T being an isometry). Next, instead of (2.2.5), which 49 follows from general principles, we could use the explicit formula for |D| given by √ [110], which implies vrad(D) ∼ e−1/4/ d. Combining these improvements leads to an asymptotic formula vrad(PPT ) e1/4 . (2.2.6) vrad(D) & 2

an Here and in what follows, an & bn stands for lim inf ≥ 1, and similarly for an . bn. bn We conclude the discussion with a (partly repetitive) summary of the bounds and asymptotic relations. (Some of them were explained above, for others we follow [93], and a few more are elementary. Note that the last row of Table III in [93] has missing exponents!) Again, we restrict our attention to N = 2 and D1 = D2 = D (hence d = D2). 1 1 e−1/4 √ ≤ vrad(D) ≤ √ , vrad(D) ' √ (2.2.7) 2 d d d 1 1 √ ≤ vrad(PPT ) ≤ vrad(D), vrad(PPT ) & √ (2.2.8) 4 d 2 d 1 4 ≤ vrad(S) ≤ (2.2.9) 6d3/4 d3/4 The estimates for the mean width read as follows.

1 2 2 √ ≤ w(D) ≤ √ , w(D) ∼ √ (2.2.10) d d d 1 1 √ ≤ w(PPT ) ≤ w(D), w(PPT ) & √ (2.2.11) 4 d 2 d 1 4 ≤ w(S) ≤ (2.2.12) 6d3/4 d3/4 The inequality on the left of (2.2.7) follows from the dual Urysohn inequality (2.1.8), from the fact that D◦ = −dD, and from the upper bound on w(D) given in (2.2.10).

Other geometric parameters that are of interest are the inradii and the outradii of sets of states. The inradius of a convex body, K, is the radius of the largest Euclidean ball contained in K. Likewise the outradius is the radius of the smallest ball that

50 contains K. Gurvits and Barnum [32] have shown that the inradius of S for the case

N = 2 and D1 = D2 verifies

1 inrad(S) = inrad(D) = . pd(d − 1)

In fact, Gurvits and Barnum have also found estimates for the inradius of S for several cases of multipartite (N ≥ 3) systems (in [33]), but we are mainly concerned with the bipartite situation.

Notice that the ratio of volume radii of S and D is asymptotically d−1/4, while the volume radii of PPT and D are of the same order. The same holds for the mean width. This implies that, for D large enough, the set PPT is much more

“massive” than S. This suggests that, starting with D = 3, and especially for large

D, there exist PPT states which are, by any reasonable measure, heavily entangled and far from the set of separable states. The goal of this work is to provide sharp quantitative results in this direction and, ultimately, to supply explicit examples of such “heavily entangled” PPT states. In sections 2.7 and 2.8 we will show that some surprisingly strong estimates that can be rather easily deduced from the above elementary volumetric estimates. However, we first need more notation and further preliminary results that will be introduced in the next two sections.

2.3 Geometric Measures of Entanglement

Here we record some distance measures that are useful in quantum information. The most popular of them is the (normalized) trace distance, for ρ, σ ∈ D:

1 δ(ρ, σ) = tr(|ρ − σ|). 2

The reason for the 1/2 factor is that in this normalization, when restricted to classical

(or diagonal) states, the eigenvalues of ρ and σ determine probability distributions

51 and δ coincides with the total variation distance (or Kolmogorov distance) frequently used in probability and statistics [22]. The measure that is most natural from the geometric point of view is the Hilbert-Schmidt (or Frobenius) distance:

h(ρ, σ) = (tr(ρ − σ)2)1/2

Both are related to the distance induced by the p-Schatten norms defined as follows

p 1/p kρkp = (tr|ρ| )

Accordingly, to quantify entanglement of a state ρ it is natural to consider, for p ∈

[1, ∞],

inf kρ − σk (2.3.1) σ∈S p Another measure of entanglement indicates how much mixture with the maximally mixed state I/d is necessary to make the state separable. It is related to the concept of the gauge of a convex set.

In our setting, with the maximally mixed state I/d playing the role of the origin, the formula necessarily becomes a little more involved, for example

sup{t ∈ (0, 1) : tρ + (1 − t)I/d ∈ S}

−1 −1 = inf{λ > 0 : ρ ∈ I/d + λ(S − I/d)}) = kρkS

Note that kρkS ≤ 1 if and only if ρ ∈ S and kρkS < 1 if and only if ρ belongs to the (relative) interior of S, so we have an entanglement measure with natural functional-analytic interpretation.

The above quantity is closely related to the so-called geometric Banach-Mazur distance between convex bodies S and D, which equals

max kρk . ρ∈D S

52 Good bounds for the above quantity are known are known [4] and will be elaborated on in the next section. Geometrically, the quantity says how far D “sticks out” of

S. (Note that the usual concept of the geometric Banach-Mazur distance between bodies A and B accounts both for how much “A sticks out of B” and for how much

“B sticks out of A,” but in the present setting we have max kσk = 1.) Similarly the σ∈S D quantity

max kρk ρ∈PPT S tells us how much the set PPT is larger than S, which is more delicate and will be among the primary objects of study in this chapter. We are also interested in exhibiting specific ρ ∈ PPT for which kρkS , or some other measure of entanglement, is large.

Many of the quantities commonly encountered in quantum information theory are based on concepts which have their roots in classical information theory. One exam- ple is fidelity [49] also known as the Bhattacharyya coefficient [22] or the transition probability [94] which is given by:

q√ √  F (ρ, σ) = tr ρσ ρ

The fidelity is often defined as the square of this quantity, for example in [49]. The convention used here appears in [22, 94, 72] An equivalent formulation is given by

√ √ F (ρ, σ) = ρ σ 1

The fidelity can be thought of as generalizing the amount of “overlap” of vectors. If

ρ and σ are pure states, say ρ = |φihφ| and σ = |ψihψ|, then

F (ρ, σ) = |hφ, ψi| .

If one considers ρ to be a mixed state, (on, say H1) then we can pass to a larger Hilbert space, H1 ⊗ H2 on which there exists a pure state |φihφ| such that ρ = tr2 |φihφ|, 53 where tr2 is the partial trace over H2, that is tr2 : D(H1 ⊗ H2) → D(H1) given by tr2 = Id ⊗ tr. This is known as a purification. It has been shown [22] that

F (ρ, σ) = sup {|hφ, ψi| : ρ = tr2 |φihφ| , σ = tr2 |ψihψ|}

Fidelity is related to the Bures distance by the following [89]:

p dB(ρ, σ) = 2 − 2F (ρ, σ) .

This measure is arguably more appropriate in the present context since it is a metric.

The fidelity and the Bures distance have a nice geometric interpretation. If θ is the angle between |ψi and |φi, then F = F (|ψihψ| , |φihφ|) = cos θ, 1 − F = 2 sin2 θ/2, √ √ 2 1 − F = sin θ, dB(|ψihψ| , |φihφ|) = 2 − 2F = 2 sin θ/2 = | |ψi − |φi|. Various geometrical properties of sets of quantum states endowed with the Bures distance have been explored in [89] and [106].

Fidelity enjoys several useful properties. It is multiplicative over tensor products, that is

F (ρ1 ⊗ ρ2, σ1 ⊗ σ2) = F (ρ1, σ1)F (ρ2, σ2) (2.3.2)

The square of F is concave in each of its arguments. We also have the following inequalities [22]

1 − F (ρ, σ) ≤ δ(ρ, σ) ≤ p1 − F (ρ, σ)2 . (2.3.3)

Quantifying entanglement begins with considering only pure states. As mentioned above (2.3.1), it is natural to consider quantities such as

inf kρ − σk σ∈S for any reasonable norm (or, in fact, for any metric). Shimony proposed in [87] a measure of entanglement for pure states given by

1 E(ρ) = inf kρ − σk2 . 2 σ∈S 2 54 Shimony also demonstrated that E(ρ) = 1 − α1 where α1 is the largest Schmidt coefficient of ρ. This approach has been generalized to mixed states with [4, 8] using various methods. Wei [96] also uses a geometric approach and defines the entanglement eigenvalue of a pure state, ρ = |φihφ| by

Λ(ρ) = sup {|hψ, φi| : |ψihψ| ∈ S} .

Notice that Λ(ρ) can be interpreted as hS (φ), the support function of S evaluated at φ. However, some care is needed as the scalar product is evaluated in a complex

Hilbert space. Wei suggests as a measure of entanglement

2 EW (ρ) = 1 − Λ(ρ) .

This is then generalized to certain classes of mixed states. It is natural to consider possible geometric measures of entanglement given by

Eg(ρ) = 1 − sup tr(ρσ) σ∈S

2 Eh(ρ) = 1 − sup F (ρ, σ) σ∈S since E(ρ) = EW (ρ) = Eg(ρ) = Eh(ρ) when ρ is a pure state. Other measures of entanglement are based on the notion of entropy. Entanglement of formation of a pure state ρ = |ψihψ|, as defined by Wooters [105], is given by the entropy of either of the subsystems. That is if ρA = trB(|ψihψ|) (and ρB = trA(|ψihψ|)) then

EF (ρ) = −tr(ρA log ρA) = −tr(ρB log ρB)

Entanglement of formation for an arbitrary state is then given as the average entan- glement of the pure states of the decomposition, minimized over all decompositions.

That is, ( ) X X EF (ρ) = min piEF (|ψiihψi|): ρ = pi |ψiihψi| i i 55 This is know as the convex roof construction and, in fact, any measure of entanglement on pure states can be extended to mixed states in this way. It is natural to consider questions analogous to the questions stated above. Namely, what are, for example,

max min dB(ρ, σ) , max Eg(ρ) , and max EF (ρ)? ρ∈PPT σ∈S ρ∈PPT ρ∈PPT

What are “typical” values of these parameters for PPT states?

2.4 Ranges of Various Entanglement Measures and the role

of the Dimension

In this section we will discuss general bounds for kρkS or kρkPPT and some other quantities defined in the preceding section as ρ varies over D or its subsets. We will also investigate the way such bounds depend on the dimension of the system.

For any vector |φi ∈ H1 ⊗ H2, we can write |φi as m X |φi = αi |eii ⊗ |fii i=1 where the αi, the Schmidt coefficients, are positive and descending real numbers and

|eii are orthonormal in H1 as are the |fii in H2. The Schmidt decomposition is a reinterpretation of the singular value decomposition under the Choi-Jamiolkowski isomorphism (hence the |eii and |fii depend on |φi). The Choi-Jamiolkowski isomor- phism essentially identifies |φihψ| with |φi ⊗ |ψi [12, 47]. The number m is called the Schmidt rank of |φi and ranges from 1 to min{dim(H1), dim(H2)}, with rank 1 equivalent to separability. We often write |eifii := |eii ⊗ |fii. Gurvitz and Barnum [4] note that for a vector with a given Schmidt decomposition n X n n d |φi = αi |eifii ∈ C ⊗ C ∼ C , i=1 2 the eigenvalues of the partial transpose of |φihφ| consist of {αi , αiαj, −αiαj : 1 ≤ i 6= j ≤ n}. A consequence is that for pure states PPT implies separability. That is, if 56 |φihφ| ∈ PPT then |φi must have Schmidt rank 1 and hence |φihφ| ∈ S. This also leads to a straightforward check for when a mixture of such states with the maximally mixed state is PPT . For example, if |φi has only two Schmidt coefficients equal to √ 1/ 2 1 |φi = √ (|e1f1i + |e2f2i) , (2.4.1) 2 then letting Φ = |φihφ|, we see that ΦΓ has −1/2 as an eigenvalue. So

1 ΦΓ + I ≥ 0 (2.4.2) 2 d ΦΓ + I/d ≥ 0 (2.4.3) 2 λΦΓ + (1 − λ)I/d ≥ 0 (2.4.4)

∴ Φλ := λΦ + (1 − λ)I/d ∈ PPT (2.4.5)

where λ = 2/(d+2). Furthermore, any value of λ ∈ (2/(d+2), 1] will give Φλ ∈/ PPT ; d + 2 in other words kΦk = . Since, as is easily seen, −1/2 is the largest – PPT 2 in absolute value – negative eigenvalue of |φihφ|Γ that is possible, the inequality d + 2 k|φihφ|k ≤ holds for any pure state |φihφ|. (For example, if we consider PPT 2 √ the maximally entangled vector |φi – i.e., with all αi equal to 1/ n – the negative eigenvalues will be −1/n and the same argument will lead to k|φihφ|kPPT = n + 1 = √ d + 1, which is smaller than (d + 2)/2 = (n2 + 2)/2 for n > 2; this counterintuitive fact was noticed by Gurvitz and Barnum [32].) Finally, we conclude that

d + 2 max kρk = (2.4.6) ρ∈D PPT 2 since the maximum must be attained on an extreme point of D, i.e., on a pure state.

Obtaining sharp bounds on kρkS is a little more delicate. The following follows from the results of Gurvitz and Barnum [32]

d + 2 ≤ max kρk ≤ d − 1. (2.4.7) 2 ρ∈D S 57 The first inequality follows from (2.4.6) since k·kS ≥ k·kPPT . The second inequality 1 is a consequence of the fact that the (Hilbert-Schmidt) inradius of S is , pd(d − 1) r d − 1 the same as the inradius of D; since the outradius of D is , the value of kρk d S for ρ ∈ D does not exceed the ratio of these two quantities, which is d − 1.

To determine the maximum from (2.4.7) precisely requires being able to calculate, or at least to estimate in a precise way k|φihφ|kS for an arbitrary vector |φi. While calculating k·kS is in general difficult [31], the task is manageable if all non-zero

Schmidt coefficients of |φi are equal as in that case k|φihφ|kS = k|φihφ|kPPT ; in other words, mixtures of such states with I/d are separable if and only if they are PPT .

This is well-documented in the literature for the maximally entangled vector (see, e.g.,

[43]) and routinely follows in other cases. For example, let |φi be given by (2.4.1) and the mixtures Φλ by (2.4.5) and denote by P and Q the (orthogonal) projections

n from C onto the spans of respectively {e1, e2} and {f1, f2}, each of which can be 2 identified with C . Then |φi may be thought of as a maximally entangled vector on 2 2 2 2 C ⊗ C and so its mixtures with the identity on C ⊗ C are separable iff they are PPT (in this particular case n = 2 the same conclusion can be obtained from the

Peres-Horodecki-St¨ormer-Woronowicz criteria). In the framework of the larger space n n ˜ C ⊗ C such mixtures can be represented as Φλ := (P ⊗ Q)Φλ(P ⊗ Q) (cf. (2.4.5)) ˜ and it is an easy exercise to show that Φλ is separable iff Φλ is. On the one hand, any X 0 ˜ separable representation Φλ = σk ⊗ σk leads to a separable representation Φλ = k X 0 ˜ 0 0 0 0 P σkP ⊗QσkQ. On the other hand, Φλ = Φλ +(1−λ)(P ⊗Q +P ⊗Q+P ⊗Q ), k 0 0 n where P = I − P , Q = I − Q are projections of C that are complementary to P,Q; ˜ since the last three terms are obviously separable, separability of Φλ implies that of

Φλ. The above discussion is an instance of the following general problem. Suppose ρ

58 is a state on some “small” m × m system (for example, 3 × 3). If n > m, ρ may be considered as a state on a larger n × n system via the embedding ρ 7→ ρ˜ := ρ ⊕ 0, where 0 is the zero matrix of the appropriate size. We now ask how properties related to separability or PPT of ρ relate to those ofρ ˜.

On the qulitative level, it is easy to see that ρ belongs to PPT (resp. S) iffρ ˜ does. (This is immediate from the definition for the PPT property and follows by considerations similar to – but simpler than – the above for separability.) However, when we focus on quantitative measures induced by, say, k·kS or p-Schatten norm, the situation becomes slightly more complicated.

The analysis presented above shows that the calculation of kρkS is equivalent to determining the smallest s ≥ 0, for which the (non-normalized) mixture ρ + sI is separable, and that, on the other hand, ρ + sI is separable if and only ifρ ˜ + sI is.

Thus, while generally we may have kρkS 6= kρ˜kS (since I/d and not I is a state, the dimension d of the system ultimately enters the considerations), the relationship between the two is fairly straightforward.

If we replace in the argument gauges by p-Schatten distance, the situation is

m m similar. For example, if ρ ∈ D(C ⊗ C ) and σ is a separable approximant of ρ, then n n σ˜ is a separable approximant ofρ ˜ ∈ D(C ⊗ C ) with kρ˜ − σ˜kp = kρ − σkp. On the other hand, if τ is a separable approximant ofρ ˜, then (in the notation analogous to the one used earlier) τ0 := (P ⊗ Q)τ(P ⊗ Q) is a separable approximant of ρ with kρ − τ0kp ≤ kρ˜ − τkp. However, it is possible that tr(τ0) < tr(τ) and so – at least for now – we can only infer that the distances of ρ andρ ˜ to the separable cones are the same. Still, this allows to draw some interesting conclusions. For example, since it is known [12, 42] that there are 3 × 3 states with the PPT property that aren’t separable, it follows that there is c > 0 such that, for all p ∈ [1, ∞] and all n × n

59 systems with n ≥ 3

max inf kρ − σk ≥ c . (2.4.8) ρ∈PPT σ∈S p In other words, the Hausdorff distances (induced by Schatten p-norm) between the sets PPT and S admit, for n ≥ 3, a universal strictly positive lower bound. Finding optimal bounds (for fixed n or in the asymptotic regime), which may and probably will depend on p, is an interesting question, but we will not pursue it here.

For p = 1 it is actually easy to see that the distances of ρ andρ ˜ to S (in the

0 appropriate dimension) are the same. Indeed, denote P0 = P ⊗ Q, P1 = P ⊗ Q ,

0 0 0 n n P2 = P ⊗ Q, P3 = P ⊗ Q . Next, for a given approximant τ ∈ S(C ⊗ C ) ofρ ˜, 3 0 X 0 set τk = PkτPk and τ := τk. Then kρ˜ − τ k1 ≤ kρ˜ − τk1 (since suppressing a k=0 0 matrix to a block diagonal part does not increase its p-norm). Moreover kρ˜ − τ k1 = 3 X 00 m m kρ − τ0k1 + kτkk1, and so if τ = τ0 + sI ∈ S(C ⊗ C ), then k=1 00 0 kρ − τ k1 ≤ kρ˜ − τ k1 ≤ kρ˜ − τk1

(the reason for the first inequality above is that the condition tr(τ0 + sI) = 1 im- 3 3 X X plies ksIk1 = tr(sI) = tr(τk) = k τkk1, and it remains to apply the triangle k=1 k=1 m m inequality for the 1-norm). Accordingly, the distance of ρ from S(C ⊗ C ) does not n n exceed the distance ofρ ˜ from S(C ⊗ C ); we recall that the opposite inequality was easy and generally true. The analysis of other cases (i.e., p > 1) is more involved and so we will then just appeal to (2.4.8).

2.5 Levy’s Lemma and its Applications to p-Schatten Norms

As hinted earlier, some of the results we aim at are of the following form: given some natural metric (or gauge, or other functional) δ on the set of states (or on the space of matrices), what is

max{δ(ρ, S): ρ ∈ PPT }? 60 While the answer obviously depends on the metric δ, we will see that, in some con- texts (for example, for p-Schatten metrics), that dependence is often very simple and transparent. This follows from a well-known concentration result which is a conse- quence of Levy’s isoperimetric inequality, which we will use to obtain information about the width (in most directions) of the PPT body. We state the result here (cf.

[70]).

m−1 Lemma 2.5.1. If f : S → R is an L-Lipschitz function, then for every t > 0

2 2 P ({|f − M| > t}) ≤ C1 exp(−c1mt /L ),

where M is either the median of f, and C1, c1 > 0 are absolute constants.

m+1 m−1 p Remark In [70] the Lemma is stated for S rather than S and with C1 = π/2 and c1 = 1/2 when M is the median of f. However, we may always replace m + 1 by m − 1 by adjusting the constants. Moreover, we will show in section A that the

Lemma – again, when M is the median of f – holds in fact with better (optimal) constants C1 = 1 and c1 = 1/2 and without the need to change the dimension to m + 1.

We will be using this lemma in a slightly different setting than the one described above. Our goal here is to estimate the ratios of the trace-class norm and the operator norm to the Hilbert-Schmidt norm in a generic direction. Let SHS = {x : kxk2 = 1} be the Hilbert-Schmidt sphere in the space of d × d self-adjoint matrices, endowed with the usual rotationally invariant measure. Then we have

Corollary 2.5.2. 8 √ kxk ≈ d, for most x ∈ S . 1 3π HS Corollary 2.5.3. 2 kxk ≈ √ , for most x ∈ SHS . ∞ d 61 Remarks (1) These corollaries are stated rather informally. What is meant by the statement (2.5.2) is that for all  > 0 ! kxk1 lim P √ − 1 <  = 1 , d→∞ 8 3π d and similarly for (2.5.3).

(2) Analogous results can be obtained for any p ∈ (1, ∞).

(3) Identical results hold when we replace SHS by its intersection with the subspace of trace 0 matrices. This is the vector space in which all the sets in question reside and so this is the minimal space that may be considered. Using the relationship between the uniform measure on the sphere and the Gaussian measure on the entire space

0 given by (2.1.4) we can consider GUE (or GUE ) matrices. Indeed, if Xd is GUE, then √ √ √ 0 D E 0 d 1 0 Xd = Xd + Xd,I/ d I/ d = Xd + tr(X ) I = Xd + γI/ d (2.5.1) HS d 0 The first equality is the definition of Xd as the projection of Xd onto the subspace of

0 0 trace 0 matrices. The important features of this decomposition are that Xd is GUE , γ ∼ N(0, 1) and they are independent. It will be apparent from the calculations √ that follow that the term γI/ d is insignificant. Similarly, one may consider the

Hilbert-Schmidt sphere in spaces of not-necessarily self-adjoint d × d matrices, real or complex (the setting we do not use).

Proof of corollary 2.5.2 In light of Levy’s lemma, our task here is to obtain an estimate for the mean of the trace-class norm on the Hilbert-Schmidt sphere. We 1 let Xd be a GUE matrix. Then Ad = √ Xd is suitable for application of the semi- d circle law [103, 104, 95, 38]. We wish to compute E kAdk1 = Etr |Ad|. (Here |A| := † 1/2 (A A) .) We use the weak convergence of (Ad) to the semi-circular distribution. That is, for any bounded and continuous function f we have: 1  Z E trf(A ) → f(x)dµ (x), d d sc 62 where µsc is the standard semi-circular distribution, i.e., the measure with density √ −1 2 (2π) 4 − x χ(−2,2)(x). The function f = |·| is not bounded, but since we have strong concentration of the eigenvalues of Ad (see, e.g., [18], sections 2.2 and 2.3), we can truncate the function f = |·| without affecting the convergence above. So we can calculate Z 8 1 |x| dµsc(x) = = lim E kAdk1 3π d→∞ d Consequently √ 8d3/2 E kX k = E(tr|X |) = dE(tr|A |) ∼ (2.5.2) d 1 d d 3π

If now X is a random matrix distributed uniformly on SHS, the relation (2.1.4) between the spherical and Gaussian means gives 8 √ E kXk = γ−1 E kX k ∼ d (2.5.3) 1 d2−1 d 1 3π √ Since the Lipschitz constant in this situation is d (this is just saying that k · k1 ≤ √ dk · k2), Levy’s lemma 2.5.1 shows that, for large enough d (depending on ), 8 √ 8 √ (1 − ) d ≤ kxk ≤ (1 + ) d (2.5.4) 3π 1 3π with probability greater than

2 2 1 − C1 exp −c2d  . (2.5.5)

Remark The reason why we may only be sure that (2.5.4) holds for large enough d, 8 √ depending on , is that we compare kxk to d and not directly to the median (or 1 3π mean) of kxk1. These two quantities are equivalent only asymptotically (2.5.3), and so to achieve -accuracy we may need to consider only large d’s. This qualification could conceivably be avoided if we had sharp estimates on the speed of convergence implicit in (2.5.3).

Proof of corollary 2.5.3 We proceed in a similar fashion as above. That is, we let

sa sa Xd ∈ Md be chosen according to the standard Gaussian measure on Md and apply 63 1 the semi-circle law to Ad = √ Xd. This time we wish to compute E kAdk . As d ∞ already pointed out in section 2.2, it is known ([28], or see [34] or [18]) that

lim E kAdk∞ = 2 d→∞ and so √ lim E kXdk∞ = 2 d (2.5.6) d→∞

For X ∈ SHS we then have

−1 2 E kXk = γ 2 E kXdk ∼ √ (2.5.7) ∞ d −1 ∞ d

Since the Lipschitz constant for k·k∞ is 1, Levy’s lemma gives us

2 2 (1 − )√ ≤ kxk ≤ (1 + )√ d ∞ d with probability greater than

2 1 − C1 exp −c4d . (2.5.8)

As before, this holds at least for large enough d (depending on ).

Remark: From equations (2.5.2) and (2.5.6) the expected values of kXdk1 and 3/2 8d √ 0 kXdk∞ are and 2 π, respectively. Accordingly, the same is true for Xd ; note 3π √ 0 that by (2.5.1), the 1 and ∞ norms of the difference Xd − Xd are of order d and √ 1/ d, respectively. A similar estimate for the Hilbert-Schmidt norm is even simpler.

2.6 Concentration for the Support Functions of PPT and S

Now we calculate the Lipschitz constants of hPPT and hS and apply Levy’s lemma in a similar fashion as above. The Lipschitz constant of a support function of a convex body, K, (centered at the origin) is simply the outradius of the body, K.

This is because each function fx(u) = hx, ui has Lipschitz constant kxk2, and as 64 hK (u) = max hx, ui, the Lipschitz constant of hK is max kxk . That is, the Lipschitz x∈K x∈K 2 constant of hK is the radius of the smallest ball that contains K, or the outradius of K. The bodies in question are all contained in a ball of radius p1 − 1/d (the intersection of the trace 1 hyperplane and the Hilbert Schmidt ball), centered about the maximally mixed state, I/d. The extreme points of D consist of pure states, and the extreme points of S consist of pure (separable) states; the latter are also among extreme point of PPT . Since all pure states have Hilbert-Schmidt norm 1, it follows that the outradius of PPT , S (and D) is p1 − 1/d. Levy’s lemma then tells us that

(1 − )w(PPT ) ≤ hPPT ≤ (1 + )w(PPT ) with probability greater than

2 2 2 1 − C1 exp −c1w(PPT )  d .

Likewise, we have

(1 − )w(S) ≤ hS ≤ (1 + )w(S) with probability greater than

2 2 2 1 − C1 exp −c1w(S)  d .

Using the known estimates (2.2.11), (2.2.12) for the mean widths of the bodies in question we obtain estimates on the measure of directions in which the width of

PPT is significantly larger than that of S. We combine this with the probability estimates from Levy’s lemma and we obtain

Proposition 2.6.1. Consider the support functions hPPT and hS as random variables on the probability space (SHS, du). Then we have

1 2 (1 − ) √ ≤ hPPT ≤ (1 + )√ (2.6.1a) 4 d d 65 1 4 (1 − ) ≤ h ≤ (1 + ) (2.6.1b) 6d3/4 S d3/4 with probabilities respectively greater than

2  1 − C1 exp −c3 d (2.6.2a)

2 1/2 1 − C1 exp −c4 d . (2.6.2b)

Remark In the context of Proposition 2.6.1, we have no closed expression for the mean widths of the sets in question, and the lower and upper bounds that we employ are generally precise only “up to a universal constant.” This is why the left and right hand sides above differ by more than just the 1 −  and 1 +  terms; this is not ideal, but still adequate for our purposes. If we want to close the gap somewhat, we can use the asymptotic lower bound on w(PPT ) from (2.2.11) and arrive at the following statement: 1 Given  > 0, the probability that hPPT ≥ (1 − ) √ tends to 1 as d → ∞. 2 d To obtain a quantitative bound on the speed of convergence in the statement above, 1 we would need a quantitative version of the bound w(PPT ) & √ . 2 d

2.7 Geometric Banach-Mazur Distance between PPT and S

Now we look at the geometric Banach-Mazur distance of PPT states from S. Since

S ⊂ PPT (with strict inclusion except for 2 × 2, 2 × 3 and 3 × 2 systems) and

S 6⊂ α PPT if α < 1, the geometric Banach-Mazur distance between PPT and S equals max kρk . A lower bound on this quantity can be easily derived from the es- ρ∈PPT S timates on vrad(S) and vrad(PPT ) given earlier by integrating in polar coordinates.

Consider equation (2.1.6).

Z 1/n −n vrad(K) = kukK du Sn−1 66 Accordingly, if vrad(K) ≥ R vrad(L), there is a direction, u, such that

−1 −1 kukK ≥ R kukL ⇔ kukL ≥ R kukK

In particular, there is an x ∈ ∂K (hence with kxkK = 1) such that kxkL ≥ R. vrad(PPT ) d1/4 In our context, since ≥ by (2.2.8) and (2.2.9), it follows that vrad(S) 16 d1/4 there is ρ ∈ PPT and that kρk ≥ . In other words, we proved S 16

Proposition 2.7.1. The geometric Banach-Mazur distance between PPT and S is d1/4 at least . 16 d1/4 Asymptotically, the estimate can be improved to . All these bounds become 8 nontrivial, i.e., > 1, only for fairly large d. Since we do know that the ratio is > 1 starting with d = 9 = 32, bounds with better constants (or just plainly sharper bounds) than those in section 2.2 would be useful. We do not know whether the order d1/4 is optimal, it would be interesting to clarify that point.

2.8 Hausdorff Distance between PPT and S in p-Schatten

Metrics

Next we want to quantify the difference between the sets PPT and S in terms of the p-Schatten distance. Ideally, we would like to find

max distp(ρ, S) ρ∈PPT for (say) p = 1, 2, ∞, i.e., the Hausdorff distance between PPT and S induced by the corresponding Schatten metric. This seems rather hard; however, nontrivial lower bounds for these quantities – some of them surprisingly sharp – can be derived from the volumetric estimates listed earlier. The case p = 2 is particularly straightforward.

67 First, if D > 2, then the inclusion S ⊂ PPT is strict and so there exist directions

n−1 u ∈ S such that hPPT (u) > hS (u). Given such direction u, denote α := hPPT (u)− hS (u) and choose ρ ∈ PPT for which hρ, ui = hPPT (u). Then, for any σ ∈ S we have

kρ − σk2 ≥ hρ − σ, ui = hPPT (u) − hσ, ui ≥ hPPT (u) − hS (u) = α

Since the average of hPPT (u) − hS (u) equals α0 := w(PPT ) − w(S), there is u for which hPPT (u) − hS (u) exceeds α0. Using the bounds (2.2.11) and (2.2.12) on w(PPT ) and w(S) we deduce

−1/2 Proposition 2.8.1. For large d, max dist2(ρ, S) is at least of the order of d . ρ∈PPT −1/2 Moreover, the distance of order d is witnessed in most directions u ∈ SHS.

In view of the observations from section 2.4 (most notably (2.4.8)), the first as- sertion is not optimal. However, since the computation leads to specific bounds for any dimension d, even weak estimates of this nature may conceivably be of interest for d in some appropriate range.

When we use the same argument for p 6= 2 we obtain instead

hPPT (u) − hS (u) kukq kρ − σkp ≥ α, or kρ − σkp ≥ (2.8.1) kukq where q is the exponent conjugate to p. The idea now is that, by Proposition 2.6.1 and Corollaries 2.5.2 and 2.5.3 (or their generalizations if we want to handle arbitrary values of p), the functions hPPT , hS and k·kq are strongly concentrated around their means and so, for most directions u ∈ SHS,

h (u) − h (u) w(PPT ) − w(S) PPT S ≥ (1 − ) (2.8.2) kukq E kukq

(as usual, for d large enough, and how large would depend on  > 0). However, if we

68 are only concerned with the existence, the argument is much simpler. Indeed, it is easy to see that there exist u such that

h (u) − h (u) w(PPT ) − w(S) PPT S ≥ (2.8.3) kukq E kukq holds. Suppose not, then for all u

E kukq (hPPT (u) − hS (u)) < kukq (w(PPT ) − w(S)) but taking the expectation of both sides yields an equality. So there must exist u such that (2.8.3) holds. Either way, we get an estimate involving the quantity

√1 4 α w(PPT ) − w(S) − 3/4 1  16  0 = ≥ 4 d d = √ 1 − E kuk E kuk E kuk d1/4 q q q 4 d E kukq 16 The above inequality is valid for all d. However, because of the term, meaningful d1/4 conclusions can be drawn only for large d. Consequently, we may just as well use the asymptotic lower bound on w(PPT ) from (2.2.11) and (cf. the Remark following

Proposition 2.6.1) arrive at the asymptotic bound

w(PPT ) − w(S) 1 √ E kuk & q 2 d E kukq When we specify p = 1 or p = ∞ (and so q = ∞ or q = 1), we can retrieve from

(2.5.3) and (2.5.7) that

8 √ 2 E kuk ∼ d and E kuk ∼ √ . 1 3π ∞ d

Combining these bounds with the preceding estimate and combining with (2.8.1) yields the following.

Proposition 2.8.2. Let  > 0. Then, for d large enough (depending on ),

1 −  3π 1 max dist1(ρ, S) ≥ , max dist∞(ρ, S) ≥ (1 − ) ρ∈PPT 4 ρ∈PPT 16 d

Moreover, in each case this distance is witnessed in most directions u ∈ SHS. 69 Of the two bounds in the first assertion, the first one is more interesting: for large d, the set PPT sticks out of the set S in most directions by a trace distance of at 1 least about . This should be compared with (2.4.8) and particularly with [5], where 4 it is shown that the maximum (non-normalized) trace distance of a PPT state to

S approaches 2 as the dimension becomes large; see also Proposition 2.9.2. While we do not obtain here the precise constant 2, our argument is based only on general geometric principles and, additionally and most importantly, we show that nearly as large distance can be witnessed in most directions. When stated in the “most directions” form, the results from the propositions in this section give the correct order. Finally, our arguments show that the information on Hausdorff distances between PPT and S can be inferred from the knowledge of their global geometric invariants such as volume or mean width, c.f. [5].

2.9 Maximum Trace Distance from a PPT state to S

In [5], Beigi and Shor prove that the maximum (normalized) trace distance between

S and PPT approaches 1 as the dimension of H goes to infinity. In fact, they arrive at the more general fact that if C is any necessary but not sufficient criterion for separability such that if ρ and σ both satisfy C then ρ ⊗ σ satisfies C then there exist states that satisfy C whose trace distance to S is arbitrarily close to 1.

We can arrive at the same bound by considering tensor powers of certain states called private states. Private states arise from considerations in quantum cryptog- raphy and where introduced in [40, 41] (see also [76, 39]) to elucidate the rela- tionship between entanglement and quantum security. A state on a Hilbert space

HA ⊗ HA0 ⊗ HB ⊗ HB0 with nK := dim(HA) = dim(HB) is said to be a private state

70 if it is of the form nK −1 1 X † γ = |e f ihe f | ⊗ U σ 0 0 U (2.9.1) n i i j j AB i A B j K i,j=0

Here {ei} forms a basis for HA as does {fi} for HB, and eifj = ei ⊗ fj ∈ HA ⊗ HB.

The state σA0B0 is an arbitrary state on HA0 ⊗HB0 and each Ui is an arbitrary unitary transform [39]. The set of all private states is denoted by PS and contains the set of all maximally entangled states. The subsystem HA ⊗ HB is called the key part while the subsystem HA0 ⊗ HB0 is called the shield part. A state on the system

HA ⊗ HA0 ⊗ HB ⊗ HB0 is said to have a key in the basis {eifj} if, when purified to an eavesdropper system E, after measurement in the basis {eifj}, and after tracing out HA0 ⊗ HB0 , it becomes a tensor product of two states

n −1 1 XK |e f ihe f | ⊗ ρ , n i i i i AB E K i=0 the “perfectly correlated” state on HA ⊗ HB and a state on E. It turns out that the state on systems HA ⊗ HA0 ⊗ HB ⊗ HB0 has a key according to the above definition if and only if it is a private state of the form (3.8.1) (see [40, 41, 39] for details).

We cite the following lemma from [39]

Lemma 2.9.1. For all γ ∈ PS

2 inf kγ − σk1 ≥ 2 − σ∈S nK where nK is the dimension (of one factor) of the key system.

Suppose now that γ1 is a private state, γ2 another state, and consider a convex combination

ρ = (1 − p)γ1 + pγ2.

It then follows from lemma 2.9.1 that for all σ ∈ S

2 kρ − σk1 ≥ 2 − − 2p (2.9.2) nK 71 Indeed, let σ ∈ S. By the triangle inequality and lemma 2.9.1 we have

2 kρ − σk1 + kρ − γ1k1 ≥ kσ − γ1k1 ≥ 2 − nK and it remains to notice that kρ − γ1k1 = kp(γ2 − γ1)k1 ≤ 2p. 1 Now, if nK = 2, nS := dim(HA0 ) = dim(HB0 ), and p = √ , it was shown in nS + 1 [45] that, for appropriate γ1, γ2, the mixture ρ belongs to PPT (with respect to the

“split” (HA ⊗ HA0 ) ⊗ (HB ⊗ HB0 )) and in fact is an edge state, that is, ρ ∈ ∂PPT . In other words, we have ρ ∈ PPT such that

2 inf kρ − σk1 ≥ 1 − √ . (2.9.3) σ∈S nS + 1

Next, taking tensor powers (still with nK = 2) and taking advantage of the prop- erties of fidelity, namely (2.3.3) and (2.3.2), we can prove the following proposition.

Proposition 2.9.2. For ρ defined above, for k ≥ 2, and for all σ ∈ S s  √ k ⊗k 2 nS ρ − σ 1 ≥ 2 − k − 2 1 − √ . 2 nS + 1 Since it is apparent that the two last terms on the right hand side can be made as small (in absolute value) as we wish by appropriate choices of k and nS, the above argument provides a method for producing states that are PPT , yet nearly as far from S as possible in the trace distance.

These calculations rely on the fact that private states are far from separable states in the sense of Lemma 2.9.1 and that there are PPT states “reasonably close” to private states. Compared to that of [5], this construction is based on rather elementary principles and gives (more) reasonable dependence of the distance from

S on the involved.

In [5], a result similar to Proposition 2.9.2 above is offered as an evidence that the sets PPT and S are of different shapes (see the abstract an the last paragraph of sec- tion IV in [5]), and contrasted with the information provided by Dvoretzky’s theorem, 72 according to which all high-dimensional convex sets have (nearly) spherical sections.

In the preceding sections of this chapter we showed that estimates such as Propo- sition 2.9.2 do in fact follow from concentration of measure, the same phenomenon that underlies Dvoretzky’s theorem. The explanation for this somewhat confusing picture is that the radii of spherical sections of PPT and S are very different; this is the feature that most fundamentally distinguishes these two sets.

Interestingly, private states cannot be “too close” to PPT states. The following theorem is proved in [44]

2 n 2 n Proposition 2.9.3. For H = (C ⊗ C S ) ⊗ (C ⊗ C S ), if ρ ∈ PPT and γ ∈ PS then 1 kρ − γk1 ≥ √ . 2(2nS nS + 1)

This shows that the choice of ρ that led to the bound (2.9.3) can not be radically improved.

73 SUMMARY

We use the information theoretic notion of relative entropy to describe convex bodies and define the relative entropy for convex bodies via (1.3.2). We elaborate on the con- nection given by Paouris and Werner [74] concerning the affine invariant ΩK (1.2.4) by providing a new geometric characterization of the relative entropy of cone mea- sures. In addition, using the mean width bodies for this characterization allows use to obtain these quantities using only first-order expansions of volume differences and no symmetry assumptions rather than the second order expansions and symmetry assumptions required when using the Lp-centroid bodies as in [74]. This is evidence that the mean width bodies are somehow more sensitive to the boundary structure of a body. The mean width bodies also provide a new geometric characterization of the Lp-affine surface area with (1.4.2). Convex geometry is a natural tool to explore quantum states. The set PPT is often used as an approximation of the set S. The results presented in Chapter 3 show that, in many respects, the PPT is not a good approximation of S. Using known volumetric estimates on the various subsets of quantum states, we arrive at fairly sharp concentration results. Specifically, proposition 2.8.2 shows that for large dimension the distance in the trace norm between S and PPT is at least 1/4 in most directions. We also obtain bounds on the distance between these sets using other notions of distance. In addition, we give an explicit method of producing PPT states that are far from S by considering private states.

74 Appendix A

CONSTANTS IN THE SPHERICAL ISOPERIMETRIC

INEQUALITY

As promised in section 2.5, we will clarify now the constants in the spherical concen- tration inequality as well as the discrepancy with the dimension that appears in the exponent. The statement we are concerned with appears in Milman and Schechtman as corollary to the spherical isoperimetric inequality ([70] corollary 2.2) and has since been quoted extensively in the literature. It reads r π 2 Proposition A.0.4. If A ⊂ Sn+1 with µ(A) ≥ 1/2 then µ(A ) ≥ 1 − e− n/2.  8

n+1 Here µ is the normalized surface measure on S and, for  ∈ [0, π/2], A = {x : d(x, A) ≤ } where d is geodesic distance. Chordal distance can also be considered in place of geodesic distance. We need only consider  ≤ π/2 since otherwise A = Sn+1 and the theorem is vacuously true. The chordal, or extrinsic distance is the usual distance in the ambient space and in many applications is more relevant than the geodesic distance.

The spherical concentration inequality played a huge role in the development of the theory. However, its statement is not completely satisfactory for two reasons. First, the dimension of the ambient space in this formulation is n + 2 while the dimension that appears in the exponent is just n. It would be more natural for the dimension in the exponent to match that of the ambient space since it is these terms that give the

75 concentration results used in (2.5.1). Next, the constant of pπ/8 ≈ 0.626657 does not seem optimal. Here we will prove the following version of the inequality that addresses all these concerns.

Proposition A.0.5. Let n > 2 and  ∈ [0, π/2]. If A ⊂ Sn−1 with µ(A) ≥ 1/2 then

1 2 µ(A ) ≥ 1 − e− n/2.  2

Before passing to the proof, we will comment on the case n = 2, i.e., for the

1-dimensional sphere S1. As easily follows from the argument sketched below, the optimal lower bound is then very simple and reads

1  µ(A ) ≥ + .  2 π

A direct check shows then that the estimate from Proposition A.0.5 fails if  ∈ (a, b), where a ≈ 1.05858, b ≈ 1.18588. However, it remains true outside of this interval and

2 if we use the extrinsic chordal distance in R instead of the geodesic distance, it is √ true in the entire nontrivial range [0, 2].

Let us now sketch the proof of Proposition A.0.5, which, except for the calculus part, follows [70]. The spherical isoperimetric inequality guarantees that, given µ(A), the value of µ(A) is minimized when A is a spherical cap. Accordingly, we need only consider the case when C ⊂ Sn−1 is a spherical cap with µ(C) = 1/2. We need to show that

1 2 µ(Sn−1\C ) ≤ e−nx /2 . x 2

n−1 We now note that S \Cx is again a spherical cap (whose radius in the geodesic distance is π/2 − x), and so – still following [70] – the left hand side can be rewritten as

R π/2 n−2 n−1 x cos θ dθ µ(S \Cx) = (A.0.1) 2In−2 76 Z π/2 n where In := cos θ dθ. This means that the assertion of Proposition A.0.5 is 0 equivalent to the inequality

R π/2 n−2 x cos θ dθ nx2/2 qn(x) := e ≤ 1 . (A.0.2) In−2

Numerical considerations suggest that, for each x ∈ [0, π/2], the sequence (qn(x)) is nonincreasing. More precisely, since we will be using the recurrence formula

Z 1 n − 1 Z cosn θ dθ = cosn−1 θ sin θ + cosn−2 θ dθ, (A.0.3) n n it will be more convenient to prove that qn+2(x) ≤ qn(x), which using the notation Z π/2 n In := cos θ dθ simplifies to 0

π/2 π/2 Z 2 I Z cosn θ dθ ≤ e−x n cosn−2 θ dθ. (A.0.4) x In−2 x I n − 1 Integrating the recurrence formula (A.0.3) from 0 to π/2 yields n = , so In−2 n substituting this value in (A.0.4) and further applying (A.0.3) to the left hand side Z π/2 of (A.0.4) allows us to rewrite it as an upper bound on cosn−2 θ dθ, namely x

Z π/2 (n − 1)(1 − e−x2 ) cosn−2 θ dθ ≤ cosn−1 x sin x . (A.0.5) x

We now use to the following fact to upper-bound the cosine integral

Z π/2 1 Z π/2 1 cosn−1 x cosn−2 θ dθ ≤ cosn−2 θ sin θ dθ = . x sin x x n − 1 sin x

Applying this bound to the left hand side of (A.0.5), we see that it is now sufficient to show that

(1 − e−x2 ) ≤ sin2 x or cos x ≤ e−x2/2 (A.0.6) for x ∈ [0, π/2], which is a well known inequality used often in analysis. This inequal- ity can be validated in many ways. One may consider the power series of both sides, or

77 take the logarithm of both sides and appeal to calculus. For the sake of completeness we include a simple proof. Taking the logarithm of both sides of (A.0.6)

ln(cos x) ≤ −x2/2 .

Since there is equality at x = 0, this inequality is true by monotonicity if it is true for the derivatives. That is, if

− tan x ≤ −x then (A.0.6) holds. Since we again have equality when x = 0 we can repeat the same argument and reduce this further to the evidently true

1 ≥ cos2 x .

We have shown that for each x ∈ [0, π/2], the sequences (q2n(x)) and (q2n+1(x)) are nonincreasing. Now we need only show that for x ∈ [0, π/2], q3(x) ≤ 1 and q4(x) ≤ 1.

(The failure of q2(x) ≤ 1 on some subinterval (a, b) ⊂ [0, π/2] was the reason why the case n = 2 has to be excluded from Proposition A.0.5 and analyzed separately.)

From (A.0.2), q3(x) and q4(x) are given by the following

3x2/2 q3(x) = (1 − sin x)e (A.0.7)

1 2 q (x) = (π − 2x − 2 cos x sin x) e2x (A.0.8) 4 π

The inequalities q3(x) ≤ 1 and q4(x) ≤ 1 can now be verified numerically or graphi- cally, see figure A.1. Note that the only points where q3(x) or q4(x) is even close to 1 4 are near x = 0 (for which we have equality), but since q0 (0) = −1 and q0 (0) = − , we 3 4 π can be sure that the inequalities hold when x is close to 0. (Note that, additionally, we know that q4(x) ≤ q2(x), so that q4(x) ≤ 1 for sure holds outside the short interval (a, b) identified earlier.) Alternatively, we can attempt to show the inequalities by analytic means using the same techniques as earlier in the proof of (A.0.6). However,

78 1.0

0.8

0.6

0.4

0.2

0.5 1.0 1.5

Figure A.1: The plots of q3 and q4.

the argument is more involved and at some stage some numerics are necessary. For example, taking the logarithm of both sides of q3(x) ≤ 1 we obtain −3x2 ln(1 − sin x) ≤ 2 We again have equality when x = 0 so we look at the derivatives and are led to cos x g(x) := 3x − ≤ 0 1 − sin x (that is, if the above inequality holds in [0, π/2], so does the previous one and we 2 − 3 sin x are done). By direct calculation, g0(x) = and so it is apparent that 1 − sin x 2 g has a unique maximum in [0, π/2) at x = arcsin . It remains to verify that 0 3 g(x0) ≈ −0.046885 < 0. This finishes the proof of Proposition A.0.5.

Remark It may be possible to generalize the above argument to arrive at estimates

1 2 involving e− (n+k)/2 where k > 0. We cannot expect these estimates to hold for all 2 79 n as we have already seen above that for k = 0, n must be greater than 2. However, it may be true for n ≥ n(k). The quantity we are interested in becomes

R π/2 n−2 x cos θ dθ (n+k)x2/2 qn,k(x) := e ≤ 1 . (A.0.9) In−2

If we repeat the above argument for qn,k(x) we see that qn+2,k(x) ≤ qn,k(x) is again equivalent to (A.0.4) and hence for each x ∈ [0, π/2] and for all k ≥ 0 the se- quences (q2n,k(x)) and (q2n+1,k(x)) are nonincreasing. A numerical check indicates that q4,1(x) ≤ 1 and q6,2(x) ≤ 1, but the inequalities do not hold for q3,1 and q5,2. This

1 2 suggests that, in the setting of Proposition A.0.5, the bound µ(A ) ≥ 1 − e− (n+1)/2  2 1 2 is valid for n ≥ 4 and the bound µ(A ) ≥ 1 − e− (n+2)/2 is valid for n ≥ 6. It would  2 be interesting to explore the dependence k → n(k).

80 BIBLIOGRAPHY

[1] A. Aspect, J. Dalibard and G. Roger, Experimental Test of Bell’s In- equalities Using Time-Varying Analyzers, Phys. Rev. Lett, 49, no.25 (1982), 1804.

[2] A. Aspect, P. Grangier and G. Roger, Experimental Realization of Einstein-Podolsky-Rosen-Bohm Gedankenexperiment: A New Violation of Bell’s Inequalities, Phys. Rev. Lett, 49, no.2 (1982), 91-94.

[3] G. Aubrun and S. Szarek, Tensor products of convex sets and the volume of separable states on N qudits, Phys. Rev. A 73 no.2, 022109 (2006).

[4] H. Barnum and N. Linden, Monotones and invariants for multi-particle quantum states, J. Phys. A: Math. Gen. 34 (2001) 6787-6805.

[5] S. Beigi and P. Shor, Approximating the Set of Separable States Using the Positive Partial Transpose Test, J. Math. Phys. 51, 042202 (2010).

[6] J. Bell, On the Einstein Podolsky Rosen Paradox, Physics 1, (1964), 195-200.

[7] C. H. Bennett and G. Brassard, Quantum Cryptography: Public Key Distribution and Coin Tossing, in Proceedings of the IEEE International Con- ference on Computers, Systems and Signal Processing (IEEE Computer Society Press, New York, Bangalore, India, December 1984), pp. 175179.

[8] C. Bennett, D. DiVincenzo, J. Smolin and W. Wootters, Mixed State Entanglement and Quantum Error Correction, Phys. Rev. A 54, 3824-3851 (1996).

[9] W. Blaschke, Vorlesungen ¨uber Differentialgeometrie II: Affine Differential- geometrie, Springer Verlag, 1923.

[10] K. Bor¨ oczky¨ Jr., R. Schneider, The mean width of circumscribed random polytopes, Canadian Math. Bull. 53 (2010), 614-628.

[11] J. Bourgain and V. Milman, New Volume Ratio Properties for Convex n Symmetric Bodies in R , Inventions Math. 88, no.2, (1987), 319-340.

81 [12] M. Choi, Completely Positive Maps on Complex Matrices, Lin. Alg. Appl. (1975) 285-290.

[13] T. Cover and J. Thomas, Elements of information theory, second ed., Wiley-Interscience, (John Wiley and Sons), Hoboken, NJ, (2006).

[14] J. Clauser, M. Horne, A. Shimony, and R. Holt, Proposed Experiment to Test Local Hidden-variable Theories, Phys. Rev. Lett 23, (1969), 880-884.

[15] E. Davies and J. Lewis, An Operational Approach to Quantum Probability, Cummun. math. Phys. 17, (1970), 239-260.

[16] A. Dembo, T. Cover, and J. Thomas, Information theoretic inequalities, IEEETrans. Inform. Theory 37 (1991), 1501-1518.

[17] R. M. Dudley, The sizes of compact subsets of Hilbert space and continuity of Gaussian processes, J. Funct. Anal. 1 (1967), 290-330.

[18] K. R. Davidson and S. J. Szarek, Local operator theory, random matrices and Banach spaces, in “Handbook on the Geometry of Banach spaces” Volume 1, W.B. Johnson, J. Lindenstrauss eds., Elsevier Science 2001, pp. 317-366. Addenda and Corrigenda, in Volume 2, 2003, pp. 1819-1820.

[19] P. A. M. Dirac, The Principles of Quantum Mechanics, fourth ed., Oxford University Press, New York, (1958).

[20] A. Einstein, B. Podolsky and N. Rosen, Can Quantum-Mechanical De- scription of Physical Reality Be Considered Complete?, Phys. Rev. 47, (1935), 777.

[21] R. Feynmann, Simulating Physics with Computers, Int. J. Theor. Phys. 21 (1982) 467.

[22] C. A. Fuchs and J. van de Graaf, Cryptographic Distinguishability Mea- sures for Quantum-Mechanical States, IEEE Trans. Inf. Theor. 45 no.4, (2006), 1216-1227.

[23] R. J. Gardner, The Brunn-Minkowski Inequality, Bulletin of the AMS 39, 3, (2002), 355-405.

[24] R. J. Gardner, A positive answer to the Busemann-Petty problem in three dimensions, Ann. of Math. 140 no.2 (1994), 435-47.

[25] R. J. Gardner, The dual Brunn-Minkowski theory for bounded Borel sets: Dual affine quermassintegrals and inequalities, Adv. Math. 216 (2007), 358- 386.

82 [26] R. J. Gardner, A. Koldobsky, and T. Schlumprecht, An analytical solution to the Busemann-Petty problem on sections of convex bodies, Ann. of Math. 149 no.2 (1999), 691-703.

[27] R. J. Gardner and G. Zhang, Affine inequalities and radial mean bodies. Amer. J. Math. 120 no.3 (1998), 505-528.

[28] S. Geman, A limit theorem for the norm of random matrices, Ann. Probab. 8 (1980), 252-261.

[29] E. Grinberg and G. Zhang, Convolutions, transforms, and convex bodies, Proc. London Math. Soc. 78 no.3 (1999), 77-115.

[30] S. Glasauer and P. M. Gruber, Asymptotic estimates for best and stepwise approximation of convex bodies III, Forum Math. 9 (1997), 383-404.

[31] L. Gurvits, Classical Deterministic Complexity of Edmond’s Problem and Quantum Entanglement, in “Proceedings of the Thirty-Fifth Annual ACM Sym- posium on Theory of Computing” (electronic), ACM, New York, (2003), 10-19.

[32] L. Gurvits and H. Barnum, Largest separable balls around the maximally mixed bipartite quantum state, Phys. Rev. A 66, 062311 (2002).

[33] L. Gurvits and H. Barnum, Better bound on the exponent of the radius of the multipartite separable ball, Phys. Rev. A 72, 032322 (2005).

[34] U. Haagerup and S. Thorbjornsen, Random Matrices with Complex Gaus- sian Entries, EXPO. MATH, 21, (1998), 293-337.

[35] C. Haberl, Blaschke valuations, Amer. J. of Math., 133, no.3, (2011), 717- 751.

[36] C. Haberl and F. Schuster, General Lp affine isoperimetric inequalities. J. Differential Geometry 83 (2009), 1-26.

[37] C. Haberl, E. Lutwak, D. Yang and G. Zhang, The even Orlicz Minkowski problem, Adv. Math. 224 (2010), 2485-2510.

[38] F. Hiai and D. Petz, The Semicircle Law, Free Random Variables and En- tropy, Mathematical Surveys and Monographs 77, AMS, Providence, 2000.

[39] K. Horodecki, PhD Thesis (2008). [40] K. Horodecki, M. Horodecki, P. Horodecki and J. Oppenheim Secure Key from Bound Entanglement Phys. Rev. Lett. 94, 160502 (2005)

[41] K. Horodecki, M. Horodecki, P. Horodecki and J. Oppenheim Gen- eral Paradigm for Distilling Classical Key from Quantum States IEEE Trans. Inf. Theory 55, (2009) 1898. 83 [42] M. Horodecki, P. Horodecki and R. Horodecki, Separability of Mixed States: Necessary and Sufficient Conditions, Physics Letters A 223 (1996), 1-8.

[43] R. Horodecki, P. Horodecki, M. Horodecki, K. Horodecki, Quantum entanglement, Rev. Mod. Phys. 81, 865 (2009).

[44] K. Horodecki, M. Horodecki, J. Jenkinson and S. Szarek, Bound entangled states with extremal properties, in preparation.

[45] K. Horodecki, L. Pankowski, M. Horodecki, and P. Horodecki, Low dimensional bound entanglement with one-way distillable cryptographic key, IEEE 54, 2621(2008).

[46] D. Hug, Contributions to affine surface area, Manuscripta Math. 91 (1996) 283-301.

[47] A. Jamiolkowski, Linear Transformations which Preserve Trace and Positive Semidefiteness of Operators, Rep. Math. Phys. 3, no.275 (1972) 380.

[48] J. Jenkinson and E. Werner, Relative Entropy of Convex Bodies, Trans. AMS (in print)

[49] R. Jozsa, Fidelity for mixed quantum states, Journal of Modern Optics, 41, no. 12, (1994), 23152323.

[50] D. Klain, Star valuations and dual mixed volumes, Adv. Math. 121 (1996), 80-101.

[51] D. Klain, Invariant valuations on star-shaped sets, Adv. Math. 125 (1997), 95-113.

[52] G. Kuperberg, From the Mahler Conjecture to Gauss Linking Integrals, GAFA 18 (2008), 870-892.

[53] K. Leichtweiss, Zur Affinoberfl¨achekonvexer K¨orper, Manuscripta Mathe- matica 56 (1986), 429-464.

[54] M. Ludwig, Ellipsoids and matrix valued valuations, Duke Math. J. 119 (2003), 159-188.

[55] M. Ludwig, Minkowski areas and valuations, J. Differential Geometry, 86 (2010), 133-162.

[56] M. Ludwig and M. Reitzner, A Characterization of Affine Surface Area, Adv. Math. 147 (1999), 138-172.

[57] M. Ludwig and M. Reitzner, A classification of SL(n) invariant valua- tions. Ann. of Math. 172 (2010), 1223-1271.

84 [58] E. Lutwak, Intersection bodies and dual mixed volumes, Adv. Math. 71 (1988), 232-261.

[59] E. Lutwak, The Brunn-Minkowski-Firey theory II : Affine and geominimal surface areas, Adv. Math. 118 (1996), 244-294.

[60] E. Lutwak, D. Yang and G. Zhang, A new ellipsoid associated with convex bodies, Duke Math. J. 104 (2000), 375-390.

[61] E. Lutwak and G. Zhang, Blaschke-Santal´oinequalities, J. Differential Geom. 47 (1997), 1-16.

[62] E. Lutwak, D. Yang and G. Zhang, Sharp Affine Lp Sobolev inequalities, J. Differential Geometry 62 (2002), 17-38.

[63] E. Lutwak, D. Yang and G. Zhang, The Cramer–Rao inequality for star bodies, Duke Math. J. 112 (2002), 59-81.

[64] E. Lutwak, D. Yang and G. Zhang, Volume inequalities for subspaces of Lp, J. Differential Geometry 68 (2004), 159-184. [65] E. Lutwak, D. Yang and G. Zhang, Moment-entropy inequalities, Ann. Probab. 32 (2004), 757774.

[66] K. Mahler, Ein Minimalproblem f¨urkonvexe Polygone, Mathematica (Zut- phen) B, 7, (1939), 118-127.

[67] K. Mahler, Ein Ubertragungsprinzip¨ f¨urkonvexe K¨orper, Casopisˇ Pˇest.Mat. Fys. 68, (1939), 93-102.

[68] M. Meyer and E. Werner, The Santal´o-regions of a convex body. Trans- actions of the AMS 350 no.11, (1998) 4569-4591.

[69] M. Meyer and E. Werner, On the p-affine surface area. Adv. Math. 152 (2000), 288-313.

[70] V. D. Milman and G. Schechtman, Asymptotic theory of finite dimensional normed spaces. With an appendix by M. Gromov, Lecture Notes Math. 1200, Springer Verlag, Berlin-New York, 1986.

[71] F. Nazarov, F. Petrov, D. Ryabogin and A. Zvavitch, A remark on the Mahler conjecture: local minimality of the unit cube, Duke Math. J. 154, (2010), 419-430.

[72] M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, Cambridge, (2000).

[73] R. Osserman, The isoperimetric inequality, Bulletin of the AMS, 84 (1978), 1182-1238. 85 [74] G. Paouris and E. Werner, Relative entropy of cone measures and Lp centroid bodies, preprint

[75] G. Pisier, The volume of convex bodies and Banach space geometry, Cambridge Tracts in Mathematics 94, Cambridge University Press, Cambridge, 1989.

[76] J. M. Renner and G. Smith, Noisy processing and distillation of private quantum states, Phys. Rev. Lett. 98 020502 (2007)

[77] C. A. Rogers and G. C. Shephard, The difference body of a convex body, Archiv der Math., 8 (1957), 220-233.

[78] C. A. Rogers and G. C. Shephard, Convex bodies associated with a given convex body, J. Lond. Math. Soc. 33 (1958), 270-281.

[79] B. Rubin and G. Zhang, Generalizations of the Busemann-Petty problem for sections of convex bodies, J. Funct. Anal. 213 (2004), 473-501.

[80] J. J. Sakurai and J. Napolitano, Modern Quantum Mechanics, second ed., Addison-Wesley (Pearson), (2007).

[81] R. Schneider, Convex Bodies: The Brunn-Minkowski theory. Cambridge University Press, (1993).

[82] F. Schuster, Crofton measures and Minkowski valuations, Duke Math. J. 154, (2010), 1-30.

[83] C. Schutt¨ and E. Werner, The convex floating body. Math. Scand. 66 (1990), 275-290.

[84] C. Schutt¨ and E. Werner, Random polytopes of points chosen from the boundary of a convex body. GAFA Seminar Notes, Lecture Notes in Mathemat- ics, Springer-Verlag 1807 (2002), 241-422,

[85] C. Schutt¨ and E. Werner, Surface bodies and p-affine surface area. Adv. Math. 187 (2004), 98-145.

[86] C. E. Shannon, A Mathematical Theory of Communication, Bell Syst. Tech. J. 27, 379 (1948); 27, 623 (1948).

[87] A. Shimony, Degree of Entanglement, Annals of the New York Academy of Sciences, 755 (1995) 675-679.

[88] P. Shor, Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer, Phys. Rev. A 52 (1995) 2493.

[89] H. Sommers and K. Zyczkowski˙ , Bures volume of the set of mixed quantum states, J. Phys. A: Math. Gen. 36 (2003) 10083-10100.

86 [90] A. Stancu, The Discrete Planar L0-Minkowski Problem. Adv. Math. 167 (2002), 160-174.

[91] A. Stancu, On the number of solutions to the discrete two-dimensional L0- Minkowski problem. Adv. Math. 180 (2003), 290-323.

[92] S. Szarek, The volume of separable states is super-doubly-exponentially small in the number of qubits, Phys. Rev. A 72, 032304 (2005).

[93] S. Szarek, E. Werner and K. Zyczkowski, Geometry of sets of quan- tum maps: A generic positive map acting on a high-dimensional system is not completely positive, J. Math. Phys. 49, 032113 (2008).

[94] A. Uhlmann, The transition probability in the state space of a ∗-algebra, Re- ports on Mathematical Physics, 9 (1976) 273-279.

[95] D. Voiculescu, K. Dykema and A. Nica, Free Random Variables, CRM Monograph Series 1, AMS, Providence, 1992.

[96] T. Wei and P. Goldbart, Geometric measure of entanglement and appli- cations to bipartite and multipartite quantum states, Phys. Rev. A 68, 042307 (2003).

[97] E. Werner, Illumination bodies and affine surface area, Studia Math. 110 (1994), 257-269.

[98] E. Werner, A general geometric construction for affine surface area, Studia Math. 132 no.3 (1999), 227-238.

[99] E. Werner, On Lp-affine surface areas, Indiana Univ. Math. J. 56 no. 5 (2007), 2305-2324.

[100] E. Werner and D. Ye, New Lp affine isoperimetric inequalities, Adv. Math. 218 no.3 (2008), 762-780.

[101] E. Werner and D. Ye, Inequalities for mixed p-affine surface area, Math. Ann. 347 (2010), 703-737.

[102] R. F. Werner, Quantum states with Einstein-Poldolsky-Rosen correlations admitting a hidden-variable model, Phys. Rev. A 40, 4277 (1989).

[103] E. P. Wigner, Characteristic vectors of bordered matrices with infinite dimen- sions, Ann. Math. 62 (1955), 548-564.

[104] E. P. Wigner, On the distribution of the roots of certain symmetric matrices, Ann. Math. 67 (1958), 325-327.

[105] W. K. Wooters, Entanglement of Formation of an Arbitrary State of Two Qubits, Phys. Rev. Lett. 80, 2245-2248 (1998) 87 [106] D. Ye, On the Bures volume of separable quantum states, J. Math. Phys. 50 083502 (2009).

4 [107] G. Zhang, Intersection bodies and Busemann-Petty inequalities in R , Ann. of Math. 140 (1994), 331-346.

[108] G. Zhang, A positive answer to the Busemann-Petty problem in four dimen- sions, Ann. of Math. 149 (1999), 535-543.

[109] G. Zhang, New Affine Isoperimetric Inequalities, ICCM 2007, Vol. II, 239-267.

[110] K. Zyczkowski˙ and H.-J. Sommers, Hilbert–Schmidt volume of the set of mixed quantum states, J. Phys. A 36, 10115 (2003).

88