PHDTHESIS

QUANTUMENTROPIES, RELATIVE ENTROPIES, AND RELATED PRESERVER PROBLEMS

DÁNIELVIROSZTEK

DEPARTMENT OF ANALYSIS BUDAPESTUNIVERSITYOFTECHNOLOGYANDECONOMICS HUNGARY

SUPERVISOR: PROF. DÉNESPETZ

2016 Contents

Index of Notation 2 Chapter 1. Introduction 3 1. C ∗-algebras, von Neumann algebras 4 2. Tensor product 9 3. Continuous functional calculus 11 4. Commutative von Neumann algebras 13 Chapter 2. Quantum variances, generalized entropies and relative entopies 15 1. Decomposable quantum variances 16 2. Generalizations of the strong subadditivity inequality 21 3. Bregman divergences and their use 34 Chapter 3. Preserver problems 51 1. Maps preserving Bregman and Jensen divergences 53 2. Jordan triple endomorphisms of the positive cone 66 3. An application: the endomorphisms of the Einstein gyrogroup 78 Chapter 4. Summary 83 Ackowledgement 84 Bibliography 85

1 2 CONTENTS

Index of Notation N the set of natural numbers Z the set of integers Q the set of rational numbers R the set of real numbers C the set of complex numbers x X x is an element of the set X ∈ A B the Cartesian product of the sets A and B × An the nth Cartesian power of the set A H a complex Hilbert space ­x, y® the inner product of the elements x and y of the Hilbert space H — we use the convention that the inner prod- uct is linear in the second variable and conjugate-linear in the first variable. B(H ) the set of bounded linear operators on H Bsa(H ) the set of self-adjoint linear operators on H B+(H ) the set of positive semidefinite linear operators on H B++(H ) the set of positive definite linear operators on H ran(.) the range of a linear operator ker(.) the kernel of a linear operator M the set of n n complex matrices n × Msa the set of n n self-adjoint complex matrices n × M+ the set of n n positive semidefinite complex matrices n × M++ the set of n n positive definite complex matrices n × [A]i,j the entry in the i-th row and j-th column of a matrix A Atr the transpose of the matrix A IA the identity element of the unital algebra A CHAPTER 1

Introduction

The classical work [44] of Andrey Nikolaevich Kolmogorov laid the foundations of probability theory in 1933. In Kolmogorov’s approach, the basic concept of probability theory is the probability space. A probability space is a triplet (X ,A ,P), where X is an arbitrary set, A P(X ) is a σ- ⊆ algebra — P(X ) denotes the power set of X — and P is a finite measure on A which is normalized, that is, P(X ) 1. This means that a probabil- = ity space is nothing else but a measure space with total measure one, so one may consider probability theory as a branch of measure theory. On the other hand, probability theory is a richer structure than measure the- ory in the sense that several measure theoretical notions gain intuitive meanings from the viewpoint of a probability theorist. Without the re- quirement of generality, let us mention some of the intuitions which are associated with the notions of measure theory. The most basic concept is that the measurable sets — that is, the elements of the σ-algebra A — are considered to be events. A measurable function f :(X ,A ) (K,B) → is called a real/complex random variable if K R or K C, respectively. R = = Therefore, the Lebesgue integral X f dP of the measurable function f is called the expected value — if it exists. As P is a finite measure, it is quite easy to guarantee the existence of the integral of a measurable function. ¡ ¯ ¯ ¢ If f is essentially bounded, that is, P {x X : ¯f (x)¯ K } 0 for some ∈ > = K 0, then f is integrable, moreover, any power of f is integrable. This > R k latter fact is remarkable as the integral X f dP is called the kth moment of the random variable f and plays an important role in probability the- ory. Let us denote by L∞ (X ,A ,P) the set of essentially bounded measur- able complex valued functions on the probability space (X ,A ,P). Let us introduce the notation

½ ¯ Z ¾ ¯ ¯ ¯2 L2 (X ,A ,P) f : X C¯ f is measurable and ¯f ¯ dP , = → ¯ X < ∞ as well. Clearly, L2 (X ,A ,P) is a Hilbert space with the inner product ­f ,g® R f gdP. Every bounded measurable function f : X C deter- = X 7→ mines a bounded linear operator on the Hilbert space L2 (X ,A ,P) in the

3 4 1. INTRODUCTION following way. Set f L∞ (X ,A ,P). Let us define the multiplication op- ∈ erator M f by M : L2 (X ,A ,P) L2 (X ,A ,P), g M (g): f g. f → 7→ f = Straightforward computations show that M f is linear, and the proof of ¡ ¢ the boundedness of M is quite easy, as well. So, M B L2 (X ,A ,P) f f ∈ for any f L∞ (X ,A ,P). Moreover, the operator norm of M f coincides ∈ ° ° ° ° with the supremum norm of f , that is, °M ° °f ° . This latter fact is f = also rather easy to prove. The map ∞ ¡ 2 ¢ (1) M : L∞ (X ,A ,P) B L (X ,A ,P) , f M → 7→ f is a canonical isometric embedding of the commutative normed algebra ¡ 2 ¢ L∞ (X ,A ,P) into the normed algebra B L (X ,A ,P) , which is far from being commutative in general. This embedding is the starting point of the noncommutative generalization of probability theory. In the follow- ing section we give a brief introduction to the theory of von Neumann algebras, which are the appropriate mathematical objects to formalize the concepts of noncommutative probability theory. It is fair to remark that all the results of this thesis concern finite dimensional von Neumann algebras.

1. C ∗-algebras, von Neumann algebras

DEFINITION 1 (Normed algebra). A unital complex algebra A en- dowed with the norm . is said to be a normed algebra, if the norm is k k submultiplicative, i. e., ab a b for any a,b A and the identity k k ≤ k kk k ∈ element is of norm one, that is, 1 1. k A k = The reader who claims that not only unital algebras are said to be normed algebras is right. However, any algebra which appears in this work is unital, so for the sake of convenience, we incorporated the re- quirement of unitality in the definition.

DEFINITION 2 (Banach algebra). A normed algebra which is a Banach space — that is, a complete normed space — is called a Banach algebra.

DEFINITION 3 (Involution). Let A be a complex algebra. A map : ∗ A A , a a∗ is called an involution if it satisfies the following proper- → 7→ ties.

is antilinear: (λa b)∗ λa∗ b∗ for any a,b A and λ C. • ∗2 + = + ∈ ∈ id, that is, (a∗)∗ a for any a A . • ∗ = = ∈ is an antihomomorphism with recpect to the product: (ab)∗ • ∗ = b∗a∗ for any a,b A . ∈ 1. C ∗-ALGEBRAS, VON NEUMANN ALGEBRAS 5

DEFINITION 4 (C ∗-algebra). A Banach algebra endowed with an invo- 2 lution : A A which satisfies a∗a a for any a A is called a ∗ → k k = k k ∈ C ∗-algebra.

The above definition of C ∗-algebras is rather abstract. However, we do not loose any generality if we consider the elements of a C ∗-algebra as bounded operators on an appropriate Hilbert space. Indeed, any C ∗- algebra is isomorphic to a closed (in the operator norm topology) unital *-subalgebra (that is, it is closed under the involution) of the operator al- gebra B(H ) for a suitable Hilbert space H . Furthermore, any commuta- tive C ∗-algebra is isomorphic to C(X ) for some compact Hausdorff space X . (The symbol C(X ) denotes the algebra of all continuous complex- valued functions defined on X endowed with the supremum norm.) Despite the above remarkable facts, the C ∗-algebra is still a bit too general notion to formalize the concepts of noncommutative probability theory. With an extra topological assumption we achieve the desired level of generality.

DEFINITION 5 (von Neumann algebra). AC ∗-algebra which is closed not just in the operator norm but also in the weak operator topology is called a von Neumann algebra.

Note that the above definition is correct as any C ∗-algebra is isomor- phic to an algebra of bounded linear operators on a Hilbert space H , hence the condition about the closedness in the weak operator topol- ogy makes sense. The weak operator topology on B(H ) is defined by © ª ¯­ ®¯ the family of seminorms p : x, y H where p (A) ¯ Ax, y ¯ (A x,y ∈ x,y = ∈ B(H )). A beautiful result of von Neumann shows that the purely topologi- cal condition of being closed in the weak operator topology can be char- acterized by a purely algebraic condition. In order to present von Neu- mann’s theorem we define the notion of the commutant.

DEFINITION 6 (Commutant). Let S be any subset of the algebra B(H ). The commutant of S is denoted by S0 and consists of bounded linear oper- ators on H which commute with each element of S. That is,

S0 {B B(H ): AB B A for any A S}. = ∈ = ∈ The bicommutant S00 of the set S is defined as the commutant of its commutant. It is clear that S00 S. The following theorem called bicom- ⊆ mutant theorem asserts that the equality S00 S holds if and only if S is a = von Neumann algebra.

THEOREM 7 (von Neumann’s bicommutant theorem). Let C be a uni- tal *-subalgebra of the operator algebra B(H ) for some Hilbert space H . 6 1. INTRODUCTION

The closure of C in the weak operator topology is equal to the bicommu- tant C 00 of C . The algebra C 00 is the von Neumann algebra generated by C . So, any von Neumnann algebra coincides with its bicommutant. There is another useful property of von Neumann algebras, namely, that they are isomorphic to the topological dual of some Banach spaces [81].

PROPOSITION 8. AC ∗-algebra M is a von Neumann algebra if and only if there exists a Banach space B such that M is isomorphic (as a Ba- nach space) to the Banach dual of B, which is the space of all continuous linear functionals on B. If there exists such a Banach space, then it is called the predual of the von Neumann algebra M and it is denoted by M . There is a canonical ∗ embedding of M into the Banach dual of M which is denoted by M ∗. This embedding is∗ the map

ˆ: M M ∗, φ φˆ, ∗ → 7→ where φˆ M ∗ is defined by the equation ∈ φˆ(A) A(φ)(A M ). = ∈ DEFINITION 9 (State). Let C be a C ∗-algebra. A continuous linear functional φ : C C is said to be a state on the C ∗-algebra C if it is nor- → malized, that is, φ(I ) 1, and positive, that is, φ(A∗ A) 0 for any A C . C = ≥ ∈ In the following, the set of all states on a C ∗-algebra C — which is clearly a convex set — will be denoted by SC . A state φ on a von Neumann algebra M is called normal state if for any countable collection {Pn}n N of mutually orthogonal projections in we have ∈ M Ã ! X X φ Pn φ(Pn). n N = n N ∈ ∈ (An element P of the von Neumann algebra M is called projection if 2 P P P ∗, and the projections P and Q are orthogonal if PQ 0.) More- = = = over, a rather interesting fact is that a state on a von Neumann algebra is normal if and only if it is in the embedded image of the predual.

EXAMPLE 10 (Normal states on the von Neumann algebra B(H )). For a given separable Hilbert space H , the algebra B(H ) of all bounded lin- ear operators is clearly a von Neumann algebra, as it is closed in B(H ) in any topology. An operator A B(H ) is said to be trace class opearator, ∈ if the sum D 1 E X ¡ ¢ 2 A∗ A e j ,e j j 1. C ∗-ALGEBRAS, VON NEUMANN ALGEBRAS 7 is finite for some (and hence for any) orthonormal basis {e } H . The j j ⊂ trace of a trace class operator is defined by X­ ® Tr A : Ae j ,e j . = j The definition is correct as the above sum is absolutely convergent by the definition of the trace class operators. So, let φ be a normal state on B(H ). Then there exists a unique self- adjoint trace class operator ρ B(H ) with Trρ 1 such that ∈ = φ(A) TrρA (A B(H )). = ∈ This means that the predual of B(H ) is the Banach space of all trace class operators on H . This latter space is usually denoted by T (H ). It is an important corollary of the bicommutant theorem that any von Neumann algebra M is determined by its projection lattice P (M ) {P 2 = ∈ M : P P P ∗} in the sense that M P (M )00 [81]. Therefore, the = = = investigation of the projection lattice P (M ) is of particular importance.

REMARK 11. The set of the projections P (M ) of a von Neumann al- gebra M is called projection lattice as it is a lattice indeed. As any C ∗- algebra is isomorphic to a subalgebra of B(H ) for a suitable Hilbert space H , we may consider the elements of P (M ) as Hilbert space oper- ators. The Loewner ordering (P Q if and only if x,Px x,Qx for any ≤ 〈 〉 ≤ 〈 〉 element x of the Hilbert space H ) is a partial ordering on P (M ). More- over, every two elements have a unique supremum (least upper bound) and a unique infimum (greatest lower bound). The infimum of the pro- jections P and Q is the projection onto the intersection of the ranges of P and Q and it is denoted by P Q. The supremum is the projection onto ∧ the closed subspace generated by the ranges of P and Q — in notation: P Q. ∨ The classification of von Neumann algebras developed by F.J. Murray and von Neumann [68] is based on the following equivalence relation on the projection lattice of a von Neumann algebra.

DEFINITION 12 (Equivalence of projections). Let M be a von Neu- mann algebra. The projections P,Q P (M ) are said to be equivalent (in ∈ notation: P Q), if there is a partial isometry U M such that UU ∗ P ∼ ∈ = and U ∗U Q. = Partial isometries can be defined even in C ∗-algebras, as follows.

DEFINITION 13 (Partial isometry). An element U of a C ∗-algebra is called partial isometry, if it satisfies the equation UU ∗U U. = 8 1. INTRODUCTION

The above defined relation on the projections is indeed an equiva- lence relations as one can convince oneself easily. Now we introduce a preorder with the use of this equivalence relation as follows. We say that P Q if there exists some P 0 P (M ) such that P 0 P and P 0 Q, that ¹ ∈ ∼ ≤ is, the range of P 0 is contained in the range of Q. (Here and throughout this work, the symbol between operators refers to the Loewner partial ≤ ordering.) A projection P is finite if Q P and Q P imply Q P. A pro- ≤ ∼ = jection which is not finite is called infinite. Let us mention an important result about the preordering . ¹ THEOREM 14 (Comparsion Theorem). Let P (M ) be the projection lattice of a von Neumann algebra M . Then, for any pair of elements P,Q P (M ) there exists a projection R P (M ) P (M )0 such that ∈ ∈ ∩ RPR RQR and (I R)Q (I R) (I R)P (I R). ¹ M − M − ¹ M − M − The comparsion theorem has an important consequence concerning algebras with trivial commutant. These algebras are called factors.

DEFINITION 15 (Factor). A von Neumann algebra M is called factor if its commutant is trivial, that is, M M 0 CI . ∩ = M The consequence is the following. If M is a factor then P (M ) is to- tally pre-ordered with respect to , that is, if P,Q P (M ) then either ¹ ∈ P Q or Q P holds. ¹ ¹ Furthermore, for any factor M , there exists a function d : P (M ) → [0, ] defined on the projection lattice with the following properties. ∞ d(P) 0 if and only if P 0. • = = d(P Q) d(P) d(Q) for any P,Q P (M ) which are orthogonal • + = + ∈ (i.e., PQ 0.) = d(P) d(Q) if and only if P Q. • ≤ ¹ d(P) is finite if and only if P is a finite projection. • d(P) d(Q) if and only if P Q • = ∼ d(P) d(Q) d(P Q) d(P Q) • + = ∧ + ∨ The function d is called dimension function. For any given factor M , the dimension function is unique up to harmless normalization, and the range of the dimension function describes the factor M . Therefore, the type of the factor is determined by the range of the dimension function defined on its projection lattice. One of the main achievements of Mur- ray and von Neumann was that they described the possible ranges of the dimension function on the projection lattice of a factor [68]. The possi- bilities are the followings. 2. TENSOR PRODUCT 9

the range of d the type of the factor M in notation {0,1,2,...,n} discrete, finite In {0,1,2,...,n,..., } discrete, infinite I ∞ ∞ [0,1] continuous, finite II1 [0, ] continuous, infinite II ∞ ∞ {0, } purely infinite III ∞ Any factor is exactly one of the above types. Moreover, any von Neu- mann algebra over a separable Hilbert space can be written as a direct integral of factors. Therefore, the result of Murray and von Neumann is a classification of von Neumann algebras.

EXAMPLE 16. The algebra B(H ) of all bounded linear operators on a separable Hilbert space H is of type In if H is n-dimensional and is of type I if H is infinite dimensional. ∞ 2. Tensor product 2.1. Tensor product of linear spaces. Let X and Y be arbitrary sets. Let F (X ) and F (Y ) denote the set of all complex valued functions on X and Y , respectively. Clearly, F (X ) and F (Y ) are linear spaces over the field C. Let M F (X ) and N F (Y ) be arbitrary linear subspaces. One ⊆ ⊆ can define the tensor product operation on M N the following way. × : M N F (X Y );(m,n) m n, ⊗ × → × 7→ ⊗ where m n is defined by ⊗ (m n)(x, y) m(x)n(y) ¡x X , y Y ¢. ⊗ = ∈ ∈ Note that as m and n are not necessarily linear functions — X and Y are not necessarily linear spaces!— the tensor product m n is not a bi- ⊗ linear function on X Y in general. However, the map (m,n) m n × 7→ ⊗ is clearly bilinear for arbitrary sets X ,Y and arbitrary function spaces M F (X ),N F (Y ). ⊆ ⊆ The tensor product space M N is defined by ⊗ ( J ) X M N m j n j : J N,m j M,n j N . ⊗ = j 1 ⊗ ∈ ∈ ∈ = Let us denote by L∗ the set of linear functionals on a linear space L. Clearly, any l L may be considered as an element of F (L∗) by ∈ ¡ ¢ l(α): α(l) l L, α L∗ . = ∈ ∈ That is, L F (L∗) Furthermore, straightforward computation shows that ⊆ any l L is a linear function on L∗. LetU and V be complex vector spaces. ∈ Then U F (U ∗) and V F (V ∗). ⊆ ⊆ 10 1. INTRODUCTION

According to the above definition, the tensor product map

¡ ¢ : U V F U ∗ V ∗ ;(u,v) u v, ⊗ × → × 7→ ⊗ is defined by ¡ ¢ ¡ ¢ (u v) α,β α(u)β(v) u U,v V,α U ∗,β V ∗ . ⊗ = ∈ ∈ ∈ ∈ In this setting the map u v : ¡α,β¢ (u v)¡α,β¢ is also bilinear, not ⊗ 7→ ⊗ just the map :(u,v) u v. ⊗ 7→ ⊗ Naturally, the tensor product space of the linear spaces U and V is

( J ) X U V u j v j : J N,u j U,v j V . ⊗ = j 1 ⊗ ∈ ∈ ∈ = 2.2. Tensor product of linear operators. In the previous subsection we defined the tensor product of linear spaces, hence now we are in the position to define the tensor product of linear operators. Let A Lin(U) and B Lin(V ) denote linear subspaces which con- ⊆ ⊆ sist of linear operators on the vector spaces U and V, respectively. The tensor product of elements of A and B are defined as : A B Lin(U V );(A,B) A B ⊗ × → ⊗ 7→ ⊗ where A B is the linear extension of the map ⊗ (u v) Au Bv (u U,v V ). ⊗ 7→ ⊗ ∈ ∈

2.3. Tensor product of C ∗-algebras. As any C ∗-algebra is a subalge- bra of B(H ) for some Hilbert space H , the elements of a C ∗-algebra may be viewed as linear operators on a linear space. Therefore, the ten- sor product A B makes sense for any A A and B B if A and B are ⊗ ∈ ∈ C ∗-algebras. For the sake of simplicity, we consider only finite dimen- sional algebras. The tensor product of the finite dimensional C ∗-algebras A and B is defined as ( J ) X A B A j B j : J N, A j A ,B j B . ⊗ = j 1 ⊗ ∈ ∈ ∈ = Note that A B is a C ∗-algebra. ⊗ 2.4. Reduced states and the partial trace. Let φ be a state (see Defi- nition 9) on the C ∗-algebra A B. Let us define the following functions: ⊗ φ (A): φ(A I )(A A ) 1 = ⊗ B ∈ and φ (B): φ(I B)(B B). 2 = A ⊗ ∈ 3. CONTINUOUS FUNCTIONAL CALCULUS 11

The functions φ1 and φ2 are clearly continuous linear functionals on A and B, respectively. On the other hand,

φ1(IA ) φ2(IB) φ(IA IB) φ(IA B) 1 = = ⊗ = ⊗ = and ¡ ¢ ¡ ¢ ¡¡ ¢ ¢ φ1 A∗ A φ A∗ A IB φ A∗ IB (A IB) = ¡ ⊗ = ¢⊗ ⊗ φ (A I )∗ (A I ) 0, = ⊗ B ⊗ B ≥ similarly, ¡ ¢ ¡ ¢ ¡¡ ¢ ¢ φ B ∗B φ I B ∗B φ I B ∗ (I B) 2 = A ⊗ = A ⊗ A ⊗ ¡ ¢ φ (I B)∗ (I B) 0, = A ⊗ A ⊗ ≥ hence both φ1 and φ2 are states. These functionals are called reduced states. Let H and K be separable Hilbert spaces and let us consider the von Neumann algebras B(H ),B(K ) and B(H K ) B(H ) B(K ). Any ⊗ = ⊗ normal state φ on B(H ) B(K ) can be represented by a density opera- ⊗ tor ρ T (H K ), see Example 10. Let the representing density opera- ∈ ⊗ tors of the reduced states φ1 and φ2 (which are also normal) be denoted by ρ1 and ρ2. The maps Tr : T (H K ) T (K ); ρ Tr ρ : ρ 1 ⊗ → 7→ 1 = 2 and Tr : T (H K ) T (K ); ρ Tr ρ : ρ 2 ⊗ → 7→ 2 = 1 are called partial traces.

3. Continuous functional calculus If A is a unital complex algebra then the spectrum of an element A ∈ A is denoted by σ(A) and defined by σ(A) {λ C : A λI is not invertible}. = ∈ − A Let A be a C ∗-algebra and let A be a normal element of A , that is, A∗ A AA∗. Let C (σ(A)) denote the C ∗-algebra of all continuous com- = plex valued functions defined on the compact set σ(A). Then there exists a unique isometric C ∗-algebra homomorphism Φ : C (σ(A)) A such A → that Φ (1) I and Φ (id (A)) A, see, e.g. [41, 4.4.5. Theorem]. The A = A A σ = map ΦA is called the continuous functional calculus corresponding to the normal element A and the notation f (A) Φ (f ) is often used — it will ≡ A be used also in this work. The most important consequence of this fact is that bounded normal operators on a Hilbert space do have a continuous functional calculus. The next example shows that if we assume that the Hilbert space H is finite dimensional, then the continuous functions of the normal elements of B(H ) can be computed quite easily. 12 1. INTRODUCTION

EXAMPLE 17. If H is finite dimensional Hilbert space then for any operator A B(H ) the spectrum σ(A) consists of finitely many ∈ point, and every element of the spectrum is an eigenvalue — that is, ¡ ¢ ker A aI H in nontrivial for any a σ(A). − B(H ) ⊂ ∈ If f : G C is a function defined on a set G C then the correspond- → ⊂ ing standard operator function is the following map:

f :{A B(H ): A∗ A AA∗ and σ(A) G} B(H ) ∈ = ⊆ → X X A aPa f (A): f (a)Pa, = a σ(A) 7→ = a σ(A) ∈ ∈ where σ(A) is the spectrum of A and Pa is the spectral projection corre- sponding to the eigenvalue a — which is nothing else but the orthogonal ¡ ¢ projection onto the subspace ker A aI H . − B(H ) ⊂ The mapping which maps the standard operator function f to the complex function f is exactly the continuous functional calculus on B(H ).

3.1. Operator monotonicity and convexity.

DEFINITION 18 (Operator monotone function). The real function f : R I R is said to be operator monotone if for any finite dimensional ⊃ → complex Hilbert space H the corresponding standard operator function

f :{A Bsa(H ): σ(A) I} B(H ) ∈ ⊆ → X X A aPa f (A): f (a)Pa, = a σ(A) 7→ = a σ(A) ∈ ∈ is monotone with respect to the Loewner partial ordering on self-adjoint operators, that is, f (A) f (B) whenever A B. ≤ ≤ DEFINITION 19 (Operator convex function). The real function f : R ⊃ I R is said to be operator convex if for any finite dimensional complex → Hilbert space H the corresponding standard operator function is convex with respect to the Loewner partial ordering on self-adjoint operators, that is, f (λA (1 λ)B) λf (A) (1 λ) f (B) + − ≤ + − for any A,B Bsa(H ) with σ(A),σ(B) I. ∈ ⊂ An exhaustive description of the results of matrix analysis related to operator monotonicity and convexity can be found in the monography of R. Bhatia [9]. 4. COMMUTATIVE VON NEUMANN ALGEBRAS 13

4. Commutative von Neumann algebras Now, we are in the position to answer the question why we call von Neumann algebra theory sometimes noncommutative probability theory? It is clear by Definition 2 that the function space L∞ (X ,A ,P) is a commutative Banach algebra for any probability space (X ,A ,P). Fur- thermore, it is easy to see that these Banach algebras are also C ∗- algebras with the complex conjugation as involution. It is folklore that 1 L∞ (X ,A ,P) is the Banach dual of the Banach space L (X ,A ,P), which is defined as follows: ½ ¯ Z ¾ ¯ ¯ ¯ L1 (X ,A ,P) f : X C¯ f is measurable and ¯f ¯dP . = → ¯ X < ∞ So, L∞ (X ,A ,P) is a commutative C ∗-algebra which is the dual of the Ba- 1 nach space L (X ,A ,P). By Proposition 8, this means that L∞ (X ,A ,P) is a commutative von Neumann algebra. The interesting fact is that the converse statement is also true. That is, every abelian von Neumann al- ¡ ¢ gebra is isomorphic to L∞ X ,S ,µ for some localizable measure space ¡ ¢ X ,S ,µ , see [81]. (A localizable measure space is the direct sum of finite measure spaces.) We can deduce that every probability space determines a commuta- tive von Neumann algebra — the algebra of the bounded random vari- ables — and every commutative von Neumann algebra determines a probability space, up to harmless normalization. That is the reason why the theory of von Neumann algbras may be considered as noncommuta- tive probability theory.

CHAPTER 2

Quantum variances, generalized entropies and relative entopies

We finished the previous chapter with the description of the corre- spondence between probability spaces and abelian von Neumann alge- bras — see Section 4 in Chapter 1. Fortunately, several interesting and useful notions of probability theory can be extended to the general von Neumann algebra setting. In this thesis, we focus on two distinguished concepts of probability theory, namely the (co)variance and the entropy. In the first section we investigate the following problem. Can we char- acterize those sets of observables for which the induced covariance map- ping is a roof ? (See Def. 21 for the definition of roof.) Note that this ques- tion does not make sense in the case of abelian von Neumann algebra for the following reason. It is known that every pure state is multiplicative on a commutative von Neumann algebra, see, e.g., [41, 4.4.1. Prop.]. There- fore, the covariance of any two observables is zero in any pure state. So, the covariance mapping is a roof if and only if it is identically zero which is clearly not the case. In the second section the strong subadditivity inequality of the en- tropy is investigated. Fairly nontrivial but rather easy computations show that the Shannon entropy is strongly subadditive. In my opinion, a much more sophisticated argument shows that its noncommutative counter- part, the is also strongly subadditive. The latter statement is a celebrated result of Lieb and Ruskai [49]. We consider a one-parameter generalization of the von Neumann entropy which is called Tsallis entropy. We show — in particular — that the Tsallis entropy is not strongly subadditive for noncommutative von Neumann algebras in spite of the facts that it is strongly subadditive in the commutative case [22, Thm 3.4] and that it is subadditive in the noncommutative case, as well [5]. In the third section we introduce the Bregman divergences which may be considered as certain generalizations of the Umegaki relative entropy. We characterize those Bregman divergences which are jointly convex, and we use this result to derive a sharp inequality for Tsallis entropy

15 16 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES which can be considered as a generalization of the strong subadditivity inequality of the von Neumann entropy.

1. Decomposable quantum variances

Let A be a von Neumann algebra of type In and let φ be a — neces- sarily normal — state on A . The covariance of the self-adjoint elements A,B A is defined by ∈ Cov (A,B) φ(AB) φ(A)φ(B). φ = − In particular, the variance of the observable (self-adjoint elements are of- ten called observables) A in the state φ is given by 2 Var (A) Cov (A, A) φ¡A2¢ ¡φ(A)¢ . φ = φ = − It is rather easy to check that Var (A λI ) Var (A) (A A ,λ R) φ + A = φ ∈ ∈ holds for any state φ. It is useful to introduce the covariance matrix of several observables. If A1,... Ar are self-adjoint elements of A , then their covariance matrix is defined as £Cov (A ,..., A )¤ : Cov ¡A , A ¢ (1 i, j r ). φ 1 r i,j = φ i j ≤ ≤ Observe that the above defined covariance matrix is necessarily self- adjoint as φ¡A A ¢ φ¡A A ¢. i j = j i One of the most important properties of the covariance is that it is a concave map on the set of states, that is, the mapping (2) Cov (A ,..., A ) : S Msa; φ Cov (A ,..., A ) (.) 1 r A → r 7→ φ 1 r is concave with respect to the Loewner ordering on the final space Mr . (For any A,B Msa we say that A B if B A is a positive semidefinite ∈ r ≤ − matrix.) Indeed, assume that 0 λ ,λ such that λ λ 1 and φ ,φ S . ≤ 1 2 1 + 2 = 1 2 ∈ A Then ¡ ¢ ¡ ¡ ¢ ¡ ¢¢ Covλ1φ1 λ2φ2 Ai , A j λ1Covφ1 Ai , A j λ2Covφ2 Ai , A j + − + λ λ ¡φ (A ) φ (A )¢¡φ ¡A ¢ φ ¡A ¢¢. = 1 2 1 i − 2 i 1 j − 2 j Therefore, the matrix ¡ ¢ Covλ1φ1 λ2φ2 (A1,..., Ar ) λ1Covφ1 (A1,..., Ar ) λ2Covφ2 (A1,..., Ar ) + − + is positive semidefinite, as it is a nonnegative multiple of a rank-one pro- jection. In particular, we obtain the concavity of the variance functional φ Var (A) 7→ φ 1. DECOMPOSABLE QUANTUM VARIANCES 17 for any self-adjoint A. As the von Neumann algebra A is of type In — that is, it is iso- morphic to the operator algebra B(H ) for an n-dimensional complex Hilbert space H , — every state is represented by a unique density op- erator (see Example 10). For the sake of simplicity, we will use the fol- lowing notation. If the state φ is represented by the density operator D, then we define Cov (.,.) : Cov (.,.), and so on, Var (.) : Var (.) and D = φ D = φ Cov (.,.,...,.) : Cov (.,.,...,.). D = φ Using this notation, the above derived concavity of the covariance matrix map (2) can be written as m m X X CovD (A1,..., Ar ) λk CovDk (A1,..., Ar ) if D λk Dk , ≥ k 1 = k 1 = = where λ 0 and Pm λ 1. k ≥ k 1 k = For any inequality,= it is an interesting task to investigate the case of equality. For such an investigation, a useful tool is the recently intro- duced notion of roof which is defined as follows.

DEFINITION 20 (Roof point). Let Ω be a compact convex set contained in a finite dimensional real linear space. Let G be a mapping from Ω into a partially ordered set. A point ω Ω is called roof point, if there are some ∈ extremal points π1,...,πm of Ω and nonnegative numbers p1,...,pm with Pm k 1 pk 1 such that = = m X pk πk ω k 1 = = and m X pkG (πk ) G (ω). k 1 = = DEFINITION 21 (Roof). A mapping G defined on Ω is called roof if ev- ery ω Ω is a roof point. ∈ The reader should consult [87] for further details. As A is finite dimensional, the set of the density operators is a com- pact convex subset of the real vector space of the self-adjoint elements of A . We are interested in the following question. Is the concave map- ping (2) a roof on SA ? It is well-known that the extremal points of the set of densities are exactly the rank-one projections. So we can reformulate our question. Given an arbitrary density D, can we find rank one pro- jections P ,...,P and nonnegative weights p ,...,p (with Pm p 1) 1 m 1 m k 1 k = such that = m X (3) D pk Pk = k 1 = 18 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES and m X CovD (A1,..., Ar ) pk CovPk (A1,..., Ar )? = k 1 = We say that (3) is an extremal convex decomposition of D. For r 1 the answer is positive, and this is the first result in this topic, = made by Petz and Tóth [77]. An extension of the former result was given by Petz and Léka in [46]. They proved that the answer is positive even in the case r 2. In our paper [78] we gave a necessary and sufficient = condition for the covariance mapping (2) being a roof in terms of the cor- responding observables. Our result applies for any finite collection of ob- servables, and it recovers all the aforementioned results easily.

LEMMA 22. If D1,...,Dm are density operators, A1,..., Ar are self- Pm adjoint elements of A and D k 1 λk Dk with some 0 λ1,...,λm, Pm = = < k 1 λk 1, then = = m X (4) CovD (A1,..., Ar ) λk CovDk (A1,..., Ar ) = k 1 = if and only if (5) TrD A TrDA for all 1 k m and 1 j r. k j = j ≤ ≤ ≤ ≤ PROOF. The covariance has the shift invariance property Cov (A ,..., A ) Cov (A λ I,..., A λ I) D 1 r = D 1 − 1 r − r for every reals λ ,...,λ . Set λ : TrDA . With this choice TrD(A 1 r j = j j − λ I) 0 holds for every j. Therefore j = [Cov (A ,..., A )] [Cov (A λ I,..., A λ I)] D 1 r i j = D 1 − 1 r − r i j TrD(A λ I)(A λ I). = i − i j − j Because of the concavity of the covariance, m X CovD (A1,..., Ar ) λk CovDk (A1,..., Ar ) − k 1 = is a positive semi-definite matrix, hence it is equal to zero if and only if the diagonal elements are zeros, that is, Ã m ! 2 X 2 ¡ ¢2 (6) TrD(A j λj I) λk TrDk (A j λj I) λk TrDk (A j λj I) 0 − − k 1 − − − = = holds for every j. It is easy to check that (6) holds if and only if TrD (A k j − λ I) 0 for every k, j and this is equivalent to (5). j = ä In the next subsection we use Lemma 22 to characterize those sets of self-adjoint elements of A for which the decomposition of the covariance with projections is possible. 1. DECOMPOSABLE QUANTUM VARIANCES 19

1.1. The main theorem. Recall that our von Neumann algebra A is (isomorphic to) the operator algebra B(H ), where H is a Hilbert space of dimension n. For an arbitrary subspace K H , we denote by QK the ⊂ orthogonal projection onto K . We define AK : QK AQK = for every element A A and ∈ B(K ): QK B(H )QK , Bsa(K ): QK Bsa(H )QK , = = K K B+(K ): Q B+(H )Q , S (K ): {X B+(K ) : Tr X 1}. = = ∈ = DEFINITION 23. Let {A ,..., A } be a set of self-adjoint elements of A 1 r = B(H ). The set {A1,..., Ar } is said to be variance-decomposable if for every D S there exists an extremal convex decomposition ∈ A m X D λk Pk = k 1 = of D such that m X CovD (A1,..., Ar ) λk CovPk (A1,..., Ar ) = k 1 = In other words, {A1,..., Ar } is variance-decomposable if and only if the mapping D Cov (A ,..., A ) is a roof. Our main result reads as 7→ D 1 r follows.

THEOREM 24. The set {A ,..., A } A is variance-decomposable if 1 r ⊂ and only if ¡ © K K K ª¢ 2 (7) dim span I , A ,..., A (dim K ) 1 r < for every subspace K H with dimK 1. ⊂ > PROOF. By Lemma 22, {A ,..., A } A is variance-decomposable if 1 r ⊂ and only if for every density operator D S there exist P ,...,P rank- ∈ A 1 m one projections such that D Conv({P ,...,P }) – where Conv(H) de- ∈ 1 m notes the convex hull of the set H – and (8) TrP A TrDA for all 1 k m and 1 j r. k j = j ≤ ≤ ≤ ≤ First, we show that the condition (7) is sufficient. It is enough to show that for every D S , rank(D) 1 there exist E ,...,E S density ∈ A > 1 m ∈ A operators such that (9) D Conv({E ,...,E }) ∈ 1 m and (10) TrE A TrDA for all k and j k j = j 20 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES and (11) rank(E ) rank(D). k < Let D be an arbitrary element of S with rank(D) 1 and K : A > = range(D). The set Bsa(K ) is a (dim(K ))2 dimensional Hilbert space over the field R with the positive definite inner product X ,Y Tr XY . Let us 〈 〉 = use the notation A (A ,..., A ). In the following we denote the identity = 1 r of A simply by I instead of IA . Define K sa ­ ® ­ ® L : {X B (K ): X , I 1, X , A D, A for all j}. D,A = ∈ 〈 〉 = j = j Clearly,

K sa ­ K ® D K E D K E L {X B (K ): X , I 1, X , A D, A for all j}. D,A = ∈ = j = j ¡ © K K K ª¢ 2 Because of the assumption dim span I , A1 ,..., Ar (dimK ) , K sa < LD,A is an affine subspace of B (K ) with positive dimension. It is well-known that S (K ) is a bounded convex set (for example, P 1 if P S (K ), where denotes the Hilbert-Schmidt norm). k k2 ≤ ∈ k·k2 Therefore, L K S (K ) is a bounded convex set and D,A ∩ K ¡ K ¢ L S (K ) Conv L ∂S (K ) , D,A ∩ ⊂ D,A ∩ where ∂S (K ) denotes the relative boundary of S (K ). By definition, D L K S (K ), and hence ∈ D,A ∩ m X m K X D λk Ek with some {Ek}k 1 LD,A ∂S (K ) and 0 λk, λk 1. = k 1 = ⊂ ∩ ≤ = = This is exactly the statemant we wanted to prove, because E ∂S (K ) k ∈ implies that rank(Ek ) dim(K ) rank(D), that is, (11) holds, and Ek K < = ∈ LD,A implies that (10) holds. Note that D has maximal rank in S (K ), hence it is a (relative) interior K point of S (K ). On the other hand, LD,A lies in the affine hull of S (K ) and has positive dimension. Therefore, the intersection L K S (K ) is D,A ∩ not a single point. To show that the condition (7) is necessary as well, assume that

¡ © K K K ª¢ 2 dim span I , A ,..., A (dimK ) 1 r = for some subspace K H with dimK 1. Set D S (K ), rank(D) ⊂ ¡ ©> ∈ ª¢ 2> 1. Because of the assumption dim span I K , AK ,..., AK (dimK ) , 1 r = L K is a 0 dimensional affine subspace of Bsa(K ), that is, L K {D}. D,A D,A = Therefore, we have by Lemma 22 that the decomposition of D is impos- sible. ä 2. GENERALIZATIONS OF THE STRONG SUBADDITIVITY INEQUALITY 21

The next example shows that for an arbitrary large n there exists a set of self-adjoint operators in B(H ) — where dim(H ) n — with only = three elements which is not variance-decomposable.

EXAMPLE. For every n 2 we can find self-adjoint operators A , A , A ≥ 1 2 3 ∈ B(H ) A and a density operator D SA with the following property. = ∈ Pm If P1,...,Pm are rank-one projections such that D k 1 λk Pk with some Pm = = 0 λ1,...,λm, k 1 λk 1, then < = = m X (12) CovD (A1, A2, A3) λk CovPk (A1, A2, A3). 6= k 1 = Let {e1,...,en} be an arbitrary orthonormal basis in H and let us identify the elements of B(H ) with their matrices in this basis. Let us use the Pauli matrices · 0 1 ¸ · 0 i ¸ · 1 0 ¸ σ , σ − , σ 1 = 1 0 2 = i 0 3 = 0 1 − to define A1, A2, A3 the following way · σ 0 ¸ · σ 0 ¸ · σ 0 ¸ A : 1 , A : 2 , A : 3 , 1 = 0 0 2 = 0 0 3 = 0 0 and D : Diag¡ 1 , 1 ,0,...,0¢. By Lemma 22, = 2 2 m X CovD (A1, A2, A3) λk CovPk (A1, A2, A3) = k 1 = if and only if TrP A 0 for every k and j, but in this case we have P K 0 k j = k = for K range(D), hence D can not be a convex combination of the P ’s. = k Therefore, (12) holds.

The proof of the statement of the previous example is shorter if we use the Theorem. The only thing we have to observe is that ¡ © K K K K ª¢ 2 dim span I , A , A , A (dimK ) 1 2 3 = for K range(D). = 2. Generalizations of the strong subadditivity inequality

Let A be a von Neumann algebra of type In and let us denote by H the underlying n-dimensional Hilbert space — that is, A B(H ). Let ρ = be a density operator which represents a state on A . Note that in this case ρ A and the expression f (ρ) makes sense by the continuous functional ∈ calculus for any complex function f which is continuous on the spectrum of ρ — see Section 3. 22 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES

The von Neumann entropy of the density operator ρ is defined by (13) S(ρ) Trρ lnρ = − see, e.g., [9, 35, 71]. Let the Hilbert space H be the tensor product of three finite dimensional Hilbert spaces, that is, H : H H H . (Note = 1⊗ 2⊗ 3 that tensor products are defined in Section 2.) Let ρ B(H ) be a den- 123 ∈ sity operator. The reduced densities are defined by partial traces — see Subsection 2.4. Let us use the following notation. (14) ρ : Tr ρ , ρ : Tr ρ , ρ : Tr ρ . 12 = 3 123 2 = 1 12 23 = 1 123 As in our case the states and the density operators are in one-to-one cor- respondence, densities will be called sometimes states, and we will refer to reduced densities sometimes by the expression reduced state. One of the most important results in quantum information theory is the strong subadditivity of the von Neumann entropy, which is the fol- lowing inequality. S(ρ ) S(ρ ) S(ρ ) S(ρ ). 123 + 2 ≤ 12 + 23 This result was made by E. Lieb and M. B. Ruskai in 1973 [49, 71]. Our aim is to generalize this inequality in various ways. The key object of our investigations is a certain generalization of the von Neumann entropy which is called

2.1. Tsallis entropy. The Tsallis entropy is a one-parameter exten- sion of the von Neumann entropy. For any real q, one can define the deformed logarithm (or q-logarithm) function ln : (0, ) R by q ∞ → q 1 Z x ( x − 1 q 2 q 1− if q 1, (15) lnq x : t − dt − 6= = 1 = ln x if q 1. = The corresponding entropy S (ρ) Trρ ln ρ q = − q is called Tsallis entropy [2, 17]. It is reasonable to restrict ourselves to the 0 q case, because limx 0 x lnq x 0 if and only if 0 q. If we < → + − = < introduce the notation f (x) x ln x we can write S (ρ) Trf (ρ). q = q q = − q 2.2. The Tsallis entropy is subadditive, but not strongly subaddi- tive. Let H1 and H2 be finite dimensional Hilbert spaces. If ρ12 is a state on a Hilbert space H H — that is, ρ B (H H ) such that 0 ρ 1⊗ 2 12 ∈ 1 ⊗ 2 ≤ 12 and Trρ 1, — then it has reduced states ρ : Tr ρ and ρ : Tr ρ 12 = 1 = 2 12 2 = 1 12 on the spaces H1 and H2, respectively. The subadditivity inequality of the Tsallis entropy is (16) S (ρ ) S (ρ ) S (ρ ), q 12 ≤ q 1 + q 2 2. GENERALIZATIONS OF THE STRONG SUBADDITIVITY INEQUALITY 23 and it has been proved for q 1 by Audenaert in 2007 [5]. > First, we have to note an easy consequence of Audenaert’s theorem. Assume that H H H H for some finite dimensional Hilbert = 1 ⊗ 2 ⊗ 3 spaces H (i 1,2,3). Let ρ B(H ) be a density operator and let the i = 123 ∈ reduced densities be labeled according to the notation introduced in (14). The subadditivity of the von Neumann entropy implies that

S (ρ ) S (ρ ) S (ρ ) q 123 ≤ q 12 + q 3 and S (ρ ) S (ρ ) S (ρ ). q 123 ≤ q 1 + q 23 It follows that S (ρ ) S (ρ ) S (ρ ) S (ρ ) q 123 + q 2 − q 12 − q 23 min{S (ρ ) S (ρ ) S (ρ ),S (ρ ) S (ρ ) S (ρ )}. ≤ q 1 + q 2 − q 12 q 2 + q 3 − q 23 The Tsallis entropy is nonnegative and takes its maximum at the com- 1 pletely mixed state — that is, in the state dimH IB(H ) — and the maximal value is ln 1 , where d is the dimension of the underlying Hilbert space. − q d Therefore, 1 1 Sq (ρ123) Sq (ρ2) Sq (ρ12) Sq (ρ23) lnq lnq , + − − ≤ − d2 − min{d1,d3} where di is the dimension of Hi . However, the strong subadditivity inequality

(17) S (ρ ) S (ρ ) S (ρ ) S (ρ ) q 123 + q 2 ≤ q 12 + q 23 does not hold in general.

PROPOSITION 25. The only strongly subadditive Tsallis entropy is the von Neumann entropy, that is, the strong subadditivity of the Tsallis en- tropy holds if and only if q 1. = PROOF. It is known that if ρ is a state on a Hilbert space H for i 1,2 i i = then ρ ρ is a state on H H and the relation 1 ⊗ 2 1 ⊗ 2 (18) S (ρ ρ ) S (ρ ) S (ρ ) (1 q)S (ρ )S (ρ ) q 1 ⊗ 2 = q 1 + q 2 + − q 1 q 2 holds. Therefore, the Tsallis enrtopy can not be subadditive for q 1. In < fact, it is neither subadditive, nor superadditive [22]. So we have shown that the Tsallis entropy is not strongly subadditive for q 1. On the other < hand, the next examples show that (17) does not hold for 1 q neither. < 24 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES

Set q 1 and consider the operator >  0 0 0 0 0 0 0 0     0 0 0 0 0 0 0 0     1 1   0 0 4 0 4 0 0 0     0 0 0 1 0 1 0 0  ¡ 2 2 2¢  4 4  (19) B C C C ρ  . ⊗ ⊗ 3 123 =  1 1   0 0 4 0 4 0 0 0     0 0 0 1 0 1 0 0   4 4     0 0 0 0 0 0 0 0    0 0 0 0 0 0 0 0 ρ is positive and Trρ 1. We have 123 123 =    1  0 0 0 0 4 0 0 0     " #  0 1 1 0   0 1 0 0  1 0  2 2 ,  4 , 2 . ρ12  1 1  ρ23  1  ρ2 1 =  0 0  =  0 0 0  = 0  2 2   4  2 1 0 0 0 0 0 0 0 4 Straightforward computations show that 1 µ µ1¶q µ1¶q ¶ 1 µ µ1¶q ¶ Sq (ρ123) Sq (ρ2) 1 2 1 2 2 4 + = q 1 − 2 + − 2 = q 1 − 2 − − and 1 µ µ1¶q ¶ Sq (ρ12) Sq (ρ23) Sq (ρ23) 1 4 . + = = q 1 − 4 − 1 q By the inequality of geometric and arithmetic means, we have 2 2 − 1 q · < 1 4 − , which immediately shows that S (ρ ) S (ρ ) S (ρ ) + q 123 + q 2 > q 12 + S (ρ ). q 23 ä This is not so surprising, if we consider a bit more general example. ¡ ¢ EXAMPLE. If ρ12 is an entangled pure state (i.e., rank ρ12 1 but ¡ ¢ ¡ ¢ = rank ρ ,rank ρ 1), let ρ be a state on H such that 1 rank(ρ ) 1 2 > 3 3 < 3 and let ρ : ρ ρ . Then 123 = 12 ⊗ 3 S (ρ ) S (ρ ) S (ρ ) S (ρ ) q 123 + q 2 > q 12 + q 23 holds for every 1 q. Indeed, if we use (18) we get < S (ρ ) S (ρ ) q 123 + q 2 S (ρ ) S (ρ ) (1 q)S (ρ )S (ρ ) S (ρ ) S (ρ ) S (ρ ) = q 12 + q 3 + − q 12 q 3 + q 2 = q 2 + q 3 and S (ρ ) S (ρ ) S (ρ ) S (ρ ) S (ρ ) (1 q)S (ρ )S (ρ ), q 12 + q 23 = q 23 = q 2 + q 3 + − q 2 q 3 2. GENERALIZATIONS OF THE STRONG SUBADDITIVITY INEQUALITY 25 because S (ρ ) 0. The density ρ is entangled, hence S (ρ ) 0, and q 12 = 12 q 2 > this verifies the statement.

It is an interesting fact that for classical probability distributions, the Tsallis entropy is strongly subadditive for 1 q [22], and this result has ≤ an elegant and short proof. The only thing necessary for the proof is that for any positive x, y and q, the identity

³ y ´ q 1 (20) lnq x lnq y lnq x − − = − x holds. Now we restate the proof of Furuichi.

m, n, r ROOF P . Let {p jkl }j 1,k 1,l 1 be a discrete probability distribution. Let = = = Pr Pm us introduce the notation p jk l 1 p jkl ,pkl j 1 p jkl and pk Pm m n r = = = = = j 1 p jk . If B (C C C ) ρ123 Diag({p jkl }), then by (20) we have = ⊗ ⊗ 3 = µ ¶ X ¡ ¢ ¡ ¢ X q p jk Sq (ρ123) Sq (ρ12) p jkl (lnq p jkl lnq p jk ) p jkl lnq − = − j,k,l − = j,k,l p jkl

µ ¶q µ ¶ X q p jkl p jk p jk lnq . = j,k,l p jk p jkl

Observe that xq ln ¡ 1 ¢ f (x), hence this expression can be written as q x = − q µ ¶ µ ¶q µ ¶ X q p jkl X p jk q p jkl p jk fq pk fq − j,k,l p jk = − j,k,l pk p jk

à µ ¶! X q X p jk p jkl pk fq . ≤ − k,l j pk p jk fq is convex, hence

à µ ¶! à ! µ ¶ X q X p jk p jkl X q X p jk p jkl X q pkl pk fq pk fq pk fq − k,l j pk p jk ≤ − k,l j pk p jk = − k,l pk =

µ ¶q µ ¶ µ ¶ X q pkl pk X q pk X ¡ ¢ ¡ ¢ pk lnq pkl lnq pkl (lnq pkl lnq pk ) = k,l pk pkl = k,l pkl = − k,l −

S (ρ ) S (ρ ). = q 23 − q 2

ä 26 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES

2.3. Relative entropy and monotonicity. Let H be a finite dimen- sional complex Hilbert space and n : dimH . If φ H then the symbols ¯ ® ­ ¯ = ∈ ¯φ and φ¯ denote certain mappings which are defined as follows. ¯ ® ¯ ® ¯φ : C H ; α ¯φ α : αφ. → 7→ = ­ ¯ ­ ¯ ­ ® φ¯ : H C; ψ φ¯ψ : φ,ψ . → 7→ = Let f be a (0, ) R function, let ρ,σ B++(H ) be positive definite ∞ → ∈ operators and A B(H ). We define the relative quasi-entropy by ∈ A ¡ ¢ D 1 ¡ ¢³ 1 ´E (21) S ρ σ : Aρ 2 , f ∆(σ/ρ) Aρ 2 , f || = where A,B Tr A∗B is the Hilbert-Schmidt inner product and ∆(σ/ρ) is 〈 〉 = the relative modular operator introduced by Araki [4]: 1 ∆(σ/ρ): B(H ) B(H ), X σX ρ− . → 7→ If A I, we simply write S ¡ρ σ¢. Quasi entropies were introduced by = f || Dénes Petz, see [76] and [75]. The following statement — which can be verified by direct compu- tations — appeared in the paper [83] and it makes the relative quasi- entropy easy to compute in some cases.

LEMMA 26. Let the spectral decomposition of the positive definite op- erators ρ and σ be given by n n X ¯ ®­ ¯ X ¯ ®­ ¯ ρ λj ¯ϕj ϕj ¯ and σ µk ¯ψk ψk ¯. = j 1 = k 1 = = Then we have µ ¶ A ¡ ¢ X µk ¯­ ¯ ¯ ®¯2 (22) S f ρ σ λj f ¯ ψk ¯ A ¯ϕj ¯ . || = j,k λj © ªn © ªn ROOF P . As ϕj j 1 and ψk k 1 are orthonormal bases in H , the set ©¯ ®­ ¯ªn = = ¯ψk ϕj ¯ j,k 1 forms an orthonormal basis of B(H ) (with respect to the Hilbert-Schmidt= inner product.) It is easy to check that with the notation ¯ ®­ ¯ v : ¯ψ ϕ ¯ we can write the relative modular operator as jk = k j X µk ¯ ®­ ¯ (23) ∆(σ/ρ) ¯v jk v jk ¯, = j,k λj ¯ ®­ ¯ where ¯v v ¯ : B(H ) B(H ) is defined by jk jk → ¯ ®­ ¯ ¯v v ¯(X ): v Tr v∗ X . jk jk = jk jk Therefore, µ ¶ ¡ ¢ X µk ¯ ®­ ¯ f ∆(σ/ρ) f ¯v jk v jk ¯. = j,k λj 2. GENERALIZATIONS OF THE STRONG SUBADDITIVITY INEQUALITY 27

Direct computation shows that D 1 ¯ ®­ ¯³ 1 ´E Aρ 2 ,¯v v ¯ Aρ 2 jk jk = à à !! 1 1 X 2 ¯ ®­ ¯ ¯ ®­ ¯ ¯ ®­ ¯ X 2 ¯ ®­ ¯ Tr λa ¯ϕa ϕa¯ A∗ ¯ψk ϕj ¯Tr ¯ϕj ψk ¯ A λb ¯ϕb ϕb¯ = a b 1 1 X 2 2 ¡¯ ®­ ¯ ¯ ®­ ¯ ¡­ ®­ ¯ ¯ ®¢¢ λa λb Tr ¯ϕa ϕa¯ A∗ ¯ψk ϕj ¯Tr ϕb ϕj ψk ¯ A ¯ϕb = a,b | 1 1 X 2 2 ­ ¯ ¯ ® ¡¯ ®­ ¯ ¯ ®­ ¯¢ λa λb δb j ψk ¯ A ¯ϕb Tr ¯ϕa ϕa¯ A∗ ¯ψk ϕj ¯ = a,b 1 1 X 2 2 ­ ¯ ¯ ®­ ¯ ¯ ® ¯­ ¯ ¯ ®¯2 λa λb δb j δa j ψk ¯ A ¯ϕb ϕa¯ A∗ ¯ψk λj ¯ ψk ¯ A ¯ϕj ¯ , = a,b = therefore S A ¡ρ σ¢ f || * µ ¶ + µ ¶ 1 X µk ¯ ®­ ¯³ 1 ´ X µk ¯­ ¯ ¯ ®¯2 Aρ 2 , f ¯v jk v jk ¯ Aρ 2 λj f ¯ ψk ¯ A ¯ϕj ¯ . = j,k λj = j,k λj

ä A short and elegant proof of the monotonicity of the relative entropy is given by Nielsen and Petz in [70]. The statement is that if A,B B(H ) ∈ ⊗ B(K ) are positive definite operators and we set A Tr A, B Tr B, 1 = 2 1 = 2 then for an operator convex function f the following inequality holds:

(24) S (A B) S (A B ). f || ≥ f 1|| 1 There is a useful extension of this monotonicity in [83]. Now we re- state Sharma’s proof, which is essentially the same as the proof of Nielsen and Petz [70].

LEMMA 27. Suppose that A,B B(H ) B(K ) are positive definite ∈ ⊗ operators and let us use the notations A Tr A, B Tr B. If f is an 1 = 2 1 = 2 operator convex function then for any T B(H ) the following inequality ∈ holds:

T IB(K ) T (25) S ⊗ (A B) S (A B ), f || ≥ f 1|| 1 where IB(K ) is the identity in B(K ).

PROOF. Let us consider the linear map

µ 1 ¶ 1 U : B(H ) B(H ) B(K ); X U (X ): XA− 2 I A 2 . → ⊗ 7→ = 1 ⊗ B(K ) 28 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES

We can check that U is an isometry. For X ,Y B(H ), ∈ µ 1 µ 1 ¶µ 1 ¶ 1 ¶ 2 2 U (X ),U (Y ) Tr A 2 A− X ∗ I YA− I A 2 〈 〉 = 1 ⊗ B(K ) 1 ⊗ B(K )

µ µ 1 1 ¶¶ µ 1 1 ¶ 2 2 2 2 Tr A A− X ∗YA− I Tr A A− X ∗YA− = 1 1 ⊗ B(K ) = 1 1 1

Tr X ∗Y X ,Y . = = 〈 〉 The short computation Y ,U (X ) 〈 〉 µ 1 ¶ 1 ³ 1 ´ µ 1 ¶ 2 ∗ ¡ ¢ 2 TrY ∗ XA− I A 2 Tr YA 2 X I A− I = 1 ⊗ B(K ) = ⊗ B(K ) 1 ⊗ B(K )

µ 1 ¶³ 1 ´ 2 ¡ ¢ Tr A− I YA 2 ∗ X I = 1 ⊗ B(K ) ⊗ B(K )

µ 1 µ 1 ¶¶ 2 ∗ ¡ ¢ Tr YA 2 A− I X I = 1 ⊗ B(K ) ⊗ B(K )

µ µ µ 1 ¶¶¶ ¿ µ µ 1 ¶¶ À 1 ∗ 1 Tr Tr YA 2 A− 2 I X Tr YA 2 A− 2 I , X 2 1 ⊗ B(K ) = 2 1 ⊗ B(K ) shows that the adjoint of U (which will be denoted by U ∗) is the map

µ 1 µ 1 ¶¶ Y Tr YA 2 A− 2 I . 7→ 2 1 ⊗ B(K ) One can see that U admits the beautiful relation

(26) U ∗∆(B/A)U ∆(B /A ). = 1 1 If X B(H ), then ∈ µ µ 1 ¶ 1 1 µ 1 ¶¶ 2 1 2 U ∗∆(B/A)U (X ) Tr B XA− I A 2 A− A 2 A− I = 2 1 ⊗ B(K ) 1 ⊗ B(K ) ¡ ¡ 1 ¢¢ 1 Tr B XA− I B XA− ∆(B /A )(X ). = 2 1 ⊗ B(K ) = 1 1 = 1 1 By definition of the relative entropy and by (26), the right-hand-side of (25) can be written as

¿ 1 µ 1 ¶À ST (A B ) TA 2 , f (∆(B /A )) TA 2 f 1|| 1 = 1 1 1 1

¿ 1 µ 1 ¶À 2 ¡ ¢ 2 TA , f U ∗∆(B/A)U TA . = 1 1 The operator convexity of f implies that ¡ ¢ f U ∗∆(B/A)U U ∗ f (∆(B/A))U ≤ 2. GENERALIZATIONS OF THE STRONG SUBADDITIVITY INEQUALITY 29

µ 1 ¶ 1 2 ¡ ¢ (see Chapter 5 of [9]), and U TA T I A 2 is immediate. 1 = ⊗ B(K ) Therefore,

¿ 1 µ 1 ¶À ¿ µ 1 ¶ µ µ 1 ¶¶À 2 ¡ ¢ 2 2 2 TA , f U ∗∆(B/A)U TA U TA , f (∆(B/A)) U TA 1 1 ≤ 1 1

1 1 D¡ ¢ ³¡ ¢ ´E T IB(K ) T I A 2 , f (∆(B/A)) T I A 2 S ⊗ (A B), = ⊗ B(K ) ⊗ B(K ) = f || and the proof is complete. ä 2.4. From the relative entropy to the strong subadditivity. As we have seen in Subsubection 2.2, the strong subadditivity of the Tsallis en- tropy holds if and only if q 1. Therefore, our goal is to find an inequality = (27) S (ρ ) S (ρ ) S (ρ ) S (ρ ) g (ρ ), q 123 + q 2 ≤ q 12 + q 23 + q 123 where g (ρ ) 0. Such a result may be considered as a generalization 1 123 = of the strong subadditivity inequality. We collect here some elementary facts that will be useful in the fol- lowing. (1) For any positive real numbers x, y and q the identity ³ y ´ ³ y ´ (28) lnq x lnq y lnq (q 1)lnq lnq x − = − x − − x holds. (2) If f and g are R R functions, ρ,σ Bsa(H ) and the spectral → ¯ ®­ ∈¯ ¯ ®­ ¯ decompositions are ρ P λ ¯ϕ ϕ ¯ and σ P µ ¯ψ ψ ¯, = j j j j = k k k k and domain of f and g contains the spectrum of ρ and σ, respec- tively, then

X ¯­ ®¯2 (29) Tr f (ρ)g(σ) f (λj )g(µk )¯ ϕj ψk ¯ . = j,k | (3) If A B(H ) B(K ) and B B(H ), then ∈ ⊗ ∈ (30) Tr ¡A B I ¢ (Tr A) B. 2 · ⊗ B(K ) = 2 · The strong subadditivity of the von Neumnann entropy can be de- rived from the monotonicity of the Umegaki relative entropy, which is a particular quasi-entropy [13, 70]. Therefore, it seems to be useful to re- formulate the strong subadditivity of the Tsallis entropy as an inequality of certain quasi-entropies.

THEOREM 28. Let ρ be an element of B++ (H H H ). The 123 1 ⊗ 2 ⊗ 3 strong subadditivity inequality of the Tsallis entropy (17) is equivalent to (31) SU ¡ρ ρ I ¢ SV ¡ρ ρ I ¢, lnq 123 12 3 lnq 23 2 3 − || ⊗ ≥ − || ⊗ 30 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES where

1 (q 1) 1 (q 1) (32) U ρ 2 − , V ρ 2 − . = 123 = 23 PROOF. By (30), we have

Tr ¡ρ ln (ρ I )¢ ρ ln (ρ ) 3 123 q 12 ⊗ 3 = 12 q 12 and Tr ¡ρ ln (ρ I )¢ ρ ln (ρ ), 3 23 q 2 ⊗ 3 = 2 q 2 hence the inequality

S (ρ ) S (ρ ) S (ρ ) S (ρ ), − q 123 + q 12 ≥ − q 12 + q 2 which is obviously equivalent to (17), can be written in the form

(33) Trρ ¡ln (ρ ) ln (ρ I )¢ Trρ ¡ln (ρ ) ln (ρ I )¢. 123 q 123 − q 12 ⊗ 3 ≥ 23 q 23 − q 2 ⊗ 3 ¯ ®­ ¯ ¯ ®­ ¯ By (29), if ρ P λ ¯ϕ ϕ ¯ and ρ I P µ ¯ψ ψ ¯, then 123 = j j j j 12 ⊗ 3 = k k k k the left hand side of (33) is X ¡ ¢¯­ ®¯2 λj lnq λj lnq µk ¯ ϕj ψk ¯ . j,k − | By (20), this expression can be written as µ µ ¶ ¶ X µk q 1 ¯­ ®¯2 − (34) λj lnq λj ¯ ϕj ψk ¯ . j,k − λj |

1 (q 1) In addition, if U ρ 2 − then = 123 ¯­ ¯ ¯ ®¯2 q 1 ¯­ ®¯2 ¯ ψ ¯U ¯ϕ ¯ λ − ¯ ψ ϕ ¯ , k j = j k | j hence by the result of Lemma 26, we can write (34) as the following rela- tive entropy µ µ ¶¶ X µk ¯­ ¯ ¯ ®¯2 U ¡ ¢ (35) λ ln ¯ ψ ¯U ¯ϕ ¯ S ρ ρ I . j q k j lnq 123 12 3 j,k − λj = − || ⊗ The observation that the right hand side of (33) can be written as a rela- tive entropy similarly to (35) completes the proof. ä Note that in the special case q 1, Theorem 28 states the equivalence = of the monotonicity of the Umegaki relative entropy and the strong sub- additivity of the von Neumann entropy. The following theorem provides an inequality which is of the form (27). 2. GENERALIZATIONS OF THE STRONG SUBADDITIVITY INEQUALITY 31

THEOREM 29. Using the notation of the previous theorem, for 0 q 2 < ≤ the inequality S (ρ ) S (ρ ) S (ρ ) S (ρ ) q 12 + q 23 − q 123 − q 2 µ 1 1 ¶ ( lnq ρ123) 2 ¡ ¢ ( lnq ρ23) 2 ¡ ¢ (q 1) S − ρ123 ρ12 I3 S − ρ23 ρ2 I3 ≥ − lnq || ⊗ − lnq || ⊗ holds. As we have claimed that the above inequality is of the form (27), it is not surprising that this statemant recovers the strong subadditivity of the von Neumann entropy if q 1. = PROOF. We noted that if X ¯ ®­ ¯ ρ123 λj ¯ϕj ϕj ¯ = j and X ¯ ®­ ¯ ρ12 I3 µk ¯ψk ψk ¯, ⊗ = k then the left hand side of (33) is X ¡ ¢¯­ ®¯2 λj lnq λj lnq µk ¯ ϕj ψk ¯ . j,k − | According to (28), it is equal to µ µ ¶ µ ¶ ¶ X µk µk ¯­ ®¯2 (36) λj lnq (q 1)lnq lnq λj ¯ ϕj ψk ¯ , j,k − λj − − λj | and by Lemma 26, (36) has the form

1 ( lnq ρ123) 2 ¡ ¢ − ¡ ¢ (37) S lnq ρ123 ρ12 I3 (q 1)S ρ123 ρ12 I3 . − || ⊗ + − lnq || ⊗ If we rewrite the right hand side of (33) similarly to (37), we get that the strong subadditivity is equivalent to

1 ( lnq ρ123) 2 ¡ ¢ − ¡ ¢ S lnq ρ123 ρ12 I3 (q 1)S ρ123 ρ12 I3 − || ⊗ + − lnq || ⊗

1 ( lnq ρ23) 2 ¡ ¢ − ¡ ¢ (38) S lnq ρ23 ρ2 I3 (q 1)S ρ23 ρ2 I3 . ≥ − || ⊗ + − lnq || ⊗ It is easy to derive from the Loewner-Heinz theorem [9, 13, 35] that ln x is operator monotone, if 0 q 2. Any operator monotone function q < ≤ is operator concave [35], hence ln x is operator convex. − q By the monotonicity property (25), for 0 q 2 we have < ≤ ¡ ¢ ¡ ¢ S lnq ρ123 ρ12 I3 S lnq ρ23 ρ2 I3 , − || ⊗ ≥ − || ⊗ 32 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES and by (38), this is equivalent to µ 1 ¶ ( lnq ρ123) 2 ¡ ¢ Sq (ρ123) Sq (ρ12) (q 1) S − ρ123 ρ12 I3 − + − − lnq || ⊗

µ 1 ¶ ( lnq ρ23) 2 ¡ ¢ (39) Sq (ρ23) Sq (ρ2) (q 1) S − ρ23 ρ2 I3 . ≥ − + − − lnq || ⊗ This is the statement of the theorem. ä The notation (32) will be used again. Because of the monotonicity property (25), for 0 q 2 and f (x) ln x, A ρ ,B ρ I ,T V < ≤ = − q = 123 = 12 ⊗ 3 = we have

I1 V ¡ ¢ V ¡ ¢ (40) S ⊗ ρ123 ρ12 I3 S ρ23 ρ2 I3 . lnq lnq − || ⊗ ≥ − || ⊗ This formula is quite similar to the strong subadditivity inequality (31). By (40),

U ¡ ¢ I1 V ¡ ¢ (41) S ρ123 ρ12 I3 S ⊗ ρ123 ρ12 I3 lnq lnq − || ⊗ ≥ − || ⊗ implies the strong subadditivity (31). We try to find a sufficient condition for (41).

THEOREM 30. If ρ123 and I1 ρ23 commute, and (using the usual nota- ¯ ®­ ¯ ⊗ ¯ ®­ ¯ tion ρ P λ ¯ϕ ϕ ¯ and ρ I P µ ¯ψ ψ ¯) we have λ µ 123 = j j j j 12⊗ 3 = k k k k j ≤ k whenever ­ψ ϕ ® 0, then for any 1 q 2 the strong subadditivity in- k | j 6= ≤ ≤ equality S (ρ ) S (ρ ) S (ρ ) S (ρ ) q 123 + q 2 ≤ q 12 + q 23 holds.

Note that if ρ is a classical probability distribution (that is, ρ 123 123 = Diag({p jkl })), then the conditions of Theorem 30 are clearly satisfied.

PROOF. By Lemma 26, the inequality (41) is equivalent to

¯ 1 ¯2 µ µ ¶¶ (q 1) µk X¯­ ¯ 2 − ¯ ®¯ ¯ ψk ¯ρ123 ¯ϕj ¯ lnq λj j,k ¯ ¯ − λj

¯ 1 ¯2 µ µ ¶¶ (q 1) µk X¯­ ¯ 2 − ¯ ®¯ ¯ ψk ¯ I1 ρ23 ¯ϕj ¯ lnq λj . ≥ j,k ¯ ⊗ ¯ − λj

³ ³ µk ´´ If λj µk , then lnq λj 0. On the other hand, ≤ − λj ≤

¯ 1 ¯2 ¯­ ¯ 2 (q 1) ¯ ®¯ q 1 ¯­ ®¯2 ¯ ψ ¯ρ − ¯ϕ ¯ λ − ¯ ψ ϕ ¯ . ¯ k 123 j ¯ = j k | j 2. GENERALIZATIONS OF THE STRONG SUBADDITIVITY INEQUALITY 33 © ª If ρ123 and I1 ρ23 commute, then I1 ρ23 is diagonal in the basis ϕj j J ⊗ ⊗ ¯ ®­ ¯ ∈ and ρ I ρ holds, that is, I ρ P ν ¯ϕ ϕ ¯ with some ν 123 ≤ 1 ⊗ 23 1 ⊗ 23 = j j j j j ≥ λj . (q 1) If 1 q, the map t t − is monotone on R , hence we have ≤ 7→ + ¯ 1 ¯2 ¯­ ¯ 2 (q 1) ¯ ®¯ q 1 ¯­ ®¯2 q 1 ¯­ ®¯2 ¯ ψ ¯ I ρ − ¯ϕ ¯ ν − ¯ ψ ϕ ¯ λ − ¯ ψ ϕ ¯ . ¯ k 1 ⊗ 23 j ¯ = j k | j ≥ j k | j We concluded that if the conditions of Theorem 30 are satisfied, then (41) holds, and hence the proof is complete. ä The following example shows that one can apply Theorem 30 in es- sentially non-classical cases, as well. £ 1 ¤ EXAMPLE. Set p,q ,1 such that pq 1 q and t R. Let us define ∈ 2 ≤ − ∈ V and Λ by  cost 0 0 sint  −  0 cost sint 0  V  −  =  0 sint cost 0  sint 0 0 cost and Λ Diag(pq,(1 p)q,p(1 q),(1 p)(1 q)). = − − − − V describes a family of orthonormal bases, this family can be considered as a one-parameter extension of the Bell basis. Let ρ1 B(H ) be an ar- ¡ ¢ ∈ bitrary density, and ρ B C2 C2 be defined by 23 ∈ ⊗   A11 0 0 A12 1  0 B11 B12 0  ρ23 V ΛV −  , = =  0 B21 B22 0  A21 0 0 A22 where A pq cos2 t (1 p)(1 q)sin2 t, 11 = + − − A (pq (1 p)(1 q))sint cost, 12 = − − − A (pq (1 p)(1 q))sint cost, 21 = − − − A (1 p)(1 q)cos2 t pq sin2 t 22 = − − + and B (1 p)q cos2 t p(1 q)sin2 t, 11 = − + − B ((1 p)q p(1 q))sint cost, 12 = − − − B ((1 p)q p(1 q))sint cost, 21 = − − − B p(1 q)cos2 t (1 p)q sin2 t. 22 = − + − 34 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES

One can easily compute that · q cos2 t (1 q)sin2 t 0 ¸ ρ Tr ρ + − . 2 = 3 23 = 0 q sin2 t (1 q)cos2 t + − Let us take the density ρ ρ ρ . The spectrum of ρ is 123 = 1 ⊗ 23 123 m [ {νj pq,νj (1 p)q,νj p(1 q),νj (1 p)(1 q)}, j 1 − − − − = where ν ’s are the eigenvalues of ρ . The spectrum of ρ (or ρ I ) is j 1 12 12 ⊗ 3 m [ 2 2 2 2 {νj (q cos t (1 q)sin t),νj (q sin t (1 q)cos t)}. j 1 + − + − = The assumption pq 1 q guarantees that the eigenvalues of ρ are ≤ − 123 smaller than the eigenvalues of ρ I , whenever the corresponding 12 ⊗ 3 eigenvectors are not orthogonal. ρ and I ρ obviously commute, 123 1 ⊗ 23 hence the conditions of Theorem 30 are satisfied by ρ123 despite the fact that ρ23 can not be diagonized in any product basis.

3. Bregman divergences and their use The strong subadditivity of the von Neumann entropy is not the only celebrated result in the area of finite dimensional functional analysis which was made — partially or in its entirety — by Eliott Lieb. For ex- ample, the famous concavity theorem which states that the map A Trexp¡H log A¢ 7→ + sa is concave on B++(H ) for any H B (H ) if H is a finite dimensional ∈ Hilbert space is also due to Lieb. In a recent paper, J. A. Tropp used the joint convexity of the Umegaki relative entropy to give a succint proof of this celebrated concavity theo- rem of Lieb [86]. This result inspires us to prove the joint convexity prop- erty for Bregman divergences which may be considered as generalized rel- ative entropies as we will see in the following. Once we are done with the proof of the convexity for a certain family of Bregman divergences, we are in the position to derive some conse- quences. It is quite surprising that we managed to generalize the strong subadditivity inequality (and not the concavity theorem) with the use of the convexity of certain Bregman divergences.

3.1. Introduction to Bregman divergences. In applications that in- volve measuring the dissimilarity between two objects (numbers, vec- tors, matrices, functions and so on) the definition of a be- comes essential. One such measure is a function, but there 3. BREGMAN DIVERGENCES AND THEIR USE 35 are many important measures which do not satisfy the properties of dis- tance. For instance, the square loss function has been used widely for regression analysis, Kullback-Leibler divergence [45] has been applied to compare two probability density functions, the Itakura-Saito divergence [38] is used as a measure of the perceptual difference between spectra, or the Mahalonobis distance [52] is to measure the dissimilarity between two random vectors of the same distribution. The Bregman divergence was introduced by Lev Bregman [11] for convex functions φ : Rd R with → gradient φ, as the φ-dependent nonnegative measure of discrepancy ∇ (42) D (p,q) φ(p) φ(q) φ(q),p q φ = − − 〈∇ − 〉 of d-dimensional vectors p,q Rd . Originally his motivation was the ∈ problem of convex programming, but it became widely researched both from theoretical and practical viewpoints. For example the remarkable fact that all the aforementioned divergences are special cases of the Breg- man divergence shows its importance [6]. In some literature it is applied under the name Bregman distance, in spite of that it is not in general a . Indeed, Dφ is definite, but does not satisfy the triangle inequality nor symmetry. In addition to the wide range of applications in informa- tion theory, and computer science, Dénes Petz suggested the extension of the concept of Bregman divergence to operators [72]. If C denotes a convex set in a Banach space and B(H ) denotes the bounded linear operators on the Hilbert space H , for an operator valued smooth function Ψ : C B(H ) the Bregman operator divergence is defined by → Ψ(y t(x y)) Ψ(y) (43) DΨ(x, y) Ψ(x) Ψ(y) lim + − − = − − t 0 t →+ for all x, y C. Since the Bregman operator divergence can be written as ∈ tΨ(x) (1 t)Ψ(y) Ψ(tx (1 t)y) DΨ(x, y) lim + − − + − , = t 0 t →+ for operator convex functions Ψ the inequality D (x, y) 0 remains true Ψ ≥ for the Loewner partial ordering between self-adjoint operators. The most important preliminary of our investigations is the work of H. Bauschke and J. Borwein. They gave a necessary and sufficient condi- tion for the joint convexity of the Bregman divergences on Rd [7]. How- ever, the question about the joint convexity of the trace of Bregman op- erator divergence has been left open.

3.2. Definition and basic properties. Let the Hilbert space H be fi- nite dimensional, as usual. Let f : (0, ) R be a convex function. Then ∞ → the induced map

ϕ : B++(H ) R, X ϕ (X ): Tr f (X ) f → 7→ f = 36 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES is convex, as well [13]. A differentiable convex function is bounded from below by its first-order Taylor polynomial, no matter what the base point is. Therefore, the expression ϕ (X ) ϕ (Y ) Dϕ [Y ](X Y ), f − f − f − where Dϕf [Y ] denotes the Fréchet derivative of ϕf at the point Y , is non- negative for any X ,Y B++(H ). By the linearity of the trace, for any Y ∈ ∈ B++(H ) we have Dϕf [Y ] Tr Df [Y ], where Df [Y ] denotes the Fréchet = ◦ sa derivative of the standard operator function f : B++(H ) B (H ) at Y . → Let us define the central object of this section precisely.

DEFINITION. Let f C 1((0, )) be a convex function and X ,Y ∈ ∞ ∈ B++(H ). The Bregman f -divergence of X and Y is defined by (44) H (X ,Y ) Tr¡f (X ) f (Y ) Df [Y ](X Y )¢. f = − − − Note that this definition of the Bregman f -divergence coincides with the trace of the Bregman operator divergence (43), if Ψ is the standard n operator function f and C B++(H ), H C . = = ¯ ®­ ¯ Consider the spectral decomposition A P λ ¯ϕ ϕ ¯ of the pos- = j j j j itive definite operator A and denote the corresponding matrix units by ¯ ®­ ¯ Ei j : ¯ϕi ϕj ¯. The Fréchet derivative of the standard operator function = sa f : B++(H ) B (H ), X f (X ) at the point A B++(H ) is → 7→ ∈ Z 1 X ¡ ¢ ¯ ®­ ¯ (45) Df [A] f 0 λj t(λi λj ) dt ¯Ei j Ei j ¯, = i,j 0 + − where Hermite’s formula is used for the divided difference matrix [35, Thm. 3.33]. Remark that the Fréchet derivative Df [A] is an Bsa(H ) → Bsa(H ) map, so the formula (45) holds in the sense that the left hand side of (45) is equal to the right hand side of (45) restricted to Bsa(H ). If f is differentiable at A, then the identities ¯ d ¯ Df [A](B) f (A tB)¯ = dt + ¯t 0 = and µ ¯ ¶ ¯ d ¯ d ¯ Tr f (A tB)¯ Tr f (A tB)¯ Tr f 0(A)B dt + ¯t 0 = dt + ¯t 0 = hold and — in particular —= show that = ¡ ¢ (46) H (X ,Y ) Tr f (X ) f (Y ) f 0(Y )(X Y ) . f = − − − LEMMA 31. If f C 2((0, )), the Bregman divergence admits the inte- ∈ ∞ gral representation Z 1 (47) H f (X ,Y ) (1 s)Tr(X Y )Df 0[Y s(X Y )](X Y )ds. = s 0 − − + − − = 3. BREGMAN DIVERGENCES AND THEIR USE 37

PROOF. Remark that Z 1 d Tr f (X ) Tr f (Y ) Tr f (Y t(X Y ))dt − = t 0 dt + − = Z 1 Tr f 0(Y t(X Y ))(X Y )dt = t 0 + − − and = Z t d f 0(Y t(X Y )) f 0(Y ) f 0(Y s(X Y ))ds + − − = s 0 ds + − Z t = Df 0[Y s(X Y )](X Y )ds, = s 0 + − − hence = Z 1 µµZ t ¶ ¶ H f (X ,Y ) Tr Df 0[Y s(X Y )](X Y )ds (X Y ) dt = t 0 s 0 + − − − = = Z 1 Z t Tr(X Y )Df 0[Y s(X Y )](X Y )dsdt = t 0 s 0 − + − − Z =1 = (1 s)Tr(X Y )Df 0[Y s(X Y )](X Y )ds. = s 0 − − + − − = ä 3.3. A characterization of the joint convexity. In this section we in- vestigate the Bregman f -divergence from the viewpoint of joint convex- ity, which is essential in the further applications. Since f is convex, it is clear that the Bregman divergence is convex in the first variable. For the original Bregman divergence (42) Bauschke and Borwein show [7] that Dφ is jointly convex - i. e. D (tp (1 t)p ,tq (1 t)q ) tD (p ,q ) (1 t)D (p ,q ), φ 1 + − 2 1 + − 2 ≤ φ 1 1 + − φ 2 2 where p ,p ,q ,q Rd , t [0,1] - if and only if the inverse of the Hessian 1 2 1 2 ∈ ∈ of φ is concave in Loewner sense. Particularly, if φ is an R I R convex ⊃ → function, then Dφ is jointly convex if and only if 1/φ00 is concave. From this viewpoint the next characterization is rather interesting. 2 THEOREM 32. Let f C ((0, )) be a convex function with f 00 0 on ∈ ∞ > (0, ). Then the following conditions are equivalent. ∞ (A) The map ¡ sa ¢ ¡ ¢ 1 (48) B++(H ) B B (H ) ; X Df 0[X ] − → 7→ is operator concave. (B) The Bregman f -divergence

H : B++(H ) B++(H ) [0, ); (X ,Y ) H (X ,Y ) f × → ∞ 7→ f is jointly convex. 38 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES

2 REMARK 33. For a convex function f C ((0, )) the property f 00 0 ¡ ∈¢ 1 ∞ > is equivalent to the existence of Df 0[X ] − for every X B++(H ). On sa ∈ the one hand, f 00 0 ensures that Df 0[X ] B (B (H )) is a positive def- > ∈ inite and hence invertible map — see formula (45). On the other hand, if sa f 00(λ) 0 for some λ 0, then Df 0[λI] 0 B (B (H )). = > = ∈ In the recent paper [16] Tropp and Chen defined the Matrix Entropy Class the following way.

DEFINITION. The Matrix Entropy Class consists of the real functions defined on [0, ) that are either affine or satisfy the following conditions. ∞ f is convex and f C([0, )) C 2((0, )). • ∈ ∞ ∩ ∞ For every finite dimensional Hilbert space H the map B++(H ) • sa ¡ ¢ 1 → B (B (H )); X Df 0[X ] − is concave with respect to the 7→ Loewner (or semidefinite) order. By this definition, the statement of Theorem 32 is essentially the fol- lowing: the set of those functions for which the corresponding Bregman divergence is jointly convex coincides with the Matrix Entropy Class de- fined by Tropp and Chen.

PROOFOF THEOREM 32: Let us prove the direction (A) (B) first. Let ⇒ Xi and Yi be positive definite operators on H (i {1,...,N}) and let αi be P ∈ P reals such that αi 0, i αi 1. Let us use the notations X i αi Xi , Y P ≥ = ¡ = ¢ 1 = α Y . By the operator concavity of the map X Df 0[X ] − , for any i i i 7→ 0 s 1 we have ≤ ≤ Tr(X Y )Df 0[Y s(X Y )](X Y ) − + − − 1 à !Ãà " #! 1!− à ! X X − X Tr αi (Xi Yi ) Df 0 αi (Yi s(Xi Yi )) αi (Xi Yi ) = i − i + − i − à !à ! 1 à ! X X ¡ ¢ 1 − X Tr αi (Xi Yi ) αi Df 0 [Yi s(Xi Yi )] − αi (Xi Yi ) . ≤ i − i + − i − We used that taking the inverse of an operator reverses the semidefinite order. If H is a Hilbert space, then the map ­ 1 ® H B++(H ) R;(x,T ) x,T − x × → 7→ is convex (see [28, Prop. 4.3], which may be obtained as a consequence of [50, Thm. 1]). If we apply this property to the Hilbert space Bsa(H ) with the Hilbert-Schmidt inner product we get that à !à ! 1 à ! X X ¡ ¢ 1 − X Tr αi (Xi Yi ) αi Df 0 [Yi s(Xi Yi )] − αi (Xi Yi ) i − i + − i − 3. BREGMAN DIVERGENCES AND THEIR USE 39

³ ´ 1 X ¡ ¢ 1 − αi Tr(Xi Yi ) Df 0 [(Yi s(Xi Yi ))] − (Xi Yi ) ≤ i − + − − X αi Tr(Xi Yi )Df 0 [(Yi s(Xi Yi ))](Xi Yi ). = i − + − − The result of Lemma 31 (eq. (47)) clearly shows that the obtained inequal- ity Tr(X Y )Df 0[Y s(X Y )](X Y ) X − + − − αi Tr(Xi Yi )Df 0 [(Yi s(Xi Yi ))](Xi Yi ) ≤ i − + − − implies the joint convexity of the Bregman divergence. The proof of (B) (A) is the following. The conditon (B) means that ⇒ sa P if A B++(H ), B B (H ) and α 0, α 1, then i ∈ i ∈ i ≥ i i = à ! X X X (49) H f αi (Ai εBi ), αi Ai αi H f (Ai εBi , Ai ), i + i ≤ i + where ε ε for some ε 0. By the integral representation (47), the right < 0 0 > hand side of (49) can be written as X X Z 1 αi H f (Ai εBi , Ai ) αi (1 s)TrεBi Df 0[Ai sεBi ](εBi )ds + = s 0 − + i i = Z 1 2 X ε (1 s) αi TrBi Df 0[Ai sεBi ](Bi )ds. = s 0 − + = i Similarly, the left hand side is à ! X X H f αi (Ai εBi ), αi Ai i + i Z 1 à ! " #à ! 2 X X X ε (1 s)Tr αi Bi Df 0 αi (Ai sεBi ) αi Bi ds. = s 0 − + = i i i 2 The assumption f C ((0, )) ensures that the map Df 0 : B++(H ) sa ∈ ∞ → B (B (H )) is continuous. Therefore, limε 0 Df 0[Ai sεBi ] Df 0[Ai ] → + = etc. After division by ε2 and taking the limit ε 0 we obtain from (49) → that à ! " #à ! X X X X Tr αi Bi Df 0 αi Ai αi Bi αi TrBi Df 0[Ai ](Bi ), i i i ≤ i that is, the map sa (50) B++(H ) B (H ) (A,B) TrBDf 0[A](B) × 3 7→ is jointly convex. This is sufficient to show the opearator concavity of the ¡ ¢ 1 map X Df 0[X ] − by the followings. Let A B++(H )(i {1,...,N}) 7→ i ∈ ∈ 40 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES P and α 0, α 1. Let us use the short notation T Df 0[A ]. For any i ≥ i i = i = i C Bsa(H ) we can define ∈ à ! 1 à ! 1 − − ¡ ¢ 1 X ¡ ¢ 1 1 X 1 Bi : Df 0[Ai ] − αj Df 0[A j ] − (C) Ti− αj T j− (C). = ◦ j ≡ ◦ j

Observe that by this definition P α B C. On the one hand, i i i = X X αi TrBi Df 0[Ai ](Bi ) αi TrBi Ti (Bi ) i = i

à à ! 1 à ! 1 ! − − X 1 X 1 1 X 1 αi Tr Ti− αj T j− (C) Ti Ti− αj T j− (C) = i ◦ j · ◦ ◦ j

ÃÃ ! Ã ! 1 Ã ! 1 ! − − X 1 X 1 X 1 Tr αi Ti− αj T j− (C) αj T j− (C) = i ◦ j · j

à ! 1 à ! 1 − − X 1 X ¡ ¢ 1 TrC αi Ti− (C) TrC αi Df 0[Ai ] − (C). = · i = i On the other hand, à ! " #à ! " # X X X X Tr αi Bi Df 0 αi Ai αi Bi TrCDf 0 αi Ai (C) i i i = i

By the joint convexity of (50),

" # Ã ! 1 X X ¡ ¢ 1 − TrCDf 0 αi Ai (C) TrC αi Df 0[Ai ] − (C) i ≤ i holds, and C was an arbitrary element of Bsa(H ), hence the operator inequality " # Ã ! 1 X X ¡ ¢ 1 − Df 0 αi Ai αi Df 0[Ai ] − i ≤ i holds, which is equivalent to

à " #! 1 X − X ¡ ¢ 1 Df 0 αi Ai αi Df 0[Ai ] − . i ≥ i

This is the desired concavity property. ä 3. BREGMAN DIVERGENCES AND THEIR USE 41

3.4. An extension of the Bregman divergence to singular opera- tors. In quantum information theory, the singular density operators play a central role, therefore, we would like to extend the Bregman f - divergences from B++(H ) B++(H ) to B+(H ) B+(H ). It is a natural × × idea to define the Bregman divergence of the positive semidefinite oper- ators X and Y as follows:

(51) H f (X ,Y ): lim H f (X εI,Y εI). = ε 0 + + → With the formula (46) at hand, easy computation shows that if X and Pn ¯ ®­ ¯ Y admit the spectral decompositions X j 1 λj ¯ϕj ϕj ¯ and Y Pn ¯ ®­ ¯ = = = k 1 µk ¯ψk ψk ¯, then = H (X εI,Y εI) f + + n X ¯­ ®¯2 ¡ ¢ (52) ¯ ϕj ψk ¯ f (λj ε) f (µk ε) f 0(µk ε)(λj µk ) . = j,k 1 | + − + − + − = 0 1 Assume that f C ([0, )) C ((0, )), that is, limx 0 f (x) R. The con- ∈ ∞ ∩ ∞ → ∈ vexity of f gives that f 0 is monotone increasing, hence limε 0 f 0(ε) R or → ∈ limε 0 f 0(ε) . → = −∞ Clearly, if limε 0 f 0(ε) R, then the limit of (52) is a real number. If → ∈ ker(Y ) ker(X ), then λ 0 whenever µ 0 and ­ϕ ψ ® 0, hence the ⊆ j = k = j | k 6= limit is finite in this case, as well. If ker(Y ) * ker(X ) and limε 0 f 0(ε) → = , then the limit is . −∞ +∞ So we conclude that if f is continuous at 0, then (51) is well-defined and takes values in [0, ], that is, the Bregman f -divergences can be ex- ∞ tended to B+(H ) B+(H ) by continuity. × If H ( , ) defined by a convex function f C 2((0, )) is jointly f · · ∈ ∞ convex on B++(H ) B++(H ), then (assuming in addition that f 0 × ∈ C ([(0, ))) H f ( , ) is jointly convex on B+(H ) B+(H ). Indeed, for ∞ · · P × X ,Y B+(H ),c 0, c 1 we have k k ∈ k ≥ k k = à ! à ! X X X X H f ck Xk , ck Yk lim H f ck Xk εI, ck Yk εI = ε 0 + + k k → k k à ! X X X lim H f ck (Xk εI), ck (Yk εI) lim ck H f (Xk εI,Yk εI) = ε 0 + + ≤ ε 0 + + → k k → k X X ck lim H f (Xk εI,Yk εI) ck H f (Xk ,Yk ). = ε 0 + + = k → k Therefore, we can reformulate the main condition with a bit different conditions. 42 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES

THEOREM 34. Let f C 0([0, )) C 2((0, )) be a convex function with ∈ ∞ ∩ ∞ f 00 0 on (0, ). Then the following conditions are equivalent. > ∞ (i) The map ¡ sa ¢ ¡ ¢ 1 B++(H ) B B (H ) ; X Df 0[X ] − → 7→ is operator concave. (ii) The Bregman f -divergence

H : B+(H ) B+(H ) [0, ]]; (X ,Y ) H (X ,Y ) f × → ∞ 7→ f is jointly convex. 3.5. A different condition and alternative proofs. In a recent pre- print Hansen and Zhang investigated the connections between the con- dition (A) in Theorem 32 and the property that f 00 is operator convex and numerically non-increasing. In an earlier version of their paper, these conditions were claimed to be equivalent [30, Thm 1.2]. Later the proof turned out to be incomplete. In the current version it is proved that if f 00 is operator convex and numerically non-increasing, then the condition (A) in Theorem 32 is satisfied [31, Thm 1.3]. Now we give a direct proof of the fact that the operator convexity (and the non-increasing property) of f 00 is sufficient to deduce the joint con- vexity of the Bregman f -divergence.

1 LEMMA 35. Set f C ((0, )) and A B++(H ). Then the Fréchet de- ∈ ∞ ∈ rivative of the standard operator function f is Z 1 Df [A] f 0 (tL A (1 t)RA)dt, = 0 + − where L A (RA) denotes the left (right) multiplication by A : L : B(H ) B(H ), X L (X ): AX , A → 7→ A = R : B(H ) B(H ), X R (X ): XA. A → 7→ A = P ¯ ®­ ¯ PROOF. Let us use the notations A j λj ¯ϕj ϕj ¯ and Ei j ¯ ®­ ¯ = = ¯ϕi ϕj ¯ again. It is easy to check that L (E ) λ E , R (E ) λ E , A i j = i i j A i j = j i j ¯ ®­ ¯ hence with P : ¯E E ¯ we have i j = i j i j X X L A λi Pi j , RA λj Pi j . = i,j = i,j Therefore X f 0 (tL A (1 t)RA) f 0(tλi (1 t)λj )Pi j , + − = i,j + − 3. BREGMAN DIVERGENCES AND THEIR USE 43 and Z 1 Z 1 X ¡ ¢ ¯ ®­ ¯ f 0 (tL A (1 t)RA)dt f 0 λj t(λi λj ) dt ¯Ei j Ei j ¯, 0 + − = i,j 0 + − and this is exactly the formula that appeared in (45). ä Lemma 31 and Lemma 35 have an immediate consequence.

COROLLARY 36. For f C 2((0, )), the Bregman divergence can be ∈ ∞ written as H f (X ,Y ) (53) Z 1 Z 1 ¡ ¢ (1 s)Tr(X Y )f 00 tLY s(X Y ) (1 t)RY s(X Y ) (X Y )dtds. = s 0 t 0 − − + − + − + − − = = Therefore we can provide a sufficient condition for the joint convexity of the Bregman divergence. 2 THEOREM 37. Let f C ((0, )) be a convex function. If f 00 is operator ∈ ∞ convex and numerically non-increasing, then the Bregman f -divergence

H : B++(H ) B++(H ) [0, ); (X ,Y ) H (X ,Y ) f × → ∞ 7→ f is jointly convex.

FIRSTPROOFOF THEOREM 37. On a Hilbert space H the map ­ ® B(H )++ H R :(A,ξ) ξ,ϕ(A)(ξ) × → 7→ is jointly convex if ϕ : (0, ) R is operator convex and numerically non- ∞ → increasing. This fact relies on the joint convexity of the map (A,ξ) ­ 1 ® 7→ ξ, A− ξ — which is stated in [28, Prop. 4.3] and may be derived from [50, Thm. 1] — and on an integral representation of the functions ϕ with the above property. This representation will be discussed in the second proof of this theorem. The maps

(X ,Y ) tLY s(X Y ) (1 t)RY s(X Y ) 7→ + − + − + − and (X ,Y ) X Y are affine, and with the Hilbert-Schmidt inner prod- 7→ − uct (53) can be written as H f (X ,Y ) (54) Z 1 Z 1 ­ ¡ ¢ ® (1 s) X Y , f 00 tLY s(X Y ) (1 t)RY s(X Y ) (X Y ) dtds, = s 0 t 0 − − + − + − + − − = = hence H f is jointly convex if f 00 is operator convex and non-increasing. ä We provide another proof of this theorem. 44 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES

PROPOSITION 38. Let F (A,B) denote the set of all A B functions. → The map 2 ¡ ¢ (55) H : C ((0, )) F B++(H ) B++(H ),R , f H ( , ) ∞ → × 7→ f · · is linear, and the kernel is the subspace of affine functions, that is, ker(H) {x ax b a,b R}. = 7→ + | ∈ PROOF. The linearity is obvious, and from the integral formula (54) it is easy to see that the kernel of H is equal to the kernel of the operator d2 : C 2((0, )) C((0, )). dx2 ∞ → ∞

ä Therefore, if f can be written as f Pk f , where the f ’s define = j 1 j j jointly convex Bregman divergence, then H =( , ) is jointly convex. The f · · affine part of a function can be omitted.

SECONDPROOFOF THEOREM 37. If f C 2((0, )) is convex function, ∈ ∞ then f 00 is numerically non-increasing and operator convex if and only if Z ∞ 1 (56) f 00(x) γ dµ(λ), = + 0 λ x + where γ 0 and µ is a nonnegative measure on [0, ) such that ≥ ∞ Z 1 ∞ dµ(λ) . 0 1 λ < ∞ + This fact is stated in this form in [3, Thm. 3.1]. Now we intend to use the ’only if’ direction, hence we outline the key steps of the proof of Ando and Hiai. A rather complex argument shows that a numerically non- increasing and operator convex function is operator monotone decreas- ing. At this point, Ando and Hiai provide a slightly simplified version of the original argument of Hansen [29] to verify the integral representation. ¡ 1 ¢ This is the following. If f 00 is operator monotone decreasing then f 00 x is operator monotone, hence by [9, pp. 144-145] it has the form µ ¶ Z 1 ∞ (λ 1)x (57) f 00 α βx + dν(λ) x = + + 0 λ x + where α,β 0 and ν is a nonnegative finite measure on (0, ). Let ≥ ∞ dµ˜(λ): dν( 1 ) on (0, ) and µ˜({0}) : β. Finally, dµ(λ): (λ 1)dµ˜(λ). = λ ∞ = = + By these operations, the representation (57) is transformed to (56). Integrating (56) two times with respect to x we get (58) γ Z f (x) α βx x2 ∞ ¡(λ x)¡log(λ x) log(λ 1)¢ (x 1)¢dµ(λ), = + + 2 + 0 + + − + − − 3. BREGMAN DIVERGENCES AND THEIR USE 45 where α,β R. One can see that f (x) is the sum of the affine part ∈ Z a(x) α βx ∞ ¡log(λ 1)(λ x) (x 1)¢dµ(λ), = + − 0 + + + − the quadratic part q(x) γ x2 and the “entropic" part = 2 Z e(x) ∞(λ x)log(λ x)dµ(λ). = 0 + + The quadratic part H (X ,Y ) γ Tr(X Y )2 is clearly jointly convex. q = 2 − By the result of [51], the same statement holds for the Bregman diver- gence induced by the standard entropy function ϕ (x) x logx, 0 = H (X ,Y ) Tr¡X ¡log X logY ¢ (X Y )¢. ϕ0 = − − − On the other hand, one can check that the Bregman divergence induced by the shifted entropy function ϕ (x) (x λ)log(x λ) can be expressed λ = + + as (59) H (X ,Y ) H (X λI,Y λI). ϕλ = ϕ0 + + On the whole, if f 00 is numerically decreasing and operator convex, then the Bregman divergence H can be written as H H H H , where f f = q + a + e H 0, H is obviously jointly convex and a = q Z Z ∞ ∞ (60) He (X ,Y ) Hϕλ (X ,Y )dµ(λ) Hϕ0 (X λI,Y λI)dµ(λ). = 0 = 0 + + The map (X ,Y ) (X λI,Y λI) is affine, hence (60) is jointly convex, 7→ + + and this completes the proof. ä 3.6. An application - the Tsallis entropy. The q-logarithm — or de- formed logarithm — function ln : (0, ) R was defined in (15). Recall q ∞ → that the definition was the following. q 1 Z x ( x − 1 q 2 q 1− if q 1, lnq x t − dt − 6= = 1 = ln x if q 1. = We also introduced the notation f (x) x ln (x) and claimed that the q = q Tsallis entropy of a density operator ρ is given by S (ρ) Tr f (ρ), q = − q see [2, 17] for further details. We noted that if q 0, then limx 0 fq (x) 0, > → = hence fq can be extended by continuity, thus the Tsallis entropy is well- defined for singular densities, as well. By the result of Tropp and Chen, fq belongs to the Matrix Entropy Class for 1 q 2 [16, Thm. 2.3]. There- ≤ ≤ fore, by Theorem 32, H fq ( , ) is jointly convex. (Alternatively, we may use · · r the well-known operator convexity of the function x x− (0 r 1) 7→ ≤ ≤ and refer to Theorem 37.) 46 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES

One can compute that for q 1 we have 6= q 1 ¡ q q 1¢ H f (A,B) TrB Tr A q Tr AB − . q = + q 1 − − The Bregman divergence is clearly unitary invariant, that is,

(61) H (U AU ∗,UBU ∗) H (A,B) f = f for all unitary operators U. (Moreover, as it will turn out in the follow- ing, the Bregman divergences are invariant under antiunitary conjuga- tions, as well.) Let X B(H ) B(K ) for some finite timensional Hilbert ∈ ⊗ spaces H and K , and let us introduce the notation m : dimH and = n : dimK . Then there are some unitaries U ,...,U 2 such that = 1 n n2 1 X 1 X1 IB(K ) 2 Uk XUk∗, ⊗ n = k 1 n = where X Tr X and I is the identity in B(K ) (see e. g. [8, 23]). 1 = 2 B(K ) Therefore, from the joint convexity it follows that the Bregman diver- gence is monotone in the following sense: µ 1 1 ¶ (62) H f X1 I ( ),Y1 I ( ) H f (X ,Y ) ⊗ n B K ⊗ n B K ≤ if f satisfies the condition (A) in Theorem 32. Let us apply (62) to fq with 1 q 2 and < ≤ 1 (63) X ρ123 B+ (H1 H2 H3), Y I1 ρ23, = ∈ ⊗ ⊗ = d1 ⊗ where H is a finite dimensional Hilbert space (i {1,2,3}), d dimH i ∈ i = i and ρ Tr ρ . Furthermore, we use the simplified notation I 23 = 1 123 i ≡ I for i {1,2,3}. The idea of this choice of X and Y comes from the B(Hi ) ∈ tutorial [70]. With this choice we get µ 1 1 1 ¶ µ 1 ¶ (64) H fq ρ12 I3, I1 ρ2 I3 H fq ρ123, I1 ρ23 . ⊗ d3 d1 ⊗ ⊗ d3 ≤ d1 ⊗ Straightforward computations show that the left hand side of (64) equals to 1 ³ 1 q q 1 q q ´ d − Trρ (d1d3) − Trρ q 1 3 12 − 2 − and the right hand side is

1 ³ q 1 q q ´ Trρ d − Trρ . q 1 123 − 1 23 − The result of this computation can be summarized as follows. 3. BREGMAN DIVERGENCES AND THEIR USE 47

THEOREM 39. If H is a finite dimensional Hilbert space for any i i ∈ {1,2,3}, d dimH , 1 q 2, then for any ρ B+ (H H H ) i = i ≤ ≤ 123 ∈ 1 ⊗ 2 ⊗ 3 the inequality 1 q q 1 q q q 1 q q (65) d − Trρ d − Trρ Trρ (d d ) − Trρ . 3 12 + 1 23 ≤ 123 + 1 3 2 holds, where notations like ρ12 denote the appropriate reduced operators. The fact that the strong subadditivity of the Tsallis entropy q q q q Trρ Trρ Trρ Trρ 12 + 23 ≤ 123 + 2 does not hold in general [79] (but holds for classical probability distri- butions [22]) makes Theorem 39 remarkable. Furthermore, one can not improve on (65), the inequality is sharp. The example which shows the sharpness of the inequality (65) was used previously to demonstrate that the Tsallis entropy is not strongly subbadditive, see (19). So, the density operator  0 0 0 0 0 0 0 0     0 0 0 0 0 0 0 0     1 1   0 0 4 0 4 0 0 0     0 0 0 1 0 1 0 0   4 4  ¡ 2 2 2¢ (66) ρ   B C C C 123 =  1 1  ∈ ⊗ ⊗  0 0 4 0 4 0 0 0     0 0 0 1 0 1 0 0   4 4     0 0 0 0 0 0 0 0    0 0 0 0 0 0 0 0 has the reduced densities    1  0 0 0 0 4 0 0 0     " #  0 1 1 0   0 1 0 0  1 0  2 2 ,  4 , 2 , ρ12  1 1  ρ23  1  ρ2 1 =  0 0  =  0 0 0  = 0  2 2   4  2 1 0 0 0 0 0 0 0 4 hence 1 q q 1 q q 1 q 1 q 1 q q 1 q q d − Trρ d − Trρ 2 − 2 − 4 − Trρ (d d ) − Trρ . 3 12 + 1 23 = + = 123 + 1 3 2 Note that (65) is equivalent to 1 q 1 q − 1 q 1 q (d1d2d3) − 1 d2 1 (d1d2d3) − Sq (ρ123) d − Sq (ρ2) − − + 2 + q 1 + q 1 − − 1 q 1 q 1 q 1 q (d2d3) − 1 (d1d2) − 1 (d2d3) − Sq (ρ23) (d1d2) − Sq (ρ12) − − , ≤ + + q 1 + q 1 − − 48 2. QUANTUM VARIANCES, GENERALIZED ENTROPIES AND RELATIVE ENTOPIES which gives the strong subadditivity S (ρ ) S (ρ ) S(ρ ) S(ρ ) q 123 + q 2 ≤ 23 + 12 of the von Neumann entropy, if we take the limit q 1. → 3.7. The relation of joint convexity and monotonicity under sto- chastic maps. For homogeneous relative entropy-type maps, the joint convexity and the monotonicity under stochastic maps is equivalent [47, remarks after Def. 2.3]. However, the Bregman divergence does not need to be homegeneous. For example, (67) H (λA,λB) λq H (A,B) fq = fq for 0 λ (and any positive q). < We show that a jointly convex Bregman divergence is not monotone in general. In order to see this surprising fact, we create an example that shows that a family of (jointly convex) Bregman divergences increases under the partial trace, which is a very important stochastic (that is, com- pletely positive trace preserving - CPTP) map [13, 47]. Easy computations show that for any A,B B+(H ) we have ∈ µ 1 1 ¶ µ A B ¶ (68) H f A I ( ),B I ( ) nH f , , ⊗ n B K ⊗ n B K = n n where I is the identity in B(K ) and dimK n. Recall that the den- B(K ) = sity operator (66) saturates the inequality (64), that is,

µ 1 1 1 ¶ µ 1 ¶ (69) H f ρ12 I, I ρ2 I H f ρ123, I ρ23 . q ⊗ 2 2 ⊗ ⊗ 2 = q 2 ⊗ On the other hand, by (67) and (68), µ ¶ µ ¶ 1 1 1 1 q 1 (70) H f ρ12 I, I ρ2 I 2 − H f ρ12, I ρ2 , q ⊗ 2 2 ⊗ ⊗ 2 = q 2 ⊗ which means that µ 1 ¶ µ 1 ¶ H f ρ12, I ρ2 H f ρ123, I ρ23 , q 2 ⊗ > q 2 ⊗ so the monotonicity under partial trace fails. This means that the joint convexity does not imply monotonicity, but the converse is true. We summarize the results in the next theorem.

THEOREM 40. Every monotone Bregman divergence is jointly con- vex. However, there are jointy convex Bregman divergences which are not monotone under stochastic maps. On the other hand, joint convexity im- plies the monotonicity under the stochastic maps of the form X (71) A ckUk AUk∗, 7→ k 3. BREGMAN DIVERGENCES AND THEIR USE 49 where the U ’s are unitaries and c 0, P c 1. k k ≥ k k = PROOF. Set X , X ,Y ,Y B+(H ) and let the block operators X ,Y 1 2 1 2 ∈ and U B(H H ) be defined by ∈ ⊕ · X 0 ¸ · Y 0 ¸ · 0 I ¸ X 1 , Y 1 , U B(H ) . = 0 X2 = 0 Y2 = IB(H ) 0 The map 1 1 E : B(H H ) B(H H ), X E (X ): X UXU ∗ ⊕ → ⊕ 7→ = 2 + 2 is clearly stochastic, and 1 · X X 0 ¸ 1 · Y Y 0 ¸ E (X ) 1 + 2 , E (Y ) 1 + 2 . = 2 0 X1 X2 = 2 0 Y1 Y2 + + The Bregman divergence of block-diagonal operators is the sum of the Bregman divergence of the blocks, hence the monotonicity condition H (E (X ),E (Y )) H (X ,Y ) f ≤ f means that µ1 1 ¶ 2H f (X1 X2), (Y1 Y2) H f (X1,Y1) H f (X2,Y2), 2 + 2 + ≤ + which is the midpoint convexity of H ( , ). The Bregman f -divergence is f · · continuous (by the assumption f C 1((0, ))), hence midpoint convex- ∈ ∞ ity implies convexity. xq q We have shown in this section that for f (x) − the correspond- q = q 1 ing Bregman divergence H ( , ) is jointly convex but− it is not monotone fq · · under stochastic maps (1 q 2). < ≤ In order to check the last statement of the theorem, suppose that H ( , ) is jointly convex. If a map E : B(H ) B(H ) has the form (71), f · · → then by the unitary invariance of the Bregman divergence,

H f (E (X ),E (Y )) Ã ! X X X ¡ ¢ H f ckUk XUk∗, ckUk YUk∗ ck H f Uk XUk∗,Uk YUk∗ = k k ≤ k X ck H f (X ,Y ) H f (X ,Y ). = k = ä

CHAPTER 3

Preserver problems

We mentioned in Subsection 3.6 in the previous chapter of this thesis that the Bregman divergences of positive definite operators are invariant under unitary conjugations — see equation (61). We also noted that uni- tary conjugations are not the only transformations of the positive definite cone which preserve the Bregman divergences. It is a very natural goal to determine all the transformations on the set of positive definite opera- tors which leave the Bregman divergences invariant. This question leads us to the topic of preserver problems. A preserver problem consists of the following ingredients. Let H be a set. Let φ : H H be a mapping. Let m be a positive integer, let K be a set → and let X : H m K be a map. We say that the transformation φ preserves → X , if either

(72) X ¡φ(A ),...,φ(A )¢ X (A ,..., A )(A ,..., A H), 1 m = 1 m 1 m ∈

or

(73) X ¡φ(A ),...,φ(A )¢ φ(X (A ,..., A )) (A ,..., A H) 1 m = 1 m 1 m ∈

holds, depending on the nature of the map X . (The equation (73) may play the role of the preserver equation only if K H.) For any given sets = H,K and mapping X , the solution of the preserver problem is the de- scription of the structure of all the transformations φ which preserve X . The following table enumerates some preserver problems.

51 52 3. PRESERVER PROBLEMS

H m K X Equation Name of the problem Rn 2 [0, ) (a,b) a b (72) isometries of ∞ 7→ k − k Rn M 1 C A DetA (72) determinant n 7→ preserving maps sa Mn 2 {0,1} (A,B) 1A B (72) order preserv- 7→ ≤ ing maps M++ 2 M++ (A,B) ABA (73) triple prod- n n 7→ uct preserving maps M+ 2 [ , ] (A,B) (72) preservers of n −∞ ∞ 7→ S f (A,B) the quantum f -divergence M++ m M++ (A ,..., A ) (73) preservers n n 1 m 7→ MG (A1,..., Am) of the multi- variable geo- metric mean

The above table makes it transparent that the topic of preserver prob- lems covers a large area of . An exhaustive description such problems — including Frobenius’ theorem on determinant preserving maps, the Mazur-Ulam theorem on isometries of real normed spaces and Wigner’s theorem on the symmetry transformations of pure states with re- spect to the transition probability — can be found of in the monography [60] written by Lajos Molnár. In this chapter we deal with several preserver problems. In the first section, we investigate the problem which has been already mentioned in the introduction of this chapter, namely the preservers of the Bregman (and Jensen) divergences on positive definite operators. The second section is devoted to an algebraic preserver problem on the cone of positive definite operators. Namely, we determine the con- tinuous Jordan triple endomorphisms of the positive definite operators acting on a two-dimensional Hilbert space. It is rather interesting that the problem was solved for any dimensions n 3 several years ago, only ≥ the two-dimensional case remained unsolved. In the third section we derive the description of the continuous endo- morphisms of the Einstein gyrogroup from the previous result on Jordan triple products. Note that Einstein’s velocity addition law was given more than a hundred years ago, but the preservers of that addition have not been determined yet. 1. MAPS PRESERVING BREGMAN AND JENSEN DIVERGENCES 53

1. Maps preserving Bregman and Jensen divergences 1.1. Introduction. In a series of papers [33, 55, 65, 54] Lajos Mol- nár and his coauthors described the structures of surjective maps of the positive definite cones in matrix algebras, or in operators algebras which can be considered generalized isometries meaning that they are transfor- mations which preserve "" with respect to given so-called gen- eralized distance measures. This latter notion stands for any function d : X X [0, ) on any set X with the mere property that for x, y X × → ∞ ∈ we have d(x, y) 0 if and only if x y. We recall that in several areas of = = mathematics not only metrics are used to measure nearness of points but also more general functions of this latter kind. In [65, 54] the considered generalized distance measures are of the form d d , where N(.) is a unitarily invariant norm on the underlying = N,g matrix algebra or operator algebra, g : (0, ) C is a continuous function ∞ → with the properties

(a1) g(y) 0 if and only if y 1; = = ¯ ¡ ¢¯ ¯ ¯ (a2) there exists a constant K 1 such that ¯g y2 ¯ K ¯g(y)¯, y 0, > ≥ > and the generalized distance measure dN,g is defined by

¡ ¡ 1/2 1/2¢¢ (74) d (X ,Y ) N g Y − XY − N,g = for all positive invertible elements X ,Y of the underlying algebra. In the mentioned papers one can see several important examples of that sort of generalized distance measures, many of them having backgrounds in the differential geometry of positive definite matrices or operators. The basic tools in describing the structure of the corresponding generalized isometries have been so-called generalized Mazur-Ulam type theorems and descriptions of certain algebraic isomorphisms (Jordan triple iso- morphisms) of the positive definite cones in question. We determine the structures of generalized isometries with respect to other important types of generalized distance measures. Namely, here we consider Bregman divergences and Jensen divergences. These types of divergences have wide ranging applications in several areas of math- ematics. For example, in the recent volume [69] on matrix information geometry 3 chapters are devoted to the study of Bregman divergences. One feature of Jensen divergences which justifies their importance is that Bregman divergences can be considered as asymptotic Jensen di- vergences (see Section 6.2 in [69]). We further mention that the famous 54 3. PRESERVER PROBLEMS

Stein’s loss and Umegaki’s relative entropy are among the most impor- tant Bregman divergences. Our basic tool to determine the correspond- ing preserver transformations is also algebraic in nature but rather dif- ferent from what we have mentioned in the previous paragraph. Namely, here we use order isomorphisms. We next define the two basic concepts what we consider now, Breg- man divergence and Jensen divergence. Both concepts are connected to convex real functions. Let f be a convex function on the interval (0, ). It ∞ is known that f is necessarily continuous and the set of points where f is differentiable has at most countable complement. It is a remarkable fact that if f is everywhere differentiable then its derivative f 0 is automatically continuous (see e.g. [82, Corollary 25.5.1]). Let H be a finite dimensional Hilbert space, as usual. For a dif- ferentiable convex function f on (0, ), the Bregman f -divergence on ∞ B++(H ) is ¡ ¢ H (X ,Y ) Tr f (X ) f (Y ) f 0(Y )(X Y ) , X ,Y B++(H ), f = − − − ∈ see e.g. formula (5) in [80]. If limx 0 f (x) and limx 0 f 0(x) exist, → + → + then f , f 0 have continuous extensions onto [0, ) and the Bregman f - ∞ divergence is well-defined and finite for any pair of positive semidefinite operators, too. For a convex function f on (0, ) and for given λ (0,1), the Jensen ∞ ∈ λ f -divergence on B++(H ) is defined by − J (X ,Y ) Tr¡λf (X ) (1 λ)f (Y ) f (λX (1 λ)Y )¢. f ,λ = + − − + − If limx 0 f (x) exists, then the Jensen λ f -divergence is also well- → + − defined and finite for any pair of positive semidefinite operators. It is well-known that H f and J f ,λ are always nonnegative and they are generalized distance measures if and only if f is strictly convex. Our main aim is to describe the "symmetries" of the positive defi- nite cone B++(H ) that preserve above type of divergences. This means that we are looking for the structure of all bijective maps φ : B++(H ) → B++(H ) for which ¡ ¢ H φ(X ),φ(Y ) H (X ,Y ), X ,Y B++(H ) f = f ∈ or ¡ ¢ J φ(X ),φ(Y ) J (X ,Y ), X ,Y B++(H ) λ,f = λ,f ∈ holds.

1.2. The results. It is useful to examine first the question that how different the present problem is from the ones that Lajos Molnár and his coauthors have considered in the papers [33, 55, 65, 54]. To see clearly 1. MAPS PRESERVING BREGMAN AND JENSEN DIVERGENCES 55 the differences we determine below which Bregman divergences, respec- tively which Jensen divergences are of the form (74). Let us begin with the case of Bregman divergences. One of the most important Bregman divergence is the one usually denoted by l which corresponds to the strictly convex function f (x) = logx, x 0 and is called Stein’s loss. Apparently, we have − > ¡ 1 ¢ 1 1 l(X ,Y ) Tr log X logY Y − (X Y ) Tr XY − logDetXY − n = − − − − = − − for all X ,Y B++(H ), where we have used the identity Tr log log Det ∈ ◦ = ◦ on B++(H ). On the other hand, the continuous function g(y) y = − log y 1, y 0 is nonnegative, satisfies the conditions (a1), (a2) (with con- − > stant K 2), and with the trace norm . we easily get = k k1 ° ¡ 1/2 1/2¢° d . 1,g (X ,Y ) °g Y − XY − ° l(X ,Y ), X ,Y B++(H ). k k = 1 = ∈ In the next proposition we prove that Stein’s loss is essentially the only Bregman divergence on B++(H ) which is a generalized distance mea- sure of the form (74). Observe that any distance measure of the form (74) is invariant under multiplication of the variables X ,Y by the same posi- tive scalar t.

PROPOSITION 41. Let f be a differentiable convex function on (0, ). ∞ Assume that the Bregman f -divergence H f on B++(H ) is homogeneous of degree 0, i.e., it satisfies

(75) H (t X ,tY ) H (X ,Y ) X ,Y B++(H ),t 0. f = f ∈ > Then we have f (x) a logx bx c, x 0 with some real scalars a,b,c = + + > and a 0. ≤ PROOF. We plug scalar multiples of the identity X xI,Y yI, x, y = = > 0 into the equality (75) and obtain

(76) f (tx) f (t y) f 0(t y)t(x y) f (x) f (y) f 0(y)(x y), t 0. − − − = − − − > Choosing t 1/y and reordering this equality we have = f 0(y) (1/(x y))(f (x) f (y) f (x/y) f (1) f 0(1)((x/y) 1))) = − − − + + − implying that f is twice continuously differentiable. Differentiating (76) 2 with respect to x twice we obtain f 00(x) f 00(tx)t , x,t 0. In particular, = 2 > it follows that f 00(t) is a constant multiple of t − , t 0. We infer that > f (t) a logt bt c, t 0 holds with some real constants a,b,c. By the = + + > convexity of f we have a 0. ≤ ä Concerning the appearance of the function x bx c in the above 7→ + proposition we note that adding an affine function to a given convex function f does not change the corresponding Bregman divergence (the 56 3. PRESERVER PROBLEMS same holds for Jensen divergence, too), hence that part simply does not count. We next examine the case of Jensen divergences. Again, let f (x) = logx, x 0 and pick λ (0,1). It is easy to see that the corresponding − > ∈ Jensen divergence is λ 1 λ J log,λ(X ,Y ) logDet(λX (1 λ)Y ) logDetX Y − , X ,Y B++(H ) − = + − − ∈ which can also be written as

J log,λ(X ,Y ) − = ¡ 1/2 1/2 ¢ ¡ 1/2 1/2¢λ logDet λY − XY − (1 λ)I logDet Y − XY − + − − ³ ³¡ 1/2 1/2 ¢¡ 1/2 1/2¢ λ´´ Tr log λY − XY − (1 λ)I Y − XY − − . = + − These divergences are scalar multiples of the Chebbi-Moakher log- determinant α-divergences, see [15, 65]. As mentioned in [65] (see pages 146-147) the continuous function g (y) log¡(λy (1 λ))/yλ¢, y 0 is λ = + − > nonnegative, satisfies (a1), (a2) (with constant K 2) and, moreover, we = have ° ¡ 1/2 1/2¢° J log,λ(X ,Y ) °g Y − XY − ° , X ,Y B++(H ). − = 1 ∈ This means that the Jensen divergences J log,λ are also of the form (74). Let us insert the remark here that in− the particular case λ 1/2 the = above Jensen divergence was considered and called S-divergence in the paper [85] of Sra. He proved the interesting fact there that the square root of this divergence is a true metric and proposed to use it as a convenient substitute for the geodesic distance d . 2,log ( . 2 standing for the Hilbert- k k k k Schmidt or Frobenius norm) originating from the natural Riemann geo- metric structure on B++(H ). Continuing the discussion, above we have seen that the divergences J log,λ are of the form (74). In what follows we show that they are essen- tially− the unique Jensen divergences with this property.

PROPOSITION 42. Let f be a convex function on (0, ) and pick a num- ∞ ber λ (0,1). Assume that the corresponding Jensen λ f divergence is ∈ − homogeneous of order 0, i.e., it satisfies

(77) J (t X ,tY ) J (X ,Y ) X ,Y B++(H ),t 0. f ,λ = f ,λ ∈ > Then we have f (x) a logx bx c, x 0 with some real scalars a,b,c = + + > and a 0. ≤ PROOF. Just as above, plugging scalar multiples of the identity X = xI,Y yI, x, y 0 into the equality (77) we have = > λf (tx) (1 λ)f (t y) f (λtx (1 λ)t y) (78) + − − + − λf (x) (1 λ)f (y) f (λx (1 λ)y), t 0. = + − − + − > 1. MAPS PRESERVING BREGMAN AND JENSEN DIVERGENCES 57

We assert that f is differentiable. In fact, choosing x 1 we have = f (t) (1/λ)(λf (1) (1 λ)f (y) f (λ (1 λ)y) = + − − + − (1 λ)f (t y) f (λt (1 λ)t y)). − − + + − The result [39, 11.3. Theorem] of Járai on the regularity of solutions of functional equations applies and tells us that f is necessarily continu- ously differentiable. Differentiating (78) with respect to t we have

0 λf 0(tx)x (1 λ)f 0(t y)y f 0(λtx (1 λ)t y)(λx (1 λ)y) = + − − + − + − implying the equality

λf 0(tx)x (1 λ)f 0(t y)y f 0(λtx (1 λ)t y)(λx (1 λ)y) + − = + − + − for all positive t,x, y. Choosing t 1 we infer = λf 0(x)x (1 λ)f 0(y)y f 0(λx (1 λ)y)(λx (1 λ)y). + − = + − + − This means that the function h(x) f 0(x)x, x 0 is affine and hence it is = > of the form h(x) bx a, x 0. It follows that f (x) a logx bx c, x 0 = + > = + + > with some real scalars a,b,c. Again, the convexity of f implies a 0. ≤ ä The above results can be considered as characterizations of the Stein’s loss and the Chebbi-Moakher log-determinant α-divergences. Namely, these are the only Bregman (resp. Jensen) divergences which are homo- geneous of degree 0. We now turn to the descriptions of the corresponding generalized isometries, i.e., transformations preserving the considered divergences. We begin with a general result relating to Bregman divergences. We first mention the easy fact that both kinds of divergences are clearly invariant under unitary and antiunitary congruence transforma- tions. These are maps of the form A U AU ∗, where U is a unitary or 7→ an antiunitary operator on H . In fact, this follows from the following. For any continuous function f on (0, ) and for any unitary or antiuni- ∞ tary operator U on H we have that f (U AU ∗) U f (A)U ∗ holds for every = positive definite A. This is the consequence of the fact that on the spec- trum of A the function f coincides with a polynomial p and hence we have f (U AU ∗) p (U AU ∗) Up(A)U ∗ U f (A)U ∗. Our theorems be- = = = low show that in many cases only the unitary and antiunitary congruence transformations preserve the Bregman divergence.

THEOREM 43. Let f be a differentiable convex function on (0, ) ∞ such that f 0 is bounded from below and unbounded from above. Let φ : B++(H ) B++(H ) be a bijective map which satisfies → ¡ ¢ H φ(A),φ(B) H (A,B), A,B B++(H ). f = f ∈ 58 3. PRESERVER PROBLEMS

Then there exists a unitary or antiunitary operator U : H H such that → φ is of the form φ(A) U AU ∗, A B++(H ). = ∈ PROOF. First we recall that since f is convex and everywhere dif- ferentiable, f 0 is continuous. By the convexity of f , its derivative f 0 is monotonically increasing. The assumption that f 0 is bounded from be- low implies that limx 0 f 0(x) exists and is finite. Hence f 0 can be con- tinuously extended onto→ + [0, ). The same holds for f , too. Indeed, R x ∞ R x f (x) f (1) f 0(t)dt, x 0 and f 0(t)dt is convergent as x tends to 0 = + 1 > 1 by the boundedness of f 0 on (0,1]. In the rest of the proof we shall need the following characterization of the usual order on B++(H ). ≤ Claim A. Let B,C B++(H ). The set ∈ © ª (79) H (B, A) H (C, A) A B++(H ) f − f | ∈ is bounded from below if and only if B C. ≤ To prove the claim we first compute H (B, A) H (C, A) f − f Tr f (B) Tr f (A) Tr f 0(A)(B A) (80) = − − − Tr f (C) Tr f (A) Tr f 0(A)(C A) − + + − Tr f (B) Tr f (C) Tr f 0(A)(C B). = − + − Let k denote a lower bound of f 0. Assume B C. Then by the inequality ≤ f 0(A) kI, A B++(H ), ≥ ∈ we have that

Tr f 0(A)(C B) TrkI(C B) k Tr(C B) − ≥ − = − holds for every A B++(H ) which shows that the set (79) is bounded ∈ from below. Conversely, if B C, then there exists a unit vector x H 6≤ ∈ such that x,Bx x,C x . Let P denote the orthogonal projection onto 〈 〉 > 〈 〉 x the one-dimensional subspace generated by x. For any t 0 we have > (81) Tr f 0 (tP (I P ))(C B) f 0(t) x,(C B)x f 0(1)Tr(I P )(C B). x + − x − = 〈 − 〉 + − x − © ª Since x,(C B)x 0 and f 0(t) t 0 is unbounded from above, hence 〈 − 〉 < | > the first term on the right hand side of (81) is unbounded from below. By (80), it follows that that the set © ª H (B, A) H (C, A) A B++(H ) f − f | ∈ is unbounded from below. This proves our claim. 1. MAPS PRESERVING BREGMAN AND JENSEN DIVERGENCES 59

Since the bijective map φ : B++(H ) B++(H ) preserves the Breg- → man f -divergence, using the above characterization of the order we ob- tain that φ is an order automorphism, i.e., for any B,C B++(H ) we have ∈ B C if and only if φ(B) φ(C). ≤ ≤ By the result [58, Theorem 1] of Lajos Molnár, φ, just as any order automorphism of B++(H ), is of the form

φ(A) T AT ∗, A B++(H ), = ∈ where T is an invertible linear or conjugate-linear operator on H . We may suppose that T is linear, since if we are done with this, the case of conjugate-linear T is not difficult to handle. We show that T is unitary. Assume on the contrary that T is not uni- tary. Then T ∗T I. Consider the polar decomposition T UP of T 6= 1 = where P pT T is positive definite and U TP − is unitary. Since the = ∗ = Bregman f -divergence is invariant under unitary congruences, hence we have ¡ ¢ ¡ ¢ H (A,B) H φ(A),φ(B) H T AT ∗,TBT ∗ f = f = f (82) ¡ ¢ H UPAPU ∗,UPBPU ∗ H (PAP,PBP) = f = f for all A,B B++(H ). Since T is not unitary, hence P I, and thus P has ∈ 6= an eigenvalue different from 1. Without serious loss of generality we may assume that P has an eigenvalue greater than 1 for the following reason. The map A PAP is a bijection of B++(H ) which preserves the Breg- 7→ 1 1 man f -divergence. Therefore, the inverse transformation A P − AP − 7→ preserves the Bregman f -divergence as well. If P does not have any 1 eigenvalue greater than 1, P − must have one. Suppose that P v λv for some λ 1 and unit vector v H . Let Q = > ∈ v be the orthogonal projection onto the one-dimensional subspace gener- ated by v. By (82), the transformation A PAP preserves the Bregman 7→ f -divergence and so does any of its powers A P n AP n. Hence 7→ H ¡λ2Q ,Q ¢ H ¡P nλ2Q P n,P nQ P n¢ f v v = f v v ¡ 2(n 1) 2n ¢ (83) H λ + Q ,λ Q = f v v holds for any n N. ∈ We now consider the symmetrized Bregman f -divergence H (A,B) f + H f (B, A) which can be written in the following convenient form

H (A,B) H (B, A) Tr f (A) Tr f (B) Tr f 0(B)(A B) f + f = − − − ¡ ¢ Tr f (B) Tr f (A) Tr f 0(A)(B A) Tr f 0(A) f 0(B) (A B). + − − − = − − By (83) we have H ¡λ2Q ,Q ¢ H ¡Q ,λ2Q ¢ f v v + f v v 60 3. PRESERVER PROBLEMS

¡ 2(n 1) 2n ¢ ¡ 2n 2(n 1) ¢ H λ + Q ,λ Q H λ Q ,λ + Q = f v v + f v v ¡ 2(n 1) 2n ¢¡ 2(n 1) 2n ¢ Tr f 0(λ + Q ) f 0(λ Q ) λ + Q λ Q = v − v v − v ¡ ¡ 2(n 1)¢ ¡ 2n¢¢ 2n ¡ 2 ¢ f 0 λ + f 0 λ λ λ 1 = − − ¡ 2(n 1) ¡ 2n¢¢ 2n for any n N. This means that f 0(λ + ) f 0 λ λ is independent ∈ − of n, that is, ¡ 2(n 1)¢ ¡ 2n¢ c f 0 λ + f 0 λ n − = ¡λ2¢ holds for some constant c. Therefore, à n ! ¡ 2(n 1)¢ X ³ ³ 2(k 1)´ ³ 2k ´´ lim f 0(x) lim f 0 λ + lim f 0(1) f 0 λ + f 0 λ x n n →∞ = →∞ = →∞ + k 0 − = n X c X∞ ¡ 2¢n f 0(1) lim f 0(1) c λ− . n ¡ 2¢k = + →∞ k 0 λ = + n 0 < ∞ = = which contradicts the assumption that f 0 is unbounded from above. The proof of the theorem is complete. ä The probably most important Bregman f -divergences on B++(H ) correspond to the following functions: x xp (p 1), x x logx x, 7→ > 7→ − x logx. Unfortunately, the general theorem above covers only the 7→ − case of the first type of functions. Indeed, the second function has deriv- ative which is neither bounded from below nor unbounded from above and the derivative of the third one is not bounded from below. Fortu- nately, the Bregman divergence related to the third function, i.e., Stein’s loss is of the form (74) and the corresponding preservers were character- ized in [65]. By [65, Theorem 2] a surjective map φ : B++(H ) B++(H ) → preserves the Stein’s loss if and only if there is an invertible linear or con- jugate linear operator T : H H such that φ is of the form → φ(A) T AT ∗, A B++(H ). = ∈ In what follows we characterize the preservers of Umegaki’s relative entropy or, in other words, von Neumann divergence which is the Breg- man divergence corresponding to the function x x logx x. It is clear 7→ − that this divergence equals ¡ ¢ Tr A(log A logB) (A B) , A,B B++(H ). − − − ∈ The result reads as follows.

THEOREM 44. Let φ : B++(H ) B++(H ) be a surjective map which → satisfies Tr¡φ(A)¡logφ(A) logφ(B)¢ ¡φ(A) φ(B)¢¢ (84) − − − Tr(A(log A logB) (A B)), A,B B++(H ). = − − − ∈ 1. MAPS PRESERVING BREGMAN AND JENSEN DIVERGENCES 61

Then there exists a unitary or antiunitary operator U : H H such that → φ is of the form φ(A) U AU ∗, A B++(H ). = ∈ PROOF. We begin with the following general observation. Assume that f is a strictly convex differentiable function on (0, ). We assert that ∞ for any B,C B++(H ), the set ∈ © ª H (A,B) H (A,C) A B++(H ) f − f | ∈ is bounded from below if and only if f 0(B) f 0(C). Indeed, we have ≤ H (A,B) H (A,C) f − f ¡ ¢ ¡ ¡ ¢¢ Tr f 0(C) f 0(B) A Tr f (C) f (B) f 0(C)C f 0(B)B = − + − − − which is easily seen to be bounded from below if and only if f 0(C) − f 0(B) 0. ≥ Therefore, for any surjective (and hence, by the strict convexity of f , bijective) map φ : B++(H ) B++(H ) which preserves the Bregman f - → divergence, we obtain that φ has the property that ¡ ¢ ¡ ¢ f 0(B) f 0(C) f 0 φ(B) f 0 φ(C) . ≤ ⇐⇒ ≤ ³ ³ 1 ´´ This means that the transformation A f 0 φ f 0− (A) is an order au- 7→ tomorphism of the set of all self-adjoint operators on H with spectrum contained in the range of f 0. ¡ ¡ A¢¢ In our particular case we have f 0 log, therefore A log φ e is = 7→ an order automorphism of the set of all self-adjoint operators on H . By [57, Theorem 2] we have an invertible linear or conjugate-linear operator T : H H and a self-adjoint linear operator X : H H such that → → ¡ ¡ A¢¢ log φ e T AT ∗ X = + or T log AT X (85) φ(A) e ∗+ , A B++(H ). = ∈ We assume that T is linear, the conjugate-linear case is similar, not diffi- cult to handle. Consider the polar decomposition T UP of T where U = is unitary and P pT T is positive definite. As already mentioned, the = ∗ unitary similarity transformation A U AU ∗ preserves Bregman diver- 7→ gences. Since

UP(log A)PU X P(log A)P U XU φ(A) e ∗+ Ue + ∗ U ∗, A B++(H ), = = ∈ it follows that in (85) we can and do assume that T is a positive definite operator. We prove that necessarily T I holds. To see this, first assume = 62 3. PRESERVER PROBLEMS that T has an eigenvalue which is greater than 1. We know that (86) ³ T (log A)T X ³ T (log A)T X T (logB)T X ´´ Tr e + (T (log A)T T (logB)T ) e + e + − − − Tr(A(log A logB) (A B)), A,B B++(H ). = − − − ∈ Fixing A,B, write tB in the place of B where t 0 is arbitrary. The above > equality gives us that (logt)T 2 T (logB)T X (87) a logt Tre + + c d logt et f , t 0 + + = + + > holds for some real constants a,b,c,d,e, f . Select a real number µ such that µI T (logB)T X . Let λ be an eigenvalue of T which is greater ≤ + than 1 and let x be a corresponding unit eigenvector. Denote by Px the rank-one projection onto the subspace generated by x. Then λ2P T 2, x ≤ so for t 1 we have (logt)λ2P µI (logt)T 2 T (logB)T X . By the ≥ x + ≤ + + monotonicity of trace functions (see [13, 2.10. Theorem]) this implies that 2 2 (logt)λ Px µI (logt)T T (logB)T X Tre + Tre + + , t 1. ≤ ≥ (logt)T 2 T (logB)T X Therefore, the function t Tre + + , t 1 can be minorized 2 7→ ≥ by a function αt λ β with some positive α and real number β. Since λ2 + > 1, considering the equality (87) and letting t tend to infinity, we easily obtain a contradiction. Therefore, the eigenvalues of T are all less than or equal to 1. However, the inverse of φ also preserves the von Neumann divergence and we have 1 T 1(log A)T 1 T 1 XT 1 φ− (A) e − − − − − , A B++(H ) = ∈ 1 It follows that the eigenvalues of T − are also not greater than 1. We con- clude that T I. = We finally prove that X 0. Let A I and B be any element of = = B++(H ) which commutes with X . We obtain from (86) that Tre X ( logB B I) Tr( logB B I). − + − = − + − By the properties of the function x logx x 1, x 0 (strictly in- 7→ − + − > creasing for x 1, and takes the value 0 at 1) it is easy to see that any ≥ positive semidefinite operator D which commutes with X can be written as D logB B I with some positive definite B which commutes with = − + − X . Consequently, we have Tre X D TrD = for any positive semidefinite operator D on H . This clearly implies that e X I, i.e., X 0. The proof of the theorem is complete. = = ä We now turn to the preservers of Jensen divergences. Our general re- sult reads as follows. 1. MAPS PRESERVING BREGMAN AND JENSEN DIVERGENCES 63

THEOREM 45. Let f be a differentiable strictly convex function on (0, ), assume limx 0 f (x) exists and finite and f 0 is unbounded from ∞ → + above. Pick λ (0,1). If φ : B++(H ) B++(H ) is a surjective map which ∈ → satisfies ¡ ¢ J φ(A),φ(B) J (A,B), A,B B++(H ), f ,λ = f ,λ ∈ then there exists a unitary or antiunitary operator U : H H such that → φ is of the form φ(A) U AU ∗, A B++(H ). = ∈ PROOF. Observe first that, by the assumptions on the function, f is monotonically increasing for large enough values of its variable. Next, we verify the following. Claim B. For any B,C B++(H ), the set ∈ © ª J (A,B) J (A,C) A B++(H ) f ,λ − f ,λ | ∈ is bounded from below if and only if B C. ≤ To prove the claim first observe that J (A,B) J (A,C) (1 λ)¡Tr f (B) Tr f (C)¢ f ,λ − f ,λ = − − + Tr f (λA (1 λ)C) Tr f (λA (1 λ)B), A B++(H ). + + − − + − ∈ Assume now that B C and that there is a sequence (A ) in B++(H ) ≤ k such that Tr f (λA (1 λ)C) Tr f (λA (1 λ)B) . k + − − k + − → −∞ Denote µ(k), i 1,...,n the eigenvalues of λA (1 λ)B and λ(k), i i = k + − i = 1,...,n the eigenvalues of λA (1 λ)C both ordered increasingly. By the k + − Weyl inequality (see e.g. [37, 4.3.3. Corollary]) it follows that µ(k) λ(k) ³ ³ ´ ³ ´´ i ≤ i for all i and k. Moreover, we have P f λ(k) f µ(k) which im- i i − i → −∞ ³ ³ (k)´ ³ (k)´´ plies that the sequence f λi f µi is not bounded from below for − ³ ³ ´ ³ ´´ some i 1,...,n and hence it has a subsequence f λ(kl ) f µ(kl ) = i − i → ³ (kl )´ . Since f is bounded from below, we deduce that f µi . This −∞ ³ ´ ³ ´ → ∞ implies that µ(kl ) and hence f λ(kl ) f µ(kl ) 0 for all but finitely i → ∞ i − i ≥ many indexes l which is a contradiction. Conversely, if B C, then there exists a unit vector x H such that 6≤ ∈ x,Bx x,C x . Set ε x,(B C)x and let P denote the orthogonal 〈 〉 > 〈 〉 = 〈 − 〉 x projection onto the one-dimensional subspace generated by x. Let m be a positive number such that mI (1 λ)C is positive definite. For any − − t 0 set > 1 At : (mI tPx (1 λ)C). = λ + − − 64 3. PRESERVER PROBLEMS

Now J (A ,B) J (A ,C) (1 λ)¡Tr f (B) Tr f (C)¢ f ,λ t − f ,λ t = − − Tr f (λAt (1 λ)C) Tr f (λAt (1 λ)B) (88) + + − − + − (1 λ)¡Tr f (B) Tr f (C)¢ = − − Tr f (mI tP ) Tr f (mI tP (1 λ)(B C)). + + x − + x + − − Let {x, y , y ,..., y } be an orthonormal basis in H . For 2 j n, denote 2 3 n ≤ ≤ by a the diagonal matrix elements of mI tP (1 λ)(B C) relative j j + x + − − to this basis, that is, a : (mI tP (1 λ)(B C)) y , y (mI (1 λ)(B C)) y , y . j j = 〈 + x + − − j j 〉 = 〈 + − − j j 〉 Note that a is independent of t for 2 j and j j ≤ (mI tP (1 λ)(B C))x,x m t (1 λ)ε. 〈 + x + − − 〉 = + + − By Peierls inequality (see [13, 2.9. Theorem]), for the convex function f we have n X f (m t (1 λ)ε) f (a j j ) Tr f (mI tPx (1 λ)(B C)). + + − + j 2 ≤ + + − − = On the other hand, it is apparent that Tr f (mI tP ) f (m t) (n 1)f (m). + x = + + − Therefore, by (88), J (A ,B) J (A ,C) f (m t) f (m t (1 λ)ε) K (B,C), f ,λ t − f ,λ t ≤ + − + + − + where n ¡ ¢ X K (B,C) (1 λ) Tr f (B) Tr f (C) (n 1)f (m) f (a j j ) = − − + − − j 2 = is independent of the parameter t. Since f 0 is unbounded from above, hence lim f (m t) f (m t (1 λ)ε) , t + − + + − = −∞ →∞ and this completes the proof of our claim. Using the characterization of the order given in Claim B, we see that the transformation φ is an order automorphism of B++(H ) and hence it is of the form φ(A) T AT ∗, A B++(H ), = ∈ where T is an invertible linear or conjugate-linear operator on H . We consider only the case where T is linear and prove that then T is neces- sarily unitary. 1. MAPS PRESERVING BREGMAN AND JENSEN DIVERGENCES 65

As already mentioned, the unitary-antiunitary congruence transfor- mations preserve the Jensen divergences. Hence, by polar decomposi- tion, we can assume that T is a positive definite operator. Any power of φ is also a divergence preserver, so for every positive integer n we have that

Trλf (T n AT n) (1 λ)f (T nBT n) f (T n(λA (1 λ)B)T n) (89) + − − + − Trλf (A) (1 λ)f (B) f (λA (1 λ)B), A,B B++(H ). = + − − + − ∈ Since f is continuously extendible onto [0, ), thus we can insert any ∞ positive semidefinite operators A,B in the equality above. Assume T has an eigenvalue, say s, which is greater than 1 and x is a corresponding unit eigenvector. As before, denote by Px the orthogonal projection onto the subspace generated by x. Plug A P and B 0 into (89). We have = x = λf ¡s2n¢ f ¡λs2n¢ c − = with some constant c. One can easily check that the function t λf (t) 7→ − f (λt) is monotonically increasing (just differentiate and use that f 0 is in- creasing). Since s2n , it follows that → ∞ λf (t) f (λt) c, t 0. − = >

We deduce λf 0(t) f 0(λt)λ, t 0 and this implies that f 0 is constant. We = > obtain that f is affine, a contradiction. It follows that T has no eigenvalue 1 which is greater than 1. Since φ− also preserves the λ f Jensen diver- 1 − gence, we have that the eigenvalues of T − are also not greater than 1. It follows that T I and the proof of the theorem is complete. = ä

Let us again consider the three most important examples x xp 7→ (p 1), x x logx x, x logx of generating functions. The first > 7→ − 7→ − two do satisfy the conditions in our theorem, hence the corresponding preservers are unitary-antiunitary congruence transformations. As for the third one, it does not satisfy the conditions (not bounded below), but the Jensen divergence in that case is of the form (74), see the dis- cussion before Proposition 42. By [65, Theorem 2], a surjective map φ : B++(H ) B++(H ) preserves the corresponding Jensen divergence → (i.e., Chebbi-Moakher log-determinant α-divergence) if and only if there is an invertible linear or conjugate linear operator T : H H such that → φ is of the form

φ(A) T AT ∗, A B++(H ). = ∈ 66 3. PRESERVER PROBLEMS

2. Jordan triple endomorphisms of the positive cone Now we turn to another preserver problem on the cone of positive op- erators. Namely, we describe the structure of the Jordan triple endomor- phisms of the positive definite cones in operator algebras. These endo- morphisms are maps which are morphisms with respect to the operation of the Jordan triple product (A,B) ABA which is a well-known opera- 7→ tion in ring theory. Our main reason for investigating these maps comes from the fact that they naturally appear in the study of surjective isome- tries and surjective maps preserving generalized distance measures be- tween positive definite cones. For details see [55, 54, 65]. In the paper [55] Lajos Molnár proved the following statement. (In the following, as usual, H stands for a finite dimensional complex Hilbert space.)

THEOREM 46. Assume dim(H ) 3. Let φ : B++(H ) B++(H ) be a ≥ → continuous map which is a Jordan triple endomorphism, i.e., φ is a con- tinuous map which satisfies

φ(ABA) φ(A)φ(B)φ(A), A,B B++(H ). = ∈ Then there exist a unitary operator U B(H ), a real number c, a set ∈ {P1,...,Pn} of mutually orthogonal rank-one projections in B(H ), and a set {c1,...,cn} of real numbers such that φ is of one of the following forms: c (a1) φ(A) (DetA) U AU ∗,A B++(H ); = c 1 ∈ (a2) φ(A) (DetA) UA− U ∗,A B++(H ); = c tr ∈ (a3) φ(A) (DetA) UA U ∗,A B++(H ); = c tr 1 ∈ (a4) φ(A) (DetA) UA − U ∗,A B++(H ); = Pn c j ∈ (a5) φ(A) j 1(DetA) P j ,A B++(H ). = = ∈ Observe that the converse statement in Theorem 46 is also true mean- ing that any transformation of any of the forms (a1)-(a5) is necessarily a continuous Jordan triple endomorphism of B++(H ). One may immediately ask why we assume the condition dim(H ) ≥ 3, what happens in the case where dim(H ) 2. The fact is that in the = proof of Theorem 46 Molnár used such tools which are applicable only if dim(H ) 3. The remaining case dim(H ) 2 was proposed as an open ≥ = problem in the papers [55] (see Remark 11) and [54] (see Remark 23). One may think that when dim(H ) 2, one can simply compute and = obtain the solution straightaway. But this is far from being true as it will turn out below. Our main result on Jordan triple endomorphisms reads as follows. 2. JORDAN TRIPLE ENDOMORPHISMS OF THE POSITIVE CONE 67

THEOREM 47. Let H be a two-dimensional Hilbert space. Let φ : B++(H ) B++(H ) be a continuous Jordan-triple endomorphism. Then → we have the following possibilities: (b1) there is a unitary operator U B(H ) and a real number c such ∈ that c φ(A) (DetA) U AU ∗, A B++(H ); = ∈ (b2) there is a unitary operator V B(H ) and a real number d such ∈ that d 1 φ(A) (DetA) VA− V ∗, A B++(H ); = ∈ (b3) there is a unitary operator W B(H ) and real numbers c ,c ∈ 1 2 such that

c1 c2 φ(A) W Diag[(DetA) ,(DetA) ]W ∗, A B++(H ). = ∈ Before presenting the proof we introduce a few notation and make some useful observations. We equip Bsa(H ) with the inner product X ,Y (1/2)Tr XY . The 〈 〉 = induced norm is denoted by . Let {e ,e } be an arbitrary orthonormal k·k 1 2 basis in H . In the following, whenever a 2 2 matrix appears as an ele- × ment of B(H ), it denotes the operator with the following (characteriz- ing) property: its matrix expansion in the above fixed orthonormal basis coincides with the given matrix. The set (90) ½ · 1 0 ¸ · 0 1 ¸ · 0 i ¸ · 1 0 ¸¾ σ I , σ , σ , σ 0 = = 0 1 x = 1 0 y = i 0 z = 0 1 − − sa sa is a convenient orthonormal basis in B (H ). Let B0 (H ) denote the traceless subspace of Bsa(H ) (the subspace of all elements in Bsa(H ) with zero trace). In the proof of our theorem we shall use the following two observa- tions. We first claim that for X Bsa(H ), the equality X 2 I holds iff ∈ 0 = X 1. Indeed, let us denote the eigenvalues of X by λ and λ, λ 0. k k = − ≥ We have 1 X 2 I λ2 1 ¡λ2 ( λ)2¢ 1 X 1 = ⇔ = ⇔ 2 + − = ⇔ k k = verifying our first calim. Next, we assert that for any 0 X Bsa(H ) we have e X 6= ∈ 0 = (cosh X ) I (sinh X )(X / X ). To see this, using (X / X )2 I, we k k + k k k k k k = compute µ ¶k X ∞ 1 X X X X X k e ek k k k X = = k 0 k! k k X = k k 68 3. PRESERVER PROBLEMS

X∞ 1 2k X∞ 1 2k 1 X X I X + . = k 0 (2k)! k k + k 0 (2k 1)! k k X = = + k k This proves our assertion. Now we turn to the proof of the main result.

PROOFOF THEOREM 47. Let H be a two-dimensional Hilbert space and let φ : B++(H ) B++(H ) be a continuous Jordan triple endomor- → phism. Then, by [55, Lemma 6] there exists a commutativity preserving linear transformation f : Bsa(H ) Bsa(H ) such that → φ(A) exp(f (log A)), A B++(H ). = ∈ In fact, similar conclusion holds for all continuous Jordan triple endo- morphisms between the positive definite cones of general C ∗-algebras as it has been shown in [54, Lemma 16]. By a commutativity preserving linear map we simply mean a transformation which sends commuting elements to commuting elements. We have two possibilities for f (I): It is either a scalar multiple of the identity or it is not. We divide the argument accordingly. Assume first that f (I) is not a scalar multiple of the identity. Then up to unitary similarity we may and do assume that f (I) is a diagonal operator with two different eigenvalues. By the commutativity preserving property of f , for every A Bsa(H ) we have that f (A) commutes with ∈ f (I) and then it follows that f (A) is diagonal, too. Therefore, we have linear functionals ϕ,ψ : Bsa(H ) R such that → · ϕ(A) 0 ¸ f (A) , A Bsa(H ) = 0 ψ(A) ∈ and hence · eϕ(log A) 0 ¸ φ(A) , A B++(H ). = 0 eψ(log A) ∈ Since φ is a Jordan triple endomorphism, we deduce easily that

ϕ(log ABA) 2ϕ(log A) ϕ(logB), A,B B++(H ) = + ∈ and similar equality holds for ψ as well. Since ϕ is a linear functional on Bsa(H ), by Riesz representation theorem we have an element T ∈ Bsa(H ) such that ϕ( ) ,T . It follows that we have · = 〈· 〉 Tr((log ABA)T ) 2Tr((log A)T ) Tr((logB)T ), A,B B++(H ). = + ∈ Following the argument given on p. 2844 in [53] from the displayed equality (2) on, one can verify that T is necessarily a scalar multiple of the identity and that means that ϕ(A) c Tr A, A Bsa(H ) holds for some = ∈ 2. JORDAN TRIPLE ENDOMORPHISMS OF THE POSITIVE CONE 69 real number c. The same observation applies for ψ, too, and then we conclude that there are real numbers c1,c2 such that we have · (DetA)c1 0 ¸ φ(A) , A B++(H ), = 0 (DetA)c2 ∈ which gives us (b3). In the remaining part of the proof we assume that f (I) is a scalar mul- tiple of the identity. sa Let us define the linear functional f0 : B (H ) R by f0( ) ­ ® → · = σ , f ( ) , that is, by f (A) (1/2)Tr f (A), A Bsa(H ). 0 · 0 = ∈ The first crucial step in the proof follows. sa Claim 1. The linear functional f0 vanishes on B0 (H ). sa The subspace B0 (H ) is generated by σx ,σy ,σz . We show that f (σ ) f (σ ) 0, the remaining equality f (σ ) 0 can be verified sim- 0 x = 0 y = 0 z = ilarly. In what follows we consider arbitrary positive real parameters s,t. Direct calculations show that for all such s,t we have

s σ tσ s σ e 2 x e y e 2 x ³ ³ s ´ ³ s ´ ´¡ ¢ cosh I sinh σx cosh(t)I sinh(t)σy = 2 + 2 + × ³ ³ s ´ ³ s ´ ´ cosh I sinh σx × 2 + 2 2 ³ s ´ ³ s ´ ³ s ´ 2 ³ s ´ cosh(t)cosh I 2cosh(t)cosh sinh σx cosh sinh(t)σy = 2 + 2 2 + 2 ³ s ´ ³ s ´ ³ s ´ ³ s ´ sinh sinh(t)cosh σx σy cosh sinh(t)sinh σy σx + 2 2 + 2 2 2 ³ s ´ 2 2 ³ s ´ cosh(t)sinh σ sinh sinh(t)σx σy σx + 2 x + 2 cosh(s)cosh(t)I cosh(t)sinh(s)σ sinh(t)σ . = + x + y Here we have used the equalities σ σ σ σ 0, σ2 I, σ σ σ σ x y + y x = x = x y x = − y and some identities of the hyperbolic functions. Since, by the multiplicativity of the determinant, we have

³ s σ tσ s σ ´ Det e 2 x e y e 2 x 1, = hence s σ tσ s σ rW e 2 x e y e 2 x e = holds for some W Bsa(H ) with W 1 and r 0 (observe that W ∈ 0 k k = ≥ depends on s,t). Since erW cosh(r )I sinh(r )W we obtain the equality = + cosh(r )I sinh(r )W cosh(s)cosh(t)I cosh(t)sinh(s)σ sinh(t)σ . + = + x + y Taking trace we first deduce that 1 (91) r cosh− (cosh(s)cosh(t)) = 70 3. PRESERVER PROBLEMS and next that sinh(r )W cosh(t)sinh(s)σ sinh(t)σ . = x + y Clearly, due to s,t 0, the possibility r 0 is ruled out and hence we infer > = that 1 ¡ ¢ W cosh(t)sinh(s)σx sinh(t)σy = sinh(r ) + cosh(t)sinh(s)σ sinh(t)σ x + y (92) p . = cosh2(s)cosh2(t) 1 − Now, on the one hand, we compute

³ ³ s σ tσ s σ ´´ Det φ e 2 x e y e 2 x

µ ³ ³ s s ´´¶ ³ ³ s s ´´ f log e 2 σx etσy e 2 σx Tr f log e 2 σx etσy e 2 σx (93) Det e e e2f0(rW ). = = = On the other hand, since φ is a Jordan triple endomorphism, the quantity (93) is equal to

³ ³ s σ ´ ¡ tσ ¢ ³ s σ ´´ ³ s f (σ ) t f (σ ) s f (σ )´ Det φ e 2 x φ e y φ e 2 x Det e 2 x e y e 2 x = (94) s s Tr f (σx ) t Tr f (σy ) Tr f (σx ) 2(s f0(σx ) t f0(σy )) e 2 e e 2 e + . = = Let us introduce the auxiliary function 1 cosh− (cosh(s)cosh(t)) N(s,t) p , 0 s,t R. = cosh2(s)cosh2(t) 1 < ∈ − By (91), (92), (93), (94) we have s f (σ ) t f (σ ) f (rW ) 0 x + 0 y = 0 N(s,t)cosh(t)sinh(s)f (σ ) N(s,t)sinh(t)f (σ ) = 0 x + 0 y for all 0 s,t R. It is not difficult to check that the two-variable func- < ∈ tions g(s,t) N(s,t)cosh(t)sinh(s) s and h(s,t) N(s,t)sinh(t) t are = − = − linearly independent. Indeed, one can see that the determinant of the matrix · g(1,1) h(1,1) ¸ g(2,2) h(2,2) is nonzero (its value is close to -0.5) which implies the desired linear in- dependence. It then follows that f (σ ) f (σ ) 0 and we obtain Claim 0 x = 0 y = 1. sa As a consequence we infer that the subspace f (B0 (H )) is orthogo- sa nal to σ0 I meaning that it consists of traceless operators, f (B0 (H )) sa = ⊂ B0 (H ). Since f (I) is a scalar multiple of the identity, we also have 2. JORDAN TRIPLE ENDOMORPHISMS OF THE POSITIVE CONE 71

sa sa f (B (H )⊥) B (H )⊥. We will use these facts in the second crucial 0 ⊂ 0 step of the proof which follows. sa Claim 2. The restriction of f to B0 (H ) is a non-negative scalar mul- tiple of an isometry. To see this, it is sufficient to show that f (σx ), f (σy ), f (σz ) are mutually orthogonal and of the same norm. Clearly, we are done if we verify this for any two elements of the collection f (σx ), f (σy ), f (σz ). We shall consider, for example, f (σx ) and f (σy ). Recalling that f (W ) is traceless, in the case where f (W ) 0, we compute 6= s s µ ³ ³ σ tσ σ ´´¶ 1 ³ s σ tσ s σ ´ 1 f log e 2 x e y e 2 x 1 r f (W ) l(s,t): Trφ e 2 x e y e 2 x Tr e Tre = 2 = 2 = 2 µ ¶ 1 ¡ ° °¢ ¡ ° °¢ f (W ) ¡ ° °¢ Tr cosh r °f (W )° I sinh r °f (W )° ° ° cosh r °f (W )° . = 2 + °f (W )° = ¡ ° °¢ If f (W ) 0, then we again have l(s,t) cosh r °f (W )° and, by (92), we = = can further compute ¡° ° 1 ¢ l(s,t) cosh °f (W )°cosh− (cosh(s)cosh(t)) = µ 1 q cosh− (cosh(s)cosh(t)) 2 2 ° °2 cosh p sinh (s)cosh (t)°f (σx )° = cosh2(s)cosh2(t) 1 × − ¶ ­ ® 2 ° °2 f (σ ), f (σ ) 2sinh(s)sinh(t)cosh(t) sinh (t)°f (σ )° + x y + y µ 1 cosh− (cosh(s)cosh(t)) cosh p = cosh2(s)cosh2(t) 1 × q − ¡ 2 2 2 ¢° °2 cosh (s)cosh (t) cosh (t) °f (σ )° × − x + ¶ ­ ® ¡ 2 ¢° °2 (95) f (σ ), f (σ ) 2sinh(s)sinh(t)cosh(t) cosh (t) 1 °f (σ )° . + x y + − y Since φ is a Jordan triple endomorphism, (95) is equal to

1 ³ ³ s σ ´ ¡ tσ ¢ ³ s σ ´´ m(s,t): Tr φ e 2 x φ e y φ e 2 x = 2

1 ³ s f (σ ) t f (σ ) s f (σ )´ 1 ³ s f (σ ) t f (σ )´ Tr e 2 x e y e 2 x Tr e x e y . = 2 = 2 ° ° Assume f (σx ), f (σy ) 0 and denote X f (σx )/°f (σx )° and Y ° ° 6= = = f (σy )/°f (σy )°. Then, since f (σx ), f (σy ) are traceless, we can continue

µ ¶ 1 ¡ ° °¢ ¡ ° °¢ f (σx ) m(s,t) Tr cosh s °f (σx )° I sinh s °f (σx )° ° ° = 2 + °f (σx )° 72 3. PRESERVER PROBLEMS Ã ! ¡ ° °¢ ¡ ° °¢ f (σy ) cosh t °f (σy )° I sinh t °f (σy )° ° ° × + °f (σy )° ¡ ° °¢ ¡ ° °¢ cosh s °f (σ )° cosh t °f (σ )° = x y + ¡ ° °¢ ¡ ° °¢ (96) X ,Y sinh s °f (σ )° sinh t °f (σ )° . +〈 〉 x y ° ° ° ° ° ° We show that °f (σx )° °f (σy )°. To this, set α : °f (σx )°, β : ° ° = = = °f (σ )°, γ : X ,Y . It is easy to check that y = 〈 〉 1 1 ¡ 2 ¢ lim cosh− cosh (t) 2 t = →∞ t and s ¡cosh4(t) cosh2(t)¢α2 2αβγsinh2(t)cosh(t) ¡cosh2(t) 1¢β2 lim − + + − t cosh4(t) 1 →∞ − α. = From these we get that for every 0 ε( 2) there exists some 0 T such < < < ε that l(t,t) cosh((2 ε)αt) holds for t T . On the other hand, it is easy ≥ − > ε to see that 1 (α β)t ¡ ¢ m(t,t) e + 1 γ o(1) . = 4 + + Therefore, the inequality m(t,t) l(t,t) cosh((2 ε)αt) = ≥ − is equivalent to

1 ¡ ¢ 1 ³ ((1 ε)α β)t ((3 ε)α β)t ´ (97) 1 γ o(1) e − − e− − + 4 + + ≥ 2 + Taking the limit t in (97) we infer that β (1 ε)α. This is true for → ∞ ≥ − ° ° any 0 ε 2, hence letting ² 0 we obtain β α, that is, °f (σy )° ° <° < → ≥ ≥ °f (σx )°. By changing the roles of σx and σy we get the desired equality ° ° ° ° °f (σ )° °f (σ )°. Having this in mind, it is clear that the function m( , ) y = x · · is symmetric in the sense that we have m(s,t) m(t,s) for all 0 s,t R, = < ∈ see (96). It follows that l( , ) is also symmetric which can happen only · · when ­f (σ ), f (σ )® 0, see (95). Therefore, we have f (σ ) f (σ ) , x y = k x k = k y k ­f (σ ), f (σ )® 0 and we are done in the case where f (σ ), f (σ ) 0. x y = x y 6= Assume now that f (σ ) 0, f (σ ) 0. By (95), (96) we have x = y 6= Ã 1 q ! cosh− (cosh(s)cosh(t)) ¡ 2 ¢° °2 cosh p cosh (t) 1 °f (σy )° cosh2(s)cosh2(t) 1 − − ¡ ° °¢ cosh t °f (σ )° . = y 2. JORDAN TRIPLE ENDOMORPHISMS OF THE POSITIVE CONE 73

It follows that v 1 ¡ 2 ¢u¡ 2 ¢° °2 cosh− cosh (t) u cosh (t) 1 °f (σy )° ° ° t − °f (σy )°. t cosh4(t) 1 = − Letting t tend to infinity, we obtain f (σ ) 0, a contradiction. y = Assume f (σ ) 0, f (σ ) 0. Again, by (95), (96) we have x 6= y = Ã 1 q ! cosh− (cosh(s)cosh(t)) ¡ 2 2 2 ¢° °2 cosh p cosh (s)cosh (t) cosh (t) °f (σx )° cosh2(s)cosh2(t) 1 − − ¡ ° °¢ cosh s °f (σ )° . = x It follows that v 1 ¡ 2 ¢u¡ 4 2 ¢° °2 cosh− cosh (t) u cosh (t) cosh (t) °f (σx )° ° ° t − °f (σx )°. t cosh4(t) 1 = − ° ° ° ° ° ° Letting t tend to infinity, we deduce 2°f (σ )° °f (σ )°, i.e., °f (σ )° x = x x = 0, a contradiction again. So it remains only the possibility f (σ ) x = f (σ ) 0 and this proves Claim 2. y = To complete the proof of our theorem, let us see what happens when the restriction of f onto Bsa(H ) is zero. We have f (I) (2c)I with some 0 = real number c. Then f (A) c(Tr A)I, A Bsa(H ) and we obtain φ(A) c = ∈ = (DetA) I, A B++(H ). This means that φ is of the form (b3). ∈ sa Now assume that the restriction of f onto B0 (H ) is a positive scalar multiple of an isometry. It follows that in the orthonormal basis (90), the transformation f has the block-matrix form · v 0 ¸ f p , = 0 M where p is a positive real number, v is a real number and M is a 3 3 × orthogonal matrix. If M SO(3), then ∈ · 1 2c 0 ¸ f p + = 0 R for some c R and R SO(3). Similarly, if M SO(3), then ∈ ∈ − ∈ · 1 2c 0 ¸ f p − + = 0 R − for some c R and R SO(3). ∈ ∈ For any R SO(3) there exists a U SU(2) such that the matrix of the ∈ ∈ transformation A U AU ∗ is 7→ · 1 0 ¸ , 0 R 74 3. PRESERVER PROBLEMS see [84, Proposition VII.5.7.]. Therefore, in the case where M SO(3) we ∈ have φ(A) exp(f (log(A))) exp(f (log A (Tr(log A)/2)I)) f ((Tr(log A)/2)I)) = = − + exp(pU(log A (Tr(log A)/2)I)U ∗)exp(p(1 2c)Tr(log A)/2) = − + pc p exp(pU(log A)U ∗)exp((pc)Tr(log A)) (DetA) UA U ∗. = = Since φ(ABA) φ(A)φ(B)φ(A), we infer (ABA)p Ap B p Ap , A,B = = ∈ B++(H ) which holds only if p 1. Consequently, we have φ(A) c = = (DetA) U AU ∗, A B++(H ). This means that φ is of the form (b1). ∈ Similarly, in the case where M SO(3) one can conclude φ(A) c 1 − ∈ = (DetA) UA− U ∗, A B++(H ), i.e, φ is of the form (b2). The proof of ∈ the theorem is complete. ä One can notice that in Theorem describing the structure of con- tinuous Jordan triple endomorphisms of B++(H ), in the case where dim(H ) 3 the transpose operation and its composition with the inverse ≥ operation also appear and one may ask why it is not so in the case where dim(H ) 2. There is no contradiction here, it is easy to see that in fact = those two possibilities do appear in Theorem 47 in a hidden way. Indeed, when dim(H ) 2, the transpose operation can be written in the form = (a2) above. Namely, for the unitary operator

· 0 1 ¸ U = 1 0 − tr 1 we have A (DetA)UA− U ∗ for all A B++(H ). = ∈ The following structural result concerning the continuous Jordan triple automorphisms of B++(H ) follows from the proof of Theorem 47.

THEOREM 48. Assume that dim(H ) 2. If φ : B++(H ) B++(H ) is = → a continuous Jordan triple automorphism, then φ is of one of the following two forms: (c1) there is a real number c 1/2 and U SU(2) such that 6= − ∈ c φ(A) (DetA) U AU ∗, A B++(H ); = ∈ (c2) there is a real number d 1/2 and V SU(2) such that 6= ∈ d 1 φ(A) (DetA) VA− V ∗, A B++(H ). = ∈ The result above has the following immediate consequence. In the case where dim(H ) 3, in [65, Theorem 1] a general result was obtained ≥ describing the possible structure of surjective maps on B++(H ) which preserve a generalized distance measure of a certain quite general kind. 2. JORDAN TRIPLE ENDOMORPHISMS OF THE POSITIVE CONE 75

It is easy to see that, following the proof of [65, Theorem 1] and apply- ing Theorem 48, the result in [65] remains valid also in the case where dim(H ) 2. = Now, we present an application of Theorem 47 for the description of so-called sequential endomorphisms of effect algebras. Effects play an important role in certain parts of quantum mechan- ics, for instance, in the quantum theory of measurement [12]. Mathe- matically, effects are represented by positive semi-definite Hilbert space operators which are bounded (in the natural order among self-adjoint ≤ operators) by the identity. The set of all Hilbert space effects are called the Hilbert space effect algebra (although it is clearly not an algebra in the classical algebraic sense). In [27] Gudder and Nagy introduced the operation called sequential product on effects which has an important ◦ physical a meaning and which is closely related the Jordan triple product. Namely, they defined A B A1/2BA1/2 ◦ = for arbitrary Hilbert space effects A,B. The corresponding endomor- phisms, i.e., maps φ on Hilbert space effects which satisfy φ(A B) φ(A) φ(B) ◦ = ◦ for all pairs A,B of effects are called sequential endomorphisms. In the literature one can find results related to sequential automorphisms or isomorphisms (bijective sequential endomorphisms). For example, Gud- der and Greechie proved in [25, Theorem 1] that, supposing the dimen- sion of the underlying Hilbert space is at least 3, the sequential automor- phisms of the Hilbert space effect algebra are exactly the transformations φ which are of the form φ : A U AU ∗, where U is either a unitary or 7→ an antiunitary operator on the underlying Hilbert space. Afterwards, in [61] Lajos Molnár substantially generalized the previous results and de- scribed the structure of sequential isomorphisms between von Neumann algebra effects (i.e., between sets of effects on Hilbert spaces belonging to given von Neumann algebras). In the paper [18] Molnár and Dolinar studied sequential endomor- phisms of effect algebras over finite dimensional Hilbert spaces of di- mension at least 3. Anybody can easily be convinced that the problem of describing non-bijective morphisms is usually much harder than that of describing bijective ones. In [18, Theorem 1] they managed to give the precise description of all continuous sequential endomorphisms as- suming the dimension is at least 3. However, the 2-dimensional case re- mained unresolved and in [18, Remark 6] they proposed it as an open problem. Now, using Theorem 47 we can present a solution of the prob- lem. 76 3. PRESERVER PROBLEMS

For any positive integer n denote by En the set of all positive semi- definite operators A which act on an n-dimensional Hilbert space and satisfy A I. ≤ THEOREM 49. Assume that dim(H ) 2 and φ : E E is a continu- = 2 → 2 ous sequential endomorphism. Then we have the following four possibili- ties: (d1) there exists a unitary U B(H ) and a non-negative real number ∈ c such that c φ(A) (DetA) U AU ∗, A E ; = ∈ 2 (d2) there exists a unitary V B(H ) such that ∈ φ(A) V (adj A)V ∗, A E ; = ∈ 2 (d3) there exists a unitary V B(H ) and a real number d 1 such ∈ > that ½ d 1 (DetA) VA− V ∗, if A E is invertible; φ(A) ∈ 2 = 0, otherwise; (d4) there exists a unitary W B(H ) and non-negative real numbers ∈ c1,c2 such that

c1 c2 φ(A) W Diag[(DetA) ,(DetA) ]W ∗, A E . = ∈ 2 Here, we mean 00 1. = PROOF. First observe that every sequential endomorphism φ : E 2 → E is automatically a Jordan triple map. Indeed, we clearly have φ(A2) 2 = φ(A)2, A E . It implies that φ(pA) pφ(A), A E and hence it follows ∈ 2 = ∈ 2 that φ is a Jordan triple map, i.e., φ satisfies φ(ABA) φ(A)φ(B)φ(A), = A,B E . Moreover, we infer that φ sends projections to projections im- ∈ 2 plying that φ(I) is a projection. If φ(I) 0, we easily have that φ is iden- = tically zero. If φ(I) P is a rank-one projection, then by φ(A) φ(IAI) = = = Pφ(A)P it follows the map A φ(A) (I P), A E is a sequential endo- 7→ + − ∈ 2 morphism of E2 which is unital, i.e., it sends I to I. Therefore, in what follows we may and do assume that our original transformation φ is a continuous unital sequential endomorphism (and hence a Jordan triple map). Consider the function λ Detφ(λI), λ [0,1]. Clearly, this is a con- 7→ ∈ tinuous multiplicative map of the interval [0,1] into itself which sends 1 to 1. Lemma 3 in [18] tells us that such a function is either everywhere equal to 1 or it is a power function corresponding to a positive exponent. This means that φ(λI) is invertible for all 0 λ 1. We claim that φ sends < ≤ invertible elements of E2 to invertible elements. To see this, first observe that φ preserves the usual order . Indeed, by [26, Theorem 5.1] we know ≤ 2. JORDAN TRIPLE ENDOMORPHISMS OF THE POSITIVE CONE 77 that for any A,B E we have A B if and only if there is a C E such ∈ 2 ≤ ∈ 2 that A B C. This clearly shows that for any A,B E with A B we have = ◦ ∈ 2 ≤ φ(A) φ(B). Now, if A E is invertible, then there is a scalar 0 λ 1 ≤ ∈ 2 < ≤ such that λI A holds which implies that φ(λI) φ(A). Since φ(λI) is ≤ ≤ invertible, it follows that φ(A) is also invertible. The sequential endomorphism φ preserves commutativity. This fol- lows from the fact that for any pair A,B of effects we have A B B A ◦ = ◦ if and only A,B as operators commute (see, e.g., Corollary 2.2 in [27]). It follows that the effects φ(λI), λ [0,1] all commute and hence they ∈ are jointly diagonizable. This means that up to unitary similarity we can write · ϕ(λ) 0 ¸ φ(λI) , λ [0,1] = 0 ψ(λ) ∈ where ϕ,ψ : [0,1] [0,1] are continuous multiplicative functions which → send 1 to 1. Therefore, by [18, Lemma 3] again, we have real numbers c,d 0 such that ≥ · λc 0 ¸ φ(λI) , λ [0,1]. = 0 λd ∈ We now distinguish two cases. Assume first that there is φ(A) which is not diagonal. Since φ(A) necessarily commute with φ(λI), λ [0,1], one ∈ can easily deduce that we necessarily have c d. It follows that φ(λI) = = λc I and hence we have φ(λA) λc φ(A) for all λ [0,1], A E . = ∈ ∈ 2 We next define Φ : B++(H ) B++(H ) by → (98) Φ(A) A c φ(A/ A ), A E . = k k k k ∈ 2 In contrast to the proof of our main result, . stands here for the operator k k norm (spectral norm) of operators; we do hope it does not cause serious confusion. It follows that for any invertible effect A E we have ∈ 2 Φ(A) A c φ(A/ A ) φ( A (A/ A )) φ(A). = k k k k = k k k k = We assert that Φ is a Jordan triple endomorphism of B++(H ). Indeed, for any A,B B++(H ) we compute ∈ ³ A ´ ³ B ´ ³ A ´ Φ(A)Φ(B)Φ(A) A 2c B φ φ φ = k k k k A B A k k k k k k ³ ABA ´ ³ ABA ABA ´ A 2c B φ A 2c B φ k k = k k k k A B A = k k k k A B A ABA k kk kk k k kk kk k k k ³ ABA ´c ³ ABA ´ ³ ABA ´ A 2c B c k k φ ABA c φ Φ(ABA), = k k k k A B A ABA = k k ABA = k kk kk k k k k k where we have used the facts that ABA /( A B A ) 1 and that k k k kk kk k ≤ (ABA)/ ABA is an effect. Clearly, Φ is continuous and hence Theo- k k rem 47 applies and we obtain that Φ is of one of the forms (b1), (b2). In 78 3. PRESERVER PROBLEMS

c the case of (b1), we have that φ(A) (DetA) U AU ∗ holds for all invertible = A E with a given unitary operator U and real number c. Since φ sends ∈ 2 effects to effects, it follows easily that c is necessarily non-negative. By continuity we deduce c φ(A) (DetA) U AU ∗, A E = ∈ 2 yielding the possibility (d1). Consider now the case where φ(A) d 1 = (DetA) VA− V ∗ holds for all invertible A E with a given unitary op- ∈ 2 erator V and real number d. Again, since φ sends effects to effects, one can easily verify that d 1. If d 1, then we have ≥ = φ(A) V (adj A)V ∗ = for all invertible A E and by continuity it follows that the same formula ∈ 2 remains valid for any A E , too. This gives us (d2). Assume d 1. Letting ∈ 2 > A be an invertible effect tending to some non-invertible one, it follows d 1 that φ(A) (DetA) VA− V ∗ tends to 0. Hence, we obtain that = ½ d 1 (DetA) VA− V ∗, if A E is invertible; φ(A) ∈ 2 = 0, otherwise and this is the possibility (d3). It remains to discuss the case where all φ(A) are diagonal, that is when we have · ϕ(A) 0 ¸ φ(A) , A E = 0 ψ(A) ∈ 2 for continuous (unital) Jordan triple maps ϕ,ψ : E [0,1]. As in (98), we 2 → can extend ϕ,ψ from the set of all invertible elements of E2 to continuous Jordan triple functionals ϕ˜,ψ˜ : B++(H ) ]0, [. Applying Theorem 47, → ∞ it follows that ϕ˜,ψ˜ are non-negative powers of the determinant function. Hence we obtain that · (DetA)c 0 ¸ φ(A) , A E = 0 (DetA)d ∈ 2 holds for some non-negative real numbers c,d. This gives (d4) and the proof of the theorem is complete. ä 3. An application: the endomorphisms of the Einstein gyrogroup 3.1. Introduction. Velocity addition was defined by Einstein in his famous paper of 1905 which founded the special theory of relativity. In fact, the whole theory is essentially based on Einstein velocity addition law, see [20]. The algebraic structure corresponding to this operation is a particular example of so-called gyrogroups the general theory of which has been developed by Ungar [88]. 3. AN APPLICATION: THE ENDOMORPHISMS OF THE EINSTEIN GYROGROUP 79

The Einstein gyrogroup of dimension three is the pair (B, ), where ⊕ B {u R3 : u 1} and is the binary operation on B given by = ∈ k k < ⊕ µ ¶ 1 1 γu (99) : B B B;(u,v) u v : u v u,v u , ⊕ × → 7→ ⊕ = 1 u,v + γ + 1 γ 〈 〉 + 〈 〉 u + u ¡ ¢ 1 where γ 1 u 2 − 2 is the so-called Lorentz factor. The operation is u = − k k ⊕ called Einstein velocity addition or relativistic sum (cf. [1, 43]). Here and throughout this section, , stands for the usual Euclidean inner product 〈· ·〉 and denotes the induced norm. k·k The main result of this research is obtained as an application of the result on the Jordan triple endomorphisms of positive definite operators acting on two-dimensional Hilbert spaces. The other ingredient of our argument is the result [43, Theorem 3.4] of Kim. The main statement reads as follows.

THEOREM 50. Let β : B B be a continuous map. We have β is an → algebraic endomorphism with respect to the operation , i.e., β satisfies ⊕ β(u v) β(u) β(v), u,v B ⊕ = ⊕ ∈ if and only if (i) either there is an orthogonal matrix O M (R) such that ∈ 3 β(v) Ov, v B; = ∈ (ii) or we have β(v) 0, v B. = ∈ Here continuity refers to the usual topology on B inherited from the Euclidean space R3. By the above result we have the interesting conclusion that the group of all (continuous) automorphisms of the Einstein gyrogroup (B, ) coin- ⊕ cides with the orthogonal group of R3.

3.2. Open Bloch ball, qubit density operators, and positive definite operators of determinant one. To prove our main result we need an im- portant observation made by Kim what we present below. We denote by P the set of all 2 2 positive definite complex matri- 2 × ces. Let D stand for the set of all 2 2 regular density matrices, i.e., the × collection of all elements of P2 with trace 1, D {A P Tr A 1}. = ∈ 2 | = From the quantum theoretical point of view, D is the set of all regular density matrices of the 2-level quantum system. One can define a binary 80 3. PRESERVER PROBLEMS operation on D as ¯ 1 1 1 : D D D;(A,B) A B : A 2 BA 2 . ¯ × → 7→ ¯ = Tr AB For certain reasons, we call the normalized sequential product. ¯ The well-known Bloch parametrization of regular density matrices is the following map:   v1 · ¸ 3 1 1 v3 v1 i v2 ρ : R B M2(C);  v2  v ρ(v): + − . ⊃ → = 7→ = 2 v1 i v2 1 v3 v3 + − The transformation ρ is clearly a bijection between B and D, and in fact, by [43, Theorem 3.4], much more is true.

THEOREM 51 (S. Kim). The Bloch parametrization ρ :(B, ) ⊕ → (D, ); v ρ(v) is an isomorphism. ¯ 7→ Let us now consider a structure which is similar to the space of 2 2 × regular density matrices equipped with the normalized sequential prod- 1 uct. Namely, Let P2 be the set of all 2 2 positive definite matrices with × 1 determinant 1. The sequential product on P2 is defined as 1 1 1 1 1 : P P P ;(A,B) A B : A 2 BA 2 . 2 × 2 → 2 7→ = We show that (D, ) and (P1, ) are isomorphic structures. ¯ 2 1 PROPOSITION 52. The map τ :(D, ) (P1, ); A τ(A): A is ¯ → 2 7→ = pDetA an isomorphism.

PROOF. To prove the injectivity assume that τ(A) τ(B) for some = A,B D. That is, 1 A 1 B, which means that A is a positive scalar ∈ pDetA = pDetB multiple of B. By Tr A TrB 1 we can deduce that pDetA pDetB and = = = therefore A B. = For any A P1 we have 1 A D. By DetA 1 it follows that ∈ 2 Tr A ∈ = µ 1 ¶ 1 1 1 1 τ A A A A. Tr A = q ¡ 1 ¢ Tr A = pDetA Tr A = Det Tr A A Tr A This shows the surjectivity of τ. Finally, we need to show that τ respects the operations , . Using ¯ the properties of the determinant, for any A,B D we compute ∈ 1 1 1 1 1 A 2 BA 2 Tr AB A 2 BA 2 τ(A B) r r ¯ = ³ 1 1 ´ Tr AB = ³ 1 1 ´ Tr AB 1 2 2 2 2 Det Tr AB A BA Det A BA 3. AN APPLICATION: THE ENDOMORPHISMS OF THE EINSTEIN GYROGROUP 81

1 1 µ A ¶ 2 B µ A ¶ 2 A B τ(A) τ(B). = pDetA pDetB pDetA = pDetA pDetB = This completes the proof. ä 1 1 Observe that the inverse of τ is given by τ− (A) A, A P . = Tr A ∈ 2 3.3. Proof of the main result. Using the result of Theorem 47, the 1 continuous sequential endomorphisms of P2 can be described as follows.

PROPOSITION 53. Let φ : P1 P1 be a continuous endomorphism with 2 → 2 respect to the operation meaning that φ satisfies φ(A B) φ(A) φ(B), A,B P1. = ∈ 2 Then φ is of one of the following forms: (1) there is a unitary matrix U M (C) such that ∈ 2 1 φ(A) U AU ∗, A P ; = ∈ 2 (2) there is a unitary matrix V M (C) such that ∈ 2 1 1 φ(A) VA− V ∗, A P ; = ∈ 2 (3) we have φ(A) I, A P1. = ∈ 2 1 1 PROOF. If φ : P2 P2 is a sequential endomorphism, then it is a Jor- → 2 dan triple endomorphism, as well. Indeed, φ(A ) φ(A A) φ(A) 2 1 = =2 φ(A) φ(A) holds for all A P2. It follows that φ(ABA) φ(A B) = 1 ∈ 1 = = φ(A2) φ(B) ¡φ(A)2¢ 2 φ(B)¡φ(A)2¢ 2 φ(A)φ(B)φ(A) for all A,B P1. = = ∈ 2 The map µ A ¶ ψ : P2 P2; A ψ(A): pDetA φ → 7→ = · pDetA is clearly a continuous Jordan triple endomorphism of P2 which extends φ (the idea of the definition of ψ comes from [65, proof of Theorem 3]). Now, the statement is an immediate consequence of the previous theo- rem. ä Using the isomorphism τ defined in Proposition 52 which is clearly a homeomorphism, too, we can pull back the structural result on the con- tinuous endomorphisms of (P1, ) to (D, ). Namely, the continuous en- 2 ¯ domorphism of (D, ) are exactly the maps of the form ¯ 1 τ− φ τ, ◦ ◦ 1 where φ is a continuous endomorphism of (P2,). The following corol- lary can be verified by straightforward computations. 82 3. PRESERVER PROBLEMS

COROLLARY 54. Let α : D D be a continuous endomorphism with → respect to the operation . Then α is of one of the following forms: ¯ (1) there is a unitary matrix U M (C) such that ∈ 2 α(A) U AU ∗, A D; = ∈ (2) there is a unitary matrix V M (C) such that ∈ 2 VA 1V − ∗ D α(A) 1 , A ; = Tr A− ∈ (3) we have α(A) I/2, A D. = ∈ Putting all information we have together, the proof of the main result is now easy.

PROOF. We have learned from the result Theorem 51 due to Kim that the Bloch parametrization ρ is an isomorphism between (B, ) and (D, ). ⊕ ¯ Clearly, ρ is a homeomorphism, too. Therefore, the continuous endo- 1 morphisms of (B, ) are exactly the maps of the form β ρ− α ρ, where ⊕ = ◦ ◦ α is a continuous endomorphism of (D, ). ¯ By Corollary 54 there are three possibilities. Assume first that we have 0 a unitary U M such that α(A) U AU ∗, A D. Denote by H (C) the ∈ 2 = ∈ 2 linear space of all traceless self-adjoint 2 2 complex matrices and equip × this space with the inner product A,B : 1 Tr AB, A,B H0(C). Define 〈 〉 = 2 ∈ 2   v1 · ¸ 3 0 v3 v1 i v2 γ : R H2(C);  v2  v γ(v): − . → = 7→ = v1 i v2 v3 v3 + − 3 0 Clearly, γ is a linear isomorphism from R onto H2(C) which preserves 0 0 0 the inner product. Define α˜ : H2(C) H2(C) by α˜(A) U AU ∗, A H2(C). 1 → = ∈ 3 Then O : γ− α˜ γ is an orthogonal linear transformation on R and = ◦ ◦ using the relation γ(v) 2ρ(v) I, v B, we easily deduce that ρ α ρ O = − ∈ ◦ ◦ = holds. 1 VA− V ∗ If α(A) 1 , A D, then the conclusion follows from the previous Tr A− = ∈ 1 (ρ(v))− case. The only thing we have to observe is that 1 ρ( v) holds Tr(ρ(v))− = − which follows from [43, Remark 3.5]. 1 Finally, if α(A) I/2, then we clearly have ρ− α ρ 0. = ◦ ◦ = The converse statement that the formulas in (i) and (ii) define contin- uous endomorphisms of the Einstein gyrogroup is just obvious. ä CHAPTER 4

Summary

In this thesis we investigated noncommutative variances, certain generalized quantum entropies and relative entropies, and we presented the solutions of some related preserver problems. In the mathematical model of quantum systems, the states are de- scribed by positive trace-class operators of trace one. These operators are called stochastic operators or — in the finite dimensional case — density matrices. These objects are non-commutative generalizations of the clas- sical probability distributions. The quantities that play important roles in the classical probability theory — variance, entropies — can be defined in this non-commutative environment, as well. In the first part of the thesis, we gave an interesting characterization of the decomposability of certain variances. One of the most important quantities in the classical probability the- ory is the Shannon entropy. The non-commutative generalization is the von Neumann entropy, which has a well-known one-parameter exten- sion called Tsallis entropy. The strong subadditivity of the von Neumann entropy is a celebrated result in quantum information theory. In the the- sis, we showed that the Tsallis entropy is not strongly subadditive and we derived further relevant inequalities concerning the Tsallis entropy and relative entropy. The Bregman divergences are similar to the relative entropies in the sense that they measure the dissimilarity between density matrices, or more generally, positive matrices. In general, these quantities are not symmetric and they do not satisfy the triangle-inequality. We gave a nec- essary and sufficient condition for the joint convexity of the Bregman di- vergences concerning the generating functions of them. Using this char- acterization, we derived a sharp inequality for the Tsallis entropy, which is a generalization of the strong subadditivity of the von Neumann en- tropy. Furthermore, we determined the transformations of the cone of pos- itive matrices which preserve the Bregman divergences and the Jensen divergences. We considered other preserver problems concerning the

83 84 4. SUMMARY cone of positive matrices, as well. The description of the transforma- tions of positive definite operators acting on a two-dimensional Hilbert space which preserve the Jordan triple product was an open problem for a while. In the thesis, we gave the solution of this problem. As an application, we determined the algebraic endomorphisms of the three- dimensional Einstein gyropgroup.

Ackowledgement I would like to express my deep gratitude to my supervisor, Professor Dénes Petz, who introduced me to the topic of matrix analysis and pro- vided me several interesting questions. Moreover, we managed to solve some of these questions together. I am also very grateful to Professor Lajos Molnár for so many reasons that I do not have enough space to enumerate all of them. Let me men- tion the most important ones. He proposed me and my classmates a very interesting open problem while giving us an excellent intensive course at the BUTE as an invited guest professor form the University of Debre- cen. Afther a while we solved with Lajos the aforementioned question and that was the starting point of our flourishing collaboration. Among others, quite the half of my thesis is based on common works with Lajos. I am grateful to József Pitrik for providing interesting problems and for the great common work experience and to Professor Miklós Horváth for his support during my PhD studies. I would like to thank the members of my family — in the order of appearance, — my mother, Katalin Erd˝osi, my father, Attila Virosztek, my brother, Tamás Virosztek, my wife, Anna Gelniczky and my son, Márton Virosztek for their continuous support. It is a pleasure to thank Anna also for her patience and tolerance which properties turned out to be really useful in the periods when I was trying to do something in mathematics. I am also grateful to Márton Vi- rosztek for his valuable comments and remarks on the thesis. During the PhD period, my research was supported by the “Lendület” Program (LP2012-46/2012) of the Hungarian Academy of Sciences, by a grant of the Hungarian Scientific Research Fund (OTKA Reg. No. K104206) and by the “For the Young Talents of the Nation” scholarship program (NTP-EFÖ-P-15-0481) of the Hungarian State. Bibliography

[1] T. Abe, Gyrometric preserving maps on Einstein gyrogroups, Möbius gyrogroups and proper velocity gyrogroups, Nonlinear Functional Analysis and Applications 19 (2014), 1-17. [2] J. Aczél and Z. Daróczy, On Measures of Information and Their Characterizations, Academic Press, San Diego, 1975. [3] T. Ando and F.Hiai, Operator log-convex functions and operator means, Mathema- tische Annalen 350(3)(2011), 611-630. [4] H. Araki, Relative entropy of state of von Neumann algebras, Publ. RIMS Kyoto Univ. 9(1976), 809 - 833. [5] K. M. R. Audenaert, Subadditivity of q-entropies for q 1, J. Math. Phys. 48(2007), > 083507. [6] A. Banerjee et al., Clustering with Bregman divergences, J. Mach. Learn. Res. 6(2005), 1705-1749. [7] H. Bauschke and J. Borwein, Joint and separate convexity of the Bregman distance, Inherently Parallel Algorithms in Feasibility and Optimization and their Applica- tions (Haifa 2000), D. Butnariu, Y. Censor, S. Reich (editors), Elsevier, pp. 23-36, 2001. [8] Á. Besenyei and D. Petz, Partial subadditivity of entropies, Linear Algebra and its Applications, 439(2013), 3297 - 3305. [9] R. Bhatia, Matrix Analysis, Springer-Verlag, New York, 1996. [10] R. Bhatia, Positive Definite Matrices, Princeton Univ. Press, Princeton, 2007. [11] L. M. Bregman, The relaxation method of finding the common points of convex sets and its application to the solution of problems in convex programming, USSR Com- putational Mathematics and Mathematical Physics 7(3)(1967), 200-217. [12] P. Busch, P.J. Lahti and P. Mittelstaedt, The Quantum Theory of Measurement, Springer-Verlag, 1991. [13] E. Carlen, Trace inequalities and quantum entropy: an introductory course, Con- temp. Math. 529 (2010), 73-140. [14] G. Cassinelli, E. De Vito, P.J. Lahti and A. Levrero, The Theory of Symmetry Actions in Quantum Mechanics, Lecture Notes in Physics 654, Springer, 2004. [15] Z. Chebbi and M. Moakher, Means of hermitian positive-definite matrices based on the log-determinant α-divergence function, Linear Algebra Appl. 436 (2012), 1872– 1889. [16] R. Y. Chen and J. A. Tropp, Subadditivity of matrix ϕ-entropy and concentration of random matrices, Electron. J. Probab. 19(2014), 1-30. [17] Z. Daróczi, General information functions, Information and Control 16(1970), 36 - 51. [18] G. Dolinar and L. Molnár, Sequential endomorphisms of finite dimensional Hilbert space effect algebras, J. Phys. A: Math. Theor. 45 (2012), 065207.

85 86 BIBLIOGRAPHY

[19] E. Effros, A matrix convexity approach to some celebrated quantum inequalities, Proc. Natl. Acad. Sci. USA, 106(2009), 1006 - 1008. [20] A. Einstein, Einstein’s Miraculous Years: Five Papers That Changed the Face of Physics, Princeton University, Princeton, NJ, 1998. [21] W. Feller, An Introduction to Probability Theory and its Applications I, Oxford, Eng- land: Wiley, 1950. [22] S. Furuichi, Information theoretical properties of Tsallis entropies, J. Math. Phys. 47, 023302 (2006) [23] S. Furuichi, K. Yanagi and K. Kuriyama, Fundamental properties of Tsallis relative entropy, J. Math.Phys. 45(2004), 4868-4877. [24] P.Gibilisco, F.Hiai and D. Petz, Quantum covariance, quantum Fisher information and the uncertainty principle, IEEE Trans. Inform. Theory 55(2009), 439-443. [25] S. Gudder and R. Greechie, Sequential products on effect algebras, Rep. Math. Phys. 49 (2002), 87–111. [26] S. Gudder and R. Greechie, Uniqueness and order in sequential effect algebras, Int. J. Theor. Phys. 44 (2005), 755–770. [27] S. Gudder and G. Nagy, Sequential quantum measurements, J. Math. Phys. 42 (2001), 5212–5222. [28] F.Hansen, Extensions of Lieb’s Concavity Theorem, J. Stat. Phys. 124(2006), 87-101. [29] F.Hansen, Trace functions as Laplace transforms, J. Math. Phys. 47, 043504 (2006). [30] F. Hansen and Z. Zhang, Characterization of matrix entropies, arXiv:1402:2118v2, 20 Mar., 2014. [31] F. Hansen and Z. Zhang, Characterization of matrix entropies, arXiv:1402:2118v3, 16 Mar., 2015. [32] G. H. Hardy, J. E. Littlewood and G. Pólya, Inequalities, Cambridge University Press, Cambridge, 1934. [33] O. Hatori and L. Molnár, Isometries of the unitary groups and Thompson isometries of the spaces of invertible positive elements in C ∗-algebras, J. Math. Anal. Appl. 409 (2014), 158–167. [34] F.Hiai and D. Petz, From quasi-entropy to various quantum information quantities, Publ. RIMS Kyoto University 48(2012), 525 - 542. [35] F. Hiai and D. Petz, Introduction to Matrix Analysis and Applications, Hindustan Book Agency and Springer Verlag, 2014. [36] A. S. Holevo, Probabilistic and Statistical Aspects of Quantum Theory, North- Holland, Amsterdam, 1982. [37] R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press, 1990. [38] F. Itakura and S. Saito, Analysis synthesis telephony based on the maximum likeli- hood method, in 6th Int. Congr. Acoustics, Tokyo, Japan., pp. C-17-C-20 (1968) [39] A. Járai, Regularity Properties of Functional Equations in Several Variables, Ad- vances in Mathematics (Springer), Springer, New York, 2005. [40] A. Jencová and M. B. Ruskai, A unified treatment of convexity of relative entropy and related trace functions, with conditions for equality, Rev. Math. Phys. 22(2010), 1099 - 1121. [41] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras, Volumes I and II, Academic Press, Orlando, 1983 and 1986. [42] I. H. Kim, Operator extension of strong subadditivity of entropy, J. Math. Phys. 53(2012), 122204. BIBLIOGRAPHY 87

[43] S. Kim, Distances of qubit density matrices on Bloch sphere, J. Math. Phys. 52, 102303 (2011). [44] A. N. Kolmogorov, Grundbegriffe der Wahrscheinlichkeitsrechnung, Springer Verlag, Berlin, 1933; English translation: Foundations of the Theory of Probability, Chelsea Publishing Co., New York, 1956. [45] S. Kullback and R.A: Leibler, On information and sufficiency, Ann. Math. Statist. 22(1)(1951), 79 - 86. [46] Z. Léka and D. Petz, Some decompositions of matrix variances, Probability and Mathematical Statistics, 33(2013), 191-199. [47] A. Lesniewski, M. B. Ruskai, Monotone Riemannian metrics and relative entropy on non-commutative probability spaces, J. Math. Phys. 40(1999), 5702-5724. [48] M. Lewin and J. Sabin, A family of monotone quantum relative entropies, Lett. Math. Phys. 104(2014) 691-705. [49] E. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum-mechanical entropy, J. Math. Phys. 14(1973), 1938-1941. [50] E. H. Lieb, M. B. Ruskai, Some operator inequalities of the Schwarz type, Adv. in Math. 12(1974), 269-273. [51] G. Linblad, Expectations and entropy inequalities, Commun. Math. Phys. 39(1974), 111-119. [52] P.C.Mahalonobis, On the generalized distance in statistics, Proceedings of National Institute of Science of India, 12(1936), 49 - 55. [53] L. Molnár, A remark to the Kochen-Specker theorem and some characterizations of the determinant on sets of Hermitian matrices, Proc. Amer. Math. Soc. 134 (2006), 2839-2848. [54] L. Molnár, General Mazur-Ulam type theorems and some applications, in Oper- ator Semigroups Meet Complex Analysis, Harmonic Analysis and Mathematical Physics, W. Arendt, R. Chill, Y. Tomilov (Eds.), Operator Theory: Advances and Ap- plications, Vol. 250, pp. 311-342, Birkhäuser, 2015. [55] L. Molnár, Jordan triple endomorphisms and isometries of spaces of positive definite matrices, Linear Multilinear Alg. 63 (2015), 12–33. [56] L. Molnár, Maps on states preserving the relative entropy J. Math. Phys. 49, 032114 (2008). [57] L. Molnár, Order-automorphisms of the set of bounded observables, J. Math. Phys. 42 (2001), 5904-5909. [58] L. Molnár, Order automorphisms on positive definite operators and a few applica- tions, Linear Algebra Appl. 434 (2011), 2158-2169. [59] L. Molnár, Preservers on Hilbert space effects, Linear Algebra Appl. 370 (2003), 287– 300. [60] L. Molnár, Selected Preserver Problems on Algebraic Structures of Linear Operators and on Function Spaces, Lecture Notes in Mathematics, Vol. 1895, p. 236, Springer, 2007. [61] L. Molnár, Sequential isomorphisms between the sets of von Neumann algebra ef- fects, Acta Sci. Math. (Szeged) 69 (2003), 755–772. [62] L. Molnár and G. Nagy, Transformations on density operators that leave the Holevo bound invariant, Int. J. Theor. Phys. 53 (2014), 3273–3278. [63] L. Molnár, G. Nagy and P. Szokol, Maps on density operators preserving quantum f -divergences, Quantum Inf. Process. 12 (2013), 2309-2323. 88 BIBLIOGRAPHY

[64] L. Molnár, J. Pitrik and D. Virosztek, Maps on positive definite matrices preserving Bregman and Jensen divergences, Linear Algebra Appl. 495 (2016), 174–189. [65] L. Molnár and P. Szokol, Transformations on positive definite matrices preserving generalized distance measures, Linear Algebra Appl. 466 (2015), 141–159. [66] L. Molnár and D. Virosztek, Continuous Jordan triple endomorphisms of P2, J. Math. Anal. Appl. 438(2) (2016), 828-839. [67] L. Molnár and D. Virosztek, On algebraic endomorphisms of the Einstein gyrogroup, J. Math. Phys. 56, 082302 (2015). [68] F. J. Murray and J. von Neumann, On rings of operators, Annals of Mathematics 37 (1936), 116-229. [69] F.Nielsen and R. Bhatia (Eds), Matrix Information Geometry, Springer, Heidelberg, 2013. [70] M. Nielsen and D. Petz, A simple proof of the strong subadditivity inequality, Quan- tum Information & Computation, 6(2005), 507 - 513. [71] M. Ohya and D. Petz, Quantum Entropy and Its Use, Springer-Verlag, Heidelberg, 1993. Second edition 2004. [72] D. Petz, Bregman divergence as relative operator entropy, Acta Math. Hungar, 116(2007), 127-131. [73] D. Petz, Covariance and Fisher information in quantum mechanics, J. Phys. A, Math. Gen. 35(2003), 79-91. [74] D. Petz, Quantum Information Theory and Quantum Statistics, Springer-Verlag, Heidelberg, 2008. [75] D. Petz, Quasi-entropies for finite quantum systems, Rep. Math. Phys. 23(1986), 57 - 65. [76] D. Petz, Quasi-entropies for states of a von Neumann algebra, Publ. RIMS. Kyoto Univ. 21 (1985), 781–800. [77] D. Petz and G. Tóth, Matrix variances with projections, Acta Sci. Math. (Szeged), 78(2012), 683–688. [78] D. Petz and D. Virosztek, A characterization theorem for matrix variances, Acta Sci. Math. (Szeged) 80 (2014), 681-687. [79] D. Petz and D. Virosztek, Some inequalities for quantum Tsallis entropy related to the strong subadditivity, Math. Inequal. Appl. 18(2)(2015), 555-568. [80] J. Pitrik and D. Virosztek, On the joint convexity of the Bregman divergence of ma- trices, Lett. Math. Phys. 105 (2015), 675-692. [81] M. Rédei and S.J. Summers, Quantum probability theory, Studies in the History and Philosophy of Modern Physics 38 (2007), 390-417. [82] R.T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, 1970. [83] N. Sharma, Equality conditions for quantum quasi-entropies under monotonicity and joint-convexity, Nat. Conf. Commun. (NCC), 2014. [84] B. Simon, Representations of Finite and Compact Groups, Graduate Studies in Math., 10, American Mathematical Society, Providence, R.I., 1996. [85] S. Sra, Positive definite matrices and the S-divergence, preprint, arXiv math.FA- 1110.1773v4. Shorter version of the manuscript is submitted to Proc. Amer. Math. Soc. [86] J. A. Tropp, From joint convexity of quantum relative entropy to a concavity theorem of Lieb, Proc. Amer. Math. Soc. 140(2012), 1757-1760. [87] A. Uhlmann, Roofs and convexity, Entropy, 12(7) (2010), 1799-1832. BIBLIOGRAPHY 89

[88] A.A. Ungar, Analytic Hyperbolic Geometry and Albert Einstein’s Special Theory of Relativity, World Scientific, Singapore, 2008.