arXiv:1711.04427v2 [cs.CC] 5 Jun 2018 h ml rz etr fte21 oMconference. FoCM 2017 the of Lecture Prize Smale the rank. tensor norms, tensor tensor, multiplication h hr osatsc htfrevery for that such constant sharp the c pcs[1.Truhu hsatce ewl let will we article, this Throughout [21]. spaces uct 1 max (1) hr h aiu ntelf stk vrall over take is left the on maximum the where C h ih stknoe all over taken is right the npriua,tefruaini 1 a u oLindenstrau to due was (1) in formulation the particular, In uhsm uhr etittesle to themselves restrict authors some such rtedeki 93 lentv rosvafactorizatio via absolutely proofs spaces, Alternative 1953. in Grothendieck ne ntertclcmue cec,Gohnic’ ine Grothendieck’s science, computer theoretical mech In quantum theory, ence. operator analysis, harmonic algebra, , hsatcepoie h eal o ld 6in 16 Slide for details the provides article This rtedeksieult a rgnlyetbihdt r to established originally was inequality Grothendieck’s e od n phrases. and words Key 2010 h nqaiyhsfudapiain nnmru ra,in areas, numerous in applications found has inequality The h xsec fasc osatidpnetof independent constant a such a of existence The ε i = ahmtc ujc Classification. Subject RTEDEKCNTN SNR FSRSE MATRIX STRASSEN OF NORM IS CONSTANT GROTHENDIECK all rtedeksieult sanr inequality norm a as inequality Grothendieck’s hwta rtedekscntn stelatuprbudo bound upper least the is constant Grothendieck’s that show aintno,ie,te3tno rblna operator bilinear unde or same the 3-tensor for the size i.e., of tensor, measures cation different are They related. arxmlilcto stegets oe on n(h lo (the on bound lower greatest the is multiplication matrix tasnsepnn fmti multiplication matrix of exponent Strassen’s — Abstract. endb arxmti rdc over product matrix-matrix by defined hn( then nfrl one yacntn needn of independent constant a by bounded uniformly epoeta rtedeksieult suiu:I egen we If unique: is inequality Grothendieck’s that prove We ,q r q, p, e iθ ,m n m, l, i k and ∈ ,q r q, p, x i [1 k = , ∈ δ ∞ k (1 = ) j y N eso httoipratqatte rmtodsaaeare disparate two from quantities important two that show We p j ], sd rmrltn h w eertdqatte,ti in this quantities, celebrated two the relating from Aside . smigoeaos t,myb on n[0 8 8 1 n r and 41] 38, 28, [40, in found be may etc, operators, -summing = k =1 IJEZAG HULFIDAD N E-EGLIM LEK-HENG AND FRIEDLAND, SHMUEL ZHANG, JINJIE e

, iφ rtedekscntn,Gohnic’ nqaiy fas inequality, Grothendieck’s constant, Grothendieck’s X 2 j , .Tevleo h etsd f()i h aefrall for same the is (1) of side left the on value The ). ∞ k ε i m µ =1 i s pt ylcpruain,teol hiefrwhich for choice only the permutations, cyclic to up is, ) l,m,n δ , k X j µ UTPIAINTENSOR MULTIPLICATION l,m,n k ∈ 1 j n , 2 =1 F , ∞ k fui bouevle(oover (so value absolute unit of p,q,r M max = 56,4B8 68,4A7 52,6Q7 68Q25. 68Q17, 65Y20, 47A07, 46B85, 46B28, 15A60, ,m n m, l, ij X,Y,M 1. h max = F l x http://www.ub.edu/focm2017/slides/Lim.pdf = Introduction X,Y,M i = y , 6=0 m R j ∈ i k ω or

6=0 + X N 1 x n rtedeksconstant Grothendieck’s and 6 ,m n m, l, k C n k i n vr matrix every and 1 y , X ti elkonta tasnsepnn of exponent Strassen’s that well-known is It . . , K | F 2 µ tr( k k j l,m,n G F Y p,q = | . XMY ∈ tr( | k ε l k 2 R i flna prtr,goer fBanach of geometry operators, linear of n , ult a oal perdi tde of studies in appeared notably has quality Y , XMY F | max ∞ : = lt udmna om ntno prod- tensor on norms fundamental elate m k f a of) g l or rlz h (1 the eralize F | k a n nc,adms eety optrsci- computer recently, most and anics, q,r δ ligojc h arxmultipli- matrix the — object rlying fui -om n h aiu on maximum the and 2-norm, unit of l ) M sadPe lczy´nski and [38]. ss j × | | and C k =1 m k ) M esrnorm tensor ∞ The . | ×

ldn aahsaetheory, space Banach cluding k , X esrrank tensor 1 r,p F n 6 m , R i m × M a icvrdb Alexandre by discovered was =1 K ih losu orewrite to us allows sight arxmlilcto,Strassen’s multiplication, matrix t rtedekconstant Grothendieck n , , G 2 so opeiytheory complexity of as ε → X ( = , . i ∞ K = of F -omt arbitrary to )-norm j n M G l × of =1 µ ± r intimately are — n l,m,n ij µ ( , and 1 k M ) l,m,n µ ,B A, ∈ l,m,n ij l ae over taken , frne therein. eferences ε > F ewill We . rsne during presented , i ) m δ k δ m j 7→ p,q,r × j

n = + AB , is n ± ;over 1; n as and K G F C is ∗ 2 JINJIEZHANG,SHMUELFRIEDLAND,ANDLEK-HENGLIM unique games conjecture [29, 30, 31, 42, 43] and SDP relaxations of NP-hard combinatorial prob- lems [2, 3, 4, 5, 11]. In quantum information theory, Grothendieck’s inequality arises unexpectedly in Bell inequalities [17, 50, 24] and in XOR games [8, 9, 7], among several other areas; Grothendieck C C constants of specific orders, e.g., KG(3) and KG(4), also have important roles to play in quantum information theory [1, 25, 15]. The inequality has even been applied to some rather surprising areas, e.g., to communication complexity [39, 44, 45] and to privacy-preserving data analysis [16]. Although the Grothendieck constant appears in numerous mathematical statements and has many equivalent interpretations in physics and computer science, its exact value remains unknown F and estimating increasingly sharper bounds for KG has been a major undertaking. The current R best known bounds are KG [1.676, 1.782], established in [13] (lower) and [33] (upper); and C ∈ KG (1.338, 1.404], established in [14] (lower) and [22] (upper). A major recent breakthrough [6] ∈ R established that Krivine’s upper bound π/ 2 log(1 + √2) 1.782 for KG is not sharp. There have also been efforts in approximating Grothedieck’s constants≈ of specific orders, e.g., see [25, 15] for C C  recent results on KG(3) and KG(4). A world apart from the aforementioned areas touched by Grothendieck’s inequality is the prob- lem of complexity of matrix inversion, or equivalently, matrix multiplication, pioneered by Volker Strassen [48, 46, 49, 47]. A systematic study of this and other related problems has blossomed into what is now often called algebraic computational complexity [10]. For the uninitiated, Strassen famously discovered in [48] that the product of a pair of 2 2 matrices may be obtained with just seven multiplications: ×

a1 a2 b1 b2 a1b1 + a2b2 β + γ + (a1 + a2 a3 a4)b4 = − − , a3 a4 b3 b4 α + γ + a4(b2 + b3 b1 b4) α + β + γ     − −  where α = (a3 a1)(b3 b4), β = (a3 + a4)(b3 b1), γ = a1b1 + (a3 + a4 a1)(b1 + b4 b3). Applied recursively,− this− gives an algorithm for forming− the product of a pair− of n n matrices− with just O(nlog2 7) multiplications, as opposed to O(n3) using the usual formula for matrix-matrix× product. In addition, Strassen also showed that: (i) the number of additions may be bounded by a constant times the number of multiplications; (ii) matrix inversion may be achieved with the same complexity as matrix multiplication. In short, if there is an algorithm that forms matrix product in O(nω) multiplication, there it yields a O(nω) algorithm that would solve n linear equations in n unknowns, which is by far the most ubiquitous problem in all of scientific and engineering computing. The smallest possible ω became known as the exponent of matrix multiplication. Strassen’s astounding discovery captured the interests of numerical analysts and theoretical com- puter scientists alike and the complexity was gradually lowered over the years. Some milestones include the Coppersmith–Winograd [12] bound O(n2.375477) that resisted progress for more than two decades until Vassilevska-Williams’s improvement [51] to O(n2.3728642); the current record, due to Le Gall [36], is O(n2.3728639). Strassen showed [47] that the best possible ω is in fact given by

ω = inf logn rank(µn,n,n) , n∈N  n×n ∗ n×n ∗ n×n where µn,n,n is the Strassen matrix multiplication tensor — the 3-tensor in (F ) (F ) F associated with matrix-matrix product, i.e., the bilinear operator ⊗ ⊗ Fn×n Fn×n Fn×n, (A, B) AB. × → 7→ Those unfamiliar with multilinear algebra [35] may regard the 3-tensor µn,n,n and the bilinear operator as the same object. If we choose a basis on Fn×n (or three different bases, one on each n×n n2×n2×n2 copy of F ), then µn,n,n may be represented as a 3-dimensional hypermatrix in F . Over any F-vector spaces U, V, W, one may define tensor rank [26] for 3-tensors τ U V W by ∈ ⊗ ⊗ r rank(τ) = min r : τ = λiui vi wi . i=1 ⊗ ⊗ n X o GROTHENDIECK CONSTANT IS NORM OF STRASSEN MATRIX MULTIPLICATION TENSOR 3

∗ ∗ In fact, Strassen showed that the tensor rank of a 3-tensor µβ U V W associated with a bilinear operator β : U V W gives the least number of multiplications∈ ⊗ required⊗ to compute β. The value of ω is in general× → dependent on the choice of F, as tensor rank is well-known to be field dependent [37]. What exactly is ω? The above discussion shows that it is the sharp lower bound for the (log of the) tensor rank of the Strassen matrix multiplication tensor: (2) log rank(µ ) > ω for all n N. n n,n,n ∈ F What exactly is KG? We will show that it is the sharp upper bound for the tensor (1, 2, )-norm of the Strassen matrix multiplication tensor: ∞ F (3) µ 1 2 6 K for all l,m,n N. k l,m,nk , ,∞ G ∈ If we desire a greater parallel to (2), we may drop l and m in (3) — there is no loss of generality F in assuming that l = 2n and m = n, i.e., KG is also the sharp upper bound so that F µ2 1 2 6 K for all n N. k n,n,nk , ,∞ G ∈ In addition, the Grothendieck constant of order l N, a popular notion in quantum information ∈ F theory (e.g., [1, 15, 25]), is given by a simple variation, namely, the sharp upper bound KG(l) in F µ 1 2 6 K (l) for all m,n N. k l,m,nk , ,∞ G ∈ We will define the (1, 2, )-norm for an arbitrary 3-tensor formally in Section 4 but at this point ∞ it suffices to know its value for µl,m,n, namely, tr(XMY ) µl,m,n 1,2,∞ = max | | k k X,Y,M6=0 X 1 2 Y 2 M 1 k k , k k ,∞k k∞, l×m m×n n×l where X F , M F , Y F , and M p,q := maxx6=0 Mx q/ x p denotes the matrix (p,q)-norm.∈ ∈ ∈ k k k k k k The inequality (3) is in fact just Grothendieck’s inequality. The characterizations of ω and KG in (2) and (3) hold over both R and C although their values are field dependent. Incidentally the fact that Grothendieck’s constant is essentially a tensor norm immediately explains why it is field dependent — because, as is the case for tensor rank, tensor norms are also field dependent [19]. An advantage of the formulation in (3) is that we obtain a natural family of (p, q, r)-norms on µl,m,n given by tr(XMY ) µl,m,n p,q,r := max | | k k X,Y,M6=0 X Y M k kp,qk kq,rk kr,p for any triple 1 6 p, q, r 6 . This family of norms will serve as a platform for us to better comprehend Grothendieck’s∞ constant and Grothendieck’s inequality. The study of the general (p, q, r)-case shows why the (1, 2, )-case is extraordinary. We will deduce a generalization of Grothendieck’s inequality and show∞ that the case (p, q, r) = (1, 2, ), i.e., Grothendieck’s inequality, is the only one up to trivial cyclic permutations1 where there∞ is a universal upper bound, i.e., Grothendieck’s constant, that holds for all l,m,n N. ∈ Theorem 1.1 (Grothendieck–H¨older inequality). Let 1 6 p, q, r 6 and l,m,n N. Then ∞ ∈ 1 F |1/q−1/2| 1−1/p 1/r 6 µ p,q,r 6 K l m n . l|1/q−1/2| m|1/p−1/2| n|1/r−1/2| k l,m,nk G · · · · · In particular, when p = 1, q = 2, and r = , the upper bound gives Grothendieck’s inequality (1). ∞ Theorem 1.2 (Uniqueness of Grothendieck’s inequality). Let 1 6 p, q, r 6 and l,m,n N. Then µ is uniformly bounded for all l,m,n N if and only if ∞ ∈ k l,m,nkp,q,r ∈ (p, q, r) (1, 2, ), ( , 1, 2), (2, , 1) . ∈ { ∞ ∞ ∞ } 1Unavoidable as (p,q,r)-norms are clearly invariant under cyclic permutations of p,q,r. See Lemma 4.1(i). 4 JINJIEZHANG,SHMUELFRIEDLAND,ANDLEK-HENGLIM

Theorem 1.1 follows from Theorems 4.2 and 4.3. Theorem 1.2 is just Theorem 5.2.

2. Strassen matrix multiplication tensor An important observation for us, obvious to anyone familiar with tensors [34, 35, 37] but perhaps less so to those accustomed to regarding (erroneously) a tensor as a “multiway array,” is that the bilinear operator (4) β Fl×m Fm×n Fl×n, (X,Y ) XY, ∈ × → 7→ and the trilinear functional (5) τ : Fl×m Fm×n Fn×l F, (X,Y,Z) tr(XYZ), × × → 7→ are given by2 the same 3-tensor in (Fl×m)∗ (Fm×n)∗ Fl×n = (Fl×m)∗ (Fm×n)∗ (Fn×l)∗. ⊗ ⊗ ∼ ⊗ ⊗ In other words, as 3-tensors, there is no difference between the product of two matrices and the trace of product of three matrices. m×n To see this, let Eij F denote the matrix with 1 in its (i, j)th entry and zeros everywhere ∈ m×n else, so that Eij : i = 1,...,m; j = 1,...,n is the standard basis for F . Its dual basis for the { of linear functionals } (Fm×n)∗ := ϕ : Fm×n F : ϕ(αX + βY )= αϕ(X)+ βϕ(Y ) { → } m×n is then given by εij : i = 1,...,m; j = 1,...,n where εij : F F, X xij, is the linear functional that takes{ an m n matrix to its (i, j)th} entry. Now choose→ the standard7→ inner product m×n × T m×n on F , i.e., X,Y = tr(X Y ). Then εij(X) = Eij, X for all X F , which allows us to m×nh∗ i n×m h mi ×n ∗ ∈ n×m identify (F ) with F and linear functional εij (F ) with the matrix Eji F . It remains to observe that the usual formula for matrix-matr∈ ix product gives ∈ l,n m l,n m β(X,Y )= xijyjk Eik = εij(X)εjk(Y ) Eik i,k=1 j=1 i,k=1 j=1 Xl,n Xm  X X l,m,n  = (εij εjk)(X,Y ) Eik = εij εjk Eik (X,Y ), i,k=1 j=1 ⊗ i,j,k=1 ⊗ ⊗ X X  X  and thus l,m,n l×m ∗ m×n ∗ l×n (6) β = εij εjk Eik (F ) (F ) F . i,j,k=1 ⊗ ⊗ ∈ ⊗ ⊗ A similar simple calculation,X l,m,n l,m,n τ(X,Y,Z)= xijyjkzki = εij(X)εjk(Y )εki(Z) i,j,k=1 i,j,k=1 Xl,m,n X l,m,n = (εij εjk εki)(X,Y,Z)= εij εjk εki (X,Y,Z), i,j,k=1 ⊗ ⊗ i,j,k=1 ⊗ ⊗   gives X X

l,m,n l×m ∗ m×n ∗ n×l ∗ (7) τ = εij εjk εki (F ) (F ) (F ) . i,j,k=1 ⊗ ⊗ ∈ ⊗ ⊗ Xm×n ∗ n×m By our identification, (F ) = F and εki = Eik. So we see from (6) and (7) that indeed β = τ as 3-tensors. We denote this tensor by µl,m,n. This has been variously called the Strassen matrix multiplication tensor or the structure tensor for matrix-matrix product [10, 34, 37, 52].

2To be more precise, by the universal property of tensor products [35, Chapter XVI, §1], β induces a linear map l×m m×n l×n l×m m×n n×l l×m ∗ m×n ∗ l×n β∗ : F ⊗F → F and τ induces a linear map τ∗ : F ⊗F ⊗F → F, i.e., β∗ ∈ (F ) ⊗(F ) ⊗F l×m ∗ m×n ∗ n×l ∗ and τ∗ ∈ (F ) ⊗ (F ) ⊗ (F ) . We identify β,τ with the linear maps β∗,τ∗ they induce. GROTHENDIECK CONSTANT IS NORM OF STRASSEN MATRIX MULTIPLICATION TENSOR 5

3. Grothendieck’s constant and Strassen’s tensor m×n l Let l,m,n be positive integers and let M = (Mij ) F . Let x1,...,xm,y1,...,yn F be ∈ l×m T∈ T vectors of unit 2-norm. We will regard x1,...,x as columns of a matrix X F and y ,...,y m ∈ 1 n as rows of a matrix Y Fn×l. Recall that for any p∈> 1 with H¨older conjugate p∗, i.e., 1/p + 1/p∗ = 1, we have

Xz p Y z ∞ (8) X 1,p := max k k = max xi p, Y p,∞ := max k k = max yi p∗ , k k z6=0 z 1 i=1,...,m k k k k z6=0 z i=1,...,n k k k k k kp and Mz 1 m n m n M ∞,1 := max k k = max Mijδj = max Mijεiδj , k k z6=0 z ∞ |δj |=1 i=1 j=1 |εi|=1, |δj |=1 i=1 j=1 k k XF R X X X which may be further simplified for = as m n T (9) M ∞,1 = max Mijεiδj = max ε Mδ . n k k εi=±1, δj =±1 i=1 j=1 ε,δ∈{±1} | |

X X We refer the reader to [19] for a proof that τ(X,M,Y ) (10) τ 1,2,∞ := max | | k k X,Y,M6=0 X 1 2 Y 2 M 1 k k , k k ,∞k k∞, defines a norm for any tensor τ (Fl×m)∗ (Fm×n)∗ (Fn×l)∗, regarded as a trilinear functional. Since ∈ ⊗ ⊗ m n tr(XMY ) if F = R, Mij xi,yj = i=1 j=1 h i (tr(XMY ) if F = C, X X we see that Grothendieck’s inequality (1) may be stated as

tr(XMY ) F (11) max | | 6 KG, X,Y,M6=0 X 1 2 Y 2 M 1 k k , k k ,∞k k∞, when F = R and as tr(XMY ) F max | | 6 KG, X,Y,M6=0 X 1 2 Y 2 M 1 k k , k k ,∞k k∞, when F = C. However, in the latter case, we may write tr(XMY ) tr(XMY ) tr(XMY ) max | | = max | | = max | | X,Y,M6=0 X 1,2 Y 2,∞ M ∞,1 X,Y ,M6=0 X 1 2 Y 2 M 1 X,Y,M6=0 X 1,2 Y 2,∞ M ∞,1 k k k k k k k k , k k ,∞k k∞, k k k k k k as matrix (p,q)-norms are invariant under complex conjugation. Hence (11) in fact gives Grothendieck’s inequality for both F = R and C. By our discussion in Section 2 and our norm in (10), (11) is just F µ 1 2 6 K k l,m,nk , ,∞ G where l m n l×m ∗ m×n ∗ n×l ∗ (12) µl,m,n := εij εjk εki F F F i=1 j=1 k=1 ⊗ ⊗ ∈ ⊗ ⊗ is the Strassen matrixX multiplicationX X tensor for the produc t of l mand m nmatrices. This allows us to define Grothendieck’s constant in terms of t×ensor norms:× For F = R or C, F KG = sup µl,m,n 1,2,∞. l,m,n∈Nk k

Since µ 1 2 = µ + 1 2 for all l > m + n, k l,m,nk , ,∞ k m n,m,nk , ,∞ F KG = sup µm+n,m,n 1,2,∞ = sup µ2n,n,n 1,2,∞. m,n∈Nk k n∈Nk k 6 JINJIEZHANG,SHMUELFRIEDLAND,ANDLEK-HENGLIM

In addition, the Grothendieck constant of order l N [1, 15, 25] may be defined as F ∈ KG(l)= sup µl,m,n 1,2,∞. m,n∈Nk k

4. Grothendieck–Holder¨ inequality The norm in (10) admits a natural generalization to arbitrary p, q, r [1, ] as ∈ ∞ τ(X,M,Y ) τ p,q,r := max | | k k X,Y,M6=0 X Y M k kp,qk kq,rk kr,p defined for any τ (Fl×m)∗ (Fm×n)∗ (Fn×l)∗, regarded as a trilinear functional ∈ ⊗ ⊗ τ : Fl×m Fm×n Fn×l F. × × → In this article, we will only be interested in τ = µl,m,n, the Strassen tensor. We first state some simple observations that will be useful later. Lemma 4.1. Let p, q, r [1, ]. Then the (p, q, r)-norm of µ ∈ ∞ l,m,n (i) is invariant under cyclic permutation of p, q, r, µ = µ = µ ; k l,m,nkp,q,r k l,m,nkr,p,q k l,m,nkq,r,p (ii) transforms under H¨older conjugation as

µ = µ ∗ ∗ ∗ . k l,m,nkp,q,r k l,m,nkr ,q ,p Recall that p∗ is the H¨older conjugate of p, i.e., 1/p + 1/p∗ = 1. Proof. Since the numerator tr(XMY ) = tr(MY X) = tr(Y XM) and the denominator is the prod- uct X M Y , cyclic permutations of (p,q), (r, p), (q, r) leave the quotient k kp,qk kr,pk kq,r tr(XMY ) | | X M Y k kp,qk kr,pk kq,r invariant. Now just observe that the cyclic permutations (p,q), (r, p), (q, r) (q, r), (p,q), (r, p) (r, p), (q, r), (p,q) → → correspond to the following permutations (p, q, r) (q,r,p) (r,p,q). → → † † † † Let X denote the conjugate transpose of X. Since tr(XMY ) = tr(Y M X ) and X p,q = † | | | | k k X ∗ ∗ , we have k kq ,p tr(XMY ) tr(Y †M †X†) | | = | | . X Y M Y † ∗ ∗ X† ∗ ∗ M † ∗ ∗ k kp,qk kq,rk kr,p k kr ,q k kq ,p k kp ,r Taking maximum over all nonzero X,Y,M yields the required equality. Note that the proof works over both R and C.  A straightforward application of H¨older’s inequality yields an upper bound for µ . k l,m,nkp,q,r Theorem 4.2. Let p, q, r [1, ] and l,m,n N. For any nonzero matrices X Fl×m,Y Fn×l and M Fm×n, the following∈ inequality∞ is sharp:∈ ∈ ∈ ∈ tr(XMY ) tr(XMY ) (13) | | 6 | | l|1/q−1/2| m1−1/p n1/r. X Y M X 1 2 Y 2 M 1 · · · k kp,qk kq,rk kr,p k k , k k ,∞k k∞, Furthermore, we have a generalization of Grothendieck’s inequality:

tr(XMY ) F |1/q−1/2| 1−1/p 1/r (14) µl,m,n p,q,r = max | | 6 KG l m n . k k X,Y,M6=0 X Y M · · · k kp,qk kq,rk kr,p GROTHENDIECK CONSTANT IS NORM OF STRASSEN MATRIX MULTIPLICATION TENSOR 7

Proof. First let 1 6 q 6 2. H¨older’s inequality together with the fact that x p 6 x q whenever q 6 p give us k k k k 1/q−1/2 (15) X 1 2 6 X 1 6 X and Y 2 6 Y 2 6 l Y . k k , k k ,q k kp,q k k ,∞ k k ,r k kq,r 1−1/p The same argument also gives M 6 M 1 6 m M for 1 6 p 6 and thus k k∞,p k k∞, k k∞,p ∞ 1−1/p 1/r 1−1/p (16) M 1 6 m M 6 n m M . k k∞, k k∞,p · k kr,p (13) then follows from (15) and (16). To see that it is sharp, we use the following3 m n rank-one matrices: × 1 0 . . . 0 1 0 . . . 0 1 1 . . . 1 1 1 . . . 1 0 0 . . . 0 1 0 . . . 0 0 0 . . . 0 1 1 . . . 1 Em,n := . . . , Cm,n := . . . , Rm,n := . . . , Jm,n := . . . . . . . . . . . . . . . .         0 0 . . . 0 1 0 . . . 0 0 0 . . . 0 1 1 . . . 1 It is easy to check that E = 1, R = l1−1/q, J = m1/p n1−1/r, k l,mkp,q k n,lkq,r k m,nkr,p · 1/2 E 1 2 = 1, R 2 = l , J 1 = mn. k l,mk , k n,lk ,∞ k m,nk∞, Since (13) becomes an equality when X = El,m, Y = Rn,l, and M = Jm,n, it is sharp for 1 6 q 6 2. Next let 2

(17) Kp,q,r := sup µl,m,n p,q,r < ? l,m,n∈Nk k ∞

In Section 5, we will see that Kp,q,r = for all (p, q, r) / (1, 2, ), ( , 1, 2), (2, , 1) . Never- theless, we stress that the absence of a∞ uniform bound is∈ only { limited∞ to∞ the class of∞ (p, q,} r)-norms in (10). For example, we may consider the tensor spectral norm [19] of µl,mn,n, tr(XMY ) µl,m,n σ := max | | k k X,Y,M6=0 X Y , M k kF k kF k kF where the norm on X,Y,M is the matrix Frobenius (i.e., Hilbert–Schmidt) norm. In this case, (18) µ = 1, for all l,m,n N, k l,m,nkσ ∈ since, by Cauchy–Schwartz and the submultiplicativity of the Frobenius norm, tr(XMY ) 6 X MY 6 M X Y , | | k kF k kF k kF k kF k kF and equality is attained by choosing M,X,Y with 1 in the (1, 1)th entry and 0 everywhere else. We will use (18) to obtain lower bounds on µl,m,n p,q,r below. (14) and (19) will collectively be referred to as the Grothendieck–H¨older inequalityk . k

3These are standard in matrix theory, often used to demonstrate sharpness of various matrix inequalities. 8 JINJIEZHANG,SHMUELFRIEDLAND,ANDLEK-HENGLIM

Theorem 4.3. Let p, q, r [1, ] and l,m,n N. Then ∈ ∞ ∈ 1 (19) 6 µ p,q,r. l|1/q−1/2| m|1/p−1/2| n|1/r−1/2| k l,m,nk · · Proof. For n N and p,q [1, ], let ∈ ∈ ∞ max{0,1/p−1/q} cp,q(n) := n . Then for any M Fm×n, the following sharp inequality holds [32, Theorem 4.3], ∈ M 6 c 2(m)c2 (n) M . k kp,q q, ,p k kF It follows that

X 6 c 2(l)c2 (m) X , Y 6 c 2(n)c2 (l) Y , M 6 c 2(m)c2 (n) M , k kp,q q, ,p k kF k kq,r r, ,q k kF k kr,p p, ,r k kF and for any tensor τ (Fl×m)∗ (Fm×n)∗ (Fn×l)∗, we have ∈ ⊗ ⊗ τ 6 τ l|1/q−1/2| m|1/p−1/2| n|1/r−1/2|. k kσ k kp,q,r · · · Plugging in τ = µl,m,n and using (18), we obtain (19).  A practical reason for wanting to ascertain (17) is that if (20) (p,q) and (q, r) (1, 1), (2, 2), ( , ), (1,q), (q, ) , ∈ { ∞ ∞ ∞ } then X and Y can be computed in polynomial time (to arbitrary precision) and k kp,q k kq,r tr(XMY ) max | | 6 Kp,q,r M r,p X,Y 6=0 X Y k k k kp,qk kq,r in principle gives a polynomial-time approximation of M , which is NP-hard [23] if (r, p) is not k kr,p one of the special cases in (20). Unfortunately, we now know that as Kp,q,r = in all other cases, this only works when (p, q, r) (1, 2, ), ( , 1, 2), (2, , 1) , all three of which∞ are equivalent to Grothendieck’s inequality. ∈ { ∞ ∞ ∞ }

5. Grothendieck’s inequality is unique We show that (p, q, r) = (1, 2, ) is, up to a cyclic permutation, the only case for which (17) holds. We will first rule out a large∞ number of cases with the following proposition.

Proposition 5.1. Let p, q, r [1, ]. If there exists a finite constant Kp,q,r > 0 such that µ 6 K for all l,m,n∈ ∞N, then k l,m,nkp,q,r p,q,r ∈ min(p, q, r) = 1 and max(p, q, r)= . ∞ m×n Proof. Let Im,n F be the matrix obtained by appending zero rows or columns to the identity 4 ∈ matrix In or Im, T [In, 0m−n] if m > n, Im,n := ([Im, 0n−m] if m q, (21) Im,n p,q = { } k k (1 if p n, 1/q−1/p Im,nz q z q n if p > q, Im,n p,q = max k k = max k k = k k z6=0 z z6=0 z 1 if p q, Im,n p,q = max k k = max k k = max k k = k k z6=0 z z6=0 z zm6=0 z 1 if p tr(X0MY0) = n tr(∆M) = n δijMij . kXk1,q , kY kq,∞61| | | | | | i,j=1 X Since ∆ 1 n×n is arbitrary, we will choose δ so that δ M is nonnegative. Thus ∈ {± } ij ij ij −1/q n (22) max tr(XMY ) > n Mij . kXk1,q , kY kq,∞61| | i,j=1 | | n×n T X Let Hn 1 be a Hadamard matrix. So HnHn = nIn and all singular values of Hn are √n [18]. Therefore,∈ {± } by (9), T 3/2 (23) Hn ∞,1 = max ε Hnδ 6 σmax(Hn) ε 2 δ 2 = n . k k ε,δ∈{±1}n| | k k k k −3/2 Let M = n H . Then M 1 6 1 and by (22), n k k∞, max tr(XMY ) > n−1/q n−3/2 n2 = n1/2−1/q kXk1,q , kY kq,∞, kMk∞,161| | × × →∞ 10 JINJIEZHANG,SHMUELFRIEDLAND,ANDLEK-HENGLIM as n . Suppose→∞ 1 6 q < 2. Since the H¨older conjugates are r∗ = 1, 2 < q∗ 6 , and p∗ = , by Lemma 4.1(ii), this reduces to the case we just treated. ∞ ∞ CaseCase II: II: (1(1(1,,, ,,, r r r),),), 1 1 166 rr 66 ... For r = , we have (1, , ), which is same as the q = case ∞∞ ∞∞ in Case I. For ∞r∞= 1, we have∞∞ (1, , 1), but∞ by Lemma 4.1(i),∞ ∞ this is equivalent to (1, 1, ),∞ which is same as the q = 1 case in Case∞ I. So we may assume 1

References [1] A. Ac´ın, N. Gisin, and B. Toner. Grothendieck’s constant and local models for noisy entangled quantum states. Phys. Rev. A (3), 73(6, part A):062105, 5, 2006. [2] N. Alon and E. Berger. The Grothendieck constant of random and pseudo-random graphs. Discrete Optim., 5(2):323–327, 2008. [3] N. Alon, K. Makarychev, Y. Makarychev, and A. Naor. Quadratic forms on graphs. Invent. Math., 163(3):499– 522, 2006. [4] N. Alon and A. Naor. Approximating the cut-norm via Grothendieck’s inequality. SIAM J. Comput., 35(4):787– 803, 2006. [5] S. Arora, E. Berger, E. Hazan, G. Kindler, and M. Safra. On non-approximability for quadratic programs. Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, pages 206–215, 2005. [6] M. Braverman, K. Makarychev, Y. Makarychev, and A. Naor. The Grothendieck constant is strictly smaller than Krivine’s bound. Forum Math. Pi, 1:e4, 42, 2013. GROTHENDIECK CONSTANT IS NORM OF STRASSEN MATRIX MULTIPLICATION TENSOR 11

[7] J. Bri¨et, H. Buhrman, and B. Toner. A generalized Grothendieck inequality and nonlocal correlations that require high entanglement. Comm. Math. Phys., 305(3):827–843, 2011. [8] J. Bri¨et, F. M. de Oliveira Filho, and F. Vallentin. The positive semidefinite Grothendieck problem with rank constraint. In Automata, languages and programming. Part I, volume 6198 of Lecture Notes in Comput. Sci., pages 31–42. Springer, Berlin, 2010. [9] J. Bri¨et, F. M. de Oliveira Filho, and F. Vallentin. Grothendieck inequalities for semidefinite programs with rank constraint. Theory Comput., 10:77–105, 2014. [10] P. B¨urgisser, M. Clausen, and A. Shokrollahi. Algebraic Complexity Theory, volume 315 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, Berlin, 1997. [11] M. Charikar and A. Wirth. Maximizing quadratic programs: extending Grothendieck’s inequality. Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science, pages 54–60, 2004. [12] D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions. J. Symbolic Comput., 9(3):251–280, 1990. [13] A. M. Davie. Lower bound for kg. Unpublished note, 1984. [14] A. M. Davie. Matrix norms related to Grothendieck’s inequality. In Banach spaces (Columbia, Mo., 1984), volume 1166 of Lecture Notes in Math., pages 22–26. Springer, Berlin, 1985. [15] P. Divi´anszky, E. Bene, and T. V´ertesi. Qutrit witness from the Grothendieck constant of order four. Phys. Rev., A(96), 2017. [16] C. Dwork, A. Nikolov, and K. Talwar. Efficient algorithms for privately releasing marginals via convex relaxations. Discrete Comput. Geom., 53(3):650–673, 2015. [17] P. C. Fishburn and J. A. Reeds. Bell inequalities, Grothendieck’s constant, and root two. SIAM J. Discrete Math., 7(1):48–56, 1994. [18] S. Friedland and M. Aliabadi. Linear algebra and matrices. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2018. [19] S. Friedland and L.-H. Lim. Nuclear norm of higher-order tensors. Math. Comp., 87(311):1255–1281, 2018. [20] S. Friedland, L.-H. Lim, and J. Zhang. An elementary proof of Grothendieck’s inequalty. arXiv:1711.10595, November 2017. [21] A. Grothendieck. R´esum´ede la th´eorie m´etrique des produits tensoriels topologiques. Bol. Soc. Mat. S˜ao Paulo, 8:1–79, 1953. [22] U. Haagerup. A new upper bound for the complex Grothendieck constant. Israel J. Math., 60(2):199–224, 1987. [23] J. M. Hendrickx and A. Olshevsky. Matrix p-norms are NP-hard to approximate if p 6= 1, 2, ∞. SIAM J. Matrix Anal. Appl., 31(5):2802–2812, 2010. [24] H. Heydari. Quantum correlation and Grothendieck’s constant. J. Phys. A, 39(38):11869–11875, 2006. [25] F. Hirsch, M. T. Quintino, T. V´ertesi, M. Navascu´es, and N. Brunner. Better local hidden variable models for two-qubit werner states and an upper bound on the Grothendieck constant KG(3). Quantum, 1(3), 2017. [26] F. L. Hitchcock. The expression of a tensor or a polyadic as a sum of products. J. Math. Phys., 6(1):164–189, 1927. [27] K. J. Horadam. Hadamard matrices and their applications. Princeton University Press, Princeton, NJ, 2007. [28] G. J. O. Jameson. Summing and nuclear norms in theory, volume 8 of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, 1987. [29] S. Khot and A. Naor. Grothendieck-type inequalities in combinatorial optimization. Comm. Pure Appl. Math., 65(7):992–1035, 2012. [30] S. Khot and A. Naor. Sharp kernel clustering algorithms and their associated Grothendieck inequalities. Random Structures Algorithms, 42(3):269–300, 2013. [31] G. Kindler, A. Naor, and G. Schechtman. The UGC hardness threshold of the Lp Grothendieck problem. Math. Oper. Res., 35(2):267–283, 2010. [32] A.-L. Klaus and C.-K. Li. Isometries for the vector (p, q) norm and the induced (p, q) norm. Linear and Multilinear Algebra, 38(4):315–332, 1995. [33] J.-L. Krivine. Constantes de Grothendieck et fonctions de type positif sur les sph`eres. Adv. in Math., 31(1):16–30, 1979. [34] J. M. Landsberg. Tensors: geometry and applications, volume 128 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, 2012. [35] S. Lang. Algebra, volume 211 of Graduate Texts in Mathematics. Springer-Verlag, New York, third edition, 2002. [36] F. Le Gall. Powers of tensors and fast matrix multiplication. In ISSAC 2014—Proceedings of the 39th Interna- tional Symposium on Symbolic and Algebraic Computation, pages 296–303. ACM, New York, 2014. [37] L.-H. Lim. Tensors and hypermatrices, volume 211 of Handbook of Linear Algebra. CRC Press, Boca Raton, FL, second edition, 2013. [38] J. Lindenstrauss and A. Pe lczy´nski. Absolutely summing operators in Lp-spaces and their applications. Studia Math., 29:275–326, 1968. 12 JINJIEZHANG,SHMUELFRIEDLAND,ANDLEK-HENGLIM

[39] N. Linial and A. Shraibman. Lower bounds in communication complexity based on factorization norms. Random Structures Algorithms, 34(3):368–394, 2009. [40] G. Pisier. Factorization of linear operators and geometry of Banach spaces, volume 60 of CBMS Regional Con- ference Series in Mathematics. Published for the Conference Board of the Mathematical Sciences, Washington, DC; by the American Mathematical Society, Providence, RI, 1986. [41] G. Pisier. Grothendieck’s theorem, past and present. Bull. Amer. Math. Soc. (N.S.), 49(2):237–323, 2012. [42] P. Raghavendra. Optimal algorithms and inapproximability results for every CSP? [extended abstract]. In STOC’08, pages 245–254. ACM, New York, 2008. [43] P. Raghavendra and D. Steurer. Towards computing the Grothendieck constant. In Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 525–534. SIAM, Philadelphia, PA, 2009. [44] O. Regev. Bell violations through independent bases games. Quantum Inf. Comput., 12(1-2):9–20, 2012. [45] O. Regev and B. Toner. Simulating quantum correlations with finite communication. SIAM J. Comput., 39(4):1562–1580, 2009/10. [46] V. Strassen. Gaussian elimination is not optimal. Numer. Math., 13(4):354–356, 1969. [47] V. Strassen. Vermeidung von Divisionen. J. Reine Angew. Math., 264:184–202, 1973. [48] V. Strassen. Rank and optimal computation of generic tensors. Linear Algebra Appl., 52/53:645–685, 1983. [49] V. Strassen. Relative bilinear complexity and matrix multiplication. J. Reine Angew. Math., 375/376:406–443, 1987. [50] B. S. Tsirelson. Quantum generalizations of Bell’s inequality. Lett. Math. Phys., 4(2):93–100, 1980. [51] V. V. Williams. Multiplying matrices faster than Coppersmith–Winograd [extended abstract]. In STOC’12— Proceedings of the 2012 ACM Symposium on Theory of Computing, pages 887–898. ACM, New York, 2012. [52] K. Ye and L.-H. Lim. Fast structured matrix computations: tensor rank and Cohn-Umans method. Found. Comput. Math., 18(1):45–95, 2018.

Department of Statistics, University of Chicago, Chicago, IL, 60637-1514. E-mail address: [email protected]

Department of Mathematics, Statistics and Computer Science, University of Illinois, Chicago, IL, 60607-7045. E-mail address: [email protected]

Computational and Applied Mathematics Initiative, Department of Statistics, University of Chicago, Chicago, IL 60637-1514. E-mail address, corresponding author: [email protected]