Generalized Power Symmetric Stochastic Matrices 1989

Total Page:16

File Type:pdf, Size:1020Kb

Generalized Power Symmetric Stochastic Matrices 1989 PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 127, Number 7, Pages 1987{1994 S 0002-9939(99)05126-6 Article electronically published on March 17, 1999 GENERALIZED POWER SYMMETRIC STOCHASTIC MATRICES R. B. BAPAT, S. K. JAIN, AND K. MANJUNATHA PRASAD (Communicated by Jeffry N. Kahn) Abstract. We characterize stochastic matrices A which satisfy the equation (Ap)T = Am where p<mare positive integers. 1. Introduction An n n matrix is called stochastic if every entry is nonnegative and each row sum is one.× The transpose of the matrix A will be denoted by AT .AmatrixAis doubly stochastic if both A and AT are stochastic. Sinkhorn [5] characterized stochastic matrices A which satisfy the condition AT = Ap,wherep>1 is a positive integer. Such matrices were called power symmetric in [5]. In this paper we consider a generalization. Call a square ma- trix A generalized power symmetric if (Ap)T = Am,wherep<mare positive integers. We give a characterization of generalized power symmetric stochastic ma- trices, thereby generalizing Sinkhorn's result. The proof makes nontrivial use of n the machinery of generalized inverses. In view of the fact that (A )i,j represents the probability of an event to change from the state i to the state j in n units of time, the reader may find it interesting to interpret the condition (Ap)T = Am on the matrix A =(Ai,j ). The paper is organized as follows. In the next section, we introduce some defini- tions and prove several preliminary results. The main results are proved in Section 3. 2. Preliminary results If A is an m n matrix, then an n m matrix G is called a generalized inverse of A if AGA =×A.IfAis a square matrix,× then G is the group inverse of A if AGA = A, GAG = G and AG = GA. We refer to ([1], [2], [3]) for the background concerning generalized inverses. It is well known that A admits group inverse if and only if rank(A)=rank(A2), in which case the group inverse, denoted by A#, is unique. Received by the editors October 22, 1997. 1991 Mathematics Subject Classification. Primary 05B20, 15A09, 15A51. This work was done while the second author was visiting Indian Statistical Institute in November-December 1996 and July-August 1997. He would like to express his thanks for the warm hospitality he enjoyed during his visits as always. Partially supported by Research Chal- lenge Fund, Ohio University. c 1999 American Mathematical Society 1987 License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 1988 R. B. BAPAT, S. K. JAIN, AND K. MANJUNATHA PRASAD If A is an n n matrix, then the index of A, denoted by index(A), is the least positive integer×k such that rank(Ak)=rank(Ak+1). Thus A has group inverse if and only if index(A)=1. If A is an m n real matrix, then the n m matrix G is called the Moore-Penrose inverse of A if× it satisfies AGA = A, GAG× = G, (AG)T = AG, (GA)T = GA.The Moore-Penrose inverse of A, which always exists and is unique, is denoted by A†. A real matrix A is said to be an EP matrix if the column spaces of A and AT are identical. We refer to [1] or [3] for elementary properties of EP matrices. Lemma 1. Let A be a real n n matrix and suppose (Ap)T = Am,wherep<m are positive integers. Then the× following assertions are true: (i) rank(Ap)=rank(Ak),k p, (ii) index(A) p, ≥ (iii) index(Ap)=1≤ , (iv) Ap is an EP matrix, p # p (v) (A ) =(A )†. Proof. (i) Clearly, rank(Ap) rank(Ak) rank(Am)forp k m.Since (Ap)T=Am,wehave ≥ ≥ ≤ ≤ rank(Ap)=rank(Ap)T = rank(Am) and thus rank(Ap)=rank(Ak),p k m. It follows that rank(Ap)= rank(Ak),k p. ≤ ≤ (ii) By (i), rank(A≥p)=rank(Ap+1) and hence index(A) p. (iii) By (i), rank(Ap)=rank(A2p) and hence index(Ap)=1.≤ (iv) Since rank(Ap)=rank(Am), the column space of Ap is the same as that of Am.Also,since(Ap)T =Am, the column space of Am is the same as the row space of Ap, written as a set of column vectors. Therefore, the column spaces of Ap and (Ap)T are identical and Ap is an EP matrix. (v) It is known (see, for example, [3], p. 129) that for an EP matrix the group inverse exists and coincides with the Moore-Penrose inverse. Lemma 2. Let A be an n n matrix such that (Ap)T = Am,wherep<mare positive integers. Let γ = m× p.ThenAp=ApAiγ (Aiγ )T for any integer i 0. − ≥ Proof. We have (1) Ap =(Ap+γ)T =(Ap)T(Aγ)T =Ap+γ(Aγ)T =AγAp(Aγ)T =Aγ(AγAp(Aγ)T)(Aγ )T = A2γ Ap(A2γ )T . Repeating the argument, we get Ap = Aiγ Ap(Aiγ )T = ApAiγ (Aiγ )T for any i 0 and the proof is complete. ≥ License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use GENERALIZED POWER SYMMETRIC STOCHASTIC MATRICES 1989 Lemma 3. Let A be a nonnegative n n matrix and suppose (Ap)T = Am,where p<mare positive integers. Then (Ap×)# exists and is nonnegative. Proof. By Lemma 1 (iii), index(Ap) = 1 and therefore (Ap)# exists. Let γ = m p. Setting i = p in Lemma 2, we get − Ap = ApApγ (Apγ )T = ApApγ A(p+γ)γ in view of (Ap)T = Am. Thus, since γ 1, ≥ p 2p p(γ 1) (p+γ)γ A = A A − A o p(γ 1) (p+γ)γ where A is taken to be the identity matrix. Let B = A − A .Thenit follows from the previous equation that ApBAp = Ap. Also, since B is a power of A, ApB = BAp.Thus(Ap)#=BApB is also a power of A and hence is nonnegative. Lemma 4. Let A be a stochastic n n matrix and suppose (Ap)T = Am,where p<mare positive integers. Then there× exists a permutation matrix P such that PAPT is a direct sum of matrices of the following two types I and II: v T I. C11 where C11 = xx ,x>0 for some positive integer v. 0 C12 0 0 00C··· 0 2 23 ··· 3 II. · where there exists a positive integer u such 6 7 6 · 7 6 0 Cd 1,d 7 6 ··· − 7 6 Cd1 0 0 7 6 ··· u T7 u T 4that (C12 C23 Cd1) =x1x15, ,(Cd1 C12 Cd 1,d) = xdxd for some positive vectors··· x , ,x . ··· ··· − 1 ··· d p p # p # p Proof. Let C = A (A ) . Since (A ) =(A )† by Lemma 1, C is symmetric. By Lemma 3, (Ap)# is nonnegative and hence C is nonnegative. Also, C is clearly idempotent. As observed in the proof of Lemma 3, (Ap)# is a power of A and hence C is also a power of A. The nonnegative roots of symmetric, nonnegative, idempotent matrices have been characterized in [4], Theorem 2. Applying the result to C, we get the form of A asserted in the present result. We remark that according to Theorem 2 of [4], a type III summand is also possible along with the two types mentioned above. However, a matrix of this type is necessarily nilpotent and since A is stochastic, such a summand is not possible. Remark. Using the notation of Lemma 4, it can be verified that the type II sum- mand given there has the property that its (du)th power is T x1x1 0 0 0 x xT ··· 0 2 2 2 ··· 3 · 6 7 6 · T 7 6 00 xdx 7 6 ··· d 7 4 5 This observation will be used in the sequel. We now introduce some notation. Denote by Jm n the m n matrix with each entry equal to one. If n , ,n are positive integers× adding× up to n,then 1 ··· d License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 1990 R. B. BAPAT, S. K. JAIN, AND K. MANJUNATHA PRASAD ^ J(n , ,n ) will denote the n n matrix 1 ··· d × 1 0 Jn n 0 0 n2 1 2 × 1 ··· 00Jnn 0 2 n32×3··· 3 · 6 7 6 · 1 7 6 00 Jn n 7 nd d 1 d 6 1 ··· − × 7 6 Jn n 0 0 7 6 n1 d× 1 ··· 7 4 5 where the zero blocks along the diagonal are square, of order n1,n2, ,nd, respectively. We remark that if d =1,thenJ^ = 1 J . ··· (n1) n1 n1 n1 From now onwards, we make the convention that when× we deal with integers n , ,n , the subscripts of n should be interpreted modulo d. Thus, for example, 1 ··· d nd+1 = n1,n 2 =nd 2 and so on. − − Lemma 5. Let n , ,n be positive integers summing to n and let p<mbe 1 ··· d positive integers. Let p = µ mod d, m = µ0 mod d where 0 µ d 1, 0 µ0 ^ ≤ ≤ − ≤ ≤ d 1.LetS=J(n, ,n ). Then the following conditions are equivalent: − 1 ··· d (i) (Sp)T = Sm; (ii) (a) d divides both p and m,or(b)ddivides neither p nor m, µ + µ0 = d and n = n , 1 i d; i i+µ0 ≤ ≤ (iii) (a) d divides both p and m,or(b)ddivides neither p nor m, µ + µ0 = d and n = n , 1 i d,whereδ=(µ, µ ) is the g.c.d.
Recommended publications
  • Eigenvalues of Euclidean Distance Matrices and Rs-Majorization on R2
    Archive of SID 46th Annual Iranian Mathematics Conference 25-28 August 2015 Yazd University 2 Talk Eigenvalues of Euclidean distance matrices and rs-majorization on R pp.: 1{4 Eigenvalues of Euclidean Distance Matrices and rs-majorization on R2 Asma Ilkhanizadeh Manesh∗ Department of Pure Mathematics, Vali-e-Asr University of Rafsanjan Alemeh Sheikh Hoseini Department of Pure Mathematics, Shahid Bahonar University of Kerman Abstract Let D1 and D2 be two Euclidean distance matrices (EDMs) with correspond- ing positive semidefinite matrices B1 and B2 respectively. Suppose that λ(A) = ((λ(A)) )n is the vector of eigenvalues of a matrix A such that (λ(A)) ... i i=1 1 ≥ ≥ (λ(A))n. In this paper, the relation between the eigenvalues of EDMs and those of the 2 corresponding positive semidefinite matrices respect to rs, on R will be investigated. ≺ Keywords: Euclidean distance matrices, Rs-majorization. Mathematics Subject Classification [2010]: 34B15, 76A10 1 Introduction An n n nonnegative and symmetric matrix D = (d2 ) with zero diagonal elements is × ij called a predistance matrix. A predistance matrix D is called Euclidean or a Euclidean distance matrix (EDM) if there exist a positive integer r and a set of n points p1, . , pn r 2 2 { } such that p1, . , pn R and d = pi pj (i, j = 1, . , n), where . denotes the ∈ ij k − k k k usual Euclidean norm. The smallest value of r that satisfies the above condition is called the embedding dimension. As is well known, a predistance matrix D is Euclidean if and 1 1 t only if the matrix B = − P DP with P = I ee , where I is the n n identity matrix, 2 n − n n × and e is the vector of all ones, is positive semidefinite matrix.
    [Show full text]
  • Clustering by Left-Stochastic Matrix Factorization
    Clustering by Left-Stochastic Matrix Factorization Raman Arora [email protected] Maya R. Gupta [email protected] Amol Kapila [email protected] Maryam Fazel [email protected] University of Washington, Seattle, WA 98103, USA Abstract 1.1. Related Work in Matrix Factorization Some clustering objective functions can be written as We propose clustering samples given their matrix factorization objectives. Let n feature vectors d×n pairwise similarities by factorizing the sim- be gathered into a feature-vector matrix X 2 R . T d×k ilarity matrix into the product of a clus- Consider the model X ≈ FG , where F 2 R can ter probability matrix and its transpose. be interpreted as a matrix with k cluster prototypes n×k We propose a rotation-based algorithm to as its columns, and G 2 R is all zeros except for compute this left-stochastic decomposition one (appropriately scaled) positive entry per row that (LSD). Theoretical results link the LSD clus- indicates the nearest cluster prototype. The k-means tering method to a soft kernel k-means clus- clustering objective follows this model with squared tering, give conditions for when the factor- error, and can be expressed as (Ding et al., 2005): ization and clustering are unique, and pro- T 2 arg min kX − FG kF ; (1) vide error bounds. Experimental results on F;GT G=I simulated and real similarity datasets show G≥0 that the proposed method reliably provides accurate clusterings. where k · kF is the Frobenius norm, and inequality G ≥ 0 is component-wise. This follows because the combined constraints G ≥ 0 and GT G = I force each row of G to have only one positive element.
    [Show full text]
  • Arxiv:1306.4805V3 [Math.OC] 6 Feb 2015 Used Greedy Techniques to Reorder Matrices
    CONVEX RELAXATIONS FOR PERMUTATION PROBLEMS FAJWEL FOGEL, RODOLPHE JENATTON, FRANCIS BACH, AND ALEXANDRE D’ASPREMONT ABSTRACT. Seriation seeks to reconstruct a linear order between variables using unsorted, pairwise similarity information. It has direct applications in archeology and shotgun gene sequencing for example. We write seri- ation as an optimization problem by proving the equivalence between the seriation and combinatorial 2-SUM problems on similarity matrices (2-SUM is a quadratic minimization problem over permutations). The seriation problem can be solved exactly by a spectral algorithm in the noiseless case and we derive several convex relax- ations for 2-SUM to improve the robustness of seriation solutions in noisy settings. These convex relaxations also allow us to impose structural constraints on the solution, hence solve semi-supervised seriation problems. We derive new approximation bounds for some of these relaxations and present numerical experiments on archeological data, Markov chains and DNA assembly from shotgun gene sequencing data. 1. INTRODUCTION We study optimization problems written over the set of permutations. While the relaxation techniques discussed in what follows are applicable to a much more general setting, most of the paper is centered on the seriation problem: we are given a similarity matrix between a set of n variables and assume that the variables can be ordered along a chain, where the similarity between variables decreases with their distance within this chain. The seriation problem seeks to reconstruct this linear ordering based on unsorted, possibly noisy, pairwise similarity information. This problem has its roots in archeology [Robinson, 1951] and also has direct applications in e.g.
    [Show full text]
  • Similarity-Based Clustering by Left-Stochastic Matrix Factorization
    JournalofMachineLearningResearch14(2013)1715-1746 Submitted 1/12; Revised 11/12; Published 7/13 Similarity-based Clustering by Left-Stochastic Matrix Factorization Raman Arora [email protected] Toyota Technological Institute 6045 S. Kenwood Ave Chicago, IL 60637, USA Maya R. Gupta [email protected] Google 1225 Charleston Rd Mountain View, CA 94301, USA Amol Kapila [email protected] Maryam Fazel [email protected] Department of Electrical Engineering University of Washington Seattle, WA 98195, USA Editor: Inderjit Dhillon Abstract For similarity-based clustering, we propose modeling the entries of a given similarity matrix as the inner products of the unknown cluster probabilities. To estimate the cluster probabilities from the given similarity matrix, we introduce a left-stochastic non-negative matrix factorization problem. A rotation-based algorithm is proposed for the matrix factorization. Conditions for unique matrix factorizations and clusterings are given, and an error bound is provided. The algorithm is partic- ularly efficient for the case of two clusters, which motivates a hierarchical variant for cases where the number of desired clusters is large. Experiments show that the proposed left-stochastic decom- position clustering model produces relatively high within-cluster similarity on most data sets and can match given class labels, and that the efficient hierarchical variant performs surprisingly well. Keywords: clustering, non-negative matrix factorization, rotation, indefinite kernel, similarity, completely positive 1. Introduction Clustering is important in a broad range of applications, from segmenting customers for more ef- fective advertising, to building codebooks for data compression. Many clustering methods can be interpreted in terms of a matrix factorization problem.
    [Show full text]
  • Doubly Stochastic Matrices Whose Powers Eventually Stop
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Linear Algebra and its Applications 330 (2001) 25–30 www.elsevier.com/locate/laa Doubly stochastic matrices whose powers eventually stopୋ Suk-Geun Hwang a,∗, Sung-Soo Pyo b aDepartment of Mathematics Education, Kyungpook National University, Taegu 702-701, South Korea bCombinatorial and Computational Mathematics Center, Pohang University of Science and Technology, Pohang, South Korea Received 22 June 2000; accepted 14 November 2000 Submitted by R.A. Brualdi Abstract In this note we characterize doubly stochastic matrices A whose powers A, A2,A3,... + eventually stop, i.e., Ap = Ap 1 =···for some positive integer p. The characterization en- ables us to determine the set of all such matrices. © 2001 Elsevier Science Inc. All rights reserved. AMS classification: 15A51 Keywords: Doubly stochastic matrix; J-potent 1. Introduction Let R denote the real field. For positive integers m, n,letRm×n denote the set of all m × n matrices with real entries. As usual let Rn denote the set Rn×1.We call the members of Rn the n-vectors. The n-vector of 1’s is denoted by e,andthe identity matrix of order n is denoted by In. For two matrices A, B of the same size, let A B denote that all the entries of A − B are nonnegative. A matrix A is called nonnegative if A O. A nonnegative square matrix is called a doubly stochastic matrix if all of its row sums and column sums equal 1.
    [Show full text]
  • Alternating Sign Matrices and Polynomiography
    Alternating Sign Matrices and Polynomiography Bahman Kalantari Department of Computer Science Rutgers University, USA [email protected] Submitted: Apr 10, 2011; Accepted: Oct 15, 2011; Published: Oct 31, 2011 Mathematics Subject Classifications: 00A66, 15B35, 15B51, 30C15 Dedicated to Doron Zeilberger on the occasion of his sixtieth birthday Abstract To each permutation matrix we associate a complex permutation polynomial with roots at lattice points corresponding to the position of the ones. More generally, to an alternating sign matrix (ASM) we associate a complex alternating sign polynomial. On the one hand visualization of these polynomials through polynomiography, in a combinatorial fashion, provides for a rich source of algo- rithmic art-making, interdisciplinary teaching, and even leads to games. On the other hand, this combines a variety of concepts such as symmetry, counting and combinatorics, iteration functions and dynamical systems, giving rise to a source of research topics. More generally, we assign classes of polynomials to matrices in the Birkhoff and ASM polytopes. From the characterization of vertices of these polytopes, and by proving a symmetry-preserving property, we argue that polynomiography of ASMs form building blocks for approximate polynomiography for polynomials corresponding to any given member of these polytopes. To this end we offer an algorithm to express any member of the ASM polytope as a convex of combination of ASMs. In particular, we can give exact or approximate polynomiography for any Latin Square or Sudoku solution. We exhibit some images. Keywords: Alternating Sign Matrices, Polynomial Roots, Newton’s Method, Voronoi Diagram, Doubly Stochastic Matrices, Latin Squares, Linear Programming, Polynomiography 1 Introduction Polynomials are undoubtedly one of the most significant objects in all of mathematics and the sciences, particularly in combinatorics.
    [Show full text]
  • Representations of Stochastic Matrices
    Rotational (and Other) Representations of Stochastic Matrices Steve Alpern1 and V. S. Prasad2 1Department of Mathematics, London School of Economics, London WC2A 2AE, United Kingdom. email: [email protected] 2Department of Mathematics, University of Massachusetts Lowell, Lowell, MA. email: [email protected] May 27, 2005 Abstract Joel E. Cohen (1981) conjectured that any stochastic matrix P = pi;j could be represented by some circle rotation f in the following sense: Forf someg par- tition Si of the circle into sets consisting of …nite unions of arcs, we have (*) f g pi;j = (f (Si) Sj) = (Si), where denotes arc length. In this paper we show how cycle decomposition\ techniques originally used (Alpern, 1983) to establish Cohen’sconjecture can be extended to give a short simple proof of the Coding Theorem, that any mixing (that is, P N > 0 for some N) stochastic matrix P can be represented (in the sense of * but with Si merely measurable) by any aperiodic measure preserving bijection (automorphism) of a Lesbesgue proba- bility space. Representations by pointwise and setwise periodic automorphisms are also established. While this paper is largely expository, all the proofs, and some of the results, are new. Keywords: rotational representation, stochastic matrix, cycle decomposition MSC 2000 subject classi…cations. Primary: 60J10. Secondary: 15A51 1 Introduction An automorphism of a Lebesgue probability space (X; ; ) is a bimeasurable n bijection f : X X which preserves the measure : If S = Si is a non- ! f gi=1 trivial (all (Si) > 0) measurable partition of X; we can generate a stochastic n matrix P = pi;j by the de…nition f gi;j=1 (f (Si) Sj) pi;j = \ ; i; j = 1; : : : ; n: (1) (Si) Since the partition S is non-trivial, the matrix P has a positive invariant (stationary) distribution v = (v1; : : : ; vn) = ( (S1) ; : : : ; (Sn)) ; and hence (by de…nition) is recurrent.
    [Show full text]
  • Left Eigenvector of a Stochastic Matrix
    Advances in Pure Mathematics, 2011, 1, 105-117 doi:10.4236/apm.2011.14023 Published Online July 2011 (http://www.SciRP.org/journal/apm) Left Eigenvector of a Stochastic Matrix Sylvain Lavalle´e Departement de mathematiques, Universite du Quebec a Montreal, Montreal, Canada E-mail: [email protected] Received January 7, 2011; revised June 7, 2011; accepted June 15, 2011 Abstract We determine the left eigenvector of a stochastic matrix M associated to the eigenvalue 1 in the commu- tative and the noncommutative cases. In the commutative case, we see that the eigenvector associated to the eigenvalue 0 is (,,NN1 n ), where Ni is the ith principal minor of NMI= n , where In is the 11 identity matrix of dimension n . In the noncommutative case, this eigenvector is (,P1 ,Pn ), where Pi is the sum in aij of the corresponding labels of nonempty paths starting from i and not passing through i in the complete directed graph associated to M . Keywords: Generic Stochastic Noncommutative Matrix, Commutative Matrix, Left Eigenvector Associated To The Eigenvalue 1, Skew Field, Automata 1. Introduction stochastic free field and that the vector 11 (,,PP1 n ) is fixed by our matrix; moreover, the sum 1 It is well known that 1 is one of the eigenvalue of a of the Pi is equal to 1, hence they form a kind of stochastic matrix (i.e. the sum of the elements of each noncommutative limiting probability. row is equal to 1) and its associated right eigenvector is These results have been proved in [1] but the proof the vector (1,1, ,1)T .
    [Show full text]
  • Notes on Birkhoff-Von Neumann Decomposition of Doubly Stochastic Matrices Fanny Dufossé, Bora Uçar
    Notes on Birkhoff-von Neumann decomposition of doubly stochastic matrices Fanny Dufossé, Bora Uçar To cite this version: Fanny Dufossé, Bora Uçar. Notes on Birkhoff-von Neumann decomposition of doubly stochastic matri- ces. Linear Algebra and its Applications, Elsevier, 2016, 497, pp.108–115. 10.1016/j.laa.2016.02.023. hal-01270331v6 HAL Id: hal-01270331 https://hal.inria.fr/hal-01270331v6 Submitted on 23 Apr 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Notes on Birkhoff-von Neumann decomposition of doubly stochastic matrices Fanny Dufoss´ea, Bora U¸carb,∗ aInria Lille, Nord Europe, 59650, Villeneuve d'Ascq, France bLIP, UMR5668 (CNRS - ENS Lyon - UCBL - Universit´ede Lyon - INRIA), Lyon, France Abstract Birkhoff-von Neumann (BvN) decomposition of doubly stochastic matrices ex- presses a double stochastic matrix as a convex combination of a number of permutation matrices. There are known upper and lower bounds for the num- ber of permutation matrices that take part in the BvN decomposition of a given doubly stochastic matrix. We investigate the problem of computing a decom- position with the minimum number of permutation matrices and show that the associated decision problem is strongly NP-complete.
    [Show full text]
  • On Matrices with a Doubly Stochastic Pattern
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector JOURNAL OF MATHEMATICALANALYSISAND APPLICATIONS 34,648-652(1971) On Matrices with a Doubly Stochastic Pattern DAVID LONDON Technion-Israel Institute of Technology, Haifa Submitted by Ky Fan 1. INTRODUCTION In [7] Sinkhorn proved that if A is a positive square matrix, then there exist two diagonal matrices D, = {@r,..., d$) and D, = {dj2),..., di2r) with positive entries such that D,AD, is doubly stochastic. This problem was studied also by Marcus and Newman [3], Maxfield and Mint [4] and Menon [5]. Later Sinkhorn and Knopp [8] considered the same problem for A non- negative. Using a limit process of alternately normalizing the rows and columns sums of A, they obtained a necessary and sufficient condition for the existence of D, and D, such that D,AD, is doubly stochastic. Brualdi, Parter and Schneider [l] obtained the same theorem by a quite different method using spectral properties of some nonlinear operators. In this note we give a new proof of the same theorem. We introduce an extremal problem, and from the existence of a solution to this problem we derive the existence of D, and D, . This method yields also a variational characterization for JJTC1(din dj2’), which can be applied to obtain bounds for this quantity. We note that bounds for I7y-r (d,!‘) dj”)) may be of interest in connection with inequalities for the permanent of doubly stochastic matrices [3]. 2. PRELIMINARIES Let A = (aij) be an 12x n nonnegative matrix; that is, aij > 0 for i,j = 1,***, n.
    [Show full text]
  • An Inequality for Doubly Stochastic Matrices*
    JOURNAL OF RESEARCH of the National Bureau of Standards-B. Mathematical Sciences Vol. 80B, No.4, October-December 1976 An Inequality for Doubly Stochastic Matrices* Charles R. Johnson** and R. Bruce Kellogg** Institute for Basic Standards, National Bureau of Standards, Washington, D.C. 20234 (June 3D, 1976) Interrelated inequalities involving doubly stochastic matrices are presented. For example, if B is an n by n doubly stochasti c matrix, x any nonnegative vector and y = Bx, the n XIX,· •• ,x" :0:::; YIY" •• y ... Also, if A is an n by n nonnegotive matrix and D and E are positive diagonal matrices such that B = DAE is doubly stochasti c, the n det DE ;:::: p(A) ... , where p (A) is the Perron· Frobenius eigenvalue of A. The relationship between these two inequalities is exhibited. Key words: Diagonal scaling; doubly stochasti c matrix; P erron·Frobenius eigenvalue. n An n by n entry·wise nonnegative matrix B = (b i;) is called row (column) stochastic if l bi ; = 1 ;= 1 for all i = 1,. ',n (~l bij = 1 for all j = 1,' . ',n ). If B is simultaneously row and column stochastic then B is said to be doubly stochastic. We shall denote the Perron·Frobenius (maximal) eigenvalue of an arbitrary n by n entry·wise nonnegative matrix A by p (A). Of course, if A is stochastic, p (A) = 1. It is known precisely which n by n nonnegative matrices may be diagonally scaled by positive diagonal matrices D, E so that (1) B=DAE is doubly stochastic. If there is such a pair D, E, we shall say that A has property (").
    [Show full text]
  • Doubly Stochastic Normalization for Spectral Clustering
    Doubly Stochastic Normalization for Spectral Clustering Ron Zass and Amnon Shashua ∗ Abstract In this paper we focus on the issue of normalization of the affinity matrix in spec- tral clustering. We show that the difference between N-cuts and Ratio-cuts is in the error measure being used (relative-entropy versus L1 norm) in finding the closest doubly-stochastic matrix to the input affinity matrix. We then develop a scheme for finding the optimal, under Frobenius norm, doubly-stochastic approxi- mation using Von-Neumann’s successive projections lemma. The new normaliza- tion scheme is simple and efficient and provides superior clustering performance over many of the standardized tests. 1 Introduction The problem of partitioning data points into a number of distinct sets, known as the clustering problem, is central in data analysis and machine learning. Typically, a graph-theoretic approach to clustering starts with a measure of pairwise affinity Kij measuring the degree of similarity between points xi, xj, followed by a normalization step, followed by the extraction of the leading eigenvectors which form an embedded coordinate system from which the partitioning is readily available. In this domain there are three principle dimensions which make a successful clustering: (i) the affinity measure, (ii) the normalization of the affinity matrix, and (iii) the particular clustering algorithm. Common practice indicates that the former two are largely responsible for the performance whereas the particulars of the clustering process itself have a relatively smaller impact on the performance. In this paper we focus on the normalization of the affinity matrix. We first show that the existing popular methods Ratio-cut (cf.
    [Show full text]