Optimization Problems with Orthogonal Matrix Constraints
Total Page:16
File Type:pdf, Size:1020Kb
NUMERICAL ALGEBRA, doi:10.3934/naco.2018026 CONTROL AND OPTIMIZATION Volume 8, Number 4, December 2018 pp. 413–440 OPTIMIZATION PROBLEMS WITH ORTHOGONAL MATRIX CONSTRAINTS K. T. Arasu Department of Mathematics and Statistics Wright State University 3640 Colonel Glenn Highway Dayton, OH 45435, U.S.A. Manil T. Mohan∗† Department of Mathematics and Statistics Air Force Institute of Technology, 2950 Hobson Way Wright Patterson Air Force Base, OH 45433, USA (Communicated by Zhong Wan) Abstract. The optimization problems involving orthogonal matrices have been formulated in this work. A lower bound for the number of stationary points of such optimization problems is found and its connection to the num- ber of possible partitions of natural numbers is also established. We obtained local and global optima of such problems for different orders and showed their connection with the Hadamard, conference and weighing matrices. The appli- cation of general theory to some concrete examples including maximization of Shannon, Rény, Tsallis and Sharma-Mittal entropies for orthogonal matrices, minimum distance orthostochastic matrices to uniform van der Waerden matri- ces, Cressie-Read and K-divergence functions for orthogonal matrices, etc are also discussed. Global optima for all orders has been found for the optimization problems involving unitary matrix constraints. 1. Introduction. Optimization problems involving constraints like orthogonal ma- trices and unitary matrices play an important role in several applications of science, engineering and technology. Some examples include linear and nonlinear eigenvalue problems, electronic structures computations, low-rank matrix optimization, poly- nomial optimization, subspace tracking, combinatorial optimization, sparse princi- pal component analysis, etc (see [8,1, 18, 24, 25] and references therein). These problems are difficult because the constraints are non-convex and the orthogonality constraints may lead to several local optimum and, in particular, many of these problems in special forms are non-deterministic polynomial-time hard (NP-hard). There is no assurance for obtaining the global optimizer except for a few simple cases and hence is a tedious task for the numerical analysts to find a global optimum (see [25] for more details). 2010 Mathematics Subject Classification. Primary: 15B51; Secondary: 65K05, 15B10, 15B34. Key words and phrases. optimization, orthogonal matrix, Hadamard matrix, conference ma- trix, weighing matrix, orthostochastic matrix, Shannon entropy. ∗Corresponding author: Manil T. Mohan. yM. T. Mohan’s current address Department of Mathematics, Indian Institute of Technology Roorkee-IIT Roorkee, Haridwar Highway, Roorkee, Uttarakhand 247 667, INDIA. 413 414 K. T. ARASU AND MANIL T. MOHAN In this work, we formulate some interesting optimization problems involving or- thogonal matrix constraints (of order n, n 2 N) and find a fascinating connection between the number of stationary points and the number of possible partitions of a natural number. We identify some global and local optima for such prob- lems, which are inevitably related to the real Hadamard, conference and weighing matrices. For the optimization problems involving unitary matrix constraints, we found global optima for all orders. We also give the application of general theory to some concrete examples including maximization of Shannon, Rény, Tsallis and Sharma-Mittal entropies for orthogonal matrices, minimum distance orthostochas- tic matrices to uniform van der Waerden matrices, Cressie-Read and K-divergence functions for orthogonal matrices, etc. Let us now list some of the optimization problems with orthogonal matrix con- strains available in the literature. One of the first such problems is addressed in [13], where the optimization problem was to find the nearest orthogonal matrix to a given matrix. The entropy of a random variable is the measure of uncertainty in Information Theory. The Shannon entropy of a real orthogonal matrix is introduced in [9], and it has been shown in [9,2] that if a real Hadamard matrix exists, then it maximizes the Shannon entropy. The work in [3] extended this idea and found (local and global) minimum distance orthostochastic matrices to uniform van der Waerden matrices (see [17,5,3]) of different orders. The uniform van der Waerden 1 matrix of order n (with all its entries n ) is bistochastic, and is orthostochastic if and only if there exists a real Hadamard matrix of order n. Note that a real Hadamard matrix of order n exits, then n = 1; 2; or n ≡ 0(mod 4). Thus for all other orders, finding local and global optimum for the optimization problem of minimum dis- tance orthostochastic matrices to uniform van der Waerden matrix, is a challenging problem. In the numerical point of view, the authors in [25] developed a feasible method for optimization problems with orthogonality constraints and they covered a wide range of such problems. Bistochastic and orthostochastic matrices have variety of applications in Statis- tics, Mathematics and Physics, including but not limited to the theory of majoriza- tion, angular momentum, in transfer problems, investigations of the Frobenius- Perron operator, and in characterization of completely positive maps acting in the space of density matrices, see for example, [4, 15, 26] etc. The construction of the paper is as follows. In the next section, we give some basic definitions and properties of the orthogonal group, Hadamard, conference and weighing matrices. In section3, we formulate a general optimization problem and discuss about its stationary points, local and global maxima (minima also). A general maximization problem (minimization problem) is described in section4 and several properties of its stationary points, local and global maxima (minima) are also examined. The same optimization problems involving unitary matrices as constraints are also formulated in this section and we find the matrices involving complex Hadamard matrices as the global optimizers for all orders. The application of general theory to some particular examples including maximization of Shannon, Rény, Tsallis and Sharma-Mittal entropies for orthogonal matrices, minimum dis- tance orthostochastic matrices to uniform van der Waerden matrices, Cressie-Read and K-divergence functions for orthogonal matrices etc, are given in section5. 2. Preliminaries. In this section, we give some preliminaries needed to establish the main results of this paper. In the sequel, M(n; R) = Rn×n denotes the space of OPTIMIZATION PROBLEMS WITH ORTHOGONAL MATRIX CONSTRAINTS 415 all n × n real matrices and M(n; C) = Cn×n denotes the space of all n × n complex matrices. Definition 2.1. For every positive integer n, the orthogonal group O(n; R) is the group of n × n real orthogonal matrices Mn with the group operation of matrix multiplication, satisfying > > MnMn = Mn Mn = In; > where In is the n × n identity matrix, Mn is the transpose of Mn. Because the determinant of an orthogonal matrix is either +1 or −1, the orthog- onal group has two components. The component containing the identity In is the special orthogonal group SO(n; R). That is, n o SO(n; R) := Mn 2 O(n; R): det(Mn) = +1 ; ∼ and is a normal subgroup of O(n; R) and O(n; R)=SO(n; R) = Z2. Note that O(n; R) is compact, not connected and it has two connected components. The map det : O(n; R) ! {−1; +1g is a continuous map and f+1g is open in {−1; +1g. Thus, −1 SO(n; R) = det (f+1g) is an open, connected subset of O(n; R), and both O(n; R) 2 2 and SO(n; R) are smooth submanifolds of Rn , where Rn is the n2−dimensional Euclidean space. For more properties of the orthogonal and special orthogonal groups, interested readers are referred to see section 7.4.2, [14]. Remark 1. The vector space of real n × n skew-symmetric matrices is denoted by so(n; R). The exponential map exp : so(n; R) ! SO(n; R); 1 k An P An defined by exp(An) = e := In + k! , for all An 2 so(n; R), is well defined and k=1 surjective. Given a real skew-symmetric matrix An, Rn = exp(An) is a rotation matrix and conversely given a rotation matrix Rn 2 SO(n; R), there is some skew- symmetric matrix An such that Rn = exp(An) (see Theorem 14.2.2, [10]). Definition 2.2 (Real Hadamard matrix). A real Hadamard matrix Hn of order n is defined as an n × n square matrix with entries from f+1; −1g such that > HnHn = nIn: The following result gives the existence of Hadamard matrices for various orders. Lemma 2.3 (Theorem 4.4, [21], Lemma 1.1.4, [23]). If a real Hadamard matrix of order n exists, then n = 1; 2 or n ≡ 0(mod 4). Conjecture 1 (The Hadamard conjecture, [20]). If n ≡ 0(mod 4), then there is a real Hadamard matrix. Definition 2.4 (Real conference matrix). A real conference matrix of order n > 1 is an n × n matrix Cn 2 M(n; R) with diagonal entries 0 and off-diagonal entries ±1 which satisfies > CnCn = (n − 1)In: The defining equation shows that any two rows of Cn are orthogonal and hence n must be even. The following result gives the equivalence of real conference matrices. 416 K. T. ARASU AND MANIL T. MOHAN Lemma 2.5 (Corollary 2.2, [7]). Any real conference matrix of order n > 2 is equivalent, under multiplication of rows and columns by −1, to a conference sym- metric or to a skew-symmetric matrix according as n satisfies n ≡ 2(mod 4) or n ≡ 0(mod 4). Definition 2.6 (Real weighing matrix). A real weighing matrix Wn;k is a square matrix with entries 0; ±1 having k non-zero entries per row and column and inner > product of distinct rows zero. Hence, Wn;k satisfies Wn;kWn;k = kIn. The number k is called the weight of Wn;k.