Learning Bayesian Networks Through Birkhoff Polytope: a Relaxation Method

Total Page:16

File Type:pdf, Size:1020Kb

Learning Bayesian Networks Through Birkhoff Polytope: a Relaxation Method 1 Learning Bayesian Networks through Birkhoff Polytope: A Relaxation Method Aramayis Dallakyan, Mohsen Pourahmadi Abstract—We establish a novel framework for learning a directed acyclic graph (DAG) when data are generated from a Gaussian, linear structural equation model. It consists of two parts: (1) introduce a permutation matrix as a new parameter within a regularized Gaussian log-likelihood to represent variable ordering; and (2) given the ordering, estimate the DAG structure through sparse Cholesky factor of the inverse covariance matrix. For permutation matrix estimation, we propose a relaxation technique that avoids the NP-hard combinatorial problem of order estimation. Given an ordering, a sparse Cholesky factor is estimated using a cyclic coordinatewise descent algorithm which decouples row-wise. Our framework recovers DAGs without the need for an expensive verification of the acyclicity constraint or enumeration of possible parent sets. We establish numerical convergence of the algorithm, and consistency of the Cholesky factor estimator when the order of variables is known. Through several simulated and macro-economic datasets, we study the scope and performance of the proposed methodology. Index Terms—Bayesian Networks, sparse Cholesky factorization, Directed Acyclic Graphs, Permutation relaxation F 1 INTRODUCTION methods [10], [22]–[24] where the topological ordering is AYESIAN considered as a parameter [10]. The order-based search Networks (BNs) are a popular class of graph- O(p log p) ical models whose structure is represented by a DAG has two main advantages: the ordering space (2 ) is B O(p2) G. BNs have been used in many applications such as significantly smaller than the DAG search space (2 ), economics, finance, biology, etc [1]–[5]. In recent years the and the existence of ordering guarantees satisfaction of the following two approaches have been evolved to learn the acyclicity constraint. structure of the underlying DAG from data: Independence- The recent Annealing on Regularized Cholesky Score based (also called constraint-based) methods [6], [7] and (ARCS) algorithm in [24] is based on representing an order- score-based methods [8]–[11]. Here, structure learning refers ing by the corresponding permutation matrix P , and then to recovering DAG from observational data. given the order, encoding the weighted adjacency matrix B Independence-based methods, such as the inductive cau- into the Cholesky factor L of the inverse covariance matrix. sation (IC) [7] and PC (Peter-Clark) [6] algorithm, utilize ARCS optimizes a regularized likelihood score function to conditional independence tests to detect the existence of recover sparse DAG structure and utilizes simulated an- edges between each pair of variables. The method assumes nealing (SA) to search over the permutation matrix space. that the distribution is Markovian and faithful with respect In SA, using a pre-specified constant m and a temperature to the underlying DAG, where P is faithful to the DAG G schedule fT (i); i = 0;:::;Ng, in the ith iteration a new if all conditional independencies in P are entailed in G and permutation matrix P ∗ is proposed by flipping a fixed- Markovian if the factorization property (1) is satisfied. length m random interval in the current permutation P^, and In contrast, score-based methods measure the goodness checking whether to stay at the current P^ or move to the of fit of different graphs over data by optimizing a score proposed P ∗ with some probability. arXiv:2107.01658v1 [stat.ML] 4 Jul 2021 function with respect to the unknown (weighted) adjacency Motivated by the ARCS two-step framework, we pro- matrix B with a combinatorial constraint that the graph is pose an order-based method for learning Gaussian DAGs by DAG. Then a search procedure is used to find the best graph. optimizing a non-convex regularized likelihood score func- Commonly used search procedures include hill-climbing [8], tion with the following distinct features and advantages: [12], forward-backward search [9], dynamic, and integer programming [13]–[17]. Recently, [18], [19] proposed a fully First, we use a relaxation technique instead of the expen- continuous optimization for structure learning by introduc- sive search for a permutation matrix P in the non-convex ing a novel characterization of acyclicity constraint. space of permutation matrices. More precisely, we project Generally, the DAG search space is intractable for a large P onto the Birkhoff polytope (the convex space of doubly number of nodes p and the task of finding a DAG is NP- stochastic matrices) and then find the “closest” permutation hard [9]. Consequently, approximate methods have been matrix to the optimal doubly stochastic matrix (See Fig- proposed with additional assumptions such as bounded ure 2). Second, given P , we resort to the cyclic coordinate- maximum indegree of the node [20] or tree-like structures wise algorithm to recover the DAG structure entailed in the [21]. Alternatively, the ordering space (or the space of Cholesky factor L. We show that the optimization reduces topological ordering) has been exploited for score-based to p decoupled penalized regressions where each iteration of the cyclic coordinatewise algorithm has a closed form Department of Statistics, Texas A&M University, College Station, TX, 77843 solution. Third, we show consistency of our Cholesky factor 2 estimator for the non-convex score function when the true permutation matrix is known. To the best of our knowledge, consistency results for the sparse Cholesky factor estimator were established only for convex problems [25], [26]. The paper is organized as follows: Section 2 introduces background on Gaussian BNs and structural equation mod- els (SEMs). In Section 3, we derive and discuss the form of the score function. In Section 4, we introduce our Relaxed Regularized Cholesky Factor (RRCF) framework. The anal- yses of the simulated and real macro-economic datasets are contained in Section 5. For the real data analysis, we apply RRCF to solve the price puzzle, a classic problem in the economics literature. Section 6 provides statistical consis- tency of our estimator, and we conclude with a discussion Fig. 1: Illustration of DAG G, corresponding coefficient ma- in Section 7. trix B, permutation matrix P , and permuted strictly lower triangular matrix Bπ. 2 BAYESIAN NETWORKS t We start by introducing the following graphical concepts. If of B to a strictly lower triangular matrix Bπ = PπBPπ the graph G contains a directed edge from the node k ! j, by permuting rows and columns of B, respectively [27] G then k is a parent of its child j. We write Πj for the set (see Figure 1 for the illustrative example). Therefore, the of all parents of a node j. If there exist a directed path stringent acyclicity constraint on B transforms into the k !···! j, then k is an ancestor of its descendant j.A constraint that Bπ is a strictly lower triangular matrix, then Bayesian Network is a directed acyclic graph G whose nodes the linear SEM can be rewritten as represent random variables X1;:::;Xp. Then G encodes a set of conditional independencies and conditional probabil- PπX = BπPπX + Pπ"; (4) ity distributions for each variable. The DAG G = (V; E) is t using the fact that PπPπ = I. From (4), the inverse covari- characterized by the node set V = f1; : : : ; pg and the edge G ance matrix can be expressed as set E = f(i; j): i 2 Πj g ⊂ V × V . It is well-known that for −1 t −1 a BN, the joint distribution factorizes as: Σπ = (I − Bπ) Ωπ (I − Bπ); (5) p Ω = P ΩP t L = Y where π π π. Using (4) and (5) and defining π P (X ;:::;X ) = P (X jΠG) −1=2 1 p j j (1) Ωπ (I −Bπ), the relationship between the Cholesky factor j=1 −1 t Lπ of the inverse covariance matrix Σπ = LπLπ and the matrix Bπ is 2.1 Gaussian BN and Structural Equation Models p (Lπ)ij = −(Bπ)ij= !j; and It is known that a Gaussian BN can be equivalently repre- (6) (L ) = 0 () (B ) = 0 for every i ≥ j sented by the linear SEM [7]: π ij π ij Hence, L preserves the DAG structure of B ; i.e., non-zero X π π Xj = βjkXk + "j; j = 1; : : : ; p; (2) elements in Lπ correspond to directed edges in DAG G. G k2Πj 2 HE CORE UNCTION where "j ∼ N(0;!j ) are mutually independent and inde- 3 T S F G pendent of fXk : k 2 Πj g. Denoting B = (βjk) with zeros In this section, given data from the Gaussian BN (or SEM), along the diagonal, the vector representation of (2) is we derive the form of the score function used to recover the underlying DAG structure. A natural choice for such X = BX + "; (3) function is the log-likelihood function, which will be used t t where " := ("1;:::;"p) and X := (X1;:::;Xp) . Thus, one for the estimation of the permutation and Cholesky fac- can characterize the linear SEM X ∼ (B; Ω) by the weighted tor matrices. We assume that each row of data matrix n×p adjacency matrix B and the noise variance matrix Ω = X = (X1;:::;Xp) 2 R is an i.i.d observation from (2). 2 2 diag(!1;:::;!p). From (3), the inverse covariance matrix of Using reformulation (4), X ∼ N (0; Σ) is Σ−1 = (I − B)tΩ−1(I − B), and the edge p XP t = XP t Bt + EP t ; (7) set of the underlying DAG is equal to the support of the π π π π weighted adjacency matrix B; i.e., E = f(k; j): βjk 6= 0g, where each row of E is an i.i.d Np(0; Ω) vector. Thus, each t which defines the structure of DAG G.
Recommended publications
  • 7 LATTICE POINTS and LATTICE POLYTOPES Alexander Barvinok
    7 LATTICE POINTS AND LATTICE POLYTOPES Alexander Barvinok INTRODUCTION Lattice polytopes arise naturally in algebraic geometry, analysis, combinatorics, computer science, number theory, optimization, probability and representation the- ory. They possess a rich structure arising from the interaction of algebraic, convex, analytic, and combinatorial properties. In this chapter, we concentrate on the the- ory of lattice polytopes and only sketch their numerous applications. We briefly discuss their role in optimization and polyhedral combinatorics (Section 7.1). In Section 7.2 we discuss the decision problem, the problem of finding whether a given polytope contains a lattice point. In Section 7.3 we address the counting problem, the problem of counting all lattice points in a given polytope. The asymptotic problem (Section 7.4) explores the behavior of the number of lattice points in a varying polytope (for example, if a dilation is applied to the polytope). Finally, in Section 7.5 we discuss problems with quantifiers. These problems are natural generalizations of the decision and counting problems. Whenever appropriate we address algorithmic issues. For general references in the area of computational complexity/algorithms see [AB09]. We summarize the computational complexity status of our problems in Table 7.0.1. TABLE 7.0.1 Computational complexity of basic problems. PROBLEM NAME BOUNDED DIMENSION UNBOUNDED DIMENSION Decision problem polynomial NP-hard Counting problem polynomial #P-hard Asymptotic problem polynomial #P-hard∗ Problems with quantifiers unknown; polynomial for ∀∃ ∗∗ NP-hard ∗ in bounded codimension, reduces polynomially to volume computation ∗∗ with no quantifier alternation, polynomial time 7.1 INTEGRAL POLYTOPES IN POLYHEDRAL COMBINATORICS We describe some combinatorial and computational properties of integral polytopes.
    [Show full text]
  • On the Vertices of the D-Dimensional Birkhoff Polytope B)
    Discrete Comput Geom (2014) 51:161–170 DOI 10.1007/s00454-013-9554-5 On the Vertices of the d-Dimensional Birkhoff Polytope Nathan Linial · Zur Luria Received: 28 August 2012 / Revised: 6 October 2013 / Accepted: 9 October 2013 / Published online: 7 November 2013 © Springer Science+Business Media New York 2013 Abstract Let us denote by Ωn the Birkhoff polytope of n × n doubly stochastic ma- trices. As the Birkhoff–von Neumann theorem famously states, the vertex set of Ωn coincides with the set of all n × n permutation matrices. Here we consider a higher- (2) dimensional analog of this basic fact. Let Ωn be the polytope which consists of all tristochastic arrays of order n. These are n × n × n arrays with nonnegative entries in (2) which every line sums to 1. What can be said about Ωn ’s vertex set? It is well known that an order-n Latin square may be viewed as a tristochastic array where every line contains n − 1 zeros and a single 1 entry. Indeed, every Latin square of order n is (2) avertexofΩn , but as we show, such vertices constitute only a vanishingly small (2) subset of Ωn ’s vertex set. More concretely, we show that the number of vertices of (2) 3 −o(1) Ωn is at least (Ln) 2 , where Ln is the number of order-n Latin squares. We also briefly consider similar problems concerning the polytope of n × n × n arrays where the entries in every coordinate hyperplane sum to 1, improving a result from Kravtsov (Cybern. Syst.
    [Show full text]
  • The Geometry of the Simplex Method and Applications to the Assignment Problems
    The Geometry of the Simplex Method and Applications to the Assignment Problems By Rex Cheung SENIOR THESIS BACHELOR OF SCIENCE in MATHEMATICS in the COLLEGE OF LETTERS AND SCIENCE of the UNIVERSITY OF CALIFORNIA, DAVIS Approved: Jes´usA. De Loera March 13, 2011 i ii ABSTRACT. In this senior thesis, we study different applications of linear programming. We first look at different concepts in linear programming. Then we give a study on some properties of polytopes. Afterwards we investigate in the Hirsch Conjecture and explore the properties of Santos' counterexample as well as a perturbation of this prismatoid. We also give a short survey on some attempts in finding more counterexamples using different mathematical tools. Lastly we present an application of linear programming: the assignment problem. We attempt to solve this assignment problem using the Hungarian Algorithm and the Gale-Shapley Algorithm. Along with the algorithms we also give some properties of the Birkhoff Polytope. The results in this thesis are obtained collaboratively with Jonathan Gonzalez, Karla Lanzas, Irene Ramirez, Jing Lin, and Nancy Tafolla. Contents Chapter 1. Introduction to Linear Programming 1 1.1. Definitions and Motivations 1 1.2. Methods for Solving Linear Programs 2 1.3. Diameter and Pivot Bound 8 1.4. Polytopes 9 Chapter 2. Hirsch Conjecture and Santos' Prismatoid 11 2.1. Introduction 11 2.2. Prismatoid and Santos' Counterexample 13 2.3. Properties of Santos' Prismatoid 15 2.4. A Perturbed Version of Santos' Prismatoid 18 2.5. A Search for Counterexamples 22 Chapter 3. An Application of Linear Programming: The Assignment Problem 25 3.1.
    [Show full text]
  • The Alternating Sign Matrix Polytope
    The Alternating Sign Matrix Polytope Jessica Striker [email protected] University of Minnesota The Alternating Sign Matrix Polytope – p. 1/22 What is an alternating sign matrix? Alternating sign matrices (ASMs) are square matrices with the following properties: The Alternating Sign Matrix Polytope – p. 2/22 What is an alternating sign matrix? Alternating sign matrices (ASMs) are square matrices with the following properties: entries ∈{0, 1, −1} the sum of the entries in each row and column equals 1 nonzero entries in each row and column alternate in sign The Alternating Sign Matrix Polytope – p. 2/22 Examples of ASMs n =3 1 0 0 1 0 0 0 1 0 0 10 0 1 0 0 0 1 1 0 0 1 −1 1 0 0 1 0 1 0 0 0 1 0 10 0 1 0 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 0 1 0 1 0 0 The Alternating Sign Matrix Polytope – p. 3/22 Examples of ASMs n =4 0 100 0 1 00 1 −1 0 1 1 −1 10 0 010 0 1 −1 1 0 100 0 0 10 The Alternating Sign Matrix Polytope – p. 4/22 What is a polytope? A polytope can be defined in two equivalent ways ... The Alternating Sign Matrix Polytope – p. 5/22 What is a polytope? A polytope can be defined in two equivalent ways ... ...as the convex hull of a finite set of points The Alternating Sign Matrix Polytope – p. 5/22 What is a polytope? A polytope can be defined in two equivalent ways ..
    [Show full text]
  • On the Vertex Degrees of the Skeleton of the Matching Polytope of a Graph
    On the vertex degrees of the skeleton of the matching polytope of a graph Nair Abreu ∗ a, Liliana Costa y b, Carlos Henrique do Nascimento z c and Laura Patuzzi x d aEngenharia de Produ¸c~ao,COPPE, Universidade Federal do Rio de Janeiro, Brasil bCol´egioPedro II, Rio de Janeiro, Brasil cInstituto de Ci^enciasExatas, Universidade Federal Fluminense, Volta Redonda, Brasil dInstituto de Matem´atica,Universidade Federal do Rio de Janeiro, Brasil Abstract The convex hull of the set of the incidence vectors of the matchings of a graph G is the matching polytope of the graph, M(G). The graph whose vertices and edges are the vertices and edges of M(G) is the skeleton of the matching polytope of G, denoted G(M(G)). Since the number of vertices of G(M(G)) is huge, the structural properties of these graphs have been studied in particular classes. In this paper, for an arbitrary graph G, we obtain a formulae to compute the degree of a vertex of G(M(G)) and prove that the minimum degree of G(M(G)) is equal to the number of edges of G. Also, we identify the vertices of the skeleton with the minimum degree and characterize regular skeletons of the matching polytopes. arXiv:1701.06210v2 [math.CO] 1 Apr 2017 Keywords: graph, matching polytope, degree of matching. ∗Nair Abreu: [email protected]. The work of this author is partially sup- ported by CNPq, PQ-304177/2013-0 and Universal Project-442241/2014-3. yLiliana Costa: [email protected]. The work of this author is partially supported by the Portuguese Foundation for Science and Technology (FCT) by CIDMA -project UID/MAT/04106/2013.
    [Show full text]
  • The K-Assignment Polytope
    The k-assignment polytope Jonna Gill ∗ Matematiska institutionen, Linköpings universitet, 581 83 Linköping, Sweden Svante Linusson Matematiska Institutionen, Royal Institute of Technology(KTH), SE-100 44 Stockholm, Sweden Abstract In this paper we study the structure of the k-assignment polytope, whose vertices are the m × n (0,1)-matrices with exactly k 1:s and at most one 1 in each row and each column. This is a natural generalisation of the Birkhoff polytope and many of the known properties of the Birkhoff polytope are generalised. A representation of the faces by certain bipartite graphs is given. This tool is used to describe properties of the polytope, especially a complete description of the cover relation in the face poset of the polytope and an exact expression for the diameter. An ear decomposition of these bipartite graphs is constructed. Key words: Birkhoff polytope, partial matching, face poset, ear decomposition, assignment polytope 1 Introduction The Birkhoff polytope and its properties have been studied from different viewpoints, see e.g. [2,3,4,7]. The Birkhoff polytope Bn has the n × n per- mutation matrices as vertices and is known under many names, such as ‘The polytope of doubly stochastic matrices’ or ‘The assignment polytope’. A natu- ral generalisation of permutation matrices occurring both in optimisation and in theoretical combinatorics is k-assignments. A k-assignment is k entries in ∗ Corresponding author. Telephone number: +46 13282863. Fax number: +46 13100746 Email addresses: [email protected] (Jonna Gill ), [email protected] (Svante Linusson). Preprint submitted to Elsevier Science September 17, 2008 a matrix that are required to be in different rows and columns.
    [Show full text]
  • Doubly Stochastic Matrices and the Assignment Problem
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by NORA - Norwegian Open Research Archives DOUBLY STOCHASTIC MATRICES AND THE ASSIGNMENT PROBLEM by Maria Sivesind Mehlum Thesis for the degree of Master of Science (Master i Matematikk) Desember 2012 Faculty of Mathematics and Natural Sciences University of Oslo ii iii Acknowledgements First of all, I would like to thank my supervisor, Geir Dahl for valuable guidance and advice, for helping me see new expansions and for believing I could do this when I didn't do it myself. I would also like to thank my fellow "LAP Real" students, both the original and current, for long study hours and lunches filled with discussions, frustration and laughter. Further, I would like to thank my mum and dad for always supporting me, and for all their love and care through the years. Finally, I would like to thank everyone who has been there for me over the past 17 weeks with encouragement and motivational distractions which has made this time enjoyable. iv Contents 1 Introduction 1 2 Theoretical Background 3 2.1 Linear optimization . .3 2.2 Convexity . .5 2.3 Network - Flow . .9 3 Doubly stochastic matrices 13 3.1 The Birkhoff Polytope . 13 3.2 Examples . 14 3.3 Majorization . 16 4 The Assignment Problem 23 4.1 Introducing the problem . 23 4.2 Matchings . 24 4.3 K¨onig'sTheorem . 26 4.4 Algorithms . 27 4.4.1 The Hungarian Method . 28 4.4.2 The Shortest Augmenting Path Algorithms . 32 4.4.3 The Auction Algorithm .
    [Show full text]
  • Extended Formulations in Combinatorial Optimization
    Extended Formulations in Combinatorial Optimization Volker Kaibel∗ April 7, 2011 Abstract The concept of representing a polytope that is associated with some combinatorial optimization problem as a linear projection of a higher- dimensional polyhedron has recently received increasing attention. In this paper (written for the newsletter Optima of the Mathematical Opti- mization Society), we provide a brief introduction to this topic and sketch some of the recent developments with respect to both tools for construct- ing such extended formulations as well as lower bounds on their sizes. 1 Introduction Linear Programming based methods and polyhedral theory form the backbone of large parts of Combinatorial Optimization. The basic paradigm here is to identify the feasible solutions to a given problem with some vectors in such a way that the optimization problem becomes the problem of optimizing a linear function over the finite set X of these vectors. The optimal value of a linear function over X is equal to its optimal value over the convex hull conv(X) = P P f x2X λxx : x2X λx = 1; λ ≥ Og of X. According to the Weyl{Minkowski Theorem [35, 27], every polytope (i.e., the convex hull of a finite set of vec- tors) can be written as the set of solutions to a system of linear equations and inequalities. Thus one ends up with a linear programming problem. As for the maybe most classical example, let us consider the set M(n) of all matchings in the complete graph K = (V ;E ) on n nodes (where a matching arXiv:1104.1023v1 [math.CO] 6 Apr 2011 n n n is a subset of edges no two of which share a common end-node).
    [Show full text]
  • Faces of Birkhoff Polytopes 3
    FACES OF BIRKHOFF POLYTOPES ANDREAS PAFFENHOLZ Abstract. The Birkhoff polytope Bn is the convex hull of all (n × n) permutation matrices, i.e., matrices where precisely one entry in each row and column is one, and zeros at all other places. This is a widely studied polytope with various applications throughout mathematics. In this paper we study combinatorial types L of faces of a Birkhoff polytope. The Birkhoff dimension bd(L) of L is the smallest n such that Bn has a face with combina- torial type L. By a result of Billera and Sarangarajan, a combinatorial type L of a d-dimensional face appears in some Bk for k ≤ 2d, so bd(L) ≤ 2d. We will characterize those types with bd(L) ≥ 2d − 3, and we prove that any type with bd(L) ≥ d is either a product or a wedge over some lower dimensional face. Further, we computationally classify all d-dimensional combinatorial types for 2 ≤ d ≤ 8. 1. Introduction The Birkhoff polytope Bn is the convex hull of all (n×n) permutation matrices, i.e., matrices that have precisely one 1 in each row and column, and zeros at all other places. Equally, Bn is the set of all doubly stochastic (n × n)-matrices, i.e., non-negative matrices whose rows and columns all sum to 1, or the perfect matching polytope of the complete bipartite graph 2 2 Kn,n. The Birkhoff polytope Bn has dimension (n − 1) with n! vertices and n facets. The Birkhoff-von Neumann Theorem shows that Bn can be realized as the intersection of the positive orthant with a family of hyperplanes.
    [Show full text]
  • On Some Problems Related to 2-Level Polytopes
    On some problems related to 2-level polytopes Thèse n. 9025 2018 présenté le 22 Octobre 2018 à la Faculté des Sciences de Base chaire d’optimisation discrète programme doctoral en mathématiques École Polytechnique Fédérale de Lausanne pour l’obtention du grade de Docteur ès Sciences par Manuel Francesco Aprile acceptée sur proposition du jury: Prof Hess Bellwald Kathryn, président du jury Prof Eisenbrand Friedrich, directeur de thèse Prof Faenza Yuri, codirecteur de thèse Prof Svensson Ola, rapporteur Prof Fiorini Samuel, rapporteur Prof Weltge Stefan, rapporteur Lausanne, EPFL, 2018 "You must go from wish to wish. What you don’t wish for will always be beyond your reach." — Michael Ende, The Neverending Story. To my brother Stefano, to whom I dedicated my Bachelor thesis too. He is now six years old and he has not read it yet. Maybe one day? Acknowledgements My PhD has been an incredible journey of discovery and personal growth. There were mo- ments of frustration and doubt, when research just wouldn’t work, even depression, but there was also excitement, enjoyment, and satisfaction for my progresses. Oh, there was hard work too. A whole lot of people contributed to making this experience so amazing. First, I would like to thank my advisor Fritz Eisenbrand for his support, the trust he put in me, and for letting me do research in total freedom. He allowed me to go to quite some conferences and workshops, where I could expand my knowledge and professional network, as well as enjoying some traveling. A big “grazie” goes to Yuri Faenza, my co-advisor: in a moment when I was not sure about the direction of my research, he gave a talk about 2-level polytopes, immediately catching my interest, and, a few days after, he was introducing me to extended formulations with an improvised crash course.
    [Show full text]
  • Combinatorial Bounds on Nonnegative Rank and Extended
    COMBINATORIAL BOUNDS ON NONNEGATIVE RANK AND EXTENDED FORMULATIONS SAMUEL FIORINI, VOLKER KAIBEL, KANSTANTSIN PASHKOVICH, AND DIRK OLIVER THEIS ABSTRACT. An extended formulation of a polytope P is a system of linear inequalities and equations that describe some polyhedron which can be projected onto P . Extended formulations of small size (i.e., number of inequalities) are of interest, as they allow to model correspond- ing optimization problems as linear programs of small sizes. In this paper, we describe several aspects and new results on the main known approach to establish lower bounds on the sizes of extended formulations, which is to bound from below the number of rectangles needed to cover the support of a slack matrix of the polytope. Our main goals are to shed some light on the ques- tion how this combinatorial rectangle covering bound compares to other bounds known from the literature, and to obtain a better idea of the power as well as of the limitations of this bound. In particular, we provide geometric interpretations (and a slight sharpening) of Yannakakis’ [35] result on the relation between minimal sizes of extended formulations and the nonnegative rank of slack matrices, and we describe the fooling set bound on the nonnegative rank (due to Diet- zfelbinger et al. [7]) as the clique number of a certain graph. Among other results, we prove that both the cube as well as the Birkhoff polytope do not admit extended formulations with fewer inequalities than these polytopes have facets, and we show that every extended formulation of a d-dimensional neighborly polytope with Ω(d2) vertices has size Ω(d2).
    [Show full text]
  • Matching Polytopes, Toric Geometry, and the Non-Negative Part of The
    MATCHING POLYTOPES, TORIC GEOMETRY, AND THE TOTALLY NON-NEGATIVE GRASSMANNIAN ALEXANDER POSTNIKOV, DAVID E SPEYER, AND LAUREN WILLIAMS Abstract. In this paper we use toric geometry to investigate the topology of the totally non-negative part of the Grassmannian, denoted (Grk,n)≥0. This is a cell complex whose cells ∆G can be parameterized in terms of the combinatorics of plane-bipartite graphs G. To each cell ∆G we associate a certain polytope P (G). The polytopes P (G) are analogous to the well-known Birkhoff polytopes, and we describe their face lattices in terms of matchings and unions of matchings of G. We also demonstrate a close connection between the polytopes P (G) and matroid polytopes. We use the data of P (G) to define an associated toric variety XG. We use our technology to prove that the cell decomposition of (Grk,n)≥0 is a CW complex, and furthermore, that the Euler characteristic of the closure of each cell of (Grk,n)≥0 is 1. Contents 1. Introduction 1 2. The totally non-negative Grassmannian and plane-bipartite graphs 3 3. Toric varieties and their non-negative parts 8 4. Matching polytopes for plane-bipartite graphs 9 5. Connections with matroid polytopes and cluster algebras 12 6. (Grk,n)≥0 is a CW complex 13 7. The face lattice of P (G) 14 8. Appendix: numerology of the polytopes P (G) 17 References 18 arXiv:0706.2501v3 [math.AG] 14 Oct 2008 1. Introduction The classical theory of total positivity concerns matrices in which all minors are non-negative.
    [Show full text]