The Mathematical Work of Daniel Spielman
Total Page:16
File Type:pdf, Size:1020Kb
The Mathematical Work of Daniel Spielman Michel X. Goemans and Jonathan A. Kelner The Notices solicited the following article describing the work of Daniel Spielman, recipient of the 2010 Nevanlinna Prize. The International Mathematical Union also issued a news release about the prize, which appeared in the December 2010 issue of the Notices. The Rolf Nevanlinna Prize is awarded once every positive constants as n grows. The goal is to get four years at the International Congress of Mathe- not only asymptotically good codes with the best maticians by the International Mathematical Union possible trade-off between the rate and the relative for outstanding contributions in mathematical as- minimum distance but also codes for which both pects of information sciences. The 2010 recipient the encoding and the decoding can be done as was Daniel Spielman, who was cited for “smoothed fast as possible, ideally in linear time. In his Ph.D. analysis of Linear Programming, algorithms for work, Spielman and his advisor Michael Sipser graph-based codes, and applications of graph proposed the first asymptotically good codes that theory to Numerical Computing”. allow linear-time decoding (for a survey, see [6] and In this article, we summarize some of Spielman’s references therein). Their codes are low-density seminal contributions in these areas. Unfortunately, parity check (LDPC) codes, introduced by Gallager because of space constraints, we can barely scratch half a century ago, in which a sparse bipartite the surface and have to leave out many of his graph links the bits of the codeword on one side impressive results (and their interconnections) to parity checks on the other side that constrain over the last two decades. sets of bits to sum to 0 over GF(2); these are linear codes. The Sipser and Spielman codes use Error-Correcting Codes a bipartite expander graph as the parity check Several of Spielman’s important early contributions graph. The expansion property not only implies were in the design of error-correcting codes. a good bound on the minimum distance but also Error-correcting codes are fundamental in our allows a simple decoding algorithm, in which one increasingly digital lives, and they are important repeatedly flips codeword bits with more than mathematical tools in the theory of computing. half of their neighboring constraints violated. The In a code of block length n and rate r < 1 Sipser and Spielman codes require quadratic time rn over the alphabet f0; 1g, a message in f0; 1g is for encoding, as in any linear code. Subsequently, n encoded into a codeword in f0; 1g . To be able to Spielman (see also [6]) provided a more intricate correct errors in the transmission of a codeword, construction, still based on expanders, that yielded one would like the minimum distance d between the first asymptotically good codes with linear-time two codewords to be large. A code is said to encoding and decoding. be asymptotically good if both the rate r and In later work, Spielman and collaborators de- the relative distance d=n are bounded below by signed highly practical LDPC codes for the erasure channel model, in which some of the bits are Michel X. Goemans is Leighton Family Professor of Applied Mathematics at the Massachusetts Institute of Technology. simply lost. Their codes [4] approach the maximum His email address is [email protected]. capacity possible according to Shannon’s theory, Jonathan Kelner is the Kokusai Denshin Denwa Assistant and they can be encoded and decoded in linear time. Professor of Applied Mathematics at MIT and a member The decoding is based on belief propagation, which of the MIT Computer Science and Artificial Intelligence sets a missing bit whenever it can be recovered Laboratory. His email address is [email protected]. unambiguously. Despite the simplicity of these 1288 Notices of the AMS Volume 58, Number 9 algorithms, the design and analysis of these codes models, but results in one model do not necessarily are mathematically quite involved. Some of this carry over to other probabilistic models and do work has had considerable practical impact. not necessarily shed any light on instances that occur in practice. Smoothed Analysis of Algorithms Spielman and Teng [7] instead proposed One of Daniel Spielman’s most celebrated contri- smoothed analysis, a blend of worst-case and butions is his introduction of a new notion for probabilistic analyses, which marvelously explains analyzing the efficiency of algorithms, the concept the typical behavior of the simplex algorithm. In of smoothed analysis, and its powerful and highly smoothed analysis, one measures the maximum technical illustration on the simplex algorithm for over all instances of a certain size of the expected linear programming. This was done in joint work running time of an algorithm under small random with Shang-Hua Teng, his longtime collaborator. perturbations of the input. In the setting of The traditional way to measure the efficiency the simplex algorithm for linear programming, of an algorithm for a computational problem Spielman and Teng consider any linear program (such as inverting a matrix, factoring a number, given by c 2 Rd , A 2 Rm×d , and b 2 Rm and or computing a shortest path in a graph) is to randomly perturb A and b by independently measure its worst-case running time over all adding to each entry a Gaussian random variable of possible instances of a given input size. Efficient variance σ 2 times the maximum entry of A and b. algorithms are considered to be those whose worst- Using an intricate technical argument, they prove case running times grow polynomially with the that the expected number of steps of the simplex size of the input. But this worst-case, pessimistic algorithm with the so-called shadow-vertex pivot measure does not always reflect the behavior rule is polynomial in the dimensions of A and of algorithms on typical instances that arise in 1/σ . The shadow-vertex pivot rule essentially practice. corresponds to walking along the projection A compelling example of this is the case of linear of the polyhedron onto a 2-dimensional linear programming, in which one aims to maximize subspace containing c. One step of the proof is to a linear function subject to linear constraints: bound the expected number of vertices along a d maxfhc; xi : Ax ≤ b; x 2 R g with the inputs random 2-dimensional projection by a polynomial d m×d m c 2 R , A 2 R , and b 2 R . This problem in n, d, and 1/σ . For this purpose, they prove has numerous industrial applications. In the late that the angle at a vertex of the projection is 1940s, George Dantzig proposed the simplex unlikely to be too flat, and this is formalized algorithm for linear programming, which has been and proved by showing that a random Gaussian cited as one of the “top ten algorithms of the perturbation of any given matrix is sufficiently well 20th century” (Dongarra and Sullivan). In the conditioned, a fundamental notion in numerical nondegenerate case, the simplex algorithm can be linear algebra. Their work has motivated further viewed as starting from a vertex of the polyhedron developments, including better estimates on the P = fx 2 Rd : Ax ≤ bg and repeatedly moving to condition number of randomly perturbed matrices, an adjacent vertex (connected by an edge, or face other probabilistic models for the perturbations, of dimension 1) while improving the value hc; xi. and smoothed analyses for other algorithms and However, no pivot rule (dictating which adjacent problems (see the Smoothed Analysis page in vertex is chosen next) is known for which the [5]). Smoothed analysis also suggested the first worst-case number of operations is polynomial in randomized polynomial-time simplex algorithm the size of the input. Even worse, for almost every for linear programming by Daniel Spielman and known pivot rule, there exist instances for which the second author [3]. This could be a step toward the number of steps grows exponentially. Whether finding a strongly (not depending on the size of there exist polynomial-time algorithms for linear the numbers involved) polynomial-time algorithm programming was a long-standing open question until the discovery of the ellipsoid algorithm for linear programming. (by Nemirovski and Shor, and Khachian) in the late 1970s and interior-point algorithms (first by Fast Algorithms for Numerical Linear Karmarkar) in the mid-1980s. Still, the simplex Algebra and for Graph Problems algorithm is the most often used algorithm for In recent years, Spielman and his collaborators have linear programming, as it performs extremely ignited what appears to be an incipient revolution well on typical instances that arise in practice. in the theory of graph algorithms. By developing This almost paradoxical disparity between its and exploiting deep connections between graph exponential worst-case behavior and its fast typical theory and computational linear algebra, they behavior called for an explanation. In the 1980s, have introduced a new set of tools that have the researchers considered probabilistic analyses of potential to transform both fields. This work has the simplex method under various probabilistic already resulted in faster algorithms for several October 2011 Notices of the AMS 1289 fundamental problems in both linear algebra and sparsification. In addition to being a surprising graph theory. result in graph theory, their techniques have led to a new simpler proof of an important theorem by Fast Algorithms for Laplacian Linear Systems Bourgain and Tzafriri in functional analysis and With any graph G, one can associate a matrix LG a new approach to the long-open Kadison-Singer known as the graph Laplacian. Much of Spielman’s conjecture.