Smoothed Analysis: an Attempt to Explain the Behavior of Algorithms in Practice∗
Total Page:16
File Type:pdf, Size:1020Kb
Smoothed Analysis: An Attempt to Explain the Behavior of Algorithms in Practice∗ † Daniel A. Spielman Shang-Hua Teng Department of Computer Science Department of Computer Science Yale University Boston University [email protected] [email protected] ABSTRACT in the worst case: if one can prove that an algorithm per- Many algorithms and heuristics work well on real data, de- forms well in the worst case, then one can be confident that spite having poor complexity under the standard worst-case it will work well in every domain. However, there are many measure. Smoothed analysis [36] is a step towards a the- algorithms that work well in practice that do not work well ory that explains the behavior of algorithms in practice. It in the worst case. Smoothed analysis provides a theoretical is based on the assumption that inputs to algorithms are framework for explaining why some of these algorithms do subject to random perturbation and modification in their work well in practice. formation. A concrete example of such a smoothed analysis The performance of an algorithm is usually measured by is a proof that the simplex algorithm for linear programming its running time, expressed as a function of the input size usually runs in polynomial time, when its input is subject of the problem it solves. The performance profiles of al- to modeling or measurement noise. gorithms across the landscape of input instances can differ greatly and can be quite irregular. Some algorithms run in time linear in the input size on all instances, some take 1. MODELING REAL DATA quadratic or higher order polynomial time, while some may take an exponential amount of time on some instances. “My experiences also strongly confirmed my Traditionally, the complexity of an algorithm is measured previous opinion that the best theory is inspired by its worst-case performance. If a single input instance by practice and the best practice is inspired by triggers an exponential run time, the algorithm is called an theory. ” exponential-time algorithm. A polynomial-time algorithm [Donald E. Knuth: “Theory and Practice”, is one that takes polynomial time on all instances. While Theoretical Computer Science, 90 (1), 1–15, 1991.] polynomial time algorithms are usually viewed as being effi- Algorithms are high-level descriptions of how computa- cient, we clearly prefer those whose run time is a polynomial tional tasks are performed. Engineers and experimentalists of low degree, especially those that run in nearly linear time. design and implement algorithms, and generally consider It would be wonderful if every algorithm that ran quickly them a success if they work in practice. However, an al- in practice was a polynomial-time algorithm. As this is not gorithm that works well in one practical domain might per- always the case, the worst-case framework is often the source form poorly in another. Theorists also design and analyze of discrepancy between the theoretical evaluation of an al- algorithms, with the goal of providing provable guarantees gorithm and its practical performance. about their performance. The traditional goal of theoretical It is commonly believed that practical inputs are usually computer science is to prove that an algorithm performs well more favorable than worst-case instances. For example, it is known that the special case of the Knapsack problem in ∗This material is based upon work supported by the Na- which one must determine whether a set of n numbers can tional Science Foundation under Grants No. CCR-0325630 be divided into two groups of equal sum does not have a and CCF-0707522. Any opinions, findings, and conclusions polynomial-time algorithm, unless NP is equal to P. Shortly or recommendations expressed in this material are those of before he passed away, Tim Russert of the NBC’s “Meet the author(s) and do not necessarily reflect the views of the National Science Foundation. the Press,” commented that the 2008 election could end in Because of CACM’s strict constraints on bibliography, we a tie between the Democratic and the Republican candi- have to cut down the citations in this writing. We will post dates. In other words, he solved a 51 item Knapsack prob- a version of the article with more complete bibliograph on lem1 by hand within a reasonable amount of time, and most our webpage. † Affliation after the summer of 2009: Department of Com- 1In presidential elections in the United States, each of the puter Science, University of Southern California. 50 states and the District of Columbia is allocated a num- Permission to make digital or hard copies of all or part of this work for ber of electors. All but the states of Maine and Nebraska personal or classroom use is granted without fee provided that copies are use winner-take-all system, with the candidate winning the not made or distributed for profit or commercial advantage and that copies majority votes in each state being awarded all of that states bear this notice and the full citation on the first page. To copy otherwise, to electors. The winner of the election is the candidate who is republish, to post on servers or to redistribute to lists, requires prior specific awarded the most electors. Due to the exceptional behavior permission and/or a fee. of Maine and Nebraska, the problem of whether the gen- Copyright 2008 ACM 0001-0782/08/0X00 ...$5.00. eral election could end with a tie is not a perfect Knapsack 1 likely without using the pseudo-polynomial-time dynamic- it can be viewed as a function from Ω to R+. But, it is programming algorithm for Knapsack! unwieldy. To compare two algorithms, we require a more In our field, the simplex algorithm is the classic example concise complexity measure. of an algorithm that is known to perform well in practice An input domain Ω is usually viewed as the union of a but has poor worst-case complexity. The simplex algorithm family of subdomains {Ω1,..., Ωn,...}, where Ωn represents solves a linear program, for example, of the form, all instances in Ω of size n. For example, in sorting, Ωn is T the set of all tuples of n elements; in graph algorithms, Ωn max c x subject to Ax ≤ b, (1) is the set of all graphs with n vertices; and in computational n where A is an m×n matrix, b is an m-place vector, and c is geometry, we often have Ωn ∈ R . In order to succinctly an n-place vector. In the worst case, the simplex algorithm express the performance of an algorithm A, for each Ωn takes exponential time [25]. Developing rigorous mathemat- one defines scalar TA(n) that summarizes the instance-based ical theories that explain the observed performance of prac- complexity measure, TA[·], of A over Ωn. One often further tical algorithms and heuristics has become an increasingly simplifies this expression by using big-O or big-Θ notation important task in Theoretical Computer Science. However, to express TA(n) asymptotically. modeling observed data and practical problem instances is a challenging task as insightfully pointed out in the 1999 2.1 Traditional Analyses “Challenges for Theory of Computing” Report for an NSF- It is understandable that different approaches to summa- Sponsored Workshop on Research in Theoretical Computer rizing the performance of an algorithm over Ωn can lead to Science2. very different evaluations of that algorithm. In Theoretical Computer Science, the most commonly used measures are “While theoretical work on models of com- the worst-case measure and the average-case measures. putation and methods for analyzing algorithms The worst-case measure is defined as has had enormous payoff, we are not done. In WCA(n) = max TA[x]. many situations, simple algorithms do well. Take x∈Ωn for example the Simplex algorithm for linear pro- The average-case measures have more parameters. In each gramming, or the success of simulated annealing average-case measure, one first determines a distribution of certain supposedly intractable problems. We of inputs and then measures the expected performance of don’t understand why! It is apparent that worst- an algorithm assuming inputs are drawn from this distribu- case analysis does not provide useful insights on tion. Supposing S provides a distribution over each Ω , the the performance of algorithms and heuristics and n average-case measure according to S is our models of computation need to be further de- S veloped and refined. Theoreticians are investing AveA(n) = E [ TA[x]] , increasingly in careful experimental work leading x∈S Ωn to identification of important new questions in where we use x ∈S Ωn to indicate that x is randomly chosen algorithms area. Developing means for predict- from Ωn according to distribution S. ing the performance of algorithms and heuristics on real data and on real computers is a grand 2.2 Critique of Traditional Analyses challenge in algorithms”. Low worst-case complexity is the gold standard for an algorithm. When low, the worst-case complexity provides an Needless to say, there are a multitude of algorithms be- absolute guarantee on the performance of an algorithm no yond simplex and simulated annealing whose performance matter which input it is given. Algorithms with good worst- in practice is not well-explained by worst-case analysis. We case performance have been developed for a great number hope that theoretical explanations will be found for the suc- of problems. cess in practice of many of these algorithms, and that these However, there are many problems that need to be solved theories will catalyze better algorithm design. in practice for which we do not know algorithms with good worst-case performance.