How to Rank with Few Errors

A PTAS for Weighted on Tournaments∗

Claire Kenyon-Mathieu Warren Schudy† Brown University Computer Science Providence, RI

ABSTRACT For general directed graphs, the FAS problem consists of We present a polynomial time approximation scheme (PTAS) removing the fewest number of edges so as to make the graph for the minimum feedback arc set problem on tournaments. acyclic, and has applications such as scheduling [21] and A simple weighted generalization gives a PTAS for Kemeny- graph layout [42, 13] (see also [34, 5, 27]). This problem has Young rank aggregation. been much studied, both in the mathematical programming community [43, 23, 24, 27, 32] and the approximation algo- rithms community [33, 38, 19]. The best known approxima- Categories and Subject Descriptors tion ratio is O(log n log log n). Unfortunately the problem is F.2.2 [Analysis of Algorithms and Problem Complex- NP-hard to approximate better than 1.36... [28, 15]. ity]: Nonnumerical Algorithms and Problems—Computa- A tournament is the special case of directed graphs where tions on discrete structures every pair of vertices is connected by exactly one of the two possible directed edges. For tournaments, the FAS problem also has a long history, starting in the early 1960s in com- General Terms binatorics [37, 44] and statistics [39]. In combinatorics and Algorithms, Theory discrete probability, much early work [18, 35, 36, 26, 40, 41, 20] focused on worst-case tournaments, starting with Erd¨os and Moon and culminating with work by Fernandez de la Keywords Vega. In statistics and psychology, one motivation is rank- , Feedback arc set, Kemeny-Young ing by paired comparisons: here, you wish to sort some set rank aggregation, Max acyclic subgraph, Polynomial-time by some objective but you do not have access to the objec- approximation scheme, Tournament graphs tive, only a way to compare a pair and see which is greater; for example, determining people’s preferences for types of 1. INTRODUCTION food. Early interesting heuristics due to Slater and Alway can be found in [39]. Unfortunately, even on tournaments, Suppose you ran a chess tournament, everybody played the FAS problem is still NP-hard [1, 3] (see also [11]). The everybody, and you wanted to use the results to rank every- best approximation algorithms achieve constant factor ap- body. Unless you were really lucky, the results would not proximations [1, 12, 45]: 2.5 in the randomized setting [1], be acyclic, so you could not just sort the players by who 3 in the deterministic setting [45, 46, 1]. Our main result beat whom. A natural objective is to find a ranking that is a polynomial time approximation scheme (PTAS) for this minimizes the number of upsets, where an upset is a pair of problem: thus the problem really is provably easier to ap- players where the player ranked lower on the ranking beats proximate in tournaments than on general graphs. the player ranked higher. This is the minimum feedback Here is a weighted generalization. arc set (FAS) problem on tournaments. Our main result is a polynomial time approximation scheme (PTAS) for this Problem 1 (Weighted FAS Tournament). Input: problem. A simple weighted generalization gives a PTAS complete with vertex set V , n V and non- ≡ | | for Kemeny-Young rank aggregation. negative edge weights wij with wij + wji [b, 1] for some fixed constant b (0, 1]. Output: A total order∈ R(x,y) over An extended version is available [31] ∈ ∗ V minimizing x,y V wxyR(y,x). { }⊂ †Corresponding author. Email: [email protected] (The unweightedP problem is the special case where wij = 1 if (i, j) is an edge and wij = 0 otherwise.) In [1], the variant where b = 1 is called weighted FAS Permission to make digital or hard copies of all or part of this work for tournament with probability constraints. Indeed, sampling personal or classroom use is granted without fee provided that copies are a population naturally leads to defining wij as the proba- not made or distributed for profit or commercial advantage and that copies bility that type i is preferred to type j. We note that all bear this notice and the full citation on the first page. To copy otherwise, to known approximation algorithms [1, 12] extend to weighted republish, to post on servers or to redistribute to lists, requires prior specific tournaments for b = 1. permission and/or a fee. STOC'07, June 11–13, 2007, San Diego, California, USA. An important application of weighted FAS tournaments Copyright 2007 ACM 978-1-59593-631-8/07/0006 ...$5.00. is rank aggregation. Frequently, one has access to several rankings of objects of some sort, such as search engine out- 2. ALGORITHM puts [16, 17], and desires to aggregate the input rankings In this section, we present our deterministic and random- into a single output ranking that is similar to all of the in- ized algorithms and reduce one to the other. put rankings: it should have minimum average distance Our algorithms use the sampling-based additive approx- from the input rankings, for some notion of distance. This imation algorithms of [4, 22] as a subroutine. For simplic- ancient problem was already studied in the context of vot- ity, here we use the derandomized additive error algorithm. ing by Borda [6] and Condorcet [10] in the 18th century, Section 4.1 discusses how to extend our analysis to use the and has aroused renewed interest in learning [9]. A nat- randomized version, yielding the runtime in Theorem 1. ural notion of distance is the number of pairs of vertices that are in different orders: this defines the Kemeny-Young Theorem 2 (Small additive error). [4, 22] There rank aggregation problem [29, 30]. This problem is NP-hard, is a randomized polynomial-time approximation scheme for even with only 4 voters [17]. Constant factor approximation maximum acyclic subgraph on weighted tournaments. Given algorithms are known: choosing a random (or best) input β,η > 0, the algorithm outputs, in polynomial time, [22] ranking as the output gives a 2-approximation; finding the 4 O(1/β2) optimal aggregate ranking using the footrule distance as the O n/β + 2 log(1/η) metric instead of the Kendall-Tau distance also gives a 2- “h i ” an ordering whose cost is, with probability at least 1 η, less approximation [16]. There is also a randomized 4/3 approx- 2 − imation algorithm [1, 2]. We improve on these results by than OPT+βn . The algorithm can be derandomized into an algorithm which we denote by AddApprox, which increases designing a polynomial-time approximation scheme. 4 the runtime to nO(1/β ). Theorem 1 (PTAS). There are randomized algorithms We also use a simple type of local search. A total order for minimum Feedback Arc Set on weighted tournaments and R can be specified by an ordering π : V 1,...,n such for Kemeny-Young rank aggregation that, given λ> 0, out- that π(x) is the position of vertex x: π(x)→<π {(y) iff R}(x,y). put orderings with expected cost less than (1 + λ)OPT . Let A single vertex move, given a ordering π, a vertex x and a ǫ = λb. The running time is: position i, consists of taking x out of π and putting it back in position i. ˜ O˜(1/ǫ) O (1/ǫ)n6 + n22O(1/ǫ) + n22 . All of our algorithms call the additive approximation al- gorithm with parameter „ « 1 log (1/ǫ2) 3 O(1/ǫ log(1/ǫ)) The algorithms can be derandomized at the cost of increasing β = β(ǫ) 9− ǫ 3/2 ǫ = 2− . ˜ ≡ 2O(1/ǫ) the running time to n . Our deterministic PTAS is given in Algorithm 1. Recall from Problem 1 that b is the lower bound on wxy + wyx for every pair x,y . For ease of thinking, the reader may think Our algorithm combines several existing algorithmic tools. { } It uses (a) existing constant factor approximation algorithms. of b as being equal to 1. Also recall that we are seeking a In addition, it uses (b) existing polynomial-time approxima- 1+ λ =1+ ǫ/b approximation. tion schemes for the complementary maximization problem Our (somewhat faster) randomized PTAS is given in Al- (a.k.a. max acyclic subgraph tournaments), due to Arora, gorithm 2. Curiously, if the constant factor approximation algorithm used is the sorting by indegree algorithms from [6, Frieze and Kaplan [4] and to Frieze and Kannan [22]. More- local over, our central algorithm, Algorithm 2, also uses (c) the 12], the initial order π computed in Algorithm 2 is pre- divide-and-conquer paradigm. Of course, this has been pre- cisely the order returned by the heuristic algorithm created viously used in this setting, in partitioning algorithms in by Slater and Alway [39] in 1961! general graphs [33] and in a Quicksort-type algorithm [1], For the analysis, it is enough to study Algorithm 2, since but both of these constructions are top-down, whereas our the result for Algorithm 1 follows as a simple corollary: algorithm is bottom-up, which gives it a very different flavor. Finally, it also relies on (d) a simple but useful technique, Algorithm 1 Deterministic polynomial time approximation used already in 1961 by Slater and Alway [39]: iteratively scheme of Theorem 1 (PTAS) improve the current ordering by taking a vertex out of the ordering and moving it back at a different position. (Such Given: Fixed parameters ǫ> 0 and b (0, 1]. Input: A weighted tournament. ∈ single vertex moves and variants have been studied at some 2 length since that time [25, 44, 16, 17].) Round weights to integer multiples of ǫ/n . π constant factor approximation [1, 12]. It is fortunate that the FAS problem on tournaments has ← an approximation scheme. The closely related problem of While some move decreases cost, do it. Types of moves: 1 correlation clustering on complete graphs is APX-hard [8]. 1. Single vertex moves. Choose vertex x and rank j, The related problem of is also hard to take x out of the ordering and put it back in so that approximate even on tournaments, though it does have a its rank is j. 2.5-approximation algorithm [7]. 2. Additive approximation. Choose integers i < We note that Amit Agarwal has informed us that he re- j; let U = vertices with rank in [i, j] . { } cently obtained similar results. π′ AddApprox(U) with parameter β(ǫ). U ← Replace the restriction πU of π to U by πU′ . 1 This problem is identical to FAS except it deals with sym- Output: metric transitive relations instead of antisymmetric ones. π. Algorithm 2 Randomized polynomial time approximation Algorithm 3 Virtual algorithm. Vertical lines in left mar- scheme (RPTAS) of Theorem 1. Vertical lines in left margin gin denote differences from Algorithm 2. denote differences from Algorithm 3. Given: Fixed parameters ǫ> 0 and b (0, 1]. ∈ Given: Fixed parameters ǫ> 0 and b (0, 1]. Input: Weighted tournament; optimal ordering π∗ Input: A weighted tournament ∈ Round weights to integer multiples of ǫ/n2. Round weights to integer multiples of ǫ/n2. π constant factor approximation [1, 12]. π constant factor approximation [1, 12]. Apply← single vertex moves to get a local optimum πlocal. ← local r 3 Apply single vertex moves to get a local optimum π . Let α = 9− , with r uniform in [0, log9(ǫ /β)]. Output: Rec(1, n) Output: πout = Rec(1,n)

Rec(i, j) = Rec(i, j) = If i=j then Return i If i=j then Return i local local For any ℓ,m, let Vℓ,m = vertices x: π (x) [ℓ,m] For any ℓ,m, let Vℓ,m = vertices x : π (x) [ℓ,m] { ∈ } { ∈ } K k : Vik , Vk+1,j Vij /3 . K k : Vik , Vk+1,j Vij /3 . Choose← { k |uniformly| | at| random ≥ | | from} K. Choose← { k |uniformly| | at| random ≥ | | from} K. ρ1 AddApprox(Vij) with parameter β(ǫ) ρ1 AddApprox(Vij) with parameter β ← ← ρ2 Concatenate Rec(i, k) and Rec(k + 1, j) ρ2 Concatenate Rec(i, k) and Rec(k + 1, j) ← ← 2 2 Return ρ1 or ρ2, whichever has lower cost. If FV αǫ Vi,j i,j ≥ | | Return ρ1 (Vij is called a leaf) else Lemma 3. If Algorithm 2 outputs an ordering whose cost Return ρ2 (Vij is called an internal node) is (1+ λ)OPT with positive probability, then Algorithm 1 outputs≤ an ordering with cost (1 + λ)OPT. ≤ Proof. Let π denote the output of Algorithm 1. Since the stopping condition used in Algorithm 3 is defined by we started with a constant factor approximation and only local FS = π (x) π∗(x) . | − | performed cost-improving moves, π is still a constant fac- x S tor approximation, of course. Execute Algorithm 2 starting X∈ with this ordering π. Since there are no cost-decreasing In tournament graphs, the footrule distance is related to the single-vertex moves, πlocal = π. Since Algorithm AddAp- cost of the two orderings: prox cannot improve any subinterval U of π, the recursive local Lemma 5. FV (2/b)(CV (π∗)+ CV (π )). process will always choose ρ2 rather than ρ1: in the end, ≤ we obtain πout = π for every execution of our Algorithm 2. Proof. By [14]’s Theorem 2 relating Spearman’s footrule Therefore it must be that π has cost (1 + λ)OPT. and Kendall-Tau, we have: ≤ local π (j) π∗(j) 3. ANALYSIS: CORRECTNESS | − | j X local local 3.1 Rounding analysis 2 11(π∗(i) >π∗(j)) 11 π (i) <π (j) ≤ i,j The following lemma shows that rounding the weights, X “ ” which is done to ensure polynomial-time termination of local 2 local local 11(π∗(i) >π∗(j)) wij + 11 π (i) <π (j) wji search, is irrelevant for the correctness analysis. ≤ b i,j X h “ ” i local Lemma 4 (Rounding). Let C denote the cost function = (2/b)(C(π∗)+ C(π )). with the original weights, C˜ denote the cost function with the where rounded weights, and OPT˜ = minρ C˜(ρ) denote the value of the optimal solution to the problem with rounded weights. If 1 if p is true 11(p) . C˜(π) (1 + λ)OPT˜ , then C(π) (1+3λ)OPT . ≡ 0 otherwise ≤ ≤  Proof Sketch. The proof is split into two cases, de- The second inequality follows from wij + wji b. pending on whether OPT

For some other pairs, there is a discrepancy, such as the TV EB,α TS = EB,α f(x,y) . − 2 3 2 3 following pair which appears in the sum for TS but not in y x:E(x) local S Xleaf X X the sum for CS(π ) CS (π∗): 4 5 4 5 − Let s(y) be the size of the leaf containing y (number of vertices in that set). We can split the sum for each y into % $ several parts depending on the relative size of the leaf con- % $ local local local taining y and π∗(y) π (y) . The cases correspond to π (x) π (y) π∗(x) π∗(y) | − | local small leaves: π∗(y) π (y) >s(y)/ǫ, and This is what we call a “crossing pair” (see Section 3.7), and • | − | local we need a corrective term, which we will denote by ΦV , big leaves: ǫs(y) > π∗(y) π (y) , taking those into account. Thus we define: • | − | local intermediate leaves: ǫs(y) π∗(y) π (y) s(y)/ǫ. local ΦS = CS(π ) CS(π∗) TS. • ≤ | − |≤ − − Small leaves are easy thanks to local optimality. The following two Lemmas, proved in later subsections, analyze the increase of the first order (T ) and second order Lemma 12 (Small Leaves). For any execution (deter- (Φ) effects from the leaves to the root. mined by B and α), we have log (1/ǫ2)/ǫ Recall that α [αmin, 1] with αmin = 9− 3/2 . ∈ f(x,y) ǫFV . The choice of α and of the random k’s determines the ran- − ≤ y small leaf x:E(x) dom choices in the execution. Let B denote the choices of ∈ X X k for an execution of Algorithm 3 with α = 1. Smaller α Proof Sketch. Moving y cannot improve πlocal. leads to earlier termination, so B also defines the random choices made for any choice of α. Thus, for a given πlocal, We handle the big leaves by proving that the probability the random choices are determined by (B, α). local that π (y) and π∗(y) are in different leaves (and hence able to contribute to the sum) is O(ǫ). Lemma 13 (Big Leaves). For any α and for a random Lemma 15 (Mistake Decomposition). We have: B, we have: ΦS = f(x,y) δxy − · x,y S ∈ EB f(x,y) 12ǫFV . X local " − # ≤ Proof. ∆C C (π ) C (π∗)= f(x,y), where y big leaf x:E(x) S S ∈ X X the sum is over pairs≡ x,y in S−such that πlocal−(x) <πlocal(y) local P Proof Sketch. Let α be fixed. If π (x) is between and π∗(x) >π∗(y). To prove the Lemma, it suffices to show local π (y) and π∗(y), but x is not in leaf(y), then the vertex that each term f(x,y) appears with the same coefficient in local (2) whose rank in π is π∗(y) is also not in leaf(y). Thus ∆C as it does in TS + ΦS . First rewrite TS by swapping we can bound the expression on the left hand side by the the names x and y and using the fact that f is symmetric to expectation (over the random tree construction) of (2 ) yield TS ′ = f(x,y), where the sum is now over pairs − local local x,y in S such that π∗(y) π (x) <π (y). Note that local y big leaf ∗ P ≤ (1) (2 ) π (y) π (y) 11 local ∈1 . ′ | − |· and π − (π∗(y)) / leaf(y) every pair (x,y) in the sums ∆C, TS , TS and ΦS has y X „ ∈ « πlocal(x) < πlocal(y). We divide the pairs (x,y) into seven One can that for any vertex y, the event E1 that “y big sets based on the truth of the inequalities π∗(x) > π∗(y), 1 local local local ∈ π∗(x) π (y), and π∗(y) π (x) (The eighth possi- leaf and π − (π∗(y)) / leaf(y)” has probability at most ≥ ≤ 9ǫ over the sequence B of∈ random choices defining the de- bility never happens because it would imply a contradiction local local composition. The intuition is that the random choice of k with π (x) < π (y).) The seven cases are shown in local is unlikely to land between π (y) and π∗(y). the following table. T indicates that the inequality in the heading is true, integers indicate the coefficient of f(x,y) in We bound the contribution of the intermediate leaves by the sums, and dots indicate false or zero. using the variation in α to force a variation in the leaf size, local making it unlikely that a given vertex y will be in an inter- π∗(x) > π∗(x) ≥ π (x) ≥ ∆C T1 T2 Φ local mediate leaf. π∗(y) π (y) π∗(y) ··· · ··· Lemma 14 (Intermediate Leaves). For any fixed B ·· T · · 1 −1 and for a random α, we have · T · · 1 · −1 T ·· 1 ·· 1 T · T 1 · 1 · Eα f(x,y) ǫFV . T T · 1 1 ·· " − # ≤ T T T 1 1 1 −1 y intermediate leaf x:E(x) ∈ X X

Proof Sketch. Let B =(kS ) be fixed. As in the beginning of the proof of Lemma 13, we can Our proof of Lemma 11 is by induction on the nodes of bound the expression on the left hand side by the expecta- the recursion tree. To analyze crossing pairs, the number tion (over the random choice of α) of local of vertices such that the displacement arc (π (x),π∗(x)) 0 if s(y)= n crosses k is relevant. Let f(x,y) if s(y) = 1 L local x:E(x) ψS = x S π∗(x) k<π (x) , and 8 −local π∗(y) π (y) |{ ∈ | ≤ }| > |P − |· y > 11(y intermediate leaf and R local <> ψS = x S π (x) k<π∗(x) . X local∈ 1 |{ ∈ | ≤ }| π − (π∗(y)) / leaf(y)) otherwise. > ∈ Lemma 16 (Core). Let S = Vi,j , let L be the set of > local One can:> argue that for any y, the event that “y interme- x S such that π (x) k, and R = S L, where Vi,j local 1 ∈ ∈ ≤ \ diate leaf, π − (π∗(y)) / leaf(y) and 1 π∗(x) π∗(y) π (x) π (y) where we write L = Lk and R = Rk to make the depen- > local ≤ local ≤ ≤ < π (x) π (y) π∗(x) π∗(y) dency in k explicit. Now, it is easy to see that the set ≤ ≤ ≤ > K from which Algorithm 3 randomly chooses k satisfies We> define δxy to be equal to 1 in the first case, 1 in the : K Vij /4 whenever i = j. We thus have: other three cases, and to be equal to 0 for non-crossing− pairs. | | ≥ | | 6 Intuitively, x,y is a crossing pair if, when we move x Ek [ΦS ΦL ΦR] { } local − − ≤ continuously from position π (x) to position π∗(x), we 4 local (x,y,k):(x,y) Lk Rk, crossing pair . pass exactly one of π (y),π∗(y) . S |{ ∈ × }| { } | | In such a triple (x,y,k), we must have that k is either Using the induction hypothesis for fixed L, R we have: local local ∗ ∗ between π (x) and π (x) or between π (y) and π (y). 3/2 Moreover, in order for x,y to be a crossing pair, it must FS { } local ΦL + ΦR EB1,B2 32 be that y cross over either π (x) or over π∗(x), so y is ≤ " S L local R local L S internal desc. of L or R | | counted in ψS (π (x)) or in ψS (π (x)) or in ψS (π∗(x)) X R or in ψS (π∗(x)). Taking the union of these cases yields the + ΦS U, L, R B . Lemma. ˛ ∈ # S leaf desc. of L or R ˛ X ˛ 1/2 ˛ Lemma 17. ψS∗ FS . Take expectation over k and rewriting: ˛ ≤ 3/2 Proof Sketch. i ψS (i) is a 1-Lipschitz function (it FS changes by at most 17→ when going from i to i + 1), and its Ek [ΦL + ΦR] EB 32 ≤ " S integral is bounded by FS. It is easy to see that these ob- S internal strictX desc. of U | | servations imply that the maximum value of this function cannot be more than √FS. + ΦS U B . ˛ ∈ # S leaf desc. of U ˛ X ˛ Corollary 18. 3/2 ˛ Adding Ek [Xk] 32FU / U (from Corollary˛ 18) to this ≤ 3/2 | | 32 3/2 and combining 32F / U with the sum over internal nodes ΦS Ek [ΦL + ΦR] F . U − ≤ S S completes the induction,| proving| Equation 3 | | The rest of the proof is deterministic. We prove that for At this point we are prepared to show that at any one every tree decomposition B, the argument of the expectation level the mistakes made are bounded by 32ǫ times the opti- on the right hand side of Equation 3 with ℓ,m = 1, n and mum cost. To do this, note that by the stopping condition U = V is bounded by O(1)ǫFV . Therefore the expectation 3/2 must also be bounded. The proof uses the following alge- √FS / S < ǫ, so FS / S < ǫFS . This shows that any one | | | | braic claim which is easily checked. Recall that γ = √5/3 . level of the tree does not contribute excessive mistakes. The 1 √5/3 − following lemma proves that the sum of mistakes over all of If 0 F1 F , s> 0, and s1 > 0 such that s1, (s s1) s/3, the levels is dominated by the nodes near the leaves of the then:≤ ≤ − ≥ tree. 3/2 3/2 3/2 F F (F F1) < √5/3 1 + − . s s1 s s1 Lemma 19 (Inductive). For every α [αmin, 1], ! ∈ − Thus, for every internal node S, we have: E 3/2 3/2 3/2 ΦV B [96γǫFS + ΦS ] , F / S √5/3(F / S1 + F / S2 ). ≤ 2 3 S | |≤ S1 | | S2 | | SXleaf 4 5 Summing over the recursion tree, we obtain: √ where γ = 5 . 3/2 3/2 3 √5 F F − S S (√5/3+(√5/3)2 + ) S ≤ S · · · Proof. Let α [αmin, 1] be fixed. If V is a leaf, then the S internal | | S leaf | | statement is true.∈ Else, given α, we first prove by induction X X 3/2 local F that for any l m, with U = x ℓ π (x) m and S ≤ { | ≤ ≤ } which is bounded by γ S leaf S . If S is a leaf, then let E [ U B] meaning expectation conditioned on U being a | | P be the parent of S.2 The parent is not a leaf, so by the leaf·| or∈ internal node in execution tree B: P stopping condition √FP / P αǫ ǫ. Thus: 3/2 | |≤ ≤ FS 3/2 ΦU EB 32 FS √FS √FP P √FP ≤ S = FS FS | | 3FS 3ǫFS . " | | S S ≤ P S ≤ P ≤ S internalX | | | | | | | | | | descendant of U Summing yields the lemma.

+ ΦS U B . (3) Combining Lemma 19 with the simple fact that S leaf FS = ˛ ∈ # S leaf ˛ FV proves Lemma 11. X ˛ P descendant of U ˛ ˛ 4. RUNNING TIME The base case of U being a leaf is trivial. Let k be the random variable used at U to decompose 4.1 Improved running time: Randomization U into L and R. Let Xk = ΦU ΦL ΦR. Let B1 be − − The running time of Algorithms 2 and 1 can be improved the random sequence used to construct the decomposition somewhat by replacing the deterministic additive error al- of the left child, and B2 be the random sequence used to gorithm with the randomized one. To use the randomized construct the decomposition of the right child, so that B = version of the AddApprox algorithm set η = δ/n for Algo- (k, B1, B2). Note that given an interval U, ΦU = EB [ΦU ] is rithm 2 and η = δǫ/n4 for Algorithm 1. Consider the event deterministic. Therefore: 2If S is the root, no parent is available but in that case the ΦU EB [Xk + ΦL + ΦR]= Ek [Xk]+ Ek [ΦL + ΦR] . lemma is trivially true anyway. E0 stating that “During execution of the algorithm, every 3. Is single vertex move local optimality sufficient to achieve call to the randomized version of AddApprox yields a result a constant factor approximation? We have an example within the stated bounds.” Each call fails with probability showing that such a constant factor is at least three. at most η. As shown in Section 4.2, there are at most n Also does sorting by indegree followed by single ver- 4 (resp. n /ǫ) such calls, so event E0 has probability at least tex moves have a provably better approximation factor 1 δ. Modify the analysis by assuming throughout that than either one by itself? − event E0 holds and do the analysis conditioned on E0. Using the randomized version of the AddApprox algo- Acknowledgements rithm introduces a complication because the proof of Lemma The authors wish to thank Nir Ailon, Eli Upfal and anony- 3 implicitly assumes determinism. This is easily remedied mous reviewers for useful suggestions. by reusing the random numbers used the previous time ran- domized AddApprox was called on the same vertices. 6. REFERENCES 4.2 Analysis: Running time [1] Nir Ailon, Moses Charikar, Alantha Newman. Both Algorithm 1 and 2 have a greedy local search phase. Aggregating inconsistent information: ranking and The preprocessing step ensures that the edge weights are clustering. ACM Symposium on Theory of Computing 2 integer multiples of ǫ/n , so the cost decreases by at least (STOC), pages 684-693, 2005. 2 2 ǫ/n at each iteration. Since the cost is always between n [2] Nir Ailon. Aggregation of Partial Rankings, p-Ratings and zero, the total number of iterations is bounded by n4/ǫ. 2 and Top-m Lists. ACM-SIAM Symposium on Discrete The single vertex move local search phase takes time O(n Algorithms (SODA), 2007, 415-424. 4 · n /ǫ), where the first term comes from the time to search for [3] N. Alon, Ranking tournaments, SIAM J. Discrete an improving single vertex move and the second is a bound Math. 20 (2006), 137-142. on the number of iterations. [4] Sanjeev Arora, Alan Frieze, and Haim Kaplan. A New The runtime is dominated by the calls to the additive er- Rounding Procedure for the Assignment Problem with ror algorithm. There are at most n calls for the randomized Applications to Dense Graph Arrangement Problems. version (Algorithm 2) and at most n2 n4/ǫ for the deter- IEEE Symposium on Foundations of Computer ministic one (Algorithm 1), where the· n2 comes from the Science (FOCS) 1996: 24-33. number of intervals that need to be checked each iteration [5] F.B.Baker and L.J.Hubert, Applications of to find an improving move. Thus the runtime is combinatorial programming to data analysis: seriation O(n6/ǫ + nf(n, (1/ǫ)O(1/ǫ), δ/n4)) using asymmetric proximity measures, British J. Math. Statis. Psych. 30 (1977) 154-164. randomized and [6] J. Borda. M´emoire sur les ´elections au scrutin. O(n6/ǫ f(n, (1/ǫ)O(1/ǫ), 0)) Histoire de l’Acad´emie Royale des Sciences, 1781. · [7] M.C. Cai, X. Deng, W. Zang. An approximation deterministic, where f(n, β, η) is the time required for the algorithm for feedback vertex sets in tournaments. additive approximation algorithm to run on the problem SIAM J. Computing v. 30 No. 6 pp. 1993-2007. 2001. of size n with error parameter β and failure probability η. [8] M. Charikar, V. Guruswami, and A. Wirth. Clustering The best-known additive approximation run time f(n, β, η) with qualitative information. In FOCS, pages 524–533, 2 4 is n/β4 + 2O(1/β ) log 1/η randomized in [22], or nO(1/β ) 2003. for“ deterministic (η”= 0). [9] William W. Cohen, Robert E. Schapire and Yoram Singer. Learning to order things. Proceedings of the 1997 conference on Advances in neural information 5. CONCLUSIONS AND FUTURE WORK processing systems (NIPS): 451-457. The reader may wonder whether our results can extend [10] M. Condorcet. Essai sur l’application de l’analyse `ala 2 to the dense case, with Ω(n ) edges but no constraint that probabilit´edes d´ecisions rendues `ala pluralit´edes every pair of vertices must have an edge between them. This voix. 1785. extension is not feasible because the dense case is as hard as [11] Conitzer, V. Computing Slater Rankings Using the general case, which is APX-hard. To see this, consider Similarities Among Candidates. Proceedings of the an arbitrary directed graph with n vertices that we want 21st National Conference on Artificial Intelligence the feedback arc set for. Union this graph with an arbitrary (AAAI-06), Boston, MA, USA, 613-619. 2006. acyclic dense graph on n vertices. The result is a dense [12] D. Coppersmith, Lisa Fleischer, Atri Rudra: Ordering graph with 2n vertices that is essentially equivalent to the by weighted number of wins gives a good ranking for original problem. weighted tournaments. ACM-SIAM Symposium on Interesting open questions include: Discrete Algorithms (SODA) 2006: 776-782. 1. Does weighted feedback arc set with the triangle in- [13] C. Demetrescu and I. Finocchi. Break the “right” equality admit a PTAS? This would give a PTAS for cycles and get the “best” drawing. In B.E. Moret and partial rank aggregation [2]. A.V. Goldberg, editors, Proceedings of the 2nd International Workshop on Algorithmic Engineering 2. Can the runtime be improved to be only singly expo- and Experiments (ALENEX), 171-182, 2000. nential in ǫ? The doubly exponential runtime comes [14] P. Diaconis and R. Graham. Spearman’s footrule as a from the intermediate leaves case in the T analysis measure of disarray. J. of the Royal Statistical Society, (Lemma 14). 39(2), pages 262–268, 1977. [15] I. Dinur and S. Safra. The importance of being biased. [31] Claire Kenyon-Mathieu and Warren Schudy. How to In Proceedings on 34th Annual ACM Symposium on rank with few errors: A PTAS for Weighted Feedback Theory of Computing, pages 33-42, 2002. Arc Set on Tournaments ECCC Report TR06-144, [16] Cynthia Dwork, Ravi Kumar, Moni Naor, and D. Nov 2006. Sivakumar. Rank aggregation revisited. Manuscript. [32] B. Korte, Approximative algorithms for discrete 2001. optimization problems, Ann. Discrete Math 4 (1979), [17] Cynthia Dwork, Ravi Kumar, Moni Naor, D. 85-120. Sivakumar. Rank Aggregation Methods for the Web. [33] T. Leighton and S. Rao. An approximate max-flow WWW10. 2001. min-cut theorem for uniform multicommodity flow [18] P. Erd¨os and J.W.Moon, On sets on consistent arcs in problems with applications to approximation tournaments, Canadian Mathematical Bulletin 8 algorithms. In Proceedings of the 29th Annual (1965) 269-271. Symposium on Foundations of Computer Science, [19] G. Even, J. Naor, S. Rao, B. Schieber. pages 422-431. IEEE, October 1988. Divide-and-conquer approximation algorithms via [34] A. Lempel and I. Cederbaum. Minimum feedback arc spreading metrics. Proceedings of the 36th Annual and vertex set of a directed graph. IEEE Trans. IEEE Symposium on Foundations of Computer Circuit Theory, 13(4): 399-403, 1966. Science (FOCS), 62-71, 1995. [35] K.B.Reid, On sets of arcs containing no cycles in [20] Fernandez de la Vega, W. On the maximal cardinality tournaments, Canad. Math Bulletin 12 (1969) 261-264. of a consistent set of arcs in a random tournament, J. [36] K.D.Reid and E.T.Parker, Disproof of a conjecture of Combinatorial Theory, Ser. B 35:328-332 (1983). Erd¨os and Moser on tournaments, J. Combin. Theory [21] M.M.Flood. Exact and Heuristic Algorithms for the 9 (1970) 225-238. Weighted Feedback Arc Set Problem: a Special Case [37] S. Seshu and M.B. Reed, Linear Graphs and Electrical of the Skew-Symmetric Quadratic Assignment Networks, Addison-Wesley, Reading, MA, 1961. Problem. Networks, 20:1-23, 1990. Attribute the problem of the minimum feedback arc [22] Alan M. Frieze, Ravi Kannan: Quick Approximation set to Runyon. to Matrices and Applications. Combinatorica 19(2): [38] P.D.Seymour. Packing directed circuit fractionally. 175-220 (1999). Combinatorica 15:281-288, 1995. [23] M. Gr¨otschel, M. J¨unger, G. Reinelt. Facets of the [39] P. Slater, Inconsistencies in a schedule of paired linear ordering polytope. Math. Programming 33, comparisons, Biometrika 48 (1961) 303-312. 1985, 43-60. [40] J. Spencer, Optimal ranking of tournaments, Networks [24] M. Gr¨otschel, M. J¨unger, G. Reinelt. On the acyclic 1(1971) 135-138. subgraph polytope. Math. Programming 33, 1985, [41] J. Spencer, Optimal ranking of unrankable 28-42. tournaments, Period. Math. Hungar. 11-2 (1980) [25] R. Hassin and S. Rubinstein. Approximations for the 131-144. maximum acyclic subgraph problem. Information [42] K. Sugiyama, S. Tagawa, and M. Toda. Methods for processing letters 51 (1994) 133-140. visual understanding of hierarchical system structures. [26] H.A. Jung, On subgraphs without cycles in IEEE Trans. Syst. Man Cybern., 11(2):109-125, 1981. tournaments, Combinatorial Theory and its [43] A.W.Tucker. On directed graphs and integer Applications II, North Holland (1970) 675-677. programs. Presented at 1960 Symposium on [27] M. J¨unger, Polyhedral combinatorics and the Acyclic Combinatorial Problems, Princeton University, cited Subdigraph Problem (Heldermann Verlag, Berlin, in [44]. 1985). [44] D.H. Younger. Minimum feedback arc sets for a [28] Karp, R., 1972. Reducibility among combinatorial directed graph. IEEE Trans. Circuit Theory 10 (1963), problems. In: Miller, R. E., Thatcher, J. W. (Eds.), 238-245. Complexity of Computer Computations. Plenum Press, [45] A. van Zuylen. Deterministic approximation NY, pp. 85–103. algorithms for ranking and clusterings. Cornell ORIE [29] J. Kemeny. Mathematics without numbers. Daedalus, Tech report No. 1431. 2005. 88:571-591, 1959. [46] Anke van Zuylen, Rajneesh Hegde, Kamal Jain and [30] J. Kemeny and J. Snell. Mathematical models in the David P. Williamson. Deterministic pivoting social sciences. Blaisdell, New York, 1962. Reprinted algorithms for constrained ranking and clustering by MIT press, Cambridge, 1972. problems. ACM-SIAM Symposium on Discrete Algorithms (SODA), 2007, 405-414.