Tight Bounds on the Round Complexity of the Distributed Maximum Coverage Problem∗

Sepehr Assadi† Sanjeev Khanna†

Abstract the coordinator model (see, e.g., [34, 54, 60]). In this We study the maximum k-set coverage problem in the follow- model, computation proceeds in rounds, and in each ing distributed setting. A collection of input sets S1,...,Sm round, all machines simultaneously send a message to over a universe [n] is partitioned across p machines and the goal is to find k sets whose union covers the most number a central coordinator who then communicates back to of elements. The computation proceeds in rounds where all machines a summary to guide the computation for in each round machines communicate information to each the next round. At the end, the coordinator outputs other. Specifically, in each round, all machines simultane- ously send a message to a central coordinator who then com- the answer. Main measures of efficiency in this setting municates back to all machines a summary to guide the com- are the communication cost, i.e., the total number of putation for the next round. At the end of the last round, bits communicated by each machine, and the round the coordinator outputs the answer. The main measures of efficiency in this setting are the approximation ratio of the complexity, i.e., the number of rounds of computation. returned solution, the communication cost of each machine, The distributed coordinator model (and the closely and the number of rounds of computation. related message-passing model1) has been studied ex- Our main result is an asymptotically tight bound on the tradeoff between these three measures for the distributed tensively in recent years (see, e.g., [54, 23, 59, 60, 61], maximum coverage problem. We first show that any r-round and references therein). Traditionally, the focus in this protocol for this problem either incurs a communication cost model has been on optimizing the communication cost Ω(1/r) of k · m or only achieves an approximation factor of and round complexity issues have been ignored. How- kΩ(1/r). This in particular implies that any protocol that simultaneously achieves good approximation ratio (O(1) ap- ever, in recent years, motivated by application to big data analysis such as MapReduce computation, there proximation) and good communication cost (Oe(n) commu- nication per machine), essentially requires logarithmic (in k) have been a growing interest in obtaining round effi- number of rounds. We complement our lower bound result cient protocols for various problems in this model (see, by showing that there exist an r-round protocol that achieves an e -approximation (essentially best possible) with a com- e.g., [3,4, 43, 41, 37, 52, 30, 13, 38, 10]). e−1 In this paper, we study the maximum coverage munication cost of k · mO(1/r) as well as an r-round protocol problem in the coordinator model: A collection of input that achieves a kO(1/r)-approximation with only Oe(n) com- munication per each machine (essentially best possible). sets S := {S1,...,Sm} over a universe [n] is arbitrarily We further use our results in this distributed setting partitioned across p machines, and the goal is to select to obtain new bounds for maximum coverage in two other k sets whose union covers the most number of elements main models of computation for massive datasets, namely, from the universe. Maximum coverage is a fundamental the dynamic streaming model and the MapReduce model. optimization problem with a wide range of applications (see, e.g., [47, 45, 58, 36] for some applications). As 1 Introduction an illustrative example of submodular maximization, A common paradigm for designing scalable algorithms the maximum coverage problem has been studied in for problems on massive data sets is to distribute the various recent works focusing on scalable algorithms for computation by partitioning the data across multiple massive data sets including in the coordinator model machines interconnected via a communication network. (e.g., [41, 52]), MapReduce framework (e.g., [27, 48]), The machines can then jointly compute a function on and the streaming model (e.g. [20, 51]); see Section 1.1 the union of their inputs by exchanging messages. A for a more comprehensive summary of previous results. well-studied and important case of this paradigm is Previous results for maximum coverage in the dis- tributed model can be divided into two main categories: one on hand, we have communication efficient protocols ∗A full version of this paper is available on arXiv. that only need Oe(n) communication and achieve a con- †Department of Computer and Information Science, Uni- versity of Pennsylvania. Supported in part by National Sci- ence Foundation grants CCF-1552909 and CCF-1617851. Email: 1In absence of any restriction on round complexity, these two {sassadi,sanjeev}@cis.upenn.edu. models are equivalent; see, e.g., [54].

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited stant factor approximation, but require a large number in the literature for bounded-round protocols includ- of rounds of Ω(p)[15, 51]2. On the other hand, we have ing [33, 46, 13, 12] (for one round a.k.a simultaneous round efficient protocols that achieve a constant factor protocols), and [7,8] (for multi-round protocols). We approximation in O(1) rounds of communication, but believe our framework will prove useful for establishing incur a large communication cost k · mΩ(1) [48]. distributed lower bound results for other problems, and This state-of-the-affairs, namely, communication ef- is thus interesting in its own right. ficient protocols that require a large number of rounds, We complement Result1 by giving protocols that or round efficient protocols that require a large com- show that its bounds are essentially tight. munication cost, raises the following natural question: Does there exist a truly efficient distributed protocol for Result 2. For any integer r ≥ 1, there exist r- maximum coverage, that is, a protocol that simultane- round protocols that achieve: ously achieves Oe(n) communication cost, O(1) round 1. an approximation factor of (almost) e with complexity, and gives a constant factor approximation? e−1 k · mO(1/r) communication per machine, or This is the precisely the question addressed in this work. 2. an approximation factor of O(r · k1/r+1) with 1.1 Our Contributions Our first result is a negative Oe(n) communication per machine. resolution of the aforementioned question. In particular, we show that, Results1 and2 together provide a near complete Result 1. For any integer r ≥ 1, any r-round understanding of the tradeoff between the approxima- protocol for distributed maximum coverage either tion ratio, the communication cost, and the round com- incurs k · mΩ(1/r) communication per machine or plexity of protocols for the distributed maximum cover- has an approximation factor of kΩ(1/r). age problem for any fixed number of rounds. The first protocol in Result2 is quite general in Prior to our work, the only known lower bound for that it works for maximizing any monotone submodu- distributed maximum coverage was due to McGregor lar function subject to a cardinality constraint. Previ- and Vu [51] who showed an Ω(m) communication lower ously, it was known how to achieve a 2-approximation bound for any protocol that achieves a better than distributed algorithm for this problem with mO(1/r)  e  e−1 -approximation (regardless of number of rounds communication and r rounds of communication [48].  e  and even if the input is randomly distributed); see However, the previous best e−1 -approximation dis- also [9]. Indyk et al. [41] also showed that no composable tributed algorithm for this problem with sublinear in m coreset (a restricted family√ of single round protocols) communication due to Kumar et al. [48] requires at least can achieve a better than Ω(e k) approximation without Ω(log n) rounds of communication. As noted above, the   communicating essentially the whole input (which is e is information theoretically the best approxima- known to be tight [30]). However, no super constant e−1 tion ratio possible for any protocol that uses sublinear lower bounds on approximation ratio were known for in m communication [51]. this problem for arbitrary protocols even for one round The second protocol in Result2 is however tailored of communication. Our result on the other hand implies heavily to the maximum coverage problem. Previously, that to achieve a constant factor approximation with an √ it was known that an O( k) approximation can be O(nc) communication protocol (for a fixed constant c >   achieved via Oe(n) communication [30] per machine, 0), Ω log k rounds of communication are required. log log k but no better bounds were known for this problem in In establishing Result1, we introduce a general multiple rounds under poly(n) communication cost. It framework for proving communication complexity lower is worth noting that since an adversary may assign all bounds for bounded round protocols in the distributed sets to a single machine, a communication cost of O˜(n) coordinator model. This framework, formally intro- is essentially best possible bound. We now elaborate on duced in Section4, captures many of the existing some applications of our results. multi-party communication complexity lower bounds Dynamic Streams. In the dynamic (set) stream- ing model, at each step, either a new set is inserted 2We remark that the algorithms of [15, 51] are originally or a previously inserted set is deleted from the stream. designed for the streaming setting and in that setting are quite The goal is to solve the maximum coverage problem efficient as they only require one or a constant number of passes over the stream. However, implementing one pass of a streaming on the sets that are present at the end of the stream. algorithm in the coordinator model directly requires p rounds of A semi-streaming algorithm is allowed to make one or communication. a small number of passes over the stream and use only

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited O(n·poly {log m, log n}) space to process the stream and maximization have also been extensively studied in the compute the answer. The streaming setting for the max- MapReduce model [27, 22, 48, 53, 41, 52, 30, 31, 19]. imum coverage problem and the closely related set cover Proving round complexity lower bounds in the problem has been studied extensively in recent years MapReduce framework turns out to be a challenging [58, 28, 14, 24, 35, 32, 15, 39, 25, 11, 20, 26, 51,9, 36]. task (see, e.g., [56] for implication of such lower bounds Previous work considered this problem in insertion-only to long standing open problems in complexity theory). streams and more recently in the sliding window model; As a result, most previous work on lower bounds to the best of our knowledge, no non-trivial results were concerns either communication cost (in a fixed number known for this problem in dynamic streams. Our Re- of rounds) or specific classes of algorithms (for round sults1 and2 imply the first upper and lower bounds for lower bounds); see, e.g., [1, 21, 55, 42] (see [56] for more maximum coverage in dynamic streams. details). Our results contribute to the latter line of Result1 together with a recent characterization of work by characterizing the power of a large family of multi-pass dynamic streaming algorithms [5] proves that MapReduce algorithms for maximum coverage. any semi-streaming algorithm for maximum coverage Many existing techniques for MapReduce algo- in dynamic streams that achieves any constant approx- rithms utilize the following paradigm which we call  log n  the sketch-and-update approach: each machine sends a imation requires Ω log log n passes over the stream. This is in sharp contrast with insertion-only streams in summary of its input, i.e., a sketch, to a single des- which semi-streaming algorithms can achieve (almost) ignated machine which processes these sketches and  e  computes a single combined sketch; the original ma- 2-approximation in a single pass [15] or (almost) e−1 - chines then receive this combined sketch and update approximation in a constant number of passes [51] (con- their sketch computation accordingly; this process is stant factor approximations are also known in the slid- then continued on the updated sketches. Popular al- ing window model [26, 36]). To our knowledge, this gorithmic techniques belonging to this framework in- is the first multi-pass dynamic streaming lower bound clude composable coresets (e.g., [15, 17, 18, 41]), the that is based on the characterization of [5]. Moreover, filtering method (e.g., [50]), linear-sketching algorithms as maximum coverage is a special case of submodular (e.g., [3,4, 43,2]), and the sample-and-prune technique maximization (subject to cardinality constraint), our (e.g., [48, 40]), among many others. lower bound extends to this problem and settles an open We use Result1 to prove a lower bound on the question of [36] on the space complexity of submodular power of this approach for solving maximum coverage maximization in dynamic streams. in the MapReduce model. We show that any MapRe- We complement this result by showing that one duce algorithm for maximum coverage in the sketch- can implement the first algorithm in Result2 using and-update framework that uses s = mδ memory per proper linear sketches in dynamic streams, which im- machine requires Ω( 1 ) rounds of computation. More-   δ e over, both our algorithms in Result2 belong to the ply an (almost) e−1 -approximation semi-streaming algorithm for maximum coverage (and monotone sub- sketch-and-update framework and can be implemented modular maximization) in O(log m) passes. As a sim- in the MapReduce model. In particular, the round com- ple application of this result, we can also obtain an plexity of our first algorithm for monotone submodular O(log n)-approximation semi-streaming algorithm for maximization (subject to cardinality constraint) in Re- the in dynamic stream that requires sult2 matches the best known algorithm of [31] with O(log m · log n) passes over the stream. the benefit of using sublinear communication (the algo- MapReduce Framework. In the MapReduce rithm of [31], in each round, incurs a linear (in input model, there are p machines each with a memory of size size) communication cost). We remark that the algo- s such that p · s = O(N), where N is the total memory rithm in [31] is however more general in that it supports required to represent the input. MapReduce compu- a larger family of constraints beside the cardinality con- tation proceeds in synchronous rounds where in each straint we study in this paper. round, each machine performs some local computation, and at the end of the round sends messages to other ma- 1.2 Organization The rest of the paper is organized chine to guide the computation for the next round. The as follows. We start with preliminaries and notation total size of messages received by each machine, how- in Section2. We then present a high level technical ever, is restricted to be O(s). Following [44], we require overview of our paper in Section3. In Section4, both p and s to at be at most N 1−Ω(1). The main com- we present our framework for proving communication plexity measure of interest in this model is typically the complexity lower bounds for bounded-round protocols. number of rounds. Maximum coverage and submodular Section 5.1 is dedicated to the proof of Result1. We

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited sketch the proof of Result2 in Section6. Due to space For two distributions µ and ν over the same proba- limitations, many of the proofs and details are deferred bility space, the Kullback-Leibler divergence between µ h i to the full version of the paper. In particular, we defer Prµ(a) and ν is defined as D(µ || ν) := Ea∼µ log Pr (a) . the detailed discussion on the applications of our results ν to dynamic set streams and the MapReduce model to Fact 2.2. For random variables A, B, C, the full version of the paper. I(A ; B | C) = 2 Preliminaries h i E D(dist(A | C = c) || dist(A | B = b, C = c)) . Notation. For a collection of sets C = (b,c) {S1,...,St}, we define c(C) := ∪i∈[t]Si, i.e., the set of We denote the total variation distance between two elements covered by C. For a tuple X = (X1,...,Xt) distributions µ and ν over the same probability space Ω

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited round lower bound in this model is in fact another problem. Recall that in our previous algorithm the motivation for our framework. To our knowledge, the machines are mainly “observers” and simply provide the only previous lower bounds specific to bounded round coordinator with a sample of their input; our second protocols in the coordinator model are those of [7,8]; algorithm is in some sense on the other extreme. In this we hope that our framework facilitates proving such algorithm, each machine is responsible for computing a lower bounds in this model (understanding the power of suitable sketch of its input, which roughly speaking, is a bounded round protocols is regarded as an interesting collection of sets that tries to “represent” each optimal open question in the literature; see, e.g., [60]). set in the input of this machine. The coordinator is also Finally, we prove the lower bound for maximum maintaining a greedy solution that is updated based on coverage using this framework by designing a family the sketches received from each machine. The elements of sets which we call randomly nearly disjoint; roughly covered by this collection are shared by the machines to speaking the sets in this family have the property that guide them towards the sets that are “misrepresented” any suitably small random subset of one set is essentially by the sketches computed so far, and the machines disjoint from any other set in the family. A reader update their sketches for the next round accordingly. familiar with [25] may realize that this definition is We show that either the greedy solution maintained by similar to the edifice set-system introduced in [25]; the the coordinator is already a good approximation or the main difference here is that we need every random final sketches computed by the machines are now a good subsets of each set in the family to be disjoint from representative of the optimal sets and hence contain a other sets, as opposed to a pre-specified collection of sets good solution. as in edifices [25]. As a result, the algebraic techniques of [25] do not seem suitable for our purpose and we prove 4 A Framework for Distributed Lower Bounds our results using different techniques. The lower bound We introduce a general framework for proving communi- then follows by instantiating the hard distribution in cation complexity lower bounds for bounded round pro- our framework with this family for maximum coverage tocols in the distributed coordinator model. Consider a and proving the approximation lower bound. decision problem3 P defined by the family of functions Upper Bounds (Result2). We achieve the first s Ps : {0, 1} → {0, 1} for any integer s ≥ 1; we refer to  e  s algorithm in Result2, namely an e−1 -approximation s as size of the problem and to {0, 1} as its domain. algorithm for maximum coverage (and submodular Note that Ps can be a partial function, i.e., not neces- maximization), via an implementation of a thresholding sarily defined on its whole domain. An instance I of (see, e.g., [16, 25]) in the distributed problem Ps is simply a binary string of length s. We setting using the sample-and-prune technique of [48] (a say that I is a Yes instance if Ps(I) = 1 and is a No similar thresholding greedy algorithm was used recently instance if Ps(I) = 0. For example, Ps can denote the in [51] for streaming maximum coverage). The main decision version of the maximum coverage problem over idea in the sample-and-prune technique is to sample a m sets and n elements with parameter k (in which case collection of sets from the machines in each round and s would be a fixed function of m, n, and k depending send them to the coordinator who can build a partial on the representation of the input) such that there is a greedy solution on those sets; the coordinator then com- relatively large gap (as a function of, say, k) between municates this partial solution to each machine and in the value of optimal solution in Yes and No instances. the next round the machines only sample from the sets We can also consider the problem Ps in the distributed that can have a substantial marginal contribution to the model, whereby we distribute each instance between the partial greedy solution maintained by the coordinator, players. The distributed coverage problem for instance, hence essentially prunning the input sets. Using a dif- can be modeled here by partitioning the sets in the in- ferent greedy algorithm and a more careful choice of the stances of Ps across the players. threshold on the necessary marginal contribution from To prove a communication lower bound for some   problem P, one typically needs to design a hard input each set, we show that an e -approximation can e−1 distribution D on instances of the problem P, and then be obtained in constant number of rounds and sublin- show that distinguishing between the Yes and No cases ear communication (as opposed to the approach of [48] in instances sampled from D, with some sufficiently which requires Ω(log n) rounds). large probability, requires large communication. Such a The second algorithm in Result2, namely a kO(1/r)- for any number of rounds r, however is more involved and is based on a new iterative 3While we present our framework for decision problems, with sketching method specific to the maximum coverage some modifications, it also extends to search problems. We elaborate more on this in the full version of the paper.

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited distribution inevitably depends on the specific problem instances of problem Psr−1 into a single instance of 0 P at hand. We would like to abstract out this depen- problem P 0 for some s ≥ s . We postpone the formal sr r r dence to the underlying problem and design a template description of the packing functions to the next section, hard distribution for any problem P using this abstrac- but roughly speaking, we require each player to be tion. Then, to achieve a lower bound for a particular able to construct the instance Ii(q) from the instances problem P, one only needs to focus on the problem spe- Ii,...,Ii and vice versa. As such, even though each 1 wr cific parts of this template and design them according player is given as input a single instance Ii, we can think to the problem P at hand. We emphasize that obvi- of each player as conceptually “playing” in wr different ously we are not going to prove a communication lower instances Ii,...,Ii of P instead. 1 wr sr−1 bound for every possible distributed problem; rather, In each group i ∈ [gr], one of the instances, namely i ? our framework reduces the problem of proving a com- Ij? for j ∈ [wr], is the special instance of the group: if munication lower bound for a problem P to designing we combine the inputs of players in Pi on their special i appropriate problem-specific gadgets for P, which deter- instance Ij? , we obtain an instance which is sampled mine the strength of the lower bound one can ultimately from the distribution Dr−1. On the other hand, all prove using this framework. With this plan in mind, we other instances are fooling instances: if we combine the i ? now describe a high level overview of our framework. inputs of players in Pi on their instance Ij for j 6= j , the resulting instance is not sampled from Dr−1; rather, 4.1 A High Level Overview of the Framework it is an instance created by picking the input of each Consider any decision problem P; we construct a re- player independently from the corresponding marginal cursive family of distributions D0, D1,... where Dr is of Dr−1 ( Dr−1 is not a product distribution, thus these a hard input distribution for r-round protocols of Psr , two distributions are not identical). Nevertheless, by i.e., for instances of size sr of the problem P, when the construction, each player is oblivious to this difference input is partitioned between pr players. Each instance and hence is unaware of which instance in the input is in Dr is a careful “combination” of many sub-instances the special instance (since the marginal distribution of a of problem Psr−1 over different subsets of pr−1 players, player’s input is identical under the two distributions). which are sampled (essentially) from Dr−1. We ensure Finally, we need to combine the instances that a small number of these sub-instances are “spe- I1,...,Igr to create the final instance I. To do this, cial” in that to solve the original instance of Psr , at we need another problem specific gadget, namely a re- least one of these instances of Psr−1 (over pr−1 players) labeling function. Roughly speaking, this function takes needs to be solved necessarily. We “hide” the special as input the index j?, i.e., the index of the special in- sub-instances in the input of players in a way that lo- stances, and instances I1,...,Igr and create the final cally, no player is able to identify them and show that instance I, while “prioritizing” the role of special in- the first round of communication in any protocol with a stances in I. By prioritizing we mean that in this step, small communication is spent only in identifying these we need to ensure that the value of Psr on I is the same special sub-instances. We then inductively show that as as the value of Psr−1 on the special instances. At the solving the special instance is hard for (r−1)-round pro- same time, we also need to ensure that this additional tocols, the original instance must be hard for r-round relabeling does not reveal the index of the special in- protocols as well. stance to each individual player, which requires a careful We now describe this distribution in more detail. design depending on the problem at hand. The pr players in the instances of distribution Dr are The above family of distributions is parameterized partitioned into gr groups P1,...,Pgr , each of size pr−1 by the sequences {sr} (size of instances), {pr} (number (hence gr = pr/pr−1). For every group i ∈ [gr] and of players), and {wr} (the width parameters), plus the every player q ∈ P , we create w instances Ii,...,Ii packing and relabeling functions. Our main result in i r 1 wr of the problem Psr−1 sampled from the distribution this section is that if these sequences and functions sat- i Dr−1. The domain of each instance Ij is the same isfy some natural conditions (similar to what discussed across all players in Pi and is different (i.e., disjoint) 0 between any two j 6= j ∈ [wr]; we refer to wr as the note that a similar notion to a packing function is captured width parameter. The next step is to pack all these via a collection of disjoint blocks of vertices in [7] (for finding instances into a single instance Ii(q) for the player q; large matchings), Ruzsa-Szemer´edigraphs in [12] (for estimating maximum matching size), and a family of small-intersecting sets this is one of the places that we need a problem specific in [8] (for finding good allocations in combinatorial auctions). 4 gadget, namely a packing function that can pack wr In this work, we use the notion of randomly nearly disjoint set- systems defined in Section 5.1. See the full version of the paper for more details on the connection between this framework and 4For a reader familiar with previous work in [12,7,8], we previous work.

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited above), then any r-round protocol for the problem Psr on the distribution Dr requires Ωr(wr) communication. P1,...,Pgr each containing pr−1 players. We remark that while we state our communication 2. Design a packing function σ of width w which lower bound only in terms of wr, to obtain any inter- r r maps w instances of P to an instance of P 0 esting lower bound using this technique, one needs to r sr−1 sr for some s ≤ s0 ≤ s . ensure that the width parameter wr is relatively large r−1 r r in the size of the instance sr; this is also achieved by de- ? 3. Pick an instance Ir ∼ Dr−1 over the set of signing suitable packing and labeling functions (as well players [p ] and domain of size s . as a suitable representation of the problem). However, r−1 r−1 as “relatively large” depends heavily on the problem at 4. For each group Pi for i ∈ [gr]: hand, we do not add this requirement to the framework ? explicitly. A discussion on possible extensions of this (a) Pick an index j ∈ [wr] uniformly at ran- dom and create w instances Ii,...,Ii of framework as well as its connection to previous work r 1 wr appears in the full version of the paper. problem Psr−1 as follows: i (i) Each instance Ij for j ∈ [wr] is over 4.2 The Formal Description of the Framework i the players Pi and domain Dj = We now describe our framework formally. As stated {0, 1}sr−1 . earlier, to use this framework for proving a lower ? i ? (ii) For index j ∈ [w ], I ? = I by bound for any specific problem P, one needs to define r j r mapping (arbitrarily) [pr−1] to Pi and appropriate problem-specific gadgets. These gadgets ? i domain of I to D ? . are functions that map multiple instances of P to r j s ? i 0 (iii) For any other index j 6= j , I ∼ a single instance Ps0 for some s ≥ s. The exact j ⊗ application of these gadgets would become clear shortly Dr−1 := ⊗q∈Pi Dr−1(q), i.e., the prod- in the description of our hard distribution. uct of marginal distribution of the in- put to each player q ∈ Pi in Dr−1. Definition 4.1. (Packing Function) For s0 ≥ s ≥ (b) Map all the instances Ii,...,Ii to a single 1 and w ≥ 1, we refer to a function σ which maps any 1 wr i instance I using the function σr. tuple of instances (I1,...,Iw) of Ps to a single instance I of Ps0 as a packing function of width w. 5. Design a gr-labeling family Φr which maps gr 00 0 instances of P 0 to a single instance P . Definition 4.2. (Labeling Family) For s ≥ s ≥ 1 sr sr and g ≥ 1, we refer to a family of functions Φ = {φ }, i 6. Pick a labeling function φ from Φ uniformly at where each φi is a function that maps any tuple of 1 gr 1 g random and map the gr instances I ,...,I of instances (I ,...,I ) of Ps0 to a single instance I of P 0 to the output instance I of P using φ. sr sr Ps00 as a g-labeling family, and to each function in this family, as a labeling function. 7. The input to each player q ∈ Pi in the instance I, for any i ∈ [gr], is the input of player q in We start by designing the following recursive fam- i ily of hard distributions {D } , parametrized by se- the instance I , after applying the mapping φ to r r≥0 map Ii to I. quences {pr}r≥0, {sr}r≥0, and {wr}r≥0. We require {pr}r≥0 and {sr}r≥0 to be increasing sequences and {wr}r≥0 to be non-increasing. In two places marked We remark that in the above distribution, the in the distribution, we require one to design the afore- “variables” in each instance sampled from Dr are the instances Ii,...,Ii for all groups i ∈ [g ], the index mentioned problem-specific gadgets for the distribution. 1 wr r ? j ∈ [w], and both the choice of labeling family Φr and the labeling function φ. On the other hand, the Distribution Dr: A template hard distribution for r-round protocols of P for any r ≥ 1. “constants” across all instances of Dr are parameters pr, sr, and wr, the choice of grouping P1,...,Pgr , and Parameters: pr: number of players, sr: size of the the packing function σr. instance, wr: width parameter, σr: packing function, To complete the description of this recursive family and Φr: labeling family. of distributions, we need to explicitly define the dis- s0 tribution D0 between p0 players over {0, 1} . We let 1. Let P be the set of pr players and define gr := D := 1 ·DYes + 1 ·DNo, where DYes is a distribution pr 0 2 0 2 0 0 ; partition the players in P into gr groups No pr−1 over Yes instances of Ps0 and D0 is a distribution over Yes No No instances. The choice of distributions D0 and D0 Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited are again problem-specific. (i) The only variable in Dr which Φr can depend ? We start by describing the main properties of the on is j ∈ [wr] (it can depend arbitrarily on the packing and labeling functions that are required for our constants in Dr). i lower bound. For any player q ∈ Pi, define I (q) := (Ii(q),...,Ii (q)), where for any j ∈ [w ], Ii(q) denotes (ii) For any player q ∈ P , the local mapping φq and 1 wr r j ? i j are independent of each other in Dr. the input of player q in the instance Ij. We require the packing and labeling functions to be locally computable Intuitively speaking, Condition (i) above implies defined as follows. that a function φ ∈ Φr can “prioritize” the special ? Definition 4.3. (Locally computable) We say instances based on the index j , but it cannot use any further knowledge about the special or fooling instances. that the packing function σr and the labeling family For example, one may be able to use φ to distinguish Φr are locally computable iff any player q ∈ Pi for i special instances from other instances, i.e., determine i ∈ [gr], can compute the mapping of I (q) to the final j?, but would not be able to infer whether the special instance I, locally, i.e., only using σr, the sampled i instance is a Yes instance or a No one only based on labeling function φ ∈ Φr, and input I (q). φ. Condition (ii) on the other hand implies that for We use φq to denote the local mapping of player each player q, no information about the special instance i q ∈ Pi for mapping I (q) to I; since σr is fixed in the is revealed by the local mapping φq. This means that distribution Dr, across different instances sampled from given the function φq (and not φ as a whole), one is not ? Dr, φq is only a function of φ. The input to each player able to determine j . i q ∈ Pi is uniquely determined by I (q) and φq. Finally, we say that the family of distributions {Dr} Inside each instance I sampled from Dr, there exists is a γ-hard recursive family, iff (i) it is parameterized by ? a unique embedded instance Ir which is sampled from increasing sequences {pr} and {sr}, and non-increasing Dr−1. Moreover, this instance is essentially “copied” gr sequence {wr}, and (ii), the packing and labeling func- i times, once in each instance Ij? for each group Pi. We tions in the family are locally computable, γ-preserving, ? 1 gr refer to the instance Ir as well as its copies Ij? ,...,Ij? and oblivious. We are now ready to present our main as special instances and to all other instances as fooling theorem of this section. instances. We require the packing and labeling func- Theorem 4.1. Let R ≥ 1 be an integer and suppose tions to be preserving, defined as follows. R {Dr}r=0 is a γ-hard recursive family for some γ ∈ (0, 1); γ We say that the Definition 4.4. ( -Preserving) for any r ≤ R, any r-round protocol for Psr on Dr which 4 packing function and the labeling family are γ- errs w.p. at most 1/3 − r · γ requires Ω(wr/r ) total preserving for a parameter γ ∈ (0, 1), iff communication. ?  Pr Psr (I) = Psr−1 (Ir ) ≥ 1 − γ. I∼Dr 4.3 Correctness of the Framework: Proof of Theorem 4.1 We first set up some notation. For any In other words, the value of Psr on an instance I should r-round protocol π and any ` ∈ [r], we use Π` := be equal to the value of Ps on the embedded special r−1 (Π ,..., Π ) to denote the random variable for the instance I? of I w.p. 1 − γ. `,1 `,pr r transcript of the message communicated by each player Recall that the packing function σr is a determin- in round ` of π. We further use Φ (resp. Φq) to denote istic function that depends only on the distribution Dr the random variable for φ (resp. local mapping φq) itself and not any specific instance (and hence the under- and J to denote the random variable for the index j?. i lying special instances); on the other hand, the preserv- Finally, for any i ∈ [gr] and j ∈ [wr], Ij denotes the ing property requires the packing and labeling functions i random variable for the instance Ij. to somehow “prioritize” the special instances over the We start by stating a simple property of oblivious fooling instances (in determining the value of the orig- mapping functions. inal instance). To achieve this property, the labeling family is allowed to vary based on the specific instance Proposition 4.1. For any i ∈ [gr] and any player i sampled from the distribution Dr. However, we need to q ∈ Pi, conditioned on input (I (q), φq) to player q, the ? limit the dependence of the labeling family to the under- index j ∈ [wr] is chosen uniformly at random. lying instance, which is captured through the definition Proof. By Condition (ii) of obliviousness in Defini- of obliviousness below. tion 4.5, Φq ⊥ J, and hence J ⊥ Φq = φq. Moreover, Definition 4.5. We say that the labeling family Φr is by Condition (i) of Definition 4.5, Φq cannot depend i i oblivious iff it satisfies the following properties: on I (q) and hence I (q) ⊥ Φq = φq also. Now notice

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited i i ?

1 X 1 r r ≤ · |Π1,q| = · |Π1| . 1. Let P = [pr] and partition P into gr equal-size Claim 4.8 wr wr q∈P groups P1,...,Pgr as is done in Dr. Create an instance Ir of Dr as follows: For any tuple (Π1, φ, j), we define the distribution ? r−1 ψ(Π1, φ, j) as the distribution of Ir in Dr conditioned 2. Using public randomness, the players in P ? on Π1 = Π1, Φ = φ, and J = j. Recall that the original sample R := (Π1, φ, j ) ∼ (dist(πr), Dr), i.e., ? distribution of Ir is Dr−1. In the following, we show from the (joint) distribution of protocol πr over that if the first message sent by the players is not too distribution Dr. large, and hence does not reveal much information by r−1 ? 3. The q-th player in P (in instance Ir−1) mim- about Ir by Lemma 4.6, even after the aforementioned ? ics the role of the q-th player in each group Pi for conditioning, distribution of Ir does not change by much in average. Formally, i ∈ [gr] in Ir, denoted by player (i, q), as follows: (a) Set the input for (i, q) in the special instance 4 Lemma 4.9. If |Π1| = o(wr/r ), then i Ij? (q) of Ir as the original input of q in Ir−1, i.e., Ir−1(q) mapped via σr and φ to I (as h i 2 E kψ(Π1, φ, j) − Dr−1k = o(1/r ). (Π1,φ,j) Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited −q independence means that conditioning on Π1 in the i RHS above can only decrease the mutual information is done in Ir to the domain Dj? ). This is possible by the locally computable property by Proposition 2.2. We can further bound the RHS of σr and φ in Definition 4.3. above by, (b) Sample the input for (i, q) in all the fooling (I(q); I(−q) | I?(q), Π , Φ, J) instances Ii(q) of I for any j 6= j? using I r 1,q j r ? private randomness from the correlated dis- ≤ I(I(q); I(−q) | Ir(q), Φ, J), ? tribution Dr | (I = Ir−1, (Π1, Φ, J) = R). r ? This sampling is possible by Proposition 4.2 since I(−q) ⊥ Π1,q | I(q), Ir(q), Φ, J as the input to player below. q is uniquely determined by I(q), Φ (again by Defini- tion 4.3) and hence after the conditioning, Π1,q is de- 4. Run the protocol πr from the second round terministic; this implies that conditioning on Π1,q in onwards on Ir assuming that in the first round RHS above can only decrease the mutual information the communicated message was Π1 and output by Proposition 2.2. Finally, observe that I(I(q); I(−q) | ? the same answer as πr. Ir(q), Φ, J) = 0 by Fact 2.1-(2), since after conditioning ? on Ir (q), the only remaining instances in I(q) are fool- In Line (3b), the distribution the players are sam- ing instances which are sampled from the distribution ⊗ ? D which is independent across the players. This im- pling from depends on Π1, φ, j which are public knowl- r−1 ? edge (through sampling via public randomness), as well plies that I(I(q); I(−q) | Ir(q), Π1, Φ, J) = 0 also which ? finalizes the proof. as Ir which is not a public information as each player ? ? q only knows Ir (q) and not all of Ir . Moreover, while i ? Having proved Proposition 4.2, it is now easy to random variables Ij(q) (for j 6= j ) are originally in- dependent across different players q (as they are sam- see that πr−1 is indeed a valid r − 1 round protocol ⊗ for distribution D : each player q can perform the pled from the product distribution Dr−1), conditioning r−1 on the first message of the protocol, i.e., Π1 correlates sampling in Line (3b) without any communication as ? them, and hence a-priori it is not clear whether the sam- (I (q), Π1, Φ,J) are all known to q; this allows the pling in Line (3b) can be done without any further com- players to simulate the first round of protocol πr without munication. Nevertheless, we prove that this is the case any communication and hence only need r − 1 rounds and to sample from the distribution in Line (3b), each of communication to compute the answer of πr. We can ? ? now prove that, player only needs to know Ir (q) and not Ir . Proposition 4.2. Suppose I is the collection of all Claim 4.11. Assuming πr is a δ-error protocol on Dr, instances in the distribution Dr and I(q) is the input 1  πr−1 would be a δ + γ + o( 2 ) -error protocol on Dr−1. to player q in instances in which q participates; then, r

? Proof. [Proof Sketch] Our goal is to calculate the prob- dist(I | Ir = Ir−1, (Π1, Φ, J) = R) ? ability that πr−1 errs on an instance Ir−1 ∼ Dr−1. For = Xq∈P dist(I(q) | Ir(q) = Ir−1(q), (Π1, Φ, J) = R). the sake of analysis, suppose that Ir−1 is instead sam- Proof. Fix any player q ∈ P , and recall that I(−q) is pled from the distribution ψ for a randomly chosen tuple ? the collection of the inputs to all players other than q (Π1, φ, j ) (defined before Lemma 4.9). Notice that by across all instances (special and fooling). We prove that Lemma 4.9, these two distributions are quite close to ? each other in total variation distance, and hence if πr−1 I(q) ⊥ I(−q) | (Ir(q), Π1, Φ, J) in Dr, which immediately implies the result. To prove this claim, by Fact 2.1-(2), has a small error on distribution ψ it would necessarily ? has a small error on Dr−1 as well (by Fact 2.4). it suffices to show that I(I(q); I(−q) | Ir(q), Π1, Φ, J) = 0. −q Using Proposition 4.2, it is easy to verify that if Ir−1 Define Π1 as the set of all messages in Π1 except for the message of player q, i.e., Π1,q. We have, is sampled from ψ, then the instance Ir constructed by ? πr−1 is sampled from Dr and moreover Ir = Ir−1. As ? I(I(q); I(−q) | Ir(q), Π1, Φ, J) such, since (i) πr is a δ-error protocol for Dr,(ii) the ? answer to I and I? = I are the same w.p. 1 − γ ≤ I(I(q); I(−q) | Ir(q), Π1,q, Φ, J), r r r−1 (by γ-preserving property in Definition 4.4), and (iii) −q ? since I(q) ⊥ Π1 | I(−q), Ir(q), Π1,q, Φ, J as the input to πr−1 outputs the same answer as πr, protocol πr−1 is a players P \{q} is uniquely determined by I(−q), Φ (by (δ + γ)-error protocol for ψ. We defer the formal proof the locally computable property in Definition 4.3) and and detailed calculation of this probability of error to −q hence Π1 is deterministic after the conditioning; this the full version of the paper. Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited We are now ready to finalize the proof of Here, mr, nr, and kr, respectively represent the Lemma 4.10. Suppose πr is a deterministic δ(r)- number of sets and elements and the parameter k in error protocol for Dr with communication cost kπrk = the maximum coverage problem in the instances of each 4 o(wr/r ). By Claim 4.11, πr−1 would be a randomized distribution Dr and together can identify the size of each δ(r − 1)-error protocol for Dr−1 with kπr−1k ≤ kπrk instance (i.e., the parameter sr defined in Section4 for 2 (as δ(r − 1) = δ(r) + γ + o(1/r )). By an averaging the distribution Dr). Moreover, pr, wr and gr represent argument, we can fix the randomness in πr−1 to ob- the number of players, the width parameter, and the 0 tain a deterministic protocol πr−1 over the distribution number of groups in Dr, respectively (notice that gr = Dr−1 with the same error δ(r−1) and communication of pr/pr−1 as needed in distribution Dr). 0 4 4 kπr−1k = o(wr/r ) = o(wr−1/r ) (as {wr}r≥0 is a non- Using the sequences above, we define: increasing sequence). But such a protocol contradicts the induction hypothesis for (r − 1)-round protocols, fi- coverage(N, r): the problem of deciding whether nalizing the proof. the optimal kr cover of universe [nr] with mr in- Proof. [Proof of Theorem 4.1] By Lemma 4.10, any de- put sets is at least (kr · N)(Yes case), or at most 2r  terministic δ(r)-error r-round protocol for Dr requires kr · 2c · log (N ) (No case). 4 Ω(wr/r ) total communication. This immediately ex- tends to randomized protocols by an averaging argu- 1/2r Notice that there is a gap of roughly N ≈ kr (ig- ment, i.e., the easy direction of Yao’s minimax princi- noring the lower order terms) between the value of the ple [62]. The statement in the theorem now follows from optimal solution in Yes and No cases of coverage(N, r). 2 this since for any r ≥ 0, δ(r) = δ(r − 1) − γ − o(1/r ) = We prove a lower bound for deciding between Yes and Pr 2 δ(0)−r·γ − `=1 o(1/` ) = 1/2−r·γ −o(1) > 1/3−r·γ No instances of coverage(N, r), when the input sets are Pr 2 (as δ(0) = 1/2 and `=1 1/` is a converging series and partitioned between the players, which implies an iden- hence is bounded by some absolute constant indepen- tical lower bound for algorithms that can approximate dent of r). the value of optimal solution in maximum coverage to within a factor smaller than (roughly) k1/2r. 5 Distributed Lower Bound for Max-Coverage r Recall that to use the framework introduced in We prove our main lower bound for maximum coverage Section4, one needs to define two problem-specific in this section, formalizing Result1. gadgets, i.e., a packing function, and a labeling family.   In the following section, we design a crucial building For integers 1 ≤ r, c ≤ o log k with Theorem 5.1. log log k block for our packing function. c ≥ 4r, any r-round protocol for the maximum coverage RND Set-Systems. Our packing function is problem that can approximate the value of optimal so- based on the following set-system.  1/2r  lution to within a factor of better than 1 · k w.p. 2c log k For integers N, r, c ≥ 1, an (N, r, c)-  c  Definition 5.1. k (c+2)·4r at least 3/4 requires Ω r4 · m communication randomly nearly disjoint (RND) set-system over a uni- per machine. The lower bound applies to instances with verse X of N 2r elements, is a collection S of subsets of m sets, n = m1/Θ(c) elements, and k = Θ(n2r/(2r+1)). X satisfying the following properties:

The proof is based on an application of Theo- (i) Each set A ∈ S is of size N 2r−1. rem 4.1. In the following, let c ≥ 1 be any integer (as in Theorem 5.1) and N ≥ 12c2 be a sufficiently large inte- (ii) Fix any set B ∈ S and suppose CB is a collection of c·r ger which we use to define the main parameters for our N subsets of X whereby each set in CB is chosen problem. To invoke Theorem 4.1, we need to instanti- by picking an arbitrary set A 6= B in S, and then c picking an N-subset uniformly at random from A ate the recursive family of distributions {Dr}r=0 in Sec- tion4 with appropriate sequences and gadgets for the (we do not assume independence between the sets maximum coverage problem. We first define sequences in CB). Then, (for all 0 ≤ r ≤ c):   Pr ∃ S ∈ CB s.t. |S ∩ B| ≥ 2c · r · log N 2 r 2r+1 kr = pr = (N − N) , nr = N , = o(1/N 3). c 2 r c mr = N · (N − N) , wr = N , Intuitively, this means that any random N-subset 2 gr = (N − N). of some set A ∈ S is essentially disjoint from any other set B ∈ S w.h.p.

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited We prove an existence of large RND set-systems in the full version of the paper. Distribution D0: The base case of the recursive family of distributions {D }c . Lemma 5.2. For integers 1 ≤ r ≤ c and sufficiently r r=0 large integer N ≥ c, there exists an (N, r, c)-RND set- 1. W.p. 1/2, the player has a single set of size N c 2r system S of size N over any universe X of size N . covering the universe (the Yes case).

5.1 Proof of Theorem 5.1 To prove Theorem 5.1 2. W.p. 1/2, the player has a single set {∅}, i.e., a using our framework in Section4, we parameterize the set that covers no elements (the No case). c recursive family of distributions {Dr}r=0 for the cov- erage problem, i.e., coverage(N, r,), with the aforemen- To invoke Theorem 4.1, we prove that this family is tioned sequences plus the packing and labeling functions a γ-hard recursive family for the parameter γ = o(r/N). which we define below. The sequences clearly satisfy the required monotonicity properties. It is also straightforward to verify that σr Packing function σr: Mapping instances and functions φ ∈ Φr are locally computable (Defini- Ii,...,Ii each over n = N 2r−1 elements 1 wr r−1 tion 4.3): both functions are specifying a mapping of and mr−1 sets for any group i ∈ [gr] to a single elements to the new instance and hence each player can i 2r instance I on N elements and wr · mr−1 sets. compute its final input by simply mapping the origi- nal input sets according to σ and φ to the new uni- 1. Let A = {A ,...,A } be an (N, r, c)-RND r 1 wr verse. In other words, the local mapping of each player system with w = N c sets over some universe r q ∈ P only specifies which element in the instance I X of N 2r elements (guaranteed to exist by i i corresponds to which element in Ii(q) for j ∈ [w ]. It Lemma 5.2 since c < N). By definition of A, j r thus remains to prove the preserving and obliviousness for any set A ∈ A, |A | = N 2r−1 = n . j j r−1 property of the packing and labeling functions.

2. Return the instance I over the universe Xi with We start by showing that the labeling family Φr is the collection of all sets in Ii,...,Ii after map- oblivious. The first property of Definition 4.5 is imme- 1 wr ? i diate to see as Φr is only a function of j and σr. For ping the elements in Ij to Aj arbitrarily. the second property, consider any group Pi and instance Ii; the labeling function never maps two elements be- We now define the labeling family Φr as a function longing to a single instance Ii to the same element in ? of the index j ∈ [wr] of special instances. the final instance (there are however overlaps between the elements across different groups). Moreover, pick-

1 gr ing a uniformly at random labeling function φ from Φ Labeling family Φr: Mapping instances I ,...,I r 2r (as is done is D ) results in mapping the elements in Ii over N elements to a single instance I on nr = r 2r+1 according to a random permutation; as such, the set of N elements and mr sets. elements in instance Ii is mapped to a uniformly at ran- ? 1. Let j ∈ [wr] be the index of the special instance dom chosen subset of the elements in I, independent of ? in the distribution Dr. For each permutation π the choice of j . As the local mapping φq of each player 2r+1 ? of [N ] we have a unique function φ(j , π) in q ∈ Pi is only a function of the set of elements to which i the family. elements in I are mapped to, φq is also independent of ? i j , proving that Φr is indeed oblivious. 2. For any instance I for i ∈ [gr], map the el- 2r 2r−1 The rest of this section is devoted to the proof ements in Xi \ Aj? to π(1,...,N − N ) 2r of the preserving property of the packing and labeling and the elements in Aj? to π(N + (gr − 1) · 2r−1 2r 2r−1 functions defined for maximum coverage. We first make N ) . . . π(N + gr · N − 1). some observations about the instances created in Dr. 3. Return the instance I over the universe [N 2r+1] Recall that the special instances in the distribution are 1 gr which consists of the collection of all sets in Ij? ,...,Ij? . After applying the packing function, each 1 g i I ,...,I r after the mapping above. instance Ij? is supported on the set of elements Aj? . After additionally applying the labeling function, Aj? is mapped to a unique set of elements in I (according Finally, we define the base case distribution D0 c to the underlying permutation π in φ); as a result, of the recursive family {Dr}r=0. By definition of our sequences, this distribution is over p0 = 1 player, n0 = Observation 5.3. The elements in the special in- 1 gr N elements, and m0 = 1 set. stances Ij? ,...,Ij? are mapped to disjoint set of ele- Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited ments in the final instance. We also bound the contribution of fooling instances using the RND set-systems properties. We defer the The input to each player q ∈ Pi in an instance of Dr proof of this claim the full version of the paper. is created by mapping the sets in instances Ii,...,Ii 1 wr With probability 1 − o(1/N) in the in- (which are all sampled from distributions D or D⊗ ) Claim 5.8. r−1 r−1 stance I, simultaneously for all integers ` ≥ 0, to the final instance I. As the packing and labeling any collection of ` sets from the fooling instances functions, by construction, never map two elements Ii | i ∈ [g ], j ∈ [w ] \{j?} can cover at most ` · r · belonging to the same instance Ii to the same element j r r j (2c · log N) elements in U ?. in the final instance, the size of each set in the input to player q is equal across any two distributions Dr In the following, we condition on the event in 0 and Dr0 for r 6= r , and thus is N by definition of D0 Claim 5.8, which happens w.p. at least 1 − 1/N. Let (we ignore empty sets in D0 as one can consider them C = Cs ∪ Cf be any collection of kr sets (i.e., a potential as not giving any set to the player instead; these sets kr-cover) in the input instance I such that Cs are Cf are are only added to simplify that math). Moreover, as chosen from the special instances and fooling instances, argued earlier, the elements are being mapped to the respectively. Let `s = |Cs| and `f = |Cf |; we have, final instance according to a random permutation. |c(C)| = |c(C) ∩ U ?| + |c(C) ∩ (U \ U ?)| ? ? ? Observation 5.4. For any group Pi, any player q ∈ ≤ |c(Cs) ∩ U | + |c(Cf ) ∩ U | + |U \ U | Pi, the distribution of any single input set to player q in 2r ≤ kr + `s · (2r − 2) · 2c · log N + `f · r · 2c · log N + N the final instance I ∼ Dr is uniform over all N-subsets ⊗ of the universe. This also holds for an instance I ∼ Dr (by Claim 5.7 for the first term and Claim 5.8 for the second term) as marginal distribution of a player input is identical. ≤ 4kr + kr · (2r − 2) · 2c · log N 2r (2kr ≥ N ) We now prove the preserving property in the follow- 2r ing two lemmas. The proof of the first lemma is quite ≤ kr · 2r · 2c · log N ≤ kr · 2c · log N . simple and is deferred to the full version. This means that w.p. at least 1 − 1/N, I is also a No instance. ? Lemma 5.5. For any instance I ∼ Dr; if Ir is a Yes instance, then I is also a Yes instance. Lemmas 5.5 and 5.6 prove that the packing function σr and labeling family Φr are γ-preserving for γ = 1/N. We now prove the more involved case. Proof. [Proof of Theorem 5.1] The results in this sec- ? tion imply that the family of distributions {D }c Lemma 5.6. For any instance I ∼ Dr; if Ir is a No r r=0 instance, then w.p. at least 1 − 1/N, I is also a No for the coverage(N, r,) are γ-hard for the parameter p instance. γ = 1/N, as long as r ≤ 4c ≤ 4 N/12. Conse- quently, by Theorem 4.1, any r-round protocol that Proof. Let U be the universe of elements in I and can compute the value of coverage(N, r,) on Dr w.p. U ? ⊆ U be the set of elements to which the elements at least 2/3 + r · γ = 2/3 + r/N < 3/4 requires 1 gr 4 c 4 in special instances Ij? ,...,Ij? are mapped to (these Ω(wr/r ) = Ω(N /r ) total communication. Recall are all elements in U except for the first N 2r elements that the gap between the value of optimal solution be- according to the permutation π in the labeling function tween Yes and No instances of coverage(N, r) is at least  k1/2r  φ). In the following, we bound the contribution of each 2r  r N/ 2c · log (N ) ≥ 2c·log k . As such, any r-round set in players inputs in covering U ? and then use the r ? distributed algorithm that can approximate the value of fact that |U \ U | is rather small to finalize the proof. optimal solution to within a factor better than this w.p. For any group Pi for i ∈ [gr], let Ui be the set of all at least 3/4 can distinguish between Yes and No cases elements across instances in which the players in Pi are c−2r 4 ? ? of this distribution, and hence requires Ω(N /r ) = participating in. Moreover, define U := U ∩Ui; notice  c  i kr (c+2)·4r ? Ω r4 · m per player communication. Finally, that Ui is precisely the set of elements in the special i 1/2r p instance Ij? . We first bound the contribution of special since N ≤ 2kr , the condition c ≤ N/12 holds as instances. The proof is deferred to the full version.  log k  long as c = o r , finalizing the proof. log log kr ? Claim 5.7. If Ir is a No instance, then for any integer ` ≥ 0, any collection of ` sets from the special instances 6 Distributed Algorithms for Max-Coverage 1 gr 2r−2 Ij? ,...,Ij? can cover at most kr + ` · (2c · log N ) In this section, we show that both the round- elements in U ?. approximation tradeoff and the round-communication

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited tradeoff achieved by our lower bound in Theorem 5.1 are essentially tight, formalizing Result2. 3. Return C as the answer.

1/r 6.1 An O(r·k )-Approximation Algorithm Re- Notice that in the Line (2) of GreedySketch, we are call that Theorem 5.1 shows that getting better than adding the new contribution of the set S and not the Ω(1/r) k approximation in r rounds requires a relatively complete set itself. This way, we can bound the total Ω(1/r) large communication of m , (potentially) larger representation size of the output collection C by Oe(n) than any poly(n). In this section, we prove that (as each element in U appears in at most one set). We this round-approximation tradeoff is essentially tight by now present our algorithm in Theorem 6.1. showing that one can always obtain a kO(1/r) approxi- mation (with a slightly larger constant in the exponent) in r rounds using a limited communication of nearly Iterative Sketching Greedy Algorithm (ISGreedy): linear in n. Input: A collection Si of subsets of [n] for each Theorem 6.1. There exists a deterministic distributed machine i ∈ [p] and a value gopt ∈ [opt, 2 · opt]. S algorithm for the maximum coverage problem that for Output: A k-cover from the sets in S := i∈[p] Si. any integer r ≥ 1 computes an O(r · k1/r+1) approxi- 0 0 mation in r rounds and Oe(n) communication per each 1. Let X = ∅ and Ui = [n], for each i ∈ [p] initially. machine. Define τ := gopt/4r · k. On a high level, our algorithm follows an iterative 2. For j = 1 to r rounds: sketching method: in each round, each machine com- (a) Each machine i computes Cj = putes a small collection Ci of its input sets Si as a i j−1 sketch and sends it to the coordinator. The coordina- GreedySketch(Ui , Si, τ) and sends it tor is maintaining a collection of sets X and updates it to coordinator. by iterating over the received sketches and picking any (b) The coordinator sets X j = X j−1 initially set that still has a relatively large contribution to this S j and iterates over the sets in i∈[p] Ci , in de- partial solution. The coordinator then communicates j the set of elements covered by X to the machines and creasing order of c(Ci ) over i (and consis- the machines update their inputs accordingly and re- tent with the order in GreedySketch for each peat this process. At the end, the coordinator returns particular i), and adds each set S to X j if j 1 (a constant approximation to) the optimal k-cover over S \ c(X ) ≥ k1/r+1 · |S|. the collection of all received sets across different rounds. (c) The coordinator communicates c(X j) to In the following, we assume that our algorithm is each machine i and the machine updates its given a value opt such that opt ≤ opt ≤ 2 · opt. We j j j g g input by setting Ui = c(Ci ) \ c(X ). can remove this assumption by guessing the value of gopt in powers of two (up to n) and solve the problem 3. At the end, the coordinator returns the best k- S j simultaneously for all of them and return the best cover among all sets in C := i∈[p],j∈[r] Ci sent solution, which increases the communication cost by by the machines over all rounds. only an O(log n) factor. We first introduce the algorithm for computing the The round complexity of ISGreedy is trivially r. sketch on each machine. For its communication cost, notice that at each round, each machine is communicating at most Oe(n) bits and GreedySketch(U, S, τ). An algorithm for computing the coordinator communicates Oe(n) bits back to each the sketch of each machine’s input. machine. As the number of rounds never needs to be Input: A collection S of sets from [n], a target more than O(log k), we obtain that ISGreedy requires universe U ⊆ [n], and a threshold τ. Oe(n) communication per each machine. Therefore, it Output: A collection C of subsets of U. only remains to analyze the approximation guarantee of this algorithm. To do so, it suffices to show that, 1. Let C = ∅ initially. Lemma 6.1. Define C := S Cj. The optimal 2. Iterate over the sets in S in an arbitrary order i∈[p],j∈[r] i opt  and for each set S ∈ S, if |(S ∩ U) \ c(C)| ≥ τ, k-cover of C covers 4r·k1/r+1 elements. then add (S ∩ U) \ c(C) to C. We defer the proof of this key lemma to the full ver- sion of the paper. Theorem 6.1 follows from Lemma 6.1

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited as the coordinator can simply run any constant factor approximation algorithm for maximum coverage on the Sample and Prune Greedy Algorithm (SPGreedy). collection C and obtains the final result. Input: A collection Vi ⊆ V of items for each machine e i ∈ [p] and a value gopt ∈ [opt, 2 · opt]. 6.2 An ( e−1 )-Approximation Algorithm We now prove that the round-communication tradeoff for the Output: A collection of k items from V . distributed maximum coverage problem proven in The- l m orem 5.1 is essentially tight. Theorem 5.1 shows that 1. Define the parameters ` := lg(1+ε) (2e) (= using k ·mO(1/r) communication in r rounds only allows Θ(1/ε)) and s = dr/`e. The algorithm consists for a relatively large approximation factor of kΩ(1/r). of ` iterations each with s steps. Here, we show that we can always obtain an (almost)  e  2. For j = 1 to ` iterations: e−1 -approximation (the optimal approximation ratio  j−1 with sublinear in m communication) in r rounds using optg 1 j,0 j−1,s (a) Let τj = · and X = X k · mΩ(1/r) (for some larger constant in the exponent). k 1+ε 0,∗ As stated in the introduction, our algorithm in this initially (we assume X = ∅). part is quite general and works for maximizing any (b) For t = 1 to s steps: monotone submodular function subject to a cardinality (i) Define V j,t = constraint. Hence, in the following, we present our {a ∈ V | f j,(t−1) (a) ≥ τj}. results in this more general form. X (ii) Each machine i ∈ [p] samples j,t each item in V ∩ Vi inde- Theorem 6.2. There exists a randomized distributed pendently and with probability ( 4k log m algorithm for submodular maximization subject to car- m1−(t/s) if t < s qt := , and dinality constraint that for any ground set V of size m, 1 if t = s V + any monotone submodular function f : 2 → R , and sends them to the coordinator. any r ≥ 1 and parameter ε ∈ (0, 1), with high probabil- (iii) The coordinator iterates over each re-  e  ity computes an e−1 + ε -approximation in r rounds ceived item a (in an arbitrary order) j,t while communicating O(k · mO(1/ε·r)) items from V . and adds a to X iff fXj,t (a) ≥ τj. (iv) The coordinator communicates the set Xj,t to the machines. Our algorithm follows the sample-and-prune tech- nique of [48]. At each round, we sample a set of items 3. The coordinator returns X`,s in the last step (if from the machines and send them to the coordinator. at any earlier point of the algorithm size of some The coordinator then computes a greedy solution X X∗,∗ is already k, the coordinator terminates the over the received sets and reports X back to the ma- algorithm and outputs this set as the answer). chines. The machines then prune any item that cannot be added to this partial greedy solution X and con- SPGreedy requires ` = Θ(1/ε) iterations each con- tinue this process in the next rounds. At the end, the sists of s = dr/`e steps. Moreover, each step can be im- coordinator outputs X. By using a thresholding greedy plemented in one round of communication. As such, the algorithm and a more careful analysis, we show that the round complexity of this algorithm is simply O(r). The dependence of the number of rounds on Ω(log ∆) (where bounds on the communication cost of this algorithm and ∆ is the ratio of maximum value of f on any single- its approximation ratio are proven in the following two ton set to its minimum value) in [48] can be completely lemma which we defer to the full version of the paper. avoided, resulting in an algorithm with only constant It is now easy to bound the communication cost of number of rounds. this protocol. We assume that the algorithm is given a value gopt such that opt ≤ gopt ≤ 2 · opt. In general, one can guess Lemma 6.2. SPGreedy communicates at most O(r · k · 1/s k gopt in powers of two in the range ∆ to k · ∆ in parallel m · log m) items w.p. at least 1 − 1/m . and solve the problem for all of them and return the Lemma 6.3. Suppose X is the set returned by best solution. This would increase the communication SPGreedy; then, f(X) ≥ (1 − 1/e − ε) · opt. cost by only a factor of Θ(log k) (and one extra round of communication just to communicate ∆ if it is unknown). Theorem 6.2 now follows immediately from We now present our algorithm. Lemma 6.2 and Lemma 6.3.

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited We conclude this section by stating the following 2012, New York, NY, USA, May 19 - 22, 2012, pages corollary of Theorem 6.2 for the maximum coverage 1079–1090, 2012. problem, which formalizes the first part of Result2. [7] N. Alon, N. Nisan, R. Raz, and O. Weinstein. Welfare The proof is a direct application of Theorem 6.2 plus maximization with limited interaction. In IEEE 56th the known sketching methods for coverage functions Annual Symposium on Foundations of Computer Sci- in [51, 20] to further optimize the communication cost. ence, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 1499–1512, 2015. We defer the proof to the full version. [8] S. Assadi. Combinatorial auctions do need modest in- Corollary 6.1. There exists a randomized distributed teraction. In Proceedings of the 2017 ACM Conference algorithm for the maximum coverage problem that for on Economics and Computation, EC ’17, Cambridge, MA, USA, June 26-30, 2017, pages 145–162, 2017. any integer r ≥ 1, and any parameter ε ∈ (0, 1), with [9] S. Assadi. Tight space-approximation tradeoff for the  e  high probability computes an e−1 + ε -approximation multi-pass streaming set cover problem. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Sympo- in r rounds and O( k · mO(1/ε·r) + n) communication. e ε4 sium on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017, pages 321–335, Acknowledgements The first author is grateful to 2017. Alessandro Epasto for bringing [36] to his attention, [10] S. Assadi and S. Khanna. Randomized composable to David Woodruff for a helpful discussion on the coresets for matching and vertex cover. In Proceed- implication of the results in [5] for proving multi-pass ings of the 29th ACM Symposium on Parallelism in dynamic streaming lower bounds, and to Qin Zhang Algorithms and Architectures, SPAA 2017, Washing- for useful comments about our framework in this paper ton DC, USA, July 24-26, 2017, pages 3–12, 2017. for proving distributed lower bounds. We also thank [11] S. Assadi, S. Khanna, and Y. Li. Tight bounds the anonymous reviewers of SODA for many valuable for single-pass streaming complexity of the set cover comments and suggestions. problem. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18-21, 2016, pages References 698–711, 2016. [12] S. Assadi, S. Khanna, and Y. Li. On estimating max- [1] F. N. Afrati, A. D. Sarma, S. Salihoglu, and J. D. imum matching size in graph streams. In Proceed- Ullman. Upper and lower bounds on the cost of a map- ings of the Twenty-Eighth Annual ACM-SIAM Sympo- reduce computation. PVLDB, 6(4):277–288, 2013. sium on Discrete Algorithms, SODA 2017, Barcelona, [2] K. J. Ahn and S. Guha. Access to data and number Spain, Hotel Porta Fira, January 16-19, pages 1723– of iterations: Dual primal algorithms for maximum 1742, 2017. matching under resource constraints. In Proceedings [13] S. Assadi, S. Khanna, Y. Li, and G. Yaroslavtsev. Max- of the 27th ACM on Symposium on Parallelism in imum matchings in dynamic graph streams and the si- Algorithms and Architectures, SPAA 2015, Portland, multaneous communication model. In Proceedings of OR, USA, June 13-15, 2015, pages 202–211, 2015. the Twenty-Seventh Annual ACM-SIAM Symposium [3] K. J. Ahn, S. Guha, and A. McGregor. Analyzing on Discrete Algorithms, SODA 2016, Arlington, VA, graph structure via linear measurements. In Proceed- USA, January 10-12, 2016, pages 1345–1364, 2016. ings of the Twenty-third Annual ACM-SIAM Sympo- [14] G. Ausiello, N. Boria, A. Giannakos, G. Lucarelli, and sium on Discrete Algorithms, SODA ’12, pages 459– V. T. Paschos. Online maximum k-coverage. Discrete 467. SIAM, 2012. Applied Mathematics, 160(13-14):1901–1913, 2012. [4] K. J. Ahn, S. Guha, and A. McGregor. Graph sketches: [15] A. Badanidiyuru, B. Mirzasoleiman, A. Karbasi, and sparsification, spanners, and subgraphs. In Proceedings A. Krause. Streaming submodular maximization: mas- of the 31st ACM SIGMOD-SIGACT-SIGART Sympo- sive data summarization on the fly. In The 20th ACM sium on Principles of Database Systems, PODS 2012, SIGKDD International Conference on Knowledge Dis- Scottsdale, AZ, USA, May 20-24, 2012, pages 5–14, covery and Data Mining, KDD ’14, New York, NY, 2012. USA - August 24 - 27, 2014, pages 671–680, 2014. [5] Y. Ai, W. Hu, Y. Li, and D. P. Woodruff. New char- [16] A. Badanidiyuru and J. Vondr´ak.Fast algorithms for acterizations in turnstile streams with applications. In maximizing submodular functions. In Proceedings of 31st Conference on Computational Complexity, CCC the Twenty-Fifth Annual ACM-SIAM Symposium on 2016, May 29 to June 1, 2016, Tokyo, Japan, pages Discrete Algorithms, SODA 2014, Portland, Oregon, 20:1–20:22, 2016. USA, January 5-7, 2014, pages 1497–1514, 2014. [6] N. Alon, A. Moitra, and B. Sudakov. Nearly complete [17] M. Balcan, S. Ehrlich, and Y. Liang. Distributed k- graphs decomposable into large induced matchings and means and k-median clustering on general communi- their applications. In Proceedings of the 44th Sym- cation topologies. In Advances in Neural Informa- posium on Theory of Computing Conference, STOC tion Processing Systems 26: 27th Annual Conference

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited on Neural Information Processing Systems 2013, Lake Management, CIKM 2010, Toronto, Ontario, Canada, Tahoe, Nevada, United States., pages 1995–2003, 2013. October 26-30, 2010, pages 479–488, 2010. [18] M. Bateni, A. Bhaskara, S. Lattanzi, and V. S. Mir- [29] T. M. Cover and J. A. Thomas. Elements of informa- rokni. Distributed balanced clustering via mapping tion theory (2. ed.). Wiley, 2006. coresets. In Advances in Neural Information Process- [30] R. da Ponte Barbosa, A. Ene, H. L. Nguyen, and ing Systems 27: Annual Conference on Neural Infor- J. Ward. The power of randomization: Distributed mation Processing Systems 2014, December 8-13 2014, submodular maximization on massive datasets. In Montreal, Quebec, Canada, pages 2591–2599, 2014. Proceedings of the 32nd International Conference on [19] M. Bateni, H. Esfandiari, and V. S. Mirrokni. Dis- Machine Learning, ICML 2015, Lille, France, 6-11 tributed coverage maximization via sketching. CoRR, July 2015, pages 1236–1244, 2015. abs/1612.02327, 2016. [31] R. da Ponte Barbosa, A. Ene, H. L. Nguyen, and [20] M. Bateni, H. Esfandiari, and V. S. Mirrokni. Almost J. Ward. A new framework for distributed submodular optimal streaming algorithms for coverage problems. maximization. In IEEE 57th Annual Symposium on In Proceedings of the 29th ACM Symposium on Par- Foundations of Computer Science, FOCS 2016, New allelism in Algorithms and Architectures, SPAA 2017, Brunswick, New Jersey, USA, pages 645–654, 2016. Washington DC, USA, July 24-26, 2017, pages 13–23, [32] E. D. Demaine, P. Indyk, S. Mahabadi, and A. Vak- 2017. ilian. On streaming and communication complexity [21] P. Beame, P. Koutris, and D. Suciu. Communica- of the set cover problem. In Distributed Computing tion steps for parallel query processing. In Proceedings - 28th International Symposium, DISC 2014, Austin, of the 32nd ACM SIGMOD-SIGACT-SIGART Sympo- TX, USA, October 12-15, 2014. Proceedings, pages sium on Principles of Database Systems, PODS 2013, 484–498, 2014. New York, NY, USA - June 22 - 27, 2013, pages 273– [33] S. Dobzinski, N. Nisan, and S. Oren. Economic effi- 284, 2013. ciency requires interaction. In Symposium on Theory [22] G. E. Blelloch, R. Peng, and K. Tangwongsan. Linear- of Computing, STOC 2014, New York, NY, USA, May work greedy parallel approximate set cover and vari- 31 - June 03, 2014, pages 233–242, 2014. ants. In SPAA 2011: Proceedings of the 23rd Annual [34] D. Dolev and T. Feder. Determinism vs. nondetermin- ACM Symposium on Parallelism in Algorithms and Ar- ism in multiparty communication complexity. SIAM J. chitectures, San Jose, CA, USA, June 4-6, 2011 (Co- Comput., 21(5):889–895, 1992. located with FCRC 2011), pages 23–32, 2011. [35] Y. Emek and A. Ros´en. Semi-streaming set cover - [23] M. Braverman, F. Ellen, R. Oshman, T. Pitassi, and (extended abstract). In Automata, Languages, and V. Vaikuntanathan. A tight bound for set disjoint- Programming - 41st International Colloquium, ICALP ness in the message-passing model. In 54th Annual 2014, Copenhagen, Denmark, July 8-11, 2014, Pro- IEEE Symposium on Foundations of Computer Sci- ceedings, Part I, pages 453–464, 2014. ence, FOCS 2013, 26-29 October, 2013, Berkeley, CA, [36] A. Epasto, S. Lattanzi, S. Vassilvitskii, and M. Zadi- USA, pages 668–677, 2013. moghaddam. Submodular optimization over sliding [24] A. Chakrabarti and S. Kale. Submodular maximiza- windows. In Proceedings of the 26th International Con- tion meets streaming: Matchings, matroids, and more. ference on World Wide Web, WWW 2017, Perth, Aus- In and Combinatorial Optimiza- tralia, April 3-7, 2017, pages 421–430, 2017. tion - 17th International Conference, IPCO 2014, [37] D. V. Gucht, R. Williams, D. P. Woodruff, and Bonn, Germany, June 23-25, 2014. Proceedings, pages Q. Zhang. The communication complexity of dis- 210–221, 2014. tributed set-joins with applications to matrix multi- [25] A. Chakrabarti and A. Wirth. Incidence geometries plication. In Proceedings of the 34th ACM Symposium and the pass complexity of semi-streaming set cover. In on Principles of Database Systems, PODS 2015, Mel- Proceedings of the Twenty-Seventh Annual ACM-SIAM bourne, Victoria, Australia, May 31 - June 4, 2015, Symposium on Discrete Algorithms, SODA 2016, Ar- pages 199–212, 2015. lington, VA, USA, January 10-12, 2016, pages 1365– [38] S. Guha, Y. Li, and Q. Zhang. Distributed partial 1373, 2016. clustering. In Proceedings of the 29th ACM Symposium [26] J. Chen, H. L. Nguyen, and Q. Zhang. Submod- on Parallelism in Algorithms and Architectures, SPAA ular maximization over sliding windows. CoRR, 2017, Washington DC, USA, July 24-26, 2017, pages abs/1611.00129, 2016. 143–152, 2017. [27] F. Chierichetti, R. Kumar, and A. Tomkins. Max- [39] S. Har-Peled, P. Indyk, S. Mahabadi, and A. Vakilian. cover in map-reduce. In Proceedings of the 19th Towards tight bounds for the streaming set cover International Conference on World Wide Web, WWW problem. In Proceedings of the 35th ACM SIGMOD- 2010, Raleigh, North Carolina, USA, April 26-30, SIGACT-SIGAI Symposium on Principles of Database 2010, pages 231–240, 2010. Systems, PODS 2016, San Francisco, CA, USA, June [28] G. Cormode, H. J. Karloff, and A. Wirth. Set cover 26 - July 01, 2016, pages 371–383, 2016. algorithms for very large datasets. In Proceedings of the [40] S. Im and B. Moseley. Brief announcement: Fast and 19th ACM Conference on Information and Knowledge better distributed mapreduce algorithms for k-center

Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited clustering. In Proceedings of the 27th ACM on Sympo- composable core-sets for distributed submodular max- sium on Parallelism in Algorithms and Architectures, imization. In Proceedings of the Forty-Seventh Annual SPAA 2015, Portland, OR, USA, June 13-15, 2015, ACM on Symposium on Theory of Computing, STOC pages 65–67, 2015. 2015, Portland, OR, USA, June 14-17, 2015, pages [41] P. Indyk, S. Mahabadi, M. Mahdian, and V. S. Mir- 153–162, 2015. rokni. Composable core-sets for diversity and cover- [53] B. Mirzasoleiman, A. Karbasi, R. Sarkar, and age maximization. In Proceedings of the 33rd ACM A. Krause. Distributed submodular maximization: SIGMOD-SIGACT-SIGART Symposium on Principles Identifying representative elements in massive data. In of Database Systems, PODS’14, Snowbird, UT, USA, Advances in Neural Information Processing Systems June 22-27, 2014, pages 100–108, 2014. 26: 27th Annual Conference on Neural Information [42] R. Jacob, T. Lieber, and N. Sitchinava. On the com- Processing Systems 2013, Lake Tahoe, Nevada, United plexity of list ranking in the parallel external mem- States., pages 2049–2057, 2013. ory model. In Mathematical Foundations of Computer [54] J. M. Phillips, E. Verbin, and Q. Zhang. Lower bounds Science 2014 - 39th International Symposium, MFCS for number-in-hand multiparty communication com- 2014, Budapest, Hungary, August 25-29, 2014. Pro- plexity, made easy. In Proceedings of the Twenty- ceedings, Part II, pages 384–395, 2014. Third Annual ACM-SIAM Symposium on Discrete Al- [43] M. Kapralov and D. P. Woodruff. Spanners and gorithms, SODA 2012, Kyoto, Japan, January 17-19, sparsifiers in dynamic streams. In ACM Symposium 2012, pages 486–501, 2012. on Principles of Distributed Computing, PODC ’14, [55] A. Pietracaprina, G. Pucci, M. Riondato, F. Silvestri, Paris, France, July 15-18, 2014, pages 272–281, 2014. and E. Upfal. Space-round tradeoffs for mapreduce [44] H. J. Karloff, S. Suri, and S. Vassilvitskii. A model computations. In International Conference on Super- of computation for mapreduce. In Proceedings of computing, ICS’12, Italy, pages 235–244, 2012. the Twenty-First Annual ACM-SIAM Symposium on [56] T. Roughgarden, S. Vassilvitskii, and J. R. Wang. Discrete Algorithms, SODA 2010, Austin, Texas, USA, Shuffles and circuits: (on lower bounds for modern par- January 17-19, 2010, pages 938–948, 2010. allel computation). In Proceedings of the 28th ACM [45] D. Kempe, J. M. Kleinberg, and E.´ Tardos. Maximiz- Symposium on Parallelism in Algorithms and Archi- ing the spread of influence through a social network. In tectures, SPAA 2016, Asilomar State Beach/Pacific Proceedings of the Ninth ACM SIGKDD International Grove, CA, USA, July 11-13, 2016, pages 1–12, 2016. Conference on Knowledge Discovery and Data Mining, [57] I. Z. Ruzsa and E. Szemer´edi. Triple systems with Washington, DC, USA, August 24 - 27, 2003, pages no six points carrying three triangles. Combinatorics 137–146, 2003. (Keszthely, 1976), Coll. Math. Soc. J. Bolyai, 18:939– [46] C. Konrad. Maximum matching in turnstile streams. 945, 1978. In Algorithms - ESA 2015 - 23rd Annual European [58] B. Saha and L. Getoor. On maximum coverage in Symposium, Patras, Greece, September 14-16, 2015, the streaming model & application to multi-topic blog- Proceedings, pages 840–852, 2015. watch. In Proceedings of the SIAM International Con- [47] A. Krause and C. Guestrin. Near-optimal observa- ference on Data Mining, SDM 2009, Sparks, Nevada, tion selection using submodular functions. In Proceed- USA, pages 697–708, 2009. ings of the Twenty-Second AAAI Conference on Artifi- [59] D. P. Woodruff and Q. Zhang. Tight bounds for dis- cial Intelligence, July 22-26, 2007, Vancouver, British tributed functional monitoring. In Proceedings of the Columbia, Canada, pages 1650–1654, 2007. 44th Symposium on Theory of Computing Conference, [48] R. Kumar, B. Moseley, S. Vassilvitskii, and A. Vattani. STOC 2012, New York, NY, USA, May 19 - 22, 2012, Fast greedy algorithms in mapreduce and streaming. pages 941–960, 2012. In 25th ACM Symposium on Parallelism in Algorithms [60] D. P. Woodruff and Q. Zhang. When distributed and Architectures, SPAA ’13, Montreal, QC, Canada - computation is communication expensive. In Dis- July 23 - 25, 2013, pages 1–10, 2013. tributed Computing - 27th International Symposium, [49] E. Kushilevitz and N. Nisan. Communication complex- DISC 2013, Jerusalem, Israel, October 14-18, 2013. ity. Cambridge University Press, 1997. Proceedings, pages 16–30, 2013. [50] S. Lattanzi, B. Moseley, S. Suri, and S. Vassilvitskii. [61] D. P. Woodruff and Q. Zhang. An optimal lower bound Filtering: a method for solving graph problems in for distinct elements in the message passing model. In mapreduce. In SPAA 2011: Proceedings of the 23rd Proceedings of the Twenty-Fifth Annual ACM-SIAM Annual ACM Symposium on Parallelism in Algorithms Symposium on Discrete Algorithms, SODA 2014, Port- and Architectures, San Jose, CA, USA, June 4-6, 2011 land, Oregon, USA, pages 718–733, 2014. (Co-located with FCRC 2011), pages 85–94, 2011. [62] A. C. Yao. Some complexity questions related to dis- [51] A. McGregor and H. T. Vu. Better streaming algo- tributive computing (preliminary report). In Proceed- rithms for the maximum coverage problem. In 20th ings of the 11h Annual ACM Symposium on Theory of International Conference on Database Theory, ICDT Computing, April 30 - May 2, 1979, Atlanta, Georgia, 2017, March 21-24, 2017, Venice, Italy, pages 22:1– USA, pages 209–213, 1979. 22:18, 2017. [52] V. S. Mirrokni and M. Zadimoghaddam. Randomized Copyright c 2018 by SIAM Unauthorized reproduction of this article is prohibited