<<

2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS)

Radio Network Coding Requires Logarithmic Overhead

Klim Efremenko Gillat Kol Raghuvansh R. Saxena Ben-Gurion University Princeton University Princeton University [email protected] [email protected] [email protected]

Abstract—We consider the celebrated network then, a neighbor u of the v receives v’s transmission model for abstracting communication in networks. if and only if u decides to receive at time t and v is the In this model, in any round, each node in the network may only neighbor of u that transmits at time t. broadcast a message to all its neighbors. However, a node is able to hear a message broadcast by a neighbor only if no The (noiseless) radio network model received a lot collision occurred, meaning that it was the only neighbor of attention over the last few decades, and has an ever . growing number of wireless and distributed applications. While the (noiseless) radio network model received a On the other hand, the effect of noise on radio networks lot of attention over the last few decades, the effect is not well understood, although wireless communication of noise on radio networks is still not well understood. In this paper, we take a step forward and show that errors are extremely common in practice. In our previous making radio network protocols resilient to noise may paper [EKS18], we took a step forward and initiated the require a substantial performance overhead. Specifically, study of interactive coding for the single-hop (clique) we construct a multi-hop network and a communication network, following the great work of [Gam87], [Gal88], protocol over this network that works in T rounds when [GKS08]. We proved that the single-hop network admits there is no noise. We prove that any scheme that simulates our protocol and is resilient to stochastic noise, requires at constant rate coding. That is, we showed how to convert least cT log(n) rounds, for some constant c. This stands in any protocol that works over the noiseless single-hop contrast to our previous result (STOC, 2018), showing that network, to a noise resilient protocol of similar length, protocols over the single-hop (clique) network can be made up to a multiplicative constant. noise resilient with only a constant overhead. Our result In our current paper, we consider noise in general also settles a recent conjecture by Censor-Hillel, Haeupler, Hershkowitz, Zuzic (2018). multi-hop networks, and show that, in contrast to We complement the above result by giving a scheme to the single-hop case, noise resilience may require a simulate any protocol with a fixed order of transmissions substantial performance overhead. Roughly, our main with only an O(log (n)) overhead. result proves that there exists a G Index Terms—Interactive Coding; Wireless Broadcast; and a Π over G with T Lower Bounds; Communication Complexity; rounds that works in the noiseless setting, such that I. INTRODUCTION every noise resilient protocol Π that ‘simulates’ Π We consider the extensively studied radio network requires Ω(T log(n)) rounds. We complement our result model for abstracting communication in wireless with a scheme for converting any protocol Π with a fixed order of transmissions over any network G,to networks [CK85]. In this model, a set of agents represented as nodes in a graph, want to disseminate a noise resilient protocol Π , while only incurring a O their privately stored data throughout the network, or, multiplicative (log n) overhead. more generally, perform a joint computation that involves A. Modeling Collisions and Noise all their data. The nodes are each equipped with a wireless device that, in any time step, can either act as To be able to give precise statements of our results, a or as a receiver. We think of the neighbors we need to define the assumed radio network model. of a transmitting node as being within its range of We chose to present our results in the model recently transmission, which is necessary, but not sufficient for suggested by [CHHZ18], described below. We mention, receiving the transmission. If a node v transmits at time t, however, that our results are proved for more general

2575-8454/19/$31.00 ©2019 IEEE 348 DOI 10.1109/FOCS.2019.00030 models (see more about that in subsection I-B and of the simulation (see [GHS14] to learn more about this section II). phenomenon). To define the assumed (noiseless) radio network Theorem I.1 (main). Let Γ be a set such that |Γ| > model, called here RADIO, and the assumed noisy radio 1, n ∈ N be sufficiently large, and >0 be a network model, called here NOISY RADIO, one is constant. There exists a network G with n nodes and asked to make two modeling decisions. The first decision a (deterministic and non-adaptive) protocol Π of length is about the way the radio network deals with collisions. T (n) in the RADIO model over G with message set As explained above, when two or more neighbors of a Γ, such that any protocol Π that simulates Π in node v transmit in the same round, v may not receive the NOISY RADIO model over G with message set Γ, (any of) their messages. What does v receive then? The requires Ω(T (n)logn) rounds. most commonly studied collision model in the literature, and the one we are using in RADIO and NOISY RADIO,is The protocol Π we construct in order to prove called “collision-as-silence”, and it asserts that v simply Theorem I.1 is a protocol for a routing task (a.k.a. a does not get anything. Thus, v is unable to distinguish a multicommodity multicast task), called TribeTransfer.In collision from the background noise occurring when no multicommodity multicasts, we are given a set of source- neighbor transmits. The main alternatives to collision- sink pairs so that each source has an input (a.k.a. rumor) as-silence, studied by prior works (somewhat to lesser that must be transmitted to the corresponding sink. Such extent), are the collision detection model, where v is tasks are a generalization of the extensively studied notified that a collision occurred, and the flaky collision broadcast task (one node is a source and all the others model, where v may receive one of the transmitted are sinks) and gossip task (all pairs of nodes are sink- messages. For an excellent survey, see [Pel07]. source pairs). Our result can be interpreted as showing The second modeling decision we face is regarding the that even protocols for “simple” tasks, such as routing noise. We choose to model the noise in NOISY RADIO tasks, that do not involve any joint computation, can as stochastic erasures. More formally, let v and u become expensive in the presence of noise. We note that be neighbors such that v is the only neighbor of u it is an interesting open problem to get a near-logarithmic transmitting in round t. Then, for >0,inthe lower bound, like the one in Theorem I.1, for the case NOISY RADIO model, in round t, u receives v’s message where the length of the protocol T can approach infinity with probability 1− (independently of any other event), (independently of the number of nodes n). and otherwise witnesses silence. We selected stochastic The noiseless protocol for TribeTransfer is also erasures as they are “weaker” than other models of noise “simple” in the sense that it is: (1) deterministic; (2) the studied in the literature (e.g., bit flips, insertion-deletion set of nodes communicating in every round is known errors, adversarial errors), and thus, make our lower in advance, and is independent of the nodes’ inputs, bound stronger. the messages received in previous rounds, and the noise. Protocols admitting the second property are called non-adaptive (a.k.a. static and oblivious). Interestingly, B. Our Results despite the obvious advantages of adaptivity, for many a) Lower bound: The following theorem is the useful tasks, the state of the art algorithms are non- main contribution of our paper. It lower bounds the adaptive. For such non-adaptive protocols, we prove that overhead, in terms of rounds, required in order to make our Ω(log n) lower bound is in fact tight. Getting a a protocol noise resilient. A proof sketch is given in similar upper bound for general (adaptive) protocols is section II, and the full proof is given in section III an interesting problem. and section V. As far as we know, this is the first b) Upper bound: For completeness, we also give lower bound that also rules out adaptive simulations. a way to convert any non-adaptive noiseless protocol Π, Such lower bounds are tricky, as when a message is lost such as the one used in the proof of our main result, due to an erasure, it may not only cause the parties to to a noise resilient protocol with an O(log n) blowup change their future messages, but can also cause them to to the length. We note that the claim is trivial when completely change their broadcast/receive patterns. Thus, the length of Π is poly(n), as each message in Π the parties can dynamically allocate more rounds to the can be repeated O(log n) times. This ensures that even parties that were erased the most, decreasing the length in the presence of noise, all nodes get the messages

349 ‘correctly’ with high probability. To prove that the noise Theorem V.1 for an exact statement)1. can be dealt with even for longer protocols, we use the (by now standard) recursive “rewind-if-error” interactive C. Related Work coding mechanism (see, for example, [Sch92], [EKS18]). The field of Our simulation protocol and its analysis are given in coding for interactive communication was introduced in Appendix section B. seminal papers of Schulman [Sch92], [Sch93], [Sch96]. Various aspects of two-party interactive coding (such as computational efficiency, interactive channel capacity, Theorem I.2. Let Γ be a set such that |Γ| > 1, n, T ∈ N noise tolerance, list decoding, different channel types, be sufficiently large, and ≥ 0 be constant. Let G be etc.) were considered in recent years, with most works a graph with n vertices, and let Π be a non-adaptive focusing on simulation protocols [GMS11], [BR11], protocol of length T in the RADIO model over G with [BKN14], [Bra12], [MS14], [BE14], [GMS14], [GH14], message set Γ. Then, there exists a protocol Π that [GHK+16], [EGH16], [BGMO17], [EKS18], to cite a simulates Π in the NOISY RADIO model over G and few. O only requires (T log n) rounds. Few strong interactive coding lower bounds are known, and these entail substantial technical difficulties [GKS08], [KR13], [BEGH16], [GK17]. Out of these, We mention that while, for simplicity of exposition, the one that is the most relevant to our work is the statements of our results use the RADIO and [BEGH16], which proves that coding over the point-to- NOISY RADIO models, we actually prove stronger point star network, requires almost-logarithmic blowup statements. Our lower bound is proved assuming an to the number of rounds, matching the beautiful scheme even weaker model for the simulating protocol, where of [RS94]. Observe that the point-to-point model allows in round t, all nodes get all messages that were for more freedom for designing a protocol, as each node successfully received by at least one node in round t can transmit different messages to each of its neighbors (see subsection IV-C for a formal definition). In addition, over a private channel, and, more importantly, one does our upper bound gives a scheme that is resilient against not need to worry about collisions. Yet, the problems the very strong noise model where, when a collision of coding over point-to-point networks and over radio occurs, an adversary is controlling the received massages networks are incomparable, as in the former case, both (see [EKS18], [Hae14]). the faultless and the noise resilient protocols use point- to-point channels. Due to the fundamental differences between the models, as is evident by the fact that the star topology does admit constant rate coding in the c) Simulation: Both our lower and upper bounds NOISY RADIO model [EKS18], we were not able to use consider the simulation of a protocol Π by another the techniques developed by [BEGH16] (see section II protocol Π . But, what does it mean to simulate? As for a more detailed comparisons). is usually the case in the field of interactive coding, While, as mentioned above, no lower bounds for in our upper bound result, Π simulates Π in a very adaptive simulations were previously known, several strong sense: given a transcript for Π one can deduce papers give strong upper bounds. The two-party setting the transcript for Π with the same randomness and is considered by the very interesting works [Hae14], noise. However, our lower bound rules out much weaker [GHS14], [AGS16], where it is shown that adaptive simulations. Namely, recall that the protocol Π is coding schemes can allow for strictly better rate and deterministic, thus, for any set of inputs, each node has a single correct output. We show that for any protocol 1We note that the weak statement that, with probability 3/4, there Π with fewer than c · T log n rounds (for some fixed exists at least one node whose output is incorrect, has an easy proof: consider the star topology where the center wants to broadcast its input. constant c>0), there exists a set containing at least This statement is weaker than ours in two respect: (1) the size of the 4/5 fraction of the sink nodes, such that for every node incorrect set is small; (2) the order of quantifiers is different, as we show that the same set of nodes is incorrect in many executions. Our in the set, the probability (over the selection of inputs, order of quantifier is the standard one for interactive coding. It parallels the randomness used by the simulation, and the noise) the notion of randomized algorithms, where it’s ok if for every random string, the computation fails for a (different) set of inputs, but the that it outputs an incorrect value is greater than 3/4 (see computation for any input should not fail for many random strings.

350 error tolerance2.In[EKS18], we construct a constant task Transfer n requires n rounds, as no two leaf nodes rate adaptive coding scheme for the single-hop network, can broadcast together (otherwise, there is a collision). when it was known that the best rate achievable by a Clearly, n rounds also suffice, as the nodes can broadcast non-adaptive scheme is O(1/ log log n) [GKS08]. their inputs one after the other in some pre-determined Noise in radio networks was considered by several order. prior works [Gam87], [Gal88], [FK00], [KM05], Consider now the noisy case, where each broadcast [New04], [GKS08], [EKS18], [CHHZ17], [GHM18], from the leaf nodes is erased with a constant probability. [CHHZ18]. The related dual graph model, where We may assume, without loss of generality, that each some of the links are unreliable, is studied by node is only broadcasting its input bit, as any information + + [KLN 10], [GHLN12], [CGK 14], [GLN13], and it has about the other inputs must have been broadcast to [KKP01], [KPP10] consider radio networks with faulty it by the center. We observe that even in the erasure case, nodes. the expected number of rounds that are needed for the a) Relation to [CHHZ18]: The most relevant to center node to hear the input of one leaf node, assuming our work, aside from [EKS18], is the very recent it is the only one communicating, is only a constant. If work [CHHZ18] defining the model NOISY RADIO one assumes that when a message reaches the center, all we use in this paper. They show that every non- the n +1 nodes in the network get notified and proceed adaptive protocol can be made noise resilient with to sending the next input, an easy Chernoff bound shows only a blowup of poly(log Δ, log log n) in length, that a noise resilient protocol for Transfer n with O(n) where Δ is the maximum degree of the graph. They rounds exists. also prove that, under certain assumptions (e.g., the How strong is the assumption that all the nodes get noise resilient protocol is non-coding), an Ω(log(Δ)) notified of a successful transmission? In the extreme multiplicative round overhead is required. The authors case, when the center node does not broadcast at all, further conjecture that such a statement can be proved the leaf nodes have no choice but to repeat their input assumptions-free for some specific protocol and topology Ω(log n) times to allow a union bound over all n inputs. which is described in the paper. Thus, a noise resilient protocol for Transfer n would Our main result essentially proves their conjecture require Ω(n log n) rounds. with a different topology and protocol: Fix n. Since the degree of the network G in Theorem I.1 with n However, the center node can give feedback to the leaf vertices is Δ=poly(n), Theorem I.1 directly proves nodes to reduce the number of rounds in the broadcast the conjecture for Δ=poly(n). To prove the conjecture protocol. Indeed, in our prior work [EKS18], we for a smaller Δ, we take vertex-disjoint copies of G with (implicitly) implement a sufficiently powerful feedback poly(Δ) vertices and consider the task of simulating Π mechanism for the star topology, reducing the number on all copies of G in parallel. of rounds all the way down to O(n). While our focus in [EKS18] was the single-hop (clique) topology, II. PROOF SKETCH our simulation can be shown to work for topologies The main contribution of this work is an Ω(log n) satisfying (roughly): (1) there is a ‘center’ node that can lower bound on the overhead of an interactive coding ‘reach’ all other nodes in a constant number of rounds scheme for a particular multi-hop radio network. We (i.e., there is a constant round protocol for broadcasting sketch this result in this section. a message from the center to all other nodes); (2) there exists a constant c such that all sets of nodes S that A. The Star Topology can broadcast successfully have |S|≤c. Here, we say Consider the star graph on n +1 nodes where one that S broadcasts successfully if there exists a node ‘center’ node is connected to n ‘leaf’ nodes. Suppose outside S with exactly one neighbor in S. The second that all the leaf nodes have a one bit input that they property implies that, effectively, at most c nodes are want to send to the center node. Also, assume that each broadcasting in any round. The center is then able to node may only broadcast one bit in every round. Call this broadcast feedback on their messages, reaching all nodes communication task Transfer n. In the noiseless case, the in constant number of rounds. The above discussion suggests that while Ω(log n) 2We note that the work of [AGS16] considers channels with no collisions. messages per node are required for the center to hear any

351 leaf’s input with high probability3, the expected number a) Lower bound under a restriction: Suppose that, of messages required is only a constant. Our lower bound for all i, all the k leaf nodes numbered i were restricted approach is to find a topology where the feedback from to only broadcast together. We now describe why, under the ‘center’ node can be somehow controlled, allowing this restriction, any noise resilient protocol for our us to “go from the expected cost to the high probability problem would require Ω(k log k) rounds. Since all the k cost”. nodes with the same number i are restricted to broadcast We mention, as an aside, the work of [BEGH16], together, the number of rounds needed for this group, say showing a near-logarithmic lower bound for the star group i, to successfully broadcast is equal to the number topology in the point-to-point model. Their proof is also of rounds needed for the node that was corrupted the “going from the expected cost to the high probability most. This number is easily seen to be Ω(log k). Since cost”, but does so using an inherently different approach. this holds for all the k groups of nodes, it implies an They lower bound the overhead required for simulating Ω(k log k) lower bound. a pointer chasing protocol in the presence of noise A crucial subtlety that we omit in the analysis above (rather than a message exchange protocol). They exploit is that information about the nodes’ inputs may be the fact that the messages communicated by the nodes transmitted by the order in which the groups choose to in a given round of the protocol crucially depend on broadcast, e.g., there may be a ‘fancy’ protocol where all prior messages sent to them, thus all nodes must the nodes in group 1 would broadcast in round 1 only if know all previous messages with high probability. Due they have some particular input, etc.. We show how to to the differences in the models, and as we are mainly deal with this ‘leak’ of information when we talk about interested in protocols for routing information, where our actual construction in subsection II-C. the messages communicated by the nodes are typically We end this subsection√ by observing that in the graph independent of the history, we are unable to use their G, all the k =Θ( n) nodes labeled by i can broadcast ideas. successfully at the same rounds and have their respective centers hear them, thus violating Property 2 required by B. Controlling the Feedback [EKS18] in a strong way. One easy way to make the feedback redundant is to constrain the leaf nodes to only broadcast in a C. Our Construction fixed subset of the rounds, i.e., make the protocol non- adaptive. This subset is decided beforehand and cannot The discussion in the foregoing section suggests that be changed by any feedback from the center node. In this there are two main issues that need to be tackled. Firstly, restricted scenario, it is again easy to see an Ω(log n) we need to find a way to actually implement the fairly lower bound (see [GKS08] for a related result for the artificial restriction described above. Secondly, we need a clique network). way to control and account for the leak of information in Since we aim to prove a lower bound for adaptive the system. In this subsection, inspired by the discussion protocols, where the nodes can decide whether to receive above, we describe a graph G for which a weak version or transmit based on the communication history and their of the above restriction holds. Later, we describe how input, we need other ways to control the feedback. To we account for the information leak in our analysis. this end, consider a graph G that has k disconnected a) Our network: The graph G we construct has 2 copies of the star graph, each with k leaves. Number n =2k nodes divided into two sets A and B of 2 the leaf nodes on each of the star graphs from 1 to k nodes each. The nodes in the sets A and B are k arbitrarily (thus, G is a graph with n = k(k +1) divided into k groups of k nodes each. We will use aij ∈ th vertices). We want to solve Transfer k on all of the star (respectively, bij) where i, j [k], to refer to the j th subgraphs in parallel, meaning that the center of each node in the i group of A (resp., B). We now describe star should know the messages of all the leaves in its the edges in G: star. Clearly, when there is no noise, this problem can 1) Edges between A and B: There is an edge between th still be solved in k rounds, by having the i nodes in node aij ∈ A and another node bij ∈ B if either each star broadcasting in round i. (i = i )or(i = i and j = j ). In words, node aij is connected to node bij and also connected to all 3High here means 1−n−c, for some constant c>1, to allow union bound over the leaves. nodes not in group i of B.

352 2) Edges in B: There is an edge between nodes challenges to be able to complete the proof. We next bij,bij ∈ B for every i and j = j . describe some of these challenging cases: (1) It may Observe that the nodes in group i of A form an happen that the nodes in A convey information about independent set, while the nodes in group i of B form their input by the order in which the groups broadcast. a clique, see Figure 1. (2) It may also happen that a lot of information is b) Our communication task: All the nodes in the conveyed in rounds where two nodes from different set A have a single symbol as input. Nodes in B have groups in A broadcast. (3) There may be a clever no input. The communication problem we consider is protocol that somehow uses broadcasts by the players where all the nodes bij in group i of B have to output in the set B. (4) Finally, it may happen that a lot of the input symbol of all the nodes in group i of A. The information is conveyed in rounds where a subset of a nodes in A are not required to produce an output. group of nodes in A broadcast, or maybe such a subset In the noiseless setting, this communication task broadcasts together with a small number (< 2) of nodes admits a simple non-adaptive and deterministic 2k outside the group. rounds protocol: In round i ∈ [k], all the nodes in group We now show how to handle each of these cases one i of A broadcast their input. After these k rounds, node by one. bij knows the input of aij. In round k + j for j ∈ [k], (1) We handle the case where information is revealed all the nodes in {bij | i ∈ [k]} broadcast the symbol by the order in which the groups in A broadcast, they received. After this round, all nodes in group i of using an information theoretic argument. Observe B knows the input of aij, for every i. that there are k2 nodes in the set A and each node c) Implementing the restriction: Since the nodes in has one bit as input. Thus, a simulation protocol A are not required to produce an output, we assume that needs to send Ω(k2) bits of information from A to B. the nodes in B do not broadcast during the protocol. If the simulation protocol has fewer than T rounds, This assumption is not without loss of generality and then there are at most kT possible orders in which we justify it in subsection II-D. the groups in A can broadcast. This means that if Suppose that two nodes aij and aij, where i = i T =Ω(k log k), then the amount of information are broadcasting in a given round. Then, since i = i, that can be conveyed using the order in which the all the nodes in the set B are connected to at least groups broadcast is at most log(kT )  k2, and thus one of these two nodes. Thus, a third node in the set negligible. A cannot successfully broadcast in this round (meaning (2) Next, we account for the information conveyed that none of its neighbors will receive the transmission). during the rounds where two nodes from different This implies that in any round, either all the nodes groups of A broadcast, using a similar information broadcasting are in the same group of the set A,orthe theoretic argument. Recall that if nodes from number nodes that are successfully broadcasting is not different groups of A broadcast, then the number more than 2. This is actually just a weak version of the of such nodes is at most 2. Thus, in such a round, restriction described in subsection II-B where we said at most 2 bits of information can be transmitted that all the nodes broadcasting in a given round have to ‘directly’. Of course, as before, information may also be in the same group. be transmitted using the order of such broadcasts. D. Our Analysis However, there are at most |A|2 = k4 ways of Our network and communication task are designed choosing these two nodes, and thus at most (k+k4)T to make the ‘basic’ lower bound analysis simple: It ways of ordering their broadcasts (the additive term is roughly true that all the nodes of A broadcasting of k comes from the rounds where all the nodes in any round√ belong to the same group. Since there broadcasting are in the same group). As before, are k =Θ( n) nodes in any group, if these groups this can only convey O(T log k)  k2 bits of broadcast in a fixed order, then a given group will need information. to broadcast Ω(log n) times. Since there are k groups, (3) We need to argue that a protocol cannot really use the total number of rounds in any simulation will be broadcasts from the nodes in the set B. For this part Ω(k log n), as required for the lower bound. of the argument, we use the fact that, in the above, This basic analysis, however, is quite far from an we actually upper bounded the information conveyed actual lower bound, and we need to face several to all of the set B (instead of to a given node in B).

353 Thus, the argument above holds even if we ‘reveal’ In this case, it turns out that if we ‘reveal’ the input everything known to any of the nodes in B to all the of node v to all the nodes in the graph (as part of the nodes in the network. In this case, since nodes in B additional information in SHARED RADIO), then, we have no inputs, their broadcasts can be computed by can assume that all the nodes in group i broadcast any node in the graph, and thus are redundant. without loss of generality. At a high level, this is We formalize this by considering a new model called because once the input of node v is known, it doesn’t SHARED RADIO, where all the nodes receive the need to broadcast and the nodes in group i can same (shared) transcript. This transcript contains broadcast without worrying about a collision with all the received transcripts of all the nodes in node v. Observe that the assumption that only an the original model (as well as some additional entire group can broadcast in a given round (instead information, described in item 4, tailor made for our of any subset), allows us to use the information network and task). In SHARED RADIO, we require theoretic bounds in the first two items. that the nodes in B do not broadcast and that the We believe that the idea of revealing some extra nodes in A only broadcast their input. We are able information to account for signaling is more general to show that a lower bound for SHARED RADIO and may find applications beyond the current work. implies a lower bound for NOISY RADIO. We then III. LOWER BOUND:NETWORK AND PROTOCOL show a lower bound for SHARED RADIO using the CONSTRUCTION ideas presented in this section. (4) Finally, an extension of the information argument In this section, we construct the network and protocol in the first two items does not seem to work used in the proof of Theorem I.1. for this case (at least not directly), as there are A. The Network Construction 2k subsets of any given group, thus giving too 2 many options. Before we describe how we handle Fix k>0, we define a graph G over n =2k vertices. 2 this case, we would like to elaborate on why We partition the vertices into two sets, A and B,ofk this may be a problem. At first sight, it may vertices each. Let i, j be variables that take values in seem that having only a subset of nodes in a [k]. We index the sets A, B by the tuple (i, j) and use th group broadcast can only give less information the notation aij (respectively, bij)todenotethe(i, j) than having the entire group broadcast. This is, vertex of A (respectively, B). for instance, true when the noise is adversarial in The graph G has the following edges: case of a collision/silence. However, we work in the • aij ∼ bij for all i, j ∈ [k]. NOISY RADIO model, where collisions and silences • bij ∼ bij for all i, j, j ∈ [k] such that j = j . are received as erasures, and it is conceivable that • aij ∼ bij for all i, i ,j,j ∈ [k] such that i = i . i a subset may give some information that the entire We use the shorthand A = {aij | j ∈ [k]} and i group did not give. B = {bij | j ∈ [k]}. Observe that each of the sets For example, consider the case when a subset S of Ai is an independent set, and each of the sets Bi is a a group i in A is broadcasting, and a node v in A, clique. For a vertex v ∈ G, the neighborhood of v is but not in group i of A, is also broadcasting. Let S denoted using N(v)={u ∈ G | u ∼ v}. We use the be the subset of group i in B containing the nodes convention that v ∈ N(v). ‘corresponding’ to S. That is, bij ∈ S if and only The graph G, for k =3is drawn in Figure 1. if aij ∈ S. By our graph construction, the nodes in The graph G constructed above was designed to S will have two neighbors broadcasting, and will satisfy the following key property: always receive the erasure symbol λ, whereas the Fact III.1. Let i = i ∈ [k] and j, j ∈ [k]. Then, nodes in group i of B that are not in S will have N(aij) ∪ N(aij )=B. only one neighbor broadcasting, namely v, and will receive a noisy copy of its broadcast. In this case, B. The Communication Task when a node in the group i in B receives λ,it In this subsection, we define the hard communication ‘suggests’ that the corresponding node in A tried to task, called TribeTransferk, for k>0. Consider the broadcast. Thus, even rounds where nodes receive λ graph G defined in subsection III-A with n =2k2 can reveal meaningful information. nodes. Suppose that all nodes in the set A have an input,

354 The support of a random variable X, denoted supp(X) A1 B1 is the set {x | Pr(x) > 0}. Throughout this paper, we assume that the nodes of a graph with n nodes are numbered 1 to n. We also work with an alphabet set Γ that does not include a special A2 B2 symbol λ. We will use Γ+ to denote the set Γ ∪{λ}.

B. The RADIO and NOISY RADIO Models In the section, we formally define the A3 B3 NOISY RADIO model. The model RADIO is defined to be NOISY RADIO0. a) The noise function: Let G be a graph with n =3 ∈ v n → Fig. 1. Our graph construction, shown for k . The nodes in the nodes. For v G, define the function single :Γ+ set A are on the left while the nodes in set B are on the right. We Γ as follows: Let σ ∈ Γn be a string. Let N(v) be the show all the edges between aij and bij , for all i, j ∈ [k], in teal. + + Some edges in the graph are not shown to reduce clutter. Notice how set of neighbors of v in G. If it is not the case that for any two nodes that are not in the same Ai cover all of B (Fact III.1). exactly one u ∈ N(v), it holds that σu = λ, then we v define single (σ)=λ. Otherwise, if N(v)={u},we v i.e., node aij has input xij ∈ Γ for some set Γ. The define single (σ)=σu. parties are required to run a broadcast protocol such Finally, we define the v v that, at the end of the protocol, for all i, j, the node randomized function N-single (σ) to be single (σ) with i − bij outputs X , which is defined as the k-length string probability 1 and λ with probability . When we refer i v X = xi ,xi , ··· ,xik. The nodes aij are not required to N-single (σ) for all nodes v in a set, say S, then we 1 2 v to produce an output. implicitly assume that the randomness in N-single (·) ∈ a) Noiseless for all v S is independent. NOISY RADIO ≥ protocol for TribeTransferk: We describe and analyze b) protocol: Fix , n 0, and let a protocol that solves TribeTransferk in the noiseless G be a graph with n nodes and let Γ be as above. setting ( =0). The protocol proceeds in 2k rounds. In A protocol Π in NOISY RADIO over G with message round i ∈ [k], all the nodes in Ai broadcast their input. set Γ, is defined by a length parameter T ∈ N, nT { v } Because the channel is noiseless, after these k rounds, transmission functions fm m∈[T ],v∈G, and n output { v} v v node bij knows xij for all i, j ∈ [k]. In round k + j for functions g v∈G. The function fm has type fm : v × m−1 → v j ∈ [k], all the nodes in {bij | i ∈ [k]} broadcast the X Γ+ Γ+, and the function g has the type v v × T → v v symbol they received, i.e., node bij broadcasts xij for g : X Γ+ Y . Here, we use X to denote the v all i ∈ [k]. Observe that in round k +j, for every j = j, input space of node v, Y to denote the output space of node bij learns xij,asbij broadcasts it and none of the node v, and λ to denote the erasure symbol. We use the i other neighbors of bij (other vertices in the clique B convention that a node that chooses to receive transmits or a vertex from A) are broadcasting. This ensures that the erasure symbol λ. i each bij can output X , and the communication task is A protocol is executed as follows. At the beginning, v v completed. Note that this protocol is non-adaptive and all the nodes v ∈ G have an input x ∈ X and v deterministic. start with the empty transcript π<1. Before round m, for m ∈ [T ], the node v has received a transcript v ∈ m−1 IV. RADIO MODELS π

355 it is not the case that exactly one neighbor of v is v v v × n T × v while the function g has the type g : X Γ+ broadcasting), then πm = λ with probability 1. CasesT → v v v v v Y .Asinsubsection IV-B, we use X to Node v appends π to the transcript π to get π , v m 1, or at least three values of i such | ∩ i| many that we only define SHARED RADIO for our graph that Sm A > 0, then we define ψm = . and our communication task (subsection III-A and We also define πm,v = λ for all v. subsection III-B). • If there are at most two values of i,wehave | ∩ i| In the model SHARED RADIO, all the nodes have Sm A =1, and for all other values of i,we i the same transcript. This transcript should be seen as have |Sm ∩ A | =0, then we define ψm = few. the concatenation of all the n transcripts received by We also define πm,v = λ if v ∈ Sm. Otherwise, set v different nodes in NOISY RADIO. However, we also πm,v = x . give the nodes some additional information. Firstly, if • Finally, we consider the case when there exists i i ∈ | ∩ i| there is an A such that exactly one node in A is an i [k] such that Sm A > 1 and | ∩ \ i |≤ broadcasting, we reveal the input of this node in the Sm A A 1. In this case, we define i transcript. Secondly, if there is an Ai such that (strictly) ψm =(i, standard). Also, for v ∈ Sm ∩ A \ A , v more than two nodes in Ai are broadcasting, then, our we set πm,v = x . We also set, independently, for i all j ∈ [k], model behaves as if all the nodes in A are broadcasting. Finally, we also give some information about which xaij , with probability 1 − subset of nodes broadcast. πm,aij = . λ,with probability In the model SHARED RADIO, the nodes in A are restricted to broadcasting only their input, while the All other coordinates of πm are set to λ. nodes in B never broadcast. All the players receive tm and define t≤m as t0 and let G be the graph from 4We use the convention that a node v chooses to broadcast (their subsection III-A with n =2k2 nodes A ∪ B.Fix ≥ 0 v input) in round m if and only if fm evaluates to 1. and let Π be a protocol over NOISY RADIO. Assume

356 that when Π commences, each node v ∈ A has an input Equation 2 simplifies to: v ∈ G G G G x Γ, while nodes in B have no input. Pr (E(τ

357 1 B |B| A A G Here, (E) denotes the indicator function for the event = 1(τT = λ )Pr(πT = τT | E(τ

358 ∈ fresh and independent for all v G. happens, we define:⎧ ⎪ v ⎪N-single (σ (πT )) ,v ∈ A ∗ ⎪ few ⎪ i As ψT = , ⎪λ,v∈ B \ B v v ⎪ we have πT,v = λ if fT (x , (π

359 SHARED RADIO Observe that, for any realization ψ of Ψ, a set Γ, set =1/3, and denote by SHARED RADIO the model with error rate over G |usei(ψ)|≤T. (14) and message set Γ. The proof of Theorem I.1 follows i∈[k] directly from the following theorem in combination with Theorem IV.1. B. The Proof of Theorem V.1 Theorem V.1. Fix k ∈ N such that k>2100, and let Π For the rest of the text, we fix k and a protocol Π for be a protocol for TribeTransferk in the SHARED RADIO TribeTransferk in SHARED RADIO. | | k log k ⊆ T model such that Π < 10000 . There exists a set I [k] Recall that ≤m =(Π≤m, Ψ≤m) is a random such that |I|≥4k/5 and for all i ∈ I and all j ∈ [k], variable that denotes the transcript of Π in the first m we have rounds. Unless stated otherwise, the probabilities and i 1 Pr(bij outputs X ) ≤ . expectations below are over the players’ inputs, and the 4 randomness used by the players and by the channel. We A. Notation prove Theorem V.1 in 3 steps: A general protocol will be denoted using Π, and T will k k Theorem V.2. If |Π| = T ≤ log , there is a set be reserved for |Π|. The notation X denotes the random 10000 I ⊆ [k] such that |I|≥9k/10, and for all i ∈ I, vector of all the n inputs and x is one realization of X. v v we have The input for node v will be denoted using x (or X ). √ 1 The inputs for the nodes in Ai will be denoted using Xi. Pr cti(Π) ≥ k − k ≤ . n 10 The random variable Tm takes values in × Cases Γ+ Theorem V.2 shows that for all i ∈ I , a significant and denotes the transcript received by the players in i number of nodes in A were ‘erased’ out during the round . A particular realization of Tm m m i m =(Π , Ψ ) execution of Π. One may conclude that nodes in B will be denoted as m m m . We define T≤m to be i t =(π ,ψ ) would not be able to output X . This is fallacious as it the concatenation of the transcripts received in the first i m may be the case that information about X was revealed T rounds. Analogously, define

ct(π≤m)= cti(π≤m). We first present the proof of Theorem V.1 assuming i∈[k] the three results above. Sometimes, we need these functions for a transcript of a single round. In this case, we have Proof of Theorem V.1. Using Lemma A.10 and linearity i of expectation, we can rewrite Theorem V.3 as: succi(πm)={v ∈ A | πm,v = λ}. i I(X : T ) ≤ γ · E[cti(Π)] + 20γT log k. (15) The functions succ(·), cti(·), and ct(·), for the transcript i∈[k] i∈[k] of a single round are defined similarly. Define the set: Define, for all i ∈ [k], the set valued function   i 200γT log k I = i ∈ [k] | I(X : T ) ≤ γ · E[cti(Π)] + . usei(ψ≤m)={m ≤ m | ψm =(i, standard)} . k

360  √ | |≥ We claim that I 9k/10. Otherwise, using the fact ζ = i∈ k Pr(cti(Π) ≥ k − k). Define the random i i [ ] √ that I(X : T )=γk − H(X |T) ≥ γ · E[cti(Π)],we i 1 ≥ − variable Y = (cti(Π) k  k). It holds that ζ = have the following: E i E i ≤ i∈[k] [Y ]= i∈[k] Y . Assuming Pr(E3) i I(X : T ) 1 200 ,wehave: i∈ k [ ] k k i i ζ ≤ k · Pr(E )+ Pr(E ) ≤ . = I(X : T )+ I(X : T ) 3 200 3 100 k i∈I i∈[k]\I This implies the theorem as there can√ be at most 10 1 ≥ · i ≥ − ≥ γ E[cti(Π)] values of i such that Pr(ct (π) k k) 10 . i∈I We now prove Equation 16.

200γT log k To show inequality (a), we assume E3 ∧E4 and derive + γ · E[cti(Π)] + i∈ k \I k a contradiction. Consider a realization (π, ψ) of (Π, Ψ) [ ] such that E3 ∧ E4 holds. Then, if (Π, Ψ) = (π, ψ),we > γ · E[cti(Π)] + 20γT log k, ⊆ | | k ∈ have a set I [k]: I > 400 such that for all i I, i∈[k] ⎛ ⎞ √ a contradiction to Equation 15. ⎝ ⎠ 2 cti(π)− cti(π≤m) − cti(π

8 We present our proof of Theorem V.2. cti(π≤m) − cti(π

m∈[T ] i:m∈usei(ψ) Proof of Theorem V.2. Recall (Π, Ψ) is a random variable that denotes the transcript of Π. We begin by a contradiction. To show inequality (b), we show that E =⇒ E . defining the following events that depend on (Π, Ψ): 4 5

i 2 Consider a realization (π, ψ) of (Π, Ψ) such that E4 ≡ − ≥ − 3 k E1 cti(Π≤m) cti(Π i I :cti(Π) k k. E happens. 200 5 Inequality (c) follows from a simple union bound and ≡∃ ⊆ | | k ∧∀ ∈ 1 E4 I [k]: I > i I : Ei . ∧ ≤ | 400 the fact that Pr(A B) Pr(A B) for any two events ≡∃ ∈ i ∧ i A, B. E5 i [k]:E1 E2. Finally, we show inequality (d) by showing that for We will show that any i and any ψ such that Ei happens when Ψ=ψ,we a b 2 ( ) ( ) i | ≤ 1 i ≤ ≤ have Pr(E1 ψ) k . Fix any such ψ. Then, E1 can Pr(E3) Pr(E4) Pr(E5) 200 2 ⊆ | |≥ − 3 (c) (d) (16) happen only if there is a set J [k], J k k ≤ i | i ≤ 1 Pr(E E ) . such that for all j ∈ J, there exists m ∈ usei(ψ) 1 2 200 i∈[k]  ∈ such that πm,aij = λ. Since for all j [k] and ≤ ∈  Before proving Equation 16, we show that Pr(E3) m usei(ψ),wehaveπm,aij = λ independently with 1 2 log k | i |≤ 200 indeed implies the theorem. Consider the quantity probability 3 and we have use (ψ) 25 , there exists

361 ∈  m usei(ψ) such that πm,aij = λ with probability 2logk. This is because succi(Πm) is determined by | ψ | − 1 − − 2 usei( ) ≤ − 10 1 (1 3 ) 1 k . We union bound over noise independently from the input X, and the rest of all possible values of J to get succ(Πm) has entropy at most γ +2logk, as at most 2 i 2 k−k 3 one πm,v may be different from λ for v/∈ A . i k 3 − 1 | ≤ · − 10 Pr(E1 ψ) k 1 k We next claim that Values of J H m | m T

= I(X : T

362 • Otherwise: In these cases, because at most two Let Z be the random variable that whenever T = t = i coordinates of Πm may be different from λ, (π, ψ) receives the value γk − H(X | t) − γ cti(π).By

I(X : Tm |T

In addition, i H(X | t) ≤ log (|Ωt|)=γk − γ cti(π), E. Proof of Theorem V.4 implying that Z ≥ 0. We will need the following lemma connecting H(X | Since Z is a non-negative random variable whose t) and H∞(X | t). The argument in this lemma is similar expectation is at most f(k), by Markov’s inequality, we to that in Lemma 2.6 in [BEGH16] (where it was said have 1 to be a special case of Fano’s inequality). See [KR13] Pr(Z ≥ 10f(k)) ≤ . 10 also. Let t be a realization of T , such that Z<10f(k).It Lemma V.8. Let (X, T ) be a pair of discrete random holds that i variables. Let t be a realization of T . Define the set −H Xi|t H(X | t) − 1 2 ∞( ) ≤ 1 − (by Lemma V.8) Ωt = {x | Pr(x, t) =0 }.If|Ωt| > 0, it holds that log(|Ωt|)+1 − − − −H X|t H(X | t) − 1 γk γ cti(π) 10f(k) 1 2 ∞( ) ≤ 1 − . ≤ 1 − log(|Ωt|)+1 γk +1− γ cti(π) (by Equation 20 and Z<10f(k)) Proof. Assume that |Ωt| > 1 as otherwise the claim ∗ ∗ 10f(k)+2 is trivial. Let x maximize Pr(x, t) and define p = ≤ , ∗ γk +1− γ cti(π) Pr(x | t).Wehave and the assertion follows. H(X | t)

= − Pr(x | t)log(Pr(x | t)) REFERENCES x∈Ωt − ∗ ∗ − | | [AGS16] Shweta Agrawal, Ran Gelles, and Amit Sahai. Adaptive = p log(p ) Pr(x t)log(Pr(x t)) protocols for interactive communication. In Information ∗ x∈Ωt\{x } Theory (ISIT), 2016 IEEE International Symposium on, ∗ ∗ ∗ ∗ = −p log(p ) − (1 − p )log(1− p ) pages 595–599. IEEE, 2016. 3, 4 [BE14] Mark Braverman and Klim Efremenko. List and unique | − | Pr(x t) coding for interactive communication in the presence of Pr(x t)log − ∗ adversarial noise. In Foundations of Computer Science ∗ 1 p x∈Ωt\{x } (FOCS), pages 236–245, 2014. 3 ∗ ∗ ∗ = h(p )+(1− p ) H(X |T = t, X = x ) [BEGH16] Mark Braverman, Klim Efremenko, Ran Gelles, and ∗ ∗ Bernhard Haeupler. Constant-rate coding for multiparty ≤ h(p )+(1− p ) log(|Ωt|) interactive communication is impossible. In Symposium ≤ − ∗ | | on Theory of Computing (STOC), pages 999–1010. ACM, 1+(1 p ) log( Ωt ), 2016. 3, 5, 16 where h(x)=−x log x − (1 − x)log(1− x) is defined [BGMO17] Mark Braverman, Ran Gelles, Jieming Mao, and Rafail Ostrovsky. Coding for interactive communication ∗ −H∞(X|t) on (0, 1). The proof is done as p =2 by correcting insertions and deletions. IEEE Transactions definition. on Information Theory, 63(10):6256–6270, 2017. 3 [BKN14] Zvika Brakerski, Yael Tauman Kalai, and Moni Naor. Fast interactive coding against adversarial noise. Journal of the Proof of Theorem V.4. Recall that ACM (JACM), 61(6):35, 2014. 3 i i i i [BR11] Mark Braverman and Anup Rao. Towards coding I(X : T )=H(X ) − H(X |T)=γk − E[H(X | t)]. t for maximum errors in interactive communication. In

363 Symposium on Theory of computing (STOC), pages 159– other settings. In Proceedings of the forty-sixth annual 166. ACM, 2011. 3 ACM symposium on Theory of computing, pages 794– [Bra12] Mark Braverman. Towards deterministic tree code 803. ACM, 2014. 2, 3 constructions. In Proceedings of the 3rd Innovations in [GK17] Ran Gelles and Yael Tauman Kalai. Constant-rate Theoretical Computer Science Conference, pages 161– interactive coding is impossible, even in constant-degree 167. ACM, 2012. 3 networks. In 8th Innovations in Theoretical Computer [CGK+14] Keren Censor-Hillel, Seth Gilbert, Fabian Kuhn, Nancy A. Science Conference, ITCS 2017, January 9-11, 2017, Lynch, and Calvin C. Newport. Structuring unreliable Berkeley, CA, USA, pages 21:1–21:13, 2017. 3 radio networks. Distributed Computing, 27(1):1–19, [GKS08] Navin Goyal, Guy Kindler, and Michael Saks. Lower 2014. 4 bounds for the noisy broadcast problem. SIAM Journal [CHHZ17] Keren Censor-Hillel, Bernhard Haeupler, D. Ellis on Computing, 37(6):1806–1841, 2008. 1, 3, 4, 5 Hershkowitz, and Goran Zuzic. Broadcasting in noisy [GLN13] Mohsen Ghaffari, Nancy A. Lynch, and Calvin C. radio networks. In Symposium on Principles of Newport. The cost of radio network broadcast for Distributed Computing (PODC), pages 33–42, 2017. 4 different models of unreliable links. In Principles of [CHHZ18] Keren Censor-Hillel, Bernhard Haeupler, D. Ellis Distributed Computing (PODC), pages 345–354, 2013. Hershkowitz, and Goran Zuzic. Erasure correction for 4 noisy radio networks. CoRR, abs/1805.04165, 2018. 1, 4 [GMS11] Ran Gelles, Ankur Moitra, and Amit Sahai. Efficient [CK85] Imrich Chlamtac and Shay Kutten. On broadcasting and explicit coding for interactive communication. In in radio networks-problem analysis and protocol design. Foundations of Computer Science (FOCS), 2011 IEEE IEEE Trans. Communications, 33(12):1240–1246, 1985. 52nd Annual Symposium on, pages 768–777. IEEE, 2011. 1 3 [EGH16] Klim Efremenko, Ran Gelles, and Bernhard Haeupler. [GMS14] Ran Gelles, Ankur Moitra, and Amit Sahai. Efficient Maximal noise in interactive communication over erasure coding for interactive communication. IEEE Transactions channels and channels with feedback. IEEE Transactions on Information Theory, 60(3):1899–1913, 2014. 3 on Information Theory, 62(8):4575–4588, 2016. 3 [Hae14] Bernhard Haeupler. Interactive channel capacity revisited. [EKS18] Klim Efremenko, Gillat Kol, and Raghuvansh Saxena. In Foundations of Computer Science (FOCS), 2014 IEEE Interactive coding over the noisy broadcast channel. In 55th Annual Symposium on, pages 226–235. IEEE, 2014. Proceedings of the 50th Annual ACM SIGACT Symposium 3 on Theory of Computing, pages 507–520. ACM, 2018. 1, [KKP01] Evangelos Kranakis, Danny Krizanc, and Andrzej Pelc. 3, 4, 5, 19 Fault-tolerant broadcasting in radio networks. Journal of [FK00] Uriel Feige and Joe Kilian. Finding OR in a noisy Algorithms, 39(1):47–67, 2001. 4 . Inf. Process. Lett., 73(1-2):69–75, + [KLN 10] Fabian Kuhn, Nancy A. Lynch, Calvin C. Newport, 2000. 4 Rotem Oshman, and Andrea´ W. Richa. Broadcasting in [Gal88] Robert G. Gallager. Finding parity in a simple broadcast unreliable radio networks. In Symposium on Principles of network. IEEE Transactions on Information Theory, Distributed Computing (PODC), pages 336–345, 2010. 4 34(2):176–180, 1988. 1, 4 [KM05] Eyal Kushilevitz and Yishay Mansour. Computation in [Gam87] Abbas El Gamal. Open problems presented at the 1984 noisy radio networks. SIAM J. Discrete Math., 19(1):96– workshop on specific problems in communication and 108, 2005. 4 computation sponsored by bell communication research. [KPP10] Evangelos Kranakis, Michel Paquette, and Andrzej Pelc. Appeared in “Open Problems in Communication and Communication in random geometric radio networks with Computation”, by Thomas M. Cover and B. Gopinath positively correlated random faults. Ad Hoc & Sensor (editors). Springer-Verlag, 1987. 1, 4 Wireless Networks, 9(1-2):23–52, 2010. 4 [GH14] Ran Gelles and Bernhard Haeupler. Capacity of interactive communication over erasure channels and [KR13] Gillat Kol and Ran Raz. Interactive channel capacity. In channels with feedback. In Proceedings of the Proceedings of the forty-fifth annual ACM symposium on Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Theory of computing, pages 715–724. ACM, 2013. 3, 16 Algorithms, pages 1296–1311. SIAM, 2014. 3 [MS14] Cristopher Moore and Leonard J Schulman. Tree codes [GHK+16] Ran Gelles, Bernhard Haeupler, Gillat Kol, Noga Ron- and a conjecture on exponential sums. In Innovations Zewi, and Avi Wigderson. Towards optimal deterministic in theoretical computer science (ITCS), pages 145–154. coding for interactive communication. In Proceedings ACM, 2014. 3 of the Twenty-Seventh Annual ACM-SIAM Symposium [New04] Ilan Newman. Computing in fault tolerance broadcast on Discrete Algorithms, pages 1922–1936. Society for networks. In Computational Complexity Conference Industrial and Applied Mathematics, 2016. 3 (CCC), pages 113–122, 2004. 4 [GHLN12] Mohsen Ghaffari, Bernhard Haeupler, Nancy A. Lynch, [Pel07] David Peleg. Time-efficient broadcasting in radio and Calvin C. Newport. Bounds on contention networks: A review. In Distributed Computing and management in radio networks. In International Technology ICDCIT, pages 1–18, 2007. 2 Symposium on Distributed Computing (DISC), pages [RS94] Sridhar Rajagopalan and Leonard J. Schulman. A coding 223–237, 2012. 4 theorem for distributed computation. In Symposium on [GHM18] Ofer Grossman, Bernhard Haeupler, and Sidhanth the Theory of Computing (STOC), pages 790–799, 1994. Mohanty. Algorithms for noisy broadcast with 3 erasures. In 45th International Colloquium on Automata, [Sch92] Leonard J Schulman. Communication on noisy channels: Languages, and Programming (ICALP 2018). Schloss A coding theorem for computation. In Foundations of Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. 4 Computer Science (FOCS), pages 724–733. IEEE, 1992. [GHS14] Mohsen Ghaffari, Bernhard Haeupler, and Madhu Sudan. 3, 19 Optimal error rates for interactive coding i: Adaptivity and [Sch93] Leonard J Schulman. Deterministic coding for interactive

364 communication. In Symposium on Theory of computing Lemma A.10 (Super-additivity of information). Let (STOC), pages 747–756. ACM, 1993. 3 n>0 and X1,X2, ··· ,Xn be mutually independent [Sch96] Leonard J. Schulman. random variables. Then, for any random variable Y , Coding for interactive communication. IEEE Transactions on Information Theory, 42(6):1745–1756, 1996. 3 I(Xi : Y ) ≤ I(X1X2 ···Xn : Y ). i∈ n APPENDIX A [ ] INFORMATION THEORY PRELIMINARIES Proof. We have A. Entropy I(X X ···Xn : Y ) 1 2 Definition A.1 (Entropy). The (binary) entropy of a = I(Xi : Y | X1 ···Xi−1) ((Chain rule)) discrete random variable X is defined as i∈[n]   1 1 H | ··· − H | ··· H(X)= Pr(x)log = E log . = (Xi X1 Xi−1) (Xi YX1 Xi−1) Pr(x) x∼X Pr(x) i∈[n] x∈supp(X) H − H | ··· Definition A.2 (Conditional Entropy). The entropy of = (Xi) (Xi YX1 Xi−1) i∈ n a discrete random variable X given another random [ ] ((Independence of Xi)) variable Y is defined as ≥ H(Xi) − H(Xi | Y ) H(X | Y )= E [H(X | Y = y)] . i∈ n y∼Y [ ] Fact A.3. We have H(XY )=H(X)+H(Y | = I(Xi : Y ). X) ≤ H(X)+H(Y ). Equality holds if X and Y are i∈[n] independent.

Fact A.4. If the random variable X takes values in the Lemma A.11. Let X, Y, Z be random variables. We set Ω, it holds that have: 0 ≤ H(X) ≤ log|Ω|. |I(X : Y ) − I(X : Y | Z)|≤H(Z). B. Min-Entropy Proof. We prove both directions separately. Firstly, Definition A.5 (Min-Entropy). The min-entropy of a I(X : Y | Z)+H(Z) discrete random variable X is ≥ I(X : Y | Z)+I(X : Z) 1 H∞(X)= min log . = I(X : YZ)=I(X : Y )+I(X : Z | Y ) x:Pr(x)=0 Pr(x) ≥ I Fact A.6. If the random variable X takes values in the (X : Y ). set Ω, it holds that For the other direction, note that: 0 ≤ H∞(X) ≤ H(X) ≤ log|Ω|. I(X : Y )+H(Z) ≥ I I | C. Mutual Information (X : Y )+ (X : Z Y ) I I I | Definition A.7 (Mutual Information). The mutual = (X : YZ)= (X : Z)+ (X : Y Z) information between two discrete random variables X ≥ I(X : Y | Z). and Y is defined as I(X : Y )=H(X) − H(X | Y )=H(Y ) − H(Y | X). APPENDIX B We can also define conditional mutual information as CODING WITH LOGARITHMIC OVERHEAD I(X : Y | Z)=H(X | Z) − H(X | YZ) In this section we prove Theorem I.2. That is, = H(Y | Z) − H(Y | XZ). we describe an O(log n) overhead scheme to make any non-adaptive noiseless protocol noise resilient. Fact A.8. We have 0 ≤ I(X : Y | Z) ≤ H(X). We can restrict attention to deterministic protocols as Fact A.9 (Chain Rule). If A, B, C, D are random randomized protocols are distributions over deterministic variables, then ones. I(AB : C | D)=I(A : C | D)+I(B : C | AD). For the rest of the text, fix a graph G. Fix a deterministic, non-adaptive protocol Π that works over

365 the noiseless RADIO model over G. Fix inputs xv for chunk in Π followed by a ‘check’ stage. For l ≥ 0, every v ∈ G. Let T be the number of rounds in Π. Recall alevell simulation has two level l − 1 simulations from subsection IV-B, that the protocol Π is defined followed by a check stage. Finally, the protocol Π has v v × m−1 →  by transmission functions, fm : X Γ+ Γ+, 2 successive executions of level simulations. Here, v v × T → v ≤ 2 5 and output functions, g : X Γ+ Y , for every is the unique integer such that 10T 2 n < 40T . round m ∈ [T ] and node v ∈ G.Wenextshowhowto Similar recursive structures can be found, for example, transform Π to a noise resilient protocol. in [Sch92], [EKS18]. Observe that if TT, simulating rounds and nodes and claim that the probability of the a virtual round means that v receives. noise effecting the output is negligible. Thus, we assume Our protocol Π for simulating Π is described in T>n100 throughout. Algorithm 3. A level l simulation, for l ≥ 1, is described a) Consistent transcripts: Since we assume that the in Algorithm 2. It simulates (roughly) the first n52l protocol Π is non-adaptive, the following set is well rounds of Π. Algorithm 1 has a level 0 simulation. defined, and known to all nodes in advance: It simulates the first n5 rounds of Π. The protocol in Su→v = {m ∈ [T ] | u is the unique node in N(v) the check stages, CHECKk, is described in Algorithm 4. } that is broadcasting in round m . This protocol assumes that each node v has a pair v v For the rest of the text, the notation π[S] means the of transcripts π1 and π2 . At the end of an execution string obtained from π by deleting the all the coordinates each node v outputs an integer mv. The protocol has  { v} not in S. We also use the notation s s to denote the the property that if π1 is m1-consistent, then all the v concatenation of two strings, s and s . outputs m are the same value m2, and m2 is the { v ∈ ∗ } { v v} Consider a set of transcripts π Γ+ v∈G, one maximum such that π1 π2 is (m1 + m2)-consistent. for each node v ∈ G. For a vertex u ∈ G, define the CHECKk uses the ‘rumor spreading’ protocol described u ∈ ∗ u transmitted transcript ρ Γ+ of u with respect to π in the following proposition: th u u u to be the string whose t symbol is ft (x ,πT to be the same as O(n2 log t) rounds such that when the protocol ends all T -consistent. Given a T -consistent transcript, one can v the nodes output minv∈H m . recover the noiseless transcripts for every v. Thus, it suffices for our simulation protocol to generate a T - consistent transcripts. The above proposition can be proved by having the nodes broadcast their input over a fixed spanning tree. A. The Simulation Protocol Π At a high level, our simulation protocol Π simulates a) Simulation length: The for loop of CHECKk the noiseless protocol Π round by round, repeating each is has O(n2) iterations and each iteration requires round c0 log n times. The constant c0 will be fixed in the O(k log n) rounds (we use the assumption that the v k−1 5 analysis. Periodically, our protocol Π has check stages transcript π2 has at most 2 n rounds for all v). to verify if any errors were introduced, and re-simulate By Proposition B.1, the execution of RSG uses another if that is the case. O(n2k log(n)) rounds, as it is run with inputs in In order to control the overhead of our simulation, we [2k−1n5]. Since every message is repeated 100k log n 5 break the noiseless protocol Π into chunks of n rounds times, CHECKk has n2 poly(k) polylog(n) rounds. Thus, each. A level 0 simulation in our protocol simulates one assuming that T>n100, the total number of rounds in

366 Π is at⎛ most Algorithm 4 The level k check round CHECKk v v  ⎝  5 Input: Transcripts π1 ,π2 for every player v. The 2 2 n · c0 log(n) v k−1 5    transcript π2 has at most 2 n rounds (for all v). v Simulation rounds ⎞ Output: A round number m for each node v. ∀ ∈ v ← v v 1: v G : π π1 π2 . v k−  ⎟ ∀ ∈ ← 1 5  5 ⎟ 2: v G : r 2 n . 2 n ⎟ v v · · 2 3: ρ is constructed from π as in section B, for all + j 5 n poly(j + 1) polylog(n)⎟ j 2 n ⎠ ∈ =0   v G. 4: for all Ordered Edge e =(u, v) ∈ G do Check rounds ∗ k n  5: Sample a hash function h :Γ → Γ100 log . = O 22 n5 log n + 6: Using binary = O (T log n) . ∈{| v| ··· k−1 5 search, find the largestm π1 , , 2 n + | v|} u \ | u| π1 such that h ρ≤m[Su→v [ π1 ]] = v \ | v| Algorithm 1 Level 0 simulation h π≤m[Su→v [ π1 ]] . Each broadcast in this Input: A transcript πv for node v. step is repeated 100k log n times and majority is Output: A string sv for node v such that |sv|≤n5. taken. v v ← v −| v| 1: ∀v ∈ G : s ← ε, the empty string. 7: r min(r ,m π1 ). 2: for i ∈ [n5] do 8: end for v RS v 3: All nodes v simulate virtual round |π | + i with 9: Execute G with inputs r . Repeat each broadcast v v in this execution 100k log n times and take majority. transcript π s . This is repeated c0 log n times Let mv be the value received by node v. and the majority symbol received is appended to v sv. 10: Output m . 4: end for v v 5: Execute CHECK1 with inputs π and s . Let the output of node v be mv. B. Correctness Analysis for Π v v 6: ∀v ∈ G : s ← s≤mv . In this section we prove that Π indeed simulates Π. We divide this section into 4 subsections, one for each Algorithm 2 Level l simulation, l ≥ 1 one of our protocols. Throughout, for a string valued v v v Input: A transcript πv for node v. variable s ,weuseσ to denote |s |. If the value of Output: A string sv for node v such that |sv|≤n52l. σv is the same for all v, then, we use σ to denote the v 1: Execute a level l − 1 simulation with inputs π to common value. Otherwise, define σ =0. Analogously, v v v get outputs s1. for a variable s , we define σ and σ , etc.. − v v 1 1 1 2: Execute a level l 1 simulation with inputs π s1 v For the rest of this section, all probabilities and to get outputs s2. ∀ ∈ v ← v v expectations are over the randomness used by the nodes, 3: v G : s s1 s2. v v as well as the noise in the channel. 4: Execute CHECKl+1 with inputs π and s . Let the v output of node v be m . 1) Analyzing CHECKk: v v 5: ∀v ∈ G : s ← s≤mv . Lemma B.2. Consider an execution of CHECKk with v v inputs π1 ,π2 for node v. As in the protocol, we define v v v v  { } π = π1 π2 . Suppose that π1 is m1-consistent and Algorithm 3 The Simulation Protocol Π v v let m be the largest integer such that {π } is Input: An input x for node v. Note that the rest of the 2 ≤m1+m2 v v protocols are also using x . (m1 + m2)-consistent. Let m be the output of node v. Output: A string sv for node v. It holds that v 1: ∀v ∈ G : π ← ε, the empty string. ∃ ∈ v  ≤ −30k  Pr( v G : m = m2) n . 2: for i ∈ [2 ] do v 3: Execute a level simulation with inputs π to Proof. Each broadcast in CHECKk is repeated 100k log n v get outputs s . times. Since the total number of rounds is poly(k, n), ∀ ∈ v ← v v 4: v G : π π s . we can use a union bound to claim that every broadcast 5: end for v v is received correctly by all the nodes except with 6: Output s ← π . − k probability at most n 40 . We condition on this event throughout the rest of the proof.

367 v v Since the hash function h has range Γ100k log n, the incorrect reception, the transcripts π , s that are input CHECK { v v} 5 probability to 1 satisfy that π s is (m + n )-consistent. u \ | u| v \ | v| By Lemma B.2, we are done. that h ρ≤m[Su→v [ π1 ]] = h π≤m[Su→v [ π1 ]] u \ | u|  v \ | v| when ρ≤m[Su→v [ π1 ]] = π≤m[Su→v [ π1 ]] is at −100k 3) Analyzing a Level l Simulation: Recall the most n . By a union bound, the probability that this v v v definitions of s , s , s , σ, σ1, σ2 at the beginning of happens for some u, v during an execution of CHECKk 1 2 − k subsection B-B. We have the following guarantee from is at most n 40 . We also condition on this event not alevell simulation. happening throughout the rest of the proof. Both of our − k assumptions hold except with probability at most n 30 . Lemma B.5. If the input {πv} to a level l simulation is v v Let rˆ be the value of r right after the for loop. Using m-consistent for some m ≥ 0, then, v Proposition B.1 and the fact that {π } is m1-consistent, −10l 1 E[σ] ≥ (E[σ1]+E[σ2]) 1 − n . we have v v ∀v ∈ G : m =minrˆ . Proof. Consider an execution of a level l simulation. v∈G Define the events: Given our conditioning, we also have v v E1 ≡{π s } is (m + σ1)-consistent. v u u 1 rˆ =minmax{m | h ρ≤m[Su→v \ [|π |]] v v v e u,v ∈G 1 E ≡{π s s } is (m + σ + σ )-consistent. =( ) 2 1 2 1 2 v \ | v| }−| v| = h π≤m[Su→v [ π1 ]] π1 By Corollary B.3 applied to the first execution of the l−1 u v ¯ ≤ −30l { v} =minmax{m | ρ≤m[Su→v]=π≤m[Su→v]}−m1 simulation, Pr(E1) n , as we assume that π is e=(u,v)∈G { v} m-consistent. By Corollary B.3 applied to the second (as π1 is m1-consistent) −30l execution of the l − 1 simulation, Pr(E¯2 | E1) ≤ n . = m2. Therefore,

Combining the two equations above and using the fact Pr(E¯2)=Pr(E¯1) · Pr(E¯2 | E¯1)+Pr(E1) · Pr(E¯2 | E1) ≥ − l − l − l that m2 0 gives the result. ≤ n 30 · 1+1· n 30 =2· n 30 . ≥ { v} v v Corollary B.3. For l, m 0, if the input π to level The input to CHECKl+1 is π , s .IfE2 occurs, this input v v v l simulation is m-consistent, then the output s satisfies, satisfies that {π s } is (m + σ1 + σ2)-consistent. By v v − l Pr ({π s } is not (m + σ)-consistent) ≤ n 30( +1). Lemma B.2,wehave − l Pr(σ = σ + σ | E ) ≤ n 30( +1). Proof. Alevell simulation ends with an execution 1 2 2 v of CHECKl+1 with π as the first input. Use This means Lemma B.2. −30(l+1) −10l Pr(σ = σ1 + σ2) ≤ n +Pr(E¯2) ≤ n . 2) Analyzing a Level 0 Simulation: The result then follows as σ ≥ 0. Recall the definitions of sv and σ at the beginning of ≥ { v} subsection B-B. The following lemma shows that, with Lemma B.6. For l 0, if the input π to a level l ≥ high probability, an execution of a level 0 simulation simulation is m-consistent for some m 0, then, l simulates the next n5 rounds of Π. E[σ] ≥ 0.9 · 2 n5.

Lemma B.4. There exists a c0 > 0 such that for all Proof. We prove a stronger statement by induction on l. ≥ { v} m 0, if the input π to a level 0 simulation is m- Namely, that consistent, then, l l 5 −20 −8i − E[σ] ≥ 2 n 1 − n − 2 n . E[σ] ≥ n5(1 − n 20). i=1

Proof. We pick c0 large enough so that the probability The base case is due to Lemma B.4. Consider an (union bounded over all edges in the graph and all the n5 execution of a level l simulation. As above, define the rounds in a level 0 simulation) of an incorrect reception event: − is at most n 50. ≡{ v v} E1 π s1 is (m + σ1)-consistent. Since we assume that the input {πv} to the level 0 v We get simulation is m-consistent, the transcripts π are a valid ≥ − −10l simulation of the first m rounds of Π. Thus, if there is no E[σ] (E[σ1]+E[σ2]) 1 n (by Lemma B.5)

368 −10l i ≥ (E[σ1]+Pr(E1) E[σ2 | E1]) 1 − n simulation independently. Define Y to be the indicator −1 5 (as σ ≥ 0) random variable that is 1 iff σi ≥ 2 n . Using a 2 l−1 Chernoff⎛ bound, we have ⎞ l− − − i ≥ 2 1n5 1 − n 20 − 2 n 8 2 ⎝ 1 · 1 ·  ⎠ ≤ − −4 i=1 Pr Yi < 2 exp 2 . i 2 2 Pr(E¯ ) − l =1 × 1+ 1 − 1 1 − n 10 2 ¯ ≤ −6 Since Pr(E) n , we get that except with probability (by induction hypothesis) −4 −6 at most exp −2 + n , l− 1 ≥ l 5 − −20 − −8i 2 2 n 1 n 2 n −1 5 σi ≥ σi ≥ 2 n Yi i=1 i=1 i∈ Y i∈ Pr(E¯ ) [2 ]: i=1 [2 ] × − 1 − −10l − − − 1 1 n ≥ 2 1n5 · 2 2 =2 3n5 ≥ T, 2 l−1 l − − i − l where the last inequality is by the definition of . ≥ 2 n5 1 − n 20 − 2 n 8 1 − n 8 i=1 Proof of Theorem I.2. The output of our simulation (by Corollary B.3) protocol is consistent with high probability Π T l (Lemma B.7). This implies that Π simulates Π l 5 −20 −8i ≥ 2 n 1 − n − 2 n . successfully with high probability. i=1

4) Analyzing Π: Alevel simulation gives a guarantee for the expected progress (Lemma B.6). In order to get concentration, the protocol Π has several executions of level simulations. For the output of Π, we have: { v} ≤ Lemma B.7. Pr ( s is not T -consistent) exp −2−4 + n−6

v Ei ≡{πi } is σj -consistent. j∈[i]

2 E ≡ Ei. i=1 Observe that: 2 Pr(E¯) ≤ Pr(E¯1)+ Pr(E¯i | Ei−1) i=2 2 −  −  ≤ n 20( +1) ≤ n 6 . (by Corollary B.3) i=1 Condition on E for the rest of this proof. Due to E,  v 2 we know that {s } is i σi-consistent. It is enough  =1 2 ≥ to show that i=1 σi T . Also, due to E, for all the level simulations, we have by Markov’s inequality and Lemma B.6 l− Pr(σ<2 1n5) < 1/2. (22) Furthermore, the bound in Equation 22 holds for each

369