A Random Linear Network Coding Approach to Multicast Tracey Ho, Member, IEEE, Muriel Médard, Senior Member, IEEE, Ralf Koetter, Senior Member, IEEE, David R

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 10, OCTOBER 2006 4413 A Random Linear Network Coding Approach to Multicast Tracey Ho, Member, IEEE, Muriel Médard, Senior Member, IEEE, Ralf Koetter, Senior Member, IEEE, David R. Karger, Associate Member, IEEE, Michelle Effros, Senior Member, IEEE, Jun Shi, and Ben Leong Abstract—We present a distributed random linear network coding approach for transmission and compression of information in general multisource multicast networks. Network nodes independently and randomly select linear mappings from inputs onto output links over some field. We show that this achieves capacity with probability exponentially approaching I with the code length. We also demonstrate that random linear coding performs compression when necessary in a network, generalizing error exponents for linear Slepian–Wolf coding in a natural way. Benefits of this approach are decentralized operation and robustness to network changes or link failures. We show that this approach can take advantage of redundant network capacity for improved success probability and robustness. We illustrate some potential advantages of random linear network coding over routing in two Fig. 1. An example of distributed random linear network coding. and $ examples of practical scenarios: distributed network operation are the source processes being multicast to the receivers, and the coefficients and networks with dynamically varying connections. Our deriva- are randomly chosen elements of a finite field. The label on each link represents the process being transmitted on the link. tion of these results also yields a new bound on required field size for centralized network coding on general multicast networks. Index Terms—Distributed compression, distributed networking, This family of problems includes traditional single-source mul- multicast, network coding, random linear coding. ticast for content delivery and the incast or reachback problem for sensor networks, in which several, possibly correlated, sources transmit to a single receiver. We use a randomized I. INTRODUCTION strategy: all nodes other than the receiver nodes perform HE capacity of multicast networks with network coding random linear mappings from inputs onto outputs over some Twas given in [1]. We present an efficient distributed ran- field. These mappings are selected independently at each node. domized approach that asymptotically achieves this capacity. An illustration is given in Fig. 1. The receivers need only know We consider a general multicast framework—multisource mul- the overall linear combination of source processes in each of ticast, possibly with correlated sources, on general networks. their incoming transmissions. This information can be sent with each transmission block or packet as a vector of coefficients corresponding to each of the source processes, and updated at Manuscript received February 26, 2004; revised June 1, 2006. This work each coding node by applying the same linear mappings to the was supported in part by the National Science Foundation under Grants CCF- 0325324, CCR-0325673, and CCR-0220039, by Hewlett-Packard under Con- coefficient vectors as to the information signals. The relative tract 008542-008, and by Caltech’s Lee Center for Advanced Networking. overhead of transmitting these coefficients decreases with T. Ho was with the Laboratory for Information and Decision Systems (LIDS), increasing length of blocks over which the codes and network Massachusetts Institute of Technology (MIT), Cambridge, MA 02139 USA. remain constant. For instance, if the network and network code She is now with the California Institute of Technology (Caltech), Pasadena, CA 91125 USA (e-mail: [email protected]). are fixed, all that is needed is for the sources to send, once, at M. Médard is with Laboratory for Information and Decision Systems, the the start of operation, a canonical basis through the network. Massachusetts Institute of Technology (MIT), Cambridge, MA 02139 USA Our primary results show, first, that such random linear (e-mail: [email protected]). R. Koetter is with the Coordinated Science Laboratory, University of Illinois coding achieves multicast capacity with probability exponen- at Urbana-Champaign, Urbana, IL 61801 USA (e-mail: [email protected]). tially approaching with the length of code. Second, in the D. R. Karger is with the Computer Science and Artificial Intelligence Labo- context of a distributed source coding problem, we demonstrate ratory (CSAIL), the Massachusetts Institute of Technology (MIT), Cambridge, MA 02139 USA (e-mail: [email protected]). that random linear coding also performs compression when M. Effros is with the Department of Electrical Engineering, California Insti- necessary in a network, generalizing known error exponents for tute of Technology, Pasadena, CA 91125 USA (e-mail: [email protected]). linear Slepian–Wolf coding [4] in a natural way. J. Shi was with the University of California, Los Angeles, CA, USA. He is This approach not only recovers the capacity and achievable now with Intel Corporation, Santa Clara, CA 95054 USA (e-mail: junshi@ee. ucla.edu). rates, but also offers a number of advantages. While capacity B. Leong was with the Computer Science and Artificial Intelligence Labo- can be achieved by other deterministic or random approaches, ratory (CSAIL), the Massachusetts Institute of Technology (MIT), Cambridge, they require, in general, network codes that are planned by or MA 02139 USA. He is now with the National University of Singapore, Singa- pore 119260, Republic of Singapore (e-mail: [email protected]). known to a central authority. Random design of network codes Communicated by A. Ashikhmin, Associate Editor for Coding Theory. was first considered in [1]; our contribution is in showing how Digital Object Identifier 10.1109/TIT.2006.881746 random linear network codes can be constructed and efficiently 0018-9448/$20.00 © 2006 IEEE 4414 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 10, OCTOBER 2006 communicated to receivers in a distributed manner. For the case in this new environment. For instance, if we allow for retrials to of distributed operation of a network whose conditions may be find successful codes, we in effect trade code length for some varying over time, our work hints at a beguiling possibility: that rudimentary coordination. a network may be operated in a decentralized manner and still Portions of this work have appeared in [9], which introduced achieve the information rates of the optimized solution. Our distributed random linear network coding; [8], which presented distributed network coding approach has led to and enabled the Edmonds matrix formulation and a new bound on required subsequent developments in distributed network optimization, field size for centralized network coding; [12], which gener- e.g., [20], [13]. The distributed nature of our approach also ties alized previous results to arbitrary networks and gave tighter in well with considerations of robustness to changing network bounds for acyclic networks; [11], on network coding for ar- conditions. We show that our approach can take advantage of bitrarily correlated sources; and [10], which considered random redundant network capacity for improved success probability linear network coding for online network operation in dynami- and robustness. Moreover, issues of stability, such as those cally varying environments. arising from propagation of routing information, are obviated by the fact that each node selects its code independently from A. Overview the others. A brief overview of related work is given in Section I-B. In Our results, more specifically, give a lower bound on the Section II, we describe the network model and algebraic coding probability of error-free transmission for independent or lin- approach we use in our analyses, and introduce some notation early correlated sources, which, owing to the particular form and existing results. Section III gives some insights arising from of transfer matrix determinant polynomials, is tighter than the consideration of bipartite matching and network flows. Suc- Schwartz–Zippel bound (e.g., [23]) for general polynomials cess/error probability bounds for random linear network coding of the same total degree. This bound, which is exponentially are given for independent and linearly correlated sources in Sec- dependent on the code length, holds for any feasible set of tion IV and for arbitrarily correlated sources in Section V. We multicast connections over any network topology (including also give examples of practical scenarios in which randomized networks with cycles and link delays). The result is derived network coding can be advantageous compared to routing, in using a formulation based on the Edmonds matrix of bipartite Section VI. We present our conclusions and some directions matching, which leads also to an upper bound on field size for further work in Section VII. Proofs and ancillary results are required for deterministic centralized network coding over given in the Appendix . general networks. We further give, for acyclic networks, tighter bounds based on more specific network structure, and show B. Related Work the effects of redundancy and link reliability on success probability. For arbitrarily correlated sources, we give error bounds Ahlswede et al. [1] showed that with network coding, as for minimum entropy and maximum a posteriori probability symbol size approaches infinity, a source can multicast infor- decoding. In the special case of

A Random Linear Network Coding Approach to Multicast Tracey Ho, Member, IEEE, Muriel Médard, Senior Member, IEEE, Ralf Koetter, Senior Member, IEEE, David R

Formalizing Randomized Matching Algorithms

Coded Sparse Matrix Multiplication

Algebraic Algorithms in Combinatorial Optimization

Coded Computation for Speeding up Distributed Machine Learning

Matchings in Graphs 1 Definitions 2 Existence of a Perfect Matching

Advanced Algorithms, a Graduate-Level Course Taught by Anupam Gupta at Carnegie Mellon University in Fall 2020

Network Flow Algorithms for Wireless Networks and Design and Analysis of Rate Compatible LDPC Codes Cuizhu Shi Iowa State University

Coded Sparse Matrix Multiplication

Exact Covers Via Determinants Andreas Björklund

Admin Review String Matching

Lecture 11 — January 28, 2014 1 Overview 2 Ideas from Linear Algebra

Algebraic Algorithms for Matching