Insight from Simple Questions: Three Examples

Henry D. Pfister

Jack Keil Wolf Lecture on and Applications Center for Magnetic Recording Research University of California, San Diego December 1st, 2017

1 / 41 Insight from Simple Questions: Three Examples (and some things that I learned from Jack)

Henry D. Pfister Duke University

Jack Keil Wolf Lecture on Information Theory and Applications Center for Magnetic Recording Research University of California, San Diego December 1st, 2017

1 / 41 Acknowledgments

Thanks to my past and present collaborators. In particular, this presentation is based on work with

Jack Wolf, Roberto Padovani, Jilei Hou, Alessandro Vanelli-Coralli John Smee, Joseph Soriaga, Stefano Tomasin Santhosh Kumar, R¨udigerUrbanke, Marco Mondelli, Shrinivas Kudekar, Eren S¸a¸so˘glu

Thanks to Paul Siegel and the selection committee for the invitation to speak today

2 / 41 Often, they were playful curiosities But, some had a more profound impact

This style of inquiry resonated with me But it is an art... Simple questions can be quite hard to answer Answer doesn’t always give insight Even when it should, you might miss it

One approach I like is based on Polya’s Dictum: If you can’t solve a problem, then there is an easier problem inside it that you can’t solve

Introduction

Jack Wolf liked to ask simple questions whose answers could provide insight

3 / 41 But, some had a more profound impact

This style of inquiry resonated with me But it is an art... Simple questions can be quite hard to answer Answer doesn’t always give insight Even when it should, you might miss it

One approach I like is based on Polya’s Dictum: If you can’t solve a problem, then there is an easier problem inside it that you can’t solve

Introduction

Jack Wolf liked to ask simple questions whose answers could provide insight Often, they were playful curiosities

3 / 41 This style of inquiry resonated with me But it is an art... Simple questions can be quite hard to answer Answer doesn’t always give insight Even when it should, you might miss it

One approach I like is based on Polya’s Dictum: If you can’t solve a problem, then there is an easier problem inside it that you can’t solve

Introduction

Jack Wolf liked to ask simple questions whose answers could provide insight Often, they were playful curiosities But, some had a more profound impact

3 / 41 But it is an art... Simple questions can be quite hard to answer Answer doesn’t always give insight Even when it should, you might miss it

One approach I like is based on Polya’s Dictum: If you can’t solve a problem, then there is an easier problem inside it that you can’t solve

Introduction

Jack Wolf liked to ask simple questions whose answers could provide insight Often, they were playful curiosities But, some had a more profound impact

This style of inquiry resonated with me

3 / 41 One approach I like is based on Polya’s Dictum: If you can’t solve a problem, then there is an easier problem inside it that you can’t solve

Introduction

Jack Wolf liked to ask simple questions whose answers could provide insight Often, they were playful curiosities But, some had a more profound impact

This style of inquiry resonated with me But it is an art... Simple questions can be quite hard to answer Answer doesn’t always give insight Even when it should, you might miss it

3 / 41 Introduction

Jack Wolf liked to ask simple questions whose answers could provide insight Often, they were playful curiosities But, some had a more profound impact

This style of inquiry resonated with me But it is an art... Simple questions can be quite hard to answer Answer doesn’t always give insight Even when it should, you might miss it

One approach I like is based on Polya’s Dictum: If you can’t solve a problem, then there is an easier problem inside it that you can’t solve

3 / 41 Three Simple Questions

Can one compute the amplitude and frequency of 2 complex sinusoids using only 3 complex samples?

How can cellular systems with successive cancellation approach the equal-rate point of the multiple-access channel with equal-power users?

Can Reed-Muller codes achieve capacity?

4 / 41 Assume the samples are s0, s1, s2 where

nω1T √ 1 nω2T √ 1 n n sn = a1e − + a2e − = a1γ1 + a2γ2

X∞ n 1 1 S(z) = snz− = a1 1 + a2 1 . 1 γ1z− 1 γ2z− n=0 − −

First Question

1 0.5 0 −0.5 −1 −4 −3 −2 −1 0 1 2 3 4

Can one compute the amplitude and frequency of 2 complex sinusoids using only 3 complex samples?

5 / 41 First Question

1 0.5 0 −0.5 −1 −4 −3 −2 −1 0 1 2 3 4

Can one compute the amplitude and frequency of 2 complex sinusoids using only 3 complex samples?

Assume the samples are s0, s1, s2 where

nω1T √ 1 nω2T √ 1 n n sn = a1e − + a2e − = a1γ1 + a2γ2

X∞ n 1 1 S(z) = snz− = a1 1 + a2 1 . 1 γ1z− 1 γ2z− n=0 − −

5 / 41 For t = 2, Prony’s method designs a 3-tap FIR filter (e.g., −1 −2 with Z-transform B(z) = b0 + b1z + b2z ) whose zeros cancel the poles in the Z-transform of the sn sequence:   −1 −1 1 1 B(z)S(z) = b0(1 γ1z )(1 γ2z ) a1 −1 + a2 −1 | − {z − } 1 γ1z 1 γ2z B(z) − −

−1 −1 = b0a1(1 γ2z ) + b0a2(1 γ1z ). − − Expand the LHS and match z−2 and z−3 coefficients to get       s0 s1 b2 s2 = b0 s1 s2 b1 − s3

B(z) has zeros at poles of S(z): Factor B(z) to find γ1, γ2.

Prony’s Method

In 1795, Baron Gaspard De Prony published a method that recovers the parameters of a linear combination of t exponentials from 2t uniform samples s1,..., s2t

6 / 41 Expand the LHS and match z−2 and z−3 coefficients to get       s0 s1 b2 s2 = b0 s1 s2 b1 − s3

B(z) has zeros at poles of S(z): Factor B(z) to find γ1, γ2.

Prony’s Method

In 1795, Baron Gaspard De Prony published a method that recovers the parameters of a linear combination of t exponentials from 2t uniform samples s1,..., s2t

For t = 2, Prony’s method designs a 3-tap FIR filter (e.g., −1 −2 with Z-transform B(z) = b0 + b1z + b2z ) whose zeros cancel the poles in the Z-transform of the sn sequence:   −1 −1 1 1 B(z)S(z) = b0(1 γ1z )(1 γ2z ) a1 −1 + a2 −1 | − {z − } 1 γ1z 1 γ2z B(z) − −

−1 −1 = b0a1(1 γ2z ) + b0a2(1 γ1z ). − −

6 / 41 B(z) has zeros at poles of S(z): Factor B(z) to find γ1, γ2.

Prony’s Method

In 1795, Baron Gaspard De Prony published a method that recovers the parameters of a linear combination of t exponentials from 2t uniform samples s1,..., s2t

For t = 2, Prony’s method designs a 3-tap FIR filter (e.g., −1 −2 with Z-transform B(z) = b0 + b1z + b2z ) whose zeros cancel the poles in the Z-transform of the sn sequence:   −1 −1 1 1 B(z)S(z) = b0(1 γ1z )(1 γ2z ) a1 −1 + a2 −1 | − {z − } 1 γ1z 1 γ2z B(z) − −

−1 −1 = b0a1(1 γ2z ) + b0a2(1 γ1z ). − − Expand the LHS and match z−2 and z−3 coefficients to get       s0 s1 b2 s2 = b0 s1 s2 b1 − s3

6 / 41 Prony’s Method

In 1795, Baron Gaspard De Prony published a method that recovers the parameters of a linear combination of t exponentials from 2t uniform samples s1,..., s2t

For t = 2, Prony’s method designs a 3-tap FIR filter (e.g., −1 −2 with Z-transform B(z) = b0 + b1z + b2z ) whose zeros cancel the poles in the Z-transform of the sn sequence:   −1 −1 1 1 B(z)S(z) = b0(1 γ1z )(1 γ2z ) a1 −1 + a2 −1 | − {z − } 1 γ1z 1 γ2z B(z) − −

−1 −1 = b0a1(1 γ2z ) + b0a2(1 γ1z ). − − Expand the LHS and match z−2 and z−3 coefficients to get       s0 s1 b2 s2 = b0 s1 s2 b1 − s3

B(z) has zeros at poles of S(z): Factor B(z) to find γ1, γ2. 6 / 41 Connection to Coding

In 1967, Wolf observed (in a 3/4 page correspondence) that the celebrated PGZ decoding algorithm for RS/BCH codes 608 was mathematically identical toIEEE Prony’sTRANSACTIONS methodON INFORMATION THEORY, OCTOBER 1967 Correspondence

Decoding of Bose-Chaudhuri-Hocquenghem Codes and the coding problem one is concerned with a finite field. The methods Prony’s Method for Curve Fitting of solution, however, are identical. It appears that this analogy will help us little in the decoding The literature is filled with examples of a common set of equations problem. On the contrary, it is possible that some of the techniques which arise in two or more diverse applications. The purpose of developed for decoding will have application to the curve fitting this correspondence is to point out that such a situation has occurred problem. For example, Forney’s techniqueIS for filling in erasures in the two fields of 1) algebraic , and 2) curve fitting. is applicable to the curve fitting problem where some of the exponen- Two aspects are of particular interest: the large time span which tials are known. separates the fundamental works in these two areas, and that the methods of solution are identical. Since it is assumed that the readers JACK KEIL WOLF Dept. of Elec. Engrg. of this TRANSACTIONS are more familiar with the coding application, Polytechnic Institute of Brooklyn the equations are first discussed in terms of the curve fitting appli- Brooklyn, N. P. cation. Prony[ll has considered the problem of approximating a curve f(z) by a finite set of exponentials as REFERENCES 111 R. de Prony, “Essai experiment& et analytique,” J. EC& Polytech. (Paris), vol. 1, pp. 24-76, December 1795. 121F. B. Hildebrand, Introduction to Numerical Analysis. New York: McGraw- Hill, 1956, pp. 378-382. where the Cc and the yi are unknowns. It is assumed that f(s) is [al D. Gwenstein and N. Zierler, “A class of error-correcting codes in pm symbols,” J. SIAM, vol. 9, pp. 207-214, June 1961. specified at N equally spaced points and with appropriate normal- [41 W. W. Peterson, Error-Correcting Codes. Cambridge, Mass.: M.I.T. Press, 1961, pp. 169-175. ization of the r scale we can call these points z = 0, I, 2, . . . , N - 1. [51 G. D. Forney. Jr., “On decoding BCH codes,” IEEE Trans. Informntion If we define the new set of unknowns 64 as Theory, vol. IT-11, pp. 549-557, October 1965.

si _= eri i = 1,2, .**n (1) then the resultant equations are 7 / 41 A Unified Approach to the Detection of Fluctuating f(i) = 2 ci s:, i = 0, 1,2, . ** N - 1 (53 i=1 Pulsed Signals in Noise to be solved for the unknowns Ci and 6<, j = 1, 2, . + . , n. Since Absfracf-The classical problem of detecting a fluctuating t,here are 2n unknowns, we require at least N = 2n equations so pulsed signal in noise is analyzed when the signal has a Rice amp- that we assume that N 2 2n. litude distribution. The results provide a natural bridge between Prony’s solution to this problem is as follows (see Prony’s original previous analyses of nonfluctuating signals and signals with Ray- paper in French,Ill or Hildebrand[“l). Let 61, 82, * . . 6, be the roots leigh envelopes. of the algebraic equation INTRODUCTION 6” - (Y1tin-l - a2 r2 - . . . - c&-l 6-l - o!, = 0. (3) The detection of pulsed signals in additive Gaussian noise has been analyzed extensively starting with the pioneering work of For each value of k, k = 0, 1, 2, . . . , N - 1 - YA,multiplying Marcum[ll on signals with constant amplitude. This work has been f(i + k), given in (a), by an--i, i = 0, 1, 2, . . . , n - 1 and multi- extended to include signals with fluctuating amplitude, where the plyingf(n + k) by -1 and adding the results, Prony obtains the amplitude distribution is derived from physical considerations or following set of linear equations to be solved for the unknown chosen for mathematical simplicity. Swerling[21 analyzed the coefficients 0li: Rayleigh and “one dominant plus Rayleigh” distributions, which n-1 are based on physical considerations of target cross section. This c f(i + W%-i = f(n + k) analysis was later extended to a more general n-dimensional distri- i=o butior# that involved &i-squared target cross section fluctuation. k = 0, 1, 2, * .* ) N - 1 - n. (4) Observations of high-frequency long-distance propagation led Nakagami141 to analyze the m-distribution, which is equivalent to If N = 2n, these equations can be solved directly; if N > 2n, a a chi-distribution. least-squares solution can be found. The purpose of this correspondence is to propose a simple distri- Once the coefficients pi are found, the 6i are found by finding bution for signal amplitude that will provide a smooth transition the roots of (3). Then the Cj can be found from (2). from the nonfluctuating signal to a signal with Rayleigh envelope The equations should now be compared with those resulting fluctuations. from the Zierler-Gorenstein(31 algorithm for decoding Bose-Chaud- The signal amplitude distribution will be assumed to be the huri-Hocquenghem codes. Using Peterson’s bookL41 as a reference, (a), following: (3), and (4) of this correspondence are seen to be analogous to (9.5), (9.7), and (9.9) of Peterson. The correspondence of terms is p(A) = (A/a”) exp [-(A2 + B2)/2a2]I,(AB/~2) dA (1) obvious with one difference being that in the curve fitting problem where A is the signal amplitude, 012 and B2 are parameters, and one works with the field of reals (or complex numbers) while in lo (.) is the modified Bessel function of the first kind and zero

hlanuscript. received April 28, 1967. This work wits supported by the Air Force Office of Scientific Research under Contract AF 49(638)-1600. Manuscript received August 23, 1966; revised February 14, 1967. ωi T √ 1 For γi = e − , consider the time-reversed conjugate s∗ n ∗ − The sequence s−n has the same frequency content as sn:

2  √ ∗ 2 √ ∗ X ∗ −nωi T −1 X ∗ nωi T −1 s−n = ai e = ai e i=1 i=1

Thus, we can use 3 samples to write       s0 s1 b2 s2 = b0 s2∗ s1∗ b1 − s0∗

If a1, a2 have uniform phase, matrix is almost surely invertible

Getting Away with Fewer Samples

To solve for b1, b2, we want 2 linear equations

But 3 samples only give 1 equation: s0b2 + s1b1 = b0s2 − Can we use s0, s1, s2 to get another independent equation?

8 / 41 Thus, we can use 3 samples to write       s0 s1 b2 s2 = b0 s2∗ s1∗ b1 − s0∗

If a1, a2 have uniform phase, matrix is almost surely invertible

Getting Away with Fewer Samples

To solve for b1, b2, we want 2 linear equations

But 3 samples only give 1 equation: s0b2 + s1b1 = b0s2 − Can we use s0, s1, s2 to get another independent equation?

ωi T √ 1 For γi = e − , consider the time-reversed conjugate s∗ n ∗ − The sequence s−n has the same frequency content as sn:

2  √ ∗ 2 √ ∗ X ∗ −nωi T −1 X ∗ nωi T −1 s−n = ai e = ai e i=1 i=1

8 / 41 Getting Away with Fewer Samples

To solve for b1, b2, we want 2 linear equations

But 3 samples only give 1 equation: s0b2 + s1b1 = b0s2 − Can we use s0, s1, s2 to get another independent equation?

ωi T √ 1 For γi = e − , consider the time-reversed conjugate s∗ n ∗ − The sequence s−n has the same frequency content as sn:

2  √ ∗ 2 √ ∗ X ∗ −nωi T −1 X ∗ nωi T −1 s−n = ai e = ai e i=1 i=1

Thus, we can use 3 samples to write       s0 s1 b2 s2 = b0 s2∗ s1∗ b1 − s0∗

If a1, a2 have uniform phase, matrix is almost surely invertible

8 / 41 New method based on time reversal and conjugacy ∗ −1 Requires that (γ ) = γi which implies γi = 1 i | | Counting R dimensions: 6 parameters from 6 samples

Yes, we have cheated! We can find t “sinusoids” using 3t/2 samples d e but not t “complex exponentials”

The gain comes from reducing γi from 2-dim to 1-dim

Have We Cheated?

Prony’s method √ 3 −1 Works for general complex frequencies like γ1 = 2e Counting R dimensions: 8 parameters from 8 samples

9 / 41 Yes, we have cheated! We can find t “sinusoids” using 3t/2 samples d e but not t “complex exponentials”

The gain comes from reducing γi from 2-dim to 1-dim

Have We Cheated?

Prony’s method √ 3 −1 Works for general complex frequencies like γ1 = 2e Counting R dimensions: 8 parameters from 8 samples

New method based on time reversal and conjugacy ∗ −1 Requires that (γ ) = γi which implies γi = 1 i | | Counting R dimensions: 6 parameters from 6 samples

9 / 41 Have We Cheated?

Prony’s method √ 3 −1 Works for general complex frequencies like γ1 = 2e Counting R dimensions: 8 parameters from 8 samples

New method based on time reversal and conjugacy ∗ −1 Requires that (γ ) = γi which implies γi = 1 i | | Counting R dimensions: 6 parameters from 6 samples

Yes, we have cheated! We can find t “sinusoids” using 3t/2 samples d e but not t “complex exponentials”

The gain comes from reducing γi from 2-dim to 1-dim

9 / 41 3t/2 d e

6

via Jack’s 1967 paper

AMS Special Session on Coding Theory 2007

“Rediscovering Our Roots: Coding Theory and Reed-Solomon Codes”

Three Questions

How many equally spaced samples are required to find the amplitude and frequency of the sum of t sinusoids?

How many errors can be corrected by a [17, 8, 10] shortened Reed-Solomon code over the Galois field F256?

How are these questions related?

10 / 41 6

via Jack’s 1967 paper

AMS Special Session on Coding Theory 2007

“Rediscovering Our Roots: Coding Theory and Reed-Solomon Codes”

Three Questions

How many equally spaced samples are required to find the amplitude and frequency of the sum of t sinusoids? 3t/2 d e How many errors can be corrected by a [17, 8, 10] shortened Reed-Solomon code over the Galois field F256?

How are these questions related?

10 / 41 via Jack’s 1967 paper

AMS Special Session on Coding Theory 2007

“Rediscovering Our Roots: Coding Theory and Reed-Solomon Codes”

Three Questions

How many equally spaced samples are required to find the amplitude and frequency of the sum of t sinusoids? 3t/2 d e How many errors can be corrected by a [17, 8, 10] shortened Reed-Solomon code over the Galois field F256? 6

How are these questions related?

10 / 41 AMS Special Session on Coding Theory 2007

“Rediscovering Our Roots: Coding Theory and Reed-Solomon Codes”

Three Questions

How many equally spaced samples are required to find the amplitude and frequency of the sum of t sinusoids? 3t/2 d e How many errors can be corrected by a [17, 8, 10] shortened Reed-Solomon code over the Galois field F256? 6

How are these questions related? via Jack’s 1967 paper

10 / 41 Galois Field “Complex Numbers”

Let F = Fq2 and let α F be a primitive element ∈ q−1 q Choose β = α and consider its conjugate β (w.r.t. Fq) q q2 q 1 q 1 A little algebra shows β = α − = α − = β−

q+1 (q 1)(q+1) q2 1 Since β = α − = α − , we see β has order q + 1 Element β gives[ q + 1, k, q + 2 k] shortened RS code − F Syndrome sequence has time-reversal conjugate symmetry Decoding trick can succeed with 2(n k)/3 errors b − c

11 / 41 He asked the simple question:

“Can we analyze typical-set decoding via weight enumerators?”

His use of simple examples made the class very accessible That’s where I learned the decoding link to Prony’s method

In Dec. 1999, Jack was on my prelim exam committee

First Interlude

In 1998, I took Algebraic Coding Theory from Jack

12 / 41 He asked the simple question:

“Can we analyze typical-set decoding via weight enumerators?”

In Dec. 1999, Jack was on my prelim exam committee

First Interlude

In 1998, I took Algebraic Coding Theory from Jack His use of simple examples made the class very accessible That’s where I learned the decoding link to Prony’s method

12 / 41 He asked the simple question:

“Can we analyze typical-set decoding via weight enumerators?”

First Interlude

In 1998, I took Algebraic Coding Theory from Jack His use of simple examples made the class very accessible That’s where I learned the decoding link to Prony’s method

In Dec. 1999, Jack was on my prelim exam committee

12 / 41 First Interlude

In 1998, I took Algebraic Coding Theory from Jack His use of simple examples made the class very accessible That’s where I learned the decoding link to Prony’s method

In Dec. 1999, Jack was on my prelim exam committee He asked the simple question:

“Can we analyze typical-set decoding via weight enumerators?”

12 / 41 Asked (in spirit) at Qualcomm in 2003 by Roberto Padovani Initially, a small group (including Jack) met weekly with Roberto to discuss this Later, the group grew and started implementing pilot interference cancellation for CSM 6800 When I left Qualcomm in 2005, PIC was working, the TIC system design was drafted, and the real work began: Jilei Hou, John Smee, Joseph Soriaga, and others built it

Second Question

How can cellular systems with successive cancellation approach the equal-rate point of the multiple-access channel with equal-power users?

13 / 41 Second Question

How can cellular systems with successive cancellation approach the equal-rate point of the multiple-access channel with equal-power users?

Asked (in spirit) at Qualcomm in 2003 by Roberto Padovani Initially, a small group (including Jack) met weekly with Roberto to discuss this Later, the group grew and started implementing pilot interference cancellation for CSM 6800 When I left Qualcomm in 2005, PIC was working, the TIC system design was drafted, and the real work began: Jilei Hou, John Smee, Joseph Soriaga, and others built it

13 / 41 First task was to understand what gains are possible: “Capacity of Multi-antenna Gaussian Channels” by Telatar

Second task was to consider practical limitations: e.g., channel estimation and MMSE cancellation

After Viterbi left Qualcomm, this view was revisited

Background

Interference cancellation viewed with skepticism at Qualcomm Viterbi’s book (Vembu paper) argues that gains are limited This negative view was controversial in the IT community

14 / 41 Second task was to consider practical limitations: e.g., channel estimation and MMSE cancellation

Background

Interference cancellation viewed with skepticism at Qualcomm Viterbi’s book (Vembu paper) argues that gains are limited This negative view was controversial in the IT community

After Viterbi left Qualcomm, this view was revisited First task was to understand what gains are possible: “Capacity of Multi-antenna Gaussian Channels” by Telatar

14 / 41 Background

Interference cancellation viewed with skepticism at Qualcomm Viterbi’s book (Vembu paper) argues that gains are limited This negative view was controversial in the IT community

After Viterbi left Qualcomm, this view was revisited First task was to understand what gains are possible: “Capacity of Multi-antenna Gaussian Channels” by Telatar

Second task was to consider practical limitations: e.g., channel estimation and MMSE cancellation

14 / 41 packet and a higher sector throughput, while a shorter target short-term fading channels among users is based on the (e.g. 1% after 1 subpacket) minimizes transmission delay. The 3GPP2 channel mix, where 30% of ATs are in channel A termination target is sometimes also referred to as the latency (3km/hr, 1-path), 30% of ATs are in channel B (10km/hr, 3- target. path), 20% of ATs are in channel C (30km/hr, 2-path), 10% of In EV-DO, users transmit the subpackets of a given packet ATs are in channel D (120km/hr, 1-path) and 10% of ATs are on one of three separate interlaces. This provides time- in channel E (Rician with 10dB K-factor). Finally, each BTS diversity while increasing the maximum delay allowed before is equipped with either two or four receiving antennas per sending the ACK/NAK. Users also transmit asynchronously, sector, with patterns and spatial correlation models as each of them separated into 4 groups corresponding to 4 prescribed in [6]. different slot offsets. Figure 1 illustrates this overall hybrid- TheARQ CDMA and 2000packet-interlace Reverse structure. Link III. INTERFERENCE CANCELLATION MODELING

A. Cancellation Strategies at the Receiver Techniques for implementing IC at the BTS were introduced in [4]. These methods relied on storing the total received signal waveform in a sample buffer, with samples taken at a multiple of the chip rate. For each packet that decodes successfully, the signal contribution from that user Qualcomm EV-DO Reverse Link can then be reconstructed and subtracted from the total 1228800Figure 1. chips/sec,Interlace structure 1 slot= for 2048 the EV-DO chips, reverse 1 slot =link. 1.67 ms received waveform. Of course, this involves knowledge of the transmit and receive filters in combination with the channel Release 0 uses a single 16-slot packet for reverse link Different payload sizes and signal constellations are available estimates. The accuracy of these components is addressed in in EV-DO, supportingUsers staggered data in 16 rates “phases” that for cover hardware the rangeload balancing of 4.8 detail in [3]. Note that since the packet data was transmitted at kbps toRelease 1.8432 A usesMbps Hybrid with ARQ relatively with four fine 4-slot granularity. sub-packets The a signal power greater than the pilot (by a factor of the T2P payload sizesPackets and in 3data interlaced rates streams corresponding to allow time to for ACK/NACK different ratio), the receiver can benefit significantly from re-estimating termination targetsUsers staggered are given across in 4 [2]. “phases” Each for data hardware rate load is further balancing the channel coefficients using the decoded data. associated with a traffic-to-pilot (T2P) transmit power ratio. Schedules for which users are decoded and their signals As more power is needed for higher data rates, the T2P ratio are subtracted can vary depending on the complexity afforded increases(Figure courtesy with larger of Soriaga payload et sizes al., GLOBECOMand smaller latency 2006 [SHS06]) targets. by the receiver, and these trade-offs are the subject of [7]. In Interference control in the EV-DO reverse link as well as 15this / 41 paper, we consider only one schedule referred to as the choice of payload size and T2P ratio used with each packet iterative group IC: transmission are specified in the Reverse Traffic Channel Iterative Group IC MAC protocol [2]. Briefly, for each sector the signal-plus- 1. Decode all users in group with subpacket boundary on interference-and-noise power to noise power ratio (also known the current slot n (Figure 1). as rise-over thermal, or RoT) is measured every slot at the 2. Cancel interference from users that decoded sector BTS. A reverse activity bit (RAB) is then set (or not successfully. set) to indicate whether (or not) the RoT exceeds a given 3. Repeat steps 1 and 2 for group with subpacket RAB-threshold. The sector BTS broadcasts this RAB value, boundary on previous slot n −1(Figure 1). and ATs that receive a set RAB generally try to lower their 4. Repeat steps 1 and 2 for group with subpacket transmit T2P ratio, or otherwise try to increase their T2P (note boundary on previous slot n − 2 . that other factors also contribute to T2P allocation, such as 5. Repeat steps 1-4 one additional time. power constraints for the AT [2]). 6. Move to next time slot by setting n:=n+1. B. Network and Channel Model IC is performed over the full duration of the transmitted The assumptions used for network level simulation follow subpacket. We also include cancellation of pilot and overhead the framework of the 3GPP2 evaluation methodology [5]. The (i.e., DRC and RRI [2]) signals. network model consists of 19 cells with 3 sectors within each cell; edge cells are wrapped around so that users in all 57 B. Power Cancellation Efficiency Per Antenna sectors experience a similar amount of inter-cell interference Network simulations can involve large numbers of users, on average. For each simulation run, ATs are uniformly which prohibit simulation of the physical layer for each AT- dropped onto the layout, with an equal number of ATs served to-BTS link. Therefore, a first-order approximation to link- per sector on the forward link. Each AT-to-BTS link is level performance characterized by the per-packet signal-to- assigned a zero-mean lognormal shadow with a standard interference-plus-noise ratio (SINR) and channel type was deviation of 8.9 dB, where the shadow values from different introduced in [5]. The decoding behavior was thus modeled in cells for a given AT are 50% correlated. For a network layout the network by first tracking the received power (averaged with 2km spacing between each adjacent BTS, the maximum over each time slot) from each user and calculating the packet path loss (above which users are re-dropped) from an AT to SINR. Then this SINR is used with lookup tables and a simple the primary (serving) BTS is 138.5dB. The distribution of random number generator to simulate decoding success.

1-4244-0357-X/06/$20.00 © 2006 IEEE This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 2006 proceedings.

The Multiple Access Channel (MAC)

T T In classic SIC with exponential power shaping and uniform User 1 power: 7/3 (i) (ii) (iii) User 1 power: 4 data rates, the last user is received with User 2 power: 7/3 very low power and

User 2 power: 2 is overly dependent User 3 power: 7/3 on the successful User 3 power: 1 decoding of prior Noise level: 1 Noise level: 1 users. Asynchronism Synchronoust Asynchronous t allows a more robust IC mechanism Figure 1. Decoding timelines for frame synchronous and frame asynchronous. because of the relative symmetry a section on receiver architectures. We then pro- total received power from all users over the addi- 3-Uservide a section Gaussian that outlines the MAC network level w/successive sim- tive white Gaussian interference noise (AWGN) cancellationchannel is 7. between users. ulation modeling of IC and provide capacity Optimum power control is achieved for both results. The final section contains our conclusions. schemes when each user has the same signal-to- Capacity achieved by an exponentiallyinterference-plus-noise decaying ratio (SINR). power profile The frame synchronous case is shown on the Also byPRINCIPLES equal-power OF staggeredleft of usersFig. 1. At (c.f., the end rate-splitting)of the frame (time t), INTERFERENCE CANCELLATION user 1 is decoded first, followed by user 2, then   user 3. We fix the transmit power of user 3 at 1,  We1+ 1begin with a simple1+7 example/3+7 showing/3+7 how/3 giving it an SINR1+7/ of3+ 1.7 If/ 3the steady state of1+7/3 3 log2 the1 sum rate= capacity log2 can be1+7 achieved/3+7 by/ succes-3 +optimum log2 power 1+7control/3 is attained,+ log each2 user 1 sive IC (SIC) for a frame asynchronous CDMA achieves this SINR with user 2 transmitting at system. We introduce the concept of effective power 2 and user 1 at power 4. This synchronous rise-over-thermal (ROT) control for CDMA net- SIC scheme achieves a total capacity of 3 b/s/Hz works with IC. After discussing the information since each user’s capacity is 1 b/s/Hz by evaluat- (Figuretheory courtesy aspects of the of CDMA Hou reverse etal., link with IEEEing Comm.log2(1 + SINR Mag.) with an SINR 2006 value [HSPT06]) of 1. multiple receiver antennas and IC, we describe The sum rate for an AWGN channel with a total how to integrate IC with the power control, signal power of 7 and noise power of 1 is 3 16 / 41 asynchronism, and H-ARQ of EV-DO RevA. b/s/Hz since log2(8) is 3. This demonstrates that applying SIC to mutually interfering synchronous ACHIEVING SUM RATE CAPACITY WITH CDMA CDMA users can achieve sum rate capacity [5]. AND SUCCESSIVE INTERFERENCE CANCELLATION: Now consider the frame asynchronous case on FRAME SYNCHRONOUS VS. ASYNCHRONOUS the right of Fig. 1 where the users have evenly distributed timing offsets and the same transmit It is well known that the sum rate capacity can be power. A careful look at the decoding timeline achieved through SIC for frame synchronous for all three users shows that each user’s packet equal rate users with an exponential power pro- observes the same interference power after can- file [5]. However, in practical systems users often celing the previous (but not future) packets transmit asynchronously because the processing transmitted by the other users. Keeping the total power and network bandwidth at the receiver will transmit power the same as in the synchronous have a more uniform usage profile in time. In case requires each user to transmit at power 7/3. contrast, frame synchronism among users At time t we decode user 1 whose frame can be requires a burst of processing power and network divided into three subframes: (i), (ii), and (iii). resources at the end of each frame boundary Notice that each sub-frame sees a different inter- since all users finish packets at the same time. In ference plus noise level of 1, 7/3 + 1, and 2 * 7/3 classic SIC with exponential power shaping and + 1. If we consider the three subframes as the uniform data rates, the last user is received with time-multiplexing of parallel channels with differ- very low power and is overly dependent on the ent SINRs, the achievable rate of user 1 (and successful decoding of prior users. Asynchronism hence each user by symmetry) can be determined allows a more robust IC mechanism because of by averaging the capacity of the three parallel the relative symmetry between users. channels. The per user capacity is By comparing frame synchronous and asyn- chronous transmission for the simple case of ⎡ ⎛ 73/ ⎞ ⎛ 73/ ⎞ ⎤ ⎢log22⎜ ⎟++log ⎜1 ⎟+⎥ three equal rate users in an isolated cell, we show 1 ⎝ 1 ⎠ ⎝ 731/ + ⎠ ⋅⎢ ⎥ that user packets do not have to arrive at the 3 ⎢ ⎛ 73/ ⎞ ⎥ base station with the same frame offset to achieve ⎢log2 ⎜1+ ⎟ ⎥ sum rate capacity. We equate the total transmit ⎣ ⎝ 273⋅ / + 1⎠ ⎦ power for each scheme and assume each user is 1 ⎛10 17 24 ⎞ = log2 ⎜ ⋅ ⋅ ⎟ = 1, decoded successfully at the end of their length T 3 ⎝ 3 10 17 ⎠ frame. We assume the noise level is 1, and the (1)

IEEE Communications Magazine • February 2006 97 Graphical Representation of Decoding Uses single-user coding and decoding of packets Cancellation occurs in receive buffer after decoding Conceptually similar to D-BLAST and rate-splitting Slots 0 1 2 3 4 5 Due6 7 to8 CRC,9 peeling... similar to LDPC codes on the BEC For uncoordinated users, called iterative collision resolution

Staggered Successive Interference Cancellation

Packet 0 Packet 1 Packet 2 Packet 3 Packet 4 User 1 User 2 User 3

17 / 41 Graphical Representation of Decoding Uses single-user coding and decoding of packets Cancellation occurs in receive buffer after decoding Conceptually similar to D-BLAST and rate-splitting Due to CRC, peeling similar to LDPC codes on the BEC For uncoordinated users, called iterative collision resolution

Staggered Successive Interference Cancellation

Packet 0 Packet 1 Packet 2 Packet 3 Packet 4 User 1 User 2 User 3 Slots 0 1 2 3 4 5 6 7 8 9 ...

17 / 41 Staggered Successive Interference Cancellation

Packet 0 Packet 1 Packet 2 Packet 3 Packet 4 User 1 User 2 User 3 Slots 0 1 2 3 4 5 6 7 8 9 ...

Graphical Representation of Decoding Uses single-user coding and decoding of packets Cancellation occurs in receive buffer after decoding Conceptually similar to D-BLAST and rate-splitting Due to CRC, peeling similar to LDPC codes on the BEC For uncoordinated users, called iterative collision resolution

17 / 41 Staggered Successive Interference Cancellation

Packet 0 Packet 1 Packet 2 Packet 3 Packet 4 User 1 User 2 User 3 Slots 0 1 2 3 4 5 6 7 8 9 ...

Graphical Representation of Decoding Uses single-user coding and decoding of packets Cancellation occurs in receive buffer after decoding Conceptually similar to D-BLAST and rate-splitting Due to CRC, peeling similar to LDPC codes on the BEC For uncoordinated users, called iterative collision resolution

17 / 41 Staggered Successive Interference Cancellation

Packet 0 Packet 1 Packet 2 Packet 3 Packet 4 User 1 User 2 User 3 Slots 0 1 2 3 4 5 6 7 8 9 ...

Graphical Representation of Decoding Uses single-user coding and decoding of packets Cancellation occurs in receive buffer after decoding Conceptually similar to D-BLAST and rate-splitting Due to CRC, peeling similar to LDPC codes on the BEC For uncoordinated users, called iterative collision resolution

17 / 41 For the -th user, the slot SINR achieved before IC from cancelled at the receiver translates directly into a throughput antenna combining with a weighting vector w is gain as ATs can use higher T2P ratios (i.e., data rates, following the MAC protocol in [1]). Results for IC versus no- E w H h hH w c = , (8) IC systems with comparable interference levels are given in H H Nt w ()R yy − hh w Figure 3. For systems loaded with 10-16 ATs per sector, we find that the gain from IC for 4-antenna receivers with MMSE where the received signal correlation matrix can be computed combining is 50-60%, while for 2-antenna receivers with to be R K h hH R . The term R denotes the MRC it is close to 80%. yy = k =1 k k + zz zz interference-plus-noise correlation matrix. After cancellation In Figure 4 we illustrate the distribution among ATs of of user k , we have absolute and relative throughput (i.e., ratio of an AT’s H throughput to the average AT throughput). The cumulative R y'y' = R yy − h k h k + R res,k , (9) distribution function of the absolute throughput indicates that IC does indeed increase the throughput for users at every H percentile, while the relative throughput distribution where R [(h hˆ )(h hˆ ) ] is the correlation res,k = k − k k − k demonstrates that the original grade-of-service provided by matrix of the residual signal. Note that this term is relatedPerformance to EV-DO Comparison does not change significantly after IC. β (m) from equation (5) according to k 2200 (m) (n) (m) (n) []Rres,k = hk hk (1− βk )(1− βk ) × 2000 m,n (10) (m) (n) 1800 4 Ant, MMSE-IC exp jθk − jθk . () 4 Ant, MMSE 1600 2 Ant, MRC-IC Moreover, similar to equation (4), the change in SINR from 2 Ant, MRC cancellation of user k interference can be written in terms of 1400 a reduction in N t as compared to (8), 1200 H H Ec w h h w 1000 = (11) (kbps) Throughput Sector N H H ()t IC N t − w ()h k h k − R res,k w 800 Therefore, further extending the link-level approximation of 600 the previous subsection to include spatial correlations only 400 involves additionally tracking spatial correlations for each 4 6 8 10 12 14 16 Number of ATs per Sector H user, hkhk . The phase of the residual error, θk , can be modeled as a uniform random variable over the interval IC gainsFigure approx. 3. Average 70% sector for throughput 4 antennas for the and EV-DO 50% reverse-link. for 2 antennas [0,2π ) for simplicity. Applications to channel types specified in [5] are straightforward as in the previous subsection. (Figure courtesy of Soriaga et al., GLOBECOM 2006 [SHS06]) 18 / 41 As a reminder, in the case of MRC, the weighting vector 1

0.8 w for the -th user has elements wm = h,m /[]Rnn m,m , 0.6 where R R h h H . When there is IC, we employ a 2 Ant, MRC = − CDF nn ()yy 0.4 2 Ant, MRC-IC 4 Ant, MMSE 0.2 similar weighting vector except that Rnn is replaced with 4 Ant, MMSE-IC H 0 R = R − h h using (9) above. In the case of MMSE 0 50 100 150 200 250 300 350 400 450 500 550 600 n'n' ()y'y' AT Throughput (kbps) −1 −1 combining, we use w = Rnnh and w = Rn'n'h for the no- 1 IC and IC cases, respectively. Moreover, the SINR in (8) then 0.8 H −1 0.6 simplifies to E / N = h R h , and likewise for (11). c t nn CDF 0.4

IV. NETWORK LEVEL SIMULATION RESULTS 0.2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 A. Sector Throughput for Full Buffer Traffic AT Relative Throughput (kbps)

For simulations we assume that all ATs continuously transmit 16-slot termination-target data, referred to as full- Figure 4. Distribution of throughput across ATs in the network. buffer traffic. In this case, if we constrain the IC-enabled receiver to maintain the same post-cancellation interference level as that of a receiver without IC, then any power

1-4244-0357-X/06/$20.00 © 2006 IEEE This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 2006 proceedings. Key role of staggered users was only recognized later Fortunately, staggering was there for hardware load balancing Adding/removing staggering would have been impossible

Cancellation project at Qualcomm started in 2003, before the magic of spatial coupling was known

Lefty Gomez said famously, “I’d rather be lucky than good.” Definitely true for innovation and backwards compatibility!

Historical Perspective

Performance in practice was surprisingly good

19 / 41 Cancellation project at Qualcomm started in 2003, before the magic of spatial coupling was known

Lefty Gomez said famously, “I’d rather be lucky than good.” Definitely true for innovation and backwards compatibility!

Historical Perspective

Performance in practice was surprisingly good

Key role of staggered users was only recognized later Fortunately, staggering was there for hardware load balancing Adding/removing staggering would have been impossible

19 / 41 Lefty Gomez said famously, “I’d rather be lucky than good.” Definitely true for innovation and backwards compatibility!

Historical Perspective

Performance in practice was surprisingly good

Key role of staggered users was only recognized later Fortunately, staggering was there for hardware load balancing Adding/removing staggering would have been impossible

Cancellation project at Qualcomm started in 2003, before the magic of spatial coupling was known

19 / 41 Historical Perspective

Performance in practice was surprisingly good

Key role of staggered users was only recognized later Fortunately, staggering was there for hardware load balancing Adding/removing staggering would have been impossible

Cancellation project at Qualcomm started in 2003, before the magic of spatial coupling was known

Lefty Gomez said famously, “I’d rather be lucky than good.” Definitely true for innovation and backwards compatibility!

19 / 41 Spatially-Coupled Factor Graphs Variable nodes have a natural global orientation Boundaries help variables to be recovered in an ordered fashion

What is Spatial Coupling?

6 5 4 7 3 9 8 1 2 2 5 1 9 1 3 5 4 8 8 2 3 6 2 9 4 3 6 7 3 6 7 8 7 6 1 5 9 1 6 2 5 4 1 9 5 6 3 2 7 2 3 8 9 3 8 4 3 8 2 8 4 7 6 7 4 9 1 9 7 6 1 4 6 2 9 2 3 1 . 4 7

20 / 41 What is Spatial Coupling?

6 5 4 7 3 9 8 1 2 2 5 1 9 1 3 5 4 8 8 2 3 6 2 9 4 3 6 7 3 6 7 8 7 6 1 5 9 1 6 2 5 4 1 9 5 6 3 2 7 2 3 8 9 3 8 4 3 8 2 8 4 7 6 7 4 9 1 9 7 6 1 4 6 2 9 2 3 1 . Spatially-Coupled Factor Graphs 4 7 Variable nodes have a natural global orientation Boundaries help variables to be recovered in an ordered fashion

20 / 41 Spatially-Coupled Sudoku Example

6 5 4 7 3 9 8 1 2 1 3 5 4 8 2 9 4 3 6 7 8 7 6 1 5 9 2 5 6 3 2 3 8 4 3 8 6 7 4 9 1 4 6 2 9 2 3 1 4 7

21 / 41 Spatially-Coupled Sudoku Example

6 5 4 7 3 9 8 1 2 1 3 5 9 4 7 8 2 9 4 3 8 6 7 8 7 6 1 2 5 9 2 5 6 3 2 3 8 4 3 8 6 7 4 9 1 4 6 2 9 2 3 1 4 7

21 / 41 Spatially-Coupled Sudoku Example

6 5 4 7 3 9 8 1 2 1 3 5 9 4 7 6 2 8 2 9 4 3 8 6 5 1 7 8 7 6 1 2 5 9 4 3 2 5 6 3 2 3 8 4 3 8 6 7 4 9 1 4 6 2 9 2 3 1 4 7

21 / 41 Spatially-Coupled Sudoku Example

6 5 4 7 3 9 8 1 2 1 3 5 9 4 7 6 2 8 2 9 4 3 8 6 5 1 7 8 7 6 1 2 5 9 4 3 5 3 2 4 8 5 6 3 2 1 3 8 4 3 8 6 7 4 9 1 4 6 2 9 2 3 1 4 7

21 / 41 Spatially-Coupled Sudoku Example

6 5 4 7 3 9 8 1 2 1 3 5 9 4 7 6 2 8 2 9 4 3 8 6 5 1 7 8 7 6 1 2 5 9 4 3 5 3 4 8 2 4 8 7 5 1 6 3 2 1 3 9 6 8 4 3 8 6 7 4 9 1 4 6 2 9 2 3 1 4 7

21 / 41 Spatially-Coupled Sudoku Example

6 5 4 7 3 9 8 1 2 1 3 5 9 4 7 6 2 8 2 9 4 3 8 6 5 1 7 8 7 6 1 2 5 9 4 3 5 6 3 4 8 2 4 9 8 7 5 1 6 3 2 7 1 3 9 6 8 4 3 8 6 7 4 9 1 4 6 2 9 2 3 1 4 7

21 / 41 Spatially-Coupled Sudoku Example

6 5 4 7 3 9 8 1 2 1 3 5 9 4 7 6 2 8 2 9 4 3 8 6 5 1 7 8 7 6 1 2 5 9 4 3 5 6 3 4 8 2 7 1 9 4 9 8 7 5 1 2 6 3 2 7 1 3 9 6 8 4 5 4 3 8 6 7 4 9 1 4 6 2 9 2 3 1 4 7

21 / 41 I was deciding between a post-doc and few different startups

There were signs that the staggered structure was important But, I didn’t appreciate spatial coupling until the KRU proof

If you want more details on this topic, google for Jack Wolf’s plenary slides from ISSSTA 2006

When I was ready to leave Qualcomm, I sought Jack’s advice

Second Interlude

Insight missed: From the staggered MAC to Spatial Coupling

22 / 41 I was deciding between a post-doc and few different startups

If you want more details on this topic, google for Jack Wolf’s plenary slides from ISSSTA 2006

When I was ready to leave Qualcomm, I sought Jack’s advice

Second Interlude

Insight missed: From the staggered MAC to Spatial Coupling There were signs that the staggered structure was important But, I didn’t appreciate spatial coupling until the KRU proof

22 / 41 I was deciding between a post-doc and few different startups

When I was ready to leave Qualcomm, I sought Jack’s advice

Second Interlude

Insight missed: From the staggered MAC to Spatial Coupling There were signs that the staggered structure was important But, I didn’t appreciate spatial coupling until the KRU proof

If you want more details on this topic, google for Jack Wolf’s plenary slides from ISSSTA 2006

22 / 41 Second Interlude

Insight missed: From the staggered MAC to Spatial Coupling There were signs that the staggered structure was important But, I didn’t appreciate spatial coupling until the KRU proof

If you want more details on this topic, google for Jack Wolf’s plenary slides from ISSSTA 2006

When I was ready to leave Qualcomm, I sought Jack’s advice I was deciding between a post-doc and few different startups

22 / 41 Discussed by Kasami, Lin, and Peterson in late 1960s Mentioned by Lin in ITW 1993 talk Conjectured for BEC in a paper by Dumer and Farrell in 1994 Costello and Forney in 2007: “The road to channel capacity” Asked by Arikan in 2010 after the invention of polar codes Shown for rate 0/1 by Abbe, Shpilka, Widgerson in 2014 →

Third Question

Can Reed-Muller codes achieve capacity?

23 / 41 Third Question

Can Reed-Muller codes achieve capacity?

Discussed by Kasami, Lin, and Peterson in late 1960s Mentioned by Lin in ITW 1993 talk Conjectured for BEC in a paper by Dumer and Farrell in 1994 Costello and Forney in 2007: “The road to channel capacity” Asked by Arikan in 2010 after the invention of polar codes Shown for rate 0/1 by Abbe, Shpilka, Widgerson in 2014 →

23 / 41 Coding for the BEC: Problem Setup

Length-N binary linear code C

Message Encoder X BEC(p) Y

N Uniform codeword: X = (X0,..., XN 1) 0, 1 − ∈ { } N Bernoulli-p erasures: Z = (Z0,..., ZN 1) 0, 1 − ∈ { } ( Xi if Zi = 0 BEC(p) observation of X : Yi = ? if Zi = 1

24 / 41 Consistent Codewords: (y) x xi = yi when yi = ? C , { ∈ C | 6 } Decoder declares block erasure iff (y) > 1 |C | Bit-i recoverable iff xi = x for all x, x (y) i0 0 ∈ C If xi unrecoverable, half the consistent codewords have xi = 1

Since they are equiprobable, H(Xi Y = y) is either 0 or 1 |

MAP Decoding on the BEC

Example: y = (1 0 ? ? ? ?) with    (0 0 0 0 0 0)     (0 0 1 1 1 0)     (0 1 0 0 1 1)   (0 1 1 1 0 1)  = C  (1 0 0 1 0 1)     (1 0 1 0 1 1)     (1 1 0 1 1 0)   (1 1 1 0 0 0) 

25 / 41 Decoder declares block erasure iff (y) > 1 |C | Bit-i recoverable iff xi = x for all x, x (y) i0 0 ∈ C If xi unrecoverable, half the consistent codewords have xi = 1

Since they are equiprobable, H(Xi Y = y) is either 0 or 1 |

MAP Decoding on the BEC

Example: y = (1 0 ? ? ? ?) with    (0 0 0 0 0 0)     (0 0 1 1 1 0)     (0 1 0 0 1 1)   (0 1 1 1 0 1)  = C  (1 0 0 1 0 1)     (1 0 1 0 1 1)     (1 1 0 1 1 0)   (1 1 1 0 0 0) 

Consistent Codewords: (y) x xi = yi when yi = ? C , { ∈ C | 6 }

25 / 41 Bit-i recoverable iff xi = x for all x, x (y) i0 0 ∈ C If xi unrecoverable, half the consistent codewords have xi = 1

Since they are equiprobable, H(Xi Y = y) is either 0 or 1 |

MAP Decoding on the BEC

Example: y = (1 0 ? ? ? ?) with    (0 0 0 0 0 0)     (0 0 1 1 1 0)     (0 1 0 0 1 1)   (0 1 1 1 0 1)  = C  (1 0 0 1 0 1)     (1 0 1 0 1 1)     (1 1 0 1 1 0)   (1 1 1 0 0 0) 

Consistent Codewords: (y) x xi = yi when yi = ? C , { ∈ C | 6 } Decoder declares block erasure iff (y) > 1 |C |

25 / 41 MAP Decoding on the BEC

Example: y = (1 0 ? ? ? ?) with    (0 0 0 0 0 0)     (0 0 1 1 1 0)     (0 1 0 0 1 1)   (0 1 1 1 0 1)  = C  (1 0 0 1 0 1)     (1 0 1 0 1 1)     (1 1 0 1 1 0)   (1 1 1 0 0 0) 

Consistent Codewords: (y) x xi = yi when yi = ? C , { ∈ C | 6 } Decoder declares block erasure iff (y) > 1 |C | Bit-i recoverable iff xi = x for all x, x (y) i0 0 ∈ C If xi unrecoverable, half the consistent codewords have xi = 1

Since they are equiprobable, H(Xi Y = y) is either 0 or 1 | 25 / 41 MAP erasure Pb(p) MAP EXIT h(p) 1 PN 1 1 PN 1 Pb(p) , N i=0− Pb,i (p) = ph(p) h(p) , N i=0− hi (p)

Achieving Capacity C =1 p under Bit-MAP Decoding − (n) Code sequence with rates Rn R C → (n) i. For any p < 1 R, the MAP erasure rate P (p) 0 − b → ii. For any p < 1 R, the MAP EXIT function h(n)(p) 0 − →

Code Performance on the BEC

The MAP erasure rate of bit-i, Pb,i (p), satisfies

Pb,i (p) , H(Xi Y )= P(Yi = ?) H(Xi Y , Yi = ?), | | {z } | | {z } p hi (p) where hi (p) is called the MAP EXIT function of bit-i

26 / 41 Achieving Capacity C =1 p under Bit-MAP Decoding − (n) Code sequence with rates Rn R C → (n) i. For any p < 1 R, the MAP erasure rate P (p) 0 − b → ii. For any p < 1 R, the MAP EXIT function h(n)(p) 0 − →

Code Performance on the BEC

The MAP erasure rate of bit-i, Pb,i (p), satisfies

Pb,i (p) , H(Xi Y )= P(Yi = ?) H(Xi Y , Yi = ?), | | {z } | | {z } p hi (p) where hi (p) is called the MAP EXIT function of bit-i

MAP erasure Pb(p) MAP EXIT h(p) 1 PN 1 1 PN 1 Pb(p) , N i=0− Pb,i (p) = ph(p) h(p) , N i=0− hi (p)

26 / 41 Code Performance on the BEC

The MAP erasure rate of bit-i, Pb,i (p), satisfies

Pb,i (p) , H(Xi Y )= P(Yi = ?) H(Xi Y , Yi = ?), | | {z } | | {z } p hi (p) where hi (p) is called the MAP EXIT function of bit-i

MAP erasure Pb(p) MAP EXIT h(p) 1 PN 1 1 PN 1 Pb(p) , N i=0− Pb,i (p) = ph(p) h(p) , N i=0− hi (p)

Achieving Capacity C =1 p under Bit-MAP Decoding − (n) Code sequence with rates Rn R C → (n) i. For any p < 1 R, the MAP erasure rate P (p) 0 − b → ii. For any p < 1 R, the MAP EXIT function h(n)(p) 0 − →

26 / 41 EXIT Area Theorem [ABK04] Z 1 h(p) dp = R (code rate) 0 conservation principle by Ashikhmin, ten Brink, and Kramer

If (n) capacity achieving, it implies h(n)(p) step function C →

EXtrinsic Information Transfer (EXIT) Curves

In 1999, ten Brink defined h(p) to explain iterative decoding Area = K/N

p

27 / 41 If (n) capacity achieving, it implies h(n)(p) step function C →

EXtrinsic Information Transfer (EXIT) Curves

In 1999, ten Brink defined h(p) to explain iterative decoding Area = K/N

EXIT Area Theorem [ABK04] Z 1 p h(p) dp = R (code rate) 0 conservation principle by Ashikhmin, ten Brink, and Kramer

27 / 41 EXtrinsic Information Transfer (EXIT) Curves

In 1999, ten Brink defined h(p) to explain iterative decoding Area = K/N

EXIT Area Theorem [ABK04] Z 1 p h(p) dp = R (code rate) 0 conservation principle by Ashikhmin, ten Brink, and Kramer

If (n) capacity achieving, it implies h(n)(p) step function C →

27 / 41 The MAP EXIT Curve of a Capacity-Achieving Code

1 h

3 Average EXIT Function N = 2 0 0 0.5 1 Erasure Probability p Area Theorem implies sharp transition iff capacity achieving

28 / 41 The MAP EXIT Curve of a Capacity-Achieving Code

1 h

N = 23 5 Average EXIT Function N = 2 0 0 0.5 1 Erasure Probability p Area Theorem implies sharp transition iff capacity achieving

28 / 41 The MAP EXIT Curve of a Capacity-Achieving Code

1 h

N = 23 N = 25 7 Average EXIT Function N = 2 0 0 0.5 1 Erasure Probability p Area Theorem implies sharp transition iff capacity achieving

28 / 41 The MAP EXIT Curve of a Capacity-Achieving Code

1 h

N = 23 N = 25 N = 27 9 Average EXIT Function N = 2 0 0 0.5 1 Erasure Probability p Area Theorem implies sharp transition iff capacity achieving

28 / 41 Previous work to show RM codes achieve capacity Many approaches tried based on the code / weight distribution Only known proof works for a much wider class!

Capacity via Symmetry Theorem

Symmetry alone is sufficient to achieve capacity!!! [KKMPSU16]

Theorem 1: Let n be a sequence of Fq-linear codes with {C } blocklengths Nn and rates Rn R (0, 1), where → ∞ → ∈ the permutation group of each n is doubly transitive. C Then, Cn achieves capacity on the q-ary erasure channel (QEC){ under} symbol-MAP decoding.

29 / 41 Only known proof works for a much wider class!

Capacity via Symmetry Theorem

Symmetry alone is sufficient to achieve capacity!!! [KKMPSU16]

Theorem 1: Let n be a sequence of Fq-linear codes with {C } blocklengths Nn and rates Rn R (0, 1), where → ∞ → ∈ the permutation group of each n is doubly transitive. C Then, Cn achieves capacity on the q-ary erasure channel (QEC){ under} symbol-MAP decoding.

Previous work to show RM codes achieve capacity Many approaches tried based on the code / weight distribution

29 / 41 Capacity via Symmetry Theorem

Symmetry alone is sufficient to achieve capacity!!! [KKMPSU16]

Theorem 1: Let n be a sequence of Fq-linear codes with {C } blocklengths Nn and rates Rn R (0, 1), where → ∞ → ∈ the permutation group of each n is doubly transitive. C Then, Cn achieves capacity on the q-ary erasure channel (QEC){ under} symbol-MAP decoding.

Previous work to show RM codes achieve capacity Many approaches tried based on the code / weight distribution Only known proof works for a much wider class!

29 / 41 π ∈ G 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0

Code Symmetry in Pictures (1)

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 Hamming (8,4) Code (columns are codewords) Permutation Group G

30 / 41 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0

Code Symmetry in Pictures (1)

π 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ∈ G 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Hamming (8,4) Code (columns are codewords) Permutation Group G

30 / 41 Code Symmetry in Pictures (1)

π 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ∈ G 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Hamming (8,4) Code (columns are codewords) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Permutation Group 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 G 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0

30 / 41 Code Symmetry in Pictures (1)

π 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ∈ G 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Hamming (8,4) Code (columns are codewords) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Permutation Group 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 G 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0

30 / 41 is transitive but not doubly transitive

A permutation group is transitive if, for all i, j ZN , there exists π such thatGπ(i) = j ∈ ∈ G is doubly transitive if, for any i, j, k, l ZN with i = j and Gk = l, there exists π such that π(i)∈ = k and π(j)6 = l. 6 ∈ G Permutation group of A  (0 0 0 0)   ( 0 1 2 3 )( 0 1 2 3 )     0 1 2 3 1 0 2 3     0 1 2 3 0 1 2 3  (0 0 1 1) ( 0 1 3 2 )( 1 0 3 2 ) = = = 0 1 2 3 0 1 2 3 A  (1 1 0 0)  ⇒ G  ( 2 3 0 1 )( 3 2 0 1 )     0 1 2 3 0 1 2 3  (1 1 1 1) ( 2 3 1 0 )( 3 2 1 0 )

The Permutation Group of a Set of Vectors

For a set N of vectors A ⊆ X N Permutations π SN act on via: b = π(a) b = ai ∈ X ⇔ π(i) The permutation group of is defined to be G A π SN π(a) a G , { ∈ | ∈ A ∀ ∈ A}

31 / 41 A permutation group is transitive if, for all i, j ZN , there exists π such thatGπ(i) = j ∈ ∈ G is doubly transitive if, for any i, j, k, l ZN with i = j and Gk = l, there exists π such that π(i)∈ = k and π(j)6 = l. 6 ∈ G is transitive but not doubly transitive

The Permutation Group of a Set of Vectors

For a set N of vectors A ⊆ X N Permutations π SN act on via: b = π(a) b = ai ∈ X ⇔ π(i) The permutation group of is defined to be G A π SN π(a) a G , { ∈ | ∈ A ∀ ∈ A}

Permutation group of A  (0 0 0 0)   ( 0 1 2 3 )( 0 1 2 3 )     0 1 2 3 1 0 2 3     0 1 2 3 0 1 2 3  (0 0 1 1) ( 0 1 3 2 )( 1 0 3 2 ) = = = 0 1 2 3 0 1 2 3 A  (1 1 0 0)  ⇒ G  ( 2 3 0 1 )( 3 2 0 1 )     0 1 2 3 0 1 2 3  (1 1 1 1) ( 2 3 1 0 )( 3 2 1 0 )

31 / 41 is doubly transitive if, for any i, j, k, l ZN with i = j and Gk = l, there exists π such that π(i)∈ = k and π(j)6 = l. 6 ∈ G but not doubly transitive

The Permutation Group of a Set of Vectors

For a set N of vectors A ⊆ X N Permutations π SN act on via: b = π(a) b = ai ∈ X ⇔ π(i) The permutation group of is defined to be G A π SN π(a) a G , { ∈ | ∈ A ∀ ∈ A} A permutation group is transitive if, for all i, j ZN , there exists π such thatGπ(i) = j ∈ ∈ G

Permutation group of is transitive A  (0 0 0 0)   ( 0 1 2 3 )( 0 1 2 3 )     0 1 2 3 1 0 2 3     0 1 2 3 0 1 2 3  (0 0 1 1) ( 0 1 3 2 )( 1 0 3 2 ) = = = 0 1 2 3 0 1 2 3 A  (1 1 0 0)  ⇒ G  ( 2 3 0 1 )( 3 2 0 1 )     0 1 2 3 0 1 2 3  (1 1 1 1) ( 2 3 1 0 )( 3 2 1 0 )

31 / 41 The Permutation Group of a Set of Vectors

For a set N of vectors A ⊆ X N Permutations π SN act on via: b = π(a) b = ai ∈ X ⇔ π(i) The permutation group of is defined to be G A π SN π(a) a G , { ∈ | ∈ A ∀ ∈ A} A permutation group is transitive if, for all i, j ZN , there exists π such thatGπ(i) = j ∈ ∈ G is doubly transitive if, for any i, j, k, l ZN with i = j and Gk = l, there exists π such that π(i)∈ = k and π(j)6 = l. 6 ∈ G Permutation group of is transitive but not doubly transitive A  (0 0 0 0)   ( 0 1 2 3 )( 0 1 2 3 )     0 1 2 3 1 0 2 3     0 1 2 3 0 1 2 3  (0 0 1 1) ( 0 1 3 2 )( 1 0 3 2 ) = = = 0 1 2 3 0 1 2 3 A  (1 1 0 0)  ⇒ G  ( 2 3 0 1 )( 3 2 0 1 )     0 1 2 3 0 1 2 3  (1 1 1 1) ( 2 3 1 0 )( 3 2 1 0 )

31 / 41 π ∈ G 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0

Code Symmetry in Pictures (2)

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 Hamming (8,4) Code (columns are codewords) Permutation Group G

32 / 41 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0

Code Symmetry in Pictures (2)

π 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ∈ G 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 Hamming (8,4) Code (columns are codewords) Permutation Group G

32 / 41 Code Symmetry in Pictures (2)

π 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ∈ G 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 Hamming (8,4) Code (columns are codewords) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Permutation Group 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 G 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0

32 / 41 Code Symmetry in Pictures (2)

π 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 ∈ G 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 Hamming (8,4) Code (columns are codewords) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 Permutation Group 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 G 0 1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0

32 / 41 Recovery indicator of bit-i is a monotone boolean function H(X Y = y Z = z ) = f (z ) 0 1 i i i , i i i i , , | ∼ ∼ ∼ ∼ ∼ ∈ { } where z i , (z0,..., zi 1, zi+1,..., zN 1) ∼ − −

Bit-i EXIT function equals the expectation of fi (z i ) ∼ hi (p) , H(Xi Y , Yi =?) = E [fi (Z i )] | ∼

Connection to Boolean Functions (1)

For linear codes, the recovery of X from Y = y i i i ∼ ∼ is independent of the transmitted codeword X

only depends on erasure indicator zi = 1{?}(yi )

is a zero-one boolean function of z∼i

33 / 41 Bit-i EXIT function equals the expectation of fi (z i ) ∼ hi (p) , H(Xi Y , Yi =?) = E [fi (Z i )] | ∼

Connection to Boolean Functions (1)

For linear codes, the recovery of X from Y = y i i i ∼ ∼ is independent of the transmitted codeword X

only depends on erasure indicator zi = 1{?}(yi )

is a zero-one boolean function of z∼i

Recovery indicator of bit-i is a monotone boolean function H(X Y = y Z = z ) = f (z ) 0 1 i i i , i i i i , , | ∼ ∼ ∼ ∼ ∼ ∈ { } where z i , (z0,..., zi 1, zi+1,..., zN 1) ∼ − −

33 / 41 Connection to Boolean Functions (1)

For linear codes, the recovery of X from Y = y i i i ∼ ∼ is independent of the transmitted codeword X

only depends on erasure indicator zi = 1{?}(yi )

is a zero-one boolean function of z∼i

Recovery indicator of bit-i is a monotone boolean function H(X Y = y Z = z ) = f (z ) 0 1 i i i , i i i i , , | ∼ ∼ ∼ ∼ ∼ ∈ { } where z i , (z0,..., zi 1, zi+1,..., zN 1) ∼ − −

Bit-i EXIT function equals the expectation of fi (z i ) ∼ hi (p) , H(Xi Y , Yi =?) = E [fi (Z i )] | ∼

33 / 41 If the permutation group is doubly transitive, then

fi (z1, z2,..., zN 1) = fi (zπ(1), zπ(2),..., zπ(N 1)) π i − − ∀ ∈ G where i is a transitive group of permutations G

Thus, fi is a symmetric monotone boolean function

Connection to Boolean Functions (2)

If the permutation group of a code is transitive, then

hi (p) = h(p), i 1, 2,..., N ∈ { } R 1 Thus, EXIT Area Theorem implies 0 hi (p) dp = R

34 / 41 Thus, fi is a symmetric monotone boolean function

Connection to Boolean Functions (2)

If the permutation group of a code is transitive, then

hi (p) = h(p), i 1, 2,..., N ∈ { } R 1 Thus, EXIT Area Theorem implies 0 hi (p) dp = R

If the permutation group is doubly transitive, then

fi (z1, z2,..., zN 1) = fi (zπ(1), zπ(2),..., zπ(N 1)) π i − − ∀ ∈ G where i is a transitive group of permutations G

34 / 41 Connection to Boolean Functions (2)

If the permutation group of a code is transitive, then

hi (p) = h(p), i 1, 2,..., N ∈ { } R 1 Thus, EXIT Area Theorem implies 0 hi (p) dp = R

If the permutation group is doubly transitive, then

fi (z1, z2,..., zN 1) = fi (zπ(1), zπ(2),..., zπ(N 1)) π i − − ∀ ∈ G where i is a transitive group of permutations G

Thus, fi is a symmetric monotone boolean function

34 / 41 PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY SymmetricVolume 124, Number Monotone 10, October 1996 Boolean Functions

EVERY MONOTONE GRAPH PROPERTY HAS A SHARP THRESHOLD

EHUD FRIEDGUT AND GIL KALAI

(Communicated by Je↵ry N. Kahn)

Abstract. In their seminal work which initiated random graph theory Erd¨os and R´enyi discovered that many graph properties have sharp thresholds as the number of vertices tends to infinity. We prove a conjecture of Linial that every monotone graph property has a sharp threshold. This follows from the following theorem. n Let Vn(p)= 0, 1 denote the Hamming space endowed with the prob- { } k n k ability measure µp defined by µp(✏1, ✏2,...,✏n)=p (1 p) ,where · k = ✏1 + ✏2 + + ✏n.LetA be a monotone subset of Vn.Wesaythat ··· A is symmetric if there is a transitive permutation group on 1, 2,...,n { } such that A is invariant under .

Theorem. For every symmetric monotone A,ifµp(A) > ✏ then µq(A) > 1 ✏ for q = p + c1 log(1/2✏)/ log n.(c1 is an absolute constant.)

For any  > 0, the EXIT function h(p) transitions from  to 1  as p changes by O(1/ log(N)) − 1. Graph properties A graph property is a property of graphs which depends only on their isomor- 35 / 41 phism class. Let P be a monotone graph property; that is, if a graph G satisfies P then every graph H on the same set of vertices, which contains G as a subgraph satisfies P as well. Examples of such properties are: G is connected, G is Hamil- tonian, G contains a clique (=complete subgraph) of size t, G is not planar, the clique number of G is larger than that of its complement, the diameter of G is at most s, etc. For a property P of graphs with a fixed set of n vertices we will denote by µp(P ) the probability that a random graph on n vertices with edge probability p satisfies P .ThetheoryofrandomgraphswasfoundedbyErd¨os and R´enyi [8, 4], and one of their significant discoveries was the existence of sharp thresholds for various graph properties; that is, the transition from a property being very unlikely to it being very likely is very swift. Many results on various aspects of this phenomenon have appeared since then. In what follows c1,c2, etc. are universal constants.

Theorem 1.1. Let P be any monotone property of graphs on n vertices. If µp(P ) > ✏ then µ (P ) > 1 ✏ for q = p + c log(1/2✏)/ log n. q 1

Received by the editors March 27, 1995. 1991 Mathematics Subject Classification. Primary 05C80, 28A35, 60K35. Research supported in part by grants from the Israeli Academy of Sciences, the U.S.-Israel Bi- national Science Foundation, the Sloan foundation and by a grant from the state of Niedersachsen.

c 1996 American Mathematical Society 2993

License or copyright restrictions may apply to redistribution; see http://www.ams.org/journal-terms-of-use Beyond Double Transitivity

The previous result also holds with less symmetry [KCP16].

Theorem 2: Let n be a sequence of codes over Fq with {C } (n) transitive perm. groups and rates Rn R (0, 1), G → ∈ where the size of the smallest orbit of (n) is unbounded G0 Then, Cn achieves capacity on the q-ary erasure channel (QEC){ under} symbol-MAP decoding.

36 / 41 Example: Product Codes

(0, 0) (0, 1) (0, 2) (0, 3) (0, 4)

(1, 0) (1, 1) (1, 2) (1, 3) (1, 4)

(2, 0) (2, 1) (2, 2) (2, 3) (2, 4)

Setup M N product code with doubly transitive component codes × 0 permutations fix (0, 0) but transitive on other rows/cols G Each orbit has a own color, smallest min M 1, N 1 ≥ { − − }

37 / 41 If (n, k) code is (n, n 1) single parity-check (SPC) code − n−1 m −1 Rate Rm = satisfies limn→∞ Rn e n →

Multi-Dimensional Product Codes

k k n k k − n k k k − n k k n − n n

Codeword of m-dimensional product code is array of nm bits Axis-aligned 1D subarrays are codewords of (n, k) linear code m Overall Rate: Rm = (k/n)

38 / 41 Multi-Dimensional Product Codes

k k n k k − n k k k − n k k n − n n

Codeword of m-dimensional product code is array of nm bits Axis-aligned 1D subarrays are codewords of (n, k) linear code m Overall Rate: Rm = (k/n)

If (n, k) code is (n, n 1) single parity-check (SPC) code − n−1 m −1 Rate Rm = satisfies limn→∞ Rn e n →

38 / 41 High-Dimensional SPC Product Codes

MAP Performance for the n-D Product of (n, n 1) SPC Codes − 100 b

P 1 10−

2 10−

n =3, R =0.30, N =33 =27 n =4, R =0.32, N =44 =256 3 10− n =5, R =0.33, N = 55 =3125 Decoder Erasure Rate n =6, R =0.34, N = 66 =46656 Capacity for R = 0.34 10 4 −0.55 0.6 0.65 0.7 0.75 Erasure Probability p 39 / 41 Either way, I believe the answer will provide new insight

But its solution provided much more Interplay between symmetry and capacity has long history

Can this be extended beyond the BEC?

Insight Achieved!

Simple question only involved RM codes

40 / 41 Either way, I believe the answer will provide new insight

Can this be extended beyond the BEC?

Insight Achieved!

Simple question only involved RM codes But its solution provided much more Interplay between symmetry and capacity has long history

40 / 41 Insight Achieved!

Simple question only involved RM codes But its solution provided much more Interplay between symmetry and capacity has long history

Can this be extended beyond the BEC? Either way, I believe the answer will provide new insight

40 / 41 Great opportunity to reflect on Jack Wolf’s impact on my life

What insights did they provide? What is the right simple question for your research?

Preparing this talk

Summary

Three simple questions and their (partial) answers

41 / 41 Great opportunity to reflect on Jack Wolf’s impact on my life

Preparing this talk

Summary

Three simple questions and their (partial) answers What insights did they provide? What is the right simple question for your research?

41 / 41 Summary

Three simple questions and their (partial) answers What insights did they provide? What is the right simple question for your research?

Preparing this talk Great opportunity to reflect on Jack Wolf’s impact on my life

41 / 41 Thank You! For group acting on ZN , the stabilizer subgroup of i ZN is G ∈ i π π(i) = i . G , { ∈ G | }

For i , let the size of the smallest non-trivial orbitbe G min( i ) , min j ( i ) . O G j ZN i |O G | ∈ \{ }

Orbits and Stabilizers

For a group acting on ZN , the orbit of i ZN is G ∈ i j ZN π , π(i) = j . O , { ∈ | ∃ ∈ G } Note that for i, j ZN , either i = j or i j = . Thus, ∈ O O O ∩ O ∅ the set of orbits, ` , partitions the set ZN . {O } For i , let the size of the smallest non-trivial orbitbe G min( i ) , min j ( i ) . O G j ZN i |O G | ∈ \{ }

Orbits and Stabilizers

For a group acting on ZN , the orbit of i ZN is G ∈ i j ZN π , π(i) = j . O , { ∈ | ∃ ∈ G } Note that for i, j ZN , either i = j or i j = . Thus, ∈ O O O ∩ O ∅ the set of orbits, ` , partitions the set ZN . {O }

For group acting on ZN , the stabilizer subgroup of i ZN is G ∈ i π π(i) = i . G , { ∈ G | } Orbits and Stabilizers

For a group acting on ZN , the orbit of i ZN is G ∈ i j ZN π , π(i) = j . O , { ∈ | ∃ ∈ G } Note that for i, j ZN , either i = j or i j = . Thus, ∈ O O O ∩ O ∅ the set of orbits, ` , partitions the set ZN . {O }

For group acting on ZN , the stabilizer subgroup of i ZN is G ∈ i π π(i) = i . G , { ∈ G | }

For i , let the size of the smallest non-trivial orbitbe G min( i ) , min j ( i ) . O G j ZN i |O G | ∈ \{ } Consider size of the smallest non-trivial orbit min( 0) O G Assume bit 0 associated with index (0,..., 0) Then, permuting code dimensions maps bit 0 to bit 0 Permuting code dimensions induces orbits defined by the empirical distribution of the index vector

Minimal orbits have one non-zero entry and min( 0) m O G ≥ n-D product of (n, n 1) SPCs achieves capacity as n ! − → ∞ Some prior work on SPC product codes [CTB95, RG01]

Multi-Dimensional Product Codes (2)

m For m-D product of (n, k) codes, we index bits by v Zn ∈ If (n, k) code transitive, then product code transitive We permute code bits by permuting code dimensions Code is preserved because component codes identical n-D product of (n, n 1) SPCs achieves capacity as n ! − → ∞ Some prior work on SPC product codes [CTB95, RG01]

Multi-Dimensional Product Codes (2)

m For m-D product of (n, k) codes, we index bits by v Zn ∈ If (n, k) code transitive, then product code transitive We permute code bits by permuting code dimensions Code is preserved because component codes identical

Consider size of the smallest non-trivial orbit min( 0) O G Assume bit 0 associated with index (0,..., 0) Then, permuting code dimensions maps bit 0 to bit 0 Permuting code dimensions induces orbits defined by the empirical distribution of the index vector

Minimal orbits have one non-zero entry and min( 0) m O G ≥ Multi-Dimensional Product Codes (2)

m For m-D product of (n, k) codes, we index bits by v Zn ∈ If (n, k) code transitive, then product code transitive We permute code bits by permuting code dimensions Code is preserved because component codes identical

Consider size of the smallest non-trivial orbit min( 0) O G Assume bit 0 associated with index (0,..., 0) Then, permuting code dimensions maps bit 0 to bit 0 Permuting code dimensions induces orbits defined by the empirical distribution of the index vector

Minimal orbits have one non-zero entry and min( 0) m O G ≥ n-D product of (n, n 1) SPCs achieves capacity as n ! − → ∞ Some prior work on SPC product codes [CTB95, RG01] ReferencesI

[BKK+92] Jean Bourgain, Jeff Kahn, Gil Kalai, Yitzhak Katznelson, Nathan Linial. The influence of variables in product spaces. Israel Journal of Mathematics, 77(1-2):55–64, 1992.

[CTB95] Giuseppe Caire, Giorgio Taricco, G´erard Battail. Weight distribution and performance of the iterated product of single-parity-check codes. Annales Des T´el´ecommunications, 50(9-10):752–761, 1995.

[FK96] Ehud Friedgut, Gil Kalai. Every monotone graph property has a sharp threshold. Proc. Amer. Math. Soc., 124(10):2993–3002, 1996.

[KCP16] Santhosh Kumar, Robert Calderbank, Henry D. Pfister. Beyond double transitivity: Capacity-achieving cyclic codes on erasure channels. Proc. IEEE Inform. Theory Workshop, strony 241–245, Sept 2016. ReferencesII

[KKL88] J. Kahn, G. Kalai, N. Linial. The influence of variables on boolean functions. Proc. IEEE Symp. on the Found. of Comp. Sci., strony 68–80, Oct 1988. [KKM+17] Shrinivas Kudekar, Santhosh Kumar, Marco Mondelli, Henry D. Pfister, Eren S¸a¸so˘glu, R¨udigerUrbanke. Reed-Muller codes achieve capacity on erasure channels. IEEE Trans. Inform. Theory, 63(7):4298–4316, 2017.

[RG01] David M Rankin, T Aaron Gulliver. Single parity check product codes. IEEE Trans. Inform. Theory, 49(8):1354–1362, 2001.