<<

arXiv:1801.07839v4 [cs.IT] 24 Nov 2020 ainGaut eerhFlosi rga ne rn No Grant under Program Fellowship Research Graduate dation upre yNFgat C-674 n C-842 n yt by and CCF-1844628 and CCF-1657049 grants NSF by supported ∗ † ‡ eateto optrSine tnodUiest.This ’18. University. RANDOM Stanford in Science, appeared Computer paper of this Department of version conference A eatet fCmue cec n lcrclEngineerin Electrical and Science Computer of Departments omto bu h itiuino itszso admlna oe a linear codes. random rank-metric of linear sizes random list str for of are result distribution sizes the list about the formation that sense the in codes, random uniformly than nfrl admcds hsipista admlna oe r nfa in are codes linear random that implies this codes; random uniformly nadto oipoigtecntn nkonls-iebud,ora our of bounds, values list-size all known for in constant simultaneously the simple—works improving to addition In EETas T 04 ooti nesnilytgtlwrbudo 1 of bound lower tight essentially ( c an of To obtain argument an to probability. improve 2014) high also IT, we with codes, hold Trans. linear to IEEE random 2002) for IT, bound per Trans. IEEE Zuckerman, parame different cover to arguments different together patched oe r ( are codes eoal suiomyrno oe,i h es htarno line random a that sense the in codes, random 1 uniformly as decodable − mrvdls-eoaiiyo admlna iaycodes binary linear random of list-decodability Improved odmntaeteapiaiiyo hs ehius euete t them use we techniques, these of applicability the demonstrate To u prahi osrnte neitnilagmn f(Gurus of argument existential an strengthen to is approach Our hr a enagetda fwr salsigta admlinear random that establishing work of deal great a been has There H ( p ) − ,H p, ǫ s( is ( ,O p, p ) /ǫ (1 )ls-eoal ihhg rbblt,frany for probability, high with 2)-list-decodable + /ǫ )ls-eoal ihhg rbblt.I hswr,w hwta su that show we work, this In probability. high with ))-list-decodable a Li Ray oebr2,2020 26, November † n ayWootters Mary and Abstract p hl rvoswrsobtaining works previous while , G 1656518. - DGE . okwsspotdb h ainlSineFoun- Science National the by supported was work ,Safr nvriy hswr a partially was work This University. Stanford g, eNFBFgatCCF-1814629. grant NSF/BSF he e regimes. ter ‡ cl smaller. ictly ai H˚astad, and wami, Sudan d()t rv similar a prove to (b) nd gmn—hc squite is rgument—which uuwm,Narayanan, Guruswami, p rbnr oeo rate of binary ar ct a banmr in- more obtain (a) o ∈ /ǫ mlmn u up- our omplement more (0 ntels ieof size list the on oe r slist- as are codes , 1 / list-decodable )and 2) L = O > ǫ (1 /ǫ ∗ ch 0. ) 1 Introduction

An error correcting code is a subset Fn, which is ideally “spread out.” In this paper, we C ⊆ 2 focus on one notion of “spread out” known as list decodability. We say that a code is (p,L)-list- C decodable if any Hamming ball of radius pn in Fn contains at most L points of : that is, if for 2 C all x Fn, (x,pn) L, where (x,pn) is the Hamming ball of radius pn centered at x. ∈ 2 |B ∩C| ≤ B Since list decoding was introduced in the 1950’s [Eli57, Woz58], it has found applications beyond communication, for example in pseudorandomness [Vad12] and complexity theory [Sud00]. A classical result in list decoding is known as the list-decoding capacity theorem:

Theorem 1.1 (List decoding capacity theorem). Let p (0, 1/2) and ε> 0. ∈ 1. There exist binary codes of rate 1 H(p) ε that are (p, 1/ε )-list decodable. − − ⌈ ⌉ 2. Any binary code of rate 1 H(p)+ ε that is (p,L)-list decodable up to distance p must have − L 2Ω(εn). ≥ Above, H(p) = (1 p) log (1 p) p log (p) is the binary entropy function. We say that a − − 2 − − 2 family of binary codes with rate approaching 1 H(p) which are (p,L)-list-decodable for L = O(1) − achieves list-decoding capacity.1 Theorem 1.1 is remarkable because it means than even when pn is much larger than half the minimum distance of the code—the radius at which at most one codeword c lies in any ∈ C Hamming ball—it still can be the case that only a constant number of c lie in any Hamming ∈ C ball of radius pn. Because of this, there has been a great deal of work attempting to understand what codes achieve the bound in Theorem 1.1. Fn The existential part of Theorem 1.1 is proved by showing that a uniformly random subset of 2 is (p, 1/ε)-list decodable with high probability. For a long time, uniformly random codes were the only example of binary codes known to come close to this bound, and today we still do not have many other options. There are explicit constructions of capacity-achieving list-decodable codes over large alphabets (either growing with n or else large-but-constant) [DL12, GX12, GX13], but over binary alphabets we still do not have any explicit constructions; we refer the reader to the survey [Gur09] for an overview of progress in this area. Because it is a major open problem to construct explicit binary codes of rate 1 H(p) ε with − − constant (or even poly(n)) list-sizes, one natural line of work has been to study structured random Fn approaches, in particular random linear codes. A random linear code 2 is simply a random Fn C ⊂ subspace of 2 , and the list-decodability of these codes has been well-studied [ZP81, GHSZ02, GHK11, CGV13, Woo13, RW14, RW18]. There are several reasons to study the list-decodability of random linear codes. Not only is it a natural question in its own right as well as a natural stepping stone in the quest to obtain explicit binary list-decodable codes, but also the list-decodabilility of random linear codes is useful in other coding-theoretic applications. One example of this is in concatenated codes and related constructions [GI04, GR08a, HW15, HRZW17], where a random linear code is used as a short inner code. Here, the linearity is useful because (a) a linear code can be efficiently described; (b) it is sometimes desirable to obtain a linear code at the end of the day, hence all components of the construction must be linear; and (c) as in [HW15] sometimes the linearity is required for the construction to work. To this end, the line of work mentioned above has aimed to establish that random linear codes are “as list-decodable” as uniformly random codes. That is, uniformly random codes are viewed

1Sometimes the phrase “achieves list-decoding capacity” is also used when L = poly(n); since this paper focuses on the exact constant in the O(1) term however, we use it to mean that L = O(1).

1 (as is often the case in ) as the optimal construction, and we try to approximate this optimality with random linear codes, despite the additional structure.

Our contributions. In this paper, we give an improved analysis of the list-decodability of random linear binary codes. More precisely, our contributions are as follows.

• A unified analysis. As we discuss more below, previous work on the list-decodability of random linear binary codes either work only in certain (non-overlapping) parameter regimes [GHK11, Woo13], or else get substantially sub-optimal bounds on the list-size [RW18]. Our argument obtains improved list size bounds over all these results and works in all parameter regimes. Our approach is surprisingly simple: we adapt an existential argument of Guruswami, H˚astad, Sudan and Zuckerman [GHSZ02] to hold with high probability. Extending the argument in this way was asked as an open question in [GHSZ02] and had been open until now.

• Improved list-size for random linear codes. Not only does our result imply that random linear codes of rate 1 H(p) ε are (p,L)-list-decodable with list-size L = O(1/ε), in fact − − we show that L H(p)/ε + 2. In particular, the leading constant is small and—to the best ≤ of our knowledge—is the best known, even existentially, for any list-decodable code.

• Tight list-size lower bound for uniformly random codes. To complement our upper bound, we strengthen an argument of Guruswami and Narayanan [GN14] to show that a uniformly random binary code of rate 1 H(p) ε requires L (1 γ)/ε for any constant − − ≥ − γ > 0 and sufficiently small ε. In other words, the list size of 1/ε in Theorem 1.1 is tight even in the leading constant. Thus, random linear codes are, with high probability, list-decodable with smaller list sizes than completely random codes.2

• Finer-grained information about the combinatorial structure of random linear codes. We extend our argument to obtain more information about the distribution of list sizes of random linear codes. More precisely, we obtain high-probability bounds on the number of points x so that the list size at x, L (x) := (x,pn) , is at least ℓ. C |B ∩ C| • Results for rank-metric codes. Finally, we adapt our argument for random linear codes to apply to random linear rank-metric codes. As with standard (Hamming-metric) codes, recent work aimed to show that random linear rank-metric codes are nearly as list-decodable as uniformly random codes [Din15, GR17]. Our approach establishes that in fact, random linear binary rank-metric codes are more list-decodable than their uniformly random coun- terparts in certain parameter regimes, in the sense that the list sizes near capacity are strictly smaller. Along the way, we show that low-rate random linear binary rank-metric codes are list-decodable to capacity, answering a question of [GR17].

On the downside, we note that our arguments only work for binary codes and do not extend to larger alphabets; additionally, our positive results do not establish average-radius list-decodability, a stronger notion which was established in some of the previous works [CGV13, Woo13, RW18]. Subsequent work [GLM+] extends our main result on list-decoding random linear binary codes to average-radius list-decodability.

2In retrospect, this may not be surprising: for example, it is well-known that random linear codes have better distance than completely random codes. However, the fact that we are able to prove this is surprising to the authors, since it requires taking advantage of the dependence between the codewords, rather than trying to get around it.

2 1.1 Outline of paper After a brief overview of the notation in §1.2, we proceed in §2 with a survey of related work for both random linear codes and rank-metric codes, and we formally state our results in this context. In §3, we prove Theorem 2.4, which establishes our upper bound for random linear binary codes. In §4, we expand upon the ideas in the proof of Theorem 2.4 to prove Theorem 2.6, characterizing the list size distribution of random linear codes. In §5, we prove Theorem 2.11, which adapts our upper bound to random linear binary rank-metric codes. To round out the story, we must prove lower bounds on the list sizes of uniformly random codes for both standard and rank-metric codes. We state these lower bounds in Theorems 2.5 and 2.12. These proofs closely follow the approach in [GN14] and are proved in Appendices A and B, respectively.

1.2 Notation We use standard Landau notation O( ), Ω( ), Θ( ). We use the notation Ω ( ) to mean that a · · · x · multiplicative factor depending on the variable(s) x is suppressed. Throughout most of the paper, we are interested in binary codes Fn of block length n. The dimension of a code is defined C ⊆ 2 C as k = log2 , and the rate is the ratio k/n. We say that a binary code is linear if it forms a |C| Fn of 2 . We define a random linear binary code of rate R to be the span of k = Rn independently random vectors b , . . . , b Fn.3 1 k ∈ 2 For a boolean statement ϕ, let I[ϕ] be 1 if ϕ is true and 0 if ϕ is not true. For two points x,y Fn, we use ∆(x,y) = n I[x = y ] to denote the between x and y, ∈ 2 i=1 i i where, for an event , I[ ] is 1 if occurs and 0 otherwise. For x Fn, r [0,n], we define the E E P E ∈ 2 ∈ Hamming ball (x, r) of radius r centered at x to be (x, r) = y Fn : ∆(x,y) r , and the B B { ∈ 2 ≤ } volume of (x, r) to be Vol(n, r) := (0n, r) = r n . We use the well known bound that, B |B | i=0 i for any p [0, 1/2], Vol(n,pn) 2H(p)n, where H(p) = (1 p) log (1 p) p log (p) is the ∈ ≤ P − − 2 − − 2 binary entropy function. One of our main technical results is about the distribution of list sizes of points x Fn: given a code and p (0, 1/2), we define the list size of a point x Fn to be ∈ 2 C ∈ ∈ 2 L (x) := (x,pn) . C |B ∩ C| For α > 0, β R, let exp (β) := αβ and assume α = e when it is omitted. For two sets ∈ α A, B Fn, define the sumset A + B = a + b : a A, b B . When b Fn, let A + b denote ⊆ 2 { ∈ ∈ } ∈ 2 A + b . { } 2 Previous Work and Our Results

In §2.1 below, we survey related work on the list-decodability of random linear binary codes, and state our results. In §2.2, we do the same for our results on the list-decodability of random linear binary rank-metric codes.

2.1 Random Linear Codes The list-decodability of random linear binary codes has been well studied. Here we survey the results that are most relevant for this work. As this work focuses on binary codes, we focus this survey on results for binary codes, even though many of the works mentioned also apply to general

3We note that this definition is slightly different than the standard definition, which is a uniformly random subspace of dimension k. However, our definition is easier to work with. Furthermore, since uniformly random b1,...,bk have rank strictly less than k with probability at most 2−(n−k), and since each dimension k code is represented in the same number of ways, our results hold also for the more standard definition.

3 q-ary codes. We additionally remark that, in contrast to the large alphabet setting [GR08b], capacity achieving binary codes have no known explicit constructions. A modification of the proof of the list decoding capacity theorem shows that a random linear code of rate 1 H(p) ε is (p, exp(O( 1 )))-list decodable [ZP81]. However, whether or not random − − ε linear codes of this rate with list-sizes that do not depend exponentially on ε remained open for decades: this question was explicitly asked in [Eli91]. A first step was given in the work of Guruswami, H˚astad, Sudan and Zuckerman [GHSZ02], who proved via a beautiful potential-function-based-argument that there exist binary linear codes or rate 1 H(p) ε which are (p, 1/ε)-list-decodable. However, this result did not hold with high − − probability. Our approach relies heavily on the approach of [GHSZ02], and we return to their argument in §3. Over the next 15 years, a line of work [GHK11, CGV13, Woo13, RW14, RW15, RW18] has focused on the list-decodability (and related properties) of random linear codes, which should hold with high probability. The works most relevant to ours are [GHK11, Woo13], which together more or less settle the question. We state these results here for binary alphabets, although both works address larger alphabets as well. The first result, of [GHK11], establishes a result for a constant p, bounded away from 1/2. Theorem 2.1 (Theorem 2 of [GHK11]). Let p (0, 1/2). Then there exist constants C ,δ > 0 ∈ p such that for all ε > 0 and sufficiently large n, for all R 1 H(p) ε, if Fn is a random ≤ − − C ⊆ 2 linear code of rate R, then is (p,C /ε)-list decodable with probability at least 1 2−δn. C p − However, C is not small and tends to as p approaches 1/2. The following result of [Woo13] p ∞ fills in the gap when p is quite close to 1/2.

Theorem 2.2 (Theorem 2 of [Woo13]). There exist constants C1,C2 so that for all sufficiently small ε> 0 and sufficiently large n, for p = 1/2 C √ε and for all R 1 H(p) ε, if Fn is − 1 ≤ − − C ⊆ 2 a random linear code of rate R, then is (p,C /ε)-list decodable with probability at least 1 o(1). C 2 − The list-decoding capacity theorem implies that we cannot hope to take the rate R substantially larger than 1 H(p) ε and obtain a constant list size. Moreover, the list size Θ(1/ε) is optimal for − − both random linear codes and uniformly random codes [Rud11, GN14]. More precisely, Guruswami and Narayanan show the following theorem (which we have specialized to binary codes). Theorem 2.3 (Theorem 20 of [GN14]). Let ε > 0. A uniformly random binary code of rate 1 H(p) ε is (p, (1 H(p))/ε)-list decodable with probability at most exp( Ω (n)).4 − − − − p,ε We note that for general codes (not uniformly random or random linear) it is still unknown what the “correct” list size L is in terms of ε, although there are results in particular parameter regimes [Bli86, GV05] and for stronger notions of list-decodability [GN14].

2.1.1 Our results for random linear codes We show that high probability a random linear binary code of rate 1 H(p) ε is (p,L)-list- − − decodable with L H(p)/ε, while a uniformly random binary code of the same rate requires ∼ L (1 γ)/ε. More precisely, the upper bound is as follows (proved in §3). ≥ − Theorem 2.4. Let ε> 0 and p (0, 1/2). A random linear code of rate 1 H(p) ε is (p,H(p)/ε+ ∈ − − 2)-list decodable with probability 1 exp( Ω (n)). − − ε 4In fact, in [GN14], Theorem 20 is stated with a list size of (1−H(p))/2ε as a lower bound. However, the constant can be improved to 1 − H(p), because the factor of 2 is introduced to handle an additive constant term. Thus, for sufficiently large n their argument proves the stronger statement stated above.

4 Theorem 2.4 improves upon the picture given by Theorems 2.1 and 2.2 in two ways. First, the leading constant on the list size, which is H(p), improves over both the constant Cp from Theorem 2.1 (which blows up as p 1/2) and on the constant C from Theorem 2.2 (which the → 2 authors do not see how to make less than 2). Moreover, when p 1/2, Theorem 2.4 improves on → Theorem 2.2 in that it decouples p from ε: in Theorem 2.2, we must take p = 1/2 O(√ε) and − R = 1 H(p) ε, while in Theorem 2.4, p and ε may be chosen independently. Thus, Theorem 2.4 − − offers the first true “list-decoding capacity theorem for binary linear codes,” in that it precisely mirrors the quantifiers in Theorem 1.1. The list size of H(p)/ε + 2 is smaller than the list size of 1/ε given by the classical list decoding capacity theorem for uniformly random codes. Further, the following negative result shows that the list size of 1/ε given by uniformly random binary codes in the list decoding capacity theorem is tight, even in the leading constant of 1.

Theorem 2.5. For any p (0, 1/2) and ε > 0, there exists a γ = exp( Ω ( 1 )) and n N ∈ p,ε − p ε p,ε ∈ such that for all n n , a random code Fn of rate R = 1 H(p) ε is with probability ≥ p,ε C ⊆ 2 − − 1 exp( Ω (n)) not (p, 1−γp,ε 1)-list decodable. − − p,ε ε − The proof of Theorem 2.5 is obtained by tightening the second moment method proof of [GN14], and is given in Appendix A. Theorem 2.5, combined with Theorem 2.4, implies that, for all p ∈ (0, 1/2) and sufficiently small ε, random linear codes of rate 1 H(p) ε with high probability − − can be list decoded up to distance p with smaller list sizes than uniformly random codes. Perhaps surprisingly, the difference between the list size upper bound in Theorem 1.1 and the lower bound in Theorem 2.5 is bounded by 2 as ε 0, implying that the “correct” list size of a uniformly → random code is tightly concentrated between 1/ε 1 for small ε. ⌊ ⌋± We are unaware of results in the literature that give even the existence of binary codes list decodable with list size better than H(p)/ε. We remark that the Lovasz Local Lemma also gives the existence of (p,H(p)/ε) list decodable codes, matching our high probability result for random linear codes. Though we suspect this argument is known, we are not aware of a published version, so we include a proof in Appendix C for completeness. The techniques that we use to prove Theorem 2.4 can be refined to give more combinatorial information about random linear codes. It is our hope that such information will help in further derandomizing constructions of binary codes approaching list-decoding capacity. In §4, we prove the following structural result about random linear binary codes.

Theorem 2.6. Fix L 1 and γ (0, 1). For all sufficiently large n, if Fn is a random linear ≥ ∈ C ⊆ 2 code of dimension k, then with probability 1 exp( Ω(γn)), for all 1 ℓ L, − − ≤ ≤

x : L (x) ℓ ℓ 2 |{ C ≥ }| 2−n(1−H(p)) 2k 2γℓ n. 2n ≤ · ·   To interpret this result, it is helpful to think of γ close to 0 and k = n(1 H(p) ε). Then, − − with high probability over the choice of a random linear code , the fraction of centers x Fn with C ∈ 2 “list size at least ℓ” decays approximately exponentially as 2−nℓε. By an appropriately small choice of γ, Theorem 2.4 follows as a corollary of Theorem 2.6 (see Corollary 4.3), but as a warm-up to Theorem 2.6 we present a proof of Theorem 2.4 independent of Theorem 2.6 in §3.

Remark 2.7. Theorem 2.6 implies that in a random linear code of rate 1 H(p) ε, with − − high probability over the choice of code, we have Pr [L (x) 2] . 2−2nε. On the other hand, x C ≥ we know that L (x) = C B(0,pn), so if the maximum list size L is a constant, we have x C | | · Pr [L (x) 1] 1 B(0,pn) 2−nℓε(1+o(1)). Thus, “most” centers x within pn of a codeword x C ≥ P≥ L ·|C|·| |≥

5 c are not within pn of any other codeword c′ = c. This is in line with the conventional wisdom ∈C 6 from the Shannon model: with high probability, random linear codes achieve capacity on the BSC, so for a random linear code, a random center x obtained by sending a codeword c through the ∈C BSC(p) has L (x) 1 with high probability. (See also [RU10]). Thus, C ≤ Theorem 2.6 recovers this intuition for list size 1, and quantitatively extends it to list sizes larger than 1.

2.2 Rank-Metric Codes As an application of our techniques for random linear codes, we turn our attention to rank metric codes. Rank metric codes, introduced by Delsarte in [Del78], are codes Fm×n; that is, the C ⊆ q codewords are m n matrices, where typically m n. The distance between two codewords X × ≥ and Y is given by the rank of their difference: ∆ (X,Y ) := 1 rank(X Y ), where ∆ is called the R n − R rank metric. We denote the rank ball by (X,p) := Y Fm×n : ∆ (X Y ) pn , and say Bq,R ∈ q R − ≤ that a rank metric code Fm×n is (p,L)-list-decodable if (X,p) L for all X Fm×n. C ⊆ q  |Bq,R ∩C|≤ ∈ q The rate R of a rank metric code Fm×n is R := log ( )/(mn). C ⊂ q q |C| Rank metric codes generalize standard (Hamming metric) codes, which are simply diagonal rank metric codes. The study of rank metric codes has been motivated by a wide range of ap- plications, including magnetic storage [Rot91], cryptography [GPT91, Loi10, Loi17], space-time coding [LGB03, LK05], network coding [KK08, SKK08], and distributed storage [SRV12, SRKV13]. The natural “list-decoding capacity” for rank metric codes is R = (1 p)(1 (n/m)p), which − − is the analog of the Gilbert-Varshamov bound [GY08]. It was shown in [Din15, GR17] that this is achievable by a uniformly random rank metric code. Theorem 2.8 ([GR17], Proposition A.1.5). Let ε > 0 and p (0, 1) and suppose that m,n are ∈ sufficiently large compared to 1/ε. A uniformly random code Fm×n of rate R = (1 p)(1 bp) ε C ⊆ q − − − is (p, 1/ε )-list-decodable with probability at least 1 O(q−εmn), where b = n/m. ⌈ ⌉ − As with standard (Hamming-metric) codes, it is interesting to study the list-decodability of random linear rank-metric codes; we say that Fn×m is linear if it forms an F -linear space. It C ⊆ q q is shown in [Din15] that no linear code can beat the bound in Theorem 2.8. Theorem 2.9. Let b = lim n be a constant and L 2o(mn). Then for any R (0, 1) n→∞ m ≤ ∈ and p (0, 1), a (p,L)-list-decodable linear rank metric code Fm×n with rate R satisfies ∈ C ⊆ q R (1 p)(1 bp). ≤ − − There has been a great deal of work aimed at establishing (or refuting) the list-decodability of explicit rank metric codes. It is shown in [WZ13] that Gabidulin codes [Gab85]—the rank-metric analog of Reed-Solomon codes—are not list-decodable to capacity, or even much beyond half their minimum distance. However, there have been works [GX13, GWX16] designing explicit codes meeting the limitation of Theorem 2.9. Following the approach of [ZP81] for Hamming metric codes, [Din15] shows that random linear rank metric codes of rate R = (1 p)(1 bp) ε are (p, exp(O(1/ε))-list-decodable, where as above − − − b = n/m. In a recent paper of Guruswami and Resch [GR17], this result was strengthened to give a list size of O(1/ε). Theorem 2.10 ([GR17]). Let p (0, 1) and q 2. There is some constant C so that the ∈ ≥ p,q following holds. For all sufficiently large n,m with b = n/m, a random linear rank metric code Fm×n of rate R = (1 p)(1 bp) ε is (p,C /ε)-list decodable with high probability. C ⊆ q − − − p,q 5In [GR17], the result is stated with L = O(1/ε), but an inspection of the proof shows that we may take the leading constant to be 1.

6 The proof of Theorem 2.10 uses ideas from the approach of [GHK11] to prove Theorem 2.1. This is a beautiful argument, but as with the results of [GHK11], the result of [GR17] suffers from the fact that C tends to as p approaches 1. It is shown in [GR17], Proposition A.2, that when p,q ∞ p = 1 η, a uniformly random rank metric code of rate R = (η ηb+η2b)/2 is (p, 4/(η ηb+η2b))- − − − list-decodable, and that work poses the question of whether or not a random linear rank metric code can achieve this. Our results, described in the next section, show that the answer is “yes” for q = 2.

2.2.1 Our Results for Rank Metric Codes By applying the techniques in the proof of Theorem 2.4, we prove the following upper bound on the list size of random linear binary rank-metric codes. Theorem 2.11. Let p (0, 1) and choose ε > 0. There is a constant C so that the following ∈ ε holds. Let m and n be sufficiently large positive integers with n 0 and suppose m,n are sufficiently large so that b = n/m. Let Fm×n be C ⊆ q a uniformly random rank metric code of rate R = (1 p)(1 bp) ε. Then is (p, (1 p)(1 bp)/ε 2)- − − − C − − − list decodable with probability at most exp( Ω (n)). − p,ε Theorem 2.12 again uses the method of [GN14], and we prove it in Appendix B. Together, Theorems 2.11 and 2.12 show that for some values of p, random linear binary rank metric codes have a strictly smaller list size than uniformly random rank metric codes with the same param- eters. In particular, the upper bound of Theorem 2.11 is strictly smaller than the lower bound 1−b of Theorem 2.12 whenever p < 2 . For larger values of p, we remark that the list size obtained by Theorem 2.11 is still strictly smaller than the 1/ε list size given by uniformly random codes in Theorem 2.8, even though in this case we don’t have a lower bound which proves that this is tight.

3 Simplified result for random linear binary codes

In this section, we prove Theorem 2.4, which we restate here. Theorem 3.1 (Theorem 2.4, restated). Let ε> 0 and p (0, 1/2). A random linear code of rate ∈ 1 H(p) ε is (p,H(p)/ε + 2)-list decodable with probability 1 exp( Ω (n)). − − − − ε Theorem 2.4 also follows from our more refined result, Theorem 2.6. However, we present a different proof of Theorem 2.4 first, for two reasons: (1) the results in Section 5 follow the argument structure in this section, and (2) the argument in this section shows how to extend the argument in [GHSZ02] to a high probability result. Before it was known that a typical random linear code is (p, O(1/ε))-list decodable, Guruswami, H˚astad, Sudan and Zuckerman [GHSZ02] proved the existence of binary linear codes of rate 1 H(p) ε that are (p, 1/ε)-list decodable. However, their − − argument did not work with high probability, and the authors explicitly stated this as a drawback of their proof. This section shows how to make the argument in [GHSZ02] work with high probability. We start by reviewing the approach of [GHSZ02], which is the of our proof.

7 3.1 The approach of [GHSZ02] The approach of [GHSZ02] followed from a beautiful potential-function argument, which is the basis of our approach and which we describe here. Let k := Rn = (1 H(p) ε)n. We choose vectors b , . . . , b one at a time, so that the code − − 1 k := span(b , . . . , b ) remains “nice”: formally, so that a potential function S˜ remains small. Once Ci 1 i Ci we have picked all k vectors, we set = , and the fact that S˜ is small implies list-decodability. C Ck Ck Recall that for a code and x Fn, we set L (x)= (x,pn) . Define C ∈ 2 C |B ∩ C| 1 S˜ := 2εnLC(x). C 2n x∈Fn X2 It is not hard to show that for any vectors b , . . . , b Fn, 1 i ∈ 2 2 E S˜ b1, . . . , bi S˜ . (1) Fn Ci+{0,bi+1} Ci bi+1∼ 2 | ≤ h i That is, when a uniformly random vector b is added to the basis b , . . . , b , we expect the i+1 { 1 i} potential function not to grow too much. Hence, there exists a choice of vectors b1, . . . , bk so that 2 6 S˜C S˜ for i = 0, 1,...,k 1. i+1 ≤ Ci − As = 0 , we have S˜ 1 + 2−n(1−H(p)−ε). Setting = = span(b , . . . , b ), we have C0 { } C0 ≤ C Ck 1 k k k 2 S˜ S˜2 1 + 2−n(1−H(p)−ε) exp 2k−n(1−H(p)−ε) = e C ≤ C0 ≤ ≤     by our choice of k. This implies that 2εnLC(x) e 2n, and in particular, for all x Fn, we have x ≤ · ∈ 2 2εnLC(x) e 2n. Thus, for all x, L (x) 1 + o(1), as desired. ≤ · C P ≤ ε The approach of [GHSZ02] is extremely clever, but these ideas have not, to the best of our knowledge, been used in subsequent work on the list-decodability of random linear codes. One reason is that the crux of the argument, which is (1), holds in expectation, and it was not clear how to show that it holds with high probability; thus, the result remained existential, and other techniques were introduced to study typical random codes [GHK11, CGV13, Woo13, RW14, RW18].

3.2 Proof of Theorem 2.4 We improve the argument of [GHSZ02] in two ways. First, we show that in fact, (1) essentially holds with high probability over the choice of bi+1, which allows us to use the approach sketched above for random linear codes. Second, we introduce one additional trick which takes advantage of the linearity of the code in order to reduce the constant in the list size from 1 to H(p). Before diving into the details, we briefly describe the main ideas. The first improvement follows from looking at the potential function in the right way. In this ˜ ˜ ˜ paragraph, all o(1) terms are exponentially small in n. Our goal is SCk O(1). Write SCi = 1+TCi . ˜ ˜ ≤ By above, TC0 = SC0 1 = o(1). We show that with high probability, for all i k, we have ˜ − ≤ TCi = o(1). In the [GHSZ02] argument we have E S˜ S˜2 = (1+ T˜ )2 =1+2T˜ (1 + o(1)), Ci+1 ≤ Ci Ci Ci and so E T˜ = 2T˜ (1 + o(1)). One can show that, always, 2T˜ T˜ . By Markov’s inequality, Ci+1 Ci Ci ≤ Ci+1 T˜ = 2T˜ (1 + o(1)) with probability 1 o(1), for appropriately chosen o(1) terms. Union Ci+1 Ci − 6 As a technical detail, one needs to be careful that bi+1 ∈C/ i. One can guarantee bi+1 ∈C/ i by carefully examining the proof of (1), or use (1) to get a similar equation where we additionally condition bi+1 ∈C/ i.

8 ˜ bounding over the o(1) failure probabilities in the k steps, we conclude that TCi grows roughly as slowly as in the existential argument, giving the desired list decodability. The second improvement follows from the linearity of the code. In the last step of the [GHSZ02] argument, we replace the summation “ ” in 2εnLC(x) e 2n with a “ x.” We can save a bit x x ≤ · ∀ because, by linearity, the contribution 2εnLC(x) is the same for all x in a coset y + . P P C Now we go through the details. It is convenient to change the definition of the potential function very slightly: losing the tilde, define, for a code Fn, C ⊂ 2 εnLC (x) AC(x) := 2 1+ε and SC := E [AC (x)] and TC := SC 1. Fn x∼ 2 −

As noted above, it is helpful to think of TC as a very small term; we would like to show—in accordance with (1)—that TC approximately doubles each time we add a basis vector. The term ˜ 1 SC differs from the term SC above in that AC(x) has an extra factor of 1+ε in the exponent. This is an extra “slack” term that helps guarantee a high probability result under the same parameters. However, this definition does not change how the potential function behaves. In particular, we still have the following lemma:

Lemma 3.2 (Following [GHSZ02]). For all linear Fn and all b Fn, C ⊆ 2 ∈ 2 L (x) L (x)+ L (x + b) (2) C+{0,b} ≤ C C A (x) A (x) A (x + b), (3) C+{0,b} ≤ C · C with equality if and only if b / . ∈C Proof. To see (2), notice that

L (x) = (x,pn) ( ( + b)) C+{0,b} |B ∩ C∪ C | (x,pn) + (x,pn) ( + b) ≤ |B ∩ C| |B ∩ C | = (x,pn) + (x + b, pn) |B ∩ C| |B ∩ C| = LC(x)+ LC(x + b), with equality in the second line if and only if b . Inequality (3) follows as a consequence of (2), 6∈ C and this proves the lemma.

The following lemma is the key step of our proof, establishing that when TC is small, it essentially doubles with high probability each time we add a basis vector.

Lemma 3.3. If is a fixed linear code, C 1.5 0.5 Pr S 1 + 2TC + T T . Fn C+{0,b} C C b∼ 2 ≥ ≤   Proof. By Lemma 3.2, for all b,

SC+{0,b} = E AC+{0,b}(x) x

E [AC(x)AC (x + b)] ≤ x = E [ 1+ AC(x)+ AC(x + b) + (AC (x) 1)(AC (x + b) 1)] x − − − = 1+2TC + E [(AC(x) 1)(AC (x + b) 1)] . x − −

9 Fn Over the randomness of b and x, we have x and x + b are independently uniform over 2 , so

2 E E [(AC(x) 1)(AC (x + b) 1)] = E [AC (x) 1] E [AC(x + b) 1] = TC . (4) b x − − b,x − · b,x − As A (x) 1 is always nonnegative, we have, by Markov’s inequality, C − 2 1.5 1.5 TC 0.5 Pr SC+{0,b} 1 + 2TC + T Pr E [(AC (x) 1)(AC (x + b) 1)] T = T . C x C 1.5 C b ≥ ≤ b − − ≥ ≤ TC   h i Iterating Lemma 3.3 gives the following.

Lemma 3.4. Let p (0, 1/2) and ε (0, 1 H(p)). Let Fn be a random linear code of rate ∈ ∈ − C ⊂ 2 1 H(p) ε. Then, with probability 1 exp( Ω (n)), we have S < 2. − − − − ε C Proof. It suffices to prove this when n is sufficiently large in terms of ε. As in §3.1, let b , b , . . . , b 1 2 k ∈ Fn be independently and uniformly chosen, and let = span b , . . . , b . Consider the sequence 2 Ci { 1 i} −n(1−H(p)− ε ) δ0 := 2 1+ε 1.5 δi := 2δi−1 + δi−1.

2 i+1 − ε n We can verify by induction that for i n(1 H(p) ε), we have δi < 2 δ0 < 2 2 . To see this, ≤ − j+1 − notice that the base case is trivial, and if δj < 2 δ0 for j < i, we have

i−1 i−1 δ = 2δ (1 + δ0.5 ) = 2iδ (1 + δ0.5) 2iδ exp δ0.5 < 2i+1δ . i i−1 i−1 0 · j ≤ 0 ·  j  0 jY=0 Xj=0   In the first two equalities, we applied the definitions of δi and δi−1,...,δ1, respectively. In the first inequality, we used the estimate 1 + z ez, and in the second we used the inductive hypothesis 2 ≤ − ε n δj < 2 2 for j < i and that n is sufficiently large. By this induction, we conclude that, if 2 − ε n k = n(1 H(p) ε), then δ < 2 2 . − − k Let b , . . . , b Fn be randomly chosen vectors, and let = span(b , . . . , b ) with = . Call 1 k ∈ 2 Ci 1 i Ck C good if T < δ . We have Ci Ci i H(p)n εn Vol(n,pn) εn 2 T = S 1 = 2 1+ε 1 < 2 1+ε = δ , C0 {0} − − · 2n · 2n 0   so is always good. On the other hand, by the definition of δ and Lemma 3.3, if is good, C0 i Ci Pr [ not good] = Pr T δ Pr T 2T + T 1.5 T 0.5 < δ0.5. Ci+1 Ci+1 ≥ i+1 ≤ Ci+1 ≥ Ci Ci ≤ Ci i Thus, with probability at least    

2 1 δ0.5 + δ0.5 + + δ0.5 > 1 k2−ε n/4 1 2−Ωε(n) − 0 1 · · · k−1 − ≥ −

 ε2n − 2 we have TCi < δi for all i = 0,...,k. In particular, TC = TCk < δk < 2 . Thus, SC =1+ TC < 2 with probability 1 exp( Ω (n)), completing the proof of Lemma 3.4. − − ε Finally, we prove the following lemma, which implies Theorem 2.4.

Lemma 3.5. Any linear code Fn of rate R with S < 2 is (p, 1−R + 1)-list decodable. C ⊆ 2 C ε

10 Proof. Suppose for sake of contradiction that there exists x∗ Fn such that (x∗,pn) > 1−R +1. ∈ 2 |B ∩C| ε For all x Fn and c , we have ∈ 2 ∈C (x + c,pn) = (x,pn) ( c) = (x,pn) , |B ∩ C| |B ∩ C− | |B ∩ C| so (x∗ + c,pn) > (1−R) + 1 for all c . If S < 2, then we have |B ∩ C| ε ∈C C ε 2n+1 > 2nS = exp n (x,pn) C 2 · 1+ ε · |B ∩ C| x∈Fn X2   ε exp n (x∗ + c,pn) ≥ 2 · 1+ ε · |B ∩ C| Xc∈C   ε 1 R exp n − + 1 ≥ 2 · 1+ ε · ε Xc∈C    1 R + ε = exp n − |C| · 2 · 1+ ε   εR = exp n 1+ . 2 1+ ε    which is a contradiction for large enough n. Proof of Theorem 2.4. Apply Lemma 3.4 and then Lemma 3.5 for R = 1 H(p) ε. − − Remark 3.6. We do not see how to extend this proof to larger alphabets. If, for example, q = 3, Lemma 3.3 would need to say Pr[SC+{0,b,2b} > 1 + 3TC + o(TC)] < o(1). However, the current proof would fail to show this, as we could not separate the expectation in (4); that is, we cannot say

E [(AC(x) 1)(AC (x + b) 1)(AC (x + 2b) 1)] b,x − − −

= E [AC (x) 1] E [AC(x + b) 1] E [AC(x + 2b) 1] b,x − · b,x − · b,x −

4 Characterizing the list size distribution

In this section we establish Theorem 2.6. For a code , let C (≥ℓ) P := E [I[LC(x) ℓ] LC(x)] . C Fn x∼ 2 ≥ · In this way, we have P (≥ℓ) Pr[L (x) ℓ] for ℓ 1. Thus, to prove, Theorem 2.6, it suffices to C ≥ C ≥ ≥ prove the following. Theorem 4.1. Fix L 1 and γ (0, 1). For all sufficiently large n, if Fn is a random linear ≥ ∈ C ⊆ 2 code of dimension k, then with probability 1 exp( Ω(γn)), for all 1 ℓ L, − − ≤ ≤ ℓ 2 P (≥ℓ) 2−n(1−H(p)) 2k 2γℓ n. (5) C ≤ · · Proof of Theorem 4.1. The proof of Theorem 4.1 proceeds by induction on k. For the base case k = 0, we show (5) holds with probability 1. For ℓ 2, (5) holds because then P (≥ℓ) = 0. When ≥ {0} ℓ = 1, we then have P (≥1) = 2−n Vol(n,pn) 2−n(1−H(p)). {0} · ≤ Now that we have established the base case of k = 0, we proceed by induction. Lemma 4.2 provides the inductive step; similar to the approach in §3, it shows that at every step the “expected behavior” holds with high probability.

11 Lemma 4.2. Let γ > 0, and suppose that Fn is a linear code of dimension k such that for all C ⊆ 2 1 ℓ L, we have ≤ ≤

ℓ 2 P (≥ℓ) 2−n(1−H(p)) 2k 2γℓ n. (6) C ≤ · ·   Then, for a uniformly chosen b Fn, with probability at least 1 2L3 2−2γn over the choice of b, ∈ 2 − · we have, for all 1 ℓ L, ≤ ≤

ℓ 2 P (≥ℓ) 2−n(1−H(p)) 2k+1 2γℓ n. (7) C+{0,b} ≤ · ·   Proof. For all codes 0, 1 n, we have C ⊂ { } P (≥1) = 2−n i x : L (x)= i = 2−n (x, c) : x Fn, c , ∆(x, c) pn C · |{ C }| |{ ∈ 2 ∈C ≤ }| Xi≥1 = 2−n Vol(n,pn) 2−n(1−H(p)) 2k. · |C| · ≤ · so (6), and thus (7) (which is just (6) for C + 0, b ), always holds for ℓ = 1. { } Now fix a code satisfying (6), and assume that ℓ 2. We have, for any b Fn, C ≥ ∈ 2 (≥ℓ) I P = E [LC+{0,b}(x) ℓ] LC+{0,b}(x) C+{0,b} x ≥ · E [I[LC (x)+ LC(x + b) ℓ] (LC (x)+ LC(x + b))] (By Lemma 3.2) ≤ x ≥ · = E [I[LC (x)+ LC(x + b) ℓ] LC(x)] + E [I[LC (x)+ LC(x + b) ℓ] LC(x + b)] x ≥ · x ≥ · ′ = 2 E [I[LC (x)+ LC(x + b) ℓ] LC(x)] Set x = x + b in second E · x ≥ · ℓ−1  = 2 E I[LC (x) ℓ] LC(x)+ I[LC(x)= i, LC (x + b) ℓ i] LC(x) · x " ≥ · ≥ − · # Xi=0 ℓ−1 (≥ℓ) = 2 P + 2 i E [I[LC(x)= i] I[LC(x + b) ℓ i]] · C · · x · ≥ − Xi=0 =: (⋆). (8)

Taking expectations of (⋆) over a random b, we have,

ℓ−1 (≥ℓ) E [(⋆)] = 2 P + 2 i E [I[LC(x)= i] I[LC (x + b) ℓ i]] Fn C b∼ 2 · · · b,x · ≥ − Xi=0 ℓ−1 (≥ℓ) = 2 P + 2 i Pr[LC(x)= i] Pr[LC(x + b) ℓ i] · C · · b,x · b,x ≥ − Xi=0 ℓ−1 < 2 P (≥ℓ) + 2L P (≥i) P (≥ℓ−i) · C · C · C Xi=0 Fn The second equality uses that x and b are independent and uniform on 2 so x and x + b are independent. The inequality uses that i

12 ℓ i] P (≥ℓ−i). Continuing, we bound − ≤ C ℓ−1 E (⋆) 2 P (≥ℓ) < 2L P (≥i) P (≥ℓ−i) b − · C · C · C h i Xi=1 ℓ−1 i+(ℓ−i) 2 2 2L 2−n(1−H(p)) 2k 2γ(i +(ℓ−i) )n (By (6)) ≤ · · · Xi=1   ℓ 2 2L2 2−n(1−H(p)) 2k 2γ(ℓ −2)n ≤ · · ·   By definition of (⋆), we have (⋆) 2P (≥ℓ) 0 always, so we may apply Markov’s inequality to − C ≥ obtain ℓ 2 Pr (⋆) 2P (≥ℓ) 2−n(1−H(p)) 2k 2γℓ n 2L2 2−2γn. b − C ≥ · · ≤ ·     As P (≥ℓ) (⋆) by (8), we have C+{0,b} ≤ ℓ 2 Pr P (≥ℓ) 2P (≥ℓ) 2−n(1−H(p)) 2k 2γℓ n 2L2 2−2γn. b C+{0,b} − C ≥ · · ≤ ·     Thus, with probability at least 1 2L22−2γn, we have − ℓ 2 P (≥ℓ) 2P (≥ℓ) + 2−n(1−H(p)) 2k 2γℓ n C+{0,b} ≤ C · ·  ℓ 2  3 2−n(1−H(p)) 2k 2γℓ n ≤ · ·   ℓ 2 2−n(1−H(p)) 2k+1 2γℓ n. (9) ≤ · · The second inequality uses the assumption (6) and the final inequality uses ℓ 2. Union bounding ≥ over all 1 ℓ L, we have (9) holds for all 1 ℓ L with probability 1 2L32−2γn. This ≤ ≤ ≤ ≤ − completes the proof of Lemma 4.2. Returning to the proof of Theorem 4.1, call a code of dimension k good if (5) holds for all C 1 ℓ L. Lemma 4.2 states that if is good, then + 0, b fails to be good with probability at ≤ ≤ C C { } most 2L32−2γn over the choice of a uniformly random b Fn. ∈ 2 Since we have already shown that 0 is good with probability 1 at the beginning of the proof, { } it follows from the union bound that a random linear binary code = span(b , . . . , b ) fails to be C 1 k good with probability at most k 2L32−2γn, which is less than 2−γn for n sufficiently large. This · completes the proof of Theorem 4.1. To finish the section, we demonstrate how Theorem 2.6 implies Theorem 2.4. Corollary 4.3 (Theorem 2.4). Let ε> 0 and p (0, 1/2). A random linear code of rate 1 H(p) ε ∈ − − is (p,H(p)/ε + 2)-list decodable with probability 1 exp( Ω (n)). − − ε Proof. Let L = H(p)/ε + 2 and choose γ = ε , and let Fn be a random linear code of 2L2 C ⊆ 2 dimension k = n(1 H(p) ε). By Theorem 2.6, with probability at least 1 exp( γn), code − − − − C satisfies (5) for k = n(1 H(p) ε) and all 1 ℓ L. Choosing ℓ = L, we have − − ≤ ≤ L(−n(1−H(p))+k)+γL2 n −nLε+ ε n −(n−k) Pr [LC (x) L] 2 = 2 2 < 2 Fn x∼ 2 ≥ ≤ where the last equality uses the definition of L and γ. Suppose there exists x Fn such that ∈ 2 L (x) L. Then L (x + c) L for all c C, so that Pr[L (x) L] 2−(n−k). This is a C ≥ C ≥ ∈ C ≥ ≥ contradiction, so there are no x Fn such that L (x) L. ∈ 2 C ≥

13 5 List-decoding rank metric codes

Here, we prove Theorem 2.11, which is restated below. Theorem 5.1 (Theorem 2.11, restated). Let p (0, 1) and choose ε > 0. There is a constant ∈ Cε so that the following holds. Let m and n be sufficiently large positive integers with n

−mn(1−p−pb+p2b− ε ) S 1 + 4 2 1+ε . (10) {0} ≤ · Following the proof of Theorem 2.4, we prove the following lemma which mirrors Lemma 3.3. Lemma 5.4. Suppose that is fixed and satisfies T < 1, so that S < 2. Then C C C 1.5 0.5 Pr SC+{0,Y } 1 + 2TC + TC TC . Fm×n Y ∼ 2 ≥ ≤   Proof. By Lemma 5.2, for all Y ,

SC+{0,Y } = E AC+{0,Y }(X) X

E [AC(X)AC (X + Y )] ≤ X = E [ 1+ AC(X)+ AC(X + Y ) + (AC (X) 1)(AC (X + Y ) 1)] X − − − = 1+2TC + E [(AC (X) 1)(AC (X + Y ) 1)] . X − −

14 Over the randomness of Y and X, we have X and X + Y are statistically independent and uniform Fm×n over 2 , so we have

2 E E [(AC (X) 1)(AC (X + Y ) 1)] = E [AC (X) 1] E [AC (X + Y ) 1] = TC . Y X − − Y,X − · Y,X − By Markov’s inequality,

2 1.5 1.5 TC 0.5 Pr SC+{0,Y } 1 + 2TC + TC Pr E [(AC (X) 1)(AC (X + Y ) 1)] TC = TC . Y ≥ ≤ Y X − − ≥ ≤ T 1.5   C   This proves the lemma.

Continuing with the proof structure of Theorem 2.4, we prove the following lemma, which mirrors Lemma 3.4.

Lemma 5.5. Let p (0, 1), ε (0, 1 ). Let n

ε δ := 4 exp mn (1 p)(1 bp) 0 · 2 − − − − 1+ ε    1.5 δi := 2δi−1 + δi−1

Like in Lemma 3.4, we verify by induction that for i mn((1 p)(1 bp) ε), we have δi < 2 ≤ − − − i+1 − ε mn j+1 2 δ0 < 2 2 . The base case is trivial (assuming mn is sufficiently large), and if δj < 2 δ0 for j < i, we have

i−1 i−1 δ = 2δ (1 + δ0.5 ) = 2iδ (1 + δ0.5) 2iδ exp δ0.5 < 2i+1δ . i i−1 i−1 0 · j ≤ 0 ·  j  0 jY=0 Xj=0   In the first inequality, we used the estimate 1 + z ez, and in the second we used the inductive 2 ≤ − ε mn hypothesis δj < 2 2 for j < i and that mn is sufficiently large. In particular, if k = mn((1 2 − − ε mn p)(1 bp) ε), then δ < 2 2 . − − k Let Y ,...,Y Fm×n be randomly chosen matrices, and let = span(Y ,...,Y ) with = . 1 k ∈ 2 Ci 1 i Ck C By Lemma 5.4, conditioned on a fixed satisfying T < δ , then with probability at most T 0.5, i Ci i Ci 0.5 C which is at most δi , we have TCi+1 > δi+1. Furthermore, TC0 < δ0 by (10). Thus, with probability at least

2 1 δ0.5 + δ0.5 + + δ0.5 > 1 ke−ε mn/4 1 2−Ω(mn) − 0 1 · · · k−1 − ≥ −

 ε2mn − 2 we have TCi < δi for all i. In particular, this implies TC = TCk < δk < 2 . Thus, we have S =1+ T < 2 with probability 1 exp( Ω (n)), completing the proof of Lemma 5.5. C C − − ε We now can finish the proof of Theorem 2.11.

15 p+bp−bp2 Proof of Theorem 2.11. By Lemma 5.5, it suffices to prove that SC < 2 implies (p, ε +2)-list decodability. Suppose for sake of contradiction that a code satisfies SC < 2 and there exists ∗ Fm×n ∗ p+bp−bp2 C X 2 such that R(X ,p) > ε + 2. We know that for all C , we have ∈ |B ∩ C| ∈ C 2 (X∗ + C,p) = (X∗,p) ( C) = (X∗,p) , so (X∗ + C,p) > p+bp−bp + 2 |BR ∩ C| |BR ∩ C− | |BR ∩ C| |BR ∩ C| ε for all C . ∈C If SC < 2, then we have

mn+1 mn ε 2 > 2 SC = exp2 mn R(X,p) · × · 1+ ε · |B ∩ C| X∈Fm n   X2 ε exp mn (X∗ + C,p) ≥ 2 · 1+ ε · |BR ∩ C| CX∈C   ε p + bp bp2 exp mn − + 2 ≥ 2 · 1+ ε · ε CX∈C    p + bp bp2 + 2ε = exp mn − |C| · 2 · 1+ ε   ε = exp mn 1+ ((1 p)(1 bp) ε) . 2 1+ ε − − −    which is a contradiction for large enough mn.

6 Conclusion

In this work, we have given an improved analysis of the list-decodability of random linear binary codes. Our analysis works for all values of p, and also obtains improved bounds on the list size as the rate approaches list-decoding capacity. In particular, not only do our bounds improve on previous work for random linear codes, but they show that random linear codes are more list-decodable than completely random codes, in the sense that the list size is strictly smaller. Our techniques are quite simple, and strengthen an argument of [GHSZ02] to hold with high probability. In order to demonstrate the applicability of these techniques, we use them to (a) obtain more information about the distribution of list sizes of random linear codes and (b) to prove a similar result for random linear rank-metric codes, improving a recent result of [GR17]. We end with some open questions raised by our work. 1. With the exception of Theorem 2.12, our results—both our upper bounds and our lower bounds—hold only for binary alphabets. We conjecture that analogous results, and in partic- ular list decoding random linear codes with list size C/ε for C < 1, hold over larger alphabets. 2. We showed that random linear binary codes of rate 1 H(p) ε are with high probability − − (p,L) list-decodable with L H(p)/ε. The lower bounds of [Rud11, GN14] show that we ≤ must have L C/ε, but the constant C is much smaller than H(p). Thus, we still do not ≥ know what the correct leading constant is for random linear codes. 3. Finally, there are currently no known explicit constructions of capacity-achieving binary list- decodable codes for general p. It is our hope that this work—which gives more informa- tion about the structure of linear codes which achieve list-decoding capacity—could lead to progress on this front. Given that we don’t know how to efficiently check if a given code is (p,L)-list-decodable, even an efficient Las Vegas construction (as opposed to a Monte Carlo construction) would be interesting.

16 Acknowledgements

We thank anonymous reviewers for helpful comments on a previous version of this manuscript.

References

[AS92] Noga Alon and Joel Spencer. The Probabilistic Method. John Wiley, 1992.

[Bli86] Volodia M. Blinovsky. Bounds for codes in the case of list decoding of finite volume. Problems of Information Transmission, 22(1):7–19, 1986.

[CGV13] Mahdi Cheraghchi, Venkatesan Guruswami, and Ameya Velingker. Restricted isometry of Fourier matrices and list decodability of random linear codes. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 432–442. ACM-SIAM, 2013.

[Del78] Philippe Delsarte. Bilinear forms over a finite field, with applications to coding theory. Journal of Combinatorial Theory, Series A, 25(3):226–241, 1978.

[Din15] Yang Ding. On list-decodability of random rank metric codes and subspace codes. IEEE Trans. Information Theory, 61(1):51–59, 2015.

[DL12] Zeev Dvir and Shachar Lovett. Subspace evasive sets. In Proceedings of the Forty-Fourth annual ACM Symposium on Theory of Computing (STOC), pages 351–358. ACM, 2012.

[Eli57] Peter Elias. List decoding for noisy channels. Wescon Convention Record, Part 2, pages 94–104, 1957.

[Eli91] Peter Elias. Error-correcting codes for list decoding. IEEE Trans. Information Theory, 37(1):5–12, 1991.

[Gab85] Ernst Mukhamedovich Gabidulin. Theory of codes with maximum rank distance. Prob- lemy Peredachi Informatsii, 21(1):3–16, 1985.

[GHK11] Venkatesan Guruswami, Johan H˚astad, and Swastik Kopparty. On the list-decodability of random linear codes. IEEE Trans. Information Theory, 57(2):718–725, 2011.

[GHSZ02] Venkatesan Guruswami, Johan H˚astad, Madhu Sudan, and David Zuckerman. Combi- natorial bounds for list decoding. IEEE Trans. Information Theory, 48(5):1021–1034, 2002.

[GI04] Venkatesan Guruswami and Piotr Indyk. Efficiently decodable codes meeting Gilbert- Varshamov bound for low rates. In Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms, pages 756–757. Society for Industrial and Applied Mathematics, 2004.

[GLM+] Venkatesan Guruswami, Ray Li, Jonathan Mosheiff, Nicolas Resch, Shashwat Silas, and Mary Wootters. Bounds for list-decoding and list-recovery of random linear codes. In APPROX/RANDOM 2020. To appear.

[GN14] Venkatesan Guruswami and Srivatsan Narayanan. Combinatorial limitations of average- radius list-decoding. IEEE Trans. Information Theory, 60(10):5827–5842, 2014.

17 [GPT91] Ernst M. Gabidulin, A. V. Paramonov, and O. V. Tretjakov. Ideals over a non- commutative ring and their applications in cryptology. In Proceedings of Advances in Cryptology - EUROCRYPT ’91, Workshop on the Theory and Application of of Cryptographic Techniques, pages 482–489, 1991. [GR08a] Venkatesan Guruswami and Atri Rudra. Concatenated codes can achieve list-decoding capacity. In Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2008, San Francisco, California, USA, January 20-22, 2008, pages 258–267, 2008. [GR08b] Venkatesan Guruswami and Atri Rudra. Explicit codes achieving list decoding ca- pacity: Error-correction with optimal redundancy. IEEE Trans. Information Theory, 54(1):135–150, 2008. [GR17] Venkatesan Guruswami and Nicolas Resch. On the list-decodability of random linear rank-metric codes. arXiv preprint arXiv:1710.11516, 2017. [Gur09] Venkatesan Guruswami. List decoding of binary codes-a brief survey of some recent results. In Proceedings of Coding and Cryptology, Second International Workshop (IWCC), pages 97–106, 2009. [GV05] Venkatesan Guruswami and Salil Vadhan. A lower bound on list size for list decoding. In Chandra Chekuri, Klaus Jansen, Jos´eD. P. Rolim, and Luca Trevisan, editors, Approx- imation, Randomization and Combinatorial Optimization. Algorithms and Techniques, pages 318–329, Berlin, Heidelberg, 2005. Springer Berlin Heidelberg. [GWX16] Venkatesan Guruswami, Carol Wang, and Chaoping Xing. Explicit list-decodable rank- metric and subspace codes via subspace designs. IEEE Trans. Information Theory, 62(5):2707–2718, 2016. [GX12] Venkatesan Guruswami and Chaoping Xing. Folded codes from function field towers and improved optimal rate list decoding. In Proceedings of the Forty-Fourth annual ACM Symposium on Theory of Computing (STOC), pages 339–350. ACM, 2012. [GX13] Venkatesan Guruswami and Chaoping Xing. List decoding Reed-Solomon, Algebraic- Geometric, and Gabidulin subcodes up to the . In Proceedings of the Forty-Fifth annual ACM Symposium on Theory of Computing (STOC), pages 843–852. ACM, 2013. [GY08] Maximilien Gadouleau and Zhiyuan Yan. On the decoder error probability of bounded rank-distance decoders for maximum rank-distance codes. IEEE Trans. Information Theory, 54(7):3202–3206, 2008. [HRZW17] Brett Hemenway, Noga Ron-Zewi, and Mary Wootters. Local list recovery of high- rate tensor codes & applications. In 58th Annual IEEE Symposium on Foundations of Computer Science, 2017. [HW15] Brett Hemenway and Mary Wootters. Linear-time list recovery of high-rate expander codes. In International Colloquium on Automata, Languages, and Programming, pages 701–712. Springer, 2015. [KK08] Ralf Koetter and Frank R Kschischang. Coding for errors and erasures in random network coding. IEEE Trans. Information Theory, 54(8):3579–3591, 2008.

18 [LGB03] P. Lusina, Ernst M. Gabidulin, and Martin Bossert. Maximum rank distance codes as space-time codes. IEEE Trans. Information Theory, 49(10):2757–2760, 2003.

[LK05] Hsiao-feng Lu and P. Vijay Kumar. A unified construction of space-time codes with optimal rate-diversity tradeoff. IEEE Trans. Information Theory, 51(5):1709–1730, 2005.

[Loi10] Pierre Loidreau. Designing a rank metric based McEliece cryptosystem. In Proceedings of the Post-Quantum Cryptography, Third International Workshop on Post-Quantum Cryptography, (PQCrypto), pages 142–152, 2010.

[Loi17] Pierre Loidreau. A new rank metric codes based encryption scheme. In Proceedings of the Post-Quantum Cryptography, 8th International Workshop on Post-Quantum Cryp- tography, (PQCrypto), pages 3–17, 2017.

[MS77] Florence Jessie MacWilliams and Neil James Alexander Sloane. The theory of error- correcting codes. Elsevier, 1977.

[Rot91] Ron M. Roth. Maximum-rank array codes and their application to crisscross error correction. IEEE Trans. Information Theory, 37(2):328–336, 1991.

[RU10] Atri Rudra and Steve Uurtamo. Two theorems on list decoding. In Maria Serna, Ronen Shaltiel, Klaus Jansen, and Jos´eRolim, editors, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques, pages 696–709, Berlin, Heidelberg, 2010. Springer Berlin Heidelberg.

[Rud11] Atri Rudra. Limits to list decoding of random codes. IEEE Trans. Information Theory, 57(3):1398–1408, 2011.

[RW14] Atri Rudra and Mary Wootters. Every list-decodable code for high noise has abun- dant near-optimal rate puncturings. In Proceedings of the Forty-Sixth annual ACM Symposium on Theory of Computing (STOC), pages 764–773. ACM, 2014.

[RW15] Atri Rudra and Mary Wootters. It’ll probably work out: Improved list-decoding through random operations. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science (ITCS), pages 287–296. ACM, 2015.

[RW18] Atri Rudra and Mary Wootters. Average-radius list-recovery of random linear codes. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algo- rithms (SODA). ACM-SIAM, 2018.

[SKK08] Danilo Silva, Frank R Kschischang, and Ralf Koetter. A rank-metric approach to error control in random network coding. IEEE Trans. Information Theory, 54(9):3951–3967, 2008.

[SRKV13] Natalia Silberstein, Ankit Singh Rawat, O Ozan Koyluoglu, and Sriram Vishwanath. Optimal locally repairable codes via rank-metric codes. In Proceedings of the 2013 IEEE International Symposium on Information Theory (ISIT), pages 1819–1823. IEEE, 2013.

[SRV12] Natalia Silberstein, Ankit Singh Rawat, and Sriram Vishwanath. Error resilience in distributed storage via rank-metric codes. In Proceedings of the 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 1150–1157. IEEE, 2012.

19 [Sud00] Madhu Sudan. List decoding: algorithms and applications. SIGACT News, 31(1):16– 27, 2000.

[Vad12] Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 7(1-3):1–336, 2012.

[Woo13] Mary Wootters. On the list decodability of random linear codes with large error rates. In Proceedings of the Forty-Fifth Symposium on Theory of Computing Conference (STOC), pages 853–860, 2013.

[Woz58] Jack Wozencraft. List decoding. Quarter Progress Report, 48:90–95, 1958.

[WZ13] Antonia Wachter-Zeh. Bounds on list decoding of rank-metric codes. IEEE Trans. Information Theory, 59(11):7268–7277, 2013.

[ZP81] Victor Vasilievich Zyablov and Mark Semenovich Pinsker. List concatenated decoding. Problemy Peredachi Informatsii, 17(4):29–33, 1981.

A Lower bound on the list size of random codes

In this appendix, we prove Theorem 2.5, which states that, for any γ > 0 and ε small enough, the smallest list size L such that a completely random code is with high probability (p,L)-list decodable, satisfies L [(1 γ)/ε 1, 1/ε]. In other words, the list size of 1/ε given by the classical list decoding ∈ − − capacity theorem is tight in the leading constant factor. We prove this by tightening the second moment argument of Guruswami and Narayanan [GN14]. We follow the exact same proof outline, differing only in how we bound one expression. This improvement appears as Lemma A.5. In both Appendices A and B, we use the following useful fact. Fact A.1. Let Z be a random variable that takes only nonnegative values. Then Var Z Pr[Z = 0] . ≤ E[Z]2 For this appendix, we assume that pn is an integer. We need a few bounds on Hamming balls and their intersections. We first use the following classical estimate on the volume of the Hamming ball (see, for example, [MS77][pp. 308-310]). Lemma A.2. Suppose that 1 pn n/2. Then ≤ ≤ 2H(p)n n Vol(n,pn) 2H(p)n. 8np(1 p) ≤ pn ≤ ≤ −   We also need a boundp on the volume of the intersections of two Hamming balls whose centers are distance d apart, which we denote by

Vol(n,pn; d) := (0,pn) (1d0n−d,pn) . |B ∩B | Lemma A.3 below bounds Vol(n,pn; d) above. Lemma A.3. Suppose 0

20 There exists a constant Cp > 0 depending only on p such that, for sufficiently large n, for all 1 d n, we have ≤ ≤ Vol(n,pn; d) 2−αpd C √n. (11) Vol(n,pn) ≤ · p

Proof. Let w = 1d0n−d be a string of weight d, so that we wish to upper bound the number of points x in D := (0,pn) (w,pn). B ∩B First, notice that if pn < d/2, then (0,pn) (w,pn) = , so (11) holds. Thus, assume that B ∩B ∅ pn d/2. Suppose that x restricted to the first d coordinates has weight i, and that x restricted ≥ to the last n d coordinates has weight j. Such an x is in D if and only if (d i)+ j pn and − − ≤ i + j pn. For a fixed i, j, the number of such x is exactly d n−d . Thus, ≤ i j d pn−max(i,d−i)   d n d Vol(n,pn; d) = − i j Xi=0 Xj=0    d pn−d/2 d n d − ≤ i j Xi=0   Xj=0   d d = Vol(n d,np (d/2)) i − − Xi=0   = 2d Vol(n d,np (d/2)) · − − pn d exp d + H − 2 (n d) , ≤ 2 2 n d · − − ! ! using Lemma A.2 in the final line. Define

pn d f(d) := d + H − 2 (n d). 2 n d · − − ! When d = 0, (11) holds for sufficiently large n. Thus, we show that the derivative f ′(d) is sufficiently negative that (11) continues to hold as d increases. Claim A.4. We have f ′(d) α . ≤− p ′ Proof. Let g(x)= x log2(x). We know that g (x) = log2(x) + log2(e). We compute that n d n d f(d) = d + (pn d/2) log − + (n pn d/2) log − − 2 pn d/2 − − 2 n pn d/2  −   − −  = d + g(pn d/2) + g(n pn d/2) g(n d) − − − − − 1 (pn d/2)(n pn d/2) f ′(d) = 1+ log − − − . 2 2 (n d)2 − We also know that (pn d/2)(n pn d/2) − − − p(1 p). (12) (n d)2 ≤ − − 2 Indeed, cross multiplying and expanding (12) yields p(1 p)n2 nd + d (n2 2nd + d2)p(1 p), − − 2 4 ≤ − − which simplifies to p(1 p) n/2−d/4 = 1/4, which is always true. Since the left-hand side is − ≤ 2n−d

21 always positive (using the assumption p< 1/2 and d/2 pn), we may take logs of both sides and ≤ conclude that 1 f ′(d) 1+ log p(1 p) = α . ≤ 2 2 − − p Returning to the proof of Lemma A.3, as f(0) = H(p)n and f ′(d) α by Claim A.4, we have ≤− p f(d) H(p)n α d for all d 0, and in particular all integers d 0. Taking C = 8p(1 p), ≤ − p ≥ ≥ p − and applying Lemma A.2, we conclude p

Vol(n,pn; d) exp2(H(p)n αpd) −αpd − 2 Cp√n. Vol(n,pn) ≤ exp (H(p)n)/ 8np(1 p) ≤ · 2 − In Lemma A.5, we record computation used inp the proof of Theorem 2.5. The proof of Theo- rem 2.5 uses the second moment method to show that, with high probability there are a positive number of “bad events”, by showing the expectation of the number of bad events (denoted W ) is large, and the variance Var[W ] of the number of bad events is comparatively small. The bound in Lemma A.5 gives a tighter bound on individual terms in the expansion of Var[W ] than the corresponding bound used in [GN14]. This helps us obtain a better overall list size lower bound.

Lemma A.5. Let 0

a, b, x1,...,xℓ,xℓ+1,...,xL,yℓ+1,...,yL be chosen independently and uniformly at random from Fn (so there are 2 + 2L ℓ points total). 2 − Let be the event E ℓ L = [∆(a, xi) pn] [∆(b, xi) pn] [∆(a, xi) pn] [∆(b, yi) pn] . E ≤ ∧ ≤ ! ≤ ∧ ≤ ! i^=1 ^ i=^ℓ+1 Then

Pr[ ] min µ2L−ℓ+1, 2−n µ2L−ℓ CL nL/2 (1 + 2−αpℓ)n . E ≤ · · p · ·   Proof. First we show the probability of is at most µ2L−ℓ+1. This is the argument that was E presented in [GN14]. For the event to occur, we need the following necessary (but not necessarily E sufficient) conditions: (1) ∆(a, x ) pn and ∆(b, x ) pn, (2) ∆(a, x ) pn for i = 2,...,L, and 1 ≤ 1 ≤ i ≤ (3) ∆(b, y ) pn for i = ℓ + 1,...,L. Notice that each of (1),(2),(3) are independent. Using the i ≤ randomness of a and b, (1) happens with probability µ2. Conditioned on the positions of a and b, (2) and (3) occur with probability µL−1 and µL−ℓ, respectively. Thus, we have Pr [ ] µ2L−ℓ+1. E ≤ For the other bound, we first condition on the locations of a and b. Note that, for 0 d n, a −n n ≤ ≤ and b have Hamming distance exactly d with probability 2 d . For i = 1,...,ℓ, conditioned on the positions of a and b being Hamming distance d apart, the probability that ∆(a, x ) pn and  i ≤ ∆(b, x ) pn is exactly 2−nVol(n,pn; d). For i = ℓ + 1,...,L, the probability that ∆(a, x ) pn i ≤ i ≤

22 and the probability that ∆(b, y ) pn are each exactly µ. We thus can write, using Lemma A.3, i ≤ n Vol(n,pn; d) ℓ Pr[ ] = Pr[∆(a, b)= d] µ2L−2ℓ E · 2n · d=0   Xn n Vol(n,pn; d) Vol(n,pn) ℓ = 2−n µ2L−2ℓ d · Vol(n,pn) · 2n · d=0     Xn n Vol(n,pn; d) ℓ = 2−n µ2L−ℓ d · Vol(n,pn) · d=0     X n n Vol(n,pn; d) ℓ = 2−nµ2L−ℓ d · Vol(n,pn) d=0     Xn n ℓ 2−nµ2L−ℓ 2−αpd C √n ≤ d · · p Xd=0     2−n µ2L−ℓ CL nL/2 (1 + 2−αpℓ)n, ≤ · · p · · as desired.

We now prove our theorem.

Theorem A.6 (Theorem 2.5, restated). For any p (0, 1/2) and ε > 0, there exists a γ = ∈ p,ε exp( Ω ( 1 )) and n N such that for all n n , a random code Fn of rate R = 1 H(p) ε − p ε p,ε ∈ ≥ p,ε C ⊆ 2 − − is with probability 1 exp( Ω (n)) not (p, 1−γp,ε 1)-list decodable. − − p,ε ε − Proof. We follow the outline [GN14]. Let αp be as in Lemma A.3. With hindsight, let

1 H(p) 1 1 ℓ := − , γ := 2−αpℓp,ε = exp Ω , p,ε 2ε p,ε 2 ln 2 · − p ε    Let Fn be a uniformly random code of rate R = 1 H(p) ε. We think of each codeword C ⊆ 2 − − c as the encoding of a distinct message x FRn. Choose L = (1 γ )/ε . We show that, if ∈C ∈ 2 ⌊ − p,ε ⌋ ε is sufficiently small, is with high probability not (p,L 1)-list decodable. C − Let µ = 2−nVol(n,pn). For any list decoding center a Fn and any ordered list of L distinct ∈ 2 messages X = (x ,...,x ) (FRn)L, let I(a, X) be the indicator random variable for the event that 1 L ∈ 2 the encodings of x ,...,x all fall in (a, pn). Let W = I(a, X). Note that is (p,L 1)-list 1 L B a,X C − decodable if and only if W = 0. P Just as in [GN14], we have E[I(a, X)] = µL and the number of different pairs (a, X) is at least

23 1 2n 2RnL. Thus, we can compute E[W ] 1 µL2n2RnL, so 2 · ≥ 2

Var W = E [I(a, X)I(b, Y )] E[I(a, X)] E[I(b, Y )] C − C · C X,YX Xa,b   = E [I(a, X)I(b, Y )] E[I(a, X)] E[I(b, Y )] C − C · C XX∩Y 6=∅ Xa,b   E [I(a, X)I(b, Y )] ≤ C XX∩Y 6=∅ Xa,b L = E [I(a, X)I(b, Y )] C Xℓ=1 |XX∩Y |=ℓ Xa,b L = 22n Pr [I(a, X) and I(b, Y )] . a,b,C Xℓ=1 |XX∩Y |=ℓ Suppose X and Y are L-tuples such that X Y = ℓ, where the intersection treats X and Y as sets. | ∩ | Suppose that the elements of X are x1,...,xL and the elements of Y are x1,...,xℓ,yℓ+1,...,yL. In this notation, the event “I(a, X) and I(b, Y )” is exactly event in Lemma A.5. Hence, E L Var W 22n Pr [I(a, X) and I(b, Y )] ≤ a,b,C Xℓ=1 |XX∩Y |=ℓ L 22n min µ2L−ℓ+1, 2−n µ2L−ℓ CL nL/2 (1 + 2−αpℓ)n . ≤ · · · p · · Xℓ=1 |XX∩Y |=ℓ   As α > 0, by choice of γ , for all ℓ ℓ , we have p p,ε ≥ p,ε 1 + 2−αpℓ 2−(1−2γp,ε). 2 ≤ We also bound the number of X,Y such that X Y = ℓ by L2L2Rn(2L−ℓ). Indeed, there are at | ∩ |

24 most 2Rn(2L−ℓ) ways to choose X,Y as sets and at most L2L ways to order the sets. Thus,

L Var W 4 22n min µ2L−ℓ+1, 2−n µ2L−ℓ CL nL/2 (1 + 2−αpℓ)n E[W ]2 ≤ µ2L22n22RnL · · · p · · Xℓ=1 |XX∩Y |=ℓ   ℓ −1 4 p,ε 22n µ2L−ℓ+1 ≤ µ2L22n22RnL · Xℓ=1 |XX∩Y |=ℓ L n 4 1 + 2−αpℓ + 22n µ2L−ℓ CL nL/2 µ2L22n22RnL · · p · · 2 ℓ=Xℓp,ε |XX∩Y |=ℓ   ℓ −1 4 p,ε L2L2Rn(2L−ℓ) 22n µ2L−ℓ+1 ≤ µ2L22n22RnL · · Xℓ=1 L n 4 1 + 2−αpℓ + L2L2Rn(2L−ℓ) 22n µ2L−ℓ CL nL/2 µ2L22n22RnL · · · p · · 2 ℓ=Xℓp,ε   ℓp,ε−1 L 4L2L(2Rnµ)−ℓ µ + 4L2L(2Rnµ)−ℓ CL nL/2 2−(1−2γp,ε)n ≤ · · p · · Xℓ=1 ℓ=Xℓp,ε 4L2L+1 2εℓp,εn 2−(1−H(p))n + 4L2L+1 CL nL/2 2εnL−(1−2γp,ε)n ≤ · · · p · · By choice of ℓ = 1−H(p) , the first term in the last line is exp( Ω (n)), and by choice of p,ε 2ε − p,ε L = 1−γp,ε , the second term in the last line is also exp( Ω (n)). Invoking Fact A.1, ⌊ ε ⌋ − p,ε Var W Pr[W = 0] exp( Ω (n)). ≤ E[W ]2 ≤ − p,ε

Since is (p,L 1)-list-decodable if and only if W = 0, we conclude, for n sufficiently large, that C − is with probability 1 exp( Ω (n)) not (p,L 1)-list decodable. C − − p,ε − B Lower bound on the list size of random rank metric codes

In this section, we prove Theorem 2.12. The second moment method calculations in this section are not as delicate as in Section A, because in Section A, we wanted to pin down the list size to (1 o(1))/ε, but here showing list size Ω(1/ε) suffices. ± Theorem B.1 (Theorem 2.12 restated). Let ε> 0 and suppose m,n are sufficiently large so that b = n/m. Let Fm×n be a uniformly random rank metric code of rate R = (1 p)(1 bp) ε. C ⊆ q − − − Then is (p, (1 p)(1 bp)/ε 2)-list decodable with probability at most exp( Ω (n)). C − − − − p,ε The proof is a second moment argument and is nearly identical to the lower bound for random linear codes over the Hamming metric in [GN14]: as this result is for the rank metric, we work out the details for completeness.

Proof. Let ε > 0 and fix L = (1 p)(1 bp)/ε 1. Let Fm×n be a uniformly random rank − − − C ⊆ q metric code. We think of each c as the encoding of a message x FRmn. For any center ∈ C ∈ q A Fm×n and any ordered list of L distinct messages X := (x ,x ,...,x ) (FRmn)L, let I(A, X) ∈ q 1 2 L ∈ q

25 be the indicator random variable for the event that, for all i, the encodings of x1,...,xL are all contained in (A,pn). Define BR W := I(A, X). A,XX Note that is (p,L 1)-list decodable if and only if W = 0. C − Let µ = q−mn (0,pn) . By Lemma 5.3, we have |Bq,R | µqRmn < 4q−εmn. (13)

The event that the encoding of x falls inside (A,pn) is exactly µ, and these events are sta- i Bq,R tistically independent, so we have E[I(a, X)] = µL. Assuming k L + 1, for large enough n, the ≥ number of pairs (A, X) is at least 1 qRmnL qmn, as the number of ordered L-lists X of L distinct 2 · messages is

L qL 1 qRmn (qRmn 1) (qRmn L + 1) qRmnL 1 2 qRmnL 1 qRmnL. · − · · · − ≥ − qRmn ≥ − qRmn ≥ 2  !   Thus, by linearity of expectation, we have 1 E W µLqmnqRmnL. ≥ 2 Following the method in [GN14], we now bound the variance of W above. For two lists of messages X and Y , we let X Y denote the size of their intersection when we view them as sets. If X and Y | ∩ | are disjoint, then the events I(A, X), I(B,Y ) are independent for any pair of centers A, B Fm×n. ∈ q Thus,

Var W = E [I(A, X)I(B,Y )] E[I(A, X)] E[I(B,Y )] C − C · C X,YX XA,B   = E [I(A, X)I(B,Y )] E[I(A, X)] E[I(B,Y )] C − C · C XX∩Y 6=∅ XA,B   E [I(A, X)I(B,Y )] ≤ C XX∩Y 6=∅ XA,B L = E [I(A, X)I(B,Y )] C Xℓ=1 |XX∩Y |=ℓ XA,B L = q2mn Pr [I(A, X) and I(B,Y )] A,B,C Xℓ=1 |XX∩Y |=ℓ Fm×n where, in the last line, A, B are picked uniformly at random from q . Fix 0 < l L and a pair (X,Y ) of L-tuples of matrices such that X Y = ℓ, and fix an arbitrary ≤ | ∩ | z X Y , which is guaranteed to exist as X,Y intersect. For the event I(A, X)= I(B,Y ) = 1 to ∈ ∩ happen, all of the following must occur:

1. A and B are both within rank-metric distance pn of the encoding of z .

2. For each x X z , the encoding of x falls inside B (A,pn). ∈ \ { } q,R

26 3. For each y Y X, the encoding of y falls inside B (B,pn). ∈ \ q,R The first event occurs with probability µ2, and conditioned on the choices of A and B, the second and third events occur with probabilities µL−1 and µL−ℓ, respectively, and are independent given A and B. Thus, the probability all the events occur is µ2L−ℓ+1, so the probability that I(A, X) = I(B,Y ) = 1 is at most µ2L−ℓ+1. Finally, noting that the number of pairs (X,Y ) with X Y = L | ∩ | is bounded by L2L qRmn(2L−ℓ), we conclude · L Var W L2L qRmn(2L−ℓ) q2mnµ2L−ℓ+1. ≤ · · Xℓ=1 Using Fact A.1 along with (13), we have

L Var W Pr[W = 0] 4L2L(qRmnµ)−ℓµ 4L2L+1 4LqεmnL 4q−mn(1−p)(1−bp). ≤ E[W ]2 ≤ ≤ · · Xℓ=1 This quantity is exp ( Ω (n)) for our choice of L. Recalling that is (p,L 1)-list-decodable if q − p,ε C − and only of W = 0, we conclude that with high probability, is not (p,L 1)-list-decodable, as C − desired.

C Existence of rate 1 H(p) ε codes achieving list size H(p) − − ε In this section, we use the Symmetric Lovasz Local Lemma (see, e.g., [AS92]).

Lemma C.1 (Lovasz Local Lemma). Let d be a positive integer and p (0, 1). Let A ,...,A be ∈ 1 N events and G = (V, E) be a graph with vertices V = [N] such that each vertex has degree at most d, and, for all i [N], the event A is independent of the events A : j [N], j = i, ij / E . Suppose ∈ i { j ∈ 6 ∈ } also that for all i, Pr[A ] p. If ep(d + 1) < 1, then i ≤ N 1 N Pr Ai 1 > 0, " # ≥ − d + 1 i^=1   i.e. with positive probability, none of the events Ai hold. We prove the following.

Theorem C.2. Let p [0, 1/2) and ε > 0. For any large enough n, there exist binary codes of ∈ rate R = 1 H(p) ε that are (p, H(p) + 1)-list decodable. − − ε Remark C.3. Theorem C.2 does not contradict Theorem 2.5, our high probability lower bound, as Theorem C.2 gives an existential result, rather than a high probability result.

Proof. Let M = 2Rn+1. Let L = H(p)/ε + 1 > H(p) . Take a random code where c ,...,c ⌊ ⌋ ε C 1 M are chosen from Fn independently and uniformly at random. For y Fn and I = i ,...,i , let 2 ∈ 2 { 1 L} A(y,I) denote the event that c (y,pn) for j = 0,...,L. There are 2nM L+1 such events. We ij ∈ B have that is (p,L)-list decodable if and only if none of the events A(y,I) occur. C Define a graph G on the events A(y,I) where A(y,I), A(z, J) are connected by an edge if I J = . Note that, as the c ’s are all independent and each A(y,I) depends only on c such that ∩ 6 ∅ i i i I, we have A(y,I) is independent of all A(z, J) satisfying I J = . In other words, A(y,I) ∈ ∩ ∅ is independent from all nonneighbor events in G. For each I [M] with I = L + 1, there are ⊆ | |

27 at most (L + 1) M L choices of J such that I J = . Thus, each A(y,I) has degree at most · ∩ 6 ∅ d := 2n (L + 1) M L. On the other hand, we can compute · · Vol(n,pn) L+1 p := Pr[A(y,I)] = 2−n(L+1)(1−H(p)). 2n ≤   We thus have

e p(d + 1) 3 2−n(L+1)(1−H(p)) (L + 1) 2n 2RnL+L · ≤ · · · · < 6L 2−n(L+1)(1−H(p)) 2n 2(1−H(p)−ε)nL+L · · · = 6L 2nH(p)−εnL+L < 1, · for our choice of L and sufficiently large n. Thus, by Lemma C.1, we have

n L+1 1 2 M Pr A(y,I) 1   ≥ − d + 1 ^y,I     2nM L+1 > exp − d   M = exp > 0. −L + 1   1 − 1 For the second inequality, we used the estimate 1 e d . At this point, we are essentially − d+1 ≥ done, but it is possible in our random choice that, for example, all ci are the same point. We check that this probability is sufficiently small compared to the probability of list decodability. There are 2nM ways to choose c ,...,c . The probability that M/2 is at most, 1 M |C| ≤ 2n M 2nM/2 M (M/2) M/2 −M/2 (M/2) Pr M/2 · (M/2) e · [ M/2] nM nM |C| ≤ ≤ 2 ≤ 2 M eM/2(M/2)M/2 eM/2 = = nM/2 n 2 r 2 !

− M < e L+1 < Pr A(y,I)   ^y,I   where the second inequality was bounded by Stirling’s approximation. Thus, with positive probability, we have both (1) none of the events A(y,I) hold, i.e. is C (p,L)-list decodable, and (2) has rate at least R = 1 H(p) ε, so in particular there exists C − − some with the desired list decodability and rate. C

28