Explicit Matrices for Sparse Approximation Amin Khajehnejad, Arash Saber Tehrani, Alexandros G
Total Page:16
File Type:pdf, Size:1020Kb
1 Explicit Matrices for Sparse Approximation Amin Khajehnejad, Arash Saber Tehrani, Alexandros G. Dimakis, Babak Hassibi Abstract|We show that girth can be used to certify time that a given candidate matrix has good recovery that sparse compressed sensing matrices have good guarantees. sparse approximation guarantees. This allows us to present the first deterministic measurement matrix Our Contributions: Consider a sparse matrix H in constructions that have an optimal number of mea- 0; 1 m×n that has d ones per row and d ones per f g c v surements for `1=`1 approximation. Our techniques are column. If the bipartite graph corresponding to H has coding theoretic and rely on a recent connection of com- Ω(log n) girth, then for k = p n and an optimal number pressed sensing to LP relaxations for channel decoding. · of measurements m = c2 n, we show that H offers `1=`1 sparse approximation under· basis pursuit decoding. Our I. Introduction technical requirement of girth, unlike expansion or RIP, is easy to check and several deterministic constructions of Assume we observe m linear measurements of an un- matrices with m = c n and Ω(log n) exist, starting with known vector e Rn: · 2 the early construction in Gallager's thesis [16], and the H e = s; progressive edge-growth Tanner graphs of [17]. · Our result is a weak bound, also known as a `for- where H is a real-valued matrix of size m n, called the every signal' guarantee [2]. This means that we have measurement matrix. When m < n this× is an underde- a fixed deterministic matrix and show the `1=`1 sparse termined system of linear equations and one fundamental approximation guarantee with high probability over the compressed sensing problem involves recovering e assum- support of the signal. To the best of our knowledge, this ing that it is also k-sparse, i.e. it has k or less non-zero is the first deterministic construction of matrices with an entries. optimal number of measurements. The sparse approximation problem goes beyond exactly sparse vectors and requires the recovery of a k-sparse Our techniques are coding-theoretic and rely on recent vector ^e that is close to e, even if e is not exactly k- developments that connect the channel decoding LP re- sparse itself. Recent breakthrough results [11]{[13] showed laxation by Feldman et al. [20] to compressed sensing [8]. that it is possible to construct measurement matrices with We rely on a primal-based density evolution technique m = O(k log(n=k)) rows that recover k-sparse signals initiated by Koetter and Vontobel [18] and analytically exactly in polynomial time. This scaling is also optimal, as strengthened in the breakthrough paper of Arora et al. [9] discussed in [2], [3]. These results rely on randomized ma- that established the best known finite-length threshold trix constructions and establish that the optimal number results for LDPC codes under LP decoding. of measurements will be sufficient with high probability To show our `1=`1 sparse approximation result, we first over the choice of the matrix and/or the signal. Unfor- need to extend [8] for non-sparse signals. Specifically, the tunately the required properties RIP [11], Nullspace [14], first step in our analysis (Theorem 3) is to show that `1=`1 [19] and high expansion (expansion quality < 1=6) have sparse approximation corresponds to a channel decoding no known ways to be deterministically constructed or problem for a perturbed symmetric channel. The second efficiently checked . There are several explicit constructions component is to extend the Arora et al. argument for of measurement matrices (e.g. [4], [7]) which, however, this perturbed symmetric channel, an analysis achieved require a slightly sub-optimal number of measurements (m in Theorem 2. This is performed by showing how a similar growing super-linearly as a function of n for k = p n). tree recursion allows us to establish robustness of the · In this paper we focus on the linear sparsity regime fundamental cone (FCP), i.e. every pseudocodeword in the where k is a fraction of n and optimal number of measure- fundamental cone [18] has a non-negative pseudoweight ments will also be a fraction of n. The explicit construc- even if the flipped bit likelihoods are multiplied by a factor tion of measurement matrices with an optimal number of larger than one. The FCP condition shows that the matrix rows is a well-known open problem in compressed sensing H, taken as an LDPC code can tolerate a constant fraction theory (see e.g. [2] and references therein). A closely of errors for this perturbed symmetric channel under LP related issue is that of checking or certifying in polynomial decoding. Amin Khajehnejad and Babak Hassibi are with the Department of We note that even though our analysis involves a rigor- Electrical Engineering, California Institute of Technology, Pasadena, ous density evolution argument, our decoder is always the CA, USA. email:famin,[email protected] basis pursuit linear relaxation which is substantially differ- A. Saber Tehrani and A. G. Dimakis are with the Department of Electrical Engineering, University of Southern California, Los ent from the related work on message-passing algorithms Angeles, CA, USA. email:fsaberteh,[email protected] for compressed sensing [15], [28]. 2 II. Background Information In this section we provide a brief background for the CR + 1 0 e ^e 1 2 min e e 1: (1) major topics we will discuss. We begin by introducing 0 (k) k − k ≤ CR 1 · e 2Σ n k − k the noiseless compressed sensing problem and the basis − R (k) n pursuit. Then, we mention the channel coding problem where Σ n = w supp(w) k . The nullspace R R and its relaxation. condition is a necessaryf 2 andj j sufficientj ≤ forg a measurement matrix to yield `1=`1 approximation guarantees [8], [27]. A. Compressed Sensing Preliminaries The simplest noiseless compressed sensing (CS) problem B. Channel Coding Preliminaries for exactly sparse signals consists of recovering the sparsest 0 A linear binary code of length n is defined by an m n real vector e of a given length n, from a set of m real- C n × parity-check matrix HCC, i.e. , x F2 HCC x = 0 . valued measurements s, given by H e0 = s; namely C f 2 j · g · We define the set of codeword indices , (HCC) , 1; : : : ; n , the set of check indices I I(H ) f g J , J CC , 0 1; : : : ; m , the set of check indices that involves the i-th CS-OPT: minimize e 0 k k codewordf g position (H ) j [H ] = subject to H e0 = s: Ji , Ji CC , f 2 J j CC j;i · 1 , and the set of codeword positions that are involved in g Since ` minimization is NP-hard, one can relax CS-OPT the j-th check j j(HCC) i [H]j;i = 1 . For a 0 I , I , f 2 Ij g by replacing the ` norm with ` , specifically regular code, the degrees of the codeword adjacency i 0 1 jJ j 0 and the size of check equations j are denoted by dv and CS-LPD: minimize e 1 jI j dc respectively. If a codeword x is transmitted through k k 0 subject to H e = s: a channel and an output sequence2 C y is received, then · one can potentially decode x by solving for the maximum This LP relaxation is also known as basis pursuit. A likelihood codeword in , namely central question in compressed sensing is under what C conditions the solution given by CS-LPD equals (or is CC-MLD: minimize λT x0 very close to, specially in the case of approximately sparse 0 subject to x conv( )); signals) the solution given by CS-OPT, i.e., the LP 2 C relaxation is tight. There has been a substantial amount where λ is the likelihood vector defined by λi = of work in this area, see e.g. [2], [11]{[14], [19]. log( P(yijxi=0) ), and conv( ) is the convex hull of all code- P(yijxi=1) C One sufficient way to certify that a given measurement words of in Rn. CC-MLD solves the ML decoding matrix is\good"is through the well-known restricted isom- problem byC the virtue of the fact that the objective λT x etry property (RIP), which guarantees that the LP relax- is minimized on a corner point of conv( ), which is a ation will be tight for all k-sparse vectors e and further the codeword. CC-MLD is NP-hard and thereforeC an efficient recovery will be robust to approximate sparsity [11], [12]. description of the exact codeword polytope is very unlikely However, RIP condition is not a complete characterization to exist. of the LP relaxation of \good" measurement matrices The well-known channel decoding LP relaxation is: (see, e.g., [21]). In this paper we rely on the nullspace T 0 characterization (see, e.g., [22], [23]) instead, that gives CC-LPD: minimize λ x 0 a necessary and sufficient condition for a matrix to be subject to x (HCC); \good". 2 P where = (HCC) is known as the fundamental poly- Definition 1: Let 1; : : : ; n and let CR 0. We P P S ⊂ f g ≤ ≥ tope [18], [20]. The fundamental polytope is compactly say that H has the nullspace property NSP ( ;CR), and T ≤ R S write H NSP ( ;C ), if described as follows: If hj is the j-th row of HCC, then 2 R S R = conv( ); (2) C : νS ν ; for all ν (H): 1≤j≤m j R k k1 ≤ k S k1 2 N P \ C n T We say that H has the strict nullspace property where j = x F hj x = 0 mod 2 .