Estimating Unknown Sparsity in Compressed Sensing

Estimating Unknown Sparsity in Compressed Sensing Miles E. Lopes [email protected] UC Berkeley, Dept. Statistics, 367 Evans Hall, Berkeley, CA 94720-3860 n p Abstract where A R × is a user-specified measurement matrix, 2n is a random noise vector, and n is much In the theory of compressed sensing (CS), the R smaller2 than the signal dimension p. During the last sparsity x of the unknown signal x p 0 R several years, the theory of CS has drawn widespread is commonlyk k assumed to be a known parame-2 attention to the fact that this seemingly ill-posed prob- ter. However, it is typically unknown in prac- lem can be solved reliably when x is sparse | in tice. Due to the fact that many aspects of the sense that the parameter x := card j : x = 0 CS depend on knowing x , it is important 0 j 0 is much less than p. For instance,k k if n fis approxi-6 g to estimate this parameterk k in a data-driven mately x log(p= x ), then accurate recovery can way. A second practical concern is that x 0 0 0 be achievedk k with highk k probability when A is drawn is a highly unstable function of x. In partic-k k from a Gaussian ensemble (Donoho, 2006; Candès ular, for real signals with entries not exactly et al., 2006). Along these lines, the value of the pa- equal to 0, the value x = p is not a useful 0 rameter x is commonly assumed to be known in description of the effectivek k number of coordi- 0 the analysisk k of recovery algorithms | even though it nates. In this paper, we propose to estimate a is typically unknown in practice. Due to the funda- stable measure of sparsity s(x) := x 2= x 2, 1 2 mental role that sparsity plays in CS, this issue has which is a sharp lower bound on k xk .k Ourk 0 been recognized as a significant gap between theory estimation procedure uses only a smallk k num- and practice by several authors (Ward, 2009; Eldar, ber of linear measurements, does not rely on 2009; Malioutov et al., 2008). Nevertheless, the liter- any sparsity assumptions, and requires very ature has been relatively quiet about the problems of little computation. A confidence interval for estimating this parameter and quantifying its uncer- s(x) is provided, and its width is shown to tainty. have no dependence on the signal dimension p. Moreover, this result extends naturally to the matrix recovery setting, where a soft ver- 1.1. Motivations and the role of sparsity sion of matrix rank can be estimated with At a conceptual level, the problem of estimating x 0 analogous guarantees. Finally, we show that is quite different from the more well-studied prob-k k the use of randomized measurements is essen- lems of estimating the full signal x or its support set tial to estimating s(x). This is accomplished S := j : xj = 0 . The difference arises from sparsity by proving that the minimax risk for estimat- assumptions.f 6 Ong one hand, a procedure for estimating ing s(x) with deterministic measurements is x 0 should make very few assumptions about sparsity large when n p. k k (if any). On the other hand, methods for estimating x or S often assume that a sparsity level is given, and then impose this value on the solution xb or Sb. Con- 1. Introduction sequently, a simple plug-in estimate of x , such as k k0 x 0 or card(Sb), may fail when the sparsity assump- The central problem of compressed sensing (CS) is to kbk p tions underlying x or S are invalid. estimate an unknown signal x R from n linear mea- b b 2 surements y = (y1; : : : ; yn) given by To emphasize that there are many aspects of CS that depend on knowing x , we provide several examples y = Ax + , (1) 0 below. Our main pointk k here is that a method for esti- th mating x is valuable because it can help to address Proceedings of the 30 International Conference on Ma- k k0 chine Learning, Atlanta, Georgia, USA, 2013. JMLR: a broad range of issues. W&CP volume 28. Copyright 2013 by the author(s). Estimating Unknown Sparsity in Compressed Sensing Modeling assumptions. One of the core mod- we must also remember that such a “certificate” • eling assumptions invoked in applications of CS is not meaningful unless we can check that k is is that the signal of interest has a sparse rep- consistent with the true signal. resentation. Likewise, the problem of checking Recovery algorithms. When recovery al- whether or not this assumption is supported by • data has been an active research topic, partic- gorithms are implemented, the sparsity level ularly in in areas of face recognition and image of x is often treated as a tuning parame- classification (Rigamonti et al., 2011; Shi et al., ter. For example, if k is a presumed bound on x , then the Orthogonal Matching Pur- 2011). In this type of situation, an estimate dx 0 0 suitk algorithmk (OMP) is typically initialized to that does not rely on any sparsity assumptionsk k is run for k iterations. A second example is a natural device for validating the use of sparse the Lasso algorithm, which computes the so- representations. 2 p lution xb argmin y Av 2 + λ v 1 : v R , for some2 choice offkλ −0 .k The sparsityk k 2 of x gis The number of measurements. If the choice ≥ b • of n is too small compared to the \critical" num- determined by the size of λ, and in order to se- lect the appropriate value, a family of solutions ber n∗(x) := x 0 log(p= x 0), then there are known information-theoretick k k k barriers to the ac- is examined over a range of λ values. In the case curate reconstruction of x (Arias-Castro et al., of either OMP or Lasso, a sparsity estimate dx k k0 2011). At the same time, if n is chosen to be much would reduce computation by restricting the pos- larger than n∗(x), then the measurement process sible choices of λ or k, and it would also ensure is wasteful, as there are known algorithms that that the chosen values conform to the true signal. can reliably recover x with approximately n∗(x) measurements (Davenport et al., 2011). 1.2. An alternative measure of sparsity To deal with the selection of n, a sparsity esti- Despite the important theoretical role of the param- mate dx 0 may be used in two different ways, de- k k eter x in many aspects of CS, it has the practical pending on whether measurements are collected k k0 sequentially, or in a single batch. In the sequential drawback of being a highly unstable function of x. In p case, an estimate of x 0 can be computed from particular, for real signals x R whose entries are not k k exactly equal to 0, the value2 x = p is not a useful a set of \preliminary" measurements, and then k k0 the estimated value dx determines how many description of the effective number of coordinates. k k0 additional measurements should be collected to In order to estimate sparsity in a way that accounts recover the full signal. Also, it is not always nec- for the instability of x , it is desirable to replace k k0 essary to take additional measurements, since the the `0 norm with a \soft" version. More precisely, preliminary set may be re-used to compute xb (as we would like to identify a function of x that can be discussed in Section5). Alternatively, if all of the interpreted like x 0, but remains stable under small measurements must be taken in one batch, the perturbations ofkx.k A natural quantity that serves this value dx 0 can be used to certify whether or not purpose is the numerical sparsity enoughk measurementsk were actually taken. x 2 s(x) := 1 ; (2) The measurement matrix. Two of the most k k2 x 2 • well-known design characteristics of the matrix A k k are defined explicitly in terms of sparsity. These which always satisfies 1 s(x) p for any non-zero ≤ ≤ are the restricted isometry property of order k x. Although the ratio x 2= x 2 appears sporadically k k1 k k2 (RIP-k), and the restricted null-space property of in different areas (Tang & Nehorai, 2011; Hurley & order k (NSP-k), where k is a presumed upper Rickard, 2009; Hoyer, 2004; Lopes et al., 2011), it does bound on the sparsity level of the true signal. not seem to be well known as a sparsity measure in CS. Since many recovery guarantees are closely tied A key property of s(x) is that it is a sharp lower bound to RIP-k and NSP-k, a growing body of work has on x for all non-zero x, been devoted to certifying whether or not a given k k0 matrix satisfies these properties (d'Aspremont & s(x) x 0; (3) El Ghaoui, 2011; Juditsky & Nemirovski, 2011; ≤ k k Tang & Nehorai, 2011). When k is treated as which follows from applying the Cauchy-Schwarz in- given, this problem is already computationally equality to the relation x 1 = x; sgn(x) . (Equality difficult. Yet, when the sparsity of x is unknown, in (3) is attained iff thek non-zerok h coordinatesi of x are Estimating Unknown Sparsity in Compressed Sensing equal in magnitude.) We also note that this inequal- methods consider a collection of (say m) solutions (1) (m) ity is invariant to scaling of x, since s(x) and x 0 are xb ;:::; xb obtained from different values θ1; : : : ; θm individually scale invariant.

Estimating Unknown Sparsity in Compressed Sensing

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support