<<

Circulant and Toeplitz Matrices in Compressed Sensing

Holger Rauhut Hausdorff Center for Mathematics & Institute for Numerical Simulation University of Bonn

Conference on Time-Frequency Strobl, June 15, 2009

Institute for Numerical Simulation Rheinische Friedrich-Wilhelms-Universität Bonn

Holger Rauhut Hausdorff Center for Mathematics & Institute for NumericalCirculant and Simulation Toeplitz University Matrices inof CompressedBonn Sensing Overview

• Compressive Sensing (compressive sampling) • Partial Random Circulant and Toeplitz Matrices

• Recovery result for `1-minimization • Numerical experiments • Proof sketch (Non-commutative Khintchine inequality)

Holger Rauhut Circulant and Toeplitz Matrices 2 Recovery requires the solution of the underdetermined linear system

Ax = y.

Idea: Sparsity allows recovery of the correct x. Suitable matrices A allow the use of efficient algorithms.

Compressive Sensing

N Recover a large vector x ∈ C from a small number of linear n×N measurements y = Ax with A ∈ C a suitable measurement . Usual assumption x is sparse, that is, kxk0 := | supp x| ≤ k where supp x = {j, xj 6= 0}. Interesting case k < n << N.

Holger Rauhut Circulant and Toeplitz Matrices 3 Compressive Sensing

N Recover a large vector x ∈ C from a small number of linear n×N measurements y = Ax with A ∈ C a suitable measurement matrix. Usual assumption x is sparse, that is, kxk0 := | supp x| ≤ k where supp x = {j, xj 6= 0}. Interesting case k < n << N.

Recovery requires the solution of the underdetermined linear system

Ax = y.

Idea: Sparsity allows recovery of the correct x. Suitable matrices A allow the use of efficient algorithms.

Holger Rauhut Circulant and Toeplitz Matrices 3 n×N For suitable matrices A ∈ C , `0 recovers every k-sparse x provided n ≥ 2k.

Problem: `0 is NP hard in general.

Need efficient alternative algorithms! We will consider `1-minimization

`0-minimization

Natural approach

min kxk0 subject to Ax = y

Holger Rauhut Circulant and Toeplitz Matrices 4 Problem: `0 is NP hard in general.

Need efficient alternative algorithms! We will consider `1-minimization

`0-minimization

Natural approach

min kxk0 subject to Ax = y

n×N For suitable matrices A ∈ C , `0 recovers every k-sparse x provided n ≥ 2k.

Holger Rauhut Circulant and Toeplitz Matrices 4 Need efficient alternative algorithms! We will consider `1-minimization

`0-minimization

Natural approach

min kxk0 subject to Ax = y

n×N For suitable matrices A ∈ C , `0 recovers every k-sparse x provided n ≥ 2k.

Problem: `0 is NP hard in general.

Holger Rauhut Circulant and Toeplitz Matrices 4 `0-minimization

Natural approach

min kxk0 subject to Ax = y

n×N For suitable matrices A ∈ C , `0 recovers every k-sparse x provided n ≥ 2k.

Problem: `0 is NP hard in general.

Need efficient alternative algorithms! We will consider `1-minimization

Holger Rauhut Circulant and Toeplitz Matrices 4 `1-minimization

Convex relaxation of `0 problem:

N X min kxk1 = |xj | subject to Ax = y j=1

Efficient convex optimization methods apply.

Idea: The minimizers of `1 and `0 coincide under appropriate conditions on the matrix A and on the sparsity k of x.

Holger Rauhut Circulant and Toeplitz Matrices 5 Theorem (Cand´es,Romberg, Tao, ’06,’08; Foucart, Lai ’08) √ n×N 2(3− 2) IfA ∈ C is a matrix satisfying δ2k < 7 ≈ 0.4531 then `1-minimization stably recovery everyk-sparsex fromy = Ax.

Restricted Isometry Property (RIP)

The restricted isometry constant δk of a matrix A is the smallest constant δ such that

2 2 2 (1 − δ)kxk2 ≤ kAxk2 ≤ (1 + δ)kxk2 for all k-sparse x.

All column submatrices of A with at most k columns are well-conditioned.

Holger Rauhut Circulant and Toeplitz Matrices 6 Restricted Isometry Property (RIP)

The restricted isometry constant δk of a matrix A is the smallest constant δ such that

2 2 2 (1 − δ)kxk2 ≤ kAxk2 ≤ (1 + δ)kxk2 for all k-sparse x.

All column submatrices of A with at most k columns are well-conditioned.

Theorem (Cand´es,Romberg, Tao, ’06,’08; Foucart, Lai ’08) √ n×N 2(3− 2) IfA ∈ C is a matrix satisfying δ2k < 7 ≈ 0.4531 then `1-minimization stably recovery everyk-sparsex fromy = Ax.

Holger Rauhut Circulant and Toeplitz Matrices 6 Random Partial Fourier matrix: (n rows of the discrete Fourier matrix selected at random)

n ≥ Cδ−2k log4(N) log(−1).

RIP for random matrices

n×N A Bernoulli A ∈ R with independent entries √1 having values ± n with equal probability; or a Gaussian random n×N matrix A ∈ R having independent normal distributed entries with mean zero and variance 1/n satisfies δk ≤ δ with probability at least1 −  provided

n ≥ Cδ−2(k log(N/k) + log(−1)).

Holger Rauhut Circulant and Toeplitz Matrices 7 RIP for random matrices

n×N A Bernoulli random matrix A ∈ R with independent entries √1 having values ± n with equal probability; or a Gaussian random n×N matrix A ∈ R having independent normal distributed entries with mean zero and variance 1/n satisfies δk ≤ δ with probability at least1 −  provided

n ≥ Cδ−2(k log(N/k) + log(−1)).

Random Partial Fourier matrix: (n rows of the discrete Fourier matrix selected at random)

n ≥ Cδ−2k log4(N) log(−1).

Holger Rauhut Circulant and Toeplitz Matrices 7 Circulant and Toeplitz matrices

N Circulant matrix: For b = (b0, b1,..., bN−1) ∈ R let b N×N S = S ∈ R be the matrix with entries Si,j = bj−i mod N ,   b0 b1 ······ bN−1 bN−1 b0 b1 ··· bN−2 Sb =   .  . . . . .   . . . . .  b1 b2 ··· bN−1 b0

2N−1 : For c = (c−N−1, c−N−2,..., cN−1) ∈ R let c N×N T = T ∈ R be the matrix with entries Ti,j = ci−j ,   c0 c1 ······ cN−1  c−1 c0 c1 ··· cN−2 T c =   .  . . . . .   . . . . .  c−N−1 c−N−2 ··· c−1 c0

Holger Rauhut Circulant and Toeplitz Matrices 8 N 2N−1 We will always choose the vectors b ∈ R , c ∈ R at random.

b c Performance of SΩ, TΩ in compressed sensing?

Partial random circulant and Toeplitz matrices

LetΩ ⊂ {1,..., N} arbitrary of cardinality n. N RΩ: operator that restricts a vector x ∈ R to its entries in Ω.

Restrict Sb and T c to the rows indexed by Ω: b b n×N Partial circulant matrix: SΩ = RΩS ∈ R c c n×N Partial Toeplitz matrix: TΩ = RΩT ∈ R

Convolution followed by subsampling: b y = RΩS x = RΩ(x ∗ b)

Holger Rauhut Circulant and Toeplitz Matrices 9 Partial random circulant and Toeplitz matrices

LetΩ ⊂ {1,..., N} arbitrary of cardinality n. N RΩ: operator that restricts a vector x ∈ R to its entries in Ω.

Restrict Sb and T c to the rows indexed by Ω: b b n×N Partial circulant matrix: SΩ = RΩS ∈ R c c n×N Partial Toeplitz matrix: TΩ = RΩT ∈ R

Convolution followed by subsampling: b y = RΩS x = RΩ(x ∗ b)

N 2N−1 We will always choose the vectors b ∈ R , c ∈ R at random.

b c Performance of SΩ, TΩ in compressed sensing?

Holger Rauhut Circulant and Toeplitz Matrices 9 Why Circulant and Toeplitz matrices?

• Structure: Fast algorithms via the FFT • Smaller amount of randomness than for Gaussian / Bernoulli matrices • Natural appearance in applications such as system identification, radar, channel estimation in wireless communication, ...

Holger Rauhut Circulant and Toeplitz Matrices 10 Previous results

Bajwa et al.(2007/2008): Ω = {1,..., n}, the entries of b and c are either independent Bernoulli ±1 or standard normal random variables. 2 √1 b √1 c If n ≥ Cδk log(N/k) then n SΩ and n TΩ satisfy RIP, δ2k ≤ δ In N particular, `1-minimization recovers k-sparse vectors x ∈ R .

Romberg (2008): Ω is chosen at random, b = F −1bˆ, where F is the Fourier transform and the entries of bˆ are essentially chosen at random from a uniform distribution on the torus (such that Sb is a real matrix). 3 Assume n ≥ max{C1k log(N/), C2 log (N/)}. Fix a support set S and choose the signs of the non-zero coefficients of x on S independently and uniformly at random. Then `1-minimization recovers x with probability at least1 − .

Holger Rauhut Circulant and Toeplitz Matrices 11 Main Result

Theorem Let Ω ⊂ {1, 2,..., N} be an arbitrary (deterministic) set of N cardinality n. Letx ∈ R bek-sparse such that the signs of its non-zero entries are Bernoulli ±1 random variables. Choose N b ∈ R to be a random vector whose entries are ±1 Bernoulli b n random variables. Lety = SΩx ∈ R . There exists a constant C > 0 such that n ≥ Ck log3(N/) implies that with probability at least 1 −  the `1-minimizer coincides withx. c b The same statement holds withT Ω in place of SΩ where 2N−1 c ∈ R is a random vector with Bernoulli ±1 entries. Improvements with respect to previous results: Theorem removes randomness in Ω, n scales essentially linear in k c and it works also for TΩ. Holger Rauhut Circulant and Toeplitz Matrices 12 Numerical experiments

Partial Circulant and Toeplitz matrix generated by Bernoulli vector, N = 500, n = 100.

Empirical Recovery Rate Empirical Recovery Rate 1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Sparsity Sparsity

(a) Circulant (b) Toeplitz

Holger Rauhut Circulant and Toeplitz Matrices 13 Numerical experiments

Partial Circulant and Toeplitz matrix generated by Gaussian vector, N = 500, n = 100.

Empirical Recovery Rate Empirical Recovery Rate 1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 Sparsity Sparsity

(c) Circulant (d) Toeplitz

Holger Rauhut Circulant and Toeplitz Matrices 14 Pre-Torture: Recovery condition for `1-minimization

Theorem (Fuchs ’03, Tropp ’04) Suppose thaty = Ax for some x with supp x = Λ. If

† |hAΛa`, sgn(xΛ)i| < 1 for all `∈ / Λ , then x is the unique solution of the `1-minimization problem. Here,A Λ is the submatrix ofA = (a1|a2| · · · |aN ) consisting of the † columns indexed by Λ andA Λ denotes its Moore-Penrose pseudo-inverse.

Holger Rauhut Circulant and Toeplitz Matrices 15 The torture begins ... Hoeffding inequality

If the entries of sgn(xΛ) are chosen independently and uniformly at random then

† † 2 P(|hAΛa`, sgn(xΛ)i| ≥ ukAΛa`k2) ≤ 2 exp(−u /2) Hence, one is lead to estimate

† ∗ −1 ∗ ∗ −1 ∗ kAΛa`k2 = k(AΛAΛ) AΛa`k2 ≤k (AΛAΛ) k2→2kAΛa`k2. Furthermore, s √ √ ∗ X 2 kAΛa`k2 = |haj , a`i| ≤ k max |haj , a`i| =: kµ. j6=` j∈Λ

∗ ∗ −1 1 If δ(Λ) := kAΛAΛ − I k2→2 ≤ δ < 1 then k(AΛAΛ) k2→2 ≤ 1−δ .

Therefore, we estimate the coherence µ and δ(Λ).

Holger Rauhut Circulant and Toeplitz Matrices 16 Proof Sketch ...

Introduce elementary shifts( Sj x)` = x`−j mod N and

 x if 1 ≤ ` − j ≤ N, (T x) = `−j j ` 0 otherwise,

Then N−1 N−1 b X c X S = bj Sj , T = cj Tj j=0 j=−N+1

√1 b √1 c Denote A = n RΩS or A = n RΩT and Dj = Sj or Dj = Tj . Then 1 X A∗ A − I =   R D∗P D R Λ Λ n j ` Λ j Ω ` Λ j6=` with j being independent Bernoulli ±1 random variables, and PΩ the projection which sets to zero all entries outside Ω.

Holger Rauhut Circulant and Toeplitz Matrices 17 Torture continues ... Non-commutative Khintchine Inequality for Rademacher chaos

Theorem r×t LetA j,k ∈ C , j, k = 1,..., N, be matrices with Aj,j = 0, j = 1,..., N. Let k , k = 1,..., N be independent Bernoulli random variables. Then for m ∈ N it holds

N N X X [ k   A k2m ]1/2m ≤ D max{k( A A∗ )1/2k , E j k j,k S2m m j,k j,k S2m j,k=1 j,k=1 N X ∗ 1/2 k( Aj,k Aj,k ) kS2m , kF kS2m }, j,k=1

N 1/2m (2m)! 1/m whereF = (Aj,k )j,k=1 and the constantD m = 4 · 2 ( 2mm! ) .

kAkSp = kσ(A)kp, σ(A): vector of singular values of A.

Holger Rauhut Circulant and Toeplitz Matrices 18 Tail estimate from moments

Lemma Suppose Z is a positive random variable satisfying

p 1/p 1/p 1/γ (EZ ) ≤ αβ p for all p0 ≤ p < ∞ and some α, β, γ > 0. Then for arbitrary κ > 0,

κ −κuγ P(Z ≥ e αu) ≤ βe for all u ≥ p0.

Holger Rauhut Circulant and Toeplitz Matrices 19 Coherence estimate

Combining the scalar valued case of the Khintchine inequality with the tail estimate yields: Lemma Let µ be the coherence of the random partial circulant matrix √1 b n×N √1 c n×N n SΩ ∈ R or Toeplitz matrix n TΩ ∈ R whereb andc are Rademacher series and Ω has cardinalityn. Then with probability at least 1 −  the coherence satisfies

log(2N2/) µ ≤ 4 √ . n

Holger Rauhut Circulant and Toeplitz Matrices 20 Torture continues ... Moment bound

Applying the non-commutative Khinthchine inequality yields

(2m)!2 km+1 kA∗ A − I k2m ≤ kA∗ A − I k2m ≤ 2 · 42m . E Λ Λ E Λ Λ S2m 2mm! nm

Stirling’s formula and H¨oldergive r 4 k ( kA∗ A − I kp)1/p ≤ (4k)1/p p. E Λ Λ e n

Holger Rauhut Circulant and Toeplitz Matrices 21 The torture ends ... Condition number estimate

Theorem N Let Ω, Λ ⊂ {1,..., N} with |Ω| = n and |Λ| = k. Letb ∈ R and 2N−1 √1 b c ∈ R be Rademacher series. Denote eitherA = n SΩ or √1 c A = n TΩ. Assume

n ≥ 16 δ−2k log2(4k/).

Then with probability at least 1 − 

∗ kAΛAΛ − I k2 ≤ δ. Combining everything gives the main result.

Holger Rauhut Circulant and Toeplitz Matrices 22 Open problem

b c Provide a good estimate of the RIP constants of SΩ and TΩ!

(Ongoing project with Joel Tropp and Justin Romberg)

Holger Rauhut Circulant and Toeplitz Matrices 23 Thanks

Thanks for your attention!

Holger Rauhut Circulant and Toeplitz Matrices 24