Introduction to Compressed Sensing

Introduction to Compressed Sensing Mrityunjoy Chakraborty Dept. of Electronics & Electrical Communication Engg. Indian Institute of Technology, Kharagpur, INDIA Introduction The field of Compressive Sensing(CS) • A powerful method of exactly recovering signals at sub-Nyquist rate given that the signal has some sparse structure. • It is a wide field with overlaps in several distinct areas of science and technology: ∗ Signal Processing: (i) MRI imaging (ii) Speech processing ∗ Applied mathematics: (i) Applied harmonic analysis (ii) Random matrix theory (iii) Geometric functional analysis ∗ Statistics Data Acquisition • For a signal bandlimited to B Hz, The Nquist rate demands at least 2B samples per second for perfect reconstruction. • Becomes pretty challenging for ADCs to deliver the high sampling rate in context of modern high bandwidth communication systems(e.g. radar). • Can prior knowledge about sparse structure of the signal help perfect reconstruction from a sub− Nyquist sampling strategy? Data Compression • Many signals are sparse in transform domains, like Fourier, Wavelet etc. • Can we use the sparse structure in the transform domains to get compression even without the full acquisition (all signal coordinates)? • Specifically, instead of taking samples of the actual vector x∈ N , can we recover xy, from the linear measurements = Φx, where,x is known to be sparse in some domain, that is there is some (known) matrix Ψ, such that x= Ψz, such that z is sparse. Sparsity in Wavelet domain The problem of compressive sensing and its solutions The l0 "norm" optimization formulation • A suitable optimization problem must be formulated that addresses these questions by seeking out an unknown vector which is highly sparse, i.e. with as few nonzero coordinates as possible. • Mathematically, let the system of linear measurements be given by yx= Φ , where x∈∈N,, ΦxMN× MN<< and is highly sparse. • Then the optimization problem we seek to solve is solving a constrained l0 "norm" minimization m|inxy:0=Φx || x | • However, this problem is a combinatorial one, and the complex- ity may be exponential. So, do we quit ? A relaxed convex optimization problem • Turns out a slight convexification of the problem does the job, i.e. instead of minimizing l01 "norm", minimize the l norm minxy:1=Φx || x || • The following diagram gives intuitive explanation to why l1 optimization finds a sparse solution, while l2 optimization does not • Solution of this problem can recover unknown x with high probability, if x is K − sparse, and M satisfies, M≥ cKln( N /K ) << N The Restricted Isometry Property • In order to recover a high dimensional sparse vector x, from a low dimensional measurement vector y, obtained as yx= Φ , the sensing matrix Φ must be “almost” orthonormal. • This idea is captured by the Restricted Isometry Property (RIP) : • A matrix Φ∈ MN× is said to satisfy RIP of order K, if ∀ K − sparse vector x∈∃N , δ > 0 such that 22 2 (1−δ ) ||x ||22≤+ || Φx||≤ (1δ ) || x || 2 • The smallest such constantδδ is denoted as K •− In simple words, Φ is an approximate isometry for all K sparse vectors. • RIP is fundamentally related to eigenvalues of a matrix T δ K= max S⊂={1, 2, , NSK }:| | ||I − ΦΦ S S ||2 → 2 • We can prove this as below •− Let x be a KSsparse vector so that Φx= ΦSS x , where is the support of x • Then, by definition of maximum and minimum eigenvalues of a TT TTu ΦΦuSS matrix, λλmin ()ΦΦSS =arg minu ,max (ΦΦSS ) = ||u ||2 TT u ΦΦuSS arg maxu || u ||2 • Hence, from the definition of RIP, any δ , that satisfies the RIP T of order K, also satisfies, λmin (ΦΦSS)(≥ 1−δδ ), (1 +≥ ) TT λδmax (ΦSSΦ⇒ ) ≥−|| I ΦΦSS||2→ 2 • Since this is true for any set SK of indices with cardinality , we can write, for any such δ satisfying the RIP property of order T K, max SS:| |= K||I −≤ΦΦ S S ||2→ 2 δ • Since, by definition, δδK is the smallest such , we have T δ K= max SS:| |= K||I − ΦΦ S S ||2→ 2 • Another nice property of RIP is that if RIC is small, after transformation, orthogonal vectors remain almost orthogonal, as stated in the following form : |〈 xy, 〉≤| δ ||x|| || y || |SS12 ||+ | 2 2 N where xy, ∈ , with supports SS12, such that S1∩=∅S 2 • To prove this, note that, |〈xy, 〉= |0 , since SS12 ∩ =∅, which allows us to write TT | 〈〉xy, | = | x( I −ΦΦy ) | T T = | xI( − ΦΦ) y | SS12∪∪ SS 12SS 12∪∪ SS 12 SS 12 ∪ ≤ ||xI|| || ( − ΦΦT ) y || SS12∪22 SS 12 ∪ SS 12 ∪∪ SS 12 SS 12 ∪ (due to Cauchy− Scwartz) T ≤− ||xI|| || ΦΦ|| || y || SS12∪2 SS 12 ∪ SS 12 ∪ SS 12 ∪→22 SS12 ∪2 ≤ δ ||xy|| || || |SS12 ||+ | 2 2 How to find good sensing matrix ? • A unique minimizer of the l0 minimization problem is guaranteed if every 2K columns of the sensing matrix is linearly independent, equivalently,δ2K ∈ (0,1); but how to find it ? •∈ How to design a sensing matrix such that δ2K (0,1) ? • An easy answer is random matrices, i.e., matrices with elements independent and identically distributed according to some distribution • Fantastic examples are : • Gaussian sensing matrices, i.e., elements are i.i.d. Gaussian • Bernoulli sensing matrices with elements i.i.d. 0,1 with probabilities 1− pp , Recovery algorithms l1 minimization algorithms • Basis pursuit : minN || x || x∈ 1 s.t. y = Φx • Quadratically constrained basis pursuit or Basis pursuit denois- ing (BPDN) : minN || x || x∈ 1 s.t. ||y −≤Φx ||2 • Dantzig Selector : minN || x || x∈ 1 T s.t. ||Φ ( y−≤ Φx ) ||∞ τ Greedy algorithms Some of the most important greedy algorithms for sparse recovery are : • Matching pursuit (MP) , Orthogonal matching pursuit (OMP) , Orthogonal least squares (OLS) • Compressive sampling matching pursuit (CoSaMP) , Subspace Pursuit (SP) • Iterative Hard Thresholding (IHT), Hard Thresholding Pursuit (HTP) Matching Pursuit • Given that y ∈ M , the goal is to, iteratively, find the best linear representation of y in the dictionary {φφ1,,} N • In other words, find xy, iteratively, such that ||− Φx ||2 is minimum where Φ = [φφ1 · · ·N ] • In matching pursuit • Initialize the residual ry= • Find the atom most correlated to the residual, i.e., find φi such that i = arg max1≤≤jN||ρ j , where ρφj= 〈 j ,r〉 • Update the residual :rr←−ρφii , and return to step 1 Orthogonal matching pursuit • Same as matching pursuit, except that the dictionary representation is known to be K − sparse • The Orthogonal matching pursuit goes as below : • Initialize residual ry= , and the temporary support Λ=∅ • Find the atom most correlated to the residual r, i.e., find φi such that i =arg max1≤≤jN|ρ j |, where ρφj=〈〉j ,r • Enlarge the temporary support by augmenting this new index, i.e. Λ←Λ∪ {i } •− Find the best K sparse representation of y with the atoms from the dictionary supported on Λ, i.e. find x, such that =† = xΛΛΦy, x()Λ c 0 • Update residual ry← − ΦxΛ and return to step 1 • Many types of conditions have been found for the sensing matrix Φ, to ensure perfect recovery of the K − sparse vector x from the measurement yx= Φ vector in K iterations • RIP based recovery conditions : 1 • Davenport and Wakin [1] found the condition δ K +1 < 3 K 1 • Wang et.al [2] improved the condition to δ K +1 < K +1 • To date the best condition is established by chang et.al [3] , 4K +− 11 which is δ < K +1 2K Another type of recovery conditions are given by the worst− case coherence µ, and the average− case coherenceν • Worst case coherence is defined as the maximum absolute cross− correlation among the columns of the sensing matrix, in other words µ:= maxij≠ |〈 φφ i , j〉 | • Average case coherence is defined as the maximum among all the absolute values of row averages (excluding the diagonal in that row) of the Gram matrix ΦΦT , in other words, 1 ν:= max |∑ 〈〉φφ ,| N −1 i ji: ≠ j i j 1 • Tropp. [4] gives recovery condition in terms of µ as µ < 21K − 1 • Chi and Calderbank. [5] give conditions µ < , and 240log N µ ν < , with N ≥128 M • Tropp and Gilbert. [6] have shown that OMP can indeed recover a K − sparse vector with very high probability if an "uncorrelated" (that is the mutual correlation between the columns of the matrix is very low with high probability) sensing matrices are used : specifically, if δ ∈(0,0.36), and if an "admissible" sensing matrix Φ is chosen with dimension MN×≥, with MCln( Nδ ), for some constant C, then, OMP can recover the original, K − sparse vector x from the measurements y = Φx, with probability exceeding 1−δ OMP with more than K iterations • Recently a variant of OMP has been studied where OMP is run for more than KK iterations, where is the sparsity of the unknown vector • Allowing the algorithm to run for more iterations improve the recovery condition • Recovery conditions found by Zhang : OMP can recover a K 1 sparse vector with 30K iterations if δ < 31K 3 • Recovery conditions found by Livshitz : OMP reconstructs a K β sparse signals in αδK , if = for proper choices of α K K αβα, ( ~ 2·1066 , β~0)1 − • Sahoo and Makur [7] has shown that if OMP is allowed to run for KK+∈αα iterations ( [0,1]), the algorithm can recover a K sparse vector with high probability with only n K ln measurements, pretty close to the number of α K +1 measurements required for success for Basis pursuit, that is n K ln K Generalized orthogonal matching pursuit • Wang. et.al [8] proposed a generalized orthogonal matching pursuit algorithm (gOMP) where at the augmentation step, instead of augmenting one index, NN(≥ 1) indices are added, which are chosen according to decreasing order of absolute correlation with the residual vector.

Introduction to Compressed Sensing

Compressive Phase Retrieval Lei Tian

Compressed Digital Holography: from Micro Towards Macro

Compressive Sensing Off the Grid

Sparse Penalties in Dynamical System Estimation

Compressed Sensing Applied to Modeshapes Reconstruction Joseph Morlier, Dimitri Bettebghor

Kalman Filtered Compressed Sensing

Robust Compressed Sensing Using Generative Models

An Overview of Compressed Sensing

Credit Spread Prediction Using Compressed Sensing

Can Compressed Sensing Beat the Nyquist Sampling Rate?

Compressive? Can It Beat the Nyquist Sampling Approach?

Compressed Sensing for Wireless Communications