Sparsity, Randomness and Compressed Sensing
Petros Boufounos Mitsubishi Electric Research Labs [email protected] Sparsity Why Sparsity
• Natural data and signals exhibit structure
• Sparsity o en captures that structure
• Very general signal model
• Computa onally tractable
• Wide range of applica ons in signal acquisi on, processing, and transmission Signal Representations Signal example: Images
• 2‐D func on
• Idealized view some func on space defined over Signal example: Images
• 2‐D func on
• Idealized view some func on space defined over
• In prac ce
ie: an matrix Signal example: Images
• 2‐D func on
• Idealized view some func on space defined over
• In prac ce
ie: an matrix (pixel average) Signal Models Classical Model: Signal lies in a linear vector space x1 (e.g. bandlimited functions) X Sparse Model: Signals of interest are often sparse or compressible x2 Signal Transform x3
Image Wavelet
Bat Sonar Gabor/ Chirp STFT
i.e., very few large coefficients, many close to zero. Sparse Signal Models
x1 x1 X X Sparse signals have x2 few non-zero coefficients x2 x3 x3 1-sparse 2-sparse
x1
Compressible signals have few significant coefficients. x2 The coefficients decay as a power law. x3
Compressible (p ball, p<1) Sparse Approximation Computational Harmonic Analysis
• Representa on
coefficients basis, frame
• Analysis: study through structure of should extract features of interest
• Approxima on: uses just a few terms exploit sparsity of Wavelet Transform Sparsity
• Many (blue) Sparseness ⇒ Approximation
few big
many small
sorted index Linear Approximation
index Linear Approximation
• ‐term approxima on: use “first”
index Nonlinear Approximation
• ‐term approxima on: use largest independently
• Greedy / thresholding
few big sorted index Error Approximation Rates
as
• Op mize asympto c error decay rate
• Nonlinear approxima on works be er than linear Compression is Approximation
• Lossy compression of an image creates an approxima on
coefficients basis, frame
quan ze to total bits Sparse approximation ≠ Compression
• Sparse approxima on chooses coefficients but does not quan ze or worry about their loca ons
threshold Location, Location, Location
• Nonlinear approxima on selects largest to minimize error (easy – threshold)
• Compression algorithm must encode both a set of and their loca ons (harder) Exposing Sparsity Spikes and Sinusoids example
Example Signal Model: Sinusoidal with a few spikes.
f B a DCT Basis: Spikes and Sinusoids Dictionary
f D a
DCT basis Impulses
Lost Uniqueness!! Overcomplete Dictionaries
Strategy: Improve sparse approximation by constructing a large dictionary. f D a
How do we design a dictionary? Dictionary Design
Wavelets DCT, DFT Edgelets, curvelets, …
Impulse Basis Oversampling Frame … …
Dictionary D
Can we just throw in the bucket everything we know? Dictionary Design Considerations
• Dic onary Size: – Computa on and storage increases with size
• Fast Transforms: – FFT, DCT, FWT, etc. drama cally decrease computa on and storage
• Coherence: – Similarity in elements makes solu on harder Dictionary Coherence
Two candidate dictionaries:
D1 D2 BAD!
Intuition: D2 has too many similar elements. It is very coherent.
Coherence (similarity) between elements: 〈d1,d2〉
Dictionary coherence: μ=maxi,j〈di,dj〉 Incoherent Bases • “Mix” well the signal components – Impulses and Fourier Basis – Anything and Random Gaussian – Anything and Random 0‐1 basis Computing Sparse Representations Thresholding
Zero out Compute set of coefficients small ones f D a
a=D†f
Computationally efficient Good for small and very incoherent dictionaries Matching Pursuit
Measure image Select largest Add to against dictionary correlation representation T ρ D f ak ← ak+ρk
ρk Compute residual
f ← f - ρkdk
〈dk,f 〉 = ρk
Iterate using residual Greedy Pursuits Family
• Several Varia ons of MP: OMP, StOMP, ROMP, CoSaMP, Tree MP, … (You can create an AndrewMP if you work on it…)
• Some have provable guarantees
• Some improve dic onary search
• Some improve coefficient selec on CoSaMP (Compressive Sampling MP)
Add to Measure image Select location support set against dictionary of largest Ω = supp(ρ 2K ) T DT f ρ 2K correlations | ∪ Invert over support supp(ρ|2K) b = DΩ† f Truncate and compute residual T = supp(b ) |K a = b K 〈d ,f 〉 = ρ | k k r f Da ← − Iterate using residual Optimization (Basis Pursuit)
Sparse approximation:
Minimize non-zeros in representation s.t.: representation is close to signal
min ‖a‖0 s.t. f ≈ Da
Number of non-zeros Data Fidelity (sparsity measure) (approximation quality)
Combinatorial complexity. Very hard problem! Optimization (Basis Pursuit)
Sparse approximation:
Minimize non-zeros in representation s.t.: representation is close to signal
min ‖a‖0 s.t. f ≈ Da
Convex Relaxation
min ‖a‖1 s.t. f ≈ Da
Ploynomial complexity. Solved using linear programming. Why l1 relaxation works
min ‖a‖1 s.t. f ≈ Da
Sparse solution
l1 “ball”
f = Da Basis Pursuits
• Have provable guarantees – Finds sparsest solu on for incoherent dic onaries
• Several variants in formula on: BPDN, LASSO, Dantzig selector, …
• Varia ons on fidelity term and relaxa on choice
• Several fast algorithms: FPC, GPSR, SPGL, … Compressed Sensing: Sensing, Sampling and Data Processing Data Acquisition • Usual acquisi on methods sample signals uniformly – Time: A/D with microphones, geophones, hydrophones. – Space: CCD cameras, sensor arrays.
• Founda on: Nyquist/Shannon sampling theory – Sample at twice the signal bandwidth. – Generally a projec on to a complete basis that spans the signal space. Data Processing and Transmission
• Data processing steps: – Sample Densely Signal x, N coefficients
– Transform to an informa ve domain (Fourier, Wavelet)
K< – Process/Compress/Transmit Sets small coefficients to zero (sparsifica on) Sparsity Model • Signals can usually be compressed in some basis • Sparsity: good prior in picking from a lot of candidates Compressive Sensing Principles x1 X If a signal is sparse, do not waste effort sampling the empty space. x2 x3 Instead, use fewer samples 1-sparse and allow ambiguity. x1 Use the sparsity model to reconstruct X and uniquely resolve the ambiguity. x2 x3 2-sparse Measuring Sparse Signals Compressive Measurements Ambiguity x1 y1 x2 x1 X Measurement x3 (Projection) X x2 y2 x3 Reconstruction Φ has rank M≪N N = Signal dimensionality M = Number of measurements K = Signal sparsity (dimensionality of y) N ≫ M ≳ K One Simple Question Geometry of Sparse Signal Sets Geometry: Embedding in RM Illustrative Example Example: 1-sparse signal y1=x2 X N=3 Bad! y2 =x3 K=1 x1=0 M=2K=2 x1 X x2 y1=x1=x2 x3 X Bad! y2 =x3 Example: 1-sparse signal y1=x2 N=3 x1 K=1 X M=2K=2 Good! y2 =x3 x1 X y1=x1 x2 x3 X Better! y2 x3 x2 Restricted Isometry Property RIP as a “Stable” Embedding Verifying RIP Universality Property Universality Property Democracy ~ Φ ~ measurements Bad/lost/dropped measurements • Measurements are democra c [Davenport, Laska, Boufounos, Baraniuk] - They are all equally important - We can loose some arbitrarily, (i.e. an adversary can choose which ones) ~ • The Φ s ll sa sfies RIP (as long as we don’t drop too many) Reconstruction Requirements for Reconstruction • Let x1, x2 be K‐sparse signals (I.e. x1-x2 is 2K‐sparse): • Mapping y=Φx is inver ble for K‐sparse signals: Φ(x1-x2)≠0 if x1≠x2 • Mapping is robust for K‐sparse signals: ||Φ(x1-x2)||2≈|| x1-x2||2 – Restricted Isometry Property (RIP): Φ preserves distance when projec ng K‐sparse signals • Guarantees there exists a unique K‐sparse signal explains the measurements, and is robust to noise. Reconstruction Ambiguity • Solu on should be consistent with measurements xˆ s.t. y = Φxˆ or y ≈ Φxˆ • Projec ons imply that an infinite number of solu ons are consistent! • Classical approach: use the pseudoinverse (minimize l2 norm) € • Compressive sensing approach: pick the sparsest. • RIP guarantee: sparsest solu on unique and reconstructs the signal. Becomes a sparse approximation problem! Putting everything together Compressed Sensing Coming Together Signal Structure (sparsity) Reconstruc on Measurement y ~ using sparse x y=Φx x approxima on Stable Embedding Non‐linear Reconstruc on (random projec ons) (Basis Pursuit, Matching Pursuit, CoSaMP, etc…) • Signal model: Provides prior informa on; allows undersampling • Randomness: Provides robustness/stability; makes proofs easier • Non‐linear reconstruc on: Incorporates informa on through computa on Beyond: Extensions, Connec ons, Generaliza ons Sparsity Models Block Sparsity Φ sparse measurements signal nonzero blocks of L x Mixed l1/l2 norm—sum of l2 norms: ! Bi !2 i ! min xB 2 s.t. y Φx Basis pursuit becomes: x ! i ! ≈ i ! Blocks are not allowed to overlap Joint Sparsity y Φ x N L M L N measurements × L sparse signals in R Sparse components per signal with common support Mixed l1/l2 norm—sum of l2 norms: x(i, ) 2 ! · ! i ! Basis pursuit becomes: min x(i, ) 2 s.t. y Φx x ! · ! ≈ i ! Randomized Embeddings Stable Embeddings Johnson-Lindenstrauss Lemma Favorable JL Distributions Connecting JL to RIP Connecting JL to RIP More? The tip of the iceberg Today’s lecture Compressive Sensing Repository dsp.rice.edu/cs Blog on CS nuit-blanche.blogspot.com/ Yet to be discovered… Start working on it