EE269 Signal Processing for Machine Learning Lecture 17

Instructor : Mert Pilanci

Stanford University

March 13, 2019 I Continuous Transform

1 W (s, ⌧)= f(t) ⇤ dt = f(t), s,⌧ h s,⌧ i Z1 I Transforms a of one variable into a continuous function of two variables : translation and scale I For a compact representation, we can choose a mother wavelet (t) that matches the signal shape I Inverse

1 1 f(t)= W (s, ⌧) s,⌧ d⌧ds Z1 Z1

Continuous Wavelet Transform I Define a function (t) I Create scaled and shifted versions of (t) 1 t ⌧ = ( ) s,⌧ ps s I Inverse Wavelet Transform

1 1 f(t)= W (s, ⌧) s,⌧ d⌧ds Z1 Z1

Continuous Wavelet Transform I Define a function (t) I Create scaled and shifted versions of (t) 1 t ⌧ = ( ) s,⌧ ps s

I Continuous Wavelet Transform

1 W (s, ⌧)= f(t) ⇤ dt = f(t), s,⌧ h s,⌧ i Z1 I Transforms a continuous function of one variable into a continuous function of two variables : translation and scale I For a compact representation, we can choose a mother wavelet (t) that matches the signal shape Continuous Wavelet Transform I Define a function (t) I Create scaled and shifted versions of (t) 1 t ⌧ = ( ) s,⌧ ps s

I Continuous Wavelet Transform

1 W (s, ⌧)= f(t) ⇤ dt = f(t), s,⌧ h s,⌧ i Z1 I Transforms a continuous function of one variable into a continuous function of two variables : translation and scale I For a compact representation, we can choose a mother wavelet (t) that matches the signal shape I Inverse Wavelet Transform

1 1 f(t)= W (s, ⌧) s,⌧ d⌧ds Z1 Z1 Continuous Wavelet Transform Continuous Wavelet Transform Continuous Haar Wavelets

I Consider the function

1 if 0 x 1 (x)=   (0 otherwise

I translates (x k) Continuous Haar Wavelets

I Consider the function

1 if 0 x 1 (x)=   (0 otherwise

I linear combination of the translates (x k) Continuous Haar Wavelets I Consider the function 1 if 0 x 1 (x)=   (0 otherwise I Define

V0 = all square integrable functions of the form g(x)= a (x k) k Xk =all square integrable functions which are constant on integer intervals Continuous Haar Wavelets I Consider the function 1 if 0 x 1 (x)=   (0 otherwise I Define

V1 = all square integrable functions of the form g(x)= a (2x k) k Xk =all square integrable functions which are constant on half integer intervals Continuous Haar Wavelets

I Consider the function

1 if 0 x 1 (x)=   (0 otherwise

I Define

Vj = all square integrable functions of the form g(x)= a (2jx k) k Xk =all square integrable functions which are constant on j 2 length intervals I In general Wj = Vj+1 Vj

Continuous Haar Wavelets

I Nested spaces: ...V 2 V 1 V0 V1 V2... ⇢ ⇢ ⇢ ⇢

I There is a subspace W0 such that V0 W0 = V1,i.e. W := V V 0 1 0 Continuous Haar Wavelets

I Nested spaces: ...V 2 V 1 V0 V1 V2... ⇢ ⇢ ⇢ ⇢

I There is a subspace W0 such that V0 W0 = V1,i.e. W := V V 0 1 0 I In general Wj = Vj+1 Vj

Continuous Haar Wavelets

I Nested spaces: ...V 2 V 1 V0 V1 V2... ⇢ ⇢ ⇢ ⇢

I Wj = Vj+1 Vj

I Theorem: Every square integrable function can be uniquely expressed as

1 w where w W j j 2 j j= X1 I Each function can be written as f = j wj I f = j j ajk jk(x) (multiresolutionP analysis) P P

Continuous Haar Wavelets

I Nested spaces: ...V 2 V 1 V0 V1 V2... ⇢ ⇢ ⇢ ⇢ I Wj = Vj+1 Vj

I Theorem: Every square integrable function can be uniquely expressed as

1 w where w W j j 2 j j= X1 1 if 0 x 1 I Define (x)=   2 ( 11/2 x 1   j/2 j 1 I 2 (2 x k) forms an for Wj k= n o 1 I f = j j ajk jk(x) (multiresolution analysis) P P

Continuous Haar Wavelets

I Nested spaces: ...V 2 V 1 V0 V1 V2... ⇢ ⇢ ⇢ ⇢ I Wj = Vj+1 Vj

I Theorem: Every square integrable function can be uniquely expressed as

1 w where w W j j 2 j j= X1 1 if 0 x 1 I Define (x)=   2 ( 11/2 x 1   j/2 j 1 I 2 (2 x k) forms an orthonormal basis for Wj k= n o 1 I Each function can be written as f = j wj P Continuous Haar Wavelets

I Nested spaces: ...V 2 V 1 V0 V1 V2... ⇢ ⇢ ⇢ ⇢ I Wj = Vj+1 Vj

I Theorem: Every square integrable function can be uniquely expressed as

1 w where w W j j 2 j j= X1 1 if 0 x 1 I Define (x)=   2 ( 11/2 x 1   j/2 j 1 I 2 (2 x k) forms an orthonormal basis for Wj k= n o 1 I Each function can be written as f = j wj I f = j j ajk jk(x) (multiresolutionP analysis) P P I pairwise averages:

x2k 1 + x2k x = ,k=1,...,N/2 k 2

I example

x =[6, 12, 15, 15, 14, 12, 120, 116] s =[9, 15, 13, 118] !

Discrete Wavelet Transform

t ⌧ I Discrete shifts and scales ( s ) I Suppose we have a signal of length N

x =[x1,x2,...xN ]

I Consider a length N/2 approximation of x, e.g., for transmission I example

x =[6, 12, 15, 15, 14, 12, 120, 116] s =[9, 15, 13, 118] !

Discrete Wavelet Transform

t ⌧ I Discrete shifts and scales ( s ) I Suppose we have a signal of length N

x =[x1,x2,...xN ]

I Consider a length N/2 approximation of x, e.g., for transmission I pairwise averages:

x2k 1 + x2k x = ,k=1,...,N/2 k 2 Discrete Wavelet Transform

t ⌧ I Discrete shifts and scales ( s ) I Suppose we have a signal of length N

x =[x1,x2,...xN ]

I Consider a length N/2 approximation of x, e.g., for transmission I pairwise averages:

x2k 1 + x2k x = ,k=1,...,N/2 k 2

I example

x =[6, 12, 15, 15, 14, 12, 120, 116] s =[9, 15, 13, 118] ! I One step Haar Transformation x [s d] ! |

I suppose that we are allowed to send N/2 more numbers I di↵erences

x2k 1 x2k d = ,k=1,...,N/2 k 2 I we can recover x

x =[6, 12, 15, 15, 14, 12, 120, 116] ! [s d]=[9, 15, 13, 118 3, 0, 1, 2] | | I suppose that we are allowed to send N/2 more numbers I di↵erences

x2k 1 x2k d = ,k=1,...,N/2 k 2 I we can recover x

x =[6, 12, 15, 15, 14, 12, 120, 116] ! [s d]=[9, 15, 13, 118 3, 0, 1, 2] | |

I One step Haar Transformation x [s d] ! | One Step Haar Transformation Discrete Haar Transform Matrix

I repeat the computation on the means I keep di↵erences in each step Discrete Haar Transform Filter Bank Discrete Haar Wavelet Transform Discrete Haar Wavelet Transform Discrete Haar Wavelet Transform Discrete Haar Wavelet Transform Wavelet Transform Features

I mean, median I variance I zero crossing rate, mean crossing rate I entropy

slide credit: A. Taspinar Results: training set: 7724 signals, test set: 2575 signals

3-Nearest Neighbors, `2-norm distance on x[n]. Accuracy : 0.77

3-Nearest Neighbors, ` -norm distance on X[k] . Accuracy : 0.85 2 | | Human Activity Recognition dataset

I 3-Nearest Neighbors, `2-norm distance on x[n]. accuracy : 77%

I 3-Nearest Neighbors, `2-norm distance on X[k] . | | accuracy : 85% I 1D Convolutional Net (4 layers) accuracy : 91% I Wavelet Transform Features (entropy, zero crossing, simple statistics) + linear classifier accuracy : 95% Other Wavelets Other Wavelets

I In MATLAB [c,l] = wavedec(x,n,wname) returns the wavelet decomposition of the signal x at level n using the wavelet wname Other Discrete Wavelets What makes a good wavelet

Application specific I Compact time support vs frequency support I Smoothness I 2D Discrete Haar Transform 2D Discrete Haar Transform I Continuous Frequency STFT

L 1 jm X[n, ]= x[n + m]w[m]e , m=0 X

Short-time Fourier Transform

I window signal 0 m<0,m L e.g. w[m]= 10 m L 1 ⇢   I Short Time Fourier Transform (STFT)

L 1 j(2⇡/N)km X[n, k]= x[n + m]w[m]e , 0 k N 1 .   m=0 X Short-time Fourier Transform

I window signal 0 m<0,m L e.g. w[m]= 10 m L 1 ⇢   I Short Time Fourier Transform (STFT)

L 1 j(2⇡/N)km X[n, k]= x[n + m]w[m]e , 0 k N 1 .   m=0 X

I Continuous Frequency STFT

L 1 jm X[n, ]= x[n + m]w[m]e , m=0 X I or, windowing the basis

Short-time Fourier Transform

L 1 jm X[n, ]= x[n + m]w[m]e , m=0 X I windowing basis signal Short-time Fourier Transform

L 1 jm X[n, ]= x[n + m]w[m]e , m=0 X I windowing basis signal I or, windowing the basis Short-time Fourier Transform vs Wavelet Transform

I wavelets have adaptive windows: I short windows for higher frequencies (small scale) I long windows for lower frequencies (large scale) Wavelet Transform vs STFT

I Wavelet transform analyzes a signal at di↵erent frequencies with di↵erent resolutions: good time resolution and relatively poor frequency resolution at high frequencies good frequency resolution and relatively poor time resolution at low frequencies I Wavelet transform is better for signals with non-periodic and fast transient features (i.e., high frequency content for short duration) Wavelet Transform Wavelet Transform Wavelet Transform vs STFT Wavelet Transform vs STFT : Locality Wavelet Transform vs STFT : Locality Fourier vs Wavelet Transforms

I Fourier Transform has convolution theorem and mathematical relationships I No closed form relations exist for wavelet transforms I Fourier transform has uniform spectral resolution I Wavelet transform has adaptive resolution Application: Audio Fingerprinting

I Spectrogram is reduced to prominent time-frequency locations I Constellations are robust against noise and outliers