Full course overview 1. Fourier analysis and analog filtering 1.1 1.2 and filtering 1.3 Applications of analog processing MAP 555 : 2. Digital signal processing Part 2 : Digital Signal Processing 2.1 Sampling and properties of discrete 2.2 z Transform and transfer function 2.3 Fast Fourier Transform R. Flamary 3. Random signals 3.1 Random signals, stochastic processes 3.2 Correlation and spectral representation 3.3 Filtering and linear prediction of stationary random signals December 7, 2020 4. Signal representation and dictionary learning 4.1 Non stationary signals and short time FT 4.2 Common signal representations (Fourier, wavelets) 4.3 Source separation and dictionary learning 5. Signal processing with machine learning 5.1 Learning the representation with deep learning 5.2 Generating realistic signals 1/80 2/80

Course overview Digital Signal Processing Fourier Analysis and analog filtering 4 Digital signal processing 4 Sampling Analog/Digital conversion 5 Sampling Reconstruction of analog signals Aliasing Digital filtering 13 Discrete Convolution Discrete Time Fourier Transform (DTFT) Z-transform and transfer function Finite signals 31 Circular convolution Digital Signal Processing Discrete Fourier Transform (DFT) I Microprocessors widely available and cheap since the 70s. Fast Fourier Transform and fast convolution Applications of DSP 54 I Analog-to-digital converter (ADC) and digital-to-analog converter (DAC). Digital filter design I Standard signal processing : ADC DSP DAC. Sinc interpolation → → Digital Image Processing I DSP more robust/stationary. Random signals 77 I Analog SP is faster but sensitive to physics (temperature). Signal representation and dictionary learning 77 I Digital Signal Processing can be done on dedicated hardware or processors. Signal processing with machine learning 77

3/80 4/80 Sampling Sampling in the Fourier domain

Principle I Sampling is the reduction of a continuous-time signal to a discrete-time signal. I A discrete signal sampled for period T can is expressed as

∞ xT (t) = x(nT )δ(t nT ) (1) − n= Fourier transform of sampled signal X−∞ 1 T is the sampling period (or interval), f = is the sampling frequency. I If x(nT ) is bounded for n Z, xT (t) is a tempered distribution. I s T ∈ Due to the properties of the dirac δ the sampled signal is equal to I The FT of xT (t) can be expressed as a function of X(f) = [x(t)]: I F

∞ 1 1 ∞ 1 xT (t) = x(t) δ(t nT ) = x(t)XT (t) (2) [xT (t)] = [x(t)XT (t)] = X(f) ? X 1 (t) = X f (3) − F F T T T − T n= n=   X−∞ X−∞ where XT (t) is the dirac comb of period T . I The regular sampling leads to a periodization in the Fourier domain. 5/80 6/80

Nyquist/Shannon sampling Theorem Nyquist/Shannon sampling Theorem Proof

Let xT (t) = x(t) n∞= δ(t nT ) be the sampled signal. Its Fourier Transform is −∞ − P 1 ∞ 1 XT (t) = X f T − T n=   X−∞ 1 1 1 1 Since we know that X is of support [ 2T , 2T ] it means that f [ 2T , 2T ] we have 1 − ∀ ∈ − XT (f) = T X(f). Now if we want to reconstruct the signal we can multiply in the Fourier domain by the Theorem [Shannon, 1949][Nyquist, 1928] ideal filter: T if f < 1 Let x(t) be a signal of Fourier transform X(f) that has a support in [ 1 , 1 ]. Then H(f) = | | 2T − 2T 2T 0 else the signal x(t) can be reconstructed from its sampling with  Using the bounded support of X we have now f ∀ ∞ x(t) = hT (t nT )x(nT ) = xT (t) ? hT (t) (4) X (f)H(f) = X(f) − T n= X−∞ Which in the temporal domain means where πt πt sin( T ) ∞ ∞ hT (t) = sinc = πt (5) T x(t) = xT (t) ? h(t) = h(t) ? x(nT )δ(t nT ) = x(nT )h(t nT )   T − − n= n= For a signal of frequency support [ B,B], B is often called the Nyquist frequency X−∞ X−∞ − (half the sampling rate necessary for reconstruction). πt 7/80 where h(t) = sinc T is the inverse TF of H. 8/80  Aliasing Example of aliasing

Aliasing in the time domain Aliasing in the time domain 1.0 1.0 0.5 x(t) x(t)

0.0 xT(t) * hT(t) 0.5 xT(t) * hT(t)

xT(t) xT(t) 0.5 0.0 1.0 2 1 0 1 2 6 4 2 0 2 4 6

I Let x(t) = cos(2πf0t) be a signal that we want to sample. We suppose that fs < f < f = f . Aliasing I 2 0 s s I We have I The FT of xT (t) is a weighted sum of X(f) = [x(t)]: F 1 1 XT (f) = (δ(f f0 kfs) + δ(f + f0 kfs)) ∞ T 2 − − − 1 1 n k [xT (t)] = [x(t)XT (t)] = X(f) ? X 1 (t) = X f (6) X F F T T T − T n= f f X−∞   I The only components of the spectrum in [ s , s ] are: − 2 2 1 1 I When the support of X(f) is not in [ 2T , 2T ] the repeated shapes will overlap 1 − XT (f)H(f) = (δ(f f0 + fs) + δ(f + f0 fs) in frequency. 2 − − I In this case the signal cannot be reconstructed and some information is lost. I Reconstructed signal: x(t) = cos(2π(fs f0)t) 9/80 − 10/80

Aliasing in real life Analog to Digital Conversion

ADC circuits Aliasing I Sampling frequency has to be twice the maximum frequency in the signal. I Low pass filtering before sampling (analog). I When sampling high frequency real life signals. I Several sources of noise : jitter (non perfect clock), non-linearity, I Always needs a low-pass filter (analog) before sampling. I For images CCD or CMOS (smartphones) sensors count photons. I Can be solved by oversampling (followed by filtering then subsampling). I Anti-aliasing filters in graphic cards (and digital cameras). Quantization I Computers are discrete, digital signal are discrete both in time and value. I Quantization is the conversion from continuous value to a finite bit format. I Number of bits has an important impact on SNR after reconstruction. 11/80 12/80 Discrete signal (1) Discrete signal (2)

Dirac discrete [n] 1.00

Notations 0.75 I x(t) with t R is the analog signal. 0.50 ∈ 0.25 I xT (t) with t R is the sampled signal of period (T) but still continuous time: 0.00 ∈ 15 10 5 0 5 10 15

∞ Discrete dirac xT (t) = x(nT )δ(t nT ) We note the discrete dirac δ[n] defined as − n= X−∞ 1 for n = 0 δ[n] = (7) I x[n] with n Z is the discrete signal sampled with period T such that: (0 else ∈ x[n] = x(nT ) Discrete signal Any discrete signal x[n] can be decomposed as a sum of translated discrete diracs: I Obviously one can recover xT (t) from x[n] with ∞ x[n] = x[k]δ[n k] (8) − ∞ k= xT (t) = x[n]δ(t nT ) X−∞ − n= X−∞ The discrete diracs are an orthogonal basis of L2(Z) of scalar product and corresponding norm I In order to simplify notations we will suppose T = 1 in the following. ∞ 2 ∞ 2 In this course we suppose that x[n] is bounded. < x[n], h[n] >= x[k]h∗[k], x[n] =< x[n], x[n] >= x[k] . I k k | | | | k= k= X−∞ X−∞ 13/80 14/80

Discrete Convolution Transfer function

Filter h[n] in the time domain Transfer function magnitude H(e2i f) 0.20 Convolution between discrete signals h[n] Real(H(e2i f)) 1.5 Imag(H(e2i f)) Let x[n] and h[n] two discrete signals. The convolution between them is expressed as: 0.15 1.0

0.10 ∞ x[n] ? h[n] = x[k]h[n k] (9) 0.5 0.05 − 0.0 k= −∞ X 0.00 0.5 8 6 4 2 0 2 4 6 8 1.0 0.5 0.0 0.5 1.0 Digital filter properties Transfer function of a discrete filter Let the discrete system/operator/filter L described by its impulse response h[n]. I Let L be a digital filter of impulse response h[n]. i2πfk I Causality L is causal if h[n] = 0, n 0. L is causal if I For an input ef [k] = e the output of the filter is ∀ ≤

1 for n 0 ∞ i2πf(n k) i2πfn ∞ i2πfk h[n] = h[n]Γ[n], where Γ[n] = ≥ (10) Lef [n] = e − h[k] = e e− h[k] (12) (0 else k= k= X−∞ X−∞ i2πfk I Stability A system is stable if the output of a bonded input is bounded. A I ef [k] = e are then the eigenvectors of the discrete convolution operator. necessary and sufficient condition is that I The Transfer function of the filter is defined as the following : ∞ h[n] < (11) i2πf ∞ i2πfk | | ∞ H(e ) = e− h[k] (13) n= X−∞ k= X−∞ 15/80 16/80 This actually corresponds to the Fourier transform of the signal. Discrete Time Fourier Transform (DTFT) Fourier Transform and discrete convolution

Discrete time domain Fourier domain 0.20 1.0 Fourier transform x[n] h[n] The Discrete Time Fourier Transform of the discrete signal x[n] is defined as 0.8 0.15 y[n]

0.6 0.10 i2πf ∞ i2πfk X(e ) = e− x[k] (14) 0.4 0.05 k= 0.2 X−∞ 0.00 0.0 8 6 4 2 0 2 4 6 8 1.0 0.5 0.0 0.5 1.0 I It is periodic and equivalent to the Fourier transform of xT (t). Theorem I For a tempered distribution (x[n] bounded) all the FT properties are preserved. Let x[n] L2(Z) and h[n] L2(Z) two discrete signals. The Fourier Transform of ∈ ∈ Orthonormal basis of L2([0, 1])) y[n] = x[n] ? h[n] is i2πf i2πf i2πf I The FT of a discrete signal is periodic (of period T = 1) and can be expressed as Y (e ) = X(e )H(e ) (16) a Fourier series. i2πfn I Similarly to continuous signal, the FT of the convolution is a pointwise I e defines an orthogonal basis of L2([0, 1])) with scalar product multiplication. < a(f), b(f) >= 1 a(f)b (f)df . 0 ∗ I This shows similarly that the filter will have an effect (amplification and I The coefficients canR be recovered using the scalar product: attenuation) on the individul frequency components. 1 i2πf i2πfn i2πf i2πfn I The FT of a temporal multiplication y[n] = x[n]h[n] can also be expressed as x[n] =< X(e ), e− >= X(e )e df (15) 0 1 Z i2πf i2πu i2π(f u) i2πf i2πf Y (e ) = X(e )H(e − )du = X(e ) H(e ) 2 1 i2πf 2 ∞ 0 ∗ I Conservation of energy implies that k= x[n] = 0 X(e ) df Z −∞ | | | | 17/80 18/80 P R Ideal filter Digital filter

Recurent filter formulation Ideal low pass filter I We want to design an implementable filter (finite number fo computations). The Fourier transform of the ideal low pass filter with f < 1 is I c 2 I We define the relation between the input x[n] and the output y[n] as a difference equation : i2πf 1 for f < fc N M H0(e ) = | | (17) aky[n k] = bkx[n k] (19) (0 else − − kX=0 kX=0 I Using Eq. 15 one can recover the impulse response where ak and bk are reals and a0 = 0. 6 1 I The sample y[n] can be expressed as 2 i2πf i2πfn sin(2πfcn) h[n] = H0(e )e df = = 2πfcsinc(2πfcn) (18) 1 πn M N Z− 2 1 y[n] = bkx[n k] aky[n k] (20) a0 − − − That is a regular sampling of the continuous impulse response. k=0 k=1 ! X X I This filter is non causal and cannot be implemented in practice (infinite sum). I Can be computed only from the past (causal) and with M + N I In practice one has to approximate this filter using Finite impulse response or multiply/accumulate. Infinite impulse response filters. I Called Infinite Impulse response (IIR) filter when N > 1 because the recurrence imply that y[n] depends on all the values of x[k] for k n. ≤

19/80 20/80 Special cases Transfer function of an IIR filter

Effect of the Na of an average filter Average filtering in the Fourier domain 2 1.00 N_a=2 1 0.75 N_a=5 x[n] Transfer function N_a=10 0 N_a=2 0.50 N_a=20 N_a=5 I We recall that the relations between input x[n] and output y[n] is defined as 1 N_a=10 0.25 2 N_a=20 N M 0.00 0 5 10 15 20 25 30 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 aky[n k] = bkx[n k] − − k=0 k=0 Finite Impulse Filter (FIR) when N = 0 X X M I Taking the Fourier transform of both terms in the equality we find bk y[n] = x[n k] = x[n] ? h[n] (21) Example: Average filter a0 − N M k=0 i2πfk i2πf i2πfk i2πf X 1 ak e− Y (e ) = bk e− X(e ) N for 0 n Na The impulse response is h[n] = a ≤ ≤ (23) k=0 k=0 (0 else X X bn for 0 n M h[n] = a0 ≤ ≤ (22) I The Transfer function of the filter is then 0 else ( i2πf M i2πfk Y (e ) bk e− H(ei2πf ) = = k=0 . (25) Autoregressive model (AR) when M = 1 X(ei2πf ) N i2πfk Pk=0 ak e− N b P i2πf y[n] = k y[n k] (24) I The transfer function is a rational function of polynomials of e− . a0 − kX=1 The output depends only on initial (in time) condition of the output. It is not a filter. 21/80 22/80

Factorization of the transfer function Frequency response for discrete signals

Modulus and Gain The modulus of the transfer function as a function of w = 2πf is: Zero/poles factorization M iw I The transfer function of a recurrent filter is iw b0 k=1 1 cke− H(e ) = | | N | − | a0 1 d e iw i2πf M i2πfk Qk=1 k − Y (e ) bk e− | | | − | H(ei2πf ) = = k=0 . (26) X(ei2πf ) N a e i2πfk The gain in dB can be expressed as G(w)Q = 20 log( H(eiw) ) Pk=0 k − | | P 2 M N I It can be factorized as b0 iw 2 iw 2 − − G(w) = 10 log10 | |2 + 10 log10 1 cke 10 log10 1 dke . M i2πf a0 | − | − | − | i2πf b0 (1 cke− ) | | k=1 k=1 H(e ) = k=1 − . (27) X X N i2πf a0 (1 dke ) Qk=1 − − where the difference between poles and zeros is only a sign. Q I ck are the zeros and dk are the poles of the transfer function. Phase I Modulus and phase are easier to interpret in with the factorization. The phase of the transfer function can be expressed similarly I Easier to make a by treating each pole/zero independently. M N iw b0 iw iw Arg(H(e )) = Arg( ) + Arg(1 cke− ) Arg(1 dke− ). a0 − − − kX=1 kX=1

23/80 24/80 Example for one pole/zero The Z-transform (1)

Magnitude for one zero with f0 = 0.2 Magnitude in dB for one zero with f0 = 0.2 Definition 20 3 r=0.1 The Z-transform is a generalization of the Fourier transform for discrete signals. It can r=0.5 0 be computed as the Laurent series: 2 r=1 r=2 + 1 20 ∞ n H(z) = (h[n]) = h[n]z− . (28) 0 Z 40 n= 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 X−∞

i2πf0 where z C is a complex. I We study a transfer function with a unique zero c1 = re : ∈ Region of Convergence (ROC) 2iπf i2πf i2πf H(e ) = (1 re 0 e− ) − I The Z-transform is always associated to its region of convergence. I The gain in dB can be expressed as I The Laurent series is said to be convergent if

i2πf0 i2πf 2 + GdB(f) = 10 log10 1 re e− ∞ n | − | h[n] z − < + . | | | | ∞ 1 n= I The magnitude reaches a minimum for f = f0 and a maximum in f = f0 + 2 . X−∞ I The Region of Convergence is defined as I The attenuation in f = f0 is perfect when r = 1. The Phase fo the transfer function is: ∞ n I ROC(H) = z : h[n]z− < ∞ ( n= ) i2πf r sin(2π(f f0)) −∞ Arg(H(e )) = arctan − . X 1 r cos(2π(f f0))  − −  I This region depends only on z . 25/80 | | 26/80

The Z-transform (2) Inverse Z-transform

Region of Convergence (2) Definition The inverse Z-transform depends on the ROC and can be expressed as I There always exists ρ1 and ρ2 such that H(z) is convergent for ρ1 < z < ρ2 and | | divergent for z < ρ1 or z > ρ2. 1 1 n 1 | | | | x[n] = − X(z) = X(z)z − dz (29) Z { } 2πj I When H(z) converges for z = 1 we recover the Discrete Time Fourier IC | | Transform of the signal. where C is a counterclockwise closed path encircling the origin and entirely in the I For a causal filter, if H(z) converges with z = ρ, it converges with z > ρ and ROC(H). Usually solved using Cauchy’s residue theorem. | | | | ρ2 = . I When z = 1 is in the ROC(H) one can compute the inverse Fourier transform ∞ | | I If a filter is stable ( h[n] < ) the H(z) is convergent for z = 1, if it is for discrete signals: n | | ∞ | | 1 causal and stable it is convergent for z 1 i2πf i2πfn P | | ≥ x[n] = X(e )e df Z0 Example n Example Let h[n] = Γ[n]φ with φ > 0 be a causal impulse response of a filter. 1 Let H(z) = 1 and ROC(H) = z z > ρ . 1 φz− { | | | } n This ROC means− that h[n] is causal and we recover ∞ n ∞ n n ∞ φ 1 h[n]z− = φ z− = = . z 1 φz 1 n= n=0 n=0   − + + X−∞ X X − 1 ∞ n n ∞ n n n 1 H(z) = = φ z− = φ Γ[n]z− h[n] = φ Γ[n] Note that the series converges for φz− < 1 hence the region of convergence is 1 φz 1 → | | − n=0 n= z > φ − X X−∞ | | 27/80 28/80 Properties of Z-transform Z-transform of recurrent filters

Some properties I The Z-transform of a recurrent filter can be expressed as I Linearity : [a1x1[n] + a2x2[n]] = a1X1(z) + a2X2(z) Z M k 1 bkz− I Time reversal : [x[ n]] = X(z− ) H(z) = k=0 . N k Z − akz n0 Pk=0 − I Time delay : [x[n n0]] = X(z)z− Z − dX(z) P I By using a polynomial identification when dk are the simple poles of H(z) one I Differentiation : [nx[n]] = z dz Z − can reformulate this transform as I Convolution : [x[n] ? h[n]] = X(z)H(z), and ROC = ROC(X) ROC(H) Z ∩ n 1 M N N I Scaling in the z-domain : [a x[n]] = X(a− z) (also scales the ROC) − r Ak ˆ − Z h(z) = Brz + 1 . n 1 1 dkz− I Accumulation : [ k= x[k]] = X(z) 1 z 1 r=0 k=0 − Z −∞ − − X X P I The causal filter corresponding to this Z-transform has then the following impulse Examples of Z-transform response M N N n0 − I Dirac δ: [δ[n n0]] = z− , and ROC = z 0 < z < n Z − { | | | ∞} h[n] = Brδ[n r] + Ak(dk) Γ[n]. 1 − r=0 I Unitary step function : [Γ[n]] = 1 z 1 , and ROC = z z < 1 k=0 Z − − { || | } X X n 1 If H(z) has multiple poles then the decomposition is done with the exponent. I [a Γ[n]] = 1 z 1 , and ROC = z z > a Z − − { || | | |} 1 z 1 cos(w ) I Note that a filter is causal and stable if and only if all its poles dk < 1. − − 0 | | I [cos(w0n)Γ[n]] = 1 2z 1 cos(w )+z 2 , and ROC = z z > 1 Z − − 0 − { || | }

29/80 30/80

Finite discret signals Circular convolution

Discrete convolution of finite signals The convolution between x˜[n] and h˜[n] both finite signals of n samples can be expressed as: + ∞ y˜[n] =x ˜[n] ? h˜[n] = x˜[p]h˜[n p] (30) − p= X−∞ I It requires values for the signals outside of the sampling widow. ˜ Finite discrete signals I One common approach consists in having x˜[n] and h[n] equal to 0 outside the sampling interval. Other choices can be done (see next slides) I Most of the theoretical results seen up to now correspond to signals x[n] where n Z. ∈ Circular convolution I In practice recordings are only done for a finite amount of time resulting to only When using the periodic version of the signals the circular convolution can be N samples. computed on a unique period of size N:

I We defined x˜[n] a finite signal of N samples with n 0,...,N 1 . N 1 ∈ { − } − I We use in the following the periodization of x¯[n] x ? h[n] = x[p]h[n p]. − p=0 x[n] =x ˜[n mod N] X The circular convolution is rarely appropriate in real life images due to border effects. where mod is the modulo operator.

31/80 32/80 Discrete convolution as matrix multiplication Border effects example

Vector representation and convolution matrix Signals N I Finite signal x of N samples can be represented as a vector x C . 1.00 ∈ 0.75

I The convolution operator is linear and can be expressed as: 0.50

0.25 x[n] y = x ? h = Chx h[n] 0.00 Convolution 0.3 Where Ch C(N,N) is a convolution matrix parametrized by vector h. y[n] discrete convolution ∈ M y[n] circular convolution Discrete convolution Circular convolution 0.2 The convolution operator when the The circular convolution operator can be 0.1

values outside the support are 0 can be expressed as 0.0 expressed as 0 10 20 30 40 50 60 h[0] h[N 1] h[1] h[0] 0 0 − ··· ··· h[1] h[0] h[2] h[1] h[0] 0  ···  I Convolution between diracs x˜[n] and a shape h[n] will repeat the shape at the ··· Ch = . . . .  . . . .  . . .. . diracs position......   Ch =  h[N 1] h[N 2] . . . h[0] A dirac at the end of the signal will cut the shape for discrete convolution where h[N 1] h[N 2] . . . h[0]   − −  I  − −     . . . .  the outside of the sampling is 0.  . . .. .  where Ch C(N,N) is a circulant   ∈ M I With circular convolution the shape is repeated t the beginning of the signal.   Toeplitz matrix.  0 0 h[N 1]  ··· −  I One can remove border effects by creating virtual periodic signal with zeros (zero where Ch (2 N 1,N) is a padding, see fast convolution). ∈ MC ∗ − Toeplitz matrix. 33/80 34/80

Discrete convolution in practice Discrete convolution in practice (2)

Finite signals Convolution x * h[n] with scipy.signal.convolve Finite signals Convolution x * h[n] with scipy.ndimage.convolve 1.00 1.00 8 x[n], N=32 4 mode='full' x[n], N=32 mode='reflect' 0.75 h[n], M=8 mode='valid' h[n], M=8 mode='nearest' 3 0.75 6 mode='same' mode='mirror' 0.50 2 0.50 4 mode='wrap'

0.25 1 0.25 2

0.00 0 0.00 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 0 5 10 15 20 25 30

The Scipy scipy.signal.convolve function: The Scipy scipy.ndimage.convolve function: I Convolution between two signals of support respectively N and M samples I Always return the same size as the first parameter by default. supposing that their values are 0 outside of the support. I The mode parameter allows selecting the borders of a signal x = (abcd): I The third parameter is mode that change the size of the output : I mode='reflect' : (dcba abcd dcba) (default) I mode='full' returns a signal of support N + M 1 (default). | | − I mode='constant' : (kkkk abcd kkkk) I mode='valid' returns a signal of support N M + 1 with only the samples | | | − | I mode='nearest' : (aaaa abcd dddd) that do not rely on zeros padding of the larger signal. | | I mode='mirror' : (dcb abcd cba) mode='same' returns a signal of the same size as the first input. | | I I mode='wrap' : (abcd abcd abcd) (circular convolution) | | Parameter method allows to choose between 'direct' computation and 'fft' and I I Parameter origin allows to select the origin of the filter h. selects the most efficient by default.

35/80 36/80 Discrete Fourier Transform (DFT) Finite signal and vector space

Vector space of finite signals Definition I The space of finite signals is a finite space of scalar product and norm The discrete Fourier Transform of a periodic signal x[n] can be expressed as N 1 N 1 − 2 − 2 < x, h >= x[k]h∗[k], x =< x, x >= x[k] N 1 k k | | − i2πk p k=0 k=0 X[k] = x[p]e − N . (31) X X i2πk n p=0 I The family of discrete exponentials (ek[n])0 k N 1 such that ek[n] = e N , is X ≤ ≤ − an orthogonal basis of the space of finite discrete signals of period N. I The DFT of periodic discrete signal is also periodic and discrete. I This means that both the signal and its Fourier Transform can be stored in DFT as a change of basis memory in a size N vector. I A signal x[n] can be decomposed on the basis of complex exponentials: N 1 N 1 − I The is sampled regularly between 0 and N− . < x, ek > x[n] = 2 ek[n]. (33) I The Inverse Discrete Fourier Transform (IDFT) can be computed as ek kX=0 k k N 1 1 − i2πn I The DFT can be computed as : X[k] =< x, ek > N k x[n] = X[k]e . (32) 2 N I Since ek = N we can recover the IDFT as kX=0 k k N 1 2 1 − i2πk n I The complexity for computing naively the DFT is (N ). x[n] = X[k]e N . (34) O N kX=0

37/80 38/80

Properties of DFT Examples of DFT (1)

Parseval-Plancherel identity for finite discrete signals x[n] = δ[n] with period N Signal x[n], N=32 DFT X[k], N=32 Since the family (ek[n])0 k N 1 is orthogonal, we can recover the Plancherel identify 1.00 1.00 ≤ ≤ − for discrete signals as 0.75 0.75 N 1 N 1 − − Real(X[k]) 2 1 2 0.50 0.50 x[n] = X[k] . (35) Imag(X[k]) | | N | | 0.25 0.25 n=0 k=0 X X 0.00 0.00 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Circular convolution I The discrete diracn leads to a DFT constant at value 1. k The circular convolution y[n] = x[n] ? h[n] is a signal of period N and its Discrete I It does not depends o N. Fourier Transform can be expressed as:

x[n] = δ[n n0] with period N Y [k] = X[k]H[k] (36) − Signal x[n], N=32 DFT X[k], N=32 This will be used for fast convolution with FFT. 1.00 1.0 0.75 0.5 Real(X[k]) 0.50 0.0 Border effect Imag(X[k]) I The DFT supposes that the signal is periodic. 0.25 0.5 0.00 1.0 I When the signal is recorded, there is no reason for it to be periodic, x[N 1] and 0 5 10 15 20 25 30 0 5 10 15 20 25 30 − n k x[0] can be very different. I The delayed discrete dirac is a complex exponential. I This can introduce some high frequencies in practice. I DFT magnitude X[k] constant at value 1. 39/80 | | 40/80 Examples of DFT (2) Examples of DFT (3) x[n] = n with period N Signal x[n], N=32 DFT X[k], N=32 x[n] = cos(2πf0n) with period N 30 Real(X[k]) 400 Imag(X[k]) 6 k When f0 = = (one of the sampled frequencies): 20 |X[k]| I N N 200 Signal x[n], N=32 DFT X[k], N=32 10 1.0 0 15 Real(X[k]) 0.5 Imag(X[k]) 0 10 |X[k]| 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0.0 n k 5 I Small variation of the signal : mostly low frequencies. 0.5 1.0 0 I The large change due to the periodicity requires high frequencies. 0 5 10 15 20 25 30 0 5 10 15 20 25 30 n k

6.2 k Average filtering I When f0 = = : N 6 N Signal x[n], N=32 DFT X[k], N=32 Signal x[n], N=32 DFT X[k], N=32 1.0 1.0 Real(X[k]) Real(X[k]) 0.10 Imag(X[k]) 10 Imag(X[k]) 0.5 0.5 |X[k]| |X[k]| 0.0 0.05 0.0 0 0.5 0.5 0.00 1.0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 n k n k I x[n] = 1/m for n < m else x[n] = 0. I Recognize the periodic sinc corresponding to rectangular function. 41/80 42/80

Examples of DFT (4) Plotting the DFT and fftshift fsk I DFT is defined on a finite number of frequencies N and periodic. Stairway to Heaven I When sampling at frequency fs we suppose that the signal has filtered to avoid aliasing. I First 10 seconds of ”Stairway to Heaven” from Led Zeppelin sampled at 44100Hz. I All relevant frequencies can be expressed between fsk with x[n] X[k] N 4000 k = N/2,...,N/2 1. 0.5 − − 3000 I One often use what is called fftshift to center the 0 frequency. 0.0 2000

0.5 1000 DFT of cosine

1.0 0 Signal x[n], N=32 DFT X[k], N=32 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 1.0 n 1e5 k 1e5 Real(X[k]) 0.5 10 Imag(X[k]) |X[k]| 0.0 I Zoom to [1, 1000] Hz and real life frequency: 0 0.5 X(e2 f) 1.0 4000 0 5 10 15 20 25 30 0 5 10 15 20 25 30 n k 3000 When doing fftshift: 2000 DFT X[k], N=32 DFT X(e2i f)

15 Real(X[k]) 15 Real(X(e2i f)) 1000 Imag(X[k]) Imag(X(e2i f)) 10 |X[k]| 10 |X(e2i f)|

0 5 5 0.0 0.2 0.4 0.6 0.8 1.0 0 0 f (Hz) 1e3 5 5

15 10 5 0 5 10 15 0.4 0.2 0.0 0.2 0.4 k f

43/80 44/80 Fourier Transform as matrix multiplication Fourier Transform matrix

1 1 1 ... 1 1 1 1 1 x[0] i2π i4π i2π(N 1) 1 1 1 i 1 i N N N − F1 = 1 , F2 = , F4 =  − −  x[1] 1 e e . . . e  1 1 1 1 1 1   i4π . i4π(N 1)  −  − − x[2] . −   1 i 1 i x = , FN = 1 e N . e N   − −   .       .  . . .. .  . . . .  Real(F32) Imag(F32)     1.00 1.00 x[N 1]  i2π(N 1) i4π(N 1) i2π(N 1)(N 1)  0 0  −  1 e N − e N − . . . e −N −      0.75 0.75   5 5 Vector representation 0.50 0.50 N 10 10 I Periodic signal x can be represented as a vector x C . 0.25 0.25 ∈ I The DFT is a Linear operator that can be represented as a matrix 15 0.00 15 0.00 T FN = [e0, e1,..., eN 1] (N,N) of components 0.25 0.25 − ∈ MC 20 20 i2π(p 1) 0.50 0.50 − (k 1) 25 25 Fk,p = e N − DFT and IDFT 0.75 0.75 30 30 I The DFT of x can be computed as 1.00 1.00 0 5 10 15 20 25 30 0 5 10 15 20 25 30 xDFT = Fx Each line in the matrix is a basis function e [n] ordered by increasing frequency. I The IDTF can be computed with I k 1 DFT 1 H DFT x = F− x = F x I The lines of the matrix contain the eigenvectors of all circular convolution N matrices (hence the point-wise multiplication in Fourier domain). 45/80 46/80 where FH is the conjugate transpose of F. Fast Fourier Transform (1) Fast Fourier Transform (2)

Principle of radix-2 decimation-in-time (DIT) I DFT for a finite signal of size N can be computed with First one can decompose the DFT between even and odd part of the sum:

N 1 N 1 − i2πk − i2πk − N p p X[k] = x[p]e . (37) X[k] = x[p]e − N p=0 p=0 X X N/2 1 N/2 1 2 − i2πk − i2πk I It is of complexity (N ) since it requires for each element X[k] teh − 2p − (2p+1) O = x[2p]e N + x[2p + 1]e N computation of (N) operations. p=0 p=0 O X X I Reorganizing the computation, one can decrease greatly the computational N/2 1 N/2 1 − i2πk p i2πk − i2πk p) complexity. = x[2p]e −N/2 + e − N x[2p + 1]e −N/2 p=0 p=0 X X History of FFT i2πk = E[k] + e − N O[k] I Traced back to Carl Friedrich Gauss in 1805 who used it to interpolate trajectories of asteroids Pallas and Juno where E[k] is the DFT of e[n] = x[2n] on even samples of x and O[k] the DFT of o[n] = x[2n + 1] on odd samples of x. I Rediscovered several time until publication by [Cooley and Tukey, 1965]. Note that using the periodicity of the complex exponential on can also recover: I Considered one of the most important algorithm in history. i2π(k+N/2) i2πk X[k + N/2] = E[k] + e − N O[k] = E[k] e − N O[k] −

47/80 48/80 Fast Fourier Transform (3) Fast Fourier Transform (4)

Cooley–Tukey FFT algorithm [Cooley and Tukey, 1965] Complexity of FFT Algorithm : FFT (N, x[n]) The complexit C(N) of FFT (N, x[n]) car be obtained from the recurrence: if N = 1 then return x[0] N C(N) = 2C( ) + 4N else 2 E[k] = FFT (N, x[2n]) O[k] = FFT (N, x[2n + 1]) Let us suppose that L = log2(N) is an integer, then we have:

for k = 0 to N/2 do N 2iπk C(N) C( 2 ) X[k] = E[k] + e − N O[k] = + K N N 2iπk 2 X[k + N/2] = E[k] e − N O[k] − end for where K = 4. With T (L) = C(N) we can see that return X[k] N L 1 end if − Classical divide and conquer approach. T (L) = T (L 1) + K = T (0) + K = (KL + T (0)) I − O l=1 I Recurrent algorithm often called butterfly due to the crossing of even/odds. X I Can be implemented with limited memory use only (N) Since T (0) = 0 we recover the final complexity: O I Has been extended to general N through factorization. C(N) = (KN log (N)) O 2 1 I The inverse FFT can be computed with IFFT (X[k]) = N FFT (X∗[k])∗

49/80 50/80

Fast convolution Finding the most efficient convolution

Convolution for finite signals Convolution between two signals of different support When two signals x[n] and h[n] have a support on [0,N 1] their convolution is N 1 − − I In applications we often have two signal x[n] and h[n] that have different lengths y[n] = x[n] ? h[n] = x[k]h[n k] − N and M. k=0 X I Complexity for the different convolution approaches is: and the support of the convolution is in [0, 2N 1]. Computing the values on the − I Direct computation: (NM) support is (N 2). O O I FFT computation : (max(M,N) log (max(M,N))) O 2 Fast convolution with zero-padding I Some applications such as convolutional neural networks have a very small M I In order to avoid border effect we define two signals a[n] and b[n] of periodicity and direct implementation is more efficient. 2N such that: x[n] for 0 n < N h[n] for 0 n < N choose_conv_method Scipy function a[n] = ≤ , b[n] = ≤ 0 for N n < 2N 0 for N n < 2N  ≤  ≤ I Scipy convolution scipy.signal.convolve automatically selects the more efficient (and precise enough) approach. I This procedure is called zero-padding and ensures that the circular convolution c[n] = a ? b[n] can recover the regular convolution: Function choose_conv_method in module scipy.signal.signaltools. I c[n] = y[n] for 0 n < 2N. (38) I Compare the number of operations for FFT and direct implementation. ≤ I Can also do a quick benchmark to find which is the fastest in practice. I The complexity of Fast convolution with zero-padding is (N log (N)). O 2 51/80 52/80 FFT implementations Applications of Digital Signal Processing

FFT in the real world I FFT is a standard algorithm implemented in numerous numerical libraries. Digital Signal Processing I Implementations for all types of input (float, double, complex). I Processing of signals after Analog to Digital Conversion. I Implementation for all devices (X86, ARM, GPU, DSP). I Can be implemented on any general purpose computer or on dedicated I Implemented in Numpy and Scipy. devices/chips for more efficiency. I Objectives of DSP include denoising, signal reconstruction and source separation. Common implementations Applications discussed in the following I FFTPACK FORTRAN [Swarztrauber, 1982] ( public domain). I GNU scientific library [Galassi et al., 1996] (GPL). I Digital filter design. I Intel Math Kernel Library (proprietary). I Interpolation. I FFTW Fastest Fourier Transform in The West [Frigo and Johnson, 1998] (GPL). I Digital Image Processing. I pyFFTW a python wrapper for FFTW (GPL). I Convolutional neural networks. I CuFFT Runs on GPU [NVIDIA, 2013], C and Python (proprietary).

53/80 54/80

Digital filter design Window design method

Impulse responses Frequency response 0.2 h(t) 1.00 |H(f)| h[n] |H(e2if)| 0.75 0.1 0.50

0.0 0.25 0.00 I Digital Filter design follows the same objective as Analog filter designs. 20 10 0 10 20 0.0 0.2 0.4 0.6 0.8 1.0 Principle I For finite signal already in memory ideal filtering is possible with FFT. When filter has to be applied in real time one needs to find the parameters of a I h(nT ) for M n N recursive IIR filter or FIR filter. h[n] = − ≤ ≤ (0 else I Scipy provides several digital filter design functions.

Approaches for digital filter design I The sampling can lead to aliasing, the truncation leads to ripples in frequencies. Works better for low frequencies and large M,N. I Window design method (FIR): Sample and truncate the impulse response of an I analog filter. I Can be used to approximate the ideal filter: Least Mean Square Error method (FIR): Find the coefficients of the FIR that I sin(2πfct) approximates the best in the square error sens the objective filter in frequency. h(t) = 2fc = 2fcsinc(2πfct) 2πfct I Bilinear transform (IIR): non-linear approximation of the transfer function on an analog filter. I Example above with fc = 0.1, M = 19,N = 20. 55/80 56/80 Bilinear transform Examples of bilinear transform

Frequency response low-pass 1st order system Frequency response band-pass 2nd order system 1.00 Principle |H(f)| |H(f)| 0.6 2if 2if 0.75 |Hd(e )| |Hd(e )| For a continuous time transfer function that can be expressed as 0.4 I 0.50

0.25 0.2 H(f) = H0(2iπf) 0.00 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 I The bilinear transformation use the change of variable:

2iπf 1 2iπf 2 1 e− 2 1 z− 2 First order system 2iπf F (e ) = − = − = i tan(πf) → T 1 + e 2iπf T 1 + z 1 T − − I The transfer function for a first order low-pass filter is of the form: T is a parameter that can be used to change the cutoff frequency. 1 H(f) = I The digital Filer z-transform can be expressed as: 1 + i2πf 2πf0 1 2 1 z− Hd(z) = H0 − 1 I The discrete transfer function can be recovered from T 1 + z−   1 1 1 z− I Null frequency in continuous maps to null frequency in discrete. Hd(z) = 1 = − 2 1 z− 2 2 1 1 + 2πf T 1+−z 1 1 + + 1 z− I Infinite frequency in continuous maps to 1/2 in discrete. 0 − 2πf0T − 2πf0T   I Bilinear transform preserves stability properties. I Comparison above for first order low-pass and second order Butterworth band-pass filter. 57/80 58/80

Application : motor imagery BCI Application : motor control BCI

Motor imagery on ElectroEncephaloGraphy I Motor Imagery is a paradigm of Brain Computer Interfaces (BCI). Prediction of arm movement from (EcoG) [Pistohl et al., 2008] I Uses the µ rhythms in the motor cortex (Band [8-40]Hz). I ElectroCorticoGram implanted on patient suffering from pharmaco-resistant epilepsy. I The subject concentrates on a movement (right or left hand). I Recording of simultaneous arm movement and EcoG signals from patients. I The contralateral area of the motor cortex sees a desynchronization (lower energy in the band) similar to a real movement [Roth et al., 1996]. I Use a model to predict arm movement from EcoG signals. I Butterworth filter of order 5 used to perform filtering [Lotte and Guan, 2010]. I Potential applications in prosthetics and robotics. I Used to control a cursor or a movement of a machine. I Signals filtered using a discrete a Savitzky-Golay filter (order 3,win 0.75 sec).

59/80 60/80 Limited band signal interpolation Fast interpolation example

Time domain Frequency domain 1.2 Sinc interpolation 10 X, N = 32

1.0 Xi, M = 256 8 I Interpolation of a sampled signal with the Nyquist/Shannon sampling Theorem 4 0.8 6 can be very expensive since the sinus cardinal has an infinite support. 0.6 4 I Computing M samples from N will be (MN). 0.4 O 0.2 2 x, N = 32 0.0 0 Fast interpolation xi, M = 256 0 5 10 15 20 25 30 0 50 100 150 200 250 I Fast interpolation from N to M can be done using FFT with the Following steps: Interpolation example 1. X[n] Compute FFT of x[n] of size N. Interpolation of rectangular function. ← M I X[k] N for 0 k < N//2 M ≤ I N = 32,M = 256. 2. Xi[k] X[k M + N] N for M N/2 k < M (zero padding) ← − − ≤ Python code 0 else 1 X=np.fft.fft(x)# FFT 3. xi[n0] Compute IFFT of Xi[k] of size M. ←  2 Xi=np.zeros(M)+0*1j# Initalize complex vector at0 M I The N is because the FFT is not normalized. 3 Xi[:N//2]=X[:N//2]# Positive low frequencies N 4 Xi[-N//2:]=X[N//2:]# Negative low frequencies I For a sampling frequency of T = 1 of x[n], sample xi[n0] corresponds to t = n0 . M 5 xi=np.fft.ifft(X2*M/N)# Scaling+ IFFT I For M > N complexity for the interpolation is (M log (M)). 6 ti=np.arange(M)*N/M# position in time of the interpolated samples O 2

61/80 62/80

Digital Image Processing Image convolution

x[m,n] x*h[m,n], P=3 x*h[m,n], P=5 x*h[m,n], P=11 x*h[m,n], P=21 x h x*h 0 10.0 0

7.5 10 10 5.0 20 20 2.5

30 0.0 30

40 2.5 40 Digital images 5.0 I We now all have a digital camera in our pockets. 50 50 7.5 60 60 I Image capture has several limits: 10.0 0 20 40 60 10 5 0 5 10 0 20 40 60 I Noise (especially at low illumination). I Motion blur on camera. 2D convolution I Limited resolution (apperture and sensor). x ? h[m, n] = x[k, l]h[m k, n l] I Images are stored in memory as a matrix of integers (or float). − − Xk,l Signal processing in 2D I In practice x[m, n] is often a large image and the filter h[m, n] a small filter (also I All results seen up to now can be extended to 2D. called kernel). I In images one does not need causality (whole image available). I h[m, n] is often of odd size and its support centered around (0, 0). I Filters in 2D are often FIR filter. I Similarly to finite discrete signal there exists a circular 2D convolution that can I Small filters of the building blocks of Convolutional Neural Networks (CNN). be computed with FFT. 63/80 64/80 2D Fast Fourier Transform 2D basis functions X[u,v]= [u-0,v-0] X[u,v]= [u-5,v-0] X[u,v]= [u-10,v-10] X[u,v]= [u-20,v-20] X[u,v]= [u-5,v--10] x[m,n] X[u,v] X[u,v] with fftshift 30 30 30 30 30 0 0 20 20 20 20 20 0.4 10 10 10 10 10 50 50 0 0 0 0 0 0.2 10 10 10 10 10 100 100 20 20 20 20 20

0.0 30 30 30 30 30 150 150 20 0 20 20 0 20 20 0 20 20 0 20 20 0 20 Real(e0, 0[m, n]) Real(e5, 0[m, n]) Real(e10, 10[m, n]) Real(e20, 20[m, n]) Real(e5, 10[m, n]) 0 0 0 0 0 0.2 10 10 10 10 10 200 200 20 20 20 20 20 0.4 30 30 30 30 30 250 250 0 50 100 150 200 250 0 50 100 150 200 250 0.4 0.2 0.0 0.2 0.4 40 40 40 40 40 50 50 50 50 50 2D FFT 60 60 60 60 60 0 20 40 60 0 20 40 60 0 20 40 60 0 20 40 60 0 20 40 60 Imag(e0, 0[m, n]) Imag(e5, 0[m, n]) Imag(e10, 10[m, n]) Imag(e20, 20[m, n]) Imag(e5, 10[m, n]) I let x[m, n] be a finite image of size (M,N). 0 0 0 0 0 10 10 10 10 10 I The 2D DFT of the image is: 20 20 20 20 20 30 30 30 30 30

M,N M N 40 40 40 40 40 2iπ( um + vn ) 2iπ um 2iπ vn X[u, v] = x[m, n]e− M N = x[m, n]e− M e− N 50 50 50 50 50 60 60 60 60 60 m=0,n=0 m=0 n=0 ! X X X 0 20 40 60 0 20 40 60 0 20 40 60 0 20 40 60 0 20 40 60 The basis functions I The DFT for each index/direction of the image are separable. 2iπ( um + vn ) eu,v[m, n] = e− M N I Can be performed in (MN log2(N) + NM log2(M)). O 65/80 correspond to a dirac in the frequency domain. 66/80

Border effects on FFT Classical 2D filters (1) x1[m, n] x2[m, n] x3[m, n] x4[m, n] 0 0 0 0 x[m,n] x*h[m,n], P=3 x*h[m,n], P=5 x*h[m,n], P=11 x*h[m,n], P=21 25 25 100 100

50 50 200 200

75 75 300 300

100 100 400 400

125 125 500 500 0 50 100 0 50 100 0 200 400 0 200 400

log(|X1[u, v]|) log(|X2[u, v]|) log(|X3[u, v]| log(|X4[u, v]| 0 0 0 0 Average filtering 1 1 1 1 25 25 100 100 9 9 9 P 2 for m < P/2 and n < P/2 1 1 1 ha,P [m, n] = | | | | , ha,3 = 9 9 9 50 50 200 200 (0 else  1 1 1  9 9 9 75 75 300 300 I Classical low pass filter (with P odd).   100 100 400 400 Parameter P defines its width and cutoff frequency 125 125 500 500 I 0 50 100 0 50 100 0 200 400 0 200 400 I Its FT is a multiplication of sinc in both directions (non isotropic): I DFT/FFT supposes that an image is periodic in space. H[u,v], P=1 H[u,v], P=3 H[u,v], P=5 H[u,v], P=11 H[u,v], P=21 I Large differences between values at opposite borders can lead to high frequencies not existing in the original image. I This can be attenuated using ”windowing” or apodization that consists in attenuation the borders fo the images. I This spatial multiplication corresponds to a convolution (filtering) in the frequency domain (see next course). 67/80 68/80 Classical 2D filters (2) Classical 2D filters (3)

x[m,n] Prewitt horiz. Prewitt vert. Sobel horiz. Sobel vert. x1[m, n] |x1 * h[m, n]| x2[m, n] |x2 * h[m, n]|

Edge detectors I Prewitt edge detector |Ph[u, v]| |Pv[u, v]| Spatial high pass |H[u,v]| 0.4 0.4 0.4 1 0 1 1 1 1 0.2 0.2 0 1 0 0.2 1 − 1 − − − 0.0 0.0 ph = 1 0 1 , pv = 0 0 0 − 0.2 0.2 h = 1 4 1 0.0 3 −  3   1 0 1 1 1 1 0.4 0.4 − −  0.2 0 1 0 − 0.4 0.2 0.0 0.2 0.4 0.4 0.2 0.0 0.2 0.4 − I Sobel edge detector   |Sh[u, v]| |Sv[u, v]|   0.4 0.4 0.4 0.4 0.2 0.0 0.2 0.4 1 0 1 1 2 1 0.2 0.2 1 − 1 − − − 0.0 0.0 I Compute de difference between the center pixel and its neighbors. sh = 2 0 2 , sv = 0 0 0 4 −  4   0.2 0.2 1 0 1 1 2 1 0.4 0.4 I Non isotropic filter. −     0.4 0.2 0.0 0.2 0.4 0.4 0.2 0.0 0.2 0.4 I Use absolute value of the output to detect area with large changes. I Combination of a low pass in one direction and high-pass in the other.

69/80 70/80

Classical 2D filters (4) Total Variation (TV)

x[m,n] Average P=3 Average P=5 Average P=11 Average P=21 TV(x)=5.552e+03 TV(x + n)=1.683e+04 TV(y)=5.212e+03 TV(y + n)=1.711e+04

Median P=3 Median P=5 Median P=11 Median P=21

I In images of man-made objects, we often have near constant parts in the image. I One way to measure the presence of those constant parts is called the Total Variation that can be expressed in its anisotropic version as: TV (x) = x[m + 1, n] x[m, n] + x[m, n + 1] x[m, n] | − | | − | m,n Median filtering X It can be reformulated (forgetting border effects) as: I Compute de median in a window of width P . I Robust version of the average filter 1 I TV (x) = x dv 1 + x dh 1, with dh = 1 1 , dv = − k ∗ k k ∗ k − 1 Non linear (no impulse response or FT representation).   I   where dv and dh are finite differences derivations and y 1 = y[m, n] . I Bette able to preserve high frequencies and large objects details. k k m,n | | I The presence of noise tends to introduce high frequencies andP to greatly increase I There exists a whole family of more complex non-linear morphological filter. the TV of an image. 71/80 72/80 Total Variation denoising Deep learning

x[m,n] TV =0.01 TV =0.1 TV =0.2 TV =0.5

x[m,n] with noise TV =0.01 TV =0.1 TV =0.2 TV =0.5

Deep neural network [LeCun et al., 2015]

f(x) = fK (fK 1(...f1(x)...)) (39) − I Since TV increases with noise, one would want to use it for denoising. I f is a composition of basis functions fk of the form: I Total Variation denoising of image y can be expressed as fk(x) = gk(Wkx + bk) (40) 2 min x y F + λT V (x) x k − k I Wk is a linear operator and bk is a bias for layer k. where F is the Fobenius norm of a matrix. k · k I gk is a non-linear activation function for layer k. I We want the denoised image x to be close to y but also have a small TV. I Function f parameters Wk, bk k are learned from the data. I It is a convex but non-smooth optimization problem. { } 73/80 74/80

Convolutional Neural Network (CNN) Bibliography

I Discrete-time signal processing [Oppenheim and Shafer, 1999]. I Signals and Systems [Haykin and Van Veen, 2007]. I Signals and Systems [Oppenheim et al., 1997]. I Signal Analysis [Papoulis, 1977]. I Fourier Analysis and its applications [Vretblad, 2003].

I Polycopi´esfrom St´ephaneMallat and Eric´ Moulines [Mallat et al., 2015]. I Replace the linear operator by a convolution [LeCun et al., 2010]. I Th´eorie du signal [Jutten, 2018]. I Reduce image dimensionality with sub-sampling or max pooling. I Distributions et Transformation de Fourier [Roddier, 1985] I Number of parameters depends on the size fo the filter, not the image. I Recent deep CNN use Relu activation [Glorot et al., 2011]: g(x) = max(0, x).

75/80 76/80 ReferencesI ReferencesII

[Cooley and Tukey, 1965] Cooley, J. W. and Tukey, J. W. (1965). [Jutten, 2018] Jutten, C. (2018). Th´eorie di signal. An algorithm for the machine calculation of complex fourier series. Univ. Grenoble Alpes - Polytech’ Grenoble. Mathematics of computation, 19(90):297–301. [LeCun et al., 2015] LeCun, Y., Bengio, Y., and Hinton, G. (2015). [Frigo and Johnson, 1998] Frigo, M. and Johnson, S. G. (1998). Deep learning. Fftw: An adaptive software architecture for the fft. Nature, 521(7553):436–444. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’98 (Cat. No. 98CH36181), volume 3, pages 1381–1384. IEEE. [LeCun et al., 2010] LeCun, Y., Kavukcuoglu, K., and Farabet, C. (2010). [Galassi et al., 1996] Galassi, M., Davies, J., Theiler, J., Gough, B., Jungman, G., Alken, P., Convolutional networks and applications in vision, pages 253–256. Booth, M., and Rossi, F. (1996). [Lotte and Guan, 2010] Lotte, F. and Guan, C. (2010). Gnu scientific library. Regularizing common spatial patterns to improve bci designs: unified theory and new No. Release, 2. algorithms. [Glorot et al., 2011] Glorot, X., Bordes, A., and Bengio, Y. (2011). IEEE Transactions on biomedical Engineering, 58(2):355–362. Deep sparse rectifier neural networks. [Mallat et al., 2015] Mallat, S., Moulines, E., and Roueff, F. (2015). In Aistats, volume 15, page 275. Traitement du signal. [Haykin and Van Veen, 2007] Haykin, S. and Van Veen, B. (2007). Polycopi´eMAP 555, Ecole´ Polytechnique. Signals and systems. [NVIDIA, 2013] NVIDIA, C. (2013). John Wiley & Sons. Fast fourier transform library (cufft).

77/80 78/80

ReferencesIII ReferencesIV

[Nyquist, 1928] Nyquist, H. (1928). Certain topics in telegraph transmission theory. Transactions of the American Institute of Electrical Engineers, 47(2):617–644. [Roth et al., 1996] Roth, M., Decety, J., Raybaudi, M., Massarelli, R., Delon-Martin, C., Segebarth, C., Morand, S., Gemignani, A., D´ecorps, M., and Jeannerod, M. (1996). [Oppenheim and Shafer, 1999] Oppenheim, A. V. and Shafer, R. W. (1999). Possible involvement of primary motor cortex in mentally simulated movement: a functional Discrete-time signal processing. magnetic resonance imaging study. Prentice Hall Inc. Neuroreport, 7(7):1280–1284. [Oppenheim et al., 1997] Oppenheim, A. V., Willsky, A. S., and Nawab, S. H. (1997). [Shannon, 1949] Shannon, C. E. (1949). Signals and systems prentice hall. Communication in the presence of noise. Inc., Upper Saddle River, New Jersey, 7458. Proceedings of the IRE, 37(1):10–21. [Papoulis, 1977] Papoulis, A. (1977). [Swarztrauber, 1982] Swarztrauber, P. N. (1982). Signal analysis, volume 191. McGraw-Hill New York. Vectorizing the ffts, parallel computations (new york)(g. rodrigue, ed.). [Pistohl et al., 2008] Pistohl, T., Ball, T., Schulze-Bonhage, A., Aertsen, A., and Mehring, C. [Vretblad, 2003] Vretblad, A. (2003). (2008). Fourier analysis and its applications, volume 223. Prediction of arm movement trajectories from ecog-recordings in humans. Springer Science & Business Media. Journal of neuroscience methods, 167(1):105–114.

[Roddier, 1985] Roddier, F. (1985). Distributions et transform´eede fourier. 79/80 80/80