<<

Eng. 100: Music Signal Processing DSP Lecture 6 Lab 3: Spectra/Spectrogram (Continued) • Curiosity: • http://www.soundhound.com • http://dx.doi.org/10.1109/MSPEC.2015.7049439 • Read Lab 3! • HW 3 on Canvas, due Thu. Oct. 15 Mon Tue Wed Thu Fri Oct 05 Oct 06 Oct 07 Oct 08 Oct 09 (no lab) HW2 due Lab 3a Oct 12 Oct 13 Oct 14 Oct 15 Oct 16 Lab 3a Ethics & HW3 due Lab 3b Review Review • DSP schedule: Oct 19 Oct 20 Oct 21 Oct 22 Oct 23 No lab No class No lab Break Break Exam 1 Oct 26 Oct 27 Oct 28 Oct 29 Oct 30 Lab 3b Lab 3 due / P2 Nov 02 ... Lab 3 due / P2 ... 1 Outline

• The of a signal (first class) ◦ Part 1. Why we need spectra ◦ Part 2. Periodic signals ◦ Part 3. Band-limited signals • Methods for computing spectra (second class) ◦ Part 4. Sampling rate S > 2B ◦ Part 5. By hand by solving systems of equations ◦ Part 6. Using general Fourier series solution ◦ Part 7. Using fast (FFT), e.g., in Matlab • Using a signal’s spectrum (third class) ◦ to determine note ◦ to remove unwanted noise ◦ to visualize content (spectrogram) ◦ Lab 3

2 Summary of previous lecture(s)

T-periodic, band-limited signals have finite Fourier series:

• Sinusoidal form:  1   2   K  x(t) = c0 +c1 cos 2π t − θ1 +c2 cos 2π t − θ2 +··· + cK cos 2π t − θK |{z} T T T DC | {z } | {z } fundamental • Trigonometric form:  1   2   K  x(t) = a + a cos 2π t +a cos 2π t +··· + a cos 2π t 0 1 T 2 T K T  1   2   K  + b sin 2π t +b sin 2π t +··· + b sin 2π t 1 T 2 T K T • Coefficients in these two forms are related by q 2 2 a0 = c0 ak = ck cosθk ck = ak + bk b = 0, b = c sinθ , 0 k k k tanθk = bk/ak √ because acos(φ)+bsin(φ) = a2 + b2 cos(φ − arctan(b/a)) and ccos(φ − θ) = (ccosθ)cosφ + (csinθ)sinφ • We can use the fast Fourier transform (FFT) method (Matlab’s fft function) to compute these coefficients very quickly from signal samples if S > 2B.

3 FFT summary

 0 1 N−1  For an array x of length N with signal samples x = x S x S ··· x S taken from a signal x(t) that is periodic and band-limited with S > 2B: disp((2/N)*real(fft(x))) gives

[2a0 a1 a2 ... aN/2−2 aN/2−1 aN/2 aN/2−1 aN/2−2 ... a2 a1] | {z } | {z } disp((-2/N)*imag(fft(x))) gives

[0 b1 b2 ... bN/2−2 bN/2−1 0 −bN/2−1 − bN/2−2 ... − b2 − b1] | {z } | {z } disp((2/N)*abs(fft(x))) gives (we usually use this one!)

[2c0 c1 c2 ... cN/2−2 cN/2−1 cN/2 cN/2−1 cN/2−2 ... c2 c1] | {z } | {z } disp(angle(fft(x))) gives

[0(or π) −θ1 ... −θN/2−1 0(or π) θN/2−1 ... θ2 θ1] | {z } | {z }

k Frequency associated with kth term (Matlab index k + 1) is f = S. N 4 FFT: Why the mirrored coefficients? √ Complex-exponential form of Fourier series of any T-periodic signal: (ı = −1) ∞ −ı2π k t x(t) = ∑ Xk e T . k=−∞ For a T-periodic signal band-limited to B = K/T: K −ı2π k t x(t) = ∑ Xk e T . k=−K

 0 1 N−1  For an array x of length N with signal samples x = x S x S ··· x S taken from a signal x(t) that is periodic and band-limited with S > 2B: disp((1/N)*fft(x)) gives

[X0 X1 X2 ... XN/2−2 XN/2−1 X−N/2 X−N/2+1 X−N/2+2 ... X−2 X−1] | {z } | {z } positive k negative k because 1 1 cos(t) = eıt + e−ıt . 2 2 We will not use the complex-exponential form in ENGN 100. (See EECS 216.) 5 k FFT: Why f = NS ? Recall sinusoidal form of Fourier series of T-periodic signal with band-limit B: K  k  x(t) = c0 + ∑ ck cos 2π t − θk k=1 T k Frequency of kth term in sum is f = . T But for unknown pitches we do not know the fundamental period T!

k So the formula f = T is not immediately useful with the FFT.

6 k FFT: Why f = NS ? Imagine repeating a segment of audio over and over:

t

This thought experiment yields a periodic signal with period T = N∆ = N/S where N is the number of samples.

k Substituting T = N/S into the formula f = T and simplifying yields k f = S. N • This formula works perfectly when the unknown frequency f is an integer multiple of S/N. • If f is not exactly an integer multiple of S/N, then we will see a “bump” in the FFT spectrum instead of a single line.

7 Line spectra • Spectrum of a T-periodic, band-limited signal  1   2   K  x(t) = c + c cos 2π t − θ +c cos 2π t − θ +··· + c cos 2π t − θ 0 1 T 1 2 T 2 K T K

• More generally, line spectra of band-limited signals look something like this:

Formula: x(t) = c0 + c1 cos(2π f1t)+··· + c4 cos(2π f4t)

(We assumed θk = 0 in this formula because is unspecified in spectrum.) • In physics, often only the frequencies of atomic spectra are shown, not the : [wiki]

8 Exercise

Draw the spectrum of this signal: x(t) = 20sin2(2π50t)+30cos(2π200t).

2 1 1 Hint: sin (φ) = 2 − 2 cos(2φ) “power reduction formula”

9 DSP, FFT and “Stairway to heaven”

0.2

0.1

0 x(n/S) −0.1 play −0.2 0 0.5 1 1.5 2 2.5 3 3.5 4 4 n x 10 0.05

0 x(n/S) play −0.05 4000 4200 4400 4600 4800 5000 5200 5400 5600 5800 6000 n

0.02

0.015

0.01

spectrum 0.005

0 111 221 331 441 551 661 771 881 991 k+1

k 110 Fundamental frequency of first note: f = NS = 20008000 = 440Hz. What is the lowest (nonzero) frequency one could find here???

10 Listening and seeing

1

0.5

0 0 400 4000 play

1

0.5

0 play 400 713

1 0.5 play 0 400 800 1200 1600 2000 2400

1

0.5 play

0 400 800 1200 1600 2000 2400 f [Hz]

Why is the lowest note on a tuba (or large pipe organ) C0 = 16.35 Hz??? [wiki]

11 Application: Removing noise from signals (Lab 3)

12 Removing noise from signals

• Given: signal = desired signal + noise signal y(t) = x(t) + ε(t) • Goal: reduce (“filter out”) as much noise as possible • Assume/know: signal is band limited to B Hz • Idea: eliminate frequencies above B Hz (must be noise) • Analog approach: use electrical circuit (e.g., treble control) (See EECS 215/216) • Digital approach: ◦ sample signal ◦ compute spectrum using FFT ◦ set to zero portions of the spectrum that are just noise ◦ use inverse FFT (ifft) to synthesize improved signal ◦ (Many more ways described in DSP course EECS 351.)

13 Removing noise digitally: Details

• sample using S > 2B k • Recall f = N S and disp((2/N)*abs(fft(x))) gives:

[2c0 c1 ... cK−1 cK ...cN/2−1 cN/2 cN/2−1 ...cK cK−1 ...c1] | {z } | {z } |{z} | {z } | {z } low high highest high low | {z } mirror • Find index K such that (K − 1)S/N = B. Then we want:

[2c0 c1 ... cK−1 0...0 0 0...0 cK−1 ...c1] | {z } | {z } |{z} | {z } | {z } low high highest high low | {z } mirror • remove frequency components above B using fft, synthesize “improved” signal using ifft: (review a(k) = 0;) Yf = fft(y); Zf = Yf; Zf(K+1:N+1-K) = 0; z = real(ifft(Zf)); • This is called low pass filtering (via FFT).

14 Example: Low-pass filtering a noisy EKG signal

• EKG=Electrocardiogram (heart) electric signal. • Periodic (we hope!) with period about 1 second (60 beats per minute = 1 per second). • Components at 1,2,3,... Hz up to about 7 Hz. • So: We eliminate all frequencies above 7 Hz. • Result: high-frequency noise eliminated, but low frequency noise remains. • See next slide for signals and their spectra.

15 Example: Low-pass filtering a noisy EKG signal

t [sec] f [Hz] 16 Example: Removing noise from mystery signal

20

10

0 y(t)

−10

−20 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t [sec]

• Known: N = 8000 samples at S = 8000 Hz • What is the signal x(t) buried in this noise? • How do we recover it from y(t)? • (cf. restoring old analog recordings, movies, etc.)

17 Spectrum of mystery signal

1.2

1

0.8

0.6

spectrum of y 0.4

0.2

0 1 8000 frequency index k + 1

• stem(2/N*abs(fft(y))) • This spectrum also looks like random noise!? • Look at vertical scale: large peaks near 1 and 8000. • Zoom in to see exactly where the peaks are (next slide)

18 Spectrum of mystery signal: Zoomed

1.2 1.2

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0 1 6 19 7996 8000 k + 1 k + 1

• Large peaks at Matlab array index 6 and 7996 = 8000+1−(6−1) • (If we counted from 0, then peaks at k = 5 and N − k = 8000 − 5 = 7995.) • Sinusoid frequency (6 − 1)S/N = 5(8000)/8000 = 5 Hz • Now filter out all other frequencies (all but 6 and 7996).

19 Filtering noisy mystery signal

Generating data: N = 8000; S = 8000; n = 0:N-1; t = n/S; x = cos(2*pi*5*t); y = x + 4 * randn(size(x));

Processing data: Yf = fft(y); Zf = Yf; Zf(1:5) = 0; Zf(7:7995) = 0; Zf(7997:8000) = 0; z = real(ifft(Zf));

1

0.5

0 z(t)

−0.5

−1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t [sec]

Did our noise removal method work perfectly???

20 Application: Pitch tracking (transcription) for songs Spectrogram (Lab 3)

21 Music transcription: Songs?

Spectrum of “The Victors” load victors_tone.mat; S = 8192; N = length(x); stem(2/N*abs(fft(x)))

0.4

0.3

0.2

0.1 victors spectrum 0 1 matlab frequency index: k+1 N=78000

Zoom in: 0.4

0.3

0.2

0.1 victors spectrum 0 matlab frequency index: k+1 What notes? What order? What song? Need a varying spectrum to transcribe songs. 22 Time-varying spectral analysis

• Idea: Subdivide long signal into shorter time segments.. • Hopefully notes do not change during most time segments. • Apply fft to each signal segment. • Examine spectra for each segment to identify note(s) played. • If we make many short segments, then we get many spectra. How to best display many spectra?

23 Displaying many 1D arrays: stem plots

→ f [Hz]

But a picture is worth 1000 words (or plots)...

24 Displaying many 1D arrays: A grayscale image

Lincoln illusion by Harmon & Jules 1973

1 250

200

150

100

50

19 0 1 15

We will use grayscale images for sequences of spectra.

25 Spectrogram

A grayscale display of spectra of many sequential short signal segments is called a spectrogram.

-gram: from Greek root meaning something that is drawn.

• Horizontal axis is time (segment number) • Vertical axis is frequency

• Grayscale intensity (or color) for the coefficients {ck}

Useful for signals whose spectra vary with time, e.g., songs!

26 Spectrogram in Matlab load train; (y); length(y) = 12880 = 560 · 23 N = 560; choice of # of samples per segment L = 23; choice of # of segments z = reshape(y, N, L); N × L array imagesc(2/N*abs(fft(z))); colorbar

0.45 50 0.4 100

0.35 150

200 0.3

250 0.25

300 0.2

frequency index k+1 350 0.15 400

0.1 450

500 0.05

550

2 4 6 8 10 12 14 16 18 20 22 time segment

Matlab’s fft computes FFT of each column of a 2D array What is size of spectrogram array here??? 27 Spectrogram of train whistle

Make it look better: colormap(gray); axis xy; axis([0 L+1 0 100])

100 0.45

90 0.4

80 0.35 70 0.3 60

0.25 50

0.2 40 frequency index k+1

0.15 30

0.1 20

10 0.05

0 1 23 time segment play What is pitch of lowest note of chord???

28 Spectrogram of a song victors_tone.mat contains the signal of a 26 note song (Lab 2). Each note has 3000 samples for sampling rate S = 8192 Hz. load victors_tone.mat; N = 3000; L = 26; z = reshape(x, N, L); imagesc(2/N*abs(fft(z))); colorbar colormap(gray); axis xy; axis([0 L+1 0 300])

29 Spectrogram of “The Victors”

300

0.9

250 0.8 play 0.7 200

0.6

150 0.5

0.4 frequency index k+1 100 0.3

0.2 50

0.1

0 0 5 10 15 20 25 note http://haostaff.com/store/images/pianoroll.gif

Each column of this spectrogram is one spectrum shown in gray scale. What kind of signal is each note??? This representation (time horizontal, frequency vertical) is a lot like music! k 242 Frequency of highest tone? f = N S = 30008192Hz = 660.8Hz (E) 30 Real music spectrograms

Bizarre “hidden” spectrogram from song by “Aphex Twin” [wiki]

Violin playing: [wiki]

play

31 Example: Spectrogram of

Chirp signal: x(t) = cos2παt2 ◦ birds, dolphins, some forms of , slide trombone? ◦ Instantaneous frequency increases linearly with time ◦ Instantaneous frequency = 2αt (not αt) Hz. t = [0:8191]/250; x = cos(2*pi*t.^2); plot(t, x)

1

0.5

0 x(t)

−0.5

−1 0 0.5 1 1.5 2 2.5 3 3.5 4 t Instantaneous frequency increases from 0 to 2 ∗ 8191/250 = 65.5 Hz

32 Example: Spectrogram of chirp imagesc(abs(fft(reshape(x,256,32)))) colormap(gray), axis xy, axis([0 33 0 256])

250

200

150

100 frequency index 50

0 1 32 time segment

Final segment has frequency index 68. k 68 − 1 Corresponding frequency f = S = 250 = 65.4 Hz. N 256 This is the average of frequencies in the final time segment. Sound of a higher-frequency chirp: play

33 Summary of spectra/spectrogram

• With spectrum, we can determine frequency components of one or more notes played simultaneously • Need S > 2B • Can remove or reduce some forms of noise / interference ◦ use fft to analyze spectrum ◦ can modify spectrum ◦ Example: set some values to zero ◦ Example: shift some values to other frequencies (e.g., auto-tune) ◦ use ifft to synthesize modified signal (e.g., signal with reduced noise or interference) • Spectrogram lets us analyze sequences of notes (songs)

34 Aliasing: audio example

S = 8192; t = [0:1/S:0.3]; x = 0.9*[cos(2*pi*2800*t), cos(2*pi*3800*t)]; play y = 0.9*[cos(2*pi*3800*t), cos(2*pi*4800*t)]; play

1 samples 0.5 4800 Hz 3392 Hz 0 y(t)

−0.5

−1 0 1 2 3 4 5 6 −4 t x 10 arccos method says 3392 Hz, not 4800 Hz for this example

Is S > 2B here???

35 Audio spectrum equalizers: Analog and digital [wiki] spotify equalizer

36