DSP: The Short-Time Fourier Transform (STFT)
Digital Signal Processing The Short-Time Fourier Transform (STFT)
D. Richard Brown III
D. Richard Brown III 1 / 14 DSP: The Short-Time Fourier Transform (STFT) Signals with Changing Frequency Content: Motivation
For signals that have frequency content that is changing over time, e.g., music, speech, ..., taking the DFT of the whole signal usually doesn’t provide much insight.
Example: Two second linear chirp
N = 16000; n = 0:N-1; x = cos(2*pi/80000*(n+1000).^2); % linear chirp soundsc(x,8000) % listen to sound
D. Richard Brown III 2 / 14 DSP: The Short-Time Fourier Transform (STFT) Example Continued: FFT of Whole Signal
plot(2*n/N,20*log10(abs(fft(x)))); % FFT of whole signal
60
40
20
0 FFT magnitude
−20
−40
−60 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 normalized frequency (ω/π
D. Richard Brown III 3 / 14 DSP: The Short-Time Fourier Transform (STFT) Example Continued: FFTs of Smaller Chunks
plot(2*[0:127]/128,20*log10(abs(fft(x(1:128))))); % FFT near beginning plot(2*[0:127]/128,20*log10(abs(fft(x(8001:8128))))); % FFT near middle plot(2*[0:127]/128,20*log10(abs(fft(x(15001:15128))))); % FFT near end
35 40 40
30 30 30
25 20 20
20 10 10
15 0 0 FFT magnitude FFT magnitude FFT magnitude 10 −10 −10
5 −20 −20
0 −30 −30
−5 −40 −40 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 normalized frequency (ω/π normalized frequency (ω/π normalized frequency (ω/π
D. Richard Brown III 4 / 14 DSP: The Short-Time Fourier Transform (STFT) Short-Time Fourier Transform
Rather than analyzing the frequency content of the whole signal, we can analyze the frequency content of smaller snapshots. The STFT is defined as ∞ X X[n, λ) = x[n + m]w[m]e−jλm m=−∞ where n ∈ Z is a time index and λ ∈ R is a normalized frequency index.
Remarks: 1. Implicit in this definition is a window function w[n] of length L. 2. The STFT results in a family of DTFTs indexed by n. 3. In practice, we don’t usually compute X[n, λ) for all n = 0, 1,... since the frequency content of the signal does not change much from sample to sample. Instead, we typically pick a “block shift” integer R ≤ L and compute the STFT for values of n = 0,R, 2R,... .
D. Richard Brown III 5 / 14 DSP: The Short-Time Fourier Transform (STFT) Short-Time Fourier Transform
x[n] R=8
n
w[n] 1 L=8
n X ,λ x[n]w[n] [0 ) DTFT
λ n
x[n+8]w[n] X[8,λ) DTFT
λ n
x[n+16]w[n] X[16,λ) DTFT
λ n
etc.
D. Richard Brown III 6 / 14 DSP: The Short-Time Fourier Transform (STFT) Short-Time Fourier Transform with the DFT/FFT
We can also use the DFT/FFT to compute the STFT as L−1 X −j2πkm/N X[n, k] = x[n + m]w[m]e . m=0
x[n] R=8
n
w[n] 1 L=8
n N=8 x[n]w[n] X[0,k] DFT k n x[n+8]w[n] X[8,k] DFT k n x[n+16]w[n] X[16,k] DFT k n D. Richard Brown III 7 / 14 DSP: The Short-Time Fourier Transform (STFT) Short-Time Fourier Transform Parameters
1. Window type
I Tradeoff between side lobe amplitude ASL and main lobe width ∆ML 2. Window length L
I Larger L gives better frequency resolution (smaller ∆ML) I Smaller L gives less temporal averaging 3. Temporal block shift samples R
I Usually R ≤ L and is related to L, e.g., R = L or R = L/2 I Larger R result in less computation I Smaller R gives “smoother” results 4. FFT length N
I Usually N ≥ L. I Larger N gives more frequency-domain samples of DTFT (better location and amplitude of peaks) I Smaller N results in less computation.
D. Richard Brown III 8 / 14 DSP: The Short-Time Fourier Transform (STFT) Matlab Spectrogram Example
Matlab function spectrogram is useful for easily computing STFTs. [s,f,t] = spectrogram(x,kaiser(512,2),256,1024,8000); % x = lin chirp image(t,f,20*log10(abs(s))); set(gca,’ydir’,’normal’); colorbar;
D. Richard Brown III 9 / 14 DSP: The Short-Time Fourier Transform (STFT) Matlab Spectrogram Example (decrease R to 1)
[s,f,t] = spectrogram(x,kaiser(512,2),511,1024,8000); % x = lin chirp image(t,f,20*log10(abs(s))); set(gca,’ydir’,’normal’); colorbar;
D. Richard Brown III 10 / 14 DSP: The Short-Time Fourier Transform (STFT) Matlab Spectrogram Example (rectangular window)
[s,f,t] = spectrogram(x,ones(512,1),511,1024,8000); % x = lin chirp image(t,f,20*log10(abs(s))); set(gca,’ydir’,’normal’); colorbar;
D. Richard Brown III 11 / 14 DSP: The Short-Time Fourier Transform (STFT) Matlab Spectrogram Example (longer window)
[s,f,t] = spectrogram(x,kaiser(1024,2),1023,1024,8000); % x = lin chirp image(t,f,20*log10(abs(s))); set(gca,’ydir’,’normal’); colorbar;
D. Richard Brown III 12 / 14 DSP: The Short-Time Fourier Transform (STFT) Speech Signals Example
ref−mic [quiet] phys−mic [quiet] TERC sensor [quiet] 1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4 frequency (kHz) 0.2 0.2 0.2
0 0 0 0 2 4 0 2 4 0 2 4
ref−mic [high−noise] phys−mic [high−noise] TERC sensor [high−noise] 1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4 frequency (kHz) 0.2 0.2 0.2
0 0 0 0 2 4 0 2 4 0 2 4 time (s) time (s) time (s)
D. Richard Brown III 13 / 14 DSP: The Short-Time Fourier Transform (STFT) “Aphex Face” Spectrogram (log scale frequency axis)
On Aphex Twin’s Windowlicker album, track 2, ∼ 5:27–5:37. D. Richard Brown III 14 / 14