<<

What is the Domain?

We live in the time domain. The sounds that our ears perceive exist in time, and we have no other way to perceive them. Sound cannot be frozen like a moving image. Stop a movie, and you see a fully discernible image. Stop a sound and it ceases to exist.

However, we can use computers to analyze frozen signals. We do this by transforming sound from the time domain into the using the . With the Fourier Transform, we can pause a signal to look at it’s momentary frequency content. We can also alter and filter thin slices of the frequency spectrum.

In the time domain, we think about sound as a series of pulse waves that hit our ear over a period of time. We also often picture a sound like a sine wave or more complex signal. In the frequency domain, we are able to look at a small slice of a signal and see what frequencies are present. We care able to look at the entire spectrum, from below the of human hearing up to ultraviolet waves and beyond.

FFT:

We use an FFT or Fast Fourier Transform to convert a signal from the time domain (our world) to the frequency domain (another dimension). An FFT takes a very small section of sound from the time domain - 512, 1024, 2048 or any power of 2 number of samples - and converts this into the frequency domain. The math of the Fourier Transform is beyond the scope of this class, but you can still use it as a powerful tool for sound transformation.

We use a buffer to store FFT information. The size of the buffer must be a power of 2, so 512, 1024, etc. The size of the buffer has significant implications. A buffer of 512 samples looks at a slice of audio that is 512 samples long. A buffer of 1024 samples looks at a sample of sound that is 1024 samples long.

a sound in the time domain is broken up into windows of 512, 1024, etc sample chunks - these can be transformed into the frequency domain

The FFT buffer for a chunk of audio contains the information derived from the Fourier Transform. It stores spectral in a local buffer in the following order: DC and nyquist are the first two samples. Then each “bin” of the FFT has two pieces of information - amplitude and phase. So, if we use a 512 sample FFT, the first two frames are DC and nyquist. Then there are 255 “bins” of information, each with amplitude and phase. Since there are 255 bins, if we are at 44100 sample rate, the separation of each bin is 22050/255 or 86.5hz wide. So we will have spectral information for 86.5hz wide slices of the audio spectrum.

The larger the FFT size, the more spectrally accurate the analysis. So, if we use a 1024 sample FFT buffer, the width of our audio slices is 43.15hz. We have twice as much information. The downside to this is that now the FFT is done half as often, so this the analysis is half as accurate as far as rhythm is concerned. So, a great rule of thumb is: if you want to be spectrally accurate, use a large FFT buffer. If you want to be rhythmically accurate, use a small FFT buffer.