Acoustics and Hearing

Signal Processing, Part 3

Introduction

As computing power became more and more affordable, digital signal processing became more esoteric. It is now possible to do things on the desktop that were impractical or even impossible to do a decade ago. In this section we shall examine the concepts of time compression, pitch shifting, and noise reduction. We shall also examine a variety of special purpose effects and other processing tasks.

Time Compression/Expansion and Pitch Shifting

These two processes use similar techniques so we shall examine them together. There are many applications for time compression/expansion and pitch shifting. Pitch shifting is useful to correct a note that a singer hit flat or sharp, or to make it possible to play instruments that are out of tune (and not simple to re-tune, such as piano) along with pre-recorded material. Time compression/expansion is useful to squeeze a narration into a specific time slot (such as a commercial), or to change the tempo of a pre-recorded song.

Early attempts at pitch shifting simply involved recording a sound on a magnetic tape recorder and then changing the speed of the playback motor. For example, if the speed was doubled, the tape went by the reproduce head twice as fast yielding a sounding twice as a high in pitch (one octave). This is the technique used for the famous “chipmunks” effects. Although the process is straight forward, it has a downside. The problem with this technique is that pitch shifting produces a time change. If the playback speed is doubled the pitch will be doubled, but the time will be cut in half. In order to keep time constant, the signal must be recorded at an artificially slow or fast rate. For example, to achieve the octave pitch increase, the individual must speak twice as slow as normal on the original recording. Failure to do so will produce a nearly unintelligible blast of words. An alternate view of this process would be to consider the time shift as desirable, and the resulting pitch shift as the undesired side effect. For example, suppose someone was recording a 30 second commercial and the narrator gave an otherwise perfect performance of 33 seconds. If the playback speed was increased by 10%, the result would fit within the 30 second allotment. Unfortunately, there would be a 10% pitch rise as well.

Considering pitch shifting for a moment, even if the recording rate is sped up or slowed down to compensate for the resulting time shift, the resultant will not sound quite right. In the case of human voice and many musical instruments, an increase in pitch does not result in a simple translation of all harmonics to new frequencies. Due to resonance effects, some of the new harmonics will be reduced or boosted in amplitude. In the case of time change, singing or speaking at a new pitch to avoid unwanted pitch changes after the time shift processing will result in a similar problem. Quite simply, if you record someone speaking an octave below their normal pitch and speaking twice as slow as normal, doubling the playback rate will not produce their voice as normally heard.

Can a sound be stretched in time without changing pitch, or shifted in pitch without changing time? Given modern digital signal processing techniques, the answer is yes. The processes are not trivial however, and the details are beyond the scope of the course. This much we can say: There are two basic techniques to achieve this. The first technique involves auto-correlation, or the ability to find similarities in waveforms at differing times. For example, if you can find several cycles of a waveform in sequence that are similar, you can clip one out to shorten the sound or

ET163 Audio Technology Lecture Notes: Signal Processing, Part 3 1 add one to lengthen it, and the pitch will not change. A more complicated technique involves a Fourier Transform to break the time domain signal into its frequency components. These components can be scaled or shifted in the spectrum and then reassembled back into a new time domain signal that does not suffer from a time shift. If a signal can be shifted in frequency without a time shift, then it must also be possible to shift time without a frequency shift1.

Noise Reduction

Noise is the nemesis of the audio engineer. The engineer does everything possible to minimize noise. If a signal already contains noise, what can be done to remove or least reduce it? One possibility is the use of a noise gate. Noise gates are primarily used to reduce leakage. For example, a snare drum microphone will pick up annoying rattles and buzzes from the drum between notes. A noise gate can be thought of as an amplifier with an intelligent on/off switch. If signal levels are below a certain threshold (adjustable by the engineer), the gate will be off. Once the signal rises above the threshold, the gate will allow the signal through. Thus the engineer will adjust the gate’s threshold level to be above the annoying buzzes and rattles, but below the purposeful snare drum strikes.

Can anything be done about specific noises beyond leakage, such as 60 Hz hum or tape hiss? Well defined narrow band noises such as hum can be removed with sharp notch filters without incurring great degradation to the original signal. Broad band noises such as tape hiss are more problematic. One technique used involves the use of a sliding low pass filter. The filter’s cutoff frequency is controlled by a spectral estimation circuit. If considerable high frequency content is noted, the filter is opened, meaning all signals will pass through. If very little high frequency content is noted, it is assumed that this is the undesired noise alone. The cutoff frequency is then slid down, eliminating the noise. As there is no desired signal present, nothing of value is lost. When the desired signal is present along with the noise, no filtering will take place, but hopefully the noise will be masked by the louder high frequency content of the desired signal. It is possible to combine this idea with the Fourier Transform mentioned earlier, in effect making a series of narrow band pass filters, each sensing to see if sufficient signal is available. If appreciable signal does not exist in the band, the output of the filter is turned off. This idea can be further enhanced by having the thresholds of each filter band be adjustable. In this way, the filtering can more accurately adapt to a noise that doesn’t have a constant spectral density. All that is needed for this process is a portion of sound that consists of just the offending noise and nothing else. The noise is then examined via the Fourier Transform producing a sort of “noise fingerprint”.

Finally, there is one specific noise that many consumers are very interested in removing, and that is the clicks and pops found on vinyl albums. There are several methods currently available to reduce this nuisance; some more effective than others. Simple filtering can remove some clicks, but at the expense of removing desired high frequency content as well. Simple “spike finders” have problems with very small clicks or with very dynamic material. For example, the algorithm may mistake a drum strike for a click, and accidentally remove it. More sophisticated processing is required to find the small clicks while simultaneously ignoring the large purposeful transients. In any case, once a click is found, the algorithm will replace the click data with an approximation to the underlying curve.

1 First, shift pitch in the opposite direction by the same amount of desired time shift. Then change the playback rate. For example, to double the time, first shift the pitch up an octave, then halve the playback rate. Halving the playback rate will double time and drop the pitch by an octave, but as the sound was already increased by an octave, the pitch will come back normal.

ET163 Audio Technology Lecture Notes: Signal Processing, Part 3 2 In closing this subsection, it is worthwhile to note that the idea of adaptive filtering can be applied to feedback suppression in a PA system. If a PA goes into feedback, a persistent high level sine wave will be created by the positive feedback loop (usually heard as a whistle or high- pitched squeal, although low frequency feedback is possible). This would be unusual for desired content, so an algorithm can be created to detect the situation. Once detected, a tunable notch filter can be used to reduce the gain at the offending frequency, thus halting the squeal. A simpler approach would be to just drop the gain back a little via a controlled gain amplifier stage.

Other Processes

There are myriad other processing functions and effects available. It must always be remembered that in the artist endeavor of music, taste rules the day. Processes are sometimes used in odd or unintended ways to achieve a certain sonic effect. Some example processes include distortion (very popular among guitar players), frequency modulation (AKA vibrato), amplitude modulation (AKA tremolo), transfer function generation, cross modulation (these last two are useful for certain kinds of synthesis), phase shifting, and so on. AM and FM are also used as a basis for some keyboard synthesizers.

ET163 Audio Technology Lecture Notes: Signal Processing, Part 3 3