Fft-Based Dynamic Range Compression
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the 14th Sound and Music Computing Conference, July 5-8, Espoo, Finland FFT-BASED DYNAMIC RANGE COMPRESSION Leo McCormack and Vesa Valim¨ aki¨ Acoustics Lab, Dept. of Signal Processing and Acoustics Aalto University, FI-02150, Espoo, Finland leo.mccormack, [email protected] ABSTRACT ample, side-chain compression can be used to reduce the amplitude of a bass guitar signal to coincide with a kick Many of the dynamic range compressor (DRC) designs drum transient. This temporally dependent “ducking” in that are deployed in the marketplace today are constrained the bass guitar signal may then allow the two instruments to operate in the time-domain; therefore, they offer only to better complement one another. In the production of temporally dependent control of the amplitude envelope of dance music, this technique is commonly utilised in the a signal. Designs that offer an element of frequency de- more extreme case, by allowing the kick drum transient to pendency, are often restricted to perform specific tasks in- conspicuously carve out temporally dependent holes in the tended by the developer. Therefore, in order to realise a amplitude of other signals; resulting in a pumping effect more flexible DRC implementation, this paper proposes a that pulses in time with the music [10]. generalised time-frequency domain design that accommo- Serial and parallel compression techniques have also be- dates both temporally-dependent and frequency-dependent come widely used, as they allow for even greater control dynamic range control; for which an FFT-based imple- of the amplitude envelope of signals by utilising multiple mentation is also presented. Examples given in this paper DRCs with different envelope detector settings [11, 12]. reveal how the design can be tailored to perform a vari- Recent research has also considered the digital modelling ety of tasks, using simple parameter manipulation; such of traditional analogue DRC devices [8], as it is often desir- as frequency-depended ducking for automatic-mixing pur- able to replicate traditional audio processing techniques, in poses and high-resolution multi-band compression. order to capture their unique distortion characteristics [13]. However, the majority of today’s DRC designs are 1. INTRODUCTION temporally-dependent and frequency-independent (refer- ring to common designs described in [4, 14, 15]), which A dynamic range compressor (DRC) is an indispensable makes them unsuitable for certain tasks. For example, it tool found in many an audio engineer’s toolbox. They might be desirable for the DRC to only derive its gain pa- are utilised extensively in the fields of music production, rameter from a specific frequency range of the input au- automatic-mixing, mastering, broadcasting and hearing dio, or the user may want to use different DRC parameters aids. From the music production perspective, the main rea- for different frequency bands and have them operate in- son for their popularity lies with their ability to shape the dependently. Therefore, to accommodate these scenarios, amplitude envelope of transient audio signals; so that they an element of frequency dependency has been adopted by may combine more cohesively with other signals present two main sub-categories of DRC design, termed here as in the mix, or to rectify issues that an existing mix may ex- the filtered side-chain compressor (FSCC) [9, 16] and the hibit. DRCs are also utilised to intentionally reduce the dy- multi-band compressor (MBC) [6, 17]. namic range of audio signals, as humans tend to associate In the case of the FSCC, the amount of gain reduction ap- a louder mix as being superior [1, 2]; which is especially plied to an input signal is influenced by a filtered version of important given stringent broadcasting regulations and the the input signal. One common implementation is the De- competition between broadcasters. esser design, which is typically used to reduce sibilance While the basic design of time-domain DRCs has evolved present in vocal recordings [9]. In this instance the gain slowly over the years [3–8], more progress has been made factor is derived from the frequency range where the sibi- with regard to diverse and creative ways of applying such lance is most prevalent, through the use of a band-pass fil- devices in the field of music production. For example, ter in their side-chain processing. There are also DRC de- most commercial implementations of modern DRCs offer signs that extend this control further and allow much more the ability to influence the behaviour of the DRC using customisable filtering of the side-chain signal [18], which a different audio signal; a technique commonly referred are suitable for applications where a single band-pass filter to as side-chain compression [9]. To give a practical ex- is insufficient. However, such designs do not feature in any formal publications. One potential drawback of the FSCC Copyright: c 2017 Leo McCormack and Vesa Valim¨ aki¨ et al. approach is that the calculated gain factor is applied to the This is an open-access article distributed under the terms of the whole frequency range of the input signal, which may not Creative Commons Attribution 3.0 Unported License, which permits unre- be desirable depending on the use case. Therefore, it might stricted use, distribution, and reproduction in any medium, provided the original be beneficial if FSCC designs were able to apply the cal- author and source are credited. culated gain factor to a user specified frequency range. SMC2017-42 Proceedings of the 14th Sound and Music Computing Conference, July 5-8, Espoo, Finland For the MBC approach, the input signal is divided into separate sub-bands and offers independent user controls for each of them. This allows for greater control of broad- band signals, which therefore makes them especially use- ful for mastering applications; however, typically only two or three independent sub-bands are controllable by the user. MBCs with a higher number of sub-bands are used more commonly in hearing-aid designs [19–21]. However, due to their specific intended application, and their gen- eral lack of envelope detection or side-chain compression support; they are usually unsuitable for music production, mastering or automatic-mixing applications. Designs that are orientated towards these musical applications, such as the Soniformer audio plug-in [22], do not feature in any Figure 1: A depiction of the threshold T , ratio R and knee formal publications. width W parameters. The blue line represents a soft knee, With regard to today’s automatic-mixing approaches [23– while the black dotted line represents a hard knee. 25], many of them operate in the frequency or time- frequency domains and are applied off-line. The latest im- plementations in particular rely heavily on machine learn- computer and an envelope detector [14]. These two com- ing principles; therefore, they have removed the audio pro- ponents allow many user parameters to be delivered, which ducer entirely from the mixing process and have generally typically include: yielded inferior results, when compared to their profes- • threshold, above which attenuation of the input sig- sionally mastered counterparts [25]. Improvements could nal occurs; be expected, by utilising higher frequency-resolution MBC designs that apply frequency dependent side-chain duck- • ratio, referring to the input/output ratio in decibels, ing. This would allow signals to frequency-dependently which influences the gain reduction based on the ex- attenuate the amplitude of other signals, thus creating tent to which the signal has surpassed the threshold; “space” for them in the mix, while also keeping the human • knee width, which permits a less abrupt attenuation element in the mixing process. around the threshold by allowing the gain factor to Therefore, the purpose of this paper is to present a gener- be determined by a gradual change in ratio. This is alised time-frequency domain DRC design, which can be often termed as a soft knee, whereas the abrupt ap- utilised as a more customisable FSCC, an expanded MBC, plication of attenuation about the threshold is termed and a potential semi-automatic-mixing method. The pa- as a hard knee (see Fig. 1); per also provides details of an implementation of the pro- posed design, which is based on the Fast Fourier Trans- • attack time, which determines how quickly the en- form (FFT). As demonstrated in the examples presented, velope detector reaches the target gain reduction, as the implementation provides promising results, while us- dictated by the threshold and ratio parameters; ing a variety of different audio material and simple param- • release time, which refers to how quickly the enve- eter manipulation. lope detector returns to unity gain, when the side- This paper is organised as follows: Section 2 provides a chain signal ceases to exceed the threshold. short introduction on how DRCs generally operate and also describes the typical parameters that are offered to the user; Additionally, the following parameters can also be of- Section 3 gives an example of a time-domain DRC design, fered to the user: which is then expanded to the proposed time-frequency do- • look-a-head, the amount of time the input signal is main design in Section 4; Section 5 then details how the delayed by, allowing the side-chain processing to re- proposed design was implemented, in order to provide the act more reliably to acute transients; examples that are shown in Section 6; and Section 7 con- cludes the paper. • make-up gain, which is a gain factor applied after application of the DRC, as it is often desirable to 2. BACKGROUND increase the amplitude of the output in order to com- pensate for the loss in sound energy; Typically, DRCs operate by duplicating the input signal to • side-chain filter parameters obtain two identical signals, henceforth referred to as the , in order to dictate input signal and the side-chain signal.