Exerpts from "A Closer Look at DTS"
Total Page:16
File Type:pdf, Size:1020Kb
DTS Article http://www.msbtech.com/dtsarticle.html HI FI NEWS & RECORD REVIEW February 1998 Exerpts from "A Closer Look at DTS" By: Malcolm Hawksford Momentum is gaining for a new high-definition digital audio standard and technical debate continues about what features this new medium should embody. In the pages of HFN/RR we were amongst the first to consider this topic back in 1995, where several key issues were identified. Since then, we have seen the launch of DVD-video (at least in the USA and Japan), which has already demonstrated superb performance capabilities, signaling the beginning of the end of Laserdisc video. The critics of course have already pointed to some minor defects, but then I can usually point to substantial defects in Laserdisc. I predicted that DVD, with MPEG-2 video encoding with the bit-rate elevated to approximately 8 Mbit/s and with a DTS soundtrack at around 1.4 Mbit/s will be a killer product and with the stamp of quality needed for home theater – that is, until higher-density discs allow recording of HDTV pictures. .. So why should the audio world wish to embrace a new audio standard? The answer is simply because it offers new opportunities, where the improvement in sound quality is only part of the story. For many, the main attraction is to usher in a paradigm shift from two-channel to multi-channel recording and sound reproduction; the potential improvements this brings are profound, enabling greater realism and sense of envelopment. For natural recordings, there is the space and atmosphere of the recording venue, while for the more creative or synthetic recordings there is a world of 3-Dimensional sound to explore. The audiophile may want to know how a multi-channel system can be planned for the future, but will also want to know what short-term expedient can be taken to gain an advantage today. One answer is to explore the technology created by DTS (Digital Theater Systems) with its Coherent Acoustic system, which has capabilities far removed from the dinosaurs which (as the sound format for Jurassic Park) it was designed to create. …Normally, the bit-rate of a single channel of digital audio is 16 x 44.1 = 705.6 kilobits/second so for 5 channels this would become 3.528 Mbit/s, which is about 2.5 time the rate normally encoded on CD. However, because music normally has front dominance and rarely fully loads either a channel or all channels simultaneously, then there is sufficient information to encode the five discrete channels without much compromise compared to two-channel encoding. The critical factor is that the DTS data rate used for Red Book CD and Laserdisc is around 3.6 times that of either Dolby AC-3 or MPEG, and it is therefore better matched to high-quality music recording. DTS decoders and software exist now, so it is possible to construct a multi-channel audio system and to experience existing multi-channel music recordings. Therefore, DTS is a catalyst which bridges the gap between existing products and future technology based upon a possible DVD-audio format. Future systems will be able to stream data from a range of sources. DTS is hierarchical and scaleable, and in some respects can be considered a complete solution to both multi-channel and high-resolution audio. The paper presented at the 100th Convention of the AES makes interesting reading. It may not be widely known but DTS claims the following features: 1 of 4 15-Mar-00 10:25 AM DTS Article http://www.msbtech.com/dtsarticle.html 1 to 8 channels of multiplexed audio sampling rates from 8kHz to192kHz 16 to 24-bit audio equivalent PCM word lengths compression ratios from 1:1 to 40:1 encoder output data range 32 kbit/s to 4.096 Mbit/s lossless coding mode (variable data rate) linear PCM decoding mode down-mixing from n coded channels to n-1, n-2,…etc. output channels down-mixing from 5.1 discrete channels to stereo LT, RT embedded dynamic range control re-equalization of all channels independently sample accurate synchronization of audio to eternal video signals embedded time stamp and user data future-proof decoder Important characteristics of relevance to future applications include the lossless mode of operation, suggesting DTS can, given an appropriate output bit rate, act as an efficient lossless compressor; and the support of 96 kHz/192 kHz sampling rates together with 24-bit coding should endear DTS to the purists. Also, the 4.096 Mbit/s and variable bit-rate options are of direct relevance to DVD-audio especially with 8-channel recording. At present available software is released at a 44.1 kHz sampling rate; however, in the near future we anticipate that DTS will make a stand as a high-resolution (96 kHz at 24-bit) multi-channel carrier and given its present compatibility with Red Book CD, Laserdisc and DVD-video, there are going to be some interesting politics ahead. Before describing how DTS functions technically, let me give a brief personal impression based upon auditioning DTS compact discs covering a variety of music styles; I believe Ken Kessler also shares similar convictions, and we are not alone… On most recordings, and especially the recent Eagles recording Hell Freezes Over, the results are sensational and in my view match and occasionally transcend by a significant margin the performance of many two-channel red-book CDs. Listening in anticipation to the last track of the Eagles, this really starts to show what multi-channel can achieve while the overall atmosphere throughout the disc is hypnotic. Intriguingly, DTS encoded discs appear to have less ‘electronic’ character than many standard recordings, and project more integrated and naturally etched acoustic objects – more like music in fact! It is curious to speculate why this may be, so I list three reasons why DTS (at 44.1 kHz) may offer a superior performance to red-book 2-channel stereo: A.Although there is lossy compression, the actual channel coding performance can have a resolution that exceeds 16-bit (and is typically 20-bit); this is beneficial especially at lower signal levels and is bounded by conservative psycho-acoustic masks. B.Multi-channel encoding can place sound sources more accurately, giving better perspective to the soundstage. When spatially separated signals become warped within a two-channel mix, the sound can appear coloured and distorted, possibly because of spectral-spatial distortion caused by inappropriate head-related transfer functions (HRTFs) as the image is incorrectly localized. C.The output code from a CD transport (when played through a conventional DAC, without DTS decoding) assumes a continuous noise signal, as the digital signal statistics are virtually independent of the coded audio signal. Consequently, the jitter 2 of 4 15-Mar-00 10:25 AM DTS Article http://www.msbtech.com/dtsarticle.html correlation between audio data and digital code is significantly weakened. I suggest this makes or scrambles the jitter inherent within the transport; and that interface distortion is similarly reduced as pulse-edge jitter in the bi-phase code of the SP/DIF signal, which carries the DTS code, is also randomized. These phenomena are relevant to any new high-resolution format. In order to minimize the effect of system jitter, measures should be taken to minimize the correlation between audio data and digital data which are in addition to measures to reduce the absolute level of jitter. I would argue that the correlation is more significant than the absolute jitter level, so it may be that DTS has inadvertently stole a lead, an interesting hypothesis. So how does DTS work? Well, that is quite a complicated question as there are many subtleties and psycho-acoustically motivated tricks embedded in the algorithm, so if you want a fuller story you must obtain the AES preprint. However, the aim of the system is to place complexity in the coder rather than the decoder, that way consumer hardware remains a fixed architecture but the encoding engine can be refined as more sophisticated strategies emerge. The key computational processes are sub-band filtering, linear-predictive coding (or LPC; that is, adaptive differential coding), vector quantization, dynamic bit allocation and psycho acoustic analysis. There are also a few other subtleties that include scale factors, joint frequency coding, variable length coding and encoder bit-rate iteration. The algorithm accommodates 32 sub-bands for frequencies ranging from 0 to 24 kHz, the latter only being applicable when 96 kHz encoding is in operation. The idea of band-splitting is well established in coding circles, as this way only ‘significant’ spectral energy need be encoded accurately, also the sub-bands map efficiently to psycho acoustic masking paradigms. The sub-band analysis is performed on discrete frames of blocks of input samples whose lengths can be selected dynamically. To increase coding efficiency, the output of each sub-band can also be coded using LPC. This is a differential form of coding where a filter is used to try and predict the next sample value thus reducing the error signal to be coded. The filter is trained to match the data and the coefficients of the filter updated when required using a process called ‘vector quantization’. Vector quantization recognizes that certain groups of filter coefficients are more probable so these are grouped to form a ‘vector’ by forming a carefully generated code table, that is, sets of permissible coefficients; and consequently only pre-determined groups of coefficients are coded, hence the name quantization. Thus, by transmitting only a normally small residual error signal and occasionally updating the predictive filter coefficients using vector quantization, a high coding efficiency is achieved.