<<

DIGITAL AUDIO AND SERVICES FOR ATV -­ THE WORK OF THE ATSC SPECIALIST GROUP Graham S. Stubbs Eidak Corporation

ABSTRACf that it is important that appropriate em­ phasis should be placed on defining the This paper describes the advice and accompanying sound channels, the features suggestions put forth by the Technology of the ancillary data and control services, Group on Distribution (T3) of the Advanced and the way in which they may be included System Committee (ATSC) re­ in the ATV transmission format. garding digital services for Advanced Televi­ sion (A TV). These recommendations were The charter of the ATSC Specialist based on the background work of the Spe­ Group on Digital Services (T3/S3) has been cialist Group (T3/S3) on Digital Services to conduct industry surveys and technical which conducted technical studies, and studies and to develop technical information surveys, and developed the suggestions and and recommendations on the following sub­ recommendations. jects. The Specialist Group commenced its studies in December, 1990. Rapid advances in multichannel com­ posite digital audio coding technology now A Sound & Ancilla:ry Data Services make it possible to plan to provide the con­ sumer (e.g. cable subscriber) with an expand­ Identification of the range of ATV ed audio experience to match wide screen sound channel requirements and high definition TV pictures. This paper how they might be satisfied with re­ summarizes the suggestions adopted by cent rapid advances in the state-of­ ATSC's T3 Group regarding audio and the art digital audio coding. Identifi­ ancillary data services, including the advice cation of the range of desirable that a standard service for A1V should in­ ancillary data services--including clude capacity for a minimum of five audio those already in use and in some channels with composite encoding. cases mandated--and estimates of the data capacity required for each. Particular emphasis was placed on the need for flexible allocation of data to audio B. Conditional Access and ancillary data services on an as-needed basis. Establishment of system attributes and features required to allow condi­ INTRODUCTION tional access to the A TV transmis­ sion in the various media in particu­ The Advanced Television Systems lar Cable Television, with emphasis Committee and other organizations, includ­ on the structure and capacity of the ing the FCC Advisory Committee on Ad­ required control data channel. vanced Television Service, EIA, and NCfA, have recognized that much of the c. Multiplex Structure effort spent on ATV development and stan­ dardization has been put on the develop­ Determination of requirements for ment of vision signal encoding schemes and flexibility in the data multiplex struc

1992 NCTA Technical Papers- 31 ture to permit re-allocation of the "images" which can truly complement the available data capacity for the vari­ high definition pictures in a restricted 6 ous services. MHZ wide HDTV digital channel. Audio techniques developed for motion picture The goal of this work has been to theaters may become applicable to the influence the characteristics of the digital consumers' living room. services (other than video) which are to be provided by the A TV emission system As a discussion vehicle, the Specialist selected for use for terrestrial broadcasting Group prepared a "strawman" proposal for in the United States. (However, it is not TV Audio and Data Services for Simulcast ATSC's intention to affect the ATV testing ATV systems. The "strawman" proposal program already in progress as part of the suggested desirable program audio attri­ FCC selection process.) Specifically, it is butes and listed possible ancillary digital suggested that the final ATV system se­ services, including some of those already in lected for terrestrial broadcasting, cable, use with NTSC such as . It and other media follow the guidance de­ was suggested that advantage should be tailed in reference 1 (see Appendix for taken of the opportunity to design a new Executive Summary of the ATSC docu­ generation of ancillary data services into the ment). system from the beginning. Starting in April 1992, the "strawman" proposal was Recognizing that different media circulated for comments and suggestions to (broadcast, cable, DBS, etc.) will use differ­ a wide range of parties with interest in ent modulation methods, possibly carry A TV development and introduction, includ­ different ancillary signals and have other ing the proponents of systems to be tested requirements unique to each medium, in the selection process. By an iterative T3 /S3 has focussed its technical studies to process, the document was refined to reflect facilitate the adoption of voluntary stan­ as much industry consensus as possible, with dards for all media based on the same particular emphasis on flexible allocation of baseband signal representation for video, data capacity to audio and ancillary data audio, data and control to maximize inter­ services. operability and to provide a simple com­ mon interface to the consumer A TV re­ The following summarizes some of ceiver. Work by T3 /S3 on Conditional the principal suggestions as to the charac­ Access is reported in References 2 and 5. teristics of the audio and digital services which should be provided by an ATV emis­ WORK ON AUDIO AND ANCILLARY sion system adopted for use in the United DATA SERVICES States, both for terrestrial broadcast and by alternate media. Certain services are iden­ Advances in audio bit rate coding tified which the final distributor of pro­ technology have occurred extremely rapidly gramming may provide in a standardized in the past two years, now making it poss­ manner. Standardization of services is ible to encode multiple audio channels necessary so that receiver manufacturers using data rates previously allocated to a may produce receivers that can receive single audio channel. It is now technically these services. A number of capabilities are possible to create multiple channel audio

32 -1992 NCTA Technical Papers identified which every receiver should be added in a compatible m&-mer by means of required to provide. the flexible allocation of data capacity. The key is to provide a way to identify new MAJOR PRINCIPLES types of allocated data services, so that new or upgraded receivers may make use of the Flexible Allocation new data types, while older receivers simply ignore them. The new data types could The number and type of audio and offer new kinds of audio services (perhaps data services which an A TV system should with more audio channels) and new kinds of deliver will vary significantly depending on information services. Some of the addition­ what services are available to accompany al data types could be intended for private the picture, and the needs of the particular or commercial reception, with the data service area for the distribution medium, capacity being sold by the final program e.g. broadcast, cable. etc. In order to pro­ distributor to provide a supplemental rev­ vide a maximum level of utility, many audio enue source. channels and data services might be need­ ed. A fixed provision for many audio Multi-Channel Audio channels and data services would require, however, that a significant portion of the The A TV system should be capable available transmission capacity be reserved of delivering multi-channel sound appropri­ for these services. This would unnecessar­ ate to the wider, higher definition picture. ily constrain the video quality, which would Consistent with demonstrated psychoacous­ suffer if its available data rate were re­ tic principles, the preferred channel assign­ stricted. Since much of the programming ment is: Left, Center, Right, Left Surround, may not require many audio channels or and Right Surround. A significant advan­ data services, a fixed allocation of transmis­ tage of three front channels (rather than sion capacity to these services would be two as in present TV stereo) is stabilization inefficient. Therefore, the A TV transmis­ of the audio dialogue image in the vicinity sion system should be capable of a flexible of the TV screen. allocation of data to audio and data servic­ [Note: This is consistent with the CCIR es on an as-needed basis, with the remain­ Task Group 10/1 Draft Recommendation ing data available to the video system. At on Multi-Channel Sound (Reference 3), and the point of emission or final distribution, the SMPTE Film Sound Sub-Committee a choice may be made based on the avail­ Report](Reference 4). An optional low able services and the needs of the intended bandwidth channel for low-frequency en­ audience as to which services are to be hancement (subwoofer channel) may also delivered via the final distribution medium. be provided. Monophonic and two-channel stereophonic transmission modes of opera­ Extensibility tion should also be provided, and will offer the most efficient method of delivery for It is not possible to enVIsion and mono or two-channel stereo programming. provide for all potential audio and data services which might be useful components Even with the use of advanced low of the A TV service in all media. It is bit-rate audio coding technologies, provision feasible to allow new digital services to be of five high quality independently coded

1992 NCTA Technical Papers- 33 discrete channels would require significant When surround information is rele­ data capacity. Current estimates of re­ vant and available, the 3/2 composite cod­ quired bit rates per individual audio chan­ ing mode is preferred. Three front chan­ nel are: nels and two surround channels are provid­ ed. High Quality 128 kbits/sec When surround information is not Medium Quality 96 kbits/sec ·available, the 3/0 mode is sufficient. Three Low Quality 64 kbits/sec front channels are provided; the 3/2 mode may also be used. However, recent developments in composite multi-channel audio coding Two-channel stereophonic audio may technology show that five-channel audio be most efficiently transmitted with the 2/0 may be delivered with a data rate only mode. The 3/2 or the 3/0 modes may also slightly larger than that required by two be used. high quality independently coded channels. It is recommended that a five-channel It is preferable that audio programs composite coding mode structure be pre­ not be simultaneously provided by a mono­ defined as part of the A TV system. phonic or two-channel stereophonic service intended for low cost receivers, in addition Suggested are three composite coding to a separate multi-channel service intended modes, which offer different numbers of for high end receivers. That would be high quality audio channels. The numerical wasteful in data capacity, and would inhibit designations below (e.g. 3/2) indicate the the use and growth of multi-channel audio. number of front channels / rear channels. It would be necessary, though, that all Some audio coding technologies incorpo­ receivers be capable of decoding a multi­ rate a low bandwidth channel ( < 200 Hz) channel service into the desired number of intended to deliver low-frequency enhance­ reproduction channels. Lower cost receivers ment information (subwoofer channel), the would not need to completely decode all use of which would be optional for the pro­ five channels, but could decode the five gram distributor, receiver, and viewer. channel service into a conventional mono­ phonic or two-channel stereophonic pro­ Five channels 3/2 300 - 400 kbits/sec gram with attendant cost savings. Three channels 3/0 256 kbits/sec Two channels 2/0 192 kbits/sec The A TV receiver would be required to produce sound for any of the pre-defined With composite coding, all of the coding modes that the broadcaster chooses audio channels which have been coded into to use. This doe not imply that every re­ the composite data stream must be repro­ ceiver should be capable of fully decoding duced together (similar to the way the a 3/2 service; only that it make sound from R,G,B colors must be reproduced together it. It would be acceptable for a receiver to to form the viewed picture), although not decode all modes into monophonic or necessarily out of independent loudspeak­ two-channel stereophonic audio. ers. The channels may be mixed together to reduce the number of loudspeakers required for reproduction.

34 -1992 NCTA Technical Papers Uniform Loudness an audio service than for the video service, as reproduced audio errors may be more The ATV audio system should pro­ annoying than reproduced video /errors. In vide the means to control loudness in a addition, some types of delivery (terrestrial, uniform manner among various programs DBS, etc.) will be used by some viewers at and delivery channels. The viewer should the threshold of the service area. There­ perceive the same subjective dialogue fore, the ATV audio system should include loudness when a program ends and a new effective concealment methods to minimize program begins, or when channels are audible disturbances caused by changed. The A TV audio data should uncorrectable errors. inform the receiver of a dialogue reference level so that the receiver can reproduce the Audio Services to the Visually and Hearing normal spoken dialogue at an acoustic level Impaired chosen by the viewer. The ATV audio system should allow Dynamic Range Control programmers to deliver special audio serv­ ices to the visually impaired (VI) and hear­ There is a conflict between the ing impaired (HI) using the flexibly alloca­ needs of many viewers for program audio ble channels. The VI service would typical­ to have a narrow dynamic range, and the ly contain a narration describing the picture desires of some viewers to reproduce the content, and would be reproduced along audio in the full dynamic range intended by with the main audio program in the receiv­ the program producer. The ATV audio er. The HI service would typically contain coding system should incorporate an inte­ only dialogue, and would be processed for grated dynamic range compression method improved intelligibility. which delivers data to the receiver repre­ senting the compression characteristic em­ SERVICE IDENTIFICATION DATA ployed. (SVIDl

Receivers may include circuitry Service Identification Data (SVID) which gives the viewer the ability to control should be incorporated into the ATV data the reproduced dynamic range. Using the stream. The SVID supplies descriptors for data representing the compression charac­ all digital services delivered by the ATV teristic employed, the receiver may partially signal. The descriptors identify the digital or completely reverse the dynamic range services and indicate their locations within compression intentionally introduced by the the overall data multiplex. Several methods program provider. may be used to deliver this information (such as packets with headers) providing they meet the requirements for this func­ tion. This issue has not been studied in Error Correction and Concealment for depth and no recommendation is made as Audio Services to the technique to be employed.

It is recognized that error correction The SVID must be very reliable, and concealment is a more critical issue for since errors could cause total loss of ATV

1992 NCTA Technical Papers- 35 service. The SVID should be recognizable Program Guides by the receiver as soon as possible after a channel change. The information carried The program guide is an optional by the SVID should be repeated frequently, service. The program guide is intended to so that a receiver can quickly recognize the inform the viewer about the programming available services after a channel change, available on the particular ATV channel and so that the redundancy can be used to being viewed. The program guide service improve reliability of the data. The format should be capable of delivering text ac­ that is adopted for the SVID must be very companied by graphics. For the benefit of flexible and allow new digital services to be the viewer who is quickly scanning channels, defined and delivered in a manner that is lines of text identifying the current and compatible with all ATV receivers. upcoming programs should be provided and the program guide data should the found ANCILLARY DATA SERVICES quickly in order to minimize the delay in displaying the current program information. Several types of data services should be pre-defined. Others may be added If an additional frequency agile tuner using the flexible allocation capability. is available, either in the ATV receiver or Some types of data services, such as condi­ at a cable headend, all available ATV chan­ tional access, may only be used by the nels may be scanned and the program guide alternative media. information for all available channels com­ bined to form a multichannel program Captioning guide.

The A TV emission system should Conditional Access provide a captioning system capable of delivering multiple versions of captions. Some applications of the ATV sys­ Different versions of captions may be used tem will require data to be allocated to to provide service in multiple languages, for conditional access control. The required the hearing impaired as well as the data capacity will depend on the size of the non-impaired. A minimum requirement on audience being served and the type of the captioning system should be that it service being offered (i.e. pay-per-view or allows three versions of captions: primary monthly subscription). There should be no language, secondary language, and primary fixed allocation of data to conditional ac­ language for slow readers. A data rate cess. Data capacity should be allocable to sufficient to support this level of service conditional access control at the option of should be allocable to the captioning ser­ the particular service provider. The T3/S3 vice when captioning information is avai­ Specialist Group made no recommendation lable. It should be possible to allocate as to the minimum data rate required to additional data to the captioning service to support a conditional access system. support delivery of additional versions of captions (or subtitles). All ATV receivers (NOTE: Meetings of the ATSC Specialist should be required to decode and display Group on Digital Services T3 /S3 were held captions. concurrently with meetings of the ATSC Specialist Group on Interoperability and

36 -1992 NCTA Technical Papers Consumer Product Interface T3 /S2. The butions to the generation and authoring of two Specialist Groups worked closely on the ATSC document. Appreciation is also Conditional Access issues. This work on expressed to the many individuals who desirable attributes and features of en­ participated in the Specialist Group's work, cryption systems is reported in a companion and who responded to the numerous re­ paper, Reference 5.) quests for advice and comment.

CONCLUSIONS References

The "strawman" approach to solicit­ 1. "Digital Audio and Ancillary Data ing industry opinion and advice regarding Services for an Advanced Television digital audio and ancillary data services for Service" February 3, 1992 (Doc. ATV resulted in responses from a broad T3/186) Advanced Television Sys­ range of parties with A TV interests. The tem Committee, 1776 K Street, NW results of the T3 /S3 studies indicate that an #300, Washington, DC 20006. acceptable set of audio and ancillary serv­ ices can only be provided by a flexible, ex­ 2. "ATV Encryption System Charac­ tensible system which allocates data ca­ teristics" May 16, 1991 (Doc. pacity to these services only on an T3/180) Advanced Television Sys­ as-needed basis. All remaining data capac­ tem Committee, 1776 K Street, NW ity should be used by the video system for #300, Washington, DC 20006. improved picture quality. 3. CCIR Task Group 10/1 Draft Rec­ Advantage should be taken of the ommendation on Multi-Channel opportunity to design into a new ATV Sound, November 7, 1992. standard the ability to distribute multi­ channel audio with qualities commensurate 4. SMPTE Film Sound Sub-Committee with wide screen high definition pictures. Report. At the same time, a flexible approach to data allocation will provide new ancillary 5. "A Progress Report on the Work of data services and should satisfy the condi­ the ATSC Specialist Group on tional access needs of Cable and other Interoperability and Consumer media. Product Interface". Bernard Lechner, NCfA Convention, May, Acknowledgements 1992.

The author wishes to acknowledge that portions of this paper are extracted from "Recommendation on Digital Audio and Ancillary Data Services for an Advanced Television Service", distributed by ATSC in February 1992. The author is grateful to Craig Todd (of Dolby Labor­ atories) and Tom Keller (Consultant to Cable Labs) for their considerable contri-

1992 NCTA Technical Papers- 37 Appendix The A TV service will offer widescreen pictures, and this feature is Executive Summacy of "Digital Audio and expected to increase consumer interest and Ancillacy Data Services for an Advanced enjoyment of this new format. The audio Television Network". ATSC. Februacy 3. corollary to the wider aspect ratio is 1992. (T3/186) multi-channel audio. Recognizing the signif­ icant limitations of two-channel stereophony The Specialist Group on Digital for sound accompanied by pictures, it is re­ Services (T3 /S3) of the Technology Group commended that the ATV service be capa­ on Distribution (T3) of the Advanced Tele­ ble of delivering five channel audio (left, vision Systems Committee (ATSC) has center, right, left surround, and right sur­ been working for two years to identify the round). This is generally consistent with digital audio and data services that should recent trends in the application of digital accompany the Advanced Television (ATV) audio in the motion picture industry, and picture. T3 /S3 has conducted technical the draft recommendation from CCIR Task studies and industry surveys, and has cir­ Group 10/1. Recent advances in culated a "strawman" proposal to stimulate multi-channel audio coding technology have industry discussion and feedback. Based on reduced the data rate required for five this work, T3 offers the guidance contained channel audio nearly to that required for in this document. On February 3, 1992 the two independent audio channels. All A TV ATSC Executive Committee approved receivers would need to decode the provid­ release of this material. It is hoped that ed service (which could vary from one to the ATV system selected for use in the five channels) into the number of loud­ United States for terrestrial broadcasting, speaker channels to be used (e.g. the five as well as ATV systems for the alternative channel audio service may be decoded into media, will follow this guidance. mono for the low-cost mono receiver).

A major finding is that it is not The ATV audio system should pro­ desirable to select a fixed set (number and vide a solution to the problem of loudness type) of digital audio and data services for uniformity among programs, channels, and inclusion into the 6 MHz ATV channel. delivery media. The average perceived This is because the data rate required for loudness of dialogue should be uniform. all potential services would negatively The coded audio data should indicate the impact the picture data rate and affect the average dialogue loudness within the dy­ picture quality. It is recommended that the namic range of the coding system. Differ­ A TV system allow data to be allocated to ent programs may have varying amounts of digital audio and data services only on an headroom above this level which is avail­ as-needed basis. A flexible system of data able for dramatic effect. allocation will require the use of only the minimum data capacity necessary for digital The A TV audio system should allow audio and data services at any time. Flexi­ audio service to be provided with a wide ble allocation also allows the addition of dynamic range. Since experience indicates new types of digital services in the future, much of the audience will prefer a restrict­ with older receivers ignoring the new data ed dynamic range, the audio coding system types. should incorporate an integrated dynamic

38 -1992 NCTA Technical Papers range compression system. Information Flexible allocation of data capacity about the compression introduced may be to audio allows audio service to be option­ incorporated into the audio data stream so ally provided to specialized audiences, that the receiver may optionally reverse the including those with other languages. Audio compression to restore the original dynamic services should be tagged to indicate lan­ range of the program. guage and type, so that the receiver may assume the burden of choosing the correct The ATV system should allow ser­ language audio service for the viewer (as­ vice to be delivered to the visually and suming the receiver has knowledge of the hearing impaired. Tne visually impaired viewers preferred language), and so that the may be served by an optionally allocated viewer may determine the types of audio narrative audio channel. The hearing im­ available. paired may be served by the captioning system, or by an optionally allocated audio * * * * * * * * * * * * * * * * channel containing only dialogue which has been processed for improved intelligibility.

1992 NCTA Technical Papers- 39