Implementing Object-Based Audio in Radio Broadcasting
Total Page:16
File Type:pdf, Size:1020Kb
Object-based Audio in Radio Broadcast Implementing Object-based audio in radio broadcasting Diplomarbeit Ausgeführt zum Zweck der Erlangung des akademischen Grades Dipl.-Ing. für technisch-wissenschaftliche Berufe am Masterstudiengang Digitale Medientechnologien and der Fachhochschule St. Pölten, Masterkalsse Audio Design von: Baran Vlad DM161567 Betreuer/in und Erstbegutachter/in: FH-Prof. Dipl.-Ing Franz Zotlöterer Zweitbegutacher/in:FH Lektor. Dipl.-Ing Stefan Lainer [Wien, 09.09.2019] I Ehrenwörtliche Erklärung Ich versichere, dass - ich diese Arbeit selbständig verfasst, andere als die angegebenen Quellen und Hilfsmittel nicht benutzt und mich auch sonst keiner unerlaubten Hilfe bedient habe. - ich dieses Thema bisher weder im Inland noch im Ausland einem Begutachter/einer Begutachterin zur Beurteilung oder in irgendeiner Form als Prüfungsarbeit vorgelegt habe. Diese Arbeit stimmt mit der vom Begutachter bzw. der Begutachterin beurteilten Arbeit überein. .................................................. ................................................ Ort, Datum Unterschrift II Kurzfassung Die Wissenschaft der objektbasierten Tonherstellung befasst sich mit einer neuen Art der Übermittlung von räumlichen Informationen, die sich von kanalbasierten Systemen wegbewegen, hin zu einem Ansatz, der Ton unabhängig von dem Gerät verarbeitet, auf dem es gerendert wird. Diese objektbasierten Systeme behandeln Tonelemente als Objekte, die mit Metadaten verknüpft sind, welche ihr Verhalten beschreiben. Bisher wurde diese Forschungen vorwiegend im Kino- und VR-Bereich angewendet, die der Rundfunkbranche wurden bis vor Kurzem vernachlässigt. Mit der zunehmenden Popularität haben die Regulierungsbehörden der Rundfunkindustrie begonnen, diese neue Technologie zu standardisieren, die mehr Flexibilität und Zugänglichkeit bietet. Aufbauend auf diesem gegenwärtig wachsenden Interesse an objektbasiertem Audio, befasst sich dieser Aufsatz mit der Möglichkeit der Implementierung von räumlichen Ton in Radiosendungen. Der Autor ließ sich vom Orpheus-Projekt des BBC-Forschungs- und Entwicklungsteams inspirieren. Um die Architektur des Rundfunks zu verstehen, besuchte der Autor drei Radiosender in Österreich. Die Signalkette der einzelnen Radiosender wurde analysiert und mit eine Lösung zur Implementierung von objektbasiertem Audio vorgestellt. Feedback zu der vorgeschlagenen Methode gab ein Mitglied des technischen Direktorat, das für die technische Überwachung mehrerer österreichischer Radiosender zuständig ist. Es wurde ein Hörtest entwickelt, um das Verhalten von Audioobjekten in gängigen Normalverbraucher-Lautsprecher-Layouts zu testen. Während die Rendersoftware experimentell war, zeigte der Test vielversprechende Ergebnisse hinsichtlich Objektpositionskonsistenz und Sprachverständlichkeit. Die Ergebnisse zeigen, dass die Implementierung von objektbasiertem Audio im Rundfunk einen experimentellen Ansatz darstellt, bei dem einige Technologien noch entwickelt und Protokolle standardisiert werden müssen. Derzeit verfügen österreichische Radiosender nicht über die notwendigen Bausteine, um eine objektbasierte Sendung zusammenzustellen. Während die meisten Lösungen dafür softwarebasiert sind, sind einige Hardwareänderungen erforderlich, um die Signalkette funktionsfähig zu machen. Dennoch sind sich die technischen Abteilungsleiter der Vorteile bewusst und sind sich einig, dass dies eine Lösung für die Zukunft sein kann. III Abstract The science of object-based audio is concerned with a new way of conveying spatial information moving away from channel-based systems towards an approach that processes audio independently of the device it is being rendered on. These object- based systems treat audio elements as objects which are linked with metadata that describes their behavior. So far research has predominately discussed Cinema and VR applications leaving the broadcasting segment of the industry behind. With the rise in popularity the regulating bodies of the broadcast industry have started to standardize this new technology offering more flexibility and accessibility. Building upon this currently increasing interest in object-based audio this paper addresses the possibility of implementing a spatial audio in radio broadcast. The author drew inspiration from the Orpheus project developed by the BBC Research and Development team. In order to understand the architecture of radio broadcasting the author participated in three Austrian radio station tours. Each radio stations signal chain was analyzed and with the information gathered a solution for implementing object-based audio is presented. Feedback regarding the method proposed was given by a member of the technical department in charge of technical oversight for several Austrian radio stations. A listening test was developed in order to test the behavior of audio objects in popular consumer speaker layouts. While the renderer was experimental the test showed promising results regarding object position consistency and voice intelligibility. Results show that the implementation of object-based audio in radio broadcasting presents an experimental approach, with some technologies still needing to be developed and protocols to be standardized. For the moment, Austrian radios do not possess the necessary building blocks in order to assemble an object-based broadcast. While most of the solutions for achieving this are software based, some hardware changes are necessary in order to make the signal chain functional. Nevertheless, technical department heads are aware of the advantages it brings and agree that it can be a solution for the future. IV 1 Table of contents Ehrenwörtliche Erklärung II Kurzfassung III Abstract IV 1 Table of contents V 2 Introduction 7 3 Research questions 9 4 Methodology 10 5 Technical concepts 12 4.1 Radio technology 12 4.1.1. Principles of a Radio Studio 12 4.2 Immersive Audio 17 4.2.1 Immersive Audio 17 4.2.2 Ambisonics 17 4.2.3 VBAP 19 4.2.4 Object Based Audio 21 4.2.5 Audio Definition Model 24 4.2.6 MPEG-H 3D Audio 36 4.2.7 Demand for Immersive Content 37 4.2.8 The Orpheus Project 38 4.2.9 IP Studio 48 5 Current Technology 50 5.1 Radio station analisys in Austria 50 5.1.2 Public Radio Niederöstereich 50 5.1.3 Kronehit Radio 54 5.1.4 OE3 Radio 58 6 Spatial audio broadcast 63 6.1 MPEH-H in TV Broadcast 63 6.2 Implementation in radio Broadcast 65 6.2.1 MPEG-DASH 66 6.2.2 Capture 69 V 6.2.3 Transport 74 6.2.4 Processing 82 6.2.5 Distribution 90 6.2.6 MPEG-H distribution 94 6.2.7 Reception 96 6.2.8 Web-based content 100 6.2.9 DAB+ Encoding 101 6.2.10 Technical feedback 102 7 Listening test 105 7.1 Immersive Audio for Home 105 7.2 Spatial audio panning algorithms 109 7.3 Methodology 110 7.4 Audio Object Rendering 110 7.5 Results 112 7 Conclusions 117 8 References 119 9 Table of figures 125 VI 2 Introduction In modern times radio is a common household name. In today’s world it is hard to imagine a person who never heard, seen or used a radio. The fact that this technology is so routed in our cultures is an advantage because of the accessibility that it brings with, but also a disadvantage. The radio has a simple way of working since its development, the user hears what the transmitter broadcasts. With this mindset most of the people don’t even think radio technology will evolve, or it is somewhat ‘fixed’. Radio has been continuously evolving since it’s invention. While the service didn’t change a lot in its appearance the technology behind it has dramatically improved. From quality to range and bandwidth radio has been continuously improving. One of the greatest improvements in media delivery was interactivity. VR is taking the gaming industry into the next level of interactivity and developing technology that can be used in other purposes, for example augmented reality is helping surgeons to better plan operations (Usman, 2018). In the modern world interactivity is looking like an important asset to have as a media broadcasting service and with the developments in technology that are available new it appears to be a more realistic goal than ever. BBC Radio is the most famous radio in the UK and the one with the most listeners, BBC Radio 1 ranking up an average of 11 million listeners every week (“ORPHEUS - BBC R&D,” 2019). The research and development team has been experimenting with Binaural and Immersive audio since 2012. Producing radio drama and classical music content in order to achieve the most realistic sound for the user. In order to do this a new approach on handlining audio was developed. Traditionally, in radio the audio is handled as a stream of mixed content, for example speech and music or speech and background noise. Once a program is made it could not be ‘unmixed’ so the final result is locked with no ability to change the individual parts. Also, it is fixed to a playback format being stereo, 5.1, 7.1 and so on. When treating audio as objects each part of the production is treated separately and it is only mixed together when played back. The task of mixing of individual objects Is divided between 2 technologies. First is metadata which is attached to each object and has the purpose of describing what is in the object and how it should be mixed. Second is the decoder which can be software or hardware based, this mixes the objects into a finished product. The decoder has also the advantage that it can playback the final product for any given speaker array. Given the right data about the array the decoder can adapt any audio object content to what is available. With the media being assembled at the user end this opens the possibility of giving the user the power to change the media to his/hers preferences (R. Bleidt et al., 2017). A standard called ATSC 3.0 was 7 implemented in Seoul, Korea which gives the users a new video experience, part of that was the ability to change multiple audio parameters (atsc.org , Jay Jeon). This system is based on object-based audio and users have a multitude of choices regarding audio, from the language of a program to changing different levels of the background noise or other communications channels.