Virtualization of Audio-Visual Services

Software Defined Media: Virtualization of Audio-Visual Services Manabu Tsukada∗, Keiko, Ogaway, Masahiro Ikedaz, Takuro Sonez, Kenta Niwax, Shoichiro Saitox, Takashi Kasuya{, Hideki Sunaharay, and Hiroshi Esaki∗ ∗ Graduate School of Information Science and Technology, The University of Tokyo Email: [email protected], [email protected] yGraduate School of Media Design, Keio University / Email: [email protected], [email protected] zYamaha Corporation / Email: fmasahiro.ikeda, [email protected] xNTT Media Intelligence Laboratories / Email: fniwa.kenta, [email protected] {Takenaka Corporation / Email: [email protected] Abstract—Internet-native audio-visual services are witnessing We believe that innovative applications will emerge from rapid development. Among these services, object-based audio- the fusion of object-based audio and video systems, including visual services are gaining importance. In 2014, we established new interactive education systems and public viewing systems. the Software Defined Media (SDM) consortium to target new research areas and markets involving object-based digital media In 2014, we established the Software Defined Media (SDM) 1 and Internet-by-design audio-visual environments. In this paper, consortium to target new research areas and markets involving we introduce the SDM architecture that virtualizes networked object-based digital media and Internet-by-design audio-visual audio-visual services along with the development of smart build- environments. We design SDM along with the development ings and smart cities using Internet of Things (IoT) devices of smart buildings and smart cities using Internet of Things and smart building facilities. Moreover, we design the SDM architecture as a layered architecture to promote the development (IoT) devices and smart building facilities. Moreover, we of innovative applications on the basis of rapid advancements define audio-visual devices as smart building facilities for in software-defined networking (SDN). Then, we implement a better integration into the smart buildings. SDM virtualizes prototype system based on the architecture, present the system the networked audio-visual infrastructure and enables SDM at an exhibition, and provide it as an SDM API to application applications to flexibly access audio-visual services on the developers at hackathons. Various types of applications are developed using the API at these events. An evaluation of SDM basis of rapid advancements in software-defined networking API access shows that the prototype SDM platform effectively (SDN) [5]. provides 3D audio reproducibility and interactiveness for SDM This paper discusses the activities and achievements of applications. the SDM consortium. Section II reviews the related studies. Section III describes the motivations and goals of SDM. Sec- I. INTRODUCTION tion IV introduces the SDM architecture that realizes flexible audio-visual services based on object-based media. Section V With the proliferation of the Internet, users are increasingly presents an overview of the SDM prototype system. Section VI able to enjoy media generation services, such as Vocaloid™, summarizes the SDM platform presented at exhibitions and and media sharing services such as YouTube, Facebook, and provided at hackathons. Section VII describes the evaluation Nico Nico Douga. In these services, data is generated, pro- of the SDM prototype systems. Finally, Section VIII concludes cessed, transmitted, and shared in a digitally native manner on the paper and explores directions for future work. the Internet without analog-to-digital conversion. Recently, the arXiv:1702.07452v1 [cs.MM] 24 Feb 2017 object-based approach has gained importance, whereby media II. RELATED WORK data is decoupled from audio-visual data and 3D meta-data, Sound reproduction with spatial expression began with the which represents multiple objects that exist in 3D space, and development of stereo sound (two-channel) and continued transmitted to remote locations for the reproduction of 3D with the development of surround sound (multi-channel). 22.2 space. This allows flexible and possibly interactive reproduc- Multichannel Sound [6] introduced height expression for 3D tion that adapts to the receiver’s configuration, including head- sound reproduction. Super Hi-Vision [7] employs 22.2 multi- mounted display (HMD), 3D TV, and 3D audio systems. channel surround sound with the worlds first High-Efficiency Object-based audio systems for reproducing 3D sound are Video Coding (HEVC) encoder for 8K Ultra HDTV. Channel- commercially used in cinemas, home theatres, and broad- based systems reproduce sound from the information assigned casting TV systems (e.g., Dolby Atmos[1] and DTS:X [2]). to each channel, whereas object-based audio systems render Furthermore, researchers are developing free-viewpoint video sounds based on software processing while considering the systems that employ object-based systems [3], [4], whereby 3D locations of the sound objects as well as the positions the 3D models of visual objects are extracted from multiple field-installed cameras. 1http://sdm.wide.ad.jp/ of the speakers. Besides Dolby Atmos[1] and DTS:X [2], created to add the desired effects. SDM aims to create which have been deployed in theatres, an object-based audio augmented audio-visual effects by mixing real and virtual format is being standardized by the ISO/IEC Moving Picture audio-visual objects to provide an enriched experience to Experts Group (MPEG) as MPEG-H for consumer audio the audience. including broadcasting [8]. Flexibly focusing on a sound • Focusing on audience interests: SDM provides means source using signal processing with a microphone array has for users to interactively provide feedback regarding their been investigated as an audio recording technique [9], [10]. interests to the software systems for reproduction man- Moreover, researchers have investigated 3D model extraction agement, content source selection, and so on. Moreover, from multiple field-installed cameras as a video recording SDM connects spatially/temporally distributed audiences technique [3], [4]. having similar interests and enables them to interact. The audio-visual infrastructure installed in a building, in- IV. SDM ARCHITECTURE cluding speakers, microphones, displays, and cameras, is con- sidered as a building facility interconnected by an IP network. Fig. 1 shows the SDM architecture that realizes the goals Recent advancements in smart buildings include facilities such described in the previous section. as heating, ventilation, and air conditioning (HVAC) as well as lighting systems managed by intelligent controllers connected Contents SDM Station to an IP network using standard protocols such as IEEE Broadcast The Internet 1888 [11]. system OTIVATIONS AND OALS Media Contents III. M G Exchange of Public mixer audio-visual processing Management The notion of Software Defined Media (SDM) has arisen objects SDM Station from IP-based media networks. SDM is an architectural ap- Reproduction Media proach to media as a service by virtualization and abstraction Application manager transmitter Application Reflected audio- of networked media infrastructure. SDM allows application visual object API developers to manage audio-visual services through the ab- Virtual space Localiza Audio Video 3D audio straction of lower-level functionalities. This is achieved by Service tion extraction extraction rendering decoupling the software that makes decisions about how to 3D space modeling and map Projection render the audio-visual services from the underlying infras- Reflection Input Output tructure that inputs and outputs the sound. Such decoupling enables a system to configure flexible and intelligent audio- Projected audio and video Infrastruc visual services to match an entertainment program with the ture Micro Sensor Camera Speaker HMD Display rendering. phone Recording environment Reproduction environment The goals of SDM are as follows: Real space • Software-programmable environment of 3D audio- Fig. 1. SDM architecture visual services: At present, audio-visual systems are mainly hardware-based dedicated systems that fulfill As shown on the left-hand side of Fig. 1, SDM interprets the different performance requirements for applications in objects in the real space and reflects them three-dimensionally theatres, museums, classrooms, and so on. SDM aims as audio-visual objects in the virtual space. The audio-visual to change such dedicated systems into flexible systems objects are exchanged between remote locations and processed via software rendering. The software manages the audio- in various applications. The results are projected from the visual objects in the virtual 3D space and renders images virtual space to the real space for the rendering of sounds and and sounds suitable for the rendering environments. By images. The SDM station is the most basic unit of SDM for reconfiguring the software, many applications can be re- linking the virtual space and the real space. An SDM station alized with a general-purpose audio-visual infrastructure is divided into three layers, as shown on the right-hand side in the rendering environment. of Fig. 1. • Mixing of 3D audio-visual objects from multiple The infrastructure layer includes input devices that record sources: SDM aims to achieve receiver-side mixing of and interpret the targets three-dimensionally in the real space, audio-visual objects from multiple sources, including the e.g., sensors,

Virtualization of Audio-Visual Services

DVD850 DVD Video Player with Jog/Shuttle Remote

An Introduction to MPEG Transport Streams

Implementing Object-Based Audio in Radio Broadcasting

BDP9100/05 Philips Blu-Ray Disc Player

Comprehensive Video/Audio Analysis to Reduce Debug Cycle

HTS3450/77 Philips DVD Home Theater System with Video

DTS Studio Soundtm

DTS Introduces the Advanced DTS Digital Cinema Encoder™; First to Employ JPEG 2000 'Constant Quality' Encoding for D-Cinema

Inspired by Passion-Driven by Excellence DBS-30.3 BLU-RAY DISC PLAYER

EBU Evaluations of Multichannel Audio Codecs

DTS® and Coding Technologies Bring MPEG-4 Multi-Channel Audio for HDTV Into the Home at CES

40" FULL HD ANDROID TV™ the 40BI5KA Is a Full HD LED Android TV™ with Exceptional Multimedia Functionality