Two-Dimensional Audio Compression Method Using Video Coding Schemes

electronics Article Two-Dimensional Audio Compression Method Using Video Coding Schemes Seonjae Kim 1 , Dongsan Jun 1,*, Byung-Gyu Kim 2,*, Seungkwon Beack 3, Misuk Lee 3 and Taejin Lee 3 1 Department of Convergence IT Engineering, Kyungnam University, Changwon 51767, Korea; [email protected] 2 Department of IT Engineering, Sookmyung Women’s University, Seoul 04310, Korea 3 Electronics and Telecommunications Research Institute (ETRI), Daejeon 34129, Korea; [email protected] (S.B.); [email protected] (M.L.); [email protected] (T.L.) * Correspondence: [email protected] (D.J.); [email protected] (B.-G.K.) Abstract: As video compression is one of the core technologies that enables seamless media streaming within the available network bandwidth, it is crucial to employ media codecs to support powerful coding performance and higher visual quality. Versatile Video Coding (VVC) is the latest video coding standard developed by the Joint Video Experts Team (JVET) that can compress original data hundreds of times in the image or video; the latest audio coding standard, Unified Speech and Audio Coding (USAC), achieves a compression rate of about 20 times for audio or speech data. In this paper, we propose a pre-processing method to generate a two-dimensional (2D) audio signal as an input of a VVC encoder, and investigate the applicability to 2D audio compression using the video coding scheme. To evaluate the coding performance, we measure both signal-to-noise ratio (SNR) and bits per sample (bps). The experimental result shows the possibility of researching 2D audio encoding using video coding schemes. Citation: Kim, S.; Jun, D.; Kim, B.-G.; Keywords: audio compression; video compression; next generation audio coding; versatile video Beack, S.; Lee, M.; Lee, T. coding (VVC); mu-law encoding; linear mapping Two-Dimensional Audio Compression Method Using Video Coding Schemes. Electronics 2021, 10, 1094. https:// doi.org/10.3390/electronics10091094 1. Introduction As consumer demands for realistic and rich media services on low-end devices are Academic Editor: Juan M. Corchado rapidly increasing in the field of multimedia delivery and storage applications, the need for audio or video codecs with powerful coding performance is emphasized, which can achieve Received: 23 March 2021 minimum bitrates and maintain higher perceptual quality compared with the original Accepted: 3 May 2021 data. As the state-of-the-art video coding standard, Versatile Video Coding (VVC) [1] was Published: 6 May 2021 developed by the Joint Video Experts Team (JVET) of the ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG). VVC can achieve a Publisher’s Note: MDPI stays neutral bitrate reduction of 50% with similar visual quality compared with the previous method, with regard to jurisdictional claims in published maps and institutional affil- named High Efficiency Video Coding (HEVC) [2]. In the field of audio coding technology, iations. Unified Speech and Audio Coding (USAC) [3] was developed by the MPEG audio group as the latest audio coding standard, which integrates speech and audio coding schemes. Whereas VVC can accomplish significant coding performance in the original image or video data, USAC can provide about 20 times compression performance of the original audio or speech data. Copyright: © 2021 by the authors. In this study, we conducted various experiments to verify the possibility of com- Licensee MDPI, Basel, Switzerland. pressing two-dimensional (2D) audio signals using video coding schemes. In detail, we This article is an open access article distributed under the terms and converted 1D audio signal into 2D audio signal with the proposed 2D audio conversion conditions of the Creative Commons process as an input of a VVC encoder. We used both signal-to-noise ratio (SNR) and bits Attribution (CC BY) license (https:// per sample (bps) for performance evaluations. creativecommons.org/licenses/by/ The remainder of this paper is organized as follows: In Section2, we overview the 4.0/). video coding tools, which are newly adopted in VVC. In Section3, we describe the proposed Electronics 2021, 10, 1094. https://doi.org/10.3390/electronics10091094 https://www.mdpi.com/journal/electronics Electronics 2021, 10, x FOR PEER REVIEW 2 of 12 Electronics 2021, 10, 1094 The remainder of this paper is organized as follows: In Section 2, we overview2 the of 12 video coding tools, which are newly adopted in VVC. In Section 3, we describe the proposed methods for converting a 1D audio signal into a 2D audio signal. Finally, the experimentalmethods for convertingresults and aconclusions 1D audio signal are pr intoovided a 2D in audio Sections signal. 4 and Finally, 5, respectively. the experimental results and conclusions are provided in Sections4 and5, respectively. 2. Overview of VVC 2. OverviewVVC can ofprovide VVC powerful coding performance compared with HEVC. One of the main differencesVVC can provide between powerful HEVC and coding VVC performance is the block compared structure. withBoth HEVC.HEVC Oneand ofVVC the commonlymain differences specify between coding HEVCtree unit and (CTU) VVC isas the the block largest structure. coding Both unit, HEVC which and has VVC a changeablecommonly size specify depending coding tree on unitthe encoder (CTU) as configuration. the largest coding In addition, unit, which a CTU has acan changeable be split intosize four depending coding units on the (CUs) encoder by a configuration. quad tree (QT) In structure addition, to a adapt CTU canto a bevariety split of into block four properties.coding units In (CUs)HEVC, by a aCU quad can tree be (QT)further structure partitioned to adapt into to one, a variety two, or of blockfour prediction properties. unitsIn HEVC, (PUs) aaccording CU can beto the further PU splitting partitioned type. into After one, obtaining two, or fourthe residual prediction block units derived (PUs) fromaccording the PU-level to the PU intra- splitting or inter-prediction, type. After obtaining a CU thecan residual be partitioned block derived into multiple from the transformPU-level intra-units (TUs) or inter-prediction, according to a a residual CU can quad-tree be partitioned (RQT) into structure multiple similar transform to that units of CU(TUs) split. according to a residual quad-tree (RQT) structure similar to that of CU split. VVCVVC substitutes substitutes the the concepts concepts of of multiple multiple partition partition unit unit types types (CU, (CU, PU, PU, and and TU) TU) with with aa QT-based QT-based multi-type multi-type tree tree (QTMTT) (QTMTT) block block structure, structure, where where the the MTT MTT is is classified classified into into binarybinary tree tree (BT) (BT) split split and and ternary ternary tree tree (TT) (TT) split split to to support support more more flexibility flexibility for for CU CU partition partition shapes.shapes. This This means means that that a aQT QT can can be be further further split split by by the the MTT MTT structure structure after after a aCTU CTU is is first first partitionedpartitioned by by a aQT QT structure. structure. As As depicted depicted in in Figure Figure 1,1 ,VVC VVC specifies specifies four four MTT MTT split split types: types: verticalvertical binary binary split split (SPLIT_BT_VER), (SPLIT_BT_VER), horizontal horizontal binary binary split split (SPLIT_BT_HOR), (SPLIT_BT_HOR), vertical vertical ternaryternary split split (SPLIT_TT_VER), (SPLIT_TT_VER), and and horizontal horizontal ternary ternary split split (SPLIT_TT_HOR), (SPLIT_TT_HOR), as as well well as as QTQT split. split. In In VVC, VVC, a aQT QT or or MTT MTT node node is is considered considered a aCU CU for for prediction prediction and and transform transform processesprocesses without without any any further further partitioning partitioning sc schemes.hemes. Note Note that that CU, CU, PU, PU, and and TU TU have have the the samesame block block size size in in the the VVC VVC block block structure. structure. In In other other words, a CU inin VVCVVC cancan havehave eithereither a asquare square oror rectangularrectangular shape,shape, whereaswhereas a a CU CU in in HEVC HEVC always always has has a a square square shape. shape. FigureFigure 1. 1. VVCVVC block block structure structure (QTMTT). (QTMTT). TableTable 1 shows newly adopted coding tool toolss between between HEVC HEVC and and VVC. VVC. In In general, general, intra-predictionintra-prediction generates generates a predicted a predicted block block from from the the reconstructed reconstructed neighboring neighboring pixels pixels of theof thecurrent current block. block. As Asshown shown in inFigure Figure 2, 2VVC, VVC can can provide provide 67 67 intra-prediction intra-prediction modes, modes, wherewhere modes modes 0 0and and 1 1are are planar planar and and DC DC mode, mode, respectively, respectively, and and the the others others are are in in angular angular predictionprediction mode mode to to represent represent edge direction.direction. AccordingAccording to to [ 4[4],], the the VVC VVC test test model model (VTM) (VTM) [5 ] achieves 25% higher compression performance than the HEVC test model (HM) [6] under [5] achieves 25% higher compression performance than the HEVC test model (HM) [6] the all-intra (AI) configuration recommended by JVET Common Test Conditions (CTC) [7]. under the all-intra (AI) configuration recommended by JVET Common Test Conditions This improvement was mainly realized by the newly adopted coding tools, such as position (CTC)

Two-Dimensional Audio Compression Method Using Video Coding Schemes

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support