(LSF) Vector Quantization in Speech Codec

electronics Article An Efficient Codebook Search Algorithm for Line Spectrum Frequency (LSF) Vector Quantization in Speech Codec Yuqun Xue 1,2, Yongsen Wang 1,2, Jianhua Jiang 1,2, Zenghui Yu 1, Yi Zhan 1, Xiaohua Fan 1 and Shushan Qiao 1,2,* 1 Smart Sensing R&D Center, Institute of Microelectronics of Chinese Academy of Sciences, Beijing 100029, China; [email protected] (Y.X.); [email protected] (Y.W.); [email protected] (J.J.); [email protected] (Z.Y.); [email protected] (Y.Z.); [email protected] (X.F.) 2 Department of Microelectronics, University of Chinese Academy of Sciences, Beijing 100049, China * Correspondence: [email protected] Abstract: A high-performance vector quantization (VQ) codebook search algorithm is proposed in this paper. VQ is an important data compression technique that has been widely applied to speech, image, and video compression. However, the process of the codebook search demands a high computational load. To solve this issue, a novel algorithm that consists of training and encoding procedures is proposed. In the training procedure, a training speech dataset was used to build the squared-error distortion look-up table for each subspace. In the encoding procedure, firstly, an input vector was quickly assigned to a search subspace. Secondly, the candidate code word group was obtained by employing the triangular inequality elimination (TIE) equation. Finally, a partial distortion elimination technique was employed to reduce the number of multiplications. The proposed method reduced the number of searches and computation load significantly, especially when the input vectors were uncorrelated. The experimental results show that the proposed algorithm Citation: Xue, Y.; Wang, Y.; Jiang, J.; provides a computational saving (CS) of up to 85% in the full search algorithm, up to 76% in the TIE Yu, Z.; Zhan, Y.; Fan, X.; Qiao, S. An algorithm, and up to 63% in the iterative TIE algorithm. Further, the proposed method provides CS Efficient Codebook Search Algorithm and load reduction of up to 29–33% and 67–69%, respectively, over the BSS-ITIE algorithm. for Line Spectrum Frequency (LSF) Vector Quantization in Speech Codec. Keywords: vector quantization (VQ); codebook search; line spectrum frequency (LSF); speech codec Electronics 2021, 10, 380. https:// doi.org/10.3390/electronics10040380 Academic Editor: 1. Introduction Manuel Rosa-Zurera Vector quantization (VQ) is a high-performance technique for data compression. Received: 3 December 2020 Due to its simple coding and high compression ratio, it has been successfully applied Accepted: 2 February 2021 to speech, image, audio, and video compression [1–6]. It is also a key part of the G.729 Published: 4 February 2021 Recommendation. A major handicap is that a remarkable computational load is required for the VQ of line spectrum frequency (LSF) coefficients of the speech codec [7–12]. Thus, it Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in is necessary to reduce the computation load of VQ. published maps and institutional affil- Conventionally, a full search algorithm is employed to find the code word that is best iations. matched with an arbitrary input vector. However, the full search algorithm demands a large computation load. To reduce the encoding search complexity, many approaches have been proposed [13–28]. These approaches can be classified into four main types, according to their base techniques. The first type is the tree-structured VQ (TSVQ) technique [13–17], the second type is the triangular inequality elimination (TIE)-based approach [18–21], Copyright: © 2021 by the authors. the third type is the equal-average equal-variance equal-norm nearest neighbor search Licensee MDPI, Basel, Switzerland. This article is an open access article (EEENNS) method [22–24] based on the statistical characteristic values of the input signal, distributed under the terms and and the last type is the binary search space-structured VQ (BSS-VQ) [25–28] method. conditions of the Creative Commons In the TSVQ approaches [13–17], the search complexity can be reduced significantly. Attribution (CC BY) license (https:// However, the reconstructed speech quality is poor because the selected code word is not creativecommons.org/licenses/by/ necessarily the best matched to the input vector. In contrast, the TIE-based method [18–21] 4.0/). can achieve an enormous computational load reduction without loss of speech quality. By Electronics 2021, 10, 380. https://doi.org/10.3390/electronics10040380 https://www.mdpi.com/journal/electronics Electronics 2021, 10, x FOR PEER REVIEW 2 of 15 Electronics 2021, 10, 380 2 of 14 employing the high correlation between the adjacent frames, a TIE method is used to reject employing the high correlation between the adjacent frames, a TIE method is used to reject thethe impossible impossible candidate candidate code code words. words. However, However, t thehe performance performance of of comp computationalutational load load reductionreduction is is weakened weakened when when the the input input vectors vectors are are uncorrelated uncorrelated.. The The EEENNS EEENNS technique technique employsemploys the the statistical characteristicscharacteristics of of an an input input vector, vector, such such as as the the mean, mean, the the variance, variance, and andthe norm,the norm to reject, to reject the impossible the impossible code code words. words. In order In order to further to fu reducerther reduce the computational the compu- tationalload, the load BSS-VQ, the BSS method-VQ method [25] is proposed,[25] is proposed, in which in thewhich number the number of candidate of candidate code words code wordsis closely is closely related related to the to distribution the distribution of the of hit the probability, hit probability, and a and sharpened a sharpened distribution distri- butionat specific at specific code words code words yields yields an enormous an enormous computational computational load. load. However, However, the probability the prob- abilitydistribution distribution is not uniform; is not uniform some subspaces; some subspaces are concentrated, are concentrated, and some and are flattened.some are Thusflat- tened.sometimes Thus thesometimes performance the performance is not good. is not good. ToTo solve solve the the above above issues, issues, an an efficient efficient VQ VQ codebook codebook search search algorithm algorithm is is proposed. proposed. EvenEven though though the the input input vectors vectors are are uncorrelated, uncorrelated, it it can can still still reduce reduce the the computational computational load load significantlysignificantly whilewhile maintainingmaintaining the the quantization quantization accuracy. accuracy. Compared Compared with previouswith previous works workson the on full the search full search algorithm algorithm [6], TIE [6 [],18 TIE], ITIE [18], [21 ITIE], and [21 the], and BSS-ITIE the BSS [-28ITIE], the [28 experimental], the exper- imentalresults showresults that show the that proposed the proposed algorithm algorithm can quantize can quantize the input the vector input withvector the with lowest the lowestcomputational computational load. load. TheThe rest rest of of this this paper paper is organized is organized as follows. as follows. Section Section 2 describes2 describes the encoding the encoding proce- dureprocedure of the LSF of the coefficients LSF coefficients in G.729. in Section G.729. 3 Section presents3 presentsthe theory the of theorythe proposed of the proposedalgorithm inalgorithm detail. Section in detail. 4 shows Section the4 shows experimental the experimental results and results performance and performance comparison comparison between thebetween proposed the algorithm proposed and algorithm previous and works. previous Finally, works. Section Finally, 5 concludes Section this5 concludes work. this work. 2. LSF Coefficients Quantization in G.729 2. LSF Coefficients Quantization in G.729 The ITU-T G.729 [29] speech codec was selected as the platform to verify the perfor- manceThe of ITU-Tthe proposed G.729 [ 29algorithm.] speech codecThus, wasbefore selected introducing as the platform the theory to verifyof the theproposed perfor- method,mance of the the principle proposed of algorithm.the LSF quantization Thus, before in G.729 introducing is introduced the theory here. of the proposed method, the principle of the LSF quantization in G.729 is introduced here. The LSF coefficients are obtained by an equation iarcos q i , where qi is the LSF The LSF coefficients are obtained by an equation wi = ar cos(qi), where qi is the LSF coefficient in the cosine domain, and w is the computed LSF coefficient in the frequency coefficient in the cosine domain, and wii is the computed LSF coefficient in the frequency domain.domain. TThehe procedure procedure of of the the LSF LSF quantize quantize is is organized organized as as follows: follows: a a switched switched 4th 4th moving moving averageaverage (MA) (MA) prediction prediction is is used used to to predict predict the the LSF LSF coefficients coefficients of of the the current current frame. frame. The The differencedifference between thethe computedcomputed and and predicted predicted coefficients coefficients is quantizedis quantized using using a two-stage a two- stagevector vector quantizer. quantizer. The The first first stage stage is a is 10-dimensional a 10-dimensional VQ VQ using using a codebook a codebookL1 withL1 with 128 128entries entries (7 bits). (7 bits). In the In

(LSF) Vector Quantization in Speech Codec

Music 422 Project Report

(A/V Codecs) REDCODE RAW (.R3D) ARRIRAW

Center for Signal Processing

Coding and Compression

Experiment 7 IMAGE COMPRESSION I Introduction

Ts 103 491 V1.1.1 (2017-04)

Multi-Image Classification and Compression Using Vector Quantization

Algorithms for Fast Vector Quantization∗

Vq-Wav2vec:Self-Supervised Learningof Discrete Speech Representations

Video Compression: MPEG-4 and Beyond Video Compression: MPEG-4 and Beyond

A Novel and Efficient Vector Quantization Based CPRI

Бвбдгжезй © "!#!© "! %$& %'% (0)214365 78 369