1 Introduction

1 Introduction

SHORTEN Simple lo s sle s s andne arlo s sle s s waveform compre s s ion Tony Robinson Technical rep ort CUEDFINFENGTR Cambr idge Univers ityEngineer ing Department Trumpington Street Cambr idge CB PZ UK Decemb er Ab stract Thi s rep ort de scr ib e s a program that p erforms compre ss ion of waveform le s such as audio data A s imple pre dictivemodel of thewaveform i s us e d followed by Human co dingofthe pre diction re s iduals Thi s i s b oth f ast andnear optimal for many com monly o ccur ingwaveform s ignals Thi s f rameworkisthen extended to lossy co ding under the conditions of maximisingthe s egmental s ignal to noi s e ratio on a p er f rame bas i s and co dingto a xe d acceptable s ignal to noi s e ratio Intro duction It i s common tostore digitised waveforms on computers andtheresulting le s can often consume s ignicantamountsofstorage space General compre s s ion algor ithms do not p erform very well on the s e le s as they f ail totakeinto accountthestructure of thedataand thenature of the s ignal contained there in Typically a waveform le will cons i st of s igned bit numb ers andthere will b e s ignicantsample to sample correlation A compre s s ion utility for the s e le must b e re asonably f ast p ortable accept datainamostpopular formats and give s ignicant compre s s ion Thi s rep ort de scr ib e s shorten a program for the UNIX and DOS environmentswhich aims tomeet the s e requirements A s ignicantapplication of thi s program i s tothe problem of compre s s ion of sp eech le s for di str ibution on CDROM Thi s rep ort starts withade scr iption of thi s domain then di scus s e s thetwomain problems as so ciate d with general waveform compre s s ion namely pre dictivemodellingandresidual co ding Thi s f rameworkisthen extended to lossy coding Finallytheshorten implementation i s de scr ib e d andanappendix details the command line options Compre s s ion for sp eech corp ora One imp ortant use for lossless waveform compression is to compre s s sp eech corp ora for di str ibution on CDROM Stateofthe art sp eech recognition systems require gigabytes of acoustic data for mo del e stimation whichtakes many CDROMs tostore Us e of compre s s ion software b othreduce s the di str ibution co st andthenumber of CDROM change s require d toreadthecompletedata s et Thekey f actors in thede s ign of compre s s ion software for sp eech corp ora are thatthere must b e no p erceptual degradation in the sp eech s ignal andthatthedecompre s s ion routine must b e f ast and p ortable There has b een much researchinto ecient sp eechcodingtechnique s andmanystand ards have b een e stabli shed However mo st of thi s workhas b een for telephonyapplications where de dicated hardware can used to p erform the co dingandwhere it i s imp ortantthat the re sulting system operates atawell dene d bit rate In suchapplications lo s sy co ding i s acceptable andindee d nece s sary order to guarantee thatthesystem op erates atthe xe d bit rate Similarly there has b een muchworkinde s ign of general purpose lossless compre s sors for workstation us e Suchsystems do not guarantee any compre s s ion for an arbitrary le but in general achieveworthwhile compre s s ion in re asonable timeongeneral purpose computers Sp eech corp ora compre s s ion nee ds somefeature s of b oth systems Lo s sle s s compre s s ion i s an advantage as it guarantee s there i s no p erceptual degradation in the sp eech s ignal However theestabli she d compre s s ion utilitie s do not exploit theknown structure of the sp eech s ignal Hence shorten was wr itten to ll thi s gap andisnow in us e in the di str ibution of CDROMs containing sp eechdatabas e s The recordings us e d as example s in s ection and s ection are f rom the TIMIT corpus which i s di str ibute d as bit kHz line ar PCM sample s Thi s format i s in common us e d for continuous sp eech recognition research corp ora The recordings were collecte d us inga Sennheiser HMD noisecancellinghe admounte d microphoneinlow noise conditions All ten utterance s f rom sp e aker fcjf are used which amounttoatotal of s econds or about sample s Waveform Mo deling Compression is achieved by building a pre dictivemodel of thewaveform a go o d intro duc tion for sp eech i s JayantandNoll An e stabli shed model for a widevar ietyofwaveforms is thatofanautoregre s s ivemodel also known as line ar pre dictive co ding LPC Here the pre dicted waveform i s a line ar combination of past sample s p X st a st i i i Thecode d s ignal et i s the dierence b etween theestimateoftheline ar pre dictors t andthe sp eech s ignal st et st st However manywaveforms of intere st are not stationarythatisthebestvalue s for the co ecientsofthe pre dictor a vary f rom one s ection of thewaveform toanother It i s i often re asonable toassumethatthe s ignal i s p s eudostationary ie there exi stsa timespan over which re asonable value s for the line ar pre dictor can b e found Thus thethree main stage s in the co ding pro ce s s are blo cking pre dictivemodelling andresidual co ding Blo cking Thetime f rameover whichsample s are blo cked dep ends to some extentonthenature of the s ignal It i s inecienttoblo ckontoo short a time scale as thi s incurs an overhead in the computation and transmi s s ion of the pre diction parameters It i s also inecienttouse atime scale over whichthesignal characteristics change appreciably as thi s will re sultin a p o orer mo del of the s ignal However in the implementation described belowthe linear pre dictor parameters typically takemuch le s s information to transmit than the residual s ignal so thechoice of window length i s not cr itical Thedef aultvalue in theshorten implementation i s which re sults in ms f rame s for a s ignal sample d at kHz Sample interle aved signals are handelle d by tre atingeachdata stre am as indep endent Even in cas e s where thereisaknown correlation b etween the stre ams suchasinstereo audio the withinchannel correlations are often s ignicantly gre ater than the cro s schannel correlations so for lo s sle s s or ne arlo s sle s s co dingthe exploitation of thi s additional correl ation only re sultsinsmall additional gains A rectangular window i s us e d in preference toanytap er ingwindowasthe aim i s to mo del just those sample s within theblo ck not the sp ectral characteristics of the s egment surroundingtheblo ck The windowlength i s longer than theblo cksizebythe pre diction order whichistypically three sample s Line ar Pre diction Shorten supp ortstwo forms of line ar pre diction thestandard pth order LPC analys i s of equation andarestricte d form wherebythe co ecients are s elected from one of four xe d p olynomial pre dictors In thecaseofthegeneral LPC algor ithm the pre diction co ecients a are quantised in i accordance withthe same Laplacian di str ibution us e d for theresidual s ignal andde scr ib e d in s ection The exp ected numb er of bits p er co ecient i s as thi s was foundtobe a go o d tradeo b etween mo delling accuracy andmodel storage Thestandard Durbins algor ithm for computingthe LPC co ecients f rom theauto correlation co ecients is used in a incremental wayOneachiteration theme an square d value of the pre diction re s idual i s calculated andthi s i s us e d to computethe exp ected number of bitsnee ded tocode the residual s ignal Thi s i s added tothenumb er of bitsnee ded tocodethe pre diction co ecientsandthe LPC order i s s elected to minimis e thetotal As the computation of theauto correlation co ecientsisthe mo st exp ens ivestep in this process the s e arch for theoptimal mo del order i s terminated when the last twomodels have re sulte d in a higher bit rate Whilst it i s possible to construct s ignals thatdefe atthis search pro ce dure in practice for sp eechsignals it has b een foundthatthe o ccas ional us e of a lower pre diction order re sults in an ins ignicant incre as e in the bit rateandhas the additional s ide eect of requir ing le s s computetodeco de A re str ictiveformofthe line ar pre dictor has b een foundto b e us eful In thi s cas e the pre diction co ecients are tho s e sp ecie d byttinga p order p olynomial tothe last p data p oints eg a linetothe last two p oints s t s t st s t st st s t st st st Writing e tasthe error s ignal f rom the ithpolynomial pre dictor i e t st e t e t e t e t e t e t e t e t e t As can b e s een f rom equations there i s an ecient recurs ivealgorithm for comput ingthe s et of p olynomial pre diction re s iduals Eachresidual term i s forme d f rom the dierence of the previous order pre dictors As e achterm involve s only a few integer addi tionssubtractions it i s p o s s ible to compute all pre dictors and s elect the b e st Moreover as thesumofabsolutevalue s i s line arly related tothevar iance thi s may be used as the bas i s of pre dictor s election andsothewhole process is cheap to computeasitinvolves no multiplications Figure shows b oth forms of pre diction for a range of maximum pre dictor orders The gure shows that rst and s econdorder pre diction provides a substantial incre as e in compre s s ion andthathigher order pre dictors provide relatively little improvement The gure also shows that for thi s example mo st of thetotal compre s s ion can b e obtaine d us ing no pre diction that i s a zerothorder co der achieved about compre s s ion andthebest pre dictor Hence for lo s sle s s compre s s ion it i s imp ortant not towastetoo much computeonthe pre dictor andtoto p erform the residual co ding eciently

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us