Linear Predictive Coding (LPC)-Lattice Methods, Applications

Total Page:16

File Type:pdf, Size:1020Kb

Linear Predictive Coding (LPC)-Lattice Methods, Applications Digital Speech Processing— Lecture 14 Linear Predictive Coding (LPC)-Lattice Methods, Applications 1 Prediction Error Signal 1. Speech Production Model p sn()=−+ asn ( k ) Gun () ∑ k Gu(n) s(n) k =1 H(z) Sz() G Hz()==p Uz() −k 1− ∑azk k =1 2. LPC Model: p en()=−=− sn () sn () sn ()α sn ( − k ) % ∑ k s(n) e(n) k =1 A(z) p Ez() −k Az()==−1 ∑αk z Sz() k =1 3. LPC Error Model: 11Sz() ==p Az() Ez () −k e(n) s(n) 1− ∑αk z k =1 1/A(z) p sn()=+ en ()α sn ( − k ) ∑ k Perfect reconstruction even if k =1 2 ak not equal to αk Lattice Formulations of LP • both covariance and autocorrelation methods use two step solutions 1. computation of a matrix of correlation values 2. efficient solution of a set of linear equations • another class of LP methods, called lattice methods, has evolved in which the two steps are combined into a recursive algorithm for determining LP parameters • begin with Durbin algorithm--at the i th stage the set of ()ith coefficients {α j ,ji= 12 , ,..., } are coefficients of the i order optimum LP 3 Lattice Formulations of LP • define the system function of the i th order inverse filter (prediction error filter) as i Az()iik( ) =−1 α () z− ∑ k k =1 • if the input to this filter is the input segment ˆ ()ii () ˆ smnˆ ()=+ snmwm ( )(), with output emnˆ ( )=+ enm ( ) i e()ii() m=− sm ()α () sm ( − k ) ∑ k k =1 • where we have dropped subscript nˆ - the absolute location of the signal • the z-transform gives ⎛⎞i Ez()ii( )==− AzSz () ( ) ( ) ⎜⎟1 α ()ikzSz− () ⎜⎟∑ k ⎝⎠k =1 4 Lattice Formulations of LP • using the steps of the Durbin recursion ()ii (-)11 (-) i () i (ααjj =−=kkii α ij- , and α i ) ()i we can obtain a recurrence formula for Az( ) of the form ()ii (−−−11 ) ii ( ) −1 Az( )=− A ( zkzA ) i ()z • giving for the error transform the expression ()ii (−−−−111 ) ii ( ) E( z )=− A ( zSz ) ( ) kzi A ( z ) Sz ( ) ()ii ()−−11 () i em()=− e () mkbmi ( −1 ) 5 Lattice Formulations of LP • where we can interpret the first term as the z-transform of the forward prediction error for an (i −1 )st order predictor, and the second term can be similarly interpreted based on defining a backward prediction error ()iii−− ()1111 −−− ii ( ) ( i − ) Bz( )== zAzSzzA ( ) ( ) ( zSzkAzSz ) ( ) −i ( ) ( ) −−11()ii () − 1 =−zB() z kEi () z • with inverse transform i bm()iii( )=−− smi ( )α () smki ( +−= ) b (−1 ) ( m −1)()− ke()i −1 m ∑ k i k =1 •− with the interpretation that we are attempting to predict sm( i ) from the ismi samples of the input that follow (− ) => we are doing a backward prediction and bm()i ( ) is called the backward prediction error sequence 6 Lattice Formulations of LP same set of samples is used to forward predict s(m) and backward predict s(m-i) 7 Lattice Formulations of LP • the prediction error transform and sequence Ezem()ii( ), () ( ) can now be expressed in terms of forward and backward errors, namely ()ii (−−−111 ) ( i ) Ez()=− E () zkzBi () z ()ii (−−11 ) ( i) em( )=− e ( mkb ) i ()1m −∗1 • similarly we can derive an expression for the backward error transform and sequence at sample m of the form ()iii−−11 ( ) ( − 1 ) Bz()=− zB () zkEzi () ()ii (−−11 ) ( i ) bm( )=−−∗ b ( m1 ) kemi ( ) 2 • these two equations define the forward and backward prediction error for an i th order predictor in terms of the corresponding prediction errors of an (i −1 )th order predictor, with the reminder that a zeroth order predictor does no prediction, so embmsm()00()==≤≤− () () ()01 mL ∗ 3 Ez()== E()p () z Az ()/() Sz 8 Lattice Formulations of LP Assume we know k1 (from external computation): 1. we compute em(1)[]and bm (1) []from sm[],0≤≤ m L using Eqs. ∗1 and ∗2 2. we next compute em(2)[]and bm (2) []for 01≤≤+mL using Eqs. ∗1 and ∗2 3. extend solution (lattice) to p sections giving em()pp[]and bm () []for 0≤ mLp ≤+− 1 4. solution is en[]= e()p [] n at the output of the pth lattice section 9 Lattice Formulations of LP • lattice network with p sections—the output of • no direct correlations which is the forward prediction error • no alphas • digital network implementation of the prediction • k’s computed from forward error filter with transfer function A(z) and backward error signals10 Lattice Filter for A(z) en()00[]== bn () [] sn []01 ≤≤− nL ()ii ()−−11 () i en[ ]=− e [ n ] kbni [ −=1120 ] i , ,..., p , ≤≤−+ nL 1 i ()iii ()−−11 () bn[ ]=− keni [ ] + b [ n −1120 ] i = , ,..., p , ≤ nL ≤ − 1 + i en[]=≤≤−+ e()p [], n01 n L p Ez()= AzSz ()() 11 All-Pole Lattice Filter for H(z) en()p [] en(1)p − [] en(1) [] en(0)[] s []n k k p −1 p k1 − k − k − k p p −1 1 −1 −1 z z bn()p [] bn(1)p − [] bn(1) [] bn(0)[] en()p []=≤≤−+ en [],01 nL p ()ii−−11 () () i enenkbn[]=+ []i [ −1 ] ipp=−,110 ,..., , ≤≤−+− nL 1 i 1 ()iii ()−−11 () bn[]=− keni [] + b [ n −1 ] ipp=−,110 ,..., , ≤≤−+ nL 1 i 1 ()00 () Sz()= 12 Ez () snen[]== [] bn [],01 ≤≤− nL Az() All-Pole Lattice Filter for H(z) 1. since bi()ii[1]0,−= ∀ , we can first solve for e (− 1) [0] for ipp=−,1,...,1,using the relationship: e(1)ii− [0][0] = e () This image cannot currently be displayed. 2. since be(0)[0]= (0) [0] we can then solve for b (i ) [0] for ()ii (− 1) ip=1,2,..., using the equation: bke[0]=− i [0] 3. we can now begin to solve for e(1)i− [1] as: (1)ii−− () (1) i eekbipp[1]=+ [1]i [0], =− , 1, ...,1 4. we set be(0)[1]= (0) [1] and we can then solve for b (i ) [1] for ip=1,2,..., using the equation: ()iii (−− 1) ( 1) bkebi[1]=−i [1] + [0], = 1,2,..., p 5. we iterate for nN=−2,3,..., 1 and end up with sn[]== e(0) [] n b (0) [] n 13 Lattice Formulations of LP • the lattice structure comes directly out of the Durbin algorithm • the ki parameters are obtained from the Durbin equations • the predictor coefficients, αk , do not appear explicitly in the lattice structure • can relate the ki parameters to the forward and backward errors via Li−+1 ∑ embm()ii−−11() () (−1 ) k =∗m=0 4 i 12/ ⎧⎫⎡⎤⎡⎤Li−+11 Li −+ ⎪⎪()ii−−12 () 1 2 ⎨⎬⎢⎥⎢⎥∑∑[em ( )] [ bm (−1 )] ⎩⎭⎪⎪⎣⎦⎣⎦⎢⎥⎢⎥mm==00 • where ki is a normalized cross correlation between the forward and backward prediction error, and is therefore called a partial correlation or PARCOR coefficient • can compute predictor coefficients recursively from the PARCOR coefficients 14 Direct Computation of k Parameters assume sn[ ] non-zero for 01≤ n≤− L assume ki chosen to minimize total energy of the forward (or backward) prediction errors we can then minimize forward prediction error as : 2 Li−+112 Li −+ Eem()ii⎡ ()( )()()⎤⎡emkbm()ii−−11 () ⎤ forward = ∑∑⎣ ⎦⎣=−−i 1 ⎦ mm==00 ∂E ()i Li−+1 forward ⎡⎤emkbm()ii−−11() () ( ) bm () i − 1 ( ) ==−02∑ ⎣⎦ −i − 1 − 1 ∂ki m=0 Li−+1 ()ii−−11 () ∑ ⎣⎦⎡⎤embm()⋅− (1 ) forward m=0 ki = Li−+1 2 ()i −1 ∑ ⎣⎦⎡⎤bm()−1 m=0 15 Direct Computation of k Parameters • we can also choose to minimize the backward prediction error 2 Li−+112 Li −+ E()ii⎡⎤ bm ()( ) ⎡ kembm ( ii−−11 ) ( ) ( ) (1 ) ⎤ backward==−+−∑∑⎣⎦ ⎣ i ⎦ mm==00 ∂E ()i Li−+1 backward ⎡⎤ke()ii−−11() m b () ( m )em()i −1 () ==−02∑ ⎣⎦ −i + − 1 ∂ki m=0 Li−+1 ()ii−−11 () ∑ ⎣⎦⎡⎤embm() (−1 ) backward m=0 ki = Li−+1 ()i −1 2 ∑ ⎣⎦⎡⎤em() m=0 16 Direct Computation of k Parameters • if we window and sum over all time, then Li−+1122 Li −+ ()ii−−11 () ∑∑⎣⎦⎣⎡⎤⎡em( )=− bm (1 ) ⎦⎤ mm==00 therefore PARCOR forward backward forward backward kkkkkiiiii=== Li−+1 ∑ embm()ii−−11() () (−1 ) m=0 = / Li−+11 Li −+ 12 ⎧⎫(i −1)()22i −1 ⎨⎬∑∑⎣⎦⎣⎦⎡⎤⎡⎤e ()mbm (−1 ) mm==00 ⎩⎭17 Direct Computation of k Parameters • minimize sum of forward and backward prediction errors over fixed interval (covariance method) L−1 22 Eembm()iii=+⎡⎤⎡⎤ ()( ) () ( ) Burg ∑{}⎣⎦⎣⎦ m=0 L−1 2 ∞ 2 ⎡⎤emkbm()ii−−11() () (1 )⎡⎤ kemb () ii −− 11 () ()()m 1 =−−+−+∑ ⎣⎦ii∑ ⎣⎦− m=0 m=−∞ ∂E ()i L−1 Burg 02⎡⎤emkbm()ii−−11() () () 1 bm () i − 1 () 1 ==−∑ ⎣⎦ −i − − ∂ki m=0 L−1 21⎡⎤ke()ii−−11() m b () ( m ) e () i − 1 () m −−∑ ⎣⎦i + − m=0 L−1 ()ii−−11 () 21∑ ⎣⎦⎡⎤embm()⋅− ( ) Burg m=0 ki = L−−1122L ()ii−−11 () ∑∑⎣⎦⎣⎡⎤⎡em()+− bm (1 ) ⎦⎤ m=0 m=0 Burg 18 •−≤11ki ≤ always Comparison of Autocorrelation and Burg Spectra • significantly less smearing of formant peaks using Burg method 19 Summary of Lattice Procedure • steps involved in determining the predictor coefficients and the k parameters for the lattice method are as follows 1. initial condition, emsmbm()00( )== ( ) () ( ) from Eq. *3 ()1 2. compute k1 = α1 from Eq. *4 3. determine forward and backward prediciton errors embm()11( ), () ( ) from Eqs. *1 and *2 4. set i = 2 ()i 5. determine ki = αi from Eq. *4 ()i 6. determine α j for j =−12, ,...,i 1 from Durbin iteration 7. determine em()ii( ) and bm ()( ) from Eqs. *1 and *2 8. set ii=+1 9. if ip is less than or equal to , go to step 5 10. procedure is terminated • predictor coefficients obtained directly from speech samples => without calculation of autocorrelation function • method is guaranteed to yield stable filters (ki <1) without using window 20 Summary of Lattice Procedure 21 Summary of Lattice Procedure (0) (1) (2) s[n] e [n] e [n] e [n] e( p)[n] e[n] − k1 − k2 − k p −1 − k1 −1 − k − k z z 2 z−1 p ( p) b(0)[n] b(1)[n] b(2)[n] b [n] embmsm()00( )= () ( )=∗ ( ) 3 Li−+1 ∑ embm()ii−−11() () (−1 ) k =∗m=0 4 i 12/ ⎧⎫⎡⎤⎡⎤Li−+11 Li −+ ⎪⎪()ii−−12 () 1 2 ⎨⎬⎢⎥⎢⎥∑∑[()][()]em bm−1 ⎩⎭⎪⎪⎣⎦⎣⎦⎢⎥⎢⎥mm==00 ()ii ()−−11 () i em( )= e ( mkbm )−−∗i (1 ) 1 bm()ii( )= b (−−11 ) ( m−−1 ) kem ( i ) ( ) ∗ 2 i 22 Prediction Error Signal 1.
Recommended publications
  • An Introduction to Mpeg-4 Audio Lossless Coding
    « ¬ AN INTRODUCTION TO MPEG-4 AUDIO LOSSLESS CODING Tilman Liebchen Technical University of Berlin ABSTRACT encoding process has to be perfectly reversible without loss of in- formation, several parts of both encoder and decoder have to be Lossless coding will become the latest extension of the MPEG-4 implemented in a deterministic way. audio standard. In response to a call for proposals, many com- The MPEG-4 ALS codec uses forward-adaptive Linear Pre- panies have submitted lossless audio codecs for evaluation. The dictive Coding (LPC) to reduce bit rates compared to PCM, leav- codec of the Technical University of Berlin was chosen as refer- ing the optimization entirely to the encoder. Thus, various encoder ence model for MPEG-4 Audio Lossless Coding (ALS), attaining implementations are possible, offering a certain range in terms of working draft status in July 2003. The encoder is based on linear efficiency and complexity. This section gives an overview of the prediction, which enables high compression even with moderate basic encoder and decoder functionality. complexity, while the corresponding decoder is straightforward. The paper describes the basic elements of the codec, points out 2.1. Encoder Overview envisaged applications, and gives an outline of the standardization process. The MPEG-4 ALS encoder (Figure 1) typically consists of these main building blocks: • 1. INTRODUCTION Buffer: Stores one audio frame. A frame is divided into blocks of samples, typically one for each channel. Lossless audio coding enables the compression of digital audio • Coefficients Estimation and Quantization: Estimates (and data without any loss in quality due to a perfect reconstruction quantizes) the optimum predictor coefficients for each of the original signal.
    [Show full text]
  • THÈSE Pour Obtenir Le Titre De Docteur En Sciences De L’UNIVERSITÉ De Nice-Sophia Antipolis Mention : Automatique, Traitement Du Signal Et Des Images
    Université de Nice-Sophia Antipolis École doctorale STIC THÈSE pour obtenir le titre de Docteur en Sciences de l’UNIVERSITÉ de Nice-Sophia Antipolis Mention : Automatique, Traitement du Signal et des Images présentée par Marie OGER Model-based techniques for flexible speech and audio coding Thèse dirigée par Marc ANTONINI, Directeur de Recherche CNRS Equipe d’accueil : CReATIVe, Laboratoire I3S, UNSA-CNRS et Stéphane RAGOT Equipe d’accueil : TPS, Laboratoire SSTP, France Télécom R&D Lannion soutenance prévue le 13 décembre 2007 Jury: Robert M. Gray Université de Stanford Rapporteur Pierre Duhamel CNRS Paris Rapporteur Marc Antonini CNRS Sophia Antipolis Directeur de thèse Stéphane Ragot France Télécom Lannion Directeur de thèse Michel Barlaud Université de Nice-Sophia Antipolis Examinateur Geneviève Baudoin ESIEE Paris Examinateur Christine Guillemot INRIA Rennes Examinateur Bastiaan Kleijn KTH Stockholm Examinateur ii Acronyms 3GPP 3rd Generation Partnership Project AAC Advanced Audio Coding AbS Analysis-by -Synthesis ACELP Algebraic Code Excited Linear Prediction ACR Absolute Category Rating ADPCM Adaptive Differential Pulse Code Modulation AMR-WB Adaptive Multi Rate-Wideband AMR-WB+ Extended Adaptive Multi-Rate Wideband ATC Adaptive Transform Codec BER Bit Error Rates BSAC Bit-Sliced Arithmetic Coding CCR Comparative Category Rating CELP Code Excited Linear Predictive CMOS Comparison Mean Opinion Score DCR Degradation Category Rating DCT Discrete Cosine Transform DMOS Degradation Mean Opinion Score DPCM Differential Pulse Code Modulation DWT Discrete Wavelet Transform e-AAC+ Enhanced High Efficiency Advanced Audio Coding EM Expectation Maximization iv FEC Frame Error Concealment FER Frame Error Rate FFT Fast Fourier Transform HE-AAC High Efficiency AAC GMM Gaussian Mixture Model i.i.d.
    [Show full text]
  • Notice: This Material May Be Protected by Copyright Law (Title 17 U.S.C.) No
    Notice: This Material may be protected by Copyright Law (Title 17 U.S.C.) no. 0 Data Compression Debra A. Lelewer and Daniel S. Hirschberg < Technical Report No. 87-10 Abstract This paper surveys a variety of data compression methods spanning almost forty years of research, from the work of Shannon, Fano and Huffman in the late 40's to a technique developed in 1986. The aim of data compression is to reduce redundancy in stored or communicated data, thus increasing effective data density. Data compression has important application in the areas of file storage and distributed systems. Concepts from information theory, as they relate to the goals and evaluation of data compression methods, are discussed briefly. A framework for evaluation and comparison of methods is constructed and applied to the algorithms presented. Comparisons of both theo­ retical and empirical natures are reported and possibilities for future research are suggested. INTRODUCTION 1. FUNDAMENTAL CONCEPTS 1.1 Definitions 1.2 Classification of Methods 1.3 A Data Compression Model 1.4 Motivation 2. SEMANTIC DEPENDENT METHODS 3. STATIC DEFINED-WORD SCHEMES 3.1 Shannon-Fano Code 3.2 Huffman Coding 3.3 Representations of the Integers 4. ADAPTIVE HUFFMAN CODING 4.1 Algorithm FGK 4.2 Algorithm V 5. OTHER ADAPTIVE METHODS 5.1 Lempel-Ziv Codes 5.2 Algorithm BSTW 6. EMPIRICAL RESULTS 7. SUSCEPTIBILITY TO ERROR 7.1 Static Codes 7.2 Adaptive Codes 8. NEW DIRECTIONS 9. SUMMARY INTRODUCTION Data compression is often referred to as coding. Information theory is defined to be the study of efficient coding and its consequences, in the form of speed of transmission and probability of error [Ingels 1971).
    [Show full text]
  • Variable-Length Codes for Data Compression David Salomon
    Variable-length Codes for Data Compression David Salomon Variable-length Codes for Data Compression ABC Professor David Salomon (emeritus) Computer Science Department California State University Northridge, CA 91330-8281 USA email: david.salomon@ csun.edu British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Control Number: ISBN 978-1-84628-958-3 e-ISBN 978-1-84628-959-0 Printed on acid-free paper. °c Springer-Verlag London Limited 2007 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permit- ted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. 9 8 7 6 5 4 3 2 1 Springer Science+Business Media springer.com To the originators and developers of the codes.
    [Show full text]
  • Universal Coding with Different Modelers in Data Compression By
    Universal coding with different modelers in data compression by Reggie ChingPing Kwan A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Montana State University © Copyright by Reggie ChingPing Kwan (1987) Abstract: In data compression systems, most existing codes have integer code length because of the nature of block codes. One of the ways to improve compression is to use codes with noninteger length. We developed an arithmetic universal coding scheme to achieve the entropy of an information source with the given probabilities from the modeler. We show how the universal coder can be used for both the probabilistic and nonprobabilistic approaches in data compression. to avoid a model file being too large, nonadaptive models are transformed into adaptive models simply by keeping the appropriate counts of symbols or strings. The idea of the universal coder is from the Elias code so it is quite close to the arithmetic code suggested by Rissanen and Langdon. The major difference is the way that we handle the precision problem and the carry-over problem. Ten to twenty percent extra compression can be expected as a result of using the universal coder. The process of adaptive modeling, however, may be forty times slower unless parallel algorithms are used to speed up the probability calculating process. UNIVERSAL CODING WITH DIFFERENT MODELERS IN DATA COMPRESSION by Reggie ChingPing Kwan A thesis submitted'in partial fulfillment of the requirements for the degree Of Master of Science ’ in Computer Science MONTANA STATE UNIVERSITY Bozeman, Montana J u l y '1987 MAIN LJtij /V372 /<97? C o p* j_j_ APPROVAL of a thesis submitted by Reggie chingping Kwan This thesis has been read by each member of the thesis committee and has been found to be satisfactory regarding content, English usage, format, citation, bibliographic style, and consistency, and is ready for submission to the College of Graduate studies.
    [Show full text]
  • Sparsity in Linear Predictive Coding of Speech
    Sparsity in Linear Predictive Coding of Speech Ph.D. Thesis Daniele Giacobello Multimedia Information and Signal Processing Department of Electronic Systems Aalborg University Niels Jernes Vej 12, 9220 Aalborg Ø, Denmark Sparsity in Linear Predictive Coding of Speech Ph.D. Thesis August 2010 Copyright c 2010 Daniele Giacobello, except where otherwise stated. All rights reserved. Abstract This thesis deals with developing improved techniques for speech coding based on the recent developments in sparse signal representation. In particular, this work is motivated by the need to address some of the limitations of the well- known linear prediction (LP) model currently applied in many modern speech coders. In the first part of the thesis, we provide an overview of Sparse Linear Predic- tion, a set of speech processing tools created by introducing sparsity constraints into the LP framework. This approach defines predictors that look for a sparse residual rather than a minimum variance one with direct applications to coding but also consistent with the speech production model of voiced speech, where the excitation of the all-pole filter can be modeled as an impulse train, i.e., a sparse sequence. Introducing sparsity in the LP framework will also bring to de- velop the concept of high-order sparse predictors. These predictors, by modeling efficiently the spectral envelope and the harmonics components with very few coefficients, have direct applications in speech processing, engendering a joint estimation of short-term and long-term predictors. We also give preliminary results of the effectiveness of their application in audio processing. The second part of the thesis deals with introducing sparsity directly in the linear prediction analysis-by-synthesis (LPAS) speech coding paradigm.
    [Show full text]
  • Speech and Audio Coding - a Brief Overview
    Speech and Audio Coding - a Brief Overview U. Heute Inst. for Circuit and System Theory Faculty of Engineering, Christian-Albrecht University D-24143 Kiel, Germany Abstract The historic „coding gap“ between (narrow- and wide-band) high-rate and (narrow-band) low-rate coding of speech has been filled more and more during the past 15 years. The GSM coder of 1990 was a very important step right into the gap, and it caused more research towards better quality and higher compression, which, together with other activities world-wide, closed the gap. The concepts behind this are explained, with a stress on the basis of the final break-through to good quality at medium-to-low and even wide-band speech at medium rates. The same concepts followed for speech were also applied to music, with some strong differences in the results: While time-domain approaches prevail for speech, frequency-domain coding was successful for audio, and it was accompanied by much more exploitation of psycho-acoustic effects. n i Speech Coding and the “Coding Gap” H o (z) ai z computes a linear combination i1 In the “classsical” speech-coding literature, “high- of past signal values as an estimation of the next rate” and “low-rate” codings were discriminated: n While digital speech transmission or storage was sample according to x ai xki , which is sub- possible with “telephone quality”, i.e., a bandwidth i1 of 300...3400 Hz, a sampling rate f = 8 kHz, and a s tracted from the actual samplexk , yielding the signal-to-noise ratio (SNR) of 35...38 dB, via difference or residual signal d x x from a various schemes between log.
    [Show full text]
  • Stereoscopic Depth Map Estimation and Coding Techniques for Multiview Video Systems
    Poznań University of Technology Faculty of Electronics and Telecommunications Chair of Multimedia Telecommunication and Microelectronics Doctoral Dissertation Stereoscopic depth map estimation and coding techniques for multiview video systems Olgierd Stankiewicz Supervisor: Prof. dr hab. inż. Marek Domański POZNAN UNIVERSITY OF TECHNOLOGY Faculty of Electronics and Telecommunications Chair of Multimedia Telecommunications and Microelectronics Pl. M. Skłodowskiej-Curie 5 60-965 Poznań www.multimedia.edu.pl This dissertation was supported by the public funds as a research project. A part of this dissertation related to depth estimation was partially supported by National Science Centre, Poland, according to the decision DEC-2012/07/N/ST6/02267. A part of this dissertation related to depth coding was partially supported by National Science Centre, Poland, according to the decision DEC-2012/05/B/ST7/01279. This dissertation has been partially co-financed by European Union funds as a part of European Social Funds. Copyright © Olgierd Stankiewicz, 2013 All rights reserved Online edition 1, 2015 ISBN 978-83-942477-0-6 This dissertation is dedicated to my beloved parents: Zdzisława and Jerzy, who gave me wonderful childhood and all opportunities for development and fulfillment in life. I would like to thank all important people in my life, who have always been in the right place and supported me in difficult moments, especially during realization of this work. I would like to express special thanks and appreciation to professor Marek Domański, for his time, help and ideas that have guided me towards completing this dissertation. Rozprawa ta dedykowana jest moim ukochanym rodzicom: Zdzisławie i Jerzemu, którzy dali mi cudowne dzieciństwo oraz wszelkie możliwości rozwoju i spełnienia życiowego.
    [Show full text]
  • Electronics & Communication Engineering
    ALL INDIA COUNCIL FOR TECHNICAL EDUCATION, NEW DELHI MODEL SCHEME OF INSTRUCTION AND SYLLABI FOR UG ENGINEERING DEGREE PROGRAMMES (Electronics & Communication Engineering) Prepared by All India Board for UG Studies in Engineering & Technology October 2012 1 CONTENTS Chapters Title Page No. Foreword 3 Preface 4 I Introduction 5 1. Background 5 2. Engineering Education 6 3. Approach to Curriculum 7 4. Definitions/Descriptions 10 5. Curriculum Structure 11 6. Methodology Followed 12 7. Expected Outcomes 13 8. Future Steps to be Taken II Lists of Courses Identified 15 Preamble 15 1. Humanities and Social Sciences (HS) 15 2. Basic Sciences (BS) 16 3. Engineering Sciences-Common (ES) 17 4. Professional Subjects-Core & Electives (PC, PE) 17 Electronics & Communication Engineering(EC) 19 5. Open Electives (OE) 20 6. Mandatory Courses (MC) III Model Syllabi for Common Courses 21 (a) Humanities and Social Sciences (HS) 21 (b) Basic Sciences (BS) 29 (c) Engineering Sciences-Common (ES) 44 (d) Open Electives (OE) 61 (e) Mandatory Courses (MC) 77 IV Model Scheme of Instruction & Syllabi 82 Electronics & Communication Engineering(EC) Annexure Composition of Working Groups (A) 105 Composition of AIB UGS (E&T) (B) 112 2 FOREWORD It is with great pleasure and honour that I write a foreward for the Model Scheme of Instruction and Syllabi for the Undergraduate Engineering Degree Programmes prepared by the All India Board of Undergraduate Studies in Engineering & Technology with Prof. B.S. Sonde as its Chairman and other members. All India Council for Technical Education has the responsibility for uniform development and qualitative growth of the Technical Education system and preparation of syllabi to maintain uniform standards throughout the county.
    [Show full text]
  • Speech Coding Parameters and Their Influence on Speech Recognition
    Speech Coding Parameters and Their Influence on Speech Recognition JARI TURUNEN*, DAMJAN VLAJ**, PEKKA LOULA * & BOGOMIR HORVAT** *Tampere University of Technology, Pori P.O.Box 300, FIN-28101 Pori, FINLAND **University of Maribor, FERI Maribor Smetanova 17, Maribor, SI-2000, SLOVENIA http://www.pori.tut.fi, http://dsplab.uni-mb.si Abstract: - Speech recognition over different transmission channels will set demands on the parametric encoded/decoded speech. The effects of different types of noise have been studied a lot and the effects of the parametrization process in speech has been known to cause degradation in decoded speech when compared to the original speech. But does the encoding/decoding process modify the speech so much that it will cause degradation in the speech recognition result? If it does, what may cause the speech recognition degradation? We have studied ten different speech codec configurations with noisy data. The results indicate that the vocal tract parameters are in key role in speech information transmission and the certain residual modeling methods will reduce the effect of the noise. Key-Words: - speech recognition; speech coding; quality evaluation; SNR; codec 1 Introduction example of a codec primarily designed and Speech recognition over fixed telephone network optimized for speech recognition. In [9], the speech has worked successfully in several automated recognition from encoded bit stream have been services over the decades. The rapidly evolving studied. technology will rise several other speech Speech recognition accuracy and its degradation communication possibilities, and the need and to in noisy environments has been studied extensively. develop speech interface for the semi- and fully- For example in [10] the degradation sources in automated computer services over other telephone network are studied and the speaker communication channels than telephone network are dependent variations, the effect of different types of also becoming reality [1]- [5].
    [Show full text]
  • I - Le Contexte II- Codage Temporel : Waveform Coders III- Modèles Analyse/Synthèse Sinusoïdaux IV- Vocodeurs V- Codeurs Prédictifs Linéaires : Analyse Par Synthèse
    I - Le Contexte II- Codage temporel : waveform coders III- Modèles analyse/synthèse sinusoïdaux IV- Vocodeurs V- Codeurs prédictifs linéaires : Analyse par Synthèse Bibliographie : A.Spanias, « Speech Coding, a tutorial review », Portions published in Proceedings of the IEEE, Oct. 1994 N.Moreau, « Techniques de compression des signaux », Ed Masson, collec. CNET / ENST. R.Boite, H.Bourlard, T.Dutoit, J.Hancq, H.Leich, « Traitement du la parole », Presses Polytechniques et Universitaires Romandes, 2000. http://www.eas.asu.edu/~speech/sp_cod.html http://www.data-compression.com/ http://www-mobile.ecs.soton.ac.uk/speech_codecs/ http://www.eas.asu.edu/~speech/research/stc/presf/sld001.htm 1 www.enseeiht.fr/~mailhes Version Nov 09 TERME en ANGLAIS PAGE ABS(Analyse By Synthesis) 46 ACELP(Adaptative CELP Coder) 13 ACELP(Algebraic CELP Coder) 63;70 AMDF(Average Magnitude Difference Function) 40 AMR-NB(Adaptative Multirate Narrow Band) 71 AMR-WB(Adaptative Multirate Wideband) 72 ATRAC(Adaptative Transform Acoustic Coding: MINIDISC) 77 CELP(code excited linear prediction,codebook excitation) 12;51;61 CNG(Comfort noise generation) 71 DAB(Digital Audion Broadcasting) 77;97 DAM(Diagnostic Acceptability Measure) 17 DPCM(Differential Pulse Code Modulation) 18 DRT(Diagnostic Rythm Test) 17 DTX(Discontinuous Transmission) 71 EFR(Enhanced Full Rate) 70 EPL(Erreur de Prédiction Linéaire) 18 ETSI(European Telecommunication Standard Intsitute) 67 FR(Full Rate) 68;69 GSM(Groupe Special Mobile) 13;68 Hifi(high fidelity) 75 HR(Half Rate) 70;71 IMDCT 95 ISDN 93;97
    [Show full text]
  • Speech Intelligibility Measurement on the Basis of ITU-T Recommendation P.863
    Master report, IDE 1271, November 2012 Subject: Master Thesis in Embedded and Intelligent Systems Speech Intelligibility Measurement on the basis of ITU-T Recommendation P.863 Engineering Master thesis School of Information Science, Computer and Electrical and Electrical Computer Science, Information School of Swatantra Ghimire Speech intelligibility measurement on the basis of ITU-T Recommendation P.863 Speech Intelligibility Measurement on the basis of ITU-T Recommendation P.863 Master Thesis in Embedded and Intelligent Systems November 2012 Author: Swatantra Ghimire Supervisors: John G. Beerends (TNO, the Netherlands) Josef Bigun (Halmstad University) Examiner: Antanas Verikas School of Information Science, Computer and Electrical Engineering Halmstad University PO Box 823, SE-301 18 HALMSTAD, Sweden i Speech intelligibility measurement on the basis of ITU-T Recommendation P.863 ii Speech intelligibility measurement on the basis of ITU-T Recommendation P.863 © Copyright Swatantra Ghimire, 2012. All rights reserved Master Thesis Report, IDE1271 School of Information Science, Computer and Electrical Engineering Halmstad University iii Speech intelligibility measurement on the basis of ITU-T Recommendation P.863 iv Speech intelligibility measurement on the basis of ITU-T Recommendation P.863 Preface First, I would like to thank my supervisor John G. Beerends at TNO for all the support he provided during the course of this thesis work. It was not an easy choice to come all the way from Sweden to do my thesis work at TNO, Netherlands. All the documentations required, travel arrangement and visa issues were quite challenging, but he helped me a lot throughout all these and I sincerely thank him for that.
    [Show full text]