ECW-706 Lecture 9 Cepstrum Alanysis A cepstrum is the Fourier transform of the decibel spectrum as if it were a signal. Operations on cepstra are labeled quefrency alanysis, liftering, or cepstral analysis. This analysis is done to estimate the rate of change in the different spectrum bands. It was originally invented for characterizing the seismic echoes resulting from earthquakes and bomb explosions. It has also been used to analyze radar signal returns. Real Ceptrum 휋 1 푗휔 푗휔푛 푐 푛 = log 푋 푒 푒 푑휔 2휋 −휋 Complex Ceptrum 휋 1 푗휔 푗휔푛 푥 푛 = log 푋 푒 푒 푑휔 2휋 −휋 Complex cepstrum is actually the inverse transform of the complex logarithm Fourier transform of the input. Accepted terminology for cepstrum is now inverse FT of log of magnitude of the power spectrum of a signal. Power cepstrum Square of the Fourier transform of the logarithm of the squared magnitude of the Fourier transform of a signal real cepstrum = 0.5*(complex cepstrum + time reversal of complex cepstrum). phase spectrum = (complex cepstrum - time reversal of complex cepstrum).^2 Ver. 1.1 1 ECW-706 Lecture 9 Applications It is used for voice identification, pitch detection. The independent variable of a cepstral graph is called the quefrency. The quefrency is a measure of time, though not in the sense of a signal in the time domain. Liftering A filter that operates on a cepstrum might be called a lifter. A low pass lifter is similar to a low pass filter in the frequency domain. It can be implemented by multiplying by a window in the cepstral domain and when converted back to the time domain, resulting in a smoother signal. Convolution Property of the cepstral domain is that the convolution of two signals can be expressed as the addition of their cepstra Consider a sequence 푥 푛 = 푥1 푛 ∗ 푥2 푛 푋 푧 = 푋1 푧 ∙ 푋2 푧 푋 푧 = log 푋 푧 = log[푋1 푧 ] + log[푋2 푧 ] z-transform is in general a complex quantity. 푋 푒푗휔 = log 푋 푒푗휔 = log 푋 푒푗휔 + 푗 arg 푋 푒푗휔 Problem of uniqueness arises in defining the imaginary part. Examples of z-transform based cepstrum alanysis Ver. 1.1 2 ECW-706 Lecture 9 Power series expansion of log 1 − 푥 ∞ 푥푛 log 1 − 푥 = − 푥 < 1 푛 푛=1 Example 1 푛 푥1 푛 = 푎 푢 푛 1 푋 푧 = , 푧 > 푎 1 1 − 푎푧−1 ∞ 푎푛 푋 푧 = log 푋 푧 = − log 1 − 푎푧−1 = − 푧−푛 1 1 푛 푛=1 푎푛 푥 푛 = 푢 푛 − 1 1 푛 Example 2 푥2 푛 = 훿 푛 + 푏훿 푛 + 1 푋2 푧 = 1 + 푏푧 ∞ −1 푛+1 푋 푧 = log 푋 푧 = log 1 + 푏푧 = 푏푛 푧푛 2 2 푛 푛=1 Solving for 푥 2 푛 −1 푛+1 푥 푛 = 푏푛 푢 −푛 − 1 2 푛 Cepstrum of speech A short segment of voiced speech can be thought of as 푠 푛 = 푝 푛 ∗ 푛 ∗ 푣 푛 ∗ 푟 푛 Where 푝 푛 is periodic train of impulse of period 푁푝 samples 푛 glotial wave shape 푟 푛 radiation impulse response Similarly, a short segment of unvoiced speech can be thought of as 푠 푛 = 푢 푛 ∗ 푣 푛 ∗ 푟 푛 Ver. 1.1 3 ECW-706 Lecture 9 Where 푢 푛 random noise excitation 푠 푛 = 푝 푛 ∗ 푛 ∗ 푣 푛 ∗ 푟 푛 = 푝 푛 ∗ ℎ푣 푣표푖푐푒푑 푠 푛 = 푢 푛 ∗ 푣 푛 ∗ 푟 푛 = 푢 푛 ∗ ℎ푢 푛 푢푛푣표푖푐푒푑 Corresponding Transfer functions 퐻푣 푧 = 퐺 푧 푉 푧 푅 푧 푣표푖푐푒푑 퐻푈 푧 = 푉 푧 푅 푧 푢푛푣표푖푐푒푑 Vocal tract transfer function is of the form −푀 푀푖 −1 푀표 −1 퐴푧 푘=1 1 − 푎푘 푧 푘=1 1 − 푏푘 푧 푉 푧 = 푁푖 −1 푘=1 1 − 푐푘 푧 For voiced speech, except nasals, an adequate model include only poles, i.e. , 푎푘 = 0, 푏푘 = 0, ∀ 푘 . For nasals and for unvoiced speech, it is necessary to include both poles and zero. For stability all the poles, 푐푘 must lie inside the unit circle. Since 푣 푛 is real, all complex poles and zeros must occour as complex conjugate pairs. The radiation effects can be roughly modeled by −1 푅 푧 = 1 − 푧 퐿푖 퐿표 −1 −1 퐺 푧 = 퐵 1 − 훼푘 푧 1 − 훽푘 푧 푘=1 푘=1 Ver. 1.1 4 ECW-706 Lecture 9 Complex cepstrum of voiced speech. Notice peak at both positive and negative times equal to pitch period. Compare with for unvoiced speech which does not display any sharp peak. This is due to the fact that the excitation is random. An appropriate cepstrum window is applied to the cepstrum ( either a low quefrency signal to extract the vocal tract log magnitude, or a high quefrency signal to extract the excitation information) and a discrete Fourier transform leads to low quefrency liftered signal (or perhaps high quefrncy liftered signal). Low pass liftered and transformed log magnitude signal is superimposed over log magnitude signal. Ver. 1.1 5 ECW-706 Lecture 9 Although both auto-correlation function and sepstrum show distinct strong peaks, at the pitch period (at ~ 110 samples) the cepstral peak is more pronounced and less subject to variations. Hence, a pitch detector based on cepstral processing is often more robust and reliable. Ver. 1.1 6 .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-