Model-Driven Time-Varying Signal Analysis and Its Application to Speech Processing

Model-driven Time-varying Signal Analysis and its Application to Speech Processing by Steven Sandoval A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Approved March 2016 by the Graduate Supervisory Committee: Antonia Papandreou-Suppappola, Chair Julie Liss Pavan Turaga Narayan Kovvali ARIZONA STATE UNIVERSITY May 2016 ABSTRACT This work examines two main areas in model-based time-varying signal processing with emphasis in speech processing applications. The first area concentrates on improving speech intelligibility and on increasing the proposed methodologies application for clinical practice in speech-language pathology. The second area concentrates on signal expansions matched to physical-based models but without requiring independent basis functions; the significance of this work is demonstrated with speech vowels. A fully automated Vowel Space Area (VSA) computation method is proposed that can be applied to any type of speech. It is shown that the VSA provides an efficient and reliable measure and is correlated to speech intelligibility. A clinical tool that incorporates the automated VSA was proposed for evaluation and treatment to be used by speech language pathologists. Two exploratory studies are performed using two databases by analyzing mean formant trajectories in healthy speech for a wide range of speakers, dialects, and coarticulation contexts. It is shown that phonemes crowded in formant space can often have distinct trajectories, possibly due to accurate perception. A theory for analyzing time-varying signals models with amplitude modulation and frequency modulation is developed. Examples are provided that demonstrate other possible signal model decompositions with independent basis functions and corresponding physical interpretations. The Hilbert transform (HT) and the use of the analytic form of a signal are motivated, and a proof is provided to show that a signal can still preserve desirable mathematical properties without the use of the HT. A visualization of the Hilbert spectrum is proposed to aid in the interpretation. A signal demodulation is proposed and used to develop a modified Empirical Mode Decomposition (EMD) algorithm. i TABLE OF CONTENTS Page LIST OF TABLES . viii LIST OF FIGURES . xi LIST OF ABBREVIATIONS AND ACRONYMS . .xxii CHAPTER 1 INTRODUCTION AND WORK MOTIVATION . 1 1.1 Evaluation of Speech . 1 1.2 Mean Formant Trajectories . 2 1.3 Hilbert Spectral Analysis . 3 1.4 Contributions in the Evaluation of Speech . 5 1.5 Contributions in Hilbert Spectral Analysis . 6 1.6 Report Organization . 7 2 SPEECH QUALITY AND SPEECH INTELLIGIBILITY . 9 2.1 The Evaluation of Speech Quality . 9 2.1.1 Background on Subjective Quality Measures: Mean Opinion Scores . 9 2.1.2 Objective Quality Measures. 10 2.2 The Evaluation of Speech Intelligibility . 14 2.2.1 Standard Assessments of Speech Intelligibility . 15 2.2.2 Automated Assessments of Speech Intelligibility . 16 2.3 Intelligibility Estimation via Automatic Speech Recognition Systems 16 2.3.1 Cepstral Analysis for Speech Characterization . 17 2.3.2 Cepstrum Computation . 18 2.3.3 Mel-Frequency Cepstrum Computation . 19 2.4 Intelligibility Estimation via Feature Mapping . 20 ii CHAPTER Page 2.4.1 Envelope Modulation Spectrum Feature . 21 2.4.2 Long-Term Average Spectrum Feature . 22 2.4.3 Internal ITU-T P.563 Features . 23 2.4.4 Linear Predictive Coding Features. 24 3 AUTOMATIC ASSESSMENT OF VOWEL SPACE AREA. 26 3.1 Motivation of the Proposed Algorithm for Vowel Space Area Auto- matic Assessment Algorithm . 26 3.2 Automatic Assessment Method . 29 3.2.1 Formant Extraction . 29 3.2.2 Filtering . 30 3.2.3 Clustering . 30 3.2.4 Convex Hull / Area Calculation . 30 3.2.5 Stimuli . 31 3.3 Results and Discussion . 31 3.3.1 Performance Analysis . 32 4 SPEECH ASSIST: AN AUGMENTATIVE TOOL FOR PRACTICE IN SPEECH-LANGUAGE PATHOLOGY . 35 4.1 Motivation Proposed for the Mobile Application Suite: Speech Assist 35 4.2 Speech Assist Methods . 35 4.2.1 Automated Vowel Space Area . 35 4.2.2 Automated Diadochokinetic Rate . 36 4.2.3 Automated Perceptual Ratings. 37 4.3 Conceptual Interface . 38 5 MEAN FORMANT TRAJECTORIES . 41 iii CHAPTER Page 5.1 Motivation of Proposed Method for Quantifying Mean Formant Trajectories . 41 5.2 Analysis Study Using the Hillenbrand Database . 44 5.2.1 Analysis Method to Quantify Mean Formant Trajectories . 44 5.2.2 Results and Discussion . 45 5.3 Analysis Study Using TIMIT Database . 48 5.3.1 Analysis Method to Quantity Mean Formant Trajectories . 48 5.3.2 Results and Discussion . 51 6 LATENT SIGNAL ANALYSIS AND THE ANALYTIC SIGNAL . 74 6.1 Motivation for Proposed Latent Signal Analysis . 74 6.2 Background on Instantaneous Frequency . 75 6.3 Latent Signal Analysis . 77 6.4 Hilbert Transform and Analytic Signal . 81 6.4.1 Vakman's Signal Constraints . 82 6.4.2 Analyticity of the Analytic Signal . 83 6.4.3 Gabor's Method . 84 6.5 Relaxing the Constraint of Harmonic Correspondence . 85 6.5.1 Harmonic Correspondence . 85 6.5.2 Analyticity of the Complex Extension . 86 6.5.3 Harmonic Conjugate Functions . 87 6.5.4 Amplitude Modulation{Frequency Modulation Demodulation 88 6.6 Example of a Latent Signal Analysis Problem . 88 6.6.1 Solution Assuming Harmonic Correspondence . 89 6.6.2 Solution Assuming Constant IF . 90 iv CHAPTER Page 6.6.3 Solution Assuming Constant IA . 91 6.7 Discussion. 91 7 HILBERT SPECTRAL ANALYSIS AND THE AM{FM MODEL . 92 7.1 Motivation for Hilbert Spectral Analysis . 92 7.2 The AM{FM Model . 93 7.2.1 Definitions and Assumptions . 93 7.2.2 Monocomponents and Narrowband Components. 96 7.3 Hilbert Spectral Analysis . 98 7.3.1 Two Conventional Ways to Relate a Real Observation to a Latent Signal . 100 7.3.2 Simple Harmonic Component . 101 7.3.3 Superposition of Simple Harmonic Components . 101 7.3.4 The AM Component . 103 7.3.5 Superposition of AM Components . 103 7.3.6 FM Component . 104 7.3.7 Superposition of FM Components . 105 7.3.8 Other AM{FM Models . 105 7.4 Frequency Domain View of Latent Signal Analysis . 107 7.5 Subtleties of the Hilbert Spectrum . 111 7.6 Examples of the Hilbert Spectral Analysis Problem. 113 7.6.1 Periodic Triangle Waveform Example . 114 7.6.2 Sinusoidal FM Example . 115 7.6.3 Remarks . 116 7.7 Visualization of the Hilbert Spectrum . 118 v CHAPTER Page 7.8 Discussion. 120 8 NUMERICAL HILBERT SPECTRAL ANALYSIS ASSUMING INTRIN- SIC MODE FUNCTIONS . 124 8.1 Motivation for Proposed Numerical Methods for Hilbert Spectral Analysis . ..

Model-Driven Time-Varying Signal Analysis and Its Application to Speech Processing

Prakriya Documentation Release 0.0.7

Vedic Accent and Lexicography

Power of Sanskrit

Python Module Index 9

Vyakarana Documentation Release 0.1

Absence of SUN-Domain Protein Slp1 Blocks Karyogamy and Switches

A Tool for Transliteration of Bilingual Texts Involving Sanskrit

SEL-587 Instruction Manual Is Not Available

The Cologne Sanskrit Lexicon

Vedic Accent and Lexicography

Devanagari Transliteration 1 Devanagari Transliteration

A New Speech Corpus and Modelling Insights Devaraja Adiga1∗ Rishabh Kumar1∗ Amrith Krishna2 [email protected] [email protected] [email protected]