Speech Production Modelling with Particular Reference to English

Speech Production Modelling with Particular Reference to English

- 1 - SPEECH PRODUCTION MODELLING WITH PARTICULAR REFERENCE TO ENGLISH by Celia Scully University College London Department of Phonetics and Linguistics A thesis submitted for the degree of Doctor of Philosophy in the University of London 1990 ProQuest Number: 10631075 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a com plete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. uest ProQuest 10631075 Published by ProQuest LLC(2017). Copyright of the Dissertation is held by the Author. All rights reserved. This work is protected against unauthorized copying under Title 17, United States C ode Microform Edition © ProQuest LLC. ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106- 1346 - 2 - ABSTRACT Many of the complexities of structure in speech signals are related to the processes of speech production. The aim of this study is to develop a better signal model in the form of a computer-implemented composite model of speech production and to apply it to some allophone sequences for British English. The stages of speech production included in the model are: articulation, aerodynamics, derivation of acoustic sources, filtering by the vocal tract acoustic tube and radiation of a sound pressure wave. The aerodynamic processes give interactions between the various acoustic sources and between the sources and filter shapes. As a result, covarying bundles of acoustic pattern features were found in the model’s outputs; these were qualitatively and, in some cases, quantitatively in agreement with corresponding patterns in natural speech. The linguistic, anatomical and acoustic frameworks of the study are set out. Speech production processes are discussed as theory and data in relation to models. The data are drawn from natural speech production and other sensori-motor skills. The actions of speech are described kinematically. The basic physical principles and equations needed to simulate aerodynamic processes are set out. Different approaches to the acoustic processes of sources and filtering are considered. The conditions needed for the sources are described. The composite model used in this study is described in terms of the basic principles, implementation methods and assessment. The modelling of some phonetic classes relevant to English speech is described. Simulations of some minimal and non-minimal articulatory - 3 - contrasts, including eight published papers, are presented, A quantitative but flexible time scheme planning framework was developed and used as an input stage for the model. The general conclusions from and limitations of the study are discussed. Future development and work are suggested. ACKNOWLEDGEMENTS To undertake the modelling of speech production unaided would be a futile task and I owe debts of gratitude to many colleagues. The development of the computer-implemented model from a few pages of Algol to a large number of interactive graphically oriented programs with many parameters and options was a joint effort with Mr. Ted Allwood. Ted produced good ideas for the forms of representation of the model, implemented them accurately and quickly, contributed greatly to the modelling experiments themselves and assisted with the writing of reports and papers; my debt to him is very great. I should like to thank Colleagues in the Department of Linguistics & Phonetics, University of Leeds, especially Miss Marion Shirt, Dr. Peter Roach, Mr. David Barber and Mrs. Helen Roach, who spoke into masks, made studio recordings and applied their auditory skills to synthetic sounds, often very difficult to describe. The unfailing technical help and interest of Mr. Eric Brearley, Chief Technician of the Department has been particularly important; also that of members of the Department of Mechanical Engineering and other colleagues in the University of Leeds. I should like to thank particularly Professor Alan De Pennington and Dr. Susan Bloor who saved me from batch processing and gave much encouragement and help in preparing project applications, Dr. Malcolm Bloor who advised us on aerodynamics, especially the mixed model approach, Mr. Jim Swift the Systems Manager for the VAX, Mr. Stuart Allen and Mr. Stan Cail who designed and constructed the digital hardware clock for the model*s output, Dr. Gordon Lockhart and Dr. Muhammed Zaid who helped us to design and construct the anti-aliasing filter; and from outside Leeds, Dr. Steve Terepin who advised us on the program for low-pass filtering with down-sampling and Mr. Nicholas Husband who gave us a - 5 - copy of his computer program for the reflected pressure wave method of filtering. Of the many scholars outside Leeds from whose advice I have benefitted, I should like to thank, in particular, Professor Gunnar Fant whose book provides so many answers, but who still patiently answered my questions, Professor Ken Stevens who advised me on aerodynamics, Dr. John Holmes who advised and instructed me on many aspects of speech signals and production processes, and Professor Adrian Fourcin, the Supervisor of this work, for his wise advice and profound insights into the nature of speech communication. The Science and Engineering Research Council (SERC, formerly SRC) supported the model development and modelling, through a series of projects. This work describes attempts to capture quantitatively and in highly simplified form the behaviour of natural speech. The researchers who have published papers, sent me copies of their data and discussed the issues at conferences and in the journals are far too many to name individually, but I am grateful to them all. The inadequacies of the model, failures of understanding and actual errors, in this text and in the model, are entirely my responsibility. Professor Ingo Titze (almost verbatim): "If you are trying to model speech processes you must work as hard as you possibly can; then you may perhaps achieve a fairly good qualitative match with natural speech." Dedicated to Dr. John C. Scully who helped with his careful proof reading and found phonetics for me in the first place. - 6 - CONTENTS PAGE ABSTRACT 2 ACKNOWLEDGEMENTS 4 CONTENTS: CHAPTERS I, II, III AND IV 6 LIST OF FIGURES 10 LIST OF TABLES 11 LIST OF APPENDICES: PUBLICATIONS 12 LIST OF UNITS AND ABBREVIATIONS 13 LIST OF SYMBOLS AND NOMENCLATURE IN THE MODEL 13 WITH REPRESENTATIVE VALUES FOR PARAMETERS SPECTROGRAM AXES 16 VALUES FOR PHYSICAL CONSTANTS 16 SYMBOLS AND KEY WORDS FOR THE PHONEMES OF AN 16 RP ACCENT OF BRITISH ENGLISH CHAPTER I: INTRODUCTION AND THE FRAMEWORK FOR THE STUDY 19 Introduction: aims and objectives of thestudy; 19 productive and receptive aspects of speech communication 1.1 Linguistic framework 23 1.1.1 Accent and style of English modelled 24 1.1.2 Linguistic background of the real speakers 24 1.1.3 Syllable and word structure in RP English 24 1.2 Anatomical and physiological framework 26 1.3 Acoustic framework 27 1.4 Stages of speech production 33 CHAPTER II: SPEECH PRODUCTION PROCESSES 39 Introduction 39 II. 1 Speech production as skilled sensori-motor activity 39 Introduction 39 11.1.1 Degrees of freedom, functions and the 40 concept of a schema 11.1.2 Path dynamics, costs and constraints 45 11.1.3 Accuracy and variability 49 - 7 - 11.1.4 Concepts applied to speech 55 11.1.5 Feedback and feedforward 57 11.2 Mechanical properties of the structures 61 11.3 Stages of speech production: theory and data 65 in relation to models 11.3.1 Neural and muscular processes 65 11.3.2 Articulation 69 Introduction 69 11.3.2.1 Configurations of the vocal tract 73 for vowels and diphthongs 11.3.2.2 Configurations of the vocal tract 82 for consonants 11.3.2.3 Cross-dimension to cross-section 85 area mapping 11.3.2.4 Shapes and general properties of 88 movement paths 11.3.2.5 The articulators and their 93 movement path durations 11.3.2.6 Timing, coordination and variability 111 11.3.3 Aerodynamics 112 Introduction 112 11.3.3.1 Lung volumes 113 11.3.3.2 Use of the system in breathing and speech 114 11.3.3.3 Distribution and aerodynamic properties 117 for the air in the respiratory tract 11.3.3.4 Resistance 124 11.3.3.5 Compliance 132 11.3.3.6 Inertance 133 11.3.3.7 Aerodynamic data for natural speech 135 11.3.4 Acoustic sources 137 Introduction 137 11.3.4.1 Voice: introduction 138 11.3.4.2 Theory of voicing mechanisms 141 11.3.4.3 Phonation types and registers 145 11.3.4.4 Glottal area, glottal volume flowrate 146 and vocal fold contact area during voicing 11.3.4.5 Source-filter interactions 149 11.3.4.6 Flow in the closed phase 151 11.3.4.7 Choice of a voice source model 151 11.3.4.8 The range of voice waveshapes 153 11.3.4.9 Control of sound pressure level 155 11.3.4.10 Control of fundamental frequency 157 11.3.4.11 Jitter and shimmer 162 11.3.4.12 Turbulence noise 163 11.3.4.13 Transient 167 11.3.5 Filtering 168 11.3.6 Radiation 170 II.4 Mapping across stages: stability and sensitivity 172 - 8 - II.5 Specific models of speech production 176 Introduction 176 11.5.1 Neural and muscular models 176 11.5.2 Articulatory models 177 11.5.3 Aerodynamic models 179 11.5.4 Acoustic models and electrical analogues 181 11.5.5 Composite models and the need to include 184 an aerodynamic stage CHAPTER III: A COMPOSITE MODEL OF SPEECH PRODUCTION 187 Introduction 187 III.l Principles 190 111.1.1 Inputs 190 111.1.2 Articulation 190 111.1.3 Aerodynamics 199 Introduction

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    485 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us