Building Synthetic Voices

Building Synthetic Voices

Building Synthetic Voices Alan W Black Kevin A. Lenzo Building Synthetic Voices by Alan W Black and Kevin A. Lenzo For FestVox 2.0 Edition Copyright © 1999-2003 by Alan W Black & Kevin A. Lenzo Permission to use, copy, modify and distribute this document for any purpose and without fee is hereby granted in perpetuity, provided that the above copyright notice and this paragraph appear in all copies. Table of Contents I. Speech Synthesis.............................................................................................................?? 1. Overview of Speech Synthesis .............................................................................?? History................................................................................................................?? Uses of Speech Synthesis.................................................................................?? General Anatomy of a Synthesizer ................................................................?? 2. Speech Science ........................................................................................................?? 3. A Practical Speech Synthesis System ..................................................................?? Basic Use ............................................................................................................?? Utterance structure...........................................................................................?? Modules..............................................................................................................?? Utterance access................................................................................................?? Utterance building............................................................................................?? Extracting features from utterances ...............................................................?? II. Building Synthetic Voices............................................................................................?? 4. Basic Requirements................................................................................................?? Hardware/software requirements.................................................................?? Voice in a new language ..................................................................................?? Voice in an existing language..........................................................................?? Selecting a speaker ...........................................................................................?? Who owns a voice.............................................................................................?? Recording under Unix......................................................................................?? Extracting pitchmarks from waveforms........................................................?? 5. Limited domain synthesis.....................................................................................?? designing the prompts .....................................................................................?? customizing the synthesizer front end ..........................................................?? autolabeling issues ...........................................................................................?? unit size and type .............................................................................................?? using limited domain synthesizers ................................................................?? Telling the time..................................................................................................?? Making it better.................................................................................................?? 6. Text analysis ............................................................................................................?? Non-standard words analysis.........................................................................?? Token to word rules..........................................................................................?? Number pronunciation ....................................................................................?? Homograph disambiguation...........................................................................?? TTS modes .........................................................................................................?? Mark-up modes.................................................................................................?? 7. Lexicons ...................................................................................................................?? Word pronunciations........................................................................................?? Lexicons and addenda .....................................................................................?? Out of vocabulary words.................................................................................?? Building letter-to-sound rules by hand .........................................................?? Building letter-to-sound rules automatically ...............................................?? Post-lexical rules ...............................................................................................?? Building lexicons for new languages.............................................................?? 8. Building prosodic models .....................................................................................?? Phrasing .............................................................................................................?? Accent/Boundary Assignment.......................................................................?? F0 Generation ....................................................................................................?? Duration .............................................................................................................?? Prosody Research..............................................................................................?? Prosody Walkthrough ......................................................................................?? 9. Corpus developement ...........................................................................................?? 10. Waveform Synthesis.............................................................................................?? 5 11. Diphone databases ...............................................................................................?? Diphone introduction.......................................................................................?? Defining a diphone list.....................................................................................?? Recording the diphones...................................................................................?? Labeling the diphones......................................................................................?? Extracting the pitchmarks ...............................................................................?? Building LPC parameters ................................................................................?? Defining a diphone voice.................................................................................?? Checking and correcting diphones ................................................................?? Diphone check list ............................................................................................?? 12. Unit selection databases ......................................................................................?? Cluster unit selection........................................................................................?? Building a Unit Selection Cluster Voice.........................................................?? Diphones from general databases ..................................................................?? 13. Labeling Speech....................................................................................................?? Labeling with Dynamic Time Warping .........................................................?? Labeling with Full Acoustic Models..............................................................?? Prosodic Labeling .............................................................................................?? 14. Evaluation and Improvements...........................................................................?? Evaluation..........................................................................................................?? Does it work at all?...........................................................................................?? Formal Evaluation Tests ..................................................................................?? Debugging voices .............................................................................................?? III. Interfacing and Integration ........................................................................................?? 15. Markup ..................................................................................................................?? 16. Concept-to-speech................................................................................................?? 17. Deployment...........................................................................................................?? IV. Recipes ...........................................................................................................................??

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    202 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us