IBM Text-to-Speech SSML Programming Guide December 2008 A form for readers’ comments appears at the back of this publication. If the form has been removed, address your comments to: International Business Machines Corporation Department MMOA P.O. Box 12195 Research Triangle Park, North Carolina 27709-2195 When you send information to IBM®, you grant IBM a nonexclusive right to use or distribute the information in any way it believes appropriate without incurring any obligation to you. © Copyright International Business Machines Corporation 2006. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents
IBM Text-to-Speech SSML Programming British English SPRs ...... 15 Guide ...... 1 German SPRs...... 17 Introduction to SSML ...... 1 Canadian French SPRs...... 19 SSML phonemes ...... 2 Spanish SPRs ...... 22 SSML language support ...... 3 Introduction to pitch and its use with SSML . . . 24 SSML tags ...... 4 TTS question intonation ...... 26 Introduction to symbolic phonetic representations . 11 Notices and trademarks ...... 26 Symbolic phonetic representation forms (SPRs) . 11 Entering symbolic phonetic representations . . . 12 Index ...... 29 U.S. English SPRs ...... 13
© Copyright IBM Corp. 2006 iii iv IBM Text-to-Speech SSML Programming Guide IBM Text-to-Speech SSML Programming Guide
This guide provides information about using Speech Synthesis Markup Language (SSML) to incorporate IBM Text-to-Speech technology into applications. About using IBM Text-to-Speech and SSML
This guide describes the programming interfaces available for developers to take advantage of these features within their applications.
This guide is also available in Portable Document Format (PDF). To read it, you must have the Adobe Acrobat Reader installed on your computer. Using IBM Text-to-Speech Technology and Speech Synthesis Markup Language Who should use this guide
This guide can help if you are a software developer interested in writing applications that use IBM Text-to-Speech technology. This document describes the use of IBM Text-to-Speech technology for beginning to advanced software engineers. Guide topics
This guide presents the following main topics. v SSML This topic describes the use of SSML to control speech synthesis and text processing parameters. v Symbolic phonetic representations (SPRs) This topic describes the use of special phonetic symbols to customize pronunciations in IBM Text-to-Speech. Typographical conventions
The following typographical conventions are used throughout this document to facilitate reading and comprehension. They are outlined in the following list. Monospace font Applies to code samples, including XML, and to file and directory names. Bold Applies to function and callback names, and to data types, including structures and enumerations. Italics Applies to parameter and structure member names, sample text, and the introduction of new terms. UPPERCASE Applies to property, enumerator, mode, and state names.
Introduction to SSML The VoiceXML 2.0 specification adopted Speech Synthesis Markup Language (SSML) as the standard markup language for speech synthesis.
© Copyright IBM Corp. 2006 1 SSML provides developers of speech applications a standard way in which to control speech synthesis and text processing parameters. SSML enables developers to specify pronunciations, volume, pitch, speed, and so on. The SSML standard
This section describes SSML support in IBM Text-to-Speech System. The implementation is based on the Speech Synthesis Markup Language Version 1.0, recommended by W3C on September 7, 2004, which can be found at http://www.w3.org/TR/speech-synthesis/
The IBM Text-to-Speech System implements this specification, with the following exception: The following prosody elements are not supported: Contour and Duration. The rate element is supported but the implementation varies from the current W3C specification. SSML parsing
The SSML processor silently ignores unsupported tags. The text contained inside an unsupported
If the syntax of the input text is not legal, the SSML processor returns and logs an error. SSML language support
The level of SSML support is language dependent. The topic ″SSML language support″ provides detailed information about what is supported for each language. Related information SSML language support Support for some SSML tags is language-specific. This topic summarizes the supported languages and the SSML features supported for each language. SSML phonemes SSML supports two phonetic alphabets: the International Phonetic Alphabet (IPA) and the IBM TTS phonetic alphabet.
The optional alphabet attribute of the phoneme element specifies the phonetic alphabet. If the alphabet is not specified, the default IBM TTS phonetic alphabet is used. International Phonetic Alphabet example
The following example shows the US English phonetic pronunciation of ″tomato″ using the IPA:
The following example shows the pronunciation of ″tomato″ using the character codes for the IPA symbols. If you have encoding to support the IPA symbols, you can these symbols instead: