Vocalocity Voice Browser
VoiceXML Implementation Reference Guide
Version 2.4.1 Vocalocity VoiceXML Implementation Reference Guide, Version 2.4.1
Copyright © 2003–2005. Vocalocity, Inc. All rights reserved. An unpublished work under US Copyright Laws.
Published June 2005
This document is protected by copyright. No part of this document may be used or reproduced in any form by any means without prior written authorization of Vocalocity, Inc. (“Vocalocity”) and its licensors, if any. This document contains information that may be protected by one or more US patents, foreign patents, or pending applications. This document is subject to the terms of the Vocalocity Evaluation Agreement and/or the Vocalocity Master Software License Agreement.
THIS DOCUMENT IS PROVIDED “AS IS” AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
THIS DOCUMENT MAY CONTAIN TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS, AND VOCALOCITY MAKES NO REPRESENTATION OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE INFORMATION CONTAINED IN THIS DOCUMENT. CHANGES MAY BE ADDED PERIODICALLY TO THE INFORMATION CONTAINED HEREIN; THESE CHANGES WILL BE INCORPORATED IN NEW EDITIONS, VERSIONS OR RELEASES OF THIS DOCUMENT. VOCALOCITY MAY MAKE IMPROVEMENTS AND/OR CHANGES IN THE PRODUCT(S) AND/OR PROGRAM(S) DESCRIBED IN THIS DOCUMENT AT ANY TIME.
Vocalocity, the Vocalocity logo, and combinations thereof are trademarks of Vocalocity, Inc. in the United States and other countries. Other product names and brands used in this document are for identification purposes only, and are the trademarks and/or property of their respective owners. This notice does not evidence any actual or intended publication of this document.
For more information, contact us at [email protected].
Vocalocity, Inc. 730 Peachtree Street Suite 1100 Atlanta, GA 30308 USA +1.404.487.1200 http://www.vocalocity.com Contents
Preface: About This Guide
Introduction ...... vi Intended Audience ...... vi Version Information ...... vi
Using Documentation ...... vii Contents of This Guide...... vii Related Documentation ...... viii
Conventions ...... ix
Contacting Vocalocity Technical Support ...... x
Chapter 1: Introduction About Voice Browsers and VoiceXML ...... 1-2 Voice Browsers ...... 1-2 Supported Specifications ...... 1-3 Implementation of the VoiceXML Specifications...... 1-4
ASR Vendor Support of SRGS ...... 1-5 ASR Vendor Support of SISR ...... 1-5
Chapter 2: VoiceXML Element Summary VoiceXML Summary...... 2-2
SSML Summary ...... 2-7
SRGS Summary ...... 2-9
Detailed Implementation Notes...... 2-10
Chapter 3: Standard Types and Defaults Introduction ...... 3-2 Setting Vocalocity Browser Properties ...... 3-2 Setting Properties ...... 3-2 Setting Java System Properties ...... 3-2 Vocalocity Session Variables ...... 3-3
Vocalocity Property Defaults...... 3-4 Generic Speech Recognition Properties ...... 3-4 Generic DTMF Recognition Properties ...... 3-4 Prompt and Collection Properties ...... 3-5 Fetching Properties ...... 3-5 Miscellaneous Properties ...... 3-6 Custom Browser Properties ...... 3-7 Specifying the ASR or TTS Engine to Use...... 3-7 ASR Engines ...... 3-8 TTS Engines ...... 3-8 Defaults Used when No Engine Is Specified ...... 3-8 Audio and Initial Page Fetching ...... 3-9
MIME Type Mapping...... 3-10 Overriding a MIME Type ...... 3-11 SAX Parsers ...... 3-12
ECMAScript ...... 3-13 Accessing the log4j Logger ...... 3-13 Accessing Web Services ...... 3-13 Implementation Notes ...... 3-14 Bargein...... 3-14 Default Encoding ...... 3-14 DTMF-Only Applications ...... 3-14 Infinite Loop Detection ...... 3-15 Strict Content Type Processing ...... 3-15 Time Unit Designations ...... 3-15 Using a File-Based URL in Applications ...... 3-15
Chapter 4: SpeechWorks OSR Notes Introduction ...... 4-2
Application Name Used for OSR Logging ...... 4-3
SpeechWorks Recognizer Properties ...... 4-4
Endpointer Tuning ...... 4-5
Licensing Modes ...... 4-6
iv Vocalocity Voice Browser VoiceXML Implementation Reference Preface
About This Guide
The Vocalocity Voice Browser 2.4.1 fully conforms to the VoiceXML 2.0 and 2.1 specifications, and supports the Speech Recognition Grammar Specification (SRGS) and other related open standards. This reference guide provides additional detail for how the Vocalocity Voice Browser implements the standards.
The topics discussed in this guide include:
X Voice browsers and VoiceXML
X Supported specifications
X Implementation of VoiceXML elements
X Vocalocity defaults About This Guide
Introduction
The Vocalocity VoiceXML Implementation Reference Guide describes how the Vocalocity Voice Browser implements VoiceXML as described in:
X The W3C Recommendation 16 March 2004, Voice Extensible Markup Language (VoiceXML) 2.0
X The W3C Working Draft 28 July 2004, Voice Extensible Markup Language (VoiceXML) 2.1
The guide is not a programming guide; it: clarifies how Vocalocity has implemented the standards, where the requirements were ambiguous or where we have chosen to implement in a slightly different manner.
This guide is intended to provide an explanation of VoiceXML support in the Vocalocity Voice Browser. Use this guide along with the W3C VoiceXML specifications when developing applications.
Intended Audience
This guide should be used by:
X Application developers who are creating VoiceXML applications for the Vocalocity Voice Browser
X Technical personnel who are responsible for troubleshooting deployed applications
Version Information
The information in this guide is accurate for Version 2.4.1 of the Vocalocity Voice Browser.
It discusses Vocalocity Voice Browser’s implementation of VoiceXML 2.0 and VoiceXML 2.1.
vi Vocalocity Voice Browser VoiceXML Implementation Reference Guide Using Documentation
Using Documentation
This section outlines the structure of the VoiceXML Implementation Reference Guide and explains other guides in the documentation set and their intended audiences.
Contents of This Guide
This guide consists of four chapters. The following table describes each chapter.
Chapter or Appendix Description
Preface Introduces the structure of this guide, and explains how information is presented
Chapter 1, Introduction Provides some background on voice browsers, voice standards, and the Vocalocity VoiceXML Interpreter. Lists the specifications to which the VoiceXML Interpreter conforms.
Chapter 2, VoiceXML Element For each VoiceXML element, provides Summary additional detail for how the Vocalocity Voice Browser implements the VoiceXML standard.
Chapter 3, Standard Types and Lists the standard event types, session Defaults variables, and application variables in the Vocalocity Voice Browser VoiceXML implementation, and how to configure them.
Chapter 4, SpeechWorks OSR Contains implementation suggestions or Notes usage notes for using SpeechWorks OSR with the Vocalocity Voice Browser.
Vocalocity Voice Browser VoiceXML Implementation Reference Guide vii About This Guide
Related Documentation
There are several different guides to help you understand, implement and run the Vocalocity Voice Browser. The documentation set consists of the following guides.
Guide Description Intended Audiences
Vocalocity Voice Browser Contains hardware and software Anyone planning an implementation Installation Guide requirements for the Vocalocity Voice or installing Vocalocity Voice Browser, describes deployment Browser, Voice Browser options, and contains procedures for components, and Vocalocity tools installing and configuring Vocalocity software, third-party software, and hardware. Note: Operations information is included in the Control Center User’s Guide.
Vocalocity App Center User’s Describes how to build and deploy Voice application developers who Guide Vocalocity Voice Browser solutions are creating and publishing using Vocalocity App Center and the VoiceXML applications for their own Vocalocity Voice Browser use or for their customers
Vocalocity Control Center User’s Describes how to monitor Vocalocity Operations personnel performing Guide Voice Browser solutions ongoing maintenance of Vocalocity Voice Browser solutions
Vocalocity Info Center User’s Guide Describes how to gather call Support personnel responsible for information and use that information troubleshooting and supporting to support Vocalocity Voice Browser voice applications solutions
VoiceXML Implementation Describes how the Vocalocity Voice Application developers who are Reference Guide Browser implements VoiceXML 2.0 creating VoiceXML applications for and 2.1. the Vocalocity Voice Browser This guide should be used along with Technical personnel who are the W3C VoiceXML specifications responsible for troubleshooting when developing applications. deployed applications
viii Vocalocity Voice Browser VoiceXML Implementation Reference Guide Conventions
Conventions
The following table describes the typographical conventions used in this guide.
Convention Meaning
Monospace Indicates text that should be entered exactly as shown (including punctuation) or examples of code. Here is an example of a command line: # mkdir /somedir
Bold Type Indicates a path or the name of a program, process, procedure, routine, script, or table, such as ASSIGN
Italic Type Indicates a variable entry, such as
Vocalocity Voice Browser VoiceXML Implementation Reference Guide ix About This Guide
Contacting Vocalocity Technical Support
There are many ways to contact Vocalocity Customer Support.
Contact us... At...
On the Web http://support.vocalocity.com The Vocalocity support website is available 24 hours. The website has integrated issue tracking functionality that allows customers to enter and track defects and enhancements to our software.
Via email [email protected] Your email goes directly to our technical support staff.
By telephone +1 404.487.1200
By mail Our corporate offices are located at: Vocalocity, Inc. 730 Peachtree Street Suite 1100 Atlanta, GA 30308 USA
x Vocalocity Voice Browser VoiceXML Implementation Reference Guide 1 Introduction
The Vocalocity Voice Browser is a voice browser, a packaged solution that integrates all the components necessary for a voice application system. The Vocalocity Voice Browser includes a VoiceXML Interpreter that reads and plays VoiceXML applications. This chapter provides some background on voice browsers, voice standards, and the Vocalocity VoiceXML Interpreter.
This chapter contains the following topics:
X About Voice Browsers and VoiceXML
X Implementation of the VoiceXML Specifications
X ASR Vendor Support of SRGS Introduction
About Voice Browsers and VoiceXML
Vocalocity is an active member of the W3C Voice Browser Working Group. The Working Group has defined a suite of markup languages covering dialog, speech synthesis, speech recognition, call control and other aspects of interactive voice response applications.
Vocalocity is one of the W3C Editors of VoiceXML 2.0, VoiceXML 2.1, CCXML 1.0 and SSML. Additionally, Vocalocity is a Board Member of the VoiceXML Forum and our Chief Architect, Ken Rehor, serves as the organization's Vice Chair.
For more information about the:
X W3C Voice Browser Working Group, go to www.w3c.org/Voice
X VoiceXML Forum, go to www.voicexml.org
Specifications such as the Speech Synthesis Markup Language (SSML), Speech Recognition Grammar Specification (SRGS), and Call Control XML (CCXML) are core technologies for describing speech synthesis (text-to- speech), recognition grammars (automatic speech recognition), and call control constructs respectively.
VoiceXML, or Voice eXtensible Markup Language, is a dialog markup language that leverages the other specifications for creating dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key (touch tone) input, recording of spoken input, telephony, and mixed initiative conversations.
VoiceXML is the HTML of the voice web, the open standard markup language for voice applications. Where HTML assumes a graphical web browser with display, keyboard, and mouse, VoiceXML assumes a voice browser with audio output (recorded messages and TTS synthesis), audio input (ASR), and keypad input (DTMF).
Voice Browsers
A voice browser is a collection of software that works together to integrate and manage telephony, automatic speech recognition (ASR), text-to-speech (TTS), DTMF (touchtone), third-party or custom services, media, and other resources required to run VoiceXML applications.
The Vocalocity Voice Browser is a packaged solution that integrates all the components necessary for a voice application system. The Vocalocity Voice Browser includes a VoiceXML Interpreter that enables it to execute voice applications written in VoiceXML.
1-2 Vocalocity Voice Browser VoiceXML Implementation Reference Guide About Voice Browsers and VoiceXML
The Vocalocity VoiceXML Interpreter conforms to the VoiceXML 2.0 and VoiceXML 2.1 specifications and related specifications.
Supported Specifications
The Vocalocity Voice Browser supports the following W3C specifications.
Standard Description Specification
VoiceXML 2.0 Voice eXtended Markup Language W3C Recommendation 16 March 2004 Markup language used to create dialogs – www.w3.org/TR/voicexml20/ voice applications. The Vocalocity Voice Browser includes a VoiceXML interpreter that can render VoiceXML applications.
VoiceXML 2.1 Voice eXtended Markup Language W3C Working Draft 28 July 2004 www.w3.org/TR/voicexml21/
SSML 1.0 Speech Synthesis Markup Language W3C Recommendation 7 September 2004 SSML tags are used for TTS capabilities. www.w3.org/TR/speech-synthesis/ They are noted in the Implementation Notes in the following table.
SRGS 1.0 Speech Recognition Grammar W3C Recommendation 16 March 2004 Specification www.w3.org/TR/speech-grammar/ SRGS tags are used for ASR capabilities, for example, to specify a grammar.
SISR 1.0 Semantic Interpretation for Speech W3C Working Draft 8 November 2004 Recognition www.w3.org/TR/semantic-interpretation/ The SRGS element
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 1-3 Introduction
Implementation of the VoiceXML Specifications
The Vocalocity VoiceXML Interpreter conforms to all required elements in the VoiceXML 2.0 and 2.1 specifications. However, there are elements where the implementation of attributes has been left to the interpreter. This guide describes how Vocalocity has implemented the standard. It should be used along with the W3C VoiceXML specifications when developing applications.
X Support of built-in VoiceXML grammars is dependent on the ASR vendor implementation. For a list of vendors and supported SRGS versions, see ASR Vendor Support of SRGS on page 1-5.
X Support of semantic interpretation is dependent on the ASR vendor. For a list of vendors and supported SRGS versions, see ASR Vendor Support of SISR on page 1-5.
X Support for SSML is dependent on the TTS vendor.
1-4 Vocalocity Voice Browser VoiceXML Implementation Reference Guide ASR Vendor Support of SRGS
ASR Vendor Support of SRGS
The version of the SRGS supported also depends on the speech recognition vendor your implementation uses.
ASR Vendor Supported Specification
ScanSoft SpeechWorks SRGS 1.0 OSR 3.0 W3C Proposed Recommendation 18 December 2003 http://www.w3.org/TR/2003/PR-speech-grammar- 20031218/
ScanSoft SpeechWorks SRGS 1.0 OSR 2.0 W3C Candidate Recommendation 26 June 2002 http://www.w3.org/TR/2002/CR-speech-grammar- 20020626/
LumenVox SRE 5.5 SRGS 1.0 W3C Recommendation 16 March 2004 http://www.w3.org/TR/speech-grammar/
Nuance Speech SRGS 1.0 Recognition System 8.0.0 W3C Working Draft 20 August 2001 www.w3.org/TR/2001/WD-speech-grammar- 20010820/
ASR Vendor Support of SISR
The version of the SISR specification supported also depends on the speech recognition vendor your implementation uses.
ASR Vendor Supported Specification
ScanSoft SpeechWorks SISR 1.0 OSR 3.0 W3C Working Draft 8 November 2004 http://www.w3.org/TR/semantic-interpretation/
ScanSoft SpeechWorks SISR 1.0 OSR 2.0
LumenVox SRE 5.5 SISR 1.0
Vocalocity Voice Browser VoiceXML Implementation Reference Guide 1-5 Introduction
ASR Vendor Supported Specification
Nuance Speech SISR 1.0 Recognition System 8.0.0
1-6 Vocalocity Voice Browser VoiceXML Implementation Reference Guide 2 VoiceXML Element Summary
This chapter explains how the Vocalocity Voice Browser implements the VoiceXML 2.0 and 2.1 standards. It identifies areas of clarification in cases where the specifications had ambiguous requirements or where the implementation has been left to the vendor.
This chapter contains the following topics:
X VoiceXML Summary
X SSML Summary
X SRGS Summary
X Detailed Implementation Notes VoiceXML Element Summary
VoiceXML Summary
The following table is a summary of the current VoiceXML elements supported in this release of the Vocalocity Voice Browser.
Element Purpose Implementation Notes
Allows a VoiceXML application to fetch XML New in VoiceXML 2.1 data from a document server without A Java system property – transitioning to a new VoiceXML document vocalos.vxml.data.access_control.allow – can be set to configure the default behavior if the returned XML content does not contain the access-control XML processing instruction. See Element on page 2- 10.
2-2 Vocalocity Voice Browser VoiceXML Implementation Reference Guide VoiceXML Summary
Element Purpose Implementation Notes
Enter “dtmf-digit” for “text”