Rapid Portability Among Domains in an Interactive Spoken Language Translation System
Total Page:16
File Type:pdf, Size:1020Kb
Rapid Portability among Domains in an Interactive Spoken Language Translation System Mark Seligman Mike Dillinger Spoken Translation, Inc. Spoken Translation, Inc. Berkeley, CA, USA 94705 Berkeley, CA, USA 94705 mark.seligman mike.dillinger @spokentranslation.com @spokentranslation.com Abstract this difficulty and briefly present an approach to rapid inter-domain portability that we believe is promising. Spoken Language Translation systems have Three aspects of our approach will be discussed: (1) usually been produced for such specific domains as large general-purpose lexicons for automatic speech health care or military use. Ideally, such systems recognition (ASR) and machine translation (MT), would be easily portable to other domains in which made reliable and usable through interactive facilities translation is mission critical, such as emergency response or law enforcement. However, porting has in for monitoring and correcting errors; (2) easily practice proven difficult. This paper will comment on modifiable facilities for instant translation of frequent the sources of this difficulty and briefly present an phrases; and (3) quickly modifiable custom glossaries. approach to rapid inter-domain portability. Three As preliminary support for our approach, we aspects will be discussed: (1) large general-purpose apply our current SLT system, now optimized for the lexicons for automatic speech recognition and health care domain, to sample utterances from the machine translation, made reliable and usable through military, emergency service, and law enforcement interactive facilities for monitoring and correcting domains. errors; (2) easily modifiable facilities for instant With respect to the principal source of the porting translation of frequent phrases; and (3) quickly modifiable custom glossaries. As support for our problems affecting most SLT systems to date: most approach, we apply our current SLT system, now systems have relied upon statistical approaches for optimized for the health care domain, to sample both ASR and MT (Karat and Nahamoo, 2007; utterances from the military, emergency service, and Koehn, 2008); so each new domain has required law enforcement domains, with discussion of extensive and high-quality in-domain corpora for best numerous specific sentences. results, and the difficulty of obtaining them has limited these systems’ portability. The need for in- 1 Introduction domain corpora can be eliminated through the use of a quite general corpus (or collection of corpora) for Recent years have seen increasing research and statistical training; but because large corpora give rise commercial activity in the area of Spoken Language to quickly increasing perplexity and error rates, most Translation (SLT) for mission-critical applications. In SLT systems have been designed for specialized the health care area, for instance, such products as domains. Converser (Dillinger & Seligman, 2006), S-MINDS By contrast, breadth of coverage has been a (www.fluentialinc.com), and Med-SLT (Bouillon et central design goal of our SLT systems. Before any al, 2005) are coming into use. For military optimization for a specific domain, we “give our applications, products like Phraselator systems a liberal arts education” by incorporating (www.phraselator.com) and S-MINDS very broad-coverage ASR and MT technology. (We (www.fluentialinc.com) have been deployed. presently employ rule-based rather than statistical MT However, the demand for real-time translation is by components, but this choice is not essential.) For no means restricted to these areas: it is clear in example, our MT lexicons for English<>Spanish numerous other areas not yet extensively addressed – translation in the health care area contain roughly emergency services, law enforcement, and others. 350,000 words in each direction, of which only a Ideally, a system produced for one such domain small percentage are specifically health care terms. (e.g., health care) could be easily ported to other Our translation grammars (presently licensed from a domains. However, porting has in practice proven commercial source, and further developed with our difficult. This paper will comment on the sources of collaboration) are similarly designed to cover the structures of wide-ranging general texts and spoken © 2008. Licensed under the Creative Commons Attri- discourse. bution-Noncommercial-Share Alike 3.0 Unported To deal with the errors that inevitably follow as license (http://creativecommons.org/licenses/by-nc- coverage grows, we provide a set of facilities that sa/3.0/). Some rights reserved. enable users from both sides of the language barrier to 40 Coling 2008: Proceedings of the workshop on Speech Processing for Safety Critical Translation and Pervasive Applications, pages 40–47 Manchester, August 2008 interactively monitor and correct such errors. We have As an initial safeguard against translation errors, described these interactive techniques in (Dillinger we supply a back-translation, or re-translation of the and Seligman, 2004; Zong and Seligman, 2005; translation. Using this paraphrase of the initial input, Dillinger and Seligman, 2006; and Seligman and even a monolingual user can make an initial judgment Dillinger, 2006). With users thus integrated into the concerning the quality of the preliminary machine speech translation loop, automatically translated translation output. If errors are seen, the user can spoken conversations can range widely with modify specific parts of the input and retranslate. acceptable accuracy (Seligman, 2000). Users can (Other systems, e.g. IBM’s MASTOR (Gao et al, move among domains with relative freedom, even in 2006), have also employed re-translation. Our advance of lexical or other domain specialization, implementations, however, exploit proprietary because most domains are already covered to some technologies to ensure that the lexical senses used degree. After a quick summary of our approach (in during back-translation accurately reflect those used Section 2), we will demonstrate this flexibility (in in forward translation. We also allow users to modify Section 3). part or all of the input before regenerating the While our system’s facilities for monitoring and translation and back-translation.) correction of ASR and MT are vital for accuracy and In addition, if uncertainty remains about the confidence in wide-ranging conversations, they can be correctness of a given word sense, we supply a time consuming. Further, interactivity demands a proprietary set of Meaning Cues™ – synonyms, minimum degree of computer and print literacy, definitions, examples, pictures, etc. – which have which some patients may lack. To address these been drawn from various resources, collated in a issues, we have developed a facility called database (called SELECT™), and aligned with the Translation Shortcuts™, through which prepared respective lexica of the relevant MT systems. (In the translations of frequent or especially useful phrases in present English<>Spanish version of the system, this the current domain can be instantly executed by database contains some 140,000 entries, searching or browsing. The facility is described in corresponding to more than 350,000 lexical entries. (Seligman and Dillinger, 2006). After a quick The cues are automatically grouped by meaning, and description of the Translation Shortcuts facility cue groups are automatically mapped to MT lexica (Section 4), this paper will emphasize the contribution using proprietary techniques – thus in effect of the Translation Shortcuts facility to domain retrofitting an MT system with the ability to explain portability, showing how a domain-specific set of to users the meanings of its pre-existing lexical Shortcuts can be composed and integrated into the items.) With these cues as guides, the user can system very quickly (Section 5). monitor the current, proposed meaning and if Finally, while the extensive lexical resources necessary select a different, preferred meaning from already built into the system provide the most among those available. Automatic updates of significant boost to domain portability in our system, translation and back-translation then follow. (Our it will always be desirable to add specialized lexical current MT vendor has modified its rule-based items or specialized meanings of existing ones. translation engine to allow specification of a desired Section 6 will briefly present our system’s glossary sense when translating a word or expression; we import facility, through which lexical items can be provide guidelines for other vendors to do likewise. added or updated very quickly. Our concluding Comparable modifications for statistical MT engines remarks appear in Section 7. will entail the setting of temporary weightings that will bias the selection of word or phrase translations 2 Highly Interactive, Broad-coverage for the current sentence only.) Future versions of the SLT system will allow personal word-sense preferences thus specified in the current session to be optionally We now briefly summarize our group’s approach stored for reuse in future sessions, thus enabling a to highly interactive, broad-coverage SLT. Our gradual tuning of word-sense preferences to systems stress interactive monitoring and correction individual needs. (However, such persistent personal of both ASR and MT. preferences will still be applied sentence by sentence, First, users can monitor and correct the speaker- rather than by permanently modifying lexica or phrase dependent speech recognition