GRAFON: A GRAPHEME-TO-PHONEME CONVERSION SYSTEM FOR DUTCH
Walter Daolemans AI-I,AB, Vrije Universiteit Brussel Pleiulaau 2 K, 1050 Brussels, Belgium E-math walterd@ar ti. vub.uucp
The architecture of GRAFON as it is currently imple- Abstvact mented is shown in Figure l. We (k;~,.c,ibe a set of rt,o(lttle.,i that together tuake up a grapheme-to phoneme conversion system for Dutch. Modules include a syflabificatiou program, a fast morphological parser, a lexical database, a phonological knowledg*: base, transliteration rules, and phonological rnles. Knowledge and procedures were intlflenmnted object-orientedly. We contrast GRAFON to recent p.tern recognitkm and rule. compiler approaches mid tit to show that the first fails for languages with concatenative comlmtmding (like Dutch, Get,nan, and Scandinavian languages) while the MORPHOLOGICAL ANALYSIS I second lack.,; the flexibility to model different phonological ::::::::::::::::::::: ,~#het##gieren##van~#de##herfst #storm# ~ theories. It is claimed that sylhtbles (and not graphemes/phonemes or ulorphemes) should be central units in a :,:,:.:.:.: :,>:.:.: J :i!i2!?!~iiiiiiij2~ii I rtde-based phonemisatkm algorithm. Furthermore, the architec- tnre of GRA!:"ON and its nser interface make it ideally suited as ::::::::::::::::::::::~~£EX~:;~r:/: SYLLABI FICATION a rule testing tool fol phonologists. :!::: tTA~A:::::: ##het ~,~gie$ren#~van##de#~,herfst#storm# ~ ::::::::::::::::::::: I 1. INTROI)~CI'I(1lkl Speech :;ynthesis systems cousist of a linguistic and an q>~~':' ~_het ##+gieSren##+van#~de##++herfsU,+storm~,# acoustic part The linguistic part converts an orthographic representation of a text into a phonetic representation flexible E and detailed enough to serve as input to the acoustic part. TRANSLITERATION MAPPINGS The acoustic t)art is a speech synthesiser which may be based PHONOLOGICAL RULES on the production of allophones or diphones. This paper is concerned with file linguistic part of speech synthesis for at 'Xi:ro'van do 'heraf.storam Dutch (a process we will call phonamisation). The problem of phonemi~;ation has beett approached in different ways. Recently, covnectionist approaches (NETtalk: Sejnowski and Rosenberg, 1987) mid memory-based reasoning approaches Figure I. Architecture o£ GRAFON. Dark boxes indicate knox4edge sources, (MBRtalk: Slanfill and Waltz, 1986) have been proposed as white boxes processing modules, After computing morphological e~d syllable boundaries, the system retrieves word accent information and alternatives tt) the traditional symbol-manipttlation approach. applies tre.~sliteration mappings and phonological rules to the input. Within the latter (rule-based) approach, several systems have ~esultit~g t'epresentanons are s]aowr~ within the boxes bexm built for English (the rnost comprehensive of which is probably Ml]'alk; Allen, Hunnicutt and Klatt, 1987), and sys- tems for othm" European hmguages are beginning to appear. An input string of orthographic symbols is first analysed morphologically. Then, syllable bonndaries are computed, Text-to-,;peech systems tbr Dutch are still in an experi- taking into account the morphological boundaries. Morpho- mental stage, and two different designs can be distinguished. logical analysis uses a lexical database, which is also used to Some researchers adopt an 'expert system' pattern matching retrieve word stress of monomorphematic word forms. The approach /Boot, 1984/, others a 'rule compiler' approach actual transcription takes the syllable as a basic unit and /Kerkhoff, Wester and Bores, 1984; Berendsen, Langeweg proceeds in two stages: tirst, parts of spelling syllables are attd van Leer,wen, 1986/ in which the rules are mostly in an transliterated into (strings of) phoneme symbols by a number SPE-inspired format. Both approaches take the of transliteration mappings. To this representation, context- grapheme/phoneme as a central nnit. We will argue that sensitive phonological rules are applied, modifying parts of within the symbol manipulation approach, a modular architec- the syllables in the process. Any level of phonetic detail ture with rite syllable as a central tmit is to be preferred. (between a broad and a narrow transcription) can be obtained by adding or blocking rules.
The research described in this paper was supported partly by the In the remainder of this paper, we will describe the European Community under ESPRIT project OS 82. The paper is different modules playing a role in GRAFON in some detail, based on an hiternal memo (Daelemans, 1985) and on part of a go into some language-specific requirements, and discuss the dissertation (l)aelemans, 1987b). The system described here is not advantages of our architectnre to alternative designs. to be eonfilsed with the GRAPHON system developed at the Tech- aisehe Oniversitiit Wien (Pounder and Kommcnda, 1986) which is a text-to-speech syatenl for German. I am grateful to my forraer and present c(lleagtms in Nijmegen and Brtmsels for providing a sthnulating wo,'kiag environment. Erik Wybouw developed C-code for constructing an indexed-sequential version of the lexical data- base. 133 2. SYLLABIFICATION more detail in Daelemans (1988, forthcoming). Information about the position of syllable boundaries in 3. LEXICAL DATABASE spelling input strings is needed for several reasons. The most important of these is that most phonological rules in Dutch We use a word tbrm dictionary instead of a morpheme have the syllable as their domain. E.g. Dutch has a schwa- dictionary. At present, some 10,000 citation forms with their insertion rule inserting a schwa-like sound between a liquid associated inflected forms (computed algoritlmlically) are and a nasal or non-coronal consonant if both consonants listed in the lexical database. The entries were collected by belong to the same syllable. Compare melk (milk): /mel0k/ the university of Nijmegen from different sonrceso The to melSken (to milk): /melk0/ ($ indicates a syllable boun- choice for a word form lexical database was motivated by the dary). Without syllable structure this problem can only be following considerations: First, morphological analysis if; resolved in an ad hoc way. Furthermore, stress assignment reduced to dictionary looknp sometimes combined with com- rules should be described in terms of syllable structure pound and affix analysis. Complex word fonaas (i.e. freqneut /Berendsen and Don, 1987/. compounds and word tbrms with affixes) ale stored wifl~ their internal word boundaries. These boundaries can therefore be Other rules which are often described as having the retrieved instead of computed. Only the structure of complex morpheme as their domain (such as devoicing of voiced words not yet listed in the dictionary must be computed. obstruents at morpheme-final position and progressive and This makes morphological decomposition computatioually less regressive assimih~tion), shonld really be described as operat- expensive. ing on the syllable level. E.g. hetSze (/hets0/: smear cam- paign; devoicing of voiced fricative at syllable~final position) Second, the number of errors in morphological parsing and asSbest (/azbest/: asbestos; regressive assimilation). These owing to overacceptance and nonsense analyses is consider- mono-morphematic words show the effects of the phonologi- ably reduced. Traditional erroneous analyses of systems cal rules at their syllable boundaries. Furthermore, the proper vsing a morpheme-based lexicon like comput+er and target of these rules is not one phoneme, bnt the complete under+stand, or for Dutch kwart+el (quainter yard instead of coda or onset of the syllable, which may consist of more quail) and li+epen (plural past tense of lopen, to run; than one phoneme. analysed as 'epics about the Chinese measure li') are avoided this way. Finally, current and forthcoming storage and Although these examples show convincingly that syllable search technology reduce the overhead involved in using large structure is necessary, they do not prove that it is central. lexical databases considerably. However, the following observations seem to suggest the cen- trality of the syllable in Dutch phonemisation: Notice that the presence of a lexical database suggests a - The combination of syllable structure and information about simpler solution to the phonemisation problem: we could sim word stress seems enough to transform all spelling vowels ply store the transcription with each entl2¢ (This lexicon-based correctly into phonemes, including Dutch grapheme
Figure 2. Examt,le instance of the object type SYLLABLE and its associated feature values after transcription. cal rule, GRAFON keeps a list of all input strings to which REGRESSIVE ASSIMILATION the rule applies. This is advantageous when complex ~'ale Active7 True interactions must be studied. Figure 4 shows the user inter Domain? Syllable face with some output by the program. Apm~t from the Conditions changing of rules, the derivation, and the example list for (let ((coda-1 (lost (coda SYL))) each different rule, the system also offers menu-based facili- (onset-Z (r;rst(onset (next SYL))))) (and ties for manipulating various parameters used in the hyphena- (stop? onset-2) (voiced? onset-2) tion, parsing and conversion algorithms, and for compiling (obstruent ? code- l ) (nO~ (voicod? coda-l)))) and showing statistical information on the distribution of allo. phones and diphones in a corpus. Actions (meke=voiced (coda SYL)) 7.2. Dietionm2¢ Construction PROGRESSIVE ASSIMILATION Active? True Output of GRAFON was used (after manual checking) Domain? Syllable by a Dutch lexicographic firm for tile construction of the Conditions pronunciation representation of Dutch entries in a Dutch (let ((coda-I (lost(codaSYL))) French translation dictionary. The program tarried out to I~ (onset-2 (first (onset (next SYL))))) easily adaptable to the requirements by blocking rules which (~d (obstruent? coda- [ ) would lead to too much phonetic detail, and by changing the (not (voiced ? coda- 1)) (fricative? onset-2) domain of others (e.g. the. scope of assimilation rules was (voiced? onset=2))) restricted to internal word boundaries). The accuracy of' the Actions program on the 100,000 word corpus was more than 99%, (mske-voiceloss (onset (next SYL))) disregarding loan words. The phonemisation system also plays a central role in die dynamical part of the lexical Figure 3. A possible detinition of voice assimilation rules in Dutch. The LET syntax is used for local variable binding, but is not stTietly needed. database architecture we have described elsewhere /Daele- SYL is bound to the curt'cut syllable. marts, 1987a/.
In a rule compiler approach (e.g. Kerkhoff, Wester and 7.3. Spelling Error Correction Boves, 1984; Berendsen, Langeweg and van Leeuwen, 1986), A spelling error correction algorithm based on the idea rules in a particular format (most often generative phonology) that people write what they heat' if they do not know the are compiled into a program, thereby making a strict distinc- spelling of a word has teen developed by Van Berkel /Van tion between the linguistic and computational parts of the sys- Berkel and De Smedt, 1988/. A dictionary is used in which tem. None of the existing systems incorporates a full mor- the word forms have been transformed into phoneme phological analysis. The importance of morphological boun- representations with a simplified and adapted version of daries is acknowledged, but actual analysis is restricted to a GRAFON. A possible error is transformed with the same number of (overgenerating) pattern matching rules. Another algorithm and matched to the dictionary entries. Combined serious disadvantage is that the user (the linguist) is restricted with a trigram (or rather triphone) method, this system can in a compiler approach to the particular formalism the com- correct both spelling and typing errors at a reasonable speed. piler knows. I would be impossible, for instance, to incor- porate theoretical insights from autosegmental and metrical phonology in a straightforward way into existing prototypes. 8. IMPLEMENTATION AND ACCURACY In GRAFON, on the other hand, the phonological knowledge GRAFON was written in ZetaLisp and Flavors and runs base can be easily extended with new objects and relations on a Symbolics Lisp Machine. The lexical database is stored between objects, and even at the level of the fimction and on a SUN Workstation and organised indexed-sequentially. predicate interface, some theoretical modelling can be done. Accuracy measures (on randomly chosen Dutch text) are This flexibility is paid, however, by higher demands on the encouraging: in a recent test on a I000 word text, 99.26% of linguist working with the system, as he should be able to phonemes and 97.62% of transcribed word tokens generated write rules in a LISP-like applicative language. However, we by GRAFON were judged correct by an independent linguist. hope to have shown fl'om examples of rules in Figure 3 that The main source of errors by the program was the presence the complexity is not insurmountable. of foreign words in the text (mostly of English and French origin). Only a marginal number of errors was caused by 7. APPLICATIONS morphological analysis, syllabification or phonological rule application. Apart from its evident role as the linguistic part in a texture-speech system, GRAFON has also been used in other There is at present one serious restriction on the system: applications. no syntactic analysis is available and therefore, no sophisti- cated intonation patterns and sentence accent can be corn- 7.1. Linguistic Tool puted. Moreover, it is impossible to experiment with the Phi One advantage of computer models of linguistic (the phonological phrase, which may restrict sandhi processes phenomena is the framework they present for developing, in Dutch) as a domain for phonological rules. However, recently a theory has been put forward by Kager and Qnen6 testing and evaluating linguistic theories. To be used as a linguistic tool, a natural language processing system should at (1987) in which it is claimed that sentence accent, Phi bonn- least come up to the following requirements: easy daries and I (intonational phrase) boundaries can be computeA modification of rules should be possible, and traces of rule without exhaustive syntactic analysis. The information needed is restricted to the difference between function and content application should be made visible. words, the category of fimction words, and the difference In GRAFON, rules can be easily modified both at the between verbs and other content words. All this information macro level (reordering, removing and adding rules) and the is accessible in the current implementation of GRAFON micro level (reordering, removing and adding conditions and through dictionary-lookup. actions). The scope (domain) of a rule can be varied as well. Possible domains at present are the syllable, the mor- pheme, the word and the sentence. Furthermore, the applica- tion of various rules to an input string is automatically traced and this derivation can be made visible. For each phonologi-
136 Gt a fen
tlvpher~atlon 0Ol, lons f'honemlse Input, Phonemlse File [}erivation Initialise l'a, ~e," (lotions I lyphenare lrlpul, 6rafon Loop Dialects Exit I~amples Options Parse Illptl~ 5holv Rules Statistics Uandah+ 3ob #h_otm.[_og[£al Ppl_e.~ _. ~e_~ n Ill ~lforl eannantl: f}t'oftm Loop 'POg, l~ES~tqE flSSnllLflflflll [] t~ ~> lint 9ieren van tle herfsl;~tarn ?EeREsSI~OE rlSSlml.fll 1011 [] E (El =: '' [lIE nell == *+ Villi == lie == +* IIER|:SI = +* SIORH) till I Ink I]E/lOl C l liE, [7~ ;*I 'KL:Ia 'V¢,111 dO +NEFOF+sLi)I'~III Give Oxalrl~ ~1o~101: ~l OSIU[~-I O "FRt Cfll IUE 1~ I! t '1118111 fl~iSlflllfllllal ...... 1~ ~, + t P~ {'9' esslv+ R~slnElat ~on) ---'If,IF f I/it- t}[gOli]Ri;--- )[ flElll IIR[ [ fJIl [] t~ !) l 1! FINN -OEVOICIN6 :llffl[ l)[:tdfJ[Cllffl [] [~ (~ 6'ot,el Ililtongisation 1) Pl OS/VE-fO FRith riVE I "I)EI.EI IUIt ~ E t t'l~ R-+BEIEl [fIN '111 fllfll Isni loll [_] fJ It I,l~letfon) P/~Lhl,qLISh 110N iiRius FILI. IIIG 1~ 1~ ht!t ft~t 3CIIIIll- IflSEP, I IOH [] [~ IllhlU541tklR~ !1USIEI4 I#EIIIICl [llll [] (';. CIt,st+, F?oduction) [ lIT [{Pl)OI;RI. 1 e COl CI II{~ 1~ [ <~ I)~9~lafnation) IN[[~UOC~I]UUOi[IW, IHIIEI " 11[ PlI/llOllG I Slql I {Ill I El 'i 141 I" rl UfJIVEL OWII IlIONGI5tl f ItJN- l ,'9,.++? tnrygto,!uLsmm+! 2 _ ~+ E , :" Schla rnsertlo,O gOt~f[ -OlPlllllON61ShI ION 2 Do It [] fiber t [2] PRI)GRESSlVE-FISSINII ~FI[)N ill af,),/ coranatl(l: Show l#ule~ RfGI~ES5IUf .-hSSIHIt.,q 1 ][]N l)r .3for) col~r1+~n(}: Ea{~DII)Z~5 l]f)t,l~r~s NhSN.-hfSItllt ~t 10N FxaI'q) l(~ for rule ~ UJ 0 e'-'~jm),: 5 J" P b t q k 9 s ,,,111z [ vx ~,,,t,OI ,,,J wh a"e~'kl IIJ, e i acrallll e I o o II y a :),~, t ul,;0o,~ . Figure 4. Snapshot of the user interface to GRAFON. Top lelt, the system has computed an internal representation and transcription of a seuteuce fragment. A derivation is also printed. In the centre of the display, a menu listing all phonological rules is shown. By clicking on a rule, the user gets a list of input phrases to which the rule has been applied (middle left). The same list of rules is also given in the top right mmm. This time, the application of individual rules can be blocked, and the result of this can be studied. The chart bottom right shows the fi'equencydistribution of phonemes for the current session, 9. CONCLUSIONS 10. REFERENCES ]{t seems that high quality phonemisation for Dutch can Allen, J., M.S. Hnnnicutt and D. Khttt. From 7~xt to be achieved otdy by incorporating enough linguistic Speech. Cambridge, UK: C.U.P., 1987. knowledge (about syllable boundaries, internal word boun- Berendsen, E., S. [,angeweg and H. van Leeuwen. 'Compu- daries etc.). GRAFON is a first step in this direction. tational Phonology: Merged, not Mixed.' Proceedings of Although it lacks some sources of knowledge (notably about COLING-86, 1986. sentence accent and syntactic structure), a transcription of Berendsen, E. and J. Don. 'Morphology attd stress in a high quality and accuracy can already be obtained, and the rule based grapheme-to-phonente conversion system for system was successfully applied in practical tasks like rule Dutch.' Proceedings European Conference on Speech testing, dictionary constrnction and spelling error correction. Technology, Edinburgh 1987. At present, we are working on the integration of a syn- Berkel, B. Van and K. De Smedt. 'Triphone attalysis: A tactic parser into GRAFON. This would make available the Combined Method for the Correction of Orthographical phonological phrase as a domaitt, and would make the com- and Typographical Errors.' Proceedings 2nd ACt, putation of natural intonation patterns possible (vsing e.g. the Applied Conference, 1988. algoritlun developed in Van Wijk and Kempen, 1987). The Boot. M., ?Iml, tekst, computer. Katwijk: Servire, 1984. alternatiw~ approach to the comptttation of phonological phrase boundaries /Kager and Quen6, 1987/ is also being Daelemans, W. 'GRAFON: An Object-oriented System for explored. Automatic Grapheme to Phoneme Transliteration and Phonological Rule Testing.' Memo, University of Another (more trivial) extension is the addition of Nijmegen, 1985. preprocessors for the verbal expansion of abbreviations and Daelemans, W. 'A Tool for the Automatic Creation, Exten- numbers, 'rite specifications of a lexical attalyser providing sion and Updating of Lexical Knowledge Bases.' this functionality were provided in Daelemans (1987b). An Proceedings of the Third ACL European Chapter overview of the system including the modules we are Conference, 1987a. presently working on is given in Figure 5, Daelemans, W. Studies in Language J'e&nology: An Object- Oriented Computer Model of Morpho-phonological Aspects of Dutch. Doctoral Dissertation, University of Leuven, 1987b. Daelemans, W. 'Automatic Hyphenation: Linguistics w~'rsus Engineering.' In: F. Stems and F.J. Heyvaert (Eds.), Worlds behind Words, forthcoming 1988. 137 LEXICALANALYSIS ) SYNTACTIC ANALYSIS --1 ((her gieren)(van de herfst~torm)) _ MORPHOLOGICAL ANALYSIS ~#het##gieren##lvan~#de##herfst#st°rm## SYLLABIFICATION 1 ##het~*~gieSren##Iven##de~*#herfst#storrn ~*# t oRD STRESSASSIGNMENT ..] #~'het##+gie$ren##1+van##de~*#+÷herf st#+st°rm J TRANSLITERATION MAPPINGS I PHONOLOGICAL RULES at 'Xi:ro'vo, n de 'hsraf,storem I. SENTENCE ACCENT ASSIGNMENT INTONATION CONTOUR COMPUTATION at 'Xi~r a'v&n do 'h~ref,storam J Figure 5. Processing modules in an extended versionof GRAFON. Kager, R. and H. Quen6. 'Deriving prosodic sentence struc- ture without exhaustive syntactic analysis.' Proceedings European Conference on Speech Technology, Edinburgh 1987. Kerkhoff, J., J. Wester and L. Boves, 'A compiler for imple- menting the linguistic phase of a text-to-speech conver- sion system'. In: Bennis and Van Lessen Kloeke (eds), Lh~guistics in the Netherlands, p. 111-117, 1984. Lammens, J.M.G. 'A Lexicon-based Grapheme-to-phoneme Conversion System.' Proceedings European Conference on Speech Technology, Edinburgh 1987. Pounder, A. and M. Kommenda. 'Morphological Analysis for a German Text-to-speech system.' COLING '86, 1986. Sejnowski, TA. and C.R. Rosenberg. 'Parallel Networks that Learn to Pronounce English Text.' Complex Systems 1, 1987, 145-168. Stanfill, C. and D. Waltz. 'Toward Memory-based Reason- ing.' Communications of the ACM, 29 (12), 1986, 1213-1228. Wijk, C. van and G. Kempen, 'From sentence structure to intonation contour'. In: B. Muller (Ed.), Sprachsyn- these. Hidesheim: Georg Olms Verlag, 1985, p. 157- 182. 138