Proceedings, FONETIK 2009, Dept. of Linguistics, University How do Swedish encyclopedia users want pronuncia- tion to be presented? Michaël Stenberg Centre for Languages and Literature, Lund University

Abstract were four introductory questions about age and linguistic background. In order to evaluate the This paper about presentation of pronunciation questions, a pilot test was first made, with five in Swedish encyclopedias is part of a doctoral participants: library and administrative staff, dissertation in progress. It reports on a panel and students of linguistics, though not special- survey on how users view presentation of pro- izing in phonetics. This pilot test, conducted in nunciation by transcriptions and recordings, March 2009, resulted in some of the questions so-called audio pronunciations. The following being revised for the sake of clarity. main issues are dealt with: What system should The survey proper was carried out in be used to render stress and segments? For March―April 2009. Fifty-four subjects be- what words should pronunciation be given tween 19 and 80 years of age, all of them affili- (only entry headwords or other words as well)? ated to Lund University, were personally ap- What kind of pronunciation should be present- proached. No reward was offered for parti- ed (standard vs. local, original language vs. cipating. Among them were librarians, admi- swedicized)? How detailed should a phonetic nistrative staff, professors, researchers and stu- transcription be? How should ‘audio pronunci- dents. Their academic studies comprised Lin- ations’ be recorded (human vs. synthetic guistics (including General Linguistics and speech, native vs. Swedish speakers, male vs. Phonetics) Logopedics, Audiology, Semiology, female speakers)? Results show that a clear Cognitive Science, English, Nordic Langua- majority preferred IPA transcriptions to ‘re- ges, German, French, Spanish, Italian, Polish, spelled pronunciation’ given in ordinary or- Russian, Latin, Arabic, Japanese, Translation thography. An even vaster majority (90%) did Program, Comparative Literature, Film Studies, not want stress to be marked in entry head- Education, Law, Social Science, Medicine, words but in separate IPA transcriptions. Only Biology and Environmental Science. A majori- a small number of subjects would consider us- ty of the subjects had Swedish as their first ing audio pronunciations made up of synthetic language; however, the following languages speech. were also represented: Norwegian, Dutch, Ger- man, Spanish, Portuguese, Romanian, Russian, Introduction Bulgarian and Hebrew. In spite of phonetic transcriptions having been The average time for filling in the 11-page used for more than 130 years to show pro- questionnaire was 20 minutes. Each question nunciation in Swedish encyclopedias, very little had 2―5 answer options. As a rule, only one of is known about the users’ preferences and their them should be marked, but for questions opinion of existing methods of presenting pro- where more than one option was chosen, each nunciation. I therefore decided to procure infor- subject’s score was evenly distributed over the mation on this. Rather than asking a random options marked. Some follow-up questions sample of more than 1,000 persons, as in were not to be answered by all subjects. In a customary opinion polls, I chose to consult a small number of cases, questions were mista- smaller panel of persons with a high probability kenly omitted. The percentage of answers for a of being experienced users of encyclopedias. certain option has always been based on the This meant a qualitative metod and more quali- actual number of subjects who answered each fied questions than in a mass survey. question. For many of the questions, an oppor- tunity for comments was provided. In a few cases, comments made by subjects have led to Method reinterpretation of their answers, i.e., if the A questionnaire made up of 24 multiple choice choice of a given option does not coincide with questions was compiled. Besides these, there Proceedings, FONETIK 2009, Dept. of Linguistics, Stockholm University a comment on it, the answer has been inter- examples of systems for marking stress in entry preted in accordance with the comment. headings are given. However, subjects showed a strong tendency to dislike having stress Questions and results marked in entry headings. As many as 90% favoured a separate IPA transcription instead. The initial question concerned the main motive According to the comments made, the reason for seeking pronunciation advice in encyclope- was that they did not want the image of the dias. As might have been expected, a vast ma- orthograpic word to be disturbed by signs that jority, 69%, reported that they personally want- could possibly be mistaken for diacritics. ed to know the pronunciation of items they Table 1 shows five different ways of were looking up, but, interestingly enough, for marking stress in orthographic words that the 13%, the reason was to resolve disputes about panel had to evaluate. The corresponding IPA pronunciation. Others used the pronunciation transcriptions of the four words would be advice to feel more secure in company or to [noˈbɛl], [ˈmaŋkəl], [ˈramˌløːsa] and [ɧaˈmɑːn]. prepare themselves for speaking in public. When it came to the purpose of the advice given, almost half of the subjects (44%) wanted Table 1. Examples of systems for marking main it to be descriptive (presenting one or more ex- stress in orthographic words: (a) IPA system as used by Den Store Danske Encyklopædi, (b) Na- isting pronunciations). The other options were tionalencyklopedin & Nordisk Familjebok 2nd edn. prescriptive and guiding, the latter principle be- system, (c) SAOL ( Wordlist), ing adopted by several modern encyclopedias. Svensk uppslagsbok & NE:s ordbok system, (d) Bra For entries consisting of personal names, a Böckers lexikon & Lexikon 2000 system, (e) Brock- striking majority, 97%, wanted pronunciation haus, Meyers & Duden Aussprachewörterbuch sys- to be given not only for second (family) names, tem.1 but also for first names, at least for persons who are always referred to by both names. This re- (a) Noˈbel ˈMankell ˈRamlösa schaˈman sult is quite contrary to the prevalent tradition (b) Nobe´l Ma´nkell Ra´mlösa schama´n in Sweden, where pronunciation is provided ex- (c) Nobel´ Man´kell Ram´lösa schama´n clusively for second names. Somewhat surpris- (d) Nobel Mankell Ramlösa schaman ingly, a majority of 69% wanted pronunciation (e) Nobel Mankell Ramlösa schama̱n (or stress) only to be given for entry headings, not for scientific terms mentioned later. Of the In case stress was still to be marked in entry remaining 31%, however, 76% wanted stress to headings, the subjects’ preferences for the be marked in scientific terms, e.g., Calendula above systems were as follows: officinalis, mentioned either initially only or also further down in the article text. (a) : 51 % (b) : 11 % Notation of prosodic features (c) : 9 % The next section covered stress and tonal fea- (d) : 6 % tures. 46% considered it sufficient to mark (e) : 20 % main stress, whereas main plus secondary stress was preferred by 31%. The rest demanded even As the figures show, this meant a strong sup- a third degree of stress to be featured. Such a port for IPA, whereas three of the systems system has been used in John Wells’s Longman widely used in Sweden were largely dismissed. Pronunciation Dictionary, but was abandoned System (e) is a German one, used in works with with its 3rd edition (2008). Max Mangold in the board of editors. It has the 70% of the subjects wanted tonal features to same economic advantages as (c), and is well be dipslayed, and 75% of those thought suited for Swedish, where quantity is comple- Swedish accent 1 and 2 and the corresponding mentary distributed between vowels and conso- Norwegian tonelag features would suffice to be nants in stressed syllables. System (d), which shown. does not account for quantity, can be seen as a A number of systems for marking stress simplification of (e). It seems to have been in- exist, both within phonetic transcriptions in troduced in Sweden by Bra Böckers Lexikon, a square brackets and outside these, in words very widespread Swedish encyclopedia, having written in normal orthography. In table 1 the Danish work Lademanns Leksikon as its Proceedings, FONETIK 2009, Dept. of Linguistics, Stockholm University model, published from 1973 on and now super- local. Like loanwords, many foreign geogra- seded by Lexikon 2000. The only Swedish en- phical names, e.g., Hamburg, London, Paris, cyclopedia where solely IPA transcriptions in Barcelona, have obtained a standard, swedi- brackets are used appears to be Respons (1997 cized pronunciation, whereas other ones, some- —8), a minor work of c. 30,000 entries, which times—but not always—less well-known, e.g., is an adaptation of the Finnish Studia, aimed at Bordeaux, Newcastle, Katowice, have not. The young people. Its pronunciation system is, how- panel was asked how to treat the two types of ever, conceived in Sweden. names. A majority, 69% wanted a swedicized It ought to be mentioned that SAOB (Sven- pronunciation, if established, to be given, other- ska Akademiens ordbok), the vast dictionary of wise the original pronunciation. However, the the , which began to be pub- remaining 31% would even permit the editors lished in 1898 (sic!) and is still under edition, themselves to invent a pronunciation con- uses a system of its own. The above examples sidered easier for speakers of Swedish in would be represented as follows: nåbäl3, ‘difficult’ cases where no established swedi- maŋ4kel, ram3lø2sa, ʃ ama4n. The digits 1—4 fications exist, like Łódź and Poznań. Three represent different degrees of stress and are subjects commented that they wanted both the placed in the same way as the stress marks in original and swedicized pronunciation to be system (c) above, their position thus denoting given for Paris, Hamburg, etc. quantity, from which the quality of the a’s In most of Sweden /r/ + dentals are could, in turn, be derived. The digits also ex- amalgamated into retroflex sounds, [ʂ], [ ʈ ], [ɖ] press accent 1 (in Mankell) and accent 2 (in etc. In Finland, however, and in southern Ramlösa). Being complex, this system has not Sweden, where /r/ is always realized as [ʁ ] or been used in any encyclopedia. [ʀ ], the /r/ and the dentals are pronounced separately. One question put to the panel was Notation of segments whether etc. should be transcribed as For showing the pronunciation of segments, retroflex sounds—as in the recently published there was a strong bias, 80%, in favour of the Norstedts svenska uttalsordbok (a Swedish IPA, possibly with some modifications, where- pronunciation dictionary)—or as sequences of as the remaining 20% only wanted letters of the [r] and dentals—as in most encyclopedias. The to be used. Two questions scores were 44% and 50% respectively, with an concerned the narrowness of transcriptions. additional 6% answering by an option of their Half of the subjects wanted transcriptions to be own: the local pronunciation of a geographical as narrow as in a textbook of the language in name should decide. No one in the panel was question, 31% narrow enough for a word to be from Finland, but 71% of those members with identified by a native speaker if pronounced in Swedish as their first language were speakers of accordance with the transcription. The re- dialects lacking retroflex sounds. maining 19% meant that narrowness should be Particularly for geographical names, two allowed to vary from language to language. different pronunciations often exist side by Those who were of this opinion had the side: one used by the local population, and following motives for making a more narrow another, a so-called reading pronunciation, used transcription for a certain language: the lan- by people from outside, and sometimes by the guage is widely studied in Swedish schools inhabitants when speaking to strangers. The (e.g., English, French, German, Spanish), 47%; latter could be described as the result of the language is culturally and geographically somebody—who has never heard the name close to Sweden, e.g., Danish, Finnish), 29%; pronounced—reading it and making a guess at the pronunciation of the language is judged to its pronunciation. Often the reading pronun- be easy for speakers of Swedish without know- ciation has become some sort of national ledge of the language in question, (e.g., Italian, standard. A Swedish example is the ancient Spanish, Greek), 24%. More than one option town of Vadstena, on site being pronounced had often been marked. [ˈvasˌsteːna], elsewhere mostly [ˈvɑːdˌsteːna]. The reading pronunciation was preferred by What pronunciation to present? 62% of the subjects, the local one by 22%. The remainder also opted for local pronunciation, One section dealt with the kinds of pronun- provided it did not contain any phonetic ciation to present. An important dimension is features alien to speakers of standard Swedish. swedicized—foreign, another one standard— Proceedings, FONETIK 2009, Dept. of Linguistics, Stockholm University

For English, Spanish and Portuguese, both possibilites—which seems to be a wise different standards exist in Europe, the Ame- strategy—, 19% would just listen, wheras the ricas and other parts of the world. The panel remaining 10% would stick to the transcrip- was asked whether words in these languages tions. should be transcribed in one standard This section concluded with a question for each language (e.g., Received Pronuncia- about the preferred way to describe the speech tion, Madrid Spanish and Lisbon Portuguese), sounds represented by the signs. Should it be one European and one American pronunciation made by means of articulation descriptions like for each language, or if the local standard ‘voiced bilabial fricative’ or example words pronunciation (e.g., Australian English) should from languages where the sound appears, as as far as possible be provided. The scores ‘[β] Spanish saber, jabón’ or by clickable obtained were 27%, 52% and 21% respectively. recordings? Or by a combination of these? The Obviously, the panel felt a need to distinguish scores were approximately 18%, 52% and 31% between European and American pronuncia- respectively. Several subjects preferred combi- tions, which is done in Nationalencyklopedin. It nations. In such cases, each subject’s score was could be objected that native speakers of the evenly distributed over the options marked. languages in question use their own variety, irrespective of topic. On the other hand, it may Familiarity with IPA alphabet be controversial to transcribe a living person’s In order to provide an idea of how familiar the name in a way alien to him-/herself. For panel members were with the IPA alphabet, example, the name Berger is pronounced they were finally presented with a chart of 36 [ˈbɜːdʒə] in Britain but [ˈbɜ˞ːgər] in the U.S. frequently used IPA signs and asked to mark those they felt sure of how to pronounce. The Audio pronunciations average number of signs marked turned out to There were five questions about audio pronun- be 17. Of the 54 panel members, 6 did not mark ciations, i.e. clickable recordings. The first one any sign at all. The top scores were [æ]: 44, [ʃ ] was whether such recordings should be read by and [o]: both 41, [u]: 40, [ə]: 39, [a]: 37 and [ʒ]: native speakers in the standard variety of the 35. Somewhat surprising, [ʔ ] obtained no less language in question (as done in the digital than 17 marks. versions of Nationalencyklopedin) or by one and the same speaker with a swedicized Discussion pronunciation. Two thirds chose the first option. Apparently, Sweden and Germany are the two The next question dealt with speaker sex. countries where pronunciation in encyclopedias More than 87% wanted both male and female are best satisfied. Many important works in oth- speakers, evenly distributed, while 4% er countries either do not supply pronunciation preferred female and 8% male speakers. One of at all (Encyclopædia Britannica), or do so only the subjects opting for male speakers com- sparingly (Grand Larousse universel and Den mented that men, or women with voices in the Store Danske Encyklopædi), instead referring lower frequency range, were preferable since their users to specialized pronunciation dictio- they were easier to perceive for many persons naries. This solution is unsatisfactory because with a hearing loss. (i) such works are not readily available (ii) they Then subjects were asked if they would like are difficult for a layman to use (iii) you have to use a digital encyclopedia where pronun- to consult several works with different nota- ciation was presented by means of synthetic tions (iv) you will be unable to find the pronun- speech recordings. 68% were clearly against, ciation of many words, proper names in parti- and of the remaining 32%, some expressed cular. reservations like ‘Only if extremely natural’, ‘If An issue that pronunciation editors have to I have to’ and ‘I prefer natural speech’. consider, but that was not taken up in the sur- In the following question, the panel was vey is how formal—casual the presented pro- asked how it would most frequently act when nunciation should be. It is a rather theoretical seeking pronunciation information in a digital problem, complicated to explain to panel mem- encyclopedia with both easily accessible audio bers if they are not able to listen to any record- pronunciations and phonetic transcriptions. No ings. Normally, citation forms are given, but it less than 71% declared that they would use can be of importance to have set rules for how Proceedings, FONETIK 2009, Dept. of Linguistics, Stockholm University coarticulation and sandhi phenomena should be (d) I don’t imagine any pronunciation at all but treated. memorize the image of the written word Another tricky task for pronunciation edi- and link it to the concept it represents: tors is to characterize the pronunciation of the 11%. phonetic signs. As one subject pointed out in a comment, descriptions like ‘voiced bilabial It can be doubted whether (d) is a plausible op- fricative’ do not tell you much unless you have tion for people using alphabetical script. One been through an elementary course of phone- subject commented that it was not. Anyway, it tics. Neither do written example words serve seems that it would be more likely to be used their purpose to users without knowledge of the by those brought up in the tradition of icono- languages in question. It is quite evident that graphic script. Researchers of the reading pro- audio recordings of the written example words cess might be able to judge. —in various languages for each sign, thus illu- The outcome is that the panel is rather strating the phonetic range of it—would really reluctant to use Swedish pronunciation—even add something. tentatively—for foreign words, like saying for The panel favoured original language pro- example [ˈʃɑːkəˌspeːarə] for Shakespeare or nunciation both in transcriptions (69% or more) [ˈkɑːmɵs] for Camus, pronunciations that are and in audio recordings (67%). At least in Swe- sometimes heard from Swedish children. den, learners of foreign languages normally aim Rather, they prefer to make guesses like at a pronunciation as native-like as possible. [ˈgriːnwɪtʃ] for Greenwich, as is frequently done However, this might not always be valid for en- in Sweden. cyclopedia users. When speaking your mother tongue, pronouncing single foreign words in a Conclusion truly native-like way may appear snobbish or affected. Newsreaders usually do not change Sweden has grand traditions in the field of pre- their base of articulation when encountering a senting pronunciation in encyclopedias, but this foreign name. A general solution is hard to does not mean that they should be left un- find. Since you do not know for what purpose changed. It is quite evident from the panel’s an- users are seeking pronunciation advice, adopt- swers that the principle of not giving pronunci- ing a fixed level of swedicization would not be ation for first names is totally outdated. satisfactory. The Oxford BBC Guide to The digital revolution provides new possi- pronunciation has solved this problem by bilities. Not only does it allow for showing supplying two pronunciations: an anglicized more than one pronunciation, e.g., one standard one, given as ‘respelled pronunciation’, and and one regional variety, since there is now another one, more close to the original space galore. Besides allowing audio record- language, transcribed in IPA. ings of entry headings, it makes for better de- scriptions of the sounds represented by the vari- Interim strategies ous signs, by completing written example words in various languages with sound record- One question was an attempt to explore the the ings of them. strategies most frequently used by the subjects IPA transcriptions should be favoured when when they had run into words they did not producing new encyclopedias. The Internet has know how to pronounce, in other words to find contributed to an increased use of the IPA, es- out what was going on in their minds before pecially on the Wikipedia, but since the authors they began to seek pronunciation advice. The of those transcriptions do not always have suffi- options and their scores were as follows: cient knowledge of phonetics, the correctness of certain transcriptions ought to be questioned. (a) I guess at a pronunciation and then use it The extent to which transcriptions should be silently to myself: 51% used, and how detailed they should be must de- (b) I imagine the word pronounced in Swedish pend on the kind of reference book and of the and then I use that pronunciation si- group of users aimed at. Nevertheless, account lently to myself: 16% must always be taken of the many erroneous (c) I can’t relax before I know how to pro- pronunciations that exist and continue to nounce the word; therefore, I avoid all spread, e.g., [ˈnætʃənəl] for the English word conjectures and immediately try to find national, a result of Swedish influence. out how the word is pronounced: 22% Proceedings, FONETIK 2009, Dept. of Linguistics, Stockholm University

Acknowledgements I wish to thank all members of the panel for their kind help. Altogether, they have spent more than two working days on answering my questions—without being paid for it.

Notes 1. In Bra Böckers Lexikon and Lexikon 2000, system (d)—dots under vowel signs—is used for denoting main stress also in transcriptions within brackets, where segments are rendered in IPA. 2. Also available free of charge on the Internet.

References Bra Böckers Lexikon (1973—81 and later edns.) Höganäs: Bra Böcker. Den Store Danske Encyklopædi (1994—2001) Copenhagen: Danmarks Nationalleksikon. Duden, Aussprachewörterbuch, 6th edn., revised and updated (2005) Mannheim: Dudenverlag. Elert, C.-C. (1967) Uttalsbeteckningar i svenska ordlistor, uppslags- och läroböcker. In Språkvård 2. Stockholm: Svenska språknämnden. Garlén, C. (2003) Svenska språknämndens ut- talsordbok. Stockholm: Svenska språk- nämnden. Norstedts ordbok. Lexikon 2000 (1997—9) Malmö: Bra Böcker. Nationalencyklopedin (1989—96) Höganäs: Bra Böcker. Nordisk familjebok, 2nd edn. (1904—26) Stockholm: Nordisk familjeboks förlag. Olausson, L. and Sangster, C. (2006) Oxford BBC Guide to pronunciation. Oxford: Oxford Univ. Press. Respons (1997—8) Malmö: Bertmarks. Rosenqvist, H. (2004) Markering av prosodi i svenska ordböcker och läromedel. In Ek- berg, L. and Håkansson, G. (eds.) Nordand 6. Sjätte konferensen om Nordens språk som andraspråk. Lund: Lunds universitet. Institutionen för nordiska språk. Svenska Akademiens ordbok (1898—) Lund: C.W.K. Gleerup.2 Svenska Akademiens ordlista, 13th edn. (2006) Stockholm: Norstedts akademiska förlag (distributor).2 Svensk uppslagsbok, 2nd edn. (1947—55) Malmö: Förlagshuset Norden.