Proceedings, FONETIK 2008, Department of Linguistics, University of Gothenburg

Human Recognition of Swedish Dialects Jonas Beskow2, Gösta Bruce1, Laura Enflo2, Björn Granström2, Susanne Schötz1(alphabetical or- der) 1Dept. of Linguistics & Phonetics, Centre for Languages & Literature, Lund University, 2Dept. of Speech, Music & Hearing, School of Computer Science & Communication, KTH, Sweden

Abstract graphical origin of other Swedish native speakers. By determining the dialect identifi- Our recent work within the research project cation ability of Swedish listeners, a founda- SIMULEKT (Simulating Intonational Varie- tion could be made for further research in- ties of Swedish) involves a pilot perception volving dialectal clusters of speech. In order test, used for detecting tendencies in human to evaluate the importance of the factors clustering of Swedish dialects. 30 Swedish stated above for dialect recognition, a pilot listeners were asked to identify the geo- test was put together using recordings of iden- graphical origin of 72 Swedish native speak- tical utterances from 72 speakers. ers by clicking on a map of Sweden. Results In the Swedish SpeechDat database, two indicate for example that listeners from the sentences read by all speakers were added for south of Sweden are generally better at rec- their prosodically interesting properties. One ognizing some major Swedish dialects than of them was used in this experiment: Mobilte- listeners from the central part of Sweden. lefonen är nittiotalets stora fluga, både bland företagare och privatpersoner. `The mobile Background phone is the big hit of the nineties both This experiment has been carried out within among business people and private persons.' the research project SIMULEKT (Simulating For this test, each of Elert's 18 dialect Intonational Varieties of Swedish) (Bruce, groups in Sweden were represented by four Granström & Schötz, 2007). Our object of speakers, two female and two male, with an study is the prosodic variation characteristic age span as wide as possible. of seven different regions of the Swedish- speaking area: South, Göta, Svea, with Dala Subjects as a distinct subgroup, , North, and 30 subjects participated in the experiment, 12 Swedish. The seven regions corre- female and 18 male, with an average age of spond to our present dialect classification 32 and 33 years, respectively. Subjects were scheme. placed in two groups depending on where the majority of the childhood and adolescence (0- Speech material 18 years) had been spent. Seven females and One of our main sources for analysis is the eleven males grew up in the central part Swedish speech database SpeechDat (Elenius, (Svealand) whereas five female and seven 1999). SpeechDat contains speech recorded male subjects were raised in the southern part over the telephone from 5000 speakers, regis- (Götaland) of Sweden. tered by age, gender, current location and self-labeled dialect type, according to Elert's Experiment suggested Swedish dialect groups (Elert, The test was made with the scripting language 1994) that is a more fine-grained classifica- Tcl/Tk and carried out in and tion with 18 regions in Sweden. Lund. The experiment comprised a dialect- test part, a geography test and a questionnaire. Introduction In the dialect test, the SpeechDat stimuli were played in random order over headphones Prosody, vowels and some consonant allo- and could be repeated as many times as de- phones are likely to be important when trying sired before answering by clicking on a map to decide from where a person originates. The of Sweden. aim of this work is to develop a method The geography test included 18 Swedish which could be of help in finding out how towns presented one by one in written form, well Swedish subjects can identify the geo- which were placed on the map in the same

61 Proceedings, FONETIK 2008, Department of Linguistics, University of Gothenburg manner as for the dialect test. These towns are Dark dots mark the correct dialect locations the most populated in each of Elert's dialect and light dots the answers provided by the group areas. subjects. Figure 2 displays the results from Lastly, a questionnaire was filled out by the geography test in the same way. The two all subjects, so as to provide information subjects were chosen as typical representa- about e.g. age, gender and dialectal back- tives of Svealand and Götaland. Both were ground. males aged 25, but with different back- grounds. Subject 1 from Svealand was born Results and raised in Stockholm with parents from Stockholm and had been exposed to regional Subjects vary very much in their ability to lo- accents to a small extent. Subject 2 from cate speakers. In Figure 1, results for two lis- Götaland was born and raised in Jönköping teners are displayed on the Swedish map. by parents from the same area.

Figure 1. Dialect test results for subject 1 from Svealand (left) and subject 2 from Götaland (right). Dark dots for correct locations are connected by lines with light dots for answers given by subject.

Figure 2. Geography test results for subject 1 from Svealand (left) and subject 2 from Götaland (right). Dark dots for correct locations are connected by lines with light dots for answers given by subject. 62 Proceedings, FONETIK 2008, Department of Linguistics, University of Gothenburg

Figure 3. Dialect test results for speaker no. 1 from Svealand (left) and speaker no. 2 from (right). Dark dot for correct location is connected by lines with light dots for answers given by all subjects.

Speakers in the test vary considerably as to Average placement errors how consistently they are identified. An ex- The average errors in dialect placement were ample is displayed in Figure 3, which shows computed as an arbitrary unit distance on the where all subjects have placed speaker no. 1, map. Figure 4 shows this mean for six differ- a 51-year-old female from Täby, Svealand ent Elert dialect areas (four speakers in each and speaker no. 2, a 55-year-old female from area). The subjects are divided into Svealand Kiruna, Norrland. and Götaland listeners (see Subject section).

350 300 250 200 Svealand 150 Götaland 100 50 0

unit arbitrary

Upper south Gotland far north far Göteborg

Norrland, Skåne, far Distance from correct location, Stockholm

18 14 8 7 5 1

Figure 4. Götaland and Svealand listeners’ average dialect location errors for speakers from six Elert dialect areas.

63 Proceedings, FONETIK 2008, Department of Linguistics, University of Gothenburg

Discussion and future work Our data suggests that Svealand listeners are less able to locate dialects, except their own and the accent from Dalarna, which is geo- graphically nearby. It is probable that human listeners are better at identifying and locating dialects originating from their own dialectal area than those coming from more distant re- gions. However, the Götaland listeners were also good at locating Svealand speakers, pos- sibly due to the great exposure of these dia- lects in media. The high error values for the far north part of Norrland may be explained by the longer distances between towns and different-sounding dialects in that area, but also in part because of the subjects' lesser ex- posure to northern accents. These are some examples of results of the dialect location test. Further analysis of the data is planned in the near future, particularly using full statisti- cal analysis. A possible extension is to use segmentally neutralized stimuli, to focus on the prosodic features of Swedish regional va- rieties. We also wish to use listener clustering as a tool in deciding which factors play the most important roles for distinguishing the different Swedish dialect types, which might lead to modified dialect taxonomy.

Acknowledgements This work is supported by a grant from the Swedish Research Council.

References Bruce G., Granström B. and Schötz S. (2007) Simulating Intonational Varieties of Swedish. Proceedings of ICPhS XVI, Saarbrücken, Germany. Elenius K. (1999) Two Swedish SpeechDat databases – some experiences and results. Proceedings of Eurospeech 99, 2243- 2246. Elert C.-C. (1994) Indelning och gränser inom området för den nu talade svenskan – en aktuell dialektografi. In Kulturgrän- ser – myt eller verklighet. (Edlund, L.E. (Ed.)). Umeå, Sweden: Diabas, 215-228.

64