Crowdsourced Mapping of Pronunciation Variants in European French
Total Page:16
File Type:pdf, Size:1020Kb
CROWDSOURCED MAPPING OF PRONUNCIATION VARIANTS IN EUROPEAN FRENCH Yves Scherrer∗, Philippe Boula de Mareüil§, Jean-Philippe Goldman∗ ∗ LATL-CUI, University of Geneva, Switzerland; § LIMSI-CNRS, Orsay, France [email protected]; [email protected]; [email protected] ABSTRACT porain (PFC) project [5, 6, 7], samples of spoken French covering a number of Francophone survey This study aims at renewing traditional dialecto- points (today 36) have been collected. Even though logical atlases to provide a mapping of pronunciation the data were used in various studies on regional variants by using crowdsourcing. Based on French variation [3, 4, 22], the sparse inquiry point network spoken in France, Belgium and Switzerland, it fo- does not allow us to create an atlas in the form of the cuses on mid vowels whose quality may be open or ALF. The TANLAF proposal [14] is more ALF-like, close and shibboleths such as final consonants which but its coverage is currently limited to urban areas of may be maintained or deleted, as a function of spea- the northern third of France. kers’ background. Over 1000 subjects completed a The goal of the present study is to examine questionnaire in which they were asked which one of whether online surveys, or more generally crowd- the two possible pronunciations was closer to their sourcing [9], can be used so as to collect linguis- most usual pronunciation. Their responses for 70 tic data in sufficient quality and regional diversity, French words are displayed in the form of maps. with a fine-grained geographic coverage. Based on This graphical layout enables the general public and European French (i.e., the French language spoken phoneticians to readily visualise where phonological in France, Belgium and Switzerland), this work fo- constraints such as the “loi de position” are violated. cuses on the pronunciation of words which typically exhibit diatopic (geographic) variation. It is inspired Keywords: linguistic atlases, diatopic variation, by two surveys which have been conducted success- French pronunciation fully in the last few years: the Harvard Dialect Sur- vey (HDS) for American English [24], and the At- 1. INTRODUCTION: MOTIVATION las der deutschen Alltagssprache (AdA) for German [8]. These atlases, which involved thousands of par- The study of regional variation in pronunciation pat- ticipants, primarily address lexical variation but they terns is one of the oldest areas of phonetics and di- include phonetic variation issues as well. alectology. In particular, large-scale linguistic sur- The next section introduces the sources of pho- veys were carried out for many languages during the netic variation we investigated for European French. early 20th century. Most of these surveys contain ex- Section 3 presents the questionnaire we developed tensive sections on phonetics and phonology. How- for collecting the data. Section 4 provides some ever, the resulting atlases only partially reflect to- results in the form of maps, Section 5 discusses day’s speech. the crowdsourcing-based methodology and consid- A case in point is European French. The Atlas lin- ers future work. guistique de la France (ALF) [11] maps language varieties spoken in the late 19th century. In the sec- 2. PHONETIC VARIATION IN ond half of the 20th century, a series of follow-up EUROPEAN FRENCH projects were undertaken to document dying dialects in more detail; these projects are subsumed under Several sources of diatopic phonetic variation at the title Atlas linguistique de la France par régions the word level have been identified: mid vowels, (ALFR) [23] (see also [12]). Today, these dialects nasal vowels, the schwa vowel, the front/back /A/, have mostly been replaced by regional French vari- vowel quantity, glides, final consonants… Given eties. Unfortunately, no large-scale survey of current the merger of phonological oppositions related to regional French has been endeavoured yet. the back /ɑ/ and the nasal vowel /œ̃/ in standard There have been proposals to fill this gap, since (Parisian) French [10, 15, 19, 18], we found it Walter’s [26] phonological investigation. Within the too difficult to get reliable information from the framework of the Phonologie du français contem- general public regarding these vowels. We also found vowel quantity, the schwa realisation/deletion ‘thing’, one with an open vowel and one with a and the glide diaeresis/synaeresis too speech rate- close vowel). The participant is instructed to choose dependent. Hence, we focused on three pairs of whether the first or the second pronunciation is mid vowels /e/~/ɛ/, /ø/~/œ/, and /o/~/ɔ/, but also in- closer to his/her own most usual pronunciation, by cluded shibboleths which are emblematic of Belgian clicking on the button marked 1 or 2. A third button French, in particular the use of [w] (instead of the may be clicked by the participant if (s)he is unable canonical /v/ or /ɥ/ in some words) and the realisa- to perceive a difference between the two pronuncia- tion of final consonants. tions (in case of phonological deafness). The items The “loi de position”, which accounts for the ten- as well as the two pronunciations per item are pre- dency of mid vowels to be open in closed sylla- sented in random order. The participants may listen bles and close in open syllables, does not apply in to the recordings as many times as they want. The a straightforward manner to French [2, 25]. This samples were recorded by a phonetician from Paris, constraint is often claimed to better apply to south- who did not utter final schwas. ern French, where the /ɔ/ tends to be close in words According to the brief description given in the pre- such as botté ‘booted’ and open in words such as vious section, the 70 items cover the following pho- sauf ‘except’ [3, 4]. Instead, some eastern French netic contexts: speakers may distinguish words such as peau~pot • /e/~/ɛ/ in open syllable (e.g. parfait ‘per- (/o/~/ɔ/) ‘skin’~‘pot’ and pronounce jeune ‘young’ fect’): 19 words, 6 of which form minimal with a close /ø/. This led us to build up a list of 70 pairs in standard French (e.g. épée~épais words whose pronunciations may vary as a function ‘sword’~‘thick’); of the speaker’s region. Further details are given in • /e/~/ɛ/ in closed syllable (e.g. père ‘father’): the following section. 4 words; • /o/~/ɔ/ in open syllable (e.g. sot ‘foolish’): 3. THE QUESTIONNAIRE 2 words; • /o/~/ɔ/ in closed syllable (e.g. chose ‘thing’): A web-based questionnaire was designed using the 28 words, 2 of which form a minimal pair in Labguistic platform [21]. It is made up of three parts: standard French; in the first part, information about the participant is • /o/~/œ/ in open syllable: 1 word (social ‘so- collected; the second part contains the actual pho- cial’) to exemplify ‘o’-fronting [1, 3, 17, 20]; netic questions; in the third part, the participant may • /ø/~/œ/ in closed syllable (e.g. chanteuse provide feedback about the experiment. Completing ‘singer’): 10 words, 2 of which form a mini- the entire questionnaire takes about 10-15 minutes. mal pair in standard French (e.g. jeûne~jeune In the first part, the participant is asked about ‘fasting’~‘young’); his/her linguistic curriculum: • initial /w/: 2 words (wagon ‘wagon’, huit • in which city the participant currently lives; ‘eight’); • in which department (France), province (Bel- • final /t/: 2 words (soit ‘either’ , vingt ‘twenty’); gium) or canton (Switzerland) the participant • final /s/: 2 words (moins ‘less’, encens ‘in- spent most of his/her life; cense’). • in which department, province or canton the In the final part of the questionnaire, the subjects participant grew up. are asked about the difficulty of the task: for in- This information allows us to pool participants stance, if in some items both pronunciations sounded on three levels of granularity: on the city level unfamiliar to them. (for the current residence only), on a fine-grained areal level (department/province/canton) and on a 4. THE PARTICIPANTS coarse-grained areal level (the 22 official regions for France, one region for the French-speaking part of The survey was launched on October 1, 2014. It Belgium and one region for the French-speaking part was advertised through mailing lists, social networks of Switzerland). Additional information is recorded as well as personal and professional contacts. Also, about the participant, such as age and gender, and university teachers in the relevant domains were con- whether (s)he is a native speaker of French. tacted to pass the information on to their students. The second part of the questionnaire consists of Within a few months, 1250 participants completed the 70 items. For each item, displayed in French the survey and gave exploitable localisation informa- orthography, the participant hears a recording with tion. 72% of participants are females and 42% work two possible pronunciations (e.g., for the word chose in language sciences. The age distribution of infor- mants is the following: 12% under 20, 40% in their NPdC BEL twenties, 15% in their thirties, 11% in their forties, Pic Nor 12% in their fifties and 10% over 60. IdF Bre Ch-A Lor Als The number of participants currently living in ru- PdL Cen ral areas turned out to be very low, which had two Participants Bou Fr-C consequences on our analyses. First, rather than fo- 9 - 20 P-Ch SUI 21 - 30 Lim Auv cusing on the place of residence, we focused on the 31 - 50 Rh-A 51 - 70 place where the participant spent most of his/her life. Aqu 71 - 100 Second, we focused on the coarse-grained areal level 101 - 246 M-P PACA to avoid data sparsity.