Acceptability of harmonic mismatch for neutral vowel stems in Hungarian

Daniel Szeredi

August 29, 2012

1 Introduction

This paper investigates the relationship between fine phonetic detail and high-level mor- phological behavior, by looking at Hungarian neutral vowels. Hungarian vowel harmony requires that alternating suffixes surface with a back vowel after a stem with a back vowel, and with a front vowel after a front rounded vowel. The behavior after stems with front un- rounded (neutral) vowels is not entirely predictable: although front suffixing is productive, many stems with front unrounded vowels require a back vowel in the suffix (Hayes et al. 2009). The non-low front [e:,i,i:] were reported by Benus and Gafos (2007) to be articu- lated with a more retracted tongue body when they are in a stem which takes back suffixes (antiharmonic stem) even in isolation:

(1) si:v ‘heart’: harmonic stem, takes front suffixes (cf. dative si:vnEk), [i:] has fronted tongue body (2) ñi:l ‘arrow’: harmonic stem, takes back suffixes (cf. dative ñi:ln6k), [i:] has a more retracted tongue body

This result leads to a hypothesis that the morphological behavior – antiharmonicity – might be explained by the fine phonetic distinction between the retracted and the peripheral vowel. The aim of this paper is to follow up the articulatory experiment of Benus and Gafos (2007) by acoustic and perceptual studies which investigate whether these small articulatory differences in vowels are able to influence affix selection. The preliminary results of these studies do not support such a connection between fine phonetic detail and affix selection.

This section will lay out the relevant properties of the Hungarian vowel system and morphology. Previous studies which lay the groundwork for the hypotheses underlying the experiments in this paper and the interpretation of the possible outcomes will be discussed in

1 section 2. Section 3 will describe an acoustic experiment to see whether there is a difference in backness between the vowel found in a harmonic and an antiharmonic stem in the speech of Hungarian speakers, and section 4 describes a wug experiment on the categorization of stems by Hungarian speakers based on a possible phonetic difference between a retracted and a full [i(:)]. Section 5 will attempt to interpret the results of the experiments based on the hypotheses sketched in section 2, and section 6 will conclude and give a roadmap on further research on the topic.

1.1 Hungarian vowel harmony

The Hungarian vowel system contains seven short and seven long vowels. While orthog- raphy might suggest that long vowels are straightforwardly paired up with the corresponding short vowel, the quality of the members of a pair do not always match, like in :<´a> cor- respond to [6]∼[a:]. The vowel system of Hungarian is presented below in Table 1 (cf. Sipt´ar and T¨orkenczy 2000, p. 51, although they have /O/ for /6/):

front back [-round] [+round] high i i: y y: u u: mid e: ø ø: o o: low E 6 a:

Table 1: The Hungarian vowel system

The basic pattern of vowel harmony in Hungarian is governed by frontness. Most suffixes have a front and a back allomorph: stems with only back vowels select the back suffix and stems with only front vowels select the front suffix, as illustrated in (3–4). This is complicated by “mixed stems”: polysyllables with both back and front vowels. In these stems the final vowel decides on the identity of the suffix, except when the final vowel is front unrounded /E, e:, i, or i:/, which can be transparent. Examples in (5–6) illustrate the behavior of mixed stems, where the final vowel is not transparent. The suffix attached in these examples is the dative, which alternates between the [n6k] back and [nEk] front allomorphs:

(3) back stem: h6jo: ‘ship’ ∼ h6jo:n6k,*h6jo:nEk (4) front stem: tEtø: ‘roof’ ∼ tEtø:nEk,*tEtø:n6k (5) mixed back stem: tEr6s ‘terrace’ ∼ tEr6sn6k,*tEr6snEk (6) mixed front stem: Sofø:r ‘driver’ ∼ Sofø:rnEk,*Sofø:rn6k

2 1.2 Transparency

The work of Hayes et al. (2009) gives an insight on how transparency works in Hungarian. In their analysis, they only refer to front rounded vowels as front (F), and they call all front unrounded vowels neutral (N, back vowels are abbreviated with a B). They found in a web search study (also cf. Hayes and Londe 2006) and in a wug test that while neutral vowels are indeed transparent to harmony, their transparency is constrained by two principles: the Height Effect and the Count Effect.

The transparency of the front unrounded vowels means that these vowels seem to be invisible to harmony: if a stem-final transparent vowel is preceded by a back vowel, the suffix will be still back. This is illustrated in the following examples:

(7) transparent BE stem: m6tEk ‘maths’ ∼ m6tEkn6k,*m6tEknEk (8) transparent Be: stem: ka:ve: ‘coffee’ ∼ ka:ve:n6k,*ka:ve:nEk (9) transparent Bi stem: koÙi ‘car’ ∼ koÙin6k

The Height Effect refers to the pattern that the higher neutral vowels are more likely to be transparent than lower ones. Thus while almost all monomorphemic stems with a final [i(:)] preceded by a back vowel take back suffixes (exceptions are compounds, which are still transparent, as in (10)), stems with a final [e:] are more likely to allow front suffixes, as in (12). Stems with a final low [E] are mostly vacillating – both suffixes are accepted, as in 13, although the preferences vary between speakers and stems.

(10) front mixed Bi stem – only compounds: r6js+film ‘cartoon’ ∼ r6js+filmnEk,*r6js+filmn6k (11) front mixed Be: stem – mostly compounds: l6g+be:r ‘rent’ ∼ l6g+be:rnEk,*l6g+be:rn6k (12) vacillating Be: stem: 6te:n ‘Athens’ ∼ 6te:n:6k or 6te:n:Ek (13) vacillating BE stem: fotEl ‘sofa’ ∼ fotEln6k or fotElnEk

The Count Effect refers to the observation that more than one neutral vowel in the end of a stem will act as if these two vowels were being less transparent aggregately, than individually. A stem ending in an [E] preceded by another neutral vowel and a back vowel will take exclusively front suffixes (14), while a stem with back vowels and a single final [E] usually vacillates as in (13). In the same way, while a stem with back vowels and a final [i(:)] almost always take back suffixes (as in (9)), stems with back vowels ending in two [i(:)] high front vowels are mostly vacillating:

(14) front BNE stem: luţifEr ‘Lucifer’ ∼ luţifErnEk,*luţifErn6k (15) vacillating Bii stem: 6spirin ‘aspirin’ ∼ 6spirin:6k or 6spirin:Ek

3 1.3 Antiharmonicity

Most stems which contain only front unrounded (neutral) vowels take front suffixes, as illustrated in (16–18):

(16) front i stem: si:v ‘heart’ ∼ si:vnEk,*si:vn6k (17) front e: stem: e:v ‘year’ ∼ e:vnEk,*e:vn6k (18) front E stem: hEj ‘place’ ∼ hEjnEk,*hEjn6k

Only front suffixing is productive for neutral vowels: new stems entering the language, such as loan words and acronyms are always harmonic:

(19) liNk ‘hypertext link’ ∼ liNknEk ‘hypertext link.Dat’

However, certain stems with only neutral vowels take back suffixes. These stems will be labeled as antiharmonic in this paper, to highlight the exceptionality (non-productivity) of this lexicalized pattern as seen above in (19). Most antiharmonic stems are monosyllabic, as in (20–22), although there are disyllabic antiharmonic nouns (23–24), and more verbal stems which are formed with non-alternating derivative suffixes (25).

(20) antiharmonic i: stem: ñi:l ‘arrow’ ∼ ñi:ln6k,*ñi:lnEk (21) antiharmonic i stem: fiNg ‘fart’ ∼ fiNgn6k,*fiNgnEk (22) antiharmonic e: stem: ţe:l ‘target’ ∼ ţe:ln6k,*ţe:lnEk (23) antiharmonic NN stem: dEre:k ‘waist’ ∼ dEre:kn6k,*dEreknEk (24) vacillating NN stem: fe:rfi ‘man’ ∼ fe:rfin6k or fe:rfinEk (25) antiharmonic verbal NN stem: Simi:t ‘smoothen’ ∼ Simi:t6n6k ‘they smoothen’, *Simi:tEnEk; constructed by affixing the causative -i:t to the adjectival Bi stem Sim6 ‘smooth’

The Height Effect described in Hayes et al. (2009) applies for antiharmonicity as well. There are no antiharmonic stems with the low [E] vowel in the language, and only a few with [e:], like (22) and (23). Antiharmonic stems with [i] and [i:] are more frequent, and it seems that stems with the long vowel and verbal stems are more likely to be antiharmonic. A quick survey using the web corpus ´oszablya (Hal´acsyet al. 2004) shows that of the 23 verbal [i(:)] stems, 18 (78.3%) take back, 5 (21.7%) take front suffixes, while of the 145 nominal [i(:)] stems, 19 (13.1%) are antiharmonic, and 126 (86.9%) are harmonic.

This paper will focus on the link between phonetic detail, more specifically, the backness of the vowel and the categorization of monosyllabic [i(:)] stems as harmonic or antiharmonic. Longer stems and lower stem vowels are excluded, as the Height Effect and the Count Effect on antiharmonicity means that there are very few such antiharmonic lexical items.

4 2 Phonetic correlates of transparency and antiharmonic- ity

This paper focuses on the hypothesis that there are phonetic correlates of antiharmonicity on the stem vowel. The expected difference between an [i(:)] in an antiharmonic stem and in a harmonic stem is that in an antiharmonic stem one would expect a less peripheral, more retracted vowel.

This is a plausible hypothesis because in an antiharmonic stem, the vowel is more often flanked by back vowels, leading to coarticulation. it has been shown that in two vowels sep- arated by consonants do have an articulatory and acoustic influence to each other (Ohman¨ 1966). For example, the F2 formant transition between [ø] and the consonant in the ut- terance [øgA] shows movement towards a frequency characteristic to back vowel; whereas F2 is even raising in the utterance [øgy]. The extent of coarticulation for a given VCV sequence is language-specific, and has been proposed to be the phonetic grounding for vowel harmony (Beddor et al. 2002). In the case of Hungarian antiharmonicity, coarticulation of an antiharmonic [i(:)] with the neighboring back vowels means that the retracted tongue body of the back vowels causes the tongue body to be less fronted for the [i(:)] than in other environments or in isolation.

If the hypothesis that an [i(:)] in an antiharmonic stem is more retracted than in a harmonic stem were true, then any or all of the following would be seen in Hungarian:

(26) articulatory differences: a vowel [i(:)] in an antiharmonic stem would be articulated with a slightly more retracted tongue body than the same vowel in a harmonic stem. (27) acoustic differences: a vowel [i(:)] in an antiharmonic stem would have a lower F2 than the same vowel in a harmonic stem. (28) perceptual effects on the categorization of a stem: as a lower F2 would be a cue for antiharmonicity, a wug stem with an [i(:)] with lower F2 would be categorized as antiharmonic more likely than the same wug stem with a peripheral [i(:)].

The following section introduces the experiment of Benus and Gafos (2007), who claim to have found articulatory differences between the harmonic and antiharmonic [i(:)]. Section 2.2 summarizes the possible explanations for this finding, and section 2.3 sketches up the hypotheses that will be tested in the follow up experiments in this paper, and the relation of these hypotheses to the possible analyses.

2.1 Articulatory effects

In their articulatory experiment, Benus and Gafos (2007) have reported a correlation between stem type and the backness of [i(:)]: their study shows that neutral vowels in

5 antiharmonic stems are articulated with a more retracted tongue body than the same vowel in a harmonic stem. For example, the tongue body during the vowel in antiharmonic stems like [vi:v] ‘he is fencing’ and [i:r] ‘he is writing’ is more retracted than in corresponding antiharmonic minimal pairs like [i:v] ‘bow’ and [hi:r] ‘news’. They have also found the same effect for the final vowels in suffixed transparent stems as well: the [i:] in [z6fi:rb6n] ‘in sapphire’ is more retracted than in [zEfi:rbEn] ‘in zephyr’. The transparent stimuli were always suffixed, therefore the [i:] in these words were always flanked with back or front vowels in both sides, and the articulatory effect is easy to analyze by coarticulation, therefore results on these stimuli will be ignored below.

Benus and Gafos used ultrasound measurements and electromagnetic midsagittal articu- lometry (EMMA, Perkell et al. 1992, Stone 1997) to examine the articulatory characteristics of the vowels. In their EMMA study, 8 receivers were placed in a mid-sagittal plane, includ- ing two receivers on the tongue body and one on the tongue dorsum. Their stimuli were harmonic and antiharmonic monosyllabic words and trisyllabic suffixed transparent words embedded in the test sentence [6st mondom hoé X, e:S EliSme:tlEm 6st hoé X me:gEţ:Er] ‘I said X, and I repeat X once again’. They collected 4 repetitions of 16 monosyllables (8 harmonic–antiharmonic pairs, 6 with [i(:)] and 2 with [e:], 128 tokens in total) from two subjects, and 4 repetitions of 6 monosyllables (24 total) from a pilot test subject. Three tongue receivers were used in the analysis: the two on the tongue body and one on the tongue dorsum (for the pilot subject one on the tongue tip, one on the tongue body and one on the tongue dorsum). Ultrasound data were only analyzed for one subject, who took part in the EMMA experiment as well.

The study found that in monosyllabic stems, one receiver out of three for one subject (us- ing one-way ANOVA, F(1,124)=4.005, p=0.048 on the second receiver on the tongue body), and all three for the other (F(1,116)=6.94, p=0.01 for the tongue dorsum; F(1,116)=11.403, p<0.001 for the second receiver on the tongue body; F(1,116)=7.453, p=0.007 for the first receiver on the tongue body) were significantly more retracted in antiharmonic stems. For the pilot subject, the tongue dorsum receiver was more retracted for the antiharmonic stem 9 times out of 12 antiharmonic–harmonic pairs, but no statistical analysis was made to see whether this is significant. The ultrasound measurements made on one subject showed the same effect, although it did not come out as significant on an α = 0.05 level (F(1,318)=2.915, p=0.089). As the monosyllabic stimuli were presented with no suffix, the differences in vowel backness cannot be attributed to coarticulation.

The most relevant claim of Benus and Gafos’ study for the current paper is that front unrounded vowels in antiharmonic stems are more retracted than in harmonic stems, and “these difference must be part of the speakers’ knowledge of these stems” (Benus and Gafos 2007, p. 286). According to their analysis, the behavior of Hungarian antiharmonic (and transparent) vowels might indicate that there is an underlying distinction between a retracted /ffli/ and a full /i/. This distinction might be based on any representational level, but it does not require a phonemic analysis, the main point is that the distinction between these two categories for a high front unrounded vowel is present in the lexicon.

6 2.2 Possible explanations of retraction

Native Hungarian speakers’ intuition does not find this distinction between a retracted and a full [i(:)] to be phonemic. The only phonological argument for the distinction is antiharmonicity. These characteristics together with the results of the articulatory experi- ment (as noted by Benus and Gafos 2007) indicate that the observation of this experiment bears characteristics of incomplete neutralization and near merger. These phenomena will be described in this section.

In incomplete neutralization, a contrast between two (in other environments dis- tinct) phonemes disappears phonologically in a given environment, but a small subphonemic difference is retained. These differences can be used by listeners to reconstruct what the underlying phoneme originally is in a given word. The best known example is word final devoicing in German, Polish, Dutch, and other languages with this phenomenon, where sub- jects of experiments were able to identify the underlying phoneme using these phonetic cues above chance. For example, Port and Crawford (1989) show that German word-final devoic- ing is not a complete neutralization: minimal pairs like /ra:d/ ‘bicycle’ and /ra:t/ ‘advice’ are both pronounced with a devoiced final consonant, but there are small but significant dif- ferences in the phonetic realization of the underlyingly voiced and voiceless segment. They also show that listeners in a perception study are able to perform about 20% better than chance to recover the underlying quality of the word final obstruent based on these small phonetic distinctions.

Warner et al. (2004) have found the same acoustic effect in Dutch, and Ernestus and Baayen (2006) show in their perception study that listeners are able to pick up on these acoustic cues and use them to reconstruct the underlying laryngeal quality of the final obstuent, and such they are able to influence allomorph selection, which is based on the underlying [voice] feature of the stem. They had a native speaker read two almost identical lists of nonce verbal stems (imperative forms, which are identical to unsuffixed stems), where in one list all word-final postsonorant obstruents were spelled as voiceless (eg. daup), and in the other list these obstruents were spelled as voiced (eg. daub). As expected, the acoustics retained some phonetic features related to voicing in the latter list. The participants of the perceptual study were divided into two groups – half of them listened to the voiceless list and half of them listened to the slightly voiced list. Their task was to write down the past tense form corresponding to the imperative form they heard. Based on Dutch morphology, stems ending in underlyingly voiceless obstruents take the -te [t@] allomorph, while stems ending in underlyingly voiced obstruents take the -de [d@] allomorph of the past stem suffix. The results have shown tha participants, who listened to the slightly voiced list gave signifi- cantly more voiced responses (eg. daubde instead of daupte) than those, who listened to the voiceless list (10.4% versus 7.2%, F(1,120)=31.11, p < 0.001).

Dmitrieva et al. (2010) present an experiment which shows the same incomplete neu- tralization effect in Russian word final devoicing as seen in the studies on German and Dutch. They also point out that exposure to a second language (English) can strengthen

7 the distinction between the two categories.

Compared to the phenomena in German, Dutch and Russian, Hungarian does not con- trast retracted and full front vowels anywhere phonologically, the ground for the hypothesis of two distinct phonemes could only be justified by the antiharmonicity/transparency effect. As the incomplete neutralization literature relies on arguments from perception, categoriza- tional experiments based on perception (28) should show that Hungarian listeners are able to use small phonetic detail in higher-level categorization.

Near merger is similar to incomplete neutralization, but the explanation is more from a diachronic or sociolinguistic point of view. In near merger, two phonemes in an earlier stage of the language seem to have merged in a later stage (or there is a merged dialect and a non-merged dialect), however, speakers in the later stage still consistently produce small articulatory differences without being able to perceive the distinction (Labov et al. 1991). As for Hungarian, Benus and Gafos (2007) cite K´alm´an(1972) for diachronic evidence that transparency and antiharmony originated in a full contrast between /i/ and /W/ in Old Hungarian, which was eliminated by the merger of the two phonemes. In the older stage, there was no antiharmonicity:

(29) Old and Modern Hungarian /si:v/ ‘heart’ ∼ /sivEk/ ‘hearts’ (30) Old Hungarian ?/ñW:l/ ‘bridge’ ∼ ?/ñWl6k/ ‘arrows’ (31) Modern Hungarian /ñi:l/ ∼ /ñil6k/

This explanation would say that Old Hungarian was always truly harmonic (29–30), and the merger of the two high unrounded led to the development of antiharmonicity in Modern Hungarian (31), but the merger is not complete: there are articulatory differences, which the speakers cannot rely on in perception.

There is a serious problem with the analysis of Hungarian antiharmonicity as near merger: the evidence that Hungarian had a back /W/ is very weak, as most of the earlier literature on Hungarian diachronic linguistics relied heavily on the fact that antiharmonicity exists in Modern Hungarian for their conclusion. Kis (2005) takes on the other arguments for the existence of a back unrounded vowel in previous stages of Hungarian and points out their weaknesses:

(32) the existence of back unrounded segments in any reconstructed proto-language which Hungarian derives from is very questionable (33) the harmonicity of loanwords from Turkic languages which have a velar /W/ do not pattern consistently with the original Turkic vowel: Turkic /i/ corresponds to Hungarian antiharmonic /i/, and Turkic /W/ corresponds to Hungarian harmonic /i/ frequently

8 (34) the reconstruction of a Proto-Turkic /i/ or /W/ often relies on evidence from Hun- garian (the harmonicity of the loanword), if the contemporary languages have both /i/ and /W/. (35) there are no structuralist arguments for positing a back unrounded high vowel any time in the (36) the only evidence from the phonology of Modern Hungarian for /W/ is antiharmonic- ity

The analysis of the small articulatory effect as near merger would not need the acoustic and categorization experiments to yield a positive result, but as the diachronic background of the explanation is very weak, this analysis would be very doubtful.

A third possible explanation for the minor articulatory difference is a theory with rich lexical representation (Benus and Gafos 2007, p. 289), where a word is stored in every suffixed form with the exact phonetic detail a speaker encountered (Bybee 2001, Pierrehum- bert 2001). When eliciting a word, a speaker utters a form which is close to the center of the phonetic space defined by these stored exemplars (modeling such a framework would call for a stochastic process, but this is not relevant here). In a model like this, these exemplars are connected to each other based on which lemma they belong to, and which paradigmatic cell they belong to.

Thus, in a rich lexicon model like this, an antiharmonic stem like /ñi:l/ is connected to many exemplars which are suffixed with back vowels where the backness of the suffix has a coarticulatory effect on [i:]; whereas a harmonic stem like /si:v/ is not connected to so many coarticulated exemplars. When eliciting an unsuffixed form, the word [ñi:l] is more likely to be uttered with a more retracted vowel because the coarticulated exemplars shift the center of the phonetic realization of the whole stem towards retraction.

With an exemplar model, the result of the categorization experiment does not need to be positive, but a significant acoustic difference in the F2 of the vowel is expected, as the articulatory effect should be introduced to the acoustic space by the same principles – the exemplar cloud for the vowel being shifted towards a more retracted quality.

2.3 Hypotheses regarding the results of the follow up experiments

There are two ways to interpret a possible finding that small phonetic difference in vowel backness can influence the suffixation class of the stem. The main question is whether i) the phonetic cues can explain the higher level effect, and they are the driving factor underlying antiharmonicity or ii) the fine phonetic variation seen in the data is simply a low-level effect, which is probably the result of the morphophonological environment (back vowels of the suffixes in the case of antiharmonic stems), without any consequences for the rest of the grammar.

9 Figure 1: The phonetics first hypothesis: exceptionality in the phonetics, not in the grammar

Figure 2: The morphology first hypothesis: exceptionality in the grammar

Focusing on the phenomenon under discussion here, there are two competing hypotheses:

(37) phonetics first: the phonetic difference explains the morphological behavior of the stem (Figure 1). (38) morphology first: the phonetic effect is only the artifact of the morphological envi- ronment of the stem (Figure 2).

The phonetics first hypothesis states that the retracted nature of the vowel is in fact in the knowledge of the speakers, and they do not only utilize this knowledge when they articulate an antiharmonic stem, but they are able to recover the lexical categorization of the stem containing a front unrounded vowel. In this case antiharmonicity (and transparency) can be explained by the presence of retraction, and it can be present in the representation of the vowel. Benus and Gafos (2007) argue for this case, and the incomplete neutralization analysis seems to be the proper formalization of this hypothesis. Transparency and antiharmonicity will not be an issue for the grammar: the problematic stems will be perfectly harmonic, with an underlying retracted high front vowel in the antiharmonic and back-transparent stems, and a full [i] elsewhere. The retracted [ffli] can phonetically still be front, but it could carry an underlying [+back] feature which is used by the grammar, and causes retraction on the surface.

10 Under this hypothesis the representation of antiharmonicity would be marked on the vowel: there would be a retracted /ffli/ phoneme which is incompletely merged with /i/. Anti- harmonic stems would contain the former, harmonic stems would contain the latter phoneme underlyingly. On the surface, an underlying /i/ would most likely appear as a peripheral vowel, but the surface form of underlying /ffli/ would be somewhere in the continuum between [i] and [iffl]: the surface [i]∼/[tiretri] contrast would be often neutralized, but underlying /ffli/ would tend to be somewhat retracted:

(39)/ si:v/ ∼ /si:vnEk/ → [si:vnEk] (40)/ ñiffl:l/ ∼ /ñiffl:ln6k/ → in the continuum between [ñiffl:ln6k] and [ñi:ln6k]

Under the morphology first hypothesis, the representation of antiharmonicity has to be analyzed as an exceptional lexical category. The retraction of the vowel can be explained by exemplar effects as described in the previous section. In this case, the grammar marks antiharmonicity on the stem as a lexical class marker, which has a corresponding marked- ness constraint militating against front suffixes, that is ranked above the main constraint governing vowel harmony:

(41)/ si:v/ ∼ /si:vnEk/ → [si:vnEk]

(42)/ ñi:lAH/ ∼ /ñi:lAHn6k/ → [ñi:ln6k]

These two ways to represent antiharmonicity have their origin in the literature: like (39–40) in his SPE-styled generative grammar Vago (1980) analyzes antiharmonicity on the segmental level (eg. /si:v/ ‘heart’, but /ñW:l/ ‘arrow’). Other grammars have dismissed this, and use lexical marking as in (41–42), either using a floating [+back] feature or a morphological class marker (N´adasdyand Sipt´ar1994, Ringen and Vago 1998, Sipt´arand T¨orkenczy 2000).

A similar question of representation arises in other phenomena in other languages fre- quently. The question of Slavic ‘yers’ seems to be similar: the unpredictable vowel-zero alternation in certain Slavic morphemes is traditionally analyzed on segmental level using abstract phonemes (yers) (eg. Lightner 1972), but Gouskova (2012) gives an analysis of the same phenomenon using exclusively morpheme-level marking of exceptionality.

3 Acoustic study

An acoustic study was done to address the question of whether the small articulatory effects discussed by Benus and Gafos are present acoustically. These articulatory effects can cause acoustic differences in F2, although the connection between the two is not linear. If there is no acoustic effect, that would question an analysis of the phenomenon with lexically

11 stored distinction between a full and a retracted [i(:)], because there would be a serious problem with how this distinction is learned.

3.1 Methodology

The data analyzed in this paper comes from an experiment for an ongoing study by Sylvia Blaho and the author, which was designed to acquire data about the acoustics of transparent vowels and vowels in antiharmonic stems and judgments about transparency in two dialects of Hungarian (Blaho and Szeredi 2012)1.

3.1.1 Participants

Twelve subjects participated in the study: 7 from Budapest, who speak the standard colloquial language and 5 from P´ark´any (officially St´urovoˇ in Slovak), which is a small town of 10 851 inhabitants in Slovakia (Statistick´y´uradSlovenskejˇ Republiky [Statistical Office of the Slovak Republic] 2011). The town is only 42 kilometers northwest from Budapest and the majority of the inhabitants are Hungarian speakers. The town is in the Pal´ocdialect region (Kiss 2001), and speakers have preserved many archaic non-standard dialectal characteristics due to the separation of P´ark´any from the majority of the Hungarian-speaking community since 1920, when the town was awarded to Czechoslovakia. Since 2001, when a new bridge was built between P´ark´any and the Hungarian town of Esztergom, exposure to the standard language increased, and accelerated even more after the two countries were admitted to the European Union in 2004, and after 2008, when border control between the two countries was abolished due to the Schengen Agreement.

The main difference between the vowel system of Standard Hungarian and Pal´ocis the presence of a short mid front unrounded vowel /e/, which has merged with /E/ in the stan- dard (see Table 2, compared to Table 1). The presence of this vowel is expected to have consequences in the behavior of transparency, but it is unlikely that it would alter the be- havior of antiharmonic stems with /i(:)/. All stimuli for antiharmonicity are unquestionably antiharmonic in both dialects.

3.1.2 Stimuli

The stimuli of the experiments included target words which tested harmonic and antihar- monic neutral vowels2. The stimuli used in this paper are shown in Table 3. These stimuli

1Our research was supported by Marie Curie grant nr. MB08-B 82438 and the Research Council of Norway’s YGGDRASIL grant nr. 211286. 2These results are only a part of a larger experiment, which included target words testing transparency, but as this paper does not discuss transparency, the results for these are not analyzed here

12 front back [-round] [+round] high i i: y y: u u: mid e e: ø ø: o o: low E a O:

Table 2: The Pal´ocvowel system used in P´ark´any. The additional phoneme is highlighted.

were based on the target words of Benus and Gafos (2007), but several smaller changes have been made for the acoustic experiment, so that the segmentation of vowels can be easier (eg. [bi:r] instead of [i:r], [vi:z] instead of [i:z]), or that the target words can be paired to each other more simply (eg. [Ùi:p] instead of [ţi:m] to match up with [Si:p]). The tokens marked with an * are the new words used in this experiment.

harmonic antiharmonic i:v ‘spline’ vi:v ‘fence.3sg’ hi:r ‘news’ bi:r* ‘carry.3sg’ vi:z* ‘water hi:d ‘bridge Ùi:p* ‘pinch.3sg’ Si:p ‘whistle’ his ‘believe.3sg’ ñit ‘open.3sg’ e:j ‘night’ he:j ‘crust’ se:l ‘wind’ ţe:l ‘aim’

Table 3: Stimuli of the experiment to test antiharmonicity

3.1.3 Presentation

The experiment was conducted by seating the subjects in a quiet room, and presenting them with a randomized list of all the stimuli listed in Tables 3 three times, embedded in the frame sentence [6st mont6m X, mEgiSme:tlEm X] ‘I said X, I repeat X’. The randomization and the presentation was conducted with the web-based experiment presentation software Experigen (Becker and Levine 2010).

The recordings were made using a Shure WH 30 head-mounted condenser microphone for the P´ark´any speakers and one Budapest speaker, and a Sennheiser ME3-ew head-mounted condenser microphone for the Budapest speakers with an Alesis io|2 preamplifier. The recording was made in Praat (Boersma and Weenink 2010) and Audacity (Audacity Team 2009).

13 8

Figure 3: Segmenting the sentence [6st mont6m Ùi:p, mEgiSme:tlEm Ùi:p]

3.1.4 Data Analysis

Segmentation and measurements were made using Praat (Boersma and Weenink 2010)3. The quality of the vowel was measured at the midpoint, measurements made were F1, F2, F3 formants, F0 and intensity. Statistical analysis on the results was performed in R (R Development Core Team 2012). An example of how the segmenting was done is seen in Figure 3.

Unfortunately, the recording conditions were not perfect and there was a little noise in the recordings which caused the formant tracker to accidentally find an F2 between F1 and the actual F2 curve. These ‘phantom’ F2 measurements were corrected by substituting the F2 value with the measured F3 value (the actual second formant) for all data where the measured F2 was below 1000 Hz for males and 1500 Hz for females. This had to be amended with repeating this protocol for two high-pitched female speakers for all F2 below 1750 Hz. The resulting F2 distribution is highly negatively skewed (s3 = −1.16,Z = −6.97, p < 0.001) for male speakers, suggesting that more erratic tokens might still be in the data. However, the sample for female speakers has a small, although significant positive skewness (s3 = 0.41,Z = 3.48, p < 0.001), indicating that this manipulation might have introduced a little floor effect, but not too much to introduce a high positive skew to the data.

3Acknowledgements to our RA Ad´amSzalontay,´ who conducted the segmentation.

14 overall gender dialect speaker 23.388 male 5.765 Budapest 18.644 1 59.308 7 32.496 female 15.022 P´ark´any 29.866 2 19.922 8 59.134 3 2.851 9 -0.612 4 -9.523 10 60.968 5 -9.907 11 -18.134 6 -1.21 12 -7.819

Table 4: Mean differences in F2 between harmonic and antiharmonic stems. Positive values denote higher F2 values for harmonic stems (the expected direction)

3.2 Results

Every token was coded with the following factors: f1, f2, f3 as measured variables and speaker, gender, word (the item-wise grouping factor), repetition, harmony and dialect as item or speaker specific factors.

As each sentence contained the target word twice, the first one was coded with 1, and the second one with 2 in the repetition factor. The dialect factor determined whether the speaker was from Budapest or P´ark´any. The key independent variable of the analysis was harmony, which was set as front for the harmonic, and back for the antiharmonic stems.

In the analyses below, only f2 was used from the measured variables as the dependent variable: if antiharmonic stems were acoustically different from harmonic stems, a main effect of harmony should be significant in the statistical analysis. The possibility that vowel height is also affected was also entertained, however, even a less conservative mixed effects ANOVA with harmony as a within-subjects factor and subject as between-subjects factor did not indicate that harmony is a significant effect for f1 (F(1,11)=0.19, p=0.67), and mixed effects regression also showed that harmony type does not affect F1 (β = 7.623 Hz, SE = 19.122, t=0.339 for the saturated model; χ2(1) = 0.158, p=0.69 with backwards stepwise term elimination).

The grand mean difference between harmonic and antiharmonic stems in the data was 23.388 Hz. Some other relevant mean values for this difference are shown in Table 4. To find out whether these differences can be significant overall, or for some of the speakers, more sophisticated statistical analyses were made, which are presented in the following sections.

15 Figure 4: Between-speaker effect of harmony on f2 (pink = male speakers, blue=female speakers).

3.2.1 Regression analysis on the whole dataset

The very small effect size, and the apparently large between-subjects variation which is hard to attribute to gender and dialect only, and the lack of an item-wise random effect justifies the need for a detailed investigation of the data. The amount of between-speaker variation can be seen in Figure 4. A linear mixed effects regression was fit using the lme4 package for R (Bates et al. 2011).

The fully saturated model was the following:

(43) f2 ∼ harmony * gender * dialect * repetition +(1 + harmony * repetition | speaker) +(1 + gender * dialect * repetition | word)

The model does not contain random slopes for gender and dialect by speaker, because these variables are hierarchically nested under speaker, so a value for a random slope for gender=male for a given male speaker would not be interpretable, and the between-speaker variation would be vaguely distributed between the random intercept and these nested ran- dom slopes. The same reasoning led to the exclusion of a random slope for harmony by word.

Investigating the coefficients of this model leads to questioning the overall significance of harmony in the data: the coefficient of harmony=front is β = 23.836 Hz with a standard error of 27.062 Hz (t=0.88). There are non-significant interactions with positive sign – meaning that they would imply a preference for a higher F2 with harmonic stems if they

16 Figure 5: Random slope coefficients for harmony=front by speaker in the saturated model.

were significant. This is seen in the data with males (harmony=front & gender=male: β = 24.87, SE=59.28, t=0.42), speakers from P´ark´any (harmony=front & dialect=sk: β = 39.39, SE=48.86, t=0.81), and in the four-way interaction of P´ark´any male speakers in their second repetition (harmony=front & gender=male & dialect=sk & repetition=2: β = 72.413, SE=94.8, t=0.76).

There are also non-significant interactions with a negative sign: in the second repetition (harmony=front & repetition=2: β = −5.364, SE=28.866, t=0.19), with male speak- ers from P´ark´any (harmony=front & dialect=sk & gender=male: β = −86.8, SE=83.04, t=1.05), males in the second repetition (harmony=front & gender=male & repetition=2: β = −27.4, SE=69, t=0.4) and speakers from P´ark´any in the second repetition (harmony=front & dialect=sk & repetition=2: β = −14.12, SE=60.34, t=0.23).

The only significant effect for the saturated model judging by t values was gender=male: β = −642.184, SE=158.09, t=4.06.

The random slope coefficients assigned to harmony by speaker are shown in Figure 5. Most speakers seem to have a positive coefficient, albeit not significant. Speaker no. 9 has a uniquely high coefficient, which is in return offset in the random slope of harmony:repetition, for which speaker no. 9 had a significantly low coefficient, while other speakers’ coefficients hover around 0. This irregularity might be due to some formant tracking measurement errors, which accidentally grouped together such as unusually low F2 was measured for an- tiharmonic and unusually high F2 was measured for harmonic stems in the first repetition. This effect of noise can be seen on Figure 6.

Having examined the saturated model, a backward stepwise term elimination procedure was done, where the highest level interactions were eliminated if their removal did not affect the likelihood of the model significantly. This was iterated until only the main effects were

17 Figure 6: The effect of some outliers: F2 measurements of speaker no. 9 by repeti- tion. Some outlying measurements for the first repetition (on the left side) affect the harmony:repetition interaction largely for this speaker.

in the fixed effects structure, as no interactions came out as significant. The random model structure was simplified using the same method, removing not only interactions but random slopes of effects as well. A significant random slope that was not removed from the model was harmony | speaker (χ2(3) = 39.288, p< 0.001), where the pattern is seen in Figure 7, with speaker no. 9 still having a significant positive coefficient, though the interaction term had been eliminated. The other random slope term not removed from the model was dialect | word, which was scraping significance at an α = 0.1 level (χ2(4) = 7.822, p=0.098), but conservativity dictates that this term should be kept. The resulting baseline model was as below:

(44) f2 ∼ harmony + gender + dialect + repetition + (1 + harmony | speaker) + (1 + dialect | word)

The significance of main effects was tested again by comparing the model with the removal of a given term to the baseline, and finding out whether the likelihood worsened significantly. Again, harmony was not a significant main effect (χ2(1) = 2.01, p=0.156), nor was dialect (χ2(1) = 1.245, p=0.26) and repetition (χ2(1) = 0.0591, p=0.81). The factor gender was obviously still significant (χ2(1) = 17.44, p< 0.001). By removing the non-significant main effects of dialect and repetition from the fixed effects structure of the baseline model, the χ2 score of harmony can be further raised to 2.385 (p=0.122).

Summarizing the results above, no acoustic difference was found corresponding to the articulatory effect described by Benus and Gafos (2007), which would lead to antiharmonic stems being more retracted. Although it seems that the direction of the effect is consistent with the hypothesis, the size of this effect is very small, and this effect is not significant when

18 Figure 7: Random slope coefficients for harmony=front by speaker in the final baseline model.

analyzing the data with linear mixed effects regression, thus assigning random variance to subject-wise and item-wise random effects. It is also visible from the study of random slopes by speaker that not all of the subjects show the same pattern: some speakers actually do not go in the expected direction.

3.2.2 Regression analysis on subsets of the dataset

The sign of the coefficient for harmony was in the expected direction, but no significance was reached. This might be due to the variance dispersed over too many random and fixed terms while the number of subject is still rather small. This can lead to further dissection of the data by investigating reasonable subsets (by gender and dialect). Although it might seem to be going in the wrong direction to analyze subsets of the overall data with the same methodology, these analyses might yield some closer insight, whether there is a pattern by excluding potentially confounding factors.

The methodology for the analysis of subsets was the same as the analysis of the overall data: linear mixed effects regression models with backwards stepwise term elimination. The saturated model for subsets by dialect was:

(45) f2 ∼ harmony * gender * repetition + (1 + harmony * repetition | speaker) + (1 + gender * repetition | word)

For P´ark´any speakers (N=5), all interaction and random effect terms were eliminated except for repetition | speaker as it was a significant random effect (χ2(3) = 27.967, p< 0.001). This might be due to the erroneous measurements seen with speaker no. 9. In

19 the resulting baseline model, the removal of harmony does not yield a significant worsening in the likelihood of the data (χ2(1) = 0.764, p=0.38). Strangely, after only removing interaction terms, but leaving the random effect structure saturated, harmony approaches significance: χ2(1) = 2.525, p=0.112.

When analyzing Budapest speakers (N=7), all interaction terms and all random slopes turned out to be not significant. Of the remaining model, gender and repetition both turned out to be significant, and harmony is almost significant with this conservative method- ology, χ2(1) = 3.811, p=0.051.

The baseline model for the subset data by gender was:

(46) f2 ∼ harmony * dialect * repetition + (1 + harmony * repetition | speaker) + (1 + dialect * repetition | word)

For female speakers (N=4), no interaction terms were significant, but repetition|speaker (χ2(3) = 15.184, p=0.002) and dialect|word (χ2(3) = 9.055, p=0.028) were significant in the random effect structure, so they were retained in the baseline model. None of the three main effects turned out to be significant on their own, although if dialect|word was removed from the model despite its significance, harmony would close in on significance (χ2(1) = 3.102, p=0.078), but to do so would be a serious violation of the methodology followed in this pa- per. For male speakers (N=8), only repetition|speaker (χ2(3) = 20.673, p< 0.001) was significant among the interactions and random slopes. Again, no main effects turned out to be significant.

3.2.3 Discussion

The results have shown again, that the articulately effect found in Benus and Gafos (2007) is not reflected in the acoustic data. Although the effect of lower F2 in antiharmonic stems can be found in some subsets of the data, contradicting patterns can be found as well. Using the less conservative approach of working on subsets of the data, which might inflate the overall α probability for Type I errors, F2 is seen to be significantly influenced by the category of the stem for speakers from Budapest. However, when conservatively looking at the overall data, this significance disappears, as the relevant variance in the data is assigned to different random effects. This fact, and the very small size of the mean difference suggest that the effect of the articulatory difference explored by Benus and Gafos (2007) are not there in the acoustics.

The finding that some speakers do show an effect in the expected direction could indicate, however, that the possibility of the presence of an acoustic difference in a subset of Hungarian speakers can be still considered. Two questions would arise if there is still an effect: if these small acoustic differences are there, how are they perceived (can they be perceived), and can they be learned?

20 3.3 Just noticeable difference

There are several ways to address the question of perceptability. One is to devise ex- periments on whether Hungarian speakers can perceive a minor difference in F2 for a front unrounded vowel, and on whether they are able to use such a difference in the categorization of stems. These experiments are in the course of planning and running, respectively.

The other way is to find out what previous studies have found to be the minimal percept- able difference for front high vowels. The name for this measure is just noticeable difference, abbreviated as JND, but there are other names, like Difference Limen for similar measures in some studies.

Mermelstein (1978) reports a Difference Limen based on an AX discrimination task using the vowel [I] (with F2=2100) of 142 Hz in isolation, 174 Hz between labial stops and 199 Hz between velar stops. The subject with the lowest DL in F2 could discriminate a 79 ± 15 Hz difference. In a study about learning discrimination after a 2-3 day training and testing, Kewley-Port (2001) reports a JND of F2 in the range of 15–22 for vowels in isolation and 46–62 for vowels in syllables, when the vowel is [I], with F2 of 2300 Hz. She also claims, though, that without training or paying too much attention, American English speakers have a discrimination threshold of ≈111 Hz (0.3 bark) for a vowel with a 2400 Hz F2.

Pisoni (1975) studies the uncertainty about the categorization between [i] (F1=270, F2=2300, F3=3019) and [I] (F1=374, F2=2070, F3=2666). The midpoint of this uncer- tainty is at F1=315 (δ45), F2=2180 (δ120) and F3=2836 (δ183). In an other discrimination task, Aaltonen et al. (1997) investigates the ability of Finnish speakers to categorize a vowel as [y] or [i] by presenting them a vowel with an F2 taken from the continuum from 1520 to 2966, and all the other formants fixed. The boundary between the two categories was around 2000 Hz. They report a difference between two kinds of subjects: ‘good categorizers’, whose boundary width (the region between 75% /y/ responses and 75% /i/ responses) was 112 Hz (SD=39.5 Hz, range 68–182 Hz) and ‘poor categorizers’ with an uncertainty boundary width of 339.5 Hz (SD=115.5 Hz).

These results indicate that it is not likely that speakers will be able to perceive a difference below 110 Hz, although it is seen in Kewley-Port (2001), that training can lower this JND to around 50 Hz in words, and in Aaltonen et al. (1997), that there are significant differences in the population regarding the ability to discriminate between vowels with fine phonetic differences. The learnability of the Hungarian pattern may depend on these two observations: the presence of vowel harmony based on F2 distinction may “train” Hungarian speakers to be more attuned to small differences in F2 (although Aaltonen et al. 1997 have also worked with an F2-harmony language). It is also visible that more speakers have an effect of around 50 Hz, which might mean that a difference of a larger size than the actual mean effect size can be encountered and learned in language acquisition.

21 4 Rating task

The experiment presented in this paper was planned to explore the morphophonological side of the phenomenon of fine phonetic details in Hungarian neutral vowels. The concrete aim of this study was to find out whether Hungarian speakers use small acoustic differences in the signal to categorize a stem as harmonic or antiharmonic.

Having seen that the acoustic difference is below the JND as agreed on in the literature, it is not clear how such a small difference can be learned or utilized in higher level catego- rization. However, a weak tendency exists towards a slight retraction of [i], as explored by Benus and Gafos (2007) and the acoustic study above, so it might be possible that Hungarian speakers are able to use this very small phonetic difference in the categorization of a stem as harmonic or antiharmonic. If they utilize the small effect of retraction in perception, then it can be hypothesized that a mismatch in the quality of a stem vowel and the following suffix will be perceived as odd, consequently a mismatched stimulus will be rated as less well-formed than a non-mismatched stimulus. For example, ratings of the four conditions with a nonce verbal stem like /zip/ would show the pattern in Table 5, and on Figure 9).

stimulus stem suffix expected rating [ziptEk] full front no mismatch: well-formed, eg. 7 [zipt6k] full back mismatch: worse than [ziptEk], but not unacceptable, eg. 5 [zifflptEk] retracted front mismatch: close to unacceptable, eg. 3 [zifflpt6k] retracted back no mismatch: worse than [ziptEk], but not unacceptable, eg. 5

Table 5: Four conditions of a wug verbal stems, and expected judgments.

In this experiment, the quality of the stem and the suffix vowel will be strictly controlled for. The interaction term between these two conditions can reveal whether Hungarian lis- teners use the difference in F2 between the two variants to classify the stem as harmonic or antiharmonic. In the case of the suffix vowel condition, it can be expected that front suffixes will be preferred over back suffixes overall, as this is the productive pattern no new lexical items are antiharmonic in Hungarian (cf. (19)):

It might also be expected that in the stem vowel condition, full vowels will receive higher scores than retracted vowels, as the more peripheral quality could lead to an impression of a clean vowel, whereas the retracted vowel could even sound foreign – even though both full and retracted stimuli will be manipulated to control for the exact quality of the vowel.

If interaction term turns out to be not significant (as in Figure 8), that would mean that a retracted [ffli] in the stimulus does not facilitate the categorization of the stem as back. This

22 Figure 8: Hypothetical scores illustrating the no-interaction result

result would raise the question how and why the production of the retracted vowel is learned, if it does not seem to be used in perception. This result would not support a phonetics first analysis, as the phonetic differences could not be used in determining the morphological categorization of a stem.

If the interaction term turns out to be significant, the direction of the effect of retraction needs to be found out. If the interaction is such that after a retracted stem vowel, listeners give better scores to back suffixes (see Figure 9), that would mean that speakers indeed use their knowledge of the different categories for [i] and [ffli]. This would support the phonet- ics first hypothesis: vowel retraction would indeed influence the listeners in whether they categorize a stem as harmonic or antiharmonic.

If the interaction term is significant, but in the other way than expected, so that back suffixes are even more dispreferred after a retracted [iffl] (see Figure 10), then neither of the above can be true. This would mean that the subjects did have the knowledge of the difference between [i] and [iffl], but do not use this difference in the expected direction based on Benus and Gafos (2007). This result would probably raise some red flags about the methodology of the experiment, as no clear explanation on the results would be at hand.

4.1 Methodology

To find out how Hungarian speakers utilize fine phonetic detail (vowel backness) in stem categorization, a nonce word acceptability judgment task was designed. The target wug

23 Figure 9: Hypothetical scores illustrating the result with interaction in the expected way.

Figure 10: Hypothetical scores illustrating the result with interaction in the unexpected way.

24 Figure 11: The screen shown to participants for each stimulus.

stems all contained an /i/, the actual quality of which was manipulated to create a contrast between retracted and full vowels. These wug stems were paired with front and back suffixes, and the resulting stimulus was rated by the subjects on a scale from 1 (not natural) to 7 (natural).

The screen the subjects saw for each stimulus item is seen on Figure 11. The stimulus sentence was presented with the prompt Ime´ a k¨ovetkez˝omondat: ‘This is the next sen- tence:’, with the target word in boldface so that the subjects can pay more attention to it instead of the whole carreer sentence. To ensure this even more, the target word was re- peated (Erre a sz´ora figyelj: ‘Concentrate on this word:’). A play button was placed below, which could be pushed several times, if the subjects were undecided after listening to the manipulated sentence once. After having played the test sentence at least once, 7 buttons for rating appeared below the prompt asking Mennyire hangzott term´eszetesena kulcssz´o? ‘How natural did the key word sound?’, with labels next to 1 (nagyon nem ‘very much not’) and 7 (teljesen ‘totally’).

25 4.1.1 Participants

Subjects were recruited online by posting the hyperlink of the experiment site on Face- book, and asking friends to do so as well 4. To gather more subjects, a Facebook advertise- ment of the experiment ran from May 19 to May 25, 2012 with a budget of $50. The adver- tisement was targeted for Facebook users from Hungary over the age of 18, who “liked” at least one popular Hungarian language Facebook page to ensure they were Hungarian speak- ers (the targeted pages were Budapest, Hungarian language, Lake Balaton, T´ur´oRudi [a popular candy], Magyarorsz´ag[Hungary] and I ♥ Alv´as(I love sleeping)). The ad targeted 915 380 such users, and the campaign reached 214 571 of them, showing the advertisement 3.6 times on average. In the end, 654 people clicked on the ad (with a good 0.084% click through rate) and started the survey.

A Google Adwords advertisement was also ran from May 20 to May 27, 2012 with a final budget of $47.82. The advertisement was targeted again at , placing bids for the following keywords: magyar ‘Hungarian’, magyarul ‘in Hungarian’, Magyarorsz´ag ‘Hun- gary’, nyelv ‘language’, magyar nyelv ‘Hungarian language’, Balaton and T´ur´oRudi. The advertisement was seen by 47 437 users of Google, and 216 clicked on the link (corresponding to a 0.46% click through rate).

The total number of subjects who started the survey was 407. Of these, 142 finished the experiment, but only 72 were included in the final analysis. Section 4.1.4 will explain the motivations behind filtering subjects for the analysis. Of the 72 participants retained in the study 41 were male (mean age 24.2), 28 were female (mean age 25.4), while 3 did not answer (19, 23 and 26 years old). The majority of the subjects lived in Budapest proper (36) or in the metropolitan area of Budapest (8), while 21 lived in other cities, and 7 lived in smaller villages. In comparison, only 25% of the Hungarian population lives in the metropolitan area of Budapest – the skewed sample reflects, however, trends in Internet usage and it was affected by the way the advertisements were targeted. The distribution of the subjects’ age also shows these factors, as seen on Figure 12. The mean age was 24.82 years, with a standard deviation of 8.16 years (the median age is 21, the minimum is 18 and the maximum is 57 years).

4.1.2 Stimuli

The wug stems with the full [i] and a retracted [ffli] (stem vowel condition) were paired with front and back suffixes (suffix vowel condition), and embedded in natural sentences. The F2 difference between [i] and [ffli] was ∆F2=100 Hz, about four times the mean difference from the acoustic study. Kewley-Port (2001) showed that training a subject on a particular phonetic difference can lower the threshold to be able to perceive the difference. As training participants takes more time than what was available for an online study, the difference

4thanks to Adrienn Fodor, Brigitta Fodor, Andrea Kun and J´anosSzeredi for helping me spread the link.

26 Figure 12: The distribution of age in the sample used for the analysis (total N=72).

was somewhat exaggerated, so that there could be a better chance to discover its effects. The subjects assigned acceptability scores to these stimuli, which served as the dependent variable for this experiment.

Twenty wug verbs were used, ten with /i/ and ten with long /i:/ as the only vowel, and a single consonant in the onset and the coda. The vowel length condition is introduced to the experiment to control for the fact that stems with the long vowel are more likely to be antiharmonic in real Hungarian words, as this might affect the behavior of wug stems.

The stimuli were also controlled for the edit distance to the closest antiharmonic existing stem: 10 differed only in one consonant from a back i-stem (neighbors), and 10 will not have an antiharmonic stem within 1 edit distance (non-neighbors). No wug verb will only differ from an existing stem only in vowel length and edit distance is calculated ignoring vowel length.

The target nonce verbal stems are shown in Table 6. These verbs were suffixed with the past tense plural 3rd person suffix, which is [t6k] after a back stem and [tEk] after a front stem. The target words were embedded in a natural sentence with the following pattern (with syllable counts in parentheses):

(47) DET(1) N.Pl(3) V.Past.3pl(2) DET(1) N.Adv(3)

The following example illustrates the pattern:

A tigrisek f´ınytak a ketrecben. [6 tigriSEk fi:ñt6k 6 kEtrEdzbEn] (48) the tigers fi:ñ+back suffix the cage.iness ‘The tigers were f´ınying in the cages.’

27 neighbor non-neighbor short zip zit Ùis fiS lid mit kir ţit Sié nié long éi:b gi:b pi:t mi:S hi:ñ fi:ñ mi:n zi:m bi:p ni:t

Table 6: Target wug stems of the perception experiment

[u] [y] short fug kyl lut dyn dum tyk kun gyd bup Syr long gu:l zy:l du:b gy:p pu:g jy:g lu:s my:r mu:j ly:d

Table 7: Filler wug stems in the perceptional experiment

There were also 20 filler wug verbs with [u] and [y] as stem vowel. These fillers can serve as benchmarks: after [u] the back suffix condition should be rated much higher than the front suffix condition, and after [y] the front suffix condition should be more accepted, because there is a categorical exceptionless pattern for these vowels in Hungarian. The fillers are listed in Table 7.

The test sentences were read by the author in a neutral intonation. An acoustic analysis of the target vowels can be seen in Table 8. It can be seen that an effect of 41.2 Hz was seen before the different suffix vowel conditions, but this difference might be due to coarticulation, and it is not significant. It is also important that the long vowels were indeed longer in duration than the short vowels. There is also a qualitative difference between the long and short [i] vowels besides the quantitative one: the long vowel is more peripheral, so its F2 is

28 F2 (Hz) SD t(39) p before front suffix 2170.8 135.2 before back suffix 2129.6 115.1 diff. 41.2 1.05 >0.05 short [i] 2072 89.6 long [i:] 2231.2 105 diff. 159.2 5.23 <0.001*** neighbor 2161.6 151 non-neighbor 2138.3 97.7 diff. 23.2 0.59 >0.05 duration (ms) SD t(39) p short [i] 73.3 17.8 long [i:] 107.7 14.7 diff. 34.4 6.71 <0.001***

Table 8: Acoustic properties of the unmanipulated stimuli

higher than the F2 of the short [i]. This can easily be explained by vowel undershoot: there is not enough time for the short vowel to reach the most peripheral target.

The actual stimuli were artificially created from these utterances. The manipulation of formant values was made in Praat (Boersma and Weenink 2010). The whole utterance was resampled at 10 000 Hz, and the stem and suffix vowels were segmented from the word. The formants were filtered out based on the source-filter model of speech: the filter was created by an LPC analysis looking for one peak for every 1000 Hz. The source was filtered out and the formant analysis of the original source was altered using the target formant values. Finally the new formants were refiltered with the source, resulting in the final stimuli. The target formant values of the vowels were the mean values from the acoustic data of the readings, and are presented in Table 9.

The resulting sound files sounded noisy and artificial, as noted by several subjects, be- cause the formant altering methodology and the downsampling of the input did introduce some noise. The formant values for the stem and suffix vowels, however, were controlled for, and this was a crucial requirement for the experiment. The level of the introduced noise was the same for all stimulus items as well.

29 vowel F1 F2 F3 stem vowels i 310 2150 2683 ffli 310 2050 2683 u 359 879 2281 y 338 1696 2142 suffix vowels E 578 1639 2358 6 602 1406 2262

Table 9: Target formants for the formant manipulation process

4.1.3 Presentation

As the subjects filled out the survey anonymously on the internet, the environment where they listened to the stimuli could not be perfectly controlled for. All of the subjects were asked in the advertisement and in the first screen of the experiment that they should be in a quiet setting and should wear a well functioning headphone. To make sure that they did use a headphone, the brand name and the type was queried in the very beginning of the experiment, before they were shown any of the stimuli.

The presentation was made again online with Experigen (Becker and Levine 2010). Be- cause of the high number of stimuli, only a randomly selected half of them was presented to each subject (see Table 10), so instead of 120 judgments, only 60 were made. The length of the experiment was approximately 15 minutes.

After having seen the stimuli, a debriefing screen was shown, and demographic data, as well as questions about linguistic background, second languages, and hearing impairment were filled out. The questions on the demographic form were the following:

(49) • current place of residence: Budapest; Budapest agglomeration; other city in Hungary; village in Hungary; Slovakia; Romania; Serbia; any other foreign country • whether the respondent has lived in a foreign country longer than 6 months • whether the respondent has any hearing impairments • age • gender • other languages than Hungarian spoken • suggestions, remarks

30 stem vowel suffix vowel stimuli seen all stimuli [i] back 5 10 [iffl] back 5 10 [i] front 5 10 [iffl] front 5 10 [i:] back 5 10 [iffl:] back 5 10 [i:] front 5 10 [iffl:] front 5 10 [u(:)] back 5 10 [u(:)] front 5 10 [y(:)] back 5 10 [y(:)] front 5 10

Table 10: Number of conditions seen by a subject in the experiment

4.1.4 Filtering subjects

Because of the unsupervised nature of the experiment, it was crucial that the subjects included in the analysis had completed the survey seriously and with a proper headset. Many subjects started but failed to finish the experiment, and there were some conditions reported on the debriefing screen which would been problematic for the study, so a principled way to filter the participants was needed.

First, all answers from subjects who did not fill out the whole survey were removed from the sample. Out of 407 subjects who started the experiment, 140 finished by filling out the demographic data on the debriefing screen. There were two more subjects, who did not fill out the debriefing screen, but gave responses to all of the 120 stimuli: these subjects were filtered out, as no information is known about their age, linguistic background or hearing impairment. Two subjects who filled out the debriefing questionnaire missed to give a response to some stimuli: a subject missed only 1, and the other one missed 6 stimuli. These subjects were not removed from the analysis, as the magnitude of missed trials was minimal.

The answers for the demographic survey were analyzed for whether they can be grounds for exclusion. Those 23 speakers who reported having lived abroad (including 3 who were living abroad in the time of the study) were excluded from the survey, as longer exposure to and everyday use of a second language might affect the hypothetically language-specific sensitivity of the speakers towards fine phonetic detail. There were 7 speakers who reported some kind of hearing impairment – the answers of these participants were also not included

31 Figure 13: Distribution of the mean and standard deviation of the response scores by subject

in the study.

Despite the advertisement and the initial screen of the experiment clearly stated that headphones have to be used, some subjects carried on with the experiment without a head- phone, as reported on the screen which specifically asked the brand of their equipment. These 11 subjects were excluded from the analysis. As 4 of these have already been filtered out based on some of the criteria above, only 7 had to be excluded specifically for this reason.

There were 105 subjects left, who were not excluded from the analysis for extrinsic reasons. However, by analyzing the patterns of how individual subjects responded to the stimuli, it became evident that many subjects gave very similar responses to all of the trials, not differentiating much between those that they could have thought more acceptable and those that they could have thought as less acceptable. There were two subjects who gave only the “unacceptable” 1 rating to every stimulus, and more who only worked with a small space in the 1–7 scale. The distribution of the by-subject means and standard deviations of the response variable can be seen in Figure 13.

To exclude subjects based on visual evidence on their results would not be consistent, so a cut-off of standard deviation of 1 on the 1–7 scale was implemented. All of the scores given by each subject excluded because their responses did not have this variation above 1 points can be seen in Figure 14. By excluding these 33 speakers, only the responses of 72 subjects were left in the database, and these answers were used in the analysis.

32 Figure 14: Responses of subjects excluded because of the low variation in their scores

4.1.5 Data Analysis

The data were analyzed in R (R Development Core Team 2012) using linear mixed effects modeling. The key independent variable was the interaction effect between the stem vowel and the suffix vowel factors. This provides an insight into whether speakers categorize a stem as antiharmonic more often if there is a retracted vowel in the stimulus. If the interaction shows that this is the case, that would provide an argument for the phonetics first hypothesis, as we could observe the fine phonetic difference having an effect on the phonological behavior (suffix allomorph selection) of the stem. If this interaction fails to show up, that means that the phonetics first hypothesis is unlikely, favoring (although indirectly) the morphology first hypothesis. However, a null result would not prove morphology first, so further research would be needed to exclude other possible explanations for such a result.

4.2 Results

Every trial in the database was coded with the following variables:

33 (50) variables in the experiment: subject – an anonymous random code assigned to each subject by Experigen item – the item-wise variable (eg. zip, gyib, etc.) response1 – the response to the given trial (the dependent variable) stem – quality of the stem vowel (full, retracted, u, y – for the fillers as well) suffix – quality of the suffix vowel (front, back) length – the length of the vowel (short, long) neighbor – edit distance from the closest existing antiharmonic stem in Hun- garian (1, 2; and 0 for the fillers)

The hypothesis that Hungarian speakers use the small difference in F2 to categorize stems is tested by the stem : suffix interaction. Under this hypothesis, after a retracted stem vowel, listeners should give better scores to back suffixes (see Figure 9), that would mean that speakers indeed use their knowledge of the different categories for [i] and [ffli].

4.3 Target words

Again, the expected large between-subject and between-item variation called for an anal- ysis which works the best for assigning this variation to random effects, thus linear mixed effects regression was employed, using the lme4 package.

A saturated mixed effects model using all the non-demographic variables as fixed effects, and subject-wise and item-wise random slopes was built first. This was only run on the target words, so the fillers did not have any effect on the model at all. The formula called in lme4 was the following:

(51) response ∼ stem * suffix * length * neighbor + (1 + stem * suffix * length * neighbor | subject) + (1 + stem * suffix | item)

There were no random slopes for length and neighbor, because in this experiment, these factors were hierarchically nested under item: the choice of a given target word determines the length of the vowel and the edit distance to the closest antiharmonic stem.

The complicated saturated model was needed to be built to check out whether there are higher level interactions which influence the data. Examining the coefficients of the model, it seems that the four-way interaction is not significant (β=0.637, SE=0.576, t=1.105), but a three-way interaction reaches significance: for suffix=back & length=long & neighbor=1 the coefficient is significant, β = −0.829, SE=0.345, t=-2.407, which means that those target

34 Figure 15: Interaction between stem and suffix vowel conditions in the whole sample.

words which were only 1 edit distance apart from an existing antiharmonic stem were rated worse than those target words which were further apart, if the stimulus had a long vowel and it was antiharmonic. This effect goes against the predictions of an analogy-based model, but the interpretation of this interaction is not obvious, and it is not clearly related to the research question.

No other fixed effect reached a t score of 2 or higher. The coefficient for the interaction stem=retracted & suffix=back, which is relevant to the research question is negative but not significant (β=-0.113, SE=0.316, t=-0.358). The sign of this coefficient is in the unex- pected way, more so than in the hypothetical case seen in Figure 10. The actual means are plotted in Figure 15.

The coefficient of the main effect of stem vowel condition has the expected sign, as stimuli with retracted stem vowels were judged to be worse than stimuli with front stem vowels (β = −0.111, SE=0.157, t=-0.704). The main effect of suffix vowel condition not being significant indicates that the productivity of harmonic stems (cf. (19)) did not affect the judgments of the participants (suffix=back: β = −0.032, SE=0.185, t=-0.17).

After inspecting the saturated model, backward stepwise term elimination was initiated again. The four-way interaction did not significantly worsen the likelihood of the model (χ2(1) = 1.226, p=0.27), so it was removed. Of the three-way interactions, though, suffix : length : neighbor turned out to be significant again (χ2(1) = 5.001, p=0.026). This means that only the main effect of stem can be removed from the model (which does not worsen it significantly) but no other interactions and main effects can. So for further analysis, the database was split in half based on the value of the neighbor factor. The new saturated

35 Figure 16: Random slope coefficients by speaker for the stem : suffix interaction with confidence intervals of ±2SE.

model run on both halves was:

(52) response ∼ stem * suffix * length + (1 + stem * suffix * length | subject) + (1 + stem * suffix | item)

Stepwise term elimination commenced for both halves of the database. The three-way interaction was not significant for either half. For those stimuli which had an antihar- monic neighbor, the interaction of suffix=back & length=long was borderline significant (χ2(1) = 3.49, p=0.062; coefficient in the saturated model was β = −0.382, SE=0.167, t=-2.28). Eliminating this term leads to no more significant fixed effects. For those stimuli which did not have an antiharmonic neighbor, no interaction and no main effect came out as significant. Most importantly, the stem : suffix interaction failed to reach significance by any measure in either half of the data.

Going back to the main saturated model in (51), the random slope coefficients can be further examined. The random slope coefficient for the relevant stem : suffix interaction can be seen in Figure 16. It looks like there are no severe outliers in either direction – this can be confirmed by looking at the distribution of this random effect on Figure 17.

After analyzing the target words of this experiment, it seems that subjects did not use the small phonetic difference between the retracted and the full vowel to categorize stems as harmonic or antiharmonic. However, due to the large variation in the data, and the fact that there are few interpretable significant terms, it could be suspected that the negative result

36 Figure 17: Distribution of random slope coefficients by speaker for the stem : suffix interaction.

might be the consequence of a task effect – the participants might not have understood what they were intended to do. Several comments in the debriefing, and the number of subjects with little variation in their responses also indicate that they might have had problems in understanding the task

4.3.1 Fillers

If the participants had serious problems in understanding or interpreting the task the way it was designed, then they might have encountered the same problem with the fillers as well. The quality of the recording, the question and the target sentence was the same for the fillers as for the target words. However, the decision on the rating of the fillers should be much easier: these stems had the [u] and [y] vowels in their single syllable, which entails back and front suffixation respectively with no exception in Hungarian.

A saturated model was built again on the filler tokens. Only data from the 72 subjects filtered out were used again. The neighborhood variable was not used, as it was irrelevant for this set of tokens. The model looked the same as in (52), repeated below:

(53) response ∼ stem * suffix * length + (1 + stem * suffix * length | subject) + (1 + stem * suffix | item)

The coefficients in this model show a different story than for target verbs: the crucial

37 Figure 18: The stem : suffix interaction for fillers.

stem = y & suffix=back interaction looks significant, β = −2.201, SE=0.457, t=-4.83. Stepwise backwards term elimination confirms this: the three-way interaction is not signif- icant, but stem : suffix is, χ2(1) = 26.616, p< 0.001. The stem : length interaction is borderline significant as well, χ2(1) = 2.796, p=0.095. The effect of this interaction can be seen in Figure 18.

It seems that the subjects were able to distinguish the grammatical fillers from the agrammatical ones, so the task effect can be narrowed down: it is clear that participants understood what they were meant to do. The task was not satisfyingly sensitive though: as it can be seen from these results on the fillers, the grammatical and agrammatical stimuli are still not rated on the extreme ends of the scale. The agrammatical fillers have a mean score 2.88 for both vowels, and the grammatical ones have 3.92 for [y] and 4.29 for [u] – a mere 1.04 point distinction for the former and a 1.41 point distinction for the latter.

4.4 Discussion

The results of this experiment indicate that Hungarian speakers do not use a 100 Hz difference in F2 to categorize a stem containing an [i(:)] as harmonic or antiharmonic. The results of the fillers indicate that subjects did distinguish between grammatical and agrammatical stimuli, so the lack of difference for the target stems cannot be attributed to a misunderstanding of the task or the task effect of the stimuli not sounding natural to the participants. The small degree of difference between the grammatical and agrammatical fillers might question that the task was sensitive enough for the participants to pick up on

38 the minor phonetic detail needed to distinguish retracted and full [i:] vowels.

These results again do not support the hypothesis that antiharmonicity is marked on the vowels. If this were the case, one could expect the small articulatory differences to manifest in some way, which can be perceived and used for categorization for nonce stems as well. As the results of the rating task do not indicate that small acoustic differences are used by Hungarian speakers in categorization, the hypothesis that a low-level distinction between [ffli] and [i] in fine phonetic detail can influence affix selection does not seem likely.

5 Interpretation of the results

There were two competing hypotheses sketched up in section 2.3: the articulatory dif- ferences between a retracted and a full high front unrounded vowel found by Benus and Gafos (2007) are part of the speakers’ knowledge, and influence the morphological behavior of the stem (phonetics first), or these differences are merely the result of the morphological categorization of the stem: the frequent cooccurence of these stems with back suffixes can lead to a retracted articulation of the vowel in isolation as well (morphology first).

The results of the acoustic experiment failed to show that there is a significant distinction between antiharmonic and harmonic stems. Although there is a small difference in the direction expected based on the articulatory results, this difference is not significant, and there are 5 speakers out of 12, who show a (non significant) difference in the other direction. Therefore these differences cannot be attributed to a low-level distinction between a retracted [ffli] and a full [i]. If speakers would be aware of such small differences under the phonetics first hypothesis, we would expect a significant difference here, while under the morphology first hypothesis, this small insignificant difference can be attributed to the same neighborhood effects as the articulatory results. Benus and Gafos (2007) also predict that the acoustic difference should be insignificant, as they argue that the fine articulatory difference can be used to distinguish between harmonic and antiharmonic vowels because a larger articulatory difference does not translate to an equal acoustic difference for a more peripheral vowel (Stevens 1972). This argumentation, however, still would not support a phonetics first analysis: the phonetic differences cannot be analyzed as underlying if they are not audible and perceivable.

The results of the morphophonological categorization experiment do not support the phonetics first hypothesis either. If the phonetic differences were in the speakers’ knowledge, and could influence higher level morphological behavior, it would be expected that speakers would catch on to these differences, and rate a mismatched stimulus (retracted vowel – har- monic stem and full vowel – antiharmonic stme) as less acceptable than a non-mismatched one. As the results do not show this effect at all, it seems more probable that the articulatory differences found in Benus and Gafos (2007) are at most just the side effects of the mor- phological behavior, and, according to the morphology first hypothesis, the grammar marks

39 antiharmonicity on the stem, not on the vowel. The grammar handles antiharmonicity as an exceptional lexical category, and the constraints on their behavior is handled by coindexed markedness constraints (Pater 2000; 2006, Gouskova 2012).

As the phonetics first hypothesis turned out to be not probable after the acoustic and categorization experiments, the incomplete neutralization and near merger analyses of Hun- garian antiharmonicity are not plausible. If there is no sign of incomplete neutralization in perception, as found elsewhere (Port and Crawford 1989, Ernestus and Baayen 2006), then the articulatory differences alone would not suffice for such an analysis.

Under a morphology first analysis, an exemplar-based framework could explain the ar- ticulatory effects, as hinted by Benus and Gafos (2007) as well. However, such a framework would also predict a more robust acoustic difference than found in the acoustic experiment as well: the fine articulatory difference should translate to a consistent acoustic difference. The lack of such a consistent acoustic difference might be explained by the non-linear correspon- dence between articulation and acoustics (Stevens 1972), and the acoustic experiment has found a non-perceivable and insignificant difference indeed. An acoustic study with a larger sample could finally settle this question – if the 25 Hz difference is found more consistently with more speakers as well, and this difference turns significant, that could support a rich lexical representation theory.

There is also a possibility, that Benus and Gafos’ finding might be accounted as accidental. Their articulatory study examined only two subjects with EMMA, and three with ultrasound, where the effect was found for only one speaker. Such a small sample size can introduce accidental findings: the pilot study for the acoustic experiment in section 3 found a significant effect with over 100 Hz for two speakers – but as more data were collected, the smaller and less significant the effect size became.

6 Conclusion and further research

This paper discussed the possible explanations for the correlation between small phonetic difference in vowel retraction and morphological behavior (antiharmonicity) found in the articulatory study of Benus and Gafos (2007). The question was explored further with an acoustic and a morphophonological categorization experiment as well. These studies in this paper did not find the effect predicted by an analysis which would explain Benus and Gafos’ findings arguing that the small phonetic detail is used by speakers for morphological categorization. The lack of a significant acoustic difference is problematic for an exemplar theory based explanation too, as the same effect that shows up in articulation should show up in the acoustic space as well.

The experiments in this paper could be followed up by several more detailed studies. The acoustic experiment could be expanded to include more subjects, as if the 25 Hz mean

40 difference remains with more subjects, this difference could gain significance, supporting an exemplar-based analysis.

The rating task experiment could be redone using a much larger phonetic difference, which would investigate, whether phonetic detail is relevant for morphological behavior at all, influcencing participants’ judgments. If a 500 Hz difference was used, this experiment would introduce a new [W] phoneme to the speakers, and it would be interesting to see whether they would categorize this new sound as front, back, or just unacceptable generally. This study could investigate whether vowel harmony in Hungarian is based in the grammar, which requires matching phonetic features on the stem and suffix with an agreement constraint, or the vowel harmony emerges entirely from the lexicon, as the harmonicity of the stem is listed for each entry. In the latter case, the systematic behavior of back and front rounded stems could be explained by diachronic factors and the productivity of these patterns as analogical processes within the lexicon.

If this exaggerated difference does induce back-suffixing for [W] stems, that could indicate that the grammar is able to generalize over a newly introduced ‘wug phoneme’. Systematic front-suffixing could actually lead to the same conclusion, with speakers categorizing [W] together with [i] or [y], and marking [W] with a [-back] feature, letting the grammar work as it does with other vowels. However, if the results show within-speaker incoherence or a very low acceptability score for all [W] stems, that could indicate that Hungarian speakers cannot reliably mark a new phoneme with a [±back] feature which the grammar could work on, and refuse these stems altogether, or categorize them based on lexical neighborhood effects.

References

Aaltonen, Olli, Osmo Eerola, Ake˚ Hellstr¨om,Esa Uusipaikka and A. Heikki Lang (1997). Perceptual magnet effect in the light of behavioral and psychophysiological data. Journal of the Acoustical Society of America 101, pp. 1090–1105.

Audacity Team (2009). Audacity 1.2.6. Free software distributed under the terms of the GNU General Public License. The name Audacity(R) is a registered trademark of Dominic Mazzoni. http://audacity.sourceforge.net/.

Bates, Douglas, Martin Maechler and Ben Bolker (2011). lme4: Lin- ear mixed-effects models using S4 classes. R package version 0.999375-42, http://CRAN.R-project.org/package=lme4.

Becker, Michael and Jonathan Levine (2010). Experigen: an online experiment platform. https://github.com/tlozoot/experigen.

Beddor, Patrice Speeter, James D. Harnsberger and Stephanie Lindemann (2002). Language- specific patterns of vowel-to-vowel coarticulation: acoustic structures and their perceptual correlates. Journal of Phonetics 30, pp. 591–627.

41 Benus, Stefan and Adamantios I. Gafos (2007). Articulatory characteristics of Hungarian ‘transparent’ vowels. Journal of Phonetics 35, pp. 271–300.

Blaho, Sylvia and Daniel Szeredi (2012). Antiharmonic and vacillating stems in Standard and Slovakian Hungarian. Talk at the 13th International Young Linguists’ Meeting, Olomouc, Czech Republic, May 8, 2012.

Boersma, Paul and David Weenink (2010). Praat: doing phonetics by computer. Version 5.1.23. http://www.praat.org.

Bybee, Joan (2001). Phonology and language use. Cambridge University Press, Cambridge.

Dmitrieva, Olga, Allard Jongman and Joan Sereno (2010). Phonological neutralization by native and non-native speakers: The case of Russian final devoicing. Journal of Phonetics 38, pp. 483–492.

Ernestus, Mirjam and Harald Baayen (2006). The functionality of incomplete neutralization in Dutch. the case of past-tense formation. In: Louis M. Goldstein, Douglas H. Whalen and Catherine T. Best (eds.), Papers in Laboratory Phonology 8, Mouton de Gruyter, Berlin/New York, pp. 27–49.

Gouskova, Maria (2012). Unexceptional segments. Natural Language & Linguistic Theory 30, pp. 79–133.

Hal´acsy, P´eter,Andr´asKornai, L´aszl´oN´emeth,Andr´asRung, Istv´anSzakad´atand Viktor Tr´on(2004). Creating open language resources for Hungarian. In: Proceedings of Language Resources and Evaluation Conference (LREC04). European Language Resources Associa- tion,.

Hayes, Bruce and Zsuzsa Czir´akyLonde (2006). Stochastic phonological knowledge: the case of Hungarian vowel harmony. Phonology 23, pp. 59–104.

Hayes, Bruce, Kie Zuraw, P´eterSipt´arand Zsuzsa Londe (2009). Natural and unnatural constraints in Hungarian vowel harmony. Language 85, pp. 822–863.

K´alm´an,B´ela(1972). Hungarian historical phonology. In: Lor´andBenk˝oand Samu Imre (eds.), The Hungarian Language, Mouton, The Hague, pp. 49–83.

Kewley-Port, Diane (2001). Vowel formant discrimination II: Effects of stimulus uncertainty, consonantal context, and training. Journal of the Acoustical Society of America 110, pp. 2141–2155.

Kis, Tam´as(2005). A vel´aris i a magyarban. [The velar i in Hungarian]. In: Tam´asHoff- man, Tam´asKis and Istv´anNyirkosˇ (eds.), Magyar Nyelvj´ar´asok[Hungarianˇ Dialects], Debreceni Egyetemi Kiad´o,, vol. XLIII, pp. 5–26.

Kiss, Jen˝o(ed.) (2001). Magyar dialektol´ogia.[Hungarian dialectology]. Osiris, Budapest.

42 Labov, William, Mark Caren and Corey Miller (1991). Near-mergers and the suspension of phonemic contrast. Language Variation and Change 3, pp. 33–74.

Lightner, Theodore (1972). Problems in the theory of phonology. Linguistic Research Inc., Edmonton.

Mermelstein, Paul (1978). Difference limens for formant frequencies of steady-state and consonant-bound vowels. Journal of the Acoustical Society of America 63, pp. 572–580.

N´adasdy, Adam´ and P´eterSipt´ar(1994). A mag´anhangz´ok[The vowels]. In: Ferenc Kiefer (ed.), Struktur´alisMagyar Nyelvtan 2. Fonol´ogia [Structural 2. Phonology], Akad´emiaiKiad´o,Budapest, pp. 42–182.

Ohman,¨ Sven (1966). Coarticulation in VCV utterances: Spectrographic measurements. Journal of the Acoustical Society of America 39, pp. 151–168.

Pater, Joe (2000). Nonuniformity in English secondary stress: The role of ranked and lexically specific constraints. Phonology 17, pp. 237–274.

Pater, Joe (2006). The locus of exceptionality: Morpheme-specific phonology as constraint indexation. In: Leah Bateman, Michael O’Keefe, Ehren Reilly and Adam Werle (eds.), Papers in Optimality Theory III, GLSA, Amherst, MA, pp. 259–296.

Perkell, Joseph S., Marc H. Cohen, Mario A. Svirsky, Melanie L. Matthies, I˜nakiGarabieta and Michael T. T. Jackson (1992). Electromagnetic midsaggital articulometer (EMMA) systems for transducing speech articulatory movements. The Journal of the Acoustical Society of America 92, p. 3078.

Pierrehumbert, Janet B. (2001). Exemplar dynamics: Word frequency, lenition and con- trast. In: Joan Bybee and Paul Hopper (eds.), Frequency and the emergence of linguistic structure, John Benjamins, pp. 137–157.

Pisoni, David B. (1975). Auditory short-term memory and vowel perception. Memory & Cognition 3, pp. 7–18.

Port, Robert and Penny Crawford (1989). Incomplete neutralization and pragmatics in Ger- man. Journal of Phonetics 17, pp. 257–282.

R Development Core Team (2012). R: A Language and Environment for Statistical Com- puting. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, http://www.R-project.org/.

Ringen, Catherine O. and Robert M. Vago (1998). Hungarian vowel harmony in Optimality Theory. Phonology 15, pp. 393–416.

Sipt´ar,P´eterand Mikl´osT¨orkenczy (2000). The phonology of Hungarian. Oxford University Press, Oxford.

43 Stevens, Kenneth N. (1972). The quantal nature of speech: Evidence from articulatory- acoustic data. In: Edward E. David and Peter B. Denes (eds.), Human communication: A unified view, McGraw-Hill, pp. 51–66.

Stone, Maureen (1997). Laboratory techniques for investigating speech articulation. In: William J. Hardcastle and John Laver (eds.), The Handbook of Phonetic Sciences, Black- well, Oxford, pp. 11–32.

Statistick´yˇ ´urad Slovenskej Republiky [Statistical Office of the Slo- vak Republic] (2011). Poˇcet obyvateˇlov Slovenskej Republiky k 31. 12. 2011 [Population in the Slovak Republik on December 31, 2011]. http://portal.statistics.sk/files/Sekcie/sek 600/Demografia/Obyvatelstvo/tabulky / pocet obyvatelov/2011/poc obyv 2011.pdf.

Vago, Robert M. (1980). The sound pattern of Hungarian. Georgetown University Press.

Warner, Natasha, Allard Jongman, Joan Sereno and Rach`elKemps (2004). Incomplete neu- tralization and other sub-phonemic durational differences in production and perception: evidence from Dutch. Journal of Phonetics 32, pp. 251–276.

44