<<

DOCTORAL DISSERTATION

Suprasegmental sound changes in the Scandinavian

Áron Tési

2017

Eötvös Loránd University of Sciences Faculty of Humanities

DOCTORAL DISSERTATION

Áron Tési

SUPRASEGMENTAL SOUND CHANGES IN THE SCANDINAVIAN LANGUAGES

SZUPRASZEGMENTÁLIS HANGVÁLTOZÁSOK A SKANDINÁV NYELVEKBEN

Doctoral School of Linguistics Head: Dr. Gábor Tolcsvai Nagy MHAS

Doctoral programme in Germanic Linguistics Head: Dr. Károly Manherz CSc

Members of the thesis committee Dr. Károly Manherz CSc (chairman) Dr. Roland Nagy PhD (secretary) Dr. Valéria Molnár PhD (officially appointed opponent) Dr. Ildikó Vaskó PhD (officially appointed opponent) Dr. László Komlósi CSc (member)

Further members Dr. Péter Siptár DSc Dr. Miklós Törkenczy DSc

Supervisor Dr. Péter Ács CSc

Budapest, 2017

Table of contents List of abbreviations ...... 0 Foreword ...... 1 1. Theoretical considerations ...... 2 1.1. Some notes on sound change ...... 3 1.2. The problem of teleology ...... 5 1.2.1. A philosophical overview ...... 5 1.2.2. Teleology in linguistics ...... 8 1.3. The problem of markedness ...... 10 1.3.1. Multiple senses ...... 10 1.3.2. Contradictory claims ...... 11 1.3.3. Frequency and predictability ...... 12 1.3.4. The sources and effects of frequency ...... 13 1.3.5. Universality ...... 15 1.3.6. A summary of the main tenets ...... 16 1.4. Functional ...... 17 1.4.1. An overview ...... 17 1.4.2. The functions of ...... 18 1.4.3. Sound change and the three pillars of the communicative function...... 19 1.4.4. The functionalist tradition ...... 20 1.4.5. Critical points ...... 25 1.4.5.1. Optimization ...... 25 1.4.5.2. Eternal optimization ...... 27 1.4.5.3. The Darwinian paradox ...... 28 1.4.5.4. Mergers ...... 28 1.5. Functional principles at work ...... 29 1.5.1. Segmental changes ...... 30 1.5.2. Suprasegmental changes ...... 31 1.5.3. A non-teleological account ...... 34 1.5.4. Language contact ...... 37 1.5.5. The identification of external factors ...... 38 2. The prosodic features of the modern languages ...... 43 2.1. A note on classification ...... 43 2.2. Swedish and Norwegian ...... 44 2.2.1. Underlying representations ...... 44 2.2.2. Quantity ...... 45 2.2.3. ...... 53

2.2.4. The tonal distinction ...... 58 2.2.4.1. The nature of the opposition ...... 59 2.2.4.2. Word accents in non-focal position ...... 60 2.2.4.3. No lexical tones required?...... 63 2.2.4.4. The problem of markedness ...... 64 2.2.4.5. Tonal distribution in simplex forms and the role of suffixes ...... 67 2.2.4.6. Compounds in and Norwegian ...... 72 2.2.4.7. Southern Swedish compounds ...... 73 2.2.4.8. The functions of the opposition ...... 75 2.2.4.9. Building blocks reconsidered ...... 77 2.2.4.10. Lexical and post-lexical rules ...... 79 2.3. Danish ...... 81 2.3.1. Quantity ...... 81 2.3.2. Stress ...... 84 2.3.2.1. Degrees of stress ...... 84 2.3.2.2. Default stress ...... 85 2.3.3. Stød ...... 87 2.3.3.1. Distribution and markedness ...... 87 2.3.3.2. Change in progress ...... 92 2.3.3.3. Danish compounds ...... 94 2.3.3.4. The functions of the opposition ...... 95 3. A diachronic review ...... 97 3.1. Changes in terms of stress and quantity ...... 97 3.1.1. The Germanic Stress Rule (GSR) ...... 97 3.1.2. The impact of loanwords and the loss of Germanic stress ...... 100 3.1.3. The stress pattern of compounds ...... 104 3.1.4. The quantity shift and the problem of moras ...... 108 3.1.5. Traces of language contact ...... 111 3.1.6. Danish as a West-Germanic language ...... 112 3.2. The history of and stød ...... 116 3.2.1. Tonogenesis ...... 117 3.2.1.1. in Proto-Nordic ...... 118 3.2.1.2. Level stress in Old Scandinavian ...... 120 3.2.2. The birth of stød ...... 121 3.2.2.1. Stød first ...... 121 3.2.2.2. Parallel developments ...... 123 3.2.2.3. Tones first ...... 125

3.2.3. The tonal typology ...... 129 3.2.4. The role of analogy and some further changes...... 134 3.3. Current developments ...... 139 4. Summary ...... 144 4.1. Chapter 1 ...... 144 4.2. Chapter 2 ...... 145 4.3. Chapter 3 ...... 146 Appendix: Hungarian summary ...... 148 References ...... 152

List of abbreviations

AER Accent 2 exception rule AP Accent phrase C Da Danish dat dative def definite dim diminutive DSR Default stress rule fn footnote FSP French stress principle ft foot gen genitive Gmc Germanic GSR Germanic stress rule id identical IE Indo-European imp imperative inf infinitive IP phrase MAR Main Accent 2 rule No Norwegian nom nominative NSP Non-stød principle OCP Obligatory contour principle OD Old Danish OE OT Optimality Theory OSL Open lengthening part PhP Phonological phrase pl plural PrW Prosodic word ScMG Scandinavian on Multilingual Ground sg singular SPE Sound pattern of English / Semi-productive ending SR Surface representation Sw Swedish TBU tone-bearing unit UEN UG Universal grammar UPE Unproductive ending UR Underlying representation V ə-ass schwa- µ σ syllable

Foreword

The present dissertation is devoted to the understanding of how phonologically relevant prosodic features behave in the continental Scandinavian languages. From a cross-linguistic perspective the problems of tone and stød are obviously the most fascinating ones and are accordingly given due emphasis in the discussions to follow. The investigation of the tonal / glottal opposition is certainly a worthwhile enterprise given that the phenomena in question are closely intertwined with virtually every major aspect of Scandinavian morpho-phonology and are thus indispensable for anyone with an interest in the relevant fields of North Germanic linguistics. This is of course equally true for language learners with an aspiration to reach near-native competence. The fact that my own passion for prosodic problems also goes back to (and is largely defined by) a desire to improve my pronunciation has led me to adopt a pragmatic mindset, which means that I tend to approach prosodic problems with the perspective of language learners in mind. Consequently, I have serious doubts about cumbersome phonological frameworks with a high level of abstraction and I often refute proposals that go against native-speaker intuition. It seems to me that in terms of phonological theories, pragmatism translates best to a functionalist approach, which is consistently argued for and utilized throughout the thesis. Functional phonology can capture many aspects of language change in a comprehensible manner and is thus an efficient tool for exploring the dynamics of language use. Although dynamic aspects are primarily associated with the diachronic domain, it is essential to realize that the validity of a synchronic analysis may be undermined if we choose not to take certain (far from all) dynamic factors into consideration. If I may employ a chess analogy, in the (synchronic) analysis of a given position in a game of chess it is usually immaterial what previous moves have led up to the position in question. Yet in a minority of cases we cannot do without some diachronic knowledge since we are unlikely to make a correct judgment unless we know whether we are entitled to castle or capture by en passant. The notion of markedness, which is heavily featured in the present thesis, is one of those concepts whose synchronic interpretation is problematic unless we are aware of its dynamic characteristics. A rigid separation of synchrony from diachrony may turn out to be one of the reasons (along with the stratified nature of language, human imperfection etc) that have given rise to an embarrassing degree of polysemy in the field of linguistics. Such problems are necessary to address since they can blunt the scientific edge of any description. Accordingly, the first chapter of this thesis is dedicated to some theoretical questions, whose conclusions can facilitate the discussion of contemporary prosodic matters in Chapter 2, which in turn is meant to serve as a ground for certain aspects of the diachronic review in Chapter 3. Chapter 4 provides a summary of the main findings of the previous chapters.

1

1. Theoretical considerations

The study of functional phonetics (better known as phonology) entails the use of abstract entities, isolated on the basis of their distinctive characteristics. The resulting set of contrastive units constitutes the core of the given phonological system, a central essence that rightly attracts the attention it usually receives. The realm of suprasegmental phonology, however, exhibits the contradiction that many prosodic features are undisputedly more marginal as prosodemes than as non- contrastive linguistic tools. The placement of stress in Spanish can serve as an example, where the culminative and delimitative functions of stressed are clearly less occasional than the distinctiveness exhibited by the often cited minimal triad: término, termíno, terminó . The same goes for Swedish tonality, where the contrastive use of tone is so severely restricted by various phonological and morphological factors that it takes considerable efforts to devise a natural utterance whose meaning hinges exclusively on the opposition between the two accents. Needless to say, the use of tones for expressive purposes is not subject to such restrictions. In spite of their marginality, the prosodemes of the Scandinavian languages have been treated exhaustively in the phonological literature. In fact, so much has been written on almost all possible aspects of Scandinavian that virtually each major theoretical point has been put into question. This state of affairs, where prominent authors advocate such diverging positions, has its roots, as far as I understand, in the following three issues. First of all, the fact that prosodic phenomena are only sporadically indicated in writing does not only reflect how unaware speakers are of such distinctions, it even deprives us of the possibility of pursuing non-speculative diachronic research on a par with segmental analyses. This has of course serious repercussions when it comes to synchronic studies as well. It suffices to think of segmental phonology where access to diachronic data can definitely facilitate our understanding of synchronic affairs and where our experience of bygone changes can reinforce our analysis of synchronic variation. The student of prosody has to do without such benefits. The second problem that, in my opinion, can give rise to flaws in studies on prosodemes is connected to a manner of description , which to my mind makes unnecessary distinctions between segmental and suprasegmental analyses. In the following chapters I will try to demonstrate how much we can gain if we make reference to certain phonological principles well established on the segmental level, yet often overlooked in argumentations concerning prosodic problems. The last point to make pertains to the somewhat unscientific terminology that is so characteristic of linguistics as a whole. The use of clear-cut technical terms is an essential prerequisite of any scientific enterprise, still many fundamental concepts utilized in linguistics are highly polysemous, which often leads to misunderstanding and fruitless argumentation between linguists who conceptualize the disputed phenomenon in exactly the same way, though with different labels. The primary reason for this unfortunate state of affairs is to be sought in the stratified nature of language. If the same term is used to describe different linguistic levels, a certain degree of polysemy is inevitable. For a morphologist the word unmarked may mean absent (cf. unmarked plural as in two fish ), while a phonologist can interpret it as simple or frequent (cf. marked and unmarked : /z/ - /s/). This is, however, not the only source of this lack of consensus. Some linguists have a predilection for figurative language and make use of metaphors even when metaphors are clearly out of place. When a student of linguistics encounters the sentence “ sound changes are blind ” for the first time, she has several interpretations at her disposal. (1) Sound changes have no exceptions. (2) The implementation of sound changes is random. (3) The direction of sound changes is

2 random. (4) Sound changes do not consider their own consequences etc. The use of poetic language clearly increases polysemy and only adds to the existing confusion. There is no need for such formulations in a proper scientific description, it is better to call a spade a spade. For the sake of comparison it is worth considering how improbable it is that a physicist would ever want to support or illustrate a piece of argumentation with metaphors such as “gravity is stubborn ”, which would be tantamount to calling sound changes blind. In the light of the above it seems therefore crucial to settle certain theoretical points before we can embark on the analysis of suprasegmental changes. However, I will first try to delimit the scope of this study by considering what aspects of sound change to treat in the ensuing analysis.

1.1. Some notes on sound change The question of phonological change offers three main fields to investigate. The most obvious starting point is to specify those features of a given phenomenon that have undergone modification, i.e. to establish what actually has changed. However, if we are to conceptualize the notion of change in line with von Glasersfeld (1990:25), according to whom “the concept requires at least two experiences, an item that is identical in both… and a difference”, we can see that it is equally important that the portrayal of change include a description of what features have remained intact. This is indeed of interest if we want to maintain that sound changes are best conceived of as gradual processes. As far as the observed difference between two sequential stages is concerned the ensuing investigation of prosodic phenomena is going to be restricted to those features that show contrastive traits in a word-level analysis. In the case of Swedish, Danish and Norwegian it implies that the following have to be taken into consideration: (1) the manifestation of length in the rhyme of stressed syllables that serves to distinguish between minimal pairs such as Sw. väg (road) and vägg (wall), (2) the placement of stress, which gives rise to oppositions such as Da. billigst (cheapest) and bilist (car driver) and (3) the tonal curve that is associated to stressed syllables such as in No. bønder (farmers) and bønner (prayers). This insistence on restricting the scope of the study to distinctive word-level phenomena stems not only from the convenience offered by the abundance of minimal pairs. It is also a kind of necessity imposed on us by the extreme paucity of data that we have to face in the realm diachronic research. If we are to succeed in reconstructing certain prosodic conditions, the most feasible way is to concentrate on those aspects that can be understood in reference to segmental changes. Given that e.g. voiceless can raise the phonetic pitch of neighbouring , it is easy to envisage a situation in which the originally redundant tonal curve (if preserved) becomes phonemic upon the loss of the following C. Chinese tonogenesis is usually attributed to similar factors, cf. Sagart (1999). The disintegration of the quantitative system of Classical Latin can serve as a further example for the interplay of prosody and the segmental level. Due to the loss of phonological V length during the last centuries of the Roman Empire the redundant feature of stress (that was a function of the number of moras in the last two syllables) was rendered contrastive, since a distinction based mainly on V length such as Vcvcv – vcV:cv was transformed into an accentual opposition of the type Vcvcv – vcVcv, cf. Herman (2003). The second problem concerning sound change relates to the question of how a change (once initiated) spreads from one person to another within a single community (transmission) and even between communities (diffusion). The propagation of change has of course many aspects that cannot be addressed in a diachronic context. These include first and foremost those social facets of the process that in analyses of contemporary changes rely heavily on

3 statistical tools to establish the role of age, gender, income, social status etc. Such approaches are obviously inapplicable in a historical framework for want of data. It seems that for the purposes of the present study the most relevant contribution of is the distinction made between diffusion and transmission , cf. Labov (2007). The author maintains that the main divide between the two concepts boils down to the difference between the learning abilities of adults and children respectively. The difference is also reflected by the two main models used in historical linguistics to represent sound change (family trees and waves). On the one hand, the family tree model is used to illustrate that a given language can be recognized as a later stage of another provided that these two languages are connected by an “unbroken sequence of native language acquisition by children” (Labov 2007:3). If the term transmission is exclusively reserved for this domain, then it can be said to denote sound change within a given community. Diffusion in Labov’s terminology, on the other hand, stands for the phenomenon when linguistic innovations are transmitted from one community to another (consequently, it lends itself to a description in terms of the wave model). It follows that diffusion is associated to adults and relates to languages or in contact. If transmission and diffusion are in fact conditioned by the language learning abilities of children and adults respectively, then we are justified in expecting substantial differences between the two processes. Labov (2007:7) maintains that while transmission covers everything, diffusion is “limited to the most superficial aspects of language… [since] adults borrow observable elements of language, the same elements that can be socially evaluated” just like words and sounds. Another difference is constituted by the fact that due to the limited linguistic abilities of adults, diffusion involves the gradual simplification of the original rule. Labov refers to the Northern Cities Shift to illustrate that the minute conditioning of rules tends to fade away with diffusion before the spread finally comes to a halt. Given that the distribution of the Scandinavian word accents is notoriously complex, tendencies towards generalized patterns can be treated in the light of the above. The third aspect of sound change to be treated here can be referred to as the actuation problem and is basically concerned with the question of why sound change occurs in the first place. Of all the divergent approaches to this problem one extreme is constituted by the categorical pessimism expressed by Bloomfield (1933:385), who states that “the causes of sound change are unknown” and implies that a quest for triggers and causes is a fruitless enterprise. According to Martinet’s functional approach (1955:14), a book devoted to examining the factors to which sound change can be attributed, this persuasion is connected to the fact that Bloomfield and many other structuralists are only concerned with a static description of language. As will be demonstrated in the ensuing sections, a functionalist framework that makes use of concepts such as ease of articulation, perceptual salience and systemic symmetry etc can successfully account for a wide range of attested sound changes. However, if the emergence of phonemic inventories is indeed a function of perceptual salience (a requirement that contrastive linguistic units remain distinct to avoid confusion), then it follows that any account of sound change is incomplete unless it involves an exact description of functional load (i.e. how much a certain segment contributes to the success of communication) and unless it keeps track of all tokens of misunderstanding that have preceded a given change. This latter requirement is clearly unfeasible and will hopefully remain so for a long time to come. So in spite of the fact that functional theories can, beyond any doubt, identify certain preconditions without which a given change could not have taken place, the aspiration to fully describe each and every factor that is required to trigger a change, is (in line with Bloomfield’s position) obviously doomed to failure irrespective of the chosen theoretical framework. In the analyses to follow we will have to content ourselves with the above restrictions.

4

When one is concerned with questions such as why sounds change, it is essential to recognize the dual nature of the interrogative pronoun. Bertrand Russell makes the following distinction.

When we ask the question “why” concerning an event, we may mean either of two things. We may mean: “What purpose did this event serve?” or we may mean: “What earlier circumstance caused this event?” The answer to the former question is a teleological explanation, or an explanation by final causes; the answer to the latter question is a mechanistic explanation. … Experience has shown that the mechanistic question leads to scientific knowledge, while the teleological question does not. Russell (1945:67)

The functionalist approach to sound change that is to be presented in detail in section 1.4 makes (as we have hinted above) heavy use of concepts such as ease of articulation, perceptual salience and systemic symmetry. When sound change is described in terms of such factors it is virtually impossible to avoid goal-oriented formulations involving the notion of purpose. If Russell’s rejection of teleology is indeed justified, then the problem of goal- directedness can lead to undermine the validity of functional analyses. In what follows I will now turn onto somewhat philosophical questions to find out whether the concept of teleology is compatible with science as such and with linguistics in particular.

1.2. The problem of teleology In contemporary science the invocation of goal-oriented principles is generally looked upon with suspicion. “Blind alley” (Russell (1945:67)) and “poor scientific strategy” (Ohala (1993:263)) are just a few epithets attributed to the concept of teleology. And indeed, a wide range of (methodo)logical objections can be raised against its use, however, as far as I can understand, this immense criticism depends more on the ambiguity of the concept than on intrinsic problems associated with it. The long tradition of teleological explanations (spanning from Plato to our present days) has endowed us with rather diverse, even contradictory claims. As a consequence, we can neither dismiss nor adopt the concept without first establishing what we actually mean by teleology.

1.2.1. A philosophical overview The earliest account of teleological reasoning that I am aware of is found in Plato (2010:14) where Timaeus argues that “everything that becomes or is created must of necessity be created of some cause, for without a cause nothing can be created”. He then further contemplates on “the patterns [the artificer had] in view when he made the world”. Accordingly, Plato assumes that goal-oriented patterns and processes are due to the work of an intelligent designer. This type of teleology, which Leunissen (2012:1) refers to as “global, external and intentional”, is opposed to teleologies of the Aristotelian kind, which she labels “local, internal and natural”. When it comes to causation, Aristotle (Physics, Book II, part 3) distinguishes between four explanatory principles known as material, formal, efficient and final causes. On a closer inspection, however, it turns out that this high degree of polysemy (which must have characterized the Ancient Greek word for cause ) can be effectively reduced with appropriate . Given that according to Aristotle’s example the bronze and the mould are the

5 material and the formal causes (respectively) of a statue, it seems more fortunate to term them prerequisites or preconditions . Final causes on the other hand have “the sense of end or that for the sake of which a thing is done, e.g. health is the cause of walking about” (Aristotle (2004:20)). It is therefore apparent that a final cause can be equated to what we call a purpose , and is accordingly the very principle with which teleological explanations are usually associated 1. Now that the semantic burden of cause has been decreased it seems natural that in modern, mechanistic science (represented by Galilei, Newton etc) efficient causes came to be equated with causation proper i.e. the identification of triggers. Moreover, “the spectacular success of mechanistic thinking led to the conviction that the only type of causation relevant to the scientific enterprise was the type Aristotle had isolated and described as efficient cause” (von Glasersfeld (1990:19)). The view that causal reasoning is scientific while finalistic is not reached its peak in the positivistic thinking of the 19 th century. By this time a considerable body of criticism had been directed at teleological arguments, the most important of which include the following. Reliance on final causes implies the awkward concept of backward causation , which is clearly unacceptable for the natural sciences given that a (yet non-existing) future event or state can by no means determine the course of actions in the present. This is, however, not necessarily as metaphysical as it sounds. When we jog in order to be healthy the alleged future state (health) is of course nothing more than extrapolation from past experience (given our observation that jogging has helped others). Von Glasersfeld (1990:27) underlines that “the procedure of making predictions on the strength of past experience is not only the mainstay of science but also underlies almost every action we carry out in our everyday lives”. The necessity of extrapolation is overtly reflected in the way new concepts are born. Strictly speaking there is no new knowledge, since every new idea arises from a combination of two previously known facts or ideas. Leunissen (2010:4), who distinguishes between two types of teleological causation (primary/causal and secondary/explanatory), is also of the opinion that “[Aristotle’s teleology] resists the – in itself already anachronistic – charge of backward causation”, since as she claims “Aristotle never attributes causal primacy to final causes in his explanations” (ibid). This primary/causal type of teleology resembles the notion of formal cause inasmuch the form constitutes necessary conditions of later stages of development and can thus enforce the realization of future features, which are encoded in the form itself. With backward causation out of the way we can now turn our attention to the problem of intentionality , which is intrinsically connected to goal-directedness. While in the light of the above, utterances such as “she jogs to be healthy” seem acceptable, does the same apply to sentences such as “trees keep growing taller to get more sunshine”? Can inanimate entities have aims? It seems obvious that sunshine is rather the trigger of growth and not its final cause. We do not need a teleological formulation to account for the direction of growth either, since those parts of the tree that are exposed to more sunshine also grow faster. When we say that Aristotle’s teleology is natural it only means that it is unintentional as opposed to e.g. Plato’s world, which is steered by an intelligent designer. The restrictions entailed by the fact that intentionality cannot be ascribed to trees, flowers etc must lead to the conclusion that efficient/mechanistic explanations prove to be vastly superior to teleological accounts when it comes to inanimate entities. In spite of all this, it is remarkable how often human beings (even when discussing natural phenomena) are inclined to express themselves in a goal-oriented manner both in everyday conversation and even in scientific descriptions.

1 Although it can be claimed that e.g. bronze’s intrinsic telos is to be moulded into a sculpture, I find it advisory not to include such material causes in teleological explanations. Formal causes, on the other hand, can be included, provided that they establish the course of future events. See below. 6

In chemistry for instance, it is commonplace to say that “all elements try to attain noble gas configuration to acquire stability”. Yet on a closer inspection it is obvious that what really happens is the following. Given that valence electrons participate in the formation of chemical bonds, it follows that having valence electrons increases the likelihood that an atom in question will be absorbed into a molecule. The lack of valence electrons (i.e. noble gas configuration) can be equated to minimum energy and a virtual incapability to participate in chemical reactions. In short, it seems reasonable to claim that chemical elements keep on reacting as long as they have valence electrons (i.e. energy) and not in order to attain noble gas configuration. This line of reasoning is applicable to sound systems as well where instability can be constituted for instance by certain gaps in the phonological system, while an economical and symmetrical inventory embodies a stable system with a minimum energy for further changes. This means that an unstable inventory will be argued to keep on changing as long as it has the energy required to do so and not in order to acquire stability (see section 1.4.). If teleological accounts are indeed to be avoided in scientific descriptions, why do we resort to them so frequently? Regardless of whether or not free will exists, human beings live in the “illusion” that their lives are controlled by aims, desires and intentions. Then they simply extrapolate this goal-directed disposition to their surroundings. This craving for reasons is well reflected in a toddler’s way of constantly asking “why”. Even Aristotle (Physics, Book II, Part 3) remarks that “knowledge is the object of our inquiry, and men do not think they know a thing till they have grasped the why of it”. Even if illusions (such as pain, hunger or teleology) are the construct of the human mind, they still form in integral part of life which we cannot ignore. This state of affairs, where obviously inaccurate teleological accounts in certain cases arguably contribute more to our understanding of the world than mechanistic explanations, may entitle us to embrace goal-directedness after all. Leunissen (2010:4) claims that this approach can even be observed in Aristotle who “uses teleological principles as heuristic tools for the discovery of causally relevant features”. Such use of the concept is undoubtedly legitimate. Kant arrives at a similar conclusion in his Critique of the power of judgment . The author, who rejects much of the Aristotelian world view, still accepts the use of teleology for heuristic purposes .

It is self-evident that [teleology] is not a principle for the determining but only for the reflecting power of judgment, that it is regulative and not constitutive, and that by its means we acquire only a guideline for considering things in nature, in relation to a determining ground that is already given, in accordance with a new, lawful order, and for extending natural science in accordance with another principle, namely that of final causes, yet without harm to the mechanism of nature. Kant (2000:250-1)

We have thus established that changes exhibited by inanimate entities (such as sounds) can be justifiably described in terms of goal-oriented arguments provided that we realize the heuristic nature of such descriptions. When it comes to humans and some other animals the use of teleology does not have to be confined to heuristics at all. Needless to say that (mechanistic) causes and (heuristically defined teleological) effects, being relational antonyms, are different sides of the same coin for the study of which, if possible, it is best to adopt a holistic approach. There is one more frequently recurring piece of criticism whose irrelevance we have already partially addressed. The presumably most prominent use of the word teleology owes its meaning to William Paley’s Natural Theology (1802), in which the author attempts to

7 prove God’s existence with what is now known as the teleological or intelligent design argument. Paley argues that design implies a designer. If we were to find a watch on the ground, its intricate structures entitle us to infer that the watch must have had a maker. While the same apparently does not apply to a stone lying on the ground, the complexity of the universe or of living organisms, on the other hand, can lead us to the conclusion that it indeed is the work of a divine artificer. With the advance of modern physics and biology, Paley’s watchmaker analogy has been rendered irrelevant from a scientific point of view. However, it is so deeply anchored in popular culture that contemporary writers often make allusions to it, cf. The blind watchmaker by Richard Dawkins (1986). This religious overtone to teleology has surely contributed to the term’s bad scientific reputation. Nonetheless, the criticism seems unfounded given the distinction we have made between external (Plato, Paley) and internal (Aristotle) teleologies. External teleology is rightly rejected in a scientific enterprise, however, as we have demonstrated above some of the term’s other manifestations can be integrated into methodologically sound explanations “without harm to the mechanism of nature”.

1.2.2. Teleology in linguistics The above established stance that, if we ignore heuristics, mechanistic explanations are indeed superior to finalistic ones stems primarily from the fact that the former involve predictions, while the latter do not. An efficient cause constitutes a point of departure that determines the ensuing changes and whose examination can lead the scientist to discover “the laws of nature”. Such deductive reasoning, which results in exceptionless generalizations, is a requirement without which no explanation can qualify as scientific according to the Galilean view of science. In section 1.1 we arrived at the conclusion that although much of its conditioning can be identified, the ultimate causes of sound change are impossible to explore. In addition, the majority of processes concerning language are best described in terms of tendencies rather than absolute rules, which in the light of the above places linguistics outside the realm of science . However, language is not alone. In fact, all disciplines belonging to the humanities exhibit the above mentioned shortcomings, namely that the deductive model of science seems inapplicable to them. Adamska-Sałaciak (1989:54) points out that “the investigation of human phenomena is qualitatively different from the investigation of physical reality 2”. The human sciences are engaged in retrospective analyses and are incapable of providing predictions that could conform to the deductive model. Consequently, it can even be suggested that “what one is dealing with in the humanities is understanding… rather than explanation” (ibid). It follows that the only type of explanation that is eligible to these disciplines involves teleology. As far as I understand, the fundamental difference between social and natural sciences can be analyzed along the distinction between closed and open systems. Both the Newtonian laws of mechanics and the laws of thermodynamics (and all other laws of nature) apply to closed systems, where all relevant forces are included in the definition of the system. On the other hand, most phenomena involving humans constitute open systems to which no general laws can be applied. This seems obvious for history, where it apparently suffices to show how something happened without the oppressive burden of having to give causal explanations. Dray (1957:157) argues that “the historian need not show that what is to be explained happened necessarily… [f]or the demand for explanation is… met if what happened is merely shown to

2 See Ohala (1993:267) for a different view. 8 have been possible”. This is worth keeping in mind given that the study of language change pertains to a large extent to historical linguistics. Linguistic studies, which necessarily exhibit the above mentioned shortcomings, are approved by some representatives of the discipline and are rejected by some others claiming them to be unscientific. This state of affairs is certainly due to the fact that the study of language, spanning from phonetics (i.e. physics and biology) to semantics (i.e. psychology), forms a borderland between natural sciences and the humanities. As a consequence, certain aspects of language lend themselves well for a description in terms of a high degree of formalism, which brings them close to science proper, still the ubiquity of the human aspect does not make them completely eligible for the use of the deductive model. As to the status of linguistics, Adamska-Sałaciak (1992:28) remarks that it is indisputably much closer to the humanities than to the natural sciences. In this respect, finalistic reasoning in linguistics is not a choice but the only explanatory mode left at our disposal. As to the potential arguments that can prompt a researcher to adopt a teleological approach, it seems relevant to underline that language displays certain specific goal-directed traits that are absent in many other disciplines. In what follows I will mainly rely on the arguments presented in Adamska-Sałaciak (1992). The indisputably goal-oriented process of language acquisition entails the construction of hypotheses whose goal is to build an efficient grammar. It is not easy to see how the concomitant feature of change could lack goal-directedness as such. The same goes for the distinctly social aspect of language change. When age, social status, gender etc are considered as factors of change, we are actually describing with what kind of social patterns the speakers want to identify themselves, which is a sometimes unconscious but clearly goal-driven aspiration 3. Moreover, there are some structural arguments, which can advocate the notion of goal- directedness. This point, which is illustrated by the following quotation, will be further elaborated in section 1.4.

Certain changes simply appear to “make sense” when viewed teleologically: given their effects on the language system (or, more commonly, on one of its subsystems), invoking a goal or function to explain them almost forces itself upon the researcher. Adamska-Sałaciak (1989:57-8)

Even those who, in line with Kiparsky (1995a:16), like to express their reservations about this “excessive” use of teleology claiming that “it is human to read patterns into random events”, have to acknowledge its validity in the present context as a heuristic tool. Now it is time to turn onto the question of how teleology relates to sound change in particular. If we want to reconcile the mechanistic and the finalistic aspects of sound change, it is practical to assume “a two-stage theory… according to which the phonetic variation inherent in speech… is selectively integrated into the linguistic system” (Kiparsky (1995a:3)). Boersma (1997a) takes a rather similar approach with variation and selection as key concepts. He also argues that variation is mechanistic, while selection can be described in teleological terms. There are obvious parallels between this model and the Darwinian evolution through natural selection. Change occurs when the perpetual variation of sounds/genes is filtered by some condition, which can be identified with Aristotle’s formal cause. It follows from the above that we have to remedy an apparent inconsistency and reduce Kiparsky’s two-stage model to a single level. Given that variation is nothing more than

3 The fact that a given goal may often be reached through various means sheds further light on why it is so difficult to involve predictions in phenomena where humans are involved. 9 incessant random deviation from a certain pattern, we can assume that change is constituted by selection alone, the very bit to which Boersma (1997a) ascribes teleology.

1.3. The problem of markedness There seems to be general agreement concerning the observation that the course of certain phonological changes is frequently reflected in the asymmetries exhibited by the affected oppositions. For instance, in a language where the feature of flatness distinguishes between front vowels like /e/ - /ø/ and /i/ - /y/ and where the rounded phonemes have a more limited distribution it is safe to assume that /ø/ and /y/ are more likely to be eliminated by a future change than their unrounded counterparts. Such insights about asymmetries (better known as markedness relations) are frequently referred to in phonological descriptions and are even incorporated into the technical apparatus of certain theoretical frameworks such as OT or functional phonology. In spite of its extensive prevalence, the use of markedness is a contentious issue in the phonological literature. The term is notorious for the wide range of senses associated to it, which has led many linguists to dismiss its use. Haspelmath (2005:2) believes that “the relevant phenomena do not require the notion of markedness to understand them”, while Blevins (2004:20) argues that “there is a great deal of empirical evidence against the direct incorporation of markedness into synchronic grammars”. In what follows I will review the main points of criticism relating to the subject and will try to decide whether we indeed should “avoid such a snowball-like concept” (Cser (2003:9)).

1.3.1. Multiple senses The main criticism directed at markedness relates of course to its multiple meanings. Haspelmath (2005:2) provides a list of twelve senses associated to the term, a list which in my opinion seems deliberately inflated to support his claims. This wide range of meanings is due to the fact that the notion of markedness, which originated in the 1930’s as a phonological term, has since been adopted by , semantics, pragmatics etc. In fact, if we restrict our investigation to those senses that can be applied in a phonological study, we find that we are left with complexity, difficulty and five others that the author subsumes under the label of abnormality. I will try to show that the list of these remaining seven senses in (1) can be effectively reduced, given that some alleged senses overlap with each other, while others do not stand up to close scrutiny. So it may turn out that markedness is not as polysemous as Haspelmath likes to indicate.

(1) Seven senses of markedness (based on Haspelmath (2005)) a. specification for a phonological distinction b. phonetic difficulty c. rarity in texts d. typological implication / cross-linguistic rarity e. restricted distribution f. deviation from default parameter setting g. multidimensional correlation

The diverging meanings of markedness, which are obviously used as diagnostics of the concept, do not necessarily imply contradictory claims. In fact, in many cases a that is phonetically difficult is also rare in texts, deviates from default values, has a restricted

10 distribution etc. To put it differently, it can be expected that various aspects of markedness converge. This is the very idea expressed in multidimensional correlation , which is a consequence of the fact that asymmetries tend to generate further asymmetries. In this sense it is obvious that (1g) is simply a combination of other markedness criteria and does not belong on a list of core senses. When it comes to the comparison of (1d) and (1f), we are inevitably led to the insight that the two senses can be conflated. Typological implication is a notion that relies on (phonological) universals. Similarly, in a cross-linguistic context default parameters amount to linguistic universals, an observation indicating that the two senses belong together. Haspelmath (2005:11) himself acknowledges that (1f) “is a variant of [typological implication] with additional assumptions about the source of the asymmetries”. As long as we are combining, the close relationship between (1c) and (1e) is also worth considering. It is reasonable to argue that (1e) is merely a source of (1c) and thus should not be included as a separate sense. In the same vein, it can be maintained that (cross- linguistic) rarity and rarity (in texts), ie. (1d) and (1c) obviously belong together, reducing our list of markedness senses to the following dimensions: simplicity vs. complexity (1a), ease vs. difficulty (1b) and rarity vs. frequency (1c, d, e, f).

1.3.2. Contradictory claims In order to dismantle the idea that markedness is an inappropriate concept to be used in scientific descriptions we have to further diminish the number of its proposed senses. This can be achieved by examining cases where a certain markedness criterion can both imply a prediction and its exact opposite. The first example to consider concerns the ease-difficulty dichotomy, which can be approached with both physiological and perceptual criteria in mind. If both aspects are included in the definition of difficulty, then we inescapably arrive at the absurd notion of relative markedness, which “refers to a type of continuum on which every element is unmarked on some level, either physiologically, perceptually, or both” (Gurevich (2001:100), which is a critical review of Guitart (1976)). To take an example, a voiced aspirated as opposed to a voiceless unaspirated one is arguably more difficult to pronounce, still it is perceptually more salient i.e. easier to perceive. Such conflicts are usually resolved as in Rice (2007:80), whose review of non-phonological markedness diagnostics implies that a sound is unmarked if it is easier to articulate and is perceptually less salient. I am not convinced whether it is justifiable to be so categorically biased in favour of speakers. Given that most of us indeed do more listening than speaking I find it equitable to rather adopt the view expressed in Gurevich (2001:100) according to which “[n]either maximal contrast nor least effort is the main guiding principle in human communication i.e. neither the speaker nor the hearer is overwhelmingly preferred”. Although difficulty often correlates with other markedness criteria (especially complexity) it certainly does not serve as a reliable diagnostic assessment. As a consequence, we are left with two competing markedness criteria: frequency and complexity. It is sometimes indicated that the main problems with markedness (e.g. that it is theory-neutral, extremely polysemous and involves circular, vacuous arguments) can largely be attributed to the fact that it has been adopted by various frameworks often beyond the realm of phonology. In line with this argumentation we could expect that going back to its original sense would render the term unproblematic. However, this is not the case. The term was coined by the Prague School in the 1930’s and was originally intended to denote asymmetries attested in phonological oppositions. As it is commonly known we can distinguish between three types of oppositions on the basis of the relation between opposition members: privative, equipollent and gradual (Trubetzkoy (1939:74)) of which privative

11 oppositions can be best described in terms of asymmetries. So markedness was used to refer to privative oppositions where the presence or absence of a certain feature (a mark) constituted the only difference between the two members. Thus according to its earliest sense markedness can be equated with (structural) complexity. Accordingly, the features of , flatness and nasality are considered as marks, which distinguish between /b/-/p/, /y/-/i/ and /õ/-/o/ respectively. Nonetheless, it seems equally easy to cite cases where a structurally more complex form is still often claimed to be the unmarked member of an opposition. Syllable structure can provide us with a good example. CV-syllables are commonly regarded as unmarked as opposed to V-syllables despite the fact that the former is structurally more complex. The same pattern is observed in the case of certain segmental oppositions such as /s/-/h/ or /t/-/Ɂ/. There is general consensus that /h and / Ɂ/ are to be referred to as the marked members of the oppositions despite the fact they are structurally simpler than /s/ and /t/ given the additional place of articulation. In fact, implicational hierarchies suggest that if a language exhibits /h/, /Ɂ/ or V-syllables then it is also expected to have /s/, /t/ or CV-syllables respectively. It is apparent that markedness relations in the above cases are established on the basis of implicational universals and (cross-linguistic) frequency rather than on the notion of complexity. Szigetvári (2006:444) calls this contradictory state of affairs the markedness of the unmarked and remarks that “zero complexity is discouraged in language”. It means that the use of complexity as a markedness diagnostic is only legitimate if and only if certain minimal structures are satisfied. All this indicates that frequency is to be looked upon as the most reliable markedness criterion .

1.3.3. Frequency and predictability The wide range of senses the term markedness can be associated with can also be illustrated with the equally long list of criteria used to determine markedness relations. The examples in (2) and (3) below are taken from Rice (2007:80) and are frequently encountered in the phonological literature.

(2) Non-phonological markedness criteria a. less natural, less common g. early loss in language deficit b. more complex, more specific h. implies unmarked feature c. unexpected, not basic i. harder to articulate d. less stable j. perceptually more salient e. appears in few grammars k. smaller phonetic space f. later in acquisition

(3) Phonological markedness criteria a. subject to neutralization d. remains in coalescence b. unlikely to be epenthetic e. retained in deletion c. trigger of assimilation

It seems obvious that some of the above criteria are apparently incompatible with each other. To take an example, it is clearly contradictory that a perceptually more salient phoneme should have smaller phonetic space (2j, 2k) or that an unstable sound should be kept in coalescence and deletion (2d, 3d, 3e). Rice (2007:82) also asserts that both the emergence (e.g. neutralization and epenthesis) and the submergence (e.g. assimilatory loss and other deletions) of the unmarked can be employed as phonological diagnostics for featural

12 markedness. This is, however, tantamount to claiming that an unmarked phonological entity is both stable and unstable at the same time. In order to resolve this latter contradiction it might prove helpful to recall that in the previous section we pointed out frequency as the most reliable markedness criterion. Given that frequency correlates with experience, which in turn correlates with predictability, the concept of markedness can be fitted into a predictability-based approach. We can argue in line with Hume (2004:189-90) that “ elements that are predictable within a system are more likely to undergo change … [because the] most expected, i.e. unmarked, category is thus the one with the least information content”. Bias towards frequent sounds in neutralization and epenthesis (i.e. the emergence of the unmarked) can be accounted for in a similar fashion. Given that both processes involve environments where phonological oppositions are suspended, it is not surprising that these environments are filled with segments whose information content is low. This can be seen as some further support for the claim that frequency is the basis of markedness. Although Hume (ibid) is of the opinion that “information content as a quantifiable alternative [is superior] to markedness”, it has to be added that her critique concerns an ill-defined concept with multiple senses and contradictory claims. Frequency-based markedness defies such criticism.

1.3.4. The sources and effects of frequency When we consider the various diagnostics of markedness used in the literature (cf. Rice (2007:80)), we find that many of them relate to frequency in one way or another. Some of them are virtually synonymous with it (cf. general, common, natural ), some others express concepts that contribute to it (cf. simpler, easier to articulate/perceive, result of neutralization ) and still others relate to its consequences (cf. earlier in acquisition, lost in deletion, appear in more grammars, implied by a marked feature ). The concept of neutralization is generally considered to be central when it comes to determining markedness relations. This is certainly due to Trubetzkoy (1939:78-79) who argues that “[i]n those positions in which a neutralizable opposition is actually neutralized, the specific marks of an opposition member lose their distinctive force”. Word-final devoicing of attested in e.g. German and many is an illustrative example. Polish chleb ‘bread’ pronounced with final [p] demonstrates that the opposition of /p/-/b/ is suspended in a voiceless , which is consequently the unmarked member of the opposition. However, it would be a serious misconception to assume that all phonological entities found in suspended oppositions are unmarked. This can be highlighted with the case of complex onsets in for example English or Swedish. Whenever the onset of these languages is made up of three segments, it is subject to various phonotactic restrictions, according to which the consonantal opposition of the first element is always suspended in a voiceless alveolar /s/. Does this observation entitle us to claim that /s/, a phoneme that appears later in language acquisition than /m/, /b/ or /t/, is the least marked of all in English and Swedish? I would like to argue that we must answer in the negative. The obvious difference between this type of neutralization and the one discussed above relates to the concept of choice. In the case of neutralization in branching onsets the resulting fricative is imposed upon the language by phonotactic constraints originating from the sonority scale. Neutralization is thus the consequence of certain internal restrictions. In the case of word-final devoicing, however, neutralization is a language-specific choice. Given that choice (between distinctive units) is a fundamental notion in phonology, I would like to maintain that neutralizations of the former kind (i.e. where the loss of contrast is enforced upon the language by contextual or structural requirements rather than choice) do not qualify as

13 markedness criteria. In order to grasp the distinction between the two phenomena, we can refer to them as imposed vs. optional neutralization . Cross-linguistic frequency and implicational hierarchies are two proposed markedness criteria that require closer inspection. Although the notion of implication is undoubtedly an efficient tool that can grasp meaningful generalizations, it is often put to improper use. Statements like “[i]f a language has syllables that lack an onset, then it also has syllables that have an onset” (Kager (1999:93)) are ubiquitous in the phonological literature. However, if we ponder over it for a while, it becomes obvious that such implicational claims make little sense. As it is very well known that all languages have CV-syllables, the presence of potential V-syllables is completely irrelevant. Kager could also have said that if a language has a long literary tradition, it also has syllables that have an onset, a statement that is equally true, yet equally frivolous. Linguists should avoid such abusive use of logical premises. It is often argued that cross-linguistic frequency can provide evidence for universal markedness relations. What this implies is the following. The linguist observes a given pattern in a number of languages, which she wants to claim is universal. The more languages she can identify that conform to the expected pattern, the more certain she can be that the observed phenomenon is indeed universal. On the other hand, it is obvious that a long list of examples may serve to illustrate a point but not to prove it. This is self-evident as far as the natural sciences are concerned and can be aptly demonstrated with Euler’s famous polynomial x 2 – x + 41 (cf. du Sautoy (2003)), which produces prime numbers for all integer values of x from 0 to 40 but not for 41, 42, 45 etc. This means that a false generalization (e.g. that x 2 – x + 41 is always a prime) may be supported by a long line of examples without the researcher realizing it. So what does constitute a proof? A good suggestion would be to look for generalizations outside the scope of the given phenomenon, generalizations that serve to condition it. In linguistics this would amount to claiming that “phonological markedness constraints should be phonetically grounded in some property of articulation or perception” (Kager (1999:11)). Another issue concerning cross-linguistic studies and implicational universals is reflected in the following problem. Phonetically well-grounded generalizations are often accompanied by strong cross-linguistic tendencies, which nevertheless more often than not exhibit a few exceptions. Maddieson (1984:13-14) reports an alleged universal implication (not respected by five languages) namely that “[n]asal consonants do not occur unless stops (including ) occur at (broadly speaking) the same place of articulation”. This is a generalization with a sound phonetic basis. Antiresonance created by the nasal cavity renders nasal consonants acoustically weak, which implies that the place of articulation of non-nasals is easier to determine, cf. Raphael & al. (2007:140). However, do the five exceptions undermine the validity of the implication? I would like to claim that they do not and especially for the following reason. Historical sound changes spanning several centuries teach us that completely regular and relatively frequently occurring sound shifts may give rise to unnatural inventories in intermediate stages of the change. The consonantal inventory of Proto-Germanic for instance arguably had voiced but lacked voiceless ones at a certain point in its history due to the shifts known as Grimm’s law, cf. Lass (1994:20). This unnatural, intermediate stage was of course remedied by later changes. The point is that it is highly probable that a cross-linguistic study that takes hundreds of languages into consideration will include a few that are engaging in such common shifts at the moment. If the goal of cross-linguistic studies is to reveal what is normal in language as such, our insights should not be blurred by temporary abnormalities, even if it is rather difficult in most cases to apply a dynamic approach to determine what is temporary and what is not. The conclusion of the two problems discussed above can be summarized as follows. Exceptionless patterns emerging from cross-linguistic studies should not be considered to prove a point, if they lack phonetic grounding. On the other hand, counterexamples to

14 phonetically based generalizations exhibited by a handful of languages do not necessarily invalidate a given claim. So it seems that phonetic grounding is clearly superior to the insights provided by cross-linguistic data when we are to determine phonological markedness. In this respect I disagree with Kager (1999:11) when he claims that “phonetic evidence from production or perception should support a cross-linguistic preference”. As far as I understand, it is the other way round.

1.3.5. Universality In Trubetzkoy’s (1939) use of the term, markedness referred to asymmetries observed in privative oppositions on a language-specific basis. Jakobson (1963:208), on the other hand, advocated the quest for universal linguistic laws, which can contribute to both “rapprochement between linguistic and mathematical thought” and the disappearance of isolationism from linguistics. Markedness thus became universal instead of language-specific and its scope was vastly extended so that it could be included in the description of Universal Grammar. The concept was indeed integrated into Chomsky & Halle (1968) and has continued to feature in (post)generative works ever since in the widened sense described above. The innovation of OT, according to which universal principles (expressed in terms of markedness) can be violated, led to the enhancement of abstraction (since universal patterns do not necessarily surface) and resulted in an unprecedentedly heavy reliance on universal markedness, a concept that, in my view, leaves much to be desired. Rice (2007:85) asserts that in the field of featural markedness “emergence-of-the- unmarked diagnostics do not yield the same results cross-linguistically, suggesting that there is not a single universally unmarked consonant or vowel”. In fact, she concludes that what is regarded as unmarked is a function of the inventory, which means that “a feature may pattern as marked if some contrast is present, but as unmarked in the absence of that contrast” (p. 88). This of course points to the possibility that markedness is language-specific after all. Similar assumptions can be deduced from Hume (2004:183) according to whom “virtually any place of articulation can pattern as unmarked in some language”. She then proceeds to demonstrate through English and Portuguese examples that high token frequency and ensuing predictability can give rise to the emergence of unmarked patterns, which consequently differ from one language to another. This may have serious repercussions for those theories that rely on the concept of universal markedness and for OT in particular. According to the generative approach “UG contains (but is not limited to) that which is common across languages: markedness captures these commonalities and must therefore be part of the UG” (Gurevich (2001:106)). The supposition that markedness relations are language-specific obviously raises questions about the validity of UG. However, we can go even further and claim that ascribing a certain role to markedness in individual grammars is a dubious enterprise even on a language-specific basis.

[m]ost knowledge encompassed by markedness issues is knowledge a learner either does not require (such as knowing a process in their language is more natural than a process in some other language) or can easily determine with no prior knowledge (e.g. that intervocalically voiced stops are easier to articulate than voiceless ones). Gurevich (2001:96)

It is undeniable that markedness has both universal and language-specific traits associated to it. The universal traits obviously stem from what all humans have in common and that is the physiological basis of speech. In other words, universal patterns are imposed on the speaker solely by perceptual and articulatory restraints. It follows that we need not reserve a separate

15 compartment for universal features in our grammars and as a consequence we need not make a distinction between language-specific and universal patterns either (at least as far as the speaker is concerned). If markedness indeed can be derived from frequency, which in turn at times can be derived from physiological facts, then it has to be acknowledged that postulating markedness as such in the grammar may seem superfluous. On the other hand, markedness, being the convenient tool it is, can (on condition that it is properly defined) be included in grammatical descriptions without causing much harm. The fact that it is derived from other entities certainly does not disqualify it from being an integral part of the linguist’s descriptive arsenal. This uneasiness about superfluity has too firm a grip on linguistic thinking. For the sake of comparison, physicists are not the least ashamed to use derived entities . It would be absurd for a physicist to refrain from e.g. the use of force claiming that its being a product of mass and acceleration renders it superfluous. If a derived concept can make life easier and shed light on processes that have been hitherto difficult to grasp, I cannot see why it should not be made use of, given that in a sense all derived concepts are heuristic tools. The position that universal markedness is strictly confined to patterns conditioned by physiology may lead us to consider a slightly modified view of OT in the next chapter, where the analyst is not entitled to whimsically make up any sort of constraint she needs.

1.3.6. A summary of the main tenets In the current section I will reiterate the most important aspects of markedness as I understand it in the light of the above discussion and add some further remarks. We have come to the conclusion that of all its proposed senses (complexity, frequency and difficulty) markedness correlates most reliably with frequency. Its correspondence to complexity is restricted by certain minimal requirements, while the use of difficulty as a markedness criterion can lead to illogical, contradictory claims. This means that less marked can, for all practical purposes, be equated with more frequent. However, it is important to point out that they are not exact synonyms as Haspelmath (2005:8) indicates when he claims that “[t]here is no reason why we should not use the words frequent and rare when we intend them”. According to the usage I would like to advocate, markedness is intrinsically connected to the notion of oppositions, while frequency is not. To put it differently, talking about marked phonemes, categories etc. only makes sense in terms of an opposition. Given the fact that markedness is based on frequency and not on structural complexity, the notion is therefore not restricted to privative oppositions as in the early days. This will be of importance in the next chapter, where we will see that it is far from evident whether the accentual opposition is privative or equipollent. Another corollary of our frequency-based approach is that markedness is best described as a language-specific concept. The universal traits we encounter are attributable to certain physiological factors. When we are to determine markedness relations on the basis of frequency, we must remember that markedness is intrinsically connected to (phonological!) oppositions. It follows that suspended oppositions can be taken into consideration only on condition that they are the result of phonological choice. Consequently, frequency counts should not be affected by imposed neutralizations . Those authors who are inclined to make a case for eliminating the notion of markedness from linguistic terminology (cf. Gurevich (2001), Haspelmath (2005) and Hume (2004)) usually support their conviction claiming that it is an ill-defined term, which is engaged in circular reasoning, produces contradictory claims and vacuous predictions; arguments which suffice to disqualify a notion as a scientific concept. Gurevich (2001:111)

16 argues against its use suggesting that “the phenomena subsumed under the cover term markedness are individually far more valuable than the cover term itself”. In response to the above charges, I would like to propose that contradictory claims and circular reasoning are obviously a consequence of the term’s unsatisfactory definition. It seems evident to me that when a linguist encounters a term whose definition is inadequate the proper thing to do is not to back out but to amend the concept. Given the arbitrary nature of the linguistic sign, nothing prevents us from coming up with a workable definition. As far as the notion’s predictive powers are concerned, I must say that I simply do not agree with the critics. In the following chapters I will try to demonstrate that markedness (as defined above) can indeed predict (or in light of our discussion on teleology rather explain ) both cross-linguistic frequency and the direction of sound change and analogy.

1.4. Functional phonology The title of the present section undeniably constitutes a sort of tautology. Phonology being the study of sound systems deals by definition with an organized group of abstract entities where membership is granted by the distinctive function of a given sound. Malmberg (1968:103) refers indeed to the discipline as “phonétique fonctionelle”. So why would a phonologist insist on calling herself a functionalist if the notion is so deeply rooted 4 in the definition of the concept? The answer to this question requires a short historical outline.

1.4.1. An overview André Martinet, who is generally looked upon as the father of functional phonology, was heavily influenced by the achievements of the Prague Linguistic Circle, which is usually referred to as a structuralist rather than functionalist movement. Martinet shared most of their views concerning linguistic structure and methodology, which are clearly palpable in his own attitude to linguistics. In his early writings in the 1930’s and 1940’s, however, the functionalist label, which starts to feature prominently in the following decades especially in his Langue et fonction (1962), does not yet appear. By the 1950’s structuralism had become an umbrella term, which designated a diversity of movements, cf. Martinet (1955:11). Its most notable branch was associated with Roman Jakobson, whose influence on linguistic thinking in the post-war USA is clearly detectable (cf. Caton (1987)) and was especially significant for the emergence of generative phonology. By this time, mainstream phonology had distanced itself considerably from its Praguian roots, resulting in a highly technical and abstract theory. Once an approach is preoccupied with its own technicalities, the fundamental aspects of language and its use are easily neglected. With increased abstraction, when the analyst gets further and further away from parole , it is more and more likely that she will overlook the fact that language is essentially a means of communication, a system whose users want to understand others and more importantly want to make themselves understood. In light of the above, the appearance of the superfluous label of functionality is an obvious reaction to the above described development. The more ground these ideas gained, the more opposition they generated. Givón (2013:17) expresses similar views when he writes about “[another] functionalism that emerged out of the anti-Generative rebellion of the late 1960s”. The opposition between generative and functionalist approaches is of course not restricted to phonology. Haspelmath (2006:22) explores the differences in the way the two

4 The obvious fact that phonology is inherently functional entails that many of the following arguments in favour of a functional approach must inevitably be statements of the obvious. 17 movements explain syntactic universals and implies that functionalism is a more holistic approach, whereas “generativists see no role for performance in explaining competence”. Bischoff & Jany (2013:1) argue that “language use and the evolutionary and adaptive processes leading to current language usage… rather than abstract representations” are the cornerstones of functionalist explanations, which indicates that explanation and description are separate concepts. To put it simply, we can conclude that functionalism is an approach with an appeal to commonsense, an approach that prevents the linguist from getting lost in the perplexing maze of abstract representations by returning now and then to the fundamental question: “what is this really about?”

1.4.2. The functions of language The fact that language is commonly defined as a system of communication clearly designates its primary function as communicative. It can also be argued that communication is essentially the externalization of thoughts, so it is more accurate to call language a tool for thinking instead. I readily acknowledge that communication only constitutes a subset of the cognitive processes associated with language, however, as far as phonology is concerned, we can certainly delimit our scope to such manifestations of language that feature sounds. In line with this we can maintain that communication is indeed the primary function of language as we understand it. Although a large number of further functions have been identified in the literature, most of them can be subsumed under a communicative label. Furthermore, many of them need not be reviewed in this context because they are not relevant for a phonological discussion. Among others, this applies to Halliday (1975) whose proposed functions (instrumental, regulatory, interactional, personal, heuristic, imaginative and representational) relate to the language use of children and belong to the field of pragmatics. Jakobson’s (1960) six functions of language also include some that are beyond the scope of phonology, still they are worth considering here given the fact that the account is based on an actual communication model, which of course also incorporates the phonological aspects of language. Jakobson’s model grants six factors that he considers essential for verbal communication and attaches a function to each of these. The factors are as follows (with the corresponding functions in brackets). He postulates an addresser (emotive), who conveys a message (poetic) to an addressee (conative) . Moreover, successful communication requires a context (referential) and a code (metalingual) known to both persons involved, who obviously have to be in contact (phatic) . The model outlined above is clearly holistic in the sense that the factors involved ensure a “globally” successful act of communication. However, as far as phonology is concerned, the requirements are met if the addressee is able to analyze and segment the message into familiar phonological entities. It follows that the following three functions are irrelevant in a phonological analysis: (1) the referential function with its intended meaning, which can be derived from the context as a conversational implicature (cf. Grice (1975)), (2) the poetic function with its regulative and manipulative effects and (3) the phatic function, which according to Jakobson can express the purpose of starting, sustaining or ending a conversation. Accordingly, we are left with the three functions that correspond to Jakobson’s addresser, addressee and the code between the two. These three functions can be conveniently subsumed under a communicative label, given that a signal conveyed between a source and a receiver is what most people conceptualize as the core essence of communication. Given that

18 the sole criterion for a phonologist is that the signal be suitable for segmentation and analysis, the signal need not be meaningful, which means that even counting rhymes can qualify as communicative acts. On the other hand, there are certain cases (e.g. when people swear or think aloud) where there is no addressee to interpret the signal. I would like to claim that from a phonological perspective such instances should not be categorized as communicative acts given that it does not emerge whether the signal was prone to segmentation or not 5. As a consequence it can be claimed that on a phonological level the communicative function of language is determined by exactly three factors: the addresser, the addressee and the code.

1.4.3. Sound change and the three pillars of the communicative function It seems reasonable to assume that the above identified three factors that make up the phonological model of communication are also relevant as far as the formation of sound inventories is concerned. It follows that we will have to transform these factors into phonological principles, which reflect both the addresser’s (the speaker’s) and the addressee’s (the listener’s) perspectives. How difficult the latter’s task of analyzing a signal into distinct units is depends on whether the units to be distinguished are perceptually distinct (and salient) or not. The speaker’s task, on the other hand, is facilitated if she can formulate the signal with as little energy as possible. As the participants of the conversation are connected by a code they both have to master, it is their joint concern that it be easy to learn. To sum it up, the listener is interested in the minimization of perceptual confusion , the speaker aims at the minimization of articulatory effort (cf. Boersma 1998:2), while both prefer a symmetric and manageable phonological system, in which a large number of phonemes can be described with few distinctive features. This latter requirement can be referred to as the principle of economy . These three principles (summarized in (4) below) are inherently in conflict, since reduction of articulatory effort cannot lead to increased perceptual salience. Furthermore, an optimal dispersion of speech sounds may help to avoid perceptual confusion, however, optimal dispersion in an asymmetric vocal tract often prevents sound systems from having symmetric inventories.

(4) Functional principles in phonology a. minimization of articulatory effort 6 (e.g. assimilation) b. minimization of perceptual confusion (e.g. secondary split) c. maximization of economy of description (e.g. functional load, analogy)

It follows that it is impossible to devise a phonological system that meets the requirements of all three principles. What this means is that every system has unstable points , which can trigger innovations. A phonological change can thus be conceptualized as an optimization process, which eliminates patterns that violate some of the principles discussed above. This leads us to the question of how such optimizations are implemented. Given that a change may violate some principles while satisfying others, it follows that the concept of optimization requires that the net result be positive. To put it differently, “a sound change is carried through if it matches at least two of the three criteria” (Boersma (1990:2)). This has some important implications. First, the improvements need not be measured in absolute terms.

5 It can be ignored that the speaker can segment her own message. 6 Ladefoged (1990:343) expresses the view that ease of articulation is language-specific and appeals to it are therefore unscientific. However, given that I do not intend to use the concept in a cross-linguistic context, I assume that Ladefoged’s criticism does not concern the present study. 19

It suffices if the system arrives at a change by majority vote. Second, if a given change improves system A by transforming it into system B we can assume that the reverse of the change results in a decreased compliance with our three principles and is thus not expected to take place. It follows that every phonological change is unidirectional and “must be treated as a function of the system within which it takes place” (Jakobson (1949:185)). These three functional principles can be applied to different domains. So far we have used them to explain the formation of optimal inventories. However, given that many sound changes affect the distribution of phonemes rather than the inventory itself, the principles sometimes relate to lexical items instead. Keeping this in mind (4a) above can pertain to e.g. assimilations, (4b) to phonologization of upon the loss of a previously distinctive segment and (4c) to changes connected to functional load. This latter category includes the deletion of functionally weak sounds from a sequence but even the increment of type frequency in the case of unreasonably asymmetric oppositions (cf. the virtual lack of /b/ in early Indo-European). Despite the fact that analogy strictly speaking cannot be classified as a sound change (cf. Passy (1891:104)), it still contributes to enhance symmetry and economy of description. As a consequence, (4c) can even be invoked to explain analogical processes. This approach is markedly different from many other theories of sound change. First of all, it is not concerned with the question whether a given change is a law or a tendency. Given that even sporadic changes and vague tendencies can improve a phonological system, the validity of the Neo-grammarian position that sound changes are exceptionless laws is perfectly immaterial for the present discussion. Second, sound change is not conceived of as rule addition and is not claimed to be confined to oncoming generations. Phonological changes automatically restructure the system in which they have occurred, as is implied by the concept of optimization, which requires the comparison of two systems. Furthermore, it can be clearly demonstrated (e.g. by analyses of Queen Elizabeth II’s Christmas broadcasts, cf. Harrington (2006)) that even the sound system of adults is subject to change. Lastly, it is worth pointing out that the first two of our functional principles (4a) and (4b) correspond to a certain extent to the speaker- vs. listener-oriented markedness and faithfulness constraints of optimality theory. As a consequence, OT seems to lend itself perfectly to analyses in terms of functional principles (cf. Boersma (1998)), however, its disregard of the principle of economy renders it problematic for the description of sound changes that involve the concept of an inventory, cf. McMahon (2007).

1.4.4. The functionalist tradition As earlier functional approaches to phonology do not always agree on the way functional principles should be defined and used, it comes as no surprise that we find a certain degree of terminological confusion in the field. In what follows, I will present three cornerstones of the functionalist tradition (Passy (1891), Martinet (1955) and Boersma (1990), (1997a, b)) to see on what points they differ from each other and from my own understanding.

Passy’s principles Passy (1891:227) posits that the driving forces of sound change can be defined as the principle of economy and the principle of emphasis . According to the former, languages tend to get rid of superfluity, while according to the latter they tend to enhance necessary elements. He acknowledges that it is not always viable to make a clear-cut distinction between superfluous and necessary items. Given the uncertain factual base surrounding ease of articulation in the 19 th century, he chooses not to include the notion among his principles and also refuses to conflate it with economy claiming that what is economical is not always easier

20 to pronounce. Finally, he considers the origins of phonetic tendencies and distinguishes between internal (the role of emotions), external (the role of climate and habits) and ethnological (language contact) influences, most of which he expresses serious reservations about. Given that sound change is claimed to emerge from the interplay of reductive and preserving forces, Passy’s principles reflect of course to some degree the inherent conflict between OT’s markedness and faithfulness constraints. However, the fact that Passy’s account does not rely on the notion of markedness suggests that the two approaches can lead to rather diverse implications. This is the case with e.g. the notion of strengthening, where it can be shown that pursuing Passy’s approach we can capture generalizations that markedness-based theories have overlooked. Consider the following hypothetical sound changes in (5) occurring word-initially in stressed syllables.

(5) Changing onsets a. /p/ > /p h/ > /pf/ > /f/ b. /ki/ > /kji/ > /kç/ > /ç/ c. /d ʒ/ > / ʒ/

What these three changes have in common is that the last stage of each chain is traditionally described as markedness reduction. The previous stages (in the case of (5a, b)) require additional articulatory effort and therefore reflect an increase in markedness. These chains are thus the products of opposing tendencies. One inevitably arrives at a similar conclusion if the changes in (5) are analyzed in terms of and fortition. It turns out that /p/ > /f/ and /k/ > /ç/ involve both weakening and strengthening. Yet why would such opposing processes take place in identical environments? Cser (2003:17) points out that although the concept of lenition is common currency in phonological descriptions, it is clearly ill-defined and can give rise to contradictory claims. Passy’s principle of emphasis, on the other hand, allows us to lend a uniform description to the changes in (5). Recall that all of these developments take place word- initially in stressed syllables i.e. in strong environments. It seems obviously counter to common sense to assume that strong environments trigger weakening processes. If in line with the principle of emphasis languages indeed enhance necessary elements (such as stressed syllables), then all such changes should be looked upon as strengthening processes, contrary to popular belief. The real question of course is what actually undergoes fortition. Most consonants are made up of a bundle of articulatory gestures, whose timing is not necessarily simultaneous. A fresh glance at the data in (5) reveals that what indeed gets strengthened is the consonantal feature that is closest to or is co-articulated with the following stressed V. When the two phases of the C are distinct enough to be felt as two segments, phonotactic considerations cancel the first one 7. Lenition is thus not involved at all. The above examples are echoed in many actual historical changes, many of which are often erroneously classified as weakening processes.

A comparison between Passy (1891) and Martinet (1955) The term “économie” is a central concept for both Passy and Martinet, although with fundamentally different associations. Passy’s pre-structural approach revolves around the notion of the segment, while Martinet is preoccupied with a system of segments. The difference manifests itself rather clearly e.g. in the opposition between / θ/ and /ð/ in English. The distribution of the is severely restricted, so the opposition’s

7 In a similar manner, branching onsets of loanwords in Finnish retain the C closest to the stressed V (cf. (s)tuoli (chair), (g)lasi (glass) etc). 21 functional load too is very limited. According to Passy’s theorem /ð/ is thus a superfluous element that the language should eliminate. However, Martinet (1955:78) points out that it is not the functional load of individual oppositions that counts, but the functional load of the feature that the given opposition is integrated in. The voiced-voiceless opposition in English is so stable that the opposition between / θ/ and /ð/ is not threatened in any way. Economy of description does not entail that the number of phonemes or features should be kept as low as possible, otherwise it would be very difficult to account for differences in inventory size between languages such as the Rotokas language (11 segments) of Papua New Guinea and !Xu (141 segments), Khoisan language in Botswana, which are claimed to have the smallest and the largest inventories in the world respectively, cf. Maddieson (1984:7). Economy refers instead to the ratio of the number of phonemes and the number of distinctive features, so a good system will maximize the number of phonemes with regard to the features employed by the language, cf. Martinet (1955:100). My own principle of economy of description is very much akin to Martinet’s approach. There is one substantial difference, however. Martinet (1955:94ff) uses the term in a much wider sense and claims that “économie” is a ubiquitous aspect of language that covers everything. He defines it as the balance between constantly changing communicative needs and the unchangeable human inertia, a combination that also can be referred to as the law of least effort. Language change, which Martinet claims is always economical, is thus explained as the result of this unstable balance. I have the following objections. First of all, using the cryptic notion of communicative needs, which allegedly change from one period to another and which he never explains in any detail, the author tries to evade the uncomfortable truth that he cannot identify the factors underlying language change. Second, the discussion of his other core notions (function and structure) largely overlaps with many aspects of economy, which shows that the three concepts are interrelated. Still, the definition of economy does not reveal the nature of this relation. We are not told how function and structure, which are strictly internal factors, are connected to the external factor of inertia and the dubious notion of communicative needs. Martinet and Passy also differ in their use of internal and external factors . Passy’s remarks on the role of climatic conditions, emotions and habits are obviously obsolete and do not make any sense in a modern linguistic description. Martinet’s (1955:20-2) position, on the other hand, is worth considering. The author proposes that internal factors should be defined as anything that is purely linguistic, i.e. arbitrary in the Saussurean sense of the word, so a description in terms of internal factors has to abstract away from everything that is automatic, necessarily present and non-distinctive. This definition is somewhat problematic. It suggests for instance that the physiological basis of speech, which is common to all humans, is an external factor, while a given V inventory is internal, since it is obviously language-specific. Now if V inventories (internal factors) are shaped by a tendency to maximum dispersion, which necessarily reflects the shape of the oral cavity (external factors), it is virtually impossible to find any internal feature that is not influenced by external factors. It is therefore inconceivable to mark the exact boundaries of the internal domain, which renders the term rather vacuous. Furthermore, non-distinctive features (such as aspiration rules in English and Swedish), which are supposedly external, vary extensively across languages. Yet Martinet purports that everything that distinguishes between two languages is internal. Similarly, what label can we attach to Saussure’s parole , which is an exteriorization of the langue , when it cannot be described as being automatic or necessarily present? Consequently, I have to reject Martinet’s terminology. Instead, I would like to suggest that all aspects of a particular phonological system that reflect our three functional principles be subsumed under the label of internal factors. Any development that cannot be traced back

22 to majority vote (i.e. at least two principles should be in favour of it) is due to external factors, the most important of which is language or contact.

Boersma’s approach A common point in Martinet’s and Passy’s conception of language change is that they both see it as emerging from the interplay of two opposing forces. Passy conjectures the existence of two principles, while Martinet assumes two components that eventually combine in the notion of economy. However, it is evident that a model that relies only on two factors is incapable of explaining continuous change in a satisfactory manner, since it cannot account for the reasons why the equilibrium of the two forces is so unstable. It is clearly inadequate to throw the blame on changing communicative needs or the fickle line between superfluous and necessary elements. Boersma (1990:1) realized that it takes at least three distinct criteria to model an optimization loop, which is the only way to explain why sound change is eternal in the first place. He proposed that inventories are shaped by the following principles in (6).

(6) Functional principles in Boersma (1990:1) a. Articulatory effort is reduced (within words) b. The perceptual salience within words improves. c. Perceptual distinctions between words improve.

Whenever (6b) is satisfied, we must assume that the language in question makes use of clearly distinct phonemes, which inevitably leads to (6c), since perceptual distinctions between words depend primarily on the qualities of the phonemes that make up the words. It follows that it is superfluous to posit two converging perceptual principles. Boersma’s references indicate that his understanding of functional principles primarily echoes the views of Passy. Therefore, it is not surprising that he overlooks the importance of principles relating to systems. Boersma (1997b) reconsidered his previous approach and posited the principles in (7).

(7) Functional principles in Boersma (1997b:3) a. minimization of articulatory effort b. maximization of perceptual contrast c. maximization of information flow d. maximization of recognition e. minimization of categorization

This revised list does not consider systemic properties either. The only principle that expresses the common concern of the speaker and the listener and thus can relate to the communication channel is (7c), which owes its very existence to a claim attributed to Passy, namely that language is meant to convey information from one person to another as quickly and clearly as possible. However, I am not convinced that the success of communication depends in any way on the speed of information flow. An optimal pace can certainly lead to increased satisfaction with the conversation, yet Boersma (ibid) is certainly on the wrong track when he introduces a principle whose main point is to ”put as many bits of information in every second of speech as you can”. If a principle that is claimed to mirror the common interests of the speaker and the listener is given more and more prominence, it follows that both parties involved should be better and better off. This is clearly not the case given that after a certain point the listener is expected to have considerable difficulties processing the information content of the message as the pace of speech increases, even if she is still able to recognize and categorize the utterance on the phonological level. The optimal speed of speech

23 as regards information processing certainly depends on a wide range of pragmatic factors such as relevance, information content etc. (7d) is essentially a novel rendering of (6c), which is arguably superfluous as claimed above. (7e) could in theory relate to systemic requirements given that minimizing the number of categories (or distinctive features), while at the same time maximizing the number of phonemes derived from those categories leads to an optimally manageable system for both the speaker and the listener. Boersma (ibid), however, sees it as a principle of speech perception claiming that “it is easier to divide a perceptual continuum into two categories than it is to divide it into five”. This is undeniably true, yet the lack of principles that could relate to the communication channel makes (7e) somewhat inappropriate.

Teleology in the functionalist tradition Although in section 1.1 I argued that functional approaches are inseparable from goal-oriented formulations, we find that both Boersma and Martinet make considerable efforts to avoid teleological explanations . Martinet (1955:18) admits that reference to goal-directedness may be justified under certain conditions, but he concludes that the terms finality and teleology are loaded with too divergent overtones to be ever introduced into a scientific discussion. In other words, his dismissal of the concept is due to its multiple senses rather than to inherent problems associated with it. Adamska-Sałaciak (1989:64) points to the apparent contradiction between Martinet’s attempt at eliminating teleology from diachronic linguistics and his constant use of goal-oriented concepts. She argues that “Martinet’s attempt to eliminate teleological overtones from the explanation of apparently purposeful sequences of changes proves unsuccessful [since] any explanation referring to drag- or push-chains remains teleological, despite the use of causal terminology”. A theory that employs functional principles and relies on majority-vote optimization is inherently teleological, cf. Boersma (1990). The author was of course very much aware of this and tried his best to withdraw his theory of sound change from the sphere of teleology in his subsequent articles. Boersma (1997a) acknowledges that goal-oriented processes play a role in the grammar, given that in an OT framework it contains constraints that can be derived from the functional principles of minimizing perceptual confusion and maximizing articulatory ease. When it comes to change, however, he claims on page 2 that “[it] is not teleological (there is no final causation), it is functional in the sense that it is the result of local optimization in the production grammar”. He then goes on to argue that sound change leads to an improved system, which parallels biological evolution, given that random variation is followed by meaningful selection. Boersma (1997a:11) concludes that sound change “is a mechanical process without goal orientation and teleological only at a higher level of abstraction, just like Darwin’s survival of the fittest”. Boersma (2003:33-4) goes a step further when he reviews his previous position and indicates that goal-directedness in the selection step is an undesirable aspect of the approach, so “[f]inding instead a blind underlying mechanism to account for this step would be more satisfying”. He claims that this can be achieved with the help of mutually unranked constraints in an OT framework, where certain constraints “happen to” fall from the top, while others “happen to” rise from the bottom to the top of the hierarchy. As we indicated above it seems somewhat self-deceptive to deny the validity of teleological explanations in a framework that relies so heavily on implicitly teleological notions such as functionality and optimization. As far as the Darwinian parallel is concerned I agree with Boersma (1997a) that the selection stage is goal-driven, while variation is not. However, Boersma fails to acknowledge that change is brought about by the goal-oriented process of selection and is therefore inherently teleological. Constant random variation is not a change. I also have difficulties following the logic in his claims according to which changes

24 are non-teleological, while changes in inventories are goal-oriented processes. Despite all this I find that Boersma (1997a) makes a very valid point demonstrating that sound change and biological evolution are governed by identical mechanisms. In light of the above, I of course deem Boersma’s departure from this partially goal- driven account as unnecessary. He abandoned an approach where goal-oriented selection could be accounted for with reference to functional principles and majority-vote optimization in favour of an OT-based account where the burden of goals is replaced by constraints that “happen to” undergo various processes. This is of course nothing but a convenient evasion that does not explain anything and poses a set of new questions. Most importantly, what makes certain constraints fall to the bottom or rise to the top and what makes certain patterns more frequent than others? As long as such questions are left unanswered, OT-based non- teleological approaches to sound change remain unsatisfactory.

1.4.5. Critical points We can conclude this section’s theoretical discussion by meeting some critical remarks that are sometimes directed at the functional approach to sound change and some of its basic concepts. The first of these relates to the notion of optimization, without which the validity of both OT and functional phonology can be undermined.

1.4.5.1. Optimization Labov (2001:3-14) rejects the main tenet of functionalism, namely that changes are implemented to ameliorate the system and to improve communication. He elaborates on the social effects of language change and the damage it brings about. Labov’s grievances range from school papers downgraded for stylistic reasons to the huge effort needed to master the orthography of a language after various shifts and mergers have taken place. Sound change destroys morphological patterns and gives rise to a large number of exceptions, which do nothing but burden the lexicon. It is the ensuing analogy that improves the state of affairs and not systematic change, which is seen as having destructive effects. Most importantly, linguistic change is obviously an obstacle to communication, since the drifting apart of dialects to give rise to mutually unintelligible languages obviously isolates people and does not bring them closer together. Labov (2001:5) sums up his discussion saying that “it is hard to avoid the conclusion that language, as an instrument of communication, would work best if it did not change at all… we do not profit in any obvious way from the results of systematic change”. Let me demonstrate why this line of reasoning is wrong. The fundamental point Labov seems to have missed is a notion that we can refer to as the locality principle . Both language and society are stratified in many ways. What we find is that any observed phenomenon is best understood when it is first analyzed within its own domain. Its potential effects on neighbouring or further layers are not necessarily relevant. Accordingly, a phonological change may improve the phonological system, but its (negative) consequences as far as e.g. morphology is concerned cannot and should not overrule the observation that the given change was an improvement within its own domain. To put it differently, it follows from the locality principle that optimization is strictly local. The validity of the locality principle becomes self-evident if we return to Labov’s (2001:4-5) critical remarks for while. He notes that his own generation calls an ice box an ice box, while younger generations refer to the same thing as a refrigerator , which can lead to “small family arguments… over the proper use of words”. Speakers obviously do not interact with all members of their speech community. In fact, virtually everyone spends most time

25 communicating within their own age group, which presupposes invisible communication boundaries. In this case the younger generation optimizes language use locally by agreeing at a term that every member of the (age) group understands and uses. The ensuing slight communication obstacle between groups is a global issue and does not challenge the assumption that the change was initiated to improve communication. This applies all the more to dialects that are drifting away as a consequence of (systematic) change. When a certain speech community (i.e. people who actually communicate with each other) implements a change, this is a local decision that can by no means be influenced by other communities, which are out of contact with the group in question. Thus Labov’s reference to “once mutually intelligible dialects of Proto-Indo- European” cannot be taken seriously. In order to prove that the locality principle also relates to individual levels of linguistic structure and thus e.g. morphological or syntactic repercussions cannot defy phonological optimization, we now have to consider the notion of the phoneme for a while. A phoneme is often defined as the smallest distinctive unit of language , which raises the question of what entities the distinction concerns. Given that phonemes are identified with the help of commutation tests and that minimal pairs are words that differ in meaning, it is not uncommon to encounter definitions such as a phoneme is the smallest unit of speech, which may bring about a change in meaning . With regard to the locality principle such definitions are evidently deficient. The reason for this is that semantics is never involved in phonological descriptions. We can make reference to syntactic labels (just like in the case of stress assignment rules in English) but never to meaning. As opposed to meaning, syntactic labels are local in phonological descriptions, so a proper definition has to reflect this: a phoneme is the smallest unit of speech, which may distinguish between syntactic categories . These different syntactic categories usually have different meanings, but this is completely irrelevant. The necessity to distinguish between syntax and semantics is conspicuous in e.g. Hebrew or , where root forms have a meaning but lack a syntactic label. Similarly, in languages such as Hungarian or Polish, which can transform affirmative utterances into questions solely with the help of prosodic tools, phonology can bring about a syntactic change, but does not affect meaning. I take this as evidence of the validity of the locality principle, which consequently indicates that it makes sense to speak of optimization on the phonological level, even if it entails that the lexicon is burdened by the fact that morphological or syntactic patterns have to be restructured. We are thus justified in using the concept of optimization given that we observe the locality principle.

Some limitations We have to maintain that we cannot describe any given sound change we encounter in terms of improvements. The reasons for this are twofold. First, our model of functional principles is restricted to internal factors and does not take external causes such as language contact into consideration. In section 1.5.4, I will elaborate more fully on the notion of language contact and argue that influences initiated by such external factors in many cases cannot be described as optimization. Second, as far as historical changes are concerned, we often lack minute details that are necessary to fully understand a given change. The following example may shed some light on the problem. English /u:/ underwent shortening in a number of words including cook, book, good, hood etc. As it turns out, shortening was first restricted to vowels preceding posterior, non-coronal segments, cf. Kiparsky (1995a:5). This condition was later relaxed to include other environments as well. If one is to compare the distribution of /u:/ before the change and the present state of affairs, it is rather difficult to see what sort of improvement has been

26 implemented. However, if we investigate the two stages separately, the benefits soon become evident. In order to explain the first stage of the change we can invoke the principle of minimization of perceptual confusion. In coda position the labial, alveolar and velar plosives are notoriously difficult to distinguish. The strategies involved to alleviate the problem include V laryngealization (cf. Garellek (2011)), which helps the listener identify a following alveolar plosive. Similarly, the length of /u:/ (a prominently labialized V) can contribute to the identification of a following labial C on condition that /u:/ cannot occur before velar consonants. Thus the first stage of shortening served to enhance perceptual salience. The ensuing spreading of the change is an analogical process that can be ascribed to the principle of economy. Without such details it can be demanding to explain a change in terms of optimization.

1.4.5.2. Eternal optimization If phonological change results in better and better inventories, then languages should after a certain time arrive at an optimum system where no further change is necessary. This is, however, clearly not the case as all phonological systems keep on changing forever. Still, this is not surprising. Eternal change is, in fact, what we expect given that no phonological system can meet all the opposing requirements of our three functional principles, cf. section 1.4.3. If we want to grant the proposition that inventories never cease to undergo optimization, but at the same time there is no optimal system to reach, we have to suggest that changes are circular, orbiting in eternal loops. The simplest way to model eternal optimization requires three systems with three specifications each, cf. Boersma (2003:31). An example is given in (8) below.

(8) Eternal optimization A (2, 3, 1) > B ( 3, 1, 2) > C (1, 2, 3) > A ( 2, 3, 1) > B etc

The numbers in bold indicate that the values represented by them are higher than corresponding values of a previous system. Thus B is a better choice than A, given that it is superior to A as far as the first and the third specifications are concerned. Given that a new system is to be preferred to the current one if it is better on at least two of its specifications, it follows from (8) that eternal optimization is theoretically possible. Furthermore, inherent inequalities of the system are enough to maintain an eternally changing loop. It is not necessary to assume contact between different dialects or languages for sound change to take place. Even if the above assumptions seem reasonable, the approach devised so far has to meet the following criticism.

There is a deeper problem that makes all of these explanations less satisfactory. They all depend upon some permanent properties of the organism of the language structure; yet sound change is characteristically sporadic, accelerating at unpredictable rates and terminating at unpredictable times. Labov (2001:22)

First of all, we have to pinpoint that a phonological change does not automatically take place whenever two of the three functional principles are in favour of it. As I claimed before, functional principles merely constitute necessary and not sufficient conditions of a phonological change. Recall that historical linguistics cannot rely on efficient causes but has to content itself with the use of final ones. On the other hand, it is important to see that the amount of variation present in a speech community is absolutely crucial. With increased

27 the ensuing selection process is much more likely to result in a phonological change. To put it differently, language or dialect contact can obviously facilitate phonological change, however, as far as I see it, it is not an absolute precondition. Given that Labov refers to sound change as sporadic, he obviously has changing sound inventories in mind. In section 1.4.3, we underlined the fact that changes in functional load and distribution can restructure phonological relations and thus play a crucial role in preparing the way for changes that affect inventories. Such restructuring is for me an integral part (or the introductory part) of sound change. However, if we think of sound change exclusively in terms of changing inventories as Labov does, we may indeed get the false impression that it is a sporadic process.

1.4.5.3. The Darwinian paradox Labov (2001) also considers the parallels between biological and linguistic evolution. He acknowledges most similarities that Darwin observed between the two processes, but rejects the proposition that natural selection should play a role in linguistic evolution. Labov (2001:10) argues that “language does not show an evolutionary pattern in the sense of progressive adaptation to communicative needs”. This standpoint is of course coherent with his previous observations that see language change as a destructive force. Labov (2001:15) terms this problem the “The Darwinian paradox” and summarizes it as follows: “The evolution of species and the evolution of language are identical in form, although the fundamental mechanism of the former is absent in the latter”. However, in response to the above claims we can revisit the arguments put forward in section 1.2.2, which, I think, render Labov’s paradox rather tenuous. We have seen in connection with the two-stage model that sound change can actually be modeled in exactly the same way as biological evolution through natural selection. Given that the actual phonetic manifestation of a single phoneme can hardly ever be reproduced in exactly the same manner, we can see that speech sounds (just like genes) are characterized by a high degree of variation. This random variation is then filtered by the environment. Those genes that can build successful machines (organisms) capable of adapting to their environments will prosper and predominate. In the same vein, those sounds that result in smooth communication will be used over and over again at the expense of less successful ones. It is necessary to underline that the difference between winning and losing sounds/genes might be so small as to seem negligible, still the vast amount of time at our disposal suffices for the invisible effects of natural selection to accumulate and bring about radical changes. Had Labov understood how this meaningful (or if I may teleological ) process of natural selection works, he would have referred to it as “the Darwinian parallel”.

1.4.5.4. Mergers Given that chain shifts can be characterized by the preservation of contrast, they can therefore be neatly explained with the help of functional principles. However, if the minimization of perceptual confusion is a significant driving force of such processes, how can we then account for mergers, which are possibly even more common than chain shifts? This is the very question on the basis of which Labov (2001:21) dismisses Martinet’s functional approach. As we saw in section 1.4.4, Martinet relies on three overlapping concepts (function, structure and economy), which cannot be clearly related to the three essential factors of communication. Within such a framework it can indeed be troublesome to account for mergers. Martinet (1955:53-4) posits that if mergers occur it may mean that the target of the

28 change has nowhere to escape and blames such changes on unknown factors that are more powerful than the functional factors trying to maintain the status quo. The main problem with his reasoning, apart from the mysterious language he uses, is that it suggests that the functional factors are only involved in maintaining the status quo. However, if we accept the three functional principles presented in section 1.4.3, we can account for mergers without having to make reference to powerful unknown factors. In the present model a sound change can be carried through if two out of three principles are in favour of it. There are no absolute principles that have to be adhered all the time. In fact, it is relatively easy to think of a system where a merger (i.e. loss of contrast) is counterbalanced by ease of articulation and economy. As a consequence, loss of contrast does not present a theoretical problem. A final note is in order concerning the various ways distinct units can coalesce. Both mergers and apocopation can result in the loss of contrast and thus give rise to homophones, not all of which, needless to say, are at the risk of being confused. Words that belong to different syntactic categories and thus never get mixed up (e.g. reed and read or meet and meat etc) can be conveniently excluded, as far as the principle of perceptual confusion is concerned. In the same vein, we can exclude for instance syncretized forms that never occur on their own such as French parle / parles / parlent (I / you / they speak), which always require a personal pronoun. If we restrict the scope of homophony in this way, an interesting pattern may emerge: some mergers lead hardly ever to real homophones, while others do result in confusion. The former group can be illustrated with the following examples. Contemporary Polish / ʒ/ is either denoted by or < ż>, which of course reflects an actual historical development, namely that palatalized /r/ came to merge with / ʒ/ by the 18 th century, cf. Klemensiewicz (1974:295). Virtually all word pairs that were affected by this merger belong to different syntactic categories (cf. mo że – morze (maybe - sea), wierz ę – wie żę (I believe – tower [Acc])). I am aware of kolarz – kola ż (cyclist – collage), however, this latter word of French origin entered the language when the merger had already taken place. The merger of Hungarian /j/ and / ʎ/ (completed by the turn of the 18 th and 19 th centuries (cf. Kiss-Pusztai (2005:711)) did not lead to real homophonous pairs either. Even the best candidates have flaws such as folyt – fojt (flowed (3 rd person sg) – strangles), where the two belong to different temporal categories or estélye – estéje (his/her soirée – his/her night), which are inflected words whose base forms are not homophones (cf. estély vs. este). The sources I have consulted claim that both the Polish and the Hungarian merger took centuries to gain ground, possibly in the form of lexical diffusion. If this is appropriate, then we may have found the reason why the mergers discussed above did not produce any real homophones. Lexical diffusion is a gradual process that first involves one word and then another. Under such circumstances it is rather unlikely that a meaningful opposition should be dissolved by analogical change. As a consequence, we can assume that changes that do not lend themselves to a description in terms of lexical diffusion (such as those resulting from phonotactic constraints) are more likely to result in real homophonous pairs.

1.5. Functional principles at work The aim of the current section is to illustrate how the theoretical tools introduced above can be employed to model eternal loops or to account for actual historical developments. I will also show to what extent non-teleological accounts can and should be taken into consideration in those cases where the sound change in question defies the expectations that we owe to functional principles.

29

1.5.1. Segmental changes We have claimed above (1.4.3) that a phonological change is best treated as a function of the system within which it takes place. One obvious consequence of this principle is the fact that on the level of segmental phonology, V inventories are more liable to functional analyses than consonantal systems. The main reason for this is that a C inventory can often be characterized as a loose bundle of subsystems where changes to explosives for example do not appear to have any clear effect on liquids and vice versa. Yet in spite of this it is not obvious whether the subsystems can be analyzed on their own without reference to each other. Furthermore, it is not clear where to draw the line. Is it viable to categorize /l/ and /r/ as a separate subsystem of liquids in English? What would be the status of /s/, the only fricative in PIE? V inventories, on the other hand, usually exhibit unitary systems8. In what follows I now turn to investigate the vocalic system of Old English in order to see whether the attested historical changes are in line with the functional principles proposed in 1.4.3. Lass (1992:40) assumes that the inventory of short vowels in OE “that can reasonably be seen as input to ” took the form of (9) below 9. Symbols that are aligned to the right edge of a cell represent a rounded V.

(9) Short vowels in late OE [+front] [-front] [+high] /i/ /y/ /u/ [-high] /e/ /o/ [+low] /æ/ /ɑ/

It follows from (9) that it takes four distinctive features to keep these seven vowels apart. However, in an inventory of optimal economy we can distinguish between a large number of phonemes using a small number of features. To be more precise, n features would account for 2n phonemes. Taking the inherent redundancy of [ αhigh] and [-αlow] into consideration the theoretical maximum of the OE system can be established to be 12 phonemes (rather than 16). The fact that (9) features only seven phonemes implies that there is some further (perhaps unequally distributed) redundancy in the system. Given that the functional load of a feature (cf. Martinet (1955:78)) is related to sound change, it follows that non-distinctive (or weakly distinctive) features are much more prone to change than their phonemic (or strongly integrated) counterparts and thus constitute the weakest points of a given system. If the language manages to dispose of those segments that are represented by the weakest feature then we can arrive at a simplified, more economical phonological description. An analysis of (9) in terms of functional load reveals that both [ ±high ] and [ ±front ] serve to distinguish between two pairs of phonemes, while [ ±round ] and [ ±low ] constitute the weakest points of the system with only one phonemic pair each: /i/ - /y/ and /e/ - /æ/ respectively. This makes /y/ and /æ/ the most obvious targets of structural change so any proposed optimization should progress along these lines. (10) and (11) display the effects of hypothesized mergers that serve to eliminate /æ/ and /y/ respectively.

8 The principles governing the formation of V inventories are on the whole better understood. We know for instance that “vowels tend toward a balanced and wide dispersion in the available phonetic space” (Maddieson (1984:136)), while corresponding correlations regarding consonants cannot be established, cf. ibid: 17-19. 9 OE also possessed an inventory of long vowels that was completely symmetrical to (9). In addition, it had two short and two long : /æ(:) ɑ/ and /e(:)o/. 30

(10) OE vowels without [ ±low ] (11) OE vowels without [ ±round ] [+front] [-front] [+front] [-front] [+high] /i/ /y/ /u/ [+high] /i/ /u/ [-high] /e/ /ɑ/ /o/ [-high] /e/ /o/ [+low] ------[+low] /æ/ /ɑ/

As regards the number of features (and their functional load) the two systems are almost equally economical. [±front ] in (11) stands out with three pairs of phonemes, while all other features distinguish between two. Even as far as minimization of articulatory effort (4a) is concerned, (11) turns out to be a better candidate since it lacks the mark of rounded front vowels. Finally, given that the principle of dispersion is closely related to minimization of perceptual confusion (4b), it follows that even in this respect we have to lean towards (11), which has no gaps at the lower regions. To sum it up, we are entitled to expect that (if we only assume internal factors) the V inventory of (9) be transformed into that of (11). Now if we take a look at the actual course of events we can see that the unrounding of /y/ indeed took place at some stage in early Middle English. Nevertheless, it might be added that this development was restricted to the southeast and southeast Midlands, while in many other dialects “front rounded categories remained unchanged into Middle English, and in one form or another persisted into the 15 th century” (Lass (1992:54)). The exact dialectal and phonetic details of the changes are of course immaterial here. The point I want to make is that we can indeed predict the direction of future changes (provided no external forces are involved), however, we are not capable of ascribing the change to exact triggers and thus foresee when it will occur. The unrounding of /y/ was, as a matter of fact, predated by some other changes (most notably monophthongizations and mergers) that transformed the system to a certain extent, but presumably not in a way that is relevant to the present discussion. These changes will be discussed in some detail in section 1.5.5.

1.5.2. Suprasegmental changes While the previous section sought to provide evidence for the validity of functional principles, in the present one we will turn our attention to another cornerstone of the functional approach, namely to eternal optimization. In what follows I will try to demonstrate how changes in stress assignment can be modeled in an eternal loop, suggesting that even suprasegmental changes are expected to follow strict patterns. When we speak of “stress languages” then what we primarily have in mind is the culminative property of stress, according to which each accentual unit (or each word uttered in isolation) has one and only one primary stress. Any pattern that deviates from this definition will be claimed to have no (phonological) stress. Hyman (1977) reviewed more than 400 languages with regard to stress placement and found that fixed stress was a cross- linguistically preferred phenomenon. Fixed stress is defined in relation to word edges (initial, final etc) and has thus a demarcative function, i.e. it serves as a boundary signal. What we call free stress, on the other hand, is to a certain extent lexically and to a certain extent morphologically specified (as in Russian). Hyman (1977:39) keeps lexical and morphological / grammatical stress apart on the assumption that the latter is predictable, while the former is not. I find this somewhat unfounded given that the distinction is often blurred by the fact that certain morphological processes are less productive than others, which can give rise to unpredictable stress patterns. Furthermore, we often need a lexical specification first in order

31 to be able to apply grammatical stress. In other words, we simply have to memorize how certain individual morphemes behave. Although some aspects of this present discussion rely explicitly on Hyman (1977), I am aware of the fact that his contains a large number of factual inaccuracies. A few examples will suffice. He classifies Romanian as having penultimate stress, while in reality Romanian exhibits a ‘three-syllable window’ (similarly to modern Greek), which means that the stressed σ always falls on one of the final three syllables of the word. Serbo-Croatian is claimed to have initial accent, which is apparently wrong 10 . He classifies English as lacking a dominant stress pattern, while Swedish as having initial stress, a controversial position given that we have good reasons to assume that the two languages represent more or less identical accentual types. Furthermore, the study includes certain reconstructed / extinct languages (e.g. Proto-Italic and Latin), while the author omits such well-documented ones as Italian, Macedonian and Slovak. In spite of all this, I am inclined to think that Hyman’s theoretical assumptions are mostly valid and relevant. The factual shortcomings mentioned above shed some light on a problematic aspect of cross-linguistic studies. No matter how knowledgeable the author is, the wide range of languages involved will surely go beyond her expertise. If she wants to include exotic languages with only a handful of speakers it is very likely that she will be unable to find any material against which she can check her source. And even if she is to categorize a relatively well-documented language she will stumble on difficulties. Sometimes some of the sources are simply wrong, sometimes it is not straightforward how to categorize a given pattern. Should Latin (a µ-counting language) be described as having penultimate or antepenultimate stress? Is it reasonable to characterize a language like Spanish that exhibits a number of accentual minimal pairs (and even triads) as having fixed stress? Data squeezed into statistical charts are always somewhat distorted. Ladefoged’s (2003:2) warning is certainly appropriate: “you should never fully trust anyone else’s description of the sounds of the language you are investigating”.

Cross-linguistic patterns Hyman (1977:40) makes an important claim proposing that “a language which does not have stress will not directly develop lexical stress… an intermediate stage of grammatical (i.e. morphological or demarcative) stress [will always be involved]”. If Hyman is right, then we may conjecture that a language that lacks the phenomenon of stress also lacks phonemic V length (given that unpredictable lexical stress could arise from a rule that requires the speaker to stress every long V). Fixed stress with regard to word boundaries facilitates perception and is thus preferred to no stress systems. Hyman (1977:44) proposes that “stress actually comes from intonation... intonation becomes grammaticalized as word-stress when the suprasegmental features of pitch, duration and intensity that would have characterized a word in isolation are encoded with the word, and thus come to function in words not in isolation”. It probably holds universally that intonational phrases (IPs) start on a high pitch, which is then easily identified as primary stress when Hyman's grammaticalization takes place. In line with this we can claim that a no stress language can theoretically only acquire initial stress. Languages that arrive at initial stress in the above described way can be argued to lack long vowels and perhaps even σ weight as such. Nevertheless, ensuing changes related to (dynamic) initial stress such as syncope can give rise to σ weight, which does not necessarily correspond to stress on the first σ. In such cases certain segments can undergo lengthening or shortening (Weight-to-Stress) or the language can move the stressed σ to meet σ weight (Stress-to-Weight). Hyman (1977:52) derives contemporary Turkish final stress positing a

10 Cf. Hammond (2005:31) according to whom “[t]he general rule regarding stress [in Serbian] is that it can fall on any syllable except the last, although there are exceptions to the rule”. 32 rule that moves original initial stress onto the first long V of a word. In the absence of a long V, stress falls onto the σ where the scanning has ended (i.e. word-finally). As a final step, analogical pressure generalizes stress on the last σ. This may only occur in languages with distinctive V quantity as in Common Turkic. (In later stages phonemic V length was gradually lost in most Turkic dialects and is only marginal in standard Turkish, cf. Kabak (2005:355).) For languages that lack phonemic V length the notion of secondary stress is crucial for the description of stress shift, since it allows us to model gradual instead of sudden, abrupt changes. Secondary stress itself is due to the rhythmic organization of syllables into trochaic feet. Languages where most words consist of an even number of syllables can develop a strong secondary stress on the penultimate σ. When the initial stress of disyllabic words is interpreted as a stress on the penultimate, then this establishes a new penultimate stress pattern. This is exactly what happened in Polish, where we can observe a striking temporal coincidence between the loss of initial stress and the loss of V length, cf. Klemensiewicz (1974:100-2). It is worth mentioning that both standard Czech and standard Slovak, languages that still have (traces of) V length, have preserved initial stress into our present days. Fixed penultimate stress can also undergo various changes. Final σ weakening and deletion can result in a system where stress regularly falls on the last σ. Hyman (1977:73) remarks that “this should not be seen as a stress change, per se, but as a segmental change which has brought about a restructuring of the stress system”. If final deletions do not occur in a consistent manner, then this may lead to the rise (or increase) of lexical stress. Free lexical stress, on the other hand, is then likely to be fixed on the initial σ of the word, cf. the developments from Indo-European (IE) that occurred in Italic, Celtic, Germanic, West Slavic etc (cf. Hulst & al. (1999)). Such a change satisfies the criterion of economy since it reduces the lexical burden of the language, but it also favours perception given that fixed stress has a clear demarcative function. As regards final stress we can observe a situation (as was the case in French) where word-final stress is converted into phrase-final stress, which in certain treatments is even equated (or seen as a precursor) to a system without stress. In (12) I summarize the typological findings discussed so far.

(12) Modelling changes in stress placement

→ → → → → → → NO STRESS ↑ ↓ ↑ INITIAL ← ← ← ← ← ← ← ↑ ↓ ↑ ↑ ← ← STRESS-TO-WEIGHT → → ↑ ↑ ↓ ↓ ↓ ↑ ↑ ← FINAL (←) PENULTIMATE ( →) LEXICAL → ↑

The model does not cover all possible stress patterns e.g. languages with stress on the antepenultimate or on the second σ. However, such systems represent quite exceptional patterns that amount to less than 5 % in Hyman's typological survey. The point is that (12) clearly indicates that attested changes in stress assignment can be juxtaposed to model eternal loops of optimization. We will see below that deviances from the teleological model should generally be treated as having arisen due to language contact. Western Ukrainian dialects can be quoted as

33 a preliminary example. According to Baerman (1999:125) “the westernmost Carpathian Ukrainian dialects, which protrude like a peninsula into W Slavic speech territory (Polish to the N and Slovak to the S), have fixed penultimate stress, like the surrounding W Slavic dialects”. He then goes on to add that “it is generally agreed that penultimate stress was borrowed by the Ukrainian dialects from W Slavic”, but does not provide any explanation as to why this is so obvious. Nevertheless, in the light of the above we can claim that the penultimate stress of Western Ukrainian cannot have arisen due to language internal developments. Our model under (12) makes it clear that penultimate stress presupposes an earlier system of initial stress. Whenever free lexical stress is immediately transformed into stress on the penultimate (with the omission of the initial phase), we can be certain that this change is due to foreign influence.

1.5.3. A non-teleological account The functionalist approach presented so far assumes that sound change is a phonetically gradual process that involves teleological optimization of the phonological system. The mechanism of change is said to mirror Darwin’s theory of evolution by natural selection inasmuch as inherent random variation is followed by meaningful selection. Variation is of course perceived by the listener, who (upon having been transformed into a speaker) can carry through optimizing sound changes by the means of selection. Consequently, the speaker is seen as playing a key role in the process. As hinted at above, this position is far from being generally accepted. In fact, many linguists advocate the polar opposite of the functionalist view. One of the most prominent representatives of the non-teleological approach is John Ohala, who has exposed his theory of sound change in a series of related articles. Let us briefly review his main ideas. The assumption that “some of the synchronic variation in speech is similar to sound change” (Ohala (1993:239)) implies that many questions of historical linguistics can be approached with the uniformitarian principle in mind (i.e. the conviction that natural laws and processes are not subject to change). Given that there is an “incredible amount of variation… in pronunciation not only between speakers but also in the speech of a single speaker” (Ohala (1989:176)), it is not surprising that many utterances are misheard by the listener, a state of affairs that may give rise to potential phonological changes. Nevertheless, it is noted that sound changes are not at all as frequent as these distorted speech signals would indicate. The author accounts for this by postulating the existence of corrective rules, with the help of which the listener can “undo” the distortion11 . In other words, potential distortions can be factored out since the listener “knows that a slightly affricated release to a stop before a high or glide is to be expected and that is not part of the speaker’s intention” (Ohala (1989:185)) and this ensures the unchanged continuity of underlying forms. However, in some instances the listener fails to apply the relevant corrective rule and takes the heard utterance at face value (equates it with UR). In these cases (which Ohala terms hypo-correction ) a mini sound change takes place, which if transferred to other speakers may evolve to be a proper sound change. The regressive spread of nasality can serve as an example. A sequence of a V followed by a nasal often surfaces as a strongly nasalized V where the C is almost absorbed by the previous sound as is shown in (13). It can be difficult to tell whether the nasalized V is present underlyingly or is just the product of assimilation. Hypo-correction can thus create new phonemes.

11 This entails that subjective constancy applies in the domain of speech perception as well. 34

(13) Hypo-correction of nasality (based on Ohala (1993:248)) /VN/ > [ ṽN] > [ ṽ] → /ṽ/

The inadequate application of corrective rules may of course also take the form of exaggerated use, which is the case if the listener identifies a distortion that does not exist. As an example one may refer to the Italian reflex ( cinque ) of Latin quinque (five), where dissimilation (or hyper-correction) is due to the fact that “the same feature [plays] a distinctive role in two sites in a word [and] the listener [interprets] its presence on one site as a predictable – and therefore detachable – spillover from the other site” (Ohala (1989:189)). According to Ohala’s approach the cognitive efforts involved in speech (perception) aim at preserving and not at changing the pronunciation norm. Eventual changes thus occur when the listener makes mistakes interpreting the intended string. Sound change is an accident with no telos involved, although Ohala (1993:263) acknowledges that “teleology [might be involved] in other aspects of language change, especially its spread”. If sound change was indeed triggered by misunderstandings then it would be reasonable to propose (contrary to our previous assumptions) that it is phonetically abrupt given that practically identical auditory signals can be generated by remarkably different articulatory gestures (e.g. [kw] and [p]).

Shortcomings The functionalist and the Ohalian accounts of sound change are essentially complete opposites. So on what basis can we decide which one to choose? The distinction between the two models can basically be attributed to the divide between phonetics and phonology. It is obvious that most of the criticism that can be directed at the respective theories will focus on those aspects of the linguistic reality that the given model ignores. Thus functionalists can rightly be reproached for not paying attention to phonetic variation, which is arguably a prerequisite for any phonological change. In a similar vein, the phonetic approach is preoccupied with analyses of the physical reality at the expense of linguistic structure. If we fail to treat phonological changes as functions of the system in which they occur, it becomes increasingly difficult to account for trends that seem to endure over long periods of time. Earlier forms of a language are clearly not available for speakers to process, cf. the Synchronic Base Hypothesis in Hutton (1996). Nevertheless, a change that spans over a handful of generations definitely requires some degree of collaboration if it is to proceed in the same direction for centuries. Labov (2007:51) proposes that it has to be taken into consideration that there are many variants in use within a community. He argues that “if children align the variants heard in the community with the vector of age : that is, they grasp the relationship: the younger the speaker, the more advanced the change” then we can dissolve this seeming contradiction. However, it does not seem to be a valid assertion since it implies that systematic shifts are somehow related to life expectancy within a given group 12 , which is definitely not the case. Closely related languages constitute another stumbling block for the phonetic approach. When two dialects are no longer in contact with each other, they are expected to display individual innovations, although the same feature may develop separately in both by coincidence, cf. Saussure (1997:222). However, the number of coincidences between related languages is astonishingly high, which compelled Ohala to adopt the following position.

I will focus only on sound changes that occur in similar form in languages distant from each other in time, geography, family membership and typology. This tends to insure

12 The longer we live, the more diachronic information there is to be accessed. 35

that they will be sound changes determined by language universal factors (phonetic or cognitive) and not by language- or culture-specific factors. Ohala (1992:2)

It would be interesting to know what he means by language-specific factors and why they have to be dismissed. It is also difficult to see why he thinks that time and geography can be looked upon as relevant factors when it comes to sound change. The above references to such fuzzy concepts indicate that phonetic approaches are incapable of dealing with the problem of coincidences. Functionalists, on the other hand, can invoke the role of the actual phonological system to find an explanation. Closely related languages (like Swedish and Norwegian) have more or less identical inventories with a very similar phonology. Now given that the course of optimization is defined by the characteristics of the given phonological system we cannot help expecting a certain amount of shared innovations even if the two dialects or languages in question have already drifted apart. Let us take the example of i-mutation in Germanic. The fact that this change was carried out independently in all (with the exception of Gothic) should not be described as coincidental and it does not depend on ties of kinship either. Any language that displays some necessary conditions 13 is likely to undergo a similar change sooner or later. Quite similarly, the raising of tense vowels in English, Swedish and German should be attributed to phonological similarities and not to genetics.

Reconciling the two approaches Despite their differences the two models make identical predictions in a number of cases, which is not surprising given that both acknowledge how important the physiological restrictions of the vocal tract are in shaping phoneme inventories (Ohala (1983), Boersma (1998)). The important thing to acknowledge is that phonetic and phonological explanations do by all means complement each other and should not be treated as competing accounts. Recall that Ohala assumes that sound change (a misunderstanding) takes place when the listener is unable to correctly decode the intended pronunciation of the speaker. Yet why would speakers of the same community (with a common grammar and common underlying forms) have problems discerning each other’s utterances 14 ? The supposition that people have problems identifying the intended message is clearly better applicable to languages or dialects in contact. On the other hand, the functionalist approach, which predominantly takes internal factors into consideration, should be seen as a theory describing developments within a single speech community. Consequently, we can assume that those phonological innovations that cannot be characterized in terms of optimization can be identified as having arisen as the result of language contact provided we can identify a donor language with an appropriate phonology and that we can establish the historical context of the given change in a satisfactory manner. Whenever a phonological change is directly followed by its exact reverse we can by no means assume that both changes serve to ameliorate the system. Such instances (when a language returns to a previous state) are clearly due to foreign influence. In some cases (as hinted at in 1.4.5.1) the historical perspective and the concomitant feature of paucity of data blur optimization as such. Such a case can present itself when the change in question is itself the combination of several stages (all meaningful in their own right). The final outcome itself quite often does not provide enough clues for us to ascertain what really has been going on. It is also essential to remind ourselves that the two models in many cases come up with identical predictions. This implies that a phonological change that fulfils the requirements of optimization may in fact have been initiated by foreign factors as well pointing in the same

13 These probably include a symmetric V system with fixed, non-final stress. 14 All the more so since the speech signal usually abounds in redundant elements. 36 direction. This of course compels us to allow for Bloomfield’s pessimism (cf. section 1.1) and to admit that the ultimate causes of sound change are indeed often unknown.

1.5.4. Language contact Although the exact mechanism of linguistic transfer from one language to another lies beyond the scope of our concern, I still find it advisable to introduce certain fundamental theoretical concepts that can contribute to a deeper understanding of the problem. In what follows I will rely heavily on Weinreich (1953:1-28). Contact between two languages implies that they are used alternately by a number of bilingual individuals. The use of the term bilingualism does not presume that the bilingual individual is equally proficient in both languages. The minimum degree of proficiency is established as a certain linguistic level that allows the speaker to maintain a conversation in a foreign language. Interference phenomena occur when the language use of bilinguals displays patterns that deviate from the norm in a way that is attributable to their familiarity with other languages. Such instances are not restricted to the addition of new elements, but they also involve the elimination of some earlier items and the reorganization of the remaining features into a new system. The mechanism of interference associated with contact between mutually intelligible dialects and contact between distinct languages is essentially the same, which means that language contact can be used as an umbrella term to cover both. A comparative analysis of two languages in contact provides us with a list of potential forms of interference, yet not all conceivable cases of interference actually manifest themselves. The amount of interference depends greatly on a large array of non-structural factors such as the size of the bilingual group, attitudes towards each language, the speakers’ relative proficiency etc. Consequently, there is much to be gained if we adopt an interdisciplinary approach to explore the field. When we want to compile a list of potential forms of interference, our task is to identify different expression and content units of the two languages in question. Nevertheless, it has to be borne in mind that such identification can be carried out at different levels. The incidental identity of [t] in Hungarian hat (six) and Swedish hat (hatred) should not lead us to think that Hungarian /t/ can be equated to Swedish /t/. Similar warnings are to be issued for the student of syntax and semantics as well. As far as phonic interference is concerned, a distinction has to be made between primary and secondary systems . The former is usually the speaker’s mother-tongue, however, it has to be pointed out that the primary language is not always the native one. Weinreich (1953:14ff) discusses an actual case of language contact in Switzerland comparing Romansh and Schwyzertütsch and establishes what sort of interference is to be expected when the respective languages are taken as primary or secondary systems. On the basis of these analyses, he arrives at the conclusion that four basic types of interference can be distinguished. These are summarized in (14) below.

(14) Four basic types of interference (Weinreich (1953:18f)) a. Under-differentiation of phonemes (when two sounds of the secondary system not distinguished in the primary system are confused) b. Over-differentiation of phonemes (when phonemic distinctions from the primary system are imposed on the sounds of the secondary system) c. Reinterpretation of distinctions (when phonemes of the secondary system are distinguished by features that are only relevant in the primary system) d. substitution (applies to phonemes that are identically defined in two languages but whose normal pronunciation differs)

37

Weinreich (1953:21) maintains that (14b) is immaterial to the listener and (14d) is “least detrimental to intelligibility”, while the remaining two types of interference distort the message and thus can lead to misunderstanding. Besides those listed in (14) other types of interference can also occur, most notably the possibility of hypercorrectness, which Weinreich (1953:19) labels as “too complicated to be identified with a single one of the four basic types”. The potential sites of interference are not all equally difficult to avoid. In a case of language contact between English and Swedish, where the former is the secondary system, we can see that it exhibits a number of phonemes that are unknown to the primary system (i.e. to Swedish: /z/, /tʃ/, /dʒ/ etc). It can be argued that those sounds that represent “empty cases” in the primary system are somewhat easier to master than the rest. Consequently, a Swedish native speaker is expected to have less trouble pronouncing /z/ (a voiced, alveolar fricative), given that she is familiar with all of its specification. The faithful rendering of /d ʒ/ (a voiced, palatal ), on the other hand, is expected to be more difficult given that affricates are an unknown category in standard Swedish. It must be easier to master a new combination of known categories than to introduce a completely new category into the system. With these preliminaries in mind, let us now return to the discussion of OE (1.5.1).

1.5.5. The identification of external factors As we mentioned in section 1.5.1, the unrounding of OE /y/ was preceded by certain other changes, which were claimed to have had no substantial effect on the proposed optimization expressed as (9) > (11). The first change to treat here is the appearance of short and long /ø/. This change does not seem to comply with our previous assumptions about how the system in (9) ought to be optimized. Moreover, in a historical perspective the development looks even more suspicious. The first rounded, front vowels in English (/y(:)/ and /ø(:)/) are posited to have developed as the result of i-mutation. Lass (1994:63) assumes that these new vowels were “categorically established and phonemicized only around 800 or so”. /ø(:)/ turned out to be rather short-lived and generally merged with /e(:)/ by the end of the ninth century (ibid: 66). Its reappearance can be dated to around 1000 (Lass (1992:42)). According to the functionalist approach, language as such is not expected to return to an earlier state by reintroducing a phoneme that has just been disposed of. In other words, we have good reasons to assume that the reintroduction of /ø/ can be blamed on external factors. Let us examine whether contact with Old Danish can be held responsible for the unmotivated appearance of this marked segment. Between the 9 th and the 11 th centuries OE came into uncomfortably close contact with the language of the Vikings, which we are going to term Old Danish for the sake of simplicity. Loyn (1977:64) argues that the latter part of the Viking age (following 954) did not exhibit “the characteristics of a migration”, which means that the influx of Danish immigrants can be hypothesized to have reached its peak at the time after King Alfred’s death (899). The testimony of place-names indicates (cf. Geipel (1971:113)) that the most densely populated Danish areas that came to be known as the Danelaw were the East Midlands and the region of Yorkshire. All these assumptions and the common claim that “standard Modern English is in large part a descendant of Mercian speech” (Pyles (1993:102)) entitle us to pick 10 th century Mercian as a starting point 15 . Needless to say, no dialectal partition can be established as far as the Danes are concerned.

15 Literary OE (West Saxon) and Kentish both had somewhat different vocalic systems. 38

When comparing the vocalic systems of the two languages we rely on Lass (1992:40) and Skautrup (1944:124). The OE inventory (9) is repeated here for the sake of convenience (15), while the Danish system is presented in (16).

(15) OE (16) OD [+front] [-front] [+front] [-front] [+high] /i/ /y/ /u/ [+high] /i/ /y/ /u/ [-high] /e/ /o/ [-high] /e/ /ø/ /o/ [+low] /æ/ /ɑ/ [+low] /æ/ /ɑ/ / ɒ/

Both systems displayed long vowels as well, whose inventories were completely symmetrical to those in (15) and (16). However, as far as their diphthongs were concerned the two languages were completely different. OE /e(:)o/ and /æ(:) ɑ/ were opposed to OD /iu/, /io/ and /ia/ 16 , which means that the former were subject to height harmony and distinctions in terms of phonemic length, while the latter were not. In addition, OD had only rising diphthongs, while OE had only falling ones,17 cf. Hogg (1992:86). It follows that the diphthongs were the only elements of the OE vocalic system that the Danes were unable to reproduce.

(17) Monophthongization of OE diphthongs and subsequent unrounding a. /e(:)o/ > /ø(:)/ > /e(:)/ b. /æ(:) ɑ/ > /æ(:)/

In (17) above, I have summarized how the OE diphthongs came to merge with the non-high front vowels of (15). Lass (1992:42f) maintains that the process “involved two different types of assimilation between morae… [in (17b)] the second mora assimilated completely to the first”, while assimilation in (17a) was bidirectional (i.e. the outcome was defined by the rounding of the second µ and the frontness of the first). Yet, why would an otherwise completely uniform pattern involve two different processes? Given that (17b) can also be derived with the mechanism proposed for (17a), it would be much more elegant to suggest that the two reflexes were brought about by a single process. The resulting rounded vowels /ø(:)/ were, as a next step (sometime during the course of the 12 th century (cf. ibid:45), merged with /e(:)/, which leaves us with the question of why the language resorted to an intermediate stage instead of adopting Lass’s explanation for (17b), which would generate the more economical /e(:)o/ > /e(:)/. I would like to indicate that what we have here is a “mini sound change” of the Ohala type based on misunderstandings, whose mechanism is hypothesized to have been as follows. A falling diphthong (like OE /eo/) is a sound whose first element is more prominent than the second. It follows that the second element is unlikely to fully reach its target, which in this case entails that the second element [o] is expected to be pronounced with an advanced tongue root. The Danes took this surface form at face value (hypo-correction) and matched it onto their own phoneme inventory. Given that “all aspects of non-native phonological

16 The reflexes of Common Germanic /au/ (> /ou/ > /o:/) and /ai/ (> /ei/ > /e:/) were monophthongized only in Eastern Scandinavian. Haugen (1982:37) claims that the change had taken place in Denmark by the beginning of the 10 th century. Hence the majority of Scandinavians in the Danelaw are assumed to have lacked such falling diphthongs. 17 The modern English reflexes of ceosan (choose), leosan (loose) etc are often explained with “a presumed shift of syllabicity to the /o/ mora, which then becomes long /o:/ and participates in the later history of OE /o:/” (Lass (1992:43)). It would be tempting to attribute such sporadic occurrences of rising diphthongs to Danish influence. 39 structure, including segments, suprasegments, and syllable phonotactics, are systematically distorted during speech perception” (Peperkamp (2005:346)) the original falling diphthong was identified by its second element and thus came to be equated with /ø(:)/. This instance of hypo-correction is due to one of the four basic types of interference, namely reinterpretation of distinctions (14c). Recall that it entailed that a phoneme of the secondary system (i.e. OE) is distinguished by a feature that is marginal there but relevant in the primary system (i.e. OD). In our case this feature is [ ±round], which serves to distinguish between three pairs of phonemes in OD, thus being on par with [ ±high ] the strongest feature of the system, cf. (16). To sum it up, the Danes at this stage systematically reproduced OE /eo/ as /ø/. These two sounds can be claimed to have had an identical definition (vocalic, non-high, front and round), which make them susceptible to phonic substitution according to (14d). If we reverse the polarity of interference making OE the primary system, we can even claim that /ø/ represents an empty case in the system, which makes it relatively easy to master given that all of its specifications were familiar to speakers of OE, cf. 1.5.4. Another point in favour of this account is the relative chronology of the re- disappearance of /ø/. The sound is claimed to have unrounded by the 12 th century, which is the approximate date given by Ekwall (1930:55) and Townend (2002:204) for the death of the / OD language in England. Once ON was extinct in England, /ø/ received no reinforcement that could have protected it against systemic pressure, so it finally merged with /e/ 18 . When a sound enters a language by means of hypo-correction and fails to integrate into the phonological system, it is likely to be disposed of as soon as the external factors responsible for its appearance show signs of abating. Thus the temporary reinforcement of [round] and the concomitant reintroduction of /ø/ into OE phonology can be identified as a manifestation of Scandinavian influence.

The merger of /ɑ/ and /æ/ Another change that predated the unrounding of /y/ was the merger of /ɑ/ and /æ/. Although (10) above entertained the possibility of such an optimization, it also assumed that it would entail the elimination of [ ±low]. However, the two sounds did not merge in / ɑ/ as expected. Lass (1992:44:f) establishes the common outcome as /a/. This change is much more difficult to analyse since the phonetic details are rather uncertain. The merger could occur in two ways, illustrated in (18) below. We can either assume retraction of /æ/ to / ɑ/ with subsequent fronting of the merged result or the simultaneous lowering of /æ/ and fronting of / ɑ/ until they converge in /a/. Lass (ibid) opts for the latter alternative (18b) on grounds that the former is less economical. I have similar preferences, nevertheless, with different arguments.

(18) Two accounts of the merger of /ɑ/ and /æ/ a. /æ/ > / ɑ/ followed by /ɑ/ > /a/ b. /ɑ/ > /a/, /æ/ > /a/

I am not convinced that a sharp distinction can be made between the phonetic values of [a] and [æ] in a system with only three heights. Moreover, a strict insistence on the front values [i], [e] and [a] also seems to violate the principle of dispersion. This said, (18a) can be considered a see-saw pattern with an unnecessary number of steps. However, the real argument against it is to be sought in the integration and elimination of binary features. Recall that the merger in (18) is assumed to have taken place simultaneously with the reintroduction of /ø/ into OE phonology. The proposed merger of /æ/ > / ɑ/ would not only have eliminated [±low], it would also have promoted [ ±round] to be one of the most salient features of the

18 The temporary reappearance of /ø/ did not intervene with the proposed optimization of (9) in any way, given that the inventory of short vowels eventually returned to its “starting point”. 40 system, cf. (19) below. Had it been the case, it is safe to say that neither / ɑ/ > /a/ or the unrounding of front vowels could have taken place as the result of internal development. As a matter of fact, I am not aware of any external factor to which these actual processes may be attributed. The arguments in favour of (18b) are as follows. Given that the distribution of /ɑ/ was restricted to pre-nasal environments (which had blocked Anglo-Frisian brightening) or to those instances that are generally referred to as restoration of / ɑ/ (a sort of umlaut triggered by a following back V in various inflected forms) and given that i-mutation further increased the incidence of /æ/, it can be purported that the token frequency and thus the functional load of /ɑ/ was significantly lower than that of /æ/ 19 . If this is correct and if we are indeed justified in expecting functionally feebler entities to be more prone to change, the merger in (18) must have started with the fronting of / ɑ/.

(19) The first stage of (18a) [+front] [-front] [+high] /i/ /y/ /u/ [-high] /e/ /ø/ /ɑ/ /o/ [+low] ------

Even if we choose to adopt (18b) as the historically accurate outline, we still have to explain what might have led to this unexpected merger of low vowels. Upon comparing the lower regions of (15) and (16), we can see that the OD system has an extra phoneme (/ ɒ/, resulting from u-mutation) that the OE inventory (/æ/, / ɑ/) lacks. Although the two languages had mostly identical phonemes, they did not necessarily overlap in distribution. OE /æ/ could, for instance, be mapped onto all three low vowels of OD, while OE /ɑ/ correlated with the two non-front, low vowels in (16). OD /æ/ never corresponded to OE / ɑ/. Some examples of the correspondences are given in (20) below.

(20) Correspondences of low vowels a. OE /æ/ i. fæstan (fasten) – OD /æ/ ii. dæg (day), bearn 20 (child) – OD / ɑ/ iii. bearn (children) – OD / ɒ/ b. OE / ɑ/ i. mann 21 (man) – OD / ɑ/ ii. dagum (day (dative plural)) – OD / ɒ/

Given that (20b) was of restricted distribution, when speakers of OD were exposed to OE they could not help noticing that most of their low vowels were rendered as /æ/ in the secondary

19 The functional load of the opposition (/ ɑ/ - /æ/) is of marginal importance in OE given that the two sounds used to be in complementary distribution until the back vowels conditioning retraction were lost. However, [front], being the strongly integrated feature it is, should in theory prevent such mergers. Recall that it is not the functional load of individual oppositions that counts, but the functional load of the feature that the given opposition is integrated in. 20 Lass (1992:43f) assumes that the merger of low vowels followed (or partly overlapped with) the monophthongizations discussed above. Accordingly, bearn is posited to feature a monophthong. 21 I am aware that pre-nasal / ɑ/ was sometimes indicated in OE manuscripts with cf. monn , which implies that the sound underwent rounding (and perhaps raising). Yet Lass (1994:42) suggests that this need not have been the case. 41 system. This state of affairs led to the hyper-correct generalization that /æ/ is the only low V in OE. This erroneous pronunciation led to increased paradigm uniformity (cf. (20a.ii) and (20b.ii)) so it is no surprise that it eventually found its way into OE phonology manifesting itself as the merger in (18b). In conclusion, we can see that the two unexpected changes that predated and contradicted our proposed optimization of (9) can both be attributed to external forces, which in the present discussion took the shape of language contact between speakers of OE and OD. The validity of the assumptions was enhanced by a well-documented historical background. All this suggests that it might turn out to be a fruitful undertaking to reanalyze certain established facts of historical linguistics, perhaps to find that the role of language contact is in need of revision.

42

2. The prosodic features of the modern languages

The present chapter is meant to facilitate the historical survey in chapter 3 by providing a synchronic analysis of the phonemic features of Scandinavian prosody (quantity, stress and tone/stød). Large parts of the diachronic enterprise would be (in want of sufficient data) doomed to failure without a well-founded, synchronic understanding of these matters. The substantial prosodic variation experienced in Scandinavia can allow for comparative analyses in a historical spirit. Nevertheless, for considerations of space, the discussion to follow will be mostly limited to the standard languages of the three Scandinavian countries as described in Kristoffersen (2000), Basbøll (2005) and Riad (2013) with some occasional dialectal asides. Understandably, the discussion of tonal typologies in chapter 3 will be a marked exception with a detailed treatment of regional varieties.

2.1. A note on classification The Germanic languages of present-day Scandinavia show signs of mutual intelligibility and thus can be claimed to form a dialect continuum. There are, however, apparent gaps between a large number of dialects, which often makes communication laborious, though seldom impossible. Haugen refers to this limited form of linguistic contact in the following way.

Danes, Norwegians and Swedes expect to be understood by fellow Scandinavians when they use their own languages. At times, however, they are disappointed in their expectations; and the region as a whole offers many examples of what we may call semicommunication , the trickle of messages through a rather high level of code noise. Haugen (1966:216)

Several attempts have been made to quantify the relative success of inter-Scandinavian communication. Haugen (1966:225f) found that “the Danes are in the most difficult Nordic communication situation” and established the well-known fact that “Norwegian occupied the favoured position of being most easily understood by both Swedes and Danes”. This is hardly surprising given that Norwegian is often described as a mixture of Danish vocabulary and Swedish pronunciation. The exact figures in Haugen’s survey are somewhat biased since it was based on self-evaluation and university-trained informants were strongly overrepresented. Ohlsson (1978:30f) reports the results of a later study commissioned by the Nordic Council in 1973. The indicators of intelligibility were significantly higher than in Haugen (1966), which illuminates the importance of how the question is phrased and how many alternatives the informant can choose between. An important aspect of the survey was that it claimed that intelligibility between Swedish and Danish was non-reciprocal, a finding that was echoed in Maurud (1976) as well. The reasons why speakers of dialect A may be better at understanding dialect B (than vice versa) can be blamed on political, cultural, attitudinal and linguistic factors, cf. Olmstead (1954). As far as the Scandinavian languages are concerned, it can be maintained that “the phonological representations of the Danish domain are to a certain extent deeper” than the corresponding representations of Norwegian and Swedish (Ács (1990:197)). This entails that more allophonic rules are involved and that the allophones of a given phoneme may display extremely different phonetic forms (Ács (1996:98f)). Ács (2012:18) even suggests that Danish surface forms are so sharply opposed to the other two languages that it now forms a separate branch of the . To put it differently, we can say that the historical subdivision of the Scandinavian languages into an Eastern and a Western branch has for some time been overridden by reductive changes

43 leading to a partition along the Northern-Southern dimension . This was in a way reflected in Delsing & Lundin Åkesson (2005) who found that intelligibility between the three Scandinavian languages is on the decline mainly in those dimensions where Danish is involved 22 . In light of the above, the considerable similarity between Norwegian and entitles us to treat them jointly in the ensuing sections in order to avoid unnecessary repetition. Guided by the nature of my own expertise, I will put considerably more emphasis on Swedish, but I will aim at a comparative description and include Norwegian in those cases when some differences require us to do so. Danish will be surveyed separately in section 2.3.

2.2. Swedish and Norwegian None of the prosodic features to be discussed below can be treated fully (or even partially) without touching upon the remaining two features as well. This mutual interdependence, which is a well-known characteristic of both Sw&No, was among others pointed out in Haugen (1967:188f), who demonstrated that all three features were constrained to stressed syllables in a way that tone is a function of stress, while stress is a function of quantity. The following subsections will thus complement each other in substantial ways and will necessarily also be somewhat overlapping.

2.2.1. Underlying representations The bimoraic condition imposed on stressed syllables implies that stress and length go hand in hand in the two languages 23 . A stressed σ can be implemented by means of a long V or a short V followed by a moraic (geminate) C. This of course raises the question whether we should be thinking in terms of stress-to-weight or weight-to-stress. Which of these features should be taken as a starting point and be included in the underlying representation 24 ? Kristoffersen (2000:145) takes the stance that “both vowel lengthening and gemination are dependent on stress and… should accordingly be absent from underlying representations”. Stress, on the other hand, is not stored in the lexicon either since Kristoffersen argues that the assignment of primary stress can be accounted for if we assume that closed syllables attract stress in a framework where the relevant foot is computed on moras (instead of syllables). This entails that syllabification 25 precedes stress assignment, which in turn precedes segmental lengthening. This stress assignment model works with an accuracy of about 80% and thus has to resort to extraprosodicity and lexical marking 26 . The final step is to assign length to an appropriate segment of the stressed σ. Given that there are no general patterns that would enable us to predict the distribution of V and C length respectively, it is assumed

22 It has to be added that the authors attribute the poor performance of informants (compared to Maurud’s earlier survey) in a large part to non-linguistic factors, such as geopolitical changes, nevertheless, structural aspects certainly play an important role too. 23 Some dialects (including Älvdalen in and Nord-Gudbrandsdalen in ), which have escaped the effects of the Scandinavian quantity shift of the 14 th century, still distinguish between light and heavy root syllables, cf. Riad (1992:171). 24 It has also been proposed that stress can be derived from tonal movements, cf. Endresen (1977). 25 Kristoffersen (2000:153) claims that the underlying forms of canasta (id) and balanse (balance) both have a heavy penult, which suggests that he ignores the onset maximization principle at this stage of the derivation. 26 Given that there is no structured way of finding out which lexical items are extraprosodic and which are not, from the language learner’s point of view, lexical marking and extraprosodicity have the same implication. It seems misguiding to keep them apart. 44 that quantity “can be seen as a property that must be assigned to each lexical item as a kind of class-membership index” (ibid: 157). At this point Kristoffersen seems to contradict himself. He assumes that length is absent underlyingly, yet at the same time lexically specified. I fail to understand how a lexically specified feature can be absent underlyingly. In light of the present discussion it seems necessary to assume some sort of segmental length in underlying forms, which can serve as a starting point for the derivation of stress patterns. Stress assignment cannot precede the assignment of quantity. Finally, it might be added that irrespective of the validity of the phonological considerations argued for above, the proposal that stress is in some way more fundamental than quantity feels intuitively wrong. The main reason for this may be sought in the way the individual features of quantity, stress and tone contribute to the success of communication. Nothing happens if a foreigner fails to master the use of tone. In the worst case scenario he might cause some amusement by talking about the holy duck instead of the Holy Spirit (Sw. den heliga 1anden / 2anden ), however, intelligibility is not at risk at all. A slight misunderstanding may occur if he assigns stress to the wrong σ and says that we use new technicians instead of techniques (Sw. vi använder nya tek niker / tekni ker ). Such situations are easily sorted out with the help of contextual and other clues. Nevertheless, if he is unable to distinguish between long and short segments, this may disable a conversation given that he might end up saying things like du måste bada naken (you have to bathe naked) instead of du måste badda nacken (you have to dab/moisten the back of your neck). It is clear that the success of communication hinges primarily on quantitative distinctions, which allows us to assume that the feature of length must be present in underlying representations .

2.2.2. Quantity Having established the necessity of including quantity in underlying forms we now have to address the problem of complementarity. Given that V length can be derived from C length and vice versa, it would be superfluous to indicate both in the lexicon. So which one should we choose? As there is no consensus on the matter in the phonological literature, I will review the most important arguments, pro and con.

Perception When we are to devise appropriate underlying forms, our foremost concern is to arrive at a solution that results in an optimally economical description and corresponds to the facts of perception. As far as the latter condition is concerned, distinctive vowel quantity stands out as an obvious option. A long V is on average 50% longer than a short V, while the corresponding figure for consonants is only 25%, cf. Elert (1964). This entails that vocalic differences are easier to perceive and are thus more fundamental for distinguishing between V:C and VC: syllables. Riad (2013:164) points out that this argument “isolates the quantity distinction from the context in which it occurs”, i.e. complementarity, and suggests that “the ratio of vowel to consonant could be what is relevant for perception” (ibid). The ratio can be calculated as in (21) and suggests that it is indeed much more relevant than Elert’s figures for short and long segments. Nevertheless, this observation does not disprove the assumption that V quantity is more basic, since the more substantial durational difference between V: and V still contributes more to the V/C ratio than the difference between C: and C.

45

(21) The durational ratio of V:C and VC: rhymes 1.5͐ ̽ = 1.875 ͐ 1.25 ̽

Evidence from perceptual studies also seems to support the relevance of V quantity. Hadding- Koch & Abramson (1964) investigated 27 whether a V:C σ, whose nucleus is sufficiently shortened, is perceived by native speakers of Swedish as a VC: σ. They reduced the vowels of väg (wall) and stöta (thrust) in several steps and found that shortened vocalic duration on its own (i.e. without complementary lengthening of the following C) was enough for the listeners involved to classify the stimulus as VC:. This, however, did not apply to the long V of ful (ugly), whose duration was eventually reduced beyond the short V of full (id), yet it was mostly “correctly” identified 28 . This means that most (if not all) pairs of short and long vowels (notably [ ɑ:] – [a] and [ ʉ̟ :] – [ɵ] ) apparently differ in both timbre and duration, which suggests that they are not merely allophones of the same phoneme. This is echoed in Linell (1978:128) according to whom “[t]hese differences of vowel quality are so great that they cannot be explained as mechanical consequences of the differences in duration”. Furthermore, “[l]ong and short vowels are not acquired by children in a pairwise manner” (ibid), which can lead us to assume that long and short vowels are psychologically distinct. Corresponding spectral differences are not observed for long and short consonants, which can be interpreted as a strong argument in favour of distinctive vowel quantity . The observation that stressed VC-syllables are typically identified as VC: can be further supported by inter-linguistic interference phenomena. Linell (1978:130) claims that when a Swede is asked to reproduce a Finnish utterance containing a stressed VC σ, he “attends to the quantity of the vowel, retains it and introduces instead the allophonic consonant length”. I myself have conducted a similar perceptual experiment, in which native speakers of Swedish were exposed to Polish and Hungarian utterances containing stressed VC-syllables. The results of the study allow us to draw identical conclusions, i.e. what counts is usually the shortness of the V and not that of the C (with the reservation that deviating V quality appears to be a stronger perceptual cue 29 than temporal relations). Note that inter-linguistic interference can arguably mirror various derivational processes. When a Swede is confronted with Hungarian tó [to:] (lake), he will probably repeat it as tå [t ho:] (toe) and not as då [do:] (then). This leads the analyst to assume that aspiration is redundant, while voice is a distinctive feature in Swedish phonology. The Hungarian utterance, [to:] can for all practical purposes be equated to the Swedish underlying form, which serves as a starting point in the derivation of [t ho:]. In a similar vein, Polish, Finnish etc stimuli should be viewed as underlying forms. Now if CVC is reproduced as CVC:, then we cannot help acknowledging that C gemination is a redundant, allophonic process .

Economy If we now turn our attention to the question of economy, we will find that the arguments concerning the phonological status of the quantitative distinction are complex and contradictory (at least compared to the clear-cut conclusions we have arrived at having

27 Both the reader of the stimuli and the listeners were speakers of Southern Swedish, i.e. a dialect where long vowels are subject to diphthongization. 28 Thorén (2003) reports the results of an experiment, in which temporal values for consonants and vowels were manipulated simultaneously. He found that although perception was aided by adjusted consonantal duration, the role of spectral differences could not be completely overruled by complementary lengthening. 29 For instance, Hungarian hat [h ɒt] (six) was typically identified with Swedish hat [h ɑ:t] (hatred) and not with hatt [hat:] (hat). 46 discussed perceptual issues). The main reason for this is to be sought in the fact that the term economy can be used with reference to different linguistic levels and a wide range of morpho- phonological processes. The question of inventory size seems to be the most straightforward issue to address. Standard Swedish has 9 vowels and 18 consonants, of which /h/ and /ç/ are never lengthened. Moreover, the short vowels corresponding to and <ä> have been merged in [ɛ]. This means that assuming underlying C length would result in 43 phonemes, while a theory embracing distinctive V length would have to count with an inventory size 30 of 35. All other things being equal, positing underlying C quantity would constitute a violation of Occam’s razor. All other things are of course not equal, but it is still worth reminding ourselves that in the eventuality of a tie we are expected to lean towards distinctive V length.

Shortening and lengthening rules An economical description does not resort to otherwise unmotivated shortening and lengthening rules during the course of the derivation. To take an example, let us consider the word matte , which is the shortened form of matemat i: k (mathematics). It is reasonable to assume that matte is the result of back clipping and as such has the first two syllables of matemat i: k as its underlying form. It is apparent that the surface form preserves the length of the V and mechanically adjusts the length of the C. Linell (1978:129) includes a list of hypocoristic formations , which all share the same mechanism, and adds that “[t]his is precisely what we should expect, if , but not consonant length, is phonologically relevant”. A theory adopting distinctive C length would have to posit a seemingly unmotivated gemination rule in order to derive the correct surface form. However, the force of this argument is considerably blunted by the realization that certain shortened forms surface with an unexpected long V, while in other cases an originally long V undergoes shortening for no apparent reason. Examples are given in (22) below. In many instances, the hypocoristic word cannot have been derived from the given base form with the help of phonological rules as /kf/ > /r:/, /n:/ > /l:/ and /Ø/ > /p:/ in the examples in (22) would defy everything we know about how phonological change proceeds. Linell (ibid) attributes (22b) to pronunciation but he does not address the data in (22c). Given that the formation of hypocorisms cannot always be described in terms of a derivation, we have no reason to expect them to exhibit faithfulness to the base form, cf. Riad (2013:165). In fact, it can be argued that the formation of hypocoristic words is akin to certain aspects of child language and always involves gemination 31 . Moreover, such C lengthening is claimed to go back (at least) to Proto-Germanic times, cf. Tschirch (1983:82ff), Martinet (1937a). All this indicates that diminutives and back clippings cannot be relied on when we want to establish the phonological status of the quantitative distinction.

30 Similarly to Kristoffersen (2000), Eliasson & La Pelle (1973) is an analysis where quantity is seen as a consequence of stress. The authors refer to Elert (1964) and assert that distinctive V quantity would result in an “unexpectedly” high number of V phonemes compared to other languages. In other words, they are of the opinion that assuming underlying, segmental quantity is dubious from a cross-linguistic perspective. Maddieson (1984:7) makes it clear that “the typical size of an inventory lies between 20 and 37 segments”. Furthermore, “[t]he probability of length being part of the vowel system increases with the number of vowel quality contrasts” (ibid: 129). Around 84% of the languages involved in Maddieson’s survey have fewer V qualities than Swedish has (ibid: 127), which suggests that there is nothing abnormal cross-linguistically about positing distinctive V quantity for Swedish. 31 Note that avundsjuk in (22a) does not conform to this expectation. 47

(22) Hypocoristic formations 32 a. preserved vocalic length i. a: vundsjuk > a: vis (envious), bibliot e: k > bibbla (library), deprim e: rad > deppig (depressed), fotograf i: > f ot:o (photograph), Lennart > Lelle b. V lengthening i. fok us e: ra > fo: ka (to focus), fotograf i: > fo: to (photograph), elektricit e: t > e: l (electricity), kv alific e: ring > kv a: l (qualification) c. V shortening i. Ro: bert > Robban , O: lof > Olle, Krist i: na > Titti , Jo: han > Joppe sm a: kfull > sm arrig (tasty)

Arguments for distinctive C quantity As far as proper derivations are concerned, Riad (2013:165ff) argues that “we get a simpler and more consistent grammar with fewer ad hoc assumptions if we assume distinctive consonant quantity”. His reasoning is based on the premise that long consonants are moraic and that consonantal length is either underlying e.g. vinna (to win) or the result of assimilation e.g. vit+t (white, neuter) or is assigned by position e.g. mjölk (milk). Irrespective of their source, moraic consonants behave in a uniform manner: they are always preceded by a short V. Assuming distinctive C quantity also entails that the behaviour (i.e. the occasional lengthening) of vowels is predictable. When stress is assigned to a σ with a non-branching rhyme, its nucleus undergoes open syllable lengthening (OSL). This can apply to words ending in a stressed V e.g. meny (menu), to words with final stress ending in an extrametrical C e.g. gardin (curtain) and words with non-final stress where the following intervocalic C (cluster) is non-moraic e.g. rita (to draw) or sakna (to miss). In the latter case the cluster is expected to display a rising sonority profile and has to constitute a legitimate onset 33 . So in what ways is this approach superior to its competitor? One of the strongest arguments concerns cases “when stress occurs on different syllables within the same paradigm, and vowel length moves around accordingly” (Riad (2013:169f)). Consider the examples in (23).

(23) Moving stress patterns a. polit i: k (politics), pol i: tiker (politician), politis e: ra (politicize) b. dr a: ma , dramat i: k (drama), dram a: tisk (dramatic), dramatis e: ra (dramatize) c. fo: n (phone), fon e: m (phoneme), fonem a: tisk (phonemic), fonemat i: k (phonemics), fonematis e: ra (phonemicize) 34

V lengthening in open syllables seems to be an automatic (thus predictable and redundant) consequence of stress assignment. C length, to which V length is apparently adjusted, is stable throughout the patterns. It is tacitly implied that a theory that assumes distinctive V quality would falsely predict geminated consonants in politiker and dramatisk and therefore has to resort to unmotivated shortening and lengthening rules in order to account for the actual data. In Riad’s framework it is expected that OSL should give rise to long vowels. Whenever OSL operates, V length is clearly redundant. This must lead defenders of distinctive V quantity to acknowledge that it is superfluous to postulate underlying length in words such as tre (three).

32 Many of the examples appearing in (22) are borrowed from Riad (2013:ch7) and Linell (1978:129). 33 Riad (ibid) does not mention this latter criterion, still all of his examples conform to it. /vl/ in tä:.vla (to compete) is contentious since no Swedish or Norwegian word starts with this cluster. However, /vr/ is a legitimate onset and native speakers of Sw&No do not tend to have difficulties pronouncing Russian proper such Vladimir or Vladivostok , so this might be considered to be a lexical gap. 34 These examples are taken from Eliasson (1978:119). 48

A similar pattern seems to emerge when we investigate assimilated geminates. These can arise by attaching the neuter suffix –t to an adjective ending in a dental stop or by forming the past tense of weak verbs, whose imperative ends in –t. As indicated in (24a), both cases involve the shortening of the root V. Again, underlying C quantity seems to explain the facts more accurately as the length of the V is adjusted to that of the following C.

(24) V shortening a. Assimilated geminates i. vi: t (white) + -t (neuter) 35 > vitt , br e:d (broad) + -t (neuter) > br ett ii. by: t (change) + -de (past) > bytte , le:d (lead) + -de (past) > ledde b. Monosyllable + dental suffix i. fri (free) + -t (neuter) > fritt , ny (new) + -t (neuter) > nytt ii. tro (believe) + -t (past part.) > trott , sy (sew) + -t (past part.) > sytt iii. fly (flee) + -de (past) > flydde , nå (reach) + -de (past) > nådde iv. ko (cow) + sa (dim.) > kossa , Bo (given name) + -se (dim.) > Bosse

We can encounter similar shortening phenomena when we investigate monosyllables that lack a coda. According to Riad’s (2013:175) analysis, the suffixes in (24bi) and in (24bii) are non- moraic and are not extrametrical either, which means that OSL is not applicable in the present case. Instead, the grammar evaluates the two candidates ([fri:t] and [frit:]) and finds that the latter is more optimal given that it “[contains] a branching rhyme both in terms of moras… and in terms of syllable structure”. Riad (ibid) concludes that “[t]his outline of an analysis makes a phonological connection across adjectives… verbs… and hypocoristics” as indicated in (24b). Syllabification can also provide support for distinctive C quantity. If we compare mata (feed) and matta (carpet), we see that the stressed σ is open in the former and closed in the latter. This is, indeed, reflected in the underlying forms if we presume that consonants are distinctive (cf. /m ɑ.t ɑ/ - /m ɑt. µɑ/). This insight is not captured if we adopt distinctive V quantity (cf. /m ɑ:.t ɑ/ - /m ɑ.tɑ/). This latter convention leads us to discover a discrepancy between the underlying representation and the surface form. In what follows I will try to find out whether the arguments exposed above can be countered by the defenders of distinctive V quantity.

Arguments for distinctive V quantity One of the most significant points to make is that the postulation of phonemic V length does not necessarily imply that each and every long V features in the underlying representation. This is exactly what Riad (2013:166) assumes for consonants when he states that weight can be induced by position . To take an example, underlying length in monosyllables without codas would be totally superfluous since V lengthening is the only process at hand that can ensure a bimoraic rhyme. So let us suppose that [fri:] (free) is stored in the lexicon as /fri/ and [t hru:] (believe) as /tru/. As a next step, let us consider what happens if we attach C-initial suffixes to such monosyllabic stems: /frit/ and /trut/ are not eligible for extra vocalic weight by position. The underlying shortness of the V will trigger the gemination of the following C. This approach can account for the data in (24b) in a uniform and straightforward way. There is no need to make dubious claims about cross-linguistically unmarked CV:-syllables being less optimal than CVC:-syllables. There is no need to involve (or to revoke) extrametricality and the predictions of the theory are all in line with (24b). Note that Riad explained (24bi and ii) claiming that the coda blocked OSL by virtue of not being extrametrical. However, the

35 Interestingly, a few adjectives (many ending in a dental stop) lack a neuter form. These include lat (lazy), kåt (horny), gravid (pregnant), paranoid (id), skraj (scared) etc. 49 same line of reasoning cannot apply to (24biii and iv), which in his model should undergo OSL. Further problems are raised by the present tense forms of monosyllabic verbs e.g. tror (believes), ser (sees) and by the definite forms of monosyllabic nouns e.g. ön (the island), ån (the stream). All of these words feature a long V despite the fact that both approaches predict the opposite. We may call attention to the fact that these two inflected categories (i.e. present and definite) display virtually no exceptions (as opposed to various irregular past tense and plural forms, which obviously burden the lexicon) and as such are extremely faithful to their base forms. This faithfulness then manifests itself in identical36 V quantity in tro and tror . However, this line of reasoning is defied by the long V in plural forms such as bi+n (bees) and sto+n (mares). A fresh look at the data indicates that a phonetic account is superior to an explanation in terms of faithfulness and regularity. As it turns out, length in the inflected forms of monosyllables that lack a coda (on their own) is dependent on the sonority of the first segment of the suffix: obstruents are geminated (24b), while trigger the lengthening of the root V. Note that the definite suffix of neuter nouns features a V as in bi+et (the bee) as opposed to by+n (the village), cf. (33a), for the sake of paradigm uniformity. Riad attributes the long V of gardin (curtain) and ful (ugly) to OSL arguing that the word-final C is extrametrical. However, I would like to maintain that once we refer to extra- prosodicity , we find ourselves on thin ice. Altering the data we are investigating should be our last resort. Note that Kristoffersen (2000:155) was compelled to revoke the very same assumption about extra-prosodic consonants that is embraced by Riad, because he found that the stress rules he had devised did not permit invisible consonants. Riad (2013:278) derives [1nɑ:g ɛl] (nail) from /n ɑgl/ with the help of epenthesis, as is often suggested in the literature. However, if the length of the surface V is attributed to OSL, then we have to hypothesize that both /g/ and /l/ are extrametrical, which may seem somewhat far-fetched. Riad (2013:171) mentions that “[t]here are a few monomorphemic forms with a lengthened vowel before… clusters” such as aln (ell), moln (cloud), värld (world), but he does not explain how they are derived. Distinctive V quantity does not face such problems. Note that a theory assuming distinctive C length can run into problems even if it renounces the idea of epenthesis. In this case me: del (means) is stored in the lexicon as /medel/ and it is assumed that V length on the surface can be attributed to OSL. However, these premises would make it difficult to account for the long V of förm e: dla (to mediate), which cannot be derived in the same manner, since it is dubious whether /dl/ can be syllabified into the next σ. One can always resort to paradigm uniformity and faithfulness to the base form, however, it is not certain that native speakers automatically associate förmedla with medel. It seems more efficient to assume distinctive V quantity. Riad (2013:171f) makes a very important point when he says that “[t]he synchronic addition of morphemes will in general not have an effect on vowel length”. Of the three exceptions he mentions the first two were presented in (24a and b) and the third relates to historically derived alternations where the addition of a morpheme involves V shortening. Examples include vid (wide) – vidga (to widen), våt (wet) – vätska (liquid), fet (fat) – fetma (obesity) etc. This pattern seems to suggest that in those cases when a formative boundary (+) gets obscured (and finally vanishes) due to the waning productivity of a given suffix, some urgent rearrangement is called for. In a disyllabic morpheme with initial stress any C following a long V is syllabified into the second σ. In the case of the above mentioned derivations this is not viable given that the intervocalic C clusters are not legitimate onsets in the language. So if we want to resolve these illicit patterns we have to choose between V shortening on the one hand and C-deletion on the other. It is obvious that the latter option

36 Sjö+n [ʂøn:] (the lake) is a notable exception. 50 constitutes a more serious violation of faithfulness constraints and is, as a consequence, less optimal. The stressed σ of the winning candidate will therefore be made up of a short V followed by a moraic C. I would like to argue that such instances of V shortening should not be interpreted as evidence for distinctive C quantity, even if the stability of the C as compared to the V may suggest so. These processes do not reflect the way Swedish phonology chooses to derive surface forms since they cannot be equated to a derivation in terms of UR > SR. What these processes do instead is that they restructure illicit sequences and by doing so they alter actual lexical representations (UR 1 > UR 2). This can be considered as an exceptional emergency measure (where what counts is minimizing losses) and should not be equated to or confused with everyday derivations. A similar line of reasoning can be applied to explain the shortenings in (24a). When the neuter suffix –t is added to an adjective of the form (C)V:t or (C)V:d, the ensuing fusion of the final C and the suffix does not only lead to a geminate, it also leads to structural problems. The first thing to notice is that gemination entails the loss of a formative boundary (it is impossible to tell where exactly the suffix begins). So similarly to the non-productive suffixes discussed above, vitt (white - neuter) is best analyzed as a single morph. Here again, we have to deal with an illicit sequence (/vi:t:/), since a moraic C can never be preceded by a long V. The emergency measure we have to implement has to evaluate the two possible outcomes (/vit:/ and /vi:t/) in terms of optimality. We know that both segments are long and moraic in the rhyme of the illicit form. /i/ (unlike /i:/) is short and moraic (one difference) while /t/ (unlike /t:/) is short and non-moraic (two differences). This means that the winning candidate should rather contain short /i/ than /t/. Consequently, /vit:/ displays a higher degree of faithfulness towards the illicit input form. Let us now turn to the problem of moving stress patterns . The data in (23) suggest that whenever the location of primary stress is changed in a word there is always some sort of affixation involved. According to Riad (2009:ch6) we can distinguish between four different types of morphemes on the basis of their contribution to the surfacing stress pattern. Morphemes are either prosodically unspecified, pre-tonic, tonic or post-tonic. The first two categories can be ignored in the present discussion given that an unspecified suffix is unable to alter the location of stress, while pre-tonic affixes (i.e. prefixes such as be- and för-) cannot be combined with morphemes that surface with non-initial stress on their own. Therefore they do not have an effect on stress assignment either 37 . Tonic affixes attract primary stress to themselves. When a tonic prefix is attached to a word (cf. o+markerad (unmarked), ur+usel (very bad)), the primary stress of the second constituent is demoted and surfaces as secondary stress. The resulting structure is a compound, which we can refer to as a maximal prosodic word following Riad’s (2013:ch5) terminology. Such prefixes, however, do not alter quantity relations, since a σ characterized by secondary stress is also subject to the bimoraic condition. So here we will restrict our discussion to minimal prosodic words, which are defined by culminativity. With the addition of a tonic suffix the stress of the original word disappears, but all previous traces of stress are not necessarily erased. Under certain conditions (e.g. that the stress is not moved to a neighboring σ), an originally long V surfaces as phonetically half- long when unstressed (cf. ibid: 202)), however, “[w]hen stress is shifted away from a syllable containing a long consonant no trace of length is left behind” (ibid: 176). This also seems to support the assumption that V quantity is more basic and not only a mechanical consequence of stress.

37 This is not entirely true. When pre-tonic morphemes are combined with formal compounds such as be+arbeta (to cultivate), för+orsaka (to cause) etc, the main stress usually moves to the prefix (cf. Riad (2009:56)), by which the discrepancy between the compound’s morphological and prosodic structure is dissolved. 51

Given that stress is defined by length, which is in turn restricted to stressed syllables, it is quite easy to account for segmental shortenings upon stress shift as in fo:n > fone:m . It is enough to posit a rule that deletes any 38 segmental length before the newly added tonic suffix. The real problem is presented by lengthening processes that take place in syllables preceding post-tonic morphemes. The long vowels in pol i: tiker (politician), ma: giker (magician), matem a: tiker (mathematician) etc defy our predictions that a /'CV.CV/-sequence should undergo intervocalic gemination. In the case of algebr a: iker (algebraist) we can argue that the long V is due to OSL as a result of weight by position, but this should not apply to the previous examples. We have noted that there is discrepancy in terms of syllabification between the proposed UR and the SR of words like matta (carpet). The stressed σ is open in the former but closed in the latter. What this indicates to me is that gemination must be one of the earliest phenomena in the derivation, apparently preceding syllabification. If we assume that gemination is non-cyclic or more precisely pre-cyclic, it all makes sense. As a first step, every C that follows a short, stressed V undergoes gemination. This happens separately in each morpheme. This is then followed by various other processes including syllabification, suffixation, compounding etc. When the actual phonetic representation is ready to surface, the post-lexical rule of OSL will adjust monomoraic stressed syllables to the bimoraic condition. This can occur in monosyllables where it would be superfluous to indicate underlying quantity e.g. tre (three) or in words that have undergone stress shift. This also reveals why the question of Swedish quantity is so confusing. Both gemination (a case for phonemic vowels) and OSL (an argument for phonemic consonants) are detectable in Swedish phonology, however, the former acts on the UR, while the latter on derived forms .

The location of primary stress I would like to conclude this section by discussing another aspect of the economy argument. I have argued in 2.2.1 that quantity is more fundamental than stress. In light of the above discussion it seems reasonable to infer that stress-to-length applies pre-cyclically while length-to-stress during the final course of the derivation. In other words, in the majority of cases the location of primary stress follows from the segmental make-up of the UR . The two approaches can be compared with the help of the data in (25) below. I have not included words with three syllables or longer since I assume that the pattern that emerges here is relevant to longer lexical items too.

(25) Stress-to-weight in disyllables Length on the surface Example UR with C length UR with V length CV:CV te ve (television) CVCV CV:CV CVC µV kaf fe (coffee) CVC µV CVCV *** CVCV: ca fé (id) CVCV CVCV: CVC µCV hin di (id) CVC µCV CVCCV *** CV:CCV va kna (to wake up) CVCCV CV:CCV CVCCV: hin du (id) CVCCV ??? CVCCV: CV:CVC må nad (month) CVCVC CV:CVC CVC µVC gam mal (old) CVC µVC CVCVC CVCV:C ba nan (banana) CVCVC!!! CVCV:C CVCVC µ ho tell (hotel) CVCVC µ CVCVC!!!

38 Many tonic suffixes can combine with bound morphemes (e.g. polit-ik ), which are consequently unspecified for stress. 52

CVC µCVC bis kop (bishop) CVC µCVC CVCCVC CVCCVC µ boj kott 39 (boycott) CVCCVC µ CVCCVC!!! CVCCVC µ ma drass (mattress) CVCCVC µ CVCCVC!!! CVCCV:C ban dit (id) CVCCVC!!! CVCCV:C CV:CCVC ba sket (basketball) CVCCVC CV:CCVC VCCV:C asket (ascetic) VCCVC!!! VCCV:C CVCVC µC ce ment (id) CVCVC µC CVCVCC!!! CVCCVC µC kon serv (tin) CVCCVC µC CVCCVCC!!! VCCVC µC as falt (asphalt) VC µCVCC VCCVCC

Letters in bold in the second column indicate a stressed σ. The gray-shaded cells contain such representations that are identical with the SR in terms of quantity. Cells marked with asterisks illustrate that the given approach can successfully deduce the location of stress and thus the SR as well. To take an example, CVCV can only surface as CVC µV provided that V quantity is underlying. This is because the second σ cannot be made bimoraic. Recall that OSL was restricted to monosyllables and derived forms. Underlying V length is assumed in words like café , since it serves to indicate the location of stress and is thus not redundant. Following Kristoffersen (2000:154) we can postulate that in words without underlyingly specified segmental length stress placement is quantity sensitive in the sense that stress is attracted to closed syllables. A word can of course have more than one closed σ, in which case stress is located on the rightmost σ with the most branching coda. This is, however, more of a tendency than a rule. It works for the cells with exclamation marks but it also makes a number of false predictions, which have to be lexically marked. For the C approach these include word types represented by månad and basket , while the V approach is unable to cope with gammal, biskop and asfalt . The word hindu (marked with question marks ) and hindi can be derived with the C approach on condition that the actual moraicity of a given C is always underlying. However, if we assume weight (i.e. gemination) by position, then the stress pattern of hindu cannot be predicted. The remaining cells of the last two columns (i.e. cells without colours, asterisks and question marks) contain those syllabic types for which the given approach either makes wrong predictions or no predictions at all, so the location of stress in these items has to be lexically specified. Given that words with a syllabic structure like teve and café are immensely numerous, the C approach would have to deal with a large number of lexically specified words. As far as the V approach is concerned, it has to be added that a growing number of words with original final stress are assigned stress on the first σ (see 2.2.3), which both increases the number of CVC µVC-words and the lexical burden of the V approach. All in all, a thorough comparison suggests that distinctive V quantity can account for a somewhat wider range of stress patterns .

2.2.3. Stress In the description of languages where stress is largely (but not completely) predictable due to the interaction of phonological, morphological and lexical factors linguists have for a long time been preoccupied with accounting for attested stress patterns, cf. Chomsky & Halle (1968:ch3), Fretheim (1969), Standwell (1972), Kristoffersen (2000:ch6), Rice (2006). The main impetus behind this enthusiastic quest for rules is to be sought in the precept according to which “[r]egular variations… are not matters for the lexicon, which should contain only idiosyncratic properties of items, properties not predictable by rules” (Chomsky & Halle

39 This word can also be encountered with initial stress. 53

(1968:12)). In other words, the lexicon should be as slender as it possibly can be with no information on e.g. stress placement for those words whose accentuation is claimed to be predictable. Consequently, the analyst aims at constructing a grammar where the number of lexically marked exceptions is kept as low as possible and where the devised scheme reveals how the speaker derives the surface form from the UR. In spite of its theoretical appeal the above position is highly problematic. My first objection concerns the conviction that whenever the analyst manages to establish a derivation, this must necessarily reflect some cognitive process in the speaker’s mind. This is arguably not the case. As far as I understand, speakers need not derive the stress pattern of mono- morphemic words with the help of rules. The very idea of a stress rule presupposes a process of language acquisition whereby the child is first exposed to a large number of linguistic stimuli, part of which he memorizes and utilizes in constructing the grammar of his own. If the generative approach were right in claiming that the lexicon is completely void of redundancy as such, then we would have to postulate that the process of language acquisition involves a certain stage during which all lexical information on the basis of which a given rule has been construed is erased in order to decrease the lexical burden of the language. This is quite improbable. Languages (or their users) do not seem to be so utterly concerned about lexical specifications as the tones of Chinese, the grammatical genders of German or the lexical stress of Russian may suggest. But even if they were, they would surely find far more efficient ways of reducing the size of the lexicon than by banishing stress from the UR. Let us investigate the word konserv (tin) in terms of a traditional feature analysis following Linell & al. (1971:56, 97). In (26) below we find that there are at least 26 things to remember about the word’s seven segments. If we were to add [+stress] to (26f), this figure would increase to 27. It is plain to see that the lexical burden remains in the same order of magnitude even if we specify the stressed σ. If the size of the lexicon really mattered (which it does not) languages would resort to widespread segmental deletions and would impose serious restrictions on the size of potential lexical items.

(26) A feature analysis of konserv a. /k/ - [−voice], [−coronal], [−anterior], [−continuant] b. /o/ - [+round], [−high], [+back] c. /n/ - [+nasal], [+coronal], [+anterior] d. /s/ - [−voice], [+coronal], [+anterior], [+continuant] e. /e/ - [−round], [−low], [−high] f. /r/ - [+voice], [+coronal], [−anterior], [+continuant], [−lateral] g. /v/ - [+voice], [−coronal], [+anterior], [+continuant]

This suggests that human beings have much greater cognitive capacities than most linguists would like to acknowledge. It is obvious that a lot of linguistic information is multiply stored in various areas of the brain. People who have suffered brain injuries would not be able to relearn or maintain their language in the remarkable way they do, if it were otherwise. The claim that the lexicon presumably contains a large number of redundant elements can be further supported by the fact that high frequency phrases take much less processing time than rare structures, which really have to be derived. Redundancy is, in fact, so pervasive in all linguistic levels that it would be surprising if it did not apply to the lexicon as well in some way or another. This is, of course, not to deny that the rules and derivations established by the analyst are linguistically real to the speaker, quite the contrary. The rules are essential in productive alternations, which may allow the speaker to improvise and come up with a new lexical item and hitherto unheard phrases and sentences. This aspect of creativity, which is so central in generative grammars, is not present in the stress pattern of

54 mono-morphemic words. They are for the most part not derived by the speaker. Their pronunciation is just recalled from his memory. It follows from the above reasoning that in order to be phonologically relevant main stress assignment either has to involve productive alternations or it has to relate to some ongoing processes in a given language. In the , predictable stress shifts can be brought about by compounding and affixation. Examples for the latter were given in (23), which included both tonic (-ik , -era ) and post-tonic affixes (-isk , -iker ), while pre-tonic affixes (be-, för-) and their occasional influence on stress patterns were mentioned in fn. 37. In what follows, I will therefore focus on stress shifts exhibited by compounds.

The stress pattern of compounds Most Swedish compounds are maximal prosodic words in the sense that they are characterized by two prosodic peaks, usually referred to as primary and secondary stress. The prominence patterns of the individual words entering a compound are usually preserved. The main stress of the compound is assigned to the stressed σ of the first constituent, while the main stress of the last constituent commonly appears as the secondary stress of the compound. This means that if a compound is made up of more than two constituents, secondary stress is moved to the last of these regardless of the internal structure of the compound word. Examples are given in (27) below. This pattern applies even in those cases where the last constituent is a formal compound, cf. (27e, f).

(27) Secondary stress in multiple compounds (Standard Sw – Finland Sw) a. folk +dans (folk dance) + lag (team) > folkdans lag – folkdans lag b. pojke (boy) + lands +lag (national team) > pojklands lag – pojk land slag c. fader +skap (paternity) + mål (lawsuit) > faderskaps mål – faderskaps mål d. värde (value) + föremål (object) > värdeföre mål – värde fö remål e. hus +håll (household) + ar.bete (work) > hushållsar be te – hushålls ar bete f. skida (ski) + para.dis (paradise) > skidpara dis – skid pa radis

Bruce (2007:118) points out that non-tonal Germanic dialects (i.e. Swedish spoken in Finland, Danish and German) seem to share a different rule, which sees to it that secondary stress be associated with the first constituent of the posterior lexeme. This practice assists the identification of the compound’s semantic content, which is not the case in Standard Swedish, cf. (27b and d). All this conveys the impression that in the dialects where postponed secondary stress occurs, stress serves to mark the domain to which the connectivity of the tonal contours holds. In non-tonal dialects it would be prosodically unmotivated for the σ bearing secondary stress not to reflect the constituent structure of the compound. Certain lexicalized compounds such as riksdag (Parliament), blåbär (blueberry) and måndag (Monday) have lost their secondary stress and are consequently prosodic minimal words 40 . When there is a very strong semantic fusion between two constituents, then this cohesion can result in the demotivation of the compound (cf. Liberman (1982:15)), which in turn has prosodic consequences such as stress demotion and accent shift, see 2.2.4. In certain distinct categories we can observe that the expected primary stress is deleted and the compound surfaces with final stress instead. Such constructions include numerals (femtio fy ra (54)), points of the compass ( nord väst (northwest)), given names ( Per-Erik, Karl- Ber til ), geographical names ( Öre bro , Göte borg ) etc, cf. Bruce (1998:37). Interestingly, in some Northern Swedish varieties the distribution of final stress in compounds is not restricted to such well-defined categories. According to Bruce (1982) the use of final stress is most

40 Secondary stress is often reintroduced upon suffixation as in mån µda:gar (Mondays), mid µda:gar (dinners). 55 common in Luleå, where the data obtained from 64 informants showed that final compound stress was in fierce competition with standard accentuation. In nearly 60% of the cases, male 41 informants produced compounds with final stress as long as the stress of the anterior constituent did not fall on the last σ. Stress clash, on the other hand, as in the nonsense compounds ba nan kust (banana coast) and blod prins (blood prince) inhibited final stress in all but one token. It is noteworthy that while stress clash apparently blocks final stress in word level phonology, at the phrasal level it seems to have the opposite effect. Riad (2013:138f) discusses the Swedish rhythm rule, which “reduces stress in a word that typically immediately precedes the head of the phrase”. Examples include kapten Nil sson and Karl -Olof 42 , which thus surface with a kind of final stress. Riad (ibid) points out that “reduction does not only apply in case of stress clash” as exemplified by the proper name Eric Ericsson , whose first element exhibits both loss of stress and phonological shortening. A rather similar accentual pattern emerges in the realm of phrasal verbs. When a and a following particle combine in Standard Swedish to give rise to an item with a new meaning (e.g. tycka (think) + om (about) > tycka om (to like)), then the primary stress of the verb is demoted and the particle takes over as tone bearing unit. The anacrusis preceding the particle indicates that the two words belong to the same intonation-group. The rhythmic pattern encountered here (x…X) is typically found in phrases (e.g. en stor familj (a big family), ett stort ting (a big thing)) i.e. in constructions where the meaning is easily derivable from the two constituents. In compounds, however, where the resulting meaning is not always straightforward (e.g. storfamilj (extended family), Storting(et) (the Norwegian Parliament)), we usually find a reversed stress pattern as discussed above. Now given that the semantics of phrasal verbs is also rather obscure, we could expect them to display the rhythmic pattern of compounds instead. This is, in fact, what we find in a number of Scandinavian dialects including Standard Norwegian. Phrasal verbs like komme in (to come in) and gå ut (to go out) behave like compounds both rhythmically and tonally. Abrahamsen (2003:197f) reports a curious fact about the Sunnmøre dialect in Norway, where initial stress is allowed if and only if the verb is monosyllabic. In all other cases, the phrasal verb in question behaves like in Standard Swedish. This basically means that final stress is blocked by stress clash just like in northern Swedish compounds. In the light of these patterns, it seems likely that reduced initial stress in kapten Nil sson and Karl -Olof should not be attributed exclusively to stress clash.

Default stress Having discussed some productive alternations of stress patterns, in the remainder of this section I will address another problem, which renders the study of stress assignment phonologically relevant. Is it plausible to assume that there is a default stress pattern in Swedish? From a historical perspective, it might be inferred that all native words of Gmc origin are stressed on the initial σ of the root, while words that do not conform to this pattern are arguably foreign. This is the factual base of the belief that Swedish words in general have initial stress. This conviction is even reflected in the earliest generative accounts of word stress. We can think of Fretheim (1969), who proposes a partition of the lexicon along the binary feature [ ±native], which implies that [+native] can be derived with the help of a simple rule, while [-native] is lexically specified. An obvious argument against this hypothesis is that most [-native] words “have no taste of foreignness to the native speaker” (Rice (2006:1193)). Given that [-native] words are usually stressed on the final or penultimate (sometimes on the antepenultimate) σ and that the [+native] part of the vocabulary is restricted to a

41 Women were much more inclined to adopt the socially more accepted standard pronunciation. 42 Bruce (1998:37) classifies names like this as compounds and not as phrases. 56 maximum of three syllables, all Swedish lexical items can be accommodated in a three- syllable window starting at the right edge of the word. This is reflected in Riad’s (2013:200) general stress rule, which “stress[es] the rightmost available syllable in the prosodic word”. What this essentially means is that the heavy influx of Romance loanwords is claimed to have transformed left-edge aligned into a right-edge aligned language. It follows from this assertion that an “etymologizing view of the lexicon is not going to be able to express [phonological and morphological] patterns in any principled way” (ibid: 229). Bruce (1998:33f) also dismisses the “prejudice” of default initial stress. He points out the fact that initial stress in Swedish is not as typical as it is often made out to be. He provides a comparison between English and Swedish based on circa 30 parallel words and finds that stress in English regularly comes earlier than in the corresponding Swedish word. This observation also applies to recent loanwords such as backup , stand-by and time-out . The opposite trend is found in a negligible number of words including juli (July), Japan (id) and schampo (shampoo). Even if the differences appear to be conspicuous, I would like to argue that such a static comparison of stress patterns does not necessarily lead to valid conclusions. For instance, a comparison of Finnish and Swedish onsets would certainly indicate that branching structures are exceptional in the former, but well-established in the latter. However, such observations are not very informative, since we are not told to what extent the two languages resort to repair strategies synchronically in order to exclude illicit formations. Such changes are clear indicators of what structures the language seeks to avoid, which is not always reflected in static comparisons. Nonetheless, stress is undoubtedly located earlier in English than in Swedish in numerous parallel words. It has also been observed that Swedish has a somewhat stronger tendency for late stress than Danish and Norwegian, cf. Uneson (2012). Is this a strong argument against default initial stress? Not necessarily. The first thing to notice is that Bruce’s material conveys the false impression that Swedish tends to move the main stress closer to the right edge of the word. As a matter of fact, virtually all of his Swedish examples exhibit a German stress pattern. This is certainly no coincidence given that most of these lexical items reached Scandinavia through German- speaking areas. Therefore most of these observed differences between English and Swedish cannot be seen as indicators of change. Yet how can we account for final stress in English loans such as make-up ? First of all, we may refer to the fact that make-up (similarly to the other examples given above) has the structure of a phrasal verb, which most Swedes certainly have no difficulty noticing. If the structure of the loanword evokes the prosodic characteristics of phrasal verbs, then final stress is what we expect. Note that make-up is pronounced with initial stress in Norwegian similarly to phrasal verbs. However, it has to be acknowledged that four of Bruce’s examples ( syntax, huligan, nylon, tomahawk ) exhibit initial stress in German, final in Swedish. As far as tomahawk is concerned, we can observe that its last σ is pronounced with an unstressed long V in German. Now given that duration is the most consistent stress correlate in Swedish (cf. Fant & Kruckenberg (1994:126)), it is not surprising that the long V was mistaken for stress. Stress shifts can in certain cases also be ascribed to the attraction exerted by lexical items with identical endings. Thus huligan is likely to have been remodelled on the basis of Ameri kan , angli kan , colombi an , charla tan, sul tan etc. It is certainly not coincidental that hool igan surfaces with initial stress in Norwegian and Danish, where the corresponding –an words are either initially stressed e.g. sjar latan, sul tan or have a different ending e.g. Amerikaner , anglikaner etc. Although we can postulate the same mechanism for nylon (cf. ace ton , anti mon , a xon , ama son , saxo fon etc), it is much more difficult to explain why it is pronounced with initial

57 stress in both Danish and Norwegian. The stress patterns of individual words are indeed so varied (cf. Nshe DS riff, NS koli Dbri, Ssa DN lat ) that it is futile to seek an explanation for every single deviation.

A partition of the lexicon The discussion so far has provided some tentative explanations but it has clearly not ruled out the possibility that Swedish, after all, may be a right-aligned language. Let us for a moment grant the proposition that loanwords indeed have a tendency to surface with final stress, even if they were pronounced differently in the source language. Interestingly, we also find that certain words with an originally foreign stress pattern often shift primary prominence to their first σ. This can be observed in standard Swedish where words such as rädisa (radish), presenning (tarpaulin), persilja (parsley) exhibit an alternating stress pattern, cf. Riad (2013:145). This process is even more common in Norwegian (especially in the Trøndelag dialect), where e.g. betong, protest and banan often surface with initial stress, cf. Kristoffersen (2000:165). These two opposing tendencies (to assign late stress to certain loanwords and initial to others) in fact make sense if we return to the previously criticized partition of the lexicon along the binary feature [ ±native]. If newly acquired loanwords indeed have a tendency to enter the language with final stress, this may be interpreted as a way of marking foreign words with prosodic means. The assignment of final stress is, as we can see, often hindered by various structural or psychological factors such as analogical pressure exerted by other lexical items. Then when some time has elapsed and the word does not feel foreign any more, it may undergo prosodic assimilation as well if this is not blocked by some of the above mentioned factors. Assimilation can take place at different levels. As we will see in the next section, such cases of stress adjustments are as a rule accompanied by tonal changes as well. This proposal indicates that when [ ±native] is used to describe certain lexical items, then its use does not necessarily reflect the actual etymology of a given word. Moving main stress to the first σ thus manifests that the word has undergone assimilation and is now labelled [+native]. I am not aware of any counter-examples where a word with original initial stress has been assigned final stress having spent some decades or so in the language. In short, we can claim that initial stress is still felt as default or unmarked in Swedish , and this apparently does not have to be reflected by the most economically formulated main stress rules. What counts is the direction of the changes we have observed. This is essentially an analogical process that serves to dispose of prosodic marks. If we view this phenomenon in terms of analogy, it also becomes clear why these assimilations are so slow, sporadic and unpredictable.

2.2.4. The tonal distinction Sw&No 43 are predominantly intonation languages but they also make a limited use of lexical tone, cf. Cruttenden (1986:10f). This makes them quite unique among the languages of Europe, where, apart from Lithuanian, Serbo-Croatian and the Limburgian dialects of Dutch (cf. Haugen (1967:185), Boersma (2013)), tone is exclusively used for intonational purposes. However, Sw&No polysyllables with non-final stress can be associated with one of two distinct tonal contours. The hundreds of words distinguished this way are usually referred to as tonal minimal pairs and can be exemplified with Sw. 1regel (rule) and 2regel (bolt). The two different tonal patterns indicated here by superscripts are generally described as tonelag

43 The tonal opposition is extant even in some to the south of the stød isoglosses, cf. Ejskjaer (2003:28). 58 in the Norwegian and ordaccent in the . This practice has led to a slight terminological inconsistency in English publications, where the same concept is alternately referred to as word tones (Haugen (1967)) and word accents (Bruce (1977)). Interestingly, the tonal opposition is absent in certain peripheral areas of the Swedish- Norwegian dialect continuum. These include an area surrounding (but not including) the Norwegian city of , the of Finland and many northern varieties in Troms, Finnmark, Lappland etc, where the lack of the tonal distinction can probably be attributed to substratal influence from Finno-Samic languages, cf. Kristoffersen (2000:234).

2.2.4.1. The nature of the opposition Although the actual phonetic realization of the accents varies considerably from dialect to dialect, their distribution (in simplex words) is remarkably stable. As Malmberg (1959:197) points out, in extreme cases this state of affairs can entail that a word pronounced with Accent 1 in one dialect is phonetically identical to another word pronounced with Accent 2 in another dialect. In view of this observation, it seems essential to establish what the distinction between the two accents consists in and whether this contrast can be utilized to describe each and every tonal dialect. There are two competing accounts to consider, as far as the nature of the opposition is concerned. Those who maintain that the opposition is equipollent argue that the main phonetic difference between the accents lies in F0 peak timing. Consider the following description of East Norwegian.

Wherever we have an accent 1, its stress falls near the low point of the curve; in accent 2, the stress comes earlier, and usually includes the preceding high point, while the low point follows the main stress. Haugen & Joos (1952:190)

Bruce (1977:49) also found “an earlier timing for accent I than for accent II in connection with the stressed syllable” in Swedish. The difference in terms of the timing hypothesis can be expressed as (H)L*H for Accent 1 and as H*LH for Accent 2, where the asterisk marks association with the stressed σ. The alternative hypothesis claims the opposition to be privative assuming that Accent 1 can be equated with stress, while Accent 2 can be identified with the combination of a tonal feature followed by stress i.e. Accent 1, cf. Haugen (1967:189). Given that stress in focal position signals prominence and sometimes even a prosodic boundary, Riad (2003:92) postulates that the Accents are made up of the constituents in (28) below. Note that we get the same tonal contour if we attach a boundary tone to Bruce’s tonal transcription above. The data in (28) also reflect the traditional conviction that Accent 2 is the marked member of the opposition.

(28) The building blocks of a privative opposition in Standard Swedish Lexical tone Prominence tone Boundary tone Accent 1 - LH L Accent 2 H LH L

As to which account is more reliable, Kristoffersen (2006a:158) points out that “the two hypotheses are [often] based on data from different varieties, [so] it is conceivable that both

59 are correct.” He then goes on to argue that two of the three East he investigates are best described with the help of the timing hypothesis, a claim that he supports with instrumental measurements. Nevertheless, from a (functional) phonological perspective the privativity approach seems more appropriate, since it attempts to segment the tonal contour and to assign a function to the individual constituents. It is apparently easier to slice the F0 of such dialects where Accent 2 is realized with a double-peaked contour. The dialects of Malmö and Bergen, for instance, whose Accent 2 is realized as L*HL (cf. Riad (2003:99)) are thus perhaps more liable 44 to an analysis in terms of the timing hypothesis. The lack of consensus concerning the description of the tonal opposition should perhaps be interpreted as evidence for diversity: the opposition is possibly privative in some dialects, while it may be equipollent in others. Yet this possibility is rarely entertained, since it would deprive us of a unified description of the tonal dialects of Scandinavia such that typological studies would be rendered clumsy and cumbersome. An approach that renounces such unified descriptions must also be able to account for the assumed historical transition leading from a privative to an equipollent opposition (or vice versa). Kristoffersen (2006b:108) attempts to reconcile the two opposing views in an interesting proposal. He assumes in line with the timing hypothesis that “the input melody L*H% is the same for both accents” in East Norwegian. However, the surface difference is accounted for by an initial, epenthetic H in Accent 2, which means that the output of the tonal grammar can be described in terms of privativity. In what follows I will tentatively adopt the privative approach of (28) for standard Swedish. The arguments against the timing hypothesis are twofold. First of all, as we have mentioned above it is arguably less phonological than its competitor. Secondly, it is not appealing to assume identical melodies for the two accents in a framework that makes use of only two tones, H and L. Given the validity of the Obligatory Contour Principle (cf. Goldsmith (1976:63)) we do nothing but state the obvious if we claim that two different tonal contours are in fact identical with a difference in timing.

2.2.4.2. Word accents in non-focal position One of the most important implications of our choice between the two hypotheses concerns the opposition in relation to stress. Recall that (28) equates Accent 2 with a lexical tone followed by stress (i.e. Accent 1). This approach suggests that in unstressed position the tonal contrast (HLHL – LHL) is reduced to H – Ø, but it is not lost. Needless to say, the timing hypothesis indicates the exact opposite. While it is obvious that tone is associated with the head of the prosodic word, it is not equally straightforward what the effects of stress demotion are. Bruce (1977:9) remarks that it is “not entirely clear nor is there complete agreement as to what level of stress is necessary for the distinction to be maintained”. The answer to this question depends primarily on how many degrees of stress we postulate and on the nature of the prosodic hierarchy. As far as word level phonology is concerned, Kristoffersen (2000:141) assumes four levels of stress such that a σ can be assigned primary stress, strong secondary stress, weak secondary stress or no stress at all. Strong secondary stress results from stress demotion in compounds, while weak secondary stress arises when unstressed syllables on the left edge form “a stress foot in the shape of a moraic trochee”. Kristoffersen motivates the distinction between the two secondary stresses by pointing out that unstressed /e/ in initial syllables fails

44 It might be a sheer coincidence, yet, it is interesting to note that most linguists appear to be inclined to argue in favour of the very hypothesis that lends itself best to describe the dialect in which they are based. 60 to reduce to schwa under certain circumstances. Moreover, he adds that there is a “slight lengthening of the following onset consonant in cases where the initial syllable is open” (ibid: 163). This lengthening, however, seems optional and is “difficult to hear”. Given that the formation of weak secondary stress is blocked by peninitial stress or a preceding stressed σ, it is ultimately due to post-lexical rhythmic adjustments and is thus not relevant in word level descriptions. The rejection of this category implies that there are “ no secondary stresses inside minimal prosodic words ” (Riad (2013:138)). Although Haugen’s (1982:21) historical survey also assumes four degrees 45 of stress, it is clear that this approach serves phonetic rather than phonological purposes. In (1967:188) he only distinguishes between three different kinds of syllables, which can be accounted for with reference to the binary features [stress] and [quantity] as indicated in (29) below. This can lead us to the claim that it is sufficient to postulate three levels of stress in word level descriptions.

(29) Three degrees of stress (based on Haugen (1967:188)) Primary Secondary Unstressed Stress + - - Quantity + + -

The phonological constituents that make up a purportedly universal prosodic hierarchy are listed in Selkirk (1996) as follows: utterance >> intonational phrase >> phonological phrase >> prosodic word >> foot >> syllable. Various further sublevels can be assumed on a language-specific basis, such as the distinction between minimal and maximal prosodic words in Swedish, cf. Riad (2013:ch5). Myrberg (2010:33) describes Stockholm Swedish with a hierarchy very similar to Selkirk’s, though with a somewhat tilted terminology: intonation phrase (IP) >> phonological phrase 46 (PhP) >> accent phrase (AP) >> prosodic word (PrW) etc. Riad (2013:267f) subsumes the AP and the maximal PrW under one label, a practice that we will adopt in the ensuing discussion. The number of constituents above the PrW (i.e. the domain of the tonal distinction) is of potential importance, since it should reflect what degree of stress a PrW in non-focal position is expected to have. Cruttenden (1986:21ff) argues that four degrees of stress are needed to account for varying degrees of prominence in intonation-groups in English. He assumes that primary and secondary stress involve pitch prominence, while tertiary stress is produced principally by length/loudness and is unaccented, similarly to unstressed syllables. Four degrees of stress is arguably a lot in a cross-linguistic perspective, yet stress-timed languages as English, whose rhythm has an isochrony based on stresses, tend to operate with more distinctions of stress than σ-timed languages such as French (cf. ibid). Given that “there are clear indications of tendencies towards isochrony in Swedish” (Strangert (1978:107)), we can assume four degrees of stress for Swedish as well on par with English.

(30) Prominence levels in the prosodic hierarchy - Hur ofta hade dina barn öroninflammation när de gick i skolan? (How often did your kids have inflammations of the ear when they were going to school?) - Under de första åren var det ganska ofta. (In the first years it happened rather often.)

45 Tertiary stress is assigned to derivational suffixes such as –ing and –are . 46 Cruttenden (1986:ch3) refers to Myrberg’s phonological phrase as an intonation-group. 61

Primary IP Under de första åren var det ganska ofta . Secondary PhP Under de första åren var det ganska ofta Tertiary PrW Under de första åren var det ganska ofta

The example and the partition in (30) are taken from Myrberg (2010:33) with some slight modifications. The head of each constituent is indicated in bold or in italics. Now it is presumed in the table above that the head of an IP is assigned primary (in bold), the head of a PhP is assigned secondary (in bold and italics), while the head of a PrW is assigned tertiary (in italics) stress. Needless to say, the highest level constituent wins out, which means that in the example above första surfaces with secondary and ofta with primary stress. If we follow Cruttenden (1986:21) in claiming that tertiary stress is typically not characterized by pitch, then we are led to conclude that the tonal distinction is not maintained at the level of the PrW and “is realized only in strongly stressed position”, cf. Elstad (1980:393). This conclusion, however, is rejected by most writers on the subject. Bye (2004:4) claims that “lexical tone is restricted to the head of the Accent Phrase”, which we have come to equate with the PrW. Riad (2013:268) is also of the opinion that “[e]ach maximal prosodic word is defined by carrying a word accent, which may be supplanted by a focus accent [i.e. primary or secondary stress in (30)] if so required by information structure”. The distinction between word accent and focal accent goes back to Bruce (1977:16) according to whom the tonal opposition “may be maintained, even if the word in question has no sentence accent”. The view is echoed by Hansson (2003:21) too, who relying on Bruce’s research declares that “the word accent distinction is maintained in non-focal position”. How can we then resolve this glaring contradiction between mainstream research and our conclusions in connection with (30)? The first thing to notice is the subtle difference in wording between Hansson and Bruce: “the distinction is / may be maintained in non-focal position”. The word accent distinction is arguably absent in unstressed syllables. The problem we are faced with is whether it is always maintained under tertiary stress. I would like to indicate that the question must be answered in the negative. While it is apparently possible to maintain the distinction under tertiary stress, a speaker who restricts the occurrence of the opposition to focal positions (i.e. to the heads of IPs and PhPs) is not judged by native speakers as having a foreign accent or breaking the rules of tonality. This dual aspect of linguistic reality (namely that certain features are possible but not necessary) may call to mind the concept of minimal vs. maximal systems proposed by Bertil Malmberg.

The minimal system is the common denominator for all speakers and is consequently the necessary minimum for a speaker to be understood. The maximal system comprises all existing distinctive possibilities, even the weakest and the least utilized. Malmberg (1962:142)

If a speaker maintains a phonemic distinction under unfavourable phonetic conditions (e.g. in unstressed syllables), then it arguably belongs to the minimal system of the language. This line of reasoning could theoretically be extended to argue that the accents (due to their low stability and dubious functionality) should be excluded from minimal descriptions. However, as Riad (1998a:79) points out, “the pitch accent opposition is phonologically very real”, which is probably due to the fact that it is intricately intertwined with nearly all major aspects of Sw&No phonology with stable distribution across dialects in large chunks of the vocabulary. The solution I would like to propose is that a minimal description of standard Swedish phonology includes the tonal opposition in focal positions but not under tertiary stress in the head of a PrW. This is the necessary minimum for a language learner to acquire. Any further complication belongs to the maximal system of the language and is irrelevant as

62 far as the core nature of the opposition is concerned. The discussions to follow in later sections will take this minimal system as a starting point. The position we have adopted is neatly summarized in the following quotation.

Since the presence of one stressed syllable is a prerequisite for accent distinction, a word with inherent Accent 1 or 2 will lose its characteristic pitch when it loses its stress in the sentence. Oftedal (1952:158)

2.2.4.3. No lexical tones required? It has already been mentioned that the tonal distinction is absent in a handful of Norwegian and Swedish dialects and those varieties where tonal minimal pairs do occur restrict its use in minimal systems. Moreover, the meaning of a real-life utterance seldom (if ever) hinges exclusively on whether the accents are used correctly or not. While these details insinuate that the functional load of the opposition is negligible, a large number of tonal minimal pairs have been collected in both languages, which clearly signals the opposite. Haugen (1967:185) suggests that there are probably more minimal pairs for tone than for quantity and stress, still “the tones are eminently dispensable”. Elert (1972) found more than 350 tonal minimal pairs in Swedish, while the corresponding figure for Norwegian is above 2400 (Kloster-Jensen (1958)), which makes the opposition more than marginal. This huge difference between the two languages is mainly due to the neutralizing effect of Norwegian V reduction and partly to the definite neuter of nouns, in which word-final /t/ is not pronounced (cf. Sw. 1ruset (the intoxication) – 2rusa (to rush), 1läser (read) – 2läsare (reader) with No. 1/2 ruse and 1/2 leser ). The contradiction between these long lists of minimal pairs and the opposition’s disputable functionality can be easily resolved if we recognize that most items on these lists differ from their alleged pair with respect to morphological structure, syntactic category etc and should be treated at best as near-minimal pairs. This is reflected in Elert’s and Haugen’s endeavour to devise a set of rules, which can account for the distribution of the accents. Haugen (1967:186) asserts that “in most contexts the use of tone is either predictable or irrelevant except as an indicator of in-group membership on part of the speaker”, which suggests that “once a toneme, always a toneme” might not be an applicable principle in Sw&No. In spite of all this, Elert (1972:151f) declares that “any description of tonality as generated by a few simple rules will necessarily have to account for a large number of exceptions”, which is why the opposition is regarded as phonemic in the first place. Morén-Duolljá (2013:246) goes a step further and argues that “North Germanic tone distributions are determined solely by rule” and proposes to discard the notion of lexical tones altogether. By investigating the distribution of tone only in underived Swedish nouns the author excludes a large number of near-minimal pairs such 1/2 buren (the cage – carried (past part.)) and 1/ 2gripen (the griffin – seized (past part.)). Word pairs like 1and+en (the duck) and 2ande+n (spirit) can be excluded with respect to the structural differences they display. As far as the remaining few “proper” minimal pairs such as 1/2 axel (shoulder/axle), 1/ 2kappa (Greek letter/coat), 1/ 2regel (rule/bolt) etc are concerned, the author suggests that they are “either not exceptional at all or are exceptional in some way that does not involve underlying tones” (ibid: 199). Tone assignment is then claimed to be a function of the prosodic structure, which makes it completely regular. Lexical tones are thus not required in the UR. The exact mechanism is explained as follows. Assuming a prosodic hierarchy, where IP >> AP >> PrW >> ft >> σ >> µ, the occurrence of Accent 2 is bound to the occurrence of a recursive foot (a structural uneven trochee), which we associate with the post-stress σ of a given word. In those cases when the post-stress σ is associated with the AP instead, the

63 resulting contour is Accent 1. It is presumed that the formation of recursive feet is blocked by lexical stress, i.e. when stress falls on a σ other than the penultimate and by an epenthetic V in the post-stress σ. This explains the tonal contour of 1kamera (camera), ba 1naner (bananas), 1regel (rule) and correctly predicts 2motor (engine), 2damer (ladies), 2regel (bolt) etc. At the same time, a number of factual inaccuracies, false predictions and a disturbing degree of arbitrariness make Morén-Duolljá’s approach less than attractive. First of all, the generalization that “[t]risyllables with penultimate stress and a light first syllable always have accent 1” (ibid: 228) simply does not hold as indicated by ve 2randa, pro 2fessor, Sa 2hara etc. If these words are subject to the same mechanism as pi 1ano , then the theory is unable to account for their melody. A further problem is presented by the assumption that the plural suffix –er with an underlying V always triggers stress shift. This is defied by zu 1cchinier , 1hobbyer , sa 1farier etc where the V of the suffix is by no means epenthetic. Stress shift associated with –er is restricted to a well-defined subset of nouns ending in –or (cf. traktor, motor, professor ). It would be unfortunate to generalize an exceptional pattern at the expense of a wide range of other nouns. It is not true either that antepenultimate stress automatically results in Accent 1 as exemplified by 2helvete (hell), 2människa (person) or 2hammare (hammer) 47 . The author derives Accent 1 in hobby by attaching the final σ to the AP, which he admits to be exceptional (cf. ibid: 244). The argumentation is clearly circular: Accent 1 is conditioned by the attachment of the final σ to the AP, while the attachment of the final σ to the AP is conditioned by Accent 1. The approach exhibits similar methodological shortcomings in connection with epenthesis as well. The author suggests that what is usually subsumed under the label 3 rd declension (cf. sko+r (shoes), slav+(e)r (Slavs), månad+(e)r (months), katt+er (cats), idé+er (ideas)) is actually two declensions: the first class is combined with underlying /r/, while the other with /er/. As the distribution of the two suffixes is not subject to phonotactic considerations, epenthesis is called for when /r/ is attached to a word ending in a C. Reliance on epenthesis in slaver as opposed to katter is equally arbitrary and circular as the explanation of Accent 1 in hobby above. Eliasson (1972:181) points out that “[a]part from the argument from tonal accent, it is completely arbitrary and unpredictable” when to postulate an epenthetic V. On the other hand, it seems justified to assume epenthesis for words that display an alternating pattern, cf. 1fågel (bird), 2fåglar (birds). This is, however, not the case in slaver . In summary , it is quite misleading to claim that the tonal melodies are completely regular and predictable (thus absent from the lexicon) when the factors that condition their distribution are irregular and arbitrary. I would like to argue that we are not entitled to relocate the lexical burden of the accents to some fuzzy and disputed theoretical concepts as the AP or epenthesis. Such hocus-pocus does not make the distribution of the accents more regular. Whatever theoretical tools we resort to, the language learner still has to memorize the accentual pattern of certain words.

2.2.4.4. The problem of markedness There seems to be general agreement that the two members of the tonal opposition exhibit certain disparities, which lend themselves to a description in terms of markedness relations. However, due to the multiple senses associated with the concept it is far from being settled

47 The endings –a and –are are sometimes analyzed as suffixes. It is not clear, however, whether this approach is applicable to kamer-a and whether it is reasonable to postulate bound morphemes, which only occur in combination with one single suffix, cf. hamm+are . 64 which accent should be taken as the marked member of the opposition. Traditional accounts have for a long time argued that Accent 2 is marked in relation to Accent 1 (cf. Haugen & Joos (1952:194), Haugen (1967:189), Elert (1972:151), Liberman (1982:18), Riad (2000:261) etc). This position has recently been questioned by Lahiri & al. (2005), Kristoffersen (2006a, b), Wetterlin (2007) etc, who argue that distributional (and perhaps even representational) generalizations can be captured in a much more straightforward fashion if we assume marked status for Accent 1. Let us first inspect the arguments in favour of the traditional account. In mainstream accounts , complexity has been the main culprit for taking Accent 2 to be the marked member of the opposition. A privative approach such as the one in (28) indicates that Accent 2 is phonetically more complex, since it is made up of a tonal gesture (a mark) followed by the tonal contour of Accent 1. However, it has already been hinted at that the privativity hypothesis presumably cannot account for a number of dialects such as and North-Gudbrandsdal in Norway, cf. Kristoffersen (2006a). Furthermore, the tonal make- up of Accent 2 is single-peaked in large areas of Western Norway, Southern Sweden, Dalarna etc (cf. the map in Riad (2003:97)), which makes it dubious whether it is always more complex than Accent 1. Yet even if we were to assume the privative approach for all tonal dialects of Scandinavia, complexity still could not be looked upon as a reliable markedness criterion. In section 1.3 we argued that the use of complexity as a markedness diagnostic is legitimate if and only if certain minimal structures are satisfied. In the present context this may be interpreted as a requirement that all moras in stressed mono- and disyllabic domains be linked to a tone, which can ensure a simple one-to-one configuration. Even if disyllabic Accent 2 is phonetically more complex than Accent 1, this should not lead us to the conclusion that Accent 2 is more marked as suggested by Szigetvári’s (2006:444) markedness of the unmarked. Given that we have argued frequency to be the most reliable markedness criterion, we should look at the distribution of the two accents. Haugen (1967:189) points out that “Accent II is of a more restricted distribution, since it does not occur with monosyllables… [and] is virtually excluded from a number of word types which are of rather obvious German or Romance origin”. While this previous remark is descriptively accurate in itself, it would be highly erroneous to derive markedness relations on this basis, as Haugen does. First of all, Sw&No seem to exhibit a general ban on tonal crowding, which means that it is impossible to map the three tones of Accent 2 onto the two moras of a monosyllabic 48 word. Having established in section 1.3 that frequency counts should not be affected by such imposed neutralizations, we completely agree with Liberman (1982:19) according to whom “the behavior of monosyllables is no clue to markedness”. We can thus tentatively declare that monosyllables are not specified for Accent 1 , their tone is assigned by structure instead. Similarly, the occurrence of Accent 2 in words with final stress is also ruled out on structural grounds given that no post-stress σ is available for the boundary tone to attach to. In this case, however, we cannot speak of imposed neutralization because the structural possibilities of the language would allow for the stress to fall on other syllables. Final stress is a choice. Consequently, such cases of optional neutralization must be taken into consideration when it comes to frequency, which means that words with final stress are specified for Accent 1. Still, the opposition does not extend to this domain either. Interestingly, if we restrict our investigation to those contexts that are capable of hosting the opposition (i.e. words with at least one post-stress σ), it turns out that Accent 2 is more frequent (i.e. less marked) in the Swedish-Norwegian lexicon than its structurally simpler counterpart. The distributional evidence for this will be presented in section 2.2.4.5.

48 In certain so-called circumflex dialects (e.g. in Oppdal) Accent 2 can be realized in monosyllabic words as well. However, Kristoffersen (2007:110) suggests that “the monosyllables carrying the circumflex accent are trimoraic”, so the ban on tonal crowding is probably not violated in circumflex dialects either. 65

The fact that loanwords are typically assigned Accent 1 is often adduced as further evidence for the marked status of Accent 2, since as Kristoffersen (2006b:106) puts it “loanword adaptation supposedly reveals unmarked patterns in a language”. Kristoffersen acknowledges that “this may be true”, but proposes that if different layers of the vocabulary are subject to different sub- then “accent 1 in loanwords can be interpreted as a sign of their foreignness” after all. This is, however, an unnecessary complication. We should instead dismantle the false assumption that foreign words are fully integrated when they enter a language. Loanword adaptation is not necessarily a single-phase 49 process. Initially, when its primary aim is to transform an illegal foreign form into a legal native one, it can be described as a set of “phonetically minimal transformations that apply during speech perception” (Peperkamp (2005:341-2)). As Wetterlin & al. (2007:356) point out “loans via could take accent 1 because this was the accent which was closest to the prosodic pattern of the donor language”. In the realm of segmental phonology it would be unthinkable for anyone to claim that the Turkish word basit (simple) adopted from Arabic or the Polish rendering [bi'telçi] of (The) Beatles would suggest that /b/ is the unmarked member of the /p/ - /b/ opposition in Turkish and Polish, which both exhibit final devoicing. Everyone is fully aware that in the first place loanword adaptation reflects phonetic similarity and not the markedness relations of certain oppositions in the borrowing language. It is remarkable that such insights do not always apply to prosodic problems. So the fact that most loanwords enter the language with Accent 1 does not reveal anything about markedness relations. The first adaptation phase makes sure that all aspects of a given loanword can be mapped onto some feature of the borrowing language. The borrowed item is now void of illicit forms and is ready to be used. It is well-known that high frequency of use preserves morphological irregularities (cf. suppletive forms), while it erodes phonological marks. Therefore if we could detect accent shift in a frequently used loanword, this would have to be interpreted as a phase two adaptation, whereby the nativized loanword is assigned native, default, unmarked tone. As a matter of fact, such examples are not hard to come by. In Sweden the Japanese word manga (cartoon) is typically pronounced with Accent 1 by the older generations, while younger Swedes, who have grown up watching Japanese manga series tend to use a nativized pronunciation with Accent 2 (Tomas Riad p.c.). Halvorsen (1983:352) mentions English purser , which was borrowed into Norwegian with Accent 1 before the 1940’s and is nowadays commonly pronounced with tone 2. We can find further examples if we recall the phenomenon of alternating stress patterns. The relevant words discussed in section 2.2.3 not only shift primary prominence to their initial σ (thereby acquiring what we have argued to be a default stress pattern), but they also exhibit a tonal shift to Accent 2. As will be evident from section 3.1.2, this nativizing process is also of importance in a historical perspective, since it can be invoked to explain unexpected prosodic patterns in a large number of lexical items. Interestingly, with the exception of demoted compounds (see 2.2.4.6) the contrary (whereby an Accent 2 word changes its tonal contour to Accent 1) is virtually unattested. Recently the question of word accents has been approached from a psycho- and neurolinguistic perspective as well. Roll & al. (2011:1712) compared “brain responses related to the processing of noun stems combined with Accent 1 and Accent 2 suffixes… [and found that] a stem associated with the low Accent 1 tone does not activate a plural suffix”. They also report the results of a response time experiment, in which the participants needed more time to correctly identify the grammatical tense of verbs with Accent 2 than corresponding verbs with Accent 1. The authors conclude that this state of affairs is due to the fact that “syllables with Accent 2 activate far more word forms than syllables with Accent 1”.

49 This can be exemplified by Hungarian, which typically adopts words like hardware, Arnold etc with [a]. This pronunciation, which conforms to the rules of , is then often further nativized to [ ɒ]. 66

Accent 2 is thus associated with an increased processing load, which makes it neuro- cognitively marked. It must be clear, however, that the fact that it is more difficult to anticipate word forms with Accent 2 has nothing to do with phonological markedness. As we pointed out in section 1.3, difficulty is an unreliable markedness diagnostic, which can lead to contradictory claims. This observed psycholinguistic difference between the two accents boils down to the fact that Accent 2 is far more common than Accent 1 and is consequently characterized by much lower information content. If we were to apply the claims about markedness in Roll & al. (2011) to segmental phonology, then we would have to insist that /j/ is a more marked onset in Swedish than /spj/ because the latter takes less processing time than the former (cf. [ja'h ɑ:r ɛt'j…] with [ja'h ɑ:r ɛt'spj…] (I have a…)). Such counter-intuitive claims would be unthinkable in the segmental domain.

2.2.4.5. Tonal distribution in simplex forms and the role of suffixes So far it has been established that tonal distribution in polysyllabic words is the only reliable property of the accents that we can link to markedness. In what follows I will attempt to demonstrate that the scope of Accent 1 is more limited and its use should consequently be interpreted as an indicator of marginal, exceptional, special and foreign patterns. As far as uninflected forms are concerned, Haugen (1967:199) asserts that “[m]orphologically unmarked polysyllabic words may be either tonal or non-tonal; the latter indicates them as loanwords” (rule 8). This statement does not by itself provide evidence for the marked status if non-tonal i.e. Accent 1 words, given that it is not inconceivable that loanwords outnumber the native/nativized portion of a language’s vocabulary. Nevertheless, it clearly indicates that Accent 1 can be used to mark certain lexical items as foreign. It goes without saying that the analyst’s perception of the role of suffixes in accent assignment depends primarily on what he actually considers to be a suffix. The more generalizations we want to capture the deeper we have to delve into morphological abstractions. Given the observation that simplex words with an identical ending (e.g. ma 1sk-in (machine), ga 1rd-in (curtain), vita 1m-in (id), tur 1b-in (turbine) or 2dat-or (computer), 2trakt-or (tractor), pro 2fess-or (id), re 2vis-or (auditor) or 2flick-a (girl), 2väsk-a (bag), go 2rill-a (id), 2matt-a (carpet) etc) often share a large number of morphological and prosodic characteristics, it lends itself to postulate that such endings are actually suffixes. This approach, which is adopted by Riad (2013), entails that the tonal information resides in the suffixes, which as a consequence can be used to predict the surfacing tonal configuration. As far as I can see, we face at least two major problems if we adopt such a theory. First of all, we have to assume an implausibly high number of bound roots, which can only be combined with a single suffix. If we cannot delimit the occurrence of bound roots in some way or another, then nothing prevents us from analyzing words like 2månad (month) as 2må- nad on the basis of 2bygg-nad (building). This is quite an extreme position, which ignores the semantic restriction usually imposed on cranberry morphemes, namely that the morpheme that is appended to it has to be a free morpheme. My second objection concerns the belief that accent assignment is a function of suffixes. If it really were the case, then we would expect e.g. all words ending in post-tonic -or to surface with Accent 2. The exceptions to this generalization ( 1humor (id), 1terror (id), 1Viktor (given name) etc) then have to be analyzed as mono-morphemic, which clearly indicates the arbitrary nature of the matter. As it turns out, morphological segmentation is determined by prosodic facts, although it should be the other way round. In what follows I will now present Riad’s (2013) approach and then propose an alternative interpretation of the relevant data.

67

Riad’s approach (Accent 2 is marked) The above mentioned morphological issues have a direct bearing on the way frequency counts can be carried out to determine the markedness relations of the tonal opposition. In Swedish and in Norwegian a very large number of disyllabic words end in –a/–e or in schwa respectively, in which case the expected tonal outcome is Accent 2 (cf. Kristoffersen (2000:255f)). This implies that Accent 1 in words not conforming to this generalization should be looked upon as marked and exceptional. Assuming that words like 2flicka (girl) are mono-morphemic is crucial for maintaining the claim that Accent 2 outnumbers Accent 1 in uninflected / unsuffixed words and is thus the unmarked member of the opposition. As we have seen above Riad (2013:236) gets around this assumption by arguing that the final V in words like 2flicka (girl) and 2pojke (boy) are post-tonic inflectional suffixes, which induce Accent 2. In fact, if we exclude such words from the discussion as Riad does, we will find that the majority of uninflected forms are assigned Accent 1. Among other things, this observation leads Riad (2013:244) to draw the conclusion that Accent 1 is unmarked and that “the tonal information indeed resides in the suffixes ”. Thus Riad’s (2013:ch11) approach posits that lexical Accent 2 is either lexically marked or induced by syllabic suffixes, while the assignment of post-lexical Accent 2 results from the formation of maximal prosodic words (i.e. compounding and certain derivations). It is further assumed that suffixes, which are assumed to fall into two classes: weak and strong, have to obey the locality principle (i.e. they have to be post-tonic) in order to induce Accent 2. Moreover, it is claimed that a lexically represented tone is not always expressed given that anacrusis inhibits the realization of lexical tone unless the given word features a strong suffix. The examples in (31) below, some of which are borrowed from Riad (ibid), serve to illustrate how the author puts these theoretical assumptions to use to account for attested tonal patterns.

(31) Tonal distribution in the light of suffixes Weak suffixes a. 1katt (cat) + 2er (plural) > 2katter b. gar 1din (curtain) + 2er (plural) > gar 1diner c. 2dat-or (computer) + 2er (plural) > da 1torer d. 1hund (dog) + 2ar (plural) > 2hundar e. mi 1nister (id) + 2ar (plural) > mi 1nistrar f. 1komp-is (friend) + 2ar (plural) > 1kompisar g. 2månad (month) + 2er (plural) > 2månader h. 1sann (true) + 2ing (noun) > 2sanning i. bog 1ser(a) (tow) + 2ing (noun) > bog 1sering j. 1mat (food) + 2a (infinitive) > 2mata k. för 1akt (contempt) + 2a (infinitive) > för 1akta l. 1glad (happy) + 2are (comparative) > 2gladare m. speci 1ell (special) + 2are (comparative) > speci 1ellare Strong suffixes n. kamer + 2a (camera) + 2or (plural) > 1kamera, 1kameror o. gorill + 2a (gorilla) + 2or (plural) > go 2rilla, go 2rillor p. bog 1ser(a) (tow) + 2are (agent) > bog 2serare q. ge 1lé (jelly) + 2ig (adjective) > ge 2léig r. ner 1vös (nervous) + 2ing (agent) > ner 2vösing s. ka 1ssör (cashier) + 2ska (female) > ka 2ssörska

68

While the violation of the locality principle results in Accent 1 with both weak and strong suffixes (31f, n), the presence of an unstressed initial σ inhibits the assignment of Accent 2 only in the weak class (31b, c, e, i, k, m). Accordingly, the remaining examples (31a, d, g, h, j, l, o, p, q, r, s) seem to support the proposal that tonal information resides in the suffixes. A suffix is considered to be strong whenever its effects on accent assignment can override those of anacrusis. This makes the segmentally identical suffixes of (31i and r) and (31k and o) respectively weak and strong. To sum it up, the factors of accent assignment are characterized by hierarchical strength: locality >> strong suffixes >> anacrusis >> weak suffixes. 2 2 It must be added that most suffixes that Riad (2013:243) labels as strong ( - are N, - aN, -2ska and -2ig ) exhibit counterexamples: be 1sökare (visitor), 1aria (id), 1/2 astma (id), 1ungerska (Hungarian woman), för 1siktig (careful) etc. Given that aria is disyllabic, the surfacing accent cannot be blamed on the locality principle, which can be invoked to explain ungerska . However, if we do take the epenthetic V of ungerska into consideration and assume a disyllabic root, then it becomes difficult to account for 2klosterlig (monastery+Adj) and 2borgerlig (bourgeois), which on this ground should surface with Accent 1. Another problem is presented by (31c), which in Riad’s model is the combination of a bound root and two Accent 2 inducing suffixes, which quite unexpectedly yield Accent 1. Riad (2013:237ff) allows for the possibility that lexically represented Accent 2 is not always expressed. It is, however, not an attractive solution. Why do we have to assign tone to both suffixes and subsequently take it away in order to arrive at a supposedly default representation? Moreover, why should the exceptional pattern of stress shift in the plural of (31c) be associated with what Riad calls default Accent 1? As for the interplay of the factors that influence the tonal outcome we have seen that locality is more important than strong suffixes and that anacrusis is more important than weak suffixes. As Riad (2013:238) puts it “information closer to the left edge is often more visible and influential on the resultant accent”, which means that the properties of the root at the left edge often override those of the suffix. This could indicate that tonal information after all resides primarily in the root and not in the suffixes as Riad makes out.

An alternative approach (Accent 2 is unmarked) Let us now investigate whether the alternative theory, which views Accent 1 as the marked member of the opposition, is more powerful when it comes to accounting for tonal distribution. The ability of suffixes to induce Accent 2 in (31) above was restricted 50 to monosyllabic stems and derived forms exhibiting what Riad calls strong suffixes. Given that monosyllabic words are not specified for Accent 1 (cf. 2.2.4.4), it is not appropriate to talk about accent shift in (31a, d, h, j, l). What happens is that any polysyllabic word not specified for Accent 1 surfaces with default Accent 2, which is assigned by a post-lexical rule. Such a mechanism is assumed by Lahiri & al. (2005:68) who point out that “ Accent 1 can be specified on prefixes, suffixes and stems [and] its presence will ALWAYS block the postlexical rule”. An obvious advantage of this approach is that it does not postulate lexically specified tones that are unexpressed on the surface. Given that the role of suffixes is basically restricted to providing an extra σ to monosyllabic words, which otherwise would find themselves outside the scope of the opposition, cf. (31a, d, h, j, l), we are not compelled to distinguish between a strong and a weak class. It is not necessary to assume the interplay between anacrusis and suffixes and the locality principle either since the addition of a suffix does not typically alter tonal specifications, cf. (31b, e, f, g, i, k, m, n, o). The only suffixes that do induce Accent 1 can be related to unusual patterns and extremely low productivity. The former can be exemplified by the stress shift in (31c), while

50 Recall that we assume words like go 2rilla (id) and 2flicka (girl) to be mono-morphemic roots. 69 the latter by the inflected forms in (32) below. It is interesting to note that whenever an affix has both an Accent 1-inducing and a neutral allomorph, it is always the latter that is productive. Most monosyllables whose plural is formed by the addition of –er are assigned Accent 2, yet those plurals in which the stem V is mutated surface with Accent 1 as indicated in (32ai). Similar patterns can be observed in comparatives (32b), while allomorphy in verbs in (32c) can also provide evidence for the marked status and exceptional character of Accent 1 words.

(32) Accent 1 as an attribute of non-productive affixation a. Plural (umlaut vs. –Vr ) i. bok (book) > 1böcker , natt (night) > 1nätter , land (country) > 1länder ii. park (id) > 2parker , 1fågel (bird) > 2fåglar , länd > 2länder (loins) b. Comparative ( –re vs. –are ) i. stor (big) > 1större , hög (high) > 1högre , 2liten (small) > 1mindre ii. röd (red) > 2rödare , vacker (beautiful) > 2vackrare , cool (id) > 2coolare c. Present tense (–er vs. –ar ) i. 1läser (read), 1dricker (drink), 1håller (hold) ii. 2googlar, 2twittrar, 2skannar (scan)

(33) Definite articles attached to Swedish nouns a. singular i. neuter : hus (house) + (e)t > 1huset , 2papper (paper) + (e)t > 2pappret , bi (bee) + (e)t > 1biet ii. non-neuter : stol (chair) + ( e)n > 1stolen , 2fader (father) + (e)n > 2fadern , by (village) + (e)n > byn b. plural i. neuter : hus +(e)n > 1husen , 2papper +(e)n > 2pappren , bin +(a) > 1bina ii. non-neuter : 2stolar + n(a) > 2stolarna, 1fäder + n(a) > 1fäderna , 2byar + n(a) > 2byarna

Epenthesis The superscripts denoting the tonal contour have been consciously omitted in monosyllables in accordance with the claim that they are not specified for Accent 1. However, if we want to maintain the supposition that lexically specified instances of Accent 1 are never overridden by suffixes, then we also have to analyze fågel, vacker, läser etc as monosyllables, whose second V is due to epenthesis. Although we dismissed an explanation involving epenthesis in 1slaver (Slavs) in 2.2.4.3, in the present case this solution is supported by vowel-zero alternations and accent shifts in inflected forms. It follows that the tonal outcome of 1slaver (Slavs), 1letter (Latvians), 1tjecker (Czechs), 1saker (things) etc should be accounted for by specifying Accent 1 on the stem rather than assuming an inserted V, for which we have scarce evidence. What this means is that certain monosyllables can be marked for Accent 1, however, this specification depends on the behaviour of the inflected forms and not on the surfacing tone of the monosyllable itself. Although the definite article attached to a monosyllabic word often provides it with an additional σ, it never affects the tonal outcome as indicated in (33) above. It seems, however, obvious that Accent 1 in huset, stolen, biet, bina etc cannot be always blamed on underlying mono-syllabicity. First of all, the insertion of an epenthetic V should be motivated by phonotactic requirements on the surface, which is apparently not the case in the examples given above. Second, epenthesis cannot be responsible for the insertion of both /e/ and /a/ because it “normally inserts an unspecified vowel slot whose features are supplied by fill-in

70 rules [which makes it] the least marked vowel of the language” (Yip (1987:464)). Furthermore, if the article in (33ai) consisted of a single underlying C, then Norwegian neuter nouns would display a contradictory pattern. The fact that the final C is not pronounced would leave us with a single epenthetic V on the surface. The behaviour of fadern and byn as opposed to pappret , pappren and biet indicate that we should postulate epenthesis in (33aii) and an underlying V in all other cases cf. Lahiri & al. (2005:75). Note, however, that our choice between underlying vowels and V insertion is completely independent of tonal considerations. Does this imply that underlying disyllables such as huset and biet should be lexically marked for Accent 1? Given that definite forms exhibit virtually no exceptions, lexical specification of Accent 1 in huset would be hard to reconcile with our understanding of lexical Accent 1 being the indicator of irregular and marginal patterns. The problem can be resolved if we analyze the definite endings as clitics and not as suffixes. Such an approach is supported both by their regularity and historical considerations. As a clitic attaches to a word outside the lexical domain, its tone can be attributed to the same post-lexical Accent 1 rule that is responsible for the surface melody of monosyllabic words (see 2.2.4.10).

Strong suffixes revisited It remains to be added that Accent 2 induced by Riad’s strong suffixes in (31p, q, r, s) evidently falsifies Lahiri’s claim that lexically specified Accent 1 ALWAYS blocks 51 the post-lexical rule. There are, however, a couple of interesting points to make concerning the relevant data. The first thing to notice is (as we have already pointed out) that all strong suffixes are derivational. Second, words of the type (31p, q, r, s) appear to display considerable tonal variation. I have found 23 words ending in -erare in Hedelin’s (1997) Swedish pronunciation dictionary. One of these was tagged with Accent 1 ( funderare ), four others with Accent 2 ( officerare, orienterare, sekreterare, specialare ), while the remaining 18 vacillate 52 between the two accents. It seems plausible that linguistic variation within a single speech community should be interpreted as a sign of change in progress. Moreover, if we recall that asymmetries (i.e. markedness relations) exhibited by an opposition reflect the course of phonological and analogical changes, it all makes sense. The assumption that Accent 1 is marked also entails that Accent 2 is attributed a certain level of assimilatory force. Tonal assimilation involves the gradual elimination of Accent 1, while the distribution of the default melody widens its scope. It is not unusual for lexical rules to have exceptions as exemplified by 2söner , 2mödrar cf. (32ai). Marked entities are expected to recede. Whether a given word undergoes tonal assimilation or not depends (apart from frequency of use) on a complex interplay of phonological, morphological, semantic and psychological factors. As far as phonology is concerned it is quite telling that words ending in –erare with obligatory Accent 2 all have a disyllabic anacrusis, while fun 1derare does not. A large number of words derived with the help of strong suffixes have arguably drifted away from the semantic content of the base form to such an extent that we are probably not justified in labelling them as derivations in a synchronic analysis. Thus profe 2ssorska (wife of professor) is not simply the female version of pro 2fessor , just as ka 2ssörska (cashier) is not a female ka 1ssör (treasurer) cf. manlig kassörska (male cashier). Given that kassörska has

51 This claim also seems to be challenged by an interesting Norwegian pattern involving the definite form of certain superlatives, cf. 1heder (honour) , 2hederlig (honest) , 1hederligst, 2hederligste (most honest). The first three forms can be accounted for if the base form is analyzed to be monosyllabic and –st as marked for Accent 1. However, the definite suffix –e is not expected to induce Accent 2. Wetterlin (2007:112) gets around this problem by postulating two different superlative suffixes: a lexically specified indefinite ( -st ) and an unspecified definite suffix ( -ste ). 52 Barberare, bogserare, formerare, hanterare, inventerare, investerare, juvelerare, klarerare, passagerare, placerare, planterare, programmerare, redigerare, registrerare, reglerare, systemerare, tapetserare, taxerare. 71 distanced itself from kassör , the original mark of final stress is not relevant for tonal assignment. In a synchronic analysis kassörska is not a derivation. Structurally, it belongs to the same category as go 2rilla and ve 2randa . If this is correct, then it can be maintained after all that most strong suffixes do not induce Accent 2 , given that such suffixes are not suffixes any more: they are integrated into a mono-morphemic word.

2.2.4.6. Compounds in Standard Swedish and Norwegian As we saw in 2.2.3, Swedish- typically assigns secondary stress to the last constituent of a compound. In most Swedish dialects, the tonal result of such constructions is post-lexical Accent 2, whose lexical tone (H) is associated to the σ bearing primary stress, while its prominence tone (LH) is aligned with the σ bearing secondary stress. However, as a result of semantic fusion lexicalized compounds such as 1torsdag (Thursday) undergo both stress demotion and accent shift. The former is expected when a word cannot be analyzed as a compound any more, but what motivates the latter? Recall that Accent 1 is claimed to indicate rare patterns. It is undoubtedly rare that a whole compound is squeezed into a single minimal prosodic word. In this sense, the surfacing melody accompanying such lexicalized words is in line with our understanding of the core properties of Accent 1. The tonal opposition of compounds is also suspended in Accent 2 in most Norwegian dialects to the north of Trondheim (cf. the map in Riad (2013:190)), while those dialects that are located to the south of the isogloss typically exhibit tonal contrast in maximal prosodic words. In this latter group the surfacing accent of the compound is determined by the lexical specification 53 of the first constituent. This seems straightforward as far as underlyingly polysyllabic first constituents are concerned. Although some compounds with a monosyllabic first constituent surface with Accent 1, others with Accent 2, “the individual lexemes are very consistent in their correspondence to a particular compound accent” cf. 1festdag (holiday), 1festhumør (festive atmosphere), 1feststemning (festive mood) with 2nattog (night train), 2nattmøbel (chamber pot), 2nattarbeid (night work) in Wetterlin (2007:145). We have seen so far that monosyllables as a class are not specified for Accent 1, yet the marked accent can appear in certain inflected forms and compounds with a monosyllabic base. A closer look at the data in (34) suggests that from a tonal point of view the marks in these two word groups are virtually 54 in complementary distribution. Accordingly, it cannot be maintained that the stem itself is marked for tone, since under such an assumption we could not provide a proper explanation for the behaviour of compounds in (34a) and that of plurals in (34b). Nevertheless, we can postulate that floating marks can be attached to the representation of monosyllables, which will associate when the proper context (either a given suffix or an extra constituent) is provided.

(34) The behaviour of Norwegian monosyllabic stems a. Accent 1 in inflections i. 1tenner (teeth) – 2tannlege (dentist) ii. 1netter (nights) – 2nattklubb (night club) iii. 1lengre (longer) – 2langbord (long table) iv. 1større (bigger) – 2storgråte (cry buckets)

53 Cf. Wetterlin (2007:144-6) for a summary of the relevant data. 54 The only exceptions I am aware of are 1render (edges) , 1randbemerkning (edge remark) and 1stenger (rods) , 1stangfiske (rod-fishing), where the monosyllable itself seems to bear an actual tonal mark. Although both 1bønder (farmers) and 1bondeaktig (farmer-like) are lexically marked, this mark has to be floating given that the base form 2bonde (farmer) surfaces with Accent 2. 72

b. Accent 1 in compounds i. 2fester (feasts) – 1festdag (holiday) ii. 2filmer (films) – 1filmrull (roll of film) iii. 2ferske (fresh - pl) – 1ferskvann (fresh water) iv. 2sydde (sewed) – 1systue (sewing room)

Concerning the number of monosyllables that build Accent 1 compounds we can refer to a quantitative study undertaken by Withgott & Halvorsen (1988). They established that out of the 5162 UEN compounds found by their “machine readable dictionary” 25.4% had Accent 1 and the rest Accent 2. Although the authors are of the opinion that “it is highly undesirable to treat one-fourth of the words in a language as exceptional” (ibid: 289), the ratio (3:1) quite tellingly suggests that from a distributional point of view Accent 1 is still justly considered to be the marked pattern. Although the tonal behaviour of monosyllabic stems is characterized by rather low predictability, there are certain tendencies and regularities that allow the language learner to engage in “educated” guesses. It is assumed that the lexical items with Accent 1 “behave in a systematic 55 fashion that the grammar should account for” (ibid). Kristoffersen (1992) reveals that there exists an interesting correlation between accent assignment and sonority . The higher the sonority of the rhyme, the more likely we are to get Accent 2 in the compound. In accordance with the cross-linguistic observation that codas as such are weak positions where obstruents are expected to undergo lenition, we can claim that universally unmarked codas are characterized by a high degree of sonority. Kristoffersen’s observation is thus completely compatible with our present claim about markedness relations, since it seems to be the case that a given unmarked structure promotes the occurrence of another. We can arrive at a similar conclusion investigating the tonal behaviour of linking morphemes. Compounding in Sw&No is often accompanied by the insertion of –s– or –e– between two constituents. In UEN the presence of the former results in Accent 1, while compounds with –e– (which is historically the remnant of the plural genitive) can only surface with Accent 2. In light of the above, the marked status of –s– can also be attributed to its low sonority value 56 .

2.2.4.7. Southern Swedish compounds We have seen that tonal distribution in compounds is either determined by prosodic issues as in Standard Swedish or by lexical considerations as in East Norwegian. Apart from these two extremes, there are also some varieties (e.g. the Scanian dialects of southern Sweden) where accent assignment is governed by a systematic interplay between prosodic and lexical factors. Bruce’s (1974) experiment, in which informants from Malmö, Kristianstad and Halmstad were asked to pronounce nonsense compounds, revealed that a decent continuum can be set up along the line of lexical vs. prosodic treatment, given that geographical proximity to the connective dialects of Götaland and Svealand correlated with an increased dominance of prosodic factors. Moreover, one can establish certain implicational relations on the basis of the actual stress patterns exhibited by Bruce’s neologisms. The data in (35) below indicate that if a given pattern is assigned Accent 2, then all further words below it share the same melody.

55 To take an example, we can think of monosyllabic words denoting (cf. 1tysk-, 1spansk-, 1fransk-, 1gresk- etc), which build compounds in a uniform manner. 56 The addition of /s/ often brings along a radical decrease of sonority in the stressed syllable: hav+s [haf:s](sea - gen) , behöv+s [be'høf:s] (need - passive) etc. 73

(35) Accent assignment in Southern Swedish (based on Bruce (2007:121)) Malmö Kristianstad Halmstad Stockholm Rhythmic pattern a taxi-gris 1 1 1 2 Vv-V b skogs-hals 1 1 2 2 V---V c banan-kust 1 1 / 2 2 2 vV-V d lax-choklad 1 2 2 2 V-vV e burk-handduk 1 2 2 2 V-(V)V f cykel-plank 57 1 / 2 2 2 2 V(v)-V g blod-prins 2 2 2 2 V-V h skrattmås-sylt 2 2 2 2 V(V)-V i sommar-träsk 2 2 2 2 Vv-V

In the rightmost column of (35) V denotes a stressed and v an unstressed σ. When the former is in brackets, it signals demoted stress, while (v) marks a non-vocalic σ, whose schwa may be dissolved to create a syllabic . It is interesting to note that stress clash, which inhibits final stress and thereby Accent 1 in Northern Swedish (cf. 2.2.3), results in Accent 2 in (35g) but not necessarily in (35c). Furthermore, the linking morpheme that is associated with Accent 1 in Norwegian compounds has the same effect in Malmö and Kristianstad, cf. (35b). It follows from Riad (2003:101, 125) that those dialects where the prominence tone associates in compounds (like in Stockholm and Narvik) have post-lexical Accent 2 in words with two stressed syllables. When lexical factors also have a say in accent assignment (like in , Bergen and Malmö), we find an unassociated prominence tone, which is oriented to the left. It is probably not a coincidence that whenever lexical specification plays a role in the accent assignment of compounds, their prominence tone behaves exactly in the same way as in simplex words, whose tonal distribution is regulated by lexical marking.

An OT analysis The data in (35) can also be accounted for within an optimality-theoretic framework, where the interplay of lexical and prosodic factors can be expressed with the help of the markedness constraint *ACC1 and the faithfulness constraint IDENT-IO(tone). While the former is responsible for the elimination of tonal marks, the latter makes sure that input marks are preserved in the output. It follows from (35) above that *ACC1 >> IDENT-IO(tone) in Standard Swedish, while the opposite ranking holds for Malmö. The tonal distribution of Kristianstad and Halmstad reveals that IDENT-IO(tone) is actually a cover term, which comprises at least four sub-constraints displayed in (36) below. It seems that the ranking of these four faithfulness constraints is uniform over the whole area under investigation, which means that (36a) >> (36b) >> (36c) >> (36d). If this is correct, then the dialects in (35) differ solely in the relative ranking of *ACC1. This is illustrated by a proposed underlying constraint structure in (37) below. If we grant that a compound whose posterior constituent is another compound exhibits posterior anacrusis (i.e. demoted stress and unstressed position are

57 Tonal vacillation in Malmö may be attributed to the fact that some speakers store words like 1tiger (id) and 1cykel (bicycle) as lexically specified disyllables, which follows from the mark attested in the plural forms (cf. 1tigrar , 1cyklar ). 74 subsumed under one label), then the constraint rankings in (37) can correctly account for the data in (35).

(36) IDENT-IO(tone) preserves the tonal specification of a. marked free morphemes: ID-IO(tone-mf) b. marked bound morphemes: ID-IO(tone-mb) c. anterior anacrusis: ID-IO(tone-aa) d. posterior anacrusis: ID-IO(tone-pa)

(37) The relative strength of *ACC1 in four Swedish dialects Malmö Kristianstad Halmstad Stockholm ID-IO(tone-mf) ID-IO(tone-mf) ID-IO(tone-mf) *ACC1 ID-IO(tone-mb) ID-IO(tone-mb) *ACC1 ID-IO(tone-mf) ID-IO(tone-aa) ID-IO(tone-aa) ID-IO(tone-mb) ID-IO(tone-mb) ID-IO(tone-pa) *ACC1 ID-IO(tone-aa) ID-IO(tone-aa) *ACC1 ID-IO(tone-pa) ID-IO(tone-pa) ID-IO(tone-pa)

The stair-like pattern of (35) reflects the strict hierarchy of (37), which allows us to assert that if a dialect is in change, it will follow a particular scenario, in which *ACC1 struggles its way up to the top. In fact, such a change is actually in progress. Ström (1998) repeated Bruce’s (1974) experiment and found that these (partially) lexical dialects were gradually shifting towards the Central Swedish system in a regular manner (i.e. keeping to the stair-like pattern). This transition is too systematic to be attributed to such non-phonological factors as the prestige of the or its influence exerted on other dialects through TV, radio etc. Riad (2003:114) also points out that “the change follows a common path” and that “the general direction of change here is from lexical to increasingly prosodic assignment of accent”. Riad (2003) analyses this process as part of a more comprehensive, circulatory pattern. We can conclude the present discussion by acknowledging some striking similarities between the behaviour of simplex words and compounds. Both classes are in the process of disposing of the lexically specified mark of Accent 1: simplex words rely on analogy, while compounds engage in more structured developments. Furthermore, monosyllables in isolation and as the first constituents of compounds are also subject to a similar mechanism. An appropriate extra σ can turn both groups into words bearing Accent 2. For simplex words the restriction holding for such an extra σ is that it should not be epenthetic, while for compounds that it be stressed, cf. (35g). In light of the above, it seems even more appealing to claim that Accent 1 is the marked member of the opposition .

2.2.4.8. The functions of the opposition Our primary motivation for investigating the Scandinavian word accents in the first place is their phonemic or distinctive status. Nevertheless, the functional load of the accents is so low (partly because large sections of Sw&No vocabulary are not even eligible for the tonal distinction) that from a phonological point of view the opposition seems almost entirely superfluous. Recall that according to Passy’s (1891:227) principle of economy languages tend to get rid of unnecessary, redundant material. However, the very fact that most mainland Scandinavian dialects have preserved the opposition up to our present days indicates that the distinction’s existence can be ascribed to some other functions that it serves to fulfill. Elert (1972:152) suggests that the principal distinctive function of the opposition is not “the distinction of words but rather of otherwise homonymous morphemic elements such as

75 the various suffixes, -1/2 a [neuter nouns def. pl. / inf. or adj. pl.] , -1/2 en [nouns def. / past part.] , -1/2 er [pres. tense / nouns pl.]”. Given that the different tonal manifestations of the suffixes always denote different syntactic categories, they are usually attached to different stems so these morphemes are most often no more than near minimal pairs. The proposition that the role of the accents is somehow related to an “attempt” to reduce homophony loses immediately its appeal if we recall that semantics is never involved in phonological descriptions (1.4.5.1). We would expect a language to employ syntactic rather than phonological means when it comes to avoiding homophony as is the case in French. It is well-known from various perception tests that features that are claimed to be redundant in relation to a given opposition often contribute a lot to the proper identification of speech sounds, cf. e.g. Cooper & al. (1952). In this respect, the accents certainly play a role in facilitating perception (cf. Roll & al. (2011)) but this role is not distinctive in the narrow sense of the word. It is generally acknowledged in the literature that Accent 2 also has a connective function but there appears to be somewhat less consensus on what actually is being connected to what. Elert (1972) assumes a formative or word boundary following the stressed σ of words with Accent 2, which is tantamount to suggesting that Accent 2 is essentially a juncture. In other words, Accent 2 is claimed to “[signal] a syntactic connection between morphemes or sequences of morphemes” (ibid: 152). However, as we argued in fn 47, certain objections can be raised against analyses that consider words like go 2rilla (id) to be poly-morphemic. So in this present approach we simply assume that Accent 2 signals that the stressed σ and the following σ belong to the same domain. Accent 1, on the other hand, does not necessarily indicate detachment or mono-syllabicity, cf. pi 1ano (id), 1aria (id) etc. Throughout this chapter we have argued that Accent 1 should be seen as an indicator of special, foreign and irregular patterns. We have seen that many linguists reject the partition of the lexicon along the binary feature [ ±native] claiming that there is nothing foreign about words like fa 1milj (family) or pi 1ano (id). This is, however, not a consistent argument. Recall that in the case of connectivity we assumed that Accent 2 indicates [+connective] , while Accent 1 does not indicate anything at all. We can argue in a similar vein that Accent 2 signals [+native] and that no value is attached to Accent 1. Such an approach makes it clear why loans tend to enter the language with Accent 1 and why the elimination of certain segmental and prosodic marks goes hand in hand with tonal adaptation: unmarked tone presupposes further unmarked, default, native patterns. The proposal that the [ ±native] distinction constitutes a psychological reality for Swedes and Norwegians can be supported with the distribution of the tonal accents in geographical names. Our prediction is that foreign places should be uttered with Accent 1, while domestic ones with Accent 2, cf. Larsson (2003:32), who reports that school children are more likely to pronounce familiar place names with Accent 2 in the dialect of Trögd, Sweden. There is, of course, a large number of exceptions that defy this expectation, still the following comparison between Sw and No in (38) is worth considering.

(38) Tonal distribution in geographical names a. in Swedish (based on Hedelin (1997) and http://www.ltz.se/slakt-o-vanner/fira- o-uppmarksamma/ortnamn-de-skandinaviska-ordaccenterna-tradition-och- nysprak ) i. 1Norge, 1Oslo, 1Narvik, 1/2 Trondheim, 1Steinkjer, 1Namsos, 1Svolv ӕr ii. 2Avesta, 1/2 Umeå, 1/2 Luleå, 1/2 , 2Hässelby, 1/2 Halmstad, 1/2 Vilhelmina b. in Norwegian (based on Berulfsen (1969) and http://sprak.nrk.no/ordbok/ ) i. 2Norge, 2Oslo, 2Narvik, 2Trondheim, 2Steinkjer, 2Namsos, 2Svolv ӕr

76

ii. 2A1vesta, 1Umeå, 1Luleå, 1Kalmar, 1Hässelby, 1Halmstad, 1Vilhelmina

It follows clearly from (38aii) that the pronunciation of certain geographical names is subject to (regional) variation, to which the list in (38) cannot do full justice. The names of e.g. Kalmar, Halmstad and Vilhelmina are often pronounced differently by a standard speaker than an inhabitant of the respective city. Nevertheless, Trondheim shows in a straightforward manner what I mean by tonal assimilation. According to Hedelin (1997), the name of the city is normally pronounced with Accent 1 in Swedish. However, he indicates that there is an alternative, segmentally somewhat different, Swedecized pronunciation with Accent 2. The Norwegian renderings of 1Umeå and 1Luleå in (38bii) are particularly telling given that the tonal outcome of Norwegian compounds is as a rule determined by the first constituent. Such a pronunciation obviously goes against the productive patterns according to which disyllables ending in –e are assigned Accent 2. The concept of nativeness can account for the surfacing melody of such compounds 58 . Sw 1Vänern and 1Vättern (pronounced in Nw as 2Vänern and 1/2 Vättern ) are the only counterexamples to the generalization in (38) that I am aware of. Finally, it remains to be added that certain non-Scandinavian place names are pronounced with default Accent 2 in Swedish and with marked Accent 1 in Norwegian (cf. , Kina, Mecka, Peking, Israel, Jerusalem, Budapest etc). Note that these examples can be either analyzed as (formal) compounds or constitute a disyllable with an ending that is usually associated with Accent 2 in Swedish. Accordingly, these renderings with Accent 2 can be attributed to analogical pressure.

2.2.4.9. Building blocks reconsidered By now it must be clear that our understanding of the tonal opposition is in certain respects incompatible with Riad’s (2003) privative analysis of the accents presented in (28) above. Let us first address a terminological problem . While the prominence and boundary tones of Accent 2 are tagged with labels that reflect their function (the former signals prominence, the latter the boundary of an IP), the term lexical tone leaves much to be desired. First of all, a given tone can be characterized by a lexical specification, however, this is just a property and certainly not a function. Secondly, the term directly contradicts our conviction that Accent 2 is default and post-lexical, but it is also problematic for traditional analyses. Riad (1998a:96fn) points out that such a designation is “a little idiosyncratic in view of the fact that accent 2 is not a lexical specification of compounds”. My next objection to the model in (28) concerns the claim that Accent 2 equals a lexical tone followed by stress as claimed by Haugen (1967:189)). We saw in 2.2.4.2 that such an approach entails that the tonal contrast is reduced to H – Ø and is thus maintained in non-focal position. Recall that we have argued that as far as the minimal system is concerned this is not the case. Given that I do not intend to relinquish the advantages of the privative approach, I will, in what follows, attempt to devise an alternative model to that of (28). The first point to note is that the phonological interpretation of tonal contours has at least two precarious aspects, which makes it somewhat more arbitrary than is the case in the segmental domain. First of all, the intended or underlying F0 is often obscured by pitch perturbations caused by the different effects of voiced and voiceless consonants. Ladefoged (2003:87) emphasizes that such micro-prosody is “usually not part of the phonology of the language, and should be disregarded in most circumstances”. Another problem is presented by the fact that different pitch analysis programs may draw rather different F0 curves for the

58 Berulfsen (1969) indicates that the final V of Luleå is lengthened, which reflects secondary stress and suggests that the tonal outcome cannot be blamed on stress demotion and the fossilization of the compound. 77 same input (ibid: 83ff). Thus as a rule of thumb, it seems reasonable to keep the phonological description as restrictive as possible by ignoring tonal movements that are not integral parts of the given pattern under investigation. As a matter of fact, in certain words the realization of Accent 1 in Standard Swedish lacks the initial L of the commonly accepted LH-L contour. In light of the above, we are entitled to reanalyze the accent and equate it with H-L. Such a move might seem arbitrary but it is not unprecedented. Riad (1998a:81) and Bye (2004:19) represent the Accent 2-contour of Malmö as L-HL-H, while Riad (2003:99) interprets the same melody as L-H-L. In short, we argue that Accent 1 , which is frequently equated with stress (Haugen (1967:189)), should be described as H-L in a phonological analysis . It is commonly asserted that Accent 2 is “difficult for non-natives to learn and by them often misheard as delayed stress”, cf. Haugen & Joos (1952:179). The first part of the claim stems from experience and the second follows from the assumption that the difference between the two accents resides solely in tonality (hence tonal accents ). Nevertheless, I am convinced that in the case of Standard Swedish it is not entirely legitimate to throw the blame on the “imperfect” perception of foreigners: the second peak of Accent 2 does bear a certain degree of stress 59 . It is clear, however, that this sensation of (secondary) stress on the second peak of Accent 2 in simplex words is purely phonetic and plays no phonological role. The arguments in favour of this claim are as follows. It is well-known that the three correlates of stress (quantity, intensity and F0) are used differently in various languages. Swedes rely most heavily on quantity, while speakers of English on changes in F0. However, it is apparent that the less important cues cannot be ignored given that even different groups within a single speech community may signal stress with different means, cf. Gósy (2004:198). Accordingly, we have to examine whether the unstressed σ of disyllabic Accent 2 words is distinguished with more prominence than the corresponding unstressed σ of initially-stressed, disyllabic Accent 1 words. As far as F0 is concerned it is obvious that the additional peak of Accent 2 words signals some degree of prominence. If Accent 1, which as we have mentioned before equals stress, is indeed represented as HL, then it is hard to escape the conclusion that the HLHL contour of Accent 2 words contains two phonetic stresses. The tonal contour is not so conspicuous in those cases when the second µ of the stressed σ is linked to a voiceless segment. I carried out measurements of near minimal pairs (Tési (2014:181ff)) such as 1tapper (brave) – 2tappar (drop), 1läcker (leak) – 2läcka (to leak), 1fiffel (cheating) – 2fifflar (cheat) etc. The results showed that the second σ of an Accent 2 word was on average 2.8 dB louder than the first, while in Accent 1 words the first σ was 0.7 dB louder than the second. Moreover, the average difference in F0 between the two syllables of Accent 1 words was a perfect fifth, while the corresponding value for Accent 2 words was a minor third. This suggests that in initially stressed disyllabic words the prominence cues of the second σ may contribute more to the correct identification of the actual accent than the rising or falling contour of the first σ. It has been hinted at above that we should replace the misleading label lexical tone with a more appropriate term that reflects its functional properties. Considering the fact that connectivity is the least controversial function of Accent 2, it is legitimate to introduce it to our new terminology. Given that it is the second constituent of Accent 2 that connects the post-stress σ to the stressed σ, we must reconsider the order of the constituents in (28). This alternative account of the Standard Swedish privative opposition is presented in (39) below. It follows clearly from this new segmentation why the distinction is suspended in non-focal position in a minimal system. Given that the connective tone connects to the preceding

59 This has been recognized for a long time as can be seen in Kock (1878:41) who equates Accent 2 with a sequence of gravis + levis. 78 prominence tone, it is unable to fulfill its function in non-focal position, which means that it cannot be realized.

(39) The revised opposition of Standard Swedish (from Tési (2014:184)) Prominence tone Connective tone Boundary tone Accent 1 H - L Accent 2 H LH L

2.2.4.10. Lexical and post-lexical rules In what follows, I will conclude the discussion of the tonal opposition by investigating whether our present approach to accent assignment and the alternative proposals put forward above can be transferred to a rule-based description of Swedish tonality. Elert (1972) is an obvious point of departure as far as SPE-inspired, generative rules are concerned. Elert posits a Main Accent 2 Rule (MAR), an Accent 2 Exception Rule (AER) and a number of special environments and exceptions, where lexical marking for Accent 2 is deemed necessary. He finds that AER, which is made up of eight sub-rules, should precede MAR. The reason why the lexical marking of certain words is inevitable is that Elert’s MAR assumes a boundary 60 after the stressed σ and is thus unable to derive the correct tone of 2sallad (id), 2gummi (rubber), 2bambu (bamboo) etc. Given that MAR requires a boundary after the stressed σ, its scope is reduced and it does not make false predictions about the large group of loanwords that surface with Accent 1. Elert’s approach entails the contradiction that Accent 2 is both seen as default (MAR) and lexically marked. Moreover, as MAR is responsible for Accent 2 in both simplex words and in compounds, AER inhibits Accent 2 even in those cases where the anterior constituent of the compound satisfies the structural description of AER as in be 2sökstid (visiting hours). Elert claims Accent 1 to be the unmarked (i.e. default) melody but at the same time postulates exception rules, which can hinder “marked” Accent 2 from surfacing. Clearly, whatever markedness relations we assume, lexical marking should be restricted to the marked pattern and it does not matter whether some of the exceptions belong to well-defined, rule-like categories. The fact that Elert orders AER before MAR indicates that Accent 1 is closer to the lexicon than Accent 2. In line with this, Lahiri & al. (2005:68) propose that only Accent 1 is lexically marked, while all instances of Accent 2 are due to post-lexical accent assignment: “[e]very polysyllabic word, consisting of at least one disyllabic trochee, if not lexically specified for accent 1 is assigned accent 2”. Given that monosyllables are not (automatically) specified for Accent 1, a post-lexical Accent 1 rule is also called for. The different behaviour of Accent 2 in compounds and in simplex words suggests that Lahiri’s post-lexical rule should be split into two. This means that Swedish tonality is defined by lexically specified Accent 1 and the four rules in (40).

(40) The rules of Swedish tonality a. Lexical Accent 1 : assign H-L to those words that undergo marked morphological or phonological processes in the course of their derivation. b. Post-lexical Accent 2 : delete any previous tone in words with two stresses and assign H-LH-L.

60 So similarly to Riad (2013), Elert treats the endings –a and –e as suffixes or stem formatives.

79

c. Post-lexical Accent 2 : assign H-LH-L to toneless words with at least one stressed disyllabic trochee. d. Post-lexical Accent 1 : assign H-L to any stressed word lacking tones.

As I pointed out in Tési (2012), many claims of classical Lexical Phonology (LP) are in accordance with the facts of accent assignment as described above. For instance, the dual (partly default, partly lexical) status of Accent 1 does not involve complications since LP “does not prohibit a rule from applying both 61 lexically and post-lexically… [but] lexical applications of the rule will exhibit different properties than post-lexical applications” (Pullyblank (1986:6)). In the case of Accent 1 this difference is observed in foot structure (i.e. lexical Accent 1 pertains to polysyllables), while Elert’s lexical (e.g. 2månad (month)) and post-lexical (e.g. 2byggnad (building)) Accent 2 are only distinguished by the problematic 62 notion of morpheme boundary.

(41) Some principles of LP (based on Mohanan (1982:116)) a. Lexical processes cannot apply across words. b. Post-lexical operations are blind to the internal structure of words. c. Post-lexical operations are exceptionless.

Given that lexical and post-lexical are polar opposites, the statements in (41) above can be negated if lexical is exchanged for post-lexical or vice versa. (41a) is clearly true of (40a), as Accent 1 does not extend beyond the domain of the word. If the definite article of 1huset (the house) is indeed a clitic, then (40d) is arguably a post-lexical rule that can apply across word boundaries. The same pertains to (40b) and (40c) as exemplified by standard Sw. compounds and the behaviour of Norwegian phrasal verbs respectively, cf. 2slå ned (knock down). The validity of (41b) is apparent from the association of the connective tone in (40b) and the fact that the placement of secondary stress in compounds does not indicate the hierarchical structure of the word. As far as (41c) is concerned, words like 2söner (sons), 2mödrar (mothers) provide evidence for the proposition that lexical rules (40a) can have exceptions. We have seen that the post-lexical rules of (40b) and (40d) apply across the board and are indeed exceptionless. Strictly speaking, (40c) has no exceptions either as it never generates false results. On the other hand, we must admit that all the words that are specified for Accent 1 in the lexicon are essentially exceptions to the Accent 2 rule. Lexical marking is a back door to ensuring the post-lexical status of (40c). In addition to the principles in (41), lexical and post-lexical rules differ in many other aspects cf. Rubach (2008:470). For the present discussion one of the most relevant differences concerns the question of structure preservation . Lexical rules never make use of (and never create) features that are not present in the UR of a given language. Post-lexical rules (e.g. aspiration and glottalisation) can introduce such surface specifications. Kiparsky (1985:93) argues that “any rule which introduces marked 63 specifications of lexically non-distinctive features must be post-lexical”. Given that both Accent 2 rules are post-lexical and that certain words are specified for Accent 1 in the lexicon, H and L are the only tones present in the UR of Swedish. The connective tone (LH), which is a phonetically marked specification, is accordingly introduced by the two Accent 2 rules, which therefore must be post-lexical.

61 Mohanan (1982:114) also provides examples of rules with both lexical and post-lexical domains. 62 Cf. fn. 47. 63 This use of the word refers to phonetic complexity. 80

2.3. Danish In the following overview of the three phonemic features of Danish prosody I shall attempt to analyze the relevant data in a way such that the emerging findings can be interpreted with regard to the conclusions of section 2.2. As my primary concern is to point out (dis)similarities between Danish on the one hand and Sw&No on the other, the ensuing discussions obviously make absolutely no claim to be exhaustive. Furthermore, the interdependent nature of the three features (the occurrence of stød, similarly to the accentual opposition, presupposes both length and stress) compels me to echo the caveat put forward in 2.2: the following subsections will complement each other in substantial ways and will necessarily also be somewhat overlapping.

2.3.1. Quantity The complementary distribution of vocalic and consonantal length in Sw&No makes it evident that length is the property of a suprasegmental domain (the rhyme the stressed σ) and is not (merely) associated with individual segments. In Danish, however, the bimoraic requirement of Weight-to-Stress is not a structural principle, which means that stressed syllables can be monomoraic as opposed to Swedish, Norwegian, Italian, Icelandic etc, cf. Basbøll (2005:275). Given that long consonants (apart from in certain marginal phenomena to be treated below) are unknown to modern Danish, the phonological distinction between light and heavy syllables can only be dependent on V length, cf. the minimal pairs hvile [i:] (to rest) and ville [i] (to want). This suggests that Danish quantity should be considered as a segmental feature after all. Grønnum (2005:266) provides some arguments in favour of the view that length is a prosodic feature. She compares the behaviour of quantity to that of segments and finds that the former occurs at most once per σ, can be lost under certain circumstances and its placement within the σ is predictable. She adds that segments such as /s/ are not characterized by such properties. Nonetheless, this is an arguably unfair comparison, which may lead us to categorize /h/ as a prosodic feature, given that it occurs at most once in a σ (hence predictable) and is lost in certain structures. Length is obviously bound to particular segments and is, as Basbøll (2005:44) puts it, “the least prosodic of the three prosodies”. Moreover, in certain cases (especially in the lower regions), “the distinction in vowel quantity is combined with a clear distinction in vowel quality” (ibid: 272), which seems to support the supposition that long and short vowels are in fact distinct phonemes, cf. the argument in 2.2.2. In spite of all this, V quantity is so closely intertwined with the two other prosodic phenomena in question that we are entitled to treat it together with ‘proper’ prosodic features 64 .

Shortening A long V can be analyzed as bimoraic, it is always assigned a certain degree of stress (primary or secondary) and is thus a possible locus for stød. These properties indicate that V – V: alternations are often 65 accompanied by changes in terms of stress or/and stød. This can be illustrated by V shortening in the neuter form of adjectives where the loss of the second µ co- occurs with the loss of stød, cf. [ly: Ɂs] – [lysd ̥ ] (bright), [fʁi: Ɂ] – [f ʁid ̥ ] (free). We can also

64 The unwillingness to postulate long V phonemes may stem from the fact that Danish exhibits an unusually high number of full vowels (at least 12, cf. Basbøll (2005:ch2.2)). The inventory is obviously more manageable without long V phonemes, yet recall fn. 30. 65 Not always. V shortening before before sonorants keeps the stød-basis intact, so in such cases the glottal feature is preserved but it is realized on the postvocalic C instead. Examples include stor (big), hvid (white) etc., cf. Martinet (1937b:264) and Basbøll (2003:20). 81 encounter V shortening in those cases where the addition of derivational suffixes results in moving stress patterns, cf. (23). In idio Ɂt > idioti Ɂ (idiot-ism) the loss of stress leads to the shortening of [o: Ɂ], which loses its stød as a consequence. In the examples above, the loss of stød is expected and automatic, given that the shortening deprived the actual words of the necessary phonological structure known as stød- basis. Interestingly, some instances of stød-loss do not follow from such structural considerations. In [sø: Ɂ] + [man Ɂ] > ['søman Ɂ] (sea + man = sailor) the ambisyllabic C could take over as stød-bearing unit, but it does not. In fact, stød is often lost on the anterior constituent of compounds irrespective of V shortening, “when it comes to Danish words that are well established as first parts of compounds” (Basbøll (1999:360)). Examples of such partially lexicalized compounds include so Ɂl + skin Ɂ > 'sol,skin Ɂ (sun-shine), vi Ɂn + glas > 'vin,glas (wine-glass) etc. These instances of stød-loss are reminiscent of the loss of the connective tone in lexicalized compounds in Sw&No. As we will see below, lack of stød (similarly to Accent 1) is the marked member of the opposition and can be used to signal lexicalized, unusual and foreign patterns.

Lengthening A process known as schwa-assimilation ( ə-ass) is the primary source of segmental lengthening in contemporary Danish. The phenomenon entails that schwa becomes segmentally identical to the most sonorous adjacent segment, while keeping its syllabicity, cf. Basbøll (2005:293). Accordingly, ə-ass can act both regressively 66 and progressively depending on the segmental make-up of the word. The exact details of this very complicated process need not concern us here. We should instead restrict our attention to establish what kinds of lengthened segments ə-ass can give rise to.

(42) Some examples of schwa-assimilation 67 a. V: əC > V: µC – tien ['tsi: ən] > ['t si:in] (keeping silent) b. V: əC > V(:)C µ – givet ['g̊ i: əð] > ['g̊ i:ð̩ ] > ['g̊ iðð̩ ]68 (given) c. VəC > V µC – sofaen ['so:fa ən] > ['so: ˌfӕ:Ɂn] (the sofa) d. V:C ə > V:C µ – time ['t si:m ə] > ['t si:m̩ ] (hour) e. V:C ə > V: µC – mase 69 ['mӕ:s ə] > [m ӕ:ӕs] (mash) f. VC ə > VC µ – helle ['hɛlə] > ['h ɛl̩ ] (refuge island) g. VC ə > VC – masse ['mas ə] > [mas] (mass) h. Və > V µ – hyppige ['hyb̥ iə] > ['hy ˌb̥ i: Ɂ](frequent - pl/def)

Vµ and C µ in (42) indicate the phonemes with which the µ of / ə/ is associated upon its segmental deletion, so the lack of the µ-symbol does not entail that a given segment is non- moraic. This is obvious as far as vowels are concerned: a short V is monomoraic while a long V is bimoraic. The effects of ə-ass add one µ to the original length of a V, so in (42a, e) they manifest themselves in an overlong (i.e. trimoraic) V. Similarly to Sw&No, a Danish long V cannot occur in unstressed position. When a short V is lengthened as in (42c, h), the resulting long, bimoraic V acquires secondary stress and consequently becomes eligible for, and surfaces with, stød.

66 Directionality is one of the main differences between the respective versions of ə-ass in Standard Danish and Standard British English, which otherwise can be construed as essentially the same phenomenon, cf. Ács & Törkenczy (1986). 67 The examples in (42) are borrowed from Basbøll (2005:ch11). 68 The stressed V shortens before the resulting syllabic glide. 69 Note that in (42e) / ə/ assimilates to a non-adjacent segment. Basbøll (2005:317) argues that this cannot occur in (42g) since in masse the stressed V is specified for shortness. 82

The remaining examples (in which / ə/ assimilates to a C) reveal further interesting patterns. The directionality of ə-ass in (42b) indicates that from a clearly phonetic point of view the alveolar , /ð/ is more sonorous than /i/, a high front V (ibid: 297). A comparison between the English loanword team ['t si:m] and (42d) illustrates that the syllabicity vs. non-syllabicity of a sonorant C is the only surface distinction in certain minimal pairs. ə-ass is merely favoured in (42d, f), as opposed to (42a, b, c), where it is obligatory (ibid: 293). Note that if the phenomenon widens its scope, the lexicalized results will transform URs in interesting ways. Finally, we should point out the different behaviour of various consonants. The data suggest that ə can assimilate to sonorants (42d, f) but not to obstruents (42e, g). This divergent behaviour of obstruents is in line with the proposition that only sonorants can be moraic in Danish (ibid: 270). Given that a long C is inherently moraic, /s/ in (42e, g) cannot undergo compensatory lengthening. However, there is no agreement in the literature. Ács (1996:36) claims that once ə-ass has taken place, passe (to fit) can be transcribed as [pas:] (i.e. with a moraic C 70 ), cf. also Kusmenko (2005:139).

Moraic consonants The analysis of Danish consonants in terms of moras is indeed somewhat problematic. Moras (being units of σ weight) are a convenient tool to express the difference between light (monomoraic), heavy (bimoraic) and superheavy (trimoraic) syllables. It is important to keep in mind that (traditionally) σ weight merely reflects the overall segmental quantity of a σ and is not used as an expression for prominence relations, even if the principle of Stress-to-Weight often blurs the distinction. It follows that a long C is always moraic, but the reverse is not necessarily true. Basbøll (2005:269) cites Latin lectus (bed) to support his approach in which a moraic C need not be long. We have, however, no way of knowing whether such postvocalic consonants were indeed short in Classical Latin. Note that this is the very environment for which Riad (2013:166) assumes weight (i.e. length) by position in Swedish. As σ weight is determined by the segmental make-up of the rhyme, a moraic C must either be final or ambisyllabic. (Recall that an ambisyllabic or final C is not necessarily moraic.) An intervocalic C before / ə/ patterns in parallel with final consonants 71 , which links it to the preceding V, while the onset principle links it to the following / ə/. The short intervocalic C of disse (these) is therefore best analyzed as ambisyllabic (ibid: ch9.2), and there is general consensus concerning its non-moraic status. It follows from the discussion so far that a language that lacks C length should not be analyzed as having moraic consonants . As far as I understand, the only reason for neglecting this principle in Danish is to be sought in the distribution of stød. Given that stød either occurs on a long V or on a sonorant C following a short V, it is obviously a property of the rhyme. Basbøll (2005:271) proposes that an analysis in terms of suprasegmental features (moras) can lead to a unified description, in which “the notion of stød-basis can be dispensed with”. Stød is essentially “a signal of the second mora of a syllable”. As we will see in 2.3.3, extra-prosodicity is a central concept in Basbøll’s approach. The fact that a C bearing stød is not systematically longer 72 than a corresponding stød- less C indicates that the moraic analysis is rather arbitrary. This leads Basbøll (ibid) to admit that the moraic analysis is not real psychologically, but it offers “a convenient linguistic

70 It follows from (42) that a long or moraic C created by ə-ass is always syllabic. Long non-syllabic consonants, on the other hand, can be found across grammatical boundaries like in sollys (sunlight). Ɂ 71 Cf. the allophones in gade ['g̊ ӕ:ð ə] (street), kage ['k hӕ:j ə] (cake), mad [mað] (food), dag [d̥ ӕ: ] (day). 72 In those cases where stød is realized as a (and not just ) the C bearing stød is even shorter than its stød-less pair, cf. the citation forms of hun (she) and hund (dog). 83 description”. Given that consonantal length in Danish does not pattern as “expected”, Basbøll (2005:305) assumes that “the mora is a unit of quantity for vowels, but not for consonants”. Despite these shortcomings of the moraic analysis, I will in the following discussions sometimes employ the convenient term “bimoraic” σ, when I talk about a σ with stød-basis.

2.3.2. Stress The stress patterns of the three languages under investigation exhibit a higher degree of similarity than any other aspect of their prosodic systems. As a matter of fact, a majority of those instances when Danish stress placement deviates from Sw&No can be related to some further prosodic differences such as the divergent manifestations and functions of tone and stød or some differences in quantity. To take an example, while Danish monomoraic syllables can be assigned any degree of phonological stress, they are always unstressed in Sw&No. Bimoraic syllables behave in a uniform fashion in the three languages since they can never appear in unstressed position. In other words, Stress-to-Weight in Danish (which assigns at least secondary stress to a bimoraic domain) clearly dominates Weight-to-Stress as can be exemplified by words like ['janu, ɑ:Ɂ] (January) or ['gi,t sɑ:Ɂ] (guitar). These two words are pronounced with final stress in Sw&No, where both constraints are undominated.

2.3.2.1. Degrees of stress We argued in 2.2.4.2 that it was sufficient to postulate three degrees of stress in word level descriptions for Sw&No. Basbøll (2005:ch12) takes the same position arguing that it is not appropriate to distinguish between weak and strong secondary stress in a phonological sense. Although he provides a uniform definition for secondary stress calling it “the degree of stress below primary which is strong enough to have stød” (ibid: 333), he acknowledges that it is a heterogeneous category given that it can result either from stress demotion as in compounds or from the enhancement of unstressed syllables as in (42c, h). This essentially amounts to saying that secondary stress is a derived concept, which features in surface-phonological descriptions, while lexical stress as such is a binary category (ibid: 386). Such a position seems to be valid for Sw&No as well. The view that only one degree of stress is indicated in the UR of all three languages entails that the occurrence of secondary stress (as opposed to that of primary) is completely regular and derivable by rules. These rules , however, differ substantially between Danish on the one hand and Sw&No on the other. We have seen that stress enhancement (tertiary > secondary) is brought about by ə-ass (and is accompanied by stød-addition) in Danish, while it is due to the suffixation of lexicalized compounds (followed by a tonal shift to Accent 2) in Sw&No, cf. fn. 40. The demoted stresses of (multiple) compounds also behave differently in tonal and non-tonal dialects/languages as we anticipated in (27). Connectivity in Sw&No marks the domain of word accents by assigning secondary stress to the last component of the posterior constituent. In Danish, however, where the domain of stød is the stressed σ, we can expect the assignment of secondary stress to reflect the grammatical structure of compounds. This is indeed the case, however, the number of secondary stresses is not restricted to one as in Sw&No, cf. (43b, c, e). In a regular compound the most prominent σ of the anterior constituent is distinguished with primary stress and all other syllables (of the anterior constituent) are unstressed irrespective of how complicated its grammatical structure is, cf. (43a, b and e, f). Although it must be added that “heavy parts of compounds… under certain conditions are only reduced to secondary stress” (Basbøll (2005:334)), cf. (43g).

84

This means that secondary stress is normally the property of the posterior constituent, whose structure is expressed prosodically given that each of its major constituents bears secondary stress. In Basbøll’s (1999:355ff) terminology such major constituents are connected by marked compound boundaries , while all other boundaries are (prosodically) unmarked. Consequently, we can define lexicalization as a process whereby a marked boundary turns into an unmarked boundary, cf. (43c, d and e, f). It follows from the distribution of secondary stresses that when a lexicalized compound is affected by stress reduction it is always the stress pattern of the posterior constituent that changes. This is illustrated with a few examples borrowed from Basbøll (2005:ch12, 16) in (43) below, where a hyphen represents an unmarked, while a plus sign a marked boundary. The numbers 1, 2, 3 denote different degrees of stress. See Martinet (1937b:256ff) for further examples.

(43) Stress patterns in Danish compounds a. ud-salgs + dame (female shop assistant during sales): 1-3+23 b. under + sal Ɂgs-che Ɂf (deputy sales manager): 13+2-2 c. silke + tør-kl ӕ(Ɂ)de (silk scarf): 13+2-23 d. silke-tør-kl ӕ(Ɂ)de (silk scarf): 13-3-23 e. for-bunds + dom-sto Ɂl (federal court): 1-3+2-2 f. for-bunds-dom-sto Ɂl (federal court): 1-3-3-2 g. under-vi Ɂsnings+minister (minister of education): 13-23+323

It is quite illuminative that unstressed salg in (43a) lacks stød, while the same morpheme with secondary stress preserves it in (43b). The “fact that stød is lost as a consequence of stress reduction” (Basbøll (2005:280)) can be further illustrated with various examples of unit accentuation , which entails that all stresses preceding primary stress are deleted within a domain. For instance, when primary stress is attracted to the adjectival derivational suffix – agtig , the stem loses the stød it would feature in isolation, cf. bar Ɂn+agtig (childish). The same pattern can be observed in certain lexicalized compounds, cf. rød Ɂ (red) vs. rød'bede (beetroot). Similarly, when the main stress of a VP is carried by the object, then the preceding verb (the head of the VP) usually loses its stress and therefore its stød as well. Basbøll (2005:523) provides the illustrative examples hun spiser 'ost (she eats cheese) and hun 'spi Ɂser en 'ost (she eats a cheese), where the stressed verb in the second utterance surfaces with stød, while the same unstressed verb in the first example lacks it. As we have seen, Swedish phrasal verbs exhibit corresponding cases of unit accentuation with a tonal shift to Accent 1. Finally, it may be concluded that our analysis in 2.2.4.2 seems to be corroborated by the notion of unit accentuation and the relevant Danish data. Recall that we have argued that the tonal opposition is not (necessarily) maintained in non-focal position. The verb in hun spiser ost can be described as having (non-focal) tertiary stress (using Cruttenden’s four levels), and accordingly appears without stød. The prosodic features of tone and stød are intrinsically intertwined with the notion of linguistic prominence and cannot normally be realized under tertiary stress, which in a binary framework corresponds to an unstressed position.

2.3.2.2. Default stress Given that Danish and its Scandinavian neighbours have been exposed historically to very similar impacts from other languages, the location of Danish primary stress calls for a partition of the lexicon along the binary feature [ ±native] similarly to what we proposed in

85

2.2.3 for Sw&No. Recall that this is not an etymological approach given that many foreign words are classified as belonging to the [+native] section of the vocabulary. Basbøll (2005:396ff) suggests that stress placement in simplex lexemes is taken care of by the Default Stress Rule (DSR) and by the French Stress Principle (FSP). DSR assigns stress to the first σ with a branching nucleus or (in words without long vowels) to the last σ with a branching rhyme. As expected, FSP stresses the last full 73 V of a given lexeme. This means, however, that the two rules overlap to a certain extent since the stress patterns of words like [k ha'lif] (caliph), [ ɑm'p hul Ɂ] (ampoule), [sa't sɛŋ] (satin) etc can be derived with both. It would be interesting to know on what grounds Basbøll decides whether such lexemes follow FSP or DSR. Moreover, neither rule can account for words with non-final stress lacking both codas and long vowels (such as gummi (rubber)). In these cases and in a considerable number of words for which both rules make false predictions we have to resort to lexically specified stress on the (ante)penultimate σ. I am in total agreement with Basbøll (2005:395) when he claims that there is “nothing wrong in considering stress placement as being lexical”, however, I would like to reject the problematic assumption that the “French” subsection of the lexicon acquires its stress by a rule and not by lexical marking. It obviously has to be specified whether a certain word follows FSP or DSR. As far as the language learner is concerned, it is immaterial whether we speak of such specifications or of lexical marking “proper”: both require memorization. The reader has certainly observed that the word default is used in a different sense in Basbøll’s DSR and in the present thesis. What I call default stress is a stress pattern defined by word edges, which is strong enough to exert analogical pressure on lexical items with a different stress placement. We saw in 2.2.3 that the primary reason for regarding initial stress to be default in Sw&No was the observation that it is the target of certain assimilatory and analogical processes and not the historical fact that it is a characteristic feature of native items. Given that unmarked patterns of various phonological levels presuppose each other, the movement of stress to the initial σ is typically accompanied by accentual adjustments. Using the terminology of OT this can be expressed as a requirement that once we violate faithfulness constraints we should gain as much as possible in terms of markedness. This means that stress patterns are not expected to undergo change without further prosodic benefits. The question we now have to ask ourselves is whether Danish displays some corresponding tendencies, which would allow us to describe it as having default stress. If we manage to isolate some words whose stress pattern is altered so that it can surface with stød, then we can answer in the affirmative. Nevertheless, I am not aware of such examples. This state of affairs (if it does not stem from my ignorance) can be explained as follows. While the distribution of word accents is strongly intertwined with different stress patterns, it is typically 74 not affected by the realization of complementary length in the stressed σ. In Danish, however, the distribution of stød is closely tied to segmental length and sonority (i.e. stød-basis), while the location of primary stress is of less importance given that there is no position within the word (in relation to word edges) where stød is ruled out on structural grounds. It can usually be encountered in a bimoraic stressed σ if it is final or antepenultimate in a lexeme, cf. the Danish pronunciation of Latin insulae ['en Ɂsu,l ɛ:Ɂ] (island, sg. gen/dat or pl. nom), however, it can also occur (through lexical marking) in penultimate syllables as in høj Ɂde (height), lӕnɁgde (length), or Ɂdre (order) etc. The latter is the very position for which Basbøll’s (2003:13) Non-Stød Principle predicts lack of stød in minimal words (i.e. words without (semi-)productive endings). It means that stød in penultimate position is rare, but by no means impossible.

73 This restriction is included to make sure that “a final schwa syllable cannot fall under this rule” (ibid: 396). 74 With the possible exception of increased sonority (longer vocalic content) favouring Accent 2 in Norwegian compounds with a monosyllabic first constituent, cf. 2.2.4.6. 86

Basbøll (2005:398) considers the penultimate stress of Danish oregano (id), a word which is pronounced with antepenultimate stress in both Italian (the donor language) and German (the language through which the word might have reached the Danes). He suggests that the surfacing stress pattern can be accounted for by the fact that the target position typically lacks stød, which he calls “a strong Danish or perhaps anti-foreign feature” (ibid). Consequently, the loanword can enter the language without [+native] features due to its altered stress pattern. However, we are probably not entitled to attribute the altered stress pattern of oregano to a ‘proper’ stress shift. This word is pronounced with penultimate stress in many European languages (including Dutch, Swedish and Russian), which is no doubt due to the general misconception that Italian always has penultimate stress. So this stress pattern has probably nothing to do with the lack of stød. We can plausibly infer from the lack of stress shifts during the course of loanword adaptation that contemporary Danish lacks default stress as such.

2.3.3. Stød Similarly to the tonal accents of Sw&No, the occurrence of stød is subject to considerable dialectal variation. Some areas like the island of Bornholm lack the (laryngeal) opposition altogether (Grønnum (1988:26)), while some other dialects like East Slesvig or the island of Rømø feature a corresponding tonal opposition instead, cf. Ejskj ӕr (2003:28). Even in the remaining varieties, where stød is an integral part of , we find various types of stød with different distributions and phonetic manifestations. The dialects of West Jutland and North Funen for example exhibit a strong stød-like feature called v-stød (preglottalized stops) in syllables without stød-basis. When v-stød co-occurs with standard stød, the latter often has a more restricted distribution and is absent in words like hals (throat), hj ӕlp (help), kant (side) etc in the dialects of Jylland and Fyn, cf. Perridon (2006:42). Stød without stød- basis (short-vowel stød) can also be encountered on the island of Zealand. This type of stød also co-occurs with standard stød, participates in productive alternations but is restricted to polysyllables, cf. Iosad (2015). However, for considerations of space, we will completely ignore the question of dialectal complexity in the analyses to follow.

2.3.3.1. Distribution and markedness We have seen that the multiple senses of markedness can lead to contradictory claims about the tonal opposition. An approach rooted in phonetic complexity views Accent 2 as marked, while a frequency-based interpretation that takes the dynamics of language change into consideration arrives at an opposing view. Interestingly, we can find a similar contradiction when it comes to stød. Those who work in an autosegmental framework tend to focus on (autosegmental) representation and are therefore liable to adopt a phonetic mindset. Riad (2000:267) depicts stød as the allophonic reflex of an HL sequence realized on one σ, while non-stød is construed as the same tonal contour distributed over two syllables. He argues that “one-to-many is a more marked configuration than one-to-one” (ibid) so stød must be the marked member of the opposition since a σ bearing stød is linked to two tones not one. Even those who do not analyze stød as a tonal phenomenon (e.g. Basbøll (2005:85)) must admit that stød represents some additional complexity, which corresponding non-stød segments lack. Nevertheless, we have claimed that such arguments are irrelevant as far as markedness relations are concerned given that complexity is not a reliable diagnostic of phonological markedness. We should look at frequency instead. As it turns out, the distribution of stød

87 makes it clear that it is easier to “[formulate principles] for the absence of stød in bimoraic syllables… than for its presence” (ibid: 267), which renders non-stød the marked member of the opposition. In what follows I will provide evidence for this assumption by briefly outlining the stød pattern of modern Danish. It has long been known that the occurrence of stød is bound to certain phonological templates, which go well beyond the bimoraic requirement of stød-basis. These characteristics can be used to establish separate word-types (which in turn can be divided into sub-groups), some of which favour the glottal gesture, while others seem to block it. As far as uninflected, underived simplex lexemes are concerned, Danish grammars traditionally distinguish between α-, β- and γ-words such that the stressed σ of an α-word is word-final, while that of a β-word is followed by a σ that contains / ə/. The stressed σ of a γ-word is followed by at least one σ with a full V. Heger (1980:95) points out that such a partition of the vocabulary deprives us of powerful generalizations by keeping certain groups of words with identical stød-pattern sharply apart. This is no doubt a relevant shortcoming when it comes to formulating generative rules for the appearance of stød. However, adopting this traditional approach still enables us to establish the main distributional characteristics and as such the relevant markedness relations of the opposition. The arrangement of (44) below relies on Heger (1980) and indicates whether a given word-type is predominantly associated with the presence or the absence of stød.

(44) Traditional word-types 75 and the distribution of stød a. α-words i. VC# stød: hjem Ɂ (home), hal Ɂ (hall) ii. VCC# stød: prin Ɂs (prince), sal Ɂt (id) Ɂ Ɂ iii. V:C 0# stød: blå (blue), gu l (yellow) b. β-words i. σ ə non-stød : tale (to speak), handle (to trade) Ɂ 76 Ɂ ii. (σ)1'σ əVERB stød: udta le (to pronounce) , behan dle (to treat) iii. (σ)1'σ əNOUN non-stød : udtale (pronunciation), vanilje (vanilla) iv. σ əl, σ ər stød: fa Ɂbel (fable), vin Ɂter (winter) v. σ ən non-stød : gluten (id), hvilken (which) c. γ-words i. (σ)0'σ σ non-stød : bolig (residence), desperado (id) Ɂ Ɂ ii. (σ)0'σ σ σ stød: delir ium (id), kalor ie (calory)

A close examination of the data 77 in (44) reveals why Heger (1980) is critical of the traditional approach: as non-stød is restricted to penultimate stress it is clearly gratuitous to differentiate between (44a, b and c). Yet I must hasten to add that apart from β-words displaying anacrusis (44bii, iii) most word-types exhibit exceptions. Some of these are merely sporadic such as the lack of stød in words like Nils (given name), team (id), Lellinge (place name), engel (angel), broder (brother) or the unexpected glottal gesture in or Ɂdre (order), vå Ɂben (weapon) and kog Ɂnak (cognac). Such cases are not so numerous and can be reasonably handled with lexical marking. Exceptions to (44ai) e.g. ven (friend), tal (number) etc. are much more frequent but are still manageable: only one word out of seven has a deviating (i.e. stødless) pronunciation, cf. Grønnum (2005:225).

75 Words not fulfilling the requirements of stød-basis are not included in (44). 76 I am aware that udtale is stressed as a compound both as a verb and as a noun, so strictly speaking this is not Ɂ an instance of anacrusis. The similarity to behan dle is confined to the fact that the stød-bearing σ is not initial in the word. 77 Many examples of this section are borrowed from Heger (1980) and Basbøll (2003). 88

Now in the light of the above it is indeed hard to escape the conclusion that the absence of stød on bimoraic syllables is a restricted, marked pattern. Heger (1980:85f), in fact, comments on a number of stødless items (such as interview (id), bonvivant (id) etc) claiming that they are loanwords that have “retained the absence of stød”. This is an indication that loanword adaptation in Danish also involves prosodic means, a practice which is quite reminiscent of the analogical spread of unmarked Accent 2 in Sw&No. The point is that the direction of analogical pressure (e.g. the spread of stød to unassimilated loans) is a reliable signal of markedness relations. Examples for such analogical changes will be provided in 2.3.3.2. Given that most word-types of (44) exhibit a number of exceptions, it seems more accurate to invoke principles than to postulate stød-rules in the traditional sense. The main principle at work here can be referred to as the Non-stød principle (NSP), cf. Basbøll (2003:12ff), and is meant to account for the lack of stød on penultimate bimoraic syllables. Recall, however, that the scope of (44) above is restricted to uninflected, simplex forms. It is a legitimate expectation that syllabic suffixes should interact with the NSP in some way or another, given that e.g. a final bimoraic σ (a preferred stød-locus) can be transformed into a penultimate one and as such be eligible to the NSP. As a matter of fact, the stød-pattern of many mono- and disyllabic base forms is altered upon suffixation, while certain other lexical items show no sign of alternation, as is demonstrated in (45) below.

(45) The effects of suffixation a. loss of stød i. hun Ɂd (dog) + e (plural) > hunde ii. dum Ɂ (stupid) + e (plural) > dumme iii. prin Ɂs (prince) + er (plural) > prinser iv. tal Ɂ (speak – imp.) + e (inf.) > tale b. new stød i. ven (friend) + en (sg. def.) > ven Ɂnen ii. Clinton (surname) + er (plural) > Clinton Ɂner iii. tale (to speak) + r (present) > tal Ɂer c. no change i. ven (friend) + er (plural) > venner ii. hӕnde (to happen) + r (present) > hӕnder iii. tal Ɂ (speak – imp.) + r (present) > tal Ɂer

Although the examples in (45a) meet our expectations according to which penultimate syllables should surface without stød, the data in (45b) are rather difficult to interpret in terms of the NSP. Basbøll (2003) proposes that the solution to the problem lies in the realization that suffixes should be split into two separate classes along the dimension of productivity : fully productive endings (FPE) are invisible as far as the NSP is concerned (cf. 45b), while others (unproductive and semi-productive endings; UPE, SPE) generally conform to it. Accordingly, he distinguishes between the categories put forward in (46) below and claims that the general form of the NSP has the Basic word as its domain, cf. (ibid: 16).

(46) Word-types in Basbøll (2003) a. min-word = min-stem + UPE b. Basic word = min-word + SPE c. max-word = Basic word + FPE

89

In Basbøll’s (2003) approach lexical marking is reserved for those non-alternating items whose stød pattern defies the predictions of the NSP. This is an appealing claim that resonates well with the position we took in connection with epenthesis (cf. 2.2.4.3) and the tone of monosyllabic words (cf. 2.2.4.4) in Sw&No: a feature is part of the UR if and only if it shows no sign of alternation (i.e. it is present in all relevant forms of the word). As a consequence, the lack of stød in (45bi) cannot be blamed on lexical marking and Basbøll (2003:17ff) therefore resorts to extrametricality instead. It is assumed that “the extra-prosodic consonant must be final in the max-word… and since obstruents are never moraic in Danish, only sonorant consonants can be lexically specified for extra-prosodicity” (ibid). This explains why the generalization in (44aii) has much fewer exceptions than the one in (44ai): words not obeying the latter end in an extra-prosodic C and are not specified for [-stød] such as the exceptions to (44aii). Basbøll (2003:29) is aware that both cases require memorization from the language learner and classifies exceptions to (44ai) and (44aii) as different instances of lexemes undergoing what he calls Lexical Non-stød (LNS). Nevertheless, it is noteworthy that the alternations that compel us to adopt extra- prosodicity are all limited to the addition of the sg. def. article attached to α-words of the type (44ai). The word-final nasal of (45bii) is not lexically specified for extra-metricality, since it follows from the stress conditions of the word that an unstressed final σ cannot end in a moraic C. So it may occur to the reader whether it is worth putting forward such semi- arbitrary claims about the alleged prosodic invisibility of certain consonants instead of owing up to the fact that the observed alternation (45bi) is a morphological rather than a phonological regularity . We would bump into additional problems, if it were otherwise. The lack of stød in (45ci) would be difficult to explain, since a strictly phonological approach predicts that the nasal is no longer extra-prosodic upon suffixation, which results in a bimoraic stressed σ. All this should lead to the appearance of stød in much the same way as can be observed in (45bi). The domain to which extra-prosodicity is assumed to hold is another arbitrary and therefore problematic aspect of the phenomenon. Basbøll (2003:24) compares imperatives (stød Ɂ (thrust)) with nouns ( stød (thrust)) of the same stem and argues that “extra-prosodicity cannot apply to verbs since verbal lexemes (i.e. the lexeme entry form) are in the infinitive, and extra-prosodicity, which only concerns consonants, is irrelevant for the infinitive since it always ends in a vowel”. Now this is a rather flimsy explanation featuring two dubious claims. Why is the author so certain that the infinitive is more basic than the imperative? A comparison between (45biii) and (45ciii) aptly illustrates that the analyst is much better off if he chooses the imperative as a point of departure. Furthermore, I have difficulties recognizing why extra-prosodicity must be restricted to the lexeme entry form. This is clearly arbitrary given that other researchers freely apply the concept to derived forms, cf. Hayes (1982:241), who regards adjectival suffixes in English as extrametrical.

The problematic aspects of graded productivity The idea of graded productivity is an attractive proposal inasmuch as it acknowledges the highly lexicalized nature of the opposition. It also helps to explain certain disparities between tonal patterns on the one hand and stød assignment on the other. Although stød generally corresponds to Sw&No Accent 1, the model rightly predicts that suppletive and irregular disyllabic 78 forms such as ӕldre (older), vӕrre (worse), større (bigger), lӕngere (longer) etc will surface without the glottal gesture, despite the fact that they display Accent 1 in most tonal dialects. However, the picture is more varied when it comes to irregular plurals. Terms of kinship are borne out as predicted, cf. mødre (mothers), brødre (brothers), fӕdre (fathers),

78 Bimoraic monosyllables are usually realized with stød as expected, cf. ӕlɁdst (oldest), lӕnɁgst (longest) etc. 90 yet other mutated disyllabic nouns exhibit stød, which is not compatible with our expectations based on the NSP, cf. bøg Ɂer (books), fød Ɂder (feet), tӕnɁder (teeth), hӕnɁder (hands) etc. Note that although Basbøll (2003:14) claims that –er (plural) is an FPE, this clearly cannot be the case when the suffix is realized jointly with i-mutation. The graded productivity of endings does not correspond so smoothly to the distribution of stød as Basbøll makes out. He puts forward some dubious claims and contradicts himself on a number of occasions. A few examples will suffice. It is assumed that “the option SPE is not available for simplex nouns” (ibid: 14) because “an SPE can be distinguished from a UPE only if the base to which the ending is added is not coextensive with the basic word… [and verbs are] the only word-class in Danish where the morphological base (the min-stem) does not always occur as an independent word” (ibid: 11). This is a formal fallacy given that it does not follow from the above that SPEs are ruled out for nouns. They may well exist but they are indistinguishable from UPEs, and that is a difference. Yet even if we allow for Basbøll’s claim and suppose that a SPE cannot be attached to a noun, we find inconsistencies. The author pins down that, as far as the plural of nouns is concerned, “ə is a UPE… and must be so since the FPE is ər” (ibid: 14). He then proceeds to claim that words of the type (45aiii) lose their stød before a lexicalized syllabic ending (ibid: 26). Such words indeed regularly drop their stød in the plural, which is always formed by the addition of –er . As matters now stand the plural suffix –er is argued to be both an FPE and a UPE at the same time. This is a conscious decision on the part of the author since he explicitly states that “stød is always retained in the productive plural form” (ibid: 35f), cf. hal Ɂ, hal Ɂler (halls), while it is lost in non-productive but regular 79 forms, cf. sum Ɂ, summer (sums). What this essentially entails is that stød is lost because the following ending is a UPE and that the following ending is a UPE because stød is lost. The argumentation is obviously circular and tenuous. A further point to make concerns the polysemous suffix –e, cf. (45a), which has the same effect (loss of stød) irrespective of the grammatical labels attached to it. Accordingly, it should be analyzed as a non-FPE (i.e. an SPE or a UPE) in all cases, however, such an approach is hardly compatible with its actual productivity. The suffix as an infinitive marker or an adjectival plural is fully productive, while its use is clearly restricted when attached to nouns. This difference in productivity is neither mentioned by the author nor reflected in the distribution of stød. Given these points, it seems reasonable to suggest that a synchronic treatment of stød should acknowledge that most traces of regularity in the distribution of the opposition are tied to morphology rather than the phonological structure of the language. As a corollary of this position we should accept the highly lexicalized nature of the opposition and should refrain from formulating rules or principles that have little bearing on the competence of present-day native speakers. This stance is to a certain degree echoed by Gress-Wright (2008:191) who also maintains that “ we ought to avoid a strictly phonological approach ” given that “stød is not phonologically predictable” (ibid: 199). The author argues that such phenomena can be best expressed in terms of Lexical Phonology, the very framework we adopted to account for the equally highly lexicalized tonal opposition in section 2.2.4.10.

79 It would be interesting to know how non-productive, regular suffixes are to be conceptualized given that the author rules out SPEs for nouns. 91

2.3.3.2. Change in progress In the previous section we saw that stød can be claimed to be the unmarked member of the opposition on purely distributional grounds. However, it must be evident by now that the concept of markedness does not entail exactly the same array of features for the tonal and glottal opposition. To take an example, the combined effects of the NSP and of the bimoraic requirement blur the distinction we observed for tonal dialects, namely that rare, special and non-productive patterns are to be associated with the marked member of the opposition. While lack of stød is unmarked and rare, we can say the same for stød on a penultimate σ. Although mutated comparatives surface without stød, mutated plurals usually have it, as we saw in 2.3.3.1. In tonal dialects these two special categories are prosodically united, as both are pronounced with the marked melody, yet Danish prosody keeps them apart. So what other indications do we have that may corroborate our earlier analysis of markedness relations in Danish prosody? In what follows I will investigate whether the dynamics of distributional change allow us to come up with some further evidence (as it does in Sw&No) to support the claim that the more frequent item (i.e. stød in this case) is indeed the unmarked member of the opposition. If we find analogical processes which show stød widening its scope , then we will be justified in claiming that the receding pattern of non-stød is indeed a marked configuration. Grønnum&Basbøll (2009) report a number of such ongoing changes, which all point in the same direction. The first group of words (ibid: 29) includes spark (kick – imp.), sort (id), sport (id) etc. Nowadays more and more people pronounce such words with stød, while it is absent in an old-fashioned, distinct norm. As to their phonological makeup, such items belong to (44aii) and are therefore expected to surface with stød. Historically, however, /r/ used to be devoiced when followed by a voiceless C, which means that such words did not fulfill the necessary structural requirements to acquire stød. Now that /r/ has been vocalized, bimoraicity is satisfied and the words in question have started to acquire stød, a change which Basbøll (2003:24) interprets as “dropping of a lexical marker, a well-known kind of regularization”, analogy by lexical diffusion. There are some further examples of newly acquired stød (mentioned in Grønnum&Basbøll (2009:30)), which display systematic traits. When a monosyllabic ending is attached to normally unstressed derivational suffixes such as –(l)ig , the effects of ə-ass make the suffix bimoraic, which consequently is realized with secondary stress and surfaces with stød, cf. (42c, h). The authors note that some nouns of the type (44biii) are assigned stød upon suffixation 80 , which they claim to be surprising given that under normal circumstances neither the definite article nor the productive plural suffix should alter the stød-pattern of this word-type. However, this behaviour is arguably not as extraordinary as it first seems. Once we take the effects of ə-ass, the vocalization of post-vocalic /r/ etc. into consideration, it is easy to see that the plural of a formal compound as formue (fortune) is prosodically and structurally indistinguishable from that of Clinton , cf. (45bii), which features an obligatory and productively formed stød-addition, cf. Basbøll (2003:6): the unmarked pattern widens its scope by analogy. Changes in the segmental specification of a word may give rise to certain dilemmas. Given that post-vocalic /r/ has been vocalized in Danish, the examples above are generally pronounced with a phonetically long V. Do we have any indication (apart from the testimony of spelling) that such words indeed should be labelled as belonging to (44aii)? It may be more accurate to classify them together with gul , cf. (44aiii) instead. Why should we postulate underlying /VrC/ along with a rule that transforms it into /V:C/, when the long V is readily available as the UR? In the present case, it eventually has no bearing on the matter since both

80 Examples include 'for,mu Ɂer (fortunes), 'om,rå Ɂder (territories), 'an,kla Ɂgelse (accusations).

92

(44aii) and (44aiii) exert a strong analogical pressure on lexical items that do not conform to the patterns they represent. Yet consider the following example. The stød patterns of vanilje and magnolie , cf. (44b, c) are commonly explained with a reference to the number syllables that follow the stressed σ. This is of course false reasoning, as the relevant syllables are both penultimate. The number of syllables we posit (and the spelling we choose to reflect it) depends on the presence or absence of stød, while ideally the stød pattern should instead be the function of the number of syllables. The point is that the phonological representations of Danish are so deep, the UR and the SR are separated by so many rules that it is not always straightforward which UR to pick. The deeper we plunge into the domain of abstract forms, the more regularities we can explain, which consequently makes the analyst reluctant to acknowledge that certain changes have been implemented in the UR itself, if this leads to blur an otherwise neat and comprehensible pattern. Once the process of ə-ass widens its scope such that words like sabotage (id) (now purportedly belonging to (44biii)) end in a C and thereby join (44aiii), we will simultaneously lose one of the best clues to the distribution of stød. Note that the same development has already taken place as far as pige (girl) is concerned. The word is now in normal speech generally pronounced as an α- word lacking stød, yet it is not customary to label it as a lexical exception to (44aiii). Although about 85 % of verbs lack stød in the present tense, it is getting more and more common that such verbs are assigned the glottal gesture, especially if they feature a long V. This is a remarkable development given that paradigm / morphological uniformity should entail that the majority pattern gains ground at the expense of the minority group exhibiting stød. We are therefore faced with the problem that a feature can be marked in a given subset of a language’s vocabulary, while it can constitute the unmarked pattern when we consider the vocabulary as a whole. It is not always obvious whether markedness-based analogy proceeds along local or global considerations. For reasons to be made clear in 3.2.4 I am inclined to think that analogical processes are predominantly local. The fact that most analogical changes make sense when approached with a global interpretation of markedness is primarily due to the fact that global markedness is made up of a number of local markedness relations. The global pattern is, as we have seen, that the number of words with stød is getting more and more numerous even in certain classes where non-stød used to be, or is, the dominant pattern. An interesting aspect of present tense verbs gaining stød is that the glottal opposition between nouns and verbs (cf. udtale in (44b)) is further enhanced, which means that the formerly phonological opposition of stød:non-stød is attaining a more and more morphological character. In addition to the systematic changes considered so far, the language also displays a handful of sporadic stød-assignments, which can be generally interpreted within the frames of loanword adaptation . Basbøll (2005:443ff) indicates that from a prosodic point of view Danish vocabulary lends itself to a bipartition into a domain “where the native stød system obtains” and another part “where Lexical Non-stød applies”. The author makes it clear that this is neither a psychologically real nor an etymologizing division of the lexicon. Note that Basbøll’s position is fully compatible with the prosodic partition of the Sw&No lexicon along the binary feature [native] presented in 2.3.3. Roughly speaking, the [+native] section is made up of native items and old loans from Latin, Greek and German, while the more recent [- native] section contains further foreign words and names. In light of the discussion so far the conclusion is inescapable that most loanwords enter the language without stød and are after a while assigned the glottal gesture provided that certain requirements are fulfilled, cf. 2.2.4.4. for Sw&No. This can be illustrated with a number of recent loans from English, which have all been integrated into the language and therefore obey the rules of the native stød system: astronau Ɂt, stu Ɂdio, motel Ɂ etc. Certain other words vacillate between the two norms, e.g. hi (Ɂ)ke, tran (Ɂ)sfer, vam (Ɂ)p (vampire), which is one of the best indicators of change in progress.

93

In conclusion, the examples in the present section all point in the same direction and demonstrate that stød is the unmarked member of the opposition.

2.3.3.3. Danish compounds The distribution of stød in compounds is to a large extent predictable from the stress pattern that results from the fusion of two or more constituents. In 2.3.2.1 we saw through a handful of examples how the binary distinction of lexical stress was transformed into three different levels on the surface. If a given constituent is demoted to tertiary stress, it is no longer eligible to host the opposition, while a σ with secondary stress generally retains its stød, cf. salg in (43a) and (43b). Let us first investigate what typically happens to the individual constituents of a compound. In those instances when a first constituent X features stød we can be certain that it is also pronounced as X Ɂ in isolation. This means that stød in first constituents is “always an original, not an added-on feature” (Basbøll (1999:360)). On the other hand, such original støds are frequently lost as is the case in well-established, lexicalized compounds, especially those where X is monosyllabic, cf. (47a). Whenever a compound is made up of three constituents, its inner structure can either be expressed as (X+Y)+Z, cf. (43a, g) or as X+(Y+Z), cf. (43c). In the first case, Y can surface with secondary stress (and thus retain its stød) if it is characterized by “the iconic principle of weight” (43g), cf. Basbøll 81 (1999:359f) who mentions that “in recent generations… there has been a tendency towards loss of stød” in the middle of (X+Y)+Z compounds. In the latter case, when Y is preceded by a marked boundary, the requirement for secondary stress (and thus stød) to be maintained is that the marked boundary is preserved, i.e. the compound does not undergo lexicalization, cf. (43d). It follows that the loss of stød is more commonly encountered when it comes to frequently used compounds where Y is light. In addition, increased speech tempo also favours stød-drop given that it can inhibit the bimoraic realization of Y’s stressed σ, cf. (ibid). As opposed to (47a, b) where stød is lost under certain circumstances, (47c) below reveals that the final constituent Z can host lexically unmotivated tokens of stød. The relevant data indicate that Z typically contains verbs. Basbøll (2005:499) argues against this traditional conviction that stød-addition is a function word-class and proposes that the phenomenon depends on the structure of the compound. (47c) is indeed not restricted to verbs as can be seen from the compounded nouns listed in Grønnum&Basbøll (2009:30): folkesko Ɂlen (the primary school), sygehu Ɂse (hospitals), spisestu Ɂen (the dining room) etc. The authors remark that such patterns are new and unexpected given that each final constituent surfaces without stød in isolation. Adjectives such as u'mu Ɂlig (impossible), u'skyl Ɂdig (innocent), whose prefixless antonyms always lack stød, may serve as further examples 82 to prove the point that (47c) is not restricted to verbs (any more).

(47) Changes 83 in stød pattern upon compounding: X+(Y+)Z a. loss of stød in X: stenbro Ɂ (stone bridge), broste Ɂn (cobbles) b. loss of stød in Y: landmandsliv Ɂ (farmer’s life), udsalgsdame (43a) c. newly gained stød in Z: dagdriv Ɂer (loafer), trekan Ɂtet (triangular), (44bii)

81 The author employs a cognitive approach to the heavy vs. light distinction. 82 It must be added that such words (the prefix being unstressed) are highly lexicalized and exhibit therefore unit accentuation, which differs from the stress pattern of typical compounds e.g. the ones in (47c). 83 The examples in (47) are borrowed from Basbøll (2005) and Riad (2000). Note that the individual lexemes (bro, sten, ud, salg, land, mand, liv, dag, tre ) are all pronounced with stød in isolation. 94

With all this in mind, we can now return to the question of markedness. How does the systematic loss of stød in (47a, b) resonate with our earlier claims that stød is unmarked and is therefore not expected to recede? Recall that markedness is not necessarily a unitary concept, which means that non-stød is possibly unmarked in certain positions, say in anterior constituents. Given that the domain of stød (as opposed to the tonal opposition) is the stressed σ and not the PrW, a lexeme with multiple stresses can host multiple occurrences of stød. This implies that the presence of stød in compounds is a gradual feature. The fact that a binary specification can be described more economically than a gradual one suggests that the loss of stød in anterior constituents is functionally motivated. We run into problems if we claim that a prosodic feature that holds at the level of bimoraic syllables is also distinctive at a higher level, where multiple instances of bimoraic syllables can occur. In a similar vein, stress is not distinctive in compounds either. Inasmuch as there is a contrast, this can be led back to the smallest domain where the given opposition is contrastive. Furthermore, we have seen that the plural form in (45bii) is a fully productive pattern in contemporary Danish. In Clinton Ɂner we have (if we abide by Basbøll’s analysis) two bimoraic syllables (the first nasal cannot be extraprosodic, yet it has no stød) such that primary prominence is followed by secondary stress. What we essentially have here is a prosodic compound, whose stød pattern is being generalized through the changes presented in (47) above. The functional reason why stød is preferred in posterior rather anterior constituents is easy to be discerned. Recall that high speech tempo does not favour the occurrence of stød. The farther away we find ourselves from the rapid anacrusis in the beginning of an IP, the lower the speech tempo. Physiologically speaking, stød in final constituents is easier to realize and to perceive. On a psychological level it is obvious that productive stød-addition is closely associated with certain morphological regularities, which are in turn tied to suffixes, i.e. to the end of the word. If compounds and inflected/derived simplex forms, cf. (47) and (45bii) are sometimes indeed characterized by an identical prosodic structure, then the question may occur whether it is necessary to differentiate between such categories. The examples provided by Basbøll (1999:353) suggest that derivatives with a heavy suffix, formal and real compounds are indistinguishable as far as prosody is concerned, cf. 'klog,ska Ɂb (cleverness), 'kl ӕde,ska Ɂb (wardrobe), 'para,di Ɂs (paradise) and 'spare,gri Ɂs (piggy bank). In a later work (2005:475) the author points out that compounds and derivatives cannot be equated after all given that lexically specified morpheme-final extraprosodicity holds in the former but not in the latter. This can be illustrated with the extraprosodic liquid of me'tal (id), which is still invisible (i.e. it lacks stød) in a compound, cf. me'tal,agtig (like metal), but not so when derived or inflected, cf. me'tal Ɂlet (the metal), me'tal Ɂlisk (metallic). The lack of stød in the first constituent of words like 'ven,ska Ɂb (friendship) is not a counterexample. Recall that even klo Ɂg (clever) surfaces without stød in 'klog ˌska Ɂb (cleverness). Nevertheless, the boundaries between compounds and derivatives are not always straightforward especially if a given morpheme is involved in grammaticalization.

2.3.3.4. The functions of the opposition Throughout chapter 2 we have observed a wide range of parallels between the tonal and the glottal opposition, which can no doubt be attributed to their common ancestry. Both phenomena are marginal as far as their functional load is concerned, which is aptly reflected in the fact that both Sw&No on the one hand and Danish on the other have a number of peripheral dialects that lack the opposition altogether. This implies that many of the present section’s conclusions will be analogous to those of 2.2.4.8.

95

As to why the opposition is still preserved despite its phonological marginality, Basbøll (2003:37f) remarks that “Danish stød can fulfill an important communicational function… [given] that it is a potential key to morphological structure ”. The means that stød can inter alia be used to signal a grammatical boundary followed by an FPE. Nevertheless, it does not seem to be a valid assertion given the occasional disparities between stød assignment and graded productivity, as we saw above. Furthermore, the juncture approach is exposed to considerable problems when it comes to some β- and γ-words of the type (44biv) and (44cii). Recall that we raised similar concerns in connection with the claims of Elert (1972:154), according to whom Accent 2 is essentially a juncture. This is, however, not to deny that stød can to a certain extent facilitate perception and is, as Basbøll (2003:37) puts it, “an aid to the addressee ”. This perceptual contribution is, nonetheless, much more limited than in tonal dialects. This is mainly because stød is a property of the second µ of the rhyme and its locus as such basically coincides with the morphological boundary it is argued to signal. On the other hand, the melody of a tonal accent is launched simultaneously with the stressed V, which means that the listener has appreciably more reaction time and can indeed anticipate the suffixes that the observed melody activates. Basbøll (ibid) argues that “[t]he sociolinguistic function of stød, as a marker of linguistic identity, should not be taken to be of any less importance”. This appears to be a reasonable claim given that the main isoglosses of dialect typology in Scandinavia are typically defined by prosodic variables. Nevertheless, this alleged sociolinguistic function can by no means explain why a functionally dubious feature has been retained through the course of the centuries, since it views the absence of an internal development (the elimination of stød) in relation to other dialects. To put it simply, speakers of a given speech community cannot be expected to optimize their phonological system in such a way that the resulting outcome should preserve the sociolinguistically significant dialectal differences of the region that applied at the onset of the change. Finally, we can conclude this chapter by considering the effects of glottal assimilation. Given that stød has a tendency to widen its scope by lexical diffusion, loanwords entering the language may acquire stød after a while under the right circumstances. Such instances of loanword adaptation imply that the lexical specification of a given item is changed 84 from [-native] to [+native]. Thus adaptation involves the strengthening of unmarked features, an increase in paradigm uniformity and a reduced lexical burden. Consequently, I would like to argue that, as far as [±native] is concerned, we can apply exactly the same approach to stød that we applied to the tonal accents in 2.2.4.8, which of course includes the stipulation that the lack of stød does not signal [-native].

84 [+native] being the default case does not require a lexical specification. 96

3. A diachronic review

The main aim of the present chapter is to outline the prosodic history of the Scandinavian languages by investigating a number of well-known suprasegmental changes ranging from the earliest reconstructions to certain ongoing processes. Given that the following analyses are meant to incorporate the conclusions of the previous two chapters, the changes to be treated below are mainly constrained to controversial phenomena or to such developments whose understanding is essential for subsequent claims. Consequently, the ensuing (to a certain extent cherry-picked) compilation of prosodic changes makes no claim of presenting the diachronic prosody of the Scandinavian languages in a comprehensive manner.

3.1. Changes in terms of stress and quantity On the premise that tone is synchronically a function of stress, which in turn is a function of quantity, cf. 2.2, we will start the discussion with the latter two concepts. The problem of tone and stød will be addressed in section 3.2. Although the present chapter is essentially devoted to the history of Sw, No and Da, occasional references will be made to other Gmc dialects as well, especially to Gothic and OE, when the comparative nature of the matter compels us to do so.

3.1.1. The Germanic Stress Rule (GSR) The development of fixed stress was one of the most important phonological innovations that (along with Grimm’s and Verner’s laws, vocalic mergers, rhotacism, gemination etc) served to define the Germanic subgroup of the IE family. Although the GSR itself is fairly well understood, there are some competing theories that seek to account for the actual transition from the IE accentual system and the chronology of the change. Let us first outline the stress pattern of the earliest attested Gmc languages in order to attain an approximation to the output of the accent shift. Lass (1994:91ff) postulates an OE stress rule , which assigns primary stress to the first σ “bounded on the left by a major category label”, i.e. noun, verb, adjective or . It follows that monomorphemic content words are invariably stressed on the first σ, while a prefixed word displays initial stress if and only if it can be analyzed as a compound. Although unstressed prefixes are generally associated with verbs, Lass is keen to point out that the main divide is not drawn between verbs on the one hand and nouns/adjectives on the other, since a verb derived from a prefixed noun retains the stress pattern of the latter and vice versa. This is reminiscent of the retention of stød or Accent 1 in nouns derived from prefixed verbs, e.g. behandling (treatment). Lass draws parallels to the separable prefixes of modern German, where these stressed verbal prefixes are usually , i.e they belong to a major syntactic category. If we compare OE with Gothic , we can see that the two systems are not quite identical. The main difference boils down to the assumption that Gothic possibly lacked unstressed (verbal) prefixes, which means that its stress pattern was arguably phonologically assigned and did not depend on morphological considerations as was the case in OE. Bennett (1970:465f) concludes from the testimony of certain vocalic changes that even the preterite of reduplicating (i.e. prefixed) verbs, such as Go. haíhald (held) was pronounced with initial stress. He also assumes that the negative prefix un- bore the same primary word stress as other word-initial syllables. He then proceeds to claim that the unstressed prefixes of later Gmc were still proclitics of verb phrases in Gothic. The arguments for this are twofold. First, the “prefix” and its verb are readily separable and can be expanded with various intervening

97 words as can be acknowledged from Wulfila’s Bible, cf. (ibid: 468). The second piece of evidence concerns the reflexes of the Gmc C shifts. Word-initial IE /k/ came down as Gmc /x/ irrespective of the place of the original IE accent, given that the conditions for voicing according to Verner’s law were not satisfied at the left edge of a word. Nevertheless, the voiced C of Go. ga- (the Gmc equivalent of Latin con-) seems to defy our expectations. Yet such startling instances of voicing are not surprising after all if we remind ourselves that the prefix in question is a clitic. The illuminating example of the Gmc definite article (the, das, det ) suggests that such weakly stressed clitics can undergo voicing even in initial position. Consequently, it seems reasonable to assume that literary Gothic witnessed the onset of a grammaticalization process that eventually led to the unstressed prefixes of OE. This early Gmc stress pattern evolved from Proto-IE , whose relevant prosody can be reconstructed as follows, cf. Ringe (2006:21f). The ”free” accent of IE could fall on any σ, still the inflected forms of individual thematic nominals and verbs had uniform stress placement throughout the paradigm. Clitics never bore an accent. The nom and acc of nouns and the singular of verbs were typically stressed / accented near the left edge, while other forms near the right edge of the word. In other words, what we could call a base form (more often than not) had initial prominence. The author also asserts that “words with no underlying accent were assigned accent on the leftmost syllable by default” (ibid: 22). To sum it up, the IE system was governed by a mixture of lexical and morphological factors, however, when these factors did not have a say in accent assignment, the surfacing default initial accent could be attributed to the language’s phonology. What this amounts to is the suggestion that as far as phonology is concerned, IE was initially stressed, even if phonological considerations were clearly dominated by lexical and morphological ones. This view is reflected in Halle (1997:298), whose analysis of IE stress and accent predicts that “once lexical accent is removed, the core rules assign initial stress to all words”. Such a claim is in line both with our own predictions in (12) and the attested facts that Gmc, Celtic, Italic and West Slavic all developed initial stress when they abandoned the IE system. When it comes to the transition IE stress > Proto-GSR , we can either assume that IE accent was fixed on the Gmc root σ in one step or that it first landed on the initial σ of all words, with some later change making certain prefixes invisible attaching stress to the root σ. Lahiri & al (1999:336f) do not take sides in the matter, yet they point out that the former approach is faced with the challenge of accounting for stressed prefixes while the latter for unstressed ones. However, in light of the above, the choice is not difficult to make. The accentual disparities between OE and Gothic discussed above establish the direction of a change whose first step must have been the assignment of primary stress to word-initial syllables. As we have seen, the fusion of a clitic and a following word was still incomplete in the 4 th century AD, i.e. the time of Wulfila’s . This means that Scherer (1878:81ff) is certainly on the wrong track when he assumes that the difference between IE and Proto-Gmc accent is that the former was movable, while the latter was fixed on the root σ. The description he provides matches a later linguistic stage but not that of Gothic or Proto-Gmc. Yet even if we did not have access to the early Gothic texts, even if we had to do without the predictions of Halle (1997) and the stress loop in (12), we should still be skeptical towards the root σ-hypothesis. Our understanding of phonological phenomena in general indicates that whenever sound patterns are to a certain extent morphologically conditioned this conditioning is the result of at least two phonological changes whereby the latter destroys the conditioning of the former (stress shift and fusion in the present context). As we will see in 3.2 below, this expectation is borne out by the accentual opposition as well. The literature is also divided on the degree of the phonological yield we can attribute to the accentual systems of IE and Proto-Gmc. While Kuryłowicz (1968:192) declares that the

98 distinctive prominence of IE was transformed into delimitative stress, Bennett (1972:100) is of the opinion that “[t]he IE movable pitch accent was generally nondistinctive… whereas the Gmc. fixed stress provided a basis for contrast“. As we have seen, however, this basis of contrast is not provided by the fixed stress system itself, but rather by ensuing grammaticalization, which gave rise to unstressed prefixes.

Chronological problems A further controversial aspect of the GSR is its relative chronology. Traditionally, it is assumed that the transition of IE voiceless stops to corresponding Gmc fricatives must have pre-dated the accent shift (cf. Lass (1994:22)), because otherwise pretonic, intervocalic voicing as formulated 85 by Verner could not have taken place and (to take an example) the intervocalic consonants of OE bro ϸor (brother) and fӕder (father) should be identical, cf. Latin frater and pater . Nevertheless, this mainstream view, as we have indicated above, is not uncontested. The first question to raise is the following. Given that pre-accentual IE /k/ produced Gmc /ɣ/ in clitics (see ga- above) as a result of Verner’s law, it is not obvious why the same C should escape voicing in the onset of a word-initial, pre-accentual σ, cf. hund(red) and IE *kmtóm . Bennett (1970:464) suggests that the accent shift preceded all stages of the C shift . The reason why Verner’s pre- and post-tonic consonants still produced different reflexes is to be sought in the supposition that pre-Gmc /p, t, k, s/ had two sets of allophones, fortis and lenis (the former in pre-tonic, the latter in post-tonic position). This allophony was transformed into a phonological opposition upon the completion of the accent shift, and both sets underwent spirantization producing voiceless and voiced fricatives respectively. This line of reasoning entails that we have to assume four contrasting explosives for h each relevant place of articulation e.g. /p/ fortis , /p/ lenis , /b/ and /b /. It is rather difficult to establish whether such an inventory is typologically possible, all the more so as Bennett does not reveal what actual phonetic content he associates with the first two. Ladefoged & Maddieson (1996:95ff) make it clear that the terms fortis and lenis have been used with so many diverse meanings that they should never feature in a proper scientific description in want of an adequate phonetic definition. From a systemic point of view it is tempting to equate /p/ fortis with an aspirated voiceless stop, however, phonetic strength and an accompanying lengthening is a much more usual correlate of the fortis-lenis distinction. The fact that Gothic is only marginally affected by Verner’s Law is the basis for further arguments in favour of the proposal that the accent shift pre-dated the spirantization of voiceless stops. An earlier fixation of Gothic stress may explain why Verner’s Law failed to produce voiced fricatives in the affected words (Bennett (1972:464)), however, this is not an attractive solution since it means that we have to assume dialectal disparities for a change that is claimed to have pre-dated the C shifts that eventually came to define Proto-Gmc. The deviant behaviour of Gothic is traditionally explained by invoking analogical levelling. Although a fair amount of criticism can be directed at the analogy-hypothesis, Suzuki (1994) argues in favour of it claiming that what is usually described as analogy was in fact a rule-based generalization. He suggests in agreement with Kuryłowicz (1968:24) that final devoicing was the immediate trigger that led to the loss of the effects of Verner’s Law in Gothic. Final devoicing affected the other Gmc dialects so late that the original opposition between voiced and voiceless fricatives had been restructured in several ways (cf. OE /ð/ > /d/) so the (morphological) effects of Verner’s Law were not lost in the same way as in Gothic. All in all, it seems (on assumption that Suzuki’s argumentation is correct) that the traditional chronology of the changes (i.e. voicing precedes the accent shift) is defensible .

85 Although Verner himself talks of intervocalic voicing, Bennett (1972) and Suzuki (1994) among others assume that the law also applied word-finally. 99

Let us conclude this section by briefly mentioning a third theory on the consonantal shifts from IE to Gmc. Noske (2009) makes a case for the assumption that Verner’s Law and the Gmc spirantization can be analyzed from a synchronic perspective as being part of a single process, which implies that the relevant segmental changes must have preceded the accent shift. He formulates 86 the two segmental rules in SPE-style and finds that they are in an elsewhere relationship. However, the concept of transformational rules and rule ordering inevitably reflects some sort of sequentiality, which in the present case gives precedence to intervocalic voicing. One may recall Liberman’s (1982:33) remark according to which “history always stands at the elbow of generative phonologists”.

3.1.2. The impact of loanwords and the loss of Germanic stress The word-initial stress of early Gmc was insensitive to quantity, which means that neither light-stressed nor heavy-unstressed syllables were ruled out on structural grounds. At various later stages the principles of both weight-to-stress and stress-to-weight came to play a prominent role in the individual Gmc languages. This was primarily due to a complex of quantitative changes (to be discussed in the ensuing sections), some of which paved 87 the way for the abandonment exclusive of root-initial prominence in most Gmc dialects. Lahiri & al (1999:378) demonstrate that all Gmc languages (with the exception of Icelandic and Faroese, which are claimed to still parse prosodic words from left to right) have developed a right-to-left orientation by adopting the “Romance stress rule”, which confines them to the three-σ window known from Greek, Romanian, Spanish, Italian etc. In what follows, I will investigate how this transition took place and expose some of the views in Lahiri & al (1999) to scrutiny. There is a general consensus that the loss of native Gmc stress is attributable to the heavy influx of Romance loans, whose right-edged stress patterns altered the direction of parsing in the recipient languages. It is worth pointing out that such developments may only occur in languages where stress is (at least to a certain extent) assigned by morphological and lexical factors. Speakers of languages with purely phonological stress (such as Hungarian, Finnish, Polish, French etc) are unable to retain the stress patterns of loanwords that do not conform to the native system. Thus the unstressed prefixes of early Gmc are absolutely crucial in this respect. Nevertheless, a language with morpho-lexical stress may still resort to repair strategies to eliminate unorthodox patterns displayed by certain loanwords. Such a tendency can be readily observed in English throughout the Middle Ages. Although both final and non- initial penultimate stresses were known to Old and Middle English (cf. to'd ӕg (today), be'foran (before) in Arnason (1996:19)), the prominence of French loans was after a certain time commonly moved to the first σ of the word. As long as the pretonic syllables were semantically transparent morphemes, non-initial stress was restricted to such categories. However, we may entertain the suggestion that in later centuries, semantic fusion led to the loss of some morphological boundaries and this process removed the restriction imposed on pretonic syllables. The prosodic adaptation of loans can be exemplified with the words in (48) below. There is an interesting disparity between English and other Gmc languages, as far as the treatment of certain Romance loans is concerned, given that the German, Dutch, Danish and

86 Noske’s (2009) formulation adopts the so-called Glottalic Theory, which reconstructs the IE obstruent inventory positing three series of stops: voiceless, voiced and glottalic. The latter two are traditionally represented as a voiced aspirate and a voiced series respectively. 87 As we will see, the promotion of stress-to-weight, inter alia, is a potential requirement for a transition from edge-aligned to lexical stress. 100

Norwegian equivalents of the examples in (48a) are uniformly pronounced with final stress. This, of course, implies that root-initial stress was abandoned on the continent considerably earlier than in England. Fikkert & al (2006:146) are of the opinion that the OE system persisted as late as in the beginning of the 15 th century and argue against the view that Chaucer’s language was already affected by the change.

(48) Romance words 88 with initial stress in English a. paper, baron, channel, satin, coral, Latin, salad, actual, moral, August b. aspect, colleague, import NOUN , ridicule, pretext, reservoir, commodore

If right-edged stress and the three-σ requirement indeed go hand in hand, then perhaps the distribution of graded adjectives (among other things) may provide a clue for ascertaining the debut of Latinate stress in English. As is commonly known, synchronically the choice between inflected and periphrastic gradation depends primarily on the length of the adjective (cf. nicer and more beautiful ). In OE times comparison was predominantly expressed with inflected forms, however, in the Middle English period periphrastic (and even double) gradation became so common that the speakers’ preference for say easier, more easy or more easier was essentially a matter of style and not a grammatical issue, cf. Görlach (1991:83f). Yet more importantly, trisyllabic (or longer) adjectives could still take syllabic inflectional endings, as exemplified by the use of maidenliest in Shakespeare’s King Lear (Act 1, scene 2). The modern length-sensitive usage had been established by the late 17 th century. Given that according to Fikkert & al (2006:146) right-edged Latinate stress was adopted roughly in the latter half of the 17 th century, we cannot help discovering some degree of causality. Once Latinate stress with its three-σ rule was an obligatory pattern in English, the choice between the two forms of gradation was no longer optional, since the inflected form of a trisyllabic word with initial stress would have violated the newly acquired constraint 89 , hence *beautifullest . I consider this coincidence 90 as strong evidence in favour of the chronology put forward by the authors.

The three-σ window in various languages Nevertheless, the actual status of the three-σ rule is somewhat dubious as we can easily come across a handful of derived and inflected words whose stress patterns cast doubt on its overall validity. The initial stress of words like accuracy, legionary, legislature, radiator, spiritualism, decorated etc suggests that the three-σ rule is more of a static tendency than a dynamic rule. As a matter of fact, different languages subsumed under the three-σ label implement the rule in rather various ways. Modern Greek displays a fairly strict approach to the three-σ rule, which can bring about productive patterns of stress shift in inflected forms, cf. πρόβληµα (problem), προβλήµατα (problems). In addition, the rule also applies to clitics, cf. the genitive construction αυτοκίνητό µου (my car) as opposed to αυτοκίνητο (car). The next level is represented by Italian , where both stems and inflected forms conform to the expectations of the rule. Given that a large number of conjugational suffixes are tonic, stress shifts are extremely common in verbal paradigms, cf. 'dormo (I sleep), dormi'amo (we sleep). The only deviating pattern I am aware of is the 3rd person present

88 The examples in (48) are taken from Lahiri & al (1999:375) and Ekwall (1975:7). 89 It remains to be added that the present-day usage is not entirely based on phonological considerations. As Görlach (1991:84) points out “[t]he complicated rules holding for bisyllabics in PrE clearly show these to have been formulated by eighteenth-century grammarians”. 90 The shift from Gmc to Romance stress was not a uniform process and spread through sociolects. “Whereas the frequent words (or all the words in the speech of the less educated) had initial stress… rarer words occurring only in the sociolect of the educated class often preserved the original stress” (Görlach (1991:64)). 101 indicative of a few verbs such as: 'augurano (they wish), 'esplicano (they explain), while the infinitives are stressed as augu'rare , espli'care . Clitics, however, do not affect stress placement in Italian, and this can lead to some apparent surface violation, leaving the stressed σ far away from the right edge of the prosodic word as in te'lefona-glie-lo (phone him about it), cf. D’Imperio & Rosenthall (1999:21). Nouns, on the other hand, thoroughly respect the three-σ rule (partly because they cannot be combined with syllabic inflections). An even weaker interpretation of the rule can be found in Romanian , where it applies to uninflected and derived stems (most derivative suffixes being tonic), however, neither inflections nor clitics seem to affect the assignment of primary stress, cf. Franzén & Horne (1997:81). Stress shifts are extremely rare in inflected forms and are possibly not due to the three-σ-rule. This is clear in plurals like 'sor ă, su'rori (sister), which are mostly confined to the oldest sections of the vocabulary. The stress shift in 'radio > radi'ouri (radio(s)) is probably dialectal. Note that the plural suffix –uri is pronounced as a single σ, which means that the initial stress of forms like grepfruturi (grapefruits) does not defy the three-σ rule. The scope of the rule does not extend to compounds either as can be illustrated by the initial stress of şaptesprezecelea (17 th ), fluier ă-vânt (vagabond) etc.

(49) A tentative typology of the three-σ rule The rule is valid for a. Greek : stems, inflections, derivatives, clitics b. Italian : stems, derivatives BUT not for clitics and inflections c. English : stems BUT not for inflections and derivatives d. Romanian : stems, derivatives, BUT not for inflections and compounds

The three-σ window in the North Gmc languages Now how is all this applicable to the Scandinavian languages? Given that the reduction of unstressed syllables has been more far-reaching in Da&No than in Swedish, the latter has longer suffixes and is thus more suited for a comparison with the languages in (49). Needless to say, unaffixed simplex stems are stressed according to the rule (otherwise Swedish would not even be considered in this respect). Derivational suffixes in Swedish are either tonic or post-tonic and the latter have a maximal length of two syllables, which means that derivatives also fulfil the trisyllabic requirement. Now when it comes to inflections we can see that the initial stress in words like viktigare (more important), ananasen (the pineapple), positiva (positive + pl) is followed by three unstressed syllables. If we now consider the compound rule put forward in 2.2.3, it becomes obvious that Swedish is typologically closest to (49d). What this essentially means is that Swedish follows the weakest possible interpretation of the rule and exhibits no dynamic shifts that may prove that the pattern is not merely the result of an accidental, static generalization. Furthermore, a comparison between Swedish and Romanian makes it clear that it would not be accurate to categorize them as belonging to the same typological class. First of all, a significant portion of Swedish vocabulary is distinguished by initial stress, which is, on the other hand, rather exceptional for Romanian. Second, compounding (which is by its very definition a left-edged phenomenon) is fully productive in the (continental) Gmc languages, but not in Italian or Romanian, which rather rely on prepositional phrases, cf. Sw. tandborste , G. Zahnbürste , E. toothbrush with Ro. perie de din ți, Fr. brosse à dents , It. spazzolino da denti . It seems to me that a language with fully productive compounding (where primary stress is always associated with the first constituent) can hardly be analyzed as right-aligned. A further aspect that keeps Swedish and Romanian apart is constituted by their respective strategies employed in loanword adaptation. There are some di- or trisyllabic Hungarian loanwords in Romanian whose original initial stress is moved towards the right

102 edge of the word for no apparent reason 91 as in 'beteg > be'teag (sick), 'betegség > bete' şug (disease), 'város > o'ra ş (town), 'tagad(ás) > tă'gad ă (denial), 'szidalom > su'dalm ă (curse). The final stress of Turkish loanwords is naturally always preserved ( 'filde ş (ivory) being a notable exception). Swedish and (particularly the Trøndelag dialect of) Norwegian (as anticipated in 2.2.3) display different tendencies from Romanian. A handful of loanwords can be identified whose original final or penultimate stress has been shifted to the left edge of the word. The most common Swedish examples include 2tallrik (plate), 2papper (paper), 2kaffe (coffee), 2sallad , (salad), 2insekt (insect), 2byrå (commode), 2paradis (paradise), 2kakao (cacao) etc. Recall that such instances of prosodic adaptation are not confined to stress shifts as such nativized items uniformly surface with the default melody of Accent 2. References to prosodic assimilation make it clear why it is insufficient to consider the phonological form of a given word when it comes to accent assignment. Accent 2 in 2tallrik and 2papper as opposed to 1fänrik (lieutenant) and 1tapper (brave) are beyond our grasp unless we are aware of the phonological history of such words. The implications of the present discussion (combined with our earlier claims about default stress in Sw&No, cf. 2.2.3) lead to the inevitable conclusion that the categorization in Lahiri & al (1999:378) is not in full accordance with the general tendencies and the relevant facts of the languages. We have no clear-cut evidence to indicate that Sw&No are in every aspect right-to-left languages . It is nevertheless clear that they have abandoned (or at least considerably altered) the stress pattern we assumed for OE, however, the process has not (yet) terminated in the adoption of a fully Latinate system (as is claimed for English). Foreign stress patterns can be accommodated in the recipient language in various ways. Árnason (1996:12) assumes that modern Gmc languages have developed “a new stress system, a sort of compromise between the foreign and the native system such that both native and foreign words follow the same (complex) system”. The author does not go into detail so it is not clear on what grounds we can draw the line between such a complex system (proposed for the modern languages) and a parallel arrangement consisting of a native and a foreign system (assumed for Middle English). As a consequence, we are in all likelihood entitled to suggest that stress assignment in Sw&No is governed by a dual system where words following foreign patterns are lexically marked. As far as the direction of parsing is concerned, Fikkert & al (2006:146) suggest that “Latin words alone could not provoke the native speakers to change directionality” in Middle English. They propose that the elimination of word-final, unstressed syllables was a crucial factor inasmuch as they reduced native items to a single foot, which in turn could no longer serve as a clue for directionality. In addition, they point out that the adoption of pre-tonic and tonic Latinate suffixes is a reliable diagnostic for altered directionality, since “[i]n such forms, stress is computed from the right side” (ibid). Word-final reduction did not assume such proportions in German and Swedish as in English, which means that not all stems were reduced to a single foot. Consequently, the system as a whole (and the direction of parsing in particular) could not be remodeled on the stress pattern of foreign words. Their integration involved the development of a parallel system instead. This can be seen as another argument against a uniformly right-to-left parsing in continental Gmc. Let us conclude the discussion of directionality with a few observations concerning Icelandic and Faroese . Lahiri & al (1999:378) claim both languages to be left-aligned with no trace of extrametricality. The only typological difference from the early Gmc system is constituted by the fact that feet are organized into syllabic trochees (by virtue of the bimoraic

91 One may entertain the thought that closed syllables attract stress (as in bete' şug, o'ra ş, su'dalm ă above), however, counterexamples are easy to find, cf. 'gula ş (goulash), 'guler (collar), 'biber (beaver) etc.

103 condition) and not into resolved moraic trochees. The treatment of loans, however, raises interesting questions in the two languages. Although most foreign words (irrespective of their original stress pattern) are pronounced with initial prominence in Icelandic (examples include 'Aristó ˌteles , 'Rachmani ˌnoff , 'aristó ˌkrat ), very recent borrowings (such as Se'curitas , Ge'valia , Svo'boda 92 ) can have non-initial stress, cf. Árnason (1996:6ff). In contrast, it is claimed that “loanwords in Faroese have the same stress pattern as in Danish, which is the main source of the borrowings” (ibid: 10). Accordingly, this section of the vocabulary respects the three-σ rule, yet they sometimes violate the rule if they take a native inflectional ending e.g. 'positiv-ur (positive), 'fysikar-i (physicist). Árnason (1996:15) points out that the window can be kept if we assume that the endings are extrametrical. Given that the three-σ rule does not apply to inflectional endings in mainland Scandinavian either, it would be an unnecessary complication to suggest that the examples above exhibit right alignment 93 . As matters now stand, the only difference between Faroese and mainland Scandinavian stress assignment boils down to the fact that native forms in the former are occasionally longer than three syllables (given that syllables in unstressed position have not been exposed to such dramatic reduction). So unless we want to squeeze Sw&No&Da vocabulary into the three-σ frame, we can readily acknowledge that mainland and insular Scandinavian have typologically identical systems where the different proportions of various stress patterns should be attributed to historical factors and not to the synchronic properties of their prosody. We can thus assume that all Scandinavian languages 94 have a dual stress system where words following foreign patterns are lexically marked .

3.1.3. The stress pattern of compounds While in most Gmc languages the stress assignment of simplex words has apparently distanced itself from the GSR, the prominence pattern of compounds is still predominantly left-aligned, cf. 2.2.3. Lass (1994:89f) assumes that the Compound Stress Rule of early Gmc was essentially a “reiteration of the GSR at a higher level”, which means that the posterior constituent of compounds adopted secondary stress in a similar fashion to what is assumed for non-tonal contemporary languages. The exact nature of secondary stress is of course impossible to infer, still Bennett (1972:104f) argues that compounds and derivatives such as Go. 'erd-,cunni (earth-tribes) and 'mihil,nessi (greatness) had the same basic stress pattern. If we consider the stressed derivational suffixes of the modern languages it becomes apparent that derivatives pattern more closely with simplex words than with compounds. So at what point did the stress assignment of simplex words and compounds diverge? The adoption of Romance stress in non-compounds lends itself as a reasonable guess. However, we cannot repudiate the possibility that also the stress pattern of compounds was to a certain extent altered by the huge influx of right-aligned simplex words. As a matter of fact, oxytonic compounds used to be much more widespread in the Scandinavian languages than the few sporadic examples in 2.2.3 may lead us to conclude. The diachronic study conducted by Fischer-Jørgensen (2001) on the basis of grammars and metrical treatises between 1600 and 1900 makes it clear that the more we go back in time (at least within the surveyed period), the more instances of final stress we encounter. The extent of final stress in compounds is, however, open to debate, cf. (ibid: 17).

92 Note that the Czech original is stressed on the first σ. 93 Interestingly, the author asserts that native words featuring an unstressed prefix or unit accentuation are examples of right alignment in both Icelandic and Faroese. However, a quick reference to early Gmc makes it clear that it is absurd to classify all instances of non-initial stress in this way. 94 With the possible exception of Danish, which lacks default initial stress, cf. 2.3.2.2. 104

Stress clash Bye (2004:22) reckons that “final compound accent is a very old feature that… turns out to have a ubiquitous, yet sporadic, relic-type geographical distribution within Scandinavia as a whole”. Such an extensive use of final stress is rightly expected to be tied to certain morpho- phonological regularities. Fischer-Jørgensen provides a summary of Axel Kock’s theory of accentuation according to which “all compounds with a monosyllabic first member, and particularly verbs, often had stress on the second [constituent] and… this was also true of some words with a polysyllabic first member” (2001:498). If this is a correct approximation to the scope of the phenomenon, then it lends itself to assume that the transition from initial to final stress in compounds must have been to a certain extent rhythmically conditioned. If we derive the stress pattern of compounds from the reiteration of the GSR, then we find that any item with a monosyllabic first constituent is expected to exhibit the phenomenon of stress clash, which Riad (1992:131f) considers as the main trigger of reduction during the Syncope Period. He calls attention to the fact that “the retraction of main-stress to the stem-initial syllable [in Proto-Gmc]… goes against the principle of quantity sensitivity… [and leads thus to] a conflict deeply embedded in the stress system” (ibid). In addition, he argues that many prosodic changes of early Gmc can be seen as attempts to harmonize the stress system by reconciling and unifying main-stress assignment (i.e. stem-initial stress) and algorithmic stress assignment (i.e. the use of moraic trochees). Accordingly, stress shifts leading to final stress in compounds are essentially clash resolution strategies. Fischer-Jørgensen (2001:513) takes a similar position when she mentions a “tendency to avoid the sequence primary stress + secondary stress + weak stress, and to replace it with primary stress on the middle member”. However, the notion of stress clash raises at least two concerns. First of all, if we assume (adopting Riad’s stance) that the “deeply embedded” conflict in the system is represented by words where a heavy σ follows a light initial σ, we might expect that oxytonic compounds with light first constituents (at some point in history) should be overrepresented. This is, however, either a discovery yet to be made or a possible argument against treating stress clash as the source of final compounds. One may, of course, invoke the role of compounds borrowed from Low German and Latin (many of them oxytonic) and the role of analogy to explain this disparity. Nevertheless, we may also assume that the resolution of stress clash, which affected both simplex words and compounds, involved some distinct stages. The first one witnessed the demotion of secondary stress (and ensuing V shortening) in words like *ge. βoo (gift). The next stage can be claimed to have brought about the reduction of secondary stress in words with a heavy stem-initial σ such as *saw.loo (soul) 95 . Similarly to the example discussed in 1.4.5.1, the various stages of the process implemented different improvements in the system. Given that an initial light stressed σ is not much more prominent than an ensuing heavy σ with secondary stress, the first step enhanced the differences between the two σ types (a gain in terms of perception), while the second step restricted heavy positions in simplex words to syllables with primary stress (an increase in the economy of description, cf. (4)). For the process to widen its scope the next step would have been to further generalize this pattern to compounds with adjacent stresses. However, in this case the loss of secondary stress leads to a simplex structure (a lexicalized 96 compound) and is thus not adequate to reflect the semantic content of the word. So the language was compelled to adopt a different resolution strategy, namely that of stress shift.

95 The examples are taken from Riad (1992:131f). 96 Note that there is a tendency in Danish to assign final stress to opaque compounds, cf. Fischer-Jørgsensen (2001:513) and section 2.3.2.1 above. 105

Chronology As to when this occurred we can infer that the rise of final stress in compounds must have post-dated the loss of unstressed prefixes (i.e. the Syncope Period) since otherwise monosyllabic first constituents would have undergone considerable reduction, which is not the case. On the other hand, the change must have pre-dated the 13 th century, i.e. the adoption of the first Low German loans with unstressed prefixes ( be-, för-) for the following reason. Upon the loss of unstressed Nordic prefixes, stem-initial stress was transformed into word- initial stress given that the morphological conditioning had been lost. Now as we have argued before, speakers of languages with purely phonological stress are unable to retain the stress pattern of loanwords whose structure does not conform to the native system. Given that these Low German loans were borrowed with retained anacrusis, we must assume that such sound patterns had already been established in the language. It is a convenient guess to attribute this state of affairs to the existence of oxytonic compounds in the relevant period. Although final stress in compounds has largely retreated by now, in some dialects it seems to be in fierce competition with standard accentuation. As we saw in 2.2.3, this is the case in some Northern Swedish varieties. Recall that certain rhythmic patterns seem to inhibit final stress, which led Bruce (1982:128) to coin the phrase slutledsbetoningskandidater (i.e. words eligible for final stress). It turns out that final stress is ruled out in Northern Swedish if the word in question displays two adjacent stresses. Now this is somewhat unsettling given that we have just assumed the very same pattern to have triggered stress shifts after the Syncope Period. Whether a sense of stress clash is developed (and whether resolution strategies are called for) varies from language to language and from time to time. The development of final stress in the pre-Hanseatic period was the consummation of a long process, which was to resolve the inherent problems of the Gmc stress system. The obligatorily bimoraic stressed syllables of the variety described by Bruce (1982) do not call for such repair strategies. The phonological systems of the respective stages are thus fundamentally different, which is the reason why they also treat stress clash differently, in line with the functionalist precept that every change must be treated as a function of the system 97 within which it takes place, cf. 1.4.3. Two adjacent stresses do not automatically call for repair strategies.

Level stress Stress shift in compounds is, nevertheless, not the only possible source of final stress. Recall that the disharmonies of the Gmc stress system were partially solved by the resolution of stress clash, which led to the elimination of secondary stress in certain positions. This process, however, left stressed light syllables intact. This latter problem was addressed by a related process called balance , which Riad (1992:193) describes as an important step in imposing the bimoraic condition on stressed syllables. The basic pattern boils down to the observation that unstressed syllables following stem-initial light syllables were better preserved than corresponding vowels following heavy syllables, cf. Haugen (1982:40). According to Riad’s (1992:189) interpretation, “main stress in such words is in fact realized on two syllables” in a unipositional stress foot. This entails that both balance and the ensuing quantity shift satisfy the bimoraic condition, however, the latter relates to the head of the stress foot, while the former to the stress foot itself. One of the manifestations of balance (along with V strengthening and V levelling) is known as level stress, cf. Riad (1992:ch4.2). When prominence is equally distributed between the two syllables of the stress foot, the second σ is likely to be perceived as the more

97 The case of pre-nasal vowels is an illuminating example. In those systems where nasalized vowels have a phonological status we expect VN to undergo lowering (cf. Proto-Scandinavian: * fim > fem , French dent ), while in other systems we expect pre-nasal raising as in IE */eNC/ > Gmc /iNC/, cf. Lass (1994:24). 106 prominent one, cf. Tési (2014). Thus it is not surprising that certain dialects with level stress are reported to have developed an optional pattern of final prominence. Given that such instances of oxytonation are never obligatory, Riad (1992:200f) denies that it “derived from a phonological stress shift” and suggests that “we should not represent oxytonation by assigning a weak position to the root syllable”. So even if the sensation of final stress found in certain level stress dialects is not phonologically real, it is a curious coincidence that level stress and final stress in compounds can be conjectured to have arisen at roughly the same time.

Productivity There are two circumstances to indicate that final stress in compounds may have been productive for a longer period of time. Kock (1878:73ff) compares the stressed suffixes of prin'sessa (princess) and lära'rinna (female teacher) and finds that the former has an identical stress pattern in the source language (i.e. German), while the latter does not, cf. Prin'zess(in), 'Lehrerin . He assumes that the suffix –inna had secondary stress by virtue of being long and was thus assigned final stress just as if it had been a proper oxytonic compound. He proposes that the same mechanism can be held responsible for the stress patterns of adjectives such as e'gentlig (real) and vä'sentlig (essential) cf. G. 'eigentlich , 'wesentlich with initial stress. If Kock’s analysis is correct, then final stress in compounds was still productive after the completion of the Scandinavian quantity shift. Nevertheless, it should be added that the stress patterns of Dutch and Low German (i.e. the immediate sources of loanwords in the Scandinavian languages) on the one hand and High German on the other exhibit certain disparities given that anacrustic stress was much more widespread in the former two languages. Fischer-Jørgensen (2001:512f) speaks of a large area in the Frisian region and the Northwestern parts of where “there is a tendency to stress the second part of compounds, particularly in place names, but also in a number of other words”. Consequently, many instances of final stress are the direct results of borrowings and should not be attributed to a phonological rule aiming to avoid stress clash.

Distribution As far as the distribution of final stress is concerned, it is worth pointing out that certain categories seem to be strongly overrepresented. One such category is constituted by adjectives with the suffix –ig , all of which happen to be clerical words. One may entertain the possibility that “stress on the second member is a characteristic feature of the pronunciation of clergymen” (ibid: 514) and of the educated classes, which granted prestige to oxytonic pronunciation. The asymmetry between nominals and verbs is another peculiarity to be accounted for. While in the former group final stress was rare and often bound to certain suffixes, “compound verbs were regularly used with the second member in accented position” (ibid: 516). Fischer-Jørgensen reckons that “almost all compound verbs were either loanwords from Low German or loan translations from Latin… found mostly in the written language… [which] may have led to an uncertainty as to their correct pronunciation” (ibid). It seems probable that words that are rarely used in everyday speech were associated with the foreign feature of final stress. Whatever the reason for the special treatment of verbs, the fact that there was enough time for a morphological generalization to develop indicates that final stress in compounds was a productive pattern even more than half a millennium after its first presumptive appearance in the language.

107

3.1.4. The quantity shift and the problem of moras The complementary distribution of length in stressed syllables has its immediate origins in a series of changes that replaced segmental quantity with prosodic length in most Scandinavian dialects. This process (commonly referred to as the “great quantity shift”) started with the shortening of overlong (V:CC) syllables, which was followed by the elimination of light syllables in stressed position (either through V lengthening or gemination) from the 13 th century onward, cf. Haugen (1982:25). Interestingly, complementary length in Danish dialects (the assumed cradle of the change) was probably never fully developed given that geminate consonants were to all intents and purposes lost in the course of the 14 th century, cf. 3.1.6.

CVC monosyllables in terms of moras The main results of the development, which are reflected among others by modern Sw&No, cf. 2.2.2, are fairly clear and can be summarized as follows. First, the bimoraic condition is tied to a single σ (i.e. level stress is disallowed). Second, only vowels and consonantal geminates can contribute to syllabic weight (i.e. all other segments are non-moraic). Despite this clarity, the notion of syllabic weight raises certain inherent problems. Traditionally, a σ is claimed to be light if it contains a short V followed by at most one short C. All other syllables are either heavy (featuring one long segment) or superheavy (featuring two). Riad (1992:236ff) argues that the traditional view of σ weight is in urgent need of revision, since a segmental definition is clearly inadequate for a prosodic entity such as the σ. His main objection concerns the traditional conviction that a CVC-sequence is light in monosyllables but heavy in polysyllabic words. He proposes a moraic 98 definition of σ weight, assumes that CVC monosyllables are heavy and claims that “weight should [not] be interpreted differently in monosyllables than in polysyllables” (ibid: 240). The main argument against light monosyllables stems from the frequently repeated claim that “a Germanic content word must be minimally bimoraic” (ibid: 239). It is generally assumed that whenever a VC-sequence is prosodically equivalent to a long V, then the C is to be considered moraic, cf. Trubetzkoy (1939:174). However, it must be pointed out that none of Trubetzkoy’s examples feature moraic consonants in word-final position. In colloquial Arabic the consonantal µ is followed by an extra-metrical C, while it is always located in the penultimate σ in Classical Latin. Given that final syllables are unstressed in Middle Indic, no moraic C can be found at the right edge of the word. It is thus probably not legitimate to extend the correspondence of VC-sequences and long vowels to word-final domains. This is not to say that a word-final C cannot acquire weight or length by position if it stands e.g. phrase-finally and belongs to a stressed σ. A Gmc content word is undeniably characterized by a branching rhyme, however, a branching rhyme should not be automatically equated with bimoraicity, not even in stressed positions. The postvocalic C of monosyllabic words obviously makes the σ somewhat longer (and thus heavier), however, this should not be treated as an underlying moraic specification, but at best as weight by position. An obvious difference between word-internal and word-final VC-sequences is that the former cannot be affected by resyllabification, while the latter can. Once a coda is resyllabified into the onset of the following σ, its presence becomes immaterial as far as the prosodic characteristics of the σ are concerned. Now I would like to claim that underlying moras are never lost during the process of resyllabification, or to put it simply: a moraic C becomes ambisyllabic when resyllabified and thus retains its weight. Consider the following example. Under Riad’s analysis both the short V and the following C are attached to a µ in Common Scandinavian skip (ship). The singular dative form adds a monomoraic suffix –i.

98 Given that each mora is associated with a V or a C on the segmental tier, the newly proposed moraic approach and the traditional segmental one are not so fundamentally different, after all. 108

Interestingly, the resulting form (skipi ) is universally acknowledged to be bimoraic. Two plus one is definitely not equal to two, so the grammar has to include some kind of delinking rule to account for the bimoraic surface form. It would be undeniably simpler to assign weight by position when necessary than to specify an underlying µ that has to be removed in the course of the derivation. It is somewhat mysterious how the prosodic representation of words like skip would be affected by the quantity shift if they really were bimoraic. If gemination transforms ski µpµ into skiµpµp, then it is difficult to say what exactly has changed in terms of prosody. Also, if the main task of the quantity shift is to eliminate light syllables, then why does it alter a supposedly well-formed bimoraic stem in the first place? Riad (ibid: 311) is aware of this problem and explains it invoking the concept of final C extrametricality, which implies that “a word-final consonant is really treated as the (potential) onset to a following syllable in another word”. This potential onset function is further claimed to be satisfied by non-moraic consonants and moraic long consonants but not by moraic short consonants as is the case in skip above. Riad also presumes that at this stage of development Swedish monosyllables were treated as final syllables. The fact that these other final syllables could feature geminate consonants, cf. himill (heaven), in combination with the potential onset function of extrametrical consonants is said to have triggered lengthening in bimoraic CVC-words. Riad (ibid: 313) acknowledges that this is not a “principled reason” for the σ lengthening. In fact, his explanation relies on a number of ad hoc assumptions like the one that resyllabification can lead closed monosyllables to participate in OSL. As far as I know, neither Middle English nor German can provide evidence for such assertions. Furthermore, the fact that monosyllables are in a certain period more similar to initial than final syllables or vice versa does not necessitate that they should systematically assimilate to them. On the whole, the explanation is not convincing and leaves our major concerns unanswered. Riad emphasizes that his approach can give a unified account of VC-sequences irrespective of their position in the word given that they are assumed to be heavy in both mono- and polysyllables. Although this unified approach to V µCµ# and V µCµ.CV µ (C)# lends elegance to the description, it also leads to a disjunction between V µCµ# and V µVµC# since we are entitled to expect word-final consonants to behave in a uniform manner (still the latter cannot be represented as moraic). It is, however, unclear to me why Riad believes that the connection he has established (i.e. segmental and moraic identity) is more significant than the one he has abandoned (i.e. positional considerations). In my opinion, the latter is undoubtedly superior. Consider the phonetic realization of /t/ in American English: beat, bit, bitter . It is evident that positional correspondence ( beat - bit ) leads to identical patterns, while segmental accordance ( bit - bitter ) does not. I take this as evidence that the postvocalic short C of monosyllables is non-moraic irrespective of the length of the preceding V .

V lengthening before C clusters As mentioned above, the first stage of the quantity shift involved the elimination of overlong syllables. We find that the now illicit structure of V µVµCµ is systematically resolved with the help of V shortening, while the geminate is left intact. This is what we expect given that the change restructured the actual lexical representations, recall our discussion of (24ai) in 2.2.2. Although V shortening before a long (i.e. moraic) C is a regular development, it is not always easy to discern what actually counts as long. Noreen (1904:85) reports V shortening before a geminate or a tautosyllabic C cluster (not due to syncope) such as in mínn (mine), Thórkil (given name) etc. Somewhat surprisingly, he also makes mention of V lengthening before certain C clusters, typically before a sonorant followed by an explosive (ibid: 122). Such cases of seemingly unmotivated lengthening can also be encountered in Middle English, cf.

109 the modern reflexes of child and children . According to Riad (1992:302) V lengthening is not caused by a “higher-order principle” and is merely spontaneous. However, if our earlier suggestion is tenable (namely that moraicity was constrained to vowels and geminate consonants in the relevant period) then we may propose that the very source of the irregularity is that a C.C or a CC#-cluster is not necessarily moraic, in which case we can expect V lengthening. Given that the weight of a geminate can be considerably obscured by a following C, this can understandably lead to some uncertainty as to the C’s underlying specification, which in turn can structure the attested irregularities. Note that these patterns are analogous to what Riad (1992:243) calls false overlength.

Syncope in terms of moras It should be clear by now that the moraic status of consonants is to a large degree arbitrary, while the moraicity of vowels is not. This state of affairs can be further 99 illustrated with some moraic changes leading up to the quantity shift. The reductions of the Syncope Period, which only affected light syllables, provide us with reliable information concerning σ weight in Proto-Nordic. Accordingly, V loss in (50a) but not in *griipan (to grasp) indicates that a sonorant C can contribute to the weight of an unstressed σ, while an obstruent cannot, (ibid: 106). Given that *gastiz underwent reduction in the first Syncope Period, which mostly affected words with a heavy root σ, we can deduce that any kind of (immediately) postvocalic C is moraic in main stress syllables. This is the hypothetical input to a series of changes that led to generalized extrametricality in Sw&No after the completion of the quantity shift. Now let us look at some claims made about the intermittent stages of the process.

(50) The Syncope Period (based on Lahiri & al (1999:357)) Early syncope (µ-deletion) a. *'gas.tiz > gestz (guest), *'wul.faz > wulfz (wolf) b. *'da.gaz > *dagz > dagr (day), *wiraz > *wirz > verr (man) c. *'ka.ti.laz > *ka.tilz > 'ke.till (kettle), * 'ka.ti.looz > 'kat.lar (kettles) Late syncope (V-deletion) d. sitiz > sitr (sits), sunu > sun (son)

Lahiri & al (1999:358) observe that “syncope in the first period deleted moras, rather than just vowels… [while] later syncope was actually a rule of vowel deletion, not mora deletion“. The first part of the statement can be supported with (50a), in which trimoraic words are indeed reduced to a single heavy (i.e. bimoraic) σ. The rest of the examples, however, are not all in agreement with what we may expect. The first problem to notice is that syncope in (50b) targets bimoraic words, which goes counter to Riad’s (1992:141) claim that “in a minimal bimoraic word, mora deletion cannot take place”. On the other hand, µ loss in the stress foot 100 of (50cii) is not borne out as expected (given that both ka.ti. and kat . are bimoraic), and (50ci) cannot be reconciled with Lahiri’s claims unless we argue that a sonorant C could no longer contribute to the weight of a σ not bearing main stress. Nonetheless, it would not be compatible with Riad’s analysis of nasal loss 101 in *valjan (to choose), which he interprets as µ deletion (ibid: 140). Note that the lost µ is regained when syncope is followed by gemination as in (50bii) and (50ci). Gemination in the latter is problematic again, since as Riad (1992:249) puts it “we should not

99 Interestingly, a word-final /s/ outside of a stressed σ is reported to be non-moraic in Gothic, cf. Riad (1992:44) but moraic in the both typologically and temporally closely related OE, cf. Lahiri & al (1999:344). 100 V shortening in the final σ of (50cii) is due to stress clash resolution and has nothing to do with syncope. 101 This change must have occurred after syncope given that word-final /n/ inhibited V loss after heavy syllables, cf. *griipan above. 110 expect syllables outside of the main-stress position to expand from light to heavy”. Moreover, if moras are indeed preserved in the second Syncope Period then how can we explain the moraic transition from alleged su µnµ (50dii) to so µµ n? V lengthening, as we have seen above, would be very difficult to account for. Finally, I have difficulties following Riad’s claims about monosyllables being treated as final or initial (ibid: 248, 312). Departing from the purported moraicity of consonants he assumes that monosyllables (which are evidently both final and initial) were treated as final up to the end of the Syncope Period, after which they behaved like initial syllables. However, by the time level stress appeared in early Old Swedish, monosyllables had been reanalyzed as final again. Also, I fail to understand why the final-initial dichotomy is relevant in the matter and why it should switch back and forth from time to time. Riad himself indicates that the main distinction can be drawn along the notion of stress: any segment can be moraic in the rhyme of a main-stress σ, while in other syllables moraicity is restricted to sonorants (ibid: 246). Accordingly, a monosyllable cannot behave like a final σ given that the former is stressed, while the latter is not. All these contradictions indicate that some of the claims we have reviewed are based on false premises. As we have suggested above, one of these is the belief that a Gmc content word is always heavy, even if it is of the shape CVC as Go. dag (day – sg. acc.). Recall that we have argued this to be a misinterpretation given that a branching rhyme is not automatically bimoraic. As a matter of fact, content words typically involve some branching structure cross-linguistically, yet such generalizations should not necessarily be equated with prosodic requirements. As far as I understand, this is a combinatorial question . Content words are open class words by their very nature and often appear in stressed position. Now if we consider an average language with 5 vowels and 20 consonants, we can see that the total number of possible combinations that do not involve branching rhymes or feet (leaving phonotactic considerations aside) is 100. It is thus not surprising that this limited set of short combinations is primarily associated with the equally limited set of usually unstressed (i.e. short) function words. Another reason why moras seem to escape our grasp has to do with the limitations of diachronic research. The moraic status of consonants is often uncertain even in a synchronic context (recall Basbøll’s moraic analysis in 2.3.1). It cannot get much better when we are largely confined to educated guesses about bygone changes in want of data. To take an example, the observation that a sonorant C inhibits syncope in words like *griipan (to grasp) but an obstruent does not, cf. (50a, b) can be given multiple interpretations. We can seek the solution in terms of σ weight as Riad does but we can also distance ourselves from a moraic analysis and focus on phonotactics instead. Given that Proto-Nordic syllabic nuclei could only be filled with vowels, syncope before a sonorant C would have led to illicit structures. Note that obstruents are not subject to similar restrictions given that they can freely form a coda together with the previously intervocalic C (cluster) without violating the sonority principle. It may thus turn out that the observed patterns of syncopic reduction reveal more about phonotactic constraints than about moraic conditions .

3.1.5. Traces of language contact The lengthening of a light stressed σ in a bisyllabic word ('CV.CV) can proceed in two ways: the language either employs OSL or intervocalic gemination. However, the latter option does not merely lengthen the stressed σ, but also provides it with a coda. Such a structural change seems unmotivated as far as the aims of the change are concerned, which means that OSL is expected to be the cross-linguistically preferred option. Kusmenko (2005:130f) also asserts that in this respect V lengthening is a natural process, while C lengthening is rare and

111 exceptional. This assumption can be further supported with a reference to our functional principles in (4). Languages that have a quantity opposition can usually lengthen all their short vowels, while this is not true of each and every C. Voiced fricatives are, for instance, notoriously difficult to maintain for a longer time, which means that V lengthening is clearly preferred as far as articulatory effort is concerned. We can reach similar conclusions when we turn to the economy of description. An ambisyllabic C represents a descriptive complication since it is split between two distinct higher level constituents (cf. the syllable integrity principle in Riad (1992:100)) and thus requires a different treatment from final geminates. On the other hand, long vowels can be represented in a uniform fashion. The fact that our functional principles (along with other phonological considerations) point out V lengthening as a natural strategy indicates that gemination is likely to be triggered by external forces, cf. 1.5.3. As a matter of fact, those Scandinavian dialects that did implement the quantity shift display certain differences when it comes to lengthening. Generally speaking, southern and western dialects employed V lengthening, while northern and central regions made use of both strategies, cf. (ibid: 270). Kusmenko (2005:131) underlines that “[t]he geographical distribution of consonant lengthening heightens the possibility of a Sámi influence ”. He identifies the phenomenon of Finno-Samic C gradation as the key source of interference. Given that short-syllabic Sámi words have an alternating form with a geminate, certain Scandinavian loanwords of the shape CV.CV were pronounced with a lengthened C. Guided by our discussion of basic interference types in 1.5.4, the present contact situation can be sketched as follows. As a first step we assume Sámi to be the primary system. Intervocalic consonants in the secondary system (i.e. Scandinavian) are subject to over-differentiation, i.e. phonemic distinctions arising from C gradation are imposed on the secondary system, cf. (14b). Gemination occurs when speakers of Scandinavian (now the primary system) reinterpret the distinctions caused by C gradation in terms of their own system of quantitative oppositions, cf. (14c). Dialectal geography, unexpected development and the fact that the process can be mapped onto Weinreich’s basic interference types provide ample evidence that such instances of gemination should indeed be attributed to the effects of language contact.

3.1.6. Danish as a West-Germanic language The Scandinavian quantity shift proceeded in three distinct stages. The elimination of overlong syllables was followed by the lengthening of CVC monosyllables, upon which the process was concluded with lengthening in CVCV bisyllables, cf. Hesselman’s laws in Riad (1992:271). The fully implemented quantity shift resulted in a transition from segmental to prosodic length and is thus characterized by quantitative complementarity in the obligatorily bimoraic stressed syllables. Nevertheless, the shift towards prosodic quantity (especially its second stage) seems to have been less systematic in Danish than in No&Sw. Haugen (1982:51) remarks that it is not clear whether CVC-words escaped lengthening as such or whether they were lengthened and subsequently shortened. Riad (1992:329) seems to advocate the latter possibility when he says that “[b]y the quantity shift in Danish, vowel quantity only is chosen as distinctive”. Complementarity is an obvious prerequisite for such choices to be made. Recall that in a complementary system we do not need both V and C quantity in the UR, but it is not always easy to decide which option to embrace, cf. 2.2.2. Riad assumes that those Gmc dialects that have preserved the achievements of the quantity shift (i.e. standard Sw, No, Icelandic and Faroese) have underlying C quantity, while in others (such as Danish, English, German and Dutch) V quantity is distinctive instead, cf. Lahiri & al (1999:363). The main argument in favour of such an interpretation has its roots in the observation that all members of the latter group have undergone degemination and lack

112

(morpheme-internal) long consonants. The disappearance of geminate consonants would indeed be rather difficult to account for if C quantity were distinctive. If degemination in Danish is indeed the consequence of C quantity being redundant in a complementary system, this means that monosyllables are argued to have returned to their original form following a cursory lengthening. In a functionalist framework we are rather reluctant to postulate seesaw changes (such as A > B, B > A) given that we view phonological changes as unidirectional, cf. 1.4.3. Furthermore, we cannot adopt the proposition that distinctive V quantity automatically leads to degemination, since this would contradict our earlier claims about the UR of Sw&No, cf. 2.2.2. So what arguments are there to support the alternative proposal that Danish did not fully implement the quantity shift and thus has never had prosodic length akin to Sw&No? In what follows I will try to show that underlying V quantity is not necessarily a relevant factor as far as degemination is concerned.

The triggers of degemination It is a convenient starting point to consider those (West) Gmc languages that have also lost distinctive C quantity. The first thing to notice is that degemination in both Danish and West Gmc took place immediately (i.e. within a few generations) after OSL as indicated in (51) below. The relative chronology of the changes can also be deduced from the fact that early degemination in words of the shape CVC:V would have provided an input to OSL. Given that OE sittan (sit), , Dutch fallan (fall) etc come down to us with a short stressed V (cf. fallen, vallen ), we can conclude that the two rules were in a counterfeeding order. Accordingly, when Goossens (1974:42, 73) dates both processes to have occurred in there can be no doubt about their relative chronology. Similarly, OSL in (1400-1600) must have taken place in the first part of the period, given that long consonants are assumed to have shortened by the 16 th century, cf. Tschirsch (1989:167) and Nádasdy (1989:202).

(51) The chronology of OSL and C shortening a. Danish 102 : 1250 vs. 1300 b. English 103 : 13 th century vs. 13 th -14 th centuries c. Dutch: both between 1200 and 1500 d. High German: both between 1400 and 1600

In addition to this chronological uniformity, the West Gmc languages and Danish converged on certain prosodic traits as well. For instance, they had all restricted the occurrence of long vowels to stressed syllables when they implemented OSL. This causal relationship (in the sense of formal causes, cf. 1.2.1) can be firmly established with a reference to High German, where long vowels were preserved much longer (well into the middle period) in unstressed closed syllables than in English or Danish, cf. Lahiri & al (1999:347). In accordance with this, OSL entered the scene roughly two centuries later in High German than in the other languages of (51). Another shared characteristic of the degeminating languages is that they all allowed (a restricted set of) light syllables under main stress at the time of OSL in bisyllables. Danish had arguably escaped the lengthening of certain CVC-words. The same was true of German where the long nucleus of present-day Tag (day) is due to analogical pressure aiming at paradigm uniformity and cannot be attributed to a systematic change (cf. the plural form Tage, whose long V (resulting from OSL) served as a basis for the analogy). Light stressed syllables were even more frequent in Middle English, where the effects of trisyllabic

102 Cf. Haugen (1982:25) and Skautrup (1944:254). 103 Cf. Lass (1992:47, 59). Both processes occurred somewhat earlier in the north of England, cf. Lahiri & al (1999:350). 113 shortening (cf. vanity - vain ) greatly increased the number of short stressed syllables. CVC- monosyllables were not systematically lengthened in English either. These correspondences can be interpreted as follows. Both West Gmc and Danish underwent certain quantitative changes in a transition from segmental to prosodic length. The transition was of course incomplete by virtue of the few remaining light stressed syllables in CVC-words (and in trisyllables in English). The stress system of the languages in question was thus a curious mixture. It resembled a genuine prosodic system in all but one aspect, yet it displayed a three-way (i.e. almost fully-fledged) segmental opposition given that CV:C, CVC: and CVC were all legitimate and distinctive. Such a dual system is an unstable constellation (a language should have either segmental or prosodic length ), which calls for the reduction of the segmental oppositions. Note that it would be much more cumbersome to retreat on the prosodic level and reintroduce heavy unstressed syllables. The superiority of degemination over V shortening (in the present context) is evident given that the latter would be the mirror image of the latest change in the system, which would violate the principle of unidirectionality, cf. 1.4.3.

Syllable cuts Degemination leaves us with the following σ types under main stress: VC, V: and V:C (of which only the first two can feature in disyllables). The question is how the loss of C quantity can improve the challenges of the dual system if it simultaneously diminishes segmental length and undermines our prosodic generalizations by increasing the number of light stressed syllables as in CVC:V > CV.CV. If we want to maintain that degemination was indeed an improvement, then we must assume that the resulting sequence was, for some reason, syllabified as CVC.V so that the stressed σ could remain heavy. As a matter of fact, Da ville (to want), G Wasser (water) and E butcher are usually analyzed as having a closed stressed σ, which is consequently heavy, cf. Kusmenko (2005:135), according to whom “the postvocalic consonant clearly belongs to the coda in Danish and the [so] the root syllable remains bimoraic”. It would thus seem that every stressed σ is heavy in West Gmc and Danish; but such an assumption is contested by Árnason (1996:2) among others, who says that “Danish has light stressed syllables and in that respect is similar to German and English”. This very idea is expressed by Lahiri & al (1999:367) who say that “rhyme- branching does not necessarily mean heavy; rather, closed syllables are heavy while open syllables are light”. This state of affairs is clearly incomprehensible in terms of moras. Such analyses rely on the theory of σ cuts whose basic idea is that a stressed σ is either smoothly or abruptly cut (i.e. with either a slow or a sharp drop of the energy contour). The former represents the unmarked case and is typically found in languages that lack a σ cut opposition, cf. Vennemann (2000:252). The fact that σ cuts are defined by energy contours in relation to a σ nucleus indicates that σ cuts are an expression of prosodic length and are thus incompatible with segmental quantity. Accordingly, the difference between German Ha:se (rabbit) and hasse (I hate) is an opposition of σ cuts (surfacing as a tense-lax distinction) and not of quantity. Trubetzkoy (1939:178, 199) remarks that “shortness is nothing but the expression of the interruption of vowel articulation by the following consonant… [so i]f the vowel with close contact [i.e. abrupt cut] appears to be shorter than the vowel with open contact [smooth cut], this is merely a phonetic side phenomenon”. All this indicates that segmental quantity can be ousted 104 by a system of σ cuts, which is what happened in Danish and West Gmc. Vennemann (2000:262) is of the opinion that the development of σ cut prosodies occurred before the major quantity changes in English, German and Icelandic (and, although

104 Note that under this interpretation all modern Gmc languages have abandoned segmental quantity and adopted a prosodic system, which manifests itself in σ cuts either with or without complementarity. 114 he does not mention this, probably in Dutch and Danish too). Furthermore, he assumes that what we call OSL is essentially a transition from abrupt to smooth cuts with the concomitant feature of tensing, cf. (ibid: 268). Let me outline why I think that this is an improbable assumption. My main objection concerns the problem of actuation. Vennemann simply announces that certain transitions (e.g. segmental > prosodic, abrupt > smooth) took place at a given period of time, however, he provides no explanation whatsoever as to what could have triggered the changes in the first place. The only causal account he embraces relates to the sonority of postvocalic consonants (such that high sonority can transform an abrupt cut into a smooth one (ibid: 264)), but this is relevant only for a tiny subset of the transitions involved. Accordingly, the purported transition from abrupt to smooth cuts in words like Old Norse vita (to know) > Sw ve:ta is left unexplained, which clearly undermines Vennemann’s proposal. We have seen that Riad (1992) is an extremely detailed, well-founded treatment of the loss of segmental stress in the Gmc languages. He makes it clear that the transition into a prosodic system was a gradual, rather protracted process, whose full implementation was prepared and accompanied by a great number of quantitative changes. It is thus utterly unlikely that the adoption of σ cuts predated OSL, as OSL was arguably the last decisive change in favour of a prosodic interpretation of quantity. So it is more plausible to assume that the appearance of abrupt cuts (i.e the abandonment of segmental quantity) proceeded hand in hand with degemination , as we have suggested above.

On the scope of moraic analyses We can conclude this section by considering the challenges that the loss of C length poses for moraic analyses. Trubetzkoy (1939:178) says that µ-counting and σ-counting languages differ in principled ways. While short (as opposed to long) nuclei are always unmarked in the former, they are marked in the latter group provided that the language employs a σ cut opposition. So it is assumed to be a very “peculiar combination” for a µ-counting language to display abrupt cuts. The Hopi language of the Uto-Aztecan family, where short stressed vowels can only occur before a C (just like in West Gmc), is claimed to be one of the rare exceptions. However, based on the scarce evidence found in Trubetzkoy (1939:179), Hopi’s status as a µ-counting language seems somewhat ambiguous. The author acknowledges that the stress rule that assigns stress to the second µ of the word is not a synchronic regularity. Furthermore, two of the three distinctive degrees of quantity are taken care of by Trubetzkoy’s close and open contact. The point is that a language that makes use of a σ cut opposition in such a way that a CVCV-sequence is syllabified as CVC.V should not be analyzed in terms of moras in the same vein as languages of segmental quantity or of prosodic complementarity. Recall that smoothly cut open syllables are described as light (purportedly monomoraic), abruptly cut syllables are always heavy (purportedly bimoraic). Now the problems of such an overapplied moraic approach become soon apparent if we consider how the transition into a σ cut opposition can be conceptualized. (52) below outlines a typological change of the type Sw > Da.

(52) C shortening and σ cuts (based on Kusmenko (2005:133)) a. falla (fall): CV µCµ.CV µ > CV µCµ.V µ b. tala (speak): CV µVµ.CV µ > CV µ:.CV µ

In his analysis of (52a) Kusmenko assumes that degemination resulted in the disappearance of an extrametrical element (i.e. the onset of the second σ), but the moraic structure itself was not altered, as can be deduced from the heaviness of the root-initial σ. However, if Kusmenko

115 is right about (52a), we have no option but to fully embrace the theory of σ cuts and claim that (52bii) is light (thus monomoraic) by virtue of being open and smoothly cut. This means that if we extend our moraic analysis to the domain of σ cuts, we must conclude that consonantal degemination leaves the moraic status of consonants unaltered, while it strips away a µ from the representation of long vowels. Kusmenko (2005:139) also addresses the problem of ə-ass, which he calls C lengthening by apocope. He assumes that compensatory lengthening cannot reduce the number of moras so he represents the apocopated form of (52a) as CV µCµCµ. This basically amounts to the highly unlikely scenario that (52a) is still trimoraic after two instances of reduction. Those who compare Sw falla with Da falde (fall) will immediately recognize some apparent quantitative differences, which make it impossible to impose the same moraic structure on the Sw and the Da word. Such highly counter-intuitive conclusions render Kusmenko’s analysis completely untenable and indicate that, unless we delimit their scope in some way or another, moraic analyses as such face certain inherent problems. A reasonable first step (in light of the above discussion) would be not to employ moras in the description of languages with a σ cut opposition , because the term has rather different (often mutually incompatible) implications when used in the context of σ cut languages and segmental or prosodically complementary length respectively. This way the multiple senses associated with the moras can be efficiently reduced. Furthermore, we saw in 3.1.4 that the moraic status of (word-final) consonants probably also needs revision since its present use is often disturbingly arbitrary. Last but not least, it must be pointed out that moras are ultimately metrical units and poetry does not necessarily reflect spoken language in a reliable way.

3.2. The history of tone and stød While the previous discussion of stress and quantity relied heavily on comparative evidence from other Gmc languages, the problem of tone and stød confines us entirely to the testimony of Da&No&Sw, as these phenomena have no direct counterparts in other related languages. Neither Faroese nor Icelandic show traces of a corresponding tonal or glottal opposition and it is not even clear whether the opposition was a common Scandinavian inheritance (and thus lost at an early stage in insular Scandinavian) or whether it was a continental innovation that post-dated the colonization of the Atlantic. A choice between the two options is usually made on the basis of what phonetic factors a given writer attributes to the birth of the opposition. As we will see below, the theories put forward so far advocate highly contradictory positions in all possible respects. Given that we lack adequate answers to even these basic questions of the relatively recent past of the North Gmc languages, we will choose to ignore such far-fetched hypotheses that see a link between Scandinavian accentuation and e.g. the circumflex of Proto-IE, cf. Kock 105 (1901:110f). The first written accounts of the opposition date back to only a couple of centuries so they are of limited use. In want of early records and comparative evidence from other languages the linguist has to resort to the distributional peculiarities of the oppositions, to dialectal geography and to her understanding of phonological change in order to be able to reconstruct certain aspects of the prosodic history of North Gmc. Therefore a proposal for the birth of the opposition can be considered reliable if it 1) is phonetically tenable, 2) is in accordance with historical sources, 3) can be used as a starting point to account for contemporary dialectal variation 4) can explain the distribution of the opposition without

105 See Kuryłowicz (1936) for arguments against Kock’s proposals. 116 excessive 106 reference to analogical pressure. In what follows I will evaluate a number of earlier suggestions in light of these four criteria.

3.2.1. Tonogenesis Although the Scandinavian oppositions display both phonological and morphological regularities, even the best-formulated distributional rules have a large number of exceptions. Any binary opposition defined by both morphophonological and lexical factors is likely to have arisen from allophonic variation, which having lost its conditioning first acquires an exclusively morphophonological character of high predictability. The distribution of the opposition is then transformed such that the lexical component grows more and more dominant at the expense of phonological predictability, while at the same time new morphological regularities may develop. A similar line of development can be assumed for the history of the Sw/No opposition. This section is devoted to the concept of tonogenesis , a term which I use for the phonologization of two positional tone variants, i.e. the birth of an opposition. Although the first written sources that make explicit references to the oppositions date back to the 18 th century, cf. Wetterlin (2007:17), the phenomena in question are no doubt much older than that. The general consensus is that the distinction arose in the 11 th or 12 th century as a result of 1) epenthesis in words like segl > 1segel (sail) and 2) the emergence of the definite suffix of nouns like 1and +en (the duck), cf. Riad (1998b:65). Both processes gave rise to disyllables specified for Accent 1, thus creating the basis of contrast with other polysyllables normally bearing Accent 2 (e.g. 2spegel (mirror), 2ande +n (the spirit)). The observation that (former) monosyllables are assigned Accent 1, while longer words (with the exception of loanwords, lexicalized compounds etc) surface with Accent 2 led the earliest writers on the subject to various theories that in some way or another relate to the number of syllables involved. Oftedal (1952) reviews and compares the predictions of the two best- known accounts, which he calls hypotheses A and B. According to Hypothesis A the phonetic basis of the opposition goes back to the early Old Scandinavian Period (i.e. to the 9 th century following the completion of syncope) such that “all words that were monosyllabic in that period, but later developed into dissyllables, received Accent I… [while a]ll other polysyllables received Accent II” (ibid:158). Hypothesis B can be attributed to Axel Kock and assumes that the accent distinction has its roots in the reductions of the syncope period. If a word lost a σ by syncope, it received Accent 1, otherwise (i.e. if it retained all its syllables) it received Accent 2. Oftedal (1952:172) finds that Hypothesis A is more consistent with the facts of tonal distribution as it explains fewer cases by analogy than its competitor. It is obvious that the two hypotheses above fail to provide a satisfactory phonetic explanation given that the sheer number of (lost) syllables is unlikely to affect the pitch contour in any principled way, cf. Riad (1998b:73). Given that pitch is one of the three main correlates of stress, it lends itself to ascribe the tonal difference of mono- and polysyllables to different stress patterns . In the unmarked case (i.e. presumably also in the Syncope and the Old Scandinavian periods) a stressed σ is represented (among others) by a redundant F0 peak. Given that in early Gmc times each word was assigned root-initial primary stress (if uttered in isolation), it is reasonable to attribute the tonal curves of the opposition to the presence or

106 This last requirement reflects a cornerstone of scientific descriptions known as Occam’s razor. While a simple explanation is always preferred to a more complicated one in a synchronic description, historical developments do not always follow the most economical path. Yet despite the fact that Occam’s razor does not apply equally well to historical explanations, Oftedal (1952:164) remarks that “many a treatise on problems of linguistic history would have gained both in consistency and persuasion if the authors had been more keenly observant of the many cases where the principle is applicable”. (my italics) 117 absence of a non-initial σ bearing some degree of prominence. As we saw in 3.1.3, the history of the Nordic languages offers two possible sources: secondary stress in Proto-Nordic and level stress in Old Scandinavian.

3.2.1.1. Secondary stress in Proto-Nordic The main tenet of the Proto-Nordic hypothesis as formulated by Riad (1998b:68f) is that Gmc foot structure was defined by moraic trochees such that a non-initial heavy σ could support its own stress foot and therefore surfaced with secondary stress. The reductions of the Syncope Period both increased the number of secondary stresses (by virtue of creating closed syllables upon apocope) and gave rise to stress clash in words like *katilaz > ketill : (x.). > (x)(x) 107 - (kettle) or *doomijan > døøman : (x).(x) > (x)(x) - (to judge), cf. (ibid). The problem of adjacent stresses is commonly resolved by destressing, but in this case secondary stress was not completely lost, given that the originally redundant F0 contour was preserved. Riad (1998b:74), in fact, asserts that the separation of pitch from the other stress correlates (i.e. quantity and intensity) can be attributed to stress clash. In other words, stress clash is argued to have triggered both loss of length and intensity. Yet if we consider the various segmental reductions that were brought about by stress demotion, we can also argue that first only intensity was lost, given that quantity disappeared only as a result of destressing. If a polysyllabic word was confined to a single stress foot in the period after syncope but before destressing, then it came down as a monosyllable in early Old Scandinavian as a result of further reductions. Accent 2, on the other hand, was a reflex of non-initial stress feet and was found in polysyllables. If the distinction indeed arose due to epenthesis and cliticization, this means that the tonal curve was still redundant after stress demotion, as it was first encoded in σ weight (2xx – 1x.) and then in the number of syllables (2x. – 1x). However, this only could have been the case on assumption that the apocope of (53c) preceded the reductions of (53a, b). Otherwise the rhythmical structure x. could have been associated with both melodies, in which case tonal information was a lexical property of individual words even before epenthesis entered the scene.

(53) Some common Scandinavian reductions (taken from Riad (1998b:69)) a. døøman > døøma (to judge): xx > x. b. sattee > satte (put - preterite): xx > x. c. wordu > ord (word): x. > x

Now if the inherent conflicts of the Gmc stress system (light stressed, heavy unstressed syllables, stress clash etc) are indeed responsible for the early reductions and at a later stage for OSL, then it is highly unlikely that words like (53c), which are compatible with the expectations of a system based on prosodic quantity, were earlier affected by reductions than such words whose prosodic structure calls for repair strategies (53a, b). It is not necessarily a valid counterargument that (53c) must have preceded (53b), because otherwise the final short V of (53b) would also have been targeted by apocope. The second peak of Accent 2 could have prevented such deletions. Accordingly, if the premises of the Proto-Nordic hypothesis are correct, then tonal information was probably lexical for a short time after (53a, b) and

107 Parentheses are used to indicate stress feet, while a dot denotes a monomoraic unstressed σ. Word-initially x stands for a σ with primary stress (either light or heavy), otherwise it refers to a bimoraic σ bearing secondary stress. Recall (3.1.4) that sonorants in word-final position are generally claimed to contribute to σ weight as opposed to obstruents, hence the last σ of *doomijan has secondary stress but the final σ of *katilaz does not. 118 before (53c), which means that epenthesis and cliticization only reintroduced and did not “create” the distinction. If the phonetic basis of the opposition is indeed due to stress clash resolution, then we can conclude that double-peaked dialects (i.e. where Accent 2 is realized as H-LH-L as in Stockholm) represent the original state of affairs . Central Swedish is thus a possible input to the diachrony of Scandinavian accent typology, since as Riad (1998b:87) puts it “nothing much has happened prosodically between PN [Proto-Nordic] accent 2 and Stockholm Swedish accent 2”. This is apparently a slight exaggeration. Just to take an example, the stress pattern of multiple compounds (cf. (27) in 2.2.3) probably did not mark the edges of a Proto- Nordic word but reflected its semantic content in much the same way as can be observed in contemporary non-tonal Gmc dialects. The tones of Accent 2 also had a different association pattern in trisyllabic simplex words than they have today, cf. Kuryłowicz (1952:466). There is at least one further difference between the two language stages that may cast a shadow on the validity of the Proto-Nordic approach. Irrespective of the actual phonetic manifestation of the tonal curve, Accent 2 is generally described as trimoraic (even in circumflex dialects, cf. fn. 48), which is especially clear in privative approaches that postulate three separate building blocks, cf. (28). The ban on tonal crowding restricts the opposition to polysyllables given that the two moras of a monosyllable would not be able to host the whole melody. This raises the following question. How is it possible to account for 2CV:CV-words in modern Sw&No in those cases where the long V is due to OSL, cf. (51)? Words like Sw 2tala (speak), 2gata (street), 2hane (cock) were all bimoraic at the time when the opposition is claimed to have arisen (i.e. between 1000 and 1200), which means that Accent 2 should be ruled out on structural grounds. How likely is it that a single µ could host two tones in the relevant period? The proposal that Accent 2 originates from two stresses may lead us to assume that only the two H tones have phonological relevance, which means that the L tones (or at least the first one) can be hypothesized to be epenthetically inserted in order to meet the requirements of the OCP, cf. Riad (1998b:78). Thus bimoraic Accent 2 in bisyllables is only possible if the melody is made up of two falling tones: HL-HL , which was then reanalyzed as H-LH-L upon the completion of OSL. Therefore it is not convincing to claim, as Riad (ibid: 72) does, that suffixes (e.g. stor +a (big + pl)) are specified for a lexical H tone, which “docks on the primary stress syllable”. First of all, as we have seen it is a mistaken approach to assume identical tonal building blocks for early Old Scandinavian and the modern languages. Secondly, why would the additional tone of a suffix end up on the first σ of the root? This is a rather counter- intuitive claim that Riad (1998b:80) adopts without further scrutiny. The revised opposition of (39) does not face similar problems since in (39) the connective tone of Accent 2 is claimed to follow the prominence tone of the first µ. Given that the boundary tone is assigned post- lexically, the H tone of the suffix (enhanced with an epenthetic L) simply appears after the prominence tone without arbitrary tonal movements and ordering problems. The assumption that the peaks of the tonal curves are the legacy of demoted stress raises the question of what gave rise to the separation of the three usual stress correlates. Why was stress demotion accompanied by the preservation of the pitch contour in Scandinavia but not in the West Gmc languages (which also underwent similar reductions)? According to Riad (1998b:71fn) the degree of reduction might have played a role, since rigorously implemented syncope could have caused “clashes with more prominent tonal effects”, yet he concedes that this is difficult to prove. Furthermore, it is difficult to conceive of a natural (i.e. internal) development which results in some stress correlates being lost while others being preserved. Such a situation can only come about when the speakers of a speech community are confused concerning the distribution of secondary stress. It is reasonable to ascribe such confusions to external factors (i.e. to language contact) rather than to prominent tonal effects.

119

Such an alternative explanation is provided by Kiparsky (2013:30:55), who entertains the possibility that preserved pitch can be attributed to contact with speakers of Finnic languages. The facts that 1) the former distribution of the two-stress system coincides with the linguistic border between Gmc and Finnic speakers during the Viking age, and that 2) speakers of Sami knew how to pronounce accented short syllables, serve as a solid basis for assuming some sort of substratum effect at work. Accordingly, we can speculate that in a language contact situation, where Sami is the primary system, the rules for assigning secondary stress were transferred to the secondary system (Sw/No), which means that the preservation of pitch was an instance of over-differentiation, cf. (14b).

3.2.1.2. Level stress in Old Scandinavian Elstad (1980) proposes an alternative account of tonogenesis, in which he attributes the phonetic basis of the opposition to the balance phenomenon of level stress. His reasoning goes as follows. In most Gmc languages (and presumably even in late Proto-Nordic) an IP is (was) usually characterized by a falling terminal contour in ordinary, definite statements. A word in isolation forms an IP of its own and therefore exhibits a fall from the stressed σ to the right edge of the word. Elstad (1980:391) assumes that “[i]n the 11 th century the two syllables of a Scandinavian disyllable carried fairly equal stress” and therefore the tonal fall was delayed in disyllables. Phrasal stress was then transferred to the word itself, which gave rise to a tonal difference based on the number of syllables. The author adopts the standard assumption according to which the contrast was created by epenthesis and cliticization. Riad (1998b:91) challenges Elstad and points out that the author “assume[s], rather than argue[s], the development from phrase prosody to word prosody”, and thus fails to give a reason for why such a secondary split took place in the first place. Such ad hoc claims indeed deserve criticism, but as far as I can see in this specific case Elstad’s lack of explanation does not invalidate the proposal. What is surprising for me is that he refers to phrase prosody at all. If stress was indeed equally distributed in disyllables, then this was a word-level property and accordingly there is not much to be explained about a transition from phrase to word prosody. It seems to me that we have more reason to reject Elstad’s claims about the birth of the tonal opposition, which he attributes to cliticization and epenthesis (in line with mainstream research). If the contours of the opposition are indeed due to the difference between mono- and polysyllabic stress patterns, then it is not tone but stress that is made phonemic by the aforementioned processes. Strictly speaking, the opposition does not become tonal until some restructuring separates pitch from the other stress correlates. The alleged stress pattern of disyllables is another moot point to be considered. Equally distributed stress (known as level stress) was not a function of the number of syllables but rather depended on the weight of the initial σ. If this was light, then the bimoraic condition was satisfied by extending the stress foot to both syllables. In words with a heavy initial σ the stress foot did not include the final σ, which was thus unstressed and underwent more enhanced reductions in Old Scandinavian than corresponding syllables under level stress. This indicates that if we follow Elstad’s reasoning, then we have to equate Accent 2 with level stress . Given that it was the quantitative changes of OSL that broke down the system of level stress, the transformation of the original stress-based distinction into a tonal opposition should also be attributed to OSL. Accordingly, tonogenesis (in the strict sense of the word) entered the scene in the 13 th and 14 th centuries, i.e. roughly two centuries later than assumed by Elstad himself. Interestingly, this is the very scenario (put forward by Kuryłowicz (1952)) that Elstad (1980:388) dismisses in his introduction. As we have seen his main reason for doing so is that he fails to distinguish between stress and tone. Riad (1992:196) also argues in favour of the

120 assumption that level stress corresponds to Accent 2, but he sees it only as a possible phonetic realization of Accent 2 and not the only one. Kuryłowicz (1952:466), on the other hand, regards level stress to be the exclusive source of Accent 2 such that he attributes the Accent 2 of words that have never had level stress (i.e. those that had a heavy initial σ even before OSL) to analogy. Furthermore, he argues that the accentual opposition is unknown to those dialects that have preserved stressed light syllables (ibid: 476). This is, however, probably not a correct generalization given that Riad (1992:196) makes mention of some light-stem dialects, which display Accent 2 in both light and heavy stems. Elstad’s hypothesis has also some implications for the shape of the earliest tonal contours . If Accent 1 is claimed to be due to a falling terminal contour in monosyllables, then it must be equated with HL. Although a stressed σ is associated with a H tone, level stress in disyllables cannot be two-peaked since in this case we only have one single stress distributed equally over the two syllables. This means that the peak of a single terminal contour is shifted to the right in a way that Accent 2 surfaces as HHL > LHL, a contour that can nowadays be found in e.g. Southern Sweden and Western Norway, cf. Bye (2004:6). We have thus seen that under the Proto-Nordic hypothesis the opposition is claimed to have arisen as H-L : HL- HL (pointing out Stockholm as a tonally conservative dialect), while the level stress 108 approach leads us to assume that the tonal curves of Malmö (HL : LHL) represent the original state of affairs. In section 3.2.3 we will investigate how these two possible starting points allow us to reconstruct the diachrony of the Scandinavian accent typology.

3.2.2. The birth of stød The distributional similarities between the tonal and glottal oppositions of Scandinavia make it clear that the two phenomena stem from a common source. It is generally acknowledged that the stød system evolved at some point from a tonal predecessor and is thus a late innovation. It is also widely believed that stød used to be the allophonic realization of an HL contour and was phonologized as a result of certain segmental changes, possibly reductions. The details of the reconstructions (proposed by various writers) exhibit substantial variation, and even some points of this basic outline have been put into question.

3.2.2.1. Stød first Liberman (1982) is one of the most unorthodox treatments of Scandinavian accentuation both as regards his claims and his argumentation. I am not aware of a single author who has adopted the diachronic assertions he makes in subsequent research and I do not intend to follow them either. In spite of this I have a twofold reason for considering his position. First, Liberman exhibits a stunningly deep understanding of Scandinavian dialectology, which makes his extremely thorough study an important point of reference even if we have to refute many of his conclusions and assumptions. Second, an analysis of his mistaken premises and (methodo)logical shortcomings may help us avoid some similar pitfalls. Liberman’s whole approach is based on the belief that stød is primarily a µ-counting device while the accents of Sw&No are responsible for counting syllables. The reader is repeatedly reminded that the main function of the accents / stød is to count syllables / moras.

108 Bye (2004) attributes the rise of the opposition to peak-delay, which he says is “frequently observed in tone languages” (ibid: 12), and does not mention level stress. Although peak-delay is not restricted to tone languages, as far as I understand, it cannot create an opposition on its own, i.e. without some initial conditioning in terms of stress. It can at best transform an accentual opposition into a tonal one. Kiparsky (2013:1:50) lists several sources a tonal contrast can have; ad hoc peak-delay is not one of them. 121

Given that Proto-Gmc was a µ-counting language, he assumes that “stød is a Common Germanic phenomenon” (ibid: 308), which came to signal monosyllabicity through apocope and was then at a later stage transformed into a tonal opposition. To support his claims the author makes reference to a remarkably wide range of dialects, which he sees as “many consecutive rings of a chronological ladder” (ibid: 306). My first objection concerns the logical leap that leads Liberman to the conclusion that stød is a Proto-Gmc inheritance. The fact that stød is a µ-counting device obviously does not entail that all µ-counting languages have stød. So in want of comparative evidence (no West (let alone East) Gmc language is known to (have) exhibit(ed) suprasegmental glottal features) it seems more than far-fetched to trace the phenomenon of stød back to common Gmc times without providing a (sound) phonetic basis for its genesis. Thus the author devotes more than a third of his lengthy monograph to the “Origin of Scandinavian Accentuation” and still successfully evades answering the question he poses to himself: the origin of stød itself is not addressed. As a matter of fact, Liberman is of the opinion that articulatory explanations are not needed in historical linguistics, since the linguist “can at best show how a specific function brought forth an appropriate realization” (ibid: 193). Accordingly, he claims that “[t]he only type of verifiable reconstruction is functional, i.e., a reconstruction that attempts to trace the history of functional units rather than of purely acoustic entitites” (ibid: 306). Such an approach to reconstruction (i.e. one that ignores phonetic details) is clearly untenable especially if we recall what we said about historical accounts in section 1.2.2: the demand for explanation is met if what happened is shown to have been possible. Postulating different language stages without demonstrating that the transition between them was phonetically plausible fails to meet the requirements of an explanation, simply because such descriptions lack the apparatus needed to reject the charge of being overtly speculative. Ringgaard (1983:344) also calls attention to the fact that Liberman’s notion of stød lacks phonetic content and is thus “an empty entity which he manipulates and moves about at will”. I also have difficulties decoding what Liberman means by “purely acoustic entitites” as opposed to “functional units”. In a phonological context the latter either denotes a phoneme or a prosodeme, which means that “purely acoustic entities” can only refer to redundant phonetic details. Now given that nondistinctive traits are virtually never considered in a historical context, Liberman does nothing but state the obvious when he emphasizes that reconstructions ought to be confined to “functional units”. However, he probably interprets the two concepts differently than we do. This follows inter alia from the fact that he reckons µ-counting to be the primary function of stød. First of all, it is worth reminding ourselves that languages are not especially good at arithmetics, despite the traditional terminology. It would be rather uncommon for a natural language to be involved in processes that require us to count to more than two 109 . Thus the idea of counting syllables or moras is somewhat misleading and it seems more adequate to say that stød or Accent 2 can signal bimoraicity or polysyllabicity, but they are certainly not engaged in counting. But even if we granted the supposition that stød signalled the second µ of a stressed σ (or Accent 2 polysyllabicity), it would still be more of a property than a function. The very fact that we can speak of an accentual opposition in Scandinavian implies that stød and Accent 2 can no longer signal the number of moras or syllables in a reliable way. If this was their original function, they lost it the moment the contrast was created.

109 While (in accordance with the binary nature of synapses) many generalizations involve the number two , natural processes never involve larger numbers. Thus we can say things like: 1) make sure that the second constituent of the sentence is a verb, 2) stress the second σ from the right if it is heavy, stress the σ preceding it if it is light, 3) a content word should be made up of at least two moras etc. On the other hand, the 3 rd constituent of a sentence or the 4 th σ of a word is never bound by natural constraints. 122

Apart from the arguments presented so far there are a few more reasons to dismiss the idea that the tonal accents developed out of a glottal opposition. Ringgaard (1983:344) refers to the results of areal linguistics and argues that Liberman’s claims are neither coherent with the chronology of apocope nor with the geographical distribution of apocopating dialects. The general direction of the change is probably also indicated by the fact that the area of stød in Denmark is increasing at the expense of the tonal accents. My last point concerns the respective domains of the oppositions. While the tonal accents are distinctive at the level of the word, the occurrence of stød is confined to stressed syllables. If the phonologization of stød and its subsequent transformation into a tonal contrast were indeed initiated by certain changes of the Syncope Period and later apocope, this means that the transition took place in a period when almost everything including words, feet, secondary stresses and unstressed syllables underwent reduction. How likely is it then that at the same time the domain of the opposition was extended from syllables to words?

3.2.2.2. Parallel developments Given that stød cannot be the precursor of tonal accents, we can either reverse Liberman’s scenario (i.e. claim that stød is a late innovation) or assume that the two phenomena are parallel but separate developments that share a common phonetic source. Let us now review this latter option by considering two recent proposals. Panieri (2010) places the origin of stød to the Viking Age (i.e. 800-1000 AD) by establishing a connection between stød and terminal devoicing . Given that both phenomena are associated with reduced sonority, the author argues that devoicing was accompanied by a glottal gesture, which thus served to mark the boundary between a voiced nucleus and a voiceless coda (ibid: 8). This allophonic feature was then phonologized when stød was preserved (but devoicing was not) upon epenthesis and cliticization and thus the one- to-one correspondence between devoicing and stød was broken. Panieri assumes that devoicing affected exclusively such voiced consonants that followed the second µ of the rhyme. Accordingly, words like ny (new), hus (house), folk 110 (id) etc must have developed stød due to analogy at a later stage when the occurrence of stød had been divorced from devoicing and only required a bimoraic rhyme (i.e. stød-basis). It is also mentioned that in the dialect of Aarhus stød is accompanied by an HL-contour, while stødless disyllables exhibit a delayed tonal peak in the second σ, cf. (ibid: 16). Furthermore, the author argues that both stød and Accent 1 go back to the same HL-contour and that neither should be derived from the other. The idea is that devoicing and peak-delay entered the scene at roughly the same period, yet they basically affected different geographical areas with some degree of overlapping as testified by the dialects of Aarhus and the island of Fyn where the tonal accents coexist with stød, cf. Ejskj ӕr (1990:72). Although the eventuality that the two phenomena in question arose as the result of parallel development is certainly an exciting thought experiment, on a closer inspection it turns out that Panieri’s theory rests on a rather weak factual basis. From a cross-linguistic perspective, word-final devoicing is an extremely common process (attested in Dutch, German, Polish, Russian, Turkish etc) yet I am not aware of a single language where a decrease in sonority manifests itself as a glottal gesture. The loss of a voice distinction (and the subsequent loss of the postvocalic C) can at best lead to a tonal opposition (given that the F0 of the V terminates on a somewhat higher pitch if it is followed by a voiceless C) but as far as I understand, it is not expected to give rise to stød-like phenomena. Furthermore, terminal

110 Recall that words of this shape still lack stød in the dialects of Jylland and Fyn, cf. Perridon (2006:42). 123 devoicing usually only affects obstruents and is not dependent on the moraic structure of the word as Panieri (2010:6) assumes for Runic Danish. Despite the fact that neither Haugen (1982) nor Skautrup (1944) make mention 111 of terminal devoicing, it is reasonable to assume that the scope of the voice distinction was somewhat restricted. This can be inferred from the fact that the Younger Futhark did not distinguish between voiced and voiceless obstruents, cf. Skautrup (1944:125). In addition, the final sonorants of those words that were later targeted by epenthesis must have been devoiced to a certain extent in order to minimize their violation of the sonority principle. But even with these considerations in mind it is hard to escape the conclusion that Panieri’s conception of devoicing is supported by nothing but theory-internal arguments . If we first acknowledge that stød presupposes a bimoraic base and then equate it with devoicing, then we can see why the author only assumes devoicing outside this bimoraic base. Nevertheless, the counter- intuitive nature of this conclusion suggests that one of the premises (presumably the latter one) must be at fault. Another objection to be raised concerns Panieri’s excessive reliance on analogy . It seems quite unlikely that CV:C#-words with a voiced coda outnumbered the combination of CV:#-words and CV:C#-words with a voiceless coda to such an extent that the two latter stødless groups were remodelled on the basis of the first one. It is also exaggerated to claim that words are automatically assigned stød “the moment” they acquire stød-basis (ibid: 10), even if the spread of stød is a clear tendency, cf. 2.3.3.2, since this would have resulted in the generalization of stød and the suspension of the opposition. It follows from Panieri’s argumentation that he probably considers the assumed function of stød (i.e. to mark the border between a voiced nucleus and a voiceless coda, cf. (ibid: 19)) as an important factor when it comes to analogical developments. However, his border-hypothesis based on moras is certainly weakened by the fact that he claims words like líf (live) to be trimoraic (ibid: 6) and that he assumes original stød for words like finger (id), in which the stød-bearing µ belongs to the coda. It seems obvious that Panieri basically relies on Liberman’s µ-counting claims, which is one of the reasons why I am reluctant to accept the function he assigns to stød.

Rischel (2008:194) also argues for “parallel though hardly synchronous” developments in an admittedly speculative 112 proposal, which seeks to give a unified theory of i-umlaut, syncope and stød. According to the general idea “stød is the Danish reflex of [an] HL contour concentrated on one (stressed) syllable” (ibid: 215) such that “if a monosyllable had a sharply falling pitch distributed over a long sonorous stretch its pitch drop developed into a phonation type with irregular vocal fold vibrations towards the end, namely the stød” (ibid: 224). The pitch drop itself is attributed to the rhythmic patterns generated by Rischel’s umlaut and syncope model. Changes to the µ structure and the metric template of Common Nordic separated Danish from Sw&No, which is the reason why the latter two languages did not develop stød. The proposal’s speculative nature is not the only reason why I fail to embrace Rischel’s approach. His argumentation is difficult to follow and he seems to contradict himself on a number of points. A few examples will suffice. He both speaks of parallel developments and assumes that “the tonal contrast developed into a contrast between tone and stød” (ibid: 207). It does not really make sense to postulate “a sharply falling pitch distributed over a long sonorous stretch” (ibid: 224), since the longer the sonorous stretch, the less precipitous the fall. It is not clear either why he suggests that monomoraic words concentrated

111 According to Sandøy (2005:1854) word-final stops were devoiced in Proto-Nordic but this rule disappeared during the Syncope Period. 112 The author does not indicate the sources on which his historical assertions are based so it is sometimes difficult to distinguish between his own speculation and widespread conjectures borrowed from others. 124 the HL pattern on one µ in Sw&No but truncated the low tone in Danish (ibid: 206), when at the same time he wants to connect stød to a steep fall in pitch. Recall (3.1.4) that the existence of monomoraic content words is far from being universally acknowledged. Furthermore, the assumed transitions are not supported with phonetic evidence. In short, the theory is obscure, speculative, incomplete and lacks internal coherence, which prevents us from adopting the view that the birth of stød and tonogenesis were parallel developments.

3.2.2.3. Tones first In light of the above we are left with the single option of regarding stød to be an innovation with tonal precursors. Our reconstruction depends of course to a large part on what input we choose, i.e. whether we are inclined to trace back the origins of the tonal opposition to Old Scandinavian or to Proto-Nordic, cf. 3.2.1. In spite of the fact that we have two approaches to start with, I am not aware of a single theory based on the former alternative. As a matter of fact, it seems that the level stress hypothesis has to face some inherent problems 113 when it comes to accounting for the origins of stød. If 1) tonogenesis is due to the quantity shift of the Middle Ages and 2) stød evolved from a tonal system, then how is all this compatible with the fact that the quantity shift was initiated in Denmark but was never fully implemented in Danish dialects? Besides, this approach would leave quite a small time span 114 for the development of stød, as we would have to find an appropriate (segmental) change in the late 14 th or 15 th century that could account for the transition tonal > glottal opposition. Riad (2000) expands his earlier theory (1998b) to account for the origins of stød and thus makes a case for the Proto-Nordic hypothesis . His approach is based on the idea that contemporary Danish stød can be analyzed as an allophonic realization of an HL contour. He claims that “the tonal configuration of the Danish prosodic word is HL” and stød occurs if and only if the two tones are realized in a single (stressed) σ. In disyllables, where this is not the case, stød is ruled out on structural grounds. Under this approach we can equate stød with a boundary tone that resides in (or is sufficiently near) a stressed σ. Interestingly, the Swedish dialect 115 of Eskilstuna (located about 60 miles to the west of Stockholm) displays both a stød-like feature known as Eskilstuna-curl and a “tendency for the boundary tone to be realized on the last stressed syllable” even if it is non-final in the word (ibid: 275). Riad makes the logical conclusion that the Eskilstuna-curl must represent an intermediate stage of the transition 116 from the prosodically conservative Central Swedish variety to the stød of Standard Danish. The loss of the lexical tone distinction, which is supposed to be the terminal stage of the transition, is argued to be the direct consequence of a reanalysis that equates each tonal peak with stress. In this way a new secondary stress emerges, which is then neutralized under clash but without the preservation of the tonal contour (HLHL > HL). Riad supports his claims about the loss of distinctive tone with a reference to the Eastern Mälardalen dialect area, which features generalized Accent 2. Although I believe that Riad (2000) is essentially on the right track as far as the phonetic transition is concerned, I have to reject some of his other claims. My main objection concerns his conviction that the glottal contrast is basically allophonic in nature and is thus almost completely predictable. As we saw in 2.3.3, this is not the case. The author completely ignores the lexical component and makes oversimplified and sometimes even false statements

113 This suggests that the dispute between the two approaches (cf. 3.2.1) should be settled in favour of the Proto- Nordic hypothesis. 114 Stød probably already existed in Danish by 1510, cf. Riad (2000:293). 115 Riad refers to this region as Western Mälardalen. 116 The Eskilstuna-system and the standard Danish one are only typologically related and must be separate innovation areas, cf. Riad (2000:263). 125 about the distribution of stød, such as that “[it] cannot be realized on a sonorant consonant if that consonant is itself final in the word” (ibid: 265) or that “a monosyllabic first element [of a compound] regularly loses its stød” (my italics, ibid: 269), cf. 2.3.3. Treating stød as an allophonic feature leads Riad to the doubtful assumption that the dialect of Eastern Mälardalen and Standard Danish are typologically identical (since they both have non- distinctive tone and allophonic stød). It is self-evident that stød used to be allophonic up until that period of time when the first stødless words with stød-basis appeared, i.e. when stød was phonologized. Obviously, the same goes for the tonal opposition. Recall that the chronology of tonogenesis is usually established based on the distributional peculiarities of the accents, especially the fact that there are epenthetic or definite disyllables with Accent 1. This means that it is both counter- factual and counter-productive to consider the stød pattern of the modern language as allophonic, since we thereby would have to totally ignore its distribution, which can reveal much about its origins and help us to date its development (as in the case of tone). As far as the allophonic period is concerned, it seems to be a completely sound approach to assume (as Riad (ibid: 266) does) that stød emerges when the boundary tone (L) is realized in the same σ as the prominence tone (H). This definition involves two well- known predictions. First, the traditional notion of stød-basis requires two sonorant moras, with which the tones can be associated. In the absence of a second sonorant µ, the boundary tone is not realized 117 , which (as Riad (ibid: 277) points out) does not mean that a tone is missing. As a consequence, all words have the same tonal configuration (HL) regardless whether they have stød-basis or not. Second, under this definition, stød cannot be realized on a disyllabic word. All we now have to do is examine those cases that contradict the predictions of this allophonic definition. As it turns out we can both find words with stød- basis that lack stød (54a) and disyllables that have stød, although they should not (54c).

(54) Some Danish words with stød-basis a. monosyllables without stød : glad (id), gud (god), lov (law), son (id), (bog)stav (letter), sted (place), tal (number), ven (friend) b. disyllables (both inflected and uninflected) without stød : bærer (carries), døde (died) , kone (woman), maler (painter), tanke (thought), huse (houses) c. inflected disyllables with stød : taler (speaks), finder (finds), huset (the house), pilen (the arrow), guden (the god), sønnen (the son)

The most straightforward explanation for the deviant behaviour of (54a) is that these words lacked stød-basis at the time when stød appeared as an allophonic feature. This is in line with my earlier assumption that a word-final short C cannot be underlyingly specified as moraic. While (54b) conforms to our predictions, the inflected disyllables in (54c) exhibit some problematic cases. First of all, verbs ending in –er do not behave in a uniform manner as shown by ta Ɂler as opposed to bӕrer . Secondly, the suffixed definite article is transparent in hu Ɂset given that the word maintains its stød (cf. hu Ɂs). However, it is not transparent in søn Ɂnen, since it helps to introduce stød in a word that lacked it on its own (cf. søn ). The problem posed by present tense verbs like ta Ɂler, fin Ɂder etc can be solved if we realize that such verbs must have had a different moraic structure at the time when stød appeared. Indeed, modern Danish /t ӕ:Ɂlʌ/ goes back to Common Nordic /talr/, i.e. a form for

117 Riad (2000:267) asserts that the TBU for word accents (including stød) is the PrW and not the µ. His main reason for doing so is the realization that “two tones on a bimoraic syllable should be an entirely natural thing”, which contradicts his earlier assumption that stød is a marked configuration (i.e. he considers stød to be a manifestation of tonal crowding). Note that Accent 1 (which Riad claims to be unmarked) has the same prosodic structure as the one he assumes for stød (i.e. two tones, two moras on one σ). 126 which Riad’s definition predicts stød. We may thus tentatively assume that stød predates both epenthesis and OSL. In a similar vein, we can account for the retention of stød in hus+et and pil+en by assuming that the preconditions for the appearance of stød reflect a period before the postposed article was cliticized. This line of reasoning, however, is not applicable to words like gud+en and søn+en , whose syllabic structure provides no explanation for the attested stød pattern, given that their indefinite forms should lack stød by virtue of being monomoraic, and their suffixed forms should lack stød by virtue of being disyllabic. This means that the stød pattern of definite forms remains problematic irrespective of the chronology we postulate. So instead of binding the appearance of stød to the number of syllables at a given language stage, we should consider the tonal precursors of the glottal gesture. Recall that we claim stød to have emerged from an existing tonal opposition, in which the tonal contour was distinctive in disyllabic words. As far the distributional similarities are concerned we can acknowledge that (if we ignore stød in syllables bearing secondary stress) words with stød constitute a subset of words with Accent 1. Disyllables with Accent 1 generally have stød in Danish, which means that the realization of stød is not connected to an HL contour within a single σ. Riad refers to the number of syllables presumably in order to make sure that disyllables are not assigned stød. This is, however, quite unnecessary since (if stød indeed emerged from a tonal opposition then) most Danish disyllables still had Accent 2, i.e. HL-HL in this early stage. By assuming HL for Danish disyllables without stød Riad confuses the synchronic and the diachronic aspects of the matter. Accordingly, we should claim that stød is the direct reflex of an earlier HL contour irrespective of the number of syllables involved. As to why an HL contour distributed over two moras should be realized as a glottal gesture, we can adopt Riad’s proposal according to which stød is due to an early alignment of the boundary tone, which amounts to a steep fall in pitch. It lends itself to attribute an unusually steep fall to various kinds of deletions. Kiparsky (1995b:5) in an article on Livonian stød suggests that apocope in a sequence of segments like sugu with a tonal configuration HL “causes the onset of the second syllable to become a coda, which... gets Low tone by default [and] the resulting HL combination is stød”. Riad (2000:293) calls this “an example of tonal stability”. However, this can hardly be applied to Danish, since many words with stød, just like hus (house), were monosyllables already before the phonologization of Scandinavian tone. Also the Danish merger of unstressed /a/, /i/ and /u/ to schwa and the ensuing apocope, which was most prominent in the dialects of Jylland, affected a large number of disyllables that lack stød in modern Danish. So, contrary to what might be inferred from Kiparsky’s analysis of Livonian, we find that apocope is associated with non-stød, while stød is found in those words that have been unaffected by apocope. Apocope is of course an extreme form of weakening / shortening. Given that V weakening can also lead to a more precipitous HL contour, we can presume that fully-fledged apocope is not an absolute prerequisite for the emergence of stød. If we want to maintain that stød is due to the V weakening of the 12 th – 14 th centuries, we have to account for the fact that stød appears in the “unaffected” words. Furthermore, the loss of word-final weak vowels is an extremely common process, while the phenomenon of stød is not. To take an example, very similar reductions took place in Middle English as well, however, no corresponding tonal gesture evolved. I would like to propose that the key to understanding this puzzle is the insight that the tonal opposition was already a psychological reality in the Scandinavian languages at the time when the (near) loss of unstressed vowels (almost) altered the moraic structure of disyllables. Segmental phonology teaches us that if a change affects one member of an opposition, then all members will undergo the change in a similar manner. It suffices to think of the

127 palatalization of velar consonants (/k/ > /c/, /g/ > /j/) or the shifts affecting natural classes like /p/ > /f/, /t/ > / θ/ and /k/ > /x/. There are many other examples where members of an opposition move in tandem . If the tonal opposition of the 11 th or 12 th century was indeed psychologically real then we have all reasons to suppose that the shortening (which advanced vowel weakening can be equated to) affected words both with Accent 1 and with Accent 2. It is clear that words with Accent 2 can shorten through apocope, yet how can we postulate the shortening of words with Accent 1? We can expect some sonorant content to be lost. This is of course not viable if a word consists of a light syllable. What is needed is two sonorant moras, i.e. the structural configuration known as stød-basis. A sharp drop in frequency (i.e. stød) within the second µ can accomplish some degree of shortening, which I therefore consider to be a development that parallels word-final V weakening in disyllables. Now it remains to be seen whether modern Danish provides evidence for our conviction that stød is essentially a manifestation of shortening . Instrumental measurements carried out by Fischer-Jørgensen (1989:16) indicate that the duration of long vowels with stød is on average 81% of the duration of stødless long vowels. On the other hand, Grønnum & Basbøll (2001:238f) find only very small differences 118 and a considerable quantitative overlap between V: and V: Ɂ, which leads them to the conclusion that there is “no justification for a claim that stød vowels generally are shorter than long vowels”. When it comes to consonants Fischer-Jørgensen (1989:17f) reports that a sonorant C after a short V is more than 30% longer if it bears stød than if it does not. Grønnum & Basbøll (2001:239f) do not verify this claim either. They find that a stød C tends to be slightly longer in word-final position utterance medially, while in utterance-final position a C with stød is “generally considerably” shorter than a corresponding C without stød. It seems reasonable to argue that the “intended” pronunciation is at a much higher risk of becoming obscure due to rhythmic considerations utterance-medially than in utterance- final position. Therefore, I am inclined to interpret the latter as relevant and thus assume that stød is a potential trigger of C shortening. The same seems to apply to vowels, for which both sources report that a stød V is somewhat shorter, yet it is not clear whether this difference is statistically significant or not. Although the evidence is far from being overwhelming, it nevertheless makes it clear that stød can be rightly 119 associated with shortening in certain contexts. In light of the arguments above the following hypothesis can be formulated. The already established tonal opposition underwent considerable changes presumably some time between 120 1100 and 1300. Both Accent 1 and Accent 2 were affected by shortening of the last µ. In the case of Accent 2 this was expressed through weakening or loss of the word-final V. In the case of Accent 1 the change led to the appearance of allophonic stød, which became phonemic upon the loss of the tonal contrast. The melodic content of the accents was probably also restructured due to these segmental weakenings. Recall that the F0-curve of Accent 2 can only be segmented as HL-HL in the period preceding OSL. Advanced V weakening (and in certain cases optional apocope) led to a situation where the remaining segmental content could not support the second peak of Accent 2. Due to the subsequent tonal reduction (in which Accent 2 was thus simplified to a single peak) the tonal distinction disappeared as both accents came down as HL. By virtue of this tonal coalescence the allophonic conditioning of

118 Yet in four instances out of 14 the stød V was significantly shorter than the corresponding long V. 119 It has to be added that the measurement of modern Danish consonants cannot do justice to the historical question whether the appearance of stød was an instance of shortening or not. In order to make a fair comparison we should analyze a stød C in relation to a moraic stødless C, which in modern standard Danish are restricted to the results of ə-ass. 120 V weakening is already attested in Runic inscriptions from about 1100. Manuscripts from 1300 represent unstressed vowels throughout with a single symbol, cf. Skautrup (1944:225f). 128 stød went away, which gave rise to a phonologized glottal opposition. Thus we can hypothesize that V weakening was enough for the disintegration of the system, which means that tonal restructuring could take place even without fully-fledged apocope. If we are right in claiming that the glottal opposition had been established by the latter half of the 13 th century, then this may also shed some light on the peculiarities of the Scandinavian quantity shift. Recall that Danish failed to fully implement the transition into a complementary system of the Sw&No type. This may be blamed on the fact that the originally tonal opposition, which was a word-level phenomenon, had been shifted to the level of the σ. A σ-based opposition obviously does not allow for the wholesale neutralization of various σ- types, since this would go counter our functional principle in (4b).

3.2.3. The tonal typology While the distribution of Accents 1 and 2 (in simplex words) is relatively stable throughout the dialectal continuum, modern Scandinavian tonality (as we have mentioned before) is characterized by extensive phonetic variation. Given that the first documented dialectal splits cannot be traced back further in time than to the end of the Viking Age (and even these being segmental in nature), we have no other option but to postulate that this vivid contemporary variation goes back to a single source. This leaves us with the challenge of reconstructing the diachronic development of the tonal system into the attested typology of modern-day dialects. As we saw in 3.2.1, this thought experiment has two possible starting points . Bye (2004) adopts the target-delay hypothesis and argues accordingly that the original opposition was realized as HL vs. LHL. Riad (2003), on the other hand, assumes that the opposition goes back to Proto-Nordic, which points out the double-peaked dialects of Central Sweden as conservative varieties. Under this theory the tonal input is HL vs. HL-HL, which is reinterpreted upon the completion of the quantity shift as H-L vs. H-LH-L. In what follows I will compare and evaluate how the two proposals derive the tonal typology of Scandinavia. The fact that we have to deal with at least two possible inputs is not the only factor that makes the whole enterprise hopelessly speculative. Even the segmentation of the output (i.e. the tonal curves of the modern dialects) is open 121 to debate. Hognestad (2012:95) points out that the F0 of Stockholm and Stavanger (HLHL) can be segmented both as H-LH-L and as HL-H-L and shows that the two segmentations make different assumptions for instance about a transition to L-H-L: the former has to resort to a leftward shift, while the latter can be interpreted as an example of simplification, which is much easier to motivate than a leftward shift. This leads us to the problem of plausibility. A proper reconstruction has to make tenable claims about the phonetic transition from one language stage to another. The best way to do this (it seems) is to restrict ourselves to functionally motivated changes (i.e. internal development) and not to make ad hoc guesses about external factors. Yet in view of the fact that certain Scandinavian dialects (have) had vivid contact with speakers of other languages (not to speak of dialectal intermingling), an approach that, in want of sufficient data, must turn a blind eye on the role of language contact can hardly be historically accurate. It is not entirely clear either what criteria should serve as a basis for the typology. On condition that complex tones (like LH) are restricted to medial position (i.e. to what is traditionally referred to as prominence tone, cf. (28)), the shape of Accent 2 is predictable from that of Accent 1 due to OCP. This is, however, not generally assumed, so it is better to base our typology on the tonal curve of Accent 2 (even if the two accents are expected to change in tandem). Although Accent 2 has (in some dialects) a different association of tones in compounds and in simplex words, both categories exhibit the same sequence of tones. We

121 Recall our note on the dialect of Malmö in 2.2.4.9. 129 can take more factors into consideration if we focus on compounds. Yet whatever option we choose our predictions for compounds should hold for simplex words as well and vice versa.

Riad (2003) Given that Bye (2004) is a response to Riad (2003), we will start with an outline of the diachronic development Riad puts forward. The typology is based on the eight dialects in (55) below, which are represented with the tools of autosegmental phonology. Primarily stressed TBUs are marked with an asterisk, boundary tones with a square bracket, spreading with an arrow and interpolation with an interrupted line. Underlined tones are associated with stressed syllables, others are floating.

(55) Accent 2 in compounds (based on Riad (2003:101)) a. Stockholm H* ← LH L] e. West Göta H* ← L H] b. Dala, Narvik, L* ← H L] f. Oslo H* L - - - H] c. East Färnebo L* - - - H L] g. Malmö, Bergen L* H - - - L] d. East Göta H* ← LH L] h. Stavanger H* LH - - - L]

(56) Dialectal transitions (based on Riad (103:ff)) a. (55a) + leftward shift > (55b) + delinking + lost connectivity > (55g) b. (55a) + delinking > (55d) + rightward shift > (55e) + lost connectivity > (55f) c. (55a) + lost connectivity > (55h)

Given that the Proto-Nordic theory singles out (55a) as the most conservative variety Riad derives the other dialects assuming the changes in (56) above. Connectivity (i.e. the prosodic assignment of Accent 2 in compounds) and an associated connective 122 tone exhibit a close correlation since the latter always entails the former. Note that the reverse statement does not hold for (55d, e), which are connective dialects despite having an unassociated connective tone. It follows from the above that loss of connectivity in (56) above also involves delinking. Let us now review the main points of criticism directed at Riad’s model. Bye (2004:14) presents some dialect geographic evidence, which is meant to disprove the premise that (55a) is an archaic variety. He points out that (55g) has a “peripheral and discontinuous geographical distribution”, which is to be interpreted as a relic feature. Riad (2003:127) apparently anticipated such counterarguments and addressed them in advance claiming that “the spread of dialect types allows for conflicting interpretations” since both (55a) and (55g) have isolated satellites, which can be regarded as archaic, relic areas. It has to be added that although Malmö is indeed peripheral in the present-day tonal continuum, this has not always been so given that Danish is claimed to have tonal precursors. Also the fact that Skåne and formed an integral part of Denmark up until the treaty of Roskilde in 1658 makes it dubious that a region affected by forced “Swedicization” represents the Old Scandinavian language stage. We have no way of telling whether the territory in question has ever had a glottal opposition, although it seems quite probable since a relatively densely populated area, which is near (or is itself) a cultural and political centre, is also expected to be an innovation centre. This very argument, however, also seems to challenge Riad’s (1998b:87) assumption that “nothing much has happened prosodically between PN accent 2 and Stockholm Swedish accent 2”. It is rather unlikely for a political

122 This correlation can be regarded as another piece of evidence in favour of the revised segmentation and terminology of (39). 130 centre like Stockholm to escape innovations for more than a millennium or even to be linguistically conservative. The assumed transitions in (56) also constitute a vulnerable point. Bye (2004:47) argues that peak-delay (or rightward shifts as in (56b)) is a well-documented and natural phenomenon, while leftward shifts are not. Given that Riad only describes the transitions without exploring the underlying motivations, it is difficult to adopt the assumption that a single tonal starting point can involve shifts in all possible directions. Note that it follows from Riad’s theory of privativity, cf. (28), that rightward shifts are universal in all tonal dialects since he claims that in Accent 2 the lexical tone pushes the prominence tone of Accent 1 to the right. Rightward (as opposed to leftward) shifts are also attested historically. It suffices to think of the stress pattern of multiple compounds in tonal dialects, cf. (27), or if we want to keep stress and tone apart, we can invoke the tonal shift in standard Swedish trisyllables, in which (according to Kuryłowicz (1952:466)) the second peak of Accent 2 shifted to the right (i.e. to the last σ) possibly after the year 1500. Riad (2003:108) indicates that in certain dialects the two realizations ( ˊxx ˋx and ˊxxx ˋ) are in , still it is obvious that the latter is more recent.

(57) The typology of Sw&No tonal accents Type Accent 1 Accent 2 Cf. (55) a. 1A HL LHL g b. 1B HL LHL b, c c. 2A LHL HLHL a, h d. 2B LH HLH e, f

Bye (2004) Bye treats simplex words separately from compounds and analyzes the former using the framework of Gårding’s (1977) classical typology, according to which tonal dialects can be categorized 123 based on peak timing (early = A vs. late = B) and on the number of peaks (1 vs. 2) in Accent 2. The typology is summarized in (57) above with corresponding data for compounds taken from (55). Bye assumes target delay to be responsible for both tonogenesis and for the diachronic development leading up to the present typology. The transitions are claimed to have occurred as follows: 1A > 1B > 2A > 2B. Bye (2004:19) also considers the possibility of a shift from 2B to 1A, which from a phonetic point of view is equally plausible as the other three shifts. Notably, this would imply that a dialect of type 1A is either extremely conservative or ultra-advanced diachronically. It is conspicuous that a simple tonal representation cannot do justice to the timing difference between 1A and 1B. However, if we also take the testimony of compounds into consideration, it becomes apparent that the loop of 1A > 1B > 2A > 2B > 1A etc is not restricted to tone shifts. The transition from 1A to 1B involves the association of a previously floating connective tone, while the transition from 2A to 2B is preceded by delinking. Note that from a functional point of view (cf. 1.4.3) it seems very attractive to approach the problem of tonal transitions with the concept of target-delay. The reasons for this are twofold. First, it allows us to treat phonological change as unidirectional, which means that we can avoid the problem of directionality we faced in (56). Second, the fact that the assumed changes are circular (i.e. they form a loop) entitles us to maintain the functional tenet that changes initiated by internal factors are to be looked upon as instances of optimization, which can go on eternally, cf. 1.4.5.2.

123 See Riad (2003:97) for a map displaying the distribution of various accent types. 131

As far as compounds are concerned, Bye (2004:21) distinguishes between four types: initial (55g), final, connective (55b) and double (55a). The fact that final, which can be exemplified with northern Swedish as described by Bruce (1982), does not have an equivalent in (55) indicates that Bye works with somewhat different concepts than Riad does. While Riad obviously investigates the behaviour of Accent 2 in compounds, the scope of Bye’s study is not so easy to tell, since in addition to considering Accent 2 in (55g, b, a) he also includes final stress, a location in which the accentual opposition is suspended in Accent 1. In light of this methodological error, it is rather difficult to interpret Bye’s proposed transitions, which include initial > final > double and initial > connective. Another problem with Bye’s approach is that he breaks the bond between Accent 2 in compounds and in simplex words. This is apparent given that the changes affecting the latter form a loop, while the transitions of compounds do not. This means that if we adopt Bye’s claims then we are unable to account for the fact that Accent 2 in the two groups always displays the same tones irrespective of which dialect we consider. I also fail to understand why Bye singles out 1A dialects as being conservative. The fact that target-delay is a well-documented 124 process, which is arguably responsible for many ongoing and recent changes in Scandinavia, does not necessarily mean that it played a role in Scandinavian tonogenesis, cf. fn 108. It seems perfectly conceivable to me that the tonal distinction has its roots in the stress patterns of Proto-Nordic, while the ensuing dialectal disintegration is due to the mechanisms of target-delay.

A functional approach Given that target-delay is a widely attested natural phenomenon in tonal languages, it must have an underlying functional motivation. In most non-tonal languages an isolated simplex word has a falling contour (HL), in which H is a focus tone and L is a boundary tone. Neither tone is complex. The former is associated with the (often initial) σ bearing primary stress. This unmarked case is in agreement with the simple physiological observation that the speaker has more air at her disposal at the beginning of a breath group (or an IP), which is thus louder, higher in pitch and more prominent. Translated to the functional principles in (4) a H initial tone can have a delimitative function and thus helps to minimize perceptual confusion (4b). A L boundary tone results from the minimization of articulatory effort (4a), while the avoidance of complex tones maximizes the economy of description (4c). In a tonal opposition of the Scandinavian type certain words feature an extra tone, which leads to a situation where it is impossible to comply with all of the aforementioned functional criteria at the same time. Recall that our constraints require 1) a H initial and 2) a L final tone, while 3) none of the tones should be complex. Nevertheless, it is self-evident that the first and the last tones cannot have opposite polarity if the middle tone is simplex. This is the reason why peak-delay in tonal languages like Sw&No leads to an eternal loop of optimization . In line with the findings of the discussion so far, we can hypothesize that the evolution of the Scandinavian accent typology proceeded along the optimization loop of (58) below. It seems reasonable to propose that an associated complex tone is immune 125 to shifts, which means that the transition from Stockholm to Oslo has to be preceded by delinking. Normally, the delinking of the connective tone entails the loss of connectivity. As we have seen, the

124 Hognestad (2012:127) gives an illustrative example of the tonal transitions that can be observed in , Norway. 125 Given that a σ bearing primary stress cannot have a floating tone, we can now see that the notion of peak- delay is not compatible with bitonal initial tones. Accordingly, HL-H-L is not a possible segmentation for present-day Stockholm. A further consequence of this discovery is that peak-delay was not an option in Scandinavia until OSL allowed for the reinterpretation of the original HL-HL contour into H-LH-L. 132 dialects of East and West Göta (55d, e) do not conform to this expectation, which means that they do not fit into an optimization loop of internal developments. Accordingly, we have to postulate that their “irregular” melody should be attributed to some external influence such as contact with East Norwegian and South Swedish.

(58) The evolution of the Scandinavian accent typology Stockholm → Stavanger → Oslo H*LH L delinking H*LH L peak delay H*L H peak ↓ delay Stockholm ← Dala, Narvik ← Malmö H*LH L peak delay L*H L association L*H L

Riad (1998a:78) assumes that “Scandinavian dialects lend themselves to uniform description” since “[t]here is considerable homogeneity across dialects with regard to tonal grammar”. Accordingly, he proposes to use the privative model presented in (28) “in order to keep the typology and its predictions in check” (2003:93). Now given that the phonetic realization of the accents is subject to immense variation and that the “specific functions of [the] tones may vary between dialects” (ibid: 92), this alleged uniformity is, as far as I understand, restricted to privativity. Phonetic variation is clearly due to peak delay, yet where did the functional differences originate? The reinterpretation of the tones of a newly arisen contour in terms of functionality can obviously give rise to new functions. If Accent 2 equals a lexical tone followed by Accent 1 (cf. (28)) in all dialects and if the members of the opposition indeed change in tandem, then it is difficult to see why the tonal contour should be reinterpreted in the first place. Recall, however, that in (39) we challenged the mainstream approach (28) and proposed that Accent 2 in central Swedish is made up of a sequence of prominence, connective and boundary tones, while Accent 1 equals prominence + boundary. A look at (58) makes it clear that this approach is not compatible with those dialects where the middle tone of Accent 2 is simplex, since it makes wrong predictions for Accent 1 (on the assumption that the values for the prominence and boundary tones are identical in the two accents). Nevertheless, if (39) is a correct description of (55a), then we can no longer claim that the tonal grammar of Scandinavia lends itself to a uniform analysis. Bye (2004:10) also points out that “[s]hifts in the alignment and the phonological association of tones bring about reassignments to function ”. However, it seems to me that if Accents 1 and 2 always change simultaneously then reinterpretation cannot take place. Accordingly, we can argue that the shifts affecting the opposition have to at least at some point in the loop affect Accent 2 somewhat earlier than Accent 1 or vice versa. It is reasonable to assume that the former scenario can occur in those dialects where Accent 1 and the final part of Accent 2 are identical even after a rightward shift in the latter. This is possible only when the middle tone becomes complex, i.e. in the transition from Dala to Stockholm, (55b) > (55a). While the opposition HL vs. HLHL is open to multiple interpretations or segmentations, the other contours are not. It has to be added that the data included in (58) above is restricted to a handful of well- known dialects and thus does not include each and every intermediate stage of the transitions. This can be illustrated with Dala/Narvik > Stockholm, in which the newly arisen H tone cannot have ousted its low predecessor without the intermediate stage: HL*HL. As a matter of fact, it seems that new tones can appear only at the left edge of the word and that peak-delay requires complex tones. If this is correct, then the transition Oslo > Malmö is rather problematic, since it must have involved H*LH > LH*LH > L*HLH. Given that the output violates more functional constraints than the input, this stage seems to undermine the validity of (58).

133

Recall, however, that we are trying to reconstruct the history of the opposition and not just the changes that have affected Accent 2. If the functional requirements outlined above can lead to peak-delay in words with Accent 2, then the same must hold for Accent 1. Note that Oslo is the only point of the loop where Accent 1 (L*H) does not involve a falling contour and thus defies our expectations that the initial tone should have a H peak and that the boundary tone should be L. Given the claim that the members of a phonological opposition move in tandem (though not always simultaneously), we can account for the unexpected change Oslo > Malmö by assuming that the transition involved two distinct stages (H*LH > LH*LH > L*HL), the first of which was triggered by Accent 1, the second by Accent 2.

We can conclude this section by addressing a possible shortcoming of the Proto-Nordic theory that we have already mentioned: how likely is it that a cultural and political centre like Stockholm represents a conservative language stage? It is of course not likely at all. Given the loop of (58) it seems that Stockholm may be one of the prosodically most advanced dialects of Scandinavia. Yet in order to maintain this claim we have to ascertain how rapidly the changes outlined in (58) can proceed. Is half a millennium (roughly 126 from 1300 to 1800) a time span long enough for these five changes to take place? In order to answer this question we need a documented tonal shift, on the basis of which we can do the arithmetics. Hognestad (2012:48) compares two studies (the first from 1927 and the second from 1970) on the tonal contours of Stavanger and finds that Accent 2 is represented in the same way in both, but Accent 1 seems to have undergone peak delay given that the H tone was associated with the first µ of the stressed σ in 1927 but with the second µ in 1970: H-L > (L)H-L. So it seems that a single step (or rather half 127 a step, since it only affected Accent 1) can be accomplished in less than half a century, which implies that the time span at our disposal was theoretically sufficient for Stockholm to complete the loop. Although a cultural and political center like Stockholm is expected to exhibit certain innovations in the course of a millennium, note that the Swedish capital only emerged as a significant political power in the early 17 th century. We have good reasons to believe that soon afterwards the city of Stockholm lost the accent distinction, which must have been reintroduced by the 18 th century. Kock (1878:108) proposes that there is a connection between the second peak of Accent 2 (which he calls levis ) and the preservation of unstressed word- final vowels. If this observation is correct and if the distinction was in fact lost in Stockholm for a time, then it is possible that the dialect of Stockholm has escaped the loop of (58) altogether.

3.2.4. The role of analogy and some further changes Whatever model we adopt to account for the accentual opposition, we will find that the predictions made by the theory do not always correspond to the present-day distribution of the accents. We have seen that both Panieri (2010) and Kuryłowicz (1952), Oftedal’s (1952) hypotheses A and B invoke analogical pressure to explain the behaviour of deviant words. Yet similarly to markedness, the concept of analogy can be employed with a wide range of connotations, which means that we should examine what is actually meant by the term. All the more so, since this can facilitate the evaluation of the various proposals mentioned above.

126 The dates reflect the assumption that, as argued in fn. 125, peak-delay cannot have occurred earlier than the 14 th century (i.e. before OSL) and the fact that we have reliable descriptions of a 2-peaked contour from the 19 th century. 127 Thus we have found further support for our recently revised position that although the members of the tonal opposition arguably move in tandem, the changes are not necessarily simultaneous. 134

On how to define analogy We have already made some sporadic statements in connection with analogy. We have assumed that it should not be classified as a sound change (1.4.3), that it is unlikely to dissolve a phonological opposition (1.4.5.4) and that it is slow, sporadic and unpredictable 128 (2.2.3). It follows that in my interpretation analogy involves abrupt changes in the lexical specification of individual words. This is, however, not the only possible approach to the problem. Anttila (1977:16ff) points out that analogy is essentially “a relation of similarity”, which can be classified inter alia according to the number of terms involved. When we have a relation expressed as A:B, then the two terms must share an inherent characteristic (material analogy). When we have three or four terms (A:B=B:C or A:B=C:D), then similarity is expressed as a proportion 129 (formal analogy) and is not (necessarily) located in the corresponding terms involved. It follows that a change can be described as analogical if a bond between two or more entities (based on some already existing similarity) is further enhanced at a certain level. Such an enhancement is expected to take place in an arbitrary domain such that the original relation of similarity remains unaffected by the change. In linguistics this arbitrary domain is the level of sounds. Accordingly, similarity at word level is either semantic or morphological, which is then reinforced with phonetic means . It goes without saying that there is no point enhancing phonetic similarity between individual words at the level of sounds. Just to take a few examples, Modern English father is the result of analogy based on semantic similarity, cf. OE bro ϸor (brother) and fӕder (father), while Sw simmar – simmade (swim – swam) results from morphological analogy based on the fact that the preterite dental suffix is productive and unmarked in relation to ablaut. This latter type of analogy can lead to morphologically productive patterns as in English, where the productive plural suffix –s was originally restricted to strong masculine nouns in OE. Anttila (1977:19ff) argues that linguistic change is linked to language learning, which is a process of imitation and can be described in terms of abduction, deduction and induction. Thus Anttila uses the assumption that language learning is a fully analogical process to support his position that “all change is analogical”, including the regularity of sound change. As Anttila (1989:88) puts it “the regularity of sound change is also analogical: when a sound x changes under conditions y in a word A, it also changes in word B under the same conditions”. Although it seems to be a logical conclusion, it is obviously quite problematic if we take analogy to be a cover term for all possible changes within linguistics. The first objection that comes to mind is that the term analogical change is rendered into a meaningless tautology once we adopt the suggestion that analogy and change are the same thing. A term that can be used to describe everything is not a useful scientific tool. My second problem concerns Anttila’s assertions that “ [a]ll learning is always change” and that “ all change is analogical” (ibid – my italics). This is certainly not the case, especially if we want to maintain that analogy is a relation of similarity. Although imitation usually leads to an output that is similar to the original (i.e. output 1), it would be a formal fallacy to equate similarity and analogy. Similarity (as we have seen) is the basis of analogy (and not analogy itself) given that its reinforcement manifests itself as an analogical change. A process that terminates in some sort of similarity is thus not analogical unless the terms involved display an initial bond in the form of an already existing similarity. Lastly, it is misleading to talk about “the regularity of sound change” as such and thus subsume all types of phonological change under one label. In Chapter 1 we distinguished

128 While the direction of the change is easy to foresee (provided that we manage to establish the relevant markedness relations), its chronology and the exact scope of the words that will be affected are not. 129 Interestingly, proportion is the Latin translation of Greek analogy . 135 between teleological (1.4.3) and non-teleological (1.5.3) processes and found that the former involves gradual transitions, which optimize the system, while the latter abrupt changes, which arise from misunderstandings. Recall that Ohala’s hypo-correction (13) starts out as a covert abduction and can later surface through overt deductions, cf. Anttila (1977:19). This means that (13) satisfies the criteria of an analogical change given that it is abrupt and involves an altered lexical specification. A gradual teleological change does not modify lexical properties (only over the course of a longer time), but alters the phonological specifications of the language. We would lose the distinction between these two categories if we were to adopt Anttila’s claim about all (sound) change being analogical.

Analogy and the level stress hypothesis With these considerations in mind we can now turn to the discrepancy between the present- day distribution of the accents, the testimony of our earliest sources and the predictions of the relevant hypotheses. We can start with Kuryłowicz (1952), who as we saw in 3.2.1.2 suggests that Accent 2 originates in disyllabic words with a light initial σ, which is tantamount to equating Accent 2 to level stress. The distinction between heavy and light initial syllables is neutralized upon OSL and is transformed into an accentual opposition. Accordingly, disyllables with an originally heavy initial σ should come down to us with Accent 1, yet this is rarely the case. Kuryłowicz (1952:467-8) assumes that such words were remodelled with Accent 2 on the basis of polysyllables affected by OSL. He provides a number of examples (both inflected and uninflected) illustrating the analogical transition, a few of which are presented in (59) below. Note that (59) only includes uninflected simplex words.

(59) Analogical spread of Accent 2 proposed by Kuryłowicz (1952:467) a. 1oxe > 2oxe (ox) based on 2hane (cock) b. 1flicka > 2flicka (girl) based on 2gata (street) c. 1moder > 2moder (mother), 1broder > 2broder (brother), 1dotter > 2dotter (daughter) based on 2fader (father) d. 1afton > 2afton (evening), 1morgon > 2morgon (morning) based on 2nyckel (key) 130

Let us see whether these assumed changes are in agreement with our previous claims. Recall that we view analogy as being a reinforcement of some already existing (either semantic or morphological) similarity at the phonetic level. While there is an obvious semantic bond between the words in (59c), the other examples are somewhat problematic. One may claim that ox and cock (59a) both refer to animals and are thus semantically similar, yet this similarity is clearly much weaker than the one is (59c) 131 . Even if Anttila (1977:18) is right in believing that “there is no point in defining degrees of similarity”, the concept similarity itself requires a definition, since otherwise references to analogy will always remain vague excuses being the analyst’s last resort. However, it is probably impossible to arrive at a satisfactory definition given the observer’s “logical freedom from criticism in perceptual judgement” (ibid). In other words, similarity is a subjective notion, which cannot be formalized . Consequently, we should refrain from establishing a non-existing semantic bond between the respective words in (59a, b). This leaves us with the option of morphological similarity, which (if applicable) implies that the word-final vowels of (59a, b) should be

130 This word also goes back to a form with a light initial σ ( lykel ) and is cognate with English lock . 131 Despite the obvious similarity observed, (59c) is unlikely to reflect an actual historical development for the simple reason that analogy is expected to extend the scope of unmarked forms (recall that markedness is a frequency-based concept, cf. 1.3.3). This entails that the change has to act in the reverse direction, i.e fader should switch to Accent 1. 136 analyzed as suffixes. Recall that we argued against such a synchronic interpretation in 2.2.4.5. It is well-known that the vowels in question are the continuation of IE stem-building suffixes, however, it is rather unlikely that they were felt as suffixes back in the 14 th and 15 th centuries. We must find a way of accounting for the analogical transitions in (59a, b) even if we conclude (which we do) that they are not actual historical developments. Recall that we assumed Accent 2 to be the result of analogical pressure in words like Sw. 2Meck-a, 2Kin-a, 2Chil-e, 2Pek-ing , cf. 2.2.4.8. Given that we are reluctant to analyze such endings as suffixes, it is necessary to revise an earlier statement made in this subsection, namely that there is no point enhancing phonetic similarity between individual words at the level of sounds. A more accurate rendering would sound like this: it is pointless to enhance similarity at the very level where we observed it. Now given that endings belong to segmental phonology and tones to prosody, it is not contradictory after all to suggest that phonological analogy can lead to tonal changes (even if its actual mechanism raises a number of unanswered questions). A look at (59d) makes it clear that the terms involved display neither semantic, nor morphological, nor phonological similarity. A common syntactic label (all being nouns) is clearly not a sufficient base for an analogical change. Note that if (59d) were a historical reality then we would have a hard time explaining why analogy has not affected those bisyllabic words that still have Accent 1 today and thus serve as a basis for the opposition. The fact that in most dialects analogy has not (yet) generalized Accent 2 indicates that we have to reject the theory put forward by Kuryłowicz (1952) along with the level stress approach itself.

Generalized Accent 2 The fact that analogical pressure brings about the elimination of lexical marks suggests that unmarked features can generalize in the long run. This leads to the suspension of contrast as the distribution of (in our case) tone and stød becomes predictable from structural features such as stød-basis or the number of syllables within a word. Indeed, this is what we find in the Eastern Mälardalen dialect area where “the tonal contour of accent 2 is generalized in all polysyllables that have a post-tonic syllable” (Riad (2000:288)). However, this development can hardly be attributed to a lengthy analogical process. Recall that I use the term analogy for processes that proceed item-by-item (based on some sort of similarity) and not for rule-based generalizations that apply across-the-board. The generalization of Accent 2 is evidently a simplification, which totally decreases the lexical burden of the accents. Thus the language disposes of both lexical marking and various generalizations that Elert (1972) calls Accent 2 Exception Rule, cf. (40a) in 2.2.4.10. It is well-known that foreign learners of Swedish, who are advanced enough to pay attention to the melody of the language, often acquire a taste for Accent 2 and tend to use it in virtually every focused word that has a post-tonic σ. This suggests that generalized Accent 2 is likely to arise as a result of language contact, since it lends itself to a description in terms of Labov’s diffusion (cf. 1.1), given that it entails that the minute conditioning of the distributional rule tends to fade away. Riad (2000:291), on the other hand, argues that “the language contact hypothesis does not seem viable” since all known cases where the accent distinction was lost due to foreign influence involve the neutralization of the opposition in Accent 1, which he assumes to be unmarked. Let me briefly outline an argument in favour of the language contact hypothesis. Given the geographical proximity, we can assume that generalized Accent 2 goes back to a privative opposition of the Stockholm type, cf. (39). Once the distribution of the two accents becomes entirely predictable, it is no longer necessary for our segmentation of the tonal contour to express privativity. As a consequence, we can get rid of the connective tone and reanalyze Accent 2 as being made up of a prominence tone (HLH) followed by a boundary

137 tone (L). Bloch (2003:50) argues for the same segmentation, although with somewhat different premises. Riad (2000:291) points out that “simplification is more or less expected” in a tripartite focus tone and the fact that it has not yet occurred indicates that the phenomenon of generalized Accent 2 is a relatively recent development. Bloch (2003:45ff) reports that the second peak of Accent 2 is indeed frequently missing (or rather weakly articulated) in certain positions. The author therefore represents words with a post-tonic σ as HL(HL) as opposed to HL for words that lack a post-tonic σ. It is difficult to escape the conclusion that the gradual simplification of the two-peaked melody of Accent 2 (H-LH-L > HLH-L > HL-L > H-L) will eventually equate the melody of words with and without a post-tonic σ. To put it differently, the simplifications affecting generalized Accent 2 are expected to lead to generalized Accent 1, which is what we find in those dialects where the loss of the tonal opposition is commonly attributed to language contact. If this is a natural development, then generalized Accent 1 around Bergen, in Finland etc is the result of simplification from a previous system of generalized Accent 2. Hence the difference between Eastern Mälardalen and most Swedish dialects of Finland is not which accent is generalized but rather how advanced the process is. Are there any historical facts that could support such a supposition? Nyström (2003:14) proposes that generalized Accent 2 arose as an innovation in Stockholm due to Low German influence. Then the phenomenon spread from the capital to the surrounding area and somewhat later it vanished from Stockholm, i.e. the tonal opposition was reintroduced. The accentual pattern of the capital has exerted considerable influence on the surrounding dialects such that the opposition has by now almost completely ousted the phenomenon of generalized Accent 2. Riad (2000:291) mentions that a similar “course of events is assumed for the city of Bergen in western Norway… where the Hanseatic league had an office”, with the difference that the area surrounding Bergen has generalized Accent 1. It goes without saying that we cannot assume the same trigger (i.e. Low German influence in the late Middle Ages) for the developments in Bergen and Stockholm, since this should lead to identical patterns. The relevant contact situation in Stockholm must have arisen a couple of centuries later. A look at the demographic changes in Stockholm reveals that the most precipitous rise in the number of inhabitants occurred in the 17 th century, when Sweden became the leading power of Scandinavia. The population increased fourfold (from 10000 to 40000) between the 1620’s and year 1660, cf. Åberg (1978:227). Given the numerous foreign dominions occupied by Sweden, a large part of the people pouring to the capital presumably spoke a language other than Swedish or a Scandinavian dialect which lacked the accent distinction. Accordingly, we can conjecture that the contact situation of the 17 th century was the trigger that gave rise to generalized Accent 2 in Eastern Mälardalen.

The directionality of analogical change It follows from the definition of analogy that the direction of an analogical change depends on the prevailing markedness relations. We have argued throughout the thesis that Accent 2 should be looked upon as the unmarked member of the tonal opposition and that the same goes for stød in the glottal opposition. Given the strong lexical correlation between Accent 1 and stød in simplex words, the birth of stød was evidently a markedness reversal. The early tonal opposition pertained to the domain of disyllabic words, where Accent 1 (HL) was clearly a minority pattern. When the tonal opposition was transformed into a contrast of the absence or presence of stød (i.e. after degemination and the loss of the second peak of Accent 2), stød (HL) could be realized on a single stressed σ, which means that the domain of the opposition was reduced from word-level to the level of stressed syllables, where the absence of stød is rarer than its presence.

138

In 2.3.3.2 we hinted at the problem that markedness is not necessarily a unitary concept and that a given feature can be assigned different labels in different segments of a language’s vocabulary. This means that the analogical spread of unmarked tonality is not as simple as I indicated in some earlier sections. A comparison between some 19 th century descriptions and the present-day distribution of the accents reveals that the last two centuries have seen transitions both from Accent 1 to Accent 2 and vice versa (although the former seems to be more common). A few notable examples for the shift Accent 2 > Accent 1 are included in (60) below based predominantly on Kock (1878:59ff). The list also includes words whose pronunciation used to vacillate between the two accents.

(60) Accent 2 > Accent 1 a. Stefan, Valdemar (given names) 132 b. värdera (evaluate), egentlig (real), offentlig (public) c. större (bigger), mindre (smaller)

The Accent 2 of (60c) possibly reflects the original state of affairs as it is also found in a number of Sw&No dialects, cf. Oftedal (1952:165ff). The transition seems easy to account for once we assume that i-mutation in the plural of nouns (32ai) was accompanied by Accent 1, while in the comparative of adjectives it resulted in Accent 2. The former, being a more numerous group, exerted analogical pressure on the latter. As a next step, Accent 1 was associated with irregular inflection as such. Värdera in (60b) was possibly remodelled on the basis of verbs with unstressed prefixes ( be-, för-), but it is rather unclear why adjectives with an anacrusis (such as offentlig in (60b)) came to pattern with verbs and not with nouns. The immediate trigger for (60a) is also difficult to identify. As far as I know, the transitions in (60a, b) do not have Norwegian parallels. It is probable that certain instances of tonal change (Accent 2 > Accent 1) in standard Swedish do not follow from analogical considerations and have therefore nothing to do with markedness. Recall that we have just proposed that Eastern Mälardalen (including Stockholm) acquired generalized Accent 2 in the latter half of the 17 th century, upon which the accent distinction was reintroduced in the capital. It is thus reasonable to claim that the spread of Accent 1 in certain standard Swedish words is nothing but the disposal of the remnants of generalized Accent 2 . It is of course often impossible to tell such cases apart from the analogical spread of Accent 1, which implies that some changes that are reminiscent of the ones in (60) should be analyzed with a grain of salt.

3.3. Current developments The linguistic landscape of Scandinavia has been considerably altered in recent decades as a result of increased international mobility. Although to the is not a new phenomenon, the latter half of the 20 th century saw certain novel trends. The social and economic security of the post-war era turned Scandinavia into a region of net immigration in sharp contrast to the exodus that had characterized the turn of the century. Given that immigration to Scandinavia has in modern times extended to both migrant workers and asylum seekers (with the regulated possibility of family reunification) the number of newcomers has assumed unprecedented 133 proportions. The linguistic background of the immigrants is another component that distinguishes present-day migration from that of earlier

132 Taken from Bloch (2003:58). 133 The number of immigrants has in certain years exceeded one percent of the total population. 139 centuries. Historically, most 134 migrants were speakers of Dutch and German, i.e. other closely related Gmc languages. Contemporary migrants, on the other hand, come mainly from poverty- or conflict-stricken areas from all over the world, which means that the most common immigrant languages include Arabic, Turkish, Persian, , Somali, Serbo- Croatian etc. Despite considerable efforts, the various governments of the Scandinavian countries have failed to devise an adequate solution to the challenges of integration. As a result, a large number of people with a foreign background (and often an imperfect command of Da, No or Sw) have been entrapped in segregated suburban areas, where both school dropout and unemployment are disproportionately high. Needless to say, this state of affairs serves as a fertile soil for the emergence of new linguistic varieties. A number of research projects 135 have indeed been recently conducted in all three Scandinavian countries in order to explore the exact nature of such newly arisen vernaculars that have been labelled in the mass media with somewhat derogatory designations such as blattesvenska, , etc. Here we will adopt the more neutral term 136 ScMG (Scandinavian on Multilingual Ground), coined on the basis of Bodén’s (2005a) description of Swedish on Multilingual Ground.

Basic features Although ScMG is generally acknowledged to be foreign-sounding, it cannot be equated with a deficient foreign accent resulting from second language acquisition. Its speakers are usually capable of code-switching (to a certain extent) according to the speech act they are involved in. ScMG is widely spoken by bilingual adolescents born and raised in Scandinavia, who are generally more competent in Da, No or Sw than in their language of “origin”. In addition, ScMG is also spoken by monolingual adolescents without an immigrant background, who are obviously not second language learners. All this makes it clear that ScMG is a variety in its own right and not “a manifestation of lack of competence”, cf. Svendsen & Røyneland (2008:64), Bodén (2005b, 2007:21). ScMG is of course by no means homogeneous, as there is both considerable inter- and intra-dialectal variation. Listeners in perception experiments often have difficulties deciding whether a given stimulus belongs to a speaker of ScMG or not, cf. Bodén (2007). The problems of classification can only partly be ascribed to the listener’s own linguistic background, as there is probably a continuum of varieties ranging from the standard local dialects to ScMG proper. Furthermore, urban youth language is beyond question always defined by the dialect in which it is based. Nevertheless, there are certain cross-Scandinavian similarities that allow us to use ScMG as an umbrella term for the different speech varieties in question. The fact that the characteristic features of ScMG typically overlap with foreign accent follows from their role as linguistic markers of identity. It can be hypothesized that certain deviating patterns in the speech of first-generation immigrants are conventionalized as an expression of social identity in the usage of second-generation immigrants, cf. Bijvoet & Fraurud (2006:5). On the syntactic level this commonly involves the violation of the V2 constraint in sentences that start with an adverb or a subordinate clause. In morphology we can observe the excessive use of common gender at the expense of neuter forms, cf. Quist (2008:47), Svendsen & Røyneland (2008:74), Bijvoet & Fraurud (2006:5). This is in

134 Finnish, whose speakers immigrated in large numbers into Swedish-speaking territories especially in the 17 th century and onward, is a non-IE language. Nevertheless, the Finnish people had lived under Swedish rule for approximately six centuries and thus shared a number of cultural and historical bonds with the Swedes. 135 http://svenska.gu.se/forskning/isa/forskningsverksamhet/Avslutade%20projekt/Sprak-och-sprakbruk http://www.hf.uio.no/iln/english/research/projects/upus/ (both retr. 2016-11-12) 136 Quist (2008), Svendsen & Røyneland (2008) etc refer to these urban youth varieties as multiethnolects. 140 accordance with our expectation that L2 learners tend to favour unmarked structures, some of which may get conventionalized by later generations. All this suggests that we can anticipate unmarked features to gain ground on all levels of the linguistic description, yet each and every marked feature is not expected to recede. We can take the example of rounded, high, non-back vowels in Swedish, which (being cross- linguistically rare) are often replaced with other sounds by first-generation immigrants, however, similar deviations are not reported in the urban youth language of Rosengård, Malmö. In fact, some marked entities such as voiced fricatives, affricates etc (which are generally absent in the standard languages) can find their way into ScMG on condition that they are used in the relevant immigrant languages, cf. Bodén (2007:27ff).

Stress and quantity Given that certain suprasegmental features of the Scandinavian languages can be expressed in terms of markedness and are notoriously difficult for foreigners to reproduce, the prosodic level offers a number of candidates to be conventionalized. As far as stress patterns are concerned, we have to distinguish between lexical prominence and the phonetic realization of (syntactic) stress. To my knowledge, the former does not deviate from the standard patterns in any systematic way, which is not surprising, given that the immigrant languages have divergent stress systems and thus lack a common dominant edge-aligned pattern. They are, however, widely reported to differ from the stress-timed nature of the Gmc languages. Multiethnolects are frequently described as having a staccato-like rhythm and approaching σ- timing (cf. Bodén (2005a)), the reasons for which are as follows. If a language displays tendencies towards foot isochrony, it has to resort to various assimilations and reductions in order to be able to squeeze a given number of unstressed syllables into the interval defined by two adjacent stresses. Accordingly, the speech rhythm of a stress-timed language cannot be mastered without a thorough understanding of the phonological rules involved. In the case of language contact this requirement is rarely met, which means that most L2 learners (at least in an early stage) are likely to lean towards σ- timing irrespective of their linguistic background. Now given that varieties such as ScMG emerge from language contact, a shift towards σ-timing seems to be a natural phenomenon. The impression of a staccato-like rhythm does not exclusively depend on the absence of reductions and assimilations. Defective realizations of phonological length can have similar effects. Bodén (2005b:9) summarizes the rhythmical aspects of rinkebysvenska (Stockholm) as presented in Kotsinas (1990). It turns out that in addition to the infrequent use of reductions and assimilations, speakers of ScMG in Rinkeby tend to produce shortened long vowels and prolonged short vowels in stressed syllables. Such a deviation from standard complementary length is one of the main prosodic reasons why rinkebysvenska feels foreign-sounding. However, it must be added that in certain Swedish dialects a lower degree of complementarity is not necessarily perceived as a foreign feature. Bodén (2007:28) points out that this is the case in Skåne, where both ScMG based in Rosengård and the regional standard exhibit a rather weak interpretation of complementary length. All in all, it seems that the length distinctions of the (regional) standards are maintained but with clear tendencies towards σ- timing.

Tone and stød Most dialects that lack the tonal / glottal contrast are located at the periphery of the Scandinavian continuum, which suggests that language contact can lead to the suspension of the opposition. It would thus be surprising if ScMG did not deviate from the standard distribution and realization of tone and stød. Quist (2000:8) indeed reports that in the speech of adolescents in Nørrebro, the expected stød of standard Danish is absent in

141 words like sammen (together), tusind (thousand) and grim (ugly) but is retained in some other words such as mand (man). She suggests that the preserved stød of the latter item is somehow related to the word’s high frequency. Given that the other three examples are also relatively common, an explanation in terms of frequency (while being intuitively appealing) is not necessarily an appropriate one. I am not aware of any systematic study of the distribution of stød in Danish multiethnolects, yet it seems safe to assume that the typically foreign feature of stødlessnes has not been conventionalized in the newly arisen variety spoken in Nørrebro. The urban youth language of Oslo displays some analogous tendencies, cf. Svendsen & Røyneland (2008:73). The authors mention that multiethnolectal speakers often replace Accent 2 with Accent 1 and make thus no distinction between words like 1bønder (peasants) and 2bønner (beans). The tonal opposition, however, is not totally lost given that the authors’ informant pronounces certain minimal pairs such as 1tanken (the tank) 2tanken (the thought) in accordance with the tone-system of standard Norwegian. Similarly to the absence of stød discussed above, the domain of Accent 2 seems to be severely restricted in this Norwegian . The authors also mention a tendency among adolescents (without an immigrant background) in the Oslo area to replace Accent 2 with Accent 1 in e.g. present tense weak verbs. While this process can give rise to new minimal pairs (cf. 2spiller (player) as opposed to 2spiller > 1spiller (plays)) it never leads to the merger of already existing ones. Thus the spread of Accent 1 among ethnic Norwegians seems to be subject to morphological and lexical restrictions and is therefore an example of transmission. In language contact the conditioning of such processes tends to fade away (diffusion) so it is not surprising that multiethnolectal speakers do not follow such restrictions and show mergers between minimal pairs. We have noted so far that the use of stød and Accent 2 is less frequent in ScMG than in the regional standards. On the face of it, such developments are unexpected given that in chapter 2 we claimed both stød and Accent 2 to be the unmarked members of the respective oppositions, which means that they should widen their scope and not recede. Recall, however, that these markedness relations were based on phonological and not on phonetic criteria (cf. the discussion of frequency vs. complexity in 1.3.) and were language-specific.We have also claimed e.g. in connection with loanword adaptation that the internal development of a single phonological system favours phonologically unmarked features. In the case of language contact, most speakers usually do not have so deep an understanding of the secondary system as to be able to take phonological criteria into consideration. It follows that such external developments favour phonetically unmarked features , which can explain why non-stød and Accent 1 are so widely used in ScMG. We can conclude this chapter by considering the use of word accents in the suburbs of Sweden’s biggest cities. In accordance with the multiethnolects of Denmark and Norway, speakers of ScMG in Sweden appear to “maintain the word accent distinction… [which] means that the most obvious way of melodically signaling a non-Swedish background is left unused” (Bodén (2004:479)). Nevertheless, Bodén also points out that ScMG exhibits some deviating patterns, which she interprets as regional features in ScMG. The Accent 2 of Malmö (L-H-L) is sometimes pronounced in Rosengård such that the F0 curve “lacks a L turning point preceding the H tone” (ibid: 478). So, according to the revised segmentation we put forward in (39) in 2.2.4.9, the prominence tone may be absent in Rosengård-Swedish. In Rinkeby, on the other hand, the second peak of Accent 2 words (H-LH-L) is not always assigned, which means that in Stockholm it is the connective tone that is lost, cf. Bodén (2011:42), and not the prominence tone. In Göteborg, however, “the high tone at the end of prosodic phrases [i.e. the boundary tone] is often missing” (ibid), which reduces Accent 2 from H-L-H to H-L.

142

At a first glance it seems that the dialects in question employ completely different regional features given that they omit different constituents of Accent 2. A closer inspection, however, reveals that this is not the case. Table (61) below illustrates that the surfacing melodies of Malmö, Stockholm and Göteborg appear to converge in H-L once we ignore the building blocks the respective dialects have been reported to lack. Now if we recall that we have argued that Accent 1 can be represented as H-L (at least in Malmö and Stockholm), it is hard to escape the conclusion that the opposition is on the way to be suspended in the varieties surveyed. Bodén approaches the problem of word accents with the help of the timing hypothesis (cf. 2.2.4.1) and does not call attention to the tendency in (61). If we are not mistaken in the interpretation of the relevant data, then the allegedly regional features of ScMG (cf. Bodén (2011:40)) may turn out to be a supraregional correspondence.

(61) The realization of Accent 2 in ScMG Prominence tone Connective tone Boundary tone Malmö (L) H L Stockholm H (LH) L Göteborg H L (H)

Thus we have observed a general trend in Da&No&Sw varieties of ScMG. The use of the tonal / glottal opposition is not absent as it usually is in the “broken” pronunciation of L2 learners. Still, some distributional and realizational simplifications may lead us to assume that the opposition is considerably weakened in ScMG when compared to the regional standards.

143

4. Summary

4.1. Chapter 1 In the present thesis I examine the changes of those prosodic features that are distinctive at the level of the word in standard Danish, Swedish and Norwegian. Chapter 1 is devoted to setting up the relevant theoretical framework, since the understanding of how phonological change proceeds is a crucial prerequisite for diachronic reconstruction. The first thing to notice is that a large number of sound changes appear to make sense as they either 1) result in a more economical inventory, 2) decrease articulatory effort or 3) minimize perceptual confusion. We can thus assume that such changes optimize the system within which they take place and therefore can be described as goal-oriented processes. These observations about sound change form the basis of an approach known as Functional phonology. Given that phonological change is rooted in language use, I argue that it reflects a typical communicative situation, which requires an addresser, an addressee and a code between the two. Optimal communication requires a code that is easy to articulate, easy to perceive and easy to learn. These three factors, which are to a certain degree inherently in conflict, can be translated to the three functional principles in 1.4.3. If a sound change is the product of these internal factors, then it satisfies at least two out of our three principles. If a sound change cannot be described in terms of optimization, then it is due to language contact / external factors (1.5.3). As far as I see, the criticism directed at the functional approach can be met with convincing arguments. It is often maintained that reliance on teleology (goal-orientedness) is unscientific. In 1.2.2 I claim that the mechanistic / deductive model of science is not applicable to the human sciences (which include linguistics) as the investigation of human phenomena is qualitatively different from the investigation of physical reality. This means that the only possible explanation left at our disposal involves teleology. I assume that sound change (similarly to Darwinian evolution) is made up of incessant random variation and meaningful selection (1.4.5.3). We can use goal-orientedness (as a heuristic tool) to describe the latter (1.2.1). The functional principle that reflects the perspective of the speaker (ease of articulation) is usually expressed with a reference to markedness relations. It has been pointed out that the notion of markedness is characterized by an embarrassing degree of polysemy and is as such an unscientific, snowball-like concept, which leads to contradictory claims. Although this is a valid point, the solution to this problem is not to banish the term for good, but to come up with a workable definition. I argue that a frequency-based approach to markedness is superior to a definition based on complexity (1.3.3). This practice assigns a completely new interpretation to the markedness relations of privative oppositions. The last critical point concerns our belief that phonological change improves the system. How is it possible that inventories keep on changing forever instead of arriving at an optimal system after a time? It turns out that there is no optimal system. The discrepancy between an optimal (i.e. symmetrical) inventory and the shape of our speech organs and the irreconcilable perspectives of the listener and the speaker make it clear that there is no ideal system. The fact that it is always possible to find a better one implies that phonological changes driven by internal factors always form a loop, which allows for eternal optimization (1.4.5.2). The problem of mergers seems also incompatible with the concept of improvement, since two merging sounds often result in homophones and thus increase rather than decrease

144 perceptual confusion. Most mergers, however, lead only to pseudo-homophones, which are never confused (1.4.5.4). The final section of chapter 1 demonstrates how our functional principles can be applied to some vocalic changes that occurred in Old English (1.5.1) and how these very same principles can pinpoint foreign influence in historical linguistics once we have established the basic types of phonic interference (1.5.4). I identify a vocalic development in Old English, which, I claim, is due to Scandinavian influence (1.5.5). It is also shown how changes in stress placement can form a loop of eternal optimization (1.5.2).

4.2. Chapter 2 The synchronic analysis of Chapter 2 addresses the problem of the three distinctive word- level prosodies in the Scandinavian languages: length, stress and tone/stød. The considerable overlap that exists between the phonology of Swedish and Norwegian (2.1), allows us to present these two languages together. Both exhibit complementary quantity in stressed syllables, which accordingly either have V:(C) or VC: rhymes. This raises the question whether the UR includes consonantal or vocalic length. I argue for underlying vocalic quantity, which I support with phonetic, derivational and distributional evidence (2.2.2). The stress pattern (by which I mean the location of primary stress) of modern Sw&No is to a large extent unpredictable. I argue that it is pointless to set up derivational rules for the stress pattern of mono-morphemic words. I challenge the view the Sw&No are right-aligned languages with a three-syllable window and propose that these languages still have default initial stress. The fact that Sw&No stress (almost) always falls on one of the three last syllables of a word is a static generalization, a distributional accident, which cannot weigh as much as the directionality of analogical processes (2.2.3). The tonal contrast in Sw&No can be analyzed either as equipollent or as privative. I adopt the latter approach arguing that it is more appropriate for the purposes of a phonological study (2.2.4.1). I distinguish between maximal and minimal systems and assume that the opposition is not preserved in non-focal position in minimal systems (2.2.4.2). Although the tonal distinction is only marginally distinctive I reject a recent proposal according to which the distribution of the accents is completely predictable (2.2.4.3). I approach the problem of tonal markedness employing the definition from 1.3 and argue that Accent 1 is the marked member of the opposition (2.2.4.4). I provide further support for the newly established markedness relations by looking at the tonal behaviour of suffigated simplex words. I reject the claim that Accent 2 is induced by suffixes and argue that it is neither necessary to postulate a locality principle nor to distinguish between weak and strong suffixes in order to account for the distribution of the accents, if we assert that Accent 2 is default (2.2.4.5). I also review compounding in standard Sw&No to find that the relevant data are compatible with my earlier claims about tonal markedness (2.2.4.6). Subsequently, I turn to southern Swedish compounds, which also indicate that Accent 1 should be analyzed as a lexically specified mark. The systematic spread of Accent 2 also supports the assertion that it is unmarked (2.2.4.7). As to the functions of the tonal contrast, I assume that Accent 2 signals both [+connective] and [+native], while no corresponding value is attached to Accent 1. Tonal distribution in geographical names can motivate our use of [+native], which is not meant to reflect a word’s etymology (2.2.4.8). Given that some of my earlier claims are not in agreement with the traditional privative segmentation of Accent 2, I propose to revise the opposition of standard Swedish claiming that Accent 2 = Prominence tone + Connective tone + Boundary tone (2.2.4.9). I conclude the discussion of Scandinavian tonality by setting up some rules within the framework of Lexical Phonology (2.2.4.10). The rules clearly indicate that Accent 1 is indeed the marked member of the opposition.

145

Given that a stressed Danish syllable is not subject to the bimoraic requirement, Danish quantity is more of a segmental than a prosodic feature. Long consonants are only found across grammatical boundaries or as a result of schwa-assimilation. I argue that a moraic analysis is not adequate for short consonants and reject therefore the claim that the mora is a unit of quantity for vowels but not for consonants (2.3.1). I assume that lexical stress is a binary category, while a surface-phonological description includes three levels in all three languages. The fact that a syllable loses its stød if it appears under tertiary stress as a result of stress demotion echoes our claims in 2.2.4.2 that the tonal contrast is not realized in non-focal position (2.3.2.1). I also propose that contemporary Danish lacks default stress in contrast to Sw&No (2.3.2.2). The way we defined phonological markedness (1.3) points out stød as the unmarked member of the glottal opposition. The distribution of stød in inflected forms indicates that most regularities are tied to morphology and that a strictly phonological approach involves unnecessary complications and way too abstract underlying forms (2.3.3.1). The distributional argument in favour of the unmarked status of stød can be supported with some ongoing changes as a large number of words have acquired stød in recent years (2.3.3.2). While stød is often lost on the anterior constituents of compounds, posterior constituents frequently host lexically unmotivated tokens of stød. I argue that this phenomenon has both physiological and morphological reasons (2.3.3.3). When it comes to the functions of the opposition, I assume that the unmarked pattern (stød) is associated with [+native] similarly to Accent 2 in the tonal dialects (2.3.3.4).

4.3. Chapter 3 The diachronic review of Chapter 3 starts with some controversial aspects of the Proto- Germanic stress shift. I argue that it originally resulted in word-initial (rather root-initial) stress and that it is unlikely to have preceded Grimm’s law as such (3.1.1) Given that the modern Scandinavian languages exhibit fully productive compounding but lack dynamic stress shifts I reject the idea that they have right-aligned stress and that they obey the three-syllable-rule of modern Greek, Italian etc. A comparison between those languages in which the rule is said to apply reveals that they do not implement the three- syllable-rule in a uniform fashion. If we want to insist that e.g. Swedish follows the rule, then we can conclude that it follows its weakest possible interpretation, given that it only applies to stems and derivatives, which is probably a distributional accident and not an underlying principle (3.1.2). Although final stress in compounds is a sporadic feature in the modern languages, it used to be much more extensive. I propose that the phenomenon emerged in connection with level stress, i.e. before the Scandinavian quantity shift, and that it was productive for centuries to follow (3.1.3). In 3.1.4 I raise concerns about the widely held belief that a Germanic content word is always heavy and argue that CVC-monosyllables are monomoraic. I point out that moraic analyses of consonants are arbitrary and often rely on false premises, which can lead to contradictions. Although open syllable lengthening is expected to feature only V lengthening, it also involves gemination in certain dialects, which is argued to be due to Sámi influence. This supposition can be mapped onto the basic interference types we outlined in 1.5.4 and is both supported by the facts of dialect geography and the sound pattern of the Finnic languages, which display both C-gradation and geminates (3.1.5). Danish has a syllable cut opposition similarly to English, German and Dutch so I argue that from a prosodic point of view it is a West-Germanic language. I demonstrate that degemination is the result of an incomplete transition into a system of complementary

146 quantity. Furthermore, I argue that (a transition into) a σ cut opposition is incomprehensible in terms of moras (3.1.6). Given that F0 is a usual stress correlate I assume that a phonetically tenable account of tonogenesis has to trace the opposition back to various stress patterns. Accordingly, the origins of the tonal contrast either go back to secondary stress in Proto-Nordic or level stress in Old Scandinavian. The former approach singles out two-peaked dialects (like that of Stockholm) as conservative varieties. Nevertheless, it is not viable to assume identical tonal building blocks for the early opposition and modern two-peaked dialects. Although the tonal opposition is commonly attributed to epenthesis and cliticization in the 11 th and 12 th centuries I suggest that the reductions of the Syncope Period could also have given rise to a cursory tonal contrast. It is argued that the retention of the tonal curve after stress demotion is due to Sámi influence (3.2.1). According to the level stress approach, in which the opposition is assumed to have originated as HL:LHL, open syllable lengthening in the 13 th and 14 th centuries can be said to have been the immediate trigger of tonogenesis. Given that this hypothesis attributes the present-day distribution of Accent 2 to analogy even in such cases that are incompatible with our understanding of analogical processes (3.2.4) it seems clearly inferior to the Proto-Nordic approach. This stance is corroborated by the assumption that the glottal opposition goes back to a tonal predecessor and emerged due to V weakening in the 13 th century (3.2.2). The extensive phonetic variation of the contemporary tonal typology is argued to be due to the mechanism of peak delay (i.e. rightward shifts). The tonal shifts are functionally motivated and form a loop of eternal optimization (3.2.3). Although the Stockholm variety is supposed to represent the original state of affairs, the city of Stockholm itself is claimed to have escaped the loop since the tonal contrast was possibly suspended for a century or so given that Accent 2 was generalized in the area due to language contact. I also propose that some dialects that have general Accent 1 today must have passed through an earlier stage with generalized Accent 2 (3.2.4). The thesis is concluded with a glance at some current developments in multiethnic urban environments (3.3). The newly arisen multiethnolects / sociolects display similar tendencies in the three languages. They all exhibit a staccato-like rhythm and a considerably weakened (yet not completely lost) tonal/glottal opposition. While earlier contact with speakers of other Germanic dialects contributed to the spread of phonologically unmarked features (3.2.4), the present-day situation, when Scandinavians have come into contact with an unprecedented number of people speaking linguistically distant non-Germanic languages, has led to the advance of phonetically unmarked features such as non-stød and Accent 1. Some supraregional similarities observed in Swedish multiethnolects suggest that the tonal opposition is on the way to be neutralised in Accent 1.

147

Appendix: Hungarian summary

Magyar nyelv ű összefoglaló

Jelen dolgozatban a fonológiailag releváns prozódiai jegyek változásait vizsgálom a három központi skandináv nyelv sztenderd változatában. A svédben és a norvégban a szó szintjén a kvantitás, a hangsúly és a tonalitás bírnak disztinktív szereppel, míg a dánban a kvantitás, a hangsúly és a szakirodalomban stødnek nevezett glottális jegy. A svéd/norvég tonális oppozíció tagjai Accent 1 illetve Accent 2 néven ismertek. Az Accent 1 és a stød (valamint az Accent 2 és a stødnélküliség) disztribúciója oly mérték ű egyezéseket mutat, melyek egyértelm űvé teszik, hogy a két jelenség egy t őről fakad. A kérdéses hangtani problémák szakirodalma igen terjedelmes, s nem ritka, hogy elismert kutatók jutnak egymásnak élesen ellentmondó következtetésekre egyazon probléma vizsgálatakor. Az ellentmondások f ő forrása sokszor a szakszókincs és a premisszák eltér ő használatára vezethet ő vissza. Ennek megfelel ően a dolgozat els ő fejezete felvázolja a vizsgálni kívánt jelenségek és hangváltozások megértéséhez szükséges elméleti keretet. A második fejezetben végzem el az egyes prozódiai jegyek szinkrón elemzését, melynek célja, hogy el őkészítse és megkönnyítse a harmadik fejezetben elemzett történeti változások rekonstrukcióját. A negyedik fejezet foglalja össze a dolgozat legfontosabb eredményeit.

Els ő fejezet A hangváltozások kérdése számos problémát vet fel, melyek közül az egyik legnehezebben megválaszolható az az, hogy mi indít el egy-egy hangtani újítást. Bár a kiváltó okok pontos meghatározása bizonyíthatóan lehetetlen feladat, a teleológia fogalma mentén vizsgálódva két jól elkülönül ő minta rajzolódik ki. A változásokat vagy a beszél ők közti félreértések idézik el ő, s mint ilyen a véletlen termékei, vagy bizonyos mögöttes célokat szolgálnak. Ha egy hangváltozásnak célja, ill. értelme van, az csak az lehet, hogy általa hatékonyabbá váljék a kommunikáció. Ez úgy érhet ő el, ha optimalizálódik a kommunikációs modell három pillére, tehát ha kevesebb artikulációs er őfeszítéssel jár a beszéd, ha jól elkülönülnek a fonémák, a szavak etc (s ez által egyszer űsödik a percepció) és ha gazdaságosan leírható az adott nyelv fonológiája. Az els ő két tényez ő a beszél ő és a hallgató szempontjait veszi figyelembe, míg az utolsó a kettejüket összeköt ő közös kódra vonatkozik. Funkcionális fonológiának hívjuk azt a hangtani modellt, mely a fonológiai rendszerek változásait e három tényez őnek (funkcionális alapelvnek) tulajdonítja és optimalizációként értelmezi. Egy adott változás csak akkor mehet végbe, ha a három funkcionális követelmény közül legalább kett őnek eleget tesz. Ebb ől következik, hogy az optimalizáció mértékét számszer űsíteni nem kell. Többen tudománytalannak tartják a teleológiára (vagyis pontosabban az arisztotelészi cél-okra) való hivatkozást s így a funkcionális megközelítést is. Úgy t űnik azonban, hogy a deduktív tudománymodell, mely egyedül az arisztotelészi ható okot ismeri el legitim magyarázatnak, a humán tudományokra nem alkalmazható, mivel ezek több ponton is eltérnek a természettudományos vizsgálódásoktól. A teleológia tehát megkerülhetetlen része a hangtani változások magyarázatának. Meglátásom szerint a hangok változása és a darwini evolúció egyazon kétszint ű mechanizmusra épülnek: random variációt teleologikus szelekció követ. A fonológiai rendszerek folytonos változása csak úgy értelmezhet ő optimalizációként, ha bizonyítjuk, hogy nincs optimális rendszer, s így a változások körkörösek. Ez egyértelm űen következik a funkcionális alapelvek egymásnak ellentmondó követelményeib ől. Mivel az optimalizáció sokszor a jelölt jegyek ellen hat, tisztázni kell, hogy mit is értünk

148 jelöltség alatt. Úgy gondolom, hogy egy kielégít ő fonológiai definíció a frekvenciát és nem a komplexitást kell, hogy alapkritériumnak tekintse. Az optimalizációként értelmezhet ő változásokat bels ő fejl ődésnek tekintem, míg a nem-teleologikus (félreértésekb ől fakadó) innovációkat küls ő nyelvi hatásnak tulajdonítom. Az els ő fejezet végén néhány történeti változás elemzésével mutatom be, hogy a fentiekben vázolt elméleti megközelítés a gyakorlatban miként alkalmazható.

Második fejezet A norvég és a svéd fonológia oly mértékben fedi egymást, hogy a felesleges ismétlések elkerülése végett indokolt közös alfejezetben tárgyalni a két nyelv prozódiai jegyeit. A hangsúlyos szótag mindkét nyelvben kötelez ően hosszú, így a rím valamely alkotóelemének (a magnak vagy a kódának) el kell ágaznia. Az ilyesféle komplementaritás azzal jár, hogy a mögöttes reprezentációban egyetlen (vagy mássalhangzós vagy magánhangzós) kvantitást feltételezünk, amib ől levezethet ő a felszíni redundancia. Bár a szakirodalom megosztott a kérdésben, amellett foglalok állást, hogy a magánhangzós kvantitást kell alapul venni. A germán etimológiájú (mára legfeljebb két-három szótagra rövidült) szimplex szavak hangsúlya néhány szisztematikus kivételt ől eltekintve az els ő szótagon van, míg a jövevényszavak hangsúlya az utolsó három szótag bármelyikére eshet. Lévén, hogy nincs olyan szemantikai fogódzkodó, mely elkülöníthetné egymástól a két csoportot, a hangsúlykiosztás a modern svéd és norvég nyelvben nagyrészt lexikális. Egy statikus általánosítás tehát lehet ővé teszi, hogy a hangsúly helyét a modern görög nyelvhez hasonlóan a szó végét ől számítsuk, ám egy ilyesféle jobbszéli illesztés ellentmond annak a tendenciának, hogy a jövevényszavak hangsúlya a hangtani adaptáció során olykor-olykor a szó elejére kerül. A produktív szóösszetételek hangsúlymintája is azt sugallja, hogy a szó eleji hangsúly még mindig alapértelmezett a két nyelvben, s mint ilyen analógiás változások alapjául szolgál. A tonális oppozíciót a szakirodalom hol ekvipollensnek, hol privatívnak írja le. Az el őbbi esetben a két tónus közti különbség a dallamminták eltér ő id őzítésében (vagyis az eltér ő asszociációs viszonyokban) rejlik, míg az utóbbi megközelítés az Accent 1-et intonációs tónusokkal azonosítja (prominencia- és határtónus), az Accent 2-t pedig egy lexikális tónust követ ő Accent 1-ként írja le. A privatív értelmezés tagolja, értelmezi a tónusokat, s így adekvát kiindulópontja egy fonológiai elemzésnek. Ha az Accent 1 valóban csak mondatintonáció, mely fellelhet ő az Accent 2 kontúrjában, úgy az elmélet szerint az oppozíciónak hangsúlytalan helyzetben is élnie kell. Azonban több érv is affelé mutat, hogy a tónusok realizációja bizonyos mérték ű hangsúlyhoz kötött. Ez arra utal, hogy a hagyományos privatív tagolás nem megfelel ő. Helytállóbbnak t űnik az Accent 2-t egy prominenciatónus, egy konnektív tónus és egy határtónus szekvenciájaként értelmezni. Ezen elképzelés szerint hangsúlytalan helyzetben a konnektív tónus nem tudja ellátni feladatát (nincs mihez kapcsolódnia), s így nem is realizálódik, aminek következtében az oppozíció hangsúly híján felfüggeszt ődik. Az oppozíció privatív természetéb ől következik, hogy a hagyományos elemzések az Accent 2-t tekintik a szembenállás jelölt tagjának, mivel az fonetikailag összetettebb. Mivel az egyszótagú szavak két moráján nem fér el három tónus, az oppozíció itt (és a véghangsúlyos szavakban) Accent 1-ben neutralizálódik. Régebben ezt is a hagyományos jelöltségi viszonyok melletti érvként kezelték. A jelöltség els ő fejezetben meghatározott definíciójából kiindulva azonban a hagyományos elképzelést felül kell bírálni, mivel azon szavak esetén, melyekben megvalósítható a distinkció, az Accent 2 frekvenciája lényegesen magasabb. Továbbá a tónusok eloszlásáról is egyszerűbben számot tudunk adni, ha az Accent 2-t vesszük alapértelmezettnek. Számos érvvel lehet bizonyítani, hogy az Accent 1 rendhagyó, speciális, idegen mintákat jelöl, melyek analógiás változások célpontjai. Mindez

149 azt sugallja, hogy az oppozíció nemcsak a konnektivitás kifejezésére alkalmas, hanem egy [native] specifikációval is rendelkezik. A dán nyelv nem rendelkezik mögöttes hosszú mássalhangzókkal, s így a svédben és a norvégban fellelhet ő komplementaritás sem jellemzi. Ennek folyományaként a magánhangzós kvantitás szegmentális és nem prozódiai jegy, azonban a hangsúly és a stød megértéséhez mindenféleképpen tárgyalni kell. Felszíni hosszú mássalhangzók morfémahatáron fordulnak el ő valamint schwa-asszimiláció következményeként. Ez a folyamat hosszú magánhangzókat is eredményezhet, melyek, mivel nem állhatnak hangsúlytalan helyzetben, mellékhangsúlyt és stødöt kapnak. A dán hangsúly a svédhez és a norvéghoz hasonlóan lexikális. A prozódiai adaptáció során azonban nem kerül a jövevényszavak els ő szótagjára a hangsúly, így azt feltételezem, hogy a dánban már nincsen alapértelmezett hangsúlyminta. A mellékhangsúly, mint láttuk, derivált kategória, mely meghatározható az összetett szavak szerkezetéb ől valamint a schwa- asszimiláció menetéb ől. A lexikalizált összetételek bizonyos körülmények közt hajlamosak elveszíteni egy-egy mellékhangsúlyt, ami együtt jár a stød elvesztésével is. A stød el őfordulása a stød-bázisként ismert feltételhez kötött, mely szerint a stød csupán olyan (mellék)hangsúlyos szótagban realizálódhat, melyben hosszú magánhangzót vagy rövid magánhangzót követ ő szonoráns mássalhangzót találunk. Ennek folyamányaként a szakirodalom a szonoráns mássalhangzókat olykor moraikusnak írja le, ez azonban önkényes megközelítés, hiszen a stød eloszlásán kívül más (pl. a felszíni kvantitás) nem igazolja. A stød fonetikailag jelölt, ám az els ő fejezet definíciója szerint jelöletlen. Eloszlását többnyire morfológiai tényez ők határozzák meg, aminek következtében a szigorúan fonológiai megközelítések szükségtelen bonyodalmakhoz vezetnek. A stød jelöletlenségét támasztja alá, hogy analógiás változások révén egyre több új szóban és pozícióban jelenik meg. Példának hozhatjuk a (formális) összetételek utótagját, mely sokszor olyan esetekben is støddel realizálódik, amikor a morfológia alapján stødnélküliséget prognosztizálnánk. A stød terjedése az el őtagokra nem vonatkozik, s őt ebben a pozícióban fiziológiai tényez őknek és a jelenség morfológiai jellegének köszönhet ően tipikusan stød nélkül realizálódnak azok a szavak is, melyeket izoláltan støddel ejtünk. A glottális jegy terjedése arra enged következtetni, hogy a svéd-norvég tónusokhoz hasonlóan a stød is szerepet játszik a jövevényszavak prozódiai asszimilációjában.

Harmadik fejezet Mivel a tonális / glottális oppozíció a kvantitás és a hangsúly függvénye, a diakrón rész e két utóbbi jegy vizsgálatával kezd ődik. Megállapítom, hogy a közgermánban el őször szó eleji hangsúly alakult ki, s a t őszótagra es ő hangsúly kés őbbi grammatikalizáció következménye. Áttekintek néhány ellentétes véleményt a változás kronológiáját illet ően, s úgy találom, hogy helytálló a tradicionális elképzelés, mely szerint a hangsúlybeli változás a germán mássalhangzó-eltolódás els ő lépése után következett be. A korai germán hangsúlymintát a modern germán nyelvekben vagy egy latinos (jobbszéli illesztés ű) rendszer váltotta fel vagy egy duális rendszer, melyben két részre oszlik a lexikon a [native] jegy mentén. Áttekintem a szó vége fel ől számított hangsúlyok tipológiáját és újabb érvekkel támasztom alá korábbi álláspontomat, miszerint a svéd nyelvet nem célszer ű efféle latinos nyelvként elemezni. Az összetett szavak f őhangsúlya több skandináv dialektusban is az utótagra esik, amelynek következtében a szóösszetételek tonalitása Accent 1-ben neutralizálódik. Rámutatok, hogy a középkorban ez a jelenség lényegesen kiterjedtebb volt, mint napjainkban, és a level stress-ként ismert (egyenhangsúly) hangsúlymintával lehet összefüggésben. A korai középkorban számos szegmentális törlés (szinkópa-korszak), míg a kés ői középkorban átfogó kvantitatív átrendez ődés zajlott a skandináv nyelvekben. Bár ezeket a

150 változásokat morák segítségével szokás elemezni, úgy vélem, elhibázott gyakorlat a moraikus elemzést kiterjeszteni az összes efféle változásra. Rávilágítok, hogy az efféle megközelítések olyan önkényes feltevésekkel élnek, melyek kés őbb számos ellentmondáshoz vezethetnek. Jól példázza ezt a nyugat-germán nyelvekben és a dánban végbement degemináció, mely olyan változásokat hozott magával, melyek a morákból kiindulva megmagyarázhatatlanok. Számos érvet hozok fel amellett, hogy a dán glottális oppozíció egy korábbi tonális szembenállásból alakult ki. Rámutatok továbbá, hogy a tonogenezist csak úgy lehet kielégít ően magyarázni, ha a tónusok jelenségét olyan prozódiai jegyhez kötjük, mely ismereteink szerint befolyásolja a tonális kontúrt. Ennek megfelel ően, a tonális oppozíció létrejöttét különböz ő hangsúlymintáknak tulajdonítom, s elvetem azokat a klasszikus munkákat, melyek kizárólag a szótagszámhoz kötik a szembenállást. Úgy találom, hogy az Accent 2 legmegbízhatóbb korrelátuma a köz-skandináv mellékhangsúly, mely oly módon veszett el, hogy számi hatásra meg őrz ődött a hozzá kapcsolódó redundáns dallamminta. Azt feltételezem, hogy a hangsúlytalan szótagok er őteljes gyengülése vezetett oda a dánban, hogy a skandináv tonalitás glottális oppozícióvá alakult át. A tonális dialektusok napjainkban igencsak eltér ően realizálják a tónusokat. Amellett érvelek, hogy ezt a sokszín űséget a stockholmi dallammintát alapul véve kell magyarázni. A többi dialektus úgy vezethet ő le, hogy a tonális kontúr a szó vége felé sodródik (peak delay). A funkcionális elvárásoknak megfelel ően a változások körkörösek, s az a néhány dialektus, melynek dallamát a modell nem képes generálni, külső nyelvi hatás eredményeként jött létre. A skandináv nyelvterületen az oppozíció több helyen is Accent 1-ben neutralizálódott. Stockholm környékén azonban az Accent 2 generalizálódott, s feltételezhet ő, hogy egy ideig a svéd f ővárosban sem élt az oppozíció. Rámutatok, hogy a generalizálódott Accent 2 bels ő fejl ődés eredményeként generalizálódott Accent 1-gyé alakul, s így valószín űleg több olyan dialektusban, mely ma csak az Accent 1-et ismeri, a poszttonikus szótaggal rendelkez ő szavakat korábban mind Accent 2-vel ejtették. A dolgozat utolsó alfejezete azt vizsgálja, hogy a skandináv nyelvek prozódiája miként változott az elmúlt 40-50 év tömeges bevándorlásának folytán kialakult nagyvárosi multietnolektusok nyelvhasználatában. A három vizsgált nyelv számos szupraregionális egyezést mutat. Közös bennük, hogy a korábbi hosszúsági viszonyokat egy staccatószer ű ritmus kezdi felváltani, s hogy a tonális / glottális oppozíció, bár még nem t űnt el teljesen, jelent ős egyszer űsítéseken esett át.

151

References

Åberg , Alf 1978 Vår Svenska Historia . Natur och Kultur, Stockholm.

Abrahamsen , Jardar Eggesbø 2003 Ein vestnorsk intonasjonsfonologi . Dr.art.-avhandling. NTNU, Trondheim.

Ács , Péter 1990 A note on interscandinavian communication . In: Papers in Scandinavian Studies 4. Budapest. 196-203. 1996 Az interskandináv kommunikáció fonológiai aspektusa. Budapest. ELTE, Germanisztikai Intézet. 2012 Az interskandináv kommunikációra visszatérve: miért nehéz a dán nyelv? In: Bárdosi Vilmos (ed.): Tanulmányok, Nyelvtudományi Doktori Iskola. Asteriskos 1. Budapest. 5-21.

Ács , Péter & Törkenczy Miklós 1986 Directionality and post-tonic schwa-deletion in Standard Danish and Standard British English . In: Skandinavisztikai Füzetek 2, 11-25.

Adamska-Sałaciak , Arleta 1989 On explaining language change teleologically . In: Studia Anglica Posnaniensia 22, 53-74. 1992 Wyja śnianie w j ęzykoznawstwie historycznym . In: Biuletyn Polskiego Towarzystwa J ęzykoznawczego 47-48, 25-40.

Anttila , Raimo 1977 Analogy . Trends in Linguistics. State-of-the-Art Reports. The Hague: Mouton. 1989 Historical and comparative linguistics . Current issues in linguistic theory 6. John Benjamins Publishing Company. Amsterdam.

Aristotle 2004 Physics . Digireads.com.

Árnason , Kristján 1996 How to Meet the European Standard: Word Stress in Faroese and Icelandic . In: Nordlyd 24, 1-22.

Baerman , Matthew 1999 The evolution of fixed stress in Slavic . University of California, Berkeley.

Basbøll , Hans 1999 Prosodic issues in Danish compounding: A cognitive view . In: Mey (ed.): E pluribus una [Festschrift Anna Wierzbicka] RASK 9/10, 349-68. 2003 Prosody, productivity and word structure: the stød pattern of Modern Danish . In: Nordic Journal of Linguistics 26/1, 5-44. 2005 The phonology of Danish . Oxford University Press.

152

Bennett , William H. 1970 The stress patterns of Gothic . In: Publications of the Modern Language Association of America 85, 463-472. 1972 Prosodic features in Proto-Germanic . In: van Coetsem & Kufner (eds.): Towards a grammar of Proto-Germanic, Tübingen: Niemeyer, 99-116.

Berulfsen , Bjarne 1969 Norsk uttaleordbok . Aschehoug, Oslo.

Bischoff , Shannon & Carmen Jany (eds.) 2013 Functional approaches to language . Walter de Gruyter.

Bijvoet , Ellen & Kari Fraurud 2006 Svenska med något utländskt . Språkvård 2006/3, 4-10.

Blevins , Juliette 2004 Evolutionary phonology . Cambridge University Press.

Bloch , Josefin 2003 Generaliserad accent 2 ur ett fonologiskt perspektiv . In: Riad (ed.): Meddelanden från Institutionen för nordiska språk vid Universitet: MINS 54, 37-61.

Bloomfield , Leonard 1933 Language . New York: Holt, Rinehart & Winston.

Bodén (Hansson) , Petra 2004 A new variety of Swedish? In: Proceedings of the 10 th Australian International Conference on Speech Science & Technology, 475-480. 2005a The sound of ’Swedish on Multilingual Ground’ . Proceedings, FONETIK, Department of Linguistics, Göteborg University. 2005b Comparing foreign accent and “Rosengård Swedish”: some hypotheses and initial observations . Lund University, Department of Linguistics, Working Papers 51, 5-15. 2007 “Rosengårdssvensk” fonetik och fonologi . In: Ekberg (ed): Språket hos ungdomar i en flerspråkig miljö i Malmö, Nordlund 27, 1-47. 2011 Adolescents’ pronunciation in multilingual Malmö, and Stockholm . In: Källström (ed): Young Urban Swedish. University of Gothenburg, 35-48.

Boersma , Paul 1990 Modelling the distribution of consonant inventories. Paper presented at the Congress of Linguistics and Phonetics, Prague. http://www.fon.hum.uva.nl/paul/papers/Praag_1990.pdf (retr. 2015-05-27) 1997a Sound change in functional phonology . Manuscript, University of Amsterdam. http://www.fon.hum.uva.nl/paul/papers/soundChange.pdf (retr. 2015-05-27) 1997b The elements of functional phonology . Manuscript, University of Amsterdam. http://www.fon.hum.uva.nl/paul/papers/elements.pdf (retr. 2015-05-27)

153

1998 Functional Phonology: Formalizing the interactions between articulatory and perceptual drives . Doctoral dissertation, University of Amsterdam. http://www.fon.hum.uva.nl/paul/papers/funphon.pdf (retr. 2015-05-27) 2003 The odds of eternal optimization in Optimality Theory . In Holt, D. Eric (ed.) Optimality theory and language change. Dordrecht: Kluwer, 31-65. 2013 The history of the Franconian tone contrast. http://www.fon.hum.uva.nl/paul/papers/FranconianToneHistory68.pdf (retr. 2015-09-28)

Bruce , Gösta 1974 Tonaccentregler för sammansatta ord i några sydsvenska stadsmål . In: Platzack (ed.): Svenskans beskrivning 8, 62-75. 1977 Swedish word accents in sentence perspective . Travaux de l’institut de linguistique de Lund 12. CWK Gleerup. 1982 Reglerna för slutbetoning i sammansatta ord i nordsvenskan . In: Elert & Fries (eds.): Nordsvenska. Språkdrag i övre Norrlands tätorter. Universitetet i Umeå. 123-148. 1998 Allmän och svensk prosodi . Praktisk Lingvistik 16. Lunds Universitet. 2007 Accentuering i svenska sammansatta ord . In: Arboe (ed.): Nordisk dialektologi og sociolingvistik. Aarhus Universitet. 116-125.

Bye , Patrik 2004 Evolutionary typology and Scandinavian pitch accent. University of Tromsø. Kluwer Academic Publishers.

Caton , Steven C 1987 Contributions of Roman Jakobson . Annual review of Anthropology 16, 223-60.

Chomsky , Noam & Morris Halle 1968 The sound pattern of English . New York: Harper & Row.

Cooper , Franklin & Pierre Delattre & Alvin Liberman & John Borst & Louis Gerstman 1952 Some experiments on the perception of synthetic speech sounds . In: The Journal of the Acoustical Society of America 24/6, 597-606.

Cruttenden , Alan 1986 Intonation . Cambridge University Press.

Cser , András 2003 The typology and modelling of obstruent lenition and fortition processes . Budapest: Akadémiai Kiadó.

Dawkins , Richard 1986 The blind watchmaker . New York: W. W. Norton & Company.

Delsing , Lars-Olof & Katarina Lundin Åkesson 2005 Håller språket ihop Norden? En forskningsrapport om ungdomars förståelse av danska, svenska och norska . TemaNord 2005:573. Nordiska ministerrådet. Köpenhamn.

154

D’Imperio , Mariapaola & Sam Rosenthall 1999 Phonetics and phonology of main stress in Italian . In: Phonology 16, 1-28.

Dray , William 1957 Laws and explanation in history . Oxford University Press.

Ejskjaer , Inger 1990 Stød and pitch accents in the Danish dialects . In: Acta Linguistica Hafniensia 22, 49-75. 2003 Glottal stop (stød, parasitic plosive) and (distinctive) tonal accents in the Danish dialects . In: de Vaan (ed.): Germanic tone accents. Proceedings of the first international workshop on Franconian tone accents, Leiden 13-14 June 2003. Franz Steiner Verlag. 25-34.

Ekwall , Eilert 1930 How long did the Scandinavian language survive in England? In: Selected Papers, 1963, Copenhagen. 54-67. 1975 The History of Modern English Sounds and Morphology. Blackwell.

Elert , Claes-Christian 1964 Phonologic studies of quantity in Swedish based on material from Stockholm speakers . Uppsala, Almquist&Wiksell. 1972 Tonality in Swedish: Rules and a list of minimal pairs . In: Firchow & al. (eds.): Studies for Einar Haugen. The Hague and . Mouton, 151-173.

Eliasson , Stig 1972 Unstable vowels in Swedish: Syncope, epenthesis or both? In: Firchow & al. (eds.): Studies for Einar Haugen. The Hague and Paris. Mouton, 174-188. 1978 Swedish quantity revisited . In: Gårding, Bruce, Bannert (eds.): Nordic prosody I. Travaux de l’institut de linguistique de Lund 13, 111-122.

Eliasson , Stig & Nancy La Pelle 1973 Generativa regler för svenskans kvantitet . Arkiv för nordisk filologi 88, 133- 148.

Elstad , Kåre 1980 Some remarks on Scandinavian tonogenesis . In: Jahr & Lorentz (eds.), 1983: Prosodi / Prosody, (Studies in Norwegian linguistics 2). Oslo: Novus. 388-398.

Fant , Gunnar & Anita Kruckenberg 1994 Notes on stress and word accent in Swedish . In: STL-QPSR 2-3, 125-144.

Fikkert , Paula & Elan B. Dresher & Aditi Lahiri 2006 Prosodic Preferences: From Old English to Early Modern English . In: Kemenade & Los (eds.): The Handbook of the . Blackwell, 125-150.

Fischer-Jørgensen , Eli 1989 Phonetic analysis of the stød in Standard Danish . In: Phonetica 46, 1-59.

155

2001 Tryk i ӕldre dansk. Sammens ӕtninger og Afledninger . Historisk-filosofiske Meddelelser 84, Det Kongelige Danske Videnskabernes Selskab, København.

Franzén , Vivan & Merle Horne 1997 Word stress in Romanian . In: Lund University Working Papers 46, 75-91.

Fretheim , Thorstein 1969 Norwegian stress and quantity reconsidered . In: Jahr & Lorentz (eds.), 1983: Prosodi / Prosody, (Studies in Norwegian linguistics 2). Oslo: Novus. 315-334.

Garallek , Marc 2011 The benefits of vowel laryngealization on the perception of coda stops in English . UCLA Working Papers in Phonetics 109, 31-39.

Gårding , Eva 1977 The Scandinavian Word Accents . Travaux de l’institut de linguistique de Lund 11. CWK Gleerup.

Geipel , John 1971 The Viking legacy: The Scandinavian influence on the English and Gaelic languages . Newton Abbot: David & Charles.

Givón , Thomas 2013 On the intellectual roots of functionalism in linguistics . In: Bischoff & Jany (eds.): Functional approaches to language . Walter de Gruyter, 9-29. von Glasersfeld , Ernst 1990 Teleology and the concepts of causation . In: Philosophica, 46 (2), 17-43.

Goldsmith , John 1976 Autosegmental phonology . Dissertation. New York. Garland.

Goossens , Jan 1974 Historische Phonologie des Niederländischen . Max Niemeyer, Tübingen.

Gósy , Mária 2004 Fonetika, a beszéd tudománya . Osiris Kiadó, Budapest.

Görlach , Manfred 1991 Introduction to Early Modern English . Cambridge University Press.

Gress-Wright , Jonathan 2008 A simpler view of Danish stød . University of Pennsylvania, Working Papers on Linguistics, Volume 14/1, 191-200.

Grice , Paul 1975 Logic and conversation . In: Cole & Morgan (eds.): Syntax and Semantics, Vol. 3, Speech Acts. New York: Academic Press, 41-58.

156

Grønnum (Thorsen) , Nina 1988 Intonation on Bornholm – Between Danish and Swedish. ARIPUC 22, 26-138. 2005 Fonetik og fonologi – Almen og dansk . Akademisk Forlag, København, 3 rd edn.

Grønnum , Nina & Hans Basbøll 2001 Consonant Length, Stød and Morae in Standard Danish. In: Phonetica 58, 230- 253. 2009 Nye stød i dansk . In: Farø, Holsting, Larsen, Mogensen & Vinther (eds.): Sprogvidenskab i glimt: 70 tekster om sprog og teori i praksis. Syddansk Universitetsforlag, 26-32.

Guitart , Jorge 1976 Markedness and a Cuban dialect of Spanish . Washington DC: Georgetown University Press.

Gurevich , Naomi 2001 A critique of markedness-based theories in phonology . In: Studies in the linguistic sciences 31/2, 89-114.

Hadding-Koch , Kerstin & Arthur Abramson 1964 Duration versus spectrum in Swedish vowels. Some perceptual experiments . In: Studia Linguistica 18, 94-107.

Halle , Morris 1997 On stress and accent in Indo-European . In: Language, vol. 73/2, 275-313.

Halliday , Michael Alexander Kirkwood 1975 Learning how to mean: explorations in the development of language . : Edward Arnold.

Halvorsen , Per-Kristian 1983 Tone in Norwegian polysyllables . In: Jahr & Lorentz (eds.): Prosodi / Prosody, (Studies in Norwegian linguistics 2). Oslo: Novus. 351-361.

Hammond , Lila 2005 Serbian, an essential grammar . Routledge.

Hansson , Petra 2003 Prosodic phrasing in spontaneous Swedish . Travaux de l’institut de linguistique de Lund 43.

Harrington , Jonathan 2006 An acoustic analysis of ’happy-tensing’ in the Queen’s Christmas broadcasts. In: Journal of Phonetics 34, 439-57.

Haspelmath , Martin 2005 Against markedness (and what to replace it with) . http://email.eva.mpg.de/~haspelmt/Againstmarkedness.pdf (retr. 2015-05-27)

157

2006 Parametric versus functional explanations of syntactic universals . http://www.eva.mpg.de/fileadmin/content_files/staff/haspelmt/pdf/Parametric.p df (retr. 2015-05-27)

Haugen , Einar 1966 Semicommunication: The language gap in Scandinavia . In: A.S. Dil (ed): The Ecology of Language: Essays by Einar Haugen, Stanford, California, 1972. 215-236. 1967 On the rules of Norwegian tonality . In: Language 43, 185-202. 1982 Scandinavian language structures. A comparative historical survey . Tübingen. Max Niemeyer Verlag.

Haugen , Einar & Martin Joos 1952 Tone and intonation in East Norwegian . In: Jahr & Lorentz (eds.), 1983: Prosodi / Prosody, (Studies in Norwegian linguistics 2). Oslo: Novus. 179-201.

Hayes , Bruce 1982 Extrametricality and English stress . In: Linguistic Inquiry 13/2, 227-276.

Hedelin , Per 1997 Norstedts svenska uttalslexikon .

Heger , Steffen 1980 Stødregler for dansk . Danske Studier 1980, 78-99.

Herman , József 2003 Vulgáris latin . Budapest: Tinta Könyvkiadó.

Hogg , M. Richard 1992 Phonology and morphology . In: Hogg (ed.): The Cambridge history of the , volume 1. 67-167.

Hognestad , Jan Kristian 2012 Tonelagsvariasjon i norsk. Synkrone og diakrone aspekter med s ӕrlig fokus på vestnorsk . Doktoravhandling. Universitetet i Agder, Fakultet for humaniora og pedagogikk.

Hume , Elizabeth 2004 Deconstructing markedness: a predictability-based approach . In: Berkeley Linguistics Society 30, 182-198.

Hutton , John 1996 Optimality Theory and historical language change . Paper presented at the 4th Phonology Workshop, Manchester, England.

Hyman , Larry 1977 On the nature of linguistic stress . In: Hyman (ed): Studies in stress and accent. Southern California Occasional Papers in Linguistics 4. Los Angeles University of California. 37-81.

158

Iosad , Pavel 2015 Prosodic structure and suprasegmental features: Short-vowel stød in Danish . http://www.anghyflawn.net/pdf/stod.pdf (retr. 2015-11-01)

Jakobson , Roman 1949 Principles of Historical Phonology . In: L. R. Waugh & M. Monville-Burston (eds.): On language – Roman Jakobson, 184-201. Harvard University Press. 1960 The speech event and the functions of language . In: L. R. Waugh & M. Monville-Burston (eds.): On language – Roman Jakobson, 69-79. Harvard University Press. 1963 Implications of language universals for linguistics. In: L. R. Waugh & M. Monville-Burston (eds.): On language – Roman Jakobson, 152-63. Harvard University Press.

Kabak , Bari ş 2005 Acquiring phonology is not acquiring inventories but contrasts: The loss of Turkic and Korean primary long vowels . In: Linguistic Typology 8/3, 351-368.

Kager , René 1999 Optimality theory . Cambridge University Press.

Kant , Immanuel 2000 Critique of the power of judgment . Cambridge University Press.

Kiparsky , Paul 1985 Some consequences of lexical phonology . In: Phonology Yearbook 1, 85-138. 1995a The phonological basis of sound change . http://web.stanford.edu/~kiparsky/Papers/workshop.pdf (retr. 2015-05-27) 1995b Livonian stød . https://web.stanford.edu/~kiparsky/Papers/livonian.pdf (retr. 2016-05-22) 2013 From Germanic stress to Scandinavian pitch accent . A talk given at MIT (2013.09.20) in tribute to Morris Halle on his 90 th birthday. https://www.youtube.com/watch?v=sAxB8xG12U8 (retr. 2016-05-15)

Kiss , Jen ő & Pusztai Ferenc (eds.) 2005 Magyar nyelvtörténet . Budapest: Osiris Kiadó.

Klemensiewicz , Zenon 1974 Historia j ęzyka polskiego . Warszawa: Pa ństwowe Wydawnictwo Naukowe.

Kloster-Jensen , Martin 1958 Bokmålets tonelagspar . Bergen. Grieg.

Kock , Axel 1878 Språkhistoriska Undersökningar om Svensk Akcent . Lund, Gleerup. 1901 Die Alt- und Neuschwedische Accentuierung unter Berücksichtigung der andern nordischen Sprachen . Karl J. Trübner, Strassburg.

159

Kotsinas , Ulla-Britt 1990 Svensk, invandrarsvensk eller invandrare? Om bedömning av ”främmande” drag i ”ungdomsspråk” . Andra symposiet om svenska som andraspåk i Göteborg 1989, 244-274.

Kristoffersen , Gjert 1992 Tonelag i sammansatte ord i østnorsk . In: Norsk Lingvistisk Tidsskrift 10, 39- 65. 2000 The phonology of Norwegian . Oxford University Press. 2006a Tonal melodies and tonal alignment in East Norwegian . In: Bruce, Horne (eds.): Nordic Prosody IX. Frankfurt am Main. Peter Lang. 157-166. 2006b Markedness in Urban East Norwegian tonal accent . In: Nordic Journal of Linguistics 29, 95-135. 2007 Dialect variation in East Norwegian tone . In: Riad & Gussenhoven (eds.): Tones and tunes, vol. I: Studies in word and sentence prosody. . Mouton de Gruyter, 91-111.

Kuryłowicz , Jerzy 1936 L’origine de l’accentuation scandinave . In: Bulletin international de l’Academie Polonaise des Sciences et des Lettres 7-10, 133-152. 1952 L’accentuation des langues indoeuropéennes . Polska Akademia Umiej ętno ści, Prace Komisji J ęzykowej 37. Kraków. 1968 Indogermanische Grammatik II . Heidelberg, Winter.

Kusmenko , Jurij K. 2005 The history of quantity in the Scandinavian languages . In: TijdSchrift voor Skandinavistiek 26/2, 127-144.

Labov , William 2001 Principles of linguistic change: Social factors . Blackwell. 2007 Transmission and diffusion . http://www.ling.upenn.edu/~wlabov/Papers/TD.pdf (retr. 2015-06-01)

Ladefoged , Peter 1990 Some reflections on the IPA . In: Journal of Phonetics 18, 335-346. 2003 Phonetic data analysis. An introduction to fieldwork and instrumental techniques . Blackwell.

Ladefoged , Peter & Ian Maddieson 1996 The Sounds of the World’s Languages . Blackwell.

Lahiri , Aditi & Tomas Riad & Haike Jacobs 1999 Diachronic prosody . In: Hulst (ed.): Word prosodic systems in the languages of Europe. Mouton de Gruyter, 335-422.

Lahiri , Aditi & Allison Wetterlin & Elisabet Jönsson-Steiner 2005 Lexical specification of tone in North Germanic . In: Nordic Journal of Linguistics 28/1, 61-96.

160

Larsson , Nils 2003 Några förändringar i dialekten i Trögds härad under 1900-talet . In: Riad (ed.): Meddelanden från Institutionen för nordiska språk vid Stockholms Universitet: MINS 54, 17-35.

Lass , Roger 1992 Phonology and morphology . In: Blake (ed.): The Cambridge history of the English language, volume 2. 23-154. 1994 Old English: a historical linguistic companion . Cambridge University Press.

Leunissen , Mariska 2010 Explanation and teleology in Aristotle’s science of nature . Cambridge University Press. 2012 Teleology . In: K. von Stuckrad & R. Segal (eds.): Vocabulary for the study of religion, 1-4. Leiden: Brill.

Liberman , Anatoly 1982 Germanic Accentology, vol I: The Scandinavian languages . Minneapolis: University of Minnesota Press.

Linell , Per 1978 Vowel length and consonant length in Swedish word level phonology . In: Gårding, Bruce, Bannert (eds.): Nordic prosody I. Travaux de l’institut de linguistique de Lund 13, 123-136.

Linell , Per & Bengt Svensson & Sven Öhman 1971 Ljudstruktur: Inledning till fonologin och särdragsteorin . Gleerups.

Loyn , Henry 1977 The Vikings in Britain . Blackwell.

Maddieson , Ian 1984 Patterns of sounds . Cambridge University Press.

Malmberg , Bertil 1959 Bemerkungen zum schwedischen Wortakzent . In: Malmberg: Phonétique générale et romane. La Haye. Mouton. 1971, 192-207. 1962 Minimal systems, potential distinctions and primitive structures . In: Malmberg: Phonétique générale et romane. La Haye. Mouton. 1971, 141-146. 1968 La phonétique . Paris: Presses Universitaires de .

Martinet , André 1937a La gémination consonantique d'origine expressive dans les langues germaniques . Thèse principale de doctorat d'Etat. Copenhague, Munksgàrd. 1937b La phonologie du mot en danois . Bulletin de la Société Linguistique de Paris 38, 169-266. 1955 Economie des changements phonétiques . Bern: A. Francke. 1962 A functional view of language . Oxford: Clarendon.

161

Maurud , Øivind 1976 Nabospråksforståelse i Skandinavia . Nordisk utredningsserie 13, Nordiska Råd.

McMahon , April 2007 Who’s afraid of the vowel shift rule? In: Language Sciences 29, 341-359.

Mohanan , Karuvannur Puthanveettil 1982 Lexical phonology . Doctoral dissertation. MIT.

Morén-Duolljá , Bruce 2013 The prosody of Swedish underived nouns: No lexical tones required . In: Nordlyd 40/1, 196-248.

Myrberg , Sara 2010 The intonational phonology of Stockholm Swedish . Stockholm studies in Scandinavian philology 53. Doctoral dissertation.

Nádasdy , Ádám 1989 Consonant length in recent borrowings into Hungarian . In: Acta Linguistica Hungarica 39, 195-213.

Noreen , Adolf 1904 Altschwedische Grammatik mit Einschluss des Altgutnischen . Max Niemeyer, Halle.

Noske , Roland 2009 Verner’s law, phonetic substance and form of historical phonological description . In: Proceedings of JEL’2009, 6èmes Journées d’Etudes Linguistiques, Nantes, 33-42.

Nyström , Staffan 2003 Grav accent i östra Svealands folkmål . In: Riad (ed.): Meddelanden från Institutionen för nordiska språk vid Stockholms Universitet: MINS 54, 7-15.

Oftedal , Magne 1952 On the origin of the Scandinavian tone distinction . In: Jahr & Lorentz (eds.), 1983: Prosodi / Prosody, (Studies in Norwegian linguistics 2). Oslo: Novus. 154-177.

Ohala , John 1983 The origin of sound patterns in vocal tract constraints . In: P. F. MacNeilage (ed.), The production of speech. New York: Springer-Verlag. 189-216. 1989 Sound change is drawn from a pool of synchronic variation . In: L. E. Breivik & E. H. Jahr (eds.), Language Change: Contributions to the study of its causes. 173-198. Mouton de Gruyter, Berlin. 1992 What’s cognitive and what’s not, in sound change . In Günter Kellermann & Michael D. Morrissey (eds.), Diachrony within synchrony: Language history and cognition. 309-355. Peter Lang Verlag. Frankfurt.

162

1993 The phonetics of sound change . In: C. Jones (ed.), Historical Linguistics: Problems and Perspectives, 237-278. London: Longman.

Ohlsson , Stig Örjan 1978 Nordisk språkhistoria och nordisk språkvård . In: Sprog i Norden. 17-40.

Olmstead , D L 1954 Achumawi-Atsugewi non-reciprocal intelligibility . In: International Journal of American Linguistics 20, 181-184.

Paley , William 1802 Natural theology . London: J. Faulder.

Panieri , Luca 2010 En mulig fonetisk forklaring på stødets opståen . In: Danske Studier 105, 5-30.

Passy , Paul 1891 Etude sur les changements phonétiques et leurs caractéres généraux . Paris: Librairie Firmin – Didot.

Peperkamp , Sharon 2005 A psycholinguistic theory of loanword adaptations . In: Proceedings of the 30 th annual meeting of the Berkeley Linguistics Society. 341-352.

Perridon , Harry 2006 On the origin of the Vestjysk stød . In: Amsterdamer Beiträge zur älteren Germanistik 62, 41-50.

Plato 2010 Timaeus and Critias . Digireads.com.

Pulleyblank , Douglas 1986 Tone in lexical phonology . Dordrecht: Reidel.

Pyles , Thomas 1993 The origins and development of the English language. Harcourt Brace College Publishers.

Quist , Pia 2000 Nydansk på Nørrebro . In: Mål & M ӕle 23/3, 5-10. 2008 Sociolinguistic approaches to multiethnolect: Language variety and stylistic practice . In: International Journal of Bilingualism 12/1&2, 43-61.

Raphael , Lawrence J, Gloria J Borden , Katherine S. Harris 2007 Speech science primer: Physiology, Acoustics and perception of speech . 5 th edition. Lippincott Williams & Wilkins.

Riad , Tomas 1992 Structures in Germanic prosody. A diachronic study with special reference to the Nordic languages . Doctoral dissertation, Stockholm University.

163

1998a Towards a Scandinavian accent typology . In: Kehrein & Wiese (eds.): Phonology and morphology of the Germanic languages. Tübinge, Niemeyer. 77-109. 1998b The origin of Scandinavian tone accents . In: Diachronica 15/1, 63-98. 2000 The origin of Danish stød . In: Lahiri (ed.): Analogy, leveling and markedness. Principles of change in phonology and morphology. Berlin and New York. Mouton de Gruyter, 261-300. 2003 Diachrony of the Scandinavian accent typology . In: Fikkert & Jakobs (eds.): Development in prosodic systems. Studies in generative grammar 58. Mouton de Gruyter. 91-144. 2009 Prosodi i svenskans morfologi . Manuskript. Department of Scandinavian languages, Stockholm University. 2013 The phonology of Swedish . Oxford University Press.

Rice , Curt 2006 Norwegian stress and quantity: The implication of loanwords . In: Lingua 116, 1171-1194.

Rice , Keren 2007 Markedness in phonology . In: de Lacy (ed.): The Cambridge handbook of phonology. Cambridge University Press, 79-98.

Ringe , Don 2006 From Proto-Indo-European to Proto-Germanic. A linguistic history of English. Volume 1 . Oxford University Press.

Ringgaard , Kristian 1983 Review of Liberman (1982) . In: Phonetica 40, 342-344.

Rischel , Jørgen 2008 A unified theory of Nordic i-umlaut, syncope and stød . In: North-Western Language Evolution 54-55, 191-235.

Roll , Mikael & Pelle Söderström & Merle Horne 2011 The marked status of Accent 2 in Central Swedish . In: Proceedings from ICPhS. Lund University, Department of Linguistics and Phonetics, 1710-1713.

Rubach , Jerzy 2008 An overview of lexical phonology . In: Language and Linguistics Compass 2/3, 456-477.

Russell , Bertrand 1945 A history of Western philosophy and its connection with political and social circumstances from the earliest times to the present day . New York: Simon & Schuster.

Sagart , Laurent 1999 The origin of Chinese tones . In: Proceedings of the Symposium Cross- Linguistic Studies of Tonal Phenomena, Tonogenesis, Typology and Related

164

Topics. Institute for the Study of Languages and Cultures of and , Tokyo University of Foreign Studies, 91-104.

Sandøy , Helge 2005 The typological development of the Nordic languages I: Phonology . In: Bandle (ed): The Nordic languages – An International Handbook of the History of the Nordic Languages. DeGruyter, 1852-1871. de Saussure , Ferdinand 1997/16 Bevezetés az általános nyelvészetbe . Corvina, Gyula. du Sautoy , Marcus 2003 The music of the primes: searching to solve the greatest mystery in Mathematics . HarperCollins.

Scherer , Wilhelm 1878 Zur Geschichte der deutschen Sprache . Weidmannsche Buchhhandlung. Berlin.

Selkirk , Elisabeth 1996 The prosodic structure of function words. http://ifa.amu.edu.pl/~grzegorz/egg2009papers/intro/Selkirk_Function_Words. pdf (retr. 2015-09-29)

Skautrup , Peter 1944 Det danske sprogs historie . København. Gyldendals.

Standwell , G. J. B. 1972 Towards a description of stress and tone in Norwegian words . In: Jahr & Lorentz (eds.), 1983: Prosodi / Prosody, (Studies in Norwegian linguistics 2). Oslo: Novus. 335-350.

Strangert , Eva 1978 Temporal aspects of rhythm in Swedish . In: Gårding, Bruce, Bannert (eds.): Nordic prosody I. Travaux de l’institut de linguistique de Lund 13, 103-108.

Ström , Anna 1998 Tonaccent i sydsvenska sammansättningar . ms Stockholm Univesity.

Suzuki , Seiichi 1994 Final Devoicing and Elimination of the Effects of Verner’s Law in Gothic . In: Indogermanische Forschungen 99, 217-251.

Svendsen , Bente Ailin & Unn Røyneland 2008 Multiethnolectal facts and functions in Oslo, Norway . In: International Journal of Bilingualism, 12/1-2, 63-83.

Szigetvári , Péter 2006 The markedness of the unmarked . In: Acta Linguistica Hungarica, 53, 433-447.

165

Tési , Áron 2012 A svéd tonalitás lexikális és posztlexikális szabályai . In: Masát-Mádl (eds.): Skandinavisztikai füzetek 9. Budapester Beiträge zur Germanistik 63, 223-233. 2014 A svéd szóhangsúlyok szerkezete a fonetikai realizáció tükrében . In: Els ő Század Online 13/2, 175-185.

Thorén , Bosse 2003 Can V/C-ratio alone be sufficient for discrimination of V:C/VC: in Swedish? A perception test with manipulated durations . http://bossethoren.se/Kvotexp1.pdf (retr. 2015-09-18)

Townend , Matthew 2002 Language and history in Viking Age England: Linguistic relations between speakers of Old Norse and Old English . Brepols Publishers.

Trubetzkoy , Nicolai 1939/69 Principles of phonology. University of California Press. http://monoskop.org/images/7/73/Trubetzkoy_NS_Principles_of_Phonology.pd f (retr.2015-05-27)

Tschirch , Fritz 1983 Geschichte der deutschen Sprache. Die Entfaltung der deutschen Sprachgestalt in der Vor- und Frühzeit . Dritte durchgesehene Auflage. Erich Schmidt Verlag GMBH, Berlin. 1989 Geschichte der deutschen Sprache. Entwicklung und Wandlungen der deutschen Sprachgestalt vom Hochmittelalter bis zur Gegenwart . Dritte ergänzte und überarbeitete Auflage. Erich Schmidt Verlag GMBH, Berlin.

Uneson , Marcus 2012 The Swedes are late as usual: On stress placement in parallel words in Danish, Norwegian, Swedish . University of Gothenburg. http://www.ling.gu.se/konferenser/fonetik2012/artiklar/Uneson_fon2012.pdf (retr. 2015-09-25)

Vennemann , Theo 2000 From quantity to syllable cuts: On so-called lengthening in the Germanic languages . In: Rivista di Linguistica, 12/1, 251-282.

Weinreich , Uriel 1953 Languages in contact: findings and problems (with a preface by Andre Martinet) . Linguistic Circle, New York.

Wetterlin , Allison 2007 The lexical specification of Norwegian tonal word accents . Doctoral dissertation. Universität Konstanz.

Wetterlin , Allison & Elisabet Jönsson-Steiner & Aditi Lahiri 2007 Tones and loans in the history of Scandinavian . In: Riad & Gussenhoven (eds.): Tones and tunes, vol. I: Studies in word and sentence prosody. Berlin. Mouton de Gruyter, 353-375.

166

Withgott , Meg & Per-Kristian Halvorsen 1988 Phonetic and phonological considerations bearing on the representations of East Norwegian accent . In: van der Hulst & Smith (eds.): Autosegmental Studies on Pitch Accent. Dordrecht: Foris Publications, 279-294.

Yip , Moira 1987 English vowel epenthesis . In: Natural language and linguistic theory 5, 463- 484.

167