MUM AND ME: A COMPARATIVE PHONOLOGICAL STUDY OF MATERNAL NURSERY TERMS AND FIRST PERSON OBJECT DESIGNATORS

by

Christopher David Ridley

A thesis submitted in partial fulfilment for the requirements for the degree of MA by research at the University of Central Lancashire

December 2018 STUDENT DECLARATION FORM

Type of Award MA by Research

School Humanities and Social Sciences

1. Concurrent registration for two or more academic awards

I declare that while registered as a candidate for the research degree, I have not been a registered candidate or enrolled student for another award of the University or other academic or professional institution

2. Material submitted for another award

I declare that no material contained in the thesis has been used in any other submission for an

academic award and is solely my own work

3. Collaboration

Where a candidate’s research programme is part of a collaborative project, the thesis must indicate in addition clearly the candidate’s individual contribution and the extent of the collaboration. Please state below:

_

4. Use of a Proof-reader

No proof-reading service was used in the compilation of this thesis.

Signature of Candidate

Print name: Christopher David Ridley ABSTRACT

It has been observed that there is a prevalence of nasal sounds for the maternal kin term across many . This was evidenced by Murdock (1959). There have been some speculative attempts to explain this phenomenon. These range from speculation that nasal sounds may be the easiest for babies to produce when suckling to theories that all languages derive from a common root.

In classic Saussurian it is deemed that the relationship between the signifier and the signified is arbitrary (Bally & Sechehaye, 1916). However, research by Blasi et al (2016) indicates that some phonemes correlate with certain basic vocabulary. This study examines whether a particular manner of articulation may by invoked by a basic meaning. It looks at the prevalence of nasal sounds across unrelated languages used for the maternal kin term and examines whether there is a statistical correlation between these and the use of nasal terms for first person object designators. This study demonstrates that such a relationship exists. This relationship is statistically significant. The implication is that there is a deep meaning that can be ascribed to the nasal phenomenon that allowed for it to move from denoting the care giver to the first person. This may, to some extent, explain how languages evolved. The literature on evolution discusses the process of evolution from primate sounds, through phonemes, syllables to words and grammar (Corballis, 2009). It may be that the evolution of phoneme sounds from one meaning to another were part of this process.

Nine languages were chosen for examination. Due to various practical reasons only seven were usable as statistical items. The analysis consisted of probability calculations to ascertain the likelihood of certain sounds occurring. Languages that had a nasally dominant “mum” term were more likely than would be expected by random occurrence to also have a nasally dominant “me” term. CONTENTS

1 Introduction 1

2 Aims, Scope and terminology 3 3 Analysis of Relevant Literature 6 4 Methodology 15 4.1 Problems with Data Collection 19 5 The Data 22 5.1 English 24 5.2 Hungarian 26 5.3 Turkish 28 5.4 Chinese 30 5.5 Japanese 32 5.6 Swahili 34 5.7 Arabic 35 5.8 Telugu 38 5.9 Tagalog 39 6 Data Analysis 41 7 Summary and Discussion 60 References 74 Appendices 1 – 5 (Each appendix has its own pagination) ACKNOWLEDGEMENTS

I would like to thank my wife, Ildi, and my two dogs, Bertie and Suba, for their support throughout this enterprise.

I would also like to thank my supervisors, Dr Lee and Bürkle for their help and advice. LIST OF TABLES

Table 4.1 (Languages Selected) 17

Table 6.1 (Distribution of Consonants for “Mum” and “Me”) 43

Table 6.2 (Frequency of nasal consonants compared to frequency of non-nasal consonants) 45/6

Table 6.3 (Binary Distribution of Nasality) 52

Table 6.4 (Probability of Nasal Distribution) 53

Table 6.5 (Probability of Distributions Across Languages) 55

Table 6.6 (Ratio of Nasal to Total Number of Consonants) 58 1 INTRODUCTION

This thesis looks at a particular issue of comparative phonology. It explores the phenomenon of the prevalence of nasal consonants being associated with the maternal nursery term, this was first evidenced by Murdock (1959), and whether this can be statistically associated with nasal consonants for first person designators. Classic Saussurian linguistics maintains that there is an arbitrary relationship between the signifier and the signified (Bally & Sechehaye, 1916). However, this position does not allow for the prevalence of nasal sounds that Murdock found. In addition, Bancel et al (2015) also found a preponderance of maternal nursery terms using nasal phonemes.

In terms of the evolution of language Corballis (2009) put forward the notion that language could have evolved over a long period of time through stages from primate sounds. These originated in our hominid ancestors but evolution of the larynx allows homo sapiens sapiens greater control over the speech organs. In Corballis’s view there must have been an evolutionary process that allowed primate sounds to go through several stages to become modern languages. The move from the expression of primal emotions such as fear being evoked by a scream to particular phonemes forming syllables was a key phase of language development. A stage of this was particular phonemes being selected to be used in syllables which conveyed a meaning.

Work by Blasi et al (2016), Cuskley & Kirby (2013), Köhler (1929, 1947) and Ramachandran & Hubbard (2001) indicates that certain concepts may be indicated by certain sounds and these symbolic representations extend across languages. This raises the notion of cross modality in that a sound may evoke a deep meaning below the level of the word.

The research question is that there may be a connection between the sound an infant makes when suckling and the sound the infant may then make to invoke the . This is because the infant associates the mother figure with food. By extension this concept may move from food to mother to “me”. The infant may then associate the same sound with themselves. If this is the case then we might 1 expect that, for any given language, there should be a statistical correlation between the sound used in the maternal nursery kin term and the sounds used to refer back to oneself. Therefore, the testable hypothesis is that there is a positive correlation between the use of a nasal phoneme in a language in the nursery maternal kin term and in the first person object designator.

This study aims to see if a part of the process can be explained by particular phonemes, in some instances, carrying a deep meaning. We will look to see if there is a statistical correlation between languages that are nasally dominant for the maternal nursery term and phonemes used for the first person object designator. The study finds that languages that are nasally dominant in the maternal nursery term are also nasally dominant for the first person object designator. This relationship is statistically significant. The implication is that nasality of a phoneme may carry notions are care giving which may they extend to the notion of self.

2 2 AIMS, SCOPE AND TERMINOLOGY

This study aims to establish if there are statistically significant correlations between the use of nasal phonemes for maternal kin terms and for first person object designators across language family boundaries. In other words, is there a higher frequency of nasal phonemes used in the above terms than would be expected by chance?

It is first necessary to define some of the terms used. The maternal kin term is the informal mode of address used by young children for their mother. This is the vocative term such as the British English “mum” /mʊm/1. It is important to differentiate between the vocative term and the referential term. An example of referential use is British English “mother”. A person may use this to talk about their maternal kin but is unlikely to use it to address them directly. Majstrík (2010) analysed the British National Corpus and found many uses of the “term” mother but only a small proportion, compared to say “mum”, were used for addressing. There is no statistical evidence but it may be inferred from a perusal of Majstrík’s study that many of these were not modern examples. This distinction is important on a theoretical basis as “mum” and “mother” are different terms used in different contexts.

There is some description of the vocative in the literature which will now be discussed. The vocative can be described as a case of grammar. However, it is of particular usage in that it can exist outside the grammatical structure of the main clause. For example, in the sentence “Mum, can you pass me the sauce, please.”, the main clause is “can you pass me the sauce”, which is complete in English consisting of a subject, verb phrase and two objects. “Mum”, is a calling term and is referred to by “you” in the main clause but it is not a requirement to make the sentence grammatical. Contrast this with the nominative which only makes sense in the place of the sentential subject of a clause and is intrinsic to the grammar of the clause (Piper in Glušac & Čolič, 2017). The vocative may also be defined as its

1 *// denote phonemes. A list is given in Appendix 1

3 linguistic function; the vocative is used for direct naming and calling as opposed to the nominative which is used for narration or description (Clušac & Čolič, 2017). In addition, the vocative may be phonologically distinct from the nominative as it may occur with particular intonation forms (Babić in Clušac & Čolič, 2017).

Further description is given by Moro (2003) who distinguishes a vocative phrase from the vocative case. The phrase is defined as a noun phrase that does not belong in the thematic grid of the predicate and is used to attract someone’s attention in the broad sense, such as in the example in the preceding paragraph. Perhaps the most useful definition is given by Leech (1999). He identifies formal, functional and semantic / pragmatic definitions of the vocative. Formally it is a nominal element. Functionally it behaves like a peripheral adverbial in that it is only loosely attached to the clause structure. Semantically / pragmatically it refers to the addressee of the utterance. For the purposes of this study a strict definition is not required. In the interviews the participants were asked “What did you call your “mum” when you were little?”. This is the term we are trying to compare.

We now attempt to define the term “first person object designator”. The first person object designator is the term used by a person to refer back to themselves. Note, this is not the same as the term used to refer to themselves as the subject of an utterance. In English “me” /mi/ is the first person object. Similar nasality is found in other words which refer back to the first person such as “my” /maɪ/. In other languages there may not be a direct translation of English “me” but there will be ways in which the language designates first person as an object or can indicate possession. In this study we have looked at first person object pronouns, first person possessive adjectives and first person possessive pronouns or morphemes which perform these functions.

The significance of this for the field of linguistics is that linguistics generally holds that the relationship between the signified and the signifier is arbitrary. This stems from the Saussurian tradition (Bally & Sechehaye, 1916). The sounds that make up a word bear no relation to what they signify. There is nothing in the sounds that make up the English word “table” /teɪbl,/ (the subscript comma here indicates a syllabic consonant) that relate to the physical object. It is entirely accidental that this collection of sounds signifies this object. 4 It is clear that for most words in a language this is the case and that a learner of this language has to learn each word independently with regard to its sounds and relate them to a particular meaning. However, this does not answer the question as to why these sounds are used for this particular meaning. Words develop and adapt throughout the history of any particular language and this development accounts for most modern day uses and sounds of words.

We may postulate that when humans first evolved they did not have language. We do not know when language first occurred in humans, Bancel & de l’Etang (2013) state that it may have begun to develop as long ago as 160,000 years before the present day. Whenever this development may have occurred it would appear unlikely that people started speaking languages that were fully formed and complex.

Languages are based on sounds, except for sign languages. Whilst the individual phonemes in words may in general not carry meanings, individual sounds may do. A cry may indicate fear or surprise. At some point, humans took the step of using individual phonemes and putting them with others to create a meaning. Could it be that the choice of these phonemes carried some fundamental meaning? In other words, we could ask, is there sound meaning below the level of word meaning?

It would make sense to look at basic words which presumably carried basic meaning to see if some of these primeval meanings remain. The basic meanings looked at in this study are the maternal kin term; the way an infant would refer to their primary care giver; and the means by which a person can designate the first person as an object.

This study begins with a discussion of the relevant literature which places this research in its academic context. This will also include some statistical tests to establish the robustness of data referred to in the literature. The research methodology of this study is then described. This is followed by results and a discussion thereof.

5 3 ANALYSIS OF RELEVANT LITERATURE

In 1957 George Murdock published his “World Ethnographic Sample” (Murdock, 1957). This work divided the human population of the world into different ethnographic groups. Sample data were gathered from each group to create an ethnographic picture of the world. This included data regarding cultivation, agriculture, community organisation and many other aspects of ethnography. For each aspect each ethnic group was categorised according to the features it exhibited. One aspect of socio-linguistics was included. This was kinship terminology. In the course of this study Murdock recorded the terms used to refer to family members in 474 societies from around the globe.

The notion of kinship terms has a literature of its own. To set the background to this study we shall briefly consider some of the issues relevant to what follows. Wallace and Atkins (1960) consider some of these. Firstly, a matter which will occur several times in this study is that of the problem of translating from one language to another. With regard to kinship terms it is not always the case that one language will have a direct translation between terms. The set of relationships defined in one language by a term will not necessarily have a one to one mapping to a term in another language which will yield the exact same set. For example, the Hungarian term “bátya” does not have a translation into English. It means older brother but, as Wallace and Atkins point out, terms such as “older brother” are descriptive statements, not kinship terms. As such, they cannot be direct translations of a kinship term. This raises the notion that the terms “bátya” and “older brother” might be denotatively the same but they are connotatively different. They do not arouse the same psychological response in the user. Therefore, in this case we have a term in English, “brother”, which encompasses a broader set of members than the term “bátya” in Hungarian. Whilst this sibling terminology might not have direct relevance to our study the problem does occur with regard to the “mum” term. Below, we analyse Bancel’s data. In these data Bancel uses a system similar to Murdock’s for denoting kin terms. However, it is apparent that in some languages the same term is used to denote mother and mother’s sister. For our purposes we still consider this as denoting “mum”, as it does. As Wallace and Atkins point out, an element of

6 ethnocentrism is unavoidable. In discussing maternal kin terms what this study is looking at is how the northern English English term “mum” is realised in different languages. It is beyond the scope of this study to clarify if the term in one language is identical to another. All we can state is that from this evidence this term in one language relates to a set of elements that overlaps to a great extent with the set implied by a term in another language. What we have, in effect, here is a problem of paradigms. In this the case the different paradigms are languages. We cannot define a term in one paradigm by using the terminology of another. We have come up against a philosophical extension of Gödel’s incompleteness theorem. A system cannot demonstrate its own consistency. We cannot define a term in one language by using terms in another.

Furthermore, it may be necessary to set a scientific definition of what we mean by the “mum” term. Linguistically we have defined it as the nursery maternal vocative. However, we need to set a transcendental definition, as well, for the basis of this study. Similar to the work of Wallace and Atkins above, Greenberg (1990) reviews much of the work on kinship linguistics. He identifies a variety of factors which determine how a culture distinguishes relationships. These include the categories of generation, lineal relationship, collateral relationship, age difference in one generation, sex of relative, sex of connecting relative, sex of speaker, consanguineal relationship, affinal relationship and condition of connecting relative. He also notes that the more distant from ego the relationship the fewer the distinguishing features. For example, English uses the sex of a relative to distinguish siblings but not cousins. He also notes that a language that has distinctions for a relationship closer to ego will not have more distinctions along the same category for a relationship further from ego. For the purposes of this study the maternal kin term is for the relationship; one generation above ego, female, lineal.

If correlations are to be drawn regarding terms used by different social groups then we need to examine how Murdock selected the societies he looked at. This is important in eradicating instances of bias. For example, we may note correlations between languages but this tells us nothing regarding the intrinsic development of language if we know that these languages are related. The equivalent of the English word “mum” in French is “maman” /mæmɒɧ/ (the underlining here indicates

7 nasalisation of the vowel). However, this tells us nothing about statistical relationships between languages as English and French are both of the same Indo- European language family. In other words, they share the same language root; proto- Indo-European (PIE). All this tells us is that the languages are related historically. To establish whether there are fundamental meanings between sounds shared across language families we need to look at unrelated languages. Whilst Murdock did not use language families as a major basis for organising his sampling of societies it would seem that they drawn from a widespread sample. In his 1957 work he discusses the sampling problem and dismisses random sampling of all known cultures. There are several reasons given for this including the problem of finding reliable information for some cultures and the danger of bias in getting more data from only a few related cultures. Murdock divided the world into six ethnographic regions which were themselves further sub-divided into ten areas giving a total of sixty areas in all. He tried to maintain recognised cultural boundaries between areas. Between five and fifteen cultures were then selected from each area based on criteria related to population, economy and linguistics; in order to have as widespread and representative a sample as possible of different types of society. Transplanted European societies were mostly avoided. The number of societies and their geographical spread, coupled with the exclusion of most transplanted European societies, means that the linguistic data are most probably drawn from a wide enough range of languages to minimise problems of bias in terms of related languages. Certainly, many of the languages would have been related but many of them would not be Jakobson (1962) refers to the work of Murdock. Jakobson refers to the interlanguage of motherese which carers and infants develop together. He notes that some terms from this interlanguage enter the standard lexicon. Examples of this are parental nursery kin terms. In British English these would be “mum” and “dad”. Jakobson asserts that the nasal sounds are the only sounds a baby can make when feeding. The baby eventually associates the sound with food. As the mother is the main carer in most circumstances the baby eventually equates the mother with food and correspondingly moves on to denoting the mother with a nasal term. Of course, it can be assumed that babies are encouraged by carers to apply a sound similar to the maternal kin term to the mother as part of stimulating language development.

8 Jakobson asserts that nasal sounds are the only ones that can be made whilst the baby feeds. It may also be conjectured that ease of articulation accounts for the preponderance of nasal maternal kin terms. The argument is as follows. Maternal kin terms will be among the first words produced by a baby or a language in its development. This is due to the requirement for babies to interact with their and for human groups to interact with each other. The first sounds likely to be produced will be the easiest ones to produce and these will form the first words. We have already observed that nasal sounds may be easiest for the baby to make when feeding and there is evidence for this from Grégoire (1937) and Leopold (1939). Are there any more objective measures of ease of articulation?

Locke (1972) carried out research into ease of articulation. Of 20 English consonants nasal phonemes did come out quite highly in one measure of ease of articulation which was correctness of articulation by 3 year olds. However, the plosive /t/ was ranked easiest; /n/ was ranked next jointly with /d/; and /m/ was ranked in the next position jointly with /b/. If ease of articulation accounts for why nasal phonemes would be used more frequently for maternal kin terms than other phonemes then would plosives not also occur as, if not more, frequently given that three year olds find some of them as easy to pronounce as nasal sounds? The argument for ease of articulation accounting for the high frequency of nasal sounds weakens further with Locke’s other data. In a test of motor ease /n/ occurred in fourth place and /m/ behind eight other consonants. In a rating of motor ease of articulation /n/ occurred in fourth place again and /m/ behind eight other consonants again. Another concept is ease of perception; it may be conjectured that the first sounds to attain meaning and become incorporated into the first words were those that were easy to perceive. Locke carried out a test of correct perception of consonants by three year olds. /n/ was ranked behind six other consonants and /m/ behind ten others. Whilst these results indicate that nasals are in the top half of consonants in terms of ease of production and ease of perception they are no higher than many other types of consonants. We may infer, then, that ease of articulation or perception cannot totally account for the high frequency of nasals in maternal terms. It should also be pointed out here that Locke’s study was solely with English speakers. We do not know if similar data would be reproduced in other languages. It might also be worth

9 considering Jakobson’s (1962) conjecture regarding the apparently high frequency of plosive phonemes for the paternal kin term. This is that the first deictic use of language by a child is the paternal term, whereas the “mama” term signals a need for fulfilment. The use of plosives for one and nasals for the other follows from a simply communicative requirement: as there are only two terms, they need to be made as distinct as possible. The study of paternal terms is beyond the scope of this research and Jakobson’s argument does not explain why plosives are used for one term and nasals for the other if ease of articulation cannot be substantiated. It should be noted that Boyes-Braem (cited in McIntire, 1977) found that, with regard to the acquisition of American Sign Language that certain gestures were acquired earlier than others. This appears to be put down to the ease of movement of certain muscles for a growing child. McLeod and Crowe (2018) in a cross linguistic study of child acquisition of consonants found that plosives and nasals were acquired earlier than fricatives or affricates. However, there appears to be little in the literature regarding spoken language phonology to substantiate an ease of articulation explanation for the acquisition of phonemes.

Further to the notion of ease of articulation is ease of recognition. In this case recognition implies the ability of the hearer to discriminate between two sounds. Might it be that nasal phonemes are used for the maternal kin term as they are easier to recognise than other phonemes? Therefore, they would be more likely to occur in basic terms in a language as these will be the most salient ones to the care giver and they will respond to them and this will encourage the child to make the sounds more often in the context of the care giver’s presence. There appears to be little research done into ease of perception in the literature of particular sounds. However, some research by Martin & Peperkamp (unpublished) indicates that place and manner of articulation are more important than voicing in terms of word recognition. It should be pointed out here that this research was conducted in relation to French nouns.

We have no idea if it could be replicated for other word classes or for other languages. This may explain to some extent why nasals are used for one parental term and plosives for the other paternal term as they are different manners for articulation and this would allow for greater recognition of the difference between 10 two basic terms in a language. However, it would not explain why there is a bias for nasal terms for the maternal term and plosives for the paternal term in the first place, only why there are different manners of articulation for each. There could be fricatives for the maternal term and plosives for the paternal.

Slonimska & Roberts (2017) found that interrogative words; interpreted as “wh” words in English, have similar sounds within a language to facilitate pragmatic inference. However, they were found to be less similar across languages. 226 languages were analysed across 66 families. Again, ease of intelligibility is accounted for phonetically between words in a language but this does not offer an explanation as to why those sounds exist to convey that meaning, only that within a language they will be similar.

Another explanation as to why nasal sounds are so prevalent in maternal kin terms is that all languages may be related. What do we mean by “related” in this context? Languages are related if they can be shown to descend from the same language. So, for example, English and German are related languages as they are derived from the same Germanic root. To take the argument further, English and Russian are related even though one is a Germanic language and the other Slavic. They are both part of the wider Indo-European language family (Dryer & Haspelmath, 2018). One feature of related languages is that they will share certain commonalities. For example, they may have similar pronunciations for words with similar meanings; eg /teɪbl,/ and /tɑblə/ are the English and French words for “table”. Therefore, if all languages were descended from a common ancestor this may account for similar pronunciations for basic words such as maternal kin terms.

This appears to be the argument of Bancel & de l’Etang (2010 & 2013) and Ruhlen (1994). The received wisdom of linguistics is that language change is hard to trace back more than about 6000 years. However, it is understood that human language has existed for many tens of thousands of years. Therefore, it is impossible to ascertain what proto languages may have existed before. Bancel & de l’Etang and Ruhlen argue that etymologies can be traced back much further and maintain that this indicates that all languages are descended from one original mother tongue. The alternative theory is that language developed in humans at different places at different times. The theory that all languages are descended from a common tongue 11 is controversial in linguistics and Bancel & de l’Etang and Ruhlen’s theories have been widely criticised. In addition, Ruhlen’s data contains Hungarian and Finnish words which are incorrect2.

This casts doubt on his other data, as well. Furthermore, whilst questioning the statistical analyses of their critics, Bancel & de l’Etang do not provide any statistical analysis of their own work. Therefore, it is not clear if the correlations they claim may not have occurred by chance. Again, even if all modern languages are descended from a common ancestor and this explains sound correlations across basic vocabulary terms; this does not explain why those sounds were chosen in the first place. One may argue they are arbitrary but equally the question may be posited as to why the same sound may be dominant for two meanings. Perhaps the sound carries a meaning of its own and the two words have this deeper, sub- word level, meaning. Another theory is that certain sounds may occur in basic vocabulary because they are imputed to have certain fundamental meaning. In classical linguistics, meaning has been looked at at the word or morpheme level. In general no meaning is ascribed to the sounds per se and the explanation for different sounds for the same word in different languages is that they are arbitrary. However, we may posit the notion that before human languages developed human sounds existed. The sounds conveyed some fundamental notions such as fear, surprise, affection, etc. in much the way that facial expressions might. It could be that the first phonemes drew upon these sounds to express certain basic vocabulary items. We use sounds to express the basic emotions referred to before and we do not consider them to be part of language. However, the notion that sounds themselves do not ever carry meaning in and of themselves can be countered. For example, if we look at the English word “walk” /wͻlk/ its past tense is “walked” /wͻlkt/. The past tense is realised by adding the phoneme /t/, a feature shared by Hungarian.

2 *Ruhlen’s incorrect Hungarian words include “hārma”, “lūd”, “sem”, “vize”, “köve”, “lom” and “mon-d”. No explanation is given forthe hyphen. The English gloss forthese words are given as “three”, “bird”, “eye”, “water”, “stone”, “snow” and “say”. The correct Hungarian words are “három”, “madar”, “szem”, “víz”, “kő”, “hó” and “mond” respectively (Magay & Országh, 1994)

12 The work of Blasi et al. (2016) found that certain sounds can be associated with certain basic vocabulary items. Approximately two thirds of the world’s languages were covered and were sampled from all over the world. Though, it is not clear exactly what “covered” means in this context. It could mean analysed languages or just potential samples. One hundred basic vocabulary items were chosen. Blasi et al. found 74 positive and negative sound- meaning associations. The work has a robust statistical basis and Blasi et al contend that it is improbable that these relationships are due to a common ancestor for a number of indirect reasons, eg. there is no consistency as to the position of the sound in a word. To give a flavour of what they found, high front vowels were positively correlated with the concept of smallness. We now consider how a sound may represent a concept. We may consider the notion of cross modality. Cuskley & Kirby (2013) discussed research which demonstrated correlations between certain concepts such as magnitude, visual angularity and taste; and certain sounds. For example, Kӧhler (1929, 1947) demonstrated that the word “maluma” was more likely to be associated with rounded shape and “takete” was more likely to be associated with an angular shape. Making some assumptions from orthography we might speculate that plosives are associated with angularity and nasals with roundness. In 2001 Ramachandran and Hubbard found a correlation between the sounds “bouba” and “kiki” and rounded and angular shapes respectively. Therefore, in this case the issue might not be plosive versus nasal but might be voiced versus unvoiced; or it may be front versus back vowels. In any case, the point remains that certain sounds, or maybe combinations of sounds, seem to correlate with certain concepts.

Even if we can ascertain that certain sounds are associated with certain meanings this does not explain why they are. It could be that close vowels make a small sound for air to pass through the mouth and this indicates smallness (Blasi et al, 2016). However, beyond this is somewhat speculative. Neither does this help us understand how the evolution of language may have moved from sounds for emotively basic notions to using these sounds for more sophisticated communication.

It may be that the same sound was used in one word and the fundamental concept of the sound is taken over to be used in another word. Let us return to the nasals of the maternal kin term. The infant might use this sound because, as Jakobson pointed

13 out, it is the easiest to produce when suckling. There is an association with food. A connection may then be developed between the mother and food. In addition, the infant may not realise to begin with that the mother and infant are not the same person. Indeed, they were not to begin with. Could it be that the sound used to invoke the mother might also be used by the infant to refer back to themselves? How would be test for such a theory? If there is such a correlation we may hypothesise that languages that had a particular sound for the maternal kin term would be more likely to use the same sound to refer back to oneself.

Therefore, our testable hypothesis is that there is a positive correlation between a language using a nasal phoneme in the nursery maternal term and a nasal term in the first person object designator.

14 4 METHODOLOGY

The methodology consisted of the following steps. 1. The Murdock and the Bancel, de l’Etang, Ruhlen data were analysed to identify if the preponderance of nasal phonemes in maternal nursery kin terms may have occurred by chance. This was done through the use of a chi squared test. 2. Languages were selected to be studied. The criteria for selection were that they were to be from different language families; they were to be as geographically distinct as possible; there was a good likelihood of finding reliable data on them. 3. Data regarding the phonemes for maternal nursery kin terms and first person object designators on each language was collected through interview, survey and reference works. Interviews were conducted with language professionals where possible. Native speakers of the language were interviewed if possible. Further data were collected through Survey Monkey (Appendix 5). The interviews asked participants what term they used to address their mother as children. They were then asked to translate “Give me the book”, “This is my book”, and “This is mine” and were then asked to identify which elements of the target language gave the same notion as “me”, “my” and “mine” in the above. 4. Languages not deemed to be nasally dominant for the maternal nursery kin term were rejected from the study. This study hypothesizes that there might be link between the nasal sound and its use for the mother figure, food and the first person object designator. Therefore, only languages with nasally dominant maternal nursery kin terms could be used. 5. For the remaining languages their first person object designators were classified as being nasally dominant or non-nasally dominant. 6. The probability of the result of (5) was then calculated for each language by comparing the likelihood of nasal dominance as opposed to the dominance of another type of consonant. For example, in English nasals comprise 1/8 of all possible consonants. 7. The probability of the distribution of languages being nasally dominant or 15 non-nasally dominant for the first person object was then calculated. This was by way of a combination calculation of the possible combinations of the number of nasally dominant and non-nasally dominant within the given set of languages. This gave an indication as to the statistical significance of the results.

In order to conduct the study we needed to select languages to consider. The languages, in so far as possible, needed to be unrelated. This is because related languages might be expected to produce similar sounds for similar concepts. In stating that French and English use nasal phonemes for nursery maternal terms and first person objects we may simply be stating a feature of the fact that they are related. This cannot be said to demonstrate that there is something fundamental to the phonemes used. In addition, the languages should be selected from a relatively diverse geographical area. This should reduce the likelihood of borrowing of terms from each other. However, given the prevalence of inter-cultural communication in the internet age it is difficult to rule this out for any languages. Furthermore, there needs to be reliable data available to analyse on any particular language. Therefore, the languages need to be spoken by relatively large populations.

Taking the above constraints into account and bearing in mind the statistical limitations of the study nine languages were chosen in total. These were English, Hungarian, Chinese, Japanese, Swahili, Arabic, Telugu, Turkish and Tagolog. Using the World Atlas of Language Structures (WALS) these were classified in the following language families.

16

(Languages Selected)

Language Family English Indo-European Hungarian Uralic Turkish Altaic Chinese (classified by WALS Sino-Tibetan as a genus rather than a language) Japanese Japanese (some classifications, eg Ethnologue, place Korean in this family) Swahili Niger-Congo Arabic Afro-Asiatic Telugu Dravidian Tagalog Austronesian Table 4.1

In terms of data collection the following techniques were used. Data were collected from published dictionaries, grammars and other linguistic works associated with the target languages. This was correlated with data drawn from corpora of language use. Further data were drawn from interviews with native speakers. In the first instance native speakers who were language professionals were approached. This was because language professionals were deemed more likely to understand the meta-linguistic terms used in the interviews such as “first person subject designator” and it was also thought that they may be able to shed light on how the target language formed the equivalent concepts to other languages. In some cases, lay people were interviewed to provide a wider data base. Further data was collected through the use of Survey Monkey. In both the interviews and the survey participants were asked to consider the following utterances in English; “Give me the book”, “my book” and “This is mine”. In each case they were asked to indicate the pronunciation

17 of the element that was equivalent to the “me”, the “my” and the “mine” in the English utterances respectively. In the case of interviews recordings were made and the interviewer was able to ask the participants to repeat terms and was able to ascertain if the participant understood their instructions. They were also able to ask questions such as what dialect the participant spoke. This type of questioning was not possible in the Survey Monkey. Full data collected through the survey are given in Appendix 5.

The first stage was to ascertain if the target language used a nasal phoneme in the nursery maternal term. If not, the language could not be used. The next stage was to identify if a nasal term was used in the language for the first person to refer back to themselves. The use of first person subject terms was not considered relevant. Once those data were collected it was analysed to ascertain if there was a statistically significant correlation between the phonemes in the maternal kin term and the first person object usage.

18 4.1 Problems with Data Collection

A problem identified before the data collection was started was the fact that words in unrelated languages do not necessarily have a direct one to one translation between them. It may that that language A uses a particular word but language B may use one word in one context but another word in another context and language A may use the same one in both contexts. In addition, unrelated languages can use grammar in very different ways. For example, the first person object pronoun in English is “me”. However, there is no equivalent one to one translation into Hungarian. In English we say, “Give me the book”. In Hungarian the equivalent utterance is, (The glosses are derived from the Leipzig Glossing Rules, Max Planck Institute of Evolutionary Anthropology, 2015)

Add nek-em a Kӧnyvet Give-2-IMP to me-1-OBJ-DAT the-ART-DEF book-ACC “Give me the book.”

“Add nekem a kӧnyvet.” The “nek” expresses the idea of transfer and the “em” part refers back to the speaker. However, another ending to “nek” could refer to a different person, eg

Add neki a Kӧnyvet Give-2-IMP to him/her-3-OBJ- the-ART-DEF book-ACC DAT “Give him/her the book.”

“Add neki a kӧnyvet” would mean give him / her the book. In addition, not only object pronouns can refer back to the speaker. In English we might use possessive adjectives such as “my”. Another language may use a variety of forms to do this. The question is, can we find any consistency in pronunciation in this regard?

The problem of unpicking the grammar of a language to find equivalences is not the onlyissue in terms of data collection. In addition, reference sources may give the orthography of an equivalent term but what we need to find is the pronunciation. Orthography may give an indication of this but cannot necessarily be relied upon. In addition, some of the subject languages do not use a Latin script. It is a limitation of

19 this study that this researcher is unable to read Chinese, Japanese or Arabic script so in this regard the work is made more problematic. Some reference works give an indication of phonology but this is often not referenced to the International Phonetic Alphabet (IPA). Furthermore, there may be no description of how particular phonemic symbols may be pronounced. It is up to the researcher to infer this. Some reference sources have an audio element and in these instances the problem is circumvented.

Another point to bear in mind is that, whilst some of these languages, eg Hungarian, are relatively homogeneous, some are not. Chinese has a number of distinct dialects as does Arabic. Each dialect may have different phonology for the terms under study. Furthermore, these languages have a “high” version which has status in society and is used in education, politics, the law etc, and it is this version which is generally found in published reference works. However, it is not the version that an infant may be expected to use to their mother. Some interviewees asked a number of times if I wanted a standard translation from the English into their language and needed to be reminded that the study was concerned with the vernacular dialect they used with their mother as a child. In all, care must be taken in selecting suitable equivalents across languages.

Despite the languages being chosen having a large number of speakers; at least 10 million and in some cases hundreds of millions; there were large discrepancies in the availability of reliable data. For some languages there was a wealth of data available and for some relatively little. In addition, the researcher has used their own knowledge of the subject languages which is as follows. English is his mother tongue. Hungarian is his second language. He has knowledge of a little Turkish. He has personal acquaintances from whom he can draw knowledge of these languages. His notion that there may be some sound correlations is based on his familiarity with these languages and these were chosen as subjects of study. This may bias the results a little but this has been taken into account. A wider study may even out such biases. Consideration was given to selecting German, French and other Indo- European languages but it is unlikely that this would have affected the results greatly. Similarly, the selection of Finnish rather than Hungarian may have but there is no reason to believe it would. It may be that the selected languages are un-

20 representative of other languages in their family but there is no reason to believe this.

Further to the intrinsic problems mentioned above there were issues with finding interviewees. Again, for some languages these were available but none were found for some others. In some cases information was obtained from language professionals who were not native speakers of the subject language. In others lay people were interviewed or surveyed. It may be that in some cases they could have given instances that did not correlate closely enough with the English terms. For example, a referential term rather than vocative might have been given for the maternal kin term. In other cases there may have been mistranslations of the first person term. In the discussion of the data collection for each language below it will be explained how these problems were addressed.

Therefore, there is no neat table we can draw up of one to one correlations between English sounds and their counterparts in other languages which we can then perform statistical analyses on. This study will discuss each language in turn and explain the conclusions come to regarding which terms in each subject language are valid.

21 5 THE DATA

5.0 Murdock and Bancel, de l’Etang, Ruhlen Data.

In 1959 Murdock published the linguistic data from the “World Ethnographic Sample” in “Cross-Language Parallels in Parental Kin Terms”. In this Murdock took the terms used by small children in his sample societies to address their parents. He referred to these as “mama” and “papa” terms. These terms were then classified according to the “consonant class”. The first syllable of the term was used to categorise the word. Only in the cases where another syllable was considered the root would a syllable other than the first be used. All terms thought to be borrowings from European languages were excluded. This yielded thirteen consonant classes, including one of no consonant, see Appendix 2. Of these thirteen consonant classes three were nasals. If phoneme occurrences in particular words are entirely arbitrary then we can assume that there would be an equal probability of any particular sound occurring as for any other phoneme. Certain languages may have a higher frequency of certain sounds and this problem is examined later. To extend this argument to the Murdock data we could state that as there are thirteen consonant classes found thenwe would expect 3/13 (c 23%) of the maternal terms to use nasals. In fact, 298 of the 531 (c 56%) maternal terms were nasals.

In addition to the Murdock data there is also the research discussed in Bancel & de l’Etang (2010, 2013). In these articles Bancel & de l’Etang refer to the work of Merrit Ruhlen. It should be stated here that in the article “Back to Proto-Sapiens (Part 2)” (Bancel et al, 2015) there is no explicit reference to Ruhlen but it can be inferred that the data is Ruhlen’s from the comments in “Back to Proto Sapiens (Part 1)” (de l’Etang, 2015). The data consist of the kin terms “papa”, “kaka”, “nana” and “mama”. No explicit description is given of the pronunciation, unlike in the Murdock data, however, it may be inferred from the text of the article that “mama” and “nana” have nasal consonants and “papa” and “kaka” have plosive consonants. The data are drawn from 1,184 languages. There is little indication given of the sampling methodology but a wide range of languages from different language families and

22 from a wide geographical spread across the globe are used. A total of 1632 instances of “mama” or “nana” were recorded as being used for kin terms. In total, these were spread across 16 kin terms, see Appendix 4. If the distribution of “mama” and “nana” terms were spread evenly across all 16 kin terms it would be expected that 1/16 of them designated the mother. Note, in some cases the term designated both the mother and the mother’s sister. For the purposes of this study we consider the instances where the term designates the mother, and where it designates the mother and mother’s sister as being one kin term. In all, of the 1632 instances of “mama” or “nana”, 706 designated the mother.

In addition to the Murdock data there is also the research discussed in Bancel & de l’Etang (2010, 2013). In these articles Bancel & de l’Etang refer to the work of Merrit Ruhlen. It should be stated here that in the article “Back to Proto-Sapiens (Part 2)” (Bancel et al, 2015) there is no explicit reference to Ruhlen but it can be inferred that the data is Ruhlen’s from the comments in “Back to Proto Sapiens (Part 1)” (de l’Etang, 2015). The data consist of the kin terms “papa”, “kaka”, “nana” and “mama”. No explicit description is given of the pronunciation, unlike in the Murdock data, however, it may be inferred from the text of the article that “mama” and “nana” have nasal consonants and “papa” and “kaka” have plosive consonants. The data are drawn from 1,184 languages. There is little indication given of the sampling methodology but a wide range of languages from different language families and from a wide geographical spread across the globe are used. A total of 1632 instances of “mama” or “nana” were recorded as being used for kin terms. In total, these were spread across 16 kin terms, see Appendix 4. If the distribution of “mama” and “nana” terms were spread evenly across all 16 kin terms it would be expected that 1/16 of them designated the mother. Note, in some cases the term designated both the mother and the mother’s sister. For the purposes of this study we consider the instances where the term designates the mother, and where it designates the mother and mother’s sister as being one kin term. In all, of the 1632 instances of “mama” or “nana”, 706 designated the mother

23 5.1 English

English is a member of the Indo-European language family. It is of the genus Germanic (WALS). English is spoken in many parts of the world and by many people as a second or foreign language. 371 959 910 speak it as first language and 611 563 010 as a second (Ethnologue). The variety selected as a subject for this study is that of northern England. This subsumes a number of dialects given in Ethnologue. However, through personal experience the author knows the pronunciations given here are reflective of much of the language spoken throughout the British Isles. The data are drawn from the author’s own pronunciations, he is a native speaker of the target dialect, and his observations of those around him.

The vocative maternal kin term is /mʊm/. This is the observed standard but variations exist such as /mʊmɪ/. For the purposes of this study /mʊm/ is used and this is the benchmarkagainst which other vocative maternal kin terms in the study will be matched. A nasal phoneme is used and, therefore, we can use English for our analysis. Indeed, nasals are the only consonant which occur. The use of “mummy” in northern England English is evidenced in Theakston, Lieven, Pine and Rowland (2001) in the CHILDES database. This is drawn from infant speech which is relevant for this study in terms of language development.

For the first person object pronoun English give us /mi/ as in, “Give me the book”. For the first person possessive determiner we have /maɪ/ as in, “This is my book”. In both instances in connected speech the target terms may be realised as /mɪ/. The first person possessive pronoun is /maɪn/. In all these instances the only consonants are nasals, as for the maternal kin term. Indeed, in all but one instance the consonants are bilabial nasals. For an academic reference for the use of “m” in first person singular in English see Nichols and Peterson (2013). Theakston, Lieder, Pine and Rowland (2001) also give evidence of “me” and “my”. Incidentally /m/ is the only consonant for the first person present tense form of “be” in English. The use of “me” and “my” is evidenced in children’s literature (Woollard & Murphy, 2014). Survey Monkey was not used for English as it was considered that they would be too many responses to process. Interviewees were not sought as the researcher is a native speaker of the language himself and is a language professional. In this case 24 we see that nasal consonants, and none other, are used for the vocative maternal term and the first person object terms relating back to the speaker.

25 5.2 Hungarian

Hungarian is of the Uralic language family. It is of the genus Ugric (WALS). It has approximately 12.5 million speakers (Ethnologue) of whom about 10 million live in Hungary. Some distinct dialects are spoken outside of Hungary. The variety for the subject of this study is Hungarian as spoken in Hungary. Again, this subsumes a number of dialects given in Ethnologue but the researcher’s own personal experiences gives him to understand that the terms given here are standard throughout the country. The examples given are drawn from the researcher’s own experiences in speaking with Hungarians. Hungarian is his second language.

The vocative maternal term is /mɒmɒ/. A nasal phoneme is the only consonant present. Therefore, we can use Hungarian for our analysis. Other evidence for this is Bodor (2004) available on CHILDES. This is recorded in Hungarian orthography as “mama” but the phonemic realisation of this is as given above. “a” in Hungarian is pronounced as /ɒ/.

The first person object pronoun is /eɧgem/. This clearly contains nasal consonants but pronouns are rarely used in Hungarian compared to English. They are only used when the person is not clear from the context or the other words used; Hungarian inflection often indicating person with regard to verbs and nouns. More commonly an affix is used to indicate person. “Give me the book” is realised as /ɒdd nƐkƐm ɒ kəɧvƐt/. The element referring to the first person object in this case is /Ɛm/. The vowel quality may vary to respect the vowel harmony rule of Hungarian grammar but the consonant remains /m/. “This is my book” is rendered as /ɒz én kəɧvƐm/.

Az (no copula in én könyvem Hungarian) This-ART-DEF is my-1POSS-ADJ book-OBJ-1POSS “This is my book.”

The element referring to the first person is again /m/ which is a suffix connected to the word for “book”. /én/ is an emphasiser which is again first person only in its reference. Hungarian does have a translation for “mine” which is /Ɛnjém/. In all these instances we see that the parts of speech which refer back to the first person

26 use nasal consonants. The only phoneme which is not nasal is /j/. This is usually classified as a consonant but is often termed a semi-vowel as there is no complete closure of the air passage as with proper consonants. Evidence of the use of /m/ as first person genitive ending is given in Bodor. There is also a transcription of a child using “enge” which is not a word in Hungarian. It is not clear whether the “m” is just missing off the end of this word or it is a mipronunication, such as below for “akarom”. The use of “engem" is given in Kaszás & Elek (2017). The translation of “enyém” as “mine” is given in Magay & Országh (1994).

Whilst the following data is not directly correlatable with data from other languages as it is not a close translation of the terms used above it provides further evidence of /m/ being used in a first person objective sense. The Bodor data provides an example of a two year old infant saying “akom”. From the author’s first hand knowledge of infant speech this is probably an attempt at “akarom” and is probably being pronounced as /ɒkom/. The /m/ ending is a first person reference and is used when the verb takes a direct object, including, as in this case, when the object is implied. The fact that this is infant speech is useful in that we can correlate it with nursery speech used in the maternal kin term.

There were no responses for Hungarian via Survey Monkey. Interviewees were not sought as the researcher has direct knowledge of the language through familial contacts and speaks it as a second language.

This evidence has been gathered by the researcher through his own experiences but are also verified in Hall (1944) and Nichols & Peterson (2013).

27 5.3 Turkish

Turkish is of the Altaic language family. It is of the genus Turkic (WALS). It has approximately 70 000 000 native speakers of whom most live in Turkey. There are a number of distinct dialects (Ethnologue).

The vocative maternal nursery term is “anne”. Evidence for this is given in Slobin & Bever (1982) and can be located in the WALS database. The pronunciation is /ʌnnƐ/. In this case the only consonant is alveolar, not labial, but it is still nasal. Incidentally, the standard term for “mother” in Hungarian is /ɒnjɒ/. Hungarian and Turkish were once thought to be related but this is no longer the case.

Turkish has an /m/ ending to refer back to the speaker, eg “benim” (Slobin & Bever, 1982) evidenced in WALS, “ben” being first person subject singular. Similarly, “I am English” is realised as /bƐn ɪɧɪlɪzɪm/, the /m/ ending referring back to the speaker. To indicate “to me”, “bana” is used (Slobin & Bever, 1982) as a dative. This is pronounced /bʌnʌ/. Turkish orthography is quite phonemic which is helpful to the researcher. Here, we have a non-nasal consonant in initial position but the other consonant is nasal. We may speculate whether the /b/ in initial position in such words evolved from a bilabial nasal but that is beyond the scope of this study. Ruhlen, (1994), similarly states that “én” in Hungarian derived from “mén” but gives no explanation for how or why this may have happened or any reference thereof. The UCLA (University of California, Los Angeles) Phonetics Lab Archive gives “benim” as a translation of “mine”. This is accompanied by a recording of a native speaker of western dialect Turkish and the pronunciation is rendered /bƐnɪm/. UCLA give further evidence of /m/ ending with first person verbs such as “ederim” /ƐdƐrɪm/ translated as “I do” or “I will” (The UCLA Phonetics Lab Archive, 2007).

The following reference works supply references as given. Gates (2002) gives “benim” as a translation of “my”. Bayram & Jones (2006) gives “anne” as a translation of “mum”, and “benim”, “beni” and “bana” as translations of “me”. Bab.la gives “bence”, “beni” and “bana” as translations of “me”. Nichols and Peterson (2013) note the use of “m” in first person singular in Turkish as does Lewis (1967).

28 There were no responses regarding Turkish via Survey Monkey. One participant was interviewed.

29 5.4 Chinese

WALS classifies Chinese as a genus of the Sino-Tibetan family. The WALS database offers ten families in the genus of Chinese. Ethnologue classifies Chinese as a macrolanguage and offers a number of families contained within it. There are approximately 1 300 000 000 speakers of this macrolanguage. This implies one of the problems which was encountered in practice during data collection. The variety of Chinese used by most Chinese people to communicate in formal situations and with those from other parts of China is Mandarin. However, many speakers use their own variety of Chinese at home. As the subject of this study is infant speak; it is this home variety which is relevant rather than Mandarin. However, some reference to Mandarin may help to contextualise what was found out via interviews and questionnaires.

The Oxford University Press “Pocket Chinese Dictionary” (1999) translates English “mum” as “māma”. No indication is given explicitly as to pronunciation. However, as this is a bilingual dictionary aimed at helping English people speak Mandarin then we may assume some kind of English pronunciation using bilabial nasals for the consonants. In the same publication “me” is translated as “wŏ”. Subsequent verification through interview would suggest that this is an aspirated semi-vowel followed by /aʊ/. Huang (2010) translates “me”, “myself” and “oneself” as “běnrén”. Subsequent research indicates a pronunciation of /bƐnrǝn/, not hugely dissimilar to /bƐnɪm/ in Turkish.

By interview the following data was ascertained. The equivalent of northern England English “mum” in Chengdu dialect, described as “close to Mandarin, is given as /mæ/, /mæm/ and /mmæ/. In Cantonese the equivalent is /mæmæ/ with a rising and falling intonation and for Mandarin as /mæmæ/ with a flat intonation. In any case, the vocative maternal nursery term appears to only have nasal consonants.

In looking for first person object references the translation of “Give me the book” was looked for. The element of the utterance referring back to the speaker, ie the equivalent of “me” in the English utterance was given as approximating to /wɑ/or /ʊwɑ/ in Mandarin and /ɧo/ in Cantonese. A translation of “my book” gave the following as the equivalent of the “my” in English as approximating to /wɒd/ or 30 /woǝdǝ/ in Mandarin and /nogƐ/ in Cantonese. For the utterance “This is mine” the equivalent of the “mine” was as follows, /wɒd/ or /woǝdǝ/ in Mandarin and /nogƐ/ in Cantonese. The reason for the differences in the terms in Mandarin may be due to different accents or how the interviewee interpreted what they were being asked for. The /woǝdǝ/ pronunciation for “my” and “mine” equivalents was corroborated by data collected through Survey Monkey.

Nichols and Peterson (2013) state that Mandarin has no first person /m/. The WALS gives no indication in this regard concerning Cantonese and the Chengdu dialect is absent from the WALS database.

The UCLA corpus has a sound recording of a bilabial nasal in Mandarin and the example word given is “mā” which is translated into English as “mamma”.

31 5.5 Japanese

The WALS classifies Japanese as a language, a genus and a family. Ethnologue terms the family Korean – Japanese – Okinawan and the sub-family as Japanese – Okinawan. Approximately 120 000 000 people speak Japanese (Gordon, 2005).

From interview the data obtained was as follows. The vocative maternal nursery term is /mæmæ/. However, in discussion with a language professional it was pointed out that there was a more traditional term, /okææsǝ/; and an even older term, /hæhæuweɪ/. There was some speculation that /mæmæ/ may be a western import. /okææsǝ/ was given by another interviewee as the term they used as a child to address their mother. Therefore, there may be some uncertainty as to whether Japanese fulfils our requirement of having a nasal phoneme in the vocative maternal kin term. The term “okaasan” was also elicited by Survey Monkey.

In the translation of “Give me the book” two utterances were offered. One was described as more common and in this case the first person object is implied. The implication is given by /tʃͻdaɪ/. For speakers of a relatively analytic language, eg English, this notion of implication may be difficult to comprehend. However, let us consider the Hungarian phrase “Seretsz?”. This translates into English as “Do you love me?”. We see that English requires four words to the Hungarian one. How is this achieved? The questioning element is provided by intonation. The verb “love” has a base form “szeret” and the “sz” ending makes it second person personal singular. There is no explicit first person object. The “me” is implied. Hungarian could add a first person object but this is rarely done, only if there were some doubt as to the object of the verb. To make the first person object explicit in this case Japanese uses /wætæʃɪnɪ/. The /nɪ/ ending is the dative case. In the translation of “my book” the first person element is given as /wætæʃɪno/. The /no/ ending indicates the genitive case.

In the translation of “This is mine” the first person element is given as /wætæʃɪnomono/. The /mono/ ending indicates a demonstrative.

32 Published references give the following indications. The Collins Pocket English Japanese Dictionary gives “mum” as “mámὰ”. Pronunciation may be inferred from the orthography. “me” is rendered as “watákushi wo / ni”. “wo” for direct and “ni” for indirect objects.

“The all Romanized English-Japanese Dictionary (1974) gives “me” translated as “watashi ni/ o”. There is an indication that the initial sound is /ʊ/.

Martin (2008) gives “me” as “watakushi” and “mum” as mama”.

Survey Monkey data corroborated the first person terms above.

On the CHILDES corpus there is a transcription by Miyata (2012b) of her child’s early utterances and these include “mamma” and “Mama”. Again, pronunciation must be inferred and there is no indication as to whether these are words. The child was 9 months and 11 days old at the time. “mama” is also given by Okayama (1970). Ota (2003) on the CHILDES corpus has a recording of a child of age 1 to 2 years saying /mææmæ/. However, it is not explicitly stated what this word is. On the same corpus Yokoyama & Miyata (2017) similarly give “mamma”.

WALS states that Japanese has no “m” in first person singular (Hinds, 1986).

There were three responses for Japanese through Survey Monkey.

There were two interviewee participants.

33 5.6 Swahili

WALS categorises Swahili as of the genus Bantoid within the family Niger-Congo. Ethnologue categorises Swahili as an Atlantic Congo language within the family Niger-Congo. africanlanguages.com estimates the number of speakers as being between 50 and 100 million.

Safari (2012) translates English “mother” as “mama”. Safari indicates the pronunciation of the consonant, a bilabial nasal. However, there is no indication of usage and it is not clear if “mama” is for vocative or referential use. africanlanguages.com translates English “mum” as “mama”; and “me” as “mimi” respectively.

WALS indicates that Swahili has an “m” in first person singular (Ashton, 1947).

UCLA have a recording of /mɑmæ/ of which the English translation is given as

“mama”. A translation of “mine” is transcribed as tʃ͡ ɑ́ŋɡu. It is not indicated whether this is IPA but it would seem that the third phoneme is a velar nasal. There is also a sound recording of /tʃæɧgu/ which is glossed as “my” in English. This word is in isolation but in an example of connected speech it is realised as /zæɧgu/. At this stage we can only speculate as to why this may be the case. Some words can change pronunciation in connected speech from how they occur in isolation. For example, in English “will” is rendered as /wɪl/ in isolation butjust as / l,/ in I’ll; “my” is /maɪ/ when stressed but can be /mɪ/ in connected speech. In another UCLA recording it is rendered as /ænjægu/. This was a recording of a conversation whereas the others were discrete utterances. Difference in dialect may also account for these discrepancies.

There were no Survey Monkey responses for Swahili.

There were no interview participants for Swahili.

34 5.7 Arabic

Ethnologue categorises Arabic as a Semitic language of the Afro Asiatic family.

WALS gives 21 Arabic dialects of the genus Semitic and the family Afro Asiatic. However, for only one dialect, Egyptian Arabic, is information given regarding “m” in first person singular where it states there is none.

By interview the following data were gathered. In the Moroccan dialect the equivalent of northern England English “mum” is /mæmæ/. For the translation of “Give me the book” the equivalent of “me” was realised as /nɪ/. The consonant is a nasal. However, the term book finishes with a vowel and it may be that the consonant is inserted to preserve sound structure in the language. Other languages insert consonants to do this. For example, Hungarian has an instrumental case which is realised by adding /l/ to a word preceded by a vowel. However, if the root word itself ends with a vowel /v/ is inserted before the penultimate vowel. This preserves the general CVCV… structure of Hungarian (Consonant Vowel ….). Nevertheless, the nasal is there. It might be of interest to ascertain if the object were not first person would a nasal be interjected, assuming this is a sound structure interjection.

With regard to the translation of “This is my book”, the “my” morpheme was rendered as the suffix /ɪ/ on the object. In standard Arabic the suffix was /i/.

With regard to “This is mine”, the “mine” was rendered with the word /djælɪ/, whereas in standard Arabic it was /li/.

With regard to the Damascene dialect of Levantine Arabic, northern England English “mum” is rendered as /emmɪ/. In “Give me the book”, “me” is realised as the suffix /jɒ/.

In “This is my book”, “my” is realised as the suffix /jɒ/. In “This is mine”, “mine” is realised as /jɒ/.

From published reference works the following were collected. Quitregard (1994) translates English “mother” as “umm”. The text states this is a transliteration but no explicit indication is given as to the phonemic values of the script.

35 Steingous (1882) translates English “mummy” as “múmija-t”. This may be more vocative as opposed to Quitregard’s which might be referential. bab.la translates northern England English “mum” as /mƐmƐ/ and “me” as /ǝm/. The website provides sound recordings and these are the author’s transcriptions of them.

The CHILDES provides examples of an infant speaking Kuwaiti Arabic. This is a sound recording and the author’s transcription is /mææmæ mǝmǝ/. The translation on CHILDES into English of this utterance is “mum”. Other phonemic realisations given the translation “mum”, “mama” or “mummy” are /mææmǝ/, /mæmæ/, /nænæ/, /mʊmæ/, /jʊmæ/, /jƐmmæ/ and variations thereof. How are we to explain so many differences in one dialect. It may be that there are different accents within the dialect. This may explain some differences but in some cases the same infant produces different pronunciations. It may be that just as in English we have slightly differing maternal kin terms; “mum”, “mummy”; then in Arabic the same occurs. All these words mean “mum”. Another explanation pertains to the age of the participant infants. Many of them are only two years old and it may be that they are still practising forming the words and so discrepancies appear. It may also be the case that the utterances they make are assumed to be words. In some cases the “mum” word is part of a longer utterance and it may be clear that the infant is addressing their mother but in some cases the infant is only making one word statements. It may be that the transcriber misinterpreted a sequence of sounds as a word when they are not.

A search on the CHILDES for “me” and “my” produced /jƐ/ ending for “give me”. This would correlate with the /jɒ/ suffix found in interview. The range of vowels may be due to different dialects but it may also be that Arabic has a greater tolerance in this regard than many other languages. However, the same infant produced the ending /neɪ/ in another part of the same recording. How are the two consonant sounds to be explained, assuming we classify /j/ as a consonant? In terms of production for /j/ the tongue approaches, but does not touch, the alveolar ridge. Therefore, it could be that as the two places of articulation are the same that their perceptions in the subject language community consider them interchangeable in much the same way that Japanese consider [l] and [r] ([] indicating phones) to be allophonic. If this seems 36 strange for English speakers the fact that they consider the /u/ in “moon” and “boot” to be the same sound would be absurd for a Hungarian speaker to whom they are distinct phonemes, one of them being closer than the other.

For “me” the CHILDES produces /ænæ/ for the word in isolation, which is also translated as “am” in some cases.

For “This is mine” CHILDES produces /mɪ/ for the “mine” element.

Survey Monkey elicited “mama” for “mum”; “ee”and “I” for “my”; “ilaya”, “li” and “le” for “me”; and “ili”, “li” and “le” for “mine”. Participants were asked to give an indication of pronunciation using a Latin alphabet so it could be that the pronunciation of some of these morphemes is the same.

The UCLA corpus yielded no data related to vocative maternal nursery terms or first person object designators.

It can be seen from this summary of the data that Arabic poses particular issues in this research due to the diversity of forms of the language that exist.

There were three Arabic responses through Survey Monkey.

There were two interview participants for Arabic.

37 5.8 Telugu

Telugu is a language of the South Central Dravidian genus of the family Dravidian. This is also the classification given by Ethnologue. 70 000 000 speakers are given in Gordon (2005). 5 000 000 speak it as a second language. 19 distinct dialects are listed.

Google translate gives the translation of “mum” as “am’ma”; and the translation of “me” as “nāku”. However, no indication is given as to the phonemic status of these symbols. An orthographic inference may be that the “m” and “n” indicate nasal sounds but this is not certain.

The WALS database gives no indication of the use or non-use of nasals with regard to pronouns for Telugu.

The CHILDES database contains no corpora of Telugu.

The UCLA corpus for Telugu contained no entries related to vocative maternal nursery terms or first person object designators.

No speakers of Telugu were found with whom interviews could be conducted. There were no responses to Survey Monkey for Telugu.

There were no interview participants for Telugu.

38 5.9 Tagalog

Gordon (2005) gives the number of speakers of Tagalog as 15 million. It is classified as an Austronesian language. Eight distinct dialects are listed.

WALS also classifies Tagalog as a member of the Austronesian family of the genus Greater Central Phillipine.

The following data was derived from published references. Perdon (2013) translated “mother” as “ina”. Hawkins (2011) translates English “mother” as “nanay”. These are orthographical renderings but we may infer the nasal phoneme. However, “mother” is not a vocative nursery term. Barrios (2014) translates “my” as “ko”. It may be inferred from the orthography that the phoneme is a plosive. This is supported by other reference works dealing with the orthography and pronunciation of Tagalog eg Ramos & Cena (1990).

The following websites yielding further data, tagalogtranslate gives the phonemic realisation of English “mum” as /man/. Tagalog-dictionary translates “mum” as “mama”, “mamaya”, “man” and “mana”. It also translates English “me” as “ma”, “maya”, “may” and “mayo”.

Professor Kie Zuraw of UCLA (in private correspondence) gave translations of English “me” and “my” as “ko”, and a translation of English “mine” as “aking”. She also gave translations of English “mum” as “nanay”, “nay”, “inay”, “mama” and “mommy”. However, Professor Kie Zuraw is not a native speaker of Tagalog.

Ramos & Cerna (1990) give the first person singular pronoun subject as “ako”. However, we are more interested in the object form. They also give two “non- subject” pronouns for the first person singular of “ko” and “akin”.

The WALS indicates that Tagalog has no “m” in the first person singular (Schachter and Otanes, 1972)

There are no Tagalog entries in the CHILDES corpora.

The UCLA database gives /ɪna/ for “mother”. It also gives /əqoɪ/ for “me”. There were no responses for Tagalog through Survey Monkey.

39 There were no interview participants for Tagalog.

40 6 DATA ANALYSIS

This chapter begins with an analysis of the Murdock data as set out in the Methodology chapter. In response to the data Murdock states that, “… the hypothesis under test would appear decisively validated.” (Murdock, 1959, p 4). However, it is not clear what the hypothesis under test is. Even so, for the purposes of this study our observation is that nasal phonemes are more represented in maternal nursery terms than can be explained by random distribution.

This can be tested by a chi squared goodness of fit test. Strictly speaking, for a chi squared test to be valid the following criteria must be met; the sample must be a simple random one; the variable under study must be categorical; and the number of observations in each level of the variable must be 5 or greater. Murdock, as briefly alluded to above rejected random sampling as this may have resulted in biases in his choice of cultures. However, with regard to languages from these cultures the sample fulfils the criterion as the reduction of bias in the selection of cultures means there is a broad base of languages which will contain a large proportion of unrelated languages. In other words, any consonant class would be as likely as any other to occur in the maternal kin term if the distribution was random. Consonant classes are categorical. They have non-numerical values. The observations for each level are way in excess of 5. Therefore, all criteria for the use of a chi squared test are fulfilled.

The null hypothesis is that the observations result purely from chance. The alternative hypothesis is that the observations are influenced in some manner which means the results are not random; they do not result purely from chance.

The chi squared calculations by hand, http://vassarstats.net/csfit.html and R Cran are in Appendix 3. The results demonstrate that the probability of the distribution of consonant classes with regard to nasality in the maternal nursery kin term being by chance are less than one in 100,000. Usual scientific inquiry in the social sciences would take a chance of probability of less than one in 20 as being good grounds to assume that the results were not by chance until further evidence came to light. For the purposes of this study it will be assumed that this result is not by chance and the

41 null hypothesis is rejected. This implies the following question: what causes maternal kin terms to include nasals as their main type of consonant class? With regard to the Bancel, de l’Etang, Ruhlen data, once again, the probability of this distribution occurring by chance can be calculated. The suitability of a chi squared test is assessed. From the data given we do not know what Ruhlen’s sampling technique was but we will assume that there if there were no non-random factors then “mama” or “papa” would be as likely to apply to any kin term as any other. If the relationship between the signifiers “mama” and “papa” to the kin terms is arbitrary we would expect an even distribution. The kin terms are qualitative labels and are non- numerical. Therefore, they are categorical. The number of terms for each level are in excess of five. Therefore, a chi squared test can be used.

The null hypothesis is that the observations result purely from chance. The alternative hypothesis is that the observations are influenced in some manner which means the results are not random; they do not result purely from chance.

The chi squared test calculations by hand, SPSS and R are given in appendix 4. These indicate that the probability of this many instances of “mama” and “nana” designating the mother occurring by chance is less than 1 in 100,000. The null hypothesis is rejected and for the purposes of this study it is assumed that the distribution indicates that non-random factor(s) influence the result. Again, this raises the question, what causes “mama” and “nana” terms to be much more likely to occur as maternal kin terms than other kin terms?

Reviewing the analysis so far; the Murdock data indicate that nasal consonant classes are much more prevalent for maternal kin terms than chance allows for. The Ruhlen data indicate that kin terms with nasal sounds are much more likely to designate the mother then other family members. The question now remains as to why this should be so.

The aim this research is to establish if there is a statistical correlation between use of nasal consonants in vocative maternal nursery terms and first person designators across languages from different families. Of the nine languages studied we can collect all the maternal kin terms and all the first person designators and establish the ratio of nasal consonants to non-nasal consonants with respect to each term 42 within each particularlanguage. We thereby establish the table below (Frequency of nasal consonants compared to frequency of non-nasal consonants). Both orthographic and phonemic data are recorded. Orthographic entries are indicated with “” and phonemic with //. Orthographic entries which appear to be designating the form of the previous vowel, eg “y” in “nanay” are not counted. Phonemic entries which appear to indicate an affricate eg /tʃ/ are classed as one phoneme. V indicates that there may be more than one vowel which could be the place holder.

A count of all the consonants produces a total of 192. These were divided between the “me” term and the “mum” terms, 112 and 79 respectively. Of the 112 “me” consonants, 46 were nasals. Of the 79 “mum” consonants, 68 were nasals.

(Distribution of Consonants for “Mum” and “Me”)

Number of consonants = 192 “me” 112 “mum” 79 Nasal 46 Non-nasal 68 Nasal 68 Non-nasal 11 Table 6.1

At first glance this seems to be a very high proportion of nasal consonants. However, this does not mean that the “mum” and “me” terms are necessarily related in any way. It could just be coincidence that they have similar sounds. Furthermore, we do not know if these phoneme frequencies are reflective of frequencies in the languages spoken or are peculiar to these linguistic items.

To begin our analysis we shall count the possible number of consonants in all our languages under consideration. From Phoible Online (a database of the phonological characteristics of languages) we have a total of 71 possible consonants which could be used for our two linguistic terms. Of these 71 IPA consonants, 3 are nasals which occur for the “mum” and “me” terms. Therefore, assuming an even frequency of consonant occurrence across the languages we may expect a probability of 3/71 for any consonant slot to be occupied by a nasal. Dividing each of our observed totals of consonants by 71 and multiplying by 3 gives our expected frequency of nasals for each term, ceteris paribus.

43 For “mama”;

79(3/71) ≈3.3

For “me”;

112(3/71) ≈4.7

Our observed frequencies of 68 and 46 are considerably in excess of those expected.

It is not the purpose of this study to arrive at a precise calculation of the probability of the occurrence of particular phonemes in particular linguistic terms. However, it is relevant to examine the probability of such outcomes being produced at random. The above model can be criticised on the following grounds: Not all the languages share the same set of consonants and some have more than others. In other words, the set of possible consonants that could be used in any given language is smaller than 71. It may also be the case that the three nasal phonemes do not occur in every language either. This means that the expected frequencies may be much lower than those above. In addition, we have uneven numbers of entries for each term in each language. For example, in Arabic we have 13 entries for “mum” but only 2 for English. It would not be surprising for the terms within a language to share common features. Therefore, the data may be biased. Furthermore, the distribution of phonemes in any particular language is unlikely to be even. In other words, a language may have a higher frequency of nasal consonants compared to plosives, even if many plosives do exist in that language. Therefore, nasals would be more likely to occur in any selected term.

44 Frequency of nasal consonants compared to frequency of non-nasal consonants

Language Recorded Frequency of Recorded first Frequency of vocative nasal person object nasal maternal kin consonants / designators consonants / terms total total frequency of frequency of consonants consonants English /mʊm/, 4/4 /mi/, /maɪ/, /mɪ/, 5/5 /mʊmɪ/ /maɪn/ Hungarian /mɒmɒ/ 2/2 /Vm/, /en/, 7/10 “enge”, /ƐɧgƐm/, /Ɛnjem/ Turkish /ʌnnƐ/ 2/2 /Vm/, /bʌnʌ/, 6/11 “beni”, “bence”, /bƐnɪm/ Chinese “māma”, /mæ/, 10/10 “wo”, /wɑ/, 4/14 /mæm/, /ʊwɑ/, /ɧo/, /mmæ/, “běnrén”, /wɒd/, /mæmæ/, “mā” /woǝdǝ/, /nogƐ/ Japanese /mæmæ/, 12/19 /tʃͻdaɪ/, 7/30 /okææsǝ/, /wataʃɪnɪ/, /hæhæuweɪ/, /wataʃɪno/, “okaasan”, /wataʃɪnomono/, “mámὰ”, “wɑtὰkushi “mɑmɑ”, wo,no”, “wɑtɑshini,o”, “wɑtɑkushi”

45 Arabic /mæmæ/, 25/29 “ee”, “I”, “ilaya”, 4/14

/emmɪ/, “li”, “le”, “ili”, “umm”, /nɪ/, /ɪ/, /i/, “múmija-t”, /djælɪ/, /li/, /-jɒ/, /ǝm/, /jƐ/, /ænæ/, /mƐmƐ/, /mææmæ/, /mɪ/ /mǝmǝ/, /mææmǝ/, /nænæ/, /mʊmæ/, /jʊmæ/, /jƐmmæ/, “mama”

Swahili “mama”, 4/4 “mimi”, /tʃɑɧgu/, 6/14 /mɑmæ/ /tʃæɧgu/, /zæɧgu/, /ænjægu/ Telugu “am’ma” 2/2 “nāku” 1/2 Tagalog “ina”, “nanay”, 14/15 “ko”, “ma”, 6/12 /man/, “maya”, “may”, “mama”, “mayo”, “aking”, “mamya”, “akin”, /ǝqoɪ/ “man”, “nay”, “mommy” Table 6.2

46 To help deal with some of these problems each language will be analysed individually to try and approximate the probability of the observed frequency of nasals in our two terms. Beginning with English, we see that for “mum” we have two entries. The only consonant that occurs is /m/. There are three possible nasals in English our of a total of 24 consonants which gives us a probability of 1/8 of a nasal being used for any particular term. Similarly, for the “me” term we only have nasals, again a probability of 1/8. If we take the classical view of linguistics and assume that there is no relationship between the signified and the signifier we can assume that any phoneme in the language can designate any term then we can assume that the occurrence of particular consonants in “mum” is independent of the occurrence of particular phonemes in “me”. Therefore, the two events are independent and we can multiply their probabilities to ascertain the probability of both events occurring.

(1/8)(1/8) = 1/64

On its own this figure does not reveal very much. There are hundreds of thousands of words in the and very many pairs of them will use the same consonants. We could equally argue that there would be a similar probability of “mum” and “no” sharing nasals. It is true that words such as “mum” and “mine” in the data have more than one consonant slot and these are both nasals but we could also argue that so do “moan”, “minimum”, etc. We could find many instances of such combinations. Indeed, it would be impossible given the volume of words to not find any.

However, the fact remains that in our data for the two terms only nasal consonants were found. It may be more profitable to try and turn our data into a meaningful statistic by trying to binarise it. If we classify our data as the occurrence of nasal terms is a success this can be deemed to be a “1”. The absence of nasal terms is then indicated by “0”. So, for English we have (1,1). Both the “mama” and the “me” term are characterised by nasals.

Moving on to Hungarian we, again, only have nasals consonants for the “mama” term. Hungarian has 26 consonants. It may be worth noting that there may be some discrepancies between sources regarding the number of consonants in a 47 language as we noted with Arabic. These differences may be due to different dialects. It is also possible that allophones may be recorded as different sounds, eg [p] in English “spill” compared to [ph] in “pill”. In addition, some sounds are used in imported words. For the purposes of this study sounds only used in imported words are excluded as they would not occur in the vocabulary items under examination. Allophones are generally classed as one phoneme if it is reasoned the native speaker community would interpret them as the same sound, eg [p] and [ph] in English. Some discretion has been applied by the author in some cases to make a judgement on what to class as a discrete phoneme but this has only occurred is a small number of instances. Of the 26 possible Hungarian consonants 3 are nasal. Therefore, there is a probability of 3/26 of any particular consonant occurring in any particular consonantal slot. For the “me” term we have 10 events, 7 of them being nasals. How should we categorise this data? It appears to be neither 1 nor 0 in our binary system. We can see that the observed frequency of 7/10 is much higher than the expected frequency of 3/26. Therefore, we can modify our binary system to state that

1 = nasal dominant

0 = not nasal dominant

Obviously, this does not reflect differences between much higher and only a little higher but it still gives us a system that might be workable. In this case Hungarian achieves a 1 with respect to the “me” term and we have a record of (1,1) for Hungarian. Even if we strip out the entry “enge” which is probably an infant’s approximation of /ƐɧgƐm/ we have an observed frequency of 6/9 which would not change our result.

With respect to Turkish, we have only nasals present for the “mum” term and record a 1 here in our binary system. For the “me” term we have a frequency of 6/11 for nasals. If we strip out “beni”, assuming from its orthography that it is the same word as /bƐnɪm/ then the ratio becomes ½. In order to find phonological data we use PHOIBLE Online as our source. Phoible gives 23 possible consonants of which 2 are nasals for an expected frequency of 2/23. The observed frequency of ½ exceeds 2/23 and we record a 1 for Turkish “me” giving a Turkish binary entry of (1,1). As all the “me” terms in any particular 48 language are related in meaning we may assume that this might account for similarities in pronunciation. Therefore, the use of probability analysis is not strictly mathematical as the events are not independent, hence, the notion of a meaning being characterised by a predominant consonant is used. However, this is contextualised by reference to the number of possible consonants in the language.

With respect to Chinese, there are only nasals present for the “mum” term so a 1 is recorded. For the “me” term we have 5 semi-vowels, 4 nasals, 1 approximant and 4 plosives. This mix may be explained by the fact that at least three dialects were used in the data collection. Without more detailed data on the distribution of these consonants throughout Chinese the results are inconclusive and no manner of articulation is dominant so a 0 is recorded. Phoible records 13 Mandarin, 11 Cantonese and 13 Chengzhou consonants, each with three nasals.

With respect to Japanese, 12/19 recorded consonants were nasals. However, looking at the data we may conclude that /mæmæ/, “mámὰ” and “mɑmɑ” are, in fact the same word. So only using one reduces our count to 8/15. Similarly, we may conclude that /okææsə/ and “okaasan” are the same word. So, if we take /okææsə/ over “okaasan”, chosen as it’s a direct recording from an interview, /hæhæuweɪ/ and /mæmæ/ as terms for “mum” then we get 2 nasals, 1 plosive, 3 fricatives and one semi-vowel. As there is not one dominant manner of articulation, let us say not one manner forms over half of events, then for the purposes of this study Japanese cannot be used as it is a requirement that the “mum” term is nasal.

With respect to Arabic, we have 13 recorded versions of the “mum” term with a total of 25 nasal consonants out of 29 events. Many of these appear to be the same word with differences which might be attributed to dialect, wide range of tolerance with regard to vowel or infant approximations of words. In order to help us in our analysis of phonemes the database phoible is used. This gives the phonemes used in many world languages and dialects thereof. The issue of tolerance is highlighted by the fact that for one of the dialects used in this

49 study phoible records 3 vowels for Moroccan. If only a few vowels are recognisable in the target speech community it may be that their range of tolerance is very high and what are perceived by the recorder as different vowels are, for the speaker, the same vowel sound. Therefore, we may modify our Arabic entries by discounting the vowel and record our entries as follows.

/mVmV/, /VmmV/, “Vmm”, “mVmVjV-t”, /nVnV/, /jVmV/

This gives a total of 11 nasal events out of a possible total of 14. The other manners of articulation are two semi-vowels and one plosive. The nasal consonants are dominant for the “mum” term so Arabic can be used in our study. To contextualise, phoible lists 19 or 21 possible consonants in Arabic depending on dialect, of which 2 are nasals. Using the Moroccan dialect with 21 consonants this gives a probability of 2/21 of a nasal event.

For the “me” term in Arabic 16 items are recorded. Of these some only have vowels. As the “me” component seems to be an affix in many cases it may be that “ee”, “I”, “ilaya”, “li”, “le”, “ili”, /ɪ/, /i/, /li/ are examples of the same event. For example, the consonant could be interjected to fit the affix on to a word that finishes in a vowel which is unnecessary if it finishes in a consonant. Similarly, /-jɒ/ and /jƐ/ could be interpreted as the same event. This then reduces the list of entries to

“VlVyV”, /nɪ/, /djælɪ/, /əm/, /ænæ/, /mɪ/

This gives us 4 approximants, 4 nasals and 1 plosive. Given that there are only two possible approximants in Arabic it is hard to say, in the absence of wider information regarding the frequency of particular consonants in the language to state whether nasals are dominant as opposed to approximants for the “me” term. As the results are inconclusive, we record a 0 for the Arabic “me” term in our binary system.

With respect to Swahili, only /m/ is recorded as a consonant in “mama” entries and the language can, therefore, be used in our study. For the “me” term there are three entries with the same ending /-ɧgu/. According to www2.ku,edu/~kiswahili this ending corresponds to first person singular possessive. The preceding phonemes are determined by the nature of the noun 50 being referred to not the person. Therefore, our entries can be reduced to “mimi”, /-ɧgu/, /ænjægu/. This gives a total of four nasal, two plosive and one semi-vowel for the consonant placings. This, together with the fact that Swahili is listed as having an “m” for the first person pronoun in WALS, allows us to state that, for our purposes, Swahili is nasal dominant and, therefore, a 1 in our binary scheme.

With respect to Telugu there is one entry for the “mum” term; “am’ma”. This only contains nasals. There is one entry for the “me” term; “nāku”. This contains one nasal and one plosive. However, there are very few entries for either term drawn from only one source. Given the paucity of data available Telugu will not be used for further analysis in this study.

With respect to Tagalog, there are 8 entries for the “mum” term. Assuming that /man/ and “man” are the same word, this reduces to 7 entries. There are 13 consonants of which 12 are nasals. Phoible lists 16 Tagalog consonants of which 3 are nasals. We conclude that the “mum” term is nasal dominant and Tagalog can be used for the purposes of this study. For the “me” term there are 8 entries. A number of these have a nasal in initial position but all of these come from the same source and cannot be corroborated in other sources. Therefore, there is reason to believe that this is a data error and these entries will be discounted. This leaves “ko”, “aking, “akin”, /əqoi/. This gives a total of 6 consonants of which 4 are plosives and 2 are nasals. This evidence, coupled with the fact that WALS lists Tagalog as having no “m” for first person returns Tagalog “me” as being not nasal dominant = 0.

To summarise the data so far the seven remaining languages are listed along with a 1 or 0.

1 = nasal dominant “me” term

0 = not nasal dominant “me” term

51 (Binary Distribution of Nasality) English 1 Hungarian 1 Turkish 1 Chinese 0 Arabic 0 Swahili 1 Tagalog 0 Table 6.3

These data must now be interpreted. It could be said that for all the languages found that were nasally dominant with respect to the “mum” term, 4/7 of them were found to have nasally dominant “me” terms. It may be that some would consider this evidence enough to suggest that there is a connection between the two terms for languages with nasally dominant “mum” terms.

However, the data can be analysed further. We can summarise the probability of recording a 1 or a 0 in our table above. This can be done by looking at the possible consonants for each language and calculating the ratio of nasals to other consonants. This will give the probability of a term being nasal dominant as opposed to another manner of articulation being dominant. Of course, we do not know the frequency of sounds in any particular language but for the purposes of this study we assume they are even. This accounts for the 1 languages. For the 0 languages we simply find the ratio of non-nasal consonants to all consonants in the relevant language to find the probability of the “me” term not being nasal dominant. From the data available the following table can be constructed.

52 (Probability of Nasal Distribution)

Language result Number of Number of Probability possible nasals of result consonants English 1 24 3 1/8 Hungarian 1 17 3 3/17 Turkish 1 23 2 2/23 Chinese 0 13 3 10/13 Arabic 0 21 2 19/21 Swahili 1 19 4 4/19 Tagalog 0 16 3 13/16 Table 6.4

Here we have the probability, in the final column, of the result for each row. What would be interesting would be to find the probability of the overall result occurring. We have to be careful how we define the overall result in this context. It is not the purpose of this study to find the probability of this exact combination of results for each language. It would be no surprise if some languages has nasal dominant “me” forms and some did not and it is not relevant which ones do. What is important is the probability of any combination of four of the seven languages being nasally dominant in the “me” form. Most available statistical packages assume even possibilities for each event but here, as the final column above indicates, we have different probabilities for each event. Therefore, the calculation cannot be done holistically as might be in the case, for example, of calculating the probability of throwing 4 sixes in seven rolls of a die. However, we can apply the same principle in calculating the probability of each combination occurring. To identify the possible combinations of four successful, in this case defined as result 1s, events out of a total of seven events it would be helpful if we knew how many possible combinations there were.

53 Fortunately, a formula for this is readily available eg Hogg and Tanis (2010). nCk= n!/2!(n-r)!

Where n = total number of places available, in this seven and r = total number of places held, in this case four.

Substituting the value into our formula gives a total of 35 possible of four place holders, in this case denoted by 1, in seven places.

The table below gives all 35 possible combinations of 4 1s and 3 0s. In addition, the probability of the event outcome is given in brackets. In the final column the probability of that row’s combination is given. This is achieved by multiplying each of the seven individual probabilities together. This is legitimate as the events are dependent in this case (Hogg and Tanis, 2010). At the top of each column is a letter corresponding to the language for that row, E = English, H = Hungarian, T = Turkish, C = Chinese, A = Arabic, S = Swahili, Ta = Tagalog

54 (Probability of Distributions Across Languages)

E H T C A S Ta Total Probability 1(1/8) 1 (3/7) 1 (2/23) 1 (3/13) 0(19/21) 0(15/19) 0(13/16) 45/175168 1(1/8) 1(3/17) 1(2/23) 0(10/13) 1(2/21) 0(15/19) 0(13/16) 75/832048 1(1/8) 1(3/17) 1(2/23) 0((10/13) 0(19/21) 1(4/19) 0(13/16) 5/21896 1(1/8) 1(3/17) 1(2/23) 0(10/13) 0(19/21) 0(15/19) 1(3/16) 225/1138592 1(1/8) 1(3/17) 0(21/23) 1(3/13) 1(2/21) 0(15/19) 0(13/16) 135/475456 1(1/8) 1(3/17) 0(21/23) 1(3/13) 0(19/21 1(4/19) 0(13/16) 9/12512 1(1/8) 1(3/17) 0(21/23) 1(3/13) 0(19/21) 0(15/19) 1(3/16) 405/650624 1(1/8) 0(14/17) 1(2/23) 1(3/13) 1(2/21) 0(15/19) 0(13/16) 15/118864 1(1/8) 0(14/17) 1(2/23) 1(3/13) 0(19/21) 1(4/19) 0(13/16) 1/3128 1(1/8) 0(14/17) 1(2/23) 1(3/13) 0(19/21) 0(15/19) O(13/16) 45/162656 0(7/8) 1(3/17) 1(2/23) 1(3/13) 1(2/21) 0(15/19) 0(13/16) 45/237728 0(7/8) 1(3/17) 1(2/23) 1(3/13) 0(19/21) 1(4/19) 0(13/16) 3/6256 0(7/8) 1(3/17) 1(2/23) 1(3/13) 0(19/21) 0(15/19) 1(3/16) 135/325312 1(1/8) 1(3/17) 0(21/23) 0(10/13) 1(2/21) 1(4/19) 0(13/16) 15/59432 1(1/8) 1(3/17) 0(21/23) 0(10/13) 0(19/21) 1(4/19) 1(3/16) 45/81328 1(1/8) 0(14/17) 0(21/23) 1(3/13) 1(2/21) 1(4/19) 0(13/16) 21/59432 1(1/8) 0(14/17) 0(21/23) 1(3/13) 1(2/21) 1(4/19) 1(3/16) 63/81328 0(7/8) 0(14/17) 1(2/23) 1(3/13) 1(2/21) 1(4/19) 0(13/16) 7/29716 0(7/8) 0(14/17) 1(2/23) 1(3/13) 0(19/21) 1(4/19) 1(3/16) 21/40664 0(7/8) 0(14/17) 0(21/23) 1(3/13) 1(2/21) 1(4/19) 1(3/16) 441/772616 1(1/8) 0(14/17) 0(21/23) 0(10/13) 1(2/21) 1(4/19) 1(3/16) 105/386308 0(7/8) 0(14/17) 1(2/23) 0(10/13) 1(2/21) 1(4/19) 1(3/16) 35/193154 1(1/8) 1(3/17) 0(21/23) 0(10/13) 0(19/21) 1(4/19) 1(3/16) 45/81328 0(7/8) 1(3/17) 1(2/23) 0(10/13) 1(2/21) 0(15/19) 1(3/16) 225/1545232 1(1/8) 0(14/17) 1(2/23) 0(10/13) 1(2/21) 0(15/19) 1(3/16) 75/772616 1(1/8) 0(14/17) 1(2/23) 0(10/13) 1(2/21) 1(4/19) 0(13/16) 5/44574 0(7/8) 1(3/17) 0(21/23) 0(10/13) 1(2/21) 1(4/19) 1(3/16) 315/772616 0(7/8) 1(3/17) 0(21/23) 1(3/13) 1(2/21) 1(4/19) 0(13/16) 63/118864 0(7/8) 1(3/17) 0(21/23) 1(3/13) 1(2/21) 0(15/19) 1(3/16) 2835/6180928 0(7/8) 0(14/17) 1(2/23) 1(3/13) 1(2/21) 0(15/19) 1(3/16) 315/1545232 0(7/8) 1(3/17) 0(21/23) 1(3/13) 0(19/21) 1(4/19) 1(3/16) 189/162656 1(1/8) 1(3/17) 0(21/23) 0(10/13) 1(2/21) 0(15/19) 1(3/16) 675/3090464 1(1/8) 0(14/17) 0(21/23) 1(3/13) 1(2/21) 0(15/19) 1(3/16) 945/3090464 0(7/8) 1(3/17) 1(2/23) 0(10/13) 1(2/21) 1(4/19) 0(13/16) 5/29716 0(7/8) 1(3/17) 1(2/23) 0(10/13) 0(19/21) 1(4/19) 1(3/16) 15/40664 Table 6.5

We now have the probability of each possible combination of 4 1s in seven places. To calculate the probability of getting any combination of 4 1s in seven places we have to sum the numbers in the final column. This is legitimate as these are now independent events (Hogg & Tanis, 2010). This gives a figure of 3282959/259598976 which has a decimal approximation of 0.0126463. This gives a probability of p < 0.013 or p ≈ 1/80. Given that much scientific research would set p < 1/20 as being significant 55 (Dahiru, 2008; Greenberg, 1990)) we may state that the result has statistical significance. However, bear in mind one of our assumptions which is that nasal consonants are as likely to occur in each language as any other form of consonant. In addition, we have used only languages in which the maternal nursery term is nasal dominant. It would be interesting to know what the probability was for languages which do not have nasal dominant maternal kin terms to have nasal dominant “me” terms. Only in this way could we be confident that there may be a relationship between nasal dominance in the “mum” and the “me” term. It may be that nasal dominance in the “me” term occurs in languages irrespective of nasal dominance in the “mum” term. However, what we have shown is that nasal dominance in the “me” term occurs at a statistically significant level in unrelated languages.

Returning to our assumption regarding the nasal consonants being as likely to occur in each language as any other form of consonant, it is beyond the scope of this project to reliably test this. However, it is possible to look at some data in this regard. This is fraught with danger and some caveats should be stated at the outset. Finding suitable corpora from which to analyse frequency distributions of phonemes is quite problematic. The corpora should be of spoken language, not written. It is difficult to be sure that any particular corpus is of naturally occurring language, indeed most seem not to be. This does not mean that non-natural language definitely has a different frequency distribution of phonemes but this cannot be ruled out. In addition, any example of language will be people talking about a particular topic. This may mean that certain words occur with a higher frequency and this may bias the results. In addition, the context of the text may be in a formal setting where a certain register is used and this may bias the results. Furthermore, for the purposes of our studies it may be that we should only be using corpora of language used with infants as we are looking at issues of fundamental language development. We could also argue, that as we are looking at fundamental meanings we should just look at basic vocabulary items as Blasi did in their language of sound symbolism. Such a process as outlined in this paragraph would be a research project in its own right.

56 However, it would seem to be reasonable not to ignore the issue of frequency distributions altogether in case it could be argued they might change the results significantly. For this project the UCLA phonemic database was used. These are recordings of language use and include entries for all seven languages here analysed. Among the caveats regarding these corpora it should be stated that the reasons for the recordings of the language in the corpora are not given. Therefore, we do not know if this might have biased the results. Looking at the data there are not always examples of connected speech for each language. Some corpora seemed to be aimed at collecting data related to particular sounds and the words selected seem to reflect that. Therefore, we do not know the frequency of these words in general language and the data may be biased in this manner. It may also be the case that words pronounced in isolation may be pronounced slightly differently in connected speech leading to a systematic loss of certain sounds. In addition, some instances of connected speech in the corpora do not sound like natural speech as there are variations of grammatical forms being exemplified with repeated words. For example, the name “John” appears in the English corpus in various grammatical contexts. This may lead to a statistical bias. Furthermore, some of the connected speech sounds like it is being read aloud even though it is written as spoken language. With regard to transcription, some transcripts are Romanisations rather than IPA requiring a little interpretation as to the exact phonemes used.

However, corpora were looked at from the UCLA database and selected. If continuous language was available the first corpus listed for each language was chosen. If not the first corpus in the list for the language was selected. This was done to try to reduce researcher bias and introduce an element of randomisation into the process. The number of nasals consonants and non- nasal consonants was ascertained. This was rounded down to the nearest 5, except in one case were there were fewer than 5 events and the number was rounded up to 5. The following ratios of nasal to total number of consonants was found.

57 (Ratio of Nasal to Total Number of Consonants)

English Hungarian Turkish Chinese Arabic Swahili Tagalog 9/44 2/19 8/59 12/53 1/8 98/339 170/517 Table 6.6

The first thing we notice is that the frequency of nasals is generally higher than the ratios in the table on page 32 would lead us to expect. The only way to check for certain as to how this might affect our probability figure of approximately 1/80 would be to run these figures through the calculations on page 33. However, this would be ill advised given the caveats above. We have no way of verifying whether these are suitable corpora to use. Indeed, it has not been established what might be the criteria for suitable corpora and we may get a result that is quite meaningless. It should also be borne in mind that these corpora are quite small, the largest only producing approximately 500 consonants in all, and the difference in magnitude between some corpora is quite large; the smallest producing only 38 consonants. These data are inadequate for use in calculations of probability in this context.

However, we may be able to draw some inferences from the statistics to contextualise our result of p ≈ 1/80. The result of observed nasals in the corpus for English was 9/44 consonants which is approximately double the ratio of nasals to non-nasal consonants in the set of all possible English consonants. Continuing respectively for Hungarian we have 2/19 compared to 3/17, results of comparable magnitude. For Turkish we have 8/59 compared to 2/23, the ratio of nasals to non-nasals in the corpus being approximately double that found for the whole set of possible Turkish consonants. For Chinese we have 41/53 non- nasals to nasals for consonants in the corpus. For the whole set of Chinese consonants we have 10/13. These ratios are of similar magnitude. For Arabic the corpus gives a ratio of 7/8 for non-nasal to nasal consonants. The set of all possible Arabic consonants gives a ratio of 19/21. These ratios are of a similar magnitude. For Swahili the corpus gives a ratio of nasals to non- nasals of 98/339. The set of all possible Swahili consonants gives a ratio of 4/19. The ratio in the corpus is greater by a factor of approximately 3/2. For Tagalog the corpus gives us a ratio of non-nasals to nasals of 347/517. The set of all possible 58 Tagalog consonants gives a respective ratio of 13/16. Therefore, the corpus ratio is higher by a factor of approximately 1/5.

Given the caveats stated above and the fact that the bias factors are approximations it would be meaningless to try sophisticated calculations of the bias. However, we may observe that in two of the four nasal dominant (1) languages the frequency of occurrence of nasals is significantly higher in the corpus. If this is reflected generally in the language then we could expect nasals to be more likely in the “me” term. However, ascertaining this is beyond the scope of this study. In addition, for the non- nasal dominant (0) languages, two of them have corpora in which nasals are of a higher frequency than the figures used to calculate our probability figure. One could argue that one would expect a higher probability of nasals occurring in the “me” term than the figure used in our calculation. It is impossible in this study to quantify this bias but we can make some tentative attempts at taking it broadly in to account. We may observe that of our seven languages three have no bias, assuming our corpora are reliable evidence of such, two others have a bias of ½, one has a bias of 1/3 and one of 1/5. Taking these biases into account we may state that our overall probability figure needs to be adjusted by approximately ¼ which would reduce the probability to approximately 1/60. This would still be comfortably within the range of significance.

59 7 SUMMARY AND DISCUSSION

The evolution of language remains a mystery. However, language exists and we can speculate that it must have evolved. Homo sapiens sapiens evolved and at some point in their existence must have developed the need to communicate. Initially, this may have been through facial expressions and body language, such as pointing (Corballis, 2009). Eventually, sophisticated languages evolved. Therefore, we can speculate that a process occurred in which human communication moved from simple expressions, eg of fear, to sophisticated grammatical formations. It is impossible to say how long this process may have taken. The development of pidgins into full creoles may suggest that this process was relatively swift. Nevertheless, we do not know how this happened. However, it may be the case that the choice of sounds for the original words was entirely arbitrary. However, it may be that this was not the case and that in some cases at least the sounds represented real world phenomena and these determined their use in particular words. The work of Blasi et al suggests that this might be the case. However, we may be able to take the argument one step forward. Blasi found that certain basic vocabulary shared particular sounds across languages. For example, close front vowels were associated with littleness and more open back vowels with largeness or roundness. However, can it be said that the use of one sound for a particular concept can result in its use for another related concept. In this case we are looking at the use of nasals in the “mum” term and asking whether this fundamental concept explains its use in the “me”. This study has not answered this question but what it has done is to provide some data to try and help us understand the matter under discussion.

To summarise our progress so far, we have verified the speculation of Murdock about the prevalence of nasals for the maternal nursery kin term across languages. This was performed through statistical analysis and we can state, with a high degree of certainty, that languages have not arbitrarily assigned nasal terms in this case. However, this depends upon the assumption that languages have developed independently from each other. Much of the history of linguistics has been spent on showing which languages are related to each other. Therefore, one criticism of our approach is that languages are related and, therefore, we 60 should not be surprised to find similarities between them. However, the state of current linguistic knowledge is that, whilst many languages are related and can be grouped into language families, there is little evidence to suggest that the families themselves are related to each other. Languages in one family may have come from a proto-language, for example English and French may have originated from proto-Indo-European, however, they are not related to languages in other families. So, for example, an Indo-European language such as Russian is not related to a Finno-Ugric language such as Hungarian. This view has been challenged and some linguists contend that all languages might be related. There has been some work by Greenberg in proposing tentative links between languages. For example, that there may be a Euro-Asiatic super language family subsuming established language families. Ruhlen, Bancel & de l’Etang and others have gone further in asserting that there is evidence from kin terms and personal pronouns to suggest that all languages derive from a common root. However, these views have been fiercely challenged and this study is based on the assumption that the established language families of the world as can be found in reference tomes such as Ethnologue are unrelated. In any case, even if all languages were derived from a common root we would still have the problem of understanding how human language developed in the first place. Discounting notions of divine or alien intervention and in the tradition of evolution we assume that some process of development took place. This then brings us back to the question of why certain sounds are used and, at least in basic vocabulary items, does the use of the same sound stem from some fundamental primeval meaning? This meaning is not definitional in the sense of words. So, for example, we cannot define the meaning of /m/ in the same way we define the meaning of the word “table”. A table consists of a flat plane supported above the ground. It is used for putting things on. It is a piece of furniture. If sounds do have a meaning it is below the word level and suggests something more intrinsic and perhaps connotative.

We have found evidence to suggest that, at least in some cases, the distribution of sounds is not arbitrary. Maternal nursery terms throughout the world are dominated by nasals. This cannot realistically be accepted as a chance distribution. If we do not accept that this is the result of all languages

61 being related then we must look elsewhere for an explanation. It may be that nasals in this case have no fundamental meaning but are simply the extension of an infant’s mumbling during breast feeding. Whilst feeding the easiest sound to make, as the air passage through the mouth is mostly blocked, is a nasal one. The infant may be expressing content and satisfaction. They may also associate this pleasant sensation with the presence of the mother. It may also be the case that the infant is not aware in early life that they are a different entity from the mother. Originally they were one and they remain in a close relationship. Perhaps the nasal sound is associated with the satisfying of need and by association with the mother and then by extension to the infant themselves. The infant may be affirming the closeness of their relationship to their mother. The other being present in their life, the father, is identified as other and a plosive sound denotes them. If this were the case we would expect to find a statistical correlation between the use of nasal sounds for the nursery maternal term and for infants to refer back to themselves.

Other explanations may be that “mum” and “me” are basic terms that any infant (Jakobson, 1962), or language, would develop early on. They may be terms that were used primevally before language fully developed with sophisticated syntax. As these words were developed first their consonants were simply the easiest ones to produce. Other words developed later required different types of sounds to distinguish them. Intuitively /m/ seems easy to produce. The lips and tongue can be in a relaxed position and there is not the same force required for production as in a plosive. A brief look at our corpora would also suggest that nasal sounds occur more frequently than other consonants in many languages than randomness would allow for. They might be more frequent because they are easier to produce. A word of caution, however, may be relevant in this regard as each language looked at only had two or three nasal consonants bu any more plosives and fricatives. We have not looked at the distribution of nasals compared to plosives or fricatives. If a third of all consonants that occur are nasals and there are only three nasals in the language the incidence may appear higher compared to any other particular consonant but it may be that a third of the consonants were plosive. If we are just looking at manner of production then the distribution would be even as the 62 particular plosive used would not be of interest. In addition, research on ease of production does not seem to suggest that this leads to nasal sounds being particularly easier for infants to produce than others. Indeed, plosives seem to be learnt earlier. For the purposes of this study nine languages were chosen from different language families. Two of these languages were dropped from the study. A requirement for their inclusion was that we could determine a nasally dominant “mum” term. For one language there was some doubt as to whether this could be stated with confidence. Another language was dropped for the lack of reliable data. This left seven languages with nasally dominant “mum” terms to be analysed further. This resulted in four of the seven appearing to have nasally dominant “me” terms. Statistical analysis demonstrated that it was unlikely that such a result would occur randomly. Therefore, we have found that for these unrelated languages there is a statistically significant positive correlation with nasal sounds for first person object designators. As a requirement for the study was that the languages should be nasally dominant for the maternal nursery term we have no way of stating whether this would be reflective of all languages or only those with nasally dominant “mum” terms. It may be a possibility for further research to compare “mum” terms and “me” terms for languages without nasally dominant maternal nursery terms. These results could then be compared with those of this study to see if there are any significant differences.

It may also be interesting to see if languages which do not have a nasally dominant maternal nursery term still share a correlation between the type of consonant used for the “mum” term and the “me” term. This point is of importance as it raises a question of what are we looking for? Are we saying that nasal forms have a fundamental meaning of their own, perhaps even transcendental to our species, which means they occur more frequently for “mum” terms and for “me” terms? Alternatively, are we saying that the use of a type of consonant for the “mum” term is arbitrary and this precedes the development of the “me” term but by association and extension the same term is used for both? In other words, the meaning of the /m/ is derived from its first arbitrary use. It does not have a meaning that precedes this use.

63 If we are saying that nasal sounds such as /m/ have a meaning that precedes language then the question arises as to why this may be the case. Work by Cuskley & Kirby (2013) and Köhler (1929, 1949) suggest that cross modality may play a role. This may also be inferred from the work of Blasi. It may be that littleness is a concept across human societies and existed prior to the development of language. It is a basic concept and it may be that efforts to express this basic concept formed the first stage of systematic sound communication which led to human languages. By systematic we mean the consistent use of particular sounds in a consistent order to express a shared concept as opposed to the sound of a scream to articulate fear. This concept appears to correlate with close front vowels such as /i/ or /ɪ/ across languages. Why might this be the case? The blunt answer is we do not know but at this stage it might be possible, if we accept the argument of cross modality for which there now appears to be at least some evidence, to hazard some conjectures. In producing a close vowel the air channel is greatly reduced compared to an open vowel. This small space may represent the concept of littleness as opposed to the relatively larger space of open vowels such as /ɑ/. It may also be that the higher frequency of sound associated with close vowels is reminiscent of the sound of small things such as small pebbles hitting bigger rocks or the sound of an insect such as a mosquito. Conversely, open sounds may be more conducive to lower frequencies associated with the movement of larger objects or creatures. Köhler found a correlation of visual representation with sounds. It may be that nasals are reminiscent of the sounds that blob type things make as they move. This may be because they can be made over a relatively long period of time. It might be that this period of time represents the movement of the eye across the curve of an object rather than the more apparent immediacy that an apex generates. Hence, the use of plosives such as /t/ and /k/ for angularity as the plosive is a sound which only exists for a moment and cannot be extended for several seconds as a nasal can.

If we conjecture, then, that sounds may have had meaning that preceded fully developed language we may then look at certain sounds for certain basic concepts. In this case we are looking at nasals for the maternal nursery term. Why might this association have arisen? We are positing that the nasal term 64 was associated with “mum” before sophisticated language developed and this was one of the basic phonological building blocks of that language. It may that /m/ is the sound we associate with eating. In English and Hungarian the anticipation of something pleasant to eat is expressed by “mmmmm”. Hungarian children say /njɒm njɒm/. French children say “ham ham”. The prevalence of nasals is remarkable. If this is an association with food it may be that this is associated with the main food giver, the mother, and a similar sound is used for her. This may be extended to the notion of “my mum” and “my food” and generally “me” and “things that belong to me”. In this way, the nasal sound which denotes “mum” and or food originally is extended to notions about “me”. It may also be the case that initially an infant may not realise that the mother is a distinct person from them and may consider themselves one and the same person which, indeed, they were initially. With a nasal the air resonates in the nasal cavity and is not projected. In a sense there may be an association between the keeping of the air in the body and the notion of relating to myself. It may be that in this way the first interlanguage developed before moving onto a proto-language stage with more sophisticated vocabulary items and grammatical structures. Originally, the sounds themselves may have carried meaning in and of themselves and later words as we understand them developed using established sets of sounds for a language. It may be, as Ruhlen and Bancel & de l’Etang, argue that there was one original proto-language, and this is how it developed. However, it may also be the case that these concepts exist across all human societies and, because of the reasons outlined above, that is why these similarities in sounds occur, because the concepts induce such sounds in humans. If the first sound associated with “mum” was arbitrary and then by extension became associated with “me” then we would expect to find an even distribution of sounds to designate “mum” but this is not what we find. Therefore, we might speculate that the sound symbolism precedes it use in language. Of course, nasals may also refer to other concepts. Unless of course Ruhlen and Bancel & de l’Etang are correct and all languages derive from an original proto-language.

Blasi et al (2014) put forward the notion that language developed out of sound meaning. Language could not have been learnt by infants when there was no 65 language around. However, a basic sound symbolism could have provided the basis for a rudimentary vocabulary which developed into sophisticated language. In addition, Blasi et al go on to conjecture that there are echoes of these sounds in modern language and that these are not just vestiges of our primeval past but by their very sound symbolic nature aids infants in language acquisition.

We may ask the question as to why is it important to discuss how language developed? As speculated above the development of sophisticated languages may have taken place over only a few generations. However, it seems that linguists have no real idea of how this took place. Homo sapiens sapiens evolved and at first had no language. At some point languages existed but we have no knowledge of how this revolution took place. We do not really have any idea of when it took place. According to McDougal, White (Bancel & de l’Etang, 2013) homo sapiens sapiens first appeared in Africa approximately 200 000 years ago. At some point after that human language developed. It may well be that human language developed with the human evolution of the use of tools, cooking, making of ornaments, art work, etc that required sophisticated co-operation and the exchange of abstract concepts. Archeology gives a wide range for the early development of such behaviour. Bancel gives a date somewhere between 50 000 and 100 000 years ago.

Rose (2006) gives a description of how language may have evolved. He notes the universality of the syllable and posits that this may have allowed for pre- linguistic vocalisations to move to an intermediate stage on the way to becoming words. Apparently, there are four proto-syllables that human infants have a preference to produce; /ma/ /da/ /ga/ and /pat/. The use of italics are those of Rose and he gives no indication of how these symbols are to be interpreted but we may suppose from the context that we have a bilabial nasal, an alveolar plosive and a velar plosive for the first three consonants. The first two correspond strikingly with “mum” and “dad” terms.

Other evidence that might support the theory that language evolved through a process of attributing sounds to meanings lies in the work of Gentilucci & 66 Corballis (2006). They propose a theory holding that spoken language evolved out of sign language.

Proto-sign language carrying iconicity was used along with spoken language and there was some link between the gesture and the sound. Bear in mind that the sounds are produced by the movements of human organs. Gentilucci & Corballis provide evidence from experiments that show that hand gestures associated with the picking up of a small object are smaller than for a large object and these are accompanied by commensurate lip kinematics and voice spectra of syllables produced at the same time. They also speculate that speech may have originated from repetitive ingestive movements of the mouth, somewhat as this work links the nasal sounds to eating, ego and the nursery maternal term.

Corballis (2009) followed the above work up and expounded on the mirror system in which primates’ brains fire neurons in response to certain gestures. This leads on to the idea that speech sounds are perceived in terms of how they are produced rather than how they sound. This would explain how many different sounds can be perceived as the same phoneme; speech is a function of the mirror system. Corballis goes on to elaborate the notion that modern language may have taken a long time to develop, contrary to speculation earlier in this work, and the evolution of grammar may have been a result of mental time travel.

The use of gesture in spoken languages still exists. Sweetser (2007) demonstrates that gesture is an integral part of spoken language and that it is derived from iconicity. If spoken language evolved from sign language or parallel with it (Corballis, 2009) it may come as no surprise that the speech organs themselves incorporated motions of iconicity into their movements for the first syllabic sounds. The importance of gesture in language evolution is further reinforced by Arbib (2003) who points out the use of gestures with speech; the fact that blind people use them as well as sighted people; that sign languages are full human languages; that some indigenous Australian and North American cultures use sign language.

67 The ability to talk would have allowed our ancestors to co-operate more efficiently at manipulating their surroundings to their own convenience, allowing the spread of our species around the globe across seas and inhospitable areas. This on its own may be considered a good reason for the study of language evolution. There are various theories as to how language evolved. A number of researchers, eg Dunbar (1998) consider that it may have grown out of primate grooming. This allowed us to maintain relationships in larger groups and allowed for more sophisticated societies to develop. However, there is another point to consider in this regard. Without language it is impossible for us to cognise. We would be aware of things but would not be able to consider them without words and we certainly could not do this through interaction with another person. Language is our interface with reality. It is how we make sense of what we do. It is our existence beyond basic physical and emotional sensations. Understanding how it evolved may help us understand ourselves.

This study has demonstrated a statistically significant result in terms of a correlation of nasal phonemes with first person object designators for a small selection of languages which had nasally dominant maternal nursery terms. However, before we assume toomuch the following limitations of this study should be borne in mind. Only seven languages were analysed. They were drawn from different language families to avoid organic relationships between them affecting our statistics. However, seven is a very small sample from the approximately 7 000 languages in the world (Ethnologue).

Whilst probabilistic calculations were made which support the significance of the result these calculations themselves have attached caveats. It is possible that the results were produced by chance. It would be advisable for another study to be carried out with other unrelated languages to see if the results of this study are reproducible and, therefore, reliable. The use of statistics in linguistics is helpful but not a panacea for problems of reliability or scientific significance. Bancel & de l’Etang (2013) discuss statistical criticism of their own work and uses the example of the probability of getting an ace when throwing an ordinary die is one in six. In fact, the probability of getting an ace 68 when throwing an ordinary die is zero. However, interpreting what he meant to say from the context, perhaps there was a problem with the article’s translation into English from the original French, is the probability of getting any one particular number between and including one and six, say six, is a ratio of one in six. However, as this study has shown we are rarely dealing with situations analogous to the rolling of dice or the drawing of cards from a randomly shuffled deck. When a fair die is rolled there is an even chance of getting any particular number but this is not the case for much of the phenomena we have looked at. As can be seen from the calculations in this study an accurate statement of a probability such as the likelihood of a particular language using a particular sound for a particular concept is fraught with difficulties. We do not know which criteria would be most reasonable to use. We may use the number of target sounds from the set of sounds in the language. We may use the frequency of sounds in a corpus of the language. However, choosing a corpus in itself is a difficult matter. Should it be of a particular age group? Should it be of language of a particular register? We also know that languages change over time and it may be that some words have existed in a particular language for much longer than others, a point substantiated by Bancel & de l’Etang. Should we then only be comparing words which have existed for the same length of time as the distribution and frequency of sounds in a language will change over time. For example, old Hungarian had diphthongs and modern Hungarian does not. Some of these factors would be much easier than others to include in our probability calculations. Further issues to be considered include the problem of languages borrowing from other unrelated languages which may result in bias in the calculations.

The inclusion of more languages would hopefully help smooth out the problems of bias in small samples. However, in this study we did not randomly select nine languages to begin with. Again, this is different from the rolling of a die as there may be an inherent bias in our selection of languages. One criteria we used for selecting languages was the availability of data. It was assumed that large languages with many speakers would have more available data. However, the history of the world over the last few centuries has resulted in the Indo- 69 European language family becoming dominant in many areas of the globe far from where the language family originally developed. Particularly, in the Americas the indigenous languages have been replaced through colonialism. This has also occurred in the Antipodes and parts of Asia. In other areas where the indigenous languages have remained colonial Indo-European languages have higher status and are used in such areas as business, politics, law and education. The result of this is that there is a wealth of data on much studied Indo-European languages but relatively little, bar a few exceptions, with regard to other language families. In some cases there is relatively little data available even for languages with tens of millions of speakers. The lack of data makes mistakes more likely to occur. The problem of finding equivalents between one word in one language and another in another language are more likely if the researcher can only find one or a few examples of its use. This is further compounded if native speakers are not available for interview to triangulate data with. Native speakers of languages are not uniformly spread across the world and it may be difficult for a particular researcher to make contact with them. Even in the present study where languages were chosen for their perceived availability of data there were problems in this area. Therefore, if the number of language families were to be increased for the purposes of study there may be an exponential increase in the problems of reliable data collection. An alternative might be to not increase the number of language families from which we select languages but to increase the number of languages from the existing language families. This study selected one from each language family but we could decide to select two; say English and French from Indo-European, Hungarian and Finnish from Finno-Ugric, Japanese and Korean, Arabic and Hebrew from Sem. However, in some cases we may have the same problem in finding reliable data from the second language and how would we account for the languages being related and, as discussed in the methodology of this section, therefore more likely to produce similar results by virtue of a shared ancestor language? Therefore, it may be that our results are biased simply because this was a chance result of the few languages that were available.

70 If a wider study is to be conducted the problem of statistical analysis will have to be addressed. In the data analysis section of this study this problem was discussed at length. It will have to be established what ratio one would expect for the sounds to occur. As we have seen this is not straight forward. In this study only seven languages were statistically analysed. If this number is to be extended the calculations will increase in number factorially. This makes the probability of error more likely. Increasing the number of languages makes the calculation of possible combinations considerably more extensive. We have an algorithm for finding how many possible combinations there are of x number of 1s and y number of 0s in any string of 1s and 0s. In this case there were 35 but this would increase vastly by just adding a few languages to the study. Furthermore, ascertaining what those combinations are was very time consuming in this study as they were found by hand. If more languages were used this would be impractical and an algorithm will have to be found which systematically gives all the possible combinations. These could then be put into a spreadsheet which will calculate the probability ratios for us.

The problem of choosing languages to study is not just one of practicality, there is also the recurring issue that our probability analyses are not those of die rolling where we have a fixed number of possible outcomes all of which are equal. We have observed that the use of much statistical software assumes an equal likelihood of all possible outcomes but the fact that we are dealing with unequal probabilities makes the calculations much more difficult. An extension of the number of languages under study would have to involve a careful method of selection. In order to avoid cross influences in the data in this study we chose unrelated languages from different families. Given the problems involved it is difficult to see if reliable data could be drawn from other language families. There are many language families available but considering the problems outlined above it may be that much research would be done for relatively little return in terms of useable data. An alternative would be to choose a second language from each of the language families already used. For example, for Indo-European we could use English and French; for Finno-Ugric, Hungarian and Finnish; etc. However, we may still find some families problematic. For

71 example, finding another Ural-Altaic language with readily available data to go with Turkish may be difficult. Nevertheless, one advantage of such a study would be that half the data are already known. However, the main problem would be that which we have tried to avoid in this study. We would expect similarities of form in related languages. This would not invalidate such a study but this problem would have to be addressed within it. In addition, it might raise issues in probability calculations.

This study began with the observation of Murdock that there seemed to be a preponderance of nasal phonemes for the maternal nursery term across languages over the whole world. We have tested this to establish if this is statistically significant, which it is. We have speculated as to the causes of this considering, among others, Jakobson’s ideas that nasal sounds are those an infant can most easily make when breast feeding. However, Murdock also observed the prevalence of plosive sounds for the paternal nursery term. Looking at this was beyond the scope of this study but the phenomenon is of relevance. If the use of nasals for the “mum” term can be explained through infants’ feeding then what explains the use of plosives for the “dad” term? Jakobson speculates that the father is the next significant other that the infant relates to and plosives are more distinct from nasals than other sounds and so are used to make this difference. However, we may ask what other sound might be used, we are rather left with fricatives. The hypothesis has yet to be tested. We might also speculate that if there is a fundamental, perhaps cross modal, meaning to the nasal in the “mum” term then perhaps this may also account for plosives in the ”dad” term.

We may refine this further and observe that a glance at Murdock’s data would suggest that front plosives in the labial and alveolar regions are used for the “dad” term rather than back plosives such as /g/. If we are to contend that the basic lexical units of a pre-grammatical proto-language developed through sound symbolism then we may expect a fundamental cross language meaning for the sounds associated with the basic familial concept of the nursery paternal term. In this way early language may have developed. A further study, therefore, may be to use the model of this study, perhaps with modifications as observed above, to

72 analyse the prevalence of front plosives for the “dad” term. This could follow begin with a statistical analysis of significance for front plosives in the Murdock data and move on from there. We my speculate a development of language from the nasal terms which perhaps associated “mum” with food and by extension “me” as air resonates in the nasal cavity and this may have represented the keeping of food in the mouth of the infant. The ejaculation of air with a front plosive may have represented otherness in the projection of food, perhaps metaphorically, to another representing a notion of “you” or “yours”. If “me” associates with “mum” then “you” and “dad” may be other. This may allow us to look for front plosives in the impersonal or familiar second person singular of languages to test to see if this correlates with the use of such phonemes for the paternal nursery term.

73 References

The references are listed alphabetically by main author’s surname. In some cases the author is an institution and in this case the first main word in the institution’s name is used. In some cases the format of the citation is dictated by the database from where it is derived; hence there are some minor inconsistencies in formatting.

Arbib, M. (2003) The Evolving Mirror System in “Language Evolution” Christiansen, M. & Kirby, S., Oxford University Press, Oxford.

Ashton, E. O. (1947) “Swahili Grammar”. London: Longmans, Green & Co. (2nd edition).

Bally, C. & Sechehaye, A. (1916) “Course in General Linguistics: Ferdinand de Saussure”, McGraw-Hall Book Company, New York

Bancel, P. J. & de l’Etang, A. M. (2010) Where do Personal Pronouns Come From? In “Journal of Language Relationship” 3

Bancel, P. J. & de l’Etang, A. M. (2013) Brave New Words in “New Perspectives on the Origins of Language” Lefebvre, Claire, Comrie, Bernard, Cohen and Henri (eds), John Benjamins Publishing Company, Amsterdam

Bancel, P. J., de l’Etang, A. M. & Bengtson, J. D. (2015) Back to Proto-Sapiens (Part 2) in “Kinship, Language and Pre-history” Jones, D & Milicic, B. (eds), The University of Utah Press, Salt Lake City

Barrios, J. (2014) “Tagalog for Beginners”, Tuttle

Bayram, A. & Jones, K. (2006) “Large Portable Dictionary”, Milet Publishing

Blasi, D., Christiansen, M. Wichmann, S., Hammarström, H. & Stadler, P.. (2014“Sound Symbolism and the Origins of Language”, 391- 392. 10.1142/9789814603638_0059.

Blasi, D. E., Wichmann, S., Hammerström, H., Stadler, P. & Christiansen, M. H. (2016) Sound-meaning Association Biases Evidenced across Thousands of Languages, Cutler (ed) at “Proceedings of the National Academy of Sciences of the United States of America”

Bodor, P., (2004) “On emotions: A developmental social constructionist account” L’Harmattan, Budapest

Corballis, M. (2009) The Evolution of Language in “Annals of the New York Academy of Sciences” Volume 1156, Issue 1, NYAS Publications, New York

74 Cuskley, C. & Kirby, S. (2013) Synesthesia, Cross-modality and Language Evolution in “The Oxford Handbook of Synesthesia”, Simner, J. & Hubbard, E. (eds), Oxford University Press, Oxford

Dahiru, T. (2008) P-Value, a True Test of Statistical Significance? In “Annals of Ibadan Postgraduate Medicine” 6(1)

Dryer, M. S. & Haspelmath, M. (eds) (2018) “The World Atlas of Language Structures Online”, The Max Planck Institute of Evolutionary Anthropology, Leipzig (available online at http://wals.info Accessed on 26/8/18) de l’Etang, A. M., Bancel, P., & Ruhlen, M. (2015) Back to Proto-Sapiens (Part 1) in “Kinship, Language and Pre-history” Jones, D. & Milicic, B. (eds), The University of Utah Press, Salt Lake City

Dunbar, R., (1998) “Grooming, Gossip and the Evolution of Language”, Harvard University Press

Ethnologue: Languages of the World, Twenty-first edition. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com

Gates, C. (2002) “Turkish-English / English Turkish Dictionary and Phrasebook”, Hipporcrene Books Inc., New York

Gentilucci, M. & Corballis, M. (2006) From Manual Gesture to Speech in “Neuroscience and Behavioral Reviews” 30

Glušac, M. & Čolič, A. (2017) Linguistic functions of the vocative as a morphological, syntactic and pragmatic-semantic category” in “Jezikoslovlje” 18.3

Gordon, R. G. Jr. (ed.), (2005) “Ethnologue: Languages of the World”, Fifteenth edition. Dallas, SIL International. Onlineversion: http://www.ethnologue.com/15.

Greenberg, J. (1990) Universals of Kinship Terminology in “Selected Writings of Joseph Greenberg”

Grégoire, A. (1937) “L’apprentisage du Langage: Les Deux Premières Années”, Gembloux, Liége

Hall, R. A. (1944) Hungarian Grammar in “Language Monographs, 21”, Baltimore: Linguistic Society of America.

Hawkins, M. (2011) “Tagalog Verb Dictionary”, Northern Illinois University Press

Hinds, J. (1986) Japanese in “Croom Helm Descriptive Grammars, 4” London: Croom Helm, Routledge.

75 Hogg, R. & Tanis, E. (2010) “Probability & Statistical Inference”, Pearson

Huang, Q. (2010) “McGraw Hill’s Chinese Dictionary and Guide to 20000 Essential Words”, McGraw Hill Education

Jakobson, R. (1962) Why Mama and Papa? In “Roman Jakobson: Selected Writings, 1. Phonology”, Mouton, The Hague

Kaszás, G. & Elek, L. (2017) “Amit A Sūnökröl Feltétlenül Tudni Kell”, Animus Kiadó,

Köhler, W. (1929) “Gestalt Psychology” Liveright, New York

Köhler, W. (1949) “Gestalt Psychology” (2 ed) Liveright, New York

Leech, G. (1999) The Distribution and Function of Vocatives in American and British English Conversation in “Out of Corpora”, Hasselgård, H. & Oksefjel, S. (eds), Rodopi, Amsterdam

Leopold, W.F. (1939) “Speech Development of a Bilingual Child: A Linguist’s Record, Vol 1”, North Western University Press, Evanston

Lewis, G. L. (1967) Turkish Grammar. Oxford: Clarendon Press.

Locke, J. L. (1972) Ease of Articulation in “Journal of Speech and Hearing” 15

Magay, T. & Országh, L. (1994) “Magyar-Angol Kéziszótár” 6 edition, Akadémiai Kiadó, Budapest

Majstrik, B. (2010) “Kinship Terms in the British National Corpus”, Bachelor thesis, Filozofickáfakulta Univerzity Palackého

Martin, A. & Peperkamp, S. (unpublished) Assessing the Distinctiveness of Phonological Features in Word Recognition: Prelexical and Lexical Influences

Martin, S. (2008) “Tuttle Concise Japanese Dictionary”, Tuttle Publishing

McLeod, S. & Crowe, K. (2018) Children’s Consonant Acquisition in 27 Languages: A Cross Linguistic Review in “The American Journal of Speech Language Pathology” Nov 21; 27 (4)

McIntire, M. (1977) The Acquisition of American Sign Language Hand Configurations in “Sign Language Studies” 16(1)

Miyata, S. (2012b). CHILDES nihongoban: Nihongoyoo CHILDES manyuaru 2012. [Japanese CHILDES: The 2012 CHILDES manual for Japanese]. http://www2.aasa.ac.jp/people/smiyata/CHILDESmanual/chapter01.html

76 Moro, A. (2003) Notes of Vocative Case in “Amsterdam Studies in the Theory and History of Linguistic Science”, J. Benjamins Pub. Co., Amsterdam

Murdock, G. P. (1957) World Ethnographic Sample in “American Anthropologist” 59

Murdock, G. P. (1959) Cross-Language Parallels in Parental Kin Terms in “Anthropological Linguistics” Vol 1, No 9, Indiana University

Nichols, J. & Peterson, D. A. 2013. M-T Pronouns. In: Dryer, Matthew S. & Haspelmath, Martin (eds.) “The World Atlas of Language Structures Online” Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at http://wals.info/chapter/136, Accessed on 2018-08-27.)

Okayama, Y. (1970). “Haha to ko no taiwa no shuurooku” [A collection of mother- child conversations]. Osaka: Yasuda Seimei Shakaijigyoodan.

Ota, M. (2003). “The development of prosodic structure in early words: Continuity, divergence and change”, Amsterdam/Philadelphia: John Benjamins.

Oxford University Press “Pocket Oxford Chinese Dictionary” (1999) Oxford University Press, Oxford

Perdon, R. (2013) “Pocket Tagalog Dictionary”, Periplus

Planck, M. Institute for Evolutionary Anthropology (2015) “Leipzig Glossing Rules”, Leipzig

Quitregard, D. (1994) “Arabic Key Words”, Cambridge

Ramachandran, V. S. & Hubbard, E. M. (2001) Synaethesia – A Window into Perception, Thought and Language in “Journal of Consciousness Studies” 8(12)

Ramos & Cena (1990) “Modern Tagalog”, University of Hawaii Press, Honolulu

Rose, D. (2006) A Systemic Functional Approach to Language Evolution in “Cambridge Archaeological Journal” 16:1

Ruhlen, M. (1994) “On the Origin of Languages: Studies in Linguistic Taxonomy”, Stanford University Press, Stanford

Safari, J. (2012) “Swahili Made Easy”

Schachter, P. and Otanes, F. T. (1972) “Tagalog Reference Grammar”, Berkeley: University of California Press.

Slobin, D. I., & Bever, T. G. (1982). Children use canonical sentence schemas: A crosslinguistic study of word order and inflections in “Cognition” 12(3), 229- 265. https://doi.org/10.1016/0010-0277(82)90033-6

77 Slonimska, A. & Roberts, S. G. (2017) A Case for Systematic Sound Symbolism in Pragmatics: Universals in wh-words (their italics) in “Journal of Pragmatics” 116

Steingous (1882) “Arabic English Dictionary”

Sweetser, E. (2007) Looking at Space to Study Mental Spaces in “Methods in Cognitive Linguistics” Gonzalez-Marquez, M. (ed), John Benjamins, Amsterdam

Theakston, A. L., Lieven, E. V. M., Pine, J. M., & Rowland, C. F. (2001). The role of performance limitations in the acquisition of verb-argument structure: an alternative account in “Journal of Child Language” 28, 127-152.

Tuttle (1974) “The all Romanized English-Japanese Dictionary”, Tuttle, Boston

2007. The UCLA Phonetics Lab Archive. Los Angeles, CA: UCLA Department of Linguistics. http://archive.phonetics.ucla.edu/.

Wallace, A. & Atkins, J. (1960) The Meaning of Kinship Terms in “American Anthropologist” 62

Woollard, E. & Murphy, A. (2014) “Woozy the Wizard: A Spell to Get Well”, Faber & Faber Ltd, London

The World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at http://wals.info, Accessed on 2018-11-21.)

Yokoyama, M. & Miyata, S. (2017). Yokoyama Corpus. Pittsburgh, PA: TalkBank. Miyata, S. (2012). CHILDES nihongoban: Nihongoyoo CHILDES manyuaru 2012. [Japanese CHILDES: The 2012 CHILDES manual for Japanese]. http://www2.aasa.ac.jp/people/smiyata/CHILDESmanual/chapter01.html

78

APPENDIX 1 Phonemic Symbols Symbols referring to the pronunciation of a word are given between the marks //. The table below indicates their usage.

/i/ /ɪ/ monophthong, /ʊ/ /u/ /eɪ/ monophthong, close, front, tongue monophthong. monophthong, Diphthong. close, front, relaxed Close, back, close, back, Mid- front to close Slitting of lips, Lip pursed lips rounded front glide tongue tense /o/ monophthong, back, middle, cardinal /e/ monophthong, /ͻ/ front, middle to close, monophthong, slitting of lips back, middle, pursed lips /Ɛ/ monophthong, /ə/ /aɪ/ diphthong, central, front, lips monophthong, open front to widened central, close front tongue glide relaxed /æ/ monophthong, /a/ /ʌ/ /ɑ/ /ɒ/ monophthong, open, front lips open monophthong, monophthong, monophthong, open back, rounded wide central, open middle, central open, back. lips

1

/b/ consonant, /t/ consonant, /d/ consonant, /tʃ/ /g/ /k/ /q/ consonant, bilabial, voiced, plosive alveolar, alveolar, consonant, consonant, consonant, uvular, unvoiced unvoiced, voiced, plosive palatal velar, velar, plosive plosive alveolar plosive, plosive, affricate voiced unvoiced /v/ consonant, labio- /z/ consonant, /ʃ/ dental, fricative, voiced alveolar, consonant, fricative, palatal voiced alveolar, fricative, unvoiced /m/ /n/ consonant, /ɧ/ consonant, /h/ consonant, /l/ consonant, alveolar /r/ /w/ /j/ semi- consonant, alveolar, velar, nasal, open, lips voiced, lateral semi- semi- vowel, nasal, nasal, voiced voiced wide apart, vowel, vowel, palatal- bilabial, voiced unvoiced alveolar, back, alveolar, voiced lips voiced rounded, voiced

2

Appendix 2: Murdock Data

Consonant class Low vowels High front vowels High back vowels Mo Fa Mo Fa Mo Fa Non-nasal labials 15 152 11 19 4 24 Non-nasal dentals 25 105 36 28 4 15 Bilabial nasals 101 33 34 8 20 9 Dental nasals 69 16 45 9 14 6 Velars 10 16 12 6 13 8 Midpalatal 19 10 8 1 14 7 semivowels Midpalatal occlusive 2 7 11 12 0 4 Sibilant fricatives 3 4 9 9 1 1 Liquids 8 4 3 5 0 2 Aspirates 7 7 2 1 3 1 Velar nasals 9 1 1 0 4 0 Miscellaneous 2 2 6 5 4 3 No consonant 1 0 1 1 0 0

1

Appendix 3: Chi Squared Test of Murdock Data Hand Calculations

Nasal Non-nasal Total Observed events (O) 297 234 531 Expected events (E) 1593/13 5310/13 531 O-E 2268/13 -2268/13 0 (│O-E│)2 30436.8284 30436.8284 (│O-E│)2/E 248.38592 (4dp) 74.5157757 (7dp)

→ χ2 ≈ 322.901696 (6dp) With 1 degree of freedom

→ p < 0.00001 (calculated from www.socscistatistics.com/pvalues/chidistribution.aspx)

1

Calculations from http://vassarstats.net/csfit.html

2

R calculations > chisq.test(x=c(297,234),p=c(0.230769,0.769231))

Chi-squared test for given probabilities

data: c(297, 234) X-squared = 322.9024, df = 1, p-value < 2.2e-16>

3

Appendix 4: Χ2 Test of Bancel Data

Hand calculations

“nana” / “mama” “nana” / “mama” Total designating mother designating other kinship Observed events (O) 706 926 1632 Expected events (E) 1632/16 = 102 1632-102=1530 1632 O-E 604 -604 0 (O-E)2 364816 364816 (│O-E│)2/E 3576.6275 (4dp) 238.4418 (4dp)

→ χ2 = 3815.07 (2dp) With 1 df → p<0.00001 (www.socscistatistics.com/pvalues/chidistribution

1

SPSS calculations

2

R calculations > chisq.test(x=c(706,926),p=c(0.0625,0.9375))

Chi-squared test for given probabilities

data: c(706, 926) X-squared = 3815.069, df = 1, p-value < 2.2e-16

3

Appendix 5 Survey Monkey The survey was sent to University of Central Lancashire students on pre-sessional and in-sessional programmes. The following questions were asked.

Q1 Which of the following languages did you speak to your mother when you were little? If none, please, do not complete the survey.

Arabic

Tagal og

Swahil i

Chine se

Japan ese

Turkis h

1 Telug u

Hunga rian

2

Q2 What did you call your mother when you were little? Please, write using English script to approximate the pronunciation. Q3 In the sentence, "This is my book." How would the "my" part be pronounced in your language? Use English script to approximate the pronunciation. Q4 In "Give the book to me." How would the "me" part be pronounced in your language? Use English script to approximate the pronunciation. Q5 In "This is mine." How would the "mine" part be pronounced in your language? Use English script to approximate the pronunciation.

10 responses were received; 4 for Chinese and 3 each for Arabic and Japanese. Only 5 respondents answered more than the first question. The raw data are as follows.

3

Q1 Which of the following languages did you speak to your mother when you were little? If none, please, do not complete the survey. • A r a b i c Q2 What did you call your mother when you were little? Please, write using English script to approximate the pronunciation. Mama Q3 In the sentence, "This is my book." How would the "my" part be pronounced in your language? Use English script to approximate the pronunciation. -ee Q4 In "Give the book to me." How would the "me" part be pronounced in your language? Use English script to approximate the pronunciation. ilaya Q5 In "This is mine." How would the "mine" part be pronounced in your language? Use English script to approximate the pronunciation. ili

Q1 Which of the following languages did you speak to your mother when you were little? If none, please, do not complete the survey. • Chinese (Mandarin) Q2 What did you call your mother when you were little? Please, write using English script to approximate the pronunciation. mama Q3 In the sentence, "This is my book." How would the "my" part be pronounced in your language? Use English script to approximate the pronunciation. wo de Q4 In "Give the book to me." How would the "me" part be pronounced in your language? Use English script to approximate the pronunciation. 4 wo Q5 In "This is mine." How would the "mine" part be pronounced in your language? Use English script to approximate the pronunciation. wo de

5

Q1 Which of the following languages did you speak to your mother when you were little? If none, please, do not complete the survey. • J a p a n e s e Q2 What did you call your mother when you were little? Please, write using English script to approximate the pronunciation. Okaasan Q3 In the sentence, "This is my book." How would the "my" part be pronounced in your language? Use English script to approximate the pronunciation. watashino Q4 In "Give the book to me." How would the "me" part be pronounced in your language? Use English script to approximate the pronunciation. watashini Q5 In "This is mine." How would the "mine" part be pronounced in your language? Use English script to approximate the pronunciation. watashinomono

Q1

Which of the following languages did you speak to your mother when you were little? If none, please, do not complete the survey.

• A r a b i c Q2

What did you call your mother when you were little? Please, write using English script to approximate the pronunciation. 6 mama Q3

In the sentence, "This is my book." How would the "my" part be pronounced in your language? Use English script to approximate the pronunciation.

We put “i” in an end noun to describe it own such as “my”. Book = ketab, my book = ketabi Q4

In "Give the book to me." How would the "me" part be pronounced in your language? Use English script to approximate the pronunciation. me = li or le

7

Q5

In "This is mine." How would the "mine" part be pronounced in your language? Use English script to approximate the pronunciation.

mine = li or le. We use me and mine at same pronounced but different meaning. E d it D e le t e E x p or t

Q1

Which of the following languages did you speak to your mother when you were little? If none, please, do not complete the survey.

• J a p a n e s e Q2

What did you call your mother when you were little? Please, write using English script to approximate the pronunciation.

Okaasan Q3

In the sentence, "This is my book." How would the "my" part be pronounced in your language? Use English script to approximate the pronunciation.

watashino Q4

In "Give the book to me." How would the "me" part be pronounced in your language? Use English script to approximate the pronunciation.

watashini

Q5 8 In "This is mine." How would the "mine" part be pronounced in your language? Use English script to approximate the pronunciation.

9

Q1 Which of the following languages did you speak to your mother when you were little? If none, please, do not complete the survey. • Chinese (Mandarin) Q2 What did you call your mother when you were little? Please, write using English script to approximate the pronunciation. mama Q3 In the sentence, "This is my book." How would the "my" part be pronounced in your language? Use English script to approximate the pronunciation. wo de Q4 In "Give the book to me." How would the "me" part be pronunced in your language? Use English script to approximate the pronunciation. wo Q5 In "This is mine." How would the "mine" part be pronounced in your language? Use English script to approximate the pronuncation. wo de

1