<<

President oftheGerman Association ofLusitanists. WGNDV, Vice PresidentoftheGerman Association ofCatalanistsandVice Maximilians University/Munich.HeismemberoftheSteering Committeeofthe at Johannes-GutenbergUniversity/MainzandResearch Assistant atLudwig- the UniversityofLeipzig.Hewas Assistant ProfessorofRomanceLinguistics Benjamin MeisnitzerisFullProfessorofSpanishandPortuguese Linguisticsat Pluricentric Languages(WGNDV). Initiator and Coordinator of the Working Group on Non-Dominant Varieties of Head ofthe Austrian GermanResearchCentreattheUniversity ofGrazand Rudolf MuhrisSociolinguist,retiredProfessorofLinguistics, Founderand This bookcomprises30selectedpapersthatwerepresentedatthe5thWorld Hungarian, Malay, Persian,SomaliandRomanian. and littleresearchedPLCLsarealsopresentedinthecontributions: Albanian, from alargenumberofpapersonSpanish,FrenchandPortuguese,“new” known PLCLshasagainbeenextended. There arenow43PLCLsinall. Apart languages and31(non-dominant)varietiesaroundtheworld. The numberof (WGNDV). The authorscomefrom15countriesanddealwith14pluricentric by theWorking GrouponNon-DominantVarieties ofPluricentricLanguages CL) heldattheUniversityofMainz(Germany). The conferencewasorganized Conference ofPluricentricLanguagesandtheirNon-DominantVarieties (WCP ISBN 978-3-631-75623-2 www.peterlang.com -

Pluricentric Languages 20 Rudolf Muhr / Benjamin Meisnitzer (eds.) · and Non-Dominant Varieties Worldwide Varieties Worldwide ÖSTERREICHISCHES DEUTSCH 20 SPRACHE DERGEGENWART Rudolf Muhr Languages and New PluricentricLanguages−OldProblems Non-Dominant / Pluricentric Benjamin Meisnitzer(eds.) Herausgegeben von Rudolf Muhr Morgan NILSSON1 (University of Gothenburg, Sweden) [email protected]

Somali as a : corpus based evidence from schoolbooks

Abstract

Somali is spoken by more than 20 million people in five states in the Horn of Africa. This paper gives a short survey of the development from the first written texts in the late 19th century, through three decades of centralised standardisation in 1960-1989, up to the present day diver- gence into three somewhat differing parallel standards in , Somaliland, and the Somali Region of Ethiopia; three states where Somali is used in education and administration. At the Bank, electronic language corpora of Somali are presently being compiled. These text corpora are designed to allow a systematic investigation of the relevant historical phases of Somali, as well as the present day regional varieties, and contain a substantial amount of educational and administrative texts. Regional differences can be observed at all levels of the language, such as phonology, , morphology, syntax, and lexicon. In this paper, some examples of such are presented based on data from schoolbooks in the Somali corpus at the Swedish Language Bank.

Basic facts about Somali Somali is an Afro-Asiatic language, today probably spoken by some 22‒23 million people within five states in the Horn of Africa, as well as in the diaspora. The area in the Horn is approximately the size of France and Germany together. There are about 8 million speakers in Somalia2 (without Somaliland), 4 million in

1 In: Rudolf Muhr / Benjamin Meisnitzer (eds.) (2018): Pluricentric Languages and non-dominant Varieties worldwide: Nation, space and language. Wien et. al., Peter Lang Verlag. p. xx-xx. 2 It is difficult to find reliable data that is up-to-date. Linguistic sources (e.g. Ethnologue 2015b, Berchem 2012: 17) still tend to report less than 10 million inhabitants in Somalia as a whole (Somaliland included), whereas the United Nations (UN 2017) now estimate the population to 14.7 million. The linguistic minorities are negligible, with the exception of Maay (Lamberti 1988: 17ff., Appleyard & Owen 2008: 285, Tosco 2012:266f.). This language is sometimes considered a of Somali, but linguistically it is predominantly considered an independent language spoken by close to 2 million (Ethnologue 2015a). 2

Somaliland3 (which declared its independence in 1991, but has not been recognised by any other country), 6.5 million in Ethiopia4, 2.8 million in Kenya5, 0.5 million in Djibouti6, and more than 1 million elsewhere around the globe7. Somali is therefore approximately the 70th largest language in the world, and the 9th in Africa after , Swahili, Hausa, Oromo, Yoruba, Igbo, Fula, and Amharic.

Map 1. The Somali speaking area in the Horn of Africa. Source: Wikimedia.

The part of Somalia that excludes Somaliland was colonised by Italy, Somaliland was colonised by Britain, and Djibouti by France, and hence Mogadishu, Hargeisa and Djibouti City are traditional administrative centres.

3 The Government of Somaliland (SomalilandGov 2010: 10) estimated its population in 2009 to 3.85 million. The country is linguistically homogenous (Appleyard & Owen 2008: 285). 4 The Central Statistic Agency of Ethiopia stated in their last census (CSA 2007: 98f.) that 6.2% of the population were Somali speakers. The total population has now reached 105 million (CIA 2017). 5 According to Kenya National Bureau of Statistics, in 2009, 6.2% of the population were ethnic Somalis (Oparanya 2010: 38). The total population has now reached 45.8 million (KNBS 2017: 18). 6 According to the Central Intelligence Agency (CIA 2017a), 60% of Djibouti’s 0.86 million inhabitants are Somalis. 7 This is just a broad estimate as such data is very difficult to collect and verify. A list on Wikipedia (2018) gives slightly more than 1 million, but this is probably an underestimation of the number of Somali speakers in the diaspora. As an example, the figure for Sweden in Wikipedia (2018) is 63,850. This corresponds to the figure found at Statistics Sweden for persons born in Somalia. There are however also 32,000 persons living in Sweden whose parent(s) were born in Somalia (SCB 2017), and presumably many more Somali speakers with yet another geographical background, e.g. Ethiopia, Kenya or Djibouti.

3

Generally speaking, there is surprisingly little regional variation within the Somali speaking area. The most important reason for this is probably the high mobility of the major part of the population, who traditionally live as pastoral nomads. There is however, a somewhat differing group of varieties around and north of the capital of Mogadishu, referred to as Banadir, and even more distant varieties along the coast south of Mogadishu, as well as around the two big rivers west and south-west of Mogadishu (Lamberti 1984). The most notable such is referred to as Maay, which is often considered an independent language, possibly with close to 2 million speakers (Ethnologue 2015a).

Map 2: The distribution of Somali . Source: Wikimedia, © by Kzl55, based on Lamberti (1984).

The remaining area, i.e. the most wide-spread dialect group on Map 2, exhibits far less variation. It is traditionally referred to as Northern Somali. Northern Somali and the Banadir varieties, second largest on Map 2, are mutually well intelligible, with the exception of certain lexical items (Hared 1992: 17f., Appleyard & Owen 2008). 4

Standardisation process Somali has been written at least since 1880, using both Arabic, Latin and a number of unique Somali scripts (Tosco 2015). Beginning in 1941, Somali was also used in broadcasting from Mogadishu and Hargeisa (Puglielli 2001, Haybe n. d. a). After the unification of the Italian protectorate, in the east and south, with the British Somaliland, into the independent Somali Republic in 1960, language policy and planning played an important role for three decades. Initially, Italian, English, and Arabic became the official languages, and a Language Commission was appointed to resolve the question of a Somali script. A Latin-based script was proposed by the commission, but the government hesitated, possibly in fear of the reaction from religious leaders (Cali n. d., Qutbi et al. 1961). Between 1965 and 1967, commissioner Shire Jama Ahmed published a book and six issues of a journal with what was to become today’s official Somali script (Andrzejewski 1974: 201), and in 1969 a Russian-Somali-Russian was published in Moscow with the same orthography (Stepanjenko & Osman 1969). Later that year there was a military coup and the Somali Republic became the Somali Democratic Republic. The new government promised to make Somali the as soon as possible. A new Language Commission was appointed in order to produce schoolbooks, a dictionary, and a reference (Hared 1992: 33f.). In 1971 the grammar was published, and in January 1973 Somali was introduced as the sole language of administration. Later that year it was introduced as the language of instruction for the youngest pupils. Many new schoolbooks were published, and a huge literacy campaign was launched among the adult population (Haybe n. d. b, Hared 1992: 34ff.). In 1976, the first monolingual dictionary was published, and from the end of the 70’s all primary and secondary school instruction was in Somali. Schoolbooks for all subjects were produced in Somali, together with terminological wordlists (Andrzejewski 1980). The work continued until civil war broke out at the end of the 80’s. Until then, the use of written Somali in Kenya, Ethiopia, and Djibouti was marginal, and the public use of spoken Somali was discouraged. In the 70’s and 80’s, language policy in the Somali Democratic Republic was strongly centralised and controlled by the state, and therefore, also rather successful (Hared 1992: 40f.). Today, however, language policy is not a priority, and three political centres, i.e. Mogadishu (Somalia), Hargeisa (Somaliland), Jigjiga (the Somali Region of Ethiopia), influence the language indirectly through their use of Somali in education, administration, and politics.

5

Documentation of the standard The codification of Somali is very weak. The standard is implicitly docu- mented in schoolbooks produced by the Language Commission and the Ministry of Education (Andrzejewski 1978: 42), but the only explicit codification is actually a short reference grammar published by the commission (Guddiga 1971, 1973). The existing five monolingual , published in Mogadishu (Keenadiid 1976), Djibouti (Carab 2004), Nairobi (Cali-Guul-Warsame 2008), (Puglielli & Mansuur 2012), and Djibouti (Aadan 2013), have a varying, weaker status, as they have not been officially sanctioned by the authorities in any of the states where Somali is used as an official language. The same is true for a number of respected reference , such as Axmed (1973), Saeed (1993, 1999), Raabbi (1994, 2014), Mansur & Puglielli (1999). Present situation in the Somali speaking areas Somalia, without Somaliland, has about 10 million inhabitants (UN 2017, SomalilandGov 2010: 10), with 8‒10 million Somali speakers, depending on the status ascribed to the Maay variety (Ethnologue 2015a). Somalia is a federal republic, and considers Somaliland to be part of it. According to the constitution, the official language is Somali (both Common Somali and Maay), and Arabic is the second language. Public schools mainly use Somali in all 12 grades, and standardised schoolbooks are produced in Mogadishu. However, universities and private schools often use Arabic or English as their main teaching language. There are very few textbooks in Somali for higher education. The media almost exclusively uses Somali, and most newspapers and books aimed at the general public are published in Somali. In politics, most texts are written in English, but in spoken communication Somali dominates. Somaliland has some 4 million inhabitants (SomalilandGov 2010: 10), who practically all speak Somali (Appleyard & Owen 2008: 285). It declared its independence from Somalia in 1991, but it has still not been formally recognised. According to the constitution, the official language is Somali, with Arabic as the second language; other languages may be used when necessary. Public schools mainly use Somali in all 12 grades and standardised schoolbooks are produced in Hargeisa. Universities and private schools often use English or Arabic. There are very few textbooks in Somali for higher education. The media almost exclusively uses Somali, and most newspapers and books aimed at the public are in Somali. In politics, many texts are written in Somali, but English is still dominating in some political domains, but Somali is generally used in oral communication. 6

Ethiopia has about 105 million inhabitants, whereof some 6.5 million are Somali speakers (CSA 2007, CIA 2017). As of 1996, it is a federal republic divided into 9 ethnically-based regions. Amharic is the official language of the country, but regional languages may be used for internal affairs within the nine separate regions. In the Somali region, Somali is the language of instruction for grades 1‒8, and Amharic is taught as a . In grades 9‒12 both Somali and English are used, whilst higher education is basically in English throughout the whole country (Bijiga 2015: 142ff.) Standard schoolbooks in Somali are produced in Jigjiga, but there are hardly any Somali books for higher education. However, since 2014 there has been a department of and literature at Jigjiga University, with a full BA program in Somali (120 full time and 700 part time students), and since 2015 there has been a Somali section at the department of Social sciences at Dire Dawa University (68 full time and 214 part time students). These two BA programs are unique within the Horn of Africa. Only one Ethiopian TV channel and a few radio stations broadcast in Somali. No newspapers and very few books are published in Somali. In politics, Amharic is used at the national level, whereas Somali may be used at the regional level. Djibouti has over 0.8 million inhabitants, of whom about 0.5 million are Somali speakers. It became independent from France in 1977. Most of the remaining part of the population speak the related Cushitic language, Afar. The country’s two official languages are French and Arabic, but French dominates. In the constitution, the two indigenous languages, Somali and Afar, are called “national languages”. They are practically only used in oral communication in non-official situations. The educational system uses French, but there are also some Arabic schools and university courses taught in Arabic. Somali and Afar are not used in education, and are not even studied as a subject, hence there are no textbooks in Somali (Mahamoud 2011, 2016). There is, however, an Institute for the national languages at the Research and Study Centre of Djibouti, where five researchers work on Somali language and literature. The languages in media are mainly French and Arabic. There are also some Somali transmissions in TV and radio. A few publications appear in Somali. In politics, French is the dominating language, but Somali is sometimes used in oral communication. Kenya has 46 million inhabitants, of whom some 2.8 million are Somali speakers (KNBS 2017, Oparanya 2010). English and Swahili are the official languages. English dominates in education, but Swahili is also used in the lower grades. Also, Somali schools or classes are sometimes organised for the lower grades, but on a purely private initiative. Schoolbooks from Somalia may then be

7 used in these classes, as they are often printed in Kenya. English and Swahili dominate in the media, but there are also a few Somali transmissions on TV, and a small number of Somali radio stations. There are, however, very few printed publications in Somali. The languages used in politics are English and Swahili. The emergence of regional standards In the 70’s and 80’s one dominating standard was consciously developed in the Somali Democratic Republic (including today’s Somaliland). The Banadir variety of Mogadishu was considered ‘dialectal’ from the perspective of the majority, and therefore the variety spoken by the majority (often referred to as Northern Somali), was taken as the base, and the standard was formed as a certain compromise (Hared 1992: 18ff.). It was largely formed by the Somali Language Commission, through the media and the educational system (Andrzejewski 1980, Caney 1984). The country was a socialist republic with state controlled publishing houses, radio, and television, which made the easier and more efficient than it would have been under more democratic conditions (Hared 1992: 40f.). This process was, however, interrupted at the end of the 80’s, when the civil war began. Somaliland declared its independence in 1991 and went its own way. In the 21st century a certain linguistic divergence between Somaliland and Somalia can be noticed. It becomes tangible in their new series of schoolbooks published in two almost identical parallel editions. The competing linguistic centres are primarily Mogadishu and Hargeisa, but also Djibouti and Jigjiga belong to a larger northern region together with Hargeisa, and a new step towards even greater diversity was taken around 2010, when an independent series of Somali schoolbooks were produced in Jigjiga. Their contents differ completely from the books already published in Hargeisa and Mogadishu. It is difficult to say which of the two varieties of Somali, based in Mogadishu and Hargeisa, is the more dominant one. The northern variety has, however, been somewhat more productive in publishing, possibly due to the peace in Somaliland. On the other hand, the southern variety is supported by the largest Somali city and its traditional capital. The variety in Ethiopia is easier to classify as non-dominant. Conscious speakers seem to be looking for certain linguistic guidance from the side of Somalia and Somaliland. 8

Previous studies of Somali regional standards In the existing literature, the common standard formed during the 70’s and 80’s is most often contrasted with non-standard usage. Very little work has been done on the regional variations within the , based on investigation of larger amounts of texts from the different regions. However, the Somali dialects have been thoroughly investigated by, e.g., Lamberti (1986, 1988), Abdullahi (2010), and Ismail (2011). Changes over time in the standard of the 70’s and 80’s were investigated by Hared (1992), and more general issues of variation in the modern standard are discussed in, e.g., Banti (2011, 2012), Banti & Ismail (2015), Fayruus (2015), and Mansuur (2015). Somali corpora Today, three Somali corpora are available online. The first, Somali Korp8, was launched in October 2015 at the Swedish Language Bank, University of Gothenburg. This corpus, which currently stands at 5 million tokens, is accessible through the same interface and search engine as the Swedish National Corpus, allowing an unlimited number of hits and providing statistics in Excel format. However, the data is not lemmatised or tagged. On the other hand, the data is divided into 37 individually selectable sub-corpora, containing different types of texts originating from different regions within the Somali speaking area. The second corpus, Somali Corpus9, at Redsea Foundation in Hargeisa, was launched in June 2016. This corpus, containing 3 million tokens, is accessible through a corpus-specific search engine and interface. The data is lemmatised and tagged for parts of speech. The third corpus, Somali WaC 2016 Corpus10, was launched in February 2017. It was developed by the Norhed project in collaboration between the universities of Addis Ababa, Brno, and . This corpus, containing 80 million tokens, consists of texts which have been automatically collected from the internet. Data illustrating the Somali situation As a first step, a manual comparison was made of two versions of the same 3rd grade science textbook (Saynis, Fasalka 3aad. Hargeisa 2001 & Mogadishu 2011). In phonology and , the only difference observed between the two editions was the use of post-vocalic 11 in the north (Hargeisa, Somaliland),

8 Accessible at . 9 Accessible at . 10 Accessible at . 11 The digraph represents the voiced post-alveolar retroflex plosive [ɖ] largely used in the north, but in

9 corresponding to an in the south (Mogadishu, Somalia), in a very large number of words, e.g. jidhN / jirS ‘body’12. No other differences were observed in phonology or orthography. In grammar, one instance of the use of a 1st person plural exclusive possessive suffix (dhammaantayoN ‘all of us’) was found in the northern edition. This was rendered in the southern edition of the book with the corresponding inclusive possessive ending (dhammanteenS ‘all of us’), implying that the distinction in question was not maintained in this variety. There were also two instances of discrepancies when marking the subject with the ending –u, but these differences went in both directions cadowgiisaN (base form) vs. cadowgiisuS (subject form) ‘its enemy’ and carraduN (subject form) vs. carro-dhoobadaS (base form) ‘clay’, hence they are probably just writing mistakes. Another interesting difference was the treatment of the word ‘leaves’ as a collective singular noun in the northern edition, but as a plural noun in the southern: caleen badanN vs. caleemo badanS ‘many leaves’. In syntax, only one difference was encountered, namely a noun + noun construction in the northern edition of the textbook, as opposed to a noun + adjective construction in the southern: saliidda khudraddaN vs. saliid khudradeedS ‘vegetable oil’. The most numerous differences were, instead, encountered in the lexicon. In about 25 cases different lexemes were applied in the two editions, many repeated several times. Some examples are isticmaalN (Ar.13) vs. adeegso; adeegsiS ‘use; usage’, ka bacdiN (Ar.) vs. ka dibS ‘afterwards, then’, wasakhN (Ar.) vs. uskagS ‘dirt’, khatarN (Ar.) vs. halisS ‘danger, risk’, beedN (Ar.) vs. ukunS ‘egg’, akhtarN (Ar.) vs. cagaarS ‘green’, labanN (Ar.) vs. bulukeetiS (It.) ‘brick’, tamaandhoN (En.) vs. yaanyoS (Sw./Ar.) ‘tomato’, tiinN (En.) vs. daasadS ‘tin, can’, bilaydhN (En.) ‘plate’ vs. baafS ‘trough’, maydhN vs. dhaqS ‘wash’, gadhfeedhN vs. shanloS ‘comb’, dabadeedN vs. kadibnaS ‘and then’, dabeetoN vs. kadibnaS ‘and then’, maddiibadN ‘bowl’ vs. weelS ‘container’, ogaatayN ‘grasped’ vs. aragtayS ‘saw’, silbanayaN vs. siisiibanayaS ‘slippery’, haragN vs. maqaarS ‘skin, leather’, taangiN (En.) vs. beerkadS ‘tank’, qaydhiinN vs. ceeriinS ‘raw’, daqiiqN (Ar.) ‘flour’ vs. budoS ‘powder’, laxuuxN vs. canjeeroS (Am.) (a type of bread), ceel-joogN vs. qooleyS (a type of pigeon), hadhN vs. hoosS ‘shade’.

post-vocalic position typically corresponding to an /r/ in the south (Saeed 1993: 14). 12 A superscript N = the northern edition from Hargeisa, Somaliland, whereas a superscript S = the southern edition from Mogadishu, Somalia. 13 Borrowing from Am. = Amharic; Ar. = Arabic; En. = English; It. = Italian; Sw. = Swahili. 10

As the number of differences between the two editions of the Science textbook seemed somewhat low, a page from the 2nd grade textbook in Somali language (Af-Soomaaliga, Fasalka 2aad. Hargeisa 2011 & Mogadishu 2001) was also compared in the two editions. Here the number of differences was much larger. A possible explanation might be that the authors of the science books were not as sensitive to linguistic variation as the authors of the textbooks in Somali language. Again, in orthography and phonology, there was a difference between northern post-vocalic and southern . In morphology, women’s names had base forms in -a in the northern books, whereas they had base forms in –o in the southern, e.g. SahraN vs. SahroS. In the northern edition, such names also had a subject form ending in –i, whereas no special subject form was applied in the southern edition, e.g. SahriN vs. SahroS ‘Sahra (subject)’. Three differences that may be interpreted as lexical were also observed on this single page: aabboN vs. aabbeS ‘father’, geysaaN vs. u waddaaS ‘leads, takes them to’, weydiimahanN vs. su’aalahanS (Ar.) ‘the questions’. Corpus data As a subsequent step, corpus data was compared for some of the more well-known differences between the regional varieties. The corpus data analysed was the sub-corpora of Somali Korp, which contained schoolbooks as well as a few texts published in the 1960’s, as presented in Table 1.

Table 1. Amount of schoolbook data in Somali Korp Text type State Tokens14 1960’s texts Somali Republic 14,150 1970’s schoolbooks Somali Democratic Republic 119,110 2000’s schoolbooks, South Somalia 185,720 2000’s schoolbooks, North Somaliland 68,670 2000’s schoolbooks, West Ethiopia 85,820 In Somali orthography, no precise rules have been formulated for the spelling of the front diphthong; both and are applied. In the 60’s and 70’s, the proportion of was 78‒79%. In modern schoolbooks from Somalia, use of has risen slightly to 81%, whereas in books from Somaliland, its use

14 The amount of text in these sub-corpora is still quite restricted, but the goal is to include at least 1 million tokens in each of them, except the 1960’s text, where that amount of text does not exist. Also, the relatively small amount of schoolbooks from the 1980’s and 1990’s will be added.

11 has decreased slightly to 77%. In the close by Somali region of Ethiopia, however, the frequency of the spelling with is significantly higher at 90%. Another instability in Somali orthography concerns the choice between and for phonetical [mb]. In the texts from the 60’s was written in 80% of all instances of [mb], but in the books from the 70’s, it is instead the phonetical spelling that is applied in 76% of all instances. In the modern schoolbooks this trend has grown even stronger, and is used at a rate of 83‒84% in books from both Somalia and Somaliland. In the Ethiopian books, however, the use of is slightly more frequent (54%) than the phonetical spelling. The most frequent variation can, however, be noticed in the use of the phoneme or in post-vocalic position. In order to gain a preliminary picture, all instances of words with were counted and then the number of words with initial was subtracted from the first figure, giving the approximate number of words with a non-initial . In the schoolbooks from the 70’s, only 0.6% of the words contain a non-initial . The rate is somewhat lower in modern books from Somalia with 0.4%, whereas it is significantly higher in the books from Somaliland at 1.6%, and even higher in the books from Ethiopia with a non-initial in as much as 1.8% of all words. Most dictionaries claim that the word for ‘bread’ has the form rooti and that it is masculine in gender, hence the definite form should be rootiga. This is, however, not confirmed by the schoolbooks, where in the 70’s the feminine definite form rootida is almost as frequent as the masculine one. Furthermore, in modern schoolbooks from Somaliland, this noun is only used in the form roodhi, and it always takes the feminine definite form roodhida. In the schoolbooks from Ethiopia this word hardly occurs, as another synonym is used instead. Some verbs with a stem ending in –i exhibit variation in some forms, e.g. the infinitive ‘to write’ is always akhriyi in the books from Somaliland and Ethiopia, as well as in the books from the 70’s. In the books from Somalia, however, another form, akhrin, is four times more frequent than akhriyi, which also occurs. In some finite forms, there is also a variation between akhrisa- and akhrida-. Both forms are equally frequent in the modern books from Somaliland and in the books from the 70’s, whereas the forms with a –d‒ are somewhat more frequent in the modern books from Somalia and Ethiopia. Somali exhibits a vast number of possible contractions similar to the English I’m for I am. One type of contraction was investigated in the schoolbooks, namely the combination of the focus marker waxa(a) and the pronoun uu ‘he’. In the schoolbooks from the 70’s, the contractions waxuu, wuxuu, wuxu were used at 12 a rate of 83%, with the form wuxuu largely dominating. In the modern books from both Somalia and Somaliland, the rate of contractions had fallen to just 2%, and the separate spelling waxa uu or waxaa uu dominates, with the shorter variant being the most common. In the books from Ethiopia, however, contractions are used at a rate of 87%, an even higher frequence than in the books from the 70’s. Traditionally Somali numerals consisting of tens and ones have been expressed in the standard language with the ones preceding the tens, e.g. shan iyo labaatan, literally meaning ‘five and twenty’. This is the only order used in modern books from Somaliland, and, with marginal exceptions, also in the texts from the 60’s and 70’s. In modern books from Somalia, however, numbers written in the reverse order, i.e. labaatan iyo shan, literally ‘twenty and five’, are three times more common than number with the traditional order, and in books from Ethiopia this reverse order is four times more frequent. Somali also exhibits an interesting variation in the word order found in subordinate clauses between the negation aan, and subject pronouns such as aad ‘you’, uu ‘he’ and ay ‘she; they’. In all books both orders can be observed, but in the modern books from Somaliland, instances where the negation precedes the pronoun, e.g. aanu, aanay, are over 20 times more frequent than the opposite order. In books from Ethiopia such forms are only twice as common as the opposite order, and in modern books from Somalia, the forms where the subject pronoun precedes the negation, e.g. uusan, aysan, are five times more common than the opposite order. In the books from the 70’s, forms beginning with the negation are three times more common than forms beginning with the pronoun. For many basic word, there are synonyms which are generally perceived of as being typically northern or typically southern. For ‘banana’, ‘monkey’, ‘Sara’, ‘water melon’, ‘kettle’, ‘red’, and ‘fish’, the words muus, daayeer, Sahra, xabxab, kildhi, casaan, and kalluun are generally perceived as being typically northern, whereas the words moos, daanyeer, Sahro, qare, jalamad, guduud, and mallaay are generally perceived as being typically southern. Judging from the usage in the investigated schoolbooks, the picture is, however, more complex than that. For the word ‘monkey’, only the form daayeer is used in the modern books from Somaliland and Ethiopia, as well as in the books from the 70’s, whereas only the form daanyeer is used in the modern books from Somalia. For the proper noun ‘Sahra’, only the form Sahra occurs in the modern books from Somaliland and Ethiopia, as well as in the books from the 70’s, whereas in the modern books from Somalia, both Sahra and Sahro occur, with Sahro being almost twice as frequent as Sahra.

13

For the word ‘banana’ only the form muus is used in the modern schoolbooks from Ethiopia and Somaliland as well as in the books from the 60’s. In the 70’s, however, also the form moos occurs less frequently. In the modern books from Somalia, both forms are used, moos being the more frequent one. For the word ‘red’ only the form casaan occurs in modern schoolbooks from Somaliland and Ethiopia, whereas both casaan and guduud occur in the books from Somalia to an almost equal extent. Both words also occur in the books from the 60’s and 70’s. For ‘water melon’, only the word qare is used in modern schoolbooks from Somalia and only the word xabxab is used in the books from Somaliland, whereas both words occur in the Ethiopian books as well as in the books from the 70’s. In somewhat different ways, these five lexical items more or less confirm the stereotypical image of typical northern and southern words or word forms. For some other lexical items, however, the situation is more diverse. For the word ‘kettle’, only the form kildhi occurs in modern books from Somaliland and Ethiopia, whereas kildhi, kirli, and jalamad, all occur in the books from Somalia, with kirli being the most frequent. In the books from the 70’s only kirli occurs, hence the three standards seem to strive towards a common lexical item with two phonological shapes, differing only with respect to the typical alternation between and accompanied by a metathesis. Sometimes it is Ethiopia that stands out in the use of a lexical item. For the notion of ‘fruit and vegetables’, the word khudrad is almost always used in the modern books from Somalia and Somaliland, as well as in the books from the 70’s, even though there are also occasional instances of the form khudaar. In the modern Ethiopian books, khudaar is about 50 times as frequent as khudrad. Sometimes there is also a consensus about one and the same word. For ‘fish’, only the word kalluun in used in the modern textbooks from all three states. It is only in books from the 70’s that the word mallaay occurs sporadically, but with kalluun being more than 10 times as frequent. Summary The aim of this paper has been to give a short overview of the different regions where Somali is spoken, as well as to demonstrate some of the diffe- rences between the emerging regional standards of Somali, as they appear in the schoolbooks published by the administrations in the three states where written Somali is systematically used in education, i.e. Somalia, Somaliland and Ethiopia. 14

The data presented is only preliminary, as substantially larger corpora are needed in order to say something more definite about the regional standards promoted by the different states. Such findings would also need to be compared to other corpora with other types of texts within the same regional provenance, in order to explore the extent of how the imposed standards of the schoolbooks correspond to the actual usage of Somali in the specific regions. However, there clearly seems to be two equally strong, dominant, and written varieties of Somali in Somalia and Somaliland, respectively, both firmly based on the common heritage from the 70’s and 80’s, but today somewhat diverging. Besides those two, there is a younger, non-dominant, written variety in Ethiopia, which has a much shorter tradition of writing. There are also two mainly spoken non-dominant varieties of Somali in Djibouti and Kenya, respectively.

References

Aadan, A. X. (ed.) (2013): Qaamuus Afsoomaali. Djibouti. Abdullahi, M. D. (2010): Le Somali, dialectes et histoire. Saarbrücken. Agostini / Puglielli / Siyaad (eds.) (1985): Dizionario somalo-italiano. Roma. Andrzejewski, B. W. (1974): The introduction of a national orthography for Somali. African Language Studies 15, pp. 199–203. Andrzejewski, B. (1978): The development of a national orthography in Somalia and the modernization of the Somali language. Horn of Africa 1(3), pp. 39-45. Andrzejewski, B. (1980): The use of Somali in mathematics and science. Afrika und Übersee 63, pp. 103-117. Appleyard, D. / Orwin, M. (2008): The Horn of Africa : Eritrea, Djibouti, and Somalia. In : Simpson, A. (ed.), Language and national identity, pp. 267-290. Axmed, Sh. J. (1973): Naxwaha Af Soomaaliga. Muqdishow. Banti, G. (2011): Internally-headed relative clauses in literary Somali? In: Frascarelli, M. (ed.), A country called Somalia. Torino, pp. 32‒47. Banti, G. (2012): Grammatical variation in written Somali. Presentation at the conference FestSom40 (Djibouti, 17-22 Dec). Banti / Ismail (2015): Some issues in Somali orthography. In: Ismaaciil et al. (eds.), Afmaal, Djibouti, pp. 36‒48. Berchem, J. (2012): Grammatik des Somali. 2. Aufl. Norderstedt. Bijiga, T. D. (2015): The development of Oromo writing system. PhD thesis. Kent. Cali “Idaajaa” (no date): Shire Jama Ahmed. A pioneer of the development of Somalia’s national orthography. [Accessed 22 Jan. 2018] Cali-Guul-Warsame, K. (2008): Qaamuuska af Soomaaliga. Nairobi.

15

Caney, J. (1984): The role of the newspaper and the school text book in the modernisation of Somali vocabulary. In: Labahn (ed.), Proceedings of the Second International Congress of Somali Studies, Vol. I, , pp. 373- 378. Carab, S. X. (2004): Qaamuus. Ereykoobe. Jabbuuti. CIA (2017a): The world fact book. Africa: Djibouti. [Accessed 16 Jan. 2018] CIA (2017b): The world fact book. Africa: Ethiopia. [Accessed 16 Jan. 2018] CSA (2007): The 2007 population and housing census of Ethiopia. National statistical. [Accessed 16 Jan. 2018] Ethnologue (2015a): Maay. [Accessed 18 Jan. 2018] Ethnologue (2015b): Somali. [Accessed 16 Jan. 2018] Fayruus, M. C. (2015): Isrogrogidda falka Af-soomaaliga: Aragti ku aadan midaynta qoraalkiisa. In: Ismaaciil et al. (eds.), Afmaal, Djibouti, pp. 49–60. Guddiga Afka Soomaalida (1971): Aasaaska naxwaha Af Soomaaliga. Muqdisho. Guddiga Af Soomaaliga (1973): Aasaaska naxwaha Af Soomaaliga. 2nd ed. Muqdisho. Hared, M. F. (1992): Modernization and standardization in Somali press writing. PhD thesis. Los Angeles. Haybe, A. (no date a) The role of broadcasting in the development of Somali language. [Accessed 22 Jan. 2018] Haybe, A. (no date b)The mass literacy campaign (1973-5). [Accessed 22 Jan. 2018] Ismail, A. M. (2011): Dialectologie du somali: problématiques et perspectives. PhD thesis. Paris. Keenadiid, Y. C. (1976): Qaamuuska af-Soomaaliga. Muqdisho. KNBS (2017): Statistical abstract 2017. [Accessed 18 Jan. 2018] Lamberti, M. (1984): The linguistic situation in the Somali Democratic Republic. In: Labahn (ed.), Proceedings of the Second International Congress of Somali Studies, Vol. I, Hamburg, pp. 155-200. Lamberti, M. (1986): Die Somali-Dialekte. Hamburg. 16

Lamberti, M. (1988): Die Nordsomali-Dialekte. Heidelberg. Mohamoud, M. A. (2011): Description de la situation linguistique en République de Djibouti. MA thesis, Aix-Marseille. Mohamoud, M. A. (2016): Description de la situation plurilingue de Djibouti. Science et Environnement 31, pp. 1-18. Mansur, A. O. / Puglielli, A. (1999): Barashada naxwaha Af-Soomaaliga: A Somali School Grammar. London. Mansuur, C. C. (2015): Midaynta iyo horumarinta Af-soomaaliga. In: Ismaaciil et al. (eds.), Afmaal, Djibouti, pp. 17‒35. Oparanya, W. A. (2010): 2009 population & housing census results. [Accessed 18 Jan. 2018] Puglielli, A. 2001. Language and identity. In: Lilius (ed.), Variations on the theme of Somaliness, Turku, pp. 203-211. Puglielli, A. / Mansuur, C. C. (2012): Qaamuuska Af-Soomaaliga. Roma. Qutbi et al. 1961. Linguistic report 1961. The report of the Somali Language Committee. Mogadishu. Raabbi, M. X. (1994): Naxwaha sifayneed ee Afsoomaaliga: Ereyeynta. Lafoole. Raabbi, M. X. (2014): Buugga weedhaynta. Hargeysa. Saeed, J. I. (1993): Somali reference grammar. Kensington, MD. Saeed, J. I. (1999): Somali. Amsterdam. SCB (2017): Befolkning efter födelseland och ursprungsland 31 december 2016. [Accessed 16 Jan. 2018] SomaliandGov (2010): Somaliland in figures. [Accessed 16 Jan. 2018] Stepanjenko, D. I. / Osman, M. H. (1969): Abwan urursan af Soomaali iyo Rusha, Rush iyo af Soomaaliya. Mosko. Tosco, M. (2012): The unity and diversity of Somali dialectal variants. In: Ogechi / Ngala Odour / Iribemwangi (eds.), The harmonization and standardization of Kenyan languages: Orthography and other aspects, Cape Town, pp. 263-280. Tosco, M. (2015): Short note on Somali previous scripts. In: Ismaaciil / Mansuur / Sharci (eds.), Afmaal: Proceedings of the Conference on the 40th Anniversary of Somali Orthography (Djibouti 2012), pp. 189-217. UN (2017): World Population Prospects 2017. [Accessed 16 Jan. 2018] Wikipedia (2018) Somali diaspora. [Accessed 16 Jan. 2018]