Implemented Stemming Algorithms for Six Ethiopian Languages

Total Page:16

File Type:pdf, Size:1020Kb

Implemented Stemming Algorithms for Six Ethiopian Languages International Journal of Advanced Science and Technology Vol. 29, No. 7, (2020), pp. 2532-2536 Implemented Stemming Algorithms for Six Ethiopian Languages Wubetu Barud Demilie Department of Information Technology, Wachemo University, Hossana, Ethiopia, P.O. Box 667 [email protected] or [email protected] Abstract Text stemming is an exploratory process of removing suffixes, infixes and sometimes prefixes from words to arrive at the base word. It is one of the pipeline features of most Natural Language Processing Applications (NLPAs) and commonly used in natural language processing (NLP) and in text mining. The main problems of developing stemming are to identify and remove any kind of affixes since all the six languages those I have selected for analysis have different characteristics, sentence structures and grammatical rules. This paper tries to analysis different approaches that have been implemented by different researchers or scholars of the selected languages accordingly. I have discussed the type of stemming approaches, an overview of the available and the most popular used stemmers for selected languages and brief analysis between discussed stemmers as well as their evaluation results and analysis of available stemmers on the languages experimentally. Based on the analysis study and experiment, I have concluded and recommended the final results of the stemmer for the languages. Keywords: Affix, Analysis, Approach, Ethiopian Language, Natural Language Processing Application, Stemming Approaches. I. INTRODUCTION In information retrieval processes, stems help to advance the process of natural language processes accordingly. For example, in information retrieval processing systems, it has the ability to index documents based on topics, and to expand a query to obtain more accurate and precise results clearly. All information retrieval applications are used advance recalls and precisions accordingly. A recall growing method which can be helpful for even the simplest Boolean retrieval systems is stemming. Stemming is a preprocessing footstep in text mining applications as well as a very common requirement of natural language processing functions. In fact, it is very important in most of the information retrieval applications. Different types of stemming approaches have been implemented for the languages that I have mentioned in the study in terms of performance and accuracy. All stemming algorithms or approaches are language dependent. That means, the morphological forms of the selected six Ethiopian languages are very different in everything including its structure. “In English language, a word might change into inflectional or derivational forms”. For example, ‘walking’, ‘walked’, ‘walks’ are inflectional forms that can be mapped to the root word ‘walk’. On the other hand, the words ‘doing’, ‘does’ has a root of ‘do’, however, in past tense ‘do’ changes to ‘done’. The selected languages are an agglutinative language with rich morphological structures. All language words are created by adding suffixes, infixes and prefixes to original root word accordingly. Some of the language words are composed by appending a combination of two or three affixes to root words. With such rich structures and unique composition rules of the languages, it is being catered and used by any stemmers available for the study. Hence, this paper is set to explore the analysis of all the implemented stemming approaches and to identify the best and the recommended approach for the selected and morphologically rich languages. In this paper, I have discussed different stemming approaches for Amharic, Afan Oromo, Tigrinya, Wolaita, Kambaata and Awngi languages clearly. ISSN: 2005-4238 IJAST 2532 Copyright ⓒ 2020 SERSC International Journal of Advanced Science and Technology Vol. 29, No. 7, (2020), pp. 2532-2536 Generally, this paper describes the different types of stemming approaches which works differently in different amount of corpus and explains the comparative analysis of stemming approaches on the basis of stem production, efficiency and effectiveness in all information retrieval systems. II. STEMMING APPROACHES According to [1] there are to stemming approaches. The first stemming method is simply means of context free with the main objective of identifying affixes and removing them. The second stemming approach is lemmatization. In lemmatization, the developer has to have a good knowledge of the language and its grammatical rule. It also requires a dictionary look up; therefore, it is more complex than stemming. However, in lemmatization more accurate and precise result is expected. For example, a word ‘better’ has a lemma ‘good’. These types of words cannot be solved in basic stemming approaches unless it uses dictionary look-up table. To achieve stemming, there are different types of stemming approaches that are available for different languages, which differ in terms of performance and accuracy. There are four (4) different stemming approaches that will be discussed in this paper; namely, rule based, successor variety, a hybrid approach and longest match. II.I. Rule Based Approach The rule based approach is implemented by different researchers and composed of two parts: a rule-based light stemmer, and a patter-based infix remover. The rule-based light stemmer removes prefixes and suffixes form the word according to specific rules. The pattern based infix remover removes infixes from the word according to specific patters. This approach is named here rule based approach [3][4][5][6][7][8]. II.II. Successor Variety Approach According to [9] successor variety is one of the stemming approaches in natural language processing applications including especially, in information retrieval processing systems. In this approach, the successor variety of a string is the number of different characters that follow the string in words in a corpus (the body of text). The successor variety of substrings of a term will decrease as more characters are added until a segment boundary is reached. Successor variety stemmer does not require preparation of suffix lists and removal rules, and hence can be adapted to changing text collection. According to [10] successor variety approach uses the frequencies of letter sequences in a body of text as the basis of stemming. In less formal terms, the successor variety of a string is the number of different characters that follow it in words in some body of text. Consider a body of text consisting of the following words, for example, back, beach, body, backward and boy. To determine the successor varieties for ‘battle’, for example, the following process would be used. The first letter of battle is ‘b’. ‘b’ is followed in the text body by four characters: ‘a’, ‘e’, and ‘o’. Thus, the successor variety of ‘b’ is three. The next successor variety for battle would be one, since only ‘c’ follows ‘ba’ in the text. When this process is carried out using a large body of text, the successor variety of substrings of a term will decrease as more characters are added until a segment boundary is reached. At this point, the successor variety will sharply increase. This information is used to identify stems. II.III. A hybrid Approach According to [11] and [12] a hybrid approaches use two or more of the approaches in union. A simple example is a suffix tree approach which first consults a lookup table using brute force approach. However, instead of trying to store the entire set of relations between words in a given language, the lookup table is kept small and is only used to store a minute amount of ‘frequent exceptions’ like ‘ran => run’. II.IV. Longest Match Approach According to [13] the longest match approach, it removes the longest suffix possible. For example, if the same word ‘fruitfulness’ is considered the suffixes in the word are: ‘ness’, ‘ful’, and ‘fullness’. ISSN: 2005-4238 IJAST 2533 Copyright ⓒ 2020 SERSC International Journal of Advanced Science and Technology Vol. 29, No. 7, (2020), pp. 2532-2536 Therefore, the approach removes ‘fullness’ from the word. The problem of using longest match approach compared to other method is that it needs for generating all possible combinations of affixes and processing and storage space required and the change of affix during concatenation. III. ANALYSIS OF STEMMING APPROACHES FOR THE SELECTED LANGUAGES I have discussed four most popular stemmers that have been used for the selected Ethiopian languages by different researchers of the languages accordingly. As they have developed the stemming approaches for the languages, I have analyzed each of the approaches that have been used for the selected six languages. All of them are purposely developed for the specified language, however, some observations had been made. The following table summarizes the analysis of the approaches for the selected Ethiopian languages. Table 1: Summary of implemented algorithms for six Ethiopian language N Sensitive Primary o Language Conflation Technique in Error Accuracy Researcher Context? Rate 1. Amharic Nega Alemayehu Rule Based (Iterative) Yes 4.01% 95.90% 2. Atelach Alemu Affix removal & 25%S Amharic No 75% and Lars Asker Dictionary Based 3. 28.2% 71.8% Successor Variety (peak and Amharic Genet Mezemir Yes Approach plateau method) 4. Mekonnen Rule Based (Longest 7.48% Afan Oromo Yes 92.52% Wakshum Match) 5. Afan Oromo Debela Tesfaye Rule Based (Iterative) Yes 4.27% 94.84% 6. Rule Based (Longest 3.13% Kambaata Jonathan Samuel Yes 96.87% Match) 7. Abebe Belay and 17.58% Ge’ez Rule Based Yes 82.42 % Yibeltal Chanie 8. Wolaita Lemma Lessa Longest Match Yes 4.01% 95.9% 9. Afan Oromo Debela Tesfaye A hybrid Approach Yes 4.27% 95.73% 10. Jonathan Samuel 4.01% Kambaata & Solomon Rule Based Yes 95.9% Teferra 11. Silt’e Muzeyn Kedir Longest Match Yes 14.28% 85.71 % 12. Omer Osman and 10.7% Tigrinya Hybrid Approach Yes 89.3% Yoshiki Mikami 13. Awngi Tsegaye misikir Longest Match Yes 8.59% 91.41% 14. Girma Yohannis 8.16% Wolaita Bade and Hussien Longest Match Yes 91.84% Seid 15.
Recommended publications
  • Designing a Rule Based Stemming Algorithm for Kambaata Language Text
    Jonathan Samuel & Solomon Teferra Designing A Rule Based Stemming Algorithm for Kambaata Language Text Jonathan Samuel [email protected] Telecom Excellence Academy/ Digital Learning Ethio Telecom Addis Ababa, Ethiopia Solomon Teferra [email protected] Faculty of Informatics/ School of Information Science Addis Ababa University Addis Ababa, Ethiopia Abstract Stemming is the process of reducing inflectional and derivational variants of a word to its stem. It has substantial importance in several natural language processing applications. In this research, a rule based stemming algorithm that conflates Kambaata word variants has been designed for the first time. The algorithm is a single pass, context-sensitive, and longest-matching designed by adapting rule-based stemming approach. Several studies agree that Kambaata is strictly suffixing language with a rich morphology and word formations mostly relying on suffixation; even though its word formation involves infixation, compounding, blending and reduplication as well. The output of this study is a context-sensitive, longest-match stemming algorithm for Kambaata words. To evaluate the stemmer’s effectiveness, error counting method was applied. A test set of 2425 distinct words was used to evaluate the stemmer. The output from the stemmer indicates that out of 2425 words, 2349 words (96.87%) were stemmed correctly, 63 words (2.60%) were over stemmed and 13 words (0.54%) were under stemmed. What is more, a dictionary reduction of 65.86% has also been achieved during evaluation. The main factor for errors in stemming Kambaata words is the language’s rich and complex morphology. Hence several errors can be corrected by exploring more rules.
    [Show full text]
  • Similative Morphemes As Purpose Clause Markers in Ethiopia and Beyond Yvonne Treis
    Similative morphemes as purpose clause markers in Ethiopia and beyond Yvonne Treis To cite this version: Yvonne Treis. Similative morphemes as purpose clause markers in Ethiopia and beyond. Yvonne Treis; Martine Vanhove. Similative and Equative Constructions: A cross-linguistic perspective, 117, John Benjamins, pp.91-142, 2017, Typological Studies in Language, ISBN 9789027206985. hal-01351924 HAL Id: hal-01351924 https://hal.archives-ouvertes.fr/hal-01351924 Submitted on 4 Aug 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Similative morphemes as purpose clause markers in Ethiopia and beyond Yvonne Treis LLACAN (CNRS, INALCO, Université Sorbonne Paris-Cité) Abstract In more than 30 languages spoken at the Horn of Africa, a similative morpheme ‘like’ or a noun ‘manner’ or ‘type’ is used as a marker of purpose clauses. The paper first elaborates on the many functions of the enclitic morpheme =g ‘manner’ in Kambaata (Highland East Cushitic), which is used, among others, as a marker of the standard in similative and equative comparison (‘like’, ‘as’), of temporal clauses of immediate anteriority (‘as soon as’), of complement clauses (‘that’) and, most notably, of purpose clauses (‘in order to’).
    [Show full text]
  • Polysemous Agent Nominals in Kambaata (Cushitic)*
    Author manuscript, published in "Sprachtypologie und Universalienforschung 64, 4 (2011) 369-381" Published 2011in: Luschützky, Hans-Christian & Franz Rainer (eds.) 2011. Agent-noun polysemy in a cross-linguistic perspective. Special issue of Sprachtypologie und Universalienforschung 64, 4: 369-381 [This is a pre-publication version. Please quote the final published version.] YVONNE TREIS (La Trobe University) Polysemous Agent Nominals in Kambaata (Cushitic)* Kambaata has a morpheme -aan with which agent nominals can be derived from verbs and nouns. The present article discusses, firstly, the morphological and syntactic characteristics of -aan nominals and the specific problem of which word class they should be assigned to. Secondly, it is shown that the -aan morpheme is multifunctional. Apart from agent nominals, it is used to derive instrument, place and patient nominals. 1. Introduction Kambaata belongs to the Cushitic branch of the Afro-Asiatic language phylum, more precisely to the Highland East Cushitic (HEC) language group. The hitherto little documented language is spoken by more than 600,000 speakers in an area approximately 300 km south-west of the Ethiopian capital Addis Ababa. The lan- guage has a robust noun-verb distinction and a (sub-)word class of adjectives.1 Kambaata is strictly suffixing and has a rich verbal and nominal morphology. One of its derivational morphemes, the agentive morpheme -aan, productively gener- ates agent nominals on the basis of verbs. In the present article, the formal features (section 2) as well as the meaning and use of -aan nominals (section 3) are dis- cussed. In a language without a documented history, agent nominals can only be analysed in a synchronic perspective.2 The present work is intended to supplement the vast literature on agentive derivations that is predominantly concerned with Indo-European languages.
    [Show full text]
  • Similative Morphemes As Purpose Clause Markers in Ethiopia and Beyond Yvonne Treis
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Archive Ouverte a LUniversite Lyon 2 Similative morphemes as purpose clause markers in Ethiopia and beyond Yvonne Treis To cite this version: Yvonne Treis. Similative morphemes as purpose clause markers in Ethiopia and beyond. 2016. <hal-01351924> HAL Id: hal-01351924 https://hal.archives-ouvertes.fr/hal-01351924 Submitted on 4 Aug 2016 HAL is a multi-disciplinary open access L'archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destin´eeau d´ep^otet `ala diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publi´esou non, lished or not. The documents may come from ´emanant des ´etablissements d'enseignement et de teaching and research institutions in France or recherche fran¸caisou ´etrangers,des laboratoires abroad, or from public or private research centers. publics ou priv´es. Similative morphemes as purpose clause markers in Ethiopia and beyond Yvonne Treis LLACAN (CNRS, INALCO, Université Sorbonne Paris-Cité) Abstract In more than 30 languages spoken at the Horn of Africa, a similative morpheme ‘like’ or a noun ‘manner’ or ‘type’ is used as a marker of purpose clauses. The paper first elaborates on the many functions of the enclitic morpheme =g ‘manner’ in Kambaata (Highland East Cushitic), which is used, among others, as a marker of the standard in similative and equative comparison (‘like’, ‘as’), of temporal clauses of immediate anteriority (‘as soon as’), of complement clauses (‘that’) and, most notably, of purpose clauses (‘in order to’).
    [Show full text]
  • Presentative Demonstratives in Kambaata
    Presentative demonstratives in Kambaata Yvonne Treis1 Paper presented at the 22nd Afrikanistentag 17-18 June 2016, Humbold-Universität, Berlin 1. Introduction Afroasiatic < Cushitic < East Cushitic < Highland East Cushitic < Kambaata Number of speakers: > 600,000 (Census 2007) Speaker area: Southern Nations, Nationalities and Peoples’ Regional State (SNNPR), Kambaata- Xambaaro Zone Sociolinguistic status: medium of instruction in primary school, taught as a subject up to grade 8, hardly any written material (schoolbooks, Gospel of John, some brochures with a limited distribution), local Kambaata radio program Official orthography: Latin-based orthography following the Oromo writing conventions (Treis 2008: 73-80)2 Official orthography with minor adaptions also used in this paper Data on which this contribution is based: collected or verified in February-March 2016,3 from a variety of sources (recordings, elicitation, written sources) Starting point: (1) Aayíchch daqqan-teenánta y-itáa-’e bagáan Mum.fNOM meet.REC-2pIPV say-3fIPV-1sO CONTR kú’nn daqqam-mu’nnáan kabar-ée iill-íneemm meet.REC-1pNCO today-mDAT reach-1pPVE (From a letter in which a boy writes to his unknown half-brother) ‘Mum used to tell me “You will meet (one day)” but – Look! – we haven’t met up to today.’ [K89: 8.21] The independent morpheme kú’nn has been analyzed as an invariant interjection/discourse particle. This paper shows that kú’nn belongs to a (fairly elaborate) paradigm of presentative demonstratives that has been overlooked in earlier analyses of the language. 1 Affiliation: CSPC, INALCO CNRS UMR 8135 LLACAN Langage, langues et cultures d’Afrique noire 2 The following graphemes are not in accordance with the IPA conventions: <ph> /p’/, <x> /t’/, <q> /k’/, <j> /dʒ/, <c> /tʃ’/, <ch> /tʃ/, <sh> /ʃ/, <y> /j/ and <’> /Ɂ/.
    [Show full text]
  • Volltext Als
    Afrikanistik online 2006, http://www.afrikanistik-online.de/archiv/2006/379 Form and Function of Case Marking in Kambaata Yvonne Treis (Köln) Zusammenfassung Die kuschitische Sprache Kambaata zeichnet sich durch eine reiche nominale Morphologie aus. So unterscheidet sie acht verschiedene Kasus an Nomina und Pronomina: Akkusativ, Nominativ, Genitiv, Dativ, Ablativ, Instrumental-Komitativ-Perlativ, Lokativ I und II. Kasus wird nicht nur durch (segmentale) Suffixe, sondern auch durch eine bestimmte Position des Wortakzents (suprasegmental) gekennzeichnet. Der erste Teil des Artikels widmet sich diesen formalen Regeln der Kasusbildung und diskutiert im Detail die Deklinationen von Nomina und Eigennamen, die Kasusmarkierung an nominalen Modifikatoren und Personal-, Demonstrativ- und Interrogativpronomen. Adjektive, Zahlwörter und Demonstrative kongruieren mit ihren Kopfnomina in Genus und Kasus, ihr Kasussystem ist jedoch auf die Unterscheidung von Nominativ, Akkusativ und Oblique beschränkt. Der zweite Teil des Artikels thematisiert die grammatischen Funktionen und die Semantik jedes einzelnen Kasus anhand elizitierter Sätze, Daten aus Texten und spontanen Äußerungen. Kasus ist das wichtigste Instrument, um die syntaktischen Abhängigkeiten im Satz zu markieren, da Kambaata keine Adpositionen hat. 1. Introduction <1> Kambaata is a Highland East Cushitic (HEC) language spoken by more than 600,000 speakers (Gordon 2005) about 300 km southwest of Addis Abeba in the Kambaata- Xambaaro-Zone of the Southern Nations, Nationalities and Peoples Regional State, Ethiopia. It is mutually intelligible with its immediate relatives Alaaba and Qabeena. While the few works published on Kambaata so far more or less concentrate on the verbal morphology and on morpho-phonological processes (Abebe et al. 1985, M.G. Sim 1988, R.J. Sim 1988), little is known about the noun.
    [Show full text]
  • Dictionary Based Spelling Corrector System: the Case of Six Ethiopian Languages
    International Journal of Aquatic Science ISSN: 2008-8019 Vol 12, Issue 02, 2021 Dictionary Based Spelling Corrector System: The Case of Six Ethiopian Languages Wubetu Barud Demilie Department of Information Technology, Wachemo University, Hossana, Ethiopia, P.O. Box: 667 ABSTRACT A dictionary-based spelling correctorisa system that can directly identify what natural language is being dealt with and shifts to the proper spelling corrector for the languages that system users are interested to do so. Spelling corrector systems for languages would be used to check errors for any kind of spelling mistakes and are fairly reliant on the words in the lexicon dictionary. Some words may have very few words spelled similarly, so even numerous faults will recover the accurate word. Other words will have many likewise spelled words, so one error may make alteration problematic or unbearable. A dictionary- based model is used in noticing and modifying diverse classes of spelling errors. The main features of the planned model can be précised in giving the proposals for noticed errors and providing the correction automatically using the first suggestion. Furthermore, the planned model is calculated using dictionary-based data sets for all languages that the researcher has been selected for the study. This research work is based on a model dictionary-based which detects and corrects errors for six Ethiopian languages including Amharic, Afan Oromo, Tigrinya, Hadiyyisa, Kambatissa, and Awngi. The used corpora have been collected from balanced sources that contain economic, political, social, and related newspapers. Finally, after a successful evaluation of the proposed model, precision, recall, and f- measures have been calculated for each language.
    [Show full text]
  • Comparison in Kambaata: Superiority, Equality and Similarity Yvonne Treis
    Comparison in Kambaata: Superiority, Equality and Similarity Yvonne Treis To cite this version: Yvonne Treis. Comparison in Kambaata: Superiority, Equality and Similarity. Linguistic Discovery, Dartmouth College Library, 2018, On the expression of comparison: Contributions to the typology of comparative constructions from lesser-known languages (guest editors: Yvonne Treis & Katarzyna I. Wojtylak), 16 (1), pp.64-99. hal-01350812v3 HAL Id: hal-01350812 https://hal.archives-ouvertes.fr/hal-01350812v3 Submitted on 3 Feb 2019 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Comparison in Kambaata: Superiority, Equality and Similarity Yvonne Treis CNRS-LLACAN This paper is an in-depth study of the expression of comparison in Kambaata, a Highland East Cushitic language of Ethiopia. It discusses not only quantitative comparison, i.e. comparison of relative and absolute inequality and comparison of equality, but also analyses the morphology and syntax of expressions of qualitative comparison, i.e. comparison of similarity. Apart from predicative constructions, the analysis takes into account attributive comparative, superlative, equative and similative constructions. In the comparative construction (lit. ‘X is tall from Y’), the standard of comparison is marked by the ablative case, as in most languages spoken in the Horn of Africa.
    [Show full text]
  • LCSH Section K
    K., Rupert (Fictitious character) K-T boundary Ka-ju-ken-bo USE Rupert (Fictitious character : Laporte) USE Cretaceous-Paleogene boundary USE Kajukenbo K-4 PRR 1361 (Steam locomotive) K-TEA (Achievement test) Ka-La-Bre-Osh (Game) USE 1361 K4 (Steam locomotive) USE Kaufman Test of Educational Achievement USE Belote (Game) K-9 (Fictitious character) (Not Subd Geog) K-theory Kʻa-la-kʻun-lun kung lu (China and Pakistan) UF K-Nine (Fictitious character) [QA612.33] USE Karakoram Highway (China and Pakistan) K9 (Fictitious character) BT Algebraic topology Ka Lae o Kilauea (Hawaii) K 37 (Military aircraft) Homology theory USE Kilauea Point (Hawaii) USE Junkers K 37 (Military aircraft) NT Whitehead groups Ka Lang (Vietnamese people) K 98 k (Rifle) K. Tzetnik Award in Holocaust Literature USE Giẻ Triêng (Vietnamese people) USE Mauser K98k rifle UF Ka-Tzetnik Award Ka nanʻʺ (Burmese people) (May Subd Geog) K.A.L. Flight 007 Incident, 1983 Peras Ḳ. Tseṭniḳ [DS528.2.K2] USE Korean Air Lines Incident, 1983 Peras Ḳatseṭniḳ UF Ka tūʺ (Burmese people) K.A. Lind Honorary Award BT Literary prizes—Israel BT Ethnology—Burma USE Moderna museets vänners skulpturpris K2 (Pakistan : Mountain) ʾKa nao dialect (May Subd Geog) K.A. Linds hederspris UF Dapsang (Pakistan) BT China—Languages USE Moderna museets vänners skulpturpris Godwin Austen, Mount (Pakistan) Hmong language K-ABC (Intelligence test) Gogir Feng (Pakistan) Ka nō (Burmese people) USE Kaufman Assessment Battery for Children Mount Godwin Austen (Pakistan) USE Tha noʹ (Burmese people) K-B Bridge (Palau) BT Mountains—Pakistan Ka Rang (Southeast Asian people) USE Koro-Babeldaod Bridge (Palau) Karakoram Range USE Sedang (Southeast Asian people) K-BIT (Intelligence test) K2 (Drug) Kā Roimata o Hine Hukatere (N.Z.) USE Kaufman Brief Intelligence Test USE Synthetic marijuana USE Franz Josef Glacier/Kā Roimata o Hine K.
    [Show full text]
  • Switch-Reference and Omotic-Cushitic Language Contact in Southwest Ethiopia
    Journal of Language Contact 5 (2012) 80–116 brill.nl/jlc Switch-reference and Omotic-Cushitic Language Contact in Southwest Ethiopia Yvonne Treis LLACAN (CNRS, INALCO), Villejuif, France [email protected] Abstract Africa has up until now been considered a continent where switch-reference systems are extremely rare. This study shows that there is a confined area in the South of Ethiopia where many Omotic languages and a few Cushitic languages have fully grammaticalised switch- reference systems on dependent (co-)subordinate non-final verbs, so-called converbs. The paper describes in detail the switch-reference system of Kambaata (Cushitic) and gives an overview of the distribution of switch-reference systems in Ethiopia in general. It is argued that switch- reference marking in Cushitic languages is the result of contact with neighbouring Omotic languages. Keywords switch-reference; converb; Cushitic; Omotic; borrowing; contact-induced grammaticalisation 1. Introduction Ethiopia is the home of languages that belong to three families of the Afroasiatic phylum, i.e. Semitic, Cushitic and Omotic languages. Along the Western border of Ethiopia, Nilo-Saharan languages of the Eastern Sudanic and Koman branches are spoken. The geographical area that this article is concerned with, i.e. the Southwestern corner of Ethiopia, is the linguistically most heterogeneous area of the country, where the majority of Omotic lan- guages, the Surmic (East Sudanic) languages and most small Cushitic and Semitic languages are spoken. Language contact research in Ethiopia has a long tradition and may be said to have started with Leslau’s work of 1945. Ferguson’s seminal paper of 1976 set the stage for the study of the Ethiopian Language Area.
    [Show full text]
  • Negation in Kambaata (Cushitic) Yvonne Treis
    Negation in Kambaata (Cushitic) Yvonne Treis To cite this version: Yvonne Treis. Negation in Kambaata (Cushitic). Matti Miestamo; Ljuba Veselinova. Negation in the languages of the world, Language Science Press, In press. hal-02332852v2 HAL Id: hal-02332852 https://hal.archives-ouvertes.fr/hal-02332852v2 Submitted on 9 Nov 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Chapter 1 Negation in Kambaata (Cushitic) Yvonne Treis LLACAN (CNRS, INALCO) The Ethiopian language Kambaata (Cushitic) has five distinct negative inflectional suffixes that negate (i) declarative main verbs and non-verbal predicates, (ii)imper- atives, (iii) jussives and benedictives, (iv) converbs and (v) relative verbs. Affirma- tive and negative verb forms do not often match each other in a one-to-one relation, but paradigmatic and constructional asymmetries can be observed. Depending on the verb type, aspectual and modal distinctions are neutralized under negation, the number of different subject indexes is reduced, the distinction between same subject and different subject forms is lost, and changes occur in the morpholog- ical makeup of verb forms. Finally, not all affirmative paradigms have dedicated negative counterparts. Most noteworthy from a typological point of view are Kam- baata’s negative relative verbs.
    [Show full text]
  • LCSH Section K
    K., Rupert (Fictitious character) Homology theory Ka nanʻʺ (Burmese people) (May Subd Geog) USE Rupert (Fictitious character : Laporte) NT Whitehead groups [DS528.2.K2] K-4 PRR 1361 (Steam locomotive) K. Tzetnik Award in Holocaust Literature UF Ka tūʺ (Burmese people) USE 1361 K4 (Steam locomotive) UF Ka-Tzetnik Award BT Ethnology—Burma K-9 (Fictitious character) (Not Subd Geog) Peras Ḳ. Tseṭniḳ ʾKa nao dialect (May Subd Geog) UF K-Nine (Fictitious character) Peras Ḳatseṭniḳ BT China—Languages K9 (Fictitious character) BT Literary prizes—Israel Hmong language K 37 (Military aircraft) K2 (Pakistan : Mountain) Ka nō (Burmese people) USE Junkers K 37 (Military aircraft) UF Dapsang (Pakistan) USE Tha noʹ (Burmese people) K 98 k (Rifle) Godwin Austen, Mount (Pakistan) Ka Rang (Southeast Asian people) USE Mauser K98k rifle Gogir Feng (Pakistan) USE Sedang (Southeast Asian people) K.A.L. Flight 007 Incident, 1983 Mount Godwin Austen (Pakistan) Ka-taw USE Korean Air Lines Incident, 1983 BT Mountains—Pakistan USE Takraw K.A. Lind Honorary Award Karakoram Range Ka Tawng Luang (Southeast Asian people) USE Moderna museets vänners skulpturpris K2 (Drug) USE Phi Tong Luang (Southeast Asian people) K.A. Linds hederspris USE Synthetic marijuana Kā Tiritiri o te Moana (N.Z.) USE Moderna museets vänners skulpturpris K3 (Pakistan and China : Mountain) USE Southern Alps/Kā Tiritiri o te Moana (N.Z.) K-ABC (Intelligence test) USE Broad Peak (Pakistan and China) Ka-Tu USE Kaufman Assessment Battery for Children K4 (Pakistan and China : Mountain) USE Kha Tahoi K-B Bridge (Palau) USE Gasherbrum II (Pakistan and China) Ka tūʺ (Burmese people) USE Koro-Babeldaod Bridge (Palau) K4 Locomotive #1361 (Steam locomotive) USE Ka nanʻʺ (Burmese people) K-BIT (Intelligence test) USE 1361 K4 (Steam locomotive) Ka-Tzetnik Award USE Kaufman Brief Intelligence Test K5 (Pakistan and China : Mountain) USE K.
    [Show full text]