Dissertation
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Here Referred to As Class 18A (See Hyman 1980:187)
WS1 Remarks on the nasal classes in Mungbam and Naki Mungbam and Naki are two non-Grassfields Bantoid languages spoken along the northwest frontier of the Grassfields area to the north of the Ring languages. Until recently, they were poorly described, but new data reveals them to show significant nasal noun class patterns, some of which do not appear to have been previously noted for Bantoid. The key patterns are: 1. Like many other languages of their region (see Good et al. 2011), they make productive use of a mysterious diminutive plural prefix with a form like mu-, with associated concords in m, here referred to as Class 18a (see Hyman 1980:187). 2. The five dialects of Mungbam show a level of variation in their nasal classes that one might normally expect of distinct languages. a. Two dialects show no evidence for nasals in Class 6. Two other dialects, Munken and Ngun, show a Class 6 prefix on nouns of form a- but nasal concords. In Munken Class 6, this nasal is n, clearly distinct from an m associated with 6a; in Ngun, both 6 and 6a are associated with m concords. The Abar dialect shows a different pattern, with Class 6 nasal concords in m and nasal prefixes on some Class 6 nouns. b. The Abar, Biya, and Ngun dialects show a Class 18a prefix with form mN-, rather than the more regionally common mu-. This reduction is presumably connected to perseveratory nasalization attested throughout the languages of the region with a diachronic pathway along the lines of mu- > mũ- > mN- perhaps providing a partial example for the development of Bantu Class 9/10. -
A Corpus-Driven Account of the Noun Classes and Genders in Northern Sotho
Southern African Linguistics and Applied Language Studies 2016, 34(2): 169–185 Copyright © NISC (Pty) Ltd SOUTHERN AFRICAN LINGUISTICS AND APPLIED LANGUAGE STUDIES ISSN 1607-3614 EISSN 1727-9461 http://dx.doi.org/10.2989/16073614.2016.1206478 A corpus-driven account of the noun classes and genders in Northern Sotho Elsabé Taljard1* and Gilles-Maurice de Schryver1,2 1Department of African Languages, University of Pretoria, Pretoria, South Africa 2 BantUGent – UGent Centre for Bantu Studies, Department of Languages and Cultures, Ghent University, Belgium *Corresponding author email: [email protected] Abstract: This article offers a distributional corpus analysis of the Northern Sotho noun and gender system. The aim is twofold: first, to assess whether the existing descriptions of the noun class system in Northern Sotho are corroborated by information provided by the analysis of a large electronic corpus for this language, with specific reference to singular-plural pairings, and second, to present a number of novel visualisation aids to characterise a noun class system (in a radar diagram) and a noun gender system (using a two-directional weighted representation) for Northern Sotho in particular, and for any Bantu language in general. The findings include the discovery of two new genders in Northern Sotho (i.e. class pairs 1/6 and 3/10), and also indicate that the Northern Sotho noun class system, and by extension any one for Bantu, should be seen as dynamic. Introduction The present undertaking adds to a relatively small, but growing number of corpus-based grammatical descriptions of Bantu language features, several of which regard South African languages. -
The Standardisation of African Languages Michel Lafon, Vic Webb
The Standardisation of African Languages Michel Lafon, Vic Webb To cite this version: Michel Lafon, Vic Webb. The Standardisation of African Languages. Michel Lafon; Vic Webb. IFAS, pp.141, 2008, Nouveaux Cahiers de l’Ifas, Aurelia Wa Kabwe Segatti. halshs-00449090 HAL Id: halshs-00449090 https://halshs.archives-ouvertes.fr/halshs-00449090 Submitted on 20 Jan 2010 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. The Standardisation of African Languages Language political realities CentRePoL and IFAS Proceedings of a CentRePoL workshop held at University of Pretoria on March 29, 2007, supported by the French Institute for Southern Africa Michel Lafon (LLACAN-CNRS) & Vic Webb (CentRePoL) Compilers/ Editors CentRePoL wishes to express its appreciation to the following: Dr. Aurelia Wa Kabwe-Segatti, Research Director, IFAS, Johannesburg, for her professional and material support; PanSALB, for their support over the past two years for CentRePoL’s standardisation project; The University of Pretoria, for the use of their facilities. Les Nouveaux Cahiers de l’IFAS/ IFAS Working Paper Series is a series of occasional working papers, dedicated to disseminating research in the social and human sciences on Southern Africa. -
LCSH Section W
W., D. (Fictitious character) William Kerr Scott Lake (N.C.) Waaddah Island (Wash.) USE D. W. (Fictitious character) William Kerr Scott Reservoir (N.C.) BT Islands—Washington (State) W.12 (Military aircraft) BT Reservoirs—North Carolina Waaddah Island (Wash.) USE Hansa Brandenburg W.12 (Military aircraft) W particles USE Waadah Island (Wash.) W.13 (Seaplane) USE W bosons Waag family USE Hansa Brandenburg W.13 (Seaplane) W-platform cars USE Waaga family W.29 (Military aircraft) USE General Motors W-cars Waag River (Slovakia) USE Hansa Brandenburg W.29 (Military aircraft) W. R. Holway Reservoir (Okla.) USE Váh River (Slovakia) W.A. Blount Building (Pensacola, Fla.) UF Chimney Rock Reservoir (Okla.) Waaga family (Not Subd Geog) UF Blount Building (Pensacola, Fla.) Holway Reservoir (Okla.) UF Vaaga family BT Office buildings—Florida BT Lakes—Oklahoma Waag family W Award Reservoirs—Oklahoma Waage family USE Prix W W. R. Motherwell Farmstead National Historic Park Waage family W.B. Umstead State Park (N.C.) (Sask.) USE Waaga family USE William B. Umstead State Park (N.C.) USE Motherwell Homestead National Historic Site Waahi, Lake (N.Z.) W bosons (Sask.) UF Lake Rotongaru (N.Z.) [QC793.5.B62-QC793.5.B629] W. R. Motherwell Stone House (Sask.) Lake Waahi (N.Z.) UF W particles UF Motherwell House (Sask.) Lake Wahi (N.Z.) BT Bosons Motherwell Stone House (Sask.) Rotongaru, Lake (N.Z.) W. Burling Cocks Memorial Race Course at Radnor BT Dwellings—Saskatchewan Wahi, Lake (N.Z.) Hunt (Malvern, Pa.) W.S. Payne Medical Arts Building (Pensacola, Fla.) BT Lakes—New Zealand UF Cocks Memorial Race Course at Radnor Hunt UF Medical Arts Building (Pensacola, Fla.) Waʻahila Ridge (Hawaii) (Malvern, Pa.) Payne Medical Arts Building (Pensacola, Fla.) BT Mountains—Hawaii BT Racetracks (Horse racing)—Pennsylvania BT Office buildings—Florida Waaihoek (KwaZulu-Natal, South Africa) W-cars W star algebras USE Waay Hoek (KwaZulu-Natal, South Africa : USE General Motors W-cars USE C*-algebras Farm) W. -
The Boarnsterhim Corpus: a Bilingual Frisian-Dutch Panel and Trend Study
The Boarnsterhim Corpus: A Bilingual Frisian-Dutch Panel and Trend Study Marjoleine Sloos, Eduard Drenth, Wilbert Heeringa Fryske Akademy, Royal Netherlands Academy of Sciences (KNAW) Doelestrjitte 8, 8911 DX Ljouwert, The Netherlands {msloos, edrenth, wheeringa}@fryske-akademy.nl Abstract The Boarnsterhim Corpus consists of 250 hours of speech in both West Frisian and Dutch by the same sample of bilingual speakers. The corpus contains original recordings from 1982-1984 and a replication study recorded 35 years later. The data collection spans speech of four generations, and combines panel and trend data. This paper describes the Boarnsterhim Corpus halfway the project which started in 2016 and describes the way it was collected, the annotations, potential use, and the envisaged tools and end-user web application. Keywords: West Frisian, Dutch, sociolinguistics, language variation and change, bilingualism, phonetics, phonology 1. Background The remainder of this paper describes the data collection and methodology in section 2. Section 3 West Frisian is mostly spoken in the province of Fryslân in describes the embedding in a larger infrastructure and the north of the Netherlands. All its speakers are bilingual section 4 provides background information about the tool with Dutch, which is the dominant language. West Frisian that is used to retrieve lexical frequency. Section 5 is mainly used in informal settings (van Bezooijen discusses previous and ongoing research, and further 2009:302) but also the most formal, viz. in the provincial research opportunities that this database may make parliament. In semi-formal interactions (e.g.shopping, in possible. Finally, section 6 concludes. church), Dutch is usually the preferred language. -
Modeling a Historical Variety of a Low-Resource Language: Language Contact Effects in the Verbal Cluster of Early-Modern Frisian
Modeling a historical variety of a low-resource language: Language contact effects in the verbal cluster of Early-Modern Frisian Jelke Bloem Arjen Versloot Fred Weerman ILLC ACLC ACLC University of Amsterdam University of Amsterdam University of Amsterdam [email protected] [email protected] [email protected] Abstract which are by definition used less, and are less likely to have computational resources available. For ex- Certain phenomena of interest to linguists ample, in cases of language contact where there is mainly occur in low-resource languages, such as contact-induced language change. We show a majority language and a lesser used language, that it is possible to study contact-induced contact-induced language change is more likely language change computationally in a histor- to occur in the lesser used language (Weinreich, ical variety of a low-resource language, Early- 1979). Furthermore, certain phenomena are better Modern Frisian, by creating a model using studied in historical varieties of languages. Taking features that were established to be relevant the example of language change, it is more interest- in a closely related language, modern Dutch. ing to study a specific language change once it has This allows us to test two hypotheses on two already been completed, such that one can study types of language contact that may have taken place between Frisian and Dutch during this the change itself in historical texts as well as the time. Our model shows that Frisian verb clus- subsequent outcome of the change. ter word orders are associated with different For these reasons, contact-induced language context features than Dutch verb orders, sup- change is difficult to study computationally, and porting the ‘learned borrowing’ hypothesis. -
Modernising Tbe Lexicon of Tbe West Frisian Language
Modernising tbe Lexicon of tbe West Frisian Language Hindrik Sijens (LjouwertlLeeuwarden) A new toy in the world of modem electronic communications has arrived in Fryslän, namely Internet, and along with it e-mail. When asked what these new media might be ca lied in Frisian, the answers internet and e-mail are hardly surprising. We can see that both Fryslän and the Frisian language are also tak ing part in the digitalisation of the world. New concepts, new words in tech nology retain their original Anglo-American names in most European lan guages. This paper will deal with the process of developing and renewing the Frisian lexicon. But first I am going to give a brief introduction to Fryslän itself. Fryslän is one ofthe twelve provinces ofthe kingdom ofthe Netherlands and is situated in the northem part ofthe country. Fryslän is a flat province, much of its territory being below sea level. It is protected trom the sea by dikes. The capitalof Fryslän is Ljouwert, or Leeuwarden, as it is known in Dutch. In the past, Fryslän was an agricultural pro vin ce with more cattle living there than people. Nowadays farming still plays an important role in Fryslän's economy, but both the service industries and industry itself have also become important. Fryslän has a population of about 600,000 and approximately 55% of them consider Frisian to be their first language. While about 74% of the population claim that they can speak Frisian, 94% say that they can understand it when it is spoken, 65% claim that they can read Frisian, but only 17% say that they can write it (Gorter & Jonkman 1995:9-11). -
Arxiv:2105.02855V2 [Cs.CL] 22 May 2021 However, for the Majority of the World’S Languages, These Large Corpora Are Not Available
Adapting Monolingual Models: Data can be Scarce when Language Similarity is High Wietse de Vries∗, Martijn Bartelds∗, Malvina Nissim and Martijn Wieling University of Groningen The Netherlands fwietse.de.vries, m.bartelds, m.nissim, [email protected] Abstract languages that are not included in mBERT pre- training usually show poor performance (Nozza For many (minority) languages, the resources et al., 2020; Wu and Dredze, 2020). needed to train large models are not available. An alternative to multilingual transfer learning We investigate the performance of zero-shot is the adaptation of existing monolingual models transfer learning with as little data as possi- ble, and the influence of language similarity in to other languages. Zoph et al.(2016) introduce this process. We retrain the lexical layers of a method for transferring a pre-trained machine four BERT-based models using data from two translation model to lower-resource languages by low-resource target language varieties, while only fine-tuning the lexical layer. This method has the Transformer layers are independently fine- also been applied to BERT (Artetxe et al., 2020) tuned on a POS-tagging task in the model’s and GPT-2 (de Vries and Nissim, 2020). Artetxe source language. By combining the new lex- et al.(2020) also show that BERT models with ical layers and fine-tuned Transformer layers, retrained lexical layers perform well in downstream we achieve high task performance for both target languages. With high language sim- tasks, but comparatively high performance has only ilarity, 10MB of data appears sufficient to been demonstrated for languages for which at least achieve substantial monolingual transfer per- 400MB of data is available. -
[.35 **Natural Language Processing Class Here Computational Linguistics See Manual at 006.35 Vs
006 006 006 DeweyiDecimaliClassification006 006 [.35 **Natural language processing Class here computational linguistics See Manual at 006.35 vs. 410.285 *Use notation 019 from Table 1 as modified at 004.019 400 DeweyiDecimaliClassification 400 400 DeweyiDecimali400Classification Language 400 [400 [400 *‡Language Class here interdisciplinary works on language and literature For literature, see 800; for rhetoric, see 808. For the language of a specific discipline or subject, see the discipline or subject, plus notation 014 from Table 1, e.g., language of science 501.4 (Option A: To give local emphasis or a shorter number to a specific language, class in 410, where full instructions appear (Option B: To give local emphasis or a shorter number to a specific language, place before 420 through use of a letter or other symbol. Full instructions appear under 420–490) 400 DeweyiDecimali400Classification Language 400 SUMMARY [401–409 Standard subdivisions and bilingualism [410 Linguistics [420 English and Old English (Anglo-Saxon) [430 German and related languages [440 French and related Romance languages [450 Italian, Dalmatian, Romanian, Rhaetian, Sardinian, Corsican [460 Spanish, Portuguese, Galician [470 Latin and related Italic languages [480 Classical Greek and related Hellenic languages [490 Other languages 401 DeweyiDecimali401Classification Language 401 [401 *‡Philosophy and theory See Manual at 401 vs. 121.68, 149.94, 410.1 401 DeweyiDecimali401Classification Language 401 [.3 *‡International languages Class here universal languages; general -
A Balanced and Representative Corpus: the Effects of Strict Corpus-Based Dictionary Compilation in Sesotho Sa Leboa* V.M
http://lexikos.journals.ac.za A Balanced and Representative Corpus: The Effects of Strict Corpus-based Dictionary Compilation in Sesotho sa Leboa* V.M. Mojela, Sesotho sa Leboa National Lexicography Unit, University of Limpopo, Turfloop Campus, Polokwane, South Africa ([email protected]) Abstract: Theoretically the Northern Sotho language is made up of almost 30 dialects while practically it is not so, because the standard language was formed from very few of its dialects. As a result, even today the language has no corpus which is balanced or representative owing to the fact that almost all of the available corpora are compiled from the written standard language and the written dialects. The majority of the Northern Sotho dialects do not have written orthographies, and the few dialects which had written orthographies prior to standardization came to monopolize the standard language and the Northern Sotho corpora. Therefore, the compilation of a corpus- based dictionary in Northern Sotho is tantamount to a continuation of producing unbalanced and unrepresentative dictionaries, which continue to sideline and to marginalize the majority of the communities and the linguistic varieties which could potentially enrich both the Northern Sotho standard language and the Northern Sotho corpora. The main objective with this research is to analyze, to expose and to suggest ways of correcting these irregularities so that the marginalized Northern Sotho dialects can be accommodated in the standard language. This will obviously increase the size of the Northern Sotho standard language and the corpus by more than 50%. Keywords: CORPUS, BALANCED CORPUS, REPRESENTATIVE CORPUS, STANDARDIZA- TION, DIALECT, ORTHOGRAPHY, MARGINALIZED DIALECTS, PRESTIGE DIALECTS, MIS- SIONARY ACTIVITIES Opsomming: 'n Gebalanseerde en verteenwoordigende korpus: Die gevolge van streng korpusgebaseerde woordeboeksamestelling in Sesotho sa Leboa. -
Tsotsitaal Special Issue.Indb
This article was downloaded by: [Vienna University Library] On: 31 December 2014, At: 04:40 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Southern African Linguistics and Applied Language Studies Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/rall20 Language and youth identity in a multilingual setting: A multimodal repertoire approach Anthea Bristowea, Marcelyn Oostendorpa & Christine Anthonissena a Stellenbosch University, South Africa Published online: 23 Dec 2014. Click for updates To cite this article: Anthea Bristowe, Marcelyn Oostendorp & Christine Anthonissen (2014) Language and youth identity in a multilingual setting: A multimodal repertoire approach, Southern African Linguistics and Applied Language Studies, 32:2, 229-245, DOI: 10.2989/16073614.2014.992644 To link to this article: http://dx.doi.org/10.2989/16073614.2014.992644 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. -
Adjective Stacking and Classification in Nothern Sotho: a Southern Bantu Language of South Africa
Adjective Stacking in Northern Sotho Item Type Meetings and Proceedings Authors Flanagan, Paul Citation Flanagan, P.J. (2013). Adjective stacking and classification in Northern Sotho, a Bantu language of South Africa. In Olmos- Lopez, B-P., Huang, J., & Almeida, J. (Eds). Papers from the LAEL Conference, 8 pp4-40. Lancaster, United Kingdom: Lancaster University Publisher University of Lancaster Journal Papers from the LAEL Conference 2013 Download date 26/09/2021 06:46:11 Item License http://creativecommons.org/licenses/by-nc-nd/4.0/ Link to Item http://hdl.handle.net/10034/610465 Papers from the Lancaster University Postgraduate Conference in Linguistics & Language Teaching 2012 Adjective Stacking and Classification in Nothern Sotho: A Southern Bantu Language of South Africa Abstract In this paper, I investigate the nature of complex nominal modification in Northern Sotho, a Southern Bantu language and an official language of South Africa. Adjectives in Northern Sotho have traditionally been recognised as a subclass of nouns, based on morphological similarities between nouns and adjectives. Based on recent work which proposes that all languages have a distinct word class ‘adjective’, I argue that adjectives in Northern Sotho constitute an independent grammatical category. I base this suggestion on the common morpho-syntactic behaviour of members of this class and present an in-depth analysis of the ordering of elements in Northern Sotho poly- adjectival nominal phrases. There has been some limited discussion of the theory that there are universal structures in adjective order across different languages, although sequencing in languages with postnominal adjectives is desperately under-researched. Using a combination of corpus data and original fieldwork, I provide support for the suggestion that there are patterns in the syntax of complex modification strings which operate on a universal level, above that of individual languages.