Malayalam Morphosyntax: Inflectional Features and Their Acquisition

Total Page:16

File Type:pdf, Size:1020Kb

Malayalam Morphosyntax: Inflectional Features and Their Acquisition Malayalam Morphosyntax: Inflectional Features and their Acquisition Submitted in partial fulfillment of the requirements of the degree of Doctor of Philosophy by Gayathri G Roll No. 134083003 Supervisor: Prof. Vaijayanthi M Sarma Department of Humanities & Social Sciences INDIAN INSTITUTE OF TECHNOLOGY BOMBAY 2019 For Priya … Thesis Approval This thesis entitled “Malayalam Morphosyntax: Inflectional Features and their Acquisition” by Gayathri G is approved for the degree of Doctor of Philosophy. Examiners Supervisor Chairman Date: 14­11­2019 Place: Mumbai iii Declaration I declare that this written submission represents my ideas in my own words and where others’ ideas or words have been included, I have adequately cited and referenced the original sources. I also declare that I have adhered to all principles of academic honesty and integrity and have not misrepresented or fabricated or falsified any idea/data/fact/source in my submission. I un­ derstand that any violation of the above will be cause for disciplinary action by the Institute and can also evoke penal action from the sources which have thus not been properly cited or from whom proper permission has not been taken when needed. Gayathri G Date: 14­11­2019 Roll No. 134083003 v Abstract Malayalam, which belongs to the South­Dravidian language family, is an agglutinative language with rich inflectional morphology. The aim of the thesis is to analyse the grammar and acqui­ sition of Malayalam verbal inflections (tense, aspect and mood) and nominal inflections (case, number, and gender). Within the larger discussion of inflectional morphology and its acqui­ sition, particular attention is paid to two complex morphological processes, a) the past tense formation of verbs and b) case assignment of subjects and objects. In particular, the thesis will show the following: a) that the past tense marker selection is determined by different grammatical principles in underived and derived stems; specifically, phonotactics in the former and the lexical feature of transitivity in the latter; b) that the da­ tive nominals of a class of predicates (variously labelled experiencer or dative subject or psych predicates) are in fact subjects using an array of empirical tests involving binding, control, ac­ cusative marking, and predicate alternation; and c) that inflections for number and object case rest on lexical features of the noun (stem) and the allomorphy is governed by these featural requirements. In looking at the developing grammar in the two subjects, the thesis will show that Malayalam inflectional grammar has quite direct consequences for the acquisition of inflec­ tional morphology. Specifically, acquisition proceeds unobstructed when the mode of selection is phonological and offers more challenges when the mode of selection is morphological, i.e., when the selection depends on the learning of the lexical or grammatical features of the noun and verb stems. Thus, using the interplay between acquisition and the grammatical description, we es­ tablish that in addition to the established factors that guide acquisition, mode of selection of an inflection plays a key role in determining the relative ease/difficulty in the acquisition of in­ flectional morphology. This follows quite neatly from the fact that children are phonologically competent even before much language is produced and that this module­competence could fa­ cilitate the acquisition of morphology. The thesis will argue that this is indeed the case. vii Contents Abstract vii Contents ix List of Figures xiii List of Tables xv List of Abbreviations xix 1 Introduction 1 1.1 Children and Language .............................. 1 1.2 Inflections in Universal Grammar ........................ 4 1.3 Dravidian Languages ............................... 7 1.3.1 Grammatical Features of Dravidian Languages ............. 9 1.4 Malayalam and its Linguistic Features ...................... 12 1.5 Organisation of the Thesis ............................ 16 1.6 Contributions of the Thesis ............................ 22 2 Contextualising Inflectional Development 23 2.1 Acquiring Inflections ............................... 23 2.2 Major Theoretical Models on Inflectional Acquisition .............. 25 2.2.1 Phonological Theories on Inflectional Development .......... 25 2.2.2 Morphological Models on Inflectional Acquisition: Rule vs Analogy or Dual­Route vs. Single­Route Models .................. 28 2.2.3 Syntactic Theories on Inflectional Development ............. 29 2.3 Affix­Learning vs Parameter Setting ....................... 40 ix x Contents 2.3.1 Factors that Influence the Acquisition of Inflectional Morphemes ... 40 2.3.2 Parameter Setting and the Acquisition of Inflections .......... 42 2.4 Different Approaches to the Development of Inflections ............. 43 3 Subjects and Method 47 3.1 Longitudinal Data ................................. 47 3.1.1 Error Analysis .............................. 48 3.1.2 Morphological Development ....................... 51 3.1.3 Productivity ................................ 56 3.2 Cross­Sectional Data ............................... 57 4 Characteristics of Inflectional Selection in Malayalam 59 4.1 Introduction .................................... 59 4.2 Malayalam Inflections and Agglutination .................... 60 4.3 Phonotactic Constraints on Inflectional Affixes ................. 60 4.4 Phonological vs Morphological Selection .................... 67 4.5 Acquisition and Mode of Affix Selection ..................... 68 5 The Grammar and Acquisition of Malayalam Verbal Inflections 69 5.1 Verbal Inflections in Malayalam ......................... 69 5.2 Acquisition of Verbal Inflections ......................... 74 5.3 Acquisition of Malayalam TAM inflections ................... 77 5.3.1 Tense ................................... 78 5.3.2 Aspect .................................. 83 5.3.3 Mood ................................... 85 5.3.4 Errors in TAM marking ......................... 91 5.4 Verbal Inflections: Summary ........................... 92 6 Past Tense Morphology and Acquisition 93 6.1 Introduction .................................... 93 6.2 Past Tense of Underived Verbs .......................... 96 6.3 Past Tense of Derived Verbs ........................... 97 6.4 Intermediate Summary .............................. 107 6.5 Interface Patterns in Affixed Verbs and Phonological Levelling ......... 108 Contents xi 6.6 Acquisition of Past Tense ............................. 110 6.6.1 Production of the past tense affix ­i ................... 112 6.6.2 Production of the Past Tense suffixes ­t̪ u and ­n̪ t̪ u ............ 113 6.6.3 Errors and Error Analysis ........................ 117 6.7 Conclusion .................................... 122 7 The Grammar and Acquisition of Malayalam Nominal Inflections 123 7.1 Malayalam Nominal Inflections ......................... 123 7.2 Cross­Linguistic Patterns in the Acquisition of Nominal Inflections ....... 132 7.3 Acquisition of Malayalam Nominal Inflections ................. 135 7.3.1 Case .................................... 136 7.3.2 Gender .................................. 145 7.3.3 Number .................................. 148 7.4 Nominal Inflection: Errors in the Data ...................... 153 7.5 Summary ..................................... 160 8 Dative Subject Marking in Malayalam 163 8.1 Introduction .................................... 163 8.2 Structure of Dative Subject Constructions .................... 185 8.3 Acquisition of Dative Subjects .......................... 195 8.3.1 Dative Marking in the Data ........................ 199 8.4 Errors in the Production of Datives ........................ 208 8.5 Conclusion .................................... 214 9 Conclusions and Further Research 215 A Linguistic Profiles of the Subjects’ Parents 225 B Examples of Discarded Items from the Transcripts 227 C Inflectional Paradigms in the Transcripts 229 D Past Tense Forms in the Transcripts 231 E Dative Subject Predicates in the Transcripts 237 xii Contents F Dative­Assigning Modal Predicates in the Transcripts 239 References 241 Acknowledgements 251 List of Figures 1.1 Family tree of Dravidian languages. (Krishnamurti, 2003, p. 21) ........ 8 2.1 Major inflectional acquisition models. ...................... 24 2.2 Different approaches to the development of inflections. ............. 44 3.1 Sample picture of the ELAN file with data organised at different levels. .... 48 3.2 MLUs of subjects. ................................. 53 3.3 A’s vocabulary count. ............................... 55 3.4 H’s vocabulary count. ............................... 55 3.5 A’s inflectional production. ............................ 56 3.6 H’s inflectional production. ............................ 56 5.1 TAM inflections ­ Present ­un̪ n̪ u. ......................... 80 5.2 TAM inflections ­ Future ­um. .......................... 81 5.3 TAM inflections ­ Perfective ­iʈʈɨ. ......................... 84 5.4 TAM inflections ­ Imperfective ­uka + aaɳɨ. ................... 85 5.5 TAM inflections ­ Imperative ­oo. ........................ 86 5.6 TAM inflections ­ Imperative ­ee. ........................ 87 5.7 TAM inflections ­ Imperative ­aan. ........................ 88 5.8 Production of bare imperatives. .......................... 89 5.9 TAM inflections ­ Optative ­aʈʈe. ......................... 90 6.1 Malayalam verb stems. .............................. 111 6.2 TAM inflections ­ Past ­i. ............................. 113 6.3 TAM inflections ­ Past ­t̪ u. ...........................
Recommended publications
  • The Status of the Least Documented Language Families in the World
    Vol. 4 (2010), pp. 177-212 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/4478 The status of the least documented language families in the world Harald Hammarström Radboud Universiteit, Nijmegen and Max Planck Institute for Evolutionary Anthropology, Leipzig This paper aims to list all known language families that are not yet extinct and all of whose member languages are very poorly documented, i.e., less than a sketch grammar’s worth of data has been collected. It explains what constitutes a valid family, what amount and kinds of documentary data are sufficient, when a language is considered extinct, and more. It is hoped that the survey will be useful in setting priorities for documenta- tion fieldwork, in particular for those documentation efforts whose underlying goal is to understand linguistic diversity. 1. InTroducTIon. There are several legitimate reasons for pursuing language documen- tation (cf. Krauss 2007 for a fuller discussion).1 Perhaps the most important reason is for the benefit of the speaker community itself (see Voort 2007 for some clear examples). Another reason is that it contributes to linguistic theory: if we understand the limits and distribution of diversity of the world’s languages, we can formulate and provide evidence for statements about the nature of language (Brenzinger 2007; Hyman 2003; Evans 2009; Harrison 2007). From the latter perspective, it is especially interesting to document lan- guages that are the most divergent from ones that are well-documented—in other words, those that belong to unrelated families. I have conducted a survey of the documentation of the language families of the world, and in this paper, I will list the least-documented ones.
    [Show full text]
  • Diachrony of Ergative Future
    • THE EVOLUTION OF THE TENSE-ASPECT SYSTEM IN HINDI/URDU: THE STATUS OF THE ERGATIVE ALGNMENT Annie Montaut INALCO, Paris Proceedings of the LFG06 Conference Universität Konstanz Miriam Butt and Tracy Holloway King (Editors) 2006 CSLI Publications http://csli-publications.stanford.edu/ Abstract The paper deals with the diachrony of the past and perfect system in Indo-Aryan with special reference to Hindi/Urdu. Starting from the acknowledgement of ergativity as a typologically atypical feature among the family of Indo-European languages and as specific to the Western group of Indo-Aryan dialects, I first show that such an evolution has been central to the Romance languages too and that non ergative Indo-Aryan languages have not ignored the structure but at a certain point went further along the same historical logic as have Roman languages. I will then propose an analysis of the structure as a predication of localization similar to other stative predications (mainly with “dative” subjects) in Indo-Aryan, supporting this claim by an attempt of etymologic inquiry into the markers for “ergative” case in Indo- Aryan. Introduction When George Grierson, in the full rise of language classification at the turn of the last century, 1 classified the languages of India, he defined for Indo-Aryan an inner circle supposedly closer to the original Aryan stock, characterized by the lack of conjugation in the past. This inner circle included Hindi/Urdu and Eastern Panjabi, which indeed exhibit no personal endings in the definite past, but only gender-number agreement, therefore pertaining more to the adjectival/nominal class for their morphology (calâ, go-MSG “went”, kiyâ, do- MSG “did”, bola, speak-MSG “spoke”).
    [Show full text]
  • Reconsidering the “Isolating Protolanguage Hypothesis” in the Evolution of Morphology Author(S): Jaïmé Dubé Proceedings
    Reconsidering the “isolating protolanguage hypothesis” in the evolution of morphology Author(s): Jaïmé Dubé Proceedings of the 37th Annual Meeting of the Berkeley Linguistics Society (2013), pp. 76-90 Editors: Chundra Cathcart, I-Hsuan Chen, Greg Finley, Shinae Kang, Clare S. Sandy, and Elise Stickles Please contact BLS regarding any further use of this work. BLS retains copyright for both print and screen forms of the publication. BLS may be contacted via http://linguistics.berkeley.edu/bls/. The Annual Proceedings of the Berkeley Linguistics Society is published online via eLanguage, the Linguistic Society of America's digital publishing platform. Reconsidering the Isolating Protolanguage Hypothesis in the Evolution of Morphology1 JAÏMÉ DUBÉ Université de Montréal 1 Introduction Much recent work on the evolution of language assumes explicitly or implicitly that the original language was without morphology. Under this assumption, morphology is merely a consequence of language use: affixal morphology is the result of the agglutination of free words, and morphophonemic (MP) alternations arise through the morphologization of once regular phonological processes. This hypothesis is based on at least two questionable assumptions: first, that the methods and results of historical linguistics can provide a window on the evolution of language, and second, based on the claim that some languages have no morphology (the so-called isolating languages), that morphology is not a necessary part of language. The aim of this paper is to suggest that there is in fact no basis for what I will call the Isolating Proto-Language Hypothesis (henceforth IPH), either on historical or typological grounds, and that the evolution of morphology remains an interesting question.
    [Show full text]
  • Transparency in Language a Typological Study
    Transparency in language A typological study Published by LOT phone: +31 30 253 6111 Trans 10 3512 JK Utrecht e-mail: [email protected] The Netherlands http://www.lotschool.nl Cover illustration © 2011: Sanne Leufkens – image from the performance ‘Celebration’ ISBN: 978-94-6093-162-8 NUR 616 Copyright © 2015: Sterre Leufkens. All rights reserved. Transparency in language A typological study ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag van de Rector Magnificus prof. dr. D.C. van den Boom ten overstaan van een door het college voor promoties ingestelde commissie, in het openbaar te verdedigen in de Agnietenkapel op vrijdag 23 januari 2015, te 10.00 uur door Sterre Cécile Leufkens geboren te Delft Promotiecommissie Promotor: Prof. dr. P.C. Hengeveld Copromotor: Dr. N.S.H. Smith Overige leden: Prof. dr. E.O. Aboh Dr. J. Audring Prof. dr. Ö. Dahl Prof. dr. M.E. Keizer Prof. dr. F.P. Weerman Faculteit der Geesteswetenschappen i Acknowledgments When I speak about my PhD project, it appears to cover a time-span of four years, in which I performed a number of actions that resulted in this book. In fact, the limits of the project are not so clear. It started when I first heard about linguistics, and it will end when we all stop thinking about transparency, which hopefully will not be the case any time soon. Moreover, even though I might have spent most time and effort to ‘complete’ this project, it is definitely not just my work. Many people have contributed directly or indirectly, by thinking about transparency, or thinking about me.
    [Show full text]
  • Names of Foodstuffs in Indian Languages
    NAMES OF FOODSTUFFS IN INDIAN LANGUAGES CEREAL GRAINS AND PRODUCTS 1. Pearl Millet: Pennisetum typhoides Bajra (Bengali, Hindi, Oriya), Bajri (Gujarati, Marathi), Sajje (Kannada), Bajr’u (Kashmiri), Cambu (Malayalam, Tamil), Sazzalu (Telugu). Other names : Spiked millet, Pearl millet 2. Italian millet: Setaria italica Syama dhan (Bengali), Ral Kang (Gujarati), Kangni (Hindi), Thene (Kannada), Shol (Kashmiri), Thina (Malayalam), Rala (Marathi), Kaon (Punjabi), Thenai (Tamil), Korralu (Telugu), Other names: Foxtail millet , Moha millet, Kakan kora 3. Sorghum: Sorghum bicolor Juar (Bengali , Gujarati , Hindi), Jola (Kannada), Cholam (Malayalam , Tamil), Jwari (Marathi), Janha (Oriya), Jonnalu (Telugu), Other names: Milo , Chari 4. Maize: Zea mays Bhutta (Bengali), Makai (Gujarati), Maka (Hindi , Marathi , Oriya), Musikinu jola (Kannada), Makaa’y (Kashmiri), Cholam (Malayalam), Makka Cholam (Tamil), Mokka jonnalu (Telugu) 5. Finger Millet: Eleusine coracana Madua (Bengali , Hindi), Bhav (Gujarati), Ragi (Kannada) , Moothari (Malayalam), Nachni (Marathi), Mandia (Oriya), Kezhvaragu (Tamil), Ragulu (Telugu), Other names: Korakan 6. Rice, parboiled: Oryza sativa Siddha chowl (Bengali) Ukadello chokha (Gujarati), Usna chawal (Hindi), Kusubalakki (Kannada), Puzhungal ari (Malayalam), Ukadla tandool (Marathi), Usuna chaula (Oriya), Puzhungal arisi (Tamil), Uppudu biyyam (Telugu) 7. Rice raw: Orya sativa Chowl (Bengali), Chokha (Gujarati), Chawal (Hindi), Akki (Kannada), Tomul (Kashmiri), Ari (Malayalam), Tandool (Marathi), Chaula (Oriya), Arisi
    [Show full text]
  • Why Speak Quechua? : a Study of Language Attitudes Among Native Quechua Speakers In
    Why Speak Quechua? : A Study of Language Attitudes among Native Quechua Speakers in Lima, Peru. A Senior Honors Thesis Presented in Partial Fulfillment of the Requirements for graduation with research distinction in Linguistics in the undergraduate colleges of The Ohio State University by Nicole Holliday The Ohio State University June 2010 Project Advisor: Professor Donald Winford, Department of Linguistics Holliday 2 I. Introduction According to the U.S. State Department, Bureau of Western Hemisphere Affairs, there are presently 3.2 million Quechua speakers in Peru, which constitute approximately 16.5% of the total Peruvian population. As a result of the existence of a numerically prominent Quechua speaking population, the language is not presently classified as endangered in Peru. The 32 documented dialects of Quechua are considered as part of both an official language of Peru and a “lingua franca” in most regions of the Andes (Sherzer & Urban 1988, Lewis 2009). While the Peruvian government is supportive of the Quechua macrolanguage, “The State promotes the study and the knowledge of indigenous languages” (Article 83 of the Constitutional Assembly of Peru qtd. inVon Gleich 1994), many believe that with the advent of new technology and heavy cultural pressure to learn Spanish, Quechua will begin to fade into obscurity, just as the languages of Aymara and Kura have “lost their potency” in many parts of South America (Amastae 1989). At this point in time, there exists a great deal of data about how Quechua is used in Peru, but there is little data about language attitudes there, and even less about how native Quechua speakers view both their own language and how it relates to the more widely- spoken Spanish.
    [Show full text]
  • Syntactic Variation in English Quantified Noun Phrases with All, Whole, Both and Half
    Syntactic variation in English quantified noun phrases with all, whole, both and half Acta Wexionensia Nr 38/2004 Humaniora Syntactic variation in English quantified noun phrases with all, whole, both and half Maria Estling Vannestål Växjö University Press Abstract Estling Vannestål, Maria, 2004. Syntactic variation in English quantified noun phrases with all, whole, both and half, Acta Wexionensia nr 38/2004. ISSN: 1404-4307, ISBN: 91-7636-406-2. Written in English. The overall aim of the present study is to investigate syntactic variation in certain Present-day English noun phrase types including the quantifiers all, whole, both and half (e.g. a half hour vs. half an hour). More specific research questions concerns the overall frequency distribution of the variants, how they are distrib- uted across regions and media and what linguistic factors influence the choice of variant. The study is based on corpus material comprising three newspapers from 1995 (The Independent, The New York Times and The Sydney Morning Herald) and two spoken corpora (the dialogue component of the BNC and the Longman Spoken American Corpus). The book presents a number of previously not discussed issues with respect to all, whole, both and half. The study of distribution shows that one form often predominated greatly over the other(s) and that there were several cases of re- gional variation. A number of linguistic factors further seem to be involved for each of the variables analysed, such as the syntactic function of the noun phrase and the presence of certain elements in the NP or its near co-text.
    [Show full text]
  • Nostratic Grammar: Synthetic Or Analytic?
    Nostratic grammar: synthetic or analytic? A. Dolgopolsky Haifa University § 1. The criteria determining the original analytic status of a gram- matical morpheme can be defined as follows. [1] Mobility: in the daughter languages the same morpheme some- times precedes autosemantic words and sometimes follows them (e. g. the pN *mA that forms participles and other deverbal nominals in daugh- ter-languages is found both before and after the verbal root). [2] In some daughter languages the morpheme is still analytic or has phonological evidence for its former analytic status [e. g. the N morpheme *bA (that forms animal names and other nouns of quality bearers) sur- vives in proto-Uralic *-pa; since the N intervocalic *-b- yields pFU *-w-, while the word-initial *b- yields U *p-, the Uralic suffix *pa must go back to a separate word]. [3] A grammatical morpheme of daughter-languages is etymologically identical with a separate word (e. g. the 1st person ending *-mi of the IE verb is identical with the N pronoun *mi that is preserved as a pronoun in daughter-languages). Sometimes other typological features (position in the word) may be taken into account. For instance, if in Semitic and Cushitic (Beja) verbs the feminine marker (*-ī) is used in the 2nd person only and is separated (as a suffix) from the person marker (prefix *t-), it is natural to suppose that it goes back to an address word (Hebrew tiħyī 'you' [f. sg.] will live' < N *t{u} 'thou' + *Хay{u} 'live' + *ʔ{a}yV 'mother'), which is typologically compara- ble with the gender distinction of a yes-word in spoken English: [yesз:] (< yes, sir) vs.
    [Show full text]
  • Language Classification in the Ethnologue and Its Consequences (Paper)
    Sarah E. Cornwell The University of Western Ontario, London, Ontario, Canada Language Classification in The Ethnologue and its Consequences (Paper) Abstract: The Ethnologue is a widely used classificatory standard for the world’s 7000+ natural languages. However, the motives and processes used by The Ethnologue’s governing body, SIL International, have come under criticism by linguists. This paper investigates how The Ethnologue answers the question “What is a language?” through the theoretical lens presented by Bowker and Star in Sorting Things Out (1999) and presents some consequences of those classificatory decisions. 1. Introduction Language is central to human culture and experience. It is the central core of our communicative capacity, running through all media of communication. Due to this essential role, there is a need to classify and count languages and their speakers1. Governments need to communicate with citizens, libraries need to catalogue books by language, and search engines need to return webpages in the appropriate language for each searcher. For these reasons among others, there is a strong institutional need for a standardized classification structure and labelling system for languages. The most widespread current classificatory infrastructure for languages is based on The Ethnologue. This essay will critically evaluate The Ethnologue using the Foucauldian theory developed for investigating classification structures by Bowker and Star (1999) in Sorting Things Out: Classification and its Consequences. 2. The Ethnologue The Ethnologue is a catalogue of all the world’s languages. First compiled in 1951, The Ethnologue is currently in its 20th edition and contains descriptions of 7099 living languages (2017). Since 1997, The Ethnologue has been available freely online and widely accessible (Simons & Fennig, 2017).
    [Show full text]
  • TWO FORMS of "BE" in MALAYALAM1 Tara Mohanan and K.P
    TWO FORMS OF "BE" IN MALAYALAM1 Tara Mohanan and K.P. Mohanan National University of Singapore 1. INTRODUCTION Malayalam has two verbs, uNTE and aaNE, recognized in the literature as copulas (Asher 1968, Variar 1979, Asher and Kumari 1997, among others). There is among speakers of Malayalam a clear, intuitively perceived meaning difference between the verbs. One strong intuition is that uNTE and aaNE correspond to the English verbs "have" and "be"; another is that they should be viewed as the "existential" and "equative" copulas respectively.2 However, in a large number of contexts, these verbs appear to be interchangeable. This has thwarted the efforts of a clear characterization of the meanings of the two verbs. In this paper, we will explore a variety of syntactic and semantic environments that shed light on the differences between the constructions with aaNE and uNTE. On the basis of the asymmetries we lay out, we will re-affirm the intuition that aaNE and uNTE are indeed equative and existential copulas respectively, with aaNE signaling the meaning of “x is an element/subset of y" and uNTE signaling the meaning of existence of an abstract or concrete entity in the fields of location or possession. We will show that when aaNE is interchangable with uNTE in existential clauses, its function is that of a cleft marker, with the existential meaning expressed independently by the case markers on the nouns. Central to our exploration of the two copulas is the discovery of four types of existential clauses in Malayalam, namely: Neutral: with the existential verb uNTE.
    [Show full text]
  • Exploring Language Similarities with Dimensionality Reduction Techniques
    EXPLORING LANGUAGE SIMILARITIES WITH DIMENSIONALITY REDUCTION TECHNIQUES Sangarshanan Veeraraghavan Final Year Undergraduate VIT Vellore [email protected] Abstract In recent years several novel models were developed to process natural language, development of accurate language translation systems have helped us overcome geographical barriers and communicate ideas effectively. These models are developed mostly for a few languages that are widely used while other languages are ignored. Most of the languages that are spoken share lexical, syntactic and sematic similarity with several other languages and knowing this can help us leverage the existing model to build more specific and accurate models that can be used for other languages, so here I have explored the idea of representing several known popular languages in a lower dimension such that their similarities can be visualized using simple 2 dimensional plots. This can even help us understand newly discovered languages that may not share its vocabulary with any of the existing languages. 1. Introduction Language is a method of communication and ironically has long remained a communication barrier. Written representations of all languages look quite different but inherently share similarity between them. For example if we show a person with no knowledge of English alphabets a text in English and then in Spanish they might be oblivious to the similarity between them as they look like gibberish to them. There might also be several languages which may seem completely different to even experienced linguists but might share a subtle hidden similarity as they is a possibility that languages with no shared vocabulary might still have some similarity.
    [Show full text]
  • MR Harley Miyagawa Syntax of Ditransitives
    Syntax of Ditransitives Heidi Harley and Shigeru Miyagawa (in press, Oxford Research Encyclopedia of Linguistics) July 2016 Summary Keywords 1. Structure for the Two Internal Arguments 2. Underlying Order 3. Meaning Differences 4. Case, Clitic 5. The Structure of Ditransitives 6. Nominalization Asymmetries 6.1. A Morphological Account of the Nominalization Asymmetry 6.2. –kata Nominalization in Japanese and Myer’s Generalization 6.3 Selectional Accounts of the Nominalization Asymmetry 6.4 Applicative vs Small Clause Approaches to the DOC. 7. Constraints on the Dative/DOC Alternation 7.1 Morphological Constraints 7.2 Lexical Semantic Constraints 7.3 Information-Structural and Sentential Prosody Constraints 8. Overview and Prospects Further Reading References Summary Ditransitive predicates select for two internal arguments, and hence minimally entail the participation of three entities in the event described by the verb. Canonical ditranstive verbs include give, show and teach; in each case, the verb requires an Agent (a giver, shower or teacher, respectively), a Theme (the thing given, shown or taught) and a Goal (the recipient, viewer, or student). The property of requiring two internal arguments makes ditransitive verbs syntactically unique. Selection in generative grammar is often modelled as syntactic sisterhood, so ditranstive verbs immediately raise the question of whether a verb might have two sisters, requiring a ternary-branching structure, or whether one of the two internal arguments is not in a sisterhood relation with the verb. Another important property of English ditransitive constructions is the two syntactic structures associated with them. In the so-called “Double Object Construction”, or DOC, the Goal and Theme both are simple NPs and appear following the verb in the order V-Goal-Theme.
    [Show full text]