Annotation Manual for the NPCMJ

Total Page:16

File Type:pdf, Size:1020Kb

Annotation Manual for the NPCMJ Annotation manual for the NPCMJ Stephen Wright Horn, Iku Nagasaki, Alastair Butler, and Kei Yoshimoto Contents 1 Introduction 5 2 Tags 6 2.1 General principles . 6 2.2 Part-of-speech tags . 6 2.3 Syntactic tags . 7 2.4 Tag extensions to specify clause linkage . 8 2.5 Other tags . 9 3 General parsing principles 9 3.1 Overview . 9 3.2 The schema of constituents . 9 3.3 Terminal nodes: . 10 3.4 Flat phrase-structure . 10 3.5 Endocentric structure and exceptions thereto . 10 4 Segmentation and part-of-speech annotation 11 5 Basic clause structure 13 6 Null elements 15 6.1 Null elements without indexing . 15 6.1.1 Null expletive . 15 6.1.2 Zero pronoun with generic impersonal reference . 17 6.1.3 Other zero pronouns . 17 6.1.4 Traces in relative clauses . 18 6.2 Null elements that are always indexed . 19 6.3 The position of null elements . 21 7 Annotation of grammatical roles 26 7.1 Core grammatical roles . 26 7.1.1 Explicitly marked arguments . 27 7.1.2 Implicitly marked arguments . 29 7.1.3 Omitted arguments . 30 7.2 Peripheral grammatical roles . 31 7.2.1 Explicitly marked adjuncts . 32 7.2.2 Implicitly marked adjuncts . 32 7.2.3 Adjunct traces . 36 1 8 Complementizer phrases (CPs) 37 8.1 Projection for sentence final particle (CP-FINAL) . 37 8.2 Questions (CP-QUE) . 38 8.3 Exclamative utterances (CP-EXL) . 40 8.4 Imperative clauses (CP-IMP) . 41 9 Adnominal clauses 45 9.1 Adnominal clauses with traces (IP-REL) . 45 9.2 Adnominal clauses without traces (IP-EMB) . 48 9.3 Adnominally used complementizer clauses(CP-THT) ........... 51 9.4 Internally headed relative clauses . 52 9.5 Resumptive pronouns . 53 10 Nominalized clauses (IP-NML) 54 11 Control environments 56 11.1 Control into small clauses (IP-SMC) . 57 11.2 Control into adverbial clauses (IP-ADV) . 59 11.3 Control into content complements of nouns (IP-EMB) . 62 11.4 Preventing control with null elements . 65 12 Clause coordination 66 12.1 Distinguishing between subordination and coordination . 68 12.2 Expressing coordination while maintaining flat clausal stucture . 79 12.3 Multiple clausal conjuncts . 83 12.4 Summary . 84 13 Non-clausal coordination (CONJP) 85 13.1 Coordinated NPs . 85 13.2 Coordinated PPs . 87 13.3 Coordinated ADVPs . 88 14 Quantification 90 14.1 Quantifiers(Q)and numeral-classifier phrases (NUMCLP) . 90 14.2 Prenominal expressions with Q . 98 14.3 Prenominal expressions with NUMCLP . 100 14.4 Appositive expressions with Q . 104 14.5 Appositive expressions with NUMCLP . 105 14.6 Floating expressions with Q . 107 14.7 Floating expressions with NUMCLP . 108 14.8 Referring expression with Q . 111 14.9 Referring expressions with NUMCLP . 111 14.10Host-less adverbial expressions with Q . 114 14.11Host-less adverbial expressions with NUMCLP . 115 14.12Quantifying expressions with W . 117 14.12.1 W expressions with も ......................... 118 14.12.2 W expressions with か ......................... 121 2 15 Particles/Postpositions (P) 122 15.1 Particles for core grammatical roles (P-ROLE): が, を, に, と, の, etc. 123 15.1.1 Role particle が ............................. 124 15.1.2 Role particle を ............................. 125 15.1.3 Role Particle に ............................. 127 15.1.4 Role particle と ............................. 130 15.1.5 Role particle の ............................. 133 15.1.6 Role-particles marking logical subjects(LGS) ........... 135 15.1.7 Role-particles marking secondary objects (OB2) . 137 15.1.8 Other particles marking subjects (SBJ) and primary objects (OB1) . 137 15.2 Particles for peripheral roles (P-ROLE): の, に, へ, で, から, まで, と, etc . 138 15.3 Complex particles . 142 15.4 Non-clausal connective particles (P-CONN) . 145 15.5 Clausal conjunctive particles (P-CONN) . 148 15.5.1 Conjunctive particles subordinating conditional IPs . 148 15.5.2 Conjunctive particles subordinating non-conditional IPs . 152 15.5.3 Conjunctive particles coordinating independent IPs . 154 15.6 Complementaizer particles: と, という, etc. 156 15.6.1 Content complements of predicates . 156 15.6.2 Content complements of nouns . 157 15.7 Sentence-final particles: か, ね and よ, etc. 158 15.8 Interjectional particles (P-INTJ) . 160 15.9 Toritate (focus) particles: か, しか, は, ばかり, も, etc. 160 15.10Particles as clausal constituents . 161 15.10.1 Connective particles as final elements under IP-ADV . 161 15.10.2 Connective/toritate/role particles within complex predicates . 162 15.10.3 Verbal nouns marked with を ...................... 167 15.11Particle burying . 167 15.12Particle stacking . 169 15.13Particle ommission . 170 16 Predicates 171 16.1 Verbs (VB) . 171 16.2 Predicate extensions . 175 16.3 Light verbs(VB0) ............................... 176 16.4 Secondary verbs(VB2) ............................ 177 16.5 Special cases for verbs する and なる ..................... 181 16.6 い-adjectives (ADJI) . 185 16.7 Special cases (ADJI) . 189 16.8 Past tense (AXD) . 189 16.9 Auxiliary verbs (AX) . 189 16.10Special cases for the auxiliary た/だ ...................... 189 16.11Modal elements (MD) . 190 16.12Copular expressions . 191 16.12.1 Copulas and copula drop . 192 16.12.2 Shapes of copulas . 194 16.13な-adjectives (ADJN) . 197 16.14Special cases and exceptions to な-adjectives . 202 16.15Nominal predicates (NP-PRD) . 202 16.16PPs as copular complements (PP-PRD) . 205 16.17Formal noun plus copula . 206 3 16.18のだ construction . 206 16.19Criteria for analyzing の as a copula . 207 16.19.1 Coordination test for adnominal の ................... 213 17 Noun phrases (NP) 213 17.1 Heads of noun phrases . 214 17.2 Possessive NPs (NP-POS) . 215 17.3 Noun modifiers . 216 17.3.1 Determiners (D) and WH-Determiners (WD) . 216 17.3.2 Prenominals (PNL) . 218 17.4 Vocative NPs (NP-VOC) . 218 17.5 Topic NPs (NP-TPC) . 220 18 Parenthetical layers (PRN) 221 19 Intermediate nominal layers (NML) 225 20 Prenominal phrases (PNLP) 226 21 Adverb phrases (ADVP) 227 22 Punctuation 230 23 Metadata (META) 231 24 Colloquial forms 231 24.1 Interjection phrases (INTJP) . 231 24.2 False starts (FS) . 232 24.3 Ellipsis . 233 24.4 Afterthoughts . 234 24.5 Contractions . 235 24.6 Formal noun こと denoting experience or precedent . 236 25 Constructions 238 25.1 Double subject sentences . 238 25.2 N-bar deletion . 240 25.3 Right node raising construction . 241 25.4 Verbless depictive (absolute) clauses . 244 25.5 Multiple sentences in a quotation . 246 25.6 Focused pseudocleft constructions ..
Recommended publications
  • A Grammar Research Guide for Ngwi Languages
    Language and Culture DigitalResources Documentation and Description 33 A Grammar Research Guide for Ngwi Languages Eric B. Drewry A Grammar Research Guide for Ngwi Languages Eric B. Drewry Azusa Pacific University in cooperation with SIL International—East Asia Group SIL International 2016 SIL Language and Culture Documentation and Description 33 ©2016 SIL International® ISSN 1939-0785 Fair Use Policy Documents published in the Language and Culture Documentation and Description series are intended for scholarly research and educational use. You may make copies of these publications for research or instructional purposes (under fair use guidelines) free of charge and without further permission. Republication or commercial use of a Language and Culture Documentation and Description or the documents contained therein is expressly prohibited without the written consent of the copyright holder. Managing Editor Eric Kindberg Series Editor Lana Martens Content Editor Lynn Frank Copy Editor Sue McQuay Compositor Bonnie Waswick Abstract This grammar research guide describes the range of syntactic variety found in a representative group of well-described Ngwi languages. This overview of syntactic variety should make the guide useful for field linguists preparing to describe any of the forty-eight Ngwi languages that were recognized for the first time in the sixteenth edition of the Ethnologue (Lewis 2009). This is done by giving examples of where and how widely the languages in this group vary even within the typical categories of the Ngwi languages, including sentence introducers, conjunctions, noun types, compounding, derivation, noun particles, postnominal clausal particles, classifiers and numerals, negation, adjectives, pronouns, adverbs, verb types, verb concatenations, preverbal and postverbal slots, verb particles, clause-final and sentence- final particles, simple sentences, compound sentences, and complex sentences.
    [Show full text]
  • THE BASIC STRUCTURE of the ZAIWA NOUN PHRASE Mark Wannemacher Payap University Linguistics Institute 1. INTRODUCTION Zaiwa Is A
    Linguistics of the Tibeto-Burman Area Volume 33.2 ― October 2010 THE BASIC STRUCTURE OF THE ZAIWA NOUN PHRASE Mark Wannemacher Payap University Linguistics Institute Abstract This paper provides a basic description of major grammatical features of the Zaiwa noun phrase. It includes a review of Zaiwa phonology, a discussion of word classes and consituent order typology, and a description of the noun phrase including noun types, nominal modifiers, conjunctions and aspects of discourse grammar related to the noun phrase. Keywords grammar, noun phrase, word class, Zaiwa, Tibeto-Burman 1. INTRODUCTION Zaiwa is a language of approximately 150,000 speakers in the Northern Burmic Branch of the Tibeto-Burman language family. It is spoken in parts of Eastern Kachin State and Northern Shan State in Myanmar and in parts of Southwest Yunnan in China. The Zaiwa have a close cultural relationship with the Jinghpaw people and borrow extensively from the Jinghpaw language, as well as from Burmese, Dehong Dai and Yunnanese Chinese. This paper will give a brief introduction to Zaiwa phonology, word classes and constituent order typology, and then move on to the primary focus of describing the Zaiwa noun phrase. The description includes nouns types, pronouns, nominal modifiers including the relative clause, demonstratives, postpositions, conjunctions, and discourse markers. Data for this paper were collected while teaching at Payap University in Chiang Mai, Thailand. I would like to thank Payap University for their kind assistance in the study of the languages of Southeast Asia.1 The grammatical data in this paper is from speakers from Sadon and Kengtung, Myanmar, as well as Janshi, China.
    [Show full text]
  • Dissertation Supervisor: Prof. Louisa Sadler
    Progressivity Expressions in Hassawi Dialect Hamdah Mohammad Al-Abdullah Registration no:1500875 Dissertation supervisor: Prof. Louisa Sadler A dissertation submitted for the degree of Master in linguistic studies Department of Language and Linguistics University of Essex September, 2016 1 To my father… you are always in my heart 2 Acknowledgment I would like to express my sincerest appreciation to a number of people who were great company this year as they were supportive enough that I could reach this point. My deepest gratitude goes to Pro. Louisa Sadler, my dissertation supervisor for her support, time, significant comments and guidance. I am forever grateful to my father Mohammad Al- Abdullah may Allah bless his soul and my mother shirifah Al-Salim without whom this journey could not have been completed. I am also very thankful to all my family members and friends for their constant support and love. I am particularly grateful for my brother Abdullah who was a great escort and a right hand in this trip to the UK where I could expand my horizon in the University of Essex and where I could meet with great minds, my tutors to whom I am in debt forever, and my new friends whom I will keep in heart forever. 3 Table of contents Abstract List of tables……………………..……………..……………………………………………..6 List of abbreviations………………………………………………………………………….7 Chapter 1……………..……………...……………………………………………..………..10 1.1 Introduction…………………………..…………………………………………………11 1.2 Basic facts about Al-Ahsa and the Hasswi dialect…………………..….………….....12 Chapter 2……………………………………..……..………………………………………16 2. Review of the Literature…………….….…………………..…………………………...17 2.1 Progressive in Europe languages………………………………….…………………..17 2.1.1 Progressive in English………………………..………………………………………17 2.1.2 Blansitte’s classification of the morphosyntactic expressions of the Progressive in Europe languages…………………………………………………………………..……….18 2.2 Progressive in Modern Standard Arabic (MSA)…………….…………….…………19 2.3 Progressive in the dialects of colloquial Arabic………………………………………21 Chapter 3………..………………………………………………….……………………….27 3.
    [Show full text]
  • What Is a Particle? on the Use and Abuse of the Term Particle in East and Southeast Asian Languages
    What is a particle? On the use and abuse of the term particle in East and Southeast Asian Languages With some modest recommendations for improving a mildly lamentable situation 1 Keith W. Slater SIL International, East Asia Group ABSTRACT The term particle is commonly used by grammar writers, but seems to have little or no status in typological works. In this paper, I detail the results of a study of grammatical descriptions of languages spoken in East and Southeast Asia. These grammatical descriptions all use the term particle, but there is very little consistency in their usage of the term. Furthermore, hardly anyone actually defines the term, leaving us with a very unclear picture of how to compare its uses. The paper concludes with some observations about what do seem to be the most common understandings of the term particle , and makes some recommendations for improving upon the current lack of consistency across the grammatical descriptions written within different language families. CONTENTS 1 WHAT ARE PARTICLES IN THEORY ? 1.1 DEFINITIONS 1.2 PARTICLES AND CLITICS 2 WHAT ARE PARTICLES IN PRACTICE ? 2.1 ASPECT /M ODALITY /(T ENSE ) 2.2 MOOD /I LLOCUTIONARY FORCE 2.3 QUOTATION AND EVIDENTIALITY 2.4 DISCOURSE ORGANIZATION/INTERPROPOSITIONAL RELATIONSHIPS 2.5 FOCUS , EMPHASIS , TOPICALIZATION 2.6 NOMINALIZERS (MAY ALSO BE COMPLEMENTIZERS ) 2.7 NOUN PARTICLES 2.8 AND THE KITCHEN SINK 2.9 MORPHOLOGICALLY COMPLEX PARTICLES 2.10 MULTIFUNCTIONAL PARTICLES 1Thanks to Lynn Conver for helping me find some great material about particles. 1
    [Show full text]
  • Screening Procedures Annotation? COMPOUND NOUN
    ! ! ! ! ! Lexical Semantic Analysis in Natural Language Text Nathan Schneider CMU-LTI-14-001 ! Language Technologies Institute School of Computer Science Carnegie Mellon University 5000 Forbes Ave., Pittsburgh, PA 15213 www.lti.cs.cmu.edu! ! ! Thesis Committee:! Noah A. Smith (chair), Carnegie Mellon University Chris Dyer, Carnegie Mellon University Eduard Hovy, Carnegie Mellon University Lori Levin, Carnegie Mellon University Timothy Baldwin, University! of Melbourne ! ! Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy In Language and Information! Technologies ! ! © 2014, Nathan Schneider Lexical Semantic Analysis in Natural Language Text Nathan Schneider Language Technologies Institute School of Computer Science Carnegie Mellon University ◇ September 30, 2014 Submitted in partial fulfillment of the requirements for the degree of doctor of philosophy in language and information technologies Abstract Computer programs that make inferences about natural language are easily fooled by the often haphazard relationship between words and their meanings. This thesis develops Lexical Semantic Analysis (LxSA), a general-purpose framework for describing word groupings and meanings in context. LxSA marries comprehensive linguistic annotation of corpora with engineering of statistical natural lan- guage processing tools. The framework does not require any lexical resource or syntactic parser, so it will be relatively simple to adapt to new languages and domains. The contributions of this thesis are: a formal representation of lexical segments and coarse semantic classes; a well-tested linguistic annotation scheme with detailed guidelines for identifying multi- word expressions and categorizing nouns, verbs, and prepositions; an English web corpus annotated with this scheme; and an open source NLP system that automates the analysis by statistical se- quence tagging.
    [Show full text]
  • 10 Word Formation
    10 Word Formation TARO KAGEYAMA 0 Introduction A long-standing debate in generative grammar concerns the Lexicalist Hypo- thesis, the strongest form of which demands complete separation of morphol- ogy from syntax, thereby disallowing active interactions of word formation and syntactic operations (Di Sciullo and Williams 1987). Such a hypothesis confronts serious challenges from an agglutinative language like Japanese, where one suffix after another is productively added to a verb stem to give rise to more and more complex predicates, as in tabe-hazime(-ru) “eat-begin” = “begin to eat,” tabe-hazime-sase(-ru) “eat-begin-cause” = “make (someone) begin to eat,” and tabe-hazime-sase-ta(-i) “eat-begin-cause-want” = “want to make (someone) begin to eat.” This chapter will review issues in Japanese word formation which directly pertain to the evaluation of the Lexicalist Hypothesis. Included in my discussion are Verb+Verb compounds, Noun+Verbal Noun compounds, and Verbal Noun+suru compounds. Space limitations prevent me from looking into other topics of theoretical interest in the realm of lexical morphology, such as N–N compounding (e.g. inu-goya “dog-house”), A–N compounding (e.g. aka- boo “redcap”), nominalization (e.g. ame-huri “rainfall”), lexical prefixation and suffixation (e.g. sai-kakunin “re-assure,” niga-mi “bitterness”), clipping (e.g. siritu- daigaku “private universities” → si-dai), and reduplication (e.g. (biiru-o) nomi- nomi “while drinking beer”). For the topics that are not included in this chapter as well as the basics of Japanese morphology, the reader is referred to Kageyama (1982), Shibatani (1990: chapter 10), and Tsujimura (1996b: chapter 4).
    [Show full text]
  • Monograph Series on Languages and Linguistics 17Th Annual Round Table
    Monograph Series on Languages and Linguistics Number 19, 1966 edited by F. P. Dinneen, SJ 17th Annual Round Table Georgetown University School of Languages and Linguistics REPORT OF THE SEVENTEENTH ANNUAL ROUND TABLE MEETING ON LINGUISTICS AND LANGUAGE STUDIES FRANCIS P. DINNEEN, S.J. EDITOR GEORGETOWN UNIVERSITY PRESS Washington, D.C. 20007 ©Copyright 1966 GEORGETOWN UNIVERSITY PRESS SCHOOL OF LANGUAGES AND LINGUISTICS GEORGETOWN UNIVERSITY Library of Congress Catalog Card Number 58-31607 Lithographed in U.SA. by EDWARDS BROTHERS, INC. Ann Arbor, Michigan TABLE OF CONTENTS Page Foreword v WELCOMING REMARKS Rev. Frank L. Fadner, S.J., Regent Institute of Languages and Linguistics vii Robert Lado, Dean Institute of Languages and Linguistics ix I. PROBLEMS IN SEMANTICS John L. Fischer Interrogatives in Ponapean: Some Semantic and Grammatical Aspects 1 Charles J. Fillmore A Proposal Concerning English Prepositions 19 Gerhard Nickel Operational Procedures in Semantics, with Special Reference to Medieval English 35 James B. Fraser Some Remarks on the Verb-Particle Construction in English 45 DISCUSSION 63 II. FIRST LUNCHEON ADDRESS George L. Trager Linguistics as Anthropology 71 IIL HISTORY OF LINGUISTICS Karl V. Teeter The History of Linguistics: New Lamps for Old 83 iv / TABLE OF CONTENTS Hugo Mueller On Re-Reading von Humboldt 97 John Viertel Concepts of Language Underlying the 18th Century Controversy about the Origin of Language 109 Geoffrey Bursill-Hall Aspects of Modistic Grammar 133 DISCUSSION 149 IV. LINGUISTICS AND ENGLISH Stanley Sapon Shaping Productive Verbal Behavior in a Non-Speaking Child: A Case Report 157 Paul M. Postal On So-Called 'Pronouns' in English 177 Terence E.
    [Show full text]
  • A Description of Preverb and Particle Usage In
    A DESCRIPTION OF PREVERB AND PARTICLE USAGE IN INNU-AIMÛN NARRATIVE by ©Jane Bannister Bachelor of Arts Memorial University of Newfoundland (2000) A thesis submitted to the School of Graduate Studies in partial fulfillment of the requirements for the degree of Master of Arts Department of Linguistics Memorial University of Newfoundland April 2004 St. John’s Newfoundland and Labrador, Canada ABSTRACT Sentences with multiple preverbs and/or particles are examined in this thesis. The data sentences were collected from the first 18 stories of the Labrador Innu Text Project. Chapter 1 is an introduction to Innu-aimun grammar, with sections on previous research into word ordering, especially preverb ordering. Chapter 2 describes the patterning, use and co-occurrence of the ten most common preverbs in the data sentences. Preverbs are subdivided into modal preverbs, temporal preverbs, aspectual preverbs and other preverbs. Chapter 3 discusses 28 common particles in the data. These particles are also divided into smaller groups, including complementizers, focus particles, negative particles, adverbs, temporal and aspectual particles, particles of speaker opinion and particles with changed forms. Both chapters 2 and 3 include discussion of regular patterns of ordering of preverbs or particles. Chapter 4 is an analysis of the use of the independent or conjunct orders following negative particles. Optimality Theory is used to explain Innu data, and sentences are analyzed based on Brittain (2001, 1997). A general thesis conclusion ends chapter 4. ii ACKNOWLEDGMENTS Thanks to the many people who have helped me create and finish this thesis. Thanks to my supervisor Phil Branigan, for sensible suggestions and a calming demeanor.
    [Show full text]
  • Jp Grammar Guide.Pdf
    Tae Kim's Japanese guide to learning Japanese grammar file:///C:/Documents%20and%20Settings/Administrator/%E3%83%8... A Japanese guide to Japanese grammar Outline 1. The problem with conventional textbooks 2. A Japanese guide to Japanese grammar 3. What is not covered in this guide? 4. Suggestions 5. Requirements The problem with conventional textbooks The problem with conventional textbooks is that they often have the following goals. 1. They want readers to be able to use functional and polite Japanese as quickly as possible. 2. They don't want to scare readers away with terrifying Japanese script and Chinese characters. 3. They want to teach you how to say English phrases in Japanese. Traditionally with romance languages such as Spanish, these goals presented no problems or were nonexistent due to the similarities to English. However, because Japanese is different in just about every way down to the fundamental ways of thinking, these goals create many of the confusing textbooks you see on the market today. They are usually filled with complicated rules and countless number of grammar for specific English phrases. They also contain almost no kanji and so when you finally arrive in Japan, lo and behold, you discover you can't read menus, maps, or essentially anything at all because the book decided you weren't smart enough to memorize Chinese characters. The root of this problem lies in the fact that these textbooks try to teach you Japanese with English. They want to teach you on the first page how to say, "Hi, my name is Smith," but they don't tell you about all the arbitrary decisions that were made behind your back.
    [Show full text]
  • Korean Romanization and Word Division
    Korean Romanization and Word Division Romanization 1. General Practice The Library of Congress will continue to follow the McCune-Reischauer system to romanize Korean with the exceptions noted in this document. See: Romanization of the Korean Language: Based upon its Phonetic Structure by G.M. McCune and E.O. Reischauer ([S.l.: s.n., 1939?), reprinted from the Transactions of the Korea Branch of the Royal Asiatic Society. Full text of the original document is available online from the National Library of Australia Web site: http://www.nla.gov.au/librariesaustralia/cjk/download/ras_1939.pdf Note: A romanization table appears as Appendix 7, at the end of this document. 2. Authorities The Library of Congress will designate certain standard dictionaries [see Appendix 1] as final authorities to resolve questions of contemporary pronunciation. A word will be considered to be pronounced as indicated in those dictionaries, and romanized in such a way as to represent its pronunciation most accurately. 3. Conflict between Romanization Rule and Pronunciation When romanization rules conflict with the pronunciation of a word, prefer to represent the pronunciation. 1 라면 ramyŏn 漢字 Hancha 令狀 yŏngchang 서울대 Sŏuldae 월드컵 Wǒldŭk’ŏp 길잡이 kiljabi 말갈족 Malgaljok 값어치 kabŏch’i 싫증 silchŭng Note: Some dictionaries represent a reinforced medial consonant with a double consonant: 의과 [_꽈], 실시 [_씨]. However, the romanization would not necessarily show a double consonant: ǔikwa, silsi. 평가 p’yŏngka 문법 munpŏp 4. Hyphens (a) When sounds would normally change, according to McCune-Reischauer rules, sound change is indicated preceding or following a hyphen in forenames or pseudonyms that are preceded by family names, and in generic terms used as jurisdictions.
    [Show full text]
  • Parts of Speech in Novamorf, a New Morphological Annotation of Czech
    DOI 10.2478/jazcas-2019-0065 Parts OF SPEECH IN NovaMORF, a NEW MORPHOLOGICAL Annotation OF CZECH VLADIMÍR PETKEVIČ1 ‒ Jaroslava Hlaváčová2 ‒ KLÁRA OSOLSOBě3 ‒ Martin SVÁŠEK ‒ JOSEF ŠIMANDL 1 Institute of Theoretical and Computational Linguistics, Faculty of Arts, Charles University, Czech Republic 2 Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Czech Republic 3 Institute of the Czech Language, Faculty of Arts, Masaryk University, Czech Republic PETKEVIČ, Vladimír – HLAVÁČOVÁ, Jaroslava – OSOLSOBě, Klára – SVÁŠEK, Martin – ŠIMANDL, Josef: Parts of speech in NovaMorf, a new morphological annotation of Czech. Journal of Linguistics, 2019, Vol. 70, No 2, pp. 358 – 369. Abstract: A detailed morphological description of word forms in any language is a necessary condition for a successful automatic processing of linguistic data. The paper focuses on a new description of morphological categories, mainly on the subcategorization of parts of speech in Czech within the NovaMorf project. NovaMorf focuses on the description of morphological properties of Czech word forms in a more compact and consistent way and with a higher explicative power than approaches used so far. It also aims at the unification of diverse approaches to morphological annotation of Czech. NovaMorf approach will be reflected in a new morphological dictionary to be exploited for a new automatic morphological analysis (and disambiguation) of corpora of contemporary Czech. Keywords: NovaMorf, morphological annotation, parts of speech, morphological categories, subcategorization 1 INTRODUcTION We present a repertoire of morphological categories and, mainly, the parts of speech (POS) and their subcategorization distinguished in NovaMorf, the project of an innovated description of Czech morphology as a linguistic base for a new morphological analysis and subsequent disambiguation of Czech texts.
    [Show full text]
  • An Erzyan Orthography Compatible with Compound Words, Issues
    An Erzyan orthography compatible with compound words, issues Klementeva, E.F. & J.M. Rueter Congressus Duodecimus Internationalis Fenno-Ugristarum Oulu, Finland, August 17th – 21st, 2015 Aug 19, 2015 Jack Rueter, Ph.D., University of Helsinki 1 An Erzyan orthography compatible with compound words, issues Compound words and their treatment in the Erzya orthography of 2012. Aug 19, 2015 Jack Rueter, Ph.D., University of Helsinki 2 Compound words and their treatment in the Erzya orthography ● Searching for a methodology ● Ideal: Orthographies should have methodologies, so no orthographic dictionary is really necessary. ● A methodology might be based on categories inherent in the individual language Aug 19, 2015 Jack Rueter, Ph.D., University of Helsinki 3 Orthography of 2012 ● Earlier Orthographic dictionaries: 1939, 1955, 1969, 1978 ● Terminology: adjective, adverb, affixoid, lexicalization, noun, particle, participle ● Principles: of Morphology, Phonetics and Tradition Aug 19, 2015 Jack Rueter, Ph.D., University of Helsinki 4 Affixoid This term is not explained separately!! Чи, пуло, пель, пря, ланго, алкс, пе, кирда, мезе, ни, In practice this is used for 3 phenomena: (1) Semantic value not attested in independent word (2) Semantic value without independent wordform (3) Base form for use in grammatical case of so called postpositions [relative spatial nouns] Aug 19, 2015 Jack Rueter, Ph.D., University of Helsinki 5 Morphology ● This is in fact a morphematic priniciple, whereby the parts of words show no variation in orthography, regardless of how they might be pronounced. ● Куз, кузонь, кузга hence: ● /kusne/ кузтнэ Aug 19, 2015 Jack Rueter, Ph.D., University of Helsinki 6 Morphology 2 ● Write compound words whose components have not been shortened.
    [Show full text]