Applying Universal Dependency to the Language

Irina Wagner1, Andrew Cowell1, Jena D. Hwang2 1University of Colorado Boulder, Department of Linguistics; 2IHMC irina.wagner, james.cowell @colorado.edu, [email protected] { }

Abstract Applying the UD rules while annotating the data from the Arapaho (Algonquian) language, several This paper discusses the use of Universal specific features were observed to fall outside of Dependency for annotations of a Native the charted labels. Since the language does not North American language Arapaho (Algo- have a fixed word order and allows discontinuous nquian). While some relations of the uni- constituency, dependencies on the previous word versal dependency perfectly correspond were avoided and re-analyzed. The most problem- with those in Arapaho, language specific atic dependency distinction in this language is the annotations of verbal arguments elucidate variation in relations between a verb and its argu- problems of assuming certain syntactic ments. This paper examines the correlation of the categories across languages. By critiquing dependency relations in the UD scheme and their the influence of grammatical structures of practical application for the Arapaho data. Us- major European and Asian languages in ing the UD framework, we create guidelines for establishing the UD framework, this paper annotating this data. In considerations of space, develops guidelines for annotating a poly- this paper primarily focuses on the argument struc- synthetic agglutinating language and sets tures defined by the UD and their correspondences a path to developing a more comprehen- to the Arapaho syntactic patterns. An additional sive cross-linguistic approach to syntactic discussion of non-verbal roots and topicality prob- annotations of language data. lematizes some of the common assumptions in dis- 1 Introduction counting pragmatic features while analyzing syn- tactic dependencies. The recent initiatives to create a cross-linguistic scheme of annotation rely on Universal Depen- In the following pages, we first provide a short dency (UD) as a system of describing the syntactic note on the and the procedures connection between words (Nivre, 2015; de Marn- of annotations (2); discuss issues of mapping the effe et al., 2014). While research shows this anno- labels for , objects, and noun modifiers of tation type is effective not only for monolingual the UD onto the Arapaho dependencies (3); define parsers but also cross-linguistically across mul- the mechanism of analysis of non-verbal roots (4); tiple platforms, the universality of this approach and suggest further ways of developing these an- is based on the assumptions of similar syntac- notation guidelines (5). tic structures of major, often European, languages (McDonald et al., 2013). Without doubt, those 2 Arapaho data and annotations are also the languages that receive predominant at- Arapaho is an Algonquian poly-synthetic aggluti- tention in the computational sphere, the languages nating language spoken by less than 200 people in whose technological presence requires a thorough the Wind River Indian Reservation in . analysis and annotation. However, if the goal of Because the language is in critical condition, there natural language processing is truly to develop have been attempts at documenting and preserving a universal cross-linguistic strategy for annotat- it. A large transcribed and annotated spoken cor- ing and analyzing linguistic data, it is important pus has been created and parts of it are now avail- to attend to lesser described languages that may able in the Endangered Languages Archive1.A present strikingly different syntactic structures and dependencies. 1http://elar.soas.ac.uk/deposit/0194

171 Proceedings of LAW X – The 10th Linguistic Annotation Workshop, pages 171–179, Berlin, Germany, August 11, 2016. c 2016 Association for Computational Linguistics total of around eighty thousand lines transcribed, hit him/her” is marked to agree both with the se- translated, and grammatically analyzed is avail- mantic and undergoer of the verb. This se- able for further processing. The current attempts mantic distinction in the arguments is not observed at establishing the dependency scheme for this lan- in intransitive and semi-transitive verbs. Because guage initiate the new type of analysis of this data such verbs demonstrate morphological agreement to allow machine processing. only with one nominal2, other nominals are con- sidered outside of the argument structure of a verb 2.1 Some features of the Arapaho language even if they specify the semantic or theme. The current paper largely relies on the previous de- (3) nih’ii-koo-ko’uyei-3i’ biino scription of the Arapaho grammar by Cowell and PST.IMPF-REDUP-pick things-3PL chokecherries Moss (2008). There are several intriguing features “They were picking chokecherries.” of the grammar, but the ones most relevant to this study are its complex verbal morphology, split se- So in the example (3), the noun biino “chokecher- mantic and syntactic transitivity, and the system of ries” is not reflected in verbal morphology, but obviation. corresponds with its semantics by specifying the of picking. Being outside of the argument 2.1.1 Verbal complexity structure of this verb, syntactically the noun is bet- As is observed in many other poly-synthetic lan- ter understood as a verbal adjunct specifying the guages, Arapaho verbs are highly complex and manner of action, while semantically it is still the mark multiple grammatical and semantic features. patient. So the designation of the relationship be- So, in example (1), a single verb demonstrates in- tween such arguments and verbs as dobj of the uni- corporation of not only the usual tense, aspect, versal dependencies is wrong because it does not mode, person, and number features, but also the consider verbal morphology, whereas the label of manner of action and an incorporated object. nmod would not account for its semantic role. (1) he’ih’ii-xoo-xook-bix-ohoe-koohuut-oo-no’ “Their hands would go right through them and appear 2.1.3 Obviation on the other side.” Unlike many languages, Arapaho does not rely A single verb can be a full clause conveying a full on word order or case markers to disambiguate thought. Verbal prefixes code grammatical as well between overt nominals; rather it uses a system as many semantic features, inhibiting the depen- of obviation that incorporates a distinction based dency analysis since this framework only consid- on along with the combination of ver- ers the relations between individual words. bal morphosyntax and pragmatics to mark partic- ular grammatical roles. This system clearly dis- 2.1.2 Transitivity tinguishes between two third person referents by The category of verbal transitivity is both syntactic marking one of them (a less salient one in the and semantic (Cowell and Moss, 2008). To under- discourse) as and leaving the other ref- stand how many arguments are allowed in a verb’s erent unmarked (proximate). In Algonquian lan- frame, one must examine both the morphological guages, the obviation is argued to be a pragmatic and the semantic structure of a verb. So, while feature structuring discourse outside of a single semantically a verb to’oo3ei “to hit things” may clause (Goddard, 1984). Verbal morphology also appear transitive, grammatically it is intransitive, shows agreement with these categories: the transi- requiring only one argument, the subject, as in tive verb inflection clearly marks which argument too’oo3einoo “I am hitting (unspecified) things.” is acting on the other. So, instead of the usual three The transitivity of a verb is expressed in its inflec- persons, Arapaho has four, with the fourth person tion which must agree in person and number with being the obviative argument. In the example be- its arguments. Truly transitive verbs carry inflec- low, the obviative argument is the noun hiinoon tions agreeing with both of its arguments: “his mother” which corresponds with the verbal (2) Nih-to’ow-oo-t nuhu’ hinen-ino subjunctive inflection -eihok “4th person acting on PST-hit-3/4-3S this man-OBV.PL 3rd singular.” “He hit these men” 2We use phrases “nominal” and “nominal expression” to Even though only one of the two arguments ap- refer to nouns, noun phrases and nominalized verbs that func- pears in the sentence, the verb nihto’owoot “s/he tion as noun phrases.

172 (4) Hohou, hee3eihok hiinoon the annotations has been used thus far, and all of thank you say to s.o.-4/3S.SUBJ his/her mother the annotations are stored in a spreadsheet format. 3eeyokooxuu. Tipi-pole Child Because the language is critically endangered, “Thank you,” his mother said to Under-the-Tipi-Pole the resources available for this type of work are Child. extremely limited. Importantly, it is not just that there are fewer recorded texts and conversations, As it is observed in this example, obviation does but there are also fewer trained individuals able not correspond with the semantic or the syntac- to perform any type of language annotation. So, tic role of an argument. Neither it depends on the during this particular project, most of the anno- transitivity of a verb. Rather, obviative status lines tations were done by the first two authors of the up with the semantic role of an obviative coded in paper with Andrew Cowell being the language verbal morphology. Based on this feature of tran- expert due to his experience and acquired profi- sitivity and obviation, the current paper suggests ciency in the language. Over the course of six employing the semantic labels in marking the syn- months, authors met regularly to discuss the anno- tactic relations. tations, solve the occurring issues, introduce and 2.2 Annotation procedures update labels. As a result, all of the current an- notations are single annotated. The next part of We are not aware of previous attempts at de- the project includes more manual annotations us- pendency annotations with other Algonquian lan- ing the guidelines proposed here as well as double guages; however, dependency grammar has been annotations of at least a portion of the data to es- one of the theoretical approaches in Algonquian tablish the inter annotator agreement. Having al- syntax. The guidelines discussed below were cre- ready annotated a few thousand lines of narratives, ated based on the annotations of a small set of Ara- the of the following work will be on conver- paho narratives. In the first phase of the project, sational data followed by machine learning. the dependency relations were outlined based on annotations of a sample of several traditional nar- 3 Mapping the UD scheme ratives, totaling at about two thousand lines3. The annotators, one fluent non-native speaker and Out of the forty dependencies proposed by the three graduate students in Linguistics well famil- UD, thirty Arapaho dependencies have one-to-one iar with Arapaho language structure, were given correspondence. Additional seventeen specifica- a protocol established without the considerations tions and relations have been added to describe language-specific instances. The final scheme of of the UD framework but based purely on the Al- gonquian syntax patterns. Several problems us- Arapaho’s nominal argument dependencies is pre- ing these syntax patterns clarified and specified the sented in Table 1. Some of the dependencies were dependency relations, leading to the creation of a not used in the Arapaho scheme because such de- new set of labels. pendencies merely do not exist in this language. In the second phase of the project, these new la- So, for example, the language does not have a bels were further standardized based on the prin- of an adjective; therefore, amod ciples of the Stanford Dependencies (de Marneffe the dependency has not been used; instead and Manning, 2008). Using this new set of rela- descriptive verbs are analyzed as relative clauses, acl tions, the annotations of the previous phase were . Example (5) demonstrates the relative clause converted and a total of 3616 lines of elicited per- dependency where verb modifies the noun in the sonal and traditional oral narratives as well as 593 same manner that an adjective would. lines of conversational data were newly annotated. acl niibe’ei’i siikoocei’ikuu3oo nihnohkokoo’ohuni’i. The disfluency of the conversational data indicated VII NI VII major issues with this annotation scheme which (5) nii-be’ei-’i siikoocei’ikuu3oo prompted us to turn to the UD-based system. The IMPF-red-0PL rubber item guidelines presented here have been used to re- nih-nohk-okoo’ohuni-’i. mark the previous annotations of the data used in PST-INSTR-sealed with stiff object-0PL the second phase. No special software to perform “They would be sealed with the red rubber gasket.”

3What we call “lines” here refers to a ToolBox line which In addition, there are no relative pronouns in the represent a single clause, or a complete thought. language, so the dependency relation marker is

173 also obsolete in the current scheme. Similarly, 3.1 Subjects there is no category of a number or numeral; in- While there is some correspondence between the stead the number can be expressed by a verb or a UD’s nsubj and subjects in Arapaho, it is, nonethe- particle, at which instance it is analyzed just like less, problematic to analyze subjects based purely other particles with the dependency of advmod to on syntax since there is no syntactic features that the word that it modifies. In sum, omitted UD rela- would index the particular verbal arguments. Be- tions are the ones that are either expressed by some cause nominals can take any position in the sen- other dependency or non-existent in the Arapaho tence and because they are not marked by a case language. corresponding with its syntactic role, the only cer- Several UD dependencies perfectly line up with tain way of finding a subject is in the person and the Arapaho scheme. So such relations as noun number verbal agreement. The proximate and ob- modifiers, adverbial modifiers, adverbial clauses, viative distinction also does not clarify the syntac- determiners, appositives, relative clauses, case tic role of the nominal, so with transitive verbs, the markers, and a few more have a direct correspon- proximate form can be either agent or undergoer, dence. For example, an adverbial clause in Ara- and thus roughly correspond to either subject or paho is very similar, if not the same, as adver- object in English. In other words, the distinction bial clauses described for other languages in the of subject is not really important in the Arapaho UD. Arapaho adverbial clauses, as it is seen in language, especially with transitive verbs, and a the example below, lack a distinct word introduc- relationship that is based on obviation would mark ing it; instead, the head of an adverbial clause ex- the dependencies more clearly. In response to this, hibits particular morphological markers indicating the current dependency scheme adopted the UD its dependency. So in the example (6) this distinc- dependency of nsubj and csubj with the additional tion is made by the subjunctive mode indicating marker :obv to index the obviative arguments of that the verb bih’iyoohok “when it is dark” is a de- intransitive verbs expressed in the verbal morphol- pendent of the main verb of the sentence. ogy. The proximate counterparts are not marked. In the example (7), the obviative noun agreeing advcl Bih’iyoohok ce’no’useeni’. with the verb is such subject. VII VAI (6) nsubj:obv no’useeni3 nuhu’ koo’ohwuu. ce’-no’usee-ni’. VAI DET NA Bih’iyoo-hok (7) dark-SUBJUNCT back-arrive-1PL no’usee-ni3 nuhu’ koo’ohw-uu. arrive-4S this coyote-obv. “When it’s dark, we’ll come back.” “This coyote came.”

In general, dependencies between function words Similarly, the nsubj and csubj dependency is also and content words mirror the same dependencies used for animate arguments of transitive inanimate in the UD framework, and most of these depen- verbs (VTI) and inanimate arguments of intran- dency labels are used. sitive inanimate verbs (VII). However, transitive The most complicated dependency relations verbs exhibit a double marker on indicating both tend to be between the content words, and espe- the proximate and obviative participants, as well cially the relations between the verb and its argu- as the direction of action (agential relationship) ments. From the UD scheme, only one of such between the two. The proximate participant can relations matches the Arapaho scheme with some be either agent or patient, as can the obviative par- modifications: nsubj and csubj correspond to sub- ticipant. So, an additional label employing the se- jects of intransitive verbs and transitive inanimate mantic distinction, nagent (nominal agent) is in- verbs. Similarly, subjects of passive verbs also troduced. correspond to the nsubjpass and csubjpass depen- nagent:obv hiniisonoon heenei’itowuuneit. dencies. Additional provisions are made in Ara- NA VTA paho scheme to account for the obviation status. (8) In the following section, we discuss all of the pro- hi-niisonoon heen-ei’itowuun-eit. visions and additions made to the argument depen- 3S-father.obv REDUP-tell s.o.-4/3S dencies. His father tells him.

174 UD Arapaho Dependencies Notes nsubj nsubj(:obv) Nominal subjects of VII, VAI, and VTI verbs. csubj csubj(:obv) Clausal subjects of VII, VAI, and VTI verbs. nsubjpass nusubjpass(:obv)  csubjpass csubjpass(:obv)  agent Proximate agent of VTA expressed by a noun.  nagent:obv Obviative agent of VTA expressed by a clause. nagent:oblique(:obv) Oblique agents of passive verbs cagent Proximate agent of VTA expressed by a clause.  cagent:obv Obviative agent of VTA expressed by a clause. cagent:oblique(:obv) Oblique agents of passive verbs dobj Inanimate nominals as objects of VTI dobj dobj:under Animate proximate nominals, undergoers of VTA dobj:under:obv Animate obviative undergoers of VTA iobj iobj Secondary objects of VTA not expressed in the verb ccomp ccomp Additional specification of dependency (e.g., dobj, dobj:under, iobj, nmod) is required xcomp  nmod Adjuncts of verbs nmod nmod:impobj Implied objects of VAI.O, VAI.T, and incorporated verbs nmod:objad Objects of adverbial particles and some verbal prefixes nmod:instr Objects of instrumental particles and instrumental verbal prefixes

Table 1: Mapping of the UD argument labels and Arapaho nominal argument labels.

The following example further demonstrates the (syntactically) intransitive verbs or agential argu- mismatch between subject and agent in Arapaho. ments (proximate or obviative) of the transitive Here, the verb is in passive voice, and the “sub- animate verbs. We propose to account for this ject” of the verb is “my grandfathers.” However, distinction as well as the distinction in obviation, this “subject’ is obviative, and it is the oblique which is clearly marked in nominal and verbal agent (“my father”) which is proximate. morphology.

nagent:oblique nsubjpass:obv Neisonoo nihcihwonbiineihini3i nebesiiwoho’. 3.2 Objects NA VAI.PASS NA (9) The prototypical objects of transitive verbs do not easily fit the dobj relation in Arapaho. This is ne-isonoo nih-cih-won-biin-eihi-ni3i 1S-father PST-to here-ALLAT-give-PASS-4PL primarily because Arapaho verbs commonly un- ne-besiiwoho’ dergo complex secondary derivation to produce 1S-grandfathers.obv verb stems which allow one to promote an an- “My grandfathers were given (sth) by my father” imate argument to a core argument, marked in- flectionally on the verb disregarding its semantic Since the verb is passivized and thus intransitive, role. Thus, benefactives, recipients, goals, and only one argument is reflected in its morphology, even themes are typically the “object” marked in- the obviative subject nebesiiwoho’ “my grandfa- flectionally on the verb. Conversely, other argu- thers.” The label of nagent is kept with an addi- ments that would be classic “direct” objects in En- tional marker :oblique to indicate that the argu- glish are demoted, and not marked inflectionally ment neisonoo “my father” is not expressed in ver- on the verb. On the other hand, because the pro- bal morphology. Importantly, such oblique agents moted animate argument is marked inflectionally, are different from noun modifiers, which are dis- it can also easily be dropped from overt mention cussed further in the paper, because they specify in the sentence, while unmarked items are much the actor of the verb rather than its manner. more likely to be mentioned explicitly. The subject relationship is not clearly defined Thus, when the manual for universal dependen- in the Arapaho language. Instead, it is possible cies notes that dobj is the most patient-like argu- to talk about nominal expressions that are indexed ment of a verb, this is in direct tension with the ten- by verbal morphology either as sole arguments of dencies of Arapaho transitive verb dependencies.

175 Additionally, when it notes that “if there is just one different from the point of view of the grammar4 object, it should be labeled dobj, regardless of the and their respective designation5. morphological case or semantic role that it bears” Hence, a further specification of the dobj is nec- (UniversalDependencies.org, 2014), this raises ad- essary for transitive verbs. To stay consistent with ditional problems, since the actual ‘object’ marked the labels proposed for the nagent and cagent re- on the Arapaho verb is highly likely not to appear lations, the additional labels employed are :under in the sentence. The only exception and full corre- and :obv. spondence to the UD’s definition of the direct ob- nsubj:agent dobj:under:obv ject is the inanimate object of an inanimate transi- Neisonoo nihcihwonbiinoot nebesiiwoho’. NA VTA NA tive verb (VTI): (12) dobj ne-isonoo nih-cih-won-biin-oot niico’ontonounowoo nuhu’ niinen. 1S-father PST-to here-ALLAT-give-3S/4 VTI DET NI (10) ne-besiiwoho’ 1S-grandfathers.obv nii-co’on-tonoun-owoo nuhu’ niinen. IMPF-always-use-1S this piece of fat “My father came to give [me] to my grandfather.”

“I always use this fat.” In the example (12), the object clearly marked Because the verb is transitive inanimate, it requires on the verb is the fourth person, or the obvia- two arguments, only one of which (the animate tive. Specifying this dependency relation disam- agent) is marked inflectionally. The second ar- biguates between the nominals and enables the gument can only be expressed by an inanimate correct translation of the sentence. noun and can either precede or follow the verb. So, in the current scheme the distinction be- So the overt nominal in the example above rep- tween different types of objects is further clari- resents a prototypical direct object for transitive fied. The iobj is reserved only for the secondary inanimate verbs. Meanwhile, transitive animate objects of the ditransitive verbs which show no verbs can have up to three arguments (e.g., ditran- verbal agreement. Meanwhile, the dobj is used to sitive verbs), with the two animate arguments be- mark the dependency relation between the transi- ing expressed inflectionally on the verb. So tech- tive inanimate verb and its object, which is also nically, ditransitive constructions may have only not specified in the verbal morphology. Label one overt nominal not corresponding to either of dobj:under with the additional specification of ob- the person markers in verbal inflection. Accord- viation indicates the dependency relation between ing to the UD definition cited above, such a nom- transitive animate verbs and the undergoers speci- inal should be considered a direct object. In the fied in the verbal morphology. following example, the true “object” of the Ara- 3.3 Noun Modifiers paho verb is “you,” (since it is in imperative form) while “your eyes” is not marked on the verb, and The dependency relation of noun modifier corre- is thus from the perspective of Arapaho grammar sponds rather well to the noun modifiers in Ara- an oblique form. paho. It is primarily used for the disambigua- tion between direct or indirect objects of transi- dobj Cihneeneeciihi hesiiseii. tive verbs and the implied, incorporated objects, VTA NI (11) or otherwise, adjuncts. Having argued that some overt nominals of tran- Cih-nee-neeciih-i he-siiseii EMPH.IMPER-REDUP-lend-1S.IMPER 2S-eyes sitive animate nouns play a role of a secondary, or indirect object, we now also argue that such label “Lend me your eyes.” in the same context can be inappropriate as well. There is no direct agreement between the sec- Using the UD rules for distinguishing the depen- ondary object hesiiseii “your eyes” and the verb. dency in the example below would lead to analyz- Ideally, this should be represented by iobj relation ing the nominal koxouhtiit “handgame” as a direct which emphasizes the indirect syntactic relation object of the main verb. But as one can see from between the verb and the nominal. 4VTI objects are not reflected in verbal morphology. Furthermore, objects of a transitive animate 5Only the animate nominal expressions (NA) can be the verb (VTA) and transitive inanimate verb (VTI) are objects of VTA.

176 the translation, it would also lead to a wrong anal- scheme additionally distinguishes the instrumental ysis. Similarly, the indirect object analysis would case since there are special case markers defined also be incorrect. Indeed, annotating this noun as by an adverbial or an adverbial prefix. So where an oblique or an adjunct, nmod, is the only way of the prefixes hi’-, nohk-, and nii3- are present or ensuring the correct analysis and translation. where the corresponding adverbial particles ap-

nmod pear, the nominal adjunct is considered to be in- Ceebe’eiheinoo koxouhtiit strumental (nmod:instr). So in example (5), the VTA NI (13) relation between the head of the relative clause si- ikoocei’iikuu3oo “rubber item” and the main verb ceebe’eih-einoo koxouhtiit. IC.beat-3S/1S handgame is nmod:instr. Finally, an additional dependency poss, posses- “He beats me in handgame.” sor modifier, is being used for possessive construc- When adjuncts are used with semi-transitive verbs, tions with an overt possessor. Similar to Finnish the nmod relation is further suffixed with :ob- (Tsarfaty, 2013; Haverinen et al., 2014), in Ara- jim to note that the noun modifier further speci- paho, it is possible to distinguish between the sub- fies the under-specified objects of semi-transitive ject and the object of a possession. However, un- verbs. Essentially, while these nominals are an- like in other languages, no special genitive con- alyzed and marked as noun modifiers, for a suc- struction exists to mark this type of relation. In- cessful translation they need to be marked as di- stead, the possessor and possessed appear side-by- rect objects, which we have argued against in the side. The possessed in such constructions has a previous section. In order to avoid the incorrect third (or fourth) person possessive marker. So in a translation as well as incorrect analysis, the label phrase nii’ehihi’ hi-siiseii “little bird his-eyes” the nmod:objim is used. In the following example, the possessor is “little bird” since the possessive pre- noun bei’ci3ei’i “money” semantically is the ob- fix hi- “his” directly references this third person. ject of the semi-transitive verb. However, as we The dependency relation marked here is possessor argue, marking it as direct or indirect object would nominal modifying another nominal. violate the principles of Arapaho syntax. The examples above demonstrate that not all of

nmod:objim the arguments that may semantically appear sim- ilar to the dependencies established in UD are the neeyeih’oonotooneenou’u bei’ci3ei’i. same in Arapaho. While under-specification of the VAI.O NI (14) semantic relationships can be beneficial in estab- neeyeih-’oonotoonee-nou’u bei’ci3ei’i. lishing some commonalities cross-linguistically, it IC.try-REDUP-borrow things-12.ITER money can also result in misrepresentation of some of “Whenever we try to borrow money.” the relations and lowered efficiency in machine learning (Lipenkova and Soucek,ˇ 2014). The ma- In addition, some of these implied or incorpo- jor underspecification for the Arapaho language rated objects with overt nominal expressions can is the omission of proximate-obviative distinction: be modified by an adverbial particle similar to a while we realize that it could potentially be prob- preposition in English. lematic in cross-linguistic applicability, omitting

nmod:objad case this distinction disregards one of the main fea- tures of Algonquian syntax, and renders automatic nih’iinou’oo3i’ neci’ hi3oobei’i’ VAI NA PART translation of English transitive verbs into Ara- (15) paho effectively impossible. nih-’iinou’oo-3i’ nec-i’ hi3oobei’-i’ PST-float around-3PL water-LOC under sth-LOC 4 Non-verbal roots “They were floating around under the water” Adopting the relation of a root as the independent In the example above, particle hi3oobei’i’ “un- word in a clause or sentence allows us to avoid der” is a dependent of the adjunct neci’ “water- issues arising from securing the root node with LOC.” This relation is reflected in the locative verbs. So, like in the UD scheme, our annota- case marker on the noun showing a direct depen- tions do not attach the node of a root to a partic- dency with the particle. The Arapaho dependency ular part of speech even though they are usually

177 represented by verbs. The main reason for doing the scope of the paper to discuss these issues, but this is avoiding the potential analysis of what is we hope that expanding this project to annotat- not there (Nivre, 2015; Hajicova et al., 2015; Os- ing conversational data and applying the annotated borne and Liang, 2015). In our annotations, the data to machine learning methods will further re- root often represents a pragmatically independent veal some additional insights on analysis of dis- word, as for example in predicative type construc- continuous constituency. tions (Cowell and Moss, 2008). Such construc- In critiquing the UD, we, nonetheless, want to tions are used to topicalize one of the verbal ar- stress the eloquence of such an approach. Unlike guments or the manner of action (i.e., verbal par- the phrase structure annotations, UD allows us to ticles) similar to existential constructions in other account for the inconsistent phrase structures and languages. However, instead of marking the pred- dislocated tokens so often encountered in the Ara- icate as a root of the sentence as it is done in the paho language. At the same time, however, we Russian TreeBank (de Marneffe et al., 2014), the argue that to adequately account for the many lin- topicalized nominal or the particle is the root in guistic nuances in annotations of such a morpho- Arapaho. The relation between the root and the logically and syntactically complex language like predicate is backreference: Arapaho, it is often necessary to include the se- root backref mantic and pragmatic levels of analysis.

Ni’ook he’ne’nih’iisih’it. Acknowledgments NAME VAI.PASS (16) This research is funded by the National Endow- Ni’ook he’ne’-nih-’iisih’i-t Puffy Eyes that-PST-how named-3S ment for the Humanities grant, project number “Puffy Eyes, that is how he is named.” 1551671 “Arapaho Lexical Database and Dictio- nary.” We are especially thankful to the Northern In example (16), the argument of the verb Arapaho tribe for allowing us to conduct the work he’ne’nih’iisih’it “that is how he is named” is not with their language. realized overtly, and the verbal prefix ne’- “that is” references back to the topical argument, mak- ing the verb actually a dependent of it. Were we to References analyze distinct morphological elements, this pre- Andrew Cowell and Alonzo Moss. 2008. The Arapaho fix would act as a copula between the two. Over- Language. Westview Press, Boulder. all, the reasoning for treating such topicalized el- ements (which sometimes may take other than the Marie-Catherine de Marneffe and Christopher D. Man- clause-initial position) comes from the combina- ning. 2008. Stanford typed dependencies manual. Revised: April 2015:1–22. tion of the pragmatics and morphology: nearly all of the verbal clauses with prefixes ne’- “that” and Marie-Catherine de Marneffe, Timothy Dozat, Na- nee’ees- “that is how” are backreference depen- talia Silveira, Katri Haverinen, Filip Ginter, Joakim Nivre, and Christopher D. Manning. 2014. Uni- dents of such roots. versal Stanford Dependencies: A cross-linguistic ty- pology. In Proceedings of the Ninth International 5 Conclusion Conference on Language Resources and Evaluation (LREC’14), pages 4585–4592. In this paper we demonstrate the use of Univer- sal Dependency scheme with a language typolog- . 1984. The obviative in Fox narrative ically different from the ones often included in discourse. In William Cowan, editor, Papers of the Fifteenth Algonquian Conference, pages 273–286. machine-learning technologies. In using the UD Carleton University Ottawa. framework, several unmentioned issues stemming from the reliance on the word order were noticed. Eva Hajicova, Marie Mikulova, and Jarmila For example, in the current annotation scheme, Panevova. 2015. Reconstructions of Deletions in a Dependency-based Description of Czech: we reanalyzed the relationship of parataixis to ac- Selected Issues. In Proceedings of the Third In- count for verbs of citations, so that dependency ternational Conference on Dependency Linguistics would be traced from such a verb to the root of (Depling 2015), pages 131–140, Uppsala, Sweden. the whole clause. Similarly, the discourse marker Katri Haverinen, Jenna Nyblom, Timo Viljanen, dependencies were modified to include and ana- Veronika Laippala, Samuel Kohonen, Anna Missila,¨ lyze interjections. Unfortunately, it is outside of Stina Ojala, Tapio Salakoski, and Filip Ginter. 2014.

178 Building the essential resources for Finnish: the Appendix: Abbreviations Turku Dependency Treebank. Language Resources and Evaluation, pages 1–39. 0PL inanimate Janna Lipenkova and Milan Soucek.ˇ 2014. Converting 12 first person plural inclusive Russian Dependency Treebank to Stanford Typed 3S/4 or 1PL/2PL the first number indicates the person and num- Dependencies Representation. In Proceedings of the ber acting on the following person and number 14th Conference of the European Chapter of the As- (he to him; we to you.pl) sociation for Computational Linguistics, pages 143– ALLAT allative 147, Gothenburg, Sweden. Association for Compu- DET determiner tational Linguistics. DETACH detached adverbial prefix DIM diminutive Ryan McDonald, Joakim Nivre, Yvonne Quirmbach- Brundage, Yoav Goldberg, Dipanjan Das, Kuzman EMPH emphatic Ganchev, Keith Hall, Slav Petrov, Hao Zhang, Os- FUT future tense car Tackstr¨ om,¨ Claudia Bedini, Nuria´ Bertomeu IC phonological initial change Castello,´ and Jungmee Lee. 2013. Universal De- IMPER imperative pendency Annotation for Multilingual Parsing Ryan. IMPF imperfect In ACL 2013 - 51st Annual Meeting of the Associa- ITER iterative tion for Computational Linguistics, Proceedings of LOC locative the Conference, pages 92–97, Sofia, Bulgaria. NA animate noun NAME name Joakim Nivre. 2015. Towards a universal grammar for natural language processing. Lecture Notes in Com- NARRPAST narrative puter Science (including subseries Lecture Notes in NI inanimate noun Artificial Intelligence and Lecture Notes in Bioinfor- PART particle matics), 9041:3–16. PASS passive voice PL plural Timothy Osborne and Junying Liang. 2015. A Sur- PST past tense vey of Ellipsis in Chinese. In Proceedings of the REDUP Third International Conference on Dependency Lin- REL relative prefix guistics (Depling 2015), pages 271–280, Uppsala, S singular Sweden. SELFBEN self-benefactive Reut Tsarfaty. 2013. A Unified Morpho-Syntactic VAI animate intransitive verb Scheme of Stanford Dependencies. Proceedings of VAI.O animate intransitive verb with an implied ob- the 51st Annual Meeting of the Association for Com- ject putational Linguistics (Volume 2: Short Papers), VAI.PASS animate intranstive passive verb pages 578–584. VAI.T animate intransitive verb with a specific im- plied object UniversalDependencies.org. 2014. Univer- VII inanimate intransitive verb sal dependency relations (single document). VTA transitive verb with animate object http://universaldependencies.org/ VTI transitive verb with inanimate object u/dep/all.html.

179