<<

Modeling the of Plains Conor Snoek1, Dorothy Thunder1, Kaidi Loo˜ 1, Antti Arppe1, Jordan Lachler1 Sjur Moshagen2, Trond Trosterud2 1 University of , 2 University of Tromsø, Norway [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

Abstract While we are working to develop a complete finite-state model of Plains Cree morphology, we This paper presents aspects of a com- focus on nominal morphology in this paper. putational model of the morphology of Plains Cree based on the technology of In the first section we briefly describe Plains finite state transducers (FST). The paper Cree nominal morphology and give some back- focuses in particular on the modeling of ground on the . This is followed by de- nominal morphology. Plains Cree is a tails on the model and its implementation. Fi- whose nominal nally, we discuss the particular situation of de- morphology relies on prefixes, suffixes veloping tools for a language that lacks a formal, and circumfixes. The model of Plains agreed-upon standard and the challenges that this Cree morphology is capable of handling presents. We conclude with some comments on these complex affixation patterns and the benefits of this technology to language revital- the morphophonological alternations ization efforts. that they engender. Plains Cree is an endangered Algonquian language spo- 2 Background ken in numerous communities across 2.1 Plains Cree Canada. The language has no agreed upon standard , and exhibits Plains Cree or nehiyawˆ ewinˆ is an Algonquian widespread variation. We describe prob- language spoken across the Prairie Provinces in lems encountered and solutions found, what today is Canada. It forms part of the while contextualizing the endeavor in the Cree-Montagnais-Naskapi continuum that description, documentation and revitaliza- stretches from Labrador to . Es- tion of First Nations in Canada. timates as to the number of speakers of Plains Cree vary a lot and the exact number is not known, from a high of just over 83,000 (Statistics Canada 1 Introduction 2011, for Cree without differentiating for Cree di- The Department of at the University of alects) to as low as 160 ( 2013). Wol- Alberta has a long tradition of working with First fart (1973) estimated there to be about 20,000 na- Nations communities in Alberta and beyond. Re- tive speakers, but some recent figures are more cently a collaboration has begun with Giellatekno, conservative. a research institute at the University of Tromsø, Regardless of the exact number of speakers, which has specialized in creating language tech- there is general that the language is un- nologies, particularly for the indigenous Saami der threat of extinction. In many, if not most, com- languages of Scandinavia, but also for other lan- munities where Cree is spoken, children are learn- guages that have received less attention from the ing English as a first language, and encounter Cree computational linguistic mainstream. This collab- only in the language classroom. However, vigor- oration is currently focusing on developing com- ous revitalization efforts are underway and Cree is putational tools for promoting and supporting lit- regarded as one of the Canadian First Nations lan- eracy, language learning and language teaching. guages with the best chances to prosper (Cook and Plains Cree is a morphologically complex lan- Flynn, 2008). guage, especially with regard to and . As a polysynthetic language (Wolvengrey, 2011, 35), Plains Cree exhibits substantial mor- (2) phological complexity. Nouns come in two gen- omaskisinisiwawaˆ der classes: animate and inanimate. Each of these o-maskisin-is-iwaw-aˆ classes is associated with distinct morphological 3PL.POSS-shoe-DIM-3PL.POSS-PL.IN patterns. Both animate and inanimate nouns carry ‘their little shoes’ inflectional morphology expressing the grammati- cal categories of number and locativity. The num- The particular form of the , how- ber suffixes for animate and inanimate nouns are ever, varies considerably. For example, the different, the being marked by -ak in ani- most common form of the suffix is -is. The mates and -a in inanimates. Locativity is marked suffix triggers morphophonemic changes in the by a suffix taking the form -ihk (with a number of stem. For example, the ‘t’ in oskatˆ askw-ˆ ‘carrot’ allomorphs). The locative suffix cannot co-occur changes to ‘’ (the alveolar [ts]) when the with suffixes marking number or obviation, but diminutive suffix is present resulting in the form does occur in conjunction with affixes. oskacˆ askosˆ . Since the form oskatˆ askw-ˆ is a -w Obviation is a marked on final form a further phonological change occurs, animate nouns that indicates relative position on namely the initial in the suffix changes the animacy hierarchy, when there are two third from i > o. person participants in the same clause. Obviation is expressed through the suffix -a, which forms a To sum up, Plains Cree nominal morphology mutually exclusive paradigmatic structure with the allows the following productive pattern types: locative and number prefixes. The possessor of a noun in Plains Cree is ex- (3) pressed through affixes attached to the noun stem. stem+NUM These affixes mark person and number of the stem+OBV possessor by means of a paradigmatic inflectional stem+LOC pattern that includes both prefixes and suffixes. stem+DIM+NUM Since matching prefixes and suffixes need to stem+DIM+OBV co-occur with the noun when it is possessed, it stem+DIM+LOC is possible to treat such prefix-suffix pairings as POSS+stem+POSS+NUM circumfixes expressing a single person-number POSS+stem+DIM+POSS+NUM meaning. The noun maskisin in (1) below1 is POSS+stem+DIM+POSS+OBV marked for third person plural possessors as well POSS+stem+POSS+LOC as being plural itself. The inanimate gender class POSS+stem+DIM+POSS+LOC is recognizable in the plural suffix -a, which would be -ak in the case of an animate noun. Plains Cree can be written both with the Ro- man alphabet and with a Syllabary. Theoretically (1) there is a one-to-one match between the two. omaskisiniwawaˆ However, a number of factors complicate this o-maskisin-iwaw-aˆ relationship. Differing punctuation conventions, 3PL.POSS-shoe-3PL.POSS-PL.IN such as capitalization, and the treatment of ‘their shoes’ loanwords make conversion from one writing system to another anything but a trivial matter. Nouns also occur with derivational morphol- Orthography presents a general problem for the ogy in the form of diminutive and augmentative development of computer-based tools, because suffixes. The diminutive suffix is productive and unlike nationally standardized languages, ortho- forms taking the diminutive suffix can occur with graphic conventions can vary considerably from all the inflectional morphology described above. community to community, even from one user to another. Certain authors have argued for the adoption of orthographic standards for Plains 1The following abbreviations are used POSS = possessive Cree (Okimasisˆ and Wolvengrey, 2008), but there prefix/suffix; LOC = locative suffix; OBV = suffix; DIM = diminutive suffix; NUM = number marking suffix; IN simply is no centralized institution to enforce = inanimate; PL = plural. orthographic or other standardization. This means that the wealth of varying forms and dialectal ating a proliferation of dialectical tags, it is easier diversity of the language are apparent in each in- to reproduce the architecture of the model and use dividual community. This situation poses specific it to create a new model for the related language. challenges to the project of developing language This allows the preservation of formal structures tools that are more seldom encountered when that follow essentially the same pattern, such as making spell-checkers and language learning possessive inflection for example, while replacing tools for more standardized languages. the actual surface forms with those of the target language. Similar situations have been encountered in work on the Saami languages of Scandinavia 2.2 Previous computational modeling of (Johnson, 2013). Following their work, we in- clude dialectical variants in the model, but mark them with specific tags. This permits a tool such as Previous work on Algonquian languages that has a spell-checker to be configured to accept and out- taken a computational approach is not extensive. put a subset of the total possible forms in the mor- Hewson (1993) compiled a dictionary of Proto- phological model. An example here is the distribu- Algonquian terms generated through an algorithm. tion of the locative suffix described in more detail His data were drawn from fieldwork carried out by in section 4. There is a disparity between com- Leonard Bloomfield. Kondrak (2002) applied al- munities regarding the acceptability of the occur- gorithms for cognate identification to Algonquian rence of the suffix with certain nouns. The suffix data with considerable success. ?) worked on a can be marked with a tag in the FST-model. This sizable corpus of Cree data and developed tools tag can then be used to block the acceptance or for data management and analysis in PL/I. Junker generation of this particular form. The key notion (2008) have written on the difficulties of creat- here is that language learning and teaching tools ing search engine tools for and describe are built on the basis of the general FST model. challenges similar to the ones we have encoun- For Plains Cree there is one inclusive model, en- tered with regard to dialectal variation and the compassing as much dialectical variation as pos- absence of agreed on standard and sible. From this, individual tools are created, e.g. other widespread conventions. spell-checkers, that selects an appropriate subset In general, computational approaches to Algon- of the dialectically marked forms. A community quian, and other Indigenous North American lan- can therefore have their own spell-checker, spe- guages, have been hampered by the fact that in cific to their own preferences. It is also possi- many cases large bodies of data to develop and test ble to allow for “spelling relaxations” (Johnson, methods on are just not available. Even for Plains 2013, 67) at the level of user input, meaning that Cree, which is relatively widely spoken, and rela- variant forms will be recognized, but constraining tively well documented, the available descriptions the output to a selection of forms deemed appro- are still lacking in many places. As a result, field- priate for a given community. Hence, the spell- work must be undertaken in order to establish pat- checker used in one particular community could terns that can be modeled in the formalism neces- accept certain noun-locative combinations. At the sary for the finite state transducer (FST) to work, same time, other tools, such as paradigm learn- a point that will be expanded on below. ing applications, could block this particular noun- locative combination from being generated: cer- 3 Modeling Plains Cree morphology tain forms are understood, but not taught by the The finite state transducer technology that forms model. In general, the variation is not difficult to the backbone of our morphological model, and deal with in terms of the model itself, rather it rep- consequently of all the language applications we resents a difficulty in the availability of accurate are currently developing, is based historically on descriptions, since their specifics must be known work on computational modeling of natural lan- and understood to be successfully included in the guages known as two-level morphology (TWOL) model. by Koskenniemi (1983). His ideas were further This method could, in principle, be used to ex- developed by Beesley and Karttunen (2003). tend the Plains Cree FST-model to closely related Their framework offers two basic formalisms with Algonquian languages. However, rather than cre- which to encode linguistic data, lexc and twolc. The Compiler, or lexc, is “a high-level since they pass through the same continuation declarative language and associated compiler” lexicon, leads to a further continuation lexicon (Beesley and Karttunen, 2003, 203) used for named OBVIATIVE. This rather small lexicon encoding stem forms and basic concatenative adds a final -a suffix and the tag +Obv indicating morphology. The source files are structured in that the form is inflected for the grammatical terms of a sequence of continuation lexica. Begin- category of obviation. Since no number suffixes ning with an inventory of stems the continuation can occur in this form the path does not add a +Sg lexica form states along a path, adding surface or +Pl tag to the underlying form. morphological forms and underlying analytic structure at each stage. A colon (:) separates (5) underlying and surface forms. Example (4) apiscacihkos+N+AN+Obv demonstrates paths through just three continua- apiscacihkosa tion lexica for the animate nouns apiscacihkos ‘antelope’ ‘antelope’ and apisimososˆ ‘deer’. By convention, the names of continuation lexica are given in These circumfixes were modeled using Flag upper case. Stems and affixes represent actual Diacritics, which are an “extension of the finite forms, and are thus given in lower case. The state implementation, providing feature-setting ‘+’ sign indicates a morphological tag. and feature-unification operations” (Beesley and Karttunen, 2003, 339). Flag diacritics make it (4) possible for the transducer to remember earlier LEXICON ANSTEMLIST states. The transducer may travel all paths through apiscacihkos ANDECL ; the prefixes via thousands of stems to all the apisimososˆ ANDECL ; suffixes, but the flag diacritics ensure that only LEXICON ANDECL strings with prefixes and suffixes belonging to < +N:0 +AN:0 +Sg:0 @U.noun.abs@ # > ; the same person-number value are generated. In < +N:0 +AN:0 @U.noun.abs@ OBVIATIVE > ; our solution for nouns, the continuation lexica LEXICON OBVIATIVE allow all combinations of possession suffixes < +Obv:a # > ; and prefixes, but the flag diacritics serve to filter out all undesired combinations. For example, in Both forms are directed to the continuation the noun omaskisiniwawaˆ from (1) above, the lexicon here named ANDECL which provides third person prefix o- and the suffix marking both some morphological tagging in the form of +N to person and number -iwawˆ are annotated in the mark the word as a noun and +AN to denote the lexc file with identical flag diacritics, so that they gender class ‘animate’. Each of the two nouns has will always occur together. the possibility of passing through the continuation lexicon ANDECL as an ‘absolutive’ noun – as Plains Cree has some very regular and pre- indicated by the tag @U.noun.abs@ (a flag dictable morphophonological alternations that diacritic, as will be explained below). The colons can be modeled successfully in the finite state in the code indicate a distinction between upper transducer framework. The formalism used here and lower levels of the transducer. The upper form is not lexc as in the listing of stems and the to the left of the colon is a string containing the concatenative morphology, but an additional for- the lemma as well as a number of tags that contain malism called the two-level compiler or twolc that information about grammatical properties. For is well suited to this task. The twolc formalism the word form apiscacihkos, the analysis once it was developed by Lauri Karttunen, Todd Yampol, has passed through the ANDECL continuation Kenneth R. Beesley and Ronald M. Kaplan based lexicon is apiscacihkos+N+AN+Sg. on ideas set forth in Koskenniemi (1983).

The surface forms apiscacihkos and apisimososˆ (6) are well-formed strings of Plains Cree, following acawewikamikosisˆ the Standard Roman Orthography. Hence, the atawewikamikw-isisˆ path can terminate here as indicated by the hash store-DIM mark. The other path, also open to both forms ‘little store’ However, atimw- also undergoes changes in the In example (6) above, the form atawewikamikwˆ stem vowel when the noun is marked for a pos- ‘store’ is modified by the derivational suffix -isis sessor so that a > i and i > eˆ. In the first person, marking the diminutive form. This derivation is the possessive prefix takes the form ni- leading to highly productive in Plains Cree. The underlying a sequence of two arising from the prefix form of the suffix is -isis but in conjunction with final -i- and stem initial -i-, which is not permit- a stem-final -w, the initial vowel of the suffix ted in Plains Cree. This situation is handled by a changes to -o. This morphophonemic alternation general rule deleting the first vowel in preference can be written in twolc much like a phonological for the latter. However, a set of twolc rules would rule: be required to change the stem vowels – a set that would be specific to this particular word only. The (7) full set of two level rules are accessible online2. i:o <=> w: %>:0 s: +Dim ; Since the addition of further rules poses the risk of rule conflicts in an increasingly complex twolc The sign %> is used to mark a suffix bound- code, the stem vowel changes are handled in lexc ary, which, along with the +Dim tag, ensures that instead. There are currently over 40 continuation it is the first vowel of the suffix that undergoes lexica in the model of nominal morphology alone. substitution. Thus the context is given by the occurrence of a -w before the suffix boundary, (8) i.e. stem finally. An additional complication here LEXICON IRREGULARANIMATESTEMS is that the presence of the diminutive suffix in a atim IRREGULARINFLECTION-1 ; form again triggers a phonological change in the atim:temˆ IRREGULARINFLECTION-2 ; stem by which all t’s change to c’s (phonetically [ts]). In twolc the rules dictating morphophono- The continuation lexicon contains two ver- logical alternations apply in parallel, avoiding sions of the form atim with two different paths possible problems caused by sequential rule leading to further inflectional suffixes. In the interactions. The noun completes the path through second instance of atim, writing the base form to the continuation lexica and is passed to twolc as the left of the colon and the suppletive stem to atawewikamikwisisˆ . There it undergoes two the right ensures both that the form -temˆ surfaces morphophonological changes giving the correct correctly. In the analysis the base form atim can surface form acawewikamikosisˆ . still be recovered. The forms are sent to differing continuation lexica, since only the suppletive Twolc is a powerful mechanism for dealing with forms occurs within the paradigm of possessive regular alternations. Reliance on twolc can reduce prefixes. The word meaning ‘my little dog’ is the number of continuation lexica and hence com- given as an example in (10) below. plexity of the morphology modeling carried out in lexc. The downside of using large numbers of (9) twolc rules is the increasing complexity of rule nicemisisˆ interactions. We have found that decisions about ni-atimw-isis which strategy to pursue in the modeling of a par- 1SG.POSS-dog-DIM ticular morphological pattern must frequently be ‘my little dog’ made on a case by case basis. For example, in modeling the interesting case of the form atimw- The suppletive form also does not carry an ‘dog’ several strategies needed to be employed. underlying -w and hence no longer triggers the The form triggers a vowel change i > o in con- vowel change in the diminutive suffix. With this junction with the diminutive suffix -isis resulting solution we can handle the regular and more in -osis, a change falling under a rule described straightforward morphophonological alternations in (8) above. A further change here is that the t in twolc, while avoiding undue complexity by in atimw- ‘dog’ changes to c when the diminutive modeling the suppletive forms in lexc. suffix is present resulting in the surface form aci- 2https://victorio.uit.no/langtech/ mosis. Both these forms can be handled by twolc trunk/langs/crk/src/phonology/crk-phon. rules such as the one exemplified in (8) above. twolc Finally, we have adopted a system of using spe- similar situations) is being carried out by what cial tags to denote dialectal variants that are not is at best a handful of people. While Cree lan- equally acceptable in different communities. The guage specialists form a professional body of re- seemingly of variation found in Plains searchers with a proud tradition, they are faced Cree can be related to several reasons described with the enormous task of documenting a language in more detail in the next section. The variation spoken in many small communities spread over a is dealt with in the morphological model with a huge geographical area. In addition, many of those tagging strategy that marks dialectal forms. This specialists are also involved with language revi- tagging allows for the systems based on the mor- talization and language teaching, with the result phological model to behave in accordance with that less time can be devoted to language descrip- the wishes of the user or community of users. tion, scholarship and the pursuit of larger projects In the setting of a particular teaching institution, such as the development of corpora. While such for instance, only a certain subset of the vari- projects are under development in many areas, the ants encoded in the morphological model might be demands placed on individual researchers and ac- deemed acceptable. Our model permits this com- tivists has resulted in an overall scarcity of re- munity to adjust the applications they are employ- sources. While compared to other Indigenous ing, e.g. a spell-checker, so that their community- languages spoken in Canada, Plains Cree is rel- specific forms are accepted as correct. atively well documented, many of the resources The stems are accessible online3, and may be that would be desirable assets for the development analysed and generated at the webpage for Plains of a finite state model are not available. As a re- Cree grammar tools4. sult, we have carried out fieldwork to further make explicit the full inflectional paradigm of nouns in 4 The necessity for fieldwork in modeling Plains Cree. Plains Cree There is considerable variation among speak- We began working on the morphological model ers and specialists regarding the acceptability of of Plains Cree by examining published sources, certain inflectional possibilities. For example, in such as Plains Cree: A grammatical study (Wol- the case of one animate noun atim ‘dog’ it seems fart, 1973) and Cree: Language of the Plains formally reasonable to allow its combination with (Okimasis,ˆ 2004). Okimasis’ˆ work is clearly the locative suffix -ohk rendering atimohk. This structured and contains a wealth of information. combination of stem and affix was considered im- Nevertheless, the level of explicitness required to possible or at least implausible by some of our capture the nature of a language in enough de- native speaker consultants. However, the form tail for applications such as, for example, spell- itself does occur, albeit in the guise of a place checkers is beyond the scope of her work. This name for a lake island in northern is to say that in formalizing Okimasis’ˆ descrip- named atim ‘dog’. Therefore the form atimohk ‘on tion we needed to generalize grammatical patterns the dog’ with locative suffix attached can occur that were not always explicitly spelled out in her in this very specific and geographically bounded 5 work in every detail. It should be apparent here context . The way of coping with this is to lexi- that a number of factors come in to play here that calize atimohk as locative of the island Atim, and make working on Plains Cree quite a different un- to keep the noun atim outside the set of nouns get- dertaking from working on a European language ting regular locatives. with a long history of research in the Western aca- Further inquiry into this matter revealed that demic tradition. While official national European some speakers see the locative suffix as potentially languages such as German, Finnish or Estonian occurring quite widely, while others are more re- can look on scholarly work dating back some cen- strictive (Arok Wolvengrey – p.c.). Here again turies, and are supported by work from a com- there is a problem of scale: individual speakers of munity of specialists numbering hundreds of peo- any language have only a partial experience of the ple, work on Plains Cree (and other languages in possible extent of the language. In the modeling of the morphology for the purposes of such tech- 3https://victorio.uit.no/langtech/ trunk/langs/crk/src/morphology/stems/ nologies as spell-checkers, for example, the expe- 4http://giellatekno.uit.no/cgi/index. crk.eng.html 5Thanks to Jan Van Eijk for pointing this out. rience of any potential speaker must be taken into as modules within general software applications account. While the information that this particu- such as a word-processor, an electronic dictionary lar form is rare or semantically not well-formed or a intelligent computer-aided language learning is valuable, retaining the form is important, if the (CALL) application. Each of these tools can assist model is to cover the range of potential usage pat- fluent speakers, as well as new learners, in their terns of all Plains Cree speakers. Ideally, if the use of Plains Cree as a written language. written use of the language is supported by the The spellchecking functionality within a word- tools that can be developed based on our morpho- processor will be a valuable tool for the small-but- logical model, that would lead to a gradually in- growing number of Plains profes- creasing electronic corpus of texts, providing fre- sionals are engaged in the development of quency information on both the stems and mor- teaching and literary resources for the language. phological forms. It will allow for greater accuracy and consistency We have developed a workflow in which we in spelling, as well as faster production of materi- construct the maximal paradigms that are theoreti- als. Because dialectal variation is being encoded cally possible and then submit them to intense na- directly into the FST model, the spellchecker can tive speaker scrutiny. Only once native speakers be configured so that writers from all communities and specialists have approved the forms do they and can use this tool, without worry that become part of the actual model. The paradigms the technology is covertly imposing particular or- are chosen so as to provide the coverage of the thographic standards which the communities have entire span of morphologically possible forms as not all agreed upon. well as all morphophonemic alternations. As such The morphological analysis functionality built they present a maximal testbed for the patterns en- from the FST model and integrated within e.g. a coded in the formalism. Each paradigm consists web-based electronic dictionary will allow readers of about sixty inflected forms. to highlight Plains Cree text in a document or web- Overall, a careful balance must be struck be- page to perform a lookup of in any inflected tween directly explicit speaker/specialist input and form, and not only with the citation (base) form. theoretically possible forms. We aim to achieve This will enable readers to more easily read Plains this balance by taking a threefold approach: First, Cree documents with unfamiliar words without by careful consultation with speakers and special- needing to stop to repeatedly consult paper dic- ists; second, by building a corpus6 which can serve tionaries and grammars. While this does not obvi- as a testing ground for the morphological analyzer ate the need for printed resources in learning and and as a source of data, and third by working teaching of the language, such added functional- closely with communities willing to test the model ity can greatly increase the pace at which texts and provide feedback. are read through by language learners. This is not 5 Applications in language teaching and inconsequential as it can slow down considerably revitalization the onset of weariness brought on by needing to interrupt the reading process to consult reference The development of an explicit model of the mor- materials, and hence maintain the motivation for phology of Plains Cree as outlined above is of language learning. benefit not just to researchers but also those in- The paradigm generation functionality within volved in teaching and revitalizing the language e.g. an electronic dictionary allows users to se- within their home communities. Using the gen- lect a word and receive the full, or alternatively a eral technological infrastructure developed by the smaller but representative, inflected paradigm of researchers at Giellatekno, we are able to take the that word. This will be of direct benefit to in- FST model of Plains Cree morphology and use it structors developing materials to teach the com- to create in one go a variety of language tools in- plex morphology of the Plains Cree, as well as cluding a spellchecker, a morphological analyzer their students. and a paradigm generator, which can be integrated We are working in collaboration with Plains 6As noted above, a tool like a spell-checker promotes lit- Cree communities in the development and piloting eracy and hence contributes naturally to the increase in tex- tual materials. Until that begins to happen, however, we are of these tools, to ensure their accuracy and their collecting texts through recording and transcription. usefulness for teachers, developers, learners and other community members. The full range of uses with Cree-speaking communities in Alberta to ex- that these tools will be put to will only become ap- pand on existing dictionaries and develop collec- parent over time, but we expect that they will have tions of recordings. The development of this mor- a positive impact for community language main- phological model has led us to carry out fieldwork tenance by supporting the continued development on Plains Cree and to actively engage with Cree- Plains Cree literacy. speaking communities. We have worked hard to bridge the unfortunate gap that sometimes forms 6 Conclusion between the linguistic work being carried within academia and the needs of communities that are We have found the technology of Finite State active in language documentation and revitaliza- Transducers so useful in developing language ap- tion. We look forward to further fruitful coopera- plications for Plains Cree because it permits us tion between activists, educators and researchers. to integrate native speaker competence and spe- cialized linguistic understanding of grammatical Acknowledgments structures into the model directly. Building a computational model of Plains Cree At present the analyzer contains 72 nominal lex- morphology is a task that relies on the knowl- emes, carefully chosen to cover all morphologi- edge, time and goodwill of many people. We cal and morphophonological aspects of the Plains thank the University of Alberta’s Killam Re- Cree nominal system. Once the morphological search Fund Cornerstones Grant for supporting modeling of this core set of nouns has been final- this project. We would like to acknowledge in par- ized, scaling up the lexicon will be a trivial task, ticular the crucial advice, attention and effort of as all lexicographic resources classify their stem Jean Okimasisˆ and Arok Wolvengrey, and thank in the same way as is done in the morphological them for the resources they have contributed. We transducer. wish also to thank Jeff Muehlbauer for his time We have described our method of working with and materials, as well as the attendees of the first native speaker specialists and how their insights Prairies Workshop on Language and Linguistics are reflected in the design of the model. This in- for their insights and expertise. Further, it is im- teraction also allows enough possibilities for in- portant to acknowledge the helpfulness of Earle teractions with language teachers, learners and ac- Waugh who at the very start of our project made tivists so that we make our work truly useful to the his dictionary available to us, and who has been effort of preserving and revitalizing the precious very supportive. Arden Ogg has worked tirelessly cultural heritage that is Plains Cree. We are aware to build connections among researchers working of the limits of tools that relate primarily to the on Cree, which has greatly promoted and facili- written forms for languages that have rich oral his- tated our work. Ahmad Jawad and Intellimedia, tories and cultures, but feel that writing and read- Inc. who have for some time provided the tech- ing Plains Cree will play an ever growing role in nological platform to make available a number the future of this language. of Plains Cree dictionaries through a web-based This work makes practical contributions to lin- interface, have given us invaluable assistance in guistic research on Plains Cree. On the one hand, terms of resources and introductions. We would creating the model required the formalization of also especially like to thank the staff at Miyo many aspects of Plains Cree morphology which Wahkohtowin Education for their wonderful en- had not previously been spelled out in full detail, thusiasm, and for welcoming us into their commu- i.e. it makes explicit what is known, or not known, nity. Last but by no means least, we are indebted to about Plains Cree morphology, and thus allows us innumerable Elders and native speakers of Plains to extend the description of Plains Cree morphol- Cree whose contributions have made possible all ogy accordingly. On the other, the morphological the dictionaries and text collections we are fortu- analyses can aid in future linguistic discovery es- nate to have today. pecially when used in conjunction with corpora. In the future, we will continue to expand the morphological model both in its grammatical cov- References erage and in the size of the lexical resources which Kenneth R. Beesley and Lauri Karttunen. 2003. Fi- go into it. In regard to the latter, we are working nite State Morphology. CSLI Publications, Stanford (CA). Eung-Do Cook and Darin Flynn. 2008. Aborigi- nal . In: O’Grady, William and John Archibald (eds.) Contemporary Linguistic Analysis. Pearson, Toronto (ON). John Hewson. 1993. A computer-generated dictionary of Proto-Algonquian, Canadian Museum of Civ- ilization and Canadian Ethnology Service, Ottawa (ON). Ryan Johnson, Lene Antonsen and Trond Trosterud. 2013. Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries. In Stephan Oepen & Kristin Hagen & Janne Bondi Jo- hannessen (eds.), Proceedings of the 19th Nordic Conference of Computational Linguistics (NODAL- IDA 2013), 378-411. Linkoping¨ Electronic Confer- ence Proceedings No. 85. Marie-Odile Junker and Terry Stewart. 2008. Build- ing Search Engines for Algonquian Languages. In Karl S. Hele & Regna Darnell (eds.), Papers of the 39th Algonquian Conference, 59-71. University of Western Ontario Press, London (ON).

Grzegorz Kondrak. 2002. Algorithms for Language Reconstruction, Department of Computer Science, University of Toronto. Kimmo Koskenniemi. 1983. Two-level Morphology: A General Computational Model for Word-Form Recognition and Production, Publication No. 11. Department of General Linguistics, University of Helsinki.

Jean Okimasis.ˆ 2004. Cree: Language of the Plains, Volume 13 of University of Regina publications. University of Regina Press, Regina (SK). Jean Okimasisˆ and Arok Wolvengrey. 2008. How to Spell it in Cree. miywasinˆ ink, Regina (SK). H. Christoph Wolfart. 1973. Plains Cree: A grammati- cal study, Transactions of the American Philosoph- ical Society No. 5. H. Christoph Wolfart and Francis Pardo 1973. Computer-assisted linguistic analysis, University of Anthropology Papers No. 6. Department of Anthropology, University of Manitoba. Arok E. Wolvengrey. 2011. Semantic and pragmatic functions in Plains Cree syntax, LOT, Utrecht (NL).