Download the Nawacolex 2.1 Tutorial
Total Page:16
File Type:pdf, Size:1020Kb
The Nawat Corpus & Lexicon Database NAWACOLEX Version 2.1 TUTORIAL © Alan King 2014 TABLE OF CONTENTS INTRODUCTION RESEARCH TOOLS NAWACOLEX’s research tools THE NAWACOLEX DATABASE Concordances What is NAWACOLEX? What is a concordance? What is NAWACOLEX for? Search parameters Choosing your text corpus How does NAWACOLEX work? Concordancing lexicons: Spanish to What is in NAWACOLEX 2.1? Nawat What is not in NAWACOLEX 2.1? Concordancing lexicons: advanced A read-only database searches Sorting INSTALLATION INSTRUCTIONS Sorting of lexicons Install the Toolbox software Sorting of concordances Set up the NAWACOLEX 2.1 database Filters Opening NAWACOLEX for the first time THE LEXICONS Installation troubleshooting Introduction Structure and content USING NAWACOLEX Adaptation and transcription Introducing the NAWACOLEX screen Verb inflections Basic how-to’s Arauz How do I view a book? BibLex How can I browse a document? How can I display a lexicon? Campbell How can I see which lexicons have a Hernandez word? LBN What is the Wordlist? Ramirez What can I do with it? Schultze “Jumps” and parallel movements Todd Wordlist to lexicon Lexicon to lexicon THE TEXTS Text to lexicon Introduction Non-citation forms Structure and content Variants and sub-entries Transcripts Other jumps The corpus Some Toolbox things... Masin The windows Arauz Movable columns and fields Campbell Hidden fields Other texts Browse view Side browse APPENDIX The program bar and status bar Toolbox: Shortcut keys Printing and exporting Toolbox: Advanced users Margins Open windows and the “Window” menu BIBLIOGRAPHY INTRODUCTION This tutorial teaches how to use the Nawat Corpus & Lexicon or NAWACOLEX database (v 2.1, January 2014), which contains the most important source texts in Nawat, transcribed and edited, and also essential content of the most significant Nawat lexical resources (i.e. dictionaries, vocabularies or glossaries). As a practical tool, NAWACOLEX systematizes and integrates these texts and lexical resources to make it easier to access the primary materials through an incorporated user interface. NAWACOLEX 2.1 can be downloaded with all these features and used on a computer to read, search, explore and study the linguistic material and the content. This is the first version of NAWACOLEX to be shared with advanced students and fellow researchers. The prior version of this material (NAWACOLEX [version 1], ~2008) and its predecessors (NAWATLEX and Nawat Corpus 1.0, both ~2004) were all intended for personal research and as prototypes, and not distributed. The upgrade to version 2.1, carried out in 2013, has involved reorganisation of materials, some new transcriptions, designing a more polished and user-friendly interface and the development of this tutorial prior to sharing this package with interested colleagues. Details about how to obtain NAWACOLEX 2.1, discussions and announcements of future upgrades may be found at the Seminario Lingüístico Náhuat (SLN) on Facebook. Information can also be requested personally from me at [email protected]. The following tutorial will tell you how to get started, explain how the database is organised and works, give basic operating instructions, and list the contents of the current text corpus and the lexicons incorporated. I take this opportunity to thank those who have helped with parts of the considerable work that has gone into transcribing materials that have a place in NAWACOLEX 2.1, and those who have provided such materials, and in particular I wish to acknowledge help received from the Ne Bibliaj Tik Nawat project (in the person of Jan Morrow) which has helped finance the last few months of work necessary to develop the present upgrade of NAWACOLEX, without which its public appearance would certainly have been delayed. This result should not be judged as a finished general publication. It would be closer to the mark to think of this as my personal notes, packaged and wrapped in a form that, although not as polished as a fully public release, will give fellow Nawat scholars access to these materials and will hopefully facilitate their use and study and encourage better acquaintance with the richness of the Nawat language, which is still unknown to, or ignored by, far too many people, and the study of which is essential for the language’s successful recovery, which is the hope that inspires this effort and to which it is humbly dedicated. Alan King December, 2013 THE NAWACOLEX DATABASE What is NAWACOLEX? The NAWACOLEX (Nawat Corpus & Lexicon) database is a database structure serving as a holder for the Nawat texts and lexicons it contains. These texts all ultimately come from native Nawat speakers; in some cases have been published previously and there are also some unpublished texts. Also included are some transcriptions of recorded interviews with native speakers from the IRIN documentation project. The older texts and lexicons have been re-spelled in the common, modern orthography, both for users’ convenience and to facilitate the efficient functioning of the database as a tool for linguistic study and research. The software employed is an application called Toolbox developed and freely distributed by SIL (the Summer Institute of Linguists). This application, together with the specific configuration developed for NAWACOLEX, offers design features facilitate a number of useful operations which will be described in detail below. What is NAWACOLEX for? With the NAWACOLEX database you can do many things, such as the following, speedily and easily: read Nawat texts, locating them rapidly from a single index and displaying them in a single text window. display Nawat lexicons at a single click. search for any word in the whole text corpus or a part of it, and create concordances. see a wordlist, i.e. an alphabetical list of all the words that occur in the text corpus with information on the frequency of each word and a list of places where the word occurs, with the ability to view any of the locations listed. quickly find any word in the corpus wordlist or in the lexicon of one’s choice, provided it is listed in that lexicon. quickly look up in a lexicon any word that occurs in a text. There are also things that NAWACOLEX doesn’t do. For instance: NAWACOLEX does not show the original formatting of texts or other elements contained in publications such as illustratrations, notes etc.; for all such things the original texts must be consulted. The texts in NAWACOLEX have all been transcribed into modern spelling, which is usually an advantage for searches etc., but usually you cannot see the original spelling in NAWACOLEX, but must go to the original for that. The NawaCoLex Database 5 When you are reading a Nawat text, NAWACOLEX usually cannot provide you with its translation (however, you can look up words you don’t understand in its lexicons). How does NAWACOLEX work? NAWACOLEX is built with an application called Field Linguist's Toolbox (http://www- 01.sil.org/computing/toolbox/) but usually referred to as Toolbox for short. Toolbox has been developed by the Summer Institute of Linguists (SIL) and is distributed free of charge. This is a flexible tool well adapted to a number of text/language-related areas including corpus development and lexical work. Toolbox is considered a database programme, but unlike standard databases it is specifically adapted to the needs of people working with two basic aspects of language work: text and vocabulary. Thus it is ideal for the present purpose. Toolbox treats a file containing a text as a “database” (though in this tutorial we will often call it a “document” or a “file” for clarity). Each such text file is divided into parts called “records”, but which we may also call pages, sections, chapters etc., and each record is further divided into lines (basically, sentences) and/or fields. Where the texts of the corpus are concerned, every file, every section/page and every line is assigned an identifying code. These are used by NAWACOLEX to identify sentences and texts internally. Thanks to them, NAWACOLEX can quickly find and display any sentence in the corpus, and it can tell us what sentences (in which texts, and where) contain any word. A lexicon also takes the form of a database. This kind is divided into records that we usually call “entries” (like a dictionary entry) for individual Nawat words, and each entry contains fields where different kinds of information has been inserted, such as the Nawat headword, the gloss (i.e. translation), and other information about the word. Toolbox can take an item in one lexicon and look for similar items in another lexicon (i.e. perform automatic cross-references). It can also take a word in a text and find its entry in a lexicon (i.e. look up words), but only if the word is in its citation form. Moreover, Toolbox can generate other kinds of database from existing ones. For example, from one or more text files it can create something which we call a wordlist; this is a file where all the words in those texts are listed in alphabetical order, together with an indication of how many times they occur in the texts and even a list of the places where they occur. Secondly, Toolbox can create another kind of list called a concordance, which tells us about the occurrences of particular words or phrases. The NawaCoLex Database 6 What is in NAWACOLEX 2.1? For the in-depth study of a language such as Nawat, it is important to dispose of texts which provide reliable samples of the language as it is, or has been, in use. Since these are not very many or very voluminous, this amounts to trying to collect together just about everything that can be found, with emphasis on the best sources and the largest ones in order to possess as much good data as possible.