Interdisciplinary Approaches to Language Documentation

edited by Susan D. Penfield

Language Documentation & Conservation Special Publication No. 21

Interdisciplinary Approaches to Language Documentation

edited by Susan D. Penfield

Language Documentation & Conservation Special Publication No.21 Published as a Special Publication of Language Documentation & Conservation

Language Documentation & Conservation Department of Linguistics, UHM Moore Hall 569 1890 East-West Road Honolulu, Hawaiʻi 96822 USA

http://nflrc.hawaii.edu/ldc

University of Hawaiʻi Press 2480 Kolowalu Street Honolulu, Hawaiʻi 96822 USA

© All texts and images are copyright to the respective authors, 2020 All chapters are licensed under Creative Commons Licenses

Cover design by Laura Viola Maccarone of Rizbee Design Studio

Library of Congress Cataloging in Publication data ISBN-13: 978-0-9856211-9-3 http:// http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/24947 Contents

Contributors iv

1. Introduction: Interdisciplinary Approaches to Language Documentation 1 Susan D. Penfield

2. Domain-Driven Documentation: The Case of Landscape 9 Niclas Burenhult

3. Child Language Documentation: A Pilot Project in 23 Birgit Hellwig

4. Interdisciplinary in Areal Documentation: 43 Experiences from Lower Fungom, Cameroon Jeff Good

5. Endangered Language Documentation: 72 The Challenges of Interdisciplinary Research in Ethnobiology Jonathan Amith

iv

Contributors

Jonathan D. Amith has been an independent scholar since 2000 when he began work on San Agustín Oapan and Ameyaltepec Nahuatl (Balsas River valley, central Guerrero). He later proceeded to document Yoloxóchitl Mixtec (Pacific Coast of Guerrero) and Sierra Nororiental de Puebla Nahuati. More recently he has focused on lexicography and com- parative ethnobiology, extending his ethnobiological work to Totonac-speaking communi- ties in northern Puebla .Amith has published in linguistics, history, and anthropology. He has curated an exhibit of Indigenous protest art, edited an anthology of Nahuatl-language texts (book and 6-CD set), produced and codirected an award-winning Nahuatl-language documentary, and has an ongoing project of short documentaries in Mixtec. Amith has also collaborated with SRI International on automated speech recognition of Yoloxóchitl Mixtec. Presently he is developing metadata standards and a content management system, based on open-source Symbiota software, to facilitate comparative and community based ethnobiological research.

Niclas Burenhult is Associate Professor and Senior Reader in Linguistics at Lund University, Sweden. His research concerns the relationship between language, culture and the environment, with a particular focus on landscape. He is a leading expert on the Aslian (Austroasiatic) languages of the Malay Peninsula and has directed major docu- mentation programs in that setting. He is a co-ordinator of the Repository and Workspace for Austroasiatic Intangible Heritage (RWAAI), a digital resource dedicated to endan- gered languages of Southeast Asia. He is the author of numerous works on the relation- ship between language and the environment, as well as on various aspects of the Aslian languages.

Jeff Good is Professor of Linguistics at the University at Buffalo, State University of New York. His research interests include comparative Niger-Congo linguistics, language documentation, and morphosyntactic typology. He is presently engaged in team-based documentary work focused on the languages and multilingual ecology of the Lower Fungom region of Cameroon. This involves close collaboration with faculty and students at a number of Cameroonian universities and with specialists in anthropology, computer science, and geography. His research has also considered how digital tools and methods can best support the documentation of endangered languages.

Birgit Hellwig is based at the Department of Linguistics, University of Cologne, where she combines language documentation with psycholinguistics, focusing on the adapta- tion of longitudinal and experimental approaches to investigate language acquisition and socialization in diverse socio‐cultural settings. She is currently working with the Qaqet in Papua New Guinea, and she continues to be interested in the documentation and descrip- tion of the adult language, researching Goemai (a Chadic language of Nigeria), Katla (a Niger‐Congo language of Sudan) and Tabaq (a Nilo‐Saharan language of Sudan).

Susan Penfield received a Ph.D. in Linguistic Anthropology from the University of Arizona in 1980 where she was later an instructor in the Second Language Acquisition and Teaching Ph.D. Program (SLAT) and for the American Indian Language Development v

Institute (AILDI). From 2008-2011, Penfield directed the Documenting Endangered Languages Program at the National Science Foundation (NSF). She was awarded a Smithsonian Fellowship for Native American Programs in 2012. She is currently teaching for the University of Montana Linguistics Program and for the University of Arizona Certificate Program in TESL. Dr. Penfield specializes in language documentation, lan- guage reclamation, community-based language/linguistic training and interdisciplinary applications in all of these contexts. Language Documentation & Conservation Special Publication No. 21 (October 2020) Interdisciplinary Approaches to Language Documentation ed. by Susan D. Penfield, pp. 1-8 1 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/24940

Introduction: Interdisciplinary Approaches to Language Documentation

Susan D. Penfield University of Arizona / University of Montana

This special edition of the Language Documentation & Conservation Journal results from a speakers’ series held at the Third International Conference on Language Documentation and Conservation in 2013. The speakers’ series was designed to bring attention to the complexities and value of interdisciplinary research in language doc- umentation. Four speakers shared their ongoing research and have since revised their talks into the papers presented herein. Each paper offers a slightly different approach to designing and implementing interdisciplinary research in the context of language docu- mentation and each involves a different combination of disciplines. The authors present some aspect of what constitutes and defines interdisciplinary research, an approach that has been a frequent topic of academic rhetoric but remains on the fringes of mainstream research agendas. Before discussing the papers directly, some discussion of what con- stitutes ‘interdisciplinary research’ is in order.

1. What is interdisciplinary research? One workable definition says,

Interdisciplinary research is any study or group of studies undertaken by schol- ars from two or more distinct scientific disciplines. The research is based upon a conceptual model that links or integrates theoretical frameworks from those disci- plines, uses study design and methodology that is not limited to any one field, and requires the use of perspectives and skills of the involved disciplines throughout multiple phases of the research process. (Aboelela et al. 2006: 341)

This proposed definition was mainly designed to aid decision makers in funding agencies and researchers in identifying an interdisciplinary approach (Aboelela et al. 2006: 341).

Another definition, used by the National Science Foundation, is as follows:

Interdisciplinary research is a mode of research by teams or individuals that inte- grates information, data, techniques, tools, perspectives, concepts, and/or theories from two or more disciplines or bodies of specialized knowledge to advance funda- mental understanding or to solve problems whose solutions are beyond the scope of a single discipline or area of research practice.1

1 This definition originated in the 2004 Report of the National Academy of Sciences, National Academy of Engineering, and Institute of Medicine titled Facilitating Interdisciplinary Research. See the NSF page at https://www.nsf.gov/od/oia/additional_resources/interdisciplinary_research/ definition.jsp

CC Creative Commons Attribution Non-Commercial No Derivatives Licence 2

One problem entailed in defining the term ‘interdisciplinary’ is to distinguish it from similar terms such as ‘multidisciplinary’ or ‘cross-disciplinary’. In the general literature, the terms ‘interdisciplinary,’ ‘multidisciplinary, and ‘cross-disciplinary’ are often used interchangeably, although Amith draws distinctions between them noting that ‘cross-dis- ciplinary’ is the most neutral term. Amith also suggests that “the key word in the [NSF] definition is ‘integrates’, as it is precisely this integration that is taken to distinguish interdisciplinary from multidisci- plinary studies.” This notion of integration is, indeed, pervasive throughout the literature on interdisciplinary research. Note that

Interdisciplinary research asks how these disciplinary understandings can be merged, expanded, and transcended. Interdisciplinary research will continue to require concepts and methods developed through disciplinary research, but it will integrate (emphasis added) that knowledge to create new connections between dis- ciplines and new explanations of complex phenomena. At its best, interdisciplinary research creates knowledge that no single discipline can create on its own. (Derrick 2011: 3)

The most defining aspects of true interdisciplinary research are 1) that the vary- ing disciplines are joined early in the planning process and ‘integrated’ in terms of theoret- ical and practical input to the targeted research and 2) that interdisciplinary projects have the potential to form whole new disciplines. As such, 3) they must, on some level, meet the goals for the research design of each discipline involved.

2. What background in language documentation invites interdisciplinary research? We can balance these definitions against the frequently quoted ideal scope of language documentation which states that “conventional language documentation strives to create comprehensive records of the linguistic practices of speech communities” (Himmelmann 1998: 166; cf. Gippert et al. 2006). In so doing, we can see clearly that there is a real requirement to investigate more than just linguistic phenomena in language documenta- tion. The emphasis on ‘practices’ begs researchers to consider the broader scope of things, other than the language itself, that might impact or generate those linguistic practices. When language documentation was first being developed as a unique field of study, most of the attention was given toward the gathering of language-specific data. There was also a good deal of attention paid to the role of technology, the role of linguis- tic representation, and the linguistic scope and processing of gathered data. As the ‘legs’ of this new discipline were established, researchers began to explore what else could be accomplished in the context of language documentation. There was a realization that lan- guage documentation could be layered with other disciplines to produce a richer, broader outcome for both researchers and members of the speaking communities involved. The field’s early recommendations that ‘linguistic practices’ be documented opened the door for many other considerations. And, while there has since been some significant work in language documentation that utilized an interdisciplinary approach (see Holton 2012 and Thieberger 2012), there has been little in-depth discussion about how researchers decide on and develop interdisciplinary research.

Interdisciplinary Approaches to Language Documentation 3

3. What are the parameters of interdisciplinary research? The above definitions provide some explanation of what interdisciplinary research is. Let’s also be clear about what interdisciplinary research is not. It is not the mere addi- tion of researchers from various disciplines or with different academic and professional credentials to a specific project. The addition of other researchers, perhaps considered as ‘token’ participants, used to make the project ‘look’ like it has more range and depth, will not qualify. The parameters of true interdisciplinary work begin with the conception of a project; it is not sufficient to add or include another discipline after the project has been created simply to show an attempt to broaden and be more inclusive in the research. There is one exception to this general rule and that happens when the data from a language documentation project begins to yield results that might require the perspective of another discipline. The parameters of an interdisciplinary project are set in the analysis of the con- ceptual framework, the study design and execution, the type and method of data analysis, conclusions, and future applications. As such, the development of the study design and direction helps to define research team competencies and drive team development. Once the academic world embraced the full scope of projects that were cast as ‘interdisciplinary’, research of this type began to move from being seen as a random, unsystematic occurrence to an essential, teachable research approach. This is a critical change in our conception of what it means to be interdisciplinary. Unfortunately, actual attention to these parameters is often missing, and the approach is rarely taught.

4. Are there distinct advantages and disadvantages of interdisciplinary research? Yes, there are both advantages and disadvantages. Among the advantages are:

a. More sharing of responsibility by members of the research team. To this end, from the beginning, the central research question should drive the creation of an interdisciplinary research team. As Burenhult (this volume) asks, “unless the collaborating disciplines are allowed to define aspects of the documentary agenda, how will linguists be able to make the most of the collaboration?” b. The opportunity to bring a different set of research questions to the same project data. In some cases, adopting an interdisciplinary ap- proach can ultimately be seen as leading to a more informed view of the ideal linguistic foci for a documentary project, thereby having a significant long-term impact on documentary activities (Good, this volume). c. The ability to potentially attract funding from a broader range of sourc- es within a single agency or even across agencies—such research of- fers more “bang for the buck.” Funding agencies realize more research outcomes for their money in such broader-based research. d. The fact that funding agencies specifically encourage interdisciplinary work. Consider the DoBeS project (http://dobes.mpi.nl/projects/) and the National Science Foundation

Interdisciplinary Approaches to Language Documentation 4

e. (https://www.nsf.gov/od/oia/additional_resources/interdisciplinary_ research/), for example. f. The requirement for “out of the box” project designs which create fuzzy boundaries (that’s good); such designs even create new disci- plines. Good (this volume) reports that in one case, “an interdisci- plinary focus … identified a significant new domain of documentary investigation that [had] been largely neglected.”

There are also some potential disadvantages or difficult considerations. Among them is that academic institutions often claim to support interdisciplinary work but are, in fact, structured precisely in ways that make it difficult. Our academic culture is largely based on strong disciplinary boundaries, reinforced by professional societies, institutional hierarchies, and publication sources and requirements. Rhoten (2004: 6) writes,

The fact is, universities have tended to approach interdisciplinarity as a trend rather than a real transition and to thus undertake their interdisciplinary efforts in a piecemeal, incoherent, catch-as-catch-can fashion rather than approaching them as comprehensive, root-and-branch reforms. As a result, the ample monies devoted to the cause of interdisciplinarity, and the ample energies of scientists directed toward its goals, have accomplished far less than they could, or should, have.

This extends to how researchers often see themselves in relation to their disci- pline and how willing they are to push their own limits. Good (this volume) notes that a scholar must develop a “collaborative personality.” Agreements across academic depart- ments or programs can be complicated. There are often tensions around where to publish the work—academic journals are most often focused tightly on a single discipline and may not welcome work that is truly interdisciplinary. For example,

Effective interdisciplinary research often requires collaborators to gain fairly deep knowledge about how practitioners of other disciplines collect and theorize on their data, and may further result in academic outputs that are neither fish nor fowl, as it were, in terms of disciplinary evaluation. Is a culturally informed collection of place names … an instance of linguistics, anthropology, or geography? Questions like this do not merely provide interesting intellectual puzzles. They can have real- world consequences given the fact that disciplines do not merely exist to provide a convenient way to categorize different methods of inquiry but are also embedded within the institutional structures which support scholarship. (Good, this volume)

There may also be theoretical or methodological challenges leading to problems in both the conception of and implementation of the research in question. Language doc- umentation fieldwork carries with it an established methodology for data collection and ethical rules of engagement with community partners. The definition of ‘fieldwork’ itself might differ from that in other disciplines and certainly fieldwork methodologies can dif- fer and become a source of conflict. Issues also tend arise around data management and ownership. For these reasons specifically, a designated interdisciplinary research team needs to address and anticipate as many of these things in advance when possible. A starting point for anyone considering an interdisciplinary approach to research

Interdisciplinary Approaches to Language Documentation 5 is to ask: What are the boundaries of my discipline? Think about research in related dis- ciplines or sub-disciplines and ask: is their research designed and/or evaluated by the same criteria? The advantages to interdisciplinary research are many, but must always be balanced carefully against any real or perceived disadvantages.

5. What are the goals for this issue?

The Speaker’s Series was conceived of to bring attention to the increased value of interdisciplinary approaches to language documentation and to create some discussion about how this fairly new development was defined or created. By 2013, language doc- umentation was well-established as a field of study, those who were beginning to bridge it with other disciplines were varying their approaches. At the time of the Speaker’s Series, there was a growing body of research which was, in one way or another, fitting the description of ‘interdisciplinary’ research. (e.g., Holton 2012; Thieberger 2012). Researchers were beginning to realize that strict language documentation projects, while always valuable for their linguistic content alone, proved good opportunities to gather other types of data. There was a recognition that, if one was going to record language anyway, why not also make the content of language genres and linguistic practices rele- vant in some other way? The researchers who contributed to this issue were asked to write about their experiences with interdisciplinary research. Basically, they were asked to reflect on that experience and share with the language documentation audience how they thought about and approached a project set to be interdisciplinary. They were asked to consider how they approached, conceived of, implemented, negotiated, full scale research while engag- ing disciplines other than linguistics in the context of language documentation projects. They were, therefore, asked to write about the experience, about how the interaction with another discipline came about and how it moved forward. They were also asked to discuss what the strengths and the weaknesses of engaging in interdisciplinary research were. In order to allow them plenty of room for reflection on what they thought was most import- ant, hey were not otherwise restricted by either content or the length of their contribution.

6. Overview of the included articles

Concepts of landscape and language blend together as Niclas Burenhult explores in his paper, “Domain-driven documentation: The case of landscape.” In this article, the author relates experiences from his fieldwork among the Jahai, an Austroasiatic-speaking group of rainforest foragers in the Malay Peninsula. He argues that “landscape forms a constant scene for our actions, thoughts, and beliefs.” Burenhult also notes that landscape provides a natural conversational space between researcher and community members. Burenhult defines a ‘domain’ as “an experiential sphere of universal relevance which is highly likely to be a target of human representational strategies,” and writes that domains “can be categorically identified in every language and be straightforwardly compared across them.” Domains therefore represent arenas which facilitate interdisci- plinary communication and collaboration. Accordingly, they may also be our best chance of exploiting external expertise to enrich documentation programs and regenerate and

Interdisciplinary Approaches to Language Documentation 6 expand documentation as a field (Burenhult, this volume). Burenhult suggests that, by using a domain-driven approach, the documentary record goes beyond a comprehensive record of linguistic practices and assures more reusability of the documentary resources by other disciplines. He does, however, raise a key question, repeated by many who work in the interdisciplinary arena, and that is how do researchers reconcile the goals of their documentation projects with the both the theoretical and practical goals of the other disciplines involved? Birgit Hellwig presents interesting perspectives on integrating child language acquisition research with language documentation. Her paper, “Child language documen- tation: A pilot project in Papua New Guinea” begins by recognizing the lack of this type of research within the realm of language documentation research. Hellwig establishes some of the overlapping areas of interest which contribute to child language documentation, explaining that it is anchored both in anthropology (i.e., language socialization research) and psycholinguistics (i.e., language acquisition research), and that it draws on three dis- tinct data types: anthropological, experimental, and longitudinal. Hellwig notes that of the three types, documenters are most familiar with anthropological data, and that “there is a long-standing tradition of including anthropological components within language docu- mentation projects… .” However, Hellwig contrasts this with the lack of expertise in the field with “the two data types most relevant for language acquisition research: experimen- tal and longitudinal data.” Her paper explores and explains the interaction of various disciplines in relation to her field site in Papua New Guinea as she elaborates on the different methodologies at play when two fields (documentation and psycholinguistics) intersect. She points out the need for collaboration between researchers in language documentation, language acquisi- tion, and language socialization. She elaborates on the problems with methods developed to fit Western contexts and notes that Western-based research approaches may not apply in language documentation for endangered languages. One problem she encountered was that child language studies require experimentation and longitudinal data which are not part of classic language documentation. This question persists through much interdisci- plinary research: what do we do about competing methods? For example, Hellwig consid- ers whether we should consider child language documentation as part of psycholinguistics or is it more reasonable, effective and workable in the field to integrate a more anthroplog- ical approach? Language documenters often find themselves wedded to the latter because following a set Western agenda for research fails to account for community practices and participation. Again, true interdisciplinary work would require an ‘out of the box’ research design which would result in collaborative or complimentary agenda, and a very flexible approach to fieldwork methods. In his paper, “Interdisciplinarity in areal documentation: Experiences from Lower Fungom, Cameroon,” Jeff Good discusses his work with anthropologist Pierpaolo Di Carlo, and with several other linguists, during a study of linguistic diversity involving seven languages spoken in thirteen villages where linguistic diversity has been main- tained over time. This research spans linguistics, ethnography, geography, and archaeol- ogy; its central research question explored the factors that have allowed Lower Fungom to develop and maintain its extreme linguistic diversity. Good explains how “the standard documentary linguistic toolkit has been augmented by an interdisciplinary approach to studying the region.” Good describes the way his approach involved “prioritizing the

Interdisciplinary Approaches to Language Documentation 7 documentation of multilingual practices in Lower Fungom” which “emerged directly from an insight of [his research team’s] interdisciplinary research regarding local lan- guage ideologies.” Reflecting on the complexities of interdisciplinary research, Good notes that in the “academic world that is built upon disciplinary foundations, working across disciplines requires a high level of expense with comparatively little ‘overt’ payoff. Good underscores the purpose, intent, and value of all the papers collected here not- ing that “the rise of the documentary paradigm as an approach to the study of endan- gered languages has been, at least rhetorically, associated with an emphasis on the value of interdisciplinary collaboration as a means to come to a fuller understanding of the diverse linguistic practices of communities throughout the world.”

Jonathan Amith’s paper, “Endangered language documentation: The challenges of interdisciplinary research in ethnobiology,” discusses a project that blended ethnobiology with language documentation in the development of a floristic and faunal inventory of the natural environment in the Nahuatl-speaking region of the Balsas River Valley in . This project has investigated through digitally recorded and transcribed texts, the cognitive aspects of nomenclature and categorization and an exploration of utilitarian and cultural aspects of the local flora and fauna. In this paper, the interaction of disci- plines (biology, linguistics, and anthropology) is examined at the level of data-gathering and interpretation, providing a detailed examination of disciplinary integration. Amith approaches the interaction of disciplines from two perspectives: 1) understanding the extent of the integration among disciplines needed to solve a problem and 2) considering the degree to which participants from different disciplines are acting in service to other disciplines or pursuing their own research agenda. Amith expands on the notion of work- ing‘in service’ to other colleagues saying, “through community-based collaboration, eth- nobiological projects are able to extensively collect flora and fauna, often in areas that are poorly explored. Herbaria and museum collections may thus be built up at a relatively low cost and new geographical references and species are often discovered.” (See Amith fn. 5 for detailed results of Amith’s collaborative work with biologists and the resulting collec- tions.) His paper provides a thorough discussion of elicitation practices and approaches used by other disciplines, suggesting that such documentation provides for both qualita- tive and quantitative analysis of how community members identify the flora and fauna of their natural environment..

Summary Comments: Interdisciplinary research is promoted by funding agencies, some institutions, and many researchers but, as the papers here demonstrate, not always easily understood or implemented. High quality interdisciplinary research begins in the planning stage, proceeds with lots of collaboration and cooperation, engages a range of methods and practices, and results in outcomes that benefit all involved. Its not an easy task, but the engagement of different disciplines in language documentation pushes the boundaries of research and provides a stronger documentary record not limited to language- specific data.

Interdisciplinary Approaches to Language Documentation 8

References

Aboelela, Sally W., Elaine Larson, Suzanne Bakken, Olveen Carrasquillo, Allan Formicola, Sherry A Glied, Janet Haas & Kristine M Gebbie. 2007. Defining interdisciplinary research: Conclusions from a critical review of the literature. Health Science Research 42(1 Pt. 1). 329–346. Amith, Jonathan. This volume. Endangered language documentation: The challenges of interdisciplinary research in ethnobiology.(forthcoming) Burenhult, Niclas. This volume. Domain-driven documentation: The case of landscape. Derrick, Edward G., Holly J. Falk-Krzesinski & Melanie R. Roberts (eds.). 2011. Facilitating interdisciplinary research and education: A practical guide. Produced by Research Corporation for Science Advancement for the American Association for the Advancement of Science (AAAS). 1–43. Franchetto, Bruna. 2006. Ethnography in language documentation. In Jost Gippert, Nikolaus P. Himmelmann & Ulrike Mosel (eds.), Essentials of language documentation, 183–211. Berlin and New York: Mouton de Gruyter. Gippert, Jost, Nikolaus P. Himmelmann & Ulrike Mosel (eds.). 2006. Essentials of language documentation. Berlin and New York: Mouton de Gruyter. Good, Jeff. (2018) Interdisciplinarity in areal documentation: Experiences from Lower Fungom, Cameroon. Special Issue of Language Documentation & Conservation. October 2020. Hellwig, Birgit. This volume. Child language documentation: A pilot project in Papua New Guinea. Himmelmann, Nikolaus P. 1998. Documentary and descriptive linguistics. Linguistics 36. 161–195. Holton, Gary. 2018. Interdisciplinary language documentation. Oxford Handbook of Endangered Languages, ed. by Kenneth Rehg and Lyle Campbell. Oxford University Press. Holton, Gary. 2012. Language archives: They’re not just for linguists any more. In Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts & Paul Trilsbeek (eds.), Potentials of language documentation: Methods, anal- yses, and utilization (LD&C Special Publication no. 3), 111–117. Honolulu: University of Hawai’i Press. Penfield and Reinhardt. 2012.Partnership for Indigenous Knowledge and Digital Literacy. National Science Foundation Grant #1321663 Rhoten, Dianne. 2005. National Academy of Sciences. Facilitating Interdisciplinary Research. Washington, DC: The National Academic Press. Thieberger, Nicholas (ed.). 2012. The Oxford handbook of linguistic fieldwork. Oxford: Oxford University Press.

Interdisciplinary Approaches to Language Documentation Language Documentation & Conservation Special Publication No. 21 (October 2020) Interdisciplinary Approaches to Language Documentation ed. by Susan D. Penfield, pp. 9-23 2 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/24943

Domain-driven Documentation: The Case of Landscape

Niclas Burenhult Lund University / Max Planck Institute for Psycholinguistics

Abstract It is becoming increasingly evident that the field of language documentation and the docu- mentary multimedia resources it produces rely on expanding their relevance and usability to disciplines beyond linguistics in order to increase their chances of being sustainable in the long term. This paper argues that more attention should be paid to the needs and interests of such disciplines in language documentation schemes. One way of doing so is to set out from fundamental domains of human experience in designing documentation programs, domains which are of immediate concern to disciplines such as geography, biology, history, anthropology, and so on. Particular focus is placed on the domain of land- scape, explored in two documentation programs coordinated by the author. In addition to providing clear interdisciplinary arenas of inquiry, such domain-driven approaches also offer excellent opportunities for efficient collection and construction of the comprehen- sive records of linguistic practices stipulated by current documentation initiatives.

1. Background Conventional language documentation strives to create comprehensive records of the linguistic practices of speech communities (Himmelmann 1998: 166; cf. Gippert et al. 2006). In practice, this involves collection and analysis of as wide an array of language genres as possible. Indeed, this has been a general and fundamental principle since lan- guage documentation was first promoted as a discipline in the late 1990s and continues to shape the priorities of funding programs and thus the agendas of projects. Implicitly, and sometimes explicitly, any narrowing down of the documentary objective has been viewed as problematic for the overarching goal of comprehensive documentation. The reason is obvious: documentation agendas driven by specific theories or subareas of linguistics will inevitably place emphasis on specific linguistic levels, categories, or structures, or on specific genre types which are convenient for getting to any such levels, categories, or structures. Clearly this would be at the expense of comprehensive documentation, for which the priority should be to capture all aspects of language with extreme urgency, as languages are rapidly dying out. But conventional, comprehensive language documentation has a paradoxical downside: it, too, is narrow. The broad collection of language genres may cater well to a wide range of linguistic interests and approaches, but without further consideration given to the content of a documentary corpus, such a resource will be of limited use to the non-linguist. In fact, it runs the risk of being dreadfully restricted. A biologist, geogra- pher, or historian interested in indigenous knowledge and belief systems will be lucky to

CC Creative Commons Attribution Non-Commercial No Derivatives Licence 10 easily find anything of analytical use in a corpus collected only according to the ideal of a comprehensive record of linguistic practices. This limitation is particularly problematic because it seriously restricts the reusability of these documentary resources and thereby hampers their future development. It is not a secret that interest in secondary exploitation of existing language documentation corpora has so far been lukewarm, at best. Designing them to become interdisciplinary meeting places and workspaces, rather than just lan- guage repositories, is one way to ensure sustained interest into the future. However, this would seem to require some reassessment of how documentary projects and resources are conceived. In particular, greater attention would need to be paid to the interests and requirements of other disciplines, with potential effects on the fundamental principles of documentation.2 In this short paper I will discuss the potentials of language documentation driven by domains which are relevant to disciplines other than linguistics. I will reflect on the interdisciplinary aspects and experiences of two major schemes of documentation which I have been fortunate enough to be involved in, both of which can be characterized as ‘domain-driven’.3 Pursuing domains is not unusual in language documentation. Indeed, funding programs have supported a number of documentation projects which have used this approach with great success.4 Still, the approach has so far not been clearly formu- lated or theorized, nor promoted, as a viable method for making language documentation at large more efficient, scientifically interesting, and outreaching, and while continuously evolving.

2. Domain-driven documentation Here, a domain is to be understood as an experiential sphere of universal rele- vance which is highly likely to be a target of human representational strategies. Such domains do not presuppose language-specific or cross-linguistic applicability in the sense that they can be categorially identified in every language and be straightfor- wardly compared across them. Indeed, they need not (and frequently do not) surface as well-defined and basic semantic domains in individual languages. But they encom- pass phenomena which are so fundamental and universally relevant that every human, and every human community, will be likely to have some means of representing them

2 Holton (2012) provides a more upbeat assessment of the usability of existing archives for disciplines beyond linguistics, citing examples of recent interest among non-linguists in ethnobotanical, historical, and musicological aspects of legacy language collections in the Alaska Native Language Archive. What these examples clearly show is that language documentation has the ability to attract an interdisciplinary audience and that further efforts to involve other disciplines at an early stage in the documentation process are likely to increase the relevance and usability of the resulting resources. 3 This paper builds on a presentation given by the author at the 3rd International Conference on Language Documentation and Conservation, Honolulu, February 28 – March 3, 2013. I am grateful to Susan Penfield and the National Science Foundation for inviting me to this event. The research programs reported in this paper received generous support from the Max Planck Society, the Volkswagen Foundation’s DOBES program, and the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / ERC Grant agreement n° 263512. I also gratefully acknowledge the support of the Lund University Humanities Lab. Special thanks to Clair Hill, Nicole Kruspe, Konrad Rybka, and two reviewers of LD&C for their comments on an earlier version of this paper. 4 Aung Si’s project Documentation of the language and biological knowledge of the Solega (ELDP, http://elar.soas.ac.uk/deposit/0150), Maia Ponsonnet’s A culturally informed corpus of Dalabon (ELDP, http://elar.soas.ac.uk/deposit/0071), and Hans-Juergen Sasse et al.’s Iwaidja project (DOBES, http://dobes.mpi.nl/projects/iwaidja/) are some of the many examples of highly successful, domain-driven documentation. Programs targeting ethnobiology and music are particularly well represented.

Interdisciplinary Approaches to Language Documentation 11 in language and thought, and will find the subject matter to be of central importance to their lives. This makes them particularly interesting not only because they provide convenient dimensions for in-depth language exploration (see further below), but also have specific potential for cross-linguistic comparison of representational strategies. Much work in cognitive and linguistic anthropology relies on such domains. For example, a long tradition of research on folk categories has focused on domains like plants and (Berlin 1992; Atran & Medin 2008), color (Berlin & Kay 1969; Kay et al. 2010), and anatomy (Brown 1976; Majid et al. 2006). More recent work in the lan- guage and cognitive sciences (especially out of the Language and Cognition Group at the Max Planck Institute for Psycholinguistics) has extended the cross-cultural comparison to include domains such as space (Levinson 2003; Levinson & Wilkins 2006), landscape (Burenhult 2008a; Mark et al. 2011a), and the senses (Levinson & Majid 2014), as well as event-based domains like material destruction (Majid & Bowerman 2007), caused motion (Narasimhan & Kopecka 2012), and reciprocity (Evans et al. 2011). Modern language documentation initiatives are of course aware of the signifi- cance of such basic domains as highly interesting and relevant targets of documentation, especially since they are often closely associated with endangered indigenous knowledge systems of various kinds (cf. Evans 2009). For example, the contributions in the Oxford Handbook of Linguistic Fieldwork (Thieberger 2012) cover a range of areas suitable for documentary attention, from astronomy and music to food and mathematics, and under- line their potential for interdisciplinary collaboration. But the language documentation community faces two major challenges in this regard. First, how does one reconcile a thematically wide-ranging approach with the pre- scribed comprehensive documentation of linguistic practices and genres without losing further depth? Clearly, any project aiming to cover as many genres as possible in as many domains as possible runs the risk of becoming superficial, to linguists and non-linguists alike. Second, how does one reconcile the documentary goal with the intellectual ambi- tions of the collaborating disciplines? Unless there is something to gain, why would the non-linguist invest time, energy, and brainpower in language documentation? And unless the collaborating disciplines are allowed to define aspects of the documentary agenda, how will linguists be able to make the most of the collaboration? So far, interdisciplinary collaborators have mainly served as auxiliaries, and their participation has hardly pro- duced significant scientific game changers in their home disciplines. I will argue here that documentation agendas, which set out from a specific domain, are apt to deal with both of these challenges, and below I list some major advantages of, and justifications for, this approach to language documentation. First, domain-driven documentation is conceptually convenient for the researcher and, in particular, for his or her language consultants. As mentioned, domains (as defined here) are typically cognitively and linguistically accessible and relevant to speakers, and they are coherent and interesting topics of conversation (whereas, say, serial verb con- structions or phrase-level prosody are less likely to be of community-wide interest). They therefore tend to form natural ontologies or dimensions for scientific inquiry in the field, into which one can go quickly in-depth, and around which different analyses and data types can be organized at leisure. Importantly, domains are excellent for maximizing com- munity involvement in the research process, and they even allow consultants to take the intellectual and procedural lead. They are also more likely to attract future community

Interdisciplinary Approaches to Language Documentation 12 interest in the resulting documentary resource(s). Needless to say, these characteristics make for an efficient research program and the possibility of collecting large amounts of data in a short time. So even if the domain as such is not one’s primary research interest, it may still well serve as one’s best gateway to the relevant data. Second, a seemingly narrow domain is not necessarily at odds with the principle of ‘comprehensive record of linguistic practices’―quite the opposite. Chosen and delim- ited with care, a domain-driven agenda can be perfectly capable of (and even ideal for) accommodating a wide variety of communicative events and data types, from elicitation and interviews, exclamations and conversations, to stories, myth, ritual, poetry, and song. Again, a domain-driven agenda may in fact be the most efficient way of getting to many of these genres, and reaching the goal of a comprehensive record of linguistic practices. Similarly, domains harbor structural and semantic phenomena of relevance to the linguis- tic system as a whole, and offer lines of inquiry which are culturally entrenched and likely to provide exciting new perspectives on the language and culture. They have the potential to offer those scientific ‘scoops’ of which language documentation is in such need. Third, an explored domain can form a suitable underpinning for exploration of other domains. Domains are of course not conceptually monolithic and will have con- nections to other aspects of community experience, leading naturally into other culturally relevant phenomena. For example, a documentary agenda with a focus on plants and animals can be regenerated to include, e.g., subsistence techniques, food and eating, per- ception, ritual and cosmology, place and landscape, and so on. Thus, as a starting point for driving, regenerating, and diversifying a documentary agenda in culturally embedded ways, domains have particular potential. Fourth, domain topics are often immediately relevant to other disciplines. This is because scientific branches and fields typically operate along similar fundamental dimen- sions of human experience. Life forms and biotopes are dealt with by biologists and ecol- ogists, landscape by geographers, human motion by biomechanists, food by nutritionists, and so on. Domains therefore represent arenas which facilitate interdisciplinary commu- nication and collaboration. Accordingly, they may also be our best chance of exploiting external expertise to enrich documentation programs and regenerate and expand docu- mentation as a field. Finally, domains are eminently documentable in their own right, often urgently so. The indigenous ontologies and knowledge systems which tend to associate with domains often have a more critical level of endangerment than the languages themselves. For example, in processes of acculturation, assimilation, modernization, displacement, and environmental degradation, traditional knowledge of the environment is usually the first to go. For any discipline with an interest in recording human diversity, this aspect of dis- appearing intangible heritage is just as important as the linguistic systems. This reasoning may seem intuitive and unremarkable―as noted, much modern documentary work is already carried out more or less according to such principles. But the fact is that the domain-driven approach has so far not been clearly formulated or the- orized, nor promoted, as a viable method for conducting efficient, high-yield language documentation of relevance far beyond linguistics as such. The following sections outline our experiences of documentation in the context of the domain of landscape.

Interdisciplinary Approaches to Language Documentation 13

3. Landscape: A fundamental domain? The geophysical environment (i.e. ‘landscape’) shows all the signs of being a domain of fundamental concern to human existence and experience, and is thus very promising as a frame for language documentation. Every member of our species inhabits a landscape, and landscapes have a profound influence on our lives―how we subsist, move around, find our way, and make our home. Landscape forms a constant scene for our actions, thoughts, and beliefs. It provides us with large and immoveable entities and surfaces, with spatial and temporal constancy and large-scale three-dimensional complexity, thereby forming a very distinct conceptual domain with its own spatial properties. Landscape is the spatial backdrop for cognitive development in children. It was also the spatial backdrop through- out the cognitive evolution of our species. We have special brain systems for remember- ing individual places (Burgess et al. 1999) and for distinguishing landmarks from other kinds of places (Janzen & van Turennout 2004). So, with the possible exception of the human body, it is difficult to imagine a domain more fundamental to human cognition. But, despite its universal character and relevance, landscape has the additional interesting quality of being highly variable. Thus, landscapes vary as to their profile, geol- ogy, hydrology, and vegetation. Their features can be hilly or flat, concave or convex, solid or soft, arid or soaked, open or forested. Deserts, rainforests, arctic icescapes, oceans, mountain meadows, steppes, savannahs, and the constructed landscapes (or ‘cityscapes’) in which more than half of humanity now lives, are just some examples of landscape diversity. Humans are unique as a species in having colonized all of these geographical and ecological niches. Human cognition and language have therefore had to confront and represent an astounding diversity of spatial backdrops. The brain systems which han- dle our memory for places and identification of landmarks must be flexible and general enough to map onto features which can be highly specific to a particular environment. That a fundamental domain like landscape is so variable implies that it is a rewarding area for the study of variation in human representational systems. It is special in that it places a demand on human language and cognition not only for universal attention, but also for maximal plasticity in representation. These and other characteristics make landscape a candy store for linguistics. For example, language is seriously put to the test when it comes to the delimitation and lex- ical labeling of geographical entities, since such entities seldom have clear boundaries, with interesting repercussions for conceptualization in general. Furthermore, landscape is an interesting domain for the identification of structured sets of lexicon, semantic fields, and relations, with possible grammatical reflexes and consequences. Landscape is also a fundamental scene for proper naming (place names) and offers opportunities to study how such naming is ontologically related (or not) to common nouns in the form of terms for geographic features. In addition, motion verbs, locative verbs, topological relations, spatial frames of reference, deixis, metaphor, loanwords, and a number of other semantic and grammatical categories and phenomena, are advantageously explored in a landscape context. Importantly, however, our current understanding of landscape categorization sug- gests there is little evidence for a universal, linguistically definable and recurring seman- tic domain that we can call landscape (Burenhult & Levinson 2008). That is, not every language shows language-internal evidence in the form of lexical ‘unique beginners’ or

Interdisciplinary Approaches to Language Documentation 14 relations, or structural patterns, which form systems that neatly correspond to the realm of physical geography. Instead, languages vary tremendously as to their approach to the domain. Some do indeed treat it as lexically fundamental, and have systems of basic terminology organized under an overt generic category of ‘land’ or ‘landscape’. For oth- ers, however, linguistic representation of landscape is secondary in the sense that other, landscape-external linguistic dimensions form the primary means of mapping the domain, e.g., metaphor drawn from other domains (Burenhult 2008b), or patterns of word forma- tion characteristic of the language system as a whole (O’Meara & Bohnemeyer 2008). The one landscape-related linguistic category which so far appears to show the greatest degree of universality and stability across languages is that of place names―these are potentially a candidate category for a cross-linguistically valid definition of the domain (Burenhult, in progress a). But even place names vary enormously in their structural, semantic, and referential properties. Again, however, all languages are forced to deal with landscape in some way or other, and in that sense the domain is certainly fundamental. And the variation in how lan- guages cope with landscape makes it a particularly interesting observatory for exploring linguistic representation.

4. ‘Landscape-driven’ documentation This section describes aspects of landscape-related documentation on the basis of work carried out in two research programs: the language documentation project Tongues of the Semang,5 supported by the DOBES program 2005–2011, and Language, Cognition and Landscape (LACOLA),6 funded by the European Research Council 2011–2016, both coordinated by the author. The experiences conveyed are predominantly from my own fieldwork among the Jahai, an Austroasiatic-speaking group of rainforest foragers in the Malay Peninsula.

4.1 Conceptual and procedural rewards In our projects, landscape has provided a con- venient baseline for the collection and organization of documentary data. First, the inher- ent spatial properties of landscape offer an immediate and highly intuitive ontology for organization of data of any type. Every data point (such as the location of a video record- ing or an elicitation session) can be given Global Positioning System (GPS) coordinates, situating all parts of a data set in relation to each other in large-scale space.7 Whether or not a particular type of data pertains directly to landscape, it is still always spatially iden- tifiable and accessible for spatial analysis, e.g., in Geographic Information Systems (GIS) of different kinds. So, even if data collection is not primarily targeting landscape as such, a spatially tagged dataset will be of potential use to anyone interested in landscape or spa- tiality at large. Spatial coordinates are therefore an invaluable form of metadata (see, e.g., Berez 2013; cf. Gawne & Ring 2016). The idea that landscape is an ever-present backdrop to human experience certainly applies to data organization experience as well.

5 http://dobes.mpi.nl/projects/semang/ 6 http://projekt.ht.lu.se/lacola 7 One exception is legacy data whose exact place of collection has not been recorded and whose geo coordinates are therefore unknown.

Interdisciplinary Approaches to Language Documentation 15

Second, landscape has been a reasonably uncomplicated domain to work with in the sense that its representational categories usually refer to phenomena which have mag- nitude and permanence and are perceptually of easy access. In one respect, this makes it a far less problematic domain to explore referentially than plants and animals, for exam ple, whose correct identification is notoriously challenging and time-consuming for the obvious reason that real referents are usually not readily available (although semantic boundaries are typically less clear in landscape than among life forms). However, docu- menting landscape has required a higher degree of mobility on the part of the researcher and consultants, compared to some other domains. Third, in my experience, landscape and places are evident and constantly relevant topics of conversation between the researcher and community members, forming a nat- ural dimension for documentary inquiry. Consultants have quickly grasped the topic and format of questions and, provided they have the knowledge, have typically quantified the categories spontaneously by coming up with new exemplars (especially landforms, place names, and motion verbs). This has enabled a very efficient pursuit of categori- cal and knowledge systems, and the mutual conceptual structure it has afforded to the researcher and consultant has formed a basis for other aspects of linguistic inquiry and documentation.

4.2 Linguistic practices: Data types, genres In the Jahai context, the landscape topic has lent itself very well to the collection of a number of different data types involving a range of communicative events at different levels of formality, structuredness, and spon- taneity. First of all, landscape is a constantly relevant scene of reference in day-to-day conversation, even without prompting. Human activities, experiences, and impressions are ubiquitously discussed against a backdrop of named places, locations, landforms, and, especially, watercourses. Thus, any conversational data will be relevant for studies of landscape representation, and, conversely, landscape and place reference will be central to any analysis of conversation. Similarly, structured narratives of various kinds are firmly anchored in landscape. Hunting and travel stories, as well as life histories, are always tied to different locations and movement between them. Creation myths are inextricably hooked to every named place, since each place name is eponymous with the creation being whose body trans- formed into the area or landform in question. Each named place is also associated with a song. Furthermore, landforms and named places serve as informative locations for on-site interviews and elicitation sessions pertaining to various aspects of expertise and language, from lexical and spatial distinctions to historical, ecological, and cosmological knowledge. This means that landforms and named places have shown great potential to form the underlying structure for collecting systematic data sets of many types and genres. For (and within) each location (and there are hundreds), the linguist can document conversa- tion, stories, a creation myth, a song, associated knowledge and memories, lexical cate- gories and systems, and so on. If our Jahai experience is anything to go by, domain focus need not be at the expense of documentation of a wide variety of linguistic practices. On the contrary, setting out from a baseline of place and landscape has proved to be an effi

Interdisciplinary Approaches to Language Documentation 16 cient method for quickly diversifying data collection and creating a large and systematic data set in a short period.

4.3 A basis for exploring other domains As it is such a conceptually fundamental domain, landscape forms the backdrop for a range of other domains worthy of in-depth documentation, and can provide natural inroads into them. The research programs reported here explore two such connected domains. One is the subsistence targets and techniques of Jahai foraging (the major focus of the Tongues of the Semang project). The Jahai basic classification of resources involves three biotope categories which are essentially landscape categories: rivers, forest canopy, and forest floor. All targets of Jahai foraging, be they plants or animals, belong to either of these categories. Subsistence tech- niques and their associated lexicon are similarly based on this tripartite system (Levinson & Burenhult 2009; Burenhult, in progress b). Consequently, in this case, landscape cat- egories are categorially interlocked with both the ethnobiological classification strategy and the techniques used for extracting resources.8 The second connected domain is human locomotion, also primarily explored in the Jahai context. Jahai has a set of motion verbs which encode the landscape features on which the movement takes place, and whether the movement is lengthwise or crosswise (e.g., along vs. across watercourse, along vs. across hillside, etc.; see Burenhult 2008b). These verbal categories map directly onto the landforms labeled by nouns and are in fact highly informative about Jahai landscape categorization and understanding. While this subset of motion verbs is firmly anchored in landscape, others with similar struc- tural and semantic properties are not, e.g., a set of semantically specific tree-climbing verbs (Burenhult 2013a). The landscape motion verbs here provide a stepping stone into a whole new domain of human activity which only partly overlaps with landscape.

5. Interdisciplinary perspectives A number of disciplines have a long tradition of interest in landscape, such as anthropol- ogy, archaeology, environmental psychology, philosophy, and cognitive geography (see, e.g., Bender 1993; Hirsch & O’Hanlon 1995; Tilley 1994; Bell et al. 1996; Mark et al. 1999; and Smith & Mark 2001). The nascent linguistic attention to the domain prom- ises to introduce a variety of new questions and perspectives of inquiry for such disci- plines. In particular, there is a growing interest among geographers in the cross-linguistic variation in geographical ontology and conceptualization, and the challenges that this variation poses to issues of geographical scientific classification and existing Geographic Information Systems (Mark et al. 2007; Derungs et al. 2013; Wartmann & Purves 2014). This should strike a chord with core concerns of the language documentation com- munity: GIS are typically developed according to a Western understanding of landscape, forcing an imposed universal ontology onto indigenous names and categories. The onto- logical mismatches are bound to result in inter-cultural misunderstandings, and imposed systems are certain to put indigenous practices in peril. Documenting ontological varia- tion and developing linguistically and culturally attuned applications of GIS will facilitate

8 A related example of cross-domain connections is explored by Ewelina Wnuk in her PhD thesis on the relationship between the language of perception and the domain of life forms among the Maniq of southern Thailand. This work builds on her involvement as research assistant in the Tongues of the Semang project (Wnuk & Majid 2014; Wnuk 2016).

Interdisciplinary Approaches to Language Documentation 17 inter-cultural communication about spatial representations, and can strengthen indigenous practices and efforts to safeguard them. Here, geographers have taken important steps towards an interdisciplinary agenda: Mark & Turk (2003) and Mark et al. (2007) propose a new ethnoscience of landforms, ‘Ethnophysiography’, and Turk et al. (2012) provide a geographical elicitation guide for field linguists. The LACOLA project investigated and documented the relationship between lan- guage, thought, and landscape in several diverse and endangered language settings (http:// www.lu.se/lacola). One of its aims was to identify and explore interesting points of con- nection between geography and field-based linguistics. This was enabled by close col- laboration with David Mark’s and Andrew Turk’s Ethnophysiography team and involved a range of methodological and theoretical considerations, as well as joint fieldwork on Navajo (Athapaskan, southwestern United States) and Manyjillyjarra (Pama-Nyungan, Western Desert, Australia). Major points of theoretical discussion included the challenges of defining a landscape domain (cf. Mark et al. 2011b), the basicness of categories across languages as reflected in basic vs. complex terms for landforms, and the ontological rela- tionship between generic landform terms and proper names in the form of toponyms. While these issues only directly pertained to parts of the project’s program of documen- tation, the exchange resulted in a highly refined descriptive agenda for the individual languages studied by the project (see, e.g., Huber 2013, in press; Rybka 2014, 2016). Importantly, the results not only enhanced the linguistic description and documentation but also spoke directly to the intellectual concerns of the collaborating discipline. The interdisciplinary discussion within the project also concerned methodology, and especially the pros and cons of Geographic Information Systems as tools for linguistic data collection, analysis, and documentation of indigenous categories. The limitations of Vector GIS (with its closed set of three geometrical categories: points, lines, and poly- gons) was a major concern, not least considering the potential risk of misrepresentation of such categories. Project case studies piloted various aspects of such GIS application, targeting different linguistic categories for data collection in the field with handheld GPS computers (e.g., place names, landforms, and motion in large-scale space; Burenhult et al., in progress). The challenges notwithstanding, GIS is a promising tool not only for providing a meta-structure for organizing documentary materials (see §4.1), but also for collecting, understanding, and visualizing indigenous linguistic categories and knowledge systems. Such indigenously informed GIS representations also have potential as analytical envi- ronments for investigating a range of linguistic practices pertaining to place and space (Burenhult 2013b). Furthermore, LACOLA’s domain-driven research program created opportunities for collaboration with still other disciplines with a vested interest in landscape, namely landscape architecture and environmental psychology. Drawing on the project’s access to diverse cultural settings and associated linguistic expertise, a subproject coordinated by landscape architects Caroline Hägerhäll and Åsa Ode Sang investigated human landscape preference from a cross-cultural perspective. Previous preference studies suffered from a sample bias towards Western populations (cf. Henrich et al. 2010), and the project offered a first opportunity to test landscape preference across a culturally and ecologically diverse set of populations (Hägerhäll et al., in progress).

Interdisciplinary Approaches to Language Documentation 18

5. Conclusions The main advantage of domain-driven documentation is its ability to structure the doc- umentary endeavor along culturally embedded yet universally pertinent dimensions of direct relevance and access to both researchers and communities. I have argued here that domain-driven documentation is not only an acceptable way of doing language documen- tation, but in fact also a particularly efficient and rewarding method for meeting our doc- umentary aims. It is culturally informed; it satisfies the requirements of comprehensive language documentation; it has the potential to produce significant scientific discoveries; and it can be of immediate interest to other disciplines. This last capacity is momentous, because such interest significantly increases the reusability of documentary resources and provides crucial incentives for sustaining and further developing them in the future.

Interdisciplinary Approaches to Language Documentation 19

References

Atran, Scott & Douglas Medin. 2008. The native mind and the cultural construction of nature. Cambridge: MIT Press. Bell, Paul A., Thomas C. Greene, Jeffrey D. Fisher & Andrew S. Baum. 1996. Environmental psychology. Fort Worth: Harcourt Brace. Bender, Barbara (ed.). 1993. Landscape: Politics and perspectives. Providence: Berg. Berez, Andrea L. 2013. Simple GIS in documentation and description: Google Earth for the visualization and analysis of spatially-themed language use. Presentation at the Linguistic Society of America Annual Meeting, Boston, January 3–6, 2013. Berlin, Brent. 1992. Ethnobiological classification: Principles of categorization of plants and animals in traditional societies. Princeton: Princeton University Press. Berlin, Brent & Paul Kay. 1969. Basic color terms: Their universality and evolution. Berkeley: University of California Press. Brown, Cecil H. 1976. General principles of human anatomical partonomy and spec- ulations on the growth of partonomic nomenclature. American Ethnologist 3. 400–424. Burenhult, Niclas (ed.). 2008a. Language and landscape: Geographical ontology in cross-linguistic perspective (Special Issue). Language Sciences 30(2–3). Burenhult, Niclas. 2008b. Streams of words: Hydrological lexicon in Jahai. Language Sciences 30(2–3). 182–199. Burenhult, Niclas. 2013a. Categories of arboreal locomotion in Jahai (Austroasiatic, Malay Peninsula). Unpublished report, 4 pp. Burenhult, Niclas. 2013b. Domain-driven documentation: The case of landscape. Presentation at the 3rd International Conference on Language Documentation and Conservation, Honolulu, February 28 – March 3, 2013. Burenhult, Niclas (ed.). (in progress a). Place names: The cross-linguistic perspective. Burenhult, Niclas. (in progress b). Biosystematics revisited: The Jahai foraging semplate. Ms. Burenhult, Niclas & Stephen C. Levinson. 2008. Language and landscape: A cross-lin- guistic perspective. Language Sciences 30. 135–150. Burenhult, Niclas & Asifa Majid. 2011. Olfaction in Aslian ideology and language. Senses & Society 6(1). 19–29. Burenhult, Niclas, Ross S. Purves & Love Eriksen. (in progress). The spatial properties of forager motion categories: Evidence from Jahai (Malay Peninsula). Ms. Burgess, Neil, Kathryn J. Jeffery & John O’Keefe. 1999. The hippocampal and parietal foundations of spatial cognition. Oxford: Oxford University Press. Derungs, Curdin, Flurina Wartmann, Ross S. Purves & David M. Mark. 2013. The mean- ings of the generic parts of toponyms: Use and limitations of gazetteers in studies of landscape terms. In Thora Tenbrink, John Stell, Anthony Galton & Zena Wood (eds.), Spatial Information Theory, 11th International Conference, COSIT 2013 [Conference on Spatial Information Theory], Scarborough, UK, September 2–6, 2013, Proceedings (Springer Lecture Notes in Computer Science vol. 8116). 261– 278. Cham, Switzerland: Springer International Publishing. Evans, Nicholas. 2009. Dying words: Endangered languages and what they have to tell us. Malden, MA: Wiley-Blackwell.

Interdisciplinary Approaches to Language Documentation 20

Evans, Nicholas, Alice Gaby, Stephen C. Levinson & Asifa Majid (eds.). 2011. Reciprocals and semantic typology. Amsterdam: Benjamins. Gawne, Lauren & Hiram Ring. 2016. Mapmaking for language documentation and description. Language Documentation & Conservation 10. 188–242. Gippert, Jost, Nikolaus P. Himmelmann & Ulrike Mosel (eds.). 2006. Essentials of lan- guage documentation. Berlin: Mouton de Gruyter. Hägerhäll, Caroline, Åsa Sang, Felix Ahlner, Juliette Huber, Konrad Rybka & Niclas Burenhult. (in progress). Cross-cultural variation in landscape preference. Ms. Henrich, Joseph, Steven J. Heine & Ara Norenzayan. 2010. The weirdest people in the world? Behavioral and Brain Sciences 33(2–3). Hirsch, Eric & Michael O’Hanlon (eds.). 1995. The anthropology of landscape: Perspectives on place and space. Oxford: Clarendon Press. Himmelmann, Nikolaus P. 1998. Documentary and descriptive linguistics. Linguistics 36. 161–195. Holton, Gary. 2012. Language archives: They’re not just for linguists any more. In Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts & Paul Trilsbeek (eds.), Potentials of language documentation: Methods, analyses, and utilization (LD&C Special Publication no. 3), 111–117. Honolulu: University of Hawai’i Press. Huber, Juliette. 2013. Landscape in East Timor Papuan. Language Sciences 41. 175–196. Huber, Juliette. 2017 (in press). Natural locations and the distinction between ‘what’ and ‘where’ concepts: Evidence from differential locative marking in Makalero. Linguistics. Janzen, Gabriele & Miranda van Turennout. 2004. Selective neural representation of objects relevant for navigation. Nature Neuroscience 7. 673–677. Kay, Paul, Brent Berlin, Luisa Maffi, William R. Merrifield & Richard Cook. 2010. The world color survey. Stanford: CSLI Publications. Levinson, Stephen C. 2003. Space in language and cognition. Cambridge: Cambridge University Press. Levinson, Stephen C. & Niclas Burenhult. 2009. Semplates: A new concept in lexical semantics? Language 85. 153–174. Levinson, Stephen C. & Asifa Majid. 2014. Differential ineffability and the senses. Mind & Language 29. 407–427. Levinson, Stephen C. & David Wilkins (eds.). 2006. Grammars of space. Cambridge: Cambridge University Press. Majid, Asifa & Melissa Bowerman (eds.). 2007. Cutting and breaking events: A crosslin- guistic perspective. Cognitive Linguistics 18(2). 133–152. Majid, Asifa & Niclas Burenhult. 2014. Odors are expressible in language, as long as you speak the right language. Cognition 130(2). 266–270. Majid, Asifa, Nick Enfield & Miriam van Staden (eds.). 2006. Parts of the body: Cross- linguistic categorisation. Language Sciences 28(2–3). Mark, David M., Barry Smith & Barbara Tversky. 1999. Ontology and geographic objects: An empirical study of cognitive categorization. In Christian Freksa & David M. Mark (eds.), Spatial Information Theory: Cognitive and Computational Foundations of Geographic Information Science, COSIT 1999 [Conference on Spatial Information Theory], Stade, Germany, August 25–29, 1999, Proceedings

Interdisciplinary Approaches to Language Documentation 21

(Springer Lecture Notes in Computer Science vol. 1661). 283–298. Berlin: Springer-Verlag. Mark, David M. & Andrew G. Turk. 2003. Landscape categories in Yindjibarndi: Ontology, environment, and language. In Werner Kuhn, Michael F. Worboys & Sabine Timpf (eds.), Spatial Information Theory: Foundations of Geographic Information Science, International Conference, COSIT 2003 [Conference on Spatial Information Theory], Ittingen, Switzerland, September 24–28, 2003, Proceedings (Springer Lecture Notes in Computer Science vol. 2825). 31–49. Berlin: Springer-Verlag. Mark, David M., Andrew G. Turk & David Stea. 2007. Progress on Yindjibarndi eth- nophysiography. In Stephan Winter, Matt Duckham, Lars Kulik & Ben Kuipers (eds.), Spatial Information Theory, 8th International Conference, COSIT 2007 [Conference on Spatial Information Theory], Melbourne, Australia, September 19–23, 2007, Proceedings (Springer Lecture Notes in Computer Science vol. 4736). 1–19. Berlin: Springer. Mark, David M., Andrew G. Turk, Niclas Burenhult & David Stea (eds.). 2011a. Landscape in language: Transdisciplinary perspectives. Amsterdam: Benjamins. Mark, David M., Andrew G. Turk, Niclas Burenhult & David Stea. 2011b. Landscape in language: An introduction. In David M. Mark, Andrew G. Turk, Niclas Burenhult & David Stea (eds.) Landscape in language: Transdisciplinary perspectives, 1–24. Amsterdam: Benjamins. Narasimhan, Bhuvana & Anetta Kopecka (eds.). 2012. Events of “putting” and “taking”: A crosslinguistic perspective. Amsterdam: Benjamins. O’Meara, Carolyn & Jürgen Bohnemeyer. 2008. Complex landscape terms in Seri. Language Sciences 30(2–3). 316–339. Rybka, Konrad. 2014. How are nouns categorized as denoting “what” and “where”? Language Sciences 45. 28–43. Rybka, Konrad. 2016. The linguistic encoding of landscape in Lokono. LOT 417. Amsterdam: University of Amsterdam PhD thesis. Smith, Barry & David M. Mark. 2001. Geographical categories: An ontological investiga- tion. International Journal of Geographical Information Science 15(7). 591–612. Thieberger, Nicholas (ed.). 2012. The Oxford handbook of linguistic fieldwork. Oxford: Oxford University Press. Tilley, Christopher. 1994. A phenomenology of landscape: Places, paths and monuments. Oxford: Berg. Turk, Andrew G., David M. Mark, David Stea & Carolyn O’Meara. 2012. Geography – Understanding how to identify landforms, other landscape features and their uses. In Nicholas Thieberger (ed.) The Oxford handbook of linguistic fieldwork, 368– 391. Oxford: Oxford University Press. Wartmann, Flurina & Ross S. Purves. 2014. Why landscape terms matter for mapping: A comparison of ethnogeographic categories and scientific classification. In Kathleen Stewart, Edzer Pebesma, Gerhard Navratil, Paolo Fogliaroni & Matt Duckham (eds.), Extended Abstract Proceedings of the GIScience 2014, Eighth International Conference on Geographic Information Science, Vienna, September, 23–26, 2014, 192–194. Vienna: Vienna University of Technology. Wnuk, Ewelina. 2016. Semantic specificity of perception verbs in Maniq. Nijmegen: Max

Interdisciplinary Approaches to Language Documentation 22

Planck Institute for Psycholinguistics PhD thesis. Wnuk, Ewelina & Asifa Majid. 2014. Revisiting the limits of language: The odor lexicon of Maniq. Cognition 131(1). 125–138.

Interdisciplinary Approaches to Language Documentation Language Documentation & Conservation Special Publication No. 21 (October 2020) Interdisciplinary Approaches to Language Documentation ed. by Susan D. Penfield, pp. 23-42 3 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/24942

Child Language Documentation: A Pilot Project in Papua New Guinea

Birgit Hellwig Department of Linguistics, University of Cologne

Child language documentation: A pilot project in Papua New Guinea.9

Abstract The central aim of language documentation is to comprehensively document the charac- teristic speech practices of a community. Such practices necessarily also include child lan- guage and child-directed speech-and yet there are only very few documentation projects that focus on language from tand with children. This paper argues for studying first lan- guage acquisition and socialization within a language documentation context, focusing on the types of data needed for such a study and drawing on the insights from a pilot project among the Qaqet of Papua New Guinea. The aim of this pilot project was to investigate the feasibility of a comprehensive child language documentation project, and this paper discusses the central challenges to such an endeavour and shows how they were addressed in the project.

1. Introduction Language documentation is famously characterized as aiming to create a “compre- hensive record of the linguistic practices characteristic of a given speech community” (Himmelmann 1998: 166). Over the past two decades, there have been considerable advances in addressing the challenges of identifying and recording characteristic prac- tices, of exploring ways of making the recordings accessible for multiple purposes and audiences, and of establishing a place for language documentation within descriptive and theoretical linguistics. An important aspect of the overall program is the notion of com- prehensiveness (see Foley 2003; Franchetto 2006; Himmelmann 1998, 2006; Lehmann 2001; Lüpke 2010; Seifart 2008). In an ideal world, projects would aim for a compre- hensive coverage of linguistic practices – an aim that is necessarily limited by practical constraints of time, personnel, funding, and opportunities. A cursory survey of projects conducted under the umbrella of the major funding agencies of DoBeS (Dokumentation bedrohter Sprachen), ELDP (Endangered Languages Documentation Programme), and DEL (Documenting Endangered Languages) testifies to an impressive variety: some proj- ects aim for comprehensiveness, while others explicitly limit this undertaking in various

9 Many thanks to Susan Penfield and the National Science Foundation (Grant 1246525) for organizing the speaker’s series at the 3rd International Conference on Language Documentation and Conservation (ICLDC 2013) where the ideas for this paper were first presented. I warmly thank the audience as well as two anonymous reviewers for their interesting and challenging questions.

CC Creative Commons Attribution Non-Commercial No Derivatives Licence 24 ways and focus on specific practices; there is also considerable variety in the chosen approaches. But the documented practices have one aspect in common: they are almost always practices of the adult language. Documentation projects that make reference to child language are conspicuously absent: out of roughly 500 projects funded by the above three agencies, only 5 are explicitly concerned with documenting child language, and another handful mention child language amongst the practices to be documented. With one exception (Stoll & Lieven n.d.), these projects are all pilot projects or smaller projects that explore specific aspects of child language or language transmission. The exclusion of child language from language documentation cannot be the- oretically motivated:10 characteristic linguistic practices necessarily include child lan- guage, child-directed speech, and, more generally, the processes by which children are being socialized in a given community. Assuming that there are no theoretical reasons for excluding child language, the question arises as to why there are so few instances of child language documentation. One obvious reason for its absence is, of course, that some languages are no longer acquired by children. But this cannot be the whole story: there are large numbers of minority languages that are being acquired and for which processes of language transmission could be documented. It is more likely that its absence is due to various practical and methodological reasons, and this paper sets out to explore some of them by means of a case study. The paper is structured as follows: §2 sets the scene by identifying challenges to child language documentation, focusing on the issue of data types and arguing for why we should nevertheless make an effort to address the challenges (see also Kelly & Nordlinger 2013 for a comparable discussion). It is acknowledged that this section only touches on these issues and that each of them could – and should – be expanded into a full paper in its own right. §3 takes up some of the challenges of §2, and describes in detail how they were addressed in a pilot project on child language documentation in Papua New Guinea. §4 then summarizes and concludes this paper.

2. Challenges to child language documentation: Data types Child language documentation is anchored both in anthropology (i.e., language socializa- tion research) and psycholinguistics (i.e., language acquisition research), drawing on three distinct data types: anthropological, experimental, and longitudinal data. Documenters are most familiar with the first data type, and there is a long-standing tradition of including anthropological components within language documentation projects (e.g., Franchetto 2006). But we are less familiar with the other two data types – both of which play a cen- tral role in language acquisition research. Experimental methods (see especially Blom & Unsworth 2010; Eisenbeiß 2010; Kidd 2006) allow researchers to systematically explore the extent and limits of a child’s production and comprehension, and to understand the range of variation amongst a larger number of children. For a successful experiment, appropriate contexts and test items need to be identified and created, i.e., this presupposes a good understanding of the adult or target language. For well-documented languages, researchers can draw on numerous psycholinguistic resources: on lexical databases that

10 There are parallel debates about the place of children in anthropological research (Hirschfeld 2002). I thank an anonymous referee for pointing out this debate to me.

Interdisciplinary Approaches to Language Documentation 25 give information on formal and semantic properties of words and their frequencies, or on standardized guidelines for assessing the communicative development of children. There may also be an understanding of the temporal organization of language development, i.e., of which types of non-target-like utterances tend to appear at which ages. And this knowl- edge, in turn, makes it possible to identify appropriate age ranges for a given experiment. Such resources and knowledge are not available for under-documented languages, and we thus need to develop them in parallel to studying child language. Arguably the most important such resource is a longitudinal corpus where the same children are followed over a longer period and are recorded at regular intervals (Demuth 1996; Tomasello & Stahl 2004). This data type, again, comes with its own challenges: ensuring that record- ings take place at regular intervals over at least one year, that the recordings capture natural settings and are representative of the children’s typical day-to-day activities, and that the masses of resulting data are transcribed and further processed (see Ochs 1979 for an excellent discussion of issues in transcribing child language). While language docu- mentation has gained considerable methodological expertise in collecting and analyzing different data types, we have very little expertise in the two data types most relevant for language acquisition research: experimental and longitudinal data. Taking a slightly different perspective, we can approach the issue of data types from the perspective of community involvement. Ideally, the community plays an active role in all language documentation research, contributing actively to the shaping of the project goals and research methods. There are very good reasons for this approach (Amery 1995; Czaykowska-Higgins 2009; Dobrin 2005; Grenoble & Whaley 2006; Grinevald 2003; Hinton & Hale 2001; Mosel 2006; Rice 2011; Wilkins 1992, 2000), but it can work against the inclusion of child language. Understandably, the speech community tends to be concerned about the disappearance of prestigious genres and specialized knowledge, rather than with the documentation of mundane everyday language, e.g., child language. In the face of scarce resources, it is thus not immediately obvious how to motivate a focus on child language. Similarly, the community tends to be interested in natural settings, reflecting their own invaluable communicative practices – something that cannot easily be reconciled with experimental setups that presuppose outside agency. These views tend to be mirrored by documenters who are often more comfortable with anthropological approaches to language socialization (Gaskins 1999; Kulick 1992; Ochs 1982; Schieffelin & Ochs 1986) and who remain cautious as to the applicability of experiments in field situations (e.g., the need to run experiments with a large number of speakers – something that is not always possible in the context of endangered languages). While there is no question that anthropological methods are indispensable to investigating socio-cultural factors and their impact on children’s language development, they cannot entirely replace experimental methods. Within language documentation, there are already provisions for including staged data: to supplement natural data (e.g., in order to document events that cannot easily be documented in a natural setting) or to better interpret natural data (e.g., to generate negative evidence and test the limits of produc- tivity). Our normal types of staged data tend to be fairly unconstrained: speakers are presented with a stimulus, and they can talk naturally within the context created by the stimulus (Lüpke 2010). Experiments, by contrast, are typically more constrained, and they put the participants in the awkward role of passively reacting to a stimulus. The situation is further complicated by the nature of experimental tasks: in order to provide

Interdisciplinary Approaches to Language Documentation 26 meaningful results, they have to be challenging, i.e., they have to be just difficult enough for children to produce non-target-like utterances. In the worst case scenario, they create a feeling in the community that children are being set up to produce errors. An experimental approach is thus not easy to reconcile with the ideals of investing agency in the speakers of endangered languages and of fostering community engagement. More generally, this issue ties in with the overall question of fieldwork ethics: much effort has been directed towards issues of ethics and responsible fieldwork (Dwyer 2006; Musgrave & Thieberger 2006; Rice 2006), but we have little experience when it comes to working with children and to appraising the ethical consequences of experimental methods. Supposing we have overcome the challenges of data collection, we are then faced with interpretative challenges. An important consideration in this respect is our lim- ited knowledge of the target language. We focus on under-documented languages, and we usually do not have a good (let alone comprehensive) understanding of the adult language. Moreover, it is not self-evident that the adult language is actually the target language: we often work with endangered languages, i.e., with languages that are spoken in multilin- gual settings and whose speakers are potentially shifting to other languages. Both these circumstances pose challenges for interpreting child data: to be able to identify non-target- like utterances, and to distinguish between developmental effects (where a child has not yet acquired the target) and the effects of language shift (where a child is shifting towards a different target) (see McConvell & Meakins 2005; Meakins & Wigglesworth 2013; and O’Shannessy 2008, 2012 on the emergence of mixed languages in such contexts). In the light of such challenges, it is not surprising that we know very little about the transmission and acquisition of small and endangered languages. In the CHILDES database (MacWhinney 2000), the vast majority of corpora are of Indo-European lan- guages (58% of monolingual corpora; 83% of bilingual corpora). Conversely, there are estimates that language acquisition data exist for only 1-2% of the world’s languages (Lieven & Stoll 2009: 144). There is a need to remedy this situation. From the perspective of language acqui- sition, cross-linguistic corpora are urgently needed to test claims about universals of child language and child-directed speech. The current sample is biased towards Indo-European languages, i.e., towards languages that are related, that are typologically similar, and that are acquired in similar contexts (of monolingualism or family bilingualism). From this perspective (see Bates and MacWhinney 1989; Bavin 1995; Bowerman 2010; Eisenbeiß 2005; Slobin 1982, 1985–1997; Stoll 2009), there is thus considerable interest in expand- ing the current sample. This interest is reflected in a more general concern that the exclu- sion of the majority of the world’s languages, peoples, and societies from consideration is untenable (Evans & Levinson 2009; Henrich et al. 2010). From the perspective of lan- guage documentation, child language and child-directed speech undoubtedly constitute characteristic linguistic practices (see §1) and thus need to be part of any comprehensive documentation. And finally from the perspective of language endangerment, we can hope to gain new insights into the causes of language endangerment and death. It is often argued (Fishman 1991; Grenoble & Whaley 2006) that a central cause is the interrupted transmission of language to the next generation, but we have little information on the processes of how languages are transmitted (or not transmitted) in such communities. An excellent study in this respect is Kulick (1992) who studied language socialization among the multilingual Gapun in Papua New Guinea, identifying reasons for their beginning

Interdisciplinary Approaches to Language Documentation 27 shift to the lingua franca . From all three perspectives, there is thus an interest and a need to make the considerable effort to address the practical challenges posed by the documentation of child language.

3. Towards a child language documentation project This section reports in detail on a pilot project on documenting child language among the Qaqet in Papua New Guinea. The intention of this pilot project was to investigate the fea- sibility of, and prepare for, a comprehensive child language documentation project. Given the methodological challenges outlined in §2, we considered it infeasible to start with either experimental or longitudinal studies. Instead we took the opportunity to discuss and engage community involvement, gain first anthropological insights into aspects of language socialization and typical children’s activities (in preparation for future detailed research into language socialization), explore the collection of staged data (in preparation for future experiments), and try out possible recording scenarios (with a view to identify- ing natural contexts in preparation for future longitudinal studies).11 This project is set in the Qaqet communities of Raunsepna and Lamarain, located in the remote interior mountains of the Gazelle Peninsula (East Province, Papua New Guinea). Qaqet [ISO 639-3: byx] belongs to the Baining languages, a group of six related non-, and it is spoken by an estimated 10,000 speakers. We have access to older descriptions of Qaqet (Bley 1914; Parker & Parker 1974, 1977; Rascher 1900, 1904) as well as to anthropological and historical sources (Dickhardt 2008, 2009, 2012; Fajans 1983, 1985, 1993, 1997, 1998; Hesse & Aerts 1996; Hesse 2007; Hiery 2007; Laufer 1946/1949, 1959; G. Pool 2015; J. Pool 1984; Rohatynskyj 2000, 2001). In more recent times, a substantial body of descriptive work has been published on the related Baining language Mali [gcc] (see especially Stebbins 2011), but there is only a little information available on the other Baining languages. One of them, Makolkol [zmh], is probably already extinct. Most Baining communities are multilingual, and people addi- tionally speak the national lingua franca Tok Pisin, and often also a neighboring language (especially the Oceanic language Kuanua). In remote areas, the Baining languages are still strong, but in more accessible areas, Tok Pisin and other languages are becoming dominant and are starting to replace the Baining languages. In 2011, I started researching the adult Qaqet language in Raunsepna. In the course of this research, the idea of documenting child language arose. This idea was triggered partly by my own developing research interests, and partly by the community’s interest in education and literacy. We developed a pilot project together in order to test the feasibility and scope of such a project, and we were joined by two researchers with complementary interests and expertise: Evan Kidd (language acquisition) and Alexandra Marley (an MA student in sociolinguistics). This section outlines some of the key

11 This pilot project would never have been possible without the generosity, hospitality, and enthusiasm of the people of Raunsepna and Lamarain, both adults and children. My thanks go to the entire community, and especially to John Landi, Tony Alin, Bruno Lalem, and their families for all their help throughout the project. The support of ELDP (Endangered Languages Documentation Programme, Grant SG110) for funding this pilot project and of the ARC (Australian Research Council, Grant FT0991412) for funding initial research on the adult language is gratefully acknowledged, and so is the endorsement of the National Research Institute of Papua New Guinea. This pilot project has since resulted in a comprehensive child language documentation project, and I thank the Volkswagen Foundation for their generous support of, and trust in, this project.

Interdisciplinary Approaches to Language Documentation 28 activities from the pilot phase: it describes views from the community (§3.1) and it out lines two case studies on narrative development (§3.2) and children’s activities (§3.3).

3.1 Views from the community There have been many discussions (both informal and recorded) within the community and between the community and the research team about the motivations for setting up a child language documentation project. The two most prominent views are illustrated in the quotes below. First, the view that young children do not talk very much (in 1). And second, the view that children do not talk correctly (in 2), especially if they are born into interethnic marriages (in 3).

(1) Ee, kuasiq ai, luqa ip, [x] nani nyit, prama siitka amaigulka saa qutka. […] Nyitaqen praum ip taquarl, ip katika dip kadrlem iaum, taqurla.12 ‘Yes, there is no way that you can go with a long story to a baby. […] You will say a short (word) just like (the baby), because he will only know the short (word), that’s all.’ [I12ABLAJLATASocio2 581.635 591.890] (2) Deiv iani ngerenarli angama rluimini ngamraqen taqurla, de maika dip kuasiq ini ngereseserl vriana rluimini ‘And when someone hears a little child talk like this (incorrectly), they will not always correct their little child.’ [I12AANACLADNSocio2 1220.160 1224.640] (3) Kuasiq i rataqa tdadem. [...] Amanepna. [...] Bequrini itika quasiq ikias ka uranaik. ‘They do not follow (Qaqet) straight. [...] (Because) they are mixed. [...] I assume it’s because we are not amongst ourselves.’ [I12ABLAJLATASocio3 92.012 150.699]

Such views were often followed by veiled skepticism: since children do not speak very much, and since they do not speak correctly, it would be better to focus attention on the adult language. Such skepticism, however, was countered by the equally promi- nent view that efforts needed to be made to strengthen the children’s Qaqet skills. This included, on the one hand, the speaking skills of children of interethnic marriages, as there is a strong sense that such children do not speak Qaqet well.13 And it included, on the other hand, the literacy skills of all children. The national language policy at that time (Devette- Chee 2012; Litteral 2004; Siegel 1996) envisaged that children should be taught in either a vernacular language or in Tok Pisin in the first two years of elementary school (Prep and Year 1), and should then transition to English in the last year (Year 2). Following that, English should become the language of instruction throughout primary school (Year 3 to Year 8). In Raunsepna, too, teachers attempt to teach literacy in the vernacular language, but they are faced with numerous seemingly unsurmountable difficulties. The Qaqet are a marginalized group, and the school is under-resourced in every respect, including a lack

12 All examples are presented in the Qaqet orthography and followed by an English free translation; they do not contain any interlinear glossing. The translation is followed by a bracket containing an identifier for the example (file name and time index), linking to the Qaqet corpus archived with the Endangered Languages Archive (ELAR) (Hellwig 2010–2013). 13 This sense was partly confirmed by the sociolinguistic part of our pilot project (see Marley 2013 for details).While Qaqet is still strong in this remote community, there are certain demographic factors that trigger shifts away from Qaqet and towards the lingua franca Tok Pisin­—including growing up in an interethnic marriage. Such marriages are becoming more common, thus supporting the community’s view that Qaqet is losing ground under such circumstances.

Interdisciplinary Approaches to Language Documentation 29 of Qaqet literacy materials. The general level of literacy is low, and children from this community do not perform well on the national exams. The community attributes these facts to the early school years: the challenges of teaching literacy in the absence of any Qaqet materials (let alone age-appropriate materials) as well as the absence of any materi- als that would help students transition from Qaqet to English. At the moment, it is entirely up to the teachers to create their own materials for teaching literacy and transitioning to English. Against this background, the community expressed an interest in studying child language with the explicit goal of developing age-appropriate materials to be used in the elementary school (to teach Qaqet literacy and to transition to Tok Pisin and English literacy). We therefore set up the project in close co-operation with the teachers, espe- cially with Betty Dangas and Chris Mitparlingi, and with additional support from Patrick Lemingel and Paul Liosi. The scope of the project was shaped in discussions with these teachers, necessitating numerous modifications of our earlier views on what is possible and desirable. At the same time, the teachers acted as intermediaries to the community, and discussed the project with the children and parents. In the end, we were able to jointly identify a number of acceptable recording contexts, and the teachers took care of the logistics: organizing the recordings, acting as interlocutors for the children, and helping with the transcriptions and translations. We also updated the community on the progress of the project in the form of regular announcements after Sunday church. And parents and relatives were invited to view the recordings of their children at any time. Many of them took this opportunity, and these meetings often gave new impulses for further recordings of both children and adults. In the end, we had 70 participants in the project (20 adults and 50 children), testifying to the enormous interest and involvement of the entire community.

3.2 Case study 1: Narrative development In one study, we made use of the so-called Frog Story, i.e., the illustrated wordless picture book Frog, where are you? (Mayer 1969) that depicts the adventures of a boy, his dog, and their frog. We had 41 children (between 5 and 13 years of age) look at the book, and retell the story to an adult interlocutor, either in Qaqet (25 children) or in Tok Pisin (16 children). This part of our study was partly intended to explore possible pitfalls when collecting staged data with children (in prepa- ration for future experimental studies) and was partly linked to the community goal of developing age-appropriate literacy materials. The idea here was to create a stock of pic- ture books of local and new stories, to ask children to tell these stories (thereby enabling us to investigate their narrative development), and to then create a written version of these stories by writing down the expressions used by the children in their oral narratives. We trialed this approach with the Frog Story, and each participating child in the end received their own version of this story. The Frog Story has been used successfully to investigate narrative development in many different languages, and the methods for data collection are well defined (Berman & Slobin 1994; Strömqvist &Verhoeven 2004). In particular, the adult interlocutor is asked to take as non-active a role as possible (only offering non-committal back-channel- ings that do not interfere with the child’s narrative), and the child is not supposed to know the story beforehand. In our pilot study, we ran into numerous challenges trying to follow this methodology. This section recounts the two most prominent such challenges: the children’s narratives reflected 1) normal classroom behavior and 2) normal story-telling

Interdisciplinary Approaches to Language Documentation 30 practices within the community. We assume that these challenges are not unique to our fieldwork setting, but rather characteristic of conducting such studies under fieldwork conditions. The first challenge is directly linked to our decision to work together with the local elementary school, pursuing a collaborative research model (e.g., Czaykowska- Higgins 2009; Rice 2011; see also references in §2). As summarized in §3.1, this decision worked exceedingly well on one level. It ensured the involvement of the community as well as the usefulness of the project to the community, and it facilitated the logistics of data collection. But on a different level, it worked less well, as it created a classroom atmosphere. Both teachers and children worried that we had a hidden agenda of evaluat- ing their performances. This worry showed most obviously in numerous instances where children hesitated, and their teachers then whispered the ‘right’ words for them to repeat, so that they would tell the story ‘correctly’. In the end, it was fairly straightforward to deal with this particular issue (through further discussions with the teachers about the purposes of the study). But the classroom atmosphere also showed in another issue: a tendency to co-construct the story. As outlined below, this issue was more complicated to deal with, since it reflected normal classroom practice. At the beginning of this study, many children were shy and clearly uncomfort- able. Added to their very understandable apprehension at performing for strangers, it is not usual in this community for children to be singled out and to answer questions individ- ually. Instead, it is more common for them to repeat a teacher’s words or to answer ques- tions in a chorus. The teachers were thus concerned that our normal research procedure would only increase their shyness and discomfort. We addressed this problem by asking the children to narrate the story in pairs. Example (4) illustrates a typical story recorded with this procedure: the two children (both 9 years old) often talk at almost (but not quite) the same time, with almost (but not quite) the same words.14 That is, one child or other would take the initiative, and the other would chime in, either repeating what was said or continuing on from what was said.

(4) ZGM: iva de.. and then.. ZGD: iane[brlany the two [sleep ZGM: [ianebrlanya [the two sleep now ZGM: dequasiq ian[tliana.. and the two don’t [see their.. ZGD: [iantliana [quldit.. [the two (don’t) see their [fr.. ZGM: [ianaqulditki [their frog ZGD: be [qiamit and [it has left ZGM: [be qiamit [and it has left [R12ZGDZGMFrog 19.310 25.535]

Alternatively, the children negotiated a ‘spokesperson’ (whose identity often shifted in the course of the narrative). This spokesperson would not narrate the story inde- pendently, but check with their friend, or the friend would offer help or corrections. These

14 In this excerpt, overlaps between adjacent lines are marked by “[“ and boldface or underlining.

Interdisciplinary Approaches to Language Documentation 31 exchanges almost always took place secretly and in whispers, and were not always picked up by the microphone. Example (5) illustrates such typical interactions. The children in this example are both 10 years old.

(5) ZAL: kemnyim bit he looks up ZAL: [[whispers]] da? [[whispers]] right? ZJD: [[shakes head]] [[yes]] [R12ZALZJDFrog 113.645 116.805] ZAL: dap ma.. aadangga mara and the.. his dog here ZJD: [[whispers]] John [[whispers]] John ZAL: ia, John sorry, John (not the dog) [R12ZALZJDFrog 188.550 192.015] ZJD: Johna, qena aadangga.. John now, together with his dog.. ZAL: [[whispers]] ianequuk nas [[whispers]] the two swim ZJD: ianequuk nas the two swim [R12ZALZJDFrog 243.145 248.010]

The excerpts in (4) and (5) both exemplify children’s ways of co-constructing a story: the normal way in this community for children to react to a teacher’s question or instruction. From this perspective, our study offers excellent data. But our procedure does not conform to the recommended methods, thereby reducing the cross-linguistic compa- rability of our results on children’s narrative development. It is valid to ask whether (or, maybe more to the point, to what extent) established methodologies should always be followed, even if they do not seem pertinent in a given socio-cultural context (see §4). But regardless of the answer, we have to be aware that varying the methods of data collection will make it difficult to compare the results to those obtained with other methods in other languages. Over the course of the study, children and teachers became increasingly familiar with our research, and after a few weeks, children were happy to narrate the story on their own. We tried to further facilitate this shift by recording children in their home settings with family members as interlocutors. This significantly complicated the logistics (and we only managed to record 3 children in the home setting, compared to 38 children in the school setting), but reduced the feeling of being tested in a classroom setting. Example (6) is a typical recording from such a home setting, illustrating the ease of interaction between the 6-year-old child and her mother. There are still some remaining methodological issues in that adult interlocutors sometimes took too active of a role (e.g., the mother’s question nemgi iari? ‘who stays there?’ in line 4), but this cannot be avoided altogether. For exam- ple, it would be inappropriate for the mother in line 2 to not answer her child’s question. And generally, the adult interlocutors were comfortable with issuing only vague prompts such as (be) nana? ‘(then) what?’ (in lines 6 and 8).

(6) ZMS: ademga, i qua maqasupka? a hole, so is it of the rat? mother: kuasik, auaiki no, (of) a bird

Interdisciplinary Approaches to Language Documentation 32

ZMS: auaiki a bird mother: nemgi iari? who stays there? ZMS: aququanngi an owl mother: be nana? then what? ZMS: kiaseserl sa.. it (owl) startles.. mother: nana? what? ZMS: kiaseserl, sama qaqeraqa, de it (owl) really startles, the boy, and qaat meseng he falls down mother: [[laughs]] [[laughs]]

[R12ZMSFrog 140.480 159.315]

The second major challenge is linked to normal story-telling practices within the community. We were able to find out about such practices because our pilot study included anthropological and sociolinguistic components. As illustrated by the quote in (7), stories are told by adults to children, and children hear and repeat these stories to their friends, thus allowing the stories to spread. Something similar happened to the Frog Story. It was picked up and repeated, and became a topic of conversation in the community (as illustrated by the excerpt from a conversation in 8).

(7) Ide resiit nanget de dama renngi aris. [...] Deip maget, de sa nyitlirang ngatit. De sa irang ngeresiit, naluqa amasiitka, imedu iani, de ngerenarliqa. [...] Tika amasiitka qatit. Sa qatira, mrama.. ama.. amaburlem nara. Be radrlem luqa amasiitka. ‘They (the parents) used to tell them (stories) at night. [...] Then later you will right away see the little ones (children) go around. And they will right away tell this (same) story (to their friends), the (story) that they have just heard. [...] And so the story spreads. It now spreads to many (people). And they now know the story, too.’ [I12AANACLADNSocio3 1135.065 1180.206] (8) Mani ngumr ama.. amarluis […] uresiit savrama qalminngi. ‘Yesterday I took the children […] and we told stories about a frog.’ [I12AANACLADNSocio3 1291.512 1295.322]

We thus cannot be sure that the children are constructing the story by themselves. In fact, there are indications that they are repeating a story that they have heard before. The best such indication comes from the ending of the Frog Story, where the boy and the dog find their frog among a whole group of frogs. Children initially ended their story by saying something along the lines of ‘they pick up their frog and they go’. But after a cer- tain date, they typically added the following: they pointed to an empty space in the line-up of frogs, and said ‘and the space here, they took their frog from here’. On this particular day, we had recorded a grandmother telling the story to her two grandchildren. Example (9) repeats the relevant part of this recording: the grandmother ends the story, and her 10-year-old grandson points out the empty space in the line-up of frogs. The grandmother

Interdisciplinary Approaches to Language Documentation 33 later told me that she had not noticed this empty space, and I confirmed that I had not noticed it either. The story of this boy seeing something that the adults had not seen must have spread through the community, because after that day, many children made sure to highlight this empty space whenever they narrated the story.

(9) grand-mother: deqerl ianluqia and the two have found it now YCS: nyilara ngilkaira, namen see its (empty) space, among those lura! ones!

We have yet to find a satisfactory way of dealing with this issue. We suspect that it will partly resolve itself through time and familiarization. As people become more familiar with this type of research, the stories will become less exciting and newsworthy. And we intend to circumvent the issue by varying the interlocutor (so that children will tell stories to different people) and by building up a pool of different stories (so that chil- dren cannot predict which stories we will ask them to narrate). This includes a number of little-known local stories: several people have volunteered such stories, and we are currently adapting them and creating picture book versions of them.

3.3 Case study 2: Children’s activities A second study focused on natural events: we intended to record a number of younger children (between 2 and 3 years of age) in natu- ral settings in order to develop an initial idea of typical children’s activities, and of their typical language input and output. The ultimate goal was to explore suitable recording contexts for a future longitudinal study. The first challenge was to determine the ages of the children. In this community, age is not important, and people usually do not know their or their children’s ages. When asked, people report on two salient transition points. The first such point is the inclusion in the national vaccination program (i.e., distinguishing children under the age of 5 who are still receiving vaccinations). And the second point is the enrollment in elementary school: children are supposed to have enrolled by the age of 7, but because of the under-resourc- ing of the school, they are usually much older (e.g., it is not unusual for a child to be 9 or 10 years old in their first year of school). These two transition points single out groups that cover considerable age ranges (i.e., children under 5, and children of 7+), and there are no other widely recognized transitions that would further narrow down their actual ages. Estimates turned out to be very unreliable, and even those who knew the community well tended to judge children much too young. There are, of course, ways of establishing a child’s age: through personal discussions (i.e., establishing salient events that happened around the child’s birth, or establishing their age relative to that of other children) and through tracking down official records (i.e., clinics record the day of birth on the child’s health card, and churches record the day of baptism in the church register). But both approaches are not straightforward and require considerable time and effort. We pursued all of the above options, and, in the end, we found three children in the relevant age range (two children of age 2;11 and one of age 2;5) whose families were happy to participate in this study. Following that, the main challenge was to find a balance between the natural- ness and the quality of the recordings. We discussed typical children’s activities with the

Interdisciplinary Approaches to Language Documentation 34 community and the families, and it turned out that children are exceptionally mobile. Young children would usually spend their day in the care of older siblings and cousins, and accompany them in their activities – which take them into the bush and gardens to collect firewood or food, to run errands, or to play games in the bush and the rivers. The possibilities for recording are further constrained by the climate and environment, as there are torrential rains most afternoons, and most places are covered in mud and water throughout the day. All these aspects (older siblings as care-givers, mobility of children, environmental challenges) are not unusual in many parts of the world, and they make it difficult to document children in natural settings. Our approach was to equip the focal children with backpacks with cheap audio recorders and voice-activated microphones. On the plus side, this approach allowed us to capture natural interactions. On the downside, the quality of the recordings suffered, and we especially lacked contextual information (i.e., it was not always clear what the other children were saying or doing). For the future, we intend to experiment with appointing older children as research assistants, and equip- ping them with their own recording devices (and with the instruction to try and record group activities whenever possible). But no matter what solution we will come up with in the end, it is clear that this kind of data needs to be supplemented with higher quality naturalistic data that is recorded in less challenging settings.15 One of the promising set- tings turned out to be the evening meal, which is the normal time for family, for relaxation and chatting —but which also takes place in near darkness, thus interfering with the video quality. As it turned out, it was almost impossible to interpret young children’s utter- ances without access to the visual image. Another promising setting was a staged setting: arranging for a number of adults to be engaged in a sedentary activity (e.g., weaving a rope or a netbag), with two or more children present. It turned out that during this setting, the children do not stray too far away from the adults, and they help out and interact with the adults and the other children. By contrast, we had much less success with a staged setting that works well in Western contexts: staged play activities, where a mother or other adult engages a child in talk while playing with toys. In the Qaqet community, this kind of activity is not normal. Adults usually do not engage children in such activities, and they are not used to drawing children out and encouraging them to speak. This is of course not unique to Qaqet culture (as shown by the ethnographic literature on language socialization, e.g., Duranti et al. 2011; Ochs 1982; Schieffelin & Ochs 1986). Example (10) is an extract from our first such recording, illustrating the helplessness and increasing despair of the mother in trying to get her child (aged 2;5) to talk and perform for us under these circumstances.

15 We have since embarked on a longitudinal study of 2- to 4-year-old children, and have had great success with two set-ups that we had not trialed in the pilot project reported here. First, we equipped parents with easy-to-use cameras (the Zoom Q4) and discussed with them the concept of ‘natural settings’. Following that, they themselves decided when to turn on the camera. This procedure resulted in a large number of recordings of children in typical everyday situations. The metadata, albeit not the actual data, can be accessed through the Language Archive Cologne (Hellwig et al. 2014–2019). And second, we have cross-checked the representativeness of the longitudinal study by means of a) participant observation and b) wiring children with audio recorders and recording them for an entire day. As a result, we know that the longitudinal study is largely representative of the children’s daily activities; and that the main gap is, indeed, a scarcity of recordings of children alone in the bush.

Interdisciplinary Approaches to Language Documentation 35

(10) mother: nyitaqen you talk mother: nyi, nyitaqen you, you talk mother: so, uantaqen now, you two talk YMM: oki? [[pointing to microphone]] don’t touch? mother: ai, tika nyitlak hey, you just play YMM: [[cries; reaches for microphone]] mother: de qurliqia leave it now mother: de nyitaqen you talk YMM: [[cries; plays with stones]] mother: nyitaqen you talk [C12YMMZJIPlay1 7.860 26.777]

The families adapted to this unfamiliar activity essentially by ignoring the game scenario. Sooner or later, natural interactions would find their way into the staged play activities (as illustrated in example (11), which happened sometime after example (10) above during the same recording session). We also discussed possible questions and strat- egies for adults to draw out their children, getting them to talk, and thereby reducing the pressure and anxiety of the mothers. It turned out that the question that worked best by far was the ‘where’ question (as illustrated in (12), with a child aged 2;11).

(11) YMM: amana [[points at tea]] this one mother: as kerl uilas it is still hot YMM: amana:: this o::ne mother: auilas it is hot mother: as nguadang [[pretends to be I’m burned burned]] YMM: [[moves over to grab tea]] […] […] mother: a’ai, kua nguis? okay, shall I blow (and cool it)? mother: nyisrlup now you drink [C12YMMZJIPlay1 559.262 571.479] (12) mother: ah? papa qua? huh? where is daddy? ZGT: papa lu qua? daddy is where? mother: nguluqa qua? where is he? ZGT: qua? where? mother: kuaridi i qamit kua? where did he go? ZGT: mit agalip (he) went (for) groundnuts mother: ah? huh? ZGT: agalip Groundnuts mother: kamit kua? where did he go? ZGT: galip Groundnuts

Interdisciplinary Approaches to Language Documentation 36

mother: da? really? ZGT: [[shakes head]] [[yes]]

[C12VARPlay 416.843 430.533]

Despite the unfamiliarity of the setting, the staged play activities proved fairly successful in the end in generating a good amount of child data and child-directed speech. But without complementary natural data, we can only guess how far these data reflect the typical language learning process and the typical language input and output. Different fieldsites will pose different challenges, but it is likely that one of the major challenges will always be to find a range of suitable recording contexts that give insights into typical children’s activities, and at the same time that lend themselves to high quality recordings of child language and child-directed language.

1. Conclusions Our project among the Qaqet Baining of Raunsepna has set out to explore the possibilities of conducting a comprehensive child language documentation project in this community. This section now takes a step back and draws some general lessons from our experiences. As is the case for all language documentation projects, child language documen- tation can only succeed in close cooperation with the community. A large part of the pilot project was spent in discussions with the community about the motivations and purposes of such a study, possible practical outcomes, and issues of methodology and logistics. In conjunction with these discussions, we experimented with various methods and ways of recording children. We consider this time well spent. The discussions shaped and mod- ified our original ideas, thus ensuring the feasibility of the project; they also created an atmosphere of enthusiasm and trust in the community. The trialing of methods enabled us to form a realistic view on what is and is not possible. And, just as importantly, they famil- iarized everyone – the outside researchers, the community as a whole, the care-givers, and children – with the research procedures. The sessions became increasingly normal, and adults and children no longer saw them as intrusions into their lives. And with their grow- ing knowledge of the research procedures, the community members became increasingly confident to participate in the overall project design, introducing modifications and new goals, and pointing out fruitless endeavors. This phase lasted for about three months, and it would be fair to characterize it as a phase of trial and error. It was argued in §2 that child language documentation is of interest from dif- ferent perspectives, including the perspectives of language acquisition, language docu- mentation, and language endangerment. Our pilot project generated data that spoke to all three perspectives. The corpus includes large numbers of child utterances, both target- and non-target-like, that allow for hypotheses on the acquisition trajectory of various phenom- ena and that point to phenomena whose acquisition seems protracted. The sociolinguistic and anthropological data give insights into the linguistic landscape and ideologies and practices of socialization that, from a practical perspective, help us identify suitable recording contexts. At the same time, the many instances of child-directed speech in the

Interdisciplinary Approaches to Language Documentation 37 corpus allow for initial investigations into the linguistic and anthropological properties of this register. And finally, the instances of code-switching give insights into the uses of Qaqet and Tok Pisin. However, I have deliberately phrased these results in a careful way: the purpose of the pilot project was to investigate the feasibility of a larger child lan- guage documentation project, i.e., its focus was always on methods. From the perspective of language acquisition research, what is needed now is a longitudinal study where the same children are recorded at regular intervals over a longer period. Such a corpus will give insights into the temporal organization of children’s language development. On its basis, it will then be possible to investigate individual phenomena more rigorously in the form of experimental studies: it will allow us to formulate specific testable hypotheses, to identify appropriate contexts and ages for experiments and such a corpus would also help interpret the results of experiments. We have since embarked on such a longitudinal study, and it would be fair to say that it would not have been possible without this pilot stage. We would therefore always recommend factoring in a pilot phase and allowing ample time for discussions and trialing of methods. While the pilot project made it possible to determine the feasibility and scope of a comprehensive project, it also raised a more fundamental question. In the course of the study, we encountered numerous problems with the methods developed in a Western context. It is true that we overcame most of these difficulties in the end. However, it is not entirely clear whether it is always necessary or even desirable to go through this effort. This issue arose especially in the case study on narrative development (see §3.2). Given that the children’s behavior reflects normal social and cultural practices (such as the prac- tices of co-constructing stories, and of repeating known stories), the question arises as to how to deal with such practices. On the one hand, there is a need to document characteris- tic speech practices – in their own right, and as instances of alternative ways of socializing children into narrating stories. And, on the other hand, there is a need to follow prescribed methodologies that arose in Western contexts – methodologies that were developed for good reasons, and that make the collected data comparable cross-linguistically. Or, put differently, should we think of child language documentation as part of the psycholin- guistic discipline of language acquisition (and the challenge then would be to adapt the existing methods to diverse linguistic, cultural, and geographic settings so that they yield results outside of Western contexts and laboratories)? Or is it possible to articulate a dif- ferent (complementary or overlapping) agenda, and to integrate the more anthropological and community-oriented outlook of language documentation into the overall enterprise? Our study is far too limited to even attempt an answer to this question. But it highlights the need to bring data and experiences from non-Western contexts to bear on the methodolog- ical and theoretical debate. And it highlights a second factor: the need for collaboration between researchers from language documentation, language acquisition, and language socialization. In the course of our pilot project, it was an invaluable asset to be able to discuss the methodological and theoretical challenges within the research team and with outside colleagues. It enabled us to gain an understanding of the different perspectives and approaches, and to form an idea of how to collect data that are of use to all three fields.

Interdisciplinary Approaches to Language Documentation 38

References

Amery, Rob. 1995. It’s ours to keep and call our own: Reclamation of the Nunga languages in the Adelaide region, South Australia. International Journal of the Sociology of Language 113. 63–82. Bates, Elizabeth & Brian MacWhinney. 1989. The crosslinguistic study of sentence pro- cessing. Cambridge: Cambridge University Press. Bavin, Edith L. 1995. Language acquisition in crosslinguistic perspective. Annual Review of Anthropology 24. 373–396. Berman, Ruth A. & Dan I. Slobin. 1994. Relating events in narrative. Mahwah, NJ: Erlbaum. Bley, Bernhard. 1914. Sagen der Baininger auf Neupommern, Südsee. Anthropos 9. 196– 220, 418–448. Blom, Elma & Sharon Unsworth (eds.). 2010. Experimental methods in language acqui- sition research. Amsterdam: John Benjamins. Bowerman, Melissa. 2010. Linguistic typology and first language acquisition. In Jae Jung Song (ed.), The Oxford handbook of linguistic typology, 591–617. Oxford: Oxford University Press. Czaykowska-Higgins, Ewa. 2009. Research models, community engagement, and lin- guistic fieldwork: Reflections on working within Canadian indigenous communities. Language Documentation & Conservation 3(1). 15–50. Demuth, Katherine. 1996. Collecting spontaneous production data. In Dana McDaniel, Cecile McKee & Helen Smith Cairns (eds.), Methods for assessing children’s syntax, 3–22. Cambridge: MIT Press. Devette-Chee, Kilala. 2012. The impact of Tok Pisin and local vernaculars on children’s learning in Papua New Guinea. Language and Linguistics in Melanesia 30(2). 47–59. Dickhardt, Michael. 2008. Die Feuertänze der Baining. In Ines de Castro, Katja Lembke & Ulrich Menter (eds.), Paradiese der Südsee. Mythos und Wirklichkeit, 98–101. Hildesheim and Mainz: Roemer- und Pelizaeus-Museum; Philipp von Zabern. Dickhardt, Michael. 2009. Schau nur, und also wirst du dich wandeln! Eine Studie zur Kulturanthropologie der Moralität unter den Qaqet-Baining von Raunsepna, Neubritannien, Gazellehalbinsel, Papua-Neuguinea. Habilitationsschrift, Göttingen: Georg-August-Universität. Dickhardt, Michael. 2012. Die mit den Geistern tanzen. Maskentänze, Identität und Moral unter den Qaqet-Baining (Gazellehalbinsel, Neubritannien, Papua- Neuguinea). Mitteilungen der Berliner Gesellschaft für Anthropologie, Ethnologie und Urgeschichte 33. 15–32. Dobrin, Lise M. 2005. When our values conflict with theirs: Linguists and community empowerment in Melanesia. In Peter K. Austin (ed.), Language documentation and description, vol. 3, 42–52. London: SOAS. Duranti, Alessandro, Elinor Ochs & Bambi Schieffelin (eds.). 2011. The handbook of language socialization. Malden, MA: Wiley-Blackwell. Dwyer, Arienne M. 2006. Ethics and practicalities of cooperative fieldwork and analysis. In Jost Gippert, Nikolaus P. Himmelmann & Ulrike Mosel (eds.), Essentials of lan- guage documentation, 31–66. Berlin and New York: Mouton de Gruyter. Eisenbeiß, Sonja. 2005. Documenting child language. In Peter K. Austin (ed.), Language

Interdisciplinary Approaches to Language Documentation 39

documentation and description, vol. 3, 106–140. London: SOAS. Eisenbeiß, Sonja. 2010. Production methods. In Elma Blom & Sharon Unsworth (eds.), Experimental methods in language acquisition research, 11–34. Amsterdam: John Benjamins. Evans, Nicholas & Stephen C. Levinson. 2009. The myth of language universals. Behavioral and Brain Sciences 32. 429–448. Fajans, Jane. 1983. Shame, social action, and the person among the Baining. Ethos 11(3). 166–180. Fajans, Jane. 1985. The person in social context: The social character of Baining ‘psychol- ogy’. In Geoffrey M. White & John Kirkpatrick (eds.), Person, self, and experience. Exploring Pacific ethnopsychologies, 367–397. Berkeley: University of California Press. Fajans, Jane. 1993. The alimentary structures of Kinship: Food and exchange among the Baining of Papua New Guinea. In Jane Fajans (ed.), Exchanging products: Producing exchange, 59–75. Sydney: Sydney University Press. Fajans, Jane. 1997. They make themselves: Work and play among the Baining of Papua New Guinea. Chicago: The University of Chicago Press. Fajans, Jane. 1998. Transforming nature, making culture: Why the Baining are not envi- ronmentalists. Social Analysis 42(3). 12–27. Fishman, Joshua A. 1991. Reversing language shift: Theoretical and empirical founda- tions of assistance to threatened languages. Clevedon: Multilingual Matters. Foley, William A. 2003. Genre, register and language documentation in literate and pre- literate communities. In Peter K. Austin (ed.), Language documentation and descrip- tion, vol. 1, 85–98. London: SOAS. Franchetto, Bruna. 2006. Ethnography in language documentation. In Jost Gippert, Nikolaus P. Himmelmann & Ulrike Mosel (eds.), Essentials of language documenta- tion, 183–211. Berlin and New York: Mouton de Gruyter. Gaskins, Suzanne. 1999. Children’s daily lives in a Mayan village: A case study of cul- turally constructed roles and activities. In Artin Göncü (ed.), Children’s engagement in the world: Sociocultural perspectives, 25–61. Cambridge: Cambridge University Press. Grenoble, Leonore A. & Lindsay J. Whaley. 2006. Saving languages: An introduction to language revitalization. New York: Cambridge University Press. Grinevald, Colette. 2003. Speakers and documentation of endangered languages. In Peter K. Austin (ed.), Language documentation and description, vol. 1, 52–72. London: SOAS. Hellwig, Birgit. 2010–2013. Qaqet corpus. London: Endangered Languages Archive. https://elar.soas.ac.uk/Collection/MPI188145. Hellwig, Birgit, Carmen Dawuda, Henrike Frye & Steffen Reetz. 2014–2019. Qaqet Child Corpus. Cologne: Language Archive Cologne. http://hdl.handle.net/11341/00-0000- 0000-0000-202A-0@view. Henrich, Joseph, Steven J. Heine & Ara Norenzayan. 2010. The weirdest people in the world? Behavioral and Brain Sciences 33. 61–83. Hesse, Karl. 2007. A Jos! Die Welt, in der die Chachet-Baininger leben. Sagen, Glaube und Tänze von der Gazelle-Halbinsel Papua-Neuguineas. Wiesbaden: Harrassowitz Verlag.

Interdisciplinary Approaches to Language Documentation 40

Hesse, Karl & Theo Aerts. 1996. Baining life and lore. Port Moresby: University of Papua New Guinea Press. Hiery, Hermann Joseph. 2007. Die Baininger. Einige historische Anmerkungen zur Einführung. In Karl Hesse, A Jos! Die Welt, in der die Chachet-Baininger leben. Sagen, Glaube und Tänze von der Gazelle-Halbinsel Papua-Neuguineas, VII–XXIX. Wiesbaden: Harrassowitz Verlag. Himmelmann, Nikolaus P. 1998. Documentary and descriptive linguistics. Linguistics 36. 161–195. Himmelmann, Nikolaus P. 2006. Language documentation: What is it and what is it good for? In Jost Gippert, Nikolaus P. Himmelmann & Ulrike Mosel (eds.), Essentials of language documentation, 1–30. Berlin and New York: Mouton de Gruyter. Hinton, Leanne & Kenneth Hale (eds.). 2001. The green book of language revitalization in practice. San Diego: Academic Press. Hirschfeld, Lawrence. 2002. Why don’t anthropologists like children? American Anthropologist 104(2). 611–627. Kelly, Barbara F. & Rachel Nordlinger. 2013. Fieldwork and first language acquisition. In Lauren Gawne & Jill Vaughn (eds.), Selected Papers from the 44th Conference of the Australian Linguistic Society, 2013, 178–192. Melbourne: University of Melbourne. Kidd, Evan. 2006. Language acquisition research methods. In Encyclopaedia of Language and Linguistics, 311–315. Oxford: Elsevier. Kulick, Don. 1992. Language shift and cultural reproduction: Socialization, self, and syn- cretism in a Papua New Guinean Village. Cambridge: Cambridge University Press. Laufer, Carl. 1946/1949. Rigenmucha, das Höchste Wesen der Baining (Neubritannien). Anthropos 41/46. 497–560. Laufer, Carl. 1959. Jugendinitiation und Sakraltänze der Baining. Anthropos 54. 905–938. Lehmann, Christian. 2001. Language documentation: A program. In Walter Bisang (ed.), Aspects of typology and universals, 84–97. Berlin: Akademie Verlag. Litteral, Robert. 2004. Vernacular education in Papua New Guinea. Paper commissioned for the EFA Global Monitoring Report 2005, The Quality Imperative. Lieven, Elena & Sabine Stoll. 2009. Language. In Marc H. Bornstein (ed.), The handbook of cultural developmental science, 143–160. New York and London: Psychology Press. Lüpke, Friederike. 2010. Research methods in language documentation. In Peter K. Austin (ed.), Language documentation and description, vol. 7, 55–104. London: SOAS. MacWhinney, Brian. 2000. The CHILDES Project: Tools for analyzing talk. Mahwah, NJ: Erlbaum. Marley, Alexandra. 2013. Language use amongst the Qaqet Baining: A sociolinguistic study of language choices in an ethnolinguistic minority in Papua New Guinea. Melbourne: unpublished La Trobe University MA thesis. Mayer, Mercer. 1969. Frog, where are you? New York: Dial Books for Young Readers. McConvell, Patrick & Felicity Meakins. 2005. Gurindji Kriol: A mixed language emerges from code-switching. Australian Journal of Linguistics 25(1). 9–30. Meakins, Felicity & Gillian Wigglesworth. 2013. How much input is enough? Correlating comprehension and child language input in an endangered language. Journal of Multilingual and Multicultural Development 34(2). 171–188. Mosel, Ulrike. 2006. Fieldwork and community language work. In Jost Gippert, Nikolaus

Interdisciplinary Approaches to Language Documentation 41

P. Himmelmann & Ulrike Mosel (eds.), Essentials of language documentation, 67–85. Berlin and New York: Mouton de Gruyter. Musgrave, Simon & Nicholas Thieberger. 2006. Ethical challenges in documentary linguistics. In Keith Allan (ed.), Selected Papers from the 2005 Conference of the Australian Linguistic Society. http://www.als.asn.au. Ochs, Elinor. 1979. Transcription as theory. In Elinor Ochs & Bambi B. Schieffelin (eds.), Developmental pragmatics, 43–72. New York: Academic Press. Ochs, Elinor. 1982. Talking to children in Western Samoa. Language in Society 11. 77–104. O’Shannessy, Carmel. 2008. Children’s production of their heritage language and a new mixed language. In Jane Simpson & Gillian Wigglesworth (eds.), Children’s language and multilingualism, 261–282. London and New York: Continuum International Press. O’Shannessy, Carmel. 2012. The role of code-switched input to children in the origin of a new mixed language. Linguistics 50(2). 305–340. Parker, Jim & Diana Parker. 1974. A tentative phonology of Baining (Kakat dialect). In Richard Loving (ed.), Phonologies of four Papua New Guinea languages, 5–43. Ukarumpa: Summer Institute of Linguistics. Parker, Jim & Diana Parker. 1977. Baining grammar essentials. Ukarumpa: Summer Institute of Linguistics. Unpublished manuscript. Pool, Gail. 2015. Lost among the Baining: Adventure, marriage and other fieldwork. Columbia: University of Missouri Press. Pool, Jeremy. 1984. Objet insaisissable ou anthropologie sans objet? Field Research among the Northern Baining, New Britain (1969–1970). Journal de la Société des Océanistes 40(79). 219–233. Rascher, Matthäus. 1900. Versuch zu einer Grammatik des Bainingischen. Unpublished manuscript. Rascher, Matthäus. 1904. Grundregeln der Bainingsprache. Mitteilungen des Seminars für Orientalische Sprachen zu Berlin 7(1). 31–85. Rice, Keren. 2006. Ethical issues in linguistic fieldwork: An overview. Journal of Academic Ethics 4(1–4). 123–155. Rice, Keren. 2011. Documentary linguistics and community relations. Language Documentation & Conservation 5. 187–207. Rohatynskyj, Marta A. 2000. The enigmatic Baining: The breaking of an ethnographer’s heart. In Sjoerd R. Jaarsma & Marta A. Rohatynskyj (eds.), Ethnographic artifacts: Challenges to a reflexive anthropology, 174–194. Honolulu: University of Hawai’i Press. Rohatynskyj, Marta A. 2001. On knowing the Baining and other minor ethnic groups of East New Britain. Social Analysis 45(2). 23–40. Schieffelin, Bambi & Eleanor Ochs (eds.) 1986. Language socialization across cultures. Cambridge: Cambridge University Press. Seifart, Frank. 2008. On the representativeness of language documentations. In Peter K. Austin (ed.), Language documentation and description, vol. 5, 60–76. London: SOAS. Siegel, Jeff. 1996. Vernacular education in the South Pacific. Canberra: Australian Agency for International Development.

Interdisciplinary Approaches to Language Documentation 42

Slobin, Dan I. 1982. Universal and particular in the acquisition of language. In Lila Gleitman & Eric Wanner (eds.), Language acquisition: The state of the art, 128–170. Cambridge: Cambridge University Press. Slobin, Dan I. 1985–1997. The crosslinguistic study of language acquisition, vol. 1–4. Mahwah, NJ: Erlbaum. Stoll, Sabine. 2009. Crosslinguistic approaches to language acquisition. In Edith L. Bavin (ed.), The handbook of child language, 89–104. Cambridge: Cambridge University Press. Stoll, Sabine & Elena Lieven. n.d. Child language (within the Chintang and Puma docu- mentation project). Nijmegen: The Language Archive. https://hdl.handle.net/1839/00- 0000-0000-0005-6F49-B@view. Stebbins, Tonya N. 2011. Mali (Baining) grammar. A language of the East New Britain Province, Papua New Guinea. Canberra: Pacific Linguistics. Strömqvist, Sven & Ludo Verhoeven (eds.). 2004. Relating events in narrative, vol. 2. Mahwah, NJ: Erlbaum. Tomasello, Michael & Daniel Stahl. 2004. Sampling children’s spontaneous speech: How much is enough? Journal of Child Language 31(1). 101–121. Wilkins, David. 1992. Linguistic research under aboriginal control: A personal account of fieldwork in central Australia.Australian Journal of Linguistics 12(1). 171–200. Wilkins, David. 2000. Even with the best of intentions: Some pitfalls in the fight for linguistic and cultural survival. In Francisco Queixales and Odile Renault-Lescure (eds.), As linguas amazônicas hoje / Les langues d’Amazonie aujourd’hui. Sao Paolo and Paris: Instituto Ambiental and IRD.

Interdisciplinary Approaches to Language Documentation Language Documentation & Conservation Special Publication No. 21 (October 2020) Interdisciplinary Approaches to Language Documentation ed. by Susan D. Penfield, pp. 43-71 4 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/24941

Interdisciplinary in Areal Documentation: Experiences from Lower Fungom, Cameroon

Jeff Good University at Buffalo

Abstract The Lower Fungom region of Northwest Cameroon is noteworthy for its exceptional lin- guistic diversity. Seven languages are spoken in its thirteen recognized villages within an area about the size of the city of Amsterdam. This situation raises the question: What factors have allowed Lower Fungom to develop and maintain its extreme linguistic diver- sity? This paper considers the ways in which the standard documentary linguistic toolkit has been augmented by an interdisciplinary approach to studying the region, allowing for the creation of a documentary record which covers both the synchronic features of the target languages and offers sufficient ethnographic and historical context to allow us to model its language dynamics. In addition to outlining key results of this interdisciplinary research, concrete recommendations are provided for linguists interested in engaging in similar kinds of work.

Keywords: language documentation, interdisciplinarity, Cameroon, linguistic diversity

CC Creative Commons Attribution Non-Commercial No Derivatives Licence 44

1. Challenging interdisciplinarity in language documentation

The rise of the documentary paradigm as an approach to the study of endangered lan- guages has been, at least rhetorically, associated with an emphasis on the value of inter- disciplinary collaboration as a means to come to a fuller understanding of the diverse linguistic practices of communities throughout the world16. Especially salient symbols of this can be found in edited collections, such as Gippert et al. (2006) and Thieberger (2012) which devote considerable space to the collection of language data relevant to fields like anthropology, botany, or geography—often to the point of backgrounding the traditional descriptive linguistic toolkit designed to elucidate structural patterns in grammars (see also Evans (2008: 342–343)).17 Despite this, endangered language linguistics does not seem to have been over- run by projects with interdisciplinary focus. Most work in this area remains more or less purely “linguistic” in nature (though often with an eye toward at least documenting cultur- ally significant forms of language). There are at least two reasons for this. (See also Seidel (2016) for relevant considerations.) First, in an academic world that is built upon disciplinary foundations, work- ing across disciplines requires a high level of expense with comparatively little “overt” payoff, at least as measured in the usual terms such as publishing in high-profile outlets. Effective interdisciplinary research often requires collaborators to gain fairly deep knowl- edge about how practitioners of other disciplines collect and theorize on their data and may further result in academic outputs that are neither fish nor fowl, as it were, in terms of disciplinary evaluation. Is a culturally-informed collection of place names (see §3.5) an instance of linguistics, anthropology, or geography? Questions like this do not merely provide interesting intellectual puzzles. They can have real-world consequences given the fact that disciplines do not merely exist to provide a convenient way to categorize differ- ent methods of inquiry but are also embedded within the institutional structures that sup- port scholarship. This is most saliently seen in the form of academic departments, which largely define and delimit the training and job opportunities for those with an interest in research careers (see Moran (2010: 3–13) for relevant, historically-oriented discussion). The second reason for a lack of interdisciplinary projects is simply that time and resources for collaborative work are inevitably quite limited, and documentary linguists, in particular, are often pulled more strongly towards collaborative efforts with speaker communities, typically aimed at language maintenance activities, than collaboration with specialists of other disciplines. (See Good (2012a) for discussion relevant to the present context.) This further limits the possibilities for sustained interdisciplinary work, even in cases where partnerships might be relatively easy to develop and sustain.

16 I would like to acknowledge audience members at the Third International Conference on Language Documentation and Conservation for their comments on the original presented version of this paper, as well as Friederike Lüpke, Frank Seidel, and an anonymous review for their input on an earlier draft manuscript. Various collaborators have also contributed to this work, most of whom are cited below, and Pierpaolo Di Carlo deserves special mention in this regard. Support for the development of this paper was provided by U.S. National Science Foundation Award BCS-1246525. The broader research on which this paper is based has been supported by funding from the Max Planck Institute for Evolutionary Anthropology Department of Linguistics, the U.S. National Endowment for the Humanities (under NEH fellowship #500006 and NEH grant RZ-50817-07), the U.S. National Science Foundation (under NSF Grant BCS-0853981), the Endangered Languages Documentation Programme, and the University at Buffalo College of Arts and Sciences and Humanities Institute. 17 The DoBeS project, in particular, stands out in this context for its emphasis on funding research activities drawing on the insights of different disciplines (see http://dobes.mpi.nl)

Interdisciplinary Approaches to Language Documentation 45

The purpose of the present paper is to highlight the results of interdisciplinary research on patterns of exceptional linguistic diversity in a region of the Cameroonian Grassfields known as Lower Fungom. The primary collaboration has been relatively limited, consist- ing of a pairing of the present author, identifying as a linguist, with a single collaborator, Pierpaolo Di Carlo, identifying as an anthropologist with a strong interest in the role of language in cultural reproduction. However, this multidisciplinary “team” has been able to do work that draws on data and insights associated with a number of other disciplines.18 In particular, data that is linguistic, anthropological, geographic, and historical in nature has been brought to bear on the question: Why is Lower Fungom so linguistically diverse? In discussing the interdisciplinary approach we have taken to this question, this paper my goal is both to highlight what is possible when linguists look at their research with an interdisciplinary frame of mind and also to provide practical recommendations regarding how such work can be done effectively, even with relatively limited resources. The key lessons in this regard are: (i) the utility of structuring interdisciplinary research around a particular research question that is sensible to scholars with different backgrounds and (ii) the importance of the specific skillsets and “temperaments” that different collaborators bring with them to a research project. It will also be seen that adopting an interdisciplin- ary approach can ultimately lead to a more informed view of the ideal linguistic foci for a documentary project, thereby having a significant long-term impact on documentary activities. In §2, general background information is given on Lower Fungom to provide context for later discussion. A description of the five types of interdisciplinary data that have been explored in current research on the languages of the region is given in §3, with the relevance of the data to understanding the histories of Lower Fungom communities interspersed throughout that section. General lessons that emerge from this specific case study are then offered in §4.

2. Lower Fungom: Linguistic background The Lower Fungom region is one of the most linguistically diverse parts of the Cameroonian Grassfields, itself an area whose linguistic diversity has been noted for some time (Stallcup 1980: 44). Located in the Grassfields’ northwestern periphery, the core inhabited area stretches roughly ten kilometers both north to south and east to west.19 It is a rural region with poor infrastructure, and the economy is dominated by subsistence agriculture (Good et al. 2011: 106–107). In Figure 1, a map of Lower Fungom and sur- rounding areas is given. Lower Fungom itself is found in the center of the map and is bounded roughly by the Yemne river to the west and the Kimbi river to the north and east, with its southern border running in a rough line between the market settlement of Yemgeh and the village of Mekaf and the villages of Ajumbu and Fungom.20 Yemgeh and Ajumbu are both part of Lower Fungom, while Mekaf and Fungom are not.

18 Friederike Lüpke and Frank Seidel have pointed out that it can be helpful to distinguish between multidisciplinarity and interdisciplinarity. The former involves scholars from multiple disciplines working on the same topic without necessarily integrating their results, while the latter emphasizes an attempt to bring the insights of multiple disciplines together in a coherent way (see Möhlig (2010: 4–5)). 19 See Good (2013) for discussion of Lower Fungom’s areal context and some of the original motivations underlying the work described here. 20 The body of water we refer to as the “Yemne river” has no standard Western name. See Good et al. (2011: fn. 2) for discussion of the name used here.

Interdisciplinary Approaches to Language Documentation 46

Figure 1. Lower Fungom and surrounding area

Interdisciplinary Approaches to Language Documentation 47

Table 1 lists the linguistic affiliations of each Lower Fungom village along with rough population estimates. Dashed lines indicate villages whose varieties are sufficiently distinctive from closely related varieties that they are probably best associated with their own “language” if only linguistic criteria are considered.

Table 1. Lower Fungom languages and associated villages

subgroup language village population Yemne-Kimbi Mungbam [mij] Abar 650–850 Munken around 600 Ngun 150-200 Biya 50–100 Missong around 400 Ji [boe] Mundabli 350–450 Mufu 80–150 Buu 100–200

Fang [fak] Fang 4,000–6,000

Koshin [kid] Koshin 3,000–3,500

Ajumbu [muc] Ajumbu 200–300

Beboid Naki [mff] Mashi 300–400

Central Ring Kung [kfl] Kung 600–800

The languages of the referential Yemne-Kimbi subgroup (see Table 1) are restricted to Lower Fungom. Of the other two languages, Naki is spoken in the Lower Fungom village Mashi and also in settlements outside of Lower Fungom, three of which, Mekaf, Small Mekaf, and Mashi Overside, appear on the map in Figure 1. Kung is only associated with the village of Kung but has been classified with the Central Ring group of languages found to the south, which includes Mmen [bfm]. A dialect of Mmen is spoken in Fungom, a village to the south of the Lower Fungom village of Ajumbu. (For largely accidental historical reasons, Fungom lent its name to the wider region.) While Lower Fungom’s languages can all be reasonably classified as Bantoid, the five Yemne-Kimbi languages do not have any established close relatives outside of the region, nor can they be straightforwardly shown to be closely related to each other.21 The linguistic picture is paralleled by an ethnographic one that shows considerable diversity in social organization across the region’s villages as well. More detailed discussion of the region is provided in Good et al. (2011); Di Carlo (2011); Good (2013); Di Carlo & Good (2014). Current research on the languages of Lower Fungom involves the usual sorts of activities associated with language documentation and description, and has yielded

21 See Watters (1989) for further background on the Bantoid subgroup of Niger-Congo. Roughly speaking, it can be characterized as comprising the (Narrow) Bantu languages and their closest relatives.

Interdisciplinary Approaches to Language Documentation 48 traditional outputs such as the descriptive grammar of Mungbam of Lovegren (2013) as well as more targeted studies such as the phonological and morphological investigation of Buu of Ngako Yonga (2013). At the same time, the work is driven by a broader theo- retical question associated with much work in contemporary linguistic typology, namely What’s where why? (Bickel 2007: 239). In this case, the question has been more narrowly formulated to the Lower Fungom situation as, Why is the region so linguistically diverse? Consideration of the issues raised by such a question seems especially timely given that Lower Fungom serves as a clear counterexample to global trends of language endanger- ment and loss. Understanding what has allowed it to develop and maintain such a level of diversity, therefore, may provide useful lessons for language maintenance activities in parts of the world where diversity is under more immediate threat. It is clear that a question focused on the causes of a local pattern of linguistic diversity cannot be answered with linguistic data alone. Typical documentary and descrip- tive linguistic activities provide the what of language diversity, but understanding the why requires something more. Our approach has been to examine ethnographic, geographic, historical, and other kinds of information side-by-side with the results of more purely linguistic work to look for cases where the different strands of data converge on a coher- ent explanation for a given aspect of the region’s diversity. The next section of this paper exemplifies the range of data that has been employed to this point. A similar approach to what is found here in the context of research on African lan- guages is described in Möhlig et al. (2010b). The overlap is not coincidental but reflects, among other things, the lack of old written materials on Sub-Saharan Africa which means that investigations into the history of the continent have often required interdisciplinary approaches (Möhlig et al. 2010b: 80).

3. Interdisciplinary data and Lower Fungom 3.1 Linguistics, ethnographic, archaeological, geographic, and historical data As mentioned above, in seeking to understand why Lower Fungom is so linguistically diverse, the project team has attempted to combine data that is traditionally associated with different disciplines. Our hope is that these strands of evidence will converge in a way that allows for a unified explanation for at least some aspects of the region’s diver- sity. On the whole, we believe that we have been successful in using interdisciplinary data to develop historical models which provide context for understanding this diversity. We are in the process of further exploring how regional language ideologies and patterns of multilingualism play a role in its maintenance. I do not fully explore these more recent developments here—since the focus is on the value of interdisciplinarity for lan- guage documentation rather than specific research results—but allude to them at various points in this section and in §4 for purposes of illustration. One of the Lower Fungom villages in particular, Missong, will play a prominent role in the discussion since it represents an exemplary instance where different types of data come together to provide a coherent account of its linguistic exceptionality. It, there- fore, is especially useful for illustrating the sorts of results one can achieve by combining data from different disciplines (see Möhlig et al. (2010b) for other examples of this in an African context).

Interdisciplinary Approaches to Language Documentation 49

3.2 Comparative linguistic data As a project built around understanding patterns of linguistic diversity, the collection of linguistic data is central to its activities. Significantly, however, there is a strong comparative element to the data collection. While most projects in language documentation operate in a mode where they are oriented primarily towards the “ancestral code” of a community (Woodbury 2011: 177) (see also Childs et al. (2014)), the work on Lower Fungom is more concerned with examining how the different lexi- cogrammatical codes of the region relate to each other (and to other codes spoken in the wider area as well). Gathering comparative data involves the usual methods associated with single-code documentation, with an additional emphasis in many cases on targeted elicitation in domains of grammar that are believed to be particularly valuable in under- standing historical relationships among the languages in question. To pick a concrete example, all of the languages of Lower Fungom show noun class systems that can be informally described as “reduced” variants of the well-known Bantu-type noun class system (though, in making this characterization, I do not mean to imply the relationship involves straightforward historical “decay”; see Good (2012b)). Examining their noun class systems in parallel provides a number of useful isoglosses, as can be seen in the noun class systems given in Good et al. (2011). Beginning with noun class data, I will focus in this section on the grammatical and lexical features of Mungbam, which consists of five closely related varieties, bringing in data from other languages where relevant for purposes of illustration. The discussion of Mungbam draws heavily on Lovegren (2013), which provides a polylectal grammar of the language draw- ing on data from all five varieties. Of all of the project outputs to date, this work most strongly exemplifies its comparative linguistic orientation. In Table 2 and Table 3, the noun class systems of the Missong and Munken vari- eties of Mungbam are presented respectively. For each noun class, two forms are given, one associated with the class prefix found on nouns themselves and the other indicating the form of an agreement marker on elements which show agreement with the noun (e.g., determiners). Capital letters are used to reference different kinds of assimilating conso- nants. Grave and acute accents associated with agreement markers should be interpreted as indicating that a given class or concord is coded with a lower or higher tone (respec- tively) than segmentally homophonous counterparts, with the precise tonal realization dependent on the stem.22 Numbers for the noun classes attempt to follow conventions associated with Bantu noun classes (see, e.g, Maho (1999); Katamba (2003)), following standard practice, though these should not be associated with strict reconstructions. Paired noun classes are indicative of the most common singular/plural pairings in the varieties. These tables abstract away from various complications. Fuller description can be found in Lovegren (2013: 105–145), which also provides justification for the various class num- bering conventions.

22 For example, a word from a given paradigm showing concord associated with a grave tone in Table 2 might have a low tone, while a word showing concord associated with an acute tone might have a mid or a mid-falling tone. The absolute tones will differ across paradigms, but the lower/higher pattern will still be present.

Interdisciplinary Approaches to Language Documentation 50

Table 2. The noun class system of the Missong variety of Mungbam

singular plural 1 ù-/Ø- w`- 2 ba- bu´- 3 ú- w´- 4 i- j´- 5L ì- j`- 6 a- w´- 5H i- j´- 13 ki-…(-Cə) kj´- 7 ki- k´- 8 bi- bj´- 9 ì- j`- 10 i- j´- 19 fi- f´- 18a mu- mu´- 6a aN- mu´- 14 bu- bu`-

Table 3. The noun class system of the Munken variety of Mungbam

singular plural 1 ù-/Ø- w`- 2 bə- b´- 3 ú- w´- 4 i- j´- 5L ì- j`- 6 a- n´- 5H i- j´- 13 ki-…(-lə) kj´- 12 a- k´- 8 bi- bj´- 9 ì- j`- 10 i- j´- 19 ɕi- ɕ´- 18a mu- mw´- 6a N- m´- 14 bu- bw`-

A cursory comparison of the noun class systems in Table 2 and Table 3 reveals them to be quite similar. There are some clear differences, e.g., the singular associated with plural Class 8 in Munken has a quite different form for its nominal prefix a-( ) than that of Missong (ki-)—to the extent that it is not even clear whether they should be asso- ciated with a single etymological class (Lovegren 2013: 132–137), hence their different numerical assignment of Class 7 vs. Class 12 in the tables. However, for the most part, the differences can be seen as representing relatively minor phonological divergences. This basic pattern holds across all five Mungbam varieties, presenting strong linguistic evidence for their close relationship, which is important given that local language ide- ologies treat the dialects of each of the five villages as constituting distinct “languages” (Lovegren 2013: 2). We can contrast Mungbam noun class systems with the system of Ajumbu, a one-village language of Lower Fungom, as given in Table 4. While some similarities can be found among the Ajumbu and Mungbam noun class systems (e.g., in Classes 8 and

Interdisciplinary Approaches to Language Documentation 51

6a), the Ajumbu system is clearly much more distinctive from either Mungbam system than Missong and Munken are to each other. Thus, it is relatively straightforward to use comparative grammatical data to establish that Ajumbu and Mungbam should be grouped as separate languages. This data cannot resolve whether or not Missong and Munken should be considered separate “languages”, but, from a research perspective, this is less important than simply knowing that Missong and Munken are much more closely related to each other than either is to Ajumbu.

Table 4. The noun class system of Ajumbu

singular plural 1 Ø- w`- 2 a- b´- 5 Ø- y´- 6 a- y´- 5 Ø- k´- 7(a) kə- (-lə) k´- 7 kə- k´- 8 bə- b´- 9 - y`- 10 ́- y´- 19 fə- f´- 18a m- m´- 6a m- m´-

The comparative data presented to this point is relatively straightforward in inter- pretation: It delineates a major linguistic boundary among the speech varieties of three Lower Fungom villages. Similar data can be presented to establish the distinctiveness of all of the region’s “languages”, where, here, a set of varieties is treated as a “language” if they share an ISO 639-3 code as given in Table 1 (Good et al. 2011)).23 However, our comparative investigation has also yielded significant results when we look at the gram- mars and lexicons of varieties of the same “language”. For instance, the five varieties comprising Mungbam are clearly similar enough to each other that, by linguistic criteria alone, they can be considered members of a single dialect cluster. However, the Missong variety is especially distinctive in its grammatical and lexical features. Considering the lexicons of the five Mungbam varieties, for example, the Missong variety is the most divergent from the other four. This can be seen in Table 5, where lexical similarity percentages are given for items in a 200-term wordlist (data provided by Jesse Lovegren, personal communication). While the four other Mungbam varieties tend to exhibit lexical similarity at a level of around ninety percent, for Missong, the percentage of shared items never reaches even eighty percent.

23 See Cysouw & Good (2013: 336–339) for some of the problems associated with adopting such a simplistic approach to the language/dialect division.

Interdisciplinary Approaches to Language Documentation 52

Table 5. Lexicostatistical similarities across the five Mungbam varieties Abar Biya Ngun Munken Missong Abar - Biya 90 - Ngun 89 93 - Munken 88 92 90 - Missong 77 74 73 77 -

More striking are the divergent patterns found in Missong with respect to a process of verbal ablaut that affects a subclass of verbs (not fully overlapping) in all Mungbam varieties as part of the coding of aspectual distinctions. (This pattern is roughly comparable to that associated with the tense marking found in so-called strong verbs in .) Examples are given in Table 6 (see Good et al. (2011: 123) for earlier discussion of these forms and Lovegren (2013: 189–193) for a broader over- view of this phenomenon in Mungbam). The two classes of stem forms are labelled as Perfective and Imperfective to reflect their approximate functions. In Munken, the general pattern is that, if a verb does undergo ablaut, the Perfective will show a and the Imperfective a , and this is the more typical pattern across the varieties. In Missong, however, a reverse relationship holds. Moreover, Missong, uniquely among the Mungbam varieties, shows Perfective/Imperfective forms where the Imperfective is coded with a not seen in the Perfective. These differences are not consis- tent with mere phonetic drift but indicative of a more abrupt process of change of some kind.

Table 6. Verbal ablaut in Missong and Munken

missong munken

pfv ipfv pfv ipfv gloss ji je ja ja ‘steal’ gbə gbe gbi gbo ‘fall’ ma mɔ mɛ mɔ ‘soak’ to ti ti to ‘come’ noa nɛŋ ŋan ŋan ‘slice’ wa waŋ wan wan ‘keep’ nyoa nyaŋ nyɛ nya ‘stay’

A final indicator of Missong’s divergence from the other four varieties is found in its pronominal system. As described in Lovegren (2013: 152), the personal pronouns of the Mungbam varieties are largely cognate with one clear exception. In the second person singular, Missong shows a pronoun with a form bì, while the rest of Mungbam shows the

Interdisciplinary Approaches to Language Documentation 53 following forms in the same paradigmatic position: wɛ̀ (Abar), ɔ̀ (Biya), wɔ̀ (Munken), and wɔ̀ (Ngun). The latter four can be easily seen as related to each other, while the Missong form cannot be. We have seen, then, that the comparative linguistic data provides us with the what of the What’s where why? question that has informed much of the work on Lower Fungom. In this case, the what comprises the fact that one sees clearly distinct languages (e.g., Ajumbu vs. Mungbam), and, even within “languages”, there are still interesting patterns of divergence among varieties. At the same time, the linguistic data alone cannot tell us why these particular languages and varieties happen to be the way they are. This is where data from other disciplines needs to be considered, and, in the following section, I will begin to do this by summarizing some significant results derived from ethnographic investigation. A recurrent topic in the discussion below will be the linguistic divergence of Missong, which will serve as a useful thread linking together the data derived from a variety of sources.

3.3 Ethnographic data After comparative linguistic data, the next most extensive class of data that has been collected on the communities of Lower Fungom can be broadly classified as ethnographic. This includes types of ethnographic information that could be collected for almost any community, such as oral histories and information on patterns of intermarriage, to information more specific to Grassfields societies, like the composition of village secret societies or lists of chiefs who are reported as having led a particular village. Collection of data on secret societies was greatly facilitated by advanced con- sultation of existing ethnographic descriptions, for instance Chilver & Kaberry (1968). Fortunately, much ethnographic work of this sort (especially comparatively older work) is written in a style that is fairly accessible. Detailed discussion of the comparative ethnog- raphy of Lower Fungom can be found in Di Carlo (2011). Here, I will first highlight what I believe to be useful general lessons for the linguist when making use of data of this kind and then discuss more specifically how ethnographic data has helped us better understand some linguistic features of Lower Fungom, again focusing on Missong. It would, of course, be impossible for most linguists to collect ethnographic data with the same level of expertise as a cultural anthropologist. However, for a linguist with some familiarity with salient cultural characteristics of the communities whose languages they are studying, certain kinds of data can be collected with relative ease. For instance, I have found oral histories of Lower Fungom villages to be useful to collect merely as an example of a text that is easy to elicit from speakers, even without necessarily being spe- cifically interested in their thematic content. In this case, the difficulty for the documen- tary linguist is not in gathering oral histories but, rather, in understanding how to interpret them once one goes beyond the confines of grammatical analysis and direct translation. It is not advisable to treat oral histories as “literal” histories in the academic sense.24 Moreover, one must be aware that the oral history being collected is necessarily a history as told by one specific person who may shape it to reflect their own interests.

24 Vansina (1985) is an important work on this topic.

Interdisciplinary Approaches to Language Documentation 54

Consider, for example, the fragments of oral texts collected from Nji Ndinkwa Manessah Tah of Koshin in (1) and Che Martin of Ajumbu in (2).25

(1) koshin

a. SO bə̀ ká gwá fə̀ bə̀ ká tīká bānyɛ́ bə̄ bɔ̀ Sáwì. so 3p.pvb cont separate exit 3p.pvb cont leave 2.brother 2.their Sawi “They then separated and left their brothers from Sawi.” b. Bə̀ ká nê ká nî kə̀ bà wə́ mə̀ SOTEE ká 3p.pvb cont leave cont walk prt.prog 5.bank 5.det loc so.long cont dí jḭ̄ ɛ̰̄ fə́ bə̄ mɔ̀ fɔ́ wɛ̄ n. come reach place be there there now. “They then went along the banks until they came and reached where they are today.” (2) ajumbu Sə̀ tə́ ānɛ̄ bə́ nə̄ sə̀ kɔ́ ŋ kə̄ ānɛ̄ ātsálə́ 1p.pvb cop 2.people 2.det subd 1p.pvb love prt 2.people 2.all

“We are the people who love everyone.”

The fragment in (1) portrays Koshin as being founded by a group coming from outside of Lower Fungom since Sawi is a place found to the east of the region. However, it would clearly be simplistic to take this to mean that the village, as constituted today, is composed only of people who descended from those taking part in this initial migra- tion. Rather, we can usefully compare this Koshin history to the dominant narrative of American history, which begins with the stories of settlements made by English colonists even though the vast majority of Americans do not descend from early colonial groups. What we have here for Koshin is a history associated with the community as a whole, as understood by one person who identifies as a member of the village. An ethnographically focused project might have chosen to collect oral histories using a lingua franca from many Koshin individuals in order to have a richer data set on how individuals from the village portray its history. Nevertheless, even if it is of more limited ethnographic value, this kind of text is doubly useful in that it can be used to illustrate not only important grammatical phenomena but also help with the analysis of Koshin culture. The Ajumbu example in (2), by contrast, is part of an oral history of that village which portrays the Ajumbu as having been in Lower Fungom before speakers of other lan- guages. The statement is communicating that the Ajumbu have welcomed various new- comers without causing them problems. Basic knowledge of human nature suggests that, if one were to gather the histories of each Ajumbu family, there may be more particular

25 The text fragments in (1) and (2) are drawn from recordings made by myself in collaboration with the speakers as well as Tah Christopher of Koshin and Zang Martina of Ajumbu. Transcription conventions largely follow Tadadjeu & Sadembouo (1984) (see Good et al. (2011: 13) for further details). While the most crucial aspects of the meaning of the text fragments for present purposes are believed to be secure, the glosses may not fully reflect all grammatical distinctions, especially those coded primarily via tone, and the tone transcriptions themselves reflect the surfacing tone patterns rather than a tonemic representation. Glossing abbreviations across all examples in this paper are as follows: 2, 5, 8: noun class; 1p: first person plural pronoun; 3p: third person plural pronoun; CONT: continuous; COP: copula; DET: determiner; IPFV: imperfective; LOC: locative; PROG: progressive; PRT: tense-aspect particle; PVB: preverbal pronoun; SUBD: subordinator.

Interdisciplinary Approaches to Language Documentation 55 grievances held by some members of the community with respect to other groups that have moved in. However, the portrayal of Ajumbu as generally peaceful is still ethnographi- cally interesting as a reflection of how one member of the village has chosen to represent its historical status to an outsider who showed interest in this question. A somewhat different example of how one can approach the interpretation of collected ethnographic data can be found in chief lists. These are orally transmitted lists of the chiefs who have ruled over a particular village (or other relevant unit). They may be recounted as part of an oral history, but are a distinctive cultural object with their own interpretive requirements. A fragment of a chief’s list from Ajumbu is given in (3). This was collected as part of the same oral history from which the example in (2) was drawn. It was not specifically asked for but was given as part of the history of the village, fur- ther revealing the ethnographic significance of this type of text. This sentence is one of a number of the same structure given together which, in sum, provided an entire chief list.

(3) ajumbu

Nə̄ gyɔ̀ ŋ kpə́ nə̄ sɔ̀ sî nshya᷆ ŋ mìny nə̄ ŋkyàŋ kə̀ . subd Gyong died subd Soh Sih Nshyang take subd 7.chair 7.det “As Gyong died, so Soh Sih Nshyang took the chair (of the chief).”

In literate societies, genealogies and succession lists will tend to be fixed in writ- ten form and, therefore, not readily alterable. In oral societies, by contrast, they can be actively changed to reflect new social realities, as in the example of the Gonja in Northern Ghana, as described by Goody & Watt (1963: 310). Their mythical founder shifted from having seven sons to five at some point in a span of sixty years, following a change in political organization of the group from comprising seven divisions to five. In the case of chief lists, this means that we should not take them to be literal retellings of the names of those who held the position of chief, but, rather, should instead see them as expressions of how a community’s leadership establishes its claim to author- ity. In this regard, we see an interesting divergence in the chief lists of the Lower Fungom communities. As noted in Di Carlo (2011: 74), one straightforward way to compare chief lists across villages is to examine their “structural time depth” (Vansina 1985: 118). The typical chief list in the area consists of six to eight names. In Missong—the same variety that was seen to be grammatically and lexically divergent from other variaties of Mungbam in §3.2—the chief list is only four names in length (Di Carlo 2011: 84). This is consistent with aspects of its oral traditions that treat it as having been founded in recent times by immigrant groups. Does this mean that Missong is, in fact, a recent formation? This has been the ultimate conclusion, but we can- not use ethnographic data such as this as our sole justification. All we can know from this is that Missong is presented as having less historical depth than most of the other Lower Fungom villages. However, if we encounter other evidence of Missong “recentness”, then it will become possible to suggest with more reliability that Missong actually is a com- paratively new village. In this case, there is other ethnographic evidence that corroborates the idea that Missong is a recently formed village. In particular, there is a general lack of “cohesive- ness” in the village’s structure (Di Carlo 2011: 84). Its quarters—a unit of sociopolitical

Interdisciplinary Approaches to Language Documentation 56 organization just below the level of the village—are more politically autonomous than what is found in other villages of the region, for instance having more distinctive ritual sites within them than what is seen elsewhere, where ritual sites are more likely to be cen- tralized in the quarter of the chief.26 This lack of cohesiveness is clearly consonant with a history that treats it as a recent federation of groups. Additionally, local cultural attitudes towards the Missong linguistic variety itself are pertinent to the question of its relative age. Contrary to popular associations of indig- enous codes with “ancestral” varieties, one finds in Lower Fungom an intriguing charac- terization of the origins of the Missong variety of Mungbam as having been “stolen” from one of the other Mungbam villages. This history of theft is not simply an accusation that the other villages make about the Missong, since the Missong adopt this story as well.27 What we see, in this case, is a consistent ethnographic picture of Missong as a “recent” village, further understood to have formed from groups once speaking a different language or languages. This history is perhaps most effectively presented in the words of an individual who is from Missong, Buo Makpa Amos, as recorded by Pierpaolo Di Carlo. This is another case where an oral history, here of an extended kin group rather than a village and collected in a lingua franca instead of a local language, provides valuable ethnographic information. The place names referred to in the quotation below can be found in the map in Figure 1 (where the village of Subum is part of the general Bum area referred to below). Note, in particular, the characterization of Missong not as any kind of ancient unity but, rather, as a federation of independent groups and the explicit linking of a common language to an expression of a common political identity.

As my father told me, we were from Fang side, even in Bum side there were many of us. When you people are cooperating you speak one language. If you speak one language, you cooperate. As a group of relatives moves, the brothers may decide to split, each choosing a dif- ferent place to stay. This is what happened to us. We left the early place in Fang side as a whole and arrived in Abar. From here we scattered. Now, we Bambiam from Missong have relatives in Abar, in Buu, in Ngun. Each family attached itself to a village and therefore had to speak the general language used there. For example, we Bambiam attached ourselves to Bikwom and hence had to adopt their language; Bikwom people are attached to Bidjumbi and Biandzəm to form the village of Missong, and this is why they all had to use the same language, that is, Missong. This is why all the descendants of the family that moved from Fang side now speak different languages.

26 To understand this discussion, it is helpful to be clear on what is meant by the term village in this context. While villages will typically be associated with a certain degree of spatial clustering of settlement, as expected based on how the term is used in colloquial English, here, the conceptualization of the village as a political union of different kin groups is more significant than its actual physical manifestation.This alliance of kin groups is expressed most saliently through the use of a common “language”, making it a social unit of clear significance for documentary linguistic work. 27 In the local context, this characterization of the Missong “thieves” does not appear to be particularly offensive, perhaps due to comparatively weak language–culture links when set against the norm for, for instance, in Western societies. See Di Carlo & Good (2014) for detailed discussion.

Interdisciplinary Approaches to Language Documentation 57

As just discussed, we cannot uncritically jump from ethnographic data such as this to concrete historical reconstruction. However, the evidence seems sufficient enough at this point to at least seriously consider that the Missong ideology of recentness is, in fact, connected to an actual recentness of formation, at least when set against other vil- lages in the area. This naturally leads to a further suspicion: That Missong’s linguistic distinctiveness (see §3.2) may result from its being a historically shallow formation from groups which formerly spoke diverse languages, only recently shifting to a Mungbam variety, and presumably incorporating the influences of other languages into it as part of the means of expressing a new kin group federation. In other words, a “mixed” Missong history may have led to a “mixed” linguistic variety.28 The challenge that remains is to see the extent to which we can corroborate this story with data from other sources. As mentioned above, it is clearly not the case that a linguist would be able to col- lect ethnographic data with the same level of skill as a trained anthropologist. Moreover, they would not be able to develop appropriate models for interpreting such data on their own. Even if they had some of the necessary training, they may simply lack the time to do any detailed ethnographic analysis. Nevertheless, if models have already been developed, gathering some useful data need not be particularly difficult. The Missong case just dis- cussed involved, for example, collecting oral histories, independently useful as analyzed texts; chief lists, which, if significant to the local societies, can be easily collected using a language of wider communication; and statements of local language attitudes, which can also be easily collected in a language of wider communication. Of course, the specific kinds of information of this type that one can collect will depend on cultural context— chief lists will not be found across the world (though genealogies, more generally, can be found anywhere). However, to the extent that one can find a reasonably broad ethno- graphic literature on societies in the same general cultural area as those whose language is being investigated, it seems likely that other easily “collectible” sorts of ethnographic points of interest can be easily found.29 In other words, useful ethnographic data may be in reach even in the absence of a dedicated ethnographer.

3.4 Archaeological data The data the project has collected (exclusively by Pierpaolo Di Carlo) in the archeological domain is less extensive than in other areas, but it has, nev- ertheless, helped paint a more complete picture of the forces that have shaped linguistic distributions in Lower Fungom. Of course, extensive archaeological surveying is clearly outside the scope of a linguistic project. At least in our context, shallow exploration has been relatively straightforward because we are targeting historical time depths that are not too distant from living memory and collecting the location of settlement sites where traces of former occupation are visible on the surface (e.g., in the form of foundations that remain from decayed houses). The archaeological data we have access to for the settlements once associated

28 I mean the term mixed here quite informally. A better choice might be “layered” following Seidel’s (2010) use of the term to characterize the historical development of the Bantu language Yeyi. 29 he project’s investigations into the ethnographic characteristics of Lower Fungom societies has, in fact, involved collection of data which requires deeper knowledge and care in the use of ethnographic methods. Di Carlo (2011) presents a fuller picture, and this is hinted at in the discussion of village “cohesiveness” above. I highlight these “easy” cases here to emphasize that even a lone linguist could engage in some kinds of interdisciplinary data collection without taking them too far afield from their core documentary efforts.

Interdisciplinary Approaches to Language Documentation 58 with the village of Mufu is especially revealing. The present-day village is located at the top of an inaccessible hill, following a pattern found for a number of villages in Lower Fungom. This settlement pattern is associated with a need for defense from neighbor- ing groups, since it is otherwise remarkably inconvenient: Farming areas are frequently be quite distant from settlements and require arduous hiking up and down narrow paths (along which water and other necessities also need to be transported). Across Lower Fungom, there is no reason to believe this settlement pattern is particularly “ancient”. It appears to represent a response to a period of instability in the broader region in the nine- teenth century (see Di Carlo & Good (2014)). Previous to this, settlements would have been located closer to farming areas (and, in fact, the trend in recent decades of relative peace has been for people to move to more accessible settlements, such as the low-altitude Yemgeh market area seen in Figure 1). In the case of Mufu, evidence for the shift of the village population to its present location can be found not only in oral histories (see §3.3) but also in archaeological sites found near the village. In particular, one finds remains of five small settlements found in less easily-defended locations near the village, corroborating accounts that treat Mufu as having resulted from a movement of formerly more distributed kin groups into a compact, more defensible physical space. As with all of Lower Fungom’s villages, Mufu is divided into distinctive quarters—both residential and social units—some of which presumably were historically associated with these small settlements. The proximate cause for the change in settlement patterns in the Mufu case appears to be the arrival of Naki-speakers associated with the present-day village Mashi (Di Carlo 2011: 90). Returning to the case of the village of Missong, discussed in more detail in pre- ceding sections, our only archaeological evidence is negative: It is not a village readily associated with older habitation sites. This is, of course, perfectly consistent with the ethnographic characterization of Missong as a “recent” village (see §3.3), though, at the same time, in this case we must be quite conscious of the limited scope of the archaeo- logical investigations to this point, meaning that absence of evidence can only be weakly construed as evidence of absence. As with the ethnographic data points mentioned above, it is important to stress here that the collection of the sort of archaeological information just discussed for Mufu does not require specialized training. The inhabitants of Lower Fungom are quite aware of these archaeological sites—indeed, they are not all that different from so-called “ghost towns” found in the western United States—and can easily lead an outsider to them. Beyond (sometimes arduous) hiking, a GPS device is also required so that the locations of the sites can later be mapped alongside the collection of oral accounts of their signif- icance. Such data collection can be somewhat time intensive, but, in the scope of a long field trip, may be quite manageable, and GPS devices have become quite standard pieces of field equipment.

3.6 Geographic data By “geographic data”, one can mean various things. In the sim- plest case, this could be data on locations of sites of significance to a community being investigated, and in the age of cheap GPS devices, it is quite straightforward to collect highly accurate information in this regard which can, among other things, allow for the creation of maps of the areas being investigated. Indeed, the map in Figure 1 was created

Interdisciplinary Approaches to Language Documentation 59 using data gathered from a GPS device in the context of this project and is a significant improvement over previously available maps. While such use of geographic data is enor- mously valuable as a means for visualizing linguistic distributions, in the context of a project such as the one described here, where the motivating question is What’s where why?, it is best understood as a “geography-weak” approach (Di Carlo & Pizziolo 2012: 152) to questions of the relationship between language and space since the contribution from geography as a discipline is relatively limited. By contrast, we can also consider a “geography-strong approach” such as the one adopted in the analysis of Lower Fungom population distributions in Di Carlo & Pizziolo (2012), which I rely on in the discussion here. Such an approach expands on the usual model of examining language distributions primarily as abstract distributions over a two-dimensional space (usually merely as “dots”, but also, in some cases as polygons) by: (i) using modern geographic methods, in particular Geographic Information Systems (GIS) to examine correspondences between linguistic features and other geographic fea- tures (e.g. hills, rivers, etc.) (see Di Carlo & Pizziolo (2012: 183) for summary discussion and review of the use of GIS in linguistic work) and (ii) mapping the geolinguistic knowl- edge of communities in addition to the locations of languages and language features. As with much of the data discussed above, there has been nothing specifically novel about our use of geographic information in and of itself. The innovation involves using a geography-strong approach to lend further insight into the What’s where why? problem for Lower Fungom. In particular, the compactness of the region allows us to explore language–geography interaction at a much more fine-grained level than is possi- ble for global-scale surveys, such as Nichols (1992). (This is not to criticize such work since, as discussed by Good (2013), work of this kind actually serves as much of the inspiration for our investigation into the micro-area of Lower Fungom.) To pick one example of the value in taking a geography-strong approach to language documentation, we can consider Di Carlo & Pizziolo’s (2012) presentation of information on sites of memory, adapting a notion developed by the historian Pierre Nora (see, e.g., Nora (1989)).30 This involves collecting data on locations that form part of the collective memory of villages based on their ethnohistorical traditions and, to the extent possible, attaching approximate relative chronologies to these sites using available evi- dence (including archival records—see §3.6). These locations can then be mapped for purposes of presentation and analysis. While sites of memory may correspond to locations where there is a physical record of the presence of a group of people associated with a given village, this need not necessarily be the case since references to these places are collected to see how people understand their relationship to a given location whether or not they settled there. In Figure 2 and Figure 3 sites of memory are plotted for the various Lower Fungom villages (as well as one additional site, Nsom, a settlement formerly reported to be present in the region as discussed in (Di Carlo 2011: 92–93)). These maps are drawn from the online supplementary materials published as part of Di Carlo & Pizziolo (2012). Sites of memory that can be associated with a period around 1860 are found in Figure 2, and

30 Di Carlo & Pizziolo (2012) translate Nora’s original term, lieux de mémoire, as memory-places, following Flores (1998: 429). I use sites of memory here because it appears to be a more standardly employed translation (see, e.g., Nora (1989: 15)).

Interdisciplinary Approaches to Language Documentation 60 sites that can be associated with a period around 1900 are found in Figure 3. The associ- ation of a site of memory with a particular village may have been made by the members of the village itself or by members of an outside village. The survey of sites of memory was preliminary in various respects. However, the relative differences in site density and locations should be reasonably reliable across the maps as an index of this aspect of local historical memory.

Figure 2. Sites of memory circa 1860

Interdisciplinary Approaches to Language Documentation 61

Figure 3. Sites of memory circa 1900

For each village, the maps in Figure 2 and Figure 3 reveal both the relative quantity of sites of memory and their spatial concentration. For instance, towards the bot- tom of the 1860 map in Figure 2, one can contrast the numerous, relatively concentrated sites associated with the village of Fang (in orange), against the less numerous and quite scattered sites associated with Ajumbu (in light green). Around 1900, as seen in Figure 3, the configuration of sites for Fang is relatively similar to what was found in 1860. The Ajumbu sites, however, still remain comparatively few but have become more concen- trated around the present-day location of the village of Ajumbu (see Figure 1). These patterns straightforwardly correspond to ways in which Fang and Ajumbu differ from each other, in both contemporary and historical terms. On the one hand, as

Interdisciplinary Approaches to Language Documentation 62 seen in Figure 1, there is a remarkable disparity in the populations of the two villages. Fang is, by far, the most populous village in Lower Fungom, perhaps even more than twenty times the size of Ajumbu. That there are more sites of memory associated with Fang than Ajumbu is, therefore, hardly surprising. To understand the difference in the concentration of the sites of memory, where those of Fang strongly cluster around the present-day location of the village of Fang for both 1860 and 1900, while those of Ajumbu only show strong clustering in the 1900 map, it is necessary to appeal to history. As discussed in §3.4 (see also Di Carlo & Good (2014)), the nineteenth century was a period of political instability in Lower Fungom. The most direct cause for this was the movement of refugee groups from outside of Lower Fungom into the region. (These movements were themselves caused by a larger-scale pattern of regional instability.) A community associated with present-day Fang would have been one of these refugee groups, moving into Lower Fungom via a compact migra- tion event (see also §3.6 for further comments on Fang’s history). Those associated with present-day Ajumbu, on the other hand, appear to have been present in Lower Fungom since before this period of instability and originally had a more dispersed settlement pat- tern, much like what was once found for Mufu, as discussed in §3.4. The concentration of settlement for Ajumbu groups would only have come over the course of the nineteenth century as political instability affected them to the point where, like the Mufu (and oth- ers), they would have decided to move to their present location on a relatively inaccessible hilltop. In this case, the proximate cause for this shift of settlement pattern would have most likely been the arrival of the Kung into Lower Fungom, who compete for space with the Ajumbu relatively directly.31 Returning to the case of the village of Missong, the maps in Figure 2 and Figure 3 reveal a village without a particular high number of sites of memory in either period, as well as a village where these sites are quite spatially concentrated. Missong is a com- paratively small village in terms of population. So, we should not expect the multitude of sites seen for higher-population villages such as Fang. Even accounting for this, the sites of memory in the 1860 map are still quite small in number. Missong’s population is roughly in the same order of magnitude as that of Ajumbu, Biya, Buu, Mufu, and Ngun, for example, which show many more sites of memory in that period. This, along with the concentration of sites in the present-day location of the village rather than in more outly- ing areas, can be understood as a further index of its “recentness”. Of course, historical memory of this sort is subject to revision. Therefore, we should not take this as “proof” that Missong is somehow “younger” than, say, Buu. However, this is another piece of evidence for this historical interpretation. Ethnography, archaeology, and geography are all converging on the same reconstruction of the village’s history. The final type of data to consider here is historical data in the usual Western sense of history: Written records written by first-hand observers, in this case those working as part of the British colonial enterprise. These will be discussed in §3.6. Before moving on, it is worth emphasizing a point made above in the context of

31 The two outlier sites associated with Ajumbu in Figure 3 are more precisely not associated with Ajumbu, per se, but, rather, a settlement most frequently referred to as Lung in the scant literature that mentions it (Good et al. 2011: 133). Investigation by the author with rememberers of Lung clearly show it to have been a variety very closely related to Ajumbu, though it is now more or less lost, its speakers having scattered among other groups due to the political instability just mentioned.

Interdisciplinary Approaches to Language Documentation 63 other kinds of data. While linguists, of course, cannot be expert geographers, this does not prevent them from collecting certain kinds of geographic information. In the present case, for instance, the collection of locations of sites of memory requires a GPS device and asking people about sites in their area and their association with specific villages— activities which do not presuppose advanced training. Interpreting the significance of the locations of these sites does require some additional knowledge, but by emphasizing the comparative differences in site quantity and concentration among villages, as was done here, rather than, say, the significance of a specific site, the potential for over-interpreta- tion of data which may not be fully reliable in its details is mitigated.

3.6 Archival data To the best of my knowledge, language survey and documentation projects rarely consult archival records as a means to provide historical context to the communities that they are studying. In the case of Lower Fungom, for instance, the first published survey of its languages (Hombert 1980), implicitly treats the region as largely “ahistorical”. Specifically, the surveyed varieties are reconstructed in a way which sug- gests that their grammatical diversity is due to ancient patterns of diversification after an initial settlement event, with little in the way of intervening “history” considered side-by- side with the linguistic data. I do not mean to be especially critical of individual instances of such work, as they seem to represent the norm in the field of linguistics, and, for Africa especially, merely continue a popular tradition viewing it (without justification beyond a kind of colonial romanticism) as a “continent mired in timeless immobility” (Kopytoff 1987: 7). However, in the Cameroonian case, at least, there is a significant amount of documentation in the form of survey reports from the colonial period that can push our window of Grassfields history back decades or more before the first proper linguistic surveys took place. The present project, again in work primarily undertaken by Pierpaolo Di Carlo, has examined, in particular, the National Archives in Buea, Cameroon (see Maderspacher (2009) for discussion of Cameroon’s national archives). During the colo- nial period, Lower Fungom was, just as today, not an area of special political or economic importance. Therefore, one will not find especially detailed colonial reports regarding it. Nevertheless, one can still discover within the archives additional data of relevance for understanding Lower Fungom linguistic history. In some cases, colonial sources cannot be said to offer new “facts” but they can, at least, establish that aspects of contemporary oral tradition were also in place decades ago. For instance, the oral history of the village of Fang claims that its inhabitants orig- inated much further south than their present location, in the town of Bafang in the West Region of Cameroon. They then are said to have migrated to their present northern loca- tion after stopping for a period in Befang, a village not especially far from Fang, close to the town of Wum on the road between Wum and Bamenda (see Figure 1 for the location of Bamenda). Colonial records cannot verify this specific migration story. However, as discussed by Di Carlo (2011: 79–80), during a visit to Fang by a colonial officer who was assisted by Befang carriers, the Fang asserted a strong friendship with the Befang people. We, therefore, know that the claim of a connection between the Fang and the Befang is at least nearly a century old. This cannot “prove” the Fang did actually spend time in Befang, but, as part of a larger set of evidence, can help us put together a more detailed

Interdisciplinary Approaches to Language Documentation 64 story for this village’s history. Similarly, the Koshin have a clear tradition of having migrated, as a village, from points to the southeast of Lower Fungom. As discussed in Di Carlo (2011: 81), a map from the 1930s specifically includes a site labeled “Old Koshin” in an area consistent with the migratory path told in Koshin oral history. Of course, we do not know the exact process through which this map was created. So, again, we cannot see this as a definitive statement of Koshin’s earlier location. However, it minimally verifies a decades-long persistence of this aspect of Koshin oral history, and, if the site of Old Koshin could be specifically surveyed, perhaps will one day yield quite strong evidence for this history. Returning to Missong, comparable to the archaeological data (see §3.4), where there were no old sites associated with the village, the most salient aspect of Missong’s presence in the colonial record is its comparatively minor role. Colonial documents rarely mention the village and, when they do, they represent it as a break-off from the village of Munken (Di Carlo 2011: 84). This is not strong positive evidence for Missong’s recent- ness, but it is perfectly consistent with this idea. And, it further verifies that the percep- tion of Missong as a historically shallow village goes back at least to the first part of the twentieth century. In some ways, the archival records are comparatively poor sources of evidence: They simply lack the level of detail that we would like to have on the region from decades ago. Of course, the needs of colonial administration were quite distinct from the needs a linguistic research project of this kind. So, this is to be expected. In other ways, however, they are very valuable since, by virtue of being written, they are not subject to the distor- tions intrinsic to oral histories. (This is not to say they are not subject other distortions, of course, the most prominent being the colonial lens through which they were collected.) Moreover, their written content is readily accessible to a linguist, not being embedded in theoretical concerns specific to some academic discipline.

3.7 Why is what where? We have now considered five different kinds of data: lin- guistic, ethnographic, archaeological, geographic, and historical/archival. In the context of work on Lower Fungom, this interdisciplinary approach has been intended to help us answer the What’s where why? problem for the languages of the region, and we can begin to sketch out parts of an answer to this question. Why are there so many languages in such a small space? We can see part of the answer lies in the fact that region has served, in the last 150 years or so, as a refugium, making immigration a source of linguistic diversity. Why communities typically associated with compact settlements, often on hilltops? This is not an ancient pattern of habitation but, rather, a comparatively recent one triggered by political instability. And, to return to Missong, why is it, on the one hand, clearly a Mungbam variety, while, on the other hand, being the most distinctive variety of the language? As discussed in more detail in Di Carlo & Good (2014), the most likely explanation is not the one that might seem most obvious at first to the synchronically-oriented documentary linguist: That it is the result of “drift”, as one might conclude without an interdisciplinary perspec- tive. After all, our default stance as linguists is to view linguistic divergence as the result of the dispersal of formerly united speaker communities.

Interdisciplinary Approaches to Language Documentation 65

The interdisciplinary data, by contrast, quite consistently points to Missong as being a recent formation, despite the extent of its linguistic divergence. The most likely account for this would seem to be that the village represents a relatively new confedera- tion of linguistically distinct groups who devised a common language, using a Mungbam base, in order to signal the construction of a new, Missong political identity. Into this Mungbam base, elements of the other founder languages were likely added in order to create not merely a common language, but also a locally distinctive one. Missong unity would, therefore, be signalled by a common village way of speaking that was not found outside of Missong. The what of the Missong language is, therefore, associated with the where of the village of Missong precisely because Missong is a new village located in a part of the world where the most salient external emblem of sociopolitical independence is the possession of a distinctive “language”. Of course, this is just the beginning of an answer to the What’s where why? question, and many points remain open for Missong specifically and Lower Fungom more generally. For instance, the lexical and grammatical divergence of Missong from the other varieties of Mungbam alongside the evidence of its recent formation points to some sort of intense language contact as the source for much of its linguistic differentiation, as just indicated. However, we have not located a specific language from which these grammat- ical differences could have been imported. Even if such a source language is no longer spoken, we might still be able to find a close relative to it, which would help us add crucial details to the Missong story. In a broader Lower Fungom perspective, the open questions multiply: Can we find linguistic evidence to verify Fang and Koshin oral traditions of their having recently immigrated from outside the region (see §3.6)? How do we rectify the fact that some Mungbam-speaking villages have traditions of outside origin and others claim to be longstanding inhabitants of Lower Fungom (see Di Carlo & Good (2014))? What are the sources of the apparent linguistic distinctiveness of the Buu variety of the Ji cluster of languages against the closely related varieties of Mundabli and Mufu? Nevertheless, even at this stage, by setting the linguistic data alongside data tra- ditionally associated with other disciplines, a much more nuanced and informative picture of Lower Fungom diversity has begun to emerge, and our understanding of what makes it linguistically “special” has become deeper. In the next section, I will conclude the paper by trying to generalize on these results and also briefly discuss how this interdisciplinary approach is leading to a more informed ongoing documentary agenda for the region.

4. Extending interdisciplinarity 4.1 A research question driving interdisciplinarity For many, the term “interdisciplin- arity” may conjure up a vision of a multidisciplinary team of researchers. However, while the work discussed here involved a medium-sized documentary team (see, e.g., the author list of Good et al. (2011) to get a sense of its composition), the team itself is comprised mostly of linguists, with only one member (Pierpaolo Di Carlo) primarily identifying as a practitioner of a different discipline (anthropology). In other words, the project has not been especially “multidisciplinary” (see fn. 3). This has not prevented the research from making use of interdisciplinary data, however. As can be seen, for a part of the world that has seen very little research of any kind, there is quite a bit of such data that can be

Interdisciplinary Approaches to Language Documentation 66 collected by a non-expert. Even a “lone-wolf” linguist (Crippen & Robinson 2013) could engage in some of the interdisciplinary data collection described without a major alter- ation of their fieldwork plans, assuming they were able to develop some familiarity with relevant kinds of data to be collected beforehand. It is also important to bear in mind that one reason why this kind of interdisci- plinary analysis has yielded useful results is that it has not been guided by some vague sense that “interdisciplinarity” will automatically result in better documentation, or that pairing a linguist with an anthropologist will somehow create a better grammar and a better ethnography without a dedicated set of research plans to that end. Rather, the work began with a particular research question, Why is Lower Fungom so linguistically diverse? This question expanded data collection outside of the usual bounds of language documentation, on the one hand, while, on the other, narrowing down precisely what kinds of interdisciplinary data would be collected. A full ethnographic account was not necessary, for instance. Rather, the initial focus was on ethnographic features of relevance for reconstructing historical origins and local language ideologies. No attempt was made to do an archaeological “dig”, and only sites which could be identified with basic surface inspection were considered, in order to provide a window into settlement patterns in the last century or so. Geographic investigation did not extend—to pick an example—to the local of landscape elements, privileging instead geographic data of relevance to historical memory. The focused manner of this data collection allowed for an integration of sources which would have been more difficult had a more general ethnographic, archaeologi- cal, or geographic survey been conducted. The downside of this is a degree of thematic narrowness. If the primary concern is understanding the linguistic situation of Lower Fungom, such narrowness is not a problem in and of itself, since it is tailored towards issues of interest to linguists. At the same time, the obvious drawback is that the data collected may be of less value to scholars outside of linguistics than more general studies would have been. This simply reflects inherent tensions in doing “interdisciplinary” work of any kind. Nevertheless, even “narrowly” collected interdisciplinary data will inevitably open up further interdisciplinary research questions that can help move forward the goal of integrating data from different disciplines to address further unanswered questions.

4.2 Collaborative personalities The nature of most scholarly communities reinforces disciplinarity in numerous ways: via publication cultures, university hierarchies, profes- sional societies, etc. It is not surprising, therefore, that interdisciplinarity will typically require special effort. Moreover, given the numerous institutional structures that build on disciplinary boundaries, it is not surprising that many scholars are not naturally inter- disciplinary in their orientation. What this means, in practice, is that successful inter- disciplinary research is not merely a function of gathering the right data or having an interdisciplinary question. It also requires scholars who have what could be called, for lack of a better term, an “interdisciplinary temperament” and who enhance such a temper- ament by becoming informed of disciplines allied to theirs (see also Möhlig (2010: 2)).32

32 In discussing “interdisciplinary temperaments”, I mean, in no way, to disparage scholars who are more strictly “disciplinary” in their temperaments. I take it as a given that disciplines exist primarily due to their effectiveness in achieving important scholarly ends.

Interdisciplinary Approaches to Language Documentation 67

In my own case, I was heavily influenced by work like Nichols (1992), which revealed the significance of historical and spatial factors in understanding patterns of linguistic diversity. From this arose my interests in exploring linguistic diversity using interdisci- plinary data. While Nichols’ work examined these issues on a global scale, as discussed in Good (2013), the same basic approach could be applied (suitably modified) to much smaller regions, and the concentration of languages in Lower Fungom made it a useful area to consider in this way. To prepare for such research, I was able to make use of tra- ditional ethnographic work on the Grassfields region of Cameroon, such as what is found in Chilver & Kaberry (1968), to gain a better sense of the region’s overall culture and his- tory. This then prepared me to bring in useful insights from more theoretically organized work such as Warnier (1985), Kopytoff (1987), and Zeitlyn & Connell (2003), among oth- ers. Historical investigations of Sub-Saharan Africa, such as Vansina (1990), also helped expand my understanding of issues outside of linguistics. These things, in turn, allowed me to more productively interact with Pierpaolo Di Carlo, whose earlier work had focused on the intersection of language and the maintenance of culture (Di Carlo 2010), a central concern to any understanding of What’s where why? in a linguistic context. Of course, acquiring such interdisciplinary knowledge and honing one’s inter- disciplinary temperament takes effort—though, in my own case, I found this to be sig- nificantly less effort than acquiring the complex set of skills required to do grammatical analysis of the languages of Lower Fungom in the first place. Perhaps, it was roughly on the order of mastering the technical aspects of language documentation (e.g., archival formats, metadata creation, annotation, etc.)—not insignificant, but not insurmountable either.

4.3 Interdisciplinary extensions I have focused in this paper on the results of interdisci- plinary data collection in terms of how they help us understand the linguistic situation of different parts of Lower Fungom. This has resulted in a retrospective orientation. That is, the focus is on what we have already learned. But, we could also take a prospective ori- entation. How can work of this kind actually alter our research agendas in positive ways, especially in the realm of language documentation? In our case, an interdisciplinary focus has identified a significant new domain of documentary investigation that has been largely neglected to this point (though the picture has changed in recent years): The documentation of multilingual practices in highly multi- lingual rural environments in Africa (see, e.g., Esene Agwara (2013); Di Carlo (2015)). As discussed in Di Carlo & Good (2014) (see also Lüpke & Storch (2013) for broader contex- tualization), there has been a tendency in the language documentation literature to focus on the nostalgic documentation of “ancestral codes” (Woodbury 2011: 177) rather than to try to capture, as closely as possible, the range of lexicogrammatical codes employed by a given individual or group of individuals in a multilingual context. The nostalgic orien- tation in documentation may be quite sensible in regions like North America or Australia where a dominant colonial language is encroaching on indigenous languages. However, in Sub-Saharan Africa, where multilingualism is the norm and the dynamic deployment of different languages plays a significant and active role in identity construction, it obscures the social reality of language use (Childs et al. 2014: 168–169). The idea of prioritizing the documentation of multilingual practices in Lower

Interdisciplinary Approaches to Language Documentation 68

Fungom emerged directly from an insight of our interdisciplinary research regarding local language ideologies. Ethnographic investigations closely considered the relationship between language and identity, and it became apparent that language in Lower Fungom, rather than being viewed as a marker of immutable ethnicity, as is typical of language ideologies built on the “Herderian equation” of language, culture, and nation (see, e.g., Hymes (1968; 1972); Foley (2005)), was locally construed as a marker of more ephemeral sociopolitical configurations. This is seen fairly clearly in the quotation from Buo Makpa Amos given in §3.3 where the use of the Missong variety is linked to “cooperation” among diverse kin groups rather than expressing “Missonghood”, or some related notion. This understanding implied that multilingualism in Lower Fungom is not merely a means to a communicative end but is a significant cultural feature in its own right, allowing people to maintain multiple sociopolitical identities to help gain access to the resources of different groups. As such, any documentation of language use among Lower Fungom communities—and, indeed, the entire Grassfields region where Lower Fungom is found (Warnier 1980)—would be incomplete as a reflection of the local linguistic culture if mul- tilingual practices were not adequately documented. Interdisciplinarity, therefore, turned out to not merely be a tool to answer ques- tions that were being considered independently but also allowed for the identification an important new way to enhance the documentary work that is more responsive to Lower Fungom’s sociolinguistic context. It certainly required extra effort, but the payoff has fortunately been quite extensive as well. Moreover, there are additional interdisciplin- ary horizons to be explored in considering the What’s where why? question for Lower Fungom. The influence of anthropological ideas on the research should be clear, but the methods of sociolinguistics can clearly also deepen our understanding of the day- to-day linguistic worlds in which the inhabitants of Lower Fungom operate (see, e.g., Esene Agwara (2013) and Di Carlo (2015)). Moreover, by more closely incorporating Cameroonian scholars into the research, we will be able to bring in their insights as well, especially in the domain of applied linguistics an area where they have more experience than Western language documenters (Childs et al. 2014: 181).

Interdisciplinary Approaches to Language Documentation 69

References

Bickel, Balthasar. 2007. Typology in the 21st century: Major current developments. Lin- guistic Typology 11. 239–251. Childs, G. Tucker, Jeff Good & Alice Mitchell. 2014. Beyond the ancestral code: Towards a model for sociolinguistic language documentation. Language Documentation & Conservation 8. 168–191. http://hdl.handle.net/10125/24601. Chilver, Elizabeth M. & Phyllis M. Kaberry. 1968. Traditional Bamenda: The precolonial history and ethnography of the Bamenda Grassfields. Buea: Ministry of Primary Ed- ucation and Social Welfare. Crippen, James A. & Laura C. Robinson. 2013. In defense of the lone wolf: Collaboration in language documentation. Language Documentation & Conservation 7. 123–135. Cysouw, Michael & Jeff Good. 2013. Languoid, doculect and glossonym: Formalizing the notion ‘language’. Language Documentation & Conservation 7. 331–359. Di Carlo, Pierpaolo. 2010. Take care of the poets! Verbal art performances as key factors in the preservation of Kalasha language and culture. Anthropological Linguistics 52. 141–159. Di Carlo, Pierpaolo. 2011. Lower Fungom linguistic diversity and its historical develop- ment: Proposals from a multidisciplinary perspective. Africana Linguistica 17. 53– 100. Di Carlo, Pierpaolo. 2015. Multilingualism, solidarity, and magic: New perspectives on traditional language ideologies in the Cameroonian Grassfields. In Simone Casini, Carla Bruno, Francesca Gallina & Raymond Siebetcheu (eds.), Plurilinguismo/sintas- si: Atti del XLVI congresso internazionale di studi della società di linguistica italiana (SLI)—Siena, 27–29 settembre 2012, Rome: Bulzoni. Di Carlo, Pierpaolo & Jeff Good. 2014. What are we trying to preserve? Diversity, change, and ideology at the edge of the Cameroonian Grassfields. In Peter K. Austin & Julia Sallabank (eds.), Endangered languages: Beliefs and ideologies in language docu- mentation and revitalization, 229–262. Oxford: OUP. Di Carlo, Pierpaolo & Giovanna Pizziolo. 2012. Spatial reasoning and GIS in linguistic prehistory: Two case studies from Lower Fungom (Northwest Cameroon). Language Dynamics and Change 2. 150–183. Esene Agwara, Angiachi Demetris. 2013. Multilingualism in Lower Fungom: Analyses from an ethnographically-oriented sociolinguistic survey. Buea, Cameroon: Universi- ty of Buea MA thesis. Evans, Nicholas. 2008. Review of Essentials of language documentation ed. by Jost Gip- pert, Nikolaus Himmelmann, and Ulrike Mosel. Language Documentation & Conser- vation 2. 340–350. Flores, Richard R. 1998. Memory-place, meaning, and the Alamo. American Literary His- tory 10. 428–445. Foley, William A. 2005. Personhood and linguistic identity, purism and variation. In Peter K. Austin (ed.), Language documentation and description, volume 3, 157–180. Lon- don: Hans Rausing Endangered Languages Project. Gippert, Jost, Nikolaus Himmelmann & Ulrike Mosel (eds.). 2006. Essentials of language documentation. Berlin: Mouton de Gruyter. Good, Jeff. 2012a. “Community” collaboration in Africa: Experiences from Northwest Cameroon. In Peter K. Austin & Stuart McGill (eds.), Language documentation and description, volume 11, 28–58. London: SOAS. Good, Jeff. 2012b. How to become a “Kwa” noun. Morphology 22. 293–335.

Interdisciplinary Approaches to Language Documentation 70

Good, Jeff. 2013. A (micro-)accretion zone in a remnant zone? Lower Fungom in are- al-historical perspective. In Balthasar Bickel, Lenore A. Grenoble, David A. Peterson & Alan Timberlake (eds.), Language typology and historical contingency: In honor of Johanna Nichols, 265–282. Amsterdam: Benjamins. Good, Jeff, Jesse Lovegren, Jean Patrick Mve, Nganguep Carine Tchiemouo, Rebecca Voll & Pierpaolo Di Carlo. 2011. The languages of the Lower Fungom region of Camer- oon: Grammatical overview. Africana Linguistica 17. 101–164. Goody, Jack & Ian Watt. 1963. The consequences of literacy. Comparative Studies in So- ciety and History 5. 304–345. Hombert, Jean-Marie. 1980. Noun classes of the Beboid languages. In Larry M. Hyman (ed.), Noun classes in the Grassfields Bantu borderland, 83–98. Los Angeles: Univer- sity of Southern California Department of Linguistics. Hymes, Dell H. 1968. Linguistic problems in defining the concept of “tribe”. In June Helm (ed.), Essays on the problem of tribe: Proceedings of the 1967 Annual Spring Meeting of the American Ethnological Society, 65–90. Seattle: University of Washington Press. Hymes, Dell H. 1972. Linguistic aspects of comparative political research. In Robert T. Holt & John E. Turner (eds.), The methodology of comparative research, 295–341. New York: Free Press. Katamba, Francis X. 2003. Bantu nominal morphology. In Derek Nurse & Gérard Philippson (eds.), The Bantu languages, 103–120. London: Routledge. Kopytoff, Igor. 1987. The internal African frontier: The making of African political culture. In Igor Kopytoff (ed.), The African frontier: The reproduction of traditional African societies, 3–84. Bloomington: Indiana University Press. Lovegren, Jesse. 2013. Mungbam grammar. Buffalo, NY: University at Buffalo Ph.D. dis- sertation. Lüpke, Friederike & Anne Storch. 2013. Repertoires and choices in African languages. Berlin: De Gruyter Mouton. Maderspacher, Alois. 2009. The National Archives of Cameroon in Yaoundé and Buea. History in Africa 36. 453–460. Maho, Jouni. 1999. A comparative study of Bantu noun classes. Göteburg: Acta Universi- tatis Gothoburgensis. Möhlig, Wilhelm J. G. 2010. Towards interdisciplinarity: An introduction based on the experiences of the collaborative research centre ACACIA. In Möhlig, Wilhelm J. G., Olaf Bubenzer & Gunter Menz (eds.). 2010a. Towards interdisciplinarity: Experienc- es of the long-term ACACIA project, 1–20. Köln: Rüdiger Köppe Möhlig, Wilhelm J. G., Olaf Bubenzer & Gunter Menz (eds.). 2010a. Towards interdisci- plinarity: Experiences of the long-term ACACIA project. Köln: Rüdiger Köppe. Möhlig, Wilhelm J. G., Frank Seidel & Marc Seifert. 2010b. In Möhlig, Wilhelm J. G., Olaf Bubenzer & Gunter Menz (eds.). 2010a. Towards interdisciplinarity: Experiences of the long-term ACACIA project, 79–104. Köln: Rüdiger Köppe Moran, Joe. 2010. Interdisciplinarity (second edition). London and New York: Routledge. Ngako Yonga, Monique Doriane. 2013. Ébauche phonologique et morphologique de la langue bu. Yaoundé: University of Yaoundé MA thesis. Nichols, Johanna. 1992. Linguistic diversity in space and time. Chicago: University of Chicago Press. Nora, Pierre. 1989. Between memory and history: Les lieux de mémoire. Representations 26 (special issue: Memory and counter-memory). 7–24. Seidel, Frank. 2010. Layered language genesis in the “catch basin” of the Linyati and Okavango swamps: The case of Yeyi. Sprache und Geschichte in Afrika 20. 231–264.

Interdisciplinary Approaches to Language Documentation 71

Seidel, Frank. 2016. Documentary linguistics: A language philology of the 21st century. In Peter K. Austin (ed.), Language documentation and description, volume 13, 23–63. London: SOAS. Stallcup, Kenneth. 1980. La géographie linguistique des Grassfields. In Larry M. Hyman & Jan Voorhoeve (eds.), L’expansion bantoue: Actes du colloque international du CNRS, Viviers (France) 4–16 avril 1977. Volume I: Les classes nominaux dans le bantou des Grassfields, 43–57. Paris: SELAF. Tadadjeu, Maurice & Etienne Sadembouo (eds.). 1984. General alphabet of Cameroon lan- guages (bilingual edition) PROPELCA Series No. 1. Yaoundé: University of Yaoundé. Thieberger, Nicholas (ed.). 2012. The Oxford handbook of linguistic fieldwork. Oxford: OUP. Vansina, Jan. 1985. Oral tradition as history. Madison, WI: University of Wisconsin Press. Vansina, Jan. 1990. Paths in the rainforests: Toward a history of political tradition in equa- torial Africa. Madison, WI: University of Wisconsin Press. Warnier, Jean-Pierre. 1980. Des précurseurs de l’école Berlitz: Le multilingualisme dans les Grassfields du Cameroun au 19e`me sie`cle. In Luc Bouquiaux (ed.), L’expansion bantoue: Actes du colloque international du CNRS, Viviers (France) 4–16 avril 1977. Volume III, 827–844. Paris: SELAF. Warnier, Jean-Pierre. 1985. Echanges, développement et hiérarchies dans le Bamenda pré-colonial (Cameroun). Wiesbaden: F. Steiner. Watters, John R. 1989. Bantoid overview. In John Bendor-Samuel (ed.), The Niger-Con- go languages: A classification and description of Africa’s largest , 401–420. Lanham, MD: University Press of America. Woodbury, Anthony C. 2011. Language documentation. In Peter K. Austin & Julia Sal- labank (eds.), The Cambridge handbook of endangered languages, 159–186. Cam- bridge: CUP. Zeitlyn, David & Bruce Connell. 2003. Ethnogenesis and fractal history on an African frontier: Mambila–Njerep–Mandulu. Journal of African History 44. 117–138.

Interdisciplinary Approaches to Language Documentation Language Documentation & Conservation Special Publication No. 21 (October 2020) Interdisciplinary Approaches to Language Documentation ed. by Susan D. Penfield, pp. 72-112 http://nflrc.hawaii.edu/ldc/ http://hdl.handle.net/10125/24944 5

Endangered Language Documentation: The Challenges of Interdisciplinary Research in Ethnobiology

Jonathan Amith

Abstract In 2004, three national institutes jointly published Facilitating interdisciplinary research, a report that set standards for evaluating the interdisciplinarity of cross-disciplinary col- laborations. Although endangered language documentation (ELD) projects often assem- ble multidisciplinary teams, the 2004 criteria, today followed by the NSF, create such a high bar for interdisciplinarity that it is probably better to evaluate the cross-disciplinary impact of ELD projects through a different criterion: that of service vs. science. According to this perspective, the cross-disciplinary goal of ELD projects should be to decrease reliance on outside provisioning of services while increasing their contribution to the research goals of external disciplines. This article first suggests that ELD projects should actively promote and evaluate the use of project results across disciplines, beginning with greater attention to the archiving process and issues of discoverability and transparency of data. It then explores the potential for the cross-disciplinary impact of ELD ethnobio- logical research, which has often simply asked taxonomists to identify collected material to species, a service that only marginally benefits biological research agendas. To promote scientific collaboration across disciplines, ELD ethnobiological projects are best designed if they contribute methodologically, substantially, and theoretically to biological research. This article concludes with a description of such an effort.

1. Introduction In 2004 the National Academy of Sciences, National Academy of Engineering, and Institute of Medicine published a report entitled Facilitating interdisciplinary research, a report that a decade later continues to have a profound effect on National Science Foundation efforts to promote collaboration among disciplines. Indeed, in its “Introduction to interdis- ciplinary research,” the NSF reproduces verbatim the 2004 definition of Interdisciplinary Research (IDR):33

Interdisciplinary research is a mode of research by teams or individuals that integrates information, data, techniques, tools, perspectives, concepts, and/or theories from two or more

33 The relevant NSF website is at http://www.nsf.gov/od/iia/additional_resources/interdisciplinary_research/index.jsp, accessed Oct. 10, 2014.

CC Creative Commons Attribution Non-Commercial No Derivatives Licence 73

disciplines or bodies of specialized knowledge to advance fun- damental understanding or to solve problems whose solutions are beyond the scope of a single discipline or area of research practice (emphasis added).

The key word in the preceding definition is”integrates,” as it is precisely this integration that is taken to distinguish interdisciplinary from multidisciplinary studies. Figure 1, also from the 2004 report, schematizes the difference between these two terms by graphically distinguishing the optimal directionality of “post-project” research: inter- disciplinary research has the potential to forge new disciplinary ventures. Paradoxically then, in the best of circumstances this positive result effectively redistricts academic dis- ciplines and moves the goalposts for future challenges to disciplinary boundaries.

Figure 1.

The report also lauds the benefits of a problem-solving approach to research and the incentive system that this creates: “IDR works best when it responds to a problem or process that exceeds the reach of any single discipline or investigator” (2004: 53). It works less well in environments such as those that predominate in most academic insti- tutions (though not necessarily university affiliated laboratories, which often operate in collaboration with industrial partners), where the goals are concerned less with collective problem solving and more with an individual’s academic standing within a department, college,or university. The National Science Foundation, which sponsored the set of presentations that form the basis of the papers included in this special issue, implicitly recognizes the challenges of promoting interdisciplinary collaboration within the constructs of aca- demic criteria by crafting initiatives such as CREATIV (Creative Research Awards for Transformative Interdisciplinary Ventures) and its successor, INSPIRE (Integrated NSF Support Promoting Interdisciplinary Research and Education), in which program officers are empowered to consider and grant awards outside of the normal panel review process. The stated reason for this approach is to avoid the perceived inherent conservatism

Interdisciplinary Approaches to Language Documentation 74 of panels, adverse to high-risk ventures and oriented to academic, discipline-based eval- uations of NSF proposals. Rather, these two programs seek to encourage bold, interdisci- plinary proposals that, in the politic words of the INSPIRE program solicitation, “some may consider to be at a disadvantage in a standard NSF review process.” In addition to the criteria of integration, however, multidisciplinary and inter- disciplinary projects can also be distinguished by the relative degree to which project participants from different disciplines either provide technical services to a colleague or are motivated to participate by the project’s relevance to their own scientific research agendas. This tension between the service and scientific roles of project participants from distinct disciplines is not uncommon. For example, ethnobiological projects (a research topic that is here explored in detail) rely on the taxonomic expertise of systematists who often gain little more than a voucher specimen of a common species in the “gift for deter- mination” exchange through which herbaria and entomological collections incrementally build up their inventories and biologists can obtain specimens delivered from areas they have not visited. In these situations the taxonomist is at the hub of an exchange of ser- vices: he or she provides an identification service to the ethnobiologist while serving as a conduit for specimens that enhance the collection of the host institution and may pro- vide the specialist with vouchers from deficiently covered regions. Description of a new species or even a register that represents a significant range extension to a known species may of course constitute publishable data.34 But even in the case of species new to sci- ence, collected inadvertently by a social scientist in the field, the contribution to the career of the gift-receiving taxonomist is minimal. Indeed, the taxonomist will often prefer to wait to describe the new species until he or she is developing a monographic treatment of the genus to which the new species belongs, a treatment that may take years, or never be completed. The discovery and description of a new species is seldom as significant an event in the career of a systematist as it is in the romanticized public perception of scien- tific discovery in natural history. Thus the primary foundation of ethnobiological projects, particularly in light of the definitions presented in Figure 1, is more multidisciplinary than interdisciplinary. A discrepancy between the degree to which a project pursues research goals cen- tral to one discipline while relying on the services of another not only affects the degree to which interdisciplinary integration is problematic and deficient but it also affects the degree to which commitment to project goals is equal among all participants and, in regards to funding, the degree to which multiple disciplinary programs will dedicate resources to the joint project.Viewed from a perspective that my own research in ethno- biology has led me to consider: it is easy to imagine that an anthropologist or linguist documenting an endangered language will ask taxonomists to provide determinations to the species level of local biotaxa. It is less easy to imagine a biologist approaching an endangered language (EL) researcher with a request to provide a service of botanical field collections to enhance the systematic study of a given taxon or clade, or even the coverage of a given region. Striking a more equal balance in an EL ethnobiological documentation project between the benefits to the documentarian and to the biologist is not a trivial task.

34 For example, a field expedition I organized to document Indigenous (Yoloxóchitl Mixtec) knowledge of (basically ants, bees, velvet ants, and wasps) resulted in the collection of a species in an area well beyond its previously known range and a short publication reporting this event (González et al. 2014).

Interdisciplinary Approaches to Language Documentation 75

Efforts to achieve such a goal are described in the following sections culminating in the extensive collaborative project described in §4.3. Before proceeding, however, it is important to note that endangered language documentation may also be at the “service” end of the spectrum in cross-disciplinary projects.35 One service, to biologists, has already been mentioned. Through communi- ty-based collaboration, ethnobiological projects are able to extensively collect flora and fauna, often in areas that are poorly explored. Herbaria and museum collections may thus be built up at a relatively low cost and new geographical references and species are often discovered.36 Traditional ecological knowledge, such as observations on habitat and behaviour and local uses of flora and fauna, constitute important additions to human knowledge of the biosphere. The most common “service” of documentation, however, is one that relates back to Himmelman’s (1998) early distinction between documentary and descriptive linguis- tics: the creation of a representative corpus of “the linguistic practices characteristic for a given speech community.” “In this view,” Himmelmann notes, “language documentation may be characterized as radically expanded text collection” and is “uncompromisingly data-driven” (1998:165). One basic question to be asked, then, is the following: If doc- umentation is to be considered a data acquisition and provisioning activity, how easily can the data provided in documentation efforts be mined for descriptive and theoretical work in linguistics and related disciplines? For example, although natural speech corpora created by documentarians could provide a basis for a corpus-based approach to research, it is not clear how often either the developer of an EL corpus or linguists who pursue cor- pus-based studies have used EL corpora for statistically-based lexicosemantic or morpho- syntactic research, such as attempting to discover frequency patterns (e.g., collocations, morphological co-occurrences) within the textual transcriptions. The potential for archived EL corpora to inform theoretical work in linguistics is an issue that has been addressed in at least one project. An NSF grant to Douglas Whalen, in which I was invited to participate, explored whether archived EL documentation could yield an adequate phonetic description of the language in question. Among the questions asked was whether a forced alignment system developed for a major language could be bootstrapped for use on an endangered language, providing a close enough approxima- tion to segmental boundaries that would, in turn, facilitate phonetic analysis. A second question was whether the phonetics of natural (not elicited) speech (the type of speech most extensively collected in documentation projects) could yield material that would correspond to an accurate phonetic description of the targeted language. Traditionally, such descriptions have relied on elicited, careful speech. The answers to these questions are provided in a series of articles (Christian DiCanio was the senior author in all; see references). My experience in this and other projects that attempt to use EL cor- pora of “the linguistic practices characteristic for a given speech community” has made

35 Cross-disciplinary, or across disciplines, is used as a neutral term, implying neither multidisciplinarity nor interdisciplinarity as described in the paragraphs above. 36 Amith’sethnobotanical project in the Sierra Nororiental de Puebla has yielded over 125 new state plant registers and his collaboration on Cerambycidae with Steve Lingafelter (Amith&Lingafelter 2016) has documented 13 new state registers in Guerrero and Puebla. Ongoing work with Kevin Williams on has discovered 11 species new to science.

Interdisciplinary Approaches to Language Documentation 76 it clear that creating a fit between the EL data and natural language data-processing tools for descriptive or theoretical linguistic study is not a trivial matter. Deficiencies in field data collection are not easily remedied and often require labor-intensive rectification or enhancement of the original deposit. Recognizing the need to better ensure that archived EL materials be adequate for studies similar to his, Whalen suggested the need to establish a “definition of phonetic norms not only for endangered language documentation, but for documentation of any language.”37 The lack of such norms, covering initial documenta- tion and final accession and archiving, can only adversely affect the ease of using primary documentation materials for future work in descriptive and theoretical linguistics. This introduction has briefly presented two perspectives on cross-disciplinary efforts. The first focuses on the integration of disciplines and the potential for lasting impact on future collaborative activities as a distinguishing feature of interdisciplinarity. Implicitly, the research questions addressed in such ventures must be of significance not only to the individual participants but also to the disciplines that they represent. When the NSF establishes interdisciplinary funding initiatives, as they are increasingly doing, the officers of the different programs, despite increased discretionary powers, are in effect responsible for ensuring that the proposal presented is relevant, particularly theoretically relevant, to the discipline and program goals that they represent. Second, lack of cross-disciplinary relevance can also be interpreted in relation to the science/service polarity mentioned above. When a collaborator provides a service to a cross-disciplinary project that is not particularly well integrated with that contribu- tor’s scientific research agenda but rather relies on his or her discipline-specific skills, it is unlikely to stimulate resource commitment beyond the principle discipline that drives the project. In addition, when a principle justification for EL documentation is the utility of the material for contemporary or future descriptive and theoretical work, it is fair to ask how best to gather, process, and archive the primary documentation materials so that they can fulfill this goal. This question needs to be posed, and the challenges of using the results of EL documentation projects for work in theoretical and descriptive linguistics, particularly in projects that utilize natural language processing of corpus material, need to be addressed. Nevertheless, despite a need to discuss the efficiency of EL documentation as a data/service provider for theoretically-based linguistic research, this essay will be lim- ited to an exploration of cross-disciplinary ethnobiological research, research that rep- resents what is basically the inverse situation in which EL documentation requires, at a minimum, significant service provisioning by biologists (although, as noted above, the ethnobiologist also provides a service in the voucher sent as “gift for determination” and the traditional ecological knowledge that may accompany the collection and label). Yet for a sustainable long-term effort, the heavy service needs of the documentarian should be complemented by either integrating the social science research agenda with that of biologists or offering greater services, such as more extensive collections, to these same biologists. The present essay explores several paths to a more integrated and mutually beneficial relation between language documentation and biology: a focus on biosemantics

37 Abstract of proposal 0966411, “From Endangered Language Documentation to Phonetic Documentation” at http://www.nsf.gov/awardsearch/ showAward?AWD_ID=0966411&HistoricalAwards=false, accessed Oct. 22, 2014.

Interdisciplinary Approaches to Language Documentation 77 and classification accompanied both by extensive biological fieldwork in particular fam- ilies of organisms, both flora and fauna, of interest to collaborating biologists §( 4.2), and by molecular analysis of local inventories of biotaxa that will simplify and accelerate identification of vouchers to species and will create a permanent material resource of use across many disciplines (§4.3). The following section (§2) explores the complexity of Indigenous nomenclature and classification of local flora and fauna. This complexity reflects a wide range of -fac tors, three of which are explored in §2: (a) intricate patterns of correspondence between Indigenous and Western classificatory schemes; (b) intracommunity and speaker-specific variation, particularly in the criteria of category membership; and (c) complicated lexi- cosemantic relations among Indigenous terms for biotaxa. This complexity, I suggest, presents unique challenges to cognitively focused ethnobiological research, research primarily concerned with the naming and classification of the natural environment. The argument presented below is that these challenges can best be met by a mutually beneficial collaboration between linguists or anthropologists working on language documentation and biologists interested in a thorough documenta- tion of local flora and fauna. Language documentation, particularly the compilation and analysis of a corpus of conversations about local flora and fauna (as well as related topics in material culture, food, and medicine) offers a seldom exploited tool for understanding how native speakers communicate about the natural environment, often despite signifi- cant differences in nomenclature and classification of biotaxa among conversation par- ticipants. Extensive collection of flora and fauna is both necessary to fully understand Indigenous nomenclature and classification and of potential interest to taxonomists and systematists. §3 examines the potential role of language documentation, along with an emerging strategy to engage native communities in documenting their own language and culture, in ethnobiological research. Finally, §4 explores the potential for mutually bene- ficial collaboration between those engaged in language documentation projects and a wide range of biologists, including molecular biologists.

2. Classification Ethnobiological projects most often target a single community (although the local settle- ment pattern may be disperse), a strategy that reflects both logistical and research consid- erations. Self-contained multisited projects are rare (but cf. Turner 1973 and, particularly, Martin 1996, for outstanding comparative studies) and most comparative analyses of tra- ditional ecological knowledge rely on accessing the results of several single-sited efforts developed by various researchers over a long period of time. Even when limited to a single community, however, documentation projects that develop the Boasian core of resources—corpus, lexicon, and grammar—are particularly challenged by biosemantics: the nomenclature and classification of local flora and fauna. At a very basic level, diffi- culties emerge in determining the lexicosemantics of words and combinations of words that reference the domain of natural history. For example, even when the native term has a single prototypical referent in a localized ecosystem, the meaning of this term is often eas- ily expanded (and by different speakers in different patterns) to include other referents, as occurs when a speaker encounters related species from ecosystems different from those of

Interdisciplinary Approaches to Language Documentation 78 his or her home environment. Moreover, difficulties in translating or glossing Indigenous terms to a Western language are compounded if one considers that biosemantic termi- nology often references a category or classificatory scheme and not a single referent. Viewed as a classificatory process, ethnographic research on biosemantics quickly shifts from determining the referent of a given term to understandinga divergent set of native opinions about the internal structure and often indeterminate boundary limits of categori- cal divisions of the natural environment. Time and time again speakers may differ not so much (or not only) in their identification of what might be considered the prototypical ref- erent of a given term but rather in their varied cognitive organization of the environment: categorical strategies for structuring, limiting, and expanding the set of denotata covered by a given term. Each of the following three subsections addresses factors that contribute to the complexity of describing (in effect, translating) native terms for biotaxa: (§2.1) mis- matches between Indigenous and Western taxonomic categories; (§2.2) speaker-specific variation; and (§2.2) complex lexicosemantic relations among terms ostensibly within a single classificatory group.

2.1 The relationship of Indigenous and Western classificatory schemes The relation- ship of a given native term to a subset of the natural environment may vary greatly. One of the simplest relations is a single native lemma, well-known throughout the commu- nity, that references a cohesive unit, be it (1) a unique terminal taxon or (2) a high-level node in what, viewed from a Western scientific perspective, is a monophyletic taxonomic structure. Examples of the latter that I have encountered include words for ‘earwigs’ (the order Dermaptera) and the parasitic plant known as ‘dodders’ (genus Cuscuta, now in the Convolvulaceae family). The Balsas Nahuatl term mēmēya (from the verb mēya ‘to issue or flow forth [a liquid]’) refers to the latex that characterizes one of four subgenera now included in the genus Euphorbia (family Euphorbiaceae) but previously grouped into a separate genus, Chamaesyce (commonly called ‘sandmats’).38 In cases in which the node referenced by an Indigenous term is designated by a lexical item in Western nomen- clature, the definition of the Indigenous term is almost a matter of simple equivalence: sakapahli = genus Cuscuta spp., or mēmēya = subgenus Chamaesyce. Strictly speaking, the native term may be translated as the name for a node in Western taxonomy only if, like this scientific name, it covers all locally present lower-level descendants from the designated node and does not extend to taxa below a different taxonomic node. That is, it must be locally monophyletic as viewed from a Western perspective. In all three pre- ceding cases (earwigs, dodders, and sandmats) this appears to be the case. The morphol- ogy of the referents is relatively unique, salient, and easily delimited, although in these examples each Indigenous term corresponds to a different level within Western scientific taxonomy (order, genus, and subgenus).39Such direct correspondences between taxonomy

38 For the phylogeny of Chamaesyce see Yang et al. (2012). 39 For a summary representation of the nature of the correspondence between a native term and Western taxonomic categories and general classes (e.g., by sex, developmental stage) see Ellen (1993:69), Table 1. Nevertheless while the native terminal category may intersect Western taxonomy at a range of nodes, this does not mean that the relation is monophyletic (includes all regionally extant descendents of that node). To the extent that an Indigenous term does not reference a Western monophylectic category, a simple translation of an Indigenous term to a Western taxonomic node is inaccurate. In such cases the nature of classificatory overlap is a topic for empirical research on category structure and boundaries.

Interdisciplinary Approaches to Language Documentation 79 and nomenclature in two cultural systems are, however, rare. More common are imperfect equations. Some of the most frequent categorical relations between cultures are described below. All of these represent problems of simple L1= L2 translation. One common classificatory pattern is the ‘extension’40 of a term denoting a fairly cohesive set of organisms to others that share morphological, behavioral, or even func- tional similarities with the prototypical group.Anexample of the apparent extension of a category that is generally well recognized and firmly bounded occurred with velvet ants (family Mutillidae) in the Balsas valley (Nahuatl). Over 150 collections of Mutillildae have been consistently named with a single term, ītskwin tiōpixki (lit. ‘the priest’s dog’). Most collections were of several species in the genus Pseudometheca, all of which were small and greyish black. The prototypical referent, however, seems to be the more strik- ingly marked large black-and-orange mutillids in the genus Dasymutilla. Once, however, a very knowledgeable Nahuatl-speaking consultant found a spider wasp (Psorthaspis for- mosa, family Pompilidae) that she categorized as ītskwin tiōpixki along with the dozens of Mutillidae that we had already collected together. At the time she commented on the fact that she had never seen an ītskwin tiōpixki with wings, a comment that suggests that a common (though not necessary and sufficient) characteristic of ītskwin tiōpixki is winglessness. This comment makes sense given that the prototypical referents of ītskwin tiōpixki are velvet ants, a family of in which females are diurnal, terrestrial, and wingless, while males are winged and nocturnal (and thus seldom seen or noticed). The Pompilidae collected was a non-prototypical ītskwin tiōpixkiin having wings, yet similar enough in body form and marking to merit inclusion in this category. Its designation as an ītskwin tiōpixki, however, was probably ad hoc, a spot decision based on general morpho- logical features despite the noted unusualness of the wings. No Nahuatl-speaking consul- tant ever mentioned the possibility of a winged ītskwin tiōpixki and most Pompilidae seen were large and metallic black or blue. They were generally considered a type of wasp.41 Another example involves an Indigenous category that is polyphyletic from a Western perspective. I have often found that the nomenclature and classification of both mantids (family Mantidae) and stick insects (order Phasmatodea) is as straightforward as the examples given in the previous paragraph: an Indigenous term corresponds to a named node in Western taxonomy and all descendants readily found in the local ecosystem.42 In the Mixtec community of Yoloxóchitl, however, one term (ko1li4li4) covers insects from both the Mantidae family and the Phasmatodea order.43 But twice during collection trips an assassin bug (family Reduviidae) of the subfamily Emesinae (‘thread-legged bugs’) was encountered. Both times all consultants present designated them as ko1li4li4, a term otherwise exclusively reserved for the jointly categorized mantids and stick insects.

40 Extended ranges of terms are noted in many of the classic works by Berlin, Hunn, Breedlove, and Laughlin, and others. 41 In Yoloxóchitl, however, several male Mutillidae were caught and consultants invariably identified them by the Mixtec name for velvet ants: ndi3ka’3a3 (lit., ‘panther’, Pantheraoncahernandesii) followed by nda3yu4 (‘soup’) or ndu3chi4 (‘beans’) as depending on their markings mutillids are said to be omens of what one will find to eat upon returning home: brightly colored mutillids foretell soup, darkly colored mutillids foretell beans. The association of mutillids with omens seems to be a common Mesoamerican belief (e.g., it also occurs in Totonac [David Beck, personal communication], Triqui, and Mazatec). 42 In Yoloxóchitl Mixtec one term covers insects from the Mantidae family and the Phasmatodea order. Given that all mantises collected to date have been in the family it is not certain whether other insects within the order Mantodea would also be classified by the same name, though this probably would be the case. At any rate, viewed together the Mantodea and Phasmatodea orders are not monophylectic; they do not represent all descendents from a common higher-level node. In Balsas Nahuatl the two orders are distinctly named. 43 Some speakers pronounce this as ko1ko1li4li4, or ko14li4li4; tones in Yoloxóchitl Mixtec range from low (1) to high (4), with additional rising, falling, and complex tones.

Interdisciplinary Approaches to Language Documentation 80

In the preceding two cases there seems to be no evidence of mimicry.44 The inclusion in a group corresponding to a recognized taxonomic node of species from an unrelated group of insects (in the two examples, Pompilidae and Emesinae) is based on superficial morphological similarity not related to protective mechanisms, such as Batesian and, less commonly, Müllerian mimicry. As expected, however, the inclusion of mim- icsas extensions to unrelated groups is not uncommon in native classificatory systems. Thus the Yoloxóchitl Mixtec name yo3ko2 lu3tu3 (lit., ‘wasp narrow’) is prototypically Parachartergus apicalis, a swarming, aggressive wasp with white-tipped wings.45Ac- cording to several consultants, the Mixtec name refers to the narrowing of the eyes of a person who has been attacked by these insects, though it is perhaps more likely that the name refers to the tapered form of their nests.46 Yet Mischocyttarus deceptus, also with a white-tipped wing, is also invariably labeled a yo3ko2 lu3tu3. As P. apicalis is aggressive and M. deceptus is not, it is clear from the behavioral description that the former is the prototypical referent, as this species manifests both the morphological and behavioral characteristics that speakers associate with wasps identified by the term yo3ko2 lu3tu3.47 A similar case is that of another aggressive wasp, yo3ko2 ndia’14na3(lit., ‘mask wasp’), which has been used by native Yoloxóchitl Mixtec speakers to designate three species: septentrionalis, Montezumia mexicana, and Montezumia azteca. The name refers to the nest of the first, S.septentrionalis, a “mask-like” flattish structure pasted onto tree trunks. The two Montezumia are superficially similar toS. septentrionalis though they are non-aggressive and make nests that do not resemble masks. In these two cases (involving Parachartergus apicalis and Synoeca septen- trionalls) the expected behavior or meaning of the name indicates which of the collected species is the prototype. In another instance, still pending further study, a

44 The effect of mimicry on the relationship of Indigenous and Western scientific categories is further discussed in Amith & Lingafelter (2016), where it is noted that Batesian mimicry of certain Cerambycidae creates a situation in which the Indigenous term references a category that from a Western scientific perspective is paraphyletic. 45 On the aggressivity of P. apicalis, see O’Donnell &Hunt (2013), one of two swarm-founding wasps discussed, including another of the Agelaia genus. Mimicry is well described in another article (O’Donnell &Joyce (1999: 502)):

Swarm-founding Neotropical (tribe ) often have large, aggressively defended colonies (hundreds to ten thousands of adults), and are often mimicked by other eusocial wasps. In contrast, the largely Neotropical genus Mischocyttarusis characterized by species with independently founded small colonies (several dozen adults) of relative- ly non-aggressive wasps. Many Mischocyttarusspecies apparently mimic other vespid wasps (Richards 1978). At least some Mischocyttarusspecies wasps are capable of stinging humans. However, Mischocyttaruswasps are often reluctant to sting humans even in nest defense, and individuals of most species we have observed remain immobile or even flee their nest when disturbed (S.O’D[onnell]. and F. J. J[oyce]., pers. obs.). Because they possess stings but are relatively docile, Mischocyttarus species may mimic other Vespidae through a combination of Batesian and Müllerian processes.

The docile nature of Mischocyttarus is captured by the name for M. rufidens and, less often, M.mexicanus in Oapan (Nahuatl): īxtēmpā́ya, literally ‘fuzzy eyed’ in reference to the fact that one can position one’s hand close to their nest without suffering attack, allegedly due to their poor sight. In Yoloxóchitl one consultant also described a yo3ko2 ma’3a4, literally ‘wasp docile’ (the word ma’3a4 is also used to mean ‘cuckold’) characterized by the fact that “one can take its nest a way without being stung.” This wasp has not been collected but the described behavior suggests a Mischocyttarus. Finally, James Carpenter (personalcommunication) pointed to the original description of Polybiadecepta (syn. M. deceptus) and noted: “There are several species of Mischocyttarusthat mimic species in the Parachartergusapicalis species group, including deceptus and socialis; the other species is imitator. Fox did not give an etymology in his description of Polybiadecepta [note: Synonym of M. deceptus] but did remark on the similarity in coloration to Parachartergusapicalis. “ Indeed, in the original article (Fox 1895:269–70), in describing the then new species P. decepta, William J. Fox notes: “Its similarity in color to Chartergusapicalis [Note: now a synonym of Paracha rtergusapicalis] is really remarkable.” 46 For the nest of P. apicalis, see Dejean (1998) 47 Rosch & Mervis (1975:575) definite prototypicality as follows: “members of a category come to be viewed as prototypical of the category as a whole in proportion to the extent to which they bear a family resemblance to (have attributes which overlap those of) other members of the category.”

Interdisciplinary Approaches to Language Documentation 81 ti1mi3i4 (derived from ti1- ‘prefix for animals’ and mi3i4 ‘solitary’) was frequently charac- terized as a highly aggressive solitary bee. Yet the two species collected with this name (Eufrieseamexicana and Eulaema polychroma, both Euglossini [orchid] bees) are not par- ticularly aggressive. This suggests that another bee still to be collected might be the pro- totype, combining morphological and behavioral characteristics that match the consultant descriptions. As suggested by the ti1mi3i4case, mismatches between morphology and behavior often indicate definitional complexity and problematic field identification by consultants. Such mismatches should motivate additional fieldwork. In the Nahuatl-speaking commu- nity of Oapan, a revealing mismatch occurred with an insect called yēlōtlapṓhwikātsīn, literally ‘the one that incenses green ears of maize’. For some time I had collected sev- eral yēlōtlapṓhwikātsīn, all various nondescript small Diptera, which didn’t make much sense in terms of expected behavior. The yēlōtlapṓhwikātsīn was described as a small insect that during the month of September is found hovering around green ears of maize (yēlōtl), as if incensing the plant. (Indeed, at the same time of year, campesinos go to their fields to leave an offering, incensing the field as they do.) One time, however, I collected a hoverfly (Syrphidae) also said to be a yēlōtlapṓhwikātsīn. It was Toxomerus politus, a monophagous insect that feeds only on the pollen of Zea mays and thus is often found hovering around immature maize as it develops. This was clearly the prototypical referent of the Nahuatl term. The previously collected small Diptera were the result of confusions, confusions that were perhaps induced by a pestering ethnobiologist asking questions in the wrong context. The preceding discussion was presented not only to demonstrate the complex- ity of determining Indigenous biosemantics but to problematize the taxonomic treatment of native terms for local flora and fauna. It may be, of course, that the complexities of category inclusion, structure, and delimitation are of relatively minor interest for a given study. Translations to a single term are simplifications, and caveat expressions to cover both polyphyletic and paraphyletic exceptions to one-to-one correspondence between Indigenous and Western scientific taxa beg certain questions. Likewise, intensional (nec- essary and sufficient conditions) definitions are misrepresentative while extensional (list of category members) definitions are interminable and uninformative. Greater promise is presented by ostensive definitions, a scattering of illustrative examples, though these would best be accompanied by varied (but not “necessary and sufficient”) criteria by which category membership is evaluated (such as the long antennae and tree girdling behavior of Cerambycidae) and an encyclopedic discussion of the relevant denotata.48 In all these cases, however, an extensive understanding of local flora and fauna is necessary, an understanding that can only be achieved through the collaborative support of many biologists to which must be added the sensitivity of the ethnobiologist to the details of lexicosemantic description.

48 An excellent example of such a discussion is in the Kalam-English dictionary (not consulted for this article), an entry from which is given in Pawley (2001:237–8). Cf. also Ellen (1993:154–7) for the Nuaulu term kauke. Si (2011:178) mentions the value of ostensive definitions for natural history terms, including biotaxa.

Interdisciplinary Approaches to Language Documentation 82

2.2 Speaker-specific and intracommunity variation in nomenclature and classifica- tion For many reasons, knowledge of the nomenclature, classification, and use of local flora and fauna is unequally distributed within native communities. Obviously there is a learning curve to traditional ecological knowledge: as with all types of information, chil- dren learn as they grow. Natural history knowledge may also be unequally distributed by sex, by the activities in which different social groups engage (many of which are tied to gender), and by speaker-specific aptitudes for natural history. Variations along these crite- ria are common and are occasionally reflected in lexicographic treatments, such as when specific head words are marked by register or as specialized terminology. Another source of speaker-specific variation relates to themechanisms by which consultants apply biosemantic nomenclature beyond a prototypical referent. Such exten- sions often lead to divergent classificatory schemes within a community. A clear example from my fieldwork involves two spinySolanum plants classified differently by three con- sultants from San Miguel Tzinacapan (altitude 865 meters). The first Solanum is S. rude- pannum Dunal, a medicinal plant called itskwinpahwits (lit., ‘dog- medicine thorn(y)’) that is common and extremely well known in mid- and low-altitude villages (under 1000 meters). One day, however, I and three consultants all went to a highland village where we came across, at 1485 meters, Solanum chrysotrichum Schltdl., a species superficially similar to S. rudepannum. Each consultant responded differently when asked the name of this second Solanum, which is found mostly at altitudes greater than 1000 meters and which is thus absent from the lands of their home village. One stated that he didn’t recognize the plant; another gave its name as itskwinpah- wits, while recognizing that it was morphologically distinct from the itskwinpahwits found near Tzinacapan. A third consultant named it itskwinpahwitsitahtāy (‘itskwinpahwits ‘its look-alike’).49 Although all speakers probably had never seen S. chrysotrichumpreviously, they recognized it as a plant close but not identical to S. rudepannum, which they all knew well. The first consultant had a general tendency to limit referents to the proto- type (he demonstrated the same naming strategy for tālāmat, prototypically the medicinal Desmodium caripense Kunth; he was reluctant to call other short, low-lying Desmodium, which most speakers do consider tālāmat, by this same name). The third consultant gen- erally tried to extend categories as widely as possible, a classificatory strategy that was frequently commented on by other, more conservative consultants, some of whom nick- named him ‘itahtāy’ (‘the look-alike’). Finally, the classificatory scheme of the second consultant seems to indicate a distinct categorical structure: for him the nature of the itsk- winpahwitsname/category was potentially polytypic. This flexibility in categorization was distinct from the categorical structure of the other consultants who excluded the highland plant from the itskwinpahwitscategory either definitively (first case) or lexically, establish- ing a boundary by using the termitahtāy. Functional considerations of flora and fauna may also vary not only by speaker but also by context of discussion or elicitation. Nomenclature is affected by a continual shift in many endangered language communities from local sources of material culture and medicine to commercial substitutes. As this occurs, many of the utilitarian benefits of plants become remembrances of things past, although functionally descriptive names may

49 The utilization of a term effectively meaning ‘look-alike’ is not uncommon in Mesoamerican biosemantic terminology, e.g., Yoloxóchitl Mixtec has the word ta1ni1 and Balsas Nahuatl itlahtlāk (and cognate forms).

Interdisciplinary Approaches to Language Documentation 83 survive frozen in nomenclature even though the utility is now absent from daily activity. Thus in Yoloxóchitl only one elderly consultant knew the Mixtec name for Bletiacoccinea Llave et Lex.: i3ta2 nda1ka1, which he translated as ‘flower glue’, a reference to a use of this plant that dates to prehispanic times. The three other speakers present not only did not know this name, but they had never heard the word nda1ka1, an archaic term for ‘glue’, which is now referenced by a Spanish loan. Speakers may also deny a certain functionality in an interview situation (e.g., the desirability of a given wood for fenceposts), a denial that is belied by actual use. The discrepancy between discourse and actual use manifests the problematic results that normative elicitation statements may create. Thus, when asked, Oapan Nahuatl speakers (Balsas Valley) did not name chikomolin (or chikimolin; Leucaena matudae (S. Zarate) C. Hughes) as particularly good for fence or house posts. Nevertheless, a non-quantitative assessment suggests that this species is indeed commonly so used. In this case, the dis- junction between normative descriptions of utility and actual use reflects the geographical distribution and relative scarcity of preferred species (e.g., Comocladia spp.): Leucaena matudae is abundant, particularly in the lower altitudes near the river valley village. The preceding examples briefly illustrate some patterns of local variation in nomenclature and classification: speaker-specific interpretations of category membership, normative statements of utility, and the implications of antiquated uses frozen in nomen- clature and, at times, the imagination. Such variations are facilitated by the nature of floristic and faunal categories in Indigenous communities: referents are classified together even though they do not share one or more necessary and sufficient conditions for cate- gory membership. Rather it is a series of overlapping attributes, family resemblances in Wittgenstein’s original formulation, that establish links among the referents of a category term. And it is precisely this feature of categorization that has motivated the approach taken in my research: the most interesting lexicosemantic and cognitive problems are found in those categories of the local floristic and faunal environment that have the most potential named members. Moreover, the clearest way to understand the characteristics at the foundation of the family resemblances is natural discourse, particularly recorded and transcribed conversations that can be electronically searched and analyzed (a basic goal of language documentation), accompanied by an exhaustive inventory of species present (a basic goal of floristic studies).

2.3 The lexicosemantics of biotaxa A final consideration in the lexicographic treatment of biosemantic categorization is the challenge of polysemy and the hierarchical nature of meaning, specifically in regards to hypernymy and hyponymy, within the semantic domain of flora and fauna. The consideration of these types of semantic relations is par- ticularly important given that Paul Taylor, one of the few ethnobiologists who has stressed a linguistic approach to biosemantics, has suggested (1990:46–47) that co-hyponomy is one strategy for discovering covert categories:

The method of co-hyponymy consists essentially of identifying a set of terms that can be shown to directly contrast in at least one of their senses, but which have no superordinate term to label the entire set. Having pos- ited a FLORAL FORM domain by this method, we still have not resolved

Interdisciplinary Approaches to Language Documentation 84

the problem of the boundaries of the domain, although it must minimally include the full range of the three subordinate terms [Note: in this case ‘tree’, ‘vine’ and ‘herbaceous weed’] on whose basis the FLORAL FORM class was posited. To more directly establish the boundary of the FLORAL FORM domain, we may turn to the method of “definitional implication.”

In practice, however, the discovery of co-hyponomy is often problematic. One example illustrates this point. Balsas Valley Nahuatl has a series of terms that refer to distinct species of Formicidae. A list of prototypical referents of the basic terms is given in Table 1.50 Many names are extremely well known, even to children.51 The hierarchical organization of the Formicidae domain is, however, neither obvious nor consistent across speakers.

Table 1.Formicidae in Oapan Nahuatl

Oapan Meaning Prototypical referent Nahuatl term āskatl āskatl ‘?small ant’ Solenopsis xyloni chīchīltik chīchīltik ‘red’ Solenopsis geminata ‘red small ant’ yo:n kwextikeh ‘the very small ones’ yo:n we:imeh ‘the large ones’52 kwitlayā́ k kwitla- ‘excrement’ Forelius damiani molōnki (i)yāk ‘fetid’ molōnki ‘smelly’ ‘foul-smelling ant that smells bad’ kwitlayā́ k x- negation Paratrechina longicornis xmolōnki ‘foul-smelling ant that doesn’t smell bad’ (and probably Forelius pruinosus)

Tsontetl (cf. kowtsontetl Atta mexicana kow ‘tree’ + tsontetl ‘? immobile’ kowtsontetl ‘stump (of a tree)’

yo:n kwitlayoh ‘the one with visible fungus dumps’ (literally, ‘the excrementy one’) yo:n xkwitayoh ‘the one without visible fungus dumps’

50 A more detailed discussion of other possible references of the Nahuatl terms is beyond the scope of this article. 51 There are two primary schools in Oapan and the uniforms children wear to each are distinct: red-checkered shirts for one and black-checkered shirts for the other. From the first grade, children learn the nicknames for students from each school: tēkwāntsīkatl (‘red harvester ant’) for the red-shirted students and kwitlayā́k for the black-shirted students. 52 Probably the size refers to different castes within a species, not different species.

Interdisciplinary Approaches to Language Documentation 85

māwēweyak mā- ‘arm’ or ‘hand’ Aphaenogaster ensifera wēweyak ‘long (pl)’ ‘long-armed’ Panochēroh < panocha (Spanish) ‘hardened brown Camponotusatriceps sugar cake’ + Spanish agentive (-ero) (winged female) Yewaltsīkatl yewal ‘night’ Camponotusatriceps tsīkatl ‘ant’ (worker) kōlōtsīkatl kōlō- ‘scorpion’ Pseudomyrmexmajor tsīkatl ‘ant’ (and perhaps Pseudomyr- ‘scorpion ant’ (probably for its painful mexgracilis) sting) Kowtsīkatl kow- ‘tree’ or ‘wood’ Camponotus rubroniger tsīkatl’ ‘ant’ ‘tree ant’ tēkwāntsīkatl tēkwān(i) ‘one that bites’ Pogonomyrmexbarbatus tsīkatl ‘ant’ ‘biting ant’

In English, relations of hypernymy and hyponymy among series of biosemantic terms are often widely shared among speakers. Thus maple, oak, pine are co-hyponyms of the hypernym tree; dog, cat, seal are co-hyponyms of the hypernym mammal; fire ant, carpenter ant, red harvester ant, crazy ant are co-hyponyms of the hypernym ant. In many languages (and even in English to some degree), however,these relations of hypon- ymy/hypernymy are more nuanced. For example, in many Indigenous languages an ostensible hypernym is often a generic co-hyponym (residual term) of more specific terms, a relationship that is dis- cussed in more detail in §3. Thus ntsīkatl (loosely translatable as ‘ant’) is not inarguably the hypernym of the Oapan Nahuatl terms listed in Table 1. A more accurate translation might be as a residual category of unnamed and relatively uncommon ants, though even this translation must be contextualized.53 Speakers seem to find a question such as “Is a kwítlayāk a type of tsīkatl?” somewhat perplexing. The speakers that were asked such questions, however, seemed most inclined to categorize as tsīkatlants with compound names that include the lemma tsīkatl (e.g., kowtsīkatl) and less inclined to include those ants (e.g., māwēweyak, tsontetl) whose compound names do not include this lemma. No speaker, however, was willing to consider an āskatl (Solenopsis spp.) a type oftsīkatl. Thus there is a potential contrast set, āskatl ~ tsīkatl, with the former limited to Solenopsis spp. and the latter term representing a category, with a complex internal structure and unclear boundaries, of progressively more inclusive relations in different discourse con- texts. An ethnoentomologist could certainly create a situation in which a native speaker would group Solenopsis spp. and Camponotus spp. together (e.g., triad card-sorting) but the key criteria and justification for asserting the cultural relevance and cognitive saliency of such as association should be taken from a less artificial environment.

53 Residual categories are mentioned frequently in the literature (Hays 1979: 257; Hunn 1976:511 ff., 1977: 281 ff. and passim., 1982; Taylor 1990:64–65). Hunn (1976) cites Fowler & Leland (1967) who also look at residual categories (though they do not use this term) in Northern Paiute ethnobotanical classification. A further discussion of residual categories is in§ 3 below.

Interdisciplinary Approaches to Language Documentation 86

It is also possible that in learning the nomenclature and classification of Formicidae, children begin by using the residual term of knowledgeable adults (tsīkatl) and gradually carve out portions of the domain in learning the terms listed in the first column of Table 1, though my impression is that the term āskatl is learned at a very young age. Unfortunately I am not aware of any research in native speaker communities on the development of nomenclature and classification of flora and fauna in early childhood, though different biosemantic domains are undoubtedly learned in different orders.54 An American child would probably become familiar with ‘bird’ (equivalent to a class term, Aves, or subclass term Neornithes) before the lower level term ‘robin’; with ‘ant’ (equiva- lent to a family term, Formicidae) before ‘leaf-cutter ant’, but with ‘dog’ (equivalent to a subspecies term Canis lupus familiaris) before any of the higher taxa. That is, the learning of classificatory relations can begin from the higher or lower taxa and the order of learning might well affect the nature of categorical structures as well as reflecting the degree of human involvement and “cultural saliency.” The second linguistic tool for determining cognitive categories is what Taylor (1990:47) calls “definitional implication, through which terms applying to animals or plants are used to derive covert classes such as the implied class of subjects of verbs such as ‘“tweet’ or ‘chirp’or those animals that possess a ‘beak’.” Taylor suggests that covert classes can be posited when a term relating to a plant or cannot be defined without reference to this covert category and “alternative definitions cannot suffice to define the term in question. It is insufficient to argue that, because terms like ‘leaf’ or ‘wing’ apply only to plants or animals, they presume the existence of a PLANT or ANIMAL class” (1990: 49). Taylor was not the first to utilize a linguistic approach to category definition. Berlin in his early work noted the value of numeral classifiers in setting the limits of semantic domain. He was referring to what Grinevald (2000) has called measural classifi- ers (such as ‘herd’) in English. Berlin & Kimball (1964) give examples of such classifiers and at one point present a list of nouns that occur with either of two such terms that com- municate ‘aggregations of globular-shaped objects’ (b’uhs indicating horizontal extension ‘spread out’ and t’ol ‘ indicating vertical extension, ‘piled up’). Objects that can be so arranged include corn kernels, coffee beans, peanuts, chili peppers, stones, pieces of corn dough, and eggs, among other items. Again, the list is not properly considered taxonomic (although the class may have an internal structure of core and periphery) but is rather an ad hoc collection of referents that do not correspond to a class that exists apart from the linguistic criteria that delimit it. Definitional implication, however, is hard to operationalize. The first difficulty is conceptualizing the formal structure of a definition that references a covert (unnamed) category. For example, the online OED gives the first definition of ‘beak’ as: “The horny termination of the jaws of a bird, consisting of two pointed mandibles adapted for piercing and for taking firm hold: a bird’s bill.” If we pretend that English were to lack a lexical item (i.e., ‘bird’) covering the avefauna, it is not easy to imagine a definition that would go beyond an ostensive definition employing a set of illustrative examples: “The horny termination of the jaws of robins, sparrows, wrens, hawks, penguins, and similar animals,

54 In Balsas Nahuatl the first term children learn is the “baby-speak” wīwih (from wiyōni, intrans. .,v ‘to wiggle’) loosely translatable as ‘bug’ or ‘creepy crawler’.

Interdisciplinary Approaches to Language Documentation 87 consisting of two pointed mandibles adapted for piercing and for taking firm hold.” Even if an extensional definition is used, the list of members of the category of “beak-possess- ing” animals would include tortoises and turtles as well as birds. An important caveat of definitional implication and subject/possessor associa tion with a given term is that the semantics of most terms results from the context of utterance, both associated words and social situation, a dependency of meaning precisely expressed by Firth’s famous admonition: “You shall know a word by the company it keeps.” (1957:11; cf. also Fillmore’s perspective in his frame semantics). Thus ‘bird’ in collocation with ‘chirp’ has a different set of referents as compared to ‘bird’ in collocation with ‘feathers’. The latter covers all avefauna while the former covers only a subset, a subset that is not lexically expressed in any commonly used term except, perhaps, song- bird. Thus an inductive approach to categorization from a corpus of non-generic feathered subjects of ‘sing’ would probably yield something close to ‘songbird’ (suborder Passeri) although the boundaries and internal structure of this category (feathered agents of the activity ‘sing’) would undoubtedly vary across speakers. Such a category definition could be approximated with an extensive corpus theoretically leading to, if not an inductive definition, then at least an ostensive one. This contextual limitation to Taylor’s example—”bird” is “the implied class of subjects of verbs like tweet or chirp (compare hoot and its implied subject owl; it is also likely to be found in a definition of beak, (to) perch, or feather” (1990: 47)—should thus be regarded with caution. With the exception of ‘feather’, the possessors or subjects of ‘tweet’, ‘chirp’, ‘hoot’, ‘(to) perch’, and ‘beak’ do not constitute a class coterminous with ‘bird’ or, in the case of ‘hoot’, ‘owl’. ‘Hoot’ is commonly associated with ‘owl’ but not all owls hoot. Thus the voice of a crested owl (Lophostrix cristata stricklandi) is described as: “A deep, throaty, slightly frog-like, emphatic growl, ohrrrr or gurrrr, at close range a rapid, stuttering introduction may be audible g’g’g’g’grrrr, repeated every 5–10 s” (Howell &Webb 1995:359). Not all birds (e.g., buzzards, eagles) ‘tweet’ or ‘chirp’. Indeed, the Oxford English Dictionary (online) defines the intransitive verb ‘chirp’ as fol- lows: “To utter the short sharp thin sound proper to some small birds and certain insects” (emphasis added). The phrase “some small birds” clearly indicates that chirping in an activity limited to a subset of birds.55 The “certain insects” would seem to be limited to those in the Orthoptera order and Cicadidae family. Indeed three terms (including two mentioned by Taylor)—’feather’, ‘perch’, ‘sing’—intersect the natural worlds at three different nodes: aves (class); Passeriformes (order); Passeri (suborder), respectively.56 It should be clear that in most of the above cases the definiendum does indeed point to a covert category but usually not to a category that is in any way coterminous with well-known biotaxa or named objects and categories of daily import. In some cases (‘feather’ [avifauna], ‘body hair’ or ‘milk’ [mammals]), a single term might indeed be a necessary and sufficient condition for category delimitation. But in most cases the cate- gories established by the means presented in the previous paragraphs (e.g., the potential subjects of ‘chirp’, the possessors of ‘beaks’, the objects that can be arranged in a manner

55 In a similar manner the set of referents of Spanish pájarois not equivalent to that of ‘bird’, a frequent translation, but rather a set that is some what fuzzy but that may be closer to ‘songbird’. 56 For a similar type of analysis in regards to verbs, see Levin (1993: 2) who notes that “native speakers can make extremely subtle judgments concerning the occurrence of verbs with a range of possible combinations of arguments and adjuncts in various syntactic expressions.”

Interdisciplinary Approaches to Language Documentation 88 indicated by b’uhs, the owls that ‘hoot’) reflect cognitive categories that, at least from a biotaxonomic perspective, are if not arbitrary then are at least not those commonly dis- cussed in the ethnobiological literature.

2.4 Summary This section has focused on the complexity of ethnobiological classifica- tion from a wide range of perspectives: (1) the ways in which Indigenous nomenclature and classification intersect with Western taxonomic hierarchies and the problems this cre- ates for a precise lexicosemantic treatment of native terms; (2) the family resemblance nature of Indigenous biosemantic categories and the possibility this affords for significant intra- and intercommunity variation of the boundaries and internal structure of native categories; and (3) the complexity of lexicosemantic relations among biotaxa nomencla- ture and classification including the manner in which purely linguistic criteria can create cognitive categories that are not coincident with the discontinuities in nature that are often cited as the basis for the boundaries mentioned in ethnobiological studies of nomenclature and classification. The argument presented below is that the preceding cognitive issues can best be addressed through close collaboration between documentary linguists concerned with how speakers talk about the natural environment (§3) and biologists intent on describing its taxonomic complexity (§4).

3. The contributions of endangered language documentation and corpus linguistics to the lexicosemantic treatment of the nomenclature and classification of biotaxa57 Linguistics and, particularly, documentary linguistics, has significant potential to facili- tate understanding of biosemantics: the meaning and use of terminology relating to flora and fauna. In addition to the identification of basic nomenclature and variation, discovery methods for taxonomic hierarchies and covert categories, the latter a term that Berlin has used to indicate cognitive categories that are not labeled lexically, has been continually debated in the ethnobiological literature. Ellen, who has explored the problems of clas- sification in most detail and with the greatest insight, notes the problems of formal elic- itation: “While controlled elicitation has the considerable advantage of generating large amounts of data quickly for the purpose of quantitative and comparative analysis, it is often the case that the ethnographer is demanding tasks which might otherwise never be performed” (1993: 25).58

57 Language documentation efforts to record native natural historians’ narratives on the environment—including flora and fauna, geography and landscape, material cultural, hunting and fishing—have multiple benefits.They produce material of great use to communities and document an endangered realm of knowledge often incompletely, if at all, documented in endangered language research projects. This point is most ele gantly expressed by Si (2011, 2016). While fully in accord with his perspective, and the necessity of encyclopedic entries for natural history ref erents, the pages below offer an additional value to documenting natural history: the creation of large sets of material that can be analyzed following the methodology of corpus linguistics. 58 Ellen (1993:25) cites in this regard Hays (1976) and Healey (1978–9). Brown (1974) and Taylor (1984, 1990) make the same point and note that formal elicitation tasks are activities quite divorced from everyday communicative activity. Taylor (1990:44) notes that besides the use of a nu merical classifier in Tzeltal for plants, the other techniques mentioned by Berlin, Breedlove, & Raven (1968) for determining a “unique beginner” are all “based on tests for perceived similarities among organisms.” Hays’s (1976) methodology for determining covert categories is also not convincing. He suggests that distinct plants that different consultants label with a single name constitute a covert category. There is no evidence, however, that such a grouping indicates anything more than perceived similarity and probable confusion, not a culturally salient grouping.

Interdisciplinary Approaches to Language Documentation 89

Elicitation removes important contextual elements from discourse about the nat- ural environment in regards to naming, classification, and use. Simple questions such as “Is X a type of Y?”, “How many types of X are there?”, or “What is X used for?”, often used in elicitation sequences, are themselves contextualized and situated within a series of expectations that the interviewer and consultant bring to the exchange.59 Elicitation involves a reseaercher-consultant dynamic that affects responses. One example was given above, the adequacy of chikomolin (Leucaena matudae)for posts and construction. A nor- mative response (and elicitation frameworks tend to favor normative responses) is “no”; but this “no” is belied by actual practice. Card sorting and triads tests are equally problem- atic. As Brown (1974:327) notes, they “often present informants with culturally irrelevant options coercing them to sort items together which they rarely, if ever, group together on an ordinary day to day basis” (emphasis in original). That is, the stimulus, in most cases a visual image or dried or otherwise preserved voucher specimens, implies an organizing principle, in these cases morphology, that channels responses. Recorded bird calls, a quite efficient tool for identification of native referents, would undoubtedly lead to classifica- tion patterns distinct from those prompted by visual imagery, and might be less prone to generate hierarchical patterns.60 Given the caveats associated with formal elicitation through structured question- ing and decision-making tasks, other methods need to be developed to discover nomen- clature variation and category membership, boundaries, and internal structure. This is particularly true for covert higher level categories such as life-forms and unique begin- ners. Taylor (1990:46) mentions “natural conversation” in his account of polysemy of certain Tobelo terms and, indeed, clearly all ethnobiologists, particularly those who have taken a more literary, conversational, and encyclopedic approach to Indigenous natural history (e.g., the works of Nabhan, Rea, Breedlove and Laughlin, Bulmer, Majnep, and Pawley) rely on overheard natural exchanges among consultants as well as unstructured conversations and, in the best of cases, extensive recorded discourse on relevant topics.61 Language documentation, which creates the resources for searches through large corpora of transcribed texts, offers a unique opportunity for both qualitative and quantitative anal- ysis of how members of a given community designate flora and fauna and converse about their natural environment. Several examples from both unrecorded natural discourse and corpus material illustrate the importance of conversational data for the study of ethnobio- logical nomenclature and classification. Careful attention to unelicited natural conversation can provide important clues to the structure of native categories. In the previous pages brief mention was made of Oapan Nahuatl nomenclature and classification for a group of insects designated ‘Formicidae’ in Western taxonomy. One of the most salient ants throughout the Mesoamerican cultural area is the leaf-cutter Atta mexicana. In Yoloxóchitl Mixtec distinct terms are used for the (edible) winged queen, the winged male, and wingless female soldiers. In the Sierra Nororiental de Puebla tsīkatl is a monotypic term reserved exclusively for this species. And in the Balsas region, speakers distinguish between colonies of A. mexicana (tsontetl)

59 This same point is made by Ellen (1993:225): “Informants, unprompted, rarely in the course of their ordinary lives will uses expressions such as ‘is x a kind of y?’, or ‘how many kinds of y are there?’.” 60 Hunn (1992) suggests using bird calls as a stimulus for identification and in lieu of physical specimens as vouchers. 61 A pioneering effort in this regard is the work of Majnep & Bulmer (1990). The Taller de Tradición Oral de CEPEC and Pierre Beaucage (1988) produced exemplary material, though the corpus was not subjected to direct analysis.

Interdisciplinary Approaches to Language Documentation 90 with large, visible surface fungal dumps (the contents of which are used to fertilize plants, such as cilantro, cultivated as condiments) and those without. In the Balsas and Sierra Nororiental de Puebla regions it is, therefore, rather unusual to ask whether a given ant is a “type of tsīkatl”. Whether the answer is affirmative or not, the categorization oftsontetl (Balsas) or other non-leaf cutters (Puebla) as a type of tsīkatlis not a common perspective. Nevertheless, one day I was entering my house (in San Agustín Oapan, Balsas valley) with an elderly woman who had consistently helped me in the study of . At the entrance was a large A. mexicana nest. The woman looked at me and joked, “Nō tihpia motsīkaw” (lit., ‘You also have your ants’). Though it was obvious that we both knew that this was a leaf-cutter mound, she not only used a generic term, tsīkatl, instead of the specifictsontetl , but she used a possessive construction. This brief exchange demonstrates the inclusion of tsontetl within the tsīkatl cat- egory, at least by this speaker in a given context: a joking reference to the house-site infestation of leaf-cutter ants. Even though red harvester ants (Pogonomyrmex barbatus; locally tēkwāntsīkatl) are selected to be appeased through an offering of maize kernels in exchange for a person’s soul,62 it is the A. mexicana that is considered particularly prone to take revenge on those who disturb its nest, often very visible and marked by large surface mounds. Thus unlike other ants, particularly fire and harvester ants, colonies of tsontetl are tolerated and not exterminated with pesticides. This stability in location,63 a combination of the effect of large underground nests and traditional beliefs in the danger of provoking the ire of these ants, undoubtedly also contributes to the possibility of “pos- sessing” these ants (or “having the insect lodge in one’s residence”), a relation generally not found with insects.64

62 The selection of P. barbatus for these offerings is undoubtedly related to their seed-gathering activities: they rapidly find the offered maize kernels and take them into their nests. Harvester ants also are used ritually, though in a different manner, in southwestern United States (Groark 1996, 2001). In obtaining the Western scientific species identification for the ritually used ants, Groark (2001:141) comments on the value of voucher specimens: “That such an identification can be confirmed more than a century after the species’ last known use is eloquent testimony to the importance of voucher specimens in anthropological research, as well as to the importance of the collections that preserve such materials.” The argument presented in this article is that best practice would preserve both the voucher specimen and native discourse on the taxa. 63 The term tsontetl is a compound noun from tson ‘hair’ + tetl ‘stone’ or ‘rock’. The compound has no other meaning but in regards to an implication of immobility note that kohtsontetl (with the added element koh ‘tree’ or ‘wood’) means ‘tree stump’. 64 One notable exception is Polybia occidentalis nigratella du Buysson, swarming-founding wasps that often build their nests under house eaves and are jokingly referred to as the house owner’s ‘little chicks’.

Interdisciplinary Approaches to Language Documentation 91

Figure 2. Classification of Formicidae in San Agustín Oapan Nahuatl

As is the case with the off-hand comment Nō tihpia motsīkaw, natural conversa- tions produce some of the most revealing insights into Indigenous categorization. Several have already been noted: (1) the comment that Psorthaspis formosa (family Pompilidae) was an unusual winged ītskwin tiōpixki indicated that winglessness was a central charac- teristic of this category; (2) the discussion of the status of Solanum chrysotrichum revealed speaker-specific judgments of classificatory criteria and boundaries. In Oapan, Guerrero, the statement of one consultant in regard to a stick insect, Nō chapolin, nō nokwa (‘It is also a chapolin, it is also edible’), demonstrated clearly that edibility was one criteria for class inclusion into one (perhaps the most inclusive) of several senses of chapolin (see Appendix for an encyclopedic discussion of the chapolin category). Indeed, the criteria by which category membership is judged can best be obtained by careful attention to dia- logues, discussions, and disagreements among native speakers, particularly in regard to the categorization of peripheral items. Insight into Indigenous nomenclature and classification of biotaxa can be also gleaned from semantic and statistical analysis of digitally recorded discussions among native natural historians particularly on topics such as: local flora and fauna; material culture including house construction, fencing, and animal traps; food preparation; and medicine. Large (> 1 million words) topically relevant corpora that focus on the semantic

Interdisciplinary Approaches to Language Documentation 92 domains just mentioned can be explored to address many issues in the lexicosemantics of biotaxa. The following discussion, based on such an analysis,addresses two such issues: residual categories and patterns of nomenclature and classification. As noted above (n. 22) residual categories are apparent hypernyms that have an additional sense as a co-hyponym of more specific terminology at a lower level. Such categories were first, I believe, extensively discussed by Hunn (1977), who suggested that certain terms may be both the hypernym of specifically named taxa and a co-hyponym

(residual category) of these same taxa. For example, pehpen2(residual) is a co-hyponym of named butterfly taxa species( 1, species2, species3): it covers and categorizes, in Hunn’s analysis, butterflies not specifically named. Pehpen1, as a hypernym, covers the named species (1, 2, 3, ...) as well as those of the unnamed residual category, pehpen2. Taylor (1990:64–65) offers a nuanced critique of always considering “residue” (i.e., unnamed taxa at a given taxonomic level) a residual category, although he does accept that such categories exist (e.g., o iuru as a residual term for the wingless forms of all ants except the specifically named weaver ants).65 Several researchers have suggested that residual categories are often marked by a term meaning ‘just’ (Hunn 1976, 1977; Berlin 1992:114). Corpora large enough to permit study of the use of a native term meaning ‘just’ with different named biotaxa, therefore, can suggest the relative degree to which different taxa may mark a residual category, although “just” may modify biosemantic terms for other reasons. The table below presents the four most common words that follow the Sierra Nororiental de Puebla Nahuat terms xiwit, kowit, komekat, sakat, and mōsōt.

Table 2. Association of sah with various biotaxa terms66

Term Following word Occurrences Percent Xiwit sah (‘just’) 150 13.0 % applied to certain herbaceous plants wān (‘and’) 122 10.6 % (but not grasses, sakat) 1152 total occurrences67 tein (relativizer) 50 4.3% mochīwa (‘become’, ‘grow’) 19 1.7%

Kowit wān (‘and’) 171 10.8% ‘tree’ or ‘wood’ mochīwa (‘become’, ‘grow’) 80 5.1% 1577 total occurrences tein (relativizer) 80 5.1% sah (‘just’) 74 4.7%

Komekat wān (‘and’) 55 12.4% ‘vine’ or ‘liana’ tein (relativizer) 27 6.1% 442 total occurrences sah (‘just’) 21 4.8% mochīwa (‘become’, ‘grow’) 13 3.0%

65 Berlin, despite early acceptance of residual categories, later adopted the more critical perspective of Taylor (Berlin 1994:114–16). 66 The entire corpus comprises approximately 1.5 million words; topics are skewed towards flora and fauna, material culture, and trapping and fishing. 67 This figure does not include the 233 times that xiwit (a homophone) is clearly used in the sense of ‘year’.

Interdisciplinary Approaches to Language Documentation 93

Sakat wān (‘and’) 9 9.2% 68 ‘grass’ and ‘sedge’ tein (relativizer) 4 8.2% 98 total occurrences sah (‘just’) 8 4.1% mochīwa (‘become’, ‘grow’) 1 0.0%

Mōsōt wān (‘and’) 33 20.1% Bidens alba (specific name) or a tein (relativizer) 6 3.7% category term including B. alba and B. reptans mochīwa (‘become’, ‘grow’) 3 1.8% 164 total occurrences69 sah (‘just’) 0 0.0%

The preceding table demonstrates that sah occurs almost three times as fre- quently after xiwit as it does alongside other “life forms.” Analysis of unrecorded natu- ral conversation and recorded dialogues suggests that xiwit is indeed a residual category while the other three life-form terms (kowit, komekat, sakat) are less so. There seems to be an expectation that xiwit is not further named. Thus a question about the name of a herbaceous plant could elicit the response “Āmo kipia ītōkāy, xiwit sah” (‘It doesn’t have a name, it is just a (n unnamed) herbaceous plant’) whereas a question about the name of a tree could elicit the response “Āmo nikmati” (‘I don’t know’) or “Āmo kipia ītōkāy” (‘It doesn’t have a name’) but rarely if ever ? “Āmo kipia ītōkāy, kowit sah” (‘It doesn’t have a name, it is just a tree’). Kowit also signifies ‘wood’ and many of the occurrences ofkowit sah refer to the fact that a material item is made of ‘just wood’ (i.e., not some stronger materials). Other occurrences of kowit sah refer to the fact that some trees grow just from wood (a branch) stuck in the ground and not only from seed. Xiwit may also be used to contrast cultivated from uncultivated (wild) herba- ceous plants and thus the collocation xiwitsah may also mean that the plant in question is not planted or tolerated. In such cases the xiwit might well be named, but the context of utterance makes clear that xiwit is being used to contrast the referent from other cultivated or useful plants. That is, xiwit is part of three sometimes but not necessarily overlapping contrast sets: named ~ unnamed, cultivated ~ wild, and, often, useful ~ not useful (though some xiwit may be used, predominantly as fodder)70 Corpus analysis may also shed light on nomenclature patterns. Table 3 reveals the occurrences of all lemmas that include the unanalyzable stem mōsōt, a very common term for two basic plants: Bidens alba (also the very similar B. odorata) and B. rep- tans. In an unmarked sense, mōsōt1 refers to the higher level node; in a marked sense, mōsōt2, it refers to Bidens alba (and B. odorata). To clarify which taxonomic level is meant, speakers may signal the marked use through a compound mīlahmōsōt (‘cornfield mōsōt’) based on the fact that this plant is frequently found in highly disturbed areas including abandoned or fallowed fields as well as along clear roadside sites. Kwamōsōt

68 Actually only a subset of sedges (family Cyperaceae), as about a half-dozen sedges are given specific names. 69 Only those occurrences not followed by an end-of-phrase marker (period, comma). 70An interesting meaning of xiw- is found in the incorporated-noun verbal compound xiwnekwisti ‘to smell like a weed’ used by a protagonist in Silvestre Pantaleón (Olivares &Amith 2011) in reference to Chenopodiumnuttalliae that had been fertilized by a chemical and not a natural, organic fertilizer. The plants that had been fertilized by commercial products had a “weedy” taste, not the taste of the plants fertilized with fungus from an Atta mexicana nest or from guano.

Interdisciplinary Approaches to Language Documentation 94 is a much larger plant, a climbing vine, also found along roadsides though frequently climbing on other plants. The two mōsōt are quite different and the common feature, the double barbed awn of the achene, has never been mentioned by any consultant as a defin- ing characteristic of the mōsōt category.71 This suggests that the classification of the two Bidens together is not based on the feature that botanists would use to establish the generic category but rather simply learned as a nomenclatural unit.

Table 3. Mōsōt in the Sierra Nororiental Corpus

Occurrences Lexeme 123 mōsōt (once as mōsōtsīn) 110 kwomōsōt, kwamōsōt, komōsōt, kowmōsōt, kwomōsōt 57 Mīlahmōsōt 15 kwomekamōsōt, kwamekamōsōt, komekamōsōt (an alternative name for kwomōsōt) 7 Tālmōsōt 6 istāk mōsōt 5 kowtah mōsōt 4 yēkmōsōt (once as yēkmōsōtsīn) 1 xōkihyākmōsōt 1 Mōsōtakōt 1 māweweyak mōsōt (speaker gives this as another name for mīlahmōsōt) 1 īknīw mōsōt (lit. ‹mōsōt›s brother›, a descriptive relationship applied to other Asteraceae, particularly Melampodium divaricatum) 333 Total occurrences

Table 3 reveals patterns in the term mōsōt, including yēkmōsōt and yēkmōsōtsīn (lit. ‘true mōsōt’) for B. alba/ B. odorata, indicating that this species is the prototypical referent of mōsōt. Particularly interesting is the emergence of the term tālmōsōt in 2014 for Erigeron karvinskianus, a daisy-like Asteraceae that frequently grows in rocky crev- ices and walls. The plant was known to several consultants, who also were aware of, and had commented on, its favored habitat. However, it was only in 2014 that one highly knowledgeable consultant designated the plant a tālmōsōt (lit., ‘earth mōsōt’), classifying it as a mōsōt along with the very common and well-known mīlahmōsōt (‘milpa [cornfield] mōsōt’, Bidens alba) and kwamōsōt (‘tree or woody mōsōt’, B. reptans). By that time this consultant, and others, had understood that the project involved the “classification of discontinuities in nature,” to borrow the subtitle of an influential book by Eugene Hunn, and had begun to recategorize plants, creating groups naturally, independent of formal elicitation. As in other cases (e.g., the extension of the ndi3xi4tu3 / kohtekine “woodcut- ter” category previously discussed), the extension of the mōsōt category to Erigeron

71 Interestingly, in the loan into Spanish, mozote, is applied (extended) to Triumfettalappula L. (Malvaceae, formerly Tiliaceae), an other plant with prickly haired fruits that likewise stick to one’s clothing and body (Chízmar F. 2009:319), an extension that suggests that “burr- like” fruits are a key element of the category definition for mōsōt, at least historically. Note that Bidens derives from Latin: bis ‘twice’ and dens (‘a tooth’) (Hyam & Pankhurst 1995).

Interdisciplinary Approaches to Language Documentation 95 karvinskianus, though an innovation, is not uninteresting. It demonstrates both the impact of project participation on the cognitive categories of native natural historians and the morphological characteristics that may orient patterns of extension over time. Starting after one consultant identified E. karvinskianus as tālmōsōt, he and another consultant present at the time have consistently included this term as one “type of” mōsōt.72 The examples in tables 2 and 3, the first on the utilization of sah (‘only’) in modifying terms for biotaxa (particularly life-forms) and the second on the relative occur- rence of different terms including the word-stem mōsōt, suggest the value of corpus-based research on the nomenclature and classification of biotaxa. One final point should be added, again relevantto a discursive and linguistic basis for cognitive research in eth- nobiology. In discussing Brent Berlin’s theory of taxonomic hierarchy, Eugene Hunn (1977:44) gives six examples of “life-form” categories: ‘bird’, ‘mammal’, and ‘fish’ in the animal domain and ‘tree’, ‘vine’, and ‘grass’ in the plant domain. Taxonomically all terms occupy nodes at the same level of their respective taxonomic trees (e.g., bird:robin :: tree:maple), just above intermediate or folk generic forms in Berlin’s (e.g., 1992, esp. Chap.4) analytical scheme. The analogical “equality” of life-forms at the taxonomic level disappears, how- ever, when one examines patterns of usage incommunicative events. That is, use of termi- nology at any level, from terminal taxon to life-form, is part of a process of “situationally adapted” (Ellen 1993:3) information exchange that is greatly influenced by sets of com- plex issues. Thus, if we take Hunn’s three animal life-forms (bird, mammal, fish) and count usage in two large corpora of American English (see Table 4), we see that ‘mammal’ is rarely used.73 The reason for this is not, I think, related to the relative rarity of mammals in daily life but rather to the general lack of informational relevance of this term for most situations of discourse. For example, if I were to tell a friend about a dead duck, rattle- snake, or raccoon that I had passed on the side of a highway, I would be most likely to comment in the following ways:

1. I saw a dead duck/bird/animal on the side of the road. 2. I saw a dead rattlesnake/snake/reptile/animal on the side of the road. 3. I saw a dead raccoon/? mammal/animal on the side of the road.

A strictly formal approach would suggest that duck:bird:animal::rattle- snake:snake:animal::raccoon:mammal:animal. The gap in felicitous expressions? “I saw a dead mammal on the side of the road” is not explained by any formal taxonomic anal- ysis. It reflects, instead, the low relevance of the category in quotidian exchanges. Two basic questions in beginning a cognitively oriented ethnobiological study should be, then, the context in which nomenclature and classification are to be studied and the different registers in which terms at the same taxonomic level are used. The argument presented

72 Breedlove & Laughlin (1993, vol. 1: 2) note another level of impact on consultant responses: group dynamics:

The impact of group dynamics upon the choice of plant names was terrible to behold. Of paramount importance was the social criteria of the collector within the team, such as his age or bonds of friendship. Social position and temperament often deter- mined who would align with whom in assigning which names. Efforts to prevent collectors from influencing their colleagues’ decisions frequently were unsuccessful.

73 Given the absence of syntactic parsing, ‘fish’ would show up regardless of its use as a noun or verb.

Interdisciplinary Approaches to Language Documentation 96 in this essay is that linguistic and conversational data is particularly interesting in this respect and that corpus-building as part of a language documentation project is especially valuable in creating this resource.

Table 4. Occurrence of Hunn’s six cited life-forms in two corpora of American English

Corpus of Contemporary Corpus of Historical American En- American English glish 450 million words 400 million words Singular + Total Percent- Singular + Total Percent- Plural age Plural age bird 20,986 + 42,518 43% 23,736 + 46,779 58% 21,532 23,043 mammal 1,102 + 2,938 4,040 4% 396 + 1,657 2,053 3% fish 50,179 + 51,984 53% 28,972 + 31,708 39% 1,805 2,736 tree 36,205 + 76,768 75% 42,092 + 91,626 72% 40,563 49,534 vine 2,221 + 2,805 5,026 5% 3,798 + 4,627 8,425 7% grass 18,363 + 20,595 20% 24,599 + 26,468 21% 2,232 1,869 platypus 68 + 0 68 36 + 19 55 rattlesnake 590 + 363 953 925 + 424 1,349 marigold 208 + 299 507 186 + 235 421 tulip 802 + 814 1,616 585 + 659 1,244

In regard to Table 4, certainly polysemy (‘smoke some grass’ for marijuana) and multiple part-of-speech functions (‘fish [verb] for compliments’) might account for some variation in frequency, although a corpus with part-of-speech tagging and more rigorous syntactic analysis might obviate some of these problems. But the rarity of ‘mammal’ and the fact that the plural is much more commonly used than the singular (a distribu- tional fact not shared by the other terms) suggest that other factors play a role. One is the saliency of subordinate categories: most mammals, at least mammals that are frequently the topic of conversation, are named and easily recognized. Speakers probably would tend to use the lower-level term when appropriate. This does not explain, however, my native speaker intuition that in phrase 3 above I would be more likely to use ‘animal’ than ‘mam- mal’ even though the former is inclusive of snakes and birds. This suggests that ‘mammal’ is a term of restricted register. Considering discourse pragmatics, prototype theory of categorical relations, and situational caveats to lexical relations of (co-)hyponymy and hypernymy, it is clear that taxonomic relations represent but one expression of classificatory criteria among many cross-cutting patterns of contextualized lexicosemantic relations. A taxonomic hierarchy represents the expression of one context: elicitation formalized by either triadic selection or set questions of the type mentioned earlier: “Is X a type of Y?”. Functional grouping

Interdisciplinary Approaches to Language Documentation 97 represents another categorization pattern in which prototypical status may reflect a combi- nation of normative (the best X for Y function) and geographic (the best X for Y function given locational constraints) criteria. Another type of pattern is brought out by linguistic analysis of natural speech, three types of which have been mentioned above: (1) discourse revelations of hypernymy, hyponymy, and co-hyponymy relations; (2) co-occurrence pat- terns (e.g., terms that share the feature of potential subjects of a given verb, possessors of a given noun); and (3) statistical analysis of medium-sized corpora, particularly those topically oriented to natural history. All this material invariably reveals patterns distinct from those obtained in formal elicitation settings. The final section addresses the issue of interdisciplinary collaboration with biol- ogists in projects that address the empirical data and research objectives of all partici- pants. From the discussion in the preceding sections it is clear that a cognitive focus on ethnobiology—exploring the nomenclature and classification of local flora and fauna— is enhanced by large data sets of natural conversation and comprehensive inventories of local taxa, both those named and unnamed, as well as those known and unknown to native natural historians. Certain taxonomic groups are more likely than others to pose both methodological and theoretical questions for cognitively oriented ethnobiological research, inevitably requiring more resources—from fieldwork through determinations to analysis. Finally, the advantages of multisited research to address issues of cultural history and diachronic shifts in the biosemantic lexicon can only be realized by a decided shift to a new form of interdisciplinary collaboration, one that involves the use of DNA barcoding and molecular analysis to shift the burden of determinations to species away from taxonomists and toward technicians.

4. The role of biology in ethnobiological research 4.1 Introduction At a very basic level, primary ethnobiological data comprises: (a) a native speaker account, (b) the context or field situation in which communication between researcher and native natural historian takes place, (c) a physical specimen including all relevant collection data, and (d) a determination to scientific species by a taxonomist. Each collection event constitutes a data point (single in terms of the botanical specimen and determination, often multiple in terms of ethnographic information) that, when multi- plied, provides an increasingly detailed sketch of native interpretation of the local floristic and faunal environment. Multiple data points, of course, increase the definition of the cognitive sketch of the environment (e.g., unequal distribution of knowledge in a com- munity, peripheral or central status of a reference in a classificatory scheme), a significant goal of linguistically based ethnobiological research. Yet proliferation of collections/data points has a cost—particularly in fieldwork resources, in administration and processing of vouchers, and in imposition on collaborating taxonomists. Moreover, an accurate and extensive portrait of ethnobiological knowledge is complicated by the fact that both local natural historical knowledge and Western taxonomic expertise is endangered.74And while

74 From the Western scientific perspective, there is a “dwindling pool of taxonomists” (Hebert et al. 2003:313) able and willing to determine voucher specimens to species, particularly specimens having no direct relation to their own research agenda. Indeed, NSF had recognized the problem of diminishing taxonomic expertise by creating the PEET (Partnerships for Enhancing Expertise in Taxonomy) program (now discon tinued), which sought “to enhance taxonomic research and help prepare future generations of experts.” The dearth of taxonomists has not, how ever, abated and from a practical perspective documenting the nomenclature, classification, and use of local flora (particularly in diverse neo tropical environments) is fraught with difficulties because of this.

Interdisciplinary Approaches to Language Documentation 98 native experts, familiar with local flora can invariably identify sterile (without flower or fruit) specimens, Western taxonomists are reluctant to receive sterile material for deter- mination. Over the past half-dozen years of my research, however, several projects have emerged that have greatly increased the level of mutually beneficial multidisciplinary collaboration by addressing issues and goals of significance to the different collaborating disciplines and by more fully engaging native speaker collaborators in botanical research. The first shift toward a more collaborative, multidisciplinary, and interdisciplin- ary effort targets the most complex and extensive classificatory sets for extensive eth- nobiological and biological documentation. Essentially this involves intensification of research in particular taxonomic groups, a triage approach to research that focuses on those taxa (i.e., families and genera) that are highly diverse in the local environment and particularly problematic for native naming and classification. The second shift involves using DNA barcoding to facilitate ethnobotanical research while creating a lasting resource—preserved voucher specimens and DNA extractions—that will advance not only phylogenetic and systematic biological research in the coming generations, but also biodiversity studies and ethnobiological research. Each of these is discussed below as “taxon-targeted” and “inventory-based” research, respectively.

4.2 Targeted taxon–based research Each culture that I have studied has certain groups of flora and fauna that, from a cognitive perspective, are less interesting than others. For example, Cuscuta spp. (dodder), Loranthaceae (mistletoe), Dermaptera (earwigs), and Myrmeleontidae larvae (doodlebugs) are rather stable and unchallenging in terms of native systems of categorization. Speakers may distinguish among the parasitic Cuscuta and Loranthaceae by host plant, color, or size but these seem to be ad hoc distinctions rather than lexicalized subgroupings of set categories. Indeed, the above mentioned groups manifest little internal structure (i.e., differentiation between prototype and peripheral ref- erents) and relatively unproblematic boundaries. Other groups, such as orchids in the Sierra Nororiental de Puebla, are inevita- bly recognized by native speakers. In this case the “native” category is apparently the result of Western influence (including trafficking of these plants) and thus the Indigenous and scientific categories are essentially co-terminous: the scientific family Orchidaceae.75 From the native perspective, the internal structure of this category is rather uninteresting. Only approximately 10 percent of the over 100 orchid species found in the municipality of Cuetzalan are locally named in Nahuat and in most cases there is a simple one-to-one correspondence between the Indigenous name and Western scientific species or, in a few cases, genus. In the case of orchids, extensive speaker-led collection of the taxon reveals only minor complications to intercultural category correspondences but the facility with which native consultants recognize this Western taxon can be advantageous to botanists working on orchids. Although unnamed in Nahuat, speakers can easily recognize an orchid and through collections can contribute to the research agenda of a collaborating

75 Orchids are heavily, and illegally, trafficked in the Sierra and this may well contribute to the highly salient boundaries to this category, which is named only by the Spanish loan orquídea. Interestingly ferns and fern allies are not so easily categorized.

Interdisciplinary Approaches to Language Documentation 99 expert with little cost to a project that focuses on “positive” Indigenous knowledge: those taxa named, classified, and used by local residents. An exclusive focus on such positive knowledge, however, may diminish the capacity for research into questions of category extensions and boundaries and may com- plicate any effort to determine the unequal distribution of nomenclature, classification, and use across different Western taxonomic groups.Those species that are not on the native speaker cognitive map are also likely to be more interesting from a purely botanical and floristic perspective. This introduces a methodology that I suggest be called “negative cognitive mapping,” ethnobotanical research that is focused not on plants that are locally named, classified, and used but rather on those that are not known to local experts. The plants so selected are not simply those for which no native name exists but, more signifi- cantly, those that have gone unnoticed.76

Table 5. Taxonomic complexity from an Indigenous nomenclatural and classificatory perspective

Plants: Piper spp. (but not Peperomia spp., the other major Piperaceae genus; in the Sierra Nororiental de Puebla), Asteraceae Burseraceae (Bursera spp., in the Balsas River valley of central Guerrero) Commelinaceae (in the Sierra Nororiental de Puebla) Lauraceae (in the Sierra Nororiental de Puebla) Leguminosae Melastomataceae (in the Sierra Nororiental de Puebla) Solanaceae (particularly Cestrum, Physalis, and Solanum) Rubiaceae Senna spp. (in the Pacific Coast of Guerrero) Insects: Apidae Cerambycidae Formicidae Mutillidae (along the Pacific Coast and Balsas Valley, Guerrero) Orthoptera (in the Balsas Valley, central Guerrero) Vespidae

In the three areas I have studied, the preceding taxa have proven to be particularly challenging in regard to native language nomenclature and classification. Certain groups (Bursera spp., Piper spp., Melastomataceae, Formicidae, Orthoptera, Vespidae) enjoy high levels of consistent recognition. Lauraceae manifests what appears to be extensive over determination, intraspecific distinctions made based on fruit color and formation.

76 A expert native natural historian will have wide knowledge of plants, including many for which he or she has no name. Breedlove & Laughlin (1993, vol. 1:6) regretfully note that they “kept no tally of the number of plants that [their]consultants were unable to identify.” I think this should, however, be a part of ethnobotanical projects when resources permit particularly since knowledge gaps may be significant to understand ing traditional ecological knowledge.

Interdisciplinary Approaches to Language Documentation 100

Asteraceae are extremely common (in the Sierra Nororiental de Puebla they comprise just over 200 species, approximately10 percent of the angiosperm flora) and morphologically diverse, offering excellent grounds for studying native nomenclature and classification in robust sets of stimuli. Piper spp. and Solanaceae both manifest interesting patterns of nomenclature correspondences: some Indigenous terms are monotypic while others cover sets of somewhat varied taxa. Among insects Formicidae, Mutillidae, and Orthoptera are consistently recognized, and knowledge of Vespidae is often extremely detailed.77 Taxon-based research has to date produced rewarding cross-disciplinary collab- oration by establishing synergistic research relationships with biologists specializing in generic and family taxa that are interesting from a cognitive perspective and abundant enough regionally to reward closer interdisciplinary collaboration, often including joint field ventures.78 Statistics from the Sierra Nororiental de Puebla reveal that twelve fami- lies and one group (the latter in reference to ferns and allies) account for 60 percent of the regional flora.79 Though not a full floristic study, targeting specific taxa does require the collection of material that is not only not locally named but often not even known to the most knowledgeable native natural historian. As mentioned above, it is more than likely that these unnamed or unknown taxa would be rarer than their named and known counter- parts and thus of potentially more interest to botanists.80 Targeted taxon-based research does not preclude complete coverage of the nomenclature, classification, and use of local flora and fauna but rather forges greater collaboration among social scientists, biologists, and native experts in taxa of greatest local diversity and native recognition. Joint fieldwork—among native natural historians, ethnobiologists, and taxonomists—is particularly productive as each group offers exper- tise in distinct areas. Yet it is important to note that such collaboration is a learning expe- rience for all participants and that native speaker collaborators are as eager to acquire new knowledge as are the Western natural and social scientists. Thus, for example, at one point a group of native speakers gathered many specimens of mātalin for five vouchers. I was careful to separate out Commelina erecta from C. diffusa and show the two speakers who were helping in pressing the material how to distinguish the two species. Before the collection was complete, all five consultants had internalized the classificatory distinc- tion and lexicalized it in an emergent nomenclatural distinction: mātalin wehwei (‘large mātalin’ for C. erecta) and mātalin tsikitsīn (‘small mātalin’for C. diffusa). Thus the highly integrated research teams necessary for both targeted tax- on-based research and inventory-based research (§4.3) must, of course, be completely open to full synergy among participants. Much as it would be unimaginable to expect native speakers who collaborate on grammars or lexicons not to acquire the relevant ana- lytical and theoretical tools, so too is it naïve to expect collaborators in (ethno)biological

77 For Cerambycidae, see Amith & Lingafelter(2016). 78 To date this triage focus on cognitive ethnobiology has resulted in over a dozen expert taxonomists (botanists and entomologists) either having joined in fieldwork or made commitments to do so in the near future. 79 The statistics are as follows: Ferns and allies (12.8%), Asteraceae 10.9%, Leguminosae 7.5%, Poaceae 6.2%, Solanaceae 3.4%, Rubiaceae 2.9%, Labiatae 2.9%, Orchidaceae 2.4%, Euphorbiaceae 2.3%, Malvaceaesensulato 2.3%, Fagaceae 2.1%, Piperaceae 2.1%, and Melastomataceae 1.9%. 80 This is, however, not always the case. After several years of having a reference to a plant called yōlpoliwkāxiwit, it was finally collected. It turned out to be Peperomia mexicana (Miq.) Miq. (Piperaceae), a species that had last been collected in the state of Puebla in 1945 (Guido Matheiu, personal communication). Other Peperomia not named by Indigenous consultants are much more common. Thus named plants may indeed be rare.

Interdisciplinary Approaches to Language Documentation 101 research not to learn some of the perspectives of Western science in establishing classif- icatory boundaries for flora and fauna. This will, of course, influence their own knowl- edge base (as collaboration with native communities influences Western understanding of natural history) but I have not found it difficult to work with native collaborators in differentiating sources of their expanding field of expertise. Indeed, this growth of knowl- edge is a fundamental part of developing native natural historians who can interpret their environment, particularly local flora and fauna, from multiple perspectives.

4.3 Inventory-based research and DNA barcoding Comparative study of the nomenclature and classification of biological species (flora and fauna) has been an important tool for studying the cultural history of language groups. In addition to a general concern with reconstruction and proto-forms, two more specific research topics have emerged.81 The first, exemplified by scholars such as Catherine Fowler (1972a, 1972b, 1983), Paul Friedrich (1970), Frank Siebert (1967), and K. W. Whistler (1977) has reconstructed the lexicosemantics of proto-language terms for bio- taxa and taken the reconstructed meanings as reflecting ancestral homeland ecosystems. In the depth of his study of Proto-Indo-European, Friedrich broke new ground in rigor by linking protosemantic reconstruction of flora to prehistoric ecosystems. Siebert, in turn, was the first to apply this methodology to American languages. Fowler continued this effort to use lexical evidence to document ecological clues to homelands but at the same time noted the limitations of this approach “given the quality and quantity of data pres- ently available” (Fowler 1983:224; emphasis added). A second direction of research regarding the lexicosemantics of biological no- menclature relates to contact phenomena and what has been called linguistic stratigra- phy: “the systematic investigation of the layering of grammatical and lexical material in a language or dialect which reflects its historical development and past contacts between its speakers and bearers of other linguistic and cultural traditions” (Andersen 2003b:1). Within this area of research a small subset of studies has either focused exclusively on bi- ological nomenclature (Bowern 2007; Bowern & McConvell 2011; Bostoen 2007; Meroz 2013) or relied heavily on terms from this semantic domain (Dakin 2003). The strategy of these studies differs, as Yoram Meroz notes, from most comparative lexical surveys in that to elucidate genetic relationships among languages a set of basic words most resistant to change is preferred (see Haspelmath &Tadmor (2009), particularly Chap. 1–3). The no- menclature of flora and fauna, however, is probably more sensitive than basic vocabulary to change through contact and thus is a particularly propitious semantic domain in which to study migration and contact.82 The loans that are relevant to such stratigraphic studies, in turn, may be ei- ther of form (i.e., loan words) or meaning (calques or loan translations). The former is common, for example, among Nahuat speakers of the Sierra Nororiental de Puebla who

81 Undoubtedly the most complete attempt for deep historical reconstruction of terms for flora and fauna is the Oceanic Lexicon Project (https://sites.google.com/site/theoceaniclexiconproject/) in which volumes three and four, of a proposed seven-volume effort, deal with plants and animals, respectively, and volume one with material culture (see Ross, Pawley, & Osmond 1998–2011). 82 Balée & Moore (1991) and Berlin et al. (1969, 1973) have looked at the factors that shape the rate of retention and loss in closely related languages among different types of ethnobiological nomenclature.

Interdisciplinary Approaches to Language Documentation 102 have borrowed many, some even quite basic, terms from Totonac (xopepe ‘cockroach’, āltsimit ‘wasp’, and chokoy ‘puss caterpillar’ [Megalopygidae]), the borrowing probably the result of Nahuat migration into the area. The second type of loans are calques, loans in which meaning is translated from one language to another but the term itself is not bor- rowed. Calques have been used to support the definition of Mesoamerica as a cultural area, though these loan translations are only one of several features, many morphosyntactic, that are regionally shared (Campbell, Kaufman, &Smith-Stark, 1986; Smith-Stark 1982, 1994; about 18 percent of the calques these authors reference denote flora or fauna, such as ‘mother of the leaf-cutter ant’ for ‘coral snake’). In my own ethnobiological research I have discovered a few more calques (e.g., camel spiders and whip scorpions are both called “shame [animal]” among the Aztecs, modern Nahuatl speakers from central Guerrero, and the Coastal Mixtecs) and examples of cultural beliefs associated with animals and plants (e.g, Mutillidae are considered omens in several Totonac, Mixtec, Triqui, and Mazatec languages). The viability of using loan words, calques, uses, and shared cultural beliefs of biological nomenclature for studying historical contact is, however, also hindered by the same poor quality and quantity of data noted by Fowler. The project described in this section represents an effort to provide such necessary data. Despite the importance of comparative research on the nomenclature and classifi- cation, as well as use, of local flora and fauna, unified efforts to document this knowledge across communities face problems that only begin with the complexity and fluidity of local knowledge.83 A comparative project that targets a collection of 1,000 specimens from half a dozen communities embedded at different points in a regional ecosystem would produce 6,000 plants. If botanical best practice is followed and only fertile material collected, the task of assembling the inventory becomes even more daunting. Employing multiple teams of field botanists is one way to address this issue, but this often occurs at the expense of ethnographic detail and understanding. Moreover, with such a large collection, taxono- mists would be overwhelmed by material, much of which would be common and of routine identification. Inventory-based research and DNA barcoding, the focus of a project underway in the Sierra Nororiental de Puebla, attempts to solve the difficulties of multi-sited research by facilitating the determination to species of vegetative material (e.g., leaf tissue) through the use of genetic markers.84 The botanical aspects of this project comprise four steps: (1) collection of flowering specimens that represent the floristic inventory of the Sierra Norori- ental de Puebla; (2) identification to species of these voucher specimens; (3) DNA sequenc- ing of up to four regions of the genome of each species to create a DNA barcode reference library; and (4) use of this reference library to facilitate the identification of a small sample of vegetative plant material that will be used to document the nomenclature, classification, and use of local flora in Sierra Nororiental Indigenous villages. This methodology occasions two major shifts, one at the field end and one at the herbarium end of the project. At the level of field collection, speakers can collect a single sterile voucher specimen, along with vegetative material dried in silica gel, linked to eth- nobotanical knowledge (a name, a classification, a use). This simplifies and accelerates

83 Cf. Laughlin’s comments on the variability of taxonomic knowledge in Zinacantan (Breedlove & Laughlin 1993, vol. 1:8). 84 This is an NSF, Documenting Endangered Languages award (BCS 1401178), and an NEH, Preservation and Access, award (PD-50031) entitled “A Biological Approach to Documenting Traditional Ecological Knowledge in Synchronic and Diachronic Perspectives.”

Interdisciplinary Approaches to Language Documentation 103 collection, reducing the need for flowering or fruiting vouchers as well as the number of duplicate vouchers needed. This will empower communities to document their own ethnobiological knowledge, diminishing the need for Western trained botanists in the field and, at the herbarium end, in a majority of cases transforming identification into a labo- ratory process and freeing taxonomists from routine identifications of common species. In certain cases even a four-region barcode (matK, rbcL, ITS, trnH-psbA) may not distin- guish among congeneric species, leaving ambiguity between two and perhaps half a dozen specimens.85 In such cases, in which more discriminatory power is needed, supplementary methods will be developed to disambiguating the congeneric species: (1) the use of disam- biguating morphological features present in sterile specimens, such as leaf form, venation, or pubescence; and (2) a clade-specific DNA locus capable of distinguishing among these congeneric species. It is likely that with these additions close to 100% accuracy in sterile species identification can be approached. The use of DNA barcodes not only simplifies collection and shifts identification to a technical process in a laboratory, it also creates a permanent barcode reference library that can be built upby future projects. The present project in the Sierra Nororiental de Puebla is a significant start. Itshould produce a DNA barcode reference library covering approximately10 percent of Mexican angiosperms and about 25 percent of Mexican ferns and allies; for the state of Puebla the figures are closer to 38 percent of angiosperms and 60 percent of ferns and allies. This project goes well beyond the common interests and methodologies of eth- nobotanical research. It begins with a floristic inventory of a relatively large region, an inventory that probably comprises between four and five times the number of species known to Indigenous natural historians from any given community and perhaps three times the flora collectively known to Indigenous experts throughout the region. Significant effort is expended to continue to collect material even when it is clear (as it is with orchids, Peperomia spp. and Piper spp.) that all species of a given family or genus that are known to or used by Indigenous people in the region have been collected. The inventory approach of this project has generated the interest and support of the state and national herbaria as well as that of CONABIO (Comisión Nacional para el Conocimiento y Uso de la Biodiversidad), a federal agency charged with documenting and protecting Mexico’s biodiversity. Moreover, during the project’s first year, seven renowned taxonomists from Mexico and the United States have visited the sierra to collaborate on the project and several have indicated an interest in repeated visits to develop parallel projects integrating botanical and ethnobotanical research. Another eighty have supported the project with identifications in the families of their expertise, both of new vouchers collected in this project and of misidentified herbaria specimens previously collected in the region. The degree to which this project has been able to generate strong support in the botanical community reflects, to a great extent, the manner in which it creates a floristic inventory and enlists the collaboration of expert taxonomists to identify the fertile vouch- ers, the foundation of the DNA reference library, to species. Indeed one expert on DNA barcoding mentioned that the greatest value of the project rests more with the creation of a regional flora that is accurately identified by specialized taxonomists and accompanied by permanently preserved DNA extract that could be resequenced at a later date when

85 There is little doubt that in all, or virtually all, cases, the DNA barcode should be able to get the specimen to genus.

Interdisciplinary Approaches to Language Documentation 104 technology is more advanced and economical (Peter Hollingsworth, personal commu- nication). Another botanist, Martin Ricker, noted that approximately 81 percent of the 11,219 collections from the Inventario Nacional Forestal y de Suelos (2013) are sterile. Although some 70 percent of the collection have been identified to species, the rest remain unidentified to this level (Ricker et al. 2015). One solution he is attempting is to develop models of leaf morphology that can be applied to the leaves of the collection. But the cre- ation of a DNA barcode reference library, such as that envisioned in the present project, offers an additional methodology for identification to species. Certainly a combination of sequencing and leaf morphology along with a knowledge of the biodiversity of regional ecosystems, will greatly enhance identification of the large number of sterile specimens. Within the endangered language community this project has also motivated inter- est. David Beck, an expert on Totonac, was included in the original grant to document eth- nobotanical nomenclature, classification, and use in the Upper Necaxa Totonac village he studies. Recently he has suggested that three or four students replicate the process: inten- sive collection of vegetative material of plants named, classified, or used in other Totonac communities in the Sierra Norte de Puebla. Should this expansion take place, the project would be able to develop comparative lists of nomenclature and classification of flora in over half a dozen Indigenous villages of the Sierra Nororiental and Norte in Puebla. A project that builds up a regional DNA barcode reference library has proven, therefore, to be highly attractive to botanists and ethnobotanists, including linguists and anthropologists. From a botanical perspective it will develop an extensive floristic inven- tory comprising well over 2,000 expertly determined species, each with an associated four-locus barcode and extracted DNA preserved for future study. From an ethnobotanical perspective it will develop regional sets of plant nomenclature and classification that will facilitate the study of cultural history, particularly migration and contact in a fairly wide region of Indigenous settlements. As a final observation, it is still debatable whether the type of collaboration envi- sioned in the DNA-barcode project is truly interdisciplinary. In the sense advocated by the National Science Foundation, wherein interdisciplinary research should have the potential to forge new disciplinary ventures, it is probably lacking. The theoretical focus is mostly within the domain of anthropology, linguistics, and cultural history. But the core system- atic questions that are increasingly addressed by molecular analysis, exemplified in such efforts as the Angiosperm Phylogeny Website (http://www.mobot.org/MOBOT/research/ APweb/) are only peripherally advanced by the collection of plant tissue associated with expertly determined collections from often poorly studied regions. Full integration and the potential for a new disciplinary venture is not fully realized. Nevertheless, the type of project mentioned above has demonstrated that a flo- ristic and ethnobotanical project can stimulate extensive collaboration from Indigenous communities to regional and national herbaria while melding the expertise of native natu- ral historians, anthropologists and linguists, and taxonomists and systematists into a cohe- sive collaborative enterprise that can have significant and long-lasting impact at many levels of activity and expertise. Forging such varied collaboration, as deficient as it may be from a demanding definition of interdisciplinarity, is still not trivial. An ethnobiologist or linguist focusing on biosemantics will need to be sensitive to the distinct needs of communities on the one hand and herbaria and museums on the other. This may involve collections well beyond

Interdisciplinary Approaches to Language Documentation 105 those needed for a simplified, at times simplistic, lexical entry. Field guides and exhibits for local communities should be prioritized and excellent voucher specimens with proper field data in electronic format should be made. At times the social scientist or humanist will need to go out of his or her way to acquire materials that a biologist might request, thus stimulating a desire among these natural scientists to collaborate fully with the proj- ect. For herbaria and museums, not only must the collected material be well prepared and expertly documented, but a sufficient number of duplicates should be collected so as to allow deposits in regional and national venues, as well as gifts to the specialists.

5. Summary This essay began by presenting two perspectives on interdisciplinary research. The first focuses on the degree of intergration among disciplines to solve a research problem that requires resources beyond those of any single field. The second explored the relative degree to which participants from distinct disciplines were not simply acting as service providers to others, but were participating in an effort that met their own research agenda. Interdisciplinarity, from these perspectives, involves a high degree of integration and an equitable balance of scientific impact on the collaborating disciplines. I suggested that ethnobiological research, particularly that which focuses on cognitive issues of nomen- clature and classification, tends to be poorly integrated with biological research. The chal- lenge, then, is how to achieve greater integration and disciplinary importance between the potentially major stakeholders in ethnobiological research: (1) anthropologists or lin- guists, (2) biologists, particularly taxonomists, (3) native communities, and (4) herbaria and museums. The second section explored the complexity of Indigenous patterns of nomen- clature and classification in an attempt to demonstrate that the most interesting patterns were unevenly distributed across the biological spectrum and most efficiently analyzed (1) through an extensive dataset of natural speech data and (2) through a focused inven- tory of relevant taxa in the local environment. The first item is the domain of documentary linguistics and in §3 the argument was presented that documentary linguistics and the building of a large corpus of data topicalized on natural history and related themes (e.g., material culture, agriculture, and horticulture) offers significant insight into cognitive and communicative issues related to natural historical knowledge in Indigenous communi- ties. Language documentation is particularly useful as it focuses on the compilation of recorded, transcribed, and translated verbal communication among native speakers. The second item, a focused inventory, is best addressed through the dedicated participation of taxonomists who specialize in those taxa that are most challenging from a nomen- clatural and classificatory perspective. An initial effort to build collaborative ventures involved taxon-based research and shifting ethnobiological research to target specific families and genera. Such a shift meant assigning resources to extensive collection in the targeted groups but the result is research that addresses the interests of both social scien- tists and biologists. A second effort involves a more general floristic approach: creating an inventory of regional flora and developing a DNA barcode reference library from this collection. While the resource itself does not address theoretical concerns, it is highly useful across disciplines, from phylogenetic studies (given that the extracted DNA linked

Interdisciplinary Approaches to Language Documentation 106 to expertly determined voucher specimens will be preserved for future sequencing and study), through ecology and to ethnobiological research, at a time in which both native and Western taxonomic expertise is increasingly endangered. Moreover, the resource is “cumulative” in the sense that the DNA barcode reference library can be progressively enhanced and thus can increasingly facilitate discovery and analysis in botanical and eth- nobotanical research.

Acknowledgements Research that underlies the perspectives presented in this essay was carried out with support from the National Science Foundation, Documenting Endangered Language Program (s ), the National Endowment for the Humanities, Preservation and Access, and the Endangered Language Programme at the School of Oriental and African Studies, London. The following awards contributed to the development of the data and collab- orations that were key to understanding the cross-disciplinary work situations discussed in this paper: NSF-Documenting Endangered Languages, Award #0756536, “Nahuatl Language Documentation Project: Sierra Norte de Puebla”; Hans Rausing Endangered Languages Project, School of Oriental and African Studies, Award MDP0201, “Corpus and lexicon development: Endangered genres of discourse and domains of cultural knowl- edge in Tu’un ísaví (Mixtec) of Yoloxóchitl, Guerrero”; NSF-DEL, Award ##0966462, “Corpus and lexicon development: Endangered genres of discourse and domains of cultural knowledge in Tu’un ísaví (Mixtec) of Yoloxóchitl, Guerrero”; Endangered Language Documentation Programme, SOAS, Award MDP0272, “Documentation of Nahuat Knowledge of Natural History, Material Culture, and Ecology in the Municipality of Cuetzalan, Puebla”; NSF-DEL and NEH Preservation and Access, Awards #BCS- 1401178 and #PD-50031-14, respectively, “A Biological Approach to Documenting Traditional Ecological Knowledge in Synchronic and Diachronic Perspectives, National Endowment for the Humanities”; CONABIO (Comisión Nacional para el Conocimiento y Uso de la Biodiversidad), Mexico, co-PI, Gerardo A. Salazar, Instituto de Biología, Universidad Nacional Autónoma de México, Award ME010, Floristics, Biodiversity, and Traditional Ecological Knowledge in the Sierra Nororiental of Puebla, Mexico; and NEH Office of Digital Humanities, AwardsHD-228866-15 and HAA-258602-18 “Comparative Ethnobiology in Mesoamerica: A Digital Portal for Collaborative Research and Public Dissemination.” At the beginning of the reference section the biologists who provided identi- fications to species of collections cited this paper are listed. Well over a hundred more botanists and entomologists provided identifications of other collections within the areas of their expertise. The invaluable help of all is greatly appreciated. Particularly import- ant for specimen management and distribution has been the support and collaboration of the following individuals at collaborating institutions: Lawrence Dorr and Erika Gardner at the Department of Botany, Smithsonian Institution; Gerardo A. Salazar and David Gernandt at the Instituto de Biología, Universidad Nacional Autónoma de México; and Allen Coombes at the Jardín Botánico Universitario, Universidad Nacional Autónoma de México. Martin Ricker and Jesús Romero Nápoles generously assisted in processing the necessary collection permits. Pedro Acevedo, Frank Almeda, Allen Coombes, and

Interdisciplinary Approaches to Language Documentation 107

Douglas Stevens, who in addition to identifications in their own areas of taxonomic exper- tise, were always willing to provide initial determinations based on photos, enabling the efficient distribution of plant specimens to the proper expert taxonomists. Finally, my appreciation to Susan Penfield for inviting me to participate in the original workshop and in overseeing the editorial process of this special issue.

Interdisciplinary Approaches to Language Documentation 108

References

Acknowledgments

Mutillidae Kevin Williams

Meloidae John Pinto Vespidae James Carpenter, Matt Buffington Apidae Charles Michener, Victor González, Ismael Hinojosa, Terry Griswold, Ricardo Ayala, Robert Minckley, Sam Droege, Sandra Rehan Cerambycidae Steven Lingafelter Formicidae William Mackay, John Longino, Phil Ward, Rafael Achury, Andy Suarez, Ted Schultz, Sean Brady, Jeffrey Sossa-Calvo, Robert Johnson Pompilidae James Pitts Mantidae Gavin Svenson Phasmatodea Enrique Mariño Reduviidae Thomas Henry Syrphidae Christian Thomason Passalidae Christian Beza-Beza Lantana Roger Sanders Asclepediaceae Douglas Stevens Sphecidae and Crabronidae Wojciech Pulawski and Michael Ohl

Amith, Jonathan D., and Steven W. Lingafelter. 2016. Ethnoentomological and distribu- tional notes on Cerambycidae (Coleoptera) of Guerrero and Puebla, Mexico, with notes on other insects. Submitted to [Pending]. Andersen, Henning. 2003. “Introduction,” in Henning Andersen, ed., Language Contacts in Prehistory: Studies in Stratigraphy. Philadelphia: John Benjamins, pp. 1–10. Andersen, Henning, ed. 2003. Language Contacts in Prehistory: Studies in Stratigraphy (Papers from the Workshop on Linguistic Stratigraphy and Prehistory at the Fifteenth International Conference on Historical Linguistics, Melbourne, 17 August 2001. Current Issues in Linguistic Theory 239. Philadelphia, Penn.: John Benjamins. Balée, William, and Denny Moore. 1991. “Similarity and variation in plant names in five Tupi-Guarani languages (Eastern Amazonia).” Biological Sciences 35(4):209–62. Berlin, Brent. 1992. Ethnobiological classification: Principles and categorization of plants and animals in traditional societies. Princeton, N.J.: Princeton University Press. Berlin, Brent, and A. Kimball Romney. 1964. Descriptive semantics of Tzeltal numeral classifiers.American Anthropologist 66: 79–98. Berlin, Brent, Dennis E. Breedlove, and Peter H. Raven. 1968. Covert categories and folk taxonomies. American Anthropologist 70:290–99.

Interdisciplinary Approaches to Language Documentation 109

Berlin, Brent, Dennis E. Breedlove, Robert M. Laughlin, and Peter H. Raven. 1973. “Cultural significance and lexical retention in Tzeltal-Tzotzil ethnobotany. In Munro S. Edmonson, ed., Meaning in Mayan Languages. The Hague: Mouton, pp. 143–64. ————. 1969. “Lexical retention and cultural significance in Tzeltal-Tzotzil compar- ative ethnobotany.” Paper presented at the 68th Annual Meeting of the American Anthropological Association, 20–23 November, 1969. Bianchi F. A. 1962. Notes on the biology of Cissites auriculata (Champion) (Coleoptera: Meloidae). Proceedings of the Hawaiian Entomological Society 18:111–19. Bostoen, Koen. 2007. Bantu plant names as indicators of linguistic stratigraphy in the western province of Zambia. Doris Payne and Jaime Peña, eds., Selected Proceedings of the 37th Annual Conference on African Linguistics, pp. 16–29. Bowern, Claire. 2007. “On Eels, Dolphins, and Echidnas: Nyulnyulan Prehistory through the Reconstruction of Flora and Fauna.: In Alan Nussbaum, ed., Verba Docenti: Studies in historical and Indo-European Linguistics, Presented to Jay H. Jasanoff by Students, Colleagues, and Friends. Ann Arbor: Beechstave Press, pp. 39–53. ————. (work in progress with Patrick McConvell). 2011. “Reconstruction in ethno- biological systems of Australian languages.” PowerPoint presentation. Australian Nacional University. Brown, Cecil. 1974. Unique beginners and covert categories in folk biological taxono- mies. American Anthropologist 76:325–26. Byerzychudek, P. 1981. Asclepias, Lantana and Epidendrum a floral mimicry complex. Biotropica. 13:54–58. Campbell, Lyle, Terrence Kaufman, and Thomas Smith-Stark. 1986. “Meso-America as a linguistic area.” Language 62:530–70. Chízmar Fernández, Carla, et al. 2009. Plantas comestibles de Centroamérica. Santo Domingo de Heredia, Costa Rica: Instituto Nacional de Biodiversidad. Dakin, Karen. 2003. “Uto-Aztecan in the linguistic stratigraphy of Mesoamerican pre- history,” in Henning Andersen, ed., Language Contacts in Prehistory: Studies in Stratigraphy. Philadelphia: John Benjamins, pp. 259–88. DiCanio, Christian, C. Zhang, J. D. Amith, R. Castillo García, and D. H. Whalen. forth- coming. Phonetic structure in Yoloxóchitl Mixtec consonants. International Journal of American Linguistics. ————, Hosung Nam, Jonathan D. Amith, Rey Castillo García, D. H. Whalen. 2015. Vowel variability in elicited versus running speech: Evidence from Mixtec. Journal of Phonetics: Special Issue on the Impact of Stylistic Diversity on Phonetic and Phonological Evidence and Modeling 48:45–59. ————, Jonathan D. Amith, Rey Castillo García. 2014. The phonetics of moraic align- ment in Yoloxóchitl Mixtec. Proceedings of Fourth International Symposium on Tonal Aspects of Languages, Nijmegen, The Netherlands, 13-16 May. ————, H. Nam, D. H. Whalen, H. T. Bunnell, J. D. Amith, and R. Castillo García. 2013. Using automatic alignment to analyze endangered language data: Testing the viability of untrained alignment. Journal of the Acoustical Society of America, 134:2235–46. ————, H. Nam, D. H. Whalen, H. T. Bunnell, J. D. Amith, and R. Castillo García. 2012. Assessing agreement level between forced alignment models with data from endangered language documentation corpora. Proceedings of INTERSPEECH, 2012.

Interdisciplinary Approaches to Language Documentation 110

Portland, Oregon. Ellen, Roy. 1993. The cultural relations of classification: An analysis of Nuaulu animal categories from central Seram. Cambridge, England: Cambridge University Press. Firth, John R. 1957. A synopsis of linguistic theory 1930–1955. In Studies in linguistic analysis, 1–32. Oxford: Blackwell. Fowler, Catherine S. 1972a. “Comparative Numic ethnobiology.” Ph.D. dissertation, University of Pittsburgh. ————. 1972b. “Some ecological clues to Proto-Numic homelands.” In D. D. Fowler, ed., Great Basin Cultural Ecology: A Symposium. Reno, Nev.: Desert Research Institute Publications in the Social Sciences, no. 8, pp. 105–21. ————. 1983. Lexical clues to Uto-Aztecan prehistory. International Journal of American Linguistics 49:224–57. ————, and Joy Leland. 1967. Some Northern Paiute native categories. Ethnology 6:381–404. Fox, William J. 1895. Third report on some Mexican Hymenoptera, principally from Lower California. Proceedings of the California Academy of Sciences. 2a series 5:260–72. Friedrich, Paul. 1970. Proto-Indio-European Trees: The Arboreal System of a Prehistoric People. Chicago: University of Chicago Press. González, Victor, Timothy Stein, Jonathan D. Amith, Ricardo Ayala. 2014. New record and the nest of the nocturnal sweat bee Megalopta tetewana (Hymenoptera: Halictidae). Pan-Pacific Entomologist90(1):40–43 Grinevald, Colette. 2000. A morphosyntactic typology of classifiers. In Gunter Senft, ed., Systems of nominal classification. Cambridge: Cambridge University Press, pp. 50-92. Groark, Kevin P. 1996. Ritual and therapeutic use of ‘hallucinogenic’ harvester ants (Pogonomyrmex) in native south-central California. Journal of Ethnobiology 16:1-29. ————2001. Taxonomic identity of hallucinogenic “harvester ant” (Pogonomyrmex californicus) confirmed.Journal of Ethnobiology 21:133-44. Haspelmath, Martin, and Uri Tadmor, eds., 2009. Loanwords in the World’s Languages: A Comparative Handbook. The Hague: De Gruyter Mouton (particularly chap. 1: Martin Haspelmath and Uri Tadmor, “The loanword typology project and the World Loanword Database” (pp. 1–34), Martin Haspelmath, “Lexical borrowing: Concepts and issues” (pp. 35–54), Uri Tadmor, “Loanwords in the world’s languages: Findings and results” (pp. 55–75). Hays, Terence E. 1976. An empirical method for the identification of covert categories in ethnobiology. American Ethnologist 3:489–507. ————. 1979. Plant classification and nomenclature in Ndumba, Papua New Guinea Highlands. Ethnology 18:253–70. Healey, C. 1978–79. Taxonomic rigidity in folk biological classification: Some examples from the Maring of New Guinea. Ethnomedizin 5(3–4):361–84. Hebert, Paul D. N., Alina Cywinska, Shelley L. Ball, and Jeremy R. deWaard. 2003. Biological identifications through DNA barcodes. Proceedings of the Royal Society of London, series B 270:313–21. Himmelmann, Nikolaus P. 1998. Documentary and descriptive linguistics. Linguistics 36:161–95.

Interdisciplinary Approaches to Language Documentation 111

Howell, Steve N. G., and Sophie Webb. 1995. A guide to the birds of Mexico and Northern . New York: Oxford University Press. Hunn, Eugene S. 1976. Toward a perceptual model of folk biological classification. American Ethnologist 3:508–24. ————. 1977. Tzeltal folk zoology: The classification of discontinuities in nature. New York: Academic Press. ————. 1982. The utilitarian factor in folk biological classification. American Anthropologist 84: ————. 1992. The use of sound recordings as voucher specimens and stimulus materi- als in ethnozoological research. Journal of Ethnobiology 12:187–98. Hyam, Roger, and R. Pankhurst. 1995. Plants and their names: A concise dictionary. New York: Oxford University Press. Levin, Beth. 1993. English verb classes and alternations: A preliminary analysis. Chicago, Ill.: University of Chicago Press. Majnep, Ian Saem, and Ralph Bulmer. 1990– Aps basd skip kmn ak pak ñbelgpal = Kalam hunting traditions. Auckland, New Zealand: University of Auckland, Department of Anthropology. Martin, Gary John. 1996. Comparative ethnobotany of the Chinantec and Mixe of the Sierra Norte, Oaxaca, Mexico. Ph.D. dissertation, Department of Anthropology, University of California, Berkeley. Meroz, Yoram. 2013. “Large-scale vocabulary surveys as a tool for linguistic stratigraphy: A California case study.” Paper presented at the 39th Berkeley Linguistic Society Meetings, Feb. 16–17, 2013. O’Donnell, Sean., and J. H. Hunt. 2013. Group hunting by workers of two Neotropical swarm-founding paper wasps, Parachartergus apicalis and Agelaia sp. Insectes Sociaux 60:369–72. ————. and Frank J. Joyce. 1999. Dual mimicry in the dimorphic eusocial wasp Mischocyttarus mastigophorus Richards (Hymenoptera: Vespidae). Biological Journal of the Linnean Society. 66:501–14. Olivares, Roberto, and Jonathan D. Amith. 2011. Silvestre Pantaleón. Documentary in Nahuatl. 65 minutes. Pawley, Andrew. 2001. Some problems of describing linguistic and ecological knowl- edge. In Luisa Maffi, ed., On Biocultural Diversity: Linking Language, Knowledge, and the Environment. Washington, DC: Smithsonian Institution Press, pp. 228–47. Richards, Owain Westmacott. 1978. The Social Wasps of the Americas. London: British Museum, Natural History Ricker, Martin, et al. 2015. Resultados: Determinación taxonómica de los ejemplares de herbario del re muestreo del inventario nacional forestal y de suelos 2009–2013 (año 2013). Ms. Rosch, Eleanor, and Carolyn B. Mervis. 1975. Family resemblances: Studies in the inter- nal structure of categories. Cognitive Psychology 7:573–605. Ross, Malcolm, Andrew Pawley, and Meredith Osmond, eds. The Lexicon of Proto Oceanic, vol. 1: Material culture; vol. 2: The physical environment; vol. 3: Plants; vol. 4: Animals. Canberra, Australia: Australian National University. Si, Aung. 2011. Biology in language documentation. Language Documentation and Conservation 5:169–86.

Interdisciplinary Approaches to Language Documentation 112

———— 2016. The traditional ecological knowledge of the Solega: A linguistic perspec- tive. New York: Springer Siebert, Frank T. 1967. “The original home of the proto-Algonquian people.” Contributions to Anthropology: Linguistics I. National Museum of Canada Bulletin 214:13–47. Taller de Tradición Oral de CEPEC and Pierre Beaucage. 1988. Maseualxiujpajmej, Kuesalan, Puebla / Plantas medicinales indígenas, Cuetzalan, Puebla. Puebla, state of Puebla: DIF (Desarrollo Integral de la Familia). Taylor, Paul Michael. 1984. Covert categories´ reconsidered: Identifying unlabeled classes in Tobelo folk biological classification.Journal of Ethnobiology 4:105–22.} ————. 1990. The folk biology of the Tobelo people: A study in folk classification. Smithsonian Contributions to Anthroplogy, Num. 34. Washington, D.C.: Smithsonian Institution Press. Turner, Nancy. 1973. Plant taxonomic systems and ethnobotany of three contemporary groups of the Pacific Northwest (Haida, Bella Coola, Lillooet). Ph.D. thesis. Dept. of Botany, University of British Columbia. Whistler, Kenneth W. 1977. “Wintun prehistory: An interpretation based on linguistic reconstruction of plant and animal nomenclature.” Proceedings of the Berkeley Linguistic Society 3:157–74. Yang, Ya, et al. 2012. Molecular phylogenetics and classification of Euphorbia subgenus Chamaesyce (Euphorbiaceae). Taxon 61:764–89.

Interdisciplinary Approaches to Language Documentation