DISCUSSION NOTES Comparative Concepts and Descriptive Categories in Crosslinguistic Studies

Total Page:16

File Type:pdf, Size:1020Kb

DISCUSSION NOTES Comparative Concepts and Descriptive Categories in Crosslinguistic Studies DISCUSSION NOTES Comparative concepts and descriptive categories in crosslinguistic studies MARTIN HASPELMATH Max Planck Institute for Evolutionary Anthropology In this discussion note, I argue that we need to distinguish carefully between descriptive cate- gories, that is, categories of particular languages, and comparative concepts, which are used for crosslinguistic comparison and are specifically created by typologists for the purposes of compar- ison. Descriptive formal categories cannot be equated across languages because the criteria for category assignment are different from language to language. This old structuralist insight (called CATEGORIAL PARTICULARISM) has recently been emphasized again by several linguists, but the idea that linguists need to identify ‘crosslinguistic categories’ before they can compare languages is still widespread, especially (but not only) in generative linguistics. Instead, what we have to do (and normally do in practice) is to create comparative concepts that allow us to identify compara- ble phenomena across languages and to formulate crosslinguistic generalizations. Comparative concepts have to be universally applicable, so they can only be based on other universally appli- cable concepts: conceptual-semantic concepts, general formal concepts, and other comparative concepts. Comparative concepts are not always purely semantically based concepts, but outside of phonology they usually contain a semantic component. The fact that typologists compare lan- guages in terms of a separate set of concepts that is not taxonomically superordinate to descriptive linguistic categories means that typology and language-particular analysis are more independent of each other than is often thought.* Keywords: language typology, comparative concept, descriptive category, crosslinguistic category 1. INTRODUCTION: HOW TO COMPARE LANGUAGES. The purpose of this discussion note is to argue that crosslinguistic comparison should be based on comparative concepts created by the typologist, rather than on crosslinguistic categories that are instantiated in different languages, and to show how comparative concepts differ from language- specific descriptive categories. Although in practice typologists generally work with such special comparative concepts, this distinction between comparative concepts and descriptive categories has not been articulated clearly before. More commonly, linguists tend to assume that there is a substantial set of universally available CROSSLINGUISTIC CATEGORIES (such as adjective, passive voice, accusative case, future tense, second person, subject, affix, clitic, phrase, WH-movement) from which languages may make a selection (Newmeyer 2007), and which are used both for description/analysis and for comparison. Typological research would then simply con- sist in identifying adjectives, passives, and so on in each language that has the category, and examining the ways in which the properties of the categories vary across languages. In this approach to crosslinguistic comparison, which we can call CATEGORIAL UNIVER- SALISM, it is one of the main tasks of comparative linguists (i.e. typologists) to deter- mine what these crosslinguistic categories are. For linguists working on individual * Earlier versions of this paper were presented at the Max Planck Institute for Evolutionary Anthropology, at the University of Leipzig (during the Leipzig Spring School on Linguistic Diversity), at the Cognitive- Functional Linguistics Conference, University of Tartu (May 2008), and at the Société de Linguistique de Paris (December 2008). I am grateful to the audiences for useful feedback, and to Bob Ladd, Matthew Dryer, Östen Dahl, Edith Moravcsik, Frederick Newmeyer, Geoff Pullum, Gilbert Lazard, Christian Lehmann, Bern- hard Wälchli, Andrew Spencer, David Gil, Esa Itkonen, and Susanne Michaelis for interesting discussions of the points of this paper. A number of referees for Language helped me improve the paper; I am especially grateful to Nick Evans and Greg Carlson. 663 664 LANGUAGE, VOLUME 86, NUMBER 3 (2010) languages, this means that language-particular categories can be expected to be drawn from a relatively small set, that they can be equated with categories of other languages, and that they can be identified with (or instantiate) the crosslinguistic categories. Thus, comparative linguistics is an important necessary prerequisite for analyses of particular languages. Categorial universalism has been uniformly adopted in generative typology since its beginnings, and it appears to be implicitly assumed by many other linguists as well (e.g. Payne 1997, Corbett 2000, Van Valin 2005, Dixon 2010). It is sometimes even assumed that particular categories are universal in the sense not only of being uni- versally available, but also of being universally instantiated (e.g. nouns and verbs in Baker 2003, adjectives in Dixon 2004). The present discussion note starts out from an alternative view of the tasks of com- parative linguistics. Following recent work on the foundations of grammatical typology by various authors (Dryer 1997, Croft 2001, Lazard 2006, Haspelmath 2007, Cristofaro 2009), I assume that grammatical categories are not crosslinguistic entities (either uni- versally available or universally instantiated). Each language has its own categories, and to describe a language, a linguist must create a set of DESCRIPTIVE CATEGORIES for it, and speakers must create mental categories during language acquisition. These cate- gories are often similar across languages, but the similarities and differences between languages cannot be captured by equating categories across languages. It was one of the major insights of structuralist linguistics of the twentieth century (especially the first half) that languages are best described in their own terms (e.g. Boas 1911), rather than in terms of a set of preestablished categories that are assumed to be universal, although in fact they are merely taken from an influential grammatical tradition (e.g. Latin gram- mar, or English grammar, or generative grammar, or ‘basic linguistic theory’). This alternative, nonaprioristic approach to categories can be called CATEGORIAL PARTICULAR- ISM. In this approach, language-particular analyses can be carried out independently of comparative linguistics.1 Categorial particularism appears to make crosslinguistic comparison more challeng- ing, but I argue that there exists a coherent and viable methodology for typological re- search that is compatible with it, which has in fact been employed by most researchers in the Greenbergian tradition (e.g. Greenberg 1963, Mallinson & Blake 1981, Comrie 1989, Dryer 1992, Croft 2003, Haspelmath et al. 2005, Song 2011). This is the use of COMPARATIVE CONCEPTS, that is, concepts specifically designed for the purpose of com- parison that are independent of descriptive categories. However, linguists have often been unclear about the way that the apparent paradox of comparability of incommensu- rable systems can be resolved. It is my goal here to explicate this approach and defend it against challenges from a categorial-universalist perspective such as Newmeyer 2007. I first give an overview of the crucial notion of comparative concept and characterize descriptive categories, showing that they must be different in different languages, and then argue against the use of crosslinguistic categories. The heart of the paper is §5, where I give concrete examples of well-known grammatical comparative concepts and show that the corresponding language-particular categories are crucially different. Fol- lowing this, I address the terminological issues arising from this distinction and review earlier approaches to grammatical comparison, showing that very few earlier authors 1 I use the terms ‘comparative linguistics’ and ‘typology’ interchangeably in this discussion note. ‘Typolo- gy’ is often associated with specifically nongenerative approaches, so I generally prefer the broader (and more transparent) but longer term ‘comparative linguistics’. DISCUSSION NOTES 665 have made this important distinction explicit, even though in practice many linguists distinguish the two notions implicitly. I then ask how comparative concepts are chosen, concluding that no general answer can be given because multiple perspectives of com- parison can be adopted simultaneously without contradiction. Finally, §9 emphasizes that comparative concepts are not simply generalizations over linguistic categories, and that typology cannot be based on the comparison of categories in the sense of struc- turally coherent units of languages. 2. TYPOLOGISTS USE COMPARATIVE CONCEPTS. Typologists have often observed that crosslinguistic comparison of morphosyntactic patterns cannot be based on formal pat- terns (because these are too diverse), but has to be based on universal conceptual- semantic concepts (e.g. Stassen 1985:14, 2011, Croft 1990:11–12, 1995:88, 2003:13–14, Heger 1990/91, Givón 2001:20–23, Song 2001:10–12, Haspelmath 2007). As New- meyer (2007:136) rightly emphasizes, however, ‘typological generalizations need to make reference to the specific form in which these universal concepts are realized as well’ (see also Rijkhoff 2009, 2010). Typologists make generalizations about phenom- ena such as case affixes, gender, adpositions, passive constructions, and relative clauses, and none of these can be defined in purely conceptual-semantic terms. Thus, I claim that what crosslinguistic grammatical research is based on in general
Recommended publications
  • Handling Word Formation in Comparative Linguistics
    Developing an annotation framework for word formation processes in comparative linguistics Nathanael E. Schweikhard, MPI-SHH, Jena Johann-Mattis List, MPI-SHH, Jena Word formation plays a central role in human language. Yet computational approaches to historical linguistics often pay little attention to it. This means that the detailed findings of classical historical linguistics are often only used in qualitative studies, yet not in quantitative studies. Based on human- and machine-readable formats suggested by the CLDF-initiative, we propose a framework for the annotation of cross-linguistic etymological relations that allows for the differentiation between etymologies that involve only regular sound change and those that involve linear and non-linear processes of word formation. This paper introduces this approach by means of sample datasets and a small Python library to facilitate annotation. Keywords: language comparison, cognacy, morphology, word formation, computer-assisted approaches 1 Introduction That larger levels of organization are formed as a result of the composition of lower levels is one of the key features of languages. Some scholars even assume that compositionality in the form of recursion is what differentiates human languages from communication systems of other species (Hauser et al. 2002). Whether one believes in recursion as an identifying criterion for human language or not (see Mukai 2019: 35), it is beyond question that we owe a large part of the productivity of human language to the fact that words are usually composed of other words (List et al. 2016a: 7f), as is reflected also in the numerous words in the lexicon of human languages. While compositionality in the sphere of semantics (see for example Barsalou 2017) is still less well understood, compositionality at the level of the linguistic form is in most cases rather straightforward.
    [Show full text]
  • Syntactic Reconstruction
    SYNTACTIC RECONSTRUCTION Sarah G. Thomason University of Michigan Syntactic reconstruction has not figured prominently in historical linguistic investigations, as can be surmised from the fact that the index of the recent 881-page Handbook of Histor- ical Linguistics (Joseph & Janda 2003) lists just seven pages, all in the same article, where it is discussed. As Fox observes, `Syntactic reconstruction is a controversial area...scholars working within the framework of the classical Comparative Method have been far less suc- cessful in applying their methods here than in the case of phonology of even morphology' (1995:104; see also Jeffers 1976). And in discussing this topic elsewhere, Fox does not point to any methods other than the Comparative Method that have offered promising results (1995:104-109, 190-194, 250-253, 261-270). Efforts to reconstruct syntax can be traced at least as far back as 1893-1900, when Delbr¨uck (as cited in Lehmann 1992:32) reconstructed OV word order for Proto-Indo-European. By far the most ambitious early effort at reconstructing syntax is Schleicher's \Proto-Indo- European fable", which was considered a rash enterprise even in those pre-Neogrammarian times (1868, as cited and translated in Jeffers & Lehiste 1979:107): Avis akv¯asaska `A sheep and horses' avis, jasmin varn¯ana ¯aast, dadarka akvams, tam, v¯aghamgarum vaghantam, tam, bh¯arammagham, tam, manum ¯akubharantam. avis akvabhjams ¯avavakat: kard aghnutai mai vidanti manum akvams agantam. Akv¯asas¯avavakant: krudhi avai, kard aghnutai vividvant-svas: manus patis varn¯amavis¯amskarnauti svabhjam gharman vastram avibhjams ka varn¯ana asti. Tat kukruvants avis agram ¯abhugat.
    [Show full text]
  • Historical Evolution of the World's Languages - Ranko Matasović
    LINGUISTICS - Historical Evolution of the World's Languages - Ranko Matasović HISTORICAL EVOLUTION OF THE WORLD'S LANGUAGES Ranko Matasović University of Zagreb, Croatia Keywords: language diversity, language families, comparative linguistics, linguistic palaeontology, population genetics, language spread, wave of advance model, elite dominance model Contents 1. Introduction 2. Models of language spread 2.1. Wave of advance 2.2. Elite dominance 3. Language families in the Old World 4. Language families in the New World 5. Recent history 6. Conclusion Glossary Bibliography Biographical Sketch Summary Interdisciplinary research and cooperation of linguistics, anthropology, archaeology and population genetics have led to new insights about the prehistory of language families of the world. Several models of language spread are used to account for the current distribution of the world's languages. In some cases, this distribution reflects large-scale prehistoric migrations (the "wave of advance" model), while in other cases languages have spread without the actual movement of people, often because the idiom of a small, but dominant group acquired a great social prestige and was adopted by the majority of a given population (the "elite dominance" model). 1. Introduction UNESCO – EOLSS The subject of this chapter is the historical developments that have led to the current state and distribution of languages and language families in the world. This subject has been investigatedSAMPLE from different points of CHAPTERSview, and it is currently an area of interdisciplinary research. The questions to be addressed are: why are some language families very small, in terms of the number of languages constituting them (e. g. the Kartvelian language family in the Caucasus, with only four languages), while others are extremely large (e.
    [Show full text]
  • Synchronic Reconstruction
    SYNCHRONIC RECONSTRUCTION John T. Jensen University of Ottawa 1. Reconstruction Historical linguistics recognizes two types of reconstruction. Comparative reconstruction compares cognate forms from two or more languages and posits an historical form from which the attested forms can be derived by plausible historical changes. A typical example is shown in (1). (1) Sanskrit: pitā́ Greek: patḗr PIE: *pәtḗr Latin: pater Gothic: fadar By comparing cognate forms in related languages, “the comparative method produces proto-forms, which cluster around a split-off point, a node in a family tree” (Anttila 1989: 274). Internal reconstruction compares related forms in a single language to determine a single form in an earlier stage from which those attested forms can be derived. A textbook example is Campbell’s (2004: 242) first exercise in his chapter on internal reconstruction, from German given in (2). (2) [ty:p] Typ ‘type’ [ty:pәn] Typen ‘types’ [to:t] tot ‘dead’ [to:tә] Tote ‘dead people’ [lak] Lack ‘varnish’ [lakә] Lacke ‘kinds of varnish’ [tawp] taub ‘deaf’ [tawbә] Taube ‘deaf people’ [to:t] Tod ‘death’ [to:dә] Tode ‘deaths’ [ta:k] Tag ‘day’ [ta:gә] Tage ‘days’ (Data: Campbell 2004: 242) This leads us to reconstruct historical forms in (3) and to posit the sound change in (4). (3) *ty:p *to:t *lak *tawb *to:d *ta:g Sound change: (4) [–sonorant] > [–voice] / ____ # According to Raimo Anttila, “[i]nternal reconstruction gives pre-forms, which can reach to any depth from a given point of reference…” (Anttila 1989: 274). Actes du congrès annuel de l’Association canadienne de linguistique 2009.
    [Show full text]
  • 4 the History of Linguistics
    The History of Linguistics 81 4 The History of Linguistics LYLE CAMPBELL 1 Introduction Many “histories” of linguistics have been written over the last two hundred years, and since the 1970s linguistic historiography has become a specialized subfield, with conferences, professional organizations, and journals of its own. Works on the history of linguistics often had such goals as defending a particu- lar school of thought, promoting nationalism in various countries, or focuss- ing on a particular topic or subfield, for example on the history of phonetics. Histories of linguistics often copied from one another, uncritically repeating popular but inaccurate interpretations; they also tended to see the history of linguistics as continuous and cumulative, though more recently some scholars have stressed the discontinuities. Also, the history of linguistics has had to deal with the vastness of the subject matter. Early developments in linguistics were considered part of philosophy, rhetoric, logic, psychology, biology, pedagogy, poetics, and religion, making it difficult to separate the history of linguistics from intellectual history in general, and, as a consequence, work in the history of linguistics has contributed also to the general history of ideas. Still, scholars have often interpreted the past based on modern linguistic thought, distorting how matters were seen in their own time. It is not possible to understand developments in linguistics without taking into account their historical and cultural contexts. In this chapter I attempt to present an overview of the major developments in the history of linguistics, avoiding these difficulties as far as possible. 2 Grammatical Traditions A number of linguistic traditions arose in antiquity, most as responses to linguistic change and religious concerns.
    [Show full text]
  • The Routledge Handbook of Systemic Functional Linguistics What Is A
    Elissa Asp In contrast, de Saussure (1986: 122) describes paradigmatic relations in terms of absence. What is present in a syntagm gets part of its value from its associates or ‘mnemonic group’, which may include not only morphosyntactic associates such as the inflectional paradigms for verbs or declensions of nouns, but also items related via semantic, lexical, phonological or phonetic features. Consider the following example. (1) Sam slept through an afternoon In example (1), the value of any particular item is partially determined by the absent items with which it is in contrast: the referent Sam versus relevant other participants; past not present tense; contrasts for semantic features of sleep versus, for instance, doze; and so on for the other selections. This value is partial since selection and interpretation may also be affected by items copresent in a syntagm. For example, simple past tense verbs may be ambiguous between habitual and single-instance interpretations. Features of an accompa- nying time adjunct can push interpretation towards one of these. So a partial paraphrase of example (1) with singular indefinite an is on some occasion Sam did this, but the in the same environment leaves the interpretation ambiguous. Thus the selection of items present in a syntagm can be seen as functionally motivated not only by contrast with their absent associ- ates, but also by the syntagmatic environments in which they appear. For de Saussure, these axes – the syntagmatic and the paradigmatic; the arbitrary and motivated – were the central linguistic relations to be accounted for. Curiously, although all linguistic models recognise paradigmatic contrasts to some extent, most frame grammars in terms of syntagmatic structures.
    [Show full text]
  • A Critical Review on Historical Linguistic Written in Wikipedia
    A CRITICAL REVIEW ON HISTORICAL LINGUISTIC WRITTEN IN WIKIPEDIA Endang Yuliani Rahayu Universitas Stikubank Semarang Abstract When does a languague begin? The truth is that it happens since the era of Aristoteles. Learning from those great philosophers’ era, we might want to know further about historical languages. The history wil tell about the issues related to the languages and also other factors deal with it. The language here is not only English but also other languages. The record prove that there are some languages had been extinct since there are no more native speakers of the language. Since the language is found, the science of it is developed. The development of linguistics occurred until now. There are many branches of lingusitics. There have been some changes too due to the linguistics. To find out more about what historical linguistics is, the writer tries to give critical for the article uploaded in Wikipedia about Historicl Linguistics. Key words : language, linguistics, historical linguistics INTRODUCTION People communicate using language. There are two forms of languages namely verbal and non verbal. For the verbal type, we may say that it is spoken form but on the other hand the non verbal type belongs to written ones. What about body langauge? Which category does it belong to? Body langauge or is familiar with the term gesture belongs to non verbal language type. It is said so as the form of it is using the body part and body movement. Some people will say that this kind of language is known as sign language for special purpose.
    [Show full text]
  • The Comparative Method in Syntactic Reconstruction
    The comparative method in syntactic reconstruction GEORGE WALKDEN Clare College, University of Cambridge Word count (including footnotes, excluding figures, tables and bibliography): 19,992 words Thesis submitted for the degree of MPhil in Linguistics, 2009 1 George Walkden The comparative method in syntactic reconstruction Declaration This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except where specifically indicated in the text. 2 George Walkden The comparative method in syntactic reconstruction Abstract This thesis investigates the question of whether it is possible, or desirable, to use the comparative method as applied in phonological reconstruction to identify syntactic correspondences. I show that approaches proposed in the literature (e.g. by Lehmann 1974 or Harris & Campbell 1995) are problematic either because they do not follow the comparative method or because they do not do so in a principled enough fashion; objections raised in the literature (e.g. by Lightfoot 2002a) are then assessed, and I argue that most of these constitute no obstacle to syntactic reconstruction. I then sketch a method for applying the comparative method to syntactic reconstruction through comparison of the features of lexical items, including exponents of functional heads, applying an idea popular in current Minimalist thinking. This approach is then illustrated using examples drawn from the older Germanic languages: the Old Norse middle voice, the West Germanic inflected infinitive, and V-to-C movement. I suggest that pursuing an isomorphism between phonological and syntactic change has the potential to bear fruit in syntactic reconstruction, even if the parallels cannot be universally maintained.
    [Show full text]
  • Some Comparative Phonology and Morphology of Kambata and Oromo
    Social Science Review Volume 6, Issue 1, June 2020 ISSN 2518-6825 Some Comparative Phonology and Morphology of Kambata and Oromo Tilahun Negash Mekuria Department of English Language and Literature Debre Markos University Ethiopia Abstract Comparative linguistics is a study which mainly focuses on comparing languages with a view of establishing their relatedness. This article presents some comparative analysis of studies on the phonological and morphological systems of Cushitic languages namely; Kambata and Oromo. Kambata is under Highland Land East Cushitic and Oromo is under Lowland East Cushitic. Comparative study of the Cushitic languages has been done fairly and this necessitated the study to be undertaken to determine the relatedness of the two Cushitic languages. The proposed study was guided by two objectives; to establish whether the languages are related and to discuss some forms and differences of linguistic elements in the languages. The data in this study was gathered from two natives from each language and the secondary data collected on the two languages. The comparative method then was involved in data analysis. The majority of the comparison is based on the secondary data collected on the respective languages from earlier studies on the languages. Distinctive features of the languages are indicated based on the findings. Correspondingly, different phonological and morphological aspects that make the languages identical or different are valued in the paper. The study established that the languages are related. It is observed that there are extraordinary similarities in phonemic and morphemic inventories. Based on the study objectives, it was recommended that Study of the Cushitic language should be done to clearly show their degree of relatedness.
    [Show full text]
  • High School Linguistics SUBMISSION VERSION Keith Mason
    1 Introducing Linguistics into Your Curriculum Offering language science can strengthen student preparation BY KEITH MASON Linguistics, the scientific study of language, deserves a presence in the high school curriculum because language is a basic human form of communication. Students often explore natural sciences, math, language arts, world languages, physical education, computer science, social studies, the arts and practical arts. An elective or required course in linguistics per se can certainly complement student learning in all these subjects. Studies have addressed the formal introduction of linguistics at the secondary level. Loosen (2014) describes a semester-length linguistics course taught at a Milwaukee high school. Larson, Denham & Lobeck (forthcoming) defends linguistics, and especially AP Linguistics, in the high school curriculum. Language is undoubtedly present in K-12 coursework. Nevertheless, the tenets of linguistics are barely treated in secondary learning. Students may glean linguistic knowledge indirectly in their regular courses. Because the various branches of linguistics maintain their own goals, we can argue that students can benefit from a deeper understanding of how and why language is the way it is. A more deliberate approach would ensure that students become knowledgeable about terminology, concepts, language structure, language development and variation in a systematic way that are inherently part of the field of linguistics. Students who are not college bound will also benefit from linguistics. If these students go directly to the work force, they can gain essential communication-friendly knowledge from an 2 in- depth study of linguistics. Employers report a lack of speaking or writing skills in employees. Linguistics can help in these two areas and more.
    [Show full text]
  • What Is Comparative Linguistics?
    What is Comparative Linguistics? Linguistics is the scientific study of language. Comparative Linguistics is the study of human language as a species-specific phenomenon in all facets of its occurrences. Why are languages the way they are? How come there are both remarkable similarities and extreme differences in the languages of the world? How do languages change? Comparative Linguistics is chiefly interested in general patterns that shape each and every language, both in their current structure (synchrony) and in their historical developments (diachrony). In other words, Comparative Linguistics is a discipline that seeks to formulate general principles of language. As such it differs from language-specific programs like German Studies, Slavic Studies, etc. (in German: Germanistik, Slawistik, etc.), which seek to understand an individual language in ​ ​ ​ ​ itself. The kinds of principles that are studied in Comparative Linguistics cover the nature of the language faculty and the architecture of grammar, the evolution and history of language families and language areas, general patterns in the acquisition of languages by children and adults, and the relationship of languages with social and cultural structures on the one hand, and with patterns in cognition and the brain on the other hand. However, the empirical foundation of Comparative Linguistics ultimately lies in individual languages and their histories. Therefore, a comparative linguist is typically also concerned with detailed research on individual languages. As the large languages of Europe and elsewhere (English, French, Chinese, etc.) are typically already covered by language-specific programs, comparative linguists usually study less well-known languages when they seek to expand their database.
    [Show full text]
  • Comparative Linguistics Via Language Modeling
    Comparative Linguistics via Language Modeling Scott Drellishak Department of Linguistics University of Washington Box 35430 Seattle, WA 98195-4340 [email protected] language modeling techniques, a metric of textual Abstract (and, it is hoped, genetic) similarity between pairs of single-language texts. Pairs of texts that are Natural languages are related to each oth- more similar, it is hypothesized, ought to be more er genetically. Traditionally, such genetic closely related genetically than pairs that are less relationships have been demonstrated us- similar. The metric of similarity described in §3 is ing the comparative method. I describe a based on an n-gram model of the distribution of software system that implements an alter- individual letters in each text. native method for detecting such relation- ships using simple statistical language 2 Data modeling. In order to compare texts using statistical model- ing, it is necessary that they be in the same writing 1 Introduction system, or nearly so. Consider the case of a pair of texts, one of which is written in the Latin alphabet, The historical processes of change that affect natu- the other in Japanese kanji. Statistically modeling ral languages result in genetic relationships be- the Latin text based on kanji character frequencies, tween those languages. Speakers of a single or vice-versa, would produce a probability of zero language may become divided into multiple iso- (modulo smoothing) because the character sets do lated speech communities, each of which subse- not overlap at all. In order to avoid this problem, quently develops a different new language. Well- all texts were selected from languages written in known examples include the Indo-European lan- some variant of the Latin alphabet.
    [Show full text]