Modeling Color Terminology Across Thousands of Languages
Total Page:16
File Type:pdf, Size:1020Kb
Modeling Color Terminology Across Thousands of Languages Arya D. McCarthy, Winston Wu, Aaron Mueller, Bill Watson, and David Yarowsky Department of Computer Science Johns Hopkins University, Baltimore, MD USA {arya,wswu,amueller,billwatson,yarowsky}@jhu.edu Abstract Language Color Word Literal Gloss There is an extensive history of scholarship Welsh brown brown into what constitutes a “basic” color term, Italian marrone chestnut coffee + of ﻗﻬﻮه ای as well as a broadly attested acquisition se- Persian quence of basic color terms across many lan- Cantonese ar coffee + color guages, as articulated in the seminal work of Berlin and Kay (1969). This paper employs Table 1: Examples of terms representing brown, aris- a set of diverse measures on massively cross- ing from four processes: borrowing (Welsh; from En- linguistic data to operationalize and critique glish), null affixing (Italian), derivational affixing (Per- the Berlin and Kay color term hypotheses. Col- sian), and compounding (Cantonese). lectively, the 14 empirically-grounded compu- tational linguistic metrics we design—as well as their aggregation—correlate strongly with data are hard to find in the long tail of languages, both the Berlin and Kay basic/secondary color we still aim to consider more than ever before— term partition (g = 0.96) and their hypothe- 1 sized universal acquisition sequence. The mea- 2491 languages and dialects. We leverage natural sures and result provide further empirical ev- language processing tools to operationalize long- idence from computational linguistics in sup- standing literature on language universals. port of their claims, as well as additional nu- We provide a three-pronged investigation of the ance: they suggest treating the partition as a classic criteria for basic color terms, examining spectrum instead of a dichotomy. the degree to which color words are abstract (§5), monomorphemic/monolexemic (§6), and salient 1 Introduction (§7). Our operationalization of these (B&K) cri- How many colors are in the rainbow? An infinite teria shows that individual features do not reflect number, but each language divides up perceptual the basic/non-basic divide. Nor is this divide bi- space into a finite number of categories by giving nary, as B&K suggest: We show that abstract- names to colors. The seminal work on color cate- ness, monomorphemicity, and even salience do not gories, by Berlin and Kay (1969, hereafter B&K), cleanly divide colors. characterizes a universal evolutionary sequence for Nonetheless, by treating basicness as a spectrum languages’ core colors (their basic color terms) and aggregating these features (like human-judged and their corresponding categories, at each stage concreteness, frequency of compounding, and word refining the partition of color space. length) into basicness scores (§8), we can largely A handful of criteria define basic color terms, in- distinguish between basic and non-basic colors (val- cluding abstractness, monomorphemicity, and not idating our measures), and our scores recreate the being subsumed by a broader basic term. (See §2 historical sequence of color acquisition in lan- for the complete list.) These criteria are accused guage. The sequence is in no way directly encoded of biasing analyses of color systems—especially in the criteria for basic color terms; as such, recre- in non-Western societies (Wierzbicka, 2006). To ating it is a separate and novel empirical discovery. mitigate this bias, a pan-lingual approach to analyz- 1 To this end, we present a large cross-lingual, type-level ing color systems may reveal general (“universal”) database of translations of basic and secondary color terms trends more reliably than smaller datasets. While across 2491 languages (§3). 2241 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pages 2241–2250, Hong Kong, China, November 3–7, 2019. c 2019 Association for Computational Linguistics 2 Color Terminology purple white green pink Not all languages have the same number of color < red < < blue < brown < black yellow 2orange3 words; for instance, a single Korean color word ⇥ ⇤ ⇥ ⇤ ⇥ ⇤ 6 grey 7 (pureu-n) applies to both grass and sky—an un- 4 5 usual concept for native English speakers. Simi- Figure 1: The diachronic sequence of color acquisition larly, Russian distinguishes between two families (Berlin and Kay, 1969). of what English speakers call “blue”: the lighter goluboy and the darker siniy. Reaction time ex- periments show the cognitive importance of these and salmon). Additionally, (6) recent foreign loan categories (Gilbert et al., 2006; Winawer et al., words “are suspect”, and (7) if the lexemic status of 2007), and the existence of a named category both the word is difficult to judge, then multimorphemic aids (Brown and Lenneberg, 1954) and guides (Bae words are also “suspect”. et al., 2015; Cibelli et al., 2016) color judgment and memory. In addition to this definition, (B&K) surveyed speakers of 20 languages in the San Francisco Bay Color terms may be concrete (i.e., derived from Area, plus a sweeping examination of the literature, a real-world referent like “blood” or “sky”) or ab- to find a to the emergence of color words stract. Diachronic processes can weaken the link sequence between a concrete term and its referent, until a new in language. Cultures with two color words uni- cohort of speakers believes the term to be abstract. versally used them to distinguish light and warm Indeed, this process explains the development of colors from dark and cool ones; the third color was English color words (Casson, 1994). In addition universally red, and the sequence continued until to metonymy with named things, the words may matching the set of eleven colors represented by be borrowed, compounded, or inherited from an English basic color terms. We present their par- ancestor language. tial ordering in Figure 1, though later authors have proposed alterations (Heider, 1972; Kay, 1975). While industrialized societies’ languages pos- sess a wealth of color words (Hardin, 2014), only We are not the first to assess the notion of a a handful are considered basic color terms; the re- basic color term. Crawford (1982) gives a point-by- mainder are secondary. A basic color term (BCT) point rebuttal on pragmatic grounds—the criteria must satisfy four obligatory criteria (B&K): are hard for a field worker to assess, and many 1. It must be monolexemic (and monomor- introduce subjectivity that will bias data collec- phemic). “Light blue” and “blue-green” each tion. Lucy (1997) argues that the definition pro- contain two lexemes and do not qualify. vides more of a post-hoc screening tool for when 2. It may not possess any color hypernyms (su- the “denotational net” of elicitation has captured perordinate color terms). (E.g., “lavender” has too many terms, as opposed to a morphosyntacti- the hypernym “purple”.) cally informed approach (e.g., Conklin, 1955). Fi- 3. It may not be limited in application to a nar- nally, Wierzbicka (2006) argues that other societies row class of objects. “Blond(e)” may only may not share the Western conception of hue-based be applied to a handful of referents like hair, color terms, making the application of the concept wood, and beer, for example. inappropriate. In addition to these postulatory ob- 4. It must be psychologically salient. This im- jections, a vast literature of similarity judgments, plies that the color term has a stable range reaction times, and other human measures debates of reference across speakers and has an entry the question from a cognitive perspective (Heider, in the lexemic inventory of most (if not all) 1972; Jameson, 2005; Roberson et al., 2005, 2008; native speakers’ respective idiolects. Goldstein et al., 2009; Loreto et al., 2012; Persaud Additional criteria are introduced in cases of doubt and Hemmer, 2014, inter alia). (Kay and McDaniel, 1978), though these are subjec- By contrast, we examine the conditions empir- tively applied (Crawford, 1982). Among these: (5) ically, broadly and automatically on a massively a BCT is not the name of an object that character- multilingual scale (versus manually and theoreti- istically has a particular color; in other words, the cally). Our evidence for assessing B&K’s criteria color must be abstract, and not grounded in some of abstractness, monomorphemicity, and salience concrete object (which rules out colors like gold comes from a multilingual dragnet of color terms. 2242 3 Data to qualitative observations, our experiments evalu- ate these qualities through several metrics, illumi- We investigate the three aspects of our the- nating flaws in the definition of “basic color term”. ory assessment—abstractness, monomorphemic- When averaging the 14 features together, the im- ity, and salience—through multilingual dictionar- plied total ordering is suggestive of the original ies. We additionally leverage English corpora to B&K sequence. explore abstractness and salience. We use these to construct a dataset of color senses and transla- Goodman and Kruskal’s gamma We measure tions, with scores along numerous axes. As a final correlation between basicness and our features with resource to investigate salience, we use a global Goodman and Kruskal’s gamma (Goodman and elicitation of color terms from pre-industrialized Kruskal, 1954, 1959, 1963, 1972), which is well societies. suited for comparing binary variables to ordinal In English, the basic color terms are red, orange, ones. It is a pair-counting measure which ignores yellow, green, blue, purple, brown, pink, black, tied values. We compute it by maximum likelihood white, and grey. These align to the eleven ba- estimation, giving an expression: sic color categories identified by Berlin and Kay N N (1969). In addition to these eleven, we consider a g = s − d , (1) list of 92 second-tier color terms identified by Cas- Ns + Nd son (1994). These were elicited from 30 speakers where Ns is the number of color pairs for which over several days to ensure salience, then filtered basicness and a feature agree in their ranking; Nd by a dictionary to keep only conventional (rather is the number of pairs ranked in opposite orders.