Dialectal Variation in Spanish Diminutives: a Performance Model
Total Page:16
File Type:pdf, Size:1020Kb
Studies in Hispanic and Lusophone Linguistics 2017; 10(1): 39–66 David Eddington* Dialectal variation in Spanish diminutives: A performance model DOI 10.1515/shll-2017-0002 Abstract: While the diminutive form of most Spanish words is invariant, a great deal of variation is found in bisyllabic words that either contain /je, we/ in the stem (e. g., viejo ‘old’> viejito/viejecito, pueblo ‘town’ > pueblecito/pueblito), or that end in /e/ (e. g., dulce ‘sweet’ > dulcecito/dulcito), or that end in /jo, ja/ (e. g., rubio ‘blonde’ > rubiecito/rubito). Data from the Corpus del Español indicate that in many cases both diminutive forms exist within a single country. This kind of variation has been accounted for in a number of competence-based studies. However, many of these studies, along with the entities and mechanisms they employ, are not designed to explain actual language processing. The purpose of the present study, on the other hand, is to present a performance model of diminutive formation that accounts for the observed variation. The model assumes that highly frequent diminutives have been lexicalized, and as a result, their production is a matter of lexical retrieval. In contrast, low frequency words are diminutivized based on analogy to the diminu- tive forms of words stored in the mental lexicon. A data set of existing diminutives in each country was extracted from the Corpus del Español. Using these data sets, a series of computational simulations was performed in order to predict the diminu- tive allomorphs. The model proved to be highly successful in correctly predicting the diminutives in each country. Keywords: Spanish, diminutives, analogical modeling, performance model 1 Introduction Perhaps the most studied aspect of Spanish diminutives are the allomorphs of -ito/a (i. e., the short forms -ito/a,andthelongforms-cito/a, -ecito/a).Onelineof research is descriptive and documents how the allomorphs are distributed, some with an emphasis on different varieties of Spanish (e. g., Bradley and Smith 2011; Callebaut 2011; Castillo Valenzuela and Ortiz Ciscomani 2013; Fontanella 1962; Gaarder 1966; Horcajada 1988; Jaeggli 1980; Miranda 1999; Norrmann-Vigil 2012; Rojas 1977). Another focus has been on describing diminutive formation within *Corresponding author: David Eddington, Brigham Young University, Provo, UT 84602, USA, E-mail: [email protected] Authenticated | [email protected] Download Date | 8/19/17 6:36 PM 40 David Eddington different theoretical frameworks: lexical phonology (Castro 1998), exemplar theory (Eddington 2002), optimality theory (Bradley and Smith 2011; Colina 2003; Elordieta and Carreiras 1996; Miranda 1999; Smith 2011; Stephenson 2004), Maximum entropy modeling (Norrmann-Vigil 2012) and others (Ambadiang 1996; 1997; Bermúdez-Otero 2007; Crowhurst 1992; Prieto 1992). Diminutivization in Spanish is a very robust process that can potentially apply to all nouns and adjectives, and occasionally other words such as gerunds and adverbs. In the majority of Spanish words that undergo diminutivization, there is not cross-dialectal variation. For instance, the diminutive of mesa ‘table’ is mesita and the diminutive of zapato ‘shoe’ is zapatito throughout the Spanish- speaking world. However, there are certain small areas in the lexicon where diminutives vary a great deal from country to country, and possibly from speaker to speaker. The present paper will focus on three of these, the first being bisyllabic words with /je/ and /we/ in the stem (DIPH, e. g., viejo ‘old’> viejito/viejecito, pueblo ‘town’ > pueblecito/pueblito). A great deal of variation is also found in bisyllabic words ending in /e/ (FINAL -E, e. g., dulce ‘sweet’ > dulcecito/dulcito, diente ‘tooth’ > dientecito/dientito). Finally, bisyllabic words ending in -io/a yield different diminutives in different regions (FINAL -IO/A, e. g., indio ‘Indian’ > indiecito/indito, rubio ‘blonde’ > rubiecito/rubito). The central purpose of this paper is to provide a performance model that accounts for how Spanish speakers may determine the correct diminutive allomorph of a given base word, and more importantly, how dialectal variation may be explained. The paper begins by framing itself as a performance rather than a compe- tence account of the issue. The following section describes the wide variation in diminutive forms found both between and within Spanish-speaking countries. How such variation is handled in competence approaches is briefly reviewed, and analogy is suggested as a model that may explain linguistic performance. However, it is argued that the best model of diminutive variation is one in which both derivation by analogy and retrieval of lexicalized forms play a part. 2 Competence and performance perspectives Many linguists have followed Chomsky’s lead by emphasizing the study of linguistic competence: the knowledge that a completely fluent, ideal speaker/hearer would have in a completely homogenous linguistic community (Chomsky 1965: 3). Performance, on the other hand, is the processing of language in real time by actual language speakers. Chomsky is quick to clarify the relationship between the formal mechanisms of an analysis of competence and linguistic performance: Authenticated | [email protected] Download Date | 8/19/17 6:36 PM Dialectal variation in Spanish diminutives 41 Although we may describe the grammar G as a system of processes and rules that apply in a certain order to relate sound and meaning, we are not entitled to take this as a description of the successive acts of a performance model. (Chomsky 1968: 117) Kager reflects this same sentiment when discussing optimality theory: Explaining the actual processing of linguistic knowledge by the human mind is not the goal of the formal theory of grammar … a grammatical model should not be equated with its computational implementation. (Kager 1999: 26) Performance approaches, in contrast, attempt to understand the way actual speakers learn and process linguistic information in real time. Since perfor- mance deals with actual behavior, it should be carried out empirically. That is, it must deal with entities that are observable in the speech signal or via the results of psycholinguistic experiments. Entities that are unobservable in the real world and mechanisms that are purely theory internal are not useful, and therefore not permitted in an empirical study. The principal reason for this is that such entities are not subject to potential falsification (Popper 1968). Finally, an empirical approach requires a hypothesis that makes predictions about linguistic behavior. This contrasts with competence approaches that essentially describe linguistic data, but are not designed to address linguistic behavior. The distinction between competence and performance approaches as they relate to the question of diminutive formation and dialectal variation will be discussed later on in the paper. 3 The corpus data The purpose of the present paper is to consider variation in diminutive formation across different varieties of Spanish. Most extant studies of the Spanish diminu- tive do not consider dialect variation. Those that do emphasize how diminutive formation occurs in a particular country, but do not make cross-country com- parisons. One exception is Prieto (1992) who gathered intuitions from one or two speakers from seven countries. Another is Callebaut (2011) who extracted diminutives from 14 countries using the CREA corpus1 and other online sources. However, in many cases, even this 70 million word corpus only yielded a handful of instances of diminutives of a particular word in a particular country. The possibility of examining diminutive forms from a variety of countries has 1 http://corpus.rae.es/creanet.html Authenticated | [email protected] Download Date | 8/19/17 6:36 PM 42 David Eddington been aided by the recent release of the newly updated Corpus del Español.2 This corpus contains 2 billion words of Spanish from 21 different countries.3 Roughly 60 % of the data come from blogs, meaning it covers informal registers quite well. Given its size, it should allow the process of diminutive formation to be more rigorously explored across the Spanish speaking world. Of course, corpora do have their limitations and drawbacks. The existence of typographical errors in the source documents, as well as errors introduced in the compilation and tagging process are always an issue for corpora. The country of origin in this corpus was determined by Google’s algorithm4 which is not fool- proof, and may categorize a document into the wrong country. In like manner, the fact that a blog was written in one country does not exclude the possibility that its author may actually be from another. An example of this type of error is seen in the word gurises ‘boys,’ which is used exclusively in Uruguay and to a lesser extent in Argentina, yet the corpus shows scattered tokens of this word appearing in many other countries, which are either produced by Uruguayans abroad or miscategorized documents. Another potential problem is that the corpus is divided along national boundaries. Although the data are taken from individual countries, one country may house several dialects that differ in how they form diminutives. Additionally, if the corpus happens to incorporate a document in which one particular author uses a large number of diminutives, that author may skew the results for that country. In spite of these issues, the new Corpus del Español is currently the best source for looking at diminutive variation by country,