<<

THE USE OF THIRD PERSON ACCUSATIVE IN SPOKEN : AN ANALYSIS OF DIFFERENT TV GENRES

by

Flávia Stocco Garcia

A Thesis submitted to the Faculty of Graduate Studies of The University of Manitoba in partial fulfillment of the requirements of the degree of

MASTER OF ARTS

Department of Linguistics University of Manitoba Winnipeg

Copyright © 2015 by Flávia Stocco Garcia ABSTRACT

This thesis presents an analysis of third person accusative pronouns in Brazilian

Portuguese. With the aim to analyze the variation between the use of standard (prescribed by normative ) and non-standard pronouns found in oral language, I gathered data from three kinds of TV show (news, non-scripted and soap-opera) in order to determine which form of is more common and if there is any linguistic and/or sociolinguistic factors that will influence on their usage. Based on data collected, I demonstrate that non-standard forms are favored in general and that the rules prescribed by normative grammar involving standard forms are only followed in specific contexts.

Among all the variables considered for the analysis, the ones that showed to be significant were the kind of show, the context of the utterance, the socio-economic status of the speaker and in the . Considering my results, I provide a discussion regarding to which extent the distribution of the 3rd-person pronouns on TV reflect their use by Brazilians and a brief discussion of other issues related to my findings conclude this work.

2 ACKNOWLEDGEMENTS

First and foremost, I would like to express my gratitude to my advisor, Verónica

Loureiro-Rodríguez, for all her help during the completion of this work. Her time, patience and insights were very much appreciated. I am also extremely grateful to Kevin

Russell for the help and feedback he provided me during this project. This project would have been impossible without their continuous support and guidance.

My sincere and deepest appreciation goes to my parents who are so far away but whose love, support and encouragement have always been so present and important in this journey. And my special thanks goes to my husband, my partner and best friend, who was always there to support me through the difficult and stressful times. I love you all very much!

3 TABLE OF 4

Abstract ______2 Acknowledgements ______3 Table of contents ______4 List of tables ______6 List of figures ______7 Abbreviations ______8 1. Introduction ______9 2. Context: Standardization in Brazilian Portuguese ______14 2.1 What is a standardized language ______14 2.1.1 Education and Language Prescriptivism ______17 2.2 So, what happens in ? ______19 2.2.1 and Dialectology in Brazil ______19 2.2.2 Standard Brazilian Portuguese ______22 2.3 Brazilian vs. ______24 2.4 Previous studies on Portuguese pronouns ______25 2.4.1 Studies on third person pronouns ______27 2.5 Summary ______30 3. Third person accusative pronouns in Brazilian Portuguese ______31 3.1 Personal pronouns in BP ______31 3.2 What are ? ______34 3.3 placement in BP ______36 3.3.1 Proclitics ______36 3.3.2 Enclitics ______38 3.3.3 Endoclitics ______40 3.3.4 Clitic placement in a verbal sequence ______41 3.3.5 Optional usage of clitics ______43 3.4 The nominative ele ‘he’ as direct in Brazilian Portuguese ______44 3.4.1 Prescriptive usage ______44 3.4.2 Descriptive usage ______44 3.5 Summary ______48 4. Methodology ______51 4.1 Corpus ______51 4.2 Data Collection ______55 4.3 Method ______57 4.3.1 Standards vs. non-standards: whole data ______59 4.3.2 Standards vs. non-standards: enclitics only ______60 4.3.3 Proclitics vs. enclitics: standards only ______60 4.4 Summary ______61

4

5. Descriptive Results ______62 5.1 Standard pronouns found in the data ______62 5.1.1 Proclitics ______63 5.1.1.1 Mandatory uses ______64 5.1.1.2 Optional uses ______66 5.1.2 Enclitics ______67 5.1.2.1 With verbs in the infinitive ______68 5.1.2.2 With verbs ending in a vocalic sound ______70 5.1.3 Endoclitics ______72 5.2 Non-Standard pronouns found in the data ______74 5.3 Standard vs. non-standard pronouns according to the kind of show ______75 5.3.1 News shows ______76 5.3.2 Non-scripted shows ______79 5.3.3 Soap operas ______83 5.4 Summary ______83 6. Statistical Results ______85 6.1 Kind of pronoun: standard vs. non-standard ______85 6.1.1 Whole data ______85 6.1.2 Enclitics ______90 6.2 Clitic placement: proclitic vs. enclitic ______92 6.3 Summary ______92 7. Discussion ______94 8. Conclusion ______112 References ______114 Appendix ______118

5 LIST OF TABLES

Table 1. Personal Pronouns in BP ______33

Table 2. Third Person Personal Pronouns in BP ______47

Table 3. Third Person Personal Pronouns in BP ______49

Table 4. TV shows list and number of hours in each category ______55

Table 5. Variables extracted from data ______56

Table 6. Proclitics ______64

Table 7. Enclitics ______68

Table 8. Pronoun distribution according to the kind of show ______75

Table 9. Accusative pronouns in the news shows ______76

Table 10. Accusative pronouns in the non-scripted shows ______79

Table 11. Accusative pronouns used by the hosts in each NS show ______80

Table 12. Accusative pronouns in the soap-operas ______83

6 LIST OF FIGURES

Figure 1. Regions and northern and southern division ______20

Figure 2. Known dialects of Brazilian Portuguese ______21

Figure 3. Variable importance for random forest 1 (standards vs. non-standards with the whole dataset) ______86

Figure 4. Inference tree 1 (standards vs. non-standards with the whole dataset) ______88

Figure 5. Variable importance for random forest 2 (standards vs. non-standards with enclitic structures) ______90

Figure 6. Inference tree 2 (standards vs. non-standards with enclitic structures) ______91

Figure 7. The vicious circle of linguistic discrimination ______108

Figure 8. Sign with showing a non-standard form of first person accusative pronoun ______114

7 ABBREVIATIONS

1 first person 2 second person 3 third person ACC accusative COND conditional DAT dative F feminine FUT future GER gerund IMP imperative IMPF imperfect M masculine NEG negative NOM nominative PART PAST past PL plural PRES present REFL reflexive SG singular SUBJ subjunctive

8 CHAPTER 1

INTRODUCTION

The present study constitutes an analysis of third person accusative pronouns in spoken Brazilian Portuguese (BP). Considering the existence of a high variation involving the rules and usages of these pronouns, my purpose is to analyze the variation between the use of the standard (prescribed by normative grammar) and non-standard forms found in oral language.

Oral Brazilian Portuguese often differs from the normative variety taught at schools and prescribed by normative grammar (Rocha Lima, 1998; Cunha & Cintra,

2007; Bechara, 2009). An illustrative example is the use of the third person accusative pronouns, which are often employed in a way that normative grammar deems unacceptable. In this research, I attempt to prove that non-standard pronouns are much more common than standard ones in oral speech and that the rules prescribed by normative grammar are followed only in specific contexts.

Normative grammar identifies the clitics o() (masc.) and a(s) (fem.) as the third person accusative pronouns. However, Brazilian Portuguese speakers show preference for the use the nominative forms ele(s) and ela(s) instead. According to normative rules, example (1) is grammatical, while example (2) is not. One would find example (1) only in very formal contexts such as in a political speech or university lectures and also in formal written language. Example (2) reflects the structure most Brazilian people would actually use in oral and even informal written contexts.

9 (1) A mulher beijou -o. the woman kiss.3.SG.PAST -him.ACC ‘The woman kissed him.’

(2) *A mulher beijou ele. the. woman kiss.3.SG.PAST he.ACC ‘The woman kissed him.’

The normatively prescribed placement and form of third person accusative pronouns differs from real use as well. Normative grammar states that these pronouns can be enclitic, proclitic or endoclitic depending on some factors such as tense, verb ending and lexical attraction. For instance, if the verb has a nasal ending and the pronoun comes after the verb, -as is to change to -nas (example 3), although Brazilian speakers use the form shown in (4).

(3) Peguem -nas, por favor. take.3.PL.IMP -them.F.ACC please ‘Take them, please.’

(4) Peguem elas, por favor. take.3.PL.IMP they.F.ACC please ‘Take them, please.’

The corpus used for this study comes from TV programs broadcasted by Rede

Globo, the largest Brazilian network. I focused on three kinds of programs: news shows

(which are scripted), non-scripted shows (interviews and variety shows) and soap operas

(Brazilian ). The whole corpus consists of 1252 tokens found in more than

226 hours (approximately 75 hours from each of the three kinds of show. I collected sentences containing at least one third person object pronoun from the TV shows

10 described above and considered a number of linguistic and sociolinguistic factors (like verb tense and speaker’s gender, for example) from each sentence to check if any of them would influence on the choice of pronoun (standard od non-standard) or on the placement of the clitic when standard forms were used (before or after the verb, for example).

With my analysis, I will try to answer some of the following :

1) What is the distribution of standard and non-standard accusative pronouns on the

TV shows analyzed?

2) Are there any linguistic or sociolinguistic factors that influence on the choice of pronoun or on the placement of clitics?

3) To which extent does the distribution of the 3rd-person pronouns on TV reflect their use by Brazilians?

This research is divided as follows:

In Chapter 2, I will set up a context regarding the language used in Brazil. I will firstly discuss concepts such as language standardization, prescriptivism and descriptivism and then I will turn attention to the language situation in Brazil, which has

Portuguese as a national language but also has many dialects spoken in its territory. In this chapter, I will also show some basic differences between Brazilian and European

Portuguese. Finally, this chapter will include a revision of previous studies on pronouns in

(Brazilian) Portuguese and, specifically, third person pronouns, which is the focus of my research.

In Chapter 3, I will outline a grammatical description involving the usage and placement of third person accusative pronouns in Portuguese. The rules described here follow the 2009 edition of Moderna Gramática Portuguesa ‘Modern Portuguese

11 Grammar’ by Evanildo Bechara1. In this chapter I will focus on the distribution of the third person pronouns, as this is the main topic of my study. Moreover, I will present the types of clitics (proclitics, enclitics and endoclitics) and the syntactic, morphological or phonetic rules that determine their distribution. This chapter also refers to the nominative pronoun being used as direct object and I will present a brief description of its prescriptive and its descriptive use in Brazilian Portuguese.

Chapter 4 is going to be dedicated to the methodological aspects of my research. I will explain how I built the corpus, how the data collection took place, and I will indicate what linguistic and sociolinguistic variables were considered in my investigation. Later in this chapter I will explain what method I used for my statistical analysis, which was random forests & trees (Janda et al, 2012; Tagliamonte & Baayen, 2012).

In Chapter 5 I am going to explore and describe the data quantitatively, focusing on the distribution of accusative pronouns according to the three kinds of shows. I will also analyze the distribution of standard forms according to their placement and compare them to the rules described in Chapter 3 and illustrate with examples. In Chapter 6 I will describe the statistical results of my analysis.

1 This is the 37th edition of the Portuguese grammar written by Evanildo Bechara, who is an emeritus professor at State University (UERJ) and Fluminense Federal University (UFF), a member of the Brazilian Academy of Letters and of the Brazilian Academy of Philology. He is also the Brazilian representative of the New Orthographic , which is an international accord to be used by all the countries that have Portuguese as an official language whose purpose is to create an unified orthography for Portuguese.

12 Chapter 7 will bring the discussion related to my findings. This chapter will be based on my research questions and I will use my data and my results to attempt to answer them. I will address to some topics regarding the usage of non-standard forms in

Brazilian Portuguese and some issues that might be related to this occurrence.

The final chapter of this thesis is a brief conclusion of this work.

13 CHAPTER 2

CONTEXT: STANDARDIZATION IN BRAZILIAN PORTUGUESE

My aim in this chapter is to set a context of standardization in Brazilian

Portuguese. I introduce important concepts to language, such as standardization, prescriptivism and descriptivism, and linguistic awareness. Later on, I talk about the language situation in Brazil, dialects and dialectological studies, and standard Brazilian

Portuguese. I also provide a quick section about some differences between Brazilian and

European Portuguese. Further sections in this chapter will refer to previous studies on

Portuguese pronouns, especially on third person pronouns, which is the topic of my research.

2.1. What is a standardized language?

The term standardized language refers to a language which has one variety that has undergone standardization (Trudgill, 1999, p.117). A clear definition of standardization was given by Deumert (2004, p.2), who says "standardization can be conceptualized as a movement towards linguistic uniformity through a competition-selection process: certain variants of linguistic habits are selected as part of the standard norm and are generalized to new linguistic and communicative contexts". In other words, it is a process by which a dialect is imposed as the standard variety, the one that describes the regular, or “standard uses" of a specific language, the one that members of that speech community will perceive as the most prestigious.

The process of standardization is one which language communities undergo over a

14 significant period of time and that can affect different levels (phonological, syntactical, lexical) of a certain language (Milroy and Milroy, 1998, as cited in Johnston, 2003). A standard language is the variety that is regarded as the most ‘correct' in the sense that it shows no variation in any of those levels. From a broader perspective, it promotes a social unification and a common language identity. On the other hand, non-standard varieties may differ in pronunciation, grammar and lexicon from the standard form of a language.

These varieties are "characterized by a multiplicity of highly context-specific, particularistic norms which emerged in response to the local needs of the loosely networked social groups which make up the speech community” (Lodge, as cited in

Deumert 2004, p. 3). Therefore, individuals that are not part of the group that speaks the standard variety, for social, ethnical or geographical reasons, will use a language that is characterized by variation at many different levels.

Dialects, language varieties that differ from each other, can be associated with distinct social groups (sociolects), with a particular geographical area (regional dialects) and even with a certain ethnic or cultural group (ethnolects). Essentially, a dialect shows systematic linguistic distinctions from some other variety of the same language and is spoken by a socially identifiable subgroup or some larger speech community (O’Grady &

Archibald, 2000, p. 495).

Standard varieties are socially prestigious, associated to status and power and to the

"educated speakers” of a community; they follow grammatical and prescriptive norms.

They are often the local dialects spoken in the regions that hold cultural, political and economic influence over the population. Many languages present a ‘standard form or variety’. For example, Standard English is one variety among many of the English

15 language and, according to Trudgill (1999) it may be the most important variety of

English in all sorts of ways: it is the variety normally used in writing, especially in printing; it is the variety associated with the education system in all the English-speaking countries of the world, and is therefore the variety generally spoken by those who are often referred to as “educated people”; and it is the variety taught to non-native learners.

Standard English speakers can be found in all countries where English is a native language and they will speak this variety with different accents (phonetic differences in pronunciation) depending on where they are from (the United States, England, Ireland,

Australia, etc.).

Every standard language, by virtue of continuous change and conscious elaboration, contains a minimum level of variation (Joseph, 1987, p. 127). For every speaker (of standard varieties or not), languages vary from utterance to utterance with regard to sound, intonation, meaning, word and sentence structure. At a conscious or unconscious level, a speaker makes choices about how a language is produced and perceived and, it is the production and perception of speech sounds as systematic entities functioning in relationship to each other that there is perhaps the greatest potential for variation in language (Lippi-Green, 2012 p. 23).

Every speaker has an innate ability, or a linguistic awareness, that is connected to the use of linguistic variants (linguistic options, alternatives) and the evaluation that comes along with them. As soon as language becomes a variable commodity, the variations are to value judgments and assignments of prestige just like every other attribute, aptitude, and possession within human consciousness (Joseph, 1983, p. 31).

16 2.1.1 Education and Language Prescriptivism

This idea of linguistic awareness is directly associated with the ability we have to perceive differences. The fact that linguistic variants exist and are apparent to speakers support the existence of a conscious awareness related to language at which standardization begins. As a consequence of awareness, preferences and judgments regarding language will also take place. As Joseph (1983, p.16) states:

"For any number of possible reasons, wherever variants are in competition, one will

always be preferred to the other, creating hierarchies which it is the task of language

education to inculcate. The canonical form of such education is 'Say x, not y.’”

(Joseph, 1983, p. 16)

The 'say x, not y’ statement falls into the prescriptive approach of language.

Prescriptivism involves the study of different aspects of language following normative practices, i.e., it follows the grammatical norms. Linguistic prescriptivism involves judgments of what is considered good and correct and it targets to establish a standard language. It is conservative and defended by purist linguists and grammarians.

Language education falls into the domain of the prescriptive category. It is through education that standard languages are taught and a community of speakers is maintained.

By limiting the access to non-standard varieties, educational institutions can control who has access to acquiring the ‘pure’ language and determine who is part of the ‘educated' society, which involves the sociopolitical aspect of language. Regarding this matter,

Huebner (1999, p. 9) claims that the political function of language concerns power:

17 language reflects, reinforces and acts upon relationships of political and economical dominance and inequity. Therefore, developing a way of keeping a standard language that is hard to attain is in the interest of the most powerful and prestigious socioeconomic classes. The cultural institutions of writing and education have served toward this end for a far greater portion of history than they have served the opposite, democratic purpose of bringing culture to the masses (Joseph, 1987, p. 45).

Besides this association to the prestigious and powerful speakers, the preference for a standardized language shows other practical elements to be considered as well. For example, the language used in writing; the structure of sentences and spelling of words tend to follow the dialect of the upper class speakers. Johnston (2003, p. 433) states that

"literacy permits the efficient diffusion of this standard through correspondence, official papers, religious documents, and, most important, the educational system". Literate people learn how to write in one way, the standard or 'proper' way and they do not learn that spontaneously in social interactions; the norms they learn are acquired in school, through orthographic rules, grammar books and dictionaries.

However, despite the existence of all these norms and rules, the study of language does not follow only prescriptive methods. An alternative approach to Prescriptivism,

Descriptivism is concerned with describing how language is really used, and with observing and explaining the linguistic patterns of speakers that do not follow formal usage rules (prescriptions). From a descriptive point of view, there is no place for value judgments towards language, there is no ‘wrong usage’, but rather ‘different usage'.

Descriptivists study the many variations (at all levels) within a language, the unconscious rules that do not follow the standard norms.

18 The prescriptive vs. descriptive dichotomy can be related to standard vs. non- standard, formal vs. informal, written vs. spoken, and literary vs. popular contrasts found in language. That is, the usages that follow the norms and are taught at schools and the usages involving the varieties found in spontaneous and natural speech.

2.2 So, what happens in Brazil?

2.2.1 Dialects and Dialectology in Brazil

Language is a very strong element of Brazil’s national unity. Portuguese, the

2 national language of Brazil _, is spoken by almost 200 million people (Instituto de

Geografia e Estatística, IBGE, 2013's data). With an area of 8,515,767 km2 (Instituto de

Geografia e Estatística's, IBGE, data) which makes this country the 5th largest in the world, the fact that the language spoken is uniform can be surprising. However, such large territory has led to the existence of many different regional dialects that, despite showing great phonological and lexical variation, are mutually intelligible.

According to Antenor Nascentes (1922)3, there are two main groups of dialects in

2 Portuguese is the national language of Brazil and . It has been made the official language of , , the Islands, Guinea Bissau, and the São Tomé and Príncipe Islands. In

Asia, only Macao has officially kept Portuguese, which is also spoken in the eastern part of the island of

Timor. Large communities of emigrants keep Portuguese alive in North America and in various countries in

Europe. (Mateus & d’Andrade, 2000, p. 2)

3 Antenor Nascentes was a Brazilian philologist, etymologist, dialectologist and lexicographer who contributed a great deal to the study of the Portuguese language in Brazil. In O Linguajar ‘ The

Carioca Dialect', published in 1922, he proposes this dialectal division that is followed by many other dialectologists from that point forward.

19 Brazil: , which includes the north and the northeast regions plus the

Amazon and central-southern Portuguese, which includes the central and southern regions (Figure 1). The map below shows the division of regions and the division between the two groups of dialects.

Figure 1. Regions and northern and southern dialect division (wikipedia.org)

It is possible to find many other dialects within each of these two groups; even

4 though there is no official list of all the dialects spoken in Brazil _, the image below

4 There are a number of publications including regional linguistic atlases but nothing official about the whole country yet. There is an ongoing project Projeto Atlas Linguistico do Brasil that aims at creating a broader atlas regarding the Portuguese language in Brazil.

20 illustrates the distribution of the best-known dialects.

Figure 2. Known dialects of Brazilian Portuguese (wikipedia.org)

Numerous linguists have contributed to the dialectological literature in Brazil, such as Rodolfo Garcia with his Dicionário de Brasileirismos ‘Brazilian Dictionary’ (1915),

Pereira da Costa with Vocabulário Pernambucano ‘Vocabulary from Pernambuco’

(1937), Mário Marroquim with A Língua do Nordeste ‘The Northeastern Language’

(1934), and Amadeu Amaral’s Dialeto Caipira ‘The Hick's Dialect’ (1920), etc. Some of these studies also provide a diachronic view of Brazilian , like Clóvis

Monteiro’s A linguagem dos Cantadores ‘The Language of the Singers’ (1933).

5 One noteworthy diachronic study is the NURC Project _ (Projeto NURC - Projeto

5 Please, refer to http://www.letras.ufrj.br/nurc-rj/ for more information on this project.

21 da Norma Urbana Oral Culta), which studied the speech of people holding a university degree in five Brazilian metropolises from the late 60’s to the 2000’s. Among other things, this research project aimed at the description and study of Brazilian Portuguese spoken by educated people in every aspect like phonetics, phonology, syntax, morphosyntax, lexicon, and stylistics. This collection contains material that has already been used for countless researches and publications, such as the series Gramática do

Português Falado (2003). The Grammar of Spoken Portuguese” which contains 8 published volumes.

2.2.2 Standard Brazilian Portuguese

In every society, among all the varieties of a language, there is one that is considered to be the standard. In the case of Brazilian Portuguese, the standard varieties are the ones spoken in Rio de Janeiro (carioca) and São Paulo (paulistano). There is a cultural prestige associated with these dialects since these two cities are the most important urban centers of the country. Due to their economic and cultural influence over the rest of the country, their dialects are perceived as the “standard" Portuguese. The influence of the media has played a big role in this, since these are the varieties that can be found in the most important newspapers, radio and TV shows. The media is, perhaps, the biggest factor in expanding these two dialects as the most common ones because the largest TV networks are based in these two cities and their programs are broadcast nationwide (many soap operas are transmitted to other countries as well) so all the other regions are exposed to these two varieties on a daily basis.

Television is a powerful means of communications in Brazil, exerting influence on

22 the politics, economy and culture of the country (Porcello, 2009; Rizzotto, 2012).

Therefore, the language used in TV (being the varieties from Rio de Janeiro and São

Paulo the most used ones) holds a very significant influence over the language in general; it might affect speakers of other varieties throughout the country. There is evidence (see

Carvalho, 2004) that even (a rural variety spoken by bilingual communities along the Uruguayan and Brazilian border) has suffered changes due to the exposure of Brazilian television, which supply a linguistic model for smaller groups.

Carvalho’s study shows the incorporation of new phonological variants that are proper to the urban varieties (like the palatalization of dental stops ‘di’ and ‘ti') that shifts away these dialect from its rural origin and shows the incorporation of “linguistic features that are stereotypically Brazilian, as the result of a desire to emulate speakers of larger urban monolingual communities in central Brazil, whose dialect is shown daily on television

(Carvalho, 2004, p. 128). The language used in Brazilian television is an important topic for the present study and will be discussed later in my discussion chapter.

Regarding the written language, the standard written Portuguese has helped to maintain the unity of language over the entire national territory and to ensure that all the regional varieties remain mutually intelligible. The formal written language is used in almost all printed media and written communication, and is the standard Portuguese taught at school, this written language has been based on the standards of Portugal, since both varieties of Portuguese share the same normative grammar. Written Brazilian

Portuguese differs significantly from the spoken language, with only an educated or conservative portion of the population following prescriptive norms dictated by normative grammar. The Portuguese priest Fernão de Oliveira was the first grammarian to compile

23 a grammar book of the Portuguese language in Grammatica da lingoagem portuguesa

(1536) and was followed by other Portuguese writers like João de Barros and his

Gramática da Língua Portuguesa (1540). The first Brazilian grammar book, Compêndio da Grammatica Portugueza, was published in 1829 by the priest Antonio da Costa

Duarte. In more recent years (with editions ranging from 1984 to 2013), names such as

Celso Cunha, Rocha Lima and Evenildo Bechara are well known for having contributed to the development of Portuguese normative grammar in Brazil.

2.3 Brazilian vs. European Portuguese

6 Brazilian and European, the two most important varieties of Portuguese _, can differ in terms of grammar usage, vocabulary and spelling (a little less after the New

Orthographic Agreement, which I will mention below); however, the phonological aspect is the most salient between the two varieties. The most noticeable differences are found in the unstressed system, making their prosody very peculiar. For example,

European Portuguese unstressed are shorter in duration and frequently reduced or even deleted in words, whereas in Brazilian Portuguese there is a lighter reduction of unstressed vowels and absence of deletion.

Other differences include: the vocalization of the post-vowel l in Brazil in contrast to a clear l in Europe, e.g.; mel ‘honey’ is pronounced [mɛw] in Brazil and [mɛl] in

Portugal; the position of in sentences (Portugal: Lá não vou; Brazil: Não vou lá

‘I'm not going there’), as well as some vocabulary differences (i.e., for ‘Canadian’, it is

Canadense in Brazil and Canadiano in Portugal). (Mattoso Câmara Jr., 1965).

6 See footnote 1 for varieties of Portuguese in Brazil.

24 In 1990, the Community of Portuguese Language Countries (CPLP), which included representatives (linguists, academics, writers) from all countries with Portuguese as the official language, reached an agreement (New Portuguese Language Orthographic

Agreement) on the reform of the Portuguese orthography to unify the two varieties. Into effect in Portugal since 2008 and in Brazil since 2009, a lot of changes in spelling between the two forms were reduced. Brazil had a change of 0,5% in its vocabulary and the other countries changed 1,6% in theirs (Ministério da Educação ‘Ministry of Education’s' data). Some of the main changes, for example were: the insertion of the letters ‘k’, ‘w’ and ‘y' which were not part of the Portuguese alphabet, the simplification of the rules involving the hyphen, the elimination of the acute accent in words which the penultimate is the tonic ‘ei' or ‘oi’, such as ideia ‘idea' (previously idéia) and joia ‘jewelry' (jóia), and the complete elimination of the umlaut in words containing 'qu' or 'gu' where the ‘u' was pronounced, like linguica ‘sausage' (lingüica) or sequência ‘sequence’(seqüência).

2.4 Previous studies on Portuguese pronouns

One common aspect that receives a lot of attention pertaining to those differences between Brazilian and European Portuguese is the distribution of personal pronouns.

Portuguese personal pronouns display a high degree of and their use display a high degree of variation, what makes this word class notably attractive to be investigated.

Investigating the distribution of unstressed, or atonic pronouns (also known as clitics) can be challenging. This topic has been the focus of investigation of many well- known Brazilian and Portuguese philologists and grammarians (Rocha Lima, 1980;

25 Cunha e Cintra, 1985; Bechara, 1999), as well as more recent studies, as we will see below. Attempts at explaining the usage of Portuguese clitics may focus on the syntax, the syntax-phonology interface and, just like the present study, on the sociolinguistic aspect of their use.

From a syntactic point of view, Barrie (2000) looks at clitic placement and verb movement in European Portuguese, and tries to determine which grammatical factors that could determine the position of the clitics. Galves (2003) also focuses on the European variety and proposes a historical approach of clitic placement in Portuguese with a focus on the relationship between syntax and phonology. Also following a syntax approach,

Moura (2012) presents a comparative description of four of the most used grammar books regarding the usage of accusative pronouns (pronouns filling the function of objects in sentences).

Pereira (1981) studied the high variation related to the placement of unstressed pronouns in Brazilian Portuguese and also considered extralinguistic factors, such as sex and age, in determining the level of variation. Her corpus consisted of newspaper articles, old manuscripts, and interviews with male and female participants from different ages and socio-economic status. In the written corpus she found many instances of proclitis and enclitics, however, endoclitics were very rare. Pereira suggested that the biggest factor determining pronoun placement in most genres was lexical attraction. As for the spoken corpus, she found that the rules involving the usage of pronouns are different from the written language. She concluded proclisis was more common, there was no endoclisis, and enclisis would only occur with a few pronouns. As for the most important factors influencing this variation, she pointed to the syllable structure of the pronoun as the most

26 significant linguistic element (i.e. pronouns with a CV structure tended to be more common and proclitic) and to two sociolinguistic factors: age, with older people using more clitics than younger people, and sex, with men using more clitics than women; what means that the younger people and women were more likely to delete or substitute clitic pronouns and, thus, were more likely to contribute to linguistic change.

2.4.1 Studies on third person pronouns

As it will be illustrated in the next chapter, third person pronouns behave differently than the ones of first and second person, and thus have received more research attention.

Some researchers have focused on syntactic descriptions and comparisons between two or more varieties of Portuguese as an effort to contributing to the investigation and documentation of the similarities and differences found among varieties. Freire (2005) thoroughly compared third person object pronouns in written Brazilian and European

Portuguese, and Vieira (2003) presented a comparative analysis of the placement of clitics in the varieties from Brazil, Portugal and Mozambique.

Moreover, just like the present study, considerable attention has been devoted to the sociolinguistic aspect of the distribution of Portuguese unstressed accusative third person pronouns (Omena, 1978, Bernardes, 1981; Saraiva, 2008; Bachman, 2011; Pagotto &

Duarte, 2005; Ferreira & De Alkmim, 2011), as their use concerns a high degree of variation. These studies investigated and compared their use in oral and/or in written speech based on factors such as age, social class, level of education, or region.

An early example is Omena (1978), who investigated the different forms of third

27 person object pronouns in the speech of uneducated adults from Rio de Janeiro.

Considering the existence of a variation that includes nominative forms − ele(s), ela(s) − and accusative forms − o(s), a(s) − being used as objects, she was interested in observing the occurrence of these pronouns in spontaneous speech. Besides the frequency of the occurrences, she was interested in finding patterns that could condition the use of these pronouns and in determining if their usage was merely random. In her data (built out of interviews with 4 participants and a total of 28 hours of recording) she found mostly examples of the nominative forms being used as objects (not following grammatical rules) and examples with the deletion of the pronoun. She concluded that uneducated speakers were not aware of the grammatical rules involving the accusative pronouns, they used nominative forms instead or simply deleted them; and she determined that the linguistic factors that had a great influence on her results were the animacy of the antecedent

(animate antecedents, especially with a +human feature, favour nominative forms) and the syntactic function of the antecedent (if the antecedent was a subject it would favor nominative forms, if it was a , it was likely to be deleted). She also analyzed archaic documents and realized that they contained similar examples, what proves that this kind of variation has always existed in spoken Portuguese.

In another study, Bernardes (1981) investigated the same phenomena in texts of high school students and she found out that there is also a lot of variation in written language as well as in spoken. As a result, the socio-economic group of the students showed to be a significant factor in the subjects’ usage of the kind of pronoun, meaning that only the subjects belonging to a higher socio-economic group would make use of the pronouns as taught by normative grammar (or “standard forms” as I present later in

28 Chapter 4).

Bachman (2011) developed an analysis of third person accusative pronouns in a corpus of Brazilian TV evening news programs. Her findings revealed that the use of pronouns (both clitics and tonic) was extremely low and that lexical phrases and passive constructions were used to establish anaphoric reference, instead. She argued that the format of this kind of show (scripted) could explain this pattern, which “underlines their role as a provider of quality news, as well as the audiovisual quality of its content, which requires adherence to accepted standards for spoken language” (Bachman, 2011, p.

2).

In a similar idea to Bachman’s paper, my study turns attention to describing the variation across different TV genres. I focus on the language used in TV programs to investigate the sociolinguistic reality of Brazilian Portuguese; however, I investigate the language used not only in news programs, but also in non-scripted shows and in soap operas. The purpose of this study is to analyze the variation between the use of the standard (prescribed by normative grammar) and non-standard third person accusative pronouns in spoken Brazilian Portuguese. My goal is to find out which pronouns are more common in each of the TV shows that will be analyzed and try to determine if the kind of language used in them (scripted vs. non-scripted) play any part in their occurrences. I am also interested in finding patterns and in finding out if there is any linguistic and/or extra linguistic factor that could explain their frequency and usage.

29 2.5 Summary

In this chapter, I presented certain concepts to set up the context of standardization in Brazilian Portuguese. Then, I turned attention to the language situation in Brazil and talked briefly about the country’s dialects and dialectological studies. I gave a few examples to show some differences about Brazilian and European Portuguese and finished this chapter with some previous literature on Portuguese pronouns.

In the next chapter, I will describe the third person accusative pronouns in BP. I will illustrate the rules prescribed by normative grammar on their usage and placement and will also provide a descriptive approach, which involves mostly non-standard forms.

These rules and descriptions will be used to compare my findings later on in this project.

30 CHAPTER 3

THIRD PERSON ACCUSATIVE PRONOUNS IN BRAZILIAN PORTUGUESE

In this part of my study, I will present the description involving pronouns in BP7.

There are many kinds of pronouns_, but I am going to deal with the personal pronouns; more specifically, the oblique pronouns. I will show the distribution of all the personal pronouns but will focus on the third person, as this is the main topic of my study. I will introduce the definition of ‘clitics’ and explain their kinds and placements in the

Portuguese language in relation to one verb or to a verbal sequence. In section 3.4 I present the description of the nominative pronoun ele ‘he’ serving accusative function in

BP prescriptively and descriptively.

This section is strictly descriptive; I focus on the rules prescribed by normative grammar regarding the usage of third person accusative pronouns and on the description of the occurrence of nominative pronouns functioning as direct objects in BP. This chapter does not show any comments or discussion, these will come in later sections in the paper.

3.1 Personal pronouns in BP

Personal pronouns are associated with a person (first - the one who speaks; second - the one who is being spoken to; or third - the one who is being talked about), number

(singular and plural) and gender (masculine and feminine); and they are also classified

7 In Portuguese, pronouns are classified as either: personal, , , indefinite, interrogative or relative.

31 according to their syntactic function or .

The nominative case (NOM) is the grammatical case that marks the subject of a verb and the oblique case (OBL) is the grammatical case that marks the complements

(objects) of the verb. In Brazilian Portuguese, a transitive verb may have two kinds of objects, direct or indirect, therefore there is an oblique case associated with each of them: the (ACC) marks direct objects and the dative case (DAT) marks indirect objects.

In BP, there are two kinds of oblique pronouns: tonics and atonics. Tonic pronouns are stressed, used as indirect objects (dative case) and are always linked to the verb by a preposition (usually a ‘to’, para ‘for’, de ‘of’, or com ‘with’).

(5) Ela deu o livro para ele. she give.3.SG.PAST the book to he.ACC ‘She gave him the book.’

Atonic pronouns are unstressed and are used as direct objects (example 5), with the

8 exception of the form lhe, that is used as indirect object _ (example 6). Unstressed oblique pronouns do not require a preceding preposition: they are directly connected to the verb.

8 According to the normative tradition, the pronoun lhe is used for the third person and as indirect object only, and it can only be used with verbs requiring the prepositions a or para ‘to’. However, it is very common to find the pronoun lhe being used as the direct object of a verb and referring to the second person, e.g;

(1) Eu lhe amo.

I 3.SG.DAT love.1SG.PRES

‘I love you.’

32 These pronominal forms can also be referred to as ‘clitic pronouns’ or just ‘clitics’.

(6) Ela o deu para você. she him.ACC give.3.SG.PAST to you.ACC ‘She gave it to you.’

(7) Ela deu -lhe o livro. she give.3.SG.PAST 3.SG.DAT the book ‘She gave the book to him.’

The table below shows the distribution of the personal pronouns in Brazilian

Portuguese, with the equivalent forms for the cases described above. As the focus of this study is the third person pronouns, these are highlighted in the table. Also, the examples below will refer to the third person only.

Personal Pronouns Person Nominative Oblique Unstressed (w/o prep) Stressed (w/ prep) Atonic Tonic

Singular 1st eu me mim 2nd tu te ti 3rd ele, ela lhe, o, a, se ele, ela, si

Plural 1st nós nos nós 2nd vós vos vós 3rd eles, elas lhes, os, as, se eles, elas, si

Table 1. Personal Pronouns in BP (Adapted from Bechara, 2009, p.164)

33

As Table 1 shows, the third person stands out because of its multiple variants. While there is only one form filling out the oblique case associated with the first and second person, there is a different form for each function in the third person: o/a for the accusative (used as direct objects); lhe for the dative (used as as indirect object); and se for the reflexive. The examples below show this distribution.

(8) Accusative:

A mulher não o beijou. the woman NEG him.ACC kiss.3.SG.PAST ‘The woman didn’t kiss him.’

(9) Dative

Ninguém lhe emprestou dinheiro. nobody 3.SG.DAT lend.3.SG.PAST money ‘Nobody lent him/her money.’

(10) Reflexive

Ela se feriu com a faca. she 3.REFL hurt.3.SG.PAST with the knife ‘She hurt herself with the knife.’

3.2 What are clitics?

The word ‘clitic’ derives from Greek klísis ‘leaning’ (Bechara, 2009, p. 414). Clitics may belong to diverse grammatical classes, and these classes vary from language to

34 language. Well-known examples involving clitics are the English forms I’m, you’re, he’s, don’t, and I’ve; the French forms j’ai, je t’aime, d’une, and d’eux; and the Italian forms l’uomo, m’ama, t’amo, and l’acqua. (Bagno, 2011, p. 740)

A clitic is a bound morpheme, that is, means it is not free but ‘dependent’ on adjacent words. Barrie (2000, p.7) defines clitics as small phonosyntactic units that carry two defining properties: their requirement to associate with a host in the syntax, and their inability to receive vocal or phonological emphasis. In other words, clitics cannot stand on their own, nor can they receive . Another salient property of clitics is that they often have freedom of movement (Haspelmath, 2002, p.152), that is, they can occur in different positions in the sentence, which is a common phenomenon in Portuguese.

According to the normative grammar Moderna Gramática Portuguesa (Evanildo

Bechara, 2009) clitics are classified as proclitics, enclitics, or endoclitics, depending on their position in relation to their hosts:

1) A proclitic appears before the verb:

(11) Não o vi ontem. NEG. him.ACC see.1.SG.PAST yesterday ‘I didn’t see him yesterday.’

2) An enclitic appears after the verb:

(12) Vejo -o feliz. see.1.SG.PRES -him.ACC happy ‘I see him happy.’

3) An endoclitic appears inserted in the verb (between the verb stem and its affixes):

(13) Vê -lo -ei amanhã.

35 see -him.ACC -FUT.1.SG tomorrow ‘I will see him tomorrow.’

9 Enclisis is considered to be the unmarked position _ in Portuguese. However, the

10 placement of clitics depends on certain conditions _, as we will see below.

3.3 Clitic placement in BP

3.3.1 Proclitics

Cândido de Figueiredo’s hypothesis of “lexical attraction” (as cited in Pereira, 1981, p.13), suggests that certain words or word strings attract the clitic pronoun to the left of the verb. Adverbs (sempre ‘always’, aqui ‘here’, etc...), negation words (such as não ‘no’, nada or nenhum, ‘nothing’, ninguém ‘nobody’, etc…), some conjunctions/complementizers (se or caso ‘if’, porque ‘because’, que ‘that’, embora or mas ‘but’, etc...), relative, demonstrative and indefinite pronouns (que ‘that’, alguém

‘someone’, isso ‘this') , and interrogative and exclamative sentences are the components that position clitics before the verb in Portuguese. When the clitic pronoun is placed before the verb, the forms used are the standard ones as seen in Table 1: o (masculine singular), a (feminine singular), os (masculine plural), as (feminine plural). Proclisis

11 seems to be the most common form in BP ._

9 The default order for sentences involving transitive verbs in Portuguese is SVO (subject-verb- object), so the structure that places the clitic after the verb (enclisis) is be the considered the most general, or unmarked.

10 We will see that enclisis undergo phonological variation.

11 European Portuguese and Galician are the two only that have enclisis as their most common structure, all the other languages in the family favor proclisis. (Bagno, 2001, p.741).

36

12 (14) Nós sempre o vemos lá. _ we always him.ACC see.1.PL.PRES there ‘We always see him there.’

13 (15) Ninguém a viu. _ nobody her.ACC see.3.SG.PAST ‘Nobody saw her.’

(16) Ele disse que as levou para casa. he say.3.SG.PAST that them.F.ACC take.3.SG.PAST to home ‘He said that he took them home.’

(17) Quando os veremos? when them.M.ACC see.1.PL.FUT ‘When will we see them?’

12 However, if there is a pause between the verb and , the clitic can occur either before or after the verb:

(2) Aqui, coloquei-a sobre a mesa.

here put.1.SG.PAST-it.F.ACC on the table

‘Here, I put it on the table.’

13 The clitic pronoun can occur before the negation word, if it is not sentence initially.

(3) (…) descia eu para Nápoles a busca de sol que o não

go.down.1.SG.IMP I to Naples in search of sun that it.ACC NEG

havia nas terras do Norte

exist.3.SG.IMPF on.the. lands of.the. North

‘(…) I went down to Naples searching the sun that didn’t exist in the Northern lands.

(Bechara, 2009, p. 589)

37

(18) Deus o abençoe! God him.ACC bless.3.SG.PRES.SUBJ ‘God bless him!

Normative grammar states that clitics can never occur at the beginning of a sentence. Therefore, according to its rules, the following examples are not correct:

(19) *O levou para casa. him.ACC take.3.SG.PAST to home ‘(Someone) took him home.’

(20) *As beijou com amor. them.F.ACC kiss.3.SG.PAST with love (Someone) kissed them with love.

Speakers of Brazilian Portuguese would find these sentences acceptable in oral/informal speech, even though this structure is not very common with third person pronouns14. In the cases above, according to prescriptive rules, the pronoun has to be enclitic (Levou-o para casa./ Beijou-as com amor.) as described in the next section.

3.3.2 Enclitics

When the clitic is placed after the verb, it is said to be ‘enclitic’, and it is preceded by a hyphen. In this case, there are no words that attract the pronoun, but the form of the clitic changes depending on the ending of the verb.

14 First and second person pronouns are commonly used at the beginning of sentences, even though it is not considered grammatically correct (me dá isso ‘give me that’ or te vi lá ‘I saw you there’).

38 As a general rule, if the verb ends in a vowel or in an oral diphthong (vowel + oral semi-vowel or -), the forms o, a, os, as are attached to the verb.

(21) Maria buscou -as para você. Maria take.3.SG.PAST -them.F.ACC for you ‘Maria got them for you.’

If the verb ends in , or , the last consonant must be dropped and the forms become lo, la, los, las.

15 (22) *Nós vamos atender -os logo. ➜ Nós vamos atendê-los _ logo. we will see.INF -them.M.ACC soon ‘We will see them soon.

(23) *João faz -os todos os dias. ➜ João fá-los todos os dias. John make.3.SG.PRES-them.M.ACC everyday ‘John makes them every day.’

(24) *Eu quis -a aqui. ➜ Eu qui-la aqui. I want.1.SG.PAST -her.ACC here ‘I wanted her here.’

If the verb ends in -ns, this segment will be replaced by an :

(25) tens-lo ➜ tem-lo have.2.SG.PRES + clitic

15 This new form must be re-accentuated according to the accentuation rules of Portuguese (See

Bechara 2009 “Accentuation Rules”, pp 105).

39 The next rule involves a very common feature in Portuguese: nasalization. If the verb ends in , or in nasal (-ão, -õe), an is added to the clitic.

(26) *Peguem -os, por favor. ➜ Peguem-nos, por favor. take.3.PL.IMP -them.M.ACC please ‘Take them, please.’

(27) *Põe -as sobre a mesa. ➜ Põe-nas sobre a mesa. put.2.SG.IMP -them.F.ACC on the table ‘Put them on the table.’

Enclitis are commonly found in sentences starting with verbs, in (comer

+ os = comê-los, ‘eat them’), gerunds (comendo + os = comendo-os ‘eating them’) and

16 affirmative imperatives _ (coma + os = coma-os ‘eat them!’). Verbs in the future or in the conditional forms should never use enclitic or proclitic pronouns, but endoclitics, as described below.

3.3.3 Endoclitics

According to normative grammar, Portuguese exhibits endoclitics in future (24) and conditional forms (25) (as long as there are no attracting words that would lead to the use of proclisis). The clitic needs to be placed between the verb stem and the morphemes of person, number and tense. It is, however, rare to find this structure in spoken - and even in

16 In the case of negative imperatives, proclisis is the dominant form, since the negative word não

‘no’ will always attract the pronoun.

Coma-os ‘eat them’ vs. Não os coma ‘don’t eat them’.

40 written - language. Speakers would use the enclitic or proclitic forms (non prescriptive) instead.

(28) * Eu cantarei -a muito bem / * Eu a cantarei muito bem. I sing.1.SG.FUT -it.F.ACC very well ‘I will sing it very well.’

ê

Cantá-la-ei muito bem.

(29) * Nós traríamo -las, se pudéssemos. / *Nós as traríamos se pudéssemos. we bring.1.PL.COND -them.F.ACC if can.1.PL.PAST.SUBJ ‘We would bring them, if we could.’

ê

Trá-las-íamos, se pudéssemos.

3.3.4 Clitic placement in a verbal sequence

A verbal sequence if formed by two verbs together representing just one verbal action. This structure is composed by an auxiliary verb, which is fully conjugated, and a second non-finite (infinitive, gerund or participle) verb: eu quero fazer ‘I want to to’, ele está viajando ‘he is traveling’, nós tínhamos feito ‘we had done’, etc...

If this is the situation (and if there is no lexical attraction) two cases must be considered for the placement of the clitic pronouns. If the structure is AUX + infinitive or gerund, the pronoun can be proclitic to the auxiliary verb, as in (25) and (26); enclitic to

41 the auxiliary verb, as in (27) and (28); or enclitic to the main verb, as in (29) and (30):

(30) Eu o quero ajudar. I him.ACC want.1.SG.PRES help.INF ‘I want to help him’

(31) Eu o estou ajudando. I him.ACC be.1.SG.PRES help.GER ‘I am helping him’

(32) Eu quero -o ajudar. I want.1.SG.PRES -him.ACC help.INF ‘I want to help him’

(33) Eu estou -o ajudando. 1.SG be.1.SG.PRES -3.SG.M.ACC help.GER ‘I am helping him’

(34) Eu quero ajudá -lo. 1.SG want.1.SG.PRES help.INF -3.SG.M.ACC ‘I want to help him’

(35) Eu estou ajudando-o. 1.SG be.1.SG.PRES help.GER -3.SG.M.ACC ‘I am helping him’

However, if the structure is: AUX + participle, the clitic can be placed proclitic to the auxiliary verb, as seen in (31) or enclitic to the auxiliary verb, as in (32), but it cannot be enclitic to the main verb.

42 (36) Eu o tenho ajudado. I him.ACC have.1.SG.PRES help.PART ‘I have been helping him’

(37) Eu tenho -o ajudado. I have.1.SG.PRES -him.ACC help.PART ‘I have been helping him’

(38) * Eu tenho ajudado-o.

If lexical attraction occurs, in sequences of AUX + infinitive or gerund the clitic will be proclitic to the auxiliary or enclitic to the main verb (before of after the verbal sequence). If it is AUX + participle, the pronoun has to come before the verbal sequence

3.3.5 Optional usage of clitics

There are a few situations in which the placement of the clitic is considered optional and it can come before OR after the verb (proclitic or enclitic), these are described below:

When there is an explicit subject before the verb, such as:

(39) Maria o pegou na casa da mãe dele. Maria him.ACC pick.up.3.PAST in the house of.the mother his ‘Maria picked him up at his mother’s house.’

(Or Maria pegou-o na casa da mãe dele, since the subject of the sentence Maria is explicit).

The other case is when the verb follows a coordinative , such as e ‘and’,

43 as seen in the example below:

(40) Ela chegou em casa e a encontrou dentro do quarto. she arrive.3.PAST at home and her.ACC find.3.PAST in of.the room ‘She arrived at home and found her inside the bedroom.’

The other possibility would be Ela chegou em casa e encontrou-a dentro do quarto.

3.4 The nominative ele ‘he’ as direct object in Brazilian Portuguese

3.4.1 Prescriptive usage

According to normative grammar, the nominative form ele ‘he’ can only function as direct object when preceeded by todo ‘all’ or só ‘just ‘() or if the pronoun carries emphatic accentuation, such as in:

(41) Conheço bem todos eles. Know.1.PRES well all they.ACC ‘I know all of them very well.’

(42) Vi ELE. See.1.PAST he.ACC ‘I saw HIM.’

44 3.4.2 Descriptive usage

The use of ele ‘he’ (and its plural and feminine variants) as an accusative pronoun is one of the most common characteristics of Brazilian Portuguese (eu vejo ele ‘I see he’ instead of eu vejo-o/eu o vejo ‘I see him’). Constructions like eu vi ela ‘I saw she’ was originated in the archaic Portuguese during the XIII and XIV centuries, (Bagno, 2001b, p.103). It is exclusive to Brazilian Portuguese and it is limited to the third person17.

When investigating this topic, Mattoso Câmara (2004, p.118) claimed that it is not a matter of applying the nominative pronoun as accusative but that it is just an invariable form from a syntactic point of view just like the and . In his opinion, it is about a ‘Brazilian innovation’ in terms of the structure, that separates the third person pronoun from the case system of personal pronouns. He attested that Brazilian Portuguese has specific conditions that favored the morphological evolution involved in this phenomenon.

Prescriptive grammar, however, deems this construction wrong. The school system explicitly discourages its use, and children are often told that indicates a low social background or educational level. But it unarguably is a common feature of Brazilians from diverse socio-economic backgrounds, and only speakers who want to be perceived as highly educated individuals avoid this construction. Even then, it is not completely eliminated from their speech (Câmara, 2004, p. 96).

As it was mentioned earlier on the grammatical description on Chapter 3, the third

17 Lately, there have been cases of other person pronouns involved in the same phenomena (see

Chapter 8 for examples). However, it is not so generalized and natural to the language as much as the third person pronouns are.

45 person pronouns in BP act differently from the first and second ones. One of Câmara’s arguments is that the 3rd person pronoun can substitute a noun while the 1st and 2nd person pronouns cannot since they are directly connected to the people in the discourse.

Moreover, the third person pronoun, unlike the 1st and 2nd person pronouns, can carry gender and has homonymic plural (ele ‘he’, ela she’, eles ‘they masculine, elas, ‘they feminine’), similar to nouns (gato ‘male cat’, gata ‘female cat’, gatos ‘male cats’, gatas,

‘female cats’) and demonstratives (e.g. este ‘this masc.’, esta ‘this fem.’, estes ‘these masc.’, estas ‘these fem.’).

Mattoso Câmara also gives a phonetic explanation and talks about apheresis, the loss of a sound at the beginning of a word. When there is a clitic pronoun o before a verb, it is treated as an atonic and the sound disappears (eu o vi à /eu uvi/), so there is the necessity to find a replacement to that object pronoun. This is different in European

Portuguese because the pronunciation of accessory words are always incorporated to the content ones, as if they were affixes of a morphologically complex word. Other authors, like Pereira (1981), also justify this occurrence in BP by saying the clitics o(s)/a(s) are phonetically weak and that the syllable consisting of only a vowel makes it hard to attach to other . This could also justify that among the standard pronouns, the most common forms found in my data were the enclitics –lo(s)/la(s).18

Bagno (2001) also adds an explanation with a historic root to this phenomenon. He says that in Classical , there were not subject or object third person pronouns, only first (subject ego, object me and plural nos) and second person (subject tu, object te and

18 Even though the forms -no(s), -na(s) were not found in my data, -lo(s), -la(s) were still the most common.

46 plural vos). However, as Vulgar Latin (along Romance languages) emerged, there was a need to find the third person pronouns equivalents to complete the personal pronouns paradigm. So, the demonstratives ille (masculine), illa (feminine), and illud19 (for neutral gender, which eventually disappeared) started being used (it is noticeable the similarity of the old forms ille and illa to the modern ones ele and ela). So, these new personal pronouns still carried some of their demonstrative function and could operate in many syntactic positions, as illustrated in Table 13:

SUBJECT DIRECT OBJECT INDIRECT OBJECT

Este serve. Quero este. Dei o livro para este.

‘This one works.’ ‘I want this one. ‘I gave the book to this one.

Ele serve. Quero ele. Dei o livro para ele.

‘He works. ‘I want him’ ‘I gave the book to him.

(Bagno, 2001b, p.105)

Table 2. Syntactic positions of demonstratives

As for how common this phenomenon is, Bagno shows that it could already be

19 Later, after some phonetic and morphological changes, o, a, os, as (the third person atonic pronouns and also definite articles) were created. In the case of the definite articles, which did not exist in

Latin as well, they were demonstratives that became weak and lost their deictic function, and started functioning as simple .

47 noticed centuries ago. For example, in the XIV century, in Fernão Lopes20’work:

(43) Os cardeaes, outrossim, privaram elle d’algum direito, (…) The cardeal stop.3.PL.PAST him.NOM of.any right (…) ‘The… , …, stopped him of any rights. (Bagno, 2001b, p.103)

It can even be found in more recent literary work, written by names such as Clarice

Lispector and Luís Fernando Veríssimo, as shown in the examples below.

(44) “(…) Esse reliance me deu ela de corpo inteiro.” (…) this … me give.3.SG.PAST her.NOM of body whole ‘This … gave her to me as a whole.’ (Lispector, 1977, as cited in Bagno, 2001b, p.105)

(45) “Na praia era ela que puxava ele pra dançar.” on.the beach be.3SG.IMPF.PAST she that pull.3SG.IMPF.PAST he.ACC. to dance ‘She was the one that used to pull him to dance.’ (Veríssimo, 2000, as cited in Bagno, 2001b, p.106)

After Fernão Lopes, grammar books of the Portuguese language started being compiled and rules started being implemented (as mentioned in section 2.2) and the use of the nominative ele as accusative pronoun was considered ungrammatical in European

Portuguese. However, it remained to be used in oral contexts and thus brought to Brazil after that. The replacement of nominative forms for accusative forms has been an issue in

20 Portuguese writer, one of the fathers of the European historiography, whose writing was based on oral discourse.

48 the Portuguese language ever since.

3.5 Summary

In section 3.1, I presented the personal pronouns in BP and the grammatical cases associated with them. I focused on the third person pronouns; therefore to recapitulate, I will repeat Table 1 in section 1.1 featuring the third person pronouns only.

Third Person Personal Pronouns Nominative Oblique Unstressed (w/o prep) Stressed (w/ prep) Atonic Tonic

Singular ele, ela lhe, o, a, se ele, ela, si

Plural eles, elas lhes, os, as, se eles, elas, si

Table 3. Third Person Personal Pronouns in BP

Accordingly, third person pronouns in the nominative case (used as subjects) are ele(s)/ela(s) and in the oblique case (used as objects) are lhe(s), o(s), a(s), se (unstressed) and ele(s), ela(s) and si (stressed).

Section 3.2 addressed to the definition and kinds of clitics in BP. In summary, clitics are unstressed morphemes that need to be attached to another word to form meaning. In

BP, there are three kinds of clitics: proclitics (occur before the verb), enclitics (after the verb) and endoclitics (inserted in the verb, between the verb stem and its affixes).

In section 3.3, I explained in more detail each of the three kinds of clitics and specified the conditions that may favor their use in a sentence. Proclitics are determined

49 by “lexical attraction”, meaning that some words or word strings attract the clitic to the left of the verb (in brief, adverbs, negation words, conjunctions, interrogative and exclamative sentences). If no lexical attraction occurs, enclitics can be used instead.

Enclitics appear to the right of the verb and the termination of the verb will determine the form it will be used, either o(s), a(s), lo(s), la(s), no(s), na(s). As for the endoclitics, the clitic is positioned between the verb stem and other morphemes (of person, number and tense), they are used in future and conditional forms and are of rare occurrence. Also, in this section, the placement of clitics in a verbal sequence (two verbs together: AUX + main verb) is presented; depending on the structure of the verbs, the clitics can be proclitic or enclitic to the auxiliary verb or enclitic to the main verb.

Moreover, I demonstrated the two situations in which speakers may opt to use either proclitics or enclitics, that is, if there is an explicit subject to the verb or if the verb is followed by a coordinative conjunction.

Section 3.4 focused on the nominative forms being used in the accusative case.

I briefly described the cases accepted by normative grammar, which are (i) when the pronoun is preceded by todos ‘all’ or só ‘just’ or (ii) when there is emphasis on the pronoun. I also referred to how this phenomenon (accusative ‘he’) takes place in Brazil, and attempted to explain its origins.

With all the descriptions involving the third person unstressed pronouns being established, I can now move on to the description of my study, which results will be compared and discussed according to the distributions depicted in this chapter.

50 CHAPTER 4

METHODOLOGY

In this chapter, I describe what my corpus was consisted of and how I collected my data. I indicate which variables were considered for my statistical analysis and I describe the method I used in my investigation.

4.1 Corpus

My corpus consists of 1252 tokens found in 1197 sentences containing at least one third person object pronoun. The sentences were found in scripted and non-scripted TV programs transmitted by Rede Globo, the largest Brazilian TV network. Specifically, I looked at newscast, talk shows and soap operas. I chose these three types of program because the language and register in each of them may greatly vary. News shows are scripted, meaning the the oral texts were written beforehand for broadcasting, whereas in non-scripted shows oral language is only partially planned and rehearsed (e.g., the interview questions or the hosts speech at the opening of the show), resulting in natural interactions and spontaneous conversations. Nonetheless, the language used in soap- operas follow a script that tries to represent actual speech, ‘real' spoken language.

Three different shows were chosen under each category to analyze and build the corpus. For the news shows, the three main news programs transmitted nationwide were used: Jornal Hoje ("Today's News"), Jornal da Globo ("Globo News”), and Jornal

Nacional ("National News”), the most important one, exhibited during primetime. These newscasts are very similar in structure, with one or two anchors and several outside

51 reporters responsible for news coverage. They cover material such as sports, weather, politics, and current events, national and international. It is important to note, that, for the news programs, only the language used by the anchors/reporters directed to the audience was considered under this category. Any form of spontaneous language between them or with a third person was not reliable as being scripted.

The non scripted category consisted of three variety shows: Programa do Jô (“Jô’s

Show”) is a late night talk show hosted by the comedian, author and musician Jô Soares and follows the format of American late night shows, with interviews and musical performances. Encontro com Fátima Bernardes ("Meeting with Fátima Bernardes") is a live talk show hosted by former Jornal Nacional anchorwoman Fátima Bernardes: she receives famous guests who talk about their lives and careers and talk about current or controversial topics. The last show in this category is Mais Você ("Plus You”), an early morning talk/cookery show hosted by Ana Maria Braga that focuses on the arts and culinary topics, and she also receives celebrity guests and discusses current news and events.

For the soap operas, three different ones were used: Avenida Brasil ("Brazil

Avenue"), (“Hail George") and Guerra dos Sexos ("Battle of the Sexes").

Brazilian telenovelas (soap operas) are very popular, and it is probably the most watched television genre in the country. In Rede Globo, there are usually four different soap operas being broadcasted daily, three of them during prime time. They are very realistic and broach controversial topics; the writers aim to transmit sociocultural messages, such as the importance of education, the effects of drug abuse and the fight for human rights by incorporating them into the storylines. Different genres such as drama, comedy and

52 romance can be found in telenovelas.

TV Globo usually uses a prestigious variety that follows the dialects from Rio de

Janeiro and São Paulo, the two most prestigious cities in the country. Brazilian soap operas usually take place in one of these two cities; in my data, two soap operas took place in Rio (Avenida Brasil and Salve Jorge) and one took place in São Paulo (Guerra dos Sexos). The location is important here because of the plot of she stories and the ideas that are usually associated with these two places. As a general idea, São Paulo, a huge metropolitan city, is associated with economics, politics, money and business. While Rio de Janeiro, a vacation spot, is better known for its natural settings, beaches and landmarks often linked to a popular culture involving soccer and carnival. Thus, depending on the theme, a story will fit better one of the two places.

Avenida Brasil’s main story reflects the reality of the north of the city of Rio de

Janeiro, an area crossed by Avenida Brasil, the main highway that leads from the outskirts of the city and across the suburbs into central Rio. It focuses on the middle class that lives in the suburbs of Rio de Janeiro and on their lifestyle, so the protagonists are not the typically rich, aristocratic characters portrayed in many other soap operas. There are some higher class characters portrayed as well, who live in the southern zone of the city where the beaches and richer neighborhoods are. It is easy to characterize a character’s

SES, based on where they live, what they do, how they dress and how they speak.

Salve Jorge’s main plot is a humble girl who lives in Morro do Alemão, one of the poor communities (slums) of Rio. People in these communities live a simple happy life, but are always trying hard to find better chances in life. In this scenario, it is also easy to tell what socio-economic class one character belongs to based on the same elements

53 described above.

The main story of Guerra dos Sexos revolves around the competition between men and women in the 21st century and the main characters are members of a very traditional and wealthy family. In this case, the main story shows a very high society formed by very rich and educated people.

As I mentioned earlier, data was gathered from Brazilian TV programs broadcasted nationwide by Rede Globo. All episodes aired in the period between April 2012 and

August 2013 and a total of more than 226 hours were analyzed.

The episodes had different duration times, in the scripted programs the episodes ranged between 13 and 45 minutes (average time 27,5 minutes), in the non-scripted category the full shows were between 40 and 75 minutes. However, the interviews of

Programa do Jô were available separately from the whole show and they were between

11 and 44 minutes long (total average 49 minutes in this category). The soap opera episodes were between 32 and 72 minutes in duration (average 47 minutes).

Table 3 below illustrates the TV shows, the number of hours collected for each program and the total number of hours in each category.

54 Kind of Show TV Show Time Total time

News Jornal Hoje 23:51:46 74:58:06

Jornal Nacional 26:17:31

Jornal da Globo 24:56:02

Non-scripted Programa do Jô 25:21:23 76:07:50

Encontro com Fátima Bernardes 25:30:18

Mais Você 25:16:09

Soap-opera Avenida Brasil 25:52:21 75:05:19

Salve Jorge 25:44:23

Guerra dos Sexos 23:21:25

Table 4. TV shows list and number of hours in each category

4.2 Data collection

I was able to access the programs through Rede Globo TV’s website

(globotv.globo.com). Full episodes could be watched with a member login and streamed from any computer.

For my raw data collection, I wrote down every sentence with the occurrence of a third person accusative pronoun and categorized it as either “standard”, if followed the rules prescribed by normative grammar, or “non-standard”, if the use did not follow the grammatical norms. Only episodes with at least one example in either category were considered.

Each video segment containing the sentence was extracted using Camtasia 2, a

55 video recording and editing software. All the videos are stored in my personal computer and in an HD device.

After collecting an average of 75 hours in each category, I created an Excel table containing all the extracted sentences. Each sentence was coded with the following information: the category in which it was found (i.e. ‘N’ for news (scripted), ‘V’ for variety (non-scripted) and ‘SP’ for soap opera), the speaker who uttered the sentence (e.g.

JS, FB, WB, etc.) and the following linguistic and sociolinguistic variables:

Compound verb verb Auxiliary verb Main verb True/False structure Aux + infinitive Aux + gerund Aux + participle

Conjugation Final segment Stress Tense of the verb of the verb on the verb of the verb 1st//2nd /3rd Ex: -r, -am, -e, -ou Final/Non final

Mood Affirmative Negative Interrogative of the verb sentence sentence sentence True/False True/False True/False

Main Animate antecedent Human Pronoun True/False True/False antecedent o/a/os/as True/False (and its variants) ele/ela/eles/elas

Number Gender Kind of pronoun Placement of the pronoun of the pronoun Standard/ of the pronoun Singular/plural Feminine/Masculine Non-standard Ex: enclitic,proclitic

Gender SES Context of the speaker of the speaker of the utterance Male/Female Low/Middle/High Formal/Informal

Table 5. Variables extracted from data

56 Most of these elements were used as predictors during my statistical analysis. The next section will explain the method I used and the elements that were considered when predicting the kind of pronoun and its placement within the sentence.

4.3 Method

21 "Rival forms" _ can have a variety of factors that influence on their occurrence, this is why we need statistical models in order to discover which elements are significant and to what degree. For statistical analyses of rival forms in language, logistic regression is the most popular method, so a "mixed effect model" was the first method considered for my analysis. Mixed effects models were created in R (a data analysis software for statistics) with the "lmer" function of the “lme4" package included in the LanguageR package.

The first models were built to predict the kind of pronoun (standard or non- standard); all the models included “speaker” and/or “verb” as random effects and all the other variables (shown in table 4.2) were tested as fixed effects to see if they were significant. However, I found some problems using this method due to the complexity of my data. As typical for logistic regression models built on data that has perfect separation, the models failed to converge. There was a high number of unbalanced cells, which is a common characteristic of sociolinguistic data; in this case, many speakers had only 1

21 Rival forms exist when a language has two (or more) forms that express a similar meaning in similar environments, giving the speaker a choice of options. The choice made between rival forms is often influenced by a range of factors such as the syntactic, morphological, and phonological environment (Janda et al, 2012).

57 contribution to the data (45% of speakers contributed with only 1 token) and the same happened to “verb" (51% were single observations). Therefore, simple mixed-effect models did not work or led to unreliable results.

Recently, new tools have been introduced for statistical analyses and interpretation of data. Here, I use an alternative technique that can help overcome the limitations of logistic models, random forests & trees, available in the “party" package in R, which is much more straightforward and has more intuitive results to interpret. Janda et al. (2012 page needed) explain this method by saying that "the tree & forest model uses recursive partitioning to yield a classification tree that provides an optimal partitioning of the data, giving the best sorting of observations separating the response outcomes. It can literally be understood as an optimal algorithm for predicting an outcome given the predictor values”. Random forests & trees can work with all the predictors at the same time and they consider interactions automatically, without having to specify them; furthermore, they eliminate all elements that are not important to the results, they work through the data and, by trial and error, establish whether a variable is a useful predictor or not

(Tagliamonte & Baayen, 2012, p.159).

This method provides an output that is easier for interpretation. The trees "give us a description of what is going on in the data, in a way that is visually much more tractable and intuitive than the tables of figures we receive as output in the regression model”

(Janda et al, 2012, p. 20). The tree decides which variables are relevant (and how they interact) and it simply ignores the ones that are not. Differently from the trees, random forests do not provide useful information about how the predictors work together, they provide information about the relevance of each predictor. The forests produce

58 classification accuracy with the index of concordance (C), found in the "RMS" package, to make the predictions more accurate and get results that are even more precise. Forests protect against over fitting the data, which occurs when we have a very complex model that includes a big number of predictors in relation to the number of observations.

The tree & forest method was used to analyze my data in three parts: 1) to predict the kind of pronoun (standard or non-standard) in my whole data; 2) to predict the kind of pronoun (standard or non-standard) in a subset of the whole data which included only the structure verb + pronoun (since this is the only possible structure for non-standards); and

3) to predict the placement of the clitic (proclitic or enclitic) in another subset of the whole data that contained only standard examples. I’ll describe them separately in the next sections.

The random forests_ were obtained with the “cforest" function and the "varimp" command was used for assessment of the relative importance of the predictors. Later, assessment of classification accuracy was obtained with “treeresponse" and an inference tree was created with the "ctree"" function.22

4.3.1 Standards vs. non-standards: whole data

To predict the kind of pronoun (standard or non-standard), the first random forest considered the whole data. The following linguistic and sociolinguistic variables were used:

22 The actual commands used in R can be seen in the Appendix.

59 • Regarding the kind of show: news, non-scripted, soap opera

• Regarding the verb: the composition (compound or not), conjugation (1, 2, 3), final

segment (infinitive or not), tense, mood, and stress of the verb (final or non final),

the actual verb and the auxiliary verb (if existent)

• Regarding the clause: kind of clause (main clause or not, affirmative, negative or

interrogative)

• Regarding the pronoun: animate referent, human referent, gender, number and

placement (enclitic or proclitic)

• Regarding the speaker: gender and SES (low, middle, high)

• Regarding the context: formal or informal

4.3.2 Standards vs. non-standards: enclitics only

For the second forest, I considered the same predictors as the previous forest, with the exception of the actual verb and the placement of the pronoun. The data was a subset of the whole data, which involved only the same structure (verb + pronoun). In this case, because non-standard pronouns’ single placement is after the verb (enclitic), I could only compare them to the same structure found with standard examples, so I only considered enclitic pronouns.

4.3.3 Proclitics vs. enclitics: standards only

To predict the placement of the pronoun (enclitic or proclitic), a third random forest was created. All the functions seen above were applied for assessment of relative

60 importance, classification accuracy and to produce the inference tree. However, the data used was another subset of the whole data, which considered only standard examples.

Considering that, in this case, I was interested in linguistic factors that could influence on the position of the pronoun, I excluded the sociolinguistic variables and used only the ones related to the verb, the sentence and the pronoun itself.

4.4 Summary

This chapter described my corpus, the data collection and method used for statistical analysis. I presented random forests & trees, which was the model used for my investigation and described this method was applied to find out which factors were important to predict the kind of pronoun used and the placement of clitics in relation to the verb. In the next chapter I present my descriptive results, where I will show the distribution of the pronouns found in the TV shows analyzed. I will also provide examples and highlight specific ones that deserve attention. Later on, in Chapter 6, I will move on to my statistical results and analysis.

61 CHAPTER 5

DESCRIPTIVE RESULTS

Now that I have identified the significant predictors in the usage of the kind and the placement of the pronouns, I will present some detailed results concerning the distribution of the tokens found in my data. Firstly, I will focus on the distribution of clitic pronouns and their usage based on the grammatical rules described on chapter 3; later, I will look into the use of standard and non-standard forms under each kind of show to base my discussion in the next chapter. I will also illustrate my results with tables and examples.

5.1 Standard pronouns found in the data

As explained in chapter 3, the choice of clitic pronoun form and placement is determined by grammatical rules, which I will summarize here in order to discuss the examples found in my data.

In Portuguese, clitics can be proclitic, enclitic or endoclitic. Proclisis takes place when there is lexical attraction, meaning that some words attract the clitic pronoun to the left of the verb. These words include: adverbs, negation words, pronouns, and some conjunctions/complementizers. Also, proclitics are used in interrogative and exclamative sentences and they should never come at the beginning of a sentence. If none of these rules apply, enclitics can be used instead. For enclisis, the rules that apply regard the ending of the verb, which will determine if the clitic will remain in the same form (o, a, os, as) or if they will change. Verbs ending in , or have to take the forms - lo(s) or -la(s) and verbs ending with a nasal sound have to take the forms -no(s) or -na(s).

62 Enclisis is the used form for infinitives and gerunds, and need to be used if a sentence starts with a verb (clitics can never be used in this position). Endoclitics are used with verbs in the future and conditional forms (when no rule to proclitics apply), and the pronoun is placed between the verb stem and the morphemes of person, number and tense.

A total of 495 standard tokens were found (39.5% of the whole data), 187 of which were proclitics and 308 enclitics. In the following subsections I will talk about each kind of standard form separately.

5.1.1 Proclitics

Among the 187 proclitics found, most of them were mandatory (63% of them).

Their presence was ruled by lexical attraction and they were preceded by relative pronouns, conjunctions, adverbs, negation words or were inserted in exclamative or interrogative sentences. Some proclitics (36%) were classified as optional, as they could have been used enclitically as well; only two ungrammatical (according to normative grammar) uses of proclitics were found. The table below is illustrative.

63 Mandatory proclitics Optional proclitics

After relative pronoun 77 (41%) With explicit subject 43 (23%)

After conjunction/adverb 20 (11%) After coordinative conj. 23 (13%)

After negation word 17 (9%)

Int/Excl sentences 3 (2%)

Total 117 (63%) Total 68 (36%)

Wrong 2 (1%)

Total Overall 187 (100%)

Table 6. Proclitics

5.1.1.1 Mandatory uses

Now, we can take a look at each of the cases described on the table and illustrate with examples. In 187 proclitics, 117 (63%) were mandatory, among them: 77 (41%) were attracted by the relative pronoun que ‘that’, as we can see in example (46), 20 (11%) were attracted by complementizers or adverbs (example 47), 17 (9%) were attracted by a negation word, as seen in (48), and only 3 (2%) were found in exclamative or interrogative sentences (examples 49 and 50).

64 (46) No avião que o levou de volta para Roma, on.the plane that him.ACC take.3.SG.PAST of back to Rome

o papa concedeu uma segunda entrevista. the pope give.3.SG.PAST one second interview ‘On the plane that took him back to Rome, the Pope gave a second interview.’

(Jornal Nacional, N, M)23

(47) Esse rapaz entrega uma corrente e o traficante this young.boy give.3.SG.PRES one chain and the drug.dealer

já a coloca no pescoço. already her.ACC put.3.SG.PRES in.the neck ‘This boy hands in a chain and the dealer quickly puts it around his neck.’ (Jornal Hoje, N, F)

(48) Eu não o conhecia. I NEG him.ACC know.1.SG.IMPF.PAST ‘I didn’t know him.’ (Programa do Jô, NS, F)

(49) Você o ama? you him.ACC love.2.SG.PRES ‘Do you love him?’ (Guerra dos Sexos, SO, M)

(50) A sua tia, que Deus a tenha! the your aunt that God her.ACC have.3.PRES.SUBJ ‘Your aunt, may God be with her!’ (Guerra dos Sexos, SO, F)

23 On the example references, I include the name of the show, and I also use ‘N’, ‘NS’, or ‘SO’ to represent the categories (news, non-scripted, and soap-opera) and ‘M’ or ‘F’ to represent the speaker’s gender (‘male’ or ‘female’).

65 5.1.1.2 Optional uses

A total of 68 proclitics were optional, and two cases were associated with them: 43

(23%) were after an explicit subject, as show in (51) and 25 (13%) came after a coordinative conjunction, illustrated in (52).

(51) … a minha mulher, a Emília, eu a conheci no interior de Minas. the my woman the Emilia I her.ACC meet.1.SG.PAST in.the interior of Minas ‘…my wife, Emília, I met her in the countryside of Minas.’ (Encontro com Fátima Bernardes, NS, M)

(It would also be acceptable to say … a minha mulher, a Emília, eu conheci-a no interior de Minas.)

(52) Os Argentinos são loucos por cães e os tratam com todo luxo. the Argentinians be.3.PL.PRES crazy about dogs and them.M.ACC treat.3.PL.PRES with all luxury ‘The Argentinians are crazy about dogs and treat them with a lot of luxury.’ (Jornal Hoje, N, F)

(It would also be acceptable to say Os Argentinos são loucos por cães e tratam-nos com todo luxo.)

Moreover, two cases involving proclitics being used ungrammatically were found, the first one is described below (example 53) and the other one is illustrated a little further in the section for endoclitics below (example 65).

66 (53) *Essa é a Nossa Senhora que você a visitará… this be.3.SG.PRES the Our Lady that you her.ACC visit.3.SG.FUT *’This is Our Lady_24 that you’ll visit her.’

(Mais Você, NS, F)

This sentence is grammatically unacceptable because the clitic is redundant. The relative que is already being used as the object of the verb visitará, referring to the antecedent a Nossa Senhora so the clitic pronoun a, which is an object pronoun, is placed ungrammatically, or at least, unnecessarily in the sentence. The repetition of the direct object makes the sentence ungrammatical. This structure is also unacceptable for informal spoken speech. It is important to note, however, that this utterance was found in a non- scripted variety show where spontaneous speech is used and speakers may use unacceptable sentences at times.

5.1.2 Enclitics

As for the enclitics, with a total number of 308 tokens, we can note that 304 (98.8%) of them followed verbs in the infinitive (verbs ending in ), and only 4 (1.2%) followed a vocalic sound (). Let’s look at these two cases below and specify which ones were considered mandatory or optional.

24 Brazilian Catholic figure.

67 After infinitive

Mandatory (simple verbs) 172 (55.9%)

Optional (compound verbs) 132 (42.9%)

Total 304 (98.8%)

After vowel mandatory

Sentence starting with verb 1 (0.3%)

After gerund 1 (0.3%)

After vowel optional

After coordinative conj. 1 (0.3%)

Structure of the verb (aux + part) 1 (0.3%)

Total: 4 (1.2%)

Total Overall: 308 (100%)

Table 7. Enclitics

5.1.2.1 With verbs in the infinitive

172 cases of enclitics were classified as mandatory. These were used with simple verbs, as seen in X, where the verb is in the infinitive, so the pronoun has to come enclitic to the verb (localizar + os = localizá-los ‘find them’), as seen in example (54).

68 (54) O senhor disse que é muito fácil localizá-los. the sir say.3.SG.IMP.PAST that be.3.SG.PRES very ease locate.INF-them.M.ACC 'You said it’s very easy to find them.’ (Encontro com Fátima Bernardes, NS, F)

On the other hand, 132 cases of enclitics were considered optional. These clitics were used in a sequence of verbs. The verbal sequences found in the data were AUX + infinitive (110 cases), AUX + participle (5 cases) and 17 sequences were used after one attracting word (same as in the rules of proclitics). I classify them as optional because as it was explained in the grammatical description, in this case, there is more than one possible position for clitics.

If the main verb is an infinitive, as shown below, the pronoun can be proclitic to the auxiliary verb, enclitic to the auxiliary verb, or enclitic to the main verb. In 100% of the cases, the clitic was placed after the verb in the infinitive.

(55) Eu preciso localizá-lo. I need.1.SG.PRES locate.INF him.ACC ‘I need to find him.’ (Guerra dos Sexos, SO, F)

If the main verb is in the participle (53), the pronoun can be placed proclitic to the auxiliary verb or enclitic to the auxiliary verb. All the examples found are formed of the verb ter ‘have’ in the infinitive + PART, and all of them have the pronoun enclitic to the auxiliary verb.

69 (56) Leopoldo foi acusado de tê-la matado. Leopoldo be.3.SG.PAST accuse.PART of have.INF kill-her.ACC.PART ‘Leopoldo was accused of killing her.’ (Programa do Jô, NS, F)

Moreover, 17 cases including verbal sequences after an attracting word (as seen with the rules involving proclitics) were found. In this case, the pronoun can be placed either before the auxiliary of after the main verb (before or after the verbal sequence). All of the verbal sequences were composed of AUX + infinitive and the clitic was always placed after the main verb (in the infinitive).

(57) O encontro chocante da mãe viciada em craque com a filha que tenta salvá-la. the meeting shocking the mother addict in crack with the daughter that try.3.SG.PRES save.INF-her.ACC ‘The shocking meeting of the crack addict mother with the daughter that tries to save her.’ (Jornal Hoje, N, F)

5.1.2.2 With verbs ending in a vocalic sound Only four cases were found and the examples below are illustrative.

Examples 13 and 14 show the mandatory uses with the clitic pronoun after a verb in the gerund and a sentence starting with a verb, meaning the pronoun’s only position is enclitic to the verb.

70 (58) …e atacou sistematicamente povos indígenas … and attack.3.SG.PAST systematically peoples indigenous

acusando-os de apoiar os guerrilheiros. acuse.GER-them.M.ACC of support.INF the guerrilhas ‘… and attacked systematically the indigenous peoples accusing them of supporting the guerrillas.’ (Jornal da Globo, N, M)

(59) Deixou-a sem comer e a espancou na barriga. leave.3.SG.PAST-her.ACC without eat.INF and her.ACC hit.3.SG.PAST in.the stomach ‘(He) left her starving and hit her in the stomach.’ (Jornal da Globo, N, M)

As for the optional enclitics, one of them comes after the coordinative conjunction e

‘and’ (ex 15) and the other one is being used in a verbal sequence where the structure is

AUX + Participle (ex 16). In this case there was another possible position for the pronoun

(proclitic to the auxiliary verb) but the position enclitic to the auxiliary was favoured.

(60) No dia 27, o agente imprimiu a denúncia on.the day 27, the agent print.3.SG.PAST the complaint

e anexou-a ao seu relatório. and attach.3.SG.PAST-her.ACC to his report 'On the 27th, the agent printed the complaint and attached it to his report.’ (Jornal Hoje, N, F)

(It would be acceptable to say No dia 27, o agente imprimiu a denúncia e a anexou ao seu relatório.)

71 (61) Eu tenho-a encontrado em várias situações da minha vida. I have.1.SG.PRES-her.ACC meet.PART in many situations of.the my life ‘I have been meeting her in many different occasions throughout my life. (Mais Você, N, F)

(Another possibility would be Eu a tenho encontrado em várias situações da minha vida.)

No examples of enclitics in the forms of -no(s), -na(s) were found. Verbs with a nasal ending had 34 mandatory proclitic pronouns (for example, sentence (62) below where the clitic is attracted by the adverb também 'also') and 14 were optional and the speaker preferred to use proclitics instead, as seen in example (63) where there is no word attracting the verb but the proclitic was still chosen.

(62) Os historiadores também o chamam de gênio. the historians also him.ACC call.3.PL.PRES of genius ‘Historians also call him a genius.’ (Programa do Jô, NS, F)

(63) Seus colegas o chamam de ‘hit maker’. his friends him.ACC call.3.PL.PRES of hit maker ‘His friends call him ‘hit maker’. (Jornal da Globo, N, M)

5.1.3 Endoclitics

There were no examples of endoclitics in my data. There were only 4 sentences with simple verbs in the conditional form, and 3 of them fell into the rules of proclisis due to the presence of an attracting word, as shown in example (64) below.

72 (64) Mas logo a sua voz belíssima encontraria o estilo but soon the her very.beautiful find.3.SG.COND the style

que a faria uma das cantoras mais populares do seu tempo. that her.ACC make.3.SG.COND one of.the singers more popular of her time ‘But soon, her very beautiful voice would find the lifestyle that would make her one of the biggest singers of her time.’ (Jornal da Globo, N, M)

In the whole data, there was only one example that required endoclisis but still, proclisis was used, since endoclisis is rarely used. However, this example reflects popular language use and it is completely acceptable among native speakers, although it does not follow the rules involving the placement of clitics.

(65) *Se eu pudesse, eu mesma o contrataria, mas… if I can.1.SG.PAST.SUBJ I myself him.ACC hire.1.SG.COND, but… ‘If I could, I would hire him myself, but…’ (Guerra dos Sexos, SO, F)

According to normative grammar, the correct clitic to use here would be an endoclitic because the verb is in the conditional form and there is no word attracting the clitic to the left of the verb. This sentence should be Se eu pudesse, eu mesma contratá-lo- ia, mas… . But, as we will discuss later, endoclitics are barely used in Brazilian

Portuguese anymore (spoken and written) and other structures are favoured instead.

Other than the conditional, endoclitics should be used with future forms as well.

However, no simple future form was found in the data. Compound forms (AUX + infinitive) are preferred over simple future forms. So, instead of the simple forms eu farei

‘I will do’ or ele fará ‘he will do’, compound forms such as eu vou fazer and ele vai fazer are used. Since this is the case, and the main verb is in the infinitive, all future forms

73 found in the data were used with enclitics (examples (66) and (67)).

(66) Um defensor público vai representá-lo no juri. one defender public will represent.INF-him.ACC in.the jury ‘A public defender will represent him in court.

(Jornal Nacional, N, M)

(67) A gente vai conhecê-los agora. the we will meet.INF-them.M.ACC now ‘We will meet them right now.’ (Encontro com Fátima Bernardes, NS, F)

5.2 Non-Standard pronouns found in the data

Now that I have presented the numbers and examples regarding the standard pronouns, I would like to turn attention to the non-standard ones: ele, ela, eles and elas

‘he,’ ‘she’, ‘they (m)’ and ‘they (f)’. There were 757 non-standard uses of accusative pronouns in the whole data, which represents 60.5% of the total accusative pronouns. The examples below are illustrative:

(68) Chamei ele no palco, ele veio. call.1.SG.PAST he.ACC in.the stage, he come.SG.M.PAST ‘I invited him to come on stage, he came.’ (Encontro com Fátima Bernardes, NS, M)

74 (69) Vocês conhecem ela? you know.3.PL.PRES she.ACC ‘Do you know her?’ (Salve Jorge, SO, F)

The usage of non-standard forms found in my data will be discussed in next chapter, where I will provide an attempt to explain to which extent this distribution reflect the use of third person accusative pronouns used by Brazilians.

5.3 Standard vs. non-standard pronouns according to the kind of show

As explained before, there were three kinds of show (news, non-scripted, and soap- opera) and three shows in each of them. First, let’s take a look at table 7 which illustrates exactly where the tokens were found and, then, we are going to analyze each type of show separately.

Kind of show Standard Non-standard Total

News total 296 (99.3%25) 2 (0.7%) 298

Non-scripted total 141 (25.8%) 404 (74.2%) 545

Soap opera total 58 (14.2%) 351 (85.8%) 409

Total overall 495 (39.5%) 757 (60.5%) 1252

Table 8. Pronoun distribution according to the kind of show

25 Percentages are the equivalent of the total in each row, not total overall in each table.

75

5.3.1 News shows

As it was expected, the news category displayed a large number of the standard forms of the third person accusative pronouns. With the exception of 2 examples (0.7%),

296 utterances (99.3%) contained standard accusative pronouns. Table 8 illustrates these numbers:

Show Standard Non-Standard Total

Jornal Nacional 104 (100%) 0 104

Jornal da Globo 107 (100%) 0 107

Jornal Hoje 85 (97.7%) 2 (2.3%) 87

News total overall 296 (99.33%) 2 (0.67%) 298

Table 9. Accusative pronouns in the news shows

In Chapter 2, I presented the results of Bachman’s 2011 study. She investigated third person accusative pronouns in Brazilian news programs and she concluded that the use of pronouns (both clitics and tonic) was extremely low and that lexical noun phrases and passive constructions were used to establish anaphoric reference, instead.

Also, she did not find one single direct-object tonic pronoun in her whole data. However,

76 opposed to her study, I came across two utterances that contained tonic pronouns being used as direct objects, as shown in (70) and (71).

(70) Por uma fresta, teriam conseguido ver uma mao e through one crack have.COND can.PART see.INF one hand and

escutaram ela dizer: eu ainda estou aqui, meu nome é Reshna. hear.PAST she.ACC say.INF I still be.1.PRES here, my name be.3.PRES Reshna ‘Through a crack, (they) could see a hand and heard it say: I’m still here, my name is Reshna. (Jornal Hoje, N, M)

(71) As fotos também mostram os detentos no patio e jogando the photos also show.3.PRES the inmates in.the court and play.GER

videogame (… ) essa mostra eles usando drogas. Videogame (…) this show.PRES they.ACC use.GER drugs

“The pictures show the inmates in the court and playing videogames (…) this one shows them using drugs.’ (Jornal Hoje, N, M)

The structure shown in these examples contain sentences with “accusative subjects”26. In example (70), the pronoun ela ‘she’ is being used as the subject of the verb

26 An “accusative subject” is the subject of a verb in the infinitive or in the gerund in a sentence that is acting as the direct object of another verb. Only causative or sensory verbs can precede accusative subject. In Portuguese, the most common are fazer ‘make’, mandar ‘order’, deixar ‘let’, ver ‘see’, ouvir

‘hear’, sentir ‘feel’, and their synonyms.

77 dizer ‘say’ at the same time that the whole expression ela dizer ‘she say’ is the object of the previous verb escutaram ‘they heard’. In example (71), eles ‘they’ is the subject of the verb usando ‘using’ and eles usando drogas ‘they using drugs’ is the object of the verb mostrar ‘show’. According to prescriptive grammar, in sentences with accusative subjects, nominative pronouns (ele ‘he’, ela ‘she’) cannot be used as accusative subject; only accusative pronouns such as o ‘him’ and a ‘her’ (and variants) can be used.

Therefore, the structure escutaram ela dizer ‘they heard she say’ seen in the first example above should be replaced by escutaram-na dizer ‘they heard her say’ and, in the second example, essa mostra eles usando drogas ‘this one shows they using drugs’ should be essa mostra-os usando drogas ‘this one shows them using drugs’, instead.

As it was described in Chapter 3, personal pronouns are classified according to their syntactic function or grammatical case. Subjects are associated to the nominative case, and the accusative case marks direct objects, therefore having sentences with “accusative subjects” could seem confusing and not everyone applies the rules involving their use correctly. It is important to note that, even though these examples would be classified as ungrammatical according to normative grammar due to the reasons explained above, they are still very common to find.

Based on these numbers (2 non-standard form and 296 standard forms), we can conclude that news shows do follow the rules described by prescriptive grammar involving the usage of accusative pronouns and that non-standard forms are of rare occurrence.

78 5.3.2 Non-scripted shows

Table 10 reflects the numbers of standard and non-standard forms found in the three shows in this category:

Show Standard Non-Standard Total

Encontro com Fátima Bernardes 54 (31.4%) 118 (68.6%) 172

Mais Você 45 (28.1%) 115 (71.9%) 160

Programa do Jô 42 (19.7%) 171 (80.3%) 213

Non-scripted total overall 141 (25.9%) 404 (74.1%) 545

Table 10. Accusative pronouns in the non-scripted shows

We can see that, even though standard forms were found in all the three shows

(25.9%), non-standard forms were still favored (74.1%).

34.7% (n = 189) of the examples were provided by the shows’ hosts, contributing with 71 standards (50.35%) and 118 (29.21%) non-standard examples (Table 11)

79 Show Standard Non-Standard Total #

Encontro com Fátima Bernardes 32 (80%) 8 (20%) 40

Mais Você 31 (36%) 55 (64%) 86

Programa do Jô 8 (12.7%) 55 (87.3%) 63

Total tokens from hosts 71 (37.5%) 118 (62.5%) 189

Non-scripted total tokens 141 404 545

Table 11. Accusative pronouns used by the hosts in each NS show

On the first show, the host contributed with 40 utterances and 32 of them (80%) were standard. Nevertheless, even though it was only a few examples, she still made use of non-standard forms, such as:

(72) Você mima ela demais. you spoil.2.PRES she.ACC too.much ‘You spoil her too much.’ (Encontro com Fátima Bernardes, NS, F)

On the other hand, the second show’s host, Ana Maria Braga, used more non- standard pronouns than standard ones. She contributed with 86 utterances, and 31 of them were standard (36%), while 55 were non-standard (64%). The examples below show her

80 using, in the same context, the same verb and tense with a standard pronoun in (73) and with a non-standard pronoun in (74).

(73) Pra eu fazê-lo ficar fofinho, for I make.INF him.ACC be.INF soft

eu tenho que por creme de leite aqui. I have.1.PRES that put.INF cream of milk here ‘To make it (the cake) soft, I need to put some heavy cream here.’ (Mais Você, NS, F)

(74) Agora eu tenho que fazer ele ficar fofinho. now I have.1.PRES that make.INF he.ACC be.INF soft ‘Now, I need to make it (the cake) soft.’ (Mais Você, NS, F)

The third show has the largest difference between the number of standard and non- standards total (42 standards and 171 non-standards) and regarding the host’s utterances as well. The host Jô Soares used 8 (12.7%) standard forms and 55 (87.3%) non-standard forms.

The examples below show him using the verb levar ‘take’ with both kinds of pronoun:

(75) E ela queria levá-los pra Nova Iorque (…) and she want.IMP.PAST take.INF-they.M.ACC to New York ‘And she wanted to take them to New York (…)’ (Programa do Jô, NS, M)

81 (76) Eu levei ele pra fazer teatro comigo. I take.PAST he.ACC to do theater with.me ‘I took him to take acting lessons with me.’ (Programa do Jô, NS, M)

Interestingly, there were instances where the speaker used a standard and a non- standard pronoun in the same sentence to refer to the same antecedent.

(77) Apesar de também admirá-lo e me identificar com o lado though of also admire.INF-him.ACC and me identify with the side

romântico, eu pensei na minha mãe quando eu escolhi ele. romantic I think.PAST in.the my mother when I chose he.ACC

‘Despite admiring him and identifying myself with his romatic side, I thought of my mom when I chose him. (Encontro com Fátima Bernardes, NS, F)

(78) Você vai pegar essa parte do osso aqui,você vai tirá-la da you will take this part of.the bone here you will take-her.ACC of.the

carne, mas vai deixar ela grudadinha. meat but will leave she.ACC stuck

‘You’ll get this part of the bone, you’ll remove it from the meat, but you will leave it stuck together.’ (Mais Você, NS, F)

82 5.3.3 Soap operas

Table 11 shows the distribution of accusative pronouns in the three soap operas analyzed.

Show Standard Non-Standard Total #

Avenida Brasil 2 (1.4%) 142 (98.6%) 144

Salve Jorge 7 (5.8%) 114 (94.2%) 121

Guerra dos Sexos 49 (34.1%) 95 (65.9%) 144

Soap opera total 58 (14.2%) 351 (85.8%) 409 (100%)

Table 12. Accusative pronouns in the soap operas

Even though some standard forms were found in all the shows (14.2%), non- standard forms were still favored (85.8%), especially in Avenida Brasil and Salve Jorge, the two which take place in Rio de Janeiro. Guerra dos Sexos, located in São Paulo, not only shows the highest percentage of standard forms but also the lowest percentage of non-standard forms. The importance of the location where the soap operas take place will be discussed in the next chapter.

5.4 Summary

This chapter analyzed the distribution of standard and non-standard pronouns found in my data. Section 6.1 was dedicated to describing the standard forms (proclitics

83 and enclitics) and to analyze their usage according to the normative rules described previously in Chapter 3. Section 6.2 briefly introduced non-standard forms and section 6.3 compares the distribution of both kinds of pronoun according to the kind of show.

In the next chapter, I will provide my statistical results and analysis of the graphs for variable importance and the inference trees acquired for each model used.

84 CHAPTER 6

STATISTICAL RESULTS

In this section of my project, I describe the results obtained from the random forests

& trees model used for my statistical analysis. Section 5.1 shows the results for the prediction of the kind of pronoun (standard or non-standard) separated in two parts, one using the whole data set and one using only enclitic structures (verb + pronoun); and section 5.2 focuses on describing the results for clitic placement (proclitic vs. enclitic).

Graphs for variable importance and inference trees are also depicted here to illustrate my results.

6.1 Kind of pronoun: standard vs. non-standard

6.1.1 Whole data

Figure 3 shows the conditional permutation of the variable importance for the first random forest with all the predictors used (C= .99); the index of Concordance (C) measures how well the entire forest predicts my data. It provides a more precise measure of how well the model performs. For a model to be considered a good classifier, the value of C needs to be at least 0.8 (Janda et al, 2012, p.7).

85

Figure 3. Variable importance for random forest 1 (Standards vs. non-standards with the whole dataset)

What this figure shows is that placement is by far the most important element, being the best predictor to indicate the use of standard pronouns (all proclitics are standard).

Placement is followed by kind of show and context (news programs are in a formal context, this is why these two elements are so important, standard forms are highly favored in this type of show). Some predictivity is detectable for the other elements,

86 mostly for infinitive27, mood, tense, number, SES, main clause, verb, compound structure and also stress.

To elucidate how the variables used by the random forest interact, let’s consider the partitioning tree for this data, which highlights the interaction of the predictors in the whole dataset. The trees have nodes and these form sub-trees based on attributes not used in higher nodes. As Tagliamonte & Baayen (2012, p.159) explain, the algorithm works through all predictors, splitting (partitioning) the data into subsets where justified, and then recursively considers each of the subsets, until further splitting is not justified. In this way, the algorithm partitions the input space into subsets that are increasingly homogeneous with respect to the levels of the response variable. The result of this recursive binary splitting of the data is a conditional inference tree.

It is important to say that individual trees are extremely sensitive to small quirks in the data. If just a few lines of the spreadsheet are changed, the best tree that the ctree() function creates could look completely different. Forests are hundreds of trees, each built over a different, randomly chosen subset of your data. If a variable creates a split near the top of most of the trees in the forest, we can be a lot more confident that that variable really is important and the fact that it creates a split near the top of the best tree isn't just because of a small quirk in the data. Figure 4 summarizes graphically the results of the best inference tree for this dataset.

27 A quick note on this variable: my first attempts at forests used final segment as a variable, but it became clear that the resulting split between /r/ vs. everything else was really a split between infinitives and non-infinitives, so this is the opposition that I used in all the trees and forests.

87

Figure 4. Inference tree 1 (Standards vs. non-standards with the whole dataset)

In the tree above, we can see that the single most important factor influencing pronoun choice is the type of the TV show. Speakers in news shows use the standard form almost 100% of the time. In non-scripted shows and soap operas, we can observe there are other factors that influence whether the speaker uses a non-standard form or not, like the placement and number of the pronoun, verbs in the infinitive and the SES of the speaker.

88 Regarding placement, what really matters are the enclitic pronouns (since all proclitics are standard) and these are affected by verbs in the infinitive (non-infinitve verbs use non standard pronouns 100%) and we can see a difference between the SES of the speakers with speakers of a higher SES using more standard forms and low and middle SES highly favouring non-standard forms.

The node including "verb set” refers to the actual verbs and I named them "set 1" and "set 2" for purely cosmetic convenience. The two verb sets were chosen by the tree- splitting algorithm itself and which verbs end up in which set is one of things that is most likely to change if there are small modifications in the input data. The high ranking of

“verb set" in the variable importance does not tell us that this particular set 1 and 2 are important, just that most of the trees in the forest have a fairly high split involving some two sets of verbs and it is quite probable that many of the specific assignments of verbs to set 1 or 2 are simply the result of over fitting to my data.

One aspect we can note when looking at the graph for variable importance and the best tree’s splits is that the rankings of the elements vary and they do not agree on the most important element. However, since the forests protect over fitting to the data (which is probably what the tree is doing), we can be more certain that the results shown in the graph of variable importance (figure 3) are more reliable.

89

6.1.2 Enclitics

As for our second forest, with a subset of the whole data that included only enclitic structures with standard and non-standard forms (verb + pronoun, for example, peguei-a

‘got her’ and fazê-lo ‘make him’ for standard or peguei ela and fazer ele for non- standard), the following results were obtained.

Figure 5. Variable importance for random forest 2 (Standards vs. non-standards with enclitic structures)

90

As the graph shows, the most important predictor is the kind of show, followed by verbs in the infinitive, context and speaker’s SES. All the other variables play some part, specially: tense, mood, compound structure, stress, and auxiliary verb. The index of concordance (C) obtained for this model was 0.98.

Figure 6 represents the best tree built with the same predictors as the forest above to show how the important predictors are related in this dataset.

Figure 6. Inference tree 2 (Standards vs. non-standards with enclitic structures)

91 In this tree, we can observe that the most important factor influencing pronoun choice is also the type of the TV show, with news shows showing to be dominated by the usage of standard forms. Non-scripted shows and soap operas show the relationships of other variables, that might influence whether the speaker uses a standard form or not.

Non-infinitive verbs have a few instances of standard forms, whereas for infinitives, the speaker’s SES and number affect the use of the pronouns, contributing to more usage of standard forms. Differently from the first dataset, the forest and tree with this dataset agreed on the most significant predictors.

6.2 Clitic placement: proclitic vs. enclitic

The only predictor that showed to be meaningful in regards to the placement of the pronoun was the infinitive. Verbs ending in “r" (in the infinitive) will favour enclitic pronouns 98.7% of the time and all other final segments found in the data will favour the use of proclitic pronouns (97.8%). The graph and tree for this model do not show any other important interaction and, thus, were unnecessary to be shown.

6.3 Summary

This chapter displayed my statistical results and illustrated them with graphs and trees obtained from the random forests & trees model. My analysis was split in two parts, one to predict the kind of pronoun (standard or non-standard) and one to predict the clitic placement (proclitic or enclitic). We can conclude that the kind of show, context, speaker’s SES and verbs in the infinitive were the most important factors in predicting the kind of pronoun; and as for clitic placement, the only important predictor was verbs in the

92 infinitive. In the next chapter, I will focus on my findings to refer to my research questions and address to the discussion related to them.

93 CHAPTER 7

DISCUSSION

This study was undertaken to examine the distribution of the third person accusative pronouns in three kinds of TV shows (news, non-scripted and soap operas). I was hoping to analyze the variation between normative/prescriptive (scripted) and descriptive (non-scripted) language in Brazilian Portuguese.

In this chapter, I will present my findings following the research questions posed on Chapter 1 and address to the discussion relevant to them.

1) What is the distribution of standard and non-standard accusative pronouns on the TV shows analyzed?

Table 7 (in section 6.3) summarizes the results found in my data and it shows that, in 1252 examples collected overall, 495 were standard and 757 were non-standard.

Based on the three television genres analyzed, only the news shows, which are scripted and use formal language, favored the usage of standard pronouns (99.3% were standard). If we consider unscripted shows, which displays spontaneous and natural conversations, and soap operas, whose scripts should be as real as possible to the real language used by native speakers, the large majority of the pronouns found were non- standard (74.2% and 85.8% respectively). These results show a high imbalance among the three types of show considered for my research.

Before further discussion, one term that needs to be defined and that might help us understand language variation and thus, the distribution of the pronouns in my data is the

94 one of “audience design”, a framework firstly outlined by Bell (1984) that proposes that a speaker’s linguistic style will adjust towards their audience.

This model is shown to be especially powerful in mass communication, like on the television; since the language used in TV programs is modeled to a particular audience, the style of each show is directed to reach the targeted spectators. Every TV show has a communication style that is deliberately audience-oriented that will define its setting and how it is presented and broadcast. As Scannel (1991, as cited in Robertson, 2014) states:

The communicator must affiliate to the situation of their audience, and align their communicative behavior with those circumstances. The burden of responsibility is thus on the broadcasters to understand the conditions of reception and to express that understanding in language intended to be recognized as oriented to those conditions.”

In my results, the high use of standard pronouns found in the news seem quite contrasting to the real use of language, when we compare to the other kinds of show. We might why this happens even when this type of show is intended for the entire

Brazilian population.

News shows exhibit a degree of formality and have a specific format that needs to follow language standards, this is why the use of standard clitic pronouns form the large majority of tokens found in the news data (99.3%). With the role of providing quality content, journalistic language is carefully planned ahead and organized by highly educated people (reporters, journalists, editors, etc.) who are knowledgeable on the grammatical rules of the language. The spoken language in this kind of program is based on a written speech and, therefore, is more formal and precise.

Hofmann (2007) explains this well when she states:

95

“Locating the evening news programmes on a sociolinguistic scale, they are

representative of educated language use, with contributions from university-

educated news presenters and journalists making up the majority of the language

material. While the news is read out from the teleprompter, one has to be careful

with classifying it as oral speech, since all texts in the evening news are scripted

and thus involve varying degrees of planning. Even the inclusion of interview

materials representing more spontaneous speech is done in a highly planned way.”

(2007, as cited in Bachman, 2011, p.9)

These shows must display the use of a language that contain general tendencies of the spoken language while showing a grammatical sophistication that is more common to written discourse to deliver a serious but natural reporting style. Therefore, it is expected that informal or non-standard language to be avoided in – high-level − news programs and that reporters and journalists’ use of language follows a formal variety, even when the content is designed for the general population.

On the other hand, non-scripted shows displayed a higher variation regarding to the kind of pronouns used. Because they do not follow a script, it was expected that the language used in this kind of TV show (talk shows and interviews) would reflect how people speak in real life. In other words, it was expected that grammatical rules would not be followed as much or at all.

Non-standard pronouns account for the big majority of the data in this kind of show

(74.2%), but despite the high occurrence of non-standard pronouns, there was still some

96 considerable use of standard forms (25.8%); and their presence could be explained by different factors, such as audience design which was described above.

This noticeable variation between standard and non-standard forms could also be explained by the format of this TV genre. Non-scripted shows include a variety of situations and interactions involving the hosts, their guests, the audience, and the spectators at home. The hosts and their guests can polish their language depending on the level of formality of the conversation and on whom they are talking to. For example, in

Mais Você, most of the standard examples found were uttered by the host Ana Maria

Braga. The total number of standard pronouns found was 45, and she provided 68% of them with 31 utterances. However, with 55 examples that included non-standard pronouns

(64% of her total utterances), it is clear that she favored this form. On the other hand,

Programa do Jô, which is made mostly of interviews (interactions are made face-to-face between the host and his guests) and include the use of a language that is even more natural and spontaneous, had 87.3% of non-standard examples. In the case of this show, which displays interviews with many different people who have varied background, age, geographical region, profession, education, social level, etc.) we should also consider another concept that could help us understand the variation in language. The

“Accommodation Theory”, designed by Giles (1971) suggests that people can vary their communicative styles and strategies and ‘accommodate’ (adapt) their speech according to whom they are talking to, they are able to move their style closer to their addressee’s.

This ‘convergence’ of speech decreases the possible differences between the speakers, making them more confortable with each other and their conversation more natural.

Maybe, this is why in this show, we see the host Jô Soares who is a highly educated and

97 knowledgeable person, using a lot of non standard forms as well, because it was the kind of language his guests would prefer and use. Furthermore, one interview that seemed to be drastically different from all the others was one in which the guest was a female judge, and both of them used only standard examples (the only one), so it is very likely that he was accommodating his language to hers and their conversation followed a more formal style. Unfortunately, examining exactly in which kind of interaction (to a guest or to the camera) the pronouns were found is beyond of the scope of this paper and will be left for future research.

One aspect I would like to discuss about some examples found in the non-scripted shows is in regards to examples (77) and (78) illustrated in Chapter 6. For easy reference,

I will repeat them below:

(77) Apesar de também admirá-lo e me identificar com o lado though of also admire.INF-him.ACC and me identify with the side

romântico, eu pensei na minha mãe quando eu escolhi ele. romantic I think.PAST in.the my mother when I chose he.ACC

‘Despite admiring him and identifying myself with his romantic side, I thought of my mom when I chose him. (Encontro com Fátima Bernardes, NS, F)

(78) Você vai pegar essa parte do osso aqui, você vai tirá-la da you will take this part of.the bone here you will take-her.ACC of.the

carne, mas vai deixar ela grudadinha. meat but will leave she.ACC stuck

‘You’ll get this part of the bone, you’ll remove it from the meat, but you will leave it stuck together.’ (Mais Você, NS, F)

98 It is known that there are internal (linguistic) and external (extra linguistic) factors that could explain the presence of different linguistic variables and this study was undertaken aiming at finding out which ones can predict the kind of pronoun in BP. And what is interesting about these two examples in particular is that they show the same speaker using one standard and one non-standard pronoun in the same sentence, in the same context, to refer to the same antecedent. In both examples, the first pronoun used is standard, and the non-standard one is towards the end of the sentence. The distance of the pronoun to its referent could also have a relationship regarding the kind of pronoun chosen by the speaker, but unfortunately this was not a variable analyzed in this study.

Moreover, considering that there would be no sociolinguistic factors to explain this variation (same speaker, same context), and bearing in mind my results, which pointed out that verbs in the infinitive could favor clitics, sentence (77) would be justifiable with the verb admire in the infinitive being followed by the clitic -lo and the verb choose, which is in the and ends in a vowel, is followed by the non- standard form ele. However, example (78) that has the same structure (will + infinitive) in both cases, displays one standard (-la) and one non-standard prounoun (ela). These examples show that the clitic system in BP is is getting more diverse and it could be facing changes in the future.

As for soap operas, they do have a script, but the script has to reflect how people use language in real life; the story lines need to be engaging and, most importantly, believable.

Soap operas, a very common and successful TV genre in Brazil, are watched by millions of viewers and, according to La Ferrara et al. (2008), this success can be traced to

99 three factors: 1) the locations, context and issues addressed by them are familiar to the

Brazilian population so they can relate to the story; 2) they are accessible to the viewers trough colloquial language and elements of popular culture in everyday life; and 3) the high quality production of the soap operas. In other words, soap operas are remarkably realistic.

Their authors adapt the script in a way that the spectators will believe and identify to their reality, since they always recognize themselves in the fiction. Therefore, the lines are planned according to the theme, the context and the characters, and the dialogues need to convince the spectator they are real. Soap operas portray the language, customs and cultures of specific places and kinds of people, so the speech used in the lines is very important to define who the characters are and where they are from.

In Brazilian soap-operas, this is exactly what happens. As it was presented in the

Chapter 2, TV Globo in Brazil usually uses a prestigious variety that follows the dialects from Rio de Janeiro and São Paulo and we can notice differences between the language used in each of them based on the plot and on where it takes place.

The two soap-operas taking place in Rio de Janeiro (Salve Jorge and Avenida

Brasil) showed a much higher usage of non-standard forms, with 256 non-standard examples and only 9 standards all together. On the other hand, in Guerra dos Sexos, the soap opera plotted in São Paulo, 49 standard forms and 95 non-standard forms were found. Based on the results of this data, we can imply that the language used of the plot was carefully planned to portray the characters based on their roles and I believe that the place where they are from also influence on their choice of the kind of pronoun used.

As it was described previously (in Chapter 4), regarding the location of the stories,

100 Rio de Janeiro is a more informal city, so the language used in the two soap operas taking place there is also more informal. The plots involve a lot of people from lower SES, which implies low education levels and less formality on their speech; however, even the upscale characters in these two soap operas, the ones that live in the good neighborhoods, the ones that are doctors, lawyers or entrepreneurs; they still do not make use of standard forms as it would be expected if you consider their socio-economic status and level of study. On the other hand, São Paulo, a city that is business-driven and that is associated with money and power, was the scenario of a plot that was around a rich and aristocratic population; and as discussed before (in Chapter 2), a fine use of language and the dominance of a standard variety is directly associated with knowledge, education and power. Therefore, I believe there has been a bigger effort on the script with the language used to help characterize this upper class from São Paulo, not the ones from Rio de

Janeiro.

Based on the results shown in the Chapters 5 and 6, we can claim that scripted shows follow the norms proposed by normative grammar and that non-scripted shows and soap operas use a more spontaneous and natural language, with fewer occurrences of standard forms.

2) Are there any linguistic or sociolinguistic factors that influence on the choice of pronoun or on the placement of clitics?

As it was shown in Chapter 6, the kind of show, the context of the utterance, the socio- economic status of the speaker and verbs in the infinitive were the significant variables to

101 predict the type of pronoun.

My statistical analysis pointed out that only a few of the variables analyzed play a significant role when it comes to the choice of pronoun (standards or non-standards) and placement of clitics (proclitic or enclitic).

The kind of show was by far, the most significant predictor. As for the variables considered (as detailed on Table 4), the only linguistic variable that showed to be significant was the verb in the infinitive, especially for the placement of clitics (they will favor enclitics). This fact might have to do with the phonetic explanation given by

Pereira (1981), as shown in Chapter 3. To recapitulate, she also found out that accusative pronouns with a CV structure were more common than the other ones and she justifies this occurrence in BP by saying the clitics o(s)/a(s) are phonetically weak and that the syllable consisting of only a vowel makes it hard to attach to other syllables, so the syllable structure demanded after infinitives (when the clitics -o(s), -a(s) must become - lo(s), -la(s)) is still in use.

However, this does not explain why, in my data, no example of -no(s), -na(s) showed up; the rules of enclitics involving nasal terminations (as explained in Chapter 2, if the verb has a nasal ending and the pronoun comes after the verb in the form of no(s), - na(s)) were not favored at any time. In 14 cases where these pronouns could have been used, the speaker still favored the use of proclitics. This fact might be showing that enclitics in the form of -no(s), -na(s) are no longer in use in BP.

In her study, Pereira also determined that the only significant sociolinguistic variables were the age and the gender of the speaker. In my study, age was not a predictor, but gender did not prove to be significant. This fact that seemed surprising,

102 since gender can be a very strong sociolinguistic variable and many sociolinguists claim that women use more forms of standard language, they also use higher frequencies of innovative vernacular forms, so they are the ones who usually initiate and disseminate linguistic change (Trudgill, 1972; Labov, 1990; Grégoire, 2006; etc.)

Other than Pereira’s study, some other papers analyzed like Pagotto & Duarte

(2005) and Ferreira & de Alkmim (2011) have also found gender to be a significant predictor. As for the sociolinguistic variables that I considered for my research, we can conclude that the context proved to be significant for both questions (choice of pronoun and position of clitic) and that SES_ also influenced on the second analysis (placement of clitics).

The two sociolinguistic variables, context and SES, were also expected to play a role on the use of pronouns. The context, or the circumstance that form the setting of the event/interaction, can be a strong determinant on how people use language. The way people talk is influenced by the social context in which they are talking, it matters who someone is talking to and where they are talking; the same message can be delivered in different ways depending on the participants and kind of interaction (Holmes, 2013, p. 1).

In this setting, the level of formality plays a very important role. Informal contexts include a casual style of speech that allows speakers to use slangs and discourse markers, while formal contexts require the use of a more polished language. Therefore, formal contexts will show more standard forms being used, whereas informal contexts will favor the use of non-standard forms.

Furthermore, the other sociolinguistic variable, the socio-economic level of the speaker, can also be associated to the usage of language; or as in our case, to the choice of

103 pronoun, since this factor can be related to how much a speaker has been exposed to grammatical rules and how much formality they may encounter during his interactions.

My results show that people of a higher SES use more standard forms than others. My findings support Bernarde’s early study (1981) who claimed that the variation between standard and non-standard forms is determined by the socio-economic group of the speaker, as well as Omena’s (1978) who concluded that uneducated speakers were not aware of the grammatical rules involving accusative pronouns in BP.

3) To which extent does the distribution of the 3rd-person pronouns on TV reflect their use by Brazilians?

This study aimed at investigating language practices on different genres in Brazilian television; therefore, how do my findings reflect on the language used by native speakers of Portuguese in Brazil? What are the assumptions and implications that can be drawn from the current results? Considering my final results, to recapitulate, 199 pronouns were standard (20.86%) and 755 were non-standard (79.14%). Even though the great majority of the examples found in the shows were non-standard, there were still a considerable use of standard forms. Why do we still find standard pronouns being used?

This suggests that certain rules are still followed and some of them are being incorporated into people’s everyday spoken language. In this case, the rule involving enclitics with verbs in the infinitive seems to be used evidently by Brazilian speakers.

Unfortunately, there is no right answer to why this rule is still in use while many other seem to be disappearing (the use of endoclitics and the enclitics following nasal verb endings, for example), we can only speculate and hypothesize some of the reasons why

104 this happened in my data or what could be happening to the - Brazilian - Portuguese pronoun distribution.

As for my data, we discussed in question 2, I believe the language used in TV, which is designed to a particular audience and also the accommodation of language

(depending on the kind of TV show and who is speaking/listening) might be important factors to consider here. It would be very interesting to perform the same study analyzing spontaneous conversations in diverse contexts (and maybe different varieties of

Portuguese) to see if the results would be too different, I suspect that face-to-face interactions containing spontaneous conversations would present even more non-standard pronouns (and non-standard language use in general).

In a wider perspective, in regard to the distribution of personal pronouns in

Brazilian Portuguese, I believe there could be a change in progress taking place. On one hand, there is the aspect involving the rules of usage of clitics (standard pronouns).

Considering the existence of so many rules stated by the Normative Grammar, which ones are actually followed and important to be taught/memorized?

Moreover, under another aspect, there is the usage of subject pronouns being used as objects (non-standard), that is so common and is still seen as ‘wrong’ and ‘ugly’ by so many.

Unfortunately, there is still so much judgment and prejudice in Brazil towards the use of non-standard forms or varieties. Some scholars have already debated about the language issue in Brazil. For example, Massini-Cagliari (2004) says that this linguistic prejudice against the speech of popular classes is so widespread and that there is judgment even related to the mental capacity of an individual (pp. 17 – 18). Linguistic discrimination and

105 other beliefs towards language can be found in attitude studies, which reveal information regarding how participants react to distinct varieties/dialects of a specific language.

In different dialectology studies, it has been proven that there are attitudes in Brazil towards different varieties of Portuguese. For example, Guedelha (2011) developed a debate about linguistic beliefs and attitudes of speakers in relation to the language used by others. His informants showed preference for a variety from a capital city (seen as more correct) and associated inferior ideas to the varieties spoken in smaller cities. In another study, Bugel (2009) concluded that native speakers also showed attitudes about

Portuguese varieties. They perceived differences in pronunciation and vocabulary from different regional areas and showed preference of one variety over the others to be spoken in Brazil and to be taught in schools for Brazilian children or foreigners studying in

Brazil. Another example is of Moralis (2003), which had her informants “play a game of similarities and differences” among some regional dialects. The participants associated words/ideas like ‘similar’ vs. ‘different’, ‘’ vs. ‘not so nice’, ‘pleasant’ vs.

‘unpleasant’ to their own regional dialect as well as to their perception of other varieties of Portuguese in Brazil. Her results show the informants tended to show preference for their own dialect and that they stereotype other people depending on their variety they speak.

Concerning the issue involving linguistic attitudes in Brazil, Faraco (2002) states that the problems about language in Brazil involve an important political problem that deeply affects several social situations and the educational system. In the end, there is ignorance and prejudice, present in the everyday life of the people and even in educational strategies (Faraco, 2002, as cited in Massini-Cagliari, 2004).

106 Bagno (1999) introduces the idea of the existence of a vicious circle of linguistic discrimination, which is produced by the normative grammar, the traditional pedagogic methods and by textbooks. This circle has a consequence on how speakers perceive language and makes them think Portuguese is too difficult and that they are not “good speakers”. Another myth presented by Bagno is that people in Brazil think that Brazilian

Portuguese is characterized by a unity, and this idea is probably the most dangerous and serious of all the myths that compose the mosaic of linguistic prejudice in Brazil. This myth is harmful to education because by not recognizing the true diversity of the

Portuguese language spoken in Brazil, the education system tries to impose linguistic patterns as if there was only one variety spoken by everyone in the country, independent of their age, geographic origin, socioeconomic situation and educational level (Bagno,

2002, as cited in Massini-Cagliari, 2004).

Figure 7. The vicious circle of linguistic discrimination

(Taken from Bagno, 1999, p.73)

If we eliminate the results obtained from the scripted shows, we are left with the

107 non-planned, spontaneous and natural kind of language that is present on the non-scripted shows and with soap operas, whose scripts describe real use of language and should be convincing enough to the spectators. Using my results as example of language used by native speakers in Brazil, one idea would be that maybe the acceptance of the phenomena analyzed (the preference of non-standard pronouns over standard ones) is also a change to happen in the future.

Accepting that languages change (in their grammatical norms, vocabulary, phonology, etc.) as well as accepting language variety should be understood by every speech community. All living languages change, and all change is preceded by variation.

Every standard language, by virtue of continuous change and conscious elaboration, contains a minimum level of variation (Joseph, 1987, p. 127). For every speaker (of standard varieties or not), languages vary from utterance to utterance with regard to sound, intonation, meaning, word and sentence structure. At a conscious or unconscious level, a speaker makes choices about how a language is produced and perceived and, it is the production and perception of speech sounds as systematic entities functioning in relationship to each other that there is perhaps the greatest potential for variation in language, and following from that, variation leading to change. (Lippi-Green, 2012 p. 23)

Especially in Brazil’s situation, people should be aware that a language spoken by so many people, in such a large territory is not going to be homogeneous. It is a highly diverse language that varies from place to place, from speaker to speaker. Mainly if we consider the difference between social classes that determines who gets proper education.

Unfortunately, the biggest majority of Brazilians belongs to a low SES and do not receive proper education. According to the Brazilian Institute of Geography and Statistics

108 (IBGE), in 2013 the pure illiteracy rate was 8,3% and 17,8% was the rate for functional illiteracy (people over 15 years old). So, most people does not have access to the

“standard” Portuguese that is taught at schools and this naturally creates a huge gap among the speakers of the standard and non-standard varieties, which are, in fact, the most spoken ones.

In order for this circle to be broken, the teaching of Portuguese needs to change.

Many people think that the standard language must be the main focus in school, but unfortunately this is still reserved to a minority in Brazil, so perhaps educators need to work to put an end to linguistic prejudice by studying linguistic variation and, thus, the

Portuguese that is actually used by everyone, not just by a few.

What my study showed was that the rules prescribed by normative grammar are not well implemented or, at least, are not much in use in spoken Portuguese. In this case, we analyzed accusative third person pronouns, but, as a native Brazilian Portuguese speaker,

I can assure this is not the only situation. The traditional study of Portuguese in Brazilians schools include learning and memorizing countless rules involving the syntax (and the morphology, phonetics, etc.) of a language that most native speakers do not use; or, at least, not native speakers of Brazilian Portuguese. It seems that, in Brazil, they stick to the linguistic norms that is completely followed by native speakers of European Portuguese.

It is about time to consider that these are two very different varieties of Portuguese that are getting more distant as time passes. The Portuguese taught in Brazilian schools should be the Portuguese used in Brazil, in everyday life, by everyone, and not the one spoken by that small minority of people (either across the Atlantic or the very small part of the population in Brazil). We should learn “Brazilian Portuguese”, with the awareness of the

109 very heterogeneous society that combines and affects many aspects, including language.

Bagno defines this difference and says that teaching Portuguese means, under a traditional didactic practice, to impose an endless group of syntactic prescriptions that are considered “correct”, impose a series of artificial pronunciations that do not correspond to any actual linguistic variety, demand the knowledge of incoherent terminologies with incomplete and contradictory definitions. While studying Brazilian Portuguese means admitting that normative grammar was an important contribution but that we need to go beyond it, update it and create new ideas. Studying Brazilian Portuguese is to be able to build our own knowledge, to recognize the difference between what “it is” and what some few think it “should be”. More importantly, it is to bring to life a language that is spoken in Brazil (a country 92 times bigger than Portugal) and to notice that languages change, they are alive and in constant transformation (Bagno, 2010, pp. 9 - 10).

All those problems and misconceptions mentioned above will survive if educators do not understand this difference and if defenders of normative grammar refuse to accept the language changes. It is necessary to identify the true language used by the educated people (spoken and written) and make it accessible to everyone. Without this, normative grammar will still predominate over everything else and the language discrimination will never cease to exist.

And based on my research, I support Bagno (2001) when he claims that one thing we can affirm about Brazilian Portuguese is that the third person accusative pronouns are, if not “dead”, about to die. He says that only people that went to school know they exist, and still, might not use them, since their usage can sound pedantic and inappropriate in certain situations. He affirms that they never show up in the speech of children or illiterate

110 people. He calls these pronouns “linguistic fossils” that occupy such an important place in first language learning (with so many rules involving their placement and usage) while, in fact, people should be spending more time and effort concerned about learning grammatical rules that are really in use.

Regarding nominative personal pronouns in complement function, back in 155 this occurrence was already in evidence. Brazilian scholar Silveira Bueno stated that such prohibition, which is deemed by the normative grammar language teaching, finds numerous exceptions in archaic Portuguese but that is absolutely transgressed in the common and living language of society. Such use always got in language habits in Brazil and only the effort of the education institutions and the ongoing patrol of normative grammar can reduce the occurrences of this case, especially when it comes to writing language. Therefore, it seems that it is widely used in Portuguese and that it is a natural characteristic of the language, and such spontaneity cannot be dominated even by the formal instruction (Silveira Bueno, 1955, as cited in Bagno, 2001).

It has been an issue then and it is still is. My take here is that I believe that the concept of variation and change should be the focus of language teaching and learning in

Brazilian schools. It is important that, in the school, students understand that language varies as much as society does. I support the idea that, if millions of people do not follow a rule, it is because that rule needs to change and adapt to a new reality. I also believe that change is a natural and necessary process every language faces and that normative grammar of (Brazilian) Portuguese should consider making changes in order to be more useful and more real to its speakers.

111 CHAPTER 8 CONCLUSION

I have examined data from three different kinds of TV shows in order to analyze the variation between standard and non-standard forms of third person accusative pronouns in spoken Brazilian Portuguese.

The findings of my study demonstrated that only news shows (scripted) favoured the use of standard forms; non-scripted shows and soap-operas showed a variation between the two forms of pronoun with a higher preference for non-standard forms, which reflects the common use of language by native speakers. As another part of my analysis I tried to determine if there were any linguistic/sociolinguistic variables that could affect the choice of pronoun (standards or non-standard) or placement of clitics

(when standard pronouns were used). The kind of show was, by far, the most significant predictor. As for the other variables considered, we concluded that the context of the utterance, the socio-economic status of the speaker and verbs in the infinitive were also significant.

It seems that some grammatical rules pertaining to the application of standard pronouns are still strongly used by speakers (verbs in the infinitive) while some other seem to be disappearing (use of endoclitics, use of enclitic forms –no(s), -na(s). This fact suggests that - Brazilian – Portuguese could be facing some changes regarding to the norms involving pronouns. Moreover, the analysis of these results provides evidence that the usage of non-standard forms (subject form for the syntactic object) is a really common phenomenon in Brazil and, thus, also a phenomenon of possible linguistic change.

112 We should reflect about the rules prescribed by normative grammar and the real use of language. It seems like an unnecessary task to impose so may impractical rules or to try to stop changes that are natural and inherent to every language.

This study aimed at analyzing the variation of third person pronouns only, but I would like to leave for future research the fact that other personal pronouns also show variation and might be looking into change in the future. Just like the famous lyrics by popular Brazilian singer Zeca Pagodinho says: deixa a vida me levar, vida leva eu ‘let the life take me, life take I’ or the notorious sign on TV Globo when there is a soccer match:

Galvão, filma nóis,! ‘Galvão, film we!”

Figure 8. Sign with sentence showing a non-standard form of first person plural accusative pronoun

113 REFERENCES

Allan Bell, A. (1984). Language Style as Audience Design. Language in Society, 13 (2), 145-204.

Bachmann, I. (2011). Norm and Variation on Brazilian TV Evening News Programmes: The Case of Third-person Direct-object Anaphoric Reference. Bulletin of Hispanic Studies, 88 (1), 1-20. doi:10.3828/bhs.2010.44

Bagno, M. (1999). Preconceito linguistico: o que é, como se faz. São Paulo: Edições Loyola.

Bagno, M. (2001a). Norma linguística. São Paulo: Loyola.

Bagno, M. (2001b). Português ou Brasileiro? Um convite à pesquisa. São Paulo: Parábola Editorial.

Bagno, M. (2011). Gramática pedagógica do Português Brasileiro. São Paulo: Parábola Editorial.

Barrie, M. (2000). Clitic placement and verb movement in European Portuguese. (M.A. thesis). University of Manitoba.

Bechara, E. (2009). Moderna gramática portuguesa (37th ed). Rio de Janeiro: Nova Fronteira.

Bernardes, M. M. S. (1981). Objeto direto pronominal: Um estudo sociolinguístico. (M.A. thesis). Available from PUC-Rio Dissertations and Theses database.

Bugel, T. (2009). Explicit attitudes in Brazil towards varieties of Portuguese. Studies in Hispanic and Lusophone Linguistics, 2 (2), 278–304.

Carvalho, A. M. (2004). I speak like the guys on TV: Palatalization and the urbanization of Uruguayan Portuguese. Language Variation and Change, 16, 127–151. doi: 10.10170S0954394504162030

Cunha, C. & Cintra, L. (2007). Nova gramática do Português contemporâneo (4th ed.). Rio de Janeiro: Lexikon.

Deumert, A. (2004). Language standardization and language change: The dynamics of Cape Dutch. Amsterdam: John Benjamins.

Ferreira, W. M. A. C. & de Alkmim, M. G. R. (2011, October). A Colocação do Pronome Clítico na Fala do Dialeto . Paper presented at I Congresso Nacional de Estudos Linguísticos, Vitória - ES, Brazil.

114

Freire, G. C. (2005). A realização do acusativo e do dativo anafórico de terceira pessoa na escrita brasileira e lusitana. (Ph.D Dissertation). Available from UFRJ Dissertations and Theses database.

Galves, C. (2003). Clitic-placement in the and the syntax- phonology interface. State University of Campinas.

Grégoire, S. (2006). Gender and Language Change:The Case of Early Modern Women Retrieved from http://homes.chass.utoronto.ca/~cpercy/courses/6362-gregoire.htm

Guedelha, C. A. M. (2011). Crenças e atitudes linguísticas: um estudo dialetológico. Retrieved from http://www.ufjf.br/revistagatilho/files/2011/10/guedelha.pdf.

Haspelmath, M. (2002). Understanding morphology. London: Arnold.

Huebner, T. (1999). Sociopolitical perspectives on language policy, politics and praxis. In: T. Huebner & K. A. Davis (Eds), Sociopolitical perspectives on language policy and planning in the USA. Amsterdam: John Benjamins.

Janda, L. A., Nesset, T., Dickey, S., Endresen, A. Makarova, A. & Baayen, R. H. (2013). Making choices in Slavic: Pros and cons of statistical methods for rival forms, Russian Linguistics, 37 (3) 253 - 291.

Johnston, T. A. (2003). Language Standardization and Signed Language Dictionaries. Studies, 3 (4), 431-468. doi:10.1353/sls.2003.0012

Joseph, J. E. (1987). Eloquence and power: the rise of language standards and standard languages. London: Frances Pinter Publishers.

Labov, W. (1990). The intersection of sex and social class in the course of linguistic change. Language Variation and Change 2, 205-254.

La Ferrara, E., Chong , A. & Duryea, S. (2012). Soap operas and fertility: Evidence from Brazil American Economic Journal: Applied Economics, 4 (4), 1-31.

Lippi-Green, R. (2012). English with an accent: Language, ideology and discrimination in the United States (2nd ed.). London & New York: Routledge.

Massini-Cagliari, G. (2004). Language Policy in Brazil: Monolingualism and linguistic prejudice. Language policy (3), 3-23.

Mateus, M. H. & d’Andrade, E. (2000). The phonology of portuguese. Oxford: Oxford University Pres.

115 Mattoso Câmara, J.(2004). Ele como um acusativo no português do Brasil. In: C.E.F Uchôa (Ed.), Dispersos de J. Mattoso Câmara Jr. (pp. 47-53). Rio de Janeiro: Editora Lucerna.

Moralis, E. G. (2002). Dialetos em contato: um estudo sobre atitudes linguísticas. (Ph.D Dissertation). Available from Unicamp Dissertations and Theses database.

Moura, E. S. V. (2012). Normas de colocação dos pronomes clíticos em Português. Retrieved from http://www.gelne.org.br/Site/arquivostrab/1041artigo%20gelne%202012%20corrigido. pdf.

O’Grady, W. & Archibald, J. (Eds.). (2000). Contemporary linguistic analysis: An introduction (4th ed.). Toronto: Addison Wesley Publishing Company.

Omena, N. P. (1978). Pronome pessoal de terceira pessoa: suas formas variantes em função acusativa. (M.A. thesis). Available from PUC-Rio Dissertations and Theses database.

Pagotto , E. G. & Duarte, M. E. L. (2005). Gênero e norma: avós e netos, classes e clíticos no final do século XIX. In: C. R. S. Lopes. (Ed.), A Norma Brasileira em construção: Fatos lingüísticos em cartas pessoais do século 19 (1st ed.). Rio de Janeiro: FAPERJ/UFRJ.

Pereira, M. G. D. (1981). A Variação na colocação dos pronomes átonos do Português do Brasil. (M.A. thesis). Available from PUC-Rio Dissertations and Theses database.

Porcello, F. A. C. (2009). Telejornalismo e Poder: A moeda política que regula as relacões de troca no Brasil, Revista de Estudos de Comunicação (6), 335-348.

Rizzotto. C. C. (2012). Constituição histórica do poder na mídia no Brasil: o surgimento do quarto poder. Revista de Estudos de Comunicação, 13 (31), 111-120.

Robertson, M. A. (2014). How the Language of Television News Broadcasting is Shaped by Audience Design. Retrieved from http://www.scribd.com/doc/197571845/How-the- Langauge-of-Televesion-News-Broadcasting-is-Shaped-by-Audience-Design- Robertson#scribd

Rocha Lima, C. H. (1998). Gramática normativa da língua portuguesa (35th ed.). Rio de Janeiro: José Olympio.

Saraiva, L. M. S. (2008). A colocação dos pronomes átonos na escrita culta do domínio jornalístico e nos inquéritos do projeto NURC: uma análise contrastiva. (Ph.D Dissertation). Available from UFMG Dissertations and Theses database.

116 Tagliamonte, S & Bayyen, R. H. (2012) Models, forests, and trees of York English: Was/were variation as a case study for statistical practice, Language Variation and Change, 24 135–178. doi:10.1017/S0954394512000129

Trudgill, P. (1972). Sex, Covert Prestige and Linguistic Change in the Urban of Norwich Language in Society, 1 (2), 179-195.

Trudgill, P. (1999). Standard English: what it isn’t. In T. Bex & R. J. Watts (Eds.), Standard English: the widening debate. London: Routledge.

Vieira, S. R. (2003). Colocação pronominal nas variedades européia, brasileira e moçambicana: Para definição da natureza do clínico em português. In S. F. Brandão & M. A. C Mota (Eds), Análise contrastiva de variedades do Português. Rio de Janeiro: In- Fólio.

117 APPENDIX

> library(party) > library(rms)

For random forest 1:

> forest1 = cforest(kind.of.pronoun ~ category + compound.structure + aux.verb + verb + conjugation + final.segment + stress + tense + mood + affirmative + negative + interrogative + main.clause + animate + human + gender + number + placement + gender.of.speaker + SES + context, data = wholedata)

> forest1.varimp = varimp(forest1, conditional = FALSE)

> dotplot(sort(forest1.varimp))

> trp1 = treeresponse(forest1)

> wholedata$predicted1 = sapply(trp1, FUN=function(v) return(v[2]))

> wholedata$kind.bin1 = (wholedata$kind.of.pronoun == "standard") + 0

> somers2(wholedata$predicted1, wholedata$kind.bin1)

> tree1 = ctree(kind.of.pronoun ~ category + compound.structure + aux.set + verb.set + conjugation + final.segment + stress + tense.set + mood + affirmative + negative + interrogative + main.clause + animate + human + gender + number + placement + gender.of.speaker + SES + context, data = wholedata)

> plot(tree1)

For random forest 2:

> forest2 = cforest(kind.of.pronoun ~ category + compound.structure + aux.verb + conjugation + final.segment + stress + tense + mood + affirmative + negative + interrogative + main.clause + animate + human + gender + number + gender.of.speaker + SES + context, data = enclitic)

> forest2.varimp = varimp(forest2, conditional = FALSE)

> dotplot(sort(forest2.varimp))

> trp2 = treeresponse(forest2)

118

> enclitic$predicted2 = sapply(trp2, FUN=function(v) return(v[2]))

> enclitic$kind.bin2 = (enclitic$kind.of.pronoun == "standard") + 0

> somers2(enclitic$predicted2, enclitic$kind.bin2)

> tree2 = ctree(kind.of.pronoun ~ category + compound.structure + aux.verb + conjugation + final.segment + stress + tense + mood + affirmative + negative + interrogative + main.clause + animate + human + gender + number + gender.of.speaker + SES + context, data = enclitic)

> plot(tree2)

For random forest 3:

> flavia.forest3= cforest (enclitic ~ compound.structure + aux.verb + verb + conjugation + final.segment + stress + tense + mood + affirmative + negative + interrogative + main.clause + animate + human + number + gender, data=standard)

> flavia.forest10.varimp = varimp(flavia.forest10, conditional = FALSE)

> dotplot(sort(flavia.forest10.varimp))

> flavia.trp10 = treeresponse(flavia.forest10)

> standard$predicted10 = sapply(flavia.trp10, FUN=function(v) return(v[1]))

> standard$kind.bin10 = (standard$placement == "enclitic") + 0

> flavia.tree3 = ctree(enclitic ~ compound.structure + aux.verb + verb + conjugation + final.segment + stress + tense + mood + affirmative + negative + interrogative + main.clause + animate + human + number + gender, data=standard)

> plot(flavia.tree3)

119