UNIVERSIDADE ESTADUAL DE CAMPINAS INSTITUTO DE ESTUDOS DA LINGUAGEM

LETÍCIA SCHIAVON KOLBERG

THE ROLE OF PROSODIC BOUNDARY INFORMATION FOR THE

COMPREHENSION OF STRIPPING SENTENCES BY FRENCH AND BRAZILIAN PORTUGUESE-LEARNING CHILDREN

O PAPEL DA INFORMAÇÃO DE FRONTEIRAS PROSÓDICAS PARA A COMPREENSÃO DE SENTENÇAS STRIPPING POR CRIANÇAS ADQUIRINDO O

FRANCÊS E O PORTUGUÊS BRASILEIRO

CAMPINAS 2020

LETÍCIA SCHIAVON KOLBERG

THE ROLE OF PROSODIC BOUNDARY INFORMATION FOR THE COMPREHENSION OF STRIPPING SENTENCES BY FRENCH AND BRAZILIAN PORTUGUESE-LEARNING CHILDREN

O PAPEL DA INFORMAÇÃO DE FRONTEIRAS PROSÓDICAS PARA A COMPREENSÃO DE SENTENÇAS STRIPPING POR CRIANÇAS ADQUIRINDO O FRANCÊS E O PORTUGUÊS BRASILEIRO

Thesis presented to the Institute of Language Studies of the University of Campinas in fulfillment of the requirements for the degree of Doctor in Linguistics.

Tese de Doutorado apresentada ao Instituto de Estudos da Linguagem da Universidade Estadual de Campinas para obtenção do título de Doutora em Linguística

Thesis advisor: Prof. Dr. Maria Bernadete Marques Abaurre Co-advisor: Prof. Dr. Anne Christophe

Este trabalho corresponde à versão fi- nal da tese defendida pela aluna Letícia Schiavon Kolberg, e orientada pela Profa. Dra. Maria Bernadete Marques Abaurre

CAMPINAS 2020

Ficha catalográfica Universidade Estadual de Campinas Biblioteca do Instituto de Estudos da Linguagem Leandro dos Santos Nascimento - CRB 8/8343

Kolberg, Letícia Schiavon, 1990- K831r KolThe role of prosodic boundary information for the comprehension of stripping sentences by French- and Brazilian Portuguese-learning children / Letícia Schiavon Kolberg. – Campinas, SP : [s.n.], 2020.

KolOrientador: Maria Bernadete Marques Abaurre. KolCoorientador: Anne Christophe. KolTese (doutorado) – Universidade Estadual de Campinas, Instituto de Estudos da Linguagem.

Kol1. Psicolinguística. 2. Aquisição de linguagem. 3. Gramática comparada e geral - Elipse. 4. Língua portuguesa - Prosódia. I. Abaurre, Maria Bernadete Marques. II. Christophe, Anne. III. Universidade Estadual de Campinas. Instituto de Estudos da Linguagem. IV. Título.

Informações para Biblioteca Digital

Título em outro idioma: O papel da informação de fronteiras prosódicas para a compreensão de sentenças stripping por crianças adquirindo o francês e o português brasileiro Palavras-chave em inglês: Psycholinguistics Language acquisition Grammar, Comparative and general - Ellipsis Portuguese language - Versification Área de concentração: Linguística Titulação: Doutora em Linguística Banca examinadora: Maria Bernadete Marques Abaurre Ruth Elisabeth Vasconcellos Lopes Pablo Picasso Feliciano de Faria Alex Valentin de Carvalho Maria Cristina Lobo Name Data de defesa: 22-06-2020 Programa de Pós-Graduação: Linguística

Identificação e informações acadêmicas do(a) aluno(a) - ORCID do autor: https://orcid.org/0000-0002-3655-764X - Currículo Lattes do autor: http://lattes.cnpq.br/9671232342885280

Powered by TCPDF (www.tcpdf.org) BANCA EXAMINADORA:

Maria Bernadete Marques Abaurre

Ruth Elisabeth Vasconcellos Lopes

Pablo Picasso Feliciano de Faria

Alex Valentin de Carvalho

Maria Cristina Lobo Name

IEL/UNICAMP 2020

Ata da defesa, assinada pelos membros da Comissão Examinadora, consta no SIGA/Sistema de Fluxo de Dissertação/Tese e na Secretaria de Pós Graduação do IEL.

ACKNOWLEDGEMENTS

The present thesis is the summary of the work I did during my PhD years. I call it "sum- mary" because there is much more to tell. All life and academic experience I had, and, most importantly, all new and strengthened partnership I made during the last five years could not be contained within these pages. I have several people to thank for it. I had the honor to be advised by two brilliant professors. My advisor, prof. Maria Ber- nadete Marques Abaurre, helped me with all steps of the thesis, trusted my workflow, and was always available when I needed advice. My co-advisor, prof. Anne Christophe, guided me through all experimental work from the beginning, and made me feel at home in France and in LSCP during my sandwich period. Thanks to her, I met amazing people, made awesome friends, participated in interesting and exciting events, and gained a lot of confidence as an academic. I also owe much to friends, colleagues, professors, and employees from the three insti- tutions that welcomed me on my academic path: UFPR, UNICAMP, and ENS. It was at UFPR that I learned to think as a scientist and got interested in language acquisition. I was guided by prof. Teresa Cristina Wachowicz, an amazing person and scientist, and one of my dearest friends. It was also at UFPR that I met my friends Alex, Denise, Ednei, Kayron, Luana, Thayse, Valdilena, and several others who have always given me emotional and academic support, and, most indispensably, a lot of fun time. At UNICAMP, I give special thanks to prof. Ruth Lopes, for her amazing job as my master's advisor, and prof. Thiago Motta Sampaio, who helped me numerous times during my PhD. I also thank the friends I made at UNICAMP, for the support, team work, and for giving me several hours of their time to help me with my experiments: Aline, Antônio, Carla, Fer- nanda, Francisco, Giovanna, Gisele, Harley, Josie, Lara, Rosana, Ruan, Thuany, Williane, and many others. In France, I thank Alex, Mireille, Naomi and Rachel for the great help with my experi- ments, and for accepting to be my co-authors in the thesis' papers. Alex was the one who gave me the original idea for the thesis, and patiently created (and explained to me) all scripts for the French experiments. I also thank Alice, Ava, Baptiste, Camilla, Camille, Cathal, Cécile, Geor- gia, Ghislaine, Hualin, Melissa, Monica, and many others, for the great partnership, for the help with my experiments and fun moments in and out of the lab. I thank the children, parents, and adults who participated in the experiments presented here, both in France and in Brazil; the two amazing daycares from Campinas that welcomed me into their institutions and gave me all the support that I needed; and Anne-Caroline,

Clémence, and Isabelle, for all their hard work and great support at the babylab and adult lab at LSCP. Without their participation, this thesis would have not been possible. I thank the jury members, prof. Alex de Carvalho, prof. Ruth Lopes, prof. Cristina Name, prof. Thiago Motta Sampaio, and prof. Pablo Faria for the contribution to this thesis. The present work was carried out with the support of Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) – Financing Code 001 by means of a national scholarship (PROEX - Programa de Excelência Acadêmica; process number 88882.329577/2010-01) and a sandwich scholarship (PDSE - Programa de Doutorado Sanduíche no Exterior; process number 88881.188983/2018-01).

I saved the best for last. I thank my family for always supporting me, no matter how crazy my ideas may sound; my mother, for offering me emotional, financial, and psychological support 24/7; and of course, Fabio, my best friend, life partner, and personal lab assistant, for sharing all the adventures life has provided us with for the last seven years. Thank you for being my safe place.

ABSTRACT

What kind of information can children use to attribute syntactic structure to the sentences they hear? Phrasal has been proposed as a crucial type of cue, since it correlates with syn- tactic structure and infants are sensitive to it from their first months of life (HIRSH-PASEK ET AL., 1987). However, there are still few studies on young children’s ability to use prosodic information to constrain syntactic parsing, some providing positive evidence (e.g. DE CAR- VALHO ET AL., 2016; DAUTRICHE ET AL., 2014) and others failing to observe this ability in children up to 6-years-old (e.g. SNEDEKER & TRUESWELL, 2001; CHOI & MAZUKA, 2003).

In the present thesis, we investigate young children's ability to parse a type of sentence that has not yet been studied for this purpose, namely stripping sentences (a type of Tense Phrase ellip- sis) with the adverb aussi ("too") such as Le tigre mange ! Le dinosaure aussi ! ("The tiger eats! The dinosaur too!"). If stripped out of the prosodic cues that indicate an intonational and syn- tactic phrase boundary between the verb and the second noun, these sentences cannot be distin- guished from simple transitive sentences such as "The tiger eats the dinosaur too". We con- ducted three preferential looking and picture selection tasks. Experiment 1 investigated if French- and Brazilian Portuguese-learning children from 3-to-4-years-old and 28-months-old can tell apart stripping from simple transitive sentences when presented with sentences contain- ing familiar verbs in an ambiguous context (two videos playing side-by-side, one showing a tiger eating a dinosaur, and another showing the tiger and the dinosaur eating a duck, for the example above). The results show that French- and BP-learning 3-4-year-olds interpret strip- ping sentences differently from simple transitive sentences by relying on their prosodic differ- ences. Experiment 2 investigated French-learning 30-42-month-olds' ability to tell apart these two kinds of sentence with unfamiliar verbs in the absence of (simultaneous) visual context. Children were presented with dialogues containing stripping or simple transitive sentences with a novel verb, and afterwards they saw two possible interpretations for this verb: a video showing a novel causal action (i.e., a girl swinging another girl's leg), and a video showing a novel in- transitive action (i.e., a girl spinning her arm). The results show that children failed to explore prosodic information to correctly interpret stripping sentences. Experiment 3 investigated whether children understand the identity condition behind stripping sentences, that is, if when they hear a sentence such as "the tiger eats! The dinosaur too!", children know that the dinosaur is also eating, rather than doing something else. We presented French-learning 3-4-year-olds

with stripping sentences along with two videos, one where both characters perform the named action, and another where the first-mentioned character performs the named action but the sec- ond-mentioned character performs a novel action. Our hypothesis was that children understand the identity condition behind stripping sentences, and so would look longer towards the video where both characters perform the named action. The preliminary results show only a non- significant trend in the expected direction; we discuss possible methodological and theoretical reasons for this.

Keywords: Psycholinguistics; ; Syntactic comprehension; Ellipsis; Preferential Looking.

RESUMO

Que tipo de informação as crianças podem usar para atribuir estrutura sintática às sentenças que ouvem? A prosódia frasal tem sido proposta como um tipo de pista crucial, já que ela se corre- laciona à estrutura sintática, e, além disso, crianças são sensíveis à informação prosódica desde os primeiros meses de vida (HIRSH-PASEK ET AL., 1987). No entanto, ainda há poucos es- tudos investigando a habilidade de crianças em usar a informação prosódica para restringir a análise sintática, sendo que alguns fornecem evidência positiva (DE CARVALHO ET AL., 2016; DAUTRICHE ET AL., 2014) e outros não conseguem observar essa habilidade em cri- anças de até seis anos (SNEDEKER & TRUESWELL, 2001; CHOI & MAZUKA, 2003).

Esta tese tem como objetivo investigar a habilidade de crianças pequenas em analisar um tipo de sentença ainda não estudada para esse propósito: sentenças stripping (um tipo de elipse de Sintagma Flexional) com o advérbio aussi ("também") tal como Le tigre mange ! Le dinosaure aussi ! ("O tigre come! O dinossauro também!"). Se as pistas prosódicas que indicam uma fronteira entoacional e sintática entre o verbo e o segundo nome forem removidas, essas sen- tenças não podem ser distinguidas de sentenças transitivas como "O tigre come o dinossauro também". Nós realizamos três experimentos de Olhar Preferencial. No Experimento 1, investi- gamos se crianças francesas e brasileiras de 3 a 4 anos e 28 meses conseguem distinguir sen- tenças stripping de transitivas simples contendo verbos conhecidos em um contexto ambíguo (e.g. dois vídeos exibidos lado a lado, um mostrando um tigre comendo um dinossauro, e outro mostrando o dinossauro e o tigre comendo um pato). Os resultados mostram que as crianças francesas e brasileiras de 3 a 4 anos interpretam as sentenças stripping diferente das sentenças transitivas simples baseando-se em suas diferenças prosódicas. O Experimento 2 testou a habi- lidade de crianças francesas de 30 a 42 meses em distinguir essas sentenças com verbos desco- nhecidos na ausência de contexto visual simultâneo. As crianças ouviam diálogos contendo sentenças stripping ou transitivas com um verbo inventado, e depois viam duas possíveis inter- pretações para o verbo: um vídeo mostrando uma ação causal (i.e. uma menina balançando a perna de outra menina), e um vídeo mostrando uma ação intransitiva (i.e. uma menina rodando seu próprio braço). Os resultados mostram que as crianças não conseguiram explorar a infor- mação prosódica para interpretar corretamente as sentenças stripping. O Experimento 3 inves- tigou se crianças francesas de 3 a 4 anos entendem a condição de identidade por trás das sen- tenças stripping, isto é, se quando ouvem uma sentença como "O tigre tá comendo! O dinos- sauro também!", elas sabem que o dinossauro também está comendo, e não realizando outra

ação. Apresentamos as sentenças stripping às crianças juntamente com dois vídeos, um mos- trando os dois personagens realizando a ação nomeada, e outro em que o primeiro personagem realiza a ação nomeada, mas o segundo realiza uma ação nova. Nossa hipótese era que as cri- anças entendem a condição de identidade por trás das sentenças stripping, e então olhariam por mais tempo para o vídeo em que ambos os personagens realizam a mesma ação. Os resultados preliminares mostram apenas uma tendência não-significativa na direção esperada; discutimos possíveis razões metodológicas e teóricas para isso.

Palavras-chave: Psicolinguística; Bootstrapping prosódico; Compreensão sintática; Elipse; Olhar Preferencial.

LIST OF FIGURES

Figure 1 - The prosodic hierarchy according to Nespor & Vogel ...... 33 Figure 2 - Prosodic bootstrapping model from Christophe et al...... 42 Figure 3 - Model of sentence derivation from Chomsky (1995)...... 49 Figure 4 - Example of test videos for the verb "hit" for Experiment 1...... 722 Figure 5 - Soundwave and pitch for one stripping sentence (Le tigre tape ! Le canard aussi !, top) and one simple transitive sentence (Le tigre tape le canard aussi !, bottom)...... 755 Figure 6 - Structure of a test trial for Experiment 1...... 78 Figure 7 - Proportion of looks towards the two-agent action through the whole test trial (12 seconds) in the stripping (green line) and transitive (orange line) condition for the control group of Experiment 1...... 822 Figure 8 - Proportion of looks towards the two-agent action through the whole test trial (12 seconds) in the stripping (green line) and transitive (orange line) condition for French-learning 3-4-year-olds (Experiment 1)...... 833 Figure 9 - Average proportion of looking time towards the two-agent action in the stripping (green box) and transitive (orange box) condition for French-learning 3-4-year-olds in Experiment 1...... 844 Figure 10 - Average proportion of looking time towards the two-agent action for the stripping (green bars) and transitive (orange bars) condition per item for French-learning 3-4-year-olds (Experiment 1)...... 86 Figure 11 - Left: Average proportion of pointing towards the two-agent action for French- learning 3-4-year-olds (Experiment 1). Right: Average proportion of pointing towards the two- agent videos for stripping (green bars) and transitive (orange bars) condition per item...... 87 Figure 12 - Average proportion of looking time towards the two-agent action in the stripping (green box) and transitive (orange box) condition for 28-month-olds (Experiment 1) ...... 88 Figure 13 - Proportion of looks towards the two-agent action through the whole test trial (12 seconds) in the stripping (green line) and transitive (orange line) condition for French-learning 28-month-olds...... 89 Figure 14 - Average proportion of looking time towards the two-agent action for the stripping (green bars) and transitive (orange bars) condition per item for French-learning 28-month-olds (Experiment 1)...... 911

Figure 15 - Soundwave and pitch for a stripping sentence (O tigre tá comendo! O dinossauro também!, top) and a simple transitive sentence (O tigre tá comendo o dinossauro também!, bottom)...... 1011 Figure 16 - Proportion of looks towards the two-agent action through the whole test trial (12 seconds) in the stripping (green line) and transitive (orange line) condition for BP-learning children...... 1044 Figure 17 - Average proportion of looking time towards the two-agent action in the stripping (green box) and transitive (orange box) condition for BP-learning children...... 1044 Figure 18 - Average proportion of looking time towards the two-agent action video for stripping (green bars) and transitive (orange bars) condition per item for the BP-learning children...... 10606 Figure 19 - Left: Average proportion of pointing towards the two-agent action for BP-learning children. Right: Average proportion of pointing towards the two-agent action for stripping (green bars) and transitive (orange bars) conditions per item...... 10808 Figure 20 - Example of dialogue of Experiment 2...... 120 Figure 21 - Videos of the test phase of Experiment 2...... 120 Figure 22 - Soundwave and pitch for one stripping sentence (Le bébé a dasé ! La maman aussi !, top), and one simple transitive sentence (Le bébé a dasé la maman aussi !, bottom) for Experiment 2...... 123 Figure 23 - Structure of a test trial for Experiment 2...... 125 Figure 24 - Proportion of looks towards the causal action through the whole test trial in the stripping (green line) and transitive (orange line) condition for Experiment 2...... 128 Figure 25 - Average proportion of looking time towards the causal action in the stripping (green box) and transitive (orange box) condition for Experiment 2...... 129 Figure 26 - Average proportion of pointing towards the causal action for the stripping (green bar) and transitive (orange bar) condition for Experiment 2...... 130 Figure 27 - Example of test videos for the verb "hit" from Experiment 3...... 136 Figure 28 - Structure of a test trial of Experiment 3...... 139 Figure 29 - Proportion of looks towards the same-action video through the whole test trial (14 seconds) in the aussi (orange line) and cali (green line) condition of Experiment 3...... 142 Figure 30 - Average proportion of looking time towards the same-action video in the aussi (orange box) and cali (green box) conditions (Experiment 3)...... 142 Figure 31- Average proportion of looking time towards the same-action videos for the cali (green bars) and aussi (orange bars) condition per item...... 144

Figure 32 - Average proportion of looking time towards the two-agent action in the stripping (green box) and transitive (orange box) condition for French adults...... 169 Figure 33 - Average proportion of pointing towards the two-agent action per condition per verb for the French adults...... 169 Figure 34 - "Hit" action of Experiment 1...... 170 Figure 35 - "Eat" action of Experiment 1...... 170 Figure 36 - "Carry" action of Experiment 1...... 170 Figure 37 - "Push" action of Experiment 1...... 170 Figure 38 - Second dialogue of Experiment 2...... 190 Figure 39 - "Eat" action of Experiment 3...... 191 Figure 40 - "Carry" action of Experiment 3...... 191 Figure 41 - "Push" action of Experiment 3...... 191 Figure 42 - "Hit" action of Experiment 3...... 192 Figure 43 – Videos of the fifth test phase of Experiment 3...... 192

LIST OF TABLES

Table 1 - Raw values of duration for the acoustic analysis of French test sentences (Experiment 1) ...... 74 Table 2 - Raw values of F0 for the acoustic analysis of French test sentences (Experiment 1) ...... 75 Table 3 - Average proportion of looking time towards the two-agent action per item for French- learning 3-4-year-olds in Experiment 1...... 86 Table 4 - Average proportion of pointing towards the two-agent action for stripping and transitive conditions per item. French-learning 3-4-year-olds, Experiment 1...... 87 Table 5 - Average proportion of looking time towards the two-agent action per item for French- learning 28-month-olds...... 90 Table 6 - Raw duration values for the acoustic analysis of BP test sentences...... 100 Table 7 - Raw F0 values for the acoustic analysis of BP test sentences...... 100 Table 8 - Average proportion of looking time towards the two-agent action for the stripping and transitive condition per item for BP-learning children...... 106 Table 9 - Average proportion of pointing towards the two-agent action for the stripping and transitive condition per item for BP-learning children...... 107 Table 10 - Raw duration values for the acoustic analysis of test sentences of Experiment 2 122 Table 11 - Raw F0 values for the acoustic analysis of test sentences of Experiment 2...... 122 Table 12 - Average proportion of looking time towards the same-action video for the cali and aussi conditions per item for Experiment 3...... 143 Table 13 - Video design of Experiment 1...... 171 Table 14 - Randomization of test videos for the Brazilian Experiment...... 172 Table 15 - Complete design of Experiment 2...... 188 Table 16 - Complete design of Experiment 3...... 193

LIST OF ACRONYMS AND ABBREVIATIONS

ANOVA Analysis of Variance AP Adjective Phrase / Accentual Phrase BP Brazilian Portuguese C Clitic Group Capes Coordenação de Aperfeiçoamento de Pessoal de Nível Superior CEP Comitê de Ética em Pesquisa CERES Conseil d’évaluation éthique pour les recherches en santé CP Complementizer Phrase CS Computational System DP Determiner Phrase EP European Portuguese F0 Fundamental Frequency FocP Focus Phrase FP Functional Projection I Intonational Phrase iP Intermediate Phrase LF Logical Form LSCP Laboratoire de Sciences Cognitives et Psycholinguistique M Mean NO Null Object NP Noun Phrase OSF Open Science Framework PF Phonological Form SD Standard deviation SE Standard Error Spec Specifier TP Tense Phrase / Transitional Probabilities U Phonological Utterance UNICAMP Universidade Estadual de Campinas VP Verb Phrase VPE VP ellipsis Σ

σ ΣP Sigma Phrase Φ Phonological Phrase ω Phonological Word

SUMMARY

1 INTRODUCTION ...... 20

2 THEORETICAL FRAMEWORK ...... 32 2.1 The role of prosody in syntactic acquisition and processing ...... 32 2.1.1 Prosodic Phonology ...... 32 2.1.1.1 The phonological phrase ...... 34 2.1.1.2 The intonational phrase ...... 36 2.1.1.3 Prosodic constituents and disambiguation ...... 39 2.1.2 Prosodic bootstrapping theory ...... 41 2.2 Syntactic representation of stripping sentences ...... 46 2.2.1 Basic assumptions ...... 46 2.2.1.1 Ellipsis theory ...... 49 2.2.2 Analysis of stripping sentences ...... 53 2.3 Acquisition of ellipsis ...... 56 2.4 French vs. BP: relevant prosodic and syntactic differences ...... 59

EXPERIMENT 1: STRIPPING VS. SIMPLE TRANSITIVE SENTENCES WITH KNOWN VERBS ...... 66 3.1 Test sentences ...... 67 3.2 French-learning children ...... 69 3.2.1 Method ...... 69 3.2.1.1 Participants ...... 70 3.2.1.2 Materials ...... 71 3.2.1.2.1 Acoustic analyses ...... 73 3.2.1.3 Procedure ...... 76 3.2.1.4 Data analysis ...... 79 3.2.1.5 Experimental hypothesis and predictions ...... 80 3.2.2 Results ...... 81 3.2.2.1 Control group ...... 81 3.2.2.2 Eye-tracking data: 3-4-year-olds ...... 83 3.2.2.3 Pointing data ...... 86 3.2.2.4 Eye-tracking data: 28-month-olds ...... 88 3.2.2.5 Age group comparison ...... 91 3.2.3 Discussion ...... 92

3.3 BP-learning children ...... 96 3.3.1 Method ...... 96 3.3.1.1 Participants ...... 96 3.3.1.2 Materials ...... 97 3.3.1.2.1 Acoustic analyses ...... 98 3.3.1.3 Procedure ...... 101 3.3.1.4 Data analysis ...... 102 3.3.1.5 Experimental hypothesis and predictions ...... 103 3.3.2 Results ...... 103 3.3.2.1 Eye-tracking data ...... 103 3.3.2.2 Pointing data ...... 107 3.3.3 Discussion ...... 108 3.4 General discussion: BP vs. French results ...... 110

EXPERIMENT 2: STRIPPING VS. TRANSITIVE SENTENCES WITH A NOVEL VERB ...... 115 4.1 Method ...... 118 4.1.1 Participants ...... 118 4.1.2 Materials ...... 119 4.1.2.1 Acoustic analysis ...... 121 4.1.3 Procedure ...... 123 4.1.4 Data analysis ...... 126 4.1.5 Experimental hypothesis and predictions ...... 126 4.2 Results ...... 127 4.2.1 Eye-tracking data ...... 127 4.2.2 Pointing data ...... 129 4.3 Discussion ...... 130

EXPERIMENT 3: COMPREHENSION OF STRIPPING SENTENCES...... 134 5.1 Method ...... 135 5.1.1 Participants ...... 135 5.1.2 Materials ...... 136 5.1.3 Procedure ...... 138 5.1.4 Data analysis ...... 139 5.1.5 Experimental hypothesis and predictions ...... 140 5.2 Results ...... 141 5.3 Discussion ...... 144

SUMMARY AND GENERAL DISCUSSION ...... 149

REFERENCES ...... 158

APPENDIX A – Results from adult control group – Experiment 1 ...... 169

APPENDIX B – Detailed stimuli and design of Experiment 1 ...... 170

APPENDIX C – Consent forms and parental questionnaires for the French experiments ...... 174

APPENDIX D – Consent forms for the Brazilian Portuguese experiment ...... 179

APPENDIX E – Approval documents for the research in Brazil ...... 183

APPENDIX F – Complete design of Experiment 2 ...... 188

APPENDIX G – Detailed stimuli and design of Experiment 3 ...... 191

20

1 INTRODUCTION

How an infant acquires a natural language is one of the most fascinating phenomena of human nature. From an early age, infants are instinctively aware of the special status the sounds of language have with respect to other sounds, such as music, crying or yawning. Impressively, by the moment they are brought to the world, they already figured out the of their moth- er's language and are surprised when someone speaks in a language with a different rhythmic pattern (MEHLER ET AL., 1988; RAMUS, 2002). Within a few months, they start realizing other patterns that emerge from language sounds and show more interest in speech that follows the right patterns than to speech that violates them (JUSCZYK ET. AL., 1992; HIRSH-PASEK ET AL., 1987). It does not take much longer until they can extract words from the continuous flow of speech (JUSCZYK, 1999) and in little time, they start to put together the structure of simple sentences. Before they can tie their own shoes, infants already know a great deal about the sentence structure of their mother language, just by listening to the sounds other humans make around them (CHRISTOPHE ET AL., 2008). What is it that infants find in the speech stream that helps them to grasp the structure of a human language, and to bootstrap their learning of words and sentences? The prosodic regu- larities found in the speech stream, such as , rhythm, duration, and seem to be an important source of information, to which children are sensitive from an early age. This hypothesis is supported by several studies; Ramus (2002), for instance, shows that French in- fants between 2 and 5 days of age distinguish Dutch, a language that presents stress-timed rhythm, from Japanese, a language that presents mora1-timed rhythm, on the basis of their dif- ferent rhythmic cues, as shown by presenting them with synthesized speech where other possi- ble cues (such as intonation, differences in fundamental frequency between speakers and pho- neme variability) are masked. Children habituated to listening to Japanese synthesized sen- tences show surprise when they start listening to Dutch synthesized sentences and vice-versa. The effect of surprise is even greater when the intonational cues are also present, showing that newborns are also sensitive to the differences in intonational patterns between the two lan- guages. Jusczyk et. al. (1992) and Hirsh-Pasek et al. (1987) also show that, by 7 months of age, infants can identify prosodic phrase boundaries, and pay more attention to recorded stories in

1 is a rhythmic unit that distinguishes long (which usually contain two mora) from short syllables (which usually contain one mora).

21 which pauses are inserted correctly at those boundaries as opposed to when the pauses fall within the phrases. Infants have also been shown to understand, at around 13 months of age, that words cannot straddle prosodic boundaries, and to use this information to recover known words from the speech stream; for instance, when familiarized with a word such as "paper", they pay more attention to an utterance that contains the word, such as "the college with the biggest paper forms is best", than to an utterance where the syllables of the word straddle a prosodic boundary, such as "The butler with the highest pay performs the best" (CHRIS- TOPHE ET AL., 2003). These studies show that children are indeed aware of the prosodic regularities of their mother language and can use them for speech segmentation. But do they rely on this information for syntactic acquisition? That is a harder question. It is hard to think of an experiment that would show actual syntactic processing in young infants, since they still do not know many lexical items, which makes it difficult to access their sentence comprehension. What we can do is investigate how older infants and children, who already know many words and master the basic syntactic structures in their native language, rely on prosodic information for syntactic processing. One way of doing this is to select sentence pairs that are linearly ambiguous2, i.e., sen- tences that present the same string of words, but have different syntactic structures, which are reflected in different prosodic structures. Presenting such sentences to children along with se- mantically ambiguous contexts (e.g., visual stimuli conveying both possible interpretations of the sentences) can allow us to test whether they interpret the structures correctly. This strategy has motivated some studies with children learning French (e.g. DAUTRICHE ET AL., 2014; DE CARVALHO ET AL., 2014, 2017), English (SNEDEKER & TRUESWELL, 2001; SNEDEKER & YUAN, 2008) and Korean (CHOI & MAZUKA, 2003). Dautriche et al. (2014), for instance, presented 28-month-old French-learning children with left-dislocated sentences such as [il a mangé], [le canard] ("[he ate], [the duck]"), with the brackets representing prosodic phrasing, in which the verb manger is moved to a focal position to the left of its subject (i.e., le canard), resulting in the positioning of the pronoun il as the external argument of the verb. The resulting sentence, interpreted as "the duck eats", shows prosodic focus marking of the moved verb and an intonational phrase boundary between the

2 In this thesis, we will use the terms "ambiguity" and "disambiguation" rather loosely, since they are the terms usually adopted to describe experiments such as the ones mentioned here. However, it is worth mentioning that most of the structures presented in these studies are only ambiguous if one does not consider prosodic structure, and since children are only exposed to one of the structures at a time, their task is not really to disambiguate between two possible structures, but only to correctly interpret the structure they are exposed to.

22 verb and the stranded subject. Without these prosodic cues, this sentence is ambiguous with the simple sentence [il mange le canard], which is interpreted as "he eats the duck". The authors presented children with one of these two types of sentence, along with two simultaneous videos: one with the left-dislocated interpretation of the sentences (e.g., a duck puppet eating bread, for the example above) and another one with the no-dislocation interpretation (e.g., a tiger puppet eating the duck, for the example above). Children exposed to the left-dislocated sentences looked longer towards the video where the mentioned character is the agent, compared to chil- dren exposed to the sentences without dislocation, showing that they correctly interpreted the left-dislocation sentences differently from the transitive sentences by relying on their prosodic boundary information. De Carvalho et al. (2017) also presented French-learning 20-month-olds with globally ambiguous sentences in French such as Tu vois le bébé [suʁi], where the homophone [suʁi] could be interpreted as the verb "smile" or the noun "mouse" depending on the prosodic phras- ing of the sentences: when there is a phonological phrase boundary between le bébé and the homophone, the homophone has a verb interpretation ("[Tu vois]? [Le bébé] [sourit]!” - “[Do you see]? [The baby] [smiles]!"); in contrast, when there is no phrase boundary between these words, and all the words in the sentence are pronounced in a single prosodic unit, it has a noun interpretation (“[Tu vois le bébé souris]?” - "[Do you see the baby mouse]?"). When presented with two images depicting the two possible interpretations for the string of words (e.g., a baby smiling (verb interpretation) and a little mouse (noun interpretation)) side-by-side on a TV- Screen, infants who listened to the sentences with noun prosody looked longer towards the noun interpretation image, while infants who listened to the sentences with verb prosody tended to look longer towards the verb interpretation image. The results of these experiments show that by 20 months of age children can already use prosodic boundary information to detect syntactic boundaries and correctly interpret the structure of sentences. However, some studies have failed to observe an impact of prosody on syntactic disambiguation even by older children (e.g. a second experiment from Dautriche et al. (2014) in French; Snedeker & Trueswell (2001) in English; and Choi & Mazuka (2003) in Korean). Dautriche et al. (2014) conducted a second experiment, where children listened to dia- logues with either transitive or left-dislocated sentences containing the novel verb daser, such as [elle a dasé la maman] ("she dased the mommy"), for the transitive prosody, or [elle a dasé], [la maman] ("she dased, the mommy"), for the left-dislocation prosody. After the dialogues, children saw two possible interpretations for the novel verb: a video depicting a causative action

23

(e.g., a girl swinging another girl's leg back and forth) and another video depicting an intransi- tive action (e.g., a girl making circles with her own arm); these two videos were displayed side- by-side on a screen. When asked to "look at the one who dases" (regarde celle qui dase !), groups of children exposed to transitive sentences, and children exposed to left-dislocated sen- tences, both looked longer towards the video depicting the causative action; in contrast, another group of children, who were exposed to simple intransitive sentences such as la maman a dasé, looked significantly less towards this video. This suggests that the French-learning 28-month- olds failed to use the test sentences' prosodic boundary cues to figure out their transitivity, and instead resorted to a DP-counting strategy, which led them to interpret the left-dislocated sen- tences as transitive sentences (since both types of sentence present two DPs). This was tenta- tively explained as an effect of the near absence of semantic cues in the experiment, as children were exposed to an unknown verb without a simultaneous visual context, which made the task more challenging to them. As they were confronted with a harder task, children might have resorted to a simpler strategy (i.e. DP-counting) to figure out the meaning of the novel verb. In a follow-up study, de Carvalho (2017) investigated whether children could change their strategy to figure out the novel verb meaning if the DP-counting strategy were no longer reliable, i.e., if children were presented with the novel verb in one-DP and two-DP sentences at the same time. In the experiment, half of the children listened to dialogues where the novel verb appeared in transitive sentences such as [elle a dasé la maman]! and in sentences with one DP such as [il a dasé]!; and the other half listened to dialogues in which the novel verb appeared in left-dislocated sentences such as [elle a dasé] [la maman]! and in sentences with one DP. The results showed that children who listened to the transitive + one-DP dialogues3 looked longer towards the causative action than children who listened to the left-dislocated + one-DP dia- logues, who performed similarly to children who listened to the one-DP-only dialogues in Dau- triche et al. (2014). A study by Snedeker & Trueswell (2001) also failed to show the use of prosodic bound- ary cues for sentence processing by 5-year-old English-learning children. In this study, the chil- dren and their mothers were invited to play a game; they stayed on opposite sides of an opaque screen, and an experimenter would lay some toys in front of the child, while another experi- menter performed an action with a similar set of toys in front of the mother. The mother was

3 Note here that children in the transitive + one-DP condition must have interpreted the one-DP sentences as tran- sitive sentences with null object. This is controversial, since French is said to be a language that does not allow for null object constructions. However, as further discussed in section 2.4., French does seem to present some null object constructions, although those are clearly marked in the language (e.g. CUMMINS & ROBERGE, 2004; GRÜTER, 2009).

24 then instructed to tell the child to perform the same action with the toys, by using a sentence written in a card that was given to her. Children were exposed to some unambiguous sentences, such as "tap the frog who is carrying a flower", and to some ambiguous sentences, such as "tap the frog with the flower", which had more than one possible interpretation given the available set of toys (e.g., a frog, a flower, and a frog carrying a small flower). Although the mothers' prosodic phrasing correctly cued the right interpretation of sentences in the ambiguous contexts4 (e.g. [tap] [the frog with the flower], for the "modifier interpretation", in which the child should tap the frog who was carrying a flower, or [tap the frog] [with the flower], for the "instrument interpretation", in which the child should use the flower to tap the frog), children's interpretation of these sentences were at chance (i.e., roughly half of children's actions corresponded to the sentences heard), whereas for the unambiguous sentences children almost always performed the correct actions. In a follow-up study, Snedeker & Yuan (2008) controlled for lexical and perseveration biases in the task, by choosing test sentences that were judged "neutral" (i.e. not favoring any of the two possible interpretations) by naïve listeners, and by creating two blocks of trials, each presenting only one kind of sentence prosody (whereas in Snedeker & Trueswell the ambiguous and unambiguous sentences were completely randomized). They also decided to start all test sentences with "you can…" (e.g. "you can tap the frog with the flower"), as they realized that the sentences of the first study (e.g. "tap the frog with the flower") did not sound natural when using the modifier prosody. Their results showed that children succeeded in interpreting the sentences as expected, but only when presented to a single kind of sentence prosody. Finally, Choi & Mazuka (2003) also failed to show Korean-learning children's use of prosodic boundary cues for syntactic disambiguation. The authors presented 5-6-year-olds with sentences such as [kiɾin] [kwaja məgəyo] ([giraffe] [cookies eat] - ;[(A) giraffe] [eats cookies]), which, without the prosodic boundary information, is not distinguishable from the sentence Ø [kiɾin kwaja məgəyo] ([giraffe cookies eat] - (somebody) [eats giraffe-shaped cookies], with the brackets representing the phonological phrase boundaries). Children were presented with one of the sentences above and were asked to choose between two images (i.e., an image of a giraffe eating cookies vs. another image of a boy eating giraffe-shaped cookies) the one that better represented the sentence heard. The results show that Korean children were not able to use the prosodic boundary information to correctly interpret the sentences, and presented a

4 A similar experiment was conducted with adults, and these succeeded in disambiguating the sentences and per- forming the correct actions (SNEDEKER & TRUESWELL, 2003).

25 similar pointing pattern for both types of sentence prosody, with roughly half of the children pointing towards the image of the giraffe eating cookies. How can we explain the differences in performance between the studies presented above? One of the most salient differences between them is the types of sentence tested. The French experiments used sentences in which the default prosodic structure directly reflected syntactic structure. For instance, for Dautriche et al., the movement of the VP in the left-dislo- cated sentences was signaled by an intonational phrase boundary between the verb and the sec- ond DP (e.g. [il a mangé], [le canard]), which could not be present if this DP was the internal argument of the verb, as in the transitive sentences (e.g. [il a mangé le canard]). For de Carvalho et al.'s (2017) sentences, when the prosodic boundary falls before the target word (e.g. [Tu vois]? [Le bébé] [sourit]!), it can only be interpreted as a verb; when the prosodic boundary falls after it (e.g. [Tu vois Le bébé sourit]?), the word is interpreted as a noun. However, Choi & Mazuka's sentences could be more easily disambiguated through the placing of optional case markers. Snedeker & Trueswell's sentences had the same basic pro- sodic phrasing for both interpretations (e.g. [tap] [the frog] [with the flower], with three pro- sodic units), and mature speakers who are aware of the ambiguity can intentionally disambigu- ate them by exaggerating the relevant prosodic breaks. In a natural context, the speaker could also choose to rephrase the sentence for disambiguation (as in "tap the frog using the flower"). So children's difficulty in disambiguating Snedeker & Trueswell's and Choi & Mazuka's sen- tences might be because the prosodic breaks are not part of the normal prosodic structure of the sentences, and/or they could be disambiguated by other more effective means. In order to investigate whether English-speaking children could perform better with a different type of ambiguity, de Carvalho et al. (2016) presented English-learning preschoolers (4-5-year-olds) with sentences containing noun-verb homophones such as "the baby flies hide in the shadows" versus "the baby flies his kite", where the word "flies" is interpreted as a noun in the first sentence, and as a verb in the second. The words following this homophone were masked, so that the only way children could differentiate between these sentences was through their prosodic phrasing, e.g., [the baby flies], for the first type of sentence, and [the baby] [flies…], for the second type. In an oral completion task, children successfully exploited the sentence’s prosodic structure to assign the appropriate syntactic category to the target word. Children who listened to the sentences with noun prosody interpreted the ambiguous word (e.g. “flies”) as a noun, while children who listened to the sentences with verb prosody interpreted the ambiguous word as a verb. This result shows that English-learning children, as French-

26 learning children in de Carvalho et al. (2017), can rely on the prosodic phrasing of sentences to disambiguate between the two possible meanings of a noun-verb homophone. Another important aspect that distinguishes the studies reviewed above is their method- ology. The Preferential Looking method seems to be better for analyzing children's interpreta- tion of sentences, as it can capture their online comprehension through their visual attention and does not require an active response. However, de Carvalho et al. (2016) showed that oral completion tasks can also capture children's interpretation very well, if conducted with the cor- rect age range. Furthermore, as we have seen above, the choice of lexical items, as well as the experimental design (whether within or between subjects, for instance), can have an impact on children's performance and even bias their answers. In Snedeker & Trueswell (2001), the choice of lexical items that could bias the interpretation of the test sentences was said to have a greater influence on children's performance than prosodic phrasing. The choice for a within-subjects design also led children to perseverate in one or another answer for the ambiguous sentences. These issues were solved in Snedeker & Yuan's (2008) follow-up study, which returned a dif- ferent result, with children succeeding in using the prosodic boundary information to correctly interpret the test sentences when exposed to only one type of prosodic structure. Finally, in Choi & Mazuka's study, which also presented a within-subjects design, we can also point out that "giraffe cookie" could be interpreted in different ways (e.g., a giraffe-shaped cookie or a cookie meant for giraffes), which could have also influenced children's performance on the disambiguation of the sentences.

In sum, there is a variety of factors that could influence children's performance on sen- tence parsing through prosodic boundary information, such as the type of structure being tested and its relation to prosodic structure, and the chosen methodology, including the choice of lex- ical items and visual stimuli. Also, one cannot discard the possibility that differences in prosodic and syntactic structure across languages can also have an impact on children's performance, a question that, except for de Carvalho et al. (2016), has not yet been addressed. A good way of contributing to the investigation of children's use of prosodic boundary information for syntactic parsing is to study other types of structures, to see how children's performance varies according to the type of structure tested. The present thesis starts with two main goals: 1) test a new syntactic structure in French; and 2) test the same structure in Brazilian Portuguese (BP), which is a language that has not been studied before for this type of task with

27 young children5, and see if BP-learning children can also use prosodic boundary cues to cor- rectly interpret syntactic structure. The chosen structure, illustrated in (1a), stems from an ellipsis process found in both French and BP, named stripping (ROSS, 1969). If we remove the cues for its intonational and phonological phrase boundaries, represented below by the brackets, there is an ambiguity be- tween this sentence and the simple transitive sentence in (1b):

1) a) [Le tigre mange] [Le dinosaure aussi] / [O tigre tá comendo] [O dinossauro tam- bém] [The tiger eats] [The dinosaur too] b) [Le tigre] [mange le dinosaure aussi] / [O tigre] [tá comendo o dinossauro tam- bém] [The tiger] [eats the dinosaur too]

The sentence (1a) is interpreted as a coordinated sentence where the first and the second- mentioned participants are subjects of equivalent "eat" actions, while (1b) is interpreted as a transitive6 action where the first-mentioned participant is the subject and the second-mentioned participant is the patient of the action. Notice that the linear ambiguity is only present when the coordination is not open, i.e., when there is no realized coordination operator in the sentence, as in "The tiger eats and the duck too". Although the differences between the structures in (1), when adequately cued by prosodic structure, are pretty clear for a mature speaker, this might not be the case for young children; as we have seen, sometimes children fail to use the prosodic boundary information for syntactic processing (CHOI & MAZUKA, 2003; SNEDEKER & TRUESWELL, 2001; DAUTRICHE ET AL., 2014). In order to investigate young children's ability to tell apart the sentences above through prosodic boundary information, we created two experiments, following the methodologies adopted in Dautriche et al. (2014) and de Carvalho (2017). For Experiment 1, we presented BP and French-learning 3-4-year-olds and French-learning 28-month-olds with sentences such as in (1a) or (1b), along with two videos that depicted the interpretations of each type of sentence;

5 Although there are few studies on the processing of prosodic cues by young children and adults (e.g. COSTA, 2015; SILVA, 2014; SILVA & NAME, 2014; SILVA, 2009), as far as we know, there is still no work which deals with the use of prosodic cues for syntactic disambiguation by young children learning BP. But see Souza (2016) for an experiment with adults. 6 Although I will refer to the sentences with the structure in (1b) as transitive sentences, it is important to state that, for three of the four verbs used in the present experiment (i.e., "push", "hit" and "carry"), the stripping sentences are also interpreted as transitive, as they involve the coordination of two transitive sentences. However, the verb "eat", when inside a sentence with no object, is more easily interpreted as unergative in both French and BP. See sections 2.4 and 3.1 for more details on the structure of the test sentences and their implications for the analysis of the results.

28 for the verb manger, for instance, one video showed the tiger eating the dinosaur (which is congruent with the simple transitive sentence), and the other video showed the tiger and the dinosaur eating a duck (which is congruent with the stripping sentence). We then measured children's looking pattern, as well as their pointing responses to the request "show me which video the lady was talking about" (montre-moi de quelle vidéo la dame parle/ me mostra de que vídeo a moça tava falando). Children watched a total of four test trials, with four different actions: eat, push, carry and hit7. Our working hypothesis was that French and BP-learning children who heard stripping sentences would interpret the test sentences differently from chil- dren who heard simple transitive sentences, suggesting that they pay attention to the prosodic boundary cues that differentiate between the two types of sentence. However, if children do not show any looking or pointing difference between conditions, this would suggest that it is harder for them to interpret stripping sentences through their prosodic boundary information, when compared to Dautriche et al.'s (2014) left-dislocated sentences. There are a few reasons why children might perform differently in our experiment than in Dautriche et al.'s. First, in the stripping sentences there is a non-pronounced element that needs to be recovered by the listener. This might pose a greater challenge for young children than the left-dislocated sentences. However, previous studies (e.g. SANTOS, 2006, 2009; LOPES, 2009; LOPES & SANTOS, 2014; POSTMAN ET AL., 1997) show that children as young as 17 months of age already produce and understand ellipsis sentences. So, a priori, there is no reason to believe that they would have more trouble analyzing this type of sentence than analyzing sentences with leftward dislocation. Another possible reason for a difference in children's performance between the two ex- periments comes from input frequency. Given the fact that children learn a lot about their native language with just a little input (e.g. GUASTI, 2009), it is hard to tell how much they need to listen to a structure in order to acquire it, and where (if anywhere) to trace the boundary between enough and too little exposure. Dautriche et al. conducted a search for DP dislocation on the speech corpora of two French children from 1 to 4 years of age8 and found that these structures composed 5% of all multiple-word utterances that were said to or around children 9 . We searched for ellipsis sentences with aussi on the same corpora10 and found that these compose

7 The chosen verbs were the same used by Dautriche et al. (2014); for more details, see sections 2.4 and 3.1. 8 Tim and Marie from the Lyon corpus in CHILDES (https://phonbank.talkbank.org/). Total number of sentences: 33.491. The authors excluded sentences without verbs from the analysis. 9 Perhaps because these are topicalized sentences, and topicalization is very frequent in spoken French. 10 Total number of sentences: 46.063. This number is higher than the one found in Dautriche et al. because we included the sentences without a pronounced verb in the analysis.

29 only .14% of all multiple-word utterances that were said to or around children, whereas sen- tences with aussi in general composed 1.74% of these utterances. Although ellipses with aussi cannot account for all ellipsis sentences which can occur in French, these numbers suggest that ellipses are not as common in French-learning children's input than sentences with DP disloca- tion. A methodological reason why children might perform differently in the present experi- ment is that our test videos had more participants than the videos created in Dautriche et al., since the stripping actions involve two agents and a patient, and the transitive actions involve one agent and two patients11. So, one possible outcome for children in our study is that they do not use the prosodic boundary information and resort to the argument-counting strategy to fig- ure out the structure of the sentences, either because it is harder for them to parse stripping sentences (due to their smaller frequency), or because the extra participants in the test videos make them confused. If this happens, we expect children in the stripping condition (i.e., who listen to the stripping sentences) to present the same looking pattern than children in the simple transitive condition. Finally, we might also expect a difference in performance between the two language groups, since each language has different syntactic and prosodic characteristics that could in- fluence children's performance. We will talk more about these differences in section 2.4.

Experiment 2 aimed to investigate if French-learning children can use the syntactic con- text of stripping sentences to learn a novel verb. We presented 30-42-month-olds with dialogues containing the novel verb daser either in stripping constructions, such as in [La maman a dasé]! [Le bébé aussi]! ("The mommy dased! The baby too!") or in simple transitive constructions such as [La maman a dasé le bébé aussi]! ("The mommy dased the baby too!"). Afterwards, we presented them with two simultaneous videos, one depicting a two-participant causal action, and another one depicting a one-participant intransitive action, while asking them to "look at the one who dases". Here, our working hypothesis was that children who listened to the novel verb in simple transitive sentences would infer that it describes a causative action, and children who listened to the novel verb in stripping sentences would infer that it describes an intransitive action. As a result, children in the simple transitive condition should look longer towards the causal action than children in the stripping condition, if they parse the sentences correctly and make the inference that a novel verb used in a transitive construction is more likely to refer to

11 See section 3.2.1.2.

30 a causative action. If this happens, then we will conclude that French-learning children interpret stripping and transitive sentences differently by paying attention to their prosodic phrasing dif- ferences, even in the absence of a simultaneous semantic context (since they are exposed to sentences with an unknown verb and do not see any context to help on the interpretation of the sentences until after they are presented). Another question that came to us when analyzing the results of the first experiment is what children really understand when they listen to stripping sentences. As we will see in the next sections, stripping involves an ellipsis process that we take to be characterized as the de- letion of part of the syntactic structure before its utterance, in the Phonological Form (PF)12 (e.g. LASNIK & FUNAKOSHI, 2018; ROSS, 1969). The deleted elements are recovered by the mature speaker through the knowledge of an identity condition with preceding elements in the linguistic context, which, in the case of the sentences we chose, is signaled by the presence of the adverb "too" (aussi/também). In order to investigate if children understand the identity conditions behind stripping sentences, we presented them with sentences such as (1a) accom- panied by two simultaneous videos: one video where the two mentioned participants were per- forming the named action, and another one where the first participant was performing the named action, and the second one was performing a different action (which consisted of open- ing and closing both arms and legs at the same time, repeatedly). As a control group, another group of children listened to sentences where aussi was replaced by the novel word cali, as in "Le tigre tape ! Le canard cali !". If children understand that the stripping sentences convey a Tense Phrase ellipsis in which the deleted element shares an identity relation with the previous linguistic context, the ones in the test condition should look longer towards the video where both participants perform the named action than the ones who were assigned to the control condition. If children do not understand the identity condition behind stripping sentences, we predict two possible outcomes:1) they would prefer to look towards the video where both par- ticipants perform different actions, either because they think aussi is a novel verb13 or because this video has more movement and therefore is more interesting; or 2) they would be confused and not show any preference for one or other video14.

12 See section 2.3 for a more detailed explanation of our view on the ellipsis phenomenon. 13 This might seem like a long shot, since 3-4-year-olds probably already know that aussi is an adverb. However, this word does resemble a verb in French (e.g., oscille (/ɔsij/, to oscillate) is pronounced similar to aussi (/osi/)). Furthermore, the suffix -i is a productive verb inflection for third person in French. For instance, partit (to leave), sortit (to go out), parie (to bet), crie (to scream), confie (to trust) are all verbs that are pronounced with a final -i. So if children do not understand the role of aussi in the sentences, and they realize that the second DP is an agent, they might hypothesize that aussi can also describe a novel verb, since it is in a linear position that could be occupied by the verb in the sentence. 14 For more details, see section 5.1.5.

31

This thesis is organized as follows. In the next chapter, I present the theoretical frame- work assumed for the experiments. First, I present some studies on the role of prosody in syn- tactic acquisition and processing. These studies guide our methodology for the first two exper- iments. Then, I present the syntactic structure of the sentences we chose to investigate, starting with some basic assumptions about the representation adopted and a brief review of ellipsis theory within generative syntax. Although the experiments proposed here do not aim to confirm or disprove the presented syntactic theory, it is important to state the conception we adopt for these sentences. Finally, I do a brief review of studies on the acquisition of ellipsis structures and conclude the chapter with the discussion of some important aspects of French and BP syn- tactic and prosodic structures, which could predict differences in the performance of children learning each language. In chapter 3, I describe the methodology and results of Experiment 1 in French and BP. In chapter 4, I describe Experiment 2, and in chapter 5, I describe Experiment 3. Finally, the sixth chapter contains the discussions and conclusions, as well as some indications for future work.

32

2 THEORETICAL FRAMEWORK

2.1 THE ROLE OF PROSODY IN SYNTACTIC ACQUISITION AND PROCESSING

In this section, I present the assumptions and hypotheses about the relationship between prosodic structure and syntactic acquisition and processing that we adopt here. We start by reviewing some chapters of Nespor & Vogel's seminal work Prosodic Phonology (1986; 2007), which describes an elegant and empirically based theory of prosodic hierarchy, extremely rel- evant for studies discussing prosody and its interfaces with other linguistic and non-linguistic phenomena, including the ones described in this thesis. Afterwards, I present the prosodic boot- strapping theory, also briefly described in Nespor & Vogel, which states that young children can use prosodic information for assessment of syntactic structure, speech segmentation and lexical acquisition. Finally, the last subsection focuses on syntactic disambiguation through prosodic boundary cues, which is the subject of Experiments 1 and 2.

2.1.1 Prosodic Phonology

When we are exposed to a new language, one of the first things we access from the speech flow is the structured phonological patterns that emerge from it. This includes syllables, stress, duration, intonation and pauses. Nespor & Vogel's (1986; 2007) work aims to describe these patterns, showing how they can be translated into discrete components that are hierarchi- cally organized. One of the main assumptions they make is that prosodic structure is governed by a set of rules that are particular to prosodic phonology, and as such, are somewhat independ- ent of other components of grammar. However, although they state that prosodic structure does not directly depend on non-phonological factors, they also state the implausibility of a totally autonomous phonological component, and show how these interact with morphology, syntax, semantics and non-linguistic factors such as speech style. The authors describe each of the components of the prosodic hierarchy, starting from the smallest ones, the syllable and the foot, up to the phonological utterance. They state that each component obeys its own set of rules, and that the higher a component is in the prosodic hierarchy, the more general are the nonphonological rules it refers to. For instance, the mapping rules that apply to the phonological word refer to specific notions such as the position of affixes,

33 while the mapping rules of phonological phrases make reference to general syntactic notions such as the head of a phrase or the direction of embedding in a particular language.

Figure 1 - The prosodic hierarchy according to Nespor & Vogel (from Domahs et al., 2015, p. 138)

This model, although still relevant and largely adopted to this date, has suffered criti- cisms, and some changes to its specific rules and levels have been suggested by several authors (e.g. LADD, 2008; SELKIRK, 1996; PEPERKAMP, 1997; BOOIJ, 1996). One important sug- gestion is the elimination of the C level, for various reasons, such as its unclear position in the hierarchy (e.g. PEPERKAMP, 1997; SELKIRK, 1996), asymmetries in the prosodic relations of enclitics and proclitics with their hosts (e.g. BOOIJ, 1996, in Dutch; PEPERKAMP, 1997, in Italian; and VIGÁRIO, 2003, in European Portuguese (EP)), and so on15. However, some studies claim that there must be a level between ω and ϕ, which is supported by empirical evi- dence from several languages (e.g. VIGÁRIO, 2003, in EP; HUALDE, 2007, in Castilian; KENSTOWICS, 1994, in English; BOOIJ, 1995, in Dutch; and KEBAK & VOGEL, 2001, in Turkish). In view of this, Vigário (2007) proposes that the C level must be maintained in the hierarchy, but that it should be relabeled as Prosodic Word Group, due to the fact that it does not seem to group clitics together, but rather prosodic words.

15 For more details, see Vigário (2007).

34

Since most discussions around Nespor & Vogel's work do not disprove their view of a hierarchic, non-recursive, independent prosodic structure16, but rather suggest modifications around it, we will adopt the hierarchy stated in Figure 1, although assuming that it might not be the one which best describes the current state of the theory. This is because we are mainly interested in the three highest levels: the phonological phrase (ϕ), the intonational phrase (I) and the phonological utterance (U), which are the most relevant levels for syntactic parsing. Even though prosodic structure does not always correspond to syntactic structure, the fact that some phonological rules at these levels refer to syntactic constituents means that prosody may provide cues for speech processing and disambiguation. We will focus our review on the de- scription of the phonological phrase and the intonational phrase, which are the most relevant components for sentence disambiguation.

2.1.1.1 The phonological phrase

The phonological phrase is the constituent just above the clitic group (C), and as such, it groups together one or more Cs. Its internal rules refer to syntactic heads, being general enough to be considered universal to all languages, except for its restructuring rules, which, as we will see, may be present, absent or even optional in different languages. The basic rules of phonological phrase formation are described below:

Phonological Phrase Formation I. ϕ domain The domain of ϕ consists of a C which contains a lexical head (X) and all Cs on its nonrecursive side up to the C that contains another head outside the maximal projection of X. II. ϕ construction Join into an n-ary branching ϕ all Cs included in a string delimited by the definition of the domain of ϕ. III. ϕ relative prominence In languages whose syntactic trees are right branching, the rightmost node of ϕ is labeled s; in languages whose syn- tactic trees are left branching, the leftmost node of ϕ is la- beled s. All sister nodes of s are labeled w. (NESPOR & VOGEL, 2007, p. 168)

16 But see, for instance, Selkirk (1996) for a reinterpretation of this strict hierarchical hypothesis as a set of violable constraints, in the light of Optimality Theory (PRINCE & SMOLENSKY, 1993).

35

The rules stated above are largely based on syntactic structure, without, however, refer- ring to any specific syntactic constituent. The first rule states that phonological phrases are delimited by lexical heads (i.e., V, N or A) and the direction of recursion of a given language. Right branching languages (i.e., languages whose recursion goes to the right) have ϕ domains that expand to the left (the non-recursive side) of the lexical head, and vice-versa. The second rule refers to the basic tree structure of the prosodic theory assumed by the authors, which states that prosodic trees are n-ary (and not binary, like most syntactic trees). The third and last rule describes the relative prominence of Cs inside a ϕ, which is again defined by the direction of recursion in a given language; right branching languages have more relative prominence in the rightmost node, and left branching languages have more relative prominence in the leftmost node. For a sentence such as (2), we would have the following ϕ formation:

2) [NP[NCindy]] [VP has [V bought] [DP a [NP [N car]]].

ϕ ϕ ϕ

[[Cindy]Cs]ϕ [[has]Cw [bought]Cs]ϕ[[a]Cw [car]Cs]ϕ

Since English is a right branching language, the domain of ϕ expands to the left. The first domain created is the one that has the N "car" as its head, which expands until before the next lexical head in the sentence (the V "bought"), and so forth. The w (weak) and s (strong) indexes indicate the relative prominence of each clitic group inside ϕs, and, as we can see, the rightmost Cs are marked with s, whereas the others are marked with w. Apart from these general rules of ϕ formation, this domain also presents restructuring rules that change the scope of basic phonological phrases. One of those rules allows a non- branching ϕ2 to join a ϕ1, as long as ϕ2 is the first complement of the head contained in ϕ1. The rule takes the form (3a), and an example is found in (3b), in Italian:

3) a) […Cw Cs]ϕ1 [C]ϕ2 → [... Cw Cw Cs]ϕ

b) [[se]Cw [prenderá]Cs]ϕ [[qualcosa]C]ϕ[[prenderá]C]ϕ[[tordi]C]ϕ

ϕ ϕ → [[se] [prenderá] [qualcosa] ] [[prenderá] [tordi] ] Cw Cw Cs ϕ Cw Cs ϕ If he catches anything, he will catch thrushes.

36

This rule, however, is not applicable to any two ϕs in a complement-head relation with each other. While some languages do not allow for such operation at all, in the ones that do allow it there are nonlinguistic factors that can interfere in its application. For instance, fast speech tends to join more ϕs than slow speech, and larger ϕ domains are less likely to be joined with another ϕ, probably because of physiological reasons (i.e. there is a limited amount of words a person can say without a prosodic break). In sum, the phonological phrase groups Cs together and has its limits defined by the general syntactic constraints of phrasal head, syntactic phrase and direction of embedding. However, one cannot say that it mimics syntactic structure, since there are no rules that delimit the specific type of constituents that may precede or follow the head of a phrase, but only gen- eral structural relations. Furthermore, there are also nonlinguistic factors that influence the size of phonological phrases in restructuring rules.

2.1.1.2 The intonational phrase

The intonational phrase (I) groups together one or more ϕs, and, such as the latter, it is formed by rules that refer to general syntactic information. These rules are based on the notions that I is the domain of intonational contours, and the ends of intonational phrases coincide with the places where there can be pauses in a sentence. The first rule describes constructions that obligatorily form an I, such as parenthetical expressions (as in (4a) below), interjections (as in (4b)) or tag questions (as in (4c)):

4) a) Lions [as you know]I are dangerous

b) Good heavens [There's a bear in the backyard]I

c) That's Theodore's cat [isn't it?]I (NESPOR & VOGEL, 2007, p. 188)

The categories above have in common the fact that they are all external to the adjacent root sentences. This means that strings that are not structurally attached to the sentence tree at the level of surface structure constitute a domain of I. The second domain of I refers to the boundaries of root sentences. See, for instance, (5) below:

37

5) a) [Billy thought his father was a merchant]I [and his father was a secret agent]I

b) [Billy thought his father was a merchant and his mother was a secret agent]I (DOWNING 1970, apud NESPOR & VOGEL, 2007, p. 189)

In (5a) above, "and" is coordinating two root sentences (the one which states what Billy thought and the one which states the truth about his father), and so we have two Is. In (5b), the conjunction coordinates two clauses that are part of the same root sentence (they both state what Billy thought) and so form a single I.

Intonational Phrase Formation I. I domain An I domain may consist of a. all the ϕs in a string that is not structurally attached to the sentence tree at the level of s-structure, or b. any remaining sequence of adjacent ϕs in a root sen- tence. II. I construction Join into an n-ary branching I all ϕs included in a string de- limited by the definition of the domain of I. (NESPOR & VOGEL, 2007, p. 189)

Although the I formation rules refer to syntactic structure, there is no complete corre- spondence between syntactic and I structure, as the strings that obligatorily form an I can some- times "split" sentences into Is that are not isomorphic with any syntactic constituent:

6) [They have]I [as you know]I [been living together for years]I (NESPOR & VOGEL, 2007, p. 190)

Furthermore, there are many I restructuring rules, and these are determined mostly by nonlinguistic factors such as , speech style, rate and contrastive prominence. When con- fronted with long root sentences with no intervening obligatory I, a restructuring rule may apply to break down I into smaller phrases:

7) a) [Anne's former PhD student spent all her summers doing research]I

b) [Anne's former PhD student]I [spent all her summers doing research]I

c) [Anne's former PhD student]I [spent all her summers]I [doing research]I

38

What determines the necessary length for the application of this restructuring rule is however unclear to the authors; they simply state that it may be due to physiological reasons related to breath capacity, or to the optimal chunks for linguistic processing. Moving towards relative prominence, unlike all the other constituents of the prosodic hierarchy, its attribution does not obey a fixed rule, but rather it seems to depend mostly on semantic factors such as focus or given vs. new information. The sentence below, for instance, could be represented in three different ways, depending on which information the speaker wishes to focalize:

8) a) [[My brother]ϕs [adopted]ϕw [a cat]ϕw]I

b) [[My brother]ϕw [adopted]ϕs [a cat]ϕw]I

b) [[My brother]ϕw [adopted]ϕw [a cat]ϕs]I

In sum, the intonational phrase is the constituent that groups one or more phonological phrases. Its general rules, which are said to be universal to all languages, state root sentences as the domain for I formation, as well as elements that are not structurally attached to the main sentence's tree at s-structure. There seems to be a great degree of flexibility to the I domain, since its restructuring rules may divide Is in several different ways for several different reasons. But these rules are also constrained; for instance, the Strict Layer Hypothesis, which state that the dominant prosodic category must contain only elements of the category immediately below, only allows restructuring of I boundaries to occur at the juncture between two ϕ boundaries (and never within a ϕ boundary). Restructuring also tends to only happen at the end of noun phrases; a sentence such as (7), for instance, could only undergo the restructurings seen above, and an intonational phrase boundary could never be placed after the verb, as in (9):

9) a) *[Anne's former PhD student spent]I [all her summers]I [doing research]I

b) *[Anne's former PhD student]I [spent]I [all her summers]I [doing research]I

This means that intonational phrase boundaries that are placed after the verb always signal the end of a root sentence, as in (10a) below, or a constituent movement, as in (10b):

10) a) [The tiger is eating]I [The duck too]I = The tiger and the duck are eating

b) [Il mange]I [Le canard]I = The duck is eating

39

[He eats]I [The duck]I (DAUTRICHE ET AL., 2014, p. 08)

The observation above suggests that, although there is no complete correspondence be- tween syntactic and I structure, the restrictions on I formation can be important cues for syn- tactic processing and disambiguation; for the sentences in (10), the intonational phrase bound- aries after the verbs are a clear cue to their syntactic relations with the other parts of the utter- ance. We talk more about the role of prosodic phrasing for sentence processing and language acquisition in the following sections.

2.1.1.3 Prosodic constituents and disambiguation

Nespor & Vogel dedicate one chapter of their book to discuss sentence disambiguation through prosodic information. The authors state that it is prosodic structure, and not syntactic structure (as assumed by FODOR & BEVER, 1965; GARRETT, BEVER & FODOR, 1966), which is responsible for the perception and organization of speech by the listener. They state that in cases of ambiguous sentences, only sentences whose prosodic phrasing cues their differ- ent interpretations will be correctly disambiguated in the absence of semantic or pragmatic con- text. To explore these hypotheses, Nespor & Vogel created an experiment with various types of syntactic ambiguity and asked adult listeners to interpret them in out-of-the-blue contexts. Before describing the experiment, the authors review the concept of ambiguity. At the syntactic level, they state that sentences can be ambiguous at the level of syntactic constituency (as exemplified in (11) for Italian) and/or of syntactic labels (as exemplified in (12) for French).

11) a) [[[Quando Giorgio]ϕ]I [[chiama]ϕ [suo fratello]ϕ]I [[è sempre nervoso] ϕ]I When Giorgio calls his brother, he is always nervous – Giorgio is nervous

b) [[[Quando Giorgio]ϕ]I [[chiama]ϕ]I [[suo fratello]ϕ]I [[è sempre nervoso] ϕ]I When Giorgio calls, his brother is always nervous – His brother is nervous

12) a) [[La petite]ϕ]I [[sourit]ϕ]I The little girl smiles 17 b) [[La petite souris]ϕ]I The little mouse

17 Examples from de Carvalho et al. (2017).

40

In (11), the sentences present different syntactic constituents; while in (11a) the VP è sempre nervoso takes a null pro as a subject, in (11b) it takes suo fratello as the subject. In (12), the words sourit/souris are both pronounced as [suʁi] but mean "to smile" (sourit) in the first sentence, and "mouse" (souris) in the second. Notice that this example has one more ambiguous word (la petite, which may be either a noun or an adjective) and presents both syntactic label and syntactic constituency ambiguity (the first utterance contains a DP and a VP, and the second utterance contains an AP and a DP). At the prosodic level, ambiguous sentences can have different phonological and/or in- tonational phrasing, like the sentences above, or even no prosodic differences between sen- tences. In the proposed experiment, the authors created various sets of ambiguous sentences, using all possible combinations of syntactic ambiguity (constituency and/or label ambiguity) and prosodic cues (different Is and/or different ϕs, or neither) for Italian, and asked Italian lis- teners to choose their correct interpretations. The results show that ambiguous sentences that have identical prosodic structures are not disambiguated, even when their syntactic structures are different. Through this experiment and the discussions made throughout the book, Nespor & Vo- gel state the hypothesis that listeners rely on phonological units to interpret the meaning of sentences, by making use of the same rules used for the creation of these units, but in a top- down manner, starting from the highest prosodic constituents. This view is corroborated by various studies (e.g. FRAZIER, CARLSON & CLIFTON 2006; LEHISTE, 1973; CUTLER ET AL., 1997; DUFFY & PISONI, 1992). The work of Duffy & Pisoni (1992), for instance, shows that listeners retrieve more rapidly and accurately the meaning of words and sentences uttered with natural prosody than the ones created with synthesized speech, and the less effective the synthesizer is in copying natural language prosody, the harder it is for listeners to interpret the sentences. If adults rely on prosodic structure for sentence processing, infants might also rely on it for language acquisition. In the next section, we will show how studies on prosodic bootstrap- ping, which have been largely influenced by Nespor & Vogel's work, investigate this hypothe- sis.

41

2.1.2 Prosodic bootstrapping18 theory

Nespor and Vogel (2007) suggest that the suprasegmental layers of speech contain sev- eral cues for speech access. As prosodic phrasing rules may refer to syntactic constituents, adult listeners seem to benefit from this information for speech processing. Furthermore, there is a significant body of literature claiming that children may also rely on prosodic information for speech segmentation and language acquisition. This is the hypothesis behind prosodic boot- strapping theory. According to this theory, there is enough information in prosodic structure for children to acquire the first syntactic parameters of their mother tongue (e.g. STEEDMAN, 1996; GOUT & CHRISTOPHE, 2006) and even postulate a rudimentary syntactic structure for it, based on prosodic boundary cues, along with other cues such as the position of known func- tion words (which are recurrent unstressed at prosodic boundaries) and content words19 (e.g. CHRISTOPHE ET AL., 1997, 2008; KELLY, 1996; MORGAN, SHI & ALLO- PENNA, 1996). Once children have a reliable hypothesis about the canonical structure of the language they are acquiring, this knowledge can bootstrap lexical and syntactic acquisition. Bootstrapping theories investigate the role of the input, i.e., the linguistic experience, for the acquisition of natural languages. Although they assume that children are innately en- dowed with some knowledge about the functioning of natural languages (or, more specifically, with the knowledge of rules that narrow down their search space for linguistic parameters and other language-specific features (NEVINS, 2004)), the input is essential for language acquisi- tion. The investigation of the types of linguistic cues that children rely on for this process in their first years of life helps linguists and psycholinguists to hypothesize about the path they take to uncover the specificities of their native languages (e.g. PINKER, 1989; MORGAN & DEMUTH, 1996). The following model, created by Christophe et al. (2008), illustrates the prosodic boot- strapping of lexical and syntactic acquisition. According to this model, children rely on prosodic

18 The bootstrap metaphor comes from the expression “pull oneself over a fence by one's bootstraps”, which seems to have been created in the United States in the 19th century. This expression denotes self-sustainable processes, and it is used in language acquisition with the intention of classifying the input as a sort of lever that allows children to "jump over the fence" of their native languages "by their own bootstraps", that is, through their own innate capacity for the acquisition of natural languages. 19 This view is also compatible with syntactic bootstrapping theories (e.g. MAZUKA, 1996; WAGNER, 2006; BERNAL ET AL., 2007), which state that children first learn about the syntactic structure of languages, and that this knowledge helps them get to semantic knowledge. However, the lines between those two hypotheses seem to be blurred, as it is often hard to determine whether children access certain linguistic knowledge through syntax, prosody, and even semantics (PINKER, 1989). Since it is not the goal of the present work to discuss bootstrapping theory, we will focus here on how children use prosodic and phonological information for language acquisition, without committing to a hypothesis about which cue is more important for the beginning of this process.

42 phrase boundaries for the postulation of rudimentary syntactic constituents. At the same time, they access function words and content words. The latter can be accessed through different word recognition strategies, such as previous knowledge, the observation of typical word for- mats (for instance, English words have mostly a strong-weak pattern (i.e., doctor, party)) and phonotactic cues, which are the rules about how phonemes are distributed between or within words. Finally, the knowledge of some recurrent function words helps children narrow down the meaning of unknown words and label the syntactic constituents formed by prosodic bound- ary recognition.

Figure 2 - Prosodic bootstrapping model from Christophe et al. (2008, p. 62)

This model is supported by several studies on children's sensitivity to prosodic cues and to correlations between syntactic and prosodic structure. As seen in the beginning of this chap- ter, some studies show that newborns can already distinguish languages based on their rhythmic patterns (e.g. RAMUS, 2002; MEHLER ET AL., 1988). In Ramus (2002), French children were exposed to Japanese, which has a rhythm described as mora-timed, and Dutch, which has a stress-timed rhythm. Newborn infants (from 2 to 5 days old) were presented with sentences in these two languages while sucking on a pacifier connected to a pressure transducer, which measures the differences of air pressure in the pacifier. Their study shows four different exper- iments, in which the sentences presented went through different filters and artificial treatments,

43 or through no treatment at all. Children first listened to sentences in one of the languages until there was a decrease in sucking rate; afterwards, the experimenter started playing sentences in the other language, for a test group, or in the same language, for a control group. The results show that French children increased their sucking rate when the experimenter switched the au- dio stimuli to a different language even when the only audible difference between languages was the rhythm, showing that they can differentiate languages through their different rhythmic patterns. Slightly older infants also show to be sensitive to prosodic phrase boundaries. Jusczyk et al. (1992) and Hirsh-Pasek et al. (1987) show that English-learning 7-month-olds pay more attention to recorded stories where artificial pauses are inserted correctly at prosodic phrase boundaries as opposed to within the phrases, in a position that violates prosodic phrasing rules. In a head-turning task, the authors presented children with the recorded stories, alternating be- tween stories with correct and incorrect placement of pauses, and measured children's attention to each story through the time they spent looking towards the source of the auditory stimulus (which was accompanied by a neutral visual attention-getter, such as a blinking light). The results show that children payed less attention (i.e., looked away from the source of the auditory stimulus earlier) when presented with sentences with incorrect placement of pauses. Shukla et al. (2011) also show that 6-month-olds already know that segments of the same words cannot be separated by a phrase boundary. The authors presented children to an image of a novel object accompanied by a novel bisyllabic word (AB) inside nonce utterances composed by different syllables in the form yABxz, where AB were the invariable syllables that formed the novel word, and yxz were random syllables that changed between utterances. Afterwards, children heard new nonce utterances where A and B appeared either inside the same intonational phrase, for one group of children, or were separated by a phrase boundary, for another, while seeing an image of the familiarized novel object along with other distractor objects. When children were exposed to AB inside the same intonational phrase in the famil- iarization and test phases, they looked longer towards the familiarized object than to other ob- jects in the test phase, which suggests that they learned to map the novel word AB to the famil- iarized object; however, when familiarized with AB in the same intonational phrase and then presented with utterances where the syllables A and B were separated by an intonational phrase boundary, children did not look longer towards the familiarized object, suggesting that they did not recognize the novel word they have learned before. Similar studies by Gout, Christophe & Morgan (2004), Millotte (2005), among others, also show that infants use prosodic phrase boundaries for the access of real words within

44 sentences. In Gout, Christophe & Morgan, English-speaking children from 10 to 13 months old were familiarized with either bisyllabic words such as "paper" or monosyllabic words such as "pay", and then listened to sentences in which the bisyllabic word appeared, as in [The college] [with the biggest paper forms] [is best], or in which the monosyllabic word appeared, but both syllables of the bisyllabic word were present but straddled a phrase boundary, such as in [The butler] [with the highest pay] [performs the most]. The results show that children familiarized with the bisyllabic words payed more attention (i.e., looked longer towards the source of the auditory stimulus) to the sentences in which these words appeared than to the ones in which the syllables of the words straddled a phrase boundary. Moreover, children exposed to the mono- syllabic words did not pay more attention to sentences in which these syllables were presented inside a bisyllabic word than to when they appeared right before a phrase boundary. This shows that children identified the familiarized words inside sentences, and that the ones who heard bisyllabic words did not identify the words when their syllables straddled phrase boundaries. Studies such as the ones above show that children 1) recognize the cues that indicate prosodic phrase boundaries; and 2) use this information for speech segmentation, through the knowledge that words cannot straddle phrase boundaries20. Other studies show that infants already understand the correlation between syntactic and prosodic structure and use this information for sentence processing. For instance, work by de Carvalho, He, Lidz, & Christophe (2019) show that French-learning 18-month-olds interpret a novel word such as bamoule or doripe as either a noun or a verb, depending on the prosodic phrasing of the sentences they appear in. Children were presented with a video of a penguin cartwheeling, while listening to a novel word in sentences such as [regarde]! [la petite] [bam- oule]! ("[look]! [the little (one)] [bamoules]!"), where the prosodic phrasing, indicated by the brackets, favors the interpretation of the novel word as a verb. Afterwards, they were presented with a video of a penguin spinning, while listening to another novel word in sentences such as [regarde la petite doripe]! ([look (at) the little doripe]!), where the prosodic phrasing favors the interpretation of the novel word as a noun21. Finally, at test phase, they were exposed to a switch between sentences and videos; half of the children saw a noun-switch condition (i.e., they heard the noun-prosody sentences while watching the penguin cartwheeling), while the other half saw

20 Furthermore, the results from Shukla et al. also show that infants can learn novel words through observation of Transitional Probabilities (TP) between syllables or segments; the more often two syllables cooccur within a phrase, the more likely it is that children postulate these syllables as a novel word candidate. This statistical learn- ing process seems to be important for the beginning of language acquisition; see Christophe et al. (2008) and Morgan & Demuth (1996). 21 Both novel words were equally likely to be interpreted as a noun or a verb (with a second person present inflec- tion) and were randomly assigned to either a noun or a verb prosody.

45 a verb-switch condition (i.e., they heard the verb-prosody sentences while watching the penguin spinning). Since children learned to associate one of the novel words with an action, and the other one with a noun, children in the noun-switch condition did not show surprise (i.e., did not look longer towards the visual stimulus) during test phase, since the object associated with the novel noun stayed the same; however, children in the verb-switch condition looked significantly longer towards the video during test phase, since they were seeing a different action associated with the novel verb. Children can also correctly interpret a real homophone through the information of pro- sodic boundary cues. De Carvalho, Dautriche, & Christophe (2016) presented French-learning 28-month-olds with words such as ferme, which can mean either the noun "farm" or the verb "to close" conjugated in the present, inside sentences such as [la petite] [ferme le coffre à jouets] ("the little one closes the toy box") or [la petite ferme] [est très jolie] ("the little farm is very beautiful"). Everything after the homophone was deleted and substituted by noise, so children would only listen to la petite ferme. Along with the sentences, children saw two images dis- played side-by-side on a monitor screen: an image of a girl closing a toy box (which is congru- ent with the verb interpretation of ferme), and another image of a farm (which is congruent with the noun interpretation of ferme). Since all audible words were pronounced identically between sentences, children could only rely on prosodic boundary information to disambiguate them. Children exposed to sentences in which the homophone was a noun looked significantly longer towards the image congruent with the noun interpretation than children exposed to sentences in which the homophone was a verb, showing that they correctly interpreted the homophones by relying on prosodic boundary information. Furthermore, as seen in the first part of this introduction, Dautriche et al. (2014) show that French-learning 28-month-olds can correctly interpret left-dislocated sentences such as [il mange] [le canard] ([he eats] [the duck]) differently from sentences with no dislocation such as [il mange le canard]. When exposed to one of the two types of sentence while watching two simultaneous videos, one depicting the left-dislocated interpretation (i.e., a puppet duck eating bread) and another one depicting the no-movement interpretation (i.e., a puppet tiger eating the duck), children who listened to left-dislocated sentences looked longer towards the video cor- responding to the left-dislocated interpretation than children exposed to sentences with no dis- location, showing that they correctly interpret left-dislocated sentences by relying on prosodic boundary cues. These studies show that French-learning children as young as 18 months old can infer the correct syntactic relations between words by paying attention to their prosodic boundary

46 cues, which means that they are aware of the correlation between prosodic and sentence struc- ture and can use this knowledge for sentence processing and lexical acquisition. However, as seen in the first part of this introduction, experiments with different languages (e.g. English (SNEDEKER & TRUESWELL, 2001; SNEDEKER & YUAN, 2008); Korean (CHOI & MA- ZUKA, 2003)) and different methodologies (e.g. DAUTRICHE ET AL., 2014) have shown mixed results, with some failing to show children's ability to use prosodic boundary information for sentence parsing. It is hard to conclude from these studies if the discrepancy between results is entirely due to methodological differences, or if the chosen syntactic structures, as well as differences between the syntactic and prosodic structure of the different languages studied also had an impact on children's performance. The goal of the present work is to contribute to this discussion, by testing French-learn- ing children's ability to interpret a different type of sentence, namely stripping sentences such as [Le tigre tape]! [Le canard aussi]!, which, without their prosodic boundary cues, cannot be distinguished from simple transitive sentences such as [Le tigre] [tape le canard aussi]!. We also tested BP-learning children with the same structure and methodology, aiming to investigate if they can also rely on prosodic boundary information for sentence parsing. In the next section, I present the syntactic representation we adopt for the description of the stripping sentences.

2.2 SYNTACTIC REPRESENTATION OF STRIPPING SENTENCES

2.2.1 Basic assumptions

In this section I will present the syntactic representation of stripping sentences as de- scribed by works inside generative theory (CHOMSKY, 1965; 1993; 1995). But first, I will present some basic assumptions that are crucial for understanding these proposals. By adopting generative theory, we assume that humans are innately endowed with cer- tain abilities that enable them to acquire and develop natural languages. These abilities are di- vided between knowledge that is more specific to our capacity to generate linguistic mental objects in a recursive manner, known as the Computational System (CS), and knowledge that involves general cognitive abilities that we, to some extent, share with other species, known as the Articulatory-Perceptual and Conceptual-Intentional Systems, which are responsible for the

47 phonologic and semantic representation of linguistic expressions, respectively. According to the Minimalist Program (MP) (CHOMSKY, 1995), a sentence in a partic- ular language is constituted by a pair (λ, π), which are, respectively, its logical (semantic) in- terpretation (Logical Form, LF) and its phonological interpretation (Phonological Form, PF). These are interpreted by the Conceptual-Intentional and Articulatory-Perceptual systems. Sen- tences are derived in a successive manner, starting with the selection of lexical items from the Lexicon (which contains all lexical and functional elements of a given language) by an interface that we will call Numeration. Selected lexical items undergo syntactic operations at CS (spe- cifically, the operations Merge and Move). At CS, sentences are generated in a bottom-up man- ner, starting with Merging of hierarchically lower terminal nodes up to functional categories such as TP (Tense Phrase) and CP (Complementizer phase). When these categories are merged the operation Move takes place, raising elements for feature checking purposes22. To illustrate this, I will adopt here the following simplified X-bar structure (represented as a tree for better illustration):

13) [CP [SpecCP] [C' [C] [TP [SpecTP] [T' [T] [vP [SpecvP] [v' [v] [VP [SpecVP] [V' [V] [DP]]

This structure represents the basic model that will be assumed here. Since the goal of this work is not to discuss nor introduce syntactic theory, I will not go into further discussion of other important features it may have. However, as we will see, for the derivation of stripping sentences, we will need to add at least one more functional category between CP and TP.

22 See the Last Resort rule described by Chomsky (1995).

48

Assuming the derivation described above, in a sentence such as "John read Hamlet", "John" and "Hamlet" are merged as heads of Noun Phrases (NPs) inside Determiner Phrases (DPs), and the verb "read" is merged as the head of Verb Phrase (VP). Once TP is merged with this structure, "John" moves towards it for feature checking23.

14) a) [CP [TP [SpecTP Johnj] [T' [T] [vP [Specv' tj] [v' [v] [VP [V read] [DP Hamlet]]]]]]]

Moved elements leave behind copies in all the places they have been before, and those are deleted for PF. We indicate deleted copies in the scheme above by t (which is commonly used in older versions of the theory to refer to traces (CHOMSKY, 1965)) with indexes to indicate which element was copied. For instance, "John", which takes the index j, is merged as the head of NP at SpecvP and then moves to SpecTP. In a certain point of the derivation at CS, the structures are sent to PF for phonological interpretation, in a process called spell-out, and then the computation continues in a covert manner until it is sent to LF for semantic interpretation. The scheme below, adapted from Lopes (1999), summarizes the derivation process as described by Chomsky (1995)24:

23 There is an important difference here between English and the languages that are studied here (French and BP). While in English, the verb is said not to move to T in declarative affirmative sentences, it does move in French and BP (POLLOCK, 1989). 24 See Chomsky (2000) for a different account.

49

Figure 3 - Model of sentence derivation from Chomsky (1995). Figure adapted from Lopes (1999, p. 90).

According to the MP, the only source of internal variation in this model comes from the lexicon. The different features selected by each lexicon account for the structural differences between languages, since they demand different operations (among the limited number of op- erations CS can perform). The restrictions applied to natural languages come from Economy Principles, which state that costly operations such as Move only take place when it is strictly required by features (i.e., there is no optional movement), and Bare Output Conditions, which are imposed by the PF and LF interfaces. A successful derivation satisfies all Bare Output Con- ditions (i.e., it is fully interpreted at the interfaces) while keeping the number of costly opera- tions to a minimum.

2.2.1.1 Ellipsis theory

In this subsection we present the ellipsis theory assumed in this work. The arguments proposed here come mainly from Lasnik & Funakoshi (2018) and van Craenenbroeck & Temmerman (2018). Ellipsis can be described as the omission or absence of elements in a sentence, whose meaning can be recovered from previous linguistic (or, sometimes, pragmatic) context. The following sentences constitute different types of ellipsis:

50

15) a) Cindy played the guitar, and John did __ too. b) Cindy played something yesterday - I wonder what __? c) Cindy played the guitar, and John __ too.

In (15a) we have a case of VP ellipsis (VPE), where the elided constituent is the VP [VP play the guitar]. In (15b) and (c) we have different cases of TP ellipsis; while the first one is a case of sluicing25, i.e., a TP ellipsis that leaves a wh- element as its remnant, the last one is a case of stripping, which is a type of TP ellipsis that usually leaves a DP and a sentential adverb or negation as its remnant. The elided constituents, represented by the underlined spaces in the sentences above, can be accounted for by two different hypotheses: the "absence" or non-structural hypothesis, which states that there is no structure nor empty category behind the ellipsis site; and the "omis- sion" or structural hypothesis, which states that the elided elements have a full internal structure in some levels (LASNIK & FUNAKOSHI, 2018). The former has been recently developed according to Culicover & Jackendoff's (2005) theory, which follows the WYSIWYG (What You See Is What You Get) hypothesis, which states that there is no more structure in a sentence beyond what is actually uttered. The latter is further divided between accounts of ellipsis as a process of deletion at PF and as reconstruction (copy) at LF. In the PF deletion account, the elided constituent is present in the sentence until spell-out and then it is deleted at PF. Since PF and LF are two independent levels, PF deletion does not hinder the interpretation of the sen- tences at LF. In the LF reconstruction account, the ellipsis site is occupied by a null pronominal- like element until after spell-out, and this element is replaced at LF by syntactic structure copied from an antecedent, making the sentence interpretable at this level. In the present work, we assume the PF deletion account, for the reasons discussed in Lasnik and Funakoshi (2018) and Ross (1969). As an argument in favor of the structural ac- count, these authors observe that the remnant constituents of ellipsis are subject to certain re- strictions or operations that can only apply if the elided elements present a structure. For in- stance, in (15b) above, the elided constituent must be a TP, since the verb wonder only takes interrogative clauses as its complement (the sentence "I wonder guitar", for instance, is ungram- matical). However, if the elided constituent did not have structure, wonder in (15b) would only have a wh- element as its complement, and so the sentence should be ungrammatical.

25 First described by Ross (1969). See also Lasnik (2001), Merchant (2001), and Culicover & Jackendoff (2005), among others.

51

Furthermore, the PF deletion account is preferred over the LF copying account because the former better explains some phenomena such as the fact that P-stranding (i.e., the require- ment that the preposition stays in situ) in sluicing, illustrated in (16) below, follows the same rules applied to regular P-stranding in wh-movement (MERCHANT, 2001), which, according to Lasnik & Funakoshi, "straightforwardly follows from the deletion (plus movement) analysis while it does not from the LF-copying (plus base generation) analysis." (p. 53).

16) a) who are you going to do away [PP with twho]?

b) * [PP with whom] are you going to do away tPP? c) Bill's planning on doing away with one of his in-laws, but I don't know [which]. d) *Bill's planning on doing away with one of his in-laws, but I don't know [with which] (ROSS, 1969, p. 265)

This view agrees with most analyses of ellipsis as a type of surface anaphor, first pro- posed by Hankamer & Sag (1976). According to van Craenenbroeck & Temmerman (2018), the elided constituent is a silent anaphor whose meaning is recovered from the material of the antecedent. In this view, sentences such as (15) above involve the representations in (17) below, where we use Δ to denote the anaphor:

17) a) Cindy played the guitar, and John did Δ too. Δ = play the guitar b) Cindy played something yesterday - I wonder what Δ? Δ = Cindy played c) Cindy played the guitar, and John Δ too. Δ = played the guitar

Recoverability is yet another important issue involving ellipsis theory. According to Chomsky (1964), ellipses follow two conditions of recoverability. First, the deleted element must be a "designated representative of a category". Second, deletion only occurs if the deleted element shares an identity relation with another element. The nature of this "identity relation" has been largely debated. How identical to its antecedent must the elided constituent be in order to be recoverable? Is this identity more syntactic, or is it more of a semantic or pragmatic rela- tion?

52

Some studies (e.g. CHOMSKY, 1964; LASNIK, 1995) argue that the deleted element must have a structure identical to an antecedent in some stage of the derivation. A strong argu- ment in favor of this view comes from the fact that sluicing does not tolerate active-passive mismatches such as (18) below26. Since active and passive sentences are supposed to be seman- tically identical, the explanation for the ungrammaticality of these sentences must come from the violation of syntactic identity restrictions.

18) a) *Joe was murdered, but we don't know who murdered Joe b) *Someone murdered Joe, but we don't know who by/by whom Joe was murdered (LASNIK & FUNAKOSHI, 2018, p. 63-64)

However, the grammaticality of sentences such as (19a) below pose an argument against strict syntactic identity condition, as noted by Merchant (2001). If this condition applies, the elided constituent of (19a) must have the structure in (19b). This structure, however, violates Principle C of binding theory, which states that a referential expression (such as "Alex" in the example below) cannot have an antecedent that c-commands it (i.e., it must not be bound).

19) a) They arrested Alexi, though hei thought they wouldn't.

b) *Hei thought they wouldn't [arrest Alexi] (MERCHANT, 2001, p. 24)

If the semantic identity condition applies, or if we assume a less strict syntactic identity condition, however, we can assume the structure in (20) for the elided constituent, which does not violate Condition C.

20) They [VP arrested Alexi], though hei thought they wouldn't [VP arrest himi]. (LASNIK & FUNAKOSHI, 2018, p. 66)

However, the grammaticality of the sentence above is also explained by the

26 Merchant (2008) observes that VPE does accept passive-active mismatches:

i) a) This problem was to have been looked into, but obviously nobody did. b) The janitor should remove the trash whenever it is apparent that it needs to be

However, he explains it by stating that the functional head responsible for active and passive voice is included in TP but is outside VP, which means that in cases of VPE the antecedent and the ellipsis site are still syntactically identical, while in sluicing they are not.

53 phenomenon of vehicle change, proposed by Fiengo & May (1994), which allows copies of referential expressions to be evaluated as pronouns with respect to its interpretive principles. According to this phenomenon, the deleted DP (e.g. "Alex" in (20)) is reduced to a pronoun that bears the same index the DP would have, avoiding violation of Condition C27. In view of arguments such as these (and many others presented in Lasnik & Funakoshi (2018)), it is difficult to choose between a syntactic and a semantic identity hypothesis. We will simply state here, as the authors do, that a semantic identity is necessary, but not sufficient, and a non-strict syntactic identity condition must also be met. This would explain the examples that show the need for some syntactic identity while already fulfilling the semantic identity criteria, such as (18) above for the passive-active mismatches. One last point to clarify about the ellipsis theory we adopt is about licensing. Even if the conditions described above are met, deletion, as any other syntactic operation, only occurs un- der certain conditions. For ellipsis, this is translated into licensing. According to Lasnik & Fu- nakoshi, "in order to be licensed, the ellipsis site must be in a local relation with a head with specific morphosyntactic properties" (p. 69). Martins (1994) notices that, at least in Romance languages, ellipsis is licensed by functional heads with strong features, i.e., features that trigger overt movement. We adopt this view as the one that best describes the stripping phenomenon, since it is most largely assumed by the studies reviewed in the next section.

2.2.2 Analysis of stripping sentences

Stripping, or bare argument ellipsis, was first analyzed by Ross (1969) and later by Han- kamer & Sag (1976) and several others (FIENGO & MAY, 1997; MERCHANDT, 2001; JOHNSON, 2009). It is usually described as a type of TP ellipsis that deletes everything in a clause under identity with a preceding clause, leaving behind a single element, usually accom- panied by a negation or a clausal adverb such as "too". The following sentence illustrates this phenomenon in English, BP, and French, respectively:

21) a) John read Hamlet, and Mary too. b) O João leu Hamlet, e a Maria também. c) Jean a lu Hamlet, et Marie aussi.

27 For more details, see Fiengo & May (1994). See also Safir (1999) for an extension of this phenomenon to A-bar chains, and Hunter & Yoshida (2016) for a critical review of these proposals.

54

Stripping seems to only be found in conjunctions or disjunctions (see (22) below in contrast with (21)). The conjunct that presents the elided element can appear alone in fragment answers, but only if it has a linguistic antecedent in the discourse (see (22b) and (22c) in contrast with (22d)):

22) a) *John read Hamlet because Mary too. b) *Mary too. (uttered out of the blue) c) (John is reading Hamlet when Adam sees him) Adam: *(And) Mary too28. d) Anna: John read Hamlet in high school. Adam: Mary too.

The remnant of stripping sentences can be a subject DP, as in the sentences above, but also an object DP, a PP and even an AP, as shown by the examples in (23), respectively:

23) a) John read Hamlet and Fahrenheit 451 too. b) Mary likes to work with Cindy but not with Peter. c) John said that Peter is handsome and smart too.

Ross (1967) has an account for these types of sentence that does not involve ellipsis, but only rightward movement of an element out of a conjunction, as in the example below:

24) John__ read Hamlet [and Mary] too.

However, the meaning of stripping sentences and sentences with conjoined nominals

28 However, at least in BP, fragment stripped answers may be acceptable with pragmatic antecedents:

i) Context: Twitter is a fever in Brazil. Maria, who loves social networks, is surfing on Twitter the whole day. By her side, Joana, which hates social networks, is also surfing on Twitter. John sees this and is immediately shocked:

John (to Joana): (Mas) você também? ("(But) you too?")

These cases of fragment stripped answers with a pragmatic antecedent may mean that stripping has two different analyses: one that sees it as a surface anaphor, for the cases where a pragmatic antecedent is not possible, and another one that sees it as a deep anaphor. Hankamer & Sag (1976) state that only surface anaphors are true cases of ellipsis, whereas deep anaphors are said to be null pro-forms with no internal structure.

55 are not exactly the same; van Craenenbroeck & Temmerman (2018) demonstrate that they are better understood as two singular conjoined clauses rather than one plural clause, since they cannot support adverbs with plural interpretations, such as "together"29:

25) a) John and Mary read Hamlet together. b) * John read Hamlet together and Mary too.

Merchant (2001), Depiante (2000), Kim (1999), among others, better describe the deri- vation of stripping sentences as the movement of the remnant element to a functional projection that selects the TP, motivated by a strong feature, followed by PF deletion of the TP. The scheme in (26) below, adapted from Depiante, illustrates this process:

26) John read Hamlet and [FP Maryi [TP t read Hamlet] too] TP deletion at PF movement of the remnant DP to a functional projection

An argument in favor of this leftward movement is the fact that stripping sentences can- not occur in island domains (compare (27a) with (27b), a case of VP ellipsis):

27) a) *She will visit her friends if Ana, too. b) She will visit her friends if Ana will, too. (CYRINO & MATOS, 2002, p. 02)

Island effects are a diagnostic for movement, and their presence in stripping sentences implicate the movement outside of the ellipsis site. This also shows that stripping and VP ellipsis involve different constituents; stripping sentences violate island domains because the DP must move out of the island to escape from deletion, but in VP ellipsis the DP does not need to do so, since the scope of deletion is VP and not TP. The functional element that licenses stripping is usually said to be a Focus Phrase (FocP) (e.g. WURMBRAND, 2017; WINKLER, 2005; DEPIANTE, 2000) or a Sigma Phrase (ΣP) (e.g. MARTINS, 1994; HOLMBERG, 2001) with a strong feature that needs to be satisfied by the movement of a DP to its specifier position. The FocP proposal seems to follow from the

29 This is not to say, however, that stripping sentences do not support the adverb "together" under the right condi- tions. See, for instance, x, which involves two plural conjuncts:

ii) John and Mary wrote their essay together, and Sue and Cindy too.

56 fact that the remnants of stripping are prosodically focus-marked (WINKLER, 2005), and that many languages mark focus by leftward movement of the focused element (RIZZI, 1997). Considering these proposals, we will adopt the following structure for stripping sen- tences, from van Craenenbroeck & Temmerman (2018, p. 594):

28) John read Hamlet and Mary too

This is the representation we will assume for both languages studied here, i.e., French and BP. We assume, then, that stripping sentences such as the ones above are formed through TP coordination, and that the elided TP is dominated by a functional element (FocP or ΣP) that licenses the ellipsis and motivates the movement of the remnant.

2.3 ACQUISITION OF ELLIPSIS

According to Lopes & Santos (2014), to acquire ellipsis a child needs to acquire both the licensing and identity conditions which are necessary for this process. We have seen in the previous sections that the licensing of ellipsis is given by a strong feature in a functional head. For the identity conditions, we assume that the deleted elements that depend on a linguistic antecedent must share some syntactic identity with their antecedent. This identity must involve the entire deleted element; for instance, the sentence "John read Hamlet and Mary (did) too" implies that Mary read Hamlet, and not Pride and Prejudice. This means that the elided

57 constituents in VP ellipses or stripping must share identity with the entire VP or TP of its lin- guistic antecedent. So in order for a child to understand ellipsis, she needs to 1) identify the non-pronounced elements in the sentence that are licensed by an ellipsis feature (and are not just cases of null elements, as for instance, null objects); 2) identify the remnants of the elided sentence; and 3) reconstruct the elided constituent through the observation of its identity relation with an ante- cedent. Several studies show that young children produce and comprehend ellipsis from an early age (e.g. SANTOS, 2006, 2009; LOPES, 2009; LOPES & SANTOS, 2014; POSTMAN ET AL., 1997 apud LOPES & SANTOS 2014; FOLEY ET AL., 2003; GUO ET AL., 1996). Post- man et al. (1997), for instance, conducted an imitation task with children from 31 to 47 months old, in which the experimenter would say a coordinated sentence without ellipsis and the child would be asked to repeat it. In some of these repetitions, children spontaneously shortened the experimenter's sentences by producing an ellipsis, as in the example (29) below, uttered by a child of 34 months of age.

29) Experimenter: Bert wipes his nose and Mickey wipes his nose too. Child: Bert wipes his nose and Mickey does too.

(POSTMAN ET AL. 1997, apud LOPES AND SANTOS, 2014, p. 185)

Santos (2006, 2009) and Kato (2012, apud Martins, 2016) go even further to say that European Portuguese (EP) and BP-learning children already produce ellipses in verbal answers to yes-no questions such as (30) below by 17 months of age:

30) Mãe: o cavalo vai papar? Mother: "Is the horse going to eat?" Criança: vai. Child: "goes" (yes). (SANTOS, 2009, p. 2)

Foley et al. (2003) also conducted an experiment with English-learning children from 3 to 8 years-old, in which they needed to act-out sentences using a set of toys. The authors found out that by 3 years of age, children can correctly interpret simple VPE sentences such as "Fozzi bear jumps up and down and Big Bird does too", and even sentences involving pronouns such as "Oscar touches his apple and Big Bird does too", which are ambiguous between a strict and

58 sloppy reading30. When asked to perform the first sentence, children would make a Fozzi bear and a Big Bird doll jump up and down; when asked to perform the second sentence, children would either make an Oscar doll touch his own apple and Big Bird touch Oscar's apple (strict reading), or, more frequently, Oscar touch his own apple and Big Bird touch his own apple (sloppy reading). The work of Lindenbergh, Van Hout & Hollebrandse (2015) also show that Dutch- learning children from 4 to 6 years old understand sluicing sentences as in (31). In a picture selection task, children were presented with four pictures, and should choose the picture that better represented the heard sentences. For a sentence such as (31), for instance, children saw a picture of a woman drawing a flower; a picture of a person painting a flower, but whose body and face was concealed by a blue rectangle, in a way that only her arms were visible; another picture of a person concealed by a rectangle, but this time the person is drawing a house; and finally another picture of a concealed person, but this time holding a flower.

31) Iemand tekent een bloem, maar ik zie niet wie Someone is drawing a flower, but I can't see who.

The results show that children pointed towards the target picture (i.e., the one in which a concealed person is painting a flower) more than 90% of the time. Furthermore, the authors also conducted an elicited production task, in which children were given some cards like the ones described above, and the experimenter, which could not see the cards, asked questions such as "is someone drawing a house?". Most of children's answers included fragment sluicing structures, such as "yes, but I can't see who". As shown by the examples above, there is a growing body of literature on the acquisition of ellipsis sentences, which show that children as young as 17 months old already produce and understand ellipsis sentences. When interpreting ellipses in act-out or picture selection tasks, children show the knowledge of the identity conditions linked to these structures: in Linden- bergh, Van Hout & Hollebrandse (2015), they point towards the figure where there is a hidden person drawing a flower (and not drawing a house or holding a flower); in Foley et al. (2003), they act out the named action with both the first and second characters mentioned in the sen- tences. However, there are not many studies on structures other than VP and NP ellipsis, and

30 When the referent of the pronoun that suffers VP or TP ellipsis is not the same as its linguistic antecedent, the sentence has a sloppy reading; when it is, the sentence has a strict reading.

59 most of these studies are on English, although there are some on Brazilian and European Por- tuguese (e.g. SANTOS, 2006, 2009; LOPES, 2009; LOPES & SANTOS, 2014), Dutch (e.g. LINDENBERGH, VAN HOUT & HOLLEBRANDSE, 2015), Japanese (e.g. OHTAKI, 2014; GUO ET AL., 1996) and French (e.g. SLEEMAN & HULK, 2011). The experiments described in this thesis aim to contribute to the expansion of this liter- ature, by investigating French and Brazilian children's comprehension of stripping sentences. This structure has not been extensively studied yet, apart from the analyses of verbal answers to yes-no questions in Santos (2006, 2009) and Kato (2012, apud Martins, 2016), which are said to be instances of stripping. Building up on these studies, we hypothesize that French and BP-learning 3-4-year-olds already understand stripping sentences, and so the children who lis- tened to stripping sentences in Experiments 1 and 3 should look longer towards the video where the character associated with the second mentioned DP is performing the same action as the character associated with the first DP than children who listened to simple transitive sentences in Experiment 1, or to sentences in which aussi (too) is replaced by a novel word in Experiment 3.

2.4 FRENCH VS. BP: RELEVANT PROSODIC AND SYNTACTIC DIFFERENCES

As Experiment 1 will be conducted both with French and BP-learning children, it is important to highlight some prosodic and syntactic differences between these languages. Alt- hough several factors could predict differences in the performance of children learning each language, such as methodological and environmental contexts, it is important to show which linguistic factors could also have an influence on this process. Beginning with syntax, since we are studying a type of ellipsis, it is important to observe how this phenomenon, as well as other types of null constituents, behave in each language. BP is a language that allows for several types of ellipsis, such as stripping, gapping, sluicing and VPE, as well as null subjects and objects. In French, however, there are (theoretically) no null subjects, null objects are restricted and clearly marked (i.e., are judged by native speakers as strange, although not impossible (see below)) and there is no VPE. See (32a), an example of VPE in BP, in contrast with (32b), which is the same example in French:

32) a) Perguntamos se eles já tinham comido, e eles tinham. b) * On avait demandé s'ils avaient déjà mangé, et ils avaient.

60

(We) asked if they had already eaten, and they had Ø. (CYRINO & MATOS, 2007, p. 200)

Cyrino & Matos (2007) claim that there is no VPE in French and other Romance lan- guages such as Spanish because, in these languages, the auxiliary verbs have lost their Tense value, and so auxiliary + main verb constructions lack aspectual value. The authors propose the following structure for these types of constructions in BP and French, respectively:

33) a) [CP C [TP T [AspP-vP Asp- vP ... [vP ]]]] - BP b) [CP C [TP T [AspP Asp ... [vP ]]]] - French

(CYRINO & MATOS, 2007, p. 203)

The VPE formation rule states that VPE "is licensed under immediate c-command of the lexically realized functional head with V-features that combines ("merges") with the ellip- tical verbal predicate" (p. 203, my translation). In (33b), the vP does not merge with Asp, and so Asp does not immediately c-command the elliptical verbal predicate, which impedes VPE; in (33a), Asp does immediately c-command the verbal predicate, as it combines with vP, and so VPE is allowed. Another difference between French an BP is that French is said to be a non-pro-drop language, since it does not allow for null subjects (for instance, in (32b) above, one could not say *Avait démandé), whereas BP does. The pro-drop parameter (RIZZI, 1986) classifies lan- guages according to the possibility of occurrence of null subjects, which are syntactically rep- resented as pro (hence the name pro-drop). Italian is a typical example of a pro-drop language, as it drops subjects in sentences frequently. BP, however, is a language that falls between the two sides of this parameter: while it allows for subject drop in some contexts, as in (32) above, it forbids it in others (for instance, in O João disse que Ø comprou uma casa ("John said that

(he) bought a house") it is very hard for the null subject to refer to a third person, whereas this sentence is acceptable in Italian). We are not going into further detail about this issue here, but we will assume, following Kato (2000), that BP can be better described as a partial pro-drop language. French and BP also differ in the use of null objects. While BP allows for referential null objects in several contexts, their occurrence is very marked in French. This presents an issue for Experiments 1 and 3 because our test sentences contain one verb that alternates between a transitive and an unergative reading in both languages (i.e., manger/comer ("eat")) and three

61 verbs that are strict transitive verbs (i.e., pousser/empurrar; taper/cutucar; and porter/carregar ("push", "hit/poke", "carry")). All verbs appear with a realized object in the transitive condition, but without one in the stripping condition. Transitive verbs not followed by a direct object can be interpreted as null object (NO) constructions, where the object is structurally considered as a phonetically null element (RIZZI, 1986). NOs are allowed in several languages, such as Chi- nese (e.g. HUANG, 1984), EP (e.g. RAPOSO, 1986) and BP (e.g. FERREIRA, 2000; BIAN- CHI & FIGUEIREDO SILVA, 1994; CYRINO, 1997). The nature of NO constructions or the processes that generate them seem to vary be- tween languages and even between different constructions within a single language. For BP, Bianchi & Figueiredo Silva (1994) propose that, while animate NOs are [-anaphoric], [-pro- nominal] empty categories bound by a null operator that moves to Spec,CP in the root clause, inanimate NOs are [-anaphoric], [+pronominal] empty categories, namely pro (p. 189). In fact, the empirical evidence shown by the authors and by Ferreira (2000) show that there are several important differences between animate and inanimate NOs in BP that justify their different syntactic treatment. For instance, animate NOs cannot be co-indexed with any c-commanding argument (i.e., they are subject to Condition C; see (34) below), whereas inanimate NOs can do so.

34) a) *O Joséi impediu a esposa [de matar ei] José prevented the wife from kill Ø (him)

b) Esse tipo de garrafai impede as crianças [de abrirem ei sozinhas] This kind of bottle prevents the children from opening Ø (it) by themselves (BIANCHI & FIGUEIREDO SILVA, 1994, p. 187)

Cyrino & Lopes (2004) and Lopes (2009) have a different proposal. According to the authors, NOs in BP are a case of nominal ellipsis "locally licensed under a c-command relation by the lexically filled head of Aspect (Asp)." (LOPES, 2009, p. 105). One evidence that NOs are not pro in BP is that they are not constrained by the same rules as lexical pronouns; for instance, while NOs allow for strict or sloppy reading, lexical pronouns only allow for strict reading. In (35a) below, one might say that Pedro turns off its own radio (sloppy reading) or that it turns off João's radio (strict reading), whereas in (35b), the only possible interpretation is that Pedro turns off João's radio.

35) a) De noite, João liga seu aparelho de som, mas Pedro desliga. At night, João turns on his sound system, but Pedro turns Ø off b) De noite, João liga seu aparelho de som, mas Pedro desliga ele.

62

At night, João turns on his sound system, but Pedro turns it off

Lopes and Cyrino & Lopes also state that [+specific] NOs tend to occur with [-animate] referents, whereas [+animate] referents occur mostly in [-specific] contexts, such as in Policial sempre insulta preso antes de torturar ("Policemen always insult prisoners before torturing Ø"). However, this does not mean that [+animate] referents are completely unacceptable with [+specific] contexts, only that they are more marked. For instance, the sentence in (36) is said to be acceptable with a [+animate] referent of the null object31:

36) Eu dei banho no bebê, enxuguei e troquei. I bathed the baby, dried Ø (him) and changed Ø (him).

Regardless of which theory best describes NOs in BP, it is clear that [+animate] referents behave differently from [-animate] ones, and that [+animate], [+specific] constructions are more marked in this language. While null object constructions are allowed in Brazilian Portuguese, some authors state that this is not the case for French (e.g. HUANG, 1984; RAPOSO, 1986). However, others (e.g. CUMMINS & ROBERGE, 2004; GRÜTER, 2009) show that, although clearly marked32, ref- erential NOs are present in written and spoken contemporary French. See, for instance, the examples in (37):

37) a) Je crois que t'aimes bien, toi, ce genre de truc. J'ai trouvé hier. I think that you like this sort of thing. I found Ø (it) yesterday. (spoken interaction; LAMBRECHT & LEMOINE. 1996, p. 297) b) (in a video store) Si on prenait Tigre et Dragon ? Qui a vu ? How about Crouching Tiger, Hidden Dragon? Who has seen Ø? (CUMMINS & ROBERGE, 2005, p. 53) c) …Et la tête qu'il fait le jour où on rapporte au logis un store décoré d'une photo de Marilyn. S'il déteste vraiment, on le case dans la salle de bain. …And the look on his face the day we bring home a blind decorated with a photo of Mar- ilyn [Monroe]. If he really hates Ø, we stick it in the bathroom. (Cosmopolitan, August 1996, p. 118; reported in CUMMINS & ROBERGE, 2005, p. 52) d) A: Je vais avoir trente ans.

31 Thanks to Ruth Lopes for pointing this out. 32 We are not referring here to generic or deictic NOs such as (i) below, which seem to occur somewhat freely in most languages (e.g. KATO, 1993; CUMMINS & ROBERGE, 2004, 2005). i) Les écrivains attirent Ø sexuellement. Writers attract Ø sexually. (M. Duras, reported in Lambrecht & Lemoine, 1996, p. 286)

63

I will have thirty years B: J'ai déjà eu, moi. I have already had Ø, me (spoken interaction; LAMBRECHT & LEMOINE, 1996, p. 299)

Cummins & Roberge (2004) and Grüter (2009) state that French referential NOs stem from a process called clitic-drop, that is, they are not actually NOs, but rather null clitics. Grüter proposes that, along with the three phonetically realized clitics in French (i.e., le, la, les), there is a fourth option, which is Ø. This is corroborated by the fact that almost all non-pronounced objects from the sentences above can be replaced by a clitic (e.g. Je l'ai trouvé hier; Qui l'a vu?)33. Furthermore, the occurrence of null clitics is favored in certain conditions, such as third person reference, nonhuman reference and reference to a proposition or process. Grüter explains this by saying that null clitics do not have a Case feature, which allows them to be phonetically null. According to the author, the absence of Case feature is "(…) attributed to a violation of the Case-expression requirement, which is permissible if and only if pro is marked [predicta- ble], that is, when its referent is highly salient in the discourse context". Regardless if the proposal above is the one which best describes French sentences, they show that French does allow for transitive structures with no object, even though they are more marked and have a different underlying structure than in BP. As we can see from the observations above, both French and BP allow some kinds of ellipsis and null elements, but in BP these are much more frequent than in French. This suggests that French children must be exposed to fewer cases of ellipses and phonetically null elements in the input than Brazilian children34. We could predict then that BP-learning children would perform better than French-learning children in the experiments proposed in this thesis, as they should be more used to elliptical sentences. Furthermore, French-learning children are most likely not exposed to many cases of null clitics in their input, as these constructions are marked in the language. As most of our stripping test sentences from Experiments 1 and 3 involve null object constructions (as we will further explain in section 3.1), we would also expect French children to have more difficulty interpreting them than BP-learning children.

33 There are however some cases where the absent object cannot be replaced by a clitic, as in (37d), where the sentence *Je les ai déjà eus, moi is ungrammatical. This, however, is explained by Lambrecth & Lemoine (2005) as a special use of null object motivated by the lack of a better alternative in the language, since the use of a personal pronoun "would lend to the complement a specific referential status incompatible with the indefinite quantified antecedent NP [trente ans]." (p. 40). 34 A corpus search is needed to corroborate this claim.

64

To present the prosodic aspects of French and BP which might influence children's per- formance on the proposed experiments, we turn to Frota & Prieto's (2015) detailed compilation of analyses of nine Romance languages, including some varieties of French (from Lannion, Lille, Paris, Orléans, Toulouse, Lacaune and Marseille, in France; Genève and Fribourg, in Switzerland; and Louvain-la-neuve and Brussels, in Belgium) and Brazilian Portuguese (from Salvador (BA); São Paulo (SP); Belo Horizonte (MG); and Porto Alegre (RS)). All the data described were obtained using similar methodologies and analyzed following the Autosegmen- tal Metrical model of intonational phonology. According to the studies described, French presents some prominence and phrasing characteristics that it does not share with other Romance languages, including BP. For instance, French does not present lexical stress. Instead, it presents obligatory phrase-final prominence in the last metrical syllable of the word. Furthermore, French presents a smaller number of pitch accents and boundary tones compared to BP, and BP uses pitch accents for topic and focus marking, whereas French does not. The domain for the distribution of pitch events is also dif- ferent between these languages: while in French this domain is I35, in BP, it is the prosodic word. This means that BP has potentially more pitch events than French. Since French has less pitch accents and boundary tones than BP, and these exist mainly to indicate phrase boundaries, whereas in BP intonational phrases contain more pitch events, and these also serve as indicators of other phenomena not (necessarily) involving prosodic boundaries, this might mean that these cues are less ambiguous in French than in BP. So we could predict that French children would have an easier time identifying the relevant prosodic boundaries for the comprehension of the structure of the test sentences than children learning BP. Another important difference in prosodic structure is the one found in our own test sen- tences; we conducted acoustic analyses of these sentences in both French and BP (further de- scribed in chapters 2 and 3), and found some differences in the prosodic cues that indicate the relevant prosodic boundaries. Our test stripping sentences for BP have a falling of 18Hz between the verb and the second DP, while the French stripping sentences have a rising pitch contour of 46Hz in this position, and the BP transitive sentences present a pause of 115ms in average between the first DP and the verb, whereas the French transitive sentences do not

35 Frota & Prieto define this domain as AP, or Accentual Phrase, which is a prosodic constituent placed between the prosodic word and iP, or Intermediate Phrase. AP and iP are postulated to account for some prosodic phenom- ena that are observed by the Autosegmental Metrical model in some languages such as French but are not present in BP or other Romance languages.

65 present a consistent pause in this position. These seem to be rather minor differences, so it is hard to tell if (or how) they would have any impact on children's performance in the experiment.

In the next chapter, I describe the procedure and results of Experiment 1, with French and BP-learning children. I also further explain the structure of the test sentences and make some predictions about how these structures may affect children's performance in both lan- guages.

66

EXPERIMENT 1: STRIPPING VS. SIMPLE TRANSITIVE SENTENCES WITH KNOWN VERBS

In section 2.1 above we presented some evidence in favor of the hypothesis that children and adults use prosodic boundary information for sentence parsing. The theory of prosodic phonology from Nespor & Vogel (1986) states that some phonological and intonational phrase formation rules make reference to syntactic structure, which makes it possible for adult listeners to rely on prosodic phrasing cues to access sentence structure. Experiments with children and infants show that children may also benefit from prosodic information to figure out the structure of sentences, for lexical and syntactic disambiguation (e.g. GOUT, CHRISTOPHE & MOR- GAN, 2004; DE CARVALHO ET AL., 2019; DAUTRICHE ET AL., 2014). There are, how- ever, still few experiments that corroborate this claim, and these are restricted to a handful of languages and structures. In an attempt to contribute to the growth of evidence in favor of this hypothesis, this first experiment was designed to investigate whether French- and BP-learning children could use the prosodic boundary between the verb and the second DP in stripping sentences to differentiate them from simple transitive sentences with aussi/também. Using the same methodology as Dautriche et al.'s (2014) experiment with left-dislocated sentences with known verbs, we aimed to investigate children's performance with a syntactic structure that has not been studied before. We presented them with two simultaneous videos, one depicting an interpretation of the simple transitive sentence (one-agent video - a tiger hitting a duck and a bunny with a stick alternately, for the example in (38a) below), and another one depicting an interpretation of the stripping sentence (two-agent video - a tiger and a duck hitting a bunny with a stick, for the example in (38b)), while playing one of the two types of sentence. We then measured children's looking time to each video. For the older group of children, we also asked them to point towards the video that "the lady told them to look at" (montre-moi quelle vidéo la dame t'as dit de regarder) or that "the lady was talking about" (mostra pra mim de qual vídeo a moça tava falando), as we explain in the sections below.

38) a) Le tigre tape le canard aussi ! – The tiger hits the duck too! b) Le tigre tape ! Le canard aussi ! – The tiger hits! The duck too!

This chapter is organized as follows: first, I present a brief description of the test sentences, and their possible implications for the experiment. Then I refer to the methodology, the results and a brief discussion of the experiment conducted with French-learning children. Afterwards,

67

I do the same for the experiment conducted with BP-learning children, and I conclude the chap- ter with a final discussion, in which I attempt to compare the results found for each language.

3.1 TEST SENTENCES

This experiment contains four test trials, each presenting a different known verb: man- ger/comer ("eat"), taper/cutucar ("hit"/"poke"), pousser/empurrar ("push") and por- ter/carregar ("carry"). Half of the children tested listened to these verbs inside transitive sen- tences such as (38a), and the other half listened to them inside stripping sentences such as (38b) above. These verbs were the same used in Dautriche et al.'s (2014) experiment. Their experi- ment reported good results, and we wanted to keep the present experiment as similar as possible to it. For the same reason, we used the Portuguese translation of these verbs for the experiment with BP-learning children, except for taper, which was replaced by cutucar, since the more direct translation bater is a verb that requires a preposition in BP. Dautriche et al. chose the four verbs above because they are very frequent in French children's input (according to a study made by the author), and so should be easily understood by them. However, there are some differences in these verbs' underlying structures and in how freely they can appear without a phonetically realized object. This is an issue for our experiment because our test sentences, like Dautriche et al.'s left-dislocated sentences, present the chosen verbs both with a realized object (in the transitive sentences) and without one (in the stripping sentences). The main difference between the verbs chosen for the experiment is that, while man- ger/comer, as "eat" in English, can alternate between a transitive reading when accompanied by a direct object (e.g. I eat apples) or an unergative reading when in a sentence without a direct object (e.g. I eat very little), the other three verbs are canonically transitive. This means that there are differences in the underlying structure of the stripping sentences with manger/comer and with the other verbs; while the former might be represented as intransitive, the latter must be interpreted as transitive constructions with null objects or null clitics. Different structures might lead to different performances, as children might interpret some structures more easily than others. As seen in section 2.4, while BP is a language that allows for null objects in a wide variety of contexts (e.g. CYRINO & LOPES, 2004; LOPES, 2009), French null clitic construc- tions are clearly marked and present an underlying structure different from the ones with null

68 objects in BP (e.g. CUMMINS & ROBERGE, 2004; GRÜTER, 2009). This could predict a better performance of BP-learning children with the strictly transitive verbs, as they should be more used to null object sentences than French-learning children. Another relevant aspect of BP referential null object constructions is that they are more or less restricted to [-animate] referents. Studies on French do not mention the same restriction, but they do mention a preference for [-human] referents of null clitics. In our test sentences, we use puppets as the referents of null objects for three verbs (e.g. pousser/empurrar (a monkey pushes a dinosaur inside a wagon), taper/cutucar (a tiger pokes a bunny) and manger/comer (a tiger eats a duck)) and an inanimate object for one verb (e.g. porter/carregar (a duck carries a present)). These referents are evident from the visual context (the videos that show the two possible interpretations of the test sentences) and from the preceding linguistic context: intro- duction videos are presented before each test trial, showing the first-mentioned character per- forming the named action on another character or object, while the action and participants are named once (e.g. Regarde le tigre ! Et le canard ! Il le tape ! / Olha o tigre! E o coelho! Ele tá cutucando ele! ("Look at the tiger! And the bunny! He is hitting/poking him!")). It is possible that children in our experiment would behave differently when exposed to sentences with an animate referent than to sentences with an inanimate referent, performing better with the latter.

Finally, even though we showed the possibility of null object (or null clitic) construc- tions such as the ones used in our experiment in both BP and French in section 2.4, it is still possible that at least some of the sentences we created are not accepted by native speakers. Perhaps, for instance, all stripping sentences with [+animate] referents are unacceptable. For French, it is still possible that all our stripping sentences with strictly transitive verbs could be unacceptable. However, we have reasons to believe that this is not the case. Dautriche et al. (2014) showed that their left-dislocated constructions, which involved the same transitive verbs with null clitics, are accepted by adults: not only were they created by a native French speaker and judged as acceptable by peers, but 10 French-speaking adults correctly interpreted these sen- tences. Our stripping sentences were also judged as acceptable by French and Brazilian peers and were correctly interpreted by 38 native French speakers (see section 3.2.1.2) and 29 native BP speakers (see section 3.3.1.2). Also, 25 French speakers did our Preferential Looking task as a control group, and did not seem to be surprised by the fact that there were transitive verbs being used in sentences without a direct object and with an animate referent (although we did

69 not explicitly ask them to judge the acceptability of the sentences). Furthermore, we also con- ducted a reading aloud experiment containing sentences such as the ones used in the experi- ment, and participants did not seem to be surprised by the test sentences. None of the arguments stated above, however, show that the test sentences created are completely accepted by speakers. Mature speakers interpret "bad" sentences all the time, since they naturally occur in the input (e.g., in children's or foreign speakers' utterances, or in perfor- mance errors), and their strangeness is often ignored for the sake of communication. However, even if the chosen sentences are not completely accepted by speakers, this does not mean that the present experiment is not valid; its purpose is not to access children's full comprehension of the test sentences, but rather their ability to use prosodic boundary infor- mation to tell apart stripping from simple transitive sentences. That being said, it is still possible that the strangeness of some test sentences could have an impact on children's performance on the task. To address this possibility, as an exploratory analysis, we also analyze children's per- formance by verb. This, however, will not be our main analysis, since other factors not con- nected to the underlying structures of the test sentences could also influence children's perfor- mance, such as higher salience of one test video over its pair. Moreover, the analysis per item has less statistical power, as the data are spread into four more variables.

3.2 FRENCH-LEARNING CHILDREN

3.2.1 Method

All French experiments are allowed by the CERES (Conseil d’évaluation éthique pour les recherches en santé) as part of a previously approved research project submitted by Anne Christophe (IRB number: 20140100001072). The method, analysis, and criteria for exclusion of participants in this experiment were preregistered in the OSF (Open Science Framework) database. The formal preregistration, as well as the materials, collected data, and data analysis are freely available to readers through the following link: https://osf.io/spdgm/?view_only=26e20e2dd13c46ab80892b4fdc37cb70

70

3.2.1.1 Participants

For the Preferential Looking task (see section 3.2.1.3), fifty-one 3-to-4-year-olds and forty-eight 28-month-olds were included in the final analysis. In the 3-to-4-year-old group, twenty-six participants were assigned to the transitive condition (Mage = 40.2 months, range 36.3 to 46.0, 14 girls) and twenty-five to the stripping condition (Mage = 40.1 months, range 36.2 to 48.4, 10 girls)36. In the 28-month-old group, twenty-four participants were assigned to the transitive condition (Mage = 27.8, range 27 to 28.8, 12 girls) and twenty-four to the strip- ping condition (Mage = 27.8, range 27.1 to 28.7, 10 girls). All children were monolingual native French speakers and were exposed to other languages less than 30% of the time at their homes. Their parents signed an informed consent form and filled a vocabulary survey with the nouns, verbs and the adverb used in the test sentences37. 92% of the parents of the older group and 72% of the parents of the younger group reported their children knew all the words from the test sentences. 3% of parents from both age groups reported their children did not know aussi ("too"), and the nouns and verbs were reported to be known by 96% of children or over, except for dinosaure (i.e., dinosaur), which was reported to not be known by 24% of children of the younger group38. An additional thirty-one children (twenty-one 3-4-year-olds and ten 28-month-olds) were tested but not included in the analysis. Participants were excluded because of insufficient eye data (5 children from 27-28-months-old and 9 children from 3-4-years-old) 39; fussiness (2 younger and 2 older children)40; refusal to participate (2 children from the older group); parental intervention (3 children from the younger group)41; and experimenter error (8 children from the

36 We chose a between-subjects design to avoid the perseveration biases noticed in previous works using within- subject designs, such as Snedeker & Trueswell (2001), Halbert et al. (1995) and Vogel & Raimy (2002). 37 See Appendix C. 38 For the older group, 1.6% of the parents have reported his/her child did not know the word dinosaure; another 1.6% have reported that his/her child did not know the word tigre (i.e., tiger); and 1.6% reported that they did not know the verb porter (i.e., carry). For the younger group, 3% of the parents have reported his/her child did not know the word tigre; 24% have reported they did not know the word dinosaure; and 7% have reported they did not know the word singe (i.e., monkey). 39 All trials with less than 75% of eye data (i.e., in which children looked away and/or in which the eye-tracker couldn't track the eye for 25% or more of the time of the trial) were excluded from the statistics, and children with more than two excluded test trials had all their looking data discarded. 40 The experimenter judged if children were too fussy by watching the recorded videos of their test sessions, and observing their level of concentration during the test trials; if they seemed to not pay attention to the test sentences, by talking, crying or looking away during most part of the test trials, or to show too many signs of boredom or discomfort, the data was discarded. The recorded videos did not show the screen where the stimuli were presented, which prevented possible biases in discarding data from children that looked longer towards the incongruent video. 41 Even though we instructed them not to do so, some parents pointed at the screen or talked to the children during the test trials.

71 older group) 42. The children who were not included in the analysis because of missing eye- tracking data or experimenter error did not had their pointing data discarded, as the missing eye-tracking data were not judged to be caused by children's lack of attention to the videos, and the experimenter errors did not influence their offline pointing responses43. For the pointing task, conducted only with the older group, thirty-seven children were included in the final analysis, since not all children pointed. From this group, nineteen partici- pants were assigned to the transitive condition, (Mage = 40.7 months, range 36.1 to 48.5, 11 girls), and eighteen to the stripping condition (Mage = 41.4 months, range 36.2 to 48.4, 10 girls). An additional eight children pointed, but had their data discarded due to data loss44.

We also tested 25 Native French-speaking adults as a control group.

3.2.1.2 Materials

Four pairs of test videos were created using animal puppets. As described above, the four actions were pousser ("push"), porter ("carry"), manger ("eat") and taper ("carry"). Each test trial started with an introduction, where the first character in the test sentences appeared performing the action on a third character, accompanied by a sentence such as Regarde ! C'est le tigre ! Et le lapin ! Il le tape ! ("Look! It's the tiger! And the rabbit! He is hitting him!") for the "hit" action. These videos intended to familiarize children with the verbs used in the test sentences and also offered a necessary context for the transitive test sentences (one usually does not say "the tiger hits the duck too" without a preceding context where the tiger was hitting somebody else). Regarding the test videos, for each video pair, one corresponded to the simple transitive interpretation and the other one to the stripping interpretation of the test sentences (see fig. 4 for an example). For manger, pousser and taper, both test videos contained three animals, but in the simple transitive videos only one animal was the agent and the other two were patients, while in the stripping videos two animals were agents. For the porter action, the transitive video

42 There were two types of errors: one was when the experimenter accidentally pointed at one side of the screen during a test trial, in order to turn children's attention towards the screen. The other one occurred because the experimenter wrote the wrong onset times on the experiment script and had to discard the data for the first 7 children, which were tested using this flawed script. 43 Except for the experimenter error caused by the experimenter pointing at the screen during test trial. 44 The video recordings of the children were lost, so the experimenter could not double-check their pointing an- swers.

72 contained two animals and a gift box (one animal was the agent, and the other animal and the gift box were patients (i.e., the things being carried)). The stripping video contained the same animals but two gift boxes (the patients of the individual "carry" actions)45. The quantity and quality of movements, as well as character's disposition was controlled for each video pair so both videos were as similar as possible, as they should be equally attractive to children. Since the videos were more complex than the ones in Dautriche et. al. (2014), which contained at most two characters and only one agent, we also increased the test trial times from 9s to 12s, and the preview and contrast trial times from 7s to 9s, compared to their experiment, so children would have more time to inspect the videos.

Figure 4 - Example of test videos for the verb "hit" for Experiment 1.

The sentences used in the experiment were recorded by Anne Christophe in a sound- attenuated booth, using a recorder and a condenser microphone, and were included in the videos using Filmora video editor46. The test sentences were judged by 38 French-speaking adults via an online form, made with Google Forms47. Participants were asked to listen to sentences through headphones, and choose the best interpretation for them from a 5-point scale, 1 being the simple transitive interpretation of the sentences (e.g. le tigre tape le canard), 5 being the stripping interpretation (e.g. le tigre tape et le canard tape) and 3 being completely ambiguous.

45 See Appendix B for a better illustration of each stimuli. 46 https://filmora.wondershare.com/ 47 Adults were contacted through a mailing list, and received an e-mail inviting them to participate on a "sentence- judging task" (activité de jugement de phrases). The formulary started with the following introduction in French: "Welcome! This survey aims to investigate how French-speaking adults interpret ambiguous sentences. For each phrase, there is no right or wrong answer, we simply wish to know your interpretation of the sentences as a native French speaker.". On top of each 5-point scale, the following instruction was also given: "Indicate in the scale below the best interpretation for the sentence above. If you think the sentence is ambiguous, use the intermediate points (2 to 4)". Along with the test sentences, we also showed participants Dautriche et al.'s left-dislocated sen- tences (also recorded by Anne Christophe), which served as distractor items.

73

The sentences were interpreted correctly 89% of the time48 (i.e., the stripping test sentences were interpreted as coordinated sentences with two equivalent actions, and the transitive sen- tences were interpreted as simple sentences in which the first DP referred to the agent and the second DP referred to the patient).

3.2.1.2.1 Acoustic analyses

To assess the prosodic differences between stripping and simple transitive sentences, we conducted acoustic measurements around the intonational phrase boundaries of the sen- tences recorded for the experiment. We analyzed three acoustic information which usually cue phrase boundaries (NESPOR & VOGEL, 2007): presence and length of pauses; lengthening of the stressed vowel of the word that precedes the boundaries; pitch (F0) contour at the end of the boundaries. For the last two measures, we compared the two types of sentence through t-tests using the z-score49 value of duration and pitch. In order to make the two sentence types maximally different prosodically, so that the transitive sentences could not be interpreted as stripping sentences, we chose to mark a phono- logical phrase boundary between the first DP and the verb in the transitive sentences (as in [le tigre] [tape le canard aussi]). This restructuring strategy is allowed because the speaker was instructed to use child-directed speech, which has a slower speech rate compared to adult speech. So, the measures described above were applied to analyze both the boundary between the verb and the second DP in the stripping sentences and the one between the first DP and the verb in the transitive sentences. In addition, since aussi is deaccented in the transitive sentences but not in the stripping sentences, we conducted the same analyses between the second DP and aussi, to see if there was a significant difference between conditions, indicating a phonological phrase boundary in the transitive, but not in the stripping sentences. We found a significant difference in the duration of the stressed vowel of the first DP between conditions, with transitive sentences presenting a longer duration (M = 4.97; SD = 1.31; versus M = .24; SD = .78 for the stripping sentences; t(14) = -8.760; p < 0.001), but no

48 There were 1.86% of wrong or ambiguous close to wrong judgements, all for transitive sentences; 0.9% of completely ambiguous judgements (0.55% for stripping sentences and 0.33% for transitive sentences); and 7.67% of correct but slightly ambiguous judgements (1.97% for stripping sentences and 5.7% for transitive sentences). 49 Calculated through Lobanov's (1971, apud BARBOSA, 2012) normalization procedure. This procedure calcu- lates normalized values through the raw values of F0 or duration minus the mean value for the entire sentence, divided by the standard deviation.

74 consistent pause between this word and the verb. Between the verb and the second DP, we found a consistent pause of .593s in average for the stripping sentences, and, as expected, no pause for the transitive sentences. We also found a difference in verb lengthening (average duration z-score of the verb's stressed vowel minus duration z-score of previous vowels) be- tween conditions (M = -3.33; SD = 1.97 for the transitive condition, versus M = 1.95; SD = 2.52 for the stripping condition; t(14) = 4.66; p < .001). We found no significant pitch contour differences (average F0 z-score of the verb's stressed vowel minus F0 z-score of previous vow- els) between the transitive (M = .68; SD = 1.12) and the stripping sentences (M = .73; SD = .65; t(14) = .10; p = .9). For the possible phonological phrase boundary between the second DP and aussi in the transitive sentences, we found a significant difference between conditions (M = 2.98; SD = .91 for the transitive sentences, versus M = -.98; SD = .85 for the stripping sentences; t(14) = -8.98; p < 0.001), and a significant difference in pitch contour (average F0 z-score of the DP's stressed vowel minus F0 z-score of preceding vowels) (M = 1.39; SD = .86 for the transitive condition, versus M = -.15, SD = .54 for the stripping condition; t(14) = -4.27; p = .001). The raw values of duration can be found in Table 1, and the raw values of F0 can be found in Table 2.

Table 1 - Raw values of duration for the acoustic analysis of French test sentences (Experiment 1)

Stripping sentences Transitive sentences (e.g. [le tigre tape] [le canard aussi]) (e.g. [le tigre] [tape le canard aussi]) Duration in sec SD in sec Duration in sec SD in sec 1st DP's stressed vowel (e.g. t[i]gre) 0.121 0.04 0.292 0.06 Verb's stressed vowel (e.g. t[a]pe) 0.201 0.09 0.095 0.03 Verb's preceding vowels (mean) 0.108 0.03 0.211 0.09 Verb's stressed vowel - Verb's pre- ceding vowels 0.094 0.11 -0.117 0.07 2nd DP's stressed vowel (e.g. can[a]rd) 0.096 0.01 0.25 0.03

2nd DP's preced- ing vowels (mean) 0.131 0.03 0.146 0.03 2nd DP's stressed vowel - 2nd DP's preceding vowels -0.035 0.04 0.104 0.03

75

Table 2 - Raw values of F0 for the acoustic analysis of French test sentences (Experiment 1)

Stripping sentences Transitive sentences (e.g. [le tigre tape] [le canard aussi]) (e.g. [le tigre] [tape le canard aussi]) F0 in Hz SD in Hz F0 in Hz SD in Hz Verb's stressed vowel (e.g. t[a]pe) 364 16 347 46 Verb's preceding vowels (mean) 315 46 316 53 Verb's stressed vowel - Verb's pre- ceding vowels 49 45 31 52 2nd DP's stressed vowel (e.g. can[a]rd) 308 30 389 28

2nd DP's preced- ing vowels (mean) 309 17 324 37 2nd DP's stressed vowel - 2nd DP's preceding vowels -1 37 64 36

Figure 5 - Soundwave and pitch for one stripping sentence (Le tigre tape ! Le canard aussi !, top) and one simple transitive sentence (Le tigre tape le canard aussi !, bottom).

76

3.2.1.3 Procedure50

Parents or caregivers were contacted through a mailing list and invited to come to the Babylab of LSCP (Laboratoire de Sciences Cognitives et Psycholinguistique) on a set date. Once at the laboratory, the experimenter asked them to sign a consent form and explained the procedure, while the child played with some toys. Caregivers were told they would sit in front of a TV-screen inside a booth, with the child on their lap, and should remain as still as possible, not point at the screen and talk to the child as little as possible, in order to not influence her preference for one or the other side of the screen. Once the child was in the experimental booth, the experimenter put a sticker with a high- contrast pattern on their forehead, that served as a reference of the eye position for the eye- tracker. The experimenter then adjusted the eye-tracker camera, while the child watched a car- toon. After that, the caregiver was given headphones with masking music, so they could not listen to the test sentences and feel tempted to influence the child's interpretation. Then a five- point calibration of the eye-tracker began, followed by the experiment run in Matlab (Math- works, Natick, MA). For the older children, the experimenter stayed inside the experimental booth with the parent and the child, sitting in a lower chair behind and slightly to the child's right, also wearing headphones, so that the pointing responses could be requested; for the younger ones, the experimenter stayed outside the cabin and watched the child through a cam- era. The videos were presented on a 27-inch television, and the auditory stimuli were played through two speakers positioned on each side of the screen. The left-right disposition of the videos was pseudo-randomized in a way that children did not see more than two consecutive target videos on the same side. The order of presentation of the actions was also randomized, to avoid possible order biases in children's responses. The pointing responses were recorded by the experimenter through a keyboard, and the eye data were collected via an Eyelink 1000 eye- tracker. The experiment started with an introduction of the animal characters, which appeared individually waving and dancing on the screen for five seconds being named once during this period. This was followed by one training trial, with the actions "jump" (a bunny jumps up and down by itself) and "play" (the bunny and a monkey do a hand play together). Each training and test trial started with a preview phase, where each action appeared individually on the

50 For a detailed view of the experiment structure, see Table 13 in Appendix B.

77 screen for nine seconds, followed by a contrast phase, where both actions appeared at the same time also for nine seconds, with neutral attention-getters such as Regarde ! Tu vois ça ? ("Look here! Do you see that?"). At the end of the contrast phase, the videos were replaced by a colorful circle in the middle of the screen, and the first test sentence was displayed, in the future tense (e.g. Attention: le tigre va manger ! Le dinosaure aussi !/ le tigre va manger le dinosaure aussi ! "Attention: the tiger will eat! The dinosaur too!/ the tiger will eat the dinosaur too!"). Then the test videos appeared on the screen for twelve seconds, while the test sentence was played two more times, but in the present tense (e.g. Le tigre mange ! Le dinosaure aussi !/ Le tigre mange le dinosaure aussi !). Then, the videos froze on the screen, and the experimenter asked older children to point at "the video the lady was talking about". When children refused to point at one of the videos, the experimenter encouraged them by saying she did not listen to the lady (since she was using headphones) and by pointing at both videos herself, while asking "was it this one or this one?". If children still did not want to point, the experimenter proceeded with the experiment anyway, and pointing was recorded as missing. Pointing responses were always congratulated, regardless of whether they were correct or not. After the training trial, four test trials were displayed following the same structure as the training trial, but this time beginning with the verb introduction videos (see fig. 6). All test and contrast trials were preceded by a fixation trial, where a colorful circle appeared right in the middle of the screen, which served the purpose of deviating children's eye gaze to the center of the screen. The experiment only proceeded after children fixated their gaze at the circle. This was implemented so children would not fixate on one side of the screen for the entire contrast and test trials. Also, all test trials ended with the image of a laughing baby and the sound of laughter, which served the purpose of reengaging children's attention to the videos and making them look at the center of the screen. Overall, the experiment took about five minutes, but could be longer depending on how much time children took to point during the pointing trials51 or to look at the circle during the fixation trials. Once the experiment was over, caretakers were asked to fill the specific vocabulary questionnaire and a CDI52. Then the experimenter explained them the experiment's goals. Chil- dren took home a diploma of "honorary member of the Babylab" as a gift for their participation.

51 If children refused to point, the experimenter tried to convince them in the first two trials, and this could take up to 30 seconds. After the first two trials, the experimenter still asked for the pointing responses once per trial but did not insist. 52 We presented 28-month-olds' parents with a short French version of MacArthur Communicative Development Inventories formulary for assessing young children's communicative skills (https://mb-cdi.stanford.edu/). The complete formularies used are in Appendix C.

78

10s Verb introduction phase "Oh, regarde ! C'est le tigre ! Et le lapin ! Il le tape !"

3s

9s Preview phase "Oh, regarde ! Tu vois ça ?"

1s

9s Preview phase "Oh, regarde ! Tu vois ça ?"

1s

9s Contrast phase "Et là, regarde ! Tu as vu ?"

6s Informative audio prompt "Attention: Le tigre va taper ! Le Canard aussi !/ Le tigre va taper le canard aussi !"

12s Test phase "Oh, regarde ! Le tigre tape ! Le Canard aussi !/ Le tigre tape le canard aussi !" (2x)

Figure 6 - Structure of a test trial for Experiment 1. The clock icon followed by the number illustrates how many seconds each trial lasted and was not present in the actual videos.

79

3.2.1.4 Data analysis

Eye-tracking data were analyzed through a cluster-based permutation analysis (Maris & Oostenveld, 2007) run on the proportion of looks to the two-agent action over time; and was complemented by a t-test for averaged overall looking times towards the two-agent action as the dependent variable. The cluster-based permutation analysis allows us to detect any time windows during the test phase where a significant effect of condition is observed, indicating that children listening to the stripping sentences looked longer to the two-agent action than children listening to the simple transitive sentences. This measure analyzes children's average proportion of looks towards the two-agent video (in our case, looks towards the two-agent video divided by looks towards the two-agent video plus looks towards the one-agent video) every few milliseconds. Then it runs t-tests to find the time points where there is a difference between conditions. Since children listened to the test sentences once before the beginning of the test trials, they could possibly show a preference to one video from the beginning of the trials; therefore, we searched for time windows during the whole test trials (0 to 12000ms). Adjacent time points with a t-value greater than t = 1.5 (for the comparison between groups) were grouped together into a cluster, and the probability of observing a cluster of the same size by chance was estimated by running the same analysis 1000 times on simulated data, in which groups were randomly assigned to participants. The Eyelink eye-tracker registers one looking sample each 2ms; we first down-sampled the data by averaging to one sample every 20ms. In order to do this analysis, we used R (R core team, 2017) and the package eyetrackingR (Dink & Ferguson, 2016). Our second analysis analyzed the difference in looking times averaged over the whole duration of the trial, with a t-test testing for the effect of condition.

For the pointing responses, we ran a mixed-effects regression with pointing towards stripping vs. transitive pointing as the dependent variable, condition as a fixed effect, partici- pants as a random effect with a random slope for condition53.

53 We used the maximal random effect structure that allowed the model to converge.

80

3.2.1.5 Experimental hypothesis and predictions

Our experimental hypothesis was that 28-months-old and 3-4-years-old French-learning children can use the prosodic boundary cues in stripping sentences to interpret them differently from simple transitive sentences with aussi. If, however, they do not understand the prosodic boundary information in stripping sentences, they should interpret them as simple transitive sentences, and look longer/point more at the one-agent action video (i.e., the video that best describes the simple transitive sentences). Following the predictions and results of previous experiments (DE CARVALHO ET AL., 2015; DE CARVALHO, 2017; DAUTRICHE ET AL., 2014; HAVRON ET AL., 2019), if we only found a significant effect of condition in a specific time cluster (for instance, from 100ms to 1000ms), but not in the average total looking time, we would interpret this finding as showing that children in the stripping condition interpret the heard sentences differently than children in the transitive condition, but this difference is only visible in a specific time-window. If we found a significant effect of condition in the comparison of the average total looking time, but no significant time clusters, we would still reach the same conclusion, but we would assume children's preference is not restricted to one specific time cluster. For the same reasons stated above, we expected children in the stripping condition to point more towards the two-agent videos than children in the simple transitive condition, but we did not expect them to point only towards the stripping videos, as their answers could be influenced by other factors, such as a preference for one video, little inhibitory control (i.e., the ability to suppress other actions that are not relevant for the task), little confidence in their answers, low attention span, and so on. If children showed a significant difference between conditions in the pointing task, but not in the looking task, we could conclude that children do interpret correctly the sentences heard, but we would need to conduct exploratory analyses to investigate why they did not show this preference in the looking data. Finally, if children did not show any difference between conditions, or if their preference was inverted (i.e., children in the simple transitive condition looked longer/pointed more to- wards the two-agent videos than children in the stripping condition) we would conclude that children in the proposed experiment did not interpret the stripping test sentences as expected. If we ended up with one of these outcomes, we could also conduct exploratory analyses to figure out if other factors nonrelated to comprehension of the sentences could have influenced

81 the results, such as individual or collective biases towards one or another video, higher com- plexity of the videos, low effect size and so on. In any of the possible outcomes, exploratory analyses could be conducted to further investigate the results found. We will clearly indicate all exploratory analyses conducted, so the reader can tell them apart from the predicted (preregistered) analyses we chose when designing the experiment.

3.2.2 Results

3.2.2.1 Control group

Adults looked significantly longer towards the two-agent actions when exposed to the stripping sentences (M = .88, SD = .15) than when exposed to the transitive sentences (M = .27, SD = .18) (p < .001, t(21.15) = 8.81, Cohen's d = 3.5954). We also found a significant time cluster between 300ms and 12000ms (p < .001) where participants in the stripping condition looked longer at the two-agent actions than participants in the transitive condition. They also pointed at the one-agent action when exposed to transitive sentences and at the two-agent action when exposed to the stripping sentences in almost every trial, with only four errors (four test trials from four different participants, two for manger (transitive and stripping) sentences, one for pousser transitive and one for taper stripping). The figures related to the average looking and pointing results can be found in Appendix A. The time-course of looking times, and the results of the cluster-based analysis, are shown in Figure 7 below:

54 Values of reference for effect size magnitude (COHEN, 1992): d < 0.2 = negligible; d < 0.5 = small; d < 0.8 = medium; otherwise = large.

82

Figure 7 - Proportion of looks towards the two-agent action through the whole test trial (12 seconds) in the strip- ping (green line) and transitive (orange line) condition for the control group of Experiment 1. The significant time cluster is indicated by the gray box with its respective p-value. The average time of appearance of each word in the test sentences is indicated in the gray boxes below the graph (the upper boxes show the stripping sentences, and the boxes below show the transitive sentences).

In the figure above, the x axis shows the time of the test trial in milliseconds, and the y axis shows the proportion of looks towards the two-agent video (0 means no looks to the two- agent video, and 1 means only looks to the two-agent video). The green line shows the propor- tion of looking time to the two-agent video for the stripping condition, and the orange line shows the same thing for the simple transitive condition. The horizontal black line marks the position of the proportion of looks if participants look towards the target during half of the trial (.50), and the black vertical line marks the beginning of the test trial. As we can see, the pro- portion of looks for the simple transitive condition is below .38 and the one for the stripping condition is above .75 for most of the trial. What this figure tells us is that adults from both conditions looked towards the two videos during the whole trial, as no time-windows show a proportion of .0 (no looks to the two-agent video) or 1 (100% of looks to the two-agent video), but the ones in the stripping condition looked longer towards the two-agent videos, whereas the ones in the transitive condition looked longer towards the one-agent videos.

83

3.2.2.2 Eye-tracking data: 3-4-year-olds

For the older children, the cluster-based analysis found two clusters with a significant difference in the proportion of looks for the two conditions. The first cluster started slightly before the onset of the test sentence (between 1920ms and 3820ms (p = .04)). This can be explained by the fact that children already heard the test sentence once, right before the begin- ning of the test trial. The second cluster started at 3980ms and lasted until 8080ms (p < .001). This latter time-window coincides with the onset of the target verb during the test phase. Figure 8 shows the time course of the proportion of looking time during the test trial. The significant time clusters are delimited by the gray boxes with their respective p-values.

Figure 8 - Proportion of looks towards the two-agent action through the whole test trial (12 seconds) in the strip- ping (green line) and transitive (orange line) condition for French-learning 3-4-year-olds (Experiment 1). The sig- nificant time clusters are indicated by the gray boxes with their respective p-values. The average time of appearance of each word in the test sentences is indicated in the gray boxes below the graph (the upper boxes show the stripping sentences, and the boxes below show the transitive sentences).

As we can see in the figure above, the proportion of looks in the simple transitive con- dition is slightly below .50 for almost the whole trial, and the proportion of looks in the stripping condition is above .50 for the entire trial, with the exception of the first 50ms. What this tells us is that children looked towards the two videos during the entire test trial, but the ones in the stripping condition show a slight preference towards the two-agent action, while the ones in the transitive condition show a slight preference towards the one-agent action.

84

With regards to the overall looking behavior, the results show that children who heard stripping sentences looked significantly longer towards the two-agent videos (M = .62, SD = .08) than children who heard the transitive sentences (M = .47, SD = .13, t(40.5) = 4.92; p < .001; Cohen's d = 1.36).

Figure 9 - Average proportion of looking time towards the two-agent action in the stripping (green box) and transitive (orange box) condition for French-learning 3-4-year-olds in Experiment 1. Each purple dot represents the average looking time for one child. The horizontal gray line indicates the proportion of looks if participants looked towards the target during half of the trial (.50). The white dashed lines show the mean for each condition (.62 for the stripping condition and .47 for the transitive condition).

Exploratory analyses

In order to see if children's individual preferences towards one or another video could have influenced the results, we also compared the results with the proportion of looking time during the contrast trial, where children saw both test videos before listening to the test sen- tences. We ran a mixed effects model with condition and phase (contrast vs. test) as fixed ef- fects, participants as random intercept, and condition as a random effect with a random slope for item in order to see if children's individual preferences towards one or another video could have influenced the results55. We found no effect of condition (β = -.01, SE = .03, t(21.36) = -

55 We used the maximal random effect structure that allowed the model to converge.

85

.29, p = .77) or phase (β = .05, SE = .03, t(241.86) = 1.70, p = .09), but a strong interaction between condition and phase (β = -.11, SE = .04, t(241.86) = -2.97, p = .003). Children from the stripping group looked longer towards the two-agent action after listening to the test sen- tences (overall proportion of looking time of .53 for the contrast phase, versus .58 for the test phase), whereas children in the transitive condition decreased their proportion of looking time towards this action (overall proportion of looking time of .52 for the contrast phase, versus .45 for the test phase). Since the test sentences do not have the same underlying structure, as shown in section 3.1, we also conducted an analysis per item, predicting that if different verb structures had any influence on children's performance, children would have a better performance with the manger sentences, since this verb may take an unergative structure, and so the stripping sentences would be more acceptable than the ones with the other three verbs, which require a null object (or null clitic) interpretation. We performed an ANOVA with average proportion of looking time to- wards the two-agent action as the dependent variable and condition and item as independent variables, and found an effect of condition (F(1, 142) = 17.74; p < .001) and item (F(3, 142) = 9.22; p < .001), but no interaction between these variables (F(3, 139) = .17; p = .92). A post- hoc Tukey test revealed that, for the verb pousser (push), the overall proportion of looking time towards the two-agent action was significantly greater than for the other three verbs (p < .005 for the three comparisons with pousser, and p > .4 for all other comparisons). This might be due to the fact that the two-agent action video for pousser presented more movement than the one-agent action video; in the former, each mentioned character was on one side of the screen, and they pushed the third character towards each other alternately, whereas in the latter the agent pushed the second and third characters towards only one direction, from one side of the screen to the other. The larger quantity of movement might have made the two-agent action more attractive to children, and so they ended up looking longer towards this video regardless of which condition they were in. Figure 10 below shows us the overall proportion of looking time towards the two-agent action for the stripping (green bars) and transitive (orange bars) condition per verb. As we can see, children in the stripping condition presented a larger overall proportion of looking time towards the two-agent video than children in the transitive condition for all test items (see Table 3 below). The increased attractiveness of the two-agent action video for pousser did not seem to hinder children's performance in the task. Not only the results for this item followed the expected pattern, but there is a numerically greater difference between conditions in this item than in the other three items.

86

Table 3 - Average proportion of looking time towards the two-agent action per item for French-learning 3-4-year- olds in Experiment 1.

Condition Verb Stripping Transitive Manger .54 .44 Porter .56 .44 Pousser .72 .56 Taper .49 .38

Figure 10 - Average proportion of looking time towards the two-agent action for the stripping (green bars) and transitive (orange bars) condition per item for French-learning 3-4-year-olds (Experiment 1).

3.2.2.3 Pointing data

For the pointing analysis (Figure 11), the results show no significant difference between the proportion of pointing towards the two-agent action for the transitive or stripping conditions (SE = .50, β = -.60, z = -1.20, p = .23), although the results follow the expected direction, with children in the stripping condition pointing more towards the two-agent action than children in

87 the transitive condition (55% of the time, versus 43% for the transitive condition). This pattern is maintained when we look at each verb separately, except for taper, which showed more pointing to the two-agent video for the transitive condition (51% of the time for the stripping condition, versus 61% for the transitive condition).

Table 4 - Average proportion of pointing towards the two-agent action for the stripping and transitive conditions per item. French-learning 3-4-year-olds, Experiment 1.

Condition Verb Stripping Transitive Manger .59 .42 Porter .55 .26 Pousser .55 .42 Taper .51 .61

Figure 11 - Left: Average proportion of pointing towards the two-agent action for French-learning 3-4-year-olds (Experiment 1). The white dashed horizontal lines show the mean for each condition (.54 for the stripping con- dition (green box) and .42 for the transitive condition (orange box)). The purple dots represent the proportion of pointing for each participant. The horizontal gray bar shows a .50 proportion (i.e., when children pointed towards the two-agent videos during half of the trials). Right: Average proportion of pointing towards the two-agent videos for stripping (green bars) and transitive (orange bars) condition per item.

88

3.2.2.4 Eye-tracking data: 28-month-olds

For the 28-month-olds, no significant difference in the proportion of looks between con- ditions was found for the cluster-based analysis, and only a marginally significant difference was found for the averaged overall looking-times t-test analysis (t(46) = 1.83; p = .074; Cohen's d = .528). We observed however that the toddlers' looking behavior seem to be going in the right direction, with children in the stripping condition (M = .64, SD = .10) looking longer towards the two-agent action than children in the transitive condition (M = .59, SD = .10). Figure 13 shows the cluster-based analysis; Figure 12 shows the overall looking results.

Figure 1312 -- Average proportion of looking time towards the two-agent action in the stripping (green box) and transitive (orange box) condition for 28-month-olds (Experiment 1). Each purple dot represents the average look- ing time for one participant. The horizontal gray line indicates the proportion of looks if participants looked to- wards the target during half of the trials (.50). The white dashed lines show the mean for each condition (.64 for the stripping condition and .59 for the transitive condition).

89

Figure 13 - Proportion of looks towards the two-agent action through the whole test trial (12 seconds) in the stripping (green line) and transitive (orange line) condition for French-learning 28-month-olds.

In Figure 13 above we can see that the proportion of looks towards the two-agent action is above .50 during almost the entire trial for both conditions. This suggests that children in both conditions show a slight preference towards the two-agent videos. However, the propor- tion of looking time for children in the transitive condition is mostly below the one for children in the stripping condition, showing a tendency towards the same pattern of results observed for the 3-4-year-olds.

Exploratory analyses In order to see if children's individual preferences towards one or another video could have influenced the results, as for the 3-4-year-olds, we ran a mixed effects model with condi- tion and phase (contrast vs. test) as fixed effects, participants as random intercept, and condition as a random effect with a random slope for item. We found no effect of condition (β = .03, SE = .03, t(39.26) = 1.00, p = .32) but an effect of phase (β = .10, SE = .03, t(282) = 3.39, p < .001) and an interaction between condition and phase (β = -.09, SE = .04, t(282) = -2.03, p = .04). Children from the stripping condition increased their overall looking time towards the two- agent action from contrast to test phase (proportion of .50 for the contrast, versus .60 for the test phase) whereas children in the transitive condition did not change their looking time be- tween phases (.53 for the contrast, versus .54 in the test phase).

90

Regarding the analysis between items, as for the 3-4-year-olds, we performed ANOVA with average proportion of looking time to the two-agent action as the dependent variable and condition and item as independent variables. We found a marginal effect of condition (F(1, 138) = 3.20; p = .08) and an effect of item (F(3, 138) = 4.57; p = .004), but no interaction between these variables (F(3, 135) = .21; p = .89), which suggests that there is no significant difference in the performance of children in each condition between test items. A post-hoc Tukey test revealed that, as for the 3-4-year-olds, the verb pousser presented a greater overall proportion of looking time towards the two-agent action, but only when compared to porter (p = .006) and taper (p = .05; p > .1 for all other comparisons). This helps us corroborate the prediction that the two-agent action from the pousser videos is more salient than the one-agent action. Figure 14 shows the overall proportion of looking time towards the two-agent action for the stripping (green bars) and transitive (orange bars) condition per test item. As we can see, children's proportion of looking time is greater for the stripping than for the transitive condition in all test items, but for the verbs taper and porter, this difference is numerically smaller (see Table 5).

Table 5 - Average proportion of looking time towards the two-agent action per item for French-learning 28-month- olds.

Condition Verb Stripping Transitive Manger .65 .56 Porter .51 .47 Pousser .69 .61 Taper .54 .52

91

Figure 14 - Average proportion of looking time towards the two-agent action for the stripping (green bars) and transitive (orange bars) condition per item for French-learning 28-month-olds (Experiment 1).

3.2.2.5 Age group comparison

In order to test the difference between the two age groups statistically, we performed an additional analysis comparing the overall proportion of looking time towards the two-agent action averaged across the whole trial (12s), with participants as the random factor, condition (Stripping vs. Transitive) and age-group (28-month-olds vs. 3-to-4-year-olds) as between-par- ticipant factors, a random intercept for participant, and a random slope for condition by item56. We found a significant effect of condition (β = -.061, SE = .016, t(30) = -3.677; p < .001) and age (β = .054, SE = .022, t(283) = 2.416; p = .016) but no interaction between condition and age group (β = .034, SE = .022, t(283) = 1.518; p = .13). These results show that children's proportion of looks during test were influenced by condition (stripping vs. transitive), with chil- dren in the stripping condition looking longer towards the two-agent video (M = .59s, SD = .20s) than children in the transitive condition (M = . 49s, SD = .21s). They also show that, overall, the younger children looked significantly less time towards the two-agent video (M =

56 Maximal effect structure that allowed the model to converge. We used sum coding with the stripping condition in the older age group as the base level.

92

.51, SD = .20) than the older children (M = .57, SD = .21). However, we found no evidence for a difference in performance between conditions between both age groups.

3.2.3 Discussion

In Experiment 1, we investigated French-learning children's ability to differentiate strip- ping sentences such as Le tigre tape ! Le canard aussi ! from simple transitive sentences such as Le tigre tape le canard aussi ! by relying mainly on their different prosodic structures, which indicate different syntactic relations between constituents. In order to do so, we ran a Preferen- tial Looking and Picture Selection experiment where we presented children with one of the two types of sentence, while displaying two simultaneous videos, one which corresponded to the simple transitive sentence (one-agent video), and another one which corresponded to the strip- ping sentence (two-agent video). For the 3-4-year-olds, the significant difference in looking time between conditions shows that children in the stripping condition interpreted the sentences differently than children in the simple transitive condition. For the 28-month-olds, we only found a marginally signifi- cant difference between conditions, with a tendency towards the expected direction. However, the comparison between age groups showed no interaction between condition and age, which means that both 3-4-year-olds and 28-month-olds performed in a similar manner. Furthermore, our first exploratory analysis with 28-month-olds revealed a significant interaction between condition and phase, showing that children exposed to stripping sentences looked significantly longer towards the two-agent action after listening to the test sentences (i.e., during test phase) than before listening to it (i.e., during contrast phase). These results suggest that the ability to tell apart transitive and stripping sentences is already present in 28-month-olds, although the older children performed better in the task. Despite the results found in our exploratory analyses, our results with 28-month-olds differ from the ones found in Dautriche et. al. (2014) with children of the same age, which show a more significant difference between conditions. One way of explaining this discrepancy is by suggesting that children have more trouble parsing stripping sentences than parsing left-dislo- cated sentences. This could be due to the apparent smaller frequency of ellipsis constructions when compared with dislocated structures. As seen in the Introduction, our search in the same child-directed speech corpus analyzed by Dautriche et al. showed that ellipsis sentences with aussi compose only .14% of all multiple-word utterances which were said to or around children,

93 whereas sentences with DP dislocation compose 5% of utterances, which suggests that DP- dislocation is much more frequent than ellipses with aussi in French. Even if ellipsis were as frequent as DP dislocation in children's input, perhaps children had more experience parsing sentences with DP dislocation through their prosodic boundary information than parsing stripping sentences using this strategy. This is because ellipsis sen- tences do not always have an informative prosodic boundary that needs to be accounted for sentence parsing. As seen in sections 2.2 and 2.3, ellipses often appear in fragment answers; see, for instance, the examples in (22d) and (30), repeated here as (39a) and (39b), respectively. More importantly, conjunctions or disjunctions (which are the common place of occurrence of stripping sentences) can also be signaled by overt coordination markers (e.g. "the tiger eats and the duck too"), which might be more reliable than prosody for sentence parsing. On the other hand, at least in French, sentences with DP dislocation have characteristic prosodic boundary cues that are always present in the sentences (DE CAT, 2007), and the prosodic boundary be- tween the dislocated DP and the main sentence cannot be signaled by the placing of a function word or a morpheme. Similarly, the ambiguous sentences with noun-verb homophones in de Carvalho, Dautriche, & Christophe (2016) (e.g. la petite ferme, which can mean either "the little farm" or "the little girl closes", depending on prosodic phrasing), also more easily interpreted by a younger age group, are consistently disambiguated through prosodic phrase boundaries, since subject DPs and VPs are frequently separated by a phonological phrase boundary even in unambiguous sentences. This might explain why children have more trouble parsing our strip- ping test sentences through prosodic information than parsing sentences from previous studies, since our sentences might not be as frequently disambiguated through prosody in children's input.

39) a) Anna: John read Hamlet in high school. Adam: Mary too. b) Mãe: o cavalo vai papar? Mother: "Is the horse going to eat?" Criança: vai. Child: "goes" (yes)

Another possible reason for the discrepancy between our results and Dautriche et al.'s may be the greater complexity of our test videos, which presented more characters and more movement than the ones in the former experiment. Parents' answers to the vocabulary

94 questionnaire show that both 28-month-olds and 3-4-year-olds understood almost all the verbs and nouns used in the experiment, which suggests that the lack of knowledge of the test words should not be the reason for the younger group's performance. Our looking results also suggest that the children tested already understand stripping constructions, which involve an ellipsis with an identity relationship with the preceding sen- tence, signaled by the adverb aussi. This is in line with previous studies that show that children as young as 17 months old understand and produce ellipsis (e.g. SANTOS, 2006, 2009; KATO, 2012; POSTMAN ET AL. 1997). However, even if children did not understand stripping sen- tences, they could still interpret these sentences as containing two agents, simply because Le tigre and Le canard are both at the beginning of intonational phrases; if children know the typical correspondences between I phrasing and root sentences57, and that French has the ca- nonical word order SVO, they could use this information to figure out that the second DP in the stripping sentences is in a subject position. Consequently, when confronted with the two test videos, they would choose the only video where both the tiger and the duck are agents, regard- less of what actions they are performing (i.e., regardless of knowing that they should be per- forming the same action, due to the identity relation conveyed by the ellipsis). One interesting observation coming from this experiment is the analysis of the overall proportion of looking time per item. Since manger stripping sentences should be more easily interpreted by French children, as it is a verb that allows for unergative reading (and so the sentences should not need to be interpreted as having a null clitic (or null object)), we expected that they would perform better with these sentences, showing a greater difference in proportion of looking time between stripping and transitive conditions than the other verbs. However, the analysis per item does not show any difference between manger and the other test sentences. The only test item that showed a significantly different looking pattern for both age groups is pousser. However, this difference is not related to a better performance (i.e., a greater difference between transitive and stripping conditions), but rather to a general preference towards the two- agent action. Our best interpretation for this finding is that the pousser videos were not as well- balanced as the other test videos, as explained in the Results sections, and the two-agent action was more salient than the one-agent action. However, this issue did not stop children from performing as expected with this item, since participants in the stripping condition looked longer towards the two-agent action than participants in the transitive condition in both age groups. These results show us that the differences between underlying structures of the test

57 As explained in section 2.1.1.2, the domain of I is root sentences, and the I restructuring rules never apply after a verb (i.e., there is never an I boundary after a verb that does not coincide with the end of a root sentence).

95 items did not have a significant effect in children's performance, which suggests that children did not have more trouble interpreting the verbs that do not allow for unergative reading than interpreting the verb manger.

While the looking data show that children use prosodic boundary information to inter- pret stripping sentences differently from simple transitive sentences, we cannot reach the same conclusion with the pointing results. One could justify this finding by claiming that the children tested were too young for pointing tasks. However, studies show that children start performing declarative pointing (i.e., pointing with the intention to draw the attention of the interlocutor to something for expressive or informative purposes) by the end of their first year of life, even in experimental contexts (COCHET & VAUCLAIR, 2010; LIZOWSKI, CARPENTER & TO- MASELLO, 2007). Furthermore, other studies show good performance on pointing tasks by children as young as two years old (e.g. BERNAL ET AL., 2007; ARUNACHALAM ET AL., 2013). Perhaps the complexity of the task could have had an influence on the results; while previous pointing experiments were word-mapping tasks where children were asked to look for referents of novel verbs or nouns, our task required children to map an entire sentence structure to a scene. The big difference is that, in these experiments, the experimenter asks children to point explicitly towards the referent of the novel word (i.e., show me the blick/the one who is blicking), while in the present experiment we asked children to point at the video "the lady was talking about". This task seems to be less straightforward and to increase memory demand, since children need to recover a rather complex sentence to answer properly. So one way of interpreting the discrepancy between our pointing and our looking results is by suggesting that children were able to unconsciously parse the sentences accurately during test phase, but were not confident in their pointing answers, perhaps due to the need for retrieving a rather complex sentence in order to answer the experimenter's question correctly58. This could also explain why some children refused to point; if they lacked confidence in their answers, they might have been embarrassed to tell them to the experimenter.

In sum, our results show that from 2;4 to 3 years of age, young French-learning children can use prosodic boundary information to identify major sentence boundaries in stripping sen- tences and tell them apart from simple transitive sentences. These findings add new evidence to the hypothesis that young children readily exploit prosody for syntactic parsing. Furthermore,

58 It is also important to notice that the pointing analysis had less statistical power than the looking analysis, as it used a binary measure and we had less participants when compared to the looking analysis.

96 the fact that younger children had more difficulty interpreting sentences, while children of the same age performed better in other experiments when exposed to different structures, might mean that children's ability to disambiguate syntactic structures through prosody might depend on the type of structure being studied, perhaps due to the frequency of structures, or to how often they are disambiguated through prosodic boundary information in children's input.

3.3 BP-LEARNING CHILDREN

3.3.1 Method

This experiment was approved by the Ethics Committee of the State University of Cam- pinas before its conducting (CAAE: 84251818.5.0000.8142)59. The experimental videos, as well as the raw data and analysis scripts are freely available to the reader through the following osf link: https://osf.io/98cmu/?view_only=8b2aab61846247c7a9f24ebef11bec8c

3.3.1.1 Participants

For the Preferential Looking task, fifty children from 36-50 months-old (twenty-five children in the transitive condition (Mage = 41.3 months, range 36 to 49.8, 12 girls), and twenty-five in the stripping condition (Mage = 41.8 months, range 34.6 to 50.2, 16 girls)) were included in the final analysis. Children were brought to an empty classroom in two daycare facilities at the city center of Campinas, SP, and listened to the sentences through headphones. All children were monolingual native BP speakers. We only questioned the children's teachers about their native language, but not about their exposure to other languages (which was proba- bly very low, since it is not common for children in this area to be raised in bilingual homes). Their parents signed an informed consent form, presented to them by the children's teachers along with a short letter of invitation, but did not fill any vocabulary surveys (as this was im- possible due to the experimenter not directly interacting with them).

59 See Appendix E.

97

An additional nineteen children were tested but not included in the analysis because of insufficient eye data (13 children)60; fussiness (4 children); refusal to participate (2 children); and being a non-native BP speaker (1 child). The children who were not included in the analysis because of missing eye-tracking data did not have their pointing data discarded, as this was not judged to be caused by children's lack of attention to the videos. For the Picture Selection task, forty-three children were included in the final analysis. From this group, twenty-one children were in the transitive condition (Mage = 42.2 months, range 36 to 53.1, 9 girls), and twenty-two were in the stripping condition (Mage = 42.3 months, range 34.6 to 50.2, 13 girls). An additional seventeen children pointed but were excluded from the analysis, due to side bias (when children pointed to only one side during the entire experi- ment, including the training trial) (13 children); non-native BP speaker (1 child); and fussiness (3 children).

3.3.1.2 Materials

The experiment videos were the same used for Experiment 1 with French children, ex- cept for the taper ("hit") action, which was replaced by cutucar ("poke") (see section 3.1). For this new action, we created a very similar video with the same characters and materials, but instead of hitting the patient character, the agents poked it with a stick. For BP sentences, in order to make them more natural, instead of using the simple present tense, which is more com- mon for the description of habits than for describing ongoing events in this language, we used the present continuous tense (i.e., tá61 -ndo (is -ing); see (40) below).

40) a) O tigre tá cutucando! O pato também! – stripping sentence b) O tigre tá cutucando o pato também! – transitive sentence

The sentences were recorded by a native BP speaker of the metropolitan region of Cam- pinas and fellow linguist Rosana Rogeri at the Laboratory of Acquisition and Syntactic Pro- cessing (LAPROS) at UNICAMP, using a recorder and a condenser microphone. Twenty-nine

60 As for the French experiments, all trials with less than 25% of eye data (i.e., in which children looked away and/or in which the eye-tracker couldn't track the eye for 25% or more of the time of the trial) were excluded from the statistics, and children with more than two excluded trials had all their looking data discarded. 61 This is a colloquial form of the verb estar (to be) that is more commonly used by speakers of the dialect of Campinas, as well as by many other Brazilian dialects.

98

BP-speaking adults were asked to interpret the sentences via an online form, which had the same design as the French forms. The sentences were interpreted correctly 94% of the time62 (i.e., the stripping sentences were interpreted as coordinated sentences with two equivalent ac- tions, and the transitive sentences were interpreted as simple sentences in which the first DP was the agent and the second DP was the patient of the action). The videos were presented through a commercial video player on a 21.5-inch television, and the auditory stimuli was played through headphones. To avoid possible order biases in children's responses63, we created four complete videos per condition, alternating the order of appearance of test trials in each video so that all four test trials appeared first in one video, second in another one, and so on64. The left-right disposition of the videos was also randomized so children did not see more than two consecutive target videos on the same side. Children were assigned a particular version of the video by order of testing: the first and second child saw the first and second versions of the stripping videos; the third and fourth saw the first and second versions of the simple transitive videos; the fifth and sixth saw the third and fourth versions of the stripping videos, and so on. The pointing responses were written down by the experimenter (and later checked through the viewing of the children's videos without sound) and the eye data were collected through a Gazepoint eye-tracker65.

3.3.1.2.1 Acoustic analyses

As for the experiment with French-learning children, to assess the prosodic differences between conditions, we conducted acoustic measurements around the positions of intonational phrase boundaries in the recorded test sentences. We analyzed the presence and length of pauses; lengthening of the stressed vowel of the word preceding the boundaries (i.e., for the verb, in stripping sentences, and for the first and second DPs, in simple transitive sentences); and pitch (F0) contour at the end of phrase boundaries. For the last two measurements, we

62 Stripping sentences were judged correctly 98% of the time, and transitive sentences, 97% of the time. There were 1.11% wrong or ambiguous close to wrong judgments (.83% for transitive and .28% for stripping sentences); .28% of completely ambiguous judgment for a transitive sentence; and 4.72% of correct but slightly ambiguous judgments (1.66% for stripping sentences and 1.94% for transitive sentences). 63 E.g. children could get bored during the last trials, and so perform better with the first test items, or take some time to understand the task, and so perform better with the last test items. 64 The complete description of each video created is in Table 14 in Appendix B. 65 https://www.gazept.com/

99 compared these two types of sentence through t-tests using the z-score value of duration and pitch. For the boundary between the first DP and the verb, the DP's stressed vowel was signif- icantly longer for the simple transitive condition (M = 3.58; SD = 1.92, versus M = -.03; SD = .54 for the stripping condition; t(14) = -5.11; p < .001). There was also a consistent pause be- tween the first DP and the verb for the simple transitive condition (M = .115s; SD = .75), but not for the stripping condition. The boundary between the verb and the second DP in the strip- ping sentences was also marked with a pause (M = .420s; SD = .53). We also found a longer lengthening of the verb's stressed vowel (z-score duration of the verb's stressed vowel minus average z-score duration of preceding vowels) for the stripping condition, although the simple transitive condition also presented vowel lengthening in this position (M = 3.05; SD = 1.63, for the simple transitive condition, and M = 7.97; SD = 2.18, for the stripping condition; t(14) = 5.09; p < .001). However, we found no difference in F0 curve (F0 z-score of the verb's stressed vowel minus F0 z-score of preceding vowels) between conditions (M = -1.26; SD = .53 for the simple transitive sentences, versus M = -.68; SD = 1.46 for the stripping sentences; t(14) = 1.06; p = .3), with both conditions presenting a similar falling pitch contour after the verb. The raw duration values can be found in Table 6, and the raw F0 values can be found in Table 7. For the last DP of the sentence, we found no difference in vowel lengthening between conditions (M = 2.84; SD = 3.73 for the transitive, and M = 4.89; SD = 4.91 for the stripping sentences; W = 44; p = .2). For F0, we found a significant difference between conditions (t(14) = -4.47; p < .001), but the simple transitive sentences presented a rising pitch contour, whereas the stripping sentences presented a falling pitch contour in this position (M = 1.24; SD = .79, for the transitive condition, and M = -1.99; SD = 1.89, for the stripping condition)66. These results do not follow the same pattern found for the French test sentences, as they do not indicate a prosodic phrase boundary between the second noun and também ("too") in the transitive sen- tences.

66 Individual t-tests confirm that there is a significant difference in F0 values between the verb and the preceding vowels for both types of sentence (t(7) = 4.28; p = .004 for simple transitive sentences, and t(7) = -3.14; p = .016 for stripping sentences.

100

Table 6 - Raw duration values for the acoustic analysis of BP test sentences.

Stripping sentences Transitive sentences (e.g. [o tigre tá cutucando] [o dinos- (e.g. [o tigre] [tá cutucando o dinos- sauro também]) sauro também]) Duration in sec SD in sec Duration in sec SD in sec 1st DP's stressed vowel (e.g. t[i]gre) 0.11 0.034 0.201 0.029 Verb's stressed vowel (e.g. cutuc[a]ndo) 0.23 0.033 0.172 0.024 Verb's preceding vowels (mean) 0.082 0.012 0.104 0.008 Verb's stressed vowel - Verb's pre- ceding vowels 0.148 0.031 0.068 0.026 2nd DP's stressed vowel (e.g. di- noss[a]uro) 0.194 0.053 0.172 0.028 2nd DP's preced- ing vowels (mean) 0.098 0.009 0.109 0.005 2nd DP's stressed vowel - 2nd DP's preceding vowels 0.096 0.051 0.063 0.032

Table 7 - Raw F0 values for the acoustic analysis of BP test sentences.

Stripping sentences Transitive sentences (e.g. [o tigre tá cutucando] [o dinos- (e.g. [o tigre] [tá cutucando o dinos- sauro também]) sauro também]) F0 in Hz SD in Hz F0 in Hz SD in Hz Verb's stressed vowel (e.g. cutuc[a]ndo) 307 64 234 28 Verb's preceding vowels (mean) 325 25 291 35 Verb's stressed vowel - Verb's pre- ceding vowels -18 64 -57 22 2nd DP's stressed vowel (e.g. di- noss[a]uro) 240 65 347 48 2nd DP's preced- ing vowels (mean) 326 18 288 32

101

Stripping sentences Transitive sentences (e.g. [o tigre tá cutucando] [o dinos- (e.g. [o tigre] [tá cutucando o dinos- sauro também]) sauro também]) F0 in Hz SD in Hz F0 in Hz SD in Hz 2nd DP's stressed vowel - 2nd DP's preceding vowels -87 78 59 39

Figure 15 - Soundwave and pitch for a stripping sentence (O tigre tá comendo! O dinossauro também!, top) and a simple transitive sentence (O tigre tá comendo o dinossauro também!, bottom).

3.3.1.3 Procedure

Parents or caretakers were contacted by the teachers in two selected daycare facilities and were asked to authorize children's participation by signing consent forms. The experimenter then went to the daycare facilities and invited each authorized child to come to the experiment room (an empty classroom) individually to watch the videos. Once at the room, children were instructed to sit in front of the TV monitor and put on the headphones, with the help of the experimenter. Most children accepted putting the headphones on quite easily, as they were told it was the only way they could listen to the videos. They were also told they would sometimes

102 see two videos at the same time, but the narrator ("the lady who talks") would talk about only one of them, and they would be asked to point to the video she was talking about. After that, the experimenter adjusted the eye-tracker position while the child watched a cartoon video, then sat in a chair to their right, facing the child (and not the monitor). Afterwards, the five-point calibration began, followed by the experiment. The experimenter avoided looking at the moni- tor screen during the test trial and payed attention to her own laptop screen (which children could not see), in order to not influence children's looks. However, even when she looked at the screen (to turn children's attention to it or check if the videos were running correctly), children could not easily tell if she was looking towards a specific video, since she was sitting facing the child and not the monitor. The experiment followed the same design as the one presented to French-learning chil- dren. At the end of each test trial, the experimenter would pause the videos manually and ask children to point at the video "the lady was talking about" (de qual video a moça tava falando?). Pointing responses were always congratulated, regardless of being correct or not. At the end of the experiment, the experimenter thanked the child and escorted her back to her classroom.

3.3.1.4 Data analysis

The eye-tracker software generated a table and a video with x and y axis position of eye gaze every 16.4ms in average for each child. The experimenter determined the start of each trial by watching the videos and labeled the entire trial from -2000ms to 12500ms (i.e., 2s before the beginning of the videos and .5s after the end). Once each table was ready, they went through an R script that selected only the relevant data (excluding non-labeled times) and determined the child's left-right eye position through the x and y axis values, and after that, the looks to the stripping vs. transitive videos. Afterwards, data for all children were organized on a single file and analyzed following the same procedure applied for the French experiments. For the time course analysis, as the eye-tracker did not give precisely one data point every 16.4ms, we aver- aged the data in 170ms time bins (roughly 10 times the average time bin of the eye-tracker, exactly the same pattern used for the French data). Although this meant that the analysis had less data points than the French experiment, it also meant that we had fewer missing data, which would be detrimental to the statistical analysis.

103

3.3.1.5 Experimental hypothesis and predictions

Our hypothesis was that, like French-learning children, BP-learning children can also use the prosodic boundary cues in stripping sentences to interpret them differently from the simple transitive sentences, and so they should look longer/point more towards the two-agent action when listening to stripping sentences than when listening to simple transitive sentences. If, however, they fail to use prosodic boundary information to interpret the stripping sentences, they should interpret them as simple transitive sentences and look longer/point more towards the one-agent action. The predictions for the results are the same as for the French experiment.

3.3.2 Results

3.3.2.1 Eye-tracking data

The results show two significant time clusters, one at the end of the first test sentence (from 5270ms to 7310ms, p = .05) and another one at the end of the test trial (from 10540ms to 12000ms, p = .03) where children in the stripping condition looked longer towards the stripping video than children in the simple transitive condition. The t-test on overall looking times also shows a significant difference between the stripping (M = .59, SD = .11) and transitive condi- tion (M = .50, SD = .11, t(48.81) = 2.98; p = .004; Cohen's d = .83). Figure 16 shows the time course of the proportion of looking time during the test trial, and Figure 17 shows the average proportion per condition.

104

Figure 16 - Proportion of looks towards the two-agent action through the whole test trial (12 seconds) in the stripping (green line) and transitive (orange line) condition for BP-learning children. The gray rectangles show the significant time clusters and their p-values. The average time of appearance of each word in the test sentences is indicated in the gray boxes below the graph (the upper boxes show stripping sentences, and the boxes below show transitive sentences).

Figure 17 - Average proportion of looking time towards the two-agent action in the stripping (green box) and transitive (orange box) condition for BP-learning children. The purple dots show the average looking time for each participant. The white dashed lines show the mean for each condition (.59 for stripping sentences, and .50 for transitive sentences).

105

As we can see in Figure 16, the proportion of looks from children in the simple transitive condition (orange line) is below the one from children in the stripping condition (green line) during most of the trial. However, both groups seem to have a slight preference for the stripping videos, although their proportion of looks is roughly around .50 for most of the trial. This is reflected on the t-test and on the boxplot of Figure 17, where we see that the average looking time for children in the simple transitive condition (represented by the white dashed lines in the boxplot) is almost at .50, and the one for children in the stripping condition is slightly above that.

Exploratory analyses

Since the time cluster found extended until the end of the test trial we also performed the analysis considering the time until 500ms after the end of the trial, to see how far this cluster would go. At the end of each test trial, the videos froze on the screen for the pointing trial, so children could still see the two videos for more than 5000ms (since the experimenter also paused the videos to ask children to point)67. We found that the time cluster continued until 70ms after the end of the test trial. The analysis of this slightly larger time cluster returned a slightly smaller p-value as the one found for the cluster until 12000ms (p = .02). As it was the case with the French experiments, we also performed ANOVA to see if there was any difference in children's performance between test items. We found an effect of condition (F(1,153) = 5.64, p = .02) and a strong effect of item (F(3, 153) = 7.02, p < .001), but no interaction between condition and item (F(1,150) = .58, p = .63). A post-hoc Tukey test showed that, like the French children, BP-learning children looked longer towards the two- agent action video for the empurrar (pousser) action (p < .005 for the comparison with cutucar (the action created to replace taper) and carregar (i.e., porter/carry), and p > .1 for all other comparisons). This helps corroborate the prediction that children's preference towards the two- agent action in this test item is more linked to the visual stimuli than to the test sentences, since the BP sentences yielded a similar effect. Figure 18 shows the overall proportion of looking time towards the same-action video for the stripping (green bars) and transitive (orange bars) conditions per item. As we can see, children in the stripping condition looked longer towards the two-agent action than children in

67 We cannot consider all 5000ms for analysis because children spent a good amount of this time looking at the experimenter as she interacted with them. However, it is safe to say that the experimenter takes at least 500ms to pause the video and start interacting with the child.

106 the transitive condition in all test items, although this difference is smaller for empurrar (see Table 8).

Table 8 - Average proportion of looking time towards the two-agent action for the stripping and transitive condi- tion per item for BP-learning children.

Condition Verb Stripping Transitive Comer .52 .46 Carregar .46 .36 Empurrar .57 .55 Cutucar .49 .40

Figure 18 - Average proportion of looking time towards the two-agent action video for stripping (green bars) and transitive (orange bars) condition per item for the BP-learning children.

107

3.3.2.2 Pointing data

The pointing analysis shows no significant difference in pointing between stripping and transitive conditions (β = -.51, SE = .46, z = -1.12, p = .26). The results, however, follow the expected direction, with children in the stripping condition pointing more towards the two-agent video (65% of the time, versus 53% for the transitive condition). When we look at each verb separately, two of them show more pointing towards two-agent video for the stripping condition (e.g. comer and carregar), while cutucar shows the opposite pattern, with more pointing to- wards the two-agent video for the transitive condition. Finally, empurrar has almost the same amount of pointing towards the two-agent video for each condition (see Table 9).

Table 9 - Average proportion of pointing towards the two-agent action for the stripping and transitive condition per item for BP-learning children.

Condition Verb Stripping Transitive Comer .71 .50 Carregar .75 .39 Empurrar .50 .47 Cutucar .64 .75

108

Figure 19 - Left: Average proportion of pointing towards the two-agent action for BP-learning children. The white dashed horizontal lines show the mean for each condition (.65 for the stripping condition (green box) and .53 for the simple transitive condition (orange box)). The purple dots represent the proportion of pointing for each participant. The horizontal gray bar shows .50 proportion (i.e., when children pointed towards two-agent videos in half of the trials). Right: Average proportion of pointing towards the two-agent action for stripping (green bars) and transitive (orange bars) conditions per item.

As we can see in the boxplot to the left of Figure 19, where purple dots indicate the proportion of pointing for each child, some children pointed towards the two-agent action dur- ing half of the trials (4 out of 22 children in the stripping condition, and 7 out of 21 children in the transitive condition). Since we counterbalanced the trials so the target video would always appear in the opposite side in which it appeared in the preceding trial, this means that these children always pointed towards the same side. However, as we already removed the children that pointed towards the same side during all trials including the training trial, we cannot really say that these children had a side bias, since they pointed to a different side during the training trial.

3.3.3 Discussion

This experiment was designed to investigate whether BP-learning children, like French- learning children, can interpret stripping sentences differently from transitive sentences by re- lying on their different prosodic boundary cues. Our hypothesis was that BP-learning children would be able to use the prosodic boundary information to determine that the second DP of the

109 stripping sentences refers to an agent of a new sentence, and not the patient of the preceding VP. Therefore, children who listened to stripping sentences should look longer and point more towards the two-agent action (second-DP agent interpretation) than children who listened to simple transitive sentences. The results show an overall difference between conditions, and two significant time clusters where children in the stripping condition looked longer at the two-agent video than children in the simple transitive condition. This shows that BP-learning children in the stripping condition interpreted the test sentences differently than children in the simple transitive condi- tion. The analysis between items shows that, similarly to the French experiment, children's performance did not significantly differ from one item to another, except for empurrar, which presented a greater overall proportion of looking time towards the two-agent action for the two conditions. This helps corroborate the prediction that the two-agent video was more salient than the one-agent video for this test item, since French-learning children presented a similar looking pattern for this item. For the pointing analysis, the results show no significant difference between the strip- ping and simple transitive conditions. As for the French results, a possible explanation for this would be that the complexity of the task had a negative influence on children's performance, since children needed to recover the heard sentences to point correctly, and might not have felt confident enough to do so. For the analysis of pointing per item, we saw that children only present more pointing towards the two-agent action in the stripping than in the transitive condition for two test items: comer and carregar. One might explain this by noting that comer, as manger, can be interpreted as unergative when inside sentences with no object, and so children might interpret these sen- tences more easily than null object sentences. As for carregar, this was the only verb for which we chose a non-animate object as the referent of the null object, and, as noted in section 2.4, animate referents rarely occur as null objects in [+specific] contexts in BP (CYRINO & LOPES, 2004). However, we did not attest this difference in performance between test items statistically, as this analysis would have very little statistical power due to the small number of data points.

In sum, our results show that BP-learning children from 3 to 4 years old can use prosodic boundary information to identify major sentence boundaries in stripping sentences and tell them apart from simple transitive sentences. This finding contributes to the literature on prosodic

110 bootstrapping for syntactic acquisition, by studying another language in which children can correctly interpret sentence structure relying on prosodic boundary cues.

3.4 GENERAL DISCUSSION: BP VS. FRENCH RESULTS

The results of the experiments above show that BP and French-learning children can interpret stripping sentences differently from simple transitive sentences by relying on their different prosodic boundaries. Both language groups presented significant time clusters and an overall significant difference in proportion of looking time towards the two-agent video be- tween conditions, although none of the groups presented a significant difference in the pointing responses. In general, French-learning children from 3-4-years-old seem to have a slightly better performance than the BP-learning children; the time course analysis showed a greater differ- ence between conditions during the whole trial and greater time clusters where the difference between conditions was statistically significant (p = .04 for the first cluster, and p < .001 for the second, for the French experiment; versus p = .05 for the first cluster, and p = .03 for the second, for the Brazilian experiment). The significant time clusters for the French-learning chil- dren are also at the beginning of the test trials, whereas the time clusters for Brazilian-learning children are at the end of the test sentences. This difference between results is small and could most likely be accounted for in terms of differences in the equipment used for data collection. In France, we used an Eyelink 1000 eye-tracker, which is one of the most used eye-trackers for child research, for its high precision and easy calibration. In Brazil, we used a Gazepoint eye-tracker, which is less precise and less suited for child users. Moreover, in France, eye-tracker data were automatically selected and coded through a MATLAB script, whereas in Brazil the data were selected manually from a spreadsheet generated by the eye-tracker's software. So the differences found between the Bra- zilian and the French data might be solely due to the use of a less precise data collection proce- dure in Brazil than in France. However, one could predict other possible reasons for this small discrepancy between results. The first one is methodological. While French children were tested in a sound-attenu- ated cabin inside a laboratory, Brazilian children were tested in daycare centers. These two environments differ radically in several aspects. First, in daycare centers, children are sur- rounded by stimuli of all sorts, the sound of dozens of other children, toys, music, constant

111 chatting, and so on. In the laboratory, children only encounter a few other children (usually) at their best behavior, there are rarely any loud noises, and we offer them a limited set of toys for them to play with while they wait for their turn to perform the experiment. During the test, children in the laboratory have very little distractions, since they cannot listen to external noises, and wires, computer and other devices inside the testing cabin are dark in color or masked by a black cloth, and the lights are dimmed. However, children in the daycare centers were tested inside unoccupied classrooms, which were very bright, had toys, furniture, school supplies, etc., and although they listened to the test sentences through headphones they could still listen to some of the external noises, such as loud music and screaming coming from other classrooms. This difference in environment could also explain the difference in results concerning the time of appearance of the time clusters in the cluster-based analysis; with more distractions, BP-learning children must have had more difficulty paying attention to the test sentences. Maybe they needed to listen to the sentences for a longer period to understand them. One could also try to explain the differences between languages by the differences in their prosodic structure. As already seen, the prosodic boundary cues of both languages are quite similar for this type of sentence, apart from the direction of the typical pitch contours (BP stripping sentences presented a falling pitch contour at the boundary between the verb and the second DP while the French stripping sentences presented a small rising pitch contour in this position). But we have also seen that French has some unique prosodic characteristics not shared with most other Romance languages including BP, such as the lack of lexical stress and a smaller number of pitch accents and boundary tones. Furthermore, French does not use pros- ody for topic or focus marking, whereas BP does. This could predict that prosodic cues in French, as they are smaller in number and have a more limited set of uses, are less ambiguous than the ones in BP, and so French-learning children need less exposure to sentences in order to parse them correctly through their prosodic boundary information. Differences in the syntactic processes that are allowed in each language could also ac- count for differences in children's performance. As we have seen, French does not allow for VPE or null subjects, and referential null objects (or null clitics) are rare and clearly marked (e.g. CUMMINS & ROBERGE, 2004; GRÜTER, 2009), whereas BP allows for VPE and null subjects and objects in various contexts. Our first hypothesis, however, was that this would make the comprehension of our stripping test sentences easier for BP-learning children, since this language allows for a larger number of ellipsis and null elements. One quite interesting observation is that the analysis of the looking results per item was similar between languages, which is surprising given the fact that BP allows for referential null

112 objects more freely than French. One way of explaining this is by assuming that children did not need to understand the entire stripping sentences to interpret them differently from the sim- ple transitive sentences, as already discussed above. Another way is to assume that the null object sentences were equally acceptable in both languages, since for most of them the null object had an animate referent, which is also marked in BP. These two explanations are not incompatible; if the second one is true, then it is also true that the low acceptability of the test sentences did not stop children from performing as expected in the experiments. Another pos- sible explanation would be that the syntactic advantage of BP was cancelled by the prosodic advantage of French, and that is why children in both languages had such similar performance. There are also similarities between the BP and French-learning children's performance when we look at the pointing results per item. For both language groups, the items with the greatest difference between conditions were manger/comer and porter/carregar, and there was an inverted pattern for taper/cutucar, with children in the transitive condition pointing more towards the two-agent action than children in the stripping condition. Although we do not have a good explanation for the taper/cutucar results, it is interesting to observe that this test item also presented similar issues in Dautriche et al.'s (2014) experiment. However, for Dautriche et al., this was attributed to a higher salience of one of the test videos over the other (in one video, a monkey hit a cup, whereas in the other, a bunny hit the monkey while it cringed and massaged its own head), whereas this does not seem to explain our pattern of results, since both test videos for this item were as similar as possible (they were filmed in a similar manner as the man- ger/comer videos, which did not present the same pointing pattern) and the only item for which we found a difference in the overall proportion of looking time was pousser. Regarding the better performance of manger/comer and porter/carregar, one could attribute it to the fact that manger/comer has an unergative reading when occurring in sentences without an object, and that porter/carregar had an inanimate object as the null object's referent, and null objects with inanimate objects are preferred in both French and BP. However, as we cannot attest the statis- tical significance of this difference between test items, our best guess is that there is no real difference in children's performance, although more data would be needed for this prediction to be corroborated. The results found here, along with previous studies on prosodic bootstrapping, corrob- orate Nespor & Vogel's (1986) claim that prosodic boundary cues from the upmost levels of prosodic hierarchy can be used for speech access. As some prosodic phrase formation rules refer to syntactic structure, the correlations between syntactic and prosodic structure seem to be reliable enough to help both adults and children with sentence parsing. Our acoustic analyses

113 of the test sentences revealed consistent pauses in the intonational phrase boundary between the verb and the second DP in stripping sentences, and significant differences in stressed vowel lengthening of the words immediately preceding the prosodic phrase boundaries of transitive and stripping sentences. Although we could not attest differences in pitch contour between sen- tences, the duration and pause cues, which are described by Nespor & Vogel as typical cues for phonological and intonational phrase boundaries, showed to be strong enough to indicate the different syntactic structures to children, as they were able to use them to correctly guide their interpretation of sentences.

In sum, our looking results for Experiment 1 show that both French and BP-learning children can interpret stripping sentences differently from simple transitive sentences by relying on different prosodic boundaries. This adds new evidence in favor of prosodic bootstrapping theory, especially concerning studies on the role of prosodic information in syntactic parsing, by showing a different type of structure that French children can interpret by relying on prosodic boundaries, and also by showing this ability in children learning a language not studied before for this type of task (i.e. BP). This finding is especially interesting given the fact that the pro- sodic boundary cues that differentiate our stripping from our transitive test sentences do not occur with all stripping sentences in children's input. This is because stripping may also occur in fragment answers or inside conjunctions or disjunctions with overt coordination markers (e.g. "and"; "but"). Since most previous studies used sentences where prosodic phrasing information signals sentence structure in a more systematic fashion, our study was one of the first to show children's ability to use prosodic boundary information to correctly interpret a sentence in which prosody might not be the best or the most recurrent cue for parsing. This suggests that children can explore prosodic information not only through the observation of what are the most recur- rent prosodic cues that cooccur with a specific syntactic structure, but can make wider general- izations about the correlations between syntactic and prosodic structure. These results also contribute to investigations on the acquisition of ellipsis sentences. As seen in section 2.3, children seem to understand and produce ellipsis from a very early age. There are not many studies that show children's comprehension of stripping sentences, but since there are several studies that show their understanding of other types of ellipsis, we expected children to understand our stripping sentences as well. Our results corroborate this prediction, as participants exposed to stripping sentences looked longer towards the video where the sec- ond-mentioned DP was performing the same action as the first-mentioned DP, which suggests that they understand the identity condition behind ellipsis sentences: the elided constituent must

114 be identical to some linguistic (or in some cases pragmatic) antecedent. In other words, children show to understand that the second character mentioned in the sentences must be performing the same action as the first character. There is, however, another way of interpreting our results. One might say that children chose the two-agent videos as the best interpretation for the stripping sentences not because the two characters were performing the same action in these videos, but simply because they were the only videos where both characters were agents. Perhaps the only information children needed to know to perform as expected in the experiment was that the stripping sentences de- scribed the two mentioned characters as agents. This knowledge might have been enough to lead them to look longer towards the two-agent action videos, regardless if they knew whether the second agent should be performing the same action as the first one or not. As this was presented to us as an alternative interpretation to our results, we decided to design and run another experiment aiming to investigate children's understanding of the identity condition be- hind our stripping test sentences. This experiment, as well as its motivations and preliminary results will be better described in chapter 5.

One further question we would like to address in this thesis is whether children can still correctly tell apart stripping from simple transitive sentences when presented with sentences containing an unknown verb inside dialogues and without a simultaneous visual context (e.g. two videos or images related to the sentences). De Carvalho (2017) showed that 28-month-olds hypothesize different meanings for the novel verb daser depending on the type of structure it is presented in - whether it is in left-dislocated sentences such as in il dase, le garçon ("he dases, the boy"), or transitive sentences such as in il dase le garçon ("he dases the boy"). The group exposed to the simple transitive sentences hypothesized a transitive meaning for daser whereas the group exposed to left-dislocated sentences hypothesized an intransitive meaning to it. These results show that French-learning children can rely on prosodic boundary information for sen- tence parsing even when they do not previously know the argument structure of the verb. Can children do that for stripping sentences? In the next chapter, we describe an experiment created to address this question.

115

EXPERIMENT 2: STRIPPING VS. TRANSITIVE SENTENCES WITH A NOVEL VERB

In chapter 3 above we showed that French- and BP-learning children interpret stripping sentences differently from transitive sentences by relying on their different prosodic boundaries when exposed to sentences containing known verbs. However, the experiment does not say much about children's ability to access prosodic boundary information of stripping sentences for the acquisition of a verb's argument structure. Can children still rely on prosodic structure for parsing stripping sentences containing an unknown verb? Although Experiment 1 shows that children pay attention to prosodic boundary information when parsing stripping sentences, it is still possible that when exposed to a more complex experiment, which adds another difficult task besides sentence disambiguation, they fail to do so. This second experiment was designed to investigate French-learning children's ability to correctly interpret stripping sentences through prosodic boundary information when presented with sentences containing a novel verb and in a less informative context (i.e., hearing the test sentences prior to seeing its possible referents). The methodology used was created by Yuan & Fisher (2009) to test the hypothesis that children can make predictions about a novel word's meaning even with minimal semantic in- formation, through the observation of morphosyntactic cues. In their experiment, English-learn- ing 2-year-olds listened to dialogues where a novel verb appeared either in transitive sentences such as "Jane blicked the baby" or in intransitive sentences such as "Jane blicked". Afterwards, they saw two videos displaying two novel actions side by side: a novel causal action with two participants (a girl swinging another girl's leg back and forth) and a novel intransitive action with one participant (a girl making circles with her own arm). The audio stimulus accompany- ing the videos asked children to "find blicking". Their results show that children who listened to the novel verb in transitive sentences looked longer towards the causal action than children who listened to it in intransitive sentences, suggesting that they correctly hypothesize that the novel verb must be a two-participant action by observing the number of complements accom- panying it in the sentences, and might use this hypothesis to map the verb onto a novel action later on.

116

Yuan & Fisher's results suggest that children can use the linguistic context in which a verb appears to make inferences about its possible meaning68. However, children's performance in the task might not necessarily be linked to their knowledge of syntactic structure; they may have just stored in memory the number of participants accompanying the novel verb in the dialogues, and then looked for the action that matched this number. Since the intransitive action video only showed one girl making circles with her arm, the only two-participant action avail- able was the novel causal action. In order to investigate if children can also make inferences about a novel verb meaning without relying on the number of participants involved in the action, Arunachalam & Waxman (2010) presented two-year-olds with a novel verb in transitive dia- logues such as "the lady is going to moop the baby" or in intransitive dialogues also containing two participants, such as "the lady and the baby are going to moop". The test phase displayed a novel causal action (e.g. a man spinning a woman around in a chair) and a novel intransitive action also containing two participants (e.g. the man and woman making circles with their own arms). The results show that children who listened to the novel verb in transitive sentences pointed significantly more towards the transitive action than children who listened to it in in- transitive sentences69. This suggests that children can use the syntactic information to make predictions about a novel verb meaning, and do not just count the number of participants ac- companying the verb to find its referent. Since Arunachalam & Waxman's results show that two-year-olds pay attention to the syntactic context in order to make hypotheses about a novel verb, Dautriche et al. (2014) and de Carvalho (2017) used the same methodology to investigate children's interpretation of sen- tences with left-dislocated structures when exposed to a novel verb. They presented French- learning 28-month-olds with videos where two native French-speaking women spoke to each other using the novel verb daser in left dislocated sentences (e.g. il a dasé, le bébé ("he dased, the baby")) for one group, in simple transitive sentences (e.g. il a dasé le bébé ("he dased the baby")) for another group, and in simple intransitive sentences (e.g. il a dasé) for yet another group of children. At the test phase, children saw the same videos used by Yuan & Fisher (2009): a novel causal action (a girl swinging another girl's leg) and a novel one-participant

68 As a control study, the authors conducted a similar experiment with another group of two-year-olds, but where the test videos were not accompanied by the target novel word (i.e., without the audio stimulus that asked children to "find blicking"). Children from both conditions were at chance, showing that the ones in the test group were really mapping the novel verb to the novel actions and not just choosing the videos because they listened to two- participant or one-participant actions before. 69 Children in the intransitive conditions of all experiments described in this chapter were at chance, i.e., did not show a significant preference to the intransitive nor to the transitive action. I will review the justifications given by the preceding literature when I state the predictions for the present experiment.

117 intransitive action (a girl making circles with her own arm), while being asked to "find the one who dases". Dautriche et al.'s experiment did not show any difference in looking time towards the causal action between the left-dislocated and transitive conditions, a change in the structure of the dialogues in de Carvalho (2017) made children in the left-dislocated condition interpret the novel verb differently than children in the simple transitive condition. Since the failure of children from Dautriche et al. to tell apart these two types of sentence was attributed to the use of a simpler noun-counting strategy to infer the novel word's transitivity, de Carvalho added one-DP sentences70 (e.g. la maman dase) to dialogues from Dautriche et al. from both transitive and left-dislocated groups. They believed that this would make children realize that the number of nouns accompanying the sentences was not a reliable cue for the verb's transitivity, and that they should pay attention to its syntactic structure in order to figure out its possible meaning. Indeed, children in de Carvalho's study interpreted the novel verb in left-dislocated sentences like the children who listened to it in simple intransitive sentences, and differently from children in the simple transitive condition. De Carvalho's (2017) experiment shows that children can rely on prosodic boundary cues for sentence parsing even in a verb-learning situation without simultaneous visual context. This suggests that prosodic cues can be a reliable source of information for lexical and syntactic acquisition, as children may rely on them even when they do not previously know the argument structure of the main verb of the sentence. The present experiment aimed to extend these results, investigating whether children can rely on prosodic boundary information in the task described when exposed to a different type of structure, namely stripping versus transitive sentences. We showed 30-42 months-old71 French-learning children dialogues containing either transitive + aussi sentences or stripping sentences similar to the ones in Experiment 1 (i.e., without a real- ized coordination operator) with the novel verb daser (see example (40) below), and then showed the same videos created by Yuan & Fisher (2009) in the test trial, accompanied by an audio stimulus that asked children to "look at the one who dases" (Regarde celle qui dase !). We expected children who listened to the novel verb in transitive sentences to look significantly longer towards the causal action than children who listened to it in stripping sentences. Children

70 Notice that here we are not talking about null object sentences, but about sentences that do not have an object, since the novel verb still has no transitive or intransitive interpretation associated to it. However, the children who were exposed to these sentences along with transitive sentences must have interpreted them as null object (null clitic) sentences, since they behaved similarly to children from Dautriche et al.'s experiment, who only listened to transitive sentences. 71 Since our sentences were more complex than the ones in the previous experiments using the same methodology, we chose to test slightly older children. The chosen age range is larger than for the previous experiments due to time constraints; if we chose a smaller age range, it would take longer to get the preregistered number of partici- pants.

118 in this last group should be more likely to interpret the novel verb as an intransitive action, since the stripping test sentences do not present an object.

41) a) La maman a dasé le bébé aussi ! – Transitive: "the mommy dased the baby too!" b) La maman a dasé ! Le bébé aussi ! – Stripping: "the mommy dased! The baby too!"

4.1 METHOD

The method, analysis, and criteria for exclusion of participants were preregistered in the OSF (Open Science Framework) database. The formal preregistration, materials, collected data and analysis scripts are freely available to readers through the following link: https://osf.io/f2hmd/?view_only=3da6a9a37cbc4283925f28cf8b042c11

4.1.1 Participants

For the Preferential Looking task, fifty-seven children from 30-42 months-old (twenty- eight in the stripping condition (Mage = 35.6 months, range 30.5 to 41.2, 18 girls), and twenty- nine in the transitive condition (Mage = 34.8 months, range 30.2 to 40.9, 12 girls) were included in the final analysis. All children were monolingual native French speakers and were exposed to other languages less than 30% of the time at their homes. An additional nineteen children were tested but their data were not included in the anal- ysis, due to: insufficient eye data (15 children)72; experimenter error (1 child)73; refusal to finish the experiment (1 child); parental intervention (1 child); and talking during the test trial (1 child). The children who had their data discarded because of insufficient eye data did not have their pointing data discarded.

72 As for experiment 1, all trials with less than 75% of eye data (i.e. in which children looked away and/or the eye- tracker couldn't track the eye for 25% or more of the time in the trial) were excluded from the statistics. The high number of exclusions comparing with the first experiment stems from the fact that this experiment has only one test trial, while the first one had four (and so we had a better chance of getting at least some good trials per children). 73 The eye-tracker lost calibration and the experimenter had to enter the cabin during the test trial, distracting the child.

119

For the pointing task, thirty-five children were included in the final analysis. This num- ber is smaller than the one for the Preferential Looking data because many children refused to point. From this group, fourteen children were in the stripping condition (Mage = 35.2 months, range 30.6 to 41.2, 9 girls) and twenty-one were in the transitive condition (Mage = 34.9 months, range 30.2 to 40.9,12 girls). An additional two children pointed but had their data dis- carded because of experimenter error (1 child) and parental intervention (1 child).

4.1.2 Materials

The stimuli for this experiment consisted of two dialogues per condition, in which two native French speakers uttered sentences containing the novel verb daser. We also used the two novel action videos from Yuan & Fisher (2009) for the test trial. Each dialogue contained two target sentences (i.e., simple transitive + aussi sentences (for the simple transitive condition) or stripping sentences (for the stripping condition)) and two one-DP sentences. The speaker with the most training on prosody, Anne Christophe, uttered the target sentences, while the other speaker, Cécilia Jubin, uttered the one-DP sentences. Since we needed to create an acceptable pragmatic context for the simple transitive sentences (as observed for Experiment 1, one cannot say "the mommy dased the baby too" without a pragmatic or linguistic antecedent where the action happened before with different participants), the first uttered sentence in the dialogues was a one-DP sentence, which was followed by the first target sentence. Figure 20 shows an example of dialogue.

120

Character Hé, tu sais ce qu'il a fait, le bébé ? Il a dasé A ! Character Oui, le bébé a dasé (!) la maman aussi ! B Character C'est vrai ! Et le garçon ? A Character Le garçon a dasé (!) la fille aussi ! B

Character Vraiment? Il a dasé! A

Figure 20 - Example of dialogue. The exclamation points in parenthesis indicate the prosodic boundary that is present in the stripping but absent in the simple transitive dialogues.

For the test trial, one of the videos showed a causal action (i.e., a girl swinging another girl's leg back and forth (see left image on Figure 21)) and the other one showed an intransitive action (i.e., a girl making circles with her own arm (see right image on Figure 21)).

Figure 21 - Videos of the test phase of Experiment 2.

The dialogues were recorded in a sound-attenuated booth, using a camera, an audio re- corder and a condenser microphone. The test sentences were judged by twenty-one French- speaking adults via an online form, made with Google Forms; participants were asked to listen to the sentences through headphones, and to choose the best interpretation for them from a 5-

121 point scale, 1 being the transitive interpretation of the sentences ("The baby has dased the mommy", for instance) 5 being the stripping interpretation ("The baby has dased and the mommy has dased"), and 3 being completely ambiguous. The sentences were interpreted cor- rectly 96% of the time (i.e., the stripping sentences were interpreted as having two agents, and the transitive sentences were interpreted as having one agent and a patient)74.

4.1.2.1 Acoustic analysis

As for the first experiment, we conducted acoustic measurements around the positions of the intonational phrase boundaries in both types of sentence. We analyzed the presence and length of pauses; lengthening of the stressed vowel of the word that precedes the boundary (i.e., for the verb, in the stripping sentences, and for the first DP, in the simple transitive sentences); and pitch (F0) contour at the end of the boundary. For the last two measures, we compared the two types of sentence through ANOVAs using the z-score value of duration and pitch. For the first boundary, between the first DP and the verb, we found no difference in the first DP's stressed vowel length (M = .62; SD = .91 for the simple transitive condition, versus M = .11; SD = .79 for the stripping condition; t(6)75 = -.84; p = .4) and no pause between this DP and the verb. For the second boundary, between the verb and the second DP, we found a significant difference in the lengthening of the verb's stressed vowel between conditions, indi- cating that the stripping sentences had more vowel lengthening (M = 2.78; SD = 1.78), whereas the simple transitive sentences presented no lengthening (M = -1.01; SD = .22; t(6) = 4.22; p < .01). The stripping sentences also presented a consistent pause between the verb and the second DP, of .523s in average. As for the first experiment, we did not find any difference in pitch contour between conditions (M = 1.30; SD = .22 for simple transitive sentences, and M = .80; SD = .77 for stripping sentences; t(6) = -1.24; p = .3). The raw values of duration can be found in Table 10, and the raw values of F0 can be found in Table 11. We also found a significant difference in the second DP's stressed vowel lengthening, indicating that this vowel was lengthened in simple transitive sentences, but not in stripping sentences (M = 3.41; SD = 1.61 for the transitive condition, versus M = -.53; SD = .62 for the

74 There were 3% of wrong judgements and 1% of correct but slightly ambiguous judgements, all for transitive sentences. 75 The degrees of freedom for these analyses are different from the ones for the acoustic analyses of Experiment 1 because in this experiment there is only four sentences per condition, whereas in Experiment 1 we had eight sen- tences per condition.

122 stripping condition; t(6) = -4.57; p < .01). As for the sentences in Experiment 1, this indicates a prosodic phrase boundary between the second DP and aussi for the transitive sentences, but not for the stripping sentences. However, we did not find a significant difference in pitch con- tour (M = 1.80; SD = .51 for the simple transitive condition, versus M = .56; SD = .49 for the stripping condition; t(6) = -1.57; p = .2).

Table 10 - Raw duration values for the acoustic analysis of test sentences of Experiment 2

Stripping sentences Transitive sentences (e.g. [le bébé a dasé] [la maman (e.g. [le bébé] [a dasé la maman aussi]) aussi]) Duration in sec SD in sec Duration in sec SD in sec 1st DP's stressed vowel (e.g. béb[é]) 0.12 0.03 0.14 0.02 Verb's stressed vowel (e.g. das[é]) 0.249 0.08 0.086 0.01 Verb's preceding vowels (mean) 0.131 0.01 0.124 0.01 Verb's stressed vowel - Verb's pre- ceding vowels 0.118 0.08 -0.038 0.01 2nd DP's stressed vowel (e.g. mam[a]n) 0.119 0.03 0.237 0.02

2nd DP's preced- ing vowels (mean) 0.14 0.01 0.112 0.01 2nd DP's stressed vowel - 2nd DP's preceding vowels -0.021 0.02 0.125 0.02

Table 11 - Raw F0 values for the acoustic analysis of test sentences of Experiment 2.

Stripping sentences Transitive sentences (e.g. [le bébé a dasé] [la maman (e.g. [le bébé] [a dasé la maman aussi]) aussi]) F0 in Hz SD in Hz F0 in Hz SD in Hz Verb's stressed vowel (e.g. das[é]) 325 12 318 22 Verb's preceding vowels (mean) 284 25 245 8

123

Verb's stressed vowel - Verb's pre- ceding vowels 42 36 73 20 2nd DP's stressed vowel (e.g. mam[a]n) 307 78 364 17

2nd DP's preced- ing vowels (mean) 276 9 267 10 2nd DP's stressed vowel - 2nd DP's preceding vowels 31 76 97 13

Figure 22 - Soundwave and pitch for one stripping sentence (Le bébé a dasé ! La maman aussi !, top), and one simple transitive sentence (Le bébé a dasé la maman aussi !, bottom) for Experiment 2.

4.1.3 Procedure76

The testing procedure is similar to Experiment 1. The experiment started with two train- ing trials, where children saw two pairs of familiar actions (i.e., sleep vs. push and carry vs. walk) and were asked to look at one of the actions (e.g. regarde celui qui porte ! ("look at the

76 For a detailed view of the experiment structure, see Table 15 in Appendix F.

124 one who carries!")). These trials were displayed to show children they would see two different actions on each side of the screen, but only one of the actions was going to be named. The named action was an intransitive action in one of the training trials and a transitive action in the other, so children would not think that they would only be asked to look for the transitive or intransitive action during the experiment. After the training trials, two dialogues were displayed, followed by the test trial. At the end of the test trial, the two videos froze on the screen and the experimenter entered the cabin to ask children to point. She would kneel close to the children, to their right, and ask them to "show the one who dases" (tu peux me montrer celle qui dase ?). In order not to influence children's responses, the experimenter would look at the child, and not at the videos; if children hesitated to point, she would insist by saying she did not listen because she was outside the cabin, and afterwards that their parent did not listen because he/she was using headphones. If that did not work, then she would point at both videos herself, while asking "is it this one or this one?". Parents were not required to keep the headphones on for this part, and some parents spontaneously removed them. Sometimes they would also encourage children's pointing, by asking them to "answer the lady", and, less often, by asking them to show "the one who dases". Although this was not requested, this was also not forbidden, as it did not seem to influence the results (other than perhaps increasing the chances of children to point), since the parents did not know the answer and so could not indicate the correct video to the child. The left-right disposition of the videos was pseudo-randomized, so children did not see more than two consecutive target videos on the same side. The order of presentation of training trials was also randomized, to avoid possible order biases in children's responses, but "walk" was always paired with "carry" and "sleep" was always paired with "push". The pointing re- sponses were recorded by the experimenter through a keyboard, and the eye data were collected via an Eyelink 1000 eye-tracker. The experiment took about 3 minutes but could take longer depending on how long children took to give a pointing response and to look at the fixation circle during the fixation trials. After the experiment, children took home a diploma of "honorary member of the Baby- lab" as a gift for their participation.

125

5s Preview phase "Oh, regarde! Tu vois ça?"

1s

5s Preview phase "Oh, regarde! Tu vois ça?"

1s

5s Contrast phase "Oh, et là, regarde! Tu as vu ?"

3s Informative audio prompt "Attention: Regarde celle qui dase!"

8s

Test phase "Tu la vois, qui dase? Regarde celle qui dase!"

Figure 23 - Structure of a test trial for Experiment 2. The clock icon followed by a number illustrates for how many seconds each video/gray screen appeared and was not present in the actual videos.

126

4.1.4 Data analysis

The eye-tracking data were analyzed exactly as for Experiment 1. We performed a clus- ter-based permutation analysis complemented by a t-test for averaged overall looking-times towards the causal action as the dependent variable. For the pointing responses, since we ended up with unbalanced samples (i.e., we had much more children in the transitive than in the stripping condition), we could not perform mixed effects regression. Instead, we ran a bootstrap analysis in the average proportion of point- ing per child. In this type of analysis, bootstraps shuffle each participant’s learning condition for a fixed number of times (for our experiment, we ran 10 000 permutations) and then deter- mines whether the difference in the mean level of confidence between the two conditions was higher than what would be found by chance. We also performed one-sample t-tests, to see if the average proportion of pointing per condition differed significantly from chance. For this analysis, we considered μ = .5 (i.e., chance is 50% of pointing towards the causal action) and we divided the standard α = .05 by two to correct for the analysis of the conditions separately.

4.1.5 Experimental hypothesis and predictions

Our hypothesis was that children can rely on prosodic boundary information to interpret stripping sentences differently from transitive sentences even when confronted with a novel verb in a semantically impoverished context. Also, we predicted they can use this information to make hypotheses about the meaning of the novel verb, as seen in previous experiments (e.g. ARUNACHALAM & WAXMAN, 2010; DE CARVALHO, 2017). We expected that children exposed to the novel verb in stripping sentences would perform differently than children ex- posed to it in transitive sentences, looking/pointing significantly less towards the causal action. This would show that children who listen to the novel verb in transitive contexts are more likely to interpret it as a causal action than children who listen to it in stripping sentences. Previous studies using this methodology have shown that children who listen to a novel verb in intransitive sentences do not look longer towards the one-agent action in the test phase (e.g. YUAN & FISHER, 2009; ARUNACHALAM & WAXMAN, 2010; NAIGLES & KAKO, 1993), but are rather at chance. Naigles & Kako state two possible explanations for this. First, they point out that a one-participant action can also be contained in an action with more than

127 one participant. The transitive video, for instance, also contains intransitive actions such as "smile" or "bend". If children think that there is more than one possible referent for the intran- sitive novel verb in the test videos, then they might look towards both two-participant and one- participant videos for a similar amount of time. On the other hand, transitive actions could also be contained in one-participant actions; the intransitive video also contains actions such as "ro- tate the arm". Another explanation would be that children are biased towards the intransitive action, and so any effect of the test sentence is masked by the effect of their bias. Regardless of which explanation is correct, we also do not expect children in the stripping condition to look longer towards the one-agent action, but only to prefer the causal action significantly less than children in the transitive condition. As for Experiment 1, if children show the predicted looking pattern while not showing the predicted pointing pattern or vice-versa, this would still suggest that they can tell apart stripping from transitive sentences, but we would need to investigate the reasons why they failed to show this ability in all of the measured behaviors. If we do not find any difference in looking or pointing behavior between conditions, this would suggest that children did not interpret the stripping sentences differently from transitive sentences, suggesting that the greater complexity of this task (compared to the task in Experi- ment 1) hinders French-learning children's ability to use prosodic boundary information to parse these sentences, and also that French-learning children have more trouble interpreting stripping sentences than the left-dislocated sentences in de Carvalho (2017).

4.2 RESULTS

4.2.1 Eye-tracking data

The cluster-based analysis did not find any significant time clusters where we could observe a difference between groups. Furthermore, the t-test shows no significant difference in the average proportion of looking time towards the causal action between the stripping (M = .59; SD = .19) and the transitive condition (M = .54; SD = .16; t(51.37) = .921; p = .36; Cohen's d = .25).

128

As we can see in Figure 24 below, children from both conditions present an almost identical looking pattern during the entire test trial. The proportion of looking time is slightly above .5 during most of the trial, except for a time window that coincides with the beginning of the first test sentence (tu la vois qui dase ?). Figure 25 shows the average looking time to- wards the causal action for the stripping (green boxplot) and the transitive (orange boxplot) condition. We can see a great variability in the proportion of looking time between subjects (each purple dot in the figure represents the data of one subject), with similar ranges for both conditions.

Figure 24 - Proportion of looks towards the causal action through the whole test trial (8 seconds) in the stripping (green line) and transitive (orange line) condition for Experiment 2. The average time of appearance of the test sentences is indicated in the gray boxes below the graph.

129

Figure 25 - Average proportion of looking time towards the causal action in the stripping (green box) and tran- sitive (orange box) condition for Experiment 2. Each purple dot represents the average looking time for one child. The horizontal gray line indicates the proportion of looks if participants looked towards the target during half of the trial (.50). The white dashed lines show the mean for each condition (.58 for the stripping condition and .54 for the transitive condition).

4.2.2 Pointing data

Half of the children in the stripping condition and 71% of children in the transitive con- dition pointed towards the causal action. However, the bootstrapped analysis showed no signif- icant difference between conditions (p = .17). We also performed a one-sample t-test to see if the transitive group differed from chance (which we assumed to be 50%), and found a t(20) = 2.121 and p = .046, which is only marginally significant due to alpha correction77.

77 α = .025. There was no point in analyzing the stripping condition, since the proportion of pointing was exactly 50%.

130

Figure 26 - Average proportion of pointing towards the causal action for the stripping (green bar) and transitive (orange bar) condition for Experiment 2. The gray horizontal line marks 50% of pointing.

4.3 DISCUSSION

The main goal of this experiment was to investigate French-learning children's ability to rely on prosodic boundary cues to correctly interpret stripping sentences in a semantically impoverished context, where they do not previously know the argument structure of the verb, and also in the absence of simultaneous visual context. Since Experiment 1 showed that, from 28 months to three years of age, children can correctly interpret stripping sentences differently from simple transitive sentences based on its prosodic boundary cues when there are more se- mantic cues available (i.e. verb meaning and visual context), we hypothesized that 30-42- month-olds would also be able to make correct predictions about a novel verb in stripping sen- tences, and would interpret it differently from children who listened to the same verb in transi- tive sentences. Our results, however, could not corroborate this hypothesis. For the looking data, chil- dren from both conditions presented an almost identical looking pattern. This might suggest that children in the stripping condition failed to use the prosodic boundary cues between the verb and the second DP to correctly interpret the heard sentences, and instead relied on the number of DPs accompanying the sentences to interpret them as transitive sentences, as they

131 did for the novel verb experiment in Dautriche et al. (2014). Another possibility is that they were not able to exploit the test sentences to infer something about the novel verb’s meaning. In order to choose between these two alternative explanations (i.e., that children interpreted the stripping sentences as transitive sentences, or that they did not understand the task at all) it would be necessary to conduct other control experiments, such as showing another group of children intransitive-only dialogues (e.g. "The baby dases") and comparing their results with the ones for children who listened to transitive and stripping dialogues. If we found a significant difference between the stripping condition and the new intransitive-only condition, this would suggest that children in the stripping condition do not interpret the novel verb as intransitive, and so might be resorting to the DP-counting strategy hypothesized in Dautriche et al. (2014). If, however, there were no significant difference between the three conditions (stripping, tran- sitive and intransitive-only), we could hypothesize that children did not understand the task, which would be surprising, given the fact that this methodology has already been successfully used in studies with even younger children. Another question that remains is why children in the present experiment did not behave like the ones in de Carvalho (2017), which succeeded in interpreting the novel verb differently when listening to it in left-dislocated sentences than in transitive sentences. One explanation would be that children seem to have a harder time interpreting stripping sentences than left- dislocated sentences, since our 28-month-olds in Experiment 1 did not perform as well in the task as Dautriche et al.'s participants of the same age. Although this discrepancy between results might have been caused by methodological differences in experiments, as discussed in sections 3.2.3 and 3.4, it is also possible that there is a difference between French-learning children's ability to interpret stripping and left-dislocated sentences through prosodic boundary infor- mation. If this is true, then perhaps the greater complexity of the present experiment, which involves the interpretation of an unknown verb, along with the possibly more difficult task of interpreting stripping sentences, made children choose to stick to the DP-counting strategy even in the presence of conflicting evidence (i.e. the one-DP sentences in the dialogues, which also contained the novel verb). If the apparent greater difficulty of parsing stripping sentences had an effect on chil- dren's performance in the experiment, a possible follow-up study would be to run it with older children and see if they perform better than the younger ones. However, it is possible that older children, who have already acquired a good number of lexical items, might behave similarly to

132 adults in this task. Some attempts to run pilot studies with this methodology with adults78 have not returned good results, as mature speakers seem to not understand the task and to choose the referent of the novel verb randomly. This might be because adults are no longer used to learning the meaning of novel verbs solely by paying attention to the syntactic structure they appear in. This is not to say that they are unable to do so, but they seem to have more trouble doing it in an experimental context where they are exposed to only a few short sentences containing the verb. So if this experiment was to be conducted with older children and they also failed to perform as expected, this might mean either that they perform as the younger children, or that they already perform like adults in the task. Another interesting follow-up study would be to design a new experiment adding other types of cues which might make the task easier for 30-42-month-olds. Besides the experiment with left-dislocated vs. transitive sentences inside dialogues, De Carvalho et. al. (2017) also conducted another experiment where children were presented with transitive or left-dislocated sentences while simultaneously watching two videos side-by-side, one showing a novel causal action, and another one showing a novel one-participant action. Half of the children listened to sentences such as Regarde ! Elle dase la fille ! ("Look! She dases the girl!"), and the other half listened to sentences such as Regarde ! Elle dase, la fille ! (Look! She dases, the girl!), in four trials with four different novel verbs and eight novel actions (one novel causal action and one novel intransitive action per trial). Their goal was to investigate whether children would change their DP-counting strategy to figure out the meaning of the novel verb if they had another type of cue available, i.e., the semantic context provided by the test videos. Their results showed that children who listened to left-dislocated sentences looked significantly less towards the causal actions than children who listened to the transitive sentences, showing that they relied on pro- sodic boundary information to correctly tell apart left-dislocated from transitive sentences. This methodology could also be applied to our stripping sentences and could potentially yield better results than the ones found for the present experiment, as it does not involve parsing several different sentences to hypothesize about the meaning of an unknown verb, but only pairing the sentence heard to the immediate semantic context. For the pointing data, although the results go in the expected direction (children who listened to the novel verb in transitive sentences pointing more towards the causal action), we could not find a significant difference in the proportion of pointing between conditions, and the transitive condition only marginally differed from chance (p = .046, for α = .025). However,

78 De Carvalho, personal communication.

133 these data were hard to analyze, since very few children wanted to point, and the two conditions were unbalanced (we had 14 children in the stripping and 21 children in the transitive condi- tion). The fact that we had more children from the transitive condition that were willing to point might also mean that they were more confident in their answers than children in the stripping condition, which suggests that the transitive sentences were easier to interpret. Although we cannot conclude by these results that children in the transitive condition interpret the novel verb differently than children in the stripping condition, the fact that the results go in the expected direction might suggest that children perform better in the pointing than in the looking task for this experiment, perhaps because they take longer to decide which test video better corresponds to the novel verb. To test this hypothesis, it might be interesting to replicate this experiment including more test trials, so we can see if children improve their understanding of the test sentences with longer exposure to them.

In sum, the results of the present experiment failed to show children's ability to correctly interpret stripping sentences differently from transitive sentences with aussi in dialogues with sentences containing an unknown verb. This suggests that although young French-learning chil- dren correctly interpret stripping sentences through prosodic boundary cues when listening to sentences with known verbs, they cannot do so in a verb-learning situation in a semantically impoverished context, at least not in the age range tested.

The final question we will address here is what children understand when they listen to stripping sentences. Experiment 1 has shown that they can correctly interpret stripping sen- tences differently from transitive sentences, but it cannot answer, by itself, if they understand the ellipsis in these sentences. To address this, we designed an experiment aimed at investigat- ing children's understanding of the identity relations underlying stripping sentences. We de- scribe this experiment in the next chapter.

134

EXPERIMENT 3: COMPREHENSION OF STRIPPING SENTENCES

This last experiment is aimed at a different question from the ones described before: do French-learning children understand the ellipsis behind stripping sentences from Experiment 1? As explained in section 2.2, we assume that stripping sentences like (42a) are derived via deletion of TP of the second conjunct (i.e. the second bracketed phrase) at PF, which is licensed by a functional category that also triggers the movement of a remnant DP to its specifier posi- tion. For this sentence to be correctly interpreted, children need to know that the deleted TP shares an identity condition with a linguistic antecedent, which, in the case of sentences such as (42a), is the TP of the first conjunct.

42) a) [Le canard tape]! [Le singe Δ aussi]! – Δ = [TP tape] The duck hits! The monkey too!

b) Le canard tape ! Le singe cali !

Although there are several studies on young children's comprehension and production of ellipsis (e.g. SANTOS, 2006, 2009; LOPES, 2009; LOPES & SANTOS, 2014; POSTMAN ET AL., 1997; FOLEY ET AL., 2003; GUO ET AL., 1996), there has not been, to our knowledge, many on the acquisition of stripping sentences, besides Santos' (2006, 2009) and Kato's (2012, apud Martins, 2016) studies on the production of stripping in verbal answers to yes-no questions. Although there is no a priori reason to believe that French-learning 3-4-year- olds are not able to understand stripping sentences, since children of similar age or younger were shown to understand other types of ellipsis, some peers have questioned whether children in Experiment 1 in fact understood our stripping test sentences or were simply interpreting them as two-agent sentences, without necessarily knowing which action the second agent should be performing. In order to address this question, we presented children with sentences such as (42a), while they saw two videos playing side-by-side: in one video, the two mentioned char- acters were performing the same action, whereas in the other video the first character performed the named action, and the second character performed a novel action. We believe that if children understand the identity relations underlying stripping constructions, they would look longer towards the video where both characters perform the same action; if they do not, they should either look longer towards the non-congruent action (because it is novel therefore more inter- esting), or be at chance.

135

As a control, we presented another group of children with the same test videos, but instead of stripping sentences, they listened to sentences where the adverb aussi has been re- placed by the novel word cali (see 42b). Since cali has a phonological form which is compatible with a verb, and is in a position that is typically associated with a VP, we expected children to infer that it might be a novel verb referring to the novel action, and therefore to look longer towards the video where the second-mentioned character performed the novel action.

5.1 METHOD

The method, analysis, and criteria for exclusion of participants were preregistered in the OSF (Open Science Framework) database. The formal preregistration, materials, collected data and data analysis scripts are freely available to readers through the following link: https://osf.io/r3k6t/?view_only=fc25edf71a3947ed82b154ad0c8792d3.

5.1.1 Participants

Forty-one 36-50-months-old children (twenty in the control (cali) condition (Mage = 44 months, range 36.8 to 49.5, 7 girls) and twenty-one in the stripping (aussi) condition (Mage = 42.9, range 36.3 to 49.6, 10 girls)) were included in the analysis79. Children were tested in a sound-attenuated booth at the Babylab of LSCP. All children were monolingual native French speakers and were exposed to other languages less than 30% of the time at their homes. An additional set of nineteen children were tested but not included in the analysis, due to insufficient eye data (11 children); fussiness80 (7 children); and data loss (1 child).

79 Our preregistered design committed to analyzing the data of forty-eight children, twenty-four in each condition. However, external factors (i.e. social isolation measures taken due to the covid-19 pandemic) forced us to resume testing before achieving this number of participants. The remaining participants will be tested as soon as possible, and their data will be included in the analysis for future journal publications. 80 As the experiment was long, some children became fussy by the end and did not finish it.

136

5.1.2 Materials

Four pairs of videos were created, using the same puppet animals from Experiment 1 (except for the dinosaur, which was left out for being the animal less known by children ac- cording to parental reports in Experiment 1). While one of the videos displayed two puppet animals performing a known action alternately, the other displayed one of the animals perform- ing the known action and the other animal performing a novel action (i.e. opening and closing both arms and legs repeatedly). As for Experiment 1, the known actions were pousser, porter, manger and taper. In order to simplify the videos, instead of using a third character as the patient of the known actions, we used inanimate objects: a spiked blue ball for taper; a white toy car for pousser; a gift box for porter; and a cake for manger81. This also guaranteed that the null clitic referents were all inanimate objects. The novel action was the same in all test videos. Before the test videos, we showed children two introduction videos. One video aimed to familiarize children with the known action; children saw the character mentioned first in the test sentences performing the action on its own, and this action was named once (e.g. Regarde ! C'est le canard ! Il tape ! - "Look! It's the duck! He is hitting!"). The other video displayed the novel action performed by the second-mentioned character with a neutral audio (e.g. Re- garde ! C'est le singe ! Il fait quoi ? - "Look! It's the monkey! What is he doing?") and was created to avoid making the novel action too novel and therefore too salient.

Figure 27 - Example of test videos for the verb "hit" from Experiment 3. Left: same-action video (the duck and the monkey hit a ball alternately). Right: different-action video (the duck hits the ball and the monkey performs a different action).

Figure 27 exemplifies the videos for the "hit" action; the image to the left illustrates the same-action video (i.e. the action described by the stripping sentence), while the image to the

81 See Appendix B for more details.

137 right illustrates the different-action video, where the first-mentioned character performs the named action and the second-mentioned character performs a novel action. The test sentences were recorded by Anne Christophe in a sound-attenuated booth using a recorder and a condenser microphone. They were included in the videos using Filmora video editor. We recorded stripping sentences similar to the ones in Experiment 1 (imitating the same prosody, but with different characters, since not all of the characters could perform the novel action (e.g. the duck had wings, not arms) and we removed the dinosaur). In control sentences, we replaced the adverb aussi by the novel word cali, while keeping the same prosody as the stripping sentences. We chose to use the same novel word in all trials, to mimic the test sen- tences, which all ended with aussi. The word created had a phonology that was similar to aussi (two syllables, ending in 'i') as to not favor a verb reading more than the adverb itself. Since -i is a possible verbal inflexion for third person in French, it could be interpreted as a verb. Fur- thermore, the fact that it was also a disyllabic word made it possible to replace the adverb with- out altering the prosody of the sentences, so the only difference between the test and the control sentences was the final word. Since children in the control group saw the novel word paired with the novel action four times, we also decided to investigate if they would learn to map it onto the novel action. At the end of the experiment, children saw an extra trial, where one of the test videos displayed a character performing the novel action, and the other one displayed another character performing a different novel action: the character turns sideways while holding a plastic green lettuce. The characters chosen did not appear performing the novel action in the previous test trials, so chil- dren could not associate the novel word to a character. The audio stimulus for this trial asked children to look at the one who calis (regarde celui qui cali !). Children in the test condition also saw this trial and served as a control group, as they did not listen to the novel word before, and so should be at chance82. This trial, however, will not be analyzed here, as it is an explora- tory trial (the data analysis is underpowered, since we only have one test trial and children have a good chance to already be bored with the experiment by the time it starts) that does not con- tribute to our goals.

82 Although they might infer the right action by mutual exclusion, since they only saw one of the novel actions before.

138

5.1.3 Procedure83

The procedure was the same as for Experiment 1, with the exception of the pointing trial, which was omitted, since it did not return significant results in Experiment 1 and we real- ized that the presence of the experimenter inside the cabin distracted children, causing more data loss. The experiment started with the introduction of the animal characters followed by the training trial with "push" vs. "play", exactly as in Experiment 1. This was followed by four test trials. Finally, the fifth trial tested if children learned to map the novel word to the novel action. Overall, the experiment took about seven minutes, but, like it was the case for Experiments 1 and 2, it could be longer depending on how much time children took to look at the fixation circle during the intervals between the contrast and test trials.

83 For a detailed view of the experiment structure, see Table 16 in Appendix G.

139

6s 6s Verb introduction phase "Oh, regarde ! C'est le singe ! Il fait quoi ?" 2s 6s

6s Verb introduction phase "Oh, regarde ! C'est le canard ! Il tape !" 2s 7s

7s Preview phase "Oh, regarde ! Tu vois ça ?"

1s

7s Preview phase "Oh, regarde ! Tu vois ça ?"

1s

7s Contrast phase "Et là, regarde ! Tu as vu ?"

6s Informative audio prompt "Attention: Le canard tape ! Le singe aussi/cali !"

14s Test phase "Oh, regarde ! Le canard tape ! Le singe aussi/cali !" (2x)

Figure 28 - Structure of a test trial of Experiment 3. The clock icon followed by the number illustrates for how many seconds each video/gray screen appeared and was not present in the actual videos.

5.1.4 Data analysis

We conducted a cluster-based permutation analysis to look for time-window(s) where a significant effect of condition could be observed, indicating that children look more towards

140 the same-action video when listening to the aussi (test) sentences than when listening to the cali (control) sentences. Such an effect may appear towards the offset of the sentence, since for this type of sentence the crucial information (the end with aussi or cali determining what action the second-mentioned character is performing) is at the end. However, since children hear the test sentence once before the test trial and see the test videos once before hearing the test sentence, during the contrast trial, it is possible that they show a preference for the video corresponding to the heard sentence from the beginning of the trial. Therefore, we applied the cluster-based analysis to the entire duration of the trial, as we did in Experiments 1 and 2. To analyze the averaged overall looking times, we ran a mixed effects model. We con- sidered the maximal model allowing the structure to converge, starting with the proportion of looking time towards the same-action video as the dependent variable, condition (cali vs. aussi sentences) and phase (contrast vs. test) as independent variables, subject as a random slope and condition as a random effect with a random slope for item. We chose to preregister this analysis instead of the t-test because it is more informative, as it takes into account other factors that can influence the results, such as children's proportion of looks during the contrast phase (i.e., the phase were they saw the two test videos side-by-side before listening to the test sentences).

5.1.5 Experimental hypothesis and predictions

Our working hypothesis is that 3-4-years-old French-learning children already under- stand stripping sentences, as it is a frequent construction in French and previous experiments show that children as young as 17 months old already produce and understand ellipsis sen- tences. Therefore, we expect that children who listen to the stripping sentences will look longer towards the same-action video than children who listen to the control sentences, who should prefer the video displaying the novel action (as we expect them to interpret cali as a novel verb). However, if they do not understand the identity condition behind stripping sentences, children exposed to stripping sentences should either present the same looking pattern as chil- dren exposed to cali sentences, looking longer towards the different-action video (if they decide to interpret aussi as naming the novel action, or simply because they find this video more inter- esting) or look towards both videos for a similar amount of time. In any case, we would expect their looking behavior to not differ significantly from the behavior of children in the cali con- dition.

141

5.2 RESULTS

The cluster-based analysis did not find any significant time clusters where we could observe a difference between children in the cali and aussi conditions. As we can see in Figure 29 below, the proportion of looks during the whole test trial seems to follow a similar pattern for both conditions, with a slight difference between 4000ms and 6000ms, where children in the aussi condition looked longer towards the same-action video than children in the cali con- dition. However, this difference was not significant (p = .16 for the time cluster between 4580ms and 5680ms, and p > .50 for all other time clusters found). We also performed a linear mixed effects model with proportion of looking time towards the same-action video as the dependent variable, condition (control vs. stripping) and phase (contrast vs. test) as independent variables, subject as random slope and condition as random effect84. We found no effect of condition (β = .004, SE = .035, t(104.7) = .112, p = .911) or phase (β = .037, SE = .036, t(175.7) = 1.039, p = .300), and no interaction between condition and phase (β = .026, SE = .049, t(175.7) = .531, p = .596). Figure 30 shows the average propor- tion of looking time towards the same-action video per condition. As we can see, there is little difference between conditions, with an average proportion of looking time of .56 for the aussi condition, and of .52 for the cali condition. We can also see that the average proportion of looking time per subject (represented by the purple dots in the figure) is more randomly spread in the cali condition, whereas it is more concentrated in the aussi condition, with only one subject below .45 and one subject above .70 in this condition.

84 This was the maximal model that allowed the structure to converge.

142

Figure 29 - Proportion of looks towards the same-action video through the whole test trial (14 seconds) in the aussi (orange line) and cali (green line) condition of Experiment 3. The average time of appearance for each word in the test sentences is indicated in the gray boxes below the graph (the upper boxes show the aussi sen- tences, and the boxes below show the cali sentences).

Figure 30 - Average proportion of looking time towards the same-action video in the aussi (orange box) and cali (green box) conditions (Experiment 3). The horizontal gray line indicates the proportion of looks if partic- ipants looked towards the target during half of the trial (.50). The purple dots show the average looking time for each participant. The white dashed lines show the mean for each condition.

143

Exploratory analyses

Since, as seen in sections 2.4 and 3.1, the verbs chosen for the test sentences have dif- ferent underlying structures, we also performed an analysis per item, to see if there is a signifi- cant difference in children's performance between verbs. We performed a two-way ANOVA with average proportion of looking time towards the same-action video as the dependent vari- able and condition and item as independent variables, and found no interaction between condi- tion and item (F(3, 125) = .303; p = .823) and no effect of condition (F(1, 128) = 1.203; p = .275), but a significant effect of item (F(3, 128) = 6.344; p < .001). A post-hoc Tukey test revealed that, for the verb porter (carry), the overall proportion of looking time towards the same-action video was significantly smaller than for the other three verbs85. This may have been due to the fact that the same-action video for porter presented almost no movement (the two puppets carried a present each, while gently swinging from side to side), which made the different-action video much more salient, as it had more movement. Perhaps because of this, porter was also the item that presented the largest difference between conditions (since the baseline looking towards the same-action video was so low, it left more room for children to show an increase of interest towards the different-action video in the cali condition).

Table 12 - Average proportion of looking time towards the same-action video for the cali and aussi conditions per item for Experiment 3.

Condition Verb aussi cali Manger .50 .49 Porter .42 .33 Pousser .56 .54 Taper .57 .54

85 P = .04 for the comparison with manger; p = .004 for the comparison with pousser; p < .001 for the comparison with taper.

144

Figure 31- Average proportion of looking time towards the same-action videos for the cali (green bars) and aussi (orange bars) condition per item.

5.3 DISCUSSION

The results above show us that children exposed to stripping sentences did not look longer towards the same-action video than children exposed to control (cali) sentences. This suggests that they did not interpret the heard sentences differently from children in the control condition. However, it is worth noting that there is a tendency in the expected direction; the time-course analysis showed a time window in which children in the aussi condition looked slightly more towards the same-action video than children in the cali condition. Apart from the possibility that French-learning toddlers do not understand stripping sentences, which would contradict previous literature showing that children understand ellipsis from an early age, there are some other possible explanations for this pattern of results. First, it is important to remember that we still need to test 7 more children in order to reach the pre-registered n for this experiment, and that the final results might be slightly differ- ent than the ones described here, perhaps with bigger or smaller p-values for the analyses.

145

Second, notice that children who listened to the control sentences also did not behave as ex- pected, since they did not prefer to look towards the video depicting the novel action. This could explain why there is no significant difference between conditions; if the children who listened to sentences with cali behaved at chance, we would expect a smaller difference between con- ditions, even if the children who listened to stripping sentences presented a preference towards the same-action videos. Why did children fail to interpret the novel word as naming the novel action? There are several possible explanations for this. One could think that they took some time to associate them; if this were true, then children on the cali condition should have looked longer towards the novel action in the last trials, compared to the first ones. However, there is not much differ- ence in the proportion of looking time towards the same-action video between trials86. One could also hypothesize that the novel action created was too similar to a known action (perhaps "hug" or "dance"), and so children had trouble associating a novel word to an action that was already named. Another possible explanation is that the prosody of the cali sentences, which were based on stripping sentences, made it hard for children to understand that the sentences contained a novel verb. In a coordinated sentence with two distinct actions, such as "The tiger eats. The duck sleeps!", we would expect the second verb to be in contrast with the first one, and so to receive contrastive stress. However, in our control sentences, the novel verb was deaccented, just like the adverb in the stripping sentences. Furthermore, although -i is a possible verbal inflection in French, most verbs end either in a consonant or in -e in this language. So maybe children failed to identify the novel word as a verb due to the prosody of the sentences and the use of a non-canonical verbal inflection. Another possible explanation for children's performance in the experiment might be that children actually interpreted the cali sentences as stripping sentences. Although it is very un- likely that children mistook aussi for cali, perhaps they analyzed the sentences as stripping through prosodic boundary information before encountering the novel word and had trouble changing this first interpretation afterwards. It is very common for young children to stick to their first interpretation of sentences and not reanalyze them even in presence of strong cues in favor of another interpretation; this behavior is described in the literature as the kindergarten- path effect (e.g. TRUESWELL ET AL., 1999; DE CARVALHO ET AL., 2017). If this were true, then it would provide a very interesting contribution to prosodic bootstrapping studies, by

86 For the cali condition: .55 for the first trial; .41 for the second trial; .50 for the third trial; and .43 for the fourth trial. For the aussi condition: .53 for the first trial; .50 for the second trial; .50 for the third trial; and .52 for the fourth trial.

146 showing that children can analyze the syntactic structure of stripping sentences through pro- sodic boundary information even before listening to the remnant word that cues the ellipsis (e.g., aussi). However, one interesting observation which shows that at least some children seem to treat cali and aussi differently is that some children in the cali condition questioned their parents about the meaning of this word during the test trials, but no children in the aussi condition did. During the experiment, children talked to their parents (even though they could not listen, as they were using headphones), and some even talked to the screen, probably because they thought that there was someone handling the puppets behind it. We can access their comments and non-verbal behavior (such as pointing) by looking at the videos recorded during the exper- iment. Through these videos, we can observe that some children from the cali condition do seem to learn mapping cali to the novel action. Three children answered to one of the introduc- tions of the novel action, where the audio stimuli asked what the puppet was doing, with il cali ("he calis"); three other children also pointed towards the cali action during the fifth trial, which presented the known novel action paired with a different novel action; and one child said c'est le tigre ("it's the tiger", which was the animal performing the known novel action) during this trial. However, some other children were confused about the referent of the novel word. One child interpreted it as a novel verb but did not know to which action it was referring to, and asked her parent c'est qui qui cali? ("Who is caliing?"); another one seemed confused by the presence of an unknown word, as she asked her parent pourquoi il dit toujours cali? ("why is he always saying cali?"); and yet another one thought that Cali was the name of a character, and asked her parent c'est qui cali? ("Who is Cali?"). Another child, when seeing the introduc- tion of the novel action for the third time, answered the narrator, who asked "what is he doing?", by saying il fait rien ("he is doing nothing"). The only comment on aussi was from a 37-month-old who was very fussy during the whole experiment (and did not complete it), which asked "aussi quoi?" ("aussi what?") after the first test trial. Some children in the aussi group also had comments about cali during the fifth trial: two children asked c'est quoi, cali? ("what is cali?"); one child pointed to the wrong video; and another child got it right and said c'est lui, le tigre... c'est le tigre qui cali ("it's him, the tiger… it's the tiger who calis")87. These anecdotal observations show us that at least some children treat cali and aussi differently, and some even learn to map cali to the novel action. This suggests that children

87 We assume that this child might have deduced the correct action by mutual exclusivity, since she had already seen the novel action corresponding to cali in the four preceding trials.

147 already know the adverb aussi and are not willing to attribute a new meaning to it but can identify cali as a novel word and want to figure out its meaning. These observations may also help to explain the pattern of results found in the analysis of the overall looking time, where the data of participants in the cali condition are more randomly spread than the data of participants in the aussi condition; some children in the cali condition seem to have mapped cali to the novel action, and so looked longer towards the different-action video, whereas others were at chance, and did not show preference towards any of the videos. A second explanation for the results found would be that the sentences were not accepted as grammatical by children. Following what has been discussed in sections 2.4 and 3.1, French is a language in which referential null objects are highly marked. Since for three of the four chosen verbs (pousser, porter and taper) our stripping test sentences contained referential null objects, it is possible that children got confused by these sentences and failed to interpret them as expected. However, if this were true, we would expect the analysis per item to show differ- ences in children's performance between these three verbs and manger, which allows for an unergative (i.e. intransitive) reading. As we have seen above, manger did not behave markedly differently from the other three verbs, and in particular, it did not show a trend towards the difference in conditions we were hoping to observe. One way of testing if children's perfor- mance were hindered by the use of null object sentences would be to create a follow-up study using only intransitive actions and see if children would interpret the aussi sentences as ex- pected.

In sum, the present experiment could not show children's comprehension of stripping sentences, as children exposed to these sentences did not differ significantly in their looking behavior from children exposed to the control sentences. Since the test sentences and visual stimuli for the stripping condition were very similar in Experiments 1 and 3, maybe the diffi- culties children may have found in interpreting them in the present experiment can be extended to Experiment 1. Therefore, the present experiment cannot rule out the possibility that children did not need to fully understand the test sentences to perform as expected in Experiment 1. One could be tempted to conclude through these results that French-learning 3-4-year- olds do not understand stripping sentences, contradicting previous studies showing young chil- dren's understanding of several types of ellipsis. However, the discussion above shows several other possible explanations for the results found. First, we did not yet reach the pre-registered number of children per condition, and since the results show a time window where children in

148 the aussi condition looked more at the same-action video than children in the cali condition, it is still possible that this time window increases in size and significance with the full n. Also, the preliminary results show that children in the control condition did not behave as expected, as they did not look longer towards the video displaying the novel action. This might have made the comparison with the stripping condition harder, which would explain the lack of significant difference between conditions. The fact that we used the stripping prosody in the cali sentences also brings up the possibility that children interpreted the cali sentences as stripping sentences, which would explain the lack of significant difference between conditions. Therefore, although we could not show children's comprehension of stripping sentences, this is more likely due to methodological issues, and not to children's inability to correctly interpret them.

149

SUMMARY AND GENERAL DISCUSSION

Prosodic bootstrapping theories propose that children rely on prosodic cues of speech (such as pauses, intonation and syllable lengthening) for the beginning of speech segmentation and language acquisition (e.g. CHRISTOPHE ET AL., 2008; MORGAN & DEMUTH, 1996). Several studies corroborate this hypothesis, by showing that infants are sensitive to prosodic regularities in the speech stream from their first days of life (RAMUS, 2002), and that even before knowing many words, they can detect violations to typical word and prosodic phrase structure (e.g. JUSCZYK ET AL., 1992, 199; HIRSH-PASEK ET AL., 1987). Other studies have also shown that infants can rely on prosodic boundary cues for word and sentence seg- mentation (e.g. GOUT, CHRISTOPHE & MORGAN, 2004; MILLOTTE, 2005; SHUKLA ET AL., 2011). One important hypothesis in the prosodic bootstrapping literature is that children can use the prosodic information in prosodic phrase boundaries to delimit syntactic constituents, which would help them bootstrap syntactic acquisition (e.g. CHRISTOPHE ET AL., 2008). Since mature speakers rely on prosody for speech comprehension, and benefit from prosodic boundary information for the disambiguation of structurally ambiguous sentences (e.g. NESPOR & VOGEL, 2007), it is possible that young children also rely on these cues for syn- tactic parsing. One way of investigating this is to conduct experiments on syntactic disambigu- ation, to see if children can tell apart structurally ambiguous sentences through their different prosodic phrasings. However, there are still few studies that show children's ability to disam- biguate between syntactic structures through prosodic boundary cues (e.g. DE CARVALHO ET AL., 2013, 2014, 2015; DAUTRICHE ET AL., 2014; SNEDEKER & YUAN, 2008), and these are restricted to French and English and to a handful of structures. Furthermore, some studies have failed to find an effect of prosody in syntactic processing even in older children (5-6-years-old; SNEDEKER & TRUESWELL, 2001; CHOI & MAZUKA, 2003). The present work adds knowledge to this important subject by investigating young chil- dren's ability to parse stripping sentences, which, to our knowledge, have not been studied be- fore for this type of task, although they appear in many languages. By studying a different syntactic structure, we aimed to increase the evidence in favor of the hypothesis above. Fur- thermore, we also intended to extend it to Brazilian Portuguese (BP), a language that has not been studied before for syntactic processing through prosodic boundary cues by young children.

150

Our first experiment investigated French and BP-learning children's ability to correctly interpret stripping sentences differently from simple transitive sentences. They were presented with either stripping or transitive sentences containing known transitive verbs while watching two videos side-by-side, one depicting an interpretation of the stripping sentence (i.e. two-agent video) and another one depicting an interpretation of the simple transitive sentence (i.e. one- agent video). The results show that French and BP-learning children from 3 to 4 years old in- terpret stripping and simple transitive sentences differently based on their different prosodic boundary cues: children who listened to stripping sentences looked significantly longer towards the two-agent videos than children who listened to simple transitive sentences. A similar but less significant result was also found for 28-months-old French-learning children, suggesting that this ability is also present in younger children. Our second experiment aimed to extend these results by investigating whether children could still rely on prosodic boundary information to tell apart stripping from transitive sentences in a verb-learning situation without the aid of semantic (visual) context. Children were pre- sented with auditory dialogues, in which two women appeared talking on the screen using the novel verb daser inside transitive or stripping sentences (e.g. Le bébé a dasé (!) la maman aussi !). Afterwards, they saw two videos playing side-by-side, one depicting a novel intransitive action, and another one depicting a novel causal action, while the auditory stimulus asked them to "find the girl who dases" (Regarde celle qui dase !). This experiment was built on the as- sumption that children can use the syntactic context in which a novel word appears to hypoth- esize about its possible meaning, and can build up on or adjust this hypothesis in view of new (semantic or linguistic) evidence, as shown by previous works (e.g. DE CARVALHO, 2017; ARUNACHALAM & WAXMAN, 2010). Our results, however, show that children in the strip- ping condition behaved similarly to children in the transitive condition, showing that they did not interpret the sentences from the dialogues differently, or were not able to exploit them to infer something about the novel verb's meaning. This suggests that children who listened to stripping sentences failed to use the prosodic boundary cues to differentiate them from simple transitive sentences when confronted with a novel verb in a semantically impoverished context. The third experiment aimed to investigate French-learning children's comprehension of stripping sentences. In Experiment 1, one could say that the only information children needed to know about the stripping sentences was that the intonational phrase boundary between the verb and the second DP indicated that this DP was the agent of a second sentence, and not the patient of the first sentence; this should be enough to lead them to look for the video where the two mentioned characters were agents, and since there was only one type of video where both

151 characters were agents, and in these, they always performed the same action, there is no way of knowing if children really understood that the second agent should be performing the same action as the first one. In order to investigate whether children really understood the stripping sentences used in Experiment 1, instead of just choosing the most plausible interpretation for the sentences heard, we created a new experiment where another group of children listened to stripping sen- tences while seeing two videos where both mentioned characters were agents, but, while in one video the two characters were performing the named action, in the other, the first-mentioned character performed the named action, and the second-mentioned character performed a novel action. As a control condition, we also showed another group of children sentences where we replaced aussi for the novel word cali (e.g. Le canard tape ! Le singe cali !). We expected that children in the control condition would look longer towards the different-action videos, as they should think that cali is a novel action. We also predicted that, if children understood stripping sentences, they should look longer towards the same-action videos, and so their looking pattern should differ significantly from children exposed to the control sentences. The preliminary results for this experiment show no significant difference between con- ditions, although there is a tendency in the expected direction. Such a result may be interpreted as showing that children in both conditions interpreted their test sentences in a similar manner, and that children in the aussi condition did not interpret the test sentences as expected. Since the stripping sentences in this experiment were the same used in Experiment 1, we cannot rule out the possibility that children in both experiments were confused by stripping sentences, and that the good performance in Experiment 1 was not due to children's complete understanding of the sentences. However, it is unlikely that French-speaking children do not understand ellip- sis, since stripping sentences do occur in the language, and children seem to understand and produce ellipsis from an early age (see section 2.3). It may thus be that methodological factors involving the choice of sentences and the comparison with the control condition explain the pattern of results found. An interesting hypothesis is that children in the control condition also interpreted their test sentences as stripping, and that's why their results did not differ from the ones for children exposed to stripping sentences. Although more analyses are needed to corrob- orate this hypothesis, it may suggest that children can identify a sentence as stripping through its prosodic information even before listening to the word that cues the ellipsis, which would show that prosodic information are indeed a strong cue for stripping sentences.

152

The results above contribute to the corroboration of the hypothesis that young children can use prosodic information to parse syntactic structures. Following previous studies on pro- sodic bootstrapping, we claim that children on the age range studied know that prosodic phras- ing correlates to syntactic phrasing to some extent and can use this knowledge to parse sen- tences. Regarding the prosodic and syntactic discussions made in chapter 2, our results also show that by 28 months of age children know that a verb and an object DP cannot be separated by an intonational phrase boundary in their native languages; when children exposed to strip- ping sentences detect the boundary between the verb and the second DP in the sentences, they realize that this DP is not the object of the first sentence, but rather the subject of a new sentence, and so are able to interpret them differently from simple transitive sentences. This is in line with the prosodic bootstrapping model from Christophe et al. (2008), which proposes that chil- dren can construct a rudimentary syntactic structure of their mother tongue through the obser- vation of prosodic information, suggesting that they learn to associate syntactic and prosodic structure from an early age. However, as shown in Dautriche et al.'s (2014) experiment with left-dislocated sen- tences with a novel verb, children can also rely on cues other than prosody for sentence parsing, and their preference for one or other type of cue may vary depending on the task they are ex- posed to. This might explain the results found in Experiment 2; when exposed to stripping sen- tences along with an unknown verb and an impoverished semantic context, children may prefer to rely on a DP-counting strategy to hypothesize about the meaning of the novel verb. This is not to say that they cannot pay attention to prosodic information in verb-learning tasks, as shown by de Carvalho (2017), but that they may choose to put more weight on another type of cue in this task.

The experiments conducted in this thesis provide empirical evidence for the hypothesis that children readily exploit phrasal prosody for syntactic parsing, while also showing that chil- dren may perform differently when exposed to different types of structure. However, they can- not on their own fully answer the question of which type of linguistic information hinders or favors children's ability to disambiguate sentences through prosody. Some questions remain to be investigated: What kind of syntactic phenomena renders sentences more complex for disam- biguation tasks in each language? Does frequency of occurrence in children's linguistic envi- ronment influence their performance? Which prosodic cues are important for boundary recog- nition, and does this vary between age ranges and across languages? Is this ability universal,

153 i.e., can children from any language benefit from prosodic boundary information for sentence parsing? About the first question, although previous results (e.g. DAUTRICHE ET AL., 2014; DE CARVALHO ET AL., 2015; SNEDEKER & YUAN, 2008; SNEDEKER & TRUE- SWELL, 2001; CHOI & MAZUKA, 2003) show different performance on the use of prosodic cues for the parsing of different structures by children, it is hard to tell how much of children's performance in these experiments is due to the different prosodic or syntactic structures, or to the different methodologies used and/or languages tested. A way of systematically investigating the impact of different syntactic structures would be through studies using similar methodolo- gies and the same population but varying across types of structure. However, not all languages present the same types of structural ambiguities, due to different word orders, the use of mor- phological markers, different word-level stress, etc. De Carvalho et al. (2014) comment on an attempt to create a structural disambiguation experiment with noun-verb homophones in Japa- nese, which was cancelled due to word order difficulties (i.e., the nouns and verbs do not occupy the same linear position in the sentences) and to lexical pitch accent differences (nouns are usually accented in the first mora, while verbs are not accented). The second question, about the role of frequency in the input in children's performance, is even harder to investigate. Given the fact that children learn a lot about their native language with just a little input (e.g. GUASTI, 2009), it is hard to tell how much children need to listen to a structure in order to acquire it, and where (if anywhere) to trace the boundary between enough and too little exposure. For the French left-dislocation experiments (e.g. DE CAR- VALHO, 2017; DAUTRICHE ET AL., 2014), a small corpus search revealed that about 5% of multiple-word utterances in the corpus presented left or right dislocation. We searched for el- lipsis sentences with aussi in the same corpus and found that these composed .14% of multiple- word utterances, which is a much smaller proportion, even considering that ellipses with aussi do not account for all possible ellipses in the language. For the experiments that failed to show children's ability to use prosodic information for sentence parsing, there has not been a system- atic assessment of the frequency of occurrence of the investigated structures. So maybe there is a certain correlation between input frequency and performance that we are not yet ready to address, but that could be investigated, when we collect enough data on different sentence struc- tures (again using similar methodologies on similar populations) with various levels of fre- quency in the input. The next question to be addressed is which types of prosodic cues are important for boundary recognition, and how this varies across languages. Nespor and Vogel's (1987)

154 experiment with sentence disambiguation by Italian-speaking adults tries to tell apart the effect of phonological versus intonational phrase boundaries for sentence disambiguation, but it does not tackle the issue of how different prosodic cues may affect this process. Furthermore, studies with young children purposefully emphasize prosodic boundaries by creating as many cues as possible, such as long pauses, vowel lengthening and pitch curves. There is however some ex- perimental evidence showing that young infants (around 6 months of age) rely first on universal phrase markers such as pauses, and only later, around 8 months of age, on more subtle cues such as pitch and lengthening, in the absence of pauses (JOHNSON & SEIDL, 2008; VAN OMMEN ET AL., 2017), which suggests that universal phrase markers are more salient and easier to detect. However, studies by Jusczyk et al. (1992) and Hirsh-Pasek et al. (1987) show that, by 7 months of age, children already know the correlation between intonation and pauses and pay equal attention to both types of cues, since they react in surprise when exposed to sentences in which these cues are disconnected from each other. These studies show that infants can detect pauses, and intonational and length cues from an early age, but do not conclude much about which type of cue is more salient or more easily detected by children. In order to investigate if children's performance on sentence disambiguation tasks would change depending on the prosodic cues available or on the relation between prosodic phrasing and different syntactic structures, one could conduct an experiment controlling the type of cues present (for instance, maintaining only the pause in one condition, and only the pitch curve in another) and the type of prosody-syntax relation (for instance, using structurally ambiguous sentences that are more or less easily disambiguated by prosodic phrasing) and see if children's performance would vary depending on the type of structure they are presented with. This, how- ever, would a very ambitious experiment, which should include several conditions and a very large number of participants. Finally, to investigate whether children can rely on prosodic boundary cues for sentence parsing in all languages, we need to study several different languages with different prosody and different correlations between syntax and prosody. This is necessary because, although all languages present prosodic phrasing, prosodic cues may be used differently across them (JUN, 2005, 2014). Experiment 1 helped to achieve this goal by testing stripping sentence compre- hension in Brazilian Portuguese, a language that has not been studied before. Although French and BP present some similarities, they also present differences in prosodic structure, as already described in section 2.4. For instance, since BP also uses prosodic cues in prosodic words for topic and focus marking, while French uses them mostly in phrasal prosody, these cues might be less ambiguous in French than in BP. However, one might say that French and BP present

155 more similarities than differences in prosodic structure, and so our results do not really contrib- ute to the investigation of the universality hypothesis. Therefore, we need to study other types of languages, which differ more radically in their use of prosodic cues between or within pro- sodic phrases. Another important question that remains unanswered is whether infants really use the prosodic information of speech for syntactic acquisition, and whether they need to learn the correlations between syntax and prosody to do so. Our work, along with previous studies, show that children know a great deal about their native language's prosodic structure, and can use this knowledge for a lot of different aspects of sentence parsing: for sentence disambiguation (e.g. DAUTRICHE ET AL., 2014); word and sentence segmentation (e.g. JUSCZYK, 1992, 1999; HIRSH-PASEK ET AL., 1987; COSTA, 2015); to decide whether an utterance is declarative or interrogative (e.g. ZHOU, CRAIN & ZHAN, 2012); grammatical categorization of novel words (e.g. MASSICOTTE-LAFORGE & SHI, 2015; DE CARVALHO ET AL., 2015) and known homophones (e.g. DE CARVALHO ET AL., 2013, 2014); and so forth. However, none of these studies show whether or how children learn the mapping between prosody and syntax, and whether this is necessary for syntactic acquisition. De Carvalho (2017) reviews a few hypotheses regarding the first question. The first one states that children are innately endowed with the ability to pay attention to regularities in the speech stream, and naturally explore the prosodic cues to start speech segmentation. After ac- quiring some content and function words through this process, they can go on to explore the grammatical constituents inside prosodic phrases. Another non-exclusive alternative would be that children learn the correspondences between syntax and prosody through non-ambiguous sentences in the input. Gutman et al. (2015) investigated a corpus of French child-directed speech from four different children and found an average of 1.4 prosodic phrases (Intonational or phonological phrases) per utterance. Although a bigger corpus search, with different lan- guages, would be necessary to confirm this finding, this suggests that children have enough information about prosodic phrasing in their input, and that they could possibly use this infor- mation to learn the prosody-syntax mapping. However, it is important to recall that there is no absolute correspondence between syntax and prosody; prosodic phrasing depends on other non- linguistic factors such as speech rate, utterance size and speech style, and prosodic cues are used for other types of operations in some languages, such as topic and focus marking. The question about whether children need prosody for syntactic acquisition is very hard to answer through empirical evidence. Theoretically, however, this hypothesis remains very plausible, for several reasons. First, the prosodic information is one of the most salient

156 information in the speech stream. So, if children were not able to use this information for the initial division of syntactic constituents (as proposed by Christophe et al.'s 2008 model (see section 2.1.2)), how would they do it? One alternative would be through morpho-syntactic cues, such as function words. Function words, such as auxiliary verbs and determiners, are usually unstressed phonemes that appear frequently at phrase edges and are easily detected by children from an early age (e.g. HÖHLE & WEISSENBORN, 2003; KEDAR, CASASOLA & LUST, 2006). So, from the moment children acquire some function words, they could use their posi- tioning to determine syntactic boundaries. They are even better candidates than prosody for this process, as they define these boundaries in a more consistent and reliable way. However, to acquire function words, children need to first them from the speech stream, a process that depends on the prosodic cues of prosodic phrase boundaries. A second reason why prosody must be important for syntactic acquisition is that it helps linguistic access by children and adults. Nespor & Vogel's (2007) experiment on structural dis- ambiguation by adults show that they cannot disambiguate between linearly ambiguous syntac- tic structures (in out-of-the-blue contexts) if there are no differences in prosodic phrasing be- tween them. Furthermore, Duffy & Pisoni (1992) show that listeners retrieve more rapidly and accurately the meaning of words and sentences uttered with natural prosody than created with synthesized speech, and the less effective the synthesizer is in copying natural language pros- ody, the harder it is for listeners to interpret the sentences. De Carvalho et al. (2017) created an experiment with sentences with homophones, such as la petite ferme sa boîte à jouets ("the little (girl) closes her toy box") or la petite ferme sera pour les enfants ("the little farm will be for the children"), but with conflicting prosodic cues and lexical information (i.e., sentences were constructed by splicing the beginning of one sen- tence with the end of the other, such that, for instance, the prosodic structure indicated that ferme was a noun, but the sentence ended as if it were a verb, and vice-versa), in order to see how children would interpret the target homophone ferme (i.e., which cue would be more im- portant: prosody or lexical information). His results show that children from 4 to 6 years of age stuck to the interpretation given by the prosodic cues and did not reanalyze it after listening to the lexical information that pointed towards the opposite interpretation of the homophone. This shows that children rely on prosodic information for the grammatical categorization of an am- biguous word and have trouble reanalyzing their first interpretation of the word even in the

157 presence of other more reliable cues88, suggesting that prosody is an important cue for syntactic analysis in children.

The present work, along with the previous studies cited, shows us that the prosodic cues of speech are a very important aspect of language that children and adults explore for language acquisition and parsing. For children, they might represent the first window they can easily peek through to discover many important aspects of their native languages. There is still a lot to investigate on the role of these cues for language acquisition, parsing and segmentation; there is also a great deal to learn about how children come to acquire syntactic structures, especially structures that, to be fully understood, require the interpretation not only of phonologically re- alized elements, but also of silence. Hopefully, in the present work, we have contributed to the investigation of these aspects of language acquisition and parsing, by studying a type of structure that differs from the ones studied before for syntactic parsing through prosodic cues. We also hope to have contributed to the growth of knowledge on Brazilian Portuguese, by investigating an aspect of the acquisition of this language that has not been studied before, which adds up to other pioneer studies on how BP-learning children and BP-speaking adults use prosody for sentence parsing89.

88 This behavior, called kindergarten-path effect, is observed in other infant studies (e.g. TRUESWELL ET AL., 1999). 89 e.g. COSTA, 2015; SILVA, 2014; SILVA, 2009; SOUZA, 2016, among others.

158

REFERENCES

ARUNACHALAM, S.; ESCOVAR, E.; HANSEN, M.; WAXMAN, S. Out of sight, but not out of mind: 21-month-olds use syntactic information to learn verbs even in the absence of a corresponding event. Language and Cognitive Processes, v. 28, n. 4, p. 417–425, 2013.

ARUNACHALAM, S.; SYRETT, K. Specifying event reference in verb learning. Paper pre- sented at the 38th Boston University Conference on Language Development (BUCLD), Boston, MA, 2014. pdf doi:10.7282/T3VQ34PV.

ARUNACHALAM, S.; WAXMAN, S. R. Meaning from syntax: Evidence from 2-year-olds. Cognition, v. 114, n. 3, p. 442–446, 2010.

BABINEAU, M.; DE CARVALHO, A.; TRUESWELL, J.; CHRISTOPHE, A. Familiar words can serve as a semantic seed for syntactic bootstrapping. Under review.

BARBOSA, P. A. Conhecendo melhor a prosódia: aspectos teóricos e metodológicos daquilo que molda nossa enunciação. Revista de Estudos da Linguagem, v. 20, n. 1, p. 11–27, June 2012.

BERNAL, S.; LIDZ, J.; MILLOTTE, S.; CHRISTOPHE, A. Syntax Constrains the Acquisition of Verb Meaning. Language Learning and Development, v. 3, n. 4, p. 325–341, Aug. 2007.

BIANCHI, V.; SILVA, M. C. F. On some properties of agreement-object in Italian and Brazil- ian Portuguese. Issues and Theory in Romance Linguistics. Selected papers from the Lin- guistic Symposium on Romance Languages. p. 181-197, 1994.

BOOIJ, Geert E. et al. Cliticization as prosodic integration: The case of Dutch. The linguistic review, v. 13, p. 219-242, 1996.

CHOI, Y.; MAZUKA, R. Young Children’s Use of Prosody in Sentence Parsing. Journal of Psycholinguistic Research, v. 32, n. 2, p. 197-217, Mar. 2003. pdf doi: 0090-6905/03/0300- 0197/0

CHOMSKY, N. Current Issues in Linguistic Theory. The Hague: Mouton, 1964.

CHOMSKY, N. Aspects of the theory of syntax. Cambridge, MA: MIT Press, 1965.

CHOMSKY, N. A minimalist program for linguistic theory. MIT occasional papers in lin- guistics, n. 1. Cambridge, MA: MIT Working Papers in Linguistics, 1993.

CHOMSKY, N. The Minimalist Program. Cambridge, MA: MIT Press, 1995.

CHOMSKY, N. Minimalist inquiries: The framework. In: MARTIN, R.; MICHAELS, D.; URIAGEREKA, J. (Eds.). Step by step: Essays on minimalist syntax in honor of Howard Lasnik. Cambridge, MA: MIT Press, 2000. p. 89-155.

CHRISTOPHE, A.; GOUT, A.; PEPERKAMP, S.; MORGAN, J. Discovering words in the continuous speech stream: the role of prosody. Journal of Phonetics, v. 31, n. 3–4, p. 585– 598, July 2003.

159

CHRISTOPHE, A.; GUASTI, T.; NESPOR, M; DUPOUX, E.; VAN OOYEN, B. Reflections on phonological bootstrapping: its role for lexical and syntactic acquisition. Language and cognitive processes, London, UK., v. 12, n. 5-6, p.585-612, 1997.

CHRISTOPHE, A.; MILLOTE, S.; BERNAL, S.; LIDZ, J. Bootstrapping lexical and syntactic acquisition. Language and speech, Thousand Oaks, CA., v. 51, n. 1-2, p. 61-75, 2008.

COCHET, H.; VAUCLAIR, J. Pointing gestures produced by toddlers from 15 to 30 months: Different functions, hand shapes and laterality patterns. Infant Behavior and Development, v. 33, n. 4, p. 431–441, Dec. 2010.

COHEN, J. A power primer. Psychological Bulletin. v. 112, n. 1, p. 155-159, July 1992.

COSTA, G. F. Percepção do pareamento entre prosódia e sintaxe por falantes do portu- guês brasileiro. 2015. Dissertation (masters in Linguistics) – Graduate program in Linguistics, Universidade Federal de Juiz de Fora, Juiz de Fora.

CULICOVER, P.; JACKENDOFF, R. Simpler syntax. Oxford University Press on Demand, 2005.

CUMMINS, S.; ROBERGE, Y. Null Objects in French and English. In: AUGER, J.; CLEM- ENTS, J. C.; VANCE, B. (Eds.). Current Issues in Linguistic Theory. Amsterdam: John Ben- jamins Publishing Company, v. 258, 2004. p. 121-138.

CUTLER, A.; DAHAN, D.; VAN DONSELAAR, W. Prosody in the Comprehension of Spo- ken Language: A Literature Review. Language and Speech, v. 40, n. 2, p. 141-20, 1997. pdf doi: 10.1177/002383099704000203

CYRINO, S. M. L. O objeto nulo no português do Brasil - um estudo sintático-diacrônico. Londrina: Editora UEL, 1997.

CYRINO, S. M. L.; MATOS, G. VP ellipsis in European and Brazilian Portuguese: a compa- rative analysis. Journal of Portuguese Linguistics, v. 1, n. 2, p. 177-195, Dec. 2002.

CYRINO, S. M. L.; MATOS, G. Elipse do VP e variação paramétrica. Cadernos de Estudos Linguísticos, v. 49, n. 2, p. 195-206, 2007.

DAUTRICHE, I. Weaving an ambiguous lexicon. 2014. PhD Thesis – École doctorale Frontières de l'innovation en recherche et éducation, Sorbonne Paris Cité, Paris. https://www.theses.fr/2015USPCB112

DAUTRICHE, I.; CRISTIA, A.; BRUSINI, P.; YUAN, S.; FISHER, C.; CHRISTOPHE, A. Toddlers Default to Canonical Surface-to-Meaning Mapping When Learning Verbs. Child Development, v. 85, n. 3, p. 1168-1180, May 2014.

DE CARVALHO, A. Les enfants exploitent-ils la prosodie des phrases pour calculer leur structure syntaxique ? 2013. Dissertation (masters) – Masters program in Cognitive Sciences, École Normale Supérieure, Paris.

DE CARVALHO, A. Impact du contexte syntaxique dans l’interprétation d’un nouveau verbe. 2014. Dissertation (masters) – Masters program in Cognitive Sciences, École Normale Supérieure, Paris.

160

DE CARVALHO, A. The role of phrasal prosody and function words in the acquisition of word meanings. 2017. Thesis (PhD) – Department of Cognitive Studies, École Normale Su- périeure, Paris.

DE CARVALHO, A.; DAUTRICHE, I.; CHRISTOPHE, A. Three-year-olds use prosody online to constrain syntactic analysis. Paper presented at the 38th Boston University Confer- ence on Language Development (BUCLD), Boston, MA, 2014. pdf doi: 10.13140/2.1.1402.4006

DE CARVALHO, A.; DAUTRICHE, I.; CHRISTOPHE, A. Preschoolers use phrasal prosody online to constrain syntactic analysis. Developmental science, v. 19, n. 2, p. 235-250, 2016.

DE CARVALHO, A.; DAUTRICHE, I.; LIN, I.; CHRISTOPHE, A. Phrasal prosody constrains syntactic analysis in toddlers. Cognition, v. 163, p. 67-79, June 2017.

DE CARVALHO, A.; HE, A. X.; LIDZ, J.; CHRISTOPHE, A. Prosody and function words cue the acquisition of word meanings in 18-month-old infants. Psychological science, v. 30, n. 3, p. 319-332, 2019.

DE CARVALHO, A.; LIDZ, J.; TIEU, L.; BLEAM, T.; CHRISTOPHE, A. English-speaking preschoolers can use phrasal prosody for syntactic parsing. The Journal of the Acoustical Society of America, v. 139, n. 6, p. EL216–EL222, June 2016.

DE CAT, C. French dislocation: syntax, interpretation, acquisition. Oxford, England; New York: Oxford University Press Inc, 2007.

DEPIANTE, A. M. The syntax of deep and surface anaphora: A study of null complement. 2000. Thesis (PhD). University of Connecticut, Connecticut.

DINK, J.; FERGUSON, B. eyetrackingR. R package version 0.1.4, 2016. http://www.eyetracking-R.com.

DOMAHS, U.; WIESE, R.; KNAUS, J. Word prosody in focus and non-focus position: An ERP-study on the interplay of prosodic domains. In: VOGEL, R.; VANDE VIJVER, R. (Ed.). Rhythm in Cognition and Grammar - A Germanic Perspective. Trends in Linguistics, de Gruyter, Berlin, 2015. p.137–164.

DOWNING, B. T. Syntactic structure and phonological phrasing in English. 1970. Thesis (PhD) – University of Texas at Austin, Austin.

DUFFY, S. A.; PISONI, D. B. Comprehension of Synthetic Speech Produced by Rule: A Re- view and Theoretical Interpretation. Language and Speech, v. 35, n. 4, p. 351–389, 1992. pdf doi: 10.1177/002383099203500401

FENSON, L.; DALE, P.; REZNICK, J.; THAL, D.; BATES, E.; HARTUNG, J.; PETHICK, S.; REILLY, J. S. The MacArthur Communicative Development Inventories: User’s guide and technical manual. San Diego: Singular Publishing Group, 1993.

FERREIRA, M. B. Argumentos Nulos em Português Brasileiro. 2000. Dissertation (masters) – Institute of Language Studies, State University of Campinas, Campinas.

FIENGO, R.; MAY, R. Indices and identity. MIT press, 1994.

161

FIENGO, R.; MAY, R. The Semantic Significance of Syntactic Identity. In: BENNIS; PICA; ROORYCK (Eds.). Atomism and Binding. Dordrecht: Foris, 1997.

FODOR, J. A.; BEVER, T. G. The psychological reality of linguistic segments. Journal of Verbal Learning & Verbal Behavior, v. 4, n. 5, p. 414-420, 1965. pdf doi: 10.1016/S0022- 5371(65)80081-0

FOLEY, C.; NÚÑEZ DEL PRADO, Z.; BARBIER, I.; LUST, B. Knowledge of Variable Bind- ing in VP–Ellipsis: Language Acquisition Research and Theory Converge. Syntax, v. 6, n. 1, p. 52–83, 2003.

FRAZIER, L.; CARLSON, K.; CLIFTON, J. C. Prosodic phrasing is central to language com- prehension. Trends in Cognitive Sciences, v. 10, n. 6, p. 244–249, June 2006.

FRAZIER, L.; FODOR, J. D. (Eds.). Explicit and implicit prosody in sentence processing: studies in honor of Janet Dean Fodor. Cham: Springer, 2015.

FROTA, S.; PRIETO, P. Intonation in Romance: Systemic similarities and differences. OUP Oxford, 2015. Pdf doi:10.1093/acprof:oso/9780199685332.003.0011

GARRETT, M.; BEVER, T. G.; FODOR, J. The active use of grammar in speech perception. Perception & Psychophysics, v. 1, p. 30-32, 1966.

GERKEN, L.; JUSCZYK, P. W.; MANDEL, D. R. When prosody fails to cue syntactic struc- ture: 9-month-olds’ sensitivity to phonological versus syntactic phrases. Cognition, v. 51, n. 3, p. 237–265, Mar. 1994.

GERTNER, Y.; FISHER, C. Predicted errors in children’s early sentence comprehension. Cog- nition, v. 124, n. 1, p. 85–94, July 2012.

GOUT, A.; CHRISTOPHE, A.; MORGAN, J. L. Phonological phrase boundaries constrain lexical access II. Infant data. Journal of Memory and Language, v. 51, n. 4, p. 548–567, Nov. 2004.

GOUT, A.; CHRISTOPHE, A. O papel do bootstrapping prosódico na aquisição da sintaxe e do léxico. Tradução de Renata Bottino. In: CORRÊA, L. M. S. (Org.). Aquisição da linguagem e problemas do desenvolvimento linguístico. Rio de Janeiro: PUC Rio, 2006. p.103-127.

GRÜTER, T. A Unified Account of Object Clitics and Referential Null Objects in French. Syn- tax, v. 12, n. 3, p. 215–241, Sept. 2009.

GUASTI, M. T. Universal Grammar Approaches to Language Acquisition. In: FOSTER-CO- HEN, S. (Ed.). Language Acquisition. London: Palgrave Macmillan UK, 2009. p. 87–108.

GUO, F.; FOLEY, C.; CHIEN, Y. C.; CHIANG, C. P.; LUST, B. Operator-variable binding in the initial state: A cross-linguistic study of VP ellipsis structures in Chinese and English. Ca- hiers de linguistique - Asie orientale, v. 25, n. 1, p. 3-34, 1996.

GUTMAN, A.; DAUTRICHE, I.; CRABBÉ, B.; CHRISTOPHE, A. Bootstrapping the Syntac- tic Bootstrapper: Probabilistic Labeling of Prosodic Phrases. Language Acquisition, v. 22, n. 3, p. 285–309, July 2015.

162

HALBERT, A.; CRAIN, S.; SHANKWEILER, D.; WOODAMS, E. Children's Interpretive Use of Emphatic Stress. presented at the 8th Annual CUNY Conference on Human Sentence Processing. Tucson, AZ, 1995.

HANKAMER, J.; SAG, I. Deep and surface anaphora. Linguistic inquiry, v. 7, n. 3, p. 391- 428, 1976.

HAVRON, N.; DE CARVALHO, A.; FIÉVET, A. C.; CHRISTOPHE, A. Three- to Four-Year- Old Children Rapidly Adapt Their Predictions and Use Them to Learn Novel Word Meanings. Child Development, v. 90, n. 1, p. 82–90, 2019.

HIRSH-PASEK, K.; KEMLER NELSON, D. G.; JUSCZYK, P. W.; CASSIDY, K. W.; DRUSS, B. & KENNEDY, L. Clauses are perceptual units for young infants. Cognition, Am- sterdam, v. 26, p. 269-286, 1987

HÖHLE, B.; WEISSENBORN, J. German‐learning infants’ ability to detect unstressed closed‐ class elements in continuous speech. Developmental Science, v. 6, n. 2, p. 122-127, 2003.

HOLMBERG, A. The syntax of yes and no in Finnish. Studia Linguistica, v. 55, p. 141-175, 2001.

HUALDE, J. I. Stress removal and stress addition in Spanish. Journal of Portuguese Linguis- tics, v. 6, n. 1, p. 59-89, 2007.

HUANG, C.-T. J. On the distribution and reference of empty pronouns. Linguistic Inquiry, v. 15, n. 4, p. 531-574, 1984.

HUNTER, T.; YOSHIDA, M. A. Restriction on Vehicle Change and Its Interaction with Move- ment. Linguistic Inquiry, v. 47, n. 3, p. 561–571, July 2016.

JOHNSON, E. K.; SEIDL, A. Clause segmentation by 6-month-old infants: A crosslinguistic perspective. Infancy, v. 13, n. 5, p. 440–455, 2008. pdf doi: 10.1080/15250000802329321

JOHNSON, K. Gapping Is Not (VP-) Ellipsis. Linguistic Inquiry, v. 40, n. 2, p. 289-328, 2009.

JUN, S.-A. Prosody in sentence processing: Korean vs. English. UCLA Working Papers in Phonetics, v. 104, p. 26-45, 2005.

JUN, S.-A. Prosodic typology: by prominence type, word prosody, and macro-rhythm. In: JUN, S. -A. (Ed.). Prosodic Typology II. Oxford University Press, 2014. p. 520–539.

JUSCZYK, P. W. Developing phonological categories from the speech signal. In: FERGU- SON, C.; MENN, L.; STOEL-GARNMON, C. (Eds.). Phonological development: Models, research, implications. Timonium, MD. 1992. p. 17-64.

JUSCZYK, P.W. The Discovery of Spoken Language. Massachusetts: MIT Press, 1997.

JUSCZYK, P. W. How infants begin to extract words from speech. Trends in cognitive sci- ences, Cambridge, MA., v. 2, n. 9, p. 323-328, 1999.

JUSCZYK, P.W; HOUSTON, D. M.; NEWSOME, M. The beginnings of word segmentation in English-learning infants. Cognitive Psychology, Amsterdam, v. 39, p. 159–207, 1999.

163

KABAK, B.; VOGEL, I. The phonological word and stress assignment in Turkish. Phonology, v. 18, n. 3, p. 315-360, 2001.

KATO, M. A. The Distribution of Null and Pronominal Objects in Brazilian Portuguese. In.: ASHBY W., MITHUM M., PERISSINOTO G., RAPOSO E. (Eds.). Linguistic perspectives on Romance languages: Selected Papers from the XXI Linguistic Symposium on Romance languages. Amsterdam: John Benjamins, 1993. p. 225-235.

KATO, M. A. Strong pronouns and weak pronouns in the history of Brazilian Portuguese gram- mar. Proceedings of the Colloquium on Structure, Acquisition, and Change of Grammars. Hamburg: University of Hamburg, 2000. v. II. p. 26-37.

KATO, M. A. Polar positive answers in Brazilian Portuguese. 43rd Linguistic Symposium on Romance Languages (LSRL 43), CUNY Graduate Center, New York, 2012. apud MAR- TINS, A. M. VP and TP ellipsis: Sentential polarity and information structure. In: FISCHER, S.; GABRIEL, C. (Eds.). Manual of Grammatical Interfaces in Romance. Berlin, Boston: De Gruyter, 2016.

KEDAR, Y.; CASASOLA, M.; LUST, B. Getting there faster: 18‐and 24‐month‐old infants' use of function words to determine reference. Child Development, v. 77, n. 2, p. 325-338, 2006.

KELLY, M. H. The role of phonology in grammatical category assignments. In: MORGAN, J.; DEMUTH, K. (Eds.). Signal to syntax: Bootstrapping from speech to grammar in early acquisition. Mahwah, NJ.: Lawrence Erlbaum Associates.1996. p. 249-262.

KENSTOWICZ, M. J. Phonology in generative grammar. Cambridge, MA: Blackwell, 1994.

KIM, S. Sloppy/strict identity, empty objects, and NP ellipsis. Journal of East Asian Linguis- tics, v. 8, n. 4, p. 255-284, 1999.

LADD, R. Intonational phonology. Cambridge University Press, 2008.

LAMBRECHT, K.; LEMOINE, K. Vers une grammaire des compléments zéro en français parlé. In: CHUQUET J., FRID, M. (Eds.) Absence de marques et représentation de l'absence. Rennes: Presses Universitaires de Rennes, 1996. p. 279-309.

LAMBRECHT, K.; LEMOINE, K. Definite null objects in (spoken) French: A construction- grammar account. In: FRIED, M., BOAS, H. C. (eds.) Grammatical constructions: Back to the roots. Amsterdam: John Benjamins, 2005. p. 13–55.

LASNIK, H. A note on pseudogapping. MIT working papers in linguistics, v. 27, p. 143-163, 1995.

LASNIK, H. When can you save a structure by destroying it? PROCEEDINGS-NELS, v.31, n. 2, p. 301-320, 2001.

LASNIK, H.; FUNAKOSHI, K. Ellipsis in Transformational Grammar. In: LASNIK, H.; FU- NAKOSHI, K. (Eds.). The Oxford Handbook of Ellipsis. Oxford University Press, 2018. p. 45–74.

164

LEHISTE, I. Phonetic disambiguation of syntactic ambiguity. The Journal of the Acoustical Society of America, v. 53, n. 1, p. 380-380, 1973.

LINDENBERGH, C.; VAN HOUT, A.; HOLLEBRANDSE, B. Extending ellipsis research: The acquisition of sluicing in Dutch. 32nd Boston University Conference on Language De- velopment, online proceedings supplement, p 1-22, 2015.

LISZKOWSKI, U.; CARPENTER, M.; TOMASELLO, M. Reference and attitude in infant pointing. Journal of Child Language, v. 34, n. 1, p. 1-20. 2007.

LOBANOV, B. M. Classification of Russian vowels spoken by different speakers. The Journal of the Acoustical Society of America, v. 49, n. 2B, p. 606-608, 1971.

LOPES, R. E. V. Uma proposta minimalista para o processo de aquisição da linguagem: relações locais. 1999. Thesis (PhD) – Institute of Language Studies, State University of Cam- pinas, Brazil.

LOPES, R. E. V. Aspect and the acquisition of null objects in Brazilian Portuguese. In: PIRES, A.; ROTHMAN, J. (Org.). Minimalist inquiries into child and adult language acquisition. Berlin/NY: Mouton de Gruyter, 2009. p. 105-128.

LOPES, R. E. V.; SANTOS, A. L. VP-Ellipsis Comprehension in European and Brazilian Por- tuguese. New Directions in the Acquisition of Romance Languages. Selected Proceedings of Romance Turn V, p. 181-201, 2014.

MANDEL, D. R.; JUSCZYK, P. W.; KEMLER NELSON, D. G. Does sentential prosody help infants organize and remember speech information? Cognition, v. 53, n. 2, p. 155–180, Nov. 1994.

MARIS, E.; OOSTENVELD, R. Nonparametric statistical testing of EEG-and MEG-data. Journal of neuroscience methods, v. 164, n. 1, p. 177-190, 2007.

MARTINS, A. M. Enclisis, VP-deletion and the nature of Sigma. Probus, v. 6, n. 2-3, p.173- 205, 1994.

MASSICOTTE-LAFORGE, S.; SHI, R. The role of prosody in infants' early syntactic analysis and grammatical categorization. The Journal of the Acoustical Society of America, v. 138, n. 4, p. EL441-EL446, 2015.

MATLAB and Statistics Toolbox Release 2012b, The MathWorks, Inc., Natick, Massachusetts, United States.

MAZUKA, R. Can a grammatical parameter be set before the first word? Prosodic contributions to early setting of a grammatical parameter. In: MORGAN, J.; DEMUTH, K. (eds.) Signal to Syntax: Bootstrapping from speech to grammar in early acquisition. Mahwah, NJ.: Law- rence Erlbaum Associates, 1996. p. 313-330.

MEHLER, J.; JUSCZYK, P.; LAMBERTZ, G.; HALSTED, N.; BERTONCINI, J.; AMIELTISON, C. A precursor of language acquisition in young infants. Cognition, Amster- dam, v. 29, p. 143-178, 1988

165

MERCHANT, J. The syntax of silence: Sluicing, islands, and the theory of ellipsis. Oxford University Press: Oxford, 2001.

MERCHANT, J. Fragments and ellipsis. Linguistics and Philosophy, v. 27, n. 6, p. 661–738, Jan. 2004.

MERCHANT, J. Variable island repair under ellipsis. Topics in ellipsis, v. 1174, p. 132-153, 2008.

MILLOTTE, S. Le rôle de la prosodie dans le traitement syntaxique adulte et l'acquisition de la syntaxe. 2005. Thesis (PhD). École des Hautes Etudes en Sciences Sociales, Paris.

MILLOTTE, S.; CHRISTOPHE, A. À la découverte des mots : le rôle de la prosodie dans l’acquisition du lexique et de la syntaxe. Enfance, v. 2009, n. 03, p. 283-292, Sept. 2009.

MORGAN, L. M..; DEMUTH, K. Signal to syntax: an overview. In: MORGAN, J.; DEMUTH, K. (Eds.) Signal to syntax: Bootstrapping from speech to grammar in early acquisition. Mahwah, NJ.: Lawrence Erlbaum Associates, 1996. p. 1-22.

MORGAN, L. M.; SHI, R.; ALLOPENA, P. Perceptual bases of rudimentary grammatical cat- egories: towards a broader conceptualization of bootstrapping. In: MORGAN, J.; DEMUTH, K. (eds.) Signal to syntax: Bootstrapping from speech to grammar in early acquisition. Mahwah, NJ.: Lawrence Erlbaum Associates, 1996. p. 263-283.

NAIGLES, L. G.; KAKO, E. T. First Contact in Verb Acquisition: Defining a Role for Syntax. Child Development, v. 64, n. 6, p. 1665-1687, Dec. 1993.

NESPOR, M.; VOGEL, I. Prosodic phonology: with a new foreword. Walter de Gruyter, 2007.

NEVINS, A. What UG can and can't do to help the Reduplication Learner. In: CSIRMAS, A; GUALMINI, A; NEVINS, A. (Eds.). Plato's Problem: Problems in Language Acquisition. MIT Working Papers in Linguistics, Cambridge, v. 48, 2004. p. 113-126.

OHTAKI, K. Ellipsis of Arguments: Its Acquisition and Theoretical Implications. 2014. Thesis (PhD) – University of Connecticut. https://opencommons.uconn.edu/dissertations/619

PEPERKAMP, S. A. Prosodic words. HIL dissertations 34. The Hague: Holland Academic Graphics, 1997.

PIERREHUMBERT, J. B. The phonology and phonetics of English intonation. 1980. Thesis (PhD) – Massachusetts Institute of Technology.

PINKER, S. Learnability and Cognition: The Acquisition of Argument Structure. Cam- bridge, MA.: The MIT Press, 1989

POLLOCK, J.-Y. Verb movement, Universal Grammar, and the structure of IP. Linguistic lnquiry, v. 20, p. 365-424, 1989.

POSTMAN, W.; FOLEY, C.; SANTELMANN, L.; LUST, B. Evidence for Strong Continuity: New experimental results from children’s acquisition of VP-ellipsis and bound variable struc- tures. MIT Working Papers in Linguistics, v. 31, p. 327-44, 1997. apud LOPES. R. E. V., SANTOS, A. L. VP-Ellipsis Comprehension in European and Brazilian Portuguese. New

166

Directions in the Acquisition of Romance Languages. Selected Proceedings of Romance Turn V, p. 181-201, 2014.

PRICE, P. J.; OSTENDORF, M.; SHATTUCK‐HUFNAGEL, S.; FONG, C. The use of pros- ody in syntactic disambiguation. The Journal of the Acoustical Society of America, v. 90, n. 6, p. 2956-2970, 1991.

RAMUS, F. Language discrimination by newborns: Teasing apart phonotactic, rhythmic, and intonational cues. Annual Review of Language Acquisition, v. 2, p. 85–115, 2002.

RAPOSO, E. On the null object in European Portuguese. In: JAEGGLI, O., CORVALÁN, C- S. (eds.) Studies in Romance Linguistics, Dordrecht: Foris, 1986. p. 373-390.

RIZZI, L. The fine structure of the left periphery. In: HAEGEMAN, L. (Ed.). Elements of syntax. Dordrecht: Kluwer, 1997. p.281-337.

ROSS, J. R. Guess Who? Papers from the 5th Regional Meeting of Chicago Linguistic So- ciety, v. 5, p. 252-86, 1969.

SAFIR, K. Vehicle Change and Reconstruction in Ā-Chains. Linguistic Inquiry, v. 30, n. 4, p. 587–620, Oct. 1999.

SANTOS, A. L. Minimal Answers. Ellipsis, syntax and discourse in the acquisition of Eu- ropean Portuguese. 2006. Thesis (PhD) – Universidade de Lisboa, Lisboa.

SANTOS, A. L. Early VP ellipsis: production and comprehension evidence. In: J. Rothman & A. Pires (Eds.) Minimalist inquiries into child and adult language acquisition. Berlin: Mou- ton De Gruyter, 2009. p. 155-175.

SELKIRK, E. The Prosodic Structure of Function Words. In: MORGAN, J.; DEMUTH, K. (Eds.). Signal to syntax: Bootstrapping from speech to grammar in early acquisition. Mahwah, NJ.: Lawrence Erlbaum Associates, 1996. p. 187-215.

SHUKLA, M.; WHITE, K. S.; ASLIN, R. N. Prosody guides the rapid mapping of auditory word forms onto visual objects in 6-mo-old infants. Proceedings of the National Academy of Sciences, v. 108, n. 15, p. 6038–6043, Ap. 2011.

SILVA, C. G. de C. O papel das fronteiras de sintagma fonológico na restrição do proces- samento sintático e na delimitação das categorias lexicais. 2009. Dissertation (masters) – Faculdade de Letras, Federal University of Juiz de Fora, Juiz de Fora.

SILVA, C. G. de C. A interface prosódia-sintaxe na produção e no processamento das es- truturas de Tópico e de SVO. 2015. Thesis (PhD) – Faculdade de Letras, Federal University of Juiz de Fora, Juiz de Fora.

SILVA, I. O. A sensibilidade de bebês brasileiros a fronteiras de sintagma entoacional: a prosódia nas fases iniciais da aquisição da linguagem. 2014. Dissertation (masters) – Facul- dade de Letras, Federal University of Juiz de Fora, Juiz de Fora.

SILVA, I. O; NAME, C. A sensibilidade de bebês brasileiros a pistas prosódicas de fronteiras de sintagma entoacional na fala dirigida à criança. Letrônica, v. 7, n. 1, p. 4-25, 2014.

167

SLEEMAN, P.; HULK, A. L1 acquisition of noun ellipsis in French and in Dutch. Romance Languages and Linguistic Theory: Selected papers from 'Going Romance', Utrecht 2011, v. 5, p. 249-266, 2013.

SNEDEKER, J.; TRUESWELL, J. Unheeded cues: Prosody and syntactic ambiguity in mother- child communication. 26th Boston University Conference on Language Development (BU- CLD), 2001.

SNEDEKER, J.; TRUESWELL, J. Using prosody to avoid ambiguity: Effects of speaker awareness and referential context. Journal of Memory and Language, v. 48, n. 1, p. 103–130, Jan. 2003.

SNEDEKER, J.; YUAN, S. Effects of prosodic and lexical constraints on parsing in young children (and adults). Journal of Memory and Language, v. 58, n. 2, p. 574–608, Feb. 2008.

SODERSTROM, M. The prosodic bootstrapping of phrases: Evidence from prelinguistic in- fants. Journal of Memory and Language, v. 49, n. 2, p. 249–267, Aug. 2003.

SOUZA, M. M. de. Pistas prosódicas na desambiguação de sentenças coordenadas no PB. 2016. Dissertation (masters) – Faculdade de Letras, Federal University of Juiz de Fora, Juiz de Fora.

STEEDMAN, M. Phrasal Intonation and the acquisition of Syntax. In: MORGAN, J.; DEMUTH, K. (eds.) Signal to syntax: Bootstrapping from speech to grammar in early acquisition. Mahwah, NJ.: Lawrence Erlbaum Associates, 1996. p. 331-342.

TRUCKENBRODT, H.; SANDALO, F.; ABAURRE, B. Elements of Brazilian Portuguese in- tonation. Journal of Portuguese Linguistics, v. 8, n. 1, p. 75-114, June 2009.

TRUESWELL, J. C.; SEKERINA, I.; HILL, N. M.; LOGRIP, M. L. The kindergarten-path effect: Studying on-line sentence processing in young children. Cognition, v. 73, p. 89–134, 1999.

VAN CRAENENBROECK, J.; TEMMERMAN, T. (Eds.). The Oxford Handbook of Ellip- sis. Oxford University Press, 2018.

VAN OMMEN, S.; BOLL-AVETISYAN, N.; LARRAZA, S.; WELLMANN, C.; BIJELJAC- BABIC, R.; HÖHLE, B.; NAZZI, T. Cross-linguistic evidence of language-specific processing of prosodic boundary cue. Poster presented at the 42nd Boston University Conference on Lan- guage Development (BUCLD), 2017

VENDITTI, J. J.; JUN, S.-A.; BECKMAN, M. E. Prosodic cues to syntactic and other linguistic structures in Japanese, Korean, and English In: MORGAN, J.; DEMUTH, K. (eds.) Signal to syn- tax: Bootstrapping from speech to grammar in early acquisition. Mahwah, NJ.: Lawrence Erlbaum Associates, 1996. p. 287-311.

VIGÁRIO, M. Prosody and sentence disambiguation in European Portuguese. Catalan Jour- nal of Linguistics, v. 2, p. 249-278, Dec. 2003.

VIGÁRIO, M. O lugar do Grupo Clítico e da Palavra Prosódica Composta na hierarquia pro- sódica: uma nova proposta. In: LOBO, M., COUTINHO, M. A. (Eds.) XXII Encontro da

168

Associação Portuguesa de Linguística: Textos Selecionados. Lisboa: APL, 2007. pp. 673- 688.

VOGEL, I.; RAIMY, E. The acquisition of compound vs. phrasal stress: the role of prosodic constituents. Journal of Child Language, v. 29, n. 2, p. 225–250, May 2002.

WAGNER, L. Aspectual bootstrapping in language acquisition: telicity and transitivity. Lan- guage learning and development, Boston, MA., v. 2, n. 1, p. 51-76, 2006.

WATSON, D.; GIBSON, E. Intonational phrasing and constituency in language production and comprehension. Studia linguistica, v. 59, n. 2‐3, p. 279-300, 2005.

WINKLER, S. Ellipsis and focus in generative grammar. Berlin; New York: Mouton de Gruyter, 2005.

WURMBRAND, S. Stripping and topless complements. Linguistic Inquiry, p. 341-366, 2017.

YUAN, S.; FISHER, C. “Really? She Blicked the Baby?”: Two-Year-Olds Learn Combinato- rial Facts About Verbs by Listening. Psychological Science, v. 20, n. 5, p. 619–626, May 2009.

ZHOU, P.; CRAIN, S.; ZHAN, L. Sometimes children are as good as adults: The pragmatic use of prosody in children’s on-line sentence processing. Journal of Memory and Language, v. 67, n. 1, p. 149-164, 2012.

169

APPENDIX A – RESULTS FROM ADULT CONTROL GROUP – EXPERIMENT 1

Looking results

Figure 32 - Average proportion of looking time to- wards the two-agent action in the stripping (green box) and transitive (orange box) condition for French adults. The dashed white lines show the mean for each condition.

Pointing results

Figure 33 - Average proportion of point- ing towards the two-agent action per condition per verb for the French adults.

170

APPENDIX B – DETAILED STIMULI AND DESIGN OF EXPERIMENT 1

Figure 34 - "Hit" action of Experiment 1. The one-agent video (left) showed the tiger hitting the duck and the bunny alternately; the tiger hit the duck three times, then turned towards the bunny and hit it three times as well, repeating this until the end of the trial. The two-agent video (right) showed the tiger and the duck hitting the bunny at the same time; they hit three times, then paused for a second and hit another three times, repeating this pattern until the end of the trial.

Figure 35 - "Eat" action of Experiment 1. The one-agent video (left) showed the tiger poking

the dinosaur with a fork; he would poke the dinosaur five times, then stop and nod at the

camera, then repeat this pattern until the end of the trial. The two-agent video (right) showed the tiger and the dinosaur poking the duck with a fork at the same time; they would poke it five times, then nod at the camera.

Figure 36 - "Carry" action of Experiment 1. The one-agent video (left) showed the duck hold- ing the dinosaur on top of one wing and a present on top of the other, while slowly swinging from side to side. The two-agent video (right) showed the duck and the dinosaur holding a present box each while slowly swinging from side to side.

Figure 37 - "Push" action of Experiment 1. The one-agent video (left) showed the monkey pushing the dinosaur and the bunny on a trolley from the left to the right of the screen. The two-agent video (right) showed the monkey and the bunny pushing the dinosaur on the trolley

to each other alternately.

171

Table 13 - Video design of Experiment 1. There are four test phases in total (one for each of the actions above), which follow the same design as the one exemplified below. The order of the test phases, as well as the side in which each video appears is randomized.

Video stimuli Trial Left Right Audio stimuli Time Characters appear in Character the screen for 5s Characters were named once in sentences introduction each such as "Regarde ! C'est le tigre !" 30s Interval Laughing baby Laughing baby 5s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Bunny Preview trial jumping " oh, regarde ! tu as vu ?" 5s Interval blank screen no audio 1s Bunny and mon- key play- Preview trial ing " oh, regarde ! tu as vu ?" 5s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Bunny and mon- Contrast Bunny key play- trial jumping ing " oh, regarde ! tu vois ça ?" 5s Audio prompt Fixation circle " Attention: le lapin va sauter !" 3s Bunny

and mon- phase Training Bunny key play- " Oh, regarde ! Le lapin saute ! Tu vois ? Test trial jumping ing Le lapin saute !" 12s Bunny and mon- Bunny key play- Pointing jumping ing User con- trial (paused) (paused) No audio trolled Interval Laughing baby Laughing baby 5s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Tiger eating duck Intro test with a fork "Regarde ! c'est le tigre ! Et le canard ! phase Oh, il le mange !" 10s Interval blank screen no audio 3s Tiger and dinosaur

eating phase Test Preview trial duck "oh, regarde ! Tu vois ça ?" 9s Interval blank screen no audio 1s

172

Video stimuli Trial Left Right Audio stimuli Time Characters appear in Character the screen for 5s Characters were named once in sentences introduction each such as "Regarde ! C'est le tigre !" 30s Interval Laughing baby Laughing baby 5s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Tiger eating Preview trial dinosaur "oh, regarde ! Tu vois ça ?" 9s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Tiger and Tiger dinosaur Contrast eating eating trial dinosaur duck " oh, et là ! Regarde ! Tu vois ça ?" 9s User con- Audio "Attention: Le tigre va manger (.) le trolled (at prompt Fixation circle dinosaure aussi !" least 5s) Tiger and Tiger dinosaur " oh, regarde ! Le tigre mange (.) le eating eating dinosaure aussi ! Tu vois ? Le tigre Test trial dinosaur duck mange (.) le dinosaure aussi !" 12s Tiger and Tiger dinosaur eating eating Pointing dinosaur duck User con- trial (paused) (paused) no audio trolled Interval Laughing baby Laughing baby 5s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s)

Table 14 - Randomization of test videos for the Brazilian Experiment. We prepared four complete videos per condition. The Side target action shows which side the two-agent action (or the "jump" action for the training trial) appeared on the screen, and the Side 1st action shows which side appeared first in the preview.

Stripping Condition Transitive Condition Video 1 Video 1 Side target Side 1st Side target Side 1st Item action action Item action action Jump Jump (training) L R (training) L R Push L L Push R L Poke R L Poke L L Carry L R Carry R R

173

Eat R L Eat L L Video 2 Video 2 Side target Side 1st Side target Side 1st Item action action action action Jump Jump (training) R L (training) R L Carry R R Carry L R Eat L L Eat R L Push R R Push L R Poke L R Poke R R Video 3 Video 3 Side target Side 1st Side target Side 1st Item action action action action Jump Jump (training) L R (training) L R Eat L L Eat R L Carry R L Carry L L Poke L R Poke R R Push R L Push L L Video 4 Video 4 Side target Side 1st Side target Side 1st Item action action action action Jump Jump (training) R L (training) R L Poke R R Poke L R Push L L Push R L Eat R L Eat L L Carry L R Carry R R

174

APPENDIX C – CONSENT FORMS AND PARENTAL QUESTIONNAIRES FOR THE FRENCH EXPERIMENTS

Consent Forms INFORMATIONS ET CONSENTEMENT DE PARTICIPATION

(Un exemplaire cosigné doit être remis à la personne qui participe) Nous soussignés : ...... …………………...... ………………………………………... déclarons accepter que l’enfant ………………………….. dont nous avons la charge légale participe à la recherche intitulée : "Acquisition précoce du lexique et de le syntaxe" organisée par Anne Christophe dans les conditions précisées ci-dessous et pour laquelle le CERES a émis un avis favorable le 4 mars 2014. Afin d'éclairer ma décision, j'ai reçu et bien compris les informations suivantes: Votre enfant est encore bien petit, et pourtant il est déjà lancé dans la fantastique découverte du monde qui l'entoure. Pendant ses premières années de vie, son cerveau se développe plus vite que pendant toute autre période de sa vie. De jour en jour, vous assistez à ses progrès dans le regard, le tonus, le sourire, etc. D'autres progrès sont plus cachés, et pourtant impressionnants. Savez-vous, par exemple, que dès les premiers jours de vie votre bébé est capable de vous imiter, ou encore qu’il est capable dès la naissance de reconnaître qu'une phrase provient de sa langue maternelle ou d'une langue étrangère. Ces facultés inattendues ont été découvertes grâce aux recherches menées par des scientifiques pour explorer les capacités précoces des nourrissons, et découvrir comment elles se développent en contact avec l'environnement. De nombreux laboratoires, en France comme à l'étranger, participent à ce programme dont les applications futures, notamment médicales, seront importantes. Beaucoup de parents accompagnés de leur enfant ont déjà participé ainsi à nos travaux. Notre équipe étudie les capacités précoces des nourrissons depuis plus de trente ans. Notre laboratoire dépend du Centre National de la Recherche Scientifique (CNRS) et de l’Ecole Normale Supérieure. Nous avons l'autorisation de consulter les registres de l'État-Civil par autorisation du Procureur de la République, et c’est par ce moyen que nous avons obtenu votre adresse. Pour progresser dans ce projet, nous avons besoin de votre collaboration et de celle de votre enfant. Votre participation consiste à venir une seule fois, accompagné de votre enfant, dans les locaux du Laboratoire de Sciences Cognitives et Psycholinguistique, au BabyLab de l'École Normale Supérieure ou à la maternité Port-Royal. Vous pourrez le voir en permanence, et nous répondrons à toutes vos questions concernant cette étude. Votre participation est totalement volontaire, nullement obligatoire, et vous pourrez à tout moment interrompre le test, sans avoir à fournir de justification. Selon l'étude à laquelle vous participerez, votre enfant sera assis dans un siège adapté, ou sur vos genoux. Nous lui ferons écouter des mots ou des phrases ou observer des scènes visuelles, et nous mesurerons

175 son intérêt grâce à l'une de ses réponses spontanées (le temps de regard, ou le pointage du doigt, selon l'âge de votre bébé). Le test lui-même dure entre 10 et 20 minutes ; toutefois, si votre bébé se mettait à pleurer, le test serait aussitôt interrompu et non repris. Au total vous passerez de 30 à 45 minutes avec nous, en comptant le temps de vous expliquer les détails de la procédure au début, et de répondre à vos questions sur nos recherches en général, à la fin. Si vous le désirez, nous vous communiquerons les conclusions générales auxquelles cette étude aura abouti.

Il nous a été précisé que : - Nous sommes libres d’accepter ou de refuser ainsi que d’arrêter à tout moment la participation de l’enfant dont nous avons la charge, sans avoir à fournir de justification. - Les données qui concernent l’enfant dont nous avons la charge resteront strictement confidentielles. Nous n’autorisons leur consultation que par des personnes qui collaborent avec Anne Christophe. - Nous pourrons à tout moment demander des informations à Anne Christophe ou à l’un des investigateurs conjoints. - La publication des résultats ne comportera aucun résultat individuel identifiant. - Notre consentement ne décharge pas les organisateurs de la recherche de l'ensemble de leurs responsabilités. - Nous conservons tous nos droits garantis par la loi.

Fait à ....……………...... , le .....……………

Signature du ou des parents, précédée de la mention : lu et approuvé

176

Questionnaires Specific vocabulary questionnaire: all children

177

Adapted CDI: 28-month-olds

Nom/prénom de l'enfant : Date de naissance : Sexe : Mode de garde dominant: Nombre et âge (mois) des frères et sœurs : Langues entendues regulièrement par l'enfant: Personne qui remplit le questionnaire (père, mère, les deux): Date à laquelle vous avez rempli ce questionnaire: Cochez les cases correspondant aux mots que l'enfant produit actuellement de manière spontanée (hors imitation). Si sa prononciation est différente de celle des adultes, cochez tout de même le mot. aïe cochon monsieur allô coin-coin moto assiette compote musique attention couche/lange nez au revoir coucou nom de l'enfant a/avoir peur cuillère oreille ballon dame où bâteau dehors ouaf-ouaf beau/belle eau pain bébé école/crèche pantalon bêe bêe écris/écrire papa biberon/bibi éléphant pars/partir/parti bois/boire encore pas bon/bonne fais/faire un bisou pâtes bonbon ferme/fermer pleure/pleurer bonjour fleur pluie bottes froid/froide poisson bouche fromage pomme bras ici porte

178

ça là pot cache/cacher lait poubelle cadeau lapin pyjama caillou lit quoi canard livre sale casse/casser lumière s'il te plaît chaise lune télé chat main tombe/tomber chaud/chaude maison verre chaussure/soulier maman voiture/auto cheval mange/manger vroum cheveux merci yaourt/yogourt chien/toutou meuh yeux chocolat miaou chut moi Est-ce que l'enfant a déjà commencé à combiner des mots comme par ex. "gâteau encore" ou "papa pati" ? Pas encore Quelquefois Souvent Si vous avez répondu de façon positive à la question précédente, indiquez les trois phrases les plus longues que l'enfant produit de manière spontanée actuellement :

179

APPENDIX D – CONSENT FORMS FOR THE BRAZILIAN PORTUGUESE EX- PERIMENT

TERMO DE CONSENTIMENTO LIVRE E ESCLARECIDO

Título da pesquisa: Desambiguação sintática através de pistas prosódicas por crianças em fase de aqui- sição. Pesquisadora responsável: Letícia Schiavon Kolberg (aluna do curso de doutorado em Linguística (IEL/UNICAMP)); Orientadora: Profa. Dra. Maria Bernadete Marques Abaurre (IEL/UNICAMP); Local da pesquisa: Creche Bento Quirino (Rua Cônego Cipião, 802, CEP: 13010-010, Centro, Campi- nas-SP); Número do CAAE: 84251818.5.0000.8142

Seu filho(a) ou a criança sob sua responsabilidade está sendo convidado(a) a participar como voluntário/voluntária para compor o grupo de participantes de uma pesquisa que busca investigar o uso da informação prosódica para compreensão de sentenças por crianças em fase de aquisição. Este docu- mento, chamado Termo de Consentimento Livre e Esclarecido, visa assegurar seus direitos e de seu filho(a), como participante, e é elaborado em duas vias, uma que deverá ficar com você e outra com o pesquisador. Por favor, leia com atenção e calma, aproveitando para esclarecer suas dúvidas. Se houver per- guntas antes ou mesmo depois de assiná-lo, você poderá esclarecê-las com a pesquisadora. Se preferir, pode levar este Termo para casa e consultar seus familiares ou outras pessoas antes de decidir autorizar a participação. Não haverá nenhum tipo de penalização ou prejuízo se você não aceitar a participação de seu filho(a). Mesmo após assinar este documento, você poderá retirar sua autorização a qualquer momento.

Justificativa e objetivos: O objetivo desta pesquisa é investigar se crianças de 36 a 48 meses conseguem compreender a diferença entre sentenças transitivas simples, como “A mamãe embalou o bebê também”, e sentenças coordenadas, como “A mamãe embalou. O bebê também”, apenas a partir da informação prosódica que diferencia as sentenças (tal como a entre “embalou” e “o bebê” na segunda sentença, e a ausência desta na primeira). Com isso, pretendemos contribuir para o conheci- mento sobre como se dá a aquisição de língua materna, e se, na faixa etária estudada, crianças apren- dendo o português brasileiro já conseguem lidar com as pistas prosódicas para determinar a estrutura sintática, e, por consequência, o possível significado de verbos ainda desconhecidos.

180

Procedimentos: Participando do estudo, seu filho(a) está sendo convidado a compor o grupo experimental da pesquisa. A pesquisadora irá comparecer à escola onde ele/ela estuda em horário normal de aulas, e convidá-lo(a) a assistir alguns vídeos em uma sala dentro da escola. Se seu filho(a) aceitar, a pesquisadora mostrará a ele/ela quatro vídeos curtos, compostos de dois vídeos simultâneos com fanto- ches de animais realizando ações simples. Junto aos vídeos, as crianças ouvirão algumas sentenças, e a pesquisadora pedirá às crianças para apontar para o vídeo com a interpretação correta das sentenças ouvidas. Ao todo, as crianças verão quatro ações: "cutucar", "comer", "carregar" e "empurrar". A atividade dura cerca de 5 minutos, e será gravada por uma câmera e um aparelho de rastrea- mento ocular, para que a pesquisadora possa usar, além das respostas ativas das crianças, a informação da direção do olhar para cada vídeo para determinar a compreensão das sentenças ouvidas. As imagens coletadas não serão divulgadas a pessoas de fora da equipe de pesquisa, sob hipótese alguma; a única informação a ser utilizada para a pesquisa será a quantidade de tempo que as crianças passam olhando para o vídeo transitivo e para o intransitivo ao ouvir uma das sentenças descritas.

Desconfortos e riscos: Seu filho(a) não deve participar deste estudo se apresentar algum pro- blema cognitivo e/ou fonoaudiológico. Não serão utilizados quaisquer materiais que possam envolver algum risco à criança. O único risco previsto é o possível desconforto ou timidez do participante em relação à atividade. Caso a criança mostre qualquer tipo de desconforto ou indisposição, a atividade será encerrada, e ela será levada de volta para sua professora.

Benefícios: Não há benefícios diretos aos participantes da pesquisa. Os benefícios advêm so- mente da contribuição para os estudos em aquisição do português brasileiro, especialmente no que con- cerne a desambiguação de sentenças pela prosódia.

Critérios de inclusão: Para participar desta pesquisa, seu filho(a) deve estar dentro da faixa etária de 36 a 48 meses (3 a 4 anos);

Critérios de exclusão: Seu filho(a) não deve participar desta pesquisa se apresentar algum pro- blema cognitivo e/ou fonoaudiológico.

Acompanhamento e assistência: Após seu consentimento, seu filho(a) ou criança sob sua responsabilidade será convidado(a) a participar do teste descrito. Não haverá nenhum tipo de avaliação e não haverá acompanhamento após o encerramento da pesquisa. Não se espera detectar situações que indiquem intervenção.

Sigilo e privacidade:

181

Você tem a garantia de que sua identidade e de seu filho(a) serão mantidas em sigilo e nenhuma informação será dada a outras pessoas que não façam parte da equipe de pesquisa. Na divulgação dos resultados desse estudo, seus nomes não serão citados. As gravações obtidas somente serão utilizadas para a observação do tempo em que as crianças passam olhando para os vídeos ao ouvir uma das sen- tenças descritas. As gravações coletadas serão salvas em um pen-drive e armazenadas pela pesquisadora respon- sável em local seguro por 5 anos após a conclusão da pesquisa. Após este período, o pen-drive será formatado.

Ressarcimento e Indenização: Como as crianças serão convidadas a realizar a atividade em horário normal de aulas, não haverá ressarcimento aos participantes. Ao participar deste estudo, você e seu filho(a) terão a garantia ao direito a indenização diante de eventuais danos decorrentes da pesquisa.

Contato: Em caso de dúvidas sobre o estudo, você poderá entrar em contato com a pesquisadora Letícia Schiavon Kolberg, no Departamento de Linguística do Instituto de Estudos da Linguagem da UNI- CAMP, localizado na Rua Sérgio Buarque de Holanda, nº 571, CEP: 13083-859, Campinas, SP, Brasil, Telefone: (41) 99743-1827. e-mail: [email protected]. Em caso de denúncias ou reclamações sobre sua participação e sobre questões éticas do estudo, você pode entrar em contato com a secretaria do Comitê de Ética em Pesquisa (CEP) da UNICAMP das 08:30hs às 13:30hs e das 13:00hs as 17:00hs na Rua Tessália Vieira de Camargo, 126; CEP 13083-887 Campinas – SP; telefone (19) 3521-8936; fax (19) 3521-7187; e-mail: [email protected].

O Comitê de Ética em Pesquisa (CEP). O papel do CEP é avaliar e acompanhar os aspectos éticos de todas as pesquisas envolvendo seres humanos. A Comissão Nacional de Ética em Pesquisa (CONEP), tem por objetivo desenvolver a regulamentação sobre proteção dos seres humanos envolvidos nas pesquisas. Desempenha um papel coordenador da rede de Comitês de Ética em Pesquisa (CEPs) das instituições, além de assumir a função de órgão consultor na área de ética em pesquisas.

Consentimento livre e esclarecido: Após ter recebido esclarecimentos sobre a natureza da pesquisa, seus objetivos, métodos, bene- fícios previstos, potenciais riscos e o incômodo que esta possa acarretar, aceito a participação de meu

182 filho(a) ou criança sob minha responsabilidade, e declaro estar recebendo uma via original deste docu- mento assinada pelo pesquisador e por mim, tendo todas as folhas por nós rubricadas.

Estou ciente e autorizo a gravação da imagem da criança sob minha responsabilidade apenas para os fins indicados acima: ( ) Sim ( ) Não ( ) Desejo receber os resultados da pesquisa por e-mail Nome da criança: ______Nome do pai/mãe ou responsável: ______Contato telefônico: ______e-mail (opcional): ______Data: ____/_____/______. (Assinatura do responsável)

Responsabilidade do Pesquisador: Asseguro ter cumprido as exigências da resolução 466/2012 CNS/MS e complementares na ela- boração do protocolo e na obtenção deste Termo de Consentimento Livre e Esclarecido. Asseguro, tam- bém, ter explicado e fornecido uma via deste documento ao participante. Informo que o estudo foi apro- vado pelo CEP perante o qual o projeto foi apresentado e pela CONEP, quando pertinente. Comprometo- me a utilizar o material e os dados obtidos nesta pesquisa exclusivamente para as finalidades previstas neste documento ou conforme o consentimento dado pelo participante. ______Data: ____/_____/______. (Assinatura do pesquisador)

183

APPENDIX E – APPROVAL DOCUMENTS FOR THE RESEARCH IN BRAZIL

184

185

186

187

188

APPENDIX F – COMPLETE DESIGN OF EXPERIMENT 2

Table 15 - Complete design of Experiment 2. The order of training phases, as well as the side in which each video appears is randomized.

Video stimuli Trial Left Right Audio stimuli Time " oh, regarde ! Preview trial Woman sleeping tu as vu ?" 5s Interval blank screen no audio 1s

Woman pushing " oh, regarde ! Preview trial another woman tu as vu ?" 5s User con- Fixation cir- trolled (at

cle Fixation circle no audio least 1s) Contrast Woman pushing " oh, regarde ! trial Woman sleeping another woman tu vois ça ?" 8s " Attention: Audio regarde celle qui prompt Fixation circle dort/pousse !" 3s

" Tu la vois, Training1 phase celle qui dort/pousse ? Regarde celle Woman pushing qui dort/pousse Test trial Woman sleeping another woman !" 8s Interval Colorful toy picture Laughing baby 5s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Man carrying a " oh, regarde ! Preview trial woman tu as vu ?" 5s Interval blank screen no audio 1s " oh, regarde ! Preview trial Man walking tu as vu ?" 5s

User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Contrast Man carrying an- " oh, regarde ! trial other man Man walking tu vois ça ?" 8s " Attention:

Audio regarde celui qui Training2 phase prompt Fixation circle marche/porte !" 3s " Tu le vois, qui marche/porte ? Regarde celui Man carrying an- qui Test trial other man Man walking marche/porte !" 8s

189

Video stimuli Trial Left Right Audio stimuli Time Interval Colorful toy picture Laughing baby 5s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Dialogue 1 Two women talking see fig. 19 23s Interval blank screen no audio 3s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Dialogue 2 Two women talking see fig. x 23s User con- Fixation cir- trolled (at Dialoguephase cle Fixation circle no audio least 3s) Woman swinging another woman's "oh, regarde ! Preview trial leg back and forth Tu vois ça ?" 5s Interval blank screen no audio 1s Woman spinning "oh, regarde ! Preview trial her arm in circles Tu vois ça ?" 5s User con- Fixation cir- trolled (at cle Fixation circle no audio least 1s) Woman swinging " oh, et là ! Contrast Woman spinning another woman's Regarde ! Tu trial her arm in circles leg back and forth vois ça ?" 8s "Attention: User con- Audio regarde celle qui trolled (at prompt Fixation circle dase!" least 5s) phase Test Woman swinging " Tu la vois, qui Woman spinning another woman's dase ? Regarde Test trial her arm in circles leg back and forth celle qui dase !" 8s Woman swinging Woman spinning another woman's Pointing her arm in circles leg back and forth User con- trial (paused) (paused) no audio trolled Interval Colorful toy picture Laughing baby 5s Fixation cir- User con- cle Fixation circle no audio trolled

190

Character Hé, tu sais ce qu'il a A fait, le papa ? Il a dasé ! Character Oui, le papa a dasé B (!) La mamie aussi ! Character C'est vrai ! Et la A fille ? Character La fille a dasé (!) B La maman aussi ! Character Vraiment? Elle a A dasé!

Figure 38 - Second dialogue of Experiment 2.

191

APPENDIX G – DETAILED STIMULI AND DESIGN OF EXPERIMENT 3

Figure 39 - "Eat" action of Experiment 3. The different-action video (left) showed the tiger eating a cake, while the monkey opened and closed its arms continuously. The tiger poked the cake with the fork four times, nodded at the camera and repeated until the end of the trial. The same-action video (right) showed both the monkey and the tiger eating the cake. They took turns poking the cake and nodding at the camera, so they never performed the action at the same time.

Figure 40 - "Carry" action of Experiment 3. The different-action video (left) showed the duck holding a present and slowly swinging from left to right while the bunny opened and closed its arms and legs contin- uously. The same-action video (right) showed both the bunny and the duck holding a present and slowly swinging from left to right.Figure 41 - "Eat" action of Experiment 3. The different-action video (left) showed the tiger eating a cake, while the monkey opened and closed its arms continuously. The tiger poked the cake with the fork four times, nodded at the camera and repeated until the end of the trial. The same- action video (right) showed both the monkey and the tiger eating the cake. They took turns poking the cake Figureand nodding 40 - "atCarry" the camera, action soof theyExperiment never performed 3. The different the action-action at the video same (left) time. showed the duck holding a present and slowly swinging from left to right while the bunny opened and closed its arms and legs contin- uously. The same-action video (right) showed both the bunny and the duck holding a present and slowly swinging from left to right.

Figure 42 - "Carry" action of Experiment 3. The different-action video (left) showed the duck holding a present and slowly swinging from left to right while the bunny opened and closed its arms and legs contin- uously. The same-action video (right) showed both the bunny and the duck holding a present and slowly Figureswinging 48 from - "Push" left toaction right. of Experiment 3. The different-action video (left) showed the monkey pushing a car from one side of the screen to the other while the bunny opened and closed its arms and legs continu- ously. The same-action video (right) showed the bunny and the monkey pushing the car to each other con- tinuously.Figure 49 - "Carry" action of Experiment 3. The different-action video (left) showed the duck holding a present and slowly swinging from left to right while the bunny opened and closed its arms and Figurelegs continuously. 43 - "Push" Theaction same of -Experimentaction video 3. (right) The different showed-action both thevideo bunny (left) and showed the duck the monkeyholding pushinga present a carand fromslowly one swinging side of thefrom screen left to to right. the other while the bunny opened and closed its arms and legs continu- ously. The same-action video (right) showed the bunny and the monkey pushing the car to each other con- Figuretinuously. 41Figu - "Push"re 44 action - "Carry" of Experiment action of Experiment3. The different 3. The-action different video- action(left) showed video (left) the monkey showed pushing the duck a carholding from a one present side andof the slowly screen swinging to the other from while left to the right bunny while opened the bunny and closed opened its and arms closed and legs its arms continu- and ously.legs continuously. The same-action The videosame- action(right) vishoweddeo (right) the bunny showed and both the monkeythe bunny pushing and the the duck car toholding each other a present con- tinuously. Figureand slowly 50 - swinging "Push" action from ofleft Experiment to right.Figure 3. The 45 different - "Eat" action-action of video Experiment (left) showed 3. The thedifferent monkey-action pushing video a car(left) from showed one sidethe tiger of the eating screen a cake,to the while other the while monkey the bunny opened opened and closed and close its armsd its continuously.arms and legs The continu- tiger poked the cake with the fork four times, nodded at the camera and repeated until the end of the trial. The ously. The same-action video (right) showed the bunny and the monkey pushing the car to each other con- tinuously.same-action video (right) showed both the monkey and the tiger eating the cake. They took turns poking the cake and nodding at the camera, so they never performed the action at the same time. Figure 56 - "Hit" action of Experiment 3. The different-action video (left) showed the duck hitting the ball with the stick while the monkey opened and closed its arms and legs continuously. The duck hit the ball three times, paused for one second and repeated until the end of the trial. The different-action video (right) showed both the duck and the monkey hitting the ball. They hit it alternately, so they never performed the actionFigure at 51 the - "sameHit" actiontime.Figure of Experiment 57 - "Push" 3. The action different of Experiment-action video 3. The (left) different showed-action the duc videok hitting (left) showedthe ball withFigure the 46 stick - " Carry"while theaction monkey of Experiment opened and 3. closedThe different its arms-action and legs video continuously. (left) showed The the duck duck hit holding the ball a thepresent monkey and slowlypushing swinging a car from from one left side to rightof the while screen the to bunny the other opened while and the closed bunny its openedarms and and legs closed contin- its armsthree andtimes, legs paused continuously. for one second The same and- actionrepeated video until (right) the end showed of the thetrial. bunny The differentand the monkey-action video pushing (right) the uously. The same-action video (right) showed both the bunny and the duck holding a present and slowly carshowed to each both other the duckcontinuously. and the monkey hitting the ball. They hit it alternately, so they never performed the actionswinging at the from same left time. to right.FigureFigure 52 - " Push"47 - " Eaactiont" action of Experiment of Experiment 3. The 3.different The different-action -videoaction (left) video showed (left) theshowed monkey the tigerpushing eating a car a cake, from while one sidethe monkey of the screen opened to andthe closedother while its arms the continuously. bunny opened The and tiger closed poked its armsthe cake and with legs thecontinuously. fork four times, The same nodded-action at the video camera (right) and showed repeated the until bunny the and end the of monkeythe trial. pushing The same the- 192

Figure 42 - "Hit" action of Experiment 3. The different-action video (left) showed the duck hitting the ball with the stick while the monkey opened and closed its arms and legs continuously. The duck hit the ball three times, paused for one second and repeated until the end of the trial. The different-action video (right) showed both the duck and the monkey hitting the ball. They hit it alternately, so they never performed the action at the same time.

Figure 43 - Videos of the fifth test phase of Experiment 3. The new action video (left) shows a duck turning from side to side while holding a green plastic lettuce. The familiar action video (right) shows the tiger performing the novel action seen on the previous trials. The characters of this trial were chosen on the basis that they have not been seen performing the familiar novel action before.

193

Table 16 - Complete design of Experiment 3. There are five test trials in total (one for each of the actions above), which follow the same design as the one exemplified below. The order of the first four test trials, the side in which each video appears and the order of appearance of each introduction video is randomized.

Video stimuli Trial Left Right Audio stimuli Time Characters would be named once in sentences such as Characters' Characters appear waving and "Regarde ! C'est le introduction dancing for 5s each tigre !" 30s Interval Laughing baby Laughing baby 5s

User- Fixation controlled (at circle Fixation circle no audio least 1s) Bunny " oh, regarde ! tu Preview trial jumping as vu ?" 5s Interval blank screen no audio 1s Monkey and " oh, regarde ! tu Preview trial bunny playing as vu ?" 5s User-

Fixation controlled (at circle Fixation circle no audio least 1s) e Bunny Monkey and " oh, regarde ! tu Contrast trial jumping bunny playing vois ça ?" 5s Audio " Attention: le prompt Fixation circle lapin va sauter !" 3s

" Oh, regarde ! Le phas Training lapin saute ! Tu Bunny Monkey and vois ? Le lapin Test trial jumping bunny playing saute !" 12s Interval Laughing baby Laughing baby 5s User- Fixation controlled (at circle Fixation circle no audio least 1s) "Regarde ! c'est le Intro novel Monkey performing the novel singe ! Il fait quoi action action by itself ?" 6s Interval blank screen no audio 2s Intro known "Regarde ! c'est le action Tiger eating cake by itself tigre ! Il mange..." 6s

Interval blank screen no audio 3s

Tiger and monkey "oh, regarde ! Tu Preview trial eating vois ça ?" 7s Interval blank screen no audio 1s trial test 1st Tiger eating and monkey performing "oh, regarde ! Tu Preview trial novel action vois ça ?" 7s

194

Video stimuli Trial Left Right Audio stimuli Time User- Fixation controlled (at circle Fixation circle no audio least 1s) Tiger eating and monkey Tiger and " oh, et là ! performing monkey Regarde ! Tu vois Contrast trial novel action eating ça ?" 7s "Regarde: Le tigre User- Audio mange. Le singe controlled (at prompt Fixation circle aussi / cali !" least 6s) " oh, regarde ! Le tigre mange. Le Tiger eating singe aussi / cali ! and monkey Tiger and Tu vois ? Le tigre performing monkey mange. Le singe Test trial novel action eating aussi / cali !" 14s Interval Laughing baby Laughing baby 5s User- Fixation controlled (at circle Fixation circle no audio least 1s) Duck swinging toy from side " oh, regarde ! tu Preview trial to side as vu ?" 5s Interval blank screen no audio 1s Tiger opening and closing " oh, regarde ! tu Preview trial arms and legs as vu ?" 5s User-

Fixation controlled (at circle Fixation circle no audio least 1s) Duck swinging Tiger opening toy from side and closing " oh, regarde ! tu Contrast trial to side arms and legs vois ça ?" 5s " Attention: Audio regarde celui qui trial test 5th prompt Fixation circle cali !" 3s "Oh, regarde celui qui cali ! Tu le Duck swinging Tiger opening vois, celui qui cali toy from side and closing ? Regarde celui Test trial to side arms and legs qui cali !" 12s Interval Laughing baby Laughing baby 5s Fixation User- circle Fixation circle no audio controlled