From the Richness of the Signal to the Poverty of the Stimulus: Mechanisms of Early Language Acquisition
Total Page:16
File Type:pdf, Size:1020Kb
From the Richness of the Signal to the Poverty of the Stimulus: Mechanisms of Early Language Acquisition Judit Gervain 2007 Trieste Contents 1 Introduction 7 1.1 The poverty of stimulus argument and the learnability of lan- guage . 12 1.1.1 The induction problem . 12 1.1.2 The Principles and Parameters (P&P) theory of lan- guage and language acquisition . 15 1.2 The richness of the signal . 21 1.2.1 Bootstrapping . 22 1.2.2 Statistical learning . 26 1.3 Perceptual constraints on learning . 31 1.4 The aim of the thesis . 32 2 What is in the initial state? 35 2.1 Experiment 1: Adjacent repetitions (ABB) . 37 2.1.1 Material . 39 2.1.2 Subjects . 41 2.1.3 Procedure . 42 2.1.4 Data Analysis and Statistics . 44 2.1.5 Results . 45 2.1.6 Discussion . 49 2.2 Experiment 2: Non-adjacent repetitions (ABA) . 51 2.2.1 Material . 51 2.2.2 Subjects . 51 2.2.3 Procedure . 52 i CONTENTS 2.2.4 Data Analysis and Statistics . 52 2.2.5 Results . 52 2.2.6 Discussion . 54 2.3 Experiment 3: ‘Non-adjacent adjacent’ repetitions (A A) . 56 2.3.1 Material . 56 2.3.2 Subjects . 56 2.3.3 Procedure . 56 2.3.4 Data analysis and Statistics . 56 2.3.5 Results . 57 2.3.6 Discussion . 60 2.4 General Discussion: The role of perceptual primitives and gen- eralizations in the initial state . 65 3 What is in the input? 71 3.1 The status of the input in language acquisition . 73 3.1.1 Language statistics: from information theory and struc- tural linguistics to experimental psychology . 73 3.1.2 The main typological properties of Japanese, Hungar- ian and Italian . 79 3.1.3 Developmental questions: what can statistical infor- mation cue? . 83 3.2 Experiment 4: Segmenting Hungarian and Italian corpora . 85 3.2.1 Corpora . 87 3.2.2 Statistical measures . 90 3.2.3 Segmentation algorithms . 91 3.2.4 Evaluation criteria . 93 3.2.5 Results . 95 3.2.6 Discussion . 139 3.3 Experiment 5: The role of frequency in cuing functors and content words . 141 3.3.1 Corpora . 142 3.3.2 Statistical measures . 144 3.3.3 Results . 144 ii CONTENTS 3.3.4 Discussion . 146 3.4 Experiment 6: The role of frequency cuing word order . 146 3.4.1 Corpora . 147 3.4.2 Statistical measures . 148 3.4.3 Results . 149 3.4.4 Discussion . 151 3.5 General Discussion . 153 4 What is in learning? 155 4.1 The distinction between function words and content words . 157 4.1.1 Defining the distinction . 157 4.1.2 Functors and content words in infants’ linguistic knowl- edge . 160 4.1.3 Functors’ role in the learnability of language . 162 4.2 Experiment 7: Adults show a sensitivity to the frequency dis- tributions and word order patterns of their native language . 171 4.2.1 Material . 171 4.2.2 Languages . 173 4.2.3 Subjects . 174 4.2.4 Procedure . 175 4.2.5 Results . 176 4.2.6 Discussion . 178 4.3 Experiment 8: Infants show a sensitivity to the frequency dis- tributions and word order patterns of their native language . 179 4.3.1 Material . 180 4.3.2 Subjects . 181 4.3.3 Procedure . 182 4.3.4 Results . 184 4.3.5 Discussion . 184 4.4 General Discussion: Bootstrapping word order from early frequency- based representations . 186 iii CONTENTS 5 General Discussion and Open Questions 191 5.1 Three basic mechanisms of language acquisition: perceptual primitives, statistics and rules . 191 5.1.1 Perceptual primitives in the initial state . 191 5.1.2 Statistical learning . 194 5.1.3 Learning the rules of natural language . 201 5.2 An integrative model of language acquisition . 202 5.3 Summary of main findings . 206 6 Conclusion 209 A Appendix to Chapter 3 213 A.1 Zipf’s law . 213 A.2 Hungarian and Italian TPs distribute according to Zipf’s law . 215 A.2.1 Sparsity . 215 A.2.2 Percentiles defining the set of arbitrary thresholds . 219 References 235 iv List of Figures 2.1 The absorption spectrum of oxyHb and deoxyHb . 37 2.2 The banana-shaped path of near-infrared light through the human head . 38 2.3 The design of Experiments 1–3 . 41 2.4 The placement of the optical probes on newborns’ heads used in Experiments 1–3 . 42 2.5 The grand average results of Experiment 1 . 46 2.6 The time course of responses in Experiment 1 . 49 2.7 The grand average results of Experiment 2 . 52 2.8 The time course of responses in Experiment 2 . 54 2.9 The grand average results of Experiment 3 . 57 2.10 The grand average results of the anterior ROIs in Experiment 3 visualized as bar plots . 59 2.11 The average results of areas of interest in Experiment 3 visu- alized as bar plots . 61 2.12 The time course of responses in Experiment 3 . 62 3.1 The histogram of FWTP values at utterance boundaries in the Hungarian corpus . 97 3.2 The histogram of FWTP values at utterance boundaries in the Italian corpus . 98 3.3 The histogram of BWTP values at utterance boundaries in the Hungarian corpus . 100 3.4 The histogram of BWTP values at utterance boundaries in the Italian corpus . 101 v LIST OF FIGURES 3.5 The histogram of MI values at utterance boundaries in the Hungarian corpus . 103 3.6 The histogram of MI values at utterance boundaries in the Italian corpus . 104 3.7 Histograms showing the distributions of FWTPs in the Hun- garian corpus . 107 3.8 Histograms showing the distributions of FWTPs in the small Italian corpus . 108 3.9 Histograms showing the distributions of FWTPs in the full Italian corpus . 109 3.10 Histograms showing the distributions of FWTPs in the Hun- garian, small Italian and full size Italian corpora at FWTPs ≥ 0.15...............................110 3.11 Histograms showing the distributions of BWTPs in the Hun- garian corpus . 111 3.12 Histograms showing the distributions of BWTPs in the small Italian corpus . 112 3.13 Histograms showing the distributions of BWTPs in the full Italian corpus . 113 3.14 Histograms showing the distributions of BWTPs in the Hun- garian, small Italian and full size Italian corpora at BWTPs ≥ 0.15...............................114 3.15 Histograms showing the distributions of MI values in the Hun- garian corpus . 116 3.16 Histograms showing the distributions of MI values in the small Italian corpus . 117 3.17 Histograms showing the distributions of MI values in the full Italian corpus . 118 3.18 Histograms showing the distributions of MI values in the Hun- garian, small Italian and full size Italian corpora . 119 3.19 The accuracy of the segmentations obtained by using arbitrary thresholds for each statistical measure, evaluated against the WB and MB criteria in the Hungarian corpus . 122 vi LIST OF FIGURES 3.20 The completeness of the segmentations obtained by using arbi- trary thresholds for each statistical measure, evaluated against the WB and MB criteria in the Hungarian corpus . 123 3.21 The accuracy and completeness of the segmentation obtained by FWTPs+BWTPs evaluated separately against WB and WIMB criteria . 125 3.22 The accuracy of the segmentations obtained by using arbitrary thresholds for each statistical measure in the small Italian corpus128 3.23 The completeness of the segmentations obtained by using ar- bitrary thresholds for each statistical measure in the small Italian corpus . 129 3.24 The accuracy of the segmentations obtained by using arbitrary thresholds for each statistical measure in the two corpora . 131 3.25 The completeness of the segmentations obtained by using ar- bitrary thresholds for each statistical measure in the two corpora132 3.26 The accuracy and completeness of the segmentation obtained by using a relative threshold for the Hungarian corpus . 135 3.27 The accuracy and completeness of the segmentation obtained by using a relative threshold for the small Italian corpus . 137 3.28 The accuracy and completeness scores obtained by using a relative threshold for the two corpora . 138 3.29 Histogram of word frequencies in the Japanese and Italian infant-directed corpora in Experiment 5 . 145 3.30 The percentages of frequent-initial and frequent-final phrases at utterance boundaries at the four different relative frequency thresholds . 150 4.1 Preference scores for functor-final word order in Experiment 7 177 4.2 The experimental setup of the headturn preference paradigm used in Experiment 8 . 183 4.3 The average looking time values of the Japanese and Italian infants in Experiment 8 . ..