Computational Phylogenetic Reconstruction of Pama-Nyungan Verb Conjugation Classes
Total Page:16
File Type:pdf, Size:1020Kb
Abstract Computational Phylogenetic Reconstruction of Pama-Nyungan Verb Conjugation Classes Parker Lorber Brody 2020 The Pama-Nyungan language family comprises some 300 Indigenous languages, span- ning the majority of the Australian continent. The varied verb conjugation class systems of the modern Pama-Nyungan languages have been the object of contin- ued interest among researchers seeking to understand how these systems may have changed over time and to reconstruct the verb conjugation class system of the com- mon ancestor of Pama-Nyungan. This dissertation offers a new approach to this task, namely the application of Bayesian phylogenetic reconstruction models, which are employed in both testing existing hypotheses and proposing new trajectories for the change over time of the organization of the verbal lexicon into inflection classes. Data from 111 Pama-Nyungan languages was collected based on features of the verb conjugation class systems, including the number of distinct inflectional patterns and how conjugation class membership is determined. Results favor reconstructing a re- stricted set of conjugation classes in the prehistory of Pama-Nyungan. Moreover, I show evidence that the evolution of different parts of the conjugation class sytem are highly correlated. The dissertation concludes with an excursus into the utility of closed-class morphological data in resolving areas of uncertainty in the continuing stochastic reconstruction of the internal structure of Pama-Nyungan. Computational Phylogenetic Reconstruction of Pama-Nyungan Verb Conjugation Classes A Dissertation Presented to the Faculty of the Graduate School of Yale University in Candidacy for the Degree of Doctor of Philosophy by Parker Lorber Brody Dissertation Director: Dr. Claire Bowern December 2020 Copyright c 2020 by Parker Lorber Brody All rights reserved. ii Contents List of Figures x List of Tables xiii Acknowledgements xiv 1 Introduction 1 1.1 Preliminaries . .1 1.1.1 Conjugation classes . .2 1.1.2 Verb conjugation classes in Pama-Nyungan . .3 1.1.3 Overview of the chapter . .5 1.2 The documentary tradition in Australia . .6 1.2.1 Three periods of documentation . .7 1.2.2 Merlan's (1979) diachronic account . .8 1.2.3 McGregor's (2002) typological generalizations . 10 1.2.4 Dixon's (1980, 2002) typology and reconstruction . 13 1.3 Conjugation classes in morphological theory . 17 1.3.1 Formal considerations . 18 1.3.2 Canonicity and distinctiveness . 23 1.3.3 Morphomics and the principle of independence . 30 1.3.4 Transitivity and valence . 36 iii 1.4 Overview of the thesis . 38 1.4.1 Key research questions . 38 1.4.2 Chapter summary . 40 2 Phylogenetic methods for linguistic research 50 2.1 Core concepts of computational phylogenetics . 51 2.1.1 Distance-based methods . 52 2.1.2 Maximum parsimony . 55 2.1.3 Likelihood methods . 55 2.1.4 Bayesian methods . 56 2.2 Phylogenetic methods beyond biology . 57 2.2.1 Cultural phylogenetics . 57 2.2.2 Linguistic phylogenetics . 59 2.3 Anatomy of an ancestral state reconstruction analysis . 64 2.3.1 Feature identification and coding . 64 2.3.2 Phylogenetic signal . 66 2.3.3 Algorithmic approaches to ASR: MCMC and Bayesian inference 70 3 Morphological typology 74 3.1 The language sample . 75 3.2 Typology of individual language subgroups . 77 3.2.1 Subgroups without conjugation classes . 78 3.2.2 Wati . 80 3.2.3 Ngumpin-Yapa . 82 3.2.4 Marrngu . 85 3.2.5 Kartu . 87 3.2.6 Ngayarta . 90 3.2.7 Paman . 94 iv 3.2.8 Maric . 103 3.2.9 Warluwaric . 104 3.2.10 Wiradhuric . 105 3.2.11 Dyirbalic . 108 3.2.12 Mayi . 110 3.2.13 Yolngu . 110 3.2.14 Tangkic . 116 3.2.15 Waka-Kabic . 117 3.2.16 Gumbaynggiric . 120 3.3 Grammatical characters and coding . 120 4 Ancestral state reconstruction 136 4.1 Phylogenetic signal . 136 4.2 Ancestral state reconstruction: Model setup and comparison . 142 4.2.1 Model setup . 143 4.2.2 Convergence diagnostics . 144 4.2.3 Visualizing models and estimated rates . 146 4.2.4 Model comparison with Bayes Factor . 147 4.3 Core reconstruction results I: Presence of verb conjugation classes . 153 4.3.1 Four candidate models of Character 1: Presence of conjugation classes . 154 4.3.2 Results . 155 4.3.3 Model Comparison . 158 4.4 Core reconstruction results II: Number of verb conjugation classes . 159 4.4.1 Six candidate models of Character 2: Number of conjugation classes . 161 4.4.2 Results . 164 4.4.3 Model Comparison . 170 v 4.4.4 Reversible jump MCMC . 172 4.5 Core reconstruction results III: Conjugation class membership features 178 4.5.1 Five candidate models of Character 3: Conjugation class mem- bership features . 180 4.5.2 Results . 183 4.5.3 Model Comparison . 187 4.6 Interim discussion . 188 5 Correlated evolution of traits 194 5.1 Models of correlated evolution . 195 5.1.1 Meade & Pagel (2016) Independent and Dependent models . 195 5.1.2 phytools fitPagel and AIC weight comparison . 198 5.2 Correlated evolution results I: BayesTraits discrete character models . 200 5.3 Correlated evolution results II: fitPagel and Akaike weights . 209 5.3.1 Discussion . 213 6 Effects of tree topology on reconstruction 216 6.1 Identifying topologies . 219 6.2 Comparing topologies . 227 6.2.1 Phylogenetic signal . 228 6.2.2 Measures of Homoplasy . 230 6.2.3 Visualizing effects of varying tree topology . 233 6.3 Discussion . 241 7 Summation and discussion 244 Appendix 251 vi List of Figures 2.1 The anatomy of a simple phylogenetic tree . 51 2.2 UPGMA tree for phylogeny in Table 2.1 . 54 2.3 Three unrooted trees representing phylogeny in Table 2.1 . 55 2.4 Trait variation and D statistic for four distinct distributions of a binary trait . 68 3.1 Comparison of Pama-Nyungan consensus tree and pruned language sample . 75 3.2 Geographical distribution of the language sample . 77 3.3 Geographical distribution of Character 1 . 125 3.4 Phylogenetic distribution of Character 1 . 126 3.5 Combined geographical and phylogenetic distribution of Character 1 127 3.6 Geographical distribution of Character 2 . 129 3.7 Phylogenetic distribution of Character 2 . 130 3.8 Geographical distribution of Character 3 . 132 3.9 Phylogenetic distribution of Character 3 . 133 4.1 Density plot for sum of changes for observed, Brownian, and random distributions of Character 1 . 137 4.2 Density plot for sum of changes for observed, Brownian, and random distributions of Character 3a . 139 vii 4.3 Density plot for sum of changes for observed, Brownian, and random distributions of Character 3b . 140 4.4 Density plot of K for observed data and randomization test . 142 4.5 Trace plots as a diagnostic of model validity . 145 4.6 Anatomy of a directed arrow plot . 147 4.7 Permissible transitions for four models of Character 1 . 155 4.8 Estimated transition rates for four models of Character 1 . 156 4.9 Density of reconstruction probabilities for Character 1 . 157 4.10 Permissible transitions for six models of Character 2 . 163 4.11 Estimated transition rates for six models of Character 2 . 167 4.12 Density of reconstruction probabilities for Character 2 . 169 4.13 Estimated number of parameters in Reversible jump MCMC model of Character 2 . 173 4.14 Estimated number of deleted rates in Reversible jump MCMC model of Character 2 . 174 4.15 Deletion percentage for individual rates in Reversible jump MCMC model of Character 2 . 176 4.16 Permissible transitions for five models of Character 3 . 182 4.17 Estimated transition rates for five models of Character 3 . 185 4.18 Density of reconstruction probabilities for Character 3 . 187 5.1 Heatmap of LogBF values for six sets of Independent vs. Dependent model comparisons . 202 5.2 Estimated transition rates for Dependent model; 2 conjugation classes and transitivity-based membership . 204 5.3 Estimated transition rates for Dependent model; 3 conjugation classes and transitivity-based membership . 205 viii 5.4 Estimated transition rates for Dependent model; 4 conjugation classes and transitivity-based membership . 206 5.5 Estimated transition rates for Dependent model; 4 conjugation classes and phonology-based membership . 207 5.6 Heatmap of Akaike weights for six sets of Independent vs. Dependent model comparisons . 211 6.1 Histogram of tree sampling frequency; Presence of conjugation classes character, baseline unrestricted model . 217 6.2 Overlay of ∼4,000 possible Pama-Nyungan tree topologies . 218 6.3 Maximum clade credibility tree among ∼4,000 inferred Pama-Nyungan trees . 221 6.4 Overlay of Karnic subgroup (plus Paakantyi) across ∼4,000 inferred Pama-Nyungan trees . ..