Quick viewing(Text Mode)

Rattus Exulans with Implications for Its Use As Bioproxy for Human Migrations

Rattus Exulans with Implications for Its Use As Bioproxy for Human Migrations

http://researchspace.auckland.ac.nz

University of Auckland Research Repository, ResearchSpace

Copyright Statement

The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand).

This thesis may be consulted by you, provided you comply with the provisions of the Act and the following conditions of use:

• Any use you make of these documents or images must be for research or private study purposes only, and you may not make them available to any other person. • Authors control the copyright of their thesis. You will recognise the author's right to be identified as the author of this thesis, and due acknowledgement will be made to the author where appropriate. • You will obtain the author's permission before publishing any material from their thesis.

General copyright and disclaimer

In addition to the above conditions, authors give their consent for the digital copy of their work to be used subject to the conditions specified on the Library Thesis Consent Form and Deposit Licence.

Phylogeography of the commensal exulans with implications for its use as bioproxy for human migrations

Melanie Hingston

A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy in Biological Sciences, , 2015

ABSTRACT

The Pacific , Rattus exulans, or Kiore as it is called by Maori, is a small human commensal associated with the human settlement of the Pacific. Archaeological evidence from Island connects the distribution of the species with the presumed ancestors of all , the so-called Lapita. Excavations have uncovered artefacts unprecedented in the , comprising a style of distinctively dentate-stamped pottery, fish hooks and remains among others. The earliest animal remains are from pigs and R. exulans, which was clearly introduced into this region by the people as part of their cultural complex. On these grounds R. exulans has previously been used as proxy to infer human migration pathways throughout the Pacific. Accessibility of human archaeological material is scarce and by way of proxy a much higher sampling resolution for the commensal allows more specific inferences. While the origin of the Lapita is still widely disputed among disciplines from archaeology over linguistics and genetics, attention is focused on the , as an essential stepping stone for the distribution of these human migrants into the Pacific.

The genetics of R. exulans might contribute to an assessment of the established theories of the Lapita origin. However, to make inferences regarding human migration based on the species, more knowledge is needed about its population structure and distribution history. In this thesis I establish the current population structure of R. exulans by extensive sampling of mitochondrial D-loop sequences across its distributional range with a strong focus on the Bismarck Archipelago. The population is deeply divided into three major geographic with more recent regional substructures. With the help of a chronology-based reconstruction of ancestral regional distribution I infer the geographic origin of these observed haplogroups and evaluate the chronology in the light of palaeoclimatic events to distinguish natural dispersal events from those that are human mediated. The marked differentiation of the Philippines and of Remote is almost certainly connected to a

I

Pleistocene interglacial. And while further differentiation within the Philippines can also be associated with palaeoclimatic events, evidence for dispersal of the species to the east into Near Oceania and the Pacific supports a human-mediated distribution.

These results of a dual introduction of lineages into Near and Remote Oceania support a minimum of two separate migration waves of human settlers. One was possibly Lapita- associated, spreading the Near Oceanic lineage, and a second one passed only tangentially through the Bismarcks but distributed the Remote Oceanic lineage throughout the Pacific.

Two competing theories for the origin of R. exulans have been proposed based on morphological studies on previously distinguished subtypes of the species. Here I use the population genetic data and the results of the ancestral area reconstruction to independently infer an origin for the species. Genetic evidence is incompatible with an origin on the Southeast Asian mainland, while it supports the theory of a Lesser Sunda origin.

II

In memory of Martin, my best friend, husband and partner in crime.

III

IV

ACKNOWLEDGEMENTS

Life tried to teach me patience, because this journey was led along quite a few detours. But I am here now and I am thankful to everyone who has accompanied me in one way or another. You have made a difference and allowed me to finish what I started in a different life so long ago.

I would like to begin with sincerely thanking my supervisors, Howard Ross and Lisa Matisoo-Smith. You have not only stood by me with professional guidance but also supported me personally throughout this rollercoaster; no one could be prepared for the unexpected. For their moral support and assistance in the Lab I would like to give my special thanks to my friend Judith Robins and the late Vernon Tintinger. Judith, we can have a really long break now!

The crazy bunch, Danushka, Alana, Jess, Steven, Bonbon, and Louis, made me feel at home in the lab round 01. Finally I am catching up with you. My lab buddies round 02, Peter, Louis and Kevin, were invaluable help in all Perl or R questions, even better than google. Also, without our continuous chocolate supply, I could never have made it. Thanks also to Vicky and Patricia, particularly for reminding me to take breaks.

Thanks to my friends Lorna, Gabby, Katie and Ken, who shared much of the PhD experience and more; Emma & Leah, who are the best concert company; Steffen & Jess, who have grown to be Kiwi family, and additionally answered yet another statistics question from this humble Biologist; Ben, who’s Nerd-Nites and Irregulars have provided me with much needed distraction and lots of people to discuss life, the universe and everything with. You all rock! Well apart from everyone who left New Zealand before I could finish .

I am grateful for funding by the Marsden Fund Council administered by the Royal Society of New Zealand, the Allan Wilson Centre for Molecular Ecology and Evolution and the

V

University of Auckland. Additional thanks to the University of Auckland for having such a supportive and flexible system, that allowed me to finish this work despite various suspension times, and personal thanks to Sue Skelly, who made the paperwork ‘go away’ in moments of need.

For permission to sample museum skins I would like to thank Mike Carleton, Linda Gordon and the National Museum of Natural History at the Smithsonian Institution, DC, Chris Conroy and the Museum of Vertebrate Zoology Berkeley, CA, Larry Heaney and the Field Museum of Natural History, IL, Ken Aplin and the South Australian Museum, as well as the National Research and Development Centre for Archaeology in along with the for the provision of the Liang Bua Cave sample. For kindly providing me with additional field samples, my thanks go to my fellow researchers Marie Pagès, INRA, France and Russell Palmer, . Additionally I would like to thank Kelly Ananga & family, Herman Mandui, Matt Leavesley, Irene, Agnes & Segunda, who supported me in my quest to catch during my stay in Kavieng and on Emirau.

My biggest thanks however goes to my families. Besides the obvious, you have supported me with countless skype calls and parcels, so I would not feel lonely and despair. You gave me the strength to go on. Still, most of all I am thankful to Martin, who has believed in me like nobody else. I wish our journey together could have included this and many more years of traveling. Thanks for all the fish! Now and forever.

On closing, I thank Aotearoa for all the beautiful rainbows and all people whom I have crossed paths with and did not mention explicitly. I remember.

VI

CONTENTS

Abstract ...... I

Acknowledgements ...... V

List of Figures ...... XI

List of Tables ...... XV

Abbreviations ...... XVII

Chapter 01 General introduction ...... 1

1.1 Rattus exulans, the Pacific rat or Kiore ...... 4

1.1.1 Appearance ...... 5

1.1.2 Population dynamics ...... 6

1.1.3 Nutrition ...... 7

1.1.4 Habitat ...... 8

1.1.5 Behaviour ...... 9

1.1.6 Swimming and dispersal ...... 10

1.1.7 Rats, commensals and myths ...... 11

1.1.8 Current distribution ...... 13

1.2 Regional features and peculiarities ...... 16

1.2.1 Geographic history of Sunda and Sahul ...... 16

1.2.2 Human boundaries…or classifications ...... 19

1.3 Human migrations ...... 21

1.3.1 Human settlement of Sahul and the Lapita cultural complex ...... 21

1.3.2 Hypotheses on the geographic origin and subsequent migration pathways of the Lapita cultural complex...... 24

1.4 Estimation of migration rates by phylogenetic analysis ...... 40

VII

1.4.1 Inference of population dynamics from population structure ...... 40

1.4.2 Mathematical approaches ...... 43

1.4.3 Phylogeny ...... 45

1.4.4 Optimality criteria ...... 47

1.4.5 Bayesian phylogenetics ...... 48

1.4.6 Estimation of migration rates ...... 50

Chapter 02 Mitochondrial DNA and data generation ...... 57

2.1 mtDNA as marker ...... 57

2.1 Obtaining Material ...... 59

2.2 Data Generation ...... 61

2.2.1 Protocol for fresh tissue ...... 61

2.2.2 Ancient DNA ...... 62

2.2.3 Bone preparation ...... 63

2.2.4 aDNA extraction and amplification ...... 63

2.2.5 DNA extraction from liquid preserved tissue ...... 65

2.2.6 Summary ...... 66

Chapter 03 Genetic diversity, Population structure, and Phylogeography of Rattus exulans ...... 69

3.1 Abstract ...... 69

3.2 Introduction ...... 70

3.3 Methods ...... 71

3.3.1 Sample collection ...... 71

3.3.2 Analyses ...... 73

3.4 Results ...... 76

3.4.1 Population structure ...... 76

3.4.2 Population dynamics ...... 85

3.4.3 Network analysis ...... 90

VIII

3.4.4 Phylogenetic reconstruction ...... 99

3.5 Discussion ...... 101

3.5.1 Genetic diversity ...... 101

3.5.2 Population structure ...... 103

3.5.3 Near and Remote Oceania ...... 108

Chapter 04 Origin and dispersal of Rattus exulans ...... 113

4.1 Abstract ...... 113

4.2 Introduction ...... 115

4.2.1 Where is the species commensal, where wild or feral? ...... 115

4.2.2 Hypothesis about Origin ...... 115

4.2.3 The chance of ways of dispersal ...... 116

4.2.4 Inference of ancestral ranges ...... 121

4.3 Methods ...... 124

4.3.1 Estimation of the R. exulans phylochronology ...... 124

4.3.2 Reconstruction of ancestral geographic ranges ...... 125

4.4 Results ...... 129

4.4.1 Timing of divergences within R. exulans ...... 129

4.4.2 Ancestral geographic regions for the major clades ...... 137

4.5 Discussion ...... 142

4.5.1 Age of divergence for R exulans ...... 143

4.5.2 Deep spatial history and regional clades ...... 145

4.5.3 Origin of R. exulans ...... 151

4.6 Considerations and future directions ...... 157

Chapter 05 Knowledge of old: how well does the former taxonomic classification mirror the population structure obtained from a mitochondrial DNA marker? ..... 159

5.1 Abstract ...... 159

5.2 Introduction ...... 160

IX 5.3 Methods ...... 163

5.4 Results ...... 168

5.5 Discussion ...... 171

5.6 Future direction ...... 176

Chapter 06 Stowaway or ethnotramp, R. exulans hitching a ride ...... 177

6.1 Abstract ...... 177

6.2 Introduction ...... 178

6.3 R. exulans, a proposal for space and time ...... 179

Is there a consensus for the dispersal of the Remote Oceanic clade(s)? .... 184

6.4 Concordance with other commensals ...... 186

6.5 Migration pathway exploration ...... 189

Can long range migration be modelled between diverged lineages ...... 194

6.6 Rats, implications for inferences of human migration pathways ...... 197

Chapter 07 Final conclusions ...... 203

Appendices ...... 207

A 1 WWII military maps for PNG and ...... 207

A 2 Laboratory protocols ...... 210

Extraction protocols for museum tissue and bone material ...... 210

Formalin Protocol ...... 212

A 3 R. exulans sample data ...... 215

References ...... 233

X LIST OF FIGURES

Figure 1: Mus exulans (Peale), U. S. Exploring Expedition 1838-1842...... 4

Figure 2: Field identification card for R. exulans...... 6

Figure 3: Interior of a house on Radack, , by Kotzebue; rats feeding on fruit...... 9

Figure 4: Comparison of rat pelts by C. Mahoney (via DOC website)...... 11

Figure 5: Distribution range for R. exulans (beige) on bathymetric profile map...... 14

Figure 6: Names of places referred to throughout this study...... 15

Figure 7: Impact of sea level changes on the island connectivity within the Malayan Archipelago ...... 17

Figure 8: Combined topographic and bathymetric map, revealing the underlying Sunda and Sahul shelves with Wallacea as transitional biogeographic region ...... 18

Figure 9: Overview of archaeological sites containing dentate-stamped pottery...... 23

Figure 10: Sampling locations included in all analyses ...... 60

Figure 11: Example for documentation of rat bones ...... 63

Figure 12: Rat femur with measurement points...... 63

Figure 13: Primer alignment against a sample sequence...... 67

Figure 14: Map of the study area from mainland Southeast (SEA) to Remote Oceania...... 72

XI Figure 15: Pooled sampling locations and major biogeographic lines...... 74

Figure 16: Comparison of the geographic distribution of haplogroups and the five AMOVA regions...... 79

Figure 17: Distribution of the 80 pooled populations of R. exulans with the approximate centre among the locations...... 81

Figure 18: Population differentiation heat maps for each pair of the five AMOVA populations...... 82

Figure 19: Inter-haplotypic distance matrices for pairwise differences among all haplotypes present within each of the five AMOVA regions...... 83

Figure 20: Mismatch distributions comparing the number of pairwise differences between pairs of haplotypes expected under demographic expansion...... 87

Figure 21: Mismatch distributions comparing the number of pairwise differences between pairs of haplotypes expected under spatial expansion...... 88

Figure 22: Mismatch distributions comparing the number of pairwise differences between pairs of haplotypes expected under demographic and spatial expansion for the Moluccas and the Bismarck Archipelago, all Micronesian islands of RO north of the Bismarcks and all of Remote Oceania without MN...... 89

Figure 23: Median joining network for 95 R. exulans haplotypes based on 340 samples (including 30 samples and 5 haplotypes from Thomson et al. 2014)...... 92

Figure 24: Median joining network based on a 141 bp alignment...... 94

Figure 25: Integer neighbour joining network for 95 R. exulans haplotypes based on 340 samples...... 96

Figure 26: Detailed median joining haplotype networks colour-coded by source-island for the Near Oceanic-region...... 97

Figure 27: Median joining network for all haplotypes found in the PHBS-region...... 98

XII Figure 28: Median joining network for all haplotypes found in and the Sunda and Lesser ...... 98

Figure 29: Bayesian consensus tree of 95 Rattus exulans mtDNA-HVR1 haplotypes. .. 100

Figure 30: Outline of the land bridges within the ...... 105

Figure 31: Surface currents in the Australasian during the south monsoon in February...... 118

Figure 32: Surface currents in the Australasian Mediterranean Sea during the north monsoon in August...... 118

Figure 33: Sea level changes over the last 2 Ma with magnification of the last 180 ka. .. 119

Figure 34: Paleo river systems within the Sunda shelf, at 120 m sea level...... 120

Figure 35: Subdivision of the geographic distribution of Rattus exulans into 15 discrete geographic areas for the BayArea analysis in RASP...... 128

Figure 36: Lineages through time plot for R. exulans...... 129

Figure 37: Posterior density for the time to most recent common ancestor (BEAST2) for R. exulans, estimated via fossil calibration of the Mus - Rattus split...... 130

Figure 38: Posterior density for the time to most recent common ancestor (BEAST2) for R. exulans, estimated via fixed clock rates of 0.151 and 0.2...... 131

Figure 39: Phylochronograms for R. exulans...... 133

Figure 40: Bayesian ancestral area reconstruction (RASP, BayArea)...... 139

Figure 41: Stratigraphic distribution of murid at the Liang Bua Cave on , Indonesia...... 156

Figure 42: Dispersal pathways of R. exulans (concolor group) collated after Tate (1935)...... 161

Figure 43: Geographic distribution of the R .r. exulans-series subspecies after Schwarz and Schwarz (1967)...... 162

XIII Figure 44: Descriptive colours for differentiation of former classified subspecies after Schwarz and Schwarz (1967)...... 164

Figure 45: Two colour morph examples for R. exulans specimens from Loei Province, Thailand...... 164

Figure 46: Skull morphology of the hypothesized wild type and ancestor of the concolor / exulans-series...... 166

Figure 47: Unrooted majority rule consensus tree (80%), with weighting emphasis on the skull features and the main colour traits...... 168

Figure 48: Unrooted majority rule consensus tree (80%) with all characters equally weighted...... 169

Figure 49: Split network for the four unrooted most parsimonious trees from the morphological analysis of the R .r. exulans series subspecies...... 169

Figure 50: Contrasting juxtaposition of the BEAST-derived chronogram, based on mtDNA, and the consensus tree from the most parsimonious trees in PAUP*...... 170

Figure 51: Proposed population history for R. exulans...... 181

Figure 52: Dispersal routes of R. exulans after Tate (1935) and Schwarz and Schwarz (1967)...... 184

Figure 53: Population boundaries for LAMARC migration analyses...... 191

Figure 54: Example (S10) for the wide ranges of the initial parameter estimates and the associated migration rates in LAMARC...... 195

Figure 55: Hypothetical map for the monophyletic origin and dispersal of human races by Haeckel (1889)...... 199

XIV LIST OF TABLES

Table 1: Primer pairs for modern and ancient mtDNA D-loop amplification...... 65

Table 2: Genetic diversity indices for regional groups of R. exulans...... 78

Table 3: Frequencies of the eleven haplotypes present in more than one region...... 80

Table 4: AMOVA pairwise ΦST for the subdivided five defined regions and absolute number of migrants (M=Nm) exchanged between populations...... 80

Table 5: AMOVA pairwise ΦST and absolute number of migrants (M=Nm) exchanged between populations for the NO-region subdivided into Island, the Bismarcks, and the Moluccas; the RO-region subdivided into Micronesia (MN) and the remainder of RO (ROMN) plus the combined Bismarcks and Moluccas (BIS- MOL)...... 84

Table 6: Genetic diversity indices of R. exulans for the subdivision of NO and RO; Neutrality test results for Tajima’s D, Fu’s FS and Ramos-Onsins and Rozas’ R2 statistic and time since expansion...... 85

Table 7: Comparison of haplotype diversity and nucleotide diversity of the mitochondrial D- loop among the four major commensal species...... 102

Table 8: Approximate central coordinates for the ancestral area reconstruction in RASP...... 128

Table 9: Most probable estimates and their 95% highest posterior density for the tMRCAs after BEAST2...... 132

Table 10: Marginal posterior probabilities for the ancestry per region...... 138

XV Table 11: Mean measurements of morphological characters for R. hawaiiensis and R. raveni...... 163

Table 12: Morphological characters for selected subspecies within the R. rattus exulans series, collated from Schwarz and Schwarz (1967) and Jentink (1890)...... 167

Table 13: AICM scores among the candidate models...... 193

XVI ABBREVIATIONS

µL Microliter BAII Bismarck Archipelago indigenous inhabitants BP before present bp base pair CI confidence interval D-loop displacement loop DNA Deoxyribonucleic acid EA ESS Estimated Sampling Size ETB Entangled bank ETP Express train to GIS Geographic information system GT Generation time HPD highest posterior density (95%) ISEA Island Southeast Asia ka thousand years ago km Kilometer kyr thousand years LGM last glacial maximum Ma Million years ago MCMC Markov chain Monte Carlo MHG macro-haplogroup mL Milliliter mPP marginal posterior probability MRCA most recent common ancestor mtDNA Mitochondrial DNA Myr Million years NHN standard elevation zero NRY Non recombining Y chromosome PP posterior probability RSL Relative sea level

XVII SBB Slow boat to the Bismarcks SEA Southeast Asia SNP Single nucleotide polymorphism tMRCA time to most recent common ancestor tMRCA time to most recent common ancestor U Unit VCTI Voyaging corridor triple I yrs Years

XVIII

CHAPTER 01 GENERAL INTRODUCTION

Several murid-rodent species are among the most resilient and adaptable species besides humans. As omnivorous they share many physiological features with humans, and as such, Mus musculus and Rattus norvegicus have been invaluable as model organisms in medical studies. Rattus rattus became rather infamous due to being a host for flea vectors transmitting the Plague among other diseases. Nevertheless, the close association between humans and rodent commensals inextricably intertwined their life histories, allowing us to utilize these species as bioproxies, also in respect to human migrations.

One species, Rattus exulans, the Pacific or , is associated with the last frontier of human migration, the Polynesian expansion. All inhabitable islands across the vast were settled during this last range expansion by a group of sea-faring humans. The origin of the Polynesian people can be traced back to a cultural complex spanning islands within Near Oceania, however the origin of these so called Lapita people remains disputed among disciplines from archaeology via linguistics to human genetics.

Within Oceania R. exulans proved to be a reliable tool to establish human migration pathways and helped to resolve the sequence of human settlements. These previous studies also indicated that mitochondrial DNA of R. exulans might hold a signal to support one theory over another concerning the appearance of Lapita in the Bismarck Archipelago within Near Oceania. Matisoo-Smith and Robins (2004) found an apparent substructure among R. exulans within Oceania but could not identify any of the Remote Oceanic haplotypes within Near Oceania. Hence the aim of this study is to resolve the full population structure of R. exulans and to evaluate the potential to use R. exulans as proxy for human migration pathways into Near Oceania, which is widely assumed to be linked to the dispersal of Austronesian language speaking peoples.

1

To evaluate the informational value of R. exulans as bioproxy beyond Remote Oceania, this thesis addresses the following questions:

I. Is the current population of R. exulans geographically structured within its distributional range and if so, how? II. What was the historical distribution of the species and can it be explained by natural dispersal? III. Can the origin of the species be determined? IV. Does the distribution and dispersal pattern link the clades found in Near and Remote Oceania to each other or (a) particular geographic source(s)? V. Can any of the theories for an origin of Lapita people be supported or rejected based on the findings?

To address these questions this thesis is structured as following:

CHAPTER 01 provides an overview of the study species R. exulans and the special geographical characteristics within its distribution range. It further addresses historical and current models proposed for the origin and dispersal pathways of the ancestors of the current human populations in Remote Oceania and gives a brief overview about mathematical approaches to infer migrations among populations.

CHAPTER 02 introduces the genetic marker, the acquisition and coverage of sample specimens, as well as all wet laboratory procedures that generated the data underlying all subsequent analyses.

CHAPTER 03 establishes the genetic relationships between putative populations in the fragmented island landscape mainly by analyses of molecular variance and network constructions. The present biogeographical population structure is inferred and diversity indices and neutrality tests used to identify population dynamic processes involved in shaping the observed structure.

CHAPTER 04 infers the chronology in the biogeographic distribution of the species under consideration of palaeoclimatic changes in the Pleistocene. Two competing theories regarding the origin of R. exulans as a species are evaluated.

CHAPTER 05 contrasts the historical classification of subspecies based on morphological characters with the genetic clade structure and evaluates the inforPPmation content that the available morphological characters can add in resolving the true tree structure.

2

GENERAL INTRODUCTION CHAPTER01

CHAPTER 06 summarises the conclusions gained from previous chapters and discusses the significance of these results in the light of human migration models for the settlement of the Pacific.

3

1.1 Rattus exulans, the Pacific rat or Kiore Rattus exulans was first described by Titian R. Peale (1848) as Mus exulans on Tahiti and as Mus vitiensis on . Other synonyms for the species include the descriptions as Rattus concolor, ephippium or maorium amongst many others (Ellerman, 1941; Schwarz & Schwarz, 1967; Musser & Carleton, 1993).

“M. caudâ caput corpusque longitudine superante… The tail longer than the head and body; form light and graceful; hair fine, long and silky in texture; colour sepia-brown above, nearly white beneath; feet large, white; thumb short, with a flattened nail; tail pale flesh-colour, with very small brown scales, covered slightly by fine silky hairs; eyes of a moderate size, black; ears large, rounded, and covered with silky hairs; incisor teeth small and of a yellowish- white colour.” Peale (1848)

Figure 1: Mus exulans (Peale), U. S. Exploring Expedition 1838-1842, Plate 4 (Cassin, 1858). The above description was the initial record by Peale for Mus vitiensis and the associated illustration for Mus exulans can be seen in Figure 1. Descriptions from different islands and phenotypic adaptations to different habitats led to as many as 49 separate descriptions. There were subspecies specified as only living in or around houses, others as living solely in the bush or in the mountains (Waite, 1897; Williams, 1973). Phenotypic variations, especially between lowlands and higher elevation (>1500 m) have been described for Flores, (Mertens, 1936) and other islands, including the Philippines (Heaney 2011,

4

GENERAL INTRODUCTION CHAPTER01

pers. comm.). On the distribution of two subspecies of R. exulans overlapped; Rattus concolor otteni was described to occur only in harbour cities, as opposed to R. concolor ephippium that was assumed to be restricted to the mountains (Kopstein, 1931). Further differences in average size and body to tail ratio were described extensively in Schwarz and Schwarz’s monograph (1967).

The supposed ‘wild type’, only occurring on Flores, was described as Mus wichmanni (Jentink, 1890), later Rattus rattus wichmanni. According to Schwarz and Schwarz (1967) all commensal R. r. exulans types, commonly named Lesser Malay house-rat, derived from this wild species. The main distinction of the wichmanni type from the other subspecies was based on the phenotypic difference of belly colour; a purely white belly, i.e. continuously white belly hairs down to their roots, versus white belly hairs with dark roots to various extents as found in all other exulans types (viz. concolor, ephippium, vitiensis, raveni, negrinus, todayensis) (Schwarz & Schwarz, 1967). Based on this observation the origin of the exulans group was noted to be Flores (Schwarz, 1960). This contradicted a previously postulated origin for the concolor group by Tate (1935, p. 166) in Tenasserim, Burma (Myanmar) bordering on Siam (Thailand) north of the Isthmus of Kra.

1.1.1 Appearance

Mature R. exulans measure 100 to 150 mm plus the tail about equal in length (Mertens, 1936; Ellerman, 1941). Their average body weights in the wild have been recorded with 38g ♀ and 42g ♂ around Kuala Lumpur, Malaysia (Harrison, 1951, 1955) and 62g ♀ and 75g ♂ on Kure Atoll, Hawaii (Wirtz, 1972). Most studies on the ecology of R. exulans have been carried out on different islands, and their body size and weight frequently correlate with island size and population density. In captivity both genders tend to be significantly heavier, reaching average weights of 107g ♀ and 145g ♂ (Egoscue, 1970).

Apart from their overall appearance, size, and body tail ratio, two features help to distinguish R. exulans from other Rattus species. The first is a dark bar along the outer side of the otherwise pale hindfoot (Marples, 1955) and the other is via the mammary formula. The number of mammae has been reduced to 8 (2 pectoral and 2 inguinal pairs) as compared to 10 in R. rattus and 12 in R. norvegicus (Waite, 1897; Tate, 1936; Schwarz, 1960). Specifically, the middle abdominal pair is suppressed (Schwarz & Schwarz, 1967); this allows for distinction from rats with the same overall number of mammae with a different pattern, like Rattus hoffmanni (1:3).

5

Figure 2: Field identification card for R. exulans for prospective collectors to distinguish among the commensal species.

1.1.2 Population dynamics

With 4.3 litters per year and an average litter size of 4.0 (viable offspring) the total number of offspring per adult female was determined as 17.2 per annum (Tamarin & Malecha, 1972). Females can conceive as early as 49 days old with an average of 137 days for the first litter and their reproductive span is roughly 12 months, but can be up to double that time in few individuals (Egoscue, 1970). The gestation period is around 23 days (Williams, 1973). In captivity, this polyestrous species continuously produces offspring (Egoscue, 1970), whereas in the wild, although breeding can occur continuously throughout the year (Jackson, 1962), usually it has seasonal peaks or is seasonally restricted (Tamarin & Malecha, 1971; Wirtz, 1972). Larger and heavier rats, up to a point, tend to have a higher breeding success (Jackson, 1962). Populations associated with human housing also show a more continuous breeding behaviour than populations living in adjacent gardens or grassland (Dwyer, 1978). This indicates a high adaptability of breeding behaviour subject to food availability and quality.

6

GENERAL INTRODUCTION CHAPTER01

Depending on mortality and reproduction rate, the total population density can vary throughout a year as much as between 20 and 75 rats per acre (Wirtz, 1972) in other places it can be as low as 12 rats per acre (Tamarin & Malecha, 1971). The estimation of population density in the field is prone to experimental error and, like seasonal variation, highly variable between populations and locations. An overview over densities recorded between 1946 and 1987 can be found in Moller and Craig (1987).

1.1.3 Nutrition

The dietary composition of R. exulans is versatile and not restricted either by a specific plant resource, nor a particular source of protein. For the description of the species Peale (1848) noted that their food consisted mostly of pandanus and no evidence was found that local crabs and molluscs were part of their food intake. Buller (1870) attributed the lack of odour known to be emitted by other rats to a diet of fruit and berries and Waite (1897) summarized over several authors’ publications the prevalent diet as being frugivorous. Only a study from New Zealand (Rutland, 1889) found evidence of a carnivorous content in the diet, although subsiding once sufficient herbivorous food was available.

Since those earlier years several stomach content analyses have been carried out, mostly within the scope of identifying crop pests. Although overall omnivorous, R. exulans has been confirmed to subsist on a predominantly vegetarian diet (Strecker & Jackson, 1962) including, but not exclusively consisting of , sugarcane, bananas, pawpaw, breadfruit, nuts, seeds and shoots. Following up results from (Harrison, 1951) and Hawaii (Kami, 1966), Fall et al. (1971) compared the feeding patterns of R. exulans and R. rattus on Eniwetok Atoll, showing a strong preference for softer plant tissue in the former. In all studies mentioned, R. exulans’ diet consisted of 90-99% plant material with nuts and seeds as preferred sources of protein. This also holds true for the Highlands of New Guinea (Dwyer, 1978) and thus cannot be attributed to a lack of resources. Biochemical analysis of stomach contents and fat deposits from rats in Tokelau Islands identified the dietary fat as close to exclusively consisting of oil (Mosby et al., 1974).

Other studies, mostly from New Zealand, indicate a higher percentage and variety of animal material in their diet, predominantly invertebrate, specifically arthropods (Stead, 1937; Wirtz, 1972; Hicks et al., 1975; Campbell et al., 1984; Sugihara, 1997). A small scale study also identified a substantial amount (27%) of vertebrate remains (Bettesworth, 1972)

7

whereas a larger study, despite small amounts of vertebrate material found in a few individuals, showed that about 80% of stomach contents did not contain any vertebrate remains (Newman & McFadden, 1989), suggesting that consumption might be opportunistic.

From a trophological perspective, R. exulans overall prefers a nut and fruit diet, yet is highly adaptable and retreats to animal protein if other protein sources are sparse. This is a particularly advantageous trait for adaptation to new environments. R. exulans has not been observed to store food, even if excess food was available (Egoscue, 1970).

1.1.4 Habitat

R. exulans can be found in heterogeneous habitats ranging from forest, preferably dry scrub and grassland to plantations, gardens and houses, but remains absent from wet areas like fields (Kopstein, 1931; Harrison, 1951; Musser, 1977; Bramley, 2014). In general it is a wild animal with the ability to quickly adjust to human habitations if opportunity arises; nevertheless, in houses it rather stays a frequent visitor than becoming a permanent inhabitant (Kopstein, 1931). On (Celebes) its occurrence was recorded mostly in human dominated environments in and around villages, whereas sparsely in secondary forest and slightly altered primary forest (Musser, 1977). In general, the presence of native rats tends to restrict R. exulans to human habitations (Musser & Newcomb, 1983). This increase of abundance along a gradient of degradation and human impact is also known for the invasive R. rattus (Lehtonen et al., 2001). Nevertheless, in an extensive study on rodents in the Malay region Harrison (1957) observed a preference for grassland and scrubs in R. exulans. This might be due to a higher stability of microclimate parameters in an overall fluctuating macroclimate (Strecker & Jackson, 1962). Other studies concur with the observation that are seldom found in true rainforest (Marples, 1955) and Williams (1973) summarised that the general preference in addition to a good ground cover seems to be a well-drained soil. For the Pacific he also noted that R. exulans is only found within human habitations in absence of other commensal rats or mice. At any time they can easily adapt to changing environments and revert back to a feral life (Storer, 1962; Dwyer, 1978).

Usually R. exulans builds nests underground and possibly sometimes up in trees (Marples, 1955); these nests are inconspicuous and therefore hard to find. The rats are excellent climbers, but when pursued lack endurance which makes them easy to capture (Peale, 1848). Their home ranges are restricted to individual terrains in which males cover longer

8

GENERAL INTRODUCTION CHAPTER01

distances than females (e.g. 317 m ♂ vs. 99 m ♀) (Nass, 1977; Moller & Craig, 1987). Their territories roughly cover between 800 ♀ and 1740 m2 ♂ (Wirtz, 1972) and the lack of clear runways suggest that the use of the range is rather exploratory (Strecker & Jackson, 1962).

1.1.5 Behaviour

Mistakenly described as nocturnal, R. exulans leads a cathemeral life style (Dwyer, 1978). Waite (1897) noticed some rats in an illustration (Figure 3) depicting the inside of a house in the Marshall Islands during von Kotzebue’s voyage between 1815 and 1818. Those illustrated rats did not shy away from humans and were present in the houses and gardens day and night (von Kotzebue, 1821). This concurs with the ease with which these rats can be caught (Peale, 1848).

Figure 3: Interior of a house on Radack, Marshall Islands, by Kotzebue; rats feeding on fruit. The social structure of R. exulans differs from other commensal rats. Females dominate in breeding pairs and rats of the same sex tolerate each other to an extent that adults living together do not display any obvious hierarchies (Egoscue, 1970). Nevertheless, the dominant reproductive females behave aggressively towards older post-reproductive females whereas young pre-reproductive females seem to stand outside of social rank

9

(Davis, 1979). Weaned young on occasion share a common nest with their mother and even co-breed with her and their sisters (Egoscue, 1970; Davis, 1979).

When carrying out aggressions, R. exulans lacks a submissive posture and fights continue until the flight of one sparring partner, then generally chased after by the winner (Davis, 1979). Males have a clear hierarchical structure and, as with other mammals, lower ranking individuals have an increase of adrenal weight, indicating a physical stress response (Davis, 1979). Adrenal hypertrophy may also cumulate with population density (Wirtz, 1972). The observed discrepancies between the two studies (Egoscue, 1970; Davis, 1979) regarding aggression showcase an influence of population density on behaviour.

1.1.6 Swimming and dispersal

Legends are open to interpretation. When Elsdon Best (1942) collected Maori Lore the saying Te aio i kauria e te kiore was translated to ‘the calm waters swum by the rat’ and was interpreted as relating to a rat’s swimming powers. Possibly Maori tribes had a sarcastic humour before the British arrived, because even in calm waters this small rat fares rather badly.

However, swimming is one of three possible ways of dispersal over water barriers, so it is a crucial trait when thinking about island colonisation. Unlike Norway rats, who are avid swimmers capable of directional swimming, diving and even under water hunting (Cottam, 1948), R. exulans shows no sense of orientation, tends to swim in circles, drown and get lost even in calm waters (Jackson & Strecker, 1962). As a consequence this mode of introduction has repeatedly been deemed unlikely as means of dispersal over wider expanses of water (Jackson & Strecker, 1962; Spennemann & Rapp, 1989). The associated possibility of drifting on wood or similar items is not any more likely in the context of the settlement of Remote Oceania, because exposure to sun while drifting on the ocean as well as a lack of water and food would inhibit survival over longer distances (Spennemann, 1997). The fact that within island groups, some uninhabited islands are still free of R. exulans (Atkinson & Atkinson, 2000) supports the assumptions that its dispersal in Oceania was entirely human mediated (Miller & Ewing, 1924).

10

GENERAL INTRODUCTION CHAPTER01

1.1.7 Rats, commensals and myths

Cum mensa means sharing a table. Human commensal species developed when a settlement provided food and shelter in large enough quantities to sustain a population over a long period of time. The commensal forms of R. exulans have been described to lose their light bellies for a darker appearance when living indoors (Schwarz & Schwarz, 1967).

R. exulans is the smallest of the three major human commensal rat species after R. norvegicus and R. rattus (Tate, 1935). The species seems to be a rather opportunistic commensal, quite capable of adopting a wild lifestyle depending on the habitat and their competition (Harrison, 1951). According to Williams

(1973) the species would develop a Figure 4: Comparison of rat pelts by C. Mahoney (via commensal life style as long as no larger DOC website) rats were previously filling that niche. On introduction of larger commensal rat species, R. exulans was displaced (Atkinson, 1973), although observations in the Solomons reported coexistence with R. rattus and R. praetor, attributing this to the low level of specialisation in R. exulans and its advantageous small size (Johnson, 1946). There are no known interactions between R. rattus and R. exulans (Tamarin & Malecha, 1972) and as true separate species there is no interbreeding with the later European introductions.

Although living in close association with humans, the impact of R. exulans as a pest is mostly restricted to crop damage. Unlike its sister species, R. exulans is not known to spoil food supplies like flour bags (Rutland, 1889). Few obvious signs of disease could be detected in the species (Egoscue, 1970) and fleas and parasites have been reported to be missing from several Pacific populations (Marples, 1955; Wirtz, 1972), most likely due to multiple bottleneck events during the colonisations from island to island. However, in populations on New Zealand islands, isolated from other commensal rats, Roberts (1991b) documented a composition of old as well as new endo- and ectoparasites, which therefore must have travelled across the Pacific with them. Further, R. exulans is carrier for Leptospira spp. (Perez et al., 2011), as well as Hantavirus (Nitatpattana et al., 2002) and Toxoplasmosis gondii (Jittapalapong et al., 2011), although with a low prevalence. From a

11

human perspective, this might have made the species a more tolerable commensal than its larger relatives.

Sailing rats, Maori legend and lore R. exulans was transported into the Pacific on the outrigger canoes of the first Polynesian settlers and reached every inhabited island with them. Evidence supports that this introduction of R. exulans was intentional. Its distribution over Polynesia is rather even, showing no gradient to indicate diminution in the spread from West to East (Tate, 1935). In New Zealand, the arrival of R. exulans is preserved in Maori legends and even associated with a specific vessel, the Aotea (Best, 1942). Further support to the intent is given by the accounts that R. exulans was frequently relied on as a food item (Dieffenbach, 1843; von Hochstetter, 1867; Waite, 1897). Lore from the described in detail how to catch rats and prepare them on skewers (Gill, 1880). Some Maori tribes were preserving R. exulans like they preserved birds, in their own fat (Buller, 1870; Best, 1942). This huahua kiore is said to have been a delicacy not unlike similarly preserved foods found in other remote human occupations suffering from seasonal scarcity, e.g. the kiviaq, traditionally made by Greenland Inuit, equally serves as fat and protein source. In the Tongan Island rats constituted food for lower people while rat hunting served as sport for the chiefs (Martin & Mariner, 1817).

The origin of R. exulans is also rooted in Maori-myth, wherein it is described as a pest. In stories about the Earth-Mother, Pani has not only given birth to the Kumara, but also to a daughter:

“… known as Hine-mataiti, who represented, or was the progenitor of, the kiore or rat. Said an old man of Awa to me: "The descendants of Hine-mataiti are a numerous folk in the world, ever do they assail the offspring of Pani that is the kumara, and hence man ceaselessly attacks the Kiore folk, destroying myriads.” (Best, 1942)

Through this close association of rat and men in the Pacific, R. exulans has been used as proxy for human migrations and vice versa (Roberts, 1991a; Matisoo-Smith, 1994). This rat is the only commensal that reached all the islands with archaeological evidence of Polynesian occupation, and is thus a suitable bioproxy for tracing the expansion of the Polynesians into Remote Oceania (Atkinson, 1985; Matisoo-Smith, 1996). Whether R. exulans will prove suitable for tracing the origins of Lapita, the assumed Polynesian ancestors, will be tested in this study.

12

GENERAL INTRODUCTION CHAPTER01

In summary, R. exulans is highly adaptable in its food resources, habitat and even reproduction, making it a perfect commensal to expand into various niches aided by human migration.

1.1.8 Current distribution

The current distribution of this eurytopic species ranges from the Southeast Asian mainland in the West, via the Indo Malayan Archipelago, New Guinea, along the Bismarck Archipelago, throughout Remote Oceania as as Rapa Nui (), South as Steward Island (NZ), and North as the Kure Atoll of the Hawaiian Islands (Taylor et al., 1982). This approximately covers a region with a North to South axis of 8500 km and a West to East axis of 18,000 km, the majority of which is Ocean. However, R. exulans does not occur on the Australian (Tate, 1936) although it is present on the remote Adele Island, part of the NW Territory of continental Australia; from there specimens were registered at the British Museum as early as 1891 (Tate, 1951). Until recently it was also absent from Taiwan (Motokawa et al., 2001). Further, no records exist of R. exulans reaching Madagascar; despite Malagasy being part of the Austronesian language family, and its settlement by humans from Island Southeast Asia (Burney et al., 2004; Blench, 2007).

In more recent times, the European colonial regime in the Pacific as well as subsequent military activity throughout Near Oceania and the Northwest Pacific might have impacted the distribution pattern of R. exulans. Particularly during World War II, long term military camps were established on many islands in Near Oceania (Johnson, 1946) and their continuous movements connected many locations along the Northern Coast of New Guinea (See Appendix A 1 for Figure A 1 and Figure A 2).

13

(beige), on bathymetric profile map visualising the Sunda and Sahul shelf boundaries. The Thorne Green

exulans

. R for

range istribution D : 5 Figure line indicates the distinction between Near and Remote Oceania, with the estimated time of first human occupationThe given. Huxley, human Wall first of ace, Weber, estimated time the Oceania, Remote Near and with between distinction the indicates line NOAA USGS, Esri, map, base Map: World terrain boundaries. biogeographic known demarcate lines Lydekker and

14

GENERAL INTRODUCTION CHAPTER01

Map: World terrain base map, Esri, USGS, USGS, Esri, base Map: World terrain map, lay. indicated by beige over by indicated

exulans

R.

: Names of places referred to throughout this study. Distributional range of

6 Figure NOAA.

15

1.2 Regional features and peculiarities

1.2.1 Geographic history of Sunda and Sahul

The geography of the region spanning from via Indonesia and the Philippines to New Guinea and Australia has repeatedly been altered due to oscillating sea levels throughout Earth's history and exceedingly so during the last 2 million years. Two shallow continental shelves, Sunda and Sahul, were gradually exposed or flooded depending on the extent of continental ice sheet built up. At the peak of the last Pleistocene glacial period, between 16,000 and 30,000 BP, the sea level dropped to 135m below today’s standard elevation zero. This exposed the Sahul Shelf, the dry landmass of Sahul consisting of Australia, Tasmania, New Guinea, the Aru Islands, and other islands just east, west and

north of New Guinea. At the last glacial maximum (LGM; 18,000 BP) the Sahul continent covered approximately ten million square kilometres (Spriggs, 1997). Equally to the west, the Sunda shelf was exposed, fusing the greater Sunda Islands Sumatra, Java and Borneo with the Southeast Asian mainland via an added 1.85 million km2 of landmass (Figure 7). The area in-between the two shelves, Wallacea and the Philippines, gained landmass, but remained unconnected due to deep trenches of water. This changing geography with its periods of connectivity and separation had a great impact on the dispersal of species and therefore on their evolutionary history.

Several biogeographic boundaries have been defined, limiting species distributions from both region to different extents (Figure 8). During eight years of travel Alfred R. Wallace (1869) studied the composition of animal species across the Southeast Asian mainland and the islands to the east. After collecting about 350 land-bird species on Java and Borneo, common on both islands, he found that only ten of those had made it to Celebes (Sulawesi), only a few miles further east. Wallace was also the first to describe the large faunal differences after crossing the Lombok strait from towards the more eastern and concluded that the geographical region of the Malayan Archipelago as a whole, was in fact “divisible into two portions nearly equal in extent which differ widely in their natural products” (1869, p. 10).

16

GENERAL INTRODUCTION CHAPTER01

Figure 7: Impact of sea level changes on the island connectivity within the Malayan Archipelago, for 30m, 60m, and the maximum of 135m below today’s standard elevation zero. Current day geography outlined in yellow. Images retrieved from http://sahultime.monash.edu.au.

17

The boundary between these areas, thereafter known as Wallace’s Line (Huxley, 1868), separates the zoogeographical regions of Asia and Australasia, i.e. Borneo from Sulawesi and Bali from Lombok. At this boundary a striking impoverishment in numbers of all land animal species can be observed from west to east (Mayr, 1944) and similar observations have been made for the marine environment (Barber et al., 2000). Today Wallace’s Line rather defines the western limit of Wallacea, a transitional biogeographic zone with a high degree of species endemism that combines flora and fauna from both Australia and Asia (Moss & Wilson, 1998); essentially, Wallacea sits in a gap between the Sahul Shelf and the Sunda Shelf.

Figure 8: Combined topographic and bathymetric map, revealing the underlying Sunda and Sahul shelves with Wallacea as transitional biogeographic region in-between, delimited by the Wallace and Huxley lines to the west and Weber’s or Lydekker’s line to the east. Further east: Thorne-Green line separating Near and Remote Oceania, with the approximate date ranges for human settlement BP. Map: World terrain base map, Esri, USGS, NOAA.

At sea levels around 60 m below today’s zero, the Arafura Shelf between Australia and New Guinea was exposed and formed a land bridge stretching west as far as the Aru Islands. As a result these lands share many marsupial mammals, land birds and freshwater fish. Lydekker's line, which runs along the western edge of the Sahul Shelf, is used as the boundary for this strictly Australian fauna. However, often Weber’s line is used as two- sided delimiter, which is defined by a 50:50 ratio of Asian to Australian mammals and molluscs.

18

GENERAL INTRODUCTION CHAPTER01

Geologically, the margin of the Pacific plate divides the Pacific islands from the Pacific basin. Island Southeast Asia, New Guinea, the , and the islands east to Tonga and South to New Zealand lie within the ‘’ on the circum-Pacific rim, a volcanically active area around the Pacific basin caused by continental shelf movements involving the Pacific plate (Wegener, 1912; Allen, 1965). In this context the Andesite line was defined in the early 20th century by P. Marshall (see Cotton, 1958), which parallels the subduction zone along the rim, indicating the presence of highly explosive andesite volcanoes. Apart from the obvious environmental hazards, the specific volcanic activity also holds implications for soil nutrients (Rolett & Diamond, 2004). Tephra (ash) layers resulting from Volcanic eruptions along the ring can further be used as temporal horizons in archaeology and the origin of eruptions can be traced by the unique chemistry of the deposited tephra.

1.2.2 Human boundaries…or classifications

In 1832 the islands (= νῆσος (nēsos)) of Oceania were denoted as Polynesia, Micronesia and Melanesia by the French explorer Dumont d’Urville and these terms have been commonly used for more than 150 years. Polynesia he restricted to peoples “who observe the tapu, speak the same language, and belong to the first division of the copper-skinned or swarthy race”; Micronesia, accommodating a second group of copper-skinned race, he termed because it only consists of very small islands, whereas Melanesia, including Australia, he defined “as … home of the black race” (Dumont d’Urville, 1832/2003, p165). In this work, originally published under the title ‘Sur les îles du Grand Océan’, he also included a fourth, western region, Malaysia, as encompassing “all the islands commonly known as the ” and noted that “there is a strong evidence to suggest that these islands were the original homeland of the intrepid navigators who settled the first two divisions (Polynesia and Micronesia) of Oceania”.

A less arbitrary division into more appropriate analytical and politically correct units was made by Pawley and Green (1973) with the introduction of the dimensions of Near and Remote Oceania. Green (1991a) defined Modern Near Oceania for the time-period after

6000 BP as the islands from New Guinea out through the Bismarck Archipelago and down through the Solomon Islands chain, in contrast to Ancient Near Oceania until 6000 BP, which including Island Southeast Asia and Australia. Remote Oceania comprises all the islands separated at present from those of Near Oceania by water gaps greater than 350 km;

19

the major boundary between the two regions is the stretch between San Christobal at the end of the Solomon chain proper and the Santa Cruz group to the east (Pawley & Green, 1973). This boundary between Near and Remote Oceania marks a major cut-off point in the natural distribution of fauna and flora, beyond which only bats and commensal mammals are found and “it is also likely to have been a major one in the distribution to Palaeolithic man” (Pawley & Green, 1973, p5). This boundary has been referred to as ‘Thorne-Green’ Line (Roberts, 1991a).

20

GENERAL INTRODUCTION CHAPTER01

1.3 Human migrations Over the last decades tremendous advances have been made uncovering the history of humans as species, Homo sapiens. Their migration pathways out of and their dispersal across are becoming well established and modern genetics keeps unravelling ancient relationships with other Homo spp. such as Neanderthals (Vernot & Akey, 2014) and (Huerta-Sanchez et al., 2014). Since 2005 the genographic project has contributed large amounts of genetic data and involved citizens globally, resulting in a finer scale of human migratory history (Behar et al., 2007).

Since H. sapiens first left Africa an estimated 100,000 to 65,000 years before present (BP) (Cann et al., 1987; Macaulay et al., 2005; Armitage et al., 2011), modern humans have migrated aided by their increasing abilities to cross natural boundaries in a diverse and evolving environment. Less than 1000 years BP they established populations in even the remotest areas. In this last historical movement, people spread throughout Oceania into the many islands of the Pacific Ocean, today termed Remote Oceania, the last area known to be colonised. Early traces of human movements are often masked due to social and genetic admixture, caused by trade, war and multiple settlement processes, amongst other reasons. Due to the rather recent occurrence of the last migrations into Oceania and the hypothesised uni-directionality of these voyages, this last advance provides an ideal model for studying the dynamics and consequences of human settlement (Goodenough, 1957).

1.3.1 Human settlement of Sahul and the Lapita cultural complex

Prehistoric human colonization in the Pacific is hypothesised to have happened in at least two well researched phases. Initially, during the Pleistocene hunter-gatherer populations expanded from Island Southeast Asia through New Guinea from where they reached the Bismarck Archipelago. The Papuan-speaking descendants of these people are found dispersed throughout New Guinea and parts of Island Melanesia (Bellwood, 1997). The initial entry into the supercontinent Sahul occurred around 45,000 BP (± 1,000 years, Gillespie, 2002; O'Connell & Allen, 2004), while the Bismarck Archipelago, which was never attached to the Sahul mainland, was colonized only marginally later (Leavesley et al., 2002; Leavesley & Chappell, 2004; O'Connell & Allen, 2012) and the Solomon Islands further east by 29,000 BP (Hurles et al., 2003).

21

At least one other stream of peoples is thought to have arrived in Near Oceania,

approximately 3300 BP. This colonization wave is associated with Austronesian language speakers and hypothesised to be part of a diaspora of Neolithic farming peoples out of south

China and Taiwan around 6000 BP (Bellwood, 1997). However, this remains widely

debated and I will introduce several competing hypotheses in CHAPTER 1.3.2.

However, the peoples that left their traces in Near Oceania and subsequently Remote Oceania (Figure 9) at the time were described as the Lapita cultural complex, a complex defined by a certain material culture that appeared in the archaeological horizons. The Lapita cultural complex has been named after excavation site 13 in where in 1952 Edward W. Gifford first discovered similarities of distinctly dentate-stamped ceramics excavated on-site to other ceramic pieces recovered in Tonga and Fiji. Lapita has hence been defined characteristically by this decorated pottery style, which was first described by Meyer (Meyer, 1909a, b). However, the assumed Lapita cultural complex also comprised “a range of new non-ceramic features, including permanent villages, a range of horticultural crops, domesticated animals (pigs, dogs, chickens and rats), fishhooks for inshore and open ocean fishing, fishing nets, sea-going canoes, stone adzes, anvils and shell bracelets” (Hurles et al., 2003, p. 531). The extent of this material complex has shifted over the years and the existence of a pre-packaged complex on the time of arrival in Near Oceania is debated (Donohue & Denham, 2010; Specht et al., 2013).

22

GENERAL INTRODUCTION CHAPTER01

Figure 9: Regional overview of archaeological sites containing dentate-stamped pottery. The total number of sites within the grey-shaded area has been estimated at 184 sites with a decrease in frequency along a gradient from west to east (Anderson et al., 2001).

Nevertheless, the people associated with this cultural complex are said to be ancestral to all Remote Oceanic cultures (Bellwood, 1978a; Kirch, 2002; Green, 2003) and the descendants of Lapita peoples later founded a large east Polynesian ‘‘homeland region’’ in the Societies and Cook Islands, from whence people spread to the remaining uninhabited islands of the Polynesian Triangle (Matisoo-Smith et al., 1998). However, collating evidence from several disciplines, a theory of a non-singular origin of Polynesians has emerged. This theory proposes a dual influx of peoples into Remote Oceania, the first associated with Lapita followed by a second bringing new migrants from Island Southeast Asia and forming a melting pot at the fringes of the Lapita distribution that generated the in West Polynesian (Addison & Matisoo-Smith, 2010).

23

1.3.2 Hypotheses on the geographic origin and subsequent migration pathways of the Lapita cultural complex.

A standard and often cited view of human migration throughout Island Southeast Asia and later into the Pacific is that Neolithic farmers expanded due to increased population sizes in their homelands (Bellwood, 1978c), while other scenarios do not require pressure from population density (Specht et al., 2013). To different extents there is an agreement that those Neolithic farming peoples brought along their horticulture, domesticated animals and their very different lifestyle into an area previously inhabited by hunter gatherers, and subsequently into previously uninhabited regions. As with all radiations of farming peoples, they also would have brought animal-derived pathogens that are potentially harmful to non-farming peoples thus providing them with an advantage (Black, 1975; Bellwood, 1978c; Motulsky, 1989; Diamond, 1997; Weiss, 2001). The island of New Guinea already had horticulture when the Austronesian expansion took place (Denham et al., 2003) and therefore probably represented the only area in the world where two agricultural peoples collided (Cox, 2003). A recent hypothesis proposes that the Austronesians were not farming people at all, but fisher-foragers (Blench, 2012), which would align well with the sea faring capabilities of the Lapita settlers.

Although highly unlikely, a general discussion has been ongoing about whether any human migrations undertaken were merely accidents rather than intentional voyages. However, Kirch (2002) noted that whatever route the people took when crossing through Wallacea, a minimum of ten water crossings including one of 100 km was required to reach Near Oceania and thus argued that these voyages did not happen by accident but must have been undertaken intentionally, possibly with the aim to find new resources and/or settle new lands. A study about founding population sizes in Polynesia further declared that the founder size needed (>50 women) is too large to be explained by storm-driven ‘cast aways’ (Murray-McIntosh et al., 1998). It also has to be mentioned that when the colonial powers in Polynesia prohibited canoes the German officials complained as late as 1910 about the continued abundance of ‘unregistered’ outriggers performing open ocean travels despite the prohibition (Hympendahl, 1997), indicative of a long-term highly interconnected society.

Further, long-term long-distance exchanges within Melanesia and beyond were inferred by animal translocations and evidence of obsidian trade (Tykot & Chia, 1997; Grayson, 2001;

24

GENERAL INTRODUCTION CHAPTER01

Matisoo-Smith, 2007; Summerhayes, 2009). Cuscus (Phalanger orientalis) may have been

introduced to New Ireland as early as 23,500 to 20,000 BP (Heinsohn, 2010, and references therein) and it is the only marsupial found in the Solomons today (Flannery et al., 1995).

Its range further includes Timor, where it has been found in deposits dated to 6000 BP (Spriggs, 1998) and many islands of the Moluccas. Other cuscus species have made it as far west as Sulawesi (Hooijer, 1974; Spriggs, 1998). Fossils of a wallaby species found in northern Halmahera, which were recovered from a similar time layer, were assumed to represent the extant Dorcopsis muelleri and to have been introduced from Vogelkop, New Guinea (Flannery et al., 1995; Heinsohn, 2010). These are only a few examples of proposed early animal translocations due to human agency. Likewise, support for the existence of

long-distance marine trade networks as early as 6000 BP, was obtained by obsidian artifacts from multiple sources in Melanesia that have been found in western Borneo (Sabah, Malaysia; > 3500 km to the west) (Spriggs et al., 2011). Sea faring in the Taiwan strait only began by 6000 BP (Rolett et al., 2002).

Against this backdrop of well interconnected human populations fell the Austronesian expansion with which the Lapita phenomenon is associated. Due to the somewhat sudden appearance of Lapita associated artefacts in the archaeological horizons in Melanesia it has been a long-standing question where the ancestors of these peoples originated and what their pathways were that brought them to Near Oceania. In the long run this included questioning what level of interaction took place with indigenous peoples along the way. There is still not much agreement about the geographic origin of the Lapita peoples who subsequently settled the entire Pacific, or about the interdependent route for the migration of those seafarers. The main dispute is about where the direct Lapita ancestors began their migration, whether Taiwan (Diamond, 1988), the Island Southeast Asian Triangle (Oppenheimer, 2003) or Island Melanesia itself (Allen, 1984). Hence, several models have been developed that differ in their assumptions of the origin of the Pre-Lapita peoples, as well as their pathways and tempo if dispersal. Many of these models lean on evidence obtained within single scientific fields and thus their results are not necessarily in agreement with those of scholars in different fields.

The ‘classic’ models discussed in the literature are (1) the express train to Polynesia (ETP), which assumes an origin of Polynesians in Taiwan or China and a rapid movement from there over Near Oceania and out into Remote Oceania without any or much cultural interaction on the way; (2) the Bismarck Archipelago Indigenous Inhabitants model (BII),

25

which assumes the origin of the Lapita cultural complex to be directly in Island Melanesia; (3) the Slow boat to the Bismarcks model (SBB), a theory with an origin in Taiwan as well, but with a slower pathway through Island Southeast Asia and Near Oceania out into the Pacific and cultural admixture to a certain degree on the way; (4) the Voyaging Corridor Triple I model (VCT I), which also leans towards a Taiwanese origin but argues for extensive interaction on the way through Near Oceania and suggests different origins for different components of the cultural complex before entering Remote Oceania and (5) the Entangled Bank model (EB), which predicts that origins and pathways are so entangled that they cannot be traced back in a straight line and be described by a single model. The increase of molecular evidence has led to several rejections and adjustments among the above models.

Express Train to Polynesia (ETP) “the is the record of a number of highly mobile groups of seaborne colonists and explorers, who expanded very rapidly throughout

Melanesia in the mid-late second millennium BP” Bellwood

Thirty years ago, Bellwood (1978b) described what will later be called the Express Train hypothesis. His description was based on archaeological evidence from pottery remains, which showed similarity to pottery from southern China and Taiwan, accompanied by the interpretation that it must have been derived from a similar culture, and linguistic evidence, which tied Lapita to the Austronesian expansion and added the accelerated time frame. Even though Bellwood no longer supports this model (2001), in his first book on Pacific Archaeology, ‘Man’s conquest of the Pacific’ (Bellwood, 1978b, p. 255), he claimed that the “…by far best hypothesis is to derive both the Lapita pottery and its makers directly from eastern Indonesia or the Philippines…” and “…one gets the impression that the Lapita potters are themselves intruders into a Melanesia which had been settled by other Austronesians for at least a thousand years”. But his predominant view then was expressed by the phrase that “the Lapita potters had little impact in a long term sense in Melanesia” (Bellwood, 1978b, p. 275). Overall, Bellwood (1978b) proposed to accept an immediate origin for Lapita in the Philippines or north eastern Indonesia and even specified the

timeframe as lying between 4000 and 3300 BP.

Allen (1984) discussed the dynamics of the spread into Island Melanesia as dichotomy, lying between the linguistic-based theory supporting the rapid expansion out of Southeast

26

GENERAL INTRODUCTION CHAPTER01

Asia versus the, according to him, opposing view based on archaeological evidence. Along with his description, he introduced the term ‘fast train to Polynesia’. In contrast, Diamond (1988), referring to Kirch’s excavations on the Mussau islands, used the occurrence of ‘full- blown Lapita artefacts’ as support for what he then called ‘express-train’ model, because the artefacts at the site, dated back to 1600 BP, showed no trace of gradual local origin. According to Kirch (1997) the ETP theory represents the more orthodox anthropological perspective and he mentioned that it has been around in one or another version at least since E. W. Gifford’s excavation in 1952.

In general, the ‘out of Taiwan’ or ‘express train to Polynesia’ model stands for an arrival of the culture from the west and a rapid spread eastwards without measurable pause (Diamond, 1988). Thus it describes the initial spread of Polynesian ancestors (Austronesians) without much local interaction, first into Near and then into Remote Oceania (Hurles et al., 2003). Hurles et al. (2003, p534) constructed a concise description defining the ETP model in two parts; first the ‘out of Taiwan’ part: “(1) arose north-westward of Near Oceania, meaning probably in Taiwan, but ultimately from China, (2) the associated culture … arose in the same region northwest of Near Oceania and (3) the Austronesian people differed genetically from the first indigenous, non-Austronesian speaking ”; second the ‘express train’ arguments: “(4) a relatively rapid southward movement of the Austronesian people (with their languages) through the islands of southeast Asia and subsequently eastward into Remote Oceania, (5) no significant breaks between leaving Taiwan and reaching western Polynesia and (6) only limited genetic mixing between Austronesians and indigenous Melanesians during this initial expansion, and no large-scale replacement of indigenous Melanesians.”

Given the linguistic association of Polynesian languages with the Austronesian language family, the question of the Polynesian ancestry has usually been linked to the question of the origin of this language family (Hale, 1846; Oppenheimer & Richards, 2001b). The Austronesian language family is the largest language family in the world (Blust, 2009/2013). It contains the majority of languages spoken in Polynesia, Micronesia, Island Melanesia as well as those spoken in Island Southeast Asia, Malaysia and Madagascar (Brandstetter, 1893; Blust, 1984; Diamond, 2000). In recent years, linguistic evidence for an origin of the Austronesian language family in Taiwan has hardened and with it the inference of an origin of the Austronesian speakers in the same region (Gray & Jordan,

27

2000). The expansion of the Austronesian language family has been described as pulse- pause scenario, which hypothesises two settlement pauses, one before the settlement of the Philippines and another after the settlement of Western Polynesia; the associated age of the

Austronesian language family was estimated at 5200 BP and the timings of the two pauses

at 4500 to 3800 BP and 2800 BP respectively (Gray et al., 2009). This is concordant with the idea of an ‘express train’ that saw the extremely rapid Austronesian expansion from Taiwan reaching the edges of Western Polynesia within 2100 years (Gray & Jordan, 2000). Gray & Jordan (2000) tested the ETP against the competing ‘entangled bank model’ (EBM) by applying a quantitative phylogenetic approach to linguistic evolution to a data set of 77 Austronesian languages. Under the ETP a strong tree-like signal would be expected, whereas the entangled bank model would be supported by a reticulate relationship with no phylogenetic signal (Hurles et al., 2003). The derived language tree fitted the ETP exceptionally well and, under the assumption that the language dispersal represented demic migration rather than cultural transmission, they rejected the specific features of the competing EBM as a result.

By some researchers the metaphor of an express train has been assessed as quite inappropriate due to the simple implication that people passed through without stopping (Kirch, 1997). Green (2003) even accused some molecular biologists of doing a great disservice to the field by continuing to test and support the ETP model. Others still see converging evidence from archaeology and molecular anthropology as supporting a rapid and relatively encapsulated dispersal of the Austronesian speakers throughout the Pacific (Lum & L., 1998; Gray & Jordan, 2000). Hurles et al. (2003) rejected the ETP model alongside the EBM, because neither is complete nor accounts for later effects, such as continued migration.

Bismarck Archipelago Indigenous Inhabitants (BAII) A fully archaeologically derived model was that of the Bismarck Archipelago Indigenous Inhabitants. It suggested the origin of the Lapita cultural complex, with all its associated achievements, within Melanesia. Allen (1984) argued that a widespread social cohesion in this region could easily account for the homeland which ‘prompted and sponsored’ the further expansion eastward. Based on archaeological evidence from the Bismarcks he concluded that “a sufficient time period elapsed to allow for a local cohesive social and economic structure to have developed”, pointing out that such a culture could develop

28

GENERAL INTRODUCTION CHAPTER01

technologies internally but could also receive and adopt others from the outside and subsequently bring them together (Allen, 1984).

Contacts outside of Melanesia would have facilitated the flow of materials, technologies and people in both directions, as indicated by bronze and obsidian findings (Allen, 1991). Obsidian occurs on many islands and since it does not float it was obvious that it must have been transported by humans (Spennemann, 1996). Due to this early long distance distribution of obsidian as well as the early settlement of the Admiralties, Oceanic watercraft is likely to have been developed locally. Nevertheless there has been discussion about the quality of boat and seafaring capabilities (Anderson, 2001b), especially referring to the presence and quality of sails allowing for more than downwind sailing (Irwin, 2006). However, in his overview of current models Green (2003) noted that Allen has shifted his position and no longer supports the BAII.

Voyaging Corridor Triple I (Intrusion Integration and Innovation; VCTI) Under Green’s triple-I model, the genesis of the Lapita cultural complex required the ‘intrusion’ of a new culture, followed by ‘integration’ between the cultures, and local ‘innovation’ in technology (Green, 1991b; Green, 2003; Hurles et al., 2003). A sheltered voyaging corridor between Southeast Asia and Near Oceania allowed a gradual development of seagoing skills (Irwin, 1992). This did not necessarily conflict with Allen’s

(1984) argument that the settlement of the Admiralties (Manus Is.) around 20,000 BP suggests that efficient water transport had developed in the Bismarck region several thousand years prior to Lapita. However, by some it was hypothesized that in the course of their migration into Remote Oceania the voyagers sailed out against the wind to manage a secure return downwind (Irwin, 1993).

Later in his work, Bellwood (2001), who formerly introduced the ETP model, argued that the Austronesians represent a population node, albeit not a hermetically sealed one, with a linguistic homeland in Taiwan and a history of dispersal through Island Southeast Asia and the Pacific during the past 5,000 years. This history had involved both demographic range expansion and, particularly in Melanesia, intensive contact-induced change, not only linguistic but also genetic (Bellwood, 2001).

Entangled Bank Model (ETB) Terrell (1988) introduced the metaphor ‘entangled bank’, relating to Darwin, for which he argued that the Pacific prehistory is more like ‘a playing field’ on which different games

29

can be played by different people at different times. He questioned the often assumed isolation between islands and archipelagos (Terrell, 1988; Terrell et al., 1997), pointing out that discoveries and inventions, if thought worthy, would have been traded back and forth or even stolen (Terrell, 1988). He strongly disagreed with Goodenough’s (1957) view of the Pacific islands as test field where human cultural drift could be observed in isolated populations; Terrell and co-authors called isolation a myth and argued for a new theory based on archaeology with a more dynamic view on “locational geography, Americanist archaeology and zoogeography”, social anthropology, experimental voyaging (see e.g. the Lapita Voyage project in 2008/2009), and human genetics. The mobility of ancient peoples along a voyaging corridor as well as the fact that trade among communities is older than pottery (see earlier arguments) supported this idea. However, when new migrants during Neolithic times eventually arrived in the Pacific, racial admixture occurred in Melanesia but not in Polynesia, Australia and parts of New Guinea where people were isolated enough from southeast Asia to remain true to their original endowment (Terrell & Welsh, 1997).

Overall Terrell argued against a simplistic view and suggests, with his entangled bank model that due to the level of complication and entanglement there is no such thing as the one tree that can be derived for the migration pathways. For the authors, languages are not suitable to trace origins and they called tree diagrams of historical linguistics a ‘convenient analytical fiction’. Terrell summarized repeatedly that the ‘story’ about the farming Taiwanese migrating with their language into Polynesia is not ‘history’. He prompts other researchers to downsize the hypotheses to make them more testable, because he sees Polynesia just as another world of human mutuality and entanglement (Terrell, 2004). Surprisingly the same evidence is often interpreted in opposing ways by different researchers (Cox, 2005).

Slow boat to the Bismarcks (SBB)…or from Asia Based on the study of human Polynesian Y-chromosomes, Kayser et al. (2000) proposed another model for Polynesian origins; it was named ‘slow-boat to the Bimarcks’ model to contrast the ‘fast train’. Their study traced the majority of Y-chromosome haplotypes to Melanesia and thus oppose a rapid movement through ISEA and Melanesia. However, the authors did concur with an ultimate origin of the Polynesian ancestor in Asia/Taiwan, but with extensive interaction and mixture with Melanesians (Kayser et al., 2000). The model is not substantially different from Green’s VCTI model.

30

GENERAL INTRODUCTION CHAPTER01

In their own interpretation of the SBB as ‘slow boat from Asia’, Oppenheimer and Richards (2001a) suggest that the Polynesians did instead originate in eastern Indonesia, somewhere between the Wallace line and the island of New Guinea. The presence of only the proto- Polynesian motif in the Taiwan region led them to estimate the time of the first occurrence of the full Polynesian motif in Wallacea which they dated to be earlier than the archaeological pottery remains (Richards et al., 1998; Oppenheimer & Richards, 2001a).

Genetic evidence Evidence from the field of human genetics for the Pacific region is growing steadily. As mentioned above, early studies of mitochondrial DNA (mtDNA) and the non-recombining Y-chromosome (NRY) for Oceanic populations revealed an apparent discrepancy between the two sex specific markers: while the maternal lineage showed close affinity with Asian mtDNA, the paternal lineages did not show the same extent of Asian influence, but consisted mostly of Melanesian variants (Melton et al., 1995; Sykes et al., 1995; Hagelberg et al., 1999; Kayser et al., 2000).

These observations have since been confirmed in many fine scale studies on Oceanic and Indonesian populations and have also gained support through the addition of autosomal markers (Hurles et al., 2002; Merriwether et al., 2005; Trejaut et al., 2005; Kayser et al., 2006; Pierson et al., 2006; Scheinfeldt et al., 2006; Friedlaender et al., 2007; Friedlaender et al., 2008; Kayser et al., 2008b; Kayser et al., 2008a; Karafet et al., 2010; Soares et al., 2011; Tumonggor et al., 2013; Shipley et al., 2015). In the following I attempt to give a brief overview over the two most well studied markers and the lineages associated with Lapita and the Austronesian expansion.

The maternal lineage

Initial genetic analyses in search for an origin of the Remote Oceanic human population were carried out on the maternally inherited mitochondrial DNA (mtDNA) (Lum et al., 1994; Sykes et al., 1995; Richards et al., 1998; Lum & Cann, 2000). All human mtDNA haplogroups worldwide are descendants of the non-African macro-haplogroups (MHG) M, N and R that initially left East Africa around 50-70 ka (Shriver & Kittles, 2004; Soares et al., 2009; Tumonggor et al., 2013). The lineages found in Island Southeast Asia and both Oceanic regions today differ substantially. Melanesia has two predominant haplogroups: (1) haplogroup P, which is an old descendent of MHG-R and represents an ancient connection between Australia and New Guinea, and (2) haplogroup Q, which is a

31

descendent of MHG-M, most common throughout Island Near Oceania but can also be found in Remote Oceania (Friedlaender et al., 2007). In Near Oceania further deep branches of the MHG-M were detected (e.g. M27, M28, and M29) and their age and high local diversity indicate that they might have developed within Northern Island Melanesia (Pierson et al., 2006; Friedlaender et al., 2007). Both haplogroups, P and Q, can also be found in lower frequencies in the Moluccas to the West (Tumonggor et al., 2013), indicating genetic admixture within the western contact zone of the Papuan and Malayan races as phenotypically observed by Hale (1846).

Other than this gradient in Wallacea, Island Southeast Asia is dominated by a diversity of different lineages of the M, N, and R MHGs, with an observable haplotype frequency divide between western and eastern Indonesia (Tumonggor et al., 2013). The ages of these lineages vary and possibly reflect different stages of colonisation. Different levels of admixture between temporally separate migration-waves from the African origin have also been suggested (Rasmussen et al., 2011). Many of the younger lineages (10-40 ka) are part of haplogroup B (descendant to MHG-R), particularly B4 and B5, and are frequently found throughout Asia but with increasing diversity in Indonesia, the Philippines and Taiwan (Sykes et al., 1995; Hagelberg, 1996; Hagelberg et al., 1999; Tumonggor et al., 2013). Haplogroup B is defined by the lack of a 9 bp tandem repeat present in most human mtDNA (Cann & Wilson, 1983). Because this deletion may have arisen only once in and has been found throughout most Austronesian speaking groups but virtually not among Papuan speakers, it was used as an early marker for an Austronesian influence on populations (Hertzberg et al., 1989; Merriwether et al., 1999).

However, the lineage that is of most interest for tracing Polynesian origin today is lineage B4a1a1a, part of haplogroup B and often referred to as the ‘Polynesian motif’. B4a1a1a consists of base transitions at positions 16189, 16217, 16261 (signature of B4a), 14022 (signature of B4a1a1), plus 16247, all relative to the human reference sequence (Anderson et al., 1981). The Polynesian motif is mostly restricted to Austronesian speaking populations among which it can be found within eastern Island Melanesia, Madagascar (~20%; Soodyall et al., 1995) and more importantly it is present in the majority (>90%) of all Polynesians (Soares et al., 2011).

The ancestral B4 subclades B4a and B4a1 are distributed throughout the Philippines, Taiwan and also China (Oppenheimer & Richards, 2001b, a) while the daughter clade

32

GENERAL INTRODUCTION CHAPTER01

B4a1a (tMRCA 13.2 ± 3.8 ka) has not been found on the mainland (Trejaut et al., 2005). This initially led to the interpretation that the lineage B4a1a1a must have been derived from those locations within the time frame of the Austronesian expansion. However, many results in earlier studies were biased by sampling choice and often did not include samples from alternative regions. More recent and comprehensive studies have found a high prevalence (24%) of lineage B4a1a in the Moluccas compared to 45% within only a single tribe from the east coast of Taiwan (9% in all of Taiwan) and 15% across the Philippines; along with the corresponding diversity indices this suggested that the presence of the lineage in Taiwan was best explained by a dispersal event (Soares et al., 2011). A similar proposed pathway of an ‘out of ISEA’ dispersal into Taiwan around 6 ka had previously been inferred for haplogroup E, a subgroup to MHG-M (Soares et al., 2008). Additionally, the direct precursor lineage to the Polynesian motif (B4a1a1, tMRCA 9.3 ± 2.5 ka) was not observed among Taiwanese (Trejaut et al., 2005). With compelling evidence, recent estimates led to the conclusion that the Polynesian motif originated in the Bismarck Archipelago more than 6.8 ka ago, which places the Asian derived maternal Polynesian lineage to be present in Near Oceania before the arrival of Austronesian speakers about 3,000 years later (Soares et al., 2011).

The paternal lineage

Preliminary note: The resolution achieved by typing NRY increases with every fine scale study that adds single nucleotide polymorphisms (SNPs). The definitions of new types via downstream mutations cause former groups to become paragroups, and references to haplogroups within publications can be misleading. This makes the tracing through groups and types more challenging and the resulting fluctuations and changes in nomenclature are confusing at best. I have attempted to refer to all groups with the latest ISOGG 2012 nomenclature.

In contrast to the maternal lineage, the paternal lineage, indicates a strong Melanesian influence on the populations of Island Melanesia and, although to a slightly lesser extent, Polynesia. Of the C, DE and F founder groups, which capture all non-African NRY variation (like M, N and R for mtDNA), only the C and F lineages can be observed in the Pacific; the F lineage gives rise to the K-haplogroup, which in turn is ancestral to the M, O, and S haplogroups of which only the first two are present in Remote Oceania (Underhill & Kivisild, 2007).

33

The ancestral C-lineage (RPS4Y* = C) is present throughout Southeast Asia and Island Southeast Asia, while the progressive daughter lineages C-M38 (C2) and C-M208 (C2a) have their highest frequencies in eastern Indonesia and Oceania respectively, but are absent from western Indonesia and SEA, and C2 also from eastern Indonesia (collated in: Scheinfeldt et al., 2006; Karafet et al., 2010). An increase in frequency of the C2a lineage can be observed within Polynesia from west to east (Mirabal et al., 2012). Another downstream SNP to C-M38 (C-P33) only occurs within the Polynesian Triangle and has briefly been suggested as Polynesian Y-chromosome signature (Cox et al., 2007).

Within the F-lineage, the K-derived M-haplogroup (M-P256) is most frequent in its centre in New Guinea but is also present within the Moluccas and Lesser Sunda Islands to the west, and within Oceania as far as Fiji and Tonga to the east (Capelli et al., 2001; Kayser et al., 2001; Karafet et al., 2008). The distribution of this Melanesian haplogroup indicates a strong Melanesian influence in western Polynesia. Some lineages are confined to the New Guinea highlands, but others can be found throughout Island Melanesia and particularly M-P34 is also found in the Moluccas and the Lesser Sunda Islands (Karafet et al., 2010).

The K-derived S-haplogroup (S-M230) polymorphism M254 can be found in eastern Indonesia and Melanesia, but is not present in the Pacific.

However, the Y-chromosome lineage of most interest with respect to an Austronesian- language dispersal is the also K-derived haplogroup O, which is a common Asian haplogroup and one of the major lineages in Indonesia (55%; Karafet et al., 2010). Within Indonesia three main clades are distinguished, O-M119, O-P31 and O-M122. Although the M174 mutation, characterising the group O-M119, is present within Oceania, the typed sub-haplogroups to O-M119, O-P203 and O-M110 seem more restricted in their distribution; O-P203 to SEA and island-SEA (ISEA), including the majority of Taiwanese Aboriginals, plus individual occurrences in Samoa and O-M110 to ISEA, also including Taiwanese Aboriginals, plus individual occurrences in (Karafet et al., 2010). Group O-P31 is virtually absent from eastern Indonesia as well as Oceania.

The O3-lineage (O3-M122) is present throughout Polynesia, but its distribution in Island Melanesia is patchy, more frequent in eastern Melanesia (Fiji), and importantly it is almost exclusively found within Austronesian speaking groups (Scheinfeldt et al., 2006). It is essentially absent from the highlands of New Guinea (Kayser et al., 2001). Intensified screening revealed the highest diversity within haplogroup O3-M122 in southern East Asia

34

GENERAL INTRODUCTION CHAPTER01

(Shi et al., 2005) and yet this lineage was found in western Indonesia at less than 20% and even less in eastern Indonesia (Karafet 2010). The resolution of the sub-haplogroup O3a2- P201 downstream from O3a-M324 has recently been improved (O3a2c*-P164, O3a2a- M159, O3a2b-M7 and O3a2c1-M134) and a very high prevalence of P-164 attested for Tonga (Kayser et al., 2008b; Mirabal et al., 2012). Mirabal et al. (2012) further suggested the first link between Tonga and Samoa to the Ami aborigines of Taiwan, based on the high prevalence of the P164 lineage in all three populations (S 19%, T 54%, A 37%), with the Ami as possible source population for the settlements of these islands.

Comparison of mtDNA and NRY ancestry

In comparison, Island Melanesia, particularly the Bismarck Archipelago, and Polynesia have a high frequency of Asian mtDNA coupled with a majority of Melanesian NRY haplotypes. Overall, the ratio of a higher frequency of Asian derived mtDNA versus Asian derived NRY has been observed in every study within Oceania e.g. Admiralty Islands: 18.4% NRY and 60.7% mtDNA, Solomon Islands: 15.8% NRY and 76% mtDNA, and Polynesia 28% NRY and 94% mtDNA (Kayser et al., 2006; Kayser et al., 2008b; Kayser et al., 2008a; Delfin et al., 2012), demonstrating the gradient towards Remote Oceania, particularly for mtDNA. The ratio is also confirmed by autosomal studies, in which 79% of the gene pool of Polynesians was identified as East Asian and 21% as Melanesian (Kayser et al., 2008a).

The commensal model A different approach to find traces of prehistoric human movement into Near Oceania and the Pacific is the phylogenetic analysis of commensal species (Matisoo-Smith, 1994; Matisoo-Smith et al., 2009). Due to prehistoric trade and navigational practices there is the potential for intergroup movement or even multiple colonisations to blur the initial migration patterns (Irwin, 1992). Because commensal animals such as pigs, chicken and R. exulans have been carried by man, whether voluntarily or involuntarily, they must have followed his colonization tracks (Tate, 1935) and can thus be used as proxy for their movements (Matisoo-Smith, 1994). Studies of mtDNA variations of R. exulans already revealed migration patterns in Remote Oceania, especially multiple colonisation pathways from a broad central region in the Southern Cook and Society Islands (Matisoo-Smith et al., 1998). Further investigation, including a more western part of the distributional range of the commensal, split the genetic diversity into three major haplogroups with clear

35

geographic distinction (Matisoo-Smith & Robins, 2004). Based on their results, Matisoo- Smith and Robins (2004) reject three before-mentioned models, the ETP, the BAII and the SBB, and argued for a voyaging corridor as the only explanation of the distributional pattern. A recent study on genetic data of 755 pigs including 23 ancient samples supports the existence of two human-mediated introductions of this commensal and suggest that different components of the Neolithic cultural complex might have had different origins and trajectories before they come together in the Lapita horizon (Larson et al., 2007).

Madagascar, the other end of the Austronesian realm Madagascar is the fourth largest island on earth. It lies approximately 350 km east of the African continent and has been separated from any other landmass for 165 Myr (Lowry II et al., 1997). The shortest distance to Western Polynesia is approximately 14,000 km over open Ocean and yet the regions share the same root in their Austronesian (Malayo- Polynesian) languages (Randriamasimanana, 1999). What is even more intriguing, they also share genetic roots; an equal proportion between African and Indonesian ancestry could be found in Malagasy men and women for mtDNA and NRY markers (Hurles et al., 2005). The Malagasy mtDNA carries the Polynesian motif but the predominant Asian derived NRY lineages in Remote Oceania, O3 and C, could not be found among Malagasy men, thus discounting the possibility of a direct migration between these areas (Soodyall et al., 1995; Hurles et al., 2005). Recently, autosomal SNPs revealed a stronger genetic influence from African Bantu (60%) and a lesser Austronesian influence (30%), which can be associated with the Java-Kalimantan-Sulawesi region (Pierron et al., 2014). Other studies have found co-occurring NRY-lineages (O1 and O2) shared between Madagascar and the Solomons, indicating a relationship between both regions and Western Indonesia, where that particular O2 lineage is most common (Trejaut et al., 2014). These results are concordant with linguistic evidence which connects Malagasy to the specific vocabulary of Java, and the Barito region in Southern Kalimantan, Borneo (Adelaar, 1995, 2006). In their extensive genetic study sampling 2740 individuals from Indonesian Islands Tumonggor et al. (2013) regrettably did not include any individuals from Borneo, which might have contributed to the search of the Malagasy origin, and by chance maybe that of the Polynesians.

36

GENERAL INTRODUCTION CHAPTER01

Summary Many scientific fields contribute to the ongoing quest for the Polynesian origin. Archaeologist started the question with excavations of ceramic shards, adzes and other tools, trade items like obsidian, horticultural crops and of course bone remains of humans and their commensals. Social Anthropologists compared cultural similarities and differences according to society structure while linguists made early connections to the Malay languages, tied nets and built trees with all languages spoken in the area. Finally geneticists analyse phylogenetic relationships of humans, adding time depth through modern and ancient samples, and do the same for commensal animals and horticultural crops. Yet, despite accumulating evidence a consensus among the opposing scholars still seems to be out of reach.

Archaeological evidence is consistent with an Asian and to some extent with a Melanesian origin. Linguistic evidence strictly supports a Taiwanese ancestry and human genetic evidence portrays a complex history of social interactions involving a majority of Austronesian women and Melanesian men. The sex-based discrepancy between the two genetic polymorphisms used as markers for the Austronesian expansion, mtDNA lineage B4a and Y-chromosome haplogroup O3, led to the suggestion of a matrilocal residence and matrilineal descent within the Proto-Oceanic population as best explanation, rather than chance results of founder effects (Hage & Marck, 2003). Alternatively, Kayser et al. (2006) suggested that haplogroups of Melanesian origin may have appeared earlier in Western Polynesia (Fiji) than the Asian ones.

Others used the genetic data to demonstrate that the Austronesian language influx in Near Oceania did not lead to a large-scale genetic replacement and thereby uncoupled the myth that the language dispersal was entirely demic (Donohue & Denham, 2011). This is concordant with Addison and Matisoo-Smith (2010) who introduced a further variant of the VCTI model, a Western Polynesia TI model, suggesting that the initial spread of Lapita, already a genetic admixture, reached the far ends of Near Oceania and crossed the Remote Oceanic boundary, but remained there until a separate Austronesian migration wave eventually arrived in West Polynesia, leading to the increased presence of Asian DNA in the population. They argued that the local fusion of these two cultures built the foundation for the settlement of the Pacific instead of, as so far widely accepted, a single migration wave. However, what none of these models address is the why? Why would an Asian population decide to sail out into the vast Pacific, bypassing many inhabited Islands until

37

they reach the fringes of the currently settled area, only to remain there for the next 1000 years? And how many men and women would have been necessary to successfully intermix with the present locals to achieve this strong Asian signal?

An intriguing similarity between Remote Oceania and the far away Madagascar is the congruent inference that a majority of women must have arrived on the shores as founders for the subsequent populations. This raises the biggest question: why? And what role did diseases like Malaria play? There was evidence from , a Polynesian Outlier, that local people were very aware not to visit certain close islands, because they were convinced the ‘air’ would kill them; this was observed subsequent to visiting the island and losing various members of the European crew to a fever (Hale, 1846, p. 46).

The picture that evolves is that the people termed Lapita might have appeared in Melanesia with a rich material culture that has been observed by archaeological findings, but that this was not the culture with which a specific people set out to settle new areas, but rather a fusion of cultural traits acquired during several generations.

Surely the Lapita migration through Near Oceania and into Remote Oceania does not seem to have been a single or a simple event. It occurred over a 500-year time span, and it undoubtedly represented many exploration and settlement episodes by groups with quite different motivations (Burley, 2001). A key element in interpreting the role of Lapita in the Bismarck Archipelago thus is its chronology, when did it occur and when did changes occur in the material complex within different parts of the archipelago (Summerhayes, 2001). A dual-phase process of Lapita migration mobility has been introduced, where a stable, sedentary phase is followed by an unstable phase of high mobility (Anderson, 2001a). Linguistic evidence has brought up new ideas as well, although the highest diversity of Austronesian languages in Taiwan led to the common assumption that the origin must lie there.

If all evidence is put together, it becomes obvious that there is no simple answer, that the history is somewhat entangled and must be a bit of everything. We look back at a cultural complex that migrated with the wind and if conditions were favourable at landfall they stayed. Females seem to have been more prone to survive encounters with indigenous populations, or they might have been ‘imported’ in the first place. Whichever way, the processes leading to the archaeological signature of the Lapita complex and the settlement of the Pacific is best put in the terms of Intrusion, Integration and Innovation as phrased by

38

GENERAL INTRODUCTION CHAPTER01

Green (1991b). The voyaging corridor triple I model is essentially congruent with the later introduced slow boat model. It remains the most flexible approach to explain the complex history of the Near and Remote Oceanic settlement and its terms can be adjusted to incorporate new evidence as it has been done repeatedly.

“As the examination of the customs and idioms of the Polynesian tribes leaves no room to doubt that they form, in fact, but a single nation, and as the similarity of their dialects warrants the supposition that no great length of time has elapsed since their dispersion, we are naturally led to inquire whether it may not be possible, by the comparison of their idioms and traditions, and by other indications, to determine, with at least some degree of probability, the original point from which their separation took place, and the manner in which it was effected. By this point is not meant the primitive seat of their race in the Malaisian Archipelago, though we may hereafter venture a conjecture with regard to this, but merely the island or group in the Pacific which was the first inhabited, and which bore to the rest the relation of the mother-country to its colonies.” (Hale, 1846, p. 117)

With accumulating evidence this “degree of probability” has increased over the last 150 years, and yet, despite all research the question remains unanswered.

39

1.4 Estimation of migration rates by phylogenetic analysis

1.4.1 Inference of population dynamics from population structure

Mǐgrātǐō, move, travel, relocation…individuals move in daily patterns, they travel seasonally, even over long distances and some relocate for good, i.e. they migrate between populations. Those migration events can be trivial, but they can also cause the relocation of alleles formerly unprecedented in the new population. Such change in a gene pool can have various consequences, depending on whether the allele was abundant in the source population or rare, even private, and whether it connects the gene pools of the two populations or it increases the differences between them. These issues are important if we want to know the degree of kinship within populations and among species, their phylogeny. Many naturalists have tried to find the ties between species. Animals have been measured and weighed; their bones have been analysed and compared with those of similar ones. Even behavioural traits have been observed and weighted to establish similarities and differences. When natural selection has been described by Darwin and Wallace (1858) as “a power accumulating slight variations possibly profitable to some part of the species economy combined with a better chance of survival”, the building bricks for the advent of evolutionary thinking were laid. The local distributions of variations in different environments form the key to either directional, stabilizing or disruptive selection, thus the origin of species.

In 1866 Gregor Mendel recorded the theories of heredity in a monograph (Mendel, 1866) today still known as Mendel’s Laws. In contrast to Darwin’s explanatory approach, Mendel did not believe in pangenesis, the diffusion of hereditary ‘gemmules’ throughout the body (Darwin, 1968, pp 448-449). Over half a century later several scientists tried to unite Darwinism and Mendelian genetics. R.A. Fisher, J.B.S. Haldane and S.G. Wright provided the mathematical foundation for this new population genetics whereas J.S. Huxley, T.G. Dobzhansky, E.W. Mayr, G.G. Simpson, G.L. Stebbins and B. Rensch contributed the Biological and Paleontological aspects in the ‘Modern Evolutionary Synthesis’ (Grene & Depew, 2004; Hey et al., 2005).

Fisher and Wright pioneered methods for computing the distribution of gene frequencies among populations as a result of the interaction of natural selection, mutation, migration

40

GENERAL INTRODUCTION CHAPTER01

and genetic drift. These distributions of neutral alleles or haplotypes can be compared between populations and subpopulations to infer migration events.

Before thinking about migration, overall population structures need to be considered. These structures strongly depend on different social systems. A group of individuals sharing a habitat and resources within will experience population growth or decline depending on the ratio of their specific reproduction and mortality rates until it reaches equilibrium, single individuals emigrate or individuals from another population immigrate. In essence, natality, mortality and movement rates must depend on population density (Krebs, 2001). Pulliam (1988) summarized the model of source and sink populations, considering that large fractions of individuals may regularly occur in "sink" habitats, where within-habitat reproduction is insufficient to balance local mortality but those populations may still persist because they are being maintained by continued immigration from more-productive "source" areas nearby.

change in population size = (births - deaths) + (immigrants - emigrants)

Chance has a much greater impact and selection is less effective in small populations than in large populations (Frankham et al., 2002). If not for regulation, source populations would grow to infinity and sink populations would shrink to extinction. This regulation involves a net export of animals via dispersal, such that in a source population emigration exceeds immigration, whereas sink populations only continue to exist if they attract immigrants from nearby source populations (Krebs, 2001). So apart from reproduction and mortality, migration is essential to sustain sound local populations.

In general, genetic variation is necessary to provide populations with options in the face of environmental alteration (Felsenstein, 1976). In a spatially distributed population there may be a tendency for the more related members to cluster together (Kingman, 1982a). When genetic data is used to infer migration events, only the amount of gene flow between populations can be measured. This will most certainly underestimate the true amount of migrants. Also, in the ideal geneticists’ world, each individual contributes gametes equally to a pool for the next generation. Because this assumption is violated, the effective population size Ne is used instead (Krebs, 2001).

Today, most phylogenies are inferred from genetic markers and as models get more complex the inferences have a chance of getting closer to the real scenario. However this

41

comes at an increased computational cost and wrong models will give wrong answers; the more parameters are added the more mistakes in the underlying assumptions can be made.

Many mathematical approaches have been made to answer a set of questions like: What is the relationship between populations? Do we have one population or more and if we have more which one was the first? Do individuals immigrate and or emigrate? In the following pages I will give an overview over these approaches following the timeline along with a main focus on the detection of migration events.

General Framework To assess different possibilities for the estimation of migration rates, some background is needed in modelling migration in different geographic structures and the underlying mathematical approaches.

Geographic structure of migration models

Patterns of migration can be reconstructed from gene frequency data (Felsenstein, 1982). To calculate migration events between populations a model of the geographic structure of those populations is needed. The commonly used geographic models are the one island model, the island model, the stepping stone model and a model based on a general migration matrix. The following overview of these models is primarily based on the description by Felsenstein (1976, 1987-2007).

The one-island model, first introduced by Haldane (1930) is defined by one island and one large continent, where migrations in both directions occur, but only the immigrants on the island have an impact on the gene pool and gene frequencies.

()tt(− 1) p=−+(1 m ) p mpc

Equation 1 Thereafter the island model, in its first raw form by Levene (1953), consists of n islands which all exchange migrants at the same rate. The distance between all islands is the same, thus there is no geography. The fraction of genes coming from outside each island each generation is m and the fraction of genes arriving from each other island is m/(n-1).

()tt(−− 1) m ( t 1) pii=−+(1 mp ) ∑ p j ji≠ (n − 1)

Equation 2

42

GENERAL INTRODUCTION CHAPTER01

()t Where population i has its gene frequency pi in generation t (Felsenstein, 1976, 1987- 2007).

In the stepping-stone model (Kimura, 1953; Kimura & Weiss, 1964) migration depends upon the distance between populations that are arrayed in a regular pattern, either one dimensional as string or two dimensional as lattice. As according to Wright (1940) most of the immigrants are likely to come from neighbouring groups. So for the linear model m/2 genes immigrate from each of the two neighbouring populations, in the two dimensional case that would be m/4 immigrants from each of four neighbours. In this model the number of migrants depends on distance, where migrants can derive from along a chain of stepping stones.

The general migration matrix model (Bodmer & Cavalli-Sforza, 1968) has the most general possible pattern. It assumes n populations and simply states a ‘backward migration

matrix’ where after migration a fraction mij of the genes in population i just arrived from population j, thus measuring the number of immigrant genes as fraction of the genes in the population receiving immigrants. In reverse a ‘forward migration matrix’ describes the fraction of individuals in one population ending up in the other. Regardless of the approach, the number of individuals migrating must be the same

*' Nmi ij= N j m ji

Equation 3 and subsequent mathematical descriptions are based on this observation.

“Migration tends to smooth out geographic differences in gene frequencies. The rate at which this occurs is given by the rate of migration.” Felsenstein (1987-2007)

Felsenstein further discussed effects of recurrent migration on gene frequencies and also effects of the antagonistic forces of migration and selection in different models. And new models are constantly being developed, e.g. a special model where migration maintained a gradient of gene frequencies (Feldman & Christiansen, 1974).

1.4.2 Mathematical approaches

Traditionally, analyses of geographically structured populations were based on gene frequencies. Modern estimators are now often based on the coalescent theory. These two

43

approaches differ substantially in their orientation and I will first give an overview of the approaches based on gene frequency.

Many population genetic analyses are using F-statistics which relies on a basic model in theoretical population genetics, the Wright-Fisher population model. This model assumes a fixed population size and non-overlapping generations. New individuals are drawn from a large gene pool; each individual is replaced every generation. Each individual has one parent, but a parent can have more than one offspring. Mutations can arise and disappear again. Only genetic drift governs the fate of mutations. This is also referred to as an ‘idealized population’.

When Wright (1922) introduced the inbreeding coefficient (or fixation index F), the probability that two alleles at a locus in an individual are identical by descent, he did not mention migration. However, he subsequently realised that the frequency of a gene in a given population may be modified by migration as well as by mutation (Wright, 1931). Wright developed a structural idea of populations consisting of demes, small breeding populations, at the time emphasising the effect of gene flow in achieving local peaks in evolutionary fitness (Wright, 1932). This realisation that populations are not panmictic was important for the development of modern population genetics. By 1950 Wright suggested a calculation of the inbreeding coefficient of an individual relative to the local population and then that of the local population relative to a more comprehensive population and further up (Wright, 1950). The complete F-statistics, published a year later (Wright, 1951), distinguishes three levels of population structure, based on three levels of variation. It measures the total inbreeding in a population (due to both inbreeding within subpopulations, and differentiation among sub-populations) (FIT), partitioned into that due

to inbreeding within sub-populations (FIS) and that due to differentiation among sub-

populations (FST). The familiar equation for the fixation index F (inbreeding coefficient),

1 F = 41Nm +

Equation 4 shows the dependency on the inverse of Nm where N is the effective population size (times 4 for diploid and 2 for haploid) and m the migration rate.

F-statistics provides a summary statistic about isolation of subpopulations and their variability. They are applicable to any population if there are only two alleles at a locus

44

GENERAL INTRODUCTION CHAPTER01

(Nei, 1973). Weir and Cockerham (1984) formally corrected the originally formulated F statistics for effects by sample size, number of subpopulations sampled, and equality of sample sizes across sub-populations but they also assumed that all populations have descended from a common ancestral population. Also, different coefficients with modified parameters are used for different data types, taking into account their varying mutation rates (Hudson et al., 1992; Slatkin, 1995).

The difficulty of directly measuring gene flow has led to the common use of indirect measures extrapolated from genetic frequency data. Therefore FST and variants thereof are used to solve for Nm, the number of migrants successfully entering a population per generation (Whitlock & McCauley, 1999). The least assumptions that are made when transforming Wright’s formula for migration are that the mutation rate is zero, or negligible and the number of subpopulations very large. So even though the parameters (FST, FIT, and

FIS) offer an excellent and convenient measure of summarizing population structure (Neigel, 2002; Lowe et al., 2004) the translation into an estimate of Nm is often inappropriate for real world situations (Whitlock & McCauley, 1999).

1.4.3 Phylogeny

“The theory of evolution as the means of explaining observed similarities among organisms invites the construction of trees of descent purporting to show evolutionary relationships” Cavalli-Sforza and Edwards (1967)

Inferring phylogenetic relationships from molecular data is an estimation procedure. Because the underlying data is incomplete any evolutionary scenarios could be chosen leading to any phylogeny that could have produced the data, therefore to make the ‘best estimate’ an appropriate model is needed.

The simplest assumption underlying phylogenetic inferences is that the number of differences between two sequences will increase relative to the time since they diverged from their most recent common ancestor (MRCA). After that the assumptions vary between the different approaches. There are two traditional categories of approaches, purely algorithmic methods and optimality criteria methods. Part of the first category is the distance based Neighbour joining (NJ) algorithm (Saitou & Nei, 1987), where the algorithm defines the tree. It is fast and performs well when the divergence between species is low. However, a conversion of sequence information into a distance matrix is necessary and a

45

distance matrix strongly reduces the phylogenetic information of the sequences to one value per sequence pair. Therefore some information will be lost and the inferred evolutionary distance might be inaccurate. NJ-trees are often used as the starting point in a search for the best phylogeny.

Most methods of phylogenetic inference depend on the underlying models of sequence evolution. These models make assumptions about rates in the process of nucleotide substitution and assign relative probabilities to various parameters, most essential transition versus transversion ratio and base frequency. The rates can be expressed in a 4 x 4 rate matrix for the different nucleotides. All models are much simpler than the underlying true processes but general models are relatively robust to violation of their simplified assumptions while model complexity can lead to computational intractability (Holder & Lewis, 2003). Choosing the right model for a specific data set can be challenging but model testing software assists by ranking models with different strategies (Posada & Crandall, 1998, 2001; Posada, 2008; Darriba et al., 2012).

The first model developed was that of Jukes & Cantor (1969) (JC). It assumes that all single base changes are equally probable and the frequencies of all four bases (A, C, G, T) in DNA are the same. It corrects for reverted and superimposed substitutions (‘multiple hits’) at the same time. Kimura’s two parameter model (K2P) (Kimura, 1980) corresponds to Jukes-Cantor model but adds as second parameter: the distinction between transitional and transversional substitution rates, taking into account that mutations from a purin to a pyrimidine base or vice versa occur far less often than mutations within the two groups. The models from Hasegawa, Kishino, and Yano (1985) (HKY) and Felsenstein (1984) (F84) additionally allow for inequality of base frequencies. On this basis the general time reversible model (GTR) (Rodriguez et al., 1990) assumes a symmetric substitution matrix, where A changes into T at the same rate as T changes into A, etc.. Further parameters such as rate heterogeneity and invariant sites can be added to each of these base models. A detailed overview over these and further models and their matrices is available in Swofford et al. (1996).

For data containing invariant sites, or violating the assumptions of stationarity, reversibility and homogeneity, models have been developed, that estimate the nucleotide composition of invariant sites and variable sites separately (Jayaswal et al., 2007).

46

GENERAL INTRODUCTION CHAPTER01

1.4.4 Optimality criteria

In contrast to pure tree defining algorithms stand the tree searches under optimality criteria. The best known criteria are minimum evolution (ME), maximum parsimony (MP), and maximum likelihood (ML). For ME the tree with the smallest sum of the lengths of all branches is seen as the best estimate. It is distance based and hence has the same pitfalls as NJ, but even though it is computationally more intensive it does not perform better (Takahashi & Nei, 2000). MP and ML are both based on discrete character data but they differ in their underlying evolutionary suppositions. All phylogenetic methods make either explicit or implicit assumptions about the process of DNA substitution (Felsenstein, 1988). The parsimony method makes implicit assumptions:

“…, but believing its results does require one to believe that plausible evolutionary scenarios that could cause it to fail have not taken place.” Swofford et al. (1996)

Maximum parsimony attempts to find the evolutionary tree which requires the fewest number of mutations to explain the observed data (Felsenstein, 1978). It is determined by the minimum number of ancestral sequences and does not allow more constructions with the same number of mutations. It does not account for the fact that the number of changes is not equal on all branches in the tree. Because it also does not allow for convergence along long branches as an explanation for similarity it is susceptible to ‘long branch attraction’ as discussed by Felsenstein (1978). If the number of mutations conveying useful information is high and convergence is rare, parsimony performs well (Swofford et al., 1996; Holder & Lewis, 2003).

Unlike the parsimony approach maximum likelihood corrects for multiple mutational events at the same time. Originally proposed by Cavalli-Sforza & Edwards (1967), Felsenstein worked on maximum likelihood estimations of gene frequencies as quantitative measurements (Felsenstein, 1973). The likelihood function equals the probability of the data given the hypothesis and assumptions, i. e. ( | ).

ML explicitly makes assumptions by applying𝑃𝑃 a 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑modelℎ 𝑦𝑦𝑦𝑦𝑦𝑦of evolution𝑦𝑦ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 that describes the relative probability of various events. The model is converted into a statement of the probability (e.g. one end of the branch has an A and the other has a G), incorporating the possibility of unseen events (back mutation) or complex pathways and the values for the model parameters are estimated during the evaluation of trees. They are taken to be

47

appropriate when they maximise the likelihood. This joint estimation of the parameters to find the highest point estimate must search a multidimensional tree space of parameters. Thus the calculation of ML is much slower than parsimony, but therefore it fully captures what the data tell us about the phylogeny under a given model.

All point estimates of phylogenies can be evaluated by bootstrapping (Felsenstein, 1985), where the original data set is randomly resampled with replacement to produce pseudo replicate data sets that in turn are used to build other trees. All sites are assumed to have evolved independently of each other and the presence of groupings is evaluated in the percentage of recovering them on the sample trees as summarized by a majority rule consensus tree. Bootstrapping predicts whether the same result would be seen if more data were collected, it does not proof that a result is true. Thus high bootstrap proportions are necessary but not sufficient for having high confidence in a group (Holder & Lewis, 2003).

For the ML approach, the probabilities of two different hypotheses under different models of evolution can be compared by performing a likelihood ratio test (LRT). Several LRTs can be performed hierarchically to select the simplest model that best explains the data.

1.4.5 Bayesian phylogenetics

Bayesian approaches to phylogeny have increasingly gained popularity since the last millennium. Similar to ML they also assume models of evolution (Huelsenbeck et al., 2001). The Bayesian approach is based on Bayes’ theorem:

( ) ( | ) ( | ) = ( ) 𝑃𝑃 ℎ𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑥𝑥 𝑃𝑃 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 ℎ𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑃𝑃 ℎ𝑦𝑦𝑦𝑦𝑦𝑦𝑦𝑦ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑃𝑃 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 Equation 5 The method is related to the maximum likelihood approach by the definition of the posterior probability which is proportional to the likelihood multiplied by the prior probability. So if flat priors are used or they are uninformative, most differences in the posterior probability are attributable to the likelihood. The analysis considers all possible parameter values and the so termed ‘nuisance parameters’, those that are not of primary interest, are integrated out through a marginal estimation which measures the volume under the posterior probability surface (not the height as the joint estimates for likelihood) (Lewis & Swofford, 2001). The primary analysis produces a tree and a measure of uncertainty for the clades on

48

GENERAL INTRODUCTION CHAPTER01 the tree. This is much faster and allows for more complex models of evolution to be implemented.

A large amount of data and few parameters deliver a reliable result under many models, but if the amount of data decreases relative to the number of parameters marginalising becomes increasingly helpful. When the ratio of data points to parameters is low, even ML estimates of parameters can be unreliable (Holder & Lewis, 2003).

An important aspect that has to be considered when working with Bayesian analysis, is the importance of priors. They can be used as either uninformative (uniform) or informative priors depending on the available information on the parameter. Every part of the posterior probability distribution affects the results, so the prior distribution has to be chosen carefully. It is important to check the sensitivity of the model to the priors, although in complicated hierarchical models it is generally unfeasible to systematically examine the effect of different priors on the many parameters in the model (Beaumont & Rannala, 2004).

Bayesian approaches rely on Markov chain Monte Carlo (MCMC) methods (Yang & Rannala, 1997). The method of Monte Carlo integrations over multidimensional configuration space is more efficient than conventional numerical methods (Metropolis et al., 1953; Hastings, 1970). MCMC works similar to a tree search algorithm but has stricter rules. It is a simulated random walk through parameter space. An initial tree is estimated and based on random choice moves that need to satisfy several conditions a new tree is proposed. The Metropolis Hastings algorithm is used to determine the acceptance probabilities. If an ‘uphill’ step is proposed, that is if the posterior probability of the new tree is higher, the move is accepted. If a ‘downhill’ step is proposed, the ratio of the posterior probabilities of the new tree and the current tree is taken and compared with a random number; if the number is lower than the ratio the move is accepted, if not the next sample is recorded. This procedure makes large downhill steps unlikely but not impossible and allows to truly explore the tree space. MCMC trees are highly correlated and thus millions of cycles are needed to achieve an acceptable effective sampling size (ESS) of independent trees.

With Bayesian statistics no definite hypotheses to test are needed, the output of an analysis is the posterior probability of any solution. Bayesian statistics is useful when the number

49

of alternative hypothesis is large (Holder & Lewis, 2003). This makes it extremely valuable for the estimation of migration events.

1.4.6 Estimation of migration rates

In 1982 Felsenstein asked “How can we infer geography and history from gene frequencies?” and claimed that the techniques at the time were insufficient to satisfy the theory. Since then the advent of more efficient computational methods has allowed the implementation of complex models in phylogenetic estimation for the inference of population parameters.

For the estimation of migration rates from phylogenetic analyses different estimators are used based on different measures that can be taken from sequence data. There are estimators based on allele frequencies and Wright’s F-statistic, like AMOVA (Analysis of Molecular Variance) (Excoffier et al., 1992). This is a hierarchical approach, based on a matrix of pairwise (Euclidian) distances, in which difference between haplotypes are

attributed to the various hierarchical levels and used analogous to FST (ΦST) (González et al., 1998). The theoretical amount of migrating individuals (mtDNA = female) can be estimated, to explain for the observed variability, assuming the mutation rate is negligible in comparison (Schneider et al., 2000). It is calculated by means of the hypothetical equilibrium between mutation and genetic drift. The relationship between FST (ΦST) and Migration is defined by

1 1

2 + 1 2 𝑆𝑆𝑆𝑆 𝑆𝑆𝑆𝑆 − 𝐹𝐹 𝐹𝐹 ≈ 𝑜𝑜𝑜𝑜 𝑀𝑀 ≈ 𝑆𝑆𝑆𝑆 𝑀𝑀 𝐹𝐹 Equation 6 where M is defined by effective population size Ne and migrants per generation (after Wright, 1951 but see also ; Slatkin & Voelm, 1991 and ; Hudson et al., 1992). Genetic drift is assumed as basis for the differentiation and because mutation and inbreeding are not considered (Wright, 1978; Weir & Cockerham, 1984; Slatkin & Voelm, 1991) this is only a rough estimate. For values of M < 1 populations are considered as isolated (Wright, 1978). AMOVA provides a general framework for the analysis of population structure including assumptions on the evolutionary processes in the distance matrix (Michalakis & Excoffier, 1996).

50

GENERAL INTRODUCTION CHAPTER01

Other estimators are based on allele frequencies and maximum likelihood. Tufto et al. (1996) developed a maximum likelihood method to simultaneously estimate the parameters of any migration pattern from gene frequency in stochastic equilibrium. Their aim was to include effects of differences in population size, geographic distance and other factors that can influence migration rates. The approach is strongly dependent on a large sample size and turned out to be less useful when between populations differentiation increased. Nevertheless with their approach they united the estimation of migration patterns and evolutionary trees into a common framework which Felsenstein (1982) had reported to be lacking. Another approach based on ML and allele frequencies was made by Rannala & Hartigan (1996). They considered a pseudo maximum likelihood estimator (PMLE) in addition to a maximum likelihood estimator, where the nuisance parameters are replaced by consistent estimates. It can be applied to discrete generation times but also in a continuous-generation island model, in the latter the estimator is the ratio between the immigration rate and the birth rate. The PMLE was considered to perform well and because it is simpler to calculate well suited for analysing larger data sets.

In contrast to the ‘looking forward’ strategies considered above, estimators based on genealogies of the sampled individuals, the coalescent (Kingman, 1982b), are ‘looking backward’.

“The n-coalescent is a continuous-time Markov chain on a finite set of states, which describes the family relationship among a sample of n members drawn from a large haploid population” Kingman (1982b)

The basic idea underlying the coalescent is that, “in absence of selection, sampled lineages can be viewed as randomly picking their parents as we go back in time” (Rosenberg & Nordborg, 2002). A genealogy is generated that eventually coalesces into a single lineage, the most recent common ancestor (MRCA) of the sample (Griffiths & Tavaré, 1997). The coalescent distribution only depends on the number of individuals sampled and the

parameter  which is defined by 4Neµ (effective population size (diploid) times neutral mutation rate).

1 The chance that any two lineages coalesce is per generation, while the chance that k 2Ne ( ) lineages coalesce is . The expansion of the coalescence model to any migration model 𝑘𝑘 𝑘𝑘−1 is possible and the coalescent4𝑁𝑁 can naturally incorporate the effects of recombination

51

whereas the phylogenetic approach cannot (Rosenberg & Nordborg, 2002). In general there are coalescent likelihood and Bayesian methods. The variance of the estimates is dependent on the number of unlinked loci having independent coalescent trees and the variability in the data, the more segregating sites or polymorphic loci are present the better the estimates of the migration rates (Beerli, 1998).

The development of coalescent theory has allowed less restrictive models to be used for estimating gene flow. Many variations and mixtures of the above describe approaches exist today. They accommodate recent range expansion, non-symmetrical migration and other complex scenarios that can be found in biology (Beerli & Felsenstein, 1999, 2001).

Computational approaches In 1995 Kuhner, Yamamato and Felsenstein (Kuhner et al., 1995) introduced a ML

approach using the coalescent to estimate the parameter 4Neµ. They used a Metropolis- Hastings Markov chain Monte Carlo procedure to sample genealogies in proportion to the product of their likelihood. They found that a combination of more short initial chains estimating a good value for 0 and a good starting genealogy followed by few long chains provide the best final results. They compared their estimate with other approaches and described the potential for future expansions as well as the possibility for using the collection of genealogies to test other hypothesis, similar to the bootstrap procedure. The

approach is implemented in the LAMARC package (Kuhner, 2006).

Bahlo & Griffith (2000) developed the program GENETREE. From the probability distribution of simulated gene trees with the underlying model of a coalescent process in a subdivided population they make inferences regarding population parameters. They work with three subpopulations under the infinitely-many-sites model (considering infinitely long DNA sequences) where migration rates in and out of the populations are not necessarily assumed to be constant for a simulation of source and sink models. Overall the authors try to obtain as much geographical information from the sequences variation configuration as possible, such as the probability distribution of where the MRCA occurred for the total as well as the subpopulations and where each mutation occurred. They discuss maximum likelihood estimates of migration and mutation rates, detection of population growth and several temporal questions for subdivided populations.

Slatkin & Maddison (1989) introduced a parsimony based method for estimation of Nm where N is the number of each local population. The geographic location from which each

52

GENERAL INTRODUCTION CHAPTER01

sample is drawn is treated as multistate character based on a given phylogeny. For many shared haplotypes between different locations this method cannot provide accurate estimates.

In 1999 Beerli and Felsenstein introduced an estimation of migration rates and effective population size in two populations using a maximum likelihood- and coalescent theory-based approach which integrates over all possible genealogies. Their approach is similar to that of Bahlo and Griffith (2000) but also supports mutation models for different types of data including a finite site model for nucleotide sequences. As with many others, the power increases by adding more unlinked loci. Their software, MIGRATE (Beerli & Felsenstein, 1998), starts with a genealogy that has a minimal number of migration events on it and might not readily explore the whole migration genealogy yet. In 2001 the authors described an extension to this model that allows analysing more than two subpopulations, specifying arbitrary migration scenarios (where some routes are not allowed) and testing the hierarchy of different migration scenarios. The estimated migration patterns are tested against simulated sequence data in an n-island model and a source-sink population scenario to assess success (Beerli & Felsenstein, 2001). The advantage of their program MIGRATE

over GENETREE basically lies in the amount of available mutation models. Overall both programs will make FST based estimators obsolete. With the integration of a Bayesian framework Beerli (2006) accomplished the comparability of ML and Bayesian inference and thus offers users a choice for data sets that might be too sparse to be analysed under a ML framework.

Nielsen & Wakeley (Nielsen & Wakeley, 2001) developed a procedure to distinguish equilibrium migration from isolation with a joint estimate of divergence time and migration rate between two populations. They assumed the populations to be descendent from a panmictic ancestral population allowing for migration to be present or not. Using both, a

ML and a Bayesian approach calculated via MCMC, they showed that it is possible to obtain reliable joint estimates. Additionally they showed that it is possible to distinguish between a model of high gene flow and long divergence times and a model of short divergence times and low migration rate, by reanalysing a known data set previously analysed based on the variance of pairwise differences. Further they made predictions about unequal geneflow between these populations and assumptions about the difference in effective population size.

53

Drummond & Rodrigo (2000) started working with time stamped serial sampled sequence data from fast evolving pathogens. Their approach is a serial-sampled version of the unweighted pair grouping method with arithmetic mean (sUPGMA), thus it is based on a difference matrix, and is most reliable when the molecular clock assumptions (Zuckerkandl & Pauling, 1962) are true. This enables them to estimate population migration and mutation

parameters simultaneously and separately. It is implemented in the program PEBBLE. In 2002 Drummond and others presented a Bayesian statistical inference approach to jointly estimate mutation rate and population size that incorporates the uncertainty in the genealogy of temporally spaced sequences, such as from rapidly evolving pathogens or ancient subfossils and fossils. Information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from such data of (haploid) populations was recoverd (Drummond et al., 2002). Rodrigo and others (2003) extended recent analyses on estimation of evolutionary rates from serially sampled sequences to allow the estimation of a joint rate of substitution, ω, from several evolving populations from which serial samples are drawn.

Wilson & Rannala (2003) presented a new Bayesian method that estimates rates of recent immigration among populations using individual multilocus genotypes (allozyme, microsatellite, RFLP or SNP). Only the proportions of immigrants in a population can be estimated though, and the method can also estimate the posterior probability distribution of individual immigrant ancestries, population frequencies, and other parameters. Their method offers accurate estimates when genetic differentiation among populations is large and migration rates are low, moreover an adequate number of loci is needed. It considers only identity by descent (IDB) of alleles within populations and the authors propose that an extension might be useful that introduces additional F-statistics to describe the probabilities of IBD of alleles sampled from different populations (in special cases), which could ‘combine’ sample information across populations with low levels of differentiation. They also propose to allow for immigration rates to vary over time.

Ewing and others (2004) extended previous work (Drummond et al., 2002) to fit an island model. They focused on measurably evolving populations (MEPs; Drummond et al., 2003) namely HIV, where blood and semen were used as viral demes. Within a Bayesian framework using MCMC it was possible to simultaneously recover mutation rate, population size, migration rates, and genealogical information from this temporal and

54

GENERAL INTRODUCTION CHAPTER01

spatially sampled sequence data. The authors suggest that after further modification this approach might also be useful to simulate vicariant biogeographic events.

Recently, Ewing & Rodrigo (2006) developed a coalescent-based Bayesian MCMC procedure that allows estimation of migration and mutation rates, effective population sizes, and time of these changes, when the number of demes changes over time. They simulated four models; the first has two demes, present in the same two time intervals, the mutation rate is constant and the population size and the migration rates vary. Additionally in the second model the mutation rate is allowed to vary in the different time intervals. The third model has changing numbers of demes, population size and migration rate, but a constant mutation rate. The time interval is divided into two intervals, from the present to the root in recent and ancient. In contrast to Nielsen and Wakeley (2001) lineages that are traced backward in time simply merge into the ancient lineage as the time interval changes. The fourth model advances to three time intervals. With these new approaches more complex scenarios of migration can be modelled, e.g. where demes appear during colonization events or where migration might be interrupted for a period of time due to vicariance during dispersal.

Summary Since the beginning of last century much thought has gone into how to analyse population structure. Mostly, efforts have been made to identify the viability of fragmented populations under threat of extinction. But lately also more general applications towards simple within species histories have been in focus.

Since the development of the dideoxy-chain termination method for sequencing and the discovery of a thermostable polymerase to automate the polymerase chain reaction to amplify DNA fragments (Kleppe et al., 1971; Sanger et al., 1977; Mullis et al., 1986) molecular genetic data has rapidly become more abundant, not to mention the data explosion since the advent of next generation sequencing. From the first approaches just comparing the frequency of genes in populations and subpopulation, we can today make inferences for every single nucleotide substitution. Under several models of evolution this can be computational intensive and faster approaches are developed almost constantly. With these possibilities ever more parameters can be included and the models are getting more complex with the intention to better simulate the true processes. Even temporal

55

aspects are included of late and offer analyses of population structure and associated parameters in the past, not just the present.

56

CHAPTER 02 MITOCHONDRIAL DNA AND DATA GENERATION

2.1 mtDNA as marker “Genetic markers are simply heritable characters with multiple states at each character.” Sunnucks (2000)

To fully understand why mitochondrial DNA (mtDNA) is used as a marker so frequently in evolutionary studies it is essential to understand the biology behind it. Mitochondria are the cytoplasmic organelles responsible for the major part of cellular respiration. According to the endosymbiont theory they derived from nonphotosynthetic prokaryotes, most closely related to α-proteobacteria (Esser et al., 2004), that were living in early protoeukaryotes. Although, they were forming a symbiosis as two living cells, in the course of evolution this symbiosis merged into a single inseparable organism (Madigan et al., 2001). This was first suggested by Altmann (1890) as early as 1890 and subsequently critically discussed over almost a century (Portier, 1918; Cowdry & Olitsky, 1922; Wallin, 1923; Meyer, 1973). Today, despite the acceptance of the overall mechanism, the details are still at the centre of research (Martin et al., 2001; Poole & Penny, 2007; Gribaldo et al., 2010).

Due to this history and their essential functionality for generating energy, mitochondria can be found within the cytoplasm of almost all recent eukaryote cells, where they are present in multiple copies. Each mitochondrion retained its own restricted genome of about 14- 20kb in animals; this genome in turn ranges from 103 to 104 copies per cell, depending on tissue type. The clonal organelles normally divide only once during cell proliferation, but during maturation of the oocyte, mitochondria multiply numerous times within the cell (Knippers, 1997; St. John et al., 2010). This process leads to an mtDNA ‘bottleneck’ during oogenesis (Shoubridge, 2000). The female gamete’s cytoplasm then supplies the vast majority of mitochondria for the zygote, leading to a predominantly maternal inheritance

57

(Hutchison et al., 1974; Avise et al., 1987). Due to this uniparental transmission, recombination does not naturally occur (Hayashi et al., 1985). This uniparental inheritance also leads to an accumulation of detrimental mutations (Muller, 1964), later termed by Felsenstein (1974) as Muller’s Ratchet. According to Bergstrom and Pritchard (1998), over time, the germline bottleneck slows the progression of the ratchet. Heteroplasmy is seldom found (see Barr et al. (2005) for an overview), but a low level of paternal leakage has been discussed repeatedly (Gyllensten et al., 1991; Bromham et al., 2003). Nevertheless, there are multiple mechanisms preventing the transmission of paternal mtDNA in mammals (Rantanen & Larsson, 2000) and for mice it has been shown, that paternal mitochondria are eliminated within hours of fertilization (Kaneda et al., 1995; Cummins, 2000).

The mitochondrial genome is haploid and consists of a single circular, covalently closed, double stranded molecule (Castro et al., 1998). The two complementary strands have been differentiated into a guanine rich heavy (H) strand and a cytosine rich light (L) strand, based on their density (Clayton, 2000). The majority of genes are located on the H-strand. The coding regions are strongly conserved and consist of two ribosomal (rRNA) genes (12S and 16S rRNA), 14 transfer RNA (tRNA) genes and 12 protein coding genes; the remaining 8 tRNAs and 1 mRNA are encoded on the L-strand (Anderson et al., 1981; Clayton, 2000). In addition to the contiguous coding region there is the non-coding control region (CS) or displacement loop (D-loop). It consists of two hypervariable regions (HVR I & HVR II) flanking a central conserved block (Walberg & Clayton, 1981; Vigilant et al., 1991; Matson & Baker, 2001).

Although mitochondria retained a good portion of their own genome, they are only semi- autonomous. Nuclear and mitochondrial genomes are functionally interdependent (Schatz & Mason, 1974) and mitochondrial replication is ultimately controlled by the cell (Capps et al., 2003). Because mitochondria have their own genome and are important in cellular respiration, it has been assumed that mtDNA only undergoes neutral or deleterious mutations, which would only lead to purifying selection, while positive selection was assumed to be very rare. However, potential sources of selective sweeps that reduce intraspecific variation in mtDNA exist, these are in particular selection at host level, ‘selfish’ mitochondrial mutations, and genetic hitch-hiking with other maternally inherited elements (Avise, 1991; Ballard & Kreitman, 1995; Galtier et al., 2009). Although these exceptions are assumed to be rare, particularly for mammals, potential consequences for evolutionary studies have to be considered (White et al., 2008).

58

MTDNA AND DATA GENERATION CHAPTER02

Overall, compared to nuclear genes, it has a higher rate of sequence evolution (Brown et al., 1979; Pesole et al., 1999) and within the mtDNA, the hypervariable regions of the D-loop have a tenfold higher mutation rate than the protein coding regions (Vigilant et al., 1991; Pesole et al., 1999; Richards & Macaulay, 2001).

Unlike nuclear DNA (nDNA), mtDNA is present with a high number of copies per cell. This makes it especially useful for studies where sample material is limited, as it is often the case in ancient DNA studies (O'Rourke et al., 2000).

The effective population size for this marker is only a quarter that of the nuclear genes, due to its haploidy and uniparental inheritance. Due to these special characteristics, mtDNA and especially the HVRI of the D-loop with its high variability, is frequently used to resolve relationships between closely related species or among populations within a species (Avise, 1986; Moritz et al., 1987; Harrison, 1989). For evolutionary studies it is important to choose a genetic marker that is appropriate to the scale of subdivision under investigation (Bradman et al., 2011) and this very high rate of evolution of HVR1 makes a good tool for evolutionary studies on a population level (Brown et al., 1979).

2.1 Obtaining Material Tissue and bone samples used in this study were obtained from various sources. Given the wide distributional range of the species, sample collection in the field was only possible for a limited area. The most important geographic area to tackle the questions of human migration associated with Lapita pathways before their migration into the Pacific is the Bismarck Archipelago, North of mainland . For this study the outer Islands of the Archipelago were sampled. Rats were snap-trapped in the field between 2006 and 2008 on Manus of the Admiralty Group, Emirau and Tench of the St Matthias Group, Tatau and Simberi of the Tabar Group, as well as on New Ireland, Lavongai and Lihir.

In 2008 the second excavation for Lapita artefacts at the Tamuarawai site on Emirau Island, PNG presented an additional opportunity to look for rat bones within different time layers. The aim was to add time depth to the study. The excavation yielded many precious artefacts, especially various potsherds that could be identified as Lapita. Subsequent charcoal dating of the site calibrated it to around 3200 BP which corresponds to the Early Lapita period (Summerhayes et al., 2010). Unfortunately, almost no mammalian bones were found, particularly not one single rat bone.

59

Ancient bone samples from other archaeological excavations that could be obtained for this study were from Panakiwuk on New Ireland, PNG (Ken Aplin) and the Liang Bua Cave on Flores, Indonesia, the latter excavated under Mike Morwood, University of Wollongong in collaboration with the National Research and Development Centre for Archaeology (Arkenas) in Indonesia. Additionally, Natalie Vasey provided small unidentified rat bones excavated at the Andrahomana Cave, Madagascar, which were analysed to verify the non- presence of R. exulans on Madagascar, despite the spread of the Austronesian language family.

Figure 10: Sampling locations included in all further analyses (indicated by diamonds) throughout the study area from mainland Southeast Asia (SEA) to Remote Oceania. The seabed relief reveals the extent of the two shelves, Sunda (connecting SEA with Borneo, Sumatra and Java) and Sahul (connecting Australia and New Guinea). The distributional range for R. exulans, is marked by beige coloration. Map sources: Esri, USGS and NOAA.

Museum IDs were retrieved via MaNIS (http://manisnet.org) and GBIF (www.gbif.org) and sample tissue obtained from several Museum collections. The majority of tissue samples were personally taken as small mid-ventral line clippings at the Smithsonian Institution National Museum of Natural History (NMNH). Others samples were obtained from the Chicago Field Museum (FMNH), the South Australian Museum (SAM) and Museum of Vertebrate Zoology Berkeley (MVZ). Additional fresh tissue samples were provided by fellow researchers, particularly from Southeast Asia by Marie Pagès and from Adele Island by Russell Palmer.

60

MTDNA AND DATA GENERATION CHAPTER02

Some specimens were part of previous studies on the Pacific (Matisoo-Smith et al., 1998; Matisoo-Smith, 2004). However, their (shorter) DNA sequences were elongated to match the new dataset. A few of those elongated samples have been published in Robins et al. (2007).

2.2 Data Generation

2.2.1 Protocol for fresh tissue

Total genomic DNA was isolated from ethanol-preserved tissue samples of tail, ear or liver, ca 25 mg each. Extractions were carried out using Roche’s High Pure PCR Template Preparation Kit (Protocol Vb; Roche Applied Science, Mannheim, Germany) or QIAamp® DNA Mini Kit (Tissue protocol, Qiagen, Hilden, Germany), according to the manufacturer’s instructions.

Polymerase chain reaction (PCR) amplifications of a 583 bp fragment of the hypervariable region one (HVR-I) of the mitochondrial displacement loop (D-loop) were performed in 30

µL reaction volumes containing 1 µL template DNA, 2 mM MgCl2, 0.5 µM each primer,

R15358F (EGL4L: 5’-CCA CCA TCAACA CCC AAA G-3’) and R15940R (RJ3r: 5’-CATGCC

TTG ACG GCT ATG TTG-3’), 0.15 mM dNTPs, 0.5 U AmpliTaq® in the appropriate 1x PCR- Buffer (10 mM Tris HCl, pH 8.3, 50 mM KCl; Applied Biosystems, Waltham, MA, USA) and ddH2O. Numbering of the primers is based on the mitochondrial genome of Rattus norvegicus (Gadaleta et al., 1989). The thermocycler profile consisted of 35 cycles of denaturation at 94°C for 30 s, annealing at 60°C for 30 s and elongation at 72°C for 1 min; it was initialised by a 2 min denaturation step at 94°C and followed by a terminal extension step at 72°C. Negative contamination controls where included in every PCR setup.

PCR products were visualised in a standard 1% agarose gel stained with ethidium bromide. All successful PCR products were purified with QIAquick® PCR Purification Kit (Qiagen, Hilden, Germany), later with Zymo DNA Clean and Concentrator™ and quantified using Invitrogen™ Low DNA Mass Ladder, later Invitrogen™ Qubit® Fluorometer. Quota of 2 ng sample DNA per 100 bp target sequence were then set up with 1 µl primer for a dye termination sequencing reaction after Sanger et al. (1977). The sequencing reaction was carried out by the Allan Wilson Centre Genome Service using BigDye® Termiator Kit (Version 3.1, Applied Biosystems, Waltham, MA, USA) and analysed by capillary electrophoresis on an ABI 3730 (Applied Biosystems, Waltham, MA, USA). The resulting

61

chromatograms were aligned in Geneious® (Pro 5.6.3, Biomatters), manually checked and edited and used for haplotype assignment / identification. Seldom, the PCR produced double bands. Given that the negative control was clear, those bands were cut under UV and then extracted using PureLink® Quick Gel Extraction Kit (Version B) according to the manufacturer’s instructions. All further steps as before.

2.2.2 Ancient DNA

With death of an organism degradation of the DNA polynucleotide backbone begins. Enzymes break the phosphodiester bonds and thereby shorten the DNA strands. Additionally, post mortem stability of DNA depends on tissue type, temperature, pH, and other environmental factors, but generally the amount of degradation is correlated with time (Bär et al., 1988). Only recently the half-life of mtDNA in bone has been estimated at 521 years (Allentoft et al., 2012). MtDNA is the marker of choice for ancient DNA (aDNA) studies, because of the much higher copy number within the cell. Nevertheless, depending on age and preservation conditions aDNA exhibits varying degrees of degeneration and as a result can be less informative. To overcome this problem shorter amplicons are targeted with overlapping fragments that can be concatenated to longer sequences.

Contamination poses a further challenge due to low density DNA and short fragments of the sample material. Any contaminant DNA would have better quality for PCR amplification. This is particularly true when working with human remains. Therefore utmost care needs to be taken in the entire process.

Tissue from historic museum specimens and bone from archaeological excavations were processed under sterile conditions in three physically separate laboratories. The most contamination prone extraction step was performed in a dedicated ancient DNA facility at the University of Auckland (Department of Anthropology). According to standard aDNA laboratory conduct, clothing is mandated that has not come into contact with other DNA facilities. Further, a full-cover labcoat, hair-net, and overboots are required to protect from personal contamination. All reagents that tolerate such a treatment were sterilized via UV radiation. Surfaces were washed with sodium hypochlorite, or, if not applicable, with DNA- EX™ (Genaxis Biotechnology GmbH, Spechbach, Germany) and ethanol. Subsequent fragment amplifications were run on a dedicated thermocycler in a separate room and all post-PCR steps were carried out in the separate standard laboratory.

62

MTDNA AND DATA GENERATION CHAPTER02

2.2.3 Bone preparation

The bones available were either mandibles or femurs. Before subjecting the bone material to the destructive procedures they were photographically documented and their measurements taken with a sliding calliper.

Figure 11: Example for documentation of rat bones

Accessible bone surface was sanded with fine-grained sand paper to remove remaining soil and superficial contaminants that might inhibit subsequent procedures. If present the incisor from molar bones was chosen for further processing, else the bone itself was used; femurs would be processed entirely. All parts intended for the digest were weighed and then ground up carefully in a chilled sterile mortar and pestle. The resulting bone powder yielded between 0.02 and 0.2 grams.

Figure 12: Rat femur with measurement points. Dried skin tissue taken from museum samples was cut into smaller pieces directly in a tube containing the digestion buffer.

2.2.4 aDNA extraction and amplification

DNA isolation was performed with a Silica-Guanidiniumthiocyanate method introduced by Boom et al. (1990), extended and improved several times (Höss & Pääbo, 1993;

63

Matisoo-Smith et al., 1997; Rohland & Hofreiter, 2007b, a) and simplified after testing the effectiveness of different washing agents by J. Robins and myself; for complete protocols see Appendix A 2 and Robins et al. (2014). A control setup without sample was included for every extraction to control for procedure contamination.

Amplifications were carried out with several primer pairs (Matisoo-Smith, 1996; Matisoo- Smith et al., 1997; Matisoo-Smith et al., 1998)(see Table 1, Figure 13) yielding overlapping fragments of different lengths, ranging from 120 bp to 240 bp. Every PCR reaction volume of 30 µL contained 5 µL template DNA, 1 U Amplitaq Gold® in the appropriate 1x PCR- Buffer (Applied Biosystems, Waltham, MA, USA), 1 mg/mL BSA, 0.5 µM each primer and 150 µM of each dNTP; magnesium concentration was adjusted to 2.5 mM MgCl2 or 3 mM MgCl2 depending on the primer pair used. and the final reaction volume attained with ddH2O.

The thermocycler profile was optimised combining two different annealing temperatures. It had 10 cycles of denaturation at 94°C, annealing at 54°C and elongation at 72°C, each for 20 s, then over 35 cycles of denaturation at 94°C, annealing at 50°C and elongation at 72°C, also for 20 s each. An initial denaturation step was set to 5 min at 94°C and a terminal extension step to 5 min at 72° followed by a holding temperature of 15°C. The corresponding negative extraction control and a negative PCR control to check for contamination were included in every PCR setup; if deemed necessary a positive control was included to check for successful amplification.

64

MTDNA AND DATA GENERATION CHAPTER02

Table 1: Primer pairs for modern and ancient mtDNA D-loop fragment amplification for R. exulans. Forward primers were paired with select reverse primers to give overlapping amplicons ranging from 120 bp to 240 bp to accommodate for different template quality. Numbering is based on the mitochondrial genome of Rattus norvegicus (Gadaleta et al., 1989) and does not correspond 1:1 to R. exulans. Amplicon Primer forward Primer reverse Sequence 5’ -> 3’ Tm °C length (bp) R15358F (EGL4-L) CCACCATCAACACCCAAAG 53

131 R15485R (EGL27-H) TACATGCTTATATGCTTGGG 51

240 R15523R (EGL7H) TGATAACACAGGTATGTCC 49

583 R15940R (RJ3r) CATGCCTTGACGGCTATGTTG 59

R15442F (EGL4.2-L) AGGACATTAAAACATTTATGTA 49 178 R15485R (EGL27-H)

R15542F (EGL38-L) CATGAATATTCTCACATAC 45

154 R15693R GTTGTTGATTTCACGGAGG 51

R15621f CCTTTCTCTTCCATATGACT 51

155 R15773R CCAGATGCCTGGTAAAGTTTC 57

222 R15840R CCATCGAGATGTCTTATTTA 59

R15722F CGGGCCCATACAACTTGG 53 120 R15840R

PCR products were visualised in an ethidium bromide stained 2% nusieve and agarose (1:1) gel. Purifications were performed via sephacryl gel filtration columns (GE illustra™ Micro Spin S-400 HR Columns) or via silica-membrane-based purification (QIAquick® PCR Purification Kit, Qiagen and DNA Clean & Concentrator™-5, Zymo Research). Subsequent steps were identical to standard lab procedures.

2.2.5 DNA extraction from liquid preserved tissue

An important location for the phylogeography of R. exulans are the Lesser Sunda Islands. Morphological studies from the mid-1900s have put them at the origin of the species distribution (Schwarz & Schwarz, 1967). The Lesser Sunda Islands are a remote Island group in Wallacea in Eastern Indonesia. Political difficulties in Indonesia make the use of sample material from archival collections indispensable. Regrettably, material from relevant locations was only available from fluid preserved whole animals. We obtained ten samples from the Smithsonian collection, three originating in Flores and seven in Timor. These samples had been preserved in an unknown fixative, possibly Formalin, which had been exchanged for ethanol.

Over the last few years considerable success has been made extracting short DNA fragments from Formalin fixated tissue (Schander & Kenneth, 2003), whereas mercury

65

containing preparation fluids, like Zenker’s, prevent any successful DNA retrieval to date. Formalin is the aqueous solution of the gas Formaldehyde. Formaldehyde is a very small

molecule (CH2O) and can penetrate into tissue very rapidly. In water it forms methylenehydrate which in turn can form polymers. Formalin is a very powerful preservative due to the reactivity of the aldehyde group with the amine groups in proteins,

linking nitrogen atoms via the HO CH2 OH crosslinks between different available

nitrogen atoms, and thus incapacitating− proteins− (Fraenkel-Conrat & Mecham, 1949). Particulary in the humid heat of equatorial areas this preparation form was beneficial.

Formalin has further been used extensively in Immunohistochemistry, therefore medical interest for retrieval of antigen activity of paraffin embedded slides has led to a variety of recovery protocols. Schander and Kenneth (2003) provided a good overview over methods that have since been applied to fixated animals. However, the success rate is still dependent on contributing factors during fixation.

With eleven samples there was only a finite amount of protocols that could be tried. After studying the available literature the first half of samples was processed following the protocol of Shedlock et al. (1997) who had successfully extracted DNA from fish, amphibians and reptiles. Because this first attempt was unsuccessful, a second protocol was applied. In a technical memorandum published by NOAA, Robertson et al. (2007) described a method that successfully retrieved DNA from formalin-fixed Cetacean tissue. This protocol builds on Shi et al. (2002) and was made available to me my K. Robertson (Appendix A 2). However, it too did not yield any DNA, suggesting, that the unknown preservation method was not Formalin.

2.2.6 Summary

A total of 314 out of 392 specimens yielded sufficient DNA to amplify mtDNA fragments and were identified as R. exulans. The resulting chronograms were controlled for ambiguous sites and consistency between forward and reverse sequenced amplicons. All reliably validated short amplicons from museum specimens with no ambiguities were concatenated per specimen and the resulting 402 bp HVRI-sequences aligned with the longer sequences obtained from modern tissues samples. Sequences and alignment were used for further analyses.

66

MTDNA AND DATA GENERATION CHAPTER02

Figure 13: Primer alignment against a sample sequence demonstrating fragment lengths and overlay for the D-loop HVRI region. Image: Geneious 8.0.

67

68

CHAPTER 03 GENETIC DIVERSITY, POPULATION STRUCTURE, AND PHYLOGEOGRAPHY OF RATTUS EXULANS

3.1 Abstract In this chapter I establish the population structure of the commensal rat species R. exulans across its distributional range. Mitochondrial HVRI analyses revealed deep regional structures encompassing two major and various minor clades. The Philippines are genetically most distinct and show strong ties to both Borneo and Sulawesi. Together the haplotypes originating from the Philippine Islands and the northern regions of Borneo and Sulawesi form a crown group along a long branch indicating the lack of gene flow caused by geographic isolation after initial settlement. Another clearly distinct haplogroup was formed by the specimens originating in Remote Oceania. The tree structure suggested that the group in its entirety too has been isolated from the rest of exulans. Signs of a recent range expansion are concordant with an initial settlement followed by a substantial spatial expansion, where the entailing genetic drift was possibly increased by serial founder events. No direct ancestral connection could be found for the Remote Oceanic clade. However, the Near Oceanic Islands, Manus and Tench in the Bismarck Archipelago, harbour Remote Oceanic haplotypes and could be a potential geographic source for the Remote Oceanic group. Among the smaller regional clades of R. exulans, the Southeast Asian population had the lowest diversity despite the vast geographic range; this does not lend support to a hypothesised ancestry within this region. In contrast, the high variability, the absence of a clade structure, and the interconnectivity among the specimens from the Southern Malay Archipelago, favour this region as ancestral area.

69

3.2 Introduction The Pacific rat, R. exulans, is the third most widely distributed commensal rat species after R. rattus and R. norvegicus (Tate, 1935). In Oceania its dispersal is associated with human migration, particularly the spread of the Lapita peoples and their descendants (Matisoo- Smith & Robins, 2004). Today the species is distributed from Southeast Asia, throughout the Malay Archipelago, New Guinea and adjacent islands (Near Oceania), all over the Pacific (Remote Oceania) as far as Rapa Nui (Easter Island) in the east, Hawai’i in the north, and New Zealand in the south (Schwarz, 1960; Atkinson, 1985). However, it has remained absent from the Australian continent and Madagascar, even though the latter was reached by Austronesian language-speaking peoples, who are thought to be associated with the Lapita dispersal. Further, a small Australian off-shore Island, Adèle Island, was recorded to have a population of R. exulans just over a century ago (Walker, 1892).

At the geographic centre of the distributional range of R. exulans lies the Malay Archipelago. This island landscape consists of roughly 25,000 islands (CIA World Factbook) with the Philippines and Indonesia as major political entities. The region is impacted by the clash and convergence of several tectonic plates and has undergone major changes throughout the Cenozoic (Hall, 2002). More importantly for this study, it has been affected by palaeoclimatic sea level oscillations throughout the Pleistocene (Webb III & Bartlein, 1992), leading to the repeated temporary exposure of the Sunda and Sahul shelves, and the formation of various land bridges between islands in alternation with the more fragmented landscape we currently observe. Generally, these processes repeatedly confronted resident flora and fauna with possibilities for vast range expansions, followed by extinction and speciation events due to habitat loss and isolation.

For R. exulans, a species incapable of swimming, these events of increased connectivity offered possibilities of autonomous dispersal from its area of origin, in contrast to the alternative mode of human-mediated dispersal as reported for the Pacific region. If this rat species can add information to the puzzle where the ancestors of Lapita came from we first need to know if and how the population is structured to infer modes of dispersal. The information held in mitochondrial DNA, a neutral genetic marker, can offer insights into this population history.

70

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

The aims of this chapter are to identify (1) the genetic variability of R. exulans, (2) whether the population is genetically structured and (3) to determine the geographic distribution of any intraspecific genetic variation.

3.3 Methods

3.3.1 Sample collection

A total of 310 rat samples, comprising 208 fresh tissue samples and 102 preserved museum samples, were analysed after preliminary identification via DNA Surveillance (Ross et al., 2003; Robins et al., 2007). Overall, 132 locations were sampled. Each main island and most locations in Southeast Asia were each represented by a minimum of six samples (Figure 14). Fresh tissue samples were collected in Papua New Guinea by snap-trapping near human habitations. Fellow researchers provided fresh tissue from other locations. Sequences for Remote Oceania from previous studies were elongated by sequencing the missing fragments to match the more informative sequence length. Additionally, museum IDs were retrieved via MaNIS (http://manisnet.org) and GBIF (http://gbif.org) and tissue samples were obtained from several Museum collections (for more specifics, see Chapter 2.1.9). An additional 30 specimens from Thomson et al. (2014), including 5 extra haplotypes, were added to all analyses. See APPENDIX A 2 for sample details.

71

, exulans

. stributional range for R The the seabed extent reveals of relief

. Map NOAA. and sources: Esri, USGS ast (SEA) to Asia Oceania, referredRemote including to place names. 2 sampling locations all 13 Map of the study area from mainland Southe 14 : Figure indicate beige Diamonds coloration. by is marked the two shelves, Sunda (connecting SEA with Borneo, Sumatra and Java) and Sahul (connecting Australia and New Guinea). and The di New (connecting Australia Borneo, (connecting Sahul Sunda SEA Java) and Sumatra two shelves, and with the

72

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

3.3.2 Analyses

Several analyses of molecular variance (AMOVA) were calculated in ARLEQUIN 3.5.1.2 (Excoffier et al., 1992; Excoffier et al., 2005; Excoffier & Lischer, 2010, Arlequin 3.5.1.2), to identify genetically homogenous regions within the otherwise continuous island landscape of Indonesia and the Philippines, as well as of Near and Remote Oceania. AMOVA hierarchically analyses the degree of subdivision among different sampling regions using the analogue ΦST of Wright’s fixation index FST (Wright, 1951). The method considers haplotype frequencies and the number of nucleotide differences between each pair of haplotypes. ΦST estimates the total amount of variance due to differentiation among

subpopulations and ΦCT variance due to differentiation among regions.

Across all sampling sites, sample locations were pooled into 80 populations when they occurred on the same island separated by less than 100 km over flat land or in the same administrative region without obvious unsurpassable barriers (Figure 15). Subsequently, geographically meaningful combinations of these populations were tested as contiguous regions, aiming for a maximization of the proportion of total genetic variance due to differences among regions as opposed to within regions. The approach emulates the automated Spatial Analysis of Molecular Variance (SAMOVA) as described by Dupanloup et al. (2002), except that the grouping process is based on prior knowledge of the area.

However, SAMOVA was also consulted because of its ability to identify genetic barriers between groups, and to ensure unbiased groupings. A majority rule consensus was calculated across all runs for k = 2 to 20 populations.

The initial analysis was started with a set of four geographically defined groups separating the Sunda and Sahul shelves from the Wallacea region and all Oceanic islands (Figure 15). For the Near Oceanic region, Weber’s Line, defined by a 50:50 ratio of Asian to Australian mammals and molluscs, was used as the western boundary and the Thorne-Green line, separating Near and Remote Oceania by a distance-based inter-visibility border, as the eastern boundary (Simpson, 1977; Roberts, 1991a; Moss & Wilson, 1998). Subsequently, 33 different geographic groupings were tested and evaluated. The most homogenous

groupings were then aligned with the majority rule estimate from SAMOVA and adjusted with sensible geographic delimitations. Additional finer scale analyses were carried out on the wider Near and Remote Oceanic populations, because these regions are of particular interest in reference to human migration pathways. In this context, the Island of New

73

Guinea was treated as one population and the islands west of New Guinea up to Weber’s Line (Moluccas) and the islands north of Papua New Guinea (Bismarcks) as two separate populations. Similarly, the Remote Oceanic region was divided into Micronesia and the remainder of the oceanic islands, after no differentiation was found in test runs separating Inter-Oceania (Solomons, Vanuatu, and New Caledonia) as well.

Significances of ΦST and ΦCT were assessed by 10,000 permutations against the null hypothesis of no differentiation between the populations. The underlying distance matrix was computed with Tamura and Nei’s model of evolution, assuming a gamma distribution with a shape parameter of 0.39 (human HVRI; Excoffier & Yang, 1999), the best fitted

model for the dataset implemented in ARLEQUIN. According to Meirmans and Hedrick (2011) the AMOVA results do not need to be standardized because the relationships between haplotypes are taken into account.

Figure 15: Pooled sampling locations and major biogeographic lines, delimiting zoogeographic regions between the western Oriental and eastern Australian flora and fauna.

Standard diversity indices were calculated and deviations from the neutral molecular evolution hypothesis (Kimura, 1983) were tested to detect signals of population expansion, both in ARLEQUIN 3.5 (Excoffier & Lischer, 2010) and DNASP 5.10.1 (Rozas et al., 2003; Librado & Rozas, 2009). Haplotype diversity (h) is the probability that two randomly chosen individuals have different haplotypes, and nucleotide diversity (π) is the average pairwise nucleotide difference between individuals within samples (Nei, 1987). Tajima’s

74

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

D (Tajima, 1989) considers the number of segregating sites and the average number of pairwise nucleotide differences. A founder effect (or directional selection) leads to a large number of single polymorphic sites at low frequency (D < 0) whereas recent admixture of formerly separated populations (or balancing selection) leads to many different but

abundant haplotypes (D > 0). Fu’s (1997) FS is an explicit test of population growth and detects excesses of low-frequency alleles in a growing population when compared with the expected number in a stationary one (Fedorov & Stenseth, 2001).

Fu’s FS (Fu, 1997) and the R2 test by Ramos-Onsins and Rozas (2002) were applied, considering their evaluation of different test statistics for population growth. While both tests detect population growth, they differ in their accuracy depending on samples sizes,

“the behaviour of the R2 test is superior for small sample sizes, whereas FS is better for large sample sizes” (Ramos-Onsins & Rozas, 2002). Coalescent simulations using segregating sites were run to test the statistical significance of these results (10,000 replicates).

Where applicable, mismatch distribution analyses were performed, comparing the observed number of differences between pairs of haplotypes with a simulated distribution (Rogers & Harpending, 1992) and assuming an infinite island model to test for spatial expansion (Excoffier, 2004). The resulting growth-decline parameter τ was used to estimate the time

= since expansion as , where τ is the growth-decline parameter derived 𝜏𝜏 𝑥𝑥 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 from the mismatch𝑇𝑇 distribution2µ 𝑥𝑥 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 and𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 µ𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙𝑙 theℎ mutation rate per base per million years. Initially µ was estimated at 0.151 Myr-1 for mitochondrial HVR-I (Tollenaere et al., 2010) but subsequently adjusted to 0.2 Myr-1 correcting for updated divergence times between R. tanezumi and R. rattus (0.34 Ma versus the previously published 0.45 Ma; Robins et al., 2008; Robins et al., 2010). Two generation times for R. exulans were inferred from the average observations for reproduction (Egoscue, 1970; Tamarin & Malecha, 1972;

Williams, 1973, see CHAPTER 1.1.2): generation time one (GT1) of 0.4 a based on the female’s age at the first litter, which I argue is more appropriate for a highly prolific rodent species, and GT2 = 0.72 a, based on a female’s average age over all offspring, an approach used for other mammals.

Phylogenetic analyses were carried out by constructing a median joining (MJ) haplotype network in POPART (Leigh & Bryant, 2012) for the full dataset and for subsets of samples from the wider Near Oceanic and Philippine regions. A second full network was

75

constructed with the integer neighbour-joining (INJ) method (French et al., 2013) to account for different strengths between the construction methods (Joly et al., 2007).

Further, a Bayesian Inference tree was calculated in MRBAYES (Huelsenbeck & Ronquist, 2001), via four Markov chain Monte Carlo (MCMC) chains over 10,000,000 generations each, sampling every 10,000th tree and with an applied burn-in of 10%. The best-fit model

of evolution for the observed data was selected as GTR+I+Γ5, via jMODELTEST 2.1.1

(Guindon & Gascuel, 2003; Darriba et al., 2012). Results were evaluated in TRACER 1.5 (Rambaut & Drummond, 2009) and the effective sampling size (ESS) for all parameters was controlled to be well over 200 to ensure adequate mixing of the chains. To allow inference of the order of divergences within the tree an out-group was included. Initially the three closely related sister taxa R. tanezumi and R. rattus, the more distant R. norvegicus, and Mus musculus were chosen. However, the inclusion of the direct sister species caused inconsistent topologies due to uncertainty in their ancestral relationship to R. exulans, hence after test runs only the distant Mus musculus was placed.

3.4 Results DNA was successfully extracted from 310 R. exulans samples and amplicons between 120 and 583 bp were retrieved. The chromatograms were very clean, showed no ambiguous sites or suggested heteroplasmy. All sequences were assembled into one alignment via the ClustalW algorithm, implemented in Geneious (R5-R8, Geneious; Larkin et al., 2007) covering 404 bp of the HVR I of the mtDNA D-loop. A single indel position was present with variants of either three or four adenine bases. The resulting gap was treated as a fifth character state in all analyses with the exception of the network analyses. A total of 64 variable sites comprised 38 parsimony informative sites and 25 singletons, giving 90 haplotypes. The average base composition for the dataset was 30.5% T, 26.3% C, 30.6% A and 12.6% G with a transition:transversion ratio of 3:1 (AG 16; CT 35; AC 7; CG 1; AT 7; TG 2). Within the complete working set of 340 samples and 95 haplotypes the overall haplotype and nucleotide diversity was high at 0.94 and 0.018 respectively (Table 2).

3.4.1 Population structure

A high level of geographic structuring was revealed in R. exulans. The manual inference of population structure provided the best results under the supposition of four to six

76

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

populations. In comparison, SAMOVA only performed well for k = 2 and k = 3, while any k > 3 resulted in inconsistency and increasing assignment of ‘population clusters’ consisting of only one sampling location. Under k = 2 locations from Mindanao, Sulawesi, Borneo and Batam Island of the Riau Archipelago were clustered together against the remainder, and for k = 3 a second cluster with sampling locations from Remote Oceania

was constructed. However, SAMOVA was only consulted to ensure that all genetically obvious clusters were captured and to control for unbiased groupings.

The most conclusive manual AMOVA setup separated five geographic regions within reasonable boundaries: (1) mainland Southeast Asia (SEA), (2) the Near Oceania region (NO) centred on New Guinea with Weber’s Line, defined by a 50:50 ratio of Asian to Australian mammals and molluscs, as western boundary and with the Thorne-Green line, separating Near and Remote Oceania by a distance-based inter-visibility border, as eastern boundary (Simpson, 1977; Roberts, 1991; Moss & Wilson, 1998), thus including the Moluccas and all Papua New Guinean islands, plus the western part of the Solomon Islands; (3) the Remote Oceanic region (RO) with the entirety of Oceanic islands, including Micronesia, stretching north, east and south from the Thorne-Green line; (4) the Philippines with Borneo and Sulawesi (PHBS) and (5) the Southern Malayan Archipelago (SMA), encompassing Java and the Lesser Sunda Islands. (outer circles in Figure 16; Figure 17). All other arrangements resulted in lower variation among regions and higher variation within. Where applicable, the RO-region was colour-coded to show more geographic detail by separating it into Micronesia, Eastern Melanesia (Vanuatu, New Caledonia, eastern Solomons), and Polynesia. These areas have not been treated as regions or groups.

77

Table 2: Genetic diversity indices for regional groups of R. exulans: Numbers of samples (N), numbers of haplotypes (Nh), haplotype diverstity (h) and nucleotide diversity (π) ± their standard deviations (SD); Neutrality test results for Tajima’s D, Fu’s FS and Ramos-Onsins and Rozas’ R2 statistic (DnaSP); Time since expansion (T1, T2 in years) based on GT1 (0.4 a) and GT2 (0.72 a) estimated via τ = growth-decline parameter (D = demographic, S = spatial, goodness of fit tested with the sum of square deviations (SSD)) and mutation rate µ = 0.2 Ma-1 (adjusted from Tollenaere et al 2010). 10,000 simulations per parameter, significance levels: *(P ≤ 0.05), **(P ≤ 0.01) and ***(P ≤ 0.001), n.s. (not significant).

N Nh h ± SD π ± SD Tajima’s Fu’s Fs R2 τ SSD T1 T2 D Total 340 95 0.94 0.0184 - 0.94n.s. -70.54*** 0.057n.s. n.a. ± 0.01 ± 0.001

SEA 61 10 0.77 0.0085 0.16n.s. 0.25n.s. 0.111n.s. n.a. ± 0.04 ± 0.005

PHBS 64 28 0.93 0.0164 -0.19n.s. -8.85** 0.095n.s. n.a. ± 0.02 ± 0.009

SMA 29 15 0.87 0.0092 -1.09n.s. -5.79** 0.056** D 0.05 ± 0.05 ± 0.005 S 4.40 0.02 10,945 19,701

NO 119 31 0.83 0.0090 -1.37n.s. -15.87*** 0.048n.s. D 0.18 ± 0.03 ± 0.005 S 0.62 0.02 1493 2687

RO 66 26 0.79 0.0108 -1.48* -12.01*** 0.052* D 0.19 0.02 481 866 ± 0.05 ± 0.006 S 9.8 0.01 24,378 43,881

For this five region setup, overall ΦST level was 0.73 and AMOVA attributed 52.2% (ΦCT) of the total variation to among region differences, 26.8% to within population and 20.9% to among populations within regions. In comparison, the initial setup of four populations

(Sunda-shelf, Sahul-shelf, Wallacea and Oceania) resulted in a ΦST level of 0.74 and ΦCT

level of 0.39. All pairwise ΦST attested to a very high level of significant (P < 0.001)

differentiation between the five groups. The lowest differentiation (ΦST = 0.26) was found

between SMA and SEA; the highest level was measured between RO and PHBS (ΦST = 0.63; Table 4, Figure 18). The haplotype and nucleotide diversities were highest in PHBS (0.93 and 0.164) and lowest in SEA (0.77 and 0.0085; Table 2). Eleven haplotypes were shared between some regions but none was ubiquitous (Table 3). A graphic overview of the haplotype composition of each region is presented in Figure 19.

78

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

Figure 16: Comparison of the geographic distribution of haplogroups and the five AMOVA regions. The outer circle colour indicates the affiliation of the sample location with the AMOVA regions, the inside colouring represents the haplogroups and proportional abundance, the node size represents sample size at a location. Image adapted from biogeographic map constructed in GenGIS (Parks et al., 2013), base maps from Esri and NOAA.

79

Table 3: Frequencies of the eleven haplotypes that were present in more than one region. Rx001 Rx005 Rx006 Rx007 Rx010 Rx011 Rx017 Rx022 Rx039 Rx043 Rx056 SEA 26 11 4 2 PHBS 8 1 1 12 1 SMA 1 1 2 1 1 NO 6 45 2 6 3 1 RO 30 1 1 3 2 2 2

Table 4: Below diagonal: AMOVA pairwise ΦST levels for the subdivided five defined regions. Above

diagonal: absolute number of migrants (M=Nm) exchanged between populations, estimated via FST (for

discussion see Chapter 06). Significance level for all ΦST P < 0.001. Minimum and maximum values in bold. SEA PHBS SMA NO RO SEA 0.44 1.43 0.58 0.31

PHBS 0.53 0.71 0.36 0.30

SMA 0.26 0.41 1.06 0.41

NO 0.46 0.58 0.32 0.41

RO 0.62 0.63 0.55 0.55

80

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

ours refer to

d by diamonds. Col diamonds. d by indicate

among the locations centre the approximate with

Esri, USGS and Esri, USGS NOAA. beige dark Mapcoloration. by sources: is marked

exulans

exulans .

. R . The range for defined by by defined AMOVA

Distribution of the 80 pooled populations of R of populations 80 pooled the of Distribution

17 : Figure regional groupings

81

Figure 18: Population differentiation heat maps for each pair of the five AMOVA populations. Top:

AMOVA pairwise ΦST (distance method Tamura and Nei, gamma 0.39, 10,100 permutations), darker

colour indicates higher differentiation. Significance level for all ΦST P < 0.001. Bottom: lower triangle: genetic distance after Nei; diagonal: average number of pairwise differences within populations; upper triangle: average number of pairwise differences between populations. 82

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

Figure 19: Inter-haplotypic distance matrices for pairwise differences among all haplotypes present within each of the five AMOVA regions. Light blue stands for fewer differences between sequences, dark blue for more differences; individual scales differ. For number of samples see Table 2.

83

Because of their importance for questions concerning human migration pathways, the population structure in Near Oceania and Remote Oceania were analysed on a finer scale. NO was divided into three areas, individually separating the Bismarck Archipelago and the

Moluccas from Island New Guinea. ΦST levels of 0.16 each showed strong differentiation for both island areas from the larger New Guinea, but no differentiation from each other

(ΦST=0.02; Table 5). The RO-region was divided into Micronesia (MN), the remainder of

RO (ROMN) and contrasted with the unified group of the Bismarcks and Moluccas (BIS- MOL). All three regions showed strong differentiation from each other with a slightly

higher differentiation between ROMN and BIS-MOL as compared to RO and NO.

The highest haplotype and nucleotide diversity within the NO-region was observed in the Bismarcks (Table 6), however, the nucleotide diversity in the Moluccas was almost as high, even though the sample size in the Bismarcks was almost four times as large. In the RO- region the combined BIS-MOL group had the lowest diversity for both parameters while MN had the highest.

Table 5: Left: the NO-region subdivided into New Guinea Island, the Bismarcks, and the Moluccas; right:

the RO-region subdivided into Micronesia (MN) and the remainder of RO (ROMN) plus the combined

Bismarcks and Moluccas (BIS-MOL). Within each: below diagonal: AMOVA pairwise ΦST; above

diagonal: Absolute number of migrants (M=Nm) exchanged between populations, estimated via FST (for

discussion see Chapter 06). Significance level for all ΦST P < 0.001. Moluccas New Guinea Bismarcks BIS-MOL MN ROMN

Moluccas 2.65 32.99 BIS-MOL 1.36 0.39

New Guinea 0.16*** 2.72 MN 26.9*** 2.18

*** *** Bismarcks 0.02 n.s. 0.16*** ROMN 56.1 18.6

84

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

Table 6: Genetic diversity indices of R. exulans for the subdivision of NO (upper) and RO (lower). Numbers

of samples (N), numbers of haplotypes (Nh), haplotype diversity (h) and nucleotide diversity (π) ± their

standard deviations (SD); Neutrality test results for Tajima’s D, Fu’s FS and Ramos-Onsins and Rozas’ R2

statistic; Time since expansion (T1, T2 in years) based on GT1 (0.4 a) and GT2 (0.72 a) estimated via τ = growth-decline parameter (D = demographic, S = spatial, goodness of fit tested with the sum of square deviations (SSD)) and mutation rate µ = 0.2 Ma-1 (adjusted from Tollenaere et al 2010). 10,000 simulation per parameter, significance levels: *(P ≤ 0.05), **(P ≤ 0.01) and ***(P ≤ 0.001), n.s. (not significant).

N Nh h ± SD π ± SD Tajima’s Fu’s Fs R2 τ SSD T1 T2 D Moluccas 16 7 0.775 0.0100 -0.665n.s. 0.297n.s. 0.12n.s. n.a. ± 0.088 ± 0.001

New Guinea 40 11 0.691 0.0028 -1.708* -6.703*** 0.05* D 1.131 0.00 2,813 5,064 ± 0.079 ± 0.002 S 1.171 0.00 2,913 5,243

Bismarcks 63 17 0.859 0.0107 -0.338n.s. -2.780n.s. 0.09 n.s. n.a. ± 0.027 ± 0.006

BIS-MOL 79 23 0.866 0.0111 -0.741n.s. -6.484n.s. 0.09n.s n.a. ± 0.026 ± 0.006

MN 19 10 0.901 0.0186 -0.024n.s -0.043n.s 0.13n.s. n.a. ± 0.045 ± 0.010

* *** ** ROMN 47 19 0.718 0.0065 -1.891 -11.011 0.04 D 0.459 0.14 ± 0.074 ± 0.004 S 0.458 0.01 1,139 2,051

3.4.2 Population dynamics

For both Oceanic populations, NO and RO, and the isolated island of New Guinea the combined results of the neutrality tests by Tajima, Fu, and Ramos-Onsins and Rozas carried a strong signal for rapid population expansions (Table 2, Table 6). A weaker signal was expressed for SMA. Under this premise mismatch distributions were simulated, testing for demographic or spatial expansion. For RO the observed data matched both simulated unimodal distributions for expansion considerably well (Figure 20, Figure 21). While NO did not fulfil the requirements to calculate time since expansion, the NO sub-region New Guinea did and the Island matched both simulated distributions near perfect. Similarly, the

RO sub-region, ROMN, matched the spatial expansion curve very closely, despite a small second peak at the right end of the distribution (Figure 22). In the remaining populations, including, to a lesser extent, SMA, multimodality was observed, indicating some state of demographic equilibrium (Rogers & Harpending, 1992).

85

Based on the genetic diversity indices and the patterns of the mismatch distribution the estimated death-decline parameter τ was used to estimate the time since population expansion for the applicable populations (Table 2 , Table 6). Based on the two generation times for R. exulans, a time since expansion of the entire RO haplogroup was estimated at 24,378 ka for the shorter generation time and at 43,881 ka for the longer generation time, while for ROMN the estimates were substantially more recent at 1139 ka (2051 ka respectively). For New Guinea time since expansion estimates were nearly identical between both expansion models, with 2813 ka (GT1) and 5064 ka (GT2) for a demographic expansion and 2913 ka (GT1) and 5243 ka (GT2) for a spatial expansion.

86

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

Figure 20: Mismatch distributions comparing the number of pairwise differences between pairs of haplotypes expected under DEMOGRAPHIC expansion (grey line) and as observed (black line), indicating the 90% (green), 95% (red), and 99 % (blue) confidence intervals for each region, calculated in ARLEQUIN. The regions are defined as PHBS (Philippines, Borneo and Sulawesi), SMA (Southern Malayan Archipelago, encompassing Java, and the Lesser Sunda Islands), SEA (Southeast Asian mainland), RO (Remote Oceania, stretching north, east and south from the Thorne-Green-Line) NO (Near Oceania, centred on New Guinea Island, including the Moluccas to the west, all Papua New Guinean islands and the western part of the Solomon Islands). 87

Figure 21: Mismatch distributions comparing the number of pairwise differences between pairs of haplotypes expected under SPATIAL expansion (grey line) and as observed (black line), indicating the 90% (green), 95% (red), and 99 % (blue) confidence intervals for each region, calculated in ARLEQUIN. The regions are defined as PHBS (Philippines, Borneo and Sulawesi), SMA (Southern Malayan Archipelago, encompassing Java, and the Lesser Sunda Islands), SEA (Southeast Asian mainland), RO (Remote Oceania, stretching north, east and south from the Thorne-Green-Line) NO (Near Oceania, centred on New Guinea Island, including the Moluccas to the west, all Papua New Guinean islands and the western part of the Solomon Islands), and the Island of New Guinea by itself.

88

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

Figure 22: Mismatch distributions comparing the number of pairwise differences between pairs of haplotypes expected under DEMOGRAPHIC (left) and SPATIAL (right) expansion (grey line) and as observed (black line), indicating the 90% (green), 95% (red), and 99 % (blue) confidence intervals for each region, calculated in ARLEQUIN. The regions are defined BIS-MOL (Moluccas and the Bismarck

Archipelago), MN (all Micronesian islands of RO; north of the Bismarcks) and ROMN (all of Remote Oceania without MN).

89

3.4.3 Network analysis

The median joining haplotype network of the identified 95 haplotypes revealed two clearly distinct haplogroups (Figure 23): the first stretched across sampling sites of the Northern Malay Archipelago, uniting the Philippine Islands and the Northern regions of Borneo and Sulawesi (PHBS) and the second stretched across all of Remote Oceania (RO), including some sample locations on the fringes of Near Oceania. Less distinct haplogroups were depicted for a subset of mainland Southeast Asia (SEA) and for Near Oceania (NO). Central in the network were samples mostly originating in SMA and SEA, with some regional structure for SEA and eastern Java (SMA). To connect to and from the peripheral groups of PHBS and RO, the calculated pathways led via at least one of two samples from Timor and Flores, VTH18 (plus subsequently VTH25) and Rx083, who themselves were separated by four point mutations within SMA. Two star formations indicating population expansion can be observed (von Haeseler et al., 1996): one centred on the most abundant haplotype Rx010, representing the genetic centre for NO and another centred on the predominant RO-haplotype Rx001. Very noticeable are samples originating from Near Oceanic locations that genetically belong to the RO-haplogroup. The description of the following results will refer to geographic origin and network-derived haplogroups and haplotypes, divergences are illustrated in Figure 16.

Within the NO-haplogroup the majority of samples originated within NO (87%). However, three NO-haplotypes were not exclusive to NO: Rx010 at the centre of the NO-haplogroup was also found in the PHBS, SMA and RO regions (Negros, Camiguin, Mindanao, Sulawesi, Yap, and Flores); Rx011 in NO, SEA and PHBS (New Guinea, Cambodia, Palawan); and Rx017 on Negros, PHBS. Four NO-haplotypes originated elsewhere (Rx021, Mindanao; Rx032, Cook Islands; Rx088, Eastern Solomons; Rx092, Cambodia). Within the RO-haplogroup only 76% of samples originated within the RO-region, whereas the remaining 24% were sampled in NO.

Two islands within the Bismarcks of the NO-region only had RO-haplotypes: Manus (Rx001, Rx002; N=5) and Tench (Rx001, Rx039, Rx040; N=8). Further, distinct RO- haplotypes were found with more frequent NO-haplotypes on Halmahera in the Moluccas (Rx071, Rx078; 2 out of N=6) and Lavongai in the Bismarcks (Rx003, Rx004; 2 out of N=10). All other sampled islands of the Bismarcks contained only NO-haplotypes; among them Emirau, an island between Manus and Tench, which was sampled intensively (Rx038,

90

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

Rx041; N=17; see Appendix S2). The most frequent RO-haplotype from Tench (NO- region; Rx039), was also found in New Caledonia and Vanuatu; daughter haplotypes (Rx035 & Rx085) were found on Vanuatu and Samoa and more distantly, on Kapingamarangi in the Federated States of Micronesia (FSM; Rx068 & Rx067).

91

: Median joining 23 Figure exulans 95 R. for network 340 on based haplotypes 30 (including samples 5 haplotypes and samples 2014). al. et Thomson from circles reflect the of The sizes the of abundance relative the colours the and haplotypes associated the indicate geographic origin of the (see legend). The samples one are separated by nodes step. The dotted del lines imit haplogroups and regional structures.

92

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

Of the PHBS-haplogroup only two samples from Guam (Rx022) did not originate within the PHBS-region (4%). Rx022 is the predominant haplotype (Luzon, Borneo, Sulawesi, Guam), followed by Rx023 (Sulawesi and Negros) and Rx031 (Negros, Borneo, Palawan and Batam). The latter two are not distinguished by the network algorithm, due to an indel.

No single comprehensive SEA-haplogroup was defined. A distinct group of five directly related haplotypes was centred on Rx005 which was found at all sampling locations within SEA apart from the north-eastern sampling locations Bangladesh and India. In a second SEA-grouping, Rx009 was found at four locations spanning from Cambodia, via Laos, to Thailand along a northern route. The daughter haplotype Rx008 occurred in sampling sites in Laos, Bangladesh, and India. The only other haplotype directly related to these two was Rx089 from the southern end of Palawan (PHBS). The remaining samples from the SEA- region did not group with a SEA-haplogroup. Rx007 (Bangladesh, India and Burma) was a daughter haplotype to the main SMA-haplotype and haplotypes Rx092 and Rx011, found in Cambodia, were interspersed in the NO-haplogroup.

The SMA samples form the interconnecting centre of the network, but a distinct haplogroup could not be identified; the group encompassed samples from a range of islands and, apart from Rx083 and VTH18, each haplotype was only found once. However, some haplotypes from eastern Java grouped together and haplotypes clustered around the most frequent SMA-haplotype Rx083 from Flores and Adèle Island. On Flores this haplotype was derived from a semi-ancient (250 - 500 a) and a modern sample and it is the most central of all haplotypes. All haplogroups were connected via the SMA-region, particularly via VTH18 and Rx083, or hypothetical haplotypes between them.

Two haplotypes corresponding to dated samples (1241 and 2011 a BP) from Thomson et al (2014) also group in this SMA centre. Figure 24 depicts a network based on a very short alignment (141bp) of my 95 haplotypes with these two samples. Due to the truncated sequence length, H28 cannot be distinguished from both Rx089 and Rx008, whereas H29 presents a distinct haplotype, even at the short length. Apart from the clearly distinguished RO group, this network does not offer a good resolution of the remaining data.

93

om Thomson et 23 , due to the length of the alignment. des multiplewith names indicate loss the of resolution compared to Figure : Median joining network based on a 141 bp alignment, including two dated haplotypes, H28 (2011 a BP) and H29 (1241 a BP), fr BP), a (1241 H29 and BP) a (2011 H28 haplotypes, dated two including 141 bp alignment, a on based network joining Median 24 : Figure al (2014), highlighted in grey. No

94

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

The alternative network constructed with the integer neighbour-joining (INJ) method (French et al., 2013) also clearly separated the PHBS and RO haplogroups from the rest of the network (Figure 25). However, overall it resulted in a much less resolved network, calculating more pathways connecting to the PHBS-group, more alternative pathways within the connecting part of the RO-group and many interconnections between the SMA and NO regions.

Further, a connection between the SEA (Rx005) and NO (Rx010) groups was offered with only four hypothetical point mutations between them, as well as a connections between NO and RO bypassing the SMA samples at the centre; this pathway links samples from the Bismarck Archipelago on both sides (RO: Rx002, Rx039; NO: Rx041, 6-7 steps). Apart from those links, the Timor and Flores samples Rx083, VTH18 and VTH19 form hubs between the regions and can only be bypassed via hypothetical haplotypes between them.

The closest PHBS haplotype to the centre was Rx076 from Batam (7 steps from VTH19), which was closely followed by Rx065, only found on Palawan and Rx022, shared between Borneo, Sulawesi, Luzon and Guam in the Marianas. With one step added, five connections are calculated to samples from SMA (VTH18, VTH20, VTH25, and Rx083) and SEA (Rx009). The predominant SEA-haplotype Rx005 is equidistant to VTH18 (Timor) and Rx083 (Flores). The two haplotypes from Kapingamarangi split off along the connection between NO and RO, offering an ancestry independent of any RO-haplotype, but the closest connection was to Rx039 (Tench, Vanuatu, and New Caledonia).

95

one

samples. The size of the circles reflects the relative the relative circles reflects the of The size samples. haplotypes based on 3 40 on based haplotypes R. exulans exulans R.

5

joiningnetwork for 9

25 :Integer neighbour The dotted delimit lines haplogroups.

abundance of the haplotypes and the colours indicate the associated geographic origin of the samples (see legend). are separated (see bynodes the samples The of origin the indicate associated geographic colours and of the the haplotypes abundance step. Figure

96

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

The network for all samples from the Near Oceanic region (Island New Guinea, Moluccas and Bismarcks) demonstrates the connectivity or rather distances between the Remote and Near Oceanic haplotypes within the region more closely (Figure 26). The main RO- haplotype Rx001 was only found on Manus and Tench of the Bismarcks, where three further RO-haplotypes, Rx002, Rx039 and Rx040, were also present, but no NO haplotypes were found. In contrast, the adjacent island of Lavongai (New Hanover) also carried NO haplotypes (including the NO predominant Rx010) in addition to the two RO-haplotypes (Rx003 and Rx004). The majority of samples from the Moluccas revealed NO-haplotypes, with the exception of Halmahera and Kei Besar; Halmahera was sampled with four haplotypes among six samples (for locations and approximate distances see Figure 16).

Figure 26: Detailed median joining haplotype networks colour-coded by source-island for the Near Oceanic- region (Island New Guinea, Moluccas and Bismarcks); pie size increases with frequency. The dotted line indicates the separation between the NO and RO haplogroups.

Similarly, the network for all samples from the Philippine Islands, Borneo and Sulawesi, depict the boundaries between the PHBS and the SMA and NO haplogroups more clearly. Samples from Palawan, Sulawesi, and Borneo span across all three groups, samples from Negros and Mindanao across PHBS and NO, whereas samples from Luzon and Batam were only found within the PHBS haplogroup and the samples from Camiguin only within the NO haplogroup. The haplotypes Rx023 and Rx031, which were not distinguished by the network algorithm due to an indel, only overlap on Negros, Rx023 was further found on Sulawesi and Rx031 on Borneo, Palawan, and Batam.

97

Figure 27: Median joining network for all haplotypes found in the PHBS-region (Philippines, Borneo and Sulawesi), colour coded by source-island, pie size relative to haplotype frequency. The dotted lines demarcate the separation between the PHBS, SMA, and NO haplogroups.

Figure 28: Median joining network for all haplotypes found in Southeast Asia and the Sunda and Lesser Sunda Islands, colour coded by source-island / region; pie size relative to haplotype frequency. The dotted lines indicate the (truncated) haplogroups and clades according to the full network and Bayesian inference.

98

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

The Sunda-network captures the entanglement among haplotypes in this region, but the geographic distribution of haplotypes on the Southeast Asian mainland becomes clearer. However, it has to be pointed out, that some relationships among haplotypes are masked due to omitted samples present only in other regions. This underlines the importance of consistent geographic sampling and the consequences for inferences due to incomplete geographic coverage. Regrettably this also stresses the potential importance of missing areas in this study.

3.4.4 Phylogenetic reconstruction

The Bayesian inference of the phylogeny (Figure 29) identified two well-supported crown clades (posterior probability (PP) 0.89 and 0.98), one with limited support (PP = 0.75) and two small regional clades with moderate support (PP = 0.81), as well as several subclades with moderate to good support (PP ≥ 0.8).

The PHBS haplogroup formed a pure and structured clade with subclades for Negros (PP = 0.8) and Luzon (PP = 0.95). The RO-haplogroup, including samples from the Bismarcks and Halmahera, formed a less structured clade with two haplotypes from Kapingamarangi (FSM; Rx067, Rx068, PP = 0.85) as subclade. The larger SEA- haplogroup clade (N = 47), albeit only with limited support, included samples from Thailand, Cambodia, Laos and Burma and contained a well-supported subclade with samples restricted to Cambodia and Thailand; the minor clade (SEA2; N = 10; PP = 0.81) contained samples from Thailand, Laos, Bangladesh, Cambodia and India.

While there was no substructure for any members of the NO-haplogroup, the SMA- haplogroup contained a single well-defined clade of eastern Javanese samples (PP = 0.81). Although a Near Oceanic clade was presented, this had no true support (PP = 0.61) and thus needed to be regarded upon the same level with the basal SMA samples. This unstructured group at the base also contained samples from Borneo, a single haplotype from Palawan and another found in Myanmar, Bangladesh, as well as in the very distant Marianas.

99

Figure 29: Bayesian consensus tree (1000 trees; posterior probabilities shown at the nodes, pp cutoff = 0.5) of 95 R. exulans mtDNA-HVR1 haplotypes. The outgroup Mus musculus was removed for better legibility (double line). Colours of inner and outer circles represent geographic origin and haplogroup respectively. Samples from Borneo and Sulawesi are indicated by a triangular shape. Clades with posterior probability support ≥ 0.75 are shown with location name.

100

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

3.5 Discussion As the third most widely distributed commensal rodent, R. exulans exhibits surprisingly little genetic connectivity among major areas of its distribution. This is most certainly due to the unique landscape making up the geographic range that is inhabited by the species, in particular the vast open water expanses among the islands in Oceania. Due to the human mediated dispersal into these far reaches of Oceania the geographic range is highly inflated considering the landmass actually involved. The geographic isolation does however have huge potential for genetic drift to drive differentiation.

3.5.1 Genetic diversity

High genetic diversity is to be expected for a successful commensal species that inhabits a wide geographic range and thrives in many different habitats. The genetic variation within all sampled R. exulans was high, and especially the large amount of nucleotide polymorphisms underline the strong differentiation that was observed among geographic regions. In comparison with the three other major commensal rodent species (Table 7), the haplotype diversity is in the same range as for Mus musculus musculus within China (0.96; Jing et al., 2014) and M. m. domesticus in the British Isles (0.96; Searle et al., 2009) as well as an NCBI compilation of R. rattus (0.97; Brown et al., 1986; Hingston et al., 2005; Robins et al., 2007; Russell et al., 2010; Tollenaere et al., 2010; Russell et al., 2011; Brouat et al., 2014; Colangelo et al., 2014), whereas Song et al. (2014) reported a considerably lower diversity for R. norvegicus (0.76). However, the lower diversity for R. norvegicus could be due to insufficient sampling; only three of eleven locations were sampled with more than two samples and particularly the ancestral regions were not sufficiently included in the study.

The observed nucleotide diversity for R. exulans is noticeably higher than for any of the other commensal rodents. Even the combined diversity between M. m. musculus and M. m. castaneus (13.6) is still lower than that found in R. exulans overall (18.4) and within the PHBS region alone (16.4). The high diversity within the PHBS region is partially influenced by the inclusion of NO haplotypes that were also present in some locations included in this region, but even if those haplotypes were excluded, the diversity still ranged around 13.6, as high as the combination of the two M. musculus subspecies. This is an indicator for the deep divergence between the PHBS haplogroup and the rest of the species. The RO group also showed a notably higher nucleotide diversity (10.8) but equally

101

included samples from a different haplogroup; in this case, choosing only the absolute RO- haplotypes retained a diversity of 5.8, a good average among the other commensals.

Despite the lowest sampling size, due to limited availability, the SMA region had the second highest haplotype and the third highest nucleotide diversity. It is expected that further sampling in this area would reveal more cryptic haplotypes, therefore extended sampling in the area is strongly suggested for future analyses. In contrast, sampling of the Southeast Asian mainland thoroughly covered the entire region but the area still contained the lowest diversity for both measures. This indicates a restricted ancestral population size, either caused by a founder event or a population bottleneck.

Table 7: Comparison of haplotype diversity (h) and nucleotide diversity (π) of the mitochondrial D-loop among the four major commensal rodent species. Overall genetic diversity data for D-loop of R. rattus was only available for Madagascar (Hingston et al., 2005; Tollenaere et al., 2010), therefore published sequences from a wider geographic range were retrieved from NCBI, aligned and truncated to a consensus of 366 bp. The calculated indices are based on this alignment of 111 sequences, representing 70 haplotypes at this length. Species N h π Source

R. norvegicus 239 0.76 7.91 Song et al. (2014) R. rattus 111 0.97 13.5 retrieved via GenBank, accessed 27.12.2014: NCBI (Brown et al., 1986); Hingston et al. (2005); Robins et al. (2007); Russell et al. (2010); Tollenaere et al. (2010); Russell et al. (2011); Brouat et al. (2014); Colangelo et al. (2014) M. musculus 535 0.96 13.6 Jing et al. (2014) ~castaneus 181 0.82 8.5 ~musculus 354 0.96 5.4 ~domesticus 95 0.96 6.9 Searle et al. (2009) R. exulans 340 0.94 18.4 this study

102

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

3.5.2 Population structure

The physical landscape at the centre of the distribution of R. exulans consists of thousands of islands varying from small islets to the large island of Borneo with an area over 140,000 square km. Across the archipelago, outlines and dimensions of all landmass changed periodically during the Pleistocene (Caputo, 2007; Coller, 2009) and increasingly so within the last 0.8 Myr (van den Bergh et al., 2001), when palaeoclimatic oscillations repeatedly created temporary continuity in an otherwise fragmented landscape, providing novel opportunities for dispersal.

For my approach to determine population structure, this posed a challenge, because many of these islands stayed separated while others grouped with adjacent islands allowing the formation of genetically continuous populations. Despite close geographic proximity some connections were less likely than others, due to various reasons I will describe in more detail in the next chapter. Importantly, structuring software is not yet capable of taking these delimiters into account; therefore I chose a semi-manual approach to evaluate different plausible scenarios. Five geographic regions were distinguished by this method and these were also reflected in the results of the subsequent network analyses and in the Bayesian phylogeny. However, the regions have different levels of confidence as genetic and geographic entities, based on the unequal utilization of the available data. The AMOVA approach takes the differences between haplotypes into account, but also considers haplotype frequencies, and as such offers a well-rounded approach in determining genetic interrelation within the defined structure of the current population. On the other hand, the network analyses allow derivations of the meta-population structure without the need for a priori information on groups or geographic distribution (Rozenfeld et al., 2008). Regrettably, the applied network algorithms removed insertion-deletion (indel) data and therefore reduced the information held in the original data, which can have important consequences, particularly for intraspecific networks (Joly et al., 2007). Furthermore, despite the depiction of frequencies on the nodes (Figures 9-13), these are not considered for determining the relationship among haplotypes. The Bayesian tree inference is equally restricted to differences between haplotypes but here the advantage lay in the added credibility that sections of the tree structure gained through posterior probability support. All three methods complement one another and thus contribute to a more realistic inference of the true population structure.

103

The independent analytical approaches provided supplementary evidence for a deep structure within the total population of R. exulans. The two most apparent regional lineages exist in the wider Philippine area, stretching south across the Islands of Borneo and Sulawesi, and across all of the Remote Oceanic Islands. With the highest differentiation

(ΦST = 0.63) between these two regions, both lineages doubtlessly have separated a long time ago.

The PHBS region is the most genetically structured area of the five proposed regions. With good agreement between the PHBS close-up network and the overall tree, two regional structures become evident for the Philippines: (1) Luzon is the only island that has only haplotypes from within the PHBS group and (b) Luzon and Negros are the only islands that have well supported subclades. Of the three haplotypes on Luzon, the predominant Rx022 is shared with Borneo and Sulawesi but all daughter haplotypes are private. Negros also shared haplotypes with Borneo and Sulawesi, however, gene flow between Negros and Luzon could not be documented. To allow for this level of genetic drift, either the islands or areas thereon, or both must have been isolated for a considerable stretch of time, before contact with surrounding populations recurred. This is concordant with the theory of slow colonisation across salt water barriers and long periods of accumulation as proposed for the Philippines by Heaney (1986), where areas with boundaries around the 120 m bathymetric line are known to support a high level of endemism.

Sea levels around and below this boundary would have minimized the stretches of water among the Philippine islands and created novel possibilities for immigrations from . Several colonizing passageways were possible; the shortest and shallowest was from Borneo and Palawan via a crossing of the Cuyo East Passage into Panay and Negros, which then formed a single island. From there, dispersal could have commenced northwards towards Luzon or west- and southwards towards Mindanao. However, a pathway was also possible via the deep Mindoro Strait followed by a second crossing through the Verde Island passage towards Luzon, although a subsequent southward dispersal was unlikely given the documented impact of the San Bernardino Channel on natural dispersal (Heaney, 1986). An alternative pathway led along the ridge between Borneo and Mindanao. Immigrations by more than one pathway predict the occurrence of more than one haplogroup, which matched the observed pattern within Negros and Mindanao, where also NO-haplotypes were introduced. Both, Mindanao and Northern Sulawesi appear as transition zones between the PHBS and NO haplogroups.

104

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

Figure 30: Outline of the land bridges between the islands within the Malay Archipelago. Solid red coloration indicates the 100 m bathymetric line, emphasizing the pathways between, Palawan and Luzon/Negros, the Sangihe ridge between Mindanao and Northern Sulawesi and the ridge between Borneo and Mindanao. Image: altered from source http://topex.ucsd.edu/marine_topo/ after Smith and Sandwell (1997)

In this context it is noteworthy that Borneo and Sulawesi each presented a distinct North- South division. This type of North South partitioning, implying separate introduction pathways, has been reported for other species on Sulawesi, most notably Sus celebensis (Larson et al., 2005). Wallacea (Sulawesi and Lesser Sunda Islands) always stayed isolated from Sundaland and has been known to be a centre of murine evolution (Watts & Baverstock, 1994). Nevertheless, at low sea levels, not only the Sangihe ridge (Figure 30) was a possible pathway from the southern Philippines over to the northern peninsula of Sulawesi, which is assumed to have allowed Elephas planifrons to cross, subsequently leading to the development of the Sulawesi pygmy form Elephas celebensis, but also land connections of southern Sulawesi with Flores and Timor have repeatedly been suggested, termed as “Stegoland” (Hooijer, 1974; Watts & Baverstock, 1994). This demonstrates the possibility of an additional influx from the south and offers an explanation for the latitudinal separation documented here for R. exulans on Sulawesi.

105

The similar separation in Borneo so far lacks a sufficient explanation that goes beyond speculation. The tentative AMOVA grouping with the rest of the Sunda Islands clearly indicated a closer genetic affiliation with the PHBS haplogroup. Thus, the overlapping distribution pattern of PHBS and SMA haplotypes seem suggestive of a previously established population on the island with additional backflow from the Philippine region and a more recent influx from the South.

For the SMA region, the high diversity and particularly the high differentiation among haplotypes from Flores and Timor, two adjacent islands of the Lesser Sunda chain, was conspicuous. A random introduction onto any of the two islands would have led to reduced genetic diversity due to drift, as observed, for example, for a group Javanese samples. In contrast the spread of the observed haplotypes spanning across the centre of the network suggest the existence of multiple cryptic haplotypes. Further their cross-point characteristics in addition to their position at the base of the constructed tree, suggest a long established, well interconnected, possibly large population. This lends support to the theory of the Lesser Sunda Islands being the ancestral region for the species (Schwarz & Schwarz, 1967; see Chapter04).

However, the geographic origin of specimens in this basal group did not only include samples from the Sunda chain but strikingly many samples from the island of Borneo as well as single ones from Sulawesi and Palawan. This emphasizes a close connection between these regions and indicates their importance for dispersal pathways leading to the observed haplogroups. To resolve the composition of the island population and the magnitude of southern influx more clearly, extended sampling is needed for Borneo, particularly including the western Malaysian part of the island.

Although the Southeast Asian mainland encompassed roughly the same number of specimens as PHBS and RO, the area only harboured a few albeit distant haplotypes as illustrated in the overview of inter-haplotype distances. Between the network and the Bayesian analysis, concordant support was found for two independent subclades on the Southeast Asian mainland, with evidence for two separate sources, due to the highly different haplotypes defining these structures. One clade is particularly close to the SMA network and a descendant of a haplotype so far only found on Palawan, indicating a possible dispersal route. Due to this interspersed subclade the distinction between SMA and SEA is not as obvious as between other regions, but becomes more evident when the different

106

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

approaches are combined. However, the multiple cryptic connections to the samples from Timor and Flores place the SEA region at the fringe of the central SMA distribution and indicate much more recent divergences of its two clades.

107

3.5.3 Near and Remote Oceania

“Were I to reduce the number of these groups further, I would first isolate as exulans the small-sized animals with large bulla and short meatus, making the large Tuamotu rat and the Raraka, Samoan, and Fiji rats variants of it. The Malekula rats of New Hebrides I would consider more distant on account of their peculiarly small bullae and long meatus.” (Tate, 1935, p159)

Overall the haplotypes retrieved from New Guinea and surrounding islands were closely related to the SMA group. However, the combination of the star-like formation observed in the networks with the marginally supported clade from the tree inference and the attested differentiation between the geographic regions, support a regional sub-structure for Near Oceania. Within the Near Oceanic region the lack of population differentiation between the Moluccas and the Bismarcks, and the high levels of differentiation of both areas from New Guinea indicated a clear connection between the small islands, but not with the larger New Guinea. Additionally, the network structure supports a primary colonisation of the smaller islands, with a later expansion onto the New Guinea mainland.

Furthermore, the New Guinea mainland deviated from the assumptions of neutrality, and a rather recent range expansion onto the New Guinean mainland was supported under both a demographic and spatial expansion model. The timing for this event was estimated to

2813 BP and 5064 BP, depending on the underlying generation time. These estimates are concordant with the archaeological findings of R. exulans in parts of the region, as reported for New Ireland, where bones appeared in archaeological layers between 3000 and 7000 BP (White, 2004). During the excavation at the early Lapita site Tamuarawai on the remote island Emirau no rat bones were discovered, despite the dated age of the archaeological

horizon of roughly 3000 BP (SUMMERHAYES ET AL., 2010); this either suggests, that no bone material survived in the low lying reef or that these islands received R. exulans much later. The haplotype of the current population on this island belongs to the Near Oceanic group and could have been introduced more recently, e.g. during the Austral-British recruitment of New Guinean work force (“”) from the the mid 19th century onwards, the German colonial reign starting in the late 19th century or during the World War II occupation 1944, which saw systematic Allied movement along the coastline of the Bismarck Archipelago and an American airbase stationed on the island (see Appendix A 1).

However, it is conspicuous that adjacent islands within the Bismarck Archipelago either had specimens belonging to the NO or RO group and only few islands that harboured both

108

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

groups. Manus and Tench in the Bismarck Archipelago only had rats of the RO haplogroup (Figure 26); Tench is a sister island to Emirau, only a few boat hours away, but it is considerably smaller and has not had the amount of contact with western visitors as Emirau. However, cultural connections between the St. Mathias group of the Bismarcks and Kosrae, the easternmost island of the Caroline Islands, were observed by Parkinson (1907, pp324- 325), based on his findings of a particular style of loom shared between these locations and also the . These Island groups further share a distinctive kite fishing technique (Intoh, 1999). The comparatively large variety of haplotypes found on Tench, which is a very small island, suggests a longer period of time for the rats on the island, or multiple introductions from other source populations. Manus on the other hand is a much larger island in comparison and Lou Island of Manus was a major source of the obsidian artefacts found in Remote Oceania (Ambrose, 1978), and as such is assumed to have been a trading hub.

Much of the Remote Oceanic group was similarly represented by a star-like formation, the corresponding clade had high posterior probability support and was highly differentiated from all other regions. The central haplotype for this formation, Rx001, was the only haplotype found in all areas of Remote Oceania, while haplotypes from specimens obtained in Near Oceania and islands in closer radius superficially formed the base of the RO group. Within the group, a substructure was supported for the Micronesian samples from Kapingamarangi and a strong differentiation was attested between all Micronesian samples and the remainder of Remote Oceania. The Micronesian subclade was peculiar; Kapingamarangi is a very low lying atoll, assumed not to have been settled by humans earlier than 1500 BP (Leach & Ward, 1981). Given the basal separation from the RO-goup and the good support for the subclade, this would suggest that this group must have had a reservoir elsewhere. The network analyses suggest genetic ties to Rx039 found on Tench, Vanuatu and New Caledonia. Although Tench would be geographically closer, the shared haplotype between Guam and New Caledonia suggests some degree of connectivity between the two areas. However, the documented Philippine haplotype (Rx022) in Guam was possibly a recent introduction from the 1940s, facilitated by movements of military between the Philippines and Guam during World War II and thereafter. The unexpected combinations of haplotypes in the Marianas also suggest modern introductions, and might artificially contribute to the high differentiation of the Micronesian group.

109

Remote Oceania was the only other region to fail the neutrality tests and mismatch

distributions allowed for an estimation of expansion time for the reduced ROMN under the spatial expansion model. The estimated time of 1139 BP (2051 BP for GT2) for the expansion of this eastern Remote Oceanic R. exulans population is concordant with the archaeologically derived time estimates for human population expansion into East- Polynesia (Kirch, 2002; Addison & Matisoo-Smith, 2010). The date remains only slightly older than radiocarbon estimates derived from short lived-materials (Wilmshurst et al., 2011). These results strongly support the human-mediated distribution of R. exulans into Remote Oceania as proposed by Tate (1935) and further hypothesized by anthropological scholars (Allen, 1989; Roberts, 1991a; Matisoo-Smith, 1996), which subsequently led to its application as bioproxy for human migrations (Matisoo-Smith, 1996; Matisoo-Smith et al., 1998; Matisoo-Smith & Robins, 2004).

Remarkably, the entire Remote Oceanic region, that is including Micronesia, also deviated from the neutral assumption, and although the mismatch distribution was not strictly unimodal, the overall distribution matched sufficiently well, to allow a time estimation. The beginning of the expansion was estimated to have occurred between 32,289 and

58,120 years BP, and the tree structure equally suggested a long time since the divergence of the Remote Oceanic clade. If these results can be further supported by additional analyses, they could indicate an expansion into RO with a series of founder events starting much earlier than so far assumed. The relationship among haplotypes and their geographic distribution are also concordant with two waves of dispersal, one within closer proximity to the NO-region and another significantly expanding the first radius.

In regard to the source of the Remote Oceanic clade, there is no evidence of any genetic ancestry within any of the other haplogroups. The occurrence of Remote Oceanic haplotypes on neighbouring islands within the Bismarck archipelago that otherwise revealed only Near Oceanic haplotypes thus far provide the only evidence for an ancestral geographic connection between NO and RO. More importantly, Manus and Tench within the Bismarck Archipelago show connections to Halmahera to the west, in shared haplotypes, and cultural connections to Remote Oceania in the east through the distribution of the kite fishing technique (Moodie et al., 1969; Intoh, 1999). They therefore present themselves as the best candidates for the ultimate start of the range expansion into Remote Oceania.

110

POPULATION STRUCTURE OF RATTUS EXULANS CHAPTER 03

What is exceptional for the dispersal into Remote Oceania, is the colonization of previously uninhabited islands and niches, facilitated by the lack of competing species, natural enemies and possibly diseases. This will have allowed unhindered initial population growth until the carrying capacity of each island was reached. As MacArthur and Wilson (1963) stated, the immigration rate is not only dependent on the rate with which islands are reached, which in turn is dependent on the island size and the distance from a source population, but importantly it is also dependent on the number of resident species. While under ‘normal’ circumstances only a fraction of immigrants would have been successful, it is reasonable to assume that most arriving individuals will have survived and started or contributed to a population. It has to be considered that this different dispersal success influencing the survival rates and thus population size might have effectively shortened generation times and ultimately may have led to a skewed temporal scale for the accumulation of genetic changes.

111

112

CHAPTER 04 ORIGIN AND DISPERSAL OF RATTUS EXULANS

4.1 Abstract In the present chapter I establish the chronology of divergences among clades of R. exulans in order to infer the biogeographic history leading to the population structure revealed in

CHAPTER 03. A reconstruction of ancestral areas per major clade and selected internal nodes revealed much of this geographic history and its interrelations with climatic events. Particularly the early divergence and the population structure in the wider Philippine region correlated with temporary opportunities for dispersal and vicariance events when forced isolations could have caused genetic differentiation through genetic drift. My results indicate an early expansion event with subsequent isolation leading to two population reservoirs and the divergence of two deep lineages. One reservoir must have existed in the Borneo, Palawan, Negros area, from where the diverged stock then dispersed throughout the Philippines during a second expansion event possibly during the last glacial maximum; the currently observed clades could then have diverged after the islands became isolated again around 10 ka.

A second reservoir must have existed for the Remote Oceanic clade, which branched off from the ancestral stock far before any Neolithic human migration took place. In fact the tMRCA of the clade is much closer to the initial human settlement time than the Austronesian expansion, which is thought to have aided dispersal of the species into the Remote Oceanic realm. Although the tMRCA is not necessarily identical with the time of dispersal, by 25 ka multiple lineages had further diverged. Hence, a reservoir must have existed for the RO- lineage prior to the expansion into Remote Oceania. However, the geographic location of this reservoir remains elusive; potential candidates are the Moluccas and the off shore islands of New Guinea.

113

Pertaining to the geographic origin of R. exulans as a species, two opposing theories exist in the literature. One proposes an origin in mainland Southeast Asia, just north of the Isthmus of Kra (Tate, 1935) and the other on Flores in the Lesser Sunda Islands (Schwarz & Schwarz, 1967). Based on the phylo-chronology and inferred ancestral ranges I reject the hypothesis of an origin on the Southeast Asian mainland as proposed by Tate (1935), and support the wider Lesser Sunda Islands area as centre of origin for R. exulans.

114

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

4.2 Introduction Over a majority of regions within its distribution R. exulans lives closely associated with humans and their modified environments. The species has been studied repeatedly in the wild and also under controlled laboratory conditions, thus physiological and ecological parameters are well known. Still, the origin of R. exulans remains debated and pathways of dispersal unresolved.

4.2.1 Where is the species commensal, where wild or feral?

R. exulans is not an obligate commensal, but it often lives in houses and in association with human dwellings, particularly if native rats co-occur in the region (Musser & Newcomb, 1983) but notably only in the absence of larger commensal rats, particularly R. norvegicus. However, it has been known to take on a feral life on the arrival of competing rat species e.g. in New Zealand. Alternatively, due to its small size it can fill niches that are otherwise taken up by house mice (Johnson, 1946). On Sulawesi and Mindanao R. exulans are primarily bush rats, further north in the Philippines it inhabits both bush and villages, while in Southeast Asia it has mostly been reported to live truly commensal. Throughout the Larger Sunda Islands R. exulans is historically known as commensal in highly populated coastal areas, but as feral in mountain areas. In the Pacific R. exulans it is predominantly found in huts and gardens, but has also been reported to take on a feral life style. As a general pattern, R. exulans appears to live more ferally in mountainous ranges. As an invasive species in most of its range it profits from a human induced disturbance gradient and is unlikely to outcompete well adapted native species in pristine forests. Despite phenotypic adaptations, the bimodal lifestyle is probably correlated to human population density combined with human impact on the environment, rather than just the habitat itself.

The assumption of a deliberate transportation of R. exulans by humans into the remote areas of Oceania prompts the question where this deliberate distribution might have begun. To address this question, more information is needed about the history of the population structure.

4.2.2 Hypothesis about Origin

Possible dispersal pathways have been proposed by Tate (1935) based on the

morphological relationship of subspecies of the concolor/exulans group (see CHAPTER05

115

for more detail) where amongst others he proposed the distribution and origin of the Remote Oceanic variant of R. exulans.

“I would thus derive exulans and hawaiiensis from the Philippine Borneo Java region in the form of many successive waves probably arising from stocks already differentiated from each other. I am not inclined to believe that any of them has passed through New Guinea.” Tate (1935, p. 167)

He further concluded an origin of the species in Tenasserim, Burma, a region just north of the Isthmus of Kra on the Malay Peninsula, bordering on Thailand. However, this conclusion was based on the apparent convergence of the concolor group with cremoriventer and fulvescens, both of which were since removed from the genus Rattus and reclassified within the genus Niviventer (Musser, 1981b). A competing hypothesis was formed by Schwarz and Schwarz (1967) who proposed R. r. wichmanni as the wildtype and thus ancestor of the exulans-series. Schwarz and Schwarz’s (1967) justification was based on the morphological peculiarity of untainted uni-coloured white ventral hair that was not observed in specimens from anywhere else; R. r. wichmanni only occurs on Flores of the Lesser Sunda Islands. However, according to Musser (1981a) the morphology among R. exulans of all the Lesser Sunda Islands and Sulawesi including its satellite islands did not significantly differ. The hypothesis of an origin on Flores has recently been supported by Thomson et al. (2014) who conducted a phylogeographic survey on R. exulans and compared allozyme diversities with that of its sister species R. hainaldi, an assumed native to Flores.

4.2.3 The chance of ways of dispersal

No matter which of the hypotheses of origin might be supported by the ensuing analyses, the Malay Archipelago would have had to be crossed to arrive at the current distribution of R. exulans. Unlike the vast distances over open water preventing natural dispersal throughout the Pacific Ocean, floating logs, severe weather events, as well as palaeoclimatic changes offered opportunities for natural dispersal over the shorter water gaps of the Australasian Mediterranean Sea. Atypically for a mediterranean sea, the water currents among the islands of the Malayan Archipelago are strongly influenced by wind and thus directionality of the surface currents changes twice annually during the monsoons (Figure 31 and Figure 32) increasing possible dispersal routes. The periodic reversal of the oceanic surface circulation has a particularly strong impact on the currents along the Java,

116

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

Flores and Banda Seas (Wyrtki, 1961). However, other currents are consistent throughout the year, for example the North Equatorial which continues north along the eastern Philippine coastline and the Halmahera Eddy which partially feeds into the Sulawesi Sea. The Eddy itself creates a steady semi-circular current along Southern Mindanao to Northern Borneo and back out to the Pacific via Northern Sulawesi.

Depending on directionality, the straits in between the northern chain of the Lesser Sunda Islands (Bali, Lombok, , Komodo, and Flores) had unequal influence on the dispersal of mammals. The Bali and Lombok straits posed the strongest barrier for eastward dispersal of Indo-Malayan species, whereas westward dispersal was more impacted by the Atlas strait between Lombok and Sumbawa (Mertens, 1936). However, ca. 87% of species found in the northern chain are also found on Java and many as far west as India. The fauna on Timor is again decidedly different from the northern chain, with higher influence of species arriving from the east. Rensch (1936) fittingly termed the Lesser Sunda Islands including the southern Moluccas as “Zwischengebiet” a transitional or intermediate zone where the influences of the Indo-Malayan and the Australo-Papuan fauna form a gradient.

Additional to the seasonal monsoons, the El Niño–Southern Oscillation (ENSO) and its counterpart La Niña provided frequent extreme weather events to influence dispersal. Evidence for the occurrence of ENSO in its current appearance has been found in the eastern Pacific for at least the last 10 kyr (Carré et al., 2014), with a weakened period between 5.8 and 3.2 ka (Sandweiss et al., 2001).

117

Figure 31: Surface currents in the Australasian Mediterranean Sea during the south monsoon in February (ME Mindanao Eddy); collated after Tomczak and Godfrey (2001).

Figure 32: Surface currents in the Australasian Mediterranean Sea during the north monsoon in August (ME Mindanao Eddy, HE Halmahera Eddy); collated after Tomczak and Godfrey (2001).

118

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

On a wider temporal scale, the island landscape of the Malay Archipelago, at the centre of the distributional range of R. exulans, has been affected considerably by eustatic sea level oscillations throughout the Pleistocene (Webb III & Bartlein, 1992). Falling sea levels exposed the Sunda shelf connecting the Malay Peninsula with the islands of Sumatra, Java and subsequently Borneo. At lowest sea levels Vietnam and Borneo were connected via the submerged South China Sea (CHAPTER01, Figure 7). Similarly, the exposed Sahul shelf merged Australia and New Guinea into one major landmass (after SahulTime, Coller (2009) via http://sahultime.monash.edu.au (based on Lambeck & Chappell, 2001; Tian et al., 2002; Caputo, 2007)).

Figure 33: Sea level changes over the last 2 Ma with magnification of the last 180 ka. Current sea level is notated as 0 m. For corresponding geographic profiles, see CHAPTER01, Figure 7.

All major land bridges for both shelves formed at sea levels around 60 m below today’s standard elevation zero (hereafter indicated by prefix -), many even at -40 m. Roughly a third of the past million years (Myr) had sea levels at -60 m and below (Figure 33), with episodes spanning 6 to 58 kyr. The most recent period of lowered sea levels, from 48 to 12 ka, led to an unprecedented recession of -135 m. The substantial connectivity, reached at sea levels around -60 m, allowed increasing dispersal of land fauna within Sundaland (Borneo, Sumatra, Java and Bali) and levels of connectivity between the Southeast Asian

119

mainland and Sundaland were pronounced when sea levels reached recessions of over - 100 m around 0.9 Ma, unprecedented within the last 10 Myr (Tian et al., 2002; Coller, 2009). Several extended glacial periods entailing maximal sea level recessions (approximately eight periods stretching 13 to 32 kyr; Caputo, 2007) caused extensive land bridges to form and opened up new territory for dispersal. In the Philippines, areas with boundaries around the 120 m bathymetric line are known to support a high level of endemism (Heaney, 1986), indicating a lack of admixture. Sea levels below this boundary would have minimized the stretches of water among the Philippine islands and created possibilities for waves of immigrations from Sundaland. Maximum connectivity was only reached a few times during the mid to late Pleistocene with the last three occurrences from 359 to 340 ka with a minimum sea level at -124 m, from 141 to 136 ka at -107 m and during the last glacial maximum (LGM) from 30 to 16 ka at -135 m. During these times of land connectivity long range dispersal via the newly exposed areas would only be hampered by emerging rivers systems. The largest rivers system, the North Sunda River (Figure 34), impacted dispersal between West Borneo and the Southeast Asian mainland, further east the Siam River kept a barrier between the Malay peninsula draining the Gulf of Thailand, while the Malacca Straits system hampered dispersal between Sumatra and Malaysia and the East Sunda

River between south Borneo, Java and the Figure 34: Paleo river systems within the Sunda Lesser Sunda Islands (Voris, 2000; shelf, at -120 m sea level, (after Voris 2001). MS: Malacca straits, NS: North Sunda, ES: East Sunda Sathiamurthy & Voris, 2006; Solihuddin, and SR: Siam River. Arrows indicate the 2014). These effects were most continuance towards the shelf edges. pronounced at lowest overall sea levels.

Intermittently, the interglacial periods caused sea-levels to rise, several times even above standard elevation zero as last documented between 121 to 120 ka. Sea levels above -60 /-50 m induced the Sundaland land connections to break up, which remained disconnected almost continuously between 130-70 ka. A highly fragmented island

120

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

landscape was reached at sea levels around -30 m when the geographic outline of the Malayan islands already closely resembled the current physical coastlines (Figure 33,

CHAPTER01, Figure 7). For the Isthmus of Kra on the Malay peninsula a transgression was suggested during deglaciation after the LGM, caused by a meltwater pulse between 14.6 and 14.3 ka with an extremely accelerated sea level rise (5.33 m per 100 years, Hanebuth et al., 2000). This separation of the Thai Malay Peninsula had possibly occurred before, during the Miocene and Pleistocene (Woodruff, 2003).

In addition to the climatic environmental changes, the consequential reduction of available landmass put pressure on individuals and entire populations by increased competition for resources and habitat. Further, the formation of islands isolated populations thereby increasing the rate of subdivision. From an evolutionary perspective, this cyclic vicariance repeatedly provided resident flora and fauna with possibilities for vast range expansions, followed by habitat loss and isolation. This frequently led to extinction or subdivision and ultimately to speciation, which is why the Pleistocene period has often been dubbed as a ‘species pump’ (Steppan et al., 2003). A prolific and adaptable rat species would most certainly have benefited from such a variable environment.

4.2.4 Inference of ancestral ranges

Despite this very complex palaeo-history of the Malay Archipelago it can be expected that a species distributed throughout this range will have co-evolved and retained evidence of these changes among individuals in recent populations. Avise et al. (1987) advocated the use of gene trees to infer spatial continuity or separation in species’ histories. With phylogenetic approaches in biogeography we can trace the demographic processes that led to the observable distribution, mainly dispersal and vicariance events, and further use them to infer ancestral geographic areas for clades within the phylogeny.

Early approaches in biogeography used parsimony methods to infer geographic distribution of ancestral nodes from area-cladograms (Brooks et al., 1981). Later event based parsimony approaches were developed, explaining the evolution of clades by dispersal and vicariance (DIVA) events that were chosen via a cost matrix, where each event type had a fixed cost assigned in advance (Ronquist, 1997). However, these parsimony approaches could not effectively account for stochastic processes and did not take uncertainties in the phylogeny and dating into account, causing a general underestimation of the number of events (Ronquist & Sanmartín, 2011). Statistical approaches in phylogeography overcome these

121

problems by applying a continuous-time Markov model to estimate the probability of events and rates of processes (Knowles, 2009; Ree & Sanmartín, 2009). Such parametric methods within biogeography are considerably recent approaches and have particularly become more common since molecular clock estimates of clade ages became more reliable (Ree & Sanmartín, 2009).

The dispersal, extinction, and cladogenesis (DEC) model (Ree et al., 2005; Ree & Smith, 2008) is such a parametric approach. Spatial transitions occur among a set of predefined discrete geographic areas, while the geographic range is treated as heritable character at the nodes of a phylogeny (Ree & Sanmartín, 2009). Different dispersal and extinction possibilities through time can be modelled by varying the conditions or applying constraints to parameter values in set time periods (Ree & Smith, 2008). A similar approach has been made with Bayesian island biogeography (Sanmartín et al., 2008) but there the geographic areas are restricted to isolated (island) areas not assuming widespread taxa across multiple islands. The transitions between islands (dispersal) are modelled analogous to nucleotide evolution models and the probabilities are determined by parameters for the dispersal rate and the carrying capacity (Ree & Sanmartín, 2009). Further, extinction events are not considered and as yet temporal dynamics have not been implemented.

Generally biogeographic approaches can be distinguished by their inference method and by their biogeographic model (for a full review see Ronquist & Sanmartín, 2011). Different discrete models are used to approximate the most likely evolutionary scenarios, e.g. the island model and the reticulate model. While the island models are mostly applicable to dispersal between islands or island-like environments that have restricted possibilities of exchange, the reticulate model is more complex, allows for expansion and contraction of ranges to happen and can model geo-dispersal, id est dispersal associated with range expansion across a previous dispersal barrier (Ronquist & Sanmartín, 2011). Diffusion models that treat space as continuous variable have also been introduced (Lemmon & Lemmon, 2008; Lemey et al., 2010) and find their application in fast evolving systems such as the epidemiology of viruses (Lemey et al., 2009).

Despite many advances and much potential among the discrete models a major limitation remains in their computational manageability when dealing with more than a few areas. The calculation of likelihoods for the observed geographic ranges at the tips of the phylogeny is computationally intensive because it relies on matrix exponentiation to

122

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

capture all possible biogeographic histories along the branches; the integration over possible ancestral ranges at the interior nodes adds further to the computational load (Landis et al., 2013). In an attempt to model the area specifics of the Malesian region (Malay-region plus Melanesia) DEC has been modified to accommodate more areas, but under the restriction that not all areas can be occupied concurrently (Webb & Ree, 2012). The Malay region is also at the centre of the distribution of my study species, therefore the difficulties that an ever changing island landscape brings along, with many lineages spanning across multiple islands, makes the inference of the historical biogeography of R. exulans with widely used current methods extremely difficult and might likely lead to misinterpretation. A new Bayesian approach using data augmentation (Tanner & Wong, 1987) offers a solution to the constraint in numbers of areas and further introduces a simple distance-dependent dispersal model (Landis et al., 2013).

This method was developed in the apparent tradition to use analogous approaches to evolution in genetic systems (Robinson et al., 2003) and relies on Markov chain Monte Carlo (MCMC) instead of matrix exponentiation. The underlying biogeographic models differs from Ree et al. (2005) insofar that they assume that ancestral ranges are inherited identically; under a null model every area has an equal rate of colonisation or extinction while under a second model the colonisation rates are distance-dependent (Landis et al., 2013). Full biogeographic models for the given phylogeny are then compared via Bayes factors. This method has been empirically tested on a model dataset of a Malesian Rhododendron species (Landis et al., 2013) and although floral distribution patterns across the known biogeography boundaries are different to faunal ones, this approach appears most promising for the inference of the biogeographic history of R. exulans.

The aims of this chapter are (a) to identify the chronology among the observed haplogroups, (b) to infer the geographic ancestry of the regional clades with special consideration of palaeo-climatic events, and (c) to trace the geographic ancestry of the species, testing for support to promote or dismiss either of the two competing hypothesis for an origin in Burma or Flores.

123

4.3 Methods To infer biogeographic parameters two complementary analyses were carried out: Bayesian

inference in BEAST2 was used to estimate tree topologies with intraspecific divergence times for R. exulans. Subsequently, a Bayesian ancestral area reconstruction (AAR) with

dispersal-vicariance analysis was applied to the BEAST2 derived chronograms with the

BayArea method implemented in RASP to investigate biogeographical relationships.

4.3.1 Estimation of the R. exulans phylochronology

Two separate phylochronologies were estimated with BEAST 2.1.3 (Bouckaert et al., 2014):

(a) applying the fossil calibrated time for the most recent common ancestor (MRCA) of the genera Mus and Rattus and (b) applying a fixed substitution rate, omitting the outgroup. This allowed a test of an impact of the outgroup on the population structure and of the fixed mutation rate on relative divergence times. Fossil calibration was applied as a normally distributed prior centred at the estimated time of 9.55Ma [CI 8.8-10.3] after Steppan et al. (2003; 2004). The substitution rate was initially fixed to 0.151 substitutions per site per million years (Tollenaere et al., 2010). However, Tollenaere et al.’s estimate was based on the divergence time between R. rattus and R. tanezumi after Robins et al. (2008), who used a slightly older Mus - Rattus calibration of 11.65 [10.94,12.27]. Due to updated divergence time estimates for R. rattus and R. tanezumi (Robins et al., 2010) the mutation rate was subsequently adjusted to 0.2 substitutions per site per million years.

Three additional Rattus spp. outgroups were intended as controls for the time estimates, the split between R. rattus and R. tanezumi (0.34 Ma [CI 0.17-0.45]) and the split between R. norvegicus and the previous three (2.3 Ma [CI 1.4-3.3]) (Robins et al., 2008; Robins et al., 2010). Run parameters and priors were refined incrementally, to push the estimates into the approximate time frame of splitting events. However, when Rattus outgroup species were used, each represented by a single specimen, estimates of the time to MRCA (tMRCA) were obtained that were known to be impossible. Although multiple calibration points generally improve the estimates of divergence times, in Bayesian phylogenetic analyses topologies consistent with the temporal order given by the calibration (which is not necessarily correct) will have higher prior probabilities, therefore a single reliable calibration point can be more appropriate (Ho & Phillips, 2009; Duchêne et al., 2014). Accordingly, the calibration was assigned directly to the split between Mus and a R. exulans representative sequence (Rx001), while all R. exulans haplotypes were forced to be

124

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

monophyletic. To ensure that the choice of haplotype did not bias the analysis, a control was run with the calibration between Mus and the most common haplotype Rx010, which belonged to a different haplogroup.

The GTR+I+Γ5-model was applied after testing for the most appropriate model for the presented data in jMODELTEST (Darriba et al., 2012). A Birth Death Tree Model was applied with a strict clock. A strict clock is the most useful for analysis on the population level (Brown & Yang, 2011).The prior for the speciation rate (birthRate) was set as diffuse Gamma distribution with the shape parameter set to 0.001 and the scale parameter to 1000. All parameters were estimated with a fixed substitution rate of 1.0. For (a) the clock rate was estimated (hereafter referred to as MR), given a uniform prior with an unrestricted upper bound. For (b) the clock rate was initially fixed to 0.151 (hereafter referred to as FC15) and subsequently to 0.2 substitutions per site per Myr (hereafter referred to as FC20).

A single Markov chain Monte Carlo (MCMC) chain was run for each approach with 500,000,000 steps, sampling trees every 50,000 generations. A burnin of 10% was applied.

The results were evaluated in TRACER 1.6 (Rambaut & Drummond, 2009) and the effective sampling size (ESS) for all parameters was controlled to be well over 200 to ensure adequate mixing of the chains. All analyses were repeated three times to ensure the same convergence range.

Condensed trees were calculated with TREEANNOTATOR, applying a 10 per cent burn-in and a posterior probability limit of 0.5. Maximum clade credibility was chosen to find the tree with the highest product of the posterior probability of all its nodes, and mean node heights to set the ages of each node on the tree to the mean height across the entire sample of trees for that clade (Drummond & Bouckaert, 2014/2015). No negative branch lengths due to uncertainty were observed on the trees.

A lineage through time (LTT) analysis was added to provide a temporal overview of intraspecific divergences; the lineages were summarised over all trees of FC20 to represent only the intraspecific time scale.

4.3.2 Reconstruction of ancestral geographic ranges

Much information is held in spatially sampled genetic data, but a major limitation of historic biogeographic analyses is that most computational approaches cannot handle a larger number of populations. This results in the combination of geographic areas despite

125

the availability of higher resolution data collection. One of the currently most used models is the dispersal-extinction-cladogenesis model (DEC, Ree & Smith, 2008); it was suggested that a DEC model based Bayesian approach, stochastically sampling complete geographic histories that can be integrated over alternative topologies, would be able to improve the inference of plausible biogeographic hypotheses (Clark et al., 2008). BayArea, a new approach by Landis et al. (2013), is pioneering in that direction. The MCMC method integrates over biogeographic histories corresponding to a dispersal-extinction model and estimates the joint posterior probability of the parameters given a phylogenetic tree, thereby overcoming the previous computational restrictions. The underlying dispersal model is distance-dependent, where the probability of dispersal between two areas is inversely related to their geographic distance, meaning that “the rate of gaining a particular area (0->1) depends on the relative proximity of available areas to the currently chosen lineage” (Landis et al., 2013). On basis of the gain and loss rates (1->0) of any area given a particular geographic range, the likelihood of the sampled biogeography is computed.

The distance-dispersal dependence is the only constraint, thus the model does not require predefinition of possible dispersal pathways or time frames, allowing an unbiased assessment of the spatial history. Until extended models become available that can integrate the diversity of temporal changes in landscape connectivity this is the most promising approach. This BayArea Bayesian ancestral area reconstruction was conducted within the software package RASP 3.0.2 (Yu et al., 2014).

To infer the ancestral regions, the entire range was parsed into fifteen discrete geographic areas (A to O) and the approximate centre for each of these areas used as coordinates (Figure 35, Table 8). Mainland Southeast Asia, the large Sunda Islands Java and Borneo, Sulawesi, and New Guinea were represented by one location each. The Philippines were represented by four separate islands: Luzon, Negros, Palawan, and Mindanao. Remote Oceania was split into three regions: Micronesia, Inter-Oceania (consisting of the Solomon Islands, Vanuatu, and New Caledonia), and the remainder of the remote oceanic islands to the east. The Lesser Sunda Islands (i.e. Flores and Timor), the Moluccas, as well as the Bismarck Archipelago were represented by one location each. Every individual haplotype was then assigned to as many of these areas as it occurred in.

The BEAST2 derived chronogram (MR) served as input tree to provide time calibrated

nodes. Within RASP Mus was removed as outgroup. The BayArea analysis was run with

126

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

three replicates, each with a chain length of 40,000,000 and a sample frequency of 8,000.

After visual inspection of the MCMC in TRACER, a burn-in of 4,000,000 steps was removed. ESS was controlled and observed to be very high for all parameters of all runs. The marginal posterior probability (mPP) for ancestral areas is reported for all basal nodes and nodes with posterior probability over 0.75.

127

Table 8: Approximate central coordinates for the ancestral area reconstruction in RASP. Abb. Area Latitude Longitude A Bismarcks -3.2 151.4 B Borneo 0.3 113.7 C Inter Oceania -16.2 167.5 D Java -6.9 107.8 E Lesser Sunda Islands -8.6 122.6 F Luzon 15.5 120.9 G Mindanao 8.0 124.0 H Micronesia 7.5 147.0 I Moluccas 3.0 129.2 J Negros 9.9 123.0 K Palawan 9.9 118.8 L New Guinea -4.1 140.9 M Remote Oceania -7.4 176.3 N Southeast Asia 15.5 103.3 O Sulawesi -2.2 120.3

Figure 35: Subdivision of the geographic distribution of Rattus exulans into 15 discrete geographic areas from A to O (see Table 8) for the BayArea analysis in RASP. The coordinates used as representative for each region are marked with bullet points. Underlying colours represent extent of areas or island groups, the delimiters for the Philippines are the individually sampled islands.

128

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

4.4 Results

4.4.1 Timing of divergences within R. exulans

The timing of intraspecific divergences was successfully estimated for the two Mus-fossil calibrated and both fixed clock approaches. The estimated sampling sizes (ESS) obtained by the final runs of all independent analyses were high for all parameters, assuring a good mixing of the MCMCs and sufficient independent samples.

The overall demographic development since speciation is demonstrated through the LTT analysis. It showed an episodic increase in lineages starting 60 ka, with progressively shorter time intervals between divergences up to approximately 33 ka. From then on divergences culminated in a continuous increase until the inflection point was reached around 10 ka (Figure 36). A plateau in divergences was reached around 5 ka. The high increase in divergences coincided with the onset of the LGM.

Figure 36: Lineages through time plot for R. exulans, depicting early periods of lineage divergences followed by a steady increase in lineages until divergence deceleration set in around 10-5 ka. Time 0 is 2010.

Overall age estimates among fossil and fixed clock calibrations

The posterior density distributions for the tMRCA of the two Mus calibrated analyses, using different haplotypes for calibration, converged on nearly identical posterior probability peak values with medians of 125.4 ka [95% HPD: 17.8-313.7 ka] and 117.2 ka [95% HPD: 10.9-301.4 ka] respectively (Figure 37). All other parameters between these Mus calibrated

129

runs were in equally good agreement. This indicates that the choice of haplotype for calibration did not have a significant impact on the results. Only the tree calibrated between Mus and the R. exulans haplotype Rx001 will be discussed from hereon.

Figure 37: Posterior density for the time to most recent common ancestor (BEAST2) for R. exulans, estimated via fossil calibration of the Mus - Rattus split at 9.55 Ma after Steppan et al. (2004); summary statistics for the calibrations with the most common haplotypes Rx001 and Rx010 are shown; image based on TRACER.

The tree-height of the fixed clock-rate analysis (FC15) indicated a slightly older median age of 150.3 ka [95% HPD: 93.2- 224.5 ka], but most comparable results were still in considerable agreement (Figure 38). For FC20 the median age was estimated at 113.7 ka [95% HPD 69.3 – 168 ka], with mean, median and mode (equal to the most probable estimate: MPE) almost identical. The impact of the increased mutation rate due to the updated estimate for the R. rattus - R. tanezumi split is most noticeable on the overall age estimate for R. exulans and in the wider range of the 95% HPDs.

Between the two calibration methods the credibility intervals (CI) for all parameter estimates were much narrower under the fixed clock models and particularly under the higher mutation rate of FC20. The distributions inferred via fossil calibration were also slightly more right skewed indicating an influence of the upper limit of the confidence intervals from the initial fossil calibration of Mus.

130

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

Figure 38: Posterior density for the time to most recent common ancestor (BEAST2) for R. exulans, estimated via fixed clock rates of 0.151 (FC15) and 0.2 (FC20); summary statistics for both calibrations are shown; image based on TRACER.

A general comparison of the Mus - Rx001 calibrated tree topology (MR) with those derived from both fixed clock rate analyses (FC15 & FC20) showed good agreement in the overall structure of the trees (Figure 39). The chronology of age estimates for the major crown clades was congruent except for an inversion between clades VI and VII on FC20. This was caused by a much younger clade-age estimate for NO (VI); because the clade had only marginal support (PP = 0.79) this was possibly due to uncertainties in the tree topology. All age estimates for the basal clades and those with PP > 0.75 support are collated in Table 2.

The age estimates for the basal clades (II-VII with PP >0.75) on the MR-topology were intermediate to the FC20 (lower bound) and FC15 (upper bound) estimates, with the exception of the PHBS-clade, which had a much younger (roughly 25%) estimate on MR. Among the remaining younger clades no such continuity was observed apart from SEA2 (XIII). Although this indicates the much higher uncertainty in the topologies at this level, all 95% HPD ranges of FC20 were contained within the 95% HPD of the MR estimates and all MPE obtained in MR were within the 95% HPD of the narrower FC20 estimates (Table 9); for SEA1 (XI) this is not traceable in the table, because of internal structural difference between the supported clades (see Rx091 on Figure 39).

131

and

XVI with

,

– 734

n. a. n. a. 87.14 21.44 upper (2004) 60.508 29.485 20.865 78.356 27.702 91.513 40.214 29.485 49.636

224.509 159.949 128 .

et al. on nodes I I nodes on

95% HPD 2 MR FC20 and 190 n. a. n. a. 93.43 4.922 0.284 0.338 1.133 1.084 1.493 6.214 lower 68.483 56 . 12.185 32.271 31.273 30.431 EAST B

(FC20); (FC20);

a.

– 1 Age estimation (ka), FC15 5.5 2.5 9.7 n. 98.6 71.5 61.9 32.6 19.7 54.6 52.2 45.8 10.3 20.6 34.7 MPE 139.3 10.3] after Steppan 10.3] after

s (ka) after after (ka) s

PP 1.00 0.83 0.59 0.06 0.98 0.95 1.00 0.82 0.79 0.96 0.82 1.00 0.8 7 0.97 0.69 0.98 MRCA t

168 n. a. upper 98.946 45.896 32.121 65.538 15.794 59.337 16.742 20.363 69.483 14.564 30.914 22.326 38.023 120.327 - 8.8 [CI 9.55Ma

split

95% HPD n. a. 69.3 9.304 3.378 0.164 0.116 0.836 0.150 1.106 1.307 5.837 lower 52.058 43.731 24.039 24.016 23.395 Rattus -

1.9 2.9 7.4 Age estimation (ka), FC20 and the adjusted mutation rat of 0.2 Myrs 0.2 of rat mutation adjusted the and 75.3 74.9 55.4 32.1 12.1 46.5 26.3 10.1 42.5 10.5 23.4 37.3 MPE

114.7

(2010)

mPP et al. 100 | 2.9100 | 100 | 4.9100 | 100 | 1.1100 | 100 | 0.0100 | 19.7 19.1 | 28.2 26.6 | 41.8 41.0 | 48.5 39.8 | 98.7 49.5 | 60.0 45.4 | 99.9 10.3 | 81.2 49.5 | 69.2 47.3 | 93.5 21.7 | 92.9 49.0 | 99.7 44.3 |

a.

. 8 L | I N|E J | K L | E area J | G L | A F | B E | B E | D A | C E | N Bayarea D | B H | A C | M M | C N | n. N |

(FC15) after Tollenaere (FC15) after

4|1|0 1|1|0 0|1|0 2|1|0 1|0|0 1|1|0 0|0|0 0|0|0 1|0|0 0|0|0 1|1|0 0|0|0 0|0|0 0|0|0 2|1|0 0|0|0 - 1 d | v | e

1 1 1 PP 0.89 0.62 0.06 0.76 0.94 0.85 0.83 0.97 0.82 0.89 0.98 0.75 0.99

numbers of (d), (v), dispersal extinction and (e) and events vicariance areas theprobable two their estral most anc marginal with 058 :

n. a. 24.3 32.41 upper rea 304.55 32.383 92.574 34.964 15.676 79.434 15.889 97.051 20.984 13.969 43.937 198.797 149 . A

Bay 95% HPD . n. a. 9.317 5.868 0.514 0.029 4.403 0.047 5.109 0.143 0.003 0.139 0.236 1.379 0.668 lower 16.521 10.756

Age estimation (ka), MR 6.9 4.4 2.6 2.0 0.3 9.4 114 92.4 68.8 59.0 52.6 44.1 29.2 19.9 13.9 12.8 MPE Most probable estimates (MPE) and their 95% highest posterior density (HPD, lower and upper bound) for the the for bound) upper and lower (HPD, density posterior highest 95% their and (MPE) estimates probable Most

:

9

I II X V III IX XI IV VI XII VII XV XIII VIII XIV XVI Node Table Mus by calibrated MR, chronograms: estimated differentially two for 8) 7 and Figures on (denoted estimated via a fixed substitution rate of 0.151 Myrs 0.151 of rate substitution a fixed via estimated posterior probability (PP) posterior probabilities (mPP). For a key to node locations see Table see locations node to key a For (mPP). probabilities posterior

132

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

Figure 39: Phylochronograms for R. exulans. A: tMRCAs estimated via fossil calibrated split between Mus and Rattus, outgroup removed; B: tMRCAs estimated via fixed clock rate of 0.2 substitutions per site per million years, adjusted from Tollenaere et al. (2010) and Robins et al. (2010). Posterior probabilities > 0.5 shown (italics), corresponding nodes annotated with estimated age. Colour coded by region and areas within (see legend): solid lines indicate occurrence in single area/region, dotted lines in more than one area within a region (coloured by majority), gradients indicate occurrence in two areas, black lines in more than two regions. Tips were not aligned. For 95% HPD see Table 9.

133

Comparison of topology and clade age estimates between MR and FC20

The most probable estimates of the tMRCA for R. exulans (node I) between the MR and FC20 approaches matched strikingly well at 114 ka [95% HPD: 17-304 ka; PP: 1) and 114.7 ka [95% HPD: 69-168 ka; PP: 1] respectively. The species’ age estimate under the slower mutation rate (FC15) was roughly 20% older at 139 ka [95% HPD: 97-225 ka; PP: 1]. Due to a lack of major structural differences between FC15 and FC20 and the congruence of the age estimate between MR and FC20, in the following I will only compare MR and FC20. Their two tree topologies support the same number of nodes and equally distinguish several regional clades (Table 9, Figure 39), however, they also exhibit significant differences.

The Philippine clade

The first clade diverging from the rest of the species at the tMRCA of R. exulans was the wider Philippine group; however, the crown group itself (VII) had an estimated tMRCA of 29.2 ka [MR; 95% HPD: 5.1-97.1 ka; PP: 1] and 42.5 ka [FC20; 95% HPD: 23.4-69.5 ka; PP: 1]. These time estimates were furthest apart among the main clades but the MPE for MR was well within the 95% HPD of the FC estimates. Regardless of the different age estimates, the long time span between the initial divergence and the crown clade indicates a long divergence time with differentiation through genetic drift.

As indicated by the colouring within the PHBS-clade (VII) in Figure 39 the subclades on the MR-tree with the younger age estimates were more compatible with the underlying geographic association than on the FC20-tree. The chronology of supported nodes within the PHBS-clade (VII) was identical between the two trees. However, the time estimates for the Negros and Luzon clades (XV and XVI) were older on FC20, contributing to the overall older tMRCA of clade VII.

The Remote Oceanic clade

The next clade diverging was RO at 92.4 ka [MR; 95% HPD: 10.8-198.8 ka; PP: 0.89] and

75.3 ka [FC20; 95% HPD: 52.1-120.3 ka; PP: 0.82] respectively. With a tMRCA of the

extant clade (V) at 52.6 ka [MR; 95% HPD: 5.9-92.6 ka; PP: 1] and 46.5 ka [FC20: 95%

HPD: 24.0-65.5 ka; PP: 1] this indicates a long divergence time with differentiation through genetic drift just as for the PH-clade. Within the RO-clade, a small subgroup of two deeply diverged haplotypes from the Polynesian outlier Kapingamarangi in Micronesia (VIII)

formed the only supported substructure. The tMRCA estimates for this clade of 19.9 ka [MR; 134

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

95% HPD: 0.1-32.1 ka; PP: 0.98] and 23.4 ka [MR; 95% HPD: 1.1-30.9 ka; PP: 0.97] were considerably close. Different between the two trees was the relative placement of this clade towards a structurally undefined cluster of samples originating in the Bismarcks and Remote Oceania as far as Samoa. On the MR-tree, the subclade (VIII) forms a (unsupported) clade with this cluster, whereas on the FC20-tree the subclade (VIII) is basal to it. The close association of the Micronesian clade with this particular group of samples in a structurally undefined cluster is consistent across all observed tree topologies within this study.

Within SEA-SMA-NO continuum

The remainder of the haplotypes, mainly originating in SEA, SMA, NO and Borneo, do not form equally well supported clades. A tMRCA for the stem node expanding over these regions (III) was not available, because of the very low support for the node on both trees. The low support was caused by structural uncertainties, mainly due to the placement of the SEA1 clade (XI) and haplotype Rx057 from Borneo, which both are placed either within or outside of the wider SMA-group.

The Near Oceanic clade

Nevertheless, within this structurally uncertain context, a NO-clade (VI) reached sufficient support as a third large geographic grouping with a tMRCA of 44.1 ka [MR; 95% HPD: 4.4-

79.4 ka; PP: 0.83] and 26.3 ka [FC20: 95% HPD: 24.0-59.3 ka; PP: 0.79]. The clade contains the same haplotypes across both trees and has no supported substructures. The large difference in clade ages is most likely attributable to the relative placement of the SEA1 group (XI).

Clades from SEA

Three well supported nodes capture the majority of all SEA haplotypes (XI, XIII, and XVI). The placement of the larger SEA-clade (XI) not only has major influence on the structure among haplotypes under node III but the clade itself does not contain the same haplotypes between the trees. In the MR-tree topology Rx091 is placed basal to node X whereas on FC20 it is placed under node XI, age estimates can therefore not be compared. On both trees node XI encompassed node XVI as subclade, however, the age estimates here [MR:

0.3 ka; 95% HPD: 0.003-13.9 ka; PP: 0.89 and FC20: 10.5 ka 95% HPD: 0.2-14.5 ka; PP: 0.87] reflect the uncertainties in the placement of the stem node (XI) and can therefore not

135

be considered. The third SEA-clade (XIII) was independent from the other two and placed among the more closely related SMA-haplotypes on both trees. The age estimates were both very young [MR: 4.4 ka; 95% HPD: 0.03-15.7 ka; PP: 0.85 and FC20: 1.9 ka 95%

HPD: 0.2-15.8 ka; PP: 0.79].

Java, Timor and Borneo

Within this uncertain overall group two more clades found very strong support on both trees: a group of Javanese samples (XII) and one consisting of Timorese and Bornean samples (X). However, while the chronology was identical, the time estimates did not agree (see Table 9); for node X this might again be influenced by the placement of Rx091 (SEA).

In summary, both estimation procedures resulted in good agreement concerning the main structural entities, but due to the placement of a group of SEA samples differed noticeably in their topologies under node III. Accordingly, the time estimates were only partially congruent. The wide confidence intervals on the fossil calibration resulted in a correspondingly wider HPD on the older clades of the MR-tree, however, for the younger clades the HPDs between the two approaches were almost equally high, highlighting the uncertainty of these estimates.

136

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

4.4.2 Ancestral geographic regions for the major clades

A biogeographic reconstruction of ancestral ranges was fitted to the summary trees of two individual BEAST chronologies, MR and FC20. All replicates converged to a similar likelihood with a high ESS and for all but one (ESS for the estimated number of area gains across the FC20-tree) the estimated parameters also reached ESS above recommended minimum of 200. Overall ESS were higher for all parameters on the MR-tree based reconstructions with about double the effective samples under the same run settings.

Despite the differences in run quality, the reconstruction of ancestral areas were largely congruent between both trees (Table 10), generally agreeing on the two most probable ancestral regions and with minor exceptions also agreeing on the first most probable ancestral region. Given that the time estimates from a direct fossil calibration are more resistant to inherited errors, that the MR-based analyses had the lowest log likelihoods with consistently much higher ESSs, and that the subclades for the Philippine region on the MR- tree were more compatible with the underlying geography, I primarily consider the MR-derived AAR for the remaining results.

The results from the MR-based AAR are collated in Figure 40, where the R. exulans phylo- chronogram builds the scaffold for the AAR results. The time scale goes forward in time

(kyr) from the tMRCA to the extant samples at the tips and was aligned with the sea-level contour for this time period (after Lambeck & Chappell, 2001; Coller, 2009). The most important sea-levels determining increased interconnectivity (-100 m), the turning point for the split up of the Sunda Shelf (-60 m), and increased isolation between islands as we know them today (-30 m) are explicitly marked. The letter-annotation at the tips indicates the regions (A to O) in which that particular haplotype occurs.

All basal nodes and nodes with PP ≥ 0.75 were annotated with pie charts showing the relative mPPs (scaled to sum to 100%) for the ancestral region at that node. The nodes indicating the crown clades of the four major geographic regions (IV-VII, plus III) were of most interest and therefore further annotated with the actual mPPs of the two or three (depending on the probabilities between the first two) most probable ancestral regions for that node. Dispersal and vicariance events associated with the corresponding branches are indicated by a perimeter around the pie chart.

137

0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 4.3 0.8 3.6 2.9 2.6 1.5 2.8 3.2 1.6 0.6 4.7 XVI 100 17.1 99.9

0 0 0 0 0 0 0 0 XV 0.2 0.1 1.2 0.8 3.6 0.3 1.6 0.4 9.4 0.4 0.2 0.4 5.7 0.1 0.1 0.1 0.3 1.1 100 10.1 13.6 98.6

0 2 0 1 0 0 0 0 0.1 1.7 0.3 0.2 0.6 0.9 0.3 2.3 0.1 0.2 1.6 0.1 2.8 1.3 1.9 4.9 0.1 0.3 0.1 0.2 XIV 100 100

0 0 0 0 2.8 0.7 0.9 0.8 0.2 0.2 0.1 0.1 0.1 0.1 1.9 0.9 7.1 0.5 4.4 0.4 0.1 0.4 0.1 0.4 0.6 0.4 0.1 XIII 100 10.3 99.9

XVI) of the MR (grey) and FC20 (white) (white) FC20 and (grey) MR the of XVI)

-

2 1 2 1 XII 1.2 1.8 1.2 8.2 0.6 5.6 1.7 2.7 2.2 2.6 1.6 0.4 1.6 5.8 0.3 2.9 0.9 1.1 0.2 0.2 0.6 0.5 1.3 0.4 100 99.8

5 XI 12 1.8 0.6 6.4 2.8 7.9 9.6 5.9 8.3 7.8 9.4 7.4 3.6 2.1 6.8 0.4 3.2 0.2 1.1 1.5 1.9 1.6 0.9 98.7 10.4 25.9 85.8 14.4 49.5

0 7 4 0 X 68 3.2 3.4 9.1 3.7 3.5 4.4 5.4 3.9 5.5 2.4 5.4 0.6 2.9 4.5 0.5 1.9 1.9 3.5 0.1 99.7 36.8 50.2 10.2 27.5 44.3

0 0 4 IX 49 2.8 6.8 4.4 0.2 0.2 0.2 0.2 0.3 1.2 0.2 0.2 4.3 0.1 2.3 2.2 1.5 2.8 2.1 2.4 3.2 4.3 85.6 19.8 55.1 92.9 11.5

6 24 3.5 3.6 3.3 2.4 1.7 3.6 2.9 4.7 5.5 4.7 2.8 1.6 3.1 2.3 3.2 1.8 3.6 4.6 4.5 VIII 93.5 19.1 89.5 14.6 19.2 11.7 11.3 12.1 21.7

.

3 VII 4.2 2.2 2.1 8.5 7.2 4.2 1.3 5.6 3.1 6.6 1.7 1.3 0.1 1.6 47.3 29.3 39.2 18.4 27.5 31.9 11.6 47.1 34.2 23.7 15.5 46.7 27.6 69.2 37.4 BayArea

5 4 VI 3.3 9.1 7.8 1.4 2.4 5.2 0.1 3.2 6.4 1.6 2.8 1.9 7.8 0.6 6.6 9.2 5.4 6.4 3.6 3.6 3.9 13.1 93.8 98.8 14.2 11.3 81.2 49.5

uction withuction

2 3 2 3 V 34 60 3.1 1.9 2.6 1.2 2.1 3.8 1.9 1.5 1.4 2.4 2.3 2.8 4.6 2.1 2.3 2.3 33.3 63.4 33.8 12.8 56.8 14.7 45.4 43.7

IV 10 23 6.2 7.3 2.7 2.9 7.1 2.2 3.6 7.5 5.1 8.6 7.6 4.6 5.1 5.8 48.5 18.7 19.6 39.6 66.2 56.2 33.1 18.8 23.5 26.5 39.8 11.2 22.6 12.4

6 III 44 41 3.1 3.6 8.8 2.7 4.1 8.7 8.9 8.8 8.9 4.7 5.1 5.6 15.8 32.9 26.2 12.4 39.4 14.2 41.8 31.4 21.7 20.4 33.3 13.4 20.7 12.8 29.2

II 26 32 7.2 7.6 4.2 7.2 9.8 3.7 6.7 7.8 8.7 9.1 23.9 14.1 13.7 34.7 20.5 17.2 20.7 27.2 21.5 11.4 14.5 13.6 19.2 12.3 15.6 11.4 26.6 28.2

Marginal posterior probabilities (%) for the ancestry per region (A to O) for all basal and supported nodes (I nodes supported and basal all for O) to (A region per ancestry the for (%) probabilities posterior Marginal I

14 13 13 12 11 12 12 : 9.4 chronograms via an ancestral area reconstr ancestral via an chronograms 14.1 14.8 10.4 16.9 13.2 14.5 19.6 14.2 13.1 11.7 12.2 13.6 13.4 19.1 16.8 13.3 12.9 13.1 11.8 19.7 11.9 16.2

2 10

EAST

I J L F E C K B A D H G N O M Table B Area

138

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

Figure 40: Bayesian ancestral area reconstruction (RASP, BayArea) based on the phylochronogram (BEAST2) with date estimates via the fossil calibrated Mus – Rattus split; outgroup was removed, PP ≥ 0.75 are shown at nodes. Legend depicts colour codes for the areas, perimeters of nodes indicate dispersal, vicariance or extinction events. All supported internal nodes and nodes corresponding to the main clades of interest are annotated with the actual marginal posterior probabilities (mPPs) of the most likely ancestral areas. Bottom: Sea-level contour (Lambeck & Chappell, 2001; Coller, 2009) for inspection of event correlation.

139

The base structure

For the well supported most ancestral nodes, I and II (PP: 1 & 0.89), a geographic origin could not be inferred, however, the lowest mPPs were inferred for an ancestry in the Philippine island (I & II) and for Southeast Asia and the Moluccas (II; see Table 10). Both associated branches are explained via dispersal events (I: 4 and II: 1) and one vicariance event each. Node III, comprising the three regions NO, SEA, and SMA, reached mPPs of over 33% for Java, the Lesser Sunda Islands and New Guinea and apart from the Bismarcks, Borneo and Sulawesi all other mPPs were below 20%.

The same three regions were most probable on node IV, the greater SEA-SMA region (PP < 0.5), but with 49% for the Lesser Sunda Islands, the mPPs still remained below 50%. Despite this being the SEA-SMA clade, an ancestry in SEA had very low probability (18.7%). Because of the ambivalent placement of the SEA clade (XI) and Rx057, with respect to the FC20-tree, possibly contributing to the uncertainty on nodes III and IV it is important to state that an ancestry in SEA was even lower on the FC20-AAR (III: 8.8 and IV: 8.6%; see Table 10).

The Remote Oceanic clade

The other three nodes of most interest are the well supported major geographic regions RO (V), NO (VI), and the wider Philippine area (VII). The Remote Oceanic clade (V) reached an mPP of 60% for an ancestry in Remote Oceania, followed equally by the Bismarcks and Inter-Oceania (44% and 45%); an ancestry in the Moluccas or any of the Philippine islands is not supported (all mPPs < 5%). This is surprising because an ancestry in Remote Oceania is unlikely, but possibly this is another indicator for the existence of an old reservoir for this lineage. The associated branch is explained by a simple one and one dispersal- vicariance scenario. The Micronesian subclade traces its origin to within Micronesia (93.5%) with the Bismarcks (21.7%) as only alternative above 20% mPP.

The NO-clade

The NO-clade (VI) reached the highest single mPP among the supported major three clades, with 81% for New Guinea Island followed by 50% for the Bismarcks. Apart from Micronesia, the Lesser Sunda Islands and the Moluccas all other mPPs were below 10%. The NO branch is explained via one explicit dispersal event. Ancestry of the marginally supported node IX (pp = 0.75), grouping two haplotypes from Halmahera (NW of New

140

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

Guinea) and Goodenough Island (SE of New Guinea), was inferred to have originated from New Guinea (92.9%) and alternatively from the Moluccas (49%), via a dispersal-vicariance scenario.

The wider Philippine clade

The wider Philippine clade had its most probable ancestry within Negros (69% mPP), Borneo or Palawan (47% mPP each). The branch is associated with a dispersal-vicariance scenario as Remote Oceania. Two subclades for the islands Luzon and Negros have their ancestry inferred to be locally, supported by 100% mPP.

The SEA and SMA clades

The regional ancestry for node XI (SEA1) differs drastically between the MR and FC estimates: MR infers an ancestry of the clade within the Lesser Sunda Islands (98.7% mPP; SEA 49.5%) while FC infers an ancestry within SEA (85.8%; Lesser Sunda: 25.9%); the reciprocal ancestry reached the second highest probabilities. The inclusion of Rx091 and the (resulting) positioning of this SEA clade outside of the NO-SMA group is causal to this difference. Clade XI is explained through single dispersal. The daughter clade (XVI) was inferred to have originated within SEA (100%).

Of the remaining two groups the Javanese clade (XII) was inferred to have originated locally (100%) and the clade combing Timorese and Bornean samples (X) within the Lesser Sunda Islands (99.7%).

Result summary Overall it is important to bear in mind that all estimates are highly dependent on the resolution of the underlying topology. However, given the data, the main pattern in the topology is robust, as can be seen from the congruence between different inference procedures (MR, FC). The dating is completely dependent on prior estimates, be it the fossil calibration or the derived mutation rates. Therefore the resulting chronological pattern is by far more reliable than the absolute timing observed. For this exact data set, the presented divergence dates are the maximally supported timings under this particular Mus-Rattus-

split prior. If a different, e.g. the older 12 Ma calibration was used, the tMRCA estimates would be accordingly older. However, the described patterns of dispersal and vicariance are considerably robust under slightly different topologies and reveal much of the biogeographic dynamics that led to the current distribution of R. exulans.

141

4.5 Discussion The genus Rattus originated in the late Pliocene and speciation of most of the 66 extant Rattus species occurred within the Pleistocene (Rowe et al., 2011). The time of divergence of R. exulans from the rest of Rattus has previously been estimated as part of several interspecific studies. In all these studies, R. exulans was positioned closely to R. tanezumi, R. rattus and R. norvegicus, but the absolute positions among these species varies and the inherent uncertainty can be observed by widely overlapping credibility intervals (Robins et al., 2008; Pagès et al., 2010; Robins et al., 2010; Rowe et al., 2011; Fabre et al., 2013). For this study these uncertainties caused substantially different tree topologies within R. exulans, depending on the choice of outgroup between the two close relatives R. rattus and R. norvegicus, during test analyses. This stresses the importance of the evaluation of an appropriate outgroup.

As a result Mus was chosen as single fossil calibration point, to circumvent the topological wobble. The tree structure and hence the intraspecific chronology could be inferred with much higher certainty. This has been shown by the comparison of the overall tree structures between the fossil-calibrated tree and one inferred without an out-group, by fixing the substitution rate (Figure 39). The chronology of divergences among the clades having high posterior support was largely concordant between these two approaches However, uncertainty in node ages increased inversely with the clade age.

When using a single calibration point ancestral to the genus, the absolute estimates of divergence times (tMRCA) are utterly dependant on the calibration time and width of confidence intervals applied. The calibration dates applied in the present study were carefully considered. Paleontologically derived dates for the origin of were first published with estimates between 8 and 14 Ma by Jacobs and Pilbeam (1980). These dates were updated to an age between 8 and 12 Ma after new fossil finds (Jaeger et al., 1986). The appearance of the predecessor Progonomys (11.8 Ma) and the first records of identifiable Mus (5.7 Ma) (Jacobs & Downs, 1994) subsequently led to the application of a 12 Ma (± 2 Ma) divergence date between Mus and Rattus. The younger dates applied here, derived by Steppan et al. (2004), are based on the fossil calibration with the application of this 12 Ma divergence, not to the Mus - Rattus split as previously, but instead to the split between all modern murine rodents and Batomys. Their justification relies on the character-derived placements of Antemus and Progonomys and finds full support by Jacobs and Flynn (2005). This placement of Antemus and Progonomys has further been

142

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

supported by Fabre et al. (2013) who tested the impact of different fossil constraint on node ages within Rattini and further cross validated molecular confidence intervals with those from palaeontological fossil dating.

4.5.1 Age of divergence for R exulans

The here presented two independent estimates of the tMRCA for R. exulans of 114 and 114.7 ka [95%HPDs: 16-304 ka / 69-168 ka], are concordant between methods. These divergence times are also largely consistent with previous estimates (Table 9, Figure 39). Robins et al. (2008) inferred the age for the R. exulans clade at 120 ka [95%HPD: 30-530 ka], based on full mitochondrial genomes of three specimens within a multispecies study on divergence times of selected rats. The three specimens were good representatives of three geographic clades in my study; one specimen was from New Zealand with the most common RO haplotype Rx001, one from Papua New Guinea (equivalent to NO, Rx011), and one from Thailand with the most common SEA haplotype Rx005. Robins et al. specifically stated, that a tree topology with (RxNZ (RxThai, RxPNG)) was optimal in achieving high likelihood scores. This placement of the Remote Oceanic sample as sister of the clade, sharing a direct root, agrees with the relationship revealed in the present study; here only the divergence from the Philippine samples predated those three groups (Figure 39).

In contrast to Robins et al. (2010) and the here presented dates, Aplin et al. (2011) estimated

a much older tMRCA for R. exulans of 288 ka [95%HPD: 143 – 455 ka]. This was due to an older calibration point between Mus and Rattus and a wider confidence interval [10.4- 14 Ma], for which they argued that it would better reflect evolutionary uncertainties within the Rattini group. As discussed above, the divergence time estimates are fully dependent on the fossil-calibrations, therefore the choice of an appropriate calibration point is crucial.

Deep fossil calibration is per se not the most appropriate calibration for studies on a population level, due to rate variation with a higher mutation rate on a shorter time scale (< 2 Myrs) (Ho et al., 2005). Alternative calibrations such as dated geological events might be applicable in geographic areas where these events only occur in low frequency (Ho et al., 2005; Herman & Searle, 2011), but would be inappropriate in the study area of the Malay region due to the near constant changes, particularly on the time scale of interest.

143

Therefore I argue, that the younger and narrower fossil calibration of the Mus – Rattus split, which has good support in the community (Steppan et al., 2004; Fabre et al., 2013), is the most appropriate calibration for the presented study. R. exulans is a species with a considerably short generation time and dispersal throughout most of its range involves serial colonisation events, therefore a strictly clock like behaviour of the mtDNA can be assumed. Nevertheless, the results need to be seen in this context, where the tree topology is of greatest interest.

The comparison of the fossil calibrated chronogram MR and the initially inferred chronogram FC15, using the mutation rate of 15.1% Myr-1 after Tollenaere et al. (2010),

revealed a difference in tMRCA estimates of about 25 ka. Although this was not a significant difference given the credibility intervals, it prompted the question why. As stated above, it has been argued that mutation rates calculated on the species scale are slower and hence not applicable for analyses on a population level (Penny, 2005). However, because the

estimated tMRCA for R. exulans from a multi species study was inferred as 120 ka (Robins et al., 2008), a closer age to 114 ka, the intraspecific scale did not seem to have a strong impact on the estimate. This suggested a possible underestimation of the mutation rate. On revision of the source publication (Tollenaere et al., 2010) it became evident that the dating of the R. tanezumi and R. rattus split used for their estimate had since been adjusted to have occurred later, around 0.34 Ma instead of the previous estimate of 0.45 Ma. This update calculates to an increase of the Rattus mutation rate from 15.1% to 20% Myr-1. This updated mutation rate was thereupon tested and subsequently included (Table 9, Figure 7 and 8).

However, this estimated mutation rate for the D-loop of Rattus of 0.2 substitutions per site per Ma is still rather low in comparison with the proposed mutation rate for human mitochondrial D-loop variation of 0.32 substitutions per site per Ma (Sigurðardóttir et al., 2000). Because an older divergence time estimate would require a much slower mutation rate these younger dates are presumably more realistic. Therefore the high congruence of the time estimates derived by this dichotomous approach lends further support to the choice of the calibration date. Highly relevant for the estimation of intraspecific clade ages, the application of this revised fossil calibration date not only results in slightly younger estimates, but more importantly in narrower credibility intervals.

144

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

4.5.2 Deep spatial history and regional clades

On comparison of the MR and FC20 chronograms, their agreement on the deep divergence of two regional clades became strikingly clear; the first from the wider Philippine region (PHBS), the second from (mostly Remote) Oceania. Divergence of the ancestral gene pools of these two lineages (PHBS and RO) and the remainder of the regions started a long time ago and hence the ancestral populations must have been geographically separated. The long branches leading to the RO and PHBS clades can be explained by genetic drift subsequent to an initial successful settlement followed by isolation.

The time frame of the estimated tMRCA off all lineages coincided with rising sea levels above -30 m that created a fragmented landscape throughout the penultimate interglacial between 115 and 130 ka (Figure 40). This fragmentation followed a 33 ka period of increased interconnectivity due to sea levels below -60 m, including a 5 ka period of very

high interconnectivity due to sea levels below -100 m (CHAPTER 01, Figure 7). During this time the ancestral group of all R. exulans possibly expanded its range, where it maintained populations in at least two locations additional to the original during the following period of isolation. This would have led to substantial differentiation among these three lineages, which can be observed in my data.

Invasion onto the Philippines The LGM after the penultimate period of a highly fragmented landscape (max between 91- 78 ka) opened another substantial window for dispersal, due to the lowest sea levels recorded within the last 10 Myr. This would have been particularly important for a dispersal among the Philippine Islands, which generally present a highly differentiated fauna with high rates of endemism caused by prolonged isolation across the 120 m bathymetric line (Heaney, 1986). However, the observed chronology and the relationship of the Philippine haplotypes to the other R. exulans lineages is not consistent with a recent dispersal from the original population. The deep divergence of the clade would have required a dispersal from an already differentiated reservoir, which presumably must have existed from a previous range expansion as suggested above.

Despite the deep divergence from the other lineages the tMRCA of the PHBS crown-clade was only estimated at 29.2 ka. Although there is some level of uncertainty in this time estimate, due to the different topologies for this clade between the two estimation procedures MR and FC20, the timing on both trees coincides with the lowest sea levels

145

since the previous glacial period (Figure 40). However the MR estimate of 29.2 ka further correlates with the lowest sea levels during the LGM, i. e. below -130 m, which caused the highest level of interconnectivity beyond the 120 m bathymetric line and the maximum reduction of water gaps between the Philippine islands (CHAPTER 03, Figure 30). The two regional subclades on Luzon and Negros further mirror the subsequent lack of gene flow presumably due to founder events with small population sizes following new geographic isolation on both islands.

The reconstructed geographic history of the PHBS group (Figure 40, VI) places its origin within the Philippine region, more precisely on Negros, Borneo or Palawan. A reservoir of the species in this area would be concordant with the distribution of a presumed ancestral haplotype (Rx031) across Borneo, Palawan, Batam and Negros, while the putative daughter haplotype (Rx023) was only observed on Negros and Sulawesi. The span of a further haplotype (Rx022) across Borneo, Sulawesi and Luzon, with the daughter haplotypes only present on Luzon, also supports an ancestry as inferred.

The only fossils of R. exulans that so far have been found on Negros and Luzon coincide with pottery dated to approximately 4000 a (Lawrence Heaney, pers. comm.). Further, Reis and Garong (2001) and Piper et al. (2011) indicate that there is no palaeozoological evidence so far of R. exulans on Palawan; however, it is noteworthy that the excavation at a single cave site mostly recovered larger mammal bones and the smallest bones recovered were R. tiomanicus and R. tanezumi, both of which are considerably larger than R. exulans. This lack of fossil evidence makes Borneo a more likely candidate for the regional source of a reservoir for the PHBS lineage.

For species to disperse between Sundaland and the Philippine islands, it was crucial for a land connection to exist between Borneo and Palawan. The existence of such a land-bridge has been debated (Heaney, 1986; Reis & Garong, 2001). While the faunal land communities suggest a land connection during the LGM (Cranbrook, 2000, in Piper et al. 2011), the deep divergence of related species between both islands can better be explained by a long time of separation (Heaney, 1986). Unlike earlier studies (Sathiamurthy & Voris, 2006), a recent GIS based analyses proposed that sea levels as low as -135 m were needed to expose a narrow land connection between Borneo and Palawan (Robles et al., 2014). Although Robles et al. (2014) further proposed this occurred only 440 and 630 ka, other data suggest equally low sea levels for several thousand years during the LGM (Raymo et

146

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

al., 1997; Lambeck & Chappell, 2001; Tian et al., 2002). The latter is supported by the timing and distribution of R. exulans (Figure 40).

Whence originated the Oceanic clade While the PHBS-clade is the only clade showing such a strong correlation with palaeoclimatic events, the other deeply diverged clade, RO, likewise must have had a reservoir, before further divergence separated the true RO-clade from the clade with specimens mainly occurring in the Bismarcks and Micronesia. Nevertheless, the subsequent history of this RO-clade cannot be correlated to any palaeoclimatic events. The Remote Oceanic region was barely impacted by climatic oscillations, apart from minor fluctuations in the land-areas of atolls and islands. However, these would have been isolated in the vast Pacific Ocean regardless of a few square meters more or less.

The Remote Oceanic clade (V) was inferred to have originated within Remote Oceania, with sufficient probabilities for an origin in Inter-Oceania, the Bismarcks or Micronesia

(Figure 40; Table 10). The overall tree structure and the tMRCA estimates clearly indicated a deep divergence of the clade. Due to the vastness of the Pacific Ocean and the lack of swimming capabilities of the species it is certain that R. exulans’ dispersal throughout Oceania was associated with human movement. Therefore an origin of the clade within Remote Oceania can be refuted because this area has only been settled around 1000 years ago (Wilmshurst et al., 2011). The high probability of an RO-origin of the clade can possibly be classified as an artefact due to penalties in the ancestral area reconstruction for larger geographic distances. Actual landmass and oceanic barriers were not taken into account because the inference it purely based on coordinates and distances. In general dispersal and vicariance approaches as well as island models fail to detect lineage diversification due to “rare long distance dispersal events establishing a small founder population across a wide barrier”, but instead assume dispersal events proportional to branch lengths (Ree & Sanmartín, 2009). However, this is the most likely scenario for the dispersal of R. exulans throughout Remote Oceania.

With regard to the origin of the RO-clade this leaves Inter-Oceania, the Bismarcks and Micronesia as inferred possible origins. Settlements within western Micronesia were established as early as 3500 years ago in the Marianas, 3000 years ago in Palau and 2000 years ago in Yap (Carson, 2013), settlers for the latter are thought to have come from Island Melanesia (Bismarcks) in contrast to the hypothesised Island Southeast Asian origin of the

147

first settlers of Palau and the Marianas. The detection of only Near Oceanic R. exulans haplotypes on Yap supports this theory of a settlement from Island Melanesia. However, this means an origin of the entire clade in this region is also highly unlikely. In contrast, the larger islands of the Bismarcks were settled much earlier; for example settlements for

New Ireland and New Britain have been dated to roughly 35,000 years BP (Allen, 1996) with other data suggesting that the first colonists might have arrived as early as 39,590 years BP (Leavesley et al., 2002; Leavesley & Chappell, 2004). Regrettably, these are the islands in the Bismarcks that are not associated with the Remote Oceanic clade, but with the younger Near Oceanic clade. Similar the scenario for an origin in Inter-Oceania: the settlement history of Inter-Oceania is diverse, while the Solomons have been settled by 29 ka (Friedlaender et al., 2007) Vanuatu has only been settled during the Lapita associated Austronesian expansion. However, the only R. exulans remains found within in the

Solomons were dated to around 3000 BP (Flannery et al., 1988). Furthermore, recent samples from Bougainville and the Reef Santa Cruz islands were identified as belonging to the Near Oceanic haplogroup, only the far offshore Takuu island, about 200 km northeast of Bougainville Island revealed a remote Oceanic haplotype.

This background makes it clear, that neither of the proposed regions is a suitable candidate to have harboured a population of R. exulans prior to its expansion into Remote Oceania. If this haplogroup would have been present in any of these regions before the Lapita expansion despite no traces found so far, the subsequent dispersal of the Near Oceanic haplogroup would further have had to replace an established group.

Even the subsequently proposed ancestral areas, New Guinea Island (mPP = 14.7%) and the Moluccas (mPP = 4.6%), cannot shed light on a geographic ancestry. There is no discernible genetic relationship between the New Guinean samples and those from Remote Oceania and although the Moluccan samples have a closer affinity, based on such a low mPP it is not possible to make any inferences regarding an origin of the RO-clade.

The combination of this opposing evidence indicates that the inference of the ancestral region for RO has failed. This could be due to a negative impacted of the distance penalizing aspect of the ancestral area reconstruction, or possibly due to the vastly different geographic scale of the Remote Oceanic region. Further, the incapability to detect lineage diversification might also have contributed to these results. The human mediated dispersal into Remote Oceania will further have obscured the signal, possibly causing a series of

148

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

founder events in quick succession, thereby skewing the timing between events. Unhindered population growth due to the lack of competition in a pristine environment might also have contributed. However, the results could also indicate that the true geographic origin was not sampled in this study.

The only certain inference for an ancestry of the Remote Oceanic R. exulans clade that we can make from the presented chronology, is that an origin of this clade is not in the Philippines. The two regions are the most genetically divided and have started to diverge long before any human movement between them. The hypothesised reservoir for the Remote Oceanic clade remains elusive.

SEA-SMA-NO, or the entangled regions Southeast Asia

Among the remaining regions the geographic ancestry of the SEA clades with respect to that of the SMA cluster is of most interest concerning the overall geographic origin of R. exulans. The chronological estimates of the larger SEA-clade (XI: Table 9, Table 10) within and outside the SMA cluster due to uncertainties between the tree topologies, makes the age estimates of this clade uninterpretable. However, the inference of ancestral regions was not incongruent between the chronograms. The probability for an ancestry of the clade within the Lesser Sunda Islands was expectedly larger on the tree topology where the clade was within the SMA cluster (MR: mPP = 98.7%), however, even under the tree topology that placed the entire clade outside of the SMA cluster the probability for an ancestry within the Lesser Sunda islands was still reasonable (FC20: mPP = 49.5%, Table 10). Furthermore, no other regions besides these two reached mPPs worth mentioning. The ancestry of the younger subclade (XVI) under node XI can be inferred to within SEA, the same was the case for the smaller and less differentiated SEA-clade (XIII) directly nested within the SMA cluster (Figure 40).

Age estimates for the SEA-SMA node (IV) were also not reliable but the ancestral region for this node was unanimously (MR and FC20) inferred to be in the Lesser Sunda Islands or Java (compare to SMA in CHAPTER 03). Hence all evidence suggests that two different SEA lineages ultimately originating within the Lesser Sunda Islands entered onto the Southeast Asian mainland and established the observed populations. The mode of this dispersal further suggests a natural isolation process through the recent separation of the SEA mainland from Sundaland. A further considered possibility of separation due to a 149

transgression at the Isthmus of Kra on the Malay Peninsula earlier during deglaciation between 14.6 and 14.3 ka cannot be supported by the timing of the MR chronology.

Near Oceania

The divergence of the NO-clade (VI) was dated to 44.1 ka, although the node only had marginal support (PP = 0.83). The clade is also considerably diverged and the origin estimated to be within the current distributional range, indicating a founder event. The

tMRCA roughly coincides with the presence of the first modern humans and their settlement of the Sahul continent and poses the question whether the early settlers too might have facilitated the distribution of this species. The inferred dispersal from the ancestral

reconstruction supports the evidence obtained in CHAPTER 03 that the population on New Guinea itself probably experienced a considerable range expansion. However, although this possibly suggests a primary settlement of the islands there is not enough evidence to support this.

Java and the Lesser Sunda Islands

The Javanese-clade (XII), which was inferred to have developed from local stock, would have formed through natural dispersal along the island chain, during times of increased connectivity. However the clade merging specimens from Timor and southern Borneo (X; Kalimantan) indicates an unusually close relationship across a wide geographic range the species would have had to occupy in a continuous population during the LGM. The divergence date for this clade (X) coincided with the rising sea levels reaching the -60 m threshold at the end of the LGM, causing Sundaland to separate. However, despite the high connectivity within Sundaland during the LGM, several dispersal barriers remained. A direct dispersal between Timor and Borneo would have been complicated by the vast East Sunda River System (Figure 34). The support for this clade therefore suggests that this did not fully hinder dispersal. This makes an earlier expansion of the original range at less extreme sea levels than during the LGM more likely and supports the hypothesised earlier dispersal with the resulting reservoirs on Borneo or adjacent islands.

150

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

4.5.3 Origin of R. exulans

R. exulans is a relatively young species among the 66 in the genus Rattus (Robins et al., 2010; Rowe et al., 2011). The Tenasserim region in Burma, bordering on Thailand (Tate, 1935) and the Lesser Sunda island Flores (Schwarz & Schwarz, 1967) have been suggested as the origin for R. exulans, both based on morphological character analyses of intraspecific variation. An origin in mainland Southeast Asia north of the Isthmus of Kra has further been suggested to be tested by Musser and Newcomb (1983), reasoning that the morphological variation found in the Shelf area is less than that known in native rats on the Sunda shelf.

However, across the entire geographic range of R. exulans clear morphological differentiation of skull shapes could only be found for the Pacific Islands populations, whereas only minor gradual differences were observed from SEA along the Malay Peninsula towards Island Southeast Asia, which was not differentiated from New Guinea (Motokawa et al., 2004). This agrees with the deep divergence observed in the molecular data for the Remote Oceanic population Figure 39) and also with the little (but molecularly traceable) diversification between the SMA and NO region, but does not at all reflect the deep divergence of the Philippine clade and also does not account for the much higher molecular variability observed within Island Southeast Asia (CHAPTER 03, Table 02: SMA and PHBS). Further measurement comparisons presented via principal component analyses also did not reveal significant geographic clustering (Motokawa et al., 2004). Hence, morphometric variability by itself might not be a sufficient indicator for intraspecific structure.

A recent molecular survey by Thomson et al. (2014) supports the hypothesis of Schwarz and Schwarz (1967) that the Island of Flores appears to be the ‘homeland’ of R. exulans. Their conclusions were based on nested evidence that first inferred an origin within Island Southeast Asia from the reportedly highest variability that was observed in two mtDNA markers and the subsequent inference of an origin on Flores based on the reportedly highest variability among allozymes. These findings are relevant to this thesis and their survey produced valuable new sequences from previously unsampled location in the Lesser Sunda Islands. I have included five of their sequences to my systematic approach in resolving the biogeography throughout the entire distributional range. However, although I concur with

151

their broader conclusion of an origin of R. exulans within Island Southeast Asia I am reluctant to present their findings as promoting evidence for that conclusion.

In their study to define the origin of R. exulans, Thomson et al. (2014) compared the genetic diversities of cytochrome B (cyt B) and mitochondrial D-loop sequences among samples from Island Southeast Asia (ISEA), mainland Southeast Asia (MSEA) and Remote Oceania (RO). They used three different mostly non-overlapping data sets: one cyt B and two D-loop sets of two different lengths each (381 & 360, 544 & 217, and 107 & 92 bp respectively). It is unclear why the data sets needed to be shortened so substantially because the number of samples remained identical. However, the longer (but possibly not consistent?) sets were used to calculate diversity indices and to test for selective neutrality while the shorter sets were used for network constructions.

The networks estimated from the short alignments (92 bp and 217 bp) by Thomson et al. (2014) were only able to distinguish between two regions, the Pacific and the entire Southeast Asia cluster. At this resolution, and outside the context of the adjacent regions, these are mostly uninformative to support any of their arguments. On the contrary, in their network the MSEA samples are widely distributed. As could be seen in the regional networks of CHAPTER 03, omitting samples of adjacent regions (usually because they are outside the area of interest) only produces simplified results that are possibly skewed and could lead to faulty interpretation. They are useful to depict the local distribution of haplotypes but should not be used to make inferences about the complete population structure.

Although Thomson et al. (2014) specifically intended to compare the regions of the species’ potential origins, SEA and ISEA, they based their study on a skewed ratio between ISEA and SEA samples, with more than double and triple the number of ISEA samples for the mitochondrial D loop data but no samples from the proposed SEA-origin. It is common, and also evident in this thesis, that sampling gaps arise for various reasons; in the case of the thesis it was the fluid preservation of specimens of interest that caused two important gaps, of which the smaller one could be bridged not least thanks to the freshly published sequences by Thomson et al. (2014). However, the influence on Thomson et al.‘s results due to sampling bias could be observed in the comparison of their summary statistics among the regions. The disparity in sample sizes may have contributed to differences in haplotype diversity (h) (longer D-loop: SEA h = 0.77, N = 16; ISEA h = 0.82, N = 36; and

152

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

shorter D-loop: SEA h = 0.67, N = 19; ISEA h = 0.87, N = 66). The values from the longer

set are largely concordant with the haplotype diversity obtained in CHAPTER 03, (SEA: h = 0.77 and SMA h = 0.87), although the definitions of the ISEA and SMA regions were not identical. The corresponding nucleotide diversities (π) did not seem to match though, in fact π was much higher for SEA compared to ISEA in their study and particularly high in Flores. Regrettably, the values were not comparable to this study, because π was rather uncommonly given in per cent without further annotation. On an attempt to reproduce the indices from their published sequences only ten sequences had the stated length of 544 bp while the remaining two had less than half the length (247 bp). The diversity indices for these ten Flores samples were calculated in DnaSP resulting in a higher h of 0.95 and a π of 0.007. The latter was lower than the overall π for SMA in my study, but roughly comparable given that the samples were only from one island. But in summary the published results could not be reproduced.

However, these published diversity indices formed the basis of Thomson et al.‘s (2014) strongest argument for an origin in ISEA and thus built the base line argument to support their subsequent inference of an origin in Flores by comparing regional allozyme diversity. Although discussed differently, the highest allozyme diversity was found on Bali followed by an equally high diversity on Flores and Sawu, a small island south of Flores

(CHAPTER 01, Figure 06).

A further supporting argument to build their case for an ISEA over a MSEA origin was the “strong signal” for range expansion in MSEA. However, the signal for range expansion in MSEA existed only in the cyt B data set and was very weak (- 2 < Tajima’s D < 0 (p < 0.05) while Fu’s Fs was not significant). With the same criteria the test on D-loop did not only attribute range expansion to Remote Oceania and Timor, but in their shorter data set also to the entire ISEA region (but not MSEA). The single use of Tajima’s D is not commonly used as an indicator for range expansion, while other tests such as Fu’s Fs perform better (Ramos-Onsins & Rozas, 2002). Apart from Remote Oceania (in their 107 bp data set) Fu’s Fs was not observed to be significantly negative, indicating that there was only weak support for range expansion in MSEA. There was no signal for range expansion for the

SEA group in my data set (CHAPTER 03, Table 02).

On a final note, the opportunistic wording that Flores had the highest haplotype diversity (their table 3) ”…among the populations represented by more than 5 individuals” was,

153

although true for the 92 bp data set, severely misleading not least because the Philippines and Indonesia both had a higher diversity among fewer samples; the diversity on Halmahera was essentially the same as on Flores in this set.

In the analyses presented here, I provide evidence to support the claim by Thomson et al. (2014) of an origin within the Southern Malay Archipelago and the Lesser Sunda Islands and above all evidence against an origin in Southeast Asia, therefore an origin in Flores is reasonably likely. However, given the issues raised above I cannot conclude that the study by Thomson et al. (2014) provides equally strong support for this hypothesis.

In the chronology and the ancestral area reconstruction presented here, the nodes coalescing the SEA and SMA lineages (Figure 40, Table 10: III and IV) have a very low probability for an origin in SEA. In contrast the probability for an origin of these clades in the Lesser Sunda Islands (mPP = 48.5%), which include Flores, was substantial, even more so if Java as next alternative (mPP = 39.8%) was taken into account. The ratio between an SMA and an SEA origin was even more pronounced in favour of an origin in SMA under the slightly different topology of FC20 (Figure 39). This led to the conclusion that two different SEA lineages ultimately originating in the Lesser Sunda Islands entered onto the Southeast Asian mainland leading to the genetically differentiated populations as observed. No support could be found for an origin of the species in the Southeast Asian mainland. Therefore by exclusion principle I reject the hypothesis of an origin of the species in Burma / Thailand as proposed by Tate (1935). However, the here gathered evidence is consistent with an origin in Flores in the Lesser Sunda Islands as proposed by Schwarz and Schwarz (1967), but not conclusive in that respect.

Contrasting evidence for an origin of R. exulans in the Lesser Sunda Islands comes from dated faunal assemblages recovered through archaeological excavations. Earliest evidence of R. exulans and other commensal remains was found in association with Neolithic assemblages from Uai Bobo Cave on Timor, dated no earlier than 4100 BP (Glover, 1986, in; Spriggs, 1989). Pottery, a general indicator for the Neolithic, was found earlier (5500 BP) at the Jerimalai shelter on Timor although the dating has not been verified yet (O'Connor, 2007). However, other technology changed continuously from the Pleistocene through to the Holocene which raised the important archaeological question to what degree stone artefacts can reflect changes in human cognitive abilities (O'Connor, 2007). A series of excavation from Liang Bua Cave on Flores produced similar evidence. Pottery indicating

154

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

Neolithic human occupation was found at 170 m depth (Morwood et al., 2009); these layers

were dated to 4.18 cal. BP (Roberts et al., 2009). However, R. exulans remains only appeared around 500 years later in the stratigraphy and no evidence was found in previous layers dated up to 20,000 BP (Roberts et al., 2009; Locatelli, 2011). In these older layers a morphologically very similar species was present, R. hainaldi, a slightly larger rodent than R. exulans first described only 20 years ago (Kitchener et al., 1991).

Locatelli (2011) argued that R. exulans must have been introduced due to the sudden appearance in the stratigraphy (Figure 41, A) and suggested that this introduction contributed substantially to the decline of R. hainaldi. Although this could be a valid argument, the stratigraphy also showed that the abundance of all murine rodent species fluctuated considerably throughout the 20 kyr in the cave record. Both Papagomys spp., very large rodents, also appeared suddenly, although much earlier, and correlated with a subsequent decline of R. hainaldi (Figure 41, A). In absolute numbers R. hainaldi only declined below the numbers of R. exulans specimens within the cave deposits between spit

7 and 9 (Locatelli, 2011), which interpolated would date to 1100 to 1600 BP, roughly 2000 years (equal to > 5000 generations) after the emergence of R. exulans in the stratigraphy. It could equally be argued that this correlated with the appearance of further Rattus species or the increased evidence for Komodomys rintjanus (Figure 41), which appears to have profited from the presence of Neolithic humans, or served in their diet. This underlines how fragile argumentation based on point observations are. Archaeological surveys provide valuable evidence of the presence of species, but are generally not reliable indicators for the absence of species. Often surveys portray an inaccurate account of ancient wild fauna and deductions based on a small number of osteological findings can be misleading (Ashby, 2004). The Lene Hara Cave on Timor produced evidence of human occupation around

35,000 BP, and yet we know that humans must have been present at least 20,000 years earlier on their migration to settle Australia on the Sahul continent (O'Connor et al., 2002). It is generally accepted that this reflects an incomplete archaeological picture rather than the absence of humans (O'Connor, 2007; Reepmeyer et al., 2011).

155

Figure 41: Stratigraphic distribution of murid rodents at the Liang Bua Cave on Flores, Indonesia. A: relative abundance of murine rodents per excavation spit. B: Absolute number of specimens of R. exulans and R. hainaldi within the spits. Radiocarbon dates refer to excavation spits, depth of spits 10 cm. Images and data collated from Morwood et al. (2009); Roberts et al. (2009); Locatelli (2011); Locatelli et al. (2012).

R. exulans is very adaptable, but its preferred habitat is that of low grass open spaces; it is not a forest species like R. hainaldi (Kitchener et al., 1991), therefore direct competition is rather unlikely. However, a detrimental influence through higher adaptability in a human modified environment can be argued, this however, does not imply that the species must have been introduced. Locatelli’s (2011) argument that R. exulans could not have dispersed as far as Flores without human aid because there were no land bridges would be equally applicable to the distribution of the Rattus species hainaldi or any other murine rodent species for that matter. The diverse murine fauna on Flores and Timor is best explained by

156

ORIGIN AND DISPERSAL OF RATTUS EXULANS CHAPTER 04

time shifted colonisation events in the history of the islands and the presence of two small Rattus species co-inhabiting one island in different habitats could also support that dispersal was possible without human aid. Certainly an isolated island fauna promotes extreme size divergences, as such observed among Papagomys, Komodomys, and Rattus species, in adaptation to different niches.

Although Musser (1981a) hypothesized R. exulans to be an introduced species to Flores he also suggested it could nevertheless be a native to the island. According to Thomson et al. (2014) there is unpublished evidence for the presence of R. exulans in the basal layers of Liang Bua Cave on Flores. However, until we have more precise published accounts this cannot be evaluated. The high diversity and cryptic haplotypes (CHAPTER 03, Table 02, Figure 23) lend support to the SMA region (Lesser Sunda Islands and Java) as wider region of origin. The here proposed geo-dispersal events early in the history of the species might make an inference of the exact island of origin difficult. Nonetheless, the current molecular evidence strongly supports a centre of origin for R. exulans in the Lesser Sunda Islands and would be concordant with an origin on Flores. An origin of the species in mainland Southeast Asia could be refuted.

4.6 Considerations and future directions Museum specimens made up the majority of samples within the Malayan Archipelago. Although collection managers are generally willing to supply material, they are understandably reluctant to grant permission for destructive sampling for a large number of animal tissues. For this study the Smithsonian Institution was very cooperative, however, more samples would have been necessary for some locations. The study particularly lacks specimens from Sumatra and mainland Malaysia. Some of these samples had been obtained, but were preserved in unknown liquid, possibly containing mercury, which causes technical difficulties. As a result these samples did not yield any DNA. Other specimen were not available to me due to their voucher status, or due to the known detrimental preservation. Further some dried tissues also did not yield DNA, altogether resulting in some geographic areas not being sampled.

Nevertheless, my results are conclusive in many respects, but to fully capture the intraspecific relationship among specimens from the Southeast Asian mainland and the Sunda Islands it would be necessary to obtain samples from the closest geographic links

157

between these two regions. These would be locations along the coast of Vietnam as well as from the Malaysian part of Borneo to investigate any possible lateral links across the South China Sea, during exposure of the seabed. To investigate the extent of connectivity along the large Sunda Island chain, samples from Sumatra would be necessary and combined with samples from the lower Malay Peninsula this could reveal any possible continuity between the SEA and SMA groups, which could not be observed in this study.

To pinpoint an origin of the species within the Lesser Sunda Islands a variety of samples from every islands within this island chain would be necessary, plus possibly some from the adjacent Moluccas. In order to detect gene flow among these islands the application of nuclear markers would be beneficial. However, as mentioned before, the proposed geo- dispersal events could have masked any clear signals.

With regard to Remote Oceania and R. exulans, the human mediated dispersal has influenced the observable signal. The lack of a dispersal gradient for the species in Oceania has been stated early (Tate, 1935) and its presence on every island within the Pacific (Atkinson, 1985) clearly indicates this aid. However, because the islands were previously uninhabited, this allowed populations to increase to the island’s carrying capacity presumably without having to compete for resources and possibly without detrimental effects from diseases. It is unclear how much an altered reproductive success that late in the species history can account for the difficulties in locating an origin for this group. The expansion possibly fits well within the overall experience in the Malay region of the periodic population increases, however Remote Oceania certainly lacked the subsequent population crashes.

With regard to the analysis, of ancestral areas it has to be said, that the geographic variability within the Malay Archipelago is difficult to account for. A model would be required that can not only take into account that barriers exist only temporarily, but also that vast land areas exist only periodically; it would need to be able to include ‘ghost’ areas. Further the distinction between existence and non-existence of landmass and barrier would have to be modelled by a gradient. A reconstruction taking all these factors into account would be highly complex. In such a framework, due to its wide distribution in the region and the apparent natural dispersal, R. exulans could serve as a model organism for possible dispersal pathways of other small non-volant mammals.

158

CHAPTER 05 KNOWLEDGE OF OLD: HOW WELL DOES THE FORMER TAXONOMIC CLASSIFICATION MIRROR THE POPULATION STRUCTURE OBTAINED FROM A MITOCHONDRIAL DNA MARKER?

5.1 Abstract A comparison of the morphology derived and the molecular derived topologies for R. exulans reveals a general agreement but also emphasise structural uncertainties. The recorded morphological subtleties were significant enough to distinguish sub-specific differences on a similar level to that observed in the molecular data. Nevertheless, a full superposition of both topologies was not possible, which was particularly obvious with the positioning of R. r. wichmanni. A potential reason for this discrepancy could be bias when different collectors record their subjective interpretation of phenotypic traits, rather than relying on exact morphometric data. However, the morphological topology has given support to some areas of our molecular analysis and led to asking further questions in other areas and careful collection of morphological data associated with tissue samples would most definitely add to the resolution of the phylogeny derived from our molecular data.

159

5.2 Introduction Classic Linnaean is based on the differentiation between morphological characters and an estimation of the relationship between species thereon. R. exulans is eurytopic and a phenotypically very diverse species. Variations in skull and tooth size, fur colour and texture, and ratio of body to tail length have led to nearly 50 discrete descriptions. The more enduring species and subspecies within the former Rattus rattus concolor group (Tate, 1935), also known as the Rattus rattus exulans series (Schwarz & Schwarz, 1967), were consolidated to R. exulans over the course of the last century (Ellerman, 1941; Musser & Carleton, 1993).

Tate (1935) attempted to solve the “phylogeny of the exulans-type rats in relation to their geographic distribution” based on recorded morphological characters. He extensively described various Pacific specimens and in the process evaluated the several characteristics that served to distinguish between assumed racial groups. He dismissed differences due to sex as negligible and instead emphasized the importance of age when comparing specimens. The results of his calculations of ratios between length and width of the auditory bullae and ratios including meatus measurements ‘obscured’ observed differences, but allowed a ‘moderately satisfactory interpretation’. Structural differences in the molars were classified as not sufficient to aid in resolving the taxonomy of the group. Tate further compared the vitiensis, huegli and exulans type-specimens with Malay members of the concolor group, pointing out that preserved types are challenging to compare with fresher material due to changes in colour and skin size, which particularly affect the tail to body and head ratio.

Based on his combined observations Tate created an elaborate map (Figure 42) suggesting two dispersal pathways for the Pacific exulans type rats; the first route via the New Guinea mainland out to the Solomon Islands, and a second route through Micronesia into the remaining Oceanic Islands with an unspecific origin within the Philippine-Borneo-Java region, bypassing New Guinea altogether. Similarly he drew a map for the distribution of the concolor group with a postulated origin in Southeast Burma (Tenasserim)/Thailand (Tate, 1935, p.166). From there he suggested a forked dispersal, a) through the Sunda Island Chain and b) via Borneo along three passageways twice to the Philippines and once to Sulawesi.

160

TAXONOMIC COMPARISON OF MORPHOLOGICAL AND GENETIC DATA CHAPTER05

- group from the Philippine

(concolor group) collated after Tate (1935), suggesting (A) an origin of the Pacific exulans Pacific of the origin an (A) suggesting (1935), Tate after collated group) (concolor

exulans

R.

rigin of the species in Tenasserim (Burma)/Thailand. in Tenasserim the species of rigin Java region and (B) an o - : Dispersal pathways of pathways Dispersal 42 : Figure Borneo 161

In contrast to Tate’s view, Schwarz and Schwarz (1967, p.147) postulated Rattus rattus wichmanni from Flores in the Lesser Sunda Islands as the wild type and ancestor of the exulans-series (Figure 43). From there they described dispersal westwards, first into Borneo, Java, and Sumatra (R. r. ephippium) and then on into the Southeast Asian mainland (R. r. concolor). Like Tate, they identified the source for the Philippine Islands’ populations (R. r. negrinus, except Mindanao) as Borneo, with a second line through Sulawesi (R. r. raveni) into Mindanao (R. r. todayensis). Consistent with Tate, they described the eastward dispersal route from Flores to have been along the northern coast of New Guinea, into the New Hebrides and the Bismarck Archipelago, then via the Solomon Islands, out to New Caledonia, Fiji, Tonga and Samoa (all R. r. vitiensis). The origin of the oceanic R. r. exulans (Remote Oceania) was assumed to be Sulawesi via a pathway between Mindanao and Halmahera out to Palau and the rest of the Pacific. This Sulawesi origin is also consistent with an earlier observation by Miller and Ewing (1924), who postulated an origin for the Pacific exulans, R. hawaiiensis (for Miller and Ewing, ‘R.’ is synonymous to ‘R. r.’ and corresponds to the racial forms), in the Malay region and found R. raveni to be the most similar, as demonstrated by a comparison of various measurements (Table 11).

Figure 43: Geographic distribution of the R .r. exulans-series subspecies after Schwarz and Schwarz (1967).

162

TAXONOMIC COMPARISON OF MORPHOLOGICAL AND GENETIC DATA CHAPTER05

A direct qualitative comparison of the population genetic data with these previous morphological classifications was not possible due to a lack of information on most specimens. However, adding morphological characters is generally beneficial to resolving a tree topology with better confidence than molecular data alone (Donoghue & Sanderson, 1992). The most persistent subspecies within the former R. r. concolor group / R. r. exulans- series (Tate, 1935; Schwarz & Schwarz, 1967), viz. concolor, ephippium, negrinus, todayensis, raveni, wichmanni, vitiensis, and exulans, were at least partially reflected in the results of the molecular study (Chapter 03 & 04). The goal of this chapter is to evaluate the information held in the basic morphological characters used to distinguish between the former subspecies classifications of R. exulans in contrast to the resolution achieved with the mitochondrial DNA marker and to determine whether they can add further confidence to our obtained topology. Does the morphological classification mirror the genetic tree?

Table 11 : Mean measurements (mm) of morphological characters for R. hawaiiensis and R. raveni, a considered ancestral species to exulans by Miller and Ewing (1924). Head and max length condylobasal zygomatic Interorbital Body Tail Hind foot of skull length breadth breadth nasal R. hawaiiensis 125.7 133.4 26.2 32.0 29.7 15.1 5.0 12.0 R. raveni 119.2 136.6 25.6 32.0 29.5 14.6 4.9 11.6

breadth of rostrum breadth of depth of maxillary mandibular diastema over roots of incisors braincase braincase mandible toothrow toothrow R. hawaiiensis 8.5 5.7 13.4 9.2 18.1 5.2 5.0 R. raveni 8.4 5.4 13.2 9.2 17.1 5.2 4.9

5.3 Methods For the analysis of morphological data, available characteristics were extracted from the text descriptions of Schwarz and Schwarz (1967). Among others, their monograph collated specimens of the most persistent subspecies they attributed to the exulans-series as commensal forms of R. r. wichmanni. These subspecies are the aforementioned R. r. ephippium, R. r. concolor, R. r. negrinus, R. r. raveni, R. r. todayensis, R. r. vitiensis, and R. r. exulans. The dataset by Schwarz and Schwarz seemed to be particularly reliable, because they included only descriptions and measurements of representative adult specimens in their comparison. Additionally they reviewed several of the holotypes thereby presenting the most complete morphological description across these subspecies.

163

The recorded characters collated by Schwarz and Schwarz were coat texture, dorsal and ventral fur colour, colour of the flanks, hand, feet, phalanges, and claws, presence of demarcation lines, colour of the tail and presence and detail of hair thereon. The colours used for descriptions are given in Figure 44, illustrating the subtleties for differentiation; they are further demonstrated Figure 44: Descriptive colours for differentiation of former for two sample colour morphs classified subspecies after Schwarz and Schwarz (1967). in Figure 45. Skull size and structure were documented to various degrees of detail. The most consistent descriptions included the form of the parietal bone, presence and degree of frontoparietal crest and the formation of a postorbital angle (Figure 46). Unfortunately, Schwarz and Schwarz (1967) cited a description of R. r. wichmanni by Mertens (1936) that was not accurate, but seemingly related to his account of R. r. ephippium, therefore the holotype description from Jentink (1890) was used for the wild type instead. The remaining holotype descriptions were consulted to clarify inconsistencies between the text and table description in Schwarz and Schwarz.

Figure 45: Two colour morph examples for R. exulans specimens from Loei Province, Thailand, depicting the distinctive underbelly colour and different tawny fur shades with suffused (top) and overlaid (bottom) black hair (source: www.CeroPath.org).

164

TAXONOMIC COMPARISON OF MORPHOLOGICAL AND GENETIC DATA CHAPTER05

All characters were extracted from the recorded descriptions and those used in the analysis are collated in Table 12, attempting a near complete character analysis, as recommended for morphological systematics (Wiens, 2001).

Several subspecies were documented as exhibiting ecological variations, primarily correlating with habitat elevation. This phenotypic adaptation repeatedly presented itself as a longer coat and thicker underfur. However, these adaptations cannot be treated as full characteristics, hence only lowland species description were included and thickness of undercoat discarded as character. For R. r. negrinus two descriptions were given, one ‘somewhat similar’ to ephippium and the other to concolor, differing mainly in size. These were named negrinusA and negrinusB within the analysis.

For the character matrix only size independent characters were chosen, to avoid any possible bias. As such, skull size, which generally correlates with body size, was not included as a separate character, but was included in overall body size. Teeth and molars were not included as characters, because they are not homogenous within geographic regions and measurements overlap to a high degree between them. Differences between sexes were confirmed to be negligible (Schwarz and Schwarz 1967) and thus not taken into account.

A character matrix was created, translating the phenotypical descriptions to numerical values. To account for different levels of biological relevance, selective weighting was applied. Here, the more conservative features, like the distinctive skull shapes, were weighted higher than fur colouration and the remaining phenotypical traits. The brightness of fur shades is often dependent on the habitat and colour recordings can be biased by individual perception and are further prone to interpretation, therefore a weighting was chosen, where the accumulative value for fur colour equalled about half the weight of all skull features. The least weight was attributed to features that were not independent, highly variable, or incompletely recorded. These included the colouration of body parts affected by the primary colour, the presence of a demarcation line, which is dependent on the combination of dorsal and ventral colour, and morphometric class features like the highly variable overall body size and the tail to head and body ratio. Incomplete records were present for claw colour and foot demarcation; claw colour was recorded as white for two subspecies, but not described at all for others; most likely this was due to not being unusually striking, hence not noteworthy, but the record remains uncertain. Ordering of

165

phenotypical traits was considered and tested. It had no impact on the overall structure of the tree, but increased support for the topologies observed without ordering. However, to avoid the introduction of false assumptions, ordering was dismissed.

Figure 46: Skull morphology of the hypothesized wild type and ancestor of the concolor / exulans-series; Holotype R. r. wichmanni from Flores (Jentink 1890), and skull images from the R. r. ephippium specimen USNM 145562 from Borneo (Schwarz and Schwarz 1967), present in overlapping geographic regions.

The full matrix consisted of 17 discrete characters and the most parsimonious trees were

inferred in PAUP* (Swofford, 2003). Sequences were added randomly and the trees arbitrarily rooted by the first species on the list (wichmanni). A bootstrap method with heuristic search was applied with 1000 replicates at a consensus level of 80. For the set of

most parsimonious trees, a split network was constructed in SPLITSTREE 4 (Huson et al., 2004; Huson & Bryant, 2006) to visualise congruence and uncertainty among the tree topologies. All obtained trees were unrooted.

166

Table 12: Morphological characters for selected subspecies within the R. rattus exulans series, collated from Schwarz and Schwarz (1967) and Jentink (1890). Encoding for the matrix for characters I to XVII is shown before the description and character weighting is stated in the weight column.

After Schwarz and char weight wichmanni concolor ephippium negrinusA negrinusB raveni todayensis exulans vitiensis Schwarz 1967

Size I 1 1 largest 2 smaller 3 norm 1 larger 1 larger 2 smallest 1 larger 1 larger 3 norm mm up to 155 105-120 120-135 up to 140 up to140 120-125 120-135 145-150 125-135 parietal 2 strongly, more 2 strongly, more Skull II 4 0 moderate 1 strongly 1 strongly 1 strongly 1 strongly 1 greatly 0 moderate inflation forward forward parietal raised III 4 ? 1 strongly 1 strongly 1 strongly 1 strongly 1 strongly 1 strongly 2 yes 2 yes postorbital 0 slightly 1 distinct, well- 1 distinct, well- 1 distinct, 1 distinct, well- 1 distinct, well- 1 distinct, well- IV 4 2 distinct 2 distinct angle marked marked marked well-marked marked marked marked frontoparietal 1 strong, 1 strong, V 4 1 strong 0 rel. weak 0 rel. weak 0 rel. weak 0 rel. weak 2 weaker 2 weaker crest thickened thin longer than Tail VI 1 0 no 1 yes 1 yes 0 no 0 no 1 yes 0 no 1 yes 0 no head+body Colour dorsal VII 2 0 cinnamon 1 tawny 1 tawny 1 tawny 2 dark tawny 0 cinnamon 0 cinnamon 1 tawny 0 cinnamon 1 suffused/ 1 suffused/ 1 suffused/ 1 suffused/ VIII 2 2 overlaid 0 suffused 0 suffused 2 superposed 0 suffused overlaid overlaid overlaid overlaid 0 pure 5 buff 3 lemon buff/ 6 slaty with buff Colour ventral IX 2 4 buffy 4 buffy 4 buffy 4 buffy 2 whitish white ochraceous whitish washed ochraceous 1 white, underfur X 2 0 white 1 slaty 1 slaty 1 slaty 1 slaty 2 tawny 2 tawny 1 slaty with slaty bases demarcation 1 indistinct/ 1 indistinct/ XI 2 3 distinct 2 indistinct 2 indistinct 2 indistinct 2 indistinct 2 indistinct 3 distinct line absent absent 0 pale- 0 pale flanks XII 1 1 pale tawny 1 pale tawny 1 pale tawny 1 pale tawny 0 pale cinnamon 0 pale cinnamon 2 ochraceous cinnamon cinnamon 1 black 1 black XIII 1 0 1 black suffused 1 black suffused 2 less black 2 less black ? ? suffused suffused feet XIV 2 0 white 1 sepia 2 pale drab 2 pale drab 1 sepia 2 pale drab 2 pale drab 3 buffy 0 white 1 sepia 1 sepia 1 dark median 1 dark median 1 longitudinal feet marked XV 1 0 0 0 metacarpal metacarpal 0 mark mark marks marks marks phalanges XVI 1 0 white 0 white 1 1 0 white 1 1 ? 0 white 16

7 claws XVII 1 1 white 0 0 0 0 0 0 0 1 white

5.4 Results Four equally parsimonious trees were found in paup* based on 16 parsimony informative characters. Differential weighting, particularly increasing the impact of skull characters, resulted in good support of four nodes (Figure 47). Wichmanni and vitiensis, as well as exulans, consistently separated from the remaining series with good bootstrap support. The two forms of R. r. negrinus did not cluster together; negrinusB was reliably grouping with R. r. concolor and negrinusA ostensibly with R. r. ephippium. The latter two were not fixed in their positions and caused uncertainty in the final topology. R. r. raveni reliably grouped with R. r. todayensis.

The only structural difference with no weighting applied (Figure 48) was the loss of support for the exulans split, thus ephippium, negrinusA and exulans fell ‘basal’ to the concolor & negrinusB and raveni & todayensis pairs. Additionally, overall support was slightly lower. This showed the strong signal that all combined characters have and gave credibility to the character matrix and the ensuing trees.

Figure 47: Unrooted majority rule consensus tree (80%), with weighting emphasis on the skull features and the main colour traits.

168

POPULATION GENETICS CHAPTER05

Figure 48: Unrooted majority rule consensus tree (80%) with all characters equally weighted.

Figure 49: Split network for the four unrooted most parsimonious trees from the morphological analysis of the R .r. exulans series subspecies, colour coded to match the haplogroups in Figure 8.

Within the split network (Figure 49), the strong agreement among the four most parsimonious trees underlined the overall topology among the nine morphological races. The majority of uncertainty was observed in the exact positioning within the paired leafs of ephippium & negrinusA, including changes in branch lengths. In one out of four trees the relative position of the ephippium & negrinusA pair was exchanged with that of the

169

concolor & negrinus pair. Therefore, in all but one tree the ephippium pair fell basal to the concolor pair in relation to the remaining groups.

In order to facilitate a comparison of this topology with the mtDNA derived topology, the

BEAST-chronogram from CHAPTER 04 was modified to emphasize the five haplogroups (Figure 9, A). A contrasting juxtaposition with the consensus tree of the most parsimonious trees from the morphological analysis depicts the agreements and differences between the two topologies (Figure 50). Each haplogroup can be matched by colour with its subspecies counterpart from the corresponding region. The topologies fully agreed in the paired positioning of concolor and ephippium, and the placement of the Philippine subspecies raveni and todayensis in relation to the oceanic exulans differed only in their chronology. However, the positions of vitiensis and wichmanni are cryptic and cannot be reflected in the mtDNA-tree. Further, negrinusA and B could not be included in any group because of their discordant positioning.

Figure 50: Contrasting juxtaposition of (A) the BEAST-derived chronogram, based on mtDNA (for details see CHAPTER 04) and (B) the consensus tree from the most parsimonious trees in PAUP* on the right. Colour coding applied to identify the mtDNA-haplogroups and the corresponding morphological classifications. R. r. negrinus cannot be distinguished unanimously, therefore only R. r. raveni and R. r. todayensis were selected to reflect the PHBS-haplogroup.

170

POPULATION GENETICS CHAPTER05

5.5 Discussion In the present chapter the morphological characters used to distinguish between various subspecies of R. exulans were used to infer a topology in order to evaluate the potential contribution to resolving the true phylogeny of the species. R. exulans can thrive in and adapt to almost any available habitat over a wide geographic range, not least due to its opportunistic food preferences. Therefore conspicuous phenotypic and morphometric variations have led to multiple descriptions of the species in the past. The last nine subspecies have been revised and conflated to R. exulans as the nominal species only within the last fifty years. In the morphological analysis of these nine phenotypic races, they clustered into four to five clades. This is consistent with the results from the molecular analyses from CHAPTER 03 & 04, where four to five haplogroups and clades were distinguished, and, with the exception of one group, produced a largely satisfactory congruence between the morphological and mitochondrial trees. Particularly, the geographic distribution of representatives of the morphological and mitochondrial clades matched considerably well. Overall, the results encourage the inclusion of this knowledge of old into further interpretations and the evaluation of dispersal hypotheses in context of human migration pathways. However, the use of old data comes with caveats and its reliability depends on many factors of which only few can be controlled. In the following I discuss the congruence and differences in more detail.

The two forms of R. r. negrinus, as described by Schwarz and Schwarz (1967), were assigned to two clades and had no singular genetic counterpart. Schwarz and Schwarz’ description of negrinus was predominantly based on comparisons to both concolor and ephippium, hence their respective groupings with these races was not greatly surprising. A geographic overlap of the concolor and ephippium types would be an explanation for the occurrence of the two morphs, but the genetic data does not support this possibility. Nevertheless, it was peculiar that this one subspecies was described by reflecting two others. In the first description the species was classified as Mus ephippium negrinus (Thomas, 1898), i. e. a subspecies to ephippium, pronouncing the strong similarity with ephippium, apart from size and a more greyish colour. However, Thomas (1898) also suggested that ephippium and concolor might be graded into one species. Further, Tate (1935) had stated that the number of described (sub) species in the Philippines was far higher than elsewhere, particularly compared to the mainland and Sumatra and Java, where forms of concolor and ephippium dominated. He attributed this increased variety to the lack

171

of the competing species of the cremoriventer and fulvescens groups that were absent in the Philippine islands, but essentially this apparent richness was probably owed to many local phenotypic adaptations, particularly of mountain forms. Therefore the cause for Schwarz and Schwarz’s dichotomous description for the classification of one subtype could have been the lack of a unifying description of negrinus as a result of condensing the abundance of descriptions, or the lack of any unifying features of R. exulans types across the Philippine Island to start with.

Overall, the strong support for the R. r. concolor & R. r. negrinusB pair in combination with the uncertainty of the positioning of negrinusA towards ephippium and the positioning of both towards the concolor pair indicates that the description(s) for the northern Philippine island race(s) were insufficient. It was assumed that those islands had received their rats from Borneo because, as stated above, initially negrinus was only slightly differentiated from ephippium (only by size and a little more greyish colour). If at all, this was indicated by the paired positioning of ephippium and negrinusA. However, essentially the defining features for the dichotomous description consisted of external phenotypical traits and little to no morphometric data to underpin these more interpretation prone traits. On the other hand it is possible that the well supported Philippine subclades (see

CHAPTER 04) might have shown slight phenotypical adaptations that weakened the distinguishing features for the group as a whole. However, without corresponding molecular and morphological data this cannot be evaluated. Based on the data available this leads me to question the value of Schwarz and Schwarz’s description of R. r. negrinus as one sub-species and if morphological characters are to be considered to supplement further molecular analysis, these issues need to be resolved.

Mindanao and Northern Sulawesi were recorded to have their own subspecies, R. r. todayensis and R. r. raveni, but the morphological data consistently clustered these local morphotypes together. Neither group could be identified individually in our molecular analysis but together their placement suggests a correspondence to a mitochondrial haplogroup that is associated with all of the Philippine Islands. This haplogroup was the most structured one and encompassed specimens from the Philippine islands Palawan, Luzon, Negros, and Mindanao, as well as the Northern regions of Borneo and Sulawesi (PHBS-haplogroup, with small subclades unique to Luzon and Negros). Although the inclusion of negrinus was not reflected in the morphological analysis, presumably due to

172

POPULATION GENETICS CHAPTER05

the lack of a sufficient description and thus uncertain positioning for the subspecies, a gradual grouping gathers support through the overall comparison by Schwarz and Schwarz (1967, p. 149), where they closely connected negrinus to todayensis due to shorter tails on both subtypes compared to raveni.

Although the two sub-species did not find support in the molecular analysis, it is noteworthy that a distinct North-South division was observed in Sulawesi that separated the Southern Malay Archipelago haplotypes from the PHBS-haplogroup. The geographic home-regions of raveni and todayensis, Sulawesi and Mindanao, formed a transition zone between the PHBS and the Near Oceania (NO) haplogroups. However, although this divide could be the cause for two description, it would not explain why these sub-species then would cluster together. Alternatively the differentiation between the two sub-species in this area could be associated with the above discussed hypothesis that the Philippines simply lacked a true phenotypical race. This would be supported by the age of the clade and the high molecular differentiation within this group. Combined with the strong molecular distinction from the rest of the species, the sub-species descriptions might have been on a different organisation level. Overall the raveni-todayensis clade (Figure 50) can be interpreted as at least partially mirroring the molecular PHBS-group.

In the East, R. r. exulans was clearly represented by the Remote Oceanic haplogroup. Two slightly different exulans phenotypes were described by Tate (1935), one as being particularly small in Palau, the Caroline and Marshall Islands and another much larger type present further East, with both forms present in Hawai’i. This divergence in morphotypes was possibly reflected by the occurrence of a Micronesian subclade to the larger RO-clade on the mtDNA-tree, based on samples from the Caroline Islands.

For the wider New Guinea area R. r. vitiensis, a considerably larger variant than ephippium, was known; this race was clearly represented by the NO-haplogroup. Morphologically, vitiensis appeared most closely related to R. r. wichmanni. The features uniting the two races were the rather moderate parietal inflation as well as the white colour of the feet. A tentative omittance of the feet colour as a character did not change the positioning and the skull description was free of doubt. Two possibilities can account for these observations: either the Flores specimen from the Liang Bua cave was not an example of R. r. wichmanni, or the ties between the Lesser Sunda Island and New Guinea Island are stronger than anticipated.

173

Although this positioning of R. r. wichmanni was initially surprising, it is not inconsistent with the molecular data. The Southern Malay Archipelago (SMA) and NO haplogroups, which correspond to the geographic distribution of the two sub-species, were not separable without any doubt by molecular analyses. The SMA group further indicated many cryptic haplotypes. The proposed wild type of exulans (R. r. wichmanni) was assumed to be represented by the samples from Flores and the corresponding haplotype, found in the modern sample as well as in a semi-ancient sample from the Liang Bua Cave (circa 250 ‑ 500 yrs BP, Locatelli, 2011), is only two point mutations different from the central NO-haplotype. Therefore the observed morphological cluster of R. r. vitiensis and R. r. wichmanni is to some degree supported by the molecular data.

However, the Flores / wichmanni haplotype is otherwise associated with the SMA-group

and builds the hub within the Median Joining-network (see CHAPTER 03). The SMA-group corresponds to the geographical distribution of the ephippium sub-species, meaning that the distribution of wichmanni is enclosed in the distribution of ephippium (Figure 43). It was therefore expected that ephippium and wichmanni would cluster closer together than wichmanni and vitiensis. The close association between wichmanni and vitiensis could reflect a close (and young) relationship due to the occurrence of a recent range expansion into New Guinea, as inferred from the molecular data, and the geographic proximity.

The identification of the Flores specimen as wichmanni is supported by the description of the first Adele Island specimens from 1891 which specifically mentions the lack of grey bases to the white belly hair (according to Tate, 1951, p. 323). This supports the affiliation with wichmanni, given that the mtDNA haplotype is the same between the Liang Bua cave sample and all Adele Island samples. Thus current evidence suggests stronger ties between the Lesser Sunda Island and the New Guinean form. To fully evaluate this would require the absolute certainty of the morphological identity of the collected specimens in the field for both, a wichmanni (Flores) and a vitiensis representative.

To the West of the Lesser Sunda Islands, R. r. ephippium additionally occupied the remainder of present-day Indonesia. Within this region Java was denoted to lie within the distribution of two subspecies, ephippium and concolor. This overlap could not be found in our molecular data. However, the occurrence of a concolor sub-form on Java was based on Kopstein (1931) who observed that the island seemed to have another particular form of R. exulans, which he described as Rattus concolor otteni, subsequently included in R. 174

POPULATION GENETICS CHAPTER05

concolor concolor. The form only occurred in harbour areas along the Northern coast of Java and was not found inland, where ephippium was frequently observed. The morphological description from Kopstein (1931) was similar to that of concolor, and as subspecies to R. concolor (depending on the literature-source the R. concolor-group is equivalent to the R. r. exulans-series) he termed the form “Hafen-concolor” (Harbour- concolor, as opposed to “Berg concolor”, Mountain-concolor, which was his colloquial term for R. c. ephippium). Despite thorough investigations by the author and his associates, this subspecies could not be found within a continuous area but had a patchy distribution in apparently random Harbour cities, which led to their assumption that it might have been imported via the harbours on several occasions. Based on the considerably smaller phenotypic appearance and its uniform colour it was subsequently attributed to the true concolor subspecies. In comparison with my molecular data, this otteni-lineage could be represented by the Javanese subclade, which consists entirely of eastern coastal samples from Java. However, without further analyses this remains hypothetical.

The true R. r. concolor (concolor = uniform colour) covered the range from Northern Java (but see discussion above) across most of mainland Southeast Asia. This race was quickly distinguishable by its uniform colour, the recorded increase in tail length in Burma and Thailand, and a decrease in body size eastwards. Further it was described to live entirely commensal. In the morphological analysis concolor paired with the B subtype of negrinus, suggesting a connection between the Northern Philippines and the Southeast Asian mainland (SEA). However, as discussed above, the description of the negrinus race is insufficient and cannot be accepted without further information.

Concolor by itself is well distinguished from all other groups, only its position towards ephippium is slightly ambiguous. Under the assumption that the SMA haplogroup is represented by ephippium, this can be observed in the molecular data as well. There the SEA samples are attributed to two distinct groups, one unique to SEA and well separated from other haplogroups, and a second cluster, also near unique to SEA but more closely related to Palawan samples and select individuals from the Lesser Sunda Island. However, as a whole SEA is polyphyletic and cannot be separated from the SMA group based on geography. This bipartite molecular representation of the SEA-haplogroups, is probably due to different source populations and, although the negrinus subtypes cannot be validated

175

at this point, it would be interesting to follow up on the possibility of an association of the Borneo-Palawan clade with representatives of negrinusB.

Overall the morphology and molecular derived topologies seem to be more congruent than that they differ. Further they complement each other in areas of uncertainty which leads to a more holistic interpretation of the available data and the possibility of phrasing the right questions to be asked. To solve the conundrum why ephippium and wichmanni do not cluster together, a better resolution covering the SMA region might be able to contribute.

In reference to the phenotypical descriptions used in this chapter, I would like to close with an observation from Tomich and Kami (1966) who concluded for the coat colour of the sister species R. rattus that particularly the dingy white and lemon washed bellies are accounted for by staining and soiling and several other colour morphs from field caught rats disappear after a short while in captivity.

5.6 Future direction To settle the discrepancies between the morphological descriptions and genetic data a more thorough sampling would be necessary, acquiring full morphometric and unbiased phenotypic data throughout the separate regions of the Malay Archipelago, particularly covering the bridging areas. Individual records for sampled specimens with classification into subgroups are essential in addition to the molecular data to confirm or reject a link between the recorded phenotypes and the observed haplogroups. For the specimens on the Southeast Asian mainland this could be achieved by launching a collaboration with CERoPath (Community Ecology of Rodents and their Pathogens in Southeast Asia), who have extensively sampled the mainland and measured all of their sampled specimens. Molecular data for a portion of their specimens is already included in this thesis.

As can be seen by the lack of divergence within a recent morphometric study (Motokawa et al., 2004), the combination of molecular and morphological studies might aid in discerning more informative morphological characters for a taxonomy of a species from less informative ones.

176

CHAPTER 06 STOWAWAY OR ETHNOTRAMP, R. EXULANS HITCHING A RIDE

6.1 Abstract In this chapter I summarise the findings from the previous chapters to develop a proposal for the history of R. exulans. A clearer picture emerges regarding what parts of the population structure can be explained by natural dispersal and geo-dispersal events and where the distribution of the species is most likely influenced by human translocation.

A comparison with the three other commensal or domesticated animal species relevant in the context of the Lapita cultural complex supports a unified theory of the region where the commensal fauna joined with the sea-faring settlers. The Remote Oceanic clade of R. exulans clearly did not originate in the Philippines or any other Northern Malay Island. The origin and subsequent distribution of all four species supports the inference of a pathway that involves the oceanic islands of Wallacea and if the migrants were not locals, it strongly suggests a provenance further to the West.

The most prominent theory that has accompanied the Lapita cultural complex, the out of Taiwan theory associated with the Express train model, does not find any support by the faunal contributions to this complex. Such lack of support from other components has stimulated discussion among disciplines not just how the expansion of the Austronesian language family came to be, but also as to why the language was so dominant and how all archaeological, linguistic and genetic evidence can be fitted into one framework.

177

6.2 Introduction People have manipulated their environment to better suit their needs for many thousand years before Neolithic farming cultures changed the face of the earth. In Oceania animal domestication, or rather association that involved deliberate translocation of species of interest, has the earliest records worldwide; the Common Cuscus, Phalanger orientalis, was introduced into New Ireland of the Bismarck Archipelago as early as 19,000 BP, although there is some uncertainty in the archaeological dating (Grayson, 2001; Heinsohn, 2003). The same species has an assumed natural range distribution within the Moluccas, but was introduced to Timor in the Lesser Sunda Island by 4500 BP (Groves, 1984; Heinsohn, 2003; Heinsohn, 2010). However, of particular interest in regard to human movement into the Pacific are the species translocated from the Malay region into the Near Oceanic Islands (Bismarcks), the species mainly associated with the Lapita cultural complex: dog, chicken, pig and rat. The dog, Canis lupus familiaris, appears to be a later introduction (2000 BP) than the other three and therefore probably does not reflect the initial human migration, but it might be associated with the hypothesised second wave of migrants (Matisoo-Smith, 2007; Addison & Matisoo-Smith, 2010). Although chicken, Gallus gallus, is present in older sites (> 3000 BP) associated with Lapita, it is only sporadically found within prehistoric faunal assemblages in Near Oceania (2%), about tenfold as often in Remote Oceania (22%) and less in Micronesia (10%) (Storey, 2008; Storey et al., 2008). With the exception of New Caledonia, remains of pig, Sus scrofa, are found in most early Lapita associated assemblages in Near and Remote Oceania, but become far more abundant in later layers (Matisoo-Smith, 2007). This makes it a potential proxy for human migration and ultimately the origin of the Lapita settlers. A feral subspecies to the wild boar is common throughout the Malay Peninsula, Sumatra, Java, Bali and the smaller islands to the east, where they were supposed to been transported from its origin on Java or Sumatra (Groves, 1984). Additionally, the previously inferred presence of pre-Lapita pig in the New

Guinea highlands (10,000 BP) and the New Guinea coastline (6000 BP) has been refuted based on direct dating of pig remains from various site on the New Guinea mainland; accordingly no pigs could be dated to before 3000 BP (O'Connor et al., 2011b).

However, although the pig is the best candidate among these three species, like the dog and the chicken it too can interbreed with more recent introductions through modern trade. This is why R. exulans stands out in its potential to assist in inference of the origin of the Lapita

178

HUMAN MIGRATION PATHWAYS CHAPTER06

ancestors. The species is not closely related to the commonly dispersed European commensal rats R. rattus and R. norvegicus and thus cannot interbreed with these later introductions. As mentioned earlier, the spread of R. exulans was deliberately facilitated throughout Remote Oceania by Pacific peoples (Tate, 1935, p 147), and there its initial distribution closely matches that of the Lapita Cultural Complex (Matisoo-Smith & Robins, 2004). Apart from the manufacture of an unprecedented style of red-slipped dentate- stamped pottery, among other traits, part of the definition of this cultural complex was a maritime lifestyle with the capability of sophisticated navigation (Bellwood, 1978c). During their expansion into the Pacific these Lapita peoples and their descendants carried R. exulans as a food item (Waite, 1897), presumably as a protein source for the long journeys and on the impoverished islands. The rat was introduced to every island settled by Lapita and Polynesian peoples, but remains absent from uninhabited Pacific atolls (Atkinson, 1985; Atkinson & Atkinson, 2000). R. exulans is incapable of swimming directionally or of surviving for a prolonged time in the water (Jackson & Strecker, 1962) hence their human-mediated range expansion (Tate, 1935, p147; Spennemann, 1997) has made it an ideal bioproxy for tracing human migration in the Remote Pacific region (Matisoo-Smith et al., 1998; Matisoo-Smith et al., 1999; Matisoo-Smith & Robins, 2004).

Where these rats first came aboard has not yet been determined. Molecular studies found a link between the Remote Oceanic lineage of R. exulans and specimens from Halmahera, in the Moluccas but with a big gap spanning Near Oceania (Matisoo-Smith & Robins, 2004). Further, no links could yet be found to the proposed Taiwanese homeland in support of the Austronesian expansion. However, this Taiwanese origin of the ancestors of the Lapita peoples still remains widely debated among the disciplines as recently reviewed by Bellwood et al. (2011). Over the last few years, substantial progress has been made in tracing the origins of the described commensal species that were transported into the Pacific. This thesis contributes the population genetic history of R. exulans and with it a further puzzle piece towards a better resolution of this so elusive ancestry.

6.3 R. exulans, a proposal for space and time R. exulans shares a similar history with many other species in Wallacea. Pre-exulans rats dispersed from Southeast Asia (Fabre et al., 2013) during a time with prolonged sea-level low stands, possibly between 200 and 130 ka, and became isolated during the subsequent phase of high sea-levels. This prolonged isolation from their ancestors caused sufficient 179

divergence through the founder event and genetic drift before merged landmasses would allow gene flow. Hence they emerged as new species. In Figure 51 I summarize the results from previous chapters to propose a history of the population since speciation.

Today’s widespread population of R. exulans is genetically highly structured with three clearly separated geographic clades (CHAPTER 03, Figure 23 and 29). These clades are proposed remnants of a geo-dispersal event during a glacial period of the late Pleistocene, followed by separation of two population fractions within the extended range through geographic isolation caused by the sea level high stands during the subsequent interglacial (Figure 51, steps 1 - 3). These groups diverged deeply through genetic drift during this time of separation. When the next opportunity for dispersal arose through further paleoclimatic events the dispersal did not emanate from one ancestral population, but from all three areas where the species had successfully retained a viable population in (Figure 51, step 4).

The population from the centre of origin radiated in parallel to the newly emerging land and randomly across narrow water gaps (Figure 51, yellow lineage). On Borneo this presumably led to the current presence of two lineages, a northern one from the first dispersal event and a southern one from the most recent event. However, during the second range expansion throughout the exposed Java Strait the arriving deme would have encountered an already established population on Borneo, making a secondary colonization more difficult. This would have caused a slowdown of dispersal in the northern direction and possibly led to an increased lateral redirection from this previous pathway.

Westward dispersal across the South China Sea bed onto the Southeast Asian mainland not only involved the successful distribution over land but also the crossing of two large river

systems present during the glacial periods (CHAPTER 04, Figure 34). This made a distribution leading to the Malay Peninsula and dispersal along the southern gulf of

Thailand onto the mainland (Figure 51, SEA 1 and CHAPTER 04, Figure 40, XI) more likely

than an arrival on the Vietnam coastline (Figure 51, SEA 3 and CHAPTER 04, Figure 40, XIII); given the young age of the latter group and the close relationship with Bornean haplotypes their translocation might have been mediated by human movement.

180

HUMAN MIGRATION PATHWAYS CHAPTER06

Figure 51: Proposed population history for R. exulans. Yellow indicates the original population from where the species extended its range. Each level represents a different time frame associated with an event (1 - 7); the proposed time frames for these events are annotated on the right. The locations are fixed on the grid. To signal divergence through genetic drift populations were assigned new colours. The letters N and S stand for north and south and indicate two different lineages within one geographic area (Borneo). Question marks indicate unknown locations, lines are connecting the ancestral and the two deep lineages through time. Abbreviations as before and Luz: Luzon, Neg: Negros; Sul: Sulawesi; Mol: Moluccas, Bor: Borneo, Jav: Java. 181

Eastward dispersal would still rely on crossing several water gaps and must therefore be attributed to chance. The close relationship with the NO-haplogroup suggests a later event of dispersal and although a natural stranding was possible, this lineage might entirely be explained by human mediated dispersal and subsequent range expansion, which would be

consistent with the time estimates for range expansion (CHAPTER 03, Table 2 and Table 6).

The dispersal from the Philippine reservoir, most likely on Borneo or Palawan, proceeded northwards and eastwards, founding populations on Luzon and Negros (Figure 51, steps 5 to 7). It further reached Sulawesi, leading to a North-South split of haplogroups similar to that on Borneo.

The location for the reservoir that led to the Remote Oceanic lineage is unknown, however, it must have been somewhere between the species’ origin and the Bismarck Archipelago, leaving the Moluccas at the forefront. To discover this location a survey of archaeological evidence in the area and further sampling of extant specimens within a range of potential islands would be necessary. One example and possible candidate would be the Sula islands, which formed an extended island with the Kepulauan Banggai to the west, but in absence of further evidence other islands in the Moluccas are just as likely candidates. From this reservoir further dispersal can be hypothesised to explain the slight differentiation within the Remote Oceanic lineage, which seems to be better explained by ancestral variation rather than differentiation along the dispersal route. Considering the estimated tMRCA for the lineage (CHAPTER 04, Table 40 and Figure 39), translocations aided by the first modern humans in the region are a possibility.

The observed relationships among R. exulans clades appears to fit well with the general pattern of species distribution among islands of Sundaland as has recently been compared among several non-migratory vertebrate species (Leonard et al., 2015). Leonard et al. (2015) summarised a closer relationship between Sumatran and Malay Peninsula species than of either to Borneo; they also reported the presence of deeply diverged lineages of many species on Borneo.

Although most islands in Wallacea are truly oceanic, i. e. remained largely isolated even at the lowest sea levels, many islands in the Moluccas had substantially more landmass during these periods. All archipelagos within Wallacea, with the controversial exception of the Lesser Sunda Islands, have produced native species of Rattus (Musser, 1981a), indicating

182

HUMAN MIGRATION PATHWAYS CHAPTER06

that this area has frequently received rats that then diverged through isolation. The Sula group has the native R. elaphinus and evidence exists for two further undescribed rat species on their largest island, Ceram has R. felicius, and Morotai R. morotaiensis (Musser, 1981a). As common for Halmahera the latter has entered the Moluccas via a colonisation pathway from the Sahulian region as opposed to the mostly observed colonisation pathway among the Indo-Pacific Rattini from the Sunda Shelf and the Southeast Asian mainland (Fabre et al., 2013). Halmahera builds an exception in this respect as it has received species via both pathways; the newly described Halmaheramys was inferred to have entered from the west (Fabre et al., 2013).

With respect to R. exulans, Halmahera in the Northern as well as Kei Besar in the Southern have haplotypes from the deeply diverged RO-lineage as well as

from a recent dispersal from NO, but no specific SMA haplotypes (CHAPTER 03, Figure 16). This indicates the repeated accessibility of these islands for this non-volant small mammal but also suggest a directionality from the New Guinea region, rather than the Lesser Sunda Islands. It does not provide any evidence towards an origin of the RO-lineage, and clearly does not favour either of the islands as reservoir because the observed haplotype pattern is more consistent with repeated introductions.

Within the RO-lineage, the close association of a putative subgroup with a structurally undefined cluster of samples originating in the Bismarcks and Remote Oceania as far as Samoa but not beyond is conspicuous across all tree topologies observed in this study. It could be suggestive of a pre-RO reservoir within the Bismarcks, however, although the two other rat species in the Bismarcks, R. mordax and R. praetor, were found in earlier archaeological horizons (15,000 and 8,000 BP respectively), there is no secured

archaeological evidence for the presence of R. exulans before 3000 BP (ALLEN ET AL., 1989). Hence this cluster is possibly the representative of the earlier Lapita-associated distribution of R. exulans, as opposed to a later dispersal from the shared RO-ancestral region into the Remote areas of Oceania, which has been suggested to be a separate event for humans by Addison and Matisoo-Smith (2010).

183

Is there a consensus for the dispersal of the Remote Oceanic clade(s)?

“I would thus derive exulans and hawaiiensis from the Philippine-Borneo- Java region in the form of many successive waves probably arising from stocks already differentiated from each other. I am not inclined to believe that any one of them has passed through New Guinea.” (Tate, 1935, p. 167)

Tate (1935) captured the major distinction between the Near and Remote Oceanic lineages quite accurately and also inferred the deep divergence among these demes. However, the dispersal pathways after Schwarz and Schwarz (1967) improved on Tate’s theories by adding a dispersal route along the Bismarck Archipelago (Figure 52). This route is now supported by molecular evidence and presumably reflected by the cluster spanning from the Bismarck Archipelago eastwards to Samoa, including some Micronesian Islands. Further this pathway can be seen as concordant with the hypothesised earlier introduction into the New Hebrides and Santa Cruz Islands by Tate. Therefore the molecular data is able to merge these previous accounts based on morphological differences into one theory.

Figure 52: Dispersal routes of R. exulans after Tate (1935; solid) and Schwarz and Schwarz (1967; dashed). Image from Roberts (1991).

184

HUMAN MIGRATION PATHWAYS CHAPTER06

The reconstruction of dispersal routes by Roberts (1991a) although largely concordant with the molecular data cannot be supported. The use of radiocarbon dates indicating initial human settlement may be an acceptable approach to infer the presence of R. exulans within Remote Oceania, but within Near Oceania and possibly within Micronesia the associated implications can be misleading. To be able to use the phylogeography of this commensal species for inferences of human migration pathways we cannot use the human settlement data to define the pathway of the species.

Matisoo-Smith and Robins (2004) hypothesised the introduction of two lineages into the Bismarck Archipelago and suggested an eastern route of dispersal from the Philippines into Wallacea for their haplogroup II, here presented as NO haplogroup. The first hypothesis is fully supported by the molecular data in this thesis. Evidence of the Remote Oceanic lineage was found on Lavongai, Manus and Tench in the Bismarcks, on the latter two exclusively, while the co-occurrence of NO and RO lineages as previously documented for Halmahera was observed on Lavongai, a less isolated island. All other islands sampled from the Bismarcks only received the Near Oceanic haplotype. The second suggestion of an eastern dispersal route from the Philippines cannot be supported based on the now established phylogeography. The NO haplogroup (II) is centred on New Guinea and its surrounding islands and has undergone a recent range expansion. The main haplotype of this range expansion, as well as three others can be found in the Philippine Islands. While the haplotype composition would be concordant with the Philippines being part of the range expansion or with a subsequent dispersal into the Philippines from the New Guinea region, it would not be concordant with the Philippine group as source for the NO lineage.

Overall, the results in this thesis allow clearer insight into the dispersal pathways into Remote Oceania, but the origin of the Remote Oceanic lineage of R. exulans remains elusive. The mtDNA data supports the basic ideas of Tate (1935) and Schwarz and Schwarz (1967) and clearly distinguishes between the Near and Remote Oceanic clades. The hypotheses of Matisoo-Smith and Robins (2004) regarding two lineages within the Bismarcks could be confirmed. However, the connections between the Bismarck Archipelago and Micronesia remain unclear and require further sampling to gain a better resolution. Ideally this would include archaeological samples. Regrettably it is generally difficulty to obtain DNA from material from the tropics that is older than 2000 years (Robins et al., 2001), as was experienced in this study with samples from Panakiwuk on

185

New Ireland, where DNA retrieval was unsuccessful. Hence future studies will have to rely on sampling of extant specimens, covering an extensive range in order to be able to correct for historic introductions of lineages.

6.4 Concordance with other commensals With the results obtained in this thesis it turns out that none of the animal commensals associated with the Lapita cultural complex can trace the origin of their Pacific lineages to Taiwan, or even the Philippines.

Dogs (Canis lupus familiaris) were domesticated in a single geographic region south of the Yangtze from a large number of wolves (Canis lupus) sometime between 16,300 and

5400 BP, supposedly in association with the development of rice farming in southeastern Asia, therefore the domestic dog has a homogenous gene pool of only ten major mtDNA haplogroups (Pang et al., 2009; Ding et al., 2012). The first molecular study on the origin of Australian dingoes identified only one main mtDNA haplotype of group A (A29)

indicating a singular introduction of dogs into Australia (5000 BP), but with no association towards the Oceanic islands (Savolainen et al., 2004). Samples from pre-European Polynesia revealed two different lineages (also group A, Arc1 and Arc2); while one resembled various widespread types (Arc2, Arc=archaeological sample with shorter amplicon as opposed to the A types), the other linked the Remote Oceanic lineage to Indonesia (Arc2=A75) (Savolainen et al., 2004). A recent systematic comparison of these three haplotypes to dog specimens in Southern East Asia and Island Southeast Asia revealed the presence of all three types in South China, mainland Southeast Asia and Indonesia, but found no evidence of these types in either Taiwan or the Philippines (Oskarsson et al., 2011). Therefore the proposed dispersal pathway began in South China and from there passed through mainland Southeast Asia and Indonesia, possibly in parallel to the spread of Neolithic culture (Oskarsson et al., 2011). This route, however, is not concordant with an association of the dog dispersal with the assumed Austronesian language dispersal out of Taiwan.

Chickens (Gallus gallus domesticus) were domesticated primarily from red jungle fowl (Gallus gallus) presumably in several locations in South and Southeast Asia as inferred from the geographic distribution of nine highly diverged mtDNA haplogroups (A to H): with H consisting of only jungle fowl, C only of domestic chicken and the seven remaining 186

HUMAN MIGRATION PATHWAYS CHAPTER06

comprising both (Liu et al., 2006). This differentiation spanning the wild and the domesticated types indicates either separate domestication of already differentiated red jungle fowl or substantial gene flow between domesticated chickens and local wild types (Miao et al., 2013). However, the predominant lineage found in the Pacific is haplogroup D (77%), present in South and Southeast Asia but absent from Taiwan, therefore suggesting a similar dispersal route for the chicken as for the dog (Chang et al., 2012; Miao et al., 2013).

Phylogeographic analysis of wild boar (Sus scrofa) inferred the origin of the species within Island Southeast Asia and from there its dispersal across (Larson et al., 2005). For a time multiple origins for the domestication of pigs were suggested, particularly one within . However, the prevalence of a Northern European mtDNA lineage among the domestic pig stock has later been linked to introgression from this local wild boar lineage (Larson & Burger, 2013). The general consensus now acknowledges two independent

domestications of pigs, one in eastern about 8000 BP and another around 7000 BP in central China (Larson & Burger, 2013; Evin et al., 2015). A Pacific clade of feral or loosely domesticated pigs (D6) was identified, comprising New Guinean pigs as well as specimens from Vanuatu and Hawaii to the east and Halmahera to the west; like the rats, this clade initially lacked a link to any other geographic region (Larson et al., 2005). However, subsequent studies suggested a link to Vietnam (Lum et al., 2006) and demonstrated the Pacific clade (PC) as nested within the Southeast Asian S. scrofa clade, clearly indicating its origin on the mainland (Larson et al., 2007). A dispersal pathway from the domesticated pig stock in China was suggested based on the presence of specimens clustering in the PC, along the Sunda chain (Sumatra, Java, Bali, Flores, , and Timor) via the Moluccas (Halmahera, and Seram) to New Guinea. As with dogs and chickens, the Pacific lineage is absent from Taiwan. Haplotypes found in Taiwan are among those most common in East Asia and are shared with the Philippines and parts of western Micronesia (Marianas and possibly Palau). Archaeological consensus places the pig in the Moluccas

no earlier than 3500 BP (Bellwood et al., 2005), therefore the dispersal of the PC via the Moluccas into New Guinea is strongly associated with Neolithic migration and the emergence of the Lapita cultural complex (Larson et al., 2007). So far not considered in the above studies are recent findings of admixture from S. verrucosus into S. scrofa that might have confounded the signal for human-mediated translocations (Frantz et al., 2014).

187

The cumulative evidence from these commensal animals strongly suggests that (1) the Neolithic settlers came directly from the Wallacea region, (2) the migrants stayed in this region long enough to pick up all of their commensal animals, or (3) the presumably strong trading ties between the Wallacea region and the Bismarck archipelago gave rise to the observed cultural complex in the Bismarcks.

188

HUMAN MIGRATION PATHWAYS CHAPTER06

6.5 Migration pathway exploration As part of this thesis it was intended to model different migration scenarios simulating likely human migration pathways. In order to compare the dispersal of R. exulans with these human migration pathway scenarios, the biogeographic history of R. exulans was supposed to serve as starting point for further model-based tests to infer population sizes, growth and migrations between adjacent geographic regions under different scenarios.

The population structure of R. exulans, revealed in CHAPTER 03, indicated strong

divergences between geographic regions. The paiwise ΦST results allowed the estimation of the absolute number of migrants per generation exchanged between two populations as , assuming that the mutation rate per generation was negligible compared to 1−𝐹𝐹𝑆𝑆𝑆𝑆 the𝑀𝑀 ≈ migration2𝐹𝐹𝑆𝑆𝑆𝑆 rate. Estimates of M greater than one, indicating that migration would be stronger than genetic drift, were only found for only migrations between SMA and SEA

and between SMA and NO (CHAPTER 03, Table 4). Therefore a coalescent based inference

of the population parameters in LAMARC (Kuhner, 2006) seemed to be validated.

The use of LAMARC is associated with a few assumptions. For the estimation of the coalescent parameter theta these are: (1) the marker must be considered neutral and not linked to any segment undergoing directional or balancing selection, (2) the population is much larger than the sample, and (3) each population has no subdivision that could inhibit gene flow. For the estimation of migration rates, these are: (1) the proposed populations have existed for a long time, and (2) the current migration structure has been stable for a long time. Violation of these assumptions would result in false evidence for migrations between the affected populations. Assumptions concerning the growth parameter are: (1) that the growth or decline has been the same for a long time and (2) that the immigration rate does not depend on population size; ‘a long time’ was not defined. With evidence from the prior analyses and to the best of my knowledge, my data satisfied all necessary assumptions.

Hence extensive exploratory analyses were conducted in LAMARC before choosing six different population scenarios. Due to the estimation of migration rates the number of estimated parameters increased near exponentially with the number of populations, therefore the number of populations had to be limited.

189

Several demographic models were explored and six plus two sets of biogeographic

candidate models were chosen to be fully tested in LAMARC. Plausible geographic areas were designated as ‘populations’ and the complexity of the models was increased by applying different known biogeographic boundaries and progressively isolating large islands and island groups, within the centre of the geographic distribution, overall increasing the number of separate areas (Figure 53).

4 populations The most simplistic population models, S8 and S12, consisted of four regions. In the first model (S8), the distributional range was delimited by three biogeographic lines: the Huxley line separating the Sunda shelf from Wallacea, the Lydekker line separating Wallacea from the Papua New Guinea region of the Sahul shelf, and the Thorne-Green line, separating Near and Remote Oceania. In the alternative model (S12) the Huxley line was omitted and instead the Southeast Asian mainland separated from the entire Island Southeast Asia region. For S12 two one-way dispersal variants were tested, one with an origin in SEA and the other with an origin in ISEA.

5 populations Two five population models differed more substantially. The first (S9) was an adaptation to S8, with the addition of the Wallace line to separate the Philippines as population five.

The second model (S10), was based on the AMOVA results from CHAPTER03. Here the five regions consisted of mainland Southeast Asia as one population, Java and the Lesser Sunda Islands as a second, the Philippines, Borneo and Sulawesi as a third, and the New Guinea region with the Weber line as delimiter as a fourth. The Thorne Green line remained as eastern boundary to Remote Oceania as fifth population.

6 populations Equally two six area models were tested: S11 and S13. S11 had one further area added to S10, which encompassed the northern and western islands around New Guinea, i.e. the Bismarck Archipelago and the Moluccas between the Weber and Lydekker lines. This addition was based on the AMOVA results for the New Guinea region, which found a strong differentiation between New Guinea Island and the two undistinguishable island areas. S13 combined the PHBS region with all large Sunda Islands and separated Wallacea

190

HUMAN MIGRATION PATHWAYS CHAPTER06

without Sulawesi, New Guinea and the Bismarcks were combined, but Micronesia was separated from RO.

Figure 53: Population boundaries for LAMARC migration analyses. Four populations: A (S12) and B (S8), five populations: C (S10) and D (S9), six populations: E (S11) and F (S13). For further description see text above. Coloration of the Oceanic islands is not true to their actual size, but a artefact of increasing their visibility. The Micronesian samples in F are depicted in light blue.

Population parameters estimated

The software LAMARC was used to infer the population parameter estimates for theta, migration rates, and growth for each model. The Bayesian approach was chosen because it

reduces computational time from months to weeks and in LAMARC it also offers better searches of tree space than the likelihood option.

The independent demographic parameters were estimated as = and = 𝑚𝑚 2 (haploid) where m is the per generation migration rate (or the chance𝑀𝑀1 forµ a lineage𝜃𝜃 to𝑁𝑁 immigrate𝑓𝑓µ per generation), µ the mutation rate per site per generation, and Nf the effective

191

(breeding) number of females in the population. Further, the exponential growth rate is ( ) defined within = where t is a time before present (in units of −gt mutations) and 𝜃𝜃g 𝑡𝑡 the 𝜃𝜃exponentialpresent time 𝑒𝑒growth rate; therefore positive values for g indicate population growth and negative values population decline. Estimates for g have an asymmetric magnitude and while a g of 10 indicates slow growth, a g of -10 indicates significant shrinkage. Because of the simultaneous estimation of growth rates, the estimated theta presented the present-day theta.

To obtain an estimate of the total migration rates (gene flow parameter for the population) as = 2 the resulting Bayesian estimates for and were multiplied. This scaled the 𝑀𝑀migration2 𝑁𝑁𝑓𝑓 𝑚𝑚from a generational event to a population𝜃𝜃 event𝑀𝑀1 with impact in relation to the receiving population. However, according to the manual LAMARC will fail to keep track of migration events if estimates of 2Nm become larger than about 5, and these estimates will

be unreliable. With the estimates for M via ΦST from CHAPTER 03 this did not seem to be a necessary concern.

Chain settings Four identical analyses were run in parallel for each model, on the one hand to test for similar convergence of the parameters, and on the other because auf computational restrictions. However, true parallelisation is not a feature in LAMARC, the software merely offers the possibility to create out-files logging every sample, which can then be manually combined and profiled together. Each run consisted of a short initial chain of 500 steps, sampled at an interval of 50 (after 1000 burn-in steps), to assure all parameters were estimated as expected, and a final chain with 500,000 steps, sampled at an interval of 50, after 20,000 steps discarded as burn-in. Multiple searches were performed with three initial heating temperatures of 1, 2, and 4 and a swap interval of 10. Adaptive heating had provided optimal swap rates between 10 and 40 in the test runs, hence it was allowed. All priors in Lamarc are flat, however, a choice between linear and logarithmic sample density is required and the setting of boundaries. After experimenting with the parameter bounds in various test runs, the default priors were defined as follows: theta, logarithmic [0.00001, 0.5]; migration, linear [1 x10-10, 10,000 (max.)]; growth, linear [-1000, 1000]. The model

of molecular evolution recommended by jModeltest, GTR+I+Γ5, was applied.

192

HUMAN MIGRATION PATHWAYS CHAPTER06

Parallelization

The trace files for the single runs were inspected in TRACER for an adequate ESS and overall quality. The outsumfile of each run (25M iterations) was then prepared for concatenation

in ULTRAEDIT, which is capable of handling such large text / xml files. Concatenation was handled via a short Perl script and the resulting file (100M iterations) entered for re-

profiling in LAMARC to calculate the overall parameter estimates per model across all runs.

To obtain a frequency distribution for M2 the point estimates for M1 and θ were extracted

from the combined LAMARC outsumfiles and the product of each point estimate for migration rate and the corresponding theta of the receiving population computed in R3.1.2.

Model comparison The demographic models were compared by a posterior simulation-based analogue of Akaike’s information criterion (AIC, Akaike, 1973) through Markov Chain Monte Carlo (AICM, Raftery et al., 2007). AICM has been tested to perform better in evaluating parameter-rich models than the harmonic mean estimator (Baele et al., 2012). The AICM

scores were computed in TRACER. Comparing the scores, a lower AICM indicates a better fit of the model and a cut-off of 10 AICM units can be argued to indicate a strong preference in favour of one model over another (Baele, 2012). The AICM was calculated for each run per model and the average AICM and standard deviation across the individual results reported. The results are shown in Table 13 below.

Table 13: AICM scores among the candidate models, averaged between the four independent runs. Number of populations (N), number of parameters, AICM score and standard error between the AICM scores of the different runs. Model N parameters AICM S.E. S12_ISEA 4 14 3735.438 18.874 S08 4 18 3740.586 16.926 S12 4 16 3741.955 27.622 S12_SEA 4 14 3754.426 13.558 S13 6 28 3778.899 38.372 S14 7 36 3830.074 21.820 S09 5 26 3854.238 38.343 S11 6 34 3874.116 53.454 S10 5 24 3969.907 80.623

193

Can long range migration be modelled between diverged lineages Prior to receiving the full results of all runs, the general assumptions underlying LAMARC were not violated to the best of my knowledge. However, despite the pooling of populations, the clear population structure as established by network analysis and AMOVA

as well as migration rates for M via ΦST estimated within a plausible range (CHAPTER 03,

Table 4), Nm estimates through the coalescent approach within LAMARC reached values mostly (much) larger than the software threshold for ‘going crazy’ (Kuhner). Hence the

migration estimates obtained through LAMARC are not reliable. This can be demonstrated for example by the very high migration rates between NO and RO in both directions. It is

stated in the manual that this behaviour in LAMARC is triggered by too many migration events, however, in this case this can hardly be the case. Further, some distributions for the Migration rate M=m/µ were platykurtic, meaning they were flattened to a certain degree. This form of distributions indicates issues either with the convergence during the MCMC run or the lack of information held in the data for that particular parameter.

In retrospect it becomes clear that this type of analysis could not have succeeded. Although the population sizes were allowed to change the scenarios could not capture the true model of dispersal as described under section 6.3, which became apparent through the study of the phylogeny after estimation of the species chronology and the ancestral area reconstruction. The scenario I proposed above cannot be modelled under the migration model in LAMARC and because it requires non-dichotomous dispersal events with subsequent isolation, the divergence model in LAMARC was not an alternative either. Due to the very long run times, further analyses had to be abandoned.

The obtained results had unreliable wide estimates and there are several possibilities why. The first is the population dispersal history proposed above. If the ancestral population underwent a range expansion over the proposed area, with the subsequent isolation before

a new expansion from the now three populations it is impossible to model in LAMARC. However, software like IMa2 (Hey, 2010) might be able to come closer to a solution here. Further, using the current population size to scale the migration rate, has the wrong impact on the rate between NO and RO, because at the time of initial settlement there the population sizes were zero. Another possible problem in estimating reliable population parameters is the different modes of distribution of the species.

194

HUMAN MIGRATION PATHWAYS CHAPTER06

Figure 54: Example (S10) for the wide ranges of the initial parameter estimates and the associated migration rates in LAMARC.

195

Generally it is already challenging to model the differences between normal dispersal and geo-dispersal, however, for R. exulans the accelerated human mediated dispersal has to be added, which caused an unnatural rapid series of founder events and a naturally unmatched range expansion. Slatkin and Excoffier (2012) described the following for normal colonisation events during range expansion:

“The succession of colonization events during range expansion creates a spatial sequence of allele frequencies that is analogous to the time sequence of allele frequencies in a single population. This is true in an idealized model of range expansion and is approximately true in a more realistic model that allows for some delay before the next colonization event and for weak gene flow among established populations.”.

Based on this description, it has to be considered that the human-mediated range expansion into RO might make it impossible to model migrations and dispersal without correcting for the accelerated time frame. This impact on the time frame has also be considered with

regard to the age of the Remote Oceanic clade in CHAPTER 04.

Further, this type of model-based analysis on such a wide geographic regions might simply be not feasible. As a future approach it might be preferable to test for migration on a more regional scale, e.g. within the Philippine Islands or within the Bismarck Archipelago. On this geographic scale dispersal histories have successfully been inferred for the Canary Islands (Sanmartín et al., 2008) and for Madagascar (Brouat et al., 2014). Such analyses are however not possible with the current data set, because the sampling resolution in these areas is not sufficient for such an approach. Suitable simulation software must further be capable of using (a) time slices (b) ‘empty’ or ‘ghost’ populations and (c) changing migration scenarios to account for the multifaceted population history of this commensal rat species.

Regardless, although it was not possible to directly compare migration scenarios, the distribution and population history of R. exulans still allow inferences with respect to the hypothesised human migration scenarios.

196

HUMAN MIGRATION PATHWAYS CHAPTER06

6.6 Rats, implications for inferences of human migration pathways The commensal R. exulans allowed a unique approach for inference of human migration within the Pacific (Matisoo-Smith et al., 1998; Matisoo-Smith & Robins, 2004). Unlike the other introduced animal species, pigs, dogs and chickens, the rat appears to have been introduced during the initial voyages and reached every island settled by men. This mode of introduction allowed the inference of direct connections between the islands. Therefore R. exulans has repeatedly been applied as proxy for human migration in oceanic context. The aim of this study is to distinguish the population structure and evaluate whether the species can serve as proxy for human movement beyond the scope of Remote Oceania, in particular, whether the species’ history can contribute to the search for an origin of the ancestors of the Lapita peoples.

Among the disciplines involved in this quest, new developments throughout the last decade led to the realisation that the old models are insufficient to account for all available evidence. Therefore much discussion has emanated how to address the need for a new framework (Bellwood et al., 2011; Sheppard, 2011; Spriggs, 2012; Specht et al., 2013). In the earlier depiction of human races and their migration across all continents the was clearly inferred from an origin in mainland Southeast Asia (Figure 55). However, linguistic evidence unequivocally points towards Taiwan as the ultimate origin of the Austronesian languages (Blust, 2009/2013) and the suggested pulse-pause scenario for its distribution (Gray et al., 2009; Greenhill et al., 2010) is not per se challenged. To date archaeological findings are much more limited and do not allow the same level of detail for inference (Spriggs, 2012). The characteristic Lapita ceramic heritage is most strongly related to techniques used in the Philippines and has been tied to a distribution pathway from there via the Marianas into Melanesia and the Remote Pacific (Carson et al., 2013). Such a pathway is not able to explain any of the commensal species found in early Lapita sites.

New direct dating methods have revealed false assumptions towards the presence of artefacts of faunal remains within older stratigraphic layers (O'Connor et al., 2011b). The mere definition of the so called ‘Neolithic package’ in the region is being challenged by various authors and the consensus is growing that the Neolithic within ISEA is rather a Neolithic and not refined to one source and one direction of spread (Sheppard, 2011;

197

Spriggs, 2012). Additionally the isochronic spread of language and farming is being challenged (Szabó & O'Connor, 2004; Donohue & Denham, 2010).

198

HUMAN MIGRATION PATHWAYS CHAPTER06

. Haeckel (1889)Haeckel : Hypothetical map for the monophyletic origin and dispersal of human races by by races human of and dispersal origin monophyletic the for map Hypothetical 55 :

Figure

199

Human genetic evidence has identified a complex ancestral structure among the inhabitants of Island Melanesia, with strong male Melanesian influence and likewise predominant female Asian influence in the same region (see CHAPTER 01). Ongoing genetic research proposes that further ancient DNA studies on human bone material will reveal more variability among lineages in the Pacific as currently captured, hence further adding to the complexity (Matisoo-Smith, 2015). To account for the genetic evidence, Addison and Matisoo-Smith (2010) introduced a variant of Green’s VCTI model (Green, 1991b), a Western Polynesia TI model, suggesting that the initial spread of Lapita, already a genetic admixture, reached the far ends of Near Oceania and that a later separate Austronesian migration wave eventually arrived in West Polynesia. They argued that the local fusion of these two cultures built the foundation for the settlement of the Pacific instead of, as so far believed, a single migration wave and sole development within the area. Others suggested ‘leapfrogging’ as alternative to waves of advances to account for archaeological breaks along the previously proposed pathways (Sheppard, 2011).

Considering data from the various disciplines it becomes quite clear that there truly must have been various migrations, neither uniform or coming with a complex of culture, but open and innovative. People are adaptable and true innovation will seep through cultures without the need for people to migrate. This however is not true for languages nor is it for animal or plant commensals. These are not mere ideas, both need to be transported, one physically and the other engrained in the culture. Spriggs (2012) suggested three Neolithic influences within ISEA and I strongly concur with this idea. Particularly the Wallacea region seemed to be a melting pot where different elements of the later observed ‘Lapita complex’ first came together. The language dispersing from Taiwan in the North, probably accompanied by certain pottery techniques, the animal commensals from a path directly from Southeast Asia in the west, probably accompanied by cultural images like the famed Lapita face, and cultivated plants from New Guinea in the east.

Like all the puzzle pieces, R. exulans has proved invaluable in its contributions to address the question of a Lapita origin. The lack of connection of the Near of Remote Oceanic lineages to the very strongly defined PHBS haplogroup refutes a direct pathway from the North out into Near Oceania. Even though it is not possible to locate a geographic origin of the Remote Oceanic clade, the observed distribution requires substantial interaction in the Wallacea region.

200

HUMAN MIGRATION PATHWAYS CHAPTER06

The time frame the various evidence is taken from is rather wide. Given the start of the

spread of the languages around 5200 BP and the dating for example of the presence of

commensal pigs in Halmahera around 3500 BP, it is essential to ask ‘how long exists one culture’ or since when are we who we are. Were we the same culture 500 years ago? We certainly spoke the same language, or at least almost. Is it valid to infer that the Lapita ultimately came from Taiwan although they have been interbreeding with local people ‘along the way’ for 2000 years? Would it not be more adequate to speak of influence rather than origins? Given the human genetic evidence, we know that Lapita were not Taiwanese, nor were they Asian, nor were they Melanesian.

To explain the differences in the female and male lineages a matrilineal society was proposed (Hage & Marck, 2003), where offspring belong to the mother’s family as opposed to the father’s family within patrilineal societies. This proposal entails that women of Austronesian descent purportedly established a matrilocal society within the Bismarck Archipelago, completely changing the strong patriarchic societies present there. This seems irrational; how would a group of women be able to establish an alien, women-dominated society within a strongly patriarchic realm? A scenario where Melanesian traders bring back Austronesian wives, which possess a variety of new and advantageous cultural techniques, or a scenario where Austronesian groups of settlers land in Melanesia and only the women are allowed to live, possible with their younger sons, and subsequently get integrated in the small communities, seem much more plausible, although probably do not add up to the numbers of people needed. Possibly more likely yet, a group of people living somewhere in Wallacea, emerged out of the fusion of Austronesian women and Melanesian men who were both present in this region, already possessed what later would be called Lapita culture. This group would have spoken Austronesian languages, because mothers teach their children language. Papuans are known to speak many different languages on average, so they were probably very able to adapt to this. Admixture between races does not always find approval in society, therefore cultural isolation could have contributed to forming a group of people that were more tolerant, possibly more open to innovation. All this is highly hypothetical, but the main point is that human behaviour decided this historical development and human behaviour is difficult to predict, or infer for that matter. Was there intent? What is the definition of intent? Is it the intent to sail out and see if you can manage to settle on that island that you saw, or heard of during trade, tell your mates if

201

you do? Or is it intent to gather together a larger group and systematically search for hospitable places? However, this is not my discussion to hold.

Despite all evidence there is a tendency to underestimate prehistoric peoples. Repeatedly the sea faring abilities have been put a question, but evidence for pelagic fishing for Tuna found on Timor allowed the inference of very sophisticated marine technology, not in Lapita times but dating back 42,000 years thereby attesting to the knowledge and technology for open ocean fishing in the Lesser Sunda Islands (O'Connor et al., 2011a). Thus the lack of marine technology can hardly have been a hindrance given that the Archipelago environment would always have favoured people possessing these skills.

Human genetic evidence indicates more than one human migration. Differences in the proportions of Melanesian Y-chromosome and mtDNA in Island Melanesia as compared to the Pacific populations led to suggest a continued admixture with the Melanesian indigenous people, and no further influx of Asian influence in either Melanesia or RO (Kayser et al. 2008). However, a bottleneck event during the settlement of Western Polynesia would also be sufficient to explain this ratio.

The phylogeny of R. exulans supports the proposal by Addison and Matisoo-Smith (2010) for two separate Austronesian migrations, explaining breaks in the languages as observed for the Solomons and the increase of Asian-derived genes in Polynesia. The first migration would be closely associated to the Neolithic spread and introduced rats of the slightly differentiated NO-lineage throughout the Bismarcks. This is strongly supported by the estimated time frame for this lineage. Subsequently this lineage would also be introduced to the New Guinea mainland. The second migration would have introduced the Remote Oceanic lineage of rats. The path of this group would tangentially pass the Bismarcks, but primarily remain further off shore and initially be constricted in its distribution east to Samoa. From West Polynesia the last large human range expansion is concordant with the dating of the range expansion of this rat lineage. Regrettable the origin of this haplogroup could not be determined. In order to further support this hypothesis a few successfully typed ancient rat remains from old Lapita layers in the Bismarck Archipelago would be sufficient. With regard to the origin, as suggested before, extensive sampling within Wallacea might be able to reveal the missing links.

202

CHAPTER 07 FINAL CONCLUSIONS

Analysis of the biogeography of R. exulans across its vast distributional range has allowed me do address a few specific questions in this thesis. In the following I will answer the questions that were presented in the introduction.

I. Is the current population of R. exulans geographically structured within its distributional range and if so, how?

When Matisoo-Smith and Robins (2004) expanded their sampling of the bioproxy R. exulans beyond the Pacific in their search for a connection to Near Oceania, they found a distinct structure among samples from different

geographic sources. In CHAPTER 03 I presented the structuring of the R. exulans population within its entire range and can confirm a deep geographic structure among specimens. Three major geographic areas can be distinguished of which the wider Philippines region and Remote Oceania are highly differentiated, while the third area indicates more recent gene flow within. Within this third region distinct lineages for Near Oceania (Matisoo-Smith and Robins 2004, group II) and sub-samples of Southeast Asia form the eastern and western ends of this realm. However, although my coverage with 132 sampling locations allowed a good resolution, there seem to be many cryptic haplotypes within the ancestral region. Their identification would significantly increase the credibility of the subclades in the ancestral region and clarify several geographic connection or dispersal barriers. The distinction made by Matisoo-Smith and Robins (2004) between haplogroup III A and III B is still not supported, but a structural cluster is consistent across all topologies and network.

203

II. What was the historical distribution of the species and can it be explained by natural dispersal?

In Remote Oceania the dispersal of R. exulans is known to be linked to human migration activity. As a human commensal its distribution within the rest of its

range could also have been aided by humans. In CHAPTER 04 I inferred ancestral geographic ranges and used the phylo-chronology to infer correlations with palaeoclimatic events that would allow for natural dispersal. The two deep clades clearly seem to be a result of a range expansion from the ancestral area of the species, possibly before the first arrival of modern humans. I proposed that an expansion during the penultimate ice age lead to two reservoirs for the species in which the lineages diverged through gene flow during the last interglacial period. The reservoir location for the Philippine lineage is likely to have been in the Borneo, Palawan region, while the geographic location for the Remote Oceanic lineage remains elusive, but can probably be narrowed down to the Sulawesi, Moluccas region. Further it is possible that the more distinct Southeast Asian lineage was a result from meltwater-pulse-induced vicariance events during the last deglaciation period, but this remains speculative until further samples from the southern part of the Malay Peninsular can be added to the data pool. Particularly the recent distribution into New Guinea appears to be unrelated to climatic events and is most likely the result of a human-mediated introduction, possibly associated with the Neolithic Lapita.

III. Can the origin of the species be determined?

Since this work began, others have taken a renewed interest in the origin of R. exulans. Despite the dominant opinion of an origin on the Southeast Asian mainland, this group inferred Flores of the Lesser Sunda Islands as the place of origin, albeit based on a restricted dataset and with questionable credibility. The work presented here presents a higher resolution due to the large geographical range covered and provides well-supported evidence. Of the two proposed origins in the literature, a Southeast Asian ancestry is not supported by either the

population structure from CHAPTER 03, nor the inferred chronology and ancestral 204

FINAL CONCLUSIONS CHAPTER 07

area reconstruction from CHAPTER 04. In contrast an ancestry of the species within the Southern Malay Archipelago is supported by all genetic indices and the inferences based on the tree topology and ancestral reconstruction are most supportive of an origin within the Lesser Sunda Islands. These results are compatible with the conclusion by Thomson et al. (2014) but not concordant. Until further evidence can be produced, either by genetic studies on nuclear SNPs or archaeological data that predates the current first records around 3000

BP, this question cannot conclusively be answered.

IV. Does the distribution and dispersal pattern link the clades found in Near and Remote Oceania to each other or (a) particular geographic source(s)?

Based on all conducted analyses the Near and Remote Oceanic haplogroups are clearly not directly related and particularly the Near Oceanic haplogroup is not ancestral to the Remote Oceanic group. However, the origin of the Near Oceanic group can be traced to a considerably recent dispersal event from the ancestral area, and its further distribution was presumably human-mediated. The initial differentiation most likely occurred within the Moluccas, which is also suggested as ultimate origin of the Remote Oceanic clade. Therefore both groups presumably can be traced back in direct pathways to Wallacea, from where they were translocated on different occasions from different stocks. Although the age of the lineage might be overestimated due to the serial founder events throughout Oceania combined with increased reproductive success caused by the lack of competition and disease, fewer enemies and abundant food resources, the divergence of the lineage cannot be associated with Neolithic dispersal.

205

V. Can any of the theories for an origin of Lapita people be supported or rejected based on the findings?

Clearly, as with the other commensals, R. exulans in Remote Oceania cannot be derived from any Philippine stock. Therefore a simplistic out of Taiwan or even Express Train Model can be refuted with evidence from all faunal commensals. A purely indigenous Melanesian ancestry is also not supported, because two lineages were introduced into the Bismarcks and both of them presumably human-mediated. These were previously hypothesised by Matisoo-Smith and Robins (2004) and could here be verified. While the Near Oceanic lineage was probably a recent introduction to the area, possibly associated with the Lapita dispersal, the Remote Oceanic lineage could be associated with a secondary wave of Austronesian settlers, as suggested under the WPTI model by Addison and Matisoo-Smith (2010). Overall the results for R. exulans are only compatible with more complex models and require intensive interactions within the Wallacea region.

206

APPENDICES

A 1 WWII military maps for PNG and Micronesia

207

Philippines and Marianas. the : Examples of allied movements in June 1944 around New Guinea and between the 1

Figure A

208

Examples of allied movements in September 1944 around New Guinea and between the Philippines and the Marianas. the and Philippines the between and Guinea New 1944 around September in movements allied of Examples

: 2 Figure A

209

A 2 Laboratory protocols

Extraction protocols for museum tissue and bone material

(adjusted after Boom et al., 1990; Höss & Pääbo, 1993; see Robins et al., 2014) Reagents SET buffer 100mMTrisHCl pH 8, 100mM NaCl, 1mM EDTA pH 8 Pre extraction buffer 0.5M EDTA pH 8, 1.6% Triton X-100 5M GuSCN 5MGuSCN and 25mM NaCl Silica suspension (afterBoom et al., 1990)

Digestion & extraction

Museum tissue Bone material

Digest small amount of tissue overnight Digest ground bone overnight under rotation at 55°C in: under rotation at 37°C in: 975µL Pre extraction buffer 200µL SET buffer 25 µL of 20 mg/mL Proteinase K 60 µL Proteinase K (20 mg/mL) (final concentration 0.5mg/mL)

20µL DTT (1M) If bone material is not fully digested, 20µL SDS (20%) or Triton X-100 prolong incubation for up to 3 hours at 55°C. Spin down any remaining solids.Transfer Spin 1 min at 10,000rpm. Transfer supernatant to a new tube. Add: supernatant to a new tube. Add: 600µL of 5M GuSCN extraction buffer 1mL of 5M GuSCN buffer 100 µL of silica suspension 100 µL of silica suspension Incubate for 3 hours under rotation at Incubate for 3 hours under rotation at 37°. 24 to 37°C. Spin for 1 min at 10,000 rpm, discard Spin for 1 min at 10,000 rpm, discard supernatant. supernatant. Wash pellet by resuspending the silica with 1 mL of 5M GuSCN extraction buffer, spin for 1 min at 10,000 rpm, discard the supernatant then wash twice by resuspending the silica with 1 mL of 70% cold ethanol, spin for 30 sec at 10,000 rpm and discard supernatant. After second EtOH- wash spin down and remove remaining EtOH with pipette; dry pellet at room temperature (or 37°) for approximately 15 min. Resuspend in 150 mL TE buffer, incubate 10 min at room

210

temperature, spin for 2 min at 13,000 rpm. Transfer supernatant to sterile microtube and store for use in PCR.

211

Formalin Protocol

DNA Extraction from formalin fixed tissue, by John Hyde (Rev 12/10/03), received from Kelly Robertson, NOAA (abridged original)

This protocol is based upon the following paper: Shi, S.-R., Cote, R.J., Wu, L., Liu, C., Datar, R., Shi, Y., Liu, D., Lim, H. & Taylor, C.R. (2002) DNA Extraction from Archival Formalin-fixed, Paraffin-embedded Tissue Sections Based on the Antigen Retrieval Principle: Heating Under the Influence of pH. Journal of Histochemistry & Cytochemistry, 50, 1005-1011

A simple method based on high temperature and pressure treatment of the sample prior to DNA extraction. This method was successfully used to amplify (200-300bp) and sequence DNA from museum specimens up to 44 years old.

1. Soak preserved tissue in an excess of 100% ethanol to help remove residual formalin. 2. Remove ~100mg of tissue from ethanol and chop into small pieces if necessary. 3. Place tissue in a boil proof 1.5ml tube. 4. Allow ethanol to evaporate and then add 180µL of Antigen Retrieval buffer

(28.6mM citric acid, 28.6mM KH2PO4, 28.6mM H3BO3, pH=11). 5. Close tube and secure lid with locking collar. 6. Autoclave at 250°F (121°C) for 20 minutes on the slow pressure venting setting. 7. Cool samples to room temperature. 8. Add 1.5µL 3M sodium acetate pH=5.2. 9. Add 20µL proteinase K to sample, vortex briefly to mix and incubate at 55°C overnight with occasional mixing. 10. Centrifuge briefly and add 0.5µL 3M sodium acetate pH=5.2, 200µL buffer AL, and 1µL of carrier RNA (1µg/µL) and vortex sample for 15 seconds. 11. Centrifuge briefly and add 200µL 100% ethanol and vortex for 15 seconds. 12. Centrifuge briefly and incubate at RT for 5 minutes.

Samples are ready to be processed following standard procedures after this treatment.

Hyde advised to use 1-3µL of extract for each PCR reaction. Then the extraction protocol should be sufficient to provide DNA for amplification of products ~200-300bp. Larger

212

products may be possible but for consistent amplification success he recommended designing primers to amplify products ~250bp in size. He also recommend using species- specific primers where possible with Tm’s between 55-60°C as well as use of a hotstart DNA polymerase. This advice is concordant with the standard procedures that were applied for all museum specimens.

213

214

A 1 R. exulans sample data

Table A 1: R. exulans sample data: haplotype, country and island of sample origin, laboratory ID, assigned pooled region, haplogroup and regional population, museum ID and accession number where applicable.

haplotype country island lab_ID region haplogroup population museum‐label accession Rx001 Cook Islands Aitutaki RCIAiu002 RCI02 RO RO EF186300.1 Rx001 Cook Islands Aitutaki RCIAiu003 RCI02 RO RO EF186301.1 Rx001 Cook Islands Pukapuka RCIPuk002 RCI01 RO RO Rx001 Cook Islands Pukapuka RCIPuk003 RCI01 RO RO Rx001 Cook Islands Pukapuka RCIPuk004 RCI01 RO RO Rx001 Cook Islands Pukapuka RCIPuk006 RCI01 RO RO Rx001 Cook Islands Pukapuka RCIPuk007 RCI01 RO RO Rx001 Huahine RFPSIH001 RFP01 RO RO EF186305.1 Rx001 French Polynesia UaHuka RFPMq001 RFP03 RO RO EF186307.1 Rx001 French Polynesia UaHuka RFPMq002 RFP03 RO RO EF186308.1 Rx001 Hawaii Hawaii RHaw003 RH01 RO RO EF186303.1 Rx001 New Caledonia New Caledonia RNCal001 RNC01 RO RO Rx001 New Caledonia New Caledonia RNCal003 RNC01 RO RO Rx001 New Caledonia New Caledonia RNCal004 RNC01 RO RO Rx001 New Caledonia New Caledonia RNCal006 RNC01 RO RO Rx001 New Caledonia New Caledonia RNCal008 RNC01 RO RO Rx001 New Caledonia New Caledonia RNCal009 RNC01 RO RO 215 216

haplotype country island lab_ID region haplogroup population museum‐label accession Rx001 New Caledonia New Caledonia RNCal010 RNC01 RO RO Rx001 New Zealand Great Barrier Island RNZGBI040 RNZ01 RO RO EF186309.1 Rx001 New Zealand Kapiti Island RNZKap041 RNZ01 RO RO EF186310.1 Rx001 New Zealand Moturoa Island RNZMot001 RNZ01 RO RO Rx001 New Zealand Moturoa Island RNZMot002 RNZ01 RO RO Rx001 New Zealand Moturoa Island RNZMot004 RNZ01 RO RO Rx001 New Zealand Moturoa Island RNZMot005 RNZ01 RO RO Rx001 PNG Manus RPNGMa001 INP01 NO RO Rx001 PNG Manus RPNGMa004 INP01 NO RO Rx001 PNG Tench Island RPNGTe001 INP02 NO RO Rx001 PNG Tench Island RPNGTe005 INP02 NO RO Rx001 PNG Tench Island RPNGTe006 INP02 NO RO Rx001 PNG Tench Island RPNGTe007 INP02 NO RO Rx001 Solomon Islands Takuu Island RSolNW001 INSI02 RO RO Rx001 USA Guam RGuam006 MM02 RO RO Rx001 New Caledonia Ouvea RNCal002 RNC01 RO RO M‐88682 Rx001 USA Guam RGuam001 MM02 RO RO USNM 277472 Rx001 USA Guam RGuam002 MM02 RO RO USNM 278368 Rx001 USA Guam RGuam004 MM02 RO RO USNM 278439 Rx002 PNG Manus RPNGMa005 INP01 NO RO Rx002 PNG Manus RPNGMa006 INP01 NO RO

haplotype country island lab_ID region haplogroup population museum‐label accession Rx002 PNG Manus RPNGMa007 INP01 NO RO Rx003 PNG Lavongai RPNGNH006 INP04 NO RO Rx004 PNG Lavongai RPNGNH008 INP04 NO RO Rx005 Burma RMya003 SM01 SEA SEA Rx005 Burma RMya005 SM01 SEA SEA Rx005 Burma RMya006 SM01 SEA SEA Rx005 Cambodia RCam011 SC02 SEA SEA Rx005 Cambodia RCam013 SC02 SEA SEA Rx005 Cambodia RCam016 SC01 SEA SEA Rx005 Cambodia RCam018 SC01 SEA SEA Rx005 Lao PDR RLao001 SL01 SEA SEA Rx005 Lao PDR RLao005 SL01 SEA SEA Rx005 Thailand RTha010 ST05 SEA SEA Rx005 Thailand RTha014 ST05 SEA SEA Rx005 Thailand RTha015 ST05 SEA SEA Rx005 Thailand RTha016 ST02 SEA SEA Rx005 Thailand RTha017 ST02 SEA SEA Rx005 Thailand RTha018 ST02 SEA SEA Rx005 Thailand RTha019 ST02 SEA SEA Rx005 Thailand RTha020 ST02 SEA SEA Rx005 Thailand RTha021 ST02 SEA SEA 217 218

haplotype country island lab_ID region haplogroup population museum‐label accession Rx005 Thailand RTha023 ST01 SEA SEA Rx005 Thailand RTha027 ST01 SEA SEA Rx005 Lao PDR ABTC115059 SL02 SEA SEA ABTC115059 KJ155758 Rx005 Thailand ABTC119382 ST02 SEA SEA ABTC119382 KJ155758 Rx005 Thailand ABTC119383 ST02 SEA SEA ABTC119383 KJ155758 Rx005 Indonesia Timor ABTC121691 ISF02 SMA SEA ABTC121691 KJ155759 Rx005 Thailand RTha003 ST03 SEA SEA ABTC8480 EF186319.1 Rx005 Thailand RTha001 ST03 SEA SEA ABTC8553 EF186317.1 Rx005 Thailand RTha002 ST03 SEA SEA ABTC8559 EF186318.1 Rx006 Cambodia RCam014 SC02 SEA SEA Rx006 Lao PDR RLao004 SL01 SEA SEA Rx006 Lao PDR RLao006 SL01 SEA SEA Rx006 Thailand RTha004 ST04 SEA SEA Rx006 Thailand RTha005 ST04 SEA SEA Rx006 Thailand RTha006 ST04 SEA SEA Rx006 Thailand RTha009 ST04 SEA SEA Rx006 Thailand RTha011 ST05 SEA SEA Rx006 Thailand RTha012 ST05 SEA SEA Rx006 Thailand RTha013 ST05 SEA SEA Rx006 USA Aguijan ABTC112976 MM01 RO SEA ABTC112976 KJ155757 Rx006 Cambodia ABTC115126 SC03 SEA SEA ABTC115126 KJ155757

haplotype country island lab_ID region haplogroup population museum‐label accession Rx006 Indonesia Flores ABTC121681 ISF03 SMA SEA ABTC121681 KJ155757 Rx007 Bangladesh RBan002 SB01 SEA SMA Rx007 Burma RMya001 SM01 SEA SMA Rx007 Burma RMya002 SM01 SEA SMA Rx007 Burma RMya004 SM01 SEA SMA Rx007 USA Rota ABTC112984 MM03 RO SMA ABTC112984 KJ155771 Rx008 Bangladesh RBan001 SB01 SEA SEA Rx008 India RInd001 SI01 SEA SEA Rx008 Lao PDR ABTC115051 SL02 SEA SEA ABTC115051 KJ155768 Rx008 Lao PDR ABTC116184 SL02 SEA SEA ABTC116184 KJ155768 Rx009 Cambodia RCam002 SC03 SEA SEA Rx009 Lao PDR RLao002 SL01 SEA SEA Rx009 Lao PDR RLao003 SL01 SEA SEA Rx009 Thailand RTha022 ST01 SEA SEA Rx009 Thailand RTha024 ST01 SEA SEA Rx009 Thailand RTha025 ST01 SEA SEA Rx010 Philippines Negros Island RPhNeg004 PN01 PHBS NO Rx010 Philippines Negros Island RPhNeg006 PN01 PHBS NO Rx010 Philippines Negros Island RPhNeg012 PN01 PHBS NO Rx010 Philippines Negros Island RPhNeg013 PN01 PHBS NO Rx010 PNG Lavongai RPNGNH001 INP04 NO NO 219 220

haplotype country island lab_ID region haplogroup population museum‐label accession Rx010 PNG Lavongai RPNGNH002 INP04 NO NO Rx010 PNG Lavongai RPNGNH003 INP04 NO NO Rx010 PNG Lavongai RPNGNH004 INP04 NO NO Rx010 PNG Lavongai RPNGNH005 INP04 NO NO Rx010 PNG Lavongai RPNGNH007 INP04 NO NO Rx010 PNG Lavongai RPNGNH009 INP04 NO NO Rx010 PNG New Guinea RPNG_KA686 NP07 NO NO Rx010 PNG New Guinea RPNGBub001 NP07 NO NO Rx010 PNG New Guinea RPNGBub002 NP07 NO NO Rx010 PNG New Guinea RPNGBub003 NP07 NO NO Rx010 PNG New Guinea RPNGBun002 NP05 NO NO Rx010 PNG New Guinea RPNGWig002 NP06 NO NO Rx010 PNG Tabar Island RPNGTa001 INP07 NO NO Rx010 PNG Tabar Island RPNGTa002 INP07 NO NO Rx010 PNG Tabar Island RPNGTa003 INP07 NO NO Rx010 PNG Tabar Island RPNGTa004 INP07 NO NO Rx010 PNG Tabar Island RPNGTa011 INP07 NO NO Rx010 PNG Tabar Island RPNGTa018 INP07 NO NO Rx010 PNG Tabar Island RPNGTa024 INP07 NO NO Rx010 PNG Tabar Island RPNGTa025 INP07 NO NO Rx010 PNG Tabar Island RPNGTa028 INP07 NO NO

haplotype country island lab_ID region haplogroup population museum‐label accession Rx010 PNG RPNGMuy001 INP12 NO NO Rx010 PNG Woodlark Island RPNGMuy002 INP12 NO NO Rx010 PNG Woodlark Island RPNGMuy009 INP12 NO NO Rx010 Indonesia Kei Besar ABTC110264 ISM05 NO NO ABTC110264 KJ155767 Rx010 Indonesia Flores RInFLo005 ISF01 SMA NO ABTC121689 KJ155767 Rx010 Indonesia Flores ABTC121690 ISF01 SMA NO ABTC121690 KJ155767 Rx010 PNG New Guinea RPNGWig001 NP06 NO NO ABTC44046 Rx010 PNG New Guinea RPNGTol001 NP06 NO NO ABTC44083 Rx010 PNG New Guinea RPNGNag001 NP05 NO NO ABTC48895 EF186313.1 Rx010 PNG New Guinea RPNGDuv001 NP07 NO NO ABTC48987 Rx010 PNG New Guinea RPNGBun001 NP05 NO NO ABTC49255 Rx010 Philippines Mindanao RPhMin003 PM03 PHBS NO FMNH 147951 Rx010 Philippines Camiguin Island RPhCam002 PM02 PHBS NO FMNH 154819 Rx010 Philippines Camiguin Island RPhCam003 PM02 PHBS NO FMNH 154820 Rx010 Indonesia Halmahera RInHal001 ISM02 NO NO M‐101295 Rx010 Indonesia Halmahera RInHal003 ISM02 NO NO M‐101304 Rx010 PNG Fergusson Island RPNGFer001 INP11 NO NO M‐159737 Rx010 PNG Fergusson Island RPNGFer002 INP11 NO NO M‐159740 Rx010 Indonesia New Guinea RInIJay007 NIP03 NO NO M‐222380 Rx010 Indonesia Halmahera RInHal004 ISM02 NO NO M‐267677 Rx010 FSM Yap RFSMYap001 MFSM02 RO NO MVZ 109106 221 222

haplotype country island lab_ID region haplogroup population museum‐label accession Rx010 FSM Yap RFSMYap002 MFSM02 RO NO MVZ 109107 Rx010 FSM Yap RFSMYap003 MFSM02 RO NO MVZ 109109 Rx010 Indonesia Sulawesi RInSul002 ISS03 PHBS NO USNM 199913 Rx010 Indonesia New Guinea RInIJay002 NIP03 NO NO USNM 277258 Rx010 Indonesia New Guinea RInIJay003 NIP03 NO NO USNM 277306 Rx010 Indonesia New Guinea RInIJay001 NIP02 NO NO USNM 277308 Rx010 Indonesia Morotai RInMol004 ISM03 NO NO USNM 277316 Rx010 Indonesia Morotai RInMol003 ISM03 NO NO USNM 277318 Rx010 Indonesia Morotai RInMol005 ISM03 NO NO USNM 277319 Rx010 Indonesia New Guinea RInIJay004 NIP01 NO NO USNM 277465 Rx010 Indonesia New Guinea RInIJay006 NIP01 NO NO USNM 283861 Rx011 Indonesia Flores ABTC121688 ISF01 SMA NO ABTC121688 KJ155764 Rx011 PNG New Guinea RPNGYur001 NP03 NO NO ABTC43078 EF186312.1 Rx011 PNG New Guinea RPNGFat001 NP06 NO NO ABTC44216 Rx011 Cambodia RCam006 SC03 SEA NO FMNH 168850 Rx011 Cambodia RCam008 SC03 SEA NO FMNH 168853 Rx011 Philippines Palawan RPhPal001 PP01 PHBS NO FMNH 168962 Rx012 PNG Lihir Island RPNGLi001 INP05 NO NO Rx012 PNG Lihir Island RPNGLi002 INP05 NO NO Rx012 PNG Lihir Island RPNGLi003 INP05 NO NO Rx013 PNG New Guinea RPNG_KA469 NP04 NO NO

haplotype country island lab_ID region haplogroup population museum‐label accession Rx013 PNG New Ireland RPNGNI001 INP06 NO NO Rx013 PNG Woodlark Island RPNGMuy004 INP12 NO NO Rx013 PNG Woodlark Island RPNGMuy010 INP12 NO NO M‐159686 AY604214 Rx014 PNG New Ireland RPNGNI002 INP06 NO NO Rx014 PNG New Ireland RPNGNI004 INP06 NO NO Rx015 PNG New Ireland RPNGNI003 INP06 NO NO Rx016 PNG New Guinea RPNGNok001 NP07 NO NO ABTC48945 Rx017 Philippines Negros Island RPhNeg010 PN01 PHBS NO Rx017 PNG New Guinea RPNGWau001 NP07 NO NO ABTC46019 Rx017 PNG New Guinea RPNGSia001 NP05 NO NO ABTC48894 Rx017 Indonesia Ambon Island RInMol009 ISM01 NO NO USNM 521887 Rx017 Indonesia Ambon Island RInMol002 ISM01 NO NO USNM 521888 Rx017 Indonesia Ambon Island RInMol008 ISM01 NO NO USNM 521889 Rx017 Indonesia Ambon Island RInMol001 ISM01 NO NO USNM 521890 Rx018 PNG Tabar Island RPNGTa026 INP07 NO NO Rx019 PNG Tabar Island RPNGTa027 INP07 NO NO Rx020 Philippines Mindanao RPhMin002 PM01 PHBS PHBS USNM 144950 Rx021 Philippines Mindanao RPhMin001 PM03 PHBS NO FMNH 166509 Rx022 Philippines Luzon RPhLuz002 PL01 PHBS PHBS Rx022 Philippines Luzon RPhLuz009 PL01 PHBS PHBS Rx022 Philippines Luzon RPhLuz010 PL01 PHBS PHBS 223 224

haplotype country island lab_ID region haplogroup population museum‐label accession Rx022 Philippines Luzon RPhLuz012 PL01 PHBS PHBS Rx022 Philippines Luzon RPhLuz013 PL01 PHBS PHBS Rx022 Philippines Luzon RPhLuz001 PL01 PHBS PHBS FMNH 183652 Rx022 Philippines Luzon RPhLuz007 PL01 PHBS PHBS FMNH 183445 Rx022 Philippines Luzon RPhLuz003 PL01 PHBS PHBS FMNH 183446 Rx022 Philippines Luzon RPhLuz006 PL01 PHBS PHBS FMNH 183651 Rx022 Indonesia Borneo RInBor008 ISBo01 PHBS PHBS M‐103838 AY604205 Rx022 Indonesia Sulawesi RInSul016 ISS03 PHBS PHBS USNM 199914 Rx022 Indonesia Sulawesi RInSul003 ISS03 PHBS PHBS USNM 199915 Rx022 USA Guam RGuam005 MM02 RO PHBS USNM 278369 Rx022 USA Guam RGuam003 MM02 RO PHBS USNM 278372 Rx023 Philippines Negros Island RPhNeg003 PN01 PHBS PHBS Rx023 Philippines Negros Island RPhNeg015 PN01 PHBS PHBS Rx023 Indonesia Sulawesi RInSul004 ISS02 PHBS PHBS USNM 199983 Rx023 Indonesia Sulawesi RInSul005 ISS01 PHBS PHBS USNM 496931 Rx023 Indonesia Sulawesi RInSul008 ISS01 PHBS PHBS USNM 496935 Rx023 Indonesia Sulawesi RInSul007 ISS01 PHBS PHBS USNM 496945 Rx024 Philippines Negros Island RPhNeg005 PN01 PHBS PHBS Rx024 Philippines Negros Island RPhNeg007 PN01 PHBS PHBS Rx025 Philippines Luzon RPhLuz008 PL01 PHBS PHBS Rx026 Philippines Negros Island RPhNeg002 PN01 PHBS PHBS

haplotype country island lab_ID region haplogroup population museum‐label accession Rx027 Philippines Negros Island RPhNeg001 PN01 PHBS PHBS Rx028 Philippines Luzon RPhLuz005 PL01 PHBS PHBS Rx029 Philippines Negros Island RPhNeg008 PN01 PHBS PHBS Rx030 PNG Woodlark Island RPNGMuy005 INP12 NO NO Rx030 PNG Woodlark Island RPNGMuy006 INP12 NO NO Rx030 PNG Woodlark Island RPNGMuy007 INP12 NO NO Rx031 Philippines Negros Island RPhNeg011 PN01 PHBS PHBS Rx031 Philippines Negros Island RPhNeg014 PN01 PHBS PHBS Rx031 Philippines Negros Island RPhNeg016 PN01 PHBS PHBS M‐207549 AY604204 Rx031 Indonesia Batam Island RInBat001 ISBa01 PHBS PHBS USNM 142127 Rx031 Malaysia Borneo RMalBor005 ISBo03 PHBS PHBS USNM 292726 Rx031 Malaysia Borneo RMalBor004 ISBo03 PHBS PHBS USNM 292727 Rx031 Philippines Palawan RPhPal002 PP02 PHBS PHBS USNM 478028 Rx031 Malaysia Borneo RMalBor001 ISBo03 PHBS PHBS USNM 489012 Rx032 Cook Islands Pukapuka RCIPuk001 RCI01 RO NO Rx033 Cook Islands Pukapuka RCIPuk005 RCI01 RO RO Rx034 Samoa Manua Islands RSam002 RSa01 RO RO EF186316.1 Rx035 Samoa Manua Islands RSam001 RSa01 RO RO EF186315.1 Rx036 New Zealand Chatham Island RNZCha042 RNZ02 RO RO EF186311.1 Rx037 Cook Islands Aitutaki RCIAiu001 RCI02 RO RO EF186299.1 Rx038 PNG Emira Island RPNGEm001 INP03 NO NO 225 226

haplotype country island lab_ID region haplogroup population museum‐label accession Rx038 PNG Emira Island RPNGEm002 INP03 NO NO Rx038 PNG Emira Island RPNGEm003 INP03 NO NO Rx038 PNG Emira Island RPNGEm004 INP03 NO NO Rx038 PNG Emira Island RPNGEm005 INP03 NO NO Rx038 PNG Emira Island RPNGEm006 INP03 NO NO Rx038 PNG Emira Island RPNGEm007 INP03 NO NO Rx038 PNG Emira Island RPNGEm008 INP03 NO NO Rx038 PNG Emira Island RPNGEm010 INP03 NO NO Rx038 PNG Emira Island RPNGEm011 INP03 NO NO Rx038 PNG Emira Island RPNGEm012 INP03 NO NO Rx038 PNG Emira Island RPNGEm013 INP03 NO NO Rx038 PNG Emira Island RPNGEm014 INP03 NO NO Rx038 PNG Emira Island RPNGEm015 INP03 NO NO Rx038 PNG Emira Island RPNGEm017 INP03 NO NO Rx038 PNG Emira Island RPNGEm020 INP03 NO NO Rx039 New Caledonia New Caledonia RNCal005 RNC01 RO RO Rx039 PNG Tench Island RPNGTe003 INP02 NO RO Rx039 PNG Tench Island RPNGTe004 INP02 NO RO Rx039 PNG Tench Island RPNGTe008 INP02 NO RO Rx039 Vanuatu Malekula RVanMal004 RV01 RO RO Rx040 PNG Tench Island RPNGTe002 INP02 NO RO

haplotype country island lab_ID region haplogroup population museum‐label accession Rx041 PNG Emira Island RPNGEm009 INP03 NO NO Rx041 PNG Emira Island RPNGEm016 INP03 NO NO Rx041 PNG Emira Island RPNGEm018 INP03 NO NO Rx041 PNG Emira Island RPNGEm019 INP03 NO NO Rx041 PNG Emira Island RPNGEm021 INP03 NO NO Rx042 PNG New Guinea ABTC43212 NP03 NO NO ABTC43212 KJ155763 Rx042 PNG New Guinea RPNGDoi001 NP02 NO NO ABTC43734 Rx043 Fiji Fiji RFij001 RF01 RO RO EF186302.1 Rx043 Indonesia Kei Besar ABTC110267 ISM05 NO RO ABTC110267 KJ155752 Rx043 USA Aguijan ABTC112973 MM01 RO RO ABTC112973 KJ155752 Rx044 Hawaii Hawaii RHaw005 RH01 RO RO EF186304.1 Rx045 French Polynesia Raiatea RFPSIR002 RFP02 RO RO EF186314.1 Rx046 PNG New Guinea RPNGKos001 NP01 NO NO ABTC42509 Rx047 Indonesia Java RInJav006 ISJ02 SMA SMA ABTC48011 EF186306.1 Rx048 Kiribati Kirimati RKir001 RK01 RO RO Rx048 Kiribati Kirimati RKir002 RK01 RO RO Rx048 Kiribati Kirimati RKir003 RK01 RO RO Rx049 PNG Lavongai RPNGNH010 INP04 NO NO Rx050 Malaysia Borneo RMalBor003 ISBo03 PHBS PHBS USNM 317242 Rx051 Malaysia Borneo RMalBor006 ISBo03 PHBS PHBS USNM 317243 Rx052 Malaysia Borneo RMalBor002 ISBo03 PHBS PHBS USNM 489013 227 228

haplotype country island lab_ID region haplogroup population museum‐label accession Rx053 Indonesia Java RInJav004 ISJ01 SMA SMA USNM 521911 Rx054 Indonesia Java RInJav005 ISJ01 SMA SMA USNM 521913 Rx055 Indonesia Borneo RInBor001 ISBo02 SMA SMA USNM 521878 Rx056 Indonesia Borneo RInBor002 ISBo02 SMA SMA USNM 521883 Rx056 Indonesia Borneo RInBor003 ISBo01 PHBS SMA USNM 521884 Rx057 Indonesia Borneo RInBor004 ISBo02 SMA SMA USNM 521881 Rx058 Indonesia Borneo RInBor005 ISBo02 SMA SMA USNM 521882 Rx059 PNG Goodenough Island RPNGGoI001 INP10 NO NO M‐157931 AY604208 Rx059 Indonesia New Guinea RInIJay005 NIP01 NO NO USNM 277332 Rx060 Indonesia Java RInJav001 ISJ01 SMA SMA USNM 521915 Rx061 Indonesia Java RInJav002 ISJ01 SMA SMA USNM 521912 Rx062 Indonesia Java RInJav003 ISJ01 SMA SMA USNM 521914 Rx063 Indonesia Sulawesi RInSul001 ISS04 SMA SMA USNM 257626 Rx064 Indonesia Sulawesi RInSul006 ISS01 PHBS PHBS USNM 496932 Rx064 Indonesia Sulawesi RInSul009 ISS01 PHBS PHBS USNM 496949 Rx065 Philippines Palawan RPhPal003 PP01 PHBS PHBS USNM 478050 Rx065 Philippines Palawan RPhPal004 PP01 PHBS PHBS USNM 478051 Rx065 Philippines Palawan RPhPal006 PP01 PHBS PHBS USNM 478088 Rx066 Philippines Palawan RPhPal005 PP01 PHBS PHBS USNM 478087 Rx067 FSM Kapingamarangi RFSMKap002 MFSM01 RO RO Rx068 FSM Kapingamarangi RFSMKap001 MFSM01 RO RO

haplotype country island lab_ID region haplogroup population museum‐label accession Rx069 USA Saipan RNMaSai001 MM01 RO RO Rx070 Solomon Islands Bougainville Island RPNGSol001 INSI01 NO NO M‐79815 Rx071 Indonesia Halmahera RInHal002 ISM02 NO RO M‐101299 Rx072 Indonesia Sulawesi RInSul018 ISS03 PHBS PHBS M‐153015 AY604203 Rx072 Indonesia Sulawesi RInSul017 ISS03 PHBS PHBS M‐153028 AY604203 Rx073 PNG Goodenough Island RPNGGoI002 INP10 NO NO M‐157925 AY604211 Rx075 PNG Goodenough Island RPNGGoI003 INP10 NO NO M‐157932 AY604210 Rx076 Indonesia Batam Island RInBat004 ISBa01 PHBS PHBS USNM 142126 Rx078 Indonesia Halmahera RInHal005 ISM02 NO RO M‐267679 Rx079 Indonesia Halmahera RInHal006 ISM02 NO NO M‐267691 Rx080 New Caledonia New Caledonia RNCal007 RNC01 RO RO Rx080 USA Guam RGuam007 MM02 RO RO Rx080 USA Rota ABTC112981 MM03 RO RO ABTC112981 KJ155753 Rx080 USA Rota ABTC112982 MM03 RO RO ABTC112982 KJ155753 Rx081 New Caledonia New Caledonia RNCal011 RNC01 RO RO Rx083 Indonesia Flores RInFLB004 ISF01 SMA SMA Rx083 Indonesia Flores ABTC121003 ISF03 SMA SMA ABTC121003 KJ155773 Rx083 Indonesia Flores ABTC121685 ISF03 SMA SMA ABTC121685 KJ155773 Rx083 Indonesia Flores RInFLo004 ISF01 SMA SMA ABTC121686 KJ155770 Rx083 Australia Adele Island ROzAI001 ISOz01 SMA SMA BP01952 Rx083 Australia Adele Island ROzAI002 ISOz01 SMA SMA BP01953 229 230

haplotype country island lab_ID region haplogroup population museum‐label accession Rx083 Australia Adele Island ROzAI003 ISOz01 SMA SMA BP01954 Rx083 Australia Adele Island ROzAI004 ISOz01 SMA SMA BP01955 Rx083 Australia Adele Island ROzAI005 ISOz01 SMA SMA BP01956 Rx084 Vanuatu Malekula RVanMal001 RV01 RO RO Rx085 Vanuatu Malekula RVanMal002 RV01 RO RO Rx086 Vanuatu Malekula RVanMal003 RV01 RO RO Rx087 Vanuatu Malekula RVanMal005 RV01 RO RO Rx088 Solomon Islands Reef‐Santa Cruz RSolRSC001 RSI03 RO NO Rx088 Solomon Islands Reef‐Santa Cruz RSolRSC002 RSI03 RO NO Rx089 Philippines Palawan RPhPal007 PP02 PHBS SMA USNM 478019 Rx090 Cambodia RCam009 SC02 SEA SEA Rx090 Cambodia RCam010 SC02 SEA SEA Rx091 Cambodia RCam015 SC01 SEA SEA Rx091 Cambodia RCam017 SC01 SEA SEA Rx091 Cambodia RCam019 SC01 SEA SEA Rx091 Cambodia RCam020 SC01 SEA SEA Rx092 Cambodia RCam012 SC02 SEA NO Rx093 Thailand RTha026 ST01 SEA SEA Rx094 Indonesia Borneo RInBor009 ISBo01 PHBS PHBS M‐103831 AY604204 VTH18 Indonesia Timor ET13 ISF02 SMA SMA KJ155755 VTH18 Indonesia Timor ET15 ISF02 SMA SMA KJ155755

haplotype country island lab_ID region haplogroup population museum‐label accession VTH18 Indonesia Timor ET2 ISF02 SMA SMA KJ155755 VTH18 Indonesia Timor ET7 ISF02 SMA SMA KJ155755 VTH18 Indonesia Timor ET8 ISF02 SMA SMA KJ155754 VTH18 Indonesia Timor ET9 ISF02 SMA SMA KJ155755 VTH19 Indonesia Flores ABTC121002 ISF03 SMA SMA ABTC121002 KJ155772 VTH20 Indonesia Flores ABTC121684 ISF03 SMA SMA ABTC121684 KJ155760 VTH21 Indonesia Yamdena ABTC110269 ISM04 NO SMA ABTC110269 KJ155761 VTH25 Indonesia Timor ET6 ISF02 SMA SMA KJ155756 231 232

REFERENCES

Addison, D.J. & Matisoo-Smith, E. (2010) Rethinking Polynesians origins: a West- Polynesia Triple-I model. Archaeology in Oceania, 45, 1-12. Adelaar, A. (1995) Asian roots of the Malagasy: a linguistic perspective. Bijdragen tot de Taal-, Land- en Volkenkunde, 151, 325-356. Adelaar, A. (2006) The Indonesian migrations to Madagascar: making sense of the mutlidisciplinary evidence. Austronesian diaspora and the ethnogeneses of people in Indonesian archipelago: proceedings of the international symposium (ed. by T. Simanjuntak, I.H.E. Pojoh and M. Hisyam), p. 205. Yayasan Obor Indonesia. Akaike, H. (1973) Maximum likelihood identification of Gaussian autoregressive moving average models. Biometrika, 60, 255-265. Allen, C.R. (1965) Transcurrent faults in continental areas. Allen, J. (1984) In Search of the Lapita homeland: reconstructing the prehistory of the Bismarck Archipelago. The Journal of Pacific History, 19, 186-201. Allen, J. (1989) The Lapita homeland some new data and an interpretation. Journal of the Polynesian Society, 98, 129-146. Allen, J. (1996) The pre-austronesian settlement of Island Melanesia: implications for Lapita archaeology. Transactions of the American Philosophical Society, 86, 11- 27. Allen, J., Gosden, C. & White, J.P. (1989) Human Pleistocene adaptations in the tropical island Pacific: recent evidence from New Ireland, a Greater Australian outlier. Antiquity, 63, 548-561. Allen, J.S. (1991) Introduction. Report of the Lapita homeland project (ed. by J.S. Allen and C. Gosden), pp. 1-8. The Australian National University, Canberra. Allentoft, M.E., Collins, M., Harker, D., Haile, J., Oskam, C.L., Hale, M.L., Campos, P.F., Samaniego, J.A., Gilbert, M.T.P., Willerslev, E., Zhang, G., Scofield, R.P., Holdaway, R.N. & Bunce, M. (2012) The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proceedings of the Royal Society B: Biological Sciences, 279, 4724-4733. Altmann, R. (1890) Die Elementarorganismen und ihre Beziehungen zu den Zellen, 1 edn. Veit & Co, Leipzig. Ambrose, W.R. (1978) The loneliness of the long distance trader in Melanesia. Mankind, 11, 326-333. Anderson, A. (2001a) Mobility models of Lapita migration. The Archaeology of Lapita dispersal in Oceania. Papers from the Fourth Lapita Conference June 2000, Canberra, Australia (ed. by G.R. Clark, A.J. Anderson and T. Vunidilo), pp. 15- 23. Terra Australis 17. Canberra, Pandanus Books, Research School of Pacific and Asian Studies, Australian National University.

233

Anderson, A., Bedford, S., Clark, G.R., Lilley, I., Sand, C., Summerhayes, G.R. & Torrence, R. (2001) An inventory of Lapita sites containing dentate-stamped pottery. The Archaeology of Lapita dispersal in Oceania. Papers from the Fourth Lapita Conference June 2000, Canberra, Australia (ed. by G.R. Clark, A.J. Anderson and T. Vunidilo), pp. 1-13. Terra Australis 17. Canberra, Pandanus Books, Research School of Pacific and Asian Studies, Australian National University. Anderson, A.J. (2001b) Towards the sharp end: The form and performance of prehistoric voyaging canoes. Proceedings of the fifth international conference on Easter Island and the Pacific (ed. by C.M. Stevenson, G. Lee and F.J. Morin), p. 35. Easter Island foundation, Los Oso. Anderson, S., Bankier, A.T., Barrell, B.G., de Bruijn, M.H.L., Coulson, A.R., Drouin, J., Eperon, I.C., Nierlich, D.P., Roe, B.A., Sanger, F., Schreier, P.H., Smith, A.J.H., Staden, R. & Young, I.G. (1981) Sequence and organization of the human mitochondrial genome. Nature, 290, 457-465. Aplin, K.P., Suzuki, H., Chinen, A.A., Chesser, R.T., ten Have, J., Donnellan, S.C., Austin, J., Frost, A., Gonzalez, J.P., Herbreteau, V., Catzeflis, F., Soubrier, J., Fang, Y.-P., Robins, J., Matisoo-Smith, E., Bastos, A.D.S., Maryanto, I., Sinaga, M.H., Denys, C., Van Den Bussche, R.A., Conroy, C., Rowe, K. & Cooper, A. (2011) Multiple geographic origins of commensalism and complex dispersal history of black rats. PLoS ONE, 6, e26357. Armitage, S.J., Jasim, S.A., Marks, A.E., Parker, A.G., Usik, V.I. & Uerpmann, H.-P. (2011) The southern route “out of Africa”: evidence for an early expansion of modern humans into Arabia. Science, 331, 453-456. Ashby, S. (2004) Understanding human movement and interaction through the movement of animals and animal products. 9th Conference of the International Council of Archaeozoology (ed by M. Mondini, S. Muñoz and S. Wickler), pp. 4-9. Durham. Atkinson, I.A.E. (1973) Spread of the shiprat in New Zealand. Journal of The Royal Society of New Zealand, 3, 457-472. Atkinson, I.A.E. (1985) The spread of commensal species of Rattus to oceanic islands and their effects on island avifaunas. Conservation of island birds. ICBP Technical Publication No.3 (ed. by P.J. Moors), pp. 35-81. Atkinson, I.A.E. & Atkinson, T.J. (2000) Land vertebrates as invasive species on the islands of the South Pacific Regional Environment Programme. Invasive species in the Pacific: A technical review and draft regional strategy (ed. by G. Sherley), pp. 19-84. South Pacific Regional Environment Programme, Samoa. Avise, J.C. (1986) Mitochondrial DNA and the evolutionary genetics of higher animals. Philosophical Transactions of the Royal Society of London B, 312, 325-42. Avise, J.C. (1991) Ten unorthodox perspectives on evolution prompted by comparative population genetic findings on mitochondrial DNA. Annual Review of Genetics, 25, 45-69. Avise, J.C., Arnold, J., Ball, R.M., Bermingham, E., Lamb, T., Neigel, J.E., Reeb, C.A. & Saunders, N.C. (1987) Intraspecific Phylogeography: The Mitochondrial DNA Bridge Between Population Genetics and Systematics. Annual Review of Ecology and Systematics, 18, 489-522. Baele, G. (2012) Interpreting AICM results. Available at: https://groups.google.com/forum/#!msg/beast-users/7- 3_XRT7lHs/mGhbXy5NYMIJ (accessed 12.12 2014). Baele, G., Lemey, P., Bedford, T., Rambaut, A., Suchard, M.A. & Alekseyenko, A.V. (2012) Improving the accuracy of demographic and molecular clock model 234

REFERENCES

comparison while accommodating phylogenetic uncertainty. Molecular Biology and Evolution, 29, 2157-2167. Bahlo, M. & Griffiths, R.C. (2000) Inference from gene trees in a subdivided population. Theoretical Population Biology, 57, 79-95. Ballard, J.W.O. & Kreitman, M. (1995) Is mitochondrial DNA a strictly neutral marker? Trends in Ecology & Evolution, 10, 485-488. Bär, W., Kratzer, A., Mächler, M. & Schmid, W. (1988) Postmortem stability of DNA. Forensic Science International, 39, 59-70. Barber, P.H., Palumbi, S.R., Erdmann, M.V. & Moosa, M.K. (2000) Biogeography: a marine Wallace's line? Nature, 406, 692-693. Barr, C.M., Neiman, M. & Taylor, D.R. (2005) Inheritance and recombination of mitochondrial genomes in plants, fungi and animals. New Phytologist, 168, 39-50. Beaumont, M.A. & Rannala, B. (2004) The Bayesian revolution in genetics. Nature Reviews, 5, 254-264. Beerli, P. (1998) Estimation of migration rates and population sizes in geographically structured populations. Advances in Molecular Ecology (ed. by G.R. Carvalho), pp. 39-53. IOS Press, Inc, Burke. Beerli, P. (2006) Comparison of Bayesian and maximum-likelihood inference of population genetic parameters. Bioinformatics, 22, 341-345. Beerli, P. & Felsenstein, J. (1998) Migrate - Maximum likelihood estimation of migration rates and population numbers. Program and documentation distributed by the authors. Department of Genetics, University of Washington. Beerli, P. & Felsenstein, J. (1999) Maximum likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach. Genetics, 152, 763-773. Beerli, P. & Felsenstein, J. (2001) Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach. Proceedings of the National Academy of Sciences USA, 98, 4563-4568. Behar, D.M., Rosset, S., Blue-Smith, J., Balanovsky, O., Tzur, S., Comas, D., Mitchell, R.J., Quintana-Murci, L., Tyler-Smith, C., Wells, R.S. & The Genographic, C. (2007) The Genographic Project public participation mitochondrial DNA database. PLoS Genet, 3, e104. Bellwood, P., White, P., Larson, G., Dobney, K., Albarella, U., Matisoo-Smith, E., Robins, J., Lowden, S., Rowley-Conwy, P. & Andersson, L. (2005) Domesticated pigs in eastern Indonesia. Science., 309, 381-381. Bellwood, P.S. (1978a) The prehistory of Melanesia. Man's conquest of the Pacific (ed. by P. Bellwood), pp. 233-279. Collins, Hong Kong. Bellwood, P.S. (1978b) Man's conquest of the Pacific: the prehistory of Southeast Asia and Oceania. Collins, Hong Kong. Bellwood, P.S. (1978c) The origins of the Polynesians. The Polynesians: prehistory of an island people, pp. 45-57. Thames and Hudson, London. Bellwood, P.S. (1997) The prehistory of the Indo-Malaysian Archipelago, 2nd edn. University of Hawaii Press, Honolulu. Bellwood, P.S. (2001) Early agriculturalist population diasporas? Farming, languages, and genes. Annual Review of Anthropology, 30, 181-207. Bellwood, P.S., Chambers, G., Ross, M. & Hung, H.-c. (2011) Are ‘cultures’ inherited? Multidisciplinary perspectives on the origins and migrations of Austronesian- speaking peoples prior to 1000 BC. Investigating Archaeological Cultures (ed. by B.W. Roberts and M. Vander Linden), pp. 321-354. Springer New York.

235

Bergstrom, C.T. & Pritchard, J. (1998) Germline bottlenecks and the evolutionary maintenance of mitochondrial genomes. Genetics, 149, 2135-2146. Best, E. (1942) Forest lore of the Maori, New Zealand Electronic Text Centre edn, Wellington, New Zealand. Available at: http://www.nzetc.org/tm/scholarly/tei- BesFore.html. Bettesworth, D.J. (1972) Rattus exulans on Red Mercury. Tane: Auckland University Field Club Journal, 18, 117-118. Black, F. (1975) Infectious diseases in primitive societies. Science, 187, 515-518. Blench, R. (2007) New palaeozoogeographical evidence for the settlement of Madagascar. Azania, 42, 69-82. Blench, R. (2012) Almost everything you believed about the Austronesians isn’t true. Crossing Borders: Selected Papers from the 13th International Conference of the European Association of Southeast Asian Archaeologists (ed. by M.L. Tjoa-Bonatz, A. Reinecke and D. Bonatz), pp. 128-148. NUS Press, Singapore. Blust, R. (1984) The Austronesian homeland: a linguistic perspective. Asian Perspectives, 26, 45-67. Blust, R. (2009/2013) The Autronesian languages. The Australian National University, Research School of Pacific and Asian Studies, Canberra. Bodmer, W.F. & Cavalli-Sforza, L.L. (1968) A migration matrix model for the study of random genetic drift Genetics, 59, 565-592. Boom, R., Sol, C.J., Salimans, M.M., Jansen, C.L., Wertheim-van Dillen, P.M. & van der Noordaa, J. (1990) Rapid and simple method for purification of nucleic acids. Journal of Clinical Microbiology, 28, 495-503. Bouckaert, R., Heled, J., Kühnert, D., Vaughan, T., Wu, C.-H., Xie, D., Suchard, M.A., Rambaut, A. & Drummond, A.J. (2014) BEAST 2: A Software Platform for Bayesian Evolutionary Analysis. PLoS Computational Biology, 10, e1003537. Bradman, H., Grewe, P. & Appleton, B. (2011) Direct comparison of mitochondrial markers for the analysis of swordfish population structure. Fisheries Research, 109, 95-99. Bramley, G.N. (2014) Habitat use by kiore (Rattus exulans) and Norway rats (R. norvegicus) on Kapiti Island, New Zealand. New Zealand Journal of Ecology, 38 Brandstetter, R. (1893) Die Beziehungen des Malagasy zum Malaiischen. Festschrift zur Eröffnung des neuen Kantonsschul-Gebäudes in Luzern, pp. 65-107. Räber, Luzern. Bromham, L., Eyre-Walker, A., Smith, N.H. & Maynard Smith, J. (2003) Mitochondrial Steve: paternal inheritance of mitochondria in humans. Trends in ecology & evolution (Personal edition), 18, 2-4. Brooks, D., Thorson, T. & Mayes, M. (1981) Freshwater stingrays (Potamotrygonidae) and their helminth parasites: testing hypotheses of evolution and coevolution. Advances in cladistics: proceedings of the first meeting of the Willi Hennig society (ed by V.A. Funk and D.R. Brooks), pp. 147-176. Brouat, C., Tollenaere, C., Estoup, A., Loiseau, A., Sommer, S., Soanandrasana, R., Rahalison, L., Rajerison, M., Piry, S., Goodman, S.M. & Duplantier, J.M. (2014) Invasion genetics of a human commensal rodent: the Rattus rattus in Madagascar. Molecular Ecology, 23, 4153-4167. Brown, G.G., Gadaleta, G., Pepe, G., Saccone, C. & Sbisà, E. (1986) Structural conservation and variation in the d-loop-containing region of vertebrate mitochondrial DNA. Journal of Molecular Biology, 192, 503-511. Brown, R.P. & Yang, Z. (2011) Rate variation and estimation of divergence times using strict and relaxed clocks. BMC Evolutionary Biology, 11, 271.

236

REFERENCES

Brown, W.M., George Jr., M. & Wilson, A.C. (1979) Rapid evolution of animal mitochondrial DNA. Proceedings of the National Academy of Sciences USA, 76, 1967-1971. Buller, W. (1870) On the New Zealand Rat. Transactions and Proceedings of the Royal Society of New Zealand, 3 Burley, D.V. (2001) Comment on J.E. Terrell, K. M. Kelly and P. Rainbird, 2001. Foregone conclusions? In search of "Papuans" and "Austronesians". Current Anthropology, 42, 109-110. Burney, D.A., Burney, L.P., Godfrey, L.R., Jungers, W.L., Goodman, S.M., Wright, H.T. & Jull, A.J.T. (2004) A chronology for late prehistoric Madagascar. Journal of Human Evolution, 47, 25-63. Campbell, D.J., Moller, H., Ramsay, G.W. & Wait, J.C. (1984) Observations of foods of Kiore (Rattus exulans) found in husking stations on northern offshore islands of New Zealand. New Zealand Journal of Ecology, 7, 131-138. Cann, R., Stoneking, M. & Wilson, A. (1987) Mitochondrial DNA and human evolution. Nature, 325, 31-36. Cann, R.L. & Wilson, A.C. (1983) Length mutations in human mitochondrial DNA. Genetics, 104, 699-711. Capelli, C., Wilson, J.F., Richards, M.B., Stumpf, M.P.H., Gratrix, F., Oppenheimer, S.J., Underhill, P., Pascali, V.L., Ko, T.-M. & Goldstein, D.B. (2001) A predominantly indigenous paternal heritage for the Austronesian-speaking peoples of insular Southeast Asia and Oceania. American Journal of Human Genetics, 68, 432-443. Capps, G.J., Samuels, D.C. & Chinnery, P.F. (2003) A model of the nuclear control of mitochondrial DNA replication. Journal of Theoretical Biology, 221, 565-583. Caputo, R. (2007) Sea-level curves: perplexities of an end-user in morphotectonic applications. Global and Planetary Change, 57, 417-423. Carré, M., Sachs, J.P., Purca, S., Schauer, A.J., Braconnot, P., Falcón, R.A., Julien, M. & Lavallée, D. (2014) Holocene history of ENSO variance and asymmetry in the eastern tropical Pacific. Science, 345, 1045-1048. Carson, M.T. (2013) Austronesian migrations and developments in Micronesia. Journal of Austronesian Studies, 4, 25-52. Carson, M.T., Hung, H.-c., Summerhayes, G. & Bellwood, P. (2013) The pottery trail from Southeast Asia to Remote Oceania. The Journal of Island and Coastal Archaeology, 8, 17-36. Cassin, J. (1858) Atlas mammalogy and ornithology. C. Sherman & Son, Philadelphia. Castro, J.A., Picornell, A. & Ramon, M. (1998) Mitochondrial DNA: a tool for populational genetics studies. International Microbiology, 1, 327-332. Cavalli-Sforza, L.L. & Edwards, A.W.F. (1967) Phylogenetic analysis: models and estimation procedures. Evolution, 32, 550-570. Chang, C.-S., Chen, C., Berthouly‐Salazar, C., Chazara, O., Lee, Y., Chang, C., Chang, K., Bed’Hom, B. & Tixier‐Boichard, M. (2012) A global analysis of molecular markers and phenotypic traits in local chicken breeds in Taiwan. Animal genetics, 43, 172- 182. Clark, J.R., Ree, R.H., Alfaro, M.E., King, M.G., Wagner, W.L. & Roalson, E.H. (2008) A comparative study in ancestral range reconstruction methods: retracing the uncertain histories of insular lineages. Systematic Biology, 57, 693-707. Clayton, D.A. (2000) Transcription and replication of mitochondrial DNA. Human reproduction, 15, 11-17. Colangelo, P., Abiadh, A., Aloise, G., Amori, G., Capizzi, D., Vasa, E., Annesi, F. & Castiglia, R. (2014) Mitochondrial phylogeography of the black rat supports a

237

single invasion of the western . NCBI Accessions: LN554990 - LN555005. Coller, M. (2009) SahulTime: rethinking archaeological representation in the digital age. Archaeologies: Journal of the World Archaeological Congress, 5, 110-123. Cottam, C. (1948) General notes: aquatic habits of the Norway rat. Journal of Mammalogy, 29, 299. Cotton, C.A. (1958) The rim of the Pacific. The Geographical Journal, 124, 223-231. Cowdry, E.V. & Olitsky, P.K. (1922) Differences between mitochondria and bacteria. Cox, M.P. (2003) Genetic patterning at Austronesian contact zones. PhD, University of Otago, Dunedin. Cox, M.P. (2005) Indonesian mitochondrial DNA and its opposition to a Pleistocene era origin of proto-Polynesians in Island Southeast Asia. Human Biology, 77, 179-188. Cox, M.P., Redd, A.J., Karafet, T.M., Ponder, C.A., Lansing, J.S., Sudoyo, H. & Hammer, M.F. (2007) A Polynesian motif on the Y chromosome: population structure in remote Oceania. Human biology, 79, 525-535. Cranbrook, E.o. (2000) Northern Borneo environments of the past 40,000 years: archaeozoological evidence. Sarawak Museum Journal, 55, 61. Cummins, J.M. (2000) Fertilization and elimination of the paternal mitochondrial genome. Human reproduction, 15, 92-101. Darriba, D., Taboada, G.L., Doallo, R. & Posada, D. (2012) jModelTest 2: more models, new heuristics and parallel computing. Nature Methods, 9, 772-772. Darwin, C. (1968) The variation of animals and plants under domestication. Orange Judd & Company, New York. Darwin, C. & Wallace, A.R. (1858) On the tendency of species to form varieties; and on the perpetuation of varieties and species by natural means of selection. Journal of the Proceedings of the Linnean Society, 3, 45-62. Davis, L.S. (1979) Social rank behaviour in a captive colony of Polynesian rats. New Zealand Journal of Zoology, 6, 371-380. Delfin, F., Myles, S., Choi, Y., Hughes, D., Illek, R., van Oven, M., Pakendorf, B., Kayser, M. & Stoneking, M. (2012) Bridging Near and Remote Oceania: mtDNA and NRY variation in the Solomon Islands. Molecular Biology and Evolution, 29, 545-564. Denham, T.P., Haberle, S.G., Lentfer, C., Fullagar, R., Field, J., Therin, M., Porch, N. & Winsborough, B. (2003) Origins of agriculture at Kuk swamp in the highlands of New Guinea. Science, 301, 189-193. Diamond, J. (1988) Express train to Polynesia. Nature, 336, 307-308. Diamond, J. (1997) Guns, germs, and steel: the fates of human societies, London. Diamond, J.M. (2000) Linguistics: Taiwan's gift to the world. Nature, 403, 709-710. Dieffenbach, E. (1843) Travels in New Zealand II : With contributions to the geography, geology, botany, and natural history of that country. John Murray, London. Ding, Z.-L., Oskarsson, M., Ardalan, A., Angleby, H., Dahlgren, L.-G., Tepeli, C., Kirkness, E., Savolainen, P. & Zhang, Y.-P. (2012) Origins of domestic dog in Southern East Asia is supported by analysis of Y-chromosome DNA. Heredity, 108, 507-514. Donoghue, M. & Sanderson, M. (1992) The suitability of molecular and morphological evidence in reconstructing plant phylogeny. Molecular Systematics of Plants (ed. by P. Soltis, D. Soltis and J. Doyle), pp. 340-368. Chapman & Hall, New York. Donohue, M. & Denham, T. (2010) Farming and language in Island Southeast Asia: reframing Austronesian history. Current Anthropology, 51, 223-256. Donohue, M. & Denham, T. (2011) Languages and genes attest different histories in Island Southeast Asia. Oceanic Linguistics, 50, 536-542. 238

REFERENCES

Drummond, A. & Rodrigo, A.G. (2000) Reconstructing genealogies of serial samples under the assumption of a molecular clock using serial-sample UPGMA. Molecular Biology and Evolution, 17, 1807-1815. Drummond, A.J. & Bouckaert, R.R. (2014/2015) Bayesian evolutionary analysis with BEAST 2. Cambridge University Press. Drummond, A.J., Nicholls, G.K., Rodrigo, A.G. & Solomon, W. (2002) Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics, 161, 1307-1320. Drummond, A.J., Pybus, O.G., Rambaut, A., Forsberg, R. & Rodrigo, A.G. (2003) Measurably evolving populations. Trends in Ecology & Evolution, 18, 481-488. Duchêne, S., Lanfear, R. & Ho, S.Y.W. (2014) The impact of calibration and clock-model choice on molecular estimates of divergence times. Molecular Phylogenetics and Evolution, 78, 277-289. Dumont d’Urville, J.-S.-C. (1832/2003) On the islands of the great ocean. The Journal of Pacific History, 38, 163-174. Dupanloup, I., Schneider, S. & Excoffier, L. (2002) A simulated annealing approach to define the genetic structure of populations. Molecular Ecology, 11, 2571-2581. Dwyer, P. (1978) A Study of Rattus exulans (Peale) (Rodentia : ) in the New Guinea Highlands. Wildlife Research, 5, 221-248. Egoscue, H.J. (1970) A laboratory colony of the Polynesian rat, Rattus exulans. Journal of Mammalogy, 51, 261-266. Ellerman, J.R. (1941) The families and genera of living rodents / by J.R. Ellerman ; with a list of named forms (1758-1936) by R.W. Hayman and G.W.C.Holt. British Museum, London. Esser, C., Ahmadinejad, N., Wiegand, C., Rotte, C., Sebastiani, F., Gelius-Dietrich, G., Henze, K., Kretschmann, E., Richly, E., Leister, D., Bryant, D., Steel, M.A., Lockhart, P.J., Penny, D. & Martin, W. (2004) A genome phylogeny for mitochondria among α-proteobacteria and a predominantly eubacterial ancestry of yeast nuclear genes. Molecular Biology and Evolution, 21, 1643-1660. Evin, A., Flink, L.G., Bălăşescu, A., Popovici, D., Andreescu, R., Bailey, D., Mirea, P., Lazăr, C., Boroneanţ, A. & Bonsall, C. (2015) Unravelling the complexity of domestication: a case study using morphometrics and ancient DNA analyses of archaeological pigs from Romania. Philosophical Transactions of the Royal Society B: Biological Sciences, 370, 20130616. Ewing, G. & Rodrigo, A. (2006) Coalescent-based estimation of population parameters when the number of demes changes over time. Mol Biol Evol, 23, 988-996. Ewing, G., Nicholls, G. & Rodrigo, A. (2004) Using temporally spaced sequences to simultaneously estimate migration rates, mutation rate and population sizes in measurably evolving populations. Genetics, 168, 2407-2420. Excoffier, L. (2004) Patterns of DNA sequence diversity and genetic structure after a range expansion: lessons from the infinite‐island model. Molecular Ecology, 13, 853-864. Excoffier, L. & Yang, Z. (1999) Substitution rate variation among sites in mitochondrial hypervariable region I of humans and chimpanzees. Molecular Biology and Evolution, 16, 1357-1368. Excoffier, L. & Lischer, H.E.L. (2010) Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Molecular Ecology Resources, 10, 564-567. Excoffier, L., Smouse, P.E. & Quattro, J.M. (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics, 131, 479-491.

239

Excoffier, L., Laval, G. & Schneider, S. (2005) Arlequin 3.0: an integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 1, 47-50. Fabre, P.-H., Pagès, M., Musser, G.G., Fitriana, Y.S., Fjeldså, J., Jennings, A., Jønsson, K.A., Kennedy, J., Michaux, J., Semiadi, G., Supriatna, N. & Helgen, K.M. (2013) A new genus of rodent from Wallacea (Rodentia: Muridae: Murinae: Rattini), and its implication for biogeography and Indo-Pacific Rattini systematics. Zoological Journal of the Linnean Society, 169, 408-447. Fall, M.W., Medina, A.B. & Jackson, W.B. (1971) Feeding patterns of Rattus rattus and Rattus exulans on eniwetok atoll, marshall islands. Journal of Mammalogy, 52, 69- 76. Feldman, M.W. & Christiansen, F.B. (1974) The effect of population subdivision on two loci without selection. Genetical Research, 24, 151-162. Felsenstein, J. (1973) Maximum-likelihood estimation of evolutionary trees from continuous characters. American Journal of Human Genetics, 25, 471-492. Felsenstein, J. (1974) The evolutionary advantage of recombination. Genetics, 78, 737-756. Felsenstein, J. (1976) The theoretical population genetics of variable selection and migration. Annual Review of Genetics, 10, 253-280. Felsenstein, J. (1978) Cases in which parsimony or compatibility methods will be positively misleading. Systematic Zoology, 27, 401-410. Felsenstein, J. (1982) How can we infer geography and history from gene frequencies? Journal of Theoretical Biology, 96, 9-20. Felsenstein, J. (1984) Distance methods for inferring phylogenies: a justification. Evolution, 38, 16-24. Felsenstein, J. (1985) Confidence limits of phylogenies: an approach using the bootstrap. Evolution, 39, 783-791. Felsenstein, J. (1987-2007) Theoretical evolutionary genetics. Available at: http://evolution.genetics.washington.edu/pgbook/pgbook.html (accessed 2007). Felsenstein, J. (1988) Phylogenies from molecular sequences: inference and reliability. Annual Review of Genetics, 22, 521-565. Flannery, T., Bellwood, P., White, P., Moore, A., Boeadi & Nitihaminoto, G. (1995) Fossil marsupials (Macropodidae, Peroryctidae) and other mammals of holocene age from Halmahera, North Moluccas, Indonesia. Alcheringa, 19, 17-25. Flannery, T.F., Kirch, P.V., Specht, J. & Spriggs, M. (1988) Holocene mammal faunas from archaeological sites in Island Melanesia. Archaeology in Oceania, 23, 89-94. Fraenkel-Conrat, H. & Mecham, D.K. (1949) The reaction of formaldehyde with proteins VII. Demonstration of intermolecular cross-linking by means of osmotic pressure measurements. Journal of Biological Chemistry, 177, 477-486. Frankham, R., Ballou, J.D. & Briscoe, D.A. (2002) Introduction to conservation genetics. Cambridge University Press, Cambridge. Frantz, L.A.F., Madsen, O., Megens, H.-J., Groenen, M.A.M. & Lohse, K. (2014) Testing models of speciation from genome sequences: divergence and asymmetric admixture in Island South-East Asian Sus species during the Plio-Pleistocene climatic fluctuations. Molecular Ecology, 23, 5566-5574. French, N., Yu, S., Biggs, P., Holland, B.R., Fearnhead, P., Binney, B., Fox, A., Grove- White, D.H., Leigh, J.W., Miller, W., Muellner, P. & Carter, P. (2013) Evolution of Campylobacter species in New Zealand. Campylobacter Ecology and Evolution (ed. by S. Sheppard and G. Méric), p. 360. Horizon Scientific Press.

240

REFERENCES

Friedlaender, J.S., Friedlaender, F.R., Hodgson, J.A., Stoltz, M., Koki, G., Horvat, G., Zhadanov, S., Schurr, T.G. & Merriwether, D.A. (2007) Melanesian mtDNA complexity. In: PLoS ONE, p. e248 Friedlaender, J.S., Friedlaender, F.o.R., Reed, F.A., Kidd, K.K., Kidd, J.R., Chambers, G.K., Lea, R.A., Loo, J.-H., Koki, G., Hodgson, J.A., Merriwether, D.A. & Weber, J.L. (2008) The genetic structure of Pacific Islanders. PLoS Genetics, 4, e19. Fu, Y.-X. (1997) Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics, 147, 915-925. Gadaleta, G., Pepe, G., De Candia, G., Quagliariello, C., Sbisa, E. & Saccone, C. (1989) The complete nucleotide sequence of the Rattus norvegicus mitochondrial genome: cryptic signals revealed by comparative analysis between vertebrates. Journal of Molecular Evolution, 28, 497-516. Galtier, N., Nabholz, B., Glémin, S. & Hurst, G.D.D. (2009) Mitochondrial DNA as a marker of molecular diversity: a reappraisal. Molecular Ecology, 18, 4541-4550. Geneious created by Biomatters. Available from http://www.geneious.com/. Gill, W.W. (1880) Savage life in Polynesia. George Didsbur, Government Printer, Wellington. Gillespie, R. (2002) Dating the first . Radiocarbon, 44, 455-472. Glover, I. (1986) Archaeology in Eastern Timor, 1966-67. Terra Australis, 11 González, S., Maldonando, J.E., Leonard, J.A., Vilà, C., Barbanti Duarte, J.M., Merino, M., Brum-Zorrilla, N. & Wayne, R.K. (1998) Conservation genetics of the endangered Pampas deer (Ozotoceros bezoarticus). Molecular Ecology, 7, 47-56. Goodenough, W.H. (1957) Oceania and the problem of controls in the study of cultural and human evolution. The Journal of the Polynesian Society, 146-155. Gray, R.D. & Jordan, F.M. (2000) Language trees support the express train sequence of Austronesian expansion. Nature, 405, 1052-1055. Gray, R.D., Drummond, A.J. & Greenhill, S.J. (2009) Language phylogenies reveal expansion pulses and pauses in Pacific settlement. Science, 323, 479-483. Grayson, D.K. (2001) The archaeological record of human impacts on animal populations. Journal of World Prehistory, 15, 1-68. Green, R.C. (1991a) Near and Remote Oceania - disestablishing "Melanesia" in culture history. Man and a half, Essays in Pacific Anthropology and Ethnobiology in honour of Ralph Bulmer (ed. by A. Pawley), pp. 491-502. The Polynesian Society, Auckland. Green, R.C. (1991b) The Lapita cultural complex: current evidence and proposed models. Indo-Pacific Prehistory 1990 (ed. by P.S. Bellwood), pp. 295-305. Bulletin of the Indo-Pacific Prehistory Association, Canberra and . Green, R.C. (2003) The Lapita horizon and tradition - signature for one set of oceanic migrations. Pacific Archaelology: assessments and prospects. Proceedings of the International Conference for the 50th anniversary of the first Lapita excavation (ed. by C. Sand), pp. 95-120. Les Cahiers de l'archeologie en Nouvelle-Caledonie, Koné-Nouméa. Greenhill, S.J., Drummond, A.J. & Gray, R.D. (2010) How accurate and robust are the phylogenetic estimates of Austronesian language relationships? PLoS one, 5, e9573. Grene, M. & Depew, D. (2004) The philosophy of biology. Cambridge University Press. Gribaldo, S., Poole, A.M., Daubin, V., Forterre, P. & Brochier-Armanet, C. (2010) The origin of eukaryotes and their relationship with the Archaea: are we at a phylogenomic impasse? Nat Rev Micro, 8, 743-752.

241

Griffiths, R.C. & Tavaré, S. (1997) Computational methods for the coalescent. Progress in population genetics and human evolution (ed. by P. Donnelly and S. Tavaré), pp. 165-182. Springer, New York. Groves, C.P. (1984) Of mice and men and pigs in the Indo-Australian Archipelago. Canberra Anthropology, 7, 1-19. Guindon, S. & Gascuel, O. (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology, 52, 696-704. Gyllensten, U., Wharton, D., Josefsson, A. & Wilson, A.C. (1991) Paternal inheritance of mitochondrial DNA in mice. Nature, 352, 255-257. Haeckel, E. (1889) Die Natürliche Schöpfungsgeschichte. Gemeinverständliche wissenschaftliche Vorträge über die Entwicklungslehre im Allgemeinen und diejenige von Darwin, Goethe und Lamarck im Besonderen, 8th edn. Georg Reimer, Berlin. Hage, P. & Marck, J. (2003) Matrilineality and the Melanesian origin of Polynesian Y chromosomes. Current Anthropology, 44, S121-S127. Hagelberg, E. (1996) Mitochondrial DNA in ancient and modern humans. Molecular biology and human diversity (ed. by A.J. Boyce and C.G.N. Macscie-Taylor), pp. 1-11. Cambridge University Press, Cambridge. Hagelberg, E., Kayser, M., Nagy, M., Roewer, L., Zimdahl, H., Krawczak, M., Lio, P., Schiefenhövel, W., Bradman, N. & Sykes, B. (1999) Molecular genetic evidence for the human settlement of the Pacific: analysis of mitochondrial DNA, Y chromosome and HLA markers. Philosophical Transactions: Biological Sciences, 354, 141-152. Haldane, J.B.S. (1930) A mathematical theory of natural and artificial selection. (Part VI, Isolation.). Mathematical Proceedings of the Cambridge Philosophical Society, 26, 220-230. Hale, H. (1846) Ethnography and philology. Lea and Blanchard, Philadelphia. Hall, R. (2002) Cenozoic geological and plate tectonic evolution of SE Asia and the SW Pacific: computer-based reconstructions, model and animations. Journal of Asian Earth Sciences, 20, 353-431. Hanebuth, T., Stattegger, K. & Grootes, P.M. (2000) Rapid flooding of the Sunda Shelf: a late-glacial sea-level record. Science, 288, 1033-1035. Harrison, J.L. (1951) Reproduction in rats of the subgenus Rattus. Proceedings of the Zoological Society of London, 121, 673-694. Harrison, J.L. (1955) Data on the reproduction of some Malayan mammals. Proceedings of the Zoological Society of London, 125, 445-460. Harrison, J.L. (1957) Habitat of some malayan rats. Proceedings of the Zoological Society of London, 128, 1-22. Harrison, R.G. (1989) Animal mitochondrial DNA as a genetic marker in population and evolutionary biology. Trends in Ecology and Evolution, 4, 6-11. Hasegawa, M., Kishino, H. & Yano, T.-a. (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution, 22, 160- 174. Hastings, W.K. (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97-109. Hayashi, J.-I., Tagashira, Y. & Yoshida, M.C. (1985) Absence of extensive recombination between inter- and intraspecies mitochondrial DNA in mammalian cells. Experimental Cell Research, 160, 387-395.

242

REFERENCES

Heaney, L.R. (1986) Biogeography of mammals in SE Asia: estimates of rates of colonization, extinction and speciation. Biological Journal of the Linnean Society, 28, 127-165. Heinsohn, T. (2003) Animal translocation: long-term human influences on the vertebrate zoogeography of Australasia (natural dispersal versus ethnophoresy). Australian Zoologist, 32, 351-376. Heinsohn, T.E. (2010) Marsupials as introduced species: long-term anthropogenic expansion of the marsupial frontier and its implications for zoogeographic interpretation. Altered ecologies: Fire, climate and human influence on terrestrial landscapes, 133-176. Herman, J.S. & Searle, J.B. (2011) Post-glacial partitioning of mitochondrial genetic variation in the field vole. 278, 3601-3607. Hertzberg, M., Mickleson, K.N., Serjeantson, S.W., Prior, J.F. & Trent, R.J. (1989) An Asian-specific 9-bp deletion of mitochondrial DNA is frequently found in Polynesians. American Journal of Human Genetics, 44, 504-510. Hey, J. (2010) Isolation with Migration Models for More Than Two Populations. Molecular Biology and Evolution, 27, 905-920. Hey, J., Fitch, W.M. & Ayala, F.J. (2005) Systematics and the origin of species: An introduction. Proceedings of the National Academy of Sciences, 102, 6515-6519. Hicks, G.R.F., McColl, H.P., Meads, M.J., Hardy, G.S. & Roser, R.J. (1975) An ecological reconnaissance of Korapuki Island, Mercury Islands. Notornis, 22, 195-220. Hingston, M., Goodman, S.M., Ganzhorn, J.U. & Sommer, S. (2005) Reconstruction of the colonization of southern Madagascar by introduced Rattus rattus. Journal of Biogeography, 32, 1549-1559. Ho, S.Y., Phillips, M.J., Cooper, A. & Drummond, A.J. (2005) Time dependency of molecular rate estimates and systematic overestimation of recent divergence times. Molecular biology and evolution, 22, 1561-1568. Ho, S.Y.W. & Phillips, M.J. (2009) Accounting for Calibration Uncertainty in Phylogenetic Estimation of Evolutionary Divergence Times. Systematic Biology, Holder, M.T. & Lewis, P.O. (2003) Phylogeny estimation: traditional and Bayesian approaches. Nature Genetics, 4, 275-284. Hooijer, D.A. (1974) Quaternary mammals west and east of Wallace's Line. Netherlands Journal of Zoology, 25, 46-56. Höss, M. & Pääbo, S. (1993) DNA extraction from Pleistocene bones by a silica-based purification method. Nucleic Acids Research, 21, 3913-3914. Hudson, R.R., Slatkin, M. & Maddison, W.P. (1992) Estimation of levels of gene flow from DNA sequence data. Genetics, 132, 583-589. Huelsenbeck, J.P. & Ronquist, F. (2001) MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics, 17, 754-755. Huelsenbeck, J.P., Ronquist, F., Nielsen, R. & Bollback, J.P. (2001) Bayesian inference of phylogeny and its impact on evolutionary biology. Science, 294, 2310-2314. Huerta-Sanchez, E., Jin, X., Asan, Bianba, Z., Peter, B.M., Vinckenbosch, N., Liang, Y., Yi, X., He, M., Somel, M., Ni, P., Wang, B., Ou, X., Huasang, Luosang, J., Cuo, Z.X.P., Li, K., Gao, G., Yin, Y., Wang, W., Zhang, X., Xu, X., Yang, H., Li, Y., Wang, J., Wang, J. & Nielsen, R. (2014) Altitude adaptation in Tibetans caused by introgression of -like DNA. Nature, 512, 194-197. Hurles, M.E., Matisoo-Smith, E., Gray, R.D. & Penny, D. (2003) Untangling Oceanic settlement: the edge of the knowable. Trends in Ecology & Evolution, 18, 531-540.

243

Hurles, M.E., Sykes, B.C., Jobling, M.A. & Forster, P. (2005) The dual origin of the Malagasy in Island Southeast Asia and East Africa: evidence from maternal and paternal lineages. The American Journal of Human Genetics, 76, 894-901. Hurles, M.E., Nicholson, J., Bosch, E., Renfrew, C., Sykes, B.C. & Jobling, M.A. (2002) Y chromosomal evidence for the origins of oceanic-speaking peoples. Genetics, 160, 289-303. Huson, D.H. & Bryant, D. (2006) Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution, 23, 254-267. Huson, D.H., Dezulian, T., Klopper, T. & Steel, M.A. (2004) Phylogenetic super-networks from partial trees. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1, 151-158. Hutchison, C.A., Newbold, J.E., Potter, S.S. & Edgell, M.H. (1974) Maternal inheritance of mammalian mitochondrial DNA. Nature, 251, 536-538. Huxley, T.H. (1868) On the classification and distribution of the Alectoromorphae and Heteromorphae. Proceedings of the Zoological Society of London, 294-319. Hympendahl, K. (1997) Die Kunst der polynesischen Navigation. Palstek, 4, 22-29. Intoh, M. (1999) Cultural contacts between Micronesia and Melanesia. The Pacific from 5000 to 2000 BP: colonisation and transformations (ed. by J.-C. Galipaud and I. Lilley), pp. 407-422. IRD, Paris. Irwin, G. (1993) Voyaging. A community of culture: The people and prehistory of the Pacific (ed. by M. Spriggs, D.E. Yen, W. Ambrose, R. Jones, A. Thorne and A. Andrews), pp. 73-87. The Australian National University, Canberra. Irwin, G.J. (1992) The prehistoryic exploration and colonisation of the Pacific. Cambridge University Press, Hong Kong. Irwin, G.J. (2006) Voyaging and settlement. Vaka moana, voyages of the ancestors (ed. by K.R. Howe). David Bateman Ltd., Auckland. Jackson, W.B. (1962) Population studies: D. Reproduction. Pacific island rat ecology. Report of a study made on Ponape and adjacent islands 1955-1958 (ed. by T.I. Storer), pp. 92-107. Bernice P Bishop Museum, Honolulu. Jackson, W.B. & Strecker, R.L. (1962) Ecological distribution and relative numbers. Pacific island rat ecology. Report of a study made on Ponape and adjacent islands 1955-1958 (ed. by T.I. Storer), pp. 45-63. Bernice P Bishop Museum, Honolulu. Jacobs, L.L. & Pilbeam, D. (1980) Of mice and men: fossil-based divergence dates and molecular “clocks”. Journal of Human Evolution, 9, 551-555. Jacobs, L.L. & Downs, W.R. (1994) The evolution of murine rodents in Asia. National Science Museum Monographs, 8, 149-156. Jacobs, L.L. & Flynn, L. (2005) Of mice… again: the Siwalik rodent record, murine distribution, and molecular clocks. American school of prehistoric research monograph series. Interpreting the past: essays on human, primate, and mammal evolution in honor of David Pilbeam (ed. by D.E. Lieberman, R.J. Smith and J. Kelley), pp. 63-80. Brill Academic Publishers, Inc., Boston. Jaeger, J., Tong, H. & Denys, C. (1986) The age of the Mus-Rattus divergence: paleontological data compared with the molecular clock. Comptes Rendus de l'Académie des Sciences. Série 2, 302, 917-922. Jayaswal, V., Robinson, J. & Jermiin, L. (2007) Estimation of phylogeny and invariant sites under the general Markov model of nucleotide sequence evolution. Systematic Biology, 56, 155-162. Jentink, A. (1890) Mammalia from the Malay Archipelago II; with Plate VIII, IX, X, XI. Zoologische Ergebnisse einer Reise in Niederländisch Ost-Indien (ed. by M. Weber), pp. 115-129. Brill, Leiden. 244

REFERENCES

Jing, M., Yu, H.-T., Bi, X., Lai, Y.-C., Jiang, W. & Huang, L. (2014) Phylogeography of Chinese house mice (Mus musculus musculus/castaneus): distribution, routes of colonization and geographic regions of hybridization. Molecular Ecology, 23, 4387-4405. Jittapalapong, S., Sarataphan, N., Maruyama, S., Hugot, J.-P., Morand, S. & Herbreteau, V. (2011) Toxoplasmosis in rodents: ecological survey and first evidences in Thailand. Vector-Borne and Zoonotic Diseases, 11, 231-237. Johnson, D.H. (1946) The rat population of a newly established military base in the Solomon Islands. United States naval medical bulletin, 46, 1628-1632. Joly, S., Stevens, M.I. & van Vuuren, B.J. (2007) Haplotype networks can be misleading in the presence of missing data. Systematic Biology, 56, 857-862. Jukes, T.H. & Cantor, C.R. (1969) Evolution of protein molecules. Mammalian protein metabolism III (ed. by H.N. Munro), pp. 21-132. Academic Press, New York. Kami, H.T. (1966) Foods of rodents in the Hamakua District, Hawaii. Pacific Science, 20, 367-373. Kaneda, H., Hayashi, J., Takahama, S., Taya, C., Lindahl, K.F. & Yonekawa, H. (1995) Elimination of paternal mitochondrial DNA in intraspecific crosses during early mouse embryogenesis. Proceedings of the National Academy of Sciences, 92, 4542- 4546. Karafet, T.M., Mendez, F.L., Meilerman, M.B., Underhill, P.A., Zegura, S.L. & Hammer, M.F. (2008) New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree. Genome research, 18, 830-838. Karafet, T.M., Hallmark, B., Cox, M.P., Sudoyo, H., Downey, S., Lansing, J.S. & Hammer, M.F. (2010) Major east–west division underlies Y chromosome stratification across Indonesia. Molecular biology and evolution, 27, 1833-1844. Kayser, M., Brauer, S., Weiss, G., Schiefenhoevel, W., Underhill, P.A. & Stoneking, M. (2001) Independent histories of human Y chromosomes from Melanesia and Australia. American Journal of Human Genetics, 68, 173-190. Kayser, M., Brauer, S., Weiss, G., Underhill, P.A., Roewer, L., Schiefenhövel, W. & Stoneking, M. (2000) Melanesian origin of Polynesian Y chromosomes. Current Biology, 10, 1237-1246. Kayser, M., Lao, O., Saar, K., Brauer, S., Wang, X., Nürnberg, P., Trent, Ronald J. & Stoneking, M. (2008a) Genome-wide analysis indicates more Asian than Melanesian ancestry of Polynesians. American Journal of Human Genetics, 82, 194-198. Kayser, M., Choi, Y., van Oven, M., Mona, S., Brauer, S., Trent, R.J., Suarkia, D., Schiefenhovel, W. & Stoneking, M. (2008b) The impact of the Austronesian expansion: evidence from mtDNA and Y chromosome diversity in the Admiralty Islands of Melanesia. Molecular Biology and Evolution, 25, 1362-1374. Kayser, M., Brauer, S., Cordaux, R., Casto, A., Lao, O., Zhivotovsky, L.A., Moyse-Faurie, C., Rutledge, R.B., Schiefenhoevel, W., Gil, D., Lin, A.A., Underhill, P.A., Oefner, P.J., Trent, R.J. & Stoneking, M. (2006) Melanesian and Asian origins of Polynesians: mtDNA and Y chromosome gradients across the Pacific. Molecular Biology and Evolution, 23, 2234-2244. Kimura, M. (1953) "Stepping Stone" model of population. Annual Report of the National Institute of Genetics, 3, 62-63. Kimura, M. (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution, 16, 111-120.

245

Kimura, M. (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge. Kimura, M. & Weiss, G.H. (1964) The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics, 49, 561-576. Kingman, J.F.C. (1982a) On the genealogy of large populations. Journal of Applied Probability, 19, 27-43. Kingman, J.F.C. (1982b) The coalescent. Stochastic Processes and their Applications, 13, 235-248. Kirch, P.V. (1997) The Lapita Peoples: ancestors of the oceanic wolrd. Blackwell Publishers, Oxford & Malden MA. Kirch, P.V. (2002) On the road of the winds. University of California Press, Berkeley, Los Angeles, London. Kitchener, D.J., How, R.A. & Maharadatunkamsi, A. (1991) A new species of Rattus from the mountains of West Flores, Indonesia. Records of the Western Australian Museum, 15, 611-626. Kleppe, K., Ohtsuka, E., Kleppe, R., Molineux, I. & Khorana, H.G. (1971) Studies on polynucleotides: XCVI. Repair replication of short synthetic DNA's as catalyzed by DNA polymerases. Journal of Molecular Biology, 56, 341-361. Knippers, R. (1997) Molekulare Genetik. Georg Thieme Verlag, Stuttgart & New York. Knowles, L.L. (2009) Statistical phylogeography. Annual Review of Ecology, Evolution, and Systematics, 40, 593-612. Kopstein, F. (1931) Die Ökologie der javanischen Ratten und ihre Bedeutung für die Epidemiologie der Pest. Zeitschrift für Morphologie und Ökologie der Tiere, 22, 774-807. Krebs, C.J. (2001) Ecology, the experimental analysis of distribution and abundance, 5 edn. Benjamin Cummings, San Francisco. Kuhner, M.K. (2006) LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics, 22, 768-770. Kuhner, M.K. (2014) LAMARC Manual: What do the population parameters mean? Available at: http://evolution.genetics.washington.edu/lamarc/documentation/parameters.html (accessed December 2014). Kuhner, M.K., Yamato, J. & Felsenstein, J. (1995) Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling. Genetics, 140, 1421-1490. Lambeck, K. & Chappell, J. (2001) Sea level change through the last glacial cycle. Science, 292, 679-686. Landis, M.J., Matzke, N.J., Moore, B.R. & Huelsenbeck, J.P. (2013) Bayesian analysis of biogeography when the number of areas is large. Systematic Biology, 62, 789-304. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J. & Higgins, D.G. (2007) Clustal W and Clustal X version 2.0. Bioinformatics, 23, 2947-2948. Larson, G. & Burger, J. (2013) A population genetics view of animal domestication. Trends in Genetics, 29, 197-205. Larson, G., Dobney, K., Albarella, U., Fang, M., Matisoo-Smith, E., Robins, J., Lowden, S., Finlayson, H., Brand, T., Willerslev, E., Rowley-Conwy, P., Andersson, L. & Cooper, A. (2005) Worldwide phylogeography of wild boar reveals multiple centers of pig domestication. Science, 307, 1618-1621.

246

REFERENCES

Larson, G., Cucchi, T., Fujita, M., Matisoo-Smith, E., Robins, J., Anderson, A., Rolett, B., Spriggs, M., Dolman, G., Kim, T.-H., Thuy, N.T.D., Randi, E., Doherty, M., Due, R.A., Bollt, R., Djubiantono, T., Griffin, B., Intoh, M., Keane, E., Kirch, P., Li, K.- T., Morwood, M., Pedrina, L.M., Piper, P.J., Rabett, R.J., Shooter, P., Van den Bergh, G., West, E., Wickler, S., Yuan, J., Cooper, A. & Dobney, K. (2007) Phylogeny and ancient DNA of Sus provides insights into neolithic expansion in Island Southeast Asia and Oceania. Proceedings of the National Academy of Sciences, 104, 4834-4839. Leach, B.F. & Ward, G.K. (1981) Archaeology on Kapingamarangi atoll: a Polynesian outlier in the Eastern Caroline Islands. In: Studies in Prehistoric Anthropology. University of Otago, Otago. Leavesley, M., Bird, M.I., Fitifield, L.K., Hausladen, P.A., Santos, G.M. & di Tada, M.L. (2002) Buang Merabak: early evidence for human occupation in the Bismarck Archipelago, Papua New Guinea. , 54, 55-57. Leavesley, M.G. & Chappell, J. (2004) Buang Merabak: additional early radiocarbon evidence of the colonisation of the Bismarck Archipelago, Papua New Guinea. Antiquity, 78, (project gallery at http://www.antiquity.ac.uk/projgall/leavesley; accessed 2014). Lehtonen, J.T., Mustonen, O., Ramiarinjanahary, H., Niemelä, J. & Rita, H. (2001) Habitat use by endemic and intoduced rodents along a gradient of forest disturbance in Madagascar. Biodiversity and Conservation, 10, 1185-1202. Leigh, J.W. & Bryant, D. (2012) popART v1, available from http://popart.otago.ac.nz. Lemey, P., Rambaut, A., Drummond, A.J. & Suchard, M.A. (2009) Bayesian phylogeography finds its roots. PLoS Computational Biology, 5, e1000520. Lemey, P., Rambaut, A., Welch, J.J. & Suchard, M.A. (2010) Phylogeography takes a relaxed random walk in continuous space and time. Molecular Biology and Evolution, 27, 1877-1885. Lemmon, A.R. & Lemmon, E.M. (2008) A likelihood framework for estimating phylogeographic history on a continuous landscape. Systematic Biology, 57, 544- 561. Leonard, J.A., den Tex, R.-J., Hawkins, M.T.R., Muñoz-Fuentes, V., Thorington, R. & Maldonado, J.E. (2015) Phylogeography of vertebrates on the Sunda Shelf: a multi- species comparison. Journal of Biogeography, online early: doi 10.1111/jbi.12465. Levene, H. (1953) Genetic equilibrium when more than one ecological niche is available. The American Naturalist, 87, 331-333. Lewis, P.O. & Swofford, D.L. (2001) Back to the future: Bayesian inference arrives in phylogenetics. Trends in Ecology & Evolution, 16, 600-601. Librado, P. & Rozas, J. (2009) DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics, 25, 1451-1452. Liu, Y.-P., Wu, G.-S., Yao, Y.-G., Miao, Y.-W., Luikart, G., Baig, M., Beja-Pereira, A., Ding, Z.-L., Palanichamy, M.G. & Zhang, Y.-P. (2006) Multiple maternal origins of chickens: out of the Asian jungles. Molecular phylogenetics and evolution, 38, 12-19. Locatelli, E. (2011) Insular small mammals from Quaternary deposits of Sicily and Flores. PhD, Università degli Studi di Ferrara, Ferrara. Locatelli, E., Awe Due, R., Van Den Bergh, G.D. & Van Den Hoek Ostende, L.W. (2012) Pleistocene survivors and Holocene extinctions: the giant rats from Liang Bua (Flores, Indonesia). Quaternary International, 281, 47-57. Lowe, A., Harris, S. & Ashton, P. (2004) Ecological genetics: design, analysis, and application. Blackwell Publishing, Oxford.

247

Lowry II, P.P., Schatz, G.E. & Phillipson, P.B. (1997) The classification of natural and anthropogenic vegetation in Madagascar. Natural Change and Human Impact in Madagascar (ed. by S.M. Goodman and B.D. Patterson), pp. Pages 93-123. Smithsonian Institution Press, Washington, D.C. Lum, J.K. & L., C.R. (1998) mtDNA and language support a common origin of Micronesians and Polynesians in Island Southeast Asia. American Journal of Physical Anthropology, 105, 109-119. Lum, J.K. & Cann, R.L. (2000) mtDNA lineage analyses: origins and migrations of Micronesians and Polynesians. American Journal of Physical Anthropology, 113, 151-168. Lum, J.K., Rickards, O., Ching, C. & Cann, R.L. (1994) Polynesian mitochondrial DNAs reveal three deep maternal lineage clusters. Human biology, 66, 567-590. Lum, J.K., McIntyre, J.K., Greger, D.L., Huffman, K.W. & Vilar, M.G. (2006) Recent Southeast Asian domestication and Lapita dispersal of sacred male pseudohermaphroditic “tuskers” and hairless pigs of Vanuatu. Proceedings of the National Academy of Sciences, 103, 17190-17195. MacArthur, R.H. & Wilson, E.O. (1963) An equilibrium theory of insular zoogeography. Evolution, 373-387. Macaulay, V., Hill, C., Achilli, A., Rengo, C., Clarke, D., Meehan, W., Blackburn, J., Semino, O., Scozzari, R. & Cruciani, F. (2005) Single, rapid coastal settlement of Asia revealed by analysis of complete mitochondrial genomes. Science, 308, 1034- 1036. Madigan, M.T., Martinko, J.M. & Parker, J. (2001) Brock Mikrobiologie. Spektrum Akademischer Verlag, Heidelberg - Berlin. Marples, R.R. (1955) Rattus exulans in Western Samoa. Pacific Science, 9, 171-176. Martin, J. & Mariner, W. (1817) William Mariner - The natives of the Tonga islands in the South Pacific ocean, London. Martin, W., Hoffmeister, M., Rotte, C. & Henze, K. (2001) An overview of endosymbiotic models for the origins of eukaryotes, their ATP-producing organelles (mitochondria and hydrogenosomes), and their heterotrophic lifestyle. Biological Chemistry, 382, 1521-1539. Matisoo-Smith, E. (1996) No hea te kiore: mtDNA variation in Rattus exulans: a model for human colonisation and contact in prehistoric Polynesia. PhD Thesis, University of Auckland, Auckland. Matisoo-Smith, E. (2004) More than just old bones: on the contribution of Biological Anthropology to New Zealand Archaeology. Change through Time, 50 Years of New Zealand Archaeology (ed. by L. Furey and S. Holdaway), pp. 235-249. New Zealand Archaeological Association Monograph 26. Matisoo-Smith, E. (2007) Animal translocations, genetic variation and the human settlement of the Pacific. Population Genetics, Linguistics, and Culture History in the Southwest Pacific (ed. by J.S. Friedlander), pp. 346-378. Matisoo-Smith, E. (2015) Ancient DNA and the human settlement of the Pacific: a review. Journal of Human Evolution, online early: doi:10.1016/j.jhevol.2014.10.017 Matisoo-Smith, E. & Robins, J.H. (2004) Origins and dispersals of Pacific peoples: evidence from mtDNA phylogenies of the Pacific rat. Proceedings of the National Academy of Sciences USA, 101, 9167-9172. Matisoo-Smith, E., Allen, J.S., Ladefoged, T.N., Roberts, R.M. & Lambert, D.M. (1997) Ancient DNA from Polynesian rats: extraction, amplification and sequence from single small bones. Electrophoresis, 18, 1534-7.

248

REFERENCES

Matisoo-Smith, E., Allen, J.S., Roberts, R.M., Irwin, G.J. & Lambert, D.M. (1999) Rodents of the sunrise: mitochondrial DNA phylogenies of Polynesian Rattus exulans and the settlement of Polynesia. The Pacific from 5000 to 2000 BP: colonisation and transformations (ed. by J.-C. Galipaud and I. Lilley), p. 619. Institut de Recherche pour le Developpement, Paris. Matisoo-Smith, E., Roberts, R.M., Irwin, G.J., Allen, J.S., Penny, D. & Lambert, D.M. (1998) Patterns of prehistoric human mobility in Polynesia indicated by mtDNA from the Pacific rat. Proceedings of the National Academy of Sciences USA, 95, 15145-15150. Matisoo-Smith, E., Hingston, M., Summerhayes, G.R., Robins, J., Ross, H.A. & Hendy, M.D. (2009) On the rat trail in Near Oceania applying the commensal model to the question of the Lapita colonization. Pacific Science, 63, 465-475. Matisoo-Smith, E.A. (1994) The human colonisation of Polynesia. A novel approach: genetic analysis of the Polynesian rat (Rattus exulans). Journal of the Poynesian Society, 103, 75-87. Matson, C.W. & Baker, R.J. (2001) DNA sequence variation in the mitochondrial control region of red-backed voles (Clethrionomys). Molecular Biology and Evolution, 18, 1494-1501. Mayr, E. (1944) Wallace's line in the light of recent zoogeographic studies. The Quarterly Review of Biology, 19, 1-14. Meirmans, P.G. & Hedrick, P.W. (2011) Assessing population structure: FST and related measures. Molecular Ecology Resources, 11, 5-18. Melton, T., Peterson, R., Redd, A.J., Saha, N., Sofro, A., Martinson, J. & Stoneking, M. (1995) Polynesian genetic affinities with Southeast Asian populations as identified by mtDNA analysis. American journal of human genetics, 57, 403. Mendel, G.J. (1866) Versuche über Pflanzen-Hybriden. Verhandlungen des naturforschenden Vereines Brünn, Bd. IV für das Jahr 1865, 3-47. Merriwether, D.A., Friedlander, J.S., Mediavilla, J., Mgone, C., Gentz, F. & Ferrell, R.E. (1999) Mitochondrial DNA variation is an indicator of Austronesian influence in Island Melanesia. American Journal of Physical Anthropology, 110, 234-270. Merriwether, D.A., Hodgson, J.A., Friedlaender, F.R., Allaby, R., Cerchio, S., Koki, G. & Friedlaender, J.S. (2005) Ancient mitochondrial M haplogroups identified in the Southwest Pacific. Proceedings of the National Academy of Sciences of the United States of America, 102, 13034-13039. Mertens, R. (1936) Die Säugetiere der Inseln Bali, Lombok, Sumbawa und Flores. (Beiträge zur Fauna der Kleinen Sunda-Inseln, II). Zoologische Jahrbücher, Abteilung für Systematik, Ökologie und Geographie der Tiere, 68, 273-324. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H. & Teller, E. (1953) Equation of state calculations by fast computing machines. Journal of Chemical Physics, 21, 1087-1092. Meyer, P.O. (1909a) Funde prähistorischer Töpferei und Steinmesser auf Vuatom, Bismarck-Archipel. Anthropos, 4, 251-252. Meyer, P.O. (1909b) Nachtrag. Anthropos, 4, 1093-1095. Meyer, R.R. (1973) On the evolutionary origin of mitochondrial DNA. Journal of Theoretical Biology, 38, 647-663. Miao, Y.W., Peng, M.S., Wu, G.S., Ouyang, Y.N., Yang, Z.Y., Yu, N., Liang, J.P., Pianchou, G., Beja-Pereira, A., Mitra, B., Palanichamy, M.G., Baig, M., Chaudhuri, T.K., Shen, Y.Y., Kong, Q.P., Murphy, R.W., Yao, Y.G. & Zhang, Y.P. (2013) Chicken domestication: an updated perspective based on mitochondrial genomes. Heredity, 110, 277-282.

249

Michalakis, Y. & Excoffier, L. (1996) A generic estimation of population subdivision using distances between alleles with special reference for microsatellite loci. Genetics, 142, 1061-1064. Miller, G.S. & Ewing, H.E. (1924) The characters and probable history of the Hawaiian rat. Bernice P. Bishop Museum Bulletin, 14, 1-6. Mirabal, S., Herrera, K.J., Gayden, T., Regueiro, M., Underhill, P.A., Garcia-Bertrand, R.L. & Herrera, R.J. (2012) Increased Y-chromosome resolution of haplogroup O suggests genetic ties between the Ami aborigines of Taiwan and the Polynesian Islands of Samoa and Tonga. Gene, 492, 339-348. Moller, H. & Craig, J.L. (1987) The population ecology of Rattus exulans on Tiritiri Matangi Island, and a model of comparative population dynamics in NZ. New Zealand Journal of Zoology, 14 Moodie, P.M., Booth, P.B. & Sanford, R. (1969) The Nus - a genetic survey of Tench Islanders. Archaeology & Physical Anthropology in Oceania, 4, 129-143. Moritz, C., Dowling, T.E. & Brown, W.M. (1987) Evolution of animal mitochondrial DNA: relevance for population biology and systematics. Annual Review of Ecology and Systematics, 18, 269-292. Morwood, M.J., Sutikna, T., Saptomo, E.W., Jatmiko, Hobbs, D.R. & Westaway, K.E. (2009) Preface: research at Liang Bua, Flores, Indonesia. Journal of Human Evolution, 57, 437-449. Mosby, J.M., Wodzicki, K.A. & Shorland, F.B. (1974) Fatty acid composition of the depot fats of the Polynesian rat, Rattus exulans, Tokelau Islands. New Zealand Journal of Zoology, 1, 67-70. Moss, S.J. & Wilson, M.E.J. (1998) Biogeographic implications of the Tertiary palaeogeographic evolution of Sulawesi and Borneo. Biogeography and Geological Evolution of SE Asia (ed. by R. Hall and J.D. Holloway), pp. 133-163. Backhuys Publishers, Leiden. Motokawa, M., Lin, L.-K. & Lu, K.-H. (2004) Geographic variation in cranial features of the Polynesian rat Rattus exulans (Peale, 1848)(Mammalia: Rodentia: Muridae). Raffles Bulletin of Zoology, 52, 653-663. Motokawa, M., Lu, K.-H., Harada, M. & Lin, L.-K. (2001) New records of the Polynesian rat Rattus exulans (Mammalia:Rodentia) from Taiwan and the Ryukyus. Zoological Studies, 40, 299-304. Motulsky, A.G. (1989) Metabolic polymorphisms and the role of infectious diseases in human evolution. Human Biology, 61, 835-869. Muller, H.J. (1964) The relation of recombination to mutational advance. Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis, 1, 2-9. Mullis, K.B., Faloona, F., Scharf, S., Saiki, R., Horn, G. & Erlich, H. (1986) Specific enzymatic amplification of DNA in vitro: the Polymerase Chain Reaction. Cold Spring Harbor Symposia on Quantitative Biology, 51, 263-273. Murray-McIntosh, R.P., Scrimshaw, B.J., Hatfield, P.J. & Penny, D. (1998) Testing migration patterns and estimating founding population size in Polynesia by using human mtDNA sequences. Proceedings of the National Academy of Sciences, 95, 9047-9052. Musser, G.G. (1977) Epimys benguetensis, a composite, and one zoogeographic view of rat and mouse faunas in the Philippines and Celebes. American Museum Novitates, 2624, 1-15. Musser, G.G. (1981a) The giant rat of Flores and its relatives east of Borneo and Bali. Bulletin of the AMNH, 169, 67-176.

250

REFERENCES

Musser, G.G. (1981b) Notes on systematics of Indo-Malayan murid rodents, and descriptions of new genera and species from Ceylon, Sulawesi, and the Philippines. Bulletin of the American Museum of Natural History, 168, 225-334. Musser, G.G. & Newcomb, C. (1983) Malaysian murids and the giant rat of Sumatra. Bulletin of the American Museum of Natural History, 174, 327-598. Musser, G.G. & Carleton, M.D. (1993) Family Muridae. Mammal Species of the World. A taxonomic and geographic reference (ed. by D.E. Wilson and D.M. Reeder), pp. 501-754. Smithsonian Institution Press, Washington. Nass, R.D. (1977) Movements and home ranges of Polynesian rats in Hawaiian sugarcane. Pacific Science, 31, 135-142. Nei, M. (1973) Analysis of gene diversity in subdivided populations. Proceedings of the National Academy of Sciences USA, 70, 3321-3323. Neigel, J.E. (2002) Is FST obsolete? Conservation Genetics, 3, 167–173. Newman, D.G. & McFadden, I. (1989) Seasonal fluctuations of numbers, breeding, and food of kiore (Rattus exulans) on Lady Alice Island (Hen and Chickens Group), with a consideration of kiore: tuatara (Sphenodon punctatus) relationships in New Zealand. NZ. Journal of Zoology 17, 55-63. Nielsen, R. & Wakeley, J. (2001) Distinguishing migration from isolation: a Markov Chain Monte Carlo approach. Genetics, 158, 885-896. Nitatpattana, N., Henrich, T., Palabodeewat, S., Tangkanakul, W., Poonsuksombat, D., Chauvancy, G., Barbazan, P., Yoksan, S. & Gonzalez, J.P. (2002) Hantaan virus antibody prevalence in rodent populations of several provinces of northeastern Thailand. Tropical Medicine & International Health, 7, 840-845. O'Connell, J.F. & Allen, J. (2004) Dating the colonization of Sahul (Pleistocene Australia– New Guinea): a review of recent research. Journal of Archaeological Science, 31, 835-853. O'Connell, J.F. & Allen, J. (2012) The restaurant at the end of the universe: Modelling the colonisation of Sahul. Australian Archaeology, 74, 5-17. O'Connor, S. (2007) New evidence from contributes to our understanding of earliest modern human colonisation east of the Sunda Shelf. Antiquity, 81, 523. O'Connor, S., Spriggs, M. & Veth, P. (2002) Excavation at Lene Hara Cave establishes occupation in East Timor at least 30,000–35,000 years ago. Antiquity, 76, 45-50. O'Connor, S., Ono, R. & Clarkson, C. (2011a) Pelagic fishing at 42,000 years before the present and the maritime skills of modern humans. Science, 334, 1117-1121. O'Connor, S., Barham, A., Aplin, K., Dobney, K., Fairbairn, A. & Richards, M. (2011b) The power of paradigms: Examining the evidential basis for early to mid-Holocene pigs and pottery in Melanesia. Journal of Pacific Archaeology, 2, 1-25. O'Rourke, D.H., Hayes, M.G. & Carlyle, S.W. (2000) Ancient DNA studies in Physical Anthropology. Annual Review of Anthropology, 29, 217-242. Oppenheimer, S.J. (2003) Austronesian spread into Southeast Asia and Oceania: where from and when? Pacific Archaelology: assessments and prospects. Proceedings of the International Conference for the 50th anniversary of the first Lapita excavation (ed. by C. Sand), pp. 54-70. Les Cahiers de l'archeologie en Nouvelle-Caledonie, Koné-Nouméa. Oppenheimer, S.J. & Richards, M.B. (2001a) Fast trains, slow boats, and the ancestry of the Polynesian islanders. Science Progress, 84, 157-181. Oppenheimer, S.J. & Richards, M.B. (2001b) Slow boat to Melanesia? Nature, 410, 166- 167. Oskarsson, M.C.R., Klütsch, C.F.C., Boonyaprakob, U., Wilton, A., Tanabe, Y. & Savolainen, P. (2011) Mitochondrial DNA data indicate an introduction through

251

Mainland Southeast Asia for Australian dingoes and Polynesian domestic dogs. Proceedings of the Royal Society B: Biological Sciences, Pagès, M., Chaval, Y., Herbreteau, V., Waengsothorn, S., Cosson, J.-F., Hugot, J.-P., Morand, S. & Michaux, J. (2010) Revisiting the taxonomy of the Rattini tribe: a phylogeny-based delimitation of species boundaries. BMC Evolutionary Biology, 10, 184. Pang, J.-F., Kluetsch, C., Zou, X.-J., Zhang, A.-b., Luo, L.-Y., Angleby, H., Ardalan, A., Ekström, C., Sköllermo, A., Lundeberg, J., Matsumura, S., Leitner, T., Zhang, Y.- P. & Savolainen, P. (2009) MtDNA data indicate a single origin for dogs south of Yangtze river, less than 16,300 years ago, from numerous wolves. Molecular Biology and Evolution, 26, 2849-2864. Parkinson, R. (1907) Dreißig Jahre in der Südsee: Land und Leute, Sitten und Gebräuche im Bismarckarchipel und auf den deutsche Salomoinseln. Strecker und Schröder, Stuttgart. Parks, D.H., Mankowski, T., Zangooei, S., Porter, M.S., Armanini, D.G., Baird, D.J., Langille, M.G.I. & Beiko, R.G. (2013) GenGIS 2: Geospatial Analysis of Traditional and Genetic Biodiversity, with New Gradient Algorithms and an Extensible Plugin Framework. PLoS ONE, 8, e69885. Pawley, A. & Green, R. (1973) Dating the dispersal of the . Oceanic Linguistics, 12, 1-67. Peale, T.R. (1848) Mammalia and ornithology. US Exploring Expedition during the years 1838,1839,1840,1841,1842 under the command of Charles Wilkes, U.S.N. (ed. by C. Wilkes). C. Sherman, Philadelphia. Penny, D. (2005) Evolutionary biology: Relativity for molecular clocks. Nature, 436, 183- 184. Perez, J., Brescia, F., Becam, J., Mauron, C. & Goarant, C. (2011) Rodent abundance dynamics and leptospirosis carriage in an area of hyper-endemicity in New Caledonia. PLoS Neglected Tropical Diseases, 5, e1361. Pesole, G., Gissi, C., De Chirico, A. & Saccone, C. (1999) Nucleotide substitution rate of mammalian mitochondrial genomes. Journal of Molecular Evolution, 48, 427-434. Pierron, D., Razafindrazaka, H., Pagani, L., Ricaut, F.-X., Antao, T., Capredon, M., Sambo, C., Radimilahy, C., Rakotoarisoa, J.-A., Blench, R.M., Letellier, T. & Kivisild, T. (2014) Genome-wide evidence of Austronesian–Bantu admixture and cultural reversion in a hunter-gatherer group of Madagascar. Proceedings of the National Academy of Sciences, 111, 936-941. Pierson, M.J., Martinez-Arias, R., Holland, B.R., Gemmell, N.J., Hurles, M.E. & Penny, D. (2006) Deciphering Past Human Population Movements in Oceania: Provably Optimal Trees of 127 mtDNA Genomes. Molecular Biology and Evolution, 23, 1966-1975. Piper, P.J., Ochoa, J., Robles, E.C., Lewis, H. & Paz, V. (2011) Palaeozoology of Palawan Island, Philippines. Quaternary International, 233, 142-158. Poole, A.M. & Penny, D. (2007) Evaluating hypotheses for the origin of eukaryotes. BioEssays, 29, 74-84. Portier, P.J. (1918) Les symbiotes. Masson et cie, Paris. Posada, D. (2008) jModelTest: phylogenetic model averaging. Molecular Biology and Evolution, 25, 1253-1256. Posada, D. & Crandall, K.A. (1998) Modeltest: testing the model of DNA substitution. Bioinformatics, 14, 817-818. Posada, D. & Crandall, K.A. (2001) Selecting the best-fit model of nucleotide substitution. Systematic Biology, 50, 580-601. 252

REFERENCES

Pulliam, H.R. (1988) Sources, sinks, and population regulation. The American Naturalist, 132, 652-661. Raftery, A.E., Newton, M.A., Satagopan, J.M. & Krivitsky, P.N. (2007) Estimating the integrated likelihood via posterior simulation using the harmonic mean identity. Bayesian Statistics, 8, 1-45. Rambaut, A. & Drummond, A.J. (2009) Tracer v 1.5, available from http://tree.bio.ed.ac.uk/software/tracer/. Ramos-Onsins, S.E. & Rozas, J. (2002) Statistical properties of new neutrality tests against population growth. Molecular Biology and Evolution, 19, 2092-2100. Randriamasimanana, C. (1999) The Malayo-Polynesian origins of Malagasy. From Neanderthal to Easter Island: A Tribute to, and a Celebration of, the Work of W. Wilfried Shuhmacher Presented on the Occasion of His 60th Birthday, Rannala, B. & Hartigan, J.A. (1996) Estimating gene flow in island populations. Genetical Research, 67, 147-158. Rantanen, A. & Larsson, N.-G. (2000) Regulation of mitochondrial DNA copy number during spermatogenesis. Human reproduction, 15, 86-91. Rasmussen, M., Guo, X., Wang, Y., Lohmueller, K.E., Rasmussen, S., Albrechtsen, A., Skotte, L., Lindgreen, S., Metspalu, M., Jombart, T., Kivisild, T., Zhai, W., Eriksson, A., Manica, A., Orlando, L., De La Vega, F.M., Tridico, S., Metspalu, E., Nielsen, K., Ávila-Arcos, M.C., Moreno-Mayar, J.V., Muller, C., Dortch, J., Gilbert, M.T.P., Lund, O., Wesolowska, A., Karmin, M., Weinert, L.A., Wang, B., Li, J., Tai, S., Xiao, F., Hanihara, T., van Driem, G., Jha, A.R., Ricaut, F.-X., de Knijff, P., Migliano, A.B., Gallego Romero, I., Kristiansen, K., Lambert, D.M., Brunak, S., Forster, P., Brinkmann, B., Nehlich, O., Bunce, M., Richards, M., Gupta, R., Bustamante, C.D., Krogh, A., Foley, R.A., Lahr, M.M., Balloux, F., Sicheritz-Pontén, T., Villems, R., Nielsen, R., Wang, J. & Willerslev, E. (2011) An aboriginal Australian genome reveals separate human dispersals into Asia. Science, 334, 94-98. Raymo, M.E., Oppo, D.W. & Curry, W. (1997) The Mid-Pleistocene climate transition: a deep sea carbon isotopic perspective. Paleoceanography, 12, 546-559. Ree, R.H. & Smith, S.A. (2008) Maximum likelihood inference of geographic range evolution by dispersal, local extinction, and cladogenesis. Systematic Biology, 57, 4-14. Ree, R.H. & Sanmartín, I. (2009) Prospects and challenges for parametric models in historical biogeographical inference. Journal of Biogeography, 36, 1211-1220. Ree, R.H., Moore, B.R., Webb, C.O. & Donoghue, M.J. (2005) A likelihood framework for inferring the evolution of geographic range on phylogenetic trees. Evolution, 59, 2299-2311. Reepmeyer, C., O'Connor, S.U.E. & Brockwell, S. (2011) Long-term obsidian use at the Jerimalai rock shelter in East Timor. Archaeology in Oceania, 46, 85-90. Reis, K.R. & Garong, A.M. (2001) Late quaternary terrestrial vertebrates from Palawan Island, Philippines. Palaeogeography, Palaeoclimatology, Palaeoecology, 171, 409-421. Rensch, B. (1936) Die Geschichte des Sundabogens. Eine Tiergeographische Untersuchung. Gebrüder Bornträger, Berlin. Richards, M.B. & Macaulay, V. (2001) The mitochondrial gene tree comes of age. American Journal of Human Genetics, 68, 1315-1320. Richards, M.B., Oppenheimer, S.J. & Sykes, B. (1998) mtDNA suggests Polynesian origins in Eastern Indonesia. American Journal of Human Genetics, 63, 1234-1236.

253

Roberts, M. (1991a) Origin, dispersal routes, and geographic distribution of Rattus exulans, with special reference to New Zealand. Pacific Science, 45, 123-130. Roberts, M. (1991b) The parasites of the Polynesian rat: biogeography and origins of the New Zealand parasite fauna. International Journal for Parasitology, 21, 785-793. Roberts, R.G., Westaway, K.E., Zhao, J.-x., Turney, C.S., Bird, M.I., Rink, W.J. & Fifield, L.K. (2009) Geochronology of cave deposits at Liang Bua and of adjacent river terraces in the Wae Racang valley, western Flores, Indonesia: a synthesis of age estimates for the type locality of Homo floresiensis. Journal of Human Evolution, 57, 484-502. Robertson, K.M., LeDuc, C.A., LeDuc, R.G. & Morin, P.A. (2007) Extraction from DNA from formalin-fixed cetacean tissues. In: NOAA Technical Memorandum NMFS, pp. 1-17. National Marine Fisheries Services, National Oceanic and Atmospheric Administration, La Jolla. Robins, J.H., Matisoo-Smith, E. & Furey, L. (2001) Hit or miss? Factors affecting DNA preservation in Pacific archaeological material. In: Austalasian Archaeometry Conference 2001, pp. 303-312 Robins, J.H., Hingston, M., Matisoo-Smith, E. & Ross, H.A. (2007) Identifying Rattus species using mitochondrial DNA. Molecular Ecology Notes, 7, 717-729. Robins, J.H., McLenachan, P.A., Phillips, M.J., Craig, L., Ross, H.A. & Matisoo-Smith, E. (2008) Dating of divergences within the Rattus genus phylogeny using whole mitochondrial genomes. Molecular Phylogenetics and Evolution, 49, 460-466. Robins, J.H., McLenachan, P.A., Phillips, M.J., McComish, B., Matisoo-Smith, E. & Ross, H.A. (2010) Evolutionary relationships and divergence times among the native rats of Australia. BMC Evolutionary Biology, 10, 375. Robins, J.H., Tintinger, V., Aplin, K.P., Hingston, M., Matisoo-Smith, E., Penny, D. & Lavery, S.D. (2014) Phylogenetic species identification in Rattus highlights rapid radiation and morphological similarity of New Guinean species. PLoS one, 9, e98002. Robinson, D.M., Jones, D.T., Kishino, H., Goldman, N. & Thorne, J.L. (2003) Protein evolution with dependence among codons due to tertiary structure. Molecular Biology and Evolution, 20, 1692-1704. Robles, E., Piper, P., Ochoa, J., Lewis, H., Paz, V. & Ronquillo, W. (2014) Late Quaternary sea-level changes and the palaeohistory of Palawan Island, Philippines. The Journal of Island and Coastal Archaeology, online early, 1-21. Rodrigo, A.G., Goode, M., Forsberg, R., Ross, H.A. & Drummond, A. (2003) Inferring evolutionary rates using serially sampled sequences from several populations. Molecular Biology and Evolution, 20, 2010-2018. Rodriguez, F., Oliver, J.L., Marin, A. & Medina, J.R. (1990) The general stochastic model of nucleotide substitution. Journal of Theoretical Biology, 142, 485-501. Rogers, A.R. & Harpending, H. (1992) Population growth makes waves in the distribution of pairwise genetic differences. Molecular Biology and Evolution, 9, 552-569. Rohland, N. & Hofreiter, M. (2007a) Ancient DNA extraction from bones and teeth. Nature Protocols, 2, 1756-1762. Rohland, N. & Hofreiter, M. (2007b) Comparison and optimization of ancient DNA extraction. Biotechniques, 42, 343-352. Rolett, B. & Diamond, J. (2004) Environmental predictors of pre-European on Pacific islands. Nature, 431, 443-446. Rolett, B.V., Tianlong, J. & Gongwu, L. (2002) Early seafaring in the Taiwan strait and the search for Austronesian origins. Journal of East Asian Archaeology, 4, 307-319.

254

REFERENCES

Ronquist, F. (1997) Dispersal-vicariance analysis: a new approach to the quantification of historical biogeography. Systematic Biology, 46, 195-203. Ronquist, F. & Sanmartín, I. (2011) Phylogenetic methods in biogeography. Annual Review of Ecology, Evolution, and Systematics, 42, 441-464. Rosenberg, N.A. & Nordborg, M. (2002) Genealogical trees, coalescent theory and the analysis of genetic polymorphisms Nature Reviews, 3, 380-390. Ross, H.A., Lento, G.M., Dalebout, M.L., Goode, M., Ewing, G., McLaren, P., Rodrigo, A.G., Lavery, S. & Baker, C.S. (2003) DNA Surveillance: Web-based molecular identification of whales, dolphins, and porpoises. Journal of Heredity, 94, 111-114. Rowe, K.C., Aplin, K.P., Baverstock, P.R. & Moritz, C. (2011) Recent and rapid speciation with limited morphological disparity in the genus Rattus. Systematic Biology, 60, 188-203. Rozas, J., Sanchez-DelBarrio, J.C., Messeguer, X. & Rozas, R. (2003) DnaSP, DNA polymorphism analyses by coalescent and other methods. Bioinformatics, 19, 2496- 2497. Rozenfeld, A.F., Arnaud-Haond, S., Hernández-García, E., Eguíluz, V.M., Serrão, E.A. & Duarte, C.M. (2008) Network analysis identifies weak and strong links in a metapopulation system. Proceedings of the National Academy of Sciences, 105, 18824-18829. Russell, J.C., Faulquier, L. & Tonione, M.A. (2010) Rat invasion of Tetiaroa atoll, French Polynesia. NCBI Accession: HQ588111. Russell, J.C., Gleeson, D.M. & Le Corre, M. (2011) The origin of Rattus rattus on the Îles Éparses, Western . Journal of Biogeography, 38, 1834-1836. Rutland, J. (1889) On the habits of the New Zealand bush-rat (Mus maorium). Transactions of the New Zealand Institute, XXII, 300-307. Saitou, N. & Nei, M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4, 406-425. Sandweiss, D.H., Maasch, K.A., Burger, R.L., Richardson, J.B., III, Rollins, H.B. & Clement, A. (2001) Variation in Holocene El Nino frequencies: climate records and cultural consequences in ancient Peru. Geology, 29, 603-606. Sanger, F., Nicklen, S. & Coulson, A.R. (1977) DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences USA, 74, 5463-5467. Sanmartín, I., Van Der Mark, P. & Ronquist, F. (2008) Inferring dispersal: a Bayesian approach to phylogeny-based island biogeography, with special reference to the Canary Islands. Journal of Biogeography, 35, 428-449. Sathiamurthy, E. & Voris, H.K. (2006) Maps of Holocene sea level transgression and submerged lakes on the Sunda shelf. The Natural History Journal of Chulalongkorn University, Supplement 2, 1-43. Savolainen, P., Leitner, T., Wilton, A.N., Matisoo-Smith, E. & Lundeberg, J. (2004) A detailed picture of the origin of the Australian dingo, obtained from the study of mitochondrial DNA. PNAS, 101, 12387-12390. Schander, C. & Kenneth, H.M. (2003) DNA, PCR and formalinized animal tissue - a short review and protocols. Organisms Diversity & Evolution, 3, 195-205. Schatz, G. & Mason, T.L. (1974) The biosynthesis of mitochondrial proteins. Annual Review of Biochemistry, 43, 51-87. Scheinfeldt, L., Friedlaender, F., Friedlaender, J., Latham, K., Koki, G., Karafet, T., Hammer, M. & Lorenz, J. (2006) Unexpected NRY chromosome variation in northern Island Melanesia. Molecular Biology and Evolution, 23, 1628-1641.

255

Schneider, S., Roessli, D. & Excoffier, L. (2000) Arlequin: a software for population genetics data analysis. Genetics and Biometry Lab, Dept. of Anthropology, University of Geneva. Schwarz, E. (1960) Classification, origin and distribution of commensal rats. Bulletin of the World Health Organization, 23, 411-416. Schwarz, E. & Schwarz, H.K. (1967) A monograph of the Rattus rattus group. Anales de la Escuela Nacional de Ciencias Biologicas, 14, 79-178. Searle, J.B., Jones, C.S., Gündüz, İ., Scascitelli, M., Jones, E.P., Herman, J.S., Rambau, R.V., Noble, L.R., Berry, R., Giménez, M.D. & Jóhannesdóttir, F. (2009) Of mice and (Viking?) men: phylogeography of British and Irish house mice. Proceedings of the Royal Society B: Biological Sciences, 276, 201-207. Shedlock, A.M., Haygood, M.G., Pietsch, T.W. & Bentzen, P. (1997) Enhanced DNA extraction and PCR amplification of mitochondrial genes from formalin-fixed museum specimens. Biotechniques, 22, 394-400. Sheppard, P.J. (2011) Lapita Colonization across the Near/Remote Oceania Boundary. Current Anthropology, 52, 799-840. Shi, H., Dong, Y.-l., Wen, B., Xiao, C.-J., Underhill, P.A., Shen, P.-d., Chakraborty, R., Jin, L. & Su, B. (2005) Y-chromosome evidence of southern origin of the East Asian–specific haplogroup O3-M122. The American Journal of Human Genetics, 77, 408-419. Shi, S.-R., Cote, R.J., Wu, L., Liu, C., Datar, R., Shi, Y., Liu, D., Lim, H. & Taylor, C.R. (2002) DNA extraction from archival formalin-fixed, paraffin-embedded tissue sections based on the antigen retrieval principle: heating under the influence of pH. Journal of Histochemistry & Cytochemistry, 50, 1005-1011. Shipley, G.P., Taylor, D.A., Tyagi, A., Tiwari, G. & Redd, A.J. (2015) Genetic structure among Fijian island populations. Journal of Human Genetics, Shoubridge, E.A. (2000) Mitochondrial DNA segregation in the developing embryo. Human reproduction, 15, 229-234. Shriver, M.D. & Kittles, R.A. (2004) Genetic ancestry and the search for personalized genetic histories. Nature Reviews Genetics, 5, 611-618. Sigurðardóttir, S., Helgason, A., Gulcher, J.R., Stefansson, K. & Donnelly, P. (2000) The Mutation Rate in the Human mtDNA Control Region. American Journal of Human Genetics, 66, 1599-1609. Simpson, G.G. (1977) Too many lines; the limits of the Oriental and Australian zoogeographic regions. Proceedings of the American Philosophical Society, 121, 107-120. Slatkin, M. (1995) A measure of population subdivision based on microsatellite allele frequencies. Genetics, 139, 457-462. Slatkin, M. & Maddison, W.P. (1989) A cladistic measure of gene flow inferred from the phylogenies of alleles. Genetics, 123, 603-613. Slatkin, M. & Voelm, L. (1991) F(ST) in a hierarchical island model. Genetics, 127, 627- 629. Slatkin, M. & Excoffier, L. (2012) Serial founder effects during range expansion: a spatial analog of genetic drift. Genetics, 191, 171-181. Smith, W.H.F. & Sandwell, D.T. (1997) Global seafloor topography from satellite altimetry and ship depth soundings. Science, 277, 1957-1962. Soares, P., Ermini, L., Thomson, N., Mormina, M., Rito, T., Röhl, A., Salas, A., Oppenheimer, S., Macaulay, V. & Richards, M.B. (2009) Correcting for purifying selection: an improved human mitochondrial molecular clock. The American Journal of Human Genetics, 84, 740-759. 256

REFERENCES

Soares, P., Trejaut, J.A., Loo, J.H., Hill, C., Mormina, M., Lee, C.L., Chen, Y.M., Hudjashov, G., Forster, P., Macaulay, V., Bulbeck, D., Oppenheimer, S., Lin, M. & Richards, M.B. (2008) Climate change and postglacial human dispersals in Southeast Asia. Molecular Biology and Evolution, 25, 1209-1218. Soares, P., Rito, T., Trejaut, J., Mormina, M., Hill, C., Tinkler-Hundal, E., Braid, M., Clarke, D.J., Loo, J.-H., Thomson, N., Denham, T., Donohue, M., Macaulay, V., Lin, M., Oppenheimer, S. & Richards, M.B. (2011) Ancient Voyaging and Polynesian Origins. American journal of human genetics, 88, 239-247. Solihuddin, T. (2014) A drowning Sunda shelf model during last glacial maximum (LGM) and Holocene: a review. Song, Y., Lan, Z. & Kohn, M.H. (2014) Mitochondrial DNA Phylogeography of the Norway Rat. PLoS ONE, 9, e88425. Soodyall, H., Jenkins, T. & Stoneking, M. (1995) 'Polynesian' mtDNA in the Malagasy. Nature Genetics, 10, 377-378. Specht, J., Denham, T., Goff, J. & Terrell, J. (2013) Deconstructing the Lapita cultural complex in the Bismarck Archipelago. Journal of Archaeological Research, 1-52. Spennemann, D.H.R. (1996) Gifts from the waves: a case of marine transport of obsidian to Nadikdik Atoll and the occurrence of other drift materials in the Marshall Islands. Charles Sturt University. Spennemann, D.H.R. (1997) Distribution of rat species (Rattus spp.) on the atolls of the Marshall islands: past and present dispersal. Atoll Research Bulletin, 445, 1-8. Spennemann, D.H.R. & Rapp, G. (1989) Can rats colonise Oceanic islands unaided? An assessment and review of the swimming capabilities of the genus Rattus (Rodentia:Muridae) with particular reference to tropical waters. Zoologische Abhandlungen des Museums für Tierkunde Dresden, 45, 81-91. Spriggs, M. (1989) The dating of the Island Southeast Asian Neolithic: an attempt at chronometric hygiene and linguistic correlation. Antiquity, 63, 587-613. Spriggs, M. (1997) The Island Melanesians Blackwell Publishing. Spriggs, M. (1998) Research questions in Maluku archaeology. Cakalele, 9, 51-64. Spriggs, M. (2012) Is the neolithic spread in island southeast Asia really as confusing as the archaeologists (and some linguists) make it seem? Crossing borders: selected papers from the 13th International Conference of the European Association of Southeast Asian Archaeologists (ed. by M.L. Tjoa-Bonatz, A. Reinecke and D. Bonatz), pp. 109-121. NUS Press - National University of Singapore, Singapore. Spriggs, M., Reepmeyer, C., Lape, P., Neri, L., Ronquillo, W.P., Simanjuntak, T., Summerhayes, G., Tanudirjo, D. & Tiauzon, A. (2011) Obsidian sources and distribution systems in Island Southeast Asia: a review of previous research. Journal of Archaeological Science, 38, 2873-2881. St. John, J.C., Facucho-Oliveira, J., Jiang, Y., Kelly, R. & Salah, R. (2010) Mitochondrial DNA transmission, replication and inheritance: a journey from the gamete through the embryo and into offspring and embryonic stem cells. Human Reproduction Update, 16, 488-509. Stead, E.F. (1937) The Maori rat. Transactions and Proceedings of the Royal Society of New Zealand, 66, 178-181. Steppan, S.J., Zawadzki, C. & Heaney, L.R. (2003) Molecular phylogeny of the endemic Philippine rodent Apomys (Muridae) and the dynamics of diversification in an oceanic archipelago. Biological Journal of the Linnean Society, 80, 699-715. Steppan, S.J., Adkins, R.M. & Anderson, J. (2004) Phylogeny and divergence-date estimates of rapid radiations in muroid rodents based on multiple nuclear genes. Systematic Biology, 53, 533-553.

257

Storer, T.I. (1962) Introduction. Pacific island rat ecology. Report of a study made on Ponape and adjacent islands 1955-1958 (ed. by T.I. Storer), pp. 3-13. Bernice P Bishop Museum, Honolulu. Storey, A.A. (2008) Migrations Most Fowl: Archaeological and Ancient Mitochondrial DNA Signatures of Pacific Chickens. PhD, University of Auckland, Auckland. Storey, A.A., Ladefoged, T. & Matisoo-Smith, E.A. (2008) Counting your chickens: density and distribution of chicken remains in archaeological sites of Oceania. International Journal of Osteoarchaeology, 18, 240-261. Strecker, R.L. & Jackson, W.B. (1962) Habitats and habits. Pacific island rat ecology. Report of a study made on Ponape and adjacent islands 1955-1958 (ed. by T.I. Storer), pp. 64-74. Bernice P Bishop Museum, Honolulu. Sugihara, R.T. (1997) Abundance and diets of rats in two native Hawaiian forests. Pacific Science, 51, 189-198. Summerhayes, G., Matisoo-Smith, E., Mandui, H., Allen, J., Specht, J., Hogg, N. & McPherson, S. (2010) Tamuarawai (EQS): an early Lapita site on Emirau, New Ireland, PNG. Journal of Pacific Archaeology, 1, 62-75. Summerhayes, G.R. (2001) Defining the chronology of Lapita in the Bismarck Archipelago. The Archaeology of Lapita dispersal in Oceania. Papers from the Fourth Lapita Conference June 2000, Canberra, Australia (ed. by G.R. Clark, A.J. Anderson and T. Vunidilo), pp. 25-38. Terra Australis 17. Canberra, Pandanus Books, Research School of Pacific and Asian Studies, Australian National University. Summerhayes, G.R. (2009) Obsidian network patterns in Melanesia - sources, characterisation and distribution. IPPA Bulletin, 29, 109-123. Sunnucks, P. (2000) Efficient genetic markers for population biology. Trends in Ecology and Evolution, 15, 199-203. Swofford, D.L. (2003) PAUP*. Phylogenetic analysis using parsimony (* and other methods). Version 4. Swofford, D.L., Olsen, G.J., Waddell, P.J. & Hillis, D.M. (1996) Phylogenetic inference. Molecular Systematics (ed. by D.M. Hillis, C. Moritz and B.K. Mable), p. 655. Sinauer Associates Inc. Sykes, B.C., Leiboff, A., Low-Beer, J., Tetzner, S. & Richards, M.B. (1995) The origins of the Polynesians: an interpretation from mitochondrial lineage analysis. American Journal of Human Genetics, 57, 1463-1475. Szabó, K. & O'Connor, S. (2004) Migration and complexity in holocene Island Southeast Asia. World Archaeology, 36, 621-628. Tajima, F. (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics, 123, 585-595. Takahashi, K. & Nei, M. (2000) Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Molecular Biology and Evolution, 17, 1251-1258. Tamarin, R.H. & Malecha, S.R. (1971) The population biology of Hawai'ian rodents: demographic parameters. Ecology, 52, 384-394. Tamarin, R.H. & Malecha, S.R. (1972) Reproductive parameters in Rattus rattus and Rattus exulans of Hawai'i, 1968 to 1970. Journal of Mammalogy, 53, 513-528. Tanner, M.A. & Wong, W.H. (1987) The calculation of posterior distributions by data augmentation. Journal of the American statistical Association, 82, 528-540. Tate, G. (1951) Rodents of Australia and New Guinea. Results of Archbold expedition. No. 65. Bulletin of the American Museum of Natural History, 97, 430. 258

REFERENCES

Tate, G.H.H. (1935) Rodents of the genera Rattus and Mus from the Pacific Islands. Bulletin of the American Museum of Natural History, 68, 145-178. Tate, G.H.H. (1936) Some Muridae of the Indo-Australian Region. Bulletin of the American Museum of Natural History, 72, 501-728. Taylor, J.M., Calaby, J.H. & Van Deusen, H.M. (1982) A revision of the genus Rattus (Rodentia, Muridae) in the New Guinean region. Bulletin of the American Museum of Natural History, 173, 177-336. Terrell, J. (1988) History as a family tree, history as an entangled bank: constructing images and interpretations of prehistory in the South Pacific. History, 62, 642-657. Terrell, J.E. (2004) Introduction: 'Austronesia' and the great Austronesian migration. World Archaeology, 36, 586-590. Terrell , J.E. & Welsh, R.L. (1997) Lapita and the temporal geography of prehistory. Antiquity, 71, 548-572. Terrell, J.E., Hunt, T.L. & Gosden, C. (1997) The dimension of social life in the Pacific. Current Anthropology, 38, 155-195. Thomas, O. (1898) On the Mammals obtained by Mr. John Whitehead during his recent Expedition to the Philippines. The Transactions of the Zoological Society of London, 14, 377-412. Thomson, V., Aplin, K.P., Cooper, A., Hisheh, S., Suzuki, H., Maryanto, I., Yap, G. & Donnellan, S.C. (2014) Molecular genetic evidence for the place of origin of the Pacific rat, Rattus exulans. PLoS ONE, 9, e91356. Tian, J., Wang, P., Cheng, X. & Li, Q. (2002) Astronomically tuned Plio-Pleistocene benthic δ18O record from South China Sea and Atlantic-Pacific comparison. Earth and Planetary Science Letters, 203, 1015-1029. Tollenaere, C., Brouat, C., Duplantier, J.-M., Rahalison, L., Rahelinirina, S., Pascal, M., Moné, H., Mouahid, G., Leirs, H. & Cosson, J.-F. (2010) Phylogeography of the introduced species Rattus rattus in the western Indian Ocean, with special emphasis on the colonization history of Madagascar. Journal of Biogeography, 37, 398-410. Tomczak, M. & Godfrey, J.S. (2001) Regional Oceanography: an Introduction. Online version: http://www.es.flinders.edu.au/~mattom/regoc/pdfversion.html. Tomich, P.Q. & Kami, H.T. (1966) Coat color inheritance of the roof rat in Hawaii. Journal of Mammalogy, 47, 423-431. Trejaut, J., Poloni, E., Yen, J.-C., Lai, Y.-H., Loo, J.-H., Lee, C.-L., He, C.-L. & Lin, M. (2014) Taiwan Y-chromosomal DNA variation and its relationship with Island Southeast Asia. BMC Genetics, 15, 77. Trejaut, J.A., Kivisild, T., Loo, J.H., Lee, C.L., He, C.L., Hsu, C.J., Li, Z.Y. & Lin, M. (2005) Traces of archaic mitochondrial lineages persist in Austronesian-speaking Formosan populations. PLoS Biology, 3, e247. Tufto, J., Engen, S. & Hindar, K. (1996) Inferring patterns of migration from gene frequencies under equilibrium conditions. Genetics, 144, 1911-1921. Tumonggor, M.K., Karafet, T.M., Hallmark, B., Lansing, J.S., Sudoyo, H., Hammer, M.F. & Cox, M.P. (2013) The Indonesian archipelago: an ancient genetic highway linking Asia and the Pacific. Journal of Human Genetics, 58, 165-173. Tykot, R.H. & Chia, S. (1997) Long-distance obsidian trade in Indonesia. Materials Research Society Symposium Proceedings (ed by, pp. 175-180. Underhill, P.A. & Kivisild, T. (2007) Use of Y chromosome and mitochondrial DNA population structure in tracing human migrations. Annual Review of Genetics, 41, 539-564.

259

van den Bergh, G.D., de Vos, J. & Sondaar, P.Y. (2001) The Late Quaternary palaeogeography of mammal evolution in the Indonesian Archipelago. Palaeogeography, Palaeoclimatology, Palaeoecology, 171, 385-408. Vernot, B. & Akey, J.M. (2014) Resurrecting surviving Neandertal lineages from modern human genomes. Science, 343, 1017-1021. Vigilant, L.A., Stoneking, M., Harpending, H., K.Hawkes & Wilson, A.C. (1991) African populations and the evolution of mitochondrial DNA. Science, 253, 1503–1507. von Haeseler, A., Sajantila, A. & Pääbo, S. (1996) The genetical archaeology of the human genome. Nature Genetics, 14, 135-140. von Hochstetter, F. (1867) New Zealand, Chapter VIII & X. In. Early New Zealand Books von Kotzebue, O. (1821) Entdeckungsreise in die Südsee und nach der Beringstrasse zu Erforschung einer nordöstlichen Durchfahrt. Unternommen in den Jahren 1815, 1816, 1817 und 1818. Gebrüder Hoffmann, Weimar. Voris, H.K. (2000) Maps of Pleistocene sea levels in Southeast Asia: shorelines, river systems and time durations. Journal of Biogeography, 27, 1153-1167. Waite, E.R. (1897) The mammals, reptiles, and fishes of Funafuti. VIII. The mammals, reptiles, and fishes. Australian Museum Memoir, 3, 165-202. Walberg, M.W. & Clayton, D.A. (1981) Sequence and properties of the human KB cell and mouse L cell D-loop regions of mitochondrial DNA. Nucleic Acids Research, 9, 5411-5421. Walker, J. (1892) XIX. The bird-life of Adèle Island, north-west Australia. Ibis, 34, 254- 261. Wallace, A.R. (1869) The Malay Archipelago. public version available at http://www.authorama.com. Wallin, I.E. (1923) On the nature of mitochondria. The Anatomical Record, 25, 1-7. Watts, C.H.S. & Baverstock, P. (1994) Evolution in some South-east Asian Murinae (Rodentia), as assessed by microcomplement fixation of albumin, and their relationship to Australian Murines. Australian Journal of Zoology, 42, 711-722. Webb, C.O. & Ree, R. (2012) Historical biogeography inference in Malesia. Biotic evolution and environmental change in Southeast Asia, 191-215. Webb III, T. & Bartlein, P. (1992) Global changes during the last 3 million years: climatic controls and biotic responses. Annual Review of Ecology and Systematics, 23, 141- 173. Wegener, A. (1912) Die Entstehung der Kontinente. Geologische Rundschau, 3, 276-292. Weir, B.S. & Cockerham, C.C. (1984) Estimating F-statistics for the analysis of population structure. Evolution, 38, 1358-1370. Weiss, R.A. (2001) The Leeuwenhoek Lecture 2001. Animal origins of human infectious disease. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences, 356, 957-977. White, D.J., Wolff, J.N., Pierson, M. & Gemmell, N.J. (2008) Revealing the hidden complexities of mtDNA inheritance. Molecular Ecology, 17, 4925-4942. White, P.J. (2004) Where the wild things are: Prehistoric animal translocation in the circum New Guinea Archipelago. Voyages of Discovery: The Archaeology of Islands (ed. by S.M. Fitzpatrick), pp. 147-164. Society of American Archaeology, Westport. Whitlock, M.C. & McCauley, D.E. (1999) Indirect measures of gene flow and migration: FST p 1/(4Nm + 1). Heredity, 82, 117-125. Wiens, J.J. (2001) Character analysis in morphological phylogenetics: problems and solutions. Systematic Biology, 50, 689-699. Williams, J.M. (1973) The ecology of Rattus exulans (Peale) reviewed. Pacific Science, 27, 120-127. 260

REFERENCES

Wilmshurst, J.M., Hunt, T.L., Lipo, C.P. & Anderson, A.J. (2011) High-precision radiocarbon dating shows recent and rapid initial human colonization of East Polynesia. Proceedings of the National Academy of Sciences, 108, 1815-1820. Wilson, G.A. & Rannala, B. (2003) Bayesian inference of recent migration rates using multilocus genotypes. Genetics, 163, 1177-1191. Wirtz, W.O. (1972) Population ecology of the Polynesian rat, Rattus exulans, on Kure Atoll Hawaii. Pacific Science, 26, 431-464. Woodruff, D.S. (2003) Neogene marine transgressions, palaeogeography and biogeographic transitions on the Thai–Malay Peninsula. Journal of Biogeography, 30, 551-567. Wright, S. (1922) Coefficients of inbreeding and relationship. The American Naturalist, 56, 330-338. Wright, S. (1931) Evolution in mendelian populations. Genetics, 16, 97-159. Wright, S. (1932) The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proceedings of The Sixth International Congress of Genetics, 1, 356- 366. Wright, S. (1940) Breeding structure of populations in relation to speciation. American Naturalist, 74, 232-248. Wright, S. (1950) Genetical structure of populations. Nature, 166, 241-280. Wright, S. (1951) The genetical structure of populations. Annals of Eugenics, 15, 323-354. Wright, S. (1978) Evolution and the genetics of populations. IV. Variability within and among populations. University of Chicago Press, Chicago. Wyrtki, K. (1961) Physical oceanography of the Southeast Asian waters. In: Scientific results of marine investigations of the South China Sea and the Gulf of Thailand. University of California, Scripps Institution of Oceanography, La Jolla, California. Yang, Z. & Rannala, B. (1997) Bayesian phylogenetic inference using DNA sequences: a Markov chain Monte Carlo method. Molecular Biology and Evolution, 14, 717-724. Yu, Y., Harri, s.A. & He, X.-J. (2014) RASP (Reconstruct Ancestral State in Phylogenies) 3.0. Available at http://mnh.scu.edu.cn/soft/blog/RASP Zuckerkandl, E. & Pauling, L. (1962) Molecular disease, evolution, and genic heterogeneity. Horizons in Biochemistry (ed. by M. Kasha and B. Pullman), pp. 189-225. Academic Press, New York.

261